Science.gov

Sample records for distributed clustering algorithm

  1. Optimization of a Distributed Genetic Algorithm on a Cluster of Workstations for the Detection of Microcalcifications

    NASA Astrophysics Data System (ADS)

    Bevilacqua, A.; Campanini, R.; Lanconelli, N.

    We have developed a method for the detection of clusters of microcalcifications in digital mammograms. Here, we present a genetic algorithm used to optimize the choice of the parameters in the detection scheme. The optimization has allowed the improvement of the performance, the detailed study of the influence of the various parameters on the performance and an accurate investigation of the behavior of the detection method on unknown cases. We reach a sensitivity of 96.2% with 0.7 false positive clusters per image on the Nijmegen database; we are also able to identify the most significant parameters. In addition, we have examined the feasibility of a distributed genetic algorithm implemented on a non-dedicated Cluster Of Workstations. We get very good results both in terms of quality and efficiency.

  2. Basic cluster compression algorithm

    NASA Technical Reports Server (NTRS)

    Hilbert, E. E.; Lee, J.

    1980-01-01

    Feature extraction and data compression of LANDSAT data is accomplished by BCCA program which reduces costs associated with transmitting, storing, distributing, and interpreting multispectral image data. Algorithm uses spatially local clustering to extract features from image data to describe spectral characteristics of data set. Approach requires only simple repetitive computations, and parallel processing can be used for very high data rates. Program is written in FORTRAN IV for batch execution and has been implemented on SEL 32/55.

  3. A Novel Energy-Aware Distributed Clustering Algorithm for Heterogeneous Wireless Sensor Networks in the Mobile Environment.

    PubMed

    Gao, Ying; Wkram, Chris Hadri; Duan, Jiajie; Chou, Jarong

    2015-12-10

    In order to prolong the network lifetime, energy-efficient protocols adapted to the features of wireless sensor networks should be used. This paper explores in depth the nature of heterogeneous wireless sensor networks, and finally proposes an algorithm to address the problem of finding an effective pathway for heterogeneous clustering energy. The proposed algorithm implements cluster head selection according to the degree of energy attenuation during the network's running and the degree of candidate nodes' effective coverage on the whole network, so as to obtain an even energy consumption over the whole network for the situation with high degree of coverage. Simulation results show that the proposed clustering protocol has better adaptability to heterogeneous environments than existing clustering algorithms in prolonging the network lifetime.

  4. A Novel Energy-Aware Distributed Clustering Algorithm for Heterogeneous Wireless Sensor Networks in the Mobile Environment

    PubMed Central

    Gao, Ying; Wkram, Chris Hadri; Duan, Jiajie; Chou, Jarong

    2015-01-01

    In order to prolong the network lifetime, energy-efficient protocols adapted to the features of wireless sensor networks should be used. This paper explores in depth the nature of heterogeneous wireless sensor networks, and finally proposes an algorithm to address the problem of finding an effective pathway for heterogeneous clustering energy. The proposed algorithm implements cluster head selection according to the degree of energy attenuation during the network’s running and the degree of candidate nodes’ effective coverage on the whole network, so as to obtain an even energy consumption over the whole network for the situation with high degree of coverage. Simulation results show that the proposed clustering protocol has better adaptability to heterogeneous environments than existing clustering algorithms in prolonging the network lifetime. PMID:26690440

  5. An incremental clustering algorithm based on Mahalanobis distance

    NASA Astrophysics Data System (ADS)

    Aik, Lim Eng; Choon, Tan Wee

    2014-12-01

    Classical fuzzy c-means clustering algorithm is insufficient to cluster non-spherical or elliptical distributed datasets. The paper replaces classical fuzzy c-means clustering euclidean distance with Mahalanobis distance. It applies Mahalanobis distance to incremental learning for its merits. A Mahalanobis distance based fuzzy incremental clustering learning algorithm is proposed. Experimental results show the algorithm is an effective remedy for the defect in fuzzy c-means algorithm but also increase training accuracy.

  6. A novel clustering algorithm inspired by membrane computing.

    PubMed

    Peng, Hong; Luo, Xiaohui; Gao, Zhisheng; Wang, Jun; Pei, Zheng

    2015-01-01

    P systems are a class of distributed parallel computing models; this paper presents a novel clustering algorithm, which is inspired from mechanism of a tissue-like P system with a loop structure of cells, called membrane clustering algorithm. The objects of the cells express the candidate centers of clusters and are evolved by the evolution rules. Based on the loop membrane structure, the communication rules realize a local neighborhood topology, which helps the coevolution of the objects and improves the diversity of objects in the system. The tissue-like P system can effectively search for the optimal partitioning with the help of its parallel computing advantage. The proposed clustering algorithm is evaluated on four artificial data sets and six real-life data sets. Experimental results show that the proposed clustering algorithm is superior or competitive to k-means algorithm and several evolutionary clustering algorithms recently reported in the literature.

  7. A Novel Clustering Algorithm Inspired by Membrane Computing

    PubMed Central

    Luo, Xiaohui; Gao, Zhisheng; Wang, Jun; Pei, Zheng

    2015-01-01

    P systems are a class of distributed parallel computing models; this paper presents a novel clustering algorithm, which is inspired from mechanism of a tissue-like P system with a loop structure of cells, called membrane clustering algorithm. The objects of the cells express the candidate centers of clusters and are evolved by the evolution rules. Based on the loop membrane structure, the communication rules realize a local neighborhood topology, which helps the coevolution of the objects and improves the diversity of objects in the system. The tissue-like P system can effectively search for the optimal partitioning with the help of its parallel computing advantage. The proposed clustering algorithm is evaluated on four artificial data sets and six real-life data sets. Experimental results show that the proposed clustering algorithm is superior or competitive to k-means algorithm and several evolutionary clustering algorithms recently reported in the literature. PMID:25874264

  8. Overlapping clusters for distributed computation.

    SciTech Connect

    Mirrokni, Vahab; Andersen, Reid; Gleich, David F.

    2010-11-01

    Scalable, distributed algorithms must address communication problems. We investigate overlapping clusters, or vertex partitions that intersect, for graph computations. This setup stores more of the graph than required but then affords the ease of implementation of vertex partitioned algorithms. Our hope is that this technique allows us to reduce communication in a computation on a distributed graph. The motivation above draws on recent work in communication avoiding algorithms. Mohiyuddin et al. (SC09) design a matrix-powers kernel that gives rise to an overlapping partition. Fritzsche et al. (CSC2009) develop an overlapping clustering for a Schwarz method. Both techniques extend an initial partitioning with overlap. Our procedure generates overlap directly. Indeed, Schwarz methods are commonly used to capitalize on overlap. Elsewhere, overlapping communities (Ahn et al, Nature 2009; Mishra et al. WAW2007) are now a popular model of structure in social networks. These have long been studied in statistics (Cole and Wishart, CompJ 1970). We present two types of results: (i) an estimated swapping probability {rho}{infinity}; and (ii) the communication volume of a parallel PageRank solution (link-following {alpha} = 0.85) using an additive Schwarz method. The volume ratio is the amount of extra storage for the overlap (2 means we store the graph twice). Below, as the ratio increases, the swapping probability and PageRank communication volume decreases.

  9. Cluster compression algorithm: A joint clustering/data compression concept

    NASA Technical Reports Server (NTRS)

    Hilbert, E. E.

    1977-01-01

    The Cluster Compression Algorithm (CCA), which was developed to reduce costs associated with transmitting, storing, distributing, and interpreting LANDSAT multispectral image data is described. The CCA is a preprocessing algorithm that uses feature extraction and data compression to more efficiently represent the information in the image data. The format of the preprocessed data enables simply a look-up table decoding and direct use of the extracted features to reduce user computation for either image reconstruction, or computer interpretation of the image data. Basically, the CCA uses spatially local clustering to extract features from the image data to describe spectral characteristics of the data set. In addition, the features may be used to form a sequence of scalar numbers that define each picture element in terms of the cluster features. This sequence, called the feature map, is then efficiently represented by using source encoding concepts. Various forms of the CCA are defined and experimental results are presented to show trade-offs and characteristics of the various implementations. Examples are provided that demonstrate the application of the cluster compression concept to multi-spectral images from LANDSAT and other sources.

  10. Hesitant fuzzy agglomerative hierarchical clustering algorithms

    NASA Astrophysics Data System (ADS)

    Zhang, Xiaolu; Xu, Zeshui

    2015-02-01

    Recently, hesitant fuzzy sets (HFSs) have been studied by many researchers as a powerful tool to describe and deal with uncertain data, but relatively, very few studies focus on the clustering analysis of HFSs. In this paper, we propose a novel hesitant fuzzy agglomerative hierarchical clustering algorithm for HFSs. The algorithm considers each of the given HFSs as a unique cluster in the first stage, and then compares each pair of the HFSs by utilising the weighted Hamming distance or the weighted Euclidean distance. The two clusters with smaller distance are jointed. The procedure is then repeated time and again until the desirable number of clusters is achieved. Moreover, we extend the algorithm to cluster the interval-valued hesitant fuzzy sets, and finally illustrate the effectiveness of our clustering algorithms by experimental results.

  11. Basic firefly algorithm for document clustering

    NASA Astrophysics Data System (ADS)

    Mohammed, Athraa Jasim; Yusof, Yuhanis; Husni, Husniza

    2015-12-01

    The Document clustering plays significant role in Information Retrieval (IR) where it organizes documents prior to the retrieval process. To date, various clustering algorithms have been proposed and this includes the K-means and Particle Swarm Optimization. Even though these algorithms have been widely applied in many disciplines due to its simplicity, such an approach tends to be trapped in a local minimum during its search for an optimal solution. To address the shortcoming, this paper proposes a Basic Firefly (Basic FA) algorithm to cluster text documents. The algorithm employs the Average Distance to Document Centroid (ADDC) as the objective function of the search. Experiments utilizing the proposed algorithm were conducted on the 20Newsgroups benchmark dataset. Results demonstrate that the Basic FA generates a more robust and compact clusters than the ones produced by K-means and Particle Swarm Optimization (PSO).

  12. A Geometric Clustering Algorithm with Applications to Structural Data

    PubMed Central

    Xu, Shutan; Zou, Shuxue

    2015-01-01

    Abstract An important feature of structural data, especially those from structural determination and protein-ligand docking programs, is that their distribution could be mostly uniform. Traditional clustering algorithms developed specifically for nonuniformly distributed data may not be adequate for their classification. Here we present a geometric partitional algorithm that could be applied to both uniformly and nonuniformly distributed data. The algorithm is a top-down approach that recursively selects the outliers as the seeds to form new clusters until all the structures within a cluster satisfy a classification criterion. The algorithm has been evaluated on a diverse set of real structural data and six sets of test data. The results show that it is superior to the previous algorithms for the clustering of structural data and is similar to or better than them for the classification of the test data. The algorithm should be especially useful for the identification of the best but minor clusters and for speeding up an iterative process widely used in NMR structure determination. PMID:25517067

  13. Self-organization and clustering algorithms

    NASA Technical Reports Server (NTRS)

    Bezdek, James C.

    1991-01-01

    Kohonen's feature maps approach to clustering is often likened to the k or c-means clustering algorithms. Here, the author identifies some similarities and differences between the hard and fuzzy c-Means (HCM/FCM) or ISODATA algorithms and Kohonen's self-organizing approach. The author concludes that some differences are significant, but at the same time there may be some important unknown relationships between the two methodologies. Several avenues of research are proposed.

  14. Algorithm Calculates Cumulative Poisson Distribution

    NASA Technical Reports Server (NTRS)

    Bowerman, Paul N.; Nolty, Robert C.; Scheuer, Ernest M.

    1992-01-01

    Algorithm calculates accurate values of cumulative Poisson distribution under conditions where other algorithms fail because numbers are so small (underflow) or so large (overflow) that computer cannot process them. Factors inserted temporarily to prevent underflow and overflow. Implemented in CUMPOIS computer program described in "Cumulative Poisson Distribution Program" (NPO-17714).

  15. An algorithm for spatial heirarchy clustering

    NASA Technical Reports Server (NTRS)

    Dejesusparada, N. (Principal Investigator); Velasco, F. R. D.

    1981-01-01

    A method for utilizing both spectral and spatial redundancy in compacting and preclassifying images is presented. In multispectral satellite images, a high correlation exists between neighboring image points which tend to occupy dense and restricted regions of the feature space. The image is divided into windows of the same size where the clustering is made. The classes obtained in several neighboring windows are clustered, and then again successively clustered until only one region corresponding to the whole image is obtained. By employing this algorithm only a few points are considered in each clustering, thus reducing computational effort. The method is illustrated as applied to LANDSAT images.

  16. Color sorting algorithm based on K-means clustering algorithm

    NASA Astrophysics Data System (ADS)

    Zhang, BaoFeng; Huang, Qian

    2009-11-01

    In the process of raisin production, there were a variety of color impurities, which needs be removed effectively. A new kind of efficient raisin color-sorting algorithm was presented here. First, the technology of image processing basing on the threshold was applied for the image pre-processing, and then the gray-scale distribution characteristic of the raisin image was found. In order to get the chromatic aberration image and reduce some disturbance, we made the flame image subtraction that the target image data minus the background image data. Second, Haar wavelet filter was used to get the smooth image of raisins. According to the different colors and mildew, spots and other external features, the calculation was made to identify the characteristics of their images, to enable them to fully reflect the quality differences between the raisins of different types. After the processing above, the image were analyzed by K-means clustering analysis method, which can achieve the adaptive extraction of the statistic features, in accordance with which, the image data were divided into different categories, thereby the categories of abnormal colors were distinct. By the use of this algorithm, the raisins of abnormal colors and ones with mottles were eliminated. The sorting rate was up to 98.6%, and the ratio of normal raisins to sorted grains was less than one eighth.

  17. Sampling and clustering algorithm for determining the number of clusters based on the rosette pattern

    NASA Astrophysics Data System (ADS)

    Sadr, Ali; Momtaz, Amirkeyvan

    2012-01-01

    Clustering is one of the image-processing methods used in non-destructive testing (NDT). As one of the initializing parameters, most clustering algorithms, like fuzzy C means (FCM), Iterative self-organization data analysis (ISODATA), K-means, and their derivatives, require the number of clusters. This paper proposes an algorithm for clustering the pixels in C-scan images without any initializing parameters. In this state-of-the-art method, an image is sampled based on the rosette pattern and according to the pattern characteristics, and extracted samples are clustered and then the number of clusters is determined. The centroids of the classes are computed by means of a method used to calculate the distribution function. Based on different data sets, the results show that the algorithm improves the clustering capability by 92.93% and 91.93% in comparison with FCM and K-means algorithms, respectively. Moreover, when dealing with high-resolution data sets, the efficiency of the algorithm in terms of cluster detection and run time improves considerably.

  18. Coupled cluster algorithms for networks of shared memory parallel processors

    NASA Astrophysics Data System (ADS)

    Bentz, Jonathan L.; Olson, Ryan M.; Gordon, Mark S.; Schmidt, Michael W.; Kendall, Ricky A.

    2007-05-01

    As the popularity of using SMP systems as the building blocks for high performance supercomputers increases, so too increases the need for applications that can utilize the multiple levels of parallelism available in clusters of SMPs. This paper presents a dual-layer distributed algorithm, using both shared-memory and distributed-memory techniques to parallelize a very important algorithm (often called the "gold standard") used in computational chemistry, the single and double excitation coupled cluster method with perturbative triples, i.e. CCSD(T). The algorithm is presented within the framework of the GAMESS [M.W. Schmidt, K.K. Baldridge, J.A. Boatz, S.T. Elbert, M.S. Gordon, J.J. Jensen, S. Koseki, N. Matsunaga, K.A. Nguyen, S. Su, T.L. Windus, M. Dupuis, J.A. Montgomery, General atomic and molecular electronic structure system, J. Comput. Chem. 14 (1993) 1347-1363]. (General Atomic and Molecular Electronic Structure System) program suite and the Distributed Data Interface [M.W. Schmidt, G.D. Fletcher, B.M. Bode, M.S. Gordon, The distributed data interface in GAMESS, Comput. Phys. Comm. 128 (2000) 190]. (DDI), however, the essential features of the algorithm (data distribution, load-balancing and communication overhead) can be applied to more general computational problems. Timing and performance data for our dual-level algorithm is presented on several large-scale clusters of SMPs.

  19. Parallel Clustering Algorithms for Structured AMR

    SciTech Connect

    Gunney, B T; Wissink, A M; Hysom, D A

    2005-10-26

    We compare several different parallel implementation approaches for the clustering operations performed during adaptive gridding operations in patch-based structured adaptive mesh refinement (SAMR) applications. Specifically, we target the clustering algorithm of Berger and Rigoutsos (BR91), which is commonly used in many SAMR applications. The baseline for comparison is a simplistic parallel extension of the original algorithm that works well for up to O(10{sup 2}) processors. Our goal is a clustering algorithm for machines of up to O(10{sup 5}) processors, such as the 64K-processor IBM BlueGene/Light system. We first present an algorithm that avoids the unneeded communications of the simplistic approach to improve the clustering speed by up to an order of magnitude. We then present a new task-parallel implementation to further reduce communication wait time, adding another order of magnitude of improvement. The new algorithms also exhibit more favorable scaling behavior for our test problems. Performance is evaluated on a number of large scale parallel computer systems, including a 16K-processor BlueGene/Light system.

  20. Performance Comparison Of Evolutionary Algorithms For Image Clustering

    NASA Astrophysics Data System (ADS)

    Civicioglu, P.; Atasever, U. H.; Ozkan, C.; Besdok, E.; Karkinli, A. E.; Kesikoglu, A.

    2014-09-01

    Evolutionary computation tools are able to process real valued numerical sets in order to extract suboptimal solution of designed problem. Data clustering algorithms have been intensively used for image segmentation in remote sensing applications. Despite of wide usage of evolutionary algorithms on data clustering, their clustering performances have been scarcely studied by using clustering validation indexes. In this paper, the recently proposed evolutionary algorithms (i.e., Artificial Bee Colony Algorithm (ABC), Gravitational Search Algorithm (GSA), Cuckoo Search Algorithm (CS), Adaptive Differential Evolution Algorithm (JADE), Differential Search Algorithm (DSA) and Backtracking Search Optimization Algorithm (BSA)) and some classical image clustering techniques (i.e., k-means, fcm, som networks) have been used to cluster images and their performances have been compared by using four clustering validation indexes. Experimental test results exposed that evolutionary algorithms give more reliable cluster-centers than classical clustering techniques, but their convergence time is quite long.

  1. Noise-enhanced clustering and competitive learning algorithms.

    PubMed

    Osoba, Osonde; Kosko, Bart

    2013-01-01

    Noise can provably speed up convergence in many centroid-based clustering algorithms. This includes the popular k-means clustering algorithm. The clustering noise benefit follows from the general noise benefit for the expectation-maximization algorithm because many clustering algorithms are special cases of the expectation-maximization algorithm. Simulations show that noise also speeds up convergence in stochastic unsupervised competitive learning, supervised competitive learning, and differential competitive learning.

  2. Open cluster membership probability based on K-means clustering algorithm

    NASA Astrophysics Data System (ADS)

    El Aziz, Mohamed Abd; Selim, I. M.; Essam, A.

    2016-08-01

    In the field of galaxies images, the relative coordinate positions of each star with respect to all the other stars are adapted. Therefore the membership of star cluster will be adapted by two basic criterions, one for geometric membership and other for physical (photometric) membership. So in this paper, we presented a new method for the determination of open cluster membership based on K-means clustering algorithm. This algorithm allows us to efficiently discriminate the cluster membership from the field stars. To validate the method we applied it on NGC 188 and NGC 2266, membership stars in these clusters have been obtained. The color-magnitude diagram of the membership stars is significantly clearer and shows a well-defined main sequence and a red giant branch in NGC 188, which allows us to better constrain the cluster members and estimate their physical parameters. The membership probabilities have been calculated and compared to those obtained by the other methods. The results show that the K-means clustering algorithm can effectively select probable member stars in space without any assumption about the spatial distribution of stars in cluster or field. The similarity of our results is in a good agreement with results derived by previous works.

  3. Chaotic map clustering algorithm for EEG analysis

    NASA Astrophysics Data System (ADS)

    Bellotti, R.; De Carlo, F.; Stramaglia, S.

    2004-03-01

    The non-parametric chaotic map clustering algorithm has been applied to the analysis of electroencephalographic signals, in order to recognize the Huntington's disease, one of the most dangerous pathologies of the central nervous system. The performance of the method has been compared with those obtained through parametric algorithms, as K-means and deterministic annealing, and supervised multi-layer perceptron. While supervised neural networks need a training phase, performed by means of data tagged by the genetic test, and the parametric methods require a prior choice of the number of classes to find, the chaotic map clustering gives a natural evidence of the pathological class, without any training or supervision, thus providing a new efficient methodology for the recognition of patterns affected by the Huntington's disease.

  4. A Distributed Flocking Approach for Information Stream Clustering Analysis

    SciTech Connect

    Cui, Xiaohui; Potok, Thomas E

    2006-01-01

    Intelligence analysts are currently overwhelmed with the amount of information streams generated everyday. There is a lack of comprehensive tool that can real-time analyze the information streams. Document clustering analysis plays an important role in improving the accuracy of information retrieval. However, most clustering technologies can only be applied for analyzing the static document collection because they normally require a large amount of computation resource and long time to get accurate result. It is very difficult to cluster a dynamic changed text information streams on an individual computer. Our early research has resulted in a dynamic reactive flock clustering algorithm which can continually refine the clustering result and quickly react to the change of document contents. This character makes the algorithm suitable for cluster analyzing dynamic changed document information, such as text information stream. Because of the decentralized character of this algorithm, a distributed approach is a very natural way to increase the clustering speed of the algorithm. In this paper, we present a distributed multi-agent flocking approach for the text information stream clustering and discuss the decentralized architectures and communication schemes for load balance and status information synchronization in this approach.

  5. Cross-Clustering: A Partial Clustering Algorithm with Automatic Estimation of the Number of Clusters

    PubMed Central

    Tellaroli, Paola; Bazzi, Marco; Donato, Michele; Brazzale, Alessandra R.; Drăghici, Sorin

    2016-01-01

    Four of the most common limitations of the many available clustering methods are: i) the lack of a proper strategy to deal with outliers; ii) the need for a good a priori estimate of the number of clusters to obtain reasonable results; iii) the lack of a method able to detect when partitioning of a specific data set is not appropriate; and iv) the dependence of the result on the initialization. Here we propose Cross-clustering (CC), a partial clustering algorithm that overcomes these four limitations by combining the principles of two well established hierarchical clustering algorithms: Ward’s minimum variance and Complete-linkage. We validated CC by comparing it with a number of existing clustering methods, including Ward’s and Complete-linkage. We show on both simulated and real datasets, that CC performs better than the other methods in terms of: the identification of the correct number of clusters, the identification of outliers, and the determination of real cluster memberships. We used CC to cluster samples in order to identify disease subtypes, and on gene profiles, in order to determine groups of genes with the same behavior. Results obtained on a non-biological dataset show that the method is general enough to be successfully used in such diverse applications. The algorithm has been implemented in the statistical language R and is freely available from the CRAN contributed packages repository. PMID:27015427

  6. Improved Ant Colony Clustering Algorithm and Its Performance Study.

    PubMed

    Gao, Wei

    2016-01-01

    Clustering analysis is used in many disciplines and applications; it is an important tool that descriptively identifies homogeneous groups of objects based on attribute values. The ant colony clustering algorithm is a swarm-intelligent method used for clustering problems that is inspired by the behavior of ant colonies that cluster their corpses and sort their larvae. A new abstraction ant colony clustering algorithm using a data combination mechanism is proposed to improve the computational efficiency and accuracy of the ant colony clustering algorithm. The abstraction ant colony clustering algorithm is used to cluster benchmark problems, and its performance is compared with the ant colony clustering algorithm and other methods used in existing literature. Based on similar computational difficulties and complexities, the results show that the abstraction ant colony clustering algorithm produces results that are not only more accurate but also more efficiently determined than the ant colony clustering algorithm and the other methods. Thus, the abstraction ant colony clustering algorithm can be used for efficient multivariate data clustering. PMID:26839533

  7. Improved Ant Colony Clustering Algorithm and Its Performance Study

    PubMed Central

    Gao, Wei

    2016-01-01

    Clustering analysis is used in many disciplines and applications; it is an important tool that descriptively identifies homogeneous groups of objects based on attribute values. The ant colony clustering algorithm is a swarm-intelligent method used for clustering problems that is inspired by the behavior of ant colonies that cluster their corpses and sort their larvae. A new abstraction ant colony clustering algorithm using a data combination mechanism is proposed to improve the computational efficiency and accuracy of the ant colony clustering algorithm. The abstraction ant colony clustering algorithm is used to cluster benchmark problems, and its performance is compared with the ant colony clustering algorithm and other methods used in existing literature. Based on similar computational difficulties and complexities, the results show that the abstraction ant colony clustering algorithm produces results that are not only more accurate but also more efficiently determined than the ant colony clustering algorithm and the other methods. Thus, the abstraction ant colony clustering algorithm can be used for efficient multivariate data clustering. PMID:26839533

  8. Parallelization of Edge Detection Algorithm using MPI on Beowulf Cluster

    NASA Astrophysics Data System (ADS)

    Haron, Nazleeni; Amir, Ruzaini; Aziz, Izzatdin A.; Jung, Low Tan; Shukri, Siti Rohkmah

    In this paper, we present the design of parallel Sobel edge detection algorithm using Foster's methodology. The parallel algorithm is implemented using MPI message passing library and master/slave algorithm. Every processor performs the same sequential algorithm but on different part of the image. Experimental results conducted on Beowulf cluster are presented to demonstrate the performance of the parallel algorithm.

  9. A hybrid monkey search algorithm for clustering analysis.

    PubMed

    Chen, Xin; Zhou, Yongquan; Luo, Qifang

    2014-01-01

    Clustering is a popular data analysis and data mining technique. The k-means clustering algorithm is one of the most commonly used methods. However, it highly depends on the initial solution and is easy to fall into local optimum solution. In view of the disadvantages of the k-means method, this paper proposed a hybrid monkey algorithm based on search operator of artificial bee colony algorithm for clustering analysis and experiment on synthetic and real life datasets to show that the algorithm has a good performance than that of the basic monkey algorithm for clustering analysis.

  10. a Distributed Polygon Retrieval Algorithm Using Mapreduce

    NASA Astrophysics Data System (ADS)

    Guo, Q.; Palanisamy, B.; Karimi, H. A.

    2015-07-01

    The burst of large-scale spatial terrain data due to the proliferation of data acquisition devices like 3D laser scanners poses challenges to spatial data analysis and computation. Among many spatial analyses and computations, polygon retrieval is a fundamental operation which is often performed under real-time constraints. However, existing sequential algorithms fail to meet this demand for larger sizes of terrain data. Motivated by the MapReduce programming model, a well-adopted large-scale parallel data processing technique, we present a MapReduce-based polygon retrieval algorithm designed with the objective of reducing the IO and CPU loads of spatial data processing. By indexing the data based on a quad-tree approach, a significant amount of unneeded data is filtered in the filtering stage and it reduces the IO overhead. The indexed data also facilitates querying the relationship between the terrain data and query area in shorter time. The results of the experiments performed in our Hadoop cluster demonstrate that our algorithm performs significantly better than the existing distributed algorithms.

  11. Multi-Parent Clustering Algorithms from Stochastic Grammar Data Models

    NASA Technical Reports Server (NTRS)

    Mjoisness, Eric; Castano, Rebecca; Gray, Alexander

    1999-01-01

    We introduce a statistical data model and an associated optimization-based clustering algorithm which allows data vectors to belong to zero, one or several "parent" clusters. For each data vector the algorithm makes a discrete decision among these alternatives. Thus, a recursive version of this algorithm would place data clusters in a Directed Acyclic Graph rather than a tree. We test the algorithm with synthetic data generated according to the statistical data model. We also illustrate the algorithm using real data from large-scale gene expression assays.

  12. A Flocking Based algorithm for Document Clustering Analysis

    SciTech Connect

    Cui, Xiaohui; Gao, Jinzhu; Potok, Thomas E

    2006-01-01

    Social animals or insects in nature often exhibit a form of emergent collective behavior known as flocking. In this paper, we present a novel Flocking based approach for document clustering analysis. Our Flocking clustering algorithm uses stochastic and heuristic principles discovered from observing bird flocks or fish schools. Unlike other partition clustering algorithm such as K-means, the Flocking based algorithm does not require initial partitional seeds. The algorithm generates a clustering of a given set of data through the embedding of the high-dimensional data items on a two-dimensional grid for easy clustering result retrieval and visualization. Inspired by the self-organized behavior of bird flocks, we represent each document object with a flock boid. The simple local rules followed by each flock boid result in the entire document flock generating complex global behaviors, which eventually result in a clustering of the documents. We evaluate the efficiency of our algorithm with both a synthetic dataset and a real document collection that includes 100 news articles collected from the Internet. Our results show that the Flocking clustering algorithm achieves better performance compared to the K- means and the Ant clustering algorithm for real document clustering.

  13. A highly efficient multi-core algorithm for clustering extremely large datasets

    PubMed Central

    2010-01-01

    Background In recent years, the demand for computational power in computational biology has increased due to rapidly growing data sets from microarray and other high-throughput technologies. This demand is likely to increase. Standard algorithms for analyzing data, such as cluster algorithms, need to be parallelized for fast processing. Unfortunately, most approaches for parallelizing algorithms largely rely on network communication protocols connecting and requiring multiple computers. One answer to this problem is to utilize the intrinsic capabilities in current multi-core hardware to distribute the tasks among the different cores of one computer. Results We introduce a multi-core parallelization of the k-means and k-modes cluster algorithms based on the design principles of transactional memory for clustering gene expression microarray type data and categorial SNP data. Our new shared memory parallel algorithms show to be highly efficient. We demonstrate their computational power and show their utility in cluster stability and sensitivity analysis employing repeated runs with slightly changed parameters. Computation speed of our Java based algorithm was increased by a factor of 10 for large data sets while preserving computational accuracy compared to single-core implementations and a recently published network based parallelization. Conclusions Most desktop computers and even notebooks provide at least dual-core processors. Our multi-core algorithms show that using modern algorithmic concepts, parallelization makes it possible to perform even such laborious tasks as cluster sensitivity and cluster number estimation on the laboratory computer. PMID:20370922

  14. A systematic comparison of genome-scale clustering algorithms

    PubMed Central

    2012-01-01

    Background A wealth of clustering algorithms has been applied to gene co-expression experiments. These algorithms cover a broad range of approaches, from conventional techniques such as k-means and hierarchical clustering, to graphical approaches such as k-clique communities, weighted gene co-expression networks (WGCNA) and paraclique. Comparison of these methods to evaluate their relative effectiveness provides guidance to algorithm selection, development and implementation. Most prior work on comparative clustering evaluation has focused on parametric methods. Graph theoretical methods are recent additions to the tool set for the global analysis and decomposition of microarray co-expression matrices that have not generally been included in earlier methodological comparisons. In the present study, a variety of parametric and graph theoretical clustering algorithms are compared using well-characterized transcriptomic data at a genome scale from Saccharomyces cerevisiae. Methods For each clustering method under study, a variety of parameters were tested. Jaccard similarity was used to measure each cluster's agreement with every GO and KEGG annotation set, and the highest Jaccard score was assigned to the cluster. Clusters were grouped into small, medium, and large bins, and the Jaccard score of the top five scoring clusters in each bin were averaged and reported as the best average top 5 (BAT5) score for the particular method. Results Clusters produced by each method were evaluated based upon the positive match to known pathways. This produces a readily interpretable ranking of the relative effectiveness of clustering on the genes. Methods were also tested to determine whether they were able to identify clusters consistent with those identified by other clustering methods. Conclusions Validation of clusters against known gene classifications demonstrate that for this data, graph-based techniques outperform conventional clustering approaches, suggesting that further

  15. The Enhanced Hoshen-Kopelman Algorithm for Cluster Analysis

    NASA Astrophysics Data System (ADS)

    Hoshen, Joseph

    1997-08-01

    In 1976 Hoshen and Kopelman(J. Hoshen and R. Kopelman, Phys. Rev. B, 14, 3438 (1976).) introduced a breakthrough algorithm, known today as the Hoshen-Kopelman algorithm, for cluster analysis. This algorithm revolutionized Monte Carlo cluster calculations in percolation theory as it enables analysis of very large lattices containing 10^11 or more sites. Initially the HK algorithm primary use was in the domain of pure and basic sciences. Later it began finding applications in diverse fields of technology and applied sciences. Example of such applications are two and three dimensional image analysis, composite material modeling, polymers, remote sensing, brain modeling and food processing. While the original HK algorithm provides only cluster size data for only one class of sites, the Enhanced HK (EHK) algorithm, presented in this paper, enables calculations of cluster spatial moments -- characteristics of cluster shapes -- for multiple classes of sites. These enhancements preserve the time and space complexities of the original HK algorithm, such that very large lattices could be still analyzed simultaneously in a single pass through the lattice for cluster sizes, classes and shapes.

  16. Efficient Cluster Algorithm for Spin Glasses in Any Space Dimension

    NASA Astrophysics Data System (ADS)

    Zhu, Zheng; Ochoa, Andrew J.; Katzgraber, Helmut G.

    2015-08-01

    Spin systems with frustration and disorder are notoriously difficult to study, both analytically and numerically. While the simulation of ferromagnetic statistical mechanical models benefits greatly from cluster algorithms, these accelerated dynamics methods remain elusive for generic spin-glass-like systems. Here, we present a cluster algorithm for Ising spin glasses that works in any space dimension and speeds up thermalization by at least one order of magnitude at temperatures where thermalization is typically difficult. Our isoenergetic cluster moves are based on the Houdayer cluster algorithm for two-dimensional spin glasses and lead to a speedup over conventional state-of-the-art methods that increases with the system size. We illustrate the benefits of the isoenergetic cluster moves in two and three space dimensions, as well as the nonplanar chimera topology found in the D-Wave Inc. quantum annealing machine.

  17. Efficient Cluster Algorithm for Spin Glasses in Any Space Dimension.

    PubMed

    Zhu, Zheng; Ochoa, Andrew J; Katzgraber, Helmut G

    2015-08-14

    Spin systems with frustration and disorder are notoriously difficult to study, both analytically and numerically. While the simulation of ferromagnetic statistical mechanical models benefits greatly from cluster algorithms, these accelerated dynamics methods remain elusive for generic spin-glass-like systems. Here, we present a cluster algorithm for Ising spin glasses that works in any space dimension and speeds up thermalization by at least one order of magnitude at temperatures where thermalization is typically difficult. Our isoenergetic cluster moves are based on the Houdayer cluster algorithm for two-dimensional spin glasses and lead to a speedup over conventional state-of-the-art methods that increases with the system size. We illustrate the benefits of the isoenergetic cluster moves in two and three space dimensions, as well as the nonplanar chimera topology found in the D-Wave Inc. quantum annealing machine.

  18. Adaptive link selection algorithms for distributed estimation

    NASA Astrophysics Data System (ADS)

    Xu, Songcen; de Lamare, Rodrigo C.; Poor, H. Vincent

    2015-12-01

    This paper presents adaptive link selection algorithms for distributed estimation and considers their application to wireless sensor networks and smart grids. In particular, exhaustive search-based least mean squares (LMS) / recursive least squares (RLS) link selection algorithms and sparsity-inspired LMS / RLS link selection algorithms that can exploit the topology of networks with poor-quality links are considered. The proposed link selection algorithms are then analyzed in terms of their stability, steady-state, and tracking performance and computational complexity. In comparison with the existing centralized or distributed estimation strategies, the key features of the proposed algorithms are as follows: (1) more accurate estimates and faster convergence speed can be obtained and (2) the network is equipped with the ability of link selection that can circumvent link failures and improve the estimation performance. The performance of the proposed algorithms for distributed estimation is illustrated via simulations in applications of wireless sensor networks and smart grids.

  19. A space-time cluster algorithm for stochastic processes.

    SciTech Connect

    Gulbahce, N.

    2003-01-01

    We introduce a space-time cluster algorithm that will generate histories of stochastic processes. Michael Zimmer introduced a spacetime MC algorithm for stochastic classical dynamics and he applied it to simulate Ising model with Glauber dynamics. Following his steps, we extended Brower and Tamayo's embedded {phi}{sup 4} dynamics to space and time. We believe our algorithm can be applied to more general stochastic systems. Why space-time? To be able to study nonequilibrium systems, we need to know the probability of the 'history' of a nonequilibrium state. Histories are the entire space-time configurations. Cluster algorithms first introduced by SW, are useful to overcome critical slowing down. Brower and Tamayo have mapped continous field variables to Ising spins, and have grown and flipped SW clusters to gain speed. Our algorithm is an extended version of theirs to space and time.

  20. A Fast Implementation of the ISODATA Clustering Algorithm

    NASA Technical Reports Server (NTRS)

    Memarsadeghi, Nargess; Mount, David M.; Netanyahu, Nathan S.; LeMoigne, Jacqueline

    2005-01-01

    Clustering is central to many image processing and remote sensing applications. ISODATA is one of the most popular and widely used clustering methods in geoscience applications, but it can run slowly, particularly with large data sets. We present a more efficient approach to ISODATA clustering, which achieves better running times by storing the points in a kd-tree and through a modification of the way in which the algorithm estimates the dispersion of each cluster. We also present an approximate version of the algorithm which allows the user to further improve the running time, at the expense of lower fidelity in computing the nearest cluster center to each point. We provide both theoretical and empirical justification that our modified approach produces clusterings that are very similar to those produced by the standard ISODATA approach. We also provide empirical studies on both synthetic data and remotely sensed Landsat and MODIS images that show that our approach has significantly lower running times.

  1. Efficient Record Linkage Algorithms Using Complete Linkage Clustering

    PubMed Central

    Mamun, Abdullah-Al; Aseltine, Robert; Rajasekaran, Sanguthevar

    2016-01-01

    Data from different agencies share data of the same individuals. Linking these datasets to identify all the records belonging to the same individuals is a crucial and challenging problem, especially given the large volumes of data. A large number of available algorithms for record linkage are prone to either time inefficiency or low-accuracy in finding matches and non-matches among the records. In this paper we propose efficient as well as reliable sequential and parallel algorithms for the record linkage problem employing hierarchical clustering methods. We employ complete linkage hierarchical clustering algorithms to address this problem. In addition to hierarchical clustering, we also use two other techniques: elimination of duplicate records and blocking. Our algorithms use sorting as a sub-routine to identify identical copies of records. We have tested our algorithms on datasets with millions of synthetic records. Experimental results show that our algorithms achieve nearly 100% accuracy. Parallel implementations achieve almost linear speedups. Time complexities of these algorithms do not exceed those of previous best-known algorithms. Our proposed algorithms outperform previous best-known algorithms in terms of accuracy consuming reasonable run times. PMID:27124604

  2. Clustering of Hadronic Showers with a Structural Algorithm

    SciTech Connect

    Charles, M.J.; /SLAC

    2005-12-13

    The internal structure of hadronic showers can be resolved in a high-granularity calorimeter. This structure is described in terms of simple components and an algorithm for reconstruction of hadronic clusters using these components is presented. Results from applying this algorithm to simulated hadronic Z-pole events in the SiD concept are discussed.

  3. CCL: an algorithm for the efficient comparison of clusters

    PubMed Central

    Hundt, R.; Schön, J. C.; Neelamraju, S.; Zagorac, J.; Jansen, M.

    2013-01-01

    The systematic comparison of the atomic structure of solids and clusters has become an important task in crystallography, chemistry, physics and materials science, in particular in the context of structure prediction and structure determination of nanomaterials. In this work, an efficient and robust algorithm for the comparison of cluster structures is presented, which is based on the mapping of the point patterns of the two clusters onto each other. This algorithm has been implemented as the module CCL in the structure visualization and analysis program KPLOT. PMID:23682193

  4. A modified density-based clustering algorithm and its implementation

    NASA Astrophysics Data System (ADS)

    Ban, Zhihua; Liu, Jianguo; Yuan, Lulu; Yang, Hua

    2015-12-01

    This paper presents an improved density-based clustering algorithm based on the paper of clustering by fast search and find of density peaks. A distance threshold is introduced for the purpose of economizing memory. In order to reduce the probability that two points share the same density value, similarity is utilized to define proximity measure. We have tested the modified algorithm on a large data set, several small data sets and shape data sets. It turns out that the proposed algorithm can obtain acceptable results and can be applied more wildly.

  5. Measuring Constraint-Set Utility for Partitional Clustering Algorithms

    NASA Technical Reports Server (NTRS)

    Davidson, Ian; Wagstaff, Kiri L.; Basu, Sugato

    2006-01-01

    Clustering with constraints is an active area of machine learning and data mining research. Previous empirical work has convincingly shown that adding constraints to clustering improves the performance of a variety of algorithms. However, in most of these experiments, results are averaged over different randomly chosen constraint sets from a given set of labels, thereby masking interesting properties of individual sets. We demonstrate that constraint sets vary significantly in how useful they are for constrained clustering; some constraint sets can actually decrease algorithm performance. We create two quantitative measures, informativeness and coherence, that can be used to identify useful constraint sets. We show that these measures can also help explain differences in performance for four particular constrained clustering algorithms.

  6. A Distributed Agent Implementation of Multiple Species Flocking Model for Document Partitioning Clustering

    SciTech Connect

    Cui, Xiaohui; Potok, Thomas E

    2006-01-01

    The Flocking model, first proposed by Craig Reynolds, is one of the first bio-inspired computational collective behavior models that has many popular applications, such as animation. Our early research has resulted in a flock clustering algorithm that can achieve better performance than the Kmeans or the Ant clustering algorithms for data clustering. This algorithm generates a clustering of a given set of data through the embedding of the highdimensional data items on a two-dimensional grid for efficient clustering result retrieval and visualization. In this paper, we propose a bio-inspired clustering model, the Multiple Species Flocking clustering model (MSF), and present a distributed multi-agent MSF approach for document clustering.

  7. Symmetric nonnegative matrix factorization: algorithms and applications to probabilistic clustering.

    PubMed

    He, Zhaoshui; Xie, Shengli; Zdunek, Rafal; Zhou, Guoxu; Cichocki, Andrzej

    2011-12-01

    Nonnegative matrix factorization (NMF) is an unsupervised learning method useful in various applications including image processing and semantic analysis of documents. This paper focuses on symmetric NMF (SNMF), which is a special case of NMF decomposition. Three parallel multiplicative update algorithms using level 3 basic linear algebra subprograms directly are developed for this problem. First, by minimizing the Euclidean distance, a multiplicative update algorithm is proposed, and its convergence under mild conditions is proved. Based on it, we further propose another two fast parallel methods: α-SNMF and β -SNMF algorithms. All of them are easy to implement. These algorithms are applied to probabilistic clustering. We demonstrate their effectiveness for facial image clustering, document categorization, and pattern clustering in gene expression.

  8. Sampling Within k-Means Algorithm to Cluster Large Datasets

    SciTech Connect

    Bejarano, Jeremy; Bose, Koushiki; Brannan, Tyler; Thomas, Anita; Adragni, Kofi; Neerchal, Nagaraj; Ostrouchov, George

    2011-08-01

    Due to current data collection technology, our ability to gather data has surpassed our ability to analyze it. In particular, k-means, one of the simplest and fastest clustering algorithms, is ill-equipped to handle extremely large datasets on even the most powerful machines. Our new algorithm uses a sample from a dataset to decrease runtime by reducing the amount of data analyzed. We perform a simulation study to compare our sampling based k-means to the standard k-means algorithm by analyzing both the speed and accuracy of the two methods. Results show that our algorithm is significantly more efficient than the existing algorithm with comparable accuracy. Further work on this project might include a more comprehensive study both on more varied test datasets as well as on real weather datasets. This is especially important considering that this preliminary study was performed on rather tame datasets. Also, these datasets should analyze the performance of the algorithm on varied values of k. Lastly, this paper showed that the algorithm was accurate for relatively low sample sizes. We would like to analyze this further to see how accurate the algorithm is for even lower sample sizes. We could find the lowest sample sizes, by manipulating width and confidence level, for which the algorithm would be acceptably accurate. In order for our algorithm to be a success, it needs to meet two benchmarks: match the accuracy of the standard k-means algorithm and significantly reduce runtime. Both goals are accomplished for all six datasets analyzed. However, on datasets of three and four dimension, as the data becomes more difficult to cluster, both algorithms fail to obtain the correct classifications on some trials. Nevertheless, our algorithm consistently matches the performance of the standard algorithm while becoming remarkably more efficient with time. Therefore, we conclude that analysts can use our algorithm, expecting accurate results in considerably less time.

  9. A Genetic Algorithm That Exchanges Neighboring Centers for Fuzzy c-Means Clustering

    ERIC Educational Resources Information Center

    Chahine, Firas Safwan

    2012-01-01

    Clustering algorithms are widely used in pattern recognition and data mining applications. Due to their computational efficiency, partitional clustering algorithms are better suited for applications with large datasets than hierarchical clustering algorithms. K-means is among the most popular partitional clustering algorithm, but has a major…

  10. CACONET: Ant Colony Optimization (ACO) Based Clustering Algorithm for VANET.

    PubMed

    Aadil, Farhan; Bajwa, Khalid Bashir; Khan, Salabat; Chaudary, Nadeem Majeed; Akram, Adeel

    2016-01-01

    A vehicular ad hoc network (VANET) is a wirelessly connected network of vehicular nodes. A number of techniques, such as message ferrying, data aggregation, and vehicular node clustering aim to improve communication efficiency in VANETs. Cluster heads (CHs), selected in the process of clustering, manage inter-cluster and intra-cluster communication. The lifetime of clusters and number of CHs determines the efficiency of network. In this paper a Clustering algorithm based on Ant Colony Optimization (ACO) for VANETs (CACONET) is proposed. CACONET forms optimized clusters for robust communication. CACONET is compared empirically with state-of-the-art baseline techniques like Multi-Objective Particle Swarm Optimization (MOPSO) and Comprehensive Learning Particle Swarm Optimization (CLPSO). Experiments varying the grid size of the network, the transmission range of nodes, and number of nodes in the network were performed to evaluate the comparative effectiveness of these algorithms. For optimized clustering, the parameters considered are the transmission range, direction and speed of the nodes. The results indicate that CACONET significantly outperforms MOPSO and CLPSO. PMID:27149517

  11. CACONET: Ant Colony Optimization (ACO) Based Clustering Algorithm for VANET

    PubMed Central

    Bajwa, Khalid Bashir; Khan, Salabat; Chaudary, Nadeem Majeed; Akram, Adeel

    2016-01-01

    A vehicular ad hoc network (VANET) is a wirelessly connected network of vehicular nodes. A number of techniques, such as message ferrying, data aggregation, and vehicular node clustering aim to improve communication efficiency in VANETs. Cluster heads (CHs), selected in the process of clustering, manage inter-cluster and intra-cluster communication. The lifetime of clusters and number of CHs determines the efficiency of network. In this paper a Clustering algorithm based on Ant Colony Optimization (ACO) for VANETs (CACONET) is proposed. CACONET forms optimized clusters for robust communication. CACONET is compared empirically with state-of-the-art baseline techniques like Multi-Objective Particle Swarm Optimization (MOPSO) and Comprehensive Learning Particle Swarm Optimization (CLPSO). Experiments varying the grid size of the network, the transmission range of nodes, and number of nodes in the network were performed to evaluate the comparative effectiveness of these algorithms. For optimized clustering, the parameters considered are the transmission range, direction and speed of the nodes. The results indicate that CACONET significantly outperforms MOPSO and CLPSO. PMID:27149517

  12. Effective FCM noise clustering algorithms in medical images.

    PubMed

    Kannan, S R; Devi, R; Ramathilagam, S; Takezawa, K

    2013-02-01

    The main motivation of this paper is to introduce a class of robust non-Euclidean distance measures for the original data space to derive new objective function and thus clustering the non-Euclidean structures in data to enhance the robustness of the original clustering algorithms to reduce noise and outliers. The new objective functions of proposed algorithms are realized by incorporating the noise clustering concept into the entropy based fuzzy C-means algorithm with suitable noise distance which is employed to take the information about noisy data in the clustering process. This paper presents initial cluster prototypes using prototype initialization method, so that this work tries to obtain the final result with less number of iterations. To evaluate the performance of the proposed methods in reducing the noise level, experimental work has been carried out with a synthetic image which is corrupted by Gaussian noise. The superiority of the proposed methods has been examined through the experimental study on medical images. The experimental results show that the proposed algorithms perform significantly better than the standard existing algorithms. The accurate classification percentage of the proposed fuzzy C-means segmentation method is obtained using silhouette validity index.

  13. A survey of fuzzy clustering algorithms for pattern recognition. II.

    PubMed

    Baraldi, A; Blonda, P

    1999-01-01

    For pt.I see ibid., p.775-85. In part I an equivalence between the concepts of fuzzy clustering and soft competitive learning in clustering algorithms is proposed on the basis of the existing literature. Moreover, a set of functional attributes is selected for use as dictionary entries in the comparison of clustering algorithms. In this paper, five clustering algorithms taken from the literature are reviewed, assessed and compared on the basis of the selected properties of interest. These clustering models are (1) self-organizing map (SOM); (2) fuzzy learning vector quantization (FLVQ); (3) fuzzy adaptive resonance theory (fuzzy ART); (4) growing neural gas (GNG); (5) fully self-organizing simplified adaptive resonance theory (FOSART). Although our theoretical comparison is fairly simple, it yields observations that may appear parodoxical. First, only FLVQ, fuzzy ART, and FOSART exploit concepts derived from fuzzy set theory (e.g., relative and/or absolute fuzzy membership functions). Secondly, only SOM, FLVQ, GNG, and FOSART employ soft competitive learning mechanisms, which are affected by asymptotic misbehaviors in the case of FLVQ, i.e., only SOM, GNG, and FOSART are considered effective fuzzy clustering algorithms. PMID:18252358

  14. A Novel Complex Networks Clustering Algorithm Based on the Core Influence of Nodes

    PubMed Central

    Dai, Bin; Xie, Zhongyu

    2014-01-01

    In complex networks, cluster structure, identified by the heterogeneity of nodes, has become a common and important topological property. Network clustering methods are thus significant for the study of complex networks. Currently, many typical clustering algorithms have some weakness like inaccuracy and slow convergence. In this paper, we propose a clustering algorithm by calculating the core influence of nodes. The clustering process is a simulation of the process of cluster formation in sociology. The algorithm detects the nodes with core influence through their betweenness centrality, and builds the cluster's core structure by discriminant functions. Next, the algorithm gets the final cluster structure after clustering the rest of the nodes in the network by optimizing method. Experiments on different datasets show that the clustering accuracy of this algorithm is superior to the classical clustering algorithm (Fast-Newman algorithm). It clusters faster and plays a positive role in revealing the real cluster structure of complex networks precisely. PMID:24741359

  15. A Community Detection Algorithm Based on Topology Potential and Spectral Clustering

    PubMed Central

    Wang, Zhixiao; Chen, Zhaotong; Zhao, Ya; Chen, Shaoda

    2014-01-01

    Community detection is of great value for complex networks in understanding their inherent law and predicting their behavior. Spectral clustering algorithms have been successfully applied in community detection. This kind of methods has two inadequacies: one is that the input matrixes they used cannot provide sufficient structural information for community detection and the other is that they cannot necessarily derive the proper community number from the ladder distribution of eigenvector elements. In order to solve these problems, this paper puts forward a novel community detection algorithm based on topology potential and spectral clustering. The new algorithm constructs the normalized Laplacian matrix with nodes' topology potential, which contains rich structural information of the network. In addition, the new algorithm can automatically get the optimal community number from the local maximum potential nodes. Experiments results showed that the new algorithm gave excellent performance on artificial networks and real world networks and outperforms other community detection methods. PMID:25147846

  16. Cluster size distribution in Gaussian glasses

    NASA Astrophysics Data System (ADS)

    Novikov, S. V.

    2011-03-01

    A simple method for the estimation of the asymptotics of the cluster numbers in Gaussian glasses is described. Validity of the method was tested by the comparison with the exact analytic result for the non-correlated field and simulation data for the distribution of random energies in strongly spatially correlated dipolar glass model.

  17. Functional clustering algorithm for the analysis of dynamic network data

    NASA Astrophysics Data System (ADS)

    Feldt, S.; Waddell, J.; Hetrick, V. L.; Berke, J. D.; Żochowski, M.

    2009-05-01

    We formulate a technique for the detection of functional clusters in discrete event data. The advantage of this algorithm is that no prior knowledge of the number of functional groups is needed, as our procedure progressively combines data traces and derives the optimal clustering cutoff in a simple and intuitive manner through the use of surrogate data sets. In order to demonstrate the power of this algorithm to detect changes in network dynamics and connectivity, we apply it to both simulated neural spike train data and real neural data obtained from the mouse hippocampus during exploration and slow-wave sleep. Using the simulated data, we show that our algorithm performs better than existing methods. In the experimental data, we observe state-dependent clustering patterns consistent with known neurophysiological processes involved in memory consolidation.

  18. FRCA: A Fuzzy Relevance-Based Cluster Head Selection Algorithm for Wireless Mobile Ad-Hoc Sensor Networks

    PubMed Central

    Lee, Chongdeuk; Jeong, Taegwon

    2011-01-01

    Clustering is an important mechanism that efficiently provides information for mobile nodes and improves the processing capacity of routing, bandwidth allocation, and resource management and sharing. Clustering algorithms can be based on such criteria as the battery power of nodes, mobility, network size, distance, speed and direction. Above all, in order to achieve good clustering performance, overhead should be minimized, allowing mobile nodes to join and leave without perturbing the membership of the cluster while preserving current cluster structure as much as possible. This paper proposes a Fuzzy Relevance-based Cluster head selection Algorithm (FRCA) to solve problems found in existing wireless mobile ad hoc sensor networks, such as the node distribution found in dynamic properties due to mobility and flat structures and disturbance of the cluster formation. The proposed mechanism uses fuzzy relevance to select the cluster head for clustering in wireless mobile ad hoc sensor networks. In the simulation implemented on the NS-2 simulator, the proposed FRCA is compared with algorithms such as the Cluster-based Routing Protocol (CBRP), the Weighted-based Adaptive Clustering Algorithm (WACA), and the Scenario-based Clustering Algorithm for Mobile ad hoc networks (SCAM). The simulation results showed that the proposed FRCA achieves better performance than that of the other existing mechanisms. PMID:22163905

  19. FRCA: a fuzzy relevance-based cluster head selection algorithm for wireless mobile ad-hoc sensor networks.

    PubMed

    Lee, Chongdeuk; Jeong, Taegwon

    2011-01-01

    Clustering is an important mechanism that efficiently provides information for mobile nodes and improves the processing capacity of routing, bandwidth allocation, and resource management and sharing. Clustering algorithms can be based on such criteria as the battery power of nodes, mobility, network size, distance, speed and direction. Above all, in order to achieve good clustering performance, overhead should be minimized, allowing mobile nodes to join and leave without perturbing the membership of the cluster while preserving current cluster structure as much as possible. This paper proposes a Fuzzy Relevance-based Cluster head selection Algorithm (FRCA) to solve problems found in existing wireless mobile ad hoc sensor networks, such as the node distribution found in dynamic properties due to mobility and flat structures and disturbance of the cluster formation. The proposed mechanism uses fuzzy relevance to select the cluster head for clustering in wireless mobile ad hoc sensor networks. In the simulation implemented on the NS-2 simulator, the proposed FRCA is compared with algorithms such as the Cluster-based Routing Protocol (CBRP), the Weighted-based Adaptive Clustering Algorithm (WACA), and the Scenario-based Clustering Algorithm for Mobile ad hoc networks (SCAM). The simulation results showed that the proposed FRCA achieves better performance than that of the other existing mechanisms.

  20. Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale

    PubMed Central

    Kobourov, Stephen; Gallant, Mike; Börner, Katy

    2016-01-01

    Overview Notions of community quality underlie the clustering of networks. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms—Louvain, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes. Cluster Quality Metrics We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 on modularity but score 0 out of 1 on information recovery. We find conductance, though imperfect, to be the stand-alone quality metric that best indicates performance on the information recovery metrics. Additionally, our study shows that the variant of normalized mutual information used in previous work cannot be assumed to differ only slightly from traditional normalized mutual information. Network Clustering Algorithms Smart local moving is the overall best performing algorithm in our study, but discrepancies between cluster evaluation metrics prevent us from declaring it an absolutely superior algorithm. Interestingly, Louvain performed better than Infomap in nearly all the tests in our study, contradicting the results of previous work in which Infomap was superior to Louvain. We find that although label propagation performs poorly when clusters are less clearly defined, it scales efficiently and accurately to large

  1. Characterising superclusters with the galaxy cluster distribution

    NASA Astrophysics Data System (ADS)

    Chon, Gayoung; Böhringer, Hans; Collins, Chris A.; Krause, Martin

    2014-07-01

    Superclusters are the largest observed matter density structures in the Universe. Recently, we presented the first supercluster catalogue constructed with a well-defined selection function based on the X-ray flux-limited cluster survey, REFLEX II. To construct the sample we proposed a concept to find large objects with a minimum overdensity such that it can be expected that most of their mass will collapse in the future. The main goal is to provide support for our concept here by using simulation that we can, on the basis of our observational sample of X-ray clusters, construct a supercluster sample defined by a certain minimum overdensity. On this sample we also test how superclusters trace the underlying dark matter distribution. Our results confirm that an overdensity in the number of clusters is tightly correlated with an overdensity of the dark matter distribution. This enables us to define superclusters within which most of the mass will collapse in the future. We also obtain first-order mass estimates of superclusters on the basis of the properties of the member clusters. We also show that in this context the ratio of the cluster number density and dark matter mass density is consistent with the theoretically expected cluster bias. Our previous work provided evidence that superclusters are a special environment in which the density structures of the dark matter grow differently from those in the field, as characterised by the X-ray luminosity function. Here we confirm for the first time that this originates from a top-heavy mass function at high statistical significance that is provided by a Kolmogorov-Smirnov test. We also find in close agreement with observations that the superclusters only occupy a small volume of a few per cent, but contain more than half of the clusters in the present-day Universe.

  2. Modelling the Distribution of Globular Cluster Masses

    NASA Astrophysics Data System (ADS)

    McLaughlin, Dean E.; Pudritz, Ralph E.

    1994-12-01

    On the basis of various observational evidence, we argue that the overall present-day distribution of mass in globular cluster systems around galaxies as diverse as M87 and the Milky Way may be in large part reflective of robust formation processes, and little influenced by subsequent dynamical evolution of the globulars. With this in mind, Harris & Pudritz (1994, ApJ, 429, 177) have recently suggested that globular clusters with a range of masses are formed in pregalactic ``supergiant molecular clouds'' which grow by (coalescent) binary collisions with other clouds. We develop this idea more fully by solving for the steady-state mass distributions resulting from such coalescent encounters, with provisions made for the disruption of high-mass clouds due to star formation. Agglomeration models have been proposed in various guises to explain the mass spectra of planetesimals, stars, giant molecular clouds and their cores, and galaxies. The present theory generalizes aspects of these models, and appears able to account for the distribution of globular cluster masses at least above the so-called ``turnover'' of the globular cluster luminosity function.

  3. A Task-parallel Clustering Algorithm for Structured AMR

    SciTech Connect

    Gunney, B N; Wissink, A M

    2004-11-02

    A new parallel algorithm, based on the Berger-Rigoutsos algorithm for clustering grid points into logically rectangular regions, is presented. The clustering operation is frequently performed in the dynamic gridding steps of structured adaptive mesh refinement (SAMR) calculations. A previous study revealed that although the cost of clustering is generally insignificant for smaller problems run on relatively few processors, the algorithm scaled inefficiently in parallel and its cost grows with problem size. Hence, it can become significant for large scale problems run on very large parallel machines, such as the new BlueGene system (which has {Omicron}(10{sup 4}) processors). We propose a new task-parallel algorithm designed to reduce communication wait times. Performance was assessed using dynamic SAMR re-gridding operations on up to 16K processors of currently available computers at Lawrence Livermore National Laboratory. The new algorithm was shown to be up to an order of magnitude faster than the baseline algorithm and had better scaling trends.

  4. The C4 clustering algorithm: Clusters of galaxies in the Sloan Digital Sky Survey

    SciTech Connect

    Miller, Christopher J.; Nichol, Robert; Reichart, Dan; Wechsler, Risa H.; Evrard, August; Annis, James; McKay, Timothy; Bahcall, Neta; Bernardi, Mariangela; Boehringer, Hans; Connolly, Andrew; Goto, Tomo; Kniazev, Alexie; Lamb, Donald; Postman, Marc; Schneider, Donald; Sheth, Ravi; Voges, Wolfgang; /Cerro-Tololo InterAmerican Obs. /Portsmouth U., ICG /North Carolina U. /Chicago U., Astron. Astrophys. Ctr. /Chicago U., EFI /Michigan U. /Fermilab /Princeton U. Observ. /Garching, Max Planck Inst., MPE /Pittsburgh U. /Tokyo U., ICRR /Baltimore, Space Telescope Sci. /Penn State U. /Chicago U. /Stavropol, Astrophys. Observ. /Heidelberg, Max Planck Inst. Astron. /INI, SAO

    2005-03-01

    We present the ''C4 Cluster Catalog'', a new sample of 748 clusters of galaxies identified in the spectroscopic sample of the Second Data Release (DR2) of the Sloan Digital Sky Survey (SDSS). The C4 cluster-finding algorithm identifies clusters as overdensities in a seven-dimensional position and color space, thus minimizing projection effects that have plagued previous optical cluster selection. The present C4 catalog covers {approx}2600 square degrees of sky and ranges in redshift from z = 0.02 to z = 0.17. The mean cluster membership is 36 galaxies (with redshifts) brighter than r = 17.7, but the catalog includes a range of systems, from groups containing 10 members to massive clusters with over 200 cluster members with redshifts. The catalog provides a large number of measured cluster properties including sky location, mean redshift, galaxy membership, summed r-band optical luminosity (L{sub r}), velocity dispersion, as well as quantitative measures of substructure and the surrounding large-scale environment. We use new, multi-color mock SDSS galaxy catalogs, empirically constructed from the {Lambda}CDM Hubble Volume (HV) Sky Survey output, to investigate the sensitivity of the C4 catalog to the various algorithm parameters (detection threshold, choice of passbands and search aperture), as well as to quantify the purity and completeness of the C4 cluster catalog. These mock catalogs indicate that the C4 catalog is {approx_equal}90% complete and 95% pure above M{sub 200} = 1 x 10{sup 14} h{sup -1}M{sub {circle_dot}} and within 0.03 {le} z {le} 0.12. Using the SDSS DR2 data, we show that the C4 algorithm finds 98% of X-ray identified clusters and 90% of Abell clusters within 0.03 {le} z {le} 0.12. Using the mock galaxy catalogs and the full HV dark matter simulations, we show that the L{sub r} of a cluster is a more robust estimator of the halo mass (M{sub 200}) than the galaxy line-of-sight velocity dispersion or the richness of the cluster. However, if we

  5. Adaptive clustering algorithm for community detection in complex networks.

    PubMed

    Ye, Zhenqing; Hu, Songnian; Yu, Jun

    2008-10-01

    Community structure is common in various real-world networks; methods or algorithms for detecting such communities in complex networks have attracted great attention in recent years. We introduced a different adaptive clustering algorithm capable of extracting modules from complex networks with considerable accuracy and robustness. In this approach, each node in a network acts as an autonomous agent demonstrating flocking behavior where vertices always travel toward their preferable neighboring groups. An optimal modular structure can emerge from a collection of these active nodes during a self-organization process where vertices constantly regroup. In addition, we show that our algorithm appears advantageous over other competing methods (e.g., the Newman-fast algorithm) through intensive evaluation. The applications in three real-world networks demonstrate the superiority of our algorithm to find communities that are parallel with the appropriate organization in reality. PMID:18999501

  6. Clustering algorithm evaluation and the development of a replacement for procedure 1. [for crop inventories

    NASA Technical Reports Server (NTRS)

    Lennington, R. K.; Johnson, J. K.

    1979-01-01

    An efficient procedure which clusters data using a completely unsupervised clustering algorithm and then uses labeled pixels to label the resulting clusters or perform a stratified estimate using the clusters as strata is developed. Three clustering algorithms, CLASSY, AMOEBA, and ISOCLS, are compared for efficiency. Three stratified estimation schemes and three labeling schemes are also considered and compared.

  7. Metadata distribution algorithm based on directory hash in mass storage system

    NASA Astrophysics Data System (ADS)

    Wu, Wei; Luo, Dong-jian; Pei, Can-hao

    2008-12-01

    The distribution of metadata is very important in mass storage system. Many storage systems use subtree partition or hash algorithm to distribute the metadata among metadata server cluster. Although the system access performance is improved, the scalability problem is remarkable in most of these algorithms. This paper proposes a new directory hash (DH) algorithm. It treats directory as hash key value, implements a concentrated storage of metadata, and take a dynamic load balance strategy. It improves the efficiency of metadata distribution and access in mass storage system by hashing to directory and placing metadata together with directory granularity. DH algorithm has solved the scalable problems existing in file hash algorithm such as changing directory name or permission, adding or removing MDS from the cluster, and so on. DH algorithm reduces the additional request amount and the scale of each data migration in scalable operations. It enhances the scalability of mass storage system remarkably.

  8. A Likelihood-Based SLIC Superpixel Algorithm for SAR Images Using Generalized Gamma Distribution.

    PubMed

    Zou, Huanxin; Qin, Xianxiang; Zhou, Shilin; Ji, Kefeng

    2016-01-01

    The simple linear iterative clustering (SLIC) method is a recently proposed popular superpixel algorithm. However, this method may generate bad superpixels for synthetic aperture radar (SAR) images due to effects of speckle and the large dynamic range of pixel intensity. In this paper, an improved SLIC algorithm for SAR images is proposed. This algorithm exploits the likelihood information of SAR image pixel clusters. Specifically, a local clustering scheme combining intensity similarity with spatial proximity is proposed. Additionally, for post-processing, a local edge-evolving scheme that combines spatial context and likelihood information is introduced as an alternative to the connected components algorithm. To estimate the likelihood information of SAR image clusters, we incorporated a generalized gamma distribution (GГD). Finally, the superiority of the proposed algorithm was validated using both simulated and real-world SAR images. PMID:27438840

  9. A Likelihood-Based SLIC Superpixel Algorithm for SAR Images Using Generalized Gamma Distribution

    PubMed Central

    Zou, Huanxin; Qin, Xianxiang; Zhou, Shilin; Ji, Kefeng

    2016-01-01

    The simple linear iterative clustering (SLIC) method is a recently proposed popular superpixel algorithm. However, this method may generate bad superpixels for synthetic aperture radar (SAR) images due to effects of speckle and the large dynamic range of pixel intensity. In this paper, an improved SLIC algorithm for SAR images is proposed. This algorithm exploits the likelihood information of SAR image pixel clusters. Specifically, a local clustering scheme combining intensity similarity with spatial proximity is proposed. Additionally, for post-processing, a local edge-evolving scheme that combines spatial context and likelihood information is introduced as an alternative to the connected components algorithm. To estimate the likelihood information of SAR image clusters, we incorporated a generalized gamma distribution (GГD). Finally, the superiority of the proposed algorithm was validated using both simulated and real-world SAR images. PMID:27438840

  10. A Likelihood-Based SLIC Superpixel Algorithm for SAR Images Using Generalized Gamma Distribution.

    PubMed

    Zou, Huanxin; Qin, Xianxiang; Zhou, Shilin; Ji, Kefeng

    2016-01-01

    The simple linear iterative clustering (SLIC) method is a recently proposed popular superpixel algorithm. However, this method may generate bad superpixels for synthetic aperture radar (SAR) images due to effects of speckle and the large dynamic range of pixel intensity. In this paper, an improved SLIC algorithm for SAR images is proposed. This algorithm exploits the likelihood information of SAR image pixel clusters. Specifically, a local clustering scheme combining intensity similarity with spatial proximity is proposed. Additionally, for post-processing, a local edge-evolving scheme that combines spatial context and likelihood information is introduced as an alternative to the connected components algorithm. To estimate the likelihood information of SAR image clusters, we incorporated a generalized gamma distribution (GГD). Finally, the superiority of the proposed algorithm was validated using both simulated and real-world SAR images.

  11. Improved Gravitation Field Algorithm and Its Application in Hierarchical Clustering

    PubMed Central

    Zheng, Ming; Sun, Ying; Liu, Gui-xia; Zhou, You; Zhou, Chun-guang

    2012-01-01

    Background Gravitation field algorithm (GFA) is a new optimization algorithm which is based on an imitation of natural phenomena. GFA can do well both for searching global minimum and multi-minima in computational biology. But GFA needs to be improved for increasing efficiency, and modified for applying to some discrete data problems in system biology. Method An improved GFA called IGFA was proposed in this paper. Two parts were improved in IGFA. The first one is the rule of random division, which is a reasonable strategy and makes running time shorter. The other one is rotation factor, which can improve the accuracy of IGFA. And to apply IGFA to the hierarchical clustering, the initial part and the movement operator were modified. Results Two kinds of experiments were used to test IGFA. And IGFA was applied to hierarchical clustering. The global minimum experiment was used with IGFA, GFA, GA (genetic algorithm) and SA (simulated annealing). Multi-minima experiment was used with IGFA and GFA. The two experiments results were compared with each other and proved the efficiency of IGFA. IGFA is better than GFA both in accuracy and running time. For the hierarchical clustering, IGFA is used to optimize the smallest distance of genes pairs, and the results were compared with GA and SA, singular-linkage clustering, UPGMA. The efficiency of IGFA is proved. PMID:23173043

  12. Irreversible Growth Algorithm for Branched Polymers (Lattice Animals), and Their Relation to Colloidal Cluster-Cluster Aggregates

    NASA Astrophysics Data System (ADS)

    Ball, R. C.; Lee, J. R.

    1996-03-01

    We prove that a new, irreversible growth algorithm, Non-Deletion Reaction-Limited Cluster-cluster Aggregation (NDRLCA), produces equilibrium Branched Polymers, expected to exhibit Lattice Animal statistics [1]. We implement NDRLCA, off-lattice, as a computer simulation for embedding dimension d=2 and 3, obtaining values for critical exponents, fractal dimension D and cluster mass distribution exponent tau: d=2, D≈ 1.53± 0.05, tau = 1.09± 0.06; d=3, D=1.96± 0.04, tau =1.50± 0.04 in good agreement with theoretical LA values. The simulation results do not support recent suggestions [2] that BPs may be in the same universality class as percolation. We also obtain values for a model-dependent critical “fugacity”, z_c and investigate the finite-size effects of our simulation, quantifying notions of “inbreeding” that occur in this algorithm. Finally we use an extension of the NDRLCA proof to show that standard Reaction-Limited Cluster-cluster Aggregation is very unlikely to be in the same universality class as Branched Polymers/Lattice Animals unless the backnone dimension for the latter is considerably less than the published value.

  13. A modified Fuzzy C-Means (FCM) Clustering algorithm and its application on carbonate fluid identification

    NASA Astrophysics Data System (ADS)

    Liu, Lifeng; Sun, Sam Zandong; Yu, Hongyu; Yue, Xingtong; Zhang, Dong

    2016-06-01

    Considering the fact that the fluid distribution in carbonate reservoir is very complicated and the existing fluid prediction methods are not able to produce ideal predicted results, this paper proposes a new fluid identification method in carbonate reservoir based on the modified Fuzzy C-Means (FCM) Clustering algorithm. Both initialization and globally optimum cluster center are produced by Chaotic Quantum Particle Swarm Optimization (CQPSO) algorithm, which can effectively avoid the disadvantage of sensitivity to initial values and easily falling into local convergence in the traditional FCM Clustering algorithm. Then, the modified algorithm is applied to fluid identification in the carbonate X area in Tarim Basin of China, and a mapping relation between fluid properties and pre-stack elastic parameters will be built in multi-dimensional space. It has been proven that this modified algorithm has a good ability of fuzzy cluster and its total coincidence rate of fluid prediction reaches 97.10%. Besides, the membership of different fluids can be accumulated to obtain respective probability, which can evaluate the uncertainty in fluid identification result.

  14. A new clustering algorithm for scanning electron microscope images

    NASA Astrophysics Data System (ADS)

    Yousef, Amr; Duraisamy, Prakash; Karim, Mohammad

    2016-04-01

    A scanning electron microscope (SEM) is a type of electron microscope that produces images of a sample by scanning it with a focused beam of electrons. The electrons interact with the sample atoms, producing various signals that are collected by detectors. The gathered signals contain information about the sample's surface topography and composition. The electron beam is generally scanned in a raster scan pattern, and the beam's position is combined with the detected signal to produce an image. The most common configuration for an SEM produces a single value per pixel, with the results usually rendered as grayscale images. The captured images may be produced with insufficient brightness, anomalous contrast, jagged edges, and poor quality due to low signal-to-noise ratio, grained topography and poor surface details. The segmentation of the SEM images is a tackling problems in the presence of the previously mentioned distortions. In this paper, we are stressing on the clustering of these type of images. In that sense, we evaluate the performance of the well-known unsupervised clustering and classification techniques such as connectivity based clustering (hierarchical clustering), centroid-based clustering, distribution-based clustering and density-based clustering. Furthermore, we propose a new spatial fuzzy clustering technique that works efficiently on this type of images and compare its results against these regular techniques in terms of clustering validation metrics.

  15. ABCluster: the artificial bee colony algorithm for cluster global optimization.

    PubMed

    Zhang, Jun; Dolg, Michael

    2015-10-01

    Global optimization of cluster geometries is of fundamental importance in chemistry and an interesting problem in applied mathematics. In this work, we introduce a relatively new swarm intelligence algorithm, i.e. the artificial bee colony (ABC) algorithm proposed in 2005, to this field. It is inspired by the foraging behavior of a bee colony, and only three parameters are needed to control it. We applied it to several potential functions of quite different nature, i.e., the Coulomb-Born-Mayer, Lennard-Jones, Morse, Z and Gupta potentials. The benchmarks reveal that for long-ranged potentials the ABC algorithm is very efficient in locating the global minimum, while for short-ranged ones it is sometimes trapped into a local minimum funnel on a potential energy surface of large clusters. We have released an efficient, user-friendly, and free program "ABCluster" to realize the ABC algorithm. It is a black-box program for non-experts as well as experts and might become a useful tool for chemists to study clusters. PMID:26327507

  16. Statistical physics based heuristic clustering algorithms with an application to econophysics

    NASA Astrophysics Data System (ADS)

    Baldwin, Lucia Liliana

    validating the partition. All of these techniques are competitive with existing algorithms and offer possible advantages for certain types of data distributions. A practical application is presented using the Percolation Clustering Algorithm to determine the taxonomy of the Dow Jones Industrial Average portfolio. The statistical properties of the correlation coefficients between DJIA components are studied along with the eigenvalues of the correlation matrix between the DJIA components.

  17. Large Data Visualization on Distributed Memory Mulit-GPU Clusters

    SciTech Connect

    Childs, Henry R.

    2010-03-01

    Data sets of immense size are regularly generated on large scale computing resources. Even among more traditional methods for acquisition of volume data, such as MRI and CT scanners, data which is too large to be effectively visualization on standard workstations is now commonplace. One solution to this problem is to employ a 'visualization cluster,' a small to medium scale cluster dedicated to performing visualization and analysis of massive data sets generated on larger scale supercomputers. These clusters are designed to fit a different need than traditional supercomputers, and therefore their design mandates different hardware choices, such as increased memory, and more recently, graphics processing units (GPUs). While there has been much previous work on distributed memory visualization as well as GPU visualization, there is a relative dearth of algorithms which effectively use GPUs at a large scale in a distributed memory environment. In this work, we study a common visualization technique in a GPU-accelerated, distributed memory setting, and present performance characteristics when scaling to extremely large data sets.

  18. Vetoed jet clustering: the mass-jump algorithm

    NASA Astrophysics Data System (ADS)

    Stoll, Martin

    2015-04-01

    A new class of jet clustering algorithms is introduced. A criterion inspired by successful mass-drop taggers is applied that prevents the recombination of two hard prongs if their combined jet mass is substantially larger than the masses of the separate prongs. This "mass jump" veto effectively results in jets with variable radii in dense environments. Differences to existing methods are investigated. It is shown for boosted top quarks that the new algorithm has beneficial properties which can lead to improved tagging purity.

  19. Effective Memetic Algorithms for VLSI design = Genetic Algorithms + local search + multi-level clustering.

    PubMed

    Areibi, Shawki; Yang, Zhen

    2004-01-01

    Combining global and local search is a strategy used by many successful hybrid optimization approaches. Memetic Algorithms (MAs) are Evolutionary Algorithms (EAs) that apply some sort of local search to further improve the fitness of individuals in the population. Memetic Algorithms have been shown to be very effective in solving many hard combinatorial optimization problems. This paper provides a forum for identifying and exploring the key issues that affect the design and application of Memetic Algorithms. The approach combines a hierarchical design technique, Genetic Algorithms, constructive techniques and advanced local search to solve VLSI circuit layout in the form of circuit partitioning and placement. Results obtained indicate that Memetic Algorithms based on local search, clustering and good initial solutions improve solution quality on average by 35% for the VLSI circuit partitioning problem and 54% for the VLSI standard cell placement problem. PMID:15355604

  20. Effective Memetic Algorithms for VLSI design = Genetic Algorithms + local search + multi-level clustering.

    PubMed

    Areibi, Shawki; Yang, Zhen

    2004-01-01

    Combining global and local search is a strategy used by many successful hybrid optimization approaches. Memetic Algorithms (MAs) are Evolutionary Algorithms (EAs) that apply some sort of local search to further improve the fitness of individuals in the population. Memetic Algorithms have been shown to be very effective in solving many hard combinatorial optimization problems. This paper provides a forum for identifying and exploring the key issues that affect the design and application of Memetic Algorithms. The approach combines a hierarchical design technique, Genetic Algorithms, constructive techniques and advanced local search to solve VLSI circuit layout in the form of circuit partitioning and placement. Results obtained indicate that Memetic Algorithms based on local search, clustering and good initial solutions improve solution quality on average by 35% for the VLSI circuit partitioning problem and 54% for the VLSI standard cell placement problem.

  1. GEANT4 distributed computing for compact clusters

    NASA Astrophysics Data System (ADS)

    Harrawood, Brian P.; Agasthya, Greeshma A.; Lakshmanan, Manu N.; Raterman, Gretchen; Kapadia, Anuj J.

    2014-11-01

    A new technique for distribution of GEANT4 processes is introduced to simplify running a simulation in a parallel environment such as a tightly coupled computer cluster. Using a new C++ class derived from the GEANT4 toolkit, multiple runs forming a single simulation are managed across a local network of computers with a simple inter-node communication protocol. The class is integrated with the GEANT4 toolkit and is designed to scale from a single symmetric multiprocessing (SMP) machine to compact clusters ranging in size from tens to thousands of nodes. User designed 'work tickets' are distributed to clients using a client-server work flow model to specify the parameters for each individual run of the simulation. The new g4DistributedRunManager class was developed and well tested in the course of our Neutron Stimulated Emission Computed Tomography (NSECT) experiments. It will be useful for anyone running GEANT4 for large discrete data sets such as covering a range of angles in computed tomography, calculating dose delivery with multiple fractions or simply speeding the through-put of a single model.

  2. BMI optimization by using parallel UNDX real-coded genetic algorithm with Beowulf cluster

    NASA Astrophysics Data System (ADS)

    Handa, Masaya; Kawanishi, Michihiro; Kanki, Hiroshi

    2007-12-01

    This paper deals with the global optimization algorithm of the Bilinear Matrix Inequalities (BMIs) based on the Unimodal Normal Distribution Crossover (UNDX) GA. First, analyzing the structure of the BMIs, the existence of the typical difficult structures is confirmed. Then, in order to improve the performance of algorithm, based on results of the problem structures analysis and consideration of BMIs characteristic properties, we proposed the algorithm using primary search direction with relaxed Linear Matrix Inequality (LMI) convex estimation. Moreover, in these algorithms, we propose two types of evaluation methods for GA individuals based on LMI calculation considering BMI characteristic properties more. In addition, in order to reduce computational time, we proposed parallelization of RCGA algorithm, Master-Worker paradigm with cluster computing technique.

  3. Mapping cultivable land from satellite imagery with clustering algorithms

    NASA Astrophysics Data System (ADS)

    Arango, R. B.; Campos, A. M.; Combarro, E. F.; Canas, E. R.; Díaz, I.

    2016-07-01

    Open data satellite imagery provides valuable data for the planning and decision-making processes related with environmental domains. Specifically, agriculture uses remote sensing in a wide range of services, ranging from monitoring the health of the crops to forecasting the spread of crop diseases. In particular, this paper focuses on a methodology for the automatic delimitation of cultivable land by means of machine learning algorithms and satellite data. The method uses a partition clustering algorithm called Partitioning Around Medoids and considers the quality of the clusters obtained for each satellite band in order to evaluate which one better identifies cultivable land. The proposed method was tested with vineyards using as input the spectral and thermal bands of the Landsat 8 satellite. The experimental results show the great potential of this method for cultivable land monitoring from remote-sensed multispectral imagery.

  4. Synchronous Firefly Algorithm for Cluster Head Selection in WSN

    PubMed Central

    Baskaran, Madhusudhanan; Sadagopan, Chitra

    2015-01-01

    Wireless Sensor Network (WSN) consists of small low-cost, low-power multifunctional nodes interconnected to efficiently aggregate and transmit data to sink. Cluster-based approaches use some nodes as Cluster Heads (CHs) and organize WSNs efficiently for aggregation of data and energy saving. A CH conveys information gathered by cluster nodes and aggregates/compresses data before transmitting it to a sink. However, this additional responsibility of the node results in a higher energy drain leading to uneven network degradation. Low Energy Adaptive Clustering Hierarchy (LEACH) offsets this by probabilistically rotating cluster heads role among nodes with energy above a set threshold. CH selection in WSN is NP-Hard as optimal data aggregation with efficient energy savings cannot be solved in polynomial time. In this work, a modified firefly heuristic, synchronous firefly algorithm, is proposed to improve the network performance. Extensive simulation shows the proposed technique to perform well compared to LEACH and energy-efficient hierarchical clustering. Simulations show the effectiveness of the proposed method in decreasing the packet loss ratio by an average of 9.63% and improving the energy efficiency of the network when compared to LEACH and EEHC. PMID:26495431

  5. Synchronous Firefly Algorithm for Cluster Head Selection in WSN.

    PubMed

    Baskaran, Madhusudhanan; Sadagopan, Chitra

    2015-01-01

    Wireless Sensor Network (WSN) consists of small low-cost, low-power multifunctional nodes interconnected to efficiently aggregate and transmit data to sink. Cluster-based approaches use some nodes as Cluster Heads (CHs) and organize WSNs efficiently for aggregation of data and energy saving. A CH conveys information gathered by cluster nodes and aggregates/compresses data before transmitting it to a sink. However, this additional responsibility of the node results in a higher energy drain leading to uneven network degradation. Low Energy Adaptive Clustering Hierarchy (LEACH) offsets this by probabilistically rotating cluster heads role among nodes with energy above a set threshold. CH selection in WSN is NP-Hard as optimal data aggregation with efficient energy savings cannot be solved in polynomial time. In this work, a modified firefly heuristic, synchronous firefly algorithm, is proposed to improve the network performance. Extensive simulation shows the proposed technique to perform well compared to LEACH and energy-efficient hierarchical clustering. Simulations show the effectiveness of the proposed method in decreasing the packet loss ratio by an average of 9.63% and improving the energy efficiency of the network when compared to LEACH and EEHC. PMID:26495431

  6. Pruning Neural Networks with Distribution Estimation Algorithms

    SciTech Connect

    Cantu-Paz, E

    2003-01-15

    This paper describes the application of four evolutionary algorithms to the pruning of neural networks used in classification problems. Besides of a simple genetic algorithm (GA), the paper considers three distribution estimation algorithms (DEAs): a compact GA, an extended compact GA, and the Bayesian Optimization Algorithm. The objective is to determine if the DEAs present advantages over the simple GA in terms of accuracy or speed in this problem. The experiments used a feed forward neural network trained with standard back propagation and public-domain and artificial data sets. The pruned networks seemed to have better or equal accuracy than the original fully-connected networks. Only in a few cases, pruning resulted in less accurate networks. We found few differences in the accuracy of the networks pruned by the four EAs, but found important differences in the execution time. The results suggest that a simple GA with a small population might be the best algorithm for pruning networks on the data sets we tested.

  7. ICANP2: Isoenergetic cluster algorithm for NP-complete Problems

    NASA Astrophysics Data System (ADS)

    Zhu, Zheng; Fang, Chao; Katzgraber, Helmut G.

    NP-complete optimization problems with Boolean variables are of fundamental importance in computer science, mathematics and physics. Most notably, the minimization of general spin-glass-like Hamiltonians remains a difficult numerical task. There has been a great interest in designing efficient heuristics to solve these computationally difficult problems. Inspired by the rejection-free isoenergetic cluster algorithm developed for Ising spin glasses, we present a generalized cluster update that can be applied to different NP-complete optimization problems with Boolean variables. The cluster updates allow for a wide-spread sampling of phase space, thus speeding up optimization. By carefully tuning the pseudo-temperature (needed to randomize the configurations) of the problem, we show that the method can efficiently tackle problems on topologies with a large site-percolation threshold. We illustrate the ICANP2 heuristic on paradigmatic optimization problems, such as the satisfiability problem and the vertex cover problem.

  8. An algorithm for point cluster generalization based on the Voronoi diagram

    NASA Astrophysics Data System (ADS)

    Yan, Haowen; Weibel, Robert

    2008-08-01

    This paper presents an algorithm for point cluster generalization. Four types of information, i.e. statistical, thematic, topological, and metric information are considered, and measures are selected to describe corresponding types of information quantitatively in the algorithm, i.e. the number of points for statistical information, the importance value for thematic information, the Voronoi neighbors for topological information, and the distribution range and relative local density for metric information. Based on these measures, an algorithm for point cluster generalization is developed. Firstly, point clusters are triangulated and a border polygon of the point clusters is obtained. By the border polygon, some pseudo points are added to the original point clusters to form a new point set and a range polygon that encloses all original points is constructed. Secondly, the Voronoi polygons of the new point set are computed in order to obtain the so-called relative local density of each point. Further, the selection probability of each point is computed using its relative local density and importance value, and then mark those will-be-deleted points as 'deleted' according to their selection probabilities and Voronoi neighboring relations. Thirdly, if the number of retained points does not satisfy that computed by the Radical Law, physically delete the points marked as 'deleted' forming a new point set, and the second step is repeated; else physically deleted pseudo points and the points marked as 'deleted', and the generalized point clusters are achieved. Owing to the use of the Voronoi diagram the algorithm is parameter free and fully automatic. As our experiments show, it can be used in the generalization of point features arranged in clusters such as thematic dot maps and control points on cartographic maps.

  9. Algorithm for systematic peak extraction from atomic pair distribution functions.

    PubMed

    Granlund, L; Billinge, S J L; Duxbury, P M

    2015-07-01

    The study presents an algorithm, ParSCAPE, for model-independent extraction of peak positions and intensities from atomic pair distribution functions (PDFs). It provides a statistically motivated method for determining parsimony of extracted peak models using the information-theoretic Akaike information criterion (AIC) applied to plausible models generated within an iterative framework of clustering and chi-square fitting. All parameters the algorithm uses are in principle known or estimable from experiment, though careful judgment must be applied when estimating the PDF baseline of nanostructured materials. ParSCAPE has been implemented in the Python program SrMise. Algorithm performance is examined on synchrotron X-ray PDFs of 16 bulk crystals and two nanoparticles using AIC-based multimodeling techniques, and particularly the impact of experimental uncertainties on extracted models. It is quite resistant to misidentification of spurious peaks coming from noise and termination effects, even in the absence of a constraining structural model. Structure solution from automatically extracted peaks using the Liga algorithm is demonstrated for 14 crystals and for C60. Special attention is given to the information content of the PDF, theory and practice of the AIC, as well as the algorithm's limitations. PMID:26131896

  10. NIC-based Reduction Algorithms for Large-scale Clusters

    SciTech Connect

    Petrini, F; Moody, A T; Fernandez, J; Frachtenberg, E; Panda, D K

    2004-07-30

    Efficient algorithms for reduction operations across a group of processes are crucial for good performance in many large-scale, parallel scientific applications. While previous algorithms limit processing to the host CPU, we utilize the programmable processors and local memory available on modern cluster network interface cards (NICs) to explore a new dimension in the design of reduction algorithms. In this paper, we present the benefits and challenges, design issues and solutions, analytical models, and experimental evaluations of a family of NIC-based reduction algorithms. Performance and scalability evaluations were conducted on the ASCI Linux Cluster (ALC), a 960-node, 1920-processor machine at Lawrence Livermore National Laboratory, which uses the Quadrics QsNet interconnect. We find NIC-based reductions on modern interconnects to be more efficient than host-based implementations in both scalability and consistency. In particular, at large-scale--1812 processes--NIC-based reductions of small integer and floating-point arrays provided respective speedups of 121% and 39% over the host-based, production-level MPI implementation.

  11. Robust growing neural gas algorithm with application in cluster analysis.

    PubMed

    Qin, A K; Suganthan, P N

    2004-01-01

    We propose a novel robust clustering algorithm within the Growing Neural Gas (GNG) framework, called Robust Growing Neural Gas (RGNG) network.The Matlab codes are available from . By incorporating several robust strategies, such as outlier resistant scheme, adaptive modulation of learning rates and cluster repulsion method into the traditional GNG framework, the proposed RGNG network possesses better robustness properties. The RGNG is insensitive to initialization, input sequence ordering and the presence of outliers. Furthermore, the RGNG network can automatically determine the optimal number of clusters by seeking the extreme value of the Minimum Description Length (MDL) measure during network growing process. The resulting center positions of the optimal number of clusters represented by prototype vectors are close to the actual ones irrespective of the existence of outliers. Topology relationships among these prototypes can also be established. Experimental results have shown the superior performance of our proposed method over the original GNG incorporating MDL method, called GNG-M, in static data clustering tasks on both artificial and UCI data sets. PMID:15555857

  12. Optical Cluster-Finding with an Adaptive Matched-Filter Technique: Algorithm and Comparison with Simulations

    SciTech Connect

    Dong, Feng; Pierpaoli, Elena; Gunn, James E.; Wechsler, Risa H.

    2007-10-29

    We present a modified adaptive matched filter algorithm designed to identify clusters of galaxies in wide-field imaging surveys such as the Sloan Digital Sky Survey. The cluster-finding technique is fully adaptive to imaging surveys with spectroscopic coverage, multicolor photometric redshifts, no redshift information at all, and any combination of these within one survey. It works with high efficiency in multi-band imaging surveys where photometric redshifts can be estimated with well-understood error distributions. Tests of the algorithm on realistic mock SDSS catalogs suggest that the detected sample is {approx} 85% complete and over 90% pure for clusters with masses above 1.0 x 10{sup 14}h{sup -1} M and redshifts up to z = 0.45. The errors of estimated cluster redshifts from maximum likelihood method are shown to be small (typically less that 0.01) over the whole redshift range with photometric redshift errors typical of those found in the Sloan survey. Inside the spherical radius corresponding to a galaxy overdensity of {Delta} = 200, we find the derived cluster richness {Lambda}{sub 200} a roughly linear indicator of its virial mass M{sub 200}, which well recovers the relation between total luminosity and cluster mass of the input simulation.

  13. Performance Analysis of Apriori Algorithm with Different Data Structures on Hadoop Cluster

    NASA Astrophysics Data System (ADS)

    Singh, Sudhakar; Garg, Rakhi; Mishra, P. K.

    2015-10-01

    Mining frequent itemsets from massive datasets is always being a most important problem of data mining. Apriori is the most popular and simplest algorithm for frequent itemset mining. To enhance the efficiency and scalability of Apriori, a number of algorithms have been proposed addressing the design of efficient data structures, minimizing database scan and parallel and distributed processing. MapReduce is the emerging parallel and distributed technology to process big datasets on Hadoop Cluster. To mine big datasets it is essential to re-design the data mining algorithm on this new paradigm. In this paper, we implement three variations of Apriori algorithm using data structures hash tree, trie and hash table trie i.e. trie with hash technique on MapReduce paradigm. We emphasize and investigate the significance of these three data structures for Apriori algorithm on Hadoop cluster, which has not been given attention yet. Experiments are carried out on both real life and synthetic datasets which shows that hash table trie data structures performs far better than trie and hash tree in terms of execution time. Moreover the performance in case of hash tree becomes worst.

  14. Mammographic images segmentation based on chaotic map clustering algorithm

    PubMed Central

    2014-01-01

    Background This work investigates the applicability of a novel clustering approach to the segmentation of mammographic digital images. The chaotic map clustering algorithm is used to group together similar subsets of image pixels resulting in a medically meaningful partition of the mammography. Methods The image is divided into pixels subsets characterized by a set of conveniently chosen features and each of the corresponding points in the feature space is associated to a map. A mutual coupling strength between the maps depending on the associated distance between feature space points is subsequently introduced. On the system of maps, the simulated evolution through chaotic dynamics leads to its natural partitioning, which corresponds to a particular segmentation scheme of the initial mammographic image. Results The system provides a high recognition rate for small mass lesions (about 94% correctly segmented inside the breast) and the reproduction of the shape of regions with denser micro-calcifications in about 2/3 of the cases, while being less effective on identification of larger mass lesions. Conclusions We can summarize our analysis by asserting that due to the particularities of the mammographic images, the chaotic map clustering algorithm should not be used as the sole method of segmentation. It is rather the joint use of this method along with other segmentation techniques that could be successfully used for increasing the segmentation performance and for providing extra information for the subsequent analysis stages such as the classification of the segmented ROI. PMID:24666766

  15. Algorithm-dependent fault tolerance for distributed computing

    SciTech Connect

    P. D. Hough; M. e. Goldsby; E. J. Walsh

    2000-02-01

    Large-scale distributed systems assembled from commodity parts, like CPlant, have become common tools in the distributed computing world. Because of their size and diversity of parts, these systems are prone to failures. Applications that are being run on these systems have not been equipped to efficiently deal with failures, nor is there vendor support for fault tolerance. Thus, when a failure occurs, the application crashes. While most programmers make use of checkpoints to allow for restarting of their applications, this is cumbersome and incurs substantial overhead. In many cases, there are more efficient and more elegant ways in which to address failures. The goal of this project is to develop a software architecture for the detection of and recovery from faults in a cluster computing environment. The detection phase relies on the latest techniques developed in the fault tolerance community. Recovery is being addressed in an application-dependent manner, thus allowing the programmer to take advantage of algorithmic characteristics to reduce the overhead of fault tolerance. This architecture will allow large-scale applications to be more robust in high-performance computing environments that are comprised of clusters of commodity computers such as CPlant and SMP clusters.

  16. Combined Mapping of Multiple clUsteriNg ALgorithms (COMMUNAL): A Robust Method for Selection of Cluster Number, K.

    PubMed

    Sweeney, Timothy E; Chen, Albert C; Gevaert, Olivier

    2015-11-19

    In order to discover new subsets (clusters) of a data set, researchers often use algorithms that perform unsupervised clustering, namely, the algorithmic separation of a dataset into some number of distinct clusters. Deciding whether a particular separation (or number of clusters, K) is correct is a sort of 'dark art', with multiple techniques available for assessing the validity of unsupervised clustering algorithms. Here, we present a new technique for unsupervised clustering that uses multiple clustering algorithms, multiple validity metrics, and progressively bigger subsets of the data to produce an intuitive 3D map of cluster stability that can help determine the optimal number of clusters in a data set, a technique we call COmbined Mapping of Multiple clUsteriNg ALgorithms (COMMUNAL). COMMUNAL locally optimizes algorithms and validity measures for the data being used. We show its application to simulated data with a known K, and then apply this technique to several well-known cancer gene expression datasets, showing that COMMUNAL provides new insights into clustering behavior and stability in all tested cases. COMMUNAL is shown to be a useful tool for determining K in complex biological datasets, and is freely available as a package for R.

  17. Dynamically Incremental K-means++ Clustering Algorithm Based on Fuzzy Rough Set Theory

    NASA Astrophysics Data System (ADS)

    Li, Wei; Wang, Rujing; Jia, Xiufang; Jiang, Qing

    Being classic K-means++ clustering algorithm only for static data, dynamically incremental K-means++ clustering algorithm (DK-Means++) is presented based on fuzzy rough set theory in this paper. Firstly, in DK-Means++ clustering algorithm, the formula of similar degree is improved by weights computed by using of the important degree of attributes which are reduced on the basis of rough fuzzy set theory. Secondly, new data only need match granular which was clustered by K-means++ algorithm or seldom new data is clustered by classic K-means++ algorithm in global data. In this way, that all data is re-clustered each time in dynamic data set is avoided, so the efficiency of clustering is improved. Throughout our experiments showing, DK-Means++ algorithm can objectively and efficiently deal with clustering problem of dynamically incremental data.

  18. Gravitation field algorithm and its application in gene cluster

    PubMed Central

    2010-01-01

    Background Searching optima is one of the most challenging tasks in clustering genes from available experimental data or given functions. SA, GA, PSO and other similar efficient global optimization methods are used by biotechnologists. All these algorithms are based on the imitation of natural phenomena. Results This paper proposes a novel searching optimization algorithm called Gravitation Field Algorithm (GFA) which is derived from the famous astronomy theory Solar Nebular Disk Model (SNDM) of planetary formation. GFA simulates the Gravitation field and outperforms GA and SA in some multimodal functions optimization problem. And GFA also can be used in the forms of unimodal functions. GFA clusters the dataset well from the Gene Expression Omnibus. Conclusions The mathematical proof demonstrates that GFA could be convergent in the global optimum by probability 1 in three conditions for one independent variable mass functions. In addition to these results, the fundamental optimization concept in this paper is used to analyze how SA and GA affect the global search and the inherent defects in SA and GA. Some results and source code (in Matlab) are publicly available at http://ccst.jlu.edu.cn/CSBG/GFA. PMID:20854683

  19. Novel similarity-based clustering algorithm for grouping broadcast news

    NASA Astrophysics Data System (ADS)

    Ibrahimov, Oktay V.; Sethi, Ishwar K.; Dimitrova, Nevenka

    2002-03-01

    The goal of the current paper is to introduce a novel clustering algorithm that has been designed for grouping transcribed textual documents obtained out of audio, video segments. Since audio transcripts are normally highly erroneous documents, one of the major challenges at the text processing stage is to reduce the negative impacts of errors gained at the speech recognition stage. Other difficulties come from the nature of conversational speech. In the paper we describe the main difficulties of the spoken documents and suggest an approach restricting their negative effects. In our paper we also present a clustering algorithm that groups transcripts on the base of informative closeness of documents. To carry out such partitioning we give an intuitive definition of informative field of a transcript and use it in our algorithm. To assess informative closeness of the transcripts, we apply Chi-square similarity measure, which is also described in the paper. Our experiments with Chi-square similarity measure showed its robustness and high efficacy. In particular, the performance analysis that have been carried out in regard to Chi-square and three other similarity measures such as Cosine, Dice, and Jaccard showed that Chi-square is more robust to specific features of spoken documents.

  20. A new detection algorithm for microcalcification clusters in mammographic screening

    NASA Astrophysics Data System (ADS)

    Xie, Weiying; Ma, Yide; Li, Yunsong

    2015-05-01

    A novel approach for microcalcification clusters detection is proposed. At the first time, we make a short analysis of mammographic images with microcalcification lesions to confirm these lesions have much greater gray values than normal regions. After summarizing the specific feature of microcalcification clusters in mammographic screening, we make more focus on preprocessing step including eliminating the background, image enhancement and eliminating the pectoral muscle. In detail, Chan-Vese Model is used for eliminating background. Then, we do the application of combining morphology method and edge detection method. After the AND operation and Sobel filter, we use Hough Transform, it can be seen that the result have outperformed for eliminating the pectoral muscle which is approximately the gray of microcalcification. Additionally, the enhancement step is achieved by morphology. We make effort on mammographic image preprocessing to achieve lower computational complexity. As well known, it is difficult to robustly achieve mammograms analysis due to low contrast between normal and lesion tissues, there are also much noise in such images. After a serious preprocessing algorithm, a method based on blob detection is performed to microcalcification clusters according their specific features. The proposed algorithm has employed Laplace operator to improve Difference of Gaussians (DoG) function in terms of low contrast images. A preliminary evaluation of the proposed method performs on a known public database namely MIAS, rather than synthetic images. The comparison experiments and Cohen's kappa coefficients all demonstrate that our proposed approach can potentially obtain better microcalcification clusters detection results in terms of accuracy, sensitivity and specificity.

  1. Classification of posture maintenance data with fuzzy clustering algorithms

    NASA Technical Reports Server (NTRS)

    Bezdek, James C.

    1992-01-01

    Sensory inputs from the visual, vestibular, and proprioreceptive systems are integrated by the central nervous system to maintain postural equilibrium. Sustained exposure to microgravity causes neurosensory adaptation during spaceflight, which results in decreased postural stability until readaptation occurs upon return to the terrestrial environment. Data which simulate sensory inputs under various sensory organization test (SOT) conditions were collected in conjunction with Johnson Space Center postural control studies using a tilt-translation device (TTD). The University of West Florida applied the fuzzy c-meams (FCM) clustering algorithms to this data with a view towards identifying various states and stages of subjects experiencing such changes. Feature analysis, time step analysis, pooling data, response of the subjects, and the algorithms used are discussed.

  2. Radial distribution of metallicity in the LMC cluster systems

    NASA Technical Reports Server (NTRS)

    Kontizas, M.; Kontizas, E.; Michalitsianos, A. G.

    1993-01-01

    New determinations of the deprojected distances to the galaxy center for 94 star clusters and their metal abundances are used to investigate the variation of metallicity across the two LMC star cluster systems (Kontizas et al. 1990). A systematic radial trend of metallicity is observed in the extended outer cluster system, the outermost clusters being significantly metal poorer than the more central ones, with the exception of six clusters (which might lie out of the plane of the cluster system) out of 77. A radial metallicity gradient has been found, qualitatively comparable to that of the Milky Way for its system of the old disk clusters. If the six clusters are taken into consideration then the outer cluster system is well mixed up to 8 kpc. The spatial distribution of metallicities for the inner LMC cluster system, consisting of very young globulars does not show a systematic radial trend; they are all metal rich.

  3. jClustering, an Open Framework for the Development of 4D Clustering Algorithms

    PubMed Central

    Mateos-Pérez, José María; García-Villalba, Carmen; Pascau, Javier; Desco, Manuel; Vaquero, Juan J.

    2013-01-01

    We present jClustering, an open framework for the design of clustering algorithms in dynamic medical imaging. We developed this tool because of the difficulty involved in manually segmenting dynamic PET images and the lack of availability of source code for published segmentation algorithms. Providing an easily extensible open tool encourages publication of source code to facilitate the process of comparing algorithms and provide interested third parties with the opportunity to review code. The internal structure of the framework allows an external developer to implement new algorithms easily and quickly, focusing only on the particulars of the method being implemented and not on image data handling and preprocessing. This tool has been coded in Java and is presented as an ImageJ plugin in order to take advantage of all the functionalities offered by this imaging analysis platform. Both binary packages and source code have been published, the latter under a free software license (GNU General Public License) to allow modification if necessary. PMID:23990913

  4. Classification of posture maintenance data with fuzzy clustering algorithms

    NASA Technical Reports Server (NTRS)

    Bezdek, James C.

    1991-01-01

    Sensory inputs from the visual, vestibular, and proprioreceptive systems are integrated by the central nervous system to maintain postural equilibrium. Sustained exposure to microgravity causes neurosensory adaptation during spaceflight, which results in decreased postural stability until readaptation occurs upon return to the terrestrial environment. Data which simulate sensory inputs under various conditions were collected in conjunction with JSC postural control studies using a Tilt-Translation Device (TTD). The University of West Florida proposed applying the Fuzzy C-Means Clustering (FCM) Algorithms to this data with a view towards identifying various states and stages. Data supplied by NASA/JSC were submitted to the FCM algorithms in an attempt to identify and characterize cluster substructure in a mixed ensemble of pre- and post-adaptational TTD data. Following several unsuccessful trials with FCM using a full 11 dimensional data set, a set of two channels (features) were found to enable FCM to separate pre- from post-adaptational TTD data. The main conclusions are that: (1) FCM seems able to separate pre- from post-TTD subject no. 2 on the one trial that was used, but only in certain subintervals of time; and (2) Channels 2 (right rear transducer force) and 8 (hip sway bar) contain better discrimination information than other supersets and combinations of the data that were tried so far.

  5. Inconsistent Denoising and Clustering Algorithms for Amplicon Sequence Data.

    PubMed

    Koskinen, Kaisa; Auvinen, Petri; Björkroth, K Johanna; Hultman, Jenni

    2015-08-01

    Natural microbial communities have been studied for decades using the 16S rRNA gene as a marker. In recent years, the application of second-generation sequencing technologies has revolutionized our understanding of the structure and function of microbial communities in complex environments. Using these highly parallel techniques, a detailed description of community characteristics are constructed, and even the rare biosphere can be detected. The new approaches carry numerous advantages and lack many features that skewed the results using traditional techniques, but we are still facing serious bias, and the lack of reliable comparability of produced results. Here, we contrasted publicly available amplicon sequence data analysis algorithms by using two different data sets, one with defined clone-based structure, and one with food spoilage community with well-studied communities. We aimed to assess which software and parameters produce results that resemble the benchmark community best, how large differences can be detected between methods, and whether these differences are statistically significant. The results suggest that commonly accepted denoising and clustering methods used in different combinations produce significantly different outcome: clustering method impacts greatly on the number of operational taxonomic units (OTUs) and denoising algorithm influences more on taxonomic affiliations. The magnitude of the OTU number difference was up to 40-fold and the disparity between results seemed highly dependent on the community structure and diversity. Statistically significant differences in taxonomies between methods were seen even at phylum level. However, the application of effective denoising method seemed to even out the differences produced by clustering. PMID:25525895

  6. Node Self-Deployment Algorithm Based on an Uneven Cluster with Radius Adjusting for Underwater Sensor Networks.

    PubMed

    Jiang, Peng; Xu, Yiming; Wu, Feng

    2016-01-14

    Existing move-restricted node self-deployment algorithms are based on a fixed node communication radius, evaluate the performance based on network coverage or the connectivity rate and do not consider the number of nodes near the sink node and the energy consumption distribution of the network topology, thereby degrading network reliability and the energy consumption balance. Therefore, we propose a distributed underwater node self-deployment algorithm. First, each node begins the uneven clustering based on the distance on the water surface. Each cluster head node selects its next-hop node to synchronously construct a connected path to the sink node. Second, the cluster head node adjusts its depth while maintaining the layout formed by the uneven clustering and then adjusts the positions of in-cluster nodes. The algorithm originally considers the network reliability and energy consumption balance during node deployment and considers the coverage redundancy rate of all positions that a node may reach during the node position adjustment. Simulation results show, compared to the connected dominating set (CDS) based depth computation algorithm, that the proposed algorithm can increase the number of the nodes near the sink node and improve network reliability while guaranteeing the network connectivity rate. Moreover, it can balance energy consumption during network operation, further improve network coverage rate and reduce energy consumption.

  7. Node Self-Deployment Algorithm Based on an Uneven Cluster with Radius Adjusting for Underwater Sensor Networks.

    PubMed

    Jiang, Peng; Xu, Yiming; Wu, Feng

    2016-01-01

    Existing move-restricted node self-deployment algorithms are based on a fixed node communication radius, evaluate the performance based on network coverage or the connectivity rate and do not consider the number of nodes near the sink node and the energy consumption distribution of the network topology, thereby degrading network reliability and the energy consumption balance. Therefore, we propose a distributed underwater node self-deployment algorithm. First, each node begins the uneven clustering based on the distance on the water surface. Each cluster head node selects its next-hop node to synchronously construct a connected path to the sink node. Second, the cluster head node adjusts its depth while maintaining the layout formed by the uneven clustering and then adjusts the positions of in-cluster nodes. The algorithm originally considers the network reliability and energy consumption balance during node deployment and considers the coverage redundancy rate of all positions that a node may reach during the node position adjustment. Simulation results show, compared to the connected dominating set (CDS) based depth computation algorithm, that the proposed algorithm can increase the number of the nodes near the sink node and improve network reliability while guaranteeing the network connectivity rate. Moreover, it can balance energy consumption during network operation, further improve network coverage rate and reduce energy consumption. PMID:26784193

  8. Node Self-Deployment Algorithm Based on an Uneven Cluster with Radius Adjusting for Underwater Sensor Networks

    PubMed Central

    Jiang, Peng; Xu, Yiming; Wu, Feng

    2016-01-01

    Existing move-restricted node self-deployment algorithms are based on a fixed node communication radius, evaluate the performance based on network coverage or the connectivity rate and do not consider the number of nodes near the sink node and the energy consumption distribution of the network topology, thereby degrading network reliability and the energy consumption balance. Therefore, we propose a distributed underwater node self-deployment algorithm. First, each node begins the uneven clustering based on the distance on the water surface. Each cluster head node selects its next-hop node to synchronously construct a connected path to the sink node. Second, the cluster head node adjusts its depth while maintaining the layout formed by the uneven clustering and then adjusts the positions of in-cluster nodes. The algorithm originally considers the network reliability and energy consumption balance during node deployment and considers the coverage redundancy rate of all positions that a node may reach during the node position adjustment. Simulation results show, compared to the connected dominating set (CDS) based depth computation algorithm, that the proposed algorithm can increase the number of the nodes near the sink node and improve network reliability while guaranteeing the network connectivity rate. Moreover, it can balance energy consumption during network operation, further improve network coverage rate and reduce energy consumption. PMID:26784193

  9. A comparison of queueing, cluster and distributed computing systems

    NASA Technical Reports Server (NTRS)

    Kaplan, Joseph A.; Nelson, Michael L.

    1993-01-01

    Using workstation clusters for distributed computing has become popular with the proliferation of inexpensive, powerful workstations. Workstation clusters offer both a cost effective alternative to batch processing and an easy entry into parallel computing. However, a number of workstations on a network does not constitute a cluster. Cluster management software is necessary to harness the collective computing power. A variety of cluster management and queuing systems are compared: Distributed Queueing Systems (DQS), Condor, Load Leveler, Load Balancer, Load Sharing Facility (LSF - formerly Utopia), Distributed Job Manager (DJM), Computing in Distributed Networked Environments (CODINE), and NQS/Exec. The systems differ in their design philosophy and implementation. Based on published reports on the different systems and conversations with the system's developers and vendors, a comparison of the systems are made on the integral issues of clustered computing.

  10. Dynamic Layered Dual-Cluster Heads Routing Algorithm Based on Krill Herd Optimization in UWSNs

    PubMed Central

    Jiang, Peng; Feng, Yang; Wu, Feng; Yu, Shanen; Xu, Huan

    2016-01-01

    Aimed at the limited energy of nodes in underwater wireless sensor networks (UWSNs) and the heavy load of cluster heads in clustering routing algorithms, this paper proposes a dynamic layered dual-cluster routing algorithm based on Krill Herd optimization in UWSNs. Cluster size is first decided by the distance between the cluster head nodes and sink node, and a dynamic layered mechanism is established to avoid the repeated selection of the same cluster head nodes. Using Krill Herd optimization algorithm selects the optimal and second optimal cluster heads, and its Lagrange model directs nodes to a high likelihood area. It ultimately realizes the functions of data collection and data transition. The simulation results show that the proposed algorithm can effectively decrease cluster energy consumption, balance the network energy consumption, and prolong the network lifetime. PMID:27589744

  11. Dynamic Layered Dual-Cluster Heads Routing Algorithm Based on Krill Herd Optimization in UWSNs.

    PubMed

    Jiang, Peng; Feng, Yang; Wu, Feng; Yu, Shanen; Xu, Huan

    2016-01-01

    Aimed at the limited energy of nodes in underwater wireless sensor networks (UWSNs) and the heavy load of cluster heads in clustering routing algorithms, this paper proposes a dynamic layered dual-cluster routing algorithm based on Krill Herd optimization in UWSNs. Cluster size is first decided by the distance between the cluster head nodes and sink node, and a dynamic layered mechanism is established to avoid the repeated selection of the same cluster head nodes. Using Krill Herd optimization algorithm selects the optimal and second optimal cluster heads, and its Lagrange model directs nodes to a high likelihood area. It ultimately realizes the functions of data collection and data transition. The simulation results show that the proposed algorithm can effectively decrease cluster energy consumption, balance the network energy consumption, and prolong the network lifetime. PMID:27589744

  12. Applying various algorithms for species distribution modelling.

    PubMed

    Li, Xinhai; Wang, Yuan

    2013-06-01

    Species distribution models have been used extensively in many fields, including climate change biology, landscape ecology and conservation biology. In the past 3 decades, a number of new models have been proposed, yet researchers still find it difficult to select appropriate models for data and objectives. In this review, we aim to provide insight into the prevailing species distribution models for newcomers in the field of modelling. We compared 11 popular models, including regression models (the generalized linear model, the generalized additive model, the multivariate adaptive regression splines model and hierarchical modelling), classification models (mixture discriminant analysis, the generalized boosting model, and classification and regression tree analysis) and complex models (artificial neural network, random forest, genetic algorithm for rule set production and maximum entropy approaches). Our objectives are: (i) to compare the strengths and weaknesses of the models, their characteristics and identify suitable situations for their use (in terms of data type and species-environment relationships) and (ii) to provide guidelines for model application, including 3 steps: model selection, model formulation and parameter estimation. PMID:23731809

  13. Applying various algorithms for species distribution modelling.

    PubMed

    Li, Xinhai; Wang, Yuan

    2013-06-01

    Species distribution models have been used extensively in many fields, including climate change biology, landscape ecology and conservation biology. In the past 3 decades, a number of new models have been proposed, yet researchers still find it difficult to select appropriate models for data and objectives. In this review, we aim to provide insight into the prevailing species distribution models for newcomers in the field of modelling. We compared 11 popular models, including regression models (the generalized linear model, the generalized additive model, the multivariate adaptive regression splines model and hierarchical modelling), classification models (mixture discriminant analysis, the generalized boosting model, and classification and regression tree analysis) and complex models (artificial neural network, random forest, genetic algorithm for rule set production and maximum entropy approaches). Our objectives are: (i) to compare the strengths and weaknesses of the models, their characteristics and identify suitable situations for their use (in terms of data type and species-environment relationships) and (ii) to provide guidelines for model application, including 3 steps: model selection, model formulation and parameter estimation.

  14. Clustering Algorithm for Unsupervised Monaural Musical Sound Separation Based on Non-negative Matrix Factorization

    NASA Astrophysics Data System (ADS)

    Park, Sang Ha; Lee, Seokjin; Sung, Koeng-Mo

    Non-negative matrix factorization (NMF) is widely used for monaural musical sound source separation because of its efficiency and good performance. However, an additional clustering process is required because the musical sound mixture is separated into more signals than the number of musical tracks during NMF separation. In the conventional method, manual clustering or training-based clustering is performed with an additional learning process. Recently, a clustering algorithm based on the mel-frequency cepstrum coefficient (MFCC) was proposed for unsupervised clustering. However, MFCC clustering supplies limited information for clustering. In this paper, we propose various timbre features for unsupervised clustering and a clustering algorithm with these features. Simulation experiments are carried out using various musical sound mixtures. The results indicate that the proposed method improves clustering performance, as compared to conventional MFCC-based clustering.

  15. Comparing simulated and experimental molecular cluster distributions.

    PubMed

    Olenius, Tinja; Schobesberger, Siegfried; Kupiainen-Määttä, Oona; Franchin, Alessandro; Junninen, Heikki; Ortega, Ismael K; Kurtén, Theo; Loukonen, Ville; Worsnop, Douglas R; Kulmala, Markku; Vehkamäki, Hanna

    2013-01-01

    Formation of secondary atmospheric aerosol particles starts with gas phase molecules forming small molecular clusters. High-resolution mass spectrometry enables the detection and chemical characterization of electrically charged clusters from the molecular scale upward, whereas the experimental detection of electrically neutral clusters, especially as a chemical composition measurement, down to 1 nm in diameter and beyond still remains challenging. In this work we simulated a set of both electrically neutral and charged small molecular clusters, consisting of sulfuric acid and ammonia molecules, with a dynamic collision and evaporation model. Collision frequencies between the clusters were calculated according to classical kinetics, and evaporation rates were derived from first principles quantum chemical calculations with no fitting parameters. We found a good agreement between the modeled steady-state concentrations of negative cluster ions and experimental results measured with the state-of-the-art Atmospheric Pressure interface Time-Of-Flight mass spectrometer (APi-TOF) in the CLOUD chamber experiments at CERN. The model can be used to interpret experimental results and give information on neutral clusters that cannot be directly measured.

  16. Comparing simulated and experimental molecular cluster distributions.

    PubMed

    Olenius, Tinja; Schobesberger, Siegfried; Kupiainen-Määttä, Oona; Franchin, Alessandro; Junninen, Heikki; Ortega, Ismael K; Kurtén, Theo; Loukonen, Ville; Worsnop, Douglas R; Kulmala, Markku; Vehkamäki, Hanna

    2013-01-01

    Formation of secondary atmospheric aerosol particles starts with gas phase molecules forming small molecular clusters. High-resolution mass spectrometry enables the detection and chemical characterization of electrically charged clusters from the molecular scale upward, whereas the experimental detection of electrically neutral clusters, especially as a chemical composition measurement, down to 1 nm in diameter and beyond still remains challenging. In this work we simulated a set of both electrically neutral and charged small molecular clusters, consisting of sulfuric acid and ammonia molecules, with a dynamic collision and evaporation model. Collision frequencies between the clusters were calculated according to classical kinetics, and evaporation rates were derived from first principles quantum chemical calculations with no fitting parameters. We found a good agreement between the modeled steady-state concentrations of negative cluster ions and experimental results measured with the state-of-the-art Atmospheric Pressure interface Time-Of-Flight mass spectrometer (APi-TOF) in the CLOUD chamber experiments at CERN. The model can be used to interpret experimental results and give information on neutral clusters that cannot be directly measured. PMID:24600997

  17. a New Online Distributed Process Fault Detection and Isolation Approach Using Potential Clustering Technique

    NASA Astrophysics Data System (ADS)

    Bahrampour, Soheil; Moshiri, Behzad; Salahshoor, Karim

    2009-08-01

    Most of process fault monitoring systems suffer from offline computations and confronting with novel faults that limit their applicabilities. This paper presents a new online fault detection and isolation (FDI) algorithm based on distributed online clustering approach. In the proposed approach, clustering algorithm is used for online detection of a new trend of time series data which indicates faulty condition. On the other hand, distributed technique is used to decompose the overall monitoring task into a series of local monitoring sub-tasks so as to locally track and capture the process faults. This algorithm not only solves the problem of online FDI, but also can handle novel faults. The diagnostic performances of the proposed FDI approach is evaluated on the Tennessee Eastman process plant as a large-scale benchmark problem.

  18. A distributed decision framework for building clusters with different heterogeneity settings

    DOE PAGES

    Jafari-Marandi, Ruholla; Omitaomu, Olufemi A.; Hu, Mengqi

    2016-01-05

    In the past few decades, extensive research has been conducted to develop operation and control strategy for smart buildings with the purpose of reducing energy consumption. Besides studying on single building, it is envisioned that the next generation buildings can freely connect with one another to share energy and exchange information in the context of smart grid. It was demonstrated that a network of connected buildings (aka building clusters) can significantly reduce primary energy consumption, improve environmental sustainability and building s resilience capability. However, an analytic tool to determine which type of buildings should form a cluster and what ismore » the impact of building clusters heterogeneity based on energy profile to the energy performance of building clusters is missing. To bridge these research gaps, we propose a self-organizing map clustering algorithm to divide multiple buildings to different clusters based on their energy profiles, and a homogeneity index to evaluate the heterogeneity of different building clusters configurations. In addition, a bi-level distributed decision model is developed to study the energy sharing in the building clusters. To demonstrate the effectiveness of the proposed clustering algorithm and decision model, we employ a dataset including monthly energy consumption data for 30 buildings where the data is collected every 15 min. It is demonstrated that the proposed decision model can achieve at least 13% cost savings for building clusters. Furthermore, the results show that the heterogeneity of energy profile is an important factor to select battery and renewable energy source for building clusters, and the shared battery and renewable energy are preferred for more heterogeneous building clusters.« less

  19. Cluster-based distributed face tracking in camera networks.

    PubMed

    Yoder, Josiah; Medeiros, Henry; Park, Johnny; Kak, Avinash C

    2010-10-01

    In this paper, we present a distributed multicamera face tracking system suitable for large wired camera networks. Unlike previous multicamera face tracking systems, our system does not require a central server to coordinate the entire tracking effort. Instead, an efficient camera clustering protocol is used to dynamically form groups of cameras for in-network tracking of individual faces. The clustering protocol includes cluster propagation mechanisms that allow the computational load of face tracking to be transferred to different cameras as the target objects move. Furthermore, the dynamic election of cluster leaders provides robustness against system failures. Our experimental results show that our cluster-based distributed face tracker is capable of accurately tracking multiple faces in real-time. The overall performance of the distributed system is comparable to that of a centralized face tracker, while presenting the advantages of scalability and robustness. PMID:20423804

  20. Contributions to "k"-Means Clustering and Regression via Classification Algorithms

    ERIC Educational Resources Information Center

    Salman, Raied

    2012-01-01

    The dissertation deals with clustering algorithms and transforming regression problems into classification problems. The main contributions of the dissertation are twofold; first, to improve (speed up) the clustering algorithms and second, to develop a strict learning environment for solving regression problems as classification tasks by using…

  1. User-Based Document Clustering by Redescribing Subject Descriptions with a Genetic Algorithm.

    ERIC Educational Resources Information Center

    Gordon, Michael D.

    1991-01-01

    Discussion of clustering of documents and queries in information retrieval systems focuses on the use of a genetic algorithm to adapt subject descriptions so that documents become more effective in matching relevant queries. Various types of clustering are explained, and simulation experiments used to test the genetic algorithm are described. (27…

  2. On the impact of dissimilarity measure in k-modes clustering algorithm.

    PubMed

    Ng, Michael K; Li, Mark Junjie; Huang, Joshua Zhexue; He, Zengyou

    2007-03-01

    This correspondence describes extensions to the k-modes algorithm for clustering categorical data. By modifying a simple matching dissimilarity measure for categorical objects, a heuristic approach was developed in [4], [12] which allows the use of the k-modes paradigm to obtain a cluster with strong intrasimilarity and to efficiently cluster large categorical data sets. The main aim of this paper is to rigorously derive the updating formula of the k-modes clustering algorithm with the new dissimilarity measure and the convergence of the algorithm under the optimization framework. PMID:17224620

  3. Security clustering algorithm based on reputation in hierarchical peer-to-peer network

    NASA Astrophysics Data System (ADS)

    Chen, Mei; Luo, Xin; Wu, Guowen; Tan, Yang; Kita, Kenji

    2013-03-01

    For the security problems of the hierarchical P2P network (HPN), the paper presents a security clustering algorithm based on reputation (CABR). In the algorithm, we take the reputation mechanism for ensuring the security of transaction and use cluster for managing the reputation mechanism. In order to improve security, reduce cost of network brought by management of reputation and enhance stability of cluster, we select reputation, the historical average online time, and the network bandwidth as the basic factors of the comprehensive performance of node. Simulation results showed that the proposed algorithm improved the security, reduced the network overhead, and enhanced stability of cluster.

  4. Signature extension through the application of cluster matching algorithms to determine appropriate signature transformations

    NASA Technical Reports Server (NTRS)

    Lambeck, P. F.; Rice, D. P.

    1976-01-01

    Signature extension is intended to increase the space-time range over which a set of training statistics can be used to classify data without significant loss of recognition accuracy. A first cluster matching algorithm MASC (Multiplicative and Additive Signature Correction) was developed at the Environmental Research Institute of Michigan to test the concept of using associations between training and recognition area cluster statistics to define an average signature transformation. A more recent signature extension module CROP-A (Cluster Regression Ordered on Principal Axis) has shown evidence of making significant associations between training and recognition area cluster statistics, with the clusters to be matched being selected automatically by the algorithm.

  5. A fast readout algorithm for Cluster Counting/Timing drift chambers on a FPGA board

    NASA Astrophysics Data System (ADS)

    Cappelli, L.; Creti, P.; Grancagnolo, F.; Pepino, A.; Tassielli, G.

    2013-08-01

    A fast readout algorithm for Cluster Counting and Timing purposes has been implemented and tested on a Virtex 6 core FPGA board. The algorithm analyses and stores data coming from a Helium based drift tube instrumented by 1 GSPS fADC and represents the outcome of balancing between cluster identification efficiency and high speed performance. The algorithm can be implemented in electronics boards serving multiple fADC channels as an online preprocessing stage for drift chamber signals.

  6. Parallelization of the Wolff single-cluster algorithm.

    PubMed

    Kaupuzs, J; Rimsāns, J; Melnik, R V N

    2010-02-01

    A parallel [open multiprocessing (OpenMP)] implementation of the Wolff single-cluster algorithm has been developed and tested for the three-dimensional (3D) Ising model. The developed procedure is generalizable to other lattice spin models and its effectiveness depends on the specific application at hand. The applicability of the developed methodology is discussed in the context of the applications, where a sophisticated shuffling scheme is used to generate pseudorandom numbers of high quality, and an iterative method is applied to find the critical temperature of the 3D Ising model with a great accuracy. For the lattice with linear size L=1024, we have reached the speedup about 1.79 times on two processors and about 2.67 times on four processors, as compared to the serial code. According to our estimation, the speedup about three times on four processors is reachable for the O(n) models with n> or =2. Furthermore, the application of the developed OpenMP code allows us to simulate larger lattices due to greater operative (shared) memory available.

  7. Parallelization of the Wolff single-cluster algorithm

    NASA Astrophysics Data System (ADS)

    Kaupužs, J.; Rimšāns, J.; Melnik, R. V. N.

    2010-02-01

    A parallel [open multiprocessing (OpenMP)] implementation of the Wolff single-cluster algorithm has been developed and tested for the three-dimensional (3D) Ising model. The developed procedure is generalizable to other lattice spin models and its effectiveness depends on the specific application at hand. The applicability of the developed methodology is discussed in the context of the applications, where a sophisticated shuffling scheme is used to generate pseudorandom numbers of high quality, and an iterative method is applied to find the critical temperature of the 3D Ising model with a great accuracy. For the lattice with linear size L=1024 , we have reached the speedup about 1.79 times on two processors and about 2.67 times on four processors, as compared to the serial code. According to our estimation, the speedup about three times on four processors is reachable for the O(n) models with n≥2 . Furthermore, the application of the developed OpenMP code allows us to simulate larger lattices due to greater operative (shared) memory available.

  8. Using Clustering Algorithms to Identify Brown Dwarf Characteristics

    NASA Astrophysics Data System (ADS)

    Choban, Caleb

    2016-06-01

    Brown dwarfs are stars that are not massive enough to sustain core hydrogen fusion, and thus fade and cool over time. The molecular composition of brown dwarf atmospheres can be determined by observing absorption features in their infrared spectrum, which can be quantified using spectral indices. Comparing these indices to one another, we can determine what kind of brown dwarf it is, and if it is young or metal-poor. We explored a new method for identifying these subgroups through the expectation-maximization machine learning clustering algorithm, which provides a quantitative and statistical way of identifying index pairs which separate rare populations. We specifically quantified two statistics, completeness and concentration, to identify the best index pairs. Starting with a training set, we defined selection regions for young, metal-poor and binary brown dwarfs, and tested these on a large sample of L dwarfs. We present the results of this analysis, and demonstrate that new objects in these classes can be found through these methods.

  9. A vector reconstruction based clustering algorithm particularly for large-scale text collection.

    PubMed

    Liu, Ming; Wu, Chong; Chen, Lei

    2015-03-01

    Along with the fast evolvement of internet technology, internet users have to face the large amount of textual data every day. Apparently, organizing texts into categories can help users dig the useful information from large-scale text collection. Clustering is one of the most promising tools for categorizing texts due to its unsupervised characteristic. Unfortunately, most of traditional clustering algorithms lose their high qualities on large-scale text collection, which mainly attributes to the high-dimensional vector space and semantic similarity among texts. To effectively and efficiently cluster large-scale text collection, this paper puts forward a vector reconstruction based clustering algorithm. Only the features that can represent the cluster are preserved in cluster's representative vector. This algorithm alternately repeats two sub-processes until it converges. One process is partial tuning sub-process, where feature's weight is fine-tuned by iterative process similar to self-organizing-mapping (SOM) algorithm. To accelerate clustering velocity, an intersection based similarity measurement and its corresponding neuron adjustment function are proposed and implemented in this sub-process. The other process is overall tuning sub-process, where the features are reallocated among different clusters. In this sub-process, the features useless to represent the cluster are removed from cluster's representative vector. Experimental results on the three text collections (including two small-scale and one large-scale text collections) demonstrate that our algorithm obtains high-quality performances on both small-scale and large-scale text collections.

  10. The distribution of early- and late-type galaxies in the Coma cluster

    NASA Technical Reports Server (NTRS)

    Doi, M.; Fukugita, M.; Okamura, S.; Turner, E. L.

    1995-01-01

    The spatial distribution and the morohology-density relation of Coma cluster galaxies are studied using a new homogeneous photmetric sample of 450 galaxies down to B = 16.0 mag with quantitative morphology classification. The sample covers a wide area (10 deg X 10 deg), extending well beyond the Coma cluster. Morphological classifications into early- (E+SO) and late-(S) type galaxies are made by an automated algorithm using simple photometric parameters, with which the misclassification rate is expected to be approximately 10% with respect to early and late types given in the Third Reference Catalogue of Bright Galaxies. The flattened distribution of Coma cluster galaxies, as noted in previous studies, is most conspicuously seen if the early-type galaxies are selected. Early-type galaxies are distributed in a thick filament extended from the NE to the WSW direction that delineates a part of large-scale structure. Spiral galaxies show a distribution with a modest density gradient toward the cluster center; at least bright spiral galaxies are present close to the center of the Coma cluster. We also examine the morphology-density relation for the Coma cluster including its surrounding regions.

  11. An efficient algorithm for estimating noise covariances in distributed systems

    NASA Technical Reports Server (NTRS)

    Dee, D. P.; Cohn, S. E.; Ghil, M.; Dalcher, A.

    1985-01-01

    An efficient computational algorithm for estimating the noise covariance matrices of large linear discrete stochatic-dynamic systems is presented. Such systems arise typically by discretizing distributed-parameter systems, and their size renders computational efficiency a major consideration. The proposed adaptive filtering algorithm is based on the ideas of Belanger, and is algebraically equivalent to his algorithm. The earlier algorithm, however, has computational complexity proportional to p to the 6th, where p is the number of observations of the system state, while the new algorithm has complexity proportional to only p-cubed. Further, the formulation of noise covariance estimation as a secondary filter, analogous to state estimation as a primary filter, suggests several generalizations of the earlier algorithm. The performance of the proposed algorithm is demonstrated for a distributed system arising in numerical weather prediction.

  12. The distribution of nearby rich clusters of galaxies

    NASA Technical Reports Server (NTRS)

    Postman, Marc; Huchra, John P.; Geller, Margaret J.

    1992-01-01

    Redshifts are acquired for a complete sample of 351 Abell clusters with tenth-ranked galaxy magnitudes (m10) less than or equal to 16.5, including 115 entirely new cluster redshifts. Analysis of the spatial distribution of these clusters reveals no clustering on scales larger than 75/h Mpc. The correlation length is 20.0 (+/-4.3)/h Mpc, consistent with the results from other surveys. The frequency of voids with radii of order 60/h Mpc or less is consistent with the form and amplitude of the observed two-point correlation function. There is no significant difference between the clustering properties of clusters with RC = 0 and RC not less than 1. A percolation analysis yields 23 superclusters, 17 of which are new. The superclusters are not significantly elongated in the radial direction; large-scale peculiar motions are of order 1000 km/s or less.

  13. On the distribution of galaxy ellipticity in clusters

    NASA Astrophysics Data System (ADS)

    D'Eugenio, F.; Houghton, R. C. W.; Davies, R. L.; Dalla Bontà, E.

    2015-07-01

    We study the distribution of projected ellipticity n(ɛ) for galaxies in a sample of 20 rich (Richness ≥ 2) nearby (z < 0.1) clusters of galaxies. We find no evidence of differences in n(ɛ), although the nearest cluster in the sample (the Coma Cluster) is the largest outlier (P(same) < 0.05). We then study n(ɛ) within the clusters, and find that ɛ increases with projected cluster-centric radius R (hereafter the ɛ-R relation). This trend is preserved at fixed magnitude, showing that this relation exists over and above the trend of more luminous galaxies to be both rounder and more common in the centres of clusters. The ɛ-R relation is particularly strong in the subsample of intrinsically flattened galaxies (ɛ > 0.4), therefore it is not a consequence of the increasing fraction of round slow rotator galaxies near cluster centers. Furthermore, the ɛ-R relation persists for just smooth flattened galaxies and for galaxies with de Vaucouleurs-like light profiles, suggesting that the variation of the spiral fraction with radius is not the underlying cause of the trend. We interpret our findings in light of the classification of early type galaxies (ETGs) as fast and slow rotators. We conclude that the observed trend of decreasing ɛ towards the centres of clusters is evidence for physical effects in clusters causing fast rotator ETGs to have a lower average intrinsic ellipticity near the centres of rich clusters.

  14. Soft learning vector quantization and clustering algorithms based on non-Euclidean norms: single-norm algorithms.

    PubMed

    Karayiannis, Nicolaos B; Randolph-Gips, Mary M

    2005-03-01

    This paper presents the development of soft clustering and learning vector quantization (LVQ) algorithms that rely on a weighted norm to measure the distance between the feature vectors and their prototypes. The development of LVQ and clustering algorithms is based on the minimization of a reformulation function under the constraint that the generalized mean of the norm weights be constant. According to the proposed formulation, the norm weights can be computed from the data in an iterative fashion together with the prototypes. An error analysis provides some guidelines for selecting the parameter involved in the definition of the generalized mean in terms of the feature variances. The algorithms produced from this formulation are easy to implement and they are almost as fast as clustering algorithms relying on the Euclidean norm. An experimental evaluation on four data sets indicates that the proposed algorithms outperform consistently clustering algorithms relying on the Euclidean norm and they are strong competitors to non-Euclidean algorithms which are computationally more demanding.

  15. Mass distributions of star clusters for different star formation histories in a galaxy cluster environment

    NASA Astrophysics Data System (ADS)

    Schulz, C.; Pflamm-Altenburg, J.; Kroupa, P.

    2015-10-01

    Clusters of galaxies usually contain rich populations of globular clusters (GCs). We investigate how different star formation histories (SFHs) shape the final mass distribution of star clusters. We assumed that every star cluster population forms during a formation epoch of length δt at a constant star-formation rate (SFR). The mass distribution of such a population is described by the embedded cluster mass function (ECMF), which is a pure power law extending to an upper limit Mmax. Since the SFR determines Mmax, the ECMF implicitly depends on the SFR. Starting with different SFHs, the time-evolution of the SFR, each SFH is divided into formation epochs of length δt at different SFRs. The requested mass function arises from the superposition of the star clusters of all formation epochs. An improved optimal sampling technique is introduced that allows generating number and mass distributions, both of which accurately agree with the ECMF. Moreover, for each SFH the distribution function of all involved SFRs, F(SFR), is computed. For monotonically decreasing SFHs, we found that F(SFR) always follows a power law. With F(SFR), we developed the theory of the integrated galactic embedded cluster mass function (IGECMF). The latter describes the distribution function of birth stellar masses of star clusters that accumulated over a formation episode much longer than δt. The IGECMF indeed reproduces the mass distribution of star clusters created according to the superposition principle. Interestingly, all considered SFHs lead to a turn-down with increasing star cluster mass in their respective IGECMFs in a similar way as is observed for GC systems in different galaxy clusters, which offers the possibility of determining the conditions under which a GC system was assembled. Although assuming a pure power-law ECMF, a Schechter-like IGECMF emerges from the superposition principle. In the past decade, a turn-down at the high-mass end has been observed in the cluster initial

  16. Clustering performance comparison using K-means and expectation maximization algorithms

    PubMed Central

    Jung, Yong Gyu; Kang, Min Soo; Heo, Jun

    2014-01-01

    Clustering is an important means of data mining based on separating data categories by similar features. Unlike the classification algorithm, clustering belongs to the unsupervised type of algorithms. Two representatives of the clustering algorithms are the K-means and the expectation maximization (EM) algorithm. Linear regression analysis was extended to the category-type dependent variable, while logistic regression was achieved using a linear combination of independent variables. To predict the possibility of occurrence of an event, a statistical approach is used. However, the classification of all data by means of logistic regression analysis cannot guarantee the accuracy of the results. In this paper, the logistic regression analysis is applied to EM clusters and the K-means clustering method for quality assessment of red wine, and a method is proposed for ensuring the accuracy of the classification results. PMID:26019610

  17. A novel artificial immune algorithm for spatial clustering with obstacle constraint and its applications.

    PubMed

    Sun, Liping; Luo, Yonglong; Ding, Xintao; Zhang, Ji

    2014-01-01

    An important component of a spatial clustering algorithm is the distance measure between sample points in object space. In this paper, the traditional Euclidean distance measure is replaced with innovative obstacle distance measure for spatial clustering under obstacle constraints. Firstly, we present a path searching algorithm to approximate the obstacle distance between two points for dealing with obstacles and facilitators. Taking obstacle distance as similarity metric, we subsequently propose the artificial immune clustering with obstacle entity (AICOE) algorithm for clustering spatial point data in the presence of obstacles and facilitators. Finally, the paper presents a comparative analysis of AICOE algorithm and the classical clustering algorithms. Our clustering model based on artificial immune system is also applied to the case of public facility location problem in order to establish the practical applicability of our approach. By using the clone selection principle and updating the cluster centers based on the elite antibodies, the AICOE algorithm is able to achieve the global optimum and better clustering effect.

  18. Distribution-based fuzzy clustering of electrical resistivity tomography images for interface detection

    NASA Astrophysics Data System (ADS)

    Ward, W. O. C.; Wilkinson, P. B.; Chambers, J. E.; Oxby, L. S.; Bai, L.

    2014-04-01

    A novel method for the effective identification of bedrock subsurface elevation from electrical resistivity tomography images is described. Identifying subsurface boundaries in the topographic data can be difficult due to smoothness constraints used in inversion, so a statistical population-based approach is used that extends previous work in calculating isoresistivity surfaces. The analysis framework involves a procedure for guiding a clustering approach based on the fuzzy c-means algorithm. An approximation of resistivity distributions, found using kernel density estimation, was utilized as a means of guiding the cluster centroids used to classify data. A fuzzy method was chosen over hard clustering due to uncertainty in hard edges in the topography data, and a measure of clustering uncertainty was identified based on the reciprocal of cluster membership. The algorithm was validated using a direct comparison of known observed bedrock depths at two 3-D survey sites, using real-time GPS information of exposed bedrock by quarrying on one site, and borehole logs at the other. Results show similarly accurate detection as a leading isosurface estimation method, and the proposed algorithm requires significantly less user input and prior site knowledge. Furthermore, the method is effectively dimension-independent and will scale to data of increased spatial dimensions without a significant effect on the runtime. A discussion on the results by automated versus supervised analysis is also presented.

  19. A Special Local Clustering Algorithm for Identifying the Genes Associated With Alzheimer’s Disease

    PubMed Central

    Pang, Chao-Yang; Hu, Wei; Hu, Ben-Qiong; Shi, Ying; Vanderburg, Charles R.; Rogers, Jack T.

    2010-01-01

    Clustering is the grouping of similar objects into a class. Local clustering feature refers to the phenomenon whereby one group of data is separated from another, and the data from these different groups are clustered locally. A compact class is defined as one cluster in which all similar elements cluster tightly within the cluster. Herein, the essence of the local clustering feature, revealed by mathematical manipulation, results in a novel clustering algorithm termed as the special local clustering (SLC) algorithm that was used to process gene microarray data related to Alzheimer’s disease (AD). SLC algorithm was able to group together genes with similar expression patterns and identify significantly varied gene expression values as isolated points. If a gene belongs to a compact class in control data and appears as an isolated point in incipient, moderate and/or severe AD gene microarray data, this gene is possibly associated with AD. Application of a clustering algorithm in disease-associated gene identification such as in AD is rarely reported. PMID:20089478

  20. Towards enhancement of performance of K-means clustering using nature-inspired optimization algorithms.

    PubMed

    Fong, Simon; Deb, Suash; Yang, Xin-She; Zhuang, Yan

    2014-01-01

    Traditional K-means clustering algorithms have the drawback of getting stuck at local optima that depend on the random values of initial centroids. Optimization algorithms have their advantages in guiding iterative computation to search for global optima while avoiding local optima. The algorithms help speed up the clustering process by converging into a global optimum early with multiple search agents in action. Inspired by nature, some contemporary optimization algorithms which include Ant, Bat, Cuckoo, Firefly, and Wolf search algorithms mimic the swarming behavior allowing them to cooperatively steer towards an optimal objective within a reasonable time. It is known that these so-called nature-inspired optimization algorithms have their own characteristics as well as pros and cons in different applications. When these algorithms are combined with K-means clustering mechanism for the sake of enhancing its clustering quality by avoiding local optima and finding global optima, the new hybrids are anticipated to produce unprecedented performance. In this paper, we report the results of our evaluation experiments on the integration of nature-inspired optimization methods into K-means algorithms. In addition to the standard evaluation metrics in evaluating clustering quality, the extended K-means algorithms that are empowered by nature-inspired optimization methods are applied on image segmentation as a case study of application scenario. PMID:25202730

  1. Towards Enhancement of Performance of K-Means Clustering Using Nature-Inspired Optimization Algorithms

    PubMed Central

    Deb, Suash; Yang, Xin-She

    2014-01-01

    Traditional K-means clustering algorithms have the drawback of getting stuck at local optima that depend on the random values of initial centroids. Optimization algorithms have their advantages in guiding iterative computation to search for global optima while avoiding local optima. The algorithms help speed up the clustering process by converging into a global optimum early with multiple search agents in action. Inspired by nature, some contemporary optimization algorithms which include Ant, Bat, Cuckoo, Firefly, and Wolf search algorithms mimic the swarming behavior allowing them to cooperatively steer towards an optimal objective within a reasonable time. It is known that these so-called nature-inspired optimization algorithms have their own characteristics as well as pros and cons in different applications. When these algorithms are combined with K-means clustering mechanism for the sake of enhancing its clustering quality by avoiding local optima and finding global optima, the new hybrids are anticipated to produce unprecedented performance. In this paper, we report the results of our evaluation experiments on the integration of nature-inspired optimization methods into K-means algorithms. In addition to the standard evaluation metrics in evaluating clustering quality, the extended K-means algorithms that are empowered by nature-inspired optimization methods are applied on image segmentation as a case study of application scenario. PMID:25202730

  2. Towards enhancement of performance of K-means clustering using nature-inspired optimization algorithms.

    PubMed

    Fong, Simon; Deb, Suash; Yang, Xin-She; Zhuang, Yan

    2014-01-01

    Traditional K-means clustering algorithms have the drawback of getting stuck at local optima that depend on the random values of initial centroids. Optimization algorithms have their advantages in guiding iterative computation to search for global optima while avoiding local optima. The algorithms help speed up the clustering process by converging into a global optimum early with multiple search agents in action. Inspired by nature, some contemporary optimization algorithms which include Ant, Bat, Cuckoo, Firefly, and Wolf search algorithms mimic the swarming behavior allowing them to cooperatively steer towards an optimal objective within a reasonable time. It is known that these so-called nature-inspired optimization algorithms have their own characteristics as well as pros and cons in different applications. When these algorithms are combined with K-means clustering mechanism for the sake of enhancing its clustering quality by avoiding local optima and finding global optima, the new hybrids are anticipated to produce unprecedented performance. In this paper, we report the results of our evaluation experiments on the integration of nature-inspired optimization methods into K-means algorithms. In addition to the standard evaluation metrics in evaluating clustering quality, the extended K-means algorithms that are empowered by nature-inspired optimization methods are applied on image segmentation as a case study of application scenario.

  3. Block clustering based on difference of convex functions (DC) programming and DC algorithms.

    PubMed

    Le, Hoai Minh; Le Thi, Hoai An; Dinh, Tao Pham; Huynh, Van Ngai

    2013-10-01

    We investigate difference of convex functions (DC) programming and the DC algorithm (DCA) to solve the block clustering problem in the continuous framework, which traditionally requires solving a hard combinatorial optimization problem. DC reformulation techniques and exact penalty in DC programming are developed to build an appropriate equivalent DC program of the block clustering problem. They lead to an elegant and explicit DCA scheme for the resulting DC program. Computational experiments show the robustness and efficiency of the proposed algorithm and its superiority over standard algorithms such as two-mode K-means, two-mode fuzzy clustering, and block classification EM.

  4. Optimization of composite structures by estimation of distribution algorithms

    NASA Astrophysics Data System (ADS)

    Grosset, Laurent

    The design of high performance composite laminates, such as those used in aerospace structures, leads to complex combinatorial optimization problems that cannot be addressed by conventional methods. These problems are typically solved by stochastic algorithms, such as evolutionary algorithms. This dissertation proposes a new evolutionary algorithm for composite laminate optimization, named Double-Distribution Optimization Algorithm (DDOA). DDOA belongs to the family of estimation of distributions algorithms (EDA) that build a statistical model of promising regions of the design space based on sets of good points, and use it to guide the search. A generic framework for introducing statistical variable dependencies by making use of the physics of the problem is proposed. The algorithm uses two distributions simultaneously: the marginal distributions of the design variables, complemented by the distribution of auxiliary variables. The combination of the two generates complex distributions at a low computational cost. The dissertation demonstrates the efficiency of DDOA for several laminate optimization problems where the design variables are the fiber angles and the auxiliary variables are the lamination parameters. The results show that its reliability in finding the optima is greater than that of a simple EDA and of a standard genetic algorithm, and that its advantage increases with the problem dimension. A continuous version of the algorithm is presented and applied to a constrained quadratic problem. Finally, a modification of the algorithm incorporating probabilistic and directional search mechanisms is proposed. The algorithm exhibits a faster convergence to the optimum and opens the way for a unified framework for stochastic and directional optimization.

  5. Extensions of kmeans-type algorithms: a new clustering framework by integrating intracluster compactness and intercluster separation.

    PubMed

    Huang, Xiaohui; Ye, Yunming; Zhang, Haijun

    2014-08-01

    Kmeans-type clustering aims at partitioning a data set into clusters such that the objects in a cluster are compact and the objects in different clusters are well separated. However, most kmeans-type clustering algorithms rely on only intracluster compactness while overlooking intercluster separation. In this paper, a series of new clustering algorithms by extending the existing kmeans-type algorithms is proposed by integrating both intracluster compactness and intercluster separation. First, a set of new objective functions for clustering is developed. Based on these objective functions, the corresponding updating rules for the algorithms are then derived analytically. The properties and performances of these algorithms are investigated on several synthetic and real-life data sets. Experimental studies demonstrate that our proposed algorithms outperform the state-of-the-art kmeans-type clustering algorithms with respect to four metrics: accuracy, RandIndex, Fscore, and normal mutual information.

  6. A Novel Method to Predict Genomic Islands Based on Mean Shift Clustering Algorithm.

    PubMed

    de Brito, Daniel M; Maracaja-Coutinho, Vinicius; de Farias, Savio T; Batista, Leonardo V; do Rêgo, Thaís G

    2016-01-01

    Genomic Islands (GIs) are regions of bacterial genomes that are acquired from other organisms by the phenomenon of horizontal transfer. These regions are often responsible for many important acquired adaptations of the bacteria, with great impact on their evolution and behavior. Nevertheless, these adaptations are usually associated with pathogenicity, antibiotic resistance, degradation and metabolism. Identification of such regions is of medical and industrial interest. For this reason, different approaches for genomic islands prediction have been proposed. However, none of them are capable of predicting precisely the complete repertory of GIs in a genome. The difficulties arise due to the changes in performance of different algorithms in the face of the variety of nucleotide distribution in different species. In this paper, we present a novel method to predict GIs that is built upon mean shift clustering algorithm. It does not require any information regarding the number of clusters, and the bandwidth parameter is automatically calculated based on a heuristic approach. The method was implemented in a new user-friendly tool named MSGIP--Mean Shift Genomic Island Predictor. Genomes of bacteria with GIs discussed in other papers were used to evaluate the proposed method. The application of this tool revealed the same GIs predicted by other methods and also different novel unpredicted islands. A detailed investigation of the different features related to typical GI elements inserted in these new regions confirmed its effectiveness. Stand-alone and user-friendly versions for this new methodology are available at http://msgip.integrativebioinformatics.me. PMID:26731657

  7. A Novel Method to Predict Genomic Islands Based on Mean Shift Clustering Algorithm

    PubMed Central

    de Brito, Daniel M.; Maracaja-Coutinho, Vinicius; de Farias, Savio T.; Batista, Leonardo V.; do Rêgo, Thaís G.

    2016-01-01

    Genomic Islands (GIs) are regions of bacterial genomes that are acquired from other organisms by the phenomenon of horizontal transfer. These regions are often responsible for many important acquired adaptations of the bacteria, with great impact on their evolution and behavior. Nevertheless, these adaptations are usually associated with pathogenicity, antibiotic resistance, degradation and metabolism. Identification of such regions is of medical and industrial interest. For this reason, different approaches for genomic islands prediction have been proposed. However, none of them are capable of predicting precisely the complete repertory of GIs in a genome. The difficulties arise due to the changes in performance of different algorithms in the face of the variety of nucleotide distribution in different species. In this paper, we present a novel method to predict GIs that is built upon mean shift clustering algorithm. It does not require any information regarding the number of clusters, and the bandwidth parameter is automatically calculated based on a heuristic approach. The method was implemented in a new user-friendly tool named MSGIP—Mean Shift Genomic Island Predictor. Genomes of bacteria with GIs discussed in other papers were used to evaluate the proposed method. The application of this tool revealed the same GIs predicted by other methods and also different novel unpredicted islands. A detailed investigation of the different features related to typical GI elements inserted in these new regions confirmed its effectiveness. Stand-alone and user-friendly versions for this new methodology are available at http://msgip.integrativebioinformatics.me. PMID:26731657

  8. A scalable and practical one-pass clustering algorithm for recommender system

    NASA Astrophysics Data System (ADS)

    Khalid, Asra; Ghazanfar, Mustansar Ali; Azam, Awais; Alahmari, Saad Ali

    2015-12-01

    KMeans clustering-based recommendation algorithms have been proposed claiming to increase the scalability of recommender systems. One potential drawback of these algorithms is that they perform training offline and hence cannot accommodate the incremental updates with the arrival of new data, making them unsuitable for the dynamic environments. From this line of research, a new clustering algorithm called One-Pass is proposed, which is a simple, fast, and accurate. We show empirically that the proposed algorithm outperforms K-Means in terms of recommendation and training time while maintaining a good level of accuracy.

  9. Complex Network Clustering by a Multi-objective Evolutionary Algorithm Based on Decomposition and Membrane Structure

    PubMed Central

    Ju, Ying; Zhang, Songming; Ding, Ningxiang; Zeng, Xiangxiang; Zhang, Xingyi

    2016-01-01

    The field of complex network clustering is gaining considerable attention in recent years. In this study, a multi-objective evolutionary algorithm based on membranes is proposed to solve the network clustering problem. Population are divided into different membrane structures on average. The evolutionary algorithm is carried out in the membrane structures. The population are eliminated by the vector of membranes. In the proposed method, two evaluation objectives termed as Kernel J-means and Ratio Cut are to be minimized. Extensive experimental studies comparison with state-of-the-art algorithms proves that the proposed algorithm is effective and promising. PMID:27670156

  10. `Inter-Arrival Time' Inspired Algorithm and its Application in Clustering and Molecular Phylogeny

    NASA Astrophysics Data System (ADS)

    Kolekar, Pandurang S.; Kale, Mohan M.; Kulkarni-Kale, Urmila

    2010-10-01

    Bioinformatics, being multidisciplinary field, involves applications of various methods from allied areas of Science for data mining using computational approaches. Clustering and molecular phylogeny is one of the key areas in Bioinformatics, which help in study of classification and evolution of organisms. Molecular phylogeny algorithms can be divided into distance based and character based methods. But most of these methods are dependent on pre-alignment of sequences and become computationally intensive with increase in size of data and hence demand alternative efficient approaches. `Inter arrival time distribution' (IATD) is a popular concept in the theory of stochastic system modeling but its potential in molecular data analysis has not been fully explored. The present study reports application of IATD in Bioinformatics for clustering and molecular phylogeny. The proposed method provides IATDs of nucleotides in genomic sequences. The distance function based on statistical parameters of IATDs is proposed and distance matrix thus obtained is used for the purpose of clustering and molecular phylogeny. The method is applied on a dataset of 3' non-coding region sequences (NCR) of Dengue virus type 3 (DENV-3), subtype III, reported in 2008. The phylogram thus obtained revealed the geographical distribution of DENV-3 isolates. Sri Lankan DENV-3 isolates were further observed to be clustered in two sub-clades corresponding to pre and post Dengue hemorrhagic fever emergence groups. These results are consistent with those reported earlier, which are obtained using pre-aligned sequence data as an input. These findings encourage applications of the IATD based method in molecular phylogenetic analysis in particular and data mining in general.

  11. Performance Assessment of the Optical Transient Detector and Lightning Imaging Sensor. Part 2; Clustering Algorithm

    NASA Technical Reports Server (NTRS)

    Mach, Douglas M.; Christian, Hugh J.; Blakeslee, Richard; Boccippio, Dennis J.; Goodman, Steve J.; Boeck, William

    2006-01-01

    We describe the clustering algorithm used by the Lightning Imaging Sensor (LIS) and the Optical Transient Detector (OTD) for combining the lightning pulse data into events, groups, flashes, and areas. Events are single pixels that exceed the LIS/OTD background level during a single frame (2 ms). Groups are clusters of events that occur within the same frame and in adjacent pixels. Flashes are clusters of groups that occur within 330 ms and either 5.5 km (for LIS) or 16.5 km (for OTD) of each other. Areas are clusters of flashes that occur within 16.5 km of each other. Many investigators are utilizing the LIS/OTD flash data; therefore, we test how variations in the algorithms for the event group and group-flash clustering affect the flash count for a subset of the LIS data. We divided the subset into areas with low (1-3), medium (4-15), high (16-63), and very high (64+) flashes to see how changes in the clustering parameters affect the flash rates in these different sizes of areas. We found that as long as the cluster parameters are within about a factor of two of the current values, the flash counts do not change by more than about 20%. Therefore, the flash clustering algorithm used by the LIS and OTD sensors create flash rates that are relatively insensitive to reasonable variations in the clustering algorithms.

  12. Improved Quantum Artificial Fish Algorithm Application to Distributed Network Considering Distributed Generation.

    PubMed

    Du, Tingsong; Hu, Yang; Ke, Xianting

    2015-01-01

    An improved quantum artificial fish swarm algorithm (IQAFSA) for solving distributed network programming considering distributed generation is proposed in this work. The IQAFSA based on quantum computing which has exponential acceleration for heuristic algorithm uses quantum bits to code artificial fish and quantum revolving gate, preying behavior, and following behavior and variation of quantum artificial fish to update the artificial fish for searching for optimal value. Then, we apply the proposed new algorithm, the quantum artificial fish swarm algorithm (QAFSA), the basic artificial fish swarm algorithm (BAFSA), and the global edition artificial fish swarm algorithm (GAFSA) to the simulation experiments for some typical test functions, respectively. The simulation results demonstrate that the proposed algorithm can escape from the local extremum effectively and has higher convergence speed and better accuracy. Finally, applying IQAFSA to distributed network problems and the simulation results for 33-bus radial distribution network system show that IQAFSA can get the minimum power loss after comparing with BAFSA, GAFSA, and QAFSA.

  13. A Novel Artificial Bee Colony Based Clustering Algorithm for Categorical Data

    PubMed Central

    2015-01-01

    Data with categorical attributes are ubiquitous in the real world. However, existing partitional clustering algorithms for categorical data are prone to fall into local optima. To address this issue, in this paper we propose a novel clustering algorithm, ABC-K-Modes (Artificial Bee Colony clustering based on K-Modes), based on the traditional k-modes clustering algorithm and the artificial bee colony approach. In our approach, we first introduce a one-step k-modes procedure, and then integrate this procedure with the artificial bee colony approach to deal with categorical data. In the search process performed by scout bees, we adopt the multi-source search inspired by the idea of batch processing to accelerate the convergence of ABC-K-Modes. The performance of ABC-K-Modes is evaluated by a series of experiments in comparison with that of the other popular algorithms for categorical data. PMID:25993469

  14. An adaptive immune optimization algorithm with dynamic lattice searching operation for fast optimization of atomic clusters

    NASA Astrophysics Data System (ADS)

    Wu, Xia; Wu, Genhua

    2014-08-01

    Geometrical optimization of atomic clusters is performed by a development of adaptive immune optimization algorithm (AIOA) with dynamic lattice searching (DLS) operation (AIOA-DLS method). By a cycle of construction and searching of the dynamic lattice (DL), DLS algorithm rapidly makes the clusters more regular and greatly reduces the potential energy. DLS can thus be used as an operation acting on the new individuals after mutation operation in AIOA to improve the performance of the AIOA. The AIOA-DLS method combines the merit of evolutionary algorithm and idea of dynamic lattice. The performance of the proposed method is investigated in the optimization of Lennard-Jones clusters within 250 atoms and silver clusters described by many-body Gupta potential within 150 atoms. Results reported in the literature are reproduced, and the motif of Ag61 cluster is found to be stacking-fault face-centered cubic, whose energy is lower than that of previously obtained icosahedron.

  15. Hyperspectral image clustering method based on artificial bee colony algorithm and Markov random fields

    NASA Astrophysics Data System (ADS)

    Sun, Xu; Yang, Lina; Gao, Lianru; Zhang, Bing; Li, Shanshan; Li, Jun

    2015-01-01

    Center-oriented hyperspectral image clustering methods have been widely applied to hyperspectral remote sensing image processing; however, the drawbacks are obvious, including the over-simplicity of computing models and underutilized spatial information. In recent years, some studies have been conducted trying to improve this situation. We introduce the artificial bee colony (ABC) and Markov random field (MRF) algorithms to propose an ABC-MRF-cluster model to solve the problems mentioned above. In this model, a typical ABC algorithm framework is adopted in which cluster centers and iteration conditional model algorithm's results are considered as feasible solutions and objective functions separately, and MRF is modified to be capable of dealing with the clustering problem. Finally, four datasets and two indices are used to show that the application of ABC-cluster and ABC-MRF-cluster methods could help to obtain better image accuracy than conventional methods. Specifically, the ABC-cluster method is superior when used for a higher power of spectral discrimination, whereas the ABC-MRF-cluster method can provide better results when used for an adjusted random index. In experiments on simulated images with different signal-to-noise ratios, ABC-cluster and ABC-MRF-cluster showed good stability.

  16. Estimating the abundance of clustered animal population by using adaptive cluster sampling and negative binomial distribution

    NASA Astrophysics Data System (ADS)

    Bo, Yizhou; Shifa, Naima

    2013-09-01

    An estimator for finding the abundance of a rare, clustered and mobile population has been introduced. This model is based on adaptive cluster sampling (ACS) to identify the location of the population and negative binomial distribution to estimate the total in each site. To identify the location of the population we consider both sampling with replacement (WR) and sampling without replacement (WOR). Some mathematical properties of the model are also developed.

  17. Label propagation algorithm based on edge clustering coefficient for community detection in complex networks

    NASA Astrophysics Data System (ADS)

    Zhang, Xian-Kun; Tian, Xue; Li, Ya-Nan; Song, Chen

    2014-08-01

    The label propagation algorithm (LPA) is a graph-based semi-supervised learning algorithm, which can predict the information of unlabeled nodes by a few of labeled nodes. It is a community detection method in the field of complex networks. This algorithm is easy to implement with low complexity and the effect is remarkable. It is widely applied in various fields. However, the randomness of the label propagation leads to the poor robustness of the algorithm, and the classification result is unstable. This paper proposes a LPA based on edge clustering coefficient. The node in the network selects a neighbor node whose edge clustering coefficient is the highest to update the label of node rather than a random neighbor node, so that we can effectively restrain the random spread of the label. The experimental results show that the LPA based on edge clustering coefficient has made improvement in the stability and accuracy of the algorithm.

  18. A Local Scalable Distributed Expectation Maximization Algorithm for Large Peer-to-Peer Networks

    NASA Technical Reports Server (NTRS)

    Bhaduri, Kanishka; Srivastava, Ashok N.

    2009-01-01

    This paper offers a local distributed algorithm for expectation maximization in large peer-to-peer environments. The algorithm can be used for a variety of well-known data mining tasks in a distributed environment such as clustering, anomaly detection, target tracking to name a few. This technology is crucial for many emerging peer-to-peer applications for bioinformatics, astronomy, social networking, sensor networks and web mining. Centralizing all or some of the data for building global models is impractical in such peer-to-peer environments because of the large number of data sources, the asynchronous nature of the peer-to-peer networks, and dynamic nature of the data/network. The distributed algorithm we have developed in this paper is provably-correct i.e. it converges to the same result compared to a similar centralized algorithm and can automatically adapt to changes to the data and the network. We show that the communication overhead of the algorithm is very low due to its local nature. This monitoring algorithm is then used as a feedback loop to sample data from the network and rebuild the model when it is outdated. We present thorough experimental results to verify our theoretical claims.

  19. A fast general-purpose clustering algorithm based on FPGAs for high-throughput data processing

    NASA Astrophysics Data System (ADS)

    Annovi, A.; Beretta, M.

    2010-05-01

    We present a fast general-purpose algorithm for high-throughput clustering of data "with a two-dimensional organization". The algorithm is designed to be implemented with FPGAs or custom electronics. The key feature is a processing time that scales linearly with the amount of data to be processed. This means that clustering can be performed in pipeline with the readout, without suffering from combinatorial delays due to looping multiple times through all the data. This feature makes this algorithm especially well suited for problems where the data have high density, e.g. in the case of tracking devices working under high-luminosity condition such as those of LHC or super-LHC. The algorithm is organized in two steps: the first step (core) clusters the data; the second step analyzes each cluster of data to extract the desired information. The current algorithm is developed as a clustering device for modern high-energy physics pixel detectors. However, the algorithm has much broader field of applications. In fact, its core does not specifically rely on the kind of data or detector it is working for, while the second step can and should be tailored for a given application. For example, in case of spatial measurement with silicon pixel detectors, the second step performs center of charge calculation. Applications can thus be foreseen to other detectors and other scientific fields ranging from HEP calorimeters to medical imaging. An additional advantage of this two steps approach is that the typical clustering related calculations (second step) are separated from the combinatorial complications of clustering. This separation simplifies the design of the second step and it enables it to perform sophisticated calculations achieving offline quality in online applications. The algorithm is general purpose in the sense that only minimal assumptions on the kind of clustering to be performed are made.

  20. Parallel matrix transpose algorithms on distributed memory concurrent computers

    SciTech Connect

    Choi, J.; Walker, D.W.; Dongarra, J.J. |

    1993-10-01

    This paper describes parallel matrix transpose algorithms on distributed memory concurrent processors. It is assumed that the matrix is distributed over a P x Q processor template with a block scattered data distribution. P, Q, and the block size can be arbitrary, so the algorithms have wide applicability. The communication schemes of the algorithms are determined by the greatest common divisor (GCD) of P and Q. If P and Q are relatively prime, the matrix transpose algorithm involves complete exchange communication. If P and Q are not relatively prime, processors are divided into GCD groups and the communication operations are overlapped for different groups of processors. Processors transpose GCD wrapped diagonal blocks simultaneously, and the matrix can be transposed with LCM/GCD steps, where LCM is the least common multiple of P and Q. The algorithms make use of non-blocking, point-to-point communication between processors. The use of nonblocking communication allows a processor to overlap the messages that it sends to different processors, thereby avoiding unnecessary synchronization. Combined with the matrix multiplication routine, C = A{center_dot}B, the algorithms are used to compute parallel multiplications of transposed matrices, C = A{sup T}{center_dot}B{sup T}, in the PUMMA package. Details of the parallel implementation of the algorithms are given, and results are presented for runs on the Intel Touchstone Delta computer.

  1. Illinois Occupational Skill Standards: Finishing and Distribution Cluster.

    ERIC Educational Resources Information Center

    Illinois Occupational Skill Standards and Credentialing Council, Carbondale.

    This document, which is intended as a guide for work force preparation program providers, details the Illinois occupational skill standards for programs preparing students for employment in occupations in the finishing and distribution cluster. The document begins with a brief overview of the Illinois perspective on occupational skill standards…

  2. Learning assignment order of instances for the constrained K-means clustering algorithm.

    PubMed

    Hong, Yi; Kwong, Sam

    2009-04-01

    The sensitivity of the constrained K-means clustering algorithm (Cop-Kmeans) to the assignment order of instances is studied, and a novel assignment order learning method for Cop-Kmeans, termed as clustering Uncertainty-based Assignment order Learning Algorithm (UALA), is proposed in this paper. The main idea of UALA is to rank all instances in the data set according to their clustering uncertainties calculated by using the ensembles of multiple clustering algorithms. Experimental results on several real data sets with artificial instance-level constraints demonstrate that UALA can identify a good assignment order of instances for Cop-Kmeans. In addition, the effects of ensemble sizes on the performance of UALA are analyzed, and the generalization property of Cop-Kmeans is also studied.

  3. A New Collaborative Recommendation Approach Based on Users Clustering Using Artificial Bee Colony Algorithm

    PubMed Central

    Ju, Chunhua

    2013-01-01

    Although there are many good collaborative recommendation methods, it is still a challenge to increase the accuracy and diversity of these methods to fulfill users' preferences. In this paper, we propose a novel collaborative filtering recommendation approach based on K-means clustering algorithm. In the process of clustering, we use artificial bee colony (ABC) algorithm to overcome the local optimal problem caused by K-means. After that we adopt the modified cosine similarity to compute the similarity between users in the same clusters. Finally, we generate recommendation results for the corresponding target users. Detailed numerical analysis on a benchmark dataset MovieLens and a real-world dataset indicates that our new collaborative filtering approach based on users clustering algorithm outperforms many other recommendation methods. PMID:24381525

  4. An Enhanced PSO-Based Clustering Energy Optimization Algorithm for Wireless Sensor Network

    PubMed Central

    Vimalarani, C.; Subramanian, R.; Sivanandam, S. N.

    2016-01-01

    Wireless Sensor Network (WSN) is a network which formed with a maximum number of sensor nodes which are positioned in an application environment to monitor the physical entities in a target area, for example, temperature monitoring environment, water level, monitoring pressure, and health care, and various military applications. Mostly sensor nodes are equipped with self-supported battery power through which they can perform adequate operations and communication among neighboring nodes. Maximizing the lifetime of the Wireless Sensor networks, energy conservation measures are essential for improving the performance of WSNs. This paper proposes an Enhanced PSO-Based Clustering Energy Optimization (EPSO-CEO) algorithm for Wireless Sensor Network in which clustering and clustering head selection are done by using Particle Swarm Optimization (PSO) algorithm with respect to minimizing the power consumption in WSN. The performance metrics are evaluated and results are compared with competitive clustering algorithm to validate the reduction in energy consumption. PMID:26881273

  5. An Enhanced PSO-Based Clustering Energy Optimization Algorithm for Wireless Sensor Network.

    PubMed

    Vimalarani, C; Subramanian, R; Sivanandam, S N

    2016-01-01

    Wireless Sensor Network (WSN) is a network which formed with a maximum number of sensor nodes which are positioned in an application environment to monitor the physical entities in a target area, for example, temperature monitoring environment, water level, monitoring pressure, and health care, and various military applications. Mostly sensor nodes are equipped with self-supported battery power through which they can perform adequate operations and communication among neighboring nodes. Maximizing the lifetime of the Wireless Sensor networks, energy conservation measures are essential for improving the performance of WSNs. This paper proposes an Enhanced PSO-Based Clustering Energy Optimization (EPSO-CEO) algorithm for Wireless Sensor Network in which clustering and clustering head selection are done by using Particle Swarm Optimization (PSO) algorithm with respect to minimizing the power consumption in WSN. The performance metrics are evaluated and results are compared with competitive clustering algorithm to validate the reduction in energy consumption. PMID:26881273

  6. A hybrid algorithm for clustering of time series data based on affinity search technique.

    PubMed

    Aghabozorgi, Saeed; Ying Wah, Teh; Herawan, Tutut; Jalab, Hamid A; Shaygan, Mohammad Amin; Jalali, Alireza

    2014-01-01

    Time series clustering is an important solution to various problems in numerous fields of research, including business, medical science, and finance. However, conventional clustering algorithms are not practical for time series data because they are essentially designed for static data. This impracticality results in poor clustering accuracy in several systems. In this paper, a new hybrid clustering algorithm is proposed based on the similarity in shape of time series data. Time series data are first grouped as subclusters based on similarity in time. The subclusters are then merged using the k-Medoids algorithm based on similarity in shape. This model has two contributions: (1) it is more accurate than other conventional and hybrid approaches and (2) it determines the similarity in shape among time series data with a low complexity. To evaluate the accuracy of the proposed model, the model is tested extensively using syntactic and real-world time series datasets.

  7. A randomized algorithm for two-cluster partition of a set of vectors

    NASA Astrophysics Data System (ADS)

    Kel'manov, A. V.; Khandeev, V. I.

    2015-02-01

    A randomized algorithm is substantiated for the strongly NP-hard problem of partitioning a finite set of vectors of Euclidean space into two clusters of given sizes according to the minimum-of-the sum-of-squared-distances criterion. It is assumed that the centroid of one of the clusters is to be optimized and is determined as the mean value over all vectors in this cluster. The centroid of the other cluster is fixed at the origin. For an established parameter value, the algorithm finds an approximate solution of the problem in time that is linear in the space dimension and the input size of the problem for given values of the relative error and failure probability. The conditions are established under which the algorithm is asymptotically exact and runs in time that is linear in the space dimension and quadratic in the input size of the problem.

  8. Functional Principal Component Analysis and Randomized Sparse Clustering Algorithm for Medical Image Analysis.

    PubMed

    Lin, Nan; Jiang, Junhai; Guo, Shicheng; Xiong, Momiao

    2015-01-01

    Due to the advancement in sensor technology, the growing large medical image data have the ability to visualize the anatomical changes in biological tissues. As a consequence, the medical images have the potential to enhance the diagnosis of disease, the prediction of clinical outcomes and the characterization of disease progression. But in the meantime, the growing data dimensions pose great methodological and computational challenges for the representation and selection of features in image cluster analysis. To address these challenges, we first extend the functional principal component analysis (FPCA) from one dimension to two dimensions to fully capture the space variation of image the signals. The image signals contain a large number of redundant features which provide no additional information for clustering analysis. The widely used methods for removing the irrelevant features are sparse clustering algorithms using a lasso-type penalty to select the features. However, the accuracy of clustering using a lasso-type penalty depends on the selection of the penalty parameters and the threshold value. In practice, they are difficult to determine. Recently, randomized algorithms have received a great deal of attentions in big data analysis. This paper presents a randomized algorithm for accurate feature selection in image clustering analysis. The proposed method is applied to both the liver and kidney cancer histology image data from the TCGA database. The results demonstrate that the randomized feature selection method coupled with functional principal component analysis substantially outperforms the current sparse clustering algorithms in image cluster analysis. PMID:26196383

  9. Functional Principal Component Analysis and Randomized Sparse Clustering Algorithm for Medical Image Analysis

    PubMed Central

    Lin, Nan; Jiang, Junhai; Guo, Shicheng; Xiong, Momiao

    2015-01-01

    Due to the advancement in sensor technology, the growing large medical image data have the ability to visualize the anatomical changes in biological tissues. As a consequence, the medical images have the potential to enhance the diagnosis of disease, the prediction of clinical outcomes and the characterization of disease progression. But in the meantime, the growing data dimensions pose great methodological and computational challenges for the representation and selection of features in image cluster analysis. To address these challenges, we first extend the functional principal component analysis (FPCA) from one dimension to two dimensions to fully capture the space variation of image the signals. The image signals contain a large number of redundant features which provide no additional information for clustering analysis. The widely used methods for removing the irrelevant features are sparse clustering algorithms using a lasso-type penalty to select the features. However, the accuracy of clustering using a lasso-type penalty depends on the selection of the penalty parameters and the threshold value. In practice, they are difficult to determine. Recently, randomized algorithms have received a great deal of attentions in big data analysis. This paper presents a randomized algorithm for accurate feature selection in image clustering analysis. The proposed method is applied to both the liver and kidney cancer histology image data from the TCGA database. The results demonstrate that the randomized feature selection method coupled with functional principal component analysis substantially outperforms the current sparse clustering algorithms in image cluster analysis. PMID:26196383

  10. An approximation polynomial-time algorithm for a sequence bi-clustering problem

    NASA Astrophysics Data System (ADS)

    Kel'manov, A. V.; Khamidullin, S. A.

    2015-06-01

    We consider a strongly NP-hard problem of partitioning a finite sequence of vectors in Euclidean space into two clusters using the criterion of the minimal sum of the squared distances from the elements of the clusters to the centers of the clusters. The center of one of the clusters is to be optimized and is determined as the mean value over all vectors in this cluster. The center of the other cluster is fixed at the origin. Moreover, the partition is such that the difference between the indices of two successive vectors in the first cluster is bounded above and below by prescribed constants. A 2-approximation polynomial-time algorithm is proposed for this problem.

  11. An algorithm for image clusters detection and identification based on color for an autonomous mobile robot

    SciTech Connect

    Uy, D.L.

    1996-02-01

    An algorithm for detection and identification of image clusters or {open_quotes}blobs{close_quotes} based on color information for an autonomous mobile robot is developed. The input image data are first processed using a crisp color fuszzyfier, a binary smoothing filter, and a median filter. The processed image data is then inputed to the image clusters detection and identification program. The program employed the concept of {open_quotes}elastic rectangle{close_quotes}that stretches in such a way that the whole blob is finally enclosed in a rectangle. A C-program is develop to test the algorithm. The algorithm is tested only on image data of 8x8 sizes with different number of blobs in them. The algorithm works very in detecting and identifying image clusters.

  12. A review of estimation of distribution algorithms in bioinformatics

    PubMed Central

    Armañanzas, Rubén; Inza, Iñaki; Santana, Roberto; Saeys, Yvan; Flores, Jose Luis; Lozano, Jose Antonio; Peer, Yves Van de; Blanco, Rosa; Robles, Víctor; Bielza, Concha; Larrañaga, Pedro

    2008-01-01

    Evolutionary search algorithms have become an essential asset in the algorithmic toolbox for solving high-dimensional optimization problems in across a broad range of bioinformatics problems. Genetic algorithms, the most well-known and representative evolutionary search technique, have been the subject of the major part of such applications. Estimation of distribution algorithms (EDAs) offer a novel evolutionary paradigm that constitutes a natural and attractive alternative to genetic algorithms. They make use of a probabilistic model, learnt from the promising solutions, to guide the search process. In this paper, we set out a basic taxonomy of EDA techniques, underlining the nature and complexity of the probabilistic model of each EDA variant. We review a set of innovative works that make use of EDA techniques to solve challenging bioinformatics problems, emphasizing the EDA paradigm's potential for further research in this domain. PMID:18822112

  13. A fast density-based clustering algorithm for real-time Internet of Things stream.

    PubMed

    Amini, Amineh; Saboohi, Hadi; Wah, Teh Ying; Herawan, Tutut

    2014-01-01

    Data streams are continuously generated over time from Internet of Things (IoT) devices. The faster all of this data is analyzed, its hidden trends and patterns discovered, and new strategies created, the faster action can be taken, creating greater value for organizations. Density-based method is a prominent class in clustering data streams. It has the ability to detect arbitrary shape clusters, to handle outlier, and it does not need the number of clusters in advance. Therefore, density-based clustering algorithm is a proper choice for clustering IoT streams. Recently, several density-based algorithms have been proposed for clustering data streams. However, density-based clustering in limited time is still a challenging issue. In this paper, we propose a density-based clustering algorithm for IoT streams. The method has fast processing time to be applicable in real-time application of IoT devices. Experimental results show that the proposed approach obtains high quality results with low computation time on real and synthetic datasets.

  14. A Fast Density-Based Clustering Algorithm for Real-Time Internet of Things Stream

    PubMed Central

    Ying Wah, Teh

    2014-01-01

    Data streams are continuously generated over time from Internet of Things (IoT) devices. The faster all of this data is analyzed, its hidden trends and patterns discovered, and new strategies created, the faster action can be taken, creating greater value for organizations. Density-based method is a prominent class in clustering data streams. It has the ability to detect arbitrary shape clusters, to handle outlier, and it does not need the number of clusters in advance. Therefore, density-based clustering algorithm is a proper choice for clustering IoT streams. Recently, several density-based algorithms have been proposed for clustering data streams. However, density-based clustering in limited time is still a challenging issue. In this paper, we propose a density-based clustering algorithm for IoT streams. The method has fast processing time to be applicable in real-time application of IoT devices. Experimental results show that the proposed approach obtains high quality results with low computation time on real and synthetic datasets. PMID:25110753

  15. Feature Subset Selection by Estimation of Distribution Algorithms

    SciTech Connect

    Cantu-Paz, E

    2002-01-17

    This paper describes the application of four evolutionary algorithms to the identification of feature subsets for classification problems. Besides a simple GA, the paper considers three estimation of distribution algorithms (EDAs): a compact GA, an extended compact GA, and the Bayesian Optimization Algorithm. The objective is to determine if the EDAs present advantages over the simple GA in terms of accuracy or speed in this problem. The experiments used a Naive Bayes classifier and public-domain and artificial data sets. In contrast with previous studies, we did not find evidence to support or reject the use of EDAs for this problem.

  16. Impacts of Time Delays on Distributed Algorithms for Economic Dispatch

    SciTech Connect

    Yang, Tao; Wu, Di; Sun, Yannan; Lian, Jianming

    2015-07-26

    Economic dispatch problem (EDP) is an important problem in power systems. It can be formulated as an optimization problem with the objective to minimize the total generation cost subject to the power balance constraint and generator capacity limits. Recently, several consensus-based algorithms have been proposed to solve EDP in a distributed manner. However, impacts of communication time delays on these distributed algorithms are not fully understood, especially for the case where the communication network is directed, i.e., the information exchange is unidirectional. This paper investigates communication time delay effects on a distributed algorithm for directed communication networks. The algorithm has been tested by applying time delays to different types of information exchange. Several case studies are carried out to evaluate the effectiveness and performance of the algorithm in the presence of time delays in communication networks. It is found that time delay effects have negative effects on the convergence rate, and can even result in an incorrect converge value or fail the algorithm to converge.

  17. Intracluster light in the Virgo cluster: large scale distribution

    NASA Astrophysics Data System (ADS)

    Castro-Rodriguéz, N.; Arnaboldi, M.; Aguerri, J. A. L.; Gerhard, O.; Okamura, S.; Yasuda, N.; Freeman, K. C.

    2009-11-01

    Aims: The intracluster light (ICL) is a faint diffuse stellar component of clusters made of stars that are not bound to individual galaxies. We have carried out a large scale study of this component in the nearby Virgo cluster. Methods: The diffuse light is traced using planetary nebulae (PNe). The surveyed areas were observed with a narrow-band filter centered on the [OIII]λ 5007 Å emission line redshifted to the Virgo cluster distance (the on-band image), and a broad-band filter (the off-band image). For some fields, additional narrow band imaging data corresponding to the Hα emission were also obtained. The PNe are detected in the on-band image due to their strong emission in the [OIII]λ 5007 Å line, but disappear in the off-band image. The contribution of Ly-α emitters at z=3.14 are corrected statistically using blank field surveys, when the Hα image at the field position is not available. Results: We have surveyed a total area of 3.3 square degrees in the Virgo cluster with eleven fields located at different radial distances. Those fields located at smaller radii than 80 arcmin from the cluster center contain most of the detected diffuse light. In this central region of the cluster, the ICL has a surface brightness in the range μB = 28.8-30 mag arsec-2, it is not uniformly distributed, and represents about 7% of the total galaxy light in this area. At distances larger than 80 arcmin the ICL is confined to single fields and individual sub-structures, e.g. in the sub-clump B, the M 60/M 59 group. For several fields at 2 and 3 degrees from the Virgo cluster center we set only upper limits. Conclusions: These results indicate that the ICL is not homogeneously distributed in the Virgo core, and it is concentrated in the high density regions of the Virgo cluster, e.g. the cluster core and other sub-structures. Outside these regions, the ICL is confined within areas of ~100 kpc in size, where tidal effects may be at work. These observational results link the

  18. Distributed Query Plan Generation Using Multiobjective Genetic Algorithm

    PubMed Central

    Panicker, Shina; Vijay Kumar, T. V.

    2014-01-01

    A distributed query processing strategy, which is a key performance determinant in accessing distributed databases, aims to minimize the total query processing cost. One way to achieve this is by generating efficient distributed query plans that involve fewer sites for processing a query. In the case of distributed relational databases, the number of possible query plans increases exponentially with respect to the number of relations accessed by the query and the number of sites where these relations reside. Consequently, computing optimal distributed query plans becomes a complex problem. This distributed query plan generation (DQPG) problem has already been addressed using single objective genetic algorithm, where the objective is to minimize the total query processing cost comprising the local processing cost (LPC) and the site-to-site communication cost (CC). In this paper, this DQPG problem is formulated and solved as a biobjective optimization problem with the two objectives being minimize total LPC and minimize total CC. These objectives are simultaneously optimized using a multiobjective genetic algorithm NSGA-II. Experimental comparison of the proposed NSGA-II based DQPG algorithm with the single objective genetic algorithm shows that the former performs comparatively better and converges quickly towards optimal solutions for an observed crossover and mutation probability. PMID:24963513

  19. Mesh Algorithms for PDE with Sieve I: Mesh Distribution

    DOE PAGES

    Knepley, Matthew G.; Karpeev, Dmitry A.

    2009-01-01

    We have developed a new programming framework, called Sieve, to support parallel numerical partial differential equation(s) (PDE) algorithms operating over distributed meshes. We have also developed a reference implementation of Sieve in C++ as a library of generic algorithms operating on distributed containers conforming to the Sieve interface. Sieve makes instances of the incidence relation, or arrows, the conceptual first-class objects represented in the containers. Further, generic algorithms acting on this arrow container are systematically used to provide natural geometric operations on the topology and also, through duality, on the data. Finally, coverings and duality are used to encode notmore » only individual meshes, but all types of hierarchies underlying PDE data structures, including multigrid and mesh partitions. In order to demonstrate the usefulness of the framework, we show how the mesh partition data can be represented and manipulated using the same fundamental mechanisms used to represent meshes. We present the complete description of an algorithm to encode a mesh partition and then distribute a mesh, which is independent of the mesh dimension, element shape, or embedding. Moreover, data associated with the mesh can be similarly distributed with exactly the same algorithm. The use of a high level of abstraction within the Sieve leads to several benefits in terms of code reuse, simplicity, and extensibility. We discuss these benefits and compare our approach to other existing mesh libraries.« less

  20. Structural evolution and electronic properties of medium-sized gallium clusters from ab initio genetic algorithm search.

    PubMed

    Sai, Linwei; Zhao, Jijun; Huang, Xiaoming; Wang, Jun

    2012-01-01

    Using genetic algorithm incorporated with density functional theory, we have explored the size evolution of structural and electronic properties of neutral gallium clusters of 20-40 atoms in terms of their ground state structures, binding energies, second differences of energy, HOMO-LUMO gaps, distributions of bond length and bond angle, and electron density of states. In the size range studied, the Ga(n) clusters exhibit several growth patterns, and the core-shell structures become dominant from Ga31. With high point group symmetries, Ga23 and Ga36 show particularly high stability and Ga36 owns a large HOMO-LUMO gap. The atomic structures and electronic states of Ga(n) clusters significantly differ from the a solid but resemble beta solid and liquid to certain extent.

  1. Evaluating clustering methods within the Artificial Ecosystem Algorithm and their application to bike redistribution in London.

    PubMed

    Adham, Manal T; Bentley, Peter J

    2016-08-01

    This paper proposes and evaluates a solution to the truck redistribution problem prominent in London's Santander Cycle scheme. Due to the complexity of this NP-hard combinatorial optimisation problem, no efficient optimisation techniques are known to solve the problem exactly. This motivates our use of the heuristic Artificial Ecosystem Algorithm (AEA) to find good solutions in a reasonable amount of time. The AEA is designed to take advantage of highly distributed computer architectures and adapt to changing problems. In the AEA a problem is first decomposed into its relative sub-components; they then evolve solution building blocks that fit together to form a single optimal solution. Three variants of the AEA centred on evaluating clustering methods are presented: the baseline AEA, the community-based AEA which groups stations according to journey flows, and the Adaptive AEA which actively modifies clusters to cater for changes in demand. We applied these AEA variants to the redistribution problem prominent in bike share schemes (BSS). The AEA variants are empirically evaluated using historical data from Santander Cycles to validate the proposed approach and prove its potential effectiveness.

  2. The distribution of dark matter in the A2256 cluster

    NASA Technical Reports Server (NTRS)

    Henry, J. Patrick; Briel, Ulrich G.; Nulsen, Paul E. J.

    1993-01-01

    Using spatially resolved X-ray spectroscopy, it was determined that the X-ray emitting gas in the rich cluster A2256 is nearly isothermal to a radius of at least 0.76/h Mpc, or about three core radii. These data can be used to measure the distribution of the dark matter in the cluster. It was found that the total mass interior to 0.76/h Mpc and 1.5/h Mpc is (0.5 +/- 0.1 and 1.0 +/- 0.5) x 10(exp 15)/h of the solar mass respectively where the errors encompass the full range allowed by all models used. Thus, the mass appropriate to the region where spectral information was obtained is well determined, but the uncertainties become large upon extrapolating beyond that region. It is shown that the galaxy orbits are midly anisotropic which may cause the beta discrepancy in this cluster.

  3. Fuzzy-rough supervised attribute clustering algorithm and classification of microarray data.

    PubMed

    Maji, Pradipta

    2011-02-01

    One of the major tasks with gene expression data is to find groups of coregulated genes whose collective expression is strongly associated with sample categories. In this regard, a new clustering algorithm, termed as fuzzy-rough supervised attribute clustering (FRSAC), is proposed to find such groups of genes. The proposed algorithm is based on the theory of fuzzy-rough sets, which directly incorporates the information of sample categories into the gene clustering process. A new quantitative measure is introduced based on fuzzy-rough sets that incorporates the information of sample categories to measure the similarity among genes. The proposed algorithm is based on measuring the similarity between genes using the new quantitative measure, whereby redundancy among the genes is removed. The clusters are refined incrementally based on sample categories. The effectiveness of the proposed FRSAC algorithm, along with a comparison with existing supervised and unsupervised gene selection and clustering algorithms, is demonstrated on six cancer and two arthritis data sets based on the class separability index and predictive accuracy of the naive Bayes' classifier, the K-nearest neighbor rule, and the support vector machine. PMID:20542768

  4. The Gas Distribution in Galaxy Cluster Outer Regions

    NASA Technical Reports Server (NTRS)

    Eckert, D.; Vazza, F.; Ettori, S.; Molendi, S.; Nagai, D.; Laue, E. T.; Roncarelli, M.; Rossetti, M.; Snowden, S. L.; Gastaldello, F.

    2012-01-01

    Aims. We present the analysis of a local (z = 0.04 - 0.2) sample of 31 galaxy clusters with the aim of measuring the density of the X-ray emitting gas in cluster outskirts. We compare our results with numerical simulations to set constraints on the azimuthal symmetry and gas clumping in the outer regions of galaxy clusters. Methods. We exploit the large field-of-view and low instrumental background of ROSAT/PSPC to trace the density of the intracluster gas out to the virial radius. We perform a stacking of the density profiles to detect a signal beyond r200 and measure the typical density and scatter in cluster outskirts. We also compute the azimuthal scatter of the profiles with respect to the mean value to look for deviations from spherical symmetry. Finally, we compare our average density and scatter profiles with the results of numerical simulations. Results. As opposed to some recent Suzaku results, and confirming previous evidence from ROSAT and Chandra, we observe a steepening of the density profiles beyond approximately r(sub 500). Comparing our density profiles with simulations, we find that non-radiative runs predict too steep density profiles, whereas runs including additional physics and/or treating gas clumping are in better agreement with the observed gas distribution. We report for the first time the high-confidence detection of a systematic difference between cool-core and non-cool core clusters beyond 0.3r(sub 200), which we explain by a different distribution of the gas in the two classes. Beyond r(sub 500), galaxy clusters deviate significantly from spherical symmetry, with only little differences between relaxed and disturbed systems. We find good agreement between the observed and predicted scatter profiles, but only when the 1% densest clumps are filtered out in the simulations. Conclusions. Comparing our results with numerical simulations, we find that non-radiative simulations fail to reproduce the gas distribution, even well outside cluster

  5. The Gas Distribution in the Outer Regions of Galaxy Clusters

    NASA Technical Reports Server (NTRS)

    Eckert, D.; Vazza, F.; Ettori, S.; Molendi, S.; Nagai, D.; Lau, E. T.; Roncarelli, M.; Rossetti, M.; Snowden, L.; Gastaldello, F.

    2012-01-01

    Aims. We present our analysis of a local (z = 0.04 - 0.2) sample of 31 galaxy clusters with the aim of measuring the density of the X-ray emitting gas in cluster outskirts. We compare our results with numerical simulations to set constraints on the azimuthal symmetry and gas clumping in the outer regions of galaxy clusters. Methods. We have exploited the large field-of-view and low instrumental background of ROSAT/PSPC to trace the density of the intracluster gas out to the virial radius, We stacked the density profiles to detect a signal beyond T200 and measured the typical density and scatter in cluster outskirts. We also computed the azimuthal scatter of the profiles with respect to the mean value to look for deviations from spherical symmetry. Finally, we compared our average density and scatter profiles with the results of numerical simulations. Results. As opposed to some recent Suzaku results, and confirming previous evidence from ROSAT and Chandra, we observe a steepening of the density profiles beyond approximately r(sub 500). Comparing our density profiles with simulations, we find that non-radiative runs predict density profiles that are too steep, whereas runs including additional physics and/ or treating gas clumping agree better with the observed gas distribution. We report high-confidence detection of a systematic difference between cool-core and non cool-core clusters beyond approximately 0.3r(sub 200), which we explain by a different distribution of the gas in the two classes. Beyond approximately r(sub 500), galaxy clusters deviate significantly from spherical symmetry, with only small differences between relaxed and disturbed systems. We find good agreement between the observed and predicted scatter profiles, but only when the 1% densest clumps are filtered out in the ENZO simulations. Conclusions. Comparing our results with numerical simulations, we find that non-radiative simulations fail to reproduce the gas distribution, even well outside

  6. A New-Fangled FES-k-Means Clustering Algorithm for Disease Discovery and Visual Analytics.

    PubMed

    Oyana, Tonny J

    2010-01-01

    The central purpose of this study is to further evaluate the quality of the performance of a new algorithm. The study provides additional evidence on this algorithm that was designed to increase the overall efficiency of the original k-means clustering technique-the Fast, Efficient, and Scalable k-means algorithm (FES-k-means). The FES-k-means algorithm uses a hybrid approach that comprises the k-d tree data structure that enhances the nearest neighbor query, the original k-means algorithm, and an adaptation rate proposed by Mashor. This algorithm was tested using two real datasets and one synthetic dataset. It was employed twice on all three datasets: once on data trained by the innovative MIL-SOM method and then on the actual untrained data in order to evaluate its competence. This two-step approach of data training prior to clustering provides a solid foundation for knowledge discovery and data mining, otherwise unclaimed by clustering methods alone. The benefits of this method are that it produces clusters similar to the original k-means method at a much faster rate as shown by runtime comparison data; and it provides efficient analysis of large geospatial data with implications for disease mechanism discovery. From a disease mechanism discovery perspective, it is hypothesized that the linear-like pattern of elevated blood lead levels discovered in the city of Chicago may be spatially linked to the city's water service lines.

  7. Intelligent decision support algorithm for distribution system restoration.

    PubMed

    Singh, Reetu; Mehfuz, Shabana; Kumar, Parmod

    2016-01-01

    Distribution system is the means of revenue for electric utility. It needs to be restored at the earliest if any feeder or complete system is tripped out due to fault or any other cause. Further, uncertainty of the loads, result in variations in the distribution network's parameters. Thus, an intelligent algorithm incorporating hybrid fuzzy-grey relation, which can take into account the uncertainties and compare the sequences is discussed to analyse and restore the distribution system. The simulation studies are carried out to show the utility of the method by ranking the restoration plans for a typical distribution system. This algorithm also meets the smart grid requirements in terms of an automated restoration plan for the partial/full blackout of network.

  8. Intelligent decision support algorithm for distribution system restoration.

    PubMed

    Singh, Reetu; Mehfuz, Shabana; Kumar, Parmod

    2016-01-01

    Distribution system is the means of revenue for electric utility. It needs to be restored at the earliest if any feeder or complete system is tripped out due to fault or any other cause. Further, uncertainty of the loads, result in variations in the distribution network's parameters. Thus, an intelligent algorithm incorporating hybrid fuzzy-grey relation, which can take into account the uncertainties and compare the sequences is discussed to analyse and restore the distribution system. The simulation studies are carried out to show the utility of the method by ranking the restoration plans for a typical distribution system. This algorithm also meets the smart grid requirements in terms of an automated restoration plan for the partial/full blackout of network. PMID:27512634

  9. The next generation Virgo cluster survey. VIII. The spatial distribution of globular clusters in the Virgo cluster

    SciTech Connect

    Durrell, Patrick R.; Accetta, Katharine; Côté, Patrick; Blakeslee, John P.; Ferrarese, Laura; McConnachie, Alan; Gwyn, Stephen; Peng, Eric W.; Zhang, Hongxin; Mihos, J. Christopher; Puzia, Thomas H.; Jordán, Andrés; Lançon, Ariane; Liu, Chengze; Cuillandre, Jean-Charles; Boissier, Samuel; Boselli, Alessandro; Courteau, Stéphane; Duc, Pierre-Alain; and others

    2014-10-20

    We report on a large-scale study of the distribution of globular clusters (GCs) throughout the Virgo cluster, based on photometry from the Next Generation Virgo Cluster Survey (NGVS), a large imaging survey covering Virgo's primary subclusters (Virgo A = M87 and Virgo B = M49) out to their virial radii. Using the g{sub o}{sup ′}, (g' – i') {sub o} color-magnitude diagram of unresolved and marginally resolved sources within the NGVS, we have constructed two-dimensional maps of the (irregular) GC distribution over 100 deg{sup 2} to a depth of g{sub o}{sup ′} = 24. We present the clearest evidence to date showing the difference in concentration between red and blue GCs over the full extent of the cluster, where the red (more metal-rich) GCs are largely located around the massive early-type galaxies in Virgo, while the blue (metal-poor) GCs have a much more extended spatial distribution with significant populations still present beyond 83' (∼215 kpc) along the major axes of both M49 and M87. A comparison of our GC maps to the diffuse light in the outermost regions of M49 and M87 show remarkable agreement in the shape, ellipticity, and boxiness of both luminous systems. We also find evidence for spatial enhancements of GCs surrounding M87 that may be indicative of recent interactions or an ongoing merger history. We compare the GC map to that of the locations of Virgo galaxies and the X-ray intracluster gas, and find generally good agreement between these various baryonic structures. We calculate the Virgo cluster contains a total population of N {sub GC} = 67, 300 ± 14, 400, of which 35% are located in M87 and M49 alone. For the first time, we compute a cluster-wide specific frequency S {sub N,} {sub CL} = 2.8 ± 0.7, after correcting for Virgo's diffuse light. We also find a GC-to-baryonic mass fraction ε {sub b} = 5.7 ± 1.1 × 10{sup –4} and a GC-to-total cluster mass formation efficiency ε {sub t} = 2.9 ± 0.5 × 10{sup –5}, the latter values

  10. Distributed genetic algorithms for the floorplan design problem

    NASA Technical Reports Server (NTRS)

    Cohoon, James P.; Hegde, Shailesh U.; Martin, Worthy N.; Richards, Dana S.

    1991-01-01

    Designing a VLSI floorplan calls for arranging a given set of modules in the plane to minimize the weighted sum of area and wire-length measures. A method of solving the floorplan design problem using distributed genetic algorithms is presented. Distributed genetic algorithms, based on the paleontological theory of punctuated equilibria, offer a conceptual modification to the traditional genetic algorithms. Experimental results on several problem instances demonstrate the efficacy of this method and indicate the advantages of this method over other methods, such as simulated annealing. The method has performed better than the simulated annealing approach, both in terms of the average cost of the solutions found and the best-found solution, in almost all the problem instances tried.

  11. An efficient clustering algorithm for partitioning Y-short tandem repeats data

    PubMed Central

    2012-01-01

    Background Y-Short Tandem Repeats (Y-STR) data consist of many similar and almost similar objects. This characteristic of Y-STR data causes two problems with partitioning: non-unique centroids and local minima problems. As a result, the existing partitioning algorithms produce poor clustering results. Results Our new algorithm, called k-Approximate Modal Haplotypes (k-AMH), obtains the highest clustering accuracy scores for five out of six datasets, and produces an equal performance for the remaining dataset. Furthermore, clustering accuracy scores of 100% are achieved for two of the datasets. The k-AMH algorithm records the highest mean accuracy score of 0.93 overall, compared to that of other algorithms: k-Population (0.91), k-Modes-RVF (0.81), New Fuzzy k-Modes (0.80), k-Modes (0.76), k-Modes-Hybrid 1 (0.76), k-Modes-Hybrid 2 (0.75), Fuzzy k-Modes (0.74), and k-Modes-UAVM (0.70). Conclusions The partitioning performance of the k-AMH algorithm for Y-STR data is superior to that of other algorithms, owing to its ability to solve the non-unique centroids and local minima problems. Our algorithm is also efficient in terms of time complexity, which is recorded as O(km(n-k)) and considered to be linear. PMID:23039132

  12. A non-parametric method for automatic neural spike clustering based on the non-uniform distribution of the data.

    PubMed

    Tiganj, Z; Mboup, M

    2011-12-01

    In this paper, we propose a simple and straightforward algorithm for neural spike sorting. The algorithm is based on the observation that the distribution of a neural signal largely deviates from the uniform distribution and is rather unimodal. The detected spikes to be sorted are first processed with some feature extraction technique, such as PCA, and then represented in a space with reduced dimension by keeping only a few most important features. The resulting space is next filtered in order to emphasis the differences between the centers and the borders of the clusters. Using some prior knowledge on the lowest level activity of a neuron, such as e.g. the minimal firing rate, we find the number of clusters and the center of each cluster. The spikes are then sorted using a simple greedy algorithm which grabs the nearest neighbors. We have tested the proposed algorithm on real extracellular recordings and used the simultaneous intracellular recordings to verify the results of the sorting. The results suggest that the algorithm is robust and reliable and it compares favorably with the state-of-the-art approaches. The proposed algorithm tends to be conservative, it is simple to implement and is thus suitable for both research and clinical applications as an interesting alternative to the more sophisticated approaches. PMID:22064910

  13. Clustering WHO-ART terms using semantic distance and machine learning algorithms.

    PubMed

    Iavindrasana, Jimison; Bousquet, Cedric; Degoulet, Patrice; Jaulent, Marie-Christine

    2006-01-01

    WHO-ART was developed by the WHO collaborating centre for international drug monitoring in order to code adverse drug reactions. We assume that computation of semantic distance between WHO-ART terms may be an efficient way to group related medical conditions in the WHO database in order to improve signal detection. Our objective was to develop a method for clustering WHO-ART terms according to some proximity of their meanings. Our material comprises 758 WHO-ART terms. A formal definition was acquired for each term as a list of elementary concepts belonging to SNOMED international axes and characterized by modifier terms in some cases. Clustering was implemented as a terminology service on a J2EE server. Two different unsupervised machine learning algorithms (KMeans, Pvclust) clustered WHO-ART terms according to a semantic distance operator previously described. Pvclust grouped 51% of WHO-ART terms. K-Means grouped 100% of WHO-ART terms but 25% clusters were heterogeneous with k = 180 clusters and 6% clusters were heterogeneous with k = 32 clusters. Clustering algorithms associated to semantic distance could suggest potential groupings of WHO-ART terms that need validation according to the user's requirements.

  14. The mass distribution of the Fornax dSph: constraints from its globular cluster distribution

    NASA Astrophysics Data System (ADS)

    Cole, David R.; Dehnen, Walter; Read, Justin I.; Wilkinson, Mark I.

    2012-10-01

    Uniquely among the dwarf spheroidal (dSph) satellite galaxies of the Milky Way, Fornax hosts globular clusters. It remains a puzzle as to why dynamical friction has not yet dragged any of Fornax's five globular clusters to the centre, and also why there is no evidence that any similar star cluster has been in the past (for Fornax or any other tidally undisrupted dSph). We set up a suite of 2800 N-body simulations that sample the full range of globular cluster orbits and mass models consistent with all existing observational constraints for Fornax. In agreement with previous work, we find that if Fornax has a large dark matter core, then its globular clusters remain close to their currently observed locations for long times. Furthermore, we find previously unreported behaviour for clusters that start inside the core region. These are pushed out of the core and gain orbital energy, a process we call 'dynamical buoyancy'. Thus, a cored mass distribution in Fornax will naturally lead to a shell-like globular cluster distribution near the core radius, independent of the initial conditions. By contrast, cold dark matter-type cusped mass distributions lead to the rapid infall of at least one cluster within Δt = 1-2 Gyr, except when picking unlikely initial conditions for the cluster orbits (˜2 per cent probability), and almost all clusters within Δt = 10 Gyr. Alternatively, if Fornax has only a weakly cusped mass distribution, then dynamical friction is much reduced. While over Δt = 10 Gyr this still leads to the infall of one to four clusters from their present orbits, the infall of any cluster within Δt = 1-2 Gyr is much less likely (with probability 0-70 per cent, depending on Δt and the strength of the cusp). Such a solution to the timing problem requires (in addition to a shallow dark matter cusp) that in the past the globular clusters were somewhat further from Fornax than today; they most likely did not form within Fornax, but were accreted.

  15. Whistler Waves Driven by Anisotropic Strahl Velocity Distributions: Cluster Observations

    NASA Astrophysics Data System (ADS)

    F-Viñas, A.; Gurgiolo, C.; Nieves-Chinchilla, T.; Gary, S. P.; Goldstein, M. L.

    2010-03-01

    Observed properties of the strahl using high resolution 3D electron velocity distribution data obtained from the Cluster/PEACE experiment are used to investigate its linear stability. An automated method to isolate the strahl is used to allow its moments to be computed independent of the solar wind core+halo. Results show that the strahl can have a high temperature anisotropy (T⊥/T||>~2). This anisotropy is shown to be an important free energy source for the excitation of high frequency whistler waves. The analysis suggests that the resultant whistler waves are strong enough to regulate the electron velocity distributions in the solar wind through pitch-angle scattering.

  16. Muster: Massively Scalable Clustering

    2010-05-20

    Muster is a framework for scalable cluster analysis. It includes implementations of classic K-Medoids partitioning algorithms, as well as infrastructure for making these algorithms run scalably on very large systems. In particular, Muster contains algorithms such as CAPEK (described in reference 1) that are capable of clustering highly distributed data sets in-place on a hundred thousand or more processes.

  17. Plot enchaining algorithm: a novel approach for clustering flocks of birds

    NASA Astrophysics Data System (ADS)

    Büyükaksoy Kaplan, Gülay; Lana, Adnan

    2014-06-01

    In this study, an intuitive way for tracking flocks of birds is proposed and compared to simple cluster-seeking algorithm for real radar observations. For group of targets such as flock of birds, there is no need to track each target individually. Instead a cluster can be used to represent closely spaced tracks of a possible group. Considering a group of targets as a single target for tracking provides significant performance improvement with almost no loss of information.

  18. A Game Theory Algorithm for Intra-Cluster Data Aggregation in a Vehicular Ad Hoc Network.

    PubMed

    Chen, Yuzhong; Weng, Shining; Guo, Wenzhong; Xiong, Naixue

    2016-01-01

    Vehicular ad hoc networks (VANETs) have an important role in urban management and planning. The effective integration of vehicle information in VANETs is critical to traffic analysis, large-scale vehicle route planning and intelligent transportation scheduling. However, given the limitations in the precision of the output information of a single sensor and the difficulty of information sharing among various sensors in a highly dynamic VANET, effectively performing data aggregation in VANETs remains a challenge. Moreover, current studies have mainly focused on data aggregation in large-scale environments but have rarely discussed the issue of intra-cluster data aggregation in VANETs. In this study, we propose a multi-player game theory algorithm for intra-cluster data aggregation in VANETs by analyzing the competitive and cooperative relationships among sensor nodes. Several sensor-centric metrics are proposed to measure the data redundancy and stability of a cluster. We then study the utility function to achieve efficient intra-cluster data aggregation by considering both data redundancy and cluster stability. In particular, we prove the existence of a unique Nash equilibrium in the game model, and conduct extensive experiments to validate the proposed algorithm. Results demonstrate that the proposed algorithm has advantages over typical data aggregation algorithms in both accuracy and efficiency. PMID:26907272

  19. A Game Theory Algorithm for Intra-Cluster Data Aggregation in a Vehicular Ad Hoc Network.

    PubMed

    Chen, Yuzhong; Weng, Shining; Guo, Wenzhong; Xiong, Naixue

    2016-02-19

    Vehicular ad hoc networks (VANETs) have an important role in urban management and planning. The effective integration of vehicle information in VANETs is critical to traffic analysis, large-scale vehicle route planning and intelligent transportation scheduling. However, given the limitations in the precision of the output information of a single sensor and the difficulty of information sharing among various sensors in a highly dynamic VANET, effectively performing data aggregation in VANETs remains a challenge. Moreover, current studies have mainly focused on data aggregation in large-scale environments but have rarely discussed the issue of intra-cluster data aggregation in VANETs. In this study, we propose a multi-player game theory algorithm for intra-cluster data aggregation in VANETs by analyzing the competitive and cooperative relationships among sensor nodes. Several sensor-centric metrics are proposed to measure the data redundancy and stability of a cluster. We then study the utility function to achieve efficient intra-cluster data aggregation by considering both data redundancy and cluster stability. In particular, we prove the existence of a unique Nash equilibrium in the game model, and conduct extensive experiments to validate the proposed algorithm. Results demonstrate that the proposed algorithm has advantages over typical data aggregation algorithms in both accuracy and efficiency.

  20. A Game Theory Algorithm for Intra-Cluster Data Aggregation in a Vehicular Ad Hoc Network

    PubMed Central

    Chen, Yuzhong; Weng, Shining; Guo, Wenzhong; Xiong, Naixue

    2016-01-01

    Vehicular ad hoc networks (VANETs) have an important role in urban management and planning. The effective integration of vehicle information in VANETs is critical to traffic analysis, large-scale vehicle route planning and intelligent transportation scheduling. However, given the limitations in the precision of the output information of a single sensor and the difficulty of information sharing among various sensors in a highly dynamic VANET, effectively performing data aggregation in VANETs remains a challenge. Moreover, current studies have mainly focused on data aggregation in large-scale environments but have rarely discussed the issue of intra-cluster data aggregation in VANETs. In this study, we propose a multi-player game theory algorithm for intra-cluster data aggregation in VANETs by analyzing the competitive and cooperative relationships among sensor nodes. Several sensor-centric metrics are proposed to measure the data redundancy and stability of a cluster. We then study the utility function to achieve efficient intra-cluster data aggregation by considering both data redundancy and cluster stability. In particular, we prove the existence of a unique Nash equilibrium in the game model, and conduct extensive experiments to validate the proposed algorithm. Results demonstrate that the proposed algorithm has advantages over typical data aggregation algorithms in both accuracy and efficiency. PMID:26907272

  1. A fusion method of Gabor wavelet transform and unsupervised clustering algorithms for tissue edge detection.

    PubMed

    Ergen, Burhan

    2014-01-01

    This paper proposes two edge detection methods for medical images by integrating the advantages of Gabor wavelet transform (GWT) and unsupervised clustering algorithms. The GWT is used to enhance the edge information in an image while suppressing noise. Following this, the k-means and Fuzzy c-means (FCM) clustering algorithms are used to convert a gray level image into a binary image. The proposed methods are tested using medical images obtained through Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) devices, and a phantom image. The results prove that the proposed methods are successful for edge detection, even in noisy cases.

  2. A distributed Canny edge detector: algorithm and FPGA implementation.

    PubMed

    Xu, Qian; Varadarajan, Srenivas; Chakrabarti, Chaitali; Karam, Lina J

    2014-07-01

    The Canny edge detector is one of the most widely used edge detection algorithms due to its superior performance. Unfortunately, not only is it computationally more intensive as compared with other edge detection algorithms, but it also has a higher latency because it is based on frame-level statistics. In this paper, we propose a mechanism to implement the Canny algorithm at the block level without any loss in edge detection performance compared with the original frame-level Canny algorithm. Directly applying the original Canny algorithm at the block-level leads to excessive edges in smooth regions and to loss of significant edges in high-detailed regions since the original Canny computes the high and low thresholds based on the frame-level statistics. To solve this problem, we present a distributed Canny edge detection algorithm that adaptively computes the edge detection thresholds based on the block type and the local distribution of the gradients in the image block. In addition, the new algorithm uses a nonuniform gradient magnitude histogram to compute block-based hysteresis thresholds. The resulting block-based algorithm has a significantly reduced latency and can be easily integrated with other block-based image codecs. It is capable of supporting fast edge detection of images and videos with high resolutions, including full-HD since the latency is now a function of the block size instead of the frame size. In addition, quantitative conformance evaluations and subjective tests show that the edge detection performance of the proposed algorithm is better than the original frame-based algorithm, especially when noise is present in the images. Finally, this algorithm is implemented using a 32 computing engine architecture and is synthesized on the Xilinx Virtex-5 FPGA. The synthesized architecture takes only 0.721 ms (including the SRAM READ/WRITE time and the computation time) to detect edges of 512 × 512 images in the USC SIPI database when clocked at 100

  3. A distributed Canny edge detector: algorithm and FPGA implementation.

    PubMed

    Xu, Qian; Varadarajan, Srenivas; Chakrabarti, Chaitali; Karam, Lina J

    2014-07-01

    The Canny edge detector is one of the most widely used edge detection algorithms due to its superior performance. Unfortunately, not only is it computationally more intensive as compared with other edge detection algorithms, but it also has a higher latency because it is based on frame-level statistics. In this paper, we propose a mechanism to implement the Canny algorithm at the block level without any loss in edge detection performance compared with the original frame-level Canny algorithm. Directly applying the original Canny algorithm at the block-level leads to excessive edges in smooth regions and to loss of significant edges in high-detailed regions since the original Canny computes the high and low thresholds based on the frame-level statistics. To solve this problem, we present a distributed Canny edge detection algorithm that adaptively computes the edge detection thresholds based on the block type and the local distribution of the gradients in the image block. In addition, the new algorithm uses a nonuniform gradient magnitude histogram to compute block-based hysteresis thresholds. The resulting block-based algorithm has a significantly reduced latency and can be easily integrated with other block-based image codecs. It is capable of supporting fast edge detection of images and videos with high resolutions, including full-HD since the latency is now a function of the block size instead of the frame size. In addition, quantitative conformance evaluations and subjective tests show that the edge detection performance of the proposed algorithm is better than the original frame-based algorithm, especially when noise is present in the images. Finally, this algorithm is implemented using a 32 computing engine architecture and is synthesized on the Xilinx Virtex-5 FPGA. The synthesized architecture takes only 0.721 ms (including the SRAM READ/WRITE time and the computation time) to detect edges of 512 × 512 images in the USC SIPI database when clocked at 100

  4. Node Non-Uniform Deployment Based on Clustering Algorithm for Underwater Sensor Networks.

    PubMed

    Jiang, Peng; Liu, Jun; Wu, Feng

    2015-12-01

    A node non-uniform deployment based on clustering algorithm for underwater sensor networks (UWSNs) is proposed in this study. This algorithm is proposed because optimizing network connectivity rate and network lifetime is difficult for the existing node non-uniform deployment algorithms under the premise of improving the network coverage rate for UWSNs. A high network connectivity rate is achieved by determining the heterogeneous communication ranges of nodes during node clustering. Moreover, the concept of aggregate contribution degree is defined, and the nodes with lower aggregate contribution degrees are used to substitute the dying nodes to decrease the total movement distance of nodes and prolong the network lifetime. Simulation results show that the proposed algorithm can achieve a better network coverage rate and network connectivity rate, as well as decrease the total movement distance of nodes and prolong the network lifetime.

  5. Node Non-Uniform Deployment Based on Clustering Algorithm for Underwater Sensor Networks.

    PubMed

    Jiang, Peng; Liu, Jun; Wu, Feng

    2015-01-01

    A node non-uniform deployment based on clustering algorithm for underwater sensor networks (UWSNs) is proposed in this study. This algorithm is proposed because optimizing network connectivity rate and network lifetime is difficult for the existing node non-uniform deployment algorithms under the premise of improving the network coverage rate for UWSNs. A high network connectivity rate is achieved by determining the heterogeneous communication ranges of nodes during node clustering. Moreover, the concept of aggregate contribution degree is defined, and the nodes with lower aggregate contribution degrees are used to substitute the dying nodes to decrease the total movement distance of nodes and prolong the network lifetime. Simulation results show that the proposed algorithm can achieve a better network coverage rate and network connectivity rate, as well as decrease the total movement distance of nodes and prolong the network lifetime. PMID:26633408

  6. Node Non-Uniform Deployment Based on Clustering Algorithm for Underwater Sensor Networks

    PubMed Central

    Jiang, Peng; Liu, Jun; Wu, Feng

    2015-01-01

    A node non-uniform deployment based on clustering algorithm for underwater sensor networks (UWSNs) is proposed in this study. This algorithm is proposed because optimizing network connectivity rate and network lifetime is difficult for the existing node non-uniform deployment algorithms under the premise of improving the network coverage rate for UWSNs. A high network connectivity rate is achieved by determining the heterogeneous communication ranges of nodes during node clustering. Moreover, the concept of aggregate contribution degree is defined, and the nodes with lower aggregate contribution degrees are used to substitute the dying nodes to decrease the total movement distance of nodes and prolong the network lifetime. Simulation results show that the proposed algorithm can achieve a better network coverage rate and network connectivity rate, as well as decrease the total movement distance of nodes and prolong the network lifetime. PMID:26633408

  7. The Development of FPGA-Based Pseudo-Iterative Clustering Algorithms

    NASA Astrophysics Data System (ADS)

    Drueke, Elizabeth; Fisher, Wade; Plucinski, Pawel

    2016-03-01

    The Large Hadron Collider (LHC) in Geneva, Switzerland, is set to undergo major upgrades in 2025 in the form of the High-Luminosity Large Hadron Collider (HL-LHC). In particular, several hardware upgrades are proposed to the ATLAS detector, one of the two general purpose detectors. These hardware upgrades include, but are not limited to, a new hardware-level clustering algorithm, to be performed by a field programmable gate array, or FPGA. In this study, we develop that clustering algorithm and compare the output to a Python-implemented topoclustering algorithm developed at the University of Oregon. Here, we present the agreement between the FPGA output and expected output, with particular attention to the time required by the FPGA to complete the algorithm and other limitations set by the FPGA itself.

  8. An Efficient Algorithm for Clustering of Large-Scale Mass Spectrometry Data

    PubMed Central

    Saeed, Fahad; Pisitkun, Trairak; Knepper, Mark A.; Hoffert, Jason D.

    2012-01-01

    High-throughput spectrometers are capable of producing data sets containing thousands of spectra for a single biological sample. These data sets contain a substantial amount of redundancy from peptides that may get selected multiple times in a LC-MS/MS experiment. In this paper, we present an efficient algorithm, CAMS (Clustering Algorithm for Mass Spectra) for clustering mass spectrometry data which increases both the sensitivity and confidence of spectral assignment. CAMS utilizes a novel metric, called F-set, that allows accurate identification of the spectra that are similar. A graph theoretic framework is defined that allows the use of F-set metric efficiently for accurate cluster identifications. The accuracy of the algorithm is tested on real HCD and CID data sets with varying amounts of peptides. Our experiments show that the proposed algorithm is able to cluster spectra with very high accuracy in a reasonable amount of time for large spectral data sets. Thus, the algorithm is able to decrease the computational time by compressing the data sets while increasing the throughput of the data by interpreting low S/N spectra. PMID:23471471

  9. An efficient method of key-frame extraction based on a cluster algorithm.

    PubMed

    Zhang, Qiang; Yu, Shao-Pei; Zhou, Dong-Sheng; Wei, Xiao-Peng

    2013-12-18

    This paper proposes a novel method of key-frame extraction for use with motion capture data. This method is based on an unsupervised cluster algorithm. First, the motion sequence is clustered into two classes by the similarity distance of the adjacent frames so that the thresholds needed in the next step can be determined adaptively. Second, a dynamic cluster algorithm called ISODATA is used to cluster all the frames and the frames nearest to the center of each class are automatically extracted as key-frames of the sequence. Unlike many other clustering techniques, the present improved cluster algorithm can automatically address different motion types without any need for specified parameters from users. The proposed method is capable of summarizing motion capture data reliably and efficiently. The present work also provides a meaningful comparison between the results of the proposed key-frame extraction technique and other previous methods. These results are evaluated in terms of metrics that measure reconstructed motion and the mean absolute error value, which are derived from the reconstructed data and the original data.

  10. Spitzer IR Colors and ISM Distributions of Virgo Cluster Spirals

    NASA Astrophysics Data System (ADS)

    Kenney, Jeffrey D.; Wong, I.; Kenney, Z.; Murphy, E.; Helou, G.; Howell, J.

    2012-01-01

    IRAC infrared images of 44 spiral and peculiar galaxies from the Spitzer Survey of the Virgo Cluster help reveal the interactions which transform galaxies in clusters. We explore how the location of galaxies in the IR 3.6-8μm color-magnitude diagram is related to the spatial distributions of ISM/star formation, as traced by PAH emission in the 8μm band. Based on their 8μm/PAH radial distributions, we divide the galaxies into 4 groups: normal, truncated, truncated/compact, and anemic. Normal galaxies have relatively normal PAH distributions. They are the "bluest" galaxies, with the largest 8/3.6μm ratios. They are relatively unaffected by the cluster environment, and have probably never passed through the cluster core. Truncated galaxies have a relatively normal 8μm/PAH surface brightness in the inner disk, but are abruptly truncated with little or no emission in the outer disk. They have intermediate ("green") colors, while those which are more severely truncated are "redder". Most truncated galaxies have undisturbed stellar disks and many show direct evidence of active ram pressure stripping. Truncated/compact galaxies have high 8μm/PAH surface brightness in the very inner disk (central 1 kpc) but are abruptly truncated close to center with little or no emission in the outer disk. They have intermediate global colors, similar to the other truncated galaxies. While they have the most extreme ISM truncation, they have vigorous circumnuclear star formation. Most of these have disturbed stellar disks, and they are probably produced by a combination of gravitational interaction plus ram pressure stripping. Anemic galaxies have a low 8μm/PAH surface brightness even in the inner disk. These are the "reddest" galaxies, with the smallest 8/3.6μm ratios. The origin of the anemics seems to a combination of starvation, gravitational interactions, and long-ago ram pressure stripping.

  11. A new distributed systems scheduling algorithm: a swarm intelligence approach

    NASA Astrophysics Data System (ADS)

    Haghi Kashani, Mostafa; Sarvizadeh, Raheleh; Jameii, Mahdi

    2011-12-01

    The scheduling problem in distributed systems is known as an NP-complete problem, and methods based on heuristic or metaheuristic search have been proposed to obtain optimal and suboptimal solutions. The task scheduling is a key factor for distributed systems to gain better performance. In this paper, an efficient method based on memetic algorithm is developed to solve the problem of distributed systems scheduling. With regard to load balancing efficiently, Artificial Bee Colony (ABC) has been applied as local search in the proposed memetic algorithm. The proposed method has been compared to existing memetic-Based approach in which Learning Automata method has been used as local search. The results demonstrated that the proposed method outperform the above mentioned method in terms of communication cost.

  12. Quantum cluster algorithm for frustrated Ising models in a transverse field

    NASA Astrophysics Data System (ADS)

    Biswas, Sounak; Rakala, Geet; Damle, Kedar

    2016-06-01

    Working within the stochastic series expansion framework, we introduce and characterize a plaquette-based quantum cluster algorithm for quantum Monte Carlo simulations of transverse field Ising models with frustrated Ising exchange interactions. As a demonstration of the capabilities of this algorithm, we show that a relatively small ferromagnetic next-nearest-neighbor coupling drives the transverse field Ising antiferromagnet on the triangular lattice from an antiferromagnetic three-sublattice ordered state at low temperature to a ferrimagnetic three-sublattice ordered state.

  13. A seed expanding cluster algorithm for deriving upwelling areas on sea surface temperature images

    NASA Astrophysics Data System (ADS)

    Nascimento, Susana; Casca, Sérgio; Mirkin, Boris

    2015-12-01

    In this paper a novel clustering algorithm is proposed as a version of the seeded region growing (SRG) approach for the automatic recognition of coastal upwelling from sea surface temperature (SST) images. The new algorithm, one seed expanding cluster (SEC), takes advantage of the concept of approximate clustering due to Mirkin (1996, 2013) to derive a homogeneity criterion in the format of a product rather than the conventional difference between a pixel value and the mean of values over the region of interest. It involves a boundary-oriented pixel labeling so that the cluster growing is performed by expanding its boundary iteratively. The starting point is a cluster consisting of just one seed, the pixel with the coldest temperature. The baseline version of the SEC algorithm uses Otsu's thresholding method to fine-tune the homogeneity threshold. Unfortunately, this method does not always lead to a satisfactory solution. Therefore, we introduce a self-tuning version of the algorithm in which the homogeneity threshold is locally derived from the approximation criterion over a window around the pixel under consideration. The window serves as a boundary regularizer. These two unsupervised versions of the algorithm have been applied to a set of 28 SST images of the western coast of mainland Portugal, and compared against a supervised version fine-tuned by maximizing the F-measure with respect to manually labeled ground-truth maps. The areas built by the unsupervised versions of the SEC algorithm are significantly coincident over the ground-truth regions in the cases at which the upwelling areas consist of a single continuous fragment of the SST map.

  14. Improved Quantum Artificial Fish Algorithm Application to Distributed Network Considering Distributed Generation

    PubMed Central

    Du, Tingsong; Hu, Yang; Ke, Xianting

    2015-01-01

    An improved quantum artificial fish swarm algorithm (IQAFSA) for solving distributed network programming considering distributed generation is proposed in this work. The IQAFSA based on quantum computing which has exponential acceleration for heuristic algorithm uses quantum bits to code artificial fish and quantum revolving gate, preying behavior, and following behavior and variation of quantum artificial fish to update the artificial fish for searching for optimal value. Then, we apply the proposed new algorithm, the quantum artificial fish swarm algorithm (QAFSA), the basic artificial fish swarm algorithm (BAFSA), and the global edition artificial fish swarm algorithm (GAFSA) to the simulation experiments for some typical test functions, respectively. The simulation results demonstrate that the proposed algorithm can escape from the local extremum effectively and has higher convergence speed and better accuracy. Finally, applying IQAFSA to distributed network problems and the simulation results for 33-bus radial distribution network system show that IQAFSA can get the minimum power loss after comparing with BAFSA, GAFSA, and QAFSA. PMID:26447713

  15. Improved Quantum Artificial Fish Algorithm Application to Distributed Network Considering Distributed Generation.

    PubMed

    Du, Tingsong; Hu, Yang; Ke, Xianting

    2015-01-01

    An improved quantum artificial fish swarm algorithm (IQAFSA) for solving distributed network programming considering distributed generation is proposed in this work. The IQAFSA based on quantum computing which has exponential acceleration for heuristic algorithm uses quantum bits to code artificial fish and quantum revolving gate, preying behavior, and following behavior and variation of quantum artificial fish to update the artificial fish for searching for optimal value. Then, we apply the proposed new algorithm, the quantum artificial fish swarm algorithm (QAFSA), the basic artificial fish swarm algorithm (BAFSA), and the global edition artificial fish swarm algorithm (GAFSA) to the simulation experiments for some typical test functions, respectively. The simulation results demonstrate that the proposed algorithm can escape from the local extremum effectively and has higher convergence speed and better accuracy. Finally, applying IQAFSA to distributed network problems and the simulation results for 33-bus radial distribution network system show that IQAFSA can get the minimum power loss after comparing with BAFSA, GAFSA, and QAFSA. PMID:26447713

  16. A Clustering Algorithm for Liver Lesion Segmentation of Diffusion-Weighted MR Images

    PubMed Central

    Jha, Abhinav K.; Rodríguez, Jeffrey J.; Stephen, Renu M.; Stopeck, Alison T.

    2010-01-01

    In diffusion-weighted magnetic resonance imaging, accurate segmentation of liver lesions in the diffusion-weighted images is required for computation of the apparent diffusion coefficient (ADC) of the lesion, the parameter that serves as an indicator of lesion response to therapy. However, the segmentation problem is challenging due to low SNR, fuzzy boundaries and speckle and motion artifacts. We propose a clustering algorithm that incorporates spatial information and a geometric constraint to solve this issue. We show that our algorithm provides improved accuracy compared to existing segmentation algorithms. PMID:21151837

  17. Fault-tolerant distributed algorithms for agreement and election

    SciTech Connect

    Abu-Amara, H.H.

    1988-01-01

    The first part characterizes completely the shared-memory requirements for achieving agreement in an asynchronous system of fail-stop processes that die undetectably. There is no agreement protocol that uses only read and write operations, even if at most one process dies. This result implies the impossibility of Byzantine agreement in asynchronous message-passing systems. The second part considers the election problem on asynchronous complete networks when the processors are reliable but some of the channels may be intermittently faulty. To be consistent with the standard model of distributed algorithms in which channel delay scan be arbitrary but finite, it is assumed that channel failures are undetectable. An algorithm is given that correctly solves the problem when the channels fail before or during the execution of the algorithm.

  18. Improving permafrost distribution modelling using feature selection algorithms

    NASA Astrophysics Data System (ADS)

    Deluigi, Nicola; Lambiel, Christophe; Kanevski, Mikhail

    2016-04-01

    The availability of an increasing number of spatial data on the occurrence of mountain permafrost allows the employment of machine learning (ML) classification algorithms for modelling the distribution of the phenomenon. One of the major problems when dealing with high-dimensional dataset is the number of input features (variables) involved. Application of ML classification algorithms to this large number of variables leads to the risk of overfitting, with the consequence of a poor generalization/prediction. For this reason, applying feature selection (FS) techniques helps simplifying the amount of factors required and improves the knowledge on adopted features and their relation with the studied phenomenon. Moreover, taking away irrelevant or redundant variables from the dataset effectively improves the quality of the ML prediction. This research deals with a comparative analysis of permafrost distribution models supported by FS variable importance assessment. The input dataset (dimension = 20-25, 10 m spatial resolution) was constructed using landcover maps, climate data and DEM derived variables (altitude, aspect, slope, terrain curvature, solar radiation, etc.). It was completed with permafrost evidences (geophysical and thermal data and rock glacier inventories) that serve as training permafrost data. Used FS algorithms informed about variables that appeared less statistically important for permafrost presence/absence. Three different algorithms were compared: Information Gain (IG), Correlation-based Feature Selection (CFS) and Random Forest (RF). IG is a filter technique that evaluates the worth of a predictor by measuring the information gain with respect to the permafrost presence/absence. Conversely, CFS is a wrapper technique that evaluates the worth of a subset of predictors by considering the individual predictive ability of each variable along with the degree of redundancy between them. Finally, RF is a ML algorithm that performs FS as part of its

  19. Scalable method for demonstrating the Deutsch-Jozsa and Bernstein-Vazirani algorithms using cluster states

    SciTech Connect

    Tame, M. S.; Kim, M. S.

    2010-09-15

    We show that fundamental versions of the Deutsch-Jozsa and Bernstein-Vazirani quantum algorithms can be performed using a small entangled cluster state resource of only six qubits. We then investigate the minimal resource states needed to demonstrate general n-qubit versions and a scalable method to produce them. For this purpose, we propose a versatile photonic on-chip setup.

  20. NEW MDS AND CLUSTERING BASED ALGORITHMS FOR PROTEIN MODEL QUALITY ASSESSMENT AND SELECTION.

    PubMed

    Wang, Qingguo; Shang, Charles; Xu, Dong; Shang, Yi

    2013-10-25

    In protein tertiary structure prediction, assessing the quality of predicted models is an essential task. Over the past years, many methods have been proposed for the protein model quality assessment (QA) and selection problem. Despite significant advances, the discerning power of current methods is still unsatisfactory. In this paper, we propose two new algorithms, CC-Select and MDS-QA, based on multidimensional scaling and k-means clustering. For the model selection problem, CC-Select combines consensus with clustering techniques to select the best models from a given pool. Given a set of predicted models, CC-Select first calculates a consensus score for each structure based on its average pairwise structural similarity to other models. Then, similar structures are grouped into clusters using multidimensional scaling and clustering algorithms. In each cluster, the one with the highest consensus score is selected as a candidate model. For the QA problem, MDS-QA combines single-model scoring functions with consensus to determine more accurate assessment score for every model in a given pool. Using extensive benchmark sets of a large collection of predicted models, we compare the two algorithms with existing state-of-the-art quality assessment methods and show significant improvement. PMID:24808625

  1. An effective trust-based recommendation method using a novel graph clustering algorithm

    NASA Astrophysics Data System (ADS)

    Moradi, Parham; Ahmadian, Sajad; Akhlaghian, Fardin

    2015-10-01

    Recommender systems are programs that aim to provide personalized recommendations to users for specific items (e.g. music, books) in online sharing communities or on e-commerce sites. Collaborative filtering methods are important and widely accepted types of recommender systems that generate recommendations based on the ratings of like-minded users. On the other hand, these systems confront several inherent issues such as data sparsity and cold start problems, caused by fewer ratings against the unknowns that need to be predicted. Incorporating trust information into the collaborative filtering systems is an attractive approach to resolve these problems. In this paper, we present a model-based collaborative filtering method by applying a novel graph clustering algorithm and also considering trust statements. In the proposed method first of all, the problem space is represented as a graph and then a sparsest subgraph finding algorithm is applied on the graph to find the initial cluster centers. Then, the proposed graph clustering algorithm is performed to obtain the appropriate users/items clusters. Finally, the identified clusters are used as a set of neighbors to recommend unseen items to the current active user. Experimental results based on three real-world datasets demonstrate that the proposed method outperforms several state-of-the-art recommender system methods.

  2. NEW MDS AND CLUSTERING BASED ALGORITHMS FOR PROTEIN MODEL QUALITY ASSESSMENT AND SELECTION

    PubMed Central

    WANG, QINGGUO; SHANG, CHARLES; XU, DONG

    2014-01-01

    In protein tertiary structure prediction, assessing the quality of predicted models is an essential task. Over the past years, many methods have been proposed for the protein model quality assessment (QA) and selection problem. Despite significant advances, the discerning power of current methods is still unsatisfactory. In this paper, we propose two new algorithms, CC-Select and MDS-QA, based on multidimensional scaling and k-means clustering. For the model selection problem, CC-Select combines consensus with clustering techniques to select the best models from a given pool. Given a set of predicted models, CC-Select first calculates a consensus score for each structure based on its average pairwise structural similarity to other models. Then, similar structures are grouped into clusters using multidimensional scaling and clustering algorithms. In each cluster, the one with the highest consensus score is selected as a candidate model. For the QA problem, MDS-QA combines single-model scoring functions with consensus to determine more accurate assessment score for every model in a given pool. Using extensive benchmark sets of a large collection of predicted models, we compare the two algorithms with existing state-of-the-art quality assessment methods and show significant improvement. PMID:24808625

  3. Mobile agent and multilayer integrated distributed intrusion detection model for clustering ad hoc networks

    NASA Astrophysics Data System (ADS)

    Feng, Jianxin; Wang, Guangxing

    2004-04-01

    Ad hoc networks do not depend on any predefined infrastructure or centralized administration to operate. Their security characters require more complex security preventions. As the second line of defense, Intrusion detection is the necessary means of getting the high survivability. In this paper the security characters of ad hoc networks and the related contents of intrusion detection are discussed. Mobile Agent and Multi-layer Integrated Distributed Intrusion Detection Model (MAMIDIDM) and a heuristic global detection algorithm are proposed tentatively by combining the mobile agent technology with the multi-layer conception. This heuristic global detection algorithm combines the mobile agent detection engine with the multi-layer detection engines and analyzes the results obtained by the corresponding detection engines. MAMIDIDM has the better flexibility and extensibility, can execute the intrusion detection in clustering ad hoc networks effectively.

  4. Design of multiple sequence alignment algorithms on parallel, distributed memory supercomputers.

    PubMed

    Church, Philip C; Goscinski, Andrzej; Holt, Kathryn; Inouye, Michael; Ghoting, Amol; Makarychev, Konstantin; Reumann, Matthias

    2011-01-01

    The challenge of comparing two or more genomes that have undergone recombination and substantial amounts of segmental loss and gain has recently been addressed for small numbers of genomes. However, datasets of hundreds of genomes are now common and their sizes will only increase in the future. Multiple sequence alignment of hundreds of genomes remains an intractable problem due to quadratic increases in compute time and memory footprint. To date, most alignment algorithms are designed for commodity clusters without parallelism. Hence, we propose the design of a multiple sequence alignment algorithm on massively parallel, distributed memory supercomputers to enable research into comparative genomics on large data sets. Following the methodology of the sequential progressiveMauve algorithm, we design data structures including sequences and sorted k-mer lists on the IBM Blue Gene/P supercomputer (BG/P). Preliminary results show that we can reduce the memory footprint so that we can potentially align over 250 bacterial genomes on a single BG/P compute node. We verify our results on a dataset of E.coli, Shigella and S.pneumoniae genomes. Our implementation returns results matching those of the original algorithm but in 1/2 the time and with 1/4 the memory footprint for scaffold building. In this study, we have laid the basis for multiple sequence alignment of large-scale datasets on a massively parallel, distributed memory supercomputer, thus enabling comparison of hundreds instead of a few genome sequences within reasonable time. PMID:22254462

  5. Design of multiple sequence alignment algorithms on parallel, distributed memory supercomputers.

    PubMed

    Church, Philip C; Goscinski, Andrzej; Holt, Kathryn; Inouye, Michael; Ghoting, Amol; Makarychev, Konstantin; Reumann, Matthias

    2011-01-01

    The challenge of comparing two or more genomes that have undergone recombination and substantial amounts of segmental loss and gain has recently been addressed for small numbers of genomes. However, datasets of hundreds of genomes are now common and their sizes will only increase in the future. Multiple sequence alignment of hundreds of genomes remains an intractable problem due to quadratic increases in compute time and memory footprint. To date, most alignment algorithms are designed for commodity clusters without parallelism. Hence, we propose the design of a multiple sequence alignment algorithm on massively parallel, distributed memory supercomputers to enable research into comparative genomics on large data sets. Following the methodology of the sequential progressiveMauve algorithm, we design data structures including sequences and sorted k-mer lists on the IBM Blue Gene/P supercomputer (BG/P). Preliminary results show that we can reduce the memory footprint so that we can potentially align over 250 bacterial genomes on a single BG/P compute node. We verify our results on a dataset of E.coli, Shigella and S.pneumoniae genomes. Our implementation returns results matching those of the original algorithm but in 1/2 the time and with 1/4 the memory footprint for scaffold building. In this study, we have laid the basis for multiple sequence alignment of large-scale datasets on a massively parallel, distributed memory supercomputer, thus enabling comparison of hundreds instead of a few genome sequences within reasonable time.

  6. BoCluSt: Bootstrap Clustering Stability Algorithm for Community Detection

    PubMed Central

    Garcia, Carlos

    2016-01-01

    The identification of modules or communities in sets of related variables is a key step in the analysis and modeling of biological systems. Procedures for this identification are usually designed to allow fast analyses of very large datasets and may produce suboptimal results when these sets are of a small to moderate size. This article introduces BoCluSt, a new, somewhat more computationally intensive, community detection procedure that is based on combining a clustering algorithm with a measure of stability under bootstrap resampling. Both computer simulation and analyses of experimental data showed that BoCluSt can outperform current procedures in the identification of multiple modules in data sets with a moderate number of variables. In addition, the procedure provides users with a null distribution of results to evaluate the support for the existence of community structure in the data. BoCluSt takes individual measures for a set of variables as input, and may be a valuable and robust exploratory tool of network analysis, as it provides 1) an estimation of the best partition of variables into modules, 2) a measure of the support for the existence of modular structures, and 3) an overall description of the whole structure, which may reveal hierarchical modular situations, in which modules are composed of smaller sub-modules. PMID:27258041

  7. BoCluSt: Bootstrap Clustering Stability Algorithm for Community Detection.

    PubMed

    Garcia, Carlos

    2016-01-01

    The identification of modules or communities in sets of related variables is a key step in the analysis and modeling of biological systems. Procedures for this identification are usually designed to allow fast analyses of very large datasets and may produce suboptimal results when these sets are of a small to moderate size. This article introduces BoCluSt, a new, somewhat more computationally intensive, community detection procedure that is based on combining a clustering algorithm with a measure of stability under bootstrap resampling. Both computer simulation and analyses of experimental data showed that BoCluSt can outperform current procedures in the identification of multiple modules in data sets with a moderate number of variables. In addition, the procedure provides users with a null distribution of results to evaluate the support for the existence of community structure in the data. BoCluSt takes individual measures for a set of variables as input, and may be a valuable and robust exploratory tool of network analysis, as it provides 1) an estimation of the best partition of variables into modules, 2) a measure of the support for the existence of modular structures, and 3) an overall description of the whole structure, which may reveal hierarchical modular situations, in which modules are composed of smaller sub-modules.

  8. Reexamining cluster radioactivity in trans-lead nuclei with consideration of specific density distributions in daughter nuclei and clusters

    NASA Astrophysics Data System (ADS)

    Qian, Yibin; Ren, Zhongzhou; Ni, Dongdong

    2016-08-01

    We further investigate the cluster emission from heavy nuclei beyond the lead region in the framework of the preformed cluster model. The refined cluster-core potential is constructed by the double-folding integral of the density distributions of the daughter nucleus and the emitted cluster, where the radius or the diffuseness parameter in the Fermi density distribution formula is determined according to the available experimental data on the charge radii and the neutron skin thickness. The Schrödinger equation of the cluster-daughter relative motion is then solved within the outgoing Coulomb wave-function boundary conditions to obtain the decay width. It is found that the present decay width of cluster emitters is clearly enhanced as compared to that in the previous case, which involved the fixed parametrization for the density distributions of daughter nuclei and clusters. Among the whole procedure, the nuclear deformation of clusters is also introduced into the calculations, and the degree of its influence on the final decay half-life is checked to some extent. Moreover, the effect from the bubble density distribution of clusters on the final decay width is carefully discussed by using the central depressed distribution.

  9. CHIMERA: Clustering of Heterogeneous Disease Effects via Distribution Matching of Imaging Patterns.

    PubMed

    Dong, Aoyan; Honnorat, Nicolas; Gaonkar, Bilwaj; Davatzikos, Christos

    2016-02-01

    Many brain disorders and diseases exhibit heterogeneous symptoms and imaging characteristics. This heterogeneity is typically not captured by commonly adopted neuroimaging analyses that seek only a main imaging pattern when two groups need to be differentiated (e.g., patients and controls, or clinical progressors and non-progressors). We propose a novel probabilistic clustering approach, CHIMERA, modeling the pathological process by a combination of multiple regularized transformations from normal/control population to the patient population, thereby seeking to identify multiple imaging patterns that relate to disease effects and to better characterize disease heterogeneity. In our framework, normal and patient populations are considered as point distributions that are matched by a variant of the coherent point drift algorithm. We explain how the posterior probabilities produced during the MAP optimization of CHIMERA can be used for clustering the patients into groups and identifying disease subtypes. CHIMERA was first validated on a synthetic dataset and then on a clinical dataset mixing 317 control subjects and patients suffering from Alzheimer's Disease (AD) and Parkison's Disease (PD). CHIMERA produced better clustering results compared to two standard clustering approaches. We further analyzed 390 T1 MRI scans from Alzheimer's patients. We discovered two main and reproducible AD subtypes displaying significant differences in cognitive performance.

  10. An Evaluation of Biosurveillance Grid—Dynamic Algorithm Distribution Across Multiple Computer Nodes

    PubMed Central

    Tsai, Ming-Chi; Tsui, Fu-Chiang; Wagner, Michael M.

    2007-01-01

    Performing fast data analysis to detect disease outbreaks plays a critical role in real-time biosurveillance. In this paper, we described and evaluated an Algorithm Distribution Manager Service (ADMS) based on grid technologies, which dynamically partition and distribute detection algorithms across multiple computers. We compared the execution time to perform the analysis on a single computer and on a grid network (3 computing nodes) with and without using dynamic algorithm distribution. We found that algorithms with long runtime completed approximately three times earlier in distributed environment than in a single computer while short runtime algorithms performed worse in distributed environment. A dynamic algorithm distribution approach also performed better than static algorithm distribution approach. This pilot study shows a great potential to reduce lengthy analysis time through dynamic algorithm partitioning and parallel processing, and provides the opportunity of distributing algorithms from a client to remote computers in a grid network. PMID:18693936

  11. Distribution of Lick Indices in the Globular Cluster NGC 2808

    NASA Astrophysics Data System (ADS)

    O'Connell, Julia

    2012-01-01

    We have analyzed low resolution spectra for a large sample ( 200) of red giant branch (RGB) stars in the Galactic globular cluster NGC 2808 for indications of bimodality with respect to Lick indices and radial dispersion. Target stars were observed in nine fields over 5 nights in March 2010 with the Blanco 4m telescope and Hydra multi--object positioner and bench spectrograph located at Cerro Tololo Inter--American Observatory. The full wavelength coverage spans from 3900--8200 A; for nights 1 and 2, and from 4500--6950 A; for nights 3, 4, and 5. The low resolution data ( 500 for nights 1 and 2, 1200 for nights 3, 4, and 5; 70) were used to measure radial velocities (RVs) and to determine cluster membership. RVs were measured with the IRAF task xcsao using stars with known radial RVs included in our observed fields as templates. Template candidates were taken from Cacciari et al. (2004) and Carretta et al. (2006), and cross--matched by RA and Dec to stars in our sample. A conservative approach was adopted when including stars as cluster members, with RV= 101.9 ± 7.17 km s-1 (σ=2.83). Indexf was employed to measure 12 Lick indices, with particular attention to CN1, CN2 and Na. The peak distribution of these indices appears to be correlated in stars 2.5--3.5 arc minutes from the cluster center. The correlation becomes less apparent in stars outside this radius.

  12. Whistler Waves Driven by Anisotropic Strahl Velocity Distributions: Cluster Observations

    NASA Technical Reports Server (NTRS)

    Vinas, A.F.; Gurgiolo, C.; Nieves-Chinchilla, T.; Gary, S. P.; Goldstein, M. L.

    2010-01-01

    Observed properties of the strahl using high resolution 3D electron velocity distribution data obtained from the Cluster/PEACE experiment are used to investigate its linear stability. An automated method to isolate the strahl is used to allow its moments to be computed independent of the solar wind core+halo. Results show that the strahl can have a high temperature anisotropy (T(perpindicular)/T(parallell) approximately > 2). This anisotropy is shown to be an important free energy source for the excitation of high frequency whistler waves. The analysis suggests that the resultant whistler waves are strong enough to regulate the electron velocity distributions in the solar wind through pitch-angle scattering

  13. Belief-propagation algorithm and the Ising model on networks with arbitrary distributions of motifs

    NASA Astrophysics Data System (ADS)

    Yoon, S.; Goltsev, A. V.; Dorogovtsev, S. N.; Mendes, J. F. F.

    2011-10-01

    We generalize the belief-propagation algorithm to sparse random networks with arbitrary distributions of motifs (triangles, loops, etc.). Each vertex in these networks belongs to a given set of motifs (generalization of the configuration model). These networks can be treated as sparse uncorrelated hypergraphs in which hyperedges represent motifs. Here a hypergraph is a generalization of a graph, where a hyperedge can connect any number of vertices. These uncorrelated hypergraphs are treelike (hypertrees), which crucially simplifies the problem and allows us to apply the belief-propagation algorithm to these loopy networks with arbitrary motifs. As natural examples, we consider motifs in the form of finite loops and cliques. We apply the belief-propagation algorithm to the ferromagnetic Ising model with pairwise interactions on the resulting random networks and obtain an exact solution of this model. We find an exact critical temperature of the ferromagnetic phase transition and demonstrate that with increasing the clustering coefficient and the loop size, the critical temperature increases compared to ordinary treelike complex networks. However, weak clustering does not change the critical behavior qualitatively. Our solution also gives the birth point of the giant connected component in these loopy networks.

  14. A clustering algorithm for sample data based on environmental pollution characteristics

    NASA Astrophysics Data System (ADS)

    Chen, Mei; Wang, Pengfei; Chen, Qiang; Wu, Jiadong; Chen, Xiaoyun

    2015-04-01

    Environmental pollution has become an issue of serious international concern in recent years. Among the receptor-oriented pollution models, CMB, PMF, UNMIX, and PCA are widely used as source apportionment models. To improve the accuracy of source apportionment and classify the sample data for these models, this study proposes an easy-to-use, high-dimensional EPC algorithm that not only organizes all of the sample data into different groups according to the similarities in pollution characteristics such as pollution sources and concentrations but also simultaneously detects outliers. The main clustering process consists of selecting the first unlabelled point as the cluster centre, then assigning each data point in the sample dataset to its most similar cluster centre according to both the user-defined threshold and the value of similarity function in each iteration, and finally modifying the clusters using a method similar to k-Means. The validity and accuracy of the algorithm are tested using both real and synthetic datasets, which makes the EPC algorithm practical and effective for appropriately classifying sample data for source apportionment models and helpful for better understanding and interpreting the sources of pollution.

  15. A priori data-driven multi-clustered reservoir generation algorithm for echo state network.

    PubMed

    Li, Xiumin; Zhong, Ling; Xue, Fangzheng; Zhang, Anguo

    2015-01-01

    Echo state networks (ESNs) with multi-clustered reservoir topology perform better in reservoir computing and robustness than those with random reservoir topology. However, these ESNs have a complex reservoir topology, which leads to difficulties in reservoir generation. This study focuses on the reservoir generation problem when ESN is used in environments with sufficient priori data available. Accordingly, a priori data-driven multi-cluster reservoir generation algorithm is proposed. The priori data in the proposed algorithm are used to evaluate reservoirs by calculating the precision and standard deviation of ESNs. The reservoirs are produced using the clustering method; only the reservoir with a better evaluation performance takes the place of a previous one. The final reservoir is obtained when its evaluation score reaches the preset requirement. The prediction experiment results obtained using the Mackey-Glass chaotic time series show that the proposed reservoir generation algorithm provides ESNs with extra prediction precision and increases the structure complexity of the network. Further experiments also reveal the appropriate values of the number of clusters and time window size to obtain optimal performance. The information entropy of the reservoir reaches the maximum when ESN gains the greatest precision. PMID:25875296

  16. FctClus: A Fast Clustering Algorithm for Heterogeneous Information Networks.

    PubMed

    Yang, Jing; Chen, Limin; Zhang, Jianpei

    2015-01-01

    It is important to cluster heterogeneous information networks. A fast clustering algorithm based on an approximate commute time embedding for heterogeneous information networks with a star network schema is proposed in this paper by utilizing the sparsity of heterogeneous information networks. First, a heterogeneous information network is transformed into multiple compatible bipartite graphs from the compatible point of view. Second, the approximate commute time embedding of each bipartite graph is computed using random mapping and a linear time solver. All of the indicator subsets in each embedding simultaneously determine the target dataset. Finally, a general model is formulated by these indicator subsets, and a fast algorithm is derived by simultaneously clustering all of the indicator subsets using the sum of the weighted distances for all indicators for an identical target object. The proposed fast algorithm, FctClus, is shown to be efficient and generalizable and exhibits high clustering accuracy and fast computation speed based on a theoretic analysis and experimental verification. PMID:26090857

  17. A priori data-driven multi-clustered reservoir generation algorithm for echo state network.

    PubMed

    Li, Xiumin; Zhong, Ling; Xue, Fangzheng; Zhang, Anguo

    2015-01-01

    Echo state networks (ESNs) with multi-clustered reservoir topology perform better in reservoir computing and robustness than those with random reservoir topology. However, these ESNs have a complex reservoir topology, which leads to difficulties in reservoir generation. This study focuses on the reservoir generation problem when ESN is used in environments with sufficient priori data available. Accordingly, a priori data-driven multi-cluster reservoir generation algorithm is proposed. The priori data in the proposed algorithm are used to evaluate reservoirs by calculating the precision and standard deviation of ESNs. The reservoirs are produced using the clustering method; only the reservoir with a better evaluation performance takes the place of a previous one. The final reservoir is obtained when its evaluation score reaches the preset requirement. The prediction experiment results obtained using the Mackey-Glass chaotic time series show that the proposed reservoir generation algorithm provides ESNs with extra prediction precision and increases the structure complexity of the network. Further experiments also reveal the appropriate values of the number of clusters and time window size to obtain optimal performance. The information entropy of the reservoir reaches the maximum when ESN gains the greatest precision.

  18. A Parallel Ghosting Algorithm for The Flexible Distributed Mesh Database

    DOE PAGES

    Mubarak, Misbah; Seol, Seegyoung; Lu, Qiukai; Shephard, Mark S.

    2013-01-01

    Critical to the scalability of parallel adaptive simulations are parallel control functions including load balancing, reduced inter-process communication and optimal data decomposition. In distributed meshes, many mesh-based applications frequently access neighborhood information for computational purposes which must be transmitted efficiently to avoid parallel performance degradation when the neighbors are on different processors. This article presents a parallel algorithm of creating and deleting data copies, referred to as ghost copies, which localize neighborhood data for computation purposes while minimizing inter-process communication. The key characteristics of the algorithm are: (1) It can create ghost copies of any permissible topological order inmore » a 1D, 2D or 3D mesh based on selected adjacencies. (2) It exploits neighborhood communication patterns during the ghost creation process thus eliminating all-to-all communication. (3) For applications that need neighbors of neighbors, the algorithm can create n number of ghost layers up to a point where the whole partitioned mesh can be ghosted. Strong and weak scaling results are presented for the IBM BG/P and Cray XE6 architectures up to a core count of 32,768 processors. The algorithm also leads to scalable results when used in a parallel super-convergent patch recovery error estimator, an application that frequently accesses neighborhood data to carry out computation.« less

  19. Performance Enhancement of Radial Distributed System with Distributed Generators by Reconfiguration Using Binary Firefly Algorithm

    NASA Astrophysics Data System (ADS)

    Rajalakshmi, N.; Padma Subramanian, D.; Thamizhavel, K.

    2015-03-01

    The extent of real power loss and voltage deviation associated with overloaded feeders in radial distribution system can be reduced by reconfiguration. Reconfiguration is normally achieved by changing the open/closed state of tie/sectionalizing switches. Finding optimal switch combination is a complicated problem as there are many switching combinations possible in a distribution system. Hence optimization techniques are finding greater importance in reducing the complexity of reconfiguration problem. This paper presents the application of firefly algorithm (FA) for optimal reconfiguration of radial distribution system with distributed generators (DG). The algorithm is tested on IEEE 33 bus system installed with DGs and the results are compared with binary genetic algorithm. It is found that binary FA is more effective than binary genetic algorithm in achieving real power loss reduction and improving voltage profile and hence enhancing the performance of radial distribution system. Results are found to be optimum when DGs are added to the test system, which proved the impact of DGs on distribution system.

  20. Analysis of an algorithm for distributed recognition and accountability

    SciTech Connect

    Ko, C.; Frincke, D.A.; Goan, T. Jr.; Heberlein, L.T.; Levitt, K.; Mukherjee, B.; Wee, C.

    1993-08-01

    Computer and network systems are available to attacks. Abandoning the existing huge infrastructure of possibly-insecure computer and network systems is impossible, and replacing them by totally secure systems may not be feasible or cost effective. A common element in many attacks is that a single user will often attempt to intrude upon multiple resources throughout a network. Detecting the attack can become significantly easier by compiling and integrating evidence of such intrusion attempts across the network rather than attempting to assess the situation from the vantage point of only a single host. To solve this problem, we suggest an approach for distributed recognition and accountability (DRA), which consists of algorithms which ``process,`` at a central location, distributed and asynchronous ``reports`` generated by computers (or a subset thereof) throughout the network. Our highest-priority objectives are to observe ways by which an individual moves around in a network of computers, including changing user names to possibly hide his/her true identity, and to associate all activities of multiple instance of the same individual to the same network-wide user. We present the DRA algorithm and a sketch of its proof under an initial set of simplifying albeit realistic assumptions. Later, we relax these assumptions to accommodate pragmatic aspects such as missing or delayed ``reports,`` clock slew, tampered ``reports,`` etc. We believe that such algorithms will have widespread applications in the future, particularly in intrusion-detection system.

  1. Cluster-Based Multipolling Sequencing Algorithm for Collecting RFID Data in Wireless LANs

    NASA Astrophysics Data System (ADS)

    Choi, Woo-Yong; Chatterjee, Mainak

    2015-03-01

    With the growing use of RFID (Radio Frequency Identification), it is becoming important to devise ways to read RFID tags in real time. Access points (APs) of IEEE 802.11-based wireless Local Area Networks (LANs) are being integrated with RFID networks that can efficiently collect real-time RFID data. Several schemes, such as multipolling methods based on the dynamic search algorithm and random sequencing, have been proposed. However, as the number of RFID readers associated with an AP increases, it becomes difficult for the dynamic search algorithm to derive the multipolling sequence in real time. Though multipolling methods can eliminate the polling overhead, we still need to enhance the performance of the multipolling methods based on random sequencing. To that extent, we propose a real-time cluster-based multipolling sequencing algorithm that drastically eliminates more than 90% of the polling overhead, particularly so when the dynamic search algorithm fails to derive the multipolling sequence in real time.

  2. Sampling cluster stability for peer-to-peer based content distribution networks

    NASA Astrophysics Data System (ADS)

    Darlagiannis, Vasilios; Mauthe, Andreas; Steinmetz, Ralf

    2006-01-01

    Several types of Content Distribution Networks are being deployed over the Internet today, based on different architectures to meet their requirements (e.g., scalability, efficiency and resiliency). Peer-to-Peer (P2P) based Content Distribution Networks are promising approaches that have several advantages. Structured P2P networks, for instance, take a proactive approach and provide efficient routing mechanisms. Nevertheless, their maintenance can increase considerably in highly dynamic P2P environments. In order to address this issue, a two-tier architecture that combines a structured overlay network with a clustering mechanism is suggested in a hybrid scheme. In this paper, we examine several sampling algorithms utilized in the aforementioned hybrid network that collect local information in order to apply a selective join procedure. The algorithms are based mostly on random walks inside the overlay network. The aim of the selective join procedure is to provide a well balanced and stable overlay infrastructure that can easily overcome the unreliable behavior of the autonomous peers that constitute the network. The sampling algorithms are evaluated using simulation experiments where several properties related to the graph structure are revealed.

  3. A genetic algorithmic approach to antenna null-steering using a cluster computer.

    NASA Astrophysics Data System (ADS)

    Recine, Greg; Cui, Hong-Liang

    2001-06-01

    We apply a genetic algorithm (GA) to the problem of electronically steering the maximums and nulls of an antenna array to desired positions (null toward enemy listener/jammer, max toward friendly listener/transmitter). The antenna pattern itself is computed using NEC2 which is called by the main GA program. Since a GA naturally lends itself to parallelization, this simulation was applied to our new twin 64-node cluster computers (Gemini). Design issues and uses of the Gemini cluster in our group are also discussed.

  4. K-Means Re-Clustering-Algorithmic Options with Quantifiable Performance Comparisons

    SciTech Connect

    Meyer, A W; Paglieroni, D; Asteneh, C

    2002-12-17

    This paper presents various architectural options for implementing a K-Means Re-Clustering algorithm suitable for unsupervised segmentation of hyperspectral images. Performance metrics are developed based upon quantitative comparisons of convergence rates and segmentation quality. A methodology for making these comparisons is developed and used to establish K values that produce the best segmentations with minimal processing requirements. Convergence rates depend on the initial choice of cluster centers. Consequently, this same methodology may be used to evaluate the effectiveness of different initialization techniques.

  5. KD-tree based clustering algorithm for fast face recognition on large-scale data

    NASA Astrophysics Data System (ADS)

    Wang, Yuanyuan; Lin, Yaping; Yang, Junfeng

    2015-07-01

    This paper proposes an acceleration method for large-scale face recognition system. When dealing with a large-scale database, face recognition is time-consuming. In order to tackle this problem, we employ the k-means clustering algorithm to classify face data. Specifically, the data in each cluster are stored in the form of the kd-tree, and face feature matching is conducted with the kd-tree based nearest neighborhood search. Experiments on CAS-PEAL and self-collected database show the effectiveness of our proposed method.

  6. What to Do When K-Means Clustering Fails: A Simple yet Principled Alternative Algorithm

    PubMed Central

    Baig, Fahd; Little, Max A.

    2016-01-01

    The K-means algorithm is one of the most popular clustering algorithms in current use as it is relatively fast yet simple to understand and deploy in practice. Nevertheless, its use entails certain restrictive assumptions about the data, the negative consequences of which are not always immediately apparent, as we demonstrate. While more flexible algorithms have been developed, their widespread use has been hindered by their computational and technical complexity. Motivated by these considerations, we present a flexible alternative to K-means that relaxes most of the assumptions, whilst remaining almost as fast and simple. This novel algorithm which we call MAP-DP (maximum a-posteriori Dirichlet process mixtures), is statistically rigorous as it is based on nonparametric Bayesian Dirichlet process mixture modeling. This approach allows us to overcome most of the limitations imposed by K-means. The number of clusters K is estimated from the data instead of being fixed a-priori as in K-means. In addition, while K-means is restricted to continuous data, the MAP-DP framework can be applied to many kinds of data, for example, binary, count or ordinal data. Also, it can efficiently separate outliers from the data. This additional flexibility does not incur a significant computational overhead compared to K-means with MAP-DP convergence typically achieved in the order of seconds for many practical problems. Finally, in contrast to K-means, since the algorithm is based on an underlying statistical model, the MAP-DP framework can deal with missing data and enables model testing such as cross validation in a principled way. We demonstrate the simplicity and effectiveness of this algorithm on the health informatics problem of clinical sub-typing in a cluster of diseases known as parkinsonism. PMID:27669525

  7. Fast randomized Hough transformation track initiation algorithm based on multi-scale clustering

    NASA Astrophysics Data System (ADS)

    Wan, Minjie; Gu, Guohua; Chen, Qian; Qian, Weixian; Wang, Pengcheng

    2015-10-01

    A fast randomized Hough transformation track initiation algorithm based on multi-scale clustering is proposed to overcome existing problems in traditional infrared search and track system(IRST) which cannot provide movement information of the initial target and select the threshold value of correlation automatically by a two-dimensional track association algorithm based on bearing-only information . Movements of all the targets are presumed to be uniform rectilinear motion throughout this new algorithm. Concepts of space random sampling, parameter space dynamic linking table and convergent mapping of image to parameter space are developed on the basis of fast randomized Hough transformation. Considering the phenomenon of peak value clustering due to shortcomings of peak detection itself which is built on threshold value method, accuracy can only be ensured on condition that parameter space has an obvious peak value. A multi-scale idea is added to the above-mentioned algorithm. Firstly, a primary association is conducted to select several alternative tracks by a low-threshold .Then, alternative tracks are processed by multi-scale clustering methods , through which accurate numbers and parameters of tracks are figured out automatically by means of transforming scale parameters. The first three frames are processed by this algorithm in order to get the first three targets of the track , and then two slightly different gate radius are worked out , mean value of which is used to be the global threshold value of correlation. Moreover, a new model for curvilinear equation correction is applied to the above-mentioned track initiation algorithm for purpose of solving the problem of shape distortion when a space three-dimensional curve is mapped to a two-dimensional bearing-only space. Using sideways-flying, launch and landing as examples to build models and simulate, the application of the proposed approach in simulation proves its effectiveness , accuracy , and adaptivity

  8. Revised basin-hopping Monte Carlo algorithm for structure optimization of clusters and nanoparticles.

    PubMed

    Rondina, Gustavo G; Da Silva, Juarez L F

    2013-09-23

    Suggestions for improving the Basin-Hopping Monte Carlo (BHMC) algorithm for unbiased global optimization of clusters and nanoparticles are presented. The traditional basin-hopping exploration scheme with Monte Carlo sampling is improved by bringing together novel strategies and techniques employed in different global optimization methods, however, with the care of keeping the underlying algorithm of BHMC unchanged. The improvements include a total of eleven local and nonlocal trial operators tailored for clusters and nanoparticles that allow an efficient exploration of the potential energy surface, two different strategies (static and dynamic) of operator selection, and a filter operator to handle unphysical solutions. In order to assess the efficiency of our strategies, we applied our implementation to several classes of systems, including Lennard-Jones and Sutton-Chen clusters with up to 147 and 148 atoms, respectively, a set of Lennard-Jones nanoparticles with sizes ranging from 200 to 1500 atoms, binary Lennard-Jones clusters with up to 100 atoms, (AgPd)55 alloy clusters described by the Sutton-Chen potential, and aluminum clusters with up to 30 atoms described within the density functional theory framework. Using unbiased global search our implementation was able to reproduce successfully the great majority of all published results for the systems considered and in many cases with more efficiency than the standard BHMC. We were also able to locate previously unknown global minimum structures for some of the systems considered. This revised BHMC method is a valuable tool for aiding theoretical investigations leading to a better understanding of atomic structures of clusters and nanoparticles. PMID:23957311

  9. Robustness of ‘cut and splice’ genetic algorithms in the structural optimization of atomic clusters

    NASA Astrophysics Data System (ADS)

    Froltsov, Vladimir A.; Reuter, Karsten

    2009-05-01

    We return to the geometry optimization problem of Lennard-Jones clusters to analyze the performance dependence of 'cut and splice' genetic algorithms (GAs) on the employed population size. We generally find that admixing twinning mutation moves leads to an improved robustness of the algorithm efficiency with respect to this a priori unknown technical parameter. The resulting very stable performance of the corresponding mutation + mating GA implementation over a wide range of population sizes is an important feature when addressing unknown systems with computationally involved first-principles based GA sampling.

  10. Modelling galaxy clustering: halo occupation distribution versus subhalo matching

    NASA Astrophysics Data System (ADS)

    Guo, Hong; Zheng, Zheng; Behroozi, Peter S.; Zehavi, Idit; Chuang, Chia-Hsun; Comparat, Johan; Favole, Ginevra; Gottloeber, Stefan; Klypin, Anatoly; Prada, Francisco; Rodríguez-Torres, Sergio A.; Weinberg, David H.; Yepes, Gustavo

    2016-07-01

    We model the luminosity-dependent projected and redshift-space two-point correlation functions (2PCFs) of the Sloan Digital Sky Survey (SDSS) Data Release 7 Main galaxy sample, using the halo occupation distribution (HOD) model and the subhalo abundance matching (SHAM) model and its extension. All the models are built on the same high-resolution N-body simulations. We find that the HOD model generally provides the best performance in reproducing the clustering measurements in both projected and redshift spaces. The SHAM model with the same halo-galaxy relation for central and satellite galaxies (or distinct haloes and subhaloes), when including scatters, has a best-fitting χ2/dof around 2-3. We therefore extend the SHAM model to the subhalo clustering and abundance matching (SCAM) by allowing the central and satellite galaxies to have different galaxy-halo relations. We infer the corresponding halo/subhalo parameters by jointly fitting the galaxy 2PCFs and abundances and consider subhaloes selected based on three properties, the mass Macc at the time of accretion, the maximum circular velocity Vacc at the time of accretion, and the peak maximum circular velocity Vpeak over the history of the subhaloes. The three subhalo models work well for luminous galaxy samples (with luminosity above L*). For low-luminosity samples, the Vacc model stands out in reproducing the data, with the Vpeak model slightly worse, while the Macc model fails to fit the data. We discuss the implications of the modelling results.

  11. 3D Hail Size Distribution Interpolation/Extrapolation Algorithm

    NASA Technical Reports Server (NTRS)

    Lane, John

    2013-01-01

    Radar data can usually detect hail; however, it is difficult for present day radar to accurately discriminate between hail and rain. Local ground-based hail sensors are much better at detecting hail against a rain background, and when incorporated with radar data, provide a much better local picture of a severe rain or hail event. The previous disdrometer interpolation/ extrapolation algorithm described a method to interpolate horizontally between multiple ground sensors (a minimum of three) and extrapolate vertically. This work is a modification to that approach that generates a purely extrapolated 3D spatial distribution when using a single sensor.

  12. Application of a clustering-based peak alignment algorithm to analyze various DNA fingerprinting data.

    PubMed

    Ishii, Satoshi; Kadota, Koji; Senoo, Keishi

    2009-09-01

    DNA fingerprinting analysis such as amplified ribosomal DNA restriction analysis (ARDRA), repetitive extragenic palindromic PCR (rep-PCR), ribosomal intergenic spacer analysis (RISA), and denaturing gradient gel electrophoresis (DGGE) are frequently used in various fields of microbiology. The major difficulty in DNA fingerprinting data analysis is the alignment of multiple peak sets. We report here an R program for a clustering-based peak alignment algorithm, and its application to analyze various DNA fingerprinting data, such as ARDRA, rep-PCR, RISA, and DGGE data. The results obtained by our clustering algorithm and by BioNumerics software showed high similarity. Since several R packages have been established to statistically analyze various biological data, the distance matrix obtained by our R program can be used for subsequent statistical analyses, some of which were not previously performed but are useful in DNA fingerprinting studies.

  13. A brief summary on formalizing parallel tensor distributions redistributions and algorithm derivations.

    SciTech Connect

    Schatz, Martin D.; Kolda, Tamara G.; van de Geijn, Robert

    2015-09-01

    Large-scale datasets in computational chemistry typically require distributed-memory parallel methods to perform a special operation known as tensor contraction. Tensors are multidimensional arrays, and a tensor contraction is akin to matrix multiplication with special types of permutations. Creating an efficient algorithm and optimized im- plementation in this domain is complex, tedious, and error-prone. To address this, we develop a notation to express data distributions so that we can apply use automated methods to find optimized implementations for tensor contractions. We consider the spin-adapted coupled cluster singles and doubles method from computational chemistry and use our methodology to produce an efficient implementation. Experiments per- formed on the IBM Blue Gene/Q and Cray XC30 demonstrate impact both improved performance and reduced memory consumption.

  14. The radial distribution of galaxies in groups and clusters

    NASA Astrophysics Data System (ADS)

    Budzynski, J. M.; Koposov, S. E.; McCarthy, I. G.; McGee, S. L.; Belokurov, V.

    2012-06-01

    We present a new catalogue of 55 121 groups and clusters centred on luminous red galaxies from Sloan Digital Sky Survey Data Release 7 in the redshift range 0.15 ≤z≤ 0.4. We provide halo mass (M500) estimates for each of these groups derived from a calibration between the optical richness of bright galaxies (Mr≤-20.5) within 1 Mpc and X-ray-derived mass for a small subset of 129 groups and clusters with X-ray measurements. For 20 157 high-mass groups and clusters with M500 > 1013.7 M⊙, we find that the catalogue has a purity of >97 per cent and a completeness of ˜90 per cent. We derive the mean (stacked) surface number density profiles of galaxies as a function of total halo mass in different mass bins. We find that derived profiles can be well described by a projected Navarro-Frenk-White profile with a concentration parameter (≈ 2.6) that is approximately a factor of 2 lower than that of the dark matter (as predicted by N-body cosmological simulations) and nearly independent of halo mass. Interestingly, in spite of the difference in shape between the galaxy and dark matter radial distributions, both exhibit a high degree of self-similarity. We also stack the satellite profiles based on other observables, namely redshift, brightest cluster galaxy (BCG) luminosity and satellite luminosity and colour. We see no evidence for strong variation in profile shape with redshift over the range we probe or with BCG luminosity (or BCG luminosity fraction), but we do find a strong dependence on satellite luminosity and colours, in agreement with previous studies. A self-consistent comparison to several recent semi-analytic models of galaxy formation indicates that (1) beyond ≈0.3r500 current models are able to reproduce both the shape and normalization of the satellite profiles, and (2) within ≈0.3r500 the predicted profiles are sensitive to the details of the satellite-BCG merger time-scale calculation. The former is a direct result of the models

  15. Combustion distribution control using the extremum seeking algorithm

    NASA Astrophysics Data System (ADS)

    Marjanovic, A.; Krstic, M.; Djurovic, Z.; Kvascev, G.; Papic, V.

    2014-12-01

    Quality regulation of the combustion process inside the furnace is the basis of high demands for increasing robustness, safety and efficiency of thermal power plants. The paper considers the possibility of spatial temperature distribution control inside the boiler, based on the correction of distribution of coal over the mills. Such control system ensures the maintenance of the flame focus away from the walls of the boiler, and thus preserves the equipment and reduces the possibility of ash slugging. At the same time, uniform heat dissipation over mills enhances the energy efficiency of the boiler, while reducing the pollution of the system. A constrained multivariable extremum seeking algorithm is proposed as a tool for combustion process optimization with the main objective of centralizing the flame in the furnace. Simulations are conducted on a model corresponding to the 350MW boiler of the Nikola Tesla Power Plant, in Obrenovac, Serbia.

  16. User Activity Recognition in Smart Homes Using Pattern Clustering Applied to Temporal ANN Algorithm.

    PubMed

    Bourobou, Serge Thomas Mickala; Yoo, Younghwan

    2015-01-01

    This paper discusses the possibility of recognizing and predicting user activities in the IoT (Internet of Things) based smart environment. The activity recognition is usually done through two steps: activity pattern clustering and activity type decision. Although many related works have been suggested, they had some limited performance because they focused only on one part between the two steps. This paper tries to find the best combination of a pattern clustering method and an activity decision algorithm among various existing works. For the first step, in order to classify so varied and complex user activities, we use a relevant and efficient unsupervised learning method called the K-pattern clustering algorithm. In the second step, the training of smart environment for recognizing and predicting user activities inside his/her personal space is done by utilizing the artificial neural network based on the Allen's temporal relations. The experimental results show that our combined method provides the higher recognition accuracy for various activities, as compared with other data mining classification algorithms. Furthermore, it is more appropriate for a dynamic environment like an IoT based smart home. PMID:26007738

  17. User Activity Recognition in Smart Homes Using Pattern Clustering Applied to Temporal ANN Algorithm.

    PubMed

    Bourobou, Serge Thomas Mickala; Yoo, Younghwan

    2015-01-01

    This paper discusses the possibility of recognizing and predicting user activities in the IoT (Internet of Things) based smart environment. The activity recognition is usually done through two steps: activity pattern clustering and activity type decision. Although many related works have been suggested, they had some limited performance because they focused only on one part between the two steps. This paper tries to find the best combination of a pattern clustering method and an activity decision algorithm among various existing works. For the first step, in order to classify so varied and complex user activities, we use a relevant and efficient unsupervised learning method called the K-pattern clustering algorithm. In the second step, the training of smart environment for recognizing and predicting user activities inside his/her personal space is done by utilizing the artificial neural network based on the Allen's temporal relations. The experimental results show that our combined method provides the higher recognition accuracy for various activities, as compared with other data mining classification algorithms. Furthermore, it is more appropriate for a dynamic environment like an IoT based smart home.

  18. Big Data GPU-Driven Parallel Processing Spatial and Spatio-Temporal Clustering Algorithms

    NASA Astrophysics Data System (ADS)

    Konstantaras, Antonios; Skounakis, Emmanouil; Kilty, James-Alexander; Frantzeskakis, Theofanis; Maravelakis, Emmanuel

    2016-04-01

    Advances in graphics processing units' technology towards encompassing parallel architectures [1], comprised of thousands of cores and multiples of parallel threads, provide the foundation in terms of hardware for the rapid processing of various parallel applications regarding seismic big data analysis. Seismic data are normally stored as collections of vectors in massive matrices, growing rapidly in size as wider areas are covered, denser recording networks are being established and decades of data are being compiled together [2]. Yet, many processes regarding seismic data analysis are performed on each seismic event independently or as distinct tiles [3] of specific grouped seismic events within a much larger data set. Such processes, independent of one another can be performed in parallel narrowing down processing times drastically [1,3]. This research work presents the development and implementation of three parallel processing algorithms using Cuda C [4] for the investigation of potentially distinct seismic regions [5,6] present in the vicinity of the southern Hellenic seismic arc. The algorithms, programmed and executed in parallel comparatively, are the: fuzzy k-means clustering with expert knowledge [7] in assigning overall clusters' number; density-based clustering [8]; and a selves-developed spatio-temporal clustering algorithm encompassing expert [9] and empirical knowledge [10] for the specific area under investigation. Indexing terms: GPU parallel programming, Cuda C, heterogeneous processing, distinct seismic regions, parallel clustering algorithms, spatio-temporal clustering References [1] Kirk, D. and Hwu, W.: 'Programming massively parallel processors - A hands-on approach', 2nd Edition, Morgan Kaufman Publisher, 2013 [2] Konstantaras, A., Valianatos, F., Varley, M.R. and Makris, J.P.: 'Soft-Computing Modelling of Seismicity in the Southern Hellenic Arc', Geoscience and Remote Sensing Letters, vol. 5 (3), pp. 323-327, 2008 [3] Papadakis, S. and

  19. A Fast Cluster Motif Finding Algorithm for ChIP-Seq Data Sets

    PubMed Central

    Zhang, Yipu; Wang, Ping

    2015-01-01

    New high-throughput technique ChIP-seq, coupling chromatin immunoprecipitation experiment with high-throughput sequencing technologies, has extended the identification of binding locations of a transcription factor to the genome-wide regions. However, the most existing motif discovery algorithms are time-consuming and limited to identify binding motifs in ChIP-seq data which normally has the significant characteristics of large scale data. In order to improve the efficiency, we propose a fast cluster motif finding algorithm, named as FCmotif, to identify the (l,  d) motifs in large scale ChIP-seq data set. It is inspired by the emerging substrings mining strategy to find the enriched substrings and then searching the neighborhood instances to construct PWM and cluster motifs in different length. FCmotif is not following the OOPS model constraint and can find long motifs. The effectiveness of proposed algorithm has been proved by experiments on the ChIP-seq data sets from mouse ES cells. The whole detection of the real binding motifs and processing of the full size data of several megabytes finished in a few minutes. The experimental results show that FCmotif has advantageous to deal with the (l,  d) motif finding in the ChIP-seq data; meanwhile it also demonstrates better performance than other current widely-used algorithms such as MEME, Weeder, ChIPMunk, and DREME. PMID:26236718

  20. A Fast Cluster Motif Finding Algorithm for ChIP-Seq Data Sets.

    PubMed

    Zhang, Yipu; Wang, Ping

    2015-01-01

    New high-throughput technique ChIP-seq, coupling chromatin immunoprecipitation experiment with high-throughput sequencing technologies, has extended the identification of binding locations of a transcription factor to the genome-wide regions. However, the most existing motif discovery algorithms are time-consuming and limited to identify binding motifs in ChIP-seq data which normally has the significant characteristics of large scale data. In order to improve the efficiency, we propose a fast cluster motif finding algorithm, named as FCmotif, to identify the (l,  d) motifs in large scale ChIP-seq data set. It is inspired by the emerging substrings mining strategy to find the enriched substrings and then searching the neighborhood instances to construct PWM and cluster motifs in different length. FCmotif is not following the OOPS model constraint and can find long motifs. The effectiveness of proposed algorithm has been proved by experiments on the ChIP-seq data sets from mouse ES cells. The whole detection of the real binding motifs and processing of the full size data of several megabytes finished in a few minutes. The experimental results show that FCmotif has advantageous to deal with the (l,  d) motif finding in the ChIP-seq data; meanwhile it also demonstrates better performance than other current widely-used algorithms such as MEME, Weeder, ChIPMunk, and DREME. PMID:26236718

  1. Commodity cluster and hardware-based massively parallel implementations of hyperspectral imaging algorithms

    NASA Astrophysics Data System (ADS)

    Plaza, Antonio; Chang, Chein-I.; Plaza, Javier; Valencia, David

    2006-05-01

    The incorporation of hyperspectral sensors aboard airborne/satellite platforms is currently producing a nearly continual stream of multidimensional image data, and this high data volume has soon introduced new processing challenges. The price paid for the wealth spatial and spectral information available from hyperspectral sensors is the enormous amounts of data that they generate. Several applications exist, however, where having the desired information calculated quickly enough for practical use is highly desirable. High computing performance of algorithm analysis is particularly important in homeland defense and security applications, in which swift decisions often involve detection of (sub-pixel) military targets (including hostile weaponry, camouflage, concealment, and decoys) or chemical/biological agents. In order to speed-up computational performance of hyperspectral imaging algorithms, this paper develops several fast parallel data processing techniques. Techniques include four classes of algorithms: (1) unsupervised classification, (2) spectral unmixing, and (3) automatic target recognition, and (4) onboard data compression. A massively parallel Beowulf cluster (Thunderhead) at NASA's Goddard Space Flight Center in Maryland is used to measure parallel performance of the proposed algorithms. In order to explore the viability of developing onboard, real-time hyperspectral data compression algorithms, a Xilinx Virtex-II field programmable gate array (FPGA) is also used in experiments. Our quantitative and comparative assessment of parallel techniques and strategies may help image analysts in selection of parallel hyperspectral algorithms for specific applications.

  2. Parallel OSEM Reconstruction Algorithm for Fully 3-D SPECT on a Beowulf Cluster.

    PubMed

    Rong, Zhou; Tianyu, Ma; Yongjie, Jin

    2005-01-01

    In order to improve the computation speed of ordered subset expectation maximization (OSEM) algorithm for fully 3-D single photon emission computed tomography (SPECT) reconstruction, an experimental beowulf-type cluster was built and several parallel reconstruction schemes were described. We implemented a single-program-multiple-data (SPMD) parallel 3-D OSEM reconstruction algorithm based on message passing interface (MPI) and tested it with combinations of different number of calculating processors and different size of voxel grid in reconstruction (64×64×64 and 128×128×128). Performance of parallelization was evaluated in terms of the speedup factor and parallel efficiency. This parallel implementation methodology is expected to be helpful to make fully 3-D OSEM algorithms more feasible in clinical SPECT studies.

  3. Klystron Cluster Scheme for ILC High Power RF Distribution

    SciTech Connect

    Nantista, Christopher; Adolphsen, Chris; /SLAC

    2009-07-06

    We present a concept for powering the main linacs of the International Linear Collider (ILC) by delivering high power RF from the surface via overmoded, low-loss waveguides at widely spaced intervals. The baseline design employs a two-tunnel layout, with klystrons and modulators evenly distributed along a service tunnel running parallel to the accelerator tunnel. This new idea eliminates the need for the service tunnel. It also brings most of the warm heat load to the surface, dramatically reducing the tunnel water cooling and HVAC requirements. In the envisioned configuration, groups of 70 klystrons and modulators are clustered in surface buildings every 2.5 km. Their outputs are combined into two half-meter diameter circular TE{sub 01} mode evacuated waveguides. These are directed via special bends through a deep shaft and along the tunnel, one upstream and one downstream. Each feeds approximately 1.25 km of linac with power tapped off in 10 MW portions at 38 m intervals. The power is extracted through a novel coaxial tap-off (CTO), after which the local distribution is as it would be from a klystron. The tap-off design is also employed in reverse for the initial combining.

  4. A Novel Clustering Algorithm for Mobile Ad Hoc Networks Based on Determination of Virtual Links' Weight to Increase Network Stability

    PubMed Central

    Karimi, Abbas; Afsharfarnia, Abbas; Zarafshan, Faraneh; Al-Haddad, S. A. R.

    2014-01-01

    The stability of clusters is a serious issue in mobile ad hoc networks. Low stability of clusters may lead to rapid failure of clusters, high energy consumption for reclustering, and decrease in the overall network stability in mobile ad hoc network. In order to improve the stability of clusters, weight-based clustering algorithms are utilized. However, these algorithms only use limited features of the nodes. Thus, they decrease the weight accuracy in determining node's competency and lead to incorrect selection of cluster heads. A new weight-based algorithm presented in this paper not only determines node's weight using its own features, but also considers the direct effect of feature of adjacent nodes. It determines the weight of virtual links between nodes and the effect of the weights on determining node's final weight. By using this strategy, the highest weight is assigned to the best choices for being the cluster heads and the accuracy of nodes selection increases. The performance of new algorithm is analyzed by using computer simulation. The results show that produced clusters have longer lifetime and higher stability. Mathematical simulation shows that this algorithm has high availability in case of failure. PMID:25114965

  5. A novel clustering algorithm for mobile ad hoc networks based on determination of virtual links' weight to increase network stability.

    PubMed

    Karimi, Abbas; Afsharfarnia, Abbas; Zarafshan, Faraneh; Al-Haddad, S A R

    2014-01-01

    The stability of clusters is a serious issue in mobile ad hoc networks. Low stability of clusters may lead to rapid failure of clusters, high energy consumption for reclustering, and decrease in the overall network stability in mobile ad hoc network. In order to improve the stability of clusters, weight-based clustering algorithms are utilized. However, these algorithms only use limited features of the nodes. Thus, they decrease the weight accuracy in determining node's competency and lead to incorrect selection of cluster heads. A new weight-based algorithm presented in this paper not only determines node's weight using its own features, but also considers the direct effect of feature of adjacent nodes. It determines the weight of virtual links between nodes and the effect of the weights on determining node's final weight. By using this strategy, the highest weight is assigned to the best choices for being the cluster heads and the accuracy of nodes selection increases. The performance of new algorithm is analyzed by using computer simulation. The results show that produced clusters have longer lifetime and higher stability. Mathematical simulation shows that this algorithm has high availability in case of failure.

  6. The Spatial Distribution of the Young Stellar Clusters in the Star-forming Galaxy NGC 628

    NASA Astrophysics Data System (ADS)

    Grasha, K.; Calzetti, D.; Adamo, A.; Kim, H.; Elmegreen, B. G.; Gouliermis, D. A.; Aloisi, A.; Bright, S. N.; Christian, C.; Cignoni, M.; Dale, D. A.; Dobbs, C.; Elmegreen, D. M.; Fumagalli, M.; Gallagher, J. S., III; Grebel, E. K.; Johnson, K. E.; Lee, J. C.; Messa, M.; Smith, L. J.; Ryon, J. E.; Thilker, D.; Ubeda, L.; Wofford, A.

    2015-12-01

    We present a study of the spatial distribution of the stellar cluster populations in the star-forming galaxy NGC 628. Using Hubble Space Telescope broadband WFC3/UVIS UV and optical images from the Treasury Program LEGUS (Legacy ExtraGalactic UV Survey), we have identified 1392 potential young (≲ 100 Myr) stellar clusters within the galaxy using a combination of visual inspection and automatic selection. We investigate the clustering of these young stellar clusters and quantify the strength and change of clustering strength with scale using the two-point correlation function. We also investigate how image boundary conditions and dust lanes affect the observed clustering. The distribution of the clusters is well fit by a broken power law with negative exponent α. We recover a weighted mean index of α ∼ -0.8 for all spatial scales below the break at 3.″3 (158 pc at a distance of 9.9 Mpc) and an index of α ∼ -0.18 above 158 pc for the accumulation of all cluster types. The strength of the clustering increases with decreasing age and clusters older than 40 Myr lose their clustered structure very rapidly and tend to be randomly distributed in this galaxy, whereas the mass of the star cluster has little effect on the clustering strength. This is consistent with results from other studies that the morphological hierarchy in stellar clustering resembles the same hierarchy as the turbulent interstellar medium.

  7. The cascaded moving k-means and fuzzy c-means clustering algorithms for unsupervised segmentation of malaria images

    NASA Astrophysics Data System (ADS)

    Abdul-Nasir, Aimi Salihah; Mashor, Mohd Yusoff; Halim, Nurul Hazwani Abd; Mohamed, Zeehaida

    2015-05-01

    Malaria is a life-threatening parasitic infectious disease that corresponds for nearly one million deaths each year. Due to the requirement of prompt and accurate diagnosis of malaria, the current study has proposed an unsupervised pixel segmentation based on clustering algorithm in order to obtain the fully segmented red blood cells (RBCs) infected with malaria parasites based on the thin blood smear images of P. vivax species. In order to obtain the segmented infected cell, the malaria images are first enhanced by using modified global contrast stretching technique. Then, an unsupervised segmentation technique based on clustering algorithm has been applied on the intensity component of malaria image in order to segment the infected cell from its blood cells background. In this study, cascaded moving k-means (MKM) and fuzzy c-means (FCM) clustering algorithms has been proposed for malaria slide image segmentation. After that, median filter algorithm has been applied to smooth the image as well as to remove any unwanted regions such as small background pixels from the image. Finally, seeded region growing area extraction algorithm has been applied in order to remove large unwanted regions that are still appeared on the image due to their size in which cannot be cleaned by using median filter. The effectiveness of the proposed cascaded MKM and FCM clustering algorithms has been analyzed qualitatively and quantitatively by comparing the proposed cascaded clustering algorithm with MKM and FCM clustering algorithms. Overall, the results indicate that segmentation using the proposed cascaded clustering algorithm has produced the best segmentation performances by achieving acceptable sensitivity as well as high specificity and accuracy values compared to the segmentation results provided by MKM and FCM algorithms.

  8. Effects of cluster location and cluster distribution on performance on the traveling salesman problem.

    PubMed

    MacGregor, James N

    2015-10-01

    Research on human performance in solving traveling salesman problems typically uses point sets as stimuli, and most models have proposed a processing stage at which stimulus dots are clustered. However, few empirical studies have investigated the effects of clustering on performance. In one recent study, researchers compared the effects of clustered, random, and regular stimuli, and concluded that clustering facilitates performance (Dry, Preiss, & Wagemans, 2012). Another study suggested that these results may have been influenced by the location rather than the degree of clustering (MacGregor, 2013). Two experiments are reported that mark an attempt to disentangle these factors. The first experiment tested several combinations of degree of clustering and cluster location, and revealed mixed evidence that clustering influences performance. In a second experiment, both factors were varied independently, showing that they interact. The results are discussed in terms of the importance of clustering effects, in particular, and perceptual factors, in general, during performance of the traveling salesman problem.

  9. INITIAL SIZE DISTRIBUTION OF THE GALACTIC GLOBULAR CLUSTER SYSTEM

    SciTech Connect

    Shin, Jihye; Kim, Sungsoo S.; Yoon, Suk-Jin; Kim, Juhan

    2013-01-10

    Despite the importance of their size evolution in understanding the dynamical evolution of globular clusters (GCs) of the Milky Way, studies that focus specifically on this issue are rare. Based on the advanced, realistic Fokker-Planck (FP) approach, we theoretically predict the initial size distribution (SD) of the Galactic GCs along with their initial mass function and radial distribution. Over one thousand FP calculations in a wide parameter space have pinpointed the best-fit initial conditions for the SD, mass function, and radial distribution. Our best-fit model shows that the initial SD of the Galactic GCs is of larger dispersion than today's SD, and that the typical projected half-light radius of the initial GCs is {approx}4.6 pc, which is 1.8 times larger than that of the present-day GCs ({approx}2.5 pc). Their large size signifies greater susceptibility to the Galactic tides: the total mass of destroyed GCs reaches 3-5 Multiplication-Sign 10{sup 8} M {sub Sun }, several times larger than previous estimates. Our result challenges a recent view that the Milky Way GCs were born compact on the sub-pc scale, and rather implies that (1) the initial GCs were generally larger than the typical size of the present-day GCs, (2) the initially large GCs mostly shrank and/or disrupted as a result of the galactic tides, and (3) the initially small GCs expanded by two-body relaxation, and later shrank by the galactic tides.

  10. Lightweight and Distributed Connectivity-Based Clustering Derived from Schelling's Model

    NASA Astrophysics Data System (ADS)

    Tsugawa, Sho; Ohsaki, Hiroyuki; Imase, Makoto

    In the literature, two connectivity-based distributed clustering schemes exist: CDC (Connectivity-based Distributed node Clustering scheme) and SDC (SCM-based Distributed Clustering). While CDC and SDC have mechanisms for maintaining clusters against nodes joining and leaving, neither method assumes that frequent changes occur in the network topology. In this paper, we propose a lightweight distributed clustering method that we term SBDC (Schelling-Based Distributed Clustering) since this scheme is derived from Schelling's model — a popular segregation model in sociology. We evaluate the effectiveness of the proposed SBDC in an environment where frequent changes arise in the network topology. Our simulation results show that SBDC outperforms CDC and SDC under frequent changes in network topology caused by high node mobility.

  11. Constraints on the richness-mass relation and the optical-SZE positional offset distribution for SZE-selected clusters

    NASA Astrophysics Data System (ADS)

    Saro, A.; Bocquet, S.; Rozo, E.; Benson, B. A.; Mohr, J.; Rykoff, E. S.; Soares-Santos, M.; Bleem, L.; Dodelson, S.; Melchior, P.; Sobreira, F.; Upadhyay, V.; Weller, J.; Abbott, T.; Abdalla, F. B.; Allam, S.; Armstrong, R.; Banerji, M.; Bauer, A. H.; Bayliss, M.; Benoit-Lévy, A.; Bernstein, G. M.; Bertin, E.; Brodwin, M.; Brooks, D.; Buckley-Geer, E.; Burke, D. L.; Carlstrom, J. E.; Capasso, R.; Capozzi, D.; Carnero Rosell, A.; Carrasco Kind, M.; Chiu, I.; Covarrubias, R.; Crawford, T. M.; Crocce, M.; D'Andrea, C. B.; da Costa, L. N.; DePoy, D. L.; Desai, S.; de Haan, T.; Diehl, H. T.; Dietrich, J. P.; Doel, P.; Cunha, C. E.; Eifler, T. F.; Evrard, A. E.; Fausti Neto, A.; Fernandez, E.; Flaugher, B.; Fosalba, P.; Frieman, J.; Gangkofner, C.; Gaztanaga, E.; Gerdes, D.; Gruen, D.; Gruendl, R. A.; Gupta, N.; Hennig, C.; Holzapfel, W. L.; Honscheid, K.; Jain, B.; James, D.; Kuehn, K.; Kuropatkin, N.; Lahav, O.; Li, T. S.; Lin, H.; Maia, M. A. G.; March, M.; Marshall, J. L.; Martini, Paul; McDonald, M.; Miller, C. J.; Miquel, R.; Nord, B.; Ogando, R.; Plazas, A. A.; Reichardt, C. L.; Romer, A. K.; Roodman, A.; Sako, M.; Sanchez, E.; Schubnell, M.; Sevilla, I.; Smith, R. C.; Stalder, B.; Stark, A. A.; Strazzullo, V.; Suchyta, E.; Swanson, M. E. C.; Tarle, G.; Thaler, J.; Thomas, D.; Tucker, D.; Vikram, V.; von der Linden, A.; Walker, A. R.; Wechsler, R. H.; Wester, W.; Zenteno, A.; Ziegler, K. E.

    2015-12-01

    We cross-match galaxy cluster candidates selected via their Sunyaev-Zel'dovich effect (SZE) signatures in 129.1 deg2 of the South Pole Telescope 2500d SPT-SZ survey with optically identified clusters selected from the Dark Energy Survey science verification data. We identify 25 clusters between 0.1 ≲ z ≲ 0.8 in the union of the SPT-SZ and redMaPPer (RM) samples. RM is an optical cluster finding algorithm that also returns a richness estimate for each cluster. We model the richness λ-mass relation with the following function ∝ Bλln M500 + Cλln E(z) and use SPT-SZ cluster masses and RM richnesses λ to constrain the parameters. We find B_λ = 1.14^{+0.21}_{-0.18} and C_λ =0.73^{+0.77}_{-0.75}. The associated scatter in mass at fixed richness is σ _{ln M|λ } = 0.18^{+0.08}_{-0.05} at a characteristic richness λ = 70. We demonstrate that our model provides an adequate description of the matched sample, showing that the fraction of SPT-SZ-selected clusters with RM counterparts is consistent with expectations and that the fraction of RM-selected clusters with SPT-SZ counterparts is in mild tension with expectation. We model the optical-SZE cluster positional offset distribution with the sum of two Gaussians, showing that it is consistent with a dominant, centrally peaked population and a subdominant population characterized by larger offsets. We also cross-match the RM catalogue with SPT-SZ candidates below the official catalogue threshold significance ξ = 4.5, using the RM catalogue to provide optical confirmation and redshifts for 15 additional clusters with ξ ∈ [4, 4.5].

  12. Constraints on the richness-mass relation and the optical-SZE positional offset distribution for SZE-selected clusters

    DOE PAGES

    Saro, A.

    2015-10-12

    In this study, we cross-match galaxy cluster candidates selected via their Sunyaev–Zel'dovich effect (SZE) signatures in 129.1 deg2 of the South Pole Telescope 2500d SPT-SZ survey with optically identified clusters selected from the Dark Energy Survey science verification data. We identify 25 clusters between 0.1 ≲ z ≲ 0.8 in the union of the SPT-SZ and redMaPPer (RM) samples. RM is an optical cluster finding algorithm that also returns a richness estimate for each cluster. We model the richness λ-mass relation with the following function 500> ∝ Bλln M500 + Cλln E(z) and use SPT-SZ cluster masses and RM richnessesmore » λ to constrain the parameters. We find Bλ=1.14+0.21–0.18 and Cλ=0.73+0.77–0.75. The associated scatter in mass at fixed richness is σlnM|λ = 0.18+0.08–0.05 at a characteristic richness λ = 70. We demonstrate that our model provides an adequate description of the matched sample, showing that the fraction of SPT-SZ-selected clusters with RM counterparts is consistent with expectations and that the fraction of RM-selected clusters with SPT-SZ counterparts is in mild tension with expectation. We model the optical-SZE cluster positional offset distribution with the sum of two Gaussians, showing that it is consistent with a dominant, centrally peaked population and a subdominant population characterized by larger offsets. We also cross-match the RM catalogue with SPT-SZ candidates below the official catalogue threshold significance ξ = 4.5, using the RM catalogue to provide optical confirmation and redshifts for 15 additional clusters with ξ ε [4, 4.5].« less

  13. Detection and clustering of features in aerial images by neuron network-based algorithm

    NASA Astrophysics Data System (ADS)

    Vozenilek, Vit

    2015-12-01

    The paper presents the algorithm for detection and clustering of feature in aerial photographs based on artificial neural networks. The presented approach is not focused on the detection of specific topographic features, but on the combination of general features analysis and their use for clustering and backward projection of clusters to aerial image. The basis of the algorithm is a calculation of the total error of the network and a change of weights of the network to minimize the error. A classic bipolar sigmoid was used for the activation function of the neurons and the basic method of backpropagation was used for learning. To verify that a set of features is able to represent the image content from the user's perspective, the web application was compiled (ASP.NET on the Microsoft .NET platform). The main achievements include the knowledge that man-made objects in aerial images can be successfully identified by detection of shapes and anomalies. It was also found that the appropriate combination of comprehensive features that describe the colors and selected shapes of individual areas can be useful for image analysis.

  14. Clustering of tethered satellite system simulation data by an adaptive neuro-fuzzy algorithm

    NASA Technical Reports Server (NTRS)

    Mitra, Sunanda; Pemmaraju, Surya

    1992-01-01

    Recent developments in neuro-fuzzy systems indicate that the concepts of adaptive pattern recognition, when used to identify appropriate control actions corresponding to clusters of patterns representing system states in dynamic nonlinear control systems, may result in innovative designs. A modular, unsupervised neural network architecture, in which fuzzy learning rules have been embedded is used for on-line identification of similar states. The architecture and control rules involved in Adaptive Fuzzy Leader Clustering (AFLC) allow this system to be incorporated in control systems for identification of system states corresponding to specific control actions. We have used this algorithm to cluster the simulation data of Tethered Satellite System (TSS) to estimate the range of delta voltages necessary to maintain the desired length rate of the tether. The AFLC algorithm is capable of on-line estimation of the appropriate control voltages from the corresponding length error and length rate error without a priori knowledge of their membership functions and familarity with the behavior of the Tethered Satellite System.

  15. Crowded Cluster Cores. Algorithms for Deblending in Dark Energy Survey Images

    SciTech Connect

    Zhang, Yuanyuan; McKay, Timothy A.; Bertin, Emmanuel; Jeltema, Tesla; Miller, Christopher J.; Rykoff, Eli; Song, Jeeseon

    2015-10-26

    Deep optical images are often crowded with overlapping objects. We found that this is especially true in the cores of galaxy clusters, where images of dozens of galaxies may lie atop one another. Accurate measurements of cluster properties require deblending algorithms designed to automatically extract a list of individual objects and decide what fraction of the light in each pixel comes from each object. In this article, we introduce a new software tool called the Gradient And Interpolation based (GAIN) deblender. GAIN is used as a secondary deblender to improve the separation of overlapping objects in galaxy cluster cores in Dark Energy Survey images. It uses image intensity gradients and an interpolation technique originally developed to correct flawed digital images. Our paper is dedicated to describing the algorithm of the GAIN deblender and its applications, but we additionally include modest tests of the software based on real Dark Energy Survey co-add images. GAIN helps to extract an unbiased photometry measurement for blended sources and improve detection completeness, while introducing few spurious detections. When applied to processed Dark Energy Survey data, GAIN serves as a useful quick fix when a high level of deblending is desired.

  16. Crowded Cluster Cores. Algorithms for Deblending in Dark Energy Survey Images

    DOE PAGES

    Zhang, Yuanyuan; McKay, Timothy A.; Bertin, Emmanuel; Jeltema, Tesla; Miller, Christopher J.; Rykoff, Eli; Song, Jeeseon

    2015-10-26

    Deep optical images are often crowded with overlapping objects. We found that this is especially true in the cores of galaxy clusters, where images of dozens of galaxies may lie atop one another. Accurate measurements of cluster properties require deblending algorithms designed to automatically extract a list of individual objects and decide what fraction of the light in each pixel comes from each object. In this article, we introduce a new software tool called the Gradient And Interpolation based (GAIN) deblender. GAIN is used as a secondary deblender to improve the separation of overlapping objects in galaxy cluster cores inmore » Dark Energy Survey images. It uses image intensity gradients and an interpolation technique originally developed to correct flawed digital images. Our paper is dedicated to describing the algorithm of the GAIN deblender and its applications, but we additionally include modest tests of the software based on real Dark Energy Survey co-add images. GAIN helps to extract an unbiased photometry measurement for blended sources and improve detection completeness, while introducing few spurious detections. When applied to processed Dark Energy Survey data, GAIN serves as a useful quick fix when a high level of deblending is desired.« less

  17. Synthetic infrared images and spectral energy distributions of a young low-mass stellar cluster

    NASA Astrophysics Data System (ADS)

    Kurosawa, Ryuichi; Harries, Tim J.; Bate, Matthew R.; Symington, Neil H.

    2004-07-01

    We present three-dimensional Monte Carlo radiative-transfer models of a very young (<105 yr old) low-mass (50 Msolar) stellar cluster containing 23 stars and 27 brown dwarfs. The models use the density and the stellar mass distributions from the large-scale smoothed particle hydrodynamics (SPH) simulation of the formation of a low-mass stellar cluster by Bate, Bonnell and Bromm. Using adaptive mesh refinement, the SPH density is mapped to the radiative-transfer grid without loss of resolution. The temperature of the ISM and the circumstellar dust is computed using Lucy's Monte Carlo radiative equilibrium algorithm. Based on this temperature, we compute the spectral energy distributions of the whole cluster and the individual objects. We also compute simulated far-infrared Spitzer Space Telescope (SST) images (24-, 70-, and 160-μm bands) and construct colour-colour diagrams (near-infrared HKL and mid-infrared SST bands). The presence of accretion discs around the light sources influences the morphology of the dust temperature structure on a large scale (up to several 104 au). A considerable fraction of the interstellar dust is underheated compared with a model without the accretion discs because the radiation from the light sources is blocked/shadowed by the discs. The spectral energy distribution (SED) of the model cluster with accretion discs shows excess emission at λ= 3-30 μm and λ > 500 μm, compared with that without accretion discs. While the former excess is caused by the warm dust present in the discs, the latter is caused by the presence of the underheated (shadowed) dust. Our model with accretion discs around each object shows a similar distribution of spectral index (2.2-20 μm) values (i.e. Class 0-III sources) to that seen in the ρ Ophiuchus cloud. We confirm that the best diagnostics for identifying objects with accretion discs are mid-infrared (λ= 3-10 μm) colours (e.g. SST IRAC bands) rather than HKL colours.

  18. An improved scheduling algorithm for 3D cluster rendering with platform LSF

    NASA Astrophysics Data System (ADS)

    Xu, Wenli; Zhu, Yi; Zhang, Liping

    2013-10-01

    High-quality photorealistic rendering of 3D modeling needs powerful computing systems. On this demand highly efficient management of cluster resources develops fast to exert advantages. This paper is absorbed in the aim of how to improve the efficiency of 3D rendering tasks in cluster. It focuses research on a dynamic feedback load balance (DFLB) algorithm, the work principle of load sharing facility (LSF) and optimization of external scheduler plug-in. The algorithm can be applied into match and allocation phase of a scheduling cycle. Candidate hosts is prepared in sequence in match phase. And the scheduler makes allocation decisions for each job in allocation phase. With the dynamic mechanism, new weight is assigned to each candidate host for rearrangement. The most suitable one will be dispatched for rendering. A new plugin module of this algorithm has been designed and integrated into the internal scheduler. Simulation experiments demonstrate the ability of improved plugin module is superior to the default one for rendering tasks. It can help avoid load imbalance among servers, increase system throughput and improve system utilization.

  19. Cluster mass fraction and size distribution determined by fs-time-resolved measurements

    NASA Astrophysics Data System (ADS)

    Gao, Xiaohui; Wang, Xiaoming; Shim, Bonggu; Arefiev, Alexey; Tushentsov, Mikhail; Breizman, Boris; Downer, Mike

    2009-11-01

    Characterization of supersonic gas jets is important for accurate interpretation and control of laser-cluster experiments. While average size and total atomic density can be found by standard Rayleigh scatter and interferometry, cluster mass fraction and size distribution are usually difficult to measure. Here we determine the cluster fraction and the size distribution with fs-time-resolved refractive index and absorption measurements in cluster gas jets after ionization and heating by an intense pump pulse. The fs-time-resolved refractive index measured with frequency domain interferometer (FDI) shows different contributions from monomer plasma and cluster plasma in the time domain, enabling us to determine the cluster fraction. The fs-time-resolved absorption measured by a delayed probe shows the contribution from clusters of various sizes, allowing us to find the size distribution.

  20. Clustering Educational Digital Library Usage Data: A Comparison of Latent Class Analysis and K-Means Algorithms

    ERIC Educational Resources Information Center

    Xu, Beijie; Recker, Mimi; Qi, Xiaojun; Flann, Nicholas; Ye, Lei

    2013-01-01

    This article examines clustering as an educational data mining method. In particular, two clustering algorithms, the widely used K-means and the model-based Latent Class Analysis, are compared, using usage data from an educational digital library service, the Instructional Architect (IA.usu.edu). Using a multi-faceted approach and multiple data…

  1. Development of a Genetic Algorithm to Automate Clustering of a Dependency Structure Matrix

    NASA Technical Reports Server (NTRS)

    Rogers, James L.; Korte, John J.; Bilardo, Vincent J.

    2006-01-01

    Much technology assessment and organization design data exists in Microsoft Excel spreadsheets. Tools are needed to put this data into a form that can be used by design managers to make design decisions. One need is to cluster data that is highly coupled. Tools such as the Dependency Structure Matrix (DSM) and a Genetic Algorithm (GA) can be of great benefit. However, no tool currently combines the DSM and a GA to solve the clustering problem. This paper describes a new software tool that interfaces a GA written as an Excel macro with a DSM in spreadsheet format. The results of several test cases are included to demonstrate how well this new tool works.

  2. Communities recognition in the Chesapeake Bay ecosystem by dynamical clustering algorithms based on different oscillators systems

    NASA Astrophysics Data System (ADS)

    Pluchino, A.; Rapisarda, A.; Latora, V.

    2008-10-01

    We have recently introduced [Phys. Rev. E 75, 045102(R) (2007); AIP Conference Proceedings 965, 2007, p. 323] an efficient method for the detection and identification of modules in complex networks, based on the de-synchronization properties (dynamical clustering) of phase oscillators. In this paper we apply the dynamical clustering tecnique to the identification of communities of marine organisms living in the Chesapeake Bay food web. We show that our algorithm is able to perform a very reliable classification of the real communities existing in this ecosystem by using different kinds of dynamical oscillators. We compare also our results with those of other methods for the detection of community structures in complex networks.

  3. Chirplet Clustering Algorithm for Black Hole Coalescence Signatures in Gravitational Wave Detectors

    NASA Astrophysics Data System (ADS)

    Nemtzow, Zachary; Chassande-Mottin, Eric; Mohapatra, Satyanarayan R. P.; Cadonati, Laura

    2012-03-01

    Within this decade, gravitational waves will become new astrophysical messengers with which we can learn about our universe. Gravitational wave emission from the coalescence of massive bodies is projected to be a promising source for the next generation of gravitational wave detectors: advanced LIGO and advanced Virgo. We describe a method for the detection of binary black hole coalescences using a chirplet template bank, Chirplet Omega. By appropriately clustering the linearly variant frequency sin-Gaussian pixels the algorithm uses to decompose the data, the signal to noise ratio SNR of events extended in time can be significantly increased. We present such a clustering method and discuss its impacts on performance and detectability of binary black hole coalescences in ground based gravitational wave interferometers.

  4. Clusters and cycles in the cosmic ray age distributions of meteorites

    NASA Technical Reports Server (NTRS)

    Woodard, M. F.; Marti, K.

    1985-01-01

    Statistically significant clusters in the cosmic ray exposure age distributions of some groups of iron and stone meteorites were observed, suggesting epochs of enhanced collision and breakups. Fourier analyses of the age distributions of chondrites reveal no significant periods, nor does the same analysis when applied to iron meteorite clusters.

  5. CLUSTAG & WCLUSTAG: Hierarchical Clustering Algorithms for Efficient Tag-SNP Selection

    NASA Astrophysics Data System (ADS)

    Ao, Sio-Iong

    More than 6 million single nucleotide polymorphisms (SNPs) in the human genome have been genotyped by the HapMap project. Although only a pro portion of these SNPs are functional, all can be considered as candidate markers for indirect association studies to detect disease-related genetic variants. The complete screening of a gene or a chromosomal region is nevertheless an expensive undertak ing for association studies. A key strategy for improving the efficiency of association studies is to select a subset of informative SNPs, called tag SNPs, for analysis. In the chapter, hierarchical clustering algorithms have been proposed for efficient tag SNP selection.

  6. [A cloud detection algorithm for MODIS images combining Kmeans clustering and multi-spectral threshold method].

    PubMed

    Wang, Wei; Song, Wei-Guo; Liu, Shi-Xing; Zhang, Yong-Ming; Zheng, Hong-Yang; Tian, Wei

    2011-04-01

    An improved method for detecting cloud combining Kmeans clustering and the multi-spectral threshold approach is described. On the basis of landmark spectrum analysis, MODIS data is categorized into two major types initially by Kmeans method. The first class includes clouds, smoke and snow, and the second class includes vegetation, water and land. Then a multi-spectral threshold detection is applied to eliminate interference such as smoke and snow for the first class. The method is tested with MODIS data at different time under different underlying surface conditions. By visual method to test the performance of the algorithm, it was found that the algorithm can effectively detect smaller area of cloud pixels and exclude the interference of underlying surface, which provides a good foundation for the next fire detection approach.

  7. Multispectral image classification of MRI data using an empirically-derived clustering algorithm

    SciTech Connect

    Horn, K.M.; Osbourn, G.C.; Bouchard, A.M.; Sanders, J.A. |

    1998-08-01

    Multispectral image analysis of magnetic resonance imaging (MRI) data has been performed using an empirically-derived clustering algorithm. This algorithm groups image pixels into distinct classes which exhibit similar response in the T{sub 2} 1st and 2nd-echo, and T{sub 1} (with ad without gadolinium) MRI images. The grouping is performed in an n-dimensional mathematical space; the n-dimensional volumes bounding each class define each specific tissue type. The classification results are rendered again in real-space by colored-coding each grouped class of pixels (associated with differing tissue types). This classification method is especially well suited for class volumes with complex boundary shapes, and is also expected to robustly detect abnormal tissue classes. The classification process is demonstrated using a three dimensional data set of MRI scans of a human brain tumor.

  8. Meanie3D - a mean-shift based, multivariate, multi-scale clustering and tracking algorithm

    NASA Astrophysics Data System (ADS)

    Simon, Jürgen-Lorenz; Malte, Diederich; Silke, Troemel

    2014-05-01

    Project OASE is the one of 5 work groups at the HErZ (Hans Ertel Centre for Weather Research), an ongoing effort by the German weather service (DWD) to further research at Universities concerning weather prediction. The goal of project OASE is to gain an object-based perspective on convective events by identifying them early in the onset of convective initiation and follow then through the entire lifecycle. The ability to follow objects in this fashion requires new ways of object definition and tracking, which incorporate all the available data sets of interest, such as Satellite imagery, weather Radar or lightning counts. The Meanie3D algorithm provides the necessary tool for this purpose. Core features of this new approach to clustering (object identification) and tracking are the ability to identify objects using the mean-shift algorithm applied to a multitude of variables (multivariate), as well as the ability to detect objects on various scales (multi-scale) using elements of Scale-Space theory. The algorithm works in 2D as well as 3D without modifications. It is an extension of a method well known from the field of computer vision and image processing, which has been tailored to serve the needs of the meteorological community. In spite of the special application to be demonstrated here (like convective initiation), the algorithm is easily tailored to provide clustering and tracking for a wide class of data sets and problems. In this talk, the demonstration is carried out on two of the OASE group's own composite sets. One is a 2D nationwide composite of Germany including C-Band Radar (2D) and Satellite information, the other a 3D local composite of the Bonn/Jülich area containing a high-resolution 3D X-Band Radar composite.

  9. An Energy-Efficient Spectrum-Aware Reinforcement Learning-Based Clustering Algorithm for Cognitive Radio Sensor Networks.

    PubMed

    Mustapha, Ibrahim; Mohd Ali, Borhanuddin; Rasid, Mohd Fadlee A; Sali, Aduwati; Mohamad, Hafizal

    2015-01-01

    It is well-known that clustering partitions network into logical groups of nodes in order to achieve energy efficiency and to enhance dynamic channel access in cognitive radio through cooperative sensing. While the topic of energy efficiency has been well investigated in conventional wireless sensor networks, the latter has not been extensively explored. In this paper, we propose a reinforcement learning-based spectrum-aware clustering algorithm that allows a member node to learn the energy and cooperative sensing costs for neighboring clusters to achieve an optimal solution. Each member node selects an optimal cluster that satisfies pairwise constraints, minimizes network energy consumption and enhances channel sensing performance through an exploration technique. We first model the network energy consumption and then determine the optimal number of clusters for the network. The problem of selecting an optimal cluster is formulated as a Markov Decision Process (MDP) in the algorithm and the obtained simulation results show convergence, learning and adaptability of the algorithm to dynamic environment towards achieving an optimal solution. Performance comparisons of our algorithm with the Groupwise Spectrum Aware (GWSA)-based algorithm in terms of Sum of Square Error (SSE), complexity, network energy consumption and probability of detection indicate improved performance from the proposed approach. The results further reveal that an energy savings of 9% and a significant Primary User (PU) detection improvement can be achieved with the proposed approach. PMID:26287191

  10. An Energy-Efficient Spectrum-Aware Reinforcement Learning-Based Clustering Algorithm for Cognitive Radio Sensor Networks

    PubMed Central

    Mustapha, Ibrahim; Ali, Borhanuddin Mohd; Rasid, Mohd Fadlee A.; Sali, Aduwati; Mohamad, Hafizal

    2015-01-01

    It is well-known that clustering partitions network into logical groups of nodes in order to achieve energy efficiency and to enhance dynamic channel access in cognitive radio through cooperative sensing. While the topic of energy efficiency has been well investigated in conventional wireless sensor networks, the latter has not been extensively explored. In this paper, we propose a reinforcement learning-based spectrum-aware clustering algorithm that allows a member node to learn the energy and cooperative sensing costs for neighboring clusters to achieve an optimal solution. Each member node selects an optimal cluster that satisfies pairwise constraints, minimizes network energy consumption and enhances channel sensing performance through an exploration technique. We first model the network energy consumption and then determine the optimal number of clusters for the network. The problem of selecting an optimal cluster is formulated as a Markov Decision Process (MDP) in the algorithm and the obtained simulation results show convergence, learning and adaptability of the algorithm to dynamic environment towards achieving an optimal solution. Performance comparisons of our algorithm with the Groupwise Spectrum Aware (GWSA)-based algorithm in terms of Sum of Square Error (SSE), complexity, network energy consumption and probability of detection indicate improved performance from the proposed approach. The results further reveal that an energy savings of 9% and a significant Primary User (PU) detection improvement can be achieved with the proposed approach. PMID:26287191

  11. An Energy-Efficient Spectrum-Aware Reinforcement Learning-Based Clustering Algorithm for Cognitive Radio Sensor Networks.

    PubMed

    Mustapha, Ibrahim; Mohd Ali, Borhanuddin; Rasid, Mohd Fadlee A; Sali, Aduwati; Mohamad, Hafizal

    2015-08-13

    It is well-known that clustering partitions network into logical groups of nodes in order to achieve energy efficiency and to enhance dynamic channel access in cognitive radio through cooperative sensing. While the topic of energy efficiency has been well investigated in conventional wireless sensor networks, the latter has not been extensively explored. In this paper, we propose a reinforcement learning-based spectrum-aware clustering algorithm that allows a member node to learn the energy and cooperative sensing costs for neighboring clusters to achieve an optimal solution. Each member node selects an optimal cluster that satisfies pairwise constraints, minimizes network energy consumption and enhances channel sensing performance through an exploration technique. We first model the network energy consumption and then determine the optimal number of clusters for the network. The problem of selecting an optimal cluster is formulated as a Markov Decision Process (MDP) in the algorithm and the obtained simulation results show convergence, learning and adaptability of the algorithm to dynamic environment towards achieving an optimal solution. Performance comparisons of our algorithm with the Groupwise Spectrum Aware (GWSA)-based algorithm in terms of Sum of Square Error (SSE), complexity, network energy consumption and probability of detection indicate improved performance from the proposed approach. The results further reveal that an energy savings of 9% and a significant Primary User (PU) detection improvement can be achieved with the proposed approach.

  12. Using Hierarchical Time Series Clustering Algorithm and Wavelet Classifier for Biometric Voice Classification

    PubMed Central

    Fong, Simon

    2012-01-01

    Voice biometrics has a long history in biosecurity applications such as verification and identification based on characteristics of the human voice. The other application called voice classification which has its important role in grouping unlabelled voice samples, however, has not been widely studied in research. Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth. In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree. The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification. Two datasets, one that is generated synthetically and the other one empirically collected from past voice recognition experiment, are used to verify and demonstrate the effectiveness of our proposed voice classification algorithm. PMID:22619492

  13. Modelling Paleoearthquake Slip Distributions using a Gentic Algorithm

    NASA Astrophysics Data System (ADS)

    Lindsay, Anthony; Simão, Nuno; McCloskey, John; Nalbant, Suleyman; Murphy, Shane; Bhloscaidh, Mairead Nic

    2013-04-01

    Along the Sunda trench, the annual growth rings of coral microatolls store long term records of tectonic deformation. Spread over large areas of an active megathrust fault, they offer the possibility of high resolution reconstructions of slip for a number of paleo-earthquakes. These data are complex with spatial and temporal variations in uncertainty. Rather than assuming that any one model will uniquely fit the data, Monte Carlo Slip Estimation (MCSE) modelling produces a catalogue of possible models for each event. From each earthquake's catalogue, a model is selected and a possible history of slip along the fault reconstructed. By generating multiple histories, then finding the average slip during each earthquake, a probabilistic history of slip along the fault can be generated and areas that may have a large slip deficit identified. However, the MCSE technique requires the production of many hundreds of billions of models to yield the few models that fit the observed coral data. In an attempt to accelerate this process, we have designed a Genetic Algorithm (GA). The GA uses evolutionary operators to recombine the information held by a population of possible slip models to produce a set of new models, based on how well they reproduce a set of coral deformation data. Repeated iterations of the algorithm produce populations of improved models, each generation better satisfying the coral data. Preliminary results have shown the GA to be capable of recovering synthetically generated slip distributions based their displacements of sets of corals faster than the MCSE technique. The results of the systematic testing of the GA technique and its performance using both synthetic and observed coral displacement data will be presented.

  14. Development of a memetic clustering algorithm for optimal spectral histology: application to FTIR images of normal human colon.

    PubMed

    Farah, Ihsen; Nguyen, Thi Nguyet Que; Groh, Audrey; Guenot, Dominique; Jeannesson, Pierre; Gobinet, Cyril

    2016-05-23

    The coupling between Fourier-transform infrared (FTIR) imaging and unsupervised classification is effective in revealing the different structures of human tissues based on their specific biomolecular IR signatures; thus the spectral histology of the studied samples is achieved. However, the most widely applied clustering methods in spectral histology are local search algorithms, which converge to a local optimum, depending on initialization. Multiple runs of the techniques estimate multiple different solutions. Here, we propose a memetic algorithm, based on a genetic algorithm and a k-means clustering refinement, to perform optimal clustering. In addition, this approach was applied to the acquired FTIR images of normal human colon tissues originating from five patients. The results show the efficiency of the proposed memetic algorithm to achieve the optimal spectral histology of these samples, contrary to k-means. PMID:27110605

  15. Thermal and log-normal distributions of plasma in laser driven Coulomb explosions of deuterium clusters

    NASA Astrophysics Data System (ADS)

    Barbarino, M.; Warrens, M.; Bonasera, A.; Lattuada, D.; Bang, W.; Quevedo, H. J.; Consoli, F.; de Angelis, R.; Andreoli, P.; Kimura, S.; Dyer, G.; Bernstein, A. C.; Hagel, K.; Barbui, M.; Schmidt, K.; Gaul, E.; Donovan, M. E.; Natowitz, J. B.; Ditmire, T.

    2016-08-01

    In this work, we explore the possibility that the motion of the deuterium ions emitted from Coulomb cluster explosions is highly disordered enough to resemble thermalization. We analyze the process of nuclear fusion reactions driven by laser-cluster interactions in experiments conducted at the Texas Petawatt laser facility using a mixture of D2+3He and CD4+3He cluster targets. When clusters explode by Coulomb repulsion, the emission of the energetic ions is “nearly” isotropic. In the framework of cluster Coulomb explosions, we analyze the energy distributions of the ions using a Maxwell-Boltzmann (MB) distribution, a shifted MB distribution (sMB), and the energy distribution derived from a log-normal (LN) size distribution of clusters. We show that the first two distributions reproduce well the experimentally measured ion energy distributions and the number of fusions from d-d and d-3He reactions. The LN distribution is a good representation of the ion kinetic energy distribution well up to high momenta where the noise becomes dominant, but overestimates both the neutron and the proton yields. If the parameters of the LN distributions are chosen to reproduce the fusion yields correctly, the experimentally measured high energy ion spectrum is not well represented. We conclude that the ion kinetic energy distribution is highly disordered and practically not distinguishable from a thermalized one.

  16. Consumers' Kansei Needs Clustering Method for Product Emotional Design Based on Numerical Design Structure Matrix and Genetic Algorithms

    PubMed Central

    Chen, Deng-kai; Gu, Rong; Gu, Yu-feng; Yu, Sui-huai

    2016-01-01

    Consumers' Kansei needs reflect their perception about a product and always consist of a large number of adjectives. Reducing the dimension complexity of these needs to extract primary words not only enables the target product to be explicitly positioned, but also provides a convenient design basis for designers engaging in design work. Accordingly, this study employs a numerical design structure matrix (NDSM) by parameterizing a conventional DSM and integrating genetic algorithms to find optimum Kansei clusters. A four-point scale method is applied to assign link weights of every two Kansei adjectives as values of cells when constructing an NDSM. Genetic algorithms are used to cluster the Kansei NDSM and find optimum clusters. Furthermore, the process of the proposed method is presented. The details of the proposed approach are illustrated using an example of electronic scooter for Kansei needs clustering. The case study reveals that the proposed method is promising for clustering Kansei needs adjectives in product emotional design.

  17. Consumers' Kansei Needs Clustering Method for Product Emotional Design Based on Numerical Design Structure Matrix and Genetic Algorithms

    PubMed Central

    Chen, Deng-kai; Gu, Rong; Gu, Yu-feng; Yu, Sui-huai

    2016-01-01

    Consumers' Kansei needs reflect their perception about a product and always consist of a large number of adjectives. Reducing the dimension complexity of these needs to extract primary words not only enables the target product to be explicitly positioned, but also provides a convenient design basis for designers engaging in design work. Accordingly, this study employs a numerical design structure matrix (NDSM) by parameterizing a conventional DSM and integrating genetic algorithms to find optimum Kansei clusters. A four-point scale method is applied to assign link weights of every two Kansei adjectives as values of cells when constructing an NDSM. Genetic algorithms are used to cluster the Kansei NDSM and find optimum clusters. Furthermore, the process of the proposed method is presented. The details of the proposed approach are illustrated using an example of electronic scooter for Kansei needs clustering. The case study reveals that the proposed method is promising for clustering Kansei needs adjectives in product emotional design. PMID:27630709

  18. Consumers' Kansei Needs Clustering Method for Product Emotional Design Based on Numerical Design Structure Matrix and Genetic Algorithms.

    PubMed

    Yang, Yan-Pu; Chen, Deng-Kai; Gu, Rong; Gu, Yu-Feng; Yu, Sui-Huai

    2016-01-01

    Consumers' Kansei needs reflect their perception about a product and always consist of a large number of adjectives. Reducing the dimension complexity of these needs to extract primary words not only enables the target product to be explicitly positioned, but also provides a convenient design basis for designers engaging in design work. Accordingly, this study employs a numerical design structure matrix (NDSM) by parameterizing a conventional DSM and integrating genetic algorithms to find optimum Kansei clusters. A four-point scale method is applied to assign link weights of every two Kansei adjectives as values of cells when constructing an NDSM. Genetic algorithms are used to cluster the Kansei NDSM and find optimum clusters. Furthermore, the process of the proposed method is presented. The details of the proposed approach are illustrated using an example of electronic scooter for Kansei needs clustering. The case study reveals that the proposed method is promising for clustering Kansei needs adjectives in product emotional design.

  19. Consumers' Kansei Needs Clustering Method for Product Emotional Design Based on Numerical Design Structure Matrix and Genetic Algorithms.

    PubMed

    Yang, Yan-Pu; Chen, Deng-Kai; Gu, Rong; Gu, Yu-Feng; Yu, Sui-Huai

    2016-01-01

    Consumers' Kansei needs reflect their perception about a product and always consist of a large number of adjectives. Reducing the dimension complexity of these needs to extract primary words not only enables the target product to be explicitly positioned, but also provides a convenient design basis for designers engaging in design work. Accordingly, this study employs a numerical design structure matrix (NDSM) by parameterizing a conventional DSM and integrating genetic algorithms to find optimum Kansei clusters. A four-point scale method is applied to assign link weights of every two Kansei adjectives as values of cells when constructing an NDSM. Genetic algorithms are used to cluster the Kansei NDSM and find optimum clusters. Furthermore, the process of the proposed method is presented. The details of the proposed approach are illustrated using an example of electronic scooter for Kansei needs clustering. The case study reveals that the proposed method is promising for clustering Kansei needs adjectives in product emotional design. PMID:27630709

  20. Narrow-angle tail radio sources and the distribution of galaxy orbits in Abell clusters

    NASA Technical Reports Server (NTRS)

    O'Dea, Christopher P.; Sarazin, Craig L.; Owen, Frazer N.

    1987-01-01

    The present data on the orientations of the tails with respect to the cluster centers of a sample of 70 narrow-angle-tail (NAT) radio sources in Abell clusters show the distribution of tail angles to be inconsistent with purely radial or circular orbits in all the samples, while being consistent with isotropic orbits in (1) the whole sample, (2) the sample of NATs far from the cluster center, and (3) the samples of morphologically regular Abell clusters. Evidence for very radial orbits is found, however, in the sample of NATs near the cluster center. If these results can be generalized to all cluster galaxies, then the presence of radial orbits near the center of Abell clusters suggests that violent relaxation may not have been fully effective even within the cores of the regular clusters.

  1. KANTS: a stigmergic ant algorithm for cluster analysis and swarm art.

    PubMed

    Fernandes, Carlos M; Mora, Antonio M; Merelo, Juan J; Rosa, Agostinho C

    2014-06-01

    KANTS is a swarm intelligence clustering algorithm inspired by the behavior of social insects. It uses stigmergy as a strategy for clustering large datasets and, as a result, displays a typical behavior of complex systems: self-organization and global patterns emerging from the local interaction of simple units. This paper introduces a simplified version of KANTS and describes recent experiments with the algorithm in the context of a contemporary artistic and scientific trend called swarm art, a type of generative art in which swarm intelligence systems are used to create artwork or ornamental objects. KANTS is used here for generating color drawings from the input data that represent real-world phenomena, such as electroencephalogram sleep data. However, the main proposal of this paper is an art project based on well-known abstract paintings, from which the chromatic values are extracted and used as input. Colors and shapes are therefore reorganized by KANTS, which generates its own interpretation of the original artworks. The project won the 2012 Evolutionary Art, Design, and Creativity Competition.

  2. A contiguity-enhanced k-means clustering algorithm for unsupervised multispectral image segmentation

    SciTech Connect

    Theiler, J.; Gisler, G.

    1997-07-01

    The recent and continuing construction of multi and hyper spectral imagers will provide detailed data cubes with information in both the spatial and spectral domain. This data shows great promise for remote sensing applications ranging from environmental and agricultural to national security interests. The reduction of this voluminous data to useful intermediate forms is necessary both for downlinking all those bits and for interpreting them. Smart onboard hardware is required, as well as sophisticated earth bound processing. A segmented image (in which the multispectral data in each pixel is classified into one of a small number of categories) is one kind of intermediate form which provides some measure of data compression. Traditional image segmentation algorithms treat pixels independently and cluster the pixels according only to their spectral information. This neglects the implicit spatial information that is available in the image. We will suggest a simple approach; a variant of the standard k-means algorithm which uses both spatial and spectral properties of the image. The segmented image has the property that pixels which are spatially contiguous are more likely to be in the same class than are random pairs of pixels. This property naturally comes at some cost in terms of the compactness of the clusters in the spectral domain, but we have found that the spatial contiguity and spectral compactness properties are nearly orthogonal, which means that we can make considerable improvements in the one with minimal loss in the other.

  3. Data Clustering

    NASA Astrophysics Data System (ADS)

    Wagstaff, Kiri L.

    2012-03-01

    clustering, in which some partial information about item assignments or other components of the resulting output are already known and must be accommodated by the solution. Some algorithms seek a partition of the data set into distinct clusters, while others build a hierarchy of nested clusters that can capture taxonomic relationships. Some produce a single optimal solution, while others construct a probabilistic model of cluster membership. More formally, clustering algorithms operate on a data set X composed of items represented by one or more features (dimensions). These could include physical location, such as right ascension and declination, as well as other properties such as brightness, color, temporal change, size, texture, and so on. Let D be the number of dimensions used to represent each item, xi ∈ RD. The clustering goal is to produce an organization P of the items in X that optimizes an objective function f : P -> R, which quantifies the quality of solution P. Often f is defined so as to maximize similarity within a cluster and minimize similarity between clusters. To that end, many algorithms make use of a measure d : X x X -> R of the distance between two items. A partitioning algorithm produces a set of clusters P = {c1, . . . , ck} such that the clusters are nonoverlapping (c_i intersected with c_j = empty set, i != j) subsets of the data set (Union_i c_i=X). Hierarchical algorithms produce a series of partitions P = {p1, . . . , pn }. For a complete hierarchy, the number of partitions n’= n, the number of items in the data set; the top partition is a single cluster containing all items, and the bottom partition contains n clusters, each containing a single item. For model-based clustering, each cluster c_j is represented by a model m_j , such as the cluster center or a Gaussian distribution. The wide array of available clustering algorithms may seem bewildering, and covering all of them is beyond the scope of this chapter. Choosing among them for a

  4. The temperature and size distribution of large water clusters from a non-equilibrium model

    SciTech Connect

    Gimelshein, N.; Gimelshein, S.; Pradzynski, C. C.; Zeuch, T.; Buck, U.

    2015-06-28

    A hybrid Lagrangian-Eulerian approach is used to examine the properties of water clusters formed in neon-water vapor mixtures expanding through microscale conical nozzles. Experimental size distributions were reliably determined by the sodium doping technique in a molecular beam machine. The comparison of computed size distributions and experimental data shows satisfactory agreement, especially for (H{sub 2}O){sub n} clusters with n larger than 50. Thus validated simulations provide size selected cluster temperature profiles in and outside the nozzle. This information is used for an in-depth analysis of the crystallization and water cluster aggregation dynamics of recently reported supersonic jet expansion experiments.

  5. The temperature and size distribution of large water clusters from a non-equilibrium model.

    PubMed

    Gimelshein, N; Gimelshein, S; Pradzynski, C C; Zeuch, T; Buck, U

    2015-06-28

    A hybrid Lagrangian-Eulerian approach is used to examine the properties of water clusters formed in neon-water vapor mixtures expanding through microscale conical nozzles. Experimental size distributions were reliably determined by the sodium doping technique in a molecular beam machine. The comparison of computed size distributions and experimental data shows satisfactory agreement, especially for (H2O)n clusters with n larger than 50. Thus validated simulations provide size selected cluster temperature profiles in and outside the nozzle. This information is used for an in-depth analysis of the crystallization and water cluster aggregation dynamics of recently reported supersonic jet expansion experiments. PMID:26133426

  6. The temperature and size distribution of large water clusters from a non-equilibrium model

    NASA Astrophysics Data System (ADS)

    Gimelshein, N.; Gimelshein, S.; Pradzynski, C. C.; Zeuch, T.; Buck, U.

    2015-06-01

    A hybrid Lagrangian-Eulerian approach is used to examine the properties of water clusters formed in neon-water vapor mixtures expanding through microscale conical nozzles. Experimental size distributions were reliably determined by the sodium doping technique in a molecular beam machine. The comparison of computed size distributions and experimental data shows satisfactory agreement, especially for (H2O)n clusters with n larger than 50. Thus validated simulations provide size selected cluster temperature profiles in and outside the nozzle. This information is used for an in-depth analysis of the crystallization and water cluster aggregation dynamics of recently reported supersonic jet expansion experiments.

  7. Agent-based method for distributed clustering of textual information

    DOEpatents

    Potok, Thomas E [Oak Ridge, TN; Reed, Joel W [Knoxville, TN; Elmore, Mark T [Oak Ridge, TN; Treadwell, Jim N [Louisville, TN

    2010-09-28

    A computer method and system for storing, retrieving and displaying information has a multiplexing agent (20) that calculates a new document vector (25) for a new document (21) to be added to the system and transmits the new document vector (25) to master cluster agents (22) and cluster agents (23) for evaluation. These agents (22, 23) perform the evaluation and return values upstream to the multiplexing agent (20) based on the similarity of the document to documents stored under their control. The multiplexing agent (20) then sends the document (21) and the document vector (25) to the master cluster agent (22), which then forwards it to a cluster agent (23) or creates a new cluster agent (23) to manage the document (21). The system also searches for stored documents according to a search query having at least one term and identifying the documents found in the search, and displays the documents in a clustering display (80) of similarity so as to indicate similarity of the documents to each other.

  8. Imaging active faults in a region of distributed deformation from joint focal mechanism and hypocenter clustering: Application to western Iberia

    NASA Astrophysics Data System (ADS)

    Custodio, S.; Lima, V.; Vales, D.; Carrilho, F.; Cesca, S.

    2015-12-01

    Mainland Portugal, on the SW edge of the European continent, is located directly north of the boundary between the Eurasian and Nubian plates. It lies in a region of slow lithospheric deformation, which has generated some of the largest earthquakes in Europe, both intraplate (mainland) and interplate (offshore). The seismicity of mainland Portugal and its adjacent offshore has been repeatedly classified as diffuse. We analyse the instrumental earthquake catalog for western Iberia, enriched with data from recent dense broadband deployments. We show that although the plate boundary south of Portugal is diffuse, in that deformation is accommodated along several distributed faults rather than along one long linear plate boundary, the seismicity itself is not diffuse. Rather, when located using high quality data, earthquakes collapse into well-defined clusters and lineations. We then present a new joint focal mechanism and hypocenter cluster algorithm that is able to extract coherent information between hypocenter locations and focal mechanisms. We apply the method to the Azores-western Mediterranean region, with emphasis on western Iberia. In addition to identifying well-known seismo-tectonic features, the joint clustering algorithm identifies eight new clusters of earthquakes with a good match between the directions of epicentre lineations and focal mechanism fault planes. These clusters may signal single active faults or wider fault zones accommodating a consistent type of faulting. Mainland Portugal is dominated by strike-slip faulting, consistent with the NNE-SSW and WNW-ESE oriented lineations. The region offshore SW Iberia displays clusters that are either predominantly strike-slip or reverse, indicating slip partitioning. This work shows that the study of low-magnitude earthquakes using dense seismic deployments is a powerful tool to study lithospheric deformation in slowly deforming regions, where high-magnitude earthquakes occur with long recurrence intervals.

  9. Analysis Clustering of Electricity Usage Profile Using K-Means Algorithm

    NASA Astrophysics Data System (ADS)

    Amri, Yasirli; Lailatul Fadhilah, Amanda; Fatmawati; Setiani, Novi; Rani, Septia

    2016-01-01

    Electricity is one of the most important needs for human life in many sectors. Demand for electricity will increase in line with population and economic growth. Adjustment of the amount of electricity production in specified time is important because the cost of storing electricity is expensive. For handling this problem, we need knowledge about the electricity usage pattern of clients. This pattern can be obtained by using clustering techniques. In this paper, clustering is used to obtain the similarity of electricity usage patterns in a specified time. We use K-Means algorithm to employ clustering on the dataset of electricity consumption from 370 clients that collected in a year. Result of this study, we obtained an interesting pattern that there is a big group of clients consume the lowest electric load in spring season, but in another group, the lowest electricity consumption occurred in winter season. From this result, electricity provider can make production planning in specified season based on pattern of electricity usage profile.

  10. A comparative analysis of clustering algorithms: O2 migration in truncated hemoglobin I from transition networks

    NASA Astrophysics Data System (ADS)

    Cazade, Pierre-André; Zheng, Wenwei; Prada-Gracia, Diego; Berezovska, Ganna; Rao, Francesco; Clementi, Cecilia; Meuwly, Markus

    2015-01-01

    The ligand migration network for O2-diffusion in truncated Hemoglobin N is analyzed based on three different clustering schemes. For coordinate-based clustering, the conventional k-means and the kinetics-based Markov Clustering (MCL) methods are employed, whereas the locally scaled diffusion map (LSDMap) method is a collective-variable-based approach. It is found that all three methods agree well in their geometrical definition of the most important docking site, and all experimentally known docking sites are recovered by all three methods. Also, for most of the states, their population coincides quite favourably, whereas the kinetics of and between the states differs. One of the major differences between k-means and MCL clustering on the one hand and LSDMap on the other is that the latter finds one large primary cluster containing the Xe1a, IS1, and ENT states. This is related to the fact that the motion within the state occurs on similar time scales, whereas structurally the state is found to be quite diverse. In agreement with previous explicit atomistic simulations, the Xe3 pocket is found to be a highly dynamical site which points to its potential role as a hub in the network. This is also highlighted in the fact that LSDMap cannot identify this state. First passage time distributions from MCL clusterings using a one- (ligand-position) and two-dimensional (ligand-position and protein-structure) descriptor suggest that ligand- and protein-motions are coupled. The benefits and drawbacks of the three methods are discussed in a comparative fashion and highlight that depending on the questions at hand the best-performing method for a particular data set may differ.

  11. MASS DISTRIBUTIONS OF STARS AND CORES IN YOUNG GROUPS AND CLUSTERS

    SciTech Connect

    Michel, Manon; Kirk, Helen; Myers, Philip C. E-mail: hkirk@cfa.harvard.edu

    2011-07-01

    We investigate the relation of the stellar initial mass function and the dense core mass function (CMF), using stellar masses and positions in 14 well-studied young groups. Initial column density maps are computed by replacing each star with a model initial core having the same star formation efficiency (SFE). For each group the SFE, core model, and observational resolution are varied to produce a realistic range of initial maps. A clump-finding algorithm parses each initial map into derived cores, derived core masses, and a derived CMF. The main result is that projected blending of initial cores causes derived cores to be too few and too massive. The number of derived cores is fewer than the number of initial cores by a mean factor of 1.4 in sparse groups and 5 in crowded groups. The mass at the peak of the derived CMF exceeds the mass at the peak of the initial CMF by a mean factor of 1.0 in sparse groups and 12.1 in crowded groups. These results imply that in crowded young groups and clusters, the mass distribution of observed cores may not reliably predict the mass distribution of protostars that will form in those cores.

  12. Ant colony optimization algorithm for continuous domains based on position distribution model of ant colony foraging.

    PubMed

    Liu, Liqiang; Dai, Yuntao; Gao, Jinyu

    2014-01-01

    Ant colony optimization algorithm for continuous domains is a major research direction for ant colony optimization algorithm. In this paper, we propose a distribution model of ant colony foraging, through analysis of the relationship between the position distribution and food source in the process of ant colony foraging. We design a continuous domain optimization algorithm based on the model and give the form of solution for the algorithm, the distribution model of pheromone, the update rules of ant colony position, and the processing method of constraint condition. Algorithm performance against a set of test trials was unconstrained optimization test functions and a set of optimization test functions, and test results of other algorithms are compared and analyzed to verify the correctness and effectiveness of the proposed algorithm. PMID:24955402

  13. Ant Colony Optimization Algorithm for Continuous Domains Based on Position Distribution Model of Ant Colony Foraging

    PubMed Central

    Liu, Liqiang; Dai, Yuntao

    2014-01-01

    Ant colony optimization algorithm for continuous domains is a major research direction for ant colony optimization algorithm. In this paper, we propose a distribution model of ant colony foraging, through analysis of the relationship between the position distribution and food source in the process of ant colony foraging. We design a continuous domain optimization algorithm based on the model and give the form of solution for the algorithm, the distribution model of pheromone, the update rules of ant colony position, and the processing method of constraint condition. Algorithm performance against a set of test trials was unconstrained optimization test functions and a set of optimization test functions, and test results of other algorithms are compared and analyzed to verify the correctness and effectiveness of the proposed algorithm. PMID:24955402

  14. Nature and distribution of dark matter. II - Binaries, groups, and clusters

    NASA Astrophysics Data System (ADS)

    Vasanthi, M. M.; Padmanabhan, T.

    1989-12-01

    The mass-radius relationship for aggregates of galaxies (binaries, small groups, and clusters) is studied using a simple best-fit analysis. It is found that the binaries are just two galaxies, each with an individual isothermal dark matter halo, moving under mutual gravitational attraction. Power-law relations to describe the groups and clusters are examined, suggesting that two components exist in dark matter: one clustered around the galaxies, and one distributed smoothly.

  15. On Distribution Reduction and Algorithm Implementation in Inconsistent Ordered Information Systems

    PubMed Central

    Zhang, Yanqin

    2014-01-01

    As one part of our work in ordered information systems, distribution reduction is studied in inconsistent ordered information systems (OISs). Some important properties on distribution reduction are studied and discussed. The dominance matrix is restated for reduction acquisition in dominance relations based information systems. Matrix algorithm for distribution reduction acquisition is stepped. And program is implemented by the algorithm. The approach provides an effective tool for the theoretical research and the applications for ordered information systems in practices. For more detailed and valid illustrations, cases are employed to explain and verify the algorithm and the program which shows the effectiveness of the algorithm in complicated information systems. PMID:25258721

  16. A New Database of Globular Clusters Parameters: Distributions of Cluster Properties and Correlations Between Them

    NASA Astrophysics Data System (ADS)

    Djorgovski, S.; Meylan, G.

    1993-05-01

    The forthcoming ASPCS volume, ``Structure and Dynamics of Globular Clusters'' (expected publication date: early summer of 1993) will contain a set of appendices with data resources on Galactic globular clusters; the authors of these papers include I.R. King, S. Peterson, C.T. Pryor, S.C. Trager, and ourselves. From these papers we have compiled a data base of various observed and derived parameters for globular clusters (143 of them at last count). Our main purpose is to use these data for correlative studies of globular cluster properties. Others may find it useful for similar purposes, for planning and support of observations, for testing of theoretical models, etc. We will describe the data base, and present some simple analysis of the cluster properties and correlations among them. The data will be made available to the community in a computer form, as ASCII files. Interested users should send an email message to the Internet address: george @ deimos.caltech.edu, and may also find the above mentioned ASPCS volume useful in their work. We thank our colleagues who contributed data for this compilation for their efforts. S.D. acknowledges a partial support from the NASA contract NAS5-31348, and the NSF PYI award AST-9157412.

  17. Spatial and kinematic distributions of transition populations in intermediate redshift galaxy clusters

    SciTech Connect

    Crawford, Steven M.; Wirth, Gregory D.; Bershady, Matthew A. E-mail: wirth@keck.hawaii.edu

    2014-05-01

    We analyze the spatial and velocity distributions of confirmed members in five massive clusters of galaxies at intermediate redshift (0.5 < z < 0.9) to investigate the physical processes driving galaxy evolution. Based on spectral classifications derived from broad- and narrow-band photometry, we define four distinct galaxy populations representing different evolutionary stages: red sequence (RS) galaxies, blue cloud (BC) galaxies, green valley (GV) galaxies, and luminous compact blue galaxies (LCBGs). For each galaxy class, we derive the projected spatial and velocity distribution and characterize the degree of subclustering. We find that RS, BC, and GV galaxies in these clusters have similar velocity distributions, but that BC and GV galaxies tend to avoid the core of the two z ≈ 0.55 clusters. GV galaxies exhibit subclustering properties similar to RS galaxies, but their radial velocity distribution is significantly platykurtic compared to the RS galaxies. The absence of GV galaxies in the cluster cores may explain their somewhat prolonged star-formation history. The LCBGs appear to have recently fallen into the cluster based on their larger velocity dispersion, absence from the cores of the clusters, and different radial velocity distribution than the RS galaxies. Both LCBG and BC galaxies show a high degree of subclustering on the smallest scales, leading us to conclude that star formation is likely triggered by galaxy-galaxy interactions during infall into the cluster.

  18. Hybrid clustering based fuzzy structure for vibration control - Part 1: A novel algorithm for building neuro-fuzzy system

    NASA Astrophysics Data System (ADS)

    Nguyen, Sy Dzung; Nguyen, Quoc Hung; Choi, Seung-Bok

    2015-01-01

    This paper presents a new algorithm for building an adaptive neuro-fuzzy inference system (ANFIS) from a training data set called B-ANFIS. In order to increase accuracy of the model, the following issues are executed. Firstly, a data merging rule is proposed to build and perform a data-clustering strategy. Subsequently, a combination of clustering processes in the input data space and in the joint input-output data space is presented. Crucial reason of this task is to overcome problems related to initialization and contradictory fuzzy rules, which usually happen when building ANFIS. The clustering process in the input data space is accomplished based on a proposed merging-possibilistic clustering (MPC) algorithm. The effectiveness of this process is evaluated to resume a clustering process in the joint input-output data space. The optimal parameters obtained after completion of the clustering process are used to build ANFIS. Simulations based on a numerical data, 'Daily Data of Stock A', and measured data sets of a smart damper are performed to analyze and estimate accuracy. In addition, convergence and robustness of the proposed algorithm are investigated based on both theoretical and testing approaches.

  19. Survivable algorithms and redundancy management in NASA's distributed computing systems

    NASA Technical Reports Server (NTRS)

    Malek, Miroslaw

    1992-01-01

    The design of survivable algorithms requires a solid foundation for executing them. While hardware techniques for fault-tolerant computing are relatively well understood, fault-tolerant operating systems, as well as fault-tolerant applications (survivable algorithms), are, by contrast, little understood, and much more work in this field is required. We outline some of our work that contributes to the foundation of ultrareliable operating systems and fault-tolerant algorithm design. We introduce our consensus-based framework for fault-tolerant system design. This is followed by a description of a hierarchical partitioning method for efficient consensus. A scheduler for redundancy management is introduced, and application-specific fault tolerance is described. We give an overview of our hybrid algorithm technique, which is an alternative to the formal approach given.

  20. Theoretical study of small sodium-potassium alloy clusters through genetic algorithm and quantum chemical calculations.

    PubMed

    Silva, Mateus X; Galvão, Breno R L; Belchior, Jadson C

    2014-05-21

    Genetic algorithm is employed to survey an empirical potential energy surface for small Na(x)K(y) clusters with x + y ≤ 15, providing initial conditions for electronic structure methods. The minima of such empirical potential are assessed and corrected using high level ab initio methods such as CCSD(T), CR-CCSD(T)-L and MP2, and benchmark results are obtained for specific cases. The results are the first calculations for such small alloy clusters and may serve as a reference for further studies. The validity and choice of a proper functional and basis set for DFT calculations are then explored using the benchmark data, where it was found that the usual DFT approach may fail to provide the correct qualitative result for specific systems. The best general agreement to the benchmark calculations is achieved with def2-TZVPP basis set with SVWN5 functional, although the LANL2DZ basis set (with effective core potential) and SVWN5 functional provided the most cost-effective results. PMID:24691391

  1. Fuzzy ruling between core porosity and petrophysical logs: Subtractive clustering vs. genetic algorithm-pattern search

    NASA Astrophysics Data System (ADS)

    Bagheripour, Parisa; Asoodeh, Mojtaba

    2013-12-01

    Porosity, the void portion of reservoir rocks, determines the volume of hydrocarbon accumulation and has a great control on assessment and development of hydrocarbon reservoirs. Accurate determination of porosity from core analysis is highly cost, time, and labor intensive. Therefore, the mission of finding an accurate, fast and cheap way of determining porosity is unavoidable. On the other hand, conventional well log data, available in almost all wells contain invaluable implicit information about the porosity. Therefore, an intelligent system can explicate this information. Fuzzy logic is a powerful tool for handling geosciences problem which is associated with uncertainty. However, determination of the best fuzzy formulation is still an issue. This study purposes an improved strategy, called hybrid genetic algorithm-pattern search (GA-PS) technique, against the widely held subtractive clustering (SC) method for setting up fuzzy rules between core porosity and petrophysical logs. Hybrid GA-PS technique is capable of extracting optimal parameters for fuzzy clusters (membership functions) which consequently results in the best fuzzy formulation. Results indicate that GA-PS technique manipulates both mean and variance of Gaussian membership functions contrary to SC that only has a control on mean of Gaussian membership functions. A comparison between hybrid GA-PS technique and SC method confirmed the superiority of GA-PS technique in setting up fuzzy rules. The proposed strategy was successfully applied to one of the Iranian carbonate reservoir rocks.

  2. A Multiple-Label Guided Clustering Algorithm for Historical Document Dating and Localization.

    PubMed

    He, Sheng; Samara, Petros; Burgers, Jan; Schomaker, Lambert

    2016-11-01

    It is of essential importance for historians to know the date and place of origin of the documents they study. It would be a huge advancement for historical scholars if it would be possible to automatically estimate the geographical and temporal provenance of a handwritten document by inferring them from the handwriting style of such a document. We propose a multiple-label guided clustering algorithm to discover the correlations between the concrete low-level visual elements in historical documents and abstract labels, such as date and location. First, a novel descriptor, called histogram of orientations of handwritten strokes, is proposed to extract and describe the visual elements, which is built on a scale-invariant polar-feature space. In addition, the multi-label self-organizing map (MLSOM) is proposed to discover the correlations between the low-level visual elements and their labels in a single framework. Our proposed MLSOM can be used to predict the labels directly. Moreover, the MLSOM can also be considered as a pre-structured clustering method to build a codebook, which contains more discriminative information on date and geography. The experimental results on the medieval paleographic scale data set demonstrate that our method achieves state-of-the-art results. PMID:27576248

  3. Correlated Percolation, Fractal Structures, and Scale-Invariant Distribution of Clusters in Natural Images

    PubMed Central

    Saremi, Saeed; Sejnowski, Terrence J.

    2016-01-01

    Natural images are scale invariant with structures at all length scales. We formulated a geometric view of scale invariance in natural images using percolation theory, which describes the behavior of connected clusters on graphs. We map images to the percolation model by defining clusters on a binary representation for images. We show that critical percolating structures emerge in natural images and study their scaling properties by identifying fractal dimensions and exponents for the scale-invariant distributions of clusters. This formulation leads to a method for identifying clusters in images from underlying structures as a starting point for image segmentation. PMID:26415153

  4. Dynamical quorum sensing and clustering dynamics in a population of spatially distributed active rotators

    NASA Astrophysics Data System (ADS)

    Sakaguchi, Hidetsugu; Maeyama, Satomi

    2013-02-01

    A model of clustering dynamics is proposed for a population of spatially distributed active rotators. A transition from excitable to oscillatory dynamics is induced by the increase of the local density of active rotators. It is interpreted as dynamical quorum sensing. In the oscillation regime, phase waves propagate without decay, which generates an effectively long-range interaction in the clustering dynamics. The clustering process becomes facilitated and only one dominant cluster appears rapidly as a result of the dynamical quorum sensing. An exact localized solution is found to a simplified model equation, and the competitive dynamics between two localized states is studied numerically.

  5. Discovery of a widely distributed toxin biosynthetic gene cluster

    PubMed Central

    Lee, Shaun W.; Mitchell, Douglas A.; Markley, Andrew L.; Hensler, Mary E.; Gonzalez, David; Wohlrab, Aaron; Dorrestein, Pieter C.; Nizet, Victor; Dixon, Jack E.

    2008-01-01

    Bacteriocins represent a large family of ribosomally produced peptide antibiotics. Here we describe the discovery of a widely conserved biosynthetic gene cluster for the synthesis of thiazole and oxazole heterocycles on ribosomally produced peptides. These clusters encode a toxin precursor and all necessary proteins for toxin maturation and export. Using the toxin precursor peptide and heterocycle-forming synthetase proteins from the human pathogen Streptococcus pyogenes, we demonstrate the in vitro reconstitution of streptolysin S activity. We provide evidence that the synthetase enzymes, as predicted from our bioinformatics analysis, introduce heterocycles onto precursor peptides, thereby providing molecular insight into the chemical structure of streptolysin S. Furthermore, our studies reveal that the synthetase exhibits relaxed substrate specificity and modifies toxin precursors from both related and distant species. Given our findings, it is likely that the discovery of similar peptidic toxins will rapidly expand to existing and emerging genomes. PMID:18375757

  6. Anticipation versus adaptation in Evolutionary Algorithms: The case of Non-Stationary Clustering

    NASA Astrophysics Data System (ADS)

    González, A. I.; Graña, M.; D'Anjou, A.; Torrealdea, F. J.

    1998-07-01

    From the technological point of view is usually more important to ensure the ability to react promptly to changing environmental conditions than to try to forecast them. Evolution Algorithms were proposed initially to drive the adaptation of complex systems to varying or uncertain environments. In the general setting, the adaptive-anticipatory dilemma reduces itself to the placement of the interaction with the environment in the computational schema. Adaptation consists of the estimation of the proper parameters from present data in order to react to a present environment situation. Anticipation consists of the estimation from present data in order to react to a future environment situation. This duality is expressed in the Evolutionary Computation paradigm by the precise location of the consideration of present data in the computation of the individuals fitness function. In this paper we consider several instances of Evolutionary Algorithms applied to precise problem and perform an experiment that test their response as anticipative and adaptive mechanisms. The non stationary problem considered is that of Non Stationary Clustering, more precisely the adaptive Color Quantization of image sequences. The experiment illustrates our ideas and gives some quantitative results that may support the proposition of the Evolutionary Computation paradigm for other tasks that require the interaction with a Non-Stationary environment.

  7. Monte Carlo cluster algorithm for fluid phase transitions in highly size-asymmetrical binary mixtures.

    PubMed

    Ashton, Douglas J; Liu, Jiwen; Luijten, Erik; Wilding, Nigel B

    2010-11-21

    Highly size-asymmetrical fluid mixtures arise in a variety of physical contexts, notably in suspensions of colloidal particles to which much smaller particles have been added in the form of polymers or nanoparticles. Conventional schemes for simulating models of such systems are hamstrung by the difficulty of relaxing the large species in the presence of the small one. Here we describe how the rejection-free geometrical cluster algorithm of Liu and Luijten [J. Liu and E. Luijten, Phys. Rev. Lett. 92, 035504 (2004)] can be embedded within a restricted Gibbs ensemble to facilitate efficient and accurate studies of fluid phase behavior of highly size-asymmetrical mixtures. After providing a detailed description of the algorithm, we summarize the bespoke analysis techniques of [Ashton et al., J. Chem. Phys. 132, 074111 (2010)] that permit accurate estimates of coexisting densities and critical-point parameters. We apply our methods to study the liquid-vapor phase diagram of a particular mixture of Lennard-Jones particles having a 10:1 size ratio. As the reservoir volume fraction of small particles is increased in the range of 0%-5%, the critical temperature decreases by approximately 50%, while the critical density drops by some 30%. These trends imply that in our system, adding small particles decreases the net attraction between large particles, a situation that contrasts with hard-sphere mixtures where an attractive depletion force occurs.

  8. Study of cluster reconstruction and track fitting algorithms for CGEM-IT at BESIII

    NASA Astrophysics Data System (ADS)

    Guo, Yue; Wang, Liang-Liang; Ju, Xu-Dong; Wu, Ling-Hui; Xiu, Qing-Lei; Wang, Hai-Xia; Dong, Ming-Yi; Hu, Jing-Ran; Li, Wei-Dong; Li, Wei-Guo; Liu, Huai-Min; Qun, Ou-Yang; Shen, Xiao-Yan; Yuan, Ye; Zhang, Yao

    2016-01-01

    Considering the effects of aging on the existing Inner Drift Chamber (IDC) of BESIII, a GEM-based inner tracker, the Cylindrical-GEM Inner Tracker (CGEM-IT), is proposed to be designed and constructed as an upgrade candidate for the IDC. This paper introduces a full simulation package for the CGEM-IT with a simplified digitization model, and describes the development of software for cluster reconstruction and track fitting, using a track fitting algorithm based on the Kalman filter method. Preliminary results for the reconstruction algorithms which are obtained using a Monte Carlo sample of single muon events in the CGEM-IT, show that the CGEM-IT has comparable momentum resolution and transverse vertex resolution to the IDC, and a better z-direction resolution than the IDC. Supported by National Key Basic Research Program of China (2015CB856700), National Natural Science Foundation of China (11205184, 11205182) and Joint Funds of National Natural Science Foundation of China (U1232201)

  9. The Age, Mass, and Size Distributions of Star Clusters in M51

    NASA Astrophysics Data System (ADS)

    Chandar, Rupali; Whitmore, Bradley C.; Dinino, Daiana; Kennicutt, Robert C.; Chien, L.-H.; Schinnerer, Eva; Meidt, Sharon

    2016-06-01

    We present a new catalog of 3816 compact star clusters in the grand design spiral galaxy M51 based on observations taken with the Hubble Space Telescope. The age distribution of the clusters declines starting at very young ages, and can be represented by a power law, {dN}/dτ \\propto {τ }γ , with γ =-0.65+/- 0.15. No significant changes in the shape of the age distribution at different masses is observed. The mass function of the clusters younger than τ ≈ 400 {{Myr}} can also be described by a power law, {dN}/{dM}\\propto {M}β , with β ≈ \\-2.1+/- 0.2. We compare these distributions with the predictions from various cluster disruption models, and find that they are consistent with models where clusters disrupt approximately independent of their initial mass, but not with models where lower mass clusters are disrupted earlier than their higher mass counterparts. We find that the half-light radii of clusters more massive than M ≈ 3× {10}4 {M}⊙ and with ages between 100 and 400 {{Myr}} are larger by a factor of ≈3-4 than their counterparts that are younger than 107 years old, suggesting that the clusters physically expand during their early life.

  10. A uniform metal distribution in the intergalactic medium of the Perseus cluster of galaxies.

    PubMed

    Werner, Norbert; Urban, Ondrej; Simionescu, Aurora; Allen, Steven W

    2013-10-31

    Most of the metals (elements heavier than helium) produced by stars in the member galaxies of clusters currently reside within the hot, X-ray-emitting intra-cluster gas. Observations of X-ray line emission from this intergalactic medium have suggested a relatively small cluster-to-cluster scatter outside the cluster centres and enrichment with iron out to large radii, leading to the idea that the metal enrichment occurred early in the history of the Universe. Models with early enrichment predict a uniform metal distribution at large radii in clusters, whereas those with late-time enrichment are expected to introduce significant spatial variations of the metallicity. To discriminate clearly between these competing models, it is essential to test for potential inhomogeneities by measuring the abundances out to large radii along multiple directions in clusters, which has not hitherto been done. Here we report a remarkably uniform iron abundance, as a function of radius and azimuth, that is statistically consistent with a constant value of ZFe = 0.306 ± 0.012 in solar units out to the edge of the nearby Perseus cluster. This homogeneous distribution requires that most of the metal enrichment of the intergalactic medium occurred before the cluster formed, probably more than ten billion years ago, during the period of maximal star formation and black hole activity.

  11. The mass distribution of the strong lensing cluster SDSS J1531+3414

    SciTech Connect

    Sharon, Keren; Johnson, Traci L.; Gladders, Michael D.; Rigby, Jane R.; Wuyts, Eva; Bayliss, Matthew B.; Florian, Michael K.; Dahle, Håkon

    2014-11-01

    We present the mass distribution at the core of SDSS J1531+3414, a strong-lensing cluster at z = 0.335. We find that the mass distribution is well described by two cluster-scale halos with a contribution from cluster-member galaxies. New Hubble Space Telescope observations of SDSS J1531+3414 reveal a signature of ongoing star formation associated with the two central galaxies at the core of the cluster, in the form of a chain of star forming regions at the center of the cluster. Using the lens model presented here, we place upper limits on the contribution of a possible lensed image to the flux at the central region, and rule out that this emission is coming from a background source.

  12. Cluster Mass Distribution of the Hubble Frontier Fields - What have we learned?

    NASA Astrophysics Data System (ADS)

    Jean-Paul, Kneib

    2016-07-01

    The Hubble Frontier Fields have provided the deepest imaging of six of the most massive clusters in the Universe. Using strong lensing and weak lensing techniques, we have investigated with a record high precision the mass models of these clusters. First we identified the multiples images that are then confronted to an evolving model to best match the strong lensing observable constraints. We then include weak lensing and flexion to investigate the mass distribution in the outer region. By investigating the accuracy of the model we show that we can constrain the small scale mass distribution, thus investigating the relation between the cluster galaxy stellar mass and its dark matter halo. On larger scale combining with weak lensing and X-ray measurement we can probe the assembly scenario of these cluster, which confirm that massive clusters are at the crossroads of filamentary structures.

  13. CHARACTERIZATION OF INTRACLUSTER MEDIUM TEMPERATURE DISTRIBUTIONS OF 62 GALAXY CLUSTERS WITH XMM-NEWTON

    SciTech Connect

    Frank, K. A.; Peterson, J. R.; Andersson, K.; Fabian, A. C.; Sanders, J. S.

    2013-02-10

    We measure the intracluster medium (ICM) temperature distributions for 62 galaxy clusters in the HIFLUGCS, an X-ray flux-limited sample, with available X-ray data from XMM-Newton. We search for correlations between the width of the temperature distributions and other cluster properties, including median cluster temperature, luminosity, size, presence of a cool core, active galactic nucleus (AGN) activity, and dynamical state. We use a Markov Chain Monte Carlo analysis, which models the ICM as a collection of X-ray emitting smoothed particles of plasma. Each smoothed particle is given its own set of parameters, including temperature, spatial position, redshift, size, and emission measure. This allows us to measure the width of the temperature distribution, median temperature, and total emission measure of each cluster. We find that none of the clusters have a temperature width consistent with isothermality. Counterintuitively, we also find that the temperature distribution widths of disturbed, non-cool-core, and AGN-free clusters tend to be wider than in other clusters. A linear fit to {sigma} {sub kT}-kT {sub med} finds {sigma} {sub kT} {approx} 0.20kT {sub med} + 1.08, with an estimated intrinsic scatter of {approx}0.55 keV, demonstrating a large range in ICM thermal histories.

  14. New algorithms for the Vavilov distribution calculation and the corresponding energy loss sampling

    SciTech Connect

    Chibani, O. |

    1998-10-01

    Two new algorithms for the fast calculation of the Vavilov distribution within the interval 0.01 {le} {kappa} {le} 10, where neither the Gaussian approximation nor the Landau distribution may be used, are presented. These algorithms are particularly convenient for the sampling of the corresponding random energy loss. A comparison with the exact Vavilov distribution for the case of protons traversing Al slabs is given.

  15. Coulomb explosion in dicationic noble gas clusters: a genetic algorithm-based approach to critical size estimation for the suppression of Coulomb explosion and prediction of dissociation channels.

    PubMed

    Nandy, Subhajit; Chaudhury, Pinaki; Bhattacharyya, S P

    2010-06-21

    We present a genetic algorithm based investigation of structural fragmentation in dicationic noble gas clusters, Ar(n)(+2), Kr(n)(+2), and Xe(n)(+2), where n denotes the size of the cluster. Dications are predicted to be stable above a threshold size of the cluster when positive charges are assumed to remain localized on two noble gas atoms and the Lennard-Jones potential along with bare Coulomb and ion-induced dipole interactions are taken into account for describing the potential energy surface. Our cutoff values are close to those obtained experimentally [P. Scheier and T. D. Mark, J. Chem. Phys. 11, 3056 (1987)] and theoretically [J. G. Gay and B. J. Berne, Phys. Rev. Lett. 49, 194 (1982)]. When the charges are allowed to be equally distributed over four noble gas atoms in the cluster and the nonpolarization interaction terms are allowed to remain unchanged, our method successfully identifies the size threshold for stability as well as the nature of the channels of dissociation as function of cluster size. In Ar(n)(2+), for example, fissionlike fragmentation is predicted for n=55 while for n=43, the predicted outcome is nonfission fragmentation in complete agreement with earlier work [Golberg et al., J. Chem. Phys. 100, 8277 (1994)]. PMID:20572686

  16. Coulomb explosion in dicationic noble gas clusters: A genetic algorithm-based approach to critical size estimation for the suppression of Coulomb explosion and prediction of dissociation channels

    NASA Astrophysics Data System (ADS)

    Nandy, Subhajit; Chaudhury, Pinaki; Bhattacharyya, S. P.

    2010-06-01

    We present a genetic algorithm based investigation of structural fragmentation in dicationic noble gas clusters, Arn+2, Krn+2, and Xen+2, where n denotes the size of the cluster. Dications are predicted to be stable above a threshold size of the cluster when positive charges are assumed to remain localized on two noble gas atoms and the Lennard-Jones potential along with bare Coulomb and ion-induced dipole interactions are taken into account for describing the potential energy surface. Our cutoff values are close to those obtained experimentally [P. Scheier and T. D. Mark, J. Chem. Phys. 11, 3056 (1987)] and theoretically [J. G. Gay and B. J. Berne, Phys. Rev. Lett. 49, 194 (1982)]. When the charges are allowed to be equally distributed over four noble gas atoms in the cluster and the nonpolarization interaction terms are allowed to remain unchanged, our method successfully identifies the size threshold for stability as well as the nature of the channels of dissociation as function of cluster size. In Arn2+, for example, fissionlike fragmentation is predicted for n =55 while for n =43, the predicted outcome is nonfission fragmentation in complete agreement with earlier work [Golberg et al., J. Chem. Phys. 100, 8277 (1994)].

  17. Constraints on the richness-mass relation and the optical-SZE positional offset distribution for SZE-selected clusters

    SciTech Connect

    Saro, A.

    2015-10-12

    In this study, we cross-match galaxy cluster candidates selected via their Sunyaev–Zel'dovich effect (SZE) signatures in 129.1 deg2 of the South Pole Telescope 2500d SPT-SZ survey with optically identified clusters selected from the Dark Energy Survey science verification data. We identify 25 clusters between 0.1 ≲ z ≲ 0.8 in the union of the SPT-SZ and redMaPPer (RM) samples. RM is an optical cluster finding algorithm that also returns a richness estimate for each cluster. We model the richness λ-mass relation with the following function 500> ∝ Bλln M500 + Cλln E(z) and use SPT-SZ cluster masses and RM richnesses λ to constrain the parameters. We find Bλ=1.14+0.21–0.18 and Cλ=0.73+0.77–0.75. The associated scatter in mass at fixed richness is σlnM|λ = 0.18+0.08–0.05 at a characteristic richness λ = 70. We demonstrate that our model provides an adequate description of the matched sample, showing that the fraction of SPT-SZ-selected clusters with RM counterparts is consistent with expectations and that the fraction of RM-selected clusters with SPT-SZ counterparts is in mild tension with expectation. We model the optical-SZE cluster positional offset distribution with the sum of two Gaussians, showing that it is consistent with a dominant, centrally peaked population and a subdominant population characterized by larger offsets. We also cross-match the RM catalogue with SPT-SZ candidates below the official catalogue threshold significance ξ = 4.5, using the RM catalogue to provide optical confirmation and redshifts for 15 additional clusters with ξ ε [4, 4.5].

  18. New Statistical Perspective to Link Void Distributions with Galaxy Clusters

    NASA Astrophysics Data System (ADS)

    Russell, Esra; Pycke, Jean-Renaud

    2016-07-01

    Voids dominate the total observed volume of the large scale structure and they are very sensitive to their environments which can strongly affect their shape as well their distributions. Therefore the void size distribution functions may play an important role to understand the dynamical processes affecting the structure formation of the Universe Here, using cosmic void data sets of Sutter et al. (2012) generated by galaxy mock catalogs which are tuned to three SDSS main samples, we obtain the size distribution of voids as a three parameter redshift independent log-normal void probability function. We find that the shape of the three parameter void distribution from the mock data samples is strikingly similar to the galaxy log-normal mass distribution obtained from numerical studies. This similarity of void size and galaxy mass distributions may possibly indicate evidence of large scale nonlinear mechanisms affecting both voids and galaxies, such as large scale accretion and tidal effects. Taking into account that all voids we study are generated by galaxy mock catalogs and they show hierarchical structures at different levels, it may be possible that the same nonlinear mechanisms of mass distribution affect the void size distribution.

  19. Reducing communication costs in the conjugate gradient algorithm on distributed memory multiprocessors

    SciTech Connect

    D`Azevedo, E.F.; Romine, C.H.

    1992-09-01

    The standard formulation of the conjugate gradient algorithm involves two inner product computations. The results of these two inner products are needed to update the search direction and the computed solution. In a distributed memory parallel environment, the computation and subsequent distribution of these two values requires two separate communication and synchronization phases. In this paper, we present a mathematically equivalent rearrangement of the standard algorithm that reduces the number of communication phases. We give a second derivation of the modified conjugate gradient algorithm in terms of the natural relationship with the underlying Lanczos process. We also present empirical evidence of the stability of this modified algorithm.

  20. Distribution and Genetic Diversity of Bacteriocin Gene Clusters in Rumen Microbial Genomes

    PubMed Central

    Azevedo, Analice C.; Bento, Cláudia B. P.; Ruiz, Jeronimo C.; Queiroz, Marisa V.

    2015-01-01

    Some species of ruminal bacteria are known to produce antimicrobial peptides, but the screening procedures have mostly been based on in vitro assays using standardized methods. Recent sequencing efforts have made available the genome sequences of hundreds of ruminal microorganisms. In this work, we performed genome mining of the complete and partial genome sequences of 224 ruminal bacteria and 5 ruminal archaea to determine the distribution and diversity of bacteriocin gene clusters. A total of 46 bacteriocin gene clusters were identified in 33 strains of ruminal bacteria. Twenty gene clusters were related to lanthipeptide biosynthesis, while 11 gene clusters were associated with sactipeptide production, 7 gene clusters were associated with class II bacteriocin production, and 8 gene clusters were associated with class III bacteriocin production. The frequency of strains whose genomes encode putative antimicrobial peptide precursors was 14.4%. Clusters related to the production of sactipeptides were identified for the first time among ruminal bacteria. BLAST analysis indicated that the majority of the gene clusters (88%) encoding putative lanthipeptides contained all the essential genes required for lanthipeptide biosynthesis. Most strains of Streptococcus (66.6%) harbored complete lanthipeptide gene clusters, in addition to an open reading frame encoding a putative class II bacteriocin. Albusin B-like proteins were found in 100% of the Ruminococcus albus strains screened in this study. The in silico analysis provided evidence of novel biosynthetic gene clusters in bacterial species not previously related to bacteriocin production, suggesting that the rumen microbiota represents an underexplored source of antimicrobial peptides. PMID:26253660

  1. Distribution and Genetic Diversity of Bacteriocin Gene Clusters in Rumen Microbial Genomes.

    PubMed

    Azevedo, Analice C; Bento, Cláudia B P; Ruiz, Jeronimo C; Queiroz, Marisa V; Mantovani, Hilário C

    2015-10-01

    Some species of ruminal bacteria are known to produce antimicrobial peptides, but the screening procedures have mostly been based on in vitro assays using standardized methods. Recent sequencing efforts have made available the genome sequences of hundreds of ruminal microorganisms. In this work, we performed genome mining of the complete and partial genome sequences of 224 ruminal bacteria and 5 ruminal archaea to determine the distribution and diversity of bacteriocin gene clusters. A total of 46 bacteriocin gene clusters were identified in 33 strains of ruminal bacteria. Twenty gene clusters were related to lanthipeptide biosynthesis, while 11 gene clusters were associated with sactipeptide production, 7 gene clusters were associated with class II bacteriocin production, and 8 gene clusters were associated with class III bacteriocin production. The frequency of strains whose genomes encode putative antimicrobial peptide precursors was 14.4%. Clusters related to the production of sactipeptides were identified for the first time among ruminal bacteria. BLAST analysis indicated that the majority of the gene clusters (88%) encoding putative lanthipeptides contained all the essential genes required for lanthipeptide biosynthesis. Most strains of Streptococcus (66.6%) harbored complete lanthipeptide gene clusters, in addition to an open reading frame encoding a putative class II bacteriocin. Albusin B-like proteins were found in 100% of the Ruminococcus albus strains screened in this study. The in silico analysis provided evidence of novel biosynthetic gene clusters in bacterial species not previously related to bacteriocin production, suggesting that the rumen microbiota represents an underexplored source of antimicrobial peptides.

  2. Distribution and Genetic Diversity of Bacteriocin Gene Clusters in Rumen Microbial Genomes.

    PubMed

    Azevedo, Analice C; Bento, Cláudia B P; Ruiz, Jeronimo C; Queiroz, Marisa V; Mantovani, Hilário C

    2015-10-01

    Some species of ruminal bacteria are known to produce antimicrobial peptides, but the screening procedures have mostly been based on in vitro assays using standardized methods. Recent sequencing efforts have made available the genome sequences of hundreds of ruminal microorganisms. In this work, we performed genome mining of the complete and partial genome sequences of 224 ruminal bacteria and 5 ruminal archaea to determine the distribution and diversity of bacteriocin gene clusters. A total of 46 bacteriocin gene clusters were identified in 33 strains of ruminal bacteria. Twenty gene clusters were related to lanthipeptide biosynthesis, while 11 gene clusters were associated with sactipeptide production, 7 gene clusters were associated with class II bacteriocin production, and 8 gene clusters were associated with class III bacteriocin production. The frequency of strains whose genomes encode putative antimicrobial peptide precursors was 14.4%. Clusters related to the production of sactipeptides were identified for the first time among ruminal bacteria. BLAST analysis indicated that the majority of the gene clusters (88%) encoding putative lanthipeptides contained all the essential genes required for lanthipeptide biosynthesis. Most strains of Streptococcus (66.6%) harbored complete lanthipeptide gene clusters, in addition to an open reading frame encoding a putative class II bacteriocin. Albusin B-like proteins were found in 100% of the Ruminococcus albus strains screened in this study. The in silico analysis provided evidence of novel biosynthetic gene clusters in bacterial species not previously related to bacteriocin production, suggesting that the rumen microbiota represents an underexplored source of antimicrobial peptides. PMID:26253660

  3. Velocity fluctuations and population distribution in clusters of settling particles at low Reynolds number

    NASA Astrophysics Data System (ADS)

    Boschan, A.; Ocampo, B. L.; Annichini, M.; Gauthier, G.

    2016-06-01

    A study on the spatial organization and velocity fluctuations of non-Brownian spherical particles settling at low Reynolds number in a vertical Hele-Shaw cell is reported. The particle volume fraction ranged from 0.005 to 0.05, while the distance between cell plates ranged from 5 to 15 times the particle radius. Particle tracking revealed that particles were not uniformly distributed in space but assembled in transient settling clusters. The population distribution of these clusters followed an exponential law. The measured velocity fluctuations are in agreement with that predicted theoretically for spherical clusters, from the balance between the apparent weight and the drag force. This result suggests that particle clustering, more than a spatial distribution of particles derived from random and independent events, is at the origin of the velocity fluctuations.

  4. Parallel Processing Performance on Multi-Core PC Cluster Distributing Communication Load to Multiple Paths

    NASA Astrophysics Data System (ADS)

    Fukunaga, Takafumi

    Due to advent of powerful Multi-Core PC cluster the computation performance of each node is dramatically increassed and this trend will continue in the future. On the other hand, the use of powerful network systems (Myrinet, Infiniband, etc.) is expensive and tends to increase difficulty of programming and degrades portability because they need dedicated libraries and protocol stacks. This paper proposes a relatively simple method to improve bandwidth-oriented parallel applications by improving the communication performance without the above dedicated hardware, libraries, protocol stacks and IEEE802.3ad (LACP). Although there are similarities between this proposal and IEEE802.3ad in respect to using multiple Ethernet ports, the proposal performs equal to or better than IEEE802.3ad without LACP switches and drivers. Moreover the performance of LACP is influenced by the environment (MAC addresses, IP addresses, etc.) because its distribution algorithm uses these parameters, the proposed method shows the same effect in spite of them.

  5. Virus evolutionary genetic algorithm for task collaboration of logistics distribution

    NASA Astrophysics Data System (ADS)

    Ning, Fanghua; Chen, Zichen; Xiong, Li

    2005-12-01

    In order to achieve JIT (Just-In-Time) level and clients' maximum satisfaction in logistics collaboration, a Virus Evolutionary Genetic Algorithm (VEGA) was put forward under double constraints of logistics resource and operation sequence. Based on mathematic description of a multiple objective function, the algorithm was designed to schedule logistics tasks with different due dates and allocate them to network members. By introducing a penalty item, make span and customers' satisfaction were expressed in fitness function. And a dynamic adaptive probability of infection was used to improve performance of local search. Compared to standard Genetic Algorithm (GA), experimental result illustrates the performance superiority of VEGA. So the VEGA can provide a powerful decision-making technique for optimizing resource configuration in logistics network.

  6. Parallel grid generation algorithm for distributed memory computers

    NASA Technical Reports Server (NTRS)

    Moitra, Stuti; Moitra, Anutosh

    1994-01-01

    A parallel grid-generation algorithm and its implementation on the Intel iPSC/860 computer are described. The grid-generation scheme is based on an algebraic formulation of homotopic relations. Methods for utilizing the inherent parallelism of the grid-generation scheme are described, and implementation of multiple levELs of parallelism on multiple instruction multiple data machines are indicated. The algorithm is capable of providing near orthogonality and spacing control at solid boundaries while requiring minimal interprocessor communications. Results obtained on the Intel hypercube for a blended wing-body configuration are used to demonstrate the effectiveness of the algorithm. Fortran implementations bAsed on the native programming model of the iPSC/860 computer and the Express system of software tools are reported. Computational gains in execution time speed-up ratios are given.

  7. The fractal dimensions of the spatial distribution of young open clusters in the solar neighbourhood

    NASA Astrophysics Data System (ADS)

    de La Fuente Marcos, R.; de La Fuente Marcos, C.

    2006-06-01

    Context: .Fractals are geometric objects with dimensionalities that are not integers. They play a fundamental role in the dynamics of chaotic systems. Observation of fractal structure in both the gas and the star-forming sites in galaxies suggests that the spatial distribution of young open clusters should follow a fractal pattern, too. Aims: .Here we investigate the fractal pattern of the distribution of young open clusters in the Solar Neighbourhood using a volume-limited sample from WEBDA and a multifractal analysis. By counting the number of objects inside spheres of different radii centred on clusters, we study the homogeneity of the distribution. Methods: .The fractal dimension D of the spatial distribution of a volume-limited sample of young open clusters is determined by analysing different moments of the count-in-cells. The spectrum of the Minkowski-Bouligand dimension of the distribution is studied as a function of the parameter q. The sample is corrected for dynamical effects. Results: .The Minkowski-Bouligand dimension varies with q in the range 0.71-1.77, therefore the distribution of young open clusters is fractal. We estimate that the average value of the fractal dimension is < D> = 1.7± 0.2 for the distribution of young open clusters studied. Conclusions: .The spatial distribution of young open clusters in the Solar Neighbourhood exhibits multifractal structure. The fractal dimension is time-dependent, increasing over time. The values found are consistent with the fractal dimension of star-forming sites in other spiral galaxies.

  8. The abundance and spatial distribution of ultra-diffuse galaxies in nearby galaxy clusters

    NASA Astrophysics Data System (ADS)

    van der Burg, Remco F. J.; Muzzin, Adam; Hoekstra, Henk

    2016-05-01

    Recent observations have highlighted a significant population of faint but large (reff> 1.5 kpc) galaxies in the Coma cluster. The origin of these ultra diffuse galaxies (UDGs) remains puzzling, as the interpretation of these observational results has been hindered by the (partly) subjective selection of UDGs, and the limited study of only the Coma (and some examples in the Virgo-) cluster. In this paper we extend the study of UDGs using eight clusters in the redshift range 0.044 cluster mass, reaching ~200 in typical haloes of M200 ≃ 1015M⊙. For the ensemble cluster we measure the size distribution of UDGs, their colour-magnitude distribution, and their completeness-corrected radial density distribution within the clusters. The morphologically-selected cluster UDGs have colours consistent with the cluster red sequence, and have a steep size distribution that, at a given surface brightness, declines as n [ dex-1 ] ∝ reff-3.4 ± 0.2. Their radial distribution is significantly steeper than NFW in the outskirts, and is significantly shallower in the inner parts. We find them to follow the same radial distribution as the more massive quiescent galaxies in the clusters, except within the core region of r ≲ 0.15 × R200 (or ≲ 300 kpc). Within this region the number density of UDGs drops and is consistent with zero. These diffuse galaxies can only resist tidal forces down to this cluster-centric distance if they are highly centrally dark-matter dominated. The observation that the radial distribution of more compact dwarf galaxies (reff< 1.0 kpc) with similar luminosities follows the same distribution as the UDGs, but exist down to a smaller distance of 100 kpc from the

  9. A Hybrid Method for Image Segmentation Based on Artificial Fish Swarm Algorithm and Fuzzy c-Means Clustering.

    PubMed

    Ma, Li; Li, Yang; Fan, Suohai; Fan, Runzhu

    2015-01-01

    Image segmentation plays an important role in medical image processing. Fuzzy c-means (FCM) clustering is one of the popular clustering algorithms for medical image segmentation. However, FCM has the problems of depending on initial clustering centers, falling into local optimal solution easily, and sensitivity to noise disturbance. To solve these problems, this paper proposes a hybrid artificial fish swarm algorithm (HAFSA). The proposed algorithm combines artificial fish swarm algorithm (AFSA) with FCM whose advantages of global optimization searching and parallel computing ability of AFSA are utilized to find a superior result. Meanwhile, Metropolis criterion and noise reduction mechanism are introduced to AFSA for enhancing the convergence rate and antinoise ability. The artificial grid graph and Magnetic Resonance Imaging (MRI) are used in the experiments, and the experimental results show that the proposed algorithm has stronger antinoise ability and higher precision. A number of evaluation indicators also demonstrate that the effect of HAFSA is more excellent than FCM and suppressed FCM (SFCM). PMID:26649068

  10. Effects of Formation Epoch Distribution on X-Ray Luminosity and Temperature Functions of Galaxy Clusters

    NASA Astrophysics Data System (ADS)

    Enoki, Motohiro; Takahara, Fumio; Fujita, Yutaka

    2001-07-01

    We investigate statistical properties of galaxy clusters in the context of a hierarchical clustering scenario, taking into account their formation epoch distribution; this study is motivated by the recent finding by Fujita and Takahara that X-ray clusters form a fundamental plane in which the mass and the formation epoch are regarded as two independent parameters. Using the formalism that discriminates between major mergers and accretion, the epoch of a cluster formation is identified with that of the last major merger. Since tiny mass accretion following formation does not much affect the core structure of clusters, the properties of X-ray emission from clusters are determined by the total mass and density at their formation time. Under these assumptions, we calculate X-ray luminosity and temperature functions of galaxy clusters. We find that the behavior of the luminosity function differs from the model that does not take into account formation epoch distribution; the behavior of the temperature function, however, is not much different. In our model, the luminosity function is shifted to a higher luminosity and shows no significant evolution up to z~1, independent of cosmological models. The clusters are populated on the temperature-luminosity plane, with a finite dispersion. Since the simple scaling model in which the gas temperature is equal to the virial temperature fails to reproduce the observed luminosity-temperature relation, we also consider a model that takes into account the effects of preheating. The preheating model reproduces the observations much more accurately.

  11. The GSAM software: A global search algorithm of minima exploration for the investigation of low lying isomers of clusters

    SciTech Connect

    Marchal, Rémi; Carbonnière, Philippe; Pouchan, Claude

    2015-01-22

    The study of atomic clusters has become an increasingly active area of research in the recent years because of the fundamental interest in studying a completely new area that can bridge the gap between atomic and solid state physics. Due to their specific properties, such compounds are of great interest in the field of nanotechnology [1,2]. Here, we would present our GSAM algorithm based on a DFT exploration of the PES to find the low lying isomers of such compounds. This algorithm includes the generation of an intial set of structure from which the most relevant are selected. Moreover, an optimization process, called raking optimization, able to discard step by step all the non physically reasonnable configurations have been implemented to reduce the computational cost of this algorithm. Structural properties of Ga{sub n}Asm clusters will be presented as an illustration of the method.

  12. Distributive Education Competency-Based Curriculum Models by Occupational Clusters. Final Report.

    ERIC Educational Resources Information Center

    Davis, Rodney E.; Husted, Stewart W.

    To meet the needs of distributive education teachers and students, a project was initiated to develop competency-based curriculum models for marketing and distributive education clusters. The models which were developed incorporate competencies, materials and resources, teaching methodologies/learning activities, and evaluative criteria for the…

  13. A Scheduling Algorithm for the Distributed Student Registration System in Transaction-Intensive Environment

    ERIC Educational Resources Information Center

    Li, Wenhao

    2011-01-01

    Distributed workflow technology has been widely used in modern education and e-business systems. Distributed web applications have shown cross-domain and cooperative characteristics to meet the need of current distributed workflow applications. In this paper, the author proposes a dynamic and adaptive scheduling algorithm PCSA (Pre-Calculated…

  14. Clustering instability of the spatial distribution of inertial particles in turbulent flows.

    PubMed

    Elperin, Tov; Kleeorin, Nathan; L'vov, Victor S; Rogachevskii, Igor; Sokoloff, Dmitry

    2002-09-01

    A theory of clustering of inertial particles advected by a turbulent velocity field caused by an instability of their spatial distribution is suggested. The reason for the clustering instability is a combined effect of the particles inertia and a finite correlation time of the velocity field. The crucial parameter for the clustering instability is the size of the particles. The critical size is estimated for a strong clustering (with a finite fraction of particles in clusters) associated with the growth of the mean absolute value of the particles number density and for a weak clustering associated with the growth of the second and higher moments. A new concept of compressibility of the turbulent diffusion tensor caused by a finite correlation time of an incompressible velocity field is introduced. In this model of the velocity field, the field of Lagrangian trajectories is not divergence free. A mechanism of saturation of the clustering instability associated with the particles collisions in the clusters is suggested. Applications of the analyzed effects to the dynamics of droplets in the turbulent atmosphere are discussed. An estimated nonlinear level of the saturation of the droplets number density in clouds exceeds by the orders of magnitude their mean number density. The critical size of cloud droplets required for cluster formation is more than 20 microm.

  15. A genetic algorithm for layered multisource video distribution

    NASA Astrophysics Data System (ADS)

    Cheok, Lai-Tee; Eleftheriadis, Alexandros

    2005-03-01

    We propose a genetic algorithm -- MckpGen -- for rate scaling and adaptive streaming of layered video streams from multiple sources in a bandwidth-constrained environment. A genetic algorithm (GA) consists of several components: a representation scheme; a generator for creating an initial population; a crossover operator for producing offspring solutions from parents; a mutation operator to promote genetic diversity and a repair operator to ensure feasibility of solutions produced. We formulated the problem as a Multiple-Choice Knapsack Problem (MCKP), a variant of Knapsack Problem (KP) and a decision problem in combinatorial optimization. MCKP has many successful applications in fault tolerance, capital budgeting, resource allocation for conserving energy on mobile devices, etc. Genetic algorithms have been used to solve NP-complete problems effectively, such as the KP, however, to the best of our knowledge, there is no GA for MCKP. We utilize a binary chromosome representation scheme for MCKP and design and implement the components, utilizing problem-specific knowledge for solving MCKP. In addition, for the repair operator, we propose two schemes (RepairSimple and RepairBRP). Results show that RepairBRP yields significantly better performance. We further show that the average fitness of the entire population converges towards the best fitness (optimal) value and compare the performance at various bit-rates.

  16. Multiobjective Memetic Estimation of Distribution Algorithm Based on an Incremental Tournament Local Searcher

    PubMed Central

    Yang, Kaifeng; Mu, Li; Yang, Dongdong; Zou, Feng; Wang, Lei; Jiang, Qiaoyong

    2014-01-01

    A novel hybrid multiobjective algorithm is presented in this paper, which combines a new multiobjective estimation of distribution algorithm, an efficient local searcher and ε-dominance. Besides, two multiobjective problems with variable linkages strictly based on manifold distribution are proposed. The Pareto set to the continuous multiobjective optimization problems, in the decision space, is a piecewise low-dimensional continuous manifold. The regularity by the manifold features just build probability distribution model by globally statistical information from the population, yet, the efficiency of promising individuals is not well exploited, which is not beneficial to search and optimization process. Hereby, an incremental tournament local searcher is designed to exploit local information efficiently and accelerate convergence to the true Pareto-optimal front. Besides, since ε-dominance is a strategy that can make multiobjective algorithm gain well distributed solutions and has low computational complexity, ε-dominance and the incremental tournament local searcher are combined here. The novel memetic multiobjective estimation of distribution algorithm, MMEDA, was proposed accordingly. The algorithm is validated by experiment on twenty-two test problems with and without variable linkages of diverse complexities. Compared with three state-of-the-art multiobjective optimization algorithms, our algorithm achieves comparable results in terms of convergence and diversity metrics. PMID:25170526

  17. A Comparison of Alternative Distributed Dynamic Cluster Formation Techniques for Industrial Wireless Sensor Networks

    PubMed Central

    Gholami, Mohammad; Brennan, Robert W.

    2016-01-01

    In this paper, we investigate alternative distributed clustering techniques for wireless sensor node tracking in an industrial environment. The research builds on extant work on wireless sensor node clustering by reporting on: (1) the development of a novel distributed management approach for tracking mobile nodes in an industrial wireless sensor network; and (2) an objective comparison of alternative cluster management approaches for wireless sensor networks. To perform this comparison, we focus on two main clustering approaches proposed in the literature: pre-defined clusters and ad hoc clusters. These approaches are compared in the context of their reconfigurability: more specifically, we investigate the trade-off between the cost and the effectiveness of competing strategies aimed at adapting to changes in the sensing environment. To support this work, we introduce three new metrics: a cost/efficiency measure, a performance measure, and a resource consumption measure. The results of our experiments show that ad hoc clusters adapt more readily to changes in the sensing environment, but this higher level of adaptability is at the cost of overall efficiency. PMID:26751447

  18. A Comparison of Alternative Distributed Dynamic Cluster Formation Techniques for Industrial Wireless Sensor Networks.

    PubMed

    Gholami, Mohammad; Brennan, Robert W

    2016-01-06

    In this paper, we investigate alternative distributed clustering techniques for wireless sensor node tracking in an industrial environment. The research builds on extant work on wireless sensor node clustering by reporting on: (1) the development of a novel distributed management approach for tracking mobile nodes in an industrial wireless sensor network; and (2) an objective comparison of alternative cluster management approaches for wireless sensor networks. To perform this comparison, we focus on two main clustering approaches proposed in the literature: pre-defined clusters and ad hoc clusters. These approaches are compared in the context of their reconfigurability: more specifically, we investigate the trade-off between the cost and the effectiveness of competing strategies aimed at adapting to changes in the sensing environment. To support this work, we introduce three new metrics: a cost/efficiency measure, a performance measure, and a resource consumption measure. The results of our experiments show that ad hoc clusters adapt more readily to changes in the sensing environment, but this higher level of adaptability is at the cost of overall efficiency.

  19. Determination of the Johnson-Cook Constitutive Model Parameters of Materials by Cluster Global Optimization Algorithm

    NASA Astrophysics Data System (ADS)

    Huang, Zhipeng; Gao, Lihong; Wang, Yangwei; Wang, Fuchi

    2016-06-01

    The Johnson-Cook (J-C) constitutive model is widely used in the finite element simulation, as this model shows the relationship between stress and strain in a simple way. In this paper, a cluster global optimization algorithm is proposed to determine the J-C constitutive model parameters of materials. A set of assumed parameters is used for the accuracy verification of the procedure. The parameters of two materials (401 steel and 823 steel) are determined. Results show that the procedure is reliable and effective. The relative error between the optimized and assumed parameters is no more than 4.02%, and the relative error between the optimized and assumed stress is 0.2% × 10-5. The J-C constitutive parameters can be determined more precisely and quickly than the traditional manual procedure. Furthermore, all the parameters can be simultaneously determined using several curves under different experimental conditions. A strategy is also proposed to accurately determine the constitutive parameters.

  20. Determination of the Johnson-Cook Constitutive Model Parameters of Materials by Cluster Global Optimization Algorithm

    NASA Astrophysics Data System (ADS)

    Huang, Zhipeng; Gao, Lihong; Wang, Yangwei; Wang, Fuchi

    2016-09-01

    The Johnson-Cook (J-C) constitutive model is widely used in the finite element simulation, as this model shows the relationship between stress and strain in a simple way. In this paper, a cluster global optimization algorithm is proposed to determine the J-C constitutive model parameters of materials. A set of assumed parameters is used for the accuracy verification of the procedure. The parameters of two materials (401 steel and 823 steel) are determined. Results show that the procedure is reliable and effective. The relative error between the optimized and assumed parameters is no more than 4.02%, and the relative error between the optimized and assumed stress is 0.2% × 10-5. The J-C constitutive parameters can be determined more precisely and quickly than the traditional manual procedure. Furthermore, all the parameters can be simultaneously determined using several curves under different experimental conditions. A strategy is also proposed to accurately determine the constitutive parameters.

  1. Intensity-based signal separation algorithm for accuratequantification of clustered centrosomes in tissue sections

    SciTech Connect

    Fleisch, Markus C.; Maxell, Christopher A.; Kuper, Claudia K.; Brown, Erika T.; Parvin, Bahram; Barcellos-Hoff, Mary-Helen; Costes,Sylvain V.

    2006-03-08

    Centrosomes are small organelles that organize the mitoticspindle during cell division and are also involved in cell shape andpolarity. Within epithelial tumors, such as breast cancer, and somehematological tumors, centrosome abnormalities (CA) are common, occurearly in disease etiology, and correlate with chromosomal instability anddisease stage. In situ quantification of CA by optical microscopy ishampered by overlap and clustering of these organelles, which appear asfocal structures. CA has been frequently associated with Tp53 status inpremalignant lesions and tumors. Here we describe an approach toaccurately quantify centrosomes in tissue sections and tumors.Considering proliferation and baseline amplification rate the resultingpopulation based ratio of centrosomes per nucleus allow the approximationof the proportion of cells with CA. Using this technique we show that20-30 percent of cells have amplified centrosomes in Tp53 null mammarytumors. Combining fluorescence detection, deconvolution microscopy and amathematical algorithm applied to a maximum intensity projection we showthat this approach is superior to traditional investigator based visualanalysis or threshold-based techniques.

  2. A Distributed Data Implementation of the Perspective Shear-Warp Volume Rendering Algorithm for Visualisation of Large Astronomical Cubes

    NASA Astrophysics Data System (ADS)

    Beeson, Brett; Barnes, David G.; Bourke, Paul D.

    We describe the first distributed data implementation of the perspective shear-warp volume rendering algorithm and explore its applications to large astronomical data cubes and simulation realisations. Our system distributes sub-volumes of 3-dimensional images to leaf nodes of a Beowulf-class cluster, where the rendering takes place. Junction nodes composite the sub-volume renderings together and pass the combined images upwards for further compositing or display. We demonstrate that our system out-performs other software solutions and can render a `worst-case' 512 × 512 × 512 data volume in less than four seconds using 16 rendering and 15 compositing nodes. Our system also performs very well compared with much more expensive hardware systems. With appropriate commodity hardware, such as Swinburne's Virtual Reality Theatre or a 3Dlabs Wildcat graphics card, stereoscopic display is possible.

  3. A correction algorithm for particle size distribution measurements made with the forward-scattering spectrometer probe

    NASA Technical Reports Server (NTRS)

    Lock, James A.; Hovenac, Edward A.

    1989-01-01

    A correction algorithm for evaluating the particle size distribution measurements of atmospheric aerosols obtained with a forward-scattering spectrometer probe (FSSP) is examined. A model based on Poisson statistics is employed to calculate the average diameter and rms width of the particle size distribution. The dead time and coincidence errors in the measured number density are estimated. The model generated data are compared with a Monte Carlo simulation of the FSSP operation. It is observed that the correlation between the actual and measured size distribution is nonlinear. It is noted that the algorithm permits more accurate calculation of the average diameter and rms width of the distribution compared to uncorrected measured quantities.

  4. THE STEADY-STATE WIND MODEL FOR YOUNG STELLAR CLUSTERS WITH AN EXPONENTIAL STELLAR DENSITY DISTRIBUTION

    SciTech Connect

    Silich, Sergiy; Tenorio-Tagle, Guillermo; Martinez-Gonzalez, Sergio; Bisnovatyi-Kogan, Gennadiy E-mail: gkogan@iki.rssi.ru

    2011-12-20

    A hydrodynamic model for steady-state, spherically symmetric winds driven by young stellar clusters with an exponential stellar density distribution is presented. Unlike in most previous calculations, the position of the singular point R{sub sp}, which separates the inner subsonic zone from the outer supersonic flow, is not associated with the star cluster edge, but calculated self-consistently. When the radiative losses of energy are negligible, the transition from the subsonic to the supersonic flow occurs always at R{sub sp} Almost-Equal-To 4R{sub c} , where R{sub c} is the characteristic scale for the stellar density distribution, irrespective of other star cluster parameters. This is not the case in the catastrophic cooling regime, when the temperature drops abruptly at a short distance from the star cluster center, and the transition from the subsonic to the supersonic regime occurs at a much smaller distance from the star cluster center. The impact from the major star cluster parameters to the wind inner structure is thoroughly discussed. Particular attention is paid to the effects which radiative cooling provides to the flow. The results of the calculations for a set of input parameters, which lead to different hydrodynamic regimes, are presented and compared to the results from non-radiative one-dimensional numerical simulations and to those from calculations with a homogeneous stellar mass distribution.

  5. Statistical sampling of the distribution of uranium deposits using geologic/geographic clusters

    USGS Publications Warehouse

    Finch, W.I.; Grundy, W.D.; Pierson, C.T.

    1992-01-01

    The concept of geologic/geographic clusters was developed particularly to study grade and tonnage models for sandstone-type uranium deposits. A cluster is a grouping of mined as well as unmined uranium occurrences within an arbitrary area about 8 km across. A cluster is a statistical sample that will reflect accurately the distribution of uranium in large regions relative to various geologic and geographic features. The example of the Colorado Plateau Uranium Province reveals that only 3 percent of the total number of clusters is in the largest tonnage-size category, greater than 10,000 short tons U3O8, and that 80 percent of the clusters are hosted by Triassic and Jurassic rocks. The distributions of grade and tonnage for clusters in the Powder River Basin show a wide variation; the grade distribution is highly variable, reflecting a difference between roll-front deposits and concretionary deposits, and the Basin contains about half the number in the greater-than-10,000 tonnage-size class as does the Colorado Plateau, even though it is much smaller. The grade and tonnage models should prove useful in finding the richest and largest uranium deposits. ?? 1992 Oxford University Press.

  6. Distributed Similarity based Clustering and Compressed Forwarding for wireless sensor networks.

    PubMed

    Arunraja, Muruganantham; Malathi, Veluchamy; Sakthivel, Erulappan

    2015-11-01

    Wireless sensor networks are engaged in various data gathering applications. The major bottleneck in wireless data gathering systems is the finite energy of sensor nodes. By conserving the on board energy, the life span of wireless sensor network can be well extended. Data communication being the dominant energy consuming activity of wireless sensor network, data reduction can serve better in conserving the nodal energy. Spatial and temporal correlation among the sensor data is exploited to reduce the data communications. Data similar cluster formation is an effective way to exploit spatial correlation among the neighboring sensors. By sending only a subset of data and estimate the rest using this subset is the contemporary way of exploiting temporal correlation. In Distributed Similarity based Clustering and Compressed Forwarding for wireless sensor networks, we construct data similar iso-clusters with minimal communication overhead. The intra-cluster communication is reduced using adaptive-normalized least mean squares based dual prediction framework. The cluster head reduces the inter-cluster data payload using a lossless compressive forwarding technique. The proposed work achieves significant data reduction in both the intra-cluster and the inter-cluster communications, with the optimal data accuracy of collected data.

  7. Distribution of sea anemones (Cnidaria, Actiniaria) in Korea analyzed by environmental clustering

    USGS Publications Warehouse

    Cha, H.-R.; Buddemeier, R.W.; Fautin, D.G.; Sandhei, P.

    2004-01-01

    Using environmental data and the geospatial clustering tools LOICZView and DISCO, we empirically tested the postulated existence and boundaries of four biogeographic regions in the southern part of the Korean peninsula. Environmental variables used included wind speed, sea surface temperature (SST), salinity, tidal amplitude, and the chlorophyll spectral signal. Our analysis confirmed the existence of four biogeographic regions, but the details of the borders between them differ from those previously postulated. Specimen-level distribution records of intertidal sea anemones were mapped; their distribution relative to the environmental data supported the importance of the environmental parameters we selected in defining suitable habitats. From the geographic coincidence between anemone distribution and the clusters based on environmental variables, we infer that geospatial clustering has the power to delimit ranges for marine organisms within relatively small geographical areas.

  8. Distributed edge detection algorithm based on wavelet transform for wireless video sensor network

    NASA Astrophysics Data System (ADS)

    Li, Qiulin; Hao, Qun; Song, Yong; Wang, Dongsheng

    2010-12-01

    Edge detection algorithms are critical to image processing and computer vision. Traditional edge detection algorithms are not suitable for wireless video sensor network (WVSN) in which the nodes are with in limited calculation capability and resources. In this paper, a distributed edge detection algorithm based on wavelet transform designed for WVSN is proposed. Wavelet transform decompose the image into several parts, then the parts are assigned to different nodes through wireless network separately. Each node performs sub-image edge detecting algorithm correspondingly, all the results are sent to sink node, Fusing and Synthesis which include image binary and edge connect are executed in it. And finally output the edge image. Lifting scheme and parallel distributed algorithm are adopted to improve the efficiency, simultaneously, decrease the computational complexity. Experimental results show that this method could achieve higher efficiency and better result.

  9. Distributed edge detection algorithm based on wavelet transform for wireless video sensor network

    NASA Astrophysics Data System (ADS)

    Li, Qiulin; Hao, Qun; Song, Yong; Wang, Dongsheng

    2011-05-01

    Edge detection algorithms are critical to image processing and computer vision. Traditional edge detection algorithms are not suitable for wireless video sensor network (WVSN) in which the nodes are with in limited calculation capability and resources. In this paper, a distributed edge detection algorithm based on wavelet transform designed for WVSN is proposed. Wavelet transform decompose the image into several parts, then the parts are assigned to different nodes through wireless network separately. Each node performs sub-image edge detecting algorithm correspondingly, all the results are sent to sink node, Fusing and Synthesis which include image binary and edge connect are executed in it. And finally output the edge image. Lifting scheme and parallel distributed algorithm are adopted to improve the efficiency, simultaneously, decrease the computational complexity. Experimental results show that this method could achieve higher efficiency and better result.

  10. Localization of WSN using Distributed Particle Swarm Optimization algorithm with precise references

    NASA Astrophysics Data System (ADS)

    Janapati, Ravi Chander; Balaswamy, Ch.; Soundararajan, K.

    2016-08-01

    Localization is the key research area in Wireless Sensor Networks. Finding the exact position of the node is known as localization. Different algorithms have been proposed. Here we consider a cooperative localization algorithm with censoring schemes using Crammer Rao Bound (CRB). This censoring scheme can improve the positioning accuracy and reduces computation complexity, traffic and latency. Particle swarm optimization (PSO) is a population based search algorithm based on the swarm intelligence like social behavior of birds, bees or a school of fishes. To improve the algorithm efficiency and localization precision, this paper presents an objective function based on the normal distribution of ranging error and a method of obtaining the search space of particles. In this paper Distributed localization algorithm PSO with CRB is proposed. Proposed method shows better results in terms of position accuracy, latency and complexity.

  11. Kanerva's sparse distributed memory: An associative memory algorithm well-suited to the Connection Machine

    NASA Technical Reports Server (NTRS)

    Rogers, David

    1988-01-01

    The advent of the Connection Machine profoundly changes the world of supercomputers. The highly nontraditional architecture makes possible the exploration of algorithms that were impractical for standard Von Neumann architectures. Sparse distributed memory (SDM) is an example of such an algorithm. Sparse distributed memory is a particularly simple and elegant formulation for an associative memory. The foundations for sparse distributed memory are described, and some simple examples of using the memory are presented. The relationship of sparse distributed memory to three important computational systems is shown: random-access memory, neural networks, and the cerebellum of the brain. Finally, the implementation of the algorithm for sparse distributed memory on the Connection Machine is discussed.

  12. Phenotype Clustering of Breast Epithelial Cells in Confocal Imagesbased on Nuclear Protein Distribution Analysis

    SciTech Connect

    Long, Fuhui; Peng, Hanchuan; Sudar, Damir; Levievre, Sophie A.; Knowles, David W.

    2006-09-05

    Background: The distribution of the chromatin-associatedproteins plays a key role in directing nuclear function. Previously, wedeveloped an image-based method to quantify the nuclear distributions ofproteins and showed that these distributions depended on the phenotype ofhuman mammary epithelial cells. Here we describe a method that creates ahierarchical tree of the given cell phenotypes and calculates thestatistical significance between them, based on the clustering analysisof nuclear protein distributions. Results: Nuclear distributions ofnuclear mitotic apparatus protein were previously obtained fornon-neoplastic S1 and malignant T4-2 human mammary epithelial cellscultured for up to 12 days. Cell phenotype was defined as S1 or T4-2 andthe number of days in cultured. A probabilistic ensemble approach wasused to define a set of consensus clusters from the results of multipletraditional cluster analysis techniques applied to the nucleardistribution data. Cluster histograms were constructed to show how cellsin any one phenotype were distributed across the consensus clusters.Grouping various phenotypes allowed us to build phenotype trees andcalculate the statistical difference between each group. The resultsshowed that non-neoplastic S1 cells could be distinguished from malignantT4-2 cells with 94.19 percent accuracy; that proliferating S1 cells couldbe distinguished from differentiated S1 cells with 92.86 percentaccuracy; and showed no significant difference between the variousphenotypes of T4-2 cells corresponding to increasing tumor sizes.Conclusion: This work presents a cluster analysis method that canidentify significant cell phenotypes, based on the nuclear distributionof specific proteins, with high accuracy.

  13. CORRIGENDUM: Multiscale electrical contact resistance in clustered contact distribution Multiscale electrical contact resistance in clustered contact distribution

    NASA Astrophysics Data System (ADS)

    Lee, Sangyoung; Cho, Hyun; Jang, Yong Hoon

    2010-06-01

    The authors wish to explain the similarity between some figures in the above paper (hereafter called the JPD paper) and in their other publication, Lee S, Jang Y H and Kim W 2008 Effects of nanosized contact spots on thermal contact resistance J. Appl. Phys.103 074308 (hereafter called the JAP paper), and to explain the differences between the two papers, which are not explicitly stated in the JPD paper. The main objective of the JAP paper is to calculate the thermal contact resistance of the nanosized contact spots in multiscale contact. During the process of multiscale analysis, the thermal conductivity varies, especially below the phonon mean free path. The JPD paper deals with the electrical contact resistance in the multiscale contact distribution with an assumption of constant electrical resistivity, which is known as a different kind of physics in a larger characteristic length scale. There are similar figures in the JPD paper and the JAP paper: figures 6, 7 and 8 in the JPD paper and figures 3, 4 and 5 in the JAP paper. Two research works were performed on the basis of a specific microcontact distribution. In the JAP paper, the scale of the contact distribution is in the range of the phonon mean free path of Si, which is a very small size of contact distribution. In the JPD paper, the scale of contact distribution is in the continuum scale, which is larger than the phonon mean free path. In addition, due to the characteristics of a fractal surface which repeatedly generates a similar shape of contact distribution in the different length scales, the shape of contact distribution looks similar, but the total sizes of domain in the JPD and JAP figures are different. The projected areas L × L of fractal surface of the JAP paper and JPD paper are 10 μm × 10 μm and 10 mm × 10 mm, respectively. The length scale is already stated in the JAP paper, but not in the JPD paper. Thus, we have to state that the figures were adapted from the JAP paper without clear

  14. Distribution of histaminergic neuronal cluster in the rat and mouse hypothalamus.

    PubMed

    Moriwaki, Chinatsu; Chiba, Seiichi; Wei, Huixing; Aosa, Taishi; Kitamura, Hirokazu; Ina, Keisuke; Shibata, Hirotaka; Fujikura, Yoshihisa

    2015-10-01

    Histidine decarboxylase (HDC) catalyzes the biosynthesis of histamine from L-histidine and is expressed throughout the mammalian nervous system by histaminergic neurons. Histaminergic neurons arise in the posterior mesencephalon during the early embryonic period and gradually develop into two histaminergic substreams around the lateral area of the posterior hypothalamus and the more anterior peri-cerebral aqueduct area before finally forming an adult-like pattern comprising five neuronal clusters, E1, E2, E3, E4, and E5, at the postnatal stage. This distribution of histaminergic neuronal clusters in the rat hypothalamus appears to be a consequence of neuronal development and reflects the functional differentiation within each neuronal cluster. However, the close linkage between the locations of histaminergic neuronal clusters and their physiological functions has yet to be fully elucidated because of the sparse information regarding the location and orientation of each histaminergic neuronal clusters in the hypothalamus of rats and mice. To clarify the distribution of the five-histaminergic neuronal clusters more clearly, we performed an immunohistochemical study using the anti-HDC antibody on serial sections of the rat hypothalamus according to the brain maps of rat and mouse. Our results confirmed that the HDC-immunoreactive (HDCi) neuronal clusters in the hypothalamus of rats and mice are observed in the ventrolateral part of the most posterior hypothalamus (E1), ventrolateral part of the posterior hypothalamus (E2), ventromedial part from the medial to the posterior hypothalamus (E3), periventricular part from the anterior to the medial hypothalamus (E4), and diffusely extended part of the more dorsal and almost entire hypothalamus (E5). The stereological estimation of the total number of HDCi neurons of each clusters revealed the larger amount of the rat than the mouse. The characterization of histaminergic neuronal clusters in the hypothalamus of rats and

  15. THE DISTRIBUTION OF DARK MATTER OVER THREE DECADES IN RADIUS IN THE LENSING CLUSTER ABELL 611

    SciTech Connect

    Newman, Andrew B.; Ellis, Richard S.; Treu, Tommaso; Marshall, Philip J.; Sand, David J.; Richard, Johan; Capak, Peter; Miyazaki, Satoshi

    2009-12-01

    We present a detailed analysis of the baryonic and dark matter distribution in the lensing cluster Abell 611 (z = 0.288), with the goal of determining the dark matter profile over an unprecedented range of cluster-centric distance. By combining three complementary probes of the mass distribution, weak lensing from multi-color Subaru imaging, strong lensing constraints based on the identification of multiply imaged sources in Hubble Space Telescope images, and resolved stellar velocity dispersion measures for the brightest cluster galaxy secured using the Keck telescope, we extend the methodology for separating the dark and baryonic mass components introduced by Sand et al. Our resulting dark matter profile samples the cluster from approx3 kpc to 3.25 Mpc, thereby providing an excellent basis for comparisons with recent numerical models. We demonstrate that only by combining our three observational techniques can degeneracies in constraining the form of the dark matter profile be broken on scales crucial for detailed comparisons with numerical simulations. Our analysis reveals that a simple Navarro-Frenk-White (NFW) profile is an unacceptable fit to our data. We confirm earlier claims based on less extensive analyses of other clusters that the inner profile of the dark matter profile deviates significantly from the NFW form and find a inner logarithmic slope beta flatter than 0.3 (68%; where rho{sub DM} propor to r{sup -b}eta at small radii). In order to reconcile our data with cluster formation in a LAMBDACDM cosmology, we speculate that it may be necessary to revise our understanding of the nature of baryon-dark matter interactions in cluster cores. Comprehensive weak and strong lensing data, when coupled with kinematic information on the brightest cluster galaxy, can readily be applied to a larger sample of clusters to test the universality of these results.

  16. A SIMPLE PHYSICAL MODEL FOR THE GAS DISTRIBUTION IN GALAXY CLUSTERS

    SciTech Connect

    Patej, Anna; Loeb, Abraham

    2015-01-01

    The dominant baryonic component of galaxy clusters is hot gas whose distribution is commonly probed through X-ray emission arising from thermal bremsstrahlung. The density profile thus obtained has been traditionally modeled with a β-profile, a simple function with only three parameters. However, this model is known to be insufficient for characterizing the range of cluster gas distributions and attempts to rectify this shortcoming typically introduce additional parameters to increase the fitting flexibility. We use cosmological and physical considerations to obtain a family of profiles for the gas with fewer parameters than the β-model but which better accounts for observed gas profiles over wide radial intervals.

  17. Chandrasekhar equations and computational algorithms for distributed parameter systems

    NASA Technical Reports Server (NTRS)

    Burns, J. A.; Ito, K.; Powers, R. K.

    1984-01-01

    The Chandrasekhar equations arising in optimal control problems for linear distributed parameter systems are considered. The equations are derived via approximation theory. This approach is used to obtain existence, uniqueness, and strong differentiability of the solutions and provides the basis for a convergent computation scheme for approximating feedback gain operators. A numerical example is presented to illustrate these ideas.

  18. Distributed topology control algorithm for multihop wireless netoworks

    NASA Technical Reports Server (NTRS)

    Borbash, S. A.; Jennings, E. H.

    2002-01-01

    We present a network initialization algorithmfor wireless networks with distributed intelligence. Each node (agent) has only local, incomplete knowledge and it must make local decisions to meet a predefined global objective. Our objective is to use power control to establish a topology based onthe relative neighborhood graph which has good overall performance in terms of power usage, low interference, and reliability.

  19. The grid-dose-spreading algorithm for dose distribution calculation in heavy charged particle radiotherapy

    SciTech Connect

    Kanematsu, Nobuyuki; Yonai, Shunsuke; Ishizaki, Azusa

    2008-02-15

    A new variant of the pencil-beam (PB) algorithm for dose distribution calculation for radiotherapy with protons and heavier ions, the grid-dose spreading (GDS) algorithm, is proposed. The GDS algorithm is intrinsically faster than conventional PB algorithms due to approximations in convolution integral, where physical calculations are decoupled from simple grid-to-grid energy transfer. It was effortlessly implemented to a carbon-ion radiotherapy treatment planning system to enable realistic beam blurring in the field, which was absent with the broad-beam (BB) algorithm. For a typical prostate treatment, the slowing factor of the GDS algorithm relative to the BB algorithm was 1.4, which is a great improvement over the conventional PB algorithms with a typical slowing factor of several tens. The GDS algorithm is mathematically equivalent to the PB algorithm for horizontal and vertical coplanar beams commonly used in carbon-ion radiotherapy while dose deformation within the size of the pristine spread occurs for angled beams, which was within 3 mm for a single 150-MeV proton pencil beam of 30 deg. incidence, and needs to be assessed against the clinical requirements and tolerances in practical situations.

  20. Multi-source feature extraction and target recognition in wireless sensor networks based on adaptive distributed wavelet compression algorithms

    NASA Astrophysics Data System (ADS)

    Hortos, William S.

    2008-04-01

    Proposed distributed wavelet-based algorithms are a means to compress sensor data received at the nodes forming a wireless sensor network (WSN) by exchanging information between neighboring sensor nodes. Local collaboration among nodes compacts the measurements, yielding a reduced fused set with equivalent information at far fewer nodes. Nodes may be equipped with multiple sensor types, each capable of sensing distinct phenomena: thermal, humidity, chemical, voltage, or image signals with low or no frequency content as well as audio, seismic or video signals within defined frequency ranges. Compression of the multi-source data through wavelet-based methods, distributed at active nodes, reduces downstream processing and storage requirements along the paths to sink nodes; it also enables noise suppression and more energy-efficient query routing within the WSN. Targets are first detected by the multiple sensors; then wavelet compression and data fusion are applied to the target returns, followed by feature extraction from the reduced data; feature data are input to target recognition/classification routines; targets are tracked during their sojourns through the area monitored by the WSN. Algorithms to perform these tasks are implemented in a distributed manner, based on a partition of the WSN into clusters of nodes. In this work, a scheme of collaborative processing is applied for hierarchical data aggregation and decorrelation, based on the sensor data itself and any redundant information, enabled by a distributed, in-cluster wavelet transform with lifting that allows multiple levels of resolution. The wavelet-based compression algorithm significantly decreases RF bandwidth and other resource use in target processing tasks. Following wavelet compression, features are extracted. The objective of feature extraction is to maximize the probabilities of correct target classification based on multi-source sensor measurements, while minimizing the resource expenditures at

  1. A Multi-Hop Energy Neutral Clustering Algorithm for Maximizing Network Information Gathering in Energy Harvesting Wireless Sensor Networks

    PubMed Central

    Yang, Liu; Lu, Yinzhi; Zhong, Yuanchang; Wu, Xuegang; Yang, Simon X.

    2015-01-01

    Energy resource limitation is a severe problem in traditional wireless sensor networks (WSNs) because it restricts the lifetime of network. Recently, the emergence of energy harvesting techniques has brought with them the expectation to overcome this problem. In particular, it is possible for a sensor node with energy harvesting abilities to work perpetually in an Energy Neutral state. In this paper, a Multi-hop Energy Neutral Clustering (MENC) algorithm is proposed to construct the optimal multi-hop clustering architecture in energy harvesting WSNs, with the goal of achieving perpetual network operation. All cluster heads (CHs) in the network act as routers to transmit data to base station (BS) cooperatively by a multi-hop communication method. In addition, by analyzing the energy consumption of intra- and inter-cluster data transmission, we give the energy neutrality constraints. Under these constraints, every sensor node can work in an energy neutral state, which in turn provides perpetual network operation. Furthermore, the minimum network data transmission cycle is mathematically derived using convex optimization techniques while the network information gathering is maximal. Simulation results show that our protocol can achieve perpetual network operation, so that the consistent data delivery is guaranteed. In addition, substantial improvements on the performance of network throughput are also achieved as compared to the famous traditional clustering protocol LEACH and recent energy harvesting aware clustering protocols. PMID:26712764

  2. A Multi-Hop Energy Neutral Clustering Algorithm for Maximizing Network Information Gathering in Energy Harvesting Wireless Sensor Networks.

    PubMed

    Yang, Liu; Lu, Yinzhi; Zhong, Yuanchang; Wu, Xuegang; Yang, Simon X

    2015-12-26

    Energy resource limitation is a severe problem in traditional wireless sensor networks (WSNs) because it restricts the lifetime of network. Recently, the emergence of energy harvesting techniques has brought with them the expectation to overcome this problem. In particular, it is possible for a sensor node with energy harvesting abilities to work perpetually in an Energy Neutral state. In this paper, a Multi-hop Energy Neutral Clustering (MENC) algorithm is proposed to construct the optimal multi-hop clustering architecture in energy harvesting WSNs, with the goal of achieving perpetual network operation. All cluster heads (CHs) in the network act as routers to transmit data to base station (BS) cooperatively by a multi-hop communication method. In addition, by analyzing the energy consumption of intra- and inter-cluster data transmission, we give the energy neutrality constraints. Under these constraints, every sensor node can work in an energy neutral state, which in turn provides perpetual network operation. Furthermore, the minimum network data transmission cycle is mathematically derived using convex optimization techniques while the network information gathering is maximal. Simulation results show that our protocol can achieve perpetual network operation, so that the consistent data delivery is guaranteed. In addition, substantial improvements on the performance of network throughput are also achieved as compared to the famous traditional clustering protocol LEACH and recent energy harvesting aware clustering protocols.

  3. Integrating Parallel and Distributed Data Mining Algorithms into the NASA Earth Exchange (NEX)

    NASA Astrophysics Data System (ADS)

    Oza, N.; Kumar, V.; Nemani, R. R.; Boriah, S.; Das, K.; Khandelwal, A.; Matthews, B.; Michaelis, A.; Mithal, V.; Nayak, G.; Votava, P.

    2014-12-01

    There is an urgent need in global climate change science for efficient model and/or data analysis algorithms that can be deployed in distributed and parallel environments because of the proliferation of large and heterogeneous data sets. Members of our team from NASA Ames Research Center and the University of Minnesota have been developing new distributed data mining algorithms and developing distributed versions of algorithms originally developed to run on a single machine. We are integrating these algorithms together with the Terrestrial Observation and Prediction System (TOPS), an ecological nowcasting and forecasting system, on the NASA Earth Exchange (NEX). We are also developing a framework under which data mining algorithm developers can make their algorithms available for use by scientists in our system, model developers can set up their models to run within our system and make their results available, and data source providers can make their data available, all with as little effort as possible. We demonstrate the substantial time savings and new results that can be derived through this framework by demonstrating an improvement to the Burned Area (BA) data product on a global scale. Our improvement was derived through development and implementation on NEX of a novel spatiotemporal time series change detection algorithm which will also be presented.

  4. Mass and galaxy distributions of four massive galaxy clusters from Dark Energy Survey Science Verification data

    DOE PAGES

    Melchior, P.; Suchyta, E.; Huff, E.; Hirsch, M.; Kacprzak, T.; Rykoff, E.; Gruen, D.; Armstrong, R.; Bacon, D.; Bechtol, K.; et al

    2015-03-31

    We measure the weak-lensing masses and galaxy distributions of four massive galaxy clusters observed during the Science Verification phase of the Dark Energy Survey. This pathfinder study is meant to 1) validate the DECam imager for the task of measuring weak-lensing shapes, and 2) utilize DECam's large field of view to map out the clusters and their environments over 90 arcmin. We conduct a series of rigorous tests on astrometry, photometry, image quality, PSF modelling, and shear measurement accuracy to single out flaws in the data and also to identify the optimal data processing steps and parameters. We find Sciencemore » Verification data from DECam to be suitable for the lensing analysis described in this paper. The PSF is generally well-behaved, but the modelling is rendered difficult by a flux-dependent PSF width and ellipticity. We employ photometric redshifts to distinguish between foreground and background galaxies, and a red-sequence cluster finder to provide cluster richness estimates and cluster-galaxy distributions. By fitting NFW profiles to the clusters in this study, we determine weak-lensing masses that are in agreement with previous work. For Abell 3261, we provide the first estimates of redshift, weak-lensing mass, and richness. Additionally, the cluster-galaxy distributions indicate the presence of filamentary structures attached to 1E 0657-56 and RXC J2248.7-4431, stretching out as far as 1degree (approximately 20 Mpc), showcasing the potential of DECam and DES for detailed studies of degree-scale features on the sky.« less

  5. Mass and galaxy distributions of four massive galaxy clusters from Dark Energy Survey Science Verification data

    SciTech Connect

    Melchior, P.; et al.

    2015-05-21

    We measure the weak-lensing masses and galaxy distributions of four massive galaxy clusters observed during the Science Verification phase of the Dark Energy Survey. This pathfinder study is meant to 1) validate the DECam imager for the task of measuring weak-lensing shapes, and 2) utilize DECam's large field of view to map out the clusters and their environments over 90 arcmin. We conduct a series of rigorous tests on astrometry, photometry, image quality, PSF modeling, and shear measurement accuracy to single out flaws in the data and also to identify the optimal data processing steps and parameters. We find Science Verification data from DECam to be suitable for the lensing analysis described in this paper. The PSF is generally well-behaved, but the modeling is rendered difficult by a flux-dependent PSF width and ellipticity. We employ photometric redshifts to distinguish between foreground and background galaxies, and a red-sequence cluster finder to provide cluster richness estimates and cluster-galaxy distributions. By fitting NFW profiles to the clusters in this study, we determine weak-lensing masses that are in agreement with previous work. For Abell 3261, we provide the first estimates of redshift, weak-lensing mass, and richness. In addition, the cluster-galaxy distributions indicate the presence of filamentary structures attached to 1E 0657-56 and RXC J2248.7-4431, stretching out as far as 1 degree (approximately 20 Mpc), showcasing the potential of DECam and DES for detailed studies of degree-scale features on the sky.

  6. Mass and galaxy distributions of four massive galaxy clusters from Dark Energy Survey Science Verification data

    SciTech Connect

    Melchior, P.; Suchyta, E.; Huff, E.; Hirsch, M.; Kacprzak, T.; Rykoff, E.; Gruen, D.; Armstrong, R.; Bacon, D.; Bechtol, K.; Bernstein, G. M.; Bridle, S.; Clampitt, J.; Honscheid, K.; Jain, B.; Jouvel, S.; Krause, E.; Lin, H.; MacCrann, N.; Patton, K.; Plazas, A.; Rowe, B.; Vikram, V.; Wilcox, H.; Young, J.; Zuntz, J.; Abbott, T.; Abdalla, F. B.; Allam, S. S.; Banerji, M.; Bernstein, J. P.; Bernstein, R. A.; Bertin, E.; Buckley-Geer, E.; Burke, D. L.; Castander, F. J.; da Costa, L. N.; Cunha, C. E.; Depoy, D. L.; Desai, S.; Diehl, H. T.; Doel, P.; Estrada, J.; Evrard, A. E.; Neto, A. F.; Fernandez, E.; Finley, D. A.; Flaugher, B.; Frieman, J. A.; Gaztanaga, E.; Gerdes, D.; Gruendl, R. A.; Gutierrez, G. R.; Jarvis, M.; Karliner, I.; Kent, S.; Kuehn, K.; Kuropatkin, N.; Lahav, O.; Maia, M. A. G.; Makler, M.; Marriner, J.; Marshall, J. L.; Merritt, K. W.; Miller, C. J.; Miquel, R.; Mohr, J.; Neilsen, E.; Nichol, R. C.; Nord, B. D.; Reil, K.; Roe, N. A.; Roodman, A.; Sako, M.; Sanchez, E.; Santiago, B. X.; Schindler, R.; Schubnell, M.; Sevilla-Noarbe, I.; Sheldon, E.; Smith, C.; Soares-Santos, M.; Swanson, M. E. C.; Sypniewski, A. J.; Tarle, G.; Thaler, J.; Thomas, D.; Tucker, D. L.; Walker, A.; Wechsler, R.; Weller, J.; Wester, W.

    2015-03-31

    We measure the weak-lensing masses and galaxy distributions of four massive galaxy clusters observed during the Science Verification phase of the Dark Energy Survey. This pathfinder study is meant to 1) validate the DECam imager for the task of measuring weak-lensing shapes, and 2) utilize DECam's large field of view to map out the clusters and their environments over 90 arcmin. We conduct a series of rigorous tests on astrometry, photometry, image quality, PSF modelling, and shear measurement accuracy to single out flaws in the data and also to identify the optimal data processing steps and parameters. We find Science Verification data from DECam to be suitable for the lensing analysis described in this paper. The PSF is generally well-behaved, but the modelling is rendered difficult by a flux-dependent PSF width and ellipticity. We employ photometric redshifts to distinguish between foreground and background galaxies, and a red-sequence cluster finder to provide cluster richness estimates and cluster-galaxy distributions. By fitting NFW profiles to the clusters in this study, we determine weak-lensing masses that are in agreement with previous work. For Abell 3261, we provide the first estimates of redshift, weak-lensing mass, and richness. Additionally, the cluster-galaxy distributions indicate the presence of filamentary structures attached to 1E 0657-56 and RXC J2248.7-4431, stretching out as far as 1degree (approximately 20 Mpc), showcasing the potential of DECam and DES for detailed studies of degree-scale features on the sky.

  7. Distributed adaptive pinning control for cluster synchronization of nonlinearly coupled Lur'e networks

    NASA Astrophysics Data System (ADS)

    Tang, Ze; Park, Ju H.; Lee, Tae H.

    2016-10-01

    This paper is devoted to the cluster synchronization issue of nonlinearly coupled Lur'e networks under the distributed adaptive pinning control strategy. The time-varying delayed networks consisted of identical and nonidentical Lur'e systems are discussed respectively by applying the edge-based pinning control scheme. In each cluster, the edges belonging to the spanning tree are pinned. In view of the nonlinearly couplings of the networks, for the first time, an efficient distributed nonlinearly adaptive update law based on the local information of the dynamical behaviors of node is proposed. Sufficient criteria for the achievement of cluster synchronization are derived based on S-procedure, Kronecker product and Lyapunov stability theory. Additionally, some illustrative examples are provided to demonstrate the effectiveness of the theoretical results.

  8. Differential Evolution Algorithm with Diversified Vicinity Operator for Optimal Routing and Clustering of Energy Efficient Wireless Sensor Networks.

    PubMed

    Sumithra, Subramaniam; Victoire, T Aruldoss Albert

    2015-01-01

    Due to large dimension of clusters and increasing size of sensor nodes, finding the optimal route and cluster for large wireless sensor networks (WSN) seems to be highly complex and cumbersome. This paper proposes a new method to determine a reasonably better solution of the clustering and routing problem with the highest concern of efficient energy consumption of the sensor nodes for extending network life time. The proposed method is based on the Differential Evolution (DE) algorithm with an improvised search operator called Diversified Vicinity Procedure (DVP), which models a trade-off between energy consumption of the cluster heads and delay in forwarding the data packets. The obtained route using the proposed method from all the gateways to the base station is comparatively lesser in overall distance with less number of data forwards. Extensive numerical experiments demonstrate the superiority of the proposed method in managing energy consumption of the WSN and the results are compared with the other algorithms reported in the literature. PMID:26516635

  9. Optical algorithm for calculating the quantity distribution of fiber assembly.

    PubMed

    Wu, Meiqin; Wang, Fumei

    2016-09-01

    A modification of the two-flux model of Kubelka-Munk was proposed for the description of light propagating through a fiber-air mixture medium, which simplified fibers' internal reflection as a part of the scattering on the total fiber path length. A series of systematical experiments demonstrated a higher consistency with the reference quantity distribution than the common Lambert law on the fibrogram used in the textile industry did. Its application in the fibrogram for measuring the cotton fiber's length was demonstrated to be good, extending its applicability to the wool fiber, the length of which is harder to measure than that of the cotton fiber. PMID:27607296

  10. Waiting-time distributions of magnetic discontinuities: Clustering or Poisson process?

    SciTech Connect

    Greco, A.; Matthaeus, W. H.; Servidio, S.; Dmitruk, P.

    2009-10-15

    Using solar wind data from the Advanced Composition Explorer spacecraft, with the support of Hall magnetohydrodynamic simulations, the waiting-time distributions of magnetic discontinuities have been analyzed. A possible phenomenon of clusterization of these discontinuities is studied in detail. We perform a local Poisson's analysis in order to establish if these intermittent events are randomly distributed or not. Possible implications about the nature of solar wind discontinuities are discussed.

  11. Visceral leishmaniasis in border areas: clustered distribution of phlebotomine sand flies in Clorinda, Argentina.

    PubMed

    Salomón, Oscar D; Quintana, María G; Bruno, Mario R; Quiriconi, Ricardo V; Cabral, Viviana

    2009-08-01

    Three years after the first report of Lutzomyia longipalpis in Clorinda, Argentina, a border city near Asunción, Paraguay, the city was surveyed again. Lu. longipalpis was found clustered in the same neighbourhoods in 2007 as in 2004, even though the scattered distribution of canine visceral leishmaniasis was more related to the traffic of dogs through the border.

  12. Improved mine blast algorithm for optimal cost design of water distribution systems

    NASA Astrophysics Data System (ADS)

    Sadollah, Ali; Guen Yoo, Do; Kim, Joong Hoon

    2015-12-01

    The design of water distribution systems is a large class of combinatorial, nonlinear optimization problems with complex constraints such as conservation of mass and energy equations. Since feasible solutions are often extremely complex, traditional optimization techniques are insufficient. Recently, metaheuristic algorithms have been applied to this class of problems because they are highly efficient. In this article, a recently developed optimizer called the mine blast algorithm (MBA) is considered. The MBA is improved and coupled with the hydraulic simulator EPANET to find the optimal cost design for water distribution systems. The performance of the improved mine blast algorithm (IMBA) is demonstrated using the well-known Hanoi, New York tunnels and Balerma benchmark networks. Optimization results obtained using IMBA are compared to those using MBA and other optimizers in terms of their minimum construction costs and convergence rates. For the complex Balerma network, IMBA offers the cheapest network design compared to other optimization algorithms.

  13. A swarm intelligence based memetic algorithm for task allocation in distributed systems

    NASA Astrophysics Data System (ADS)

    Sarvizadeh, Raheleh; Haghi Kashani, Mostafa

    2011-12-01

    This paper proposes a Swarm Intelligence based Memetic algorithm for Task Allocation and scheduling in distributed systems. The tasks scheduling in distributed systems is known as an NP-complete problem. Hence, many genetic algorithms have been proposed for searching optimal solutions from entire solution space. However, these existing approaches are going to scan the entire solution space without considering the techniques that can reduce the complexity of the optimization. Spending too much time for doing scheduling is considered the main shortcoming of these approaches. Therefore, in this paper memetic algorithm has been used to cope with this shortcoming. With regard to load balancing efficiently, Bee Colony Optimization (BCO) has been applied as local search in the proposed memetic algorithm. Extended experimental results demonstrated that the proposed method outperformed the existing GA-based method in terms of CPU utilization.

  14. A swarm intelligence based memetic algorithm for task allocation in distributed systems

    NASA Astrophysics Data System (ADS)

    Sarvizadeh, Raheleh; Haghi Kashani, Mostafa

    2012-01-01

    This paper proposes a Swarm Intelligence based Memetic algorithm for Task Allocation and scheduling in distributed systems. The tasks scheduling in distributed systems is known as an NP-complete problem. Hence, many genetic algorithms have been proposed for searching optimal solutions from entire solution space. However, these existing approaches are going to scan the entire solution space without considering the techniques that can reduce the complexity of the optimization. Spending too much time for doing scheduling is considered the main shortcoming of these approaches. Therefore, in this paper memetic algorithm has been used to cope with this shortcoming. With regard to load balancing efficiently, Bee Colony Optimization (BCO) has been applied as local search in the proposed memetic algorithm. Extended experimental results demonstrated that the proposed method outperformed the existing GA-based method in terms of CPU utilization.

  15. Distributed learning automata-based algorithm for community detection in complex networks

    NASA Astrophysics Data System (ADS)

    Khomami, Mohammad Mehdi Daliri; Rezvanian, Alireza; Meybodi, Mohammad Reza

    2016-03-01

    Community structure is an important and universal topological property of many complex networks such as social and information networks. The detection of communities of a network is a significant technique for understanding the structure and function of networks. In this paper, we propose an algorithm based on distributed learning automata for community detection (DLACD) in complex networks. In the proposed algorithm, each vertex of network is equipped with a learning automation. According to the cooperation among network of learning automata and updating action probabilities of each automaton, the algorithm interactively tries to identify high-density local communities. The performance of the proposed algorithm is investigated through a number of simulations on popular synthetic and real networks. Experimental results in comparison with popular community detection algorithms such as walk trap, Danon greedy optimization, Fuzzy community detection, Multi-resolution community detection and label propagation demonstrated the superiority of DLACD in terms of modularity, NMI, performance, min-max-cut and coverage.

  16. A clustering analysis of eddies' spatial distribution in the South China Sea

    NASA Astrophysics Data System (ADS)

    Yi, J.; Du, Y.; Wang, X.; He, Z.; Zhou, C.

    2013-02-01

    Spatial variation is important for studying the mesoscale eddies in the South China Sea (SCS). To investigate such spatial variations, this study made a clustering analysis on eddies' distribution using the K-means approach. Results showed that clustering tendency of anticyclonic eddies (AEs) and cyclonic eddies (CEs) were weak but not random, and the number of clusters were proved greater than four. Finer clustering results showed 10 regions where AEs densely populated and 6 regions for CEs in the SCS. Previous studies confirmed these partitions and possible generation mechanisms were related. Comparisons between AEs and CEs revealed that patterns of AE are relatively more aggregated than those of CE, and specific distinctions were summarized: (1) to the southwest of Luzon Island, AEs and CEs are generated spatially apart; AEs are likely located north of 14° N and closer to shore, while CEs are to the south and further offshore. (2) The central SCS and Nansha Trough are mostly dominated by AEs. (3) Along 112° E, clusters of AEs and CEs are located sequentially apart, and the pairs off Vietnam represent the dipole structures. (4) To the southwest of the Dongsha Islands, AEs are concentrated to the east of CEs. Overlaps of AEs and CEs in the northeastern and southern SCS were further examined considering seasonal variations. The northeastern overlap represented near-concentric distributions while the southern one was a mixed effect of seasonal variations, complex circulations and topography influences.

  17. Distributed Genetic Algorithm for Feature Selection in Gaia RVS Spectra: Application to ANN Parameterization

    NASA Astrophysics Data System (ADS)

    Fustes, Diego; Ordóñez, Diego; Dafonte, Carlos; Manteiga, Minia; Arcay, Bernardino

    This work presents an algorithm that was developed to select the most relevant areas of a stellar spectrum to extract its basic atmospheric parameters. We consider synthetic spectra obtained from models of stellar atmospheres in the spectral region of the radial velocity spectrograph instrument of the European Space Agency's Gaia space mission. The algorithm that demarcates the areas of the spectra sensitive to each atmospheric parameter (effective temperature and gravity, metallicity, and abundance of alpha elements) is a genetic algorithm, and the parameterization takes place through the learning of artificial neural networks. Due to the high computational cost of processing, we present a distributed implementation in both multiprocessor and multicomputer environments.

  18. 3D Drop Size Distribution Extrapolation Algorithm Using a Single Disdrometer

    NASA Technical Reports Server (NTRS)

    Lane, John

    2012-01-01

    Determining the Z-R relationship (where Z is the radar reflectivity factor and R is rainfall rate) from disdrometer data has been and is a common goal of cloud physicists and radar meteorology researchers. The usefulness of this quantity has traditionally been limited since radar represents a volume measurement, while a disdrometer corresponds to a point measurement. To solve that problem, a 3D-DSD (drop-size distribution) method of determining an equivalent 3D Z-R was developed at the University of Central Florida and tested at the Kennedy Space Center, FL. Unfortunately, that method required a minimum of three disdrometers clustered together within a microscale network (.1-km separation). Since most commercial disdrometers used by the radar meteorology/cloud physics community are high-cost instruments, three disdrometers located within a microscale area is generally not a practical strategy due to the limitations of these kinds of research budgets. A relatively simple modification to the 3D-DSD algorithm provides an estimate of the 3D-DSD and therefore, a 3D Z-R measurement using a single disdrometer. The basis of the horizontal extrapolation is mass conservation of a drop size increment, employing the mass conservation equation. For vertical extrapolation, convolution of a drop size increment using raindrop terminal velocity is used. Together, these two independent extrapolation techniques provide a complete 3DDSD estimate in a volume around and above a single disdrometer. The estimation error is lowest along a vertical plane intersecting the disdrometer position in the direction of wind advection. This work demonstrates that multiple sensors are not required for successful implementation of the 3D interpolation/extrapolation algorithm. This is a great benefit since it is seldom that multiple sensors in the required spatial arrangement are available for this type of analysis. The original software (developed at the University of Central Florida, 1998.- 2000) has

  19. Node status algorithm for load balancing in distributed service architectures at paperless medical institutions.

    PubMed

    Logeswaran, Rajasvaran; Chen, Li-Choo

    2008-12-01

    Service architectures are necessary for providing value-added services in telecommunications networks, including those in medical institutions. Separation of service logic and control from the actual call switching is the main idea of these service architectures, examples include Intelligent Network (IN), Telecommunications Information Network Architectures (TINA), and Open Service Access (OSA). In the Distributed Service Architectures (DSA), instances of the same object type can be placed on different physical nodes. Hence, the network performance can be enhanced by introducing load balancing algorithms to efficiently distribute the traffic between object instances, such that the overall throughput and network performance can be optimised. In this paper, we propose a new load balancing algorithm called "Node Status Algorithm" for DSA infrastructure applicable to electronic-based medical institutions. The simulation results illustrate that this proposed algorithm is able to outperform the benchmark load balancing algorithms-Random Algorithm and Shortest Queue Algorithm, especially under medium and heavily loaded network conditions, which are typical of the increasing bandwidth utilization and processing requirements at paperless hospitals and in the telemedicine environment.

  20. A quick scan and lane recognition algorithm based on positional distribution and edge features

    NASA Astrophysics Data System (ADS)

    Wang, Jian; Zhang, Yuan; Chen, Xiaomin; Shi, Xiaoying

    2010-08-01

    With the growing number of vehicles on the road, the automatic guided vehicles (AGV) vision system for intelligent vehicles has been given more and more attention. Lane recognition is an important component in the automatic guided vehicles (AGV) vision system for intelligent vehicles. To improve the speed and accuracy of lane recognition, this paper proposed an image segmentation algorithm based on the normalized histogram matching and a specific image scan algorithm based on positional distribution of lanes to reduce runtime. The purpose of image segmentation is extracting useful road information and the algorithm is segmenting the image by calculating the similarity of Cumulative Distribution Function (CDF) of normalization histogram. The main idea of image scan algorithm proposed in this paper is regarding the lanes that have been found as starting points and looking for the new lanes. Then we use a novel lane screen algorithm based on the left and right edges of lanes' geometric feature to remove invalid information and improve the accuracy and promote efficiency effectively. At last, a lane prediction algorithm is proposed to predict the farther lanes which may be lost due to treating as noises. After our tests, this algorithm has better robustness and higher efficiency.

  1. A Feature of Stellar Density Distribution within the Tidal Radius of Globular Cluster NGC 6626 (M28) in the Galactic Bulge

    NASA Astrophysics Data System (ADS)

    Chun, Sang-Hyun; Kim, Jae-Woo; Kim, Myo Jin; Kim, Ho-Il; Park, Jang-Hyun; Sohn, Young-Jong

    2012-07-01

    Using wide-field J, H, and Ks images obtained with the WIRCam near-infrared camera on the Canada-France-Hawaii Telescope, we investigated the spatial density configuration of stars within the tidal radius of metal-poor globular cluster NGC 6626, which is known to be located near the bulge region and have a thick-disk orbit motion. In order to trace the stellar density around a target cluster, we sorted the cluster's member stars using a mask-filtering algorithm and weighted the stars on the color-magnitude diagram. From the weight-summed surface density map, we found that the spatial density distribution of stars within the tidal radius is asymmetric with distorted overdensity features. The extending overdensity features are likely associated with the effects of the dynamic interaction with the Galaxy and the cluster's space motion. An interesting finding is that the prominent overdensity features extend toward the direction of the Galactic plane. This orientation of stellar density distribution can be interpreted with the result of the disk-shock effect of the Galaxy, previously experienced by the cluster. Indeed, the radial surface density profile accurately describes this overdensity feature with a break in the slope inside the tidal radius. Thus, the observed results indicate that NGC 6626 experienced a strong dynamical interaction with the Galaxy. Based on the result of strong tidal interaction with the Galaxy and previously published results, we discussed possible origins and evolutions of the cluster: (1) the cluster might have formed in satellite galaxies that were merged and created the Galactic bulge region in the early universe, after which time its dynamical properties were modified by dynamical friction, or (2) the cluster might have formed in primordial and rotationally supported massive clumps in the thick disk of the Galaxy.

  2. Effective Analysis of NGS Metagenomic Data with Ultra-Fast Clustering Algorithms (MICW - Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    ScienceCinema

    Li, Weizhong [San Diego Supercomputer Center

    2016-07-12

    San Diego Supercomputer Center's Weizhong Li on "Effective Analysis of NGS Metagenomic Data with Ultra-fast Clustering Algorithms" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  3. Effective Analysis of NGS Metagenomic Data with Ultra-Fast Clustering Algorithms (MICW - Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)

    SciTech Connect

    Li, Weizhong

    2011-10-12

    San Diego Supercomputer Center's Weizhong Li on "Effective Analysis of NGS Metagenomic Data with Ultra-fast Clustering Algorithms" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.

  4. Self-calibration of cluster dark energy studies: Observable-mass distribution

    SciTech Connect

    Lima, Marcos; Hu, Wayne

    2005-08-15

    The exponential sensitivity of cluster number counts to the properties of the dark energy implies a comparable sensitivity to not only the mean but also the actual distribution of an observable-mass proxy given the true cluster mass. For example a 25% scatter in mass can provide a {approx}50% change in the number counts at z{approx}2 for the upcoming SPT survey. Uncertainty in the scatter of this amount would degrade dark energy constraints to uninteresting levels. Given the shape of the actual mass function, the properties of the distribution may be internally monitored by the shape of the observable mass function. As a proof of principle, for a simple mass-independent Gaussian distribution the scatter may be self-calibrated to allow a measurement of the dark energy equation of state of {sigma}(w){approx}0.1. External constraints on the mass variance of the distribution that are more accurate than {delta}{sigma}{sub lnM}{sup 2}<0.01 at z{approx}1 can further improve constraints by up to a factor of 2. More generally, cluster counts and their sample variance measured as a function of the observable provide internal consistency checks on the assumed form of the observable-mass distribution that will protect against misinterpretation of the dark energy constraints.

  5. A new stochastic algorithm for inversion of dust aerosol size distribution

    NASA Astrophysics Data System (ADS)

    Wang, Li; Li, Feng; Yang, Ma-ying

    2015-08-01

    Dust aerosol size distribution is an important source of information about atmospheric aerosols, and it can be determined from multiwavelength extinction measurements. This paper describes a stochastic inverse technique based on artificial bee colony (ABC) algorithm to invert the dust aerosol size distribution by light extinction method. The direct problems for the size distribution of water drop and dust particle, which are the main elements of atmospheric aerosols, are solved by the Mie theory and the Lambert-Beer Law in multispectral region. And then, the parameters of three widely used functions, i.e. the log normal distribution (L-N), the Junge distribution (J-J), and the normal distribution (N-N), which can provide the most useful representation of aerosol size distributions, are inversed by the ABC algorithm in the dependent model. Numerical results show that the ABC algorithm can be successfully applied to recover the aerosol size distribution with high feasibility and reliability even in the presence of random noise.

  6. Distributed cluster testing using new virtualized framework for XRootD

    NASA Astrophysics Data System (ADS)

    Salmon, Justin L.; Janyst, Lukasz

    2014-06-01

    The Extended ROOT Daemon (XRootD) is a distributed, scalable system for low-latency clustered data access. XRootD is mature and widely used in HEP, both standalone and as a core server framework for the EOS system at CERN, and hence requires extensive testing to ensure general stability. However, there are many difficulties posed by distributed testing, such as cluster set up, synchronization, orchestration, inter-cluster communication and controlled failure handling. A three-layer master/hypervisor/slave model is presented to ameloirate these difficulties by utilizing libvirt and QEMU/KVM virtualization technologies to automate spawning of configurable virtual clusters and orchestrate multi-stage test suites. The framework also incorporates a user-friendly web interface for scheduling and monitoring tests. The prototype has been used successfully to build new test suites for XRootD and EOS with existing unit test integration. It is planned for the future to sufficiently generalize the framework to encourage usage by potentially any distributed system.

  7. Sinapinic acid clusters distribution from monomer to mega Dalton's region in MALDI process

    NASA Astrophysics Data System (ADS)

    Lai, Szu-Hsueh; Chang, Kuang-Hua; Lin, Jung-Lee; Wu, Chia-Lin; Chen, Chung-Hsuan

    2013-03-01

    In this work, we report the first complete sinapinic acid clusters distribution from monomer to mega Dalton's region by matrix-assisted laser desorption/ionization (MALDI). A decrease of eight orders in intensity was observed from monomer ion to 10 000-mer ion. The results fit to the model of laser ablation induced desorption process with bimodal power-law dependence. In addition, the detailed measurements on the populations of different sizes of clusters can provide some insight of different models of the mechanism for MALDI.

  8. Imaging active faulting in a region of distributed deformation from the joint clustering of focal mechanisms and hypocentres: Application to the Azores-western Mediterranean region

    NASA Astrophysics Data System (ADS)

    Custódio, Susana; Lima, Vânia; Vales, Dina; Cesca, Simone; Carrilho, Fernando

    2016-04-01

    The matching between linear trends of hypocentres and fault planes indicated by focal mechanisms (FMs) is frequently used to infer the location and geometry of active faults. This practice works well in regions of fast lithospheric deformation, where earthquake patterns are clear and major structures accommodate the bulk of deformation, but typically fails in regions of slow and distributed deformation. We present a new joint FM and hypocentre cluster algorithm that is able to detect systematically the consistency between hypocentre lineations and FMs, even in regions of distributed deformation. We apply the method to the Azores-western Mediterranean region, with particular emphasis on western Iberia. The analysis relies on a compilation of hypocentres and FMs taken from regional and global earthquake catalogues, academic theses and technical reports, complemented by new FMs for western Iberia. The joint clustering algorithm images both well-known and new seismo-tectonic features. The Azores triple junction is characterised by FMs with vertical pressure (P) axes, in good agreement with the divergent setting, and the Iberian domain is characterised by NW-SE oriented P axes, indicating a response of the lithosphere to the ongoing oblique convergence between Nubia and Eurasia. Several earthquakes remain unclustered in the western Mediterranean domain, which may indicate a response to local stresses. The major regions of consistent faulting that we identify are the mid-Atlantic ridge, the Terceira rift, the Trans-Alboran shear zone and the north coast of Algeria. In addition, other smaller earthquake clusters present a good match between epicentre lineations and FM fault planes. These clusters may signal single active faults or wide zones of distributed but consistent faulting. Mainland Portugal is dominated by strike-slip earthquakes with fault planes coincident with the predominant NNE-SSW and WNW-ESE oriented earthquake lineations. Clusters offshore SW Iberia are

  9. Reconstructing merger timelines using star cluster age distributions: the case of MCG+08-11-002

    NASA Astrophysics Data System (ADS)

    Davies, Rebecca L.; Medling, Anne M.; U, Vivian; Max, Claire E.; Sanders, David; Kewley, Lisa J.

    2016-05-01

    We present near-infrared imaging and integral field spectroscopy of the centre of the dusty luminous infrared galaxy merger MCG+08-11-002, taken using the Near InfraRed Camera 2 (NIRC2) and the OH-Suppressing InfraRed Imaging Spectrograph (OSIRIS) on Keck II. We achieve a spatial resolution of ˜25 pc in the K band, allowing us to resolve 41 star clusters in the NIRC2 images. We calculate the ages of 22/25 star clusters within the OSIRIS field using the equivalent widths of the CO 2.3 μm absorption feature and the Br γ nebular emission line. The star cluster age distribution has a clear peak at ages ≲ 20 Myr, indicative of current starburst activity associated with the final coalescence of the progenitor galaxies. There is a possible second peak at ˜65 Myr which may be a product of the previous close passage of the galaxy nuclei. We fit single and double starburst models to the star cluster age distribution and use Monte Carlo sampling combined with two-sided Kolmogorov-Smirnov tests to calculate the probability that the observed data are drawn from each of the best-fitting distributions. There is a >90 per cent chance that the data are drawn from either a single or double starburst star formation history, but stochastic sampling prevents us from distinguishing between the two scenarios. Our analysis of MCG+08-11-002 indicates that star cluster age distributions provide valuable insights into the timelines of galaxy interactions and may therefore play an important role in the future development of precise merger stage classification systems.

  10. Kinetic energy distribution of multiply charged ions in Coulomb explosion of Xe clusters.

    PubMed

    Heidenreich, Andreas; Jortner, Joshua

    2011-02-21

    We report on the calculations of kinetic energy distribution (KED) functions of multiply charged, high-energy ions in Coulomb explosion (CE) of an assembly of elemental Xe(n) clusters (average size (n) = 200-2171) driven by ultra-intense, near-infrared, Gaussian laser fields (peak intensities 10(15) - 4 × 10(16) W cm(-2), pulse lengths 65-230 fs). In this cluster size and pulse parameter domain, outer ionization is incomplete∕vertical, incomplete∕nonvertical, or complete∕nonvertical, with CE occurring in the presence of nanoplasma electrons. The KEDs were obtained from double averaging of single-trajectory molecular dynamics simulation ion kinetic energies. The KEDs were doubly averaged over a log-normal cluster size distribution and over the laser intensity distribution of a spatial Gaussian beam, which constitutes either a two-dimensional (2D) or a three-dimensional (3D) profile, with the 3D profile (when the cluster beam radius is larger than the Rayleigh length) usually being experimentally realized. The general features of the doubly averaged KEDs manifest the smearing out of the structure corresponding to the distribution of ion charges, a marked increase of the KEDs at very low energies due to the contribution from the persistent nanoplasma, a distortion of the KEDs and of the average energies toward lower energy values, and the appearance of long low-intensity high-energy tails caused by the admixture of contributions from large clusters by size averaging. The doubly averaged simulation results account reasonably well (within 30%) for the experimental data for the cluster-size dependence of the CE energetics and for its dependence on the laser pulse parameters, as well as for the anisotropy in the angular distribution of the energies of the Xe(q+) ions. Possible applications of this computational study include a control of the ion kinetic energies by the choice of the laser intensity profile (2D∕3D) in the laser-cluster interaction volume.

  11. Spectrum parameter estimation in Brillouin scattering distributed temperature sensor based on cuckoo search algorithm combined with the improved differential evolution algorithm

    NASA Astrophysics Data System (ADS)

    Zhang, Yanjun; Yu, Chunjuan; Fu, Xinghu; Liu, Wenzhe; Bi, Weihong

    2015-12-01

    In the distributed optical fiber sensing system based on Brillouin scattering, strain and temperature are the main measuring parameters which can be obtained by analyzing the Brillouin center frequency shift. The novel algorithm which combines the cuckoo search algorithm (CS) with the improved differential evolution (IDE) algorithm is proposed for the Brillouin scattering parameter estimation. The CS-IDE algorithm is compared with CS algorithm and analyzed in different situation. The results show that both the CS and CS-IDE algorithm have very good convergence. The analysis reveals that the CS-IDE algorithm can extract the scattering spectrum features with different linear weight ratio, linewidth combination and SNR. Moreover, the BOTDR temperature measuring system based on electron optical frequency shift is set up to verify the effectiveness of the CS-IDE algorithm. Experimental results show that there is a good linear relationship between the Brillouin center frequency shift and temperature changes.

  12. YOUNG STELLAR CLUSTERS WITH A SCHUSTER MASS DISTRIBUTION. I. STATIONARY WINDS

    SciTech Connect

    Palous, Jan; Wuensch, Richard; Hueyotl-Zahuantitla, Filiberto; Martinez-Gonzalez, Sergio; Silich, Sergiy; Tenorio-Tagle, Guillermo

    2013-08-01

    Hydrodynamic models for spherically symmetric winds driven by young stellar clusters with a generalized Schuster stellar density profile are explored. For this we use both semi-analytic models and one-dimensional numerical simulations. We determine the properties of quasi-adiabatic and radiative stationary winds and define the radius at which the flow turns from subsonic to supersonic for all stellar density distributions. Strongly radiative winds significantly diminish their terminal speed and thus their mechanical luminosity is strongly reduced. This also reduces their potential negative feedback into their host galaxy interstellar medium. The critical luminosity above which radiative cooling becomes dominant within the clusters, leading to thermal instabilities which make the winds non-stationary, is determined, and its dependence on the star cluster density profile, core radius, and half-mass radius is discussed.

  13. Model-Based Clustering of Regression Time Series Data via APECM -- An AECM Algorithm Sung to an Even Faster Beat

    SciTech Connect

    Chen, Wei-Chen; Maitra, Ranjan

    2011-01-01

    We propose a model-based approach for clustering time series regression data in an unsupervised machine learning framework to identify groups under the assumption that each mixture component follows a Gaussian autoregressive regression model of order p. Given the number of groups, the traditional maximum likelihood approach of estimating the parameters using the expectation-maximization (EM) algorithm can be employed, although it is computationally demanding. The somewhat fast tune to the EM folk song provided by the Alternating Expectation Conditional Maximization (AECM) algorithm can alleviate the problem to some extent. In this article, we develop an alternative partial expectation conditional maximization algorithm (APECM) that uses an additional data augmentation storage step to efficiently implement AECM for finite mixture models. Results on our simulation experiments show improved performance in both fewer numbers of iterations and computation time. The methodology is applied to the problem of clustering mutual funds data on the basis of their average annual per cent returns and in the presence of economic indicators.

  14. Cluster size distributions: signatures of self-organization in spatial ecologies.

    PubMed Central

    Pascual, Mercedes; Roy, Manojit; Guichard, Frédéric; Flierl, Glenn

    2002-01-01

    Three different lattice-based models for antagonistic ecological interactions, both nonlinear and stochastic, exhibit similar power-law scalings in the geometry of clusters. Specifically, cluster size distributions and perimeter-area curves follow power-law scalings. In the coexistence regime, these patterns are robust: their exponents, and therefore the associated Korcak exponent characterizing patchiness, depend only weakly on the parameters of the systems. These distributions, in particular the values of their exponents, are close to those reported in the literature for systems associated with self-organized criticality (SOC) such as forest-fire models; however, the typical assumptions of SOC need not apply. Our results demonstrate that power-law scalings in cluster size distributions are not restricted to systems for antagonistic interactions in which a clear separation of time-scales holds. The patterns are characteristic of processes of growth and inhibition in space, such as those in predator-prey and disturbance-recovery dynamics. Inversions of these patterns, that is, scalings with a positive slope as described for plankton distributions, would therefore require spatial forcing by environmental variability. PMID:12079527

  15. An efficient central DOA tracking algorithm for multiple incoherently distributed sources

    NASA Astrophysics Data System (ADS)

    Hassen, Sonia Ben; Samet, Abdelaziz

    2015-12-01

    In this paper, we develop a new tracking method for the direction of arrival (DOA) parameters assuming multiple incoherently distributed (ID) sources. The new approach is based on a simple covariance fitting optimization technique exploiting the central and noncentral moments of the source angular power densities to estimate the central DOAs. The current estimates are treated as measurements provided to the Kalman filter that model the dynamic property of directional changes for the moving sources. Then, the covariance-fitting-based algorithm and the Kalman filtering theory are combined to formulate an adaptive tracking algorithm. Our algorithm is compared to the fast approximated power iteration-total least square-estimation of signal parameters via rotational invariance technique (FAPI-TLS-ESPRIT) algorithm using the TLS-ESPRIT method and the subspace updating via FAPI-algorithm. It will be shown that the proposed algorithm offers an excellent DOA tracking performance and outperforms the FAPI-TLS-ESPRIT method especially at low signal-to-noise ratio (SNR) values. Moreover, the performances of the two methods increase as the SNR values increase. This increase is more prominent with the FAPI-TLS-ESPRIT method. However, their performances degrade when the number of sources increases. It will be also proved that our method depends on the form of the angular distribution function when tracking the central DOAs. Finally, it will be shown that the more the sources are spaced, the more the proposed method can exactly track the DOAs.

  16. Wide Distribution of O157-Antigen Biosynthesis Gene Clusters in Escherichia coli

    PubMed Central

    Seto, Kazuko; Ooka, Tadasuke; Ogura, Yoshitoshi; Hayashi, Tetsuya; Osawa, Kayo; Osawa, Ro

    2011-01-01

    Most Escherichia coli O157-serogroup strains are classified as enterohemorrhagic E. coli (EHEC), which is known as an important food-borne pathogen for humans. They usually produce Shiga toxin (Stx) 1 and/or Stx2, and express H7-flagella antigen (or nonmotile). However, O157 strains that do not produce Stxs and express H antigens different from H7 are sometimes isolated from clinical and other sources. Multilocus sequence analysis revealed that these 21 O157:non-H7 strains tested in this study belong to multiple evolutionary lineages different from that of EHEC O157:H7 strains, suggesting a wide distribution of the gene set encoding the O157-antigen biosynthesis in multiple lineages. To gain insight into the gene organization and the sequence similarity of the O157-antigen biosynthesis gene clusters, we conducted genomic comparisons of the chromosomal regions (about 59 kb in each strain) covering the O-antigen gene cluster and its flanking regions between six O157:H7/non-H7 strains. Gene organization of the O157-antigen gene cluster was identical among O157:H7/non-H7 strains, but was divided into two distinct types at the nucleotide sequence level. Interestingly, distribution of the two types did not clearly follow the evolutionary lineages of the strains, suggesting that horizontal gene transfer of both types of O157-antigen gene clusters has occurred independently among E. coli strains. Additionally, detailed sequence comparison revealed that some positions of the repetitive extragenic palindromic (REP) sequences in the regions flanking the O-antigen gene clusters were coincident with possible recombination points. From these results, we conclude that the horizontal transfer of the O157-antigen gene clusters induced the emergence of multiple O157 lineages within E. coli and speculate that REP sequences may involve one of the driving forces for exchange and evolution of O-antigen loci. PMID:21876740

  17. F Turnoff Distribution in the Galactic Halo Using Globular Clusters as Proxies

    NASA Astrophysics Data System (ADS)

    Newby, Matthew; Newberg, Heidi Jo; Simones, Jacob; Cole, Nathan; Monaco, Matthew

    2011-12-01

    F turnoff stars are important tools for studying Galactic halo substructure because they are plentiful, luminous, and can be easily selected by their photometric colors from large surveys such as the Sloan Digital Sky Survey (SDSS). We describe the absolute magnitude distribution of color-selected F turnoff stars, as measured from SDSS data, for 11 globular clusters in the Milky Way halo. We find that the Mg distribution of turnoff stars is intrinsically the same for all clusters studied, and is well fit by two half-Gaussian functions, centered at μ = 4.18, with a bright-side σ = 0.36, and with a faint-side σ = 0.76. However, the color errors and detection efficiencies cause the observed σ of the faint-side Gaussian to change with magnitude due to contamination from redder main-sequence stars (40% at 21st magnitude). We present a function that will correct for this magnitude-dependent change in selected stellar populations, when calculating stellar density from color-selected turnoff stars. We also present a consistent set of distances, ages, and metallicities for 11 clusters in the SDSS Data Release 7. We calculate a linear correction function to Padova isochrones so that they are consistent with SDSS globular cluster data from previous papers. We show that our cluster population falls along the Milky Way age-metallicity relationship (AMR), and further find that isochrones for stellar populations on the AMR have very similar turnoffs; increasing metallicity and decreasing age conspire to produce similar turnoff magnitudes and colors for all old clusters that lie on the AMR.

  18. F TURNOFF DISTRIBUTION IN THE GALACTIC HALO USING GLOBULAR CLUSTERS AS PROXIES

    SciTech Connect

    Newby, Matthew; Newberg, Heidi Jo; Simones, Jacob; Cole, Nathan; Monaco, Matthew E-mail: heidi@rpi.edu

    2011-12-20

    F turnoff stars are important tools for studying Galactic halo substructure because they are plentiful, luminous, and can be easily selected by their photometric colors from large surveys such as the Sloan Digital Sky Survey (SDSS). We describe the absolute magnitude distribution of color-selected F turnoff stars, as measured from SDSS data, for 11 globular clusters in the Milky Way halo. We find that the M{sub g} distribution of turnoff stars is intrinsically the same for all clusters studied, and is well fit by two half-Gaussian functions, centered at {mu} = 4.18, with a bright-side {sigma} = 0.36, and with a faint-side {sigma} = 0.76. However, the color errors and detection efficiencies cause the observed {sigma} of the faint-side Gaussian to change with magnitude due to contamination from redder main-sequence stars (40% at 21st magnitude). We present a function that will correct for this magnitude-dependent change in selected stellar populations, when calculating stellar density from color-selected turnoff stars. We also present a consistent set of distances, ages, and metallicities for 11 clusters in the SDSS Data Release 7. We calculate a linear correction function to Padova isochrones so that they are consistent with SDSS globular cluster data from previous papers. We show that our cluster population falls along the Milky Way age-metallicity relationship (AMR), and further find that isochrones for stellar populations on the AMR have very similar turnoffs; increasing metallicity and decreasing age conspire to produce similar turnoff magnitudes and colors for all old clusters that lie on the AMR.

  19. Globally convergent algorithms for estimating generalized gamma distributions in fast signal and image processing.

    PubMed

    Song, Kai-Sheng

    2008-08-01

    Many applications in real-time signal, image, and video processing require automatic algorithms for rapid characterizations of signals and images through fast estimation of their underlying statistical distributions. We present fast and globally convergent algorithms for estimating the three-parameter generalized gamma distribution (G Gamma D). The proposed method is based on novel scale-independent shape estimation (SISE) equations. We show that the SISE equations have a unique global root in their semi-infinite domains and the probability that the sample SISE equations have a unique global root tends to one. The consistency of the global root, its scale, and index shape estimators is obtained. Furthermore, we establish that, with probability tending to one, Newton-Raphson (NR) algorithms for solving the sample SISE equations converge globally to the unique root from any initial value in its given domain. In contrast to existing methods, another remarkable novelty is that the sample SISE equations are completely independent of gamma and polygamma functions and involve only elementary mathematical operations, making the algorithms well suited for real-time both hardware and software implementations. The SISE estimators also allow the maximum likelihood (ML) ratio procedure to be carried out for testing the generalized Gaussian distribution (GGD) versus the G Gamma D. Finally, the fast global convergence and accuracy of our algorithms for finite samples are demonstrated by both simulation studies and real image analysis.

  20. Determination of algorithm parallelism in NP-complete problems for distributed architectures. Master's thesis

    SciTech Connect

    Beard, R.A.

    1990-03-01

    The purpose of this thesis is to explore the methods used to parallelize NP-complete problems and the degree of improvement that can be realized using a distributed parallel processor to solve these combinatoric problems. Common NP-complete problem characteristics such as a priori reductions, use of partial-state information, and inhomogeneous searches are identified and studied. The set covering problem (SCP) is implemented for this research because many applications such as information retrieval, task scheduling, and VLSI expression simplification can be structured as an SCP problem. In addition, its generic NP-complete common characteristics are well documented and a parallel implementation has not been reported. Parallel programming design techniques involve decomposing the problem and developing the parallel algorithms. The major components of a parallel solution are developed in a four phase process. First, a meta-level design is accomplished using an appropriate design language such as UNITY. Then, the UNITY design is transformed into an algorithm and implementation specific to a distributed architecture. Finally, a complexity analysis of the algorithm is performed. the a priori reductions are divided-and-conquer algorithms; whereas, the search for the optimal set cover is accomplished with a branch-and-bound algorithm. The search utilizes a global best cost maintained at a central location for distribution to all processors. Three methods of load balancing are implemented and studied: coarse grain with static allocation of the search space, fine grain with dynamic allocation, and dynamic load balancing.

  1. A data distributed parallel algorithm for ray-traced volume rendering

    NASA Technical Reports Server (NTRS)

    Ma, Kwan-Liu; Painter, James S.; Hansen, Charles D.; Krogh, Michael F.

    1993-01-01

    This paper presents a divide-and-conquer ray-traced volume rendering algorithm and a parallel image compositing method, along with their implementation and performance on the Connection Machine CM-5, and networked workstations. This algorithm distributes both the data and the computations to individual processing units to achieve fast, high-quality rendering of high-resolution data. The volume data, once distributed, is left intact. The processing nodes perform local ray tracing of their subvolume concurrently. No communication between processing units is needed during this locally ray-tracing process. A subimage is generated by each processing unit and the final image is obtained by compositing subimages in the proper order, which can be determined a priori. Test results on both the CM-5 and a group of networked workstations demonstrate the practicality of our rendering algorithm and compositing method.

  2. EM algorithm in estimating the 2- and 3-parameter Burr Type III distributions

    NASA Astrophysics Data System (ADS)

    Ismail, Nor Hidayah Binti; Khalid, Zarina Binti Mohd

    2014-07-01

    The Burr Type III distribution has been applied in the study of income, wage and wealth. It is suitable to fit lifetime data since it has flexible shape and controllable scale parameters. The popularity of Burr Type III distribution increases because it has included the characteristics of other distributions such as logistic and exponential. Burr Type III distribution has two categories: First a two-parameter distribution which has two shape parameters and second a three-parameter distribution which has a scale and two shape parameters. Expectation-maximization (EM) algorithm method is selected in this paper to estimate the two- and three-parameter Burr Type III distributions. Complete and censored data are simulated based on the derivation of pdf and cdf in parametric form of Burr Type III distributions. Then, the EM estimates are compared with estimates from maximum likelihood estimation (MLE) approach through mean square error. The best approach results in estimates with a higher approximation to the true parameters are determined. The result shows that the EM algorithm estimates perform better than the MLE estimates for two- and three-parameter Burr Type III distributions in the presence of complete and censored data.

  3. A Sub-Clustering Algorithm Based on Spatial Data Correlation for Energy Conservation in Wireless Sensor Networks

    PubMed Central

    Tsai, Ming-Hui; Huang, Yueh-Min

    2014-01-01

    Wireless sensor networks (WSNs) have emerged as a promising solution for various applications due to their low cost and easy deployment. Typically, their limited power capability, i.e., battery powered, make WSNs encounter the challenge of extension of network lifetime. Many hierarchical protocols show better ability of energy efficiency in the literature. Besides, data reduction based on the correlation of sensed readings can efficiently reduce the amount of required transmissions. Therefore, we use a sub-clustering procedure based on spatial data correlation to further separate the hierarchical (clustered) architecture of a WSN. The proposed algorithm (2TC-cor) is composed of two procedures: the prediction model construction procedure and the sub-clustering procedure. The energy conservation benefits by the reduced transmissions, which are dependent on the prediction model. Also, the energy can be further conserved because of the representative mechanism of sub-clustering. As presented by simulation results, it shows that 2TC-cor can effectively conserve energy and monitor accurately the environment within an acceptable level. PMID:25412220

  4. On the Metallicity Distribution of the Peculiar Globular Cluster M22

    NASA Astrophysics Data System (ADS)

    Lee, Jae-Woo

    2016-10-01

    In our previous study, we showed that the peculiar globular cluster (GC) M22 contains two distinct stellar populations, namely the Ca-w and Ca-s groups, which have different physical properties, chemical compositions, spatial distributions, and kinematics. We proposed that M22 was most likely formed via a merger of two GCs with heterogeneous metallicities in a dwarf galaxy environment and then later accreted to our Galaxy. In their recent study, Mucciarelli et al. claimed that M22 is a normal monometallic globular cluster without any perceptible metallicity spread among the two groups of stars, which challenges our results and those of others. We devise new strategies for the local thermodynamic equilibrium abundance analysis of red giant branch stars in GCs and show that there exists a spread in the iron abundance distribution in M22.

  5. A Peer-to-Peer Distributed Selection Algorithm for the Internet.

    ERIC Educational Resources Information Center

    Loo, Alfred; Choi, Y. K.

    2002-01-01

    In peer-to-peer systems, the lack of dedicated links between constituent computers presents a major challenge. Discusses the problems of distributed database operation with reference to an example. Presents two statistical selection algorithms-one for the intranet with broadcast/multicast facilities, the other for Internet without…

  6. A comparative study of DIGNET, average, complete, single hierarchical and k-means clustering algorithms in 2D face image recognition

    NASA Astrophysics Data System (ADS)

    Thanos, Konstantinos-Georgios; Thomopoulos, Stelios C. A.

    2014-06-01

    The study in this paper belongs to a more general research of discovering facial sub-clusters in different ethnicity face databases. These new sub-clusters along with other metadata (such as race, sex, etc.) lead to a vector for each face in the database where each vector component represents the likelihood of participation of a given face to each cluster. This vector is then used as a feature vector in a human identification and tracking system based on face and other biometrics. The first stage in this system involves a clustering method which evaluates and compares the clustering results of five different clustering algorithms (average, complete, single hierarchical algorithm, k-means and DIGNET), and selects the best strategy for each data collection. In this paper we present the comparative performance of clustering results of DIGNET and four clustering algorithms (average, complete, single hierarchical and k-means) on fabricated 2D and 3D samples, and on actual face images from various databases, using four different standard metrics. These metrics are the silhouette figure, the mean silhouette coefficient, the Hubert test Γ coefficient, and the classification accuracy for each clustering result. The results showed that, in general, DIGNET gives more trustworthy results than the other algorithms when the metrics values are above a specific acceptance threshold. However when the evaluation results metrics have values lower than the acceptance threshold but not too low (too low corresponds to ambiguous results or false results), then it is necessary for the clustering results to be verified by the other algorithms.

  7. Optimal clustering of MGs based on droop controller for improving reliability using a hybrid of harmony search and genetic algorithms.

    PubMed

    Abedini, Mohammad; Moradi, Mohammad H; Hosseinian, S M

    2016-03-01

    This paper proposes a novel method to address reliability and technical problems of microgrids (MGs) based on designing a number of self-adequate autonomous sub-MGs via adopting MGs clustering thinking. In doing so, a multi-objective optimization problem is developed where power losses reduction, voltage profile improvement and reliability enhancement are considered as the objective functions. To solve the optimization problem a hybrid algorithm, named HS-GA, is provided, based on genetic and harmony search algorithms, and a load flow method is given to model different types of DGs as droop controller. The performance of the proposed method is evaluated in two case studies. The results provide support for the performance of the proposed method. PMID:26767800

  8. Swarm Intelligence in Text Document Clustering

    SciTech Connect

    Cui, Xiaohui; Potok, Thomas E

    2008-01-01

    Social animals or insects in nature often exhibit a form of emergent collective behavior. The research field that attempts to design algorithms or distributed problem-solving devices inspired by the collective behavior of social insect colonies is called Swarm Intelligence. Compared to the traditional algorithms, the swarm algorithms are usually flexible, robust, decentralized and self-organized. These characters make the swarm algorithms suitable for solving complex problems, such as document collection clustering. The major challenge of today's information society is being overwhelmed with information on any topic they are searching for. Fast and high-quality document clustering algorithms play an important role in helping users to effectively navigate, summarize, and organize the overwhelmed information. In this chapter, we introduce three nature inspired swarm intelligence clustering approaches for document clustering analysis. These clustering algorithms use stochastic and heuristic principles discovered from observing bird flocks, fish schools and ant food forage.

  9. Fast distributed large-pixel-count hologram computation using a GPU cluster.

    PubMed

    Pan, Yuechao; Xu, Xuewu; Liang, Xinan

    2013-09-10

    Large-pixel-count holograms are one essential part for big size holographic three-dimensional (3D) display, but the generation of such holograms is computationally demanding. In order to address this issue, we have built a graphics processing unit (GPU) cluster with 32.5 Tflop/s computing power and implemented distributed hologram computation on it with speed improvement techniques, such as shared memory on GPU, GPU level adaptive load balancing, and node level load distribution. Using these speed improvement techniques on the GPU cluster, we have achieved 71.4 times computation speed increase for 186M-pixel holograms. Furthermore, we have used the approaches of diffraction limits and subdivision of holograms to overcome the GPU memory limit in computing large-pixel-count holograms. 745M-pixel and 1.80G-pixel holograms were computed in 343 and 3326 s, respectively, for more than 2 million object points with RGB colors. Color 3D objects with 1.02M points were successfully reconstructed from 186M-pixel hologram computed in 8.82 s with all the above three speed improvement techniques. It is shown that distributed hologram computation using a GPU cluster is a promising approach to increase the computation speed of large-pixel-count holograms for large size holographic display.

  10. Clustering and velocity distributions in granular gases cooling by solid friction

    NASA Astrophysics Data System (ADS)

    Das, Prasenjit; Puri, Sanjay; Schwartz, Moshe

    2016-09-01

    We present large-scale molecular dynamics simulations to study the free evolution of granular gases. Initially, the density of particles is homogeneous and the velocity follows a Maxwell-Boltzmann (MB) distribution. The system cools down due to solid friction between the granular particles. The density remains homogeneous, and the velocity distribution remains MB at early times, while the kinetic energy of the system decays with time. However, fluctuations in the density and velocity fields grow, and the system evolves via formation of clusters in the density field and the local ordering of velocity field, consistent with the onset of plug flow. This is accompanied by a transition of the velocity distribution function from MB to non-MB behavior. We used equal-time correlation functions and structure factors of the density and velocity fields to study the morphology of clustering. From the correlation functions, we obtain the cluster size, L , as a function of time, t . We show that it exhibits power law growth with L (t ) ˜t1 /3 .

  11. Ginga observations of the Coma cluster and studies of the spatial distribution of iron

    NASA Technical Reports Server (NTRS)

    Hughes, John P.; Butcher, Jackie A.; Stewart, Gordon C.; Tanaka, Yasuo

    1993-01-01

    The large area counters on the Japanese satellite Ginga have been used to determine the X-ray spectrum from the central region of the Coma cluster of galaxies over the energy range from 1.5 to 20 keV. The spectrum is well represented by an isothermal model of temperature 8.21 +/- 0.16 keV and a heavy element (iron) abundance of 0.212 +/- 0.027, relative to the cosmic value. The Ginga spectrum was found to be consistent with the X-ray spectra from the Tenma and EXOSAT satellites for a large class of nonisothermal temperature distributions. The measured iron elemental abundances were used to set a lower limit on the total mass of iron in Coma under the assumption that the iron is not distributed uniformly throughout the cluster. The mass ratio of iron relative to hydrogen (within 2 Mpc) is not less than 18 percent of the cosmic iron to hydrogen mass ratio. This compares to an average abundance of 24 percent if the iron is distributed uniformly. We discuss these results in terms of models for the production of iron in galaxy clusters.

  12. A Wide-Field Hubble Space Telescope Study of the Cluster Cl 0024+1654 at z=0.4. II. The Cluster Mass Distribution

    NASA Astrophysics Data System (ADS)

    Kneib, Jean-Paul; Hudelot, Patrick; Ellis, Richard S.; Treu, Tommaso; Smith, Graham P.; Marshall, Phil; Czoske, Oliver; Smail, Ian; Natarajan, Priyamvada

    2003-12-01

    We present a comprehensive lensing analysis of the rich cluster Cl 0024+1654 (z=0.395) based on panoramic sparse-sampled imaging conducted with the WFPC2 and STIS cameras on board the Hubble Space Telescope. By comparing higher fidelity signals in the limited STIS data with the wider field data available from WFPC2, we demonstrate an ability to detect reliably weak-lensing signals to a cluster radius of ~=5 h-165 Mpc, where the mean shear is around 1%. This enables us to study the distribution of dark matter with respect to the cluster light over an unprecedented range of cluster radii and environments. The projected mass distribution reveals a secondary concentration representing 30% of the overall cluster mass, which is also visible in the distribution of cluster member galaxies. We develop a method to derive the projected mass profile of the main cluster taking into account the influence of the secondary clump. We normalize the mass profile determined from the shear by assuming that background galaxies selected with 23distribution statistically similar to that inferred photometrically in the Hubble Deep Fields. The total mass within the central region of the cluster is independently determined from strong-lensing constraints according to a detailed model that utilizes the multiply imaged arc at z=1.675. Combining strong and weak constraints, we are able to probe the mass profile of the cluster on scales of 0.1-5 Mpc, thus providing a valuable test of the universal form proposed by Navarro, Frenk, & White (NFW) on large scales. A generalized power-law fit indicates an asymptotic three-dimensional density distribution of ρ~r-n with n>2.4. An isothermal mass profile is therefore strongly rejected, whereas an NFW profile with M200=6.1+1.2-1.1×1014 h-165 Msolar provides a good fit to the lensing data. We isolate cluster members according to their optical/near-infrared colors; the red cluster light closely traces the dark matter with a mean

  13. Distributed Noise Generation for Density Estimation Based Clustering without Trusted Third Party

    NASA Astrophysics Data System (ADS)

    Su, Chunhua; Bao, Feng; Zhou, Jianying; Takagi, Tsuyoshi; Sakurai, Kouichi

    The rapid growth of the Internet provides people with tremendous opportunities for data collection, knowledge discovery and cooperative computation. However, it also brings the problem of sensitive information leakage. Both individuals and enterprises may suffer from the massive data collection and the information retrieval by distrusted parties. In this paper, we propose a privacy-preserving protocol for the distributed kernel density estimation-based clustering. Our scheme applies random data perturbation (RDP) technique and the verifiable secret sharing to solve the security problem of distributed kernel density estimation in [4] which assumed a mediate party to help in the computation.

  14. Better understanding of water quality evolution in water distribution networks using data clustering.

    PubMed

    Mandel, Pierre; Maurel, Marie; Chenu, Damien

    2015-12-15

    The complexity of water distribution networks raises challenges in managing, monitoring and understanding their behavior. This article proposes a novel methodology applying data clustering to the results of hydraulic simulation to define quality zones, i.e. zones with the same dynamic water origin. The methodology is presented on an existing Water Distribution Network; a large dataset of conductivity measurements measured by 32 probes validates the definition of the quality zones. The results show how quality zones help better understanding the network operation and how they can be used to analyze water quality events. Moreover, a statistical comparison with 158,230 conductivity measurements validates the definition of the quality zones.

  15. Reconstruction of the mass distribution of galaxy clusters from the inversion of the thermal Sunyaev-Zel'dovich effect

    NASA Astrophysics Data System (ADS)

    Majer, C. L.; Meyer, S.; Konrad, S.; Sarli, E.; Bartelmann, M.

    2016-07-01

    This paper continues a series in which we intend to show how all observables of galaxy clusters can be combined to recover the two-dimensional, projected gravitational potential of individual clusters. Our goal is to develop a non-parametric algorithm for joint cluster reconstruction taking all cluster observables into account. For this reason we focus on the line-of-sight projected gravitational potential, proportional to the lensing potential, in order to extend existing reconstruction algorithms. In this paper, we begin with the relation between the Compton-y parameter and the Newtonian gravitational potential, assuming hydrostatic equilibrium and a polytropic stratification of the intracluster gas. Extending our first publication we now consider a spheroidal rather than a spherical cluster symmetry. We show how a Richardson-Lucy deconvolution can be used to convert the intensity change of the CMB due to the thermal Sunyaev-Zel'dovich effect into an estimate for the two-dimensional gravitational potential. We apply our reconstruction method to a cluster based on an N-body/hydrodynamical simulation processed with the characteristics (resolution and noise) of the ALMA interferometer for which we achieve a relative error of ≲20 per cent for a large fraction of the virial radius. We further apply our method to an observation of the galaxy cluster RXJ1347 for which we can reconstruct the potential with a relative error of ≲20 per cent for the observable cluster range.

  16. Fuzzy C-mean clustering on kinetic parameter estimation with generalized linear least square algorithm in SPECT

    NASA Astrophysics Data System (ADS)

    Choi, Hon-Chit; Wen, Lingfeng; Eberl, Stefan; Feng, Dagan

    2006-03-01

    Dynamic Single Photon Emission Computed Tomography (SPECT) has the potential to quantitatively estimate physiological parameters by fitting compartment models to the tracer kinetics. The generalized linear least square method (GLLS) is an efficient method to estimate unbiased kinetic parameters and parametric images. However, due to the low sensitivity of SPECT, noisy data can cause voxel-wise parameter estimation by GLLS to fail. Fuzzy C-Mean (FCM) clustering and modified FCM, which also utilizes information from the immediate neighboring voxels, are proposed to improve the voxel-wise parameter estimation of GLLS. Monte Carlo simulations were performed to generate dynamic SPECT data with different noise levels and processed by general and modified FCM clustering. Parametric images were estimated by Logan and Yokoi graphical analysis and GLLS. The influx rate (K I), volume of distribution (V d) were estimated for the cerebellum, thalamus and frontal cortex. Our results show that (1) FCM reduces the bias and improves the reliability of parameter estimates for noisy data, (2) GLLS provides estimates of micro parameters (K I-k 4) as well as macro parameters, such as volume of distribution (Vd) and binding potential (BP I & BP II) and (3) FCM clustering incorporating neighboring voxel information does not improve the parameter estimates, but improves noise in the parametric images. These findings indicated that it is desirable for pre-segmentation with traditional FCM clustering to generate voxel-wise parametric images with GLLS from dynamic SPECT data.

  17. A Topology Visualization Early Warning Distribution Algorithm for Large-Scale Network Security Incidents

    PubMed Central

    He, Hui; Fan, Guotao; Ye, Jianwei; Zhang, Weizhe

    2013-01-01

    It is of great significance to research the early warning system for large-scale network security incidents. It can improve the network system's emergency response capabilities, alleviate the cyber attacks' damage, and strengthen the system's counterattack ability. A comprehensive early warning system is presented in this paper, which combines active measurement and anomaly detection. The key visualization algorithm and technology of the system are mainly discussed. The large-scale network system's plane visualization is realized based on the divide and conquer thought. First, the topology of the large-scale network is divided into some small-scale networks by the MLkP/CR algorithm. Second, the sub graph plane visualization algorithm is applied to each small-scale network. Finally, the small-scale networks' topologies are combined into a topology based on the automatic distribution algorithm of force analysis. As the algorithm transforms the large-scale network topology plane visualization problem into a series of small-scale network topology plane visualization and distribution problems, it has higher parallelism and is able to handle the display of ultra-large-scale network topology. PMID:24191145

  18. A topology visualization early warning distribution algorithm for large-scale network security incidents.

    PubMed

    He, Hui; Fan, Guotao; Ye, Jianwei; Zhang, Weizhe

    2013-01-01

    It is of great significance to research the early warning system for large-scale network security incidents. It can improve the network system's emergency response capabilities, alleviate the cyber attacks' damage, and strengthen the system's counterattack ability. A comprehensive early warning system is presented in this paper, which combines active measurement and anomaly detection. The key visualization algorithm and technology of the system are mainly discussed. The large-scale network system's plane visualization is realized based on the divide and conquer thought. First, the topology of the large-scale network is divided into some small-scale networks by the MLkP/CR algorithm. Second, the sub graph plane visualization algorithm is applied to each small-scale network. Finally, the small-scale networks' topologies are combined into a topology based on the automatic distribution algorithm of force analysis. As the algorithm transforms the large-scale network topology plane visualization problem into a series of small-scale network topology plane visualization and distribution problems, it has higher parallelism and is able to handle the display of ultra-large-scale network topology.

  19. A topology visualization early warning distribution algorithm for large-scale network security incidents.

    PubMed

    He, Hui; Fan, Guotao; Ye, Jianwei; Zhang, Weizhe

    2013-01-01

    It is of great significance to research the early warning system for large-scale network security incidents. It can improve the network system's emergency response capabilities, alleviate the cyber attacks' damage, and strengthen the system's counterattack ability. A comprehensive early warning system is presented in this paper, which combines active measurement and anomaly detection. The key visualization algorithm and technology of the system are mainly discussed. The large-scale network system's plane visualization is realized based on the divide and conquer thought. First, the topology of the large-scale network is divided into some small-scale networks by the MLkP/CR algorithm. Second, the sub graph plane visualization algorithm is applied to each small-scale network. Finally, the small-scale networks' topologies are combined into a topology based on the automatic distribution algorithm of force analysis. As the algorithm transforms the large-scale network topology plane visualization problem into a series of small-scale network topology plane visualization and distribution problems, it has higher parallelism and is able to handle the display of ultra-large-scale network topology. PMID:24191145

  20. Parallel paving: An algorithm for generating distributed, adaptive, all-quadrilateral meshes on parallel computers

    SciTech Connect

    Lober, R.R.; Tautges, T.J.; Vaughan, C.T.

    1997-03-01

    Paving is an automated mesh generation algorithm which produces all-quadrilateral elements. It can additionally generate these elements in varying sizes such that the resulting mesh adapts to a function distribution, such as an error function. While powerful, conventional paving is a very serial algorithm in its operation. Parallel paving is the extension of serial paving into parallel environments to perform the same meshing functions as conventional paving only on distributed, discretized models. This extension allows large, adaptive, parallel finite element simulations to take advantage of paving`s meshing capabilities for h-remap remeshing. A significantly modified version of the CUBIT mesh generation code has been developed to host the parallel paving algorithm and demonstrate its capabilities on both two dimensional and three dimensional surface geometries and compare the resulting parallel produced meshes to conventionally paved meshes for mesh quality and algorithm performance. Sandia`s {open_quotes}tiling{close_quotes} dynamic load balancing code has also been extended to work with the paving algorithm to retain parallel efficiency as subdomains undergo iterative mesh refinement.

  1. Distributing Power Grid State Estimation on HPC Clusters A System Architecture Prototype

    SciTech Connect

    Liu, Yan; Jiang, Wei; Jin, Shuangshuang; Rice, Mark J.; Chen, Yousu

    2012-08-20

    The future power grid is expected to further expand with highly distributed energy sources and smart loads. The increased size and complexity lead to increased burden on existing computational resources in energy control centers. Thus the need to perform real-time assessment on such systems entails efficient means to distribute centralized functions such as state estimation in the power system. In this paper, we present our early prototype of a system architecture that connects distributed state estimators individually running parallel programs to solve non-linear estimation procedure. The prototype consists of a middleware and data processing toolkits that allows data exchange in the distributed state estimation. We build a test case based on the IEEE 118 bus system and partition the state estimation of the whole system model to available HPC clusters. The measurement from the testbed demonstrates the low overhead of our solution.

  2. Symbolic clustering

    SciTech Connect

    Reinke, R.E.

    1991-01-01

    Clustering is the problem of finding a good organization for data. Because there are many kinds of clustering problems, and because there are many possible clusterings for any data set, clustering programs use knowledge and assumptions about individual problems to make clustering tractable. Cluster-analysis techniques allow knowledge to be expressed in the choice of a pairwise distance measure and in the choice of clustering algorithm. Conceptual clustering adds knowledge and preferences about cluster descriptions. In this study the author describes symbolic clustering, which adds representation choice to the set of ways a data analyst can use problem-specific knowledge. He develops an informal model for symbolic clustering, and uses it to suggest where and how knowledge can be expressed in clustering. A language for creating symbolic clusters, based on the model, was developed and tested on three real clustering problems. The study concludes with a discussion of the implications of the model and the results for clustering in general.

  3. Joint Multiwavelength Measurements of the Matter Distribution in Clusters of Galaxies

    NASA Astrophysics Data System (ADS)

    Mahdavi, Andisheh

    We propose to make a comprehensive determination of the baryonic and dark matter distribution in clusters of galaxies using a new methodology developed by us. Using our technique is it possible to combine all available multiwavelength data for a cluster into a single joint analysis involving X-ray hydrostatics, weak gravitational lensing, stellar dynamics, and the Sunyaev-Zel'dovich effect. This combination represents an improvement over existing methods, and can provide powerful observational constraints on the dark matter halo slope and concentration. We have marshalled data from X-ray observatories (Chandra, XMM-Newton), as well as from ground based optical and radio facilities, to assemble a representative sample of 50 clusters suitable for testing the detailed predictions of CDM and gas simulations. Our first goals are to measure the logarithmic slope of the dark matter profile to an accuracy of +/- 0.2, and to measure its concentration to an accuracy of 20%, for the relaxed clusters in the sample. We are unique among large multiwavelength surveys in attempting such constraints using all available data. To complement these new limits on dark matter properties, we will also develop and apply triaxial hydrostatic models to the relaxed clusters in our sample, and measure the connection between the level substructure and deviations from self-similarity. The physics code used for this project will be made publicly available to the broader scientific community for free use. This will prove to be an invaluable tool for the analysis of upcoming space-based (CLASH) and ground-based weak gravitational lensing surveys. The ground based surveys will produce catalogs of 10^5 clusters of galaxies, and a uniform tool is required for analyzing such data in conjunction with complementary multiwavelength observations.

  4. A Chandra Study of Temperature Distributions of the Intracluster Medium in 50 Galaxy Clusters

    NASA Astrophysics Data System (ADS)

    Zhu, Zhenghao; Xu, Haiguang; Wang, Jingying; Gu, Junhua; Li, Weitian; Hu, Dan; Zhang, Chenhao; Gu, Liyi; An, Tao; Liu, Chengze; Zhang, Zhongli; Zhu, Jie; Wu, Xiang-Ping

    2016-01-01

    To investigate the spatial distribution of the intracluster medium temperature in galaxy clusters in a quantitative way and probe the physics behind it, we analyze the X-ray spectra from a sample of 50 clusters that were observed with the Chandra ACIS instrument over the past 15 years and measure the radial temperature profiles out to 0.45r500. We construct a physical model that takes into consideration the effects of gravitational heating, thermal history (such as radiative cooling, active galactic nucleus feedback, and thermal conduction), and work done via gas compression, and use it to fit the observed temperature profiles by running Bayesian regressions. The results show that in all cases our model provides an acceptable fit at the 68% confidence level. For further validation, we select nine clusters that have been observed with both Chandra (out to ≳0.3r500) and Suzaku (out to ≳1.5r500) and fit their Chandra spectra with our model. We then compare the extrapolation of the best fits with the Suzaku measurements and find that the model profiles agree with the Suzaku results very well in seven clusters. In the remaining two clusters the difference between the model and the observation is possibly caused by local thermal substructures. Our study also implies that for most of the clusters the assumption of hydrostatic equilibrium is safe out to at least 0.5r500 and the non-gravitational interactions between dark matter and its luminous counterparts is consistent with zero.

  5. Execution time supports for adaptive scientific algorithms on distributed memory machines

    NASA Technical Reports Server (NTRS)

    Berryman, Harry; Saltz, Joel; Scroggs, Jeffrey

    1990-01-01

    Optimizations are considered that are required for efficient execution of code segments that consists of loops over distributed data structures. The PARTI (Parallel Automated Runtime Toolkit at ICASE) execution time primitives are designed to carry out these optimizations and can be used to implement a wide range of scientific algorithms on distributed memory machines. These primitives allow the user to control array mappings in a way that gives an appearance of shared memory. Computations can be based on a global index set. Primitives are used to carry out gather and scatter operations on distributed arrays. Communications patterns are derived at runtime, and the appropriate send and receive messages are automatically generated.

  6. An Efficient Distributed Algorithm for Constructing Spanning Trees in Wireless Sensor Networks

    PubMed Central

    Lachowski, Rosana; Pellenz, Marcelo E.; Penna, Manoel C.; Jamhour, Edgard; Souza, Richard D.

    2015-01-01

    Monitoring and data collection are the two main functions in wireless sensor networks (WSNs). Collected data are generally transmitted via multihop communication to a special node, called the sink. While in a typical WSN, nodes have a sink node as the final destination for the data traffic, in an ad hoc network, nodes need to communicate with each other. For this reason, routing protocols for ad hoc networks are inefficient for WSNs. Trees, on the other hand, are classic routing structures explicitly or implicitly used in WSNs. In this work, we implement and evaluate distributed algorithms for constructing routing trees in WSNs described in the literature. After identifying the drawbacks and advantages of these algorithms, we propose a new algorithm for constructing spanning trees in WSNs. The performance of the proposed algorithm and the quality of the constructed tree were evaluated in different network scenarios. The results showed that the proposed algorithm is a more efficient solution. Furthermore, the algorithm provides multiple routes to the sensor nodes to be used as mechanisms for fault tolerance and load balancing. PMID:25594593

  7. An efficient distributed algorithm for constructing spanning trees in wireless sensor networks.

    PubMed

    Lachowski, Rosana; Pellenz, Marcelo E; Penna, Manoel C; Jamhour, Edgard; Souza, Richard D

    2015-01-01

    Monitoring and data collection are the two main functions in wireless sensor networks (WSNs). Collected data are generally transmitted via multihop communication to a special node, called the sink. While in a typical WSN, nodes have a sink node as the final destination for the data traffic, in an ad hoc network, nodes need to communicate with each other. For this reason, routing protocols for ad hoc networks are inefficient for WSNs. Trees, on the other hand, are classic routing structures explicitly or implicitly used in WSNs. In this work, we implement and evaluate distributed algorithms for constructing routing trees in WSNs described in the literature. After identifying the drawbacks and advantages of these algorithms, we propose a new algorithm for constructing spanning trees in WSNs. The performance of the proposed algorithm and the quality of the constructed tree were evaluated in different network scenarios. The results showed that the proposed algorithm is a more efficient solution. Furthermore, the algorithm provides multiple routes to the sensor nodes to be used as mechanisms for fault tolerance and load balancing.

  8. A Computationally Efficient Mel-Filter Bank VAD Algorithm for Distributed Speech Recognition Systems

    NASA Astrophysics Data System (ADS)

    Vlaj, Damjan; Kotnik, Bojan; Horvat, Bogomir; Kačič, Zdravko

    2005-12-01

    This paper presents a novel computationally efficient voice activity detection (VAD) algorithm and emphasizes the importance of such algorithms in distributed speech recognition (DSR) systems. When using VAD algorithms in telecommunication systems, the required capacity of the speech transmission channel can be reduced if only the speech parts of the signal are transmitted. A similar objective can be adopted in DSR systems, where the nonspeech parameters are not sent over the transmission channel. A novel approach is proposed for VAD decisions based on mel-filter bank (MFB) outputs with the so-called Hangover criterion. Comparative tests are presented between the presented MFB VAD algorithm and three VAD algorithms used in the G.729, G.723.1, and DSR (advanced front-end) Standards. These tests were made on the Aurora 2 database, with different signal-to-noise (SNRs) ratios. In the speech recognition tests, the proposed MFB VAD outperformed all the three VAD algorithms used in the standards by [InlineEquation not available: see fulltext.] relative (G.723.1 VAD), by [InlineEquation not available: see fulltext.] relative (G.729 VAD), and by [InlineEquation not available: see fulltext.] relative (DSR VAD) in all SNRs.

  9. A recursive regularization algorithm for estimating the particle size distribution from multiangle dynamic light scattering measurements

    NASA Astrophysics Data System (ADS)

    Li, Lei; Yang, Kecheng; Li, Wei; Wang, Wanyan; Guo, Wenping; Xia, Min

    2016-07-01

    Conventional regularization methods have been widely used for estimating particle size distribution (PSD) in single-angle dynamic light scattering, but they could not be used directly in multiangle dynamic light scattering (MDLS) measurements for lack of accurate angular weighting coefficients, which greatly affects the PSD determination and none of the regularization methods perform well for both unimodal and multimodal distributions. In this paper, we propose a recursive regularization method-Recursion Nonnegative Tikhonov-Phillips-Twomey (RNNT-PT) algorithm for estimating the weighting coefficients and PSD from MDLS data. This is a self-adaptive algorithm which distinguishes characteristics of PSDs and chooses the optimal inversion method from Nonnegative Tikhonov (NNT) and Nonnegative Phillips-Twomey (NNPT) regularization algorithm efficiently and automatically. In simulations, the proposed algorithm was able to estimate the PSDs more accurately than the classical regularization methods and performed stably against random noise and adaptable to both unimodal and multimodal distributions. Furthermore, we found that the six-angle analysis in the 30-130° range is an optimal angle set for both unimodal and multimodal PSDs.

  10. Dark matter distribution in the Coma cluster from galaxy kinematics: breaking the mass-anisotropy degeneracy

    NASA Astrophysics Data System (ADS)

    Łokas, Ewa L.; Mamon, Gary A.

    2003-08-01

    We study velocity moments of elliptical galaxies in the Coma cluster using Jeans equations. The dark matter distribution in the cluster is modelled by a generalized formula based upon the results of cosmological N-body simulations. Its inner slope (cuspy or flat), concentration and mass within the virial radius are kept as free parameters, as well as the velocity anisotropy, assumed independent of position. We show that the study of line-of-sight velocity dispersion alone does not allow us to constrain the parameters. By a joint analysis of the observed profiles of velocity dispersion and kurtosis, we are able to break the degeneracy between the mass distribution and velocity anisotropy. We determine the dark matter distribution at radial distances larger than 3 per cent of the virial radius and we find that the galaxy orbits are close to isotropic. Due to limited resolution, different inner slopes are found to be consistent with the data and we observe a strong degeneracy between the inner slope α and concentration c; the best-fitting profiles have the two parameters related with c= 19-9.6α. Our best-fitting Navarro-Frenk-White profile has concentration c= 9, which is 50 per cent higher than standard values found in cosmological simulations for objects of similar mass. The total mass within the virial radius of 2.9h-170 Mpc is 1.4 × 1015h-170 Msolar (with 30 per cent accuracy), 85 per cent of which is dark. At this distance from the cluster centre, the mass-to-light ratio in the blue band is 351h70 solar units. The total mass within the virial radius leads to estimates of the density parameter of the Universe, assuming that clusters trace the mass-to-light ratio and baryonic fraction of the Universe, with Ω0= 0.29 +/- 0.1.

  11. Multi-agent coordination algorithms for control of distributed energy resources in smart grids

    NASA Astrophysics Data System (ADS)

    Cortes, Andres

    Sustainable energy is a top-priority for researchers these days, since electricity and transportation are pillars of modern society. Integration of clean energy technologies such as wind, solar, and plug-in electric vehicles (PEVs), is a major engineering challenge in operation and management of power systems. This is due to the uncertain nature of renewable energy technologies and the large amount of extra load that PEVs would add to the power grid. Given the networked structure of a power system, multi-agent control and optimization strategies are natural approaches to address the various problems of interest for the safe and reliable operation of the power grid. The distributed computation in multi-agent algorithms addresses three problems at the same time: i) it allows for the handling of problems with millions of variables that a single processor cannot compute, ii) it allows certain independence and privacy to electricity customers by not requiring any usage information, and iii) it is robust to localized failures in the communication network, being able to solve problems by simply neglecting the failing section of the system. We propose various algorithms to coordinate storage, generation, and demand resources in a power grid using multi-agent computation and decentralized decision making. First, we introduce a hierarchical vehicle-one-grid (V1G) algorithm for coordination of PEVs under usage constraints, where energy only flows from the grid in to the batteries of PEVs. We then present a hierarchical vehicle-to-grid (V2G) algorithm for PEV coordination that takes into consideration line capacity constraints in the distribution grid, and where energy flows both ways, from the grid in to the batteries, and from the batteries to the grid. Next, we develop a greedy-like hierarchical algorithm for management of demand response events with on/off loads. Finally, we introduce distributed algorithms for the optimal control of distributed energy resources, i

  12. An Efficient Estimation of Distribution Algorithm for Job Shop Scheduling Problem

    NASA Astrophysics Data System (ADS)

    He, Xiao-Juan; Zeng, Jian-Chao; Xue, Song-Dong; Wang, Li-Fang

    An estimation of distribution algorithm with probability model based on permutation information of neighboring operations for job shop scheduling problem was proposed. The probability model was given using frequency information of pair-wise operations neighboring. Then the structure of optimal individual was marked and the operations of optimal individual were partitioned to some independent sub-blocks. To avoid repeating search in same area and improve search speed, each sub-block was taken as a whole to be adjusted. Also, stochastic adjustment to the operations within each sub-block was introduced to enhance the local search ability. The experimental results show that the proposed algorithm is more robust and efficient.

  13. A Distributed Transmission Rate Adjustment Algorithm in Heterogeneous CSMA/CA Networks

    PubMed Central

    Xie, Shuanglong; Low, Kay Soon; Gunawan, Erry

    2015-01-01

    Distributed transmission rate tuning is important for a wide variety of IEEE 802.15.4 network applications such as industrial network control systems. Such systems often require each node to sustain certain throughput demand in order to guarantee the system performance. It is thus essential to determine a proper transmission rate that can meet the application requirement and compensate for network imperfections (e.g., packet loss). Such a tuning in a heterogeneous network is difficult due to the lack of modeling techniques that can deal with the heterogeneity of the network as well as the network traffic changes. In this paper, a distributed transmission rate tuning algorithm in a heterogeneous IEEE 802.15.4 CSMA/CA network is proposed. Each node uses the results of clear channel assessment (CCA) to estimate the busy channel probability. Then a mathematical framework is developed to estimate the on-going heterogeneous traffics using the busy channel probability at runtime. Finally a distributed algorithm is derived to tune the transmission rate of each node to accurately meet the throughput requirement. The algorithm does not require modifications on IEEE 802.15.4 MAC layer and it has been experimentally implemented and extensively tested using TelosB nodes with the TinyOS protocol stack. The results reveal that the algorithm is accurate and can satisfy the throughput demand. Compared with existing techniques, the algorithm is fully distributed and thus does not require any central coordination. With this property, it is able to adapt to traffic changes and re-adjust the transmission rate to the desired level, which cannot be achieved using the traditional modeling techniques. PMID:25822140

  14. Algorithm for the direct reconstruction of the dark matter correlation function from weak lensing and galaxy clustering

    SciTech Connect

    Baldauf, Tobias; Smith, Robert E.; Seljak, Uros; Mandelbaum, Rachel

    2010-03-15

    The clustering of matter on cosmological scales is an essential probe for studying the physical origin and composition of our Universe. To date, most of the direct studies have focused on shear-shear weak lensing correlations, but it is also possible to extract the dark matter clustering by combining galaxy-clustering and galaxy-galaxy-lensing measurements. In order to extract the required information, one must relate the observable galaxy distribution to the underlying dark matter distribution. In this study we develop in detail a method that can constrain the dark matter correlation function from galaxy clustering and galaxy-galaxy-lensing measurements, by focusing on the correlation coefficient between the galaxy and matter overdensity fields. Our goal is to develop an estimator that maximally correlates the two. To generate a mock galaxy catalogue for testing purposes, we use the halo occupation distribution approach applied to a large ensemble of N-body simulations to model preexisting SDSS luminous red galaxy sample observations. Using this mock catalogue, we show that a direct comparison between the excess surface mass density measured by lensing and its corresponding galaxy clustering quantity is not optimal. We develop a new statistic that suppresses the small-scale contributions to these observations and show that this new statistic leads to a cross-correlation coefficient that is within a few percent of unity down to 5h{sup -1} Mpc. Furthermore, the residual incoherence between the galaxy and matter fields can be explained using a theoretical model for scale-dependent galaxy bias, giving us a final estimator that is unbiased to within 1%, so that we can reconstruct the dark matter clustering power spectrum at this accuracy up to k{approx}1h Mpc{sup -1}. We also perform a comprehensive study of other physical effects that can affect the analysis, such as redshift space distortions and differences in radial windows between galaxy clustering and weak

  15. Clustering social cues to determine social signals: developing learning algorithms using the "n-most likely states" approach

    NASA Astrophysics Data System (ADS)

    Best, Andrew; Kapalo, Katelynn A.; Warta, Samantha F.; Fiore, Stephen M.

    2016-05-01

    Human-robot teaming largely relies on the ability of machines to respond and relate to human social signals. Prior work in Social Signal Processing has drawn a distinction between social cues (discrete, observable features) and social signals (underlying meaning). For machines to attribute meaning to behavior, they must first understand some probabilistic relationship between the cues presented and the signal conveyed. Using data derived from a study in which participants identified a set of salient social signals in a simulated scenario and indicated the cues related to the perceived signals, we detail a learning algorithm, which clusters social cue observations and defines an "N-Most Likely States" set for each cluster. Since multiple signals may be co-present in a given simulation and a set of social cues often maps to multiple social signals, the "N-Most Likely States" approach provides a dramatic improvement over typical linear classifiers. We find that the target social signal appears in a "3 most-likely signals" set with up to 85% probability. This results in increased speed and accuracy on large amounts of data, which is critical for modeling social cognition mechanisms in robots to facilitate more natural human-robot interaction. These results also demonstrate the utility of such an approach in deployed scenarios where robots need to communicate with human teammates quickly and efficiently. In this paper, we detail our algorithm, comparative results, and offer potential applications for robot social signal detection and machine-aided human social signal detection.

  16. Distributions of short-lived radioactive nuclei produced by young embedded star clusters

    SciTech Connect

    Adams, Fred C.; Fatuzzo, Marco; Holden, Lisa

    2014-07-01

    Most star formation in the Galaxy takes place in clusters, where the most massive members can affect the properties of other constituent solar systems. This paper considers how clusters influence star formation and forming planetary systems through nuclear enrichment from supernova explosions, where massive stars deliver short-lived radioactive nuclei (SLRs) to their local environment. The decay of these nuclei leads to both heating and ionization, and thereby affects disk evolution, disk chemistry, and the accompanying process of planet formation. Nuclear enrichment can take place on two spatial scales: (1) within the cluster itself (ℓ ∼ 1 pc), the SLRs are delivered to the circumstellar disks associated with other cluster members. (2) On the next larger scale (ℓ ∼ 2-10 pc), SLRs are injected into the background molecular cloud; these nuclei provide heating and ionization to nearby star-forming regions and to the next generation of disks. For the first scenario, we construct the expected distributions of radioactive enrichment levels provided by embedded clusters. Clusters can account for the SLR mass fractions inferred for the early Solar Nebula, but typical SLR abundances are lower by a factor of ∼10. For the second scenario, we find that distributed enrichment of SLRs in molecular clouds leads to comparable abundances. For both the direct and distributed enrichment processes, the masses of {sup 26}Al and {sup 60}Fe delivered to individual circumstellar disks typically fall in the range 10-100 pM {sub ☉} (where 1 pM {sub ☉} = 10{sup –12} M {sub ☉}). The corresponding ionization rate due to SLRs typically falls in the range ζ{sub SLR} ∼ 1-5 × 10{sup –19} s{sup –1}. This ionization rate is smaller than that due to cosmic rays, ζ{sub CR} ∼ 10{sup –17} s{sup –1}, but will be important in regions where cosmic rays are attenuated (e.g., disk mid-planes).

  17. Two generalizations of Kohonen clustering

    NASA Technical Reports Server (NTRS)

    Bezdek, James C.; Pal, Nikhil R.; Tsao, Eric C. K.

    1993-01-01

    The relationship between the sequential hard c-means (SHCM), learning vector quantization (LVQ), and fuzzy c-means (FCM) clustering algorithms is discussed. LVQ and SHCM suffer from several major problems. For example, they depend heavily on initialization. If the initial values of the cluster centers are outside the convex hull of the input data, such algorithms, even if they terminate, may not produce meaningful results in terms of prototypes for cluster representation. This is due in part to the fact that they update only the winning prototype for every input vector. The impact and interaction of these two families with Kohonen's self-organizing feature mapping (SOFM), which is not a clustering method, but which often leads ideas to clustering algorithms is discussed. Then two generalizations of LVQ that are explicitly designed as clustering algorithms are presented; these algorithms are referred to as generalized LVQ = GLVQ; and fuzzy LVQ = FLVQ. Learning rules are derived to optimize an objective function whose goal is to produce 'good clusters'. GLVQ/FLVQ (may) update every node in the clustering net for each input vector. Neither GLVQ nor FLVQ depends upon a choice for the update neighborhood or learning rate distribution - these are taken care of automatically. Segmentation of a gray tone image is used as a typical application of these algorithms to illustrate the performance of GLVQ/FLVQ.

  18. Distribution of centrifugal neurons targeting the soma clusters of the olfactory midbrain among decapod crustaceans.

    PubMed

    Schmidt, M

    1997-03-28

    To determine the distribution of two systems of centrifugal neurons innervating the soma clusters of the olfactory midbrain across decapod crustaceans, brains of the following nine species comprising most infraorders were immunostained with antibodies against dopamine and the neuropeptides substance P and FMRFamide: Macrobrachium rosenbergii, Homarus americanus, Cherax destructor, Orconectes limosus, Procambarus clarkii, Astacus leptodactylus, Carcinus maenas, Eriocheir sinensis and Pagurus bernhardus. One system consisting of several neurons with dopamine-like immunoreactivity that originate in the eyestalk ganglia was present in the four crayfish but not in any other species. These neurons project mainly into the lateral soma clusters (cluster 10) comprising the somata of ascending olfactory projection neurons and innervate very sparsely the medial soma clusters (clusters 9 and 11) containing the somata of local interneurons. In the innervation pattern of the lateral cluster, the dopamine-immunoreactive neurons showed large species-specific differences. The other system comprises a pair of giant neurons with substance P-like immunoreactivity. These neurons have somata in the median protocerebrum of the central brain and major projections into the lateral clusters and the core of the olfactory lobes, the neuropils that are the first synaptic relay in the central olfactory pathway of decapods; minor arborizations are present in the medial clusters. The system of substance P-immunoreactive giant neurons was present and of great morphological similarity in all studied species. Only in one species, the shrimp Macrobrachium rosenbergii, evidence for co-localization of FMRFamide-like with substance P-like immunoreactivity in these neurons was obtained. These and previously collected data indicate that the centrifugal neurons with dopamine-like immunoreactivity may be associated with the presence of an accessory lobe, a second-order neuropil that receives input from the

  19. Distributions of deposited energy and ionization clusters around ion tracks studied with Geant4 toolkit

    NASA Astrophysics Data System (ADS)

    Burigo, Lucas; Pshenichnov, Igor; Mishustin, Igor; Hilgers, Gerhard; Bleicher, Marcus

    2016-05-01

    The Geant4-based Monte Carlo model for Heavy-Ion Therapy (MCHIT) was extended to study the patterns of energy deposition at sub-micrometer distance from individual ion tracks. Dose distributions for low-energy 1H, 4He, 12C and 16O ions measured in several experiments are well described by the model in a broad range of radial distances, from 0.5 to 3000 nm. Despite the fact that such distributions are characterized by long tails, a dominant fraction of deposited energy (∼80%) is confined within a radius of about 10 nm. The probability distributions of clustered ionization events in nanoscale volumes of water traversed by 1H, 2H, 4He, 6Li, 7Li, and 12C ions are also calculated. A good agreement of calculated ionization cluster-size distributions with the corresponding experimental data suggests that the extended MCHIT can be used to characterize stochastic processes of energy deposition to sensitive cellular structures.

  20. Inverse estimation of the spheroidal particle size distribution using Ant Colony Optimization algorithms in multispectral extinction technique

    NASA Astrophysics Data System (ADS)

    He, Zhenzong; Qi, Hong; Wang, Yuqing; Ruan, Liming

    2014-10-01

    Four improved Ant Colony Optimization (ACO) algorithms, i.e. the probability density function based ACO (PDF-ACO) algorithm, the Region ACO (RACO) algorithm, Stochastic ACO (SACO) algorithm and Homogeneous ACO (HACO) algorithm, are employed to estimate the particle size distribution (PSD) of the spheroidal particles. The direct problems are solved by the extended Anomalous Diffraction Approximation (ADA) and the Lambert-Beer law. Three commonly used monomodal distribution functions i.e. the Rosin-Rammer (R-R) distribution function, the normal (N-N) distribution function, and the logarithmic normal (L-N) distribution function are estimated under dependent model. The influence of random measurement errors on the inverse results is also investigated. All the results reveal that the PDF-ACO algorithm is more accurate than the other three ACO algorithms and can be used as an effective technique to investigate the PSD of the spheroidal particles. Furthermore, the Johnson's SB (J-SB) function and the modified beta (M-β) function are employed as the general distribution functions to retrieve the PSD of spheroidal particles using PDF-ACO algorithm. The investigation shows a reasonable agreement between the original distribution function and the general distribution function when only considering the variety of the length of the rotational semi-axis.

  1. Improved Cost-Base Design of Water Distribution Networks using Genetic Algorithm

    NASA Astrophysics Data System (ADS)

    Moradzadeh Azar, Foad; Abghari, Hirad; Taghi Alami, Mohammad; Weijs, Steven

    2010-05-01

    Population growth and progressive extension of urbanization in different places of Iran cause an increasing demand for primary needs. The water, this vital liquid is the most important natural need for human life. Providing this natural need is requires the design and construction of water distribution networks, that incur enormous costs on the country's budget. Any reduction in these costs enable more people from society to access extreme profit least cost. Therefore, investment of Municipal councils need to maximize benefits or minimize expenditures. To achieve this purpose, the engineering design depends on the cost optimization techniques. This paper, presents optimization models based on genetic algorithm(GA) to find out the minimum design cost Mahabad City's (North West, Iran) water distribution network. By designing two models and comparing the resulting costs, the abilities of GA were determined. the GA based model could find optimum pipe diameters to reduce the design costs of network. Results show that the water distribution network design using Genetic Algorithm could lead to reduction of at least 7% in project costs in comparison to the classic model. Keywords: Genetic Algorithm, Optimum Design of Water Distribution Network, Mahabad City, Iran.

  2. Optimization of spatial light distribution through genetic algorithms for vision systems applied to quality control

    NASA Astrophysics Data System (ADS)

    Castellini, P.; Cecchini, S.; Stroppa, L.; Paone, N.

    2015-02-01

    The paper presents an adaptive illumination system for image quality enhancement in vision-based quality control systems. In particular, a spatial modulation of illumination intensity is proposed in order to improve image quality, thus compensating for different target scattering properties, local reflections and fluctuations of ambient light. The desired spatial modulation of illumination is obtained by a digital light projector, used to illuminate the scene with an arbitrary spatial distribution of light intensity, designed to improve feature extraction in the region of interest. The spatial distribution of illumination is optimized by running a genetic algorithm. An image quality estimator is used to close the feedback loop and to stop iterations once the desired image quality is reached. The technique proves particularly valuable for optimizing the spatial illumination distribution in the region of interest, with the remarkable capability of the genetic algorithm to adapt the light distribution to very different target reflectivity and ambient conditions. The final objective of the proposed technique is the improvement of the matching score in the recognition of parts through matching algorithms, hence of the diagnosis of machine vision-based quality inspections. The procedure has been validated both by a numerical model and by an experimental test, referring to a significant problem of quality control for the washing machine manufacturing industry: the recognition of a metallic clamp. Its applicability to other domains is also presented, specifically for the visual inspection of shoes with retro-reflective tape and T-shirts with paillettes.

  3. A comparative analysis of clustering algorithms: O{sub 2} migration in truncated hemoglobin I from transition networks

    SciTech Connect

    Cazade, Pierre-André; Berezovska, Ganna; Meuwly, Markus; Zheng, Wenwei; Clementi, Cecilia; Prada-Gracia, Diego; Rao, Francesco

    2015-01-14

    The ligand migration network for O{sub 2}–diffusion in truncated Hemoglobin N is analyzed based on three different clustering schemes. For coordinate-based clustering, the conventional k–means and the kinetics-based Markov Clustering (MCL) methods are employed, whereas the locally scaled diffusion map (LSDMap) method is a collective-variable-based approach. It is found that all three methods agree well in their geometrical definition of the most important docking site, and all experimentally known docking sites are recovered by all three methods. Also, for most of the states, their population coincides quite favourably, whereas the kinetics of and between the states differs. One of the major differences between k–means and MCL clustering on the one hand and LSDMap on the other is that the latter finds one large primary cluster containing the Xe1a, IS1, and ENT states. This is related to the fact that the motion within the state occurs on similar time scales, whereas structurally the state is found to be quite diverse. In agreement with previous explicit atomistic simulations, the Xe3 pocket is found to be a highly dynamical site which points to its potential role as a hub in the network. This is also highlighted in the fact that LSDMap cannot identify this state. First passage time distributions from MCL clusterings using a one- (ligand-position) and two-dimensional (ligand-position and protein-structure) descriptor suggest that ligand- and protein-motions are coupled. The benefits and drawbacks of the three methods are discussed in a comparative fashion and highlight that depending on the questions at hand the best-performing method for a particular data set may differ.

  4. A Recursive Multiscale Correlation-Averaging Algorithm for an Automated Distributed Road Condition Monitoring System

    SciTech Connect

    Ndoye, Mandoye; Barker, Alan M; Krogmeier, James; Bullock, Darcy

    2011-01-01

    A signal processing approach is proposed to jointly filter and fuse spatially indexed measurements captured from many vehicles. It is assumed that these measurements are influenced by both sensor noise and measurement indexing uncertainties. Measurements from low-cost vehicle-mounted sensors (e.g., accelerometers and Global Positioning System (GPS) receivers) are properly combined to produce higher quality road roughness data for cost-effective road surface condition monitoring. The proposed algorithms are recursively implemented and thus require only moderate computational power and memory space. These algorithms are important for future road management systems, which will use on-road vehicles as a distributed network of sensing probes gathering spatially indexed measurements for condition monitoring, in addition to other applications, such as environmental sensing and/or traffic monitoring. Our method and the related signal processing algorithms have been successfully tested using field data.

  5. cOSPREY: A Cloud-Based Distributed Algorithm for Large-Scale Computational Protein Design.

    PubMed

    Pan, Yuchao; Dong, Yuxi; Zhou, Jingtian; Hallen, Mark; Donald, Bruce R; Zeng, Jianyang; Xu, Wei

    2016-09-01

    Finding the global minimum energy conformation (GMEC) of a huge combinatorial search space is the key challenge in computational protein design (CPD) problems. Traditional algorithms lack a scalable and efficient distributed design scheme, preventing researchers from taking full advantage of current cloud infrastructures. We design cloud OSPREY (cOSPREY), an extension to a widely used protein design software OSPREY, to allow the original design framework to scale to the commercial cloud infrastructures. We propose several novel designs to integrate both algorithm and system optimizations, such as GMEC-specific pruning, state search partitioning, asynchronous algorithm state sharing, and fault tolerance. We evaluate cOSPREY on three different cloud platforms using different technologies and show that it can solve a number of large-scale protein design problems that have not been possible with previous approaches.

  6. Intrusion-Aware Alert Validation Algorithm for Cooperative Distributed Intrusion Detection Schemes of Wireless Sensor Networks

    PubMed Central

    Shaikh, Riaz Ahmed; Jameel, Hassan; d’Auriol, Brian J.; Lee, Heejo; Lee, Sungyoung; Song, Young-Jae

    2009-01-01

    Existing anomaly and intrusion detection schemes of wireless sensor networks have mainly focused on the detection of intrusions. Once the intrusion is detected, an alerts or claims will be generated. However, any unidentified malicious nodes in the network could send faulty anomaly and intrusion claims about the legitimate nodes to the other nodes. Verifying the validity of such claims is a critical and challenging issue that is not considered in the existing cooperative-based distributed anomaly and intrusion detection schemes of wireless sensor networks. In this paper, we propose a validation algorithm that addresses this problem. This algorithm utilizes the concept of intrusion-aware reliability that helps to provide adequate reliability at a modest communication cost. In this paper, we also provide a security resiliency analysis of the proposed intrusion-aware alert validation algorithm. PMID:22454568

  7. Improved location algorithm for multiple intrusions in distributed Sagnac fiber sensing system.

    PubMed

    Wang, He; Sun, Qizhen; Li, Xiaolei; Wo, Jianghai; Shum, Perry Ping; Liu, Deming

    2014-04-01

    An improved algorithm named "twice-FFT" for multi-point intrusion location in distributed Sagnac sensing system is proposed and demonstrated. To find the null-frequencies more accurately and efficiently, a second FFT is applied to the frequency spectrum of the phase signal caused by intrusion. After Gaussian fitting and searching the peak response frequency in the twice-FFT curve, the intrusion position could be calculated out stably. Meanwhile, the twice-FFT algorithm could solve the problem of multi-point intrusion location. Based on the experiment with twice-FFT algorithm, the location error less than 100m for single intrusion is achieved at any position along the total length of 41km, and the locating ability for two or three intrusions occurring simultaneously is also demonstrated. PMID:24718133

  8. cOSPREY: A Cloud-Based Distributed Algorithm for Large-Scale Computational Protein Design.

    PubMed

    Pan, Yuchao; Dong, Yuxi; Zhou, Jingtian; Hallen, Mark; Donald, Bruce R; Zeng, Jianyang; Xu, Wei

    2016-09-01

    Finding the global minimum energy conformation (GMEC) of a huge combinatorial search space is the key challenge in computational protein design (CPD) problems. Traditional algorithms lack a scalable and efficient distributed design scheme, preventing researchers from taking full advantage of current cloud infrastructures. We design cloud OSPREY (cOSPREY), an extension to a widely used protein design software OSPREY, to allow the original design framework to scale to the commercial cloud infrastructures. We propose several novel designs to integrate both algorithm and system optimizations, such as GMEC-specific pruning, state search partitioning, asynchronous algorithm state sharing, and fault tolerance. We evaluate cOSPREY on three different cloud platforms using different technologies and show that it can solve a number of large-scale protein design problems that have not been possible with previous approaches. PMID:27154509

  9. [Investigation of the distribution of water clusters in vegetables, fruits, and natural waters by flicker noise spectroscopy].

    PubMed

    Zubov, A V; Zubov, K V; Zubov, V A

    2007-01-01

    The distribution of water clusters in fresh rain water and in rain water that was aged for 30 days (North Germany, 53 degrees 33' N, 12 degrees 47' E, 293 K, rain on 25.06.06) as well as in fresh vegetables and fruits was studied by flicker noise spectroscopy. In addition, the development of water clusters in apples and potatoes during ripening in 2006 was investigated. A different distribution of water clusters in irrigation water (river and rain) and in the biomatrix of vegetables (potatoes, onions, tomatoes, red beets) and fruits (apples, bananas) was observed. It was concluded that the cluster structure of irrigation water differs from that of water of the biomatrix of vegetables and fruits and depends on drought and the biomatrix nature. Water clusters in plants are more stable and reproducible than water clusters in natural water. The main characteristics of cluster formation in materials studied were given. The oscillation frequencies of water clusters in plants (biofield) are given at which they interact with water clusters of the Earth hydrosphere. A model of series of clusters 16(H2O)100 <--> 4(H2O)402 <--> 2(H2O)903 <--> (H2O)1889 in the biomatrix of vegetables and fruits was discussed.

  10. Distribution and Chemical State of Cu-rich Clusters in Silicon: Preprint

    SciTech Connect

    Buonassisi, T.; Marcus, M. A.; Istratov, A. A.; Heuer, M.; Ciszek, T. F.; Lai, B.; Cai, Z.; Weber, E. R.

    2004-08-01

    the chemical state and distribution of Cu-rich clusters were determined in four different silicon-based materials with varying contamination pathways and degrees of oxygen concentration, including as-grown multicrystalline silicon. In all four samples, Cu3Si was the only chemical state observed. Cu3Si clusters were observed at structural defects within all four materials; XBIC measurements revealed that the presence of Cu3Si corresponds to increased recombination activity. Oxidized Cu compounds are not likely to form in silicon. The +1 eV edge shift in the -XAS absorption spectrum of Cu3Si relative to Cu metal is believed to be an indication of a degree of covalent bonding between Cu atoms and their silicon neighbors.

  11. Constraining the initial conditions of globular clusters using their radius distribution

    NASA Astrophysics Data System (ADS)

    Alexander, Poul E. R.; Gieles, Mark

    2013-05-01

    Studies of extragalactic globular clusters (GCs) have shown that the peak size of the GC radius distribution (RD) depends only weakly on galactic environment. We model RDs of GC populations using a simple prescription for a Hubble time of relaxation-driven evolution of cluster mass and radius. We consider a power-law cluster initial mass function (CIMF) with and without an exponential truncation, and focus in particular on a flat and a steep CIMF (power-law indices of 0 and -2, respectively). For the initial half-mass radii at birth, we adopt either Roche volume (RV) filling conditions (`filling', meaning that the ratio of half-mass to Jacobi radius is approximately rh/rJ ≃ 0.15) or strongly RV under-filling conditions (`under-filling', implying that initially rh/rJ ≪ 0.15). Assuming a constant orbital velocity about the galaxy centre, we find for a steep CIMF that the typical half-light radius scales with the galactocentric radius RG as R{^{1/3}_G}. This weak scaling is consistent with observations, but this scenario has the (well-known) problem that too many low-mass clusters survive. A flat CIMF with `filling' initial conditions results in the correct MF at old ages, but with too many large (massive) clusters at large RG. An `under-filling' GC population with a flat CIMF also results in the correct MF, and can also successfully reproduce the shape of the RD, with a peak size that is (almost) independent of RG. In this case, the peak size depends (almost) only on the peak mass of the GC MF. The (near) universality of the GC RD is therefore because of the (near) universality of the CIMF. There are some extended GCs in the outer halo of the Milky Way that cannot be explained by this model.

  12. Designing Topology-Aware Collective Communication Algorithms for Large Scale InfiniBand Clusters: Case Studies with Scatter and Gather

    SciTech Connect

    Kandalla, Krishna; Subramoni, Hari; Vishnu, Abhinav; Panda, Dhabaleswar K.

    2010-04-01

    Modern high performance computing systems are being increasingly deployed in a hierarchical fashion with multi-core computing platforms forming the base of the hierarchy. These systems are usually comprised of multiple racks, with each rack consisting of a finite number of chassis, with each chassis having multiple compute nodes or blades, based on multi-core architectures. The networks are also hierarchical with multiple levels of switches. Message exchange operations between processes that belong to different racks involve multiple hops across different switches and this directly affects the performance of collective operations. In this paper, we take on the challenges involved in detecting the topology of large scale InfiniBand clusters and leveraging this knowledge to design efficient topology-aware algorithms for collective operations. We also propose a communication model to analyze the communication costs involved in collective operations on large scale supercomputing systems. We have analyzed the performance characteristics of two collectives, MPI_Gather and MPI_Scatter on such systems and we have proposed topology-aware algorithms for these operations. Our experimental results have shown that the proposed algorithms can improve the performance of these collective operations by almost 54% at the micro-benchmark level.

  13. The rates and time-delay distribution of multiply imaged supernovae behind lensing clusters

    SciTech Connect

    Li, Xue; Hjorth, Jens; Richard, Johan E-mail: jens@dark-cosmology.dk

    2012-11-01

    Time delays of gravitationally lensed sources can be used to constrain the mass model of a deflector and determine cosmological parameters. We here present an analysis of the time-delay distribution of multiply imaged sources behind 17 strong lensing galaxy clusters with well-calibrated mass models. We find that for time delays less than 1000 days, at z = 3.0, their logarithmic probability distribution functions are well represented by P(log Δt) = 5.3 × 10{sup −4}Δt{sup β-tilde}/M{sub 250}{sup 2β-tilde}, with β-tilde = 0.77, where M{sub 250} is the projected cluster mass inside 250 kpc (in 10{sup 14}M{sub ☉}), and β-tilde is the power-law slope of the distribution. The resultant probability distribution function enables us to estimate the time-delay distribution in a lensing cluster of known mass. For a cluster with M{sub 250} = 2 × 10{sup 14}M{sub ☉}, the fraction of time delays less than 1000 days is approximately 3%. Taking Abell 1689 as an example, its dark halo and brightest galaxies, with central velocity dispersions σ≥500kms{sup −1}, mainly produce large time delays, while galaxy-scale mass clumps are responsible for generating smaller time delays. We estimate the probability of observing multiple images of a supernova in the known images of Abell 1689. A two-component model of estimating the supernova rate is applied in this work. For a magnitude threshold of m{sub AB} = 26.5, the yearly rate of Type Ia (core-collapse) supernovae with time delays less than 1000 days is 0.004±0.002 (0.029±0.001). If the magnitude threshold is lowered to m{sub AB} ∼ 27.0, the rate of core-collapse supernovae suitable for time delay observation is 0.044±0.015 per year.

  14. A Distributed Parallel Genetic Algorithm of Placement Strategy for Virtual Machines Deployment on Cloud Platform

    PubMed Central

    Dong, Yu-Shuang; Xu, Gao-Chao; Fu, Xiao-Dong

    2014-01-01

    The cloud platform provides various services to users. More and more cloud centers provide infrastructure as the main way of operating. To improve the utilization rate of the cloud center and to decrease the operating cost, the cloud center provides services according to requirements of users by sharding the resources with virtualization. Considering both QoS for users and cost saving for cloud computing providers, we try to maximize performance and minimize energy cost as well. In this paper, we propose a distributed parallel genetic algorithm (DPGA) of placement strategy for virtual machines deployment on cloud platform. It executes the genetic algorithm parallelly and distributedly on several selected physical hosts in the first stage. Then it continues to execute the genetic algorithm of the second stage with solutions obtained from the first stage as the initial population. The solution calculated by the genetic algorithm of the second stage is the optimal one of the proposed approach. The experimental results show that the proposed placement strategy of VM deployment can ensure QoS for users and it is more effective and more energy efficient than other placement strategies on the cloud platform. PMID:25097872

  15. A distributed parallel genetic algorithm of placement strategy for virtual machines deployment on cloud platform.

    PubMed

    Dong, Yu-Shuang; Xu, Gao-Chao; Fu, Xiao-Dong

    2014-01-01

    The cloud platform provides various services to users. More and more cloud centers provide infrastructure as the main way of operating. To improve the utilization rate of the cloud center and to decrease the operating cost, the cloud center provides services according to requirements of users by sharding the resources with virtualization. Considering both QoS for users and cost saving for cloud computing providers, we try to maximize performance and minimize energy cost as well. In this paper, we propose a distributed parallel genetic algorithm (DPGA) of placement strategy for virtual machines deployment on cloud platform. It executes the genetic algorithm parallelly and distributedly on several selected physical hosts in the first stage. Then it continues to execute the genetic algorithm of the second stage with solutions obtained from the first stage as the initial population. The solution calculated by the genetic algorithm of the second stage is the optimal one of the proposed approach. The experimental results show that the proposed placement strategy of VM deployment can ensure QoS for users and it is more effective and more energy efficient than other placement strategies on the cloud platform.

  16. A data distributed, parallel algorithm for ray-traced volume rendering

    SciTech Connect

    Ma, Kwan-Liu; Painter, J.S.; Hansen, C.D.; Krogh, M.F.

    1993-03-30

    This paper presents a divide-and-conquer ray-traced volume rendering algorithm and its implementation on networked workstations and a massively parallel computer, the Connection Machine CM-5. This algorithm distributes the data and the computational load to individual processing units to achieve fast, high-quality rendering of high-resolution data, even when only a modest amount of memory is available on each machine. The volume data, once distributed, is left intact. The processing nodes perform local ray-tracing of their subvolume concurrently. No communication between processing units is needed during this locally ray-tracing process. A subimage is generated by each processing unit and the final image is obtained by compositing subimages in the proper order, which can be determined a priori. Implementations and tests on a group of networked workstations and on the Thinking Machines CM-5 demonstrate the practicality of our algorithm and expose different performance tuning issues for each platform. We use data sets from medical imaging and computational fluid dynamics simulations in the study of this algorithm.

  17. A distributed parallel genetic algorithm of placement strategy for virtual machines deployment on cloud platform.

    PubMed

    Dong, Yu-Shuang; Xu, Gao-Chao; Fu, Xiao-Dong

    2014-01-01

    The cloud platform provides various services to users. More and more cloud centers provide infrastructure as the main way of operating. To improve the utilization rate of the cloud center and to decrease the operating cost, the cloud center provides services according to requirements of users by sharding the resources with virtualization. Considering both QoS for users and cost saving for cloud computing providers, we try to maximize performance and minimize energy cost as well. In this paper, we propose a distributed parallel genetic algorithm (DPGA) of placement strategy for virtual machines deployment on cloud platform. It executes the genetic algorithm parallelly and distributedly on several selected physical hosts in the first stage. Then it continues to execute the genetic algorithm of the second stage with solutions obtained from the first stage as the initial population. The solution calculated by the genetic algorithm of the second stage is the optimal one of the proposed approach. The experimental results show that the proposed placement strategy of VM deployment can ensure QoS for users and it is more effective and more energy efficient than other placement strategies on the cloud platform. PMID:25097872

  18. Computational Methods for Decentralized Two-Level 0-1 Programming Problems through Distributed Genetic Algorithms

    NASA Astrophysics Data System (ADS)

    Niwa, Keiichi; Hayashida, Tomohiro; Sakawa, Masatoshi; Yang, Yishen

    2010-10-01

    We consider two-level programming problems in which there are one decision maker (the leader) at the upper level and two or more decision makers (the followers) at the lower level and decision variables of the leader and the followers are 0-1 variables. We assume that there is coordination among the followers while between the leader and the group of all the followers, there is no motivation to cooperate each other, and fuzzy goals for objective functions of the leader and the followers are introduced so as to take fuzziness of their judgments into consideration. The leader maximizes the degree of satisfaction (the value of the membership function) and the followers choose in concert in order to maximize a minimum among their degrees of satisfaction. We propose a modified computational method that solves problems related to the computational method based on the genetic algorithm (the existing method) for obtaining the Stackelberg solution. Specifically, the distributed genetic algorithm is introduced with respect to the upper level genetic algorithm, which handles decision variables for the leader in order to shorten the computational time of the existing method. Parallelization of the lower level genetic algorithm is also performed along with parallelization of the upper level genetic algorithm. In order to demonstrate the effectiveness of the proposed computational method, numerical experiments are carried out.

  19. Optimal Allocation of Distributed Generation Minimizing Loss and Voltage Sag Problem-Using Genetic Algorithm

    NASA Astrophysics Data System (ADS)

    Biswas, S.; Goswami, S. K.

    2010-10-01

    In the present paper an attempt has been made to place the distributed generation at an optimal location so as to improve the technical as well as economical performance. Among technical issues the sag performance and the loss have been considered. Genetic algorithm method has been used as the optimization technique in this problem. For sag analysis the impact of 3-phase symmetrical short circuit faults is considered. Total load disturbed during the faults is considered as an indicator of sag performance. The solution algorithm is demonstrated on a 34 bus radial distribution system with some lateral branches. For simplicity only one DG of predefined capacity is considered. MATLAB has been used as the programming environment.

  20. Optimal design of water distribution networks by a discrete state transition algorithm

    NASA Astrophysics Data System (ADS)

    Zhou, Xiaojun; Gao, David Y.; Simpson, Angus R.

    2016-04-01

    In this study it is demonstrated that, with respect to model formulation, the number of linear and nonlinear equations involved in water distribution networks can be reduced to the number of closed simple loops. Regarding the optimization technique, a discrete state transition algorithm (STA) is introduced to solve several cases of water distribution networks. Firstly, the focus is on a parametric study of the 'restoration probability and risk probability' in the dynamic STA. To deal effectively with head pressure constraints, the influence is then investigated of the penalty coefficient and search enforcement on the performance of the algorithm. Based on the experience gained from training the Two-Loop network problem, a discrete STA has successfully achieved the best known solutions for the Hanoi, triple Hanoi and New York network problems.

  1. Optimizing spherical light-emitting diode array for highly uniform illumination distribution by employing genetic algorithm

    NASA Astrophysics Data System (ADS)

    Shen, Yanxia; Ji, Zhicheng; Su, Zhouping

    2013-01-01

    A numerical optimization method (genetic algorithm) is employed to design the spherical light-emitting diode (LED) array for highly uniform illumination distribution. An evaluation function related to the nonuniformity is constructed for the numerical optimization. With the minimum of evaluation function, the LED array produces the best uniformity. The genetic algorithm is used to seek the minimum of evaluation function. By this method, we design two LED arrays. In one case, LEDs are positioned symmetrically on the sphere and the illuminated target surface is a plane. However, in the other case, LEDs are positioned nonsymmetrically with a spherical target surface. Both the symmetrical and nonsymmetrical spherical LED arrays generate good uniform illumination distribution with calculated nonuniformities of 6 and 8%, respectively.

  2. Analysis of Massive Emigration from Poland: The Model-Based Clustering Approach

    NASA Astrophysics Data System (ADS)

    Witek, Ewa

    The model-based approach assumes that data is generated by a finite mixture of probability distributions such as multivariate normal distributions. In finite mixture models, each component of probability distribution corresponds to a cluster. The problem of determining the number of clusters and choosing an appropriate clustering method becomes the problem of statistical model choice. Hence, the model-based approach provides a key advantage over heuristic clustering algorithms, because it selects both the correct model and the number of clusters.

  3. A Hierarchical and Distributed Approach for Mapping Large Applications to Heterogeneous Grids using Genetic Algorithms

    NASA Technical Reports Server (NTRS)

    Sanyal, Soumya; Jain, Amit; Das, Sajal K.; Biswas, Rupak

    2003-01-01

    In this paper, we propose a distributed approach for mapping a single large application to a heterogeneous grid environment. To minimize the execution time of the parallel application, we distribute the mapping overhead to the available nodes of the grid. This approach not only provides a fast mapping of tasks to resources but is also scalable. We adopt a hierarchical grid model and accomplish the job of mapping tasks to this topology using a scheduler tree. Results show that our three-phase algorithm provides high quality mappings, and is fast and scalable.

  4. Spherical harmonic analysis of particle velocity distribution function: Comparison of moments and anisotropies using Cluster data

    NASA Astrophysics Data System (ADS)

    Viñas, Adolfo F.; Gurgiolo, Chris

    2009-01-01

    This paper presents a spherical harmonic analysis of the plasma velocity distribution function using high-angular, energy, and time resolution Cluster data obtained from the PEACE spectrometer instrument to demonstrate how this analysis models the particle distribution function and its moments and anisotropies. The results show that spherical harmonic analysis produced a robust physical representation model of the velocity distribution function, resolving the main features of the measured distributions. From the spherical harmonic analysis, a minimum set of nine spectral coefficients was obtained from which the moment (up to the heat flux), anisotropy, and asymmetry calculations of the velocity distribution function were obtained. The spherical harmonic method provides a potentially effective ``compression'' technique that can be easily carried out onboard a spacecraft to determine the moments and anisotropies of the particle velocity distribution function for any species. These calculations were implemented using three different approaches, namely, the standard traditional integration, the spherical harmonic (SPH) spectral coefficients integration, and the singular value decomposition (SVD) on the spherical harmonic methods. A comparison among the various methods shows that both SPH and SVD approaches provide remarkable agreement with the standard moment integration method.

  5. Spherical Harmonic Analysis of Particle Velocity Distribution Function: Comparison of Moments and Anisotropies using Cluster Data

    NASA Technical Reports Server (NTRS)

    Gurgiolo, Chris; Vinas, Adolfo F.

    2009-01-01

    This paper presents a spherical harmonic analysis of the plasma velocity distribution function using high-angular, energy, and time resolution Cluster data obtained from the PEACE spectrometer instrument to demonstrate how this analysis models the particle distribution function and its moments and anisotropies. The results show that spherical harmonic analysis produced a robust physical representation model of the velocity distribution function, resolving the main features of the measured distributions. From the spherical harmonic analysis, a minimum set of nine spectral coefficients was obtained from which the moment (up to the heat flux), anisotropy, and asymmetry calculations of the velocity distribution function were obtained. The spherical harmonic method provides a potentially effective "compression" technique that can be easily carried out onboard a spacecraft to determine the moments and anisotropies of the particle velocity distribution function for any species. These calculations were implemented using three different approaches, namely, the standard traditional integration, the spherical harmonic (SPH) spectral coefficients integration, and the singular value decomposition (SVD) on the spherical harmonic methods. A comparison among the various methods shows that both SPH and SVD approaches provide remarkable agreement with the standard moment integration method.

  6. Photodissociation of Arn + cluster ions: Kinetic energy distributions of neutral fragments

    NASA Astrophysics Data System (ADS)

    Nagata, Takashi; Kondow, Tamotsu

    1993-01-01

    The time-of-flight (TOF) spectra of fragments produced in the photodissociation of Arn+ (3≤n≤24) were measured at 532 nm. Analysis of these TOF spectra provides quantitative information on the kinetic energy distributions of the neutral Ar fragments. For Arn+ with n≤14, two types of Ar fragments were distinguished according to the kinetic energy release. One having a sizable amount of kinetic energy is ascribed to the fragments directly produced via the dissociation of the chromophoric core in the cluster ions. The other carrying a smaller amount of kinetic energy can be described by ``evaporation'' of solvent atoms in Arn+. The average translational energies of the ``fast'' and the ``slow'' fragments were estimated to be 0.35-0.38 and 0.07-0.1 eV, respectively, for n=7-11. The angular distribution of the fast fragments exhibits a preferential anisotropy with 1.5≲β≲2 along the direction of the polarization vector of the excitation laser, while an almost isotropic distribution was observed for the slow fragments. A possible photodissociation mechanism was proposed based on the theoretically predicted geometries of Arn+. In the TOF spectra for the larger Arn+ with 14≤n≤24, no indication was obtained for the production of the fast fragments. The average kinetic energy of the ejected neutral atoms is ˜0.05 eV at n=24. This finding indicates that the direct core dissociation no longer takes place in the larger Arn+ clusters, suggesting that the photophysical properties of Arn+ (n≥14) differ from those of the smaller cluster ions.

  7. Parallel multi-join query optimization algorithm for distributed sensor network in the internet of things

    NASA Astrophysics Data System (ADS)

    Zheng, Yan

    2015-03-01

    Internet of things (IoT), focusing on providing users with information exchange and intelligent control, attracts a lot of attention of researchers from all over the world since the beginning of this century. IoT is consisted of large scale of sensor nodes and data processing units, and the most important features of IoT can be illustrated as energy confinement, efficient communication and high redundancy. With the sensor nodes increment, the communication efficiency and the available communication band width become bottle necks. Many research work is based on the instance which the number of joins is less. However, it is not proper to the increasing multi-join query in whole internet of things. To improve the communication efficiency between parallel units in the distributed sensor network, this paper proposed parallel query optimization algorithm based on distribution attributes cost graph. The storage information relations and the network communication cost are considered in this algorithm, and an optimized information changing rule is established. The experimental result shows that the algorithm has good performance, and it would effectively use the resource of each node in the distributed sensor network. Therefore, executive efficiency of multi-join query between different nodes could be improved.

  8. Application of the LSQR algorithm in non-parametric estimation of aerosol size distribution

    NASA Astrophysics Data System (ADS)

    He, Zhenzong; Qi, Hong; Lew, Zhongyuan; Ruan, Liming; Tan, Heping; Luo, Kun

    2016-05-01

    Based on the Least Squares QR decomposition (LSQR) algorithm, the aerosol size distribution (ASD) is retrieved in non-parametric approach. The direct problem is solved by the Anomalous Diffraction Approximation (ADA) and the Lambert-Beer Law. An optimal wavelength selection method is developed to improve the retrieval accuracy of the ASD. The proposed optimal wavelength set is selected by the method which can make the measurement signals sensitive to wavelength and decrease the degree of the ill-condition of coefficient matrix of linear systems effectively to enhance the anti-interference ability of retrieval results. Two common kinds of monomodal and bimodal ASDs, log-normal (L-N) and Gamma distributions, are estimated, respectively. Numerical tests show that the LSQR algorithm can be successfully applied to retrieve the ASD with high stability in the presence of random noise and low susceptibility to the shape of distributions. Finally, the experimental measurement ASD over Harbin in China is recovered reasonably. All the results confirm that the LSQR algorithm combined with the optimal wavelength selection method is an effective and reliable technique in non-parametric estimation of ASD.

  9. A fuzzy discrete harmony search algorithm applied to annual cost reduction in radial distribution systems

    NASA Astrophysics Data System (ADS)

    Ameli, Kazem; Alfi, Alireza; Aghaebrahimi, Mohammadreza

    2016-09-01

    Similarly to other optimization algorithms, harmony search (HS) is quite sensitive to the tuning parameters. Several variants of the HS algorithm have been developed to decrease the parameter-dependency character of HS. This article proposes a novel version of the discrete harmony search (DHS) algorithm, namely fuzzy discrete harmony search (FDHS), for optimizing capacitor placement in distribution systems. In the FDHS, a fuzzy system is employed to dynamically adjust two parameter values, i.e. harmony memory considering rate and pitch adjusting rate, with respect to normalized mean fitness of the harmony memory. The key aspect of FDHS is that it needs substantially fewer iterations to reach convergence in comparison with classical discrete harmony search (CDHS). To the authors' knowledge, this is the first application of DHS to specify appropriate capacitor locations and their best amounts in the distribution systems. Simulations are provided for 10-, 34-, 85- and 141-bus distribution systems using CDHS and FDHS. The results show the effectiveness of FDHS over previous related studies.

  10. Angular distribution of neutrons from deuterated cluster explosions driven by femtosecond laser pulses.

    PubMed

    Buersgens, F; Madison, K W; Symes, D R; Hartke, R; Osterhoff, J; Grigsby, W; Dyer, G; Ditmire, T

    2006-07-01

    We have studied experimentally the angular distributions of fusion neutrons from plasmas of multi-keV ion temperature, created by 40 fs, multi-TW laser pulses in dense plumes of D2 and CD4 clusters. A slight anisotropy in the neutron emission is observed. We attribute this anisotropy to the fact that the differential cross section for DD fusion is anisotropic even at low collision energies, and this, coupled with the geometry of the gas jet target, leads to beam-target neutrons that are slightly directed. The qualitative features of this anisotropy are confirmed by Monte Carlo simulations.

  11. Angular distribution of neutrons from deuterated cluster explosions driven by femtosecond laser pulses

    NASA Astrophysics Data System (ADS)

    Buersgens, F.; Madison, K. W.; Symes, D. R.; Hartke, R.; Osterhoff, J.; Grigsby, W.; Dyer, G.; Ditmire, T.

    2006-07-01

    We have studied experimentally the angular distributions of fusion neutrons from plasmas of multi-keV ion temperature, created by 40fs , multi-TW laser pulses in dense plumes of D2 and CD4 clusters. A slight anisotropy in the neutron emission is observed. We attribute this anisotropy to the fact that the differential cross section for DD fusion is anisotropic even at low collision energies, and this, coupled with the geometry of the gas jet target, leads to beam-target neutrons that are slightly directed. The qualitative features of this anisotropy are confirmed by Monte Carlo simulations.

  12. Comparison of Genetic Algorithm, Particle Swarm Optimization and Biogeography-based Optimization for Feature Selection to Classify Clusters of Microcalcifications

    NASA Astrophysics Data System (ADS)

    Khehra, Baljit Singh; Pharwaha, Amar Partap Singh

    2016-06-01

    Ductal carcinoma in situ (DCIS) is one type of breast cancer. Clusters of microcalcifications (MCCs) are symptoms of DCIS that are recognized by mammography. Selection of robust features vector is the process of selecting an optimal subset of features from a large number of available features in a given problem domain after the feature extraction and before any classification scheme. Feature selection reduces the feature space that improves the performance of classifier and decreases the computational burden imposed by using many features on classifier. Selection of an optimal subset of features from a large number of available features in a given problem domain is a difficult search problem. For n features, the total numbers of possible subsets of features are 2n. Thus, selection of an optimal subset of features problem belongs to the category of NP-hard problems. In this paper, an attempt is made to find the optimal subset of MCCs features from all possible subsets of features using genetic algorithm (GA), particle swarm optimization (PSO) and biogeography-based optimization (BBO). For simulation, a total of 380 benign and malignant MCCs samples have been selected from mammogram images of DDSM database. A total of 50 features extracted from benign and malignant MCCs samples are used in this study. In these algorithms, fitness function is correct classification rate of classifier. Support vector machine is used as a classifier. From experimental results, it is also observed that the performance of PSO-based and BBO-based algorithms to select an optimal subset of features for classifying MCCs as benign or malignant is better as compared to GA-based algorithm.

  13. CNN universal machine as classificaton platform: an art-like clustering algorithm.

    PubMed

    Bálya, David

    2003-12-01

    Fast and robust classification of feature vectors is a crucial task in a number of real-time systems. A cellular neural/nonlinear network universal machine (CNN-UM) can be very efficient as a feature detector. The next step is to post-process the results for object recognition. This paper shows how a robust classification scheme based on adaptive resonance theory (ART) can be mapped to the CNN-UM. Moreover, this mapping is general enough to include different types of feed-forward neural networks. The designed analogic CNN algorithm is capable of classifying the extracted feature vectors keeping the advantages of the ART networks, such as robust, plastic and fault-tolerant behaviors. An analogic algorithm is presented for unsupervised classification with tunable sensitivity and automatic new class creation. The algorithm is extended for supervised classification. The presented binary feature vector classification is implemented on the existing standard CNN-UM chips for fast classification. The experimental evaluation shows promising performance after 100% accuracy on the training set.

  14. On Microscopic Mechanisms Which Elongate the Tail of Cluster Size Distributions: An Example of Random Domino Automaton

    NASA Astrophysics Data System (ADS)

    Czechowski, Zbigniew

    2015-07-01

    On the basis of simple cellular automaton, the microscopic mechanisms, which can be responsible for elongation of tails of cluster size distributions, were analyzed. It was shown that only the appropriate forms of rebound function can lead to inverse power tails if densities of the grid are small or moderate. For big densities, correlations between clusters become significant and lead to elongation of tails and flattening of the distribution to a straight line in log-log scale. The microscopic mechanism, given by the rebound function, included in simple 1D RDA can be projected on the geometric mechanism, which favours larger clusters in 2D RDA.

  15. Assessment and application of clustering techniques to atmospheric particle number size distribution for the purpose of source apportionment

    NASA Astrophysics Data System (ADS)

    Salimi, F.; Ristovski, Z.; Mazaheri, M.; Laiman, R.; Crilley, L. R.; He, C.; Clifford, S.; Morawska, L.

    2014-11-01

    Long-term measurements of particle number size distribution (PNSD) produce a very large number of observations and their analysis requires an efficient approach in order to produce results in the least possible time and with maximum accuracy. Clustering techniques are a family of sophisticated methods that have been recently employed to analyse PNSD data; however, very little information is available comparing the performance of different clustering techniques on PNSD data. This study aims to apply several clustering techniques (i.e. K means, PAM, CLARA and SOM) to PNSD data, in order to identify and apply the optimum technique to PNSD data measured at 25 sites across Brisbane, Australia. A new method, based on the Generalised Additive Model (GAM) with a basis of penalised B-splines, was proposed to parameterise the PNSD data and the temporal weight of each cluster was also estimated using the GAM. In addition, each cluster was associated with its possible source based on the results of this parameterisation, together with the characteristics of each cluster. The performances of four clustering techniques were compared using the Dunn index and Silhouette width validation values and the K means technique was found to have the highest performance, with five clusters being the optimum. Therefore, five clusters were found within the data using the K means technique. The diurnal occurrence of each cluster was used together with other air quality parameters, temporal trends and the physical properties of each cluster, in order to attribute each cluster to its source and origin. The five clusters were attributed to three major sources and origins, including regional background particles, photochemically induced nucleated particles and vehicle generated particles. Overall, clustering was found to be an effective technique for attributing each particle size spectrum to its source and the GAM was suitable to parameterise the PNSD data. These two techniques can help

  16. Assessment and application of clustering techniques to atmospheric particle number size distribution for the purpose of source apportionment

    NASA Astrophysics Data System (ADS)

    Salimi, F.; Ristovski, Z.; Mazaheri, M.; Laiman, R.; Crilley, L. R.; He, C.; Clifford, S.; Morawska, L.

    2014-06-01

    Long-term measurements of particle number size distribution (PNSD) produce a very large number of observations and their analysis requires an efficient approach in order to produce results in the least possible time and with maximum accuracy. Clustering techniques are a family of sophisticated methods which have been recently employed to analyse PNSD data, however, very little information is available comparing the performance of different clustering techniques on PNSD data. This study aims to apply several clustering techniques (i.e. K-means, PAM, CLARA and SOM) to PNSD data, in order to identify and apply the optimum technique to PNSD data measured at 25 sites across Brisbane, Australia. A new method, based on the Generalised Additive Model (GAM) with a basis of penalised B-splines, was proposed to parameterise the PNSD data and the temporal weight of each cluster was also estimated using the GAM. In addition, each cluster was associated with its possible source based on the results of this parameterisation, together with the characteristics of each cluster. The performances of four clustering techniques were compared using the Dunn index and silhouette width validation values and the K-means technique was found to have the highest performance, with five clusters being the optimum. Therefore, five clusters were found within the data using the K-means technique. The diurnal occurrence of each cluster was used together with other air quality parameters, temporal trends and the physical properties of each cluster, in order to attribute each cluster to its source and origin. The five clusters were attributed to three major sources and origins, including regional background particles, photochemically induced nucleated particles and vehicle generated particles. Overall, clustering was found to be an effective technique for attributing each particle size spectra to its source and the GAM was suitable to parameterise the PNSD data. These two techniques can help

  17. UV TO FAR-IR CATALOG OF A GALAXY SAMPLE IN NEARBY CLUSTERS: SPECTRAL ENERGY DISTRIBUTIONS AND ENVIRONMENTAL TRENDS

    SciTech Connect

    Hernandez-Fernandez, Jonathan D.; Iglesias-Paramo, J.; Vilchez, J. M.

    2012-03-01

    In this paper, we present a sample of cluster galaxies devoted to study the environmental influence on the star formation activity. This sample of galaxies inhabits in clusters showing a rich variety in their characteristics and have been observed by the SDSS-DR6 down to M{sub B} {approx} -18, and by the Galaxy Evolution Explorer AIS throughout sky regions corresponding to several megaparsecs. We assign the broadband and emission-line fluxes from ultraviolet to far-infrared to each galaxy performing an accurate spectral energy distribution for spectral fitting analysis. The clusters follow the general X-ray luminosity versus velocity dispersion trend of L{sub X} {proportional_to} {sigma}{sup 4.4}{sub c}. The analysis of the distributions of galaxy density counting up to the 5th nearest neighbor {Sigma}{sub 5} shows: (1) the virial regions and the cluster outskirts share a common range in the high density part of the distribution. This can be attributed to the presence of massive galaxy structures in the surroundings of virial regions. (2) The virial regions of massive clusters ({sigma}{sub c} > 550 km s{sup -1}) present a {Sigma}{sub 5} distribution statistically distinguishable ({approx}96%) from the corresponding distribution of low-mass clusters ({sigma}{sub c} < 550 km s{sup -1}). Both massive and low-mass clusters follow a similar density-radius trend, but the low-mass clusters avoid the high density extreme. We illustrate, with ABELL 1185, the environmental trends of galaxy populations. Maps of sky projected galaxy density show how low-luminosity star-forming galaxies appear distributed along more spread structures than their giant counterparts, whereas low-luminosity passive galaxies avoid the low-density environment. Giant passive and star-forming galaxies share rather similar sky regions with passive galaxies exhibiting more concentrated distributions.

  18. DETERMINING ALL GAS PROPERTIES IN GALAXY CLUSTERS FROM THE DARK MATTER DISTRIBUTION ALONE

    SciTech Connect

    Frederiksen, Teddy F.; Hansen, Steen H.; Host, Ole; Roncadelli, Marco

    2009-08-01

    We demonstrate that all properties of the hot X-ray emitting gas in galaxy clusters are completely determined by the underlying dark matter (DM) structure. Apart from the standard conditions of spherical symmetry and hydrostatic equilibrium for the gas, our proof is based on the Jeans equation for the DM and two simple relations which have recently emerged from numerical simulations: the equality of the gas and DM temperatures, and the almost linear relation between the DM velocity anisotropy profile and its density slope. For DM distributions described by the Navarro-Frenk-White or the Sersic profiles, the resulting gas density profile, the gas-to-total-mass ratio profile, and the entropy profile are all in good agreement with X-ray observations. All these profiles are derived using zero free parameters. Our result allows us to predict the X-ray luminosity profile of a cluster in terms of its DM content alone. As a consequence, a new strategy becomes available to constrain the DM morphology in galaxy clusters from X-ray observations. Our results can also be used as a practical tool for creating initial conditions for realistic cosmological structures to be used in numerical simulations.

  19. CUDA programs for the GPU computing of the Swendsen-Wang multi-cluster spin flip algorithm: 2D and 3D Ising, Potts, and XY models

    NASA Astrophysics Data System (ADS)

    Komura, Yukihiro; Okabe, Yutaka

    2014-03-01

    We present sample CUDA programs for the GPU computing of the Swendsen-Wang multi-cluster spin flip algorithm. We deal with the classical spin models; the Ising model, the q-state Potts model, and the classical XY model. As for the lattice, both the 2D (square) lattice and the 3D (simple cubic) lattice are treated. We already reported the idea of the GPU implementation for 2D models (Komura and Okabe, 2012). We here explain the details of sample programs, and discuss the performance of the present GPU implementation for the 3D Ising and XY models. We also show the calculated results of the moment ratio for these models, and discuss phase transitions. Catalogue identifier: AERM_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AERM_v1_0.html Program obtainable from: CPC Program Library, Queen’s University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 5632 No. of bytes in distributed program, including test data, etc.: 14688 Distribution format: tar.gz Programming language: C, CUDA. Computer: System with an NVIDIA CUDA enabled GPU. Operating system: System with an NVIDIA CUDA enabled GPU. Classification: 23. External routines: NVIDIA CUDA Toolkit 3.0 or newer Nature of problem: Monte Carlo simulation of classical spin systems. Ising, q-state Potts model, and the classical XY model are treated for both two-dimensional and three-dimensional lattices. Solution method: GPU-based Swendsen-Wang multi-cluster spin flip Monte Carlo method. The CUDA implementation for the cluster-labeling is based on the work by Hawick et al. [1] and that by Kalentev et al. [2]. Restrictions: The system size is limited depending on the memory of a GPU. Running time: For the parameters used in the sample programs, it takes about a minute for each program. Of course, it depends on the system size, the number of Monte Carlo steps, etc. References: [1] K

  20. Improved CUDA programs for GPU computing of Swendsen-Wang multi-cluster spin flip algorithm: 2D and 3D Ising, Potts, and XY models

    NASA Astrophysics Data System (ADS)

    Komura, Yukihiro; Okabe, Yutaka

    2016-03-01

    We present new versions of sample CUDA programs for the GPU computing of the Swendsen-Wang multi-cluster spin flip algorithm. In this update, we add the method of GPU-based cluster-labeling algorithm without the use of conventional iteration (Komura, 2015) to those programs. For high-precision calculations, we also add a random-number generator in the cuRAND library. Moreover, we fix several bugs and remove the extra usage of shared memory in the kernel functions.