Mass distributed clustering: a new algorithm for repeated measurements in gene expression data.
Matsumoto, Shinya; Aisaki, Ken-ichi; Kanno, Jun
2005-01-01
The availability of whole-genome sequence data and high-throughput techniques such as DNA microarray enable researchers to monitor the alteration of gene expression by a certain organ or tissue in a comprehensive manner. The quantity of gene expression data can be greater than 30,000 genes per one measurement, making data clustering methods for analysis essential. Biologists usually design experimental protocols so that statistical significance can be evaluated; often, they conduct experiments in triplicate to generate a mean and standard deviation. Existing clustering methods usually use these mean or median values, rather than the original data, and take significance into account by omitting data showing large standard deviations, which eliminates potentially useful information. We propose a clustering method that uses each of the triplicate data sets as a probability distribution function instead of pooling data points into a median or mean. This method permits truly unsupervised clustering of the data from DNA microarrays. PMID:16901101
Gao, Ying; Wkram, Chris Hadri; Duan, Jiajie; Chou, Jarong
2015-01-01
In order to prolong the network lifetime, energy-efficient protocols adapted to the features of wireless sensor networks should be used. This paper explores in depth the nature of heterogeneous wireless sensor networks, and finally proposes an algorithm to address the problem of finding an effective pathway for heterogeneous clustering energy. The proposed algorithm implements cluster head selection according to the degree of energy attenuation during the network’s running and the degree of candidate nodes’ effective coverage on the whole network, so as to obtain an even energy consumption over the whole network for the situation with high degree of coverage. Simulation results show that the proposed clustering protocol has better adaptability to heterogeneous environments than existing clustering algorithms in prolonging the network lifetime. PMID:26690440
A clustering routing algorithm based on improved ant colony clustering for wireless sensor networks
NASA Astrophysics Data System (ADS)
Xiao, Xiaoli; Li, Yang
Because of real wireless sensor network node distribution uniformity, this paper presents a clustering strategy based on the ant colony clustering algorithm (ACC-C). To reduce the energy consumption of the head near the base station and the whole network, The algorithm uses ant colony clustering on non-uniform clustering. The improve route optimal degree is presented to evaluate the performance of the chosen route. Simulation results show that, compared with other algorithms, like the LEACH algorithm and the improve particle cluster kind of clustering algorithm (PSC - C), the proposed approach is able to keep away from the node with less residual energy, which can improve the life of networks.
Cluster algorithms and computational complexity
NASA Astrophysics Data System (ADS)
Li, Xuenan
Cluster algorithms for the 2D Ising model with a staggered field have been studied and a new cluster algorithm for path sampling has been worked out. The complexity properties of Bak-Seppen model and the Growing network model have been studied by using the Computational Complexity Theory. The dynamic critical behavior of the two-replica cluster algorithm is studied. Several versions of the algorithm are applied to the two-dimensional, square lattice Ising model with a staggered field. The dynamic exponent for the full algorithm is found to be less than 0.5. It is found that odd translations of one replica with respect to the other together with global flips are essential for obtaining a small value of the dynamic exponent. The path sampling problem for the 1D Ising model is studied using both a local algorithm and a novel cluster algorithm. The local algorithm is extremely inefficient at low temperature, where the integrated autocorrelation time is found to be proportional to the fourth power of correlation length. The dynamic exponent of the cluster algorithm is found to be zero and therefore proved to be much more efficient than the local algorithm. The parallel computational complexity of the Bak-Sneppen evolution model is studied. It is shown that Bak-Sneppen histories can be generated by a massively parallel computer in a time that is polylog in the length of the history, which means that the logical depth of producing a Bak-Sneppen history is exponentially less than the length of the history. The parallel dynamics for generating Bak-Sneppen histories is contrasted to standard Bak-Sneppen dynamics. The parallel computational complexity of the Growing Network model is studied. The growth of the network with linear kernels is shown to be not complex and an algorithm with polylog parallel running time is found. The growth of the network with gamma ≥ 2 super-linear kernels can be realized by a randomized parallel algorithm with polylog expected running time.
Energy Aware Clustering Algorithms for Wireless Sensor Networks
NASA Astrophysics Data System (ADS)
Rakhshan, Noushin; Rafsanjani, Marjan Kuchaki; Liu, Chenglian
2011-09-01
The sensor nodes deployed in wireless sensor networks (WSNs) are extremely power constrained, so maximizing the lifetime of the entire networks is mainly considered in the design. In wireless sensor networks, hierarchical network structures have the advantage of providing scalable and energy efficient solutions. In this paper, we investigate different clustering algorithms for WSNs and also compare these clustering algorithms based on metrics such as clustering distribution, cluster's load balancing, Cluster Head's (CH) selection strategy, CH's role rotation, node mobility, clusters overlapping, intra-cluster communications, reliability, security and location awareness.
Overlapping clusters for distributed computation.
Mirrokni, Vahab; Andersen, Reid; Gleich, David F.
2010-11-01
Scalable, distributed algorithms must address communication problems. We investigate overlapping clusters, or vertex partitions that intersect, for graph computations. This setup stores more of the graph than required but then affords the ease of implementation of vertex partitioned algorithms. Our hope is that this technique allows us to reduce communication in a computation on a distributed graph. The motivation above draws on recent work in communication avoiding algorithms. Mohiyuddin et al. (SC09) design a matrix-powers kernel that gives rise to an overlapping partition. Fritzsche et al. (CSC2009) develop an overlapping clustering for a Schwarz method. Both techniques extend an initial partitioning with overlap. Our procedure generates overlap directly. Indeed, Schwarz methods are commonly used to capitalize on overlap. Elsewhere, overlapping communities (Ahn et al, Nature 2009; Mishra et al. WAW2007) are now a popular model of structure in social networks. These have long been studied in statistics (Cole and Wishart, CompJ 1970). We present two types of results: (i) an estimated swapping probability {rho}{infinity}; and (ii) the communication volume of a parallel PageRank solution (link-following {alpha} = 0.85) using an additive Schwarz method. The volume ratio is the amount of extra storage for the overlap (2 means we store the graph twice). Below, as the ratio increases, the swapping probability and PageRank communication volume decreases.
Sparse subspace clustering: algorithm, theory, and applications.
Elhamifar, Ehsan; Vidal, René
2013-11-01
Many real-world problems deal with collections of high-dimensional data, such as images, videos, text, and web documents, DNA microarray data, and more. Often, such high-dimensional data lie close to low-dimensional structures corresponding to several classes or categories to which the data belong. In this paper, we propose and study an algorithm, called sparse subspace clustering, to cluster data points that lie in a union of low-dimensional subspaces. The key idea is that, among the infinitely many possible representations of a data point in terms of other points, a sparse representation corresponds to selecting a few points from the same subspace. This motivates solving a sparse optimization program whose solution is used in a spectral clustering framework to infer the clustering of the data into subspaces. Since solving the sparse optimization program is in general NP-hard, we consider a convex relaxation and show that, under appropriate conditions on the arrangement of the subspaces and the distribution of the data, the proposed minimization program succeeds in recovering the desired sparse representations. The proposed algorithm is efficient and can handle data points near the intersections of subspaces. Another key advantage of the proposed algorithm with respect to the state of the art is that it can deal directly with data nuisances, such as noise, sparse outlying entries, and missing entries, by incorporating the model of the data into the sparse optimization program. We demonstrate the effectiveness of the proposed algorithm through experiments on synthetic data as well as the two real-world problems of motion segmentation and face clustering. PMID:24051734
Cluster compression algorithm: A joint clustering/data compression concept
NASA Technical Reports Server (NTRS)
Hilbert, E. E.
1977-01-01
The Cluster Compression Algorithm (CCA), which was developed to reduce costs associated with transmitting, storing, distributing, and interpreting LANDSAT multispectral image data is described. The CCA is a preprocessing algorithm that uses feature extraction and data compression to more efficiently represent the information in the image data. The format of the preprocessed data enables simply a look-up table decoding and direct use of the extracted features to reduce user computation for either image reconstruction, or computer interpretation of the image data. Basically, the CCA uses spatially local clustering to extract features from the image data to describe spectral characteristics of the data set. In addition, the features may be used to form a sequence of scalar numbers that define each picture element in terms of the cluster features. This sequence, called the feature map, is then efficiently represented by using source encoding concepts. Various forms of the CCA are defined and experimental results are presented to show trade-offs and characteristics of the various implementations. Examples are provided that demonstrate the application of the cluster compression concept to multi-spectral images from LANDSAT and other sources.
Basic firefly algorithm for document clustering
NASA Astrophysics Data System (ADS)
Mohammed, Athraa Jasim; Yusof, Yuhanis; Husni, Husniza
2015-12-01
The Document clustering plays significant role in Information Retrieval (IR) where it organizes documents prior to the retrieval process. To date, various clustering algorithms have been proposed and this includes the K-means and Particle Swarm Optimization. Even though these algorithms have been widely applied in many disciplines due to its simplicity, such an approach tends to be trapped in a local minimum during its search for an optimal solution. To address the shortcoming, this paper proposes a Basic Firefly (Basic FA) algorithm to cluster text documents. The algorithm employs the Average Distance to Document Centroid (ADDC) as the objective function of the search. Experiments utilizing the proposed algorithm were conducted on the 20Newsgroups benchmark dataset. Results demonstrate that the Basic FA generates a more robust and compact clusters than the ones produced by K-means and Particle Swarm Optimization (PSO).
Self-organization and clustering algorithms
NASA Technical Reports Server (NTRS)
Bezdek, James C.
1991-01-01
Kohonen's feature maps approach to clustering is often likened to the k or c-means clustering algorithms. Here, the author identifies some similarities and differences between the hard and fuzzy c-Means (HCM/FCM) or ISODATA algorithms and Kohonen's self-organizing approach. The author concludes that some differences are significant, but at the same time there may be some important unknown relationships between the two methodologies. Several avenues of research are proposed.
Hierarchical link clustering algorithm in networks
NASA Astrophysics Data System (ADS)
Bodlaj, Jernej; Batagelj, Vladimir
2015-06-01
Hierarchical network clustering is an approach to find tightly and internally connected clusters (groups or communities) of nodes in a network based on its structure. Instead of nodes, it is possible to cluster links of the network. The sets of nodes belonging to clusters of links can overlap. While overlapping clusters of nodes are not always expected, they are natural in many applications. Using appropriate dissimilarity measures, we can complement the clustering strategy to consider, for example, the semantic meaning of links or nodes based on their properties. We propose a new hierarchical link clustering algorithm which in comparison to existing algorithms considers node and/or link properties (descriptions, attributes) of the input network alongside its structure using monotonic dissimilarity measures. The algorithm determines communities that form connected subnetworks (relational constraint) containing locally similar nodes with respect to their description. It is only implicitly based on the corresponding line graph of the input network, thus reducing its space and time complexities. We investigate both complexities analytically and statistically. Using provided dissimilarity measures, our algorithm can, in addition to the general overlapping community structure of input networks, uncover also related subregions inside these communities in a form of hierarchy. We demonstrate this ability on real-world and artificial network examples.
Hierarchical link clustering algorithm in networks.
Bodlaj, Jernej; Batagelj, Vladimir
2015-06-01
Hierarchical network clustering is an approach to find tightly and internally connected clusters (groups or communities) of nodes in a network based on its structure. Instead of nodes, it is possible to cluster links of the network. The sets of nodes belonging to clusters of links can overlap. While overlapping clusters of nodes are not always expected, they are natural in many applications. Using appropriate dissimilarity measures, we can complement the clustering strategy to consider, for example, the semantic meaning of links or nodes based on their properties. We propose a new hierarchical link clustering algorithm which in comparison to existing algorithms considers node and/or link properties (descriptions, attributes) of the input network alongside its structure using monotonic dissimilarity measures. The algorithm determines communities that form connected subnetworks (relational constraint) containing locally similar nodes with respect to their description. It is only implicitly based on the corresponding line graph of the input network, thus reducing its space and time complexities. We investigate both complexities analytically and statistically. Using provided dissimilarity measures, our algorithm can, in addition to the general overlapping community structure of input networks, uncover also related subregions inside these communities in a form of hierarchy. We demonstrate this ability on real-world and artificial network examples. PMID:26172761
Color sorting algorithm based on K-means clustering algorithm
NASA Astrophysics Data System (ADS)
Zhang, BaoFeng; Huang, Qian
2009-11-01
In the process of raisin production, there were a variety of color impurities, which needs be removed effectively. A new kind of efficient raisin color-sorting algorithm was presented here. First, the technology of image processing basing on the threshold was applied for the image pre-processing, and then the gray-scale distribution characteristic of the raisin image was found. In order to get the chromatic aberration image and reduce some disturbance, we made the flame image subtraction that the target image data minus the background image data. Second, Haar wavelet filter was used to get the smooth image of raisins. According to the different colors and mildew, spots and other external features, the calculation was made to identify the characteristics of their images, to enable them to fully reflect the quality differences between the raisins of different types. After the processing above, the image were analyzed by K-means clustering analysis method, which can achieve the adaptive extraction of the statistic features, in accordance with which, the image data were divided into different categories, thereby the categories of abnormal colors were distinct. By the use of this algorithm, the raisins of abnormal colors and ones with mottles were eliminated. The sorting rate was up to 98.6%, and the ratio of normal raisins to sorted grains was less than one eighth.
An algorithm for spatial heirarchy clustering
NASA Technical Reports Server (NTRS)
Dejesusparada, N. (Principal Investigator); Velasco, F. R. D.
1981-01-01
A method for utilizing both spectral and spatial redundancy in compacting and preclassifying images is presented. In multispectral satellite images, a high correlation exists between neighboring image points which tend to occupy dense and restricted regions of the feature space. The image is divided into windows of the same size where the clustering is made. The classes obtained in several neighboring windows are clustered, and then again successively clustered until only one region corresponding to the whole image is obtained. By employing this algorithm only a few points are considered in each clustering, thus reducing computational effort. The method is illustrated as applied to LANDSAT images.
Coupled cluster algorithms for networks of shared memory parallel processors
NASA Astrophysics Data System (ADS)
Bentz, Jonathan L.; Olson, Ryan M.; Gordon, Mark S.; Schmidt, Michael W.; Kendall, Ricky A.
2007-05-01
As the popularity of using SMP systems as the building blocks for high performance supercomputers increases, so too increases the need for applications that can utilize the multiple levels of parallelism available in clusters of SMPs. This paper presents a dual-layer distributed algorithm, using both shared-memory and distributed-memory techniques to parallelize a very important algorithm (often called the "gold standard") used in computational chemistry, the single and double excitation coupled cluster method with perturbative triples, i.e. CCSD(T). The algorithm is presented within the framework of the GAMESS [M.W. Schmidt, K.K. Baldridge, J.A. Boatz, S.T. Elbert, M.S. Gordon, J.J. Jensen, S. Koseki, N. Matsunaga, K.A. Nguyen, S. Su, T.L. Windus, M. Dupuis, J.A. Montgomery, General atomic and molecular electronic structure system, J. Comput. Chem. 14 (1993) 1347-1363]. (General Atomic and Molecular Electronic Structure System) program suite and the Distributed Data Interface [M.W. Schmidt, G.D. Fletcher, B.M. Bode, M.S. Gordon, The distributed data interface in GAMESS, Comput. Phys. Comm. 128 (2000) 190]. (DDI), however, the essential features of the algorithm (data distribution, load-balancing and communication overhead) can be applied to more general computational problems. Timing and performance data for our dual-level algorithm is presented on several large-scale clusters of SMPs.
Performance Comparison Of Evolutionary Algorithms For Image Clustering
NASA Astrophysics Data System (ADS)
Civicioglu, P.; Atasever, U. H.; Ozkan, C.; Besdok, E.; Karkinli, A. E.; Kesikoglu, A.
2014-09-01
Evolutionary computation tools are able to process real valued numerical sets in order to extract suboptimal solution of designed problem. Data clustering algorithms have been intensively used for image segmentation in remote sensing applications. Despite of wide usage of evolutionary algorithms on data clustering, their clustering performances have been scarcely studied by using clustering validation indexes. In this paper, the recently proposed evolutionary algorithms (i.e., Artificial Bee Colony Algorithm (ABC), Gravitational Search Algorithm (GSA), Cuckoo Search Algorithm (CS), Adaptive Differential Evolution Algorithm (JADE), Differential Search Algorithm (DSA) and Backtracking Search Optimization Algorithm (BSA)) and some classical image clustering techniques (i.e., k-means, fcm, som networks) have been used to cluster images and their performances have been compared by using four clustering validation indexes. Experimental test results exposed that evolutionary algorithms give more reliable cluster-centers than classical clustering techniques, but their convergence time is quite long.
Parallel Clustering Algorithms for Structured AMR
Gunney, B T; Wissink, A M; Hysom, D A
2005-10-26
We compare several different parallel implementation approaches for the clustering operations performed during adaptive gridding operations in patch-based structured adaptive mesh refinement (SAMR) applications. Specifically, we target the clustering algorithm of Berger and Rigoutsos (BR91), which is commonly used in many SAMR applications. The baseline for comparison is a simplistic parallel extension of the original algorithm that works well for up to O(10{sup 2}) processors. Our goal is a clustering algorithm for machines of up to O(10{sup 5}) processors, such as the 64K-processor IBM BlueGene/Light system. We first present an algorithm that avoids the unneeded communications of the simplistic approach to improve the clustering speed by up to an order of magnitude. We then present a new task-parallel implementation to further reduce communication wait time, adding another order of magnitude of improvement. The new algorithms also exhibit more favorable scaling behavior for our test problems. Performance is evaluated on a number of large scale parallel computer systems, including a 16K-processor BlueGene/Light system.
Fusion and clustering algorithms for spatial data
NASA Astrophysics Data System (ADS)
Kuntala, Pavani
Spatial clustering is an approach for discovering groups of related data points in spatial data. Spatial clustering has attracted a lot of research attention due to various applications where it is needed. It holds practical importance in application domains such as geographic knowledge discovery, sensors, rare disease discovery, astronomy, remote sensing, and so on. The motivation for this work stems from the limitations of the existing spatial clustering methods. In most conventional spatial clustering algorithms, the similarity measurement mainly considers the geometric attributes. However, in many real applications, users are concerned about both the spatial and the non-spatial attributes. In conventional spatial clustering, the input data set is partitioned into several compact regions and data points that are similar to one another in their non-spatial attributes may be scattered over different regions, thus making the corresponding objective difficult to achieve. In this dissertation, a novel clustering methodology is proposed to explore the clustering problem within both spatial and non-spatial domains by employing a fusion-based approach. The goal is to optimize a given objective function in the spatial domain, while satisfying the constraint specified in the non- spatial attribute domain. Several experiments are conducted to provide insights into the proposed methodology. The algorithm first captures the spatial cores having the highest structure and then employs an iterative, heuristic mechanism to find the optimal number of spatial cores and non-spatial clusters that exist in the data. Such a fusion-based framework allows for the handling of data streams and provides a framework for comparing spatial clusters. The correctness and efficiency of the proposed clustering model is demonstrated on real world and synthetic data sets.
A dynamic clustering algorithm in wireless sensor networks
NASA Astrophysics Data System (ADS)
Wang, Rui; Liang, Yan; Pan, Quan; Wang, Quan; Cheng, Yongmei
2005-11-01
It is essential to prolong the lifetime of wireless sensor networks (WSN) via effective cooperation of its sensor nodes. Here, a dynamic clustering algorithm, named DCA, is presented to optimally and dynamically select the micro-sensor nodes to construct a dynamic sensor cluster at each time based on the integrated performance index including information acquirement and energy consumption. In distributed target tracking with WSN, the DCA can avoid the problem of "too frequent cluster head (CH) switches", save more than 80% energy and remain almost same tracking accuracy, compared with the information-driven sensor querying (IDSQ).
Genetic algorithm optimization of atomic clusters
Morris, J.R.; Deaven, D.M.; Ho, K.M.; Wang, C.Z.; Pan, B.C.; Wacker, J.G.; Turner, D.E. |
1996-12-31
The authors have been using genetic algorithms to study the structures of atomic clusters and related problems. This is a problem where local minima are easy to locate, but barriers between the many minima are large, and the number of minima prohibit a systematic search. They use a novel mating algorithm that preserves some of the geometrical relationship between atoms, in order to ensure that the resultant structures are likely to inherit the best features of the parent clusters. Using this approach, they have been able to find lower energy structures than had been previously obtained. Most recently, they have been able to turn around the building block idea, using optimized structures from the GA to learn about systematic structural trends. They believe that an effective GA can help provide such heuristic information, and (conversely) that such information can be introduced back into the algorithm to assist in the search process.
Open cluster membership probability based on K-means clustering algorithm
NASA Astrophysics Data System (ADS)
El Aziz, Mohamed Abd; Selim, I. M.; Essam, A.
2016-05-01
In the field of galaxies images, the relative coordinate positions of each star with respect to all the other stars are adapted. Therefore the membership of star cluster will be adapted by two basic criterions, one for geometric membership and other for physical (photometric) membership. So in this paper, we presented a new method for the determination of open cluster membership based on K-means clustering algorithm. This algorithm allows us to efficiently discriminate the cluster membership from the field stars. To validate the method we applied it on NGC 188 and NGC 2266, membership stars in these clusters have been obtained. The color-magnitude diagram of the membership stars is significantly clearer and shows a well-defined main sequence and a red giant branch in NGC 188, which allows us to better constrain the cluster members and estimate their physical parameters. The membership probabilities have been calculated and compared to those obtained by the other methods. The results show that the K-means clustering algorithm can effectively select probable member stars in space without any assumption about the spatial distribution of stars in cluster or field. The similarity of our results is in a good agreement with results derived by previous works.
Chaotic map clustering algorithm for EEG analysis
NASA Astrophysics Data System (ADS)
Bellotti, R.; De Carlo, F.; Stramaglia, S.
2004-03-01
The non-parametric chaotic map clustering algorithm has been applied to the analysis of electroencephalographic signals, in order to recognize the Huntington's disease, one of the most dangerous pathologies of the central nervous system. The performance of the method has been compared with those obtained through parametric algorithms, as K-means and deterministic annealing, and supervised multi-layer perceptron. While supervised neural networks need a training phase, performed by means of data tagged by the genetic test, and the parametric methods require a prior choice of the number of classes to find, the chaotic map clustering gives a natural evidence of the pathological class, without any training or supervision, thus providing a new efficient methodology for the recognition of patterns affected by the Huntington's disease.
A Cross Unequal Clustering Routing Algorithm for Sensor Network
NASA Astrophysics Data System (ADS)
Tong, Wang; Jiyi, Wu; He, Xu; Jinghua, Zhu; Munyabugingo, Charles
2013-08-01
In the routing protocol for wireless sensor network, the cluster size is generally fixed in clustering routing algorithm for wireless sensor network, which can easily lead to the "hot spot" problem. Furthermore, the majority of routing algorithms barely consider the problem of long distance communication between adjacent cluster heads that brings high energy consumption. Therefore, this paper proposes a new cross unequal clustering routing algorithm based on the EEUC algorithm. In order to solve the defects of EEUC algorithm, this algorithm calculating of competition radius takes the node's position and node's remaining energy into account to make the load of cluster heads more balanced. At the same time, cluster adjacent node is applied to transport data and reduce the energy-loss of cluster heads. Simulation experiments show that, compared with LEACH and EEUC, the proposed algorithm can effectively reduce the energy-loss of cluster heads and balance the energy consumption among all nodes in the network and improve the network lifetime
Random networks with tunable degree distribution and clustering
NASA Astrophysics Data System (ADS)
Volz, Erik
2004-11-01
We present an algorithm for generating random networks with arbitrary degree distribution and clustering (frequency of triadic closure). We use this algorithm to generate networks with exponential, power law, and Poisson degree distributions with variable levels of clustering. Such networks may be used as models of social networks and as a testable null hypothesis about network structure. Finally, we explore the effects of clustering on the point of the phase transition where a giant component forms in a random network, and on the size of the giant component. Some analysis of these effects is presented.
Improved Ant Colony Clustering Algorithm and Its Performance Study.
Gao, Wei
2016-01-01
Clustering analysis is used in many disciplines and applications; it is an important tool that descriptively identifies homogeneous groups of objects based on attribute values. The ant colony clustering algorithm is a swarm-intelligent method used for clustering problems that is inspired by the behavior of ant colonies that cluster their corpses and sort their larvae. A new abstraction ant colony clustering algorithm using a data combination mechanism is proposed to improve the computational efficiency and accuracy of the ant colony clustering algorithm. The abstraction ant colony clustering algorithm is used to cluster benchmark problems, and its performance is compared with the ant colony clustering algorithm and other methods used in existing literature. Based on similar computational difficulties and complexities, the results show that the abstraction ant colony clustering algorithm produces results that are not only more accurate but also more efficiently determined than the ant colony clustering algorithm and the other methods. Thus, the abstraction ant colony clustering algorithm can be used for efficient multivariate data clustering. PMID:26839533
Improved Ant Colony Clustering Algorithm and Its Performance Study
Gao, Wei
2016-01-01
Clustering analysis is used in many disciplines and applications; it is an important tool that descriptively identifies homogeneous groups of objects based on attribute values. The ant colony clustering algorithm is a swarm-intelligent method used for clustering problems that is inspired by the behavior of ant colonies that cluster their corpses and sort their larvae. A new abstraction ant colony clustering algorithm using a data combination mechanism is proposed to improve the computational efficiency and accuracy of the ant colony clustering algorithm. The abstraction ant colony clustering algorithm is used to cluster benchmark problems, and its performance is compared with the ant colony clustering algorithm and other methods used in existing literature. Based on similar computational difficulties and complexities, the results show that the abstraction ant colony clustering algorithm produces results that are not only more accurate but also more efficiently determined than the ant colony clustering algorithm and the other methods. Thus, the abstraction ant colony clustering algorithm can be used for efficient multivariate data clustering. PMID:26839533
A Distributed Flocking Approach for Information Stream Clustering Analysis
Cui, Xiaohui; Potok, Thomas E
2006-01-01
Intelligence analysts are currently overwhelmed with the amount of information streams generated everyday. There is a lack of comprehensive tool that can real-time analyze the information streams. Document clustering analysis plays an important role in improving the accuracy of information retrieval. However, most clustering technologies can only be applied for analyzing the static document collection because they normally require a large amount of computation resource and long time to get accurate result. It is very difficult to cluster a dynamic changed text information streams on an individual computer. Our early research has resulted in a dynamic reactive flock clustering algorithm which can continually refine the clustering result and quickly react to the change of document contents. This character makes the algorithm suitable for cluster analyzing dynamic changed document information, such as text information stream. Because of the decentralized character of this algorithm, a distributed approach is a very natural way to increase the clustering speed of the algorithm. In this paper, we present a distributed multi-agent flocking approach for the text information stream clustering and discuss the decentralized architectures and communication schemes for load balance and status information synchronization in this approach.
Cross-Clustering: A Partial Clustering Algorithm with Automatic Estimation of the Number of Clusters
Tellaroli, Paola; Bazzi, Marco; Donato, Michele; Brazzale, Alessandra R.; Drăghici, Sorin
2016-01-01
Four of the most common limitations of the many available clustering methods are: i) the lack of a proper strategy to deal with outliers; ii) the need for a good a priori estimate of the number of clusters to obtain reasonable results; iii) the lack of a method able to detect when partitioning of a specific data set is not appropriate; and iv) the dependence of the result on the initialization. Here we propose Cross-clustering (CC), a partial clustering algorithm that overcomes these four limitations by combining the principles of two well established hierarchical clustering algorithms: Ward’s minimum variance and Complete-linkage. We validated CC by comparing it with a number of existing clustering methods, including Ward’s and Complete-linkage. We show on both simulated and real datasets, that CC performs better than the other methods in terms of: the identification of the correct number of clusters, the identification of outliers, and the determination of real cluster memberships. We used CC to cluster samples in order to identify disease subtypes, and on gene profiles, in order to determine groups of genes with the same behavior. Results obtained on a non-biological dataset show that the method is general enough to be successfully used in such diverse applications. The algorithm has been implemented in the statistical language R and is freely available from the CRAN contributed packages repository. PMID:27015427
An Artificial Immune Univariate Marginal Distribution Algorithm
NASA Astrophysics Data System (ADS)
Zhang, Qingbin; Kang, Shuo; Gao, Junxiang; Wu, Song; Tian, Yanping
Hybridization is an extremely effective way of improving the performance of the Univariate Marginal Distribution Algorithm (UMDA). Owing to its diversity and memory mechanisms, artificial immune algorithm has been widely used to construct hybrid algorithms with other optimization algorithms. This paper proposes a hybrid algorithm which combines the UMDA with the principle of general artificial immune algorithm. Experimental results on deceptive function of order 3 show that the proposed hybrid algorithm can get more building blocks (BBs) than the UMDA.
A Hybrid Monkey Search Algorithm for Clustering Analysis
Chen, Xin; Zhou, Yongquan; Luo, Qifang
2014-01-01
Clustering is a popular data analysis and data mining technique. The k-means clustering algorithm is one of the most commonly used methods. However, it highly depends on the initial solution and is easy to fall into local optimum solution. In view of the disadvantages of the k-means method, this paper proposed a hybrid monkey algorithm based on search operator of artificial bee colony algorithm for clustering analysis and experiment on synthetic and real life datasets to show that the algorithm has a good performance than that of the basic monkey algorithm for clustering analysis. PMID:24772039
a Distributed Polygon Retrieval Algorithm Using Mapreduce
NASA Astrophysics Data System (ADS)
Guo, Q.; Palanisamy, B.; Karimi, H. A.
2015-07-01
The burst of large-scale spatial terrain data due to the proliferation of data acquisition devices like 3D laser scanners poses challenges to spatial data analysis and computation. Among many spatial analyses and computations, polygon retrieval is a fundamental operation which is often performed under real-time constraints. However, existing sequential algorithms fail to meet this demand for larger sizes of terrain data. Motivated by the MapReduce programming model, a well-adopted large-scale parallel data processing technique, we present a MapReduce-based polygon retrieval algorithm designed with the objective of reducing the IO and CPU loads of spatial data processing. By indexing the data based on a quad-tree approach, a significant amount of unneeded data is filtered in the filtering stage and it reduces the IO overhead. The indexed data also facilitates querying the relationship between the terrain data and query area in shorter time. The results of the experiments performed in our Hadoop cluster demonstrate that our algorithm performs significantly better than the existing distributed algorithms.
Multi-Parent Clustering Algorithms from Stochastic Grammar Data Models
NASA Technical Reports Server (NTRS)
Mjoisness, Eric; Castano, Rebecca; Gray, Alexander
1999-01-01
We introduce a statistical data model and an associated optimization-based clustering algorithm which allows data vectors to belong to zero, one or several "parent" clusters. For each data vector the algorithm makes a discrete decision among these alternatives. Thus, a recursive version of this algorithm would place data clusters in a Directed Acyclic Graph rather than a tree. We test the algorithm with synthetic data generated according to the statistical data model. We also illustrate the algorithm using real data from large-scale gene expression assays.
An algorithm on distributed mining association rules
NASA Astrophysics Data System (ADS)
Xu, Fan
2005-12-01
With the rapid development of the Internet/Intranet, distributed databases have become a broadly used environment in various areas. It is a critical task to mine association rules in distributed databases. The algorithms of distributed mining association rules can be divided into two classes. One is a DD algorithm, and another is a CD algorithm. A DD algorithm focuses on data partition optimization so as to enhance the efficiency. A CD algorithm, on the other hand, considers a setting where the data is arbitrarily partitioned horizontally among the parties to begin with, and focuses on parallelizing the communication. A DD algorithm is not always applicable, however, at the time the data is generated, it is often already partitioned. In many cases, it cannot be gathered and repartitioned for reasons of security and secrecy, cost transmission, or sheer efficiency. A CD algorithm may be a more appealing solution for systems which are naturally distributed over large expenses, such as stock exchange and credit card systems. An FDM algorithm provides enhancement to CD algorithm. However, CD and FDM algorithms are both based on net-structure and executing in non-shareable resources. In practical applications, however, distributed databases often are star-structured. This paper proposes an algorithm based on star-structure networks, which are more practical in application, have lower maintenance costs and which are more practical in the construction of the networks. In addition, the algorithm provides high efficiency in communication and good extension in parallel computation.
Color Distributions of 29 Galactic Globular Clusters
NASA Astrophysics Data System (ADS)
Sohn, Young-Jong; Byun, Yong-Ik; Yim, Hong-Suh; Rhee, Myung-Hyun; Chun, Mun-Suk
1998-06-01
U, B, and V CCD images are used to investigate the radial color gradients of twenty nine Galactic globular clusters - twenty two King type clusters and seven Post Core Collapse (PCC) clusters classified on their surface brightness distributions. For King type clusters, eight clusters show radial color gradients with redder center and seven clusters with bluer centers in (B-V). Seven King type clusters have redder centers in (U-B), and five King type clusters show radial color gradients with bluer center in the same color. Among seven PCC clusters, one cluster show a redder center and five clusters show bluer centers in (B-V). Two PCC clusters have redder centers in (U-B), four PCC clusters show radial color gradients with bluer centers in the same color. These results bring an evidence that the color gradient is not unique to PCC clusters with bluer center. >From the Pearson's correlation coefficient tests, we found the horizontal branch morphologies have weak correlations to the radial color gradients within globular clusters.
Incremental Clustering Algorithm For Earth Science Data Mining
Vatsavai, Raju
2009-01-01
Remote sensing data plays a key role in understanding the complex geographic phenomena. Clustering is a useful tool in discovering interesting patterns and structures within the multivariate geospatial data. One of the key issues in clustering is the specication of appropriate number of clusters, which is not obvious in many practical situations. In this paper we provide an extension of G-means algorithm which automatically learns the number of clusters present in the data and avoids over estimation of the number of clusters. Experimental evaluation on simulated and remotely sensed image data shows the effectiveness of our algorithm.
Clustering algorithms do not learn, but they can be learned
NASA Astrophysics Data System (ADS)
Brun, Marcel; Dougherty, Edward R.
2005-08-01
Pattern classification theory involves an error criterion, optimal classifiers, and a theory of learning. For clustering, there has historically been little theory; in particular, there has generally (but not always) been no learning. The key point is that clustering has not been grounded on a probabilistic theory. Recently, a clustering theory has been developed in the context of random sets. This paper discusses learning within that context, in particular, k- nearest-neighbor learning of clustering algorithms.
Clustering algorithms for Stokes space modulation format recognition.
Boada, Ricard; Borkowski, Robert; Monroy, Idelfonso Tafur
2015-06-15
Stokes space modulation format recognition (Stokes MFR) is a blind method enabling digital coherent receivers to infer modulation format information directly from a received polarization-division-multiplexed signal. A crucial part of the Stokes MFR is a clustering algorithm, which largely influences the performance of the detection process, particularly at low signal-to-noise ratios. This paper reports on an extensive study of six different clustering algorithms: k-means, expectation maximization, density-based DBSCAN and OPTICS, spectral clustering and maximum likelihood clustering, used for discriminating between dual polarization: BPSK, QPSK, 8-PSK, 8-QAM, and 16-QAM. We determine essential performance metrics for each clustering algorithm and modulation format under test: minimum required signal-to-noise ratio, detection accuracy and algorithm complexity. PMID:26193532
A scalable parallel graph coloring algorithm for distributed memory computers.
Bozdag, Doruk; Manne, Fredrik; Gebremedhin, Assefaw H.; Catalyurek, Umit; Boman, Erik Gunnar
2005-02-01
In large-scale parallel applications a graph coloring is often carried out to schedule computational tasks. In this paper, we describe a new distributed memory algorithm for doing the coloring itself in parallel. The algorithm operates in an iterative fashion; in each round vertices are speculatively colored based on limited information, and then a set of incorrectly colored vertices, to be recolored in the next round, is identified. Parallel speedup is achieved in part by reducing the frequency of communication among processors. Experimental results on a PC cluster using up to 16 processors show that the algorithm is scalable.
A highly efficient multi-core algorithm for clustering extremely large datasets
2010-01-01
Background In recent years, the demand for computational power in computational biology has increased due to rapidly growing data sets from microarray and other high-throughput technologies. This demand is likely to increase. Standard algorithms for analyzing data, such as cluster algorithms, need to be parallelized for fast processing. Unfortunately, most approaches for parallelizing algorithms largely rely on network communication protocols connecting and requiring multiple computers. One answer to this problem is to utilize the intrinsic capabilities in current multi-core hardware to distribute the tasks among the different cores of one computer. Results We introduce a multi-core parallelization of the k-means and k-modes cluster algorithms based on the design principles of transactional memory for clustering gene expression microarray type data and categorial SNP data. Our new shared memory parallel algorithms show to be highly efficient. We demonstrate their computational power and show their utility in cluster stability and sensitivity analysis employing repeated runs with slightly changed parameters. Computation speed of our Java based algorithm was increased by a factor of 10 for large data sets while preserving computational accuracy compared to single-core implementations and a recently published network based parallelization. Conclusions Most desktop computers and even notebooks provide at least dual-core processors. Our multi-core algorithms show that using modern algorithmic concepts, parallelization makes it possible to perform even such laborious tasks as cluster sensitivity and cluster number estimation on the laboratory computer. PMID:20370922
A systematic comparison of genome-scale clustering algorithms
2012-01-01
Background A wealth of clustering algorithms has been applied to gene co-expression experiments. These algorithms cover a broad range of approaches, from conventional techniques such as k-means and hierarchical clustering, to graphical approaches such as k-clique communities, weighted gene co-expression networks (WGCNA) and paraclique. Comparison of these methods to evaluate their relative effectiveness provides guidance to algorithm selection, development and implementation. Most prior work on comparative clustering evaluation has focused on parametric methods. Graph theoretical methods are recent additions to the tool set for the global analysis and decomposition of microarray co-expression matrices that have not generally been included in earlier methodological comparisons. In the present study, a variety of parametric and graph theoretical clustering algorithms are compared using well-characterized transcriptomic data at a genome scale from Saccharomyces cerevisiae. Methods For each clustering method under study, a variety of parameters were tested. Jaccard similarity was used to measure each cluster's agreement with every GO and KEGG annotation set, and the highest Jaccard score was assigned to the cluster. Clusters were grouped into small, medium, and large bins, and the Jaccard score of the top five scoring clusters in each bin were averaged and reported as the best average top 5 (BAT5) score for the particular method. Results Clusters produced by each method were evaluated based upon the positive match to known pathways. This produces a readily interpretable ranking of the relative effectiveness of clustering on the genes. Methods were also tested to determine whether they were able to identify clusters consistent with those identified by other clustering methods. Conclusions Validation of clusters against known gene classifications demonstrate that for this data, graph-based techniques outperform conventional clustering approaches, suggesting that further
Adaptive link selection algorithms for distributed estimation
NASA Astrophysics Data System (ADS)
Xu, Songcen; de Lamare, Rodrigo C.; Poor, H. Vincent
2015-12-01
This paper presents adaptive link selection algorithms for distributed estimation and considers their application to wireless sensor networks and smart grids. In particular, exhaustive search-based least mean squares (LMS) / recursive least squares (RLS) link selection algorithms and sparsity-inspired LMS / RLS link selection algorithms that can exploit the topology of networks with poor-quality links are considered. The proposed link selection algorithms are then analyzed in terms of their stability, steady-state, and tracking performance and computational complexity. In comparison with the existing centralized or distributed estimation strategies, the key features of the proposed algorithms are as follows: (1) more accurate estimates and faster convergence speed can be obtained and (2) the network is equipped with the ability of link selection that can circumvent link failures and improve the estimation performance. The performance of the proposed algorithms for distributed estimation is illustrated via simulations in applications of wireless sensor networks and smart grids.
The Enhanced Hoshen-Kopelman Algorithm for Cluster Analysis
NASA Astrophysics Data System (ADS)
Hoshen, Joseph
1997-08-01
In 1976 Hoshen and Kopelman(J. Hoshen and R. Kopelman, Phys. Rev. B, 14, 3438 (1976).) introduced a breakthrough algorithm, known today as the Hoshen-Kopelman algorithm, for cluster analysis. This algorithm revolutionized Monte Carlo cluster calculations in percolation theory as it enables analysis of very large lattices containing 10^11 or more sites. Initially the HK algorithm primary use was in the domain of pure and basic sciences. Later it began finding applications in diverse fields of technology and applied sciences. Example of such applications are two and three dimensional image analysis, composite material modeling, polymers, remote sensing, brain modeling and food processing. While the original HK algorithm provides only cluster size data for only one class of sites, the Enhanced HK (EHK) algorithm, presented in this paper, enables calculations of cluster spatial moments -- characteristics of cluster shapes -- for multiple classes of sites. These enhancements preserve the time and space complexities of the original HK algorithm, such that very large lattices could be still analyzed simultaneously in a single pass through the lattice for cluster sizes, classes and shapes.
The ordered clustered travelling salesman problem: a hybrid genetic algorithm.
Ahmed, Zakir Hussain
2014-01-01
The ordered clustered travelling salesman problem is a variation of the usual travelling salesman problem in which a set of vertices (except the starting vertex) of the network is divided into some prespecified clusters. The objective is to find the least cost Hamiltonian tour in which vertices of any cluster are visited contiguously and the clusters are visited in the prespecified order. The problem is NP-hard, and it arises in practical transportation and sequencing problems. This paper develops a hybrid genetic algorithm using sequential constructive crossover, 2-opt search, and a local search for obtaining heuristic solution to the problem. The efficiency of the algorithm has been examined against two existing algorithms for some asymmetric and symmetric TSPLIB instances of various sizes. The computational results show that the proposed algorithm is very effective in terms of solution quality and computational time. Finally, we present solution to some more symmetric TSPLIB instances. PMID:24701148
Efficient Cluster Algorithm for Spin Glasses in Any Space Dimension.
Zhu, Zheng; Ochoa, Andrew J; Katzgraber, Helmut G
2015-08-14
Spin systems with frustration and disorder are notoriously difficult to study, both analytically and numerically. While the simulation of ferromagnetic statistical mechanical models benefits greatly from cluster algorithms, these accelerated dynamics methods remain elusive for generic spin-glass-like systems. Here, we present a cluster algorithm for Ising spin glasses that works in any space dimension and speeds up thermalization by at least one order of magnitude at temperatures where thermalization is typically difficult. Our isoenergetic cluster moves are based on the Houdayer cluster algorithm for two-dimensional spin glasses and lead to a speedup over conventional state-of-the-art methods that increases with the system size. We illustrate the benefits of the isoenergetic cluster moves in two and three space dimensions, as well as the nonplanar chimera topology found in the D-Wave Inc. quantum annealing machine. PMID:26317743
Efficient Cluster Algorithm for Spin Glasses in Any Space Dimension
NASA Astrophysics Data System (ADS)
Zhu, Zheng; Ochoa, Andrew J.; Katzgraber, Helmut G.
2015-08-01
Spin systems with frustration and disorder are notoriously difficult to study, both analytically and numerically. While the simulation of ferromagnetic statistical mechanical models benefits greatly from cluster algorithms, these accelerated dynamics methods remain elusive for generic spin-glass-like systems. Here, we present a cluster algorithm for Ising spin glasses that works in any space dimension and speeds up thermalization by at least one order of magnitude at temperatures where thermalization is typically difficult. Our isoenergetic cluster moves are based on the Houdayer cluster algorithm for two-dimensional spin glasses and lead to a speedup over conventional state-of-the-art methods that increases with the system size. We illustrate the benefits of the isoenergetic cluster moves in two and three space dimensions, as well as the nonplanar chimera topology found in the D-Wave Inc. quantum annealing machine.
MODELING THE METALLICITY DISTRIBUTION OF GLOBULAR CLUSTERS
Muratov, Alexander L.; Gnedin, Oleg Y. E-mail: ognedin@umich.ed
2010-08-01
Observed metallicities of globular clusters reflect physical conditions in the interstellar medium of their high-redshift host galaxies. Globular cluster systems in most large galaxies display bimodal color and metallicity distributions, which are often interpreted as indicating two distinct modes of cluster formation. The metal-rich and metal-poor clusters have systematically different locations and kinematics in their host galaxies. However, the red and blue clusters have similar internal properties, such as their masses, sizes, and ages. It is therefore interesting to explore whether both metal-rich and metal-poor clusters could form by a common mechanism and still be consistent with the bimodal distribution. We present such a model, which prescribes the formation of globular clusters semi-analytically using galaxy assembly history from cosmological simulations coupled with observed scaling relations for the amount and metallicity of cold gas available for star formation. We assume that massive star clusters form only during mergers of massive gas-rich galaxies and tune the model parameters to reproduce the observed distribution in the Galaxy. A wide, but not the entire, range of model realizations produces metallicity distributions consistent with the data. We find that early mergers of smaller hosts create exclusively blue clusters, whereas subsequent mergers of more massive galaxies create both red and blue clusters. Thus, bimodality arises naturally as the result of a small number of late massive merger events. This conclusion is not significantly affected by the large uncertainties in our knowledge of the stellar mass and cold gas mass in high-redshift galaxies. The fraction of galactic stellar mass locked in globular clusters declines from over 10% at z > 3 to 0.1% at present.
A Fast Implementation of the ISODATA Clustering Algorithm
NASA Technical Reports Server (NTRS)
Memarsadeghi, Nargess; Mount, David M.; Netanyahu, Nathan S.; LeMoigne, Jacqueline
2005-01-01
Clustering is central to many image processing and remote sensing applications. ISODATA is one of the most popular and widely used clustering methods in geoscience applications, but it can run slowly, particularly with large data sets. We present a more efficient approach to ISODATA clustering, which achieves better running times by storing the points in a kd-tree and through a modification of the way in which the algorithm estimates the dispersion of each cluster. We also present an approximate version of the algorithm which allows the user to further improve the running time, at the expense of lower fidelity in computing the nearest cluster center to each point. We provide both theoretical and empirical justification that our modified approach produces clusterings that are very similar to those produced by the standard ISODATA approach. We also provide empirical studies on both synthetic data and remotely sensed Landsat and MODIS images that show that our approach has significantly lower running times.
Efficient Record Linkage Algorithms Using Complete Linkage Clustering
Mamun, Abdullah-Al; Aseltine, Robert; Rajasekaran, Sanguthevar
2016-01-01
Data from different agencies share data of the same individuals. Linking these datasets to identify all the records belonging to the same individuals is a crucial and challenging problem, especially given the large volumes of data. A large number of available algorithms for record linkage are prone to either time inefficiency or low-accuracy in finding matches and non-matches among the records. In this paper we propose efficient as well as reliable sequential and parallel algorithms for the record linkage problem employing hierarchical clustering methods. We employ complete linkage hierarchical clustering algorithms to address this problem. In addition to hierarchical clustering, we also use two other techniques: elimination of duplicate records and blocking. Our algorithms use sorting as a sub-routine to identify identical copies of records. We have tested our algorithms on datasets with millions of synthetic records. Experimental results show that our algorithms achieve nearly 100% accuracy. Parallel implementations achieve almost linear speedups. Time complexities of these algorithms do not exceed those of previous best-known algorithms. Our proposed algorithms outperform previous best-known algorithms in terms of accuracy consuming reasonable run times. PMID:27124604
Clustering of Hadronic Showers with a Structural Algorithm
Charles, M.J.; /SLAC
2005-12-13
The internal structure of hadronic showers can be resolved in a high-granularity calorimeter. This structure is described in terms of simple components and an algorithm for reconstruction of hadronic clusters using these components is presented. Results from applying this algorithm to simulated hadronic Z-pole events in the SiD concept are discussed.
CCL: an algorithm for the efficient comparison of clusters
Hundt, R.; Schön, J. C.; Neelamraju, S.; Zagorac, J.; Jansen, M.
2013-01-01
The systematic comparison of the atomic structure of solids and clusters has become an important task in crystallography, chemistry, physics and materials science, in particular in the context of structure prediction and structure determination of nanomaterials. In this work, an efficient and robust algorithm for the comparison of cluster structures is presented, which is based on the mapping of the point patterns of the two clusters onto each other. This algorithm has been implemented as the module CCL in the structure visualization and analysis program KPLOT. PMID:23682193
A knowledge-based clustering algorithm driven by Gene Ontology.
Cheng, Jill; Cline, Melissa; Martin, John; Finkelstein, David; Awad, Tarif; Kulp, David; Siani-Rose, Michael A
2004-08-01
We have developed an algorithm for inferring the degree of similarity between genes by using the graph-based structure of Gene Ontology (GO). We applied this knowledge-based similarity metric to a clique-finding algorithm for detecting sets of related genes with biological classifications. We also combined it with an expression-based distance metric to produce a co-cluster analysis, which accentuates genes with both similar expression profiles and similar biological characteristics and identifies gene clusters that are more stable and biologically meaningful. These algorithms are demonstrated in the analysis of MPRO cell differentiation time series experiments. PMID:15468759
A modified density-based clustering algorithm and its implementation
NASA Astrophysics Data System (ADS)
Ban, Zhihua; Liu, Jianguo; Yuan, Lulu; Yang, Hua
2015-12-01
This paper presents an improved density-based clustering algorithm based on the paper of clustering by fast search and find of density peaks. A distance threshold is introduced for the purpose of economizing memory. In order to reduce the probability that two points share the same density value, similarity is utilized to define proximity measure. We have tested the modified algorithm on a large data set, several small data sets and shape data sets. It turns out that the proposed algorithm can obtain acceptable results and can be applied more wildly.
A Genetic Algorithm That Exchanges Neighboring Centers for Fuzzy c-Means Clustering
ERIC Educational Resources Information Center
Chahine, Firas Safwan
2012-01-01
Clustering algorithms are widely used in pattern recognition and data mining applications. Due to their computational efficiency, partitional clustering algorithms are better suited for applications with large datasets than hierarchical clustering algorithms. K-means is among the most popular partitional clustering algorithm, but has a major…
Sampling Within k-Means Algorithm to Cluster Large Datasets
Bejarano, Jeremy; Bose, Koushiki; Brannan, Tyler; Thomas, Anita; Adragni, Kofi; Neerchal, Nagaraj; Ostrouchov, George
2011-08-01
Due to current data collection technology, our ability to gather data has surpassed our ability to analyze it. In particular, k-means, one of the simplest and fastest clustering algorithms, is ill-equipped to handle extremely large datasets on even the most powerful machines. Our new algorithm uses a sample from a dataset to decrease runtime by reducing the amount of data analyzed. We perform a simulation study to compare our sampling based k-means to the standard k-means algorithm by analyzing both the speed and accuracy of the two methods. Results show that our algorithm is significantly more efficient than the existing algorithm with comparable accuracy. Further work on this project might include a more comprehensive study both on more varied test datasets as well as on real weather datasets. This is especially important considering that this preliminary study was performed on rather tame datasets. Also, these datasets should analyze the performance of the algorithm on varied values of k. Lastly, this paper showed that the algorithm was accurate for relatively low sample sizes. We would like to analyze this further to see how accurate the algorithm is for even lower sample sizes. We could find the lowest sample sizes, by manipulating width and confidence level, for which the algorithm would be acceptably accurate. In order for our algorithm to be a success, it needs to meet two benchmarks: match the accuracy of the standard k-means algorithm and significantly reduce runtime. Both goals are accomplished for all six datasets analyzed. However, on datasets of three and four dimension, as the data becomes more difficult to cluster, both algorithms fail to obtain the correct classifications on some trials. Nevertheless, our algorithm consistently matches the performance of the standard algorithm while becoming remarkably more efficient with time. Therefore, we conclude that analysts can use our algorithm, expecting accurate results in considerably less time.
Sharma, Ashok; Podolsky, Robert; Zhao, Jieping; McIndoe, Richard A.
2009-01-01
Motivation: As the number of publically available microarray experiments increases, the ability to analyze extremely large datasets across multiple experiments becomes critical. There is a requirement to develop algorithms which are fast and can cluster extremely large datasets without affecting the cluster quality. Clustering is an unsupervised exploratory technique applied to microarray data to find similar data structures or expression patterns. Because of the high input/output costs involved and large distance matrices calculated, most of the algomerative clustering algorithms fail on large datasets (30 000 + genes/200 + arrays). In this article, we propose a new two-stage algorithm which partitions the high-dimensional space associated with microarray data using hyperplanes. The first stage is based on the Balanced Iterative Reducing and Clustering using Hierarchies algorithm with the second stage being a conventional k-means clustering technique. This algorithm has been implemented in a software tool (HPCluster) designed to cluster gene expression data. We compared the clustering results using the two-stage hyperplane algorithm with the conventional k-means algorithm from other available programs. Because, the first stage traverses the data in a single scan, the performance and speed increases substantially. The data reduction accomplished in the first stage of the algorithm reduces the memory requirements allowing us to cluster 44 460 genes without failure and significantly decreases the time to complete when compared with popular k-means programs. The software was written in C# (.NET 1.1). Availability: The program is freely available and can be downloaded from http://www.amdcc.org/bioinformatics/bioinformatics.aspx. Contact: rmcindoe@mail.mcg.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:19261720
CACONET: Ant Colony Optimization (ACO) Based Clustering Algorithm for VANET
Bajwa, Khalid Bashir; Khan, Salabat; Chaudary, Nadeem Majeed; Akram, Adeel
2016-01-01
A vehicular ad hoc network (VANET) is a wirelessly connected network of vehicular nodes. A number of techniques, such as message ferrying, data aggregation, and vehicular node clustering aim to improve communication efficiency in VANETs. Cluster heads (CHs), selected in the process of clustering, manage inter-cluster and intra-cluster communication. The lifetime of clusters and number of CHs determines the efficiency of network. In this paper a Clustering algorithm based on Ant Colony Optimization (ACO) for VANETs (CACONET) is proposed. CACONET forms optimized clusters for robust communication. CACONET is compared empirically with state-of-the-art baseline techniques like Multi-Objective Particle Swarm Optimization (MOPSO) and Comprehensive Learning Particle Swarm Optimization (CLPSO). Experiments varying the grid size of the network, the transmission range of nodes, and number of nodes in the network were performed to evaluate the comparative effectiveness of these algorithms. For optimized clustering, the parameters considered are the transmission range, direction and speed of the nodes. The results indicate that CACONET significantly outperforms MOPSO and CLPSO. PMID:27149517
CACONET: Ant Colony Optimization (ACO) Based Clustering Algorithm for VANET.
Aadil, Farhan; Bajwa, Khalid Bashir; Khan, Salabat; Chaudary, Nadeem Majeed; Akram, Adeel
2016-01-01
A vehicular ad hoc network (VANET) is a wirelessly connected network of vehicular nodes. A number of techniques, such as message ferrying, data aggregation, and vehicular node clustering aim to improve communication efficiency in VANETs. Cluster heads (CHs), selected in the process of clustering, manage inter-cluster and intra-cluster communication. The lifetime of clusters and number of CHs determines the efficiency of network. In this paper a Clustering algorithm based on Ant Colony Optimization (ACO) for VANETs (CACONET) is proposed. CACONET forms optimized clusters for robust communication. CACONET is compared empirically with state-of-the-art baseline techniques like Multi-Objective Particle Swarm Optimization (MOPSO) and Comprehensive Learning Particle Swarm Optimization (CLPSO). Experiments varying the grid size of the network, the transmission range of nodes, and number of nodes in the network were performed to evaluate the comparative effectiveness of these algorithms. For optimized clustering, the parameters considered are the transmission range, direction and speed of the nodes. The results indicate that CACONET significantly outperforms MOPSO and CLPSO. PMID:27149517
Personalized PageRank Clustering: A graph clustering algorithm based on random walks
NASA Astrophysics Data System (ADS)
A. Tabrizi, Shayan; Shakery, Azadeh; Asadpour, Masoud; Abbasi, Maziar; Tavallaie, Mohammad Ali
2013-11-01
Graph clustering has been an essential part in many methods and thus its accuracy has a significant effect on many applications. In addition, exponential growth of real-world graphs such as social networks, biological networks and electrical circuits demands clustering algorithms with nearly-linear time and space complexity. In this paper we propose Personalized PageRank Clustering (PPC) that employs the inherent cluster exploratory property of random walks to reveal the clusters of a given graph. We combine random walks and modularity to precisely and efficiently reveal the clusters of a graph. PPC is a top-down algorithm so it can reveal inherent clusters of a graph more accurately than other nearly-linear approaches that are mainly bottom-up. It also gives a hierarchy of clusters that is useful in many applications. PPC has a linear time and space complexity and has been superior to most of the available clustering algorithms on many datasets. Furthermore, its top-down approach makes it a flexible solution for clustering problems with different requirements.
Functional clustering algorithm for the analysis of dynamic network data
NASA Astrophysics Data System (ADS)
Feldt, S.; Waddell, J.; Hetrick, V. L.; Berke, J. D.; Żochowski, M.
2009-05-01
We formulate a technique for the detection of functional clusters in discrete event data. The advantage of this algorithm is that no prior knowledge of the number of functional groups is needed, as our procedure progressively combines data traces and derives the optimal clustering cutoff in a simple and intuitive manner through the use of surrogate data sets. In order to demonstrate the power of this algorithm to detect changes in network dynamics and connectivity, we apply it to both simulated neural spike train data and real neural data obtained from the mouse hippocampus during exploration and slow-wave sleep. Using the simulated data, we show that our algorithm performs better than existing methods. In the experimental data, we observe state-dependent clustering patterns consistent with known neurophysiological processes involved in memory consolidation.
Lee, Chongdeuk; Jeong, Taegwon
2011-01-01
Clustering is an important mechanism that efficiently provides information for mobile nodes and improves the processing capacity of routing, bandwidth allocation, and resource management and sharing. Clustering algorithms can be based on such criteria as the battery power of nodes, mobility, network size, distance, speed and direction. Above all, in order to achieve good clustering performance, overhead should be minimized, allowing mobile nodes to join and leave without perturbing the membership of the cluster while preserving current cluster structure as much as possible. This paper proposes a Fuzzy Relevance-based Cluster head selection Algorithm (FRCA) to solve problems found in existing wireless mobile ad hoc sensor networks, such as the node distribution found in dynamic properties due to mobility and flat structures and disturbance of the cluster formation. The proposed mechanism uses fuzzy relevance to select the cluster head for clustering in wireless mobile ad hoc sensor networks. In the simulation implemented on the NS-2 simulator, the proposed FRCA is compared with algorithms such as the Cluster-based Routing Protocol (CBRP), the Weighted-based Adaptive Clustering Algorithm (WACA), and the Scenario-based Clustering Algorithm for Mobile ad hoc networks (SCAM). The simulation results showed that the proposed FRCA achieves better performance than that of the other existing mechanisms. PMID:22163905
Performance impact of dynamic parallelism on different clustering algorithms
NASA Astrophysics Data System (ADS)
DiMarco, Jeffrey; Taufer, Michela
2013-05-01
In this paper, we aim to quantify the performance gains of dynamic parallelism. The newest version of CUDA, CUDA 5, introduces dynamic parallelism, which allows GPU threads to create new threads, without CPU intervention, and adapt to its data. This effectively eliminates the superfluous back and forth communication between the GPU and CPU through nested kernel computations. The change in performance will be measured using two well-known clustering algorithms that exhibit data dependencies: the K-means clustering and the hierarchical clustering. K-means has a sequential data dependence wherein iterations occur in a linear fashion, while the hierarchical clustering has a tree-like dependence that produces split tasks. Analyzing the performance of these data-dependent algorithms gives us a better understanding of the benefits or potential drawbacks of CUDA 5's new dynamic parallelism feature.
Development of clustering algorithms for Compressed Baryonic Matter experiment
NASA Astrophysics Data System (ADS)
Kozlov, G. E.; Ivanov, V. V.; Lebedev, A. A.; Vassiliev, Yu. O.
2015-05-01
A clustering problem for the coordinate detectors in the Compressed Baryonic Matter (CBM) experiment is discussed. Because of the high interaction rate and huge datasets to be dealt with, clustering algorithms are required to be fast and efficient and capable of processing events with high track multiplicity. At present there are two different approaches to the problem. In the first one each fired pad bears information about its charge, while in the second one a pad can or cannot be fired, thus rendering the separation of overlapping clusters a difficult task. To deal with the latter, two different clustering algorithms were developed, integrated into the CBMROOT software environment, and tested with various types of simulated events. Both of them are found to be highly efficient and accurate.
NCUBE - A clustering algorithm based on a discretized data space
NASA Technical Reports Server (NTRS)
Eigen, D. J.; Northouse, R. A.
1974-01-01
Cluster analysis involves the unsupervised grouping of data. The process provides an automatic procedure for generating known training samples for pattern classification. NCUBE, the clustering algorithm presented, is based upon the concept of imposing a gridwork on the data space. The NCUBE computer implementation of this concept provides an easily derived form of piecewise linear discrimination. This piecewise linear discrimination permits the separation of some types of data groups that are not linearly separable.
Fast clustering algorithm for codebook production in image vector quantization
NASA Astrophysics Data System (ADS)
Al-Otum, Hazem M.
2001-04-01
In this paper, a fast clustering algorithm (FCA) is proposed to be implemented in vector quantization codebook production. This algorithm gives the ability to avoid iterative averaging of vectors and is based on collecting vectors with similar or closely similar characters to produce corresponding clusters. FCA gives an increase in peak signal-to-noise ratio (PSNR) about 0.3 - 1.1 dB, over the LBG algorithm and reduces the computational cost for codebook production (10% - 60%) at different bit rates. Here, two FCA modifications are proposed: FCA with limited cluster size 1& (FCA-LCS1 and FCA-LCS2, respectively). FCA- LCS1 tends to subdivide large clusters into smaller ones while FCA-LCS2 reduces a predetermined threshold by a step to reach the required cluster size. The FCA-LCS1 and FCA- LCS2 give an increase in PSNR of about 0.9 - 1.0 and 0.9 - 1.1 dB, respectively, over the FCA algorithm, at the expense of about 15% - 25% and 18% - 28% increase in the output codebook size.
Particle flow reconstruction based on the directed tree clustering algorithm
Chakraborty, D.; Lima, J. G. R.; McIntosh, R.; Zutshi, V.
2006-10-27
We present the status of particle flow algorithm development at Northern Illinois University. A key element in our approach is the calorimeter-based directed tree clustering algorithm. We have attempted to identify and tackle the essential challenges and analyze the effect of several different approaches to the reconstruction of jet energies and the Z-boson mass. A number of possibilities have been studied, such as analog vs. digital energy measurement, hit density-based clustering and the use of single or multiple energy thresholds. We plan to use this PFA-based reconstruction to compare some of the proposed detector technologies and geometries.
Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale
Kobourov, Stephen; Gallant, Mike; Börner, Katy
2016-01-01
Overview Notions of community quality underlie the clustering of networks. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms—Louvain, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes. Cluster Quality Metrics We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 on modularity but score 0 out of 1 on information recovery. We find conductance, though imperfect, to be the stand-alone quality metric that best indicates performance on the information recovery metrics. Additionally, our study shows that the variant of normalized mutual information used in previous work cannot be assumed to differ only slightly from traditional normalized mutual information. Network Clustering Algorithms Smart local moving is the overall best performing algorithm in our study, but discrepancies between cluster evaluation metrics prevent us from declaring it an absolutely superior algorithm. Interestingly, Louvain performed better than Infomap in nearly all the tests in our study, contradicting the results of previous work in which Infomap was superior to Louvain. We find that although label propagation performs poorly when clusters are less clearly defined, it scales efficiently and accurately to large
A Task-parallel Clustering Algorithm for Structured AMR
Gunney, B N; Wissink, A M
2004-11-02
A new parallel algorithm, based on the Berger-Rigoutsos algorithm for clustering grid points into logically rectangular regions, is presented. The clustering operation is frequently performed in the dynamic gridding steps of structured adaptive mesh refinement (SAMR) calculations. A previous study revealed that although the cost of clustering is generally insignificant for smaller problems run on relatively few processors, the algorithm scaled inefficiently in parallel and its cost grows with problem size. Hence, it can become significant for large scale problems run on very large parallel machines, such as the new BlueGene system (which has {Omicron}(10{sup 4}) processors). We propose a new task-parallel algorithm designed to reduce communication wait times. Performance was assessed using dynamic SAMR re-gridding operations on up to 16K processors of currently available computers at Lawrence Livermore National Laboratory. The new algorithm was shown to be up to an order of magnitude faster than the baseline algorithm and had better scaling trends.
Six clustering algorithms applied to the WAIS-R: the problem of dissimilar cluster results.
Fraboni, M; Cooper, D
1989-11-01
Clusterings of the Wechsler Adult Intelligence Scale-Revised subtests were obtained from the application of six hierarchical clustering methods (N = 113). These sets of clusters were compared for similarities using the Rand index. The calculated indices suggested similarities of cluster group membership between the Complete Linkage and Centroid methods; Complete Linkage and Ward's methods; Centroid and Ward's methods; and Single Linkage and Average Linkage Between Groups methods. Cautious use of single clustering methods is implied, though the authors suggest some advantages of knowing specific similarities and differences. If between-method comparisons consistently reveal similar cluster membership, a choice could be made from those algorithms that tend to produce similar partitions, thereby enhancing cluster interpretation. PMID:2613904
Characterising superclusters with the galaxy cluster distribution
NASA Astrophysics Data System (ADS)
Chon, Gayoung; Böhringer, Hans; Collins, Chris A.; Krause, Martin
2014-07-01
Superclusters are the largest observed matter density structures in the Universe. Recently, we presented the first supercluster catalogue constructed with a well-defined selection function based on the X-ray flux-limited cluster survey, REFLEX II. To construct the sample we proposed a concept to find large objects with a minimum overdensity such that it can be expected that most of their mass will collapse in the future. The main goal is to provide support for our concept here by using simulation that we can, on the basis of our observational sample of X-ray clusters, construct a supercluster sample defined by a certain minimum overdensity. On this sample we also test how superclusters trace the underlying dark matter distribution. Our results confirm that an overdensity in the number of clusters is tightly correlated with an overdensity of the dark matter distribution. This enables us to define superclusters within which most of the mass will collapse in the future. We also obtain first-order mass estimates of superclusters on the basis of the properties of the member clusters. We also show that in this context the ratio of the cluster number density and dark matter mass density is consistent with the theoretically expected cluster bias. Our previous work provided evidence that superclusters are a special environment in which the density structures of the dark matter grow differently from those in the field, as characterised by the X-ray luminosity function. Here we confirm for the first time that this originates from a top-heavy mass function at high statistical significance that is provided by a Kolmogorov-Smirnov test. We also find in close agreement with observations that the superclusters only occupy a small volume of a few per cent, but contain more than half of the clusters in the present-day Universe.
A Resampling Based Clustering Algorithm for Replicated Gene Expression Data.
Li, Han; Li, Chun; Hu, Jie; Fan, Xiaodan
2015-01-01
In gene expression data analysis, clustering is a fruitful exploratory technique to reveal the underlying molecular mechanism by identifying groups of co-expressed genes. To reduce the noise, usually multiple experimental replicates are performed. An integrative analysis of the full replicate data, instead of reducing the data to the mean profile, carries the promise of yielding more precise and robust clusters. In this paper, we propose a novel resampling based clustering algorithm for genes with replicated expression measurements. Assuming those replicates are exchangeable, we formulate the problem in the bootstrap framework, and aim to infer the consensus clustering based on the bootstrap samples of replicates. In our approach, we adopt the mixed effect model to accommodate the heterogeneous variances and implement a quasi-MCMC algorithm to conduct statistical inference. Experiments demonstrate that by taking advantage of the full replicate data, our algorithm produces more reliable clusters and has robust performance in diverse scenarios, especially when the data is subject to multiple sources of variance. PMID:26671802
The C4 clustering algorithm: Clusters of galaxies in the Sloan Digital Sky Survey
Miller, Christopher J.; Nichol, Robert; Reichart, Dan; Wechsler, Risa H.; Evrard, August; Annis, James; McKay, Timothy; Bahcall, Neta; Bernardi, Mariangela; Boehringer, Hans; Connolly, Andrew; Goto, Tomo; Kniazev, Alexie; Lamb, Donald; Postman, Marc; Schneider, Donald; Sheth, Ravi; Voges, Wolfgang; /Cerro-Tololo InterAmerican Obs. /Portsmouth U., ICG /North Carolina U. /Chicago U., Astron. Astrophys. Ctr. /Chicago U., EFI /Michigan U. /Fermilab /Princeton U. Observ. /Garching, Max Planck Inst., MPE /Pittsburgh U. /Tokyo U., ICRR /Baltimore, Space Telescope Sci. /Penn State U. /Chicago U. /Stavropol, Astrophys. Observ. /Heidelberg, Max Planck Inst. Astron. /INI, SAO
2005-03-01
We present the ''C4 Cluster Catalog'', a new sample of 748 clusters of galaxies identified in the spectroscopic sample of the Second Data Release (DR2) of the Sloan Digital Sky Survey (SDSS). The C4 cluster-finding algorithm identifies clusters as overdensities in a seven-dimensional position and color space, thus minimizing projection effects that have plagued previous optical cluster selection. The present C4 catalog covers {approx}2600 square degrees of sky and ranges in redshift from z = 0.02 to z = 0.17. The mean cluster membership is 36 galaxies (with redshifts) brighter than r = 17.7, but the catalog includes a range of systems, from groups containing 10 members to massive clusters with over 200 cluster members with redshifts. The catalog provides a large number of measured cluster properties including sky location, mean redshift, galaxy membership, summed r-band optical luminosity (L{sub r}), velocity dispersion, as well as quantitative measures of substructure and the surrounding large-scale environment. We use new, multi-color mock SDSS galaxy catalogs, empirically constructed from the {Lambda}CDM Hubble Volume (HV) Sky Survey output, to investigate the sensitivity of the C4 catalog to the various algorithm parameters (detection threshold, choice of passbands and search aperture), as well as to quantify the purity and completeness of the C4 cluster catalog. These mock catalogs indicate that the C4 catalog is {approx_equal}90% complete and 95% pure above M{sub 200} = 1 x 10{sup 14} h{sup -1}M{sub {circle_dot}} and within 0.03 {le} z {le} 0.12. Using the SDSS DR2 data, we show that the C4 algorithm finds 98% of X-ray identified clusters and 90% of Abell clusters within 0.03 {le} z {le} 0.12. Using the mock galaxy catalogs and the full HV dark matter simulations, we show that the L{sub r} of a cluster is a more robust estimator of the halo mass (M{sub 200}) than the galaxy line-of-sight velocity dispersion or the richness of the cluster. However, if we
Adaptive clustering algorithm for community detection in complex networks.
Ye, Zhenqing; Hu, Songnian; Yu, Jun
2008-10-01
Community structure is common in various real-world networks; methods or algorithms for detecting such communities in complex networks have attracted great attention in recent years. We introduced a different adaptive clustering algorithm capable of extracting modules from complex networks with considerable accuracy and robustness. In this approach, each node in a network acts as an autonomous agent demonstrating flocking behavior where vertices always travel toward their preferable neighboring groups. An optimal modular structure can emerge from a collection of these active nodes during a self-organization process where vertices constantly regroup. In addition, we show that our algorithm appears advantageous over other competing methods (e.g., the Newman-fast algorithm) through intensive evaluation. The applications in three real-world networks demonstrate the superiority of our algorithm to find communities that are parallel with the appropriate organization in reality. PMID:18999501
NASA Technical Reports Server (NTRS)
Lennington, R. K.; Johnson, J. K.
1979-01-01
An efficient procedure which clusters data using a completely unsupervised clustering algorithm and then uses labeled pixels to label the resulting clusters or perform a stratified estimate using the clusters as strata is developed. Three clustering algorithms, CLASSY, AMOEBA, and ISOCLS, are compared for efficiency. Three stratified estimation schemes and three labeling schemes are also considered and compared.
A Likelihood-Based SLIC Superpixel Algorithm for SAR Images Using Generalized Gamma Distribution
Zou, Huanxin; Qin, Xianxiang; Zhou, Shilin; Ji, Kefeng
2016-01-01
The simple linear iterative clustering (SLIC) method is a recently proposed popular superpixel algorithm. However, this method may generate bad superpixels for synthetic aperture radar (SAR) images due to effects of speckle and the large dynamic range of pixel intensity. In this paper, an improved SLIC algorithm for SAR images is proposed. This algorithm exploits the likelihood information of SAR image pixel clusters. Specifically, a local clustering scheme combining intensity similarity with spatial proximity is proposed. Additionally, for post-processing, a local edge-evolving scheme that combines spatial context and likelihood information is introduced as an alternative to the connected components algorithm. To estimate the likelihood information of SAR image clusters, we incorporated a generalized gamma distribution (GГD). Finally, the superiority of the proposed algorithm was validated using both simulated and real-world SAR images. PMID:27438840
A Likelihood-Based SLIC Superpixel Algorithm for SAR Images Using Generalized Gamma Distribution.
Zou, Huanxin; Qin, Xianxiang; Zhou, Shilin; Ji, Kefeng
2016-01-01
The simple linear iterative clustering (SLIC) method is a recently proposed popular superpixel algorithm. However, this method may generate bad superpixels for synthetic aperture radar (SAR) images due to effects of speckle and the large dynamic range of pixel intensity. In this paper, an improved SLIC algorithm for SAR images is proposed. This algorithm exploits the likelihood information of SAR image pixel clusters. Specifically, a local clustering scheme combining intensity similarity with spatial proximity is proposed. Additionally, for post-processing, a local edge-evolving scheme that combines spatial context and likelihood information is introduced as an alternative to the connected components algorithm. To estimate the likelihood information of SAR image clusters, we incorporated a generalized gamma distribution (GГD). Finally, the superiority of the proposed algorithm was validated using both simulated and real-world SAR images. PMID:27438840
Biologically supervised hierarchical clustering algorithms for gene expression data.
Boratyn, Grzegorz M; Datta, Susmita; Datta, Somnath
2006-01-01
Cluster analysis has become a standard part of gene expression analysis. In this paper, we propose a novel semi-supervised approach that offers the same flexibility as that of a hierarchical clustering. Yet it utilizes, along with the experimental gene expression data, common biological information about different genes that is being complied at various public, Web accessible databases. We argue that such an approach is inherently superior than the standard unsupervised approach of grouping genes based on expression data alone. It is shown that our biologically supervised methods produce better clustering results than the corresponding unsupervised methods as judged by the distance from the model temporal profiles. R-codes of the clustering algorithm are available from the authors upon request. PMID:17947147
NASA Astrophysics Data System (ADS)
Liu, Lifeng; Sun, Sam Zandong; Yu, Hongyu; Yue, Xingtong; Zhang, Dong
2016-06-01
Considering the fact that the fluid distribution in carbonate reservoir is very complicated and the existing fluid prediction methods are not able to produce ideal predicted results, this paper proposes a new fluid identification method in carbonate reservoir based on the modified Fuzzy C-Means (FCM) Clustering algorithm. Both initialization and globally optimum cluster center are produced by Chaotic Quantum Particle Swarm Optimization (CQPSO) algorithm, which can effectively avoid the disadvantage of sensitivity to initial values and easily falling into local convergence in the traditional FCM Clustering algorithm. Then, the modified algorithm is applied to fluid identification in the carbonate X area in Tarim Basin of China, and a mapping relation between fluid properties and pre-stack elastic parameters will be built in multi-dimensional space. It has been proven that this modified algorithm has a good ability of fuzzy cluster and its total coincidence rate of fluid prediction reaches 97.10%. Besides, the membership of different fluids can be accumulated to obtain respective probability, which can evaluate the uncertainty in fluid identification result.
A new clustering algorithm for scanning electron microscope images
NASA Astrophysics Data System (ADS)
Yousef, Amr; Duraisamy, Prakash; Karim, Mohammad
2016-04-01
A scanning electron microscope (SEM) is a type of electron microscope that produces images of a sample by scanning it with a focused beam of electrons. The electrons interact with the sample atoms, producing various signals that are collected by detectors. The gathered signals contain information about the sample's surface topography and composition. The electron beam is generally scanned in a raster scan pattern, and the beam's position is combined with the detected signal to produce an image. The most common configuration for an SEM produces a single value per pixel, with the results usually rendered as grayscale images. The captured images may be produced with insufficient brightness, anomalous contrast, jagged edges, and poor quality due to low signal-to-noise ratio, grained topography and poor surface details. The segmentation of the SEM images is a tackling problems in the presence of the previously mentioned distortions. In this paper, we are stressing on the clustering of these type of images. In that sense, we evaluate the performance of the well-known unsupervised clustering and classification techniques such as connectivity based clustering (hierarchical clustering), centroid-based clustering, distribution-based clustering and density-based clustering. Furthermore, we propose a new spatial fuzzy clustering technique that works efficiently on this type of images and compare its results against these regular techniques in terms of clustering validation metrics.
LYDIAN: An Extensible Educational Animation Environment for Distributed Algorithms
ERIC Educational Resources Information Center
Koldehofe, Boris; Papatriantafilou, Marina; Tsigas, Philippas
2006-01-01
LYDIAN is an environment to support the teaching and learning of distributed algorithms. It provides a collection of distributed algorithms as well as continuous animations. Users can combine algorithms and animations with arbitrary network structures defining the interconnection and behavior of the distributed algorithm. Further, it facilitates…
ABCluster: the artificial bee colony algorithm for cluster global optimization.
Zhang, Jun; Dolg, Michael
2015-10-01
Global optimization of cluster geometries is of fundamental importance in chemistry and an interesting problem in applied mathematics. In this work, we introduce a relatively new swarm intelligence algorithm, i.e. the artificial bee colony (ABC) algorithm proposed in 2005, to this field. It is inspired by the foraging behavior of a bee colony, and only three parameters are needed to control it. We applied it to several potential functions of quite different nature, i.e., the Coulomb-Born-Mayer, Lennard-Jones, Morse, Z and Gupta potentials. The benchmarks reveal that for long-ranged potentials the ABC algorithm is very efficient in locating the global minimum, while for short-ranged ones it is sometimes trapped into a local minimum funnel on a potential energy surface of large clusters. We have released an efficient, user-friendly, and free program "ABCluster" to realize the ABC algorithm. It is a black-box program for non-experts as well as experts and might become a useful tool for chemists to study clusters. PMID:26327507
NASA Astrophysics Data System (ADS)
Ball, R. C.; Lee, J. R.
1996-03-01
We prove that a new, irreversible growth algorithm, Non-Deletion Reaction-Limited Cluster-cluster Aggregation (NDRLCA), produces equilibrium Branched Polymers, expected to exhibit Lattice Animal statistics [1]. We implement NDRLCA, off-lattice, as a computer simulation for embedding dimension d=2 and 3, obtaining values for critical exponents, fractal dimension D and cluster mass distribution exponent tau: d=2, D≈ 1.53± 0.05, tau = 1.09± 0.06; d=3, D=1.96± 0.04, tau =1.50± 0.04 in good agreement with theoretical LA values. The simulation results do not support recent suggestions [2] that BPs may be in the same universality class as percolation. We also obtain values for a model-dependent critical “fugacity”, z_c and investigate the finite-size effects of our simulation, quantifying notions of “inbreeding” that occur in this algorithm. Finally we use an extension of the NDRLCA proof to show that standard Reaction-Limited Cluster-cluster Aggregation is very unlikely to be in the same universality class as Branched Polymers/Lattice Animals unless the backnone dimension for the latter is considerably less than the published value.
Pruning Neural Networks with Distribution Estimation Algorithms
Cantu-Paz, E
2003-01-15
This paper describes the application of four evolutionary algorithms to the pruning of neural networks used in classification problems. Besides of a simple genetic algorithm (GA), the paper considers three distribution estimation algorithms (DEAs): a compact GA, an extended compact GA, and the Bayesian Optimization Algorithm. The objective is to determine if the DEAs present advantages over the simple GA in terms of accuracy or speed in this problem. The experiments used a feed forward neural network trained with standard back propagation and public-domain and artificial data sets. The pruned networks seemed to have better or equal accuracy than the original fully-connected networks. Only in a few cases, pruning resulted in less accurate networks. We found few differences in the accuracy of the networks pruned by the four EAs, but found important differences in the execution time. The results suggest that a simple GA with a small population might be the best algorithm for pruning networks on the data sets we tested.
Algorithm for systematic peak extraction from atomic pair distribution functions.
Granlund, L; Billinge, S J L; Duxbury, P M
2015-07-01
The study presents an algorithm, ParSCAPE, for model-independent extraction of peak positions and intensities from atomic pair distribution functions (PDFs). It provides a statistically motivated method for determining parsimony of extracted peak models using the information-theoretic Akaike information criterion (AIC) applied to plausible models generated within an iterative framework of clustering and chi-square fitting. All parameters the algorithm uses are in principle known or estimable from experiment, though careful judgment must be applied when estimating the PDF baseline of nanostructured materials. ParSCAPE has been implemented in the Python program SrMise. Algorithm performance is examined on synchrotron X-ray PDFs of 16 bulk crystals and two nanoparticles using AIC-based multimodeling techniques, and particularly the impact of experimental uncertainties on extracted models. It is quite resistant to misidentification of spurious peaks coming from noise and termination effects, even in the absence of a constraining structural model. Structure solution from automatically extracted peaks using the Liga algorithm is demonstrated for 14 crystals and for C60. Special attention is given to the information content of the PDF, theory and practice of the AIC, as well as the algorithm's limitations. PMID:26131896
Mapping cultivable land from satellite imagery with clustering algorithms
NASA Astrophysics Data System (ADS)
Arango, R. B.; Campos, A. M.; Combarro, E. F.; Canas, E. R.; Díaz, I.
2016-07-01
Open data satellite imagery provides valuable data for the planning and decision-making processes related with environmental domains. Specifically, agriculture uses remote sensing in a wide range of services, ranging from monitoring the health of the crops to forecasting the spread of crop diseases. In particular, this paper focuses on a methodology for the automatic delimitation of cultivable land by means of machine learning algorithms and satellite data. The method uses a partition clustering algorithm called Partitioning Around Medoids and considers the quality of the clusters obtained for each satellite band in order to evaluate which one better identifies cultivable land. The proposed method was tested with vineyards using as input the spectral and thermal bands of the Landsat 8 satellite. The experimental results show the great potential of this method for cultivable land monitoring from remote-sensed multispectral imagery.
Synchronous Firefly Algorithm for Cluster Head Selection in WSN.
Baskaran, Madhusudhanan; Sadagopan, Chitra
2015-01-01
Wireless Sensor Network (WSN) consists of small low-cost, low-power multifunctional nodes interconnected to efficiently aggregate and transmit data to sink. Cluster-based approaches use some nodes as Cluster Heads (CHs) and organize WSNs efficiently for aggregation of data and energy saving. A CH conveys information gathered by cluster nodes and aggregates/compresses data before transmitting it to a sink. However, this additional responsibility of the node results in a higher energy drain leading to uneven network degradation. Low Energy Adaptive Clustering Hierarchy (LEACH) offsets this by probabilistically rotating cluster heads role among nodes with energy above a set threshold. CH selection in WSN is NP-Hard as optimal data aggregation with efficient energy savings cannot be solved in polynomial time. In this work, a modified firefly heuristic, synchronous firefly algorithm, is proposed to improve the network performance. Extensive simulation shows the proposed technique to perform well compared to LEACH and energy-efficient hierarchical clustering. Simulations show the effectiveness of the proposed method in decreasing the packet loss ratio by an average of 9.63% and improving the energy efficiency of the network when compared to LEACH and EEHC. PMID:26495431
Synchronous Firefly Algorithm for Cluster Head Selection in WSN
Baskaran, Madhusudhanan; Sadagopan, Chitra
2015-01-01
Wireless Sensor Network (WSN) consists of small low-cost, low-power multifunctional nodes interconnected to efficiently aggregate and transmit data to sink. Cluster-based approaches use some nodes as Cluster Heads (CHs) and organize WSNs efficiently for aggregation of data and energy saving. A CH conveys information gathered by cluster nodes and aggregates/compresses data before transmitting it to a sink. However, this additional responsibility of the node results in a higher energy drain leading to uneven network degradation. Low Energy Adaptive Clustering Hierarchy (LEACH) offsets this by probabilistically rotating cluster heads role among nodes with energy above a set threshold. CH selection in WSN is NP-Hard as optimal data aggregation with efficient energy savings cannot be solved in polynomial time. In this work, a modified firefly heuristic, synchronous firefly algorithm, is proposed to improve the network performance. Extensive simulation shows the proposed technique to perform well compared to LEACH and energy-efficient hierarchical clustering. Simulations show the effectiveness of the proposed method in decreasing the packet loss ratio by an average of 9.63% and improving the energy efficiency of the network when compared to LEACH and EEHC. PMID:26495431
Large Data Visualization on Distributed Memory Mulit-GPU Clusters
Childs, Henry R.
2010-03-01
Data sets of immense size are regularly generated on large scale computing resources. Even among more traditional methods for acquisition of volume data, such as MRI and CT scanners, data which is too large to be effectively visualization on standard workstations is now commonplace. One solution to this problem is to employ a 'visualization cluster,' a small to medium scale cluster dedicated to performing visualization and analysis of massive data sets generated on larger scale supercomputers. These clusters are designed to fit a different need than traditional supercomputers, and therefore their design mandates different hardware choices, such as increased memory, and more recently, graphics processing units (GPUs). While there has been much previous work on distributed memory visualization as well as GPU visualization, there is a relative dearth of algorithms which effectively use GPUs at a large scale in a distributed memory environment. In this work, we study a common visualization technique in a GPU-accelerated, distributed memory setting, and present performance characteristics when scaling to extremely large data sets.
Advanced defect detection algorithm using clustering in ultrasonic NDE
NASA Astrophysics Data System (ADS)
Gongzhang, Rui; Gachagan, Anthony
2016-02-01
A range of materials used in industry exhibit scattering properties which limits ultrasonic NDE. Many algorithms have been proposed to enhance defect detection ability, such as the well-known Split Spectrum Processing (SSP) technique. Scattering noise usually cannot be fully removed and the remaining noise can be easily confused with real feature signals, hence becoming artefacts during the image interpretation stage. This paper presents an advanced algorithm to further reduce the influence of artefacts remaining in A-scan data after processing using a conventional defect detection algorithm. The raw A-scan data can be acquired from either traditional single transducer or phased array configurations. The proposed algorithm uses the concept of unsupervised machine learning to cluster segmental defect signals from pre-processed A-scans into different classes. The distinction and similarity between each class and the ensemble of randomly selected noise segments can be observed by applying a classification algorithm. Each class will then be labelled as `legitimate reflector' or `artefacts' based on this observation and the expected probability of defection (PoD) and probability of false alarm (PFA) determined. To facilitate data collection and validate the proposed algorithm, a 5MHz linear array transducer is used to collect A-scans from both austenitic steel and Inconel samples. Each pulse-echo A-scan is pre-processed using SSP and the subsequent application of the proposed clustering algorithm has provided an additional reduction to PFA while maintaining PoD for both samples compared with SSP results alone.
Non-equilibrium relaxation analysis in cluster algorithms
NASA Astrophysics Data System (ADS)
Nonomura, Yoshihiko
2014-03-01
In Monte Carlo study of phase transitions, the critical slowing down has been a serious problem. In order to overcome this difficulty, two kinds of approaches have been proposed. One is the cluster algorithms, where global update scheme based on a percolation theory is introduced in order to refrain from the power-law behavior at the critical point. Another is the non-equilibrium relaxation method, where the power-law critical relaxation process is analyzed by the dynamical scaling theory in order to refrain from time-consuming equilibration. Then, the next step is to fuse these two approaches -- to investigate phase transitions with early-stage relaxation process of cluster algorithms. Since the dynamical scaling theory does not hold in cluster algorithms in principle, such attempt had been considered impossible. In the present talk we show that such fusion is actually possible using an empirical scaling form obtained from the 2D Ising models instead of the dynamical scaling theory. Applications to the q >= 3 Potts models, +/- J Ising models etc. will also be explained in the presentation.
An improved distance matrix computation algorithm for multicore clusters.
Al-Neama, Mohammed W; Reda, Naglaa M; Ghaleb, Fayed F M
2014-01-01
Distance matrix has diverse usage in different research areas. Its computation is typically an essential task in most bioinformatics applications, especially in multiple sequence alignment. The gigantic explosion of biological sequence databases leads to an urgent need for accelerating these computations. DistVect algorithm was introduced in the paper of Al-Neama et al. (in press) to present a recent approach for vectorizing distance matrix computing. It showed an efficient performance in both sequential and parallel computing. However, the multicore cluster systems, which are available now, with their scalability and performance/cost ratio, meet the need for more powerful and efficient performance. This paper proposes DistVect1 as highly efficient parallel vectorized algorithm with high performance for computing distance matrix, addressed to multicore clusters. It reformulates DistVect1 vectorized algorithm in terms of clusters primitives. It deduces an efficient approach of partitioning and scheduling computations, convenient to this type of architecture. Implementations employ potential of both MPI and OpenMP libraries. Experimental results show that the proposed method performs improvement of around 3-fold speedup upon SSE2. Further it also achieves speedups more than 9 orders of magnitude compared to the publicly available parallel implementation utilized in ClustalW-MPI. PMID:25013779
Comparison of cluster expansion fitting algorithms for interactions at surfaces
NASA Astrophysics Data System (ADS)
Herder, Laura M.; Bray, Jason M.; Schneider, William F.
2015-10-01
Cluster expansions (CEs) are Ising-type interaction models that are increasingly used to model interaction and ordering phenomena at surfaces, such as the adsorbate-adsorbate interactions that control coverage-dependent adsorption or surface-vacancy interactions that control surface reconstructions. CEs are typically fit to a limited set of data derived from density functional theory (DFT) calculations. The CE fitting process involves iterative selection of DFT data points to include in a fit set and selection of interaction clusters to include in the CE. Here we compare the performance of three CE fitting algorithms-the MIT Ab-initio Phase Stability code (MAPS, the default in ATAT software), a genetic algorithm (GA), and a steepest descent (SD) algorithm-against synthetic data. The synthetic data is encoded in model Hamiltonians of varying complexity motivated by the observed behavior of atomic adsorbates on a face-centered-cubic transition metal close-packed (111) surface. We compare the performance of the leave-one-out cross-validation score against the true fitting error available from knowledge of the hidden CEs. For these systems, SD achieves lowest overall fitting and prediction error independent of the underlying system complexity. SD also most accurately predicts cluster interaction energies without ignoring or introducing extra interactions into the CE. MAPS achieves good results in fewer iterations, while the GA performs least well for these particular problems.
ICANP2: Isoenergetic cluster algorithm for NP-complete Problems
NASA Astrophysics Data System (ADS)
Zhu, Zheng; Fang, Chao; Katzgraber, Helmut G.
NP-complete optimization problems with Boolean variables are of fundamental importance in computer science, mathematics and physics. Most notably, the minimization of general spin-glass-like Hamiltonians remains a difficult numerical task. There has been a great interest in designing efficient heuristics to solve these computationally difficult problems. Inspired by the rejection-free isoenergetic cluster algorithm developed for Ising spin glasses, we present a generalized cluster update that can be applied to different NP-complete optimization problems with Boolean variables. The cluster updates allow for a wide-spread sampling of phase space, thus speeding up optimization. By carefully tuning the pseudo-temperature (needed to randomize the configurations) of the problem, we show that the method can efficiently tackle problems on topologies with a large site-percolation threshold. We illustrate the ICANP2 heuristic on paradigmatic optimization problems, such as the satisfiability problem and the vertex cover problem.
An algorithm for point cluster generalization based on the Voronoi diagram
NASA Astrophysics Data System (ADS)
Yan, Haowen; Weibel, Robert
2008-08-01
This paper presents an algorithm for point cluster generalization. Four types of information, i.e. statistical, thematic, topological, and metric information are considered, and measures are selected to describe corresponding types of information quantitatively in the algorithm, i.e. the number of points for statistical information, the importance value for thematic information, the Voronoi neighbors for topological information, and the distribution range and relative local density for metric information. Based on these measures, an algorithm for point cluster generalization is developed. Firstly, point clusters are triangulated and a border polygon of the point clusters is obtained. By the border polygon, some pseudo points are added to the original point clusters to form a new point set and a range polygon that encloses all original points is constructed. Secondly, the Voronoi polygons of the new point set are computed in order to obtain the so-called relative local density of each point. Further, the selection probability of each point is computed using its relative local density and importance value, and then mark those will-be-deleted points as 'deleted' according to their selection probabilities and Voronoi neighboring relations. Thirdly, if the number of retained points does not satisfy that computed by the Radical Law, physically delete the points marked as 'deleted' forming a new point set, and the second step is repeated; else physically deleted pseudo points and the points marked as 'deleted', and the generalized point clusters are achieved. Owing to the use of the Voronoi diagram the algorithm is parameter free and fully automatic. As our experiments show, it can be used in the generalization of point features arranged in clusters such as thematic dot maps and control points on cartographic maps.
NIC-based Reduction Algorithms for Large-scale Clusters
Petrini, F; Moody, A T; Fernandez, J; Frachtenberg, E; Panda, D K
2004-07-30
Efficient algorithms for reduction operations across a group of processes are crucial for good performance in many large-scale, parallel scientific applications. While previous algorithms limit processing to the host CPU, we utilize the programmable processors and local memory available on modern cluster network interface cards (NICs) to explore a new dimension in the design of reduction algorithms. In this paper, we present the benefits and challenges, design issues and solutions, analytical models, and experimental evaluations of a family of NIC-based reduction algorithms. Performance and scalability evaluations were conducted on the ASCI Linux Cluster (ALC), a 960-node, 1920-processor machine at Lawrence Livermore National Laboratory, which uses the Quadrics QsNet interconnect. We find NIC-based reductions on modern interconnects to be more efficient than host-based implementations in both scalability and consistency. In particular, at large-scale--1812 processes--NIC-based reductions of small integer and floating-point arrays provided respective speedups of 121% and 39% over the host-based, production-level MPI implementation.
Algorithm-dependent fault tolerance for distributed computing
P. D. Hough; M. e. Goldsby; E. J. Walsh
2000-02-01
Large-scale distributed systems assembled from commodity parts, like CPlant, have become common tools in the distributed computing world. Because of their size and diversity of parts, these systems are prone to failures. Applications that are being run on these systems have not been equipped to efficiently deal with failures, nor is there vendor support for fault tolerance. Thus, when a failure occurs, the application crashes. While most programmers make use of checkpoints to allow for restarting of their applications, this is cumbersome and incurs substantial overhead. In many cases, there are more efficient and more elegant ways in which to address failures. The goal of this project is to develop a software architecture for the detection of and recovery from faults in a cluster computing environment. The detection phase relies on the latest techniques developed in the fault tolerance community. Recovery is being addressed in an application-dependent manner, thus allowing the programmer to take advantage of algorithmic characteristics to reduce the overhead of fault tolerance. This architecture will allow large-scale applications to be more robust in high-performance computing environments that are comprised of clusters of commodity computers such as CPlant and SMP clusters.
Finding reproducible cluster partitions for the k-means algorithm
2013-01-01
K-means clustering is widely used for exploratory data analysis. While its dependence on initialisation is well-known, it is common practice to assume that the partition with lowest sum-of-squares (SSQ) total i.e. within cluster variance, is both reproducible under repeated initialisations and also the closest that k-means can provide to true structure, when applied to synthetic data. We show that this is generally the case for small numbers of clusters, but for values of k that are still of theoretical and practical interest, similar values of SSQ can correspond to markedly different cluster partitions. This paper extends stability measures previously presented in the context of finding optimal values of cluster number, into a component of a 2-d map of the local minima found by the k-means algorithm, from which not only can values of k be identified for further analysis but, more importantly, it is made clear whether the best SSQ is a suitable solution or whether obtaining a consistently good partition requires further application of the stability index. The proposed method is illustrated by application to five synthetic datasets replicating a real world breast cancer dataset with varying data density, and a large bioinformatics dataset. PMID:23369085
Einstein imaging observations of clusters with a bimodal mass distribution
NASA Technical Reports Server (NTRS)
Forman, W.; Bechtold, J.; Blair, W.; Giacconi, R.; Van Speybroeck, L.; Jones, C.
1981-01-01
Einstein imaging observations of four X-ray clusters of galaxies characterized by a double X-ray surface brightness and thus mass distribution are presented. The clusters A98, A115, A1750 and SC 0627-54 were found to exhibit two enhancements in their X-ray surface brightness distributions in observations made with the Einstein Imaging Proportional Counter. Calculations of the probability that the clusters represent chance superpositions indicate that the double clusters are physically associated. The radial distributions of the components are inconsistent with those of single point sources, and have been used to derive cluster luminosities which are typical of rich clusters. Masses of the subclusters are also found to be typical of bound and virialized clusters with gas contributing 10%. Within the framework of the hierarchical theory of galactic clustering, the double clusters are suggested to represent an intermediate evolutionary stage before the merger of subclusters into a relaxed Coma-type cluster.
Distributed cluster management techniques for unattended ground sensor networks
NASA Astrophysics Data System (ADS)
Essawy, Magdi A.; Stelzig, Chad A.; Bevington, James E.; Minor, Sharon
2005-05-01
Smart Sensor Networks are becoming important target detection and tracking tools. The challenging problems in such networks include the sensor fusion, data management and communication schemes. This work discusses techniques used to distribute sensor management and multi-target tracking responsibilities across an ad hoc, self-healing cluster of sensor nodes. Although miniaturized computing resources possess the ability to host complex tracking and data fusion algorithms, there still exist inherent bandwidth constraints on the RF channel. Therefore, special attention is placed on the reduction of node-to-node communications within the cluster by minimizing unsolicited messaging, and distributing the sensor fusion and tracking tasks onto local portions of the network. Several challenging problems are addressed in this work including track initialization and conflict resolution, track ownership handling, and communication control optimization. Emphasis is also placed on increasing the overall robustness of the sensor cluster through independent decision capabilities on all sensor nodes. Track initiation is performed using collaborative sensing within a neighborhood of sensor nodes, allowing each node to independently determine if initial track ownership should be assumed. This autonomous track initiation prevents the formation of duplicate tracks while eliminating the need for a central "management" node to assign tracking responsibilities. Track update is performed as an ownership node requests sensor reports from neighboring nodes based on track error covariance and the neighboring nodes geo-positional location. Track ownership is periodically recomputed using propagated track states to determine which sensing node provides the desired coverage characteristics. High fidelity multi-target simulation results are presented, indicating the distribution of sensor management and tracking capabilities to not only reduce communication bandwidth consumption, but to also
Performance Analysis of Apriori Algorithm with Different Data Structures on Hadoop Cluster
NASA Astrophysics Data System (ADS)
Singh, Sudhakar; Garg, Rakhi; Mishra, P. K.
2015-10-01
Mining frequent itemsets from massive datasets is always being a most important problem of data mining. Apriori is the most popular and simplest algorithm for frequent itemset mining. To enhance the efficiency and scalability of Apriori, a number of algorithms have been proposed addressing the design of efficient data structures, minimizing database scan and parallel and distributed processing. MapReduce is the emerging parallel and distributed technology to process big datasets on Hadoop Cluster. To mine big datasets it is essential to re-design the data mining algorithm on this new paradigm. In this paper, we implement three variations of Apriori algorithm using data structures hash tree, trie and hash table trie i.e. trie with hash technique on MapReduce paradigm. We emphasize and investigate the significance of these three data structures for Apriori algorithm on Hadoop cluster, which has not been given attention yet. Experiments are carried out on both real life and synthetic datasets which shows that hash table trie data structures performs far better than trie and hash tree in terms of execution time. Moreover the performance in case of hash tree becomes worst.
Dynamically Incremental K-means++ Clustering Algorithm Based on Fuzzy Rough Set Theory
NASA Astrophysics Data System (ADS)
Li, Wei; Wang, Rujing; Jia, Xiufang; Jiang, Qing
Being classic K-means++ clustering algorithm only for static data, dynamically incremental K-means++ clustering algorithm (DK-Means++) is presented based on fuzzy rough set theory in this paper. Firstly, in DK-Means++ clustering algorithm, the formula of similar degree is improved by weights computed by using of the important degree of attributes which are reduced on the basis of rough fuzzy set theory. Secondly, new data only need match granular which was clustered by K-means++ algorithm or seldom new data is clustered by classic K-means++ algorithm in global data. In this way, that all data is re-clustered each time in dynamic data set is avoided, so the efficiency of clustering is improved. Throughout our experiments showing, DK-Means++ algorithm can objectively and efficiently deal with clustering problem of dynamically incremental data.
A new detection algorithm for microcalcification clusters in mammographic screening
NASA Astrophysics Data System (ADS)
Xie, Weiying; Ma, Yide; Li, Yunsong
2015-05-01
A novel approach for microcalcification clusters detection is proposed. At the first time, we make a short analysis of mammographic images with microcalcification lesions to confirm these lesions have much greater gray values than normal regions. After summarizing the specific feature of microcalcification clusters in mammographic screening, we make more focus on preprocessing step including eliminating the background, image enhancement and eliminating the pectoral muscle. In detail, Chan-Vese Model is used for eliminating background. Then, we do the application of combining morphology method and edge detection method. After the AND operation and Sobel filter, we use Hough Transform, it can be seen that the result have outperformed for eliminating the pectoral muscle which is approximately the gray of microcalcification. Additionally, the enhancement step is achieved by morphology. We make effort on mammographic image preprocessing to achieve lower computational complexity. As well known, it is difficult to robustly achieve mammograms analysis due to low contrast between normal and lesion tissues, there are also much noise in such images. After a serious preprocessing algorithm, a method based on blob detection is performed to microcalcification clusters according their specific features. The proposed algorithm has employed Laplace operator to improve Difference of Gaussians (DoG) function in terms of low contrast images. A preliminary evaluation of the proposed method performs on a known public database namely MIAS, rather than synthetic images. The comparison experiments and Cohen's kappa coefficients all demonstrate that our proposed approach can potentially obtain better microcalcification clusters detection results in terms of accuracy, sensitivity and specificity.
Clustering based on conditional distributions in an auxiliary space.
Sinkkonen, Janne; Kaski, Samuel
2002-01-01
We study the problem of learning groups or categories that are local in the continuous primary space but homogeneous by the distributions of an associated auxiliary random variable over a discrete auxiliary space. Assuming that variation in the auxiliary space is meaningful, categories will emphasize similarly meaningful aspects of the primary space. From a data set consisting of pairs of primary and auxiliary items, the categories are learned by minimizing a Kullback-Leibler divergence-based distortion between (implicitly estimated) distributions of the auxiliary data, conditioned on the primary data. Still, the categories are defined in terms of the primary space. An online algorithm resembling the traditional Hebb-type competitive learning is introduced for learning the categories. Minimizing the distortion criterion turns out to be equivalent to maximizing the mutual information between the categories and the auxiliary data. In addition, connections to density estimation and to the distributional clustering paradigm are outlined. The method is demonstrated by clustering yeast gene expression data from DNA chips, with biological knowledge about the functional classes of the genes as the auxiliary data. PMID:11747539
Classification of posture maintenance data with fuzzy clustering algorithms
NASA Technical Reports Server (NTRS)
Bezdek, James C.
1992-01-01
Sensory inputs from the visual, vestibular, and proprioreceptive systems are integrated by the central nervous system to maintain postural equilibrium. Sustained exposure to microgravity causes neurosensory adaptation during spaceflight, which results in decreased postural stability until readaptation occurs upon return to the terrestrial environment. Data which simulate sensory inputs under various sensory organization test (SOT) conditions were collected in conjunction with Johnson Space Center postural control studies using a tilt-translation device (TTD). The University of West Florida applied the fuzzy c-meams (FCM) clustering algorithms to this data with a view towards identifying various states and stages of subjects experiencing such changes. Feature analysis, time step analysis, pooling data, response of the subjects, and the algorithms used are discussed.
A Fast Clustering Algorithm for Data with a Few Labeled Instances
Yang, Jinfeng; Xiao, Yong; Wang, Jiabing; Ma, Qianli; Shen, Yanhua
2015-01-01
The diameter of a cluster is the maximum intracluster distance between pairs of instances within the same cluster, and the split of a cluster is the minimum distance between instances within the cluster and instances outside the cluster. Given a few labeled instances, this paper includes two aspects. First, we present a simple and fast clustering algorithm with the following property: if the ratio of the minimum split to the maximum diameter (RSD) of the optimal solution is greater than one, the algorithm returns optimal solutions for three clustering criteria. Second, we study the metric learning problem: learn a distance metric to make the RSD as large as possible. Compared with existing metric learning algorithms, one of our metric learning algorithms is computationally efficient: it is a linear programming model rather than a semidefinite programming model used by most of existing algorithms. We demonstrate empirically that the supervision and the learned metric can improve the clustering quality. PMID:25861252
HST Imaging of the Globular Clusters in the Formax Cluster: Color and Luminosity Distributions
NASA Technical Reports Server (NTRS)
Grillmair, C. J.; Forbes, D. A.; Brodie, J.; Elson, R.
1998-01-01
We examine the luminosity and B - I color distribution of globular clusters for three early-type galaxies in the Fornax cluster using imaging data from the Wide Field/Planetary Camera 2 on the Hubble Space Telescope.
Classification of posture maintenance data with fuzzy clustering algorithms
NASA Technical Reports Server (NTRS)
Bezdek, James C.
1991-01-01
Sensory inputs from the visual, vestibular, and proprioreceptive systems are integrated by the central nervous system to maintain postural equilibrium. Sustained exposure to microgravity causes neurosensory adaptation during spaceflight, which results in decreased postural stability until readaptation occurs upon return to the terrestrial environment. Data which simulate sensory inputs under various conditions were collected in conjunction with JSC postural control studies using a Tilt-Translation Device (TTD). The University of West Florida proposed applying the Fuzzy C-Means Clustering (FCM) Algorithms to this data with a view towards identifying various states and stages. Data supplied by NASA/JSC were submitted to the FCM algorithms in an attempt to identify and characterize cluster substructure in a mixed ensemble of pre- and post-adaptational TTD data. Following several unsuccessful trials with FCM using a full 11 dimensional data set, a set of two channels (features) were found to enable FCM to separate pre- from post-adaptational TTD data. The main conclusions are that: (1) FCM seems able to separate pre- from post-TTD subject no. 2 on the one trial that was used, but only in certain subintervals of time; and (2) Channels 2 (right rear transducer force) and 8 (hip sway bar) contain better discrimination information than other supersets and combinations of the data that were tried so far.
jClustering, an open framework for the development of 4D clustering algorithms.
Mateos-Pérez, José María; García-Villalba, Carmen; Pascau, Javier; Desco, Manuel; Vaquero, Juan J
2013-01-01
We present jClustering, an open framework for the design of clustering algorithms in dynamic medical imaging. We developed this tool because of the difficulty involved in manually segmenting dynamic PET images and the lack of availability of source code for published segmentation algorithms. Providing an easily extensible open tool encourages publication of source code to facilitate the process of comparing algorithms and provide interested third parties with the opportunity to review code. The internal structure of the framework allows an external developer to implement new algorithms easily and quickly, focusing only on the particulars of the method being implemented and not on image data handling and preprocessing. This tool has been coded in Java and is presented as an ImageJ plugin in order to take advantage of all the functionalities offered by this imaging analysis platform. Both binary packages and source code have been published, the latter under a free software license (GNU General Public License) to allow modification if necessary. PMID:23990913
Applying various algorithms for species distribution modelling.
Li, Xinhai; Wang, Yuan
2013-06-01
Species distribution models have been used extensively in many fields, including climate change biology, landscape ecology and conservation biology. In the past 3 decades, a number of new models have been proposed, yet researchers still find it difficult to select appropriate models for data and objectives. In this review, we aim to provide insight into the prevailing species distribution models for newcomers in the field of modelling. We compared 11 popular models, including regression models (the generalized linear model, the generalized additive model, the multivariate adaptive regression splines model and hierarchical modelling), classification models (mixture discriminant analysis, the generalized boosting model, and classification and regression tree analysis) and complex models (artificial neural network, random forest, genetic algorithm for rule set production and maximum entropy approaches). Our objectives are: (i) to compare the strengths and weaknesses of the models, their characteristics and identify suitable situations for their use (in terms of data type and species-environment relationships) and (ii) to provide guidelines for model application, including 3 steps: model selection, model formulation and parameter estimation. PMID:23731809
Radial distribution of metallicity in the LMC cluster systems
NASA Technical Reports Server (NTRS)
Kontizas, M.; Kontizas, E.; Michalitsianos, A. G.
1993-01-01
New determinations of the deprojected distances to the galaxy center for 94 star clusters and their metal abundances are used to investigate the variation of metallicity across the two LMC star cluster systems (Kontizas et al. 1990). A systematic radial trend of metallicity is observed in the extended outer cluster system, the outermost clusters being significantly metal poorer than the more central ones, with the exception of six clusters (which might lie out of the plane of the cluster system) out of 77. A radial metallicity gradient has been found, qualitatively comparable to that of the Milky Way for its system of the old disk clusters. If the six clusters are taken into consideration then the outer cluster system is well mixed up to 8 kpc. The spatial distribution of metallicities for the inner LMC cluster system, consisting of very young globulars does not show a systematic radial trend; they are all metal rich.
Jiang, Peng; Xu, Yiming; Wu, Feng
2016-01-01
Existing move-restricted node self-deployment algorithms are based on a fixed node communication radius, evaluate the performance based on network coverage or the connectivity rate and do not consider the number of nodes near the sink node and the energy consumption distribution of the network topology, thereby degrading network reliability and the energy consumption balance. Therefore, we propose a distributed underwater node self-deployment algorithm. First, each node begins the uneven clustering based on the distance on the water surface. Each cluster head node selects its next-hop node to synchronously construct a connected path to the sink node. Second, the cluster head node adjusts its depth while maintaining the layout formed by the uneven clustering and then adjusts the positions of in-cluster nodes. The algorithm originally considers the network reliability and energy consumption balance during node deployment and considers the coverage redundancy rate of all positions that a node may reach during the node position adjustment. Simulation results show, compared to the connected dominating set (CDS) based depth computation algorithm, that the proposed algorithm can increase the number of the nodes near the sink node and improve network reliability while guaranteeing the network connectivity rate. Moreover, it can balance energy consumption during network operation, further improve network coverage rate and reduce energy consumption. PMID:26784193
Jiang, Peng; Xu, Yiming; Wu, Feng
2016-01-01
Existing move-restricted node self-deployment algorithms are based on a fixed node communication radius, evaluate the performance based on network coverage or the connectivity rate and do not consider the number of nodes near the sink node and the energy consumption distribution of the network topology, thereby degrading network reliability and the energy consumption balance. Therefore, we propose a distributed underwater node self-deployment algorithm. First, each node begins the uneven clustering based on the distance on the water surface. Each cluster head node selects its next-hop node to synchronously construct a connected path to the sink node. Second, the cluster head node adjusts its depth while maintaining the layout formed by the uneven clustering and then adjusts the positions of in-cluster nodes. The algorithm originally considers the network reliability and energy consumption balance during node deployment and considers the coverage redundancy rate of all positions that a node may reach during the node position adjustment. Simulation results show, compared to the connected dominating set (CDS) based depth computation algorithm, that the proposed algorithm can increase the number of the nodes near the sink node and improve network reliability while guaranteeing the network connectivity rate. Moreover, it can balance energy consumption during network operation, further improve network coverage rate and reduce energy consumption. PMID:26784193
Learning Based Approach for Optimal Clustering of Distributed Program's Call Flow Graph
NASA Astrophysics Data System (ADS)
Abofathi, Yousef; Zarei, Bager; Parsa, Saeed
Optimal clustering of call flow graph for reaching maximum concurrency in execution of distributable components is one of the NP-Complete problems. Learning automatas (LAs) are search tools which are used for solving many NP-Complete problems. In this paper a learning based algorithm is proposed to optimal clustering of call flow graph and appropriate distributing of programs in network level. The algorithm uses learning feature of LAs to search in state space. It has been shown that the speed of reaching to solution increases remarkably using LA in search process, and it also prevents algorithm from being trapped in local minimums. Experimental results show the superiority of proposed algorithm over others.
GX-Means: A model-based divide and merge algorithm for geospatial image clustering
Vatsavai, Raju; Symons, Christopher T; Chandola, Varun; Jun, Goo
2011-01-01
One of the practical issues in clustering is the specification of the appropriate number of clusters, which is not obvious when analyzing geospatial datasets, partly because they are huge (both in size and spatial extent) and high dimensional. In this paper we present a computationally efficient model-based split and merge clustering algorithm that incrementally finds model parameters and the number of clusters. Additionally, we attempt to provide insights into this problem and other data mining challenges that are encountered when clustering geospatial data. The basic algorithm we present is similar to the G-means and X-means algorithms; however, our proposed approach avoids certain limitations of these well-known clustering algorithms that are pertinent when dealing with geospatial data. We compare the performance of our approach with the G-means and X-means algorithms. Experimental evaluation on simulated data and on multispectral and hyperspectral remotely sensed image data demonstrates the effectiveness of our algorithm.
Applying Social Networking and Clustering Algorithms to Galaxy Groups in ALFALFA
NASA Astrophysics Data System (ADS)
Bramson, Ali; Wilcots, E. M.
2012-01-01
Because most galaxies live in groups, and the environment in which it resides affects the evolution of a galaxy, it is crucial to develop tools to understand how galaxies are distributed within groups. At the same time we must understand how groups are distributed and connected in the larger scale structure of the Universe. I have applied a variety of networking techniques to assess the substructure of galaxy groups, including distance matrices, agglomerative hierarchical clustering algorithms and dendrograms. We use distance matrices to locate groupings spatially in 3-D. Dendrograms created from agglomerative hierarchical clustering results allow us to quantify connections between galaxies and galaxy groups. The shape of the dendrogram reveals if the group is spatially homogenous or clumpy. These techniques are giving us new insight into the structure and dynamical state of galaxy groups and large scale structure. We specifically apply these techniques to the ALFALFA survey of the Coma-Abell 1367 supercluster and its resident galaxy groups.
Dynamic Layered Dual-Cluster Heads Routing Algorithm Based on Krill Herd Optimization in UWSNs.
Jiang, Peng; Feng, Yang; Wu, Feng; Yu, Shanen; Xu, Huan
2016-01-01
Aimed at the limited energy of nodes in underwater wireless sensor networks (UWSNs) and the heavy load of cluster heads in clustering routing algorithms, this paper proposes a dynamic layered dual-cluster routing algorithm based on Krill Herd optimization in UWSNs. Cluster size is first decided by the distance between the cluster head nodes and sink node, and a dynamic layered mechanism is established to avoid the repeated selection of the same cluster head nodes. Using Krill Herd optimization algorithm selects the optimal and second optimal cluster heads, and its Lagrange model directs nodes to a high likelihood area. It ultimately realizes the functions of data collection and data transition. The simulation results show that the proposed algorithm can effectively decrease cluster energy consumption, balance the network energy consumption, and prolong the network lifetime. PMID:27589744
NASA Astrophysics Data System (ADS)
Park, Sang Ha; Lee, Seokjin; Sung, Koeng-Mo
Non-negative matrix factorization (NMF) is widely used for monaural musical sound source separation because of its efficiency and good performance. However, an additional clustering process is required because the musical sound mixture is separated into more signals than the number of musical tracks during NMF separation. In the conventional method, manual clustering or training-based clustering is performed with an additional learning process. Recently, a clustering algorithm based on the mel-frequency cepstrum coefficient (MFCC) was proposed for unsupervised clustering. However, MFCC clustering supplies limited information for clustering. In this paper, we propose various timbre features for unsupervised clustering and a clustering algorithm with these features. Simulation experiments are carried out using various musical sound mixtures. The results indicate that the proposed method improves clustering performance, as compared to conventional MFCC-based clustering.
Evaluation of particle clustering algorithms in the prediction of brownout dust clouds
NASA Astrophysics Data System (ADS)
Govindarajan, Bharath Madapusi
2011-07-01
A study of three Lagrangian particle clustering methods has been conducted with application to the problem of predicting brownout dust clouds that develop when rotorcraft land over surfaces covered with loose sediment. A significant impediment in performing such particle modeling simulations is the extremely large number of particles needed to obtain dust clouds of acceptable fidelity. Computing the motion of each and every individual sediment particle in a dust cloud (which can reach into tens of billions per cubic meter) is computationally prohibitive. The reported work involved the development of computationally efficient clustering algorithms that can be applied to the simulation of dilute gas-particle suspensions at low Reynolds numbers of the relative particle motion. The Gaussian distribution, k-means and Osiptsov's clustering methods were studied in detail to highlight the nuances of each method for a prototypical flow field that mimics the highly unsteady, two-phase vortical particle flow obtained when rotorcraft encounter brownout conditions. It is shown that although clustering algorithms can be problem dependent and have bounds of applicability, they offer the potential to significantly reduce computational costs while retaining the overall accuracy of a brownout dust cloud solution.
Study on 2D random medium inversion algorithm based on Fuzzy C-means Clustering theory
NASA Astrophysics Data System (ADS)
Xu, Z.; Zhu, P.; Gu, Y.; Yang, X.; Jiang, J.
2015-12-01
Abstract: In seismic exploration for metal deposits, the traditional seismic inversion method based on layered homogeneous medium theory seems difficult to inverse small scale inhomogeneity and spatial variation of the actual medium. The reason is that physical properties of actual medium are more likely random distribution rather than layered. Thus, it is necessary to investigate a random medium inversion algorithm. The velocity of 2D random medium can be described as a function of five parameters: the background velocity (V0), the standard deviation of velocity (σ), the horizontal and vertical autocorrelation lengths (A and B), and the autocorrelation angle (θ). In this study, we propose an inversion algorithm for random medium based on the Fuzzy C-means Clustering (FCM) theory, whose basic idea is that FCM is used to control the inversion process to move forward to the direction we desired by clustering the estimated parameters into groups. Our method can be divided into three steps: firstly, the three parameters (A, B, θ) are estimated from 2D post-stack seismic data using the non-stationary random medium parameter estimation method, and then the estimated parameters are clustered to different groups according to FCM; secondly, the initial random medium model is constructed with clustered groups and the rest two parameters (V0 and σ) obtained from the well logging data; at last, inversion of the random medium are conducted to obtain velocity, impedance and random medium parameters using the Conjugate Gradient Method. The inversion experiments of synthetic seismic data show that the velocity models inverted by our algorithm are close to the real velocity distribution and the boundary of different media can be distinguished clearly.Key words: random medium, inversion, FCM, parameter estimation
A comparison of queueing, cluster and distributed computing systems
NASA Technical Reports Server (NTRS)
Kaplan, Joseph A.; Nelson, Michael L.
1993-01-01
Using workstation clusters for distributed computing has become popular with the proliferation of inexpensive, powerful workstations. Workstation clusters offer both a cost effective alternative to batch processing and an easy entry into parallel computing. However, a number of workstations on a network does not constitute a cluster. Cluster management software is necessary to harness the collective computing power. A variety of cluster management and queuing systems are compared: Distributed Queueing Systems (DQS), Condor, Load Leveler, Load Balancer, Load Sharing Facility (LSF - formerly Utopia), Distributed Job Manager (DJM), Computing in Distributed Networked Environments (CODINE), and NQS/Exec. The systems differ in their design philosophy and implementation. Based on published reports on the different systems and conversations with the system's developers and vendors, a comparison of the systems are made on the integral issues of clustered computing.
Algorithm to extract the spanning clusters and calculate conductivity in strip geometries
NASA Astrophysics Data System (ADS)
Babalievski, F.
1995-06-01
I present an improved algorithm to solve the random resistor problem using a transfer-matrix technique. Preconditioning by spanning cluster extraction both reduces the size of the matrix and yields faster execution times when compared to previous algorithms.
Moving target tracking through distributed clustering in directional sensor networks.
Enayet, Asma; Razzaque, Md Abdur; Hassan, Mohammad Mehedi; Almogren, Ahmad; Alamri, Atif
2014-01-01
The problem of moving target tracking in directional sensor networks (DSNs) introduces new research challenges, including optimal selection of sensing and communication sectors of the directional sensor nodes, determination of the precise location of the target and an energy-efficient data collection mechanism. Existing solutions allow individual sensor nodes to detect the target's location through collaboration among neighboring nodes, where most of the sensors are activated and communicate with the sink. Therefore, they incur much overhead, loss of energy and reduced target tracking accuracy. In this paper, we have proposed a clustering algorithm, where distributed cluster heads coordinate their member nodes in optimizing the active sensing and communication directions of the nodes, precisely determining the target location by aggregating reported sensing data from multiple nodes and transferring the resultant location information to the sink. Thus, the proposed target tracking mechanism minimizes the sensing redundancy and maximizes the number of sleeping nodes in the network. We have also investigated the dynamic approach of activating sleeping nodes on-demand so that the moving target tracking accuracy can be enhanced while maximizing the network lifetime. We have carried out our extensive simulations in ns-3, and the results show that the proposed mechanism achieves higher performance compared to the state-of-the-art works. PMID:25529205
Comparing simulated and experimental molecular cluster distributions.
Olenius, Tinja; Schobesberger, Siegfried; Kupiainen-Määttä, Oona; Franchin, Alessandro; Junninen, Heikki; Ortega, Ismael K; Kurtén, Theo; Loukonen, Ville; Worsnop, Douglas R; Kulmala, Markku; Vehkamäki, Hanna
2013-01-01
Formation of secondary atmospheric aerosol particles starts with gas phase molecules forming small molecular clusters. High-resolution mass spectrometry enables the detection and chemical characterization of electrically charged clusters from the molecular scale upward, whereas the experimental detection of electrically neutral clusters, especially as a chemical composition measurement, down to 1 nm in diameter and beyond still remains challenging. In this work we simulated a set of both electrically neutral and charged small molecular clusters, consisting of sulfuric acid and ammonia molecules, with a dynamic collision and evaporation model. Collision frequencies between the clusters were calculated according to classical kinetics, and evaporation rates were derived from first principles quantum chemical calculations with no fitting parameters. We found a good agreement between the modeled steady-state concentrations of negative cluster ions and experimental results measured with the state-of-the-art Atmospheric Pressure interface Time-Of-Flight mass spectrometer (APi-TOF) in the CLOUD chamber experiments at CERN. The model can be used to interpret experimental results and give information on neutral clusters that cannot be directly measured. PMID:24600997
MixSim : An R Package for Simulating Data to Study Performance of Clustering Algorithms
Melnykov, Volodymyr; Chen, Wei-Chen; Maitra, Ranjan
2012-01-01
The R package MixSim is a new tool that allows simulating mixtures of Gaussian distributions with different levels of overlap between mixture components. Pairwise overlap, defined as a sum of two misclassification probabilities, measures the degree of interaction between components and can be readily employed to control the clustering complexity of datasets simulated from mixtures. These datasets can then be used for systematic performance investigation of clustering and finite mixture modeling algorithms. Among other capabilities of MixSim, there are computing the exact overlap for Gaussian mixtures, simulating Gaussian and non-Gaussian data, simulating outliers and noise variables, calculating various measures of agreement between two partitionings, and constructing parallel distribution plots for the graphical display of finite mixture models. All features of the package are illustrated in great detail. The utility of the package is highlighted through a small comparison study of several popular clustering algorithms.
Efficient algorithms for distributed simulation and related problems
Kumar, D.
1987-01-01
This thesis presents efficient algorithms for distributed simulation, and for the related problems of termination detection and sequential simulation. Distributed simulation algorithms applicable to the simulation of special classes of systems, such that almost no overhead messages are required are presented. By contrast, previous distributed simulation algorithms, although applicable to the general class of any discrete-event system, usually require too many overhead messages. First, a simple distributed simulation algorithm is defined with nearly zero overhead messages for simulating feedforward systems. An approximate method is developed to predict its performance in simulating a class of feedforward-queuing networks. Performance of the scheme is evaluated in simulating specific subclasses of these queuing networks. It is shown that the scheme offers a high performance for serial-parallel networks. Next, another distributed simulation scheme is defined for a class of distributed systems whose topologies may have cycles. One important problem in devising distributed simulation algorithms is that of efficient detection of termination. With this in mind, a class of termination-detection algorithms using markers is devised. Finally, a new sequential simulation algorithm is developed, based on a distributed one. This algorithm often reduces the event-list manipulations of traditional-event list-driven simulation.
NASA Astrophysics Data System (ADS)
Bahrampour, Soheil; Moshiri, Behzad; Salahshoor, Karim
2009-08-01
Most of process fault monitoring systems suffer from offline computations and confronting with novel faults that limit their applicabilities. This paper presents a new online fault detection and isolation (FDI) algorithm based on distributed online clustering approach. In the proposed approach, clustering algorithm is used for online detection of a new trend of time series data which indicates faulty condition. On the other hand, distributed technique is used to decompose the overall monitoring task into a series of local monitoring sub-tasks so as to locally track and capture the process faults. This algorithm not only solves the problem of online FDI, but also can handle novel faults. The diagnostic performances of the proposed FDI approach is evaluated on the Tennessee Eastman process plant as a large-scale benchmark problem.
Contributions to "k"-Means Clustering and Regression via Classification Algorithms
ERIC Educational Resources Information Center
Salman, Raied
2012-01-01
The dissertation deals with clustering algorithms and transforming regression problems into classification problems. The main contributions of the dissertation are twofold; first, to improve (speed up) the clustering algorithms and second, to develop a strict learning environment for solving regression problems as classification tasks by using…
User-Based Document Clustering by Redescribing Subject Descriptions with a Genetic Algorithm.
ERIC Educational Resources Information Center
Gordon, Michael D.
1991-01-01
Discussion of clustering of documents and queries in information retrieval systems focuses on the use of a genetic algorithm to adapt subject descriptions so that documents become more effective in matching relevant queries. Various types of clustering are explained, and simulation experiments used to test the genetic algorithm are described. (27…
Burst detection in district metering areas using a data driven clustering algorithm.
Wu, Yipeng; Liu, Shuming; Wu, Xue; Liu, Youfei; Guan, Yisheng
2016-09-01
This paper describes a novel methodology for burst detection in a water distribution system. The proposed method has two stages. In the first stage, a clustering algorithm was employed for outlier detection, while the second stage identified the presence of bursts. An important feature of this method is that data analysis is carried out dependent on multiple flow meters whose measurements vary simultaneously in a district metering area (DMA). Moreover, the clustering-based method can automatically cope with non-stationary conditions in historical data; namely, the method has no prior data selection process. An example application of this method has been implemented to confirm that relatively large bursts (simulated by flushing) with short duration can be detected effectively. Noticeably, the method has a low false positive rate compared with previous studies and appearance of detected abnormal water usage consists with weather changes, showing great promise in real application to multi-inlet and multi-outlet DMAs. PMID:27176651
A contour-line color layer separation algorithm based on fuzzy clustering and region growing
NASA Astrophysics Data System (ADS)
Liu, Tiange; Miao, Qiguang; Xu, Pengfei; Tong, Yubing; Song, Jianfeng; Xia, Ge; Yang, Yun; Zhai, Xiaojie
2016-03-01
The color layers of contour-lines separated from scanned topographic map are the basis of contour-line extraction, but it is difficult to separate them well due to the color aliasing and mixed color problems. This paper will focus us on contour-line color layer separation and presents a novel approach for it based on fuzzy clustering and Single-prototype Region Growing for Contour-line Layer (SRGCL). The purpose of this paper is to provide a solution for processing scanned topographic maps on which contour-lines are abundant and densely distributed, for example, in the condition similar to hilly areas and mountainous regions, the contour-lines always occupy the largest proportion in linear features and the contour-line separation is the most difficult task. The proposed approach includes steps as follows. First step, line features are extracted from the map to reduce the interference from area features in fuzzy clustering. Second step, fuzzy clustering algorithm is employed to obtain membership matrix of pixels in the line map. Third step, based on the membership matrix, we obtain the most-similar prototype and the second-similar prototype of each pixel as the indicators of the pixel in SRGCL. The spatial relationship and the fuzzy similarity of color features are used in SRGCL to overcome the inaccurate classification of ambiguous pixels. The procedure focusing on single contour-line layer will improve the accuracy of contour-line segmentation result of SRGCL relative to general segmentation methods. We verified the algorithm on several USGS historical maps, the experimental results show that our algorithm produces contour-line color layers with good continuity and few noises, which verifies the improvement in contour-line color layer separation of our algorithm relative to two general segmentation methods.
Security clustering algorithm based on reputation in hierarchical peer-to-peer network
NASA Astrophysics Data System (ADS)
Chen, Mei; Luo, Xin; Wu, Guowen; Tan, Yang; Kita, Kenji
2013-03-01
For the security problems of the hierarchical P2P network (HPN), the paper presents a security clustering algorithm based on reputation (CABR). In the algorithm, we take the reputation mechanism for ensuring the security of transaction and use cluster for managing the reputation mechanism. In order to improve security, reduce cost of network brought by management of reputation and enhance stability of cluster, we select reputation, the historical average online time, and the network bandwidth as the basic factors of the comprehensive performance of node. Simulation results showed that the proposed algorithm improved the security, reduced the network overhead, and enhanced stability of cluster.
A distributed decision framework for building clusters with different heterogeneity settings
Jafari-Marandi, Ruholla; Omitaomu, Olufemi A.; Hu, Mengqi
2016-01-05
In the past few decades, extensive research has been conducted to develop operation and control strategy for smart buildings with the purpose of reducing energy consumption. Besides studying on single building, it is envisioned that the next generation buildings can freely connect with one another to share energy and exchange information in the context of smart grid. It was demonstrated that a network of connected buildings (aka building clusters) can significantly reduce primary energy consumption, improve environmental sustainability and building s resilience capability. However, an analytic tool to determine which type of buildings should form a cluster and what ismore » the impact of building clusters heterogeneity based on energy profile to the energy performance of building clusters is missing. To bridge these research gaps, we propose a self-organizing map clustering algorithm to divide multiple buildings to different clusters based on their energy profiles, and a homogeneity index to evaluate the heterogeneity of different building clusters configurations. In addition, a bi-level distributed decision model is developed to study the energy sharing in the building clusters. To demonstrate the effectiveness of the proposed clustering algorithm and decision model, we employ a dataset including monthly energy consumption data for 30 buildings where the data is collected every 15 min. It is demonstrated that the proposed decision model can achieve at least 13% cost savings for building clusters. Furthermore, the results show that the heterogeneity of energy profile is an important factor to select battery and renewable energy source for building clusters, and the shared battery and renewable energy are preferred for more heterogeneous building clusters.« less
An efficient algorithm for estimating noise covariances in distributed systems
NASA Technical Reports Server (NTRS)
Dee, D. P.; Cohn, S. E.; Ghil, M.; Dalcher, A.
1985-01-01
An efficient computational algorithm for estimating the noise covariance matrices of large linear discrete stochatic-dynamic systems is presented. Such systems arise typically by discretizing distributed-parameter systems, and their size renders computational efficiency a major consideration. The proposed adaptive filtering algorithm is based on the ideas of Belanger, and is algebraically equivalent to his algorithm. The earlier algorithm, however, has computational complexity proportional to p to the 6th, where p is the number of observations of the system state, while the new algorithm has complexity proportional to only p-cubed. Further, the formulation of noise covariance estimation as a secondary filter, analogous to state estimation as a primary filter, suggests several generalizations of the earlier algorithm. The performance of the proposed algorithm is demonstrated for a distributed system arising in numerical weather prediction.
A Formal Algorithm for Verifying the Validity of Clustering Results Based on Model Checking
Huang, Shaobin; Cheng, Yuan; Lang, Dapeng; Chi, Ronghua; Liu, Guofeng
2014-01-01
The limitations in general methods to evaluate clustering will remain difficult to overcome if verifying the clustering validity continues to be based on clustering results and evaluation index values. This study focuses on a clustering process to analyze crisp clustering validity. First, we define the properties that must be satisfied by valid clustering processes and model clustering processes based on program graphs and transition systems. We then recast the analysis of clustering validity as the problem of verifying whether the model of clustering processes satisfies the specified properties with model checking. That is, we try to build a bridge between clustering and model checking. Experiments on several datasets indicate the effectiveness and suitability of our algorithms. Compared with traditional evaluation indices, our formal method can not only indicate whether the clustering results are valid but, in the case the results are invalid, can also detect the objects that have led to the invalidity. PMID:24608823
Measles and Rubella: Scale Free Distribution of Local Infection Clusters.
Yoshikura, Hiroshi; Takeuchi, Fumihiko
2016-07-22
This study examined the size distribution of local infection clusters (referred to as clusters hereafter) of measles and rubella from 2008-2013 in Japan. When the logarithm of the cluster sizes were plotted on the x-axis and the logarithm of their frequencies were plotted on the y-axis, the plots fell on a rightward descending straight line. The size distribution was observed to follow a power law. As the size distribution of the clusters could be equated with that of local secondary infections initiated by 1 patient, the size distribution of the clusters, in fact, represented the effective reproduction numbers at the local level. As the power law distribution has no typical sizes, it was suggested that measles or rubella epidemics in Japan had no typical reproduction number. Higher the population size and higher the total number of patients, flatter was the slope of the plots, thus larger was the proportion of larger clusters. An epidemic of measles or rubella in Japan could be represented more appropriately by the cluster size frequency distribution rather than by the reproduction number. PMID:26567836
Parallelization of the Wolff single-cluster algorithm.
Kaupuzs, J; Rimsāns, J; Melnik, R V N
2010-02-01
A parallel [open multiprocessing (OpenMP)] implementation of the Wolff single-cluster algorithm has been developed and tested for the three-dimensional (3D) Ising model. The developed procedure is generalizable to other lattice spin models and its effectiveness depends on the specific application at hand. The applicability of the developed methodology is discussed in the context of the applications, where a sophisticated shuffling scheme is used to generate pseudorandom numbers of high quality, and an iterative method is applied to find the critical temperature of the 3D Ising model with a great accuracy. For the lattice with linear size L=1024, we have reached the speedup about 1.79 times on two processors and about 2.67 times on four processors, as compared to the serial code. According to our estimation, the speedup about three times on four processors is reachable for the O(n) models with n> or =2. Furthermore, the application of the developed OpenMP code allows us to simulate larger lattices due to greater operative (shared) memory available. PMID:20365669
Using Clustering Algorithms to Identify Brown Dwarf Characteristics
NASA Astrophysics Data System (ADS)
Choban, Caleb
2016-06-01
Brown dwarfs are stars that are not massive enough to sustain core hydrogen fusion, and thus fade and cool over time. The molecular composition of brown dwarf atmospheres can be determined by observing absorption features in their infrared spectrum, which can be quantified using spectral indices. Comparing these indices to one another, we can determine what kind of brown dwarf it is, and if it is young or metal-poor. We explored a new method for identifying these subgroups through the expectation-maximization machine learning clustering algorithm, which provides a quantitative and statistical way of identifying index pairs which separate rare populations. We specifically quantified two statistics, completeness and concentration, to identify the best index pairs. Starting with a training set, we defined selection regions for young, metal-poor and binary brown dwarfs, and tested these on a large sample of L dwarfs. We present the results of this analysis, and demonstrate that new objects in these classes can be found through these methods.
A heart disease recognition embedded system with fuzzy cluster algorithm.
de Carvalho, Helton Hugo; Moreno, Robson Luiz; Pimenta, Tales Cleber; Crepaldi, Paulo C; Cintra, Evaldo
2013-06-01
This article presents the viability analysis and the development of heart disease identification embedded system. It offers a time reduction on electrocardiogram - ECG signal processing by reducing the amount of data samples, without any significant loss. The goal of the developed system is the analysis of heart signals. The ECG signals are applied into the system that performs an initial filtering, and then uses a Gustafson-Kessel fuzzy clustering algorithm for the signal classification and correlation. The classification indicated common heart diseases such as angina, myocardial infarction and coronary artery diseases. The system uses the European electrocardiogram ST-T Database (EDB) as a reference for tests and evaluation. The results prove the system can perform the heart disease detection on a data set reduced from 213 to just 20 samples, thus providing a reduction to just 9.4% of the original set, while maintaining the same effectiveness. This system is validated in a Xilinx Spartan(®)-3A FPGA. The field programmable gate array (FPGA) implemented a Xilinx Microblaze(®) Soft-Core Processor running at a 50MHz clock rate. PMID:23394802
Logistics distribution centers location problem and algorithm under fuzzy environment
NASA Astrophysics Data System (ADS)
Yang, Lixing; Ji, Xiaoyu; Gao, Ziyou; Li, Keping
2007-11-01
Distribution centers location problem is concerned with how to select distribution centers from the potential set so that the total relevant cost is minimized. This paper mainly investigates this problem under fuzzy environment. Consequentially, chance-constrained programming model for the problem is designed and some properties of the model are investigated. Tabu search algorithm, genetic algorithm and fuzzy simulation algorithm are integrated to seek the approximate best solution of the model. A numerical example is also given to show the application of the algorithm.
Modeling and convergence analysis of distributed coevolutionary algorithms.
Subbu, Raj; Sanderson, Arthur C
2004-04-01
A theoretical foundation is presented for modeling and convergence analysis of a class of distributed coevolutionary algorithms applied to optimization problems in which the variables are partitioned among p nodes. An evolutionary algorithm at each of the p nodes performs a local evolutionary search based on its own set of primary variables, and the secondary variable set at each node is clamped during this phase. An infrequent intercommunication between the nodes updates the secondary variables at each node. The local search and intercommunication phases alternate, resulting in a cooperative search by the p nodes. First, we specify a theoretical basis for a class of centralized evolutionary algorithms in terms of construction and evolution of sampling distributions over the feasible space. Next, this foundation is extended to develop a model for a class of distributed coevolutionary algorithms. Convergence and convergence rate analyzes are pursued for basic classes of objective functions. Our theoretical investigation reveals that for certain unimodal and multimodal objectives, we can expect these algorithms to converge at a geometrical rate. The distributed coevolutionary algorithms are of most interest from the perspective of their performance advantage compared to centralized algorithms, when they execute in a network environment with significant local access and internode communication delays. The relative performance of these algorithms is therefore evaluated in a distributed environment with realistic parameters of network behavior. PMID:15376831
Clustering performance comparison using K-means and expectation maximization algorithms
Jung, Yong Gyu; Kang, Min Soo; Heo, Jun
2014-01-01
Clustering is an important means of data mining based on separating data categories by similar features. Unlike the classification algorithm, clustering belongs to the unsupervised type of algorithms. Two representatives of the clustering algorithms are the K-means and the expectation maximization (EM) algorithm. Linear regression analysis was extended to the category-type dependent variable, while logistic regression was achieved using a linear combination of independent variables. To predict the possibility of occurrence of an event, a statistical approach is used. However, the classification of all data by means of logistic regression analysis cannot guarantee the accuracy of the results. In this paper, the logistic regression analysis is applied to EM clusters and the K-means clustering method for quality assessment of red wine, and a method is proposed for ensuring the accuracy of the classification results. PMID:26019610
The distribution of early- and late-type galaxies in the Coma cluster
NASA Technical Reports Server (NTRS)
Doi, M.; Fukugita, M.; Okamura, S.; Turner, E. L.
1995-01-01
The spatial distribution and the morohology-density relation of Coma cluster galaxies are studied using a new homogeneous photmetric sample of 450 galaxies down to B = 16.0 mag with quantitative morphology classification. The sample covers a wide area (10 deg X 10 deg), extending well beyond the Coma cluster. Morphological classifications into early- (E+SO) and late-(S) type galaxies are made by an automated algorithm using simple photometric parameters, with which the misclassification rate is expected to be approximately 10% with respect to early and late types given in the Third Reference Catalogue of Bright Galaxies. The flattened distribution of Coma cluster galaxies, as noted in previous studies, is most conspicuously seen if the early-type galaxies are selected. Early-type galaxies are distributed in a thick filament extended from the NE to the WSW direction that delineates a part of large-scale structure. Spiral galaxies show a distribution with a modest density gradient toward the cluster center; at least bright spiral galaxies are present close to the center of the Coma cluster. We also examine the morphology-density relation for the Coma cluster including its surrounding regions.
Sun, Liping; Luo, Yonglong; Ding, Xintao; Zhang, Ji
2014-01-01
An important component of a spatial clustering algorithm is the distance measure between sample points in object space. In this paper, the traditional Euclidean distance measure is replaced with innovative obstacle distance measure for spatial clustering under obstacle constraints. Firstly, we present a path searching algorithm to approximate the obstacle distance between two points for dealing with obstacles and facilitators. Taking obstacle distance as similarity metric, we subsequently propose the artificial immune clustering with obstacle entity (AICOE) algorithm for clustering spatial point data in the presence of obstacles and facilitators. Finally, the paper presents a comparative analysis of AICOE algorithm and the classical clustering algorithms. Our clustering model based on artificial immune system is also applied to the case of public facility location problem in order to establish the practical applicability of our approach. By using the clone selection principle and updating the cluster centers based on the elite antibodies, the AICOE algorithm is able to achieve the global optimum and better clustering effect. PMID:25435862
Parallel matrix transpose algorithms on distributed memory concurrent computers
Choi, Jaeyoung; Dongarra, J. |; Walker, D.W.
1994-12-31
This paper describes parallel matrix transpose algorithms on distributed memory concurrent processors. We assume that the matrix is distributed over a P {times} Q processor template with a block scattered data distribution. P, Q, and the block size can be arbitrary, so the algorithms have wide applicability. The algorithms make use of non-blocking, point-to-point communication between processors. The use of nonblocking communication allows a processor to overlap the messages that it sends to different processors, thereby avoiding unnecessary synchronization. Combined with the matrix multiplication routine, C = A {center_dot} B, the algorithms are used to compute parallel multiplications of transposed matrices, C = A{sup T} {center_dot} B{sup T}, in the PUMMA package. Details of the parallel implementation of the algorithms are given, and results are presented for runs on the Intel Touchstone Delta computer.
MEASURING THE MASS DISTRIBUTION IN GALAXY CLUSTERS
Geller, Margaret J.; Diaferio, Antonaldo; Rines, Kenneth J.; Serra, Ana Laura E-mail: diaferio@ph.unito.it E-mail: serra@to.infn.it
2013-02-10
Cluster mass profiles are tests of models of structure formation. Only two current observational methods of determining the mass profile, gravitational lensing, and the caustic technique are independent of the assumption of dynamical equilibrium. Both techniques enable the determination of the extended mass profile at radii beyond the virial radius. For 19 clusters, we compare the mass profile based on the caustic technique with weak lensing measurements taken from the literature. This comparison offers a test of systematic issues in both techniques. Around the virial radius, the two methods of mass estimation agree to within {approx}30%, consistent with the expected errors in the individual techniques. At small radii, the caustic technique overestimates the mass as expected from numerical simulations. The ratio between the lensing profile and the caustic mass profile at these radii suggests that the weak lensing profiles are a good representation of the true mass profile. At radii larger than the virial radius, the extrapolated Navarro, Frenk and White fit to the lensing mass profile exceeds the caustic mass profile. Contamination of the lensing profile by unrelated structures within the lensing kernel may be an issue in some cases; we highlight the clusters MS0906+11 and A750, superposed along the line of sight, to illustrate the potential seriousness of contamination of the weak lensing signal by these unrelated structures.
C-element: a new clustering algorithm to find high quality functional modules in PPI networks.
Ghasemi, Mahdieh; Rahgozar, Maseud; Bidkhori, Gholamreza; Masoudi-Nejad, Ali
2013-01-01
Graph clustering algorithms are widely used in the analysis of biological networks. Extracting functional modules in protein-protein interaction (PPI) networks is one such use. Most clustering algorithms whose focuses are on finding functional modules try either to find a clique like sub networks or to grow clusters starting from vertices with high degrees as seeds. These algorithms do not make any difference between a biological network and any other networks. In the current research, we present a new procedure to find functional modules in PPI networks. Our main idea is to model a biological concept and to use this concept for finding good functional modules in PPI networks. In order to evaluate the quality of the obtained clusters, we compared the results of our algorithm with those of some other widely used clustering algorithms on three high throughput PPI networks from Sacchromyces Cerevisiae, Homo sapiens and Caenorhabditis elegans as well as on some tissue specific networks. Gene Ontology (GO) analyses were used to compare the results of different algorithms. Each algorithm's result was then compared with GO-term derived functional modules. We also analyzed the effect of using tissue specific networks on the quality of the obtained clusters. The experimental results indicate that the new algorithm outperforms most of the others, and this improvement is more significant when tissue specific networks are used. PMID:24039752
Optimization of composite structures by estimation of distribution algorithms
NASA Astrophysics Data System (ADS)
Grosset, Laurent
The design of high performance composite laminates, such as those used in aerospace structures, leads to complex combinatorial optimization problems that cannot be addressed by conventional methods. These problems are typically solved by stochastic algorithms, such as evolutionary algorithms. This dissertation proposes a new evolutionary algorithm for composite laminate optimization, named Double-Distribution Optimization Algorithm (DDOA). DDOA belongs to the family of estimation of distributions algorithms (EDA) that build a statistical model of promising regions of the design space based on sets of good points, and use it to guide the search. A generic framework for introducing statistical variable dependencies by making use of the physics of the problem is proposed. The algorithm uses two distributions simultaneously: the marginal distributions of the design variables, complemented by the distribution of auxiliary variables. The combination of the two generates complex distributions at a low computational cost. The dissertation demonstrates the efficiency of DDOA for several laminate optimization problems where the design variables are the fiber angles and the auxiliary variables are the lamination parameters. The results show that its reliability in finding the optima is greater than that of a simple EDA and of a standard genetic algorithm, and that its advantage increases with the problem dimension. A continuous version of the algorithm is presented and applied to a constrained quadratic problem. Finally, a modification of the algorithm incorporating probabilistic and directional search mechanisms is proposed. The algorithm exhibits a faster convergence to the optimum and opens the way for a unified framework for stochastic and directional optimization.
On the distribution of galaxy ellipticity in clusters
NASA Astrophysics Data System (ADS)
D'Eugenio, F.; Houghton, R. C. W.; Davies, R. L.; Dalla Bontà, E.
2015-07-01
We study the distribution of projected ellipticity n(ɛ) for galaxies in a sample of 20 rich (Richness ≥ 2) nearby (z < 0.1) clusters of galaxies. We find no evidence of differences in n(ɛ), although the nearest cluster in the sample (the Coma Cluster) is the largest outlier (P(same) < 0.05). We then study n(ɛ) within the clusters, and find that ɛ increases with projected cluster-centric radius R (hereafter the ɛ-R relation). This trend is preserved at fixed magnitude, showing that this relation exists over and above the trend of more luminous galaxies to be both rounder and more common in the centres of clusters. The ɛ-R relation is particularly strong in the subsample of intrinsically flattened galaxies (ɛ > 0.4), therefore it is not a consequence of the increasing fraction of round slow rotator galaxies near cluster centers. Furthermore, the ɛ-R relation persists for just smooth flattened galaxies and for galaxies with de Vaucouleurs-like light profiles, suggesting that the variation of the spiral fraction with radius is not the underlying cause of the trend. We interpret our findings in light of the classification of early type galaxies (ETGs) as fast and slow rotators. We conclude that the observed trend of decreasing ɛ towards the centres of clusters is evidence for physical effects in clusters causing fast rotator ETGs to have a lower average intrinsic ellipticity near the centres of rich clusters.
A Special Local Clustering Algorithm for Identifying the Genes Associated With Alzheimer’s Disease
Pang, Chao-Yang; Hu, Wei; Hu, Ben-Qiong; Shi, Ying; Vanderburg, Charles R.; Rogers, Jack T.
2010-01-01
Clustering is the grouping of similar objects into a class. Local clustering feature refers to the phenomenon whereby one group of data is separated from another, and the data from these different groups are clustered locally. A compact class is defined as one cluster in which all similar elements cluster tightly within the cluster. Herein, the essence of the local clustering feature, revealed by mathematical manipulation, results in a novel clustering algorithm termed as the special local clustering (SLC) algorithm that was used to process gene microarray data related to Alzheimer’s disease (AD). SLC algorithm was able to group together genes with similar expression patterns and identify significantly varied gene expression values as isolated points. If a gene belongs to a compact class in control data and appears as an isolated point in incipient, moderate and/or severe AD gene microarray data, this gene is possibly associated with AD. Application of a clustering algorithm in disease-associated gene identification such as in AD is rarely reported. PMID:20089478
Deb, Suash; Yang, Xin-She
2014-01-01
Traditional K-means clustering algorithms have the drawback of getting stuck at local optima that depend on the random values of initial centroids. Optimization algorithms have their advantages in guiding iterative computation to search for global optima while avoiding local optima. The algorithms help speed up the clustering process by converging into a global optimum early with multiple search agents in action. Inspired by nature, some contemporary optimization algorithms which include Ant, Bat, Cuckoo, Firefly, and Wolf search algorithms mimic the swarming behavior allowing them to cooperatively steer towards an optimal objective within a reasonable time. It is known that these so-called nature-inspired optimization algorithms have their own characteristics as well as pros and cons in different applications. When these algorithms are combined with K-means clustering mechanism for the sake of enhancing its clustering quality by avoiding local optima and finding global optima, the new hybrids are anticipated to produce unprecedented performance. In this paper, we report the results of our evaluation experiments on the integration of nature-inspired optimization methods into K-means algorithms. In addition to the standard evaluation metrics in evaluating clustering quality, the extended K-means algorithms that are empowered by nature-inspired optimization methods are applied on image segmentation as a case study of application scenario. PMID:25202730
Fong, Simon; Deb, Suash; Yang, Xin-She; Zhuang, Yan
2014-01-01
Traditional K-means clustering algorithms have the drawback of getting stuck at local optima that depend on the random values of initial centroids. Optimization algorithms have their advantages in guiding iterative computation to search for global optima while avoiding local optima. The algorithms help speed up the clustering process by converging into a global optimum early with multiple search agents in action. Inspired by nature, some contemporary optimization algorithms which include Ant, Bat, Cuckoo, Firefly, and Wolf search algorithms mimic the swarming behavior allowing them to cooperatively steer towards an optimal objective within a reasonable time. It is known that these so-called nature-inspired optimization algorithms have their own characteristics as well as pros and cons in different applications. When these algorithms are combined with K-means clustering mechanism for the sake of enhancing its clustering quality by avoiding local optima and finding global optima, the new hybrids are anticipated to produce unprecedented performance. In this paper, we report the results of our evaluation experiments on the integration of nature-inspired optimization methods into K-means algorithms. In addition to the standard evaluation metrics in evaluating clustering quality, the extended K-means algorithms that are empowered by nature-inspired optimization methods are applied on image segmentation as a case study of application scenario. PMID:25202730
Block clustering based on difference of convex functions (DC) programming and DC algorithms.
Le, Hoai Minh; Le Thi, Hoai An; Dinh, Tao Pham; Huynh, Van Ngai
2013-10-01
We investigate difference of convex functions (DC) programming and the DC algorithm (DCA) to solve the block clustering problem in the continuous framework, which traditionally requires solving a hard combinatorial optimization problem. DC reformulation techniques and exact penalty in DC programming are developed to build an appropriate equivalent DC program of the block clustering problem. They lead to an elegant and explicit DCA scheme for the resulting DC program. Computational experiments show the robustness and efficiency of the proposed algorithm and its superiority over standard algorithms such as two-mode K-means, two-mode fuzzy clustering, and block classification EM. PMID:23777526
Ab initio study on (CO2)n clusters via electrostatics- and molecular tailoring-based algorithm
NASA Astrophysics Data System (ADS)
Jovan Jose, K. V.; Gadre, Shridhar R.
An algorithm based on molecular electrostatic potential (MESP) and molecular tailoring approach (MTA) for building energetically favorable molecular clusters is presented. This algorithm is tested on prototype (CO2)n clusters with n = 13, 20, and 25 to explore their structure, energetics, and properties. The most stable clusters in this series are seen to show more number of triangular motifs. Many-body energy decomposition analysis performed on the most stable clusters reveals that the 2-body is the major contributor (>96%) to the total interaction energy. Vibrational frequencies and molecular electrostatic potentials are also evaluated for these large clusters through MTA. The MTA-based MESPs of these clusters show a remarkably good agreement with the corresponding actual ones. The most intense MTA-based normal mode frequencies are in fair agreement with the actual ones for smaller clusters. These calculated asymmetric stretching frequencies are blue-shifted with reference to the CO2 monomer.
NASA Astrophysics Data System (ADS)
Ward, W. O. C.; Wilkinson, P. B.; Chambers, J. E.; Oxby, L. S.; Bai, L.
2014-04-01
A novel method for the effective identification of bedrock subsurface elevation from electrical resistivity tomography images is described. Identifying subsurface boundaries in the topographic data can be difficult due to smoothness constraints used in inversion, so a statistical population-based approach is used that extends previous work in calculating isoresistivity surfaces. The analysis framework involves a procedure for guiding a clustering approach based on the fuzzy c-means algorithm. An approximation of resistivity distributions, found using kernel density estimation, was utilized as a means of guiding the cluster centroids used to classify data. A fuzzy method was chosen over hard clustering due to uncertainty in hard edges in the topography data, and a measure of clustering uncertainty was identified based on the reciprocal of cluster membership. The algorithm was validated using a direct comparison of known observed bedrock depths at two 3-D survey sites, using real-time GPS information of exposed bedrock by quarrying on one site, and borehole logs at the other. Results show similarly accurate detection as a leading isosurface estimation method, and the proposed algorithm requires significantly less user input and prior site knowledge. Furthermore, the method is effectively dimension-independent and will scale to data of increased spatial dimensions without a significant effect on the runtime. A discussion on the results by automated versus supervised analysis is also presented.
A Novel Method to Predict Genomic Islands Based on Mean Shift Clustering Algorithm.
de Brito, Daniel M; Maracaja-Coutinho, Vinicius; de Farias, Savio T; Batista, Leonardo V; do Rêgo, Thaís G
2016-01-01
Genomic Islands (GIs) are regions of bacterial genomes that are acquired from other organisms by the phenomenon of horizontal transfer. These regions are often responsible for many important acquired adaptations of the bacteria, with great impact on their evolution and behavior. Nevertheless, these adaptations are usually associated with pathogenicity, antibiotic resistance, degradation and metabolism. Identification of such regions is of medical and industrial interest. For this reason, different approaches for genomic islands prediction have been proposed. However, none of them are capable of predicting precisely the complete repertory of GIs in a genome. The difficulties arise due to the changes in performance of different algorithms in the face of the variety of nucleotide distribution in different species. In this paper, we present a novel method to predict GIs that is built upon mean shift clustering algorithm. It does not require any information regarding the number of clusters, and the bandwidth parameter is automatically calculated based on a heuristic approach. The method was implemented in a new user-friendly tool named MSGIP--Mean Shift Genomic Island Predictor. Genomes of bacteria with GIs discussed in other papers were used to evaluate the proposed method. The application of this tool revealed the same GIs predicted by other methods and also different novel unpredicted islands. A detailed investigation of the different features related to typical GI elements inserted in these new regions confirmed its effectiveness. Stand-alone and user-friendly versions for this new methodology are available at http://msgip.integrativebioinformatics.me. PMID:26731657
A Novel Method to Predict Genomic Islands Based on Mean Shift Clustering Algorithm
de Brito, Daniel M.; Maracaja-Coutinho, Vinicius; de Farias, Savio T.; Batista, Leonardo V.; do Rêgo, Thaís G.
2016-01-01
Genomic Islands (GIs) are regions of bacterial genomes that are acquired from other organisms by the phenomenon of horizontal transfer. These regions are often responsible for many important acquired adaptations of the bacteria, with great impact on their evolution and behavior. Nevertheless, these adaptations are usually associated with pathogenicity, antibiotic resistance, degradation and metabolism. Identification of such regions is of medical and industrial interest. For this reason, different approaches for genomic islands prediction have been proposed. However, none of them are capable of predicting precisely the complete repertory of GIs in a genome. The difficulties arise due to the changes in performance of different algorithms in the face of the variety of nucleotide distribution in different species. In this paper, we present a novel method to predict GIs that is built upon mean shift clustering algorithm. It does not require any information regarding the number of clusters, and the bandwidth parameter is automatically calculated based on a heuristic approach. The method was implemented in a new user-friendly tool named MSGIP—Mean Shift Genomic Island Predictor. Genomes of bacteria with GIs discussed in other papers were used to evaluate the proposed method. The application of this tool revealed the same GIs predicted by other methods and also different novel unpredicted islands. A detailed investigation of the different features related to typical GI elements inserted in these new regions confirmed its effectiveness. Stand-alone and user-friendly versions for this new methodology are available at http://msgip.integrativebioinformatics.me. PMID:26731657
Chinese Text Clustering Algorithm Based k-means
NASA Astrophysics Data System (ADS)
Yao, Mingyu; Pi, Dechang; Cong, Xiangxiang
Text clustering is an important means and method in text mining. The process of Chinese text clustering based on k-means was emphasized, we found that new center of a cluster was easily effected by isolated text after some experiments. Average similarity of one cluster was used as a parameter, and multiplied it with a modulus between 0.75 and 1.25 to get the similarity threshold value, the texts whose similarity with original cluster center was greater than or equal to the threshold value ware collected as a candidate collection, then updated the cluster center with center of candidate collection. The experiments show that improved method averagely increased purity and F value about 10 percent over the original method.
`Inter-Arrival Time' Inspired Algorithm and its Application in Clustering and Molecular Phylogeny
NASA Astrophysics Data System (ADS)
Kolekar, Pandurang S.; Kale, Mohan M.; Kulkarni-Kale, Urmila
2010-10-01
Bioinformatics, being multidisciplinary field, involves applications of various methods from allied areas of Science for data mining using computational approaches. Clustering and molecular phylogeny is one of the key areas in Bioinformatics, which help in study of classification and evolution of organisms. Molecular phylogeny algorithms can be divided into distance based and character based methods. But most of these methods are dependent on pre-alignment of sequences and become computationally intensive with increase in size of data and hence demand alternative efficient approaches. `Inter arrival time distribution' (IATD) is a popular concept in the theory of stochastic system modeling but its potential in molecular data analysis has not been fully explored. The present study reports application of IATD in Bioinformatics for clustering and molecular phylogeny. The proposed method provides IATDs of nucleotides in genomic sequences. The distance function based on statistical parameters of IATDs is proposed and distance matrix thus obtained is used for the purpose of clustering and molecular phylogeny. The method is applied on a dataset of 3' non-coding region sequences (NCR) of Dengue virus type 3 (DENV-3), subtype III, reported in 2008. The phylogram thus obtained revealed the geographical distribution of DENV-3 isolates. Sri Lankan DENV-3 isolates were further observed to be clustered in two sub-clades corresponding to pre and post Dengue hemorrhagic fever emergence groups. These results are consistent with those reported earlier, which are obtained using pre-aligned sequence data as an input. These findings encourage applications of the IATD based method in molecular phylogenetic analysis in particular and data mining in general.
A scalable and practical one-pass clustering algorithm for recommender system
NASA Astrophysics Data System (ADS)
Khalid, Asra; Ghazanfar, Mustansar Ali; Azam, Awais; Alahmari, Saad Ali
2015-12-01
KMeans clustering-based recommendation algorithms have been proposed claiming to increase the scalability of recommender systems. One potential drawback of these algorithms is that they perform training offline and hence cannot accommodate the incremental updates with the arrival of new data, making them unsuitable for the dynamic environments. From this line of research, a new clustering algorithm called One-Pass is proposed, which is a simple, fast, and accurate. We show empirically that the proposed algorithm outperforms K-Means in terms of recommendation and training time while maintaining a good level of accuracy.
Distributed concurrency control performance: A study of algorithms, distribution, and replication
Carey, M.J.; Livny, M.
1988-01-01
Many concurrency control algorithms have been proposed for use in distributed database systems. Despite the large number of available algorithms, and the fact that distributed database systems are becoming a commercial reality, distributed concurrency control performance tradeoffs are still not well understood. In this paper the authors attempt to shed light on some of the important issues by studying the performance of four representative algorithms - distributed 2PL, wound-wait, basic timestamp ordering, and a distributed optimistic algorithm - using a detailed simulation model of a distributed DBMS. The authors examine the performance of these algorithms for various levels of contention, ''distributedness'' of the workload, and data replication. The results should prove useful to designers of future distributed database systems.
NASA Technical Reports Server (NTRS)
Mach, Douglas M.; Christian, Hugh J.; Blakeslee, Richard; Boccippio, Dennis J.; Goodman, Steve J.; Boeck, William
2006-01-01
We describe the clustering algorithm used by the Lightning Imaging Sensor (LIS) and the Optical Transient Detector (OTD) for combining the lightning pulse data into events, groups, flashes, and areas. Events are single pixels that exceed the LIS/OTD background level during a single frame (2 ms). Groups are clusters of events that occur within the same frame and in adjacent pixels. Flashes are clusters of groups that occur within 330 ms and either 5.5 km (for LIS) or 16.5 km (for OTD) of each other. Areas are clusters of flashes that occur within 16.5 km of each other. Many investigators are utilizing the LIS/OTD flash data; therefore, we test how variations in the algorithms for the event group and group-flash clustering affect the flash count for a subset of the LIS data. We divided the subset into areas with low (1-3), medium (4-15), high (16-63), and very high (64+) flashes to see how changes in the clustering parameters affect the flash rates in these different sizes of areas. We found that as long as the cluster parameters are within about a factor of two of the current values, the flash counts do not change by more than about 20%. Therefore, the flash clustering algorithm used by the LIS and OTD sensors create flash rates that are relatively insensitive to reasonable variations in the clustering algorithms.
A Novel Artificial Bee Colony Based Clustering Algorithm for Categorical Data
2015-01-01
Data with categorical attributes are ubiquitous in the real world. However, existing partitional clustering algorithms for categorical data are prone to fall into local optima. To address this issue, in this paper we propose a novel clustering algorithm, ABC-K-Modes (Artificial Bee Colony clustering based on K-Modes), based on the traditional k-modes clustering algorithm and the artificial bee colony approach. In our approach, we first introduce a one-step k-modes procedure, and then integrate this procedure with the artificial bee colony approach to deal with categorical data. In the search process performed by scout bees, we adopt the multi-source search inspired by the idea of batch processing to accelerate the convergence of ABC-K-Modes. The performance of ABC-K-Modes is evaluated by a series of experiments in comparison with that of the other popular algorithms for categorical data. PMID:25993469
A novel artificial bee colony based clustering algorithm for categorical data.
Ji, Jinchao; Pang, Wei; Zheng, Yanlin; Wang, Zhe; Ma, Zhiqiang
2015-01-01
Data with categorical attributes are ubiquitous in the real world. However, existing partitional clustering algorithms for categorical data are prone to fall into local optima. To address this issue, in this paper we propose a novel clustering algorithm, ABC-K-Modes (Artificial Bee Colony clustering based on K-Modes), based on the traditional k-modes clustering algorithm and the artificial bee colony approach. In our approach, we first introduce a one-step k-modes procedure, and then integrate this procedure with the artificial bee colony approach to deal with categorical data. In the search process performed by scout bees, we adopt the multi-source search inspired by the idea of batch processing to accelerate the convergence of ABC-K-Modes. The performance of ABC-K-Modes is evaluated by a series of experiments in comparison with that of the other popular algorithms for categorical data. PMID:25993469
Gardiner, Eleanor J; Gillet, Valerie J; Willett, Peter; Cosgrove, David A
2007-01-01
Chemical databases are routinely clustered, with the aim of grouping molecules which share similar structural features. Ideally, medicinal chemists are then able to browse a few representatives of the cluster in order to interpret the shared activity of the cluster members. However, when molecules are clustered using fingerprints, it may be difficult to decipher the structural commonalities which are present. Here, we seek to represent a cluster by means of a maximum common substructure based on the shared functionality of the cluster members. Previously, we have used reduced graphs, where each node corresponds to a generalized functional group, as topological molecular descriptors for virtual screening. In this work, we precluster a database using any clustering method. We then represent the molecules in a cluster as reduced graphs. By repeated application of a maximum common edge substructure (MCES) algorithm, we obtain one or more reduced graph cluster representatives. The sparsity of the reduced graphs means that the MCES calculations can be performed in real time. The reduced graph cluster representatives are readily interpretable in terms of functional activity and can be mapped directly back to the molecules to which they correspond, giving the chemist a rapid means of assessing potential activities contained within the cluster. Clusters of interest are then subject to a detailed R-group analysis using the same iterated MCES algorithm applied to the molecular graphs. PMID:17309248
NASA Astrophysics Data System (ADS)
Sun, Xu; Yang, Lina; Gao, Lianru; Zhang, Bing; Li, Shanshan; Li, Jun
2015-01-01
Center-oriented hyperspectral image clustering methods have been widely applied to hyperspectral remote sensing image processing; however, the drawbacks are obvious, including the over-simplicity of computing models and underutilized spatial information. In recent years, some studies have been conducted trying to improve this situation. We introduce the artificial bee colony (ABC) and Markov random field (MRF) algorithms to propose an ABC-MRF-cluster model to solve the problems mentioned above. In this model, a typical ABC algorithm framework is adopted in which cluster centers and iteration conditional model algorithm's results are considered as feasible solutions and objective functions separately, and MRF is modified to be capable of dealing with the clustering problem. Finally, four datasets and two indices are used to show that the application of ABC-cluster and ABC-MRF-cluster methods could help to obtain better image accuracy than conventional methods. Specifically, the ABC-cluster method is superior when used for a higher power of spectral discrimination, whereas the ABC-MRF-cluster method can provide better results when used for an adjusted random index. In experiments on simulated images with different signal-to-noise ratios, ABC-cluster and ABC-MRF-cluster showed good stability.
Improved fuzzy clustering algorithms in segmentation of DC-enhanced breast MRI.
Kannan, S R; Ramathilagam, S; Devi, Pandiyarajan; Sathya, A
2012-02-01
Segmentation of medical images is a difficult and challenging problem due to poor image contrast and artifacts that result in missing or diffuse organ/tissue boundaries. Many researchers have applied various techniques however fuzzy c-means (FCM) based algorithms is more effective compared to other methods. The objective of this work is to develop some robust fuzzy clustering segmentation systems for effective segmentation of DCE - breast MRI. This paper obtains the robust fuzzy clustering algorithms by incorporating kernel methods, penalty terms, tolerance of the neighborhood attraction, additional entropy term and fuzzy parameters. The initial centers are obtained using initialization algorithm to reduce the computation complexity and running time of proposed algorithms. Experimental works on breast images show that the proposed algorithms are effective to improve the similarity measurement, to handle large amount of noise, to have better results in dealing the data corrupted by noise, and other artifacts. The clustering results of proposed methods are validated using Silhouette Method. PMID:20703716
Parallel matrix transpose algorithms on distributed memory concurrent computers
Choi, J.; Walker, D.W.; Dongarra, J.J. |
1993-10-01
This paper describes parallel matrix transpose algorithms on distributed memory concurrent processors. It is assumed that the matrix is distributed over a P x Q processor template with a block scattered data distribution. P, Q, and the block size can be arbitrary, so the algorithms have wide applicability. The communication schemes of the algorithms are determined by the greatest common divisor (GCD) of P and Q. If P and Q are relatively prime, the matrix transpose algorithm involves complete exchange communication. If P and Q are not relatively prime, processors are divided into GCD groups and the communication operations are overlapped for different groups of processors. Processors transpose GCD wrapped diagonal blocks simultaneously, and the matrix can be transposed with LCM/GCD steps, where LCM is the least common multiple of P and Q. The algorithms make use of non-blocking, point-to-point communication between processors. The use of nonblocking communication allows a processor to overlap the messages that it sends to different processors, thereby avoiding unnecessary synchronization. Combined with the matrix multiplication routine, C = A{center_dot}B, the algorithms are used to compute parallel multiplications of transposed matrices, C = A{sup T}{center_dot}B{sup T}, in the PUMMA package. Details of the parallel implementation of the algorithms are given, and results are presented for runs on the Intel Touchstone Delta computer.
A Local Scalable Distributed Expectation Maximization Algorithm for Large Peer-to-Peer Networks
NASA Technical Reports Server (NTRS)
Bhaduri, Kanishka; Srivastava, Ashok N.
2009-01-01
This paper offers a local distributed algorithm for expectation maximization in large peer-to-peer environments. The algorithm can be used for a variety of well-known data mining tasks in a distributed environment such as clustering, anomaly detection, target tracking to name a few. This technology is crucial for many emerging peer-to-peer applications for bioinformatics, astronomy, social networking, sensor networks and web mining. Centralizing all or some of the data for building global models is impractical in such peer-to-peer environments because of the large number of data sources, the asynchronous nature of the peer-to-peer networks, and dynamic nature of the data/network. The distributed algorithm we have developed in this paper is provably-correct i.e. it converges to the same result compared to a similar centralized algorithm and can automatically adapt to changes to the data and the network. We show that the communication overhead of the algorithm is very low due to its local nature. This monitoring algorithm is then used as a feedback loop to sample data from the network and rebuild the model when it is outdated. We present thorough experimental results to verify our theoretical claims.
An Improved Fuzzy c-Means Clustering Algorithm Based on Shadowed Sets and PSO
Zhang, Jian; Shen, Ling
2014-01-01
To organize the wide variety of data sets automatically and acquire accurate classification, this paper presents a modified fuzzy c-means algorithm (SP-FCM) based on particle swarm optimization (PSO) and shadowed sets to perform feature clustering. SP-FCM introduces the global search property of PSO to deal with the problem of premature convergence of conventional fuzzy clustering, utilizes vagueness balance property of shadowed sets to handle overlapping among clusters, and models uncertainty in class boundaries. This new method uses Xie-Beni index as cluster validity and automatically finds the optimal cluster number within a specific range with cluster partitions that provide compact and well-separated clusters. Experiments show that the proposed approach significantly improves the clustering effect. PMID:25477953
A new clustering algorithm applicable to multispectral and polarimetric SAR images
NASA Technical Reports Server (NTRS)
Wong, Yiu-Fai; Posner, Edward C.
1993-01-01
We describe an application of a scale-space clustering algorithm to the classification of a multispectral and polarimetric SAR image of an agricultural site. After the initial polarimetric and radiometric calibration and noise cancellation, we extracted a 12-dimensional feature vector for each pixel from the scattering matrix. The clustering algorithm was able to partition a set of unlabeled feature vectors from 13 selected sites, each site corresponding to a distinct crop, into 13 clusters without any supervision. The cluster parameters were then used to classify the whole image. The classification map is much less noisy and more accurate than those obtained by hierarchical rules. Starting with every point as a cluster, the algorithm works by melting the system to produce a tree of clusters in the scale space. It can cluster data in any multidimensional space and is insensitive to variability in cluster densities, sizes and ellipsoidal shapes. This algorithm, more powerful than existing ones, may be useful for remote sensing for land use.
The distribution of ejected brown dwarfs in clusters
NASA Astrophysics Data System (ADS)
Goodwin, S. P.; Hubber, D. A.; Moraux, E.; Whitworth, A. P.
2005-12-01
We examine the spatial distribution of brown dwarfs produced by the decay of small-N stellar systems as expected from the embryo ejection scenario. We model a cluster of several hundred stars grouped into 'cores' of a few stars/brown dwarfs. These cores decay, preferentially ejecting their lowest-mass members. Brown dwarfs are found to have a wider spatial distribution than stars, however once the effects of limited survey areas and unresolved binaries are taken into account it can be difficult to distinguish between clusters with many or no ejections. A large difference between the distributions probably indicates that ejections have occurred, however similar distributions sometimes arise even with ejections. Thus the spatial distribution of brown dwarfs is not necessarily a good discriminator between ejection and non-ejection scenarios.
A review of estimation of distribution algorithms in bioinformatics
Armañanzas, Rubén; Inza, Iñaki; Santana, Roberto; Saeys, Yvan; Flores, Jose Luis; Lozano, Jose Antonio; Peer, Yves Van de; Blanco, Rosa; Robles, Víctor; Bielza, Concha; Larrañaga, Pedro
2008-01-01
Evolutionary search algorithms have become an essential asset in the algorithmic toolbox for solving high-dimensional optimization problems in across a broad range of bioinformatics problems. Genetic algorithms, the most well-known and representative evolutionary search technique, have been the subject of the major part of such applications. Estimation of distribution algorithms (EDAs) offer a novel evolutionary paradigm that constitutes a natural and attractive alternative to genetic algorithms. They make use of a probabilistic model, learnt from the promising solutions, to guide the search process. In this paper, we set out a basic taxonomy of EDA techniques, underlining the nature and complexity of the probabilistic model of each EDA variant. We review a set of innovative works that make use of EDA techniques to solve challenging bioinformatics problems, emphasizing the EDA paradigm's potential for further research in this domain. PMID:18822112
NASA Astrophysics Data System (ADS)
Bo, Yizhou; Shifa, Naima
2013-09-01
An estimator for finding the abundance of a rare, clustered and mobile population has been introduced. This model is based on adaptive cluster sampling (ACS) to identify the location of the population and negative binomial distribution to estimate the total in each site. To identify the location of the population we consider both sampling with replacement (WR) and sampling without replacement (WOR). Some mathematical properties of the model are also developed.
Feature Subset Selection by Estimation of Distribution Algorithms
Cantu-Paz, E
2002-01-17
This paper describes the application of four evolutionary algorithms to the identification of feature subsets for classification problems. Besides a simple GA, the paper considers three estimation of distribution algorithms (EDAs): a compact GA, an extended compact GA, and the Bayesian Optimization Algorithm. The objective is to determine if the EDAs present advantages over the simple GA in terms of accuracy or speed in this problem. The experiments used a Naive Bayes classifier and public-domain and artificial data sets. In contrast with previous studies, we did not find evidence to support or reject the use of EDAs for this problem.
Impacts of Time Delays on Distributed Algorithms for Economic Dispatch
Yang, Tao; Wu, Di; Sun, Yannan; Lian, Jianming
2015-07-26
Economic dispatch problem (EDP) is an important problem in power systems. It can be formulated as an optimization problem with the objective to minimize the total generation cost subject to the power balance constraint and generator capacity limits. Recently, several consensus-based algorithms have been proposed to solve EDP in a distributed manner. However, impacts of communication time delays on these distributed algorithms are not fully understood, especially for the case where the communication network is directed, i.e., the information exchange is unidirectional. This paper investigates communication time delay effects on a distributed algorithm for directed communication networks. The algorithm has been tested by applying time delays to different types of information exchange. Several case studies are carried out to evaluate the effectiveness and performance of the algorithm in the presence of time delays in communication networks. It is found that time delay effects have negative effects on the convergence rate, and can even result in an incorrect converge value or fail the algorithm to converge.
Distributed Query Plan Generation Using Multiobjective Genetic Algorithm
Panicker, Shina; Vijay Kumar, T. V.
2014-01-01
A distributed query processing strategy, which is a key performance determinant in accessing distributed databases, aims to minimize the total query processing cost. One way to achieve this is by generating efficient distributed query plans that involve fewer sites for processing a query. In the case of distributed relational databases, the number of possible query plans increases exponentially with respect to the number of relations accessed by the query and the number of sites where these relations reside. Consequently, computing optimal distributed query plans becomes a complex problem. This distributed query plan generation (DQPG) problem has already been addressed using single objective genetic algorithm, where the objective is to minimize the total query processing cost comprising the local processing cost (LPC) and the site-to-site communication cost (CC). In this paper, this DQPG problem is formulated and solved as a biobjective optimization problem with the two objectives being minimize total LPC and minimize total CC. These objectives are simultaneously optimized using a multiobjective genetic algorithm NSGA-II. Experimental comparison of the proposed NSGA-II based DQPG algorithm with the single objective genetic algorithm shows that the former performs comparatively better and converges quickly towards optimal solutions for an observed crossover and mutation probability. PMID:24963513
An Improved Clustering Algorithm of Tunnel Monitoring Data for Cloud Computing
Zhong, Luo; Tang, KunHao; Li, Lin; Yang, Guang; Ye, JingJing
2014-01-01
With the rapid development of urban construction, the number of urban tunnels is increasing and the data they produce become more and more complex. It results in the fact that the traditional clustering algorithm cannot handle the mass data of the tunnel. To solve this problem, an improved parallel clustering algorithm based on k-means has been proposed. It is a clustering algorithm using the MapReduce within cloud computing that deals with data. It not only has the advantage of being used to deal with mass data but also is more efficient. Moreover, it is able to compute the average dissimilarity degree of each cluster in order to clean the abnormal data. PMID:24982971
An Enhanced PSO-Based Clustering Energy Optimization Algorithm for Wireless Sensor Network.
Vimalarani, C; Subramanian, R; Sivanandam, S N
2016-01-01
Wireless Sensor Network (WSN) is a network which formed with a maximum number of sensor nodes which are positioned in an application environment to monitor the physical entities in a target area, for example, temperature monitoring environment, water level, monitoring pressure, and health care, and various military applications. Mostly sensor nodes are equipped with self-supported battery power through which they can perform adequate operations and communication among neighboring nodes. Maximizing the lifetime of the Wireless Sensor networks, energy conservation measures are essential for improving the performance of WSNs. This paper proposes an Enhanced PSO-Based Clustering Energy Optimization (EPSO-CEO) algorithm for Wireless Sensor Network in which clustering and clustering head selection are done by using Particle Swarm Optimization (PSO) algorithm with respect to minimizing the power consumption in WSN. The performance metrics are evaluated and results are compared with competitive clustering algorithm to validate the reduction in energy consumption. PMID:26881273
A Hybrid Algorithm for Clustering of Time Series Data Based on Affinity Search Technique
Aghabozorgi, Saeed; Ying Wah, Teh; Herawan, Tutut; Jalab, Hamid A.; Shaygan, Mohammad Amin; Jalali, Alireza
2014-01-01
Time series clustering is an important solution to various problems in numerous fields of research, including business, medical science, and finance. However, conventional clustering algorithms are not practical for time series data because they are essentially designed for static data. This impracticality results in poor clustering accuracy in several systems. In this paper, a new hybrid clustering algorithm is proposed based on the similarity in shape of time series data. Time series data are first grouped as subclusters based on similarity in time. The subclusters are then merged using the k-Medoids algorithm based on similarity in shape. This model has two contributions: (1) it is more accurate than other conventional and hybrid approaches and (2) it determines the similarity in shape among time series data with a low complexity. To evaluate the accuracy of the proposed model, the model is tested extensively using syntactic and real-world time series datasets. PMID:24982966
An Enhanced PSO-Based Clustering Energy Optimization Algorithm for Wireless Sensor Network
Vimalarani, C.; Subramanian, R.; Sivanandam, S. N.
2016-01-01
Wireless Sensor Network (WSN) is a network which formed with a maximum number of sensor nodes which are positioned in an application environment to monitor the physical entities in a target area, for example, temperature monitoring environment, water level, monitoring pressure, and health care, and various military applications. Mostly sensor nodes are equipped with self-supported battery power through which they can perform adequate operations and communication among neighboring nodes. Maximizing the lifetime of the Wireless Sensor networks, energy conservation measures are essential for improving the performance of WSNs. This paper proposes an Enhanced PSO-Based Clustering Energy Optimization (EPSO-CEO) algorithm for Wireless Sensor Network in which clustering and clustering head selection are done by using Particle Swarm Optimization (PSO) algorithm with respect to minimizing the power consumption in WSN. The performance metrics are evaluated and results are compared with competitive clustering algorithm to validate the reduction in energy consumption. PMID:26881273
Fuzzy-Kohonen-clustering neural network trained by genetic algorithm and fuzzy competition learning
NASA Astrophysics Data System (ADS)
Xie, Weixing; Li, Wenhua; Gao, Xinbo
1995-08-01
Kohonen networks are well known for clustering analysis. Classical Kohonen networks for hard c-means clustering (trained by winner-take-all learning) have some severe drawbacks. Fuzzy Kohonen networks (FKCNN) for fuzzy c-means clustering are trained by fuzzy competition learning, and can get better clustering results than the classical Kohonen networks. However, both winner-take-all and fuzzy competition learning algorithms are in essence local search techniques that search for the optimum by using a hill-climbing technique. Thus, they often fail in the search for the global optimum. In this paper we combine genetic algorithms (GAs) with fuzzy competition learning to train the FKCNN. Our experimental results show that the proposed GA/FC learning algorithm has much higher probabilities of finding the global optimal solutions than either the winner-take-all or the fuzzy competition learning.
NASA Astrophysics Data System (ADS)
Brenden, T. O.; Clark, R. D.; Wiley, M. J.; Seelbach, P. W.; Wang, L.
2005-05-01
Remote sensing and geographic information systems have made it possible to attribute variables for streams at increasingly detailed resolutions (e.g., individual river reaches). Nevertheless, management decisions still must be made at large scales because land and stream managers typically lack sufficient resources to manage on an individual reach basis. Managers thus require a method for identifying stream management units that are ecologically similar and that can be expected to respond similarly to management decisions. We have developed a spatially-constrained clustering algorithm that can merge neighboring river reaches with similar ecological characteristics into larger management units. The clustering algorithm is based on the Cluster Affinity Search Technique (CAST), which was developed for clustering gene expression data. Inputs to the clustering algorithm are the neighbor relationships of the reaches that comprise the digital river network, the ecological attributes of the reaches, and an affinity value, which identifies the minimum similarity for merging river reaches. In this presentation, we describe the clustering algorithm in greater detail and contrast its use with other methods (expert opinion, classification approach, regular clustering) for identifying management units using several Michigan watersheds as a backdrop.
Mesh Algorithms for PDE with Sieve I: Mesh Distribution
Knepley, Matthew G.; Karpeev, Dmitry A.
2009-01-01
We have developed a new programming framework, called Sieve, to support parallel numerical partial differential equation(s) (PDE) algorithms operating over distributed meshes. We have also developed a reference implementation of Sieve in C++ as a library of generic algorithms operating on distributed containers conforming to the Sieve interface. Sieve makes instances of the incidence relation, or arrows, the conceptual first-class objects represented in the containers. Further, generic algorithms acting on this arrow container are systematically used to provide natural geometric operations on the topology and also, through duality, on the data. Finally, coverings and duality are used to encode notmore » only individual meshes, but all types of hierarchies underlying PDE data structures, including multigrid and mesh partitions. In order to demonstrate the usefulness of the framework, we show how the mesh partition data can be represented and manipulated using the same fundamental mechanisms used to represent meshes. We present the complete description of an algorithm to encode a mesh partition and then distribute a mesh, which is independent of the mesh dimension, element shape, or embedding. Moreover, data associated with the mesh can be similarly distributed with exactly the same algorithm. The use of a high level of abstraction within the Sieve leads to several benefits in terms of code reuse, simplicity, and extensibility. We discuss these benefits and compare our approach to other existing mesh libraries.« less
An approximation polynomial-time algorithm for a sequence bi-clustering problem
NASA Astrophysics Data System (ADS)
Kel'manov, A. V.; Khamidullin, S. A.
2015-06-01
We consider a strongly NP-hard problem of partitioning a finite sequence of vectors in Euclidean space into two clusters using the criterion of the minimal sum of the squared distances from the elements of the clusters to the centers of the clusters. The center of one of the clusters is to be optimized and is determined as the mean value over all vectors in this cluster. The center of the other cluster is fixed at the origin. Moreover, the partition is such that the difference between the indices of two successive vectors in the first cluster is bounded above and below by prescribed constants. A 2-approximation polynomial-time algorithm is proposed for this problem.
Uy, D.L.
1996-02-01
An algorithm for detection and identification of image clusters or {open_quotes}blobs{close_quotes} based on color information for an autonomous mobile robot is developed. The input image data are first processed using a crisp color fuszzyfier, a binary smoothing filter, and a median filter. The processed image data is then inputed to the image clusters detection and identification program. The program employed the concept of {open_quotes}elastic rectangle{close_quotes}that stretches in such a way that the whole blob is finally enclosed in a rectangle. A C-program is develop to test the algorithm. The algorithm is tested only on image data of 8x8 sizes with different number of blobs in them. The algorithm works very in detecting and identifying image clusters.
A network-assisted co-clustering algorithm to discover cancer subtypes based on gene expression
2014-01-01
Background Cancer subtype information is critically important for understanding tumor heterogeneity. Existing methods to identify cancer subtypes have primarily focused on utilizing generic clustering algorithms (such as hierarchical clustering) to identify subtypes based on gene expression data. The network-level interaction among genes, which is key to understanding the molecular perturbations in cancer, has been rarely considered during the clustering process. The motivation of our work is to develop a method that effectively incorporates molecular interaction networks into the clustering process to improve cancer subtype identification. Results We have developed a new clustering algorithm for cancer subtype identification, called “network-assisted co-clustering for the identification of cancer subtypes” (NCIS). NCIS combines gene network information to simultaneously group samples and genes into biologically meaningful clusters. Prior to clustering, we assign weights to genes based on their impact in the network. Then a new weighted co-clustering algorithm based on a semi-nonnegative matrix tri-factorization is applied. We evaluated the effectiveness of NCIS on simulated datasets as well as large-scale Breast Cancer and Glioblastoma Multiforme patient samples from The Cancer Genome Atlas (TCGA) project. NCIS was shown to better separate the patient samples into clinically distinct subtypes and achieve higher accuracy on the simulated datasets to tolerate noise, as compared to consensus hierarchical clustering. Conclusions The weighted co-clustering approach in NCIS provides a unique solution to incorporate gene network information into the clustering process. Our tool will be useful to comprehensively identify cancer subtypes that would otherwise be obscured by cancer heterogeneity, using high-throughput and high-dimensional gene expression data. PMID:24491042
A Fast Density-Based Clustering Algorithm for Real-Time Internet of Things Stream
Ying Wah, Teh
2014-01-01
Data streams are continuously generated over time from Internet of Things (IoT) devices. The faster all of this data is analyzed, its hidden trends and patterns discovered, and new strategies created, the faster action can be taken, creating greater value for organizations. Density-based method is a prominent class in clustering data streams. It has the ability to detect arbitrary shape clusters, to handle outlier, and it does not need the number of clusters in advance. Therefore, density-based clustering algorithm is a proper choice for clustering IoT streams. Recently, several density-based algorithms have been proposed for clustering data streams. However, density-based clustering in limited time is still a challenging issue. In this paper, we propose a density-based clustering algorithm for IoT streams. The method has fast processing time to be applicable in real-time application of IoT devices. Experimental results show that the proposed approach obtains high quality results with low computation time on real and synthetic datasets. PMID:25110753
A fast density-based clustering algorithm for real-time Internet of Things stream.
Amini, Amineh; Saboohi, Hadi; Wah, Teh Ying; Herawan, Tutut
2014-01-01
Data streams are continuously generated over time from Internet of Things (IoT) devices. The faster all of this data is analyzed, its hidden trends and patterns discovered, and new strategies created, the faster action can be taken, creating greater value for organizations. Density-based method is a prominent class in clustering data streams. It has the ability to detect arbitrary shape clusters, to handle outlier, and it does not need the number of clusters in advance. Therefore, density-based clustering algorithm is a proper choice for clustering IoT streams. Recently, several density-based algorithms have been proposed for clustering data streams. However, density-based clustering in limited time is still a challenging issue. In this paper, we propose a density-based clustering algorithm for IoT streams. The method has fast processing time to be applicable in real-time application of IoT devices. Experimental results show that the proposed approach obtains high quality results with low computation time on real and synthetic datasets. PMID:25110753
Efficient cluster Monte Carlo algorithm for Ising spin glasses in more than two space dimensions
NASA Astrophysics Data System (ADS)
Ochoa, Andrew J.; Zhu, Zheng; Katzgraber, Helmut G.
2015-03-01
A cluster algorithm that speeds up slow dynamics in simulations of nonplanar Ising spin glasses away from criticality is urgently needed. In theory, the cluster algorithm proposed by Houdayer poses no advantage over local moves in systems with a percolation threshold below 50%, such as cubic lattices. However, we show that the frustration present in Ising spin glasses prevents the growth of system-spanning clusters at temperatures roughly below the characteristic energy scale J of the problem. Adding Houdayer cluster moves to simulations of Ising spin glasses for T ~ J produces a speedup that grows with the system size over conventional local moves. We show results for the nonplanar quasi-two-dimensional Chimera graph of the D-Wave Two quantum annealer, as well as conventional three-dimensional Ising spin glasses, where in both cases the addition of cluster moves speeds up thermalization visibly in the physically-interesting low temperature regime.
Illinois Occupational Skill Standards: Finishing and Distribution Cluster.
ERIC Educational Resources Information Center
Illinois Occupational Skill Standards and Credentialing Council, Carbondale.
This document, which is intended as a guide for work force preparation program providers, details the Illinois occupational skill standards for programs preparing students for employment in occupations in the finishing and distribution cluster. The document begins with a brief overview of the Illinois perspective on occupational skill standards…
A Novel Coverage-Preserving Clustering Algorithm for Wireless Sensor Networks
NASA Astrophysics Data System (ADS)
Di, Xin
Sensing coverage is one of the crucial characteristics for wireless sensor networks. It has to be considered in the design of routing protocols. LEACH (Low Energy Adaptive Cluster Hierarchy) is a significant and representative routing protocol which organizes the sensing nodes by clustering. For LEACH, residual energy should be considered in order to overcome the inequality of energy dissipation rate. Considering the impact on these two factors of a network, we have proposed a coverage-preserving energy-based clustering algorithm (CEC), which is an improved LEACH. Through improving the threshold for cluster-head selection, CEC achieved more effective results than the other baseline protocols.
NASA Astrophysics Data System (ADS)
Mazure, A.; Katgert, P.; den Hartog, R.; Biviano, A.; Dubath, P.; Escalera, E.; Focardi, P.; Gerbal, D.; Giuricin, G.; Jones, B.; Le Fevre, O.; Moles, M.; Perea, J.; Rhee, G.
1996-06-01
The ESO Nearby Abell Cluster Survey (the ENACS) has yielded 5634 redshifts for galaxies in the directions of 107 rich, Southern clusters selected from the ACO catalogue (Abell et al. 1989). By combining these data with another 1000 redshifts from the literature, of galaxies in 37 clusters, we construct a volume-limited sample of 128 R_ACO_>=1 clusters in a solid angle of 2.55sr centered on the South Galactic Pole, out to a redshift z=0.1. For a subset of 80 of these clusters we can calculate a reliable velocity dispersion, based on at least 10 (but very often between 30 and 150) redshifts. We deal with the main observational problem that hampers an unambiguous interpretation of the distribution of cluster velocity dispersions, namely the contamination by fore- and background galaxies. We also discuss in detail the completeness of the cluster samples for which we derive the distribution of cluster velocity dispersions. We find that a cluster sample which is complete in terms of the field-corrected richness count given in the ACO catalogue gives a result that is essentially identical to that based on a smaller and more conservative sample which is complete in terms of an intrinsic richness count that has been corrected for superposition effects. We find that the large apparent spread in the relation between velocity dispersion and richness count (based either on visual inspection or on machine counts) must be largely intrinsic; i.e. this spread is not primarily due to measurement uncertainties. One of the consequences of the (very) broad relation between cluster richness and velocity dispersion is that all samples of clusters that are defined complete with respect to richness count are unavoidably biased against low-σ_V_ clusters. For the richness limit of our sample this bias operates only for velocity dispersions less than =~800km/sec. We obtain a statistically reliable distribution of global velocity dispersions which, for velocity dispersions σ_V_>800km/s, is
Intelligent decision support algorithm for distribution system restoration.
Singh, Reetu; Mehfuz, Shabana; Kumar, Parmod
2016-01-01
Distribution system is the means of revenue for electric utility. It needs to be restored at the earliest if any feeder or complete system is tripped out due to fault or any other cause. Further, uncertainty of the loads, result in variations in the distribution network's parameters. Thus, an intelligent algorithm incorporating hybrid fuzzy-grey relation, which can take into account the uncertainties and compare the sequences is discussed to analyse and restore the distribution system. The simulation studies are carried out to show the utility of the method by ranking the restoration plans for a typical distribution system. This algorithm also meets the smart grid requirements in terms of an automated restoration plan for the partial/full blackout of network. PMID:27512634
NASA Technical Reports Server (NTRS)
Lennington, R. K.; Malek, H.
1978-01-01
A clustering method, CLASSY, was developed, which alternates maximum likelihood iteration with a procedure for splitting, combining, and eliminating the resulting statistics. The method maximizes the fit of a mixture of normal distributions to the observed first through fourth central moments of the data and produces an estimate of the proportions, means, and covariances in this mixture. The mathematical model which is the basic for CLASSY and the actual operation of the algorithm is described. Data comparing the performances of CLASSY and ISOCLS on simulated and actual LACIE data are presented.
Distributed genetic algorithms for the floorplan design problem
NASA Technical Reports Server (NTRS)
Cohoon, James P.; Hegde, Shailesh U.; Martin, Worthy N.; Richards, Dana S.
1991-01-01
Designing a VLSI floorplan calls for arranging a given set of modules in the plane to minimize the weighted sum of area and wire-length measures. A method of solving the floorplan design problem using distributed genetic algorithms is presented. Distributed genetic algorithms, based on the paleontological theory of punctuated equilibria, offer a conceptual modification to the traditional genetic algorithms. Experimental results on several problem instances demonstrate the efficacy of this method and indicate the advantages of this method over other methods, such as simulated annealing. The method has performed better than the simulated annealing approach, both in terms of the average cost of the solutions found and the best-found solution, in almost all the problem instances tried.
Comparing Different Fault Identification Algorithms in Distributed Power System
NASA Astrophysics Data System (ADS)
Alkaabi, Salim
A power system is a huge complex system that delivers the electrical power from the generation units to the consumers. As the demand for electrical power increases, distributed power generation was introduced to the power system. Faults may occur in the power system at any time in different locations. These faults cause a huge damage to the system as they might lead to full failure of the power system. Using distributed generation in the power system made it even harder to identify the location of the faults in the system. The main objective of this work is to test the different fault location identification algorithms while tested on a power system with the different amount of power injected using distributed generators. As faults may lead the system to full failure, this is an important area for research. In this thesis different fault location identification algorithms have been tested and compared while the different amount of power is injected from distributed generators. The algorithms were tested on IEEE 34 node test feeder using MATLAB and the results were compared to find when these algorithms might fail and the reliability of these methods.
Krivitsky, Pavel N.; Handcock, Mark S.; Raftery, Adrian E.; Hoff, Peter D.
2009-01-01
Social network data often involve transitivity, homophily on observed attributes, clustering, and heterogeneity of actor degrees. We propose a latent cluster random effects model to represent all of these features, and we describe a Bayesian estimation method for it. The model is applicable to both binary and non-binary network data. We illustrate the model using two real datasets. We also apply it to two simulated network datasets with the same, highly skewed, degree distribution, but very different network behavior: one unstructured and the other with transitivity and clustering. Models based on degree distributions, such as scale-free, preferential attachment and power-law models, cannot distinguish between these very different situations, but our model does. PMID:20191087
Jungle Computing: Distributed Supercomputing Beyond Clusters, Grids, and Clouds
NASA Astrophysics Data System (ADS)
Seinstra, Frank J.; Maassen, Jason; van Nieuwpoort, Rob V.; Drost, Niels; van Kessel, Timo; van Werkhoven, Ben; Urbani, Jacopo; Jacobs, Ceriel; Kielmann, Thilo; Bal, Henri E.
In recent years, the application of high-performance and distributed computing in scientific practice has become increasingly wide spread. Among the most widely available platforms to scientists are clusters, grids, and cloud systems. Such infrastructures currently are undergoing revolutionary change due to the integration of many-core technologies, providing orders-of-magnitude speed improvements for selected compute kernels. With high-performance and distributed computing systems thus becoming more heterogeneous and hierarchical, programming complexity is vastly increased. Further complexities arise because urgent desire for scalability and issues including data distribution, software heterogeneity, and ad hoc hardware availability commonly force scientists into simultaneous use of multiple platforms (e.g., clusters, grids, and clouds used concurrently). A true computing jungle.
An Effective Intrusion Detection Algorithm Based on Improved Semi-supervised Fuzzy Clustering
NASA Astrophysics Data System (ADS)
Li, Xueyong; Zhang, Baojian; Sun, Jiaxia; Yan, Shitao
An algorithm for intrusion detection based on improved evolutionary semi- supervised fuzzy clustering is proposed which is suited for situation that gaining labeled data is more difficulty than unlabeled data in intrusion detection systems. The algorithm requires a small number of labeled data only and a large number of unlabeled data and class labels information provided by labeled data is used to guide the evolution process of each fuzzy partition on unlabeled data, which plays the role of chromosome. This algorithm can deal with fuzzy label, uneasily plunges locally optima and is suited to implement on parallel architecture. Experiments show that the algorithm can improve classification accuracy and has high detection efficiency.
Knitting distributed cluster-state ladders with spin chains
Ronke, R.; D'Amico, I.; Spiller, T. P.
2011-09-15
Recently there has been much study on the application of spin chains to quantum state transfer and communication. Here we discuss the utilization of spin chains (set up for perfect quantum state transfer) for the knitting of distributed cluster-state structures, between spin qubits repeatedly injected and extracted at the ends of the chain. The cluster states emerge from the natural evolution of the system across different excitation number sectors. We discuss the decohering effects of errors in the injection and extraction process as well as the effects of fabrication and random errors.
Distributed autonomous systems: resource management, planning, and control algorithms
NASA Astrophysics Data System (ADS)
Smith, James F., III; Nguyen, ThanhVu H.
2005-05-01
Distributed autonomous systems, i.e., systems that have separated distributed components, each of which, exhibit some degree of autonomy are increasingly providing solutions to naval and other DoD problems. Recently developed control, planning and resource allocation algorithms for two types of distributed autonomous systems will be discussed. The first distributed autonomous system (DAS) to be discussed consists of a collection of unmanned aerial vehicles (UAVs) that are under fuzzy logic control. The UAVs fly and conduct meteorological sampling in a coordinated fashion determined by their fuzzy logic controllers to determine the atmospheric index of refraction. Once in flight no human intervention is required. A fuzzy planning algorithm determines the optimal trajectory, sampling rate and pattern for the UAVs and an interferometer platform while taking into account risk, reliability, priority for sampling in certain regions, fuel limitations, mission cost, and related uncertainties. The real-time fuzzy control algorithm running on each UAV will give the UAV limited autonomy allowing it to change course immediately without consulting with any commander, request other UAVs to help it, alter its sampling pattern and rate when observing interesting phenomena, or to terminate the mission and return to base. The algorithms developed will be compared to a resource manager (RM) developed for another DAS problem related to electronic attack (EA). This RM is based on fuzzy logic and optimized by evolutionary algorithms. It allows a group of dissimilar platforms to use EA resources distributed throughout the group. For both DAS types significant theoretical and simulation results will be presented.
A distributed Canny edge detector: algorithm and FPGA implementation.
Xu, Qian; Varadarajan, Srenivas; Chakrabarti, Chaitali; Karam, Lina J
2014-07-01
The Canny edge detector is one of the most widely used edge detection algorithms due to its superior performance. Unfortunately, not only is it computationally more intensive as compared with other edge detection algorithms, but it also has a higher latency because it is based on frame-level statistics. In this paper, we propose a mechanism to implement the Canny algorithm at the block level without any loss in edge detection performance compared with the original frame-level Canny algorithm. Directly applying the original Canny algorithm at the block-level leads to excessive edges in smooth regions and to loss of significant edges in high-detailed regions since the original Canny computes the high and low thresholds based on the frame-level statistics. To solve this problem, we present a distributed Canny edge detection algorithm that adaptively computes the edge detection thresholds based on the block type and the local distribution of the gradients in the image block. In addition, the new algorithm uses a nonuniform gradient magnitude histogram to compute block-based hysteresis thresholds. The resulting block-based algorithm has a significantly reduced latency and can be easily integrated with other block-based image codecs. It is capable of supporting fast edge detection of images and videos with high resolutions, including full-HD since the latency is now a function of the block size instead of the frame size. In addition, quantitative conformance evaluations and subjective tests show that the edge detection performance of the proposed algorithm is better than the original frame-based algorithm, especially when noise is present in the images. Finally, this algorithm is implemented using a 32 computing engine architecture and is synthesized on the Xilinx Virtex-5 FPGA. The synthesized architecture takes only 0.721 ms (including the SRAM READ/WRITE time and the computation time) to detect edges of 512 × 512 images in the USC SIPI database when clocked at 100
NASA Astrophysics Data System (ADS)
Zainuddin, Zarita; Lai, Kee Huong; Ong, Pauline
2013-04-01
Artificial neural networks (ANNs) are powerful mathematical models that are used to solve complex real world problems. Wavelet neural networks (WNNs), which were developed based on the wavelet theory, are a variant of ANNs. During the training phase of WNNs, several parameters need to be initialized; including the type of wavelet activation functions, translation vectors, and dilation parameter. The conventional k-means and fuzzy c-means clustering algorithms have been used to select the translation vectors. However, the solution vectors might get trapped at local minima. In this regard, the evolutionary harmony search algorithm, which is capable of searching for near-optimum solution vectors, both locally and globally, is introduced to circumvent this problem. In this paper, the conventional k-means and fuzzy c-means clustering algorithms were hybridized with the metaheuristic harmony search algorithm. In addition to obtaining the estimation of the global minima accurately, these hybridized algorithms also offer more than one solution to a particular problem, since many possible solution vectors can be generated and stored in the harmony memory. To validate the robustness of the proposed WNNs, the real world problem of epileptic seizure detection was presented. The overall classification accuracy from the simulation showed that the hybridized metaheuristic algorithms outperformed the standard k-means and fuzzy c-means clustering algorithms.
An efficient clustering algorithm for partitioning Y-short tandem repeats data
2012-01-01
Background Y-Short Tandem Repeats (Y-STR) data consist of many similar and almost similar objects. This characteristic of Y-STR data causes two problems with partitioning: non-unique centroids and local minima problems. As a result, the existing partitioning algorithms produce poor clustering results. Results Our new algorithm, called k-Approximate Modal Haplotypes (k-AMH), obtains the highest clustering accuracy scores for five out of six datasets, and produces an equal performance for the remaining dataset. Furthermore, clustering accuracy scores of 100% are achieved for two of the datasets. The k-AMH algorithm records the highest mean accuracy score of 0.93 overall, compared to that of other algorithms: k-Population (0.91), k-Modes-RVF (0.81), New Fuzzy k-Modes (0.80), k-Modes (0.76), k-Modes-Hybrid 1 (0.76), k-Modes-Hybrid 2 (0.75), Fuzzy k-Modes (0.74), and k-Modes-UAVM (0.70). Conclusions The partitioning performance of the k-AMH algorithm for Y-STR data is superior to that of other algorithms, owing to its ability to solve the non-unique centroids and local minima problems. Our algorithm is also efficient in terms of time complexity, which is recorded as O(km(n-k)) and considered to be linear. PMID:23039132
A new distributed systems scheduling algorithm: a swarm intelligence approach
NASA Astrophysics Data System (ADS)
Haghi Kashani, Mostafa; Sarvizadeh, Raheleh; Jameii, Mahdi
2011-12-01
The scheduling problem in distributed systems is known as an NP-complete problem, and methods based on heuristic or metaheuristic search have been proposed to obtain optimal and suboptimal solutions. The task scheduling is a key factor for distributed systems to gain better performance. In this paper, an efficient method based on memetic algorithm is developed to solve the problem of distributed systems scheduling. With regard to load balancing efficiently, Artificial Bee Colony (ABC) has been applied as local search in the proposed memetic algorithm. The proposed method has been compared to existing memetic-Based approach in which Learning Automata method has been used as local search. The results demonstrated that the proposed method outperform the above mentioned method in terms of communication cost.
The distribution of dark matter in the A2256 cluster
NASA Technical Reports Server (NTRS)
Henry, J. Patrick; Briel, Ulrich G.; Nulsen, Paul E. J.
1993-01-01
Using spatially resolved X-ray spectroscopy, it was determined that the X-ray emitting gas in the rich cluster A2256 is nearly isothermal to a radius of at least 0.76/h Mpc, or about three core radii. These data can be used to measure the distribution of the dark matter in the cluster. It was found that the total mass interior to 0.76/h Mpc and 1.5/h Mpc is (0.5 +/- 0.1 and 1.0 +/- 0.5) x 10(exp 15)/h of the solar mass respectively where the errors encompass the full range allowed by all models used. Thus, the mass appropriate to the region where spectral information was obtained is well determined, but the uncertainties become large upon extrapolating beyond that region. It is shown that the galaxy orbits are midly anisotropic which may cause the beta discrepancy in this cluster.
Clustering-based robust three-dimensional phase unwrapping algorithm.
Arevalillo-Herráez, Miguel; Burton, David R; Lalor, Michael J
2010-04-01
Relatively recent techniques that produce phase volumes have motivated the study of three-dimensional (3D) unwrapping algorithms that inherently incorporate the third dimension into the process. We propose a novel 3D unwrapping algorithm that can be considered to be a generalization of the minimum spanning tree (MST) approach. The technique combines characteristics of some of the most robust existing methods: it uses a quality map to guide the unwrapping process, a region growing mechanism to progressively unwrap the signal, and also cut surfaces to avoid error propagation. The approach has been evaluated in the context of noncontact measurement of dynamic objects, suggesting a better performance than MST-based approaches. PMID:20357860
The Gas Distribution in the Outer Regions of Galaxy Clusters
NASA Technical Reports Server (NTRS)
Eckert, D.; Vazza, F.; Ettori, S.; Molendi, S.; Nagai, D.; Lau, E. T.; Roncarelli, M.; Rossetti, M.; Snowden, L.; Gastaldello, F.
2012-01-01
Aims. We present our analysis of a local (z = 0.04 - 0.2) sample of 31 galaxy clusters with the aim of measuring the density of the X-ray emitting gas in cluster outskirts. We compare our results with numerical simulations to set constraints on the azimuthal symmetry and gas clumping in the outer regions of galaxy clusters. Methods. We have exploited the large field-of-view and low instrumental background of ROSAT/PSPC to trace the density of the intracluster gas out to the virial radius, We stacked the density profiles to detect a signal beyond T200 and measured the typical density and scatter in cluster outskirts. We also computed the azimuthal scatter of the profiles with respect to the mean value to look for deviations from spherical symmetry. Finally, we compared our average density and scatter profiles with the results of numerical simulations. Results. As opposed to some recent Suzaku results, and confirming previous evidence from ROSAT and Chandra, we observe a steepening of the density profiles beyond approximately r(sub 500). Comparing our density profiles with simulations, we find that non-radiative runs predict density profiles that are too steep, whereas runs including additional physics and/ or treating gas clumping agree better with the observed gas distribution. We report high-confidence detection of a systematic difference between cool-core and non cool-core clusters beyond approximately 0.3r(sub 200), which we explain by a different distribution of the gas in the two classes. Beyond approximately r(sub 500), galaxy clusters deviate significantly from spherical symmetry, with only small differences between relaxed and disturbed systems. We find good agreement between the observed and predicted scatter profiles, but only when the 1% densest clumps are filtered out in the ENZO simulations. Conclusions. Comparing our results with numerical simulations, we find that non-radiative simulations fail to reproduce the gas distribution, even well outside
The Gas Distribution in Galaxy Cluster Outer Regions
NASA Technical Reports Server (NTRS)
Eckert, D.; Vazza, F.; Ettori, S.; Molendi, S.; Nagai, D.; Laue, E. T.; Roncarelli, M.; Rossetti, M.; Snowden, S. L.; Gastaldello, F.
2012-01-01
Aims. We present the analysis of a local (z = 0.04 - 0.2) sample of 31 galaxy clusters with the aim of measuring the density of the X-ray emitting gas in cluster outskirts. We compare our results with numerical simulations to set constraints on the azimuthal symmetry and gas clumping in the outer regions of galaxy clusters. Methods. We exploit the large field-of-view and low instrumental background of ROSAT/PSPC to trace the density of the intracluster gas out to the virial radius. We perform a stacking of the density profiles to detect a signal beyond r200 and measure the typical density and scatter in cluster outskirts. We also compute the azimuthal scatter of the profiles with respect to the mean value to look for deviations from spherical symmetry. Finally, we compare our average density and scatter profiles with the results of numerical simulations. Results. As opposed to some recent Suzaku results, and confirming previous evidence from ROSAT and Chandra, we observe a steepening of the density profiles beyond approximately r(sub 500). Comparing our density profiles with simulations, we find that non-radiative runs predict too steep density profiles, whereas runs including additional physics and/or treating gas clumping are in better agreement with the observed gas distribution. We report for the first time the high-confidence detection of a systematic difference between cool-core and non-cool core clusters beyond 0.3r(sub 200), which we explain by a different distribution of the gas in the two classes. Beyond r(sub 500), galaxy clusters deviate significantly from spherical symmetry, with only little differences between relaxed and disturbed systems. We find good agreement between the observed and predicted scatter profiles, but only when the 1% densest clumps are filtered out in the simulations. Conclusions. Comparing our results with numerical simulations, we find that non-radiative simulations fail to reproduce the gas distribution, even well outside cluster
A Game Theory Algorithm for Intra-Cluster Data Aggregation in a Vehicular Ad Hoc Network.
Chen, Yuzhong; Weng, Shining; Guo, Wenzhong; Xiong, Naixue
2016-01-01
Vehicular ad hoc networks (VANETs) have an important role in urban management and planning. The effective integration of vehicle information in VANETs is critical to traffic analysis, large-scale vehicle route planning and intelligent transportation scheduling. However, given the limitations in the precision of the output information of a single sensor and the difficulty of information sharing among various sensors in a highly dynamic VANET, effectively performing data aggregation in VANETs remains a challenge. Moreover, current studies have mainly focused on data aggregation in large-scale environments but have rarely discussed the issue of intra-cluster data aggregation in VANETs. In this study, we propose a multi-player game theory algorithm for intra-cluster data aggregation in VANETs by analyzing the competitive and cooperative relationships among sensor nodes. Several sensor-centric metrics are proposed to measure the data redundancy and stability of a cluster. We then study the utility function to achieve efficient intra-cluster data aggregation by considering both data redundancy and cluster stability. In particular, we prove the existence of a unique Nash equilibrium in the game model, and conduct extensive experiments to validate the proposed algorithm. Results demonstrate that the proposed algorithm has advantages over typical data aggregation algorithms in both accuracy and efficiency. PMID:26907272
A Game Theory Algorithm for Intra-Cluster Data Aggregation in a Vehicular Ad Hoc Network
Chen, Yuzhong; Weng, Shining; Guo, Wenzhong; Xiong, Naixue
2016-01-01
Vehicular ad hoc networks (VANETs) have an important role in urban management and planning. The effective integration of vehicle information in VANETs is critical to traffic analysis, large-scale vehicle route planning and intelligent transportation scheduling. However, given the limitations in the precision of the output information of a single sensor and the difficulty of information sharing among various sensors in a highly dynamic VANET, effectively performing data aggregation in VANETs remains a challenge. Moreover, current studies have mainly focused on data aggregation in large-scale environments but have rarely discussed the issue of intra-cluster data aggregation in VANETs. In this study, we propose a multi-player game theory algorithm for intra-cluster data aggregation in VANETs by analyzing the competitive and cooperative relationships among sensor nodes. Several sensor-centric metrics are proposed to measure the data redundancy and stability of a cluster. We then study the utility function to achieve efficient intra-cluster data aggregation by considering both data redundancy and cluster stability. In particular, we prove the existence of a unique Nash equilibrium in the game model, and conduct extensive experiments to validate the proposed algorithm. Results demonstrate that the proposed algorithm has advantages over typical data aggregation algorithms in both accuracy and efficiency. PMID:26907272
Node Non-Uniform Deployment Based on Clustering Algorithm for Underwater Sensor Networks.
Jiang, Peng; Liu, Jun; Wu, Feng
2015-01-01
A node non-uniform deployment based on clustering algorithm for underwater sensor networks (UWSNs) is proposed in this study. This algorithm is proposed because optimizing network connectivity rate and network lifetime is difficult for the existing node non-uniform deployment algorithms under the premise of improving the network coverage rate for UWSNs. A high network connectivity rate is achieved by determining the heterogeneous communication ranges of nodes during node clustering. Moreover, the concept of aggregate contribution degree is defined, and the nodes with lower aggregate contribution degrees are used to substitute the dying nodes to decrease the total movement distance of nodes and prolong the network lifetime. Simulation results show that the proposed algorithm can achieve a better network coverage rate and network connectivity rate, as well as decrease the total movement distance of nodes and prolong the network lifetime. PMID:26633408
Node Non-Uniform Deployment Based on Clustering Algorithm for Underwater Sensor Networks
Jiang, Peng; Liu, Jun; Wu, Feng
2015-01-01
A node non-uniform deployment based on clustering algorithm for underwater sensor networks (UWSNs) is proposed in this study. This algorithm is proposed because optimizing network connectivity rate and network lifetime is difficult for the existing node non-uniform deployment algorithms under the premise of improving the network coverage rate for UWSNs. A high network connectivity rate is achieved by determining the heterogeneous communication ranges of nodes during node clustering. Moreover, the concept of aggregate contribution degree is defined, and the nodes with lower aggregate contribution degrees are used to substitute the dying nodes to decrease the total movement distance of nodes and prolong the network lifetime. Simulation results show that the proposed algorithm can achieve a better network coverage rate and network connectivity rate, as well as decrease the total movement distance of nodes and prolong the network lifetime. PMID:26633408
The Development of FPGA-Based Pseudo-Iterative Clustering Algorithms
NASA Astrophysics Data System (ADS)
Drueke, Elizabeth; Fisher, Wade; Plucinski, Pawel
2016-03-01
The Large Hadron Collider (LHC) in Geneva, Switzerland, is set to undergo major upgrades in 2025 in the form of the High-Luminosity Large Hadron Collider (HL-LHC). In particular, several hardware upgrades are proposed to the ATLAS detector, one of the two general purpose detectors. These hardware upgrades include, but are not limited to, a new hardware-level clustering algorithm, to be performed by a field programmable gate array, or FPGA. In this study, we develop that clustering algorithm and compare the output to a Python-implemented topoclustering algorithm developed at the University of Oregon. Here, we present the agreement between the FPGA output and expected output, with particular attention to the time required by the FPGA to complete the algorithm and other limitations set by the FPGA itself.
An Efficient Method of Key-Frame Extraction Based on a Cluster Algorithm
Zhang, Qiang; Yu, Shao-Pei; Zhou, Dong-Sheng; Wei, Xiao-Peng
2013-01-01
This paper proposes a novel method of key-frame extraction for use with motion capture data. This method is based on an unsupervised cluster algorithm. First, the motion sequence is clustered into two classes by the similarity distance of the adjacent frames so that the thresholds needed in the next step can be determined adaptively. Second, a dynamic cluster algorithm called ISODATA is used to cluster all the frames and the frames nearest to the center of each class are automatically extracted as key-frames of the sequence. Unlike many other clustering techniques, the present improved cluster algorithm can automatically address different motion types without any need for specified parameters from users. The proposed method is capable of summarizing motion capture data reliably and efficiently. The present work also provides a meaningful comparison between the results of the proposed key-frame extraction technique and other previous methods. These results are evaluated in terms of metrics that measure reconstructed motion and the mean absolute error value, which are derived from the reconstructed data and the original data. PMID:24511336
Tiganj, Z; Mboup, M
2011-12-01
In this paper, we propose a simple and straightforward algorithm for neural spike sorting. The algorithm is based on the observation that the distribution of a neural signal largely deviates from the uniform distribution and is rather unimodal. The detected spikes to be sorted are first processed with some feature extraction technique, such as PCA, and then represented in a space with reduced dimension by keeping only a few most important features. The resulting space is next filtered in order to emphasis the differences between the centers and the borders of the clusters. Using some prior knowledge on the lowest level activity of a neuron, such as e.g. the minimal firing rate, we find the number of clusters and the center of each cluster. The spikes are then sorted using a simple greedy algorithm which grabs the nearest neighbors. We have tested the proposed algorithm on real extracellular recordings and used the simultaneous intracellular recordings to verify the results of the sorting. The results suggest that the algorithm is robust and reliable and it compares favorably with the state-of-the-art approaches. The proposed algorithm tends to be conservative, it is simple to implement and is thus suitable for both research and clinical applications as an interesting alternative to the more sophisticated approaches. PMID:22064910
Quantum cluster algorithm for frustrated Ising models in a transverse field
NASA Astrophysics Data System (ADS)
Biswas, Sounak; Rakala, Geet; Damle, Kedar
2016-06-01
Working within the stochastic series expansion framework, we introduce and characterize a plaquette-based quantum cluster algorithm for quantum Monte Carlo simulations of transverse field Ising models with frustrated Ising exchange interactions. As a demonstration of the capabilities of this algorithm, we show that a relatively small ferromagnetic next-nearest-neighbor coupling drives the transverse field Ising antiferromagnet on the triangular lattice from an antiferromagnetic three-sublattice ordered state at low temperature to a ferrimagnetic three-sublattice ordered state.
Du, Tingsong; Hu, Yang; Ke, Xianting
2015-01-01
An improved quantum artificial fish swarm algorithm (IQAFSA) for solving distributed network programming considering distributed generation is proposed in this work. The IQAFSA based on quantum computing which has exponential acceleration for heuristic algorithm uses quantum bits to code artificial fish and quantum revolving gate, preying behavior, and following behavior and variation of quantum artificial fish to update the artificial fish for searching for optimal value. Then, we apply the proposed new algorithm, the quantum artificial fish swarm algorithm (QAFSA), the basic artificial fish swarm algorithm (BAFSA), and the global edition artificial fish swarm algorithm (GAFSA) to the simulation experiments for some typical test functions, respectively. The simulation results demonstrate that the proposed algorithm can escape from the local extremum effectively and has higher convergence speed and better accuracy. Finally, applying IQAFSA to distributed network problems and the simulation results for 33-bus radial distribution network system show that IQAFSA can get the minimum power loss after comparing with BAFSA, GAFSA, and QAFSA. PMID:26447713
Du, Tingsong; Hu, Yang; Ke, Xianting
2015-01-01
An improved quantum artificial fish swarm algorithm (IQAFSA) for solving distributed network programming considering distributed generation is proposed in this work. The IQAFSA based on quantum computing which has exponential acceleration for heuristic algorithm uses quantum bits to code artificial fish and quantum revolving gate, preying behavior, and following behavior and variation of quantum artificial fish to update the artificial fish for searching for optimal value. Then, we apply the proposed new algorithm, the quantum artificial fish swarm algorithm (QAFSA), the basic artificial fish swarm algorithm (BAFSA), and the global edition artificial fish swarm algorithm (GAFSA) to the simulation experiments for some typical test functions, respectively. The simulation results demonstrate that the proposed algorithm can escape from the local extremum effectively and has higher convergence speed and better accuracy. Finally, applying IQAFSA to distributed network problems and the simulation results for 33-bus radial distribution network system show that IQAFSA can get the minimum power loss after comparing with BAFSA, GAFSA, and QAFSA. PMID:26447713
Distributed-memory Parallel Algorithms for Matching and Coloring
Catalyurek, Umit; Dobrian, Florin; Gebremedhin, Assefaw H.; Halappanavar, Mahantesh; Pothen, Alex
2011-05-31
Graph matching and coloring constitute two fundamental classes of combinatorial problems having numerous established as well as emerging applications in computational science and engineering, high-performance computing, and informatics. We provide a snapshot of an on-going work on the design and implementation of new highly-scalable distributed-memory parallel algorithms for two prototypical problems from these classes, edge-weighted matching and distance-1 vertex coloring. Graph algorithms in general have low concurrency and poor data locality, making it challenging to achieve scalability on massively parallel machines. We overcome this challenge by employing a variety of techniques, including approximation, speculation and iteration, optimized communication, and randomization, in concert. We present preliminary results on weak and strong scalability studies conducted on an IBM Blue Gene/P machine employing up to tens of thousands of processors. The results show that the algorithms hold strong potential for computing at petascale.
Durrell, Patrick R.; Accetta, Katharine; Côté, Patrick; Blakeslee, John P.; Ferrarese, Laura; McConnachie, Alan; Gwyn, Stephen; Peng, Eric W.; Zhang, Hongxin; Mihos, J. Christopher; Puzia, Thomas H.; Jordán, Andrés; Lançon, Ariane; Liu, Chengze; Cuillandre, Jean-Charles; Boissier, Samuel; Boselli, Alessandro; Courteau, Stéphane; Duc, Pierre-Alain; and others
2014-10-20
We report on a large-scale study of the distribution of globular clusters (GCs) throughout the Virgo cluster, based on photometry from the Next Generation Virgo Cluster Survey (NGVS), a large imaging survey covering Virgo's primary subclusters (Virgo A = M87 and Virgo B = M49) out to their virial radii. Using the g{sub o}{sup ′}, (g' – i') {sub o} color-magnitude diagram of unresolved and marginally resolved sources within the NGVS, we have constructed two-dimensional maps of the (irregular) GC distribution over 100 deg{sup 2} to a depth of g{sub o}{sup ′} = 24. We present the clearest evidence to date showing the difference in concentration between red and blue GCs over the full extent of the cluster, where the red (more metal-rich) GCs are largely located around the massive early-type galaxies in Virgo, while the blue (metal-poor) GCs have a much more extended spatial distribution with significant populations still present beyond 83' (∼215 kpc) along the major axes of both M49 and M87. A comparison of our GC maps to the diffuse light in the outermost regions of M49 and M87 show remarkable agreement in the shape, ellipticity, and boxiness of both luminous systems. We also find evidence for spatial enhancements of GCs surrounding M87 that may be indicative of recent interactions or an ongoing merger history. We compare the GC map to that of the locations of Virgo galaxies and the X-ray intracluster gas, and find generally good agreement between these various baryonic structures. We calculate the Virgo cluster contains a total population of N {sub GC} = 67, 300 ± 14, 400, of which 35% are located in M87 and M49 alone. For the first time, we compute a cluster-wide specific frequency S {sub N,} {sub CL} = 2.8 ± 0.7, after correcting for Virgo's diffuse light. We also find a GC-to-baryonic mass fraction ε {sub b} = 5.7 ± 1.1 × 10{sup –4} and a GC-to-total cluster mass formation efficiency ε {sub t} = 2.9 ± 0.5 × 10{sup –5}, the latter values
The mass distribution of the Fornax dSph: constraints from its globular cluster distribution
NASA Astrophysics Data System (ADS)
Cole, David R.; Dehnen, Walter; Read, Justin I.; Wilkinson, Mark I.
2012-10-01
Uniquely among the dwarf spheroidal (dSph) satellite galaxies of the Milky Way, Fornax hosts globular clusters. It remains a puzzle as to why dynamical friction has not yet dragged any of Fornax's five globular clusters to the centre, and also why there is no evidence that any similar star cluster has been in the past (for Fornax or any other tidally undisrupted dSph). We set up a suite of 2800 N-body simulations that sample the full range of globular cluster orbits and mass models consistent with all existing observational constraints for Fornax. In agreement with previous work, we find that if Fornax has a large dark matter core, then its globular clusters remain close to their currently observed locations for long times. Furthermore, we find previously unreported behaviour for clusters that start inside the core region. These are pushed out of the core and gain orbital energy, a process we call 'dynamical buoyancy'. Thus, a cored mass distribution in Fornax will naturally lead to a shell-like globular cluster distribution near the core radius, independent of the initial conditions. By contrast, cold dark matter-type cusped mass distributions lead to the rapid infall of at least one cluster within Δt = 1-2 Gyr, except when picking unlikely initial conditions for the cluster orbits (˜2 per cent probability), and almost all clusters within Δt = 10 Gyr. Alternatively, if Fornax has only a weakly cusped mass distribution, then dynamical friction is much reduced. While over Δt = 10 Gyr this still leads to the infall of one to four clusters from their present orbits, the infall of any cluster within Δt = 1-2 Gyr is much less likely (with probability 0-70 per cent, depending on Δt and the strength of the cusp). Such a solution to the timing problem requires (in addition to a shallow dark matter cusp) that in the past the globular clusters were somewhat further from Fornax than today; they most likely did not form within Fornax, but were accreted.
Muster: Massively Scalable Clustering
2010-05-20
Muster is a framework for scalable cluster analysis. It includes implementations of classic K-Medoids partitioning algorithms, as well as infrastructure for making these algorithms run scalably on very large systems. In particular, Muster contains algorithms such as CAPEK (described in reference 1) that are capable of clustering highly distributed data sets in-place on a hundred thousand or more processes.
Improving permafrost distribution modelling using feature selection algorithms
NASA Astrophysics Data System (ADS)
Deluigi, Nicola; Lambiel, Christophe; Kanevski, Mikhail
2016-04-01
The availability of an increasing number of spatial data on the occurrence of mountain permafrost allows the employment of machine learning (ML) classification algorithms for modelling the distribution of the phenomenon. One of the major problems when dealing with high-dimensional dataset is the number of input features (variables) involved. Application of ML classification algorithms to this large number of variables leads to the risk of overfitting, with the consequence of a poor generalization/prediction. For this reason, applying feature selection (FS) techniques helps simplifying the amount of factors required and improves the knowledge on adopted features and their relation with the studied phenomenon. Moreover, taking away irrelevant or redundant variables from the dataset effectively improves the quality of the ML prediction. This research deals with a comparative analysis of permafrost distribution models supported by FS variable importance assessment. The input dataset (dimension = 20-25, 10 m spatial resolution) was constructed using landcover maps, climate data and DEM derived variables (altitude, aspect, slope, terrain curvature, solar radiation, etc.). It was completed with permafrost evidences (geophysical and thermal data and rock glacier inventories) that serve as training permafrost data. Used FS algorithms informed about variables that appeared less statistically important for permafrost presence/absence. Three different algorithms were compared: Information Gain (IG), Correlation-based Feature Selection (CFS) and Random Forest (RF). IG is a filter technique that evaluates the worth of a predictor by measuring the information gain with respect to the permafrost presence/absence. Conversely, CFS is a wrapper technique that evaluates the worth of a subset of predictors by considering the individual predictive ability of each variable along with the degree of redundancy between them. Finally, RF is a ML algorithm that performs FS as part of its
Solvation Effects on Structure and Charge Distribution in Anionic Clusters
NASA Astrophysics Data System (ADS)
Weber, J. Mathias
2015-03-01
The interaction of ions with solvent molecules modifies the properties of both solvent and solute. Solvation generally stabilizes compact charge distributions compared to more diffuse ones. In the most extreme cases, solvation will alter the very composition of the ion itself. We use infrared photodissociation spectroscopy of mass-selected ions to probe how solvation affects the structures and charge distributions of metal-CO2 cluster anions. We gratefully acknowledge the National Science Foundation for funding through Grant CHE-0845618 (for graduate student support) and for instrumentation funding through Grant PHY-1125844.
Optimizing scheduling problem using an estimation of distribution algorithm and genetic algorithm
NASA Astrophysics Data System (ADS)
Qun, Jiang; Yang, Ou; Dong, Shi-Du
2007-12-01
This paper presents a methodology for using heuristic search methods to optimize scheduling problem. Specifically, an Estimation of Distribution Algorithm (EDA)- Population Based Incremental Learning (PBIL), and Genetic Algorithm (GA) have been applied to finding effective arrangement of curriculum schedule of Universities. To our knowledge, EDAs have been applied to fewer real world problems compared to GAs, and the goal of the present paper is to expand the application domain of this technique. The experimental results indicate a good applicability of PBIL to optimize scheduling problem.
An Effective Tri-Clustering Algorithm Combining Expression Data with Gene Regulation Information
Li, Ao; Tuck, David
2009-01-01
Motivation Bi-clustering algorithms aim to identify sets of genes sharing similar expression patterns across a subset of conditions. However direct interpretation or prediction of gene regulatory mechanisms may be difficult as only gene expression data is used. Information about gene regulators may also be available, most commonly about which transcription factors may bind to the promoter region and thus control the expression level of a gene. Thus a method to integrate gene expression and gene regulation information is desirable for clustering and analyzing. Methods By incorporating gene regulatory information with gene expression data, we define regulated expression values (REV) as indicators of how a gene is regulated by a specific factor. Existing bi-clustering methods are extended to a three dimensional data space by developing a heuristic TRI-Clustering algorithm. An additional approach named Automatic Boundary Searching algorithm (ABS) is introduced to automatically determine the boundary threshold. Results Results based on incorporating ChIP-chip data representing transcription factor-gene interactions show that the algorithms are efficient and robust for detecting tri-clusters. Detailed analysis of the tri-cluster extracted from yeast sporulation REV data shows genes in this cluster exhibited significant differences during the middle and late stages. The implicated regulatory network was then reconstructed for further study of defined regulatory mechanisms. Topological and statistical analysis of this network demonstrated evidence of significant changes of TF activities during the different stages of yeast sporulation, and suggests this approach might be a general way to study regulatory networks undergoing transformations. PMID:19838334
A distributed geo-routing algorithm for wireless sensor networks.
Joshi, Gyanendra Prasad; Kim, Sung Won
2009-01-01
Geographic wireless sensor networks use position information for greedy routing. Greedy routing works well in dense networks, whereas in sparse networks it may fail and require a recovery algorithm. Recovery algorithms help the packet to get out of the communication void. However, these algorithms are generally costly for resource constrained position-based wireless sensor networks (WSNs). In this paper, we propose a void avoidance algorithm (VAA), a novel idea based on upgrading virtual distance. VAA allows wireless sensor nodes to remove all stuck nodes by transforming the routing graph and forwarding packets using only greedy routing. In VAA, the stuck node upgrades distance unless it finds a next hop node that is closer to the destination than it is. VAA guarantees packet delivery if there is a topologically valid path. Further, it is completely distributed, immediately responds to node failure or topology changes and does not require planarization of the network. NS-2 is used to evaluate the performance and correctness of VAA and we compare its performance to other protocols. Simulations show our proposed algorithm consumes less energy, has an efficient path and substantially less control overheads. PMID:22408514
A seed expanding cluster algorithm for deriving upwelling areas on sea surface temperature images
NASA Astrophysics Data System (ADS)
Nascimento, Susana; Casca, Sérgio; Mirkin, Boris
2015-12-01
In this paper a novel clustering algorithm is proposed as a version of the seeded region growing (SRG) approach for the automatic recognition of coastal upwelling from sea surface temperature (SST) images. The new algorithm, one seed expanding cluster (SEC), takes advantage of the concept of approximate clustering due to Mirkin (1996, 2013) to derive a homogeneity criterion in the format of a product rather than the conventional difference between a pixel value and the mean of values over the region of interest. It involves a boundary-oriented pixel labeling so that the cluster growing is performed by expanding its boundary iteratively. The starting point is a cluster consisting of just one seed, the pixel with the coldest temperature. The baseline version of the SEC algorithm uses Otsu's thresholding method to fine-tune the homogeneity threshold. Unfortunately, this method does not always lead to a satisfactory solution. Therefore, we introduce a self-tuning version of the algorithm in which the homogeneity threshold is locally derived from the approximation criterion over a window around the pixel under consideration. The window serves as a boundary regularizer. These two unsupervised versions of the algorithm have been applied to a set of 28 SST images of the western coast of mainland Portugal, and compared against a supervised version fine-tuned by maximizing the F-measure with respect to manually labeled ground-truth maps. The areas built by the unsupervised versions of the SEC algorithm are significantly coincident over the ground-truth regions in the cases at which the upwelling areas consist of a single continuous fragment of the SST map.
Dark matter distribution in the merging cluster Abell 2163
NASA Astrophysics Data System (ADS)
Soucail, G.
2012-04-01
Context. The cluster Abell 2163 is a merging system of several subclusters with complex dynamics. It presents exceptional X-rays properties (high temperature and luminosity), suggesting that it is a very massive cluster. Recent 2D analysis of the gas distribution has revealed a complex and multiphase structure. Aims: This paper presents a wide-field weak lensing study of the dark matter distribution in the cluster in order to provide an alternative vision of the merging status of the cluster. The 2D mass distribution was built and compared to the galaxies and gas distributions. Methods: A Bayesian method, implemented in the Im2shape software, was used to fit the shape parameters of the faint background galaxies and to correct for PSF smearing. A careful color selection on the background galaxies was applied to retrieve the weak lensing signal. Shear signal was measured out to more than 2 Mpc (≃12' from the center). The radial shear profile was fit with different parametric mass profiles. The 2D mass map was built from the shear distribution and used to identify the different mass components. Results: The 2D mass map agrees with the galaxy distribution, while the total mass inferred from weak lensing shows a strong discrepancy to the X-ray deduced mass. Regardless of the method used, the virial mass M200 falls in the range 8 to 14 × 1014 h70-1 M⊙, a value that is two times less than the mass deduced from X-rays. The central mass clump appears bimodal in the dark matter distribution, with a mass ratio ~3:1 between the two components. The infalling clump A2163-B is detected in weak lensing as an independent entity. All these results are interpreted in the context of a multiple merger seen less than 1 Gyr after the main crossover. Based on observations obtained with MegaPrime/MegaCam, a joint project of Canada-France-Hawaii Telescope (CFHT) and CEA/DAPNIA, at the Canada-France-Hawaii Telescope (CFHT) which is operated by the National Research Council (NRC) of
Experimental realization of the Deutsch-Jozsa algorithm with a six-qubit cluster state
Vallone, Giuseppe; Donati, Gaia; Bruno, Natalia; Chiuri, Andrea; Mataloni, Paolo
2010-05-15
We describe an experimental realization of the Deutsch-Jozsa quantum algorithm to evaluate the properties of a two-bit Boolean function in the framework of one-way quantum computation. For this purpose, a two-photon six-qubit cluster state was engineered. Its peculiar topological structure is the basis of the original measurement pattern allowing the algorithm realization. The good agreement of the experimental results with the theoretical predictions, obtained at {approx}1 kHz success rate, demonstrates the correct implementation of the algorithm.
Spitzer IR Colors and ISM Distributions of Virgo Cluster Spirals
NASA Astrophysics Data System (ADS)
Kenney, Jeffrey D.; Wong, I.; Kenney, Z.; Murphy, E.; Helou, G.; Howell, J.
2012-01-01
IRAC infrared images of 44 spiral and peculiar galaxies from the Spitzer Survey of the Virgo Cluster help reveal the interactions which transform galaxies in clusters. We explore how the location of galaxies in the IR 3.6-8μm color-magnitude diagram is related to the spatial distributions of ISM/star formation, as traced by PAH emission in the 8μm band. Based on their 8μm/PAH radial distributions, we divide the galaxies into 4 groups: normal, truncated, truncated/compact, and anemic. Normal galaxies have relatively normal PAH distributions. They are the "bluest" galaxies, with the largest 8/3.6μm ratios. They are relatively unaffected by the cluster environment, and have probably never passed through the cluster core. Truncated galaxies have a relatively normal 8μm/PAH surface brightness in the inner disk, but are abruptly truncated with little or no emission in the outer disk. They have intermediate ("green") colors, while those which are more severely truncated are "redder". Most truncated galaxies have undisturbed stellar disks and many show direct evidence of active ram pressure stripping. Truncated/compact galaxies have high 8μm/PAH surface brightness in the very inner disk (central 1 kpc) but are abruptly truncated close to center with little or no emission in the outer disk. They have intermediate global colors, similar to the other truncated galaxies. While they have the most extreme ISM truncation, they have vigorous circumnuclear star formation. Most of these have disturbed stellar disks, and they are probably produced by a combination of gravitational interaction plus ram pressure stripping. Anemic galaxies have a low 8μm/PAH surface brightness even in the inner disk. These are the "reddest" galaxies, with the smallest 8/3.6μm ratios. The origin of the anemics seems to a combination of starvation, gravitational interactions, and long-ago ram pressure stripping.
A matching algorithm for the distribution of human pancreatic islets
Qian, Dajun; Kaddis, John; Niland, Joyce C.
2011-01-01
The success of human pancreatic islet transplantation in a subset of type 1 diabetic patients has led to an increased demand for this tissue in both clinical and basic research, yet the availability of such preparations is limited and the quality highly variable. Under the current process of islet distribution for basic science experimentation nationwide, specialized laboratories attempt to distribute islets to one or more scientists based on a list of known investigators. This Local Decision Making (LDM) process has been found to be ineffective and suboptimal. To alleviate these problems, a computerized Matching Algorithm for Islet Distribution (MAID) was developed to better match the functional, morphological, and quality characteristics of islet preparations to the criteria desired by basic research laboratories, i.e. requesters. The algorithm searches for an optimal combination of requesters using detailed screening, sorting, and search procedures. When applied to a data set of 68 human islet preparations distributed by the Islet Cell Resource (ICR) Center Consortium, MAID reduced the number of requesters that a) did not receive any islets, and b) received mis-matched shipments. These results suggest that MAID is an improved more efficient approach to the centralized distribution of human islets within a consortium setting. PMID:22199413
A new distribution vector and its application in genome clustering.
Zhao, Bo; He, Rong L; Yau, Stephen S-T
2011-05-01
In this paper we report a novel mathematical method to transform the DNA sequences into the distribution vectors which correspond to points in the sixty dimensional space. Each component of the distribution vector represents the distribution of one kind of nucleotide in k segments of the DNA sequences. The mathematical and statistical properties of the distribution vectors are demonstrated and examined with huge datasets of human DNA sequences and random sequences. The determined expectation and standard deviation can make the mapping stable and practicable. Moreover, we apply the distribution vectors to the clustering of the Haemagglutinin (HA) gene of 60 H1N1 viruses from Human, Swine and Avian, the complete mitochondrial genomes from 80 placental mammals and the complete genomes from 50 bacteria. The 60 H1N1 viruses, 80 placental mammals and 50 bacteria are classified accurately and rapidly compared to the multiple sequence alignment methods. The results indicate that the distribution vectors can reveal the similarity and evolutionary relationship among homologous DNA sequences based on the distances between any two of these distribution vectors. The advantage of fast computation offers the distribution vectors the opportunity to deal with a huge amount of DNA sequences efficiently. PMID:21385621
Borodovsky, M; Peresetsky, A
1994-09-01
Non-homogeneous Markov chain models can represent biologically important regions of DNA sequences. The statistical pattern that is described by these models is usually weak and was found primarily because of strong biological indications. The general method for extracting similar patterns is presented in the current paper. The algorithm incorporates cluster analysis, multiple alignment and entropy minimization. The method was first tested using the set of DNA sequences produced by Markov chain generators. It was shown that artificial gene sequences, which initially have been randomly set up along the multiple alignment panels, are aligned according to the hidden triplet phase. Then the method was applied to real protein-coding sequences and the resulting alignment clearly indicated the triplet phase and produced the parameters of the optimal 3-periodic non-homogeneous Markov chain model. These Markov models were already employed in the GeneMark gene prediction algorithm, which is used in genome sequencing projects. The algorithm can also handle the case in which the sequences to be aligned reveal different statistical patterns, such as Escherichia coli protein-coding sequences belonging to Class II and Class III. The algorithm accepts a random mix of sequences from different classes, and is able to separate them into two groups (clusters), align each cluster separately, and define a non-homogeneous Markov chain model for each sequence cluster. PMID:7952897
Tame, M. S.; Kim, M. S.
2010-09-15
We show that fundamental versions of the Deutsch-Jozsa and Bernstein-Vazirani quantum algorithms can be performed using a small entangled cluster state resource of only six qubits. We then investigate the minimal resource states needed to demonstrate general n-qubit versions and a scalable method to produce them. For this purpose, we propose a versatile photonic on-chip setup.
Solving the depth of the repeated texture areas based on the clustering algorithm
NASA Astrophysics Data System (ADS)
Xiong, Zhang; Zhang, Jun; Tian, Jinwen
2015-12-01
The reconstruction of the 3D scene in the monocular stereo vision needs to get the depth of the field scenic points in the picture scene. But there will inevitably be error matching in the process of image matching, especially when there are a large number of repeat texture areas in the images, there will be lots of error matches. At present, multiple baseline stereo imaging algorithm is commonly used to eliminate matching error for repeated texture areas. This algorithm can eliminate the ambiguity correspond to common repetition texture. But this algorithm has restrictions on the baseline, and has low speed. In this paper, we put forward an algorithm of calculating the depth of the matching points in the repeat texture areas based on the clustering algorithm. Firstly, we adopt Gauss Filter to preprocess the images. Secondly, we segment the repeated texture regions in the images into image blocks by using spectral clustering segmentation algorithm based on super pixel and tag the image blocks. Then, match the two images and solve the depth of the image. Finally, the depth of the image blocks takes the median in all depth values of calculating point in the bock. So the depth of repeated texture areas is got. The results of a lot of image experiments show that the effect of our algorithm for calculating the depth of repeated texture areas is very good.
NEW MDS AND CLUSTERING BASED ALGORITHMS FOR PROTEIN MODEL QUALITY ASSESSMENT AND SELECTION
WANG, QINGGUO; SHANG, CHARLES; XU, DONG
2014-01-01
In protein tertiary structure prediction, assessing the quality of predicted models is an essential task. Over the past years, many methods have been proposed for the protein model quality assessment (QA) and selection problem. Despite significant advances, the discerning power of current methods is still unsatisfactory. In this paper, we propose two new algorithms, CC-Select and MDS-QA, based on multidimensional scaling and k-means clustering. For the model selection problem, CC-Select combines consensus with clustering techniques to select the best models from a given pool. Given a set of predicted models, CC-Select first calculates a consensus score for each structure based on its average pairwise structural similarity to other models. Then, similar structures are grouped into clusters using multidimensional scaling and clustering algorithms. In each cluster, the one with the highest consensus score is selected as a candidate model. For the QA problem, MDS-QA combines single-model scoring functions with consensus to determine more accurate assessment score for every model in a given pool. Using extensive benchmark sets of a large collection of predicted models, we compare the two algorithms with existing state-of-the-art quality assessment methods and show significant improvement. PMID:24808625
NEW MDS AND CLUSTERING BASED ALGORITHMS FOR PROTEIN MODEL QUALITY ASSESSMENT AND SELECTION.
Wang, Qingguo; Shang, Charles; Xu, Dong; Shang, Yi
2013-10-25
In protein tertiary structure prediction, assessing the quality of predicted models is an essential task. Over the past years, many methods have been proposed for the protein model quality assessment (QA) and selection problem. Despite significant advances, the discerning power of current methods is still unsatisfactory. In this paper, we propose two new algorithms, CC-Select and MDS-QA, based on multidimensional scaling and k-means clustering. For the model selection problem, CC-Select combines consensus with clustering techniques to select the best models from a given pool. Given a set of predicted models, CC-Select first calculates a consensus score for each structure based on its average pairwise structural similarity to other models. Then, similar structures are grouped into clusters using multidimensional scaling and clustering algorithms. In each cluster, the one with the highest consensus score is selected as a candidate model. For the QA problem, MDS-QA combines single-model scoring functions with consensus to determine more accurate assessment score for every model in a given pool. Using extensive benchmark sets of a large collection of predicted models, we compare the two algorithms with existing state-of-the-art quality assessment methods and show significant improvement. PMID:24808625
An effective trust-based recommendation method using a novel graph clustering algorithm
NASA Astrophysics Data System (ADS)
Moradi, Parham; Ahmadian, Sajad; Akhlaghian, Fardin
2015-10-01
Recommender systems are programs that aim to provide personalized recommendations to users for specific items (e.g. music, books) in online sharing communities or on e-commerce sites. Collaborative filtering methods are important and widely accepted types of recommender systems that generate recommendations based on the ratings of like-minded users. On the other hand, these systems confront several inherent issues such as data sparsity and cold start problems, caused by fewer ratings against the unknowns that need to be predicted. Incorporating trust information into the collaborative filtering systems is an attractive approach to resolve these problems. In this paper, we present a model-based collaborative filtering method by applying a novel graph clustering algorithm and also considering trust statements. In the proposed method first of all, the problem space is represented as a graph and then a sparsest subgraph finding algorithm is applied on the graph to find the initial cluster centers. Then, the proposed graph clustering algorithm is performed to obtain the appropriate users/items clusters. Finally, the identified clusters are used as a set of neighbors to recommend unseen items to the current active user. Experimental results based on three real-world datasets demonstrate that the proposed method outperforms several state-of-the-art recommender system methods.
The Extended Spatial Distribution of Globular Clusters in the Core of the Fornax Cluster
NASA Astrophysics Data System (ADS)
D'Abrusco, R.; Cantiello, M.; Paolillo, M.; Pota, V.; Napolitano, N. R.; Limatola, L.; Spavone, M.; Grado, A.; Iodice, E.; Capaccioli, M.; Peletier, R.; Longo, G.; Hilker, M.; Mieske, S.; Grebel, E. K.; Lisker, T.; Wittmann, C.; van de Ven, G.; Schipani, P.; Fabbiano, G.
2016-03-01
We report the discovery of a complex extended density enhancement in the Globular Clusters (GCs) in the central ˜ 0.5{(^\\circ )}2 (˜ 0.06 Mpc2) of the Fornax cluster, corresponding to ˜ 50% of the area within 1 core radius. This overdensity connects the GC system of NGC 1399 to most of those of neighboring galaxies within ˜ 0\\_\\_AMP\\_\\_fdg;6 (˜ 210 kpc) along the W-E direction. The asymmetric density structure suggests that the galaxies in the core of the Fornax cluster experienced a lively history of interactions that have left a clear imprint on the spatial distribution of GCs. The extended central dominant structure is more prominent in the distribution of blue GCs, while red GCs show density enhancements that are more centrally concentrated on the host galaxies. We propose that the relatively small-scale density structures in the red GCs are caused by galaxy-galaxy interactions, while the extensive spatial distribution of blue GCs is due to stripping of GCs from the halos of core massive galaxies by the Fornax gravitational potential. Our investigations are based on density maps of candidate GCs extracted from the multi-band VLT Survey Telescope (VST) survey of Fornax (FDS), identified in a three-dimensional color space and further selected based on their g-band magnitude and morphology.
An X-Ray Spectral Classification Algorithm with Application to Young Stellar Clusters
NASA Astrophysics Data System (ADS)
Hojnacki, S. M.; Kastner, J. H.; Micela, G.; Feigelson, E. D.; LaLonde, S. M.
2007-04-01
A large volume of low signal-to-noise, multidimensional data is available from the CCD imaging spectrometers aboard the Chandra X-Ray Observatory and the X-Ray Multimirror Mission (XMM-Newton). To make progress analyzing this data, it is essential to develop methods to sort, classify, and characterize the vast library of X-ray spectra in a nonparametric fashion (complementary to current parametric model fits). We have developed a spectral classification algorithm that handles large volumes of data and operates independently of the requirement of spectral model fits. We use proven multivariate statistical techniques including principal component analysis and an ensemble classifier consisting of agglomerative hierarchical clustering and K-means clustering applied for the first time for spectral classification. The algorithm positions the sources in a multidimensional spectral sequence and then groups the ordered sources into clusters based on their spectra. These clusters appear more distinct for sources with harder observed spectra. The apparent diversity of source spectra is reduced to a three-dimensional locus in principal component space, with spectral outliers falling outside this locus. The algorithm was applied to a sample of 444 strong sources selected from the 1616 X-ray emitting sources detected in deep Chandra imaging spectroscopy of the Orion Nebula Cluster. Classes form sequences in NH, AV, and accretion activity indicators, demonstrating that the algorithm efficiently sorts the X-ray sources into a physically meaningful sequence. The algorithm also isolates important classes of very deeply embedded, active young stellar objects, and yields trends between X-ray spectral parameters and stellar parameters for the lowest mass, pre-main-sequence stars.
Belief-propagation algorithm and the Ising model on networks with arbitrary distributions of motifs
NASA Astrophysics Data System (ADS)
Yoon, S.; Goltsev, A. V.; Dorogovtsev, S. N.; Mendes, J. F. F.
2011-10-01
We generalize the belief-propagation algorithm to sparse random networks with arbitrary distributions of motifs (triangles, loops, etc.). Each vertex in these networks belongs to a given set of motifs (generalization of the configuration model). These networks can be treated as sparse uncorrelated hypergraphs in which hyperedges represent motifs. Here a hypergraph is a generalization of a graph, where a hyperedge can connect any number of vertices. These uncorrelated hypergraphs are treelike (hypertrees), which crucially simplifies the problem and allows us to apply the belief-propagation algorithm to these loopy networks with arbitrary motifs. As natural examples, we consider motifs in the form of finite loops and cliques. We apply the belief-propagation algorithm to the ferromagnetic Ising model with pairwise interactions on the resulting random networks and obtain an exact solution of this model. We find an exact critical temperature of the ferromagnetic phase transition and demonstrate that with increasing the clustering coefficient and the loop size, the critical temperature increases compared to ordinary treelike complex networks. However, weak clustering does not change the critical behavior qualitatively. Our solution also gives the birth point of the giant connected component in these loopy networks.
A Parallel Ghosting Algorithm for The Flexible Distributed Mesh Database
Mubarak, Misbah; Seol, Seegyoung; Lu, Qiukai; Shephard, Mark S.
2013-01-01
Critical to the scalability of parallel adaptive simulations are parallel control functions including load balancing, reduced inter-process communication and optimal data decomposition. In distributed meshes, many mesh-based applications frequently access neighborhood information for computational purposes which must be transmitted efficiently to avoid parallel performance degradation when the neighbors are on different processors. This article presents a parallel algorithm of creating and deleting data copies, referred to as ghost copies, which localize neighborhood data for computation purposes while minimizing inter-process communication. The key characteristics of the algorithm are: (1) It can create ghost copies of any permissible topological order inmore » a 1D, 2D or 3D mesh based on selected adjacencies. (2) It exploits neighborhood communication patterns during the ghost creation process thus eliminating all-to-all communication. (3) For applications that need neighbors of neighbors, the algorithm can create n number of ghost layers up to a point where the whole partitioned mesh can be ghosted. Strong and weak scaling results are presented for the IBM BG/P and Cray XE6 architectures up to a core count of 32,768 processors. The algorithm also leads to scalable results when used in a parallel super-convergent patch recovery error estimator, an application that frequently accesses neighborhood data to carry out computation.« less
BoCluSt: Bootstrap Clustering Stability Algorithm for Community Detection
Garcia, Carlos
2016-01-01
The identification of modules or communities in sets of related variables is a key step in the analysis and modeling of biological systems. Procedures for this identification are usually designed to allow fast analyses of very large datasets and may produce suboptimal results when these sets are of a small to moderate size. This article introduces BoCluSt, a new, somewhat more computationally intensive, community detection procedure that is based on combining a clustering algorithm with a measure of stability under bootstrap resampling. Both computer simulation and analyses of experimental data showed that BoCluSt can outperform current procedures in the identification of multiple modules in data sets with a moderate number of variables. In addition, the procedure provides users with a null distribution of results to evaluate the support for the existence of community structure in the data. BoCluSt takes individual measures for a set of variables as input, and may be a valuable and robust exploratory tool of network analysis, as it provides 1) an estimation of the best partition of variables into modules, 2) a measure of the support for the existence of modular structures, and 3) an overall description of the whole structure, which may reveal hierarchical modular situations, in which modules are composed of smaller sub-modules. PMID:27258041
The RedGOLD cluster detection algorithm and its cluster candidate catalogue for the CFHT-LS W1
NASA Astrophysics Data System (ADS)
Licitra, Rossella; Mei, Simona; Raichoor, Anand; Erben, Thomas; Hildebrandt, Hendrik
2016-01-01
We present RedGOLD (Red-sequence Galaxy Overdensity cLuster Detector), a new optical/NIR galaxy cluster detection algorithm, and apply it to the CFHT-LS W1 field. RedGOLD searches for red-sequence galaxy overdensities while minimizing contamination from dusty star-forming galaxies. It imposes an Navarro-Frenk-White profile and calculates cluster detection significance and richness. We optimize these latter two parameters using both simulations and X-ray-detected cluster catalogues, and obtain a catalogue ˜80 per cent pure up to z ˜ 1, and ˜100 per cent (˜70 per cent) complete at z ≤ 0.6 (z ≲ 1) for galaxy clusters with M ≳ 1014 M⊙ at the CFHT-LS Wide depth. In the CFHT-LS W1, we detect 11 cluster candidates per deg2 out to z ˜ 1.1. When we optimize both completeness and purity, RedGOLD obtains a cluster catalogue with higher completeness and purity than other public catalogues, obtained using CFHT-LS W1 observations, for M ≳ 1014 M⊙. We use X-ray-detected cluster samples to extend the study of the X-ray temperature-optical richness relation to a lower mass threshold, and find a mass scatter at fixed richness of σlnM|λ = 0.39 ± 0.07 and σlnM|λ = 0.30 ± 0.13 for the Gozaliasl et al. and Mehrtens et al. samples. When considering similar mass ranges as previous work, we recover a smaller scatter in mass at fixed richness. We recover 93 per cent of the redMaPPer detections, and find that its richness estimates is on average ˜40-50 per cent larger than ours at z > 0.3. RedGOLD recovers X-ray cluster spectroscopic redshifts at better than 5 per cent up to z ˜ 1, and the centres within a few tens of arcseconds.
NASA Astrophysics Data System (ADS)
Rajalakshmi, N.; Padma Subramanian, D.; Thamizhavel, K.
2015-03-01
The extent of real power loss and voltage deviation associated with overloaded feeders in radial distribution system can be reduced by reconfiguration. Reconfiguration is normally achieved by changing the open/closed state of tie/sectionalizing switches. Finding optimal switch combination is a complicated problem as there are many switching combinations possible in a distribution system. Hence optimization techniques are finding greater importance in reducing the complexity of reconfiguration problem. This paper presents the application of firefly algorithm (FA) for optimal reconfiguration of radial distribution system with distributed generators (DG). The algorithm is tested on IEEE 33 bus system installed with DGs and the results are compared with binary genetic algorithm. It is found that binary FA is more effective than binary genetic algorithm in achieving real power loss reduction and improving voltage profile and hence enhancing the performance of radial distribution system. Results are found to be optimum when DGs are added to the test system, which proved the impact of DGs on distribution system.
Distributed Minimal Residual (DMR) method for acceleration of iterative algorithms
NASA Technical Reports Server (NTRS)
Lee, Seungsoo; Dulikravich, George S.
1991-01-01
A new method for enhancing the convergence rate of iterative algorithms for the numerical integration of systems of partial differential equations was developed. It is termed the Distributed Minimal Residual (DMR) method and it is based on general Krylov subspace methods. The DMR method differs from the Krylov subspace methods by the fact that the iterative acceleration factors are different from equation to equation in the system. At the same time, the DMR method can be viewed as an incomplete Newton iteration method. The DMR method was applied to Euler equations of gas dynamics and incompressible Navier-Stokes equations. All numerical test cases were obtained using either explicit four stage Runge-Kutta or Euler implicit time integration. The formulation for the DMR method is general in nature and can be applied to explicit and implicit iterative algorithms for arbitrary systems of partial differential equations.
FctClus: A Fast Clustering Algorithm for Heterogeneous Information Networks.
Yang, Jing; Chen, Limin; Zhang, Jianpei
2015-01-01
It is important to cluster heterogeneous information networks. A fast clustering algorithm based on an approximate commute time embedding for heterogeneous information networks with a star network schema is proposed in this paper by utilizing the sparsity of heterogeneous information networks. First, a heterogeneous information network is transformed into multiple compatible bipartite graphs from the compatible point of view. Second, the approximate commute time embedding of each bipartite graph is computed using random mapping and a linear time solver. All of the indicator subsets in each embedding simultaneously determine the target dataset. Finally, a general model is formulated by these indicator subsets, and a fast algorithm is derived by simultaneously clustering all of the indicator subsets using the sum of the weighted distances for all indicators for an identical target object. The proposed fast algorithm, FctClus, is shown to be efficient and generalizable and exhibits high clustering accuracy and fast computation speed based on a theoretic analysis and experimental verification. PMID:26090857
A priori data-driven multi-clustered reservoir generation algorithm for echo state network.
Li, Xiumin; Zhong, Ling; Xue, Fangzheng; Zhang, Anguo
2015-01-01
Echo state networks (ESNs) with multi-clustered reservoir topology perform better in reservoir computing and robustness than those with random reservoir topology. However, these ESNs have a complex reservoir topology, which leads to difficulties in reservoir generation. This study focuses on the reservoir generation problem when ESN is used in environments with sufficient priori data available. Accordingly, a priori data-driven multi-cluster reservoir generation algorithm is proposed. The priori data in the proposed algorithm are used to evaluate reservoirs by calculating the precision and standard deviation of ESNs. The reservoirs are produced using the clustering method; only the reservoir with a better evaluation performance takes the place of a previous one. The final reservoir is obtained when its evaluation score reaches the preset requirement. The prediction experiment results obtained using the Mackey-Glass chaotic time series show that the proposed reservoir generation algorithm provides ESNs with extra prediction precision and increases the structure complexity of the network. Further experiments also reveal the appropriate values of the number of clusters and time window size to obtain optimal performance. The information entropy of the reservoir reaches the maximum when ESN gains the greatest precision. PMID:25875296
Raindrop Size Distribution Observation for GPM/DPR algorithm development
NASA Astrophysics Data System (ADS)
Nakagawa, Katsuhiro; Hanado, Hiroshi; Nishikawa, Masanori; Nakamura, Kenji; Kaneko, Yuki; Kawamura, Seiji; Iwai, Hironori; Minda, Haruya; Oki, Riko
2013-04-01
In order to evaluate and improve the accuracy of rainfall intensity from space-borne radars (TRMM/PR and GPM/DPR), it is important to estimate the rain attenuation, namely the k-Z relationship (k is the specific attenuation, Z is the radar reflectivity) correctly. National Institute of Information and Communications Technology (NICT) developed the mobile precipitation observation system for the dual Ka-band radar field campaign for GPM/DPR algorithm development. The precipitation measurement instruments are installed on the roof of container. The installed instruments for raindrop size distribution (DSD) measurements are 2-dimensional Video disdtrometer (2DVD), Joss-type disdrometer, and Laser Optical disdrometr (Parsival). 2DVD and Persival can measure not only raindrop size distribution but also ice and snow size distribution. Observations using the mobile precipitation observation system were performed in Okinawa Island, in Tsukuba, over the slope of Mt. Fuji, in Nagaoka, and in Sapporo Japan. Using these observed DSD data in the different provinces, the characteristics of DSD itself are analyzed and the k-Z relationship is estimated for evaluation and improvement of the TRMM/PR and GPM/DPR algorithm.
K-Boost: a scalable algorithm for high-quality clustering of microarray gene expression data.
Geraci, Filippo; Leoncini, Mauro; Montangero, Manuela; Pellegrini, Marco; Renda, M Elena
2009-06-01
Microarray technology for profiling gene expression levels is a popular tool in modern biological research. Applications range from tissue classification to the detection of metabolic networks, from drug discovery to time-critical personalized medicine. Given the increase in size and complexity of the data sets produced, their analysis is becoming problematic in terms of time/quality trade-offs. Clustering genes with similar expression profiles is a key initial step for subsequent manipulations and the increasing volumes of data to be analyzed requires methods that are at the same time efficient (completing an analysis in minutes rather than hours) and effective (identifying significant clusters with high biological correlations). In this paper, we propose K-Boost, a clustering algorithm based on a combination of the furthest-point-first (FPF) heuristic for solving the metric k-center problem, a stability-based method for determining the number of clusters, and a k-means-like cluster refinement. K-Boost runs in O (|N| x k) time, where N is the input matrix and k is the number of proposed clusters. Experiments show that this low complexity is usually coupled with a very good quality of the computed clusterings, which we measure using both internal and external criteria. Supporting data can be found as online Supplementary Material at www.liebertonline.com. PMID:19522668
NASA Astrophysics Data System (ADS)
Qian, Yibin; Ren, Zhongzhou; Ni, Dongdong
2016-08-01
We further investigate the cluster emission from heavy nuclei beyond the lead region in the framework of the preformed cluster model. The refined cluster-core potential is constructed by the double-folding integral of the density distributions of the daughter nucleus and the emitted cluster, where the radius or the diffuseness parameter in the Fermi density distribution formula is determined according to the available experimental data on the charge radii and the neutron skin thickness. The Schrödinger equation of the cluster-daughter relative motion is then solved within the outgoing Coulomb wave-function boundary conditions to obtain the decay width. It is found that the present decay width of cluster emitters is clearly enhanced as compared to that in the previous case, which involved the fixed parametrization for the density distributions of daughter nuclei and clusters. Among the whole procedure, the nuclear deformation of clusters is also introduced into the calculations, and the degree of its influence on the final decay half-life is checked to some extent. Moreover, the effect from the bubble density distribution of clusters on the final decay width is carefully discussed by using the central depressed distribution.
Cluster-Based Multipolling Sequencing Algorithm for Collecting RFID Data in Wireless LANs
NASA Astrophysics Data System (ADS)
Choi, Woo-Yong; Chatterjee, Mainak
2015-03-01
With the growing use of RFID (Radio Frequency Identification), it is becoming important to devise ways to read RFID tags in real time. Access points (APs) of IEEE 802.11-based wireless Local Area Networks (LANs) are being integrated with RFID networks that can efficiently collect real-time RFID data. Several schemes, such as multipolling methods based on the dynamic search algorithm and random sequencing, have been proposed. However, as the number of RFID readers associated with an AP increases, it becomes difficult for the dynamic search algorithm to derive the multipolling sequence in real time. Though multipolling methods can eliminate the polling overhead, we still need to enhance the performance of the multipolling methods based on random sequencing. To that extent, we propose a real-time cluster-based multipolling sequencing algorithm that drastically eliminates more than 90% of the polling overhead, particularly so when the dynamic search algorithm fails to derive the multipolling sequence in real time.
3D Hail Size Distribution Interpolation/Extrapolation Algorithm
NASA Technical Reports Server (NTRS)
Lane, John
2013-01-01
Radar data can usually detect hail; however, it is difficult for present day radar to accurately discriminate between hail and rain. Local ground-based hail sensors are much better at detecting hail against a rain background, and when incorporated with radar data, provide a much better local picture of a severe rain or hail event. The previous disdrometer interpolation/ extrapolation algorithm described a method to interpolate horizontally between multiple ground sensors (a minimum of three) and extrapolate vertically. This work is a modification to that approach that generates a purely extrapolated 3D spatial distribution when using a single sensor.
Metallicity distributions of globular cluster systems in galaxies
NASA Astrophysics Data System (ADS)
Eerik, H.; Tenjes, P.
We collected a sample of 100 galaxies for which different observers have determined colour indices of globular cluster candidates. The sample includes representatives of galaxies of various morphological types and different luminosities. Colour indices (in most cases (V-I), but also (B-I) and (C-T_1)) were transformed into metallicities [Fe/H] according to a relation by Kissler-Patig (1998). These data were analysed with the KMM software in order to estimate similarity of the distribution with uni- or bimodal Gaussian distribution. We found that 45 of 100 systems have bimodal metallicity distributions. Mean metallicity of the metal-poor component for these galaxies is < [Fe/H]> = -1.40 +/- 0.02, of the metal-rich component < [Fe/H]> = -0.69 +/- 0.03. Dispersions of the distributions are 0.15 and 0.18, respectively. Distribution of unimodal metallicities is rather wide. These data will be analysed in a subsequent paper in order to find correlations with parameters of galaxies and galactic environment.
Whistler Waves Driven by Anisotropic Strahl Velocity Distributions: Cluster Observations
NASA Technical Reports Server (NTRS)
Vinas, A.F.; Gurgiolo, C.; Nieves-Chinchilla, T.; Gary, S. P.; Goldstein, M. L.
2010-01-01
Observed properties of the strahl using high resolution 3D electron velocity distribution data obtained from the Cluster/PEACE experiment are used to investigate its linear stability. An automated method to isolate the strahl is used to allow its moments to be computed independent of the solar wind core+halo. Results show that the strahl can have a high temperature anisotropy (T(perpindicular)/T(parallell) approximately > 2). This anisotropy is shown to be an important free energy source for the excitation of high frequency whistler waves. The analysis suggests that the resultant whistler waves are strong enough to regulate the electron velocity distributions in the solar wind through pitch-angle scattering
A genetic algorithmic approach to antenna null-steering using a cluster computer.
NASA Astrophysics Data System (ADS)
Recine, Greg; Cui, Hong-Liang
2001-06-01
We apply a genetic algorithm (GA) to the problem of electronically steering the maximums and nulls of an antenna array to desired positions (null toward enemy listener/jammer, max toward friendly listener/transmitter). The antenna pattern itself is computed using NEC2 which is called by the main GA program. Since a GA naturally lends itself to parallelization, this simulation was applied to our new twin 64-node cluster computers (Gemini). Design issues and uses of the Gemini cluster in our group are also discussed.
KD-tree based clustering algorithm for fast face recognition on large-scale data
NASA Astrophysics Data System (ADS)
Wang, Yuanyuan; Lin, Yaping; Yang, Junfeng
2015-07-01
This paper proposes an acceleration method for large-scale face recognition system. When dealing with a large-scale database, face recognition is time-consuming. In order to tackle this problem, we employ the k-means clustering algorithm to classify face data. Specifically, the data in each cluster are stored in the form of the kd-tree, and face feature matching is conducted with the kd-tree based nearest neighborhood search. Experiments on CAS-PEAL and self-collected database show the effectiveness of our proposed method.
K-Means Re-Clustering-Algorithmic Options with Quantifiable Performance Comparisons
Meyer, A W; Paglieroni, D; Asteneh, C
2002-12-17
This paper presents various architectural options for implementing a K-Means Re-Clustering algorithm suitable for unsupervised segmentation of hyperspectral images. Performance metrics are developed based upon quantitative comparisons of convergence rates and segmentation quality. A methodology for making these comparisons is developed and used to establish K values that produce the best segmentations with minimal processing requirements. Convergence rates depend on the initial choice of cluster centers. Consequently, this same methodology may be used to evaluate the effectiveness of different initialization techniques.
Fast randomized Hough transformation track initiation algorithm based on multi-scale clustering
NASA Astrophysics Data System (ADS)
Wan, Minjie; Gu, Guohua; Chen, Qian; Qian, Weixian; Wang, Pengcheng
2015-10-01
A fast randomized Hough transformation track initiation algorithm based on multi-scale clustering is proposed to overcome existing problems in traditional infrared search and track system(IRST) which cannot provide movement information of the initial target and select the threshold value of correlation automatically by a two-dimensional track association algorithm based on bearing-only information . Movements of all the targets are presumed to be uniform rectilinear motion throughout this new algorithm. Concepts of space random sampling, parameter space dynamic linking table and convergent mapping of image to parameter space are developed on the basis of fast randomized Hough transformation. Considering the phenomenon of peak value clustering due to shortcomings of peak detection itself which is built on threshold value method, accuracy can only be ensured on condition that parameter space has an obvious peak value. A multi-scale idea is added to the above-mentioned algorithm. Firstly, a primary association is conducted to select several alternative tracks by a low-threshold .Then, alternative tracks are processed by multi-scale clustering methods , through which accurate numbers and parameters of tracks are figured out automatically by means of transforming scale parameters. The first three frames are processed by this algorithm in order to get the first three targets of the track , and then two slightly different gate radius are worked out , mean value of which is used to be the global threshold value of correlation. Moreover, a new model for curvilinear equation correction is applied to the above-mentioned track initiation algorithm for purpose of solving the problem of shape distortion when a space three-dimensional curve is mapped to a two-dimensional bearing-only space. Using sideways-flying, launch and landing as examples to build models and simulate, the application of the proposed approach in simulation proves its effectiveness , accuracy , and adaptivity
Schatz, Martin D.; Kolda, Tamara G.; van de Geijn, Robert
2015-09-01
Large-scale datasets in computational chemistry typically require distributed-memory parallel methods to perform a special operation known as tensor contraction. Tensors are multidimensional arrays, and a tensor contraction is akin to matrix multiplication with special types of permutations. Creating an efficient algorithm and optimized im- plementation in this domain is complex, tedious, and error-prone. To address this, we develop a notation to express data distributions so that we can apply use automated methods to find optimized implementations for tensor contractions. We consider the spin-adapted coupled cluster singles and doubles method from computational chemistry and use our methodology to produce an efficient implementation. Experiments per- formed on the IBM Blue Gene/Q and Cray XC30 demonstrate impact both improved performance and reduced memory consumption.
Combustion distribution control using the extremum seeking algorithm
NASA Astrophysics Data System (ADS)
Marjanovic, A.; Krstic, M.; Djurovic, Z.; Kvascev, G.; Papic, V.
2014-12-01
Quality regulation of the combustion process inside the furnace is the basis of high demands for increasing robustness, safety and efficiency of thermal power plants. The paper considers the possibility of spatial temperature distribution control inside the boiler, based on the correction of distribution of coal over the mills. Such control system ensures the maintenance of the flame focus away from the walls of the boiler, and thus preserves the equipment and reduces the possibility of ash slugging. At the same time, uniform heat dissipation over mills enhances the energy efficiency of the boiler, while reducing the pollution of the system. A constrained multivariable extremum seeking algorithm is proposed as a tool for combustion process optimization with the main objective of centralizing the flame in the furnace. Simulations are conducted on a model corresponding to the 350MW boiler of the Nikola Tesla Power Plant, in Obrenovac, Serbia.
An efficient distributed navigation algorithm for mobile robot
NASA Astrophysics Data System (ADS)
Yang, Xiuping; Liu, Songyan; Liu, Zhen
2008-10-01
An efficient distributed algorithm for mobile nodes of wireless sensor network (WSN) is proposed. A smart mobile robot is designed as the special node of WSN, and the nodes of WSN are deployed in the environment as signposts for the robot to follow. We use the RSSI value between the robot and other static nodes as the input of the navigation control system. The mobile robot state and navigation space are denoted by the RSSI potential field. Navigation directions are computed by using fuzzy logic method. To reduce the communication expense, each node within one hop communication range of robot is a distributed navigation unites. Then the fuzzy logic control centre will collect the control outputs from every beacon nodes and calculate the final outputs for mobile robot based on data fusion. The experimental results confirm that the navigation system based on WSN successfully achieved their assigned tasks.
Robustness of ‘cut and splice’ genetic algorithms in the structural optimization of atomic clusters
NASA Astrophysics Data System (ADS)
Froltsov, Vladimir A.; Reuter, Karsten
2009-05-01
We return to the geometry optimization problem of Lennard-Jones clusters to analyze the performance dependence of 'cut and splice' genetic algorithms (GAs) on the employed population size. We generally find that admixing twinning mutation moves leads to an improved robustness of the algorithm efficiency with respect to this a priori unknown technical parameter. The resulting very stable performance of the corresponding mutation + mating GA implementation over a wide range of population sizes is an important feature when addressing unknown systems with computationally involved first-principles based GA sampling.
User Activity Recognition in Smart Homes Using Pattern Clustering Applied to Temporal ANN Algorithm.
Bourobou, Serge Thomas Mickala; Yoo, Younghwan
2015-01-01
This paper discusses the possibility of recognizing and predicting user activities in the IoT (Internet of Things) based smart environment. The activity recognition is usually done through two steps: activity pattern clustering and activity type decision. Although many related works have been suggested, they had some limited performance because they focused only on one part between the two steps. This paper tries to find the best combination of a pattern clustering method and an activity decision algorithm among various existing works. For the first step, in order to classify so varied and complex user activities, we use a relevant and efficient unsupervised learning method called the K-pattern clustering algorithm. In the second step, the training of smart environment for recognizing and predicting user activities inside his/her personal space is done by utilizing the artificial neural network based on the Allen's temporal relations. The experimental results show that our combined method provides the higher recognition accuracy for various activities, as compared with other data mining classification algorithms. Furthermore, it is more appropriate for a dynamic environment like an IoT based smart home. PMID:26007738
User Activity Recognition in Smart Homes Using Pattern Clustering Applied to Temporal ANN Algorithm
Bourobou, Serge Thomas Mickala; Yoo, Younghwan
2015-01-01
This paper discusses the possibility of recognizing and predicting user activities in the IoT (Internet of Things) based smart environment. The activity recognition is usually done through two steps: activity pattern clustering and activity type decision. Although many related works have been suggested, they had some limited performance because they focused only on one part between the two steps. This paper tries to find the best combination of a pattern clustering method and an activity decision algorithm among various existing works. For the first step, in order to classify so varied and complex user activities, we use a relevant and efficient unsupervised learning method called the K-pattern clustering algorithm. In the second step, the training of smart environment for recognizing and predicting user activities inside his/her personal space is done by utilizing the artificial neural network based on the Allen’s temporal relations. The experimental results show that our combined method provides the higher recognition accuracy for various activities, as compared with other data mining classification algorithms. Furthermore, it is more appropriate for a dynamic environment like an IoT based smart home. PMID:26007738
Double-layer evolutionary algorithm for distributed optimization of particle detection on the Grid
NASA Astrophysics Data System (ADS)
Padée, Adam; Kurek, Krzysztof; Zaremba, Krzysztof
2013-08-01
Reconstruction of particle tracks from information collected by position-sensitive detectors is an important procedure in HEP experiments. It is usually controlled by a set of numerical parameters which have to be manually optimized. This paper proposes an automatic approach to this task by utilizing evolutionary algorithm (EA) operating on both real-valued and binary representations. Because of computational complexity of the task a special distributed architecture of the algorithm is proposed, designed to be run in grid environment. It is two-level hierarchical hybrid utilizing asynchronous master-slave EA on the level of clusters and island model EA on the level of the grid. The technical aspects of usage of production grid infrastructure are covered, including communication protocols on both levels. The paper deals also with the problem of heterogeneity of the resources, presenting efficiency tests on a benchmark function. These tests confirm that even relatively small islands (clusters) can be beneficial to the optimization process when connected to the larger ones. Finally a real-life usage example is presented, which is an optimization of track reconstruction in Large Angle Spectrometer of NA-58 COMPASS experiment held at CERN, using a sample of Monte Carlo simulated data. The overall reconstruction efficiency gain, achieved by the proposed method, is more than 4%, compared to the manually optimized parameters.
Big Data GPU-Driven Parallel Processing Spatial and Spatio-Temporal Clustering Algorithms
NASA Astrophysics Data System (ADS)
Konstantaras, Antonios; Skounakis, Emmanouil; Kilty, James-Alexander; Frantzeskakis, Theofanis; Maravelakis, Emmanuel
2016-04-01
Advances in graphics processing units' technology towards encompassing parallel architectures [1], comprised of thousands of cores and multiples of parallel threads, provide the foundation in terms of hardware for the rapid processing of various parallel applications regarding seismic big data analysis. Seismic data are normally stored as collections of vectors in massive matrices, growing rapidly in size as wider areas are covered, denser recording networks are being established and decades of data are being compiled together [2]. Yet, many processes regarding seismic data analysis are performed on each seismic event independently or as distinct tiles [3] of specific grouped seismic events within a much larger data set. Such processes, independent of one another can be performed in parallel narrowing down processing times drastically [1,3]. This research work presents the development and implementation of three parallel processing algorithms using Cuda C [4] for the investigation of potentially distinct seismic regions [5,6] present in the vicinity of the southern Hellenic seismic arc. The algorithms, programmed and executed in parallel comparatively, are the: fuzzy k-means clustering with expert knowledge [7] in assigning overall clusters' number; density-based clustering [8]; and a selves-developed spatio-temporal clustering algorithm encompassing expert [9] and empirical knowledge [10] for the specific area under investigation. Indexing terms: GPU parallel programming, Cuda C, heterogeneous processing, distinct seismic regions, parallel clustering algorithms, spatio-temporal clustering References [1] Kirk, D. and Hwu, W.: 'Programming massively parallel processors - A hands-on approach', 2nd Edition, Morgan Kaufman Publisher, 2013 [2] Konstantaras, A., Valianatos, F., Varley, M.R. and Makris, J.P.: 'Soft-Computing Modelling of Seismicity in the Southern Hellenic Arc', Geoscience and Remote Sensing Letters, vol. 5 (3), pp. 323-327, 2008 [3] Papadakis, S. and
Bisnovatyi-Kogan, Gennady S.
2009-09-20
We construct numerical models of spherically symmetric Newtonian stellar clusters with anisotropic distribution functions. These models generalize solutions obtained earlier for isotropic Maxwellian distribution functions with an energy cutoff and take into account distributions with different levels of anisotropy.
Modelling Galaxy Clustering: Halo Occupation Distribution versus Subhalo Matching
NASA Astrophysics Data System (ADS)
Guo, Hong; Zheng, Zheng; Behroozi, Peter S.; Zehavi, Idit; Chuang, Chia-Hsun; Comparat, Johan; Favole, Ginevra; Gottloeber, Stefan; Klypin, Anatoly; Prada, Francisco; Rodríguez-Torres, Sergio A.; Weinberg, David H.; Yepes, Gustavo
2016-04-01
We model the luminosity-dependent projected and redshift-space two-point correlation functions (2PCFs) of the Sloan Digital Sky Survey (SDSS) DR7 Main galaxy sample, using the halo occupation distribution (HOD) model and the subhalo abundance matching (SHAM) model and its extension. All the models are built on the same high-resolution N-body simulations. We find that the HOD model generally provides the best performance in reproducing the clustering measurements in both projected and redshift spaces. The SHAM model with the same halo-galaxy relation for central and satellite galaxies (or distinct haloes and subhaloes), when including scatters, has a best-fitting χ2/dof around 2-3. We therefore extend the SHAM model to the subhalo clustering and abundance matching (SCAM) by allowing the central and satellite galaxies to have different galaxy-halo relations. We infer the corresponding halo/subhalo parameters by jointly fitting the galaxy 2PCFs and abundances and consider subhaloes selected based on three properties, the mass Macc at the time of accretion, the maximum circular velocity Vacc at the time of accretion, and the peak maximum circular velocity Vpeak over the history of the subhaloes. The three subhalo models work well for luminous galaxy samples (with luminosity above L★). For low-luminosity samples, the Vacc model stands out in reproducing the data, with the Vpeak model slightly worse, while the Macc model fails to fit the data. We discuss the implications of the modeling results.
Modelling galaxy clustering: halo occupation distribution versus subhalo matching
NASA Astrophysics Data System (ADS)
Guo, Hong; Zheng, Zheng; Behroozi, Peter S.; Zehavi, Idit; Chuang, Chia-Hsun; Comparat, Johan; Favole, Ginevra; Gottloeber, Stefan; Klypin, Anatoly; Prada, Francisco; Rodríguez-Torres, Sergio A.; Weinberg, David H.; Yepes, Gustavo
2016-07-01
We model the luminosity-dependent projected and redshift-space two-point correlation functions (2PCFs) of the Sloan Digital Sky Survey (SDSS) Data Release 7 Main galaxy sample, using the halo occupation distribution (HOD) model and the subhalo abundance matching (SHAM) model and its extension. All the models are built on the same high-resolution N-body simulations. We find that the HOD model generally provides the best performance in reproducing the clustering measurements in both projected and redshift spaces. The SHAM model with the same halo-galaxy relation for central and satellite galaxies (or distinct haloes and subhaloes), when including scatters, has a best-fitting χ2/dof around 2-3. We therefore extend the SHAM model to the subhalo clustering and abundance matching (SCAM) by allowing the central and satellite galaxies to have different galaxy-halo relations. We infer the corresponding halo/subhalo parameters by jointly fitting the galaxy 2PCFs and abundances and consider subhaloes selected based on three properties, the mass Macc at the time of accretion, the maximum circular velocity Vacc at the time of accretion, and the peak maximum circular velocity Vpeak over the history of the subhaloes. The three subhalo models work well for luminous galaxy samples (with luminosity above L*). For low-luminosity samples, the Vacc model stands out in reproducing the data, with the Vpeak model slightly worse, while the Macc model fails to fit the data. We discuss the implications of the modelling results.
A Fast Cluster Motif Finding Algorithm for ChIP-Seq Data Sets.
Zhang, Yipu; Wang, Ping
2015-01-01
New high-throughput technique ChIP-seq, coupling chromatin immunoprecipitation experiment with high-throughput sequencing technologies, has extended the identification of binding locations of a transcription factor to the genome-wide regions. However, the most existing motif discovery algorithms are time-consuming and limited to identify binding motifs in ChIP-seq data which normally has the significant characteristics of large scale data. In order to improve the efficiency, we propose a fast cluster motif finding algorithm, named as FCmotif, to identify the (l, d) motifs in large scale ChIP-seq data set. It is inspired by the emerging substrings mining strategy to find the enriched substrings and then searching the neighborhood instances to construct PWM and cluster motifs in different length. FCmotif is not following the OOPS model constraint and can find long motifs. The effectiveness of proposed algorithm has been proved by experiments on the ChIP-seq data sets from mouse ES cells. The whole detection of the real binding motifs and processing of the full size data of several megabytes finished in a few minutes. The experimental results show that FCmotif has advantageous to deal with the (l, d) motif finding in the ChIP-seq data; meanwhile it also demonstrates better performance than other current widely-used algorithms such as MEME, Weeder, ChIPMunk, and DREME. PMID:26236718
NASA Astrophysics Data System (ADS)
Plaza, Antonio; Chang, Chein-I.; Plaza, Javier; Valencia, David
2006-05-01
The incorporation of hyperspectral sensors aboard airborne/satellite platforms is currently producing a nearly continual stream of multidimensional image data, and this high data volume has soon introduced new processing challenges. The price paid for the wealth spatial and spectral information available from hyperspectral sensors is the enormous amounts of data that they generate. Several applications exist, however, where having the desired information calculated quickly enough for practical use is highly desirable. High computing performance of algorithm analysis is particularly important in homeland defense and security applications, in which swift decisions often involve detection of (sub-pixel) military targets (including hostile weaponry, camouflage, concealment, and decoys) or chemical/biological agents. In order to speed-up computational performance of hyperspectral imaging algorithms, this paper develops several fast parallel data processing techniques. Techniques include four classes of algorithms: (1) unsupervised classification, (2) spectral unmixing, and (3) automatic target recognition, and (4) onboard data compression. A massively parallel Beowulf cluster (Thunderhead) at NASA's Goddard Space Flight Center in Maryland is used to measure parallel performance of the proposed algorithms. In order to explore the viability of developing onboard, real-time hyperspectral data compression algorithms, a Xilinx Virtex-II field programmable gate array (FPGA) is also used in experiments. Our quantitative and comparative assessment of parallel techniques and strategies may help image analysts in selection of parallel hyperspectral algorithms for specific applications.
Dynamic scheduling study on engineering machinery of clusters using multi-agent system ant algorithm
NASA Astrophysics Data System (ADS)
Gao, Qiang; Wang, Hongli; Guo, Long; Xiang, Jianping
2005-12-01
In the process of road surface construction, dispatchers' scheduling was experiential and blindfold in some degree and static scheduling restricted the continuity of the construction. Serious problems such as labor holdup, material awaiting and scheduling delay could occur when the old scheduling technique was used. This paper presents ant colony algorithm based on MAS that has the abilities of intelligentized modeling and dynamic scheduling. MAS model deals with single agent's communication and corresponding in engineering machinery of clusters firstly, next we apply ant colony algorithm to solve dynamic scheduling in the plant. Ant colony algorithm can optimize the match of agents and make the system dynamic balance. The effectiveness of the proposed method is demonstrated with MATLAB simulations.
Karimi, Abbas; Afsharfarnia, Abbas; Zarafshan, Faraneh; Al-Haddad, S. A. R.
2014-01-01
The stability of clusters is a serious issue in mobile ad hoc networks. Low stability of clusters may lead to rapid failure of clusters, high energy consumption for reclustering, and decrease in the overall network stability in mobile ad hoc network. In order to improve the stability of clusters, weight-based clustering algorithms are utilized. However, these algorithms only use limited features of the nodes. Thus, they decrease the weight accuracy in determining node's competency and lead to incorrect selection of cluster heads. A new weight-based algorithm presented in this paper not only determines node's weight using its own features, but also considers the direct effect of feature of adjacent nodes. It determines the weight of virtual links between nodes and the effect of the weights on determining node's final weight. By using this strategy, the highest weight is assigned to the best choices for being the cluster heads and the accuracy of nodes selection increases. The performance of new algorithm is analyzed by using computer simulation. The results show that produced clusters have longer lifetime and higher stability. Mathematical simulation shows that this algorithm has high availability in case of failure. PMID:25114965
Nonclercq, Antoine; Foulon, Martine; Verheulpen, Denis; De Cock, Cathy; Buzatu, Marga; Mathys, Pierre; Van Bogaert, Patrick
2012-09-30
Visual quantification of interictal epileptiform activity is time consuming and requires a high level of expert's vigilance. This is especially true for overnight recordings of patient suffering from epileptic encephalopathy with continuous spike and waves during slow-wave sleep (CSWS) as they can show tens of thousands of spikes. Automatic spike detection would be attractive for this condition, but available algorithms have methodological limitations related to variation in spike morphology both between patients and within a single recording. We propose a fully automated method of interictal spike detection that adapts to interpatient and intrapatient variation in spike morphology. The algorithm works in five steps. (1) Spikes are detected using parameters suitable for highly sensitive detection. (2) Detected spikes are separated into clusters. (3) The number of clusters is automatically adjusted. (4) Centroids are used as templates for more specific spike detections, therefore adapting to the types of spike morphology. (5) Detected spikes are summed. The algorithm was evaluated on EEG samples from 20 children suffering from epilepsy with CSWS. When compared to the manual scoring of 3 EEG experts (3 records), the algorithm demonstrated similar performance since sensitivity and selectivity were 0.3% higher and 0.4% lower, respectively. The algorithm showed little difference compared to the manual scoring of another expert for the spike-and-wave index evaluation in 17 additional records (the mean absolute difference was 3.8%). This algorithm is therefore efficient for the count of interictal spikes and determination of a spike-and-wave index. PMID:22850558
NASA Astrophysics Data System (ADS)
Abdul-Nasir, Aimi Salihah; Mashor, Mohd Yusoff; Halim, Nurul Hazwani Abd; Mohamed, Zeehaida
2015-05-01
Malaria is a life-threatening parasitic infectious disease that corresponds for nearly one million deaths each year. Due to the requirement of prompt and accurate diagnosis of malaria, the current study has proposed an unsupervised pixel segmentation based on clustering algorithm in order to obtain the fully segmented red blood cells (RBCs) infected with malaria parasites based on the thin blood smear images of P. vivax species. In order to obtain the segmented infected cell, the malaria images are first enhanced by using modified global contrast stretching technique. Then, an unsupervised segmentation technique based on clustering algorithm has been applied on the intensity component of malaria image in order to segment the infected cell from its blood cells background. In this study, cascaded moving k-means (MKM) and fuzzy c-means (FCM) clustering algorithms has been proposed for malaria slide image segmentation. After that, median filter algorithm has been applied to smooth the image as well as to remove any unwanted regions such as small background pixels from the image. Finally, seeded region growing area extraction algorithm has been applied in order to remove large unwanted regions that are still appeared on the image due to their size in which cannot be cleaned by using median filter. The effectiveness of the proposed cascaded MKM and FCM clustering algorithms has been analyzed qualitatively and quantitatively by comparing the proposed cascaded clustering algorithm with MKM and FCM clustering algorithms. Overall, the results indicate that segmentation using the proposed cascaded clustering algorithm has produced the best segmentation performances by achieving acceptable sensitivity as well as high specificity and accuracy values compared to the segmentation results provided by MKM and FCM algorithms.
Effects of algorithm for diagnosis of active labour: cluster randomised trial
Hundley, Vanora; Dowding, Dawn; Bland, J Martin; McNamee, Paul; Greer, Ian; Styles, Maggie; Barnett, Carol A; Scotland, Graham; Niven, Catherine
2008-01-01
Objective To compare the effectiveness of an algorithm for diagnosis of active labour in primiparous women with standard care in terms of maternal and neonatal outcomes. Design Cluster randomised trial. Setting Maternity units in Scotland with at least 800 annual births. Participants 4503 women giving birth for the first time, in 14 maternity units. Seven experimental clusters collected data from a baseline sample of 1029 women and a post-implementation sample of 896 women. The seven control clusters had a baseline sample of 1291 women and a post-implementation sample of 1287 women. Intervention Use of an algorithm by midwives to assist their diagnosis of active labour, compared with standard care. Main outcomes Primary outcome: use of oxytocin for augmentation of labour. Secondary outcomes: medical interventions in labour, admission management, and birth outcome. Results No significant difference was found between groups in percentage use of oxytocin for augmentation of labour (experimental minus control, difference=0.3, 95% confidence interval −9.2 to 9.8; P=0.9) or in the use of medical interventions in labour. Women in the algorithm group were more likely to be discharged from the labour suite after their first labour assessment (difference=−19.2, −29.9 to −8.6; P=0.002) and to have more pre-labour admissions (0.29, 0.04 to 0.55; P=0.03). Conclusions Use of an algorithm to assist midwives with the diagnosis of active labour in primiparous women did not result in a reduction in oxytocin use or in medical intervention in spontaneous labour. Significantly more women in the experimental group were discharged home after their first labour ward assessment. Trial registration Current Controlled Trials ISRCTN00522952. PMID:19064606
The Spatial Distribution of the Young Stellar Clusters in the Star-forming Galaxy NGC 628
NASA Astrophysics Data System (ADS)
Grasha, K.; Calzetti, D.; Adamo, A.; Kim, H.; Elmegreen, B. G.; Gouliermis, D. A.; Aloisi, A.; Bright, S. N.; Christian, C.; Cignoni, M.; Dale, D. A.; Dobbs, C.; Elmegreen, D. M.; Fumagalli, M.; Gallagher, J. S., III; Grebel, E. K.; Johnson, K. E.; Lee, J. C.; Messa, M.; Smith, L. J.; Ryon, J. E.; Thilker, D.; Ubeda, L.; Wofford, A.
2015-12-01
We present a study of the spatial distribution of the stellar cluster populations in the star-forming galaxy NGC 628. Using Hubble Space Telescope broadband WFC3/UVIS UV and optical images from the Treasury Program LEGUS (Legacy ExtraGalactic UV Survey), we have identified 1392 potential young (≲ 100 Myr) stellar clusters within the galaxy using a combination of visual inspection and automatic selection. We investigate the clustering of these young stellar clusters and quantify the strength and change of clustering strength with scale using the two-point correlation function. We also investigate how image boundary conditions and dust lanes affect the observed clustering. The distribution of the clusters is well fit by a broken power law with negative exponent α. We recover a weighted mean index of α ˜ -0.8 for all spatial scales below the break at 3.″3 (158 pc at a distance of 9.9 Mpc) and an index of α ˜ -0.18 above 158 pc for the accumulation of all cluster types. The strength of the clustering increases with decreasing age and clusters older than 40 Myr lose their clustered structure very rapidly and tend to be randomly distributed in this galaxy, whereas the mass of the star cluster has little effect on the clustering strength. This is consistent with results from other studies that the morphological hierarchy in stellar clustering resembles the same hierarchy as the turbulent interstellar medium.
Klystron Cluster Scheme for ILC High Power RF Distribution
Nantista, Christopher; Adolphsen, Chris; /SLAC
2009-07-06
We present a concept for powering the main linacs of the International Linear Collider (ILC) by delivering high power RF from the surface via overmoded, low-loss waveguides at widely spaced intervals. The baseline design employs a two-tunnel layout, with klystrons and modulators evenly distributed along a service tunnel running parallel to the accelerator tunnel. This new idea eliminates the need for the service tunnel. It also brings most of the warm heat load to the surface, dramatically reducing the tunnel water cooling and HVAC requirements. In the envisioned configuration, groups of 70 klystrons and modulators are clustered in surface buildings every 2.5 km. Their outputs are combined into two half-meter diameter circular TE{sub 01} mode evacuated waveguides. These are directed via special bends through a deep shaft and along the tunnel, one upstream and one downstream. Each feeds approximately 1.25 km of linac with power tapped off in 10 MW portions at 38 m intervals. The power is extracted through a novel coaxial tap-off (CTO), after which the local distribution is as it would be from a klystron. The tap-off design is also employed in reverse for the initial combining.
Crowded Cluster Cores: An Algorithm for Deblending in Dark Energy Survey Images
NASA Astrophysics Data System (ADS)
Zhang, Yuanyuan; McKay, Timothy A.; Bertin, Emmanuel; Jeltema, Tesla; Miller, Christopher J.; Rykoff, Eli; Song, Jeeseon
2015-11-01
Deep optical images are often crowded with overlapping objects. This is especially true in the cores of galaxy clusters, where images of dozens of galaxies may lie atop one another. Accurate measurements of cluster properties require deblending algorithms designed to automatically extract a list of individual objects and decide what fraction of the light in each pixel comes from each object. In this article, we introduce a new software tool called the Gradient And Interpolation based (GAIN) deblender. GAIN is used as a secondary deblender to improve the separation of overlapping objects in galaxy cluster cores in Dark Energy Survey images. It uses image intensity gradients and an interpolation technique originally developed to correct flawed digital images. This paper is dedicated to describing the algorithm of the GAIN deblender and its applications, but we additionally include modest tests of the software based on real Dark Energy Survey co-add images. GAIN helps to extract an unbiased photometry measurement for blended sources and improve detection completeness, while introducing few spurious detections. When applied to processed Dark Energy Survey data, GAIN serves as a useful quick fix when a high level of deblending is desired.
Crowded Cluster Cores. Algorithms for Deblending in Dark Energy Survey Images
Zhang, Yuanyuan; McKay, Timothy A.; Bertin, Emmanuel; Jeltema, Tesla; Miller, Christopher J.; Rykoff, Eli; Song, Jeeseon
2015-10-26
Deep optical images are often crowded with overlapping objects. We found that this is especially true in the cores of galaxy clusters, where images of dozens of galaxies may lie atop one another. Accurate measurements of cluster properties require deblending algorithms designed to automatically extract a list of individual objects and decide what fraction of the light in each pixel comes from each object. In this article, we introduce a new software tool called the Gradient And Interpolation based (GAIN) deblender. GAIN is used as a secondary deblender to improve the separation of overlapping objects in galaxy cluster cores in Dark Energy Survey images. It uses image intensity gradients and an interpolation technique originally developed to correct flawed digital images. Our paper is dedicated to describing the algorithm of the GAIN deblender and its applications, but we additionally include modest tests of the software based on real Dark Energy Survey co-add images. GAIN helps to extract an unbiased photometry measurement for blended sources and improve detection completeness, while introducing few spurious detections. When applied to processed Dark Energy Survey data, GAIN serves as a useful quick fix when a high level of deblending is desired.
Detection and clustering of features in aerial images by neuron network-based algorithm
NASA Astrophysics Data System (ADS)
Vozenilek, Vit
2015-12-01
The paper presents the algorithm for detection and clustering of feature in aerial photographs based on artificial neural networks. The presented approach is not focused on the detection of specific topographic features, but on the combination of general features analysis and their use for clustering and backward projection of clusters to aerial image. The basis of the algorithm is a calculation of the total error of the network and a change of weights of the network to minimize the error. A classic bipolar sigmoid was used for the activation function of the neurons and the basic method of backpropagation was used for learning. To verify that a set of features is able to represent the image content from the user's perspective, the web application was compiled (ASP.NET on the Microsoft .NET platform). The main achievements include the knowledge that man-made objects in aerial images can be successfully identified by detection of shapes and anomalies. It was also found that the appropriate combination of comprehensive features that describe the colors and selected shapes of individual areas can be useful for image analysis.
Clustering of tethered satellite system simulation data by an adaptive neuro-fuzzy algorithm
NASA Technical Reports Server (NTRS)
Mitra, Sunanda; Pemmaraju, Surya
1992-01-01
Recent developments in neuro-fuzzy systems indicate that the concepts of adaptive pattern recognition, when used to identify appropriate control actions corresponding to clusters of patterns representing system states in dynamic nonlinear control systems, may result in innovative designs. A modular, unsupervised neural network architecture, in which fuzzy learning rules have been embedded is used for on-line identification of similar states. The architecture and control rules involved in Adaptive Fuzzy Leader Clustering (AFLC) allow this system to be incorporated in control systems for identification of system states corresponding to specific control actions. We have used this algorithm to cluster the simulation data of Tethered Satellite System (TSS) to estimate the range of delta voltages necessary to maintain the desired length rate of the tether. The AFLC algorithm is capable of on-line estimation of the appropriate control voltages from the corresponding length error and length rate error without a priori knowledge of their membership functions and familarity with the behavior of the Tethered Satellite System.
Crowded Cluster Cores. Algorithms for Deblending in Dark Energy Survey Images
Zhang, Yuanyuan; McKay, Timothy A.; Bertin, Emmanuel; Jeltema, Tesla; Miller, Christopher J.; Rykoff, Eli; Song, Jeeseon
2015-10-26
Deep optical images are often crowded with overlapping objects. We found that this is especially true in the cores of galaxy clusters, where images of dozens of galaxies may lie atop one another. Accurate measurements of cluster properties require deblending algorithms designed to automatically extract a list of individual objects and decide what fraction of the light in each pixel comes from each object. In this article, we introduce a new software tool called the Gradient And Interpolation based (GAIN) deblender. GAIN is used as a secondary deblender to improve the separation of overlapping objects in galaxy cluster cores inmore » Dark Energy Survey images. It uses image intensity gradients and an interpolation technique originally developed to correct flawed digital images. Our paper is dedicated to describing the algorithm of the GAIN deblender and its applications, but we additionally include modest tests of the software based on real Dark Energy Survey co-add images. GAIN helps to extract an unbiased photometry measurement for blended sources and improve detection completeness, while introducing few spurious detections. When applied to processed Dark Energy Survey data, GAIN serves as a useful quick fix when a high level of deblending is desired.« less
Saro, A.
2015-10-12
In this study, we cross-match galaxy cluster candidates selected via their Sunyaev–Zel'dovich effect (SZE) signatures in 129.1 deg2 of the South Pole Telescope 2500d SPT-SZ survey with optically identified clusters selected from the Dark Energy Survey science verification data. We identify 25 clusters between 0.1 ≲ z ≲ 0.8 in the union of the SPT-SZ and redMaPPer (RM) samples. RM is an optical cluster finding algorithm that also returns a richness estimate for each cluster. We model the richness λ-mass relation with the following function 500> ∝ Bλln M500 + Cλln E(z) and use SPT-SZ cluster masses and RM richnessesmore » λ to constrain the parameters. We find Bλ=1.14+0.21–0.18 and Cλ=0.73+0.77–0.75. The associated scatter in mass at fixed richness is σlnM|λ = 0.18+0.08–0.05 at a characteristic richness λ = 70. We demonstrate that our model provides an adequate description of the matched sample, showing that the fraction of SPT-SZ-selected clusters with RM counterparts is consistent with expectations and that the fraction of RM-selected clusters with SPT-SZ counterparts is in mild tension with expectation. We model the optical-SZE cluster positional offset distribution with the sum of two Gaussians, showing that it is consistent with a dominant, centrally peaked population and a subdominant population characterized by larger offsets. We also cross-match the RM catalogue with SPT-SZ candidates below the official catalogue threshold significance ξ = 4.5, using the RM catalogue to provide optical confirmation and redshifts for 15 additional clusters with ξ ε [4, 4.5].« less
NASA Astrophysics Data System (ADS)
Saro, A.; Bocquet, S.; Rozo, E.; Benson, B. A.; Mohr, J.; Rykoff, E. S.; Soares-Santos, M.; Bleem, L.; Dodelson, S.; Melchior, P.; Sobreira, F.; Upadhyay, V.; Weller, J.; Abbott, T.; Abdalla, F. B.; Allam, S.; Armstrong, R.; Banerji, M.; Bauer, A. H.; Bayliss, M.; Benoit-Lévy, A.; Bernstein, G. M.; Bertin, E.; Brodwin, M.; Brooks, D.; Buckley-Geer, E.; Burke, D. L.; Carlstrom, J. E.; Capasso, R.; Capozzi, D.; Carnero Rosell, A.; Carrasco Kind, M.; Chiu, I.; Covarrubias, R.; Crawford, T. M.; Crocce, M.; D'Andrea, C. B.; da Costa, L. N.; DePoy, D. L.; Desai, S.; de Haan, T.; Diehl, H. T.; Dietrich, J. P.; Doel, P.; Cunha, C. E.; Eifler, T. F.; Evrard, A. E.; Fausti Neto, A.; Fernandez, E.; Flaugher, B.; Fosalba, P.; Frieman, J.; Gangkofner, C.; Gaztanaga, E.; Gerdes, D.; Gruen, D.; Gruendl, R. A.; Gupta, N.; Hennig, C.; Holzapfel, W. L.; Honscheid, K.; Jain, B.; James, D.; Kuehn, K.; Kuropatkin, N.; Lahav, O.; Li, T. S.; Lin, H.; Maia, M. A. G.; March, M.; Marshall, J. L.; Martini, Paul; McDonald, M.; Miller, C. J.; Miquel, R.; Nord, B.; Ogando, R.; Plazas, A. A.; Reichardt, C. L.; Romer, A. K.; Roodman, A.; Sako, M.; Sanchez, E.; Schubnell, M.; Sevilla, I.; Smith, R. C.; Stalder, B.; Stark, A. A.; Strazzullo, V.; Suchyta, E.; Swanson, M. E. C.; Tarle, G.; Thaler, J.; Thomas, D.; Tucker, D.; Vikram, V.; von der Linden, A.; Walker, A. R.; Wechsler, R. H.; Wester, W.; Zenteno, A.; Ziegler, K. E.
2015-12-01
We cross-match galaxy cluster candidates selected via their Sunyaev-Zel'dovich effect (SZE) signatures in 129.1 deg2 of the South Pole Telescope 2500d SPT-SZ survey with optically identified clusters selected from the Dark Energy Survey science verification data. We identify 25 clusters between 0.1 ≲ z ≲ 0.8 in the union of the SPT-SZ and redMaPPer (RM) samples. RM is an optical cluster finding algorithm that also returns a richness estimate for each cluster. We model the richness λ-mass relation with the following function
Modelling galaxy clustering: halo occupation distribution versus subhalo matching
Guo, Hong; Zheng, Zheng; Behroozi, Peter S.; Zehavi, Idit; Chuang, Chia-Hsun; Comparat, Johan; Favole, Ginevra; Gottloeber, Stefan; Klypin, Anatoly; Prada, Francisco; Rodríguez-Torres, Sergio A.; Weinberg, David H.; Yepes, Gustavo
2016-01-01
We model the luminosity-dependent projected and redshift-space two-point correlation functions (2PCFs) of the Sloan Digital Sky Survey (SDSS) Data Release 7 Main galaxy sample, using the halo occupation distribution (HOD) model and the subhalo abundance matching (SHAM) model and its extension. All the models are built on the same high-resolution N-body simulations. We find that the HOD model generally provides the best performance in reproducing the clustering measurements in both projected and redshift spaces. The SHAM model with the same halo–galaxy relation for central and satellite galaxies (or distinct haloes and subhaloes), when including scatters, has a best-fitting χ2/dof around 2–3. We therefore extend the SHAM model to the subhalo clustering and abundance matching (SCAM) by allowing the central and satellite galaxies to have different galaxy–halo relations. We infer the corresponding halo/subhalo parameters by jointly fitting the galaxy 2PCFs and abundances and consider subhaloes selected based on three properties, the mass Macc at the time of accretion, the maximum circular velocity Vacc at the time of accretion, and the peak maximum circular velocity Vpeak over the history of the subhaloes. The three subhalo models work well for luminous galaxy samples (with luminosity above L*). For low-luminosity samples, the Vacc model stands out in reproducing the data, with the Vpeak model slightly worse, while the Macc model fails to fit the data. We discuss the implications of the modelling results. PMID:27279784
INITIAL SIZE DISTRIBUTION OF THE GALACTIC GLOBULAR CLUSTER SYSTEM
Shin, Jihye; Kim, Sungsoo S.; Yoon, Suk-Jin; Kim, Juhan
2013-01-10
Despite the importance of their size evolution in understanding the dynamical evolution of globular clusters (GCs) of the Milky Way, studies that focus specifically on this issue are rare. Based on the advanced, realistic Fokker-Planck (FP) approach, we theoretically predict the initial size distribution (SD) of the Galactic GCs along with their initial mass function and radial distribution. Over one thousand FP calculations in a wide parameter space have pinpointed the best-fit initial conditions for the SD, mass function, and radial distribution. Our best-fit model shows that the initial SD of the Galactic GCs is of larger dispersion than today's SD, and that the typical projected half-light radius of the initial GCs is {approx}4.6 pc, which is 1.8 times larger than that of the present-day GCs ({approx}2.5 pc). Their large size signifies greater susceptibility to the Galactic tides: the total mass of destroyed GCs reaches 3-5 Multiplication-Sign 10{sup 8} M {sub Sun }, several times larger than previous estimates. Our result challenges a recent view that the Milky Way GCs were born compact on the sub-pc scale, and rather implies that (1) the initial GCs were generally larger than the typical size of the present-day GCs, (2) the initially large GCs mostly shrank and/or disrupted as a result of the galactic tides, and (3) the initially small GCs expanded by two-body relaxation, and later shrank by the galactic tides.
Modelling Paleoearthquake Slip Distributions using a Gentic Algorithm
NASA Astrophysics Data System (ADS)
Lindsay, Anthony; Simão, Nuno; McCloskey, John; Nalbant, Suleyman; Murphy, Shane; Bhloscaidh, Mairead Nic
2013-04-01
Along the Sunda trench, the annual growth rings of coral microatolls store long term records of tectonic deformation. Spread over large areas of an active megathrust fault, they offer the possibility of high resolution reconstructions of slip for a number of paleo-earthquakes. These data are complex with spatial and temporal variations in uncertainty. Rather than assuming that any one model will uniquely fit the data, Monte Carlo Slip Estimation (MCSE) modelling produces a catalogue of possible models for each event. From each earthquake's catalogue, a model is selected and a possible history of slip along the fault reconstructed. By generating multiple histories, then finding the average slip during each earthquake, a probabilistic history of slip along the fault can be generated and areas that may have a large slip deficit identified. However, the MCSE technique requires the production of many hundreds of billions of models to yield the few models that fit the observed coral data. In an attempt to accelerate this process, we have designed a Genetic Algorithm (GA). The GA uses evolutionary operators to recombine the information held by a population of possible slip models to produce a set of new models, based on how well they reproduce a set of coral deformation data. Repeated iterations of the algorithm produce populations of improved models, each generation better satisfying the coral data. Preliminary results have shown the GA to be capable of recovering synthetically generated slip distributions based their displacements of sets of corals faster than the MCSE technique. The results of the systematic testing of the GA technique and its performance using both synthetic and observed coral displacement data will be presented.
QoE collaborative evaluation method based on fuzzy clustering heuristic algorithm.
Bao, Ying; Lei, Weimin; Zhang, Wei; Zhan, Yuzhuo
2016-01-01
At present, to realize or improve the quality of experience (QoE) is a major goal for network media transmission service, and QoE evaluation is the basis for adjusting the transmission control mechanism. Therefore, a kind of QoE collaborative evaluation method based on fuzzy clustering heuristic algorithm is proposed in this paper, which is concentrated on service score calculation at the server side. The server side collects network transmission quality of service (QoS) parameter, node location data, and user expectation value from client feedback information. Then it manages the historical data in database through the "big data" process mode, and predicts user score according to heuristic rules. On this basis, it completes fuzzy clustering analysis, and generates service QoE score and management message, which will be finally fed back to clients. Besides, this paper mainly discussed service evaluation generative rules, heuristic evaluation rules and fuzzy clustering analysis methods, and presents service-based QoE evaluation processes. The simulation experiments have verified the effectiveness of QoE collaborative evaluation method based on fuzzy clustering heuristic rules. PMID:27398281
Applications of colored petri net and genetic algorithms to cluster tool scheduling
NASA Astrophysics Data System (ADS)
Liu, Tung-Kuan; Kuo, Chih-Jen; Hsiao, Yung-Chin; Tsai, Jinn-Tsong; Chou, Jyh-Horng
2005-12-01
In this paper, we propose a method, which uses Coloured Petri Net (CPN) and genetic algorithm (GA) to obtain an optimal deadlock-free schedule and to solve re-entrant problem for the flexible process of the cluster tool. The process of the cluster tool for producing a wafer usually can be classified into three types: 1) sequential process, 2) parallel process, and 3) sequential parallel process. But these processes are not economical enough to produce a variety of wafers in small volume. Therefore, this paper will propose the flexible process where the operations of fabricating wafers are randomly arranged to achieve the best utilization of the cluster tool. However, the flexible process may have deadlock and re-entrant problems which can be detected by CPN. On the other hand, GAs have been applied to find the optimal schedule for many types of manufacturing processes. Therefore, we successfully integrate CPN and GAs to obtain an optimal schedule with the deadlock and re-entrant problems for the flexible process of the cluster tool.
Development of a Genetic Algorithm to Automate Clustering of a Dependency Structure Matrix
NASA Technical Reports Server (NTRS)
Rogers, James L.; Korte, John J.; Bilardo, Vincent J.
2006-01-01
Much technology assessment and organization design data exists in Microsoft Excel spreadsheets. Tools are needed to put this data into a form that can be used by design managers to make design decisions. One need is to cluster data that is highly coupled. Tools such as the Dependency Structure Matrix (DSM) and a Genetic Algorithm (GA) can be of great benefit. However, no tool currently combines the DSM and a GA to solve the clustering problem. This paper describes a new software tool that interfaces a GA written as an Excel macro with a DSM in spreadsheet format. The results of several test cases are included to demonstrate how well this new tool works.
Cluster mass fraction and size distribution determined by fs-time-resolved measurements
NASA Astrophysics Data System (ADS)
Gao, Xiaohui; Wang, Xiaoming; Shim, Bonggu; Arefiev, Alexey; Tushentsov, Mikhail; Breizman, Boris; Downer, Mike
2009-11-01
Characterization of supersonic gas jets is important for accurate interpretation and control of laser-cluster experiments. While average size and total atomic density can be found by standard Rayleigh scatter and interferometry, cluster mass fraction and size distribution are usually difficult to measure. Here we determine the cluster fraction and the size distribution with fs-time-resolved refractive index and absorption measurements in cluster gas jets after ionization and heating by an intense pump pulse. The fs-time-resolved refractive index measured with frequency domain interferometer (FDI) shows different contributions from monomer plasma and cluster plasma in the time domain, enabling us to determine the cluster fraction. The fs-time-resolved absorption measured by a delayed probe shows the contribution from clusters of various sizes, allowing us to find the size distribution.
CLUSTAG & WCLUSTAG: Hierarchical Clustering Algorithms for Efficient Tag-SNP Selection
NASA Astrophysics Data System (ADS)
Ao, Sio-Iong
More than 6 million single nucleotide polymorphisms (SNPs) in the human genome have been genotyped by the HapMap project. Although only a pro portion of these SNPs are functional, all can be considered as candidate markers for indirect association studies to detect disease-related genetic variants. The complete screening of a gene or a chromosomal region is nevertheless an expensive undertak ing for association studies. A key strategy for improving the efficiency of association studies is to select a subset of informative SNPs, called tag SNPs, for analysis. In the chapter, hierarchical clustering algorithms have been proposed for efficient tag SNP selection.
Reduced-cost sparsity-exploiting algorithm for solving coupled-cluster equations.
Brabec, Jiri; Yang, Chao; Epifanovsky, Evgeny; Krylov, Anna I; Ng, Esmond
2016-05-01
We present an algorithm for reducing the computational work involved in coupled-cluster (CC) calculations by sparsifying the amplitude correction within a CC amplitude update procedure. We provide a theoretical justification for this approach, which is based on the convergence theory of inexact Newton iterations. We demonstrate by numerical examples that, in the simplest case of the CCD equations, we can sparsify the amplitude correction by setting, on average, roughly 90% nonzero elements to zeros without a major effect on the convergence of the inexact Newton iterations. PMID:26804120
NASA Astrophysics Data System (ADS)
Wang, Deguang; Han, Baochang; Huang, Ming
Computer forensics is the technology of applying computer technology to access, investigate and analysis the evidence of computer crime. It mainly include the process of determine and obtain digital evidence, analyze and take data, file and submit result. And the data analysis is the key link of computer forensics. As the complexity of real data and the characteristics of fuzzy, evidence analysis has been difficult to obtain the desired results. This paper applies fuzzy c-means clustering algorithm based on particle swarm optimization (FCMP) in computer forensics, and it can be more satisfactory results.
Multispectral image classification of MRI data using an empirically-derived clustering algorithm
Horn, K.M.; Osbourn, G.C.; Bouchard, A.M.; Sanders, J.A. |
1998-08-01
Multispectral image analysis of magnetic resonance imaging (MRI) data has been performed using an empirically-derived clustering algorithm. This algorithm groups image pixels into distinct classes which exhibit similar response in the T{sub 2} 1st and 2nd-echo, and T{sub 1} (with ad without gadolinium) MRI images. The grouping is performed in an n-dimensional mathematical space; the n-dimensional volumes bounding each class define each specific tissue type. The classification results are rendered again in real-space by colored-coding each grouped class of pixels (associated with differing tissue types). This classification method is especially well suited for class volumes with complex boundary shapes, and is also expected to robustly detect abnormal tissue classes. The classification process is demonstrated using a three dimensional data set of MRI scans of a human brain tumor.
Mustapha, Ibrahim; Mohd Ali, Borhanuddin; Rasid, Mohd Fadlee A; Sali, Aduwati; Mohamad, Hafizal
2015-01-01
It is well-known that clustering partitions network into logical groups of nodes in order to achieve energy efficiency and to enhance dynamic channel access in cognitive radio through cooperative sensing. While the topic of energy efficiency has been well investigated in conventional wireless sensor networks, the latter has not been extensively explored. In this paper, we propose a reinforcement learning-based spectrum-aware clustering algorithm that allows a member node to learn the energy and cooperative sensing costs for neighboring clusters to achieve an optimal solution. Each member node selects an optimal cluster that satisfies pairwise constraints, minimizes network energy consumption and enhances channel sensing performance through an exploration technique. We first model the network energy consumption and then determine the optimal number of clusters for the network. The problem of selecting an optimal cluster is formulated as a Markov Decision Process (MDP) in the algorithm and the obtained simulation results show convergence, learning and adaptability of the algorithm to dynamic environment towards achieving an optimal solution. Performance comparisons of our algorithm with the Groupwise Spectrum Aware (GWSA)-based algorithm in terms of Sum of Square Error (SSE), complexity, network energy consumption and probability of detection indicate improved performance from the proposed approach. The results further reveal that an energy savings of 9% and a significant Primary User (PU) detection improvement can be achieved with the proposed approach. PMID:26287191
Mustapha, Ibrahim; Ali, Borhanuddin Mohd; Rasid, Mohd Fadlee A.; Sali, Aduwati; Mohamad, Hafizal
2015-01-01
It is well-known that clustering partitions network into logical groups of nodes in order to achieve energy efficiency and to enhance dynamic channel access in cognitive radio through cooperative sensing. While the topic of energy efficiency has been well investigated in conventional wireless sensor networks, the latter has not been extensively explored. In this paper, we propose a reinforcement learning-based spectrum-aware clustering algorithm that allows a member node to learn the energy and cooperative sensing costs for neighboring clusters to achieve an optimal solution. Each member node selects an optimal cluster that satisfies pairwise constraints, minimizes network energy consumption and enhances channel sensing performance through an exploration technique. We first model the network energy consumption and then determine the optimal number of clusters for the network. The problem of selecting an optimal cluster is formulated as a Markov Decision Process (MDP) in the algorithm and the obtained simulation results show convergence, learning and adaptability of the algorithm to dynamic environment towards achieving an optimal solution. Performance comparisons of our algorithm with the Groupwise Spectrum Aware (GWSA)-based algorithm in terms of Sum of Square Error (SSE), complexity, network energy consumption and probability of detection indicate improved performance from the proposed approach. The results further reveal that an energy savings of 9% and a significant Primary User (PU) detection improvement can be achieved with the proposed approach. PMID:26287191
Meanie3D - a mean-shift based, multivariate, multi-scale clustering and tracking algorithm
NASA Astrophysics Data System (ADS)
Simon, Jürgen-Lorenz; Malte, Diederich; Silke, Troemel
2014-05-01
Project OASE is the one of 5 work groups at the HErZ (Hans Ertel Centre for Weather Research), an ongoing effort by the German weather service (DWD) to further research at Universities concerning weather prediction. The goal of project OASE is to gain an object-based perspective on convective events by identifying them early in the onset of convective initiation and follow then through the entire lifecycle. The ability to follow objects in this fashion requires new ways of object definition and tracking, which incorporate all the available data sets of interest, such as Satellite imagery, weather Radar or lightning counts. The Meanie3D algorithm provides the necessary tool for this purpose. Core features of this new approach to clustering (object identification) and tracking are the ability to identify objects using the mean-shift algorithm applied to a multitude of variables (multivariate), as well as the ability to detect objects on various scales (multi-scale) using elements of Scale-Space theory. The algorithm works in 2D as well as 3D without modifications. It is an extension of a method well known from the field of computer vision and image processing, which has been tailored to serve the needs of the meteorological community. In spite of the special application to be demonstrated here (like convective initiation), the algorithm is easily tailored to provide clustering and tracking for a wide class of data sets and problems. In this talk, the demonstration is carried out on two of the OASE group's own composite sets. One is a 2D nationwide composite of Germany including C-Band Radar (2D) and Satellite information, the other a 3D local composite of the Bonn/Jülich area containing a high-resolution 3D X-Band Radar composite.
Farah, Ihsen; Nguyen, Thi Nguyet Que; Groh, Audrey; Guenot, Dominique; Jeannesson, Pierre; Gobinet, Cyril
2016-05-23
The coupling between Fourier-transform infrared (FTIR) imaging and unsupervised classification is effective in revealing the different structures of human tissues based on their specific biomolecular IR signatures; thus the spectral histology of the studied samples is achieved. However, the most widely applied clustering methods in spectral histology are local search algorithms, which converge to a local optimum, depending on initialization. Multiple runs of the techniques estimate multiple different solutions. Here, we propose a memetic algorithm, based on a genetic algorithm and a k-means clustering refinement, to perform optimal clustering. In addition, this approach was applied to the acquired FTIR images of normal human colon tissues originating from five patients. The results show the efficiency of the proposed memetic algorithm to achieve the optimal spectral histology of these samples, contrary to k-means. PMID:27110605
Clusters and cycles in the cosmic ray age distributions of meteorites
NASA Technical Reports Server (NTRS)
Woodard, M. F.; Marti, K.
1985-01-01
Statistically significant clusters in the cosmic ray exposure age distributions of some groups of iron and stone meteorites were observed, suggesting epochs of enhanced collision and breakups. Fourier analyses of the age distributions of chondrites reveal no significant periods, nor does the same analysis when applied to iron meteorite clusters.
Chen, Deng-kai; Gu, Rong; Gu, Yu-feng; Yu, Sui-huai
2016-01-01
Consumers' Kansei needs reflect their perception about a product and always consist of a large number of adjectives. Reducing the dimension complexity of these needs to extract primary words not only enables the target product to be explicitly positioned, but also provides a convenient design basis for designers engaging in design work. Accordingly, this study employs a numerical design structure matrix (NDSM) by parameterizing a conventional DSM and integrating genetic algorithms to find optimum Kansei clusters. A four-point scale method is applied to assign link weights of every two Kansei adjectives as values of cells when constructing an NDSM. Genetic algorithms are used to cluster the Kansei NDSM and find optimum clusters. Furthermore, the process of the proposed method is presented. The details of the proposed approach are illustrated using an example of electronic scooter for Kansei needs clustering. The case study reveals that the proposed method is promising for clustering Kansei needs adjectives in product emotional design.
KANTS: a stigmergic ant algorithm for cluster analysis and swarm art.
Fernandes, Carlos M; Mora, Antonio M; Merelo, Juan J; Rosa, Agostinho C
2014-06-01
KANTS is a swarm intelligence clustering algorithm inspired by the behavior of social insects. It uses stigmergy as a strategy for clustering large datasets and, as a result, displays a typical behavior of complex systems: self-organization and global patterns emerging from the local interaction of simple units. This paper introduces a simplified version of KANTS and describes recent experiments with the algorithm in the context of a contemporary artistic and scientific trend called swarm art, a type of generative art in which swarm intelligence systems are used to create artwork or ornamental objects. KANTS is used here for generating color drawings from the input data that represent real-world phenomena, such as electroencephalogram sleep data. However, the main proposal of this paper is an art project based on well-known abstract paintings, from which the chromatic values are extracted and used as input. Colors and shapes are therefore reorganized by KANTS, which generates its own interpretation of the original artworks. The project won the 2012 Evolutionary Art, Design, and Creativity Competition. PMID:23912505
A contiguity-enhanced k-means clustering algorithm for unsupervised multispectral image segmentation
Theiler, J.; Gisler, G.
1997-07-01
The recent and continuing construction of multi and hyper spectral imagers will provide detailed data cubes with information in both the spatial and spectral domain. This data shows great promise for remote sensing applications ranging from environmental and agricultural to national security interests. The reduction of this voluminous data to useful intermediate forms is necessary both for downlinking all those bits and for interpreting them. Smart onboard hardware is required, as well as sophisticated earth bound processing. A segmented image (in which the multispectral data in each pixel is classified into one of a small number of categories) is one kind of intermediate form which provides some measure of data compression. Traditional image segmentation algorithms treat pixels independently and cluster the pixels according only to their spectral information. This neglects the implicit spatial information that is available in the image. We will suggest a simple approach; a variant of the standard k-means algorithm which uses both spatial and spectral properties of the image. The segmented image has the property that pixels which are spatially contiguous are more likely to be in the same class than are random pairs of pixels. This property naturally comes at some cost in terms of the compactness of the clusters in the spectral domain, but we have found that the spatial contiguity and spectral compactness properties are nearly orthogonal, which means that we can make considerable improvements in the one with minimal loss in the other.
NASA Astrophysics Data System (ADS)
Makino, Junichiro
2002-10-01
We present a novel, highly efficient algorithm to parallelize O( N2) direct summation method for N-body problems with individual timesteps on distributed-memory parallel machines such as Beowulf clusters. Previously known algorithms, in which all processors have complete copies of the N-body system, has the serious problem that the communication-computation ratio increases as we increase the number of processors, since the communication cost is independent of the number of processors. In the new algorithm, p processors are organized as a p×p two-dimensional array. Each processor has N/ p particles, but the data are distributed in such a way that complete system is presented if we look at any row or column consisting of p processors. In this algorithm, the communication cost scales as N/ p, while the calculation cost scales as N2/ p. Thus, we can use a much larger number of processors without losing efficiency compared to what was practical with previously known algorithms.
Analysis Clustering of Electricity Usage Profile Using K-Means Algorithm
NASA Astrophysics Data System (ADS)
Amri, Yasirli; Lailatul Fadhilah, Amanda; Fatmawati; Setiani, Novi; Rani, Septia
2016-01-01
Electricity is one of the most important needs for human life in many sectors. Demand for electricity will increase in line with population and economic growth. Adjustment of the amount of electricity production in specified time is important because the cost of storing electricity is expensive. For handling this problem, we need knowledge about the electricity usage pattern of clients. This pattern can be obtained by using clustering techniques. In this paper, clustering is used to obtain the similarity of electricity usage patterns in a specified time. We use K-Means algorithm to employ clustering on the dataset of electricity consumption from 370 clients that collected in a year. Result of this study, we obtained an interesting pattern that there is a big group of clients consume the lowest electric load in spring season, but in another group, the lowest electricity consumption occurred in winter season. From this result, electricity provider can make production planning in specified season based on pattern of electricity usage profile.
NASA Astrophysics Data System (ADS)
Cazade, Pierre-André; Zheng, Wenwei; Prada-Gracia, Diego; Berezovska, Ganna; Rao, Francesco; Clementi, Cecilia; Meuwly, Markus
2015-01-01
The ligand migration network for O2-diffusion in truncated Hemoglobin N is analyzed based on three different clustering schemes. For coordinate-based clustering, the conventional k-means and the kinetics-based Markov Clustering (MCL) methods are employed, whereas the locally scaled diffusion map (LSDMap) method is a collective-variable-based approach. It is found that all three methods agree well in their geometrical definition of the most important docking site, and all experimentally known docking sites are recovered by all three methods. Also, for most of the states, their population coincides quite favourably, whereas the kinetics of and between the states differs. One of the major differences between k-means and MCL clustering on the one hand and LSDMap on the other is that the latter finds one large primary cluster containing the Xe1a, IS1, and ENT states. This is related to the fact that the motion within the state occurs on similar time scales, whereas structurally the state is found to be quite diverse. In agreement with previous explicit atomistic simulations, the Xe3 pocket is found to be a highly dynamical site which points to its potential role as a hub in the network. This is also highlighted in the fact that LSDMap cannot identify this state. First passage time distributions from MCL clusterings using a one- (ligand-position) and two-dimensional (ligand-position and protein-structure) descriptor suggest that ligand- and protein-motions are coupled. The benefits and drawbacks of the three methods are discussed in a comparative fashion and highlight that depending on the questions at hand the best-performing method for a particular data set may differ.
Liu, Liqiang; Dai, Yuntao; Gao, Jinyu
2014-01-01
Ant colony optimization algorithm for continuous domains is a major research direction for ant colony optimization algorithm. In this paper, we propose a distribution model of ant colony foraging, through analysis of the relationship between the position distribution and food source in the process of ant colony foraging. We design a continuous domain optimization algorithm based on the model and give the form of solution for the algorithm, the distribution model of pheromone, the update rules of ant colony position, and the processing method of constraint condition. Algorithm performance against a set of test trials was unconstrained optimization test functions and a set of optimization test functions, and test results of other algorithms are compared and analyzed to verify the correctness and effectiveness of the proposed algorithm. PMID:24955402
Xue, Zhong; Shen, Dinggang; Li, Hai; Wong, Stephen
2010-01-01
The traditional fuzzy clustering algorithm and its extensions have been successfully applied in medical image segmentation. However, because of the variability of tissues and anatomical structures, the clustering results might be biased by the tissue population and intensity differences. For example, clustering-based algorithms tend to over-segment white matter tissues of MR brain images. To solve this problem, we introduce a tissue probability map constrained clustering algorithm and apply it to serial MR brain image segmentation, i.e., a series of 3-D MR brain images of the same subject at different time points. Using the new serial image segmentation algorithm in the framework of the CLASSIC framework, which iteratively segments the images and estimates the longitudinal deformations, we improved both accuracy and robustness for serial image computing, and at the mean time produced longitudinally consistent segmentation and stable measures. In the algorithm, the tissue probability maps consist of both the population-based and subject-specific segmentation priors. Experimental study using both simulated longitudinal MR brain data and the Alzheimer’s Disease Neuroimaging Initiative (ADNI) data confirmed that using both priors more accurate and robust segmentation results can be obtained. The proposed algorithm can be applied in longitudinal follow up studies of MR brain imaging with subtle morphological changes for neurological disorders. PMID:26566399
On Distribution Reduction and Algorithm Implementation in Inconsistent Ordered Information Systems
Zhang, Yanqin
2014-01-01
As one part of our work in ordered information systems, distribution reduction is studied in inconsistent ordered information systems (OISs). Some important properties on distribution reduction are studied and discussed. The dominance matrix is restated for reduction acquisition in dominance relations based information systems. Matrix algorithm for distribution reduction acquisition is stepped. And program is implemented by the algorithm. The approach provides an effective tool for the theoretical research and the applications for ordered information systems in practices. For more detailed and valid illustrations, cases are employed to explain and verify the algorithm and the program which shows the effectiveness of the algorithm in complicated information systems. PMID:25258721
A maximum likelihood method for determining the distribution of galaxies in clusters
NASA Astrophysics Data System (ADS)
Sarazin, C. L.
1980-02-01
A maximum likelihood method is proposed for the analysis of the projected distribution of galaxies in clusters. It has many advantages compared to the standard method; principally, it does not require binning of the galaxy positions, applies to asymmetric clusters, and can simultaneously determine all cluster parameters. A rapid method of solving the maximum likelihood equations is given which also automatically gives error estimates for the parameters. Monte Carlo tests indicate this method applies even for rather sparse clusters. The Godwin-Peach data on the Coma cluster are analyzed; the core sizes derived agree reasonably with those of Bahcall. Some slight evidence of mass segregation is found.
Falcon: neural fuzzy control and decision systems using FKP and PFKP clustering algorithms.
Tung, W L; Quek, C
2004-02-01
Neural fuzzy networks proposed in the literature can be broadly classified into two groups. The first group is essentially fuzzy systems with self-tuning capabilities and requires an initial rule base to be specified prior to training. The second group of neural fuzzy networks, on the other hand, is able to automatically formulate the fuzzy rules from the numerical training data. Examples are the Falcon-ART, and the POPFNN family of networks. A cluster analysis is first performed on the training data and the fuzzy rules are subsequently derived through the proper connections of these computed clusters. This correspondence proposes two new networks: Falcon-FKP and Falcon-PFKP. They are extensions of the Falcon-ART network, and aimed to overcome the shortcomings faced by the Falcon-ART network itself, i.e., poor classification ability when the classes of input data are very similar to each other, termination of training cycle depends heavily on a preset error parameter, the fuzzy rule base of the Falcon-ART network may not be consistent Nauck, there is no control over the number of fuzzy rules generated, and learning efficiency may deteriorate by using complementarily coded training data. These deficiencies are essentially inherent to the fuzzy ART, clustering technique employed by the Falcon-ART network. Hence, two clustering techniques--Fuzzy Kohonen Partitioning (FKP) and its pseudo variant PFKP, are synthesized with the basic Falcon structure to compute the fuzzy sets and to automatically derive the fuzzy rules from the training data. The resultant neural fuzzy networks are Falcon-FKP and Falcon-PFKP, respectively. These two proposed networks have a lean and efficient training algorithm and consistent fuzzy rule bases. Extensive simulations are conducted using the two networks and their performances are encouraging when benchmarked against other neural and neural fuzzy systems. PMID:15369109
The temperature and size distribution of large water clusters from a non-equilibrium model.
Gimelshein, N; Gimelshein, S; Pradzynski, C C; Zeuch, T; Buck, U
2015-06-28
A hybrid Lagrangian-Eulerian approach is used to examine the properties of water clusters formed in neon-water vapor mixtures expanding through microscale conical nozzles. Experimental size distributions were reliably determined by the sodium doping technique in a molecular beam machine. The comparison of computed size distributions and experimental data shows satisfactory agreement, especially for (H2O)n clusters with n larger than 50. Thus validated simulations provide size selected cluster temperature profiles in and outside the nozzle. This information is used for an in-depth analysis of the crystallization and water cluster aggregation dynamics of recently reported supersonic jet expansion experiments. PMID:26133426
The temperature and size distribution of large water clusters from a non-equilibrium model
Gimelshein, N.; Gimelshein, S.; Pradzynski, C. C.; Zeuch, T.; Buck, U.
2015-06-28
A hybrid Lagrangian-Eulerian approach is used to examine the properties of water clusters formed in neon-water vapor mixtures expanding through microscale conical nozzles. Experimental size distributions were reliably determined by the sodium doping technique in a molecular beam machine. The comparison of computed size distributions and experimental data shows satisfactory agreement, especially for (H{sub 2}O){sub n} clusters with n larger than 50. Thus validated simulations provide size selected cluster temperature profiles in and outside the nozzle. This information is used for an in-depth analysis of the crystallization and water cluster aggregation dynamics of recently reported supersonic jet expansion experiments.
Agent-based method for distributed clustering of textual information
Potok, Thomas E [Oak Ridge, TN; Reed, Joel W [Knoxville, TN; Elmore, Mark T [Oak Ridge, TN; Treadwell, Jim N [Louisville, TN
2010-09-28
A computer method and system for storing, retrieving and displaying information has a multiplexing agent (20) that calculates a new document vector (25) for a new document (21) to be added to the system and transmits the new document vector (25) to master cluster agents (22) and cluster agents (23) for evaluation. These agents (22, 23) perform the evaluation and return values upstream to the multiplexing agent (20) based on the similarity of the document to documents stored under their control. The multiplexing agent (20) then sends the document (21) and the document vector (25) to the master cluster agent (22), which then forwards it to a cluster agent (23) or creates a new cluster agent (23) to manage the document (21). The system also searches for stored documents according to a search query having at least one term and identifying the documents found in the search, and displays the documents in a clustering display (80) of similarity so as to indicate similarity of the documents to each other.
Studies of Entropy Distributions in X-ray Luminous Clusters of Galaxies
NASA Astrophysics Data System (ADS)
Cavagnolo, K. W.; Donahue, M. E.; Voit, G. M.; Sun, M.; Evrard, A. E.
2005-12-01
We present entropy distributions for a sample of galaxy clusters from the Chandra public archive, which builds on our previous analysis of nine nearby, bright clusters. By studying the entropy distribution within clusters we quantify the effect of radiative cooling, supernovae feedback, and AGN feedback on cluster properties. This expanded sample contains both cooling flow and non-cooling flow clusters while our previous work focused only on classical cooling flow clusters. We also test the predictions of Mathiesen and Evrard (2001) by checking whether the spectral fit temperature is an unbiased estimate of the mass-weighted temperature, and how this estimate effects the calculation of the intracluster medium mass. Temperature and entropy maps for the clusters in our sample using the Voronoi Tesselation method as employed by Statler et al (in preparation) will also be presented. These maps serve as a prelude to future work in which we will investigate how well such maps may represent the "true" projected quantities of a cluster by comparing deprojected real and simulated clusters from our sample and the Virtual Cluster Exploratory, respectively. Our discussion focuses on tying together feedback mechanisms with the breaking of self-similar relations expected in cluster and galaxy formation models.
Survivable algorithms and redundancy management in NASA's distributed computing systems
NASA Technical Reports Server (NTRS)
Malek, Miroslaw
1992-01-01
The design of survivable algorithms requires a solid foundation for executing them. While hardware techniques for fault-tolerant computing are relatively well understood, fault-tolerant operating systems, as well as fault-tolerant applications (survivable algorithms), are, by contrast, little understood, and much more work in this field is required. We outline some of our work that contributes to the foundation of ultrareliable operating systems and fault-tolerant algorithm design. We introduce our consensus-based framework for fault-tolerant system design. This is followed by a description of a hierarchical partitioning method for efficient consensus. A scheduler for redundancy management is introduced, and application-specific fault tolerance is described. We give an overview of our hybrid algorithm technique, which is an alternative to the formal approach given.
On the distribution of dark matter in clusters of galaxies
NASA Astrophysics Data System (ADS)
Sand, David J.
2006-07-01
The goal of this thesis is to provide constraints on the dark matter density profile in galaxy clusters by developing and combining different techniques. The work is motivated by the fact that a precise measurement of the logarithmic slope of the dark matter on small scales provides a powerful test of the Cold Dark Matter paradigm for structure formation, where numerical simulations suggest a density profile r DM 0( r -1 or steeper in the innermost regions. We have obtained deep spectroscopy of gravitational arcs and the dominant brightest cluster galaxy in six carefully chosen galaxy clusters. Three of the clusters have both radial and tangential gravitational arcs while the other three display only tangential arcs. We analyze the stellar velocity dispersion for the brightest cluster galaxies in conjunction with axially symmetric lens models to jointly constrain the dark and baryonic mass profiles jointly. For the radial are systems we find the inner dark matter density profile is consistent with r DM 0( r -b , with [left angle bracket]b[right angle bracket] = [Special characters omitted.] (68% CL). Likewise, an upper limit on b for the tangential arc sample is found to be b <0.57 (99% CL). We study a variety of possible systematic uncertainties, including the consequences of our one- dimensional mass model, fixed dark matter scale radius, and simple velocity dispersion analysis, and conclude that at most these systematics each contribute a Db ~ 0.2 systematic into our final conclusions. These results suggest the relationship between dark and baryonic matter in cluster cores is more complex than anticipated from dark matter only simulations. Recognizing the power of our technique, we have performed a systematic search of the Hubble Space Telescope Wide Field and Planetary Camera 2 data archive for further examples of systems containing tangential and radial gravitational arcs. We carefully examined 128 galaxy cluster cores and found 104 tangential arcs and 12
NASA Technical Reports Server (NTRS)
Cen, Renyue
1994-01-01
The mass and velocity distributions in the outskirts (0.5-3.0/h Mpc) of simulated clusters of galaxies are examined for a suite of cosmogonic models (two Omega(sub 0) = 1 and two Omega(sub 0) = 0.2 models) utilizing large-scale particle-mesh (PM) simulations. Through a series of model computations, designed to isolate the different effects, we find that both Omega(sub 0) and P(sub k) (lambda less than or = 16/h Mpc) are important to the mass distributions in clusters of galaxies. There is a correlation between power, P(sub k), and density profiles of massive clusters; more power tends to point to the direction of a stronger correlation between alpha and M(r less than 1.5/h Mpc); i.e., massive clusters being relatively extended and small mass clusters being relatively concentrated. A lower Omega(sub 0) universe tends to produce relatively concentrated massive clusters and relatively extended small mass clusters compared to their counterparts in a higher Omega(sub 0) model with the same power. Models with little (initial) small-scale power, such as the hot dark matter (HDM) model, produce more extended mass distributions than the isothermal distribution for most of the mass clusters. But the cold dark matter (CDM) models show mass distributions of most of the clusters more concentrated than the isothermal distribution. X-ray and gravitational lensing observations are beginning providing useful information on the mass distribution in and around clusters; some interesting constraints on Omega(sub 0) and/or the (initial) power of the density fluctuations on scales lambda less than or = 16/h Mpc (where linear extrapolation is invalid) can be obtained when larger observational data sets, such as the Sloan Digital Sky Survey, become available.
MASS DISTRIBUTIONS OF STARS AND CORES IN YOUNG GROUPS AND CLUSTERS
Michel, Manon; Kirk, Helen; Myers, Philip C. E-mail: hkirk@cfa.harvard.edu
2011-07-01
We investigate the relation of the stellar initial mass function and the dense core mass function (CMF), using stellar masses and positions in 14 well-studied young groups. Initial column density maps are computed by replacing each star with a model initial core having the same star formation efficiency (SFE). For each group the SFE, core model, and observational resolution are varied to produce a realistic range of initial maps. A clump-finding algorithm parses each initial map into derived cores, derived core masses, and a derived CMF. The main result is that projected blending of initial cores causes derived cores to be too few and too massive. The number of derived cores is fewer than the number of initial cores by a mean factor of 1.4 in sparse groups and 5 in crowded groups. The mass at the peak of the derived CMF exceeds the mass at the peak of the initial CMF by a mean factor of 1.0 in sparse groups and 12.1 in crowded groups. These results imply that in crowded young groups and clusters, the mass distribution of observed cores may not reliably predict the mass distribution of protostars that will form in those cores.
Ahmad, Amir
2016-01-01
The early diagnosis of breast cancer is an important step in a fight against the disease. Machine learning techniques have shown promise in improving our understanding of the disease. As medical datasets consist of data points which cannot be precisely assigned to a class, fuzzy methods have been useful for studying of these datasets. Sometimes breast cancer datasets are described by categorical features. Many fuzzy clustering algorithms have been developed for categorical datasets. However, in most of these methods Hamming distance is used to define the distance between the two categorical feature values. In this paper, we use a probabilistic distance measure for the distance computation among a pair of categorical feature values. Experiments demonstrate that the distance measure performs better than Hamming distance for Wisconsin breast cancer data. PMID:27022504
OpenACC programs of the Swendsen-Wang multi-cluster spin flip algorithm
NASA Astrophysics Data System (ADS)
Komura, Yukihiro
2015-12-01
We present sample OpenACC programs of the Swendsen-Wang multi-cluster spin flip algorithm. OpenACC is a directive-based programming model for accelerators without requiring modification to the underlying CPU code itself. In this paper, we deal with the classical spin models as with the sample CUDA programs (Komura and Okabe, 2014), that is, two-dimensional (2D) Ising model, three-dimensional (3D) Ising model, 2D Potts model, 3D Potts model, 2D XY model and 3D XY model. We explain the details of sample OpenACC programs and compare the performance of the present OpenACC implementations with that of the CUDA implementations for the 2D and 3D Ising models and the 2D and 3D XY models.
NASA Astrophysics Data System (ADS)
He, Tao; Sun, Yu-Jun; Xu, Ji-De; Wang, Xue-Jun; Hu, Chang-Ru
2014-01-01
Land use/cover (LUC) classification plays an important role in remote sensing and land change science. Because of the complexity of ground covers, LUC classification is still regarded as a difficult task. This study proposed a fusion algorithm, which uses support vector machines (SVM) and fuzzy k-means (FKM) clustering algorithms. The main scheme was divided into two steps. First, a clustering map was obtained from the original remote sensing image using FKM; simultaneously, a normalized difference vegetation index layer was extracted from the original image. Then, the classification map was generated by using an SVM classifier. Three different classification algorithms were compared, tested, and verified-parametric (maximum likelihood), nonparametric (SVM), and hybrid (unsupervised-supervised, fusion of SVM and FKM) classifiers, respectively. The proposed algorithm obtained the highest overall accuracy in our experiments.
Kanters, René P F; Donald, Kelling J
2014-12-01
A new flexible implementation of a genetic algorithm for locating unique low energy minima of isomers of clusters is described and tested. The strategy employed can be applied to molecular or atomic clusters and has a flexible input structure so that a system with several different elements can be built up from a set of individual atoms or from fragments made up of groups of atoms. This cluster program is tested on several systems, and the results are compared to computational and experimental data from previous studies. The quality of the algorithm for locating reliably the most competitive low energy structures of an assembly of atoms is examined for strongly bound Si-Li clusters, and ZnF2 clusters, and the more weakly interacting water trimers. The use of the nuclear repulsion energy as a duplication criterion, an increasing population size, and avoiding mutation steps without loss of efficacy are distinguishing features of the program. For the Si-Li clusters, a few new low energy minima are identified in the testing of the algorithm, and our results for the metal fluorides and water show very good agreement with the literature. PMID:26583254
NASA Astrophysics Data System (ADS)
Nguyen, Sy Dzung; Nguyen, Quoc Hung; Choi, Seung-Bok
2015-01-01
This paper presents a new algorithm for building an adaptive neuro-fuzzy inference system (ANFIS) from a training data set called B-ANFIS. In order to increase accuracy of the model, the following issues are executed. Firstly, a data merging rule is proposed to build and perform a data-clustering strategy. Subsequently, a combination of clustering processes in the input data space and in the joint input-output data space is presented. Crucial reason of this task is to overcome problems related to initialization and contradictory fuzzy rules, which usually happen when building ANFIS. The clustering process in the input data space is accomplished based on a proposed merging-possibilistic clustering (MPC) algorithm. The effectiveness of this process is evaluated to resume a clustering process in the joint input-output data space. The optimal parameters obtained after completion of the clustering process are used to build ANFIS. Simulations based on a numerical data, 'Daily Data of Stock A', and measured data sets of a smart damper are performed to analyze and estimate accuracy. In addition, convergence and robustness of the proposed algorithm are investigated based on both theoretical and testing approaches.
NASA Astrophysics Data System (ADS)
Milham, Merrill E.
1994-10-01
In this report, relevant parts of the scattering theory for magnetic spheres are presented. Mass extinction coefficients, and the lognormal size distribution are defined. The theory and algorithms for integrating scattering parameters over size distributions are developed. The integrations are carried out in terms of dimensionless scattering, and size distribution parameters, which are simply related to the usual mass scattering coefficients. Fortran codes, which implement the algorithmic design, are presented, and examples of code use are given. Code listings are included.
New algorithms for the Vavilov distribution calculation and the corresponding energy loss sampling
Chibani, O. |
1998-10-01
Two new algorithms for the fast calculation of the Vavilov distribution within the interval 0.01 {le} {kappa} {le} 10, where neither the Gaussian approximation nor the Landau distribution may be used, are presented. These algorithms are particularly convenient for the sampling of the corresponding random energy loss. A comparison with the exact Vavilov distribution for the case of protons traversing Al slabs is given.
A global plan policy for coherent co-operation in distributed dynamic load balancing algorithms
NASA Astrophysics Data System (ADS)
Kara, M.
1995-12-01
Distributed-controlled dynamic load balancing algorithms are known to have several advantages over centralized algorithms such as scalability, and fault tolerance. Distributed implies that the control is decentralized and that a copy of the algorithm (called a scheduler) is replicated on each host of the network. However, distributed control also contributes to the lack of global goals and lack of coherence. This paper presents a new algorithm called DGP (decentralized global plans) that addresses the problem of coherence and co-ordination in distributed dynamic load balancing algorithms. The DGP algorithm is based on a strategy called global plans (GP), and aims at maintaining all computational loads of a distributed system within a band called delta . The rationale for the design of DGP is to allow each scheduler to consider the actions of its peer schedulers. With this level of co-ordination, the schedulers can act more as a coherent team. This new approach first explicitly specifies a global goal and then designs a strategy around this global goal such that each scheduler (i) takes into account local decisions made by other schedulers; (ii) takes into account the effect of its local decisions on the overall system and (iii) ensures load balancing. An experimental evaluation of DGP with two other well known dynamic load balancing algorithms published in the literature shows that DGP performs consistently better. More significantly, the results indicate that the global plan approach provides a better framework for the design of distributed dynamic load balancing algorithms.
NASA Astrophysics Data System (ADS)
Mao, Jiandong; Li, Jinxuan
2015-10-01
Particle size distribution is essential for describing direct and indirect radiation of aerosols. Because the relationship between the aerosol size distribution and optical thickness (AOT) is an ill-posed Fredholm integral equation of the first type, the traditional techniques for determining such size distributions, such as the Phillips-Twomey regularization method, are often ambiguous. Here, we use an approach based on an improved particle swarm optimization algorithm (IPSO) to retrieve aerosol size distribution. Using AOT data measured by a CE318 sun photometer in Yinchuan, we compared the aerosol size distributions retrieved using a simple genetic algorithm, a basic particle swarm optimization algorithm and the IPSO. Aerosol size distributions for different weather conditions were analyzed, including sunny, dusty and hazy conditions. Our results show that the IPSO-based inversion method retrieved aerosol size distributions under all weather conditions, showing great potential for similar size distribution inversions.
NASA Astrophysics Data System (ADS)
Bagheripour, Parisa; Asoodeh, Mojtaba
2013-12-01
Porosity, the void portion of reservoir rocks, determines the volume of hydrocarbon accumulation and has a great control on assessment and development of hydrocarbon reservoirs. Accurate determination of porosity from core analysis is highly cost, time, and labor intensive. Therefore, the mission of finding an accurate, fast and cheap way of determining porosity is unavoidable. On the other hand, conventional well log data, available in almost all wells contain invaluable implicit information about the porosity. Therefore, an intelligent system can explicate this information. Fuzzy logic is a powerful tool for handling geosciences problem which is associated with uncertainty. However, determination of the best fuzzy formulation is still an issue. This study purposes an improved strategy, called hybrid genetic algorithm-pattern search (GA-PS) technique, against the widely held subtractive clustering (SC) method for setting up fuzzy rules between core porosity and petrophysical logs. Hybrid GA-PS technique is capable of extracting optimal parameters for fuzzy clusters (membership functions) which consequently results in the best fuzzy formulation. Results indicate that GA-PS technique manipulates both mean and variance of Gaussian membership functions contrary to SC that only has a control on mean of Gaussian membership functions. A comparison between hybrid GA-PS technique and SC method confirmed the superiority of GA-PS technique in setting up fuzzy rules. The proposed strategy was successfully applied to one of the Iranian carbonate reservoir rocks.
Silva, Mateus X; Galvão, Breno R L; Belchior, Jadson C
2014-05-21
Genetic algorithm is employed to survey an empirical potential energy surface for small Na(x)K(y) clusters with x + y ≤ 15, providing initial conditions for electronic structure methods. The minima of such empirical potential are assessed and corrected using high level ab initio methods such as CCSD(T), CR-CCSD(T)-L and MP2, and benchmark results are obtained for specific cases. The results are the first calculations for such small alloy clusters and may serve as a reference for further studies. The validity and choice of a proper functional and basis set for DFT calculations are then explored using the benchmark data, where it was found that the usual DFT approach may fail to provide the correct qualitative result for specific systems. The best general agreement to the benchmark calculations is achieved with def2-TZVPP basis set with SVWN5 functional, although the LANL2DZ basis set (with effective core potential) and SVWN5 functional provided the most cost-effective results. PMID:24691391
Parallel grid generation algorithm for distributed memory computers
NASA Technical Reports Server (NTRS)
Moitra, Stuti; Moitra, Anutosh
1994-01-01
A parallel grid-generation algorithm and its implementation on the Intel iPSC/860 computer are described. The grid-generation scheme is based on an algebraic formulation of homotopic relations. Methods for utilizing the inherent parallelism of the grid-generation scheme are described, and implementation of multiple levELs of parallelism on multiple instruction multiple data machines are indicated. The algorithm is capable of providing near orthogonality and spacing control at solid boundaries while requiring minimal interprocessor communications. Results obtained on the Intel hypercube for a blended wing-body configuration are used to demonstrate the effectiveness of the algorithm. Fortran implementations bAsed on the native programming model of the iPSC/860 computer and the Express system of software tools are reported. Computational gains in execution time speed-up ratios are given.
Virus evolutionary genetic algorithm for task collaboration of logistics distribution
NASA Astrophysics Data System (ADS)
Ning, Fanghua; Chen, Zichen; Xiong, Li
2005-12-01
In order to achieve JIT (Just-In-Time) level and clients' maximum satisfaction in logistics collaboration, a Virus Evolutionary Genetic Algorithm (VEGA) was put forward under double constraints of logistics resource and operation sequence. Based on mathematic description of a multiple objective function, the algorithm was designed to schedule logistics tasks with different due dates and allocate them to network members. By introducing a penalty item, make span and customers' satisfaction were expressed in fitness function. And a dynamic adaptive probability of infection was used to improve performance of local search. Compared to standard Genetic Algorithm (GA), experimental result illustrates the performance superiority of VEGA. So the VEGA can provide a powerful decision-making technique for optimizing resource configuration in logistics network.
D'Azevedo, E.F.; Romine, C.H.
1992-09-01
The standard formulation of the conjugate gradient algorithm involves two inner product computations. The results of these two inner products are needed to update the search direction and the computed solution. In a distributed memory parallel environment, the computation and subsequent distribution of these two values requires two separate communication and synchronization phases. In this paper, we present a mathematically equivalent rearrangement of the standard algorithm that reduces the number of communication phases. We give a second derivation of the modified conjugate gradient algorithm in terms of the natural relationship with the underlying Lanczos process. We also present empirical evidence of the stability of this modified algorithm.
NASA Astrophysics Data System (ADS)
Ashton, Douglas J.; Liu, Jiwen; Luijten, Erik; Wilding, Nigel B.
2010-11-01
Highly size-asymmetrical fluid mixtures arise in a variety of physical contexts, notably in suspensions of colloidal particles to which much smaller particles have been added in the form of polymers or nanoparticles. Conventional schemes for simulating models of such systems are hamstrung by the difficulty of relaxing the large species in the presence of the small one. Here we describe how the rejection-free geometrical cluster algorithm of Liu and Luijten [J. Liu and E. Luijten, Phys. Rev. Lett. 92, 035504 (2004)] can be embedded within a restricted Gibbs ensemble to facilitate efficient and accurate studies of fluid phase behavior of highly size-asymmetrical mixtures. After providing a detailed description of the algorithm, we summarize the bespoke analysis techniques of [Ashton et al., J. Chem. Phys. 132, 074111 (2010)] that permit accurate estimates of coexisting densities and critical-point parameters. We apply our methods to study the liquid-vapor phase diagram of a particular mixture of Lennard-Jones particles having a 10:1 size ratio. As the reservoir volume fraction of small particles is increased in the range of 0%-5%, the critical temperature decreases by approximately 50%, while the critical density drops by some 30%. These trends imply that in our system, adding small particles decreases the net attraction between large particles, a situation that contrasts with hard-sphere mixtures where an attractive depletion force occurs.
Study of cluster reconstruction and track fitting algorithms for CGEM-IT at BESIII
NASA Astrophysics Data System (ADS)
Guo, Yue; Wang, Liang-Liang; Ju, Xu-Dong; Wu, Ling-Hui; Xiu, Qing-Lei; Wang, Hai-Xia; Dong, Ming-Yi; Hu, Jing-Ran; Li, Wei-Dong; Li, Wei-Guo; Liu, Huai-Min; Qun, Ou-Yang; Shen, Xiao-Yan; Yuan, Ye; Zhang, Yao
2016-01-01
Considering the effects of aging on the existing Inner Drift Chamber (IDC) of BESIII, a GEM-based inner tracker, the Cylindrical-GEM Inner Tracker (CGEM-IT), is proposed to be designed and constructed as an upgrade candidate for the IDC. This paper introduces a full simulation package for the CGEM-IT with a simplified digitization model, and describes the development of software for cluster reconstruction and track fitting, using a track fitting algorithm based on the Kalman filter method. Preliminary results for the reconstruction algorithms which are obtained using a Monte Carlo sample of single muon events in the CGEM-IT, show that the CGEM-IT has comparable momentum resolution and transverse vertex resolution to the IDC, and a better z-direction resolution than the IDC. Supported by National Key Basic Research Program of China (2015CB856700), National Natural Science Foundation of China (11205184, 11205182) and Joint Funds of National Natural Science Foundation of China (U1232201)
Crawford, Steven M.; Wirth, Gregory D.; Bershady, Matthew A. E-mail: wirth@keck.hawaii.edu
2014-05-01
We analyze the spatial and velocity distributions of confirmed members in five massive clusters of galaxies at intermediate redshift (0.5 < z < 0.9) to investigate the physical processes driving galaxy evolution. Based on spectral classifications derived from broad- and narrow-band photometry, we define four distinct galaxy populations representing different evolutionary stages: red sequence (RS) galaxies, blue cloud (BC) galaxies, green valley (GV) galaxies, and luminous compact blue galaxies (LCBGs). For each galaxy class, we derive the projected spatial and velocity distribution and characterize the degree of subclustering. We find that RS, BC, and GV galaxies in these clusters have similar velocity distributions, but that BC and GV galaxies tend to avoid the core of the two z ≈ 0.55 clusters. GV galaxies exhibit subclustering properties similar to RS galaxies, but their radial velocity distribution is significantly platykurtic compared to the RS galaxies. The absence of GV galaxies in the cluster cores may explain their somewhat prolonged star-formation history. The LCBGs appear to have recently fallen into the cluster based on their larger velocity dispersion, absence from the cores of the clusters, and different radial velocity distribution than the RS galaxies. Both LCBG and BC galaxies show a high degree of subclustering on the smallest scales, leading us to conclude that star formation is likely triggered by galaxy-galaxy interactions during infall into the cluster.
NASA Astrophysics Data System (ADS)
Ma, Xiaoke; Gao, Lin
2011-05-01
The detection of community structure in complex networks is crucial since it provides insight into the substructures of the whole network. Spectral clustering algorithms that employ the eigenvalues and eigenvectors of an appropriate input matrix have been successfully applied in this field. Despite its empirical success in community detection, spectral clustering has been criticized for its inefficiency when dealing with large scale data sets. This is confirmed by the fact that the time complexity for spectral clustering is cubic with respect to the number of instances; even the memory efficient iterative eigensolvers, such as the power method, may converge slowly to the desired solutions. In efforts to improve the complexity and performance, many non-traditional spectral clustering algorithms have been proposed. Rather than using the real eigenvalues and eigenvectors as in the traditional methods, the non-traditional clusterings employ additional topological structure information characterized by the spectrum of a matrix associated with the network involved, such as the complex eigenvalues and their corresponding complex eigenvectors, eigenspaces and semi-supervised labels. However, to the best of our knowledge, no work has been devoted to comparison among these newly developed approaches. This is the main goal of this paper, through evaluating the effectiveness of these spectral algorithms against some benchmark networks. The experimental results demonstrate that the spectral algorithm based on the eigenspaces achieves the best performance but is the slowest algorithm; the semi-supervised spectral algorithm is the fastest but its performance largely depends on the prior knowledge; and the spectral method based on the complement network shows similar performance to the conventional ones.
NASA Astrophysics Data System (ADS)
Nandy, Subhajit; Chaudhury, Pinaki; Bhattacharyya, S. P.
2010-06-01
We present a genetic algorithm based investigation of structural fragmentation in dicationic noble gas clusters, Arn+2, Krn+2, and Xen+2, where n denotes the size of the cluster. Dications are predicted to be stable above a threshold size of the cluster when positive charges are assumed to remain localized on two noble gas atoms and the Lennard-Jones potential along with bare Coulomb and ion-induced dipole interactions are taken into account for describing the potential energy surface. Our cutoff values are close to those obtained experimentally [P. Scheier and T. D. Mark, J. Chem. Phys. 11, 3056 (1987)] and theoretically [J. G. Gay and B. J. Berne, Phys. Rev. Lett. 49, 194 (1982)]. When the charges are allowed to be equally distributed over four noble gas atoms in the cluster and the nonpolarization interaction terms are allowed to remain unchanged, our method successfully identifies the size threshold for stability as well as the nature of the channels of dissociation as function of cluster size. In Arn2+, for example, fissionlike fragmentation is predicted for n =55 while for n =43, the predicted outcome is nonfission fragmentation in complete agreement with earlier work [Golberg et al., J. Chem. Phys. 100, 8277 (1994)].
What can the spatial distribution of galaxy clusters tell about their scaling relations?
NASA Astrophysics Data System (ADS)
Balaguera-Antolínez, Andrés
2014-03-01
Context. The clustering of galaxy clusters is sensitive not only to the parameters characterizing a given cosmological model, but also to the links between cluster intrinsic properties (e.g., the X-ray luminosity, X-ray temperature) and the total cluster mass. These links, referred to as the cluster scaling relations, represent the tip of the iceberg of the so-called cross-roads between cosmology and astrophysics on the cluster scale. Aims: In this paper we aim to quantify the capability of the inhomogeneous distribution of galaxy clusters, represented by the two-point statistics in Fourier space, to retrieve information on the underlying scaling relations. To that end, we make a case study using the mass-X-ray luminosity scaling relation for galaxy clusters and study its impact on the clustering pattern of these objects. Methods: To characterize the clustering of galaxy clusters, we define the luminosity-weighted power spectrum and introduce the luminosity power spectrum as direct assessment of the clustering of the property of interest, in our case, the cluster X-ray luminosity. Using a suite of halo catalogs extracted from N-body simulations and realistic estimates of the mass-X-ray luminosity relation, we measured these statistics with their corresponding covariance matrices. By carrying out a Fisher matrix analysis, we quantified the content of information (by means of a figure of merit) encoded in the amplitude, shape, and full shape of our probes for two-point statistics. Results: The full shape of the luminosity power spectrum, when analyzed up to scales of k ~ 0.2 h Mpc-1, yields a figure of merit that is two orders of magnitude above the figure obtained from the unweighted power spectrum, and only one order of magnitude below the value encoded in X-ray luminosity function estimated from the same sample. This is a significant improvement over the analysis developed with the standard (i.e., unweighted) clustering probes. Conclusions: The measurements of the
NASA Astrophysics Data System (ADS)
Djorgovski, S.; Meylan, G.
1993-05-01
The forthcoming ASPCS volume, ``Structure and Dynamics of Globular Clusters'' (expected publication date: early summer of 1993) will contain a set of appendices with data resources on Galactic globular clusters; the authors of these papers include I.R. King, S. Peterson, C.T. Pryor, S.C. Trager, and ourselves. From these papers we have compiled a data base of various observed and derived parameters for globular clusters (143 of them at last count). Our main purpose is to use these data for correlative studies of globular cluster properties. Others may find it useful for similar purposes, for planning and support of observations, for testing of theoretical models, etc. We will describe the data base, and present some simple analysis of the cluster properties and correlations among them. The data will be made available to the community in a computer form, as ASCII files. Interested users should send an email message to the Internet address: george @ deimos.caltech.edu, and may also find the above mentioned ASPCS volume useful in their work. We thank our colleagues who contributed data for this compilation for their efforts. S.D. acknowledges a partial support from the NASA contract NAS5-31348, and the NSF PYI award AST-9157412.
Saremi, Saeed; Sejnowski, Terrence J.
2016-01-01
Natural images are scale invariant with structures at all length scales. We formulated a geometric view of scale invariance in natural images using percolation theory, which describes the behavior of connected clusters on graphs. We map images to the percolation model by defining clusters on a binary representation for images. We show that critical percolating structures emerge in natural images and study their scaling properties by identifying fractal dimensions and exponents for the scale-invariant distributions of clusters. This formulation leads to a method for identifying clusters in images from underlying structures as a starting point for image segmentation. PMID:26415153
Saremi, Saeed; Sejnowski, Terrence J
2016-05-01
Natural images are scale invariant with structures at all length scales.We formulated a geometric view of scale invariance in natural images using percolation theory, which describes the behavior of connected clusters on graphs.We map images to the percolation model by defining clusters on a binary representation for images. We show that critical percolating structures emerge in natural images and study their scaling properties by identifying fractal dimensions and exponents for the scale-invariant distributions of clusters. This formulation leads to a method for identifying clusters in images from underlying structures as a starting point for image segmentation. PMID:26415153
NASA Technical Reports Server (NTRS)
Dasarathy, B. V.
1976-01-01
An algorithm is proposed for dimensionality reduction in the context of clustering techniques based on histogram analysis. The approach is based on an evaluation of the hills and valleys in the unidimensional histograms along the different features and provides an economical means of assessing the significance of the features in a nonparametric unsupervised data environment. The method has relevance to remote sensing applications.
Ansari, Elnaz Saberi; Eslahchi, Changiz; Pezeshk, Hamid; Sadeghi, Mehdi
2014-09-01
Decomposition of structural domains is an essential task in classifying protein structures, predicting protein function, and many other proteomics problems. As the number of known protein structures in PDB grows exponentially, the need for accurate automatic domain decomposition methods becomes more essential. In this article, we introduce a bottom-up algorithm for assigning protein domains using a graph theoretical approach. This algorithm is based on a center-based clustering approach. For constructing initial clusters, members of an independent dominating set for the graph representation of a protein are considered as the centers. A distance matrix is then defined for these clusters. To obtain final domains, these clusters are merged using the compactness principle of domains and a method similar to the neighbor-joining algorithm considering some thresholds. The thresholds are computed using a training set consisting of 50 protein chains. The algorithm is implemented using C++ language and is named ProDomAs. To assess the performance of ProDomAs, its results are compared with seven automatic methods, against five publicly available benchmarks. The results show that ProDomAs outperforms other methods applied on the mentioned benchmarks. The performance of ProDomAs is also evaluated against 6342 chains obtained from ASTRAL SCOP 1.71. ProDomAs is freely available at http://www.bioinf.cs.ipm.ir/software/prodomas. PMID:24596179
The Age, Mass, and Size Distributions of Star Clusters in M51
NASA Astrophysics Data System (ADS)
Chandar, Rupali; Whitmore, Bradley C.; Dinino, Daiana; Kennicutt, Robert C.; Chien, L.-H.; Schinnerer, Eva; Meidt, Sharon
2016-06-01
We present a new catalog of 3816 compact star clusters in the grand design spiral galaxy M51 based on observations taken with the Hubble Space Telescope. The age distribution of the clusters declines starting at very young ages, and can be represented by a power law, {dN}/dτ \\propto {τ }γ , with γ =-0.65+/- 0.15. No significant changes in the shape of the age distribution at different masses is observed. The mass function of the clusters younger than τ ≈ 400 {{Myr}} can also be described by a power law, {dN}/{dM}\\propto {M}β , with β ≈ \\-2.1+/- 0.2. We compare these distributions with the predictions from various cluster disruption models, and find that they are consistent with models where clusters disrupt approximately independent of their initial mass, but not with models where lower mass clusters are disrupted earlier than their higher mass counterparts. We find that the half-light radii of clusters more massive than M ≈ 3× {10}4 {M}ȯ and with ages between 100 and 400 {{Myr}} are larger by a factor of ≈3–4 than their counterparts that are younger than 107 years old, suggesting that the clusters physically expand during their early life.
A uniform metal distribution in the intergalactic medium of the Perseus cluster of galaxies.
Werner, Norbert; Urban, Ondrej; Simionescu, Aurora; Allen, Steven W
2013-10-31
Most of the metals (elements heavier than helium) produced by stars in the member galaxies of clusters currently reside within the hot, X-ray-emitting intra-cluster gas. Observations of X-ray line emission from this intergalactic medium have suggested a relatively small cluster-to-cluster scatter outside the cluster centres and enrichment with iron out to large radii, leading to the idea that the metal enrichment occurred early in the history of the Universe. Models with early enrichment predict a uniform metal distribution at large radii in clusters, whereas those with late-time enrichment are expected to introduce significant spatial variations of the metallicity. To discriminate clearly between these competing models, it is essential to test for potential inhomogeneities by measuring the abundances out to large radii along multiple directions in clusters, which has not hitherto been done. Here we report a remarkably uniform iron abundance, as a function of radius and azimuth, that is statistically consistent with a constant value of ZFe = 0.306 ± 0.012 in solar units out to the edge of the nearby Perseus cluster. This homogeneous distribution requires that most of the metal enrichment of the intergalactic medium occurred before the cluster formed, probably more than ten billion years ago, during the period of maximal star formation and black hole activity. PMID:24172976
Discovery of a widely distributed toxin biosynthetic gene cluster
Lee, Shaun W.; Mitchell, Douglas A.; Markley, Andrew L.; Hensler, Mary E.; Gonzalez, David; Wohlrab, Aaron; Dorrestein, Pieter C.; Nizet, Victor; Dixon, Jack E.
2008-01-01
Bacteriocins represent a large family of ribosomally produced peptide antibiotics. Here we describe the discovery of a widely conserved biosynthetic gene cluster for the synthesis of thiazole and oxazole heterocycles on ribosomally produced peptides. These clusters encode a toxin precursor and all necessary proteins for toxin maturation and export. Using the toxin precursor peptide and heterocycle-forming synthetase proteins from the human pathogen Streptococcus pyogenes, we demonstrate the in vitro reconstitution of streptolysin S activity. We provide evidence that the synthetase enzymes, as predicted from our bioinformatics analysis, introduce heterocycles onto precursor peptides, thereby providing molecular insight into the chemical structure of streptolysin S. Furthermore, our studies reveal that the synthetase exhibits relaxed substrate specificity and modifies toxin precursors from both related and distant species. Given our findings, it is likely that the discovery of similar peptidic toxins will rapidly expand to existing and emerging genomes. PMID:18375757
The mass distribution of the strong lensing cluster SDSS J1531+3414
Sharon, Keren; Johnson, Traci L.; Gladders, Michael D.; Rigby, Jane R.; Wuyts, Eva; Bayliss, Matthew B.; Florian, Michael K.; Dahle, Håkon
2014-11-01
We present the mass distribution at the core of SDSS J1531+3414, a strong-lensing cluster at z = 0.335. We find that the mass distribution is well described by two cluster-scale halos with a contribution from cluster-member galaxies. New Hubble Space Telescope observations of SDSS J1531+3414 reveal a signature of ongoing star formation associated with the two central galaxies at the core of the cluster, in the form of a chain of star forming regions at the center of the cluster. Using the lens model presented here, we place upper limits on the contribution of a possible lensed image to the flux at the central region, and rule out that this emission is coming from a background source.
Cluster Mass Distribution of the Hubble Frontier Fields - What have we learned?
NASA Astrophysics Data System (ADS)
Jean-Paul, Kneib
2016-07-01
The Hubble Frontier Fields have provided the deepest imaging of six of the most massive clusters in the Universe. Using strong lensing and weak lensing techniques, we have investigated with a record high precision the mass models of these clusters. First we identified the multiples images that are then confronted to an evolving model to best match the strong lensing observable constraints. We then include weak lensing and flexion to investigate the mass distribution in the outer region. By investigating the accuracy of the model we show that we can constrain the small scale mass distribution, thus investigating the relation between the cluster galaxy stellar mass and its dark matter halo. On larger scale combining with weak lensing and X-ray measurement we can probe the assembly scenario of these cluster, which confirm that massive clusters are at the crossroads of filamentary structures.
Angular distribution of ejected electrons at the laser-cluster interactions
NASA Astrophysics Data System (ADS)
Gets, A. V.; Krainov, V. P.
2007-06-01
The histograms of deflection angles of electrons ejected from Xe clusters irradiated by femtosecond super-intense laser pulses are presented. The dependence of the angular distribution on the peak laser intensity, the pulse duration, and the cluster position is considered. A clear relationship between the final electron energy and the deflection angle is shown. The deflection angles are calculated by solving the relativistic equation of motion taking into account the Lorentz force and the Coulomb field of the ionized cluster. The ions in the cluster undergo sequential multiple ionization up to charge multiplicity Z = 26. The measurements of the electron angular distributions allow us to reproduce the imaging dynamics of outer ionization of the cluster at the leading edge of the relativistic femtosecond laser pulse.
ERIC Educational Resources Information Center
Li, Wenhao
2011-01-01
Distributed workflow technology has been widely used in modern education and e-business systems. Distributed web applications have shown cross-domain and cooperative characteristics to meet the need of current distributed workflow applications. In this paper, the author proposes a dynamic and adaptive scheduling algorithm PCSA (Pre-Calculated…
Saro, A.
2015-10-12
In this study, we cross-match galaxy cluster candidates selected via their Sunyaev–Zel'dovich effect (SZE) signatures in 129.1 deg^{2} of the South Pole Telescope 2500d SPT-SZ survey with optically identified clusters selected from the Dark Energy Survey science verification data. We identify 25 clusters between 0.1 ≲ z ≲ 0.8 in the union of the SPT-SZ and redMaPPer (RM) samples. RM is an optical cluster finding algorithm that also returns a richness estimate for each cluster. We model the richness λ-mass relation with the following function
Frank, K. A.; Peterson, J. R.; Andersson, K.; Fabian, A. C.; Sanders, J. S.
2013-02-10
We measure the intracluster medium (ICM) temperature distributions for 62 galaxy clusters in the HIFLUGCS, an X-ray flux-limited sample, with available X-ray data from XMM-Newton. We search for correlations between the width of the temperature distributions and other cluster properties, including median cluster temperature, luminosity, size, presence of a cool core, active galactic nucleus (AGN) activity, and dynamical state. We use a Markov Chain Monte Carlo analysis, which models the ICM as a collection of X-ray emitting smoothed particles of plasma. Each smoothed particle is given its own set of parameters, including temperature, spatial position, redshift, size, and emission measure. This allows us to measure the width of the temperature distribution, median temperature, and total emission measure of each cluster. We find that none of the clusters have a temperature width consistent with isothermality. Counterintuitively, we also find that the temperature distribution widths of disturbed, non-cool-core, and AGN-free clusters tend to be wider than in other clusters. A linear fit to {sigma} {sub kT}-kT {sub med} finds {sigma} {sub kT} {approx} 0.20kT {sub med} + 1.08, with an estimated intrinsic scatter of {approx}0.55 keV, demonstrating a large range in ICM thermal histories.
Ma, Li; Li, Yang; Fan, Suohai; Fan, Runzhu
2015-01-01
Image segmentation plays an important role in medical image processing. Fuzzy c-means (FCM) clustering is one of the popular clustering algorithms for medical image segmentation. However, FCM has the problems of depending on initial clustering centers, falling into local optimal solution easily, and sensitivity to noise disturbance. To solve these problems, this paper proposes a hybrid artificial fish swarm algorithm (HAFSA). The proposed algorithm combines artificial fish swarm algorithm (AFSA) with FCM whose advantages of global optimization searching and parallel computing ability of AFSA are utilized to find a superior result. Meanwhile, Metropolis criterion and noise reduction mechanism are introduced to AFSA for enhancing the convergence rate and antinoise ability. The artificial grid graph and Magnetic Resonance Imaging (MRI) are used in the experiments, and the experimental results show that the proposed algorithm has stronger antinoise ability and higher precision. A number of evaluation indicators also demonstrate that the effect of HAFSA is more excellent than FCM and suppressed FCM (SFCM). PMID:26649068
Ma, Li; Li, Yang; Fan, Suohai; Fan, Runzhu
2015-01-01
Image segmentation plays an important role in medical image processing. Fuzzy c-means (FCM) clustering is one of the popular clustering algorithms for medical image segmentation. However, FCM has the problems of depending on initial clustering centers, falling into local optimal solution easily, and sensitivity to noise disturbance. To solve these problems, this paper proposes a hybrid artificial fish swarm algorithm (HAFSA). The proposed algorithm combines artificial fish swarm algorithm (AFSA) with FCM whose advantages of global optimization searching and parallel computing ability of AFSA are utilized to find a superior result. Meanwhile, Metropolis criterion and noise reduction mechanism are introduced to AFSA for enhancing the convergence rate and antinoise ability. The artificial grid graph and Magnetic Resonance Imaging (MRI) are used in the experiments, and the experimental results show that the proposed algorithm has stronger antinoise ability and higher precision. A number of evaluation indicators also demonstrate that the effect of HAFSA is more excellent than FCM and suppressed FCM (SFCM). PMID:26649068
Wang, Xueyi
2011-01-01
The k-nearest neighbors (k-NN) algorithm is a widely used machine learning method that finds nearest neighbors of a test object in a feature space. We present a new exact k-NN algorithm called kMkNN (k-Means for k-Nearest Neighbors) that uses the k-means clustering and the triangle inequality to accelerate the searching for nearest neighbors in a high dimensional space. The kMkNN algorithm has two stages. In the buildup stage, instead of using complex tree structures such as metric trees, kd-trees, or ball-tree, kMkNN uses a simple k-means clustering method to preprocess the training dataset. In the searching stage, given a query object, kMkNN finds nearest training objects starting from the nearest cluster to the query object and uses the triangle inequality to reduce the distance calculations. Experiments show that the performance of kMkNN is surprisingly good compared to the traditional k-NN algorithm and tree-based k-NN algorithms such as kd-trees and ball-trees. On a collection of 20 datasets with up to 106 records and 104 dimensions, kMkNN shows a 2-to 80-fold reduction of distance calculations and a 2- to 60-fold speedup over the traditional k-NN algorithm for 16 datasets. Furthermore, kMkNN performs significant better than a kd-tree based k-NN algorithm for all datasets and performs better than a ball-tree based k-NN algorithm for most datasets. The results show that kMkNN is effective for searching nearest neighbors in high dimensional spaces. PMID:22247818
Energy distributions of constituent atoms of cluster impacts on solid surface
NASA Astrophysics Data System (ADS)
Yamamura, Yasunori
1991-12-01
Using the time-evolution Monte Carlo simulation code DYACAT, the energy distributions of constituent atoms due to big cluster impacts on amorphous targets have been investigated, where the (Ag) n and (Al) n cluster (n being 10 to 500) with energies a few 100 eV/atom to keV/atom are bombarded on amorphous carbon and gold targets, respectively. It is found that the energy distribution of constituent atoms is strongly affected by the mass ratio M 2/M 1 (M 1 and M 2 being the atomic masses of the constituent atom and the target atom, respectively), the size of the cluster, and the cluster energy. In the case of the 1 keV/atom (Ag) 500 cluster impacts on C (M 2/M 1 < 1) the shape of the energy distribution of constituent atoms is trapezoidal, while in the case of the Al cluster impacts on Au (M 2/M 1 > 1) the high-energy tail of the energy distribution of Al atoms due to the big cluster impact ( n > 100) can be well described in terms of the Maxwell-Boltzmann function, and its temperature is linearly proportional to the energy. In the case of 1 keV/atom (Al) 500 cluster impact on Au, the quasi-equilibrium state continues for more than 0.6×10 -13 s, but the temperature of the cluster impact region decreases as time passes. The present simulation supports the recent Echenique, Manson and Ritchie's Ansatz in their theory of cluster impact fusion.
Distribution and Genetic Diversity of Bacteriocin Gene Clusters in Rumen Microbial Genomes.
Azevedo, Analice C; Bento, Cláudia B P; Ruiz, Jeronimo C; Queiroz, Marisa V; Mantovani, Hilário C
2015-10-01
Some species of ruminal bacteria are known to produce antimicrobial peptides, but the screening procedures have mostly been based on in vitro assays using standardized methods. Recent sequencing efforts have made available the genome sequences of hundreds of ruminal microorganisms. In this work, we performed genome mining of the complete and partial genome sequences of 224 ruminal bacteria and 5 ruminal archaea to determine the distribution and diversity of bacteriocin gene clusters. A total of 46 bacteriocin gene clusters were identified in 33 strains of ruminal bacteria. Twenty gene clusters were related to lanthipeptide biosynthesis, while 11 gene clusters were associated with sactipeptide production, 7 gene clusters were associated with class II bacteriocin production, and 8 gene clusters were associated with class III bacteriocin production. The frequency of strains whose genomes encode putative antimicrobial peptide precursors was 14.4%. Clusters related to the production of sactipeptides were identified for the first time among ruminal bacteria. BLAST analysis indicated that the majority of the gene clusters (88%) encoding putative lanthipeptides contained all the essential genes required for lanthipeptide biosynthesis. Most strains of Streptococcus (66.6%) harbored complete lanthipeptide gene clusters, in addition to an open reading frame encoding a putative class II bacteriocin. Albusin B-like proteins were found in 100% of the Ruminococcus albus strains screened in this study. The in silico analysis provided evidence of novel biosynthetic gene clusters in bacterial species not previously related to bacteriocin production, suggesting that the rumen microbiota represents an underexplored source of antimicrobial peptides. PMID:26253660
Distribution and Genetic Diversity of Bacteriocin Gene Clusters in Rumen Microbial Genomes
Azevedo, Analice C.; Bento, Cláudia B. P.; Ruiz, Jeronimo C.; Queiroz, Marisa V.
2015-01-01
Some species of ruminal bacteria are known to produce antimicrobial peptides, but the screening procedures have mostly been based on in vitro assays using standardized methods. Recent sequencing efforts have made available the genome sequences of hundreds of ruminal microorganisms. In this work, we performed genome mining of the complete and partial genome sequences of 224 ruminal bacteria and 5 ruminal archaea to determine the distribution and diversity of bacteriocin gene clusters. A total of 46 bacteriocin gene clusters were identified in 33 strains of ruminal bacteria. Twenty gene clusters were related to lanthipeptide biosynthesis, while 11 gene clusters were associated with sactipeptide production, 7 gene clusters were associated with class II bacteriocin production, and 8 gene clusters were associated with class III bacteriocin production. The frequency of strains whose genomes encode putative antimicrobial peptide precursors was 14.4%. Clusters related to the production of sactipeptides were identified for the first time among ruminal bacteria. BLAST analysis indicated that the majority of the gene clusters (88%) encoding putative lanthipeptides contained all the essential genes required for lanthipeptide biosynthesis. Most strains of Streptococcus (66.6%) harbored complete lanthipeptide gene clusters, in addition to an open reading frame encoding a putative class II bacteriocin. Albusin B-like proteins were found in 100% of the Ruminococcus albus strains screened in this study. The in silico analysis provided evidence of novel biosynthetic gene clusters in bacterial species not previously related to bacteriocin production, suggesting that the rumen microbiota represents an underexplored source of antimicrobial peptides. PMID:26253660
Marchal, Rémi; Carbonnière, Philippe; Pouchan, Claude
2015-01-22
The study of atomic clusters has become an increasingly active area of research in the recent years because of the fundamental interest in studying a completely new area that can bridge the gap between atomic and solid state physics. Due to their specific properties, such compounds are of great interest in the field of nanotechnology [1,2]. Here, we would present our GSAM algorithm based on a DFT exploration of the PES to find the low lying isomers of such compounds. This algorithm includes the generation of an intial set of structure from which the most relevant are selected. Moreover, an optimization process, called raking optimization, able to discard step by step all the non physically reasonnable configurations have been implemented to reduce the computational cost of this algorithm. Structural properties of Ga{sub n}Asm clusters will be presented as an illustration of the method.
NASA Astrophysics Data System (ADS)
Boschan, A.; Ocampo, B. L.; Annichini, M.; Gauthier, G.
2016-06-01
A study on the spatial organization and velocity fluctuations of non-Brownian spherical particles settling at low Reynolds number in a vertical Hele-Shaw cell is reported. The particle volume fraction ranged from 0.005 to 0.05, while the distance between cell plates ranged from 5 to 15 times the particle radius. Particle tracking revealed that particles were not uniformly distributed in space but assembled in transient settling clusters. The population distribution of these clusters followed an exponential law. The measured velocity fluctuations are in agreement with that predicted theoretically for spherical clusters, from the balance between the apparent weight and the drag force. This result suggests that particle clustering, more than a spatial distribution of particles derived from random and independent events, is at the origin of the velocity fluctuations.
NASA Astrophysics Data System (ADS)
Ayvaz, M. Tamer
2007-11-01
This study proposes an inverse solution algorithm through which both the aquifer parameters and the zone structure of these parameters can be determined based on a given set of observations on piezometric heads. In the zone structure identification problem fuzzy c-means ( FCM) clustering method is used. The association of the zone structure with the transmissivity distribution is accomplished through an optimization model. The meta-heuristic harmony search ( HS) algorithm, which is conceptualized using the musical process of searching for a perfect state of harmony, is used as an optimization technique. The optimum parameter zone structure is identified based on three criteria which are the residual error, parameter uncertainty, and structure discrimination. A numerical example given in the literature is solved to demonstrate the performance of the proposed algorithm. Also, a sensitivity analysis is performed to test the performance of the HS algorithm for different sets of solution parameters. Results indicate that the proposed solution algorithm is an effective way in the simultaneous identification of aquifer parameters and their corresponding zone structures.
NASA Astrophysics Data System (ADS)
Fukunaga, Takafumi
Due to advent of powerful Multi-Core PC cluster the computation performance of each node is dramatically increassed and this trend will continue in the future. On the other hand, the use of powerful network systems (Myrinet, Infiniband, etc.) is expensive and tends to increase difficulty of programming and degrades portability because they need dedicated libraries and protocol stacks. This paper proposes a relatively simple method to improve bandwidth-oriented parallel applications by improving the communication performance without the above dedicated hardware, libraries, protocol stacks and IEEE802.3ad (LACP). Although there are similarities between this proposal and IEEE802.3ad in respect to using multiple Ethernet ports, the proposal performs equal to or better than IEEE802.3ad without LACP switches and drivers. Moreover the performance of LACP is influenced by the environment (MAC addresses, IP addresses, etc.) because its distribution algorithm uses these parameters, the proposed method shows the same effect in spite of them.
The fractal dimensions of the spatial distribution of young open clusters in the solar neighbourhood
NASA Astrophysics Data System (ADS)
de La Fuente Marcos, R.; de La Fuente Marcos, C.
2006-06-01
Context: .Fractals are geometric objects with dimensionalities that are not integers. They play a fundamental role in the dynamics of chaotic systems. Observation of fractal structure in both the gas and the star-forming sites in galaxies suggests that the spatial distribution of young open clusters should follow a fractal pattern, too. Aims: .Here we investigate the fractal pattern of the distribution of young open clusters in the Solar Neighbourhood using a volume-limited sample from WEBDA and a multifractal analysis. By counting the number of objects inside spheres of different radii centred on clusters, we study the homogeneity of the distribution. Methods: .The fractal dimension D of the spatial distribution of a volume-limited sample of young open clusters is determined by analysing different moments of the count-in-cells. The spectrum of the Minkowski-Bouligand dimension of the distribution is studied as a function of the parameter q. The sample is corrected for dynamical effects. Results: .The Minkowski-Bouligand dimension varies with q in the range 0.71-1.77, therefore the distribution of young open clusters is fractal. We estimate that the average value of the fractal dimension is < D> = 1.7± 0.2 for the distribution of young open clusters studied. Conclusions: .The spatial distribution of young open clusters in the Solar Neighbourhood exhibits multifractal structure. The fractal dimension is time-dependent, increasing over time. The values found are consistent with the fractal dimension of star-forming sites in other spiral galaxies.
The abundance and spatial distribution of ultra-diffuse galaxies in nearby galaxy clusters
NASA Astrophysics Data System (ADS)
van der Burg, Remco F. J.; Muzzin, Adam; Hoekstra, Henk
2016-05-01
Recent observations have highlighted a significant population of faint but large (reff> 1.5 kpc) galaxies in the Coma cluster. The origin of these ultra diffuse galaxies (UDGs) remains puzzling, as the interpretation of these observational results has been hindered by the (partly) subjective selection of UDGs, and the limited study of only the Coma (and some examples in the Virgo-) cluster. In this paper we extend the study of UDGs using eight clusters in the redshift range 0.044
NASA Technical Reports Server (NTRS)
Dasarathy, B. V.
1976-01-01
Learning of discriminant hyperplanes in imperfectly supervised or unsupervised training sample sets with unreliably labeled samples along the fuzzy joint boundaries between sample clusters is discussed, with the discriminant hyperplane designed to be a least-squares fit to the unreliably labeled data points. (Samples along the fuzzy boundary jump back and forth from one cluster to the other in recursive cluster stabilization and are considered unreliably labeled.) Minimization of the distances of these unreliably labeled samples from the hyperplanes does not sacrifice the ability to discriminate between classes represented by reliably labeled subsets of samples. An equivalent unconstrained linear inequality problem is formulated and algorithms for its solution are indicated. Landsat earth sensing data were used in confirming the validity and computational feasibility of the approach, which should be useful in deriving discriminant hyperplanes separating clusters with fuzzy boundaries, given supervised training sample sets with unreliably labeled boundary samples.
An Improved Source-Scanning Algorithm for Locating Earthquake Clusters or Aftershock Sequences
NASA Astrophysics Data System (ADS)
Liao, Y.; Kao, H.; Hsu, S.
2010-12-01
The Source-scanning Algorithm (SSA) was originally introduced in 2004 to locate non-volcanic tremors. Its application was later expanded to the identification of earthquake rupture planes and the near-real-time detection and monitoring of landslides and mud/debris flows. In this study, we further improve SSA for the purpose of locating earthquake clusters or aftershock sequences when only a limited number of waveform observations are available. The main improvements include the application of a ground motion analyzer to separate P and S waves, the automatic determination of resolution based on the grid size and time step of the scanning process, and a modified brightness function to utilize constraints from multiple phases. Specifically, the improved SSA (named as ISSA) addresses two major issues related to locating earthquake clusters/aftershocks. The first one is the massive amount of both time and labour to locate a large number of seismic events manually. And the second one is to efficiently and correctly identify the same phase across the entire recording array when multiple events occur closely in time and space. To test the robustness of ISSA, we generate synthetic waveforms consisting of 3 separated events such that individual P and S phases arrive at different stations in different order, thus making correct phase picking nearly impossible. Using these very complicated waveforms as the input, the ISSA scans all model space for possible combination of time and location for the existence of seismic sources. The scanning results successfully associate various phases from each event at all stations, and correctly recover the input. To further demonstrate the advantage of ISSA, we apply it to the waveform data collected by a temporary OBS array for the aftershock sequence of an offshore earthquake southwest of Taiwan. The overall signal-to-noise ratio is inadequate for locating small events; and the precise arrival times of P and S phases are difficult to
The kinematics of dense clusters of galaxies. II - The distribution of velocity dispersions
NASA Technical Reports Server (NTRS)
Zabludoff, Ann I.; Geller, Margaret J.; Huchra, John P.; Ramella, Massimo
1993-01-01
From the survey of 31 Abell R above 1 cluster fields within z of 0.02-0.05, we extract 25 dense clusters with velocity dispersions omicron above 300 km/s and with number densities exceeding the mean for the Great Wall of galaxies by one deviation. From the CfA Redshift Survey (in preparation), we obtain an approximately volume-limited catalog of 31 groups with velocity dispersions above 100 km/s and with the same number density limit. We combine these well-defined samples to obtain the distribution of cluster velocity dispersions. The group sample enables us to correct for incompleteness in the Abell catalog at low velocity dispersions. The clusters from the Abell cluster fields populate the high dispersion tail. For systems with velocity dispersions above 700 km/s, approximately the median for R = 1 clusters, the group and cluster abundances are consistent. The combined distribution is consistent with cluster X-ray temperature functions.
NASA Astrophysics Data System (ADS)
Smail, Linda
2016-06-01
The basic task of any probabilistic inference system in Bayesian networks is computing the posterior probability distribution for a subset or subsets of random variables, given values or evidence for some other variables from the same Bayesian network. Many methods and algorithms have been developed to exact and approximate inference in Bayesian networks. This work compares two exact inference methods in Bayesian networks-Lauritzen-Spiegelhalter and the successive restrictions algorithm-from the perspective of computational efficiency. The two methods were applied for comparison to a Chest Clinic Bayesian Network. Results indicate that the successive restrictions algorithm shows more computational efficiency than the Lauritzen-Spiegelhalter algorithm.
Implementation of genetic algorithm for distribution systems loss minimum re-configuration
Nara, K.; Shiose, A. ); Kitagawa, M.; Ishihara, T. )
1992-08-01
In this paper, a distribution systems loss minimum reconfiguration method by genetic algorithm is proposed. The problem is a complex mixed integer programming problem and is very difficult to solve by a mathematical programming approach. A genetic algorithm (GA) is a search or optimization algorithm based on the mechanics of natural selection and natural genetics. Since GA is suitable to solve combinatorial optimization problems, it can be successfully applied to problems of loss minimum in distribution systems. Numerical examples demonstrate the validity and effectiveness of the proposed methodology.
Research on distributed QOS routing algorithm based on TCP/IP
NASA Astrophysics Data System (ADS)
Liu, Xiaoyue; Chen, Yongqiang
2011-10-01
At present, network environment follow protocol standard of IPV4 is intended to do the best effort of network to provide network applied service for users, however, not caring about service quality.Thus the packet loss rate is high, it cannot reach an ideal applied results. This article through the establishment of mathematical model, put forward a new distributed multi QOS routing algorithm, given the realization process of this distributed QOS routing algorithm, and simulation was carried out by simulation software. The results show the proposed algorithm can improve the utilization rate of network resources and the service quality of network application.
Michiels, E.; Bijbels, R.
1984-06-01
Laser microprobe mass analysis (LAMMA) spectra are described for binary oxides belonging to different groups in the periodic table. The positive and negative cluster ion distributions show a strong correlation with the valence electron configuration of the metal in the oxide. The bond dissociation energy of the MO/sup +/ ion also affects the intensity distributions. 20 references, 10 figures.
Distributive Education Competency-Based Curriculum Models by Occupational Clusters. Final Report.
ERIC Educational Resources Information Center
Davis, Rodney E.; Husted, Stewart W.
To meet the needs of distributive education teachers and students, a project was initiated to develop competency-based curriculum models for marketing and distributive education clusters. The models which were developed incorporate competencies, materials and resources, teaching methodologies/learning activities, and evaluative criteria for the…
Velocity distributions in galaxy clusters - How to combine different normality tests
NASA Astrophysics Data System (ADS)
Sampaio, F. S.; Ribeiro, A. L. B.
2014-02-01
We study 416 galaxy systems with more than 7 members selected from the 2MASS catalog. We applied five well known normality tests to the velocity distributions of these systems to distinguish Gaussian and non-Gaussian clusters. Using controlled samples, we estimated type I and II errors for each test. We verified that individual tests minimize the chances of classifying a Gaussian system as non-Gaussian, while the Fisher’s meta-analysis method, a procedure to combine p-values from several statistical tests, minimizes the chances of classifying a non-Gaussian system as Gaussian. Taking the positive elements of each method and also including a modality analysis of the velocity distribution, we defined objective criteria to split up the sample into Gaussian and non-Gaussian clusters. Our analysis indicates that 50-58% of groups have Gaussian distribution, a lower fraction than that we found using individual normality tests, 71-87%. We also found that some properties of galaxy clusters are significantly different between Gaussian and non-Gaussian systems. For instance, non-Gaussian clusters have larger radii and contain more galaxies than Gaussian clusters. Finally, we discussed the importance of choosing the adequate methodology to classify galaxy systems from their velocity distributions and also the dependence of the results on the criteria used to identify clusters in galaxy surveys.
Distributed Range-Free Localization Algorithm Based on Self-Organizing Maps
NASA Astrophysics Data System (ADS)
Tinh, Pham Doan; Kawai, Makoto
In Mobile Ad-Hoc Networks (MANETs), determining the physical location of nodes (localization) is very important for many network services and protocols. This paper proposes a new Distributed Range-free Localization Algorithm Based on Self-Organizing Maps (SOM) to deal with this issue. Our proposed algorithm utilizes only connectivity information to determine the location of nodes. By utilizing the intersection areas between radio coverage of neighboring nodes, the algorithm has maximized the correlation between neighboring nodes in distributed implementation of SOM and reduced the SOM learning time. An implementation of the algorithm on Network Simulator 2 (NS-2) was done with the mobility consideration to verify the performance of the proposed algorithm. From our intensive simulations, the results show that the proposed scheme achieves very good accuracy in most cases.
NASA Technical Reports Server (NTRS)
Rogers, David
1988-01-01
The advent of the Connection Machine profoundly changes the world of supercomputers. The highly nontraditional architecture makes possible the exploration of algorithms that were impractical for standard Von Neumann architectures. Sparse distributed memory (SDM) is an example of such an algorithm. Sparse distributed memory is a particularly simple and elegant formulation for an associative memory. The foundations for sparse distributed memory are described, and some simple examples of using the memory are presented. The relationship of sparse distributed memory to three important computational systems is shown: random-access memory, neural networks, and the cerebellum of the brain. Finally, the implementation of the algorithm for sparse distributed memory on the Connection Machine is discussed.
Segmentation of SoHO/EIT Images using fuzzy clustering algorithms
NASA Astrophysics Data System (ADS)
Delouille, V.; Barra, V.; Hochedez, J.
2007-12-01
The study of the variability of the solar corona and the monitoring of its traditional regions (Coronal Holes, Quiet Sun and Active Regions) are of great importance in astrophysics as well as in view of the Space Weather and Space Climate applications. In this presentation, I will propose a multi-channel unsupervised spatially- constrained fuzzy clustering algorithm that automatically segments EUV solar images into Coronal Holes, Quiet Sun and Active Regions. The use of Fuzzy logic allows to manage the various noises present in the images and the imprecision in the definition of the above mentioned regions. The process is fast and automatic. It is applied to SoHO-EIT images taken from January 1997 till May 2005, spanning thus almost a full solar cycle. Results in terms of areas and intensity estimations are consistent with previous knowledge. The method reveal the rotational and other mid-term periodicities in the extracted time series across solar cycle 23. Further, such an approach paves the way to bridging observations between spatially resolved data from imaging telescopes and time series from radiometers. Time series resulting form the segmentation of EUV coronal images can indeed provide an essential component in the process of reconstructing the solar spectrum.
NASA Astrophysics Data System (ADS)
Huang, Zhipeng; Gao, Lihong; Wang, Yangwei; Wang, Fuchi
2016-06-01
The Johnson-Cook (J-C) constitutive model is widely used in the finite element simulation, as this model shows the relationship between stress and strain in a simple way. In this paper, a cluster global optimization algorithm is proposed to determine the J-C constitutive model parameters of materials. A set of assumed parameters is used for the accuracy verification of the procedure. The parameters of two materials (401 steel and 823 steel) are determined. Results show that the procedure is reliable and effective. The relative error between the optimized and assumed parameters is no more than 4.02%, and the relative error between the optimized and assumed stress is 0.2% × 10-5. The J-C constitutive parameters can be determined more precisely and quickly than the traditional manual procedure. Furthermore, all the parameters can be simultaneously determined using several curves under different experimental conditions. A strategy is also proposed to accurately determine the constitutive parameters.
Fleisch, Markus C.; Maxell, Christopher A.; Kuper, Claudia K.; Brown, Erika T.; Parvin, Bahram; Barcellos-Hoff, Mary-Helen; Costes,Sylvain V.
2006-03-08
Centrosomes are small organelles that organize the mitoticspindle during cell division and are also involved in cell shape andpolarity. Within epithelial tumors, such as breast cancer, and somehematological tumors, centrosome abnormalities (CA) are common, occurearly in disease etiology, and correlate with chromosomal instability anddisease stage. In situ quantification of CA by optical microscopy ishampered by overlap and clustering of these organelles, which appear asfocal structures. CA has been frequently associated with Tp53 status inpremalignant lesions and tumors. Here we describe an approach toaccurately quantify centrosomes in tissue sections and tumors.Considering proliferation and baseline amplification rate the resultingpopulation based ratio of centrosomes per nucleus allow the approximationof the proportion of cells with CA. Using this technique we show that20-30 percent of cells have amplified centrosomes in Tp53 null mammarytumors. Combining fluorescence detection, deconvolution microscopy and amathematical algorithm applied to a maximum intensity projection we showthat this approach is superior to traditional investigator based visualanalysis or threshold-based techniques.
NASA Astrophysics Data System (ADS)
Chang, Seongmin; Baek, Sungmin; Kim, Ki-Ook; Cho, Maenghyo
2015-06-01
A system identification method has been proposed to validate finite element models of complex structures using measured modal data. Finite element method is used for the system identification as well as the structural analysis. In perturbation methods, the perturbed system is expressed as a combination of the baseline structure and the related perturbations. The changes in dynamic responses are applied to determine the structural modifications so that the equilibrium may be satisfied in the perturbed system. In practical applications, the dynamic measurements are carried out on a limited number of accessible nodes and associated degrees of freedom. The equilibrium equation is, in principle, expressed in terms of the measured (master, primary) and unmeasured (slave, secondary) degrees of freedom. Only the specified degrees of freedom are included in the equation formulation for identification and the unspecified degrees of freedom are eliminated through the iterative improved reduction scheme. A large number of system parameters are included as the unknown variables in the system identification of large-scaled structures. The identification problem with large number of system parameters requires a large amount of computation time and resources. In the present study, a hierarchical clustering algorithm is applied to reduce the number of system parameters effectively. Numerical examples demonstrate that the proposed method greatly improves the accuracy and efficiency in the inverse problem of identification.
Distributed topology control algorithm for multihop wireless netoworks
NASA Technical Reports Server (NTRS)
Borbash, S. A.; Jennings, E. H.
2002-01-01
We present a network initialization algorithmfor wireless networks with distributed intelligence. Each node (agent) has only local, incomplete knowledge and it must make local decisions to meet a predefined global objective. Our objective is to use power control to establish a topology based onthe relative neighborhood graph which has good overall performance in terms of power usage, low interference, and reliability.
Chandrasekhar equations and computational algorithms for distributed parameter systems
NASA Technical Reports Server (NTRS)
Burns, J. A.; Ito, K.; Powers, R. K.
1984-01-01
The Chandrasekhar equations arising in optimal control problems for linear distributed parameter systems are considered. The equations are derived via approximation theory. This approach is used to obtain existence, uniqueness, and strong differentiability of the solutions and provides the basis for a convergent computation scheme for approximating feedback gain operators. A numerical example is presented to illustrate these ideas.
NASA Astrophysics Data System (ADS)
Hortos, William S.
2008-04-01
Proposed distributed wavelet-based algorithms are a means to compress sensor data received at the nodes forming a wireless sensor network (WSN) by exchanging information between neighboring sensor nodes. Local collaboration among nodes compacts the measurements, yielding a reduced fused set with equivalent information at far fewer nodes. Nodes may be equipped with multiple sensor types, each capable of sensing distinct phenomena: thermal, humidity, chemical, voltage, or image signals with low or no frequency content as well as audio, seismic or video signals within defined frequency ranges. Compression of the multi-source data through wavelet-based methods, distributed at active nodes, reduces downstream processing and storage requirements along the paths to sink nodes; it also enables noise suppression and more energy-efficient query routing within the WSN. Targets are first detected by the multiple sensors; then wavelet compression and data fusion are applied to the target returns, followed by feature extraction from the reduced data; feature data are input to target recognition/classification routines; targets are tracked during their sojourns through the area monitored by the WSN. Algorithms to perform these tasks are implemented in a distributed manner, based on a partition of the WSN into clusters of nodes. In this work, a scheme of collaborative processing is applied for hierarchical data aggregation and decorrelation, based on the sensor data itself and any redundant information, enabled by a distributed, in-cluster wavelet transform with lifting that allows multiple levels of resolution. The wavelet-based compression algorithm significantly decreases RF bandwidth and other resource use in target processing tasks. Following wavelet compression, features are extracted. The objective of feature extraction is to maximize the probabilities of correct target classification based on multi-source sensor measurements, while minimizing the resource expenditures at
DCPVP: distributed clustering protocol using voting and priority for wireless sensor networks.
Hematkhah, Hooman; Kavian, Yousef S
2015-01-01
This paper presents a new clustering protocol for designing energy-efficient hierarchical wireless sensor networks (WSNs) by dividing the distributed sensor network into virtual sensor groups to satisfy the scalability and prolong the network lifetime in large-scale applications. The proposed approach is a distributed clustering protocol called DCPVP, which is based on voting and priority ideas. In the DCPVP protocol, the size of clusters is based on the distance of nodes from the data link such as base station (BS) and the local node density. The cluster heads are elected based on the mean distance from neighbors, remaining energy and the times of being elected as cluster head. The performance of the DCPVP protocol is compared with some well-known clustering protocols in literature such as the LEACH, HEED, WCA, GCMRA and TCAC protocols. The simulation results confirm that the prioritizing- and voting-based election ideas decrease the construction time and the energy consumption of clustering progress in sensor networks and consequently improve the lifetime of networks with limited resources and battery powered nodes in harsh and inaccessible environments. PMID:25763646
Gholami, Mohammad; Brennan, Robert W
2016-01-01
In this paper, we investigate alternative distributed clustering techniques for wireless sensor node tracking in an industrial environment. The research builds on extant work on wireless sensor node clustering by reporting on: (1) the development of a novel distributed management approach for tracking mobile nodes in an industrial wireless sensor network; and (2) an objective comparison of alternative cluster management approaches for wireless sensor networks. To perform this comparison, we focus on two main clustering approaches proposed in the literature: pre-defined clusters and ad hoc clusters. These approaches are compared in the context of their reconfigurability: more specifically, we investigate the trade-off between the cost and the effectiveness of competing strategies aimed at adapting to changes in the sensing environment. To support this work, we introduce three new metrics: a cost/efficiency measure, a performance measure, and a resource consumption measure. The results of our experiments show that ad hoc clusters adapt more readily to changes in the sensing environment, but this higher level of adaptability is at the cost of overall efficiency. PMID:26751447
Gholami, Mohammad; Brennan, Robert W.
2016-01-01
In this paper, we investigate alternative distributed clustering techniques for wireless sensor node tracking in an industrial environment. The research builds on extant work on wireless sensor node clustering by reporting on: (1) the development of a novel distributed management approach for tracking mobile nodes in an industrial wireless sensor network; and (2) an objective comparison of alternative cluster management approaches for wireless sensor networks. To perform this comparison, we focus on two main clustering approaches proposed in the literature: pre-defined clusters and ad hoc clusters. These approaches are compared in the context of their reconfigurability: more specifically, we investigate the trade-off between the cost and the effectiveness of competing strategies aimed at adapting to changes in the sensing environment. To support this work, we introduce three new metrics: a cost/efficiency measure, a performance measure, and a resource consumption measure. The results of our experiments show that ad hoc clusters adapt more readily to changes in the sensing environment, but this higher level of adaptability is at the cost of overall efficiency. PMID:26751447
DCPVP: Distributed Clustering Protocol Using Voting and Priority for Wireless Sensor Networks
Hematkhah, Hooman; Kavian, Yousef S.
2015-01-01
This paper presents a new clustering protocol for designing energy-efficient hierarchical wireless sensor networks (WSNs) by dividing the distributed sensor network into virtual sensor groups to satisfy the scalability and prolong the network lifetime in large-scale applications. The proposed approach is a distributed clustering protocol called DCPVP, which is based on voting and priority ideas. In the DCPVP protocol, the size of clusters is based on the distance of nodes from the data link such as base station (BS) and the local node density. The cluster heads are elected based on the mean distance from neighbors, remaining energy and the times of being elected as cluster head. The performance of the DCPVP protocol is compared with some well-known clustering protocols in literature such as the LEACH, HEED, WCA, GCMRA and TCAC protocols. The simulation results confirm that the prioritizing- and voting-based election ideas decrease the construction time and the energy consumption of clustering progress in sensor networks and consequently improve the lifetime of networks with limited resources and battery powered nodes in harsh and inaccessible environments. PMID:25763646
Positions and Spectral Energy Distributions of 41 Star Clusters in M33
NASA Astrophysics Data System (ADS)
Ma, Jun; Zhou, Xu; Chen, Jian-Sheng; Wu, Hong; Jiang, Zhao-Ji; Xue, Sui-Jian; Zhu, Jin
2002-06-01
We present accurate positions and multi-color photometry for 41 star clusters detected by Melnick & D'odorico in the nearby spiral galaxy M33 as a part of the BATC Color Survey of the sky in 13 intermediate-band filters from 3800 to 10,000 Å. The coordinates of the clusters are found from the HST Guide Star Catalog. By aperture photometry, we obtain the spectral energy distributions of the clusters. Using the relations between the BATC intermediate-band system and UBVRI broadband system, we derive their V magnitudes and B-V colors and find that most of them are blue, which is consistent with previous findings.
NASA Astrophysics Data System (ADS)
Noh, Hae Young; Rajagopal, Ram; Kiremidjian, Anne S.
2012-04-01
This paper introduces a damage diagnosis algorithm for civil structures that uses a sequential change point detection method for the cases where the post-damage feature distribution is unknown a priori. This algorithm extracts features from structural vibration data using time-series analysis and then declares damage using the change point detection method. The change point detection method asymptotically minimizes detection delay for a given false alarm rate. The conventional method uses the known pre- and post-damage feature distributions to perform a sequential hypothesis test. In practice, however, the post-damage distribution is unlikely to be known a priori. Therefore, our algorithm estimates and updates this distribution as data are collected using the maximum likelihood and the Bayesian methods. We also applied an approximate method to reduce the computation load and memory requirement associated with the estimation. The algorithm is validated using multiple sets of simulated data and a set of experimental data collected from a four-story steel special moment-resisting frame. Our algorithm was able to estimate the post-damage distribution consistently and resulted in detection delays only a few seconds longer than the delays from the conventional method that assumes we know the post-damage feature distribution. We confirmed that the Bayesian method is particularly efficient in declaring damage with minimal memory requirement, but the maximum likelihood method provides an insightful heuristic approach.
NASA Astrophysics Data System (ADS)
Karras, G.; Kosmidis, C.
2010-10-01
The angular distribution of the fragment ions ejected from the interaction of methyl iodide clusters with 20 fs strong laser pulses is studied by means of a mass spectrometer. Three types of angular distributions, one isotropic and two anisotropic, have been observed and their dependence on the laser intensity has been studied. There is strong evidence that the ions exhibiting anisotropic angular distribution with a maximum in the direction parallel to the laser polarization vector are produced via an electron impact ionization process.
NASA Astrophysics Data System (ADS)
Kauczor, Joanna; Norman, Patrick; Christiansen, Ove; Coriani, Sonia
2013-12-01
We present a reduced-space algorithm for solving the complex (damped) linear response equations required to compute the complex linear response function for the hierarchy of methods: coupled cluster singles, coupled cluster singles and iterative approximate doubles, and coupled cluster singles and doubles. The solver is the keystone element for the development of damped coupled cluster response methods for linear and nonlinear effects in resonant frequency regions.
Statistical sampling of the distribution of uranium deposits using geologic/geographic clusters
Finch, W.I.; Grundy, W.D.; Pierson, C.T.
1992-01-01
The concept of geologic/geographic clusters was developed particularly to study grade and tonnage models for sandstone-type uranium deposits. A cluster is a grouping of mined as well as unmined uranium occurrences within an arbitrary area about 8 km across. A cluster is a statistical sample that will reflect accurately the distribution of uranium in large regions relative to various geologic and geographic features. The example of the Colorado Plateau Uranium Province reveals that only 3 percent of the total number of clusters is in the largest tonnage-size category, greater than 10,000 short tons U3O8, and that 80 percent of the clusters are hosted by Triassic and Jurassic rocks. The distributions of grade and tonnage for clusters in the Powder River Basin show a wide variation; the grade distribution is highly variable, reflecting a difference between roll-front deposits and concretionary deposits, and the Basin contains about half the number in the greater-than-10,000 tonnage-size class as does the Colorado Plateau, even though it is much smaller. The grade and tonnage models should prove useful in finding the richest and largest uranium deposits. ?? 1992 Oxford University Press.
Integrating Parallel and Distributed Data Mining Algorithms into the NASA Earth Exchange (NEX)
NASA Astrophysics Data System (ADS)
Oza, N.; Kumar, V.; Nemani, R. R.; Boriah, S.; Das, K.; Khandelwal, A.; Matthews, B.; Michaelis, A.; Mithal, V.; Nayak, G.; Votava, P.
2014-12-01
There is an urgent need in global climate change science for efficient model and/or data analysis algorithms that can be deployed in distributed and parallel environments because of the proliferation of large and heterogeneous data sets. Members of our team from NASA Ames Research Center and the University of Minnesota have been developing new distributed data mining algorithms and developing distributed versions of algorithms originally developed to run on a single machine. We are integrating these algorithms together with the Terrestrial Observation and Prediction System (TOPS), an ecological nowcasting and forecasting system, on the NASA Earth Exchange (NEX). We are also developing a framework under which data mining algorithm developers can make their algorithms available for use by scientists in our system, model developers can set up their models to run within our system and make their results available, and data source providers can make their data available, all with as little effort as possible. We demonstrate the substantial time savings and new results that can be derived through this framework by demonstrating an improvement to the Burned Area (BA) data product on a global scale. Our improvement was derived through development and implementation on NEX of a novel spatiotemporal time series change detection algorithm which will also be presented.
NASA Astrophysics Data System (ADS)
Crawford, Steven M.; Wirth, Gregory D.; Bershady, Matthew A.
2014-05-01
We analyze the spatial and velocity distributions of confirmed members in five massive clusters of galaxies at intermediate redshift (0.5 < z < 0.9) to investigate the physical processes driving galaxy evolution. Based on spectral classifications derived from broad- and narrow-band photometry, we define four distinct galaxy populations representing different evolutionary stages: red sequence (RS) galaxies, blue cloud (BC) galaxies, green valley (GV) galaxies, and luminous compact blue galaxies (LCBGs). For each galaxy class, we derive the projected spatial and velocity distribution and characterize the degree of subclustering. We find that RS, BC, and GV galaxies in these clusters have similar velocity distributions, but that BC and GV galaxies tend to avoid the core of the two z ≈ 0.55 clusters. GV galaxies exhibit subclustering properties similar to RS galaxies, but their radial velocity distribution is significantly platykurtic compared to the RS galaxies. The absence of GV galaxies in the cluster cores may explain their somewhat prolonged star-formation history. The LCBGs appear to have recently fallen into the cluster based on their larger velocity dispersion, absence from the cores of the clusters, and different radial velocity distribution than the RS galaxies. Both LCBG and BC galaxies show a high degree of subclustering on the smallest scales, leading us to conclude that star formation is likely triggered by galaxy-galaxy interactions during infall into the cluster. Based in part on data obtained at the W. M. Keck Observatory, which is operated as a scientific partnership among the California Institute of Technology, the University of California and the National Aeronautics and Space Administration. The Observatory was made possible by the generous financial support of the W. M. Keck Foundation.
A multi-wavelength survey of AGN in massive clusters: AGN distribution and host galaxy properties
NASA Astrophysics Data System (ADS)
Klesman, Alison J.; Sarajedini, Vicki L.
2014-07-01
We investigate the effect of environment on the presence and fuelling of active galactic nuclei (AGN) by identifying galaxies hosting AGN in massive galaxy clusters and the fields around them. We have identified AGN candidates via optical variability (178), X-ray emission (74), and mid-IR SEDs (64) in multi-wavelength surveys covering regions centred on 12 galaxy clusters at redshifts 0.5 < z < 0.9. In this paper, we present the radial distribution of AGN in clusters to examine how local environment affects the presence of an AGN and its host galaxy. While distributions vary from cluster to cluster, we find that the radial distribution of AGN generally differs from that of normal galaxies. X-ray-selected AGN candidates appear to be more centrally concentrated than normal galaxies in the inner 20 per cent of the virial radius, while becoming less centrally concentrated in the outer regions. Mid-IR-selected AGN are less centrally concentrated overall. Optical variables have a similar distribution to normal galaxies in the inner regions, then become somewhat less centrally concentrated farther from the cluster centre. The host galaxies of AGN reveal a different colour distribution than normal galaxies, with many AGN hosts displaying galaxy colours in the `green valley' between the red sequence and blue star-forming normal galaxies. This result is similar to those found in field galaxy studies. The colour distribution of AGN hosts is more pronounced in disturbed clusters where minor mergers, galaxy harassment, and interactions with cluster substructure may continue to prompt star formation in the hosts. Among normal galaxies, we find that galaxy colours become generally bluer with increasing cluster radius, as is expected. However, we find no relationship between host galaxy colour and cluster radius among AGN hosts, which may indicate that processes related to the accreting supermassive black hole have a greater impact on the star-forming properties of the host galaxy
Distributed Similarity based Clustering and Compressed Forwarding for wireless sensor networks.
Arunraja, Muruganantham; Malathi, Veluchamy; Sakthivel, Erulappan
2015-11-01
Wireless sensor networks are engaged in various data gathering applications. The major bottleneck in wireless data gathering systems is the finite energy of sensor nodes. By conserving the on board energy, the life span of wireless sensor network can be well extended. Data communication being the dominant energy consuming activity of wireless sensor network, data reduction can serve better in conserving the nodal energy. Spatial and temporal correlation among the sensor data is exploited to reduce the data communications. Data similar cluster formation is an effective way to exploit spatial correlation among the neighboring sensors. By sending only a subset of data and estimate the rest using this subset is the contemporary way of exploiting temporal correlation. In Distributed Similarity based Clustering and Compressed Forwarding for wireless sensor networks, we construct data similar iso-clusters with minimal communication overhead. The intra-cluster communication is reduced using adaptive-normalized least mean squares based dual prediction framework. The cluster head reduces the inter-cluster data payload using a lossless compressive forwarding technique. The proposed work achieves significant data reduction in both the intra-cluster and the inter-cluster communications, with the optimal data accuracy of collected data. PMID:26343165
Optical algorithm for calculating the quantity distribution of fiber assembly.
Wu, Meiqin; Wang, Fumei
2016-09-01
A modification of the two-flux model of Kubelka-Munk was proposed for the description of light propagating through a fiber-air mixture medium, which simplified fibers' internal reflection as a part of the scattering on the total fiber path length. A series of systematical experiments demonstrated a higher consistency with the reference quantity distribution than the common Lambert law on the fibrogram used in the textile industry did. Its application in the fibrogram for measuring the cotton fiber's length was demonstrated to be good, extending its applicability to the wool fiber, the length of which is harder to measure than that of the cotton fiber. PMID:27607296
An Algorithm for Testing Unidimensionality and Clustering Items in Rasch Measurement
ERIC Educational Resources Information Center
Debelak, Rudolf; Arendasy, Martin
2012-01-01
A new approach to identify item clusters fitting the Rasch model is described and evaluated using simulated and real data. The proposed method is based on hierarchical cluster analysis and constructs clusters of items that show a good fit to the Rasch model. It thus gives an estimate of the number of independent scales satisfying the postulates of…
Distribution of sea anemones (Cnidaria, Actiniaria) in Korea analyzed by environmental clustering
Cha, H.-R.; Buddemeier, R.W.; Fautin, D.G.; Sandhei, P.
2004-01-01
Using environmental data and the geospatial clustering tools LOICZView and DISCO, we empirically tested the postulated existence and boundaries of four biogeographic regions in the southern part of the Korean peninsula. Environmental variables used included wind speed, sea surface temperature (SST), salinity, tidal amplitude, and the chlorophyll spectral signal. Our analysis confirmed the existence of four biogeographic regions, but the details of the borders between them differ from those previously postulated. Specimen-level distribution records of intertidal sea anemones were mapped; their distribution relative to the environmental data supported the importance of the environmental parameters we selected in defining suitable habitats. From the geographic coincidence between anemone distribution and the clusters based on environmental variables, we infer that geospatial clustering has the power to delimit ranges for marine organisms within relatively small geographical areas.
The shape of the mass distribution in M31 from its globular cluster system
NASA Technical Reports Server (NTRS)
Kent, Stephen M.; Huchra, John P.; Stauffer, John
1989-01-01
The velocity dispersion and rotation velocity of the M31 globular cluster system depend on the relative division of mass between the flat disk and a spherically symmetric halo. Using the tensor virial theorem, it is shown in detail how the mass ratio can be constrained. Radial velocities have been collected for 149 globular clusters in M31. With no assumptions about the isotropy of the velocity distribution, the globular cluster kinematics are consistent with the mass distribution inferred by the rotation curve but otherwise place no strong constraints on the relative division of the mass. If the velocity distribution is isotropic, models with the disk mass ranging between 1/2 and 1 times its maximum possible value are marginally favored.
Yang, Liu; Lu, Yinzhi; Zhong, Yuanchang; Wu, Xuegang; Yang, Simon X
2015-01-01
Energy resource limitation is a severe problem in traditional wireless sensor networks (WSNs) because it restricts the lifetime of network. Recently, the emergence of energy harvesting techniques has brought with them the expectation to overcome this problem. In particular, it is possible for a sensor node with energy harvesting abilities to work perpetually in an Energy Neutral state. In this paper, a Multi-hop Energy Neutral Clustering (MENC) algorithm is proposed to construct the optimal multi-hop clustering architecture in energy harvesting WSNs, with the goal of achieving perpetual network operation. All cluster heads (CHs) in the network act as routers to transmit data to base station (BS) cooperatively by a multi-hop communication method. In addition, by analyzing the energy consumption of intra- and inter-cluster data transmission, we give the energy neutrality constraints. Under these constraints, every sensor node can work in an energy neutral state, which in turn provides perpetual network operation. Furthermore, the minimum network data transmission cycle is mathematically derived using convex optimization techniques while the network information gathering is maximal. Simulation results show that our protocol can achieve perpetual network operation, so that the consistent data delivery is guaranteed. In addition, substantial improvements on the performance of network throughput are also achieved as compared to the famous traditional clustering protocol LEACH and recent energy harvesting aware clustering protocols. PMID:26712764
Yang, Liu; Lu, Yinzhi; Zhong, Yuanchang; Wu, Xuegang; Yang, Simon X.
2015-01-01
Energy resource limitation is a severe problem in traditional wireless sensor networks (WSNs) because it restricts the lifetime of network. Recently, the emergence of energy harvesting techniques has brought with them the expectation to overcome this problem. In particular, it is possible for a sensor node with energy harvesting abilities to work perpetually in an Energy Neutral state. In this paper, a Multi-hop Energy Neutral Clustering (MENC) algorithm is proposed to construct the optimal multi-hop clustering architecture in energy harvesting WSNs, with the goal of achieving perpetual network operation. All cluster heads (CHs) in the network act as routers to transmit data to base station (BS) cooperatively by a multi-hop communication method. In addition, by analyzing the energy consumption of intra- and inter-cluster data transmission, we give the energy neutrality constraints. Under these constraints, every sensor node can work in an energy neutral state, which in turn provides perpetual network operation. Furthermore, the minimum network data transmission cycle is mathematically derived using convex optimization techniques while the network information gathering is maximal. Simulation results show that our protocol can achieve perpetual network operation, so that the consistent data delivery is guaranteed. In addition, substantial improvements on the performance of network throughput are also achieved as compared to the famous traditional clustering protocol LEACH and recent energy harvesting aware clustering protocols. PMID:26712764
Long, Fuhui; Peng, Hanchuan; Sudar, Damir; Levievre, Sophie A.; Knowles, David W.
2006-09-05
Background: The distribution of the chromatin-associatedproteins plays a key role in directing nuclear function. Previously, wedeveloped an image-based method to quantify the nuclear distributions ofproteins and showed that these distributions depended on the phenotype ofhuman mammary epithelial cells. Here we describe a method that creates ahierarchical tree of the given cell phenotypes and calculates thestatistical significance between them, based on the clustering analysisof nuclear protein distributions. Results: Nuclear distributions ofnuclear mitotic apparatus protein were previously obtained fornon-neoplastic S1 and malignant T4-2 human mammary epithelial cellscultured for up to 12 days. Cell phenotype was defined as S1 or T4-2 andthe number of days in cultured. A probabilistic ensemble approach wasused to define a set of consensus clusters from the results of multipletraditional cluster analysis techniques applied to the nucleardistribution data. Cluster histograms were constructed to show how cellsin any one phenotype were distributed across the consensus clusters.Grouping various phenotypes allowed us to build phenotype trees andcalculate the statistical difference between each group. The resultsshowed that non-neoplastic S1 cells could be distinguished from malignantT4-2 cells with 94.19 percent accuracy; that proliferating S1 cells couldbe distinguished from differentiated S1 cells with 92.86 percentaccuracy; and showed no significant difference between the variousphenotypes of T4-2 cells corresponding to increasing tumor sizes.Conclusion: This work presents a cluster analysis method that canidentify significant cell phenotypes, based on the nuclear distributionof specific proteins, with high accuracy.
What the Spatial Distribution of Stars tells us about Star Formation and Massive Cluster Formation
NASA Astrophysics Data System (ADS)
Bressert, Eli; Bastian, N.; Testi, L.; Patience, J.; Longmore, S.
2012-01-01
We present a dissertation study on two recent results regarding the clustering properties of young stars. First, we discuss a global study of young stellar object (YSO) surface densities in star forming regions based on a comprehensive collection of Spitzer Space Telescope surveys, which encompasses nearly all star formation in the solar neighbourhood. It is shown that the distribution of YSO surface densities is a smooth distribution, being adequately described by a lognormal function from a few to 103 YSOs pc-2, with a peak at 22 YSOs pc-2 and a dispersion of 0.85. We find no evidence for multiple discrete modes of star-formation (e.g. clustered and distributed) and that not all stars form in clusters. A Herschel Space Observatory study confirms the YSO surface density results by observing and analyzing the prestellar core population in several star forming regions. Secondly, we propose that bound stellar clusters primarily form from dense clouds having escape speeds greater than the sound speed in photo-ionized gas. A list of giant molecular clumps with masses >103 M⊙ that have escape speeds greater than the sound speed in photo-ionized plasma is compiled from the Bolocam Galactic Plane Survey. In these clumps, radiative feedback in the form of gas ionization is bottled up, enabling star formation to proceed to sufficiently high efficiency so that the resulting star cluster remains bound even after gas removal. We present over ten candidates that will most likely form >103 M⊙ star clusters and two of them that are comparable to NGC 3603 (>104 M⊙). Thus, providing us with an outlook on the next generation of star clusters in the Milky Way and clues to the initial conditions of massive cluster formation.
THE DISTRIBUTION OF DARK MATTER OVER THREE DECADES IN RADIUS IN THE LENSING CLUSTER ABELL 611
Newman, Andrew B.; Ellis, Richard S.; Treu, Tommaso; Marshall, Philip J.; Sand, David J.; Richard, Johan; Capak, Peter; Miyazaki, Satoshi
2009-12-01
We present a detailed analysis of the baryonic and dark matter distribution in the lensing cluster Abell 611 (z = 0.288), with the goal of determining the dark matter profile over an unprecedented range of cluster-centric distance. By combining three complementary probes of the mass distribution, weak lensing from multi-color Subaru imaging, strong lensing constraints based on the identification of multiply imaged sources in Hubble Space Telescope images, and resolved stellar velocity dispersion measures for the brightest cluster galaxy secured using the Keck telescope, we extend the methodology for separating the dark and baryonic mass components introduced by Sand et al. Our resulting dark matter profile samples the cluster from approx3 kpc to 3.25 Mpc, thereby providing an excellent basis for comparisons with recent numerical models. We demonstrate that only by combining our three observational techniques can degeneracies in constraining the form of the dark matter profile be broken on scales crucial for detailed comparisons with numerical simulations. Our analysis reveals that a simple Navarro-Frenk-White (NFW) profile is an unacceptable fit to our data. We confirm earlier claims based on less extensive analyses of other clusters that the inner profile of the dark matter profile deviates significantly from the NFW form and find a inner logarithmic slope beta flatter than 0.3 (68%; where rho{sub DM} propor to r{sup -b}eta at small radii). In order to reconcile our data with cluster formation in a LAMBDACDM cosmology, we speculate that it may be necessary to revise our understanding of the nature of baryon-dark matter interactions in cluster cores. Comprehensive weak and strong lensing data, when coupled with kinematic information on the brightest cluster galaxy, can readily be applied to a larger sample of clusters to test the universality of these results.
Distribution of histaminergic neuronal cluster in the rat and mouse hypothalamus.
Moriwaki, Chinatsu; Chiba, Seiichi; Wei, Huixing; Aosa, Taishi; Kitamura, Hirokazu; Ina, Keisuke; Shibata, Hirotaka; Fujikura, Yoshihisa
2015-10-01
Histidine decarboxylase (HDC) catalyzes the biosynthesis of histamine from L-histidine and is expressed throughout the mammalian nervous system by histaminergic neurons. Histaminergic neurons arise in the posterior mesencephalon during the early embryonic period and gradually develop into two histaminergic substreams around the lateral area of the posterior hypothalamus and the more anterior peri-cerebral aqueduct area before finally forming an adult-like pattern comprising five neuronal clusters, E1, E2, E3, E4, and E5, at the postnatal stage. This distribution of histaminergic neuronal clusters in the rat hypothalamus appears to be a consequence of neuronal development and reflects the functional differentiation within each neuronal cluster. However, the close linkage between the locations of histaminergic neuronal clusters and their physiological functions has yet to be fully elucidated because of the sparse information regarding the location and orientation of each histaminergic neuronal clusters in the hypothalamus of rats and mice. To clarify the distribution of the five-histaminergic neuronal clusters more clearly, we performed an immunohistochemical study using the anti-HDC antibody on serial sections of the rat hypothalamus according to the brain maps of rat and mouse. Our results confirmed that the HDC-immunoreactive (HDCi) neuronal clusters in the hypothalamus of rats and mice are observed in the ventrolateral part of the most posterior hypothalamus (E1), ventrolateral part of the posterior hypothalamus (E2), ventromedial part from the medial to the posterior hypothalamus (E3), periventricular part from the anterior to the medial hypothalamus (E4), and diffusely extended part of the more dorsal and almost entire hypothalamus (E5). The stereological estimation of the total number of HDCi neurons of each clusters revealed the larger amount of the rat than the mouse. The characterization of histaminergic neuronal clusters in the hypothalamus of rats and
Three-Dimensional Distribution of Ryanodine Receptor Clusters in Cardiac Myocytes
Chen-Izu, Ye; McCulle, Stacey L.; Ward, Chris W.; Soeller, Christian; Allen, Bryan M.; Rabang, Cal; Cannell, Mark B.; Balke, C. William; Izu, Leighton T.
2006-01-01
The clustering of ryanodine receptors (RyR2) into functional Ca2+ release units is central to current models for cardiac excitation-contraction (E-C) coupling. Using immunolabeling and confocal microscopy, we have analyzed the distribution of RyR2 clusters in rat and ventricular atrial myocytes. The resolution of the three-dimensional structure was improved by a novel transverse sectioning method as well as digital deconvolution. In contrast to earlier reports, the mean RyR2 cluster transverse spacing was measured 1.05 μm in ventricular myocytes and estimated 0.97 μm in atrial myocytes. Intercalated RyR2 clusters were found interspersed between the Z-disks on the cell periphery but absent in the interior, forming double rows flanking the local Z-disks on the surface. The longitudinal spacing between the adjacent rows of RyR2 clusters on the Z-disks was measured to have a mean value of 1.87 μm in ventricular and 1.69 μm in atrial myocytes. The measured RyR2 cluster distribution is compatible with models of Ca2+ wave generation. The size of the typical RyR2 cluster was close to 250 nm, and this suggests that ∼100 RyR2s might be present in a cluster. The importance of cluster size and three-dimensional spacing for current E-C coupling models is discussed. PMID:16603500
Dynamic algorithm for correlation noise estimation in distributed video coding
NASA Astrophysics Data System (ADS)
Thambu, Kuganeswaran; Fernando, Xavier; Guan, Ling
2010-01-01
Low complexity encoders at the expense of high complexity decoders are advantageous in wireless video sensor networks. Distributed video coding (DVC) achieves the above complexity balance, where the receivers compute Side information (SI) by interpolating the key frames. Side information is modeled as a noisy version of input video frame. In practise, correlation noise estimation at the receiver is a complex problem, and currently the noise is estimated based on a residual variance between pixels of the key frames. Then the estimated (fixed) variance is used to calculate the bit-metric values. In this paper, we have introduced the new variance estimation technique that rely on the bit pattern of each pixel, and it is dynamically calculated over the entire motion environment which helps to calculate the soft-value information required by the decoder. Our result shows that the proposed bit based dynamic variance estimation significantly improves the peak signal to noise ratio (PSNR) performance.
Spectral distributed Lagrange multiplier method: algorithm and benchmark tests
NASA Astrophysics Data System (ADS)
Dong, Suchuan; Liu, Dong; Maxey, Martin R.; Karniadakis, George Em
2004-04-01
We extend the formulation of the distributed Lagrange multiplier (DLM) approach for particulate flows to high-order methods within the spectral/ hp element framework. We implement the rigid-body motion constraint inside the particle via a penalty method. The high-order DLM method demonstrates spectral convergence rate, i.e. discretization errors decrease exponentially as the order of spectral polynomials increases. We provide detailed comparisons between the spectral DLM method, direct numerical simulations, and the force coupling method for a number of 2D and 3D benchmark flow problems. We also validate the spectral DLM method with available experimental data for a transient problem. The new DLM method can potentially be very effective in many-moving body problems, where a smaller number of grid points is required in comparison with low-order methods.
Distributed Storage Algorithm for Geospatial Image Data Based on Data Access Patterns
Pan, Shaoming; Li, Yongkai; Xu, Zhengquan; Chong, Yanwen
2015-01-01
Declustering techniques are widely used in distributed environments to reduce query response time through parallel I/O by splitting large files into several small blocks and then distributing those blocks among multiple storage nodes. Unfortunately, however, many small geospatial image data files cannot be further split for distributed storage. In this paper, we propose a complete theoretical system for the distributed storage of small geospatial image data files based on mining the access patterns of geospatial image data using their historical access log information. First, an algorithm is developed to construct an access correlation matrix based on the analysis of the log information, which reveals the patterns of access to the geospatial image data. Then, a practical heuristic algorithm is developed to determine a reasonable solution based on the access correlation matrix. Finally, a number of comparative experiments are presented, demonstrating that our algorithm displays a higher total parallel access probability than those of other algorithms by approximately 10–15% and that the performance can be further improved by more than 20% by simultaneously applying a copy storage strategy. These experiments show that the algorithm can be applied in distributed environments to help realize parallel I/O and thereby improve system performance. PMID:26181628
A SIMPLE PHYSICAL MODEL FOR THE GAS DISTRIBUTION IN GALAXY CLUSTERS
Patej, Anna; Loeb, Abraham
2015-01-01
The dominant baryonic component of galaxy clusters is hot gas whose distribution is commonly probed through X-ray emission arising from thermal bremsstrahlung. The density profile thus obtained has been traditionally modeled with a β-profile, a simple function with only three parameters. However, this model is known to be insufficient for characterizing the range of cluster gas distributions and attempts to rectify this shortcoming typically introduce additional parameters to increase the fitting flexibility. We use cosmological and physical considerations to obtain a family of profiles for the gas with fewer parameters than the β-model but which better accounts for observed gas profiles over wide radial intervals.
A swarm intelligence based memetic algorithm for task allocation in distributed systems
NASA Astrophysics Data System (ADS)
Sarvizadeh, Raheleh; Haghi Kashani, Mostafa
2011-12-01
This paper proposes a Swarm Intelligence based Memetic algorithm for Task Allocation and scheduling in distributed systems. The tasks scheduling in distributed systems is known as an NP-complete problem. Hence, many genetic algorithms have been proposed for searching optimal solutions from entire solution space. However, these existing approaches are going to scan the entire solution space without considering the techniques that can reduce the complexity of the optimization. Spending too much time for doing scheduling is considered the main shortcoming of these approaches. Therefore, in this paper memetic algorithm has been used to cope with this shortcoming. With regard to load balancing efficiently, Bee Colony Optimization (BCO) has been applied as local search in the proposed memetic algorithm. Extended experimental results demonstrated that the proposed method outperformed the existing GA-based method in terms of CPU utilization.
A swarm intelligence based memetic algorithm for task allocation in distributed systems
NASA Astrophysics Data System (ADS)
Sarvizadeh, Raheleh; Haghi Kashani, Mostafa
2012-01-01
This paper proposes a Swarm Intelligence based Memetic algorithm for Task Allocation and scheduling in distributed systems. The tasks scheduling in distributed systems is known as an NP-complete problem. Hence, many genetic algorithms have been proposed for searching optimal solutions from entire solution space. However, these existing approaches are going to scan the entire solution space without considering the techniques that can reduce the complexity of the optimization. Spending too much time for doing scheduling is considered the main shortcoming of these approaches. Therefore, in this paper memetic algorithm has been used to cope with this shortcoming. With regard to load balancing efficiently, Bee Colony Optimization (BCO) has been applied as local search in the proposed memetic algorithm. Extended experimental results demonstrated that the proposed method outperformed the existing GA-based method in terms of CPU utilization.
Efficient implementation of Jacobi algorithms and Jacobi sets on distributed memory architectures
Eberlein, P.J. ); Park, H. )
1990-04-01
One-sided methods for implementing Jacobi diagonalization algorithms have been recently proposed for both distributed memory and vector machines. These methods are naturally well suited to distributed memory and vector architectures because of their inherent parallelism and their abundance of vector operations. Also, one-sided methods require substantially less message passing than the two-sided methods, and thus can achieve higher efficiency. The authors describe in detail the use of the one-sided Jacobi rotation as opposed to the rotation used in the Hestenes algorithm; they perceive the difference to have been widely misunderstood. Furthermore, the one-sided algorithm generalizes to other problems such as the nonsymmetric eigenvalue problem while the Hestenes algorithm does not. The authors discuss two new implementations for Jacobi sets for a ring connected array of processors and show their isomorphism to the round-robin ordering.
Distributed learning automata-based algorithm for community detection in complex networks
NASA Astrophysics Data System (ADS)
Khomami, Mohammad Mehdi Daliri; Rezvanian, Alireza; Meybodi, Mohammad Reza
2016-03-01
Community structure is an important and universal topological property of many complex networks such as social and information networks. The detection of communities of a network is a significant technique for understanding the structure and function of networks. In this paper, we propose an algorithm based on distributed learning automata for community detection (DLACD) in complex networks. In the proposed algorithm, each vertex of network is equipped with a learning automation. According to the cooperation among network of learning automata and updating action probabilities of each automaton, the algorithm interactively tries to identify high-density local communities. The performance of the proposed algorithm is investigated through a number of simulations on popular synthetic and real networks. Experimental results in comparison with popular community detection algorithms such as walk trap, Danon greedy optimization, Fuzzy community detection, Multi-resolution community detection and label propagation demonstrated the superiority of DLACD in terms of modularity, NMI, performance, min-max-cut and coverage.
NASA Astrophysics Data System (ADS)
Barkat, B.; Abed-Meraim, K.
2004-12-01
We propose novel algorithms to select and extract separately all the components, using the time-frequency distribution (TFD), of a given multicomponent frequency-modulated (FM) signal. These algorithms do not use any a priori information about the various components. However, their performances highly depend on the cross-terms suppression ability and high time-frequency resolution of the considered TFD. To illustrate the usefulness of the proposed algorithms, we applied them for the estimation of the instantaneous frequency coefficients of a multicomponent signal and the results are compared with those of the higher-order ambiguity function (HAF) algorithm. Monte Carlo simulation results show the superiority of the proposed algorithms over the HAF.
Sumithra, Subramaniam; Victoire, T. Aruldoss Albert
2015-01-01
Due to large dimension of clusters and increasing size of sensor nodes, finding the optimal route and cluster for large wireless sensor networks (WSN) seems to be highly complex and cumbersome. This paper proposes a new method to determine a reasonably better solution of the clustering and routing problem with the highest concern of efficient energy consumption of the sensor nodes for extending network life time. The proposed method is based on the Differential Evolution (DE) algorithm with an improvised search operator called Diversified Vicinity Procedure (DVP), which models a trade-off between energy consumption of the cluster heads and delay in forwarding the data packets. The obtained route using the proposed method from all the gateways to the base station is comparatively lesser in overall distance with less number of data forwards. Extensive numerical experiments demonstrate the superiority of the proposed method in managing energy consumption of the WSN and the results are compared with the other algorithms reported in the literature. PMID:26516635
Sumithra, Subramaniam; Victoire, T Aruldoss Albert
2015-01-01
Due to large dimension of clusters and increasing size of sensor nodes, finding the optimal route and cluster for large wireless sensor networks (WSN) seems to be highly complex and cumbersome. This paper proposes a new method to determine a reasonably better solution of the clustering and routing problem with the highest concern of efficient energy consumption of the sensor nodes for extending network life time. The proposed method is based on the Differential Evolution (DE) algorithm with an improvised search operator called Diversified Vicinity Procedure (DVP), which models a trade-off between energy consumption of the cluster heads and delay in forwarding the data packets. The obtained route using the proposed method from all the gateways to the base station is comparatively lesser in overall distance with less number of data forwards. Extensive numerical experiments demonstrate the superiority of the proposed method in managing energy consumption of the WSN and the results are compared with the other algorithms reported in the literature. PMID:26516635
NASA Astrophysics Data System (ADS)
Chaudhury, Pinaki; Bhattacharyya, S. P.
1999-03-01
It is demonstrated that Genetic Algorithm in a floating point realisation can be a viable tool for locating critical points on a multi-dimensional potential energy surface (PES). For small clusters, the standard algorithm works well. For bigger ones, the search for global minimum becomes more efficient when used in conjunction with coordinate stretching, and partitioning of the strings into a core part and an outer part which are alternately optimized The method works with equal facility for locating minima, local as well as global, and saddle points (SP) of arbitrary orders. The search for minima requires computation of the gradient vector, but not the Hessian, while that for SP's requires the information of the gradient vector and the Hessian, the latter only at some specific points on the path. The method proposed is tested on (i) a model 2-d PES (ii) argon clusters (Ar 4-Ar 30) in which argon atoms interact via Lennard-Jones potential, (iii) Ar mX, m=12 clusters where X may be a neutral atom or a cation. We also explore if the method could also be used to construct what may be called a stochastic representation of the reaction path on a given PES with reference to conformational changes in Ar n clusters.
An algorithm to calculate the beam momentum distribution from flying wire profiles
Wang, X.Q.
1989-07-27
Horizontal flying wire measurements give beam profiles from which information about the beam momentum distribution and betatron distribution can be extracted. When calculating these beam characteristics in the past, for the matter of simplicity, the beam has been assumed Gaussian. For beam profiles which may not be Gaussian, an algorithm to obtain the general beam momentum distribution is developed using the Fourier transform to the beam profiles. Since the profile is the convolution of the momentum distribution and the betatron distribution, using a Fourier transform method makes calculations easier. 6 figs.
Luminous red galaxies in clusters: central occupation, spatial distributions and miscentring
NASA Astrophysics Data System (ADS)
Hoshino, Hanako; Leauthaud, Alexie; Lackner, Claire; Hikage, Chiaki; Rozo, Eduardo; Rykoff, Eli; Mandelbaum, Rachel; More, Surhud; More, Anupreeta; Saito, Shun; Vulcani, Benedetta
2015-09-01
Luminous red galaxies (LRG) from the Sloan Digital Sky Survey are among the best understood samples of galaxies and are employed in a broad range of cosmological studies. In this paper, we study how LRGs occupy massive haloes via counts in clusters and reveal several unexpected trends. Using the red-sequence Matched-filter Probabilistic Percolation (redMaPPer) cluster catalogue, we derive the central occupation of LRGs as a function richness. We show that clusters contain a significantly lower fraction of central LRGs than predicted from the two-point correlation function. At halo masses of 1014.5 M⊙, we find Ncen = 0.73 compared to Ncen = 0.89 from correlation studies. Our central occupation function for LRGs converges to 0.95 at large halo masses. A strong anticorrelation between central luminosity and cluster mass at fixed richness is required to reconcile our results with those based on clustering studies. We derive the probability that the brightest cluster member is not the central galaxy. We find PBNC ≈ 20-30 per cent which is a factor of ˜2 lower than the value found by Skibba et al. Finally, we study the radial offsets of bright non-central LRGs from cluster centres and show that bright non-central LRGs follow a different radial distribution compared to red cluster members. This work demonstrates that even the most massive clusters do not always have an LRG at the centre, and that the brightest galaxy in a cluster is not always the central galaxy.
3D Drop Size Distribution Extrapolation Algorithm Using a Single Disdrometer
NASA Technical Reports Server (NTRS)
Lane, John
2012-01-01
Determining the Z-R relationship (where Z is the radar reflectivity factor and R is rainfall rate) from disdrometer data has been and is a common goal of cloud physicists and radar meteorology researchers. The usefulness of this quantity has traditionally been limited since radar represents a volume measurement, while a disdrometer corresponds to a point measurement. To solve that problem, a 3D-DSD (drop-size distribution) method of determining an equivalent 3D Z-R was developed at the University of Central Florida and tested at the Kennedy Space Center, FL. Unfortunately, that method required a minimum of three disdrometers clustered together within a microscale network (.1-km separation). Since most commercial disdrometers used by the radar meteorology/cloud physics community are high-cost instruments, three disdrometers located within a microscale area is generally not a practical strategy due to the limitations of these kinds of research budgets. A relatively simple modification to the 3D-DSD algorithm provides an estimate of the 3D-DSD and therefore, a 3D Z-R measurement using a single disdrometer. The basis of the horizontal extrapolation is mass conservation of a drop size increment, employing the mass conservation equation. For vertical extrapolation, convolution of a drop size increment using raindrop terminal velocity is used. Together, these two independent extrapolation techniques provide a complete 3DDSD estimate in a volume around and above a single disdrometer. The estimation error is lowest along a vertical plane intersecting the disdrometer position in the direction of wind advection. This work demonstrates that multiple sensors are not required for successful implementation of the 3D interpolation/extrapolation algorithm. This is a great benefit since it is seldom that multiple sensors in the required spatial arrangement are available for this type of analysis. The original software (developed at the University of Central Florida, 1998.- 2000) has
NASA Astrophysics Data System (ADS)
Fustes, Diego; Ordóñez, Diego; Dafonte, Carlos; Manteiga, Minia; Arcay, Bernardino
This work presents an algorithm that was developed to select the most relevant areas of a stellar spectrum to extract its basic atmospheric parameters. We consider synthetic spectra obtained from models of stellar atmospheres in the spectral region of the radial velocity spectrograph instrument of the European Space Agency's Gaia space mission. The algorithm that demarcates the areas of the spectra sensitive to each atmospheric parameter (effective temperature and gravity, metallicity, and abundance of alpha elements) is a genetic algorithm, and the parameterization takes place through the learning of artificial neural networks. Due to the high computational cost of processing, we present a distributed implementation in both multiprocessor and multicomputer environments.
Weng, Yang; Xiao, Wendong; Xie, Lihua
2011-01-01
Distributed estimation of Gaussian mixtures has many applications in wireless sensor network (WSN), and its energy-efficient solution is still challenging. This paper presents a novel diffusion-based EM algorithm for this problem. A diffusion strategy is introduced for acquiring the global statistics in EM algorithm in which each sensor node only needs to communicate its local statistics to its neighboring nodes at each iteration. This improves the existing consensus-based distributed EM algorithm which may need much more communication overhead for consensus, especially in large scale networks. The robustness and scalability of the proposed approach can be achieved by distributed processing in the networks. In addition, we show that the proposed approach can be considered as a stochastic approximation method to find the maximum likelihood estimation for Gaussian mixtures. Simulation results show the efficiency of this approach. PMID:22163956
Fragmentation and reliable size distributions of large ammonia and water clusters
NASA Astrophysics Data System (ADS)
Bobbert, C.; Schütte, S.; Steinbach, C.; Buck, U.
2002-05-01
The interaction of large ammonia and water clusters in the size range from < n rangle = 10 to 3 400 with electrons is investigated in a reflectron time-of-flight mass spectrometer. The clusters are generated in adiabatic expansions through conical nozzles and are nearly fragmentation free detected by single photon ionization after they have been doped by one sodium atom. For ammonia also the (1+1) resonance enhanced two photon ionization through the tilde A state with v=6 operates similarly. In this way reliable size distributions of the neutral clusters are obtained which are analyzed in terms of a modified scaling law of the Hagena type [Surf. Sci. 106, 101 (1981)]. In contrast, using electron impact ionization, the clusters are strongly fragmented when varying the electron energy between 150 and 1 500 eV. The number of evaporated molecules depends on the cluster size and the energy dependence follows that of the stopping power of the solid material. Therefore we attribute the operating mechanism to that which is also responsible for the electronic sputtering of solid matter. The yields, however, are orders of magnitude larger for clusters than for the solid. This result is a consequence of the finite dimensions of the clusters which cannot accommodate the released energy.
Optimal fusion rule for distributed detection in clustered wireless sensor networks
NASA Astrophysics Data System (ADS)
Aldalahmeh, Sami A.; Ghogho, Mounir; McLernon, Des; Nurellari, Edmond
2016-01-01
We consider distributed detection in a clustered wireless sensor network (WSN) deployed randomly in a large field for the purpose of intrusion detection. The WSN is modeled by a homogeneous Poisson point process. The sensor nodes (SNs) compute local decisions about the intruder's presence and send them to the cluster heads (CHs). A stochastic geometry framework is employed to derive the optimal cluster-based fusion rule (OCR), which is a weighted average of the local decision sum of each cluster. Interestingly, this structure reduces the effect of false alarm on the detection performance. Moreover, a generalized likelihood ratio test (GLRT) for cluster-based fusion (GCR) is developed to handle the case of unknown intruder's parameters. Simulation results show that the OCR performance is close to the Chair-Varshney rule. In fact, the latter benchmark can be reached by forming more clusters in the network without increasing the SN deployment intensity. Simulation results also show that the GCR performs very closely to the OCR when the number of clusters is large enough. The performance is further improved when the SN deployment intensity is increased.
Bhattacharya, Anindya; De, Rajat K
2010-08-01
Distance based clustering algorithms can group genes that show similar expression values under multiple experimental conditions. They are unable to identify a group of genes that have similar pattern of variation in their expression values. Previously we developed an algorithm called divisive correlation clustering algorithm (DCCA) to tackle this situation, which is based on the concept of correlation clustering. But this algorithm may also fail for certain cases. In order to overcome these situations, we propose a new clustering algorithm, called average correlation clustering algorithm (ACCA), which is able to produce better clustering solution than that produced by some others. ACCA is able to find groups of genes having more common transcription factors and similar pattern of variation in their expression values. Moreover, ACCA is more efficient than DCCA with respect to the time of execution. Like DCCA, we use the concept of correlation clustering concept introduced by Bansal et al. ACCA uses the correlation matrix in such a way that all genes in a cluster have the highest average correlation values with the genes in that cluster. We have applied ACCA and some well-known conventional methods including DCCA to two artificial and nine gene expression datasets, and compared the performance of the algorithms. The clustering results of ACCA are found to be more significantly relevant to the biological annotations than those of the other methods. Analysis of the results show the superiority of ACCA over some others in determining a group of genes having more common transcription factors and with similar pattern of variation in their expression profiles. Availability of the software: The software has been developed using C and Visual Basic languages, and can be executed on the Microsoft Windows platforms. The software may be downloaded as a zip file from http://www.isical.ac.in/~rajat. Then it needs to be installed. Two word files (included in the zip file) need to
A new stochastic algorithm for inversion of dust aerosol size distribution
NASA Astrophysics Data System (ADS)
Wang, Li; Li, Feng; Yang, Ma-ying
2015-08-01
Dust aerosol size distribution is an important source of information about atmospheric aerosols, and it can be determined from multiwavelength extinction measurements. This paper describes a stochastic inverse technique based on artificial bee colony (ABC) algorithm to invert the dust aerosol size distribution by light extinction method. The direct problems for the size distribution of water drop and dust particle, which are the main elements of atmospheric aerosols, are solved by the Mie theory and the Lambert-Beer Law in multispectral region. And then, the parameters of three widely used functions, i.e. the log normal distribution (L-N), the Junge distribution (J-J), and the normal distribution (N-N), which can provide the most useful representation of aerosol size distributions, are inversed by the ABC algorithm in the dependent model. Numerical results show that the ABC algorithm can be successfully applied to recover the aerosol size distribution with high feasibility and reliability even in the presence of random noise.
Automatic distribution of vision-tasks on computing clusters
NASA Astrophysics Data System (ADS)
Müller, Thomas; Tran, Binh An; Knoll, Alois
2011-01-01
In this paper a consistent and efficient but yet convenient system for parallel computer vision, and in fact also realtime actuator control is proposed. The system implements the multi-agent paradigm and a blackboard information storage. This, in combination with a generic interface for hardware abstraction and integration of external software components, is setup on basis of the message passing interface (MPI). The system allows for data- and task-parallel processing, and supports both synchronous communication, as data exchange can be triggered by events, and asynchronous communication, as data can be polled, strategies. Also, by duplication of processing units (agents) redundant processing is possible to achieve greater robustness. As the system automatically distributes the task units to available resources, and a monitoring concept allows for combination of tasks and their composition to complex processes, it is easy to develop efficient parallel vision / robotics applications quickly. Multiple vision based applications have already been implemented, including academic, research related fields and prototypes for industrial automation. For the scientific community the system has been recently launched open-source.
NASA Astrophysics Data System (ADS)
Xu, Hao; Li, Hui; Collins, David C.; Li, Shengtai; Norman, Michael L.
2011-10-01
Theory and simulations suggest that magnetic fields from radio jets and lobes powered by their central super massive black holes can be an important source of magnetic fields in the galaxy clusters. This is Paper II in a series of studies where we present self-consistent high-resolution adaptive mesh refinement cosmological magnetohydrodynamic simulations that simultaneously follow the formation of a galaxy cluster and evolution of magnetic fields ejected by an active galactic nucleus. We studied 12 different galaxy clusters with virial masses ranging from 1 × 1014 to 2 × 1015 M sun. In this work, we examine the effects of the mass and merger history on the final magnetic properties. We find that the evolution of magnetic fields is qualitatively similar to those of previous studies. In most clusters, the injected magnetic fields can be transported throughout the cluster and be further amplified by the intracluster medium (ICM) turbulence during the cluster formation process with hierarchical mergers, while the amplification history and the magnetic field distribution depend on the cluster formation and magnetism history. This can be very different for different clusters. The total magnetic energies in these clusters are between 4 × 1057 and 1061 erg, which is mainly decided by the cluster mass, scaling approximately with the square of the total mass. Dynamically older relaxed clusters usually have more magnetic fields in their ICM. The dynamically very young clusters may be magnetized weakly since there is not enough time for magnetic fields to be amplified.
NASA Astrophysics Data System (ADS)
Meyer, Michael R.
1996-04-01
The goal of this thesis is to test the following hypothesis: the initial distribution of stellar masses from a single "episode" of star formation is independent of the local physical conditions of the region. In other words, is the initial mass function (IMF) strictly universal over spatial scales d < 1 \\ pc and over time intervals Delta-tau << 3 x 10^6 yrs? We discuss the utility of embedded clusters in addressing this question. Using a combination of spectroscopic and photometric techniques, we seek to characterize emergent mass distributions of embedded clusters in order to compare them both with each other and with the field star IMF. Medium resolution (R=1000) near-infrared spectra obtainable with the current generation of NIR grating spectrographs can provide estimates of the photospheric temperatures of optically-invisible stars. Deriving these spectral types requires a three--step process; i) setting up a classification scheme based on near-infrared spectra of spectral standards; ii) understanding the effects of accretion on this classification scheme by studying optically-visible young stellar objects; and iii) applying this classification technique to the deeply embedded clusters. Combining near-infrared photometry with spectral types, accurate stellar luminosities can be derived for heavily reddened young stars thus enabling their placement in the H-R diagram. From their position in the H-R diagram, masses and ages of stars can be estimated from comparison with theoretical pre-main sequence evolutionary models. Because it is not practical to obtain complete spectroscopic samples of embedded cluster members, a technique is developed based solely on near-IR photometry for estimating stellar luminosities from flux--limited surveys. We then describe how spectroscopic surveys of deeply embedded clusters are necessary in order to adopt appropriate mass-luminosity relationships. Stellar luminosity functions constructed from complete extinction-limited samples
NASA Astrophysics Data System (ADS)
Meyer, Michael R.
1996-02-01
The goal of this thesis is to test the following hypothesis: the initial distribution of stellar masses from a single ``episode'' of star formation is independent of the local physical conditions of the region. In other words, is the initial mass function (IMF) strictly universal over spatial scales d < 1 pc and over time intervals Δ τ << 3 × 106yrs? We discuss the utility of embedded clusters in addressing this question. Using a combination of spectroscopic and photometric techniques, we seek to characterize emergent mass distributions of embedded clusters in order to compare them both with each other and with the field star IMF. Medium resolution (R = 1000) near--infrared spectra obtainable with the current generation of NIR grating spectrographs can provide estimates of the photospheric temperatures of optically--invisible stars. Deriving these spectral types requires a three--step process; i) setting up a classification scheme based on near--infrared spectra of spectral standards; ii) understanding the effects of accretion on this classification scheme by studying optically--visible young stellar objects; and iii) applying this classification technique to the deeply embedded clusters. Combining near--infrared photometry with spectral types, accurate stellar luminosities can be derived for heavily reddened young stars thus enabling their placement in the H--R diagram. From their position in the H--R diagram, masses and ages of stars can be estimated from comparison with theoretical pre--main sequence evolutionary models. Because it is not practical to obtain complete spectroscopic samples of embedded cluster members, a technique is developed based solely on near--IR photometry for estimating stellar luminosities from flux--limited surveys. We then describe how spectroscopic surveys of deeply embedded clusters are necessary in order to adopt appropriate mass--luminosity relationships. Stellar luminosity functions constructed from complete extinction
Relation chain based clustering analysis
NASA Astrophysics Data System (ADS)
Zhang, Cheng-ning; Zhao, Ming-yang; Luo, Hai-bo
2011-08-01
Clustering analysis is currently one of well-developed branches in data mining technology which is supposed to find the hidden structures in the multidimensional space called feature or pattern space. A datum in the space usually possesses a vector form and the elements in the vector represent several specifically selected features. These features are often of efficiency to the problem oriented. Generally, clustering analysis goes into two divisions: one is based on the agglomerative clustering method, and the other one is based on divisive clustering method. The former refers to a bottom-up process which regards each datum as a singleton cluster while the latter refers to a top-down process which regards entire data as a cluster. As the collected literatures, it is noted that the divisive clustering is currently overwhelming both in application and research. Although some famous divisive clustering methods are designed and well developed, clustering problems are still far from being solved. The k - means algorithm is the original divisive clustering method which initially assigns some important index values, such as the clustering number and the initial clustering prototype positions, and that could not be reasonable in some certain occasions. More than the initial problem, the k - means algorithm may also falls into local optimum, clusters in a rigid way and is not available for non-Gaussian distribution. One can see that seeking for a good or natural clustering result, in fact, originates from the one's understanding of the concept of clustering. Thus, the confusion or misunderstanding of the definition of clustering always derives some unsatisfied clustering results. One should consider the definition deeply and seriously. This paper demonstrates the nature of clustering, gives the way of understanding clustering, discusses the methodology of designing a clustering algorithm, and proposes a new clustering method based on relation chains among 2D patterns. In
Melchior, P.; Suchyta, E.; Huff, E.; Hirsch, M.; Kacprzak, T.; Rykoff, E.; Gruen, D.; Armstrong, R.; Bacon, D.; Bechtol, K.; Bernstein, G. M.; Bridle, S.; Clampitt, J.; Honscheid, K.; Jain, B.; Jouvel, S.; Krause, E.; Lin, H.; MacCrann, N.; Patton, K.; Plazas, A.; Rowe, B.; Vikram, V.; Wilcox, H.; Young, J.; Zuntz, J.; Abbott, T.; Abdalla, F. B.; Allam, S. S.; Banerji, M.; Bernstein, J. P.; Bernstein, R. A.; Bertin, E.; Buckley-Geer, E.; Burke, D. L.; Castander, F. J.; da Costa, L. N.; Cunha, C. E.; Depoy, D. L.; Desai, S.; Diehl, H. T.; Doel, P.; Estrada, J.; Evrard, A. E.; Neto, A. F.; Fernandez, E.; Finley, D. A.; Flaugher, B.; Frieman, J. A.; Gaztanaga, E.; Gerdes, D.; Gruendl, R. A.; Gutierrez, G. R.; Jarvis, M.; Karliner, I.; Kent, S.; Kuehn, K.; Kuropatkin, N.; Lahav, O.; Maia, M. A. G.; Makler, M.; Marriner, J.; Marshall, J. L.; Merritt, K. W.; Miller, C. J.; Miquel, R.; Mohr, J.; Neilsen, E.; Nichol, R. C.; Nord, B. D.; Reil, K.; Roe, N. A.; Roodman, A.; Sako, M.; Sanchez, E.; Santiago, B. X.; Schindler, R.; Schubnell, M.; Sevilla-Noarbe, I.; Sheldon, E.; Smith, C.; Soares-Santos, M.; Swanson, M. E. C.; Sypniewski, A. J.; Tarle, G.; Thaler, J.; Thomas, D.; Tucker, D. L.; Walker, A.; Wechsler, R.; Weller, J.; Wester, W.
2015-03-31
We measure the weak-lensing masses and galaxy distributions of four massive galaxy clusters observed during the Science Verification phase of the Dark Energy Survey. This pathfinder study is meant to 1) validate the DECam imager for the task of measuring weak-lensing shapes, and 2) utilize DECam's large field of view to map out the clusters and their environments over 90 arcmin. We conduct a series of rigorous tests on astrometry, photometry, image quality, PSF modelling, and shear measurement accuracy to single out flaws in the data and also to identify the optimal data processing steps and parameters. We find Science Verification data from DECam to be suitable for the lensing analysis described in this paper. The PSF is generally well-behaved, but the modelling is rendered difficult by a flux-dependent PSF width and ellipticity. We employ photometric redshifts to distinguish between foreground and background galaxies, and a red-sequence cluster finder to provide cluster richness estimates and cluster-galaxy distributions. By fitting NFW profiles to the clusters in this study, we determine weak-lensing masses that are in agreement with previous work. For Abell 3261, we provide the first estimates of redshift, weak-lensing mass, and richness. Additionally, the cluster-galaxy distributions indicate the presence of filamentary structures attached to 1E 0657-56 and RXC J2248.7-4431, stretching out as far as 1degree (approximately 20 Mpc), showcasing the potential of DECam and DES for detailed studies of degree-scale features on the sky.
Melchior, P.; et al.
2015-05-21
We measure the weak-lensing masses and galaxy distributions of four massive galaxy clusters observed during the Science Verification phase of the Dark Energy Survey. This pathfinder study is meant to 1) validate the DECam imager for the task of measuring weak-lensing shapes, and 2) utilize DECam's large field of view to map out the clusters and their environments over 90 arcmin. We conduct a series of rigorous tests on astrometry, photometry, image quality, PSF modeling, and shear measurement accuracy to single out flaws in the data and also to identify the optimal data processing steps and parameters. We find Science Verification data from DECam to be suitable for the lensing analysis described in this paper. The PSF is generally well-behaved, but the modeling is rendered difficult by a flux-dependent PSF width and ellipticity. We employ photometric redshifts to distinguish between foreground and background galaxies, and a red-sequence cluster finder to provide cluster richness estimates and cluster-galaxy distributions. By fitting NFW profiles to the clusters in this study, we determine weak-lensing masses that are in agreement with previous work. For Abell 3261, we provide the first estimates of redshift, weak-lensing mass, and richness. In addition, the cluster-galaxy distributions indicate the presence of filamentary structures attached to 1E 0657-56 and RXC J2248.7-4431, stretching out as far as 1 degree (approximately 20 Mpc), showcasing the potential of DECam and DES for detailed studies of degree-scale features on the sky.
NASA Astrophysics Data System (ADS)
Melchior, P.; Suchyta, E.; Huff, E.; Hirsch, M.; Kacprzak, T.; Rykoff, E.; Gruen, D.; Armstrong, R.; Bacon, D.; Bechtol, K.; Bernstein, G. M.; Bridle, S.; Clampitt, J.; Honscheid, K.; Jain, B.; Jouvel, S.; Krause, E.; Lin, H.; MacCrann, N.; Patton, K.; Plazas, A.; Rowe, B.; Vikram, V.; Wilcox, H.; Young, J.; Zuntz, J.; Abbott, T.; Abdalla, F. B.; Allam, S. S.; Banerji, M.; Bernstein, J. P.; Bernstein, R. A.; Bertin, E.; Buckley-Geer, E.; Burke, D. L.; Castander, F. J.; da Costa, L. N.; Cunha, C. E.; Depoy, D. L.; Desai, S.; Diehl, H. T.; Doel, P.; Estrada, J.; Evrard, A. E.; Neto, A. Fausti; Fernandez, E.; Finley, D. A.; Flaugher, B.; Frieman, J. A.; Gaztanaga, E.; Gerdes, D.; Gruendl, R. A.; Gutierrez, G. R.; Jarvis, M.; Karliner, I.; Kent, S.; Kuehn, K.; Kuropatkin, N.; Lahav, O.; Maia, M. A. G.; Makler, M.; Marriner, J.; Marshall, J. L.; Merritt, K. W.; Miller, C. J.; Miquel, R.; Mohr, J.; Neilsen, E.; Nichol, R. C.; Nord, B. D.; Reil, K.; Roe, N. A.; Roodman, A.; Sako, M.; Sanchez, E.; Santiago, B. X.; Schindler, R.; Schubnell, M.; Sevilla-Noarbe, I.; Sheldon, E.; Smith, C.; Soares-Santos, M.; Swanson, M. E. C.; Sypniewski, A. J.; Tarle, G.; Thaler, J.; Thomas, D.; Tucker, D. L.; Walker, A.; Wechsler, R.; Weller, J.; Wester, W.
2015-05-01
We measure the weak lensing masses and galaxy distributions of four massive galaxy clusters observed during the Science Verification phase of the Dark Energy Survey (DES). This pathfinder study is meant to (1) validate the Dark Energy Camera (DECam) imager for the task of measuring weak lensing shapes, and (2) utilize DECam's large field of view to map out the clusters and their environments over 90 arcmin. We conduct a series of rigorous tests on astrometry, photometry, image quality, point spread function (PSF) modelling, and shear measurement accuracy to single out flaws in the data and also to identify the optimal data processing steps and parameters. We find Science Verification data from DECam to be suitable for the lensing analysis described in this paper. The PSF is generally well behaved, but the modelling is rendered difficult by a flux-dependent PSF width and ellipticity. We employ photometric redshifts to distinguish between foreground and background galaxies, and a red-sequence cluster finder to provide cluster richness estimates and cluster-galaxy distributions. By fitting Navarro-Frenk-White profiles to the clusters in this study, we determine weak lensing masses that are in agreement with previous work. For Abell 3261, we provide the first estimates of redshift, weak lensing mass, and richness. In addition, the cluster-galaxy distributions indicate the presence of filamentary structures attached to 1E 0657-56 and RXC J2248.7-4431, stretching out as far as 1°(approximately 20 Mpc), showcasing the potential of DECam and DES for detailed studies of degree-scale features on the sky.
Melchior, P.; Suchyta, E.; Huff, E.; Hirsch, M.; Kacprzak, T.; Rykoff, E.; Gruen, D.; Armstrong, R.; Bacon, D.; Bechtol, K.; et al
2015-03-31
We measure the weak-lensing masses and galaxy distributions of four massive galaxy clusters observed during the Science Verification phase of the Dark Energy Survey. This pathfinder study is meant to 1) validate the DECam imager for the task of measuring weak-lensing shapes, and 2) utilize DECam's large field of view to map out the clusters and their environments over 90 arcmin. We conduct a series of rigorous tests on astrometry, photometry, image quality, PSF modelling, and shear measurement accuracy to single out flaws in the data and also to identify the optimal data processing steps and parameters. We find Sciencemore » Verification data from DECam to be suitable for the lensing analysis described in this paper. The PSF is generally well-behaved, but the modelling is rendered difficult by a flux-dependent PSF width and ellipticity. We employ photometric redshifts to distinguish between foreground and background galaxies, and a red-sequence cluster finder to provide cluster richness estimates and cluster-galaxy distributions. By fitting NFW profiles to the clusters in this study, we determine weak-lensing masses that are in agreement with previous work. For Abell 3261, we provide the first estimates of redshift, weak-lensing mass, and richness. Additionally, the cluster-galaxy distributions indicate the presence of filamentary structures attached to 1E 0657-56 and RXC J2248.7-4431, stretching out as far as 1degree (approximately 20 Mpc), showcasing the potential of DECam and DES for detailed studies of degree-scale features on the sky.« less
An efficient algorithm for generating random number pairs drawn from a bivariate normal distribution
NASA Technical Reports Server (NTRS)
Campbell, C. W.
1983-01-01
An efficient algorithm for generating random number pairs from a bivariate normal distribution was developed. Any desired value of the two means, two standard deviations, and correlation coefficient can be selected. Theoretically the technique is exact and in practice its accuracy is limited only by the quality of the uniform distribution random number generator, inaccuracies in computer function evaluation, and arithmetic. A FORTRAN routine was written to check the algorithm and good accuracy was obtained. Some small errors in the correlation coefficient were observed to vary in a surprisingly regular manner. A simple model was developed which explained the qualities aspects of the errors.
NASA Astrophysics Data System (ADS)
Wang, Wei; Huang, Peter
2016-02-01
Evanescent wave nano-velocimetry offers a unique three-dimensional measurement capability that allows for inferring tracer position distribution through the imaged particle intensities. Our previous study suggested that tracer polydispersity and failure to account for a near-wall tracer depletion layer would lead to compromised measurement accuracy. In this work, we report on a hybrid algorithm that converts the measured tracer intensities as a whole into their overall position distribution. The algorithm achieves a superior accuracy by using tracer size variation as a statistical analysis parameter.
WARM GAS IN THE VIRGO CLUSTER. I. DISTRIBUTION OF Ly{alpha} ABSORBERS
Yoon, Joo Heon; Putman, Mary E.; Bryan, Greg L.; Thom, Christopher; Chen, Hsiao-Wen
2012-08-01
The first systematic study of the warm gas (T = 10{sup 4-5} K) distribution across a galaxy cluster is presented using multiple background QSOs in and around the Virgo Cluster. We detect 25 Ly{alpha} absorbers (N{sub HI} = 10{sup 13.1-15.4} cm{sup -2}) in the Virgo velocity range toward 9 of 12 QSO sightlines observed with the Cosmic Origin Spectrograph, with a cluster impact parameter range of 0.36-1.65 Mpc (0.23-1.05 R{sub vir}). Including 18 Ly{alpha} absorbers previously detected by STIS or GHRS toward 7 of 11 background QSOs in and around the Virgo Cluster, we establish a sample of 43 absorbers toward a total of 23 background probes for studying the incidence of Ly{alpha} absorbers in and around the Virgo Cluster. With these absorbers, we find (1) warm gas is predominantly in the outskirts of the cluster and avoids the X-ray-detected hot intracluster medium (ICM). Also, Ly{alpha} absorption strength increases with cluster impact parameter. (2) Ly{alpha}-absorbing warm gas traces cold H I-emitting gas in the substructures of the Virgo Cluster. (3) Including the absorbers associated with the surrounding substructures, the warm gas covering fraction (100% for N{sub HI} > 10{sup 13.1} cm{sup -2}) is in agreement with cosmological simulations. We speculate that the observed warm gas is part of large-scale gas flows feeding the cluster both in the ICM and galaxies.
Probing surface distributions of α clusters in 20Ne via α -transfer reaction
NASA Astrophysics Data System (ADS)
Fukui, Tokuro; Taniguchi, Yasutaka; Suhara, Tadahiro; Kanada-En'yo, Yoshiko; Ogata, Kazuyuki
2016-03-01
Background: Direct evidence of the α -cluster manifestation in bound states has not been obtained yet, although a number of experimental studies were carried out to extract the information of the clustering. In particular in conventional analyses of α -transfer reactions, there exist a few significant problems on reaction models, which are insufficient to qualitatively discuss the cluster structure. Purpose: We aim to verify the manifestation of the α -cluster structure from observables. As the first application, we plan to extract the spatial information of the cluster structure of the 20Ne nucleus in its ground state through the cross section of the α -transfer reaction 16O(6Li,d )20Ne. Methods: For the analysis of the transfer reaction, we work with the coupled-channel Born approximation (CCBA) approach, in which the breakup effect of 6Li is explicitly taken into account by means of the continuum-discretized coupled-channel method based on the three-body α +d +16O model. The two methods are adopted to calculate the overlap function between 20Ne and α +16O ; one is the microscopic cluster model (MCM) with the generator coordinate method, and the other is the phenomenological two-body potential model (PM). Results: We show that the CCBA calculation with the MCM wave function gives a significant improvement of the theoretical result on the angular distribution of the transfer cross section, which is consistent with the experimental data. Employing the PM, it is discussed which region of the cluster wave function is probed on the transfer cross section. Conclusions: It is found that the surface region of the cluster wave function is sensitive to the cross section. The present work is situated as the first step in obtaining important information to systematically investigate the cluster structure.
Li, Weizhong [San Diego Supercomputer Center
2013-01-22
San Diego Supercomputer Center's Weizhong Li on "Effective Analysis of NGS Metagenomic Data with Ultra-fast Clustering Algorithms" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.
Li, Weizhong
2011-10-12
San Diego Supercomputer Center's Weizhong Li on "Effective Analysis of NGS Metagenomic Data with Ultra-fast Clustering Algorithms" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.
NASA Astrophysics Data System (ADS)
Zhang, Yanjun; Yu, Chunjuan; Fu, Xinghu; Liu, Wenzhe; Bi, Weihong
2015-12-01
In the distributed optical fiber sensing system based on Brillouin scattering, strain and temperature are the main measuring parameters which can be obtained by analyzing the Brillouin center frequency shift. The novel algorithm which combines the cuckoo search algorithm (CS) with the improved differential evolution (IDE) algorithm is proposed for the Brillouin scattering parameter estimation. The CS-IDE algorithm is compared with CS algorithm and analyzed in different situation. The results show that both the CS and CS-IDE algorithm have very good convergence. The analysis reveals that the CS-IDE algorithm can extract the scattering spectrum features with different linear weight ratio, linewidth combination and SNR. Moreover, the BOTDR temperature measuring system based on electron optical frequency shift is set up to verify the effectiveness of the CS-IDE algorithm. Experimental results show that there is a good linear relationship between the Brillouin center frequency shift and temperature changes.
An efficient central DOA tracking algorithm for multiple incoherently distributed sources
NASA Astrophysics Data System (ADS)
Hassen, Sonia Ben; Samet, Abdelaziz
2015-12-01
In this paper, we develop a new tracking method for the direction of arrival (DOA) parameters assuming multiple incoherently distributed (ID) sources. The new approach is based on a simple covariance fitting optimization technique exploiting the central and noncentral moments of the source angular power densities to estimate the central DOAs. The current estimates are treated as measurements provided to the Kalman filter that model the dynamic property of directional changes for the moving sources. Then, the covariance-fitting-based algorithm and the Kalman filtering theory are combined to formulate an adaptive tracking algorithm. Our algorithm is compared to the fast approximated power iteration-total least square-estimation of signal parameters via rotational invariance technique (FAPI-TLS-ESPRIT) algorithm using the TLS-ESPRIT method and the subspace updating via FAPI-algorithm. It will be shown that the proposed algorithm offers an excellent DOA tracking performance and outperforms the FAPI-TLS-ESPRIT method especially at low signal-to-noise ratio (SNR) values. Moreover, the performances of the two methods increase as the SNR values increase. This increase is more prominent with the FAPI-TLS-ESPRIT method. However, their performances degrade when the number of sources increases. It will be also proved that our method depends on the form of the angular distribution function when tracking the central DOAs. Finally, it will be shown that the more the sources are spaced, the more the proposed method can exactly track the DOAs.
Beard, R.A.
1990-03-01
The purpose of this thesis is to explore the methods used to parallelize NP-complete problems and the degree of improvement that can be realized using a distributed parallel processor to solve these combinatoric problems. Common NP-complete problem characteristics such as a priori reductions, use of partial-state information, and inhomogeneous searches are identified and studied. The set covering problem (SCP) is implemented for this research because many applications such as information retrieval, task scheduling, and VLSI expression simplification can be structured as an SCP problem. In addition, its generic NP-complete common characteristics are well documented and a parallel implementation has not been reported. Parallel programming design techniques involve decomposing the problem and developing the parallel algorithms. The major components of a parallel solution are developed in a four phase process. First, a meta-level design is accomplished using an appropriate design language such as UNITY. Then, the UNITY design is transformed into an algorithm and implementation specific to a distributed architecture. Finally, a complexity analysis of the algorithm is performed. the a priori reductions are divided-and-conquer algorithms; whereas, the search for the optimal set cover is accomplished with a branch-and-bound algorithm. The search utilizes a global best cost maintained at a central location for distribution to all processors. Three methods of load balancing are implemented and studied: coarse grain with static allocation of the search space, fine grain with dynamic allocation, and dynamic load balancing.
Percolation of randomly distributed growing clusters: the low initial density regime
NASA Astrophysics Data System (ADS)
Tsakiris, N.; Maragakis, M.; Kosmidis, K.; Argyrakis, P.
2011-06-01
We investigate the problem of growing clusters, which is modeled by two dimensional disks and three dimensional droplets. In this model we place a number of seeds on random locations on a lattice with an initial occupation probability, p. The seeds simultaneously grow with a constant velocity to form clusters. When two or more clusters eventually touch each other they immediately stop their growth. The probability that such a system will result in a percolating cluster depends on the density of the initially distributed seeds and the dimensionality of the system. For very low values of p we find a power law behavior for several properties that we investigate, namely for the size of the largest and second largest cluster, for the probability for a spanning cluster to occur, and for the mean radius of the finally formed droplets. We report the values of the corresponding scaling exponents. Finally, we show that for very low initial concentration of seeds the final coverage takes a constant value which depends on the system dimensionality.
An improved distributed routing algorithm for Benes based optical NoC
NASA Astrophysics Data System (ADS)
Zhang, Jing; Gu, Huaxi; Yang, Yintang
2010-08-01
Integrated optical interconnect is believed to be one of the main technologies to replace electrical wires. Optical Network-on-Chip (ONoC) has attracted more attentions nowadays. Benes topology is a good choice for ONoC for its rearrangeable non-blocking character, multistage feature and easy scalability. Routing algorithm plays an important role in determining the performance of ONoC. But traditional routing algorithms for Benes network are not suitable for ONoC communication, we developed a new distributed routing algorithm for Benes ONoC in this paper. Our algorithm selected the routing path dynamically according to network condition and enables more path choices for the message traveling in the network. We used OPNET to evaluate the performance of our routing algorithm and also compared it with a well-known bit-controlled routing algorithm. ETE delay and throughput were showed under different packet length and network sizes. Simulation results show that our routing algorithm can provide better performance for ONoC.
A data distributed parallel algorithm for ray-traced volume rendering
NASA Technical Reports Server (NTRS)
Ma, Kwan-Liu; Painter, James S.; Hansen, Charles D.; Krogh, Michael F.
1993-01-01
This paper presents a divide-and-conquer ray-traced volume rendering algorithm and a parallel image compositing method, along with their implementation and performance on the Connection Machine CM-5, and networked workstations. This algorithm distributes both the data and the computations to individual processing units to achieve fast, high-quality rendering of high-resolution data. The volume data, once distributed, is left intact. The processing nodes perform local ray tracing of their subvolume concurrently. No communication between processing units is needed during this locally ray-tracing process. A subimage is generated by each processing unit and the final image is obtained by compositing subimages in the proper order, which can be determined a priori. Test results on both the CM-5 and a group of networked workstations demonstrate the practicality of our rendering algorithm and compositing method.
Distributions of Short-lived Radioactive Nuclei Produced by Young Embedded Star Clusters
NASA Astrophysics Data System (ADS)
Adams, Fred C.; Fatuzzo, Marco; Holden, Lisa
2014-07-01
Most star formation in the Galaxy takes place in clusters, where the most massive members can affect the properties of other constituent solar systems. This paper considers how clusters influence star formation and forming planetary systems through nuclear enrichment from supernova explosions, where massive stars deliver short-lived radioactive nuclei (SLRs) to their local environment. The decay of these nuclei leads to both heating and ionization, and thereby affects disk evolution, disk chemistry, and the accompanying process of planet formation. Nuclear enrichment can take place on two spatial scales: (1) within the cluster itself (l ~ 1 pc), the SLRs are delivered to the circumstellar disks associated with other cluster members. (2) On the next larger scale (l ~ 2-10 pc), SLRs are injected into the background molecular cloud; these nuclei provide heating and ionization to nearby star-forming regions and to the next generation of disks. For the first scenario, we construct the expected distributions of radioactive enrichment levels provided by embedded clusters. Clusters can account for the SLR mass fractions inferred for the early Solar Nebula, but typical SLR abundances are lower by a factor of ~10. For the second scenario, we find that distributed enrichment of SLRs in molecular clouds leads to comparable abundances. For both the direct and distributed enrichment processes, the masses of 26Al and 60Fe delivered to individual circumstellar disks typically fall in the range 10-100 pM ⊙ (where 1 pM ⊙ = 10-12 M ⊙). The corresponding ionization rate due to SLRs typically falls in the range ζSLR ~ 1-5 × 10-19 s-1. This ionization rate is smaller than that due to cosmic rays, ζCR ~ 10-17 s-1, but will be important in regions where cosmic rays are attenuated (e.g., disk mid-planes).
NASA Astrophysics Data System (ADS)
Custódio, Susana; Lima, Vânia; Vales, Dina; Cesca, Simone; Carrilho, Fernando
2016-04-01
The matching between linear trends of hypocentres and fault planes indicated by focal mechanisms (FMs) is frequently used to infer the location and geometry of active faults. This practice works well in regions of fast lithospheric deformation, where earthquake patterns are clear and major structures accommodate the bulk of deformation, but typically fails in regions of slow and distributed deformation. We present a new joint FM and hypocentre cluster algorithm that is able to detect systematically the consistency between hypocentre lineations and FMs, even in regions of distributed deformation. We apply the method to the Azores-western Mediterranean region, with particular emphasis on western Iberia. The analysis relies on a compilation of hypocentres and FMs taken from regional and global earthquake catalogues, academic theses and technical reports, complemented by new FMs for western Iberia. The joint clustering algorithm images both well-known and new seismo-tectonic features. The Azores triple junction is characterised by FMs with vertical pressure (P) axes, in good agreement with the divergent setting, and the Iberian domain is characterised by NW-SE oriented P axes, indicating a response of the lithosphere to the ongoing oblique convergence between Nubia and Eurasia. Several earthquakes remain unclustered in the western Mediterranean domain, which may indicate a response to local stresses. The major regions of consistent faulting that we identify are the mid-Atlantic ridge, the Terceira rift, the Trans-Alboran shear zone and the north coast of Algeria. In addition, other smaller earthquake clusters present a good match between epicentre lineations and FM fault planes. These clusters may signal single active faults or wide zones of distributed but consistent faulting. Mainland Portugal is dominated by strike-slip earthquakes with fault planes coincident with the predominant NNE-SSW and WNW-ESE oriented earthquake lineations. Clusters offshore SW Iberia are
Chen, Wei-Chen; Maitra, Ranjan
2011-01-01
We propose a model-based approach for clustering time series regression data in an unsupervised machine learning framework to identify groups under the assumption that each mixture component follows a Gaussian autoregressive regression model of order p. Given the number of groups, the traditional maximum likelihood approach of estimating the parameters using the expectation-maximization (EM) algorithm can be employed, although it is computationally demanding. The somewhat fast tune to the EM folk song provided by the Alternating Expectation Conditional Maximization (AECM) algorithm can alleviate the problem to some extent. In this article, we develop an alternative partial expectation conditional maximization algorithm (APECM) that uses an additional data augmentation storage step to efficiently implement AECM for finite mixture models. Results on our simulation experiments show improved performance in both fewer numbers of iterations and computation time. The methodology is applied to the problem of clustering mutual funds data on the basis of their average annual per cent returns and in the presence of economic indicators.
A Peer-to-Peer Distributed Selection Algorithm for the Internet.
ERIC Educational Resources Information Center
Loo, Alfred; Choi, Y. K.
2002-01-01
In peer-to-peer systems, the lack of dedicated links between constituent computers presents a major challenge. Discusses the problems of distributed database operation with reference to an example. Presents two statistical selection algorithms-one for the intranet with broadcast/multicast facilities, the other for Internet without…
Vinogradov, S A; Wilson, D F
1994-01-01
A new method for analysis of phosphorescence lifetime distributions in heterogeneous systems has been developed. This method is based on decomposition of the data vector to a linearly independent set of exponentials and uses quadratic programming principles for x2 minimization. Solution of the resulting algorithm requires a finite number of calculations (it is not iterative) and is computationally fast and robust. The algorithm has been tested on various simulated decays and for analysis of phosphorescence measurements of experimental systems with descrete distributions of lifetimes. Critical analysis of the effect of signal-to-noise on the resolving capability of the algorithm is presented. This technique is recommended for resolution of the distributions of quencher concentration in heterogeneous samples, of which oxygen distributions in tissue is an important example. Phosphors of practical importance for biological oxygen measurements: Pd-meso-tetra (4-carboxyphenyl) porphyrin (PdTCPP) and Pd-meso-porphyrin (PdMP) have been used to provide experimental test of the algorithm. PMID:7858142
Distributed vs. Clustered Star Formation in Mon OB1 - NGC 2264
NASA Astrophysics Data System (ADS)
Fazio, Giovanni; Pipher, Judy; Allen, Lori; Gutermuth, Robert; Megeath, S. Thomas; Myers, Phillip
2007-05-01
Following successful IRAC + MIPS observations (75% complete) of the GMCs Cep OB3 and Mon R2 harboring high mass star formation in PID 20403 'Distributed vs. Clustered Star Formation', we propose in this cycle to obtain similar observations for 2 more GMCs, namely Mon OB1 and Cam OB1. The present GTO proposal begins this program by covering part of the distributed area around NGC 2264 in Mon OB1. The rest of Mon OB1 as well as Cam OB1 will be proposed as a GO program. We hope to establish the physical conditions leading to clustered vs. distributed star formation. We will compare results for all these regions against similar studies in clouds harboring low mass star formation, for example, the c2d survey.
Distributed cluster testing using new virtualized framework for XRootD
NASA Astrophysics Data System (ADS)
Salmon, Justin L.; Janyst, Lukasz
2014-06-01
The Extended ROOT Daemon (XRootD) is a distributed, scalable system for low-latency clustered data access. XRootD is mature and widely used in HEP, both standalone and as a core server framework for the EOS system at CERN, and hence requires extensive testing to ensure general stability. However, there are many difficulties posed by distributed testing, such as cluster set up, synchronization, orchestration, inter-cluster communication and controlled failure handling. A three-layer master/hypervisor/slave model is presented to ameloirate these difficulties by utilizing libvirt and QEMU/KVM virtualization technologies to automate spawning of configurable virtual clusters and orchestrate multi-stage test suites. The framework also incorporates a user-friendly web interface for scheduling and monitoring tests. The prototype has been used successfully to build new test suites for XRootD and EOS with existing unit test integration. It is planned for the future to sufficiently generalize the framework to encourage usage by potentially any distributed system.
Distributed control of cluster synchronisation in networks with randomly occurring non-linearities
NASA Astrophysics Data System (ADS)
Hu, Aihua; Cao, Jinde; Hu, Manfeng; Guo, Liuxiao
2016-08-01
This paper is concerned with the issue of mean square cluster synchronisation in complex networks, which consist of non-identical nodes with randomly occurring non-linearities. In order to guarantee synchronisation, distributed controllers depending on the information from the neighbours in the same cluster are applied to each node, meanwhile, the control gains are supposed to be updated according to the given laws. Based on the Lyapunov stability theory, the sufficient synchronisation conditions are derived and proved theoretically. Finally, a numerical example is presented to demonstrate the effectiveness of the results.
Reconstructing merger timelines using star cluster age distributions: the case of MCG+08-11-002
NASA Astrophysics Data System (ADS)
Davies, Rebecca L.; Medling, Anne M.; U, Vivian; Max, Claire E.; Sanders, David; Kewley, Lisa J.
2016-05-01
We present near-infrared imaging and integral field spectroscopy of the centre of the dusty luminous infrared galaxy merger MCG+08-11-002, taken using the Near InfraRed Camera 2 (NIRC2) and the OH-Suppressing InfraRed Imaging Spectrograph (OSIRIS) on Keck II. We achieve a spatial resolution of ˜25 pc in the K band, allowing us to resolve 41 star clusters in the NIRC2 images. We calculate the ages of 22/25 star clusters within the OSIRIS field using the equivalent widths of the CO 2.3 μm absorption feature and the Br γ nebular emission line. The star cluster age distribution has a clear peak at ages ≲ 20 Myr, indicative of current starburst activity associated with the final coalescence of the progenitor galaxies. There is a possible second peak at ˜65 Myr which may be a product of the previous close passage of the galaxy nuclei. We fit single and double starburst models to the star cluster age distribution and use Monte Carlo sampling combined with two-sided Kolmogorov-Smirnov tests to calculate the probability that the observed data are drawn from each of the best-fitting distributions. There is a >90 per cent chance that the data are drawn from either a single or double starburst star formation history, but stochastic sampling prevents us from distinguishing between the two scenarios. Our analysis of MCG+08-11-002 indicates that star cluster age distributions provide valuable insights into the timelines of galaxy interactions and may therefore play an important role in the future development of precise merger stage classification systems.
NASA Astrophysics Data System (ADS)
Thanos, Konstantinos-Georgios; Thomopoulos, Stelios C. A.
2014-06-01
The study in this paper belongs to a more general research of discovering facial sub-clusters in different ethnicity face databases. These new sub-clusters along with other metadata (such as race, sex, etc.) lead to a vector for each face in the database where each vector component represents the likelihood of participation of a given face to each cluster. This vector is then used as a feature vector in a human identification and tracking system based on face and other biometrics. The first stage in this system involves a clustering method which evaluates and compares the clustering results of five different clustering algorithms (average, complete, single hierarchical algorithm, k-means and DIGNET), and selects the best strategy for each data collection. In this paper we present the comparative performance of clustering results of DIGNET and four clustering algorithms (average, complete, single hierarchical and k-means) on fabricated 2D and 3D samples, and on actual face images from various databases, using four different standard metrics. These metrics are the silhouette figure, the mean silhouette coefficient, the Hubert test Γ coefficient, and the classification accuracy for each clustering result. The results showed that, in general, DIGNET gives more trustworthy results than the other algorithms when the metrics values are above a specific acceptance threshold. However when the evaluation results metrics have values lower than the acceptance threshold but not too low (too low corresponds to ambiguous results or false results), then it is necessary for the clustering results to be verified by the other algorithms.
Distributed power allocation for sink-centric clusters in multiple sink wireless sensor networks.
Cao, Lei; Xu, Chen; Shao, Wei; Zhang, Guoan; Zhou, Hui; Sun, Qiang; Guo, Yuehua
2010-01-01
Due to the battery resource constraints, saving energy is a critical issue in wireless sensor networks, particularly in large sensor networks. One possible solution is to deploy multiple sink nodes simultaneously. Another possible solution is to employ an adaptive clustering hierarchy routing scheme. In this paper, we propose a multiple sink cluster wireless sensor networks scheme which combines the two solutions, and propose an efficient transmission power control scheme for a sink-centric cluster routing protocol in multiple sink wireless sensor networks, denoted as MSCWSNs-PC. It is a distributed, scalable, self-organizing, adaptive system, and the sensor nodes do not require knowledge of the global network and their location. All sinks effectively work out a representative view of a monitored region, after which power control is employed to optimize network topology. The simulations demonstrate the advantages of our new protocol. PMID:22294911
YOUNG STELLAR CLUSTERS WITH A SCHUSTER MASS DISTRIBUTION. I. STATIONARY WINDS
Palous, Jan; Wuensch, Richard; Hueyotl-Zahuantitla, Filiberto; Martinez-Gonzalez, Sergio; Silich, Sergiy; Tenorio-Tagle, Guillermo
2013-08-01
Hydrodynamic models for spherically symmetric winds driven by young stellar clusters with a generalized Schuster stellar density profile are explored. For this we use both semi-analytic models and one-dimensional numerical simulations. We determine the properties of quasi-adiabatic and radiative stationary winds and define the radius at which the flow turns from subsonic to supersonic for all stellar density distributions. Strongly radiative winds significantly diminish their terminal speed and thus their mechanical luminosity is strongly reduced. This also reduces their potential negative feedback into their host galaxy interstellar medium. The critical luminosity above which radiative cooling becomes dominant within the clusters, leading to thermal instabilities which make the winds non-stationary, is determined, and its dependence on the star cluster density profile, core radius, and half-mass radius is discussed.
Determination of the distribution of galaxies in groups and clusters by correlation analysis
Sanroma, M.; Salvador-Sole, E. )
1989-07-01
A new method for determining the two-dimensional number density profile in groups and clusters of galaxies is presented and tested. It is based on the use of the distribution of intergalactic projected separations, which can be calculated in a straightforward manner from the positions of galaxies in a plate. After establishing the mathematical foundation and practical implementation of the method, we show its potential by applying it to Monte Carlo clusters with a small number of galaxies. Several questions of interest to the general user are discussed and illustrated with real data. With this new method, reliable number density profiles can be determined for groups and clusters with a small number of galaxies. 21 refs.
NASA Astrophysics Data System (ADS)
Miquelez, Teresa; Bengoetxea, Endika; Mendiburu, Alexander; Larranaga, Pedro
2007-12-01
This paper introduces a evolutionary computation method that applies Bayesian classifiers to optimization problems. This approach is based on Estimation of Distribution Algorithms (EDAs) in which Bayesian or Gaussian networks are applied to the evolution of a population of individuals (i.e. potential solutions to the optimization problem) in order to improve the quality of the individuals of the next generation. Our new approach, called Evolutionary Bayesian Classifier-based Optimization Algorithm (EBCOA), employs Bayesian classifiers instead of Bayesian or Gaussian networks in order to evolve individuals to a fitter population. In brief, EBCOAs are characterized by applying Bayesian classification techniques - usually applied to supervised classification problems - to optimization in continuous domains. We propose and review in this paper different Bayesian classifiers for implementing our EBCOA method, focusing particularly on EBCOAs applying naive Bayes, semi-na¨ive Bayes, and tree augmented na¨ive Bayes classifiers. This work presents a deep study on the behavior of these algorithms with classical optimiztion problems in continuous domains. The different parameters used for tuning the performance of the algorithms are discussed, and a comprehensive overview of their influence is provided. We also present experimental results to compare this new method with other state of the art approaches of the evolutionary computation field for continuous domains such as Evolutionary Strategies (ES) and Estimation of Distribution Algorithms (EDAs).
He, Hui; Fan, Guotao; Ye, Jianwei; Zhang, Weizhe
2013-01-01
It is of great significance to research the early warning system for large-scale network security incidents. It can improve the network system's emergency response capabilities, alleviate the cyber attacks' damage, and strengthen the system's counterattack ability. A comprehensive early warning system is presented in this paper, which combines active measurement and anomaly detection. The key visualization algorithm and technology of the system are mainly discussed. The large-scale network system's plane visualization is realized based on the divide and conquer thought. First, the topology of the large-scale network is divided into some small-scale networks by the MLkP/CR algorithm. Second, the sub graph plane visualization algorithm is applied to each small-scale network. Finally, the small-scale networks' topologies are combined into a topology based on the automatic distribution algorithm of force analysis. As the algorithm transforms the large-scale network topology plane visualization problem into a series of small-scale network topology plane visualization and distribution problems, it has higher parallelism and is able to handle the display of ultra-large-scale network topology. PMID:24191145
He, Hui; Fan, Guotao; Ye, Jianwei; Zhang, Weizhe
2013-01-01
It is of great significance to research the early warning system for large-scale network security incidents. It can improve the network system's emergency response capabilities, alleviate the cyber attacks' damage, and strengthen the system's counterattack ability. A comprehensive early warning system is presented in this paper, which combines active measurement and anomaly detection. The key visualization algorithm and technology of the system are mainly discussed. The large-scale network system's plane visualization is realized based on the divide and conquer thought. First, the topology of the large-scale network is divided into some small-scale networks by the MLkP/CR algorithm. Second, the sub graph plane visualization algorithm is applied to each small-scale network. Finally, the small-scale networks' topologies are combined into a topology based on the automatic distribution algorithm of force analysis. As the algorithm transforms the large-scale network topology plane visualization problem into a series of small-scale network topology plane visualization and distribution problems, it has higher parallelism and is able to handle the display of ultra-large-scale network topology. PMID:24191145
Lober, R.R.; Tautges, T.J.; Vaughan, C.T.
1997-03-01
Paving is an automated mesh generation algorithm which produces all-quadrilateral elements. It can additionally generate these elements in varying sizes such that the resulting mesh adapts to a function distribution, such as an error function. While powerful, conventional paving is a very serial algorithm in its operation. Parallel paving is the extension of serial paving into parallel environments to perform the same meshing functions as conventional paving only on distributed, discretized models. This extension allows large, adaptive, parallel finite element simulations to take advantage of paving`s meshing capabilities for h-remap remeshing. A significantly modified version of the CUBIT mesh generation code has been developed to host the parallel paving algorithm and demonstrate its capabilities on both two dimensional and three dimensional surface geometries and compare the resulting parallel produced meshes to conventionally paved meshes for mesh quality and algorithm performance. Sandia`s {open_quotes}tiling{close_quotes} dynamic load balancing code has also been extended to work with the paving algorithm to retain parallel efficiency as subdomains undergo iterative mesh refinement.
Efficient implementation of Jacobi algorithms and Jacobi sets on distributed memory architectures
NASA Astrophysics Data System (ADS)
Eberlein, P. J.; Park, Haesun
1990-04-01
One-sided methods for implementing Jacobi diagonalization algorithms have been recently proposed for both distributed memory and vector machines. These methods are naturally well suited to distributed memory and vector architectures because of their inherent parallelism and their abundance of vector operations. Also, one-sided methods require substantially less message passing than the two-sided methods, and thus can achieve higher efficiency. We describe in detail the use of the one-sided Jacobi rotation as opposed to the rotation used in the ``Hestenes'' algorithm; we perceive this difference to have been widely misunderstood. Furthermore the one-sided algorithm generalizes to other problems such as the nonsymmetric eigenvalue problem while the Hestenes algorithm does not. We discuss two new implementations for Jacobi sets for a ring connected array of processors and show their isomorphism to the round-robin ordering. Moreover, we show that two implementations produce Jacobi sets in identical orders up to a relabeling. These orderings are optimal in the sense that they complete each sweep in a minimum number of stages with minimal communication. We present implementation results of one-sided Jacobi algorithms using these orderings on the NCUBE/seven hypercube as well as the Intel iPSC/2 hypercube. Finally, we mention how other orderings, and can be, implemented. The number of nonisomorphic Jacobi sets has recently been shown to become infinite with increasing n. The work of this author was supported by National Science Foundation Grant CCR-8813493.
NASA Astrophysics Data System (ADS)
Yang, Fugui; Wang, Anting; Dong, Lei; Ming, Hai
2013-04-01
A classical iterative Lucy-Richardson (LR) inversion algorithm used for recovering particle-size distributions (PSD) from light-scattering data is proposed. The convergence of iteration is validated in the numerical simulation for three different distributions: the gamma, the log-normal, and the Rosin-Rammler. The accuracy of the inversion is checked graphically against the exact distribution with good results, even for the synthesized intensity data of a signal-noise-ratio smaller than 20 dB. Finally, an experiment with linear charge coupled device as the detector is carried out, and the PSD is recovered successfully by the LR inversion method.
Execution time supports for adaptive scientific algorithms on distributed memory machines
NASA Technical Reports Server (NTRS)
Berryman, Harry; Saltz, Joel; Scroggs, Jeffrey
1990-01-01
Optimizations are considered that are required for efficient execution of code segments that consists of loops over distributed data structures. The PARTI (Parallel Automated Runtime Toolkit at ICASE) execution time primitives are designed to carry out these optimizations and can be used to implement a wide range of scientific algorithms on distributed memory machines. These primitives allow the user to control array mappings in a way that gives an appearance of shared memory. Computations can be based on a global index set. Primitives are used to carry out gather and scatter operations on distributed arrays. Communications patterns are derived at runtime, and the appropriate send and receive messages are automatically generated.
Abedini, Mohammad; Moradi, Mohammad H; Hosseinian, S M
2016-03-01
This paper proposes a novel method to address reliability and technical problems of microgrids (MGs) based on designing a number of self-adequate autonomous sub-MGs via adopting MGs clustering thinking. In doing so, a multi-objective optimization problem is developed where power losses reduction, voltage profile improvement and reliability enhancement are considered as the objective functions. To solve the optimization problem a hybrid algorithm, named HS-GA, is provided, based on genetic and harmony search algorithms, and a load flow method is given to model different types of DGs as droop controller. The performance of the proposed method is evaluated in two case studies. The results provide support for the performance of the proposed method. PMID:26767800
F TURNOFF DISTRIBUTION IN THE GALACTIC HALO USING GLOBULAR CLUSTERS AS PROXIES
Newby, Matthew; Newberg, Heidi Jo; Simones, Jacob; Cole, Nathan; Monaco, Matthew E-mail: heidi@rpi.edu
2011-12-20
F turnoff stars are important tools for studying Galactic halo substructure because they are plentiful, luminous, and can be easily selected by their photometric colors from large surveys such as the Sloan Digital Sky Survey (SDSS). We describe the absolute magnitude distribution of color-selected F turnoff stars, as measured from SDSS data, for 11 globular clusters in the Milky Way halo. We find that the M{sub g} distribution of turnoff stars is intrinsically the same for all clusters studied, and is well fit by two half-Gaussian functions, centered at {mu} = 4.18, with a bright-side {sigma} = 0.36, and with a faint-side {sigma} = 0.76. However, the color errors and detection efficiencies cause the observed {sigma} of the faint-side Gaussian to change with magnitude due to contamination from redder main-sequence stars (40% at 21st magnitude). We present a function that will correct for this magnitude-dependent change in selected stellar populations, when calculating stellar density from color-selected turnoff stars. We also present a consistent set of distances, ages, and metallicities for 11 clusters in the SDSS Data Release 7. We calculate a linear correction function to Padova isochrones so that they are consistent with SDSS globular cluster data from previous papers. We show that our cluster population falls along the Milky Way age-metallicity relationship (AMR), and further find that isochrones for stellar populations on the AMR have very similar turnoffs; increasing metallicity and decreasing age conspire to produce similar turnoff magnitudes and colors for all old clusters that lie on the AMR.
F Turnoff Distribution in the Galactic Halo Using Globular Clusters as Proxies
NASA Astrophysics Data System (ADS)
Newby, Matthew; Newberg, H. J.; Simones, J.; Monaco, M.; Cole, N.
2012-01-01
F turnoff stars are important tools for studying Galactic halo substructure because they are plentiful, luminous, and can be easily selected by their photometric colors from large surveys such as the Sloan Digital Sky Survey (SDSS). We describe the absolute magnitude distribution of color-selected F turnoff stars, as measured from SDSS data, for eleven globular clusters in the Milky Way halo. We find that the absolute magnitude distribution of turnoff stars is intrinsically the same for all clusters studied, and is well fit by two half Gaussian functions, centered at μ = 4.18, with a bright-side σ = 0.36, and with a faint-side σ = 0.76. However, the color errors and detection efficiencies cause the observed σ of the faint-side Gaussian to change with magnitude due to contamination from redder main sequence stars (40% at 21st magnitude). We present a function that will correct for this magnitude-dependent change in selected stellar populations, when calculating stellar density from color-selected turnoff stars. We also present a consistent set of distances, ages and metallicities for eleven clusters in the SDSS Data Release 7. We calculate a linear correction function to Padova isochrones so that they are consistent with SDSS globular cluster data from previous papers. We show that our cluster population falls along the theoretical Age-Metallicity Relationship (AMR), and further find that isochrones for stellar populations on the AMR have very similar turnoffs; increasing metallicity and decreasing age conspire to produce similar turnoff magnitudes and colors for all old clusters that lie on the AMR. This research was supported by NSF grant AST 10-09670 and the NASA/NY Space Grant.
F Turnoff Distribution in the Galactic Halo Using Globular Clusters as Proxies
NASA Astrophysics Data System (ADS)
Newby, Matthew; Newberg, Heidi Jo; Simones, Jacob; Cole, Nathan; Monaco, Matthew
2011-12-01
F turnoff stars are important tools for studying Galactic halo substructure because they are plentiful, luminous, and can be easily selected by their photometric colors from large surveys such as the Sloan Digital Sky Survey (SDSS). We describe the absolute magnitude distribution of color-selected F turnoff stars, as measured from SDSS data, for 11 globular clusters in the Milky Way halo. We find that the Mg distribution of turnoff stars is intrinsically the same for all clusters studied, and is well fit by two half-Gaussian functions, centered at μ = 4.18, with a bright-side σ = 0.36, and with a faint-side σ = 0.76. However, the color errors and detection efficiencies cause the observed σ of the faint-side Gaussian to change with magnitude due to contamination from redder main-sequence stars (40% at 21st magnitude). We present a function that will correct for this magnitude-dependent change in selected stellar populations, when calculating stellar density from color-selected turnoff stars. We also present a consistent set of distances, ages, and metallicities for 11 clusters in the SDSS Data Release 7. We calculate a linear correction function to Padova isochrones so that they are consistent with SDSS globular cluster data from previous papers. We show that our cluster population falls along the Milky Way age-metallicity relationship (AMR), and further find that isochrones for stellar populations on the AMR have very similar turnoffs; increasing metallicity and decreasing age conspire to produce similar turnoff magnitudes and colors for all old clusters that lie on the AMR.
A Computationally Efficient Mel-Filter Bank VAD Algorithm for Distributed Speech Recognition Systems
NASA Astrophysics Data System (ADS)
Vlaj, Damjan; Kotnik, Bojan; Horvat, Bogomir; Kačič, Zdravko
2005-12-01
This paper presents a novel computationally efficient voice activity detection (VAD) algorithm and emphasizes the importance of such algorithms in distributed speech recognition (DSR) systems. When using VAD algorithms in telecommunication systems, the required capacity of the speech transmission channel can be reduced if only the speech parts of the signal are transmitted. A similar objective can be adopted in DSR systems, where the nonspeech parameters are not sent over the transmission channel. A novel approach is proposed for VAD decisions based on mel-filter bank (MFB) outputs with the so-called Hangover criterion. Comparative tests are presented between the presented MFB VAD algorithm and three VAD algorithms used in the G.729, G.723.1, and DSR (advanced front-end) Standards. These tests were made on the Aurora 2 database, with different signal-to-noise (SNRs) ratios. In the speech recognition tests, the proposed MFB VAD outperformed all the three VAD algorithms used in the standards by [InlineEquation not available: see fulltext.] relative (G.723.1 VAD), by [InlineEquation not available: see fulltext.] relative (G.729 VAD), and by [InlineEquation not available: see fulltext.] relative (DSR VAD) in all SNRs.
An efficient distributed algorithm for constructing spanning trees in wireless sensor networks.
Lachowski, Rosana; Pellenz, Marcelo E; Penna, Manoel C; Jamhour, Edgard; Souza, Richard D
2015-01-01
Monitoring and data collection are the two main functions in wireless sensor networks (WSNs). Collected data are generally transmitted via multihop communication to a special node, called the sink. While in a typical WSN, nodes have a sink node as the final destination for the data traffic, in an ad hoc network, nodes need to communicate with each other. For this reason, routing protocols for ad hoc networks are inefficient for WSNs. Trees, on the other hand, are classic routing structures explicitly or implicitly used in WSNs. In this work, we implement and evaluate distributed algorithms for constructing routing trees in WSNs described in the literature. After identifying the drawbacks and advantages of these algorithms, we propose a new algorithm for constructing spanning trees in WSNs. The performance of the proposed algorithm and the quality of the constructed tree were evaluated in different network scenarios. The results showed that the proposed algorithm is a more efficient solution. Furthermore, the algorithm provides multiple routes to the sensor nodes to be used as mechanisms for fault tolerance and load balancing. PMID:25594593
An Efficient Distributed Algorithm for Constructing Spanning Trees in Wireless Sensor Networks
Lachowski, Rosana; Pellenz, Marcelo E.; Penna, Manoel C.; Jamhour, Edgard; Souza, Richard D.
2015-01-01
Monitoring and data collection are the two main functions in wireless sensor networks (WSNs). Collected data are generally transmitted via multihop communication to a special node, called the sink. While in a typical WSN, nodes have a sink node as the final destination for the data traffic, in an ad hoc network, nodes need to communicate with each other. For this reason, routing protocols for ad hoc networks are inefficient for WSNs. Trees, on the other hand, are classic routing structures explicitly or implicitly used in WSNs. In this work, we implement and evaluate distributed algorithms for constructing routing trees in WSNs described in the literature. After identifying the drawbacks and advantages of these algorithms, we propose a new algorithm for constructing spanning trees in WSNs. The performance of the proposed algorithm and the quality of the constructed tree were evaluated in different network scenarios. The results showed that the proposed algorithm is a more efficient solution. Furthermore, the algorithm provides multiple routes to the sensor nodes to be used as mechanisms for fault tolerance and load balancing. PMID:25594593
NASA Astrophysics Data System (ADS)
Li, Lei; Yang, Kecheng; Li, Wei; Wang, Wanyan; Guo, Wenping; Xia, Min
2016-07-01
Conventional regularization methods have been widely used for estimating particle size distribution (PSD) in single-angle dynamic light scattering, but they could not be used directly in multiangle dynamic light scattering (MDLS) measurements for lack of accurate angular weighting coefficients, which greatly affects the PSD determination and none of the regularization methods perform well for both unimodal and multimodal distributions. In this paper, we propose a recursive regularization method-Recursion Nonnegative Tikhonov-Phillips-Twomey (RNNT-PT) algorithm for estimating the weighting coefficients and PSD from MDLS data. This is a self-adaptive algorithm which distinguishes characteristics of PSDs and chooses the optimal inversion method from Nonnegative Tikhonov (NNT) and Nonnegative Phillips-Twomey (NNPT) regularization algorithm efficiently and automatically. In simulations, the proposed algorithm was able to estimate the PSDs more accurately than the classical regularization methods and performed stably against random noise and adaptable to both unimodal and multimodal distributions. Furthermore, we found that the six-angle analysis in the 30-130° range is an optimal angle set for both unimodal and multimodal PSDs.
A Comparison of Three Algorithms for Approximating the Distance Distribution in Real-World Graphs
NASA Astrophysics Data System (ADS)
Crescenzi, Pierluigi; Grossi, Roberto; Lanzi, Leonardo; Marino, Andrea
The distance for a pair of vertices in a graph G is the length of the shortest path between them. The distance distribution for G specifies how many vertex pairs are at distance h, for all feasible values h. We study three fast randomized algorithms to approximate the distance distribution in large graphs. The Eppstein-Wang (ew) algorithm exploits sampling through a limited (logarithmic) number of Breadth-First Searches (bfses). The Size-Estimation Framework (sef) by Cohen employs random ranking and least-element lists to provide several estimators. Finally, the Approximate Neighborhood Function (anf) algorithm by Palmer, Gibbons, and Faloutsos makes use of the probabilistic counting technique introduced by Flajolet and Martin, in order to estimate the number of distinct elements in a large multiset. We investigate how good is the approximation of the distance distribution, when the three algorithms are run in similar settings. The analysis of anf derives from the results on the probabilistic counting method, while the one of sef is given by Cohen. For what concerns ew (originally designed for another problem), we extend its simple analysis in order to bound its error with high probability and to show its convergence. We then perform an experimental study on 30 real-world graphs, showing that our implementation of ew combines the accuracy of sef with the performance of anf.
NASA Astrophysics Data System (ADS)
Jia, Chun-Xiao; Liu, Run-Ran
2015-10-01
Recently, many scaling laws of interevent time distribution of human behaviors are observed and some quantitative understanding of human behaviors are also provided by researchers. In this paper, we propose a modified collaborative filtering algorithm by making use the scaling law of human behaviors for information filtering. Extensive experimental analyses demonstrate that the accuracies on MovieLensand Last.fm datasets could be improved greatly, compared with the standard collaborative filtering. Surprisingly, further statistical analyses suggest that the present algorithm could simultaneously improve the novelty and diversity of recommendations. This work provides a creditable way for highly efficient information filtering.
NASA Astrophysics Data System (ADS)
Tian, Shu; Zhao, Min
2013-03-01
To solve the difficult problem that exists in the location of single-phase ground fault for coal mine underground distribution network, a fault location method using RBF network optimized by improved PSO algorithm based on the mapping relationship between wavelet packet transform modulus maxima of specific frequency bands transient state zero sequence current in the fault line and fault point position is presented. The simulation analysis results in the cases of different transition resistances and fault distances show that the RBF network optimized by improved PSO algorithm can obtain accurate and reliable fault location results, and the fault location perfor- mance is better than traditional RBF network.
Multi-agent coordination algorithms for control of distributed energy resources in smart grids
NASA Astrophysics Data System (ADS)
Cortes, Andres
Sustainable energy is a top-priority for researchers these days, since electricity and transportation are pillars of modern society. Integration of clean energy technologies such as wind, solar, and plug-in electric vehicles (PEVs), is a major engineering challenge in operation and management of power systems. This is due to the uncertain nature of renewable energy technologies and the large amount of extra load that PEVs would add to the power grid. Given the networked structure of a power system, multi-agent control and optimization strategies are natural approaches to address the various problems of interest for the safe and reliable operation of the power grid. The distributed computation in multi-agent algorithms addresses three problems at the same time: i) it allows for the handling of problems with millions of variables that a single processor cannot compute, ii) it allows certain independence and privacy to electricity customers by not requiring any usage information, and iii) it is robust to localized failures in the communication network, being able to solve problems by simply neglecting the failing section of the system. We propose various algorithms to coordinate storage, generation, and demand resources in a power grid using multi-agent computation and decentralized decision making. First, we introduce a hierarchical vehicle-one-grid (V1G) algorithm for coordination of PEVs under usage constraints, where energy only flows from the grid in to the batteries of PEVs. We then present a hierarchical vehicle-to-grid (V2G) algorithm for PEV coordination that takes into consideration line capacity constraints in the distribution grid, and where energy flows both ways, from the grid in to the batteries, and from the batteries to the grid. Next, we develop a greedy-like hierarchical algorithm for management of demand response events with on/off loads. Finally, we introduce distributed algorithms for the optimal control of distributed energy resources, i
Saito, Asami; Kawahara, Makoto; Ikeda, Seishi; Ishimine, Masato; Akao, Shoichiro; Minamisawa, Kiwamu
2008-01-01
Endophytic clostridia present on various plants as obligate anaerobes were surveyed by terminal restriction fragment length polymorphism (TRFLP) analysis specific to the clostridial 16S rRNA gene. Endophytic clostridia were detected in 10 plant types: sugarcane, cultivated rice, corn, tobacco, soybean, bermuda grass, tall fescue, and three mangrove species. Phylogenetically, cluster XIVa clostridia were detected more frequently than cluster I clostridia in aerial parts. Isolation of clostridia from surface-sterilized sugarcane stem validated the TRFLP results. Plant-derived clostridia occupied two unique phylogenetic positions (groups I and II) within cluster XIVa. Most of cluster XIVa clostridia from other sources (e.g., human, animal, and insect intestines) were located outside these groups. Thus two unique groups of cluster XIVa clostridia are widely distributed in plants, including crops. In field-grown soybeans, TRFLP analysis revealed clostridia only in a non-nodulating mutant. Ribosomal intergenic spacer analysis (RISA) showed that the bacterial community in soybean shoot depended partly on the soybean nodulation genotype. PMID:21558691
A Distributed Transmission Rate Adjustment Algorithm in Heterogeneous CSMA/CA Networks
Xie, Shuanglong; Low, Kay Soon; Gunawan, Erry
2015-01-01
Distributed transmission rate tuning is important for a wide variety of IEEE 802.15.4 network applications such as industrial network control systems. Such systems often require each node to sustain certain throughput demand in order to guarantee the system performance. It is thus essential to determine a proper transmission rate that can meet the application requirement and compensate for network imperfections (e.g., packet loss). Such a tuning in a heterogeneous network is difficult due to the lack of modeling techniques that can deal with the heterogeneity of the network as well as the network traffic changes. In this paper, a distributed transmission rate tuning algorithm in a heterogeneous IEEE 802.15.4 CSMA/CA network is proposed. Each node uses the results of clear channel assessment (CCA) to estimate the busy channel probability. Then a mathematical framework is developed to estimate the on-going heterogeneous traffics using the busy channel probability at runtime. Finally a distributed algorithm is derived to tune the transmission rate of each node to accurately meet the throughput requirement. The algorithm does not require modifications on IEEE 802.15.4 MAC layer and it has been experimentally implemented and extensively tested using TelosB nodes with the TinyOS protocol stack. The results reveal that the algorithm is accurate and can satisfy the throughput demand. Compared with existing techniques, the algorithm is fully distributed and thus does not require any central coordination. With this property, it is able to adapt to traffic changes and re-adjust the transmission rate to the desired level, which cannot be achieved using the traditional modeling techniques. PMID:25822140
Spatial Structures in the Globular Cluster Distribution of the 10 Brightest Virgo Galaxies
NASA Astrophysics Data System (ADS)
D'Abrusco, R.; Fabbiano, G.; Zezas, A.
2015-05-01
We report the discovery of significant localized structures in the projected two-dimensional spatial distributions of the Globular Cluster (GC) systems of the 10 brightest galaxies in the Virgo Cluster. We use catalogs of GCs extracted from the Hubble Space Telescope (HST) Advanced Camera for Surveys (ACS) Virgo Cluster Survey (ACSVCS) imaging data, complemented, when available, by additional archival ACS data. These structures have projected sizes ranging from ˜5″ to a few arcminutes (˜1 to ˜25 kpc). Their morphologies range from localized, circular to coherent, complex shapes resembling arcs and streams. The largest structures are preferentially aligned with the major axis of the host galaxy. A few relatively smaller structures follow the minor axis. Differences in the shape and significance of the GC structures can be noticed by investigating the spatial distribution of GCs grouped by color and luminosity. The largest coherent GC structures are located in low-density regions within the Virgo cluster. This trend is more evident in the red GC population, which is believed to form in mergers involving late-type galaxies. We suggest that GC over-densities may be driven by either accretion of satellite galaxies, major dissipationless mergers, or wet dissipation mergers. We discuss caveats to these scenarios, and estimate the masses of the potential progenitors galaxies. These masses range in the interval {{10}8.5}-{{10}9.5} {{M}⊙ }, larger than those of the Local Group dwarf galaxies.
Biological response to nonuniform distributions of (210)Po in multicellular clusters.
Neti, Prasad V S V; Howell, Roger W
2007-09-01
Radionuclides are distributed nonuniformly in tissue. The present work examined the impact of nonuniformities at the multicellular level on the lethal effects of (210)Po. A three-dimensional (3D) tissue culture model was used wherein V79 cells were labeled with (210)Po-citrate and mixed with unlabeled cells, and multicellular clusters were formed by centrifugation. The labeled cells were located randomly in the cluster to achieve a uniform distribution of radioactivity at the macroscopic level that was nonuniform at the multicellular level. The clusters were maintained at 10.5 degrees C for 72 h to allow alpha-particle decays to accumulate and then dismantled, and the cells were seeded for colony formation. Unlike typical survival curves for alpha particles, two-component exponential dose-response curves were observed for all three labeling conditions. Furthermore, the slopes of the survival curves for 100, 10 and 1% labeling were different. Neither the mean cluster absorbed dose nor a semi-empirical multicellular dosimetry approach could accurately predict the lethal effects of (210)Po-citrate. PMID:17705637
Effects of charging and doping on orbital hybridizations and distributions in TiO2 clusters
NASA Astrophysics Data System (ADS)
Zhao, Hong Min; Wu, Miao Miao; Wang, Qian; Jena, Puru
2011-11-01
Charging and doping are two important strategies used in TiO2 quantum dots for photocatalysis and photovoltaics. Using small clusters as the prototypes for quantum dots, we have carried out density functional calculations to study the size-specific effects of charging and doping on geometry, electronic structure, frontier orbital distribution, and orbital hybridization. We find that in neutral (TiO2)n clusters the charge transfer from Ti to O is almost size independent, while for the anionic (TiO2)n clusters the corresponding charge transfer is reduced but it increases with size. When one O atom is substituted with N, the charge transfer is also reduced due to the smaller electron affinity of N. As the cluster size increases, the populations of 3d and 4s orbitals of Ti decrease with size, while the populations of the 4p orbital increase, suggesting size dependence of spd hybridizations. The present study clearly shows that charging and doping are effective ways for tailoring the energy gap, orbital distributions, and hybridizations.
NASA Astrophysics Data System (ADS)
Choi, Hon-Chit; Wen, Lingfeng; Eberl, Stefan; Feng, Dagan
2006-03-01
Dynamic Single Photon Emission Computed Tomography (SPECT) has the potential to quantitatively estimate physiological parameters by fitting compartment models to the tracer kinetics. The generalized linear least square method (GLLS) is an efficient method to estimate unbiased kinetic parameters and parametric images. However, due to the low sensitivity of SPECT, noisy data can cause voxel-wise parameter estimation by GLLS to fail. Fuzzy C-Mean (FCM) clustering and modified FCM, which also utilizes information from the immediate neighboring voxels, are proposed to improve the voxel-wise parameter estimation of GLLS. Monte Carlo simulations were performed to generate dynamic SPECT data with different noise levels and processed by general and modified FCM clustering. Parametric images were estimated by Logan and Yokoi graphical analysis and GLLS. The influx rate (K I), volume of distribution (V d) were estimated for the cerebellum, thalamus and frontal cortex. Our results show that (1) FCM reduces the bias and improves the reliability of parameter estimates for noisy data, (2) GLLS provides estimates of micro parameters (K I-k 4) as well as macro parameters, such as volume of distribution (Vd) and binding potential (BP I & BP II) and (3) FCM clustering incorporating neighboring voxel information does not improve the parameter estimates, but improves noise in the parametric images. These findings indicated that it is desirable for pre-segmentation with traditional FCM clustering to generate voxel-wise parametric images with GLLS from dynamic SPECT data.
Bradac, M.
2005-04-13
We have shown that the cluster-mass reconstruction method which combines strong and weak gravitational lensing data, developed in the first paper in the series, successfully reconstructs the mass distribution of a simulated cluster. In this paper we apply the method to the ground-based high-quality multi-colour data of RX J1347.5-1145, the most X-ray luminous cluster to date. A new analysis of the cluster core on very deep, multi-colour data analysis of VLT/FORS data reveals many more arc candidates than previously known for this cluster. The combined strong and weak lensing reconstruction confirms that the cluster is indeed very massive. If the redshift and identification of the multiple-image system as well as the redshift estimates of the source galaxies used for weak lensing are correct, we determine the enclosed cluster mass in a cylinder to M(< 360h{sup -1}kpc) = (1.2 {+-} 0.3) x 10{sup 15}M{circle_dot}. In addition the reconstructed mass distribution follows the distribution found with independent methods (X-ray measurements, SZ). With higher resolution (e.g. HST imaging data) more reliable multiple imaging information can be obtained and the reconstruction can be improved to accuracies greater than what is currently possible with weak and strong lensing techniques.
Bradač, M.; Erben, T.; Schneider, P.; Hildebrandt, H.; Lombardi, M.; Schirmer, M.; Miralles, J. -M.; Clowe, D.; Schindler, S.
2005-07-01
We have shown that the cluster-mass reconstruction method which combines strong and weak gravitational lensing data, developed in the first paper in the series, successfully reconstructs the mass distribution of a simulated cluster. In this paper we apply the method to the ground-based high-quality multi-colour data of RX J1347.5-1145 , the most X-ray luminous cluster to date. A new analysis of the cluster core on very deep, multi-colour data analysis of VLT/FORS data reveals many more arc candidates than previously known for this cluster. The combined strong and weak lensing reconstruction confirms that the cluster is indeed very massive. If the redshift and identification of the multiple-image system as well as the redshift estimates of the source galaxies used for weak lensing are correct, we determine the enclosed cluster mass in a cylinder to M(<360 h^{ -1} kpc)= (1.2± 0.3) x 10^{15} M_{⊙}. In addition the reconstructed mass distribution follows the distribution found with independent methods (X-ray measurements, SZ). With higher resolution (e.g. HST imaging data) more reliable multiple imaging information can be obtained and the reconstruction can be improved to accuracies greater than what is currently possible with weak and strong lensing techniques.
Swarm Intelligence in Text Document Clustering
Cui, Xiaohui; Potok, Thomas E
2008-01-01
Social animals or insects in nature often exhibit a form of emergent collective behavior. The research field that attempts to design algorithms or distributed problem-solving devices inspired by the collective behavior of social insect colonies is called Swarm Intelligence. Compared to the traditional algorithms, the swarm algorithms are usually flexible, robust, decentralized and self-organized. These characters make the swarm algorithms suitable for solving complex problems, such as document collection clustering. The major challenge of today's information society is being overwhelmed with information on any topic they are searching for. Fast and high-quality document clustering algorithms play an important role in helping users to effectively navigate, summarize, and organize the overwhelmed information. In this chapter, we introduce three nature inspired swarm intelligence clustering approaches for document clustering analysis. These clustering algorithms use stochastic and heuristic principles discovered from observing bird flocks, fish schools and ant food forage.
NASA Astrophysics Data System (ADS)
Wagstaff, Kiri L.
2012-03-01
clustering, in which some partial information about item assignments or other components of the resulting output are already known and must be accommodated by the solution. Some algorithms seek a partition of the data set into distinct clusters, while others build a hierarchy of nested clusters that can capture taxonomic relationships. Some produce a single optimal solution, while others construct a probabilistic model of cluster membership. More formally, clustering algorithms operate on a data set X composed of items represented by one or more features (dimensions). These could include physical location, such as right ascension and declination, as well as other properties such as brightness, color, temporal change, size, texture, and so on. Let D be the number of dimensions used to represent each item, xi ∈ RD. The clustering goal is to produce an organization P of the items in X that optimizes an objective function f : P -> R, which quantifies the quality of solution P. Often f is defined so as to maximize similarity within a cluster and minimize similarity between clusters. To that end, many algorithms make use of a measure d : X x X -> R of the distance between two items. A partitioning algorithm produces a set of clusters P = {c1, . . . , ck} such that the clusters are nonoverlapping (c_i intersected with c_j = empty set, i != j) subsets of the data set (Union_i c_i=X). Hierarchical algorithms produce a series of partitions P = {p1, . . . , pn }. For a complete hierarchy, the number of partitions n’= n, the number of items in the data set; the top partition is a single cluster containing all items, and the bottom partition contains n clusters, each containing a single item. For model-based clustering, each cluster c_j is represented by a model m_j , such as the cluster center or a Gaussian distribution. The wide array of available clustering algorithms may seem bewildering, and covering all of them is beyond the scope of this chapter. Choosing among them for a
NASA Astrophysics Data System (ADS)
Tropin, T. V.; Jargalan, N.; Avdeev, M. V.; Kyzyma, O. A.; Sangaa, D.; Aksenov, V. L.
2014-01-01
The aggregate growth in a C60/N-methylpyrrolidone (NMP) solution has been considered in the framework of the approach developed earlier for describing the cluster growth kinetics in fullerene polar solutions. The final cluster size distribution functions in model solutions have been estimated for two fullerene aggregation models including the influence of complex formation on the cluster growth using extrapolations of the characteristics of the cluster state and distribution parameters. Based on the obtained results, the model curves of small-angle neutron scattering have been calculated for a C60/NMP solution at various values of the model parameters.
Improved Cost-Base Design of Water Distribution Networks using Genetic Algorithm
NASA Astrophysics Data System (ADS)
Moradzadeh Azar, Foad; Abghari, Hirad; Taghi Alami, Mohammad; Weijs, Steven
2010-05-01
Population growth and progressive extension of urbanization in different places of Iran cause an increasing demand for primary needs. The water, this vital liquid is the most important natural need for human life. Providing this natural need is requires the design and construction of water distribution networks, that incur enormous costs on the country's budget. Any reduction in these costs enable more people from society to access extreme profit least cost. Therefore, investment of Municipal councils need to maximize benefits or minimize expenditures. To achieve this purpose, the engineering design depends on the cost optimization techniques. This paper, presents optimization models based on genetic algorithm(GA) to find out the minimum design cost Mahabad City's (North West, Iran) water distribution network. By designing two models and comparing the resulting costs, the abilities of GA were determined. the GA based model could find optimum pipe diameters to reduce the design costs of network. Results show that the water distribution network design using Genetic Algorithm could lead to reduction of at least 7% in project costs in comparison to the classic model. Keywords: Genetic Algorithm, Optimum Design of Water Distribution Network, Mahabad City, Iran.
NASA Astrophysics Data System (ADS)
Castellini, P.; Cecchini, S.; Stroppa, L.; Paone, N.
2015-02-01
The paper presents an adaptive illumination system for image quality enhancement in vision-based quality control systems. In particular, a spatial modulation of illumination intensity is proposed in order to improve image quality, thus compensating for different target scattering properties, local reflections and fluctuations of ambient light. The desired spatial modulation of illumination is obtained by a digital light projector, used to illuminate the scene with an arbitrary spatial distribution of light intensity, designed to improve feature extraction in the region of interest. The spatial distribution of illumination is optimized by running a genetic algorithm. An image quality estimator is used to close the feedback loop and to stop iterations once the desired image quality is reached. The technique proves particularly valuable for optimizing the spatial illumination distribution in the region of interest, with the remarkable capability of the genetic algorithm to adapt the light distribution to very different target reflectivity and ambient conditions. The final objective of the proposed technique is the improvement of the matching score in the recognition of parts through matching algorithms, hence of the diagnosis of machine vision-based quality inspections. The procedure has been validated both by a numerical model and by an experimental test, referring to a significant problem of quality control for the washing machine manufacturing industry: the recognition of a metallic clamp. Its applicability to other domains is also presented, specifically for the visual inspection of shoes with retro-reflective tape and T-shirts with paillettes.
MC-BRM: A distributed RWA algorithm with minimum wavelength conversion
NASA Astrophysics Data System (ADS)
Barkabi Zanjani, Saeed; Ghaffarpour Rahbar, Akbar
2012-07-01
There are two strategies for solving Routing and Wavelength Assignment (RWA) in wavelength-routed networks: centralized and distributed. Centralized approaches are appropriate for small networks with light traffic, whereas distributed approaches are suitable for large networks with heavy traffic. Solving RWA problem in distributed algorithms can be generally divided into two phases: routing phase and wavelength assignment phase. Allocating a wavelength over a physical path for a connection request can be performed by one of two major strategies: Backward Reservation Method (BRM) and Forward Reservation Method (FRM). In this work, we assume that every node in the network can be equipped with a number of wavelength converters. Wavelength converters are usually chosen in a free policy. However, we propose a distributed algorithm, called Minimum-Conversion Backward Reservation Method (MC-BRM), that attempts to establish light-paths with minimum number of wavelength conversions. The MC-BRM algorithm can efficiently reduce the number of required wavelength conversions in the network. Besides improving blocking probability, MC-BRM can lead to better fairness in establishing light-paths with different number of hops. Finally, we make the worst case analysis for estimating wavelength conversion usages in individual nodes.
NASA Astrophysics Data System (ADS)
He, Zhenzong; Qi, Hong; Wang, Yuqing; Ruan, Liming
2014-10-01
Four improved Ant Colony Optimization (ACO) algorithms, i.e. the probability density function based ACO (PDF-ACO) algorithm, the Region ACO (RACO) algorithm, Stochastic ACO (SACO) algorithm and Homogeneous ACO (HACO) algorithm, are employed to estimate the particle size distribution (PSD) of the spheroidal particles. The direct problems are solved by the extended Anomalous Diffraction Approximation (ADA) and the Lambert-Beer law. Three commonly used monomodal distribution functions i.e. the Rosin-Rammer (R-R) distribution function, the normal (N-N) distribution function, and the logarithmic normal (L-N) distribution function are estimated under dependent model. The influence of random measurement errors on the inverse results is also investigated. All the results reveal that the PDF-ACO algorithm is more accurate than the other three ACO algorithms and can be used as an effective technique to investigate the PSD of the spheroidal particles. Furthermore, the Johnson's SB (J-SB) function and the modified beta (M-β) function are employed as the general distribution functions to retrieve the PSD of spheroidal particles using PDF-ACO algorithm. The investigation shows a reasonable agreement between the original distribution function and the general distribution function when only considering the variety of the length of the rotational semi-axis.
NASA Astrophysics Data System (ADS)
Zhang, Wei; Xiao, Li; Peng, Jiangde
2006-03-01
Two matrix algorithms aiming at a dynamic gain-spectrum adjustment in a backward-pumped distributed fiber Raman amplifier (B-DFRA) are developed based on the relation between changes in the pump power and the gain spectrum. Characteristic channels are chosen to reduce the dimension of matrices in the algorithm, which can be implemented by built-in microprocessors or DSP chips inside the B-DFRA module. Furthermore, as shown by the theoretical analysis and the numerical simulation, elements in the matrices can be directly and easily measured in deployed fiber plants without information on fiber parameters. These matrix algorithms are capable of adjusting the gain spectrum to fit the arbitrary profile desired in reality, while a wide dynamic range can be achieved by a multistage adjustment using matrices measured under several gain levels.