NASA Astrophysics Data System (ADS)
Graf, Norman A.
2001-07-01
An object-oriented framework for undertaking clustering algorithm studies has been developed. We present here the definitions for the abstract Cells and Clusters as well as the interface for the algorithm. We intend to use this framework to investigate the interplay between various clustering algorithms and the resulting jet reconstruction efficiency and energy resolutions to assist in the design of the calorimeter detector.
Basic cluster compression algorithm
NASA Technical Reports Server (NTRS)
Hilbert, E. E.; Lee, J.
1980-01-01
Feature extraction and data compression of LANDSAT data is accomplished by BCCA program which reduces costs associated with transmitting, storing, distributing, and interpreting multispectral image data. Algorithm uses spatially local clustering to extract features from image data to describe spectral characteristics of data set. Approach requires only simple repetitive computations, and parallel processing can be used for very high data rates. Program is written in FORTRAN IV for batch execution and has been implemented on SEL 32/55.
Ultrametric Hierarchical Clustering Algorithms.
ERIC Educational Resources Information Center
Milligan, Glenn W.
1979-01-01
Johnson has shown that the single linkage and complete linkage hierarchical clustering algorithms induce a metric on the data known as the ultrametric. Johnson's proof is extended to four other common clustering algorithms. Two additional methods also produce hierarchical structures which can violate the ultrametric inequality. (Author/CTM)
Parallel Wolff Cluster Algorithms
NASA Astrophysics Data System (ADS)
Bae, S.; Ko, S. H.; Coddington, P. D.
The Wolff single-cluster algorithm is the most efficient method known for Monte Carlo simulation of many spin models. Due to the irregular size, shape and position of the Wolff clusters, this method does not easily lend itself to efficient parallel implementation, so that simulations using this method have thus far been confined to workstations and vector machines. Here we present two parallel implementations of this algorithm, and show that one gives fairly good performance on a MIMD parallel computer.
Gao, Ying; Wkram, Chris Hadri; Duan, Jiajie; Chou, Jarong
2015-01-01
In order to prolong the network lifetime, energy-efficient protocols adapted to the features of wireless sensor networks should be used. This paper explores in depth the nature of heterogeneous wireless sensor networks, and finally proposes an algorithm to address the problem of finding an effective pathway for heterogeneous clustering energy. The proposed algorithm implements cluster head selection according to the degree of energy attenuation during the network’s running and the degree of candidate nodes’ effective coverage on the whole network, so as to obtain an even energy consumption over the whole network for the situation with high degree of coverage. Simulation results show that the proposed clustering protocol has better adaptability to heterogeneous environments than existing clustering algorithms in prolonging the network lifetime. PMID:26690440
Gao, Ying; Wkram, Chris Hadri; Duan, Jiajie; Chou, Jarong
2015-12-10
In order to prolong the network lifetime, energy-efficient protocols adapted to the features of wireless sensor networks should be used. This paper explores in depth the nature of heterogeneous wireless sensor networks, and finally proposes an algorithm to address the problem of finding an effective pathway for heterogeneous clustering energy. The proposed algorithm implements cluster head selection according to the degree of energy attenuation during the network's running and the degree of candidate nodes' effective coverage on the whole network, so as to obtain an even energy consumption over the whole network for the situation with high degree of coverage. Simulation results show that the proposed clustering protocol has better adaptability to heterogeneous environments than existing clustering algorithms in prolonging the network lifetime.
Multimodal Estimation of Distribution Algorithms.
Yang, Qiang; Chen, Wei-Neng; Li, Yun; Chen, C L Philip; Xu, Xiang-Min; Zhang, Jun
2016-02-15
Taking the advantage of estimation of distribution algorithms (EDAs) in preserving high diversity, this paper proposes a multimodal EDA. Integrated with clustering strategies for crowding and speciation, two versions of this algorithm are developed, which operate at the niche level. Then these two algorithms are equipped with three distinctive techniques: 1) a dynamic cluster sizing strategy; 2) an alternative utilization of Gaussian and Cauchy distributions to generate offspring; and 3) an adaptive local search. The dynamic cluster sizing affords a potential balance between exploration and exploitation and reduces the sensitivity to the cluster size in the niching methods. Taking advantages of Gaussian and Cauchy distributions, we generate the offspring at the niche level through alternatively using these two distributions. Such utilization can also potentially offer a balance between exploration and exploitation. Further, solution accuracy is enhanced through a new local search scheme probabilistically conducted around seeds of niches with probabilities determined self-adaptively according to fitness values of these seeds. Extensive experiments conducted on 20 benchmark multimodal problems confirm that both algorithms can achieve competitive performance compared with several state-of-the-art multimodal algorithms, which is supported by nonparametric tests. Especially, the proposed algorithms are very promising for complex problems with many local optima.
Energy Aware Clustering Algorithms for Wireless Sensor Networks
NASA Astrophysics Data System (ADS)
Rakhshan, Noushin; Rafsanjani, Marjan Kuchaki; Liu, Chenglian
2011-09-01
The sensor nodes deployed in wireless sensor networks (WSNs) are extremely power constrained, so maximizing the lifetime of the entire networks is mainly considered in the design. In wireless sensor networks, hierarchical network structures have the advantage of providing scalable and energy efficient solutions. In this paper, we investigate different clustering algorithms for WSNs and also compare these clustering algorithms based on metrics such as clustering distribution, cluster's load balancing, Cluster Head's (CH) selection strategy, CH's role rotation, node mobility, clusters overlapping, intra-cluster communications, reliability, security and location awareness.
Overlapping clusters for distributed computation.
Mirrokni, Vahab; Andersen, Reid; Gleich, David F.
2010-11-01
Scalable, distributed algorithms must address communication problems. We investigate overlapping clusters, or vertex partitions that intersect, for graph computations. This setup stores more of the graph than required but then affords the ease of implementation of vertex partitioned algorithms. Our hope is that this technique allows us to reduce communication in a computation on a distributed graph. The motivation above draws on recent work in communication avoiding algorithms. Mohiyuddin et al. (SC09) design a matrix-powers kernel that gives rise to an overlapping partition. Fritzsche et al. (CSC2009) develop an overlapping clustering for a Schwarz method. Both techniques extend an initial partitioning with overlap. Our procedure generates overlap directly. Indeed, Schwarz methods are commonly used to capitalize on overlap. Elsewhere, overlapping communities (Ahn et al, Nature 2009; Mishra et al. WAW2007) are now a popular model of structure in social networks. These have long been studied in statistics (Cole and Wishart, CompJ 1970). We present two types of results: (i) an estimated swapping probability {rho}{infinity}; and (ii) the communication volume of a parallel PageRank solution (link-following {alpha} = 0.85) using an additive Schwarz method. The volume ratio is the amount of extra storage for the overlap (2 means we store the graph twice). Below, as the ratio increases, the swapping probability and PageRank communication volume decreases.
Cluster compression algorithm: A joint clustering/data compression concept
NASA Technical Reports Server (NTRS)
Hilbert, E. E.
1977-01-01
The Cluster Compression Algorithm (CCA), which was developed to reduce costs associated with transmitting, storing, distributing, and interpreting LANDSAT multispectral image data is described. The CCA is a preprocessing algorithm that uses feature extraction and data compression to more efficiently represent the information in the image data. The format of the preprocessed data enables simply a look-up table decoding and direct use of the extracted features to reduce user computation for either image reconstruction, or computer interpretation of the image data. Basically, the CCA uses spatially local clustering to extract features from the image data to describe spectral characteristics of the data set. In addition, the features may be used to form a sequence of scalar numbers that define each picture element in terms of the cluster features. This sequence, called the feature map, is then efficiently represented by using source encoding concepts. Various forms of the CCA are defined and experimental results are presented to show trade-offs and characteristics of the various implementations. Examples are provided that demonstrate the application of the cluster compression concept to multi-spectral images from LANDSAT and other sources.
Introduction to Cluster Monte Carlo Algorithms
NASA Astrophysics Data System (ADS)
Luijten, E.
This chapter provides an introduction to cluster Monte Carlo algorithms for classical statistical-mechanical systems. A brief review of the conventional Metropolis algorithm is given, followed by a detailed discussion of the lattice cluster algorithm developed by Swendsen and Wang and the single-cluster variant introduced by Wolff. For continuum systems, the geometric cluster algorithm of Dress and Krauth is described. It is shown how their geometric approach can be generalized to incorporate particle interactions beyond hardcore repulsions, thus forging a connection between the lattice and continuum approaches. Several illustrative examples are discussed.
A novel spatial clustering algorithm based on Delaunay triangulation
NASA Astrophysics Data System (ADS)
Yang, Xiankun; Cui, Weihong
2008-12-01
Exploratory data analysis is increasingly more necessary as larger spatial data is managed in electro-magnetic media. Spatial clustering is one of the very important spatial data mining techniques. So far, a lot of spatial clustering algorithms have been proposed. In this paper we propose a robust spatial clustering algorithm named SCABDT (Spatial Clustering Algorithm Based on Delaunay Triangulation). SCABDT demonstrates important advantages over the previous works. First, it discovers even arbitrary shape of cluster distribution. Second, in order to execute SCABDT, we do not need to know any priori nature of distribution. Third, like DBSCAN, Experiments show that SCABDT does not require so much CPU processing time. Finally it handles efficiently outliers.
Single-Pass Clustering Algorithm Based on Storm
NASA Astrophysics Data System (ADS)
Fang, LI; Longlong, DAI; Zhiying, JIANG; Shunzi, LI
2017-02-01
The dramatically increasing volume of data makes the computational complexity of traditional clustering algorithm rise rapidly accordingly, which leads to the longer time. So as to improve the efficiency of the stream data clustering, a distributed real-time clustering algorithm (S-Single-Pass) based on the classic Single-Pass [1] algorithm and Storm [2] computation framework was designed in this paper. By employing this kind of method in the Topic Detection and Tracking (TDT) [3], the real-time performance of topic detection arises effectively. The proposed method splits the clustering process into two parts: one part is to form clusters for the multi-thread parallel clustering, the other part is to merge the generated clusters in the previous process and update the global clusters. Through the experimental results, the conclusion can be drawn that the proposed method have the nearly same clustering accuracy as the traditional Single-Pass algorithm and the clustering accuracy remains steady, computing rate increases linearly when increasing the number of cluster machines and nodes (processing threads).
Online clustering algorithms for radar emitter classification.
Liu, Jun; Lee, Jim P Y; Senior; Li, Lingjie; Luo, Zhi-Quan; Wong, K Max
2005-08-01
Radar emitter classification is a special application of data clustering for classifying unknown radar emitters from received radar pulse samples. The main challenges of this task are the high dimensionality of radar pulse samples, small sample group size, and closely located radar pulse clusters. In this paper, two new online clustering algorithms are developed for radar emitter classification: One is model-based using the Minimum Description Length (MDL) criterion and the other is based on competitive learning. Computational complexity is analyzed for each algorithm and then compared. Simulation results show the superior performance of the model-based algorithm over competitive learning in terms of better classification accuracy, flexibility, and stability.
A geometric clustering algorithm with applications to structural data.
Xu, Shutan; Zou, Shuxue; Wang, Lincong
2015-05-01
An important feature of structural data, especially those from structural determination and protein-ligand docking programs, is that their distribution could be mostly uniform. Traditional clustering algorithms developed specifically for nonuniformly distributed data may not be adequate for their classification. Here we present a geometric partitional algorithm that could be applied to both uniformly and nonuniformly distributed data. The algorithm is a top-down approach that recursively selects the outliers as the seeds to form new clusters until all the structures within a cluster satisfy a classification criterion. The algorithm has been evaluated on a diverse set of real structural data and six sets of test data. The results show that it is superior to the previous algorithms for the clustering of structural data and is similar to or better than them for the classification of the test data. The algorithm should be especially useful for the identification of the best but minor clusters and for speeding up an iterative process widely used in NMR structure determination.
Self-organization and clustering algorithms
NASA Technical Reports Server (NTRS)
Bezdek, James C.
1991-01-01
Kohonen's feature maps approach to clustering is often likened to the k or c-means clustering algorithms. Here, the author identifies some similarities and differences between the hard and fuzzy c-Means (HCM/FCM) or ISODATA algorithms and Kohonen's self-organizing approach. The author concludes that some differences are significant, but at the same time there may be some important unknown relationships between the two methodologies. Several avenues of research are proposed.
Quantum AdaBoost algorithm via cluster state
NASA Astrophysics Data System (ADS)
Li, Yuan
2017-03-01
The principle and theory of quantum computation are investigated by researchers for many years, and further applied to improve the efficiency of classical machine learning algorithms. Based on physical mechanism, a quantum version of AdaBoost (Adaptive Boosting) training algorithm is proposed in this paper, of which purpose is to construct a strong classifier. In the proposed scheme with cluster state in quantum mechanism is to realize the weak learning algorithm, and then update the corresponding weight of examples. As a result, a final classifier can be obtained by combining efficiently weak hypothesis based on measuring cluster state to reweight the distribution of examples.
Distributed Minimum Hop Algorithms
1982-01-01
acknowledgement), node d starts iteration i+1, and otherwise the algorithm terminates. A detailed description of the algorithm is given in pidgin algol...precise behavior of the algorithm under these circumstances is described by the pidgin algol program in the appendix which is executed by each node. The...l) < N!(2) for each neighbor j, and thus by induction,J -1 N!(2-1) < n-i + (Z-1) + N!(Z-1), completing the proof. Algorithm Dl in Pidgin Algol It is
Research on retailer data clustering algorithm based on Spark
NASA Astrophysics Data System (ADS)
Huang, Qiuman; Zhou, Feng
2017-03-01
Big data analysis is a hot topic in the IT field now. Spark is a high-reliability and high-performance distributed parallel computing framework for big data sets. K-means algorithm is one of the classical partition methods in clustering algorithm. In this paper, we study the k-means clustering algorithm on Spark. Firstly, the principle of the algorithm is analyzed, and then the clustering analysis is carried out on the supermarket customers through the experiment to find out the different shopping patterns. At the same time, this paper proposes the parallelization of k-means algorithm and the distributed computing framework of Spark, and gives the concrete design scheme and implementation scheme. This paper uses the two-year sales data of a supermarket to validate the proposed clustering algorithm and achieve the goal of subdividing customers, and then analyze the clustering results to help enterprises to take different marketing strategies for different customer groups to improve sales performance.
Optimal Hops-Based Adaptive Clustering Algorithm
NASA Astrophysics Data System (ADS)
Xuan, Xin; Chen, Jian; Zhen, Shanshan; Kuo, Yonghong
This paper proposes an optimal hops-based adaptive clustering algorithm (OHACA). The algorithm sets an energy selection threshold before the cluster forms so that the nodes with less energy are more likely to go to sleep immediately. In setup phase, OHACA introduces an adaptive mechanism to adjust cluster head and load balance. And the optimal distance theory is applied to discover the practical optimal routing path to minimize the total energy for transmission. Simulation results show that OHACA prolongs the life of network, improves utilizing rate and transmits more data because of energy balance.
Algorithm Calculates Cumulative Poisson Distribution
NASA Technical Reports Server (NTRS)
Bowerman, Paul N.; Nolty, Robert C.; Scheuer, Ernest M.
1992-01-01
Algorithm calculates accurate values of cumulative Poisson distribution under conditions where other algorithms fail because numbers are so small (underflow) or so large (overflow) that computer cannot process them. Factors inserted temporarily to prevent underflow and overflow. Implemented in CUMPOIS computer program described in "Cumulative Poisson Distribution Program" (NPO-17714).
An algorithm for spatial heirarchy clustering
NASA Technical Reports Server (NTRS)
Dejesusparada, N. (Principal Investigator); Velasco, F. R. D.
1981-01-01
A method for utilizing both spectral and spatial redundancy in compacting and preclassifying images is presented. In multispectral satellite images, a high correlation exists between neighboring image points which tend to occupy dense and restricted regions of the feature space. The image is divided into windows of the same size where the clustering is made. The classes obtained in several neighboring windows are clustered, and then again successively clustered until only one region corresponding to the whole image is obtained. By employing this algorithm only a few points are considered in each clustering, thus reducing computational effort. The method is illustrated as applied to LANDSAT images.
Cluster hybrid Monte Carlo simulation algorithms.
Plascak, J A; Ferrenberg, Alan M; Landau, D P
2002-06-01
We show that addition of Metropolis single spin flips to the Wolff cluster-flipping Monte Carlo procedure leads to a dramatic increase in performance for the spin-1/2 Ising model. We also show that adding Wolff cluster flipping to the Metropolis or heat bath algorithms in systems where just cluster flipping is not immediately obvious (such as the spin-3/2 Ising model) can substantially reduce the statistical errors of the simulations. A further advantage of these methods is that systematic errors introduced by the use of imperfect random-number generation may be largely healed by hybridizing single spin flips with cluster flipping.
Cluster hybrid Monte Carlo simulation algorithms
NASA Astrophysics Data System (ADS)
Plascak, J. A.; Ferrenberg, Alan M.; Landau, D. P.
2002-06-01
We show that addition of Metropolis single spin flips to the Wolff cluster-flipping Monte Carlo procedure leads to a dramatic increase in performance for the spin-1/2 Ising model. We also show that adding Wolff cluster flipping to the Metropolis or heat bath algorithms in systems where just cluster flipping is not immediately obvious (such as the spin-3/2 Ising model) can substantially reduce the statistical errors of the simulations. A further advantage of these methods is that systematic errors introduced by the use of imperfect random-number generation may be largely healed by hybridizing single spin flips with cluster flipping.
On scaling properties of cluster distributions in Ising models
NASA Astrophysics Data System (ADS)
Ruge, C.; Wagner, F.
1992-01-01
Scaling relations of cluster distributions for the Wolff algorithm are derived. We found them to be well satisfied for the Ising model in d=3 dimensions. Using scaling and a parametrization of the cluster distribution, we determine the critical exponent β/ν=0.516(6) with moderate effort in computing time.
Coupled cluster algorithms for networks of shared memory parallel processors
NASA Astrophysics Data System (ADS)
Bentz, Jonathan L.; Olson, Ryan M.; Gordon, Mark S.; Schmidt, Michael W.; Kendall, Ricky A.
2007-05-01
As the popularity of using SMP systems as the building blocks for high performance supercomputers increases, so too increases the need for applications that can utilize the multiple levels of parallelism available in clusters of SMPs. This paper presents a dual-layer distributed algorithm, using both shared-memory and distributed-memory techniques to parallelize a very important algorithm (often called the "gold standard") used in computational chemistry, the single and double excitation coupled cluster method with perturbative triples, i.e. CCSD(T). The algorithm is presented within the framework of the GAMESS [M.W. Schmidt, K.K. Baldridge, J.A. Boatz, S.T. Elbert, M.S. Gordon, J.J. Jensen, S. Koseki, N. Matsunaga, K.A. Nguyen, S. Su, T.L. Windus, M. Dupuis, J.A. Montgomery, General atomic and molecular electronic structure system, J. Comput. Chem. 14 (1993) 1347-1363]. (General Atomic and Molecular Electronic Structure System) program suite and the Distributed Data Interface [M.W. Schmidt, G.D. Fletcher, B.M. Bode, M.S. Gordon, The distributed data interface in GAMESS, Comput. Phys. Comm. 128 (2000) 190]. (DDI), however, the essential features of the algorithm (data distribution, load-balancing and communication overhead) can be applied to more general computational problems. Timing and performance data for our dual-level algorithm is presented on several large-scale clusters of SMPs.
Performance Comparison Of Evolutionary Algorithms For Image Clustering
NASA Astrophysics Data System (ADS)
Civicioglu, P.; Atasever, U. H.; Ozkan, C.; Besdok, E.; Karkinli, A. E.; Kesikoglu, A.
2014-09-01
Evolutionary computation tools are able to process real valued numerical sets in order to extract suboptimal solution of designed problem. Data clustering algorithms have been intensively used for image segmentation in remote sensing applications. Despite of wide usage of evolutionary algorithms on data clustering, their clustering performances have been scarcely studied by using clustering validation indexes. In this paper, the recently proposed evolutionary algorithms (i.e., Artificial Bee Colony Algorithm (ABC), Gravitational Search Algorithm (GSA), Cuckoo Search Algorithm (CS), Adaptive Differential Evolution Algorithm (JADE), Differential Search Algorithm (DSA) and Backtracking Search Optimization Algorithm (BSA)) and some classical image clustering techniques (i.e., k-means, fcm, som networks) have been used to cluster images and their performances have been compared by using four clustering validation indexes. Experimental test results exposed that evolutionary algorithms give more reliable cluster-centers than classical clustering techniques, but their convergence time is quite long.
Parallel Clustering Algorithms for Structured AMR
Gunney, B T; Wissink, A M; Hysom, D A
2005-10-26
We compare several different parallel implementation approaches for the clustering operations performed during adaptive gridding operations in patch-based structured adaptive mesh refinement (SAMR) applications. Specifically, we target the clustering algorithm of Berger and Rigoutsos (BR91), which is commonly used in many SAMR applications. The baseline for comparison is a simplistic parallel extension of the original algorithm that works well for up to O(10{sup 2}) processors. Our goal is a clustering algorithm for machines of up to O(10{sup 5}) processors, such as the 64K-processor IBM BlueGene/Light system. We first present an algorithm that avoids the unneeded communications of the simplistic approach to improve the clustering speed by up to an order of magnitude. We then present a new task-parallel implementation to further reduce communication wait time, adding another order of magnitude of improvement. The new algorithms also exhibit more favorable scaling behavior for our test problems. Performance is evaluated on a number of large scale parallel computer systems, including a 16K-processor BlueGene/Light system.
Cluster Algorithm Special Purpose Processor
NASA Astrophysics Data System (ADS)
Talapov, A. L.; Shchur, L. N.; Andreichenko, V. B.; Dotsenko, Vl. S.
We describe a Special Purpose Processor, realizing the Wolff algorithm in hardware, which is fast enough to study the critical behaviour of 2D Ising-like systems containing more than one million spins. The processor has been checked to produce correct results for a pure Ising model and for Ising model with random bonds. Its data also agree with the Nishimori exact results for spin glass. Only minor changes of the SPP design are necessary to increase the dimensionality and to take into account more complex systems such as Potts models.
A hierarchical clustering algorithm for MIMD architecture.
Du, Zhihua; Lin, Feng
2004-12-01
Hierarchical clustering is the most often used method for grouping similar patterns of gene expression data. A fundamental problem with existing implementations of this clustering method is the inability to handle large data sets within a reasonable time and memory resources. We propose a parallelized algorithm of hierarchical clustering to solve this problem. Our implementation on a multiple instruction multiple data (MIMD) architecture shows considerable reduction in computational time and inter-node communication overhead, especially for large data sets. We use the standard message passing library, message passing interface (MPI) for any MIMD systems.
Noise-enhanced clustering and competitive learning algorithms.
Osoba, Osonde; Kosko, Bart
2013-01-01
Noise can provably speed up convergence in many centroid-based clustering algorithms. This includes the popular k-means clustering algorithm. The clustering noise benefit follows from the general noise benefit for the expectation-maximization algorithm because many clustering algorithms are special cases of the expectation-maximization algorithm. Simulations show that noise also speeds up convergence in stochastic unsupervised competitive learning, supervised competitive learning, and differential competitive learning.
Spectral clustering algorithms for ultrasound image segmentation.
Archip, Neculai; Rohling, Robert; Cooperberg, Peter; Tahmasebpour, Hamid; Warfield, Simon K
2005-01-01
Image segmentation algorithms derived from spectral clustering analysis rely on the eigenvectors of the Laplacian of a weighted graph obtained from the image. The NCut criterion was previously used for image segmentation in supervised manner. We derive a new strategy for unsupervised image segmentation. This article describes an initial investigation to determine the suitability of such segmentation techniques for ultrasound images. The extension of the NCut technique to the unsupervised clustering is first described. The novel segmentation algorithm is then performed on simulated ultrasound images. Tests are also performed on abdominal and fetal images with the segmentation results compared to manual segmentation. Comparisons with the classical NCut algorithm are also presented. Finally, segmentation results on other types of medical images are shown.
Morphology of open clusters NGC 1857 and Czernik 20 using clustering algorithms
NASA Astrophysics Data System (ADS)
Bhattacharya, S.; Mahulkar, V.; Pandaokar, S.; Singh, P. K.
2017-01-01
The morphology and cluster membership of the Galactic open clusters-Czernik 20 and NGC 1857 were analyzed using two different clustering algorithms. We present the maiden use of density-based spatial clustering of applications with noise (DBSCAN) to determine open cluster morphology from spatial distribution. The region of analysis has also been spatially classified using a statistical membership determination algorithm. We utilized near infrared (NIR) data for a suitably large region around the clusters from the United Kingdom Infrared Deep Sky Survey Galactic Plane Survey star catalogue database, and also from the Two Micron All Sky Survey star catalogue database. The densest regions of the cluster morphologies (1 for Czernik 20 and 2 for NGC 1857) thus identified were analyzed with a K-band extinction map and color-magnitude diagrams (CMDs). To address significant discrepancy in known distance and reddening parameters, we carried out field decontamination of these CMDs and subsequent isochrone fitting of the cleaned CMDs to obtain reliable distance and reddening parameters for the clusters (Czernik 20: D = 2900 pc; E(J- K) = 0 . 33; NGC 1857: D = 2400 pc; E(J- K) =0.18-0.19). The isochrones were also used to convert the luminosity functions for the densest regions of Czernik 20 and NGC 1857 into mass function, to derive their slopes. Additionally, a previously unknown over-density consistent with that of a star cluster is identified in the region of analysis.
Chaotic map clustering algorithm for EEG analysis
NASA Astrophysics Data System (ADS)
Bellotti, R.; De Carlo, F.; Stramaglia, S.
2004-03-01
The non-parametric chaotic map clustering algorithm has been applied to the analysis of electroencephalographic signals, in order to recognize the Huntington's disease, one of the most dangerous pathologies of the central nervous system. The performance of the method has been compared with those obtained through parametric algorithms, as K-means and deterministic annealing, and supervised multi-layer perceptron. While supervised neural networks need a training phase, performed by means of data tagged by the genetic test, and the parametric methods require a prior choice of the number of classes to find, the chaotic map clustering gives a natural evidence of the pathological class, without any training or supervision, thus providing a new efficient methodology for the recognition of patterns affected by the Huntington's disease.
Dynamic exponents for potts model cluster algorithms
NASA Astrophysics Data System (ADS)
Coddington, Paul D.; Baillie, Clive F.
We have studied the Swendsen-Wang and Wolff cluster update algorithms for the Ising model in 2, 3 and 4 dimensions. The data indicate simple relations between the specific heat and the Wolff autocorrelations, and between the magnetization and the Swendsen-Wang autocorrelations. This implies that the dynamic critical exponents are related to the static exponents of the Ising model. We also investigate the possibility of similar relationships for the Q-state Potts model.
First Cluster Algorithm Special Purpose Processor
NASA Astrophysics Data System (ADS)
Talapov, A. L.; Andreichenko, V. B.; Dotsenko S., Vi.; Shchur, L. N.
We describe the architecture of the special purpose processor built to realize in hardware cluster Wolff algorithm, which is not hampered by a critical slowing down. The processor simulates two-dimensional Ising-like spin systems. With minor changes the same very effective architecture, which can be defined as a Memory Machine, can be used to study phase transitions in a wide range of models in two or three dimensions.
Dimensionality Reduction Particle Swarm Algorithm for High Dimensional Clustering
Cui, Xiaohui; ST Charles, Jesse Lee; Potok, Thomas E; Beaver, Justin M
2008-01-01
The Particle Swarm Optimization (PSO) clustering algorithm can generate more compact clustering results than the traditional K-means clustering algorithm. However, when clustering high dimensional datasets, the PSO clustering algorithm is notoriously slow because its computation cost increases exponentially with the size of the dataset dimension. Dimensionality reduction techniques offer solutions that both significantly improve the computation time, and yield reasonably accurate clustering results in high dimensional data analysis. In this paper, we introduce research that combines different dimensionality reduction techniques with the PSO clustering algorithm in order to reduce the complexity of high dimensional datasets and speed up the PSO clustering process. We report significant improvements in total runtime. Moreover, the clustering accuracy of the dimensionality reduction PSO clustering algorithm is comparable to the one that uses full dimension space.
A Distributed Flocking Approach for Information Stream Clustering Analysis
Cui, Xiaohui; Potok, Thomas E
2006-01-01
Intelligence analysts are currently overwhelmed with the amount of information streams generated everyday. There is a lack of comprehensive tool that can real-time analyze the information streams. Document clustering analysis plays an important role in improving the accuracy of information retrieval. However, most clustering technologies can only be applied for analyzing the static document collection because they normally require a large amount of computation resource and long time to get accurate result. It is very difficult to cluster a dynamic changed text information streams on an individual computer. Our early research has resulted in a dynamic reactive flock clustering algorithm which can continually refine the clustering result and quickly react to the change of document contents. This character makes the algorithm suitable for cluster analyzing dynamic changed document information, such as text information stream. Because of the decentralized character of this algorithm, a distributed approach is a very natural way to increase the clustering speed of the algorithm. In this paper, we present a distributed multi-agent flocking approach for the text information stream clustering and discuss the decentralized architectures and communication schemes for load balance and status information synchronization in this approach.
Evaluation of Hierarchical Clustering Algorithms for Document Datasets
2002-06-03
new class of agglomerative algorithms, in which we introduced intermediate clusters obtained by partitional clustering algorithms to constrain the space ...of the corresponding clusters. The various clustering algorithms that are described in this paper use the vector- space model [26] to represent each...document. In this model, each document d is considered to be a vector in the term- space . In particular, we employed the t f id f term weighting model
Cross-Clustering: A Partial Clustering Algorithm with Automatic Estimation of the Number of Clusters
Tellaroli, Paola; Bazzi, Marco; Donato, Michele; Brazzale, Alessandra R.; Drăghici, Sorin
2016-01-01
Four of the most common limitations of the many available clustering methods are: i) the lack of a proper strategy to deal with outliers; ii) the need for a good a priori estimate of the number of clusters to obtain reasonable results; iii) the lack of a method able to detect when partitioning of a specific data set is not appropriate; and iv) the dependence of the result on the initialization. Here we propose Cross-clustering (CC), a partial clustering algorithm that overcomes these four limitations by combining the principles of two well established hierarchical clustering algorithms: Ward’s minimum variance and Complete-linkage. We validated CC by comparing it with a number of existing clustering methods, including Ward’s and Complete-linkage. We show on both simulated and real datasets, that CC performs better than the other methods in terms of: the identification of the correct number of clusters, the identification of outliers, and the determination of real cluster memberships. We used CC to cluster samples in order to identify disease subtypes, and on gene profiles, in order to determine groups of genes with the same behavior. Results obtained on a non-biological dataset show that the method is general enough to be successfully used in such diverse applications. The algorithm has been implemented in the statistical language R and is freely available from the CRAN contributed packages repository. PMID:27015427
Improved Ant Colony Clustering Algorithm and Its Performance Study
Gao, Wei
2016-01-01
Clustering analysis is used in many disciplines and applications; it is an important tool that descriptively identifies homogeneous groups of objects based on attribute values. The ant colony clustering algorithm is a swarm-intelligent method used for clustering problems that is inspired by the behavior of ant colonies that cluster their corpses and sort their larvae. A new abstraction ant colony clustering algorithm using a data combination mechanism is proposed to improve the computational efficiency and accuracy of the ant colony clustering algorithm. The abstraction ant colony clustering algorithm is used to cluster benchmark problems, and its performance is compared with the ant colony clustering algorithm and other methods used in existing literature. Based on similar computational difficulties and complexities, the results show that the abstraction ant colony clustering algorithm produces results that are not only more accurate but also more efficiently determined than the ant colony clustering algorithm and the other methods. Thus, the abstraction ant colony clustering algorithm can be used for efficient multivariate data clustering. PMID:26839533
Improved Ant Colony Clustering Algorithm and Its Performance Study.
Gao, Wei
2016-01-01
Clustering analysis is used in many disciplines and applications; it is an important tool that descriptively identifies homogeneous groups of objects based on attribute values. The ant colony clustering algorithm is a swarm-intelligent method used for clustering problems that is inspired by the behavior of ant colonies that cluster their corpses and sort their larvae. A new abstraction ant colony clustering algorithm using a data combination mechanism is proposed to improve the computational efficiency and accuracy of the ant colony clustering algorithm. The abstraction ant colony clustering algorithm is used to cluster benchmark problems, and its performance is compared with the ant colony clustering algorithm and other methods used in existing literature. Based on similar computational difficulties and complexities, the results show that the abstraction ant colony clustering algorithm produces results that are not only more accurate but also more efficiently determined than the ant colony clustering algorithm and the other methods. Thus, the abstraction ant colony clustering algorithm can be used for efficient multivariate data clustering.
Applying fuzzy clustering optimization algorithm to extracting traffic spatial pattern
NASA Astrophysics Data System (ADS)
Hu, Chunchun; Shi, Wenzhong; Meng, Lingkui; Liu, Min
2009-10-01
Traditional analytical methods for traffic information can't meet to need of intelligent traffic system. Mining value-add information can deal with more traffic problems. The paper exploits a new clustering optimization algorithm to extract useful spatial clustered pattern for predicting long-term traffic flow from macroscopic view. Considering the sensitivity of initial parameters and easy falling into local extreme in FCM algorithm, the new algorithm applies Particle Swarm Optimization method, which can discovery the globe optimal result, to the FCM algorithm. And the algorithm exploits the union of the clustering validity index and objective function of the FCM algorithm as the fitness function of the PSO algorithm. The experimental result indicates that it is effective and efficient. For fuzzy clustering of road traffic data, it can produce useful spatial clustered pattern. And the clustered centers represent the locations which have heavy traffic flow. Moreover, the parameters of the patterns can provide intelligent traffic system with assistant decision support.
Poole, William; Leinonen, Kalle; Shmulevich, Ilya; Knijnenburg, Theo A; Bernard, Brady
2017-02-01
Cancer researchers have long recognized that somatic mutations are not uniformly distributed within genes. However, most approaches for identifying cancer mutations focus on either the entire-gene or single amino-acid level. We have bridged these two methodologies with a multiscale mutation clustering algorithm that identifies variable length mutation clusters in cancer genes. We ran our algorithm on 539 genes using the combined mutation data in 23 cancer types from The Cancer Genome Atlas (TCGA) and identified 1295 mutation clusters. The resulting mutation clusters cover a wide range of scales and often overlap with many kinds of protein features including structured domains, phosphorylation sites, and known single nucleotide variants. We statistically associated these multiscale clusters with gene expression and drug response data to illuminate the functional and clinical consequences of mutations in our clusters. Interestingly, we find multiple clusters within individual genes that have differential functional associations: these include PTEN, FUBP1, and CDH1. This methodology has potential implications in identifying protein regions for drug targets, understanding the biological underpinnings of cancer, and personalizing cancer treatments. Toward this end, we have made the mutation clusters and the clustering algorithm available to the public. Clusters and pathway associations can be interactively browsed at m2c.systemsbiology.net. The multiscale mutation clustering algorithm is available at https://github.com/IlyaLab/M2C.
Poole, William; Leinonen, Kalle; Shmulevich, Ilya
2017-01-01
Cancer researchers have long recognized that somatic mutations are not uniformly distributed within genes. However, most approaches for identifying cancer mutations focus on either the entire-gene or single amino-acid level. We have bridged these two methodologies with a multiscale mutation clustering algorithm that identifies variable length mutation clusters in cancer genes. We ran our algorithm on 539 genes using the combined mutation data in 23 cancer types from The Cancer Genome Atlas (TCGA) and identified 1295 mutation clusters. The resulting mutation clusters cover a wide range of scales and often overlap with many kinds of protein features including structured domains, phosphorylation sites, and known single nucleotide variants. We statistically associated these multiscale clusters with gene expression and drug response data to illuminate the functional and clinical consequences of mutations in our clusters. Interestingly, we find multiple clusters within individual genes that have differential functional associations: these include PTEN, FUBP1, and CDH1. This methodology has potential implications in identifying protein regions for drug targets, understanding the biological underpinnings of cancer, and personalizing cancer treatments. Toward this end, we have made the mutation clusters and the clustering algorithm available to the public. Clusters and pathway associations can be interactively browsed at m2c.systemsbiology.net. The multiscale mutation clustering algorithm is available at https://github.com/IlyaLab/M2C. PMID:28170390
Parallelization of Edge Detection Algorithm using MPI on Beowulf Cluster
NASA Astrophysics Data System (ADS)
Haron, Nazleeni; Amir, Ruzaini; Aziz, Izzatdin A.; Jung, Low Tan; Shukri, Siti Rohkmah
In this paper, we present the design of parallel Sobel edge detection algorithm using Foster's methodology. The parallel algorithm is implemented using MPI message passing library and master/slave algorithm. Every processor performs the same sequential algorithm but on different part of the image. Experimental results conducted on Beowulf cluster are presented to demonstrate the performance of the parallel algorithm.
A hybrid monkey search algorithm for clustering analysis.
Chen, Xin; Zhou, Yongquan; Luo, Qifang
2014-01-01
Clustering is a popular data analysis and data mining technique. The k-means clustering algorithm is one of the most commonly used methods. However, it highly depends on the initial solution and is easy to fall into local optimum solution. In view of the disadvantages of the k-means method, this paper proposed a hybrid monkey algorithm based on search operator of artificial bee colony algorithm for clustering analysis and experiment on synthetic and real life datasets to show that the algorithm has a good performance than that of the basic monkey algorithm for clustering analysis.
The Georgi algorithms of jet clustering
NASA Astrophysics Data System (ADS)
Ge, Shao-Feng
2015-05-01
We reveal the direct link between the jet clustering algorithms recently proposed by Howard Georgi and parton shower kinematics, providing firm foundation from the theoretical side. The kinematics of this class of elegant algorithms is explored systematically for partons with arbitrary masses and the jet function is generalized to J {/β ( n)} with a jet function index n in order to achieve more degrees of freedom. Based on three basic requirements that, the result of jet clustering is process-independent and hence logically consistent, for softer subjets the inclusion cone is larger to conform with the fact that parton shower tends to emit softer partons at earlier stage with larger opening angle, and that the cone size cannot be too large in order to avoid mixing up neighbor jets, we derive constraints on the jet function parameter β and index n which are closely related to cone size cutoff. Finally, we discuss how jet function values can be made invariant under Lorentz boost.
Clustering algorithm for determining community structure in large networks
NASA Astrophysics Data System (ADS)
Pujol, Josep M.; Béjar, Javier; Delgado, Jordi
2006-07-01
We propose an algorithm to find the community structure in complex networks based on the combination of spectral analysis and modularity optimization. The clustering produced by our algorithm is as accurate as the best algorithms on the literature of modularity optimization; however, the main asset of the algorithm is its efficiency. The best match for our algorithm is Newman’s fast algorithm, which is the reference algorithm for clustering in large networks due to its efficiency. When both algorithms are compared, our algorithm outperforms the fast algorithm both in efficiency and accuracy of the clustering, in terms of modularity. Thus, the results suggest that the proposed algorithm is a good choice to analyze the community structure of medium and large networks in the range of tens and hundreds of thousand vertices.
a Distributed Polygon Retrieval Algorithm Using Mapreduce
NASA Astrophysics Data System (ADS)
Guo, Q.; Palanisamy, B.; Karimi, H. A.
2015-07-01
The burst of large-scale spatial terrain data due to the proliferation of data acquisition devices like 3D laser scanners poses challenges to spatial data analysis and computation. Among many spatial analyses and computations, polygon retrieval is a fundamental operation which is often performed under real-time constraints. However, existing sequential algorithms fail to meet this demand for larger sizes of terrain data. Motivated by the MapReduce programming model, a well-adopted large-scale parallel data processing technique, we present a MapReduce-based polygon retrieval algorithm designed with the objective of reducing the IO and CPU loads of spatial data processing. By indexing the data based on a quad-tree approach, a significant amount of unneeded data is filtered in the filtering stage and it reduces the IO overhead. The indexed data also facilitates querying the relationship between the terrain data and query area in shorter time. The results of the experiments performed in our Hadoop cluster demonstrate that our algorithm performs significantly better than the existing distributed algorithms.
Identifying multiple influential spreaders by a heuristic clustering algorithm
NASA Astrophysics Data System (ADS)
Bao, Zhong-Kui; Liu, Jian-Guo; Zhang, Hai-Feng
2017-03-01
The problem of influence maximization in social networks has attracted much attention. However, traditional centrality indices are suitable for the case where a single spreader is chosen as the spreading source. Many times, spreading process is initiated by simultaneously choosing multiple nodes as the spreading sources. In this situation, choosing the top ranked nodes as multiple spreaders is not an optimal strategy, since the chosen nodes are not sufficiently scattered in networks. Therefore, one ideal situation for multiple spreaders case is that the spreaders themselves are not only influential but also they are dispersively distributed in networks, but it is difficult to meet the two conditions together. In this paper, we propose a heuristic clustering (HC) algorithm based on the similarity index to classify nodes into different clusters, and finally the center nodes in clusters are chosen as the multiple spreaders. HC algorithm not only ensures that the multiple spreaders are dispersively distributed in networks but also avoids the selected nodes to be very "negligible". Compared with the traditional methods, our experimental results on synthetic and real networks indicate that the performance of HC method on influence maximization is more significant.
Multi-Parent Clustering Algorithms from Stochastic Grammar Data Models
NASA Technical Reports Server (NTRS)
Mjoisness, Eric; Castano, Rebecca; Gray, Alexander
1999-01-01
We introduce a statistical data model and an associated optimization-based clustering algorithm which allows data vectors to belong to zero, one or several "parent" clusters. For each data vector the algorithm makes a discrete decision among these alternatives. Thus, a recursive version of this algorithm would place data clusters in a Directed Acyclic Graph rather than a tree. We test the algorithm with synthetic data generated according to the statistical data model. We also illustrate the algorithm using real data from large-scale gene expression assays.
Space complexity of estimation of distribution algorithms.
Gao, Yong; Culberson, Joseph
2005-01-01
In this paper, we investigate the space complexity of the Estimation of Distribution Algorithms (EDAs), a class of sampling-based variants of the genetic algorithm. By analyzing the nature of EDAs, we identify criteria that characterize the space complexity of two typical implementation schemes of EDAs, the factorized distribution algorithm and Bayesian network-based algorithms. Using random additive functions as the prototype, we prove that the space complexity of the factorized distribution algorithm and Bayesian network-based algorithms is exponential in the problem size even if the optimization problem has a very sparse interaction structure.
MM Algorithms for Some Discrete Multivariate Distributions.
Zhou, Hua; Lange, Kenneth
2010-09-01
The MM (minorization-maximization) principle is a versatile tool for constructing optimization algorithms. Every EM algorithm is an MM algorithm but not vice versa. This article derives MM algorithms for maximum likelihood estimation with discrete multivariate distributions such as the Dirichlet-multinomial and Connor-Mosimann distributions, the Neerchal-Morel distribution, the negative-multinomial distribution, certain distributions on partitions, and zero-truncated and zero-inflated distributions. These MM algorithms increase the likelihood at each iteration and reliably converge to the maximum from well-chosen initial values. Because they involve no matrix inversion, the algorithms are especially pertinent to high-dimensional problems. To illustrate the performance of the MM algorithms, we compare them to Newton's method on data used to classify handwritten digits.
A Comparative Study of Protein Sequence Clustering Algorithms
NASA Astrophysics Data System (ADS)
Eldin, A. Sharaf; Abdelgaber, S.; Soliman, T.; Kassim, S.; Abdo, A.
In this paper, we survey four clustering techniques and discuss their advantages and drawbacks. A review of eight different protein sequence clustering algorithms has been accomplished. Moreover, a comparison between the algorithms on the basis of some factors has been presented.
New SIMD Algorithms for Cluster Labeling on Parallel Computers
NASA Astrophysics Data System (ADS)
Apostolakis, John; Coddington, Paul; Marinari, Enzo
Cluster algorithms are non-local Monte Carlo update schemes which can greatly increase the efficiency of computer simulations of spin models of magnets. The major computational task in these algorithms is connected component labeling, to identify clusters of connected sites on a lattice. We have devised some new SIMD component labeling algorithms, and implemented them on the Connection Machine. We investigate their performance when applied to the cluster update of the two-dimensional Ising spin model. These algorithms could also be applied to other problems which use connected component labeling, such as percolation and image analysis.
An algorithm on distributed mining association rules
NASA Astrophysics Data System (ADS)
Xu, Fan
2005-12-01
With the rapid development of the Internet/Intranet, distributed databases have become a broadly used environment in various areas. It is a critical task to mine association rules in distributed databases. The algorithms of distributed mining association rules can be divided into two classes. One is a DD algorithm, and another is a CD algorithm. A DD algorithm focuses on data partition optimization so as to enhance the efficiency. A CD algorithm, on the other hand, considers a setting where the data is arbitrarily partitioned horizontally among the parties to begin with, and focuses on parallelizing the communication. A DD algorithm is not always applicable, however, at the time the data is generated, it is often already partitioned. In many cases, it cannot be gathered and repartitioned for reasons of security and secrecy, cost transmission, or sheer efficiency. A CD algorithm may be a more appealing solution for systems which are naturally distributed over large expenses, such as stock exchange and credit card systems. An FDM algorithm provides enhancement to CD algorithm. However, CD and FDM algorithms are both based on net-structure and executing in non-shareable resources. In practical applications, however, distributed databases often are star-structured. This paper proposes an algorithm based on star-structure networks, which are more practical in application, have lower maintenance costs and which are more practical in the construction of the networks. In addition, the algorithm provides high efficiency in communication and good extension in parallel computation.
Clustering algorithms for Stokes space modulation format recognition.
Boada, Ricard; Borkowski, Robert; Monroy, Idelfonso Tafur
2015-06-15
Stokes space modulation format recognition (Stokes MFR) is a blind method enabling digital coherent receivers to infer modulation format information directly from a received polarization-division-multiplexed signal. A crucial part of the Stokes MFR is a clustering algorithm, which largely influences the performance of the detection process, particularly at low signal-to-noise ratios. This paper reports on an extensive study of six different clustering algorithms: k-means, expectation maximization, density-based DBSCAN and OPTICS, spectral clustering and maximum likelihood clustering, used for discriminating between dual polarization: BPSK, QPSK, 8-PSK, 8-QAM, and 16-QAM. We determine essential performance metrics for each clustering algorithm and modulation format under test: minimum required signal-to-noise ratio, detection accuracy and algorithm complexity.
A highly efficient multi-core algorithm for clustering extremely large datasets
2010-01-01
Background In recent years, the demand for computational power in computational biology has increased due to rapidly growing data sets from microarray and other high-throughput technologies. This demand is likely to increase. Standard algorithms for analyzing data, such as cluster algorithms, need to be parallelized for fast processing. Unfortunately, most approaches for parallelizing algorithms largely rely on network communication protocols connecting and requiring multiple computers. One answer to this problem is to utilize the intrinsic capabilities in current multi-core hardware to distribute the tasks among the different cores of one computer. Results We introduce a multi-core parallelization of the k-means and k-modes cluster algorithms based on the design principles of transactional memory for clustering gene expression microarray type data and categorial SNP data. Our new shared memory parallel algorithms show to be highly efficient. We demonstrate their computational power and show their utility in cluster stability and sensitivity analysis employing repeated runs with slightly changed parameters. Computation speed of our Java based algorithm was increased by a factor of 10 for large data sets while preserving computational accuracy compared to single-core implementations and a recently published network based parallelization. Conclusions Most desktop computers and even notebooks provide at least dual-core processors. Our multi-core algorithms show that using modern algorithmic concepts, parallelization makes it possible to perform even such laborious tasks as cluster sensitivity and cluster number estimation on the laboratory computer. PMID:20370922
A biased random-key genetic algorithm for data clustering.
Festa, P
2013-09-01
Cluster analysis aims at finding subsets (clusters) of a given set of entities, which are homogeneous and/or well separated. Starting from the 1990s, cluster analysis has been applied to several domains with numerous applications. It has emerged as one of the most exciting interdisciplinary fields, having benefited from concepts and theoretical results obtained by different scientific research communities, including genetics, biology, biochemistry, mathematics, and computer science. The last decade has brought several new algorithms, which are able to solve larger sized and real-world instances. We will give an overview of the main types of clustering and criteria for homogeneity or separation. Solution techniques are discussed, with special emphasis on the combinatorial optimization perspective, with the goal of providing conceptual insights and literature references to the broad community of clustering practitioners. A new biased random-key genetic algorithm is also described and compared with several efficient hybrid GRASP algorithms recently proposed to cluster biological data.
Distributed sensor data compression algorithm
NASA Astrophysics Data System (ADS)
Ambrose, Barry; Lin, Freddie
2006-04-01
Theoretically it is possible for two sensors to reliably send data at rates smaller than the sum of the necessary data rates for sending the data independently, essentially taking advantage of the correlation of sensor readings to reduce the data rate. In 2001, Caltech researchers Michelle Effros and Qian Zhao developed new techniques for data compression code design for correlated sensor data, which were published in a paper at the 2001 Data Compression Conference (DCC 2001). These techniques take advantage of correlations between two or more closely positioned sensors in a distributed sensor network. Given two signals, X and Y, the X signal is sent using standard data compression. The goal is to design a partition tree for the Y signal. The Y signal is sent using a code based on the partition tree. At the receiving end, if ambiguity arises when using the partition tree to decode the Y signal, the X signal is used to resolve the ambiguity. We have extended this work to increase the efficiency of the code search algorithms. Our results have shown that development of a highly integrated sensor network protocol that takes advantage of a correlation in sensor readings can result in 20-30% sensor data transport cost savings. In contrast, the best possible compression using state-of-the-art compression techniques that did not take into account the correlation of the incoming data signals achieved only 9-10% compression at most. This work was sponsored by MDA, but has very widespread applicability to ad hoc sensor networks, hyperspectral imaging sensors and vehicle health monitoring sensors for space applications.
A fuzzy clustering algorithm to detect planar and quadric shapes
NASA Technical Reports Server (NTRS)
Krishnapuram, Raghu; Frigui, Hichem; Nasraoui, Olfa
1992-01-01
In this paper, we introduce a new fuzzy clustering algorithm to detect an unknown number of planar and quadric shapes in noisy data. The proposed algorithm is computationally and implementationally simple, and it overcomes many of the drawbacks of the existing algorithms that have been proposed for similar tasks. Since the clustering is performed in the original image space, and since no features need to be computed, this approach is particularly suited for sparse data. The algorithm may also be used in pattern recognition applications.
CLAG: an unsupervised non hierarchical clustering algorithm handling biological data
2012-01-01
Background Searching for similarities in a set of biological data is intrinsically difficult due to possible data points that should not be clustered, or that should group within several clusters. Under these hypotheses, hierarchical agglomerative clustering is not appropriate. Moreover, if the dataset is not known enough, like often is the case, supervised classification is not appropriate either. Results CLAG (for CLusters AGgregation) is an unsupervised non hierarchical clustering algorithm designed to cluster a large variety of biological data and to provide a clustered matrix and numerical values indicating cluster strength. CLAG clusterizes correlation matrices for residues in protein families, gene-expression and miRNA data related to various cancer types, sets of species described by multidimensional vectors of characters, binary matrices. It does not ask to all data points to cluster and it converges yielding the same result at each run. Its simplicity and speed allows it to run on reasonably large datasets. Conclusions CLAG can be used to investigate the cluster structure present in biological datasets and to identify its underlying graph. It showed to be more informative and accurate than several known clustering methods, as hierarchical agglomerative clustering, k-means, fuzzy c-means, model-based clustering, affinity propagation clustering, and not to suffer of the convergence problem proper to this latter. PMID:23216858
A robust fuzzy local information C-Means clustering algorithm.
Krinidis, Stelios; Chatzis, Vassilios
2010-05-01
This paper presents a variation of fuzzy c-means (FCM) algorithm that provides image clustering. The proposed algorithm incorporates the local spatial information and gray level information in a novel fuzzy way. The new algorithm is called fuzzy local information C-Means (FLICM). FLICM can overcome the disadvantages of the known fuzzy c-means algorithms and at the same time enhances the clustering performance. The major characteristic of FLICM is the use of a fuzzy local (both spatial and gray level) similarity measure, aiming to guarantee noise insensitiveness and image detail preservation. Furthermore, the proposed algorithm is fully free of the empirically adjusted parameters (a, ¿(g), ¿(s), etc.) incorporated into all other fuzzy c-means algorithms proposed in the literature. Experiments performed on synthetic and real-world images show that FLICM algorithm is effective and efficient, providing robustness to noisy images.
Efficient Cluster Algorithm for Spin Glasses in Any Space Dimension
NASA Astrophysics Data System (ADS)
Zhu, Zheng; Ochoa, Andrew J.; Katzgraber, Helmut G.
2015-08-01
Spin systems with frustration and disorder are notoriously difficult to study, both analytically and numerically. While the simulation of ferromagnetic statistical mechanical models benefits greatly from cluster algorithms, these accelerated dynamics methods remain elusive for generic spin-glass-like systems. Here, we present a cluster algorithm for Ising spin glasses that works in any space dimension and speeds up thermalization by at least one order of magnitude at temperatures where thermalization is typically difficult. Our isoenergetic cluster moves are based on the Houdayer cluster algorithm for two-dimensional spin glasses and lead to a speedup over conventional state-of-the-art methods that increases with the system size. We illustrate the benefits of the isoenergetic cluster moves in two and three space dimensions, as well as the nonplanar chimera topology found in the D-Wave Inc. quantum annealing machine.
Adaptive link selection algorithms for distributed estimation
NASA Astrophysics Data System (ADS)
Xu, Songcen; de Lamare, Rodrigo C.; Poor, H. Vincent
2015-12-01
This paper presents adaptive link selection algorithms for distributed estimation and considers their application to wireless sensor networks and smart grids. In particular, exhaustive search-based least mean squares (LMS) / recursive least squares (RLS) link selection algorithms and sparsity-inspired LMS / RLS link selection algorithms that can exploit the topology of networks with poor-quality links are considered. The proposed link selection algorithms are then analyzed in terms of their stability, steady-state, and tracking performance and computational complexity. In comparison with the existing centralized or distributed estimation strategies, the key features of the proposed algorithms are as follows: (1) more accurate estimates and faster convergence speed can be obtained and (2) the network is equipped with the ability of link selection that can circumvent link failures and improve the estimation performance. The performance of the proposed algorithms for distributed estimation is illustrated via simulations in applications of wireless sensor networks and smart grids.
Mass Distribution in Galaxy Cluster Cores
NASA Astrophysics Data System (ADS)
Hogan, M. T.; McNamara, B. R.; Pulido, F.; Nulsen, P. E. J.; Russell, H. R.; Vantyghem, A. N.; Edge, A. C.; Main, R. A.
2017-03-01
Many processes within galaxy clusters, such as those believed to govern the onset of thermally unstable cooling and active galactic nucleus feedback, are dependent upon local dynamical timescales. However, accurate mapping of the mass distribution within individual clusters is challenging, particularly toward cluster centers where the total mass budget has substantial radially dependent contributions from the stellar (M *), gas (M gas), and dark matter (M DM) components. In this paper we use a small sample of galaxy clusters with deep Chandra observations and good ancillary tracers of their gravitating mass at both large and small radii to develop a method for determining mass profiles that span a wide radial range and extend down into the central galaxy. We also consider potential observational pitfalls in understanding cooling in hot cluster atmospheres, and find tentative evidence for a relationship between the radial extent of cooling X-ray gas and nebular Hα emission in cool-core clusters. At large radii the entropy profiles of our clusters agree with the baseline power law of K ∝ r 1.1 expected from gravity alone. At smaller radii our entropy profiles become shallower but continue with a power law of the form K ∝ r 0.67 down to our resolution limit. Among this small sample of cool-core clusters we therefore find no support for the existence of a central flat “entropy floor.”
A Fast Implementation of the ISODATA Clustering Algorithm
NASA Technical Reports Server (NTRS)
Memarsadeghi, Nargess; Mount, David M.; Netanyahu, Nathan S.; LeMoigne, Jacqueline
2005-01-01
Clustering is central to many image processing and remote sensing applications. ISODATA is one of the most popular and widely used clustering methods in geoscience applications, but it can run slowly, particularly with large data sets. We present a more efficient approach to ISODATA clustering, which achieves better running times by storing the points in a kd-tree and through a modification of the way in which the algorithm estimates the dispersion of each cluster. We also present an approximate version of the algorithm which allows the user to further improve the running time, at the expense of lower fidelity in computing the nearest cluster center to each point. We provide both theoretical and empirical justification that our modified approach produces clusterings that are very similar to those produced by the standard ISODATA approach. We also provide empirical studies on both synthetic data and remotely sensed Landsat and MODIS images that show that our approach has significantly lower running times.
Cui, Xiaohui; Potok, Thomas E
2006-01-01
The Flocking model, first proposed by Craig Reynolds, is one of the first bio-inspired computational collective behavior models that has many popular applications, such as animation. Our early research has resulted in a flock clustering algorithm that can achieve better performance than the Kmeans or the Ant clustering algorithms for data clustering. This algorithm generates a clustering of a given set of data through the embedding of the highdimensional data items on a two-dimensional grid for efficient clustering result retrieval and visualization. In this paper, we propose a bio-inspired clustering model, the Multiple Species Flocking clustering model (MSF), and present a distributed multi-agent MSF approach for document clustering.
Implementing Agglomerative Hierarchic Clustering Algorithms for Use in Document Retrieval.
ERIC Educational Resources Information Center
Voorhees, Ellen M.
1986-01-01
Describes a computerized information retrieval system that uses three agglomerative hierarchic clustering algorithms--single link, complete link, and group average link--and explains their implementations. It is noted that these implementations have been used to cluster a collection of 12,000 documents. (LRW)
Learning factorizations in estimation of distribution algorithms using affinity propagation.
Santana, Roberto; Larrañaga, Pedro; Lozano, José A
2010-01-01
Estimation of distribution algorithms (EDAs) that use marginal product model factorizations have been widely applied to a broad range of mainly binary optimization problems. In this paper, we introduce the affinity propagation EDA (AffEDA) which learns a marginal product model by clustering a matrix of mutual information learned from the data using a very efficient message-passing algorithm known as affinity propagation. The introduced algorithm is tested on a set of binary and nonbinary decomposable functions and using a hard combinatorial class of problem known as the HP protein model. The results show that the algorithm is a very efficient alternative to other EDAs that use marginal product model factorizations such as the extended compact genetic algorithm (ECGA) and improves the quality of the results achieved by ECGA when the cardinality of the variables is increased.
Efficient Record Linkage Algorithms Using Complete Linkage Clustering
Mamun, Abdullah-Al; Aseltine, Robert; Rajasekaran, Sanguthevar
2016-01-01
Data from different agencies share data of the same individuals. Linking these datasets to identify all the records belonging to the same individuals is a crucial and challenging problem, especially given the large volumes of data. A large number of available algorithms for record linkage are prone to either time inefficiency or low-accuracy in finding matches and non-matches among the records. In this paper we propose efficient as well as reliable sequential and parallel algorithms for the record linkage problem employing hierarchical clustering methods. We employ complete linkage hierarchical clustering algorithms to address this problem. In addition to hierarchical clustering, we also use two other techniques: elimination of duplicate records and blocking. Our algorithms use sorting as a sub-routine to identify identical copies of records. We have tested our algorithms on datasets with millions of synthetic records. Experimental results show that our algorithms achieve nearly 100% accuracy. Parallel implementations achieve almost linear speedups. Time complexities of these algorithms do not exceed those of previous best-known algorithms. Our proposed algorithms outperform previous best-known algorithms in terms of accuracy consuming reasonable run times. PMID:27124604
Efficient Record Linkage Algorithms Using Complete Linkage Clustering.
Mamun, Abdullah-Al; Aseltine, Robert; Rajasekaran, Sanguthevar
2016-01-01
Data from different agencies share data of the same individuals. Linking these datasets to identify all the records belonging to the same individuals is a crucial and challenging problem, especially given the large volumes of data. A large number of available algorithms for record linkage are prone to either time inefficiency or low-accuracy in finding matches and non-matches among the records. In this paper we propose efficient as well as reliable sequential and parallel algorithms for the record linkage problem employing hierarchical clustering methods. We employ complete linkage hierarchical clustering algorithms to address this problem. In addition to hierarchical clustering, we also use two other techniques: elimination of duplicate records and blocking. Our algorithms use sorting as a sub-routine to identify identical copies of records. We have tested our algorithms on datasets with millions of synthetic records. Experimental results show that our algorithms achieve nearly 100% accuracy. Parallel implementations achieve almost linear speedups. Time complexities of these algorithms do not exceed those of previous best-known algorithms. Our proposed algorithms outperform previous best-known algorithms in terms of accuracy consuming reasonable run times.
Clustering of Hadronic Showers with a Structural Algorithm
Charles, M.J.; /SLAC
2005-12-13
The internal structure of hadronic showers can be resolved in a high-granularity calorimeter. This structure is described in terms of simple components and an algorithm for reconstruction of hadronic clusters using these components is presented. Results from applying this algorithm to simulated hadronic Z-pole events in the SiD concept are discussed.
Critical dynamics of cluster algorithms in the dilute Ising model
NASA Astrophysics Data System (ADS)
Hennecke, M.; Heyken, U.
1993-08-01
Autocorrelation times for thermodynamic quantities at T C are calculated from Monte Carlo simulations of the site-diluted simple cubic Ising model, using the Swendsen-Wang and Wolff cluster algorithms. Our results show that for these algorithms the autocorrelation times decrease when reducing the concentration of magnetic sites from 100% down to 40%. This is of crucial importance when estimating static properties of the model, since the variances of these estimators increase with autocorrelation time. The dynamical critical exponents are calculated for both algorithms, observing pronounced finite-size effects in the energy autocorrelation data for the algorithm of Wolff. We conclude that, when applied to the dilute Ising model, cluster algorithms become even more effective than local algorithms, for which increasing autocorrelation times are expected.
CCL: an algorithm for the efficient comparison of clusters
Hundt, R.; Schön, J. C.; Neelamraju, S.; Zagorac, J.; Jansen, M.
2013-01-01
The systematic comparison of the atomic structure of solids and clusters has become an important task in crystallography, chemistry, physics and materials science, in particular in the context of structure prediction and structure determination of nanomaterials. In this work, an efficient and robust algorithm for the comparison of cluster structures is presented, which is based on the mapping of the point patterns of the two clusters onto each other. This algorithm has been implemented as the module CCL in the structure visualization and analysis program KPLOT. PMID:23682193
Efficient cluster algorithm for CP(N-1) models
NASA Astrophysics Data System (ADS)
Beard, B. B.; Pepe, M.; Riederer, S.; Wiese, U.-J.
2006-11-01
Despite several attempts, no efficient cluster algorithm has been constructed for CP(N-1) models in the standard Wilson formulation of lattice field theory. In fact, there is a no-go theorem that prevents the construction of an efficient Wolff-type embedding algorithm. In this paper, we construct an efficient cluster algorithm for ferromagnetic SU(N)-symmetric quantum spin systems. Such systems provide a regularization for CP(N-1) models in the framework of D-theory. We present detailed studies of the autocorrelations and find a dynamical critical exponent that is consistent with z=0.
Measuring Constraint-Set Utility for Partitional Clustering Algorithms
NASA Technical Reports Server (NTRS)
Davidson, Ian; Wagstaff, Kiri L.; Basu, Sugato
2006-01-01
Clustering with constraints is an active area of machine learning and data mining research. Previous empirical work has convincingly shown that adding constraints to clustering improves the performance of a variety of algorithms. However, in most of these experiments, results are averaged over different randomly chosen constraint sets from a given set of labels, thereby masking interesting properties of individual sets. We demonstrate that constraint sets vary significantly in how useful they are for constrained clustering; some constraint sets can actually decrease algorithm performance. We create two quantitative measures, informativeness and coherence, that can be used to identify useful constraint sets. We show that these measures can also help explain differences in performance for four particular constrained clustering algorithms.
Digital News Graph Clustering using Chinese Whispers Algorithm
NASA Astrophysics Data System (ADS)
Pratama, M. F. E.; Kemas, R. S. W.; Anisa, H.
2017-01-01
As the exponential growth of news creation on the internet, the amount of digital news has reached out billion numbers. Digital news is naturally linked each other but it needs to be grouped so that user can easily classify the news that they read. Graph is the most suitable data model to represent digital news since its can describing relation in easy and flexible manner. Thus, to overcome grouping problems, in this paper we using Chinese Whispers Algorithm as the graph clustering approach. We choose Chinese Whisper Algorithm based on consideration that the algorithm is able to make clusters from a big graph data with a relatively fast process [8], that appropriate with the characteristics of digital news. In this research, we examine the graph quality by comparing intra and inter-cluster weights of every node. This scenario gives us a quite high result that 95% of nodes have intra-cluster weight higher than its inter-cluster weight. We also investigate the graph accuracy by comparing the cluster results with expert judgement. As the result, the average accuracy of digital news graph clustering using Chinese Whisper algorithm is 80%.
A Genetic Algorithm That Exchanges Neighboring Centers for Fuzzy c-Means Clustering
ERIC Educational Resources Information Center
Chahine, Firas Safwan
2012-01-01
Clustering algorithms are widely used in pattern recognition and data mining applications. Due to their computational efficiency, partitional clustering algorithms are better suited for applications with large datasets than hierarchical clustering algorithms. K-means is among the most popular partitional clustering algorithm, but has a major…
Sampling Within k-Means Algorithm to Cluster Large Datasets
Bejarano, Jeremy; Bose, Koushiki; Brannan, Tyler; Thomas, Anita; Adragni, Kofi; Neerchal, Nagaraj; Ostrouchov, George
2011-08-01
Due to current data collection technology, our ability to gather data has surpassed our ability to analyze it. In particular, k-means, one of the simplest and fastest clustering algorithms, is ill-equipped to handle extremely large datasets on even the most powerful machines. Our new algorithm uses a sample from a dataset to decrease runtime by reducing the amount of data analyzed. We perform a simulation study to compare our sampling based k-means to the standard k-means algorithm by analyzing both the speed and accuracy of the two methods. Results show that our algorithm is significantly more efficient than the existing algorithm with comparable accuracy. Further work on this project might include a more comprehensive study both on more varied test datasets as well as on real weather datasets. This is especially important considering that this preliminary study was performed on rather tame datasets. Also, these datasets should analyze the performance of the algorithm on varied values of k. Lastly, this paper showed that the algorithm was accurate for relatively low sample sizes. We would like to analyze this further to see how accurate the algorithm is for even lower sample sizes. We could find the lowest sample sizes, by manipulating width and confidence level, for which the algorithm would be acceptably accurate. In order for our algorithm to be a success, it needs to meet two benchmarks: match the accuracy of the standard k-means algorithm and significantly reduce runtime. Both goals are accomplished for all six datasets analyzed. However, on datasets of three and four dimension, as the data becomes more difficult to cluster, both algorithms fail to obtain the correct classifications on some trials. Nevertheless, our algorithm consistently matches the performance of the standard algorithm while becoming remarkably more efficient with time. Therefore, we conclude that analysts can use our algorithm, expecting accurate results in considerably less time.
Characterising superclusters with the galaxy cluster distribution
NASA Astrophysics Data System (ADS)
Chon, Gayoung; Böhringer, Hans; Collins, Chris A.; Krause, Martin
2014-07-01
Superclusters are the largest observed matter density structures in the Universe. Recently, we presented the first supercluster catalogue constructed with a well-defined selection function based on the X-ray flux-limited cluster survey, REFLEX II. To construct the sample we proposed a concept to find large objects with a minimum overdensity such that it can be expected that most of their mass will collapse in the future. The main goal is to provide support for our concept here by using simulation that we can, on the basis of our observational sample of X-ray clusters, construct a supercluster sample defined by a certain minimum overdensity. On this sample we also test how superclusters trace the underlying dark matter distribution. Our results confirm that an overdensity in the number of clusters is tightly correlated with an overdensity of the dark matter distribution. This enables us to define superclusters within which most of the mass will collapse in the future. We also obtain first-order mass estimates of superclusters on the basis of the properties of the member clusters. We also show that in this context the ratio of the cluster number density and dark matter mass density is consistent with the theoretically expected cluster bias. Our previous work provided evidence that superclusters are a special environment in which the density structures of the dark matter grow differently from those in the field, as characterised by the X-ray luminosity function. Here we confirm for the first time that this originates from a top-heavy mass function at high statistical significance that is provided by a Kolmogorov-Smirnov test. We also find in close agreement with observations that the superclusters only occupy a small volume of a few per cent, but contain more than half of the clusters in the present-day Universe.
CACONET: Ant Colony Optimization (ACO) Based Clustering Algorithm for VANET
Bajwa, Khalid Bashir; Khan, Salabat; Chaudary, Nadeem Majeed; Akram, Adeel
2016-01-01
A vehicular ad hoc network (VANET) is a wirelessly connected network of vehicular nodes. A number of techniques, such as message ferrying, data aggregation, and vehicular node clustering aim to improve communication efficiency in VANETs. Cluster heads (CHs), selected in the process of clustering, manage inter-cluster and intra-cluster communication. The lifetime of clusters and number of CHs determines the efficiency of network. In this paper a Clustering algorithm based on Ant Colony Optimization (ACO) for VANETs (CACONET) is proposed. CACONET forms optimized clusters for robust communication. CACONET is compared empirically with state-of-the-art baseline techniques like Multi-Objective Particle Swarm Optimization (MOPSO) and Comprehensive Learning Particle Swarm Optimization (CLPSO). Experiments varying the grid size of the network, the transmission range of nodes, and number of nodes in the network were performed to evaluate the comparative effectiveness of these algorithms. For optimized clustering, the parameters considered are the transmission range, direction and speed of the nodes. The results indicate that CACONET significantly outperforms MOPSO and CLPSO. PMID:27149517
CACONET: Ant Colony Optimization (ACO) Based Clustering Algorithm for VANET.
Aadil, Farhan; Bajwa, Khalid Bashir; Khan, Salabat; Chaudary, Nadeem Majeed; Akram, Adeel
2016-01-01
A vehicular ad hoc network (VANET) is a wirelessly connected network of vehicular nodes. A number of techniques, such as message ferrying, data aggregation, and vehicular node clustering aim to improve communication efficiency in VANETs. Cluster heads (CHs), selected in the process of clustering, manage inter-cluster and intra-cluster communication. The lifetime of clusters and number of CHs determines the efficiency of network. In this paper a Clustering algorithm based on Ant Colony Optimization (ACO) for VANETs (CACONET) is proposed. CACONET forms optimized clusters for robust communication. CACONET is compared empirically with state-of-the-art baseline techniques like Multi-Objective Particle Swarm Optimization (MOPSO) and Comprehensive Learning Particle Swarm Optimization (CLPSO). Experiments varying the grid size of the network, the transmission range of nodes, and number of nodes in the network were performed to evaluate the comparative effectiveness of these algorithms. For optimized clustering, the parameters considered are the transmission range, direction and speed of the nodes. The results indicate that CACONET significantly outperforms MOPSO and CLPSO.
Exact and heuristic algorithms for weighted cluster editing.
Rahmann, Sven; Wittkop, Tobias; Baumbach, Jan; Martin, Marcel; Truss, Anke; Böcker, Sebastian
2007-01-01
Clustering objects according to given similarity or distance values is a ubiquitous problem in computational biology with diverse applications, e.g., in defining families of orthologous genes, or in the analysis of microarray experiments. While there exists a plenitude of methods, many of them produce clusterings that can be further improved. "Cleaning up" initial clusterings can be formalized as projecting a graph on the space of transitive graphs; it is also known as the cluster editing or cluster partitioning problem in the literature. In contrast to previous work on cluster editing, we allow arbitrary weights on the similarity graph. To solve the so-defined weighted transitive graph projection problem, we present (1) the first exact fixed-parameter algorithm, (2) a polynomial-time greedy algorithm that returns the optimal result on a well-defined subset of "close-to-transitive" graphs and works heuristically on other graphs, and (3) a fast heuristic that uses ideas similar to those from the Fruchterman-Reingold graph layout algorithm. We compare quality and running times of these algorithms on both artificial graphs and protein similarity graphs derived from the 66 organisms of the COG dataset.
A Community Detection Algorithm Based on Topology Potential and Spectral Clustering
Wang, Zhixiao; Chen, Zhaotong; Zhao, Ya; Chen, Shaoda
2014-01-01
Community detection is of great value for complex networks in understanding their inherent law and predicting their behavior. Spectral clustering algorithms have been successfully applied in community detection. This kind of methods has two inadequacies: one is that the input matrixes they used cannot provide sufficient structural information for community detection and the other is that they cannot necessarily derive the proper community number from the ladder distribution of eigenvector elements. In order to solve these problems, this paper puts forward a novel community detection algorithm based on topology potential and spectral clustering. The new algorithm constructs the normalized Laplacian matrix with nodes' topology potential, which contains rich structural information of the network. In addition, the new algorithm can automatically get the optimal community number from the local maximum potential nodes. Experiments results showed that the new algorithm gave excellent performance on artificial networks and real world networks and outperforms other community detection methods. PMID:25147846
Atmospheric Ion Clusters: Properties and Size Distributions
NASA Astrophysics Data System (ADS)
D'Auria, R.; Turco, R. P.
2002-12-01
Ions are continuously generated in the atmosphere by the action of galactic cosmic radiation. Measured charge concentrations are of the order of 103 ~ {cm-3} throughout the troposphere, increasing to about 5 x 103 ~ {cm-3} in the lower stratosphere [Cole and Pierce, 1965; Paltridge, 1965, 1966]. The lifetimes of these ions are sufficient to allow substantial clustering with common trace constituents in air, including water, nitric and sulfuric acids, ammonia, and a variety of organic compounds [e.g., D'Auria and Turco, 2001 and references cited therein]. The populations of the resulting charged molecular clusters represent a pre-nucleation phase of particle formation, and in this regard comprise a key segment of the over-all nucleation size spectrum [e.g., Castleman and Tang, 1972]. It has been suggested that these clusters may catalyze certain heterogeneous reactions, and given their characteristic crystal-like structures may act as freezing nuclei for supercooled droplets. To investigate these possibilities, basic information on cluster thermodynamic properties and chemical kinetics is needed. Here, we present new results for several relevant atmospheric ion cluster families. In particular, predictions based on quantum mechanical simulations of cluster structure, and related thermodynamic parameters, are compared against laboratory data. We also describe a hybrid approach for modeling cluster sequences that combines laboratory measurements and quantum predictions with the classical liquid droplet (Thomson) model to treat a wider range of cluster sizes. Calculations of cluster mass distributions based on this hybrid model are illustrated, and the advantages and limitations of such an analysis are summarized. References: Castelman, A. W., Jr., and I. N. Tang, Role of small clusters in nucleation about ions, J. Chem. Phys., 57, 3629-3638, 1972. Cole, R. K., and E. T. Pierce, Electrification in the Earth's atmosphere for altitudes between 0 and 100 kilometers, J
Cluster Recognition Algorithms for Battlefield Simulation.
1996-01-01
j,s’ 1 4 acJ y • 4 1 I A, r 22403 ( 23268 •s 0.. 233700 b "Ilk 0 *3 5* % ** % ,i• .3 0. •,. 3 4%4 ;4 26 " 24564 - N.. 24996 Figure A-65. Initial...22153 22403 22657 4 I 22909 23161 23413 23665 23917 Figure A-142. Circular clustering from data set 45. 170 p 424 27-21 427PO � � 142858
Critical slowing down of cluster algorithms for Ising models coupled to 2-d gravity
NASA Astrophysics Data System (ADS)
Bowick, Mark; Falcioni, Marco; Harris, Geoffrey; Marinari, Enzo
1994-02-01
We simulate single and multiple Ising models coupled to 2-d gravity using both the Swendsen-Wang and Wolff algorithms to update the spins. We study the integrated autocorrelation time and find that there is considerable critical slowing down, particularly in the magnetization. We argue that this is primarily due to the local nature of the dynamical triangulation algorithm and to the generation of a distribution of baby universes which inhibits cluster growth.
Advanced algorithms for distributed fusion
NASA Astrophysics Data System (ADS)
Gelfand, A.; Smith, C.; Colony, M.; Bowman, C.; Pei, R.; Huynh, T.; Brown, C.
2008-03-01
The US Military has been undergoing a radical transition from a traditional "platform-centric" force to one capable of performing in a "Network-Centric" environment. This transformation will place all of the data needed to efficiently meet tactical and strategic goals at the warfighter's fingertips. With access to this information, the challenge of fusing data from across the batttlespace into an operational picture for real-time Situational Awareness emerges. In such an environment, centralized fusion approaches will have limited application due to the constraints of real-time communications networks and computational resources. To overcome these limitations, we are developing a formalized architecture for fusion and track adjudication that allows the distribution of fusion processes over a dynamically created and managed information network. This network will support the incorporation and utilization of low level tracking information within the Army Distributed Common Ground System (DCGS-A) or Future Combat System (FCS). The framework is based on Bowman's Dual Node Network (DNN) architecture that utilizes a distributed network of interlaced fusion and track adjudication nodes to build and maintain a globally consistent picture across all assets.
Functional clustering algorithm for the analysis of dynamic network data
NASA Astrophysics Data System (ADS)
Feldt, S.; Waddell, J.; Hetrick, V. L.; Berke, J. D.; Żochowski, M.
2009-05-01
We formulate a technique for the detection of functional clusters in discrete event data. The advantage of this algorithm is that no prior knowledge of the number of functional groups is needed, as our procedure progressively combines data traces and derives the optimal clustering cutoff in a simple and intuitive manner through the use of surrogate data sets. In order to demonstrate the power of this algorithm to detect changes in network dynamics and connectivity, we apply it to both simulated neural spike train data and real neural data obtained from the mouse hippocampus during exploration and slow-wave sleep. Using the simulated data, we show that our algorithm performs better than existing methods. In the experimental data, we observe state-dependent clustering patterns consistent with known neurophysiological processes involved in memory consolidation.
Analysis of Network Clustering Algorithms and Cluster Quality Metrics at Scale
Kobourov, Stephen; Gallant, Mike; Börner, Katy
2016-01-01
Overview Notions of community quality underlie the clustering of networks. While studies surrounding network clustering are increasingly common, a precise understanding of the realtionship between different cluster quality metrics is unknown. In this paper, we examine the relationship between stand-alone cluster quality metrics and information recovery metrics through a rigorous analysis of four widely-used network clustering algorithms—Louvain, Infomap, label propagation, and smart local moving. We consider the stand-alone quality metrics of modularity, conductance, and coverage, and we consider the information recovery metrics of adjusted Rand score, normalized mutual information, and a variant of normalized mutual information used in previous work. Our study includes both synthetic graphs and empirical data sets of sizes varying from 1,000 to 1,000,000 nodes. Cluster Quality Metrics We find significant differences among the results of the different cluster quality metrics. For example, clustering algorithms can return a value of 0.4 out of 1 on modularity but score 0 out of 1 on information recovery. We find conductance, though imperfect, to be the stand-alone quality metric that best indicates performance on the information recovery metrics. Additionally, our study shows that the variant of normalized mutual information used in previous work cannot be assumed to differ only slightly from traditional normalized mutual information. Network Clustering Algorithms Smart local moving is the overall best performing algorithm in our study, but discrepancies between cluster evaluation metrics prevent us from declaring it an absolutely superior algorithm. Interestingly, Louvain performed better than Infomap in nearly all the tests in our study, contradicting the results of previous work in which Infomap was superior to Louvain. We find that although label propagation performs poorly when clusters are less clearly defined, it scales efficiently and accurately to large
Development of clustering algorithms for Compressed Baryonic Matter experiment
NASA Astrophysics Data System (ADS)
Kozlov, G. E.; Ivanov, V. V.; Lebedev, A. A.; Vassiliev, Yu. O.
2015-05-01
A clustering problem for the coordinate detectors in the Compressed Baryonic Matter (CBM) experiment is discussed. Because of the high interaction rate and huge datasets to be dealt with, clustering algorithms are required to be fast and efficient and capable of processing events with high track multiplicity. At present there are two different approaches to the problem. In the first one each fired pad bears information about its charge, while in the second one a pad can or cannot be fired, thus rendering the separation of overlapping clusters a difficult task. To deal with the latter, two different clustering algorithms were developed, integrated into the CBMROOT software environment, and tested with various types of simulated events. Both of them are found to be highly efficient and accurate.
Qin, Jiahu; Fu, Weiming; Gao, Huijun; Zheng, Wei Xing
2016-03-03
This paper is concerned with developing a distributed k-means algorithm and a distributed fuzzy c-means algorithm for wireless sensor networks (WSNs) where each node is equipped with sensors. The underlying topology of the WSN is supposed to be strongly connected. The consensus algorithm in multiagent consensus theory is utilized to exchange the measurement information of the sensors in WSN. To obtain a faster convergence speed as well as a higher possibility of having the global optimum, a distributed k-means++ algorithm is first proposed to find the initial centroids before executing the distributed k-means algorithm and the distributed fuzzy c-means algorithm. The proposed distributed k-means algorithm is capable of partitioning the data observed by the nodes into measure-dependent groups which have small in-group and large out-group distances, while the proposed distributed fuzzy c-means algorithm is capable of partitioning the data observed by the nodes into different measure-dependent groups with degrees of membership values ranging from 0 to 1. Simulation results show that the proposed distributed algorithms can achieve almost the same results as that given by the centralized clustering algorithms.
NCUBE - A clustering algorithm based on a discretized data space
NASA Technical Reports Server (NTRS)
Eigen, D. J.; Northouse, R. A.
1974-01-01
Cluster analysis involves the unsupervised grouping of data. The process provides an automatic procedure for generating known training samples for pattern classification. NCUBE, the clustering algorithm presented, is based upon the concept of imposing a gridwork on the data space. The NCUBE computer implementation of this concept provides an easily derived form of piecewise linear discrimination. This piecewise linear discrimination permits the separation of some types of data groups that are not linearly separable.
Parallel and Distributed Computing Combinatorial Algorithms
1993-10-01
FUPNDKC %2,•, PARALLEL AND DISTRIBUTED COMPUTING COMBINATORIAL ALGORITHMS 6. AUTHOR(S) 2304/DS F49620-92-J-0125 DR. LEIGHTON 7 PERFORMING ORGANIZATION NAME...on several problems involving parallel and distributed computing and combinatorial optimization. This research is reported in the numerous papers that...network decom- position. In Proceedings of the Eleventh Annual ACM Symposium on Principles of Distributed Computing , August 1992. [15] B. Awerbuch, B
Particle flow reconstruction based on the directed tree clustering algorithm
Chakraborty, D.; Lima, J. G. R.; McIntosh, R.; Zutshi, V.
2006-10-27
We present the status of particle flow algorithm development at Northern Illinois University. A key element in our approach is the calorimeter-based directed tree clustering algorithm. We have attempted to identify and tackle the essential challenges and analyze the effect of several different approaches to the reconstruction of jet energies and the Z-boson mass. A number of possibilities have been studied, such as analog vs. digital energy measurement, hit density-based clustering and the use of single or multiple energy thresholds. We plan to use this PFA-based reconstruction to compare some of the proposed detector technologies and geometries.
Salem, Sameh A; Salem, Nancy M; Nandi, Asoke K
2007-03-01
In this paper, segmentation of blood vessels from colour retinal images using a novel clustering algorithm with a partial supervision strategy is proposed. The proposed clustering algorithm, which is a RAdius based Clustering ALgorithm (RACAL), uses a distance based principle to map the distributions of the data by utilising the premise that clusters are determined by a distance parameter, without having to specify the number of clusters. Additionally, the proposed clustering algorithm is enhanced with a partial supervision strategy and it is demonstrated that it is able to segment blood vessels of small diameters and low contrasts. Results are compared with those from the KNN classifier and show that the proposed RACAL performs better than the KNN in case of abnormal images as it succeeds in segmenting small and low contrast blood vessels, while it achieves comparable results for normal images. For automation process, RACAL can be used as a classifier and results show that it performs better than the KNN classifier in both normal and abnormal images.
Generalized chemical distance distribution in all-sided critical percolation clusters
NASA Astrophysics Data System (ADS)
Katunin, Andrzej
2016-12-01
The algorithm of evaluation of chemical distance distribution in 2D and 3D critical percolation clusters is presented in the following study. The algorithm is enriched by numerous examples of 2D and 3D critical percolation clusters related to the currently investigated problem of electrical percolation in a mixture of conducting/dielectric polymers. The introduced measure of the chemical distance distribution can be a useful tool for characterization of percolation clusters, and in problems of percolation theory and graphs theory in general.
The C4 clustering algorithm: Clusters of galaxies in the Sloan Digital Sky Survey
Miller, Christopher J.; Nichol, Robert; Reichart, Dan; Wechsler, Risa H.; Evrard, August; Annis, James; McKay, Timothy; Bahcall, Neta; Bernardi, Mariangela; Boehringer, Hans; Connolly, Andrew; Goto, Tomo; Kniazev, Alexie; Lamb, Donald; Postman, Marc; Schneider, Donald; Sheth, Ravi; Voges, Wolfgang; /Cerro-Tololo InterAmerican Obs. /Portsmouth U., ICG /North Carolina U. /Chicago U., Astron. Astrophys. Ctr. /Chicago U., EFI /Michigan U. /Fermilab /Princeton U. Observ. /Garching, Max Planck Inst., MPE /Pittsburgh U. /Tokyo U., ICRR /Baltimore, Space Telescope Sci. /Penn State U. /Chicago U. /Stavropol, Astrophys. Observ. /Heidelberg, Max Planck Inst. Astron. /INI, SAO
2005-03-01
We present the ''C4 Cluster Catalog'', a new sample of 748 clusters of galaxies identified in the spectroscopic sample of the Second Data Release (DR2) of the Sloan Digital Sky Survey (SDSS). The C4 cluster-finding algorithm identifies clusters as overdensities in a seven-dimensional position and color space, thus minimizing projection effects that have plagued previous optical cluster selection. The present C4 catalog covers {approx}2600 square degrees of sky and ranges in redshift from z = 0.02 to z = 0.17. The mean cluster membership is 36 galaxies (with redshifts) brighter than r = 17.7, but the catalog includes a range of systems, from groups containing 10 members to massive clusters with over 200 cluster members with redshifts. The catalog provides a large number of measured cluster properties including sky location, mean redshift, galaxy membership, summed r-band optical luminosity (L{sub r}), velocity dispersion, as well as quantitative measures of substructure and the surrounding large-scale environment. We use new, multi-color mock SDSS galaxy catalogs, empirically constructed from the {Lambda}CDM Hubble Volume (HV) Sky Survey output, to investigate the sensitivity of the C4 catalog to the various algorithm parameters (detection threshold, choice of passbands and search aperture), as well as to quantify the purity and completeness of the C4 cluster catalog. These mock catalogs indicate that the C4 catalog is {approx_equal}90% complete and 95% pure above M{sub 200} = 1 x 10{sup 14} h{sup -1}M{sub {circle_dot}} and within 0.03 {le} z {le} 0.12. Using the SDSS DR2 data, we show that the C4 algorithm finds 98% of X-ray identified clusters and 90% of Abell clusters within 0.03 {le} z {le} 0.12. Using the mock galaxy catalogs and the full HV dark matter simulations, we show that the L{sub r} of a cluster is a more robust estimator of the halo mass (M{sub 200}) than the galaxy line-of-sight velocity dispersion or the richness of the cluster. However, if we
2013-01-01
Index (Hubert & Arabie 1985) Range: [-1, 1] Similarity 2 FMI Fowlkes-Mallows Index (Fowlkes & Mallows 1983) Range: [0, 1] Similarity 3...distance between hand-crafted and algorithmic clusterings Clustering ARI FMI JC JMS MM NMI. RI VDM VI Mean-30k 0.6170 0.6628 0.4466 0.7208 29614 0.9579...0.7472 36312 0.9574 0.9985 1200 0.8556 Table 5: Similarity or distance between hand-crafted and random clusterings Clustering ARI FMI JC JMS MM NMI RI
Clustered Self Organising Migrating Algorithm for the Quadratic Assignment Problem
NASA Astrophysics Data System (ADS)
Davendra, Donald; Zelinka, Ivan; Senkerik, Roman
2009-08-01
An approach of population dynamics and clustering for permutative problems is presented in this paper. Diversity indicators are created from solution ordering and its mapping is shown as an advantage for population control in metaheuristics. Self Organising Migrating Algorithm (SOMA) is modified using this approach and vetted with the Quadratic Assignment Problem (QAP). Extensive experimentation is conducted on benchmark problems in this area.
Analysis and Implementation of Graph Clustering for Digital News Using Star Clustering Algorithm
NASA Astrophysics Data System (ADS)
Ahdi, A. B.; SW, K. R.; Herdiani, A.
2017-01-01
Since Web 2.0 notion emerged and is used extensively by many services in the Internet, we see an unprecedented proliferation of digital news. Those digital news is very rich in term of content and link to other news/sources but lack of category information. This make the user could not easily identify or grouping all the news that they read into set of groups. Naturally, digital news are linked data because every digital new has relation/connection with other digital news/resources. The most appropriate model for linked data is graph model. Graph model is suitable for this purpose due its flexibility in describing relation and its easy-to-understand visualization. To handle the grouping issue, we use graph clustering approach. There are many graph clustering algorithm available, such as MST Clustering, Chameleon, Makarov Clustering and Star Clustering. From all of these options, we choose Star Clustering because this algorithm is more easy-to-understand, more accurate, efficient and guarantee the quality of clusters results. In this research, we investigate the accuracy of the cluster results by comparing it with expert judgement. We got quite high accuracy level, which is 80.98% and for the cluster quality, we got promising result which is 62.87%.
NASA Technical Reports Server (NTRS)
Lennington, R. K.; Johnson, J. K.
1979-01-01
An efficient procedure which clusters data using a completely unsupervised clustering algorithm and then uses labeled pixels to label the resulting clusters or perform a stratified estimate using the clusters as strata is developed. Three clustering algorithms, CLASSY, AMOEBA, and ISOCLS, are compared for efficiency. Three stratified estimation schemes and three labeling schemes are also considered and compared.
Large Data Visualization on Distributed Memory Mulit-GPU Clusters
Childs, Henry R.
2010-03-01
Data sets of immense size are regularly generated on large scale computing resources. Even among more traditional methods for acquisition of volume data, such as MRI and CT scanners, data which is too large to be effectively visualization on standard workstations is now commonplace. One solution to this problem is to employ a 'visualization cluster,' a small to medium scale cluster dedicated to performing visualization and analysis of massive data sets generated on larger scale supercomputers. These clusters are designed to fit a different need than traditional supercomputers, and therefore their design mandates different hardware choices, such as increased memory, and more recently, graphics processing units (GPUs). While there has been much previous work on distributed memory visualization as well as GPU visualization, there is a relative dearth of algorithms which effectively use GPUs at a large scale in a distributed memory environment. In this work, we study a common visualization technique in a GPU-accelerated, distributed memory setting, and present performance characteristics when scaling to extremely large data sets.
GEANT4 distributed computing for compact clusters
NASA Astrophysics Data System (ADS)
Harrawood, Brian P.; Agasthya, Greeshma A.; Lakshmanan, Manu N.; Raterman, Gretchen; Kapadia, Anuj J.
2014-11-01
A new technique for distribution of GEANT4 processes is introduced to simplify running a simulation in a parallel environment such as a tightly coupled computer cluster. Using a new C++ class derived from the GEANT4 toolkit, multiple runs forming a single simulation are managed across a local network of computers with a simple inter-node communication protocol. The class is integrated with the GEANT4 toolkit and is designed to scale from a single symmetric multiprocessing (SMP) machine to compact clusters ranging in size from tens to thousands of nodes. User designed 'work tickets' are distributed to clients using a client-server work flow model to specify the parameters for each individual run of the simulation. The new g4DistributedRunManager class was developed and well tested in the course of our Neutron Stimulated Emission Computed Tomography (NSECT) experiments. It will be useful for anyone running GEANT4 for large discrete data sets such as covering a range of angles in computed tomography, calculating dose delivery with multiple fractions or simply speeding the through-put of a single model.
NASA Astrophysics Data System (ADS)
Ball, R. C.; Lee, J. R.
1996-03-01
We prove that a new, irreversible growth algorithm, Non-Deletion Reaction-Limited Cluster-cluster Aggregation (NDRLCA), produces equilibrium Branched Polymers, expected to exhibit Lattice Animal statistics [1]. We implement NDRLCA, off-lattice, as a computer simulation for embedding dimension d=2 and 3, obtaining values for critical exponents, fractal dimension D and cluster mass distribution exponent tau: d=2, D≈ 1.53± 0.05, tau = 1.09± 0.06; d=3, D=1.96± 0.04, tau =1.50± 0.04 in good agreement with theoretical LA values. The simulation results do not support recent suggestions [2] that BPs may be in the same universality class as percolation. We also obtain values for a model-dependent critical “fugacity”, z_c and investigate the finite-size effects of our simulation, quantifying notions of “inbreeding” that occur in this algorithm. Finally we use an extension of the NDRLCA proof to show that standard Reaction-Limited Cluster-cluster Aggregation is very unlikely to be in the same universality class as Branched Polymers/Lattice Animals unless the backnone dimension for the latter is considerably less than the published value.
A Likelihood-Based SLIC Superpixel Algorithm for SAR Images Using Generalized Gamma Distribution
Zou, Huanxin; Qin, Xianxiang; Zhou, Shilin; Ji, Kefeng
2016-01-01
The simple linear iterative clustering (SLIC) method is a recently proposed popular superpixel algorithm. However, this method may generate bad superpixels for synthetic aperture radar (SAR) images due to effects of speckle and the large dynamic range of pixel intensity. In this paper, an improved SLIC algorithm for SAR images is proposed. This algorithm exploits the likelihood information of SAR image pixel clusters. Specifically, a local clustering scheme combining intensity similarity with spatial proximity is proposed. Additionally, for post-processing, a local edge-evolving scheme that combines spatial context and likelihood information is introduced as an alternative to the connected components algorithm. To estimate the likelihood information of SAR image clusters, we incorporated a generalized gamma distribution (GГD). Finally, the superiority of the proposed algorithm was validated using both simulated and real-world SAR images. PMID:27438840
The Geometric Cluster Algorithm: Rejection-Free Monte Carlo Simulation of Complex Fluids
NASA Astrophysics Data System (ADS)
Luijten, Erik
2005-03-01
The study of complex fluids is an area of intense research activity, in which exciting and counter-intuitive behavior continue to be uncovered. Ironically, one of the very factors responsible for such interesting properties, namely the presence of multiple relevant time and length scales, often greatly complicates accurate theoretical calculations and computer simulations that could explain the observations. We have recently developed a new Monte Carlo simulation methodootnotetextJ. Liu and E. Luijten, Phys. Rev. Lett.92, 035504 (2004); see also Physics Today, March 2004, pp. 25--27. that overcomes this problem for several classes of complex fluids. Our approach can accelerate simulations by orders of magnitude by introducing nonlocal, collective moves of the constituents. Strikingly, these cluster Monte Carlo moves are proposed in such a manner that the algorithm is rejection-free. The identification of the clusters is based upon geometric symmetries and can be considered as the off-latice generalization of the widely-used Swendsen--Wang and Wolff algorithms for lattice spin models. While phrased originally for complex fluids that are governed by the Boltzmann distribution, the geometric cluster algorithm can be used to efficiently sample configurations from an arbitrary underlying distribution function and may thus be applied in a variety of other areas. In addition, I will briefly discuss various extensions of the original algorithm, including methods to influence the size of the clusters that are generated and ways to introduce density fluctuations.
Estimation of distribution algorithms with Kikuchi approximations.
Santana, Roberto
2005-01-01
The question of finding feasible ways for estimating probability distributions is one of the main challenges for Estimation of Distribution Algorithms (EDAs). To estimate the distribution of the selected solutions, EDAs use factorizations constructed according to graphical models. The class of factorizations that can be obtained from these probability models is highly constrained. Expanding the class of factorizations that could be employed for probability approximation is a necessary step for the conception of more robust EDAs. In this paper we introduce a method for learning a more general class of probability factorizations. The method combines a reformulation of a probability approximation procedure known in statistical physics as the Kikuchi approximation of energy, with a novel approach for finding graph decompositions. We present the Markov Network Estimation of Distribution Algorithm (MN-EDA), an EDA that uses Kikuchi approximations to estimate the distribution, and Gibbs Sampling (GS) to generate new points. A systematic empirical evaluation of MN-EDA is done in comparison with different Bayesian network based EDAs. From our experiments we conclude that the algorithm can outperform other EDAs that use traditional methods of probability approximation in the optimization of functions with strong interactions among their variables.
A new clustering algorithm for scanning electron microscope images
NASA Astrophysics Data System (ADS)
Yousef, Amr; Duraisamy, Prakash; Karim, Mohammad
2016-04-01
A scanning electron microscope (SEM) is a type of electron microscope that produces images of a sample by scanning it with a focused beam of electrons. The electrons interact with the sample atoms, producing various signals that are collected by detectors. The gathered signals contain information about the sample's surface topography and composition. The electron beam is generally scanned in a raster scan pattern, and the beam's position is combined with the detected signal to produce an image. The most common configuration for an SEM produces a single value per pixel, with the results usually rendered as grayscale images. The captured images may be produced with insufficient brightness, anomalous contrast, jagged edges, and poor quality due to low signal-to-noise ratio, grained topography and poor surface details. The segmentation of the SEM images is a tackling problems in the presence of the previously mentioned distortions. In this paper, we are stressing on the clustering of these type of images. In that sense, we evaluate the performance of the well-known unsupervised clustering and classification techniques such as connectivity based clustering (hierarchical clustering), centroid-based clustering, distribution-based clustering and density-based clustering. Furthermore, we propose a new spatial fuzzy clustering technique that works efficiently on this type of images and compare its results against these regular techniques in terms of clustering validation metrics.
Improved Gravitation Field Algorithm and Its Application in Hierarchical Clustering
Zheng, Ming; Sun, Ying; Liu, Gui-xia; Zhou, You; Zhou, Chun-guang
2012-01-01
Background Gravitation field algorithm (GFA) is a new optimization algorithm which is based on an imitation of natural phenomena. GFA can do well both for searching global minimum and multi-minima in computational biology. But GFA needs to be improved for increasing efficiency, and modified for applying to some discrete data problems in system biology. Method An improved GFA called IGFA was proposed in this paper. Two parts were improved in IGFA. The first one is the rule of random division, which is a reasonable strategy and makes running time shorter. The other one is rotation factor, which can improve the accuracy of IGFA. And to apply IGFA to the hierarchical clustering, the initial part and the movement operator were modified. Results Two kinds of experiments were used to test IGFA. And IGFA was applied to hierarchical clustering. The global minimum experiment was used with IGFA, GFA, GA (genetic algorithm) and SA (simulated annealing). Multi-minima experiment was used with IGFA and GFA. The two experiments results were compared with each other and proved the efficiency of IGFA. IGFA is better than GFA both in accuracy and running time. For the hierarchical clustering, IGFA is used to optimize the smallest distance of genes pairs, and the results were compared with GA and SA, singular-linkage clustering, UPGMA. The efficiency of IGFA is proved. PMID:23173043
A comparison of clustering algorithms in article recommendation system
NASA Astrophysics Data System (ADS)
Tantanasiriwong, Supaporn
2011-12-01
Recommendation system is considered a tool that can be used to recommend researchers about resources that are suitable for their research of interest by using content-based filtering. In this paper, clustering algorithm as an unsupervised learning is introduced for grouping objects based on their feature selection and similarities. The information of publication in Science Cited Index is used to be dataset for clustering as a feature extraction in terms of dimensionality reduction of these articles by comparing Latent Dirichlet Allocation (LDA), Principal Component Analysis (PCA), and K-Mean to determine the best algorithm. In my experiment, the selected database consists of 2625 documents extraction extracted from SCI corpus from 2001 to 2009. Clustering into ranks as 50,100,200,250 is used to consider and using F-Measure evaluate among them in three algorithms. The result of this paper showed that LDA technique given the accuracy up to 95.5% which is the highest effective than any other clustering technique.
A comparison of clustering algorithms in article recommendation system
NASA Astrophysics Data System (ADS)
Tantanasiriwong, Supaporn
2012-01-01
Recommendation system is considered a tool that can be used to recommend researchers about resources that are suitable for their research of interest by using content-based filtering. In this paper, clustering algorithm as an unsupervised learning is introduced for grouping objects based on their feature selection and similarities. The information of publication in Science Cited Index is used to be dataset for clustering as a feature extraction in terms of dimensionality reduction of these articles by comparing Latent Dirichlet Allocation (LDA), Principal Component Analysis (PCA), and K-Mean to determine the best algorithm. In my experiment, the selected database consists of 2625 documents extraction extracted from SCI corpus from 2001 to 2009. Clustering into ranks as 50,100,200,250 is used to consider and using F-Measure evaluate among them in three algorithms. The result of this paper showed that LDA technique given the accuracy up to 95.5% which is the highest effective than any other clustering technique.
LYDIAN: An Extensible Educational Animation Environment for Distributed Algorithms
ERIC Educational Resources Information Center
Koldehofe, Boris; Papatriantafilou, Marina; Tsigas, Philippas
2006-01-01
LYDIAN is an environment to support the teaching and learning of distributed algorithms. It provides a collection of distributed algorithms as well as continuous animations. Users can combine algorithms and animations with arbitrary network structures defining the interconnection and behavior of the distributed algorithm. Further, it facilitates…
BMI optimization by using parallel UNDX real-coded genetic algorithm with Beowulf cluster
NASA Astrophysics Data System (ADS)
Handa, Masaya; Kawanishi, Michihiro; Kanki, Hiroshi
2007-12-01
This paper deals with the global optimization algorithm of the Bilinear Matrix Inequalities (BMIs) based on the Unimodal Normal Distribution Crossover (UNDX) GA. First, analyzing the structure of the BMIs, the existence of the typical difficult structures is confirmed. Then, in order to improve the performance of algorithm, based on results of the problem structures analysis and consideration of BMIs characteristic properties, we proposed the algorithm using primary search direction with relaxed Linear Matrix Inequality (LMI) convex estimation. Moreover, in these algorithms, we propose two types of evaluation methods for GA individuals based on LMI calculation considering BMI characteristic properties more. In addition, in order to reduce computational time, we proposed parallelization of RCGA algorithm, Master-Worker paradigm with cluster computing technique.
Mapping cultivable land from satellite imagery with clustering algorithms
NASA Astrophysics Data System (ADS)
Arango, R. B.; Campos, A. M.; Combarro, E. F.; Canas, E. R.; Díaz, I.
2016-07-01
Open data satellite imagery provides valuable data for the planning and decision-making processes related with environmental domains. Specifically, agriculture uses remote sensing in a wide range of services, ranging from monitoring the health of the crops to forecasting the spread of crop diseases. In particular, this paper focuses on a methodology for the automatic delimitation of cultivable land by means of machine learning algorithms and satellite data. The method uses a partition clustering algorithm called Partitioning Around Medoids and considers the quality of the clusters obtained for each satellite band in order to evaluate which one better identifies cultivable land. The proposed method was tested with vineyards using as input the spectral and thermal bands of the Landsat 8 satellite. The experimental results show the great potential of this method for cultivable land monitoring from remote-sensed multispectral imagery.
Synchronous Firefly Algorithm for Cluster Head Selection in WSN
Baskaran, Madhusudhanan; Sadagopan, Chitra
2015-01-01
Wireless Sensor Network (WSN) consists of small low-cost, low-power multifunctional nodes interconnected to efficiently aggregate and transmit data to sink. Cluster-based approaches use some nodes as Cluster Heads (CHs) and organize WSNs efficiently for aggregation of data and energy saving. A CH conveys information gathered by cluster nodes and aggregates/compresses data before transmitting it to a sink. However, this additional responsibility of the node results in a higher energy drain leading to uneven network degradation. Low Energy Adaptive Clustering Hierarchy (LEACH) offsets this by probabilistically rotating cluster heads role among nodes with energy above a set threshold. CH selection in WSN is NP-Hard as optimal data aggregation with efficient energy savings cannot be solved in polynomial time. In this work, a modified firefly heuristic, synchronous firefly algorithm, is proposed to improve the network performance. Extensive simulation shows the proposed technique to perform well compared to LEACH and energy-efficient hierarchical clustering. Simulations show the effectiveness of the proposed method in decreasing the packet loss ratio by an average of 9.63% and improving the energy efficiency of the network when compared to LEACH and EEHC. PMID:26495431
Pruning Neural Networks with Distribution Estimation Algorithms
Cantu-Paz, E
2003-01-15
This paper describes the application of four evolutionary algorithms to the pruning of neural networks used in classification problems. Besides of a simple genetic algorithm (GA), the paper considers three distribution estimation algorithms (DEAs): a compact GA, an extended compact GA, and the Bayesian Optimization Algorithm. The objective is to determine if the DEAs present advantages over the simple GA in terms of accuracy or speed in this problem. The experiments used a feed forward neural network trained with standard back propagation and public-domain and artificial data sets. The pruned networks seemed to have better or equal accuracy than the original fully-connected networks. Only in a few cases, pruning resulted in less accurate networks. We found few differences in the accuracy of the networks pruned by the four EAs, but found important differences in the execution time. The results suggest that a simple GA with a small population might be the best algorithm for pruning networks on the data sets we tested.
Advanced defect detection algorithm using clustering in ultrasonic NDE
NASA Astrophysics Data System (ADS)
Gongzhang, Rui; Gachagan, Anthony
2016-02-01
A range of materials used in industry exhibit scattering properties which limits ultrasonic NDE. Many algorithms have been proposed to enhance defect detection ability, such as the well-known Split Spectrum Processing (SSP) technique. Scattering noise usually cannot be fully removed and the remaining noise can be easily confused with real feature signals, hence becoming artefacts during the image interpretation stage. This paper presents an advanced algorithm to further reduce the influence of artefacts remaining in A-scan data after processing using a conventional defect detection algorithm. The raw A-scan data can be acquired from either traditional single transducer or phased array configurations. The proposed algorithm uses the concept of unsupervised machine learning to cluster segmental defect signals from pre-processed A-scans into different classes. The distinction and similarity between each class and the ensemble of randomly selected noise segments can be observed by applying a classification algorithm. Each class will then be labelled as `legitimate reflector' or `artefacts' based on this observation and the expected probability of defection (PoD) and probability of false alarm (PFA) determined. To facilitate data collection and validate the proposed algorithm, a 5MHz linear array transducer is used to collect A-scans from both austenitic steel and Inconel samples. Each pulse-echo A-scan is pre-processed using SSP and the subsequent application of the proposed clustering algorithm has provided an additional reduction to PFA while maintaining PoD for both samples compared with SSP results alone.
ICANP2: Isoenergetic cluster algorithm for NP-complete Problems
NASA Astrophysics Data System (ADS)
Zhu, Zheng; Fang, Chao; Katzgraber, Helmut G.
NP-complete optimization problems with Boolean variables are of fundamental importance in computer science, mathematics and physics. Most notably, the minimization of general spin-glass-like Hamiltonians remains a difficult numerical task. There has been a great interest in designing efficient heuristics to solve these computationally difficult problems. Inspired by the rejection-free isoenergetic cluster algorithm developed for Ising spin glasses, we present a generalized cluster update that can be applied to different NP-complete optimization problems with Boolean variables. The cluster updates allow for a wide-spread sampling of phase space, thus speeding up optimization. By carefully tuning the pseudo-temperature (needed to randomize the configurations) of the problem, we show that the method can efficiently tackle problems on topologies with a large site-percolation threshold. We illustrate the ICANP2 heuristic on paradigmatic optimization problems, such as the satisfiability problem and the vertex cover problem.
Oscillator clustering in a resource distribution chain.
Postnov, Dmitry E; Sosnovtseva, Olga V; Mosekilde, Erik
2005-03-01
The paper investigates the special clustering phenomena that one can observe in systems of nonlinear oscillators that are coupled via a shared flow of primary resources (or a common power supply). This type of coupling, which appears to be quite frequent in nature, implies that one can no longer separate the inherent dynamics of the individual oscillator from the properties of the coupling network. Illustrated by examples from microbiological population dynamics, renal physiology, and electronic oscillator theory, we show how competition for primary resources in a resource distribution chain leads to a number of new generic phenomena, including partial synchronization, sliding of the synchronization region with the resource supply, and coupling-induced inhomogeneity.
Clustering Algorithms: Their Application to Gene Expression Data
Oyelade, Jelili; Isewon, Itunuoluwa; Oladipupo, Funke; Aromolaran, Olufemi; Uwoghiren, Efosa; Ameh, Faridah; Achas, Moses; Adebiyi, Ezekiel
2016-01-01
Gene expression data hide vital information required to understand the biological process that takes place in a particular organism in relation to its environment. Deciphering the hidden patterns in gene expression data proffers a prodigious preference to strengthen the understanding of functional genomics. The complexity of biological networks and the volume of genes present increase the challenges of comprehending and interpretation of the resulting mass of data, which consists of millions of measurements; these data also inhibit vagueness, imprecision, and noise. Therefore, the use of clustering techniques is a first step toward addressing these challenges, which is essential in the data mining process to reveal natural structures and identify interesting patterns in the underlying data. The clustering of gene expression data has been proven to be useful in making known the natural structure inherent in gene expression data, understanding gene functions, cellular processes, and subtypes of cells, mining useful information from noisy data, and understanding gene regulation. The other benefit of clustering gene expression data is the identification of homology, which is very important in vaccine design. This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure. PMID:27932867
Dong, Feng; Pierpaoli, Elena; Gunn, James E.; Wechsler, Risa H.
2007-10-29
We present a modified adaptive matched filter algorithm designed to identify clusters of galaxies in wide-field imaging surveys such as the Sloan Digital Sky Survey. The cluster-finding technique is fully adaptive to imaging surveys with spectroscopic coverage, multicolor photometric redshifts, no redshift information at all, and any combination of these within one survey. It works with high efficiency in multi-band imaging surveys where photometric redshifts can be estimated with well-understood error distributions. Tests of the algorithm on realistic mock SDSS catalogs suggest that the detected sample is {approx} 85% complete and over 90% pure for clusters with masses above 1.0 x 10{sup 14}h{sup -1} M and redshifts up to z = 0.45. The errors of estimated cluster redshifts from maximum likelihood method are shown to be small (typically less that 0.01) over the whole redshift range with photometric redshift errors typical of those found in the Sloan survey. Inside the spherical radius corresponding to a galaxy overdensity of {Delta} = 200, we find the derived cluster richness {Lambda}{sub 200} a roughly linear indicator of its virial mass M{sub 200}, which well recovers the relation between total luminosity and cluster mass of the input simulation.
HST Imaging of the Globular Clusters in the Formax Cluster: Color and Luminosity Distributions
NASA Technical Reports Server (NTRS)
Grillmair, C. J.; Forbes, D. A.; Brodie, J.; Elson, R.
1998-01-01
We examine the luminosity and B - I color distribution of globular clusters for three early-type galaxies in the Fornax cluster using imaging data from the Wide Field/Planetary Camera 2 on the Hubble Space Telescope.
Performance Analysis of Apriori Algorithm with Different Data Structures on Hadoop Cluster
NASA Astrophysics Data System (ADS)
Singh, Sudhakar; Garg, Rakhi; Mishra, P. K.
2015-10-01
Mining frequent itemsets from massive datasets is always being a most important problem of data mining. Apriori is the most popular and simplest algorithm for frequent itemset mining. To enhance the efficiency and scalability of Apriori, a number of algorithms have been proposed addressing the design of efficient data structures, minimizing database scan and parallel and distributed processing. MapReduce is the emerging parallel and distributed technology to process big datasets on Hadoop Cluster. To mine big datasets it is essential to re-design the data mining algorithm on this new paradigm. In this paper, we implement three variations of Apriori algorithm using data structures hash tree, trie and hash table trie i.e. trie with hash technique on MapReduce paradigm. We emphasize and investigate the significance of these three data structures for Apriori algorithm on Hadoop cluster, which has not been given attention yet. Experiments are carried out on both real life and synthetic datasets which shows that hash table trie data structures performs far better than trie and hash tree in terms of execution time. Moreover the performance in case of hash tree becomes worst.
Algorithm-dependent fault tolerance for distributed computing
P. D. Hough; M. e. Goldsby; E. J. Walsh
2000-02-01
Large-scale distributed systems assembled from commodity parts, like CPlant, have become common tools in the distributed computing world. Because of their size and diversity of parts, these systems are prone to failures. Applications that are being run on these systems have not been equipped to efficiently deal with failures, nor is there vendor support for fault tolerance. Thus, when a failure occurs, the application crashes. While most programmers make use of checkpoints to allow for restarting of their applications, this is cumbersome and incurs substantial overhead. In many cases, there are more efficient and more elegant ways in which to address failures. The goal of this project is to develop a software architecture for the detection of and recovery from faults in a cluster computing environment. The detection phase relies on the latest techniques developed in the fault tolerance community. Recovery is being addressed in an application-dependent manner, thus allowing the programmer to take advantage of algorithmic characteristics to reduce the overhead of fault tolerance. This architecture will allow large-scale applications to be more robust in high-performance computing environments that are comprised of clusters of commodity computers such as CPlant and SMP clusters.
Sweeney, Timothy E; Chen, Albert C; Gevaert, Olivier
2015-11-19
In order to discover new subsets (clusters) of a data set, researchers often use algorithms that perform unsupervised clustering, namely, the algorithmic separation of a dataset into some number of distinct clusters. Deciding whether a particular separation (or number of clusters, K) is correct is a sort of 'dark art', with multiple techniques available for assessing the validity of unsupervised clustering algorithms. Here, we present a new technique for unsupervised clustering that uses multiple clustering algorithms, multiple validity metrics, and progressively bigger subsets of the data to produce an intuitive 3D map of cluster stability that can help determine the optimal number of clusters in a data set, a technique we call COmbined Mapping of Multiple clUsteriNg ALgorithms (COMMUNAL). COMMUNAL locally optimizes algorithms and validity measures for the data being used. We show its application to simulated data with a known K, and then apply this technique to several well-known cancer gene expression datasets, showing that COMMUNAL provides new insights into clustering behavior and stability in all tested cases. COMMUNAL is shown to be a useful tool for determining K in complex biological datasets, and is freely available as a package for R.
Dynamically Incremental K-means++ Clustering Algorithm Based on Fuzzy Rough Set Theory
NASA Astrophysics Data System (ADS)
Li, Wei; Wang, Rujing; Jia, Xiufang; Jiang, Qing
Being classic K-means++ clustering algorithm only for static data, dynamically incremental K-means++ clustering algorithm (DK-Means++) is presented based on fuzzy rough set theory in this paper. Firstly, in DK-Means++ clustering algorithm, the formula of similar degree is improved by weights computed by using of the important degree of attributes which are reduced on the basis of rough fuzzy set theory. Secondly, new data only need match granular which was clustered by K-means++ algorithm or seldom new data is clustered by classic K-means++ algorithm in global data. In this way, that all data is re-clustered each time in dynamic data set is avoided, so the efficiency of clustering is improved. Throughout our experiments showing, DK-Means++ algorithm can objectively and efficiently deal with clustering problem of dynamically incremental data.
Algorithms for distributed and mobile sensing
NASA Astrophysics Data System (ADS)
Isler, Ibrahim Volkan
Sensing remote, complex and large environments is an important task that arises in diverse applications including planetary exploration, monitoring forest fires and the surveillance of large factories. Currently, automation of such sensing tasks in complex environments is achieved either by deploying many stationary sensors to the environment, or by mounting a sensor on a mobile device and using the device to sense the environment. The Eighties and Nineties witnessed tremendous advances in both distributed and mobile sensing technologies. To take advantage of these technologies, it is crucial to design algorithms to perform sensing tasks in an autonomous fashion. In this dissertation, we study four fundamental sensing problems that arise in sensing complex environments with distributed and mobile systems. For mobile sensing systems we study exploration and pursuit-evasion problems. In the exploration problem, the goal is to design a strategy for a mobile robot so that the robot sees every point in an unknown environment as quickly as possible. In the pursuit-evasion problem, the goal is to design a strategy for a pursuer to capture an adversarial evader. For distributed sensing systems we study placement and assignment problems. In the placement problem, the goal is to place sensors to an environment so that every point in the environment is in the range of at least one sensor. The assignment problem deals with the issue of assigning targets to sensors in a network, so that overall error in estimating the position of the targets is minimized. We present algorithms to perform these sensing tasks in an efficient fashion. Performance guarantees of the algorithms are mathematically proven and evaluated by simulations.
New sampling distributions for evolution algorithms
NASA Astrophysics Data System (ADS)
Sweeney, Francis Dermot
Evolution algorithms are stochastic optimization methods based on evolutionary principles. They have long been used in optimization, and are gaining in popularity. They are particularly useful in high dimensional problems, or in problems where gradient methods fails. Evolution strategies, a class of evolutionary algorithms, are stochastic searches which evolve by mutation. This work proposes a new mutation distribution for use in single objective optimization. Up to now, cost function information obtained by mutations that do not improve fitness has been discarded. In many problems, particularly when cost function calls are expensive, it is desirable to use all available information to guide the search. The new method in this work patches Gaussians of different variances together to create a sampling distribution which delivers mutations designed to direct the search away from regions where low values of fitness have been observed. Analytic results for this new method are derived on idealized problems. The method is compared with existing methods on a range of test problems, and its overall performance attributes are assessed. A new method for multiobjective optimization is also developed. Genetic Algorithms introduce innovation into their populations by a process of bit mutation. This small scale mutation is often insufficient to successfully direct the search, unless the initial population is of sufficient quality. The new method proposed here, termed Rank Biased Sampling, uses the population to create new members, which are resampled across the entire search space from a distribution designed to favor regions which are inadequately represented by the current population. Again, this method is compared to existing methods on some standard test problems. These new optimization methods are then applied to some real-world problems of engineering interest. The optimization routines developed in this work performed well on these applications, and provide good
Gravitation field algorithm and its application in gene cluster
2010-01-01
Background Searching optima is one of the most challenging tasks in clustering genes from available experimental data or given functions. SA, GA, PSO and other similar efficient global optimization methods are used by biotechnologists. All these algorithms are based on the imitation of natural phenomena. Results This paper proposes a novel searching optimization algorithm called Gravitation Field Algorithm (GFA) which is derived from the famous astronomy theory Solar Nebular Disk Model (SNDM) of planetary formation. GFA simulates the Gravitation field and outperforms GA and SA in some multimodal functions optimization problem. And GFA also can be used in the forms of unimodal functions. GFA clusters the dataset well from the Gene Expression Omnibus. Conclusions The mathematical proof demonstrates that GFA could be convergent in the global optimum by probability 1 in three conditions for one independent variable mass functions. In addition to these results, the fundamental optimization concept in this paper is used to analyze how SA and GA affect the global search and the inherent defects in SA and GA. Some results and source code (in Matlab) are publicly available at http://ccst.jlu.edu.cn/CSBG/GFA. PMID:20854683
The Hierarchical Distribution of Young Stellar Clusters in Nearby Galaxies
NASA Astrophysics Data System (ADS)
Grasha, Kathryn; Calzetti, Daniela
2017-01-01
We investigate the spatial distributions of young stellar clusters in six nearby galaxies to trace the large scale hierarchical star-forming structures. The six galaxies are drawn from the Legacy ExtraGalactic UV Survey (LEGUS). We quantify the strength of the clustering among stellar clusters as a function of spatial scale and age to establish the survival timescale of the substructures. We separate the clusters into different classes, compact (bound) clusters and associations (unbound), and compare the clustering among them. We find that younger star clusters are more strongly clustered over small spatial scales and that the clustering disappears rapidly for ages as young as a few tens of Myr, consistent with clusters slowly losing the fractal dimension inherited at birth from their natal molecular clouds.
A new detection algorithm for microcalcification clusters in mammographic screening
NASA Astrophysics Data System (ADS)
Xie, Weiying; Ma, Yide; Li, Yunsong
2015-05-01
A novel approach for microcalcification clusters detection is proposed. At the first time, we make a short analysis of mammographic images with microcalcification lesions to confirm these lesions have much greater gray values than normal regions. After summarizing the specific feature of microcalcification clusters in mammographic screening, we make more focus on preprocessing step including eliminating the background, image enhancement and eliminating the pectoral muscle. In detail, Chan-Vese Model is used for eliminating background. Then, we do the application of combining morphology method and edge detection method. After the AND operation and Sobel filter, we use Hough Transform, it can be seen that the result have outperformed for eliminating the pectoral muscle which is approximately the gray of microcalcification. Additionally, the enhancement step is achieved by morphology. We make effort on mammographic image preprocessing to achieve lower computational complexity. As well known, it is difficult to robustly achieve mammograms analysis due to low contrast between normal and lesion tissues, there are also much noise in such images. After a serious preprocessing algorithm, a method based on blob detection is performed to microcalcification clusters according their specific features. The proposed algorithm has employed Laplace operator to improve Difference of Gaussians (DoG) function in terms of low contrast images. A preliminary evaluation of the proposed method performs on a known public database namely MIAS, rather than synthetic images. The comparison experiments and Cohen's kappa coefficients all demonstrate that our proposed approach can potentially obtain better microcalcification clusters detection results in terms of accuracy, sensitivity and specificity.
Classification of posture maintenance data with fuzzy clustering algorithms
NASA Technical Reports Server (NTRS)
Bezdek, James C.
1992-01-01
Sensory inputs from the visual, vestibular, and proprioreceptive systems are integrated by the central nervous system to maintain postural equilibrium. Sustained exposure to microgravity causes neurosensory adaptation during spaceflight, which results in decreased postural stability until readaptation occurs upon return to the terrestrial environment. Data which simulate sensory inputs under various sensory organization test (SOT) conditions were collected in conjunction with Johnson Space Center postural control studies using a tilt-translation device (TTD). The University of West Florida applied the fuzzy c-meams (FCM) clustering algorithms to this data with a view towards identifying various states and stages of subjects experiencing such changes. Feature analysis, time step analysis, pooling data, response of the subjects, and the algorithms used are discussed.
Graphical representations and cluster algorithms I. Discrete spin systems
NASA Astrophysics Data System (ADS)
Chayes, L.; Machta, J.
1997-02-01
Graphical representations similar to the FK representation are developed for a variety of spin-systems. In several cases, it is established that these representations have (FKG) monotonicity properties which enables characterization theorems for the uniqueness phase and the low-temperature phase of the spin system. Certain systems with intermediate phases and/or first-order transitions are also described in terms of the percolation properties of the representations. In all cases, these representations lead, in a natural fashion, to Swendsen-Wang-type algorithms. Hence, at least in the above-mentioned instances, these algorithms realize the program described by Kandel and Domany, Phys. Rev. B 43 (1991) 8539-8548. All of the algorithms are shown to satisfy a Li-Sokal bound which (at least for systems with a divergent specific heat) implies critical slowing down. However, the representations also give rise to invaded cluster algorithms which may allow for the rapid simulation of some of these systems at their transition points.
Dynamic and static properties of the invaded cluster algorithm
NASA Astrophysics Data System (ADS)
Moriarty, K.; Machta, J.; Chayes, L. Y.
1999-02-01
Simulations of the two-dimensional Ising and three-state Potts models at their critical points are performed using the invaded cluster (IC) algorithm. It is argued that observables measured on a sublattice of size l should exhibit a crossover to Swendsen-Wang (SW) behavior for l sufficiently less than the lattice size L, and a scaling form is proposed to describe the crossover phenomenon. It is found that the energy autocorrelation time τɛ(l,L) for an l×l sublattice attains a maximum in the crossover region, and a dynamic exponent zIC for the IC algorithm is defined according to τɛ,max~LzIC. Simulation results for the three-state model yield zIC=0.346+/-0.002, which is smaller than values of the dynamic exponent found for the SW and Wolff algorithms and also less than the Li-Sokal bound. The results are less conclusive for the Ising model, but it appears that zIC<0.21 and possibly that τɛ,max~ln L so that zIC=0-similar to previous results for the SW and Wolff algorithms.
A comparison of queueing, cluster and distributed computing systems
NASA Technical Reports Server (NTRS)
Kaplan, Joseph A.; Nelson, Michael L.
1993-01-01
Using workstation clusters for distributed computing has become popular with the proliferation of inexpensive, powerful workstations. Workstation clusters offer both a cost effective alternative to batch processing and an easy entry into parallel computing. However, a number of workstations on a network does not constitute a cluster. Cluster management software is necessary to harness the collective computing power. A variety of cluster management and queuing systems are compared: Distributed Queueing Systems (DQS), Condor, Load Leveler, Load Balancer, Load Sharing Facility (LSF - formerly Utopia), Distributed Job Manager (DJM), Computing in Distributed Networked Environments (CODINE), and NQS/Exec. The systems differ in their design philosophy and implementation. Based on published reports on the different systems and conversations with the system's developers and vendors, a comparison of the systems are made on the integral issues of clustered computing.
A clustering method of Chinese medicine prescriptions based on modified firefly algorithm.
Yuan, Feng; Liu, Hong; Chen, Shou-Qiang; Xu, Liang
2016-12-01
This paper is aimed to study the clustering method for Chinese medicine (CM) medical cases. The traditional K-means clustering algorithm had shortcomings such as dependence of results on the selection of initial value, trapping in local optimum when processing prescriptions form CM medical cases. Therefore, a new clustering method based on the collaboration of firefly algorithm and simulated annealing algorithm was proposed. This algorithm dynamically determined the iteration of firefly algorithm and simulates sampling of annealing algorithm by fitness changes, and increased the diversity of swarm through expansion of the scope of the sudden jump, thereby effectively avoiding premature problem. The results from confirmatory experiments for CM medical cases suggested that, comparing with traditional K-means clustering algorithms, this method was greatly improved in the individual diversity and the obtained clustering results, the computing results from this method had a certain reference value for cluster analysis on CM prescriptions.
jClustering, an Open Framework for the Development of 4D Clustering Algorithms
Mateos-Pérez, José María; García-Villalba, Carmen; Pascau, Javier; Desco, Manuel; Vaquero, Juan J.
2013-01-01
We present jClustering, an open framework for the design of clustering algorithms in dynamic medical imaging. We developed this tool because of the difficulty involved in manually segmenting dynamic PET images and the lack of availability of source code for published segmentation algorithms. Providing an easily extensible open tool encourages publication of source code to facilitate the process of comparing algorithms and provide interested third parties with the opportunity to review code. The internal structure of the framework allows an external developer to implement new algorithms easily and quickly, focusing only on the particulars of the method being implemented and not on image data handling and preprocessing. This tool has been coded in Java and is presented as an ImageJ plugin in order to take advantage of all the functionalities offered by this imaging analysis platform. Both binary packages and source code have been published, the latter under a free software license (GNU General Public License) to allow modification if necessary. PMID:23990913
jClustering, an open framework for the development of 4D clustering algorithms.
Mateos-Pérez, José María; García-Villalba, Carmen; Pascau, Javier; Desco, Manuel; Vaquero, Juan J
2013-01-01
We present jClustering, an open framework for the design of clustering algorithms in dynamic medical imaging. We developed this tool because of the difficulty involved in manually segmenting dynamic PET images and the lack of availability of source code for published segmentation algorithms. Providing an easily extensible open tool encourages publication of source code to facilitate the process of comparing algorithms and provide interested third parties with the opportunity to review code. The internal structure of the framework allows an external developer to implement new algorithms easily and quickly, focusing only on the particulars of the method being implemented and not on image data handling and preprocessing. This tool has been coded in Java and is presented as an ImageJ plugin in order to take advantage of all the functionalities offered by this imaging analysis platform. Both binary packages and source code have been published, the latter under a free software license (GNU General Public License) to allow modification if necessary.
Classification of posture maintenance data with fuzzy clustering algorithms
NASA Technical Reports Server (NTRS)
Bezdek, James C.
1991-01-01
Sensory inputs from the visual, vestibular, and proprioreceptive systems are integrated by the central nervous system to maintain postural equilibrium. Sustained exposure to microgravity causes neurosensory adaptation during spaceflight, which results in decreased postural stability until readaptation occurs upon return to the terrestrial environment. Data which simulate sensory inputs under various conditions were collected in conjunction with JSC postural control studies using a Tilt-Translation Device (TTD). The University of West Florida proposed applying the Fuzzy C-Means Clustering (FCM) Algorithms to this data with a view towards identifying various states and stages. Data supplied by NASA/JSC were submitted to the FCM algorithms in an attempt to identify and characterize cluster substructure in a mixed ensemble of pre- and post-adaptational TTD data. Following several unsuccessful trials with FCM using a full 11 dimensional data set, a set of two channels (features) were found to enable FCM to separate pre- from post-adaptational TTD data. The main conclusions are that: (1) FCM seems able to separate pre- from post-TTD subject no. 2 on the one trial that was used, but only in certain subintervals of time; and (2) Channels 2 (right rear transducer force) and 8 (hip sway bar) contain better discrimination information than other supersets and combinations of the data that were tried so far.
Thermodynamic Casimir effect in films: the exchange cluster algorithm.
Hasenbusch, Martin
2015-02-01
We study the thermodynamic Casimir force for films with various types of boundary conditions and the bulk universality class of the three-dimensional Ising model. To this end, we perform Monte Carlo simulations of the improved Blume-Capel model on the simple cubic lattice. In particular, we employ the exchange or geometric cluster cluster algorithm [Heringa and Blöte, Phys. Rev. E 57, 4976 (1998)]. In a previous work, we demonstrated that this algorithm allows us to compute the thermodynamic Casimir force for the plate-sphere geometry efficiently. It turns out that also for the film geometry a substantial reduction of the statistical error can achieved. Concerning physics, we focus on (O,O) boundary conditions, where O denotes the ordinary surface transition. These are implemented by free boundary conditions on both sides of the film. Films with such boundary conditions undergo a phase transition in the universality class of the two-dimensional Ising model. We determine the inverse transition temperature for a large range of thicknesses L(0) of the film and study the scaling of this temperature with L(0). In the neighborhood of the transition, the thermodynamic Casimir force is affected by finite size effects, where finite size refers to a finite transversal extension L of the film. We demonstrate that these finite size effects can be computed by using the universal finite size scaling function of the free energy of the two-dimensional Ising model.
Ternary alloy material prediction using genetic algorithm and cluster expansion
Chen, Chong
2015-12-01
This thesis summarizes our study on the crystal structures prediction of Fe-V-Si system using genetic algorithm and cluster expansion. Our goal is to explore and look for new stable compounds. We started from the current ten known experimental phases, and calculated formation energies of those compounds using density functional theory (DFT) package, namely, VASP. The convex hull was generated based on the DFT calculations of the experimental known phases. Then we did random search on some metal rich (Fe and V) compositions and found that the lowest energy structures were body centered cube (bcc) underlying lattice, under which we did our computational systematic searches using genetic algorithm and cluster expansion. Among hundreds of the searched compositions, thirteen were selected and DFT formation energies were obtained by VASP. The stability checking of those thirteen compounds was done in reference to the experimental convex hull. We found that the composition, 24-8-16, i.e., Fe_{3}VSi_{2} is a new stable phase and it can be very inspiring to the future experiments.
NASA Astrophysics Data System (ADS)
Komura, Yukihiro
2015-09-01
Cluster-labeling algorithms that use a single GPU can be roughly divided into direct and two-stage approaches. To date, both types use an iterative method to compare the labels of nearest-neighbor sites. In this paper, I present a GPU-based cluster-labeling algorithm that does not use conventional iteration. The proposed method is applicable to both direct algorithms and two-stage approaches. Under the proposed approach, only one comparison with the nearest-neighbor site is needed for a two-dimensional (2D) system, and just two comparisons are needed for three-dimensional (3D) systems. As an application of the new cluster-labeling algorithm, I consider the Swendsen-Wang (SW) multi-cluster spin flip algorithm. The performance of the proposed method is compared with that of other cluster-labeling algorithms for the SW multi-cluster spin flip problem using the 2D and 3D Ising models. As a result, the computation time of the new algorithm is shown to be 40% faster than that of the previous algorithm for the 2D Ising model, and 20% faster than that of the previous algorithm for the 3D Ising model at the critical temperature.
Moving Target Tracking through Distributed Clustering in Directional Sensor Networks
Enayet, Asma; Razzaque, Md. Abdur; Hassan, Mohammad Mehedi; Almogren, Ahmad; Alamri, Atif
2014-01-01
The problem of moving target tracking in directional sensor networks (DSNs) introduces new research challenges, including optimal selection of sensing and communication sectors of the directional sensor nodes, determination of the precise location of the target and an energy-efficient data collection mechanism. Existing solutions allow individual sensor nodes to detect the target's location through collaboration among neighboring nodes, where most of the sensors are activated and communicate with the sink. Therefore, they incur much overhead, loss of energy and reduced target tracking accuracy. In this paper, we have proposed a clustering algorithm, where distributed cluster heads coordinate their member nodes in optimizing the active sensing and communication directions of the nodes, precisely determining the target location by aggregating reported sensing data from multiple nodes and transferring the resultant location information to the sink. Thus, the proposed target tracking mechanism minimizes the sensing redundancy and maximizes the number of sleeping nodes in the network. We have also investigated the dynamic approach of activating sleeping nodes on-demand so that the moving target tracking accuracy can be enhanced while maximizing the network lifetime. We have carried out our extensive simulations in ns-3, and the results show that the proposed mechanism achieves higher performance compared to the state-of-the-art works. PMID:25529205
Jiang, Peng; Xu, Yiming; Wu, Feng
2016-01-01
Existing move-restricted node self-deployment algorithms are based on a fixed node communication radius, evaluate the performance based on network coverage or the connectivity rate and do not consider the number of nodes near the sink node and the energy consumption distribution of the network topology, thereby degrading network reliability and the energy consumption balance. Therefore, we propose a distributed underwater node self-deployment algorithm. First, each node begins the uneven clustering based on the distance on the water surface. Each cluster head node selects its next-hop node to synchronously construct a connected path to the sink node. Second, the cluster head node adjusts its depth while maintaining the layout formed by the uneven clustering and then adjusts the positions of in-cluster nodes. The algorithm originally considers the network reliability and energy consumption balance during node deployment and considers the coverage redundancy rate of all positions that a node may reach during the node position adjustment. Simulation results show, compared to the connected dominating set (CDS) based depth computation algorithm, that the proposed algorithm can increase the number of the nodes near the sink node and improve network reliability while guaranteeing the network connectivity rate. Moreover, it can balance energy consumption during network operation, further improve network coverage rate and reduce energy consumption. PMID:26784193
NASA Astrophysics Data System (ADS)
Gong, Lina; Xu, Tao; Zhang, Wei; Li, Xuhong; Wang, Xia; Pan, Wenwen
2017-03-01
The traditional microblog recommendation algorithm has the problems of low efficiency and modest effect in the era of big data. In the aim of solving these issues, this paper proposed a mixed recommendation algorithm with user clustering. This paper first introduced the situation of microblog marketing industry. Then, this paper elaborates the user interest modeling process and detailed advertisement recommendation methods. Finally, this paper compared the mixed recommendation algorithm with the traditional classification algorithm and mixed recommendation algorithm without user clustering. The results show that the mixed recommendation algorithm with user clustering has good accuracy and recall rate in the microblog advertisements promotion.
Comparing simulated and experimental molecular cluster distributions.
Olenius, Tinja; Schobesberger, Siegfried; Kupiainen-Määttä, Oona; Franchin, Alessandro; Junninen, Heikki; Ortega, Ismael K; Kurtén, Theo; Loukonen, Ville; Worsnop, Douglas R; Kulmala, Markku; Vehkamäki, Hanna
2013-01-01
Formation of secondary atmospheric aerosol particles starts with gas phase molecules forming small molecular clusters. High-resolution mass spectrometry enables the detection and chemical characterization of electrically charged clusters from the molecular scale upward, whereas the experimental detection of electrically neutral clusters, especially as a chemical composition measurement, down to 1 nm in diameter and beyond still remains challenging. In this work we simulated a set of both electrically neutral and charged small molecular clusters, consisting of sulfuric acid and ammonia molecules, with a dynamic collision and evaporation model. Collision frequencies between the clusters were calculated according to classical kinetics, and evaporation rates were derived from first principles quantum chemical calculations with no fitting parameters. We found a good agreement between the modeled steady-state concentrations of negative cluster ions and experimental results measured with the state-of-the-art Atmospheric Pressure interface Time-Of-Flight mass spectrometer (APi-TOF) in the CLOUD chamber experiments at CERN. The model can be used to interpret experimental results and give information on neutral clusters that cannot be directly measured.
GX-Means: A model-based divide and merge algorithm for geospatial image clustering
Vatsavai, Raju; Symons, Christopher T; Chandola, Varun; Jun, Goo
2011-01-01
One of the practical issues in clustering is the specification of the appropriate number of clusters, which is not obvious when analyzing geospatial datasets, partly because they are huge (both in size and spatial extent) and high dimensional. In this paper we present a computationally efficient model-based split and merge clustering algorithm that incrementally finds model parameters and the number of clusters. Additionally, we attempt to provide insights into this problem and other data mining challenges that are encountered when clustering geospatial data. The basic algorithm we present is similar to the G-means and X-means algorithms; however, our proposed approach avoids certain limitations of these well-known clustering algorithms that are pertinent when dealing with geospatial data. We compare the performance of our approach with the G-means and X-means algorithms. Experimental evaluation on simulated data and on multispectral and hyperspectral remotely sensed image data demonstrates the effectiveness of our algorithm.
Evaluation of particle clustering algorithms in the prediction of brownout dust clouds
NASA Astrophysics Data System (ADS)
Govindarajan, Bharath Madapusi
2011-07-01
A study of three Lagrangian particle clustering methods has been conducted with application to the problem of predicting brownout dust clouds that develop when rotorcraft land over surfaces covered with loose sediment. A significant impediment in performing such particle modeling simulations is the extremely large number of particles needed to obtain dust clouds of acceptable fidelity. Computing the motion of each and every individual sediment particle in a dust cloud (which can reach into tens of billions per cubic meter) is computationally prohibitive. The reported work involved the development of computationally efficient clustering algorithms that can be applied to the simulation of dilute gas-particle suspensions at low Reynolds numbers of the relative particle motion. The Gaussian distribution, k-means and Osiptsov's clustering methods were studied in detail to highlight the nuances of each method for a prototypical flow field that mimics the highly unsteady, two-phase vortical particle flow obtained when rotorcraft encounter brownout conditions. It is shown that although clustering algorithms can be problem dependent and have bounds of applicability, they offer the potential to significantly reduce computational costs while retaining the overall accuracy of a brownout dust cloud solution.
NASA Astrophysics Data System (ADS)
Park, Sang Ha; Lee, Seokjin; Sung, Koeng-Mo
Non-negative matrix factorization (NMF) is widely used for monaural musical sound source separation because of its efficiency and good performance. However, an additional clustering process is required because the musical sound mixture is separated into more signals than the number of musical tracks during NMF separation. In the conventional method, manual clustering or training-based clustering is performed with an additional learning process. Recently, a clustering algorithm based on the mel-frequency cepstrum coefficient (MFCC) was proposed for unsupervised clustering. However, MFCC clustering supplies limited information for clustering. In this paper, we propose various timbre features for unsupervised clustering and a clustering algorithm with these features. Simulation experiments are carried out using various musical sound mixtures. The results indicate that the proposed method improves clustering performance, as compared to conventional MFCC-based clustering.
NASA Astrophysics Data System (ADS)
Bahrampour, Soheil; Moshiri, Behzad; Salahshoor, Karim
2009-08-01
Most of process fault monitoring systems suffer from offline computations and confronting with novel faults that limit their applicabilities. This paper presents a new online fault detection and isolation (FDI) algorithm based on distributed online clustering approach. In the proposed approach, clustering algorithm is used for online detection of a new trend of time series data which indicates faulty condition. On the other hand, distributed technique is used to decompose the overall monitoring task into a series of local monitoring sub-tasks so as to locally track and capture the process faults. This algorithm not only solves the problem of online FDI, but also can handle novel faults. The diagnostic performances of the proposed FDI approach is evaluated on the Tennessee Eastman process plant as a large-scale benchmark problem.
NASA Astrophysics Data System (ADS)
Tramacere, A.; Paraficz, D.; Dubath, P.; Kneib, J.-P.; Courbin, F.
2016-12-01
We present a study on galaxy detection and shape classification using topometric clustering algorithms. We first use the DBSCAN algorithm to extract, from CCD frames, groups of adjacent pixels with significant fluxes and we then apply the DENCLUE algorithm to separate the contributions of overlapping sources. The DENCLUE separation is based on the localization of pattern of local maxima, through an iterative algorithm, which associates each pixel to the closest local maximum. Our main classification goal is to take apart elliptical from spiral galaxies. We introduce new sets of features derived from the computation of geometrical invariant moments of the pixel group shape and from the statistics of the spatial distribution of the DENCLUE local maxima patterns. Ellipticals are characterized by a single group of local maxima, related to the galaxy core, while spiral galaxies have additional groups related to segments of spiral arms. We use two different supervised ensemble classification algorithms: Random Forest and Gradient Boosting. Using a sample of ≃24 000 galaxies taken from the Galaxy Zoo 2 main sample with spectroscopic redshifts, and we test our classification against the Galaxy Zoo 2 catalogue. We find that features extracted from our pipeline give, on average, an accuracy of ≃93 per cent, when testing on a test set with a size of 20 per cent of our full data set, with features deriving from the angular distribution of density attractor ranking at the top of the discrimination power.
MEASURING THE MASS DISTRIBUTION IN GALAXY CLUSTERS
Geller, Margaret J.; Diaferio, Antonaldo; Rines, Kenneth J.; Serra, Ana Laura E-mail: diaferio@ph.unito.it E-mail: serra@to.infn.it
2013-02-10
Cluster mass profiles are tests of models of structure formation. Only two current observational methods of determining the mass profile, gravitational lensing, and the caustic technique are independent of the assumption of dynamical equilibrium. Both techniques enable the determination of the extended mass profile at radii beyond the virial radius. For 19 clusters, we compare the mass profile based on the caustic technique with weak lensing measurements taken from the literature. This comparison offers a test of systematic issues in both techniques. Around the virial radius, the two methods of mass estimation agree to within {approx}30%, consistent with the expected errors in the individual techniques. At small radii, the caustic technique overestimates the mass as expected from numerical simulations. The ratio between the lensing profile and the caustic mass profile at these radii suggests that the weak lensing profiles are a good representation of the true mass profile. At radii larger than the virial radius, the extrapolated Navarro, Frenk and White fit to the lensing mass profile exceeds the caustic mass profile. Contamination of the lensing profile by unrelated structures within the lensing kernel may be an issue in some cases; we highlight the clusters MS0906+11 and A750, superposed along the line of sight, to illustrate the potential seriousness of contamination of the weak lensing signal by these unrelated structures.
A new concept of wildland-urban interface based on city clustering algorithm
NASA Astrophysics Data System (ADS)
Kanevski, M.; Champendal, A.; Vega Orozco, C.; Tonini, M.; Conedera, M.
2012-04-01
Wildland-Urban-Interface (WUI) is a widely used term in the context of wild and forest fires to indicate areas where human infrastructures interact with wildland/forest areas. Many complex problems are associated to the WUI; but the most relevant ones are those related to forest fire hazard and management in dense populated areas where fire regime is dominated by anthropogenic-induced ignition fires. This coexistence enhances both anthropogenic-ignition sources and flammable fuels. Furthermore, the growing trend of the WUI and global change effects may even worsening the situation in the near future. Therefore, many studies are dedicated to the WUI problem, focusing on refinement of its definition, development of mapping methods, implementation of measures into specific fire management plans and the validation of the proposed approaches. The present study introduces a new concept of WUI based on city clustering algorithm (CCA) introduced in Rosenfeld et al., 2008. CCA was proposed as an automatic tool for studying the definition of cities and their distribution. The algorithm uses demographic data - either on a regular or non-regular grid in space - where a city (urban zone) is detected as a cluster of connected populated cells with maximal size. In the present study the CCA is proposed as a tool to develop a new concept of population dynamic analysis crucial to define and to localise WUI. The real case study is based on demographic/census data - organised in a regular grid with a resolution of 100 m and the forest fire ignition points database from canton Ticino, Switzerland. By changing spatial scales of demographic cells the relationships between urban zones (demographic clusters) and forest fire events were statistically analyzed. Corresponding scaling laws were used to understand the interaction between urban zones and forest fires. The first results are good and indicate that the method can be applied to define WUI in an innovative way. Keywords: forest fires
LMC clusters - Age calibration and age distribution revisited
NASA Technical Reports Server (NTRS)
Elson, Rebecca A.; Fall, S. Michael
1988-01-01
The empirical age relation for star clusters in the Large Magellanic Cloud presented by Elson and Fall (1985) are reexamined using ages based only on main-sequence turnoffs. The present sample includes 57 clusters, 24 of which have color-magnitude diagrams published since 1985. The new calibration is very similar to that found previously, and the scatter in the relation corresponds to uncertainties of about a factor of 2 in age. The age distribution derived from the new calibration does not differ significantly from that derived in earlier work. It is compared with age distributions estimated by other authors for different samples of clusters, and the results are discussed.
User-Based Document Clustering by Redescribing Subject Descriptions with a Genetic Algorithm.
ERIC Educational Resources Information Center
Gordon, Michael D.
1991-01-01
Discussion of clustering of documents and queries in information retrieval systems focuses on the use of a genetic algorithm to adapt subject descriptions so that documents become more effective in matching relevant queries. Various types of clustering are explained, and simulation experiments used to test the genetic algorithm are described. (27…
Contributions to "k"-Means Clustering and Regression via Classification Algorithms
ERIC Educational Resources Information Center
Salman, Raied
2012-01-01
The dissertation deals with clustering algorithms and transforming regression problems into classification problems. The main contributions of the dissertation are twofold; first, to improve (speed up) the clustering algorithms and second, to develop a strict learning environment for solving regression problems as classification tasks by using…
NASA Astrophysics Data System (ADS)
Liu, Fang
2011-06-01
Image segmentation remains one of the major challenges in image analysis and computer vision. Fuzzy clustering, as a soft segmentation method, has been widely studied and successfully applied in mage clustering and segmentation. The fuzzy c-means (FCM) algorithm is the most popular method used in mage segmentation. However, most clustering algorithms such as the k-means and the FCM clustering algorithms search for the final clusters values based on the predetermined initial centers. The FCM clustering algorithms does not consider the space information of pixels and is sensitive to noise. In the paper, presents a new fuzzy c-means (FCM) algorithm with adaptive evolutionary programming that provides image clustering. The features of this algorithm are: 1) firstly, it need not predetermined initial centers. Evolutionary programming will help FCM search for better center and escape bad centers at local minima. Secondly, the spatial distance and the Euclidean distance is also considered in the FCM clustering. So this algorithm is more robust to the noises. Thirdly, the adaptive evolutionary programming is proposed. The mutation rule is adaptively changed with learning the useful knowledge in the evolving process. Experiment results shows that the new image segmentation algorithm is effective. It is providing robustness to noisy images.
A Cluster Algorithm for the 2-D SU(3) × SU(3) Chiral Model
NASA Astrophysics Data System (ADS)
Ji, Da-ren; Zhang, Jian-bo
1996-07-01
To extend the cluster algorithm to SU(N) × SU(N) chiral models, a variant version of Wolff's cluster algorithm is proposed and tested for the 2-dimensional SU(3) × SU(3) chiral model. The results show that the new method can reduce the critical slowing down in SU(3) × SU(3) chiral model.
Gravitational lensing by clusters of galaxies - Constraining the mass distribution
NASA Technical Reports Server (NTRS)
Miralda-Escude, Jordi
1991-01-01
The possibility of placing constraints on the mass distribution of a cluster of galaxies by analyzing the cluster's gravitational lensing effect on the images of more distant galaxies is investigated theoretically in the limit of weak distortion. The steps in the proposed analysis are examined in detail, and it is concluded that detectable distortion can be produced by clusters with line-of-sight velocity dispersions of over 500 km/sec. Hence it should be possible to determine (1) the cluster center position (with accuracy equal to the mean separation of the background galaxies), (2) the cluster-potential quadrupole moment (to within about 20 percent of the total potential if velocity dispersion is 1000 km/sec), and (3) the power law for the outer-cluster density profile (if enough background galaxies in the surrounding region are observed).
The distribution of early- and late-type galaxies in the Coma cluster
NASA Technical Reports Server (NTRS)
Doi, M.; Fukugita, M.; Okamura, S.; Turner, E. L.
1995-01-01
The spatial distribution and the morohology-density relation of Coma cluster galaxies are studied using a new homogeneous photmetric sample of 450 galaxies down to B = 16.0 mag with quantitative morphology classification. The sample covers a wide area (10 deg X 10 deg), extending well beyond the Coma cluster. Morphological classifications into early- (E+SO) and late-(S) type galaxies are made by an automated algorithm using simple photometric parameters, with which the misclassification rate is expected to be approximately 10% with respect to early and late types given in the Third Reference Catalogue of Bright Galaxies. The flattened distribution of Coma cluster galaxies, as noted in previous studies, is most conspicuously seen if the early-type galaxies are selected. Early-type galaxies are distributed in a thick filament extended from the NE to the WSW direction that delineates a part of large-scale structure. Spiral galaxies show a distribution with a modest density gradient toward the cluster center; at least bright spiral galaxies are present close to the center of the Coma cluster. We also examine the morphology-density relation for the Coma cluster including its surrounding regions.
Security clustering algorithm based on reputation in hierarchical peer-to-peer network
NASA Astrophysics Data System (ADS)
Chen, Mei; Luo, Xin; Wu, Guowen; Tan, Yang; Kita, Kenji
2013-03-01
For the security problems of the hierarchical P2P network (HPN), the paper presents a security clustering algorithm based on reputation (CABR). In the algorithm, we take the reputation mechanism for ensuring the security of transaction and use cluster for managing the reputation mechanism. In order to improve security, reduce cost of network brought by management of reputation and enhance stability of cluster, we select reputation, the historical average online time, and the network bandwidth as the basic factors of the comprehensive performance of node. Simulation results showed that the proposed algorithm improved the security, reduced the network overhead, and enhanced stability of cluster.
A Self-Adaptive Fuzzy c-Means Algorithm for Determining the Optimal Number of Clusters
Wang, Zhihao; Yi, Jing
2016-01-01
For the shortcoming of fuzzy c-means algorithm (FCM) needing to know the number of clusters in advance, this paper proposed a new self-adaptive method to determine the optimal number of clusters. Firstly, a density-based algorithm was put forward. The algorithm, according to the characteristics of the dataset, automatically determined the possible maximum number of clusters instead of using the empirical rule n and obtained the optimal initial cluster centroids, improving the limitation of FCM that randomly selected cluster centroids lead the convergence result to the local minimum. Secondly, this paper, by introducing a penalty function, proposed a new fuzzy clustering validity index based on fuzzy compactness and separation, which ensured that when the number of clusters verged on that of objects in the dataset, the value of clustering validity index did not monotonically decrease and was close to zero, so that the optimal number of clusters lost robustness and decision function. Then, based on these studies, a self-adaptive FCM algorithm was put forward to estimate the optimal number of clusters by the iterative trial-and-error process. At last, experiments were done on the UCI, KDD Cup 1999, and synthetic datasets, which showed that the method not only effectively determined the optimal number of clusters, but also reduced the iteration of FCM with the stable clustering result. PMID:28042291
A Self-Adaptive Fuzzy c-Means Algorithm for Determining the Optimal Number of Clusters.
Ren, Min; Liu, Peiyu; Wang, Zhihao; Yi, Jing
2016-01-01
For the shortcoming of fuzzy c-means algorithm (FCM) needing to know the number of clusters in advance, this paper proposed a new self-adaptive method to determine the optimal number of clusters. Firstly, a density-based algorithm was put forward. The algorithm, according to the characteristics of the dataset, automatically determined the possible maximum number of clusters instead of using the empirical rule [Formula: see text] and obtained the optimal initial cluster centroids, improving the limitation of FCM that randomly selected cluster centroids lead the convergence result to the local minimum. Secondly, this paper, by introducing a penalty function, proposed a new fuzzy clustering validity index based on fuzzy compactness and separation, which ensured that when the number of clusters verged on that of objects in the dataset, the value of clustering validity index did not monotonically decrease and was close to zero, so that the optimal number of clusters lost robustness and decision function. Then, based on these studies, a self-adaptive FCM algorithm was put forward to estimate the optimal number of clusters by the iterative trial-and-error process. At last, experiments were done on the UCI, KDD Cup 1999, and synthetic datasets, which showed that the method not only effectively determined the optimal number of clusters, but also reduced the iteration of FCM with the stable clustering result.
NASA Technical Reports Server (NTRS)
Lambeck, P. F.; Rice, D. P.
1976-01-01
Signature extension is intended to increase the space-time range over which a set of training statistics can be used to classify data without significant loss of recognition accuracy. A first cluster matching algorithm MASC (Multiplicative and Additive Signature Correction) was developed at the Environmental Research Institute of Michigan to test the concept of using associations between training and recognition area cluster statistics to define an average signature transformation. A more recent signature extension module CROP-A (Cluster Regression Ordered on Principal Axis) has shown evidence of making significant associations between training and recognition area cluster statistics, with the clusters to be matched being selected automatically by the algorithm.
Comparison and evaluation of network clustering algorithms applied to genetic interaction networks.
Hou, Lin; Wang, Lin; Berg, Arthur; Qian, Minping; Zhu, Yunping; Li, Fangting; Deng, Minghua
2012-01-01
The goal of network clustering algorithms detect dense clusters in a network, and provide a first step towards the understanding of large scale biological networks. With numerous recent advances in biotechnologies, large-scale genetic interactions are widely available, but there is a limited understanding of which clustering algorithms may be most effective. In order to address this problem, we conducted a systematic study to compare and evaluate six clustering algorithms in analyzing genetic interaction networks, and investigated influencing factors in choosing algorithms. The algorithms considered in this comparison include hierarchical clustering, topological overlap matrix, bi-clustering, Markov clustering, Bayesian discriminant analysis based community detection, and variational Bayes approach to modularity. Both experimentally identified and synthetically constructed networks were used in this comparison. The accuracy of the algorithms is measured by the Jaccard index in comparing predicted gene modules with benchmark gene sets. The results suggest that the choice differs according to the network topology and evaluation criteria. Hierarchical clustering showed to be best at predicting protein complexes; Bayesian discriminant analysis based community detection proved best under epistatic miniarray profile (EMAP) datasets; the variational Bayes approach to modularity was noticeably better than the other algorithms in the genome-scale networks.
Parallelization of the Wolff single-cluster algorithm
NASA Astrophysics Data System (ADS)
Kaupužs, J.; Rimšāns, J.; Melnik, R. V. N.
2010-02-01
A parallel [open multiprocessing (OpenMP)] implementation of the Wolff single-cluster algorithm has been developed and tested for the three-dimensional (3D) Ising model. The developed procedure is generalizable to other lattice spin models and its effectiveness depends on the specific application at hand. The applicability of the developed methodology is discussed in the context of the applications, where a sophisticated shuffling scheme is used to generate pseudorandom numbers of high quality, and an iterative method is applied to find the critical temperature of the 3D Ising model with a great accuracy. For the lattice with linear size L=1024 , we have reached the speedup about 1.79 times on two processors and about 2.67 times on four processors, as compared to the serial code. According to our estimation, the speedup about three times on four processors is reachable for the O(n) models with n≥2 . Furthermore, the application of the developed OpenMP code allows us to simulate larger lattices due to greater operative (shared) memory available.
Parallelization of the Wolff single-cluster algorithm.
Kaupuzs, J; Rimsāns, J; Melnik, R V N
2010-02-01
A parallel [open multiprocessing (OpenMP)] implementation of the Wolff single-cluster algorithm has been developed and tested for the three-dimensional (3D) Ising model. The developed procedure is generalizable to other lattice spin models and its effectiveness depends on the specific application at hand. The applicability of the developed methodology is discussed in the context of the applications, where a sophisticated shuffling scheme is used to generate pseudorandom numbers of high quality, and an iterative method is applied to find the critical temperature of the 3D Ising model with a great accuracy. For the lattice with linear size L=1024, we have reached the speedup about 1.79 times on two processors and about 2.67 times on four processors, as compared to the serial code. According to our estimation, the speedup about three times on four processors is reachable for the O(n) models with n> or =2. Furthermore, the application of the developed OpenMP code allows us to simulate larger lattices due to greater operative (shared) memory available.
NASA Astrophysics Data System (ADS)
Ward, W. O. C.; Wilkinson, P. B.; Chambers, J. E.; Oxby, L. S.; Bai, L.
2014-04-01
A novel method for the effective identification of bedrock subsurface elevation from electrical resistivity tomography images is described. Identifying subsurface boundaries in the topographic data can be difficult due to smoothness constraints used in inversion, so a statistical population-based approach is used that extends previous work in calculating isoresistivity surfaces. The analysis framework involves a procedure for guiding a clustering approach based on the fuzzy c-means algorithm. An approximation of resistivity distributions, found using kernel density estimation, was utilized as a means of guiding the cluster centroids used to classify data. A fuzzy method was chosen over hard clustering due to uncertainty in hard edges in the topography data, and a measure of clustering uncertainty was identified based on the reciprocal of cluster membership. The algorithm was validated using a direct comparison of known observed bedrock depths at two 3-D survey sites, using real-time GPS information of exposed bedrock by quarrying on one site, and borehole logs at the other. Results show similarly accurate detection as a leading isosurface estimation method, and the proposed algorithm requires significantly less user input and prior site knowledge. Furthermore, the method is effectively dimension-independent and will scale to data of increased spatial dimensions without a significant effect on the runtime. A discussion on the results by automated versus supervised analysis is also presented.
An efficient algorithm for estimating noise covariances in distributed systems
NASA Technical Reports Server (NTRS)
Dee, D. P.; Cohn, S. E.; Ghil, M.; Dalcher, A.
1985-01-01
An efficient computational algorithm for estimating the noise covariance matrices of large linear discrete stochatic-dynamic systems is presented. Such systems arise typically by discretizing distributed-parameter systems, and their size renders computational efficiency a major consideration. The proposed adaptive filtering algorithm is based on the ideas of Belanger, and is algebraically equivalent to his algorithm. The earlier algorithm, however, has computational complexity proportional to p to the 6th, where p is the number of observations of the system state, while the new algorithm has complexity proportional to only p-cubed. Further, the formulation of noise covariance estimation as a secondary filter, analogous to state estimation as a primary filter, suggests several generalizations of the earlier algorithm. The performance of the proposed algorithm is demonstrated for a distributed system arising in numerical weather prediction.
NASA Astrophysics Data System (ADS)
Morales-Esteban, Antonio; Martínez-Álvarez, Francisco; Scitovski, Sanja; Scitovski, Rudolf
2014-12-01
In this paper we construct an efficient adaptive Mahalanobis k-means algorithm. In addition, we propose a new efficient algorithm to search for a globally optimal partition obtained by using the adoptive Mahalanobis distance-like function. The algorithm is a generalization of the previously proposed incremental algorithm (Scitovski and Scitovski, 2013). It successively finds optimal partitions with k = 2 , 3 , … clusters. Therefore, it can also be used for the estimation of the most appropriate number of clusters in a partition by using various validity indexes. The algorithm has been applied to the seismic catalogues of Croatia and the Iberian Peninsula. Both regions are characterized by a moderate seismic activity. One of the main advantages of the algorithm is its ability to discover not only circular but also elliptical shapes, whose geometry fits the faults better. Three seismogenic zonings are proposed for Croatia and two for the Iberian Peninsula and adjacent areas, according to the clusters discovered by the algorithm.
The efficiency of star formation in clustered and distributed regions
NASA Astrophysics Data System (ADS)
Bonnell, Ian A.; Smith, Rowan J.; Clark, Paul C.; Bate, Matthew R.
2011-02-01
We investigate the formation of both clustered and distributed populations of young stars in a single molecular cloud. We present a numerical simulation of a 104 M⊙ elongated, turbulent, molecular cloud and the formation of over 2500 stars. The stars form both in stellar clusters and in a distributed mode, which is determined by the local gravitational binding of the cloud. A density gradient along the major axis of the cloud produces bound regions that form stellar clusters and unbound regions that form a more distributed population. The initial mass function (IMF) also depends on the local gravitational binding of the cloud with bound regions forming full IMFs whereas in the unbound, distributed regions the stellar masses cluster around the local Jeans mass and lack both the high-mass and the low-mass stars. The overall efficiency of star formation is ≈ 15 per cent in the cloud when the calculation is terminated, but varies from less than 1 per cent in the regions of distributed star formation to ≈ 40 per cent in regions containing large stellar clusters. Considering that large-scale surveys are likely to catch clouds at all evolutionary stages, estimates of the (time-averaged) star formation efficiency (SFE) for the giant molecular cloud reported here is only ≈ 4 per cent. This would lead to the erroneous conclusion of slow star formation when in fact it is occurring on a dynamical time-scale.
Logistics distribution centers location problem and algorithm under fuzzy environment
NASA Astrophysics Data System (ADS)
Yang, Lixing; Ji, Xiaoyu; Gao, Ziyou; Li, Keping
2007-11-01
Distribution centers location problem is concerned with how to select distribution centers from the potential set so that the total relevant cost is minimized. This paper mainly investigates this problem under fuzzy environment. Consequentially, chance-constrained programming model for the problem is designed and some properties of the model are investigated. Tabu search algorithm, genetic algorithm and fuzzy simulation algorithm are integrated to seek the approximate best solution of the model. A numerical example is also given to show the application of the algorithm.
Automatic Regionalization Algorithm for Distributed State Estimation in Power Systems: Preprint
Wang, Dexin; Yang, Liuqing; Florita, Anthony; Alam, S.M. Shafiul; Elgindy, Tarek; Hodge, Bri-Mathias
2016-08-01
The deregulation of the power system and the incorporation of generation from renewable energy sources recessitates faster state estimation in the smart grid. Distributed state estimation (DSE) has become a promising and scalable solution to this urgent demand. In this paper, we investigate the regionalization algorithms for the power system, a necessary step before distributed state estimation can be performed. To the best of the authors' knowledge, this is the first investigation on automatic regionalization (AR). We propose three spectral clustering based AR algorithms. Simulations show that our proposed algorithms outperform the two investigated manual regionalization cases. With the help of AR algorithms, we also show how the number of regions impacts the accuracy and convergence speed of the DSE and conclude that the number of regions needs to be chosen carefully to improve the convergence speed of DSEs.
Sun, Liping; Luo, Yonglong; Ding, Xintao; Zhang, Ji
2014-01-01
An important component of a spatial clustering algorithm is the distance measure between sample points in object space. In this paper, the traditional Euclidean distance measure is replaced with innovative obstacle distance measure for spatial clustering under obstacle constraints. Firstly, we present a path searching algorithm to approximate the obstacle distance between two points for dealing with obstacles and facilitators. Taking obstacle distance as similarity metric, we subsequently propose the artificial immune clustering with obstacle entity (AICOE) algorithm for clustering spatial point data in the presence of obstacles and facilitators. Finally, the paper presents a comparative analysis of AICOE algorithm and the classical clustering algorithms. Our clustering model based on artificial immune system is also applied to the case of public facility location problem in order to establish the practical applicability of our approach. By using the clone selection principle and updating the cluster centers based on the elite antibodies, the AICOE algorithm is able to achieve the global optimum and better clustering effect. PMID:25435862
Sun, Liping; Luo, Yonglong; Ding, Xintao; Zhang, Ji
2014-01-01
An important component of a spatial clustering algorithm is the distance measure between sample points in object space. In this paper, the traditional Euclidean distance measure is replaced with innovative obstacle distance measure for spatial clustering under obstacle constraints. Firstly, we present a path searching algorithm to approximate the obstacle distance between two points for dealing with obstacles and facilitators. Taking obstacle distance as similarity metric, we subsequently propose the artificial immune clustering with obstacle entity (AICOE) algorithm for clustering spatial point data in the presence of obstacles and facilitators. Finally, the paper presents a comparative analysis of AICOE algorithm and the classical clustering algorithms. Our clustering model based on artificial immune system is also applied to the case of public facility location problem in order to establish the practical applicability of our approach. By using the clone selection principle and updating the cluster centers based on the elite antibodies, the AICOE algorithm is able to achieve the global optimum and better clustering effect.
C-element: a new clustering algorithm to find high quality functional modules in PPI networks.
Ghasemi, Mahdieh; Rahgozar, Maseud; Bidkhori, Gholamreza; Masoudi-Nejad, Ali
2013-01-01
Graph clustering algorithms are widely used in the analysis of biological networks. Extracting functional modules in protein-protein interaction (PPI) networks is one such use. Most clustering algorithms whose focuses are on finding functional modules try either to find a clique like sub networks or to grow clusters starting from vertices with high degrees as seeds. These algorithms do not make any difference between a biological network and any other networks. In the current research, we present a new procedure to find functional modules in PPI networks. Our main idea is to model a biological concept and to use this concept for finding good functional modules in PPI networks. In order to evaluate the quality of the obtained clusters, we compared the results of our algorithm with those of some other widely used clustering algorithms on three high throughput PPI networks from Sacchromyces Cerevisiae, Homo sapiens and Caenorhabditis elegans as well as on some tissue specific networks. Gene Ontology (GO) analyses were used to compare the results of different algorithms. Each algorithm's result was then compared with GO-term derived functional modules. We also analyzed the effect of using tissue specific networks on the quality of the obtained clusters. The experimental results indicate that the new algorithm outperforms most of the others, and this improvement is more significant when tissue specific networks are used.
A Special Local Clustering Algorithm for Identifying the Genes Associated With Alzheimer’s Disease
Pang, Chao-Yang; Hu, Wei; Hu, Ben-Qiong; Shi, Ying; Vanderburg, Charles R.; Rogers, Jack T.
2010-01-01
Clustering is the grouping of similar objects into a class. Local clustering feature refers to the phenomenon whereby one group of data is separated from another, and the data from these different groups are clustered locally. A compact class is defined as one cluster in which all similar elements cluster tightly within the cluster. Herein, the essence of the local clustering feature, revealed by mathematical manipulation, results in a novel clustering algorithm termed as the special local clustering (SLC) algorithm that was used to process gene microarray data related to Alzheimer’s disease (AD). SLC algorithm was able to group together genes with similar expression patterns and identify significantly varied gene expression values as isolated points. If a gene belongs to a compact class in control data and appears as an isolated point in incipient, moderate and/or severe AD gene microarray data, this gene is possibly associated with AD. Application of a clustering algorithm in disease-associated gene identification such as in AD is rarely reported. PMID:20089478
The distribution of ejected brown dwarfs in clusters
NASA Astrophysics Data System (ADS)
Goodwin, S. P.; Hubber, D. A.; Moraux, E.; Whitworth, A. P.
2005-12-01
We examine the spatial distribution of brown dwarfs produced by the decay of small-N stellar systems as expected from the embryo ejection scenario. We model a cluster of several hundred stars grouped into 'cores' of a few stars/brown dwarfs. These cores decay, preferentially ejecting their lowest-mass members. Brown dwarfs are found to have a wider spatial distribution than stars, however once the effects of limited survey areas and unresolved binaries are taken into account it can be difficult to distinguish between clusters with many or no ejections. A large difference between the distributions probably indicates that ejections have occurred, however similar distributions sometimes arise even with ejections. Thus the spatial distribution of brown dwarfs is not necessarily a good discriminator between ejection and non-ejection scenarios.
Block clustering based on difference of convex functions (DC) programming and DC algorithms.
Le, Hoai Minh; Le Thi, Hoai An; Dinh, Tao Pham; Huynh, Van Ngai
2013-10-01
We investigate difference of convex functions (DC) programming and the DC algorithm (DCA) to solve the block clustering problem in the continuous framework, which traditionally requires solving a hard combinatorial optimization problem. DC reformulation techniques and exact penalty in DC programming are developed to build an appropriate equivalent DC program of the block clustering problem. They lead to an elegant and explicit DCA scheme for the resulting DC program. Computational experiments show the robustness and efficiency of the proposed algorithm and its superiority over standard algorithms such as two-mode K-means, two-mode fuzzy clustering, and block classification EM.
Deb, Suash; Yang, Xin-She
2014-01-01
Traditional K-means clustering algorithms have the drawback of getting stuck at local optima that depend on the random values of initial centroids. Optimization algorithms have their advantages in guiding iterative computation to search for global optima while avoiding local optima. The algorithms help speed up the clustering process by converging into a global optimum early with multiple search agents in action. Inspired by nature, some contemporary optimization algorithms which include Ant, Bat, Cuckoo, Firefly, and Wolf search algorithms mimic the swarming behavior allowing them to cooperatively steer towards an optimal objective within a reasonable time. It is known that these so-called nature-inspired optimization algorithms have their own characteristics as well as pros and cons in different applications. When these algorithms are combined with K-means clustering mechanism for the sake of enhancing its clustering quality by avoiding local optima and finding global optima, the new hybrids are anticipated to produce unprecedented performance. In this paper, we report the results of our evaluation experiments on the integration of nature-inspired optimization methods into K-means algorithms. In addition to the standard evaluation metrics in evaluating clustering quality, the extended K-means algorithms that are empowered by nature-inspired optimization methods are applied on image segmentation as a case study of application scenario. PMID:25202730
Fong, Simon; Deb, Suash; Yang, Xin-She; Zhuang, Yan
2014-01-01
Traditional K-means clustering algorithms have the drawback of getting stuck at local optima that depend on the random values of initial centroids. Optimization algorithms have their advantages in guiding iterative computation to search for global optima while avoiding local optima. The algorithms help speed up the clustering process by converging into a global optimum early with multiple search agents in action. Inspired by nature, some contemporary optimization algorithms which include Ant, Bat, Cuckoo, Firefly, and Wolf search algorithms mimic the swarming behavior allowing them to cooperatively steer towards an optimal objective within a reasonable time. It is known that these so-called nature-inspired optimization algorithms have their own characteristics as well as pros and cons in different applications. When these algorithms are combined with K-means clustering mechanism for the sake of enhancing its clustering quality by avoiding local optima and finding global optima, the new hybrids are anticipated to produce unprecedented performance. In this paper, we report the results of our evaluation experiments on the integration of nature-inspired optimization methods into K-means algorithms. In addition to the standard evaluation metrics in evaluating clustering quality, the extended K-means algorithms that are empowered by nature-inspired optimization methods are applied on image segmentation as a case study of application scenario.
NASA Astrophysics Data System (ADS)
Bo, Yizhou; Shifa, Naima
2013-09-01
An estimator for finding the abundance of a rare, clustered and mobile population has been introduced. This model is based on adaptive cluster sampling (ACS) to identify the location of the population and negative binomial distribution to estimate the total in each site. To identify the location of the population we consider both sampling with replacement (WR) and sampling without replacement (WOR). Some mathematical properties of the model are also developed.
Ab initio study on (CO2)n clusters via electrostatics- and molecular tailoring-based algorithm
NASA Astrophysics Data System (ADS)
Jovan Jose, K. V.; Gadre, Shridhar R.
An algorithm based on molecular electrostatic potential (MESP) and molecular tailoring approach (MTA) for building energetically favorable molecular clusters is presented. This algorithm is tested on prototype (CO2)n clusters with n = 13, 20, and 25 to explore their structure, energetics, and properties. The most stable clusters in this series are seen to show more number of triangular motifs. Many-body energy decomposition analysis performed on the most stable clusters reveals that the 2-body is the major contributor (>96%) to the total interaction energy. Vibrational frequencies and molecular electrostatic potentials are also evaluated for these large clusters through MTA. The MTA-based MESPs of these clusters show a remarkably good agreement with the corresponding actual ones. The most intense MTA-based normal mode frequencies are in fair agreement with the actual ones for smaller clusters. These calculated asymmetric stretching frequencies are blue-shifted with reference to the CO2 monomer.
Optimization of composite structures by estimation of distribution algorithms
NASA Astrophysics Data System (ADS)
Grosset, Laurent
The design of high performance composite laminates, such as those used in aerospace structures, leads to complex combinatorial optimization problems that cannot be addressed by conventional methods. These problems are typically solved by stochastic algorithms, such as evolutionary algorithms. This dissertation proposes a new evolutionary algorithm for composite laminate optimization, named Double-Distribution Optimization Algorithm (DDOA). DDOA belongs to the family of estimation of distributions algorithms (EDA) that build a statistical model of promising regions of the design space based on sets of good points, and use it to guide the search. A generic framework for introducing statistical variable dependencies by making use of the physics of the problem is proposed. The algorithm uses two distributions simultaneously: the marginal distributions of the design variables, complemented by the distribution of auxiliary variables. The combination of the two generates complex distributions at a low computational cost. The dissertation demonstrates the efficiency of DDOA for several laminate optimization problems where the design variables are the fiber angles and the auxiliary variables are the lamination parameters. The results show that its reliability in finding the optima is greater than that of a simple EDA and of a standard genetic algorithm, and that its advantage increases with the problem dimension. A continuous version of the algorithm is presented and applied to a constrained quadratic problem. Finally, a modification of the algorithm incorporating probabilistic and directional search mechanisms is proposed. The algorithm exhibits a faster convergence to the optimum and opens the way for a unified framework for stochastic and directional optimization.
A Novel Method to Predict Genomic Islands Based on Mean Shift Clustering Algorithm
de Brito, Daniel M.; Maracaja-Coutinho, Vinicius; de Farias, Savio T.; Batista, Leonardo V.; do Rêgo, Thaís G.
2016-01-01
Genomic Islands (GIs) are regions of bacterial genomes that are acquired from other organisms by the phenomenon of horizontal transfer. These regions are often responsible for many important acquired adaptations of the bacteria, with great impact on their evolution and behavior. Nevertheless, these adaptations are usually associated with pathogenicity, antibiotic resistance, degradation and metabolism. Identification of such regions is of medical and industrial interest. For this reason, different approaches for genomic islands prediction have been proposed. However, none of them are capable of predicting precisely the complete repertory of GIs in a genome. The difficulties arise due to the changes in performance of different algorithms in the face of the variety of nucleotide distribution in different species. In this paper, we present a novel method to predict GIs that is built upon mean shift clustering algorithm. It does not require any information regarding the number of clusters, and the bandwidth parameter is automatically calculated based on a heuristic approach. The method was implemented in a new user-friendly tool named MSGIP—Mean Shift Genomic Island Predictor. Genomes of bacteria with GIs discussed in other papers were used to evaluate the proposed method. The application of this tool revealed the same GIs predicted by other methods and also different novel unpredicted islands. A detailed investigation of the different features related to typical GI elements inserted in these new regions confirmed its effectiveness. Stand-alone and user-friendly versions for this new methodology are available at http://msgip.integrativebioinformatics.me. PMID:26731657
Software Model Checking for Verifying Distributed Algorithms
2014-10-28
Verification procedure is an intelligent exhaustive search of the state space of the design Model Checking 6 Verifying Synchronous Distributed App...Distributed App Sagar Chaki, June 11, 2014 © 2014 Carnegie Mellon University Tool Usage Project webpage (http://mcda.googlecode.com) • Tutorial
`Inter-Arrival Time' Inspired Algorithm and its Application in Clustering and Molecular Phylogeny
NASA Astrophysics Data System (ADS)
Kolekar, Pandurang S.; Kale, Mohan M.; Kulkarni-Kale, Urmila
2010-10-01
Bioinformatics, being multidisciplinary field, involves applications of various methods from allied areas of Science for data mining using computational approaches. Clustering and molecular phylogeny is one of the key areas in Bioinformatics, which help in study of classification and evolution of organisms. Molecular phylogeny algorithms can be divided into distance based and character based methods. But most of these methods are dependent on pre-alignment of sequences and become computationally intensive with increase in size of data and hence demand alternative efficient approaches. `Inter arrival time distribution' (IATD) is a popular concept in the theory of stochastic system modeling but its potential in molecular data analysis has not been fully explored. The present study reports application of IATD in Bioinformatics for clustering and molecular phylogeny. The proposed method provides IATDs of nucleotides in genomic sequences. The distance function based on statistical parameters of IATDs is proposed and distance matrix thus obtained is used for the purpose of clustering and molecular phylogeny. The method is applied on a dataset of 3' non-coding region sequences (NCR) of Dengue virus type 3 (DENV-3), subtype III, reported in 2008. The phylogram thus obtained revealed the geographical distribution of DENV-3 isolates. Sri Lankan DENV-3 isolates were further observed to be clustered in two sub-clades corresponding to pre and post Dengue hemorrhagic fever emergence groups. These results are consistent with those reported earlier, which are obtained using pre-aligned sequence data as an input. These findings encourage applications of the IATD based method in molecular phylogenetic analysis in particular and data mining in general.
Yu, Shanen; Liu, Shuai; Jiang, Peng
2016-01-01
Most existing deployment algorithms for event coverage in underwater wireless sensor networks (UWSNs) usually do not consider that network communication has non-uniform characteristics on three-dimensional underwater environments. Such deployment algorithms ignore that the nodes are distributed at different depths and have different probabilities for data acquisition, thereby leading to imbalances in the overall network energy consumption, decreasing the network performance, and resulting in poor and unreliable late network operation. Therefore, in this study, we proposed an uneven cluster deployment algorithm based network layered for event coverage. First, according to the energy consumption requirement of the communication load at different depths of the underwater network, we obtained the expected value of deployment nodes and the distribution density of each layer network after theoretical analysis and deduction. Afterward, the network is divided into multilayers based on uneven clusters, and the heterogeneous communication radius of nodes can improve the network connectivity rate. The recovery strategy is used to balance the energy consumption of nodes in the cluster and can efficiently reconstruct the network topology, which ensures that the network has a high network coverage and connectivity rate in a long period of data acquisition. Simulation results show that the proposed algorithm improves network reliability and prolongs network lifetime by significantly reducing the blind movement of overall network nodes while maintaining a high network coverage and connectivity rate. PMID:27973448
NASA Astrophysics Data System (ADS)
Ju, Ying; Zhang, Songming; Ding, Ningxiang; Zeng, Xiangxiang; Zhang, Xingyi
2016-09-01
The field of complex network clustering is gaining considerable attention in recent years. In this study, a multi-objective evolutionary algorithm based on membranes is proposed to solve the network clustering problem. Population are divided into different membrane structures on average. The evolutionary algorithm is carried out in the membrane structures. The population are eliminated by the vector of membranes. In the proposed method, two evaluation objectives termed as Kernel J-means and Ratio Cut are to be minimized. Extensive experimental studies comparison with state-of-the-art algorithms proves that the proposed algorithm is effective and promising.
Ju, Ying; Zhang, Songming; Ding, Ningxiang; Zeng, Xiangxiang; Zhang, Xingyi
2016-01-01
The field of complex network clustering is gaining considerable attention in recent years. In this study, a multi-objective evolutionary algorithm based on membranes is proposed to solve the network clustering problem. Population are divided into different membrane structures on average. The evolutionary algorithm is carried out in the membrane structures. The population are eliminated by the vector of membranes. In the proposed method, two evaluation objectives termed as Kernel J-means and Ratio Cut are to be minimized. Extensive experimental studies comparison with state-of-the-art algorithms proves that the proposed algorithm is effective and promising. PMID:27670156
NASA Technical Reports Server (NTRS)
Mach, Douglas M.; Christian, Hugh J.; Blakeslee, Richard; Boccippio, Dennis J.; Goodman, Steve J.; Boeck, William
2006-01-01
We describe the clustering algorithm used by the Lightning Imaging Sensor (LIS) and the Optical Transient Detector (OTD) for combining the lightning pulse data into events, groups, flashes, and areas. Events are single pixels that exceed the LIS/OTD background level during a single frame (2 ms). Groups are clusters of events that occur within the same frame and in adjacent pixels. Flashes are clusters of groups that occur within 330 ms and either 5.5 km (for LIS) or 16.5 km (for OTD) of each other. Areas are clusters of flashes that occur within 16.5 km of each other. Many investigators are utilizing the LIS/OTD flash data; therefore, we test how variations in the algorithms for the event group and group-flash clustering affect the flash count for a subset of the LIS data. We divided the subset into areas with low (1-3), medium (4-15), high (16-63), and very high (64+) flashes to see how changes in the clustering parameters affect the flash rates in these different sizes of areas. We found that as long as the cluster parameters are within about a factor of two of the current values, the flash counts do not change by more than about 20%. Therefore, the flash clustering algorithm used by the LIS and OTD sensors create flash rates that are relatively insensitive to reasonable variations in the clustering algorithms.
Du, Tingsong; Hu, Yang; Ke, Xianting
2015-01-01
An improved quantum artificial fish swarm algorithm (IQAFSA) for solving distributed network programming considering distributed generation is proposed in this work. The IQAFSA based on quantum computing which has exponential acceleration for heuristic algorithm uses quantum bits to code artificial fish and quantum revolving gate, preying behavior, and following behavior and variation of quantum artificial fish to update the artificial fish for searching for optimal value. Then, we apply the proposed new algorithm, the quantum artificial fish swarm algorithm (QAFSA), the basic artificial fish swarm algorithm (BAFSA), and the global edition artificial fish swarm algorithm (GAFSA) to the simulation experiments for some typical test functions, respectively. The simulation results demonstrate that the proposed algorithm can escape from the local extremum effectively and has higher convergence speed and better accuracy. Finally, applying IQAFSA to distributed network problems and the simulation results for 33-bus radial distribution network system show that IQAFSA can get the minimum power loss after comparing with BAFSA, GAFSA, and QAFSA.
A novel harmony search-K means hybrid algorithm for clustering gene expression data.
Nazeer, Ka Abdul; Sebastian, Mp; Kumar, Sd Madhu
2013-01-01
Recent progress in bioinformatics research has led to the accumulation of huge quantities of biological data at various data sources. The DNA microarray technology makes it possible to simultaneously analyze large number of genes across different samples. Clustering of microarray data can reveal the hidden gene expression patterns from large quantities of expression data that in turn offers tremendous possibilities in functional genomics, comparative genomics, disease diagnosis and drug development. The k- ¬means clustering algorithm is widely used for many practical applications. But the original k-¬means algorithm has several drawbacks. It is computationally expensive and generates locally optimal solutions based on the random choice of the initial centroids. Several methods have been proposed in the literature for improving the performance of the k-¬means algorithm. A meta-heuristic optimization algorithm named harmony search helps find out near-global optimal solutions by searching the entire solution space. Low clustering accuracy of the existing algorithms limits their use in many crucial applications of life sciences. In this paper we propose a novel Harmony Search-K means Hybrid (HSKH) algorithm for clustering the gene expression data. Experimental results show that the proposed algorithm produces clusters with better accuracy in comparison with the existing algorithms.
Zhu, Bohui; Ding, Yongsheng; Hao, Kuangrong
2013-01-01
This paper presents a novel maximum margin clustering method with immune evolution (IEMMC) for automatic diagnosis of electrocardiogram (ECG) arrhythmias. This diagnostic system consists of signal processing, feature extraction, and the IEMMC algorithm for clustering of ECG arrhythmias. First, raw ECG signal is processed by an adaptive ECG filter based on wavelet transforms, and waveform of the ECG signal is detected; then, features are extracted from ECG signal to cluster different types of arrhythmias by the IEMMC algorithm. Three types of performance evaluation indicators are used to assess the effect of the IEMMC method for ECG arrhythmias, such as sensitivity, specificity, and accuracy. Compared with K-means and iterSVR algorithms, the IEMMC algorithm reflects better performance not only in clustering result but also in terms of global search ability and convergence ability, which proves its effectiveness for the detection of ECG arrhythmias. PMID:23690875
A Novel Artificial Bee Colony Based Clustering Algorithm for Categorical Data
2015-01-01
Data with categorical attributes are ubiquitous in the real world. However, existing partitional clustering algorithms for categorical data are prone to fall into local optima. To address this issue, in this paper we propose a novel clustering algorithm, ABC-K-Modes (Artificial Bee Colony clustering based on K-Modes), based on the traditional k-modes clustering algorithm and the artificial bee colony approach. In our approach, we first introduce a one-step k-modes procedure, and then integrate this procedure with the artificial bee colony approach to deal with categorical data. In the search process performed by scout bees, we adopt the multi-source search inspired by the idea of batch processing to accelerate the convergence of ABC-K-Modes. The performance of ABC-K-Modes is evaluated by a series of experiments in comparison with that of the other popular algorithms for categorical data. PMID:25993469
Tidal Densities of Globular Clusters and the Galactic Mass Distribution
NASA Astrophysics Data System (ADS)
Lee, Hyung Mok
1990-12-01
The tidal radii of globular clusters reflect the tidal field of the Galaxy. The mass distribution of the Galaxy thus may be obtained if the tidal fields of clusters are well known. Although large amounts of uncertainties in the determination of tidal radii have been obstacles in utilizing this method, analysis of tidal density could give independent check for the Galactic mass distribution. Recent theoretical modeling of dynamical evolution including steady Galactic tidal field shows that the observationally determined tidal radii could be systematically larger by about a factor of 1.5 compared to the theoretical values. From the analysis of entire sample of 148 globular clusters and 7 dwarf spheroidal systems compiled by Webbink(1985), we find that such reduction from observed values would make the tidal density(the mean density within the tidal radius) distribution consistent with the flat rotation curve of our Galaxy out to large distances if the velocity distribution of clusters and dwarf spheroidals with respect to the Galactic center is isotropic.
Illinois Occupational Skill Standards: Finishing and Distribution Cluster.
ERIC Educational Resources Information Center
Illinois Occupational Skill Standards and Credentialing Council, Carbondale.
This document, which is intended as a guide for work force preparation program providers, details the Illinois occupational skill standards for programs preparing students for employment in occupations in the finishing and distribution cluster. The document begins with a brief overview of the Illinois perspective on occupational skill standards…
A distributed scheduling algorithm for heterogeneous real-time systems
NASA Technical Reports Server (NTRS)
Zeineldine, Osman; El-Toweissy, Mohamed; Mukkamala, Ravi
1991-01-01
Much of the previous work on load balancing and scheduling in distributed environments was concerned with homogeneous systems and homogeneous loads. Several of the results indicated that random policies are as effective as other more complex load allocation policies. The effects of heterogeneity on scheduling algorithms for hard real time systems is examined. A distributed scheduler specifically to handle heterogeneities in both nodes and node traffic is proposed. The performance of the algorithm is measured in terms of the percentage of jobs discarded. While a random task allocation is very sensitive to heterogeneities, the algorithm is shown to be robust to such non-uniformities in system components and load.
NASA Astrophysics Data System (ADS)
Zhang, Xian-Kun; Tian, Xue; Li, Ya-Nan; Song, Chen
2014-08-01
The label propagation algorithm (LPA) is a graph-based semi-supervised learning algorithm, which can predict the information of unlabeled nodes by a few of labeled nodes. It is a community detection method in the field of complex networks. This algorithm is easy to implement with low complexity and the effect is remarkable. It is widely applied in various fields. However, the randomness of the label propagation leads to the poor robustness of the algorithm, and the classification result is unstable. This paper proposes a LPA based on edge clustering coefficient. The node in the network selects a neighbor node whose edge clustering coefficient is the highest to update the label of node rather than a random neighbor node, so that we can effectively restrain the random spread of the label. The experimental results show that the LPA based on edge clustering coefficient has made improvement in the stability and accuracy of the algorithm.
Improved fuzzy clustering algorithms in segmentation of DC-enhanced breast MRI.
Kannan, S R; Ramathilagam, S; Devi, Pandiyarajan; Sathya, A
2012-02-01
Segmentation of medical images is a difficult and challenging problem due to poor image contrast and artifacts that result in missing or diffuse organ/tissue boundaries. Many researchers have applied various techniques however fuzzy c-means (FCM) based algorithms is more effective compared to other methods. The objective of this work is to develop some robust fuzzy clustering segmentation systems for effective segmentation of DCE - breast MRI. This paper obtains the robust fuzzy clustering algorithms by incorporating kernel methods, penalty terms, tolerance of the neighborhood attraction, additional entropy term and fuzzy parameters. The initial centers are obtained using initialization algorithm to reduce the computation complexity and running time of proposed algorithms. Experimental works on breast images show that the proposed algorithms are effective to improve the similarity measurement, to handle large amount of noise, to have better results in dealing the data corrupted by noise, and other artifacts. The clustering results of proposed methods are validated using Silhouette Method.
An improved fuzzy c-means clustering algorithm based on shadowed sets and PSO.
Zhang, Jian; Shen, Ling
2014-01-01
To organize the wide variety of data sets automatically and acquire accurate classification, this paper presents a modified fuzzy c-means algorithm (SP-FCM) based on particle swarm optimization (PSO) and shadowed sets to perform feature clustering. SP-FCM introduces the global search property of PSO to deal with the problem of premature convergence of conventional fuzzy clustering, utilizes vagueness balance property of shadowed sets to handle overlapping among clusters, and models uncertainty in class boundaries. This new method uses Xie-Beni index as cluster validity and automatically finds the optimal cluster number within a specific range with cluster partitions that provide compact and well-separated clusters. Experiments show that the proposed approach significantly improves the clustering effect.
A new clustering algorithm applicable to multispectral and polarimetric SAR images
NASA Technical Reports Server (NTRS)
Wong, Yiu-Fai; Posner, Edward C.
1993-01-01
We describe an application of a scale-space clustering algorithm to the classification of a multispectral and polarimetric SAR image of an agricultural site. After the initial polarimetric and radiometric calibration and noise cancellation, we extracted a 12-dimensional feature vector for each pixel from the scattering matrix. The clustering algorithm was able to partition a set of unlabeled feature vectors from 13 selected sites, each site corresponding to a distinct crop, into 13 clusters without any supervision. The cluster parameters were then used to classify the whole image. The classification map is much less noisy and more accurate than those obtained by hierarchical rules. Starting with every point as a cluster, the algorithm works by melting the system to produce a tree of clusters in the scale space. It can cluster data in any multidimensional space and is insensitive to variability in cluster densities, sizes and ellipsoidal shapes. This algorithm, more powerful than existing ones, may be useful for remote sensing for land use.
Unsupervised Cryo-EM Data Clustering through Adaptively Constrained K-Means Algorithm
Xu, Yaofang; Wu, Jiayi; Yin, Chang-Cheng; Mao, Youdong
2016-01-01
In single-particle cryo-electron microscopy (cryo-EM), K-means clustering algorithm is widely used in unsupervised 2D classification of projection images of biological macromolecules. 3D ab initio reconstruction requires accurate unsupervised classification in order to separate molecular projections of distinct orientations. Due to background noise in single-particle images and uncertainty of molecular orientations, traditional K-means clustering algorithm may classify images into wrong classes and produce classes with a large variation in membership. Overcoming these limitations requires further development on clustering algorithms for cryo-EM data analysis. We propose a novel unsupervised data clustering method building upon the traditional K-means algorithm. By introducing an adaptive constraint term in the objective function, our algorithm not only avoids a large variation in class sizes but also produces more accurate data clustering. Applications of this approach to both simulated and experimental cryo-EM data demonstrate that our algorithm is a significantly improved alterative to the traditional K-means algorithm in single-particle cryo-EM analysis. PMID:27959895
Unsupervised Cryo-EM Data Clustering through Adaptively Constrained K-Means Algorithm.
Xu, Yaofang; Wu, Jiayi; Yin, Chang-Cheng; Mao, Youdong
2016-01-01
In single-particle cryo-electron microscopy (cryo-EM), K-means clustering algorithm is widely used in unsupervised 2D classification of projection images of biological macromolecules. 3D ab initio reconstruction requires accurate unsupervised classification in order to separate molecular projections of distinct orientations. Due to background noise in single-particle images and uncertainty of molecular orientations, traditional K-means clustering algorithm may classify images into wrong classes and produce classes with a large variation in membership. Overcoming these limitations requires further development on clustering algorithms for cryo-EM data analysis. We propose a novel unsupervised data clustering method building upon the traditional K-means algorithm. By introducing an adaptive constraint term in the objective function, our algorithm not only avoids a large variation in class sizes but also produces more accurate data clustering. Applications of this approach to both simulated and experimental cryo-EM data demonstrate that our algorithm is a significantly improved alterative to the traditional K-means algorithm in single-particle cryo-EM analysis.
A Local Scalable Distributed Expectation Maximization Algorithm for Large Peer-to-Peer Networks
NASA Technical Reports Server (NTRS)
Bhaduri, Kanishka; Srivastava, Ashok N.
2009-01-01
This paper offers a local distributed algorithm for expectation maximization in large peer-to-peer environments. The algorithm can be used for a variety of well-known data mining tasks in a distributed environment such as clustering, anomaly detection, target tracking to name a few. This technology is crucial for many emerging peer-to-peer applications for bioinformatics, astronomy, social networking, sensor networks and web mining. Centralizing all or some of the data for building global models is impractical in such peer-to-peer environments because of the large number of data sources, the asynchronous nature of the peer-to-peer networks, and dynamic nature of the data/network. The distributed algorithm we have developed in this paper is provably-correct i.e. it converges to the same result compared to a similar centralized algorithm and can automatically adapt to changes to the data and the network. We show that the communication overhead of the algorithm is very low due to its local nature. This monitoring algorithm is then used as a feedback loop to sample data from the network and rebuild the model when it is outdated. We present thorough experimental results to verify our theoretical claims.
Knitting distributed cluster-state ladders with spin chains
Ronke, R.; D'Amico, I.; Spiller, T. P.
2011-09-15
Recently there has been much study on the application of spin chains to quantum state transfer and communication. Here we discuss the utilization of spin chains (set up for perfect quantum state transfer) for the knitting of distributed cluster-state structures, between spin qubits repeatedly injected and extracted at the ends of the chain. The cluster states emerge from the natural evolution of the system across different excitation number sectors. We discuss the decohering effects of errors in the injection and extraction process as well as the effects of fabrication and random errors.
An Efficient Document Clustering Algorithm and Its Application to a Document Browser.
ERIC Educational Resources Information Center
Tanaka, Hideki; Kumano, Tadashi; Uratani, Noriyoshi; Ehara, Terumasa
1999-01-01
Presents a document-clustering algorithm that uses a term frequency vector for each document in a Japanese collection to produce a hierarchy in the form of a document classification tree. Introduces an application of this algorithm to a Japanese-to-English translation-aid system. (Author/LRW)
Parallel matrix transpose algorithms on distributed memory concurrent computers
Choi, J.; Walker, D.W.; Dongarra, J.J. |
1993-10-01
This paper describes parallel matrix transpose algorithms on distributed memory concurrent processors. It is assumed that the matrix is distributed over a P x Q processor template with a block scattered data distribution. P, Q, and the block size can be arbitrary, so the algorithms have wide applicability. The communication schemes of the algorithms are determined by the greatest common divisor (GCD) of P and Q. If P and Q are relatively prime, the matrix transpose algorithm involves complete exchange communication. If P and Q are not relatively prime, processors are divided into GCD groups and the communication operations are overlapped for different groups of processors. Processors transpose GCD wrapped diagonal blocks simultaneously, and the matrix can be transposed with LCM/GCD steps, where LCM is the least common multiple of P and Q. The algorithms make use of non-blocking, point-to-point communication between processors. The use of nonblocking communication allows a processor to overlap the messages that it sends to different processors, thereby avoiding unnecessary synchronization. Combined with the matrix multiplication routine, C = A{center_dot}B, the algorithms are used to compute parallel multiplications of transposed matrices, C = A{sup T}{center_dot}B{sup T}, in the PUMMA package. Details of the parallel implementation of the algorithms are given, and results are presented for runs on the Intel Touchstone Delta computer.
GenClust: A genetic algorithm for clustering gene expression data
Di Gesú, Vito; Giancarlo, Raffaele; Lo Bosco, Giosué; Raimondi, Alessandra; Scaturro, Davide
2005-01-01
Background Clustering is a key step in the analysis of gene expression data, and in fact, many classical clustering algorithms are used, or more innovative ones have been designed and validated for the task. Despite the widespread use of artificial intelligence techniques in bioinformatics and, more generally, data analysis, there are very few clustering algorithms based on the genetic paradigm, yet that paradigm has great potential in finding good heuristic solutions to a difficult optimization problem such as clustering. Results GenClust is a new genetic algorithm for clustering gene expression data. It has two key features: (a) a novel coding of the search space that is simple, compact and easy to update; (b) it can be used naturally in conjunction with data driven internal validation methods. We have experimented with the FOM methodology, specifically conceived for validating clusters of gene expression data. The validity of GenClust has been assessed experimentally on real data sets, both with the use of validation measures and in comparison with other algorithms, i.e., Average Link, Cast, Click and K-means. Conclusion Experiments show that none of the algorithms we have used is markedly superior to the others across data sets and validation measures; i.e., in many cases the observed differences between the worst and best performing algorithm may be statistically insignificant and they could be considered equivalent. However, there are cases in which an algorithm may be better than others and therefore worthwhile. In particular, experiments for GenClust show that, although simple in its data representation, it converges very rapidly to a local optimum and that its ability to identify meaningful clusters is comparable, and sometimes superior, to that of more sophisticated algorithms. In addition, it is well suited for use in conjunction with data driven internal validation measures and, in particular, the FOM methodology. PMID:16336639
The loop-cluster algorithm for the case of the 6 vertex model
NASA Astrophysics Data System (ADS)
Evertz, Hans Gerd; Marcu, Mihai
1993-03-01
We present the loop algorithm, a new type of cluster algorithm that we recently introduced for the F model. Using the framework of Kandel and Domany, we show how to generalize the algorithm to the arrow flip symmetric 6 vertex model. We propose the principle of least possible freezing as the guide to choosing the values of free parameters in the algorithm. Finally, we briefly discuss the application of our algorithm to simulations of quantum spin systems. In particular, all necessary information is provided for the simulation of spin {1}/{2} Heisenberg and ¢x¢x z models.
Semi-supervised clustering algorithm for haplotype assembly problem based on MEC model.
Xu, Xin-Shun; Li, Ying-Xin
2012-01-01
Haplotype assembly is to infer a pair of haplotypes from localized polymorphism data. In this paper, a semi-supervised clustering algorithm-SSK (semi-supervised K-means) is proposed for it, which, to our knowledge, is the first semi-supervised clustering method for it. In SSK, some positive information is firstly extracted. The information is then used to help k-means to cluster all SNP fragments into two sets from which two haplotypes can be reconstructed. The performance of SSK is tested on both real data and simulated data. The results show that it outperforms several state-of-the-art algorithms on minimum error correction (MEC) model.
CHRONICLE: A Two-Stage Density-Based Clustering Algorithm for Dynamic Networks
NASA Astrophysics Data System (ADS)
Kim, Min-Soo; Han, Jiawei
Information networks, such as social networks and that extracted from bibliographic data, are changing dynamically over time. It is crucial to discover time-evolving communities in dynamic networks. In this paper, we study the problem of finding time-evolving communities such that each community freely forms, evolves, and dissolves for any time period. Although the previous t-partite graph based methods are quite effective for discovering such communities from large-scale dynamic networks, they have some weak points such as finding only stable clusters of single path type and not being scalable w.r.t. the time period. We propose CHRONICLE, an efficient clustering algorithm that discovers not only clusters of single path type but also clusters of path group type. In order to find clusters of both types and also control the dynamicity of clusters, CHRONICLE performs the two-stage density-based clustering, which performs the 2nd-stage density-based clustering for the t-partite graph constructed from the 1st-stage density-based clustering result for each timestamp network. For a given data set, CHRONICLE finds all clusters in a fixed time by using a fixed amount of memory, regardless of the number of clusters and the length of clusters. Experimental results using real data sets show that CHRONICLE finds a wider range of clusters in a shorter time with a much smaller amount of memory than the previous method.
The cluster distribution as a test of dark matter models - I. Clustering properties
NASA Astrophysics Data System (ADS)
Borgani, Stefano; Plionis, Manolis; Coles, Peter; Moscardini, Lauro
1995-12-01
We present extended simulations of the large-scale distribution of galaxy clusters in several dark matter models, using an optimized version of the truncated Zel'dovich approximation (TZA). In order to test the reliability of our simulations, we compare them with N-body-based cluster simulations. We find that the TZA provides a very accurate description of the cluster distribution as long as fluctuations on the cluster mass-scale are in the mildly non-linear regime (sigma_8<~1). The low computational cost of this simulation technique allows us to run a large ensemble of 50 realizations for each model, so we are able to quantify accurately the effects of cosmic variance. Six different dark matter models are studied in this work: standard CDM (SCDM), tilted CDM (TCDM) with primordial spectral index n=0.7, cold+hot DM (CHDM) with Omega_hot=0.3, low Hubble constant (h=0.3) CDM (LOWH), and two spatially flat low-density CDM models with Omega_0=0.2 and Omega_Lambda=0.8, having two different normalizations, sigma_8=0.8 (LambdaCDM_1) and sigma_8=1.3 (LambdaCDM_2). We compare the cluster simulations with an extended redshift sample of Abell/ACO clusters, using various statistical measures, such as the integral of the two-point correlation function, J_3, and the probability density function (pdf). We find that the models that best reproduce the clustering of the Abell/ACO cluster sample are the CHDM and the LambdaCDM_1 models. The LambdaCDM_2 model is too strongly clustered, and this clustering is probably overestimated in our simulations as a result of the large sigma_8-value of this model. All of the other models are ruled out at a high confidence level. The pdfs of all models are well approximated by a lognormal distribution, consistent with similar findings for Abell/ACO clusters. The low-order moments of all the pdfs are found to obey a variance-skewness relation of the form gamma~=S_3sigma^4, with S_3~=1.9, independent of the primordial spectral shape and consistent
The distribution of dark matter in the A2256 cluster
NASA Technical Reports Server (NTRS)
Henry, J. Patrick; Briel, Ulrich G.; Nulsen, Paul E. J.
1993-01-01
Using spatially resolved X-ray spectroscopy, it was determined that the X-ray emitting gas in the rich cluster A2256 is nearly isothermal to a radius of at least 0.76/h Mpc, or about three core radii. These data can be used to measure the distribution of the dark matter in the cluster. It was found that the total mass interior to 0.76/h Mpc and 1.5/h Mpc is (0.5 +/- 0.1 and 1.0 +/- 0.5) x 10(exp 15)/h of the solar mass respectively where the errors encompass the full range allowed by all models used. Thus, the mass appropriate to the region where spectral information was obtained is well determined, but the uncertainties become large upon extrapolating beyond that region. It is shown that the galaxy orbits are midly anisotropic which may cause the beta discrepancy in this cluster.
A clustering algorithm based on two distance functions for MEC model.
Wang, Ying; Feng, Enmin; Wang, Ruisheng
2007-04-01
Haplotype reconstruction, based on aligned single nucleotide polymorphism (SNP) fragments, is to infer a pair of haplotypes from localized polymorphism data gathered through short genome fragment assembly. This paper first presents two distance functions, which are used to measure the difference degree and similarity degree between SNP fragments. Based on the two distance functions, a clustering algorithm is proposed in order to solve MEC model. The algorithm involves two sections. One is to determine the initial haplotype pair, the other concerns with inferring true haplotype pair by re-clustering. The comparison results prove that our algorithm utilizing two distance functions is effective and feasible.
Empirical relations between static and dynamic exponents for Ising model cluster algorithms
NASA Astrophysics Data System (ADS)
Coddington, Paul D.; Baillie, Clive F.
1992-02-01
We have measured the autocorrelations for the Swendsen-Wang and the Wolff cluster update algorithms for the Ising model in two, three, and four dimensions. The data for the Wolff algorithm suggest that the autocorrelations are linearly related to the specific heat, in which case the dynamic critical exponent is zint,EW=α/ν. For the Swendsen-Wang algorithm, scaling the autocorrelations by the average maximum cluster size gives either a constant or a logarithm, which implies that zint,ESW=β/ν for the Ising model.
The Gas Distribution in the Outer Regions of Galaxy Clusters
NASA Technical Reports Server (NTRS)
Eckert, D.; Vazza, F.; Ettori, S.; Molendi, S.; Nagai, D.; Lau, E. T.; Roncarelli, M.; Rossetti, M.; Snowden, L.; Gastaldello, F.
2012-01-01
Aims. We present our analysis of a local (z = 0.04 - 0.2) sample of 31 galaxy clusters with the aim of measuring the density of the X-ray emitting gas in cluster outskirts. We compare our results with numerical simulations to set constraints on the azimuthal symmetry and gas clumping in the outer regions of galaxy clusters. Methods. We have exploited the large field-of-view and low instrumental background of ROSAT/PSPC to trace the density of the intracluster gas out to the virial radius, We stacked the density profiles to detect a signal beyond T200 and measured the typical density and scatter in cluster outskirts. We also computed the azimuthal scatter of the profiles with respect to the mean value to look for deviations from spherical symmetry. Finally, we compared our average density and scatter profiles with the results of numerical simulations. Results. As opposed to some recent Suzaku results, and confirming previous evidence from ROSAT and Chandra, we observe a steepening of the density profiles beyond approximately r(sub 500). Comparing our density profiles with simulations, we find that non-radiative runs predict density profiles that are too steep, whereas runs including additional physics and/ or treating gas clumping agree better with the observed gas distribution. We report high-confidence detection of a systematic difference between cool-core and non cool-core clusters beyond approximately 0.3r(sub 200), which we explain by a different distribution of the gas in the two classes. Beyond approximately r(sub 500), galaxy clusters deviate significantly from spherical symmetry, with only small differences between relaxed and disturbed systems. We find good agreement between the observed and predicted scatter profiles, but only when the 1% densest clumps are filtered out in the ENZO simulations. Conclusions. Comparing our results with numerical simulations, we find that non-radiative simulations fail to reproduce the gas distribution, even well outside
The Gas Distribution in Galaxy Cluster Outer Regions
NASA Technical Reports Server (NTRS)
Eckert, D.; Vazza, F.; Ettori, S.; Molendi, S.; Nagai, D.; Laue, E. T.; Roncarelli, M.; Rossetti, M.; Snowden, S. L.; Gastaldello, F.
2012-01-01
Aims. We present the analysis of a local (z = 0.04 - 0.2) sample of 31 galaxy clusters with the aim of measuring the density of the X-ray emitting gas in cluster outskirts. We compare our results with numerical simulations to set constraints on the azimuthal symmetry and gas clumping in the outer regions of galaxy clusters. Methods. We exploit the large field-of-view and low instrumental background of ROSAT/PSPC to trace the density of the intracluster gas out to the virial radius. We perform a stacking of the density profiles to detect a signal beyond r200 and measure the typical density and scatter in cluster outskirts. We also compute the azimuthal scatter of the profiles with respect to the mean value to look for deviations from spherical symmetry. Finally, we compare our average density and scatter profiles with the results of numerical simulations. Results. As opposed to some recent Suzaku results, and confirming previous evidence from ROSAT and Chandra, we observe a steepening of the density profiles beyond approximately r(sub 500). Comparing our density profiles with simulations, we find that non-radiative runs predict too steep density profiles, whereas runs including additional physics and/or treating gas clumping are in better agreement with the observed gas distribution. We report for the first time the high-confidence detection of a systematic difference between cool-core and non-cool core clusters beyond 0.3r(sub 200), which we explain by a different distribution of the gas in the two classes. Beyond r(sub 500), galaxy clusters deviate significantly from spherical symmetry, with only little differences between relaxed and disturbed systems. We find good agreement between the observed and predicted scatter profiles, but only when the 1% densest clumps are filtered out in the simulations. Conclusions. Comparing our results with numerical simulations, we find that non-radiative simulations fail to reproduce the gas distribution, even well outside cluster
An improved clustering algorithm of tunnel monitoring data for cloud computing.
Zhong, Luo; Tang, KunHao; Li, Lin; Yang, Guang; Ye, JingJing
2014-01-01
With the rapid development of urban construction, the number of urban tunnels is increasing and the data they produce become more and more complex. It results in the fact that the traditional clustering algorithm cannot handle the mass data of the tunnel. To solve this problem, an improved parallel clustering algorithm based on k-means has been proposed. It is a clustering algorithm using the MapReduce within cloud computing that deals with data. It not only has the advantage of being used to deal with mass data but also is more efficient. Moreover, it is able to compute the average dissimilarity degree of each cluster in order to clean the abnormal data.
An Enhanced PSO-Based Clustering Energy Optimization Algorithm for Wireless Sensor Network.
Vimalarani, C; Subramanian, R; Sivanandam, S N
2016-01-01
Wireless Sensor Network (WSN) is a network which formed with a maximum number of sensor nodes which are positioned in an application environment to monitor the physical entities in a target area, for example, temperature monitoring environment, water level, monitoring pressure, and health care, and various military applications. Mostly sensor nodes are equipped with self-supported battery power through which they can perform adequate operations and communication among neighboring nodes. Maximizing the lifetime of the Wireless Sensor networks, energy conservation measures are essential for improving the performance of WSNs. This paper proposes an Enhanced PSO-Based Clustering Energy Optimization (EPSO-CEO) algorithm for Wireless Sensor Network in which clustering and clustering head selection are done by using Particle Swarm Optimization (PSO) algorithm with respect to minimizing the power consumption in WSN. The performance metrics are evaluated and results are compared with competitive clustering algorithm to validate the reduction in energy consumption.
Ju, Chunhua
2013-01-01
Although there are many good collaborative recommendation methods, it is still a challenge to increase the accuracy and diversity of these methods to fulfill users' preferences. In this paper, we propose a novel collaborative filtering recommendation approach based on K-means clustering algorithm. In the process of clustering, we use artificial bee colony (ABC) algorithm to overcome the local optimal problem caused by K-means. After that we adopt the modified cosine similarity to compute the similarity between users in the same clusters. Finally, we generate recommendation results for the corresponding target users. Detailed numerical analysis on a benchmark dataset MovieLens and a real-world dataset indicates that our new collaborative filtering approach based on users clustering algorithm outperforms many other recommendation methods. PMID:24381525
GPU-based single-cluster algorithm for the simulation of the Ising model
NASA Astrophysics Data System (ADS)
Komura, Yukihiro; Okabe, Yutaka
2012-02-01
We present the GPU calculation with the common unified device architecture (CUDA) for the Wolff single-cluster algorithm of the Ising model. Proposing an algorithm for a quasi-block synchronization, we realize the Wolff single-cluster Monte Carlo simulation with CUDA. We perform parallel computations for the newly added spins in the growing cluster. As a result, the GPU calculation speed for the two-dimensional Ising model at the critical temperature with the linear size L = 4096 is 5.60 times as fast as the calculation speed on a current CPU core. For the three-dimensional Ising model with the linear size L = 256, the GPU calculation speed is 7.90 times as fast as the CPU calculation speed. The idea of quasi-block synchronization can be used not only in the cluster algorithm but also in many fields where the synchronization of all threads is required.
Uy, D.L.
1996-02-01
An algorithm for detection and identification of image clusters or {open_quotes}blobs{close_quotes} based on color information for an autonomous mobile robot is developed. The input image data are first processed using a crisp color fuszzyfier, a binary smoothing filter, and a median filter. The processed image data is then inputed to the image clusters detection and identification program. The program employed the concept of {open_quotes}elastic rectangle{close_quotes}that stretches in such a way that the whole blob is finally enclosed in a rectangle. A C-program is develop to test the algorithm. The algorithm is tested only on image data of 8x8 sizes with different number of blobs in them. The algorithm works very in detecting and identifying image clusters.
A review of estimation of distribution algorithms in bioinformatics
Armañanzas, Rubén; Inza, Iñaki; Santana, Roberto; Saeys, Yvan; Flores, Jose Luis; Lozano, Jose Antonio; Peer, Yves Van de; Blanco, Rosa; Robles, Víctor; Bielza, Concha; Larrañaga, Pedro
2008-01-01
Evolutionary search algorithms have become an essential asset in the algorithmic toolbox for solving high-dimensional optimization problems in across a broad range of bioinformatics problems. Genetic algorithms, the most well-known and representative evolutionary search technique, have been the subject of the major part of such applications. Estimation of distribution algorithms (EDAs) offer a novel evolutionary paradigm that constitutes a natural and attractive alternative to genetic algorithms. They make use of a probabilistic model, learnt from the promising solutions, to guide the search process. In this paper, we set out a basic taxonomy of EDA techniques, underlining the nature and complexity of the probabilistic model of each EDA variant. We review a set of innovative works that make use of EDA techniques to solve challenging bioinformatics problems, emphasizing the EDA paradigm's potential for further research in this domain. PMID:18822112
Haplotype-based quantitative trait mapping using a clustering algorithm
Li, Jing; Zhou, Yingyao; Elston, Robert C
2006-01-01
Background With the availability of large-scale, high-density single-nucleotide polymorphism (SNP) markers, substantial effort has been made in identifying disease-causing genes using linkage disequilibrium (LD) mapping by haplotype analysis of unrelated individuals. In addition to complex diseases, many continuously distributed quantitative traits are of primary clinical and health significance. However the development of association mapping methods using unrelated individuals for quantitative traits has received relatively less attention. Results We recently developed an association mapping method for complex diseases by mining the sharing of haplotype segments (i.e., phased genotype pairs) in affected individuals that are rarely present in normal individuals. In this paper, we extend our previous work to address the problem of quantitative trait mapping from unrelated individuals. The method is non-parametric in nature, and statistical significance can be obtained by a permutation test. It can also be incorporated into the one-way ANCOVA (analysis of covariance) framework so that other factors and covariates can be easily incorporated. The effectiveness of the approach is demonstrated by extensive experimental studies using both simulated and real data sets. The results show that our haplotype-based approach is more robust than two statistical methods based on single markers: a single SNP association test (SSA) and the Mann-Whitney U-test (MWU). The algorithm has been incorporated into our existing software package called HapMiner, which is available from our website at . Conclusion For QTL (quantitative trait loci) fine mapping, to identify QTNs (quantitative trait nucleotides) with realistic effects (the contribution of each QTN less than 10% of total variance of the trait), large samples sizes (≥ 500) are needed for all the methods. The overall performance of HapMiner is better than that of the other two methods. Its effectiveness further depends on other
A fast density-based clustering algorithm for real-time Internet of Things stream.
Amini, Amineh; Saboohi, Hadi; Wah, Teh Ying; Herawan, Tutut
2014-01-01
Data streams are continuously generated over time from Internet of Things (IoT) devices. The faster all of this data is analyzed, its hidden trends and patterns discovered, and new strategies created, the faster action can be taken, creating greater value for organizations. Density-based method is a prominent class in clustering data streams. It has the ability to detect arbitrary shape clusters, to handle outlier, and it does not need the number of clusters in advance. Therefore, density-based clustering algorithm is a proper choice for clustering IoT streams. Recently, several density-based algorithms have been proposed for clustering data streams. However, density-based clustering in limited time is still a challenging issue. In this paper, we propose a density-based clustering algorithm for IoT streams. The method has fast processing time to be applicable in real-time application of IoT devices. Experimental results show that the proposed approach obtains high quality results with low computation time on real and synthetic datasets.
Efficient cluster Monte Carlo algorithm for Ising spin glasses in more than two space dimensions
NASA Astrophysics Data System (ADS)
Ochoa, Andrew J.; Zhu, Zheng; Katzgraber, Helmut G.
2015-03-01
A cluster algorithm that speeds up slow dynamics in simulations of nonplanar Ising spin glasses away from criticality is urgently needed. In theory, the cluster algorithm proposed by Houdayer poses no advantage over local moves in systems with a percolation threshold below 50%, such as cubic lattices. However, we show that the frustration present in Ising spin glasses prevents the growth of system-spanning clusters at temperatures roughly below the characteristic energy scale J of the problem. Adding Houdayer cluster moves to simulations of Ising spin glasses for T ~ J produces a speedup that grows with the system size over conventional local moves. We show results for the nonplanar quasi-two-dimensional Chimera graph of the D-Wave Two quantum annealer, as well as conventional three-dimensional Ising spin glasses, where in both cases the addition of cluster moves speeds up thermalization visibly in the physically-interesting low temperature regime.
A Class of Manifold Regularized Multiplicative Update Algorithms for Image Clustering.
Yang, Shangming; Yi, Zhang; He, Xiaofei; Li, Xuelong
2015-12-01
Multiplicative update algorithms are important tools for information retrieval, image processing, and pattern recognition. However, when the graph regularization is added to the cost function, different classes of sample data may be mapped to the same subspace, which leads to the increase of data clustering error rate. In this paper, an improved nonnegative matrix factorization (NMF) cost function is introduced. Based on the cost function, a class of novel graph regularized NMF algorithms is developed, which results in a class of extended multiplicative update algorithms with manifold structure regularization. Analysis shows that in the learning, the proposed algorithms can efficiently minimize the rank of the data representation matrix. Theoretical results presented in this paper are confirmed by simulations. For different initializations and data sets, variation curves of cost functions and decomposition data are presented to show the convergence features of the proposed update rules. Basis images, reconstructed images, and clustering results are utilized to present the efficiency of the new algorithms. Last, the clustering accuracies of different algorithms are also investigated, which shows that the proposed algorithms can achieve state-of-the-art performance in applications of image clustering.
Zhong, Wei; Altun, Gulsah; Harrison, Robert; Tai, Phang C; Pan, Yi
2005-09-01
Information about local protein sequence motifs is very important to the analysis of biologically significant conserved regions of protein sequences. These conserved regions can potentially determine the diverse conformation and activities of proteins. In this work, recurring sequence motifs of proteins are explored with an improved K-means clustering algorithm on a new dataset. The structural similarity of these recurring sequence clusters to produce sequence motifs is studied in order to evaluate the relationship between sequence motifs and their structures. To the best of our knowledge, the dataset used by our research is the most updated dataset among similar studies for sequence motifs. A new greedy initialization method for the K-means algorithm is proposed to improve traditional K-means clustering techniques. The new initialization method tries to choose suitable initial points, which are well separated and have the potential to form high-quality clusters. Our experiments indicate that the improved K-means algorithm satisfactorily increases the percentage of sequence segments belonging to clusters with high structural similarity. Careful comparison of sequence motifs obtained by the improved and traditional algorithms also suggests that the improved K-means clustering algorithm may discover some relatively weak and subtle sequence motifs, which are undetectable by the traditional K-means algorithms. Many biochemical tests reported in the literature show that these sequence motifs are biologically meaningful. Experimental results also indicate that the improved K-means algorithm generates more detailed sequence motifs representing common structures than previous research. Furthermore, these motifs are universally conserved sequence patterns across protein families, overcoming some weak points of other popular sequence motifs. The satisfactory result of the experiment suggests that this new K-means algorithm may be applied to other areas of bioinformatics
Impacts of Time Delays on Distributed Algorithms for Economic Dispatch
Yang, Tao; Wu, Di; Sun, Yannan; Lian, Jianming
2015-07-26
Economic dispatch problem (EDP) is an important problem in power systems. It can be formulated as an optimization problem with the objective to minimize the total generation cost subject to the power balance constraint and generator capacity limits. Recently, several consensus-based algorithms have been proposed to solve EDP in a distributed manner. However, impacts of communication time delays on these distributed algorithms are not fully understood, especially for the case where the communication network is directed, i.e., the information exchange is unidirectional. This paper investigates communication time delay effects on a distributed algorithm for directed communication networks. The algorithm has been tested by applying time delays to different types of information exchange. Several case studies are carried out to evaluate the effectiveness and performance of the algorithm in the presence of time delays in communication networks. It is found that time delay effects have negative effects on the convergence rate, and can even result in an incorrect converge value or fail the algorithm to converge.
Distributed query plan generation using multiobjective genetic algorithm.
Panicker, Shina; Kumar, T V Vijay
2014-01-01
A distributed query processing strategy, which is a key performance determinant in accessing distributed databases, aims to minimize the total query processing cost. One way to achieve this is by generating efficient distributed query plans that involve fewer sites for processing a query. In the case of distributed relational databases, the number of possible query plans increases exponentially with respect to the number of relations accessed by the query and the number of sites where these relations reside. Consequently, computing optimal distributed query plans becomes a complex problem. This distributed query plan generation (DQPG) problem has already been addressed using single objective genetic algorithm, where the objective is to minimize the total query processing cost comprising the local processing cost (LPC) and the site-to-site communication cost (CC). In this paper, this DQPG problem is formulated and solved as a biobjective optimization problem with the two objectives being minimize total LPC and minimize total CC. These objectives are simultaneously optimized using a multiobjective genetic algorithm NSGA-II. Experimental comparison of the proposed NSGA-II based DQPG algorithm with the single objective genetic algorithm shows that the former performs comparatively better and converges quickly towards optimal solutions for an observed crossover and mutation probability.
Durrell, Patrick R.; Accetta, Katharine; Côté, Patrick; Blakeslee, John P.; Ferrarese, Laura; McConnachie, Alan; Gwyn, Stephen; Peng, Eric W.; Zhang, Hongxin; Mihos, J. Christopher; Puzia, Thomas H.; Jordán, Andrés; Lançon, Ariane; Liu, Chengze; Cuillandre, Jean-Charles; Boissier, Samuel; Boselli, Alessandro; Courteau, Stéphane; Duc, Pierre-Alain; and others
2014-10-20
We report on a large-scale study of the distribution of globular clusters (GCs) throughout the Virgo cluster, based on photometry from the Next Generation Virgo Cluster Survey (NGVS), a large imaging survey covering Virgo's primary subclusters (Virgo A = M87 and Virgo B = M49) out to their virial radii. Using the g{sub o}{sup ′}, (g' – i') {sub o} color-magnitude diagram of unresolved and marginally resolved sources within the NGVS, we have constructed two-dimensional maps of the (irregular) GC distribution over 100 deg{sup 2} to a depth of g{sub o}{sup ′} = 24. We present the clearest evidence to date showing the difference in concentration between red and blue GCs over the full extent of the cluster, where the red (more metal-rich) GCs are largely located around the massive early-type galaxies in Virgo, while the blue (metal-poor) GCs have a much more extended spatial distribution with significant populations still present beyond 83' (∼215 kpc) along the major axes of both M49 and M87. A comparison of our GC maps to the diffuse light in the outermost regions of M49 and M87 show remarkable agreement in the shape, ellipticity, and boxiness of both luminous systems. We also find evidence for spatial enhancements of GCs surrounding M87 that may be indicative of recent interactions or an ongoing merger history. We compare the GC map to that of the locations of Virgo galaxies and the X-ray intracluster gas, and find generally good agreement between these various baryonic structures. We calculate the Virgo cluster contains a total population of N {sub GC} = 67, 300 ± 14, 400, of which 35% are located in M87 and M49 alone. For the first time, we compute a cluster-wide specific frequency S {sub N,} {sub CL} = 2.8 ± 0.7, after correcting for Virgo's diffuse light. We also find a GC-to-baryonic mass fraction ε {sub b} = 5.7 ± 1.1 × 10{sup –4} and a GC-to-total cluster mass formation efficiency ε {sub t} = 2.9 ± 0.5 × 10{sup –5}, the latter values
Mesh Algorithms for PDE with Sieve I: Mesh Distribution
Knepley, Matthew G.; Karpeev, Dmitry A.
2009-01-01
We have developed a new programming framework, called Sieve, to support parallel numerical partial differential equation(s) (PDE) algorithms operating over distributed meshes. We have also developed a reference implementation of Sieve in C++ as a library of generic algorithms operating on distributed containers conforming to the Sieve interface. Sieve makes instances of the incidence relation, or arrows, the conceptual first-class objects represented in the containers. Further, generic algorithms acting on this arrow container are systematically used to provide natural geometric operations on the topology and also, through duality, on the data. Finally, coverings and duality are used to encode notmore » only individual meshes, but all types of hierarchies underlying PDE data structures, including multigrid and mesh partitions. In order to demonstrate the usefulness of the framework, we show how the mesh partition data can be represented and manipulated using the same fundamental mechanisms used to represent meshes. We present the complete description of an algorithm to encode a mesh partition and then distribute a mesh, which is independent of the mesh dimension, element shape, or embedding. Moreover, data associated with the mesh can be similarly distributed with exactly the same algorithm. The use of a high level of abstraction within the Sieve leads to several benefits in terms of code reuse, simplicity, and extensibility. We discuss these benefits and compare our approach to other existing mesh libraries.« less
Adham, Manal T; Bentley, Peter J
2016-08-01
This paper proposes and evaluates a solution to the truck redistribution problem prominent in London's Santander Cycle scheme. Due to the complexity of this NP-hard combinatorial optimisation problem, no efficient optimisation techniques are known to solve the problem exactly. This motivates our use of the heuristic Artificial Ecosystem Algorithm (AEA) to find good solutions in a reasonable amount of time. The AEA is designed to take advantage of highly distributed computer architectures and adapt to changing problems. In the AEA a problem is first decomposed into its relative sub-components; they then evolve solution building blocks that fit together to form a single optimal solution. Three variants of the AEA centred on evaluating clustering methods are presented: the baseline AEA, the community-based AEA which groups stations according to journey flows, and the Adaptive AEA which actively modifies clusters to cater for changes in demand. We applied these AEA variants to the redistribution problem prominent in bike share schemes (BSS). The AEA variants are empirically evaluated using historical data from Santander Cycles to validate the proposed approach and prove its potential effectiveness.
Solvation Effects on Structure and Charge Distribution in Anionic Clusters
NASA Astrophysics Data System (ADS)
Weber, J. Mathias
2015-03-01
The interaction of ions with solvent molecules modifies the properties of both solvent and solute. Solvation generally stabilizes compact charge distributions compared to more diffuse ones. In the most extreme cases, solvation will alter the very composition of the ion itself. We use infrared photodissociation spectroscopy of mass-selected ions to probe how solvation affects the structures and charge distributions of metal-CO2 cluster anions. We gratefully acknowledge the National Science Foundation for funding through Grant CHE-0845618 (for graduate student support) and for instrumentation funding through Grant PHY-1125844.
Spitzer IR Colors and ISM Distributions of Virgo Cluster Spirals
NASA Astrophysics Data System (ADS)
Kenney, Jeffrey D.; Wong, I.; Kenney, Z.; Murphy, E.; Helou, G.; Howell, J.
2012-01-01
IRAC infrared images of 44 spiral and peculiar galaxies from the Spitzer Survey of the Virgo Cluster help reveal the interactions which transform galaxies in clusters. We explore how the location of galaxies in the IR 3.6-8μm color-magnitude diagram is related to the spatial distributions of ISM/star formation, as traced by PAH emission in the 8μm band. Based on their 8μm/PAH radial distributions, we divide the galaxies into 4 groups: normal, truncated, truncated/compact, and anemic. Normal galaxies have relatively normal PAH distributions. They are the "bluest" galaxies, with the largest 8/3.6μm ratios. They are relatively unaffected by the cluster environment, and have probably never passed through the cluster core. Truncated galaxies have a relatively normal 8μm/PAH surface brightness in the inner disk, but are abruptly truncated with little or no emission in the outer disk. They have intermediate ("green") colors, while those which are more severely truncated are "redder". Most truncated galaxies have undisturbed stellar disks and many show direct evidence of active ram pressure stripping. Truncated/compact galaxies have high 8μm/PAH surface brightness in the very inner disk (central 1 kpc) but are abruptly truncated close to center with little or no emission in the outer disk. They have intermediate global colors, similar to the other truncated galaxies. While they have the most extreme ISM truncation, they have vigorous circumnuclear star formation. Most of these have disturbed stellar disks, and they are probably produced by a combination of gravitational interaction plus ram pressure stripping. Anemic galaxies have a low 8μm/PAH surface brightness even in the inner disk. These are the "reddest" galaxies, with the smallest 8/3.6μm ratios. The origin of the anemics seems to a combination of starvation, gravitational interactions, and long-ago ram pressure stripping.
A pairwise alignment algorithm which favors clusters of blocks.
Nédélec, Elodie; Moncion, Thomas; Gassiat, Elisabeth; Bossard, Bruno; Duchateau-Nguyen, Guillemette; Denise, Alain; Termier, Michel
2005-01-01
Pairwise sequence alignments aim to decide whether two sequences are related and, if so, to exhibit their related domains. Recent works have pointed out that a significant number of true homologous sequences are missed when using classical comparison algorithms. This is the case when two homologous sequences share several little blocks of homology, too small to lead to a significant score. On the other hand, classical alignment algorithms, when detecting homologies, may fail to recognize all the significant biological signals. The aim of the paper is to give a solution to these two problems. We propose a new scoring method which tends to increase the score of an alignment when "blocks" are detected. This so-called Block-Scoring algorithm, which makes use of dynamic programming, is worth being used as a complementary tool to classical exact alignments methods. We validate our approach by applying it on a large set of biological data. Finally, we give a limit theorem for the score statistics of the algorithm.
Intelligent decision support algorithm for distribution system restoration.
Singh, Reetu; Mehfuz, Shabana; Kumar, Parmod
2016-01-01
Distribution system is the means of revenue for electric utility. It needs to be restored at the earliest if any feeder or complete system is tripped out due to fault or any other cause. Further, uncertainty of the loads, result in variations in the distribution network's parameters. Thus, an intelligent algorithm incorporating hybrid fuzzy-grey relation, which can take into account the uncertainties and compare the sequences is discussed to analyse and restore the distribution system. The simulation studies are carried out to show the utility of the method by ranking the restoration plans for a typical distribution system. This algorithm also meets the smart grid requirements in terms of an automated restoration plan for the partial/full blackout of network.
Generalized quantum counting algorithm for non-uniform amplitude distribution
NASA Astrophysics Data System (ADS)
Tan, Jianing; Ruan, Yue; Li, Xi; Chen, Hanwu
2017-03-01
We give generalized quantum counting algorithm to increase universality of quantum counting algorithm. Non-uniform initial amplitude distribution is possible due to the diversity of situations on counting problems or external noise in the amplitude initialization procedure. We give the reason why quantum counting algorithm is invalid on this situation. By modeling in three-dimensional space spanned by unmarked state, marked state and free state to the entire Hilbert space of n qubits, we find Grover iteration can be regarded as improper rotation in the space. This allows us to give formula to solve counting problem. Furthermore, we express initial amplitude distribution in the eigenvector basis of improper rotation matrix. This is necessary to obtain mathematical analysis of counting problem on various situations. Finally, we design four simulation experiments, the results of which show that compared with original quantum counting algorithm, generalized quantum counting algorithm wins great satisfaction from three aspects: (1) Whether initial amplitude distribution is uniform; (2) the diversity of situations on counting problems; and (3) whether phase estimation technique can get phase exactly.
NASA Astrophysics Data System (ADS)
Monceau, Pascal; Hsiao, Pai-Yi
2002-09-01
We study the Wolff cluster size distributions obtained from Monte Carlo simulations of the Ising phase transition on Sierpinski fractals with Hausdorff dimensions Df between 2 and 3. These distributions are shown to be invariant when going from an iteration step of the fractal to the next under a scaling of the cluster sizes involving the exponent (β/ν)+(γ/ν). Moreover, the decay of the autocorrelation functions at the critical points enables us to calculate the Wolff dynamical critical exponents z for three different values of Df. The Wolff algorithm is more efficient in reducing the critical slowing down when Df is lowered.
Longitudinally-invariant k⊥-clustering algorithms for hadron-hadron collisions
NASA Astrophysics Data System (ADS)
Catani, S.; Dokshitzer, Yu. L.; Seymour, M. H.; Webber, B. R.
1993-09-01
We propose a version of the QCD-motivated " k⊥" jet-clustering algorithm for hadron-hadron collisions which is invariant under boosts along the beam directions. This leads to improved factorization properties and closer correspondence to experimental practice at hadron colliders. We examine alternative definitions of the resolution variables and cluster recombination scheme, and show that the algorithm can be implemented efficiently on a computer to provide a full clustering history of each event. Using simulated data at √ S = 1.8 TeV, we study the effects of calorimeter segmentation, hadronization and the soft underlying event, and compare the results with those obtained using a conventional cone-type algorithm.
NASA Astrophysics Data System (ADS)
Cruz, S. M. A.; Marques, J. M. C.; Pereira, F. B.
2016-10-01
We propose improvements to our evolutionary algorithm (EA) [J. M. C. Marques and F. B. Pereira, J. Mol. Liq. 210, 51 (2015)] in order to avoid dissociative solutions in the global optimization of clusters with competing attractive and repulsive interactions. The improved EA outperforms the original version of the method for charged colloidal clusters in the size range 3 ≤ N ≤ 25, which is a very stringent test for global optimization algorithms. While the Bernal spiral is the global minimum for clusters in the interval 13 ≤ N ≤ 18, the lowest-energy structure is a peculiar, so-called beaded-necklace, motif for 19 ≤ N ≤ 25. We have also applied the method for larger sizes and unusual quasi-linear and branched clusters arise as low-energy structures.
Tuning a Major Part of a Clustering Algorithm.
1988-02-01
with core points identified by the local density algorithm TOTAL MINORITES t-4.5* t.A4* Median 8 Ilb uHinge 10 20 Max 14 44 (Mean) 8.1 15.4 *Actual...2.7) curent* average-linkage" coniplete-linkage*** Median 25h 47h 35 u~linge 33 65 39 Max 73 97 56 (Mean) 29.7 51.1 36.0 *Actual total minorites
NASA Astrophysics Data System (ADS)
Tsarev, R. Yu; Gruzenkin, D. V.; Kovalev, I. V.; Prokopenko, A. V.; Knyazkov, A. N.
2016-11-01
Control and information processing systems, which often executes critical functions, must satisfy requirements not only of fault tolerance, but also of disaster tolerance. Cluster architecture is reasonable to be applied to provide disaster tolerance of these systems. In this case clusters are separate control and information processing centers united by means of communication channels. Thus, clusters are a single hardware resource interacting with each other to achieve system objectives. Remote cluster positioning allows ensuring system availability and disaster tolerance even in case of some units’ failures or a whole cluster crash. A technique for evaluation of availability of cluster distributed systems for control and information processing based on a cluster quorum is presented in the paper. This technique can be applied to different cluster distributed control and information processing systems, claimed to be based on the disaster tolerance principles. In the article we discuss a communications satellite system as an example of a cluster distributed disaster tolerant control and information processing system. Evaluation of availability of the communications satellite system is provided. Possible scenarios of communications satellite system cluster-based components failures were analyzed. The analysis made it possible to choose the best way to implement the cluster structure for a distributed control and information processing system.
Distributed autonomous systems: resource management, planning, and control algorithms
NASA Astrophysics Data System (ADS)
Smith, James F., III; Nguyen, ThanhVu H.
2005-05-01
Distributed autonomous systems, i.e., systems that have separated distributed components, each of which, exhibit some degree of autonomy are increasingly providing solutions to naval and other DoD problems. Recently developed control, planning and resource allocation algorithms for two types of distributed autonomous systems will be discussed. The first distributed autonomous system (DAS) to be discussed consists of a collection of unmanned aerial vehicles (UAVs) that are under fuzzy logic control. The UAVs fly and conduct meteorological sampling in a coordinated fashion determined by their fuzzy logic controllers to determine the atmospheric index of refraction. Once in flight no human intervention is required. A fuzzy planning algorithm determines the optimal trajectory, sampling rate and pattern for the UAVs and an interferometer platform while taking into account risk, reliability, priority for sampling in certain regions, fuel limitations, mission cost, and related uncertainties. The real-time fuzzy control algorithm running on each UAV will give the UAV limited autonomy allowing it to change course immediately without consulting with any commander, request other UAVs to help it, alter its sampling pattern and rate when observing interesting phenomena, or to terminate the mission and return to base. The algorithms developed will be compared to a resource manager (RM) developed for another DAS problem related to electronic attack (EA). This RM is based on fuzzy logic and optimized by evolutionary algorithms. It allows a group of dissimilar platforms to use EA resources distributed throughout the group. For both DAS types significant theoretical and simulation results will be presented.
A Game Theory Algorithm for Intra-Cluster Data Aggregation in a Vehicular Ad Hoc Network.
Chen, Yuzhong; Weng, Shining; Guo, Wenzhong; Xiong, Naixue
2016-02-19
Vehicular ad hoc networks (VANETs) have an important role in urban management and planning. The effective integration of vehicle information in VANETs is critical to traffic analysis, large-scale vehicle route planning and intelligent transportation scheduling. However, given the limitations in the precision of the output information of a single sensor and the difficulty of information sharing among various sensors in a highly dynamic VANET, effectively performing data aggregation in VANETs remains a challenge. Moreover, current studies have mainly focused on data aggregation in large-scale environments but have rarely discussed the issue of intra-cluster data aggregation in VANETs. In this study, we propose a multi-player game theory algorithm for intra-cluster data aggregation in VANETs by analyzing the competitive and cooperative relationships among sensor nodes. Several sensor-centric metrics are proposed to measure the data redundancy and stability of a cluster. We then study the utility function to achieve efficient intra-cluster data aggregation by considering both data redundancy and cluster stability. In particular, we prove the existence of a unique Nash equilibrium in the game model, and conduct extensive experiments to validate the proposed algorithm. Results demonstrate that the proposed algorithm has advantages over typical data aggregation algorithms in both accuracy and efficiency.
A Game Theory Algorithm for Intra-Cluster Data Aggregation in a Vehicular Ad Hoc Network
Chen, Yuzhong; Weng, Shining; Guo, Wenzhong; Xiong, Naixue
2016-01-01
Vehicular ad hoc networks (VANETs) have an important role in urban management and planning. The effective integration of vehicle information in VANETs is critical to traffic analysis, large-scale vehicle route planning and intelligent transportation scheduling. However, given the limitations in the precision of the output information of a single sensor and the difficulty of information sharing among various sensors in a highly dynamic VANET, effectively performing data aggregation in VANETs remains a challenge. Moreover, current studies have mainly focused on data aggregation in large-scale environments but have rarely discussed the issue of intra-cluster data aggregation in VANETs. In this study, we propose a multi-player game theory algorithm for intra-cluster data aggregation in VANETs by analyzing the competitive and cooperative relationships among sensor nodes. Several sensor-centric metrics are proposed to measure the data redundancy and stability of a cluster. We then study the utility function to achieve efficient intra-cluster data aggregation by considering both data redundancy and cluster stability. In particular, we prove the existence of a unique Nash equilibrium in the game model, and conduct extensive experiments to validate the proposed algorithm. Results demonstrate that the proposed algorithm has advantages over typical data aggregation algorithms in both accuracy and efficiency. PMID:26907272
Ergen, Burhan
2014-01-01
This paper proposes two edge detection methods for medical images by integrating the advantages of Gabor wavelet transform (GWT) and unsupervised clustering algorithms. The GWT is used to enhance the edge information in an image while suppressing noise. Following this, the k-means and Fuzzy c-means (FCM) clustering algorithms are used to convert a gray level image into a binary image. The proposed methods are tested using medical images obtained through Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) devices, and a phantom image. The results prove that the proposed methods are successful for edge detection, even in noisy cases. PMID:24790590
A distributed Canny edge detector: algorithm and FPGA implementation.
Xu, Qian; Varadarajan, Srenivas; Chakrabarti, Chaitali; Karam, Lina J
2014-07-01
The Canny edge detector is one of the most widely used edge detection algorithms due to its superior performance. Unfortunately, not only is it computationally more intensive as compared with other edge detection algorithms, but it also has a higher latency because it is based on frame-level statistics. In this paper, we propose a mechanism to implement the Canny algorithm at the block level without any loss in edge detection performance compared with the original frame-level Canny algorithm. Directly applying the original Canny algorithm at the block-level leads to excessive edges in smooth regions and to loss of significant edges in high-detailed regions since the original Canny computes the high and low thresholds based on the frame-level statistics. To solve this problem, we present a distributed Canny edge detection algorithm that adaptively computes the edge detection thresholds based on the block type and the local distribution of the gradients in the image block. In addition, the new algorithm uses a nonuniform gradient magnitude histogram to compute block-based hysteresis thresholds. The resulting block-based algorithm has a significantly reduced latency and can be easily integrated with other block-based image codecs. It is capable of supporting fast edge detection of images and videos with high resolutions, including full-HD since the latency is now a function of the block size instead of the frame size. In addition, quantitative conformance evaluations and subjective tests show that the edge detection performance of the proposed algorithm is better than the original frame-based algorithm, especially when noise is present in the images. Finally, this algorithm is implemented using a 32 computing engine architecture and is synthesized on the Xilinx Virtex-5 FPGA. The synthesized architecture takes only 0.721 ms (including the SRAM READ/WRITE time and the computation time) to detect edges of 512 × 512 images in the USC SIPI database when clocked at 100
The Development of FPGA-Based Pseudo-Iterative Clustering Algorithms
NASA Astrophysics Data System (ADS)
Drueke, Elizabeth; Fisher, Wade; Plucinski, Pawel
2016-03-01
The Large Hadron Collider (LHC) in Geneva, Switzerland, is set to undergo major upgrades in 2025 in the form of the High-Luminosity Large Hadron Collider (HL-LHC). In particular, several hardware upgrades are proposed to the ATLAS detector, one of the two general purpose detectors. These hardware upgrades include, but are not limited to, a new hardware-level clustering algorithm, to be performed by a field programmable gate array, or FPGA. In this study, we develop that clustering algorithm and compare the output to a Python-implemented topoclustering algorithm developed at the University of Oregon. Here, we present the agreement between the FPGA output and expected output, with particular attention to the time required by the FPGA to complete the algorithm and other limitations set by the FPGA itself.
Node Non-Uniform Deployment Based on Clustering Algorithm for Underwater Sensor Networks
Jiang, Peng; Liu, Jun; Wu, Feng
2015-01-01
A node non-uniform deployment based on clustering algorithm for underwater sensor networks (UWSNs) is proposed in this study. This algorithm is proposed because optimizing network connectivity rate and network lifetime is difficult for the existing node non-uniform deployment algorithms under the premise of improving the network coverage rate for UWSNs. A high network connectivity rate is achieved by determining the heterogeneous communication ranges of nodes during node clustering. Moreover, the concept of aggregate contribution degree is defined, and the nodes with lower aggregate contribution degrees are used to substitute the dying nodes to decrease the total movement distance of nodes and prolong the network lifetime. Simulation results show that the proposed algorithm can achieve a better network coverage rate and network connectivity rate, as well as decrease the total movement distance of nodes and prolong the network lifetime. PMID:26633408
An Efficient Algorithm for Clustering of Large-Scale Mass Spectrometry Data.
Saeed, Fahad; Pisitkun, Trairak; Knepper, Mark A; Hoffert, Jason D
2012-10-04
High-throughput spectrometers are capable of producing data sets containing thousands of spectra for a single biological sample. These data sets contain a substantial amount of redundancy from peptides that may get selected multiple times in a LC-MS/MS experiment. In this paper, we present an efficient algorithm, CAMS (Clustering Algorithm for Mass Spectra) for clustering mass spectrometry data which increases both the sensitivity and confidence of spectral assignment. CAMS utilizes a novel metric, called F-set, that allows accurate identification of the spectra that are similar. A graph theoretic framework is defined that allows the use of F-set metric efficiently for accurate cluster identifications. The accuracy of the algorithm is tested on real HCD and CID data sets with varying amounts of peptides. Our experiments show that the proposed algorithm is able to cluster spectra with very high accuracy in a reasonable amount of time for large spectral data sets. Thus, the algorithm is able to decrease the computational time by compressing the data sets while increasing the throughput of the data by interpreting low S/N spectra.
A Degree Distribution Optimization Algorithm for Image Transmission
NASA Astrophysics Data System (ADS)
Jiang, Wei; Yang, Junjie
2016-09-01
Luby Transform (LT) code is the first practical implementation of digital fountain code. The coding behavior of LT code is mainly decided by the degree distribution which determines the relationship between source data and codewords. Two degree distributions are suggested by Luby. They work well in typical situations but not optimally in case of finite encoding symbols. In this work, the degree distribution optimization algorithm is proposed to explore the potential of LT code. Firstly selection scheme of sparse degrees for LT codes is introduced. Then probability distribution is optimized according to the selected degrees. In image transmission, bit stream is sensitive to the channel noise and even a single bit error may cause the loss of synchronization between the encoder and the decoder. Therefore the proposed algorithm is designed for image transmission situation. Moreover, optimal class partition is studied for image transmission with unequal error protection. The experimental results are quite promising. Compared with LT code with robust soliton distribution, the proposed algorithm improves the final quality of recovered images obviously with the same overhead.
Du, Tingsong; Hu, Yang; Ke, Xianting
2015-01-01
An improved quantum artificial fish swarm algorithm (IQAFSA) for solving distributed network programming considering distributed generation is proposed in this work. The IQAFSA based on quantum computing which has exponential acceleration for heuristic algorithm uses quantum bits to code artificial fish and quantum revolving gate, preying behavior, and following behavior and variation of quantum artificial fish to update the artificial fish for searching for optimal value. Then, we apply the proposed new algorithm, the quantum artificial fish swarm algorithm (QAFSA), the basic artificial fish swarm algorithm (BAFSA), and the global edition artificial fish swarm algorithm (GAFSA) to the simulation experiments for some typical test functions, respectively. The simulation results demonstrate that the proposed algorithm can escape from the local extremum effectively and has higher convergence speed and better accuracy. Finally, applying IQAFSA to distributed network problems and the simulation results for 33-bus radial distribution network system show that IQAFSA can get the minimum power loss after comparing with BAFSA, GAFSA, and QAFSA. PMID:26447713
An improved K-means clustering algorithm in agricultural image segmentation
NASA Astrophysics Data System (ADS)
Cheng, Huifeng; Peng, Hui; Liu, Shanmei
Image segmentation is the first important step to image analysis and image processing. In this paper, according to color crops image characteristics, we firstly transform the color space of image from RGB to HIS, and then select proper initial clustering center and cluster number in application of mean-variance approach and rough set theory followed by clustering calculation in such a way as to automatically segment color component rapidly and extract target objects from background accurately, which provides a reliable basis for identification, analysis, follow-up calculation and process of crops images. Experimental results demonstrate that improved k-means clustering algorithm is able to reduce the computation amounts and enhance precision and accuracy of clustering.
NASA Astrophysics Data System (ADS)
Vinograd, Victor L.; Saxena, Surendra K.; Putnis, Andrew
1997-11-01
The free energy of the Ising model in the cluster-variation method (CVM) is traditionally described as a function of many configuration variables (correlation functions) the number of which is equal to the number of all distinct subclusters of the chosen set of basic clusters. According to the present approach the description of the equilibrium distribution of basic clusters such as point, pair, triangle, square, hexagon, octagon, tetrahedron, cube, and octahedron requires no more than four basic variables corresponding to the four basic subclusters, namely, point, pair, triangle, and tetrahedron. The values of all other correlation functions can be found with the help of a set of irreversible transformations on basic clusters which equilibrate the cluster distributions with respect to the given distribution of their basic subclusters. The decrease in the number of cluster variables results in a significant simplification of formulation and minimization of the free-energy expressions used in the CVM.
Whistler Waves Driven by Anisotropic Strahl Velocity Distributions: Cluster Observations
NASA Technical Reports Server (NTRS)
Vinas, A.F.; Gurgiolo, C.; Nieves-Chinchilla, T.; Gary, S. P.; Goldstein, M. L.
2010-01-01
Observed properties of the strahl using high resolution 3D electron velocity distribution data obtained from the Cluster/PEACE experiment are used to investigate its linear stability. An automated method to isolate the strahl is used to allow its moments to be computed independent of the solar wind core+halo. Results show that the strahl can have a high temperature anisotropy (T(perpindicular)/T(parallell) approximately > 2). This anisotropy is shown to be an important free energy source for the excitation of high frequency whistler waves. The analysis suggests that the resultant whistler waves are strong enough to regulate the electron velocity distributions in the solar wind through pitch-angle scattering
CHIMERA: Clustering of Heterogeneous Disease Effects via Distribution Matching of Imaging Patterns.
Dong, Aoyan; Honnorat, Nicolas; Gaonkar, Bilwaj; Davatzikos, Christos
2016-02-01
Many brain disorders and diseases exhibit heterogeneous symptoms and imaging characteristics. This heterogeneity is typically not captured by commonly adopted neuroimaging analyses that seek only a main imaging pattern when two groups need to be differentiated (e.g., patients and controls, or clinical progressors and non-progressors). We propose a novel probabilistic clustering approach, CHIMERA, modeling the pathological process by a combination of multiple regularized transformations from normal/control population to the patient population, thereby seeking to identify multiple imaging patterns that relate to disease effects and to better characterize disease heterogeneity. In our framework, normal and patient populations are considered as point distributions that are matched by a variant of the coherent point drift algorithm. We explain how the posterior probabilities produced during the MAP optimization of CHIMERA can be used for clustering the patients into groups and identifying disease subtypes. CHIMERA was first validated on a synthetic dataset and then on a clinical dataset mixing 317 control subjects and patients suffering from Alzheimer's Disease (AD) and Parkison's Disease (PD). CHIMERA produced better clustering results compared to two standard clustering approaches. We further analyzed 390 T1 MRI scans from Alzheimer's patients. We discovered two main and reproducible AD subtypes displaying significant differences in cognitive performance.
Distributed-memory Parallel Algorithms for Matching and Coloring
Catalyurek, Umit; Dobrian, Florin; Gebremedhin, Assefaw H.; Halappanavar, Mahantesh; Pothen, Alex
2011-05-31
Graph matching and coloring constitute two fundamental classes of combinatorial problems having numerous established as well as emerging applications in computational science and engineering, high-performance computing, and informatics. We provide a snapshot of an on-going work on the design and implementation of new highly-scalable distributed-memory parallel algorithms for two prototypical problems from these classes, edge-weighted matching and distance-1 vertex coloring. Graph algorithms in general have low concurrency and poor data locality, making it challenging to achieve scalability on massively parallel machines. We overcome this challenge by employing a variety of techniques, including approximation, speculation and iteration, optimized communication, and randomization, in concert. We present preliminary results on weak and strong scalability studies conducted on an IBM Blue Gene/P machine employing up to tens of thousands of processors. The results show that the algorithms hold strong potential for computing at petascale.
Improving permafrost distribution modelling using feature selection algorithms
NASA Astrophysics Data System (ADS)
Deluigi, Nicola; Lambiel, Christophe; Kanevski, Mikhail
2016-04-01
The availability of an increasing number of spatial data on the occurrence of mountain permafrost allows the employment of machine learning (ML) classification algorithms for modelling the distribution of the phenomenon. One of the major problems when dealing with high-dimensional dataset is the number of input features (variables) involved. Application of ML classification algorithms to this large number of variables leads to the risk of overfitting, with the consequence of a poor generalization/prediction. For this reason, applying feature selection (FS) techniques helps simplifying the amount of factors required and improves the knowledge on adopted features and their relation with the studied phenomenon. Moreover, taking away irrelevant or redundant variables from the dataset effectively improves the quality of the ML prediction. This research deals with a comparative analysis of permafrost distribution models supported by FS variable importance assessment. The input dataset (dimension = 20-25, 10 m spatial resolution) was constructed using landcover maps, climate data and DEM derived variables (altitude, aspect, slope, terrain curvature, solar radiation, etc.). It was completed with permafrost evidences (geophysical and thermal data and rock glacier inventories) that serve as training permafrost data. Used FS algorithms informed about variables that appeared less statistically important for permafrost presence/absence. Three different algorithms were compared: Information Gain (IG), Correlation-based Feature Selection (CFS) and Random Forest (RF). IG is a filter technique that evaluates the worth of a predictor by measuring the information gain with respect to the permafrost presence/absence. Conversely, CFS is a wrapper technique that evaluates the worth of a subset of predictors by considering the individual predictive ability of each variable along with the degree of redundancy between them. Finally, RF is a ML algorithm that performs FS as part of its
A multilevel gamma-clustering layout algorithm for visualization of biological networks.
Hruz, Tomas; Wyss, Markus; Lucas, Christoph; Laule, Oliver; von Rohr, Peter; Zimmermann, Philip; Bleuler, Stefan
2013-01-01
Visualization of large complex networks has become an indispensable part of systems biology, where organisms need to be considered as one complex system. The visualization of the corresponding network is challenging due to the size and density of edges. In many cases, the use of standard visualization algorithms can lead to high running times and poorly readable visualizations due to many edge crossings. We suggest an approach that analyzes the structure of the graph first and then generates a new graph which contains specific semantic symbols for regular substructures like dense clusters. We propose a multilevel gamma-clustering layout visualization algorithm (MLGA) which proceeds in three subsequent steps: (i) a multilevel γ -clustering is used to identify the structure of the underlying network, (ii) the network is transformed to a tree, and (iii) finally, the resulting tree which shows the network structure is drawn using a variation of a force-directed algorithm. The algorithm has a potential to visualize very large networks because it uses modern clustering heuristics which are optimized for large graphs. Moreover, most of the edges are removed from the visual representation which allows keeping the overview over complex graphs with dense subgraphs.
Cluster algorithm for two-dimensional U(1) lattice gauge theory
NASA Astrophysics Data System (ADS)
Sinclair, R.
1992-03-01
We use gauge fixing to rewrite the two-dimensional U(1) pure gauge model with Wilson action and periodic boundary conditions as a nonfrustrated XY model on a closed chain. The Wolff single-cluster algorithm is then applied, eliminating critical slowing down of topological modes and Polyakov loops.
Solving the depth of the repeated texture areas based on the clustering algorithm
NASA Astrophysics Data System (ADS)
Xiong, Zhang; Zhang, Jun; Tian, Jinwen
2015-12-01
The reconstruction of the 3D scene in the monocular stereo vision needs to get the depth of the field scenic points in the picture scene. But there will inevitably be error matching in the process of image matching, especially when there are a large number of repeat texture areas in the images, there will be lots of error matches. At present, multiple baseline stereo imaging algorithm is commonly used to eliminate matching error for repeated texture areas. This algorithm can eliminate the ambiguity correspond to common repetition texture. But this algorithm has restrictions on the baseline, and has low speed. In this paper, we put forward an algorithm of calculating the depth of the matching points in the repeat texture areas based on the clustering algorithm. Firstly, we adopt Gauss Filter to preprocess the images. Secondly, we segment the repeated texture regions in the images into image blocks by using spectral clustering segmentation algorithm based on super pixel and tag the image blocks. Then, match the two images and solve the depth of the image. Finally, the depth of the image blocks takes the median in all depth values of calculating point in the bock. So the depth of repeated texture areas is got. The results of a lot of image experiments show that the effect of our algorithm for calculating the depth of repeated texture areas is very good.
An effective trust-based recommendation method using a novel graph clustering algorithm
NASA Astrophysics Data System (ADS)
Moradi, Parham; Ahmadian, Sajad; Akhlaghian, Fardin
2015-10-01
Recommender systems are programs that aim to provide personalized recommendations to users for specific items (e.g. music, books) in online sharing communities or on e-commerce sites. Collaborative filtering methods are important and widely accepted types of recommender systems that generate recommendations based on the ratings of like-minded users. On the other hand, these systems confront several inherent issues such as data sparsity and cold start problems, caused by fewer ratings against the unknowns that need to be predicted. Incorporating trust information into the collaborative filtering systems is an attractive approach to resolve these problems. In this paper, we present a model-based collaborative filtering method by applying a novel graph clustering algorithm and also considering trust statements. In the proposed method first of all, the problem space is represented as a graph and then a sparsest subgraph finding algorithm is applied on the graph to find the initial cluster centers. Then, the proposed graph clustering algorithm is performed to obtain the appropriate users/items clusters. Finally, the identified clusters are used as a set of neighbors to recommend unseen items to the current active user. Experimental results based on three real-world datasets demonstrate that the proposed method outperforms several state-of-the-art recommender system methods.
Performance evaluation of simple linear iterative clustering algorithm on medical image processing.
Cong, Jinyu; Wei, Benzheng; Yin, Yilong; Xi, Xiaoming; Zheng, Yuanjie
2014-01-01
Simple Linear Iterative Clustering (SLIC) algorithm is increasingly applied to different kinds of image processing because of its excellent perceptually meaningful characteristics. In order to better meet the needs of medical image processing and provide technical reference for SLIC on the application of medical image segmentation, two indicators of boundary accuracy and superpixel uniformity are introduced with other indicators to systematically analyze the performance of SLIC algorithm, compared with Normalized cuts and Turbopixels algorithm. The extensive experimental results show that SLIC is faster and less sensitive to the image type and the setting superpixel number than other similar algorithms such as Turbopixels and Normalized cuts algorithms. And it also has a great benefit to the boundary recall, the robustness of fuzzy boundary, the setting superpixel size and the segmentation performance on medical image segmentation.
Unsupervised unstained cell detection by SIFT keypoint clustering and self-labeling algorithm.
Muallal, Firas; Schöll, Simon; Sommerfeldt, Björn; Maier, Andreas; Steidl, Stefan; Buchholz, Rainer; Hornegger, Joachim
2014-01-01
We propose a novel unstained cell detection algorithm based on unsupervised learning. The algorithm utilizes the scale invariant feature transform (SIFT), a self-labeling algorithm, and two clustering steps in order to achieve high performance in terms of time and detection accuracy. Unstained cell imaging is dominated by phase contrast and bright field microscopy. Therefore, the algorithm was assessed on images acquired using these two modalities. Five cell lines having in total 37 images and 7250 cells were considered for the evaluation: CHO, L929, Sf21, HeLa, and Bovine cells. The obtained F-measures were between 85.1 and 89.5. Compared to the state-of-the-art, the algorithm achieves very close F-measure to the supervised approaches in much less time.
A distributed geo-routing algorithm for wireless sensor networks.
Joshi, Gyanendra Prasad; Kim, Sung Won
2009-01-01
Geographic wireless sensor networks use position information for greedy routing. Greedy routing works well in dense networks, whereas in sparse networks it may fail and require a recovery algorithm. Recovery algorithms help the packet to get out of the communication void. However, these algorithms are generally costly for resource constrained position-based wireless sensor networks (WSNs). In this paper, we propose a void avoidance algorithm (VAA), a novel idea based on upgrading virtual distance. VAA allows wireless sensor nodes to remove all stuck nodes by transforming the routing graph and forwarding packets using only greedy routing. In VAA, the stuck node upgrades distance unless it finds a next hop node that is closer to the destination than it is. VAA guarantees packet delivery if there is a topologically valid path. Further, it is completely distributed, immediately responds to node failure or topology changes and does not require planarization of the network. NS-2 is used to evaluate the performance and correctness of VAA and we compare its performance to other protocols. Simulations show our proposed algorithm consumes less energy, has an efficient path and substantially less control overheads.
NASA Astrophysics Data System (ADS)
You, Tao; Cheng, Hui-Min; Ning, Yi-Zi; Shia, Ben-Chang; Zhang, Zhong-Yuan
2016-12-01
Like clustering analysis, community detection aims at assigning nodes in a network into different communities. Fdp is a recently proposed density-based clustering algorithm which does not need the number of clusters as prior input and the result is insensitive to its parameter. However, Fdp cannot be directly applied to community detection due to its inability to recognize the community centers in the network. To solve the problem, a new community detection method (named IsoFdp) is proposed in this paper. First, we use IsoMap technique to map the network data into a low dimensional manifold which can reveal diverse pair-wised similarity. Then Fdp is applied to detect the communities in the network. An improved partition density function is proposed to select the proper number of communities automatically. We test our method on both synthetic and real-world networks, and the results demonstrate the effectiveness of our algorithm over the state-of-the-art methods.
Hybridization of evolutionary algorithms and local search by means of a clustering method.
Martínez-Estudillo, Alfonso C; Hervás-Martínez, César; Martínez-Estudillo, Francisco J; García-Pedrajas, Nicolás
2006-06-01
This paper presents a hybrid evolutionary algorithm (EA) to solve nonlinear-regression problems. Although EAs have proven their ability to explore large search spaces, they are comparatively inefficient in fine tuning the solution. This drawback is usually avoided by means of local optimization algorithms that are applied to the individuals of the population. The algorithms that use local optimization procedures are usually called hybrid algorithms. On the other hand, it is well known that the clustering process enables the creation of groups (clusters) with mutually close points that hopefully correspond to relevant regions of attraction. Local-search procedures can then be started once in every such region. This paper proposes the combination of an EA, a clustering process, and a local-search procedure to the evolutionary design of product-units neural networks. In the methodology presented, only a few individuals are subject to local optimization. Moreover, the local optimization algorithm is only applied at specific stages of the evolutionary process. Our results show a favorable performance when the regression method proposed is compared to other standard methods.
Numerical convergence and interpretation of the fuzzy c-shells clustering algorithm.
Bezdek, J C; Hathaway, R J
1992-01-01
R. N. Dave's (1990) version of fuzzy c-shells is an iterative clustering algorithm which requires the application of Newton's method or a similar general optimization technique at each half step in any sequence of iterates for minimizing the associated objective function. An important computational question concerns the accuracy of the solution required at each half step within the overall iteration. The general convergence theory for grouped coordination minimization is applied to this question to show that numerically exact solution of the half-step subproblems in Dave's algorithm is not necessary. One iteration of Newton's method in each coordinate minimization half step yields a sequence obtained using the fuzzy c-shells algorithm with numerically exact coordinate minimization at each half step. It is shown that fuzzy c-shells generates hyperspherical prototypes to the clusters it finds for certain special cases of the measure of dissimilarity used.
Design of multiple sequence alignment algorithms on parallel, distributed memory supercomputers.
Church, Philip C; Goscinski, Andrzej; Holt, Kathryn; Inouye, Michael; Ghoting, Amol; Makarychev, Konstantin; Reumann, Matthias
2011-01-01
The challenge of comparing two or more genomes that have undergone recombination and substantial amounts of segmental loss and gain has recently been addressed for small numbers of genomes. However, datasets of hundreds of genomes are now common and their sizes will only increase in the future. Multiple sequence alignment of hundreds of genomes remains an intractable problem due to quadratic increases in compute time and memory footprint. To date, most alignment algorithms are designed for commodity clusters without parallelism. Hence, we propose the design of a multiple sequence alignment algorithm on massively parallel, distributed memory supercomputers to enable research into comparative genomics on large data sets. Following the methodology of the sequential progressiveMauve algorithm, we design data structures including sequences and sorted k-mer lists on the IBM Blue Gene/P supercomputer (BG/P). Preliminary results show that we can reduce the memory footprint so that we can potentially align over 250 bacterial genomes on a single BG/P compute node. We verify our results on a dataset of E.coli, Shigella and S.pneumoniae genomes. Our implementation returns results matching those of the original algorithm but in 1/2 the time and with 1/4 the memory footprint for scaffold building. In this study, we have laid the basis for multiple sequence alignment of large-scale datasets on a massively parallel, distributed memory supercomputer, thus enabling comparison of hundreds instead of a few genome sequences within reasonable time.
BoCluSt: Bootstrap Clustering Stability Algorithm for Community Detection.
Garcia, Carlos
2016-01-01
The identification of modules or communities in sets of related variables is a key step in the analysis and modeling of biological systems. Procedures for this identification are usually designed to allow fast analyses of very large datasets and may produce suboptimal results when these sets are of a small to moderate size. This article introduces BoCluSt, a new, somewhat more computationally intensive, community detection procedure that is based on combining a clustering algorithm with a measure of stability under bootstrap resampling. Both computer simulation and analyses of experimental data showed that BoCluSt can outperform current procedures in the identification of multiple modules in data sets with a moderate number of variables. In addition, the procedure provides users with a null distribution of results to evaluate the support for the existence of community structure in the data. BoCluSt takes individual measures for a set of variables as input, and may be a valuable and robust exploratory tool of network analysis, as it provides 1) an estimation of the best partition of variables into modules, 2) a measure of the support for the existence of modular structures, and 3) an overall description of the whole structure, which may reveal hierarchical modular situations, in which modules are composed of smaller sub-modules.
Sampling cluster stability for peer-to-peer based content distribution networks
NASA Astrophysics Data System (ADS)
Darlagiannis, Vasilios; Mauthe, Andreas; Steinmetz, Ralf
2006-01-01
Several types of Content Distribution Networks are being deployed over the Internet today, based on different architectures to meet their requirements (e.g., scalability, efficiency and resiliency). Peer-to-Peer (P2P) based Content Distribution Networks are promising approaches that have several advantages. Structured P2P networks, for instance, take a proactive approach and provide efficient routing mechanisms. Nevertheless, their maintenance can increase considerably in highly dynamic P2P environments. In order to address this issue, a two-tier architecture that combines a structured overlay network with a clustering mechanism is suggested in a hybrid scheme. In this paper, we examine several sampling algorithms utilized in the aforementioned hybrid network that collect local information in order to apply a selective join procedure. The algorithms are based mostly on random walks inside the overlay network. The aim of the selective join procedure is to provide a well balanced and stable overlay infrastructure that can easily overcome the unreliable behavior of the autonomous peers that constitute the network. The sampling algorithms are evaluated using simulation experiments where several properties related to the graph structure are revealed.
Tsai, Ming-Chi; Tsui, Fu-Chiang; Wagner, Michael M
2007-10-11
Performing fast data analysis to detect disease outbreaks plays a critical role in real-time biosurveillance. In this paper, we described and evaluated an Algorithm Distribution Manager Service (ADMS) based on grid technologies, which dynamically partition and distribute detection algorithms across multiple computers. We compared the execution time to perform the analysis on a single computer and on a grid network (3 computing nodes) with and without using dynamic algorithm distribution. We found that algorithms with long runtime completed approximately three times earlier in distributed environment than in a single computer while short runtime algorithms performed worse in distributed environment. A dynamic algorithm distribution approach also performed better than static algorithm distribution approach. This pilot study shows a great potential to reduce lengthy analysis time through dynamic algorithm partitioning and parallel processing, and provides the opportunity of distributing algorithms from a client to remote computers in a grid network.
Chen, Wei-Chen; Ostrouchov, George; Pugmire, Dave; Prabhat,; Wehner, Michael
2013-01-01
We develop a parallel EM algorithm for multivariate Gaussian mixture models and use it to perform model-based clustering of a large climate data set. Three variants of the EM algorithm are reformulated in parallel and a new variant that is faster is presented. All are implemented using the single program, multiple data (SPMD) programming model, which is able to take advantage of the combined collective memory of large distributed computer architectures to process larger data sets. Displays of the estimated mixture model rather than the data allow us to explore multivariate relationships in a way that scales to arbitrary size data. We study the performance of our methodology on simulated data and apply our methodology to a high resolution climate dataset produced by the community atmosphere model (CAM5). This article has supplementary material online.
Modelling galaxy clustering: halo occupation distribution versus subhalo matching.
Guo, Hong; Zheng, Zheng; Behroozi, Peter S; Zehavi, Idit; Chuang, Chia-Hsun; Comparat, Johan; Favole, Ginevra; Gottloeber, Stefan; Klypin, Anatoly; Prada, Francisco; Rodríguez-Torres, Sergio A; Weinberg, David H; Yepes, Gustavo
2016-07-01
We model the luminosity-dependent projected and redshift-space two-point correlation functions (2PCFs) of the Sloan Digital Sky Survey (SDSS) Data Release 7 Main galaxy sample, using the halo occupation distribution (HOD) model and the subhalo abundance matching (SHAM) model and its extension. All the models are built on the same high-resolution N-body simulations. We find that the HOD model generally provides the best performance in reproducing the clustering measurements in both projected and redshift spaces. The SHAM model with the same halo-galaxy relation for central and satellite galaxies (or distinct haloes and subhaloes), when including scatters, has a best-fitting χ(2)/dof around 2-3. We therefore extend the SHAM model to the subhalo clustering and abundance matching (SCAM) by allowing the central and satellite galaxies to have different galaxy-halo relations. We infer the corresponding halo/subhalo parameters by jointly fitting the galaxy 2PCFs and abundances and consider subhaloes selected based on three properties, the mass Macc at the time of accretion, the maximum circular velocity Vacc at the time of accretion, and the peak maximum circular velocity Vpeak over the history of the subhaloes. The three subhalo models work well for luminous galaxy samples (with luminosity above L*). For low-luminosity samples, the Vacc model stands out in reproducing the data, with the Vpeak model slightly worse, while the Macc model fails to fit the data. We discuss the implications of the modelling results.
A priori data-driven multi-clustered reservoir generation algorithm for echo state network.
Li, Xiumin; Zhong, Ling; Xue, Fangzheng; Zhang, Anguo
2015-01-01
Echo state networks (ESNs) with multi-clustered reservoir topology perform better in reservoir computing and robustness than those with random reservoir topology. However, these ESNs have a complex reservoir topology, which leads to difficulties in reservoir generation. This study focuses on the reservoir generation problem when ESN is used in environments with sufficient priori data available. Accordingly, a priori data-driven multi-cluster reservoir generation algorithm is proposed. The priori data in the proposed algorithm are used to evaluate reservoirs by calculating the precision and standard deviation of ESNs. The reservoirs are produced using the clustering method; only the reservoir with a better evaluation performance takes the place of a previous one. The final reservoir is obtained when its evaluation score reaches the preset requirement. The prediction experiment results obtained using the Mackey-Glass chaotic time series show that the proposed reservoir generation algorithm provides ESNs with extra prediction precision and increases the structure complexity of the network. Further experiments also reveal the appropriate values of the number of clusters and time window size to obtain optimal performance. The information entropy of the reservoir reaches the maximum when ESN gains the greatest precision.
A Parallel Ghosting Algorithm for The Flexible Distributed Mesh Database
Mubarak, Misbah; Seol, Seegyoung; Lu, Qiukai; ...
2013-01-01
Critical to the scalability of parallel adaptive simulations are parallel control functions including load balancing, reduced inter-process communication and optimal data decomposition. In distributed meshes, many mesh-based applications frequently access neighborhood information for computational purposes which must be transmitted efficiently to avoid parallel performance degradation when the neighbors are on different processors. This article presents a parallel algorithm of creating and deleting data copies, referred to as ghost copies, which localize neighborhood data for computation purposes while minimizing inter-process communication. The key characteristics of the algorithm are: (1) It can create ghost copies of any permissible topological order inmore » a 1D, 2D or 3D mesh based on selected adjacencies. (2) It exploits neighborhood communication patterns during the ghost creation process thus eliminating all-to-all communication. (3) For applications that need neighbors of neighbors, the algorithm can create n number of ghost layers up to a point where the whole partitioned mesh can be ghosted. Strong and weak scaling results are presented for the IBM BG/P and Cray XE6 architectures up to a core count of 32,768 processors. The algorithm also leads to scalable results when used in a parallel super-convergent patch recovery error estimator, an application that frequently accesses neighborhood data to carry out computation.« less
An adaptive enhancement algorithm for infrared video based on modified k-means clustering
NASA Astrophysics Data System (ADS)
Zhang, Linze; Wang, Jingqi; Wu, Wen
2016-09-01
In this paper, we have proposed a video enhancement algorithm to improve the output video of the infrared camera. Sometimes the video obtained by infrared camera is very dark since there is no clear target. In this case, infrared video should be divided into frame images by frame extraction, in order to carry out the image enhancement. For the first frame image, which can be divided into k sub images by using K-means clustering according to the gray interval it occupies before k sub images' histogram equalization according to the amount of information per sub image, we used a method to solve a problem that final cluster centers close to each other in some cases; and for the other frame images, their initial cluster centers can be determined by the final clustering centers of the previous ones, and the histogram equalization of each sub image will be carried out after image segmentation based on K-means clustering. The histogram equalization can make the gray value of the image to the whole gray level, and the gray level of each sub image is determined by the ratio of pixels to a frame image. Experimental results show that this algorithm can improve the contrast of infrared video where night target is not obvious which lead to a dim scene, and reduce the negative effect given by the overexposed pixels adaptively in a certain range.
Density-based cluster algorithms for the identification of core sets
NASA Astrophysics Data System (ADS)
Lemke, Oliver; Keller, Bettina G.
2016-10-01
The core-set approach is a discretization method for Markov state models of complex molecular dynamics. Core sets are disjoint metastable regions in the conformational space, which need to be known prior to the construction of the core-set model. We propose to use density-based cluster algorithms to identify the cores. We compare three different density-based cluster algorithms: the CNN, the DBSCAN, and the Jarvis-Patrick algorithm. While the core-set models based on the CNN and DBSCAN clustering are well-converged, constructing core-set models based on the Jarvis-Patrick clustering cannot be recommended. In a well-converged core-set model, the number of core sets is up to an order of magnitude smaller than the number of states in a conventional Markov state model with comparable approximation error. Moreover, using the density-based clustering one can extend the core-set method to systems which are not strongly metastable. This is important for the practical application of the core-set method because most biologically interesting systems are only marginally metastable. The key point is to perform a hierarchical density-based clustering while monitoring the structure of the metric matrix which appears in the core-set method. We test this approach on a molecular-dynamics simulation of a highly flexible 14-residue peptide. The resulting core-set models have a high spatial resolution and can distinguish between conformationally similar yet chemically different structures, such as register-shifted hairpin structures.
Cluster-Based Multipolling Sequencing Algorithm for Collecting RFID Data in Wireless LANs
NASA Astrophysics Data System (ADS)
Choi, Woo-Yong; Chatterjee, Mainak
2015-03-01
With the growing use of RFID (Radio Frequency Identification), it is becoming important to devise ways to read RFID tags in real time. Access points (APs) of IEEE 802.11-based wireless Local Area Networks (LANs) are being integrated with RFID networks that can efficiently collect real-time RFID data. Several schemes, such as multipolling methods based on the dynamic search algorithm and random sequencing, have been proposed. However, as the number of RFID readers associated with an AP increases, it becomes difficult for the dynamic search algorithm to derive the multipolling sequence in real time. Though multipolling methods can eliminate the polling overhead, we still need to enhance the performance of the multipolling methods based on random sequencing. To that extent, we propose a real-time cluster-based multipolling sequencing algorithm that drastically eliminates more than 90% of the polling overhead, particularly so when the dynamic search algorithm fails to derive the multipolling sequence in real time.
Distribution-based clustering: using ecology to refine the operational taxonomic unit.
Preheim, Sarah P; Perrotta, Allison R; Martin-Platero, Antonio M; Gupta, Anika; Alm, Eric J
2013-11-01
16S rRNA sequencing, commonly used to survey microbial communities, begins by grouping individual reads into operational taxonomic units (OTUs). There are two major challenges in calling OTUs: identifying bacterial population boundaries and differentiating true diversity from sequencing errors. Current approaches to identifying taxonomic groups or eliminating sequencing errors rely on sequence data alone, but both of these activities could be informed by the distribution of sequences across samples. Here, we show that using the distribution of sequences across samples can help identify population boundaries even in noisy sequence data. The logic underlying our approach is that bacteria in different populations will often be highly correlated in their abundance across different samples. Conversely, 16S rRNA sequences derived from the same population, whether slightly different copies in the same organism, variation of the 16S rRNA gene within a population, or sequences generated randomly in error, will have the same underlying distribution across sampled environments. We present a simple OTU-calling algorithm (distribution-based clustering) that uses both genetic distance and the distribution of sequences across samples and demonstrate that it is more accurate than other methods at grouping reads into OTUs in a mock community. Distribution-based clustering also performs well on environmental samples: it is sensitive enough to differentiate between OTUs that differ by a single base pair yet predicts fewer overall OTUs than most other methods. The program can decrease the total number of OTUs with redundant information and improve the power of many downstream analyses to describe biologically relevant trends.
Distributed Minimal Residual (DMR) method for acceleration of iterative algorithms
NASA Technical Reports Server (NTRS)
Lee, Seungsoo; Dulikravich, George S.
1991-01-01
A new method for enhancing the convergence rate of iterative algorithms for the numerical integration of systems of partial differential equations was developed. It is termed the Distributed Minimal Residual (DMR) method and it is based on general Krylov subspace methods. The DMR method differs from the Krylov subspace methods by the fact that the iterative acceleration factors are different from equation to equation in the system. At the same time, the DMR method can be viewed as an incomplete Newton iteration method. The DMR method was applied to Euler equations of gas dynamics and incompressible Navier-Stokes equations. All numerical test cases were obtained using either explicit four stage Runge-Kutta or Euler implicit time integration. The formulation for the DMR method is general in nature and can be applied to explicit and implicit iterative algorithms for arbitrary systems of partial differential equations.
K-Means Re-Clustering-Algorithmic Options with Quantifiable Performance Comparisons
Meyer, A W; Paglieroni, D; Asteneh, C
2002-12-17
This paper presents various architectural options for implementing a K-Means Re-Clustering algorithm suitable for unsupervised segmentation of hyperspectral images. Performance metrics are developed based upon quantitative comparisons of convergence rates and segmentation quality. A methodology for making these comparisons is developed and used to establish K values that produce the best segmentations with minimal processing requirements. Convergence rates depend on the initial choice of cluster centers. Consequently, this same methodology may be used to evaluate the effectiveness of different initialization techniques.
What to Do When K-Means Clustering Fails: A Simple yet Principled Alternative Algorithm.
Raykov, Yordan P; Boukouvalas, Alexis; Baig, Fahd; Little, Max A
The K-means algorithm is one of the most popular clustering algorithms in current use as it is relatively fast yet simple to understand and deploy in practice. Nevertheless, its use entails certain restrictive assumptions about the data, the negative consequences of which are not always immediately apparent, as we demonstrate. While more flexible algorithms have been developed, their widespread use has been hindered by their computational and technical complexity. Motivated by these considerations, we present a flexible alternative to K-means that relaxes most of the assumptions, whilst remaining almost as fast and simple. This novel algorithm which we call MAP-DP (maximum a-posteriori Dirichlet process mixtures), is statistically rigorous as it is based on nonparametric Bayesian Dirichlet process mixture modeling. This approach allows us to overcome most of the limitations imposed by K-means. The number of clusters K is estimated from the data instead of being fixed a-priori as in K-means. In addition, while K-means is restricted to continuous data, the MAP-DP framework can be applied to many kinds of data, for example, binary, count or ordinal data. Also, it can efficiently separate outliers from the data. This additional flexibility does not incur a significant computational overhead compared to K-means with MAP-DP convergence typically achieved in the order of seconds for many practical problems. Finally, in contrast to K-means, since the algorithm is based on an underlying statistical model, the MAP-DP framework can deal with missing data and enables model testing such as cross validation in a principled way. We demonstrate the simplicity and effectiveness of this algorithm on the health informatics problem of clinical sub-typing in a cluster of diseases known as parkinsonism.
What to Do When K-Means Clustering Fails: A Simple yet Principled Alternative Algorithm
Baig, Fahd; Little, Max A.
2016-01-01
The K-means algorithm is one of the most popular clustering algorithms in current use as it is relatively fast yet simple to understand and deploy in practice. Nevertheless, its use entails certain restrictive assumptions about the data, the negative consequences of which are not always immediately apparent, as we demonstrate. While more flexible algorithms have been developed, their widespread use has been hindered by their computational and technical complexity. Motivated by these considerations, we present a flexible alternative to K-means that relaxes most of the assumptions, whilst remaining almost as fast and simple. This novel algorithm which we call MAP-DP (maximum a-posteriori Dirichlet process mixtures), is statistically rigorous as it is based on nonparametric Bayesian Dirichlet process mixture modeling. This approach allows us to overcome most of the limitations imposed by K-means. The number of clusters K is estimated from the data instead of being fixed a-priori as in K-means. In addition, while K-means is restricted to continuous data, the MAP-DP framework can be applied to many kinds of data, for example, binary, count or ordinal data. Also, it can efficiently separate outliers from the data. This additional flexibility does not incur a significant computational overhead compared to K-means with MAP-DP convergence typically achieved in the order of seconds for many practical problems. Finally, in contrast to K-means, since the algorithm is based on an underlying statistical model, the MAP-DP framework can deal with missing data and enables model testing such as cross validation in a principled way. We demonstrate the simplicity and effectiveness of this algorithm on the health informatics problem of clinical sub-typing in a cluster of diseases known as parkinsonism. PMID:27669525
Kinetic energy distributions of sputtered neutral aluminum clusters: Al--Al[sub 6
Coon, S.R.; Calaway, W.F.; Pellin, M.J. ); Curlee, G.A. . Dept. of Physics); White, J.M. . Dept. of Chemistry and Biochemistry)
1992-01-01
Neutral aluminum clusters sputtered from polycrystalline aluminum were analyzed by laser postionization time-of-flight (TOF) mass spectrometry. The kinetic energy distributions of Al through Al[sub 6] were measured by a neutrals time-of-flight technique. The interpretation of laser postionization TOF data to extract velocity and energy distributions is presented. The aluminum cluster distributions are qualitatively similar to previous copper cluster distribution measurements from our laboratory. In contrast to the steep high energy tails predicted by the single- or multiple- collision models, the measured cluster distributions have high energy power law dependences in the range of E[sup [minus]3] to E[sup [minus]4.5]. Correlated collision models may explain the substantial abundance of energetic clusters that are observed in these experiments. Possible influences of cluster fragmentation on the distributions are discussed.
Kinetic energy distributions of sputtered neutral aluminum clusters: Al--Al{sub 6}
Coon, S.R.; Calaway, W.F.; Pellin, M.J.; Curlee, G.A.; White, J.M.
1992-12-01
Neutral aluminum clusters sputtered from polycrystalline aluminum were analyzed by laser postionization time-of-flight (TOF) mass spectrometry. The kinetic energy distributions of Al through Al{sub 6} were measured by a neutrals time-of-flight technique. The interpretation of laser postionization TOF data to extract velocity and energy distributions is presented. The aluminum cluster distributions are qualitatively similar to previous copper cluster distribution measurements from our laboratory. In contrast to the steep high energy tails predicted by the single- or multiple- collision models, the measured cluster distributions have high energy power law dependences in the range of E{sup {minus}3} to E{sup {minus}4.5}. Correlated collision models may explain the substantial abundance of energetic clusters that are observed in these experiments. Possible influences of cluster fragmentation on the distributions are discussed.
Climatological analyses of LMA data with an open-source lightning flash-clustering algorithm
NASA Astrophysics Data System (ADS)
Fuchs, Brody R.; Bruning, Eric C.; Rutledge, Steven A.; Carey, Lawrence D.; Krehbiel, Paul R.; Rison, William
2016-07-01
Approximately 63 million lightning flashes have been identified and analyzed from multiple years of Washington, D. C., northern Alabama, and northeast Colorado lightning mapping array (LMA) data using an open-source flash-clustering algorithm. LMA networks detect radiation produced by lightning breakdown processes, allowing for high-resolution mapping of lightning flashes. Similar to other existing clustering algorithms, the algorithm described herein groups lightning-produced radiation sources by space and time to estimate total flash counts and information about each detected flash. Various flash characteristics and their sensitivity to detection efficiency are investigated to elucidate biases in the algorithm, detail detection efficiencies of various LMAs, and guide future improvements. Furthermore, flash density values in each region are compared to corresponding satellite estimates. While total flash density values produced by the algorithm in Washington, D. C. ( 20 flashes km-2 yr-1), and Alabama ( 35 flashes km-2 yr-1) are within 50% of satellite estimates, LMA-based estimates are approximately a factor of 3 larger (50 flashes km-2 yr-1) than satellite estimates in northeast Colorado. Accordingly, estimates of the ratio of in-cloud to cloud-to-ground flashes near the LMA network ( 20) are approximately a factor of 3 larger than satellite estimates in Colorado. These large differences between estimates may be related to the distinct environment conducive to intense convection, low-altitude flashes, and unique charge structures in northeast Colorado.
An Auto-Recognizing System for Dice Games Using a Modified Unsupervised Grey Clustering Algorithm
Huang, Kuo-Yi
2008-01-01
In this paper, a novel identification method based on a machine vision system is proposed to recognize the score of dice. The system employs image processing techniques, and the modified unsupervised grey clustering algorithm (MUGCA) to estimate the location of each die and identify the spot number accurately and effectively. The proposed algorithms are substituted for manual recognition. From the experimental results, it is found that this system is excellent due to its good capabilities which include flexibility, high speed, and high accuracy. PMID:27879761
Klystron Cluster Scheme for ILC High Power RF Distribution
Nantista, Christopher; Adolphsen, Chris; /SLAC
2009-07-06
We present a concept for powering the main linacs of the International Linear Collider (ILC) by delivering high power RF from the surface via overmoded, low-loss waveguides at widely spaced intervals. The baseline design employs a two-tunnel layout, with klystrons and modulators evenly distributed along a service tunnel running parallel to the accelerator tunnel. This new idea eliminates the need for the service tunnel. It also brings most of the warm heat load to the surface, dramatically reducing the tunnel water cooling and HVAC requirements. In the envisioned configuration, groups of 70 klystrons and modulators are clustered in surface buildings every 2.5 km. Their outputs are combined into two half-meter diameter circular TE{sub 01} mode evacuated waveguides. These are directed via special bends through a deep shaft and along the tunnel, one upstream and one downstream. Each feeds approximately 1.25 km of linac with power tapped off in 10 MW portions at 38 m intervals. The power is extracted through a novel coaxial tap-off (CTO), after which the local distribution is as it would be from a klystron. The tap-off design is also employed in reverse for the initial combining.
3D Hail Size Distribution Interpolation/Extrapolation Algorithm
NASA Technical Reports Server (NTRS)
Lane, John
2013-01-01
Radar data can usually detect hail; however, it is difficult for present day radar to accurately discriminate between hail and rain. Local ground-based hail sensors are much better at detecting hail against a rain background, and when incorporated with radar data, provide a much better local picture of a severe rain or hail event. The previous disdrometer interpolation/ extrapolation algorithm described a method to interpolate horizontally between multiple ground sensors (a minimum of three) and extrapolate vertically. This work is a modification to that approach that generates a purely extrapolated 3D spatial distribution when using a single sensor.
Room Acoustical Simulation Algorithm Based on the Free Path Distribution
NASA Astrophysics Data System (ADS)
VORLÄNDER, M.
2000-04-01
A new algorithm is presented which provides estimates of impulse responses in rooms. It is applicable to arbitrary shaped rooms, thus including non-diffuse spaces like workrooms or offices. In the latter cases, for instance, sound propagation curves are of interest to be applied in noise control. In the case of concert halls and opera houses, the method enables very fast predictions of room acoustical criteria like reverberation time, strength or clarity. The method is based on a low-resolved ray tracing and recording of the free paths. Estimates of impulse responses are derived from evaluation of the free path distribution and of the free path transition probabilities.
Schatz, Martin D.; Kolda, Tamara G.; van de Geijn, Robert
2015-09-01
Large-scale datasets in computational chemistry typically require distributed-memory parallel methods to perform a special operation known as tensor contraction. Tensors are multidimensional arrays, and a tensor contraction is akin to matrix multiplication with special types of permutations. Creating an efficient algorithm and optimized im- plementation in this domain is complex, tedious, and error-prone. To address this, we develop a notation to express data distributions so that we can apply use automated methods to find optimized implementations for tensor contractions. We consider the spin-adapted coupled cluster singles and doubles method from computational chemistry and use our methodology to produce an efficient implementation. Experiments per- formed on the IBM Blue Gene/Q and Cray XC30 demonstrate impact both improved performance and reduced memory consumption.
The Spatial Distribution of the Young Stellar Clusters in the Star-forming Galaxy NGC 628
NASA Astrophysics Data System (ADS)
Grasha, K.; Calzetti, D.; Adamo, A.; Kim, H.; Elmegreen, B. G.; Gouliermis, D. A.; Aloisi, A.; Bright, S. N.; Christian, C.; Cignoni, M.; Dale, D. A.; Dobbs, C.; Elmegreen, D. M.; Fumagalli, M.; Gallagher, J. S., III; Grebel, E. K.; Johnson, K. E.; Lee, J. C.; Messa, M.; Smith, L. J.; Ryon, J. E.; Thilker, D.; Ubeda, L.; Wofford, A.
2015-12-01
We present a study of the spatial distribution of the stellar cluster populations in the star-forming galaxy NGC 628. Using Hubble Space Telescope broadband WFC3/UVIS UV and optical images from the Treasury Program LEGUS (Legacy ExtraGalactic UV Survey), we have identified 1392 potential young (≲ 100 Myr) stellar clusters within the galaxy using a combination of visual inspection and automatic selection. We investigate the clustering of these young stellar clusters and quantify the strength and change of clustering strength with scale using the two-point correlation function. We also investigate how image boundary conditions and dust lanes affect the observed clustering. The distribution of the clusters is well fit by a broken power law with negative exponent α. We recover a weighted mean index of α ∼ -0.8 for all spatial scales below the break at 3.″3 (158 pc at a distance of 9.9 Mpc) and an index of α ∼ -0.18 above 158 pc for the accumulation of all cluster types. The strength of the clustering increases with decreasing age and clusters older than 40 Myr lose their clustered structure very rapidly and tend to be randomly distributed in this galaxy, whereas the mass of the star cluster has little effect on the clustering strength. This is consistent with results from other studies that the morphological hierarchy in stellar clustering resembles the same hierarchy as the turbulent interstellar medium.
Tong, Zhen; Pu, Lixin; Dong, Fangjie
2013-08-01
As a common malignant tumor, breast cancer has seriously affected women's physical and psychological health even threatened their lives. Breast cancer has even begun to show a gradual trend of high incidence in some places in the world. As a kind of common pathological assist diagnosis technique, immunohistochemical technique plays an important role in the diagnosis of breast cancer. Usually, Pathologists isolate positive cells from the stained specimen which were processed by immunohistochemical technique and calculate the ratio of positive cells which is a core indicator of breast cancer in diagnosis. In this paper, we present a new algorithm which was based on modified watershed algorithm and concavity points searching to identify the positive cells and segment the clustered cells automatically, and then realize automatic counting. By comparison of the results of our experiments with those of other methods, our method can exactly segment the clustered cells without losing any geometrical cell features and give the exact number of separating cells.
Ishii, Satoshi; Kadota, Koji; Senoo, Keishi
2009-09-01
DNA fingerprinting analysis such as amplified ribosomal DNA restriction analysis (ARDRA), repetitive extragenic palindromic PCR (rep-PCR), ribosomal intergenic spacer analysis (RISA), and denaturing gradient gel electrophoresis (DGGE) are frequently used in various fields of microbiology. The major difficulty in DNA fingerprinting data analysis is the alignment of multiple peak sets. We report here an R program for a clustering-based peak alignment algorithm, and its application to analyze various DNA fingerprinting data, such as ARDRA, rep-PCR, RISA, and DGGE data. The results obtained by our clustering algorithm and by BioNumerics software showed high similarity. Since several R packages have been established to statistically analyze various biological data, the distance matrix obtained by our R program can be used for subsequent statistical analyses, some of which were not previously performed but are useful in DNA fingerprinting studies.
Combustion distribution control using the extremum seeking algorithm
NASA Astrophysics Data System (ADS)
Marjanovic, A.; Krstic, M.; Djurovic, Z.; Kvascev, G.; Papic, V.
2014-12-01
Quality regulation of the combustion process inside the furnace is the basis of high demands for increasing robustness, safety and efficiency of thermal power plants. The paper considers the possibility of spatial temperature distribution control inside the boiler, based on the correction of distribution of coal over the mills. Such control system ensures the maintenance of the flame focus away from the walls of the boiler, and thus preserves the equipment and reduces the possibility of ash slugging. At the same time, uniform heat dissipation over mills enhances the energy efficiency of the boiler, while reducing the pollution of the system. A constrained multivariable extremum seeking algorithm is proposed as a tool for combustion process optimization with the main objective of centralizing the flame in the furnace. Simulations are conducted on a model corresponding to the 350MW boiler of the Nikola Tesla Power Plant, in Obrenovac, Serbia.
User Activity Recognition in Smart Homes Using Pattern Clustering Applied to Temporal ANN Algorithm
Bourobou, Serge Thomas Mickala; Yoo, Younghwan
2015-01-01
This paper discusses the possibility of recognizing and predicting user activities in the IoT (Internet of Things) based smart environment. The activity recognition is usually done through two steps: activity pattern clustering and activity type decision. Although many related works have been suggested, they had some limited performance because they focused only on one part between the two steps. This paper tries to find the best combination of a pattern clustering method and an activity decision algorithm among various existing works. For the first step, in order to classify so varied and complex user activities, we use a relevant and efficient unsupervised learning method called the K-pattern clustering algorithm. In the second step, the training of smart environment for recognizing and predicting user activities inside his/her personal space is done by utilizing the artificial neural network based on the Allen’s temporal relations. The experimental results show that our combined method provides the higher recognition accuracy for various activities, as compared with other data mining classification algorithms. Furthermore, it is more appropriate for a dynamic environment like an IoT based smart home. PMID:26007738
User Activity Recognition in Smart Homes Using Pattern Clustering Applied to Temporal ANN Algorithm.
Bourobou, Serge Thomas Mickala; Yoo, Younghwan
2015-05-21
This paper discusses the possibility of recognizing and predicting user activities in the IoT (Internet of Things) based smart environment. The activity recognition is usually done through two steps: activity pattern clustering and activity type decision. Although many related works have been suggested, they had some limited performance because they focused only on one part between the two steps. This paper tries to find the best combination of a pattern clustering method and an activity decision algorithm among various existing works. For the first step, in order to classify so varied and complex user activities, we use a relevant and efficient unsupervised learning method called the K-pattern clustering algorithm. In the second step, the training of smart environment for recognizing and predicting user activities inside his/her personal space is done by utilizing the artificial neural network based on the Allen's temporal relations. The experimental results show that our combined method provides the higher recognition accuracy for various activities, as compared with other data mining classification algorithms. Furthermore, it is more appropriate for a dynamic environment like an IoT based smart home.
Numerical linked-cluster algorithms. I. Spin systems on square, triangular, and kagomé lattices.
Rigol, Marcos; Bryant, Tyler; Singh, Rajiv R P
2007-06-01
We discuss recently introduced numerical linked-cluster (NLC) algorithms that allow one to obtain temperature-dependent properties of quantum lattice models, in the thermodynamic limit, from exact diagonalization of finite clusters. We present studies of thermodynamic observables for spin models on square, triangular, and kagomé lattices. Results for several choices of clusters and extrapolations methods, that accelerate the convergence of NLCs, are presented. We also include a comparison of NLC results with those obtained from exact analytical expressions (where available), high-temperature expansions (HTE), exact diagonalization (ED) of finite periodic systems, and quantum Monte Carlo simulations. For many models and properties NLC results are substantially more accurate than HTE and ED.
Comments on "A robust fuzzy local information C-means clustering algorithm".
Celik, Turgay; Lee, Hwee Kuan
2013-03-01
In a recent paper, Krinidis and Chatzis proposed a variation of fuzzy c-means algorithm for image clustering. The local spatial and gray-level information are incorporated in a fuzzy way through an energy function. The local minimizers of the designed energy function to obtain the fuzzy membership of each pixel and cluster centers are proposed. In this paper, it is shown that the local minimizers of Krinidis and Chatzis to obtain the fuzzy membership and the cluster centers in an iterative manner are not exclusively solutions for true local minimizers of their designed energy function. Thus, the local minimizers of Krinidis and Chatzis do not converge to the correct local minima of the designed energy function not because of tackling to the local minima, but because of the design of energy function.
A spectral clustering search algorithm for predicting shallow landslide size and location
NASA Astrophysics Data System (ADS)
Bellugi, Dino; Milledge, David G.; Dietrich, William E.; McKean, Jim A.; Perron, J. Taylor; Sudderth, Erik B.; Kazian, Brian
2015-02-01
The potential hazard and geomorphic significance of shallow landslides depend on their location and size. Commonly applied one-dimensional stability models do not include lateral resistances and cannot predict landslide size. Multidimensional models must be applied to specific geometries, which are not known a priori, and testing all possible geometries is computationally prohibitive. We present an efficient deterministic search algorithm based on spectral graph theory and couple it with a multidimensional stability model to predict discrete landslides in applications at scales broader than a single hillslope using gridded spatial data. The algorithm is general, assuming only that instability results when driving forces acting on a cluster of cells exceed the resisting forces on its margins and that clusters behave as rigid blocks with a failure plane at the soil-bedrock interface. This algorithm recovers predefined clusters of unstable cells of varying shape and size on a synthetic landscape, predicts the size, location, and shape of an observed shallow landslide using field-measured physical parameters, and is robust to modest changes in input parameters. The search algorithm identifies patches of potential instability within large areas of stable landscape. Within these patches will be many different combinations of cells with a Factor of Safety less than one, suggesting that subtle variations in local conditions (e.g., pore pressure and root strength) may determine the ultimate form and exact location at a specific site. Nonetheless, the tests presented here suggest that the search algorithm enables the prediction of shallow landslide size as well as location across landscapes.
NASA Astrophysics Data System (ADS)
Sun, Jiajia; Li, Yaoguo
2017-02-01
Joint inversion that simultaneously inverts multiple geophysical data sets to recover a common Earth model is increasingly being applied to exploration problems. Petrophysical data can serve as an effective constraint to link different physical property models in such inversions. There are two challenges, among others, associated with the petrophysical approach to joint inversion. One is related to the multimodality of petrophysical data because there often exist more than one relationship between different physical properties in a region of study. The other challenge arises from the fact that petrophysical relationships have different characteristics and can exhibit point, linear, quadratic, or exponential forms in a crossplot. The fuzzy c-means (FCM) clustering technique is effective in tackling the first challenge and has been applied successfully. We focus on the second challenge in this paper and develop a joint inversion method based on variations of the FCM clustering technique. To account for the specific shapes of petrophysical relationships, we introduce several different fuzzy clustering algorithms that are capable of handling different shapes of petrophysical relationships. We present two synthetic and one field data examples and demonstrate that, by choosing appropriate distance measures for the clustering component in the joint inversion algorithm, the proposed joint inversion method provides an effective means of handling common petrophysical situations we encounter in practice. The jointly inverted models have both enhanced structural similarity and increased petrophysical correlation, and better represent the subsurface in the spatial domain and the parameter domain of physical properties.
NASA Astrophysics Data System (ADS)
Sun, Jiajia; Li, Yaoguo
2016-11-01
Joint inversion that simultaneously inverts multiple geophysical data sets to recover a common Earth model is increasingly being applied to exploration problems. Petrophysical data can serve as an effective constraint to link different physical property models in such inversions. There are two challenges, among others, associated with the petrophysical approach to joint inversion. One is related to the multi-modality of petrophysical data because there often exist more than one relationship between different physical properties in a region of study. The other challenge arises from the fact that petrophysical relationships have different characteristics and can exhibit point, linear, quadratic, or exponential forms in a crossplot. The fuzzy c-means (FCM) clustering technique is effective in tackling the first challenge and has been applied successfully. We focus on the second challenge in this paper and develop a joint inversion method based on variations of the FCM clustering technique. To account for the specific shapes of petrophysical relationships, we introduce several different fuzzy clustering algorithms that are capable of handling different shapes of petrophysical relationships. We present two synthetic and one field data examples and demonstrate that, by choosing appropriate distance measures for the clustering component in the joint inversion algorithm, the proposed joint inversion method provides an effective means of handling common petrophysical situations we encounter in practice. The jointly inverted models have both enhanced structural similarity and increased petrophysical correlation, and better represent the subsurface in the spatial domain and the parameter domain of physical properties.
Big Data GPU-Driven Parallel Processing Spatial and Spatio-Temporal Clustering Algorithms
NASA Astrophysics Data System (ADS)
Konstantaras, Antonios; Skounakis, Emmanouil; Kilty, James-Alexander; Frantzeskakis, Theofanis; Maravelakis, Emmanuel
2016-04-01
Advances in graphics processing units' technology towards encompassing parallel architectures [1], comprised of thousands of cores and multiples of parallel threads, provide the foundation in terms of hardware for the rapid processing of various parallel applications regarding seismic big data analysis. Seismic data are normally stored as collections of vectors in massive matrices, growing rapidly in size as wider areas are covered, denser recording networks are being established and decades of data are being compiled together [2]. Yet, many processes regarding seismic data analysis are performed on each seismic event independently or as distinct tiles [3] of specific grouped seismic events within a much larger data set. Such processes, independent of one another can be performed in parallel narrowing down processing times drastically [1,3]. This research work presents the development and implementation of three parallel processing algorithms using Cuda C [4] for the investigation of potentially distinct seismic regions [5,6] present in the vicinity of the southern Hellenic seismic arc. The algorithms, programmed and executed in parallel comparatively, are the: fuzzy k-means clustering with expert knowledge [7] in assigning overall clusters' number; density-based clustering [8]; and a selves-developed spatio-temporal clustering algorithm encompassing expert [9] and empirical knowledge [10] for the specific area under investigation. Indexing terms: GPU parallel programming, Cuda C, heterogeneous processing, distinct seismic regions, parallel clustering algorithms, spatio-temporal clustering References [1] Kirk, D. and Hwu, W.: 'Programming massively parallel processors - A hands-on approach', 2nd Edition, Morgan Kaufman Publisher, 2013 [2] Konstantaras, A., Valianatos, F., Varley, M.R. and Makris, J.P.: 'Soft-Computing Modelling of Seismicity in the Southern Hellenic Arc', Geoscience and Remote Sensing Letters, vol. 5 (3), pp. 323-327, 2008 [3] Papadakis, S. and
Saro, A.
2015-10-12
In this study, we cross-match galaxy cluster candidates selected via their Sunyaev–Zel'dovich effect (SZE) signatures in 129.1 deg2 of the South Pole Telescope 2500d SPT-SZ survey with optically identified clusters selected from the Dark Energy Survey science verification data. We identify 25 clusters between 0.1 ≲ z ≲ 0.8 in the union of the SPT-SZ and redMaPPer (RM) samples. RM is an optical cluster finding algorithm that also returns a richness estimate for each cluster. We model the richness λ-mass relation with the following function 500> ∝ Bλln M500 + Cλln E(z) and use SPT-SZ cluster masses and RM richnessesmore » λ to constrain the parameters. We find Bλ=1.14+0.21–0.18 and Cλ=0.73+0.77–0.75. The associated scatter in mass at fixed richness is σlnM|λ = 0.18+0.08–0.05 at a characteristic richness λ = 70. We demonstrate that our model provides an adequate description of the matched sample, showing that the fraction of SPT-SZ-selected clusters with RM counterparts is consistent with expectations and that the fraction of RM-selected clusters with SPT-SZ counterparts is in mild tension with expectation. We model the optical-SZE cluster positional offset distribution with the sum of two Gaussians, showing that it is consistent with a dominant, centrally peaked population and a subdominant population characterized by larger offsets. We also cross-match the RM catalogue with SPT-SZ candidates below the official catalogue threshold significance ξ = 4.5, using the RM catalogue to provide optical confirmation and redshifts for 15 additional clusters with ξ ε [4, 4.5].« less
Galaxy clusters in visible light (I): catalogues, large-scale distribution, and general properties.
NASA Astrophysics Data System (ADS)
Bian, Yulin
1995-12-01
While the nature, behaviour, and evolution of galaxy clusters is a such wide research field, only some of their optical properties are underlined in the present review. The whole article is divided into two parts, of which this is the first one, contributed to cluster catalogues, large-scale distribution, and some general characteristics of galaxy clusters.
NASA Astrophysics Data System (ADS)
Plaza, Antonio; Chang, Chein-I.; Plaza, Javier; Valencia, David
2006-05-01
The incorporation of hyperspectral sensors aboard airborne/satellite platforms is currently producing a nearly continual stream of multidimensional image data, and this high data volume has soon introduced new processing challenges. The price paid for the wealth spatial and spectral information available from hyperspectral sensors is the enormous amounts of data that they generate. Several applications exist, however, where having the desired information calculated quickly enough for practical use is highly desirable. High computing performance of algorithm analysis is particularly important in homeland defense and security applications, in which swift decisions often involve detection of (sub-pixel) military targets (including hostile weaponry, camouflage, concealment, and decoys) or chemical/biological agents. In order to speed-up computational performance of hyperspectral imaging algorithms, this paper develops several fast parallel data processing techniques. Techniques include four classes of algorithms: (1) unsupervised classification, (2) spectral unmixing, and (3) automatic target recognition, and (4) onboard data compression. A massively parallel Beowulf cluster (Thunderhead) at NASA's Goddard Space Flight Center in Maryland is used to measure parallel performance of the proposed algorithms. In order to explore the viability of developing onboard, real-time hyperspectral data compression algorithms, a Xilinx Virtex-II field programmable gate array (FPGA) is also used in experiments. Our quantitative and comparative assessment of parallel techniques and strategies may help image analysts in selection of parallel hyperspectral algorithms for specific applications.
Parallel OSEM Reconstruction Algorithm for Fully 3-D SPECT on a Beowulf Cluster.
Rong, Zhou; Tianyu, Ma; Yongjie, Jin
2005-01-01
In order to improve the computation speed of ordered subset expectation maximization (OSEM) algorithm for fully 3-D single photon emission computed tomography (SPECT) reconstruction, an experimental beowulf-type cluster was built and several parallel reconstruction schemes were described. We implemented a single-program-multiple-data (SPMD) parallel 3-D OSEM reconstruction algorithm based on message passing interface (MPI) and tested it with combinations of different number of calculating processors and different size of voxel grid in reconstruction (64×64×64 and 128×128×128). Performance of parallelization was evaluated in terms of the speedup factor and parallel efficiency. This parallel implementation methodology is expected to be helpful to make fully 3-D OSEM algorithms more feasible in clinical SPECT studies.
Double-layer evolutionary algorithm for distributed optimization of particle detection on the Grid
NASA Astrophysics Data System (ADS)
Padée, Adam; Kurek, Krzysztof; Zaremba, Krzysztof
2013-08-01
Reconstruction of particle tracks from information collected by position-sensitive detectors is an important procedure in HEP experiments. It is usually controlled by a set of numerical parameters which have to be manually optimized. This paper proposes an automatic approach to this task by utilizing evolutionary algorithm (EA) operating on both real-valued and binary representations. Because of computational complexity of the task a special distributed architecture of the algorithm is proposed, designed to be run in grid environment. It is two-level hierarchical hybrid utilizing asynchronous master-slave EA on the level of clusters and island model EA on the level of the grid. The technical aspects of usage of production grid infrastructure are covered, including communication protocols on both levels. The paper deals also with the problem of heterogeneity of the resources, presenting efficiency tests on a benchmark function. These tests confirm that even relatively small islands (clusters) can be beneficial to the optimization process when connected to the larger ones. Finally a real-life usage example is presented, which is an optimization of track reconstruction in Large Angle Spectrometer of NA-58 COMPASS experiment held at CERN, using a sample of Monte Carlo simulated data. The overall reconstruction efficiency gain, achieved by the proposed method, is more than 4%, compared to the manually optimized parameters.
Cluster mass fraction and size distribution determined by fs-time-resolved measurements
NASA Astrophysics Data System (ADS)
Gao, Xiaohui; Wang, Xiaoming; Shim, Bonggu; Arefiev, Alexey; Tushentsov, Mikhail; Breizman, Boris; Downer, Mike
2009-11-01
Characterization of supersonic gas jets is important for accurate interpretation and control of laser-cluster experiments. While average size and total atomic density can be found by standard Rayleigh scatter and interferometry, cluster mass fraction and size distribution are usually difficult to measure. Here we determine the cluster fraction and the size distribution with fs-time-resolved refractive index and absorption measurements in cluster gas jets after ionization and heating by an intense pump pulse. The fs-time-resolved refractive index measured with frequency domain interferometer (FDI) shows different contributions from monomer plasma and cluster plasma in the time domain, enabling us to determine the cluster fraction. The fs-time-resolved absorption measured by a delayed probe shows the contribution from clusters of various sizes, allowing us to find the size distribution.
NASA Astrophysics Data System (ADS)
Abdul-Nasir, Aimi Salihah; Mashor, Mohd Yusoff; Halim, Nurul Hazwani Abd; Mohamed, Zeehaida
2015-05-01
Malaria is a life-threatening parasitic infectious disease that corresponds for nearly one million deaths each year. Due to the requirement of prompt and accurate diagnosis of malaria, the current study has proposed an unsupervised pixel segmentation based on clustering algorithm in order to obtain the fully segmented red blood cells (RBCs) infected with malaria parasites based on the thin blood smear images of P. vivax species. In order to obtain the segmented infected cell, the malaria images are first enhanced by using modified global contrast stretching technique. Then, an unsupervised segmentation technique based on clustering algorithm has been applied on the intensity component of malaria image in order to segment the infected cell from its blood cells background. In this study, cascaded moving k-means (MKM) and fuzzy c-means (FCM) clustering algorithms has been proposed for malaria slide image segmentation. After that, median filter algorithm has been applied to smooth the image as well as to remove any unwanted regions such as small background pixels from the image. Finally, seeded region growing area extraction algorithm has been applied in order to remove large unwanted regions that are still appeared on the image due to their size in which cannot be cleaned by using median filter. The effectiveness of the proposed cascaded MKM and FCM clustering algorithms has been analyzed qualitatively and quantitatively by comparing the proposed cascaded clustering algorithm with MKM and FCM clustering algorithms. Overall, the results indicate that segmentation using the proposed cascaded clustering algorithm has produced the best segmentation performances by achieving acceptable sensitivity as well as high specificity and accuracy values compared to the segmentation results provided by MKM and FCM algorithms.
Selecting Random Distributed Elements for HIFU using Genetic Algorithm
NASA Astrophysics Data System (ADS)
Zhou, Yufeng
2011-09-01
As an effective and noninvasive therapeutic modality for tumor treatment, high-intensity focused ultrasound (HIFU) has attracted attention from both physicians and patients. New generations of HIFU systems with the ability to electrically steer the HIFU focus using phased array transducers have been under development. The presence of side and grating lobes may cause undesired thermal accumulation at the interface of the coupling medium (i.e. water) and skin, or in the intervening tissue. Although sparse randomly distributed piston elements could reduce the amplitude of grating lobes, there are theoretically no grating lobes with the use of concave elements in the new phased array HIFU. A new HIFU transmission strategy is proposed in this study, firing a number of but not all elements for a certain period and then changing to another group for the next firing sequence. The advantages are: 1) the asymmetric position of active elements may reduce the side lobes, and 2) each element has some resting time during the entire HIFU ablation (up to several hours for some clinical applications) so that the decreasing efficiency of the transducer due to thermal accumulation is minimized. Genetic algorithm was used for selecting randomly distributed elements in a HIFU array. Amplitudes of the first side lobes at the focal plane were used as the fitness value in the optimization. Overall, it is suggested that the proposed new strategy could reduce the side lobe and the consequent side-effects, and the genetic algorithm is effective in selecting those randomly distributed elements in a HIFU array.
Clusters and cycles in the cosmic ray age distributions of meteorites
NASA Technical Reports Server (NTRS)
Woodard, M. F.; Marti, K.
1985-01-01
Statistically significant clusters in the cosmic ray exposure age distributions of some groups of iron and stone meteorites were observed, suggesting epochs of enhanced collision and breakups. Fourier analyses of the age distributions of chondrites reveal no significant periods, nor does the same analysis when applied to iron meteorite clusters.
Ma, Li; Li, Yang; Fan, Suohai; Fan, Runzhu
2015-01-01
Image segmentation plays an important role in medical image processing. Fuzzy c-means (FCM) clustering is one of the popular clustering algorithms for medical image segmentation. However, FCM has the problems of depending on initial clustering centers, falling into local optimal solution easily, and sensitivity to noise disturbance. To solve these problems, this paper proposes a hybrid artificial fish swarm algorithm (HAFSA). The proposed algorithm combines artificial fish swarm algorithm (AFSA) with FCM whose advantages of global optimization searching and parallel computing ability of AFSA are utilized to find a superior result. Meanwhile, Metropolis criterion and noise reduction mechanism are introduced to AFSA for enhancing the convergence rate and antinoise ability. The artificial grid graph and Magnetic Resonance Imaging (MRI) are used in the experiments, and the experimental results show that the proposed algorithm has stronger antinoise ability and higher precision. A number of evaluation indicators also demonstrate that the effect of HAFSA is more excellent than FCM and suppressed FCM (SFCM).
A modified estimation distribution algorithm based on extreme elitism.
Gao, Shujun; de Silva, Clarence W
2016-12-01
An existing estimation distribution algorithm (EDA) with univariate marginal Gaussian model was improved by designing and incorporating an extreme elitism selection method. This selection method highlighted the effect of a few top best solutions in the evolution and advanced EDA to form a primary evolution direction and obtain a fast convergence rate. Simultaneously, this selection can also keep the population diversity to make EDA avoid premature convergence. Then the modified EDA was tested by means of benchmark low-dimensional and high-dimensional optimization problems to illustrate the gains in using this extreme elitism selection. Besides, no-free-lunch theorem was implemented in the analysis of the effect of this new selection on EDAs.
Fast Parabola Detection Using Estimation of Distribution Algorithms
Sierra-Hernandez, Juan Manuel; Avila-Garcia, Maria Susana; Rojas-Laguna, Roberto
2017-01-01
This paper presents a new method based on Estimation of Distribution Algorithms (EDAs) to detect parabolic shapes in synthetic and medical images. The method computes a virtual parabola using three random boundary pixels to calculate the constant values of the generic parabola equation. The resulting parabola is evaluated by matching it with the parabolic shape in the input image by using the Hadamard product as fitness function. This proposed method is evaluated in terms of computational time and compared with two implementations of the generalized Hough transform and RANSAC method for parabola detection. Experimental results show that the proposed method outperforms the comparative methods in terms of execution time about 93.61% on synthetic images and 89% on retinal fundus and human plantar arch images. In addition, experimental results have also shown that the proposed method can be highly suitable for different medical applications. PMID:28321264
Fast Parabola Detection Using Estimation of Distribution Algorithms.
Guerrero-Turrubiates, Jose de Jesus; Cruz-Aceves, Ivan; Ledesma, Sergio; Sierra-Hernandez, Juan Manuel; Velasco, Jonas; Avina-Cervantes, Juan Gabriel; Avila-Garcia, Maria Susana; Rostro-Gonzalez, Horacio; Rojas-Laguna, Roberto
2017-01-01
This paper presents a new method based on Estimation of Distribution Algorithms (EDAs) to detect parabolic shapes in synthetic and medical images. The method computes a virtual parabola using three random boundary pixels to calculate the constant values of the generic parabola equation. The resulting parabola is evaluated by matching it with the parabolic shape in the input image by using the Hadamard product as fitness function. This proposed method is evaluated in terms of computational time and compared with two implementations of the generalized Hough transform and RANSAC method for parabola detection. Experimental results show that the proposed method outperforms the comparative methods in terms of execution time about 93.61% on synthetic images and 89% on retinal fundus and human plantar arch images. In addition, experimental results have also shown that the proposed method can be highly suitable for different medical applications.
Clustering of tethered satellite system simulation data by an adaptive neuro-fuzzy algorithm
NASA Technical Reports Server (NTRS)
Mitra, Sunanda; Pemmaraju, Surya
1992-01-01
Recent developments in neuro-fuzzy systems indicate that the concepts of adaptive pattern recognition, when used to identify appropriate control actions corresponding to clusters of patterns representing system states in dynamic nonlinear control systems, may result in innovative designs. A modular, unsupervised neural network architecture, in which fuzzy learning rules have been embedded is used for on-line identification of similar states. The architecture and control rules involved in Adaptive Fuzzy Leader Clustering (AFLC) allow this system to be incorporated in control systems for identification of system states corresponding to specific control actions. We have used this algorithm to cluster the simulation data of Tethered Satellite System (TSS) to estimate the range of delta voltages necessary to maintain the desired length rate of the tether. The AFLC algorithm is capable of on-line estimation of the appropriate control voltages from the corresponding length error and length rate error without a priori knowledge of their membership functions and familarity with the behavior of the Tethered Satellite System.
Detection and clustering of features in aerial images by neuron network-based algorithm
NASA Astrophysics Data System (ADS)
Vozenilek, Vit
2015-12-01
The paper presents the algorithm for detection and clustering of feature in aerial photographs based on artificial neural networks. The presented approach is not focused on the detection of specific topographic features, but on the combination of general features analysis and their use for clustering and backward projection of clusters to aerial image. The basis of the algorithm is a calculation of the total error of the network and a change of weights of the network to minimize the error. A classic bipolar sigmoid was used for the activation function of the neurons and the basic method of backpropagation was used for learning. To verify that a set of features is able to represent the image content from the user's perspective, the web application was compiled (ASP.NET on the Microsoft .NET platform). The main achievements include the knowledge that man-made objects in aerial images can be successfully identified by detection of shapes and anomalies. It was also found that the appropriate combination of comprehensive features that describe the colors and selected shapes of individual areas can be useful for image analysis.
Probability-changing cluster algorithm for two-dimensional XY and clock models
NASA Astrophysics Data System (ADS)
Tomita, Yusuke; Okabe, Yutaka
2002-05-01
We extend the newly proposed probability-changing cluster (PCC) Monte Carlo algorithm to the study of systems with the vector order parameter. Wolff's idea of the embedded cluster formalism is used for assigning clusters. The Kosterlitz-Thouless (KT) transitions for the two-dimensional (2D) XY and q-state clock models are studied by using the PCC algorithm. Combined with the finite-size scaling analysis based on the KT form of the correlation length, ξ~exp(c/(T/TKT-1)), we determine the KT transition temperature and the decay exponent η as TKT=0.8933(6) and η=0.243(4) for the 2D XY model. We investigate two transitions of the KT type for the 2D q-state clock models with q=6,8,12 and confirm the prediction of η=4/q2 at T1, the low-temperature critical point between the ordered and XY-like phases, systematically.
Crowded Cluster Cores. Algorithms for Deblending in Dark Energy Survey Images
Zhang, Yuanyuan; McKay, Timothy A.; Bertin, Emmanuel; Jeltema, Tesla; Miller, Christopher J.; Rykoff, Eli; Song, Jeeseon
2015-10-26
Deep optical images are often crowded with overlapping objects. We found that this is especially true in the cores of galaxy clusters, where images of dozens of galaxies may lie atop one another. Accurate measurements of cluster properties require deblending algorithms designed to automatically extract a list of individual objects and decide what fraction of the light in each pixel comes from each object. In this article, we introduce a new software tool called the Gradient And Interpolation based (GAIN) deblender. GAIN is used as a secondary deblender to improve the separation of overlapping objects in galaxy cluster cores in Dark Energy Survey images. It uses image intensity gradients and an interpolation technique originally developed to correct flawed digital images. Our paper is dedicated to describing the algorithm of the GAIN deblender and its applications, but we additionally include modest tests of the software based on real Dark Energy Survey co-add images. GAIN helps to extract an unbiased photometry measurement for blended sources and improve detection completeness, while introducing few spurious detections. When applied to processed Dark Energy Survey data, GAIN serves as a useful quick fix when a high level of deblending is desired.
Crowded Cluster Cores. Algorithms for Deblending in Dark Energy Survey Images
Zhang, Yuanyuan; McKay, Timothy A.; Bertin, Emmanuel; ...
2015-10-26
Deep optical images are often crowded with overlapping objects. We found that this is especially true in the cores of galaxy clusters, where images of dozens of galaxies may lie atop one another. Accurate measurements of cluster properties require deblending algorithms designed to automatically extract a list of individual objects and decide what fraction of the light in each pixel comes from each object. In this article, we introduce a new software tool called the Gradient And Interpolation based (GAIN) deblender. GAIN is used as a secondary deblender to improve the separation of overlapping objects in galaxy cluster cores inmore » Dark Energy Survey images. It uses image intensity gradients and an interpolation technique originally developed to correct flawed digital images. Our paper is dedicated to describing the algorithm of the GAIN deblender and its applications, but we additionally include modest tests of the software based on real Dark Energy Survey co-add images. GAIN helps to extract an unbiased photometry measurement for blended sources and improve detection completeness, while introducing few spurious detections. When applied to processed Dark Energy Survey data, GAIN serves as a useful quick fix when a high level of deblending is desired.« less
ERIC Educational Resources Information Center
Xu, Beijie; Recker, Mimi; Qi, Xiaojun; Flann, Nicholas; Ye, Lei
2013-01-01
This article examines clustering as an educational data mining method. In particular, two clustering algorithms, the widely used K-means and the model-based Latent Class Analysis, are compared, using usage data from an educational digital library service, the Instructional Architect (IA.usu.edu). Using a multi-faceted approach and multiple data…
An improved scheduling algorithm for 3D cluster rendering with platform LSF
NASA Astrophysics Data System (ADS)
Xu, Wenli; Zhu, Yi; Zhang, Liping
2013-10-01
High-quality photorealistic rendering of 3D modeling needs powerful computing systems. On this demand highly efficient management of cluster resources develops fast to exert advantages. This paper is absorbed in the aim of how to improve the efficiency of 3D rendering tasks in cluster. It focuses research on a dynamic feedback load balance (DFLB) algorithm, the work principle of load sharing facility (LSF) and optimization of external scheduler plug-in. The algorithm can be applied into match and allocation phase of a scheduling cycle. Candidate hosts is prepared in sequence in match phase. And the scheduler makes allocation decisions for each job in allocation phase. With the dynamic mechanism, new weight is assigned to each candidate host for rearrangement. The most suitable one will be dispatched for rendering. A new plugin module of this algorithm has been designed and integrated into the internal scheduler. Simulation experiments demonstrate the ability of improved plugin module is superior to the default one for rendering tasks. It can help avoid load imbalance among servers, increase system throughput and improve system utilization.
NASA Astrophysics Data System (ADS)
Barbarino, M.; Warrens, M.; Bonasera, A.; Lattuada, D.; Bang, W.; Quevedo, H. J.; Consoli, F.; de Angelis, R.; Andreoli, P.; Kimura, S.; Dyer, G.; Bernstein, A. C.; Hagel, K.; Barbui, M.; Schmidt, K.; Gaul, E.; Donovan, M. E.; Natowitz, J. B.; Ditmire, T.
2016-08-01
In this work, we explore the possibility that the motion of the deuterium ions emitted from Coulomb cluster explosions is highly disordered enough to resemble thermalization. We analyze the process of nuclear fusion reactions driven by laser-cluster interactions in experiments conducted at the Texas Petawatt laser facility using a mixture of D2+3He and CD4+3He cluster targets. When clusters explode by Coulomb repulsion, the emission of the energetic ions is “nearly” isotropic. In the framework of cluster Coulomb explosions, we analyze the energy distributions of the ions using a Maxwell-Boltzmann (MB) distribution, a shifted MB distribution (sMB), and the energy distribution derived from a log-normal (LN) size distribution of clusters. We show that the first two distributions reproduce well the experimentally measured ion energy distributions and the number of fusions from d-d and d-3He reactions. The LN distribution is a good representation of the ion kinetic energy distribution well up to high momenta where the noise becomes dominant, but overestimates both the neutron and the proton yields. If the parameters of the LN distributions are chosen to reproduce the fusion yields correctly, the experimentally measured high energy ion spectrum is not well represented. We conclude that the ion kinetic energy distribution is highly disordered and practically not distinguishable from a thermalized one.
NASA Astrophysics Data System (ADS)
Gilligan, Martin; Lamont, Gary B.; Peterson, Michael R.
2009-05-01
An important aspect of contemporary military communications in the design of robust image transforms for defense surveillance applications. In particular, efficient yet effective transfer of critical image information is required for decision making. The generic use of wavelets to transform an image is a standard transform approach. However, the resulting bandwidth requirements can be quite high, suggesting that a different bandwidth-limited transform be developed. Thus, our specific use of genetic algorithms (GAs) attempts to replace standard wavelet filter coefficients with an optimized transform filter in order to retain or improve image quality for bandwidth-restricted surveillance applications. To find improved coefficients efficiently, we have developed a software engineered distributed design employing a genetic algorithm (GA) parallel island model on small and large computational clusters with multi-core nodes. The main objective is to determine whether running a distributed GA with multiple islands would either give statistically equivalent results quicker or obtain better results in the same amount of time. In order to compare computational performance with our previous serial results, we evaluate the obtained "optimal" wavelet coefficients on test images from both approaches which results in excellent comparative metric values.
Development of a Genetic Algorithm to Automate Clustering of a Dependency Structure Matrix
NASA Technical Reports Server (NTRS)
Rogers, James L.; Korte, John J.; Bilardo, Vincent J.
2006-01-01
Much technology assessment and organization design data exists in Microsoft Excel spreadsheets. Tools are needed to put this data into a form that can be used by design managers to make design decisions. One need is to cluster data that is highly coupled. Tools such as the Dependency Structure Matrix (DSM) and a Genetic Algorithm (GA) can be of great benefit. However, no tool currently combines the DSM and a GA to solve the clustering problem. This paper describes a new software tool that interfaces a GA written as an Excel macro with a DSM in spreadsheet format. The results of several test cases are included to demonstrate how well this new tool works.
Narrow-angle tail radio sources and the distribution of galaxy orbits in Abell clusters
NASA Technical Reports Server (NTRS)
O'Dea, Christopher P.; Sarazin, Craig L.; Owen, Frazer N.
1987-01-01
The present data on the orientations of the tails with respect to the cluster centers of a sample of 70 narrow-angle-tail (NAT) radio sources in Abell clusters show the distribution of tail angles to be inconsistent with purely radial or circular orbits in all the samples, while being consistent with isotropic orbits in (1) the whole sample, (2) the sample of NATs far from the cluster center, and (3) the samples of morphologically regular Abell clusters. Evidence for very radial orbits is found, however, in the sample of NATs near the cluster center. If these results can be generalized to all cluster galaxies, then the presence of radial orbits near the center of Abell clusters suggests that violent relaxation may not have been fully effective even within the cores of the regular clusters.
CLUSTAG & WCLUSTAG: Hierarchical Clustering Algorithms for Efficient Tag-SNP Selection
NASA Astrophysics Data System (ADS)
Ao, Sio-Iong
More than 6 million single nucleotide polymorphisms (SNPs) in the human genome have been genotyped by the HapMap project. Although only a pro portion of these SNPs are functional, all can be considered as candidate markers for indirect association studies to detect disease-related genetic variants. The complete screening of a gene or a chromosomal region is nevertheless an expensive undertak ing for association studies. A key strategy for improving the efficiency of association studies is to select a subset of informative SNPs, called tag SNPs, for analysis. In the chapter, hierarchical clustering algorithms have been proposed for efficient tag SNP selection.
NASA Astrophysics Data System (ADS)
Wang, Deguang; Han, Baochang; Huang, Ming
Computer forensics is the technology of applying computer technology to access, investigate and analysis the evidence of computer crime. It mainly include the process of determine and obtain digital evidence, analyze and take data, file and submit result. And the data analysis is the key link of computer forensics. As the complexity of real data and the characteristics of fuzzy, evidence analysis has been difficult to obtain the desired results. This paper applies fuzzy c-means clustering algorithm based on particle swarm optimization (FCMP) in computer forensics, and it can be more satisfactory results.
The temperature and size distribution of large water clusters from a non-equilibrium model
Gimelshein, N.; Gimelshein, S.; Pradzynski, C. C.; Zeuch, T.; Buck, U.
2015-06-28
A hybrid Lagrangian-Eulerian approach is used to examine the properties of water clusters formed in neon-water vapor mixtures expanding through microscale conical nozzles. Experimental size distributions were reliably determined by the sodium doping technique in a molecular beam machine. The comparison of computed size distributions and experimental data shows satisfactory agreement, especially for (H{sub 2}O){sub n} clusters with n larger than 50. Thus validated simulations provide size selected cluster temperature profiles in and outside the nozzle. This information is used for an in-depth analysis of the crystallization and water cluster aggregation dynamics of recently reported supersonic jet expansion experiments.
Agent-based method for distributed clustering of textual information
Potok, Thomas E [Oak Ridge, TN; Reed, Joel W [Knoxville, TN; Elmore, Mark T [Oak Ridge, TN; Treadwell, Jim N [Louisville, TN
2010-09-28
A computer method and system for storing, retrieving and displaying information has a multiplexing agent (20) that calculates a new document vector (25) for a new document (21) to be added to the system and transmits the new document vector (25) to master cluster agents (22) and cluster agents (23) for evaluation. These agents (22, 23) perform the evaluation and return values upstream to the multiplexing agent (20) based on the similarity of the document to documents stored under their control. The multiplexing agent (20) then sends the document (21) and the document vector (25) to the master cluster agent (22), which then forwards it to a cluster agent (23) or creates a new cluster agent (23) to manage the document (21). The system also searches for stored documents according to a search query having at least one term and identifying the documents found in the search, and displays the documents in a clustering display (80) of similarity so as to indicate similarity of the documents to each other.
Mustapha, Ibrahim; Mohd Ali, Borhanuddin; Rasid, Mohd Fadlee A; Sali, Aduwati; Mohamad, Hafizal
2015-08-13
It is well-known that clustering partitions network into logical groups of nodes in order to achieve energy efficiency and to enhance dynamic channel access in cognitive radio through cooperative sensing. While the topic of energy efficiency has been well investigated in conventional wireless sensor networks, the latter has not been extensively explored. In this paper, we propose a reinforcement learning-based spectrum-aware clustering algorithm that allows a member node to learn the energy and cooperative sensing costs for neighboring clusters to achieve an optimal solution. Each member node selects an optimal cluster that satisfies pairwise constraints, minimizes network energy consumption and enhances channel sensing performance through an exploration technique. We first model the network energy consumption and then determine the optimal number of clusters for the network. The problem of selecting an optimal cluster is formulated as a Markov Decision Process (MDP) in the algorithm and the obtained simulation results show convergence, learning and adaptability of the algorithm to dynamic environment towards achieving an optimal solution. Performance comparisons of our algorithm with the Groupwise Spectrum Aware (GWSA)-based algorithm in terms of Sum of Square Error (SSE), complexity, network energy consumption and probability of detection indicate improved performance from the proposed approach. The results further reveal that an energy savings of 9% and a significant Primary User (PU) detection improvement can be achieved with the proposed approach.
Mustapha, Ibrahim; Ali, Borhanuddin Mohd; Rasid, Mohd Fadlee A.; Sali, Aduwati; Mohamad, Hafizal
2015-01-01
It is well-known that clustering partitions network into logical groups of nodes in order to achieve energy efficiency and to enhance dynamic channel access in cognitive radio through cooperative sensing. While the topic of energy efficiency has been well investigated in conventional wireless sensor networks, the latter has not been extensively explored. In this paper, we propose a reinforcement learning-based spectrum-aware clustering algorithm that allows a member node to learn the energy and cooperative sensing costs for neighboring clusters to achieve an optimal solution. Each member node selects an optimal cluster that satisfies pairwise constraints, minimizes network energy consumption and enhances channel sensing performance through an exploration technique. We first model the network energy consumption and then determine the optimal number of clusters for the network. The problem of selecting an optimal cluster is formulated as a Markov Decision Process (MDP) in the algorithm and the obtained simulation results show convergence, learning and adaptability of the algorithm to dynamic environment towards achieving an optimal solution. Performance comparisons of our algorithm with the Groupwise Spectrum Aware (GWSA)-based algorithm in terms of Sum of Square Error (SSE), complexity, network energy consumption and probability of detection indicate improved performance from the proposed approach. The results further reveal that an energy savings of 9% and a significant Primary User (PU) detection improvement can be achieved with the proposed approach. PMID:26287191
Meanie3D - a mean-shift based, multivariate, multi-scale clustering and tracking algorithm
NASA Astrophysics Data System (ADS)
Simon, Jürgen-Lorenz; Malte, Diederich; Silke, Troemel
2014-05-01
Project OASE is the one of 5 work groups at the HErZ (Hans Ertel Centre for Weather Research), an ongoing effort by the German weather service (DWD) to further research at Universities concerning weather prediction. The goal of project OASE is to gain an object-based perspective on convective events by identifying them early in the onset of convective initiation and follow then through the entire lifecycle. The ability to follow objects in this fashion requires new ways of object definition and tracking, which incorporate all the available data sets of interest, such as Satellite imagery, weather Radar or lightning counts. The Meanie3D algorithm provides the necessary tool for this purpose. Core features of this new approach to clustering (object identification) and tracking are the ability to identify objects using the mean-shift algorithm applied to a multitude of variables (multivariate), as well as the ability to detect objects on various scales (multi-scale) using elements of Scale-Space theory. The algorithm works in 2D as well as 3D without modifications. It is an extension of a method well known from the field of computer vision and image processing, which has been tailored to serve the needs of the meteorological community. In spite of the special application to be demonstrated here (like convective initiation), the algorithm is easily tailored to provide clustering and tracking for a wide class of data sets and problems. In this talk, the demonstration is carried out on two of the OASE group's own composite sets. One is a 2D nationwide composite of Germany including C-Band Radar (2D) and Satellite information, the other a 3D local composite of the Bonn/Jülich area containing a high-resolution 3D X-Band Radar composite.
A new algorithm to find earthquake clusters using neighboring cell connection and rate analysis.
NASA Astrophysics Data System (ADS)
Peng, W.; Toda, S.
2015-12-01
To study earthquake interaction, it is important to objectively find a group of earthquakes occurred closely in space and time. Earthquake clusters are chosen with previous techniques that characterize them as mainshock-aftershock sequences or swarm sequences by empirical laws (e.g., Omori-Utsu; ETAS) or direct assumptions about physical processes such as stress transfer, transient stress loading, and fluid migration. Recent papers instead proposed non-parameterized techniques such as a kernel-based smoothing method. The cumulative rate clustering method (CURATE, Jacobs et al., 2013) is one of the approaches without any direct assumptions. The CURATE method was applied in New Zealand and provided a good result for selecting the swarm sequences comparing with the ETAS model. However, it is still difficult to choose a proper confined area and a time interval for extracting sequences from the catalog. To avoid arbitrariness in space and time parameters, here we propose a new method modifying the CURATE approach. We first identify the spatial clusters by looking into the spatial distribution in a 2-D cell-gridded map. The spatial clusters defined as multiple neighboring cells, each of which contains at least one earthquake in a time period T. From the selected spatial clusters, we then evaluate temporal clustering which is defined as a transient increase of seismicity rate comparing to the rate before the target event. We tested this method focusing on shallow crustal seismicity, northern Honshu, Japan. We chose the parameter range from T = 1 to 100 days and cell size = 0.01°to 0.1°. As a result, the number of the clusters increase with longer T and larger cell size. By choosing the T = 30 days and cell size = 0.05°, we successfully selected the long-lasting aftershock sequences associated with the 2004 M6.8 Chuetsu and 2007 M6.8 Chuetsu-oki earthquakes, while other empirical models and CURATE method failed to decluster.
Fong, Simon
2012-01-01
Voice biometrics has a long history in biosecurity applications such as verification and identification based on characteristics of the human voice. The other application called voice classification which has its important role in grouping unlabelled voice samples, however, has not been widely studied in research. Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth. In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree. The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification. Two datasets, one that is generated synthetically and the other one empirically collected from past voice recognition experiment, are used to verify and demonstrate the effectiveness of our proposed voice classification algorithm. PMID:22619492
Modelling Paleoearthquake Slip Distributions using a Gentic Algorithm
NASA Astrophysics Data System (ADS)
Lindsay, Anthony; Simão, Nuno; McCloskey, John; Nalbant, Suleyman; Murphy, Shane; Bhloscaidh, Mairead Nic
2013-04-01
Along the Sunda trench, the annual growth rings of coral microatolls store long term records of tectonic deformation. Spread over large areas of an active megathrust fault, they offer the possibility of high resolution reconstructions of slip for a number of paleo-earthquakes. These data are complex with spatial and temporal variations in uncertainty. Rather than assuming that any one model will uniquely fit the data, Monte Carlo Slip Estimation (MCSE) modelling produces a catalogue of possible models for each event. From each earthquake's catalogue, a model is selected and a possible history of slip along the fault reconstructed. By generating multiple histories, then finding the average slip during each earthquake, a probabilistic history of slip along the fault can be generated and areas that may have a large slip deficit identified. However, the MCSE technique requires the production of many hundreds of billions of models to yield the few models that fit the observed coral data. In an attempt to accelerate this process, we have designed a Genetic Algorithm (GA). The GA uses evolutionary operators to recombine the information held by a population of possible slip models to produce a set of new models, based on how well they reproduce a set of coral deformation data. Repeated iterations of the algorithm produce populations of improved models, each generation better satisfying the coral data. Preliminary results have shown the GA to be capable of recovering synthetically generated slip distributions based their displacements of sets of corals faster than the MCSE technique. The results of the systematic testing of the GA technique and its performance using both synthetic and observed coral displacement data will be presented.
MSClust: A Multi-Seeds Based Clustering Algorithm for microbiome profiling using 16S rRNA Sequence
Chen, Wei; Cheng, Yongmei; Zhang, Clarence; Zhang, Shaowu; Zhao, Hongyu
2013-01-01
Recent developments of next generation sequencing technologies have led to rapid accumulation of 16s rRNA sequences for microbiome profiling. One key step in data processing is to cluster short sequences into operational taxonomic units (OTUs). Although many methods have been proposed for OTU inferences, a major challenge is the balance between inference accuracy and computational efficiency, where inference accuracy is often sacrificed to accommodate the need to analyze large numbers of sequences. Inspired by the hierarchical clustering method and a modified greedy network clustering algorithm, we propose a novel multi-seeds based heuristic clustering method, named MSClust, for OTU inference. MSClust first adaptively selects multi-seeds instead of one seed for each candidate cluster, and the reads are then processed using a greedy clustering strategy. Through many numerical examples, we demonstrate that MSClust enjoys less memory usage, and better biological accuracy compared to existing heuristic clustering methods while preserving efficiency and scalability. PMID:23899776
NASA Astrophysics Data System (ADS)
Wagstaff, Kiri L.
2012-03-01
clustering, in which some partial information about item assignments or other components of the resulting output are already known and must be accommodated by the solution. Some algorithms seek a partition of the data set into distinct clusters, while others build a hierarchy of nested clusters that can capture taxonomic relationships. Some produce a single optimal solution, while others construct a probabilistic model of cluster membership. More formally, clustering algorithms operate on a data set X composed of items represented by one or more features (dimensions). These could include physical location, such as right ascension and declination, as well as other properties such as brightness, color, temporal change, size, texture, and so on. Let D be the number of dimensions used to represent each item, xi ∈ RD. The clustering goal is to produce an organization P of the items in X that optimizes an objective function f : P -> R, which quantifies the quality of solution P. Often f is defined so as to maximize similarity within a cluster and minimize similarity between clusters. To that end, many algorithms make use of a measure d : X x X -> R of the distance between two items. A partitioning algorithm produces a set of clusters P = {c1, . . . , ck} such that the clusters are nonoverlapping (c_i intersected with c_j = empty set, i != j) subsets of the data set (Union_i c_i=X). Hierarchical algorithms produce a series of partitions P = {p1, . . . , pn }. For a complete hierarchy, the number of partitions n’= n, the number of items in the data set; the top partition is a single cluster containing all items, and the bottom partition contains n clusters, each containing a single item. For model-based clustering, each cluster c_j is represented by a model m_j , such as the cluster center or a Gaussian distribution. The wide array of available clustering algorithms may seem bewildering, and covering all of them is beyond the scope of this chapter. Choosing among them for a
On the distribution of dark matter in clusters of galaxies
NASA Astrophysics Data System (ADS)
Sand, David J.
2006-07-01
The goal of this thesis is to provide constraints on the dark matter density profile in galaxy clusters by developing and combining different techniques. The work is motivated by the fact that a precise measurement of the logarithmic slope of the dark matter on small scales provides a powerful test of the Cold Dark Matter paradigm for structure formation, where numerical simulations suggest a density profile r DM 0( r -1 or steeper in the innermost regions. We have obtained deep spectroscopy of gravitational arcs and the dominant brightest cluster galaxy in six carefully chosen galaxy clusters. Three of the clusters have both radial and tangential gravitational arcs while the other three display only tangential arcs. We analyze the stellar velocity dispersion for the brightest cluster galaxies in conjunction with axially symmetric lens models to jointly constrain the dark and baryonic mass profiles jointly. For the radial are systems we find the inner dark matter density profile is consistent with r DM 0( r -b , with [left angle bracket]b[right angle bracket] = [Special characters omitted.] (68% CL). Likewise, an upper limit on b for the tangential arc sample is found to be b <0.57 (99% CL). We study a variety of possible systematic uncertainties, including the consequences of our one- dimensional mass model, fixed dark matter scale radius, and simple velocity dispersion analysis, and conclude that at most these systematics each contribute a Db ~ 0.2 systematic into our final conclusions. These results suggest the relationship between dark and baryonic matter in cluster cores is more complex than anticipated from dark matter only simulations. Recognizing the power of our technique, we have performed a systematic search of the Hubble Space Telescope Wide Field and Planetary Camera 2 data archive for further examples of systems containing tangential and radial gravitational arcs. We carefully examined 128 galaxy cluster cores and found 104 tangential arcs and 12
Studies of Entropy Distributions in X-ray Luminous Clusters of Galaxies
NASA Astrophysics Data System (ADS)
Cavagnolo, K. W.; Donahue, M. E.; Voit, G. M.; Sun, M.; Evrard, A. E.
2005-12-01
We present entropy distributions for a sample of galaxy clusters from the Chandra public archive, which builds on our previous analysis of nine nearby, bright clusters. By studying the entropy distribution within clusters we quantify the effect of radiative cooling, supernovae feedback, and AGN feedback on cluster properties. This expanded sample contains both cooling flow and non-cooling flow clusters while our previous work focused only on classical cooling flow clusters. We also test the predictions of Mathiesen and Evrard (2001) by checking whether the spectral fit temperature is an unbiased estimate of the mass-weighted temperature, and how this estimate effects the calculation of the intracluster medium mass. Temperature and entropy maps for the clusters in our sample using the Voronoi Tesselation method as employed by Statler et al (in preparation) will also be presented. These maps serve as a prelude to future work in which we will investigate how well such maps may represent the "true" projected quantities of a cluster by comparing deprojected real and simulated clusters from our sample and the Virtual Cluster Exploratory, respectively. Our discussion focuses on tying together feedback mechanisms with the breaking of self-similar relations expected in cluster and galaxy formation models.
Chen, Deng-kai; Gu, Rong; Gu, Yu-feng; Yu, Sui-huai
2016-01-01
Consumers' Kansei needs reflect their perception about a product and always consist of a large number of adjectives. Reducing the dimension complexity of these needs to extract primary words not only enables the target product to be explicitly positioned, but also provides a convenient design basis for designers engaging in design work. Accordingly, this study employs a numerical design structure matrix (NDSM) by parameterizing a conventional DSM and integrating genetic algorithms to find optimum Kansei clusters. A four-point scale method is applied to assign link weights of every two Kansei adjectives as values of cells when constructing an NDSM. Genetic algorithms are used to cluster the Kansei NDSM and find optimum clusters. Furthermore, the process of the proposed method is presented. The details of the proposed approach are illustrated using an example of electronic scooter for Kansei needs clustering. The case study reveals that the proposed method is promising for clustering Kansei needs adjectives in product emotional design. PMID:27630709
Yang, Yan-Pu; Chen, Deng-Kai; Gu, Rong; Gu, Yu-Feng; Yu, Sui-Huai
2016-01-01
Consumers' Kansei needs reflect their perception about a product and always consist of a large number of adjectives. Reducing the dimension complexity of these needs to extract primary words not only enables the target product to be explicitly positioned, but also provides a convenient design basis for designers engaging in design work. Accordingly, this study employs a numerical design structure matrix (NDSM) by parameterizing a conventional DSM and integrating genetic algorithms to find optimum Kansei clusters. A four-point scale method is applied to assign link weights of every two Kansei adjectives as values of cells when constructing an NDSM. Genetic algorithms are used to cluster the Kansei NDSM and find optimum clusters. Furthermore, the process of the proposed method is presented. The details of the proposed approach are illustrated using an example of electronic scooter for Kansei needs clustering. The case study reveals that the proposed method is promising for clustering Kansei needs adjectives in product emotional design.
piClust: a density based piRNA clustering algorithm.
Jung, Inuk; Park, Jong Chan; Kim, Sun
2014-06-01
Piwi-interacting RNAs (piRNAs) are recently discovered, endogenous small non-coding RNAs. piRNAs protect the genome from invasive transposable elements (TE) and sustain integrity of the genome in germ cell lineages. Small RNA-sequencing data can be used to detect piRNA activations in a cell under a specific condition. However, identification of cell specific piRNA activations requires sophisticated computational methods. As of now, there is only one computational method, proTRAC, to locate activated piRNAs from the sequencing data. proTRAC detects piRNA clusters based on a probabilistic analysis with assumption of a uniform distribution. Unfortunately, we were not able to locate activated piRNAs from our proprietary sequencing data in chicken germ cells using proTRAC. With a careful investigation on data sets, we found that a uniform or any statistical distribution for detecting piRNA clusters may not be assumed. Furthermore, small RNA-seq data contains many different types of RNAs which was not carefully taken into account in previous studies. To improve piRNA cluster identification, we developed piClust that uses a density based clustering approach without assumption of any parametric distribution. In previous studies, it is known that piRNAs exhibit a strong tendency of forming piRNA clusters in syntenic regions of the genome. Thus, the density based clustering approach is effective and robust to the existence of non-piRNAs or noise in the data. In experiments with piRNA data from human, mouse, rat and chicken, piClust was able to detect piRNA clusters from total small RNA-seq data from germ cell lines, while proTRAC was not successful. piClust outperformed proTRAC in terms of sensitivity and running time (up to 200 folds). piClust is currently available as a web service at http://epigenomics.snu.ac.kr/piclustweb.
Rodríguez-Ramilo, Silvia T; Wang, Jinliang
2012-09-01
The inference of population genetic structures is essential in many research areas in population genetics, conservation biology and evolutionary biology. Recently, unsupervised Bayesian clustering algorithms have been developed to detect a hidden population structure from genotypic data, assuming among others that individuals taken from the population are unrelated. Under this assumption, markers in a sample taken from a subpopulation can be considered to be in Hardy-Weinberg and linkage equilibrium. However, close relatives might be sampled from the same subpopulation, and consequently, might cause Hardy-Weinberg and linkage disequilibrium and thus bias a population genetic structure analysis. In this study, we used simulated and real data to investigate the impact of close relatives in a sample on Bayesian population structure analysis. We also showed that, when close relatives were identified by a pedigree reconstruction approach and removed, the accuracy of a population genetic structure analysis can be greatly improved. The results indicate that unsupervised Bayesian clustering algorithms cannot be used blindly to detect genetic structure in a sample with closely related individuals. Rather, when closely related individuals are suspected to be frequent in a sample, these individuals should be first identified and removed before conducting a population structure analysis.
Ma, Tao; Wang, Fen; Cheng, Jianjun; Yu, Yang; Chen, Xiaoyun
2016-01-01
The development of intrusion detection systems (IDS) that are adapted to allow routers and network defence systems to detect malicious network traffic disguised as network protocols or normal access is a critical challenge. This paper proposes a novel approach called SCDNN, which combines spectral clustering (SC) and deep neural network (DNN) algorithms. First, the dataset is divided into k subsets based on sample similarity using cluster centres, as in SC. Next, the distance between data points in a testing set and the training set is measured based on similarity features and is fed into the deep neural network algorithm for intrusion detection. Six KDD-Cup99 and NSL-KDD datasets and a sensor network dataset were employed to test the performance of the model. These experimental results indicate that the SCDNN classifier not only performs better than backpropagation neural network (BPNN), support vector machine (SVM), random forest (RF) and Bayes tree models in detection accuracy and the types of abnormal attacks found. It also provides an effective tool of study and analysis of intrusion detection in large networks. PMID:27754380
Ma, Tao; Wang, Fen; Cheng, Jianjun; Yu, Yang; Chen, Xiaoyun
2016-10-13
The development of intrusion detection systems (IDS) that are adapted to allow routers and network defence systems to detect malicious network traffic disguised as network protocols or normal access is a critical challenge. This paper proposes a novel approach called SCDNN, which combines spectral clustering (SC) and deep neural network (DNN) algorithms. First, the dataset is divided into k subsets based on sample similarity using cluster centres, as in SC. Next, the distance between data points in a testing set and the training set is measured based on similarity features and is fed into the deep neural network algorithm for intrusion detection. Six KDD-Cup99 and NSL-KDD datasets and a sensor network dataset were employed to test the performance of the model. These experimental results indicate that the SCDNN classifier not only performs better than backpropagation neural network (BPNN), support vector machine (SVM), random forest (RF) and Bayes tree models in detection accuracy and the types of abnormal attacks found. It also provides an effective tool of study and analysis of intrusion detection in large networks.
KANTS: a stigmergic ant algorithm for cluster analysis and swarm art.
Fernandes, Carlos M; Mora, Antonio M; Merelo, Juan J; Rosa, Agostinho C
2014-06-01
KANTS is a swarm intelligence clustering algorithm inspired by the behavior of social insects. It uses stigmergy as a strategy for clustering large datasets and, as a result, displays a typical behavior of complex systems: self-organization and global patterns emerging from the local interaction of simple units. This paper introduces a simplified version of KANTS and describes recent experiments with the algorithm in the context of a contemporary artistic and scientific trend called swarm art, a type of generative art in which swarm intelligence systems are used to create artwork or ornamental objects. KANTS is used here for generating color drawings from the input data that represent real-world phenomena, such as electroencephalogram sleep data. However, the main proposal of this paper is an art project based on well-known abstract paintings, from which the chromatic values are extracted and used as input. Colors and shapes are therefore reorganized by KANTS, which generates its own interpretation of the original artworks. The project won the 2012 Evolutionary Art, Design, and Creativity Competition.
A contiguity-enhanced k-means clustering algorithm for unsupervised multispectral image segmentation
Theiler, J.; Gisler, G.
1997-07-01
The recent and continuing construction of multi and hyper spectral imagers will provide detailed data cubes with information in both the spatial and spectral domain. This data shows great promise for remote sensing applications ranging from environmental and agricultural to national security interests. The reduction of this voluminous data to useful intermediate forms is necessary both for downlinking all those bits and for interpreting them. Smart onboard hardware is required, as well as sophisticated earth bound processing. A segmented image (in which the multispectral data in each pixel is classified into one of a small number of categories) is one kind of intermediate form which provides some measure of data compression. Traditional image segmentation algorithms treat pixels independently and cluster the pixels according only to their spectral information. This neglects the implicit spatial information that is available in the image. We will suggest a simple approach; a variant of the standard k-means algorithm which uses both spatial and spectral properties of the image. The segmented image has the property that pixels which are spatially contiguous are more likely to be in the same class than are random pairs of pixels. This property naturally comes at some cost in terms of the compactness of the clusters in the spectral domain, but we have found that the spatial contiguity and spectral compactness properties are nearly orthogonal, which means that we can make considerable improvements in the one with minimal loss in the other.
Smith, Edward M; Littrell, Jack; Olivier, Michael
2007-12-01
High-throughput SNP genotyping platforms use automated genotype calling algorithms to assign genotypes. While these algorithms work efficiently for individual platforms, they are not compatible with other platforms, and have individual biases that result in missed genotype calls. Here we present data on the use of a second complementary SNP genotype clustering algorithm. The algorithm was originally designed for individual fluorescent SNP genotyping assays, and has been optimized to permit the clustering of large datasets generated from custom-designed Affymetrix SNP panels. In an analysis of data from a 3K array genotyped on 1,560 samples, the additional analysis increased the overall number of genotypes by over 45,000, significantly improving the completeness of the experimental data. This analysis suggests that the use of multiple genotype calling algorithms may be advisable in high-throughput SNP genotyping experiments. The software is written in Perl and is available from the corresponding author.
NASA Astrophysics Data System (ADS)
Cazade, Pierre-André; Zheng, Wenwei; Prada-Gracia, Diego; Berezovska, Ganna; Rao, Francesco; Clementi, Cecilia; Meuwly, Markus
2015-01-01
The ligand migration network for O2-diffusion in truncated Hemoglobin N is analyzed based on three different clustering schemes. For coordinate-based clustering, the conventional k-means and the kinetics-based Markov Clustering (MCL) methods are employed, whereas the locally scaled diffusion map (LSDMap) method is a collective-variable-based approach. It is found that all three methods agree well in their geometrical definition of the most important docking site, and all experimentally known docking sites are recovered by all three methods. Also, for most of the states, their population coincides quite favourably, whereas the kinetics of and between the states differs. One of the major differences between k-means and MCL clustering on the one hand and LSDMap on the other is that the latter finds one large primary cluster containing the Xe1a, IS1, and ENT states. This is related to the fact that the motion within the state occurs on similar time scales, whereas structurally the state is found to be quite diverse. In agreement with previous explicit atomistic simulations, the Xe3 pocket is found to be a highly dynamical site which points to its potential role as a hub in the network. This is also highlighted in the fact that LSDMap cannot identify this state. First passage time distributions from MCL clusterings using a one- (ligand-position) and two-dimensional (ligand-position and protein-structure) descriptor suggest that ligand- and protein-motions are coupled. The benefits and drawbacks of the three methods are discussed in a comparative fashion and highlight that depending on the questions at hand the best-performing method for a particular data set may differ.
Cazade, Pierre-André; Zheng, Wenwei; Prada-Gracia, Diego; Berezovska, Ganna; Rao, Francesco; Clementi, Cecilia; Meuwly, Markus
2015-01-14
The ligand migration network for O2-diffusion in truncated Hemoglobin N is analyzed based on three different clustering schemes. For coordinate-based clustering, the conventional k-means and the kinetics-based Markov Clustering (MCL) methods are employed, whereas the locally scaled diffusion map (LSDMap) method is a collective-variable-based approach. It is found that all three methods agree well in their geometrical definition of the most important docking site, and all experimentally known docking sites are recovered by all three methods. Also, for most of the states, their population coincides quite favourably, whereas the kinetics of and between the states differs. One of the major differences between k-means and MCL clustering on the one hand and LSDMap on the other is that the latter finds one large primary cluster containing the Xe1a, IS1, and ENT states. This is related to the fact that the motion within the state occurs on similar time scales, whereas structurally the state is found to be quite diverse. In agreement with previous explicit atomistic simulations, the Xe3 pocket is found to be a highly dynamical site which points to its potential role as a hub in the network. This is also highlighted in the fact that LSDMap cannot identify this state. First passage time distributions from MCL clusterings using a one- (ligand-position) and two-dimensional (ligand-position and protein-structure) descriptor suggest that ligand- and protein-motions are coupled. The benefits and drawbacks of the three methods are discussed in a comparative fashion and highlight that depending on the questions at hand the best-performing method for a particular data set may differ.
Palma, G; Zambrano, D
2008-12-01
In this paper we propose a method to study critical systems numerically, which combines collective-mode algorithms and renormalization group on the lattice. This method is an improved version of the Monte Carlo renormalization group in the sense that it has all the advantages of cluster algorithms. As an application we considered the 2D Ising model and studied whether scale invariance or universality are possible underlying mechanisms responsible for the approximate "universal fluctuations" close to a so-called bulk temperature T(L) . "Universal fluctuations" were first proposed in the work of Bramwell, Holdsworth, and Pinton [Nature (London) 396, 552 (1998)] and stated that the probability density function of a global quantity for very dissimilar systems, such as a confined turbulent flow and a two-dimensional (2D) magnetic system, properly normalized to the first two moments, becomes similar to the "universal distribution," originally obtained for magnetization in the 2D XY model in the low-temperature region. The results for the critical exponents and the renormalization-group flow of the probability density function are very accurate and show no evidence to support that the approximate common shape of the PDF should be related to both scale invariance or universal behavior.
Crawford, Steven M.; Wirth, Gregory D.; Bershady, Matthew A. E-mail: wirth@keck.hawaii.edu
2014-05-01
We analyze the spatial and velocity distributions of confirmed members in five massive clusters of galaxies at intermediate redshift (0.5 < z < 0.9) to investigate the physical processes driving galaxy evolution. Based on spectral classifications derived from broad- and narrow-band photometry, we define four distinct galaxy populations representing different evolutionary stages: red sequence (RS) galaxies, blue cloud (BC) galaxies, green valley (GV) galaxies, and luminous compact blue galaxies (LCBGs). For each galaxy class, we derive the projected spatial and velocity distribution and characterize the degree of subclustering. We find that RS, BC, and GV galaxies in these clusters have similar velocity distributions, but that BC and GV galaxies tend to avoid the core of the two z ≈ 0.55 clusters. GV galaxies exhibit subclustering properties similar to RS galaxies, but their radial velocity distribution is significantly platykurtic compared to the RS galaxies. The absence of GV galaxies in the cluster cores may explain their somewhat prolonged star-formation history. The LCBGs appear to have recently fallen into the cluster based on their larger velocity dispersion, absence from the cores of the clusters, and different radial velocity distribution than the RS galaxies. Both LCBG and BC galaxies show a high degree of subclustering on the smallest scales, leading us to conclude that star formation is likely triggered by galaxy-galaxy interactions during infall into the cluster.
Oña, Ofelia B.; Ferraro, Marta B.; Facelli, Julio C.
2010-01-01
The characterization and prediction of the structures of metal silicon clusters is important for nanotechnology research because these clusters can be used as building blocks for nano devices, integrated circuits and solar cells. Several authors have postulated that there is a transition between exo to endo absorption of Cu in Sin clusters and showed that for n larger than 9 it is possible to find endohedral clusters. Unfortunately, no global searchers have confirmed this observation, which is based on local optimizations of plausible structures. Here we use parallel Genetic Algorithms (GA), as implemented in our MGAC software, directly coupled with DFT energy calculations to show that the global search of CuSin cluster structures does not find endohedral clusters for n < 8 but finds them for n ≥ 10. PMID:21785526
NASA Astrophysics Data System (ADS)
Cherba, David M.
A set of Genetic Algorithm (GA) operators based on spatial location concepts will provide improved performance for a class of NP hard search problems in N dimensional spaces. A set of spatial operators for use with genetic algorithms is proposed for a class of problems with real-valued genes that consist of N-dimensional homogeneous vectors. Evolutionary computation is capable of providing solutions to problems that would be intractable using more conventional methods. A subset of these problems is represented in real-valued three dimensional spaces or other more complex vector spaces. This thesis addresses a number of issues related to the natural influences that adjacent locations in these spaces have on the fitness functions used in genetic algorithms. A subset of building blocks (schema) will be utilized based on these natural influences. It will be shown that these operators can be described by a building block style of theory that supports the experiment results. Further, the spatial base operators naturally preserve the interactions between genes for this class of problems. Genes have a natural influence on each other based on proximity. To be an effective genetic algorithm, operators need to take these proximity effects into consideration in order to preserve good contributions to fitness. Failure to utilize these spatial relationships will lead to very poor performance of the genetic algorithm or require statistical methods to try to capture the relationships. As a demonstration of these spatial operators, this dissertation will focus on the conformation of molecular clusters, where each atom's location represents a gene with real-valued coordinates. Further, the algorithm presented will work from unlabeled distance information available from experiments with limited preparation. A set of theories will be presented that form the basis for prediction of operator effectiveness, population size and convergence for this class of problems. The theory will be
A bio-inspired cooperative algorithm for distributed source localization with mobile nodes.
Khalili, Azam; Rastegarnia, Amir; Islam, Md Kafiul; Yang, Zhi
2013-01-01
In this paper we propose an algorithm for distributed optimization in mobile nodes. Compared with many published works, an important consideration here is that the nodes do not know the cost function beforehand. Instead of decision-making based on linear combination of the neighbor estimates, the proposed algorithm relies on information-rich nodes that are iteratively identified. To quickly find these nodes, the algorithm adopts a larger step size during the initial iterations. The proposed algorithm can be used in many different applications, such as distributed odor source localization and mobile robots. Comparative simulation results are presented to support the proposed algorithm.
Page, Andrew J; Keane, Thomas M; Naughton, Thomas J
2010-07-01
We present a multi-heuristic evolutionary task allocation algorithm to dynamically map tasks to processors in a heterogeneous distributed system. It utilizes a genetic algorithm, combined with eight common heuristics, in an effort to minimize the total execution time. It operates on batches of unmapped tasks and can preemptively remap tasks to processors. The algorithm has been implemented on a Java distributed system and evaluated with a set of six problems from the areas of bioinformatics, biomedical engineering, computer science and cryptography. Experiments using up to 150 heterogeneous processors show that the algorithm achieves better efficiency than other state-of-the-art heuristic algorithms.
Liu, Liqiang; Dai, Yuntao; Gao, Jinyu
2014-01-01
Ant colony optimization algorithm for continuous domains is a major research direction for ant colony optimization algorithm. In this paper, we propose a distribution model of ant colony foraging, through analysis of the relationship between the position distribution and food source in the process of ant colony foraging. We design a continuous domain optimization algorithm based on the model and give the form of solution for the algorithm, the distribution model of pheromone, the update rules of ant colony position, and the processing method of constraint condition. Algorithm performance against a set of test trials was unconstrained optimization test functions and a set of optimization test functions, and test results of other algorithms are compared and analyzed to verify the correctness and effectiveness of the proposed algorithm.
NASA Astrophysics Data System (ADS)
Sakaguchi, Hidetsugu; Maeyama, Satomi
2013-02-01
A model of clustering dynamics is proposed for a population of spatially distributed active rotators. A transition from excitable to oscillatory dynamics is induced by the increase of the local density of active rotators. It is interpreted as dynamical quorum sensing. In the oscillation regime, phase waves propagate without decay, which generates an effectively long-range interaction in the clustering dynamics. The clustering process becomes facilitated and only one dominant cluster appears rapidly as a result of the dynamical quorum sensing. An exact localized solution is found to a simplified model equation, and the competitive dynamics between two localized states is studied numerically.
Discovery of a widely distributed toxin biosynthetic gene cluster
Lee, Shaun W.; Mitchell, Douglas A.; Markley, Andrew L.; Hensler, Mary E.; Gonzalez, David; Wohlrab, Aaron; Dorrestein, Pieter C.; Nizet, Victor; Dixon, Jack E.
2008-01-01
Bacteriocins represent a large family of ribosomally produced peptide antibiotics. Here we describe the discovery of a widely conserved biosynthetic gene cluster for the synthesis of thiazole and oxazole heterocycles on ribosomally produced peptides. These clusters encode a toxin precursor and all necessary proteins for toxin maturation and export. Using the toxin precursor peptide and heterocycle-forming synthetase proteins from the human pathogen Streptococcus pyogenes, we demonstrate the in vitro reconstitution of streptolysin S activity. We provide evidence that the synthetase enzymes, as predicted from our bioinformatics analysis, introduce heterocycles onto precursor peptides, thereby providing molecular insight into the chemical structure of streptolysin S. Furthermore, our studies reveal that the synthetase exhibits relaxed substrate specificity and modifies toxin precursors from both related and distant species. Given our findings, it is likely that the discovery of similar peptidic toxins will rapidly expand to existing and emerging genomes. PMID:18375757
Reconciling fault-tolerant distributed algorithms and real-time computing.
Moser, Heinrich; Schmid, Ulrich
We present generic transformations, which allow to translate classic fault-tolerant distributed algorithms and their correctness proofs into a real-time distributed computing model (and vice versa). Owing to the non-zero-time, non-preemptible state transitions employed in our real-time model, scheduling and queuing effects (which are inherently abstracted away in classic zero step-time models, sometimes leading to overly optimistic time complexity results) can be accurately modeled. Our results thus make fault-tolerant distributed algorithms amenable to a sound real-time analysis, without sacrificing the wealth of algorithms and correctness proofs established in classic distributed computing research. By means of an example, we demonstrate that real-time algorithms generated by transforming classic algorithms can be competitive even w.r.t. optimal real-time algorithms, despite their comparatively simple real-time analysis.
On Distribution Reduction and Algorithm Implementation in Inconsistent Ordered Information Systems
Zhang, Yanqin
2014-01-01
As one part of our work in ordered information systems, distribution reduction is studied in inconsistent ordered information systems (OISs). Some important properties on distribution reduction are studied and discussed. The dominance matrix is restated for reduction acquisition in dominance relations based information systems. Matrix algorithm for distribution reduction acquisition is stepped. And program is implemented by the algorithm. The approach provides an effective tool for the theoretical research and the applications for ordered information systems in practices. For more detailed and valid illustrations, cases are employed to explain and verify the algorithm and the program which shows the effectiveness of the algorithm in complicated information systems. PMID:25258721
On distribution reduction and algorithm implementation in inconsistent ordered information systems.
Zhang, Yanqin
2014-01-01
As one part of our work in ordered information systems, distribution reduction is studied in inconsistent ordered information systems (OISs). Some important properties on distribution reduction are studied and discussed. The dominance matrix is restated for reduction acquisition in dominance relations based information systems. Matrix algorithm for distribution reduction acquisition is stepped. And program is implemented by the algorithm. The approach provides an effective tool for the theoretical research and the applications for ordered information systems in practices. For more detailed and valid illustrations, cases are employed to explain and verify the algorithm and the program which shows the effectiveness of the algorithm in complicated information systems.
NASA Astrophysics Data System (ADS)
Bitter, Ingmar; Brown, John E.; Brickman, Daniel; Summers, Ronald M.
2004-04-01
The presented method significantly reduces the time necessary to validate a computed tomographic colonography (CTC) computer aided detection (CAD) algorithm of colonic polyps applied to a large patient database. As the algorithm is being developed on Windows PCs and our target, a Beowulf cluster, is running on Linux PCs, we made the application dual platform compatible using a single source code tree. To maintain, share, and deploy source code, we used CVS (concurrent versions system) software. We built the libraries from their sources for each operating system. Next, we made the CTC CAD algorithm dual-platform compatible and validate that both Windows and Linux produced the same results. Eliminating system dependencies was mostly achieved using the Qt programming library, which encapsulates most of the system dependent functionality in order to present the same interface on either platform. Finally, we wrote scripts to execute the CTC CAD algorithm in parallel. Running hundreds of simultaneous copies of the CTC CAD algorithm on a Beowulf cluster computing network enables execution in less than four hours on our entire collection of over 2400 CT scans, as compared to a month a single PC. As a consequence, our complete patient database can be processed daily, boosting research productivity. Large scale validation of a computer aided polyp detection algorithm for CT colonography using cluster computing significantly improves the round trip time of algorithm improvement and revalidation.
A uniform metal distribution in the intergalactic medium of the Perseus cluster of galaxies.
Werner, Norbert; Urban, Ondrej; Simionescu, Aurora; Allen, Steven W
2013-10-31
Most of the metals (elements heavier than helium) produced by stars in the member galaxies of clusters currently reside within the hot, X-ray-emitting intra-cluster gas. Observations of X-ray line emission from this intergalactic medium have suggested a relatively small cluster-to-cluster scatter outside the cluster centres and enrichment with iron out to large radii, leading to the idea that the metal enrichment occurred early in the history of the Universe. Models with early enrichment predict a uniform metal distribution at large radii in clusters, whereas those with late-time enrichment are expected to introduce significant spatial variations of the metallicity. To discriminate clearly between these competing models, it is essential to test for potential inhomogeneities by measuring the abundances out to large radii along multiple directions in clusters, which has not hitherto been done. Here we report a remarkably uniform iron abundance, as a function of radius and azimuth, that is statistically consistent with a constant value of ZFe = 0.306 ± 0.012 in solar units out to the edge of the nearby Perseus cluster. This homogeneous distribution requires that most of the metal enrichment of the intergalactic medium occurred before the cluster formed, probably more than ten billion years ago, during the period of maximal star formation and black hole activity.
The mass distribution of the strong lensing cluster SDSS J1531+3414
Sharon, Keren; Johnson, Traci L.; Gladders, Michael D.; Rigby, Jane R.; Wuyts, Eva; Bayliss, Matthew B.; Florian, Michael K.; Dahle, Håkon
2014-11-01
We present the mass distribution at the core of SDSS J1531+3414, a strong-lensing cluster at z = 0.335. We find that the mass distribution is well described by two cluster-scale halos with a contribution from cluster-member galaxies. New Hubble Space Telescope observations of SDSS J1531+3414 reveal a signature of ongoing star formation associated with the two central galaxies at the core of the cluster, in the form of a chain of star forming regions at the center of the cluster. Using the lens model presented here, we place upper limits on the contribution of a possible lensed image to the flux at the central region, and rule out that this emission is coming from a background source.
Cluster Mass Distribution of the Hubble Frontier Fields - What have we learned?
NASA Astrophysics Data System (ADS)
Jean-Paul, Kneib
2016-07-01
The Hubble Frontier Fields have provided the deepest imaging of six of the most massive clusters in the Universe. Using strong lensing and weak lensing techniques, we have investigated with a record high precision the mass models of these clusters. First we identified the multiples images that are then confronted to an evolving model to best match the strong lensing observable constraints. We then include weak lensing and flexion to investigate the mass distribution in the outer region. By investigating the accuracy of the model we show that we can constrain the small scale mass distribution, thus investigating the relation between the cluster galaxy stellar mass and its dark matter halo. On larger scale combining with weak lensing and X-ray measurement we can probe the assembly scenario of these cluster, which confirm that massive clusters are at the crossroads of filamentary structures.
Multimodality in galaxy clusters from SDSS DR8: substructure and velocity distribution
NASA Astrophysics Data System (ADS)
Einasto, M.; Vennik, J.; Nurmi, P.; Tempel, E.; Ahvensalmi, A.; Tago, E.; Liivamägi, L. J.; Saar, E.; Heinämäki, P.; Einasto, J.; Martínez, V. J.
2012-04-01
Context. The study of the signatures of multimodality in groups and clusters of galaxies, an environment for most of the galaxies in the Universe, gives us information about the dynamical state of clusters and about merging processes, which affect the formation and evolution of galaxies, groups and clusters, and larger structures - superclusters of galaxies and the whole cosmic web. Aims: We search for the presence of substructure, a non-Gaussian, asymmetrical velocity distribution of galaxies, and large peculiar velocities of the main galaxies in clusters with at least 50 member galaxies, drawn from the SDSS DR8. Methods: We employ a number of 3D, 2D, and 1D tests to analyse the distribution of galaxies in clusters: 3D normal mixture modelling, the Dressler-Shectman test, the Anderson-Darling and Shapiro-Wilk tests, as well as the Anscombe-Glynn and the D'Agostino tests. We find the peculiar velocities of the main galaxies, and use principal component analysis to characterise our results. Results: More than 80% of the clusters in our sample have substructure according to 3D normal mixture modelling, and the Dressler-Shectman (DS) test shows substructure in about 70% of the clusters. The median value of the peculiar velocities of the main galaxies in clusters is 206 km s-1 (41% of the rms velocity). The velocities of galaxies in more than 20% of the clusters show significant non-Gaussianity. While multidimensional normal mixture modelling is more sensitive than the DS test in resolving substructure in the sky distribution of cluster galaxies, the DS test determines better substructure expressed as tails in the velocity distribution of galaxies (possible line-of-sight mergers). Richer, larger, and more luminous clusters have larger amount of substructure and larger (compared to the rms velocity) peculiar velocities of the main galaxies. Principal component analysis of both the substructure indicators and the physical parametres of clusters shows that galaxy clusters
NASA Astrophysics Data System (ADS)
Altarelli, F.; Monasson, R.; Zamponi, F.
2008-01-01
We study the performances of stochastic heuristic search algorithms on Uniquely Extendible Constraint Satisfaction Problems with random inputs. We show that, for any heuristic preserving the Poissonian nature of the underlying instance, the (heuristic-dependent) largest ratio αa of constraints per variables for which a search algorithm is likely to find solutions is smaller than the critical ratio αd above which solutions are clustered and highly correlated. In addition we show that the clustering ratio can be reached when the number k of variables per constraints goes to infinity by the so-called Generalized Unit Clause heuristic.
Saro, A.
2015-10-12
In this study, we cross-match galaxy cluster candidates selected via their Sunyaev–Zel'dovich effect (SZE) signatures in 129.1 deg^{2} of the South Pole Telescope 2500d SPT-SZ survey with optically identified clusters selected from the Dark Energy Survey science verification data. We identify 25 clusters between 0.1 ≲ z ≲ 0.8 in the union of the SPT-SZ and redMaPPer (RM) samples. RM is an optical cluster finding algorithm that also returns a richness estimate for each cluster. We model the richness λ-mass relation with the following function
Distribution and Genetic Diversity of Bacteriocin Gene Clusters in Rumen Microbial Genomes
Azevedo, Analice C.; Bento, Cláudia B. P.; Ruiz, Jeronimo C.; Queiroz, Marisa V.
2015-01-01
Some species of ruminal bacteria are known to produce antimicrobial peptides, but the screening procedures have mostly been based on in vitro assays using standardized methods. Recent sequencing efforts have made available the genome sequences of hundreds of ruminal microorganisms. In this work, we performed genome mining of the complete and partial genome sequences of 224 ruminal bacteria and 5 ruminal archaea to determine the distribution and diversity of bacteriocin gene clusters. A total of 46 bacteriocin gene clusters were identified in 33 strains of ruminal bacteria. Twenty gene clusters were related to lanthipeptide biosynthesis, while 11 gene clusters were associated with sactipeptide production, 7 gene clusters were associated with class II bacteriocin production, and 8 gene clusters were associated with class III bacteriocin production. The frequency of strains whose genomes encode putative antimicrobial peptide precursors was 14.4%. Clusters related to the production of sactipeptides were identified for the first time among ruminal bacteria. BLAST analysis indicated that the majority of the gene clusters (88%) encoding putative lanthipeptides contained all the essential genes required for lanthipeptide biosynthesis. Most strains of Streptococcus (66.6%) harbored complete lanthipeptide gene clusters, in addition to an open reading frame encoding a putative class II bacteriocin. Albusin B-like proteins were found in 100% of the Ruminococcus albus strains screened in this study. The in silico analysis provided evidence of novel biosynthetic gene clusters in bacterial species not previously related to bacteriocin production, suggesting that the rumen microbiota represents an underexplored source of antimicrobial peptides. PMID:26253660
Wu, Jia-Rui; Guo, Wei-Xian; Zhang, Xiao-Meng; Yang, Bing; Zhang, Bing
2014-02-01
Based on the data mining methods of association rules and clustering algorithm, the 188 prescriptions for cough that built by Yan Zhenghua were collected and analyzed to get the frequency of drug usage and the relationship between drugs. From which we could conclude the experiences of Yan Zhenghua for the treatment of cough. The results of the analysis were that 20 core combinations were dig out, such as Bambusae Caulis in Taenias-Almond-Sactmarsh Aster. And there were 10 new prescriptions were found out, such as Sactmarsh Aster-Scutellariae Radix-Album Viscum-Bambusae Caulis in Taenian-Eriobotryae Folium. The results of the analysis were proved that Yan Zhenghua was good at curing cough by using the traditional Chinese medicine that can dispel wind and heat from the body, and remove heat from the lung to relieve cough.
2016-01-01
The early diagnosis of breast cancer is an important step in a fight against the disease. Machine learning techniques have shown promise in improving our understanding of the disease. As medical datasets consist of data points which cannot be precisely assigned to a class, fuzzy methods have been useful for studying of these datasets. Sometimes breast cancer datasets are described by categorical features. Many fuzzy clustering algorithms have been developed for categorical datasets. However, in most of these methods Hamming distance is used to define the distance between the two categorical feature values. In this paper, we use a probabilistic distance measure for the distance computation among a pair of categorical feature values. Experiments demonstrate that the distance measure performs better than Hamming distance for Wisconsin breast cancer data. PMID:27022504
OpenACC programs of the Swendsen-Wang multi-cluster spin flip algorithm
NASA Astrophysics Data System (ADS)
Komura, Yukihiro
2015-12-01
We present sample OpenACC programs of the Swendsen-Wang multi-cluster spin flip algorithm. OpenACC is a directive-based programming model for accelerators without requiring modification to the underlying CPU code itself. In this paper, we deal with the classical spin models as with the sample CUDA programs (Komura and Okabe, 2014), that is, two-dimensional (2D) Ising model, three-dimensional (3D) Ising model, 2D Potts model, 3D Potts model, 2D XY model and 3D XY model. We explain the details of sample OpenACC programs and compare the performance of the present OpenACC implementations with that of the CUDA implementations for the 2D and 3D Ising models and the 2D and 3D XY models.
NASA Astrophysics Data System (ADS)
Boschan, A.; Ocampo, B. L.; Annichini, M.; Gauthier, G.
2016-06-01
A study on the spatial organization and velocity fluctuations of non-Brownian spherical particles settling at low Reynolds number in a vertical Hele-Shaw cell is reported. The particle volume fraction ranged from 0.005 to 0.05, while the distance between cell plates ranged from 5 to 15 times the particle radius. Particle tracking revealed that particles were not uniformly distributed in space but assembled in transient settling clusters. The population distribution of these clusters followed an exponential law. The measured velocity fluctuations are in agreement with that predicted theoretically for spherical clusters, from the balance between the apparent weight and the drag force. This result suggests that particle clustering, more than a spatial distribution of particles derived from random and independent events, is at the origin of the velocity fluctuations.
NASA Astrophysics Data System (ADS)
Nguyen, Sy Dzung; Nguyen, Quoc Hung; Choi, Seung-Bok
2015-01-01
This paper presents a new algorithm for building an adaptive neuro-fuzzy inference system (ANFIS) from a training data set called B-ANFIS. In order to increase accuracy of the model, the following issues are executed. Firstly, a data merging rule is proposed to build and perform a data-clustering strategy. Subsequently, a combination of clustering processes in the input data space and in the joint input-output data space is presented. Crucial reason of this task is to overcome problems related to initialization and contradictory fuzzy rules, which usually happen when building ANFIS. The clustering process in the input data space is accomplished based on a proposed merging-possibilistic clustering (MPC) algorithm. The effectiveness of this process is evaluated to resume a clustering process in the joint input-output data space. The optimal parameters obtained after completion of the clustering process are used to build ANFIS. Simulations based on a numerical data, 'Daily Data of Stock A', and measured data sets of a smart damper are performed to analyze and estimate accuracy. In addition, convergence and robustness of the proposed algorithm are investigated based on both theoretical and testing approaches.
Survivable algorithms and redundancy management in NASA's distributed computing systems
NASA Technical Reports Server (NTRS)
Malek, Miroslaw
1992-01-01
The design of survivable algorithms requires a solid foundation for executing them. While hardware techniques for fault-tolerant computing are relatively well understood, fault-tolerant operating systems, as well as fault-tolerant applications (survivable algorithms), are, by contrast, little understood, and much more work in this field is required. We outline some of our work that contributes to the foundation of ultrareliable operating systems and fault-tolerant algorithm design. We introduce our consensus-based framework for fault-tolerant system design. This is followed by a description of a hierarchical partitioning method for efficient consensus. A scheduler for redundancy management is introduced, and application-specific fault tolerance is described. We give an overview of our hybrid algorithm technique, which is an alternative to the formal approach given.
Community Detection Algorithm Combining Stochastic Block Model and Attribute Data Clustering
NASA Astrophysics Data System (ADS)
Kataoka, Shun; Kobayashi, Takuto; Yasuda, Muneki; Tanaka, Kazuyuki
2016-11-01
We propose a new algorithm to detect the community structure in a network that utilizes both the network structure and vertex attribute data. Suppose we have the network structure together with the vertex attribute data, that is, the information assigned to each vertex associated with the community to which it belongs. The problem addressed this paper is the detection of the community structure from the information of both the network structure and the vertex attribute data. Our approach is based on the Bayesian approach that models the posterior probability distribution of the community labels. The detection of the community structure in our method is achieved by using belief propagation and an EM algorithm. We numerically verified the performance of our method using computer-generated networks and real-world networks.
ACS imaging of star clusters in M 51. I. Identification and radius distribution
NASA Astrophysics Data System (ADS)
Scheepmaker, R. A.; Haas, M. R.; Gieles, M.; Bastian, N.; Larsen, S. S.; Lamers, H. J. G. L. M.
2007-07-01
Context: Size measurements of young star clusters are valuable tools to put constraints on the formation and early dynamical evolution of star clusters. Aims: We use HST/ACS observations of the spiral galaxy M 51 in F435W, F555W and F814W to select a large sample of star clusters with accurate effective radius measurements in an area covering the complete disc of M 51. We present the dataset and study the radius distribution and relations between radius, colour, arm/interarm region, galactocentric distance, mass and age. Methods: We select a sample of 7698 (F435W), 6846 (F555W) and 5024 (F814W) slightly resolved clusters and derive their effective radii (R_eff) by fitting the spatial profiles with analytical models convolved with the point spread function. The radii of 1284 clusters are studied in detail. Results: We find cluster radii between 0.5 and ~10 pc, and one exceptionally large cluster candidate with R_eff = 21.6 pc. The median R_eff is 2.1 pc. We find 70 clusters in our sample which have colours consistent with being old GC candidates and we find 6 new “faint fuzzy” clusters in, or projected onto, the disc of M 51. The radius distribution can not be fitted with a power law similar to the one for star-forming clouds. We find an increase in R_eff with colour as well as a higher fraction of clusters with B-V ⪆ 0.05 in the interarm regions. We find a correlation between R_eff and galactocentric distance (RG) of the form R_eff∝ RG0.12±0.02, which is considerably weaker than the observed correlation for old Milky Way GCs. We find weak relations between cluster luminosity and radius: R_eff∝ L0.15±0.02 for the interarm regions and R_eff∝ L-0.11±0.01 for the spiral arm regions, but we do not observe a correlation between cluster mass and radius. Conclusions: The observed radius distribution indicates that shortly after the formation of the clusters from a fractal gas, the radii of the clusters have changed in a non-uniform way. We find tentative
NASA Astrophysics Data System (ADS)
Enoki, Motohiro; Takahara, Fumio; Fujita, Yutaka
2001-07-01
We investigate statistical properties of galaxy clusters in the context of a hierarchical clustering scenario, taking into account their formation epoch distribution; this study is motivated by the recent finding by Fujita and Takahara that X-ray clusters form a fundamental plane in which the mass and the formation epoch are regarded as two independent parameters. Using the formalism that discriminates between major mergers and accretion, the epoch of a cluster formation is identified with that of the last major merger. Since tiny mass accretion following formation does not much affect the core structure of clusters, the properties of X-ray emission from clusters are determined by the total mass and density at their formation time. Under these assumptions, we calculate X-ray luminosity and temperature functions of galaxy clusters. We find that the behavior of the luminosity function differs from the model that does not take into account formation epoch distribution; the behavior of the temperature function, however, is not much different. In our model, the luminosity function is shifted to a higher luminosity and shows no significant evolution up to z~1, independent of cosmological models. The clusters are populated on the temperature-luminosity plane, with a finite dispersion. Since the simple scaling model in which the gas temperature is equal to the virial temperature fails to reproduce the observed luminosity-temperature relation, we also consider a model that takes into account the effects of preheating. The preheating model reproduces the observations much more accurately.
Study of cluster reconstruction and track fitting algorithms for CGEM-IT at BESIII
NASA Astrophysics Data System (ADS)
Guo, Yue; Wang, Liang-Liang; Ju, Xu-Dong; Wu, Ling-Hui; Xiu, Qing-Lei; Wang, Hai-Xia; Dong, Ming-Yi; Hu, Jing-Ran; Li, Wei-Dong; Li, Wei-Guo; Liu, Huai-Min; Qun, Ou-Yang; Shen, Xiao-Yan; Yuan, Ye; Zhang, Yao
2016-01-01
Considering the effects of aging on the existing Inner Drift Chamber (IDC) of BESIII, a GEM-based inner tracker, the Cylindrical-GEM Inner Tracker (CGEM-IT), is proposed to be designed and constructed as an upgrade candidate for the IDC. This paper introduces a full simulation package for the CGEM-IT with a simplified digitization model, and describes the development of software for cluster reconstruction and track fitting, using a track fitting algorithm based on the Kalman filter method. Preliminary results for the reconstruction algorithms which are obtained using a Monte Carlo sample of single muon events in the CGEM-IT, show that the CGEM-IT has comparable momentum resolution and transverse vertex resolution to the IDC, and a better z-direction resolution than the IDC. Supported by National Key Basic Research Program of China (2015CB856700), National Natural Science Foundation of China (11205184, 11205182) and Joint Funds of National Natural Science Foundation of China (U1232201)
NASA Astrophysics Data System (ADS)
Girardi, L.; Bica, E.
1993-07-01
We present a comparison between photometric cluster models, based on classical and with-overshooting stellar tracks, and the enlarged sample of 624 LMC clusters recently gathered in integrated UBV photometry by Bica et al. Models based on Maeder and Meynet's tracks present two temporary red phases: the first at age 10 Myr, caused by a clump of red supergiants; the second at ~100 Myr due to the combined effect of both the progressive reduction of the blue loop of core He-burning stars, and their fading relative to top-MS stars. The 100 Myr red phase does not occur in models without overshooting. Taking into account stochastic effects on the mass distribution of stars, the models describe well the general distribution of clusters in the (U - B) vs. (B - V) diagram, except for the oldest, SWB types V-VII, clusters. The dispersion of cluster colours due to stochastic effects is found to be strongly variable along the ageing sequence: the general trend is a decrease with age due to the increasing population of post-MS phases, but the dispersion increases in the temporary red phases and is expected to increase again after the red giant branch phase transition due to the appearance of extended RGBs and carbon stars. We also study the LMC clusters age distribution function, based on the age frequency of clusters of equal initial masses, taking into account different values for the IMF slope.
NASA Astrophysics Data System (ADS)
Nandy, Subhajit; Chaudhury, Pinaki; Bhattacharyya, S. P.
2010-06-01
We present a genetic algorithm based investigation of structural fragmentation in dicationic noble gas clusters, Arn+2, Krn+2, and Xen+2, where n denotes the size of the cluster. Dications are predicted to be stable above a threshold size of the cluster when positive charges are assumed to remain localized on two noble gas atoms and the Lennard-Jones potential along with bare Coulomb and ion-induced dipole interactions are taken into account for describing the potential energy surface. Our cutoff values are close to those obtained experimentally [P. Scheier and T. D. Mark, J. Chem. Phys. 11, 3056 (1987)] and theoretically [J. G. Gay and B. J. Berne, Phys. Rev. Lett. 49, 194 (1982)]. When the charges are allowed to be equally distributed over four noble gas atoms in the cluster and the nonpolarization interaction terms are allowed to remain unchanged, our method successfully identifies the size threshold for stability as well as the nature of the channels of dissociation as function of cluster size. In Arn2+, for example, fissionlike fragmentation is predicted for n =55 while for n =43, the predicted outcome is nonfission fragmentation in complete agreement with earlier work [Golberg et al., J. Chem. Phys. 100, 8277 (1994)].
Load flow and state estimation algorithms for three-phase unbalanced power distribution systems
NASA Astrophysics Data System (ADS)
Madvesh, Chiranjeevi
Distribution load flow and state estimation are two important functions in distribution energy management systems (DEMS) and advanced distribution automation (ADA) systems. Distribution load flow analysis is a tool which helps to analyze the status of a power distribution system under steady-state operating conditions. In this research, an effective and comprehensive load flow algorithm is developed to extensively incorporate the distribution system components. Distribution system state estimation is a mathematical procedure which aims to estimate the operating states of a power distribution system by utilizing the information collected from available measurement devices in real-time. An efficient and computationally effective state estimation algorithm adapting the weighted-least-squares (WLS) method has been developed in this research. Both the developed algorithms are tested on different IEEE test-feeders and the results obtained are justified.
Distributive Education Competency-Based Curriculum Models by Occupational Clusters. Final Report.
ERIC Educational Resources Information Center
Davis, Rodney E.; Husted, Stewart W.
To meet the needs of distributive education teachers and students, a project was initiated to develop competency-based curriculum models for marketing and distributive education clusters. The models which were developed incorporate competencies, materials and resources, teaching methodologies/learning activities, and evaluative criteria for the…
NASA Astrophysics Data System (ADS)
Wu, Tin-Yu; Chang, Tse; Chu, Teng-Hao
2017-02-01
Many data mining adopts the form of Artificial Neural Network (ANN) to solve many problems, many problems will be involved in the process of training Artificial Neural Network, such as the number of samples with volume label, the time and performance of training, the number of hidden layers and Transfer function, if the compared data results are not expected, it cannot be known clearly that which dimension causes the deviation, the main reason is that Artificial Neural Network trains compared results through the form of modifying weight, and it is not a kind of training to improve the original algorithm for the extraction algorithm of image, but tend to obtain correct value aimed at the result plus the weigh; in terms of these problems, this paper will mainly put forward a method to assist in the image data analysis of Artificial Neural Network; normally, a parameter will be set as the value to extract feature vector during processing the image, which will be considered by us as weight, the experiment will use the value extracted from feature point of Speeded Up Robust Features (SURF) Image as the basis for training, SURF itself can extract different feature points according to extracted values, we will make initial semi-supervised clustering according to these values, and use Modified K - on his Neighbors (MFKNN) as training and classification, the matching mode of unknown images is not one-to-one complete comparison, but only compare group Centroid, its main purpose is to save its efficiency and speed up, and its retrieved data results will be observed and analyzed eventually; the method is mainly to make clustering and classification with the use of the nature of image feature point to give values to groups with high error rate to produce new feature points and put them into Input Layer of Artificial Neural Network for training, and finally comparative analysis is made with Back-Propagation Neural Network (BPN) of Genetic Algorithm-Artificial Neural Network
D`Azevedo, E.F.; Romine, C.H.
1992-09-01
The standard formulation of the conjugate gradient algorithm involves two inner product computations. The results of these two inner products are needed to update the search direction and the computed solution. In a distributed memory parallel environment, the computation and subsequent distribution of these two values requires two separate communication and synchronization phases. In this paper, we present a mathematically equivalent rearrangement of the standard algorithm that reduces the number of communication phases. We give a second derivation of the modified conjugate gradient algorithm in terms of the natural relationship with the underlying Lanczos process. We also present empirical evidence of the stability of this modified algorithm.
Parallel grid generation algorithm for distributed memory computers
NASA Technical Reports Server (NTRS)
Moitra, Stuti; Moitra, Anutosh
1994-01-01
A parallel grid-generation algorithm and its implementation on the Intel iPSC/860 computer are described. The grid-generation scheme is based on an algebraic formulation of homotopic relations. Methods for utilizing the inherent parallelism of the grid-generation scheme are described, and implementation of multiple levELs of parallelism on multiple instruction multiple data machines are indicated. The algorithm is capable of providing near orthogonality and spacing control at solid boundaries while requiring minimal interprocessor communications. Results obtained on the Intel hypercube for a blended wing-body configuration are used to demonstrate the effectiveness of the algorithm. Fortran implementations bAsed on the native programming model of the iPSC/860 computer and the Express system of software tools are reported. Computational gains in execution time speed-up ratios are given.
NASA Technical Reports Server (NTRS)
Dasarathy, B. V.
1976-01-01
An algorithm is proposed for dimensionality reduction in the context of clustering techniques based on histogram analysis. The approach is based on an evaluation of the hills and valleys in the unidimensional histograms along the different features and provides an economical means of assessing the significance of the features in a nonparametric unsupervised data environment. The method has relevance to remote sensing applications.
Ji, Ze-Xuan; Sun, Quan-Sen; Xia, De-Shen
2011-07-01
A modified possibilistic fuzzy c-means clustering algorithm is presented for fuzzy segmentation of magnetic resonance (MR) images that have been corrupted by intensity inhomogeneities and noise. By introducing a novel adaptive method to compute the weights of local spatial in the objective function, the new adaptive fuzzy clustering algorithm is capable of utilizing local contextual information to impose local spatial continuity, thus allowing the suppression of noise and helping to resolve classification ambiguity. To estimate the intensity inhomogeneity, the global intensity is introduced into the coherent local intensity clustering algorithm and takes the local and global intensity information into account. The segmentation target therefore is driven by two forces to smooth the derived optimal bias field and improve the accuracy of the segmentation task. The proposed method has been successfully applied to 3 T, 7 T, synthetic and real MR images with desirable results. Comparisons with other approaches demonstrate the superior performance of the proposed algorithm. Moreover, the proposed algorithm is robust to initialization, thereby allowing fully automatic applications.
Wang, Xueyi
2011-01-01
The k-nearest neighbors (k-NN) algorithm is a widely used machine learning method that finds nearest neighbors of a test object in a feature space. We present a new exact k-NN algorithm called kMkNN (k-Means for k-Nearest Neighbors) that uses the k-means clustering and the triangle inequality to accelerate the searching for nearest neighbors in a high dimensional space. The kMkNN algorithm has two stages. In the buildup stage, instead of using complex tree structures such as metric trees, kd-trees, or ball-tree, kMkNN uses a simple k-means clustering method to preprocess the training dataset. In the searching stage, given a query object, kMkNN finds nearest training objects starting from the nearest cluster to the query object and uses the triangle inequality to reduce the distance calculations. Experiments show that the performance of kMkNN is surprisingly good compared to the traditional k-NN algorithm and tree-based k-NN algorithms such as kd-trees and ball-trees. On a collection of 20 datasets with up to 106 records and 104 dimensions, kMkNN shows a 2-to 80-fold reduction of distance calculations and a 2- to 60-fold speedup over the traditional k-NN algorithm for 16 datasets. Furthermore, kMkNN performs significant better than a kd-tree based k-NN algorithm for all datasets and performs better than a ball-tree based k-NN algorithm for most datasets. The results show that kMkNN is effective for searching nearest neighbors in high dimensional spaces. PMID:22247818
DCPVP: Distributed Clustering Protocol Using Voting and Priority for Wireless Sensor Networks
Hematkhah, Hooman; Kavian, Yousef S.
2015-01-01
This paper presents a new clustering protocol for designing energy-efficient hierarchical wireless sensor networks (WSNs) by dividing the distributed sensor network into virtual sensor groups to satisfy the scalability and prolong the network lifetime in large-scale applications. The proposed approach is a distributed clustering protocol called DCPVP, which is based on voting and priority ideas. In the DCPVP protocol, the size of clusters is based on the distance of nodes from the data link such as base station (BS) and the local node density. The cluster heads are elected based on the mean distance from neighbors, remaining energy and the times of being elected as cluster head. The performance of the DCPVP protocol is compared with some well-known clustering protocols in literature such as the LEACH, HEED, WCA, GCMRA and TCAC protocols. The simulation results confirm that the prioritizing- and voting-based election ideas decrease the construction time and the energy consumption of clustering progress in sensor networks and consequently improve the lifetime of networks with limited resources and battery powered nodes in harsh and inaccessible environments. PMID:25763646
Gholami, Mohammad; Brennan, Robert W
2016-01-06
In this paper, we investigate alternative distributed clustering techniques for wireless sensor node tracking in an industrial environment. The research builds on extant work on wireless sensor node clustering by reporting on: (1) the development of a novel distributed management approach for tracking mobile nodes in an industrial wireless sensor network; and (2) an objective comparison of alternative cluster management approaches for wireless sensor networks. To perform this comparison, we focus on two main clustering approaches proposed in the literature: pre-defined clusters and ad hoc clusters. These approaches are compared in the context of their reconfigurability: more specifically, we investigate the trade-off between the cost and the effectiveness of competing strategies aimed at adapting to changes in the sensing environment. To support this work, we introduce three new metrics: a cost/efficiency measure, a performance measure, and a resource consumption measure. The results of our experiments show that ad hoc clusters adapt more readily to changes in the sensing environment, but this higher level of adaptability is at the cost of overall efficiency.
Gholami, Mohammad; Brennan, Robert W.
2016-01-01
In this paper, we investigate alternative distributed clustering techniques for wireless sensor node tracking in an industrial environment. The research builds on extant work on wireless sensor node clustering by reporting on: (1) the development of a novel distributed management approach for tracking mobile nodes in an industrial wireless sensor network; and (2) an objective comparison of alternative cluster management approaches for wireless sensor networks. To perform this comparison, we focus on two main clustering approaches proposed in the literature: pre-defined clusters and ad hoc clusters. These approaches are compared in the context of their reconfigurability: more specifically, we investigate the trade-off between the cost and the effectiveness of competing strategies aimed at adapting to changes in the sensing environment. To support this work, we introduce three new metrics: a cost/efficiency measure, a performance measure, and a resource consumption measure. The results of our experiments show that ad hoc clusters adapt more readily to changes in the sensing environment, but this higher level of adaptability is at the cost of overall efficiency. PMID:26751447
Genetic algorithms in a distributed computing environment using PVM
Cronje, G.A.; Steeb, W.H.
1997-04-01
The Parallel Virtual Machine (PVM) is a software system that enables a collection of heterogeneous computer systems to be used as a coherent and flexible concurrent computation resource. We show that genetic algorithms can be implemented using a Parallel Virtual Machine and C++. Problems with constraints are also discussed.
Liu, Song; Zhu, Lizhe; Sheong, Fu Kit; Wang, Wei; Huang, Xuhui
2017-01-30
We present an efficient density-based adaptive-resolution clustering method APLoD for analyzing large-scale molecular dynamics (MD) trajectories. APLoD performs the k-nearest-neighbors search to estimate the density of MD conformations in a local fashion, which can group MD conformations in the same high-density region into a cluster. APLoD greatly improves the popular density peaks algorithm by reducing the running time and the memory usage by 2-3 orders of magnitude for systems ranging from alanine dipeptide to a 370-residue Maltose-binding protein. In addition, we demonstrate that APLoD can produce clusters with various sizes that are adaptive to the underlying density (i.e., larger clusters at low-density regions, while smaller clusters at high-density regions), which is a clear advantage over other popular clustering algorithms including k-centers and k-medoids. We anticipate that APLoD can be widely applied to split ultra-large MD datasets containing millions of conformations for subsequent construction of Markov State Models. © 2016 Wiley Periodicals, Inc.
Silich, Sergiy; Tenorio-Tagle, Guillermo; Martinez-Gonzalez, Sergio; Bisnovatyi-Kogan, Gennadiy E-mail: gkogan@iki.rssi.ru
2011-12-20
A hydrodynamic model for steady-state, spherically symmetric winds driven by young stellar clusters with an exponential stellar density distribution is presented. Unlike in most previous calculations, the position of the singular point R{sub sp}, which separates the inner subsonic zone from the outer supersonic flow, is not associated with the star cluster edge, but calculated self-consistently. When the radiative losses of energy are negligible, the transition from the subsonic to the supersonic flow occurs always at R{sub sp} Almost-Equal-To 4R{sub c} , where R{sub c} is the characteristic scale for the stellar density distribution, irrespective of other star cluster parameters. This is not the case in the catastrophic cooling regime, when the temperature drops abruptly at a short distance from the star cluster center, and the transition from the subsonic to the supersonic regime occurs at a much smaller distance from the star cluster center. The impact from the major star cluster parameters to the wind inner structure is thoroughly discussed. Particular attention is paid to the effects which radiative cooling provides to the flow. The results of the calculations for a set of input parameters, which lead to different hydrodynamic regimes, are presented and compared to the results from non-radiative one-dimensional numerical simulations and to those from calculations with a homogeneous stellar mass distribution.
Statistical sampling of the distribution of uranium deposits using geologic/geographic clusters
Finch, W.I.; Grundy, W.D.; Pierson, C.T.
1992-01-01
The concept of geologic/geographic clusters was developed particularly to study grade and tonnage models for sandstone-type uranium deposits. A cluster is a grouping of mined as well as unmined uranium occurrences within an arbitrary area about 8 km across. A cluster is a statistical sample that will reflect accurately the distribution of uranium in large regions relative to various geologic and geographic features. The example of the Colorado Plateau Uranium Province reveals that only 3 percent of the total number of clusters is in the largest tonnage-size category, greater than 10,000 short tons U3O8, and that 80 percent of the clusters are hosted by Triassic and Jurassic rocks. The distributions of grade and tonnage for clusters in the Powder River Basin show a wide variation; the grade distribution is highly variable, reflecting a difference between roll-front deposits and concretionary deposits, and the Basin contains about half the number in the greater-than-10,000 tonnage-size class as does the Colorado Plateau, even though it is much smaller. The grade and tonnage models should prove useful in finding the richest and largest uranium deposits. ?? 1992 Oxford University Press.
Marchal, Rémi; Carbonnière, Philippe; Pouchan, Claude
2015-01-22
The study of atomic clusters has become an increasingly active area of research in the recent years because of the fundamental interest in studying a completely new area that can bridge the gap between atomic and solid state physics. Due to their specific properties, such compounds are of great interest in the field of nanotechnology [1,2]. Here, we would present our GSAM algorithm based on a DFT exploration of the PES to find the low lying isomers of such compounds. This algorithm includes the generation of an intial set of structure from which the most relevant are selected. Moreover, an optimization process, called raking optimization, able to discard step by step all the non physically reasonnable configurations have been implemented to reduce the computational cost of this algorithm. Structural properties of Ga{sub n}Asm clusters will be presented as an illustration of the method.
ERIC Educational Resources Information Center
Li, Wenhao
2011-01-01
Distributed workflow technology has been widely used in modern education and e-business systems. Distributed web applications have shown cross-domain and cooperative characteristics to meet the need of current distributed workflow applications. In this paper, the author proposes a dynamic and adaptive scheduling algorithm PCSA (Pre-Calculated…
Zheng, Ying; Yeh, Chen-Wei; Yang, Chi-Da; Jang, Shi-Shang; Chu, I-Ming
2007-08-31
Biological information generated by high-throughput technology has made systems approach feasible for many biological problems. By this approach, optimization of metabolic pathway has been successfully applied in the amino acid production. However, in this technique, gene modifications of metabolic control architecture as well as enzyme expression levels are coupled and result in a mixed integer nonlinear programming problem. Furthermore, the stoichiometric complexity of metabolic pathway, along with strong nonlinear behaviour of the regulatory kinetic models, directs a highly rugged contour in the whole optimization problem. There may exist local optimal solutions wherein the same level of production through different flux distributions compared with global optimum. The purpose of this work is to develop a novel stochastic optimization approach-information guided genetic algorithm (IGA) to discover the local optima with different levels of modification of the regulatory loop and production rates. The novelties of this work include the information theory, local search, and clustering analysis to discover the local optima which have physical meaning among the qualified solutions.
Distributed Similarity based Clustering and Compressed Forwarding for wireless sensor networks.
Arunraja, Muruganantham; Malathi, Veluchamy; Sakthivel, Erulappan
2015-11-01
Wireless sensor networks are engaged in various data gathering applications. The major bottleneck in wireless data gathering systems is the finite energy of sensor nodes. By conserving the on board energy, the life span of wireless sensor network can be well extended. Data communication being the dominant energy consuming activity of wireless sensor network, data reduction can serve better in conserving the nodal energy. Spatial and temporal correlation among the sensor data is exploited to reduce the data communications. Data similar cluster formation is an effective way to exploit spatial correlation among the neighboring sensors. By sending only a subset of data and estimate the rest using this subset is the contemporary way of exploiting temporal correlation. In Distributed Similarity based Clustering and Compressed Forwarding for wireless sensor networks, we construct data similar iso-clusters with minimal communication overhead. The intra-cluster communication is reduced using adaptive-normalized least mean squares based dual prediction framework. The cluster head reduces the inter-cluster data payload using a lossless compressive forwarding technique. The proposed work achieves significant data reduction in both the intra-cluster and the inter-cluster communications, with the optimal data accuracy of collected data.
Distribution of sea anemones (Cnidaria, Actiniaria) in Korea analyzed by environmental clustering
Cha, H.-R.; Buddemeier, R.W.; Fautin, D.G.; Sandhei, P.
2004-01-01
Using environmental data and the geospatial clustering tools LOICZView and DISCO, we empirically tested the postulated existence and boundaries of four biogeographic regions in the southern part of the Korean peninsula. Environmental variables used included wind speed, sea surface temperature (SST), salinity, tidal amplitude, and the chlorophyll spectral signal. Our analysis confirmed the existence of four biogeographic regions, but the details of the borders between them differ from those previously postulated. Specimen-level distribution records of intertidal sea anemones were mapped; their distribution relative to the environmental data supported the importance of the environmental parameters we selected in defining suitable habitats. From the geographic coincidence between anemone distribution and the clusters based on environmental variables, we infer that geospatial clustering has the power to delimit ranges for marine organisms within relatively small geographical areas.
Long, Fuhui; Peng, Hanchuan; Sudar, Damir; Levievre, Sophie A.; Knowles, David W.
2006-09-05
Background: The distribution of the chromatin-associatedproteins plays a key role in directing nuclear function. Previously, wedeveloped an image-based method to quantify the nuclear distributions ofproteins and showed that these distributions depended on the phenotype ofhuman mammary epithelial cells. Here we describe a method that creates ahierarchical tree of the given cell phenotypes and calculates thestatistical significance between them, based on the clustering analysisof nuclear protein distributions. Results: Nuclear distributions ofnuclear mitotic apparatus protein were previously obtained fornon-neoplastic S1 and malignant T4-2 human mammary epithelial cellscultured for up to 12 days. Cell phenotype was defined as S1 or T4-2 andthe number of days in cultured. A probabilistic ensemble approach wasused to define a set of consensus clusters from the results of multipletraditional cluster analysis techniques applied to the nucleardistribution data. Cluster histograms were constructed to show how cellsin any one phenotype were distributed across the consensus clusters.Grouping various phenotypes allowed us to build phenotype trees andcalculate the statistical difference between each group. The resultsshowed that non-neoplastic S1 cells could be distinguished from malignantT4-2 cells with 94.19 percent accuracy; that proliferating S1 cells couldbe distinguished from differentiated S1 cells with 92.86 percentaccuracy; and showed no significant difference between the variousphenotypes of T4-2 cells corresponding to increasing tumor sizes.Conclusion: This work presents a cluster analysis method that canidentify significant cell phenotypes, based on the nuclear distributionof specific proteins, with high accuracy.
Liang, Faming; Jin, Ick-Hoon
2013-08-01
Simulating from distributions with intractable normalizing constants has been a long-standing problem in machine learning. In this letter, we propose a new algorithm, the Monte Carlo Metropolis-Hastings (MCMH) algorithm, for tackling this problem. The MCMH algorithm is a Monte Carlo version of the Metropolis-Hastings algorithm. It replaces the unknown normalizing constant ratio by a Monte Carlo estimate in simulations, while still converges, as shown in the letter, to the desired target distribution under mild conditions. The MCMH algorithm is illustrated with spatial autologistic models and exponential random graph models. Unlike other auxiliary variable Markov chain Monte Carlo (MCMC) algorithms, such as the Møller and exchange algorithms, the MCMH algorithm avoids the requirement for perfect sampling, and thus can be applied to many statistical models for which perfect sampling is not available or very expensive. The MCMH algorithm can also be applied to Bayesian inference for random effect models and missing data problems that involve simulations from a distribution with intractable integrals.
NASA Technical Reports Server (NTRS)
Dasarathy, B. V.
1976-01-01
Learning of discriminant hyperplanes in imperfectly supervised or unsupervised training sample sets with unreliably labeled samples along the fuzzy joint boundaries between sample clusters is discussed, with the discriminant hyperplane designed to be a least-squares fit to the unreliably labeled data points. (Samples along the fuzzy boundary jump back and forth from one cluster to the other in recursive cluster stabilization and are considered unreliably labeled.) Minimization of the distances of these unreliably labeled samples from the hyperplanes does not sacrifice the ability to discriminate between classes represented by reliably labeled subsets of samples. An equivalent unconstrained linear inequality problem is formulated and algorithms for its solution are indicated. Landsat earth sensing data were used in confirming the validity and computational feasibility of the approach, which should be useful in deriving discriminant hyperplanes separating clusters with fuzzy boundaries, given supervised training sample sets with unreliably labeled boundary samples.
Real-Time Distributed Algorithms for Visual and Battlefield Reasoning
2006-08-01
in many fields (e.g. designers of disk servers try to merge requests to read addresses on disk to reduce the search time on disk; designers of... designed algorithms to optimally split the set of N task conditions into such buckets. • We then analyzed the complexity of this problem and...in the preceding sections. The STM is a collection of modules documenting this effort. The STM modules themselves involve the design and
An Improved Source-Scanning Algorithm for Locating Earthquake Clusters or Aftershock Sequences
NASA Astrophysics Data System (ADS)
Liao, Y.; Kao, H.; Hsu, S.
2010-12-01
The Source-scanning Algorithm (SSA) was originally introduced in 2004 to locate non-volcanic tremors. Its application was later expanded to the identification of earthquake rupture planes and the near-real-time detection and monitoring of landslides and mud/debris flows. In this study, we further improve SSA for the purpose of locating earthquake clusters or aftershock sequences when only a limited number of waveform observations are available. The main improvements include the application of a ground motion analyzer to separate P and S waves, the automatic determination of resolution based on the grid size and time step of the scanning process, and a modified brightness function to utilize constraints from multiple phases. Specifically, the improved SSA (named as ISSA) addresses two major issues related to locating earthquake clusters/aftershocks. The first one is the massive amount of both time and labour to locate a large number of seismic events manually. And the second one is to efficiently and correctly identify the same phase across the entire recording array when multiple events occur closely in time and space. To test the robustness of ISSA, we generate synthetic waveforms consisting of 3 separated events such that individual P and S phases arrive at different stations in different order, thus making correct phase picking nearly impossible. Using these very complicated waveforms as the input, the ISSA scans all model space for possible combination of time and location for the existence of seismic sources. The scanning results successfully associate various phases from each event at all stations, and correctly recover the input. To further demonstrate the advantage of ISSA, we apply it to the waveform data collected by a temporary OBS array for the aftershock sequence of an offshore earthquake southwest of Taiwan. The overall signal-to-noise ratio is inadequate for locating small events; and the precise arrival times of P and S phases are difficult to
A genetic algorithm for layered multisource video distribution
NASA Astrophysics Data System (ADS)
Cheok, Lai-Tee; Eleftheriadis, Alexandros
2005-03-01
We propose a genetic algorithm -- MckpGen -- for rate scaling and adaptive streaming of layered video streams from multiple sources in a bandwidth-constrained environment. A genetic algorithm (GA) consists of several components: a representation scheme; a generator for creating an initial population; a crossover operator for producing offspring solutions from parents; a mutation operator to promote genetic diversity and a repair operator to ensure feasibility of solutions produced. We formulated the problem as a Multiple-Choice Knapsack Problem (MCKP), a variant of Knapsack Problem (KP) and a decision problem in combinatorial optimization. MCKP has many successful applications in fault tolerance, capital budgeting, resource allocation for conserving energy on mobile devices, etc. Genetic algorithms have been used to solve NP-complete problems effectively, such as the KP, however, to the best of our knowledge, there is no GA for MCKP. We utilize a binary chromosome representation scheme for MCKP and design and implement the components, utilizing problem-specific knowledge for solving MCKP. In addition, for the repair operator, we propose two schemes (RepairSimple and RepairBRP). Results show that RepairBRP yields significantly better performance. We further show that the average fitness of the entire population converges towards the best fitness (optimal) value and compare the performance at various bit-rates.
A SIMPLE PHYSICAL MODEL FOR THE GAS DISTRIBUTION IN GALAXY CLUSTERS
Patej, Anna; Loeb, Abraham
2015-01-01
The dominant baryonic component of galaxy clusters is hot gas whose distribution is commonly probed through X-ray emission arising from thermal bremsstrahlung. The density profile thus obtained has been traditionally modeled with a β-profile, a simple function with only three parameters. However, this model is known to be insufficient for characterizing the range of cluster gas distributions and attempts to rectify this shortcoming typically introduce additional parameters to increase the fitting flexibility. We use cosmological and physical considerations to obtain a family of profiles for the gas with fewer parameters than the β-model but which better accounts for observed gas profiles over wide radial intervals.
Yang, Kaifeng; Mu, Li; Yang, Dongdong; Zou, Feng; Wang, Lei; Jiang, Qiaoyong
2014-01-01
A novel hybrid multiobjective algorithm is presented in this paper, which combines a new multiobjective estimation of distribution algorithm, an efficient local searcher and ε-dominance. Besides, two multiobjective problems with variable linkages strictly based on manifold distribution are proposed. The Pareto set to the continuous multiobjective optimization problems, in the decision space, is a piecewise low-dimensional continuous manifold. The regularity by the manifold features just build probability distribution model by globally statistical information from the population, yet, the efficiency of promising individuals is not well exploited, which is not beneficial to search and optimization process. Hereby, an incremental tournament local searcher is designed to exploit local information efficiently and accelerate convergence to the true Pareto-optimal front. Besides, since ε-dominance is a strategy that can make multiobjective algorithm gain well distributed solutions and has low computational complexity, ε-dominance and the incremental tournament local searcher are combined here. The novel memetic multiobjective estimation of distribution algorithm, MMEDA, was proposed accordingly. The algorithm is validated by experiment on twenty-two test problems with and without variable linkages of diverse complexities. Compared with three state-of-the-art multiobjective optimization algorithms, our algorithm achieves comparable results in terms of convergence and diversity metrics. PMID:25170526
Baum's Algorithm Learns Intersections of Halfspaces with Respect to Log-Concave Distributions
NASA Astrophysics Data System (ADS)
Klivans, Adam R.; Long, Philip M.; Tang, Alex K.
In 1990, E. Baum gave an elegant polynomial-time algorithm for learning the intersection of two origin-centered halfspaces with respect to any symmetric distribution (i.e., any {\\cal D} such that {\\cal D}(E) = {\\cal D}(-E)) [3]. Here we prove that his algorithm also succeeds with respect to any mean zero distribution with a log-concave density (a broad class of distributions that need not be symmetric). As far as we are aware, prior to this work, it was not known how to efficiently learn any class of intersections of halfspaces with respect to log-concave distributions.
NASA Astrophysics Data System (ADS)
Huang, Zhipeng; Gao, Lihong; Wang, Yangwei; Wang, Fuchi
2016-09-01
The Johnson-Cook (J-C) constitutive model is widely used in the finite element simulation, as this model shows the relationship between stress and strain in a simple way. In this paper, a cluster global optimization algorithm is proposed to determine the J-C constitutive model parameters of materials. A set of assumed parameters is used for the accuracy verification of the procedure. The parameters of two materials (401 steel and 823 steel) are determined. Results show that the procedure is reliable and effective. The relative error between the optimized and assumed parameters is no more than 4.02%, and the relative error between the optimized and assumed stress is 0.2% × 10-5. The J-C constitutive parameters can be determined more precisely and quickly than the traditional manual procedure. Furthermore, all the parameters can be simultaneously determined using several curves under different experimental conditions. A strategy is also proposed to accurately determine the constitutive parameters.
Fleisch, Markus C.; Maxell, Christopher A.; Kuper, Claudia K.; Brown, Erika T.; Parvin, Bahram; Barcellos-Hoff, Mary-Helen; Costes,Sylvain V.
2006-03-08
Centrosomes are small organelles that organize the mitoticspindle during cell division and are also involved in cell shape andpolarity. Within epithelial tumors, such as breast cancer, and somehematological tumors, centrosome abnormalities (CA) are common, occurearly in disease etiology, and correlate with chromosomal instability anddisease stage. In situ quantification of CA by optical microscopy ishampered by overlap and clustering of these organelles, which appear asfocal structures. CA has been frequently associated with Tp53 status inpremalignant lesions and tumors. Here we describe an approach toaccurately quantify centrosomes in tissue sections and tumors.Considering proliferation and baseline amplification rate the resultingpopulation based ratio of centrosomes per nucleus allow the approximationof the proportion of cells with CA. Using this technique we show that20-30 percent of cells have amplified centrosomes in Tp53 null mammarytumors. Combining fluorescence detection, deconvolution microscopy and amathematical algorithm applied to a maximum intensity projection we showthat this approach is superior to traditional investigator based visualanalysis or threshold-based techniques.
OpenCluster: A Flexible Distributed Computing Framework for Astronomical Data Processing
NASA Astrophysics Data System (ADS)
Wei, Shoulin; Wang, Feng; Deng, Hui; Liu, Cuiyin; Dai, Wei; Liang, Bo; Mei, Ying; Shi, Congming; Liu, Yingbo; Wu, Jingping
2017-02-01
The volume of data generated by modern astronomical telescopes is extremely large and rapidly growing. However, current high-performance data processing architectures/frameworks are not well suited for astronomers because of their limitations and programming difficulties. In this paper, we therefore present OpenCluster, an open-source distributed computing framework to support rapidly developing high-performance processing pipelines of astronomical big data. We first detail the OpenCluster design principles and implementations and present the APIs facilitated by the framework. We then demonstrate a case in which OpenCluster is used to resolve complex data processing problems for developing a pipeline for the Mingantu Ultrawide Spectral Radioheliograph. Finally, we present our OpenCluster performance evaluation. Overall, OpenCluster provides not only high fault tolerance and simple programming interfaces, but also a flexible means of scaling up the number of interacting entities. OpenCluster thereby provides an easily integrated distributed computing framework for quickly developing a high-performance data processing system of astronomical telescopes and for significantly reducing software development expenses.
The mass-ratio and eccentricity distributions of barium and S stars, and red giants in open clusters
NASA Astrophysics Data System (ADS)
Van der Swaelmen, M.; Boffin, H. M. J.; Jorissen, A.; Van Eck, S.
2017-01-01
Context. A complete set of orbital parameters for barium stars, including the longest orbits, has recently been obtained thanks to a radial-velocity monitoring with the HERMES spectrograph installed on the Flemish Mercator telescope. Barium stars are supposed to belong to post-mass-transfer systems. Aims: In order to identify diagnostics distinguishing between pre- and post-mass-transfer systems, the properties of barium stars (more precisely their mass-function distribution and their period-eccentricity (P-e) diagram) are compared to those of binary red giants in open clusters. As a side product, we aim to identify possible post-mass-transfer systems among the cluster giants from the presence of s-process overabundances. We investigate the relation between the s-process enrichment, the location in the (P-e) diagram, and the cluster metallicity and turn-off mass. Methods: To invert the mass-function distribution and derive the mass-ratio distribution, we used the method pioneered by Boffin et al. (1992) that relies on a Richardson-Lucy deconvolution algorithm. The derivation of s-process abundances in the open-cluster giants was performed through spectral synthesis with MARCS model atmospheres. Results: A fraction of 22% of post-mass-transfer systems is found among the cluster binary giants (with companion masses between 0.58 and 0.87 M⊙, typical for white dwarfs), and these systems occupy a wider area than barium stars in the (P-e) diagram. Barium stars have on average lower eccentricities at a given orbital period. When the sample of binary giant stars in clusters is restricted to the subsample of systems occupying the same locus as the barium stars in the (P-e) diagram, and with a mass function compatible with a WD companion, 33% (=4/12) show a chemical signature of mass transfer in the form of s-process overabundances (from rather moderate - about 0.3 dex - to more extreme - about 1 dex). The only strong barium star in our sample is found in the cluster with
Automatic distribution of vision-tasks on computing clusters
NASA Astrophysics Data System (ADS)
Müller, Thomas; Tran, Binh An; Knoll, Alois
2011-01-01
In this paper a consistent and efficient but yet convenient system for parallel computer vision, and in fact also realtime actuator control is proposed. The system implements the multi-agent paradigm and a blackboard information storage. This, in combination with a generic interface for hardware abstraction and integration of external software components, is setup on basis of the message passing interface (MPI). The system allows for data- and task-parallel processing, and supports both synchronous communication, as data exchange can be triggered by events, and asynchronous communication, as data can be polled, strategies. Also, by duplication of processing units (agents) redundant processing is possible to achieve greater robustness. As the system automatically distributes the task units to available resources, and a monitoring concept allows for combination of tasks and their composition to complex processes, it is easy to develop efficient parallel vision / robotics applications quickly. Multiple vision based applications have already been implemented, including academic, research related fields and prototypes for industrial automation. For the scientific community the system has been recently launched open-source.
Asymptotic Properties of Distributed and Communicating Stochastic Approximation Algorithms,
1986-02-01
L ECTE W1 DDEC B Approved for public release; Diztxiblltiofl Unlimited 15TW- ’p. *~8"~ ,$1Q Unclassified SECURITY CLAS FICATION OF THIS PAGE (ftm Owa...ORGANIZATION NAME AND AORESS . P"QRAM ELEMENT. PROJECT. TASK Lefschetz Center for Dynamical Systems AEIA WORK UNIT NUMBERS Division of Applied Mathematics...unclassified C\\~z \\\\A D. ECLASSIFICATION/ OOWNGRAOING 16. DISTRIBUTION STATEMENT (of thl R aprt) Approved for public release: distribution unlimited 17
NASA Astrophysics Data System (ADS)
Barghorn, K.; Hilf, E. R.
1994-04-01
The collision process of low energetic gold atoms and solid targets has been simulated using our molecular dynamics simulation code CLIMPACT II. The used algorithm is a third-order predictor Verlet algorithm [L. Verlet, Phys. Rev. 159 (1967) 98; W.F. van Gusteren and H.J.C. Berendsen, in: Molecular Liquids — Dynamics and Interfaces, A.J. Barnes et al., eds. (Reidel, 1984) p. 475.]. The iteration time step is continuously optimized by the program. About 50% of the total computer time is spent to integrate the motions during the first 100 fs of simulation time [B. Nitzschmann, Diploma thesis, Univ. of Oldenburg, Germany, 1992)]. When the crater formation ends and the motions in the target are slower, the step increases up to 20 times the start step size. Using this algorithm we are able to simulate a target of up to 10 5 particles. We use new nonreflecting boundary conditions. Only mechanical interactions are considered. The projectile can be chosen as a cluster with variable impact angle. Specifically the output yield under different impact angles and the distribution of the desorbed particles are presented and discussed. The temporal development of the desorption shows three distinct processes: an early explosive process, a surface ablation by an apparent surface shock wave, a final thermal evaporation.
Superferromagnetism in mechanically alloyed fcc Fe23Cu77 with bimodal cluster size distribution
NASA Astrophysics Data System (ADS)
Silva, N. J. O.; Amaral, J. S.; Amaral, V. S.; Costa, B. F. O.; LeCaër, G.
2009-01-01
Magnetic measurements, x-ray diffraction and Mössbauer spectroscopy were used to characterize a nanostructured fcc Fe23Cu77 at.% alloy prepared by high-energy ball-milling, addressing in particular the effect of clustering on the nature of the interacting magnetic entities. The interpretation of magnetization measurements leads to the conclusion that grains, whose mean size is ~16 nm, contain two populations of magnetic Fe-rich nanoclusters with a bimodal size distribution. These two sets of clusters contain about 14 and 400 Fe atoms and have magnetic moments of 30 µB and 860 µB, respectively. The inter-cluster ferromagnetic interactions that lead to superferromagnetism with a Curie temperature TC~220 K can be described by a mean field determined by the smaller clusters only, which account for 90% of the magnetization.
NASA Technical Reports Server (NTRS)
Rogers, David
1988-01-01
The advent of the Connection Machine profoundly changes the world of supercomputers. The highly nontraditional architecture makes possible the exploration of algorithms that were impractical for standard Von Neumann architectures. Sparse distributed memory (SDM) is an example of such an algorithm. Sparse distributed memory is a particularly simple and elegant formulation for an associative memory. The foundations for sparse distributed memory are described, and some simple examples of using the memory are presented. The relationship of sparse distributed memory to three important computational systems is shown: random-access memory, neural networks, and the cerebellum of the brain. Finally, the implementation of the algorithm for sparse distributed memory on the Connection Machine is discussed.
NASA Astrophysics Data System (ADS)
Meyer, Michael R.
1996-04-01
The goal of this thesis is to test the following hypothesis: the initial distribution of stellar masses from a single "episode" of star formation is independent of the local physical conditions of the region. In other words, is the initial mass function (IMF) strictly universal over spatial scales d < 1 \\ pc and over time intervals Delta-tau << 3 x 10^6 yrs? We discuss the utility of embedded clusters in addressing this question. Using a combination of spectroscopic and photometric techniques, we seek to characterize emergent mass distributions of embedded clusters in order to compare them both with each other and with the field star IMF. Medium resolution (R=1000) near-infrared spectra obtainable with the current generation of NIR grating spectrographs can provide estimates of the photospheric temperatures of optically-invisible stars. Deriving these spectral types requires a three--step process; i) setting up a classification scheme based on near-infrared spectra of spectral standards; ii) understanding the effects of accretion on this classification scheme by studying optically-visible young stellar objects; and iii) applying this classification technique to the deeply embedded clusters. Combining near-infrared photometry with spectral types, accurate stellar luminosities can be derived for heavily reddened young stars thus enabling their placement in the H-R diagram. From their position in the H-R diagram, masses and ages of stars can be estimated from comparison with theoretical pre-main sequence evolutionary models. Because it is not practical to obtain complete spectroscopic samples of embedded cluster members, a technique is developed based solely on near-IR photometry for estimating stellar luminosities from flux--limited surveys. We then describe how spectroscopic surveys of deeply embedded clusters are necessary in order to adopt appropriate mass-luminosity relationships. Stellar luminosity functions constructed from complete extinction-limited samples
NASA Astrophysics Data System (ADS)
Meyer, Michael R.
1996-02-01
The goal of this thesis is to test the following hypothesis: the initial distribution of stellar masses from a single ``episode'' of star formation is independent of the local physical conditions of the region. In other words, is the initial mass function (IMF) strictly universal over spatial scales d < 1 pc and over time intervals Δ τ << 3 × 106yrs? We discuss the utility of embedded clusters in addressing this question. Using a combination of spectroscopic and photometric techniques, we seek to characterize emergent mass distributions of embedded clusters in order to compare them both with each other and with the field star IMF. Medium resolution (R = 1000) near--infrared spectra obtainable with the current generation of NIR grating spectrographs can provide estimates of the photospheric temperatures of optically--invisible stars. Deriving these spectral types requires a three--step process; i) setting up a classification scheme based on near--infrared spectra of spectral standards; ii) understanding the effects of accretion on this classification scheme by studying optically--visible young stellar objects; and iii) applying this classification technique to the deeply embedded clusters. Combining near--infrared photometry with spectral types, accurate stellar luminosities can be derived for heavily reddened young stars thus enabling their placement in the H--R diagram. From their position in the H--R diagram, masses and ages of stars can be estimated from comparison with theoretical pre--main sequence evolutionary models. Because it is not practical to obtain complete spectroscopic samples of embedded cluster members, a technique is developed based solely on near--IR photometry for estimating stellar luminosities from flux--limited surveys. We then describe how spectroscopic surveys of deeply embedded clusters are necessary in order to adopt appropriate mass--luminosity relationships. Stellar luminosity functions constructed from complete extinction
Localization of WSN using Distributed Particle Swarm Optimization algorithm with precise references
NASA Astrophysics Data System (ADS)
Janapati, Ravi Chander; Balaswamy, Ch.; Soundararajan, K.
2016-08-01
Localization is the key research area in Wireless Sensor Networks. Finding the exact position of the node is known as localization. Different algorithms have been proposed. Here we consider a cooperative localization algorithm with censoring schemes using Crammer Rao Bound (CRB). This censoring scheme can improve the positioning accuracy and reduces computation complexity, traffic and latency. Particle swarm optimization (PSO) is a population based search algorithm based on the swarm intelligence like social behavior of birds, bees or a school of fishes. To improve the algorithm efficiency and localization precision, this paper presents an objective function based on the normal distribution of ranging error and a method of obtaining the search space of particles. In this paper Distributed localization algorithm PSO with CRB is proposed. Proposed method shows better results in terms of position accuracy, latency and complexity.
Melchior, P.; Suchyta, E.; Huff, E.; Hirsch, M.; Kacprzak, T.; Rykoff, E.; Gruen, D.; Armstrong, R.; Bacon, D.; Bechtol, K.; Bernstein, G. M.; Bridle, S.; Clampitt, J.; Honscheid, K.; Jain, B.; Jouvel, S.; Krause, E.; Lin, H.; MacCrann, N.; Patton, K.; Plazas, A.; Rowe, B.; Vikram, V.; Wilcox, H.; Young, J.; Zuntz, J.; Abbott, T.; Abdalla, F. B.; Allam, S. S.; Banerji, M.; Bernstein, J. P.; Bernstein, R. A.; Bertin, E.; Buckley-Geer, E.; Burke, D. L.; Castander, F. J.; da Costa, L. N.; Cunha, C. E.; Depoy, D. L.; Desai, S.; Diehl, H. T.; Doel, P.; Estrada, J.; Evrard, A. E.; Neto, A. F.; Fernandez, E.; Finley, D. A.; Flaugher, B.; Frieman, J. A.; Gaztanaga, E.; Gerdes, D.; Gruendl, R. A.; Gutierrez, G. R.; Jarvis, M.; Karliner, I.; Kent, S.; Kuehn, K.; Kuropatkin, N.; Lahav, O.; Maia, M. A. G.; Makler, M.; Marriner, J.; Marshall, J. L.; Merritt, K. W.; Miller, C. J.; Miquel, R.; Mohr, J.; Neilsen, E.; Nichol, R. C.; Nord, B. D.; Reil, K.; Roe, N. A.; Roodman, A.; Sako, M.; Sanchez, E.; Santiago, B. X.; Schindler, R.; Schubnell, M.; Sevilla-Noarbe, I.; Sheldon, E.; Smith, C.; Soares-Santos, M.; Swanson, M. E. C.; Sypniewski, A. J.; Tarle, G.; Thaler, J.; Thomas, D.; Tucker, D. L.; Walker, A.; Wechsler, R.; Weller, J.; Wester, W.
2015-03-31
We measure the weak-lensing masses and galaxy distributions of four massive galaxy clusters observed during the Science Verification phase of the Dark Energy Survey. This pathfinder study is meant to 1) validate the DECam imager for the task of measuring weak-lensing shapes, and 2) utilize DECam's large field of view to map out the clusters and their environments over 90 arcmin. We conduct a series of rigorous tests on astrometry, photometry, image quality, PSF modeling, and shear measurement accuracy to single out flaws in the data and also to identify the optimal data processing steps and parameters. We find Science Verification data from DECam to be suitable for the lensing analysis described in this paper. The PSF is generally well-behaved, but the modeling is rendered difficult by a flux-dependent PSF width and ellipticity. We employ photometric redshifts to distinguish between foreground and background galaxies, and a red-sequence cluster finder to provide cluster richness estimates and cluster-galaxy distributions. By fitting NFW profiles to the clusters in this study, we determine weak-lensing masses that are in agreement with previous work. For Abell 3261, we provide the first estimates of redshift, weak-lensing mass, and richness. In addition, the cluster-galaxy distributions indicate the presence of filamentary structures attached to 1E 0657-56 and RXC J2248.7-4431, stretching out as far as 1 degree (approximately 20 Mpc), showcasing the potential of DECam and DES for detailed studies of degree-scale features on the sky.
Planck intermediate results. XLIII. Spectral energy distribution of dust in clusters of galaxies
NASA Astrophysics Data System (ADS)
Planck Collaboration; Adam, R.; Ade, P. A. R.; Aghanim, N.; Ashdown, M.; Aumont, J.; Baccigalupi, C.; Banday, A. J.; Barreiro, R. B.; Bartolo, N.; Battaner, E.; Benabed, K.; Benoit-Lévy, A.; Bersanelli, M.; Bielewicz, P.; Bikmaev, I.; Bonaldi, A.; Bond, J. R.; Borrill, J.; Bouchet, F. R.; Burenin, R.; Burigana, C.; Calabrese, E.; Cardoso, J.-F.; Catalano, A.; Chiang, H. C.; Christensen, P. R.; Churazov, E.; Colombo, L. P. L.; Combet, C.; Comis, B.; Couchot, F.; Crill, B. P.; Curto, A.; Cuttaia, F.; Danese, L.; Davis, R. J.; de Bernardis, P.; de Rosa, A.; de Zotti, G.; Delabrouille, J.; Désert, F.-X.; Diego, J. M.; Dole, H.; Doré, O.; Douspis, M.; Ducout, A.; Dupac, X.; Elsner, F.; Enßlin, T. A.; Finelli, F.; Forni, O.; Frailis, M.; Fraisse, A. A.; Franceschi, E.; Galeotta, S.; Ganga, K.; Génova-Santos, R. T.; Giard, M.; Giraud-Héraud, Y.; Gjerløw, E.; González-Nuevo, J.; Górski, K. M.; Gregorio, A.; Gruppuso, A.; Gudmundsson, J. E.; Hansen, F. K.; Harrison, D. L.; Hernández-Monteagudo, C.; Herranz, D.; Hildebrandt, S. R.; Hivon, E.; Hobson, M.; Hornstrup, A.; Hovest, W.; Hurier, G.; Jaffe, A. H.; Jaffe, T. R.; Jones, W. C.; Keihänen, E.; Keskitalo, R.; Khamitov, I.; Kisner, T. S.; Kneissl, R.; Knoche, J.; Kunz, M.; Kurki-Suonio, H.; Lagache, G.; Lähteenmäki, A.; Lamarre, J.-M.; Lasenby, A.; Lattanzi, M.; Lawrence, C. R.; Leonardi, R.; Levrier, F.; Liguori, M.; Lilje, P. B.; Linden-Vørnle, M.; López-Caniego, M.; Macías-Pérez, J. F.; Maffei, B.; Maggio, G.; Mandolesi, N.; Mangilli, A.; Maris, M.; Martin, P. G.; Martínez-González, E.; Masi, S.; Matarrese, S.; Melchiorri, A.; Mennella, A.; Migliaccio, M.; Miville-Deschênes, M.-A.; Moneti, A.; Montier, L.; Morgante, G.; Mortlock, D.; Munshi, D.; Murphy, J. A.; Naselsky, P.; Nati, F.; Natoli, P.; Nørgaard-Nielsen, H. U.; Novikov, D.; Novikov, I.; Oxborrow, C. A.; Pagano, L.; Pajot, F.; Paoletti, D.; Pasian, F.; Perdereau, O.; Perotto, L.; Pettorino, V.; Piacentini, F.; Piat, M.; Plaszczynski, S.; Pointecouteau, E.; Polenta, G.; Ponthieu, N.; Pratt, G. W.; Prunet, S.; Puget, J.-L.; Rachen, J. P.; Rebolo, R.; Reinecke, M.; Remazeilles, M.; Renault, C.; Renzi, A.; Ristorcelli, I.; Rocha, G.; Rosset, C.; Rossetti, M.; Roudier, G.; Rubiño-Martín, J. A.; Rusholme, B.; Santos, D.; Savelainen, M.; Savini, G.; Scott, D.; Stolyarov, V.; Stompor, R.; Sudiwala, R.; Sunyaev, R.; Sutton, D.; Suur-Uski, A.-S.; Sygnet, J.-F.; Tauber, J. A.; Terenzi, L.; Toffolatti, L.; Tomasi, M.; Tristram, M.; Tucci, M.; Valenziano, L.; Valiviita, J.; Van Tent, F.; Vielva, P.; Villa, F.; Wade, L. A.; Wehus, I. K.; Yvon, D.; Zacchei, A.; Zonca, A.
2016-12-01
Although infrared (IR) overall dust emission from clusters of galaxies has been statistically detected using data from the Infrared Astronomical Satellite (IRAS), it has not been possible to sample the spectral energy distribution (SED) of this emission over its peak, and thus to break the degeneracy between dust temperature and mass. By complementing the IRAS spectral coverage with Planck satellite data from 100 to 857 GHz, we provide new constraints on the IR spectrum of thermal dust emission in clusters of galaxies. We achieve this by using a stacking approach for a sample of several hundred objects from the Planck cluster sample. This procedure averages out fluctuations from the IR sky, allowing us to reach a significant detection of the faint cluster contribution. We also use the large frequency range probed by Planck, together with component-separation techniques, to remove the contamination from both cosmic microwave background anisotropies and the thermal Sunyaev-Zeldovich effect (tSZ) signal, which dominate at ν ≤ 353 GHz. By excluding dominant spurious signals or systematic effects, averaged detections are reported at frequencies 353 GHz ≤ ν ≤ 5000 GHz. We confirm the presence of dust in clusters of galaxies at low and intermediate redshifts, yielding an SED with a shape similar to that of the Milky Way. Planck's resolution does not allow us to investigate the detailed spatial distribution of this emission (e.g. whether it comes from intergalactic dust or simply the dust content of the cluster galaxies), but the radial distribution of the emission appears to follow that of the stacked SZ signal, and thus the extent of the clusters. The recovered SED allows us to constrain the dust mass responsible for the signal and its temperature.
Melchior, P.; Suchyta, E.; Huff, E.; Hirsch, M.; Kacprzak, T.; Rykoff, E.; Gruen, D.; Armstrong, R.; Bacon, D.; Bechtol, K.; Bernstein, G. M.; Bridle, S.; Clampitt, J.; Honscheid, K.; Jain, B.; Jouvel, S.; Krause, E.; Lin, H.; MacCrann, N.; Patton, K.; Plazas, A.; Rowe, B.; Vikram, V.; Wilcox, H.; Young, J.; Zuntz, J.; Abbott, T.; Abdalla, F. B.; Allam, S. S.; Banerji, M.; Bernstein, J. P.; Bernstein, R. A.; Bertin, E.; Buckley-Geer, E.; Burke, D. L.; Castander, F. J.; da Costa, L. N.; Cunha, C. E.; Depoy, D. L.; Desai, S.; Diehl, H. T.; Doel, P.; Estrada, J.; Evrard, A. E.; Neto, A. F.; Fernandez, E.; Finley, D. A.; Flaugher, B.; Frieman, J. A.; Gaztanaga, E.; Gerdes, D.; Gruendl, R. A.; Gutierrez, G. R.; Jarvis, M.; Karliner, I.; Kent, S.; Kuehn, K.; Kuropatkin, N.; Lahav, O.; Maia, M. A. G.; Makler, M.; Marriner, J.; Marshall, J. L.; Merritt, K. W.; Miller, C. J.; Miquel, R.; Mohr, J.; Neilsen, E.; Nichol, R. C.; Nord, B. D.; Reil, K.; Roe, N. A.; Roodman, A.; Sako, M.; Sanchez, E.; Santiago, B. X.; Schindler, R.; Schubnell, M.; Sevilla-Noarbe, I.; Sheldon, E.; Smith, C.; Soares-Santos, M.; Swanson, M. E. C.; Sypniewski, A. J.; Tarle, G.; Thaler, J.; Thomas, D.; Tucker, D. L.; Walker, A.; Wechsler, R.; Weller, J.; Wester, W.
2015-03-31
We measure the weak lensing masses and galaxy distributions of four massive galaxy clusters observed during the Science Verification phase of the Dark Energy Survey (DES). This pathfinder study is meant to (1) validate the Dark Energy Camera (DECam) imager for the task of measuring weak lensing shapes, and (2) utilize DECam's large field of view to map out the clusters and their environments over 90 arcmin. We conduct a series of rigorous tests on astrometry, photometry, image quality, point spread function (PSF) modelling, and shear measurement accuracy to single out flaws in the data and also to identify the optimal data processing steps and parameters. We find Science Verification data from DECam to be suitable for the lensing analysis described in this paper. The PSF is generally well behaved, but the modelling is rendered difficult by a flux-dependent PSF width and ellipticity. We employ photometric redshifts to distinguish between foreground and background galaxies, and a red-sequence cluster finder to provide cluster richness estimates and cluster-galaxy distributions. By fitting Navarro-Frenk-White profiles to the clusters in this study, we determine weak lensing masses that are in agreement with previous work. For Abell 3261, we provide the first estimates of redshift, weak lensing mass, and richness. In addition, the cluster-galaxy distributions indicate the presence of filamentary structures attached to 1E 0657-56 and RXC J2248.7-4431, stretching out as far as 1. (approximately 20 Mpc), showcasing the potential of DECam and DES for detailed studies of degree-scale features on the sky.
Melchior, P.; Suchyta, E.; Huff, E.; ...
2015-03-31
We measure the weak-lensing masses and galaxy distributions of four massive galaxy clusters observed during the Science Verification phase of the Dark Energy Survey. This pathfinder study is meant to 1) validate the DECam imager for the task of measuring weak-lensing shapes, and 2) utilize DECam's large field of view to map out the clusters and their environments over 90 arcmin. We conduct a series of rigorous tests on astrometry, photometry, image quality, PSF modelling, and shear measurement accuracy to single out flaws in the data and also to identify the optimal data processing steps and parameters. We find Sciencemore » Verification data from DECam to be suitable for the lensing analysis described in this paper. The PSF is generally well-behaved, but the modelling is rendered difficult by a flux-dependent PSF width and ellipticity. We employ photometric redshifts to distinguish between foreground and background galaxies, and a red-sequence cluster finder to provide cluster richness estimates and cluster-galaxy distributions. By fitting NFW profiles to the clusters in this study, we determine weak-lensing masses that are in agreement with previous work. For Abell 3261, we provide the first estimates of redshift, weak-lensing mass, and richness. Additionally, the cluster-galaxy distributions indicate the presence of filamentary structures attached to 1E 0657-56 and RXC J2248.7-4431, stretching out as far as 1degree (approximately 20 Mpc), showcasing the potential of DECam and DES for detailed studies of degree-scale features on the sky.« less
Melchior, P.; Suchyta, E.; Huff, E.; Hirsch, M.; Kacprzak, T.; Rykoff, E.; Gruen, D.; Armstrong, R.; Bacon, D.; Bechtol, K.; Bernstein, G. M.; Bridle, S.; Clampitt, J.; Honscheid, K.; Jain, B.; Jouvel, S.; Krause, E.; Lin, H.; MacCrann, N.; Patton, K.; Plazas, A.; Rowe, B.; Vikram, V.; Wilcox, H.; Young, J.; Zuntz, J.; Abbott, T.; Abdalla, F. B.; Allam, S. S.; Banerji, M.; Bernstein, J. P.; Bernstein, R. A.; Bertin, E.; Buckley-Geer, E.; Burke, D. L.; Castander, F. J.; da Costa, L. N.; Cunha, C. E.; Depoy, D. L.; Desai, S.; Diehl, H. T.; Doel, P.; Estrada, J.; Evrard, A. E.; Neto, A. F.; Fernandez, E.; Finley, D. A.; Flaugher, B.; Frieman, J. A.; Gaztanaga, E.; Gerdes, D.; Gruendl, R. A.; Gutierrez, G. R.; Jarvis, M.; Karliner, I.; Kent, S.; Kuehn, K.; Kuropatkin, N.; Lahav, O.; Maia, M. A. G.; Makler, M.; Marriner, J.; Marshall, J. L.; Merritt, K. W.; Miller, C. J.; Miquel, R.; Mohr, J.; Neilsen, E.; Nichol, R. C.; Nord, B. D.; Reil, K.; Roe, N. A.; Roodman, A.; Sako, M.; Sanchez, E.; Santiago, B. X.; Schindler, R.; Schubnell, M.; Sevilla-Noarbe, I.; Sheldon, E.; Smith, C.; Soares-Santos, M.; Swanson, M. E. C.; Sypniewski, A. J.; Tarle, G.; Thaler, J.; Thomas, D.; Tucker, D. L.; Walker, A.; Wechsler, R.; Weller, J.; Wester, W.
2015-03-31
We measure the weak-lensing masses and galaxy distributions of four massive galaxy clusters observed during the Science Verification phase of the Dark Energy Survey. This pathfinder study is meant to 1) validate the DECam imager for the task of measuring weak-lensing shapes, and 2) utilize DECam's large field of view to map out the clusters and their environments over 90 arcmin. We conduct a series of rigorous tests on astrometry, photometry, image quality, PSF modelling, and shear measurement accuracy to single out flaws in the data and also to identify the optimal data processing steps and parameters. We find Science Verification data from DECam to be suitable for the lensing analysis described in this paper. The PSF is generally well-behaved, but the modelling is rendered difficult by a flux-dependent PSF width and ellipticity. We employ photometric redshifts to distinguish between foreground and background galaxies, and a red-sequence cluster finder to provide cluster richness estimates and cluster-galaxy distributions. By fitting NFW profiles to the clusters in this study, we determine weak-lensing masses that are in agreement with previous work. For Abell 3261, we provide the first estimates of redshift, weak-lensing mass, and richness. Additionally, the cluster-galaxy distributions indicate the presence of filamentary structures attached to 1E 0657-56 and RXC J2248.7-4431, stretching out as far as 1degree (approximately 20 Mpc), showcasing the potential of DECam and DES for detailed studies of degree-scale features on the sky.
Spatial Distributions of Young Large Magellanic Cloud Clusters as Tracers of a Bar Perturbation
NASA Astrophysics Data System (ADS)
Dottori, H.; Bica, E.; Claria, J. J.; Puerari, I.
1996-04-01
The spatial distributions of SWB I [10 <~ t(Myr) <~ 30] and SWB II [30 <~ t(Myr) <~ 70] LMC clusters are analyzed using the enlarged sample of integrated UBV photometry of star clusters and associations published by Bica et al. in 1996. The differences in the clusters spatial distributions are interpreted as dating back from their formation epoch and as being caused by a perturbation in the gaseous disk generated by the LMC stellar bar. The two SWB distributions present noncoincident bar Or barlike structures with a position angle difference of ~22^deg^ +/- 2^deg^, which, together with the age difference between the groups, leads to a perturbation propagation velocity of 13.7 +/- 2 km s^-1^ kpc^-1^, which suggests bar-induced star formation effects on the disk. Differences are also observed in the outer parts of the SWB I and SWB II spatial distributions. A spatial Fourier analysis reveals the predominance of m = 1 and m = 2 components in both cases. The spatial distribution of clusters younger than 10 Myr (SWB 0) is also studied. The patterns measured by the Fourier analysis for the SWB 0 clusters resemble the gas distribution in the potential of disk models with an off-center bar. The H I kinematics is also evidence of the presence of the perturbation in the LMC disk. Although some evidence of locally induced star formation effects is found, our analysis indicates that globally triggered star formation effects induced by the potential play an important role in organizing the overall patterns and the loci of Shapley's Constellations.
NASA Astrophysics Data System (ADS)
Rastogi, Richa; Londhe, Ashutosh; Srivastava, Abhishek; Sirasala, Kirannmayi M.; Khonde, Kiran
2017-03-01
In this article, a new scalable 3D Kirchhoff depth migration algorithm is presented on state of the art multicore CPU based cluster. Parallelization of 3D Kirchhoff depth migration is challenging due to its high demand of compute time, memory, storage and I/O along with the need of their effective management. The most resource intensive modules of the algorithm are traveltime calculations and migration summation which exhibit an inherent trade off between compute time and other resources. The parallelization strategy of the algorithm largely depends on the storage of calculated traveltimes and its feeding mechanism to the migration process. The presented work is an extension of our previous work, wherein a 3D Kirchhoff depth migration application for multicore CPU based parallel system had been developed. Recently, we have worked on improving parallel performance of this application by re-designing the parallelization approach. The new algorithm is capable to efficiently migrate both prestack and poststack 3D data. It exhibits flexibility for migrating large number of traces within the available node memory and with minimal requirement of storage, I/O and inter-node communication. The resultant application is tested using 3D Overthrust data on PARAM Yuva II, which is a Xeon E5-2670 based multicore CPU cluster with 16 cores/node and 64 GB shared memory. Parallel performance of the algorithm is studied using different numerical experiments and the scalability results show striking improvement over its previous version. An impressive 49.05X speedup with 76.64% efficiency is achieved for 3D prestack data and 32.00X speedup with 50.00% efficiency for 3D poststack data, using 64 nodes. The results also demonstrate the effectiveness and robustness of the improved algorithm with high scalability and efficiency on a multicore CPU cluster.
NASA Astrophysics Data System (ADS)
Kato, Takeyoshi; Minagata, Atsushi; Suzuoki, Yasuo
This paper discusses the influence of mass installation of a home co-generation system (H-CGS) using a polymer electrolyte fuel cell (PEFC) on the voltage profile of power distribution system in residential area. The influence of H-CGS is compared with that of photovoltaic power generation systems (PV systems). The operation pattern of H-CGS is assumed based on the electricity and hot-water demand observed in 10 households for a year. The main results are as follows. With the clustered H-CGS, the voltage of each bus is higher by about 1-3% compared with the conventional system without any distributed generators. Because H-CGS tends to increase the output during the early evening, H-CGS contributes to recover the voltage drop during the early evening, resulting in smaller voltage variation of distribution system throughout a day. Because of small rated power output about 1kW, the influence on voltage profile by the clustered H-CGS is smaller than that by the clustered PV systems. The highest voltage during the day time is not so high as compared with the distribution system with the clustered PV systems, even if the reverse power flow from H-CGS is allowed.
Distributed topology control algorithm for multihop wireless netoworks
NASA Technical Reports Server (NTRS)
Borbash, S. A.; Jennings, E. H.
2002-01-01
We present a network initialization algorithmfor wireless networks with distributed intelligence. Each node (agent) has only local, incomplete knowledge and it must make local decisions to meet a predefined global objective. Our objective is to use power control to establish a topology based onthe relative neighborhood graph which has good overall performance in terms of power usage, low interference, and reliability.
Chandrasekhar equations and computational algorithms for distributed parameter systems
NASA Technical Reports Server (NTRS)
Burns, J. A.; Ito, K.; Powers, R. K.
1984-01-01
The Chandrasekhar equations arising in optimal control problems for linear distributed parameter systems are considered. The equations are derived via approximation theory. This approach is used to obtain existence, uniqueness, and strong differentiability of the solutions and provides the basis for a convergent computation scheme for approximating feedback gain operators. A numerical example is presented to illustrate these ideas.
NASA Astrophysics Data System (ADS)
Hortos, William S.
2008-04-01
Proposed distributed wavelet-based algorithms are a means to compress sensor data received at the nodes forming a wireless sensor network (WSN) by exchanging information between neighboring sensor nodes. Local collaboration among nodes compacts the measurements, yielding a reduced fused set with equivalent information at far fewer nodes. Nodes may be equipped with multiple sensor types, each capable of sensing distinct phenomena: thermal, humidity, chemical, voltage, or image signals with low or no frequency content as well as audio, seismic or video signals within defined frequency ranges. Compression of the multi-source data through wavelet-based methods, distributed at active nodes, reduces downstream processing and storage requirements along the paths to sink nodes; it also enables noise suppression and more energy-efficient query routing within the WSN. Targets are first detected by the multiple sensors; then wavelet compression and data fusion are applied to the target returns, followed by feature extraction from the reduced data; feature data are input to target recognition/classification routines; targets are tracked during their sojourns through the area monitored by the WSN. Algorithms to perform these tasks are implemented in a distributed manner, based on a partition of the WSN into clusters of nodes. In this work, a scheme of collaborative processing is applied for hierarchical data aggregation and decorrelation, based on the sensor data itself and any redundant information, enabled by a distributed, in-cluster wavelet transform with lifting that allows multiple levels of resolution. The wavelet-based compression algorithm significantly decreases RF bandwidth and other resource use in target processing tasks. Following wavelet compression, features are extracted. The objective of feature extraction is to maximize the probabilities of correct target classification based on multi-source sensor measurements, while minimizing the resource expenditures at
NASA Astrophysics Data System (ADS)
Noh, Hae Young; Rajagopal, Ram; Kiremidjian, Anne S.
2012-04-01
This paper introduces a damage diagnosis algorithm for civil structures that uses a sequential change point detection method for the cases where the post-damage feature distribution is unknown a priori. This algorithm extracts features from structural vibration data using time-series analysis and then declares damage using the change point detection method. The change point detection method asymptotically minimizes detection delay for a given false alarm rate. The conventional method uses the known pre- and post-damage feature distributions to perform a sequential hypothesis test. In practice, however, the post-damage distribution is unlikely to be known a priori. Therefore, our algorithm estimates and updates this distribution as data are collected using the maximum likelihood and the Bayesian methods. We also applied an approximate method to reduce the computation load and memory requirement associated with the estimation. The algorithm is validated using multiple sets of simulated data and a set of experimental data collected from a four-story steel special moment-resisting frame. Our algorithm was able to estimate the post-damage distribution consistently and resulted in detection delays only a few seconds longer than the delays from the conventional method that assumes we know the post-damage feature distribution. We confirmed that the Bayesian method is particularly efficient in declaring damage with minimal memory requirement, but the maximum likelihood method provides an insightful heuristic approach.
Mitavskiy, Boris; Cannings, Chris
2009-01-01
The evolutionary algorithm stochastic process is well-known to be Markovian. These have been under investigation in much of the theoretical evolutionary computing research. When the mutation rate is positive, the Markov chain modeling of an evolutionary algorithm is irreducible and, therefore, has a unique stationary distribution. Rather little is known about the stationary distribution. In fact, the only quantitative facts established so far tell us that the stationary distributions of Markov chains modeling evolutionary algorithms concentrate on uniform populations (i.e., those populations consisting of a repeated copy of the same individual). At the same time, knowing the stationary distribution may provide some information about the expected time it takes for the algorithm to reach a certain solution, assessment of the biases due to recombination and selection, and is of importance in population genetics to assess what is called a "genetic load" (see the introduction for more details). In the recent joint works of the first author, some bounds have been established on the rates at which the stationary distribution concentrates on the uniform populations. The primary tool used in these papers is the "quotient construction" method. It turns out that the quotient construction method can be exploited to derive much more informative bounds on ratios of the stationary distribution values of various subsets of the state space. In fact, some of the bounds obtained in the current work are expressed in terms of the parameters involved in all the three main stages of an evolutionary algorithm: namely, selection, recombination, and mutation.
Yang, Liu; Lu, Yinzhi; Zhong, Yuanchang; Wu, Xuegang; Yang, Simon X.
2015-01-01
Energy resource limitation is a severe problem in traditional wireless sensor networks (WSNs) because it restricts the lifetime of network. Recently, the emergence of energy harvesting techniques has brought with them the expectation to overcome this problem. In particular, it is possible for a sensor node with energy harvesting abilities to work perpetually in an Energy Neutral state. In this paper, a Multi-hop Energy Neutral Clustering (MENC) algorithm is proposed to construct the optimal multi-hop clustering architecture in energy harvesting WSNs, with the goal of achieving perpetual network operation. All cluster heads (CHs) in the network act as routers to transmit data to base station (BS) cooperatively by a multi-hop communication method. In addition, by analyzing the energy consumption of intra- and inter-cluster data transmission, we give the energy neutrality constraints. Under these constraints, every sensor node can work in an energy neutral state, which in turn provides perpetual network operation. Furthermore, the minimum network data transmission cycle is mathematically derived using convex optimization techniques while the network information gathering is maximal. Simulation results show that our protocol can achieve perpetual network operation, so that the consistent data delivery is guaranteed. In addition, substantial improvements on the performance of network throughput are also achieved as compared to the famous traditional clustering protocol LEACH and recent energy harvesting aware clustering protocols. PMID:26712764
Yang, Liu; Lu, Yinzhi; Zhong, Yuanchang; Wu, Xuegang; Yang, Simon X
2015-12-26
Energy resource limitation is a severe problem in traditional wireless sensor networks (WSNs) because it restricts the lifetime of network. Recently, the emergence of energy harvesting techniques has brought with them the expectation to overcome this problem. In particular, it is possible for a sensor node with energy harvesting abilities to work perpetually in an Energy Neutral state. In this paper, a Multi-hop Energy Neutral Clustering (MENC) algorithm is proposed to construct the optimal multi-hop clustering architecture in energy harvesting WSNs, with the goal of achieving perpetual network operation. All cluster heads (CHs) in the network act as routers to transmit data to base station (BS) cooperatively by a multi-hop communication method. In addition, by analyzing the energy consumption of intra- and inter-cluster data transmission, we give the energy neutrality constraints. Under these constraints, every sensor node can work in an energy neutral state, which in turn provides perpetual network operation. Furthermore, the minimum network data transmission cycle is mathematically derived using convex optimization techniques while the network information gathering is maximal. Simulation results show that our protocol can achieve perpetual network operation, so that the consistent data delivery is guaranteed. In addition, substantial improvements on the performance of network throughput are also achieved as compared to the famous traditional clustering protocol LEACH and recent energy harvesting aware clustering protocols.
DeMaere, Matthew Z.
2016-01-01
Background Chromosome conformation capture, coupled with high throughput DNA sequencing in protocols like Hi-C and 3C-seq, has been proposed as a viable means of generating data to resolve the genomes of microorganisms living in naturally occuring environments. Metagenomic Hi-C and 3C-seq datasets have begun to emerge, but the feasibility of resolving genomes when closely related organisms (strain-level diversity) are present in the sample has not yet been systematically characterised. Methods We developed a computational simulation pipeline for metagenomic 3C and Hi-C sequencing to evaluate the accuracy of genomic reconstructions at, above, and below an operationally defined species boundary. We simulated datasets and measured accuracy over a wide range of parameters. Five clustering algorithms were evaluated (2 hard, 3 soft) using an adaptation of the extended B-cubed validation measure. Results When all genomes in a sample are below 95% sequence identity, all of the tested clustering algorithms performed well. When sequence data contains genomes above 95% identity (our operational definition of strain-level diversity), a naive soft-clustering extension of the Louvain method achieves the highest performance. Discussion Previously, only hard-clustering algorithms have been applied to metagenomic 3C and Hi-C data, yet none of these perform well when strain-level diversity exists in a metagenomic sample. Our simple extension of the Louvain method performed the best in these scenarios, however, accuracy remained well below the levels observed for samples without strain-level diversity. Strain resolution is also highly dependent on the amount of available 3C sequence data, suggesting that depth of sequencing must be carefully considered during experimental design. Finally, there appears to be great scope to improve the accuracy of strain resolution through further algorithm development. PMID:27843713
Distributions of Gas and Galaxies from Galaxy Clusters to Larger Scales
NASA Astrophysics Data System (ADS)
Patej, Anna
2017-01-01
We address the distributions of gas and galaxies on three scales: the outskirts of galaxy clusters, the clustering of galaxies on large scales, and the extremes of the galaxy distribution. In the outskirts of galaxy clusters, long-standing analytical models of structure formation and recent simulations predict the existence of density jumps in the gas and dark matter profiles. We use these features to derive models for the gas density profile, obtaining a simple fiducial model that is in agreement with both observations of cluster interiors and simulations of the outskirts. We next consider the galaxy density profiles of clusters; under the assumption that the galaxies in cluster outskirts follow similar collisionless dynamics as the dark matter, their distribution should show a steep jump as well. We examine the profiles of a low-redshift sample of clusters and groups, finding evidence for the jump in some of these clusters. Moving to larger scales where massive galaxies of different types are expected to trace the same large-scale structure, we present a test of this prediction by measuring the clustering of red and blue galaxies at z 0.6, finding low stochasticity between the two populations. These results address a key source of systematic uncertainty - understanding how target populations of galaxies trace large-scale structure - in galaxy redshift surveys. Such surveys use baryon acoustic oscillations (BAO) as a cosmological probe, but are limited by the expense of obtaining sufficiently dense spectroscopy. With the intention of leveraging upcoming deep imaging data, we develop a new method of detecting the BAO in sparse spectroscopic samples via cross-correlation with a dense photometric catalog. This method will permit the extension of BAO measurements to higher redshifts than possible with the existing spectroscopy alone. Lastly, we connect galaxies near and far: the Local Group dwarfs and the high redshift galaxies observed by Hubble and Spitzer. We
Size Distribution of Star Clusters and Stellar Groups in IC2574
NASA Astrophysics Data System (ADS)
Pellerin, Anne; Meyer, Martin J.; Calzetti, Daniela
2017-01-01
We present an HST/ACS archival study of compact and dispersed star clusters and stellar groups found in the nearby galaxy IC 2574. In this work, we identified and characterized the properties of clusters with spatially unresolved stars. We combined these properties with those found in a companion work on the dispersed stellar groups in IC 2574 with spatially resolved stars. We find that the size distribution of all young stellar groups, sparse and compact together, is consistent with the hierarchical model of star formation.
Optical algorithm for calculating the quantity distribution of fiber assembly.
Wu, Meiqin; Wang, Fumei
2016-09-01
A modification of the two-flux model of Kubelka-Munk was proposed for the description of light propagating through a fiber-air mixture medium, which simplified fibers' internal reflection as a part of the scattering on the total fiber path length. A series of systematical experiments demonstrated a higher consistency with the reference quantity distribution than the common Lambert law on the fibrogram used in the textile industry did. Its application in the fibrogram for measuring the cotton fiber's length was demonstrated to be good, extending its applicability to the wool fiber, the length of which is harder to measure than that of the cotton fiber.
Reconstructing merger timelines using star cluster age distributions: the case of MCG+08-11-002
NASA Astrophysics Data System (ADS)
Davies, Rebecca L.; Medling, Anne M.; U, Vivian; Max, Claire E.; Sanders, David; Kewley, Lisa J.
2016-05-01
We present near-infrared imaging and integral field spectroscopy of the centre of the dusty luminous infrared galaxy merger MCG+08-11-002, taken using the Near InfraRed Camera 2 (NIRC2) and the OH-Suppressing InfraRed Imaging Spectrograph (OSIRIS) on Keck II. We achieve a spatial resolution of ˜25 pc in the K band, allowing us to resolve 41 star clusters in the NIRC2 images. We calculate the ages of 22/25 star clusters within the OSIRIS field using the equivalent widths of the CO 2.3 μm absorption feature and the Br γ nebular emission line. The star cluster age distribution has a clear peak at ages ≲ 20 Myr, indicative of current starburst activity associated with the final coalescence of the progenitor galaxies. There is a possible second peak at ˜65 Myr which may be a product of the previous close passage of the galaxy nuclei. We fit single and double starburst models to the star cluster age distribution and use Monte Carlo sampling combined with two-sided Kolmogorov-Smirnov tests to calculate the probability that the observed data are drawn from each of the best-fitting distributions. There is a >90 per cent chance that the data are drawn from either a single or double starburst star formation history, but stochastic sampling prevents us from distinguishing between the two scenarios. Our analysis of MCG+08-11-002 indicates that star cluster age distributions provide valuable insights into the timelines of galaxy interactions and may therefore play an important role in the future development of precise merger stage classification systems.
Distributed Storage Algorithm for Geospatial Image Data Based on Data Access Patterns.
Pan, Shaoming; Li, Yongkai; Xu, Zhengquan; Chong, Yanwen
2015-01-01
Declustering techniques are widely used in distributed environments to reduce query response time through parallel I/O by splitting large files into several small blocks and then distributing those blocks among multiple storage nodes. Unfortunately, however, many small geospatial image data files cannot be further split for distributed storage. In this paper, we propose a complete theoretical system for the distributed storage of small geospatial image data files based on mining the access patterns of geospatial image data using their historical access log information. First, an algorithm is developed to construct an access correlation matrix based on the analysis of the log information, which reveals the patterns of access to the geospatial image data. Then, a practical heuristic algorithm is developed to determine a reasonable solution based on the access correlation matrix. Finally, a number of comparative experiments are presented, demonstrating that our algorithm displays a higher total parallel access probability than those of other algorithms by approximately 10-15% and that the performance can be further improved by more than 20% by simultaneously applying a copy storage strategy. These experiments show that the algorithm can be applied in distributed environments to help realize parallel I/O and thereby improve system performance.
NASA Astrophysics Data System (ADS)
Custódio, Susana; Lima, Vânia; Vales, Dina; Cesca, Simone; Carrilho, Fernando
2016-04-01
The matching between linear trends of hypocentres and fault planes indicated by focal mechanisms (FMs) is frequently used to infer the location and geometry of active faults. This practice works well in regions of fast lithospheric deformation, where earthquake patterns are clear and major structures accommodate the bulk of deformation, but typically fails in regions of slow and distributed deformation. We present a new joint FM and hypocentre cluster algorithm that is able to detect systematically the consistency between hypocentre lineations and FMs, even in regions of distributed deformation. We apply the method to the Azores-western Mediterranean region, with particular emphasis on western Iberia. The analysis relies on a compilation of hypocentres and FMs taken from regional and global earthquake catalogues, academic theses and technical reports, complemented by new FMs for western Iberia. The joint clustering algorithm images both well-known and new seismo-tectonic features. The Azores triple junction is characterised by FMs with vertical pressure (P) axes, in good agreement with the divergent setting, and the Iberian domain is characterised by NW-SE oriented P axes, indicating a response of the lithosphere to the ongoing oblique convergence between Nubia and Eurasia. Several earthquakes remain unclustered in the western Mediterranean domain, which may indicate a response to local stresses. The major regions of consistent faulting that we identify are the mid-Atlantic ridge, the Terceira rift, the Trans-Alboran shear zone and the north coast of Algeria. In addition, other smaller earthquake clusters present a good match between epicentre lineations and FM fault planes. These clusters may signal single active faults or wide zones of distributed but consistent faulting. Mainland Portugal is dominated by strike-slip earthquakes with fault planes coincident with the predominant NNE-SSW and WNW-ESE oriented earthquake lineations. Clusters offshore SW Iberia are
Improved mine blast algorithm for optimal cost design of water distribution systems
NASA Astrophysics Data System (ADS)
Sadollah, Ali; Guen Yoo, Do; Kim, Joong Hoon
2015-12-01
The design of water distribution systems is a large class of combinatorial, nonlinear optimization problems with complex constraints such as conservation of mass and energy equations. Since feasible solutions are often extremely complex, traditional optimization techniques are insufficient. Recently, metaheuristic algorithms have been applied to this class of problems because they are highly efficient. In this article, a recently developed optimizer called the mine blast algorithm (MBA) is considered. The MBA is improved and coupled with the hydraulic simulator EPANET to find the optimal cost design for water distribution systems. The performance of the improved mine blast algorithm (IMBA) is demonstrated using the well-known Hanoi, New York tunnels and Balerma benchmark networks. Optimization results obtained using IMBA are compared to those using MBA and other optimizers in terms of their minimum construction costs and convergence rates. For the complex Balerma network, IMBA offers the cheapest network design compared to other optimization algorithms.
A swarm intelligence based memetic algorithm for task allocation in distributed systems
NASA Astrophysics Data System (ADS)
Sarvizadeh, Raheleh; Haghi Kashani, Mostafa
2012-01-01
This paper proposes a Swarm Intelligence based Memetic algorithm for Task Allocation and scheduling in distributed systems. The tasks scheduling in distributed systems is known as an NP-complete problem. Hence, many genetic algorithms have been proposed for searching optimal solutions from entire solution space. However, these existing approaches are going to scan the entire solution space without considering the techniques that can reduce the complexity of the optimization. Spending too much time for doing scheduling is considered the main shortcoming of these approaches. Therefore, in this paper memetic algorithm has been used to cope with this shortcoming. With regard to load balancing efficiently, Bee Colony Optimization (BCO) has been applied as local search in the proposed memetic algorithm. Extended experimental results demonstrated that the proposed method outperformed the existing GA-based method in terms of CPU utilization.
A swarm intelligence based memetic algorithm for task allocation in distributed systems
NASA Astrophysics Data System (ADS)
Sarvizadeh, Raheleh; Haghi Kashani, Mostafa
2011-12-01
This paper proposes a Swarm Intelligence based Memetic algorithm for Task Allocation and scheduling in distributed systems. The tasks scheduling in distributed systems is known as an NP-complete problem. Hence, many genetic algorithms have been proposed for searching optimal solutions from entire solution space. However, these existing approaches are going to scan the entire solution space without considering the techniques that can reduce the complexity of the optimization. Spending too much time for doing scheduling is considered the main shortcoming of these approaches. Therefore, in this paper memetic algorithm has been used to cope with this shortcoming. With regard to load balancing efficiently, Bee Colony Optimization (BCO) has been applied as local search in the proposed memetic algorithm. Extended experimental results demonstrated that the proposed method outperformed the existing GA-based method in terms of CPU utilization.
Distributed learning automata-based algorithm for community detection in complex networks
NASA Astrophysics Data System (ADS)
Khomami, Mohammad Mehdi Daliri; Rezvanian, Alireza; Meybodi, Mohammad Reza
2016-03-01
Community structure is an important and universal topological property of many complex networks such as social and information networks. The detection of communities of a network is a significant technique for understanding the structure and function of networks. In this paper, we propose an algorithm based on distributed learning automata for community detection (DLACD) in complex networks. In the proposed algorithm, each vertex of network is equipped with a learning automation. According to the cooperation among network of learning automata and updating action probabilities of each automaton, the algorithm interactively tries to identify high-density local communities. The performance of the proposed algorithm is investigated through a number of simulations on popular synthetic and real networks. Experimental results in comparison with popular community detection algorithms such as walk trap, Danon greedy optimization, Fuzzy community detection, Multi-resolution community detection and label propagation demonstrated the superiority of DLACD in terms of modularity, NMI, performance, min-max-cut and coverage.
YOUNG STELLAR CLUSTERS WITH A SCHUSTER MASS DISTRIBUTION. I. STATIONARY WINDS
Palous, Jan; Wuensch, Richard; Hueyotl-Zahuantitla, Filiberto; Martinez-Gonzalez, Sergio; Silich, Sergiy; Tenorio-Tagle, Guillermo
2013-08-01
Hydrodynamic models for spherically symmetric winds driven by young stellar clusters with a generalized Schuster stellar density profile are explored. For this we use both semi-analytic models and one-dimensional numerical simulations. We determine the properties of quasi-adiabatic and radiative stationary winds and define the radius at which the flow turns from subsonic to supersonic for all stellar density distributions. Strongly radiative winds significantly diminish their terminal speed and thus their mechanical luminosity is strongly reduced. This also reduces their potential negative feedback into their host galaxy interstellar medium. The critical luminosity above which radiative cooling becomes dominant within the clusters, leading to thermal instabilities which make the winds non-stationary, is determined, and its dependence on the star cluster density profile, core radius, and half-mass radius is discussed.
Distributed Power Allocation for Sink-Centric Clusters in Multiple Sink Wireless Sensor Networks
Cao, Lei; Xu, Chen; Shao, Wei; Zhang, Guoan; Zhou, Hui; Sun, Qiang; Guo, Yuehua
2010-01-01
Due to the battery resource constraints, saving energy is a critical issue in wireless sensor networks, particularly in large sensor networks. One possible solution is to deploy multiple sink nodes simultaneously. Another possible solution is to employ an adaptive clustering hierarchy routing scheme. In this paper, we propose a multiple sink cluster wireless sensor networks scheme which combines the two solutions, and propose an efficient transmission power control scheme for a sink-centric cluster routing protocol in multiple sink wireless sensor networks, denoted as MSCWSNs-PC. It is a distributed, scalable, self-organizing, adaptive system, and the sensor nodes do not require knowledge of the global network and their location. All sinks effectively work out a representative view of a monitored region, after which power control is employed to optimize network topology. The simulations demonstrate the advantages of our new protocol. PMID:22294911
NASA Astrophysics Data System (ADS)
Nonomura, Yoshihiko
2014-11-01
Nonequilibrium relaxation behaviors in the Ising model on a square lattice based on the Wolff algorithm are totally different from those based on local-update algorithms. In particular, the critical relaxation is described by the stretched-exponential decay. We propose a novel scaling procedure to connect nonequilibrium and equilibrium behaviors continuously, and find that the stretched-exponential scaling region in the Wolff algorithm is as wide as the power-law scaling region in local-update algorithms. We also find that relaxation to the spontaneous magnetization in the ordered phase is characterized by the exponential decay, not the stretched-exponential decay based on local-update algorithms.
3D Drop Size Distribution Extrapolation Algorithm Using a Single Disdrometer
NASA Technical Reports Server (NTRS)
Lane, John
2012-01-01
Determining the Z-R relationship (where Z is the radar reflectivity factor and R is rainfall rate) from disdrometer data has been and is a common goal of cloud physicists and radar meteorology researchers. The usefulness of this quantity has traditionally been limited since radar represents a volume measurement, while a disdrometer corresponds to a point measurement. To solve that problem, a 3D-DSD (drop-size distribution) method of determining an equivalent 3D Z-R was developed at the University of Central Florida and tested at the Kennedy Space Center, FL. Unfortunately, that method required a minimum of three disdrometers clustered together within a microscale network (.1-km separation). Since most commercial disdrometers used by the radar meteorology/cloud physics community are high-cost instruments, three disdrometers located within a microscale area is generally not a practical strategy due to the limitations of these kinds of research budgets. A relatively simple modification to the 3D-DSD algorithm provides an estimate of the 3D-DSD and therefore, a 3D Z-R measurement using a single disdrometer. The basis of the horizontal extrapolation is mass conservation of a drop size increment, employing the mass conservation equation. For vertical extrapolation, convolution of a drop size increment using raindrop terminal velocity is used. Together, these two independent extrapolation techniques provide a complete 3DDSD estimate in a volume around and above a single disdrometer. The estimation error is lowest along a vertical plane intersecting the disdrometer position in the direction of wind advection. This work demonstrates that multiple sensors are not required for successful implementation of the 3D interpolation/extrapolation algorithm. This is a great benefit since it is seldom that multiple sensors in the required spatial arrangement are available for this type of analysis. The original software (developed at the University of Central Florida, 1998.- 2000) has
A Fast Elitism Gaussian Estimation of Distribution Algorithm and Application for PID Optimization
Xu, Qingyang; Zhang, Chengjin; Zhang, Li
2014-01-01
Estimation of distribution algorithm (EDA) is an intelligent optimization algorithm based on the probability statistics theory. A fast elitism Gaussian estimation of distribution algorithm (FEGEDA) is proposed in this paper. The Gaussian probability model is used to model the solution distribution. The parameters of Gaussian come from the statistical information of the best individuals by fast learning rule. A fast learning rule is used to enhance the efficiency of the algorithm, and an elitism strategy is used to maintain the convergent performance. The performances of the algorithm are examined based upon several benchmarks. In the simulations, a one-dimensional benchmark is used to visualize the optimization process and probability model learning process during the evolution, and several two-dimensional and higher dimensional benchmarks are used to testify the performance of FEGEDA. The experimental results indicate the capability of FEGEDA, especially in the higher dimensional problems, and the FEGEDA exhibits a better performance than some other algorithms and EDAs. Finally, FEGEDA is used in PID controller optimization of PMSM and compared with the classical-PID and GA. PMID:24892059
The Velocity Distribution Function of Galaxy Clusters as a Cosmological Probe
NASA Astrophysics Data System (ADS)
Ntampaka, M.; Trac, H.; Cisewski, J.; Price, L. C.
2017-01-01
We present a new approach for quantifying the abundance of galaxy clusters and constraining cosmological parameters using dynamical measurements. In the standard method, galaxy line-of-sight velocities, v, or velocity dispersions are used to infer cluster masses, M, to quantify the halo mass function (HMF), {dn}(M)/d{log}(M), which is strongly affected by mass measurement errors. In our new method, the probability distributions of velocities for each cluster in the sample are summed to create a new statistic called the velocity distribution function (VDF), {dn}(v)/{dv}. The VDF can be measured more directly and precisely than the HMF and can be robustly predicted with cosmological simulations that capture the dynamics of subhalos or galaxies. We apply these two methods to realistic (ideal) mock cluster catalogs with (without) interlopers and forecast the bias and constraints on the matter density parameter Ωm and the amplitude of matter fluctuations σ8 in flat ΛCDM cosmologies. For an example observation of 200 massive clusters, the VDF with (without) interloping galaxies constrains the parameter combination {σ }8 {{{Ω }}}m0.29(0.29)=0.589+/- 0.014 (0.584+/- 0.011) and shows only minor bias. However, the HMF with interlopers is biased to low Ωm and high σ8 and the fiducial model lies well outside of the forecast constraints, prior to accounting for Eddington bias. When the VDF is combined with constraints from the cosmic microwave background, the degeneracy between cosmological parameters can be significantly reduced. Upcoming spectroscopic surveys that probe larger volumes and fainter magnitudes will provide clusters for applying the VDF as a cosmological probe.
F Turnoff Distribution in the Galactic Halo Using Globular Clusters as Proxies
NASA Astrophysics Data System (ADS)
Newby, Matthew; Newberg, H. J.; Simones, J.; Monaco, M.; Cole, N.
2012-01-01
F turnoff stars are important tools for studying Galactic halo substructure because they are plentiful, luminous, and can be easily selected by their photometric colors from large surveys such as the Sloan Digital Sky Survey (SDSS). We describe the absolute magnitude distribution of color-selected F turnoff stars, as measured from SDSS data, for eleven globular clusters in the Milky Way halo. We find that the absolute magnitude distribution of turnoff stars is intrinsically the same for all clusters studied, and is well fit by two half Gaussian functions, centered at μ = 4.18, with a bright-side σ = 0.36, and with a faint-side σ = 0.76. However, the color errors and detection efficiencies cause the observed σ of the faint-side Gaussian to change with magnitude due to contamination from redder main sequence stars (40% at 21st magnitude). We present a function that will correct for this magnitude-dependent change in selected stellar populations, when calculating stellar density from color-selected turnoff stars. We also present a consistent set of distances, ages and metallicities for eleven clusters in the SDSS Data Release 7. We calculate a linear correction function to Padova isochrones so that they are consistent with SDSS globular cluster data from previous papers. We show that our cluster population falls along the theoretical Age-Metallicity Relationship (AMR), and further find that isochrones for stellar populations on the AMR have very similar turnoffs; increasing metallicity and decreasing age conspire to produce similar turnoff magnitudes and colors for all old clusters that lie on the AMR. This research was supported by NSF grant AST 10-09670 and the NASA/NY Space Grant.
Distributed consensus for metamorphic systems using a gossip algorithm for CAT(0) metric spaces
NASA Astrophysics Data System (ADS)
Bellachehab, Anass; Jakubowicz, Jérémie
2015-01-01
We present an application of distributed consensus algorithms to metamorphic systems. A metamorphic system is a set of identical units that can self-assemble to form a rigid structure. For instance, one can think of a robotic arm composed of multiple links connected by joints. The system can change its shape in order to adapt to different environments via reconfiguration of its constituting units. We assume in this work that several metamorphic systems form a network: two systems are connected whenever they are able to communicate with each other. The aim of this paper is to propose a distributed algorithm that synchronizes all the systems in the network. Synchronizing means that all the systems should end up having the same configuration. This aim is achieved in two steps: (i) we cast the problem as a consensus problem on a metric space and (ii) we use a recent distributed consensus algorithm that only make use of metrical notions.
Estimation of Distribution Algorithm Based on PCFG-LA Mixture Model
NASA Astrophysics Data System (ADS)
Hasegawa, Yoshihiko; Iba, Hitoshi
Estimation of distribution algorithms (EDAs) are evolutionary algorithms which substitute traditional genetic operators with distribution estimation and sampling. Recently, the application of probabilistic techniques to program and function evolution has received increasing attention, and promises to provide a strong alternative to the traditional genetic programming (GP) techniques. Although PAGE (Programming with Annotated Grammar Estimation) is a state-of-art GP-EDA based on PCFG-LA (PCFG with Latent Annotations), PAGE can not effectively estimate the distribution with multiple solutions. In this paper, we proposed extended PCFG-LA named PCFG-LAMM (PCFG-LA Mixture Model) and proposed UPAGE (Unsupervised PAGE) based on PCFG-LAMM. By applying the proposed algorithm to three computational problems, it is demonstrated that our approach requires fewer fitness evaluations. We also show that UPAGE is capable of obtaining multiple solutions in a multimodal problem.
A new stochastic algorithm for inversion of dust aerosol size distribution
NASA Astrophysics Data System (ADS)
Wang, Li; Li, Feng; Yang, Ma-ying
2015-08-01
Dust aerosol size distribution is an important source of information about atmospheric aerosols, and it can be determined from multiwavelength extinction measurements. This paper describes a stochastic inverse technique based on artificial bee colony (ABC) algorithm to invert the dust aerosol size distribution by light extinction method. The direct problems for the size distribution of water drop and dust particle, which are the main elements of atmospheric aerosols, are solved by the Mie theory and the Lambert-Beer Law in multispectral region. And then, the parameters of three widely used functions, i.e. the log normal distribution (L-N), the Junge distribution (J-J), and the normal distribution (N-N), which can provide the most useful representation of aerosol size distributions, are inversed by the ABC algorithm in the dependent model. Numerical results show that the ABC algorithm can be successfully applied to recover the aerosol size distribution with high feasibility and reliability even in the presence of random noise.
Li, Weizhong [San Diego Supercomputer Center
2016-07-12
San Diego Supercomputer Center's Weizhong Li on "Effective Analysis of NGS Metagenomic Data with Ultra-fast Clustering Algorithms" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.
Mass distribution and Dynamical State of Galaxy Clusters in the LZLS Sample
NASA Astrophysics Data System (ADS)
Campusano, L. E.; Cypriano, E. S.; Sodré, L., Jr.; Kneib, J.-P.
We use the weak gravitational lensing effect to study the mass distribution of a sample of 50 southern Abell clusters (0.05
NASA Astrophysics Data System (ADS)
Koksal, Canan; Akbas, Ugur; Okutan, Murat; Demir, Bayram; Hakki Sarpun, Ismail
2015-07-01
Commercial treatment planning systems with have different dose calculation algorithms have been developed for radiotherapy plans. The Ray Tracing and the Monte Carlo dose calculation algorithms are available for MultiPlan treatment planning system. Many studies indicated that the Monte Carlo algorithm enables the more accurate dose distributions in heterogeneous regions such a lung than the Ray Tracing algorithm. The purpose of this study was to compare the Ray Tracing algorithm with the Monte Carlo algorithm for lung tumors in CyberKnife System. An Alderson Rando anthropomorphic phantom was used for creating CyberKnife treatment plans. The treatment plan was developed using the Ray Tracing algorithm. Then, this plan was recalculated with the Monte Carlo algorithm. EBT3 radiochromic films were put in the phantom to obtain measured dose distributions. The calculated doses were compared with the measured doses. The Monte Carlo algorithm is the more accurate dose calculation method than the Ray Tracing algorithm in nonhomogeneous structures.
An efficient algorithm for generating random number pairs drawn from a bivariate normal distribution
NASA Technical Reports Server (NTRS)
Campbell, C. W.
1983-01-01
An efficient algorithm for generating random number pairs from a bivariate normal distribution was developed. Any desired value of the two means, two standard deviations, and correlation coefficient can be selected. Theoretically the technique is exact and in practice its accuracy is limited only by the quality of the uniform distribution random number generator, inaccuracies in computer function evaluation, and arithmetic. A FORTRAN routine was written to check the algorithm and good accuracy was obtained. Some small errors in the correlation coefficient were observed to vary in a surprisingly regular manner. A simple model was developed which explained the qualities aspects of the errors.
NASA Astrophysics Data System (ADS)
Quinn, Kevin Martin
The total amount of precipitation integrated across a precipitation cluster (contiguous precipitating grid cells exceeding a minimum rain rate) is a useful measure of the aggregate size of the disturbance, expressed as the rate of water mass lost or latent heat released, i.e. the power of the disturbance. Probability distributions of cluster power are examined during boreal summer (May-September) and winter (January-March) using satellite-retrieved rain rates from the Tropical Rainfall Measuring Mission (TRMM) 3B42 and Special Sensor Microwave Imager and Sounder (SSM/I and SSMIS) programs, model output from the High Resolution Atmospheric Model (HIRAM, roughly 0.25-0.5 0 resolution), seven 1-2° resolution members of the Coupled Model Intercomparison Project Phase 5 (CMIP5) experiment, and National Center for Atmospheric Research Large Ensemble (NCAR LENS). Spatial distributions of precipitation-weighted centroids are also investigated in observations (TRMM-3B42) and climate models during winter as a metric for changes in mid-latitude storm tracks. Observed probability distributions for both seasons are scale-free from the smallest clusters up to a cutoff scale at high cluster power, after which the probability density drops rapidly. When low rain rates are excluded by choosing a minimum rain rate threshold in defining clusters, the models accurately reproduce observed cluster power statistics and winter storm tracks. Changes in behavior in the tail of the distribution, above the cutoff, are important for impacts since these quantify the frequency of the most powerful storms. End-of-century cluster power distributions and storm track locations are investigated in these models under a "business as usual" global warming scenario. The probability of high cluster power events increases by end-of-century across all models, by up to an order of magnitude for the highest-power events for which statistics can be computed. For the three models in the suite with continuous
Single-cluster algorithm for the site-bond-correlated Ising model
NASA Astrophysics Data System (ADS)
Campos, P. R. A.; Onody, R. N.
1997-12-01
We extend the Wolff algorithm to include correlated spin interactions in diluted magnetic systems. This algorithm is applied to study the site-bond-correlated Ising model on a two-dimensional square lattice. We use a finite-size scaling procedure to obtain the phase diagram in the temperature-concentration space. We also have verified that the autocorrelation time diminishes in the presence of dilution and correlation, showing that the Wolff algorithm performs even better in such situations.
Clustering and velocity distributions in granular gases cooling by solid friction.
Das, Prasenjit; Puri, Sanjay; Schwartz, Moshe
2016-09-01
We present large-scale molecular dynamics simulations to study the free evolution of granular gases. Initially, the density of particles is homogeneous and the velocity follows a Maxwell-Boltzmann (MB) distribution. The system cools down due to solid friction between the granular particles. The density remains homogeneous, and the velocity distribution remains MB at early times, while the kinetic energy of the system decays with time. However, fluctuations in the density and velocity fields grow, and the system evolves via formation of clusters in the density field and the local ordering of velocity field, consistent with the onset of plug flow. This is accompanied by a transition of the velocity distribution function from MB to non-MB behavior. We used equal-time correlation functions and structure factors of the density and velocity fields to study the morphology of clustering. From the correlation functions, we obtain the cluster size, L, as a function of time, t. We show that it exhibits power law growth with L(t)∼t^{1/3}.
Clustering and velocity distributions in granular gases cooling by solid friction
NASA Astrophysics Data System (ADS)
Das, Prasenjit; Puri, Sanjay; Schwartz, Moshe
2016-09-01
We present large-scale molecular dynamics simulations to study the free evolution of granular gases. Initially, the density of particles is homogeneous and the velocity follows a Maxwell-Boltzmann (MB) distribution. The system cools down due to solid friction between the granular particles. The density remains homogeneous, and the velocity distribution remains MB at early times, while the kinetic energy of the system decays with time. However, fluctuations in the density and velocity fields grow, and the system evolves via formation of clusters in the density field and the local ordering of velocity field, consistent with the onset of plug flow. This is accompanied by a transition of the velocity distribution function from MB to non-MB behavior. We used equal-time correlation functions and structure factors of the density and velocity fields to study the morphology of clustering. From the correlation functions, we obtain the cluster size, L , as a function of time, t . We show that it exhibits power law growth with L (t ) ˜t1 /3 .
Ginga observations of the Coma cluster and studies of the spatial distribution of iron
NASA Technical Reports Server (NTRS)
Hughes, John P.; Butcher, Jackie A.; Stewart, Gordon C.; Tanaka, Yasuo
1993-01-01
The large area counters on the Japanese satellite Ginga have been used to determine the X-ray spectrum from the central region of the Coma cluster of galaxies over the energy range from 1.5 to 20 keV. The spectrum is well represented by an isothermal model of temperature 8.21 +/- 0.16 keV and a heavy element (iron) abundance of 0.212 +/- 0.027, relative to the cosmic value. The Ginga spectrum was found to be consistent with the X-ray spectra from the Tenma and EXOSAT satellites for a large class of nonisothermal temperature distributions. The measured iron elemental abundances were used to set a lower limit on the total mass of iron in Coma under the assumption that the iron is not distributed uniformly throughout the cluster. The mass ratio of iron relative to hydrogen (within 2 Mpc) is not less than 18 percent of the cosmic iron to hydrogen mass ratio. This compares to an average abundance of 24 percent if the iron is distributed uniformly. We discuss these results in terms of models for the production of iron in galaxy clusters.
Fast distributed large-pixel-count hologram computation using a GPU cluster.
Pan, Yuechao; Xu, Xuewu; Liang, Xinan
2013-09-10
Large-pixel-count holograms are one essential part for big size holographic three-dimensional (3D) display, but the generation of such holograms is computationally demanding. In order to address this issue, we have built a graphics processing unit (GPU) cluster with 32.5 Tflop/s computing power and implemented distributed hologram computation on it with speed improvement techniques, such as shared memory on GPU, GPU level adaptive load balancing, and node level load distribution. Using these speed improvement techniques on the GPU cluster, we have achieved 71.4 times computation speed increase for 186M-pixel holograms. Furthermore, we have used the approaches of diffraction limits and subdivision of holograms to overcome the GPU memory limit in computing large-pixel-count holograms. 745M-pixel and 1.80G-pixel holograms were computed in 343 and 3326 s, respectively, for more than 2 million object points with RGB colors. Color 3D objects with 1.02M points were successfully reconstructed from 186M-pixel hologram computed in 8.82 s with all the above three speed improvement techniques. It is shown that distributed hologram computation using a GPU cluster is a promising approach to increase the computation speed of large-pixel-count holograms for large size holographic display.
Mandel, Pierre; Maurel, Marie; Chenu, Damien
2015-12-15
The complexity of water distribution networks raises challenges in managing, monitoring and understanding their behavior. This article proposes a novel methodology applying data clustering to the results of hydraulic simulation to define quality zones, i.e. zones with the same dynamic water origin. The methodology is presented on an existing Water Distribution Network; a large dataset of conductivity measurements measured by 32 probes validates the definition of the quality zones. The results show how quality zones help better understanding the network operation and how they can be used to analyze water quality events. Moreover, a statistical comparison with 158,230 conductivity measurements validates the definition of the quality zones.
NASA Astrophysics Data System (ADS)
Zhang, Yanjun; Yu, Chunjuan; Fu, Xinghu; Liu, Wenzhe; Bi, Weihong
2015-12-01
In the distributed optical fiber sensing system based on Brillouin scattering, strain and temperature are the main measuring parameters which can be obtained by analyzing the Brillouin center frequency shift. The novel algorithm which combines the cuckoo search algorithm (CS) with the improved differential evolution (IDE) algorithm is proposed for the Brillouin scattering parameter estimation. The CS-IDE algorithm is compared with CS algorithm and analyzed in different situation. The results show that both the CS and CS-IDE algorithm have very good convergence. The analysis reveals that the CS-IDE algorithm can extract the scattering spectrum features with different linear weight ratio, linewidth combination and SNR. Moreover, the BOTDR temperature measuring system based on electron optical frequency shift is set up to verify the effectiveness of the CS-IDE algorithm. Experimental results show that there is a good linear relationship between the Brillouin center frequency shift and temperature changes.
The size distribution of spatiotemporal extreme rainfall clusters around the globe
NASA Astrophysics Data System (ADS)
Traxl, D.; Boers, N.; Rheinwalt, A.; Goswami, B.; Kurths, J.
2016-09-01
The scaling behavior of rainfall has been extensively studied both in terms of event magnitudes and in terms of spatial extents of the events. Different heavy-tailed distributions have been proposed as candidates for both instances, but statistically rigorous treatments are rare. Here we combine the domains of event magnitudes and event area sizes by a spatiotemporal integration of 3-hourly rain rates corresponding to extreme events derived from the quasi-global high-resolution rainfall product Tropical Rainfall Measuring Mission 3B42. A maximum likelihood evaluation reveals that the distribution of spatiotemporally integrated extreme rainfall cluster sizes over the oceans is best described by a truncated power law, calling into question previous statements about scale-free distributions. The observed subpower law behavior of the distribution's tail is evaluated with a simple generative model, which indicates that the exponential truncation of an otherwise scale-free spatiotemporal cluster size distribution over the oceans could be explained by the existence of land masses on the globe.
Multirate parallel distributed compensation of a cluster in wireless sensor and actor networks
NASA Astrophysics Data System (ADS)
Yang, Chun-xi; Huang, Ling-yun; Zhang, Hao; Hua, Wang
2016-01-01
The stabilisation problem for one of the clusters with bounded multiple random time delays and packet dropouts in wireless sensor and actor networks is investigated in this paper. A new multirate switching model is constructed to describe the feature of this single input multiple output linear system. According to the difficulty of controller design under multi-constraints in multirate switching model, this model can be converted to a Takagi-Sugeno fuzzy model. By designing a multirate parallel distributed compensation, a sufficient condition is established to ensure this closed-loop fuzzy control system to be globally exponentially stable. The solution of the multirate parallel distributed compensation gains can be obtained by solving an auxiliary convex optimisation problem. Finally, two numerical examples are given to show, compared with solving switching controller, multirate parallel distributed compensation can be obtained easily. Furthermore, it has stronger robust stability than arbitrary switching controller and single-rate parallel distributed compensation under the same conditions.
Distributing Power Grid State Estimation on HPC Clusters A System Architecture Prototype
Liu, Yan; Jiang, Wei; Jin, Shuangshuang; Rice, Mark J.; Chen, Yousu
2012-08-20
The future power grid is expected to further expand with highly distributed energy sources and smart loads. The increased size and complexity lead to increased burden on existing computational resources in energy control centers. Thus the need to perform real-time assessment on such systems entails efficient means to distribute centralized functions such as state estimation in the power system. In this paper, we present our early prototype of a system architecture that connects distributed state estimators individually running parallel programs to solve non-linear estimation procedure. The prototype consists of a middleware and data processing toolkits that allows data exchange in the distributed state estimation. We build a test case based on the IEEE 118 bus system and partition the state estimation of the whole system model to available HPC clusters. The measurement from the testbed demonstrates the low overhead of our solution.
A CHANDRA STUDY OF TEMPERATURE DISTRIBUTIONS OF THE INTRACLUSTER MEDIUM IN 50 GALAXY CLUSTERS
Zhu, Zhenghao; Xu, Haiguang; Li, Weitian; Hu, Dan; Zhang, Chenhao; Liu, Chengze; Wang, Jingying; Gu, Junhua; Wu, Xiang-Ping; Gu, Liyi; An, Tao; Zhang, Zhongli; Zhu, Jie E-mail: hgxu@sjtu.edu.cn
2016-01-10
To investigate the spatial distribution of the intracluster medium temperature in galaxy clusters in a quantitative way and probe the physics behind it, we analyze the X-ray spectra from a sample of 50 clusters that were observed with the Chandra ACIS instrument over the past 15 years and measure the radial temperature profiles out to 0.45r{sub 500}. We construct a physical model that takes into consideration the effects of gravitational heating, thermal history (such as radiative cooling, active galactic nucleus feedback, and thermal conduction), and work done via gas compression, and use it to fit the observed temperature profiles by running Bayesian regressions. The results show that in all cases our model provides an acceptable fit at the 68% confidence level. For further validation, we select nine clusters that have been observed with both Chandra (out to ≳0.3r{sub 500}) and Suzaku (out to ≳1.5r{sub 500}) and fit their Chandra spectra with our model. We then compare the extrapolation of the best fits with the Suzaku measurements and find that the model profiles agree with the Suzaku results very well in seven clusters. In the remaining two clusters the difference between the model and the observation is possibly caused by local thermal substructures. Our study also implies that for most of the clusters the assumption of hydrostatic equilibrium is safe out to at least 0.5r{sub 500} and the non-gravitational interactions between dark matter and its luminous counterparts is consistent with zero.
Swarm Intelligence in Text Document Clustering
Cui, Xiaohui; Potok, Thomas E
2008-01-01
Social animals or insects in nature often exhibit a form of emergent collective behavior. The research field that attempts to design algorithms or distributed problem-solving devices inspired by the collective behavior of social insect colonies is called Swarm Intelligence. Compared to the traditional algorithms, the swarm algorithms are usually flexible, robust, decentralized and self-organized. These characters make the swarm algorithms suitable for solving complex problems, such as document collection clustering. The major challenge of today's information society is being overwhelmed with information on any topic they are searching for. Fast and high-quality document clustering algorithms play an important role in helping users to effectively navigate, summarize, and organize the overwhelmed information. In this chapter, we introduce three nature inspired swarm intelligence clustering approaches for document clustering analysis. These clustering algorithms use stochastic and heuristic principles discovered from observing bird flocks, fish schools and ant food forage.
McFarland, Christopher D
2015-01-01
The Ziggurat Algorithm is a very fast rejection sampling method for generating PseudoRandom Numbers (PRNs) from statistical distributions. In the algorithm, rectangular sampling domains are layered on top of each other (resembling a ziggurat) to encapsulate the desired probability density function. Random values within these layers are sampled and then returned if they lie beneath the graph of the probability density function. Here, we present an implementation where ziggurat layers reside completely beneath the probability density function, thereby eliminating the need for any rejection test within the ziggurat layers. In the new algorithm, small overhanging segments of probability density remain to the right of each ziggurat layer, which can be efficiently sampled with triangularly-shaped sampling domains. Median runtimes of the new algorithm for exponential and normal variates is reduced to 58% and 53% respectively (collective range: 41–93%). An accessible C library, along with extensions into Python and MATLAB/Octave are provided. PMID:27041780
An improved distributed routing algorithm for Benes based optical NoC
NASA Astrophysics Data System (ADS)
Zhang, Jing; Gu, Huaxi; Yang, Yintang
2010-08-01
Integrated optical interconnect is believed to be one of the main technologies to replace electrical wires. Optical Network-on-Chip (ONoC) has attracted more attentions nowadays. Benes topology is a good choice for ONoC for its rearrangeable non-blocking character, multistage feature and easy scalability. Routing algorithm plays an important role in determining the performance of ONoC. But traditional routing algorithms for Benes network are not suitable for ONoC communication, we developed a new distributed routing algorithm for Benes ONoC in this paper. Our algorithm selected the routing path dynamically according to network condition and enables more path choices for the message traveling in the network. We used OPNET to evaluate the performance of our routing algorithm and also compared it with a well-known bit-controlled routing algorithm. ETE delay and throughput were showed under different packet length and network sizes. Simulation results show that our routing algorithm can provide better performance for ONoC.
A data distributed parallel algorithm for ray-traced volume rendering
NASA Technical Reports Server (NTRS)
Ma, Kwan-Liu; Painter, James S.; Hansen, Charles D.; Krogh, Michael F.
1993-01-01
This paper presents a divide-and-conquer ray-traced volume rendering algorithm and a parallel image compositing method, along with their implementation and performance on the Connection Machine CM-5, and networked workstations. This algorithm distributes both the data and the computations to individual processing units to achieve fast, high-quality rendering of high-resolution data. The volume data, once distributed, is left intact. The processing nodes perform local ray tracing of their subvolume concurrently. No communication between processing units is needed during this locally ray-tracing process. A subimage is generated by each processing unit and the final image is obtained by compositing subimages in the proper order, which can be determined a priori. Test results on both the CM-5 and a group of networked workstations demonstrate the practicality of our rendering algorithm and compositing method.
Tsai, Ming-Hui; Huang, Yueh-Min
2014-11-18
Wireless sensor networks (WSNs) have emerged as a promising solution for various applications due to their low cost and easy deployment. Typically, their limited power capability, i.e., battery powered, make WSNs encounter the challenge of extension of network lifetime. Many hierarchical protocols show better ability of energy efficiency in the literature. Besides, data reduction based on the correlation of sensed readings can efficiently reduce the amount of required transmissions. Therefore, we use a sub-clustering procedure based on spatial data correlation to further separate the hierarchical (clustered) architecture of a WSN. The proposed algorithm (2TC-cor) is composed of two procedures: the prediction model construction procedure and the sub-clustering procedure. The energy conservation benefits by the reduced transmissions, which are dependent on the prediction model. Also, the energy can be further conserved because of the representative mechanism of sub-clustering. As presented by simulation results, it shows that 2TC-cor can effectively conserve energy and monitor accurately the environment within an acceptable level.
NASA Astrophysics Data System (ADS)
Łokas, Ewa L.; Mamon, Gary A.
2003-08-01
We study velocity moments of elliptical galaxies in the Coma cluster using Jeans equations. The dark matter distribution in the cluster is modelled by a generalized formula based upon the results of cosmological N-body simulations. Its inner slope (cuspy or flat), concentration and mass within the virial radius are kept as free parameters, as well as the velocity anisotropy, assumed independent of position. We show that the study of line-of-sight velocity dispersion alone does not allow us to constrain the parameters. By a joint analysis of the observed profiles of velocity dispersion and kurtosis, we are able to break the degeneracy between the mass distribution and velocity anisotropy. We determine the dark matter distribution at radial distances larger than 3 per cent of the virial radius and we find that the galaxy orbits are close to isotropic. Due to limited resolution, different inner slopes are found to be consistent with the data and we observe a strong degeneracy between the inner slope α and concentration c; the best-fitting profiles have the two parameters related with c= 19-9.6α. Our best-fitting Navarro-Frenk-White profile has concentration c= 9, which is 50 per cent higher than standard values found in cosmological simulations for objects of similar mass. The total mass within the virial radius of 2.9h-170 Mpc is 1.4 × 1015h-170 Msolar (with 30 per cent accuracy), 85 per cent of which is dark. At this distance from the cluster centre, the mass-to-light ratio in the blue band is 351h70 solar units. The total mass within the virial radius leads to estimates of the density parameter of the Universe, assuming that clusters trace the mass-to-light ratio and baryonic fraction of the Universe, with Ω0= 0.29 +/- 0.1.
Balouchestani, Mohammadreza; Krishnan, Sridhar
2014-01-01
Long-term recording of Electrocardiogram (ECG) signals plays an important role in health care systems for diagnostic and treatment purposes of heart diseases. Clustering and classification of collecting data are essential parts for detecting concealed information of P-QRS-T waves in the long-term ECG recording. Currently used algorithms do have their share of drawbacks: 1) clustering and classification cannot be done in real time; 2) they suffer from huge energy consumption and load of sampling. These drawbacks motivated us in developing novel optimized clustering algorithm which could easily scan large ECG datasets for establishing low power long-term ECG recording. In this paper, we present an advanced K-means clustering algorithm based on Compressed Sensing (CS) theory as a random sampling procedure. Then, two dimensionality reduction methods: Principal Component Analysis (PCA) and Linear Correlation Coefficient (LCC) followed by sorting the data using the K-Nearest Neighbours (K-NN) and Probabilistic Neural Network (PNN) classifiers are applied to the proposed algorithm. We show our algorithm based on PCA features in combination with K-NN classifier shows better performance than other methods. The proposed algorithm outperforms existing algorithms by increasing 11% classification accuracy. In addition, the proposed algorithm illustrates classification accuracy for K-NN and PNN classifiers, and a Receiver Operating Characteristics (ROC) area of 99.98%, 99.83%, and 99.75% respectively.
Vinogradov, S A; Wilson, D F
1994-01-01
A new method for analysis of phosphorescence lifetime distributions in heterogeneous systems has been developed. This method is based on decomposition of the data vector to a linearly independent set of exponentials and uses quadratic programming principles for x2 minimization. Solution of the resulting algorithm requires a finite number of calculations (it is not iterative) and is computationally fast and robust. The algorithm has been tested on various simulated decays and for analysis of phosphorescence measurements of experimental systems with descrete distributions of lifetimes. Critical analysis of the effect of signal-to-noise on the resolving capability of the algorithm is presented. This technique is recommended for resolution of the distributions of quencher concentration in heterogeneous samples, of which oxygen distributions in tissue is an important example. Phosphors of practical importance for biological oxygen measurements: Pd-meso-tetra (4-carboxyphenyl) porphyrin (PdTCPP) and Pd-meso-porphyrin (PdMP) have been used to provide experimental test of the algorithm. PMID:7858142
Multi-objective optimization with estimation of distribution algorithm in a noisy environment.
Shim, Vui Ann; Tan, Kay Chen; Chia, Jun Yong; Al Mamun, Abdullah
2013-01-01
Many real-world optimization problems are subjected to uncertainties that may be characterized by the presence of noise in the objective functions. The estimation of distribution algorithm (EDA), which models the global distribution of the population for searching tasks, is one of the evolutionary computation techniques that deals with noisy information. This paper studies the potential of EDAs; particularly an EDA based on restricted Boltzmann machines that handles multi-objective optimization problems in a noisy environment. Noise is introduced to the objective functions in the form of a Gaussian distribution. In order to reduce the detrimental effect of noise, a likelihood correction feature is proposed to tune the marginal probability distribution of each decision variable. The EDA is subsequently hybridized with a particle swarm optimization algorithm in a discrete domain to improve its search ability. The effectiveness of the proposed algorithm is examined via eight benchmark instances with different characteristics and shapes of the Pareto optimal front. The scalability, hybridization, and computational time are rigorously studied. Comparative studies show that the proposed approach outperforms other state of the art algorithms.
NASA Astrophysics Data System (ADS)
Thanos, Konstantinos-Georgios; Thomopoulos, Stelios C. A.
2014-06-01
The study in this paper belongs to a more general research of discovering facial sub-clusters in different ethnicity face databases. These new sub-clusters along with other metadata (such as race, sex, etc.) lead to a vector for each face in the database where each vector component represents the likelihood of participation of a given face to each cluster. This vector is then used as a feature vector in a human identification and tracking system based on face and other biometrics. The first stage in this system involves a clustering method which evaluates and compares the clustering results of five different clustering algorithms (average, complete, single hierarchical algorithm, k-means and DIGNET), and selects the best strategy for each data collection. In this paper we present the comparative performance of clustering results of DIGNET and four clustering algorithms (average, complete, single hierarchical and k-means) on fabricated 2D and 3D samples, and on actual face images from various databases, using four different standard metrics. These metrics are the silhouette figure, the mean silhouette coefficient, the Hubert test Γ coefficient, and the classification accuracy for each clustering result. The results showed that, in general, DIGNET gives more trustworthy results than the other algorithms when the metrics values are above a specific acceptance threshold. However when the evaluation results metrics have values lower than the acceptance threshold but not too low (too low corresponds to ambiguous results or false results), then it is necessary for the clustering results to be verified by the other algorithms.
Clelland, C C; Giles, K D; Farley, T D; Gee, Q; Wright, J R
1986-01-01
The 64Cu-radiolabeled mixed valence cluster ion of copper and penicillamine (Cu(I)8Cu(II)6Pen12Cl5-) is easily prepared and purified in a 30 minute procedure. Mice were dosed intravenously with this substance at 0.1 mg/kg, and the distribution of the isotope among organs and tissues was quantified by gamma scintillometry of the isotope's positron annihilation radiation. The largest concentration was found in urine (a 96-fold increase over whole body specific activity at 20 minutes), and this finding is consistent with the compound's inulin-like renal clearance. Isotope levels were also elevated in kidney and liver tissues (2.36-and 3.62-fold, respectively), while all other organs and tissues examined were found to be depleted of the label. A pre-injection of unlabeled cluster at 60 mg/kg effectively blocked radiolabel interaction with liver tissue. This intensely red-violet compound did not stain viable liver cells, and it was not significantly decomposed by liver homogenates over a three hour period. These results suggest cluster interaction at saturable binding sites on liver cell surfaces. The unlabeled cluster shows no indication of toxicity, and the labeled version might have biomedical applications.
The mass distribution of the unusual merging cluster Abell 2146 from strong lensing
NASA Astrophysics Data System (ADS)
Coleman, Joseph E.; King, Lindsay J.; Oguri, Masamune; Russell, Helen R.; Canning, Rebecca E. A.; Leonard, Adrienne; Santana, Rebecca; White, Jacob A.; Baum, Stefi A.; Clowe, Douglas I.; Edge, Alastair; Fabian, Andrew C.; McNamara, Brian R.; O'Dea, Christopher P.
2017-01-01
Abell 2146 consists of two galaxy clusters that have recently collided close to the plane of the sky, and it is unique in showing two large shocks on Chandra X-ray Observatory images. With an early stage merger, shortly after first core passage, one would expect the cluster galaxies and the dark matter to be leading the X-ray emitting plasma. In this regard, the cluster Abell 2146-A is very unusual in that the X-ray cool core appears to lead, rather than lag, the brightest cluster galaxy (BCG) in their trajectories. Here we present a strong-lensing analysis of multiple-image systems identified on Hubble Space Telescope images. In particular, we focus on the distribution of mass in Abell 2146-A in order to determine the centroid of the dark matter halo. We use object colours and morphologies to identify multiple-image systems; very conservatively, four of these systems are used as constraints on a lens mass model. We find that the centroid of the dark matter halo, constrained using the strongly lensed features, is coincident with the BCG, with an offset of ≈2 kpc between the centres of the dark matter halo and the BCG. Thus from the strong-lensing model, the X-ray cool core also leads the centroid of the dark matter in Abell 2146-A, with an offset of ≈30 kpc.
Abedini, Mohammad; Moradi, Mohammad H; Hosseinian, S M
2016-03-01
This paper proposes a novel method to address reliability and technical problems of microgrids (MGs) based on designing a number of self-adequate autonomous sub-MGs via adopting MGs clustering thinking. In doing so, a multi-objective optimization problem is developed where power losses reduction, voltage profile improvement and reliability enhancement are considered as the objective functions. To solve the optimization problem a hybrid algorithm, named HS-GA, is provided, based on genetic and harmony search algorithms, and a load flow method is given to model different types of DGs as droop controller. The performance of the proposed method is evaluated in two case studies. The results provide support for the performance of the proposed method.
Pillió, Zoltán; Tajti, Attila; Szalay, Péter G
2012-09-11
A new algorithm is presented for the calculation of the ladder-type term of the coupled cluster singles and doubles (CCSD) equations using two-electron integrals in atomic orbital (AO) basis. The method is based on an orbital grouping scheme, which results in an optimal partitioning of the AO integral matrix into sparse and dense blocks allowing efficient matrix multiplication. Carefully chosen numerical tests have been performed to analyze the performance of all aspects of the new algorithm. It is shown that the suggested scheme allows an efficient utilization of modern highly parallel architectures and devices in CCSD calculations. Details of the implementation in the development version of CFOUR quantum chemical program package are also presented.
Burigo, Lucas; Pshenichnov, Igor; Mishustin, Igor; Hilgers, Gerhard; Bleicher, Marcus
2016-05-21
The Geant4-based Monte Carlo model for Heavy-Ion Therapy (MCHIT) was extended to study the patterns of energy deposition at sub-micrometer distance from individual ion tracks. Dose distributions for low-energy (1)H, (4)He, (12)C and (16)O ions measured in several experiments are well described by the model in a broad range of radial distances, from 0.5 to 3000 nm. Despite the fact that such distributions are characterized by long tails, a dominant fraction of deposited energy (∼80%) is confined within a radius of about 10 nm. The probability distributions of clustered ionization events in nanoscale volumes of water traversed by (1)H, (2)H, (4)He, (6)Li, (7)Li, and (12)C ions are also calculated. A good agreement of calculated ionization cluster-size distributions with the corresponding experimental data suggests that the extended MCHIT can be used to characterize stochastic processes of energy deposition to sensitive cellular structures.
Zubov, A V; Zubov, K V; Zubov, V A
2007-01-01
The distribution of water clusters in fresh rain water and in rain water that was aged for 30 days (North Germany, 53 degrees 33' N, 12 degrees 47' E, 293 K, rain on 25.06.06) as well as in fresh vegetables and fruits was studied by flicker noise spectroscopy. In addition, the development of water clusters in apples and potatoes during ripening in 2006 was investigated. A different distribution of water clusters in irrigation water (river and rain) and in the biomatrix of vegetables (potatoes, onions, tomatoes, red beets) and fruits (apples, bananas) was observed. It was concluded that the cluster structure of irrigation water differs from that of water of the biomatrix of vegetables and fruits and depends on drought and the biomatrix nature. Water clusters in plants are more stable and reproducible than water clusters in natural water. The main characteristics of cluster formation in materials studied were given. The oscillation frequencies of water clusters in plants (biofield) are given at which they interact with water clusters of the Earth hydrosphere. A model of series of clusters 16(H2O)100 <--> 4(H2O)402 <--> 2(H2O)903 <--> (H2O)1889 in the biomatrix of vegetables and fruits was discussed.
Two generalizations of Kohonen clustering
NASA Technical Reports Server (NTRS)
Bezdek, James C.; Pal, Nikhil R.; Tsao, Eric C. K.
1993-01-01
The relationship between the sequential hard c-means (SHCM), learning vector quantization (LVQ), and fuzzy c-means (FCM) clustering algorithms is discussed. LVQ and SHCM suffer from several major problems. For example, they depend heavily on initialization. If the initial values of the cluster centers are outside the convex hull of the input data, such algorithms, even if they terminate, may not produce meaningful results in terms of prototypes for cluster representation. This is due in part to the fact that they update only the winning prototype for every input vector. The impact and interaction of these two families with Kohonen's self-organizing feature mapping (SOFM), which is not a clustering method, but which often leads ideas to clustering algorithms is discussed. Then two generalizations of LVQ that are explicitly designed as clustering algorithms are presented; these algorithms are referred to as generalized LVQ = GLVQ; and fuzzy LVQ = FLVQ. Learning rules are derived to optimize an objective function whose goal is to produce 'good clusters'. GLVQ/FLVQ (may) update every node in the clustering net for each input vector. Neither GLVQ nor FLVQ depends upon a choice for the update neighborhood or learning rate distribution - these are taken care of automatically. Segmentation of a gray tone image is used as a typical application of these algorithms to illustrate the performance of GLVQ/FLVQ.