Genetically derived fuzzy c-means clustering algorithm for segmentation
Nezamoddin N. Kachouie; Javad Alirezaie; Kaamran Raahemifar
2003-01-01
The proper classification of pixels is an important step in the realm of satellite imagery, to partition different land cover regions. This paper describes a clustering method that utilizes hard and fuzzy clustering algorithms. The performance of the algorithm is optimized using genetic algorithm, which searches the best cluster centers to initialize the fuzzy partition matrix in place of random
Fang Liu
2011-01-01
Image segmentation remains one of the major challenges in image analysis and computer vision. Fuzzy clustering, as a soft segmentation method, has been widely studied and successfully applied in mage clustering and segmentation. The fuzzy c-means (FCM) algorithm is the most popular method used in mage segmentation. However, most clustering algorithms such as the k-means and the FCM clustering algorithms
Chu XiaoLi; Zhu Ying; Shi JunTao; Song JiQing
2010-01-01
By analyzing advantages and disadvantages of Fuzzy C-Means Clustering Algorithm, a method of image segmentation based on Fuzzy C-Means Clustering Algorithm and Artificial Fish Swarm Algorithm is proposed. The image is segmented in terms of the values of the membership of pixels, Artificial Fish Swarm Algorithm is introduced into Fuzzy C-Means Clustering Algorithm, and through the behavior of prey, follow,
MRI Brain Image Segmentation Using Modified Fuzzy C-Means Clustering Algorithm
M. Shasidhar; V. Sudheer Raja; B. Vijay Kumar
2011-01-01
Clustering approach is widely used in biomedical applications particularly for brain tumor detection in abnormal magnetic resonance (MRI) images. Fuzzy clustering using fuzzy C-means (FCM) algorithm proved to be superior over the other clustering approaches in terms of segmentation efficiency. But the major drawback of the FCM algorithm is the huge computational time required for convergence. The effectiveness of the
An Image Segmentation Algorithm Based on Fuzzy C-Means Clustering
Xin-bo Zhang; Li Jiang
2009-01-01
Image segmentation algorithm based on fuzzy c-means clustering is an important algorithm in the image segmentation field. It has been used widely. However, it is not successfully to segment the noise image because the algorithm disregards of special constraint information. It only considers the gray information. Therefore, we proposed a weighed FCM algorithm based on Gaussian kernel function for image
A New Algorithm for Image Segmentation Based on Fast Fuzzy C-Means Clustering
Zhi-bing Wang; Rui-hua Lu
2008-01-01
Fuzzy c-means algorithm with spatial constraints (FCM_S) is more effective for image segmentation. However, it still lacks enough robustness to noise and outliers, and costs much time in computation. To overcome the above problem, a new algorithm for image segmentation based on fast fuzzy c-means clustering is proposed in this paper. In order to reduce the number of iteration, the
NASA Astrophysics Data System (ADS)
Liu, Fang
2011-06-01
Image segmentation remains one of the major challenges in image analysis and computer vision. Fuzzy clustering, as a soft segmentation method, has been widely studied and successfully applied in mage clustering and segmentation. The fuzzy c-means (FCM) algorithm is the most popular method used in mage segmentation. However, most clustering algorithms such as the k-means and the FCM clustering algorithms search for the final clusters values based on the predetermined initial centers. The FCM clustering algorithms does not consider the space information of pixels and is sensitive to noise. In the paper, presents a new fuzzy c-means (FCM) algorithm with adaptive evolutionary programming that provides image clustering. The features of this algorithm are: 1) firstly, it need not predetermined initial centers. Evolutionary programming will help FCM search for better center and escape bad centers at local minima. Secondly, the spatial distance and the Euclidean distance is also considered in the FCM clustering. So this algorithm is more robust to the noises. Thirdly, the adaptive evolutionary programming is proposed. The mutation rule is adaptively changed with learning the useful knowledge in the evolving process. Experiment results shows that the new image segmentation algorithm is effective. It is providing robustness to noisy images.
An improved fuzzy c-means algorithm for unbalanced sized clusters
NASA Astrophysics Data System (ADS)
Gu, Shuguo; Liu, Jingjing; Xie, Qingguo; Wang, Luyao
2012-02-01
In this paper, we propose an improved fuzzy c-means (FCM) algorithm based on cluster height information to deal with the sensitivity of unbalanced sized clusters in FCM. As we know, cluster size sensitivity is an major drawback of FCM, which tends to balance the cluster sizes during iteration, so the center of smaller cluster might be drawn to the adjacent larger one, which will lead to bad classification. To overcome this problem, the cluster height information is considered and introduced to the distance function to adjust the conventional Euclidean distance, thus to control the effect on classification from cluster size difference. Experimental results demonstrate that our algorithm can obtain good clustering results in spite of great size difference, while traditional FCM cannot work well in such case. The improved FCM has shown its potential for extracting small clusters, especially in medical image segmentation.
Image Segmentation Algorithm Based on Improved Weighted Fuzzy C-means Clustering
Jie Xin; Xiuyan Sha
2008-01-01
This paper introduces an image segmentation algorithm of weighted with neighborhood gray difference fuzzy c-means clustering (WFCM) and experiments with the samples on two-dimensional histogram between the original image and its median filter image. Experimental results demonstrate that this scheme can not only effectively segment the low contrast object, but also reduce the noise from the background.
New two-dimensional fuzzy C-means clustering algorithm for image segmentation
Xian-cheng Zhou; Qun-tai Shen; Li-mei Liu
2008-01-01
To solve the problem of poor anti-noise performance of the traditional fuzzy C-means (FCM) algorithm in image segmentation,\\u000a a novel two-dimensional FCM clustering algorithm for image segmentation was proposed. In this method, the image segmentation\\u000a was converted into an optimization problem. The fitness function containing neighbor information was set up based on the gray\\u000a information and the neighbor relations between
An Improved Fuzzy c-Means Clustering Algorithm Based on Shadowed Sets and PSO
Zhang, Jian; Shen, Ling
2014-01-01
To organize the wide variety of data sets automatically and acquire accurate classification, this paper presents a modified fuzzy c-means algorithm (SP-FCM) based on particle swarm optimization (PSO) and shadowed sets to perform feature clustering. SP-FCM introduces the global search property of PSO to deal with the problem of premature convergence of conventional fuzzy clustering, utilizes vagueness balance property of shadowed sets to handle overlapping among clusters, and models uncertainty in class boundaries. This new method uses Xie-Beni index as cluster validity and automatically finds the optimal cluster number within a specific range with cluster partitions that provide compact and well-separated clusters. Experiments show that the proposed approach significantly improves the clustering effect. PMID:25477953
Robust weighted fuzzy c-means clustering
A. H. Hadjahmadi; M. M. Homayounpour; S. M. Ahadi
2008-01-01
Nowadays, the fuzzy c-means method (FCM) became one of the most popular clustering methods based on minimization of a criterion function. However, the performance of this clustering algorithm may be significantly degraded in the presence of noise. This paper presents a robust clustering algorithm called robust weighted fuzzy c-means (RWFCM). We used a new objective function that uses some kinds
NASA Astrophysics Data System (ADS)
Wang, Deguang; Han, Baochang; Huang, Ming
Computer forensics is the technology of applying computer technology to access, investigate and analysis the evidence of computer crime. It mainly include the process of determine and obtain digital evidence, analyze and take data, file and submit result. And the data analysis is the key link of computer forensics. As the complexity of real data and the characteristics of fuzzy, evidence analysis has been difficult to obtain the desired results. This paper applies fuzzy c-means clustering algorithm based on particle swarm optimization (FCMP) in computer forensics, and it can be more satisfactory results.
A Genetic Algorithm That Exchanges Neighboring Centers for Fuzzy c-Means Clustering
ERIC Educational Resources Information Center
Chahine, Firas Safwan
2012-01-01
Clustering algorithms are widely used in pattern recognition and data mining applications. Due to their computational efficiency, partitional clustering algorithms are better suited for applications with large datasets than hierarchical clustering algorithms. K-means is among the most popular partitional clustering algorithm, but has a major…
Fuzzy Clustering Hard-c-Means, Fuzzy-c-Means, Gustafson-Kessel
Rostock, Universität
Fuzzy Clustering Hard-c-Means, Fuzzy-c-Means, Gustafson-Kessel Olaf Wolkenhauer Control Systems: . . . . . . . . . . . . . . . . . 10 2 Clustering 11 2.1 Equivalence Relations . . . . . . . . . . . . . . . . . . 12 3 Hard-c-Means Clustering 13 3.1 Objective Function, Partition Space . . . . . . . . . . 14 3.2 Hard-c-Means Algorithm
Weiling Cai; Songcan Chen; Daoqiang Zhang
2007-01-01
Fuzzy c-means (FCM) algorithms with spatial constraints (FCM_S) have been proven effective for image segmentation. However, they still have the following disadvantages: (1) although the introduction of local spatial information to the corresponding objective functions enhances their insensitiveness to noise to some extent, they still lack enough robustness to noise and outliers, especially in absence of prior knowledge of the
NASA Astrophysics Data System (ADS)
Abdul-Nasir, Aimi Salihah; Mashor, Mohd Yusoff; Halim, Nurul Hazwani Abd; Mohamed, Zeehaida
2015-05-01
Malaria is a life-threatening parasitic infectious disease that corresponds for nearly one million deaths each year. Due to the requirement of prompt and accurate diagnosis of malaria, the current study has proposed an unsupervised pixel segmentation based on clustering algorithm in order to obtain the fully segmented red blood cells (RBCs) infected with malaria parasites based on the thin blood smear images of P. vivax species. In order to obtain the segmented infected cell, the malaria images are first enhanced by using modified global contrast stretching technique. Then, an unsupervised segmentation technique based on clustering algorithm has been applied on the intensity component of malaria image in order to segment the infected cell from its blood cells background. In this study, cascaded moving k-means (MKM) and fuzzy c-means (FCM) clustering algorithms has been proposed for malaria slide image segmentation. After that, median filter algorithm has been applied to smooth the image as well as to remove any unwanted regions such as small background pixels from the image. Finally, seeded region growing area extraction algorithm has been applied in order to remove large unwanted regions that are still appeared on the image due to their size in which cannot be cleaned by using median filter. The effectiveness of the proposed cascaded MKM and FCM clustering algorithms has been analyzed qualitatively and quantitatively by comparing the proposed cascaded clustering algorithm with MKM and FCM clustering algorithms. Overall, the results indicate that segmentation using the proposed cascaded clustering algorithm has produced the best segmentation performances by achieving acceptable sensitivity as well as high specificity and accuracy values compared to the segmentation results provided by MKM and FCM algorithms.
Ji, Ze-Xuan; Sun, Quan-Sen; Xia, De-Shen
2011-07-01
A modified possibilistic fuzzy c-means clustering algorithm is presented for fuzzy segmentation of magnetic resonance (MR) images that have been corrupted by intensity inhomogeneities and noise. By introducing a novel adaptive method to compute the weights of local spatial in the objective function, the new adaptive fuzzy clustering algorithm is capable of utilizing local contextual information to impose local spatial continuity, thus allowing the suppression of noise and helping to resolve classification ambiguity. To estimate the intensity inhomogeneity, the global intensity is introduced into the coherent local intensity clustering algorithm and takes the local and global intensity information into account. The segmentation target therefore is driven by two forces to smooth the derived optimal bias field and improve the accuracy of the segmentation task. The proposed method has been successfully applied to 3 T, 7 T, synthetic and real MR images with desirable results. Comparisons with other approaches demonstrate the superior performance of the proposed algorithm. Moreover, the proposed algorithm is robust to initialization, thereby allowing fully automatic applications. PMID:21256710
Fuzzy c-means clustering with prior biological knowledge
Luis Tari; Chitta Baral; Seungchan Kim
2009-01-01
We propose a novel semi-supervised clustering method called GO Fuzzy c-means, which enables the simultaneous use of biological knowledge and gene expression data in a probabilistic clustering algorithm. Our method is based on the fuzzy c-means clustering algorithm and utilizes the Gene Ontology annotations as prior knowledge to guide the process of grouping functionally related genes. Unlike traditional clustering methods,
Image segmentation using probabilistic fuzzy c-means clustering
Dinh-tuan Pham
2001-01-01
A new approach for gray-level image segmentation is presented using a probabilistic fuzzy c-means clustering algorithm. This approach combines the spatial probabilistic information and the fuzzy membership function in the clustering process. The proposed probabilistic fuzzy c-means method can deal effectively with image segmentation in a noisy environment
Chahir, Youssef
Abstract-- In this paper, we present the application of the fuzzy c-means clustering algorithm or non skin ones[1]. An important step in the image classification process is color segmentation and interpretation. After segmenting out skin from images, this can be useful for identifying faces, hand sign
Xianyi Cheng; Xiangpu Gong
2008-01-01
This paper proposes a method of dynamic fuzzy clustering analysis based on improved ant colony algorithm. This method makes use of the great ability of ant colony algorithm for disposing local convergence, which overcomes sensitivity to initialization of fuzzy clustering method (FCM) and fixes on the numbers of clustering as well as the centers of clustering dynamically. This paper improves
Image segmentation with cyclic load balanced parallel Fuzzy C - Means cluster analysis
Mogana Vadiveloo; Rosni Abdullah; Mandava Rajeswari; Ahmad Adel Abu-Shareha
2011-01-01
This paper proposes a cyclic load balancing strategy to parallel Fuzzy C-Means cluster analysis algorithm. The problem is to minimize the total time cost and maximize the parallel processing efficiency when a subset of clusters is distributed over a set of processors cores on shared memory architecture. The parallel Fuzzy C - Means (FCM) cluster analysis algorithm is composed by
On cluster validity for the fuzzy c-means model
N. R. Pal; J. C. Bezdek
1995-01-01
Many functionals have been proposed for validation of partitions of object data produced by the fuzzy c-means (FCM) clustering algorithm. We examine the role a subtle but important parameter-the weighting exponent m of the FCM model-plays in determining the validity of FCM partitions. The functionals considered are the partition coefficient and entropy indexes of Bezdek, the Xie-Beni (1991), and extended
Efficient inhomogeneity compensation using fuzzy c-means clustering models.
Szilágyi, László; Szilágyi, Sándor M; Benyó, Balázs
2012-10-01
Intensity inhomogeneity or intensity non-uniformity (INU) is an undesired phenomenon that represents the main obstacle for magnetic resonance (MR) image segmentation and registration methods. Various techniques have been proposed to eliminate or compensate the INU, most of which are embedded into classification or clustering algorithms, they generally have difficulties when INU reaches high amplitudes and usually suffer from high computational load. This study reformulates the design of c-means clustering based INU compensation techniques by identifying and separating those globally working computationally costly operations that can be applied to gray intensity levels instead of individual pixels. The theoretical assumptions are demonstrated using the fuzzy c-means algorithm, but the proposed modification is compatible with a various range of c-means clustering based INU compensation and MR image segmentation algorithms. Experiments carried out using synthetic phantoms and real MR images indicate that the proposed approach produces practically the same segmentation accuracy as the conventional formulation, but 20-30 times faster. PMID:22405524
Pang, Yachun; Li, Li; Hu, Wenyong; Peng, Yanxia; Liu, Lizhi; Shao, Yuanzhi
2012-01-01
This paper presents a novel two-step approach that incorporates fuzzy c-means (FCMs) clustering and gradient vector flow (GVF) snake algorithm for lesions contour segmentation on breast magnetic resonance imaging (BMRI). Manual delineation of the lesions by expert MR radiologists was taken as a reference standard in evaluating the computerized segmentation approach. The proposed algorithm was also compared with the FCMs clustering based method. With a database of 60 mass-like lesions (22 benign and 38 malignant cases), the proposed method demonstrated sufficiently good segmentation performance. The morphological and texture features were extracted and used to classify the benign and malignant lesions based on the proposed computerized segmentation contour and radiologists' delineation, respectively. Features extracted by the computerized characterization method were employed to differentiate the lesions with an area under the receiver-operating characteristic curve (AUC) of 0.968, in comparison with an AUC of 0.914 based on the features extracted from radiologists' delineation. The proposed method in current study can assist radiologists to delineate and characterize BMRI lesion, such as quantifying morphological and texture features and improving the objectivity and efficiency of BMRI interpretation with a certain clinical value. PMID:22952558
Hybrid image segmentation using fuzzy c-means and gravitational search algorithm
NASA Astrophysics Data System (ADS)
Majd, Emadaldin Mozafari; As'ari, M. A.; Sheikh, U. U.; Abu-Bakar, S. A. R.
2012-04-01
In this paper, we propose a new hybrid approach for image segmentation. The proposed approach exploits spatial fuzzy c-means for clustering image pixels into homogeneous regions. In order to improve the performance of fuzzy c-means to cope with segmentation problems, we employ gravitational search algorithm which is inspired by Newton's rule of gravity. Gravitational search algorithm is incorporated into fuzzy c-means to take advantage of its ability to find optimum cluster centers which minimizes the fitness function of fuzzy c-means. Experimental results show effectiveness of the proposed method in segmentation different types of images as compared to classical fuzzy c-means.
A fast fuzzy c-means algorithm for color image segmentation
Paris-Sud XI, Université de
A fast fuzzy c-means algorithm for color image segmentation Hoel Le Capitaine and Carl Frélicot on the clustering approach, especially the fuzzy c-means algorithm (FCM, [3]), which is used by many segmentation segmentation. Furthermore, we introduce a novel fuzzy iterative algorithm allowing fast segmentation of images
A new modified fuzzy c-means algorithm for multispectral satellite images segmentation
Punya Thitimajshima
2000-01-01
The purpose of cluster analysis is to partition a data set into a number of disjoint groups or clusters. The members within a cluster are more similar to each other than members from different clusters. The fuzzy c-means (FCM) clustering is an iterative partitioning method that produces optimal c-partitions. Since the standard FCM algorithm takes a long time to partition
Synthetic aperture radar (SAR) image segmentation using a new modified fuzzy c-means algorithm
Warin Chumsamrong; Punya Thitimajshima; Yuttapong Rangsanseri
2000-01-01
Generally fuzzy c-means algorithm is one proved that very well suited for remote sensing image segmentation, exhibited sensitivity to the initial guess with regard to both speed and stability. But it also showed sensitivity to noise. This paper proposes a fully automatic technique to obtain image clusters. A modified fuzzy c-means classification algorithm is used to provide a fuzzy partition.
A Modified Fuzzy C-means Algorithm in Remote Sensing Image Segmentation
Du Gen-yuan; Miao Fang; Tian Sheng-li; Liu Ye
2009-01-01
Fuzzy C-Means (FCM) algorithm has good clustering efficiency, for which it is widely used in the field of image segmentation. However, problems such as weak robustness of distance measure, the number of initial clustering to be given in advance and not considering local image feature still exist. In essence, FCM is a local search algorithm. Improper selection of initial value
NASA Astrophysics Data System (ADS)
Schröter, Ingmar; Paasche, Hendik; Dietrich, Peter; Wollschläger, Ute
2014-05-01
Soil moisture is a key variable of the hydrological cycle. For example, it controls partitioning of rainfall into a runoff and an infiltration component and modulating physical, chemical and biological processes within the soil. For a better understanding of these processes, knowledge about the spatio-temporal distribution of soil moisture is indispensable. For the field to the small catchment scale with survey areas up to a few square kilometres, there are numerous new and innovative ground-based and remote sensing technologies available which have great potential to provide temporal information about soil moisture patterns. The aim of this work is to design an optimal soil moisture monitoring program for a low-mountain catchment in central Germany. In a first step, the fuzzy c-means clustering technique (Paasche et al., 2006) was used to identify structure-relevant patterns in a set of different terrain attributes derived from a DEM. Based on these patterns optimal measurement locations were identified to conduct in-situ soil moisture measurements. To consider different wetting and drying states in the catchment, several TDR measurement campaigns were conducted from April to October 2013. The TDR measurements have been integrated with the structure-relevant patterns obtained by the fuzzy cluster analysis to regionally predict soil moisture. In this study, we outline the conceptual framework of this integrative approach and present first results from field measurements. The results of the project are expected to improve the monitoring and understanding of small catchment-scale hydrological processes and to contribute to a better representation of soil moisture dynamics in physically-based, hydrological models operating at the field to the small catchment scale. Reference: Paasche, H., J. Tronicke, K. Holliger, A.G. Green, and H. Maurer (2006): Integration of diverse physical-property models: Subsurface zonation and petrophysical parameter estimation based on fuzzy c-means cluster analyses. Geophysics 71(3), H33-H44, doi:10.1190/1.2192927.
FUZZY IMAGE SEGMENTATION USING SUPPRESSED FUZZY C-MEANS CLUSTERING (SFCM)
M. Ameer Ali; Gour C Karmakar; Laurence S Dooley
Clustering algorithms are highly dependent on the features used and the type of the objects in a particular image. By considering object similar surface variations (SSV) as well as the arbitrariness of the fuzzy c-means (FCM) algorithm for pixel location, a fuzzy image segmentation considering object surface similarity (FSOS) algorithm was developed, but it was unable to segment objects having
Fingerprint Image Segmentation Using Modified Fuzzy C-Means Algorithm
Jiayin Kang; Wenjuan Zhang
2009-01-01
Fingerprint segmentation is a crucial step in fingerprint recognition system, and determines the results of fingerprint analysis and recognition. In this paper, an approach for fingerprint segmentation based on modified fuzzy c-means (FCM) is presented. The proposed algorithm is realized by modifying the objective function in the Szilagyi's algorithm via introducing the gray histogram-based weighting. Experimental results show that proposed
Fuzzy c-means clustering with spatial information for image segmentation
Keh-Shih Chuang; Hong-Long Tzeng; Sharon Chen; Jay Wu; Tzong-Jer Chen
2006-01-01
A conventional FCM algorithm does not fully utilize the spatial information in the image. In this paper, we present a fuzzy c-means (FCM) algorithm that incorporates spatial information into the membership function for clustering. The spatial function is the summation of the membership function in the neighborhood of each pixel under consideration. The advantages of the new method are the
Image Segmentation Based on Adaptive Fuzzy-C-Means Clustering
Mohamed Walid Ayech; Karim El Kalti; Béchir el Ayeb
2010-01-01
The clustering method “Fuzzy-C-Means” (FCM) is widely used in image segmentation. However, the major drawback of this method is its sensitivity to the noise. In this paper, we propose a variant of this method which aims at resolving this problem. Our approach is based on an adaptive distance which is calculated according to the spatial position of the pixel in
Multivariate image segmentation based on geometrically guided fuzzy C-means clustering
J. C. Noordam; Broek van den W. H. A. M
2002-01-01
This paper describes a new approach to geometrically guided fuzzy clustering. A modified version of fuzzy C-means (FCM) clustering, conditional FCM, is extended to incorporate a priori geometrical information from the spatial domain in order to improve image segmentation. This leads to a new algorithm (GGC-FCM) where the cluster guidance is determined by the membership values of neighbouring pixels. GGC-FCM
Marine Images Segmentation Using Adaptive Fuzzy c-Means Algorithm Based on Spatial Neighborhood
Shi-long Wang; Yu-ru Xu; Lei Wan; Xu-dong Tang
2011-01-01
In the marine environment, the underwater images have low SNR and the detail is fuzzy due to scattering and absorption of a variety of suspended matter in it and the water itself. If traditional methods are used to dispose underwater images directly, it is unlikely to obtain satisfactory results. Through the further research on traditional fuzzy c-means clustering algorithm, an
Integration and generalization of LVQ and c-means clustering (Invited Paper)
NASA Astrophysics Data System (ADS)
Bezdek, James C.
1992-11-01
This paper discusses the relationship between the sequential hard c-means (SHCM), learning vector quantization (LVQ), and fuzzy c-means (FCM) clustering algorithms. LVQ and SHCM suffer from several major problems. For example, they depend heavily on initialization. If the initial values of the cluster centers are outside the convex hull of the input data, such algorithms, even if they terminate, may not produce meaningful results in terms of prototypes for cluster representation. This is due in part to the fact that they update only the winning prototype for every input vector. We also discuss the impact and interaction of these two families with Kohonen's self-organizing feature mapping (SOFM), which is not a clustering method, but which often lends itself to clustering algorithms. Then we present two generalizations of LVQ that are explicitly designed as clustering algorithms: we refer to these algorithms as generalized LVQ equals GLVQ; and fuzzy LVQ equals FLVQ. Learning rules are derived to optimize an objective function whose goal is to produce 'good clusters'. GLVQ/FLVQ (may) update every node in the clustering net for each input vector. We use Anderson's IRIS data to compare the performance of GLVQ/FLVQ with a standard version of LVQ. Experiments show that the final centroids produced by GLVQ are independent of node initialization and learning coefficients. Neither GLVQ nor FLVQ depends upon a choice for the update neighborhood or learning rate distribution--these are taken care of automatically.
Xiaowei Yang; Guangquan Zhang; Jie Lu; Jun Ma
2011-01-01
The support vector machine (SVM) has provided higher performance than traditional learning machines and has been widely applied in real-world classification problems and non- linear function estimation problems. Unfortunately, the training process of the SVM is sensitive to the outliers or noises in the train- ing set. In this paper, a common misunderstanding of Gaussian- function-based kernel fuzzy clustering is
Generalized rough fuzzy c-means algorithm for brain MR image segmentation.
Ji, Zexuan; Sun, Quansen; Xia, Yong; Chen, Qiang; Xia, Deshen; Feng, Dagan
2012-11-01
Fuzzy sets and rough sets have been widely used in many clustering algorithms for medical image segmentation, and have recently been combined together to better deal with the uncertainty implied in observed image data. Despite of their wide spread applications, traditional hybrid approaches are sensitive to the empirical weighting parameters and random initialization, and hence may produce less accurate results. In this paper, a novel hybrid clustering approach, namely the generalized rough fuzzy c-means (GRFCM) algorithm is proposed for brain MR image segmentation. In this algorithm, each cluster is characterized by three automatically determined rough-fuzzy regions, and accordingly the membership of each pixel is estimated with respect to the region it locates. The importance of each region is balanced by a weighting parameter, and the bias field in MR images is modeled by a linear combination of orthogonal polynomials. The weighting parameter estimation and bias field correction have been incorporated into the iterative clustering process. Our algorithm has been compared to the existing rough c-means and hybrid clustering algorithms in both synthetic and clinical brain MR images. Experimental results demonstrate that the proposed algorithm is more robust to the initialization, noise, and bias field, and can produce more accurate and reliable segmentations. PMID:22088865
A New Approach to Lung Image Segmentation using Fuzzy Possibilistic C-Means Algorithm
Gomathi, M
2010-01-01
Image segmentation is a vital part of image processing. Segmentation has its application widespread in the field of medical images in order to diagnose curious diseases. The same medical images can be segmented manually. But the accuracy of image segmentation using the segmentation algorithms is more when compared with the manual segmentation. In the field of medical diagnosis an extensive diversity of imaging techniques is presently available, such as radiography, computed tomography (CT) and magnetic resonance imaging (MRI). Medical image segmentation is an essential step for most consequent image analysis tasks. Although the original FCM algorithm yields good results for segmenting noise free images, it fails to segment images corrupted by noise, outliers and other imaging artifact. This paper presents an image segmentation approach using Modified Fuzzy C-Means (FCM) algorithm and Fuzzy Possibilistic c-means algorithm (FPCM). This approach is a generalized version of standard Fuzzy CMeans Clustering (FCM) ...
Mandarin Digital Speech Recognition Based on a Chaotic Neural Network and Fuzzy C-means Clustering
Freeman, Walter J.
Mandarin Digital Speech Recognition Based on a Chaotic Neural Network and Fuzzy C-means Clustering model can perform digital speech recognition efficiently and the fuzzy c-means clustering has better performance than the hard k-means clustering. I. INTRODUCTION Digital speech recognition can be widely used
NASA Astrophysics Data System (ADS)
Akinin, M. V.; Akinina, N. V.; Klochkov, A. Y.; Nikiforov, M. B.; Sokolova, A. V.
2015-05-01
The report reviewed the algorithm fuzzy c-means, performs image segmentation, give an estimate of the quality of his work on the criterion of Xie-Beni, contain the results of experimental studies of the algorithm in the context of solving the problem of drawing up detailed two-dimensional maps with the use of unmanned aerial vehicles. According to the results of the experiment concluded that the possibility of applying the algorithm in problems of decoding images obtained as a result of aerial photography. The considered algorithm can significantly break the original image into a plurality of segments (clusters) in a relatively short period of time, which is achieved by modification of the original k-means algorithm to work in a fuzzy task.
Modified fuzzy c-means applied to a Bragg grating-based spectral imager for material clustering
NASA Astrophysics Data System (ADS)
Rodríguez, Aida; Nieves, Juan Luis; Valero, Eva; Garrote, Estíbaliz; Hernández-Andrés, Javier; Romero, Javier
2012-01-01
We have modified the Fuzzy C-Means algorithm for an application related to segmentation of hyperspectral images. Classical fuzzy c-means algorithm uses Euclidean distance for computing sample membership to each cluster. We have introduced a different distance metric, Spectral Similarity Value (SSV), in order to have a more convenient similarity measure for reflectance information. SSV distance metric considers both magnitude difference (by the use of Euclidean distance) and spectral shape (by the use of Pearson correlation). Experiments confirmed that the introduction of this metric improves the quality of hyperspectral image segmentation, creating spectrally more dense clusters and increasing the number of correctly classified pixels.
Multivariate image segmentation with cluster size insensitive Fuzzy C-means
J. C. Noordam; W. H. A. M. van den Broek; L. M. C. Buydens
2002-01-01
This paper describes a technique to overcome the sensitivity of fuzzy C-means clustering for unequal cluster sizes in multivariate images. As FCM tends to balance the number of points in each cluster, cluster centres of smaller clusters are drawn to larger adjacent clusters. In order to overcome this, a modified version of FCM, called Conditional FCM, is used to balance
Keller, Brad; Nathan, Diane; Wang, Yan; Zheng, Yuanjie; Gee, James; Conant, Emily; Kontos, Despina
2011-01-01
The relative fibroglandular tissue content in the breast, commonly referred to as breast density, has been shown to be the most significant risk factor for breast cancer after age. Currently, the most common approaches to quantify density are based on either semi-automated methods or visual assessment, both of which are highly subjective. This work presents a novel multi-class fuzzy c-means (FCM) algorithm for fully-automated identification and quantification of breast density, optimized for the imaging characteristics of digital mammography. The proposed algorithm involves adaptive FCM clustering based on an optimal number of clusters derived by the tissue properties of the specific mammogram, followed by generation of a final segmentation through cluster agglomeration using linear discriminant analysis. When evaluated on 80 bilateral screening digital mammograms, a strong correlation was observed between algorithm-estimated PD% and radiological ground-truth of r=0.83 (p<0.001) and an average Jaccard spatial similarity coefficient of 0.62. These results show promise for the clinical application of the algorithm in quantifying breast density in a repeatable manner. PMID:22003744
Inhomogeneity correction for magnetic resonance images with fuzzy C-mean algorithm
segmentation method. For example, the adaptive fuzzy c-mean (AFCM) image segmentation algorithm proposedInhomogeneity correction for magnetic resonance images with fuzzy C-mean algorithm Xiang Li of New York at Stony Brook, USA Abstract: Segmentation of magnetic resonance (MR) images plays
An interval type-2 fuzzy c-means algorithm based on spatial information for image segmentation
Cunyong Qiu; Jian Xiao; Long Yu; Lu Han
2011-01-01
Fuzzy c-means algorithm (FCM) is a classic algorithm used in image segmentation. However, FCM is founded with type-1 fuzzy sets, which cannot handle the uncertainties existing in images and algorithm itself. The interval type-2 fuzzy c-means algorithm (IT2FCM) has better performance on handling uncertainties. But for image segmentation, IT2FCM hasn't taken the spatial information of images into account, which makes
Zhu Feng; Song Yuqing; Chen Jianmei
2010-01-01
The application of fuzzy c-means algorithm to image segmentation is not taking into account spatial information apart from intensity values , which will lead a misclassification on the boundaries and inhomogeneous regions with noises.In order to solve this problem, a new image segmentation method is proposed using adaptive spatially median neighborhood information and fuzzy c-means algorithm in this paper. First,
A novel kernelized fuzzy C-means algorithm with application in medical image segmentation
Dao-qiang Zhang; Song-can Chen
2004-01-01
Image segmentation plays a crucial role in many medical imaging applications. In this paper, we present a novel algorithm for fuzzy segmentation of magnetic resonance imaging (MRI) data. The algorithm is realized by modifying the objective function in the conventional fuzzy C-means (FCM) algorithm using a kernel-induced distance metric and a spatial penalty on the membership functions. Firstly, the original
Dzung L. Pham; Jerry L. Prince
1999-01-01
We present a novel algorithm for obtaining fuzzy segmentations of images that are subject to multiplicative intensity inhomogeneities, such as magnetic resonance images. The algorithm is formulated by modifying the objective function in the fuzzy C-means algorithm to include a multiplier field, which allows the centroids for each class to vary across the image. First and second order regularization terms
A wavelet relational fuzzy C-means algorithm for 2D gel image segmentation.
Rashwan, Shaheera; Faheem, Mohamed Talaat; Sarhan, Amany; Youssef, Bayumy A B
2013-01-01
One of the most famous algorithms that appeared in the area of image segmentation is the Fuzzy C-Means (FCM) algorithm. This algorithm has been used in many applications such as data analysis, pattern recognition, and image segmentation. It has the advantages of producing high quality segmentation compared to the other available algorithms. Many modifications have been made to the algorithm to improve its segmentation quality. The proposed segmentation algorithm in this paper is based on the Fuzzy C-Means algorithm adding the relational fuzzy notion and the wavelet transform to it so as to enhance its performance especially in the area of 2D gel images. Both proposed modifications aim to minimize the oversegmentation error incurred by previous algorithms. The experimental results of comparing both the Fuzzy C-Means (FCM) and the Wavelet Fuzzy C-Means (WFCM) to the proposed algorithm on real 2D gel images acquired from human leukemias, HL-60 cell lines, and fetal alcohol syndrome (FAS) demonstrate the improvement achieved by the proposed algorithm in overcoming the segmentation error. In addition, we investigate the effect of denoising on the three algorithms. This investigation proves that denoising the 2D gel image before segmentation can improve (in most of the cases) the quality of the segmentation. PMID:24174990
A Wavelet Relational Fuzzy C-Means Algorithm for 2D Gel Image Segmentation
Rashwan, Shaheera; Faheem, Mohamed Talaat; Sarhan, Amany; Youssef, Bayumy A. B.
2013-01-01
One of the most famous algorithms that appeared in the area of image segmentation is the Fuzzy C-Means (FCM) algorithm. This algorithm has been used in many applications such as data analysis, pattern recognition, and image segmentation. It has the advantages of producing high quality segmentation compared to the other available algorithms. Many modifications have been made to the algorithm to improve its segmentation quality. The proposed segmentation algorithm in this paper is based on the Fuzzy C-Means algorithm adding the relational fuzzy notion and the wavelet transform to it so as to enhance its performance especially in the area of 2D gel images. Both proposed modifications aim to minimize the oversegmentation error incurred by previous algorithms. The experimental results of comparing both the Fuzzy C-Means (FCM) and the Wavelet Fuzzy C-Means (WFCM) to the proposed algorithm on real 2D gel images acquired from human leukemias, HL-60 cell lines, and fetal alcohol syndrome (FAS) demonstrate the improvement achieved by the proposed algorithm in overcoming the segmentation error. In addition, we investigate the effect of denoising on the three algorithms. This investigation proves that denoising the 2D gel image before segmentation can improve (in most of the cases) the quality of the segmentation. PMID:24174990
Infrared image segmentation using Enhanced Fuzzy C-means clustering for automatic detection systems
Sitanshu Gupta; Asim Mukherjee
2011-01-01
This paper proposes Enhanced Fuzzy C-means technique (EFCM) based infrared image segmentation and its broad application in Automatic detection systems. The EFCM based image segmentation is able to approximate the exact number of clusters present in the image. EFCM based segmentation is applied on various infrared images that can be used for automatic detection systems and compared with widely used
Yukihiro Hamasuna; Yasunori Endo; Sadaaki Miyamoto
2011-01-01
This paper presents Mahalanobis distance based fuzzy c-means clustering for uncertain data using penalty vector regularization. When we handle a set of data, data contains inherent uncertainty e.g., errors, ranges or some missing value of attributes. In order to handle such uncertain data as a point in a pattern space the concept of penalty vector has been proposed. Some significant
Generalized fuzzy c-means clustering strategies using Lp norm distances
Richard J. Hathaway; James C. Bezdek; Yingkang Hu
2000-01-01
Fuzzy c-means (FCM) is a useful clustering technique. Modifications of FCM using L1 norm distances increase robustness to outliers. Object and relational data versions of FCM clustering are defined for the more general case where the Lp norm (p⩾1) or semi-norm (0
Performance Characterization of Clustering Algorithms for Colour Image Segmentation
Whelan, Paul F.
techniques (K-Means clustering, Fuzzy C- Means clustering and Adaptive K-Means clustering) that are applied algorithms: K-Means, Fuzzy C-Means and Adaptive K-Means (also called clustering by competitive agglomeration segmentation algorithm. To improve the immunity to noise and increase the performance of the colour e
Unsupervised Image Segmentation Using Automated Fuzzy c-Means
Supatra Sahaphong; Nualsawat Hiransakolwong
2007-01-01
An unsupervised fuzzy clustering technique, fuzzy c-means (FCM) clustering algorithm has been widely used in image segmentation. However, the conventional FCM algorithm must be estimated by expertise users to determine the cluster numbers. To overcome the limitation of FCM algorithm, an automated fuzzy c-mean (AFCM) algorithm is presented in this paper. The proposed algorithm initiates the first two centroids of
A Modified Fuzzy C-Means Algorithm For Collaborative LMAM and School of Mathematical Sciences,
Li, Tiejun
to the Netflix Prize data set and acquire comparable accuracy with that of MF. Categories and Subject Descriptors Keywords Collaborative Filtering, Clustering, Matrix Factorization, Fuzzy C-means, Netflix Prize 1 on servers or to redistribute to lists, requires prior specific permission and/or a fee. 2nd Netflix
Automatic online spike sorting with singular value decomposition and fuzzy C-mean clustering
2012-01-01
Background Understanding how neurons contribute to perception, motor functions and cognition requires the reliable detection of spiking activity of individual neurons during a number of different experimental conditions. An important problem in computational neuroscience is thus to develop algorithms to automatically detect and sort the spiking activity of individual neurons from extracellular recordings. While many algorithms for spike sorting exist, the problem of accurate and fast online sorting still remains a challenging issue. Results Here we present a novel software tool, called FSPS (Fuzzy SPike Sorting), which is designed to optimize: (i) fast and accurate detection, (ii) offline sorting and (iii) online classification of neuronal spikes with very limited or null human intervention. The method is based on a combination of Singular Value Decomposition for fast and highly accurate pre-processing of spike shapes, unsupervised Fuzzy C-mean, high-resolution alignment of extracted spike waveforms, optimal selection of the number of features to retain, automatic identification the number of clusters, and quantitative quality assessment of resulting clusters independent on their size. After being trained on a short testing data stream, the method can reliably perform supervised online classification and monitoring of single neuron activity. The generalized procedure has been implemented in our FSPS spike sorting software (available free for non-commercial academic applications at the address: http://www.spikesorting.com) using LabVIEW (National Instruments, USA). We evaluated the performance of our algorithm both on benchmark simulated datasets with different levels of background noise and on real extracellular recordings from premotor cortex of Macaque monkeys. The results of these tests showed an excellent accuracy in discriminating low-amplitude and overlapping spikes under strong background noise. The performance of our method is competitive with respect to other robust spike sorting algorithms. Conclusions This new software provides neuroscience laboratories with a new tool for fast and robust online classification of single neuron activity. This feature could become crucial in situations when online spike detection from multiple electrodes is paramount, such as in human clinical recordings or in brain-computer interfaces. PMID:22871125
Image segmentation using kernel fuzzy c-means clustering on level set method
G. Raghotham Reddy; Tara Saikumar; R. Rameshwar Rao
2011-01-01
In this paper, kernel fuzzy c-means (KFCM) was used to generate an initial contour curve which overcomes leaking at the boundary during the curve propagation. Firstly, KFCM algorithm computes the fuzzy membership values for each pixel. On the basis of KFCM the edge indicator function was redefined. Using the edge indicator function the image segmentation of a medical image was
Image segmentation using kernel fuzzy c-means clustering on level set method on noisy images
G. Raghotham Reddy; K. Ramudu; Syed Zaheeruddin; R. Rameshwar Rao
2011-01-01
In this paper, kernel fuzzy c-means (KFCM) was used to generate an initial contour curve which overcomes leaking at the boundary during the curve propagation. Firstly, KFCM algorithm computes the fuzzy membership values for each pixel. On the basis of KFCM the edge indicator function was redefined. Using the edge indicator function the segmentation of medical images which are added
MR Image Segmentation Based On Fuzzy C-Means Clustering and the Level Set Method
Chengzhong Huang; Bin Yan; Hua Jiang; Dahui Wang
2008-01-01
The purpose of this study was to improve the segmentation performance of the level set method for magnetic resonance images (MRI), such as fuzzy boundary or low contrast. In this paper, a level set method was presented in which fuzzy c-means (FCM) was used to prevent boundary leaking during the curve propagated. Firstly, FCM algorithm was used to compute the
Self-organization and clustering algorithms
NASA Technical Reports Server (NTRS)
Bezdek, James C.
1991-01-01
Kohonen's feature maps approach to clustering is often likened to the k or c-means clustering algorithms. Here, the author identifies some similarities and differences between the hard and fuzzy c-Means (HCM/FCM) or ISODATA algorithms and Kohonen's self-organizing approach. The author concludes that some differences are significant, but at the same time there may be some important unknown relationships between the two methodologies. Several avenues of research are proposed.
Image watermarking using a dynamically weighted fuzzy c-means algorithm
NASA Astrophysics Data System (ADS)
Kang, Myeongsu; Ho, Linh Tran; Kim, Yongmin; Kim, Cheol Hong; Kim, Jong-Myon
2011-10-01
Digital watermarking has received extensive attention as a new method of protecting multimedia content from unauthorized copying. In this paper, we present a nonblind watermarking system using a proposed dynamically weighted fuzzy c-means (DWFCM) technique combined with discrete wavelet transform (DWT), discrete cosine transform (DCT), and singular value decomposition (SVD) techniques for copyright protection. The proposed scheme efficiently selects blocks in which the watermark is embedded using new membership values of DWFCM as the embedding strength. We evaluated the proposed algorithm in terms of robustness against various watermarking attacks and imperceptibility compared to other algorithms [DWT-DCT-based and DCT- fuzzy c-means (FCM)-based algorithms]. Experimental results indicate that the proposed algorithm outperforms other algorithms in terms of robustness against several types of attacks, such as noise addition (Gaussian noise, salt and pepper noise), rotation, Gaussian low-pass filtering, mean filtering, median filtering, Gaussian blur, image sharpening, histogram equalization, and JPEG compression. In addition, the proposed algorithm achieves higher values of peak signal-to-noise ratio (approximately 49 dB) and lower values of measure-singular value decomposition (5.8 to 6.6) than other algorithms.
Fuzzy C-means clustering with local information and kernel metric for image segmentation.
Gong, Maoguo; Liang, Yan; Shi, Jiao; Ma, Wenping; Ma, Jingjing
2013-02-01
In this paper, we present an improved fuzzy C-means (FCM) algorithm for image segmentation by introducing a tradeoff weighted fuzzy factor and a kernel metric. The tradeoff weighted fuzzy factor depends on the space distance of all neighboring pixels and their gray-level difference simultaneously. By using this factor, the new algorithm can accurately estimate the damping extent of neighboring pixels. In order to further enhance its robustness to noise and outliers, we introduce a kernel distance measure to its objective function. The new algorithm adaptively determines the kernel parameter by using a fast bandwidth selection rule based on the distance variance of all data points in the collection. Furthermore, the tradeoff weighted fuzzy factor and the kernel distance measure are both parameter free. Experimental results on synthetic and real images show that the new algorithm is effective and efficient, and is relatively independent of this type of noise. PMID:23008257
Yin, Jiandong; Sun, Hongzan; Yang, Jiawen; Guo, Qiyong
2014-01-01
The arterial input function (AIF) plays a crucial role in the quantification of cerebral perfusion parameters. The traditional method for AIF detection is based on manual operation, which is time-consuming and subjective. Two automatic methods have been reported that are based on two frequently used clustering algorithms: fuzzy c-means (FCM) and K-means. However, it is still not clear which is better for AIF detection. Hence, we compared the performance of these two clustering methods using both simulated and clinical data. The results demonstrate that K-means analysis can yield more accurate and robust AIF results, although it takes longer to execute than the FCM method. We consider that this longer execution time is trivial relative to the total time required for image manipulation in a PACS setting, and is acceptable if an ideal AIF is obtained. Therefore, the K-means method is preferable to FCM in AIF detection. PMID:24503700
Remote sensing ocean data analyses using fuzzy C-Means clustering
NASA Astrophysics Data System (ADS)
Xu, Suqin; Chen, Jie; Gao, Guoxing
2009-10-01
With the deep understanding and exploitation of the wide Ocean, There are more and more fine instrument installed or loaded on measuring ships or other marines. The high costs and complexity of corrosion place ever-increasing demands on the analyses of surrounding ocean environment. In this paper, the fuzzy C-Means clustering is used to analyze the surrounding ocean environment with remote sensing data. The studied ocean area is considered as a two dimensional gird or an image, and the fuzzy C-Means clustering technique is used to reveal the underlying relationship of the elements and segment the interrelated ocean in regions with similar spectral properties in the influence of instrument corrosion. The influence of the environment elements in instrument corrosion is studied and a priori spatial information is added to improving the segmentation result. The fitness function containing neighbor information was set up based on the gray information and the neighbor relations between the pixels. By making use of the global searching ability of the predator-prey particle swarm optimization, the optimal cluster center could be obtained by iterative optimization and the segmentation could be accomplished. The calculation results show that the segmentation is accurate and reasonable. This ocean environment analysis fruit has used in real application and has proved to be valuable in ship instrument corrosion monitoring and the guide of other ocean activity.
Multiple kernel fuzzy C-means algorithm with ALS method for satellite and medical image segmentation
Babu. J Sheshagiri
2012-01-01
Image segmentation refers to the process of partitioning an image into its constituent parts or objects. The identical images can be segmented manually, but the accurateness of image segmentation using the segmentation algorithms is more compared to manual segmentation. Clustering is a primary data description method in applications such as data mining, pattern recognition and bioinformatics. The data clustering is
Image segmentation using kernel fuzzy c-means clustering on level set method
NASA Astrophysics Data System (ADS)
Reddy, G. Raghotham; Saikumar, Tara; Rao, R. Rameshwar
2011-10-01
In this paper, kernel fuzzy c-means (KFCM) was used to generate an initial contour curve which overcomes leaking at the boundary during the curve propagation. Firstly, KFCM algorithm computes the fuzzy membership values for each pixel. On the basis of KFCM the edge indicator function was redefined. Using the edge indicator function the image segmentation of a medical image was performed to extract the regions of interest for further processing. The above process of segmentation showed a considerable improvement in the evolution of the level set function.
A multiple-kernel fuzzy C-means algorithm for image segmentation.
Chen, Long; Chen, C L Philip; Lu, Mingzhu
2011-10-01
In this paper, a generalized multiple-kernel fuzzy C-means (FCM) (MKFCM) methodology is introduced as a framework for image-segmentation problems. In the framework, aside from the fact that the composite kernels are used in the kernel FCM (KFCM), a linear combination of multiple kernels is proposed and the updating rules for the linear coefficients of the composite kernel are derived as well. The proposed MKFCM algorithm provides us a new flexible vehicle to fuse different pixel information in image-segmentation problems. That is, different pixel information represented by different kernels is combined in the kernel space to produce a new kernel. It is shown that two successful enhanced KFCM-based image-segmentation algorithms are special cases of MKFCM. Several new segmentation algorithms are also derived from the proposed MKFCM framework. Simulations on the segmentation of synthetic and medical images demonstrate the flexibility and advantages of MKFCM-based approaches. PMID:21803693
Fuzzy and possibilistic clustering algorithms based on generalized reformulation
Nicolaos B. Karayiannis
1996-01-01
This paper presents a new approach to fuzzy and possibilistic clustering based on reformulation. The reformulation of fuzzy c-means (FCM) algorithms provides the basis for reformulating entropy constrained fuzzy clustering (ECFC) algorithms. This paper also proposes a generalized reformulation function and interprets both FCM and ECFC algorithms as special cases of the broad family of fuzzy and possibilistic clustering algorithms
Bias Field Estimation and Adaptive Segmentation of MRI Data Using a Modi ed Fuzzy C-Means Algorithm
Farag, Aly A.
Bias Field Estimation and Adaptive Segmentation of MRI Data Using a Modi ed Fuzzy C-Means Algorithm for adap- tive fuzzy segmentation of MRI data and estimation of intensity inhomogeneities using fuzzy logic by modifying the objective function of the standard fuzzy c-means FCM algo- rithm to compensate
T1- and T2-weighted spatially constrained fuzzy c-means clustering for brain MRI segmentation
NASA Astrophysics Data System (ADS)
Despotovi?, Ivana; Goossens, Bart; Vansteenkiste, Ewout; Philips, Wilfried
2010-03-01
The segmentation of brain tissue in magnetic resonance imaging (MRI) plays an important role in clinical analysis and is useful for many applications including studying brain diseases, surgical planning and computer assisted diagnoses. In general, accurate tissue segmentation is a difficult task, not only because of the complicated structure of the brain and the anatomical variability between subjects, but also because of the presence of noise and low tissue contrasts in the MRI images, especially in neonatal brain images. Fuzzy clustering techniques have been widely used in automated image segmentation. However, since the standard fuzzy c-means (FCM) clustering algorithm does not consider any spatial information, it is highly sensitive to noise. In this paper, we present an extension of the FCM algorithm to overcome this drawback, by combining information from both T1-weighted (T1-w) and T2-weighted (T2-w) MRI scans and by incorporating spatial information. This new spatially constrained FCM (SCFCM) clustering algorithm preserves the homogeneity of the regions better than existing FCM techniques, which often have difficulties when tissues have overlapping intensity profiles. The performance of the proposed algorithm is tested on simulated and real adult MR brain images with different noise levels, as well as on neonatal MR brain images with the gestational age of 39 weeks. Experimental quantitative and qualitative segmentation results show that the proposed method is effective and more robust to noise than other FCM-based methods. Also, SCFCM appears as a very promising tool for complex and noisy image segmentation of the neonatal brain.
NASA Astrophysics Data System (ADS)
Sandhir, Radha Pyari; Muhuri, Sanjib; Nayak, Tapan K.
2012-07-01
In high-energy physics experiments, calorimetric data reconstruction requires a suitable clustering technique in order to obtain accurate information about the shower characteristics such as the position of the shower and energy deposition. Fuzzy clustering techniques have high potential in this regard, as they assign data points to more than one cluster, thereby acting as a tool to distinguish between overlapping clusters. Fuzzy c-means (FCM) is one such clustering technique that can be applied to calorimetric data reconstruction. However, it has a drawback: it cannot easily identify and distinguish clusters that are not uniformly spread. A version of the FCM algorithm called dynamic fuzzy c-means (dFCM) allows clusters to be generated and eliminated as required, with the ability to resolve non-uniformly distributed clusters. Both the FCM and dFCM algorithms have been studied and successfully applied to simulated data of a sampling tungsten-silicon calorimeter. It is seen that the FCM technique works reasonably well, and at the same time, the use of the dFCM technique improves the performance.
Wen, Ying; He, Lianghua; von Deneen, Karen M; Lu, Yue
2013-11-01
We present an effective method for brain tissue classification based on diffusion tensor imaging (DTI) data. The method accounts for two main DTI segmentation obstacles: random noise and magnetic field inhomogeneities. In the proposed method, DTI parametric maps were used to resolve intensity inhomogeneities of brain tissue segmentation because they could provide complementary information for tissues and define accurate tissue maps. An improved fuzzy c-means with spatial constraints proposal was used to enhance the noise and artifact robustness of DTI segmentation. Fuzzy c-means clustering with spatial constraints (FCM_S) could effectively segment images corrupted by noise, outliers, and other imaging artifacts. Its effectiveness contributes not only to the introduction of fuzziness for belongingness of each pixel but also to the exploitation of spatial contextual information. We proposed an improved FCM_S applied on DTI parametric maps, which explores the mean and covariance of the feature spatial information for automated segmentation of DTI. The experiments on synthetic images and real-world datasets showed that our proposed algorithms, especially with new spatial constraints, were more effective. PMID:23891435
Carotid artery image segmentation using modified spatial fuzzy c-means and ensemble clustering.
Hassan, Mehdi; Chaudhry, Asmatullah; Khan, Asifullah; Kim, Jin Young
2012-12-01
Disease diagnosis based on ultrasound imaging is popular because of its non-invasive nature. However, ultrasound imaging system produces low quality images due to the presence of spackle noise and wave interferences. This shortcoming requires a considerable effort from experts to diagnose a disease from the carotid artery ultrasound images. Image segmentation is one of the techniques, which can help efficiently in diagnosing a disease from the carotid artery ultrasound images. Most of the pixels in an image are highly correlated. Considering the spatial information of surrounding pixels in the process of image segmentation may further improve the results. When data is highly correlated, one pixel may belong to more than one clusters with different degree of membership. In this paper, we present an image segmentation technique namely improved spatial fuzzy c-means and an ensemble clustering approach for carotid artery ultrasound images to identify the presence of plaque. Spatial, wavelets and gray level co-occurrence matrix (GLCM) features are extracted from carotid artery ultrasound images. Redundant and less important features are removed from the features set using genetic search process. Finally, segmentation process is performed on optimal or reduced features. Ensemble clustering with reduced feature set outperforms with respect to segmentation time as well as clustering accuracy. Intima-media thickness (IMT) is measured from the images segmented by the proposed approach. Based on IMT measured values, Multi-Layer Back-Propagation Neural Networks (MLBPNN) is used to classify the images into normal or abnormal. Experimental results show the learning capability of MLBPNN classifier and validate the effectiveness of our proposed technique. The proposed approach of segmentation and classification of carotid artery ultrasound images seems to be very useful for detection of plaque in carotid artery. PMID:22981822
Unsupervised Image Segmentation Using Penalized Fuzzy Clustering Algorithm
Yong Yang; Feng Zhang; Chongxun Zheng; Pan Lin
2005-01-01
\\u000a Fuzzy c-means (FCM) clustering algorithm as an unsupervised fuzzy clustering technique has been widely used in image segmentation.\\u000a However, the conventional FCM algorithm is very sensitive to noise for the reason of incorporating no information about spatial\\u000a context while segmentation. To overcome this limitation of FCM algorithm, a novel penalized fuzzy c-means (PFCM) algorithm\\u000a for image segmentation is presented in
More Clustering CURE Algorithm
Ullman, Jeffrey D.
1 More Clustering CURE Algorithm Clustering Streams #12;2 The CURE Algorithm Problem with BFR/k -means: Assumes clusters are normally distributed in each dimension. And axes are fixed --- ellipses at an angle are not OK. CURE: Assumes a Euclidean distance. Allows clusters to assume any shape. #12;3 Example
Modi ed Fuzzy C-Mean in Medical Image Segmentation
Farag, Aly A.
1 Modi ed Fuzzy C-Mean in Medical Image Segmentation Nevin. A. Mohamed, M. N. Ahmed and A. Farag making or the de- fuzzi cation process. Keywords|Fuzzy c-mean, Image Segmentation, Fuzzy clustering. This paper de- scribes in details the implementation of a modi ed fuzzy c-mean algorithm in the brain image
More Clustering CURE Algorithm
Ullman, Jeffrey D.
1 More Clustering CURE Algorithm Non-Euclidean Approaches #12;2 The CURE Algorithm Problem with BFR/k -means: Assumes clusters are normally distributed in each dimension. And axes are fixed --- ellipses at an angle are not OK. CURE: Assumes a Euclidean distance. Allows clusters to assume any shape. #12;3 Example
Online Writer Identification Using Fuzzy C-means Clustering of Character Prototypes
Paris-Sud XI, Université de
information derived from the raw ink signal for indexing and retrieval of the documents. This paper proposes, information retrieval, online handwriting, fuzzy c-means 1. Introduction Technology has become an integral such as handwritten online documents are emerging, which are produced by digital devices such as Tablet PC, personal
Fuzzy Clustering with Improved Artificial Fish Swarm Algorithm
Si He; Nabil Belacel; Habib Hamam; Yassine Bouslimani
2009-01-01
This paper applies the artificial fish swarm algorithm (AFSA) to fuzzy clustering. An improved AFSA with adaptive visual and adaptive step is proposed. AFSA enhances the performance of the fuzzy C-means (FCM) algorithm. A computational experiment shows that AFSA improved FCM out performs both the conventional FCM algorithm and the genetic algorithm (GA) improved FCM.
A Multiple-Kernel Fuzzy C-Means Algorithm for Image Segmentation
Long Chen; C. L. Philip Chen; Mingzhu Lu
2011-01-01
In this paper, a generalized multiple-kernel fuzzy C-means (FCM) (MKFCM) methodology is introduced as a frame- work for image-segmentation problems. In the framework, aside from the fact that the composite kernels are used in the kernel FCM (KFCM), a linear combination of multiple kernels is pro- posed and the updating rules for the linear coefficients of the composite kernel are
NASA Astrophysics Data System (ADS)
Nasseri, Aynur; Jafar Mohammadzadeh, Mohammad; Hashem Tabatabaei Raeisi, S.
2015-04-01
This paper deals with the application of the ant colony algorithm (AC) to a seismic dataset from Dezful Embayment in the southwest region of Iran. The objective of the approach is to generate an accurate representation of faults and discontinuities to assist in pertinent matters such as well planning and field optimization. The AC analyzed all spatial discontinuities in the seismic attributes from which features were extracted. True fault information from the attributes was detected by many artificial ants, whereas noise and the remains of the reflectors were eliminated. Furthermore, the fracture enhancement procedure was conducted by three steps on seismic data of the area. In the first step several attributes such as chaos, variance/coherence and dip deviation were taken into account; the resulting maps indicate high-resolution contrast for the variance attribute. Subsequently, the enhancement of spatial discontinuities was performed and finally elimination of the noise and remains of non-faulting events was carried out by simulating the behavior of ant colonies. After considering stepwise attribute optimization, focusing on chaos and variance in particular, an attribute fusion was generated and used in the ant colony algorithm. The resulting map displayed the highest performance in feature detection along the main structural feature trend, confined to a NW–SE direction. Thus, the optimized attribute fusion might be used with greater confidence to map the structural feature network with more accuracy and resolution. In order to assess the performance of the AC in feature detection, and cross validate the reliability of the method used, fuzzy c-means clustering (FCMC) was employed for the same dataset. Comparing the maps illustrates the effectiveness and preference of the AC approach due to its high resolution contrast for structural feature detection compared to the FCMC method. Accordingly, 3D planes of discontinuity determined spatial distribution of fractures in the field in order to assist well planning. Results revealed that the high impedance location probability related to an area in the vicinity of the faults, whilst low impedance location probably could indicate zones of high permeability which indicate flow conduits. Analysis under the present study suggests that the orientation and magnitude of fractures exhibiting the main trend of NW–SE in Dezful Embayment is more susceptible to stimulation and is more likely to open for fluid flow.
MR brain image segmentation using an enhanced fuzzy C-means algorithm
L. Szilagyi; Z. Benyol; S. M. Szilagyi; H. S. Adam
2003-01-01
This paper presents a new algorithm for fuzzy segmentation of MR brain images. Starting from the standard FCM and its bias-corrected version BCFCM algorithm, by splitting up the two major steps of the latter, and by introducing a new factor, the amount of required calculations is considerably reduced. The algorithm provides good-quality segmented brain images a very quick way, which
A modified fuzzy C-means with particle swarm optimization adaptive image segmentation algorithm
NASA Astrophysics Data System (ADS)
Gu, Yingjie; Jia, Zhenhong; Yang, Jie; Pang, Shaoning
2010-07-01
A new method which the numbers of cluster is self-adapted and use up and down cut-off of FCM combined with PSO to take place of common FCM combined with PSO is proposed in this paper. Experiment's results show that compared with the method of combining the particle swarm optimization (PSO) with common FCM, it helps to make a better effect on image segmentation and optimize the numbers of cluster and converge the rate quickly.
NASA Astrophysics Data System (ADS)
Wang, Shilong; Xu, Yuru; Pang, Yongjie
2011-03-01
The S/N of an underwater image is low and has a fuzzy edge. If using traditional methods to process it directly, the result is not satisfying. Though the traditional fuzzy C-means algorithm can sometimes divide the image into object and background, its time-consuming computation is often an obstacle. The mission of the vision system of an autonomous underwater vehicle (AUV) is to rapidly and exactly deal with the information about the object in a complex environment for the AUV to use the obtained result to execute the next task. So, by using the statistical characteristics of the gray image histogram, a fast and effective fuzzy C-means underwater image segmentation algorithm was presented. With the weighted histogram modifying the fuzzy membership, the above algorithm can not only cut down on a large amount of data processing and storage during the computation process compared with the traditional algorithm, so as to speed up the efficiency of the segmentation, but also improve the quality of underwater image segmentation. Finally, particle swarm optimization (PSO) described by the sine function was introduced to the algorithm mentioned above. It made up for the shortcomings that the FCM algorithm can not get the global optimal solution. Thus, on the one hand, it considers the global impact and achieves the local optimal solution, and on the other hand, further greatly increases the computing speed. Experimental results indicate that the novel algorithm can reach a better segmentation quality and the processing time of each image is reduced. They enhance efficiency and satisfy the requirements of a highly effective, real-time AUV.
S. SAHEB BASHA; K. SATYA PRASAD
2009-01-01
Breast cancer is one of the major causes of death among women. Small clusters of micro calcifications appearing as collection of white spots on mammograms show an early warning of breast cancer. Primary prevention seems impossible since the causes of this disease is still remain unknown. An improvement of early diagnostic techniques is critical for women's quality of life. Mammography
Survey of clustering algorithms
Rui Xu; Donald Wunsch II
2005-01-01
Data analysis plays an indispensable role for understanding various phenomena. Cluster analysis, primitive exploration with little or no prior knowledge, consists of research developed across a wide variety of communities. The diversity, on one hand, equips us with many tools. On the other hand, the profusion of options causes confusion. We survey clustering algorithms for data sets appearing in statistics,
Hassan, Mehdi; Chaudhry, Asmatullah; Khan, Asifullah; Iftikhar, M Aksam
2014-02-01
In this paper, a robust method is proposed for segmentation of medical images by exploiting the concept of information gain. Medical images contain inherent noise due to imaging equipment, operating environment and patient movement during image acquisition. A robust medical image segmentation technique is thus inevitable for accurate results in subsequent stages. The clustering technique proposed in this work updates fuzzy membership values and cluster centroids based on information gain computed from the local neighborhood of a pixel. The proposed approach is less sensitive to noise and produces homogeneous clustering. Experiments are performed on medical and non-medical images and results are compared with state of the art segmentation approaches. Analysis of visual and quantitative results verifies that the proposed approach outperforms other techniques both on noisy and noise free images. Furthermore, the proposed technique is used to segment a dataset of 300 real carotid artery ultrasound images. A decision system for plaque detection in the carotid artery is then proposed. Intima media thickness (IMT) is measured from the segmented images produced by the proposed approach. A feature vector based on IMT values is constructed for making decision about the presence of plaque in carotid artery using probabilistic neural network (PNN). The proposed decision system detects plaque in carotid artery images with high accuracy. Finally, effect of the proposed segmentation technique has also been investigated on classification of carotid artery ultrasound images. PMID:24239296
A modified fuzzy c-means algorithm for bias field estimation and segmentation of MRI data
Mohamed N. Ahmed; Sameh M. Yamany; Nevin Mohamed; Aly A. Farag; Thomas Moriarty
2002-01-01
In this paper, we present a novel algorithm for fuzzy segmentation of magnetic resonance imaging (MRI) data and es- timation of intensity inhomogeneities using fuzzy logic. MRI in- tensity inhomogeneities can be attributed to imperfections in the radio-frequency coils or to problems associated with the acqui- sition sequences. The result is a slowly varying shading artifact over the image that
A NOVEL IMAGE SEGMENTATION APPROACH BASED ON NEUTROSOPHIC SET AND IMPROVED FUZZY C-MEANS ALGORITHM
H. D. CHENG; YANHUI GUO; YINGTAO ZHANG
2011-01-01
Image segmentation is an important component in image processing, pattern recognition and computer vision. Many segmentation algorithms have been proposed. However, segmentation methods for both noisy and noise-free images have not been studied in much detail.Neutrosophic set (NS), a part of neutrosophy theory, studies the origin, nature, and scope of neutralities, as well as their interaction with different ideational spectra.
A New Approach to Lung Image Segmentation using Fuzzy Possibilistic C-Means Algorithm
M. Gomathi; P. Thangaraj
2010-01-01
Image segmentation is a vital part of image processing. Segmentation has its\\u000aapplication widespread in the field of medical images in order to diagnose\\u000acurious diseases. The same medical images can be segmented manually. But the\\u000aaccuracy of image segmentation using the segmentation algorithms is more when\\u000acompared with the manual segmentation. In the field of medical diagnosis an\\u000aextensive
Image Segmentation Using Fast Fuzzy C Means Based on Particle Swarm Optimization
Zang Jing; Li Bo
2010-01-01
Present a image segmentation technique using fast fuzzy C Means clustering algorithm based on Particle Swarm Optimization Algorithm. This Algorithm utilizes the strong ability of the global optimizing of the PSO Algorithm, and avoids the sensitivity to local optimization of the Fast FCM algorithm. Furthermore, the PSO defines the centers and numbers of clustering automatically. Two algorithm combined to find
Miriam B. Brandl; Dominik Beck; Tuan D. Pham
2011-01-01
The high dimensionality of image-based dataset can be a drawback for classification accuracy. In this study, we propose the application of fuzzy c-means clustering, cluster validity indices and the notation of a joint-feature-clustering matrix to find redundancies of image-features. The introduced matrix indicates how frequently features are grouped in a mutual cluster. The resulting information can be used to find
Sopharak, Akara; Uyyanonvara, Bunyarit; Barman, Sarah
2009-01-01
Exudates are the primary sign of Diabetic Retinopathy. Early detection can potentially reduce the risk of blindness. An automatic method to detect exudates from low-contrast digital images of retinopathy patients with non-dilated pupils using a Fuzzy C-Means (FCM) clustering is proposed. Contrast enhancement preprocessing is applied before four features, namely intensity, standard deviation on intensity, hue and a number of edge pixels, are extracted to supply as input parameters to coarse segmentation using FCM clustering method. The first result is then fine-tuned with morphological techniques. The detection results are validated by comparing with expert ophthalmologists’ hand-drawn ground-truths. Sensitivity, specificity, positive predictive value (PPV), positive likelihood ratio (PLR) and accuracy are used to evaluate overall performance. It is found that the proposed method detects exudates successfully with sensitivity, specificity, PPV, PLR and accuracy of 87.28%, 99.24%, 42.77%, 224.26 and 99.11%, respectively. PMID:22574005
Louisville, University of
IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 21, NO. 3, MARCH 2002 193 A Modified Fuzzy C-Means--In this paper, we present a novel algorithm for fuzzy segmentation of magnetic resonance imaging (MRI) data algorithm. Index Terms--Bias field, fuzzy logic, image segmentation, MR imaging. I. INTRODUCTION SPATIAL
NASA Astrophysics Data System (ADS)
An, Yu; Liu, Jie; Ye, Jinzuo; Mao, Yamin; Yang, Xin; Jiang, Shixin; Chi, Chongwei; Tian, Jie
2015-03-01
As an important molecular imaging modality, fluorescence molecular imaging (FMI) has the advantages of high sensitivity, low cost and ease of use. By labeling the regions of interest with fluorophore, FMI can noninvasively obtain the distribution of fluorophore in-vivo. However, due to the fact that the spectrum of fluorescence is in the section of the visible light range, there are mass of autofluorescence on the surface of the bio-tissues, which is a major disturbing factor in FMI. Meanwhile, the high-level of dark current for charge-coupled device (CCD) camera and other influencing factor can also produce a lot of background noise. In this paper, a novel method for image denoising of FMI based on fuzzy C-Means clustering (FCM) is proposed, because the fluorescent signal is the major component of the fluorescence images, and the intensity of autofluorescence and other background signals is relatively lower than the fluorescence signal. First, the fluorescence image is smoothed by sliding-neighborhood operations to initially eliminate the noise. Then, the wavelet transform (WLT) is performed on the fluorescence images to obtain the major component of the fluorescent signals. After that, the FCM method is adopt to separate the major component and background of the fluorescence images. Finally, the proposed method was validated using the original data obtained by in vivo implanted fluorophore experiment, and the results show that our proposed method can effectively obtain the fluorescence signal while eliminate the background noise, which could increase the quality of fluorescence images.
NASA Astrophysics Data System (ADS)
Brandl, Miriam B.; Beck, Dominik; Pham, Tuan D.
2011-06-01
The high dimensionality of image-based dataset can be a drawback for classification accuracy. In this study, we propose the application of fuzzy c-means clustering, cluster validity indices and the notation of a joint-feature-clustering matrix to find redundancies of image-features. The introduced matrix indicates how frequently features are grouped in a mutual cluster. The resulting information can be used to find data-derived feature prototypes with a common biological meaning, reduce data storage as well as computation times and improve the classification accuracy.
Fuzzy dynamic clustering algorithm Sankar K. PAL and Sushmita M ITRA
Pal, Sankar Kumar
algorithms include the fuzzy isodata [5], fuzzy C-means [6] and clustering by decomposition of induced fuzzy lems, e.g., in segmenting an image [7], in defining a feature evaluation index [8], in determiningFuzzy dynamic clustering algorithm Sankar K. PAL and Sushmita M ITRA Electronics and Communication
Pekka Teppola; Satu-Pia Mujunen; Pentti Minkkinen
1998-01-01
In this paper, a combined approach of partial least squares (PLS) and fuzzy c-means (FCM) clustering for the monitoring of an activated-sludge waste-water treatment plant is presented. Their properties are also investigated. Both methods were applied together in process monitoring. PLS was used for extracting the most useful information from the control and process variables in order to predict a
Robust Image Segmentation Algorithm Using Fuzzy Clustering Based on Kernel-Induced Distance Measure
Yanling Li; Yi Shen
2008-01-01
Fuzzy c-means (FCM) algorithm is one of the most popular methods for image segmentation. However, the standard FCM algorithm is noise sensitive because of not taking into account the spatial information in the image. To overcome the above problem, Z. Yang, et al. propose a robust fuzzy clustering based image segmentation method for noisy image(RFCM). Although the RFCM algorithm is
Effective FCM noise clustering algorithms in medical images.
Kannan, S R; Devi, R; Ramathilagam, S; Takezawa, K
2013-02-01
The main motivation of this paper is to introduce a class of robust non-Euclidean distance measures for the original data space to derive new objective function and thus clustering the non-Euclidean structures in data to enhance the robustness of the original clustering algorithms to reduce noise and outliers. The new objective functions of proposed algorithms are realized by incorporating the noise clustering concept into the entropy based fuzzy C-means algorithm with suitable noise distance which is employed to take the information about noisy data in the clustering process. This paper presents initial cluster prototypes using prototype initialization method, so that this work tries to obtain the final result with less number of iterations. To evaluate the performance of the proposed methods in reducing the noise level, experimental work has been carried out with a synthetic image which is corrupted by Gaussian noise. The superiority of the proposed methods has been examined through the experimental study on medical images. The experimental results show that the proposed algorithms perform significantly better than the standard existing algorithms. The accurate classification percentage of the proposed fuzzy C-means segmentation method is obtained using silhouette validity index. PMID:23219569
Cluster center initialization algorithm for K-means clustering
Shehroz S. Khan; Amir Ahmad
2004-01-01
Performance of iterative clustering algorithms which converges to numerous local minima depend highly on initial cluster centers. Generally initial cluster centers are selected randomly. In this paper we propose an algorithm to compute initial cluster centers for K-means clustering. This algorithm is based on two observations that some of the patterns are very similar to each other and that is
Genetic clustering algorithm for searching the nonspherically shaped clusters
NASA Astrophysics Data System (ADS)
Yang, Shiueng B.; Lee, Yi L.
2003-04-01
The K-means algorithm is a well-known method for searching the clustering. However, the K-means algorithm is suitable to find the clustering that contains compact spherical clusters. If the shape of clusters is not spherical, the K-means algorithm is failure to find the clustering result. Therefore, in this study, the genetic clustering algorithm is proposed to find the clustering whether the shape of clusters is spherical or not. Also, the genetic clustering algorithm can automatically find the number of clusters in the data set. Thus, the users need not to pre-dine the number of clusters in the data set. Experimental results show our proposed genetic clustering algorithm achieves better performance than the traditional clustering algorithms.
Fast Algorithms for Projected Clustering
Charu C. Aggarwal; Cecilia Magdalena Procopiuc; Joel L. Wolf; Philip S. Yu; Jong Soo Park
1999-01-01
The clustering problem is well known in the database literature for its numerous applications in problems such as customer segmentation, classification and trend analysis. Unfortunately, all known algorithms tend to break down in high dimensional spaces because of the inherent sparsity of the points. In such high dimensional spaces not all dimensions may be relevant to a given cluster. One
Novel Clustering Algorithms Based on Improved Artificial Fish Swarm Algorithm
Yongming Cheng; Mingyan Jiang; Dongfeng Yuan
2009-01-01
An improved artificial fish swarm algorithm (IAFSA) is proposed, and its complexity is much less than the original algorithm (AFSA) because of a new proposed fish behavior. Based on IAFSA, two novel algorithms for data clustering are presented. One is the improved artificial fish swarm clustering (IAFSC) algorithm, the other is a hybrid fuzzy clustering algorithm that incorporates the fuzzy
A k-means Based Clustering Algorithm Domenico Daniele Bloisi and Luca Iocchi
Iocchi, Luca
and the performance of the proposed method. Key words: data clustering, k-means, image segmentation 1 Introduction of well known algorithms that perform the task of clustering data: k-means, Fuzzy C-means, Hierarchical segmentation of the images. Background subtraction is used to first distinguish foreground (i.e., boats) from
NASA Astrophysics Data System (ADS)
Sadr, Ali; Momtaz, Amirkeyvan
2012-01-01
Clustering is one of the image-processing methods used in non-destructive testing (NDT). As one of the initializing parameters, most clustering algorithms, like fuzzy C means (FCM), Iterative self-organization data analysis (ISODATA), K-means, and their derivatives, require the number of clusters. This paper proposes an algorithm for clustering the pixels in C-scan images without any initializing parameters. In this state-of-the-art method, an image is sampled based on the rosette pattern and according to the pattern characteristics, and extracted samples are clustered and then the number of clusters is determined. The centroids of the classes are computed by means of a method used to calculate the distribution function. Based on different data sets, the results show that the algorithm improves the clustering capability by 92.93% and 91.93% in comparison with FCM and K-means algorithms, respectively. Moreover, when dealing with high-resolution data sets, the efficiency of the algorithm in terms of cluster detection and run time improves considerably.
On Spectral Clustering: Analysis and an algorithm
Andrew Y. Ng; Michael I. Jordan; Yair Weiss
2001-01-01
Despite many empirical successes of spectral clustering methods|algorithms that cluster points using eigenvectors of matrices derivedfrom the distances between the points|there are several unresolvedissues. First, there is a wide variety of algorithms thatuse the eigenvectors in slightly dierent ways. Second, many ofthese algorithms have no proof that they will actually compute areasonable clustering. In this paper, we present a simple
Scaling Clustering Algorithms to Large Databases
Paul S. Bradley; Usama M. Fayyad; Cory Reina
1998-01-01
Practical clustering algorithms require multiple data scans to achieve convergence. For large databases, these scans become prohibitively expensive. We present a scalable clustering framework applicable to a wide class of iterative clustering. We require at most one scan of the database. In this work, the framework is instantiated and numerically justified with the popular K-Means clustering algorithm. The method is
Choosing the cluster to split in bisecting divisive clustering algorithms
Boley, Daniel
- 1 - Choosing the cluster to split in bisecting divisive clustering algorithms Sergio M. Savaresi@ian.pv.cnr.it * Corresponding Author Abstract. This paper deals with the problem of clustering a data-set. In particular-problems: the problem of choosing which cluster must be divided, and the problem of splitting the selected cluster
HARP: A Practical Projected Clustering Algorithm
Kevin Y. Yip; David W. Cheung; Michael K. Ng
2004-01-01
In high-dimensional data, clusters can exist in subspaces that hide themselves from traditional clustering methods. A number of algorithms have been proposed to identify such projected clusters, but most of them rely on some user parameters to guide the clustering process. The clustering accuracy can be seriously degraded if incorrect values are used. Unfortunately, in real situations, it is rarely
Color image segmentation using multiscale fuzzy C-means and graph theoretic merging
S. Makrogiannis; Christos Theoharatos; George Economou; Spiros Fotopoulos
2003-01-01
A multiresolution color image segmentation method is presented that incorporates the main principles of region-based and cluster analysis approaches. A multiscale dissimilarity measure in the feature space is proposed that makes use of non-parametric cluster validity analysis and fuzzy C-Means clustering. Detected clusters are utilized to assign membership functions to the image regions. In addition, a graph theoretic merging algorithm
Infrared Image Segmentation via Fast Fuzzy C-Means with Spatial Information
Jin Wu; Juan Li; Jian Liu; Jinwen Tian
2004-01-01
Forward looking infrared (FLIR) image segmentation is crucial for automatic target recognition (ATR). This paper presents a new clustering algorithm for FLIR image segmentation. We combine the standard fuzzy c-means algorithm (FCM) with two-dimensional histogram of the image, and modify the membership function of the FCM for taking into account the spatial information of image data. Since the FCM is
Hierarchical Clustering Algorithms for Document Datasets
Ying Zhao; George Karypis; Usama M. Fayyad
2005-01-01
Fast and high-quality document clustering algorithms play an important role in providing intuitive navigation and browsing mechanisms by organizing large amounts of information into a small number of meaningful clusters. In particular, clustering algorithms that build meaningful hierarchies out of large document collections are ideal tools for their interactive visualization and exploration as they provide data-views that are consistent, predictable,
HARP: A Practical Projected Clustering Algorithm
Cheung, David Wai-lok
HARP: A Practical Projected Clustering Algorithm Kevin Y. Yip, David W. Cheung, Member, IEEE Computer Society, and Michael K. Ng Abstract--In high-dimensional data, clusters can exist in subspaces that hide themselves from traditional clustering methods. A number of algorithms have been proposed
Modi ed Fuzzy C-Mean in Medical Image Segmentation
Louisville, University of
1 Modi ed Fuzzy C-Mean in Medical Image Segmentation Nevin. A. Mohamed, M. N. Ahmed and A. Farag- fuzzi cation process. Keywords|Fuzzy c-mean, Image Segmentation, Fuzzy clustering, Adaptive lter. I of brain images. We propose a fully automatic technique to obtain image clusters. A modi ed fuzzy c-mean
Introduction to Cluster Monte Carlo Algorithms
NASA Astrophysics Data System (ADS)
Luijten, E.
This chapter provides an introduction to cluster Monte Carlo algorithms for classical statistical-mechanical systems. A brief review of the conventional Metropolis algorithm is given, followed by a detailed discussion of the lattice cluster algorithm developed by Swendsen and Wang and the single-cluster variant introduced by Wolff. For continuum systems, the geometric cluster algorithm of Dress and Krauth is described. It is shown how their geometric approach can be generalized to incorporate particle interactions beyond hardcore repulsions, thus forging a connection between the lattice and continuum approaches. Several illustrative examples are discussed.
NASA Astrophysics Data System (ADS)
Zainuddin, Zarita; Lai, Kee Huong; Ong, Pauline
2013-04-01
Artificial neural networks (ANNs) are powerful mathematical models that are used to solve complex real world problems. Wavelet neural networks (WNNs), which were developed based on the wavelet theory, are a variant of ANNs. During the training phase of WNNs, several parameters need to be initialized; including the type of wavelet activation functions, translation vectors, and dilation parameter. The conventional k-means and fuzzy c-means clustering algorithms have been used to select the translation vectors. However, the solution vectors might get trapped at local minima. In this regard, the evolutionary harmony search algorithm, which is capable of searching for near-optimum solution vectors, both locally and globally, is introduced to circumvent this problem. In this paper, the conventional k-means and fuzzy c-means clustering algorithms were hybridized with the metaheuristic harmony search algorithm. In addition to obtaining the estimation of the global minima accurately, these hybridized algorithms also offer more than one solution to a particular problem, since many possible solution vectors can be generated and stored in the harmony memory. To validate the robustness of the proposed WNNs, the real world problem of epileptic seizure detection was presented. The overall classification accuracy from the simulation showed that the hybridized metaheuristic algorithms outperformed the standard k-means and fuzzy c-means clustering algorithms.
Noisy Image Segmentation by a Robust Clustering Algorithm Based on DC Programming and DCA
Le Thi Hoai An; Le Hoai Minh; Nguyen Trong Phuc; Pham Dinh Tao
2008-01-01
We present a fast and robust algorithm for image segmentation problems via Fuzzy C-Means (FCM) clustering model. Our approach\\u000a is based on DC (Difference of Convex functions) programming and DCA (DC Algorithms) that have been successfully applied in\\u000a a lot of various fields of Applied Sciences, including Machine Learning. In an elegant way, the FCM model is reformulated\\u000a as a
Farjam, Reza; Tsien, Christina I.; Lawrence, Theodore S. [Department of Radiation Oncology, University of Michigan, 1500 East Medical Center Drive, SPC 5010, Ann Arbor, Michigan 48109-5010 (United States)] [Department of Radiation Oncology, University of Michigan, 1500 East Medical Center Drive, SPC 5010, Ann Arbor, Michigan 48109-5010 (United States); Cao, Yue, E-mail: yuecao@umich.edu [Department of Radiation Oncology, University of Michigan, 1500 East Medical Center Drive, SPC 5010, Ann Arbor, Michigan 48109-5010 (United States) [Department of Radiation Oncology, University of Michigan, 1500 East Medical Center Drive, SPC 5010, Ann Arbor, Michigan 48109-5010 (United States); Department of Radiology, University of Michigan, 1500 East Medical Center Drive, Med Inn Building C478, Ann Arbor, Michigan 48109-5842 (United States); Department of Biomedical Engineering, University of Michigan, 2200 Bonisteel Boulevard, Ann Arbor, Michigan 48109-2099 (United States)
2014-01-15
Purpose: To develop a pharmacokinetic modelfree framework to analyze the dynamic contrast enhanced magnetic resonance imaging (DCE-MRI) data for assessment of response of brain metastases to radiation therapy. Methods: Twenty patients with 45 analyzable brain metastases had MRI scans prior to whole brain radiation therapy (WBRT) and at the end of the 2-week therapy. The volumetric DCE images covering the whole brain were acquired on a 3T scanner with approximately 5 s temporal resolution and a total scan time of about 3 min. DCE curves from all voxels of the 45 brain metastases were normalized and then temporally aligned. A DCE matrix that is constructed from the aligned DCE curves of all voxels of the 45 lesions obtained prior to WBRT is processed by principal component analysis to generate the principal components (PCs). Then, the projection coefficient maps prior to and at the end of WBRT are created for each lesion. Next, a pattern recognition technique, based upon fuzzy-c-means clustering, is used to delineate the tumor subvolumes relating to the value of the significant projection coefficients. The relationship between changes in different tumor subvolumes and treatment response was evaluated to differentiate responsive from stable and progressive tumors. Performance of the PC-defined tumor subvolume was also evaluated by receiver operating characteristic (ROC) analysis in prediction of nonresponsive lesions and compared with physiological-defined tumor subvolumes. Results: The projection coefficient maps of the first three PCs contain almost all response-related information in DCE curves of brain metastases. The first projection coefficient, related to the area under DCE curves, is the major component to determine response while the third one has a complimentary role. In ROC analysis, the area under curve of 0.88 ± 0.05 and 0.86 ± 0.06 were achieved for the PC-defined and physiological-defined tumor subvolume in response assessment. Conclusions: The PC-defined subvolume of a brain metastasis could predict tumor response to therapy similar to the physiological-defined one, while the former is determined more rapidly for clinical decision-making support.
NASA Astrophysics Data System (ADS)
Guofeng, Jin; Wei, Zhang; Zhengwei, Yang; Zhiyong, Huang; Yuanjia, Song; Dongdong, Wang; Gan, Tian
2012-12-01
The Fuzzy C-Mean clustering (FCM) algorithm is an effective image segmentation algorithm which combines the clustering of non-supervised and the idea of the blurry aggregate, it is widely applied to image segmentation, but it has many problems, such as great amount of calculation, being sensitive to initial data values and noise in images, and being vulnerable to fall into the shortcoming of local optimization. To conquer the problems of FCM, the algorithm of fuzzy clustering based on Particle Swarm Optimization (PSO) was proposed, this article first uses the PSO algorithm of a powerful global search capability to optimize FCM centers, and then uses this center to partition the images, the speed of the image segmentation was boosted and the segmentation accuracy was improved. The results of the experiments show that the PSO-FCM algorithm can effectively avoid the disadvantage of FCM, boost the speed and get a better image segmentation result.
Improved fuzzy clustering algorithms in segmentation of DC-enhanced breast MRI.
Kannan, S R; Ramathilagam, S; Devi, Pandiyarajan; Sathya, A
2012-02-01
Segmentation of medical images is a difficult and challenging problem due to poor image contrast and artifacts that result in missing or diffuse organ/tissue boundaries. Many researchers have applied various techniques however fuzzy c-means (FCM) based algorithms is more effective compared to other methods. The objective of this work is to develop some robust fuzzy clustering segmentation systems for effective segmentation of DCE - breast MRI. This paper obtains the robust fuzzy clustering algorithms by incorporating kernel methods, penalty terms, tolerance of the neighborhood attraction, additional entropy term and fuzzy parameters. The initial centers are obtained using initialization algorithm to reduce the computation complexity and running time of proposed algorithms. Experimental works on breast images show that the proposed algorithms are effective to improve the similarity measurement, to handle large amount of noise, to have better results in dealing the data corrupted by noise, and other artifacts. The clustering results of proposed methods are validated using Silhouette Method. PMID:20703716
Sonar image segmentation on Fuzzy C-Mean using local texture feature
Xiufen Ye; Lei Wang; Tian Wang
2011-01-01
This paper proposes an improved Fuzzy C Mean (FMC) algorithm for scan sonar image segmentation. Takin g care of the characteristics of sonar images which are poor contrast, low resolution and stron g noise, we propose to use local texture features and ori ginal image to calculate the distance of the pixels and the center of clusters .First, we use
Francesco Di Maio; Piercesare Secchi; Simone Vantini; Enrico Zio
2011-01-01
This paper addresses the issue of the classification of accident scenarios generated in a dynamic safety and reliability analyses of a Nuclear Power Plant (NPP) equipped with a Dig- ital Instrumentation and Control system (I&C). More specifically, the classification of the final state reached by the system at the end of an accident scenario is performed by Fuzzy C-Means clus-
K-nearest neighbors clustering algorithm
NASA Astrophysics Data System (ADS)
Gauza, Dariusz; ?ukowska, Anna; Nowak, Robert
2014-11-01
Cluster analysis, understood as unattended method of assigning objects to groups solely on the basis of their measured characteristics, is the common method to analyze DNA microarray data. Our proposal is to classify the results of one nearest neighbors algorithm (1NN). The presented method well cope with complex, multidimensional data, where the number of groups is properly identified. The numerical experiments on benchmark microarray data shows that presented algorithm give a better results than k-means clustering.
The global k-means clustering algorithm
Aristidis Likas; Nikos A. Vlassis; Jakob J. Verbeek
2003-01-01
Abstract We present the global k-means algorithm which is an incremental approach to clustering that dynamically adds one cluster center at a time through a deterministic global search procedure consisting of N (with N being the size of the data set) executions of the k-means algorithm from suitable initial positions. We also propose modifications of the method,to reduce the computational
Efficient fuzzy C-means architecture for image segmentation.
Li, Hui-Ya; Hwang, Wen-Jyi; Chang, Chia-Yen
2011-01-01
This paper presents a novel VLSI architecture for image segmentation. The architecture is based on the fuzzy c-means algorithm with spatial constraint for reducing the misclassification rate. In the architecture, the usual iterative operations for updating the membership matrix and cluster centroid are merged into one single updating process to evade the large storage requirement. In addition, an efficient pipelined circuit is used for the updating process for accelerating the computational speed. Experimental results show that the the proposed circuit is an effective alternative for real-time image segmentation with low area cost and low misclassification rate. PMID:22163980
A local distribution based spatial clustering algorithm
NASA Astrophysics Data System (ADS)
Deng, Min; Liu, Qiliang; Li, Guangqiang; Cheng, Tao
2009-10-01
Spatial clustering is an important means for spatial data mining and spatial analysis, and it can be used to discover the potential spatial association rules and outliers among the spatial data. Most existing spatial clustering algorithms only utilize the spatial distance or local density to find the spatial clusters in a spatial database, without taking the spatial local distribution characters into account, so that the clustered results are unreasonable in many cases. To overcome such limitations, this paper develops a new indicator (i.e. local median angle) to measure the local distribution at first, and further proposes a new algorithm, called local distribution based spatial clustering algorithm (LDBSC in abbreviation). In the process of spatial clustering, a series of recursive search are implemented for all the entities so that those entities with its local median angle being very close or equal are clustered. In this way, all the spatial entities in the spatial database can be automatically divided into some clusters. Finally, two tests are implemented to demonstrate that the method proposed in this paper is more prominent than DBSCAN, as well as that it is very robust and feasible, and can be used to find the clusters with different shapes.
Optimal Hops-Based Adaptive Clustering Algorithm
NASA Astrophysics Data System (ADS)
Xuan, Xin; Chen, Jian; Zhen, Shanshan; Kuo, Yonghong
This paper proposes an optimal hops-based adaptive clustering algorithm (OHACA). The algorithm sets an energy selection threshold before the cluster forms so that the nodes with less energy are more likely to go to sleep immediately. In setup phase, OHACA introduces an adaptive mechanism to adjust cluster head and load balance. And the optimal distance theory is applied to discover the practical optimal routing path to minimize the total energy for transmission. Simulation results show that OHACA prolongs the life of network, improves utilizing rate and transmits more data because of energy balance.
Hierarchical link clustering algorithm in networks
NASA Astrophysics Data System (ADS)
Bodlaj, Jernej; Batagelj, Vladimir
2015-06-01
Hierarchical network clustering is an approach to find tightly and internally connected clusters (groups or communities) of nodes in a network based on its structure. Instead of nodes, it is possible to cluster links of the network. The sets of nodes belonging to clusters of links can overlap. While overlapping clusters of nodes are not always expected, they are natural in many applications. Using appropriate dissimilarity measures, we can complement the clustering strategy to consider, for example, the semantic meaning of links or nodes based on their properties. We propose a new hierarchical link clustering algorithm which in comparison to existing algorithms considers node and/or link properties (descriptions, attributes) of the input network alongside its structure using monotonic dissimilarity measures. The algorithm determines communities that form connected subnetworks (relational constraint) containing locally similar nodes with respect to their description. It is only implicitly based on the corresponding line graph of the input network, thus reducing its space and time complexities. We investigate both complexities analytically and statistically. Using provided dissimilarity measures, our algorithm can, in addition to the general overlapping community structure of input networks, uncover also related subregions inside these communities in a form of hierarchy. We demonstrate this ability on real-world and artificial network examples.
Performance Comparison Of Evolutionary Algorithms For Image Clustering
NASA Astrophysics Data System (ADS)
Civicioglu, P.; Atasever, U. H.; Ozkan, C.; Besdok, E.; Karkinli, A. E.; Kesikoglu, A.
2014-09-01
Evolutionary computation tools are able to process real valued numerical sets in order to extract suboptimal solution of designed problem. Data clustering algorithms have been intensively used for image segmentation in remote sensing applications. Despite of wide usage of evolutionary algorithms on data clustering, their clustering performances have been scarcely studied by using clustering validation indexes. In this paper, the recently proposed evolutionary algorithms (i.e., Artificial Bee Colony Algorithm (ABC), Gravitational Search Algorithm (GSA), Cuckoo Search Algorithm (CS), Adaptive Differential Evolution Algorithm (JADE), Differential Search Algorithm (DSA) and Backtracking Search Optimization Algorithm (BSA)) and some classical image clustering techniques (i.e., k-means, fcm, som networks) have been used to cluster images and their performances have been compared by using four clustering validation indexes. Experimental test results exposed that evolutionary algorithms give more reliable cluster-centers than classical clustering techniques, but their convergence time is quite long.
Parallel Algorithms for Hierarchical Clustering and Cluster Validity
Xiaobo Li
1990-01-01
Parallel algorithms on SIMD (single-instruction stream multiple-data stream) machines for hierarchical clustering and cluster validity computation are proposed. The machine model uses a parallel memory system and an alignment network to facilitate parallel access to both pattern matrix and proximity matrix. For a problem with N patterns, the number of memory accesses is reduced from O(N 3) on a sequential
Classification of posture maintenance data with fuzzy clustering algorithms
NASA Technical Reports Server (NTRS)
Bezdek, James C.
1991-01-01
Sensory inputs from the visual, vestibular, and proprioreceptive systems are integrated by the central nervous system to maintain postural equilibrium. Sustained exposure to microgravity causes neurosensory adaptation during spaceflight, which results in decreased postural stability until readaptation occurs upon return to the terrestrial environment. Data which simulate sensory inputs under various conditions were collected in conjunction with JSC postural control studies using a Tilt-Translation Device (TTD). The University of West Florida proposed applying the Fuzzy C-Means Clustering (FCM) Algorithms to this data with a view towards identifying various states and stages. Data supplied by NASA/JSC were submitted to the FCM algorithms in an attempt to identify and characterize cluster substructure in a mixed ensemble of pre- and post-adaptational TTD data. Following several unsuccessful trials with FCM using a full 11 dimensional data set, a set of two channels (features) were found to enable FCM to separate pre- from post-adaptational TTD data. The main conclusions are that: (1) FCM seems able to separate pre- from post-TTD subject no. 2 on the one trial that was used, but only in certain subintervals of time; and (2) Channels 2 (right rear transducer force) and 8 (hip sway bar) contain better discrimination information than other supersets and combinations of the data that were tried so far.
Agglomerative Fuzzy K-Means Clustering Algorithm with Selection of Number of Clusters
Cheung, Yiu-ming
Agglomerative Fuzzy K-Means Clustering Algorithm with Selection of Number of Clusters Mark Junjie--In this paper, we present an agglomerative fuzzy K-Means clustering algorithm for numerical data, an extension the clustering process not sensitive to the initial cluster centers. The new algorithm can produce more
CURE: An Efficient Clustering Algorithm for Large Databases
Sudipto Guha; Rajeev Rastogi; Kyuseok Shim
1998-01-01
Clustering, in data mining, is useful for discovering groups and identifying interesting distributions in the underlying data. Traditional clustering algorithms either favor clusters with spherical shapes and similar sizes, or are very frag- ile in the presence of outliers. We propose a new cluster- ing algorithm called CURE that is more robust to outliers, and identifies clusters having non-spherical shapes
Colour image segmentation using fuzzy clustering techniques and competitive neural network
B. Sowmya; B. Sheela Rani
2011-01-01
This paper explains the task of segmenting any given colour image using fuzzy clustering algorithms and competitive neural network. The fuzzy clustering algorithms used are Fuzzy C means algorithm, Possibilistic Fuzzy C means. Image segmentation is the process of dividing the pixels into homogeneous classes or clusters so that items in the same class are as similar as possible and
Traffic classification using clustering algorithms
Jeffrey Erman; Martin F. Arlitt; Anirban Mahanti
2006-01-01
Classication of network trafc using port-based or payload-based analysis is becoming increasingly difcult with many peer-to-peer (P2P) applications using dynamic port numbers, masquerading tech- niques, and encryption to avoid detection. An alternative approach is to classify trafc by exploiting the distinctive characteristics of applications when they communicate on a network. We pursue this latter approach and demonstrate how cluster analysis
An efficient k-means clustering algorithm
Khaled Alsabti; Sanjay Ranka; Vineet Singh
1997-01-01
In this paper, we present a novel algorithm for performing k-means clustering. It organizes all the patterns in a k-d tree structure such that one can find all the patterns which are closest to a given prototype efficiently. The main intuition behind our approach is as follows. All the prototypes are potential candidates for the closest prototype at the root
Fuzzy Ants and Clustering Parag M. Kanade and Lawrence O. Hall
Hall, Lawrence O.
from the ant based algorithm were better optimized than those from randomly initialized hard C Means. Keywords: Clustering, Swarm Intelligence, Ant Colony Optimization, Fuzzy C Means, Hard C Means, fuzzy optimization algorithm. Experiments were done with both deterministic algorithms which assign each example
Genetically Improved PSO Algorithm for Efficient Data Clustering
Rehab F. Abdel-Kader
2010-01-01
Clustering is an important research topic in data mining that appears in a wide range of unsupervised classification applications. Partitional clustering algorithms such as the k-means algorithm are the most popular for clustering large datasets. The major problem with the k-means algorithm is that it is sensitive to the selection of the initial partitions and it may converge to local
ON CLUSTER VALIDITY INDEXES IN FUZZY AND HARD CLUSTERING ALGORITHMS FOR IMAGE SEGMENTATION
Farag, Aly A.
ON CLUSTER VALIDITY INDEXES IN FUZZY AND HARD CLUSTERING ALGORITHMS FOR IMAGE SEGMENTATION Moumen addresses the issue of assessing the quality of the clusters found by fuzzy and hard clustering algorithms. In particular, it seeks an answer to the question on how well cluster validity indexes can automatically
A Cross Unequal Clustering Routing Algorithm for Sensor Network
NASA Astrophysics Data System (ADS)
Tong, Wang; Jiyi, Wu; He, Xu; Jinghua, Zhu; Munyabugingo, Charles
2013-08-01
In the routing protocol for wireless sensor network, the cluster size is generally fixed in clustering routing algorithm for wireless sensor network, which can easily lead to the "hot spot" problem. Furthermore, the majority of routing algorithms barely consider the problem of long distance communication between adjacent cluster heads that brings high energy consumption. Therefore, this paper proposes a new cross unequal clustering routing algorithm based on the EEUC algorithm. In order to solve the defects of EEUC algorithm, this algorithm calculating of competition radius takes the node's position and node's remaining energy into account to make the load of cluster heads more balanced. At the same time, cluster adjacent node is applied to transport data and reduce the energy-loss of cluster heads. Simulation experiments show that, compared with LEACH and EEUC, the proposed algorithm can effectively reduce the energy-loss of cluster heads and balance the energy consumption among all nodes in the network and improve the network lifetime
Color image segmentation using fuzzy C-means and eigenspace projections
Jar-ferr Yang; Shu-sheng Hao; Pau-choo Chung
2002-01-01
In this paper, we propose two eigen-based fuzzy C-means (FCM) clustering algorithms to accurately segment the desired images, which have the same color as the pre-selected pixels. From the selected color pixels, we can 3rst divide the color space into principal and residual eigenspaces. Combined eigenspace transform and the FCM method, we can e4ectively achieve color image segmentation. The separate
The Georgi algorithms of jet clustering
NASA Astrophysics Data System (ADS)
Ge, Shao-Feng
2015-05-01
We reveal the direct link between the jet clustering algorithms recently proposed by Howard Georgi and parton shower kinematics, providing firm foundation from the theoretical side. The kinematics of this class of elegant algorithms is explored systematically for partons with arbitrary masses and the jet function is generalized to J {/? ( n)} with a jet function index n in order to achieve more degrees of freedom. Based on three basic requirements that, the result of jet clustering is process-independent and hence logically consistent, for softer subjets the inclusion cone is larger to conform with the fact that parton shower tends to emit softer partons at earlier stage with larger opening angle, and that the cone size cannot be too large in order to avoid mixing up neighbor jets, we derive constraints on the jet function parameter ? and index n which are closely related to cone size cutoff. Finally, we discuss how jet function values can be made invariant under Lorentz boost.
A survey of fuzzy clustering algorithms for pattern recognition. II
Andrea Baraldi; Palma Blonda
1999-01-01
For pt.I see ibid., p.775-85. In part I an equivalence between the concepts of fuzzy clustering and soft competitive learning in clustering algorithms is proposed on the basis of the existing literature. Moreover, a set of functional attributes is selected for use as dictionary entries in the comparison of clustering algorithms. In this paper, five clustering algorithms taken from the
MRI brain tumor segmentation based on improved fuzzy c-means method
NASA Astrophysics Data System (ADS)
Deng, Wankai; Xiao, Wei; Pan, Chao; Liu, Jianguo
2009-10-01
This paper focuses on the image segmentation, which is one of the key problems in medical image processing. A new medical image segmentation method is proposed based on fuzzy c- means algorithm and spatial information. Firstly, we classify the image into the region of interest and background using fuzzy c means algorithm. Then we use the information of the tissues' gradient and the intensity inhomogeneities of regions to improve the quality of segmentation. The sum of the mean variance in the region and the reciprocal of the mean gradient along the edge of the region are chosen as an objective function. The minimum of the sum is optimum result. The result shows that the clustering segmentation algorithm is effective.
A New Artificial Immune System Algorithm for Clustering
Reda Younsi; Wenjia Wang
2004-01-01
\\u000a This paper describes a new artificial immune system algorithm for data clustering. The proposed algorithm resembles the CLONALG,\\u000a widely used AIS algorithm but much simpler as it uses one shot learning and omits cloning. The algorithm is tested using four\\u000a simulated and two benchmark data sets for data clustering. Experimental results indicate it produced the correct clusters\\u000a for the data
Determining the Number of Clusters/Segments in Hierarchical Clustering/Segmentation Algorithms
Chan, Philip K.
Determining the Number of Clusters/Segments in Hierarchical Clustering/Segmentation Algorithms Stan {ssalvado, pkc}@cs.fit.edu ABSTRACT Many clustering and segmentation algorithms both suffer from the limitation that the number of clusters/segments are specified by a human user. It is often impractical
Suresh Chandra Satapathy; J. V. R. Murthy; P. V. G. D. Prasada Reddy
2007-01-01
Data clustering is a process of putting similar data into groups. A clustering algorithm partitions a data set into several groups such that the similarity within a group is larger than among groups. This paper presents data clustering using improved genetic algorithm (IGA) and the popular Nelder-Mead(NM) Simplex search . To improve the accuracy of data clustering, an improved GA
ROCK: A Robust Clustering Algorithm for Categorical Attributes
Sudipto Guha; Rajeev Rastogi; Kyuseok Shim
2000-01-01
Clustering, in data mining, is useful to discover distribution patterns in the underlying data. Clustering algorithms usually employ a distance metric based (e.g., euclidean) similarity measure in order to partition the database such that data points in the same partition are more similar than points in dierent partitions. In this paper, we study clustering algorithms for data with boolean and
Efficiency and Stability of Clustering Algorithms for Linked Data
Isabel Drost; Tobias Scheffer
2004-01-01
Abstract We are interested in finding clusters (“communities”) in networks of linked data, such as citation networks or web pages. Hierarchical clustering for networks is reviewed and an algorithmic improvement that leads to a significant performance increase is introduced. Our main focus is on the development of partitioning clustering algorithms that can deal with data represented only by link information
A Hybrid Monkey Search Algorithm for Clustering Analysis
Chen, Xin; Zhou, Yongquan; Luo, Qifang
2014-01-01
Clustering is a popular data analysis and data mining technique. The k-means clustering algorithm is one of the most commonly used methods. However, it highly depends on the initial solution and is easy to fall into local optimum solution. In view of the disadvantages of the k-means method, this paper proposed a hybrid monkey algorithm based on search operator of artificial bee colony algorithm for clustering analysis and experiment on synthetic and real life datasets to show that the algorithm has a good performance than that of the basic monkey algorithm for clustering analysis. PMID:24772039
A Simulation Study of Data Partitioning Algorithms for Multiple Clusters
of the algorithms. I. INTRODUCTION The Single Program Multiple Data (SPMD) paradigm [3] has been used for data1 A Simulation Study of Data Partitioning Algorithms for Multiple Clusters Chen Yu, Dan C.morrison@cs.ucc.ie Abstract-- Recently we proposed algorithms for con- current execution on multiple clusters [9
Optimizing clustering algorithm in mobile ad hoc networks using genetic algorithmic approach
Damla Turgut; Sajal K. Das; Ramez Elmasri; Begumhan Turgut
2002-01-01
We show how genetic algorithms can be useful in enhancing the performance of clustering algorithms in mobile ad hoc networks. In particular, we optimize our recently proposed weighted clustering algorithm (WCA). The problem formulation along with the parameters are mapped to individual chromosomes as input to the genetic algorithmic technique. Encoding the individual chromosomes is an essential part of the
A Robust Competitive Clustering Algorithm With Applications in Computer Vision
Hichem Frigui; Raghu Krishnapuram
1999-01-01
This paper addresses three major issues associated with conventional partitional clustering, namely, sensitivity to initialization, difficulty in determining the number of clusters, and sensitivity to noise and outliers. The proposed robust competitive agglomeration (RCA) algorithm starts with a large number of clusters to reduce the sensitivity to initialization, and determines the actual number of clusters by a process of competitive
A novel clustering algorithm inspired by membrane computing.
Peng, Hong; Luo, Xiaohui; Gao, Zhisheng; Wang, Jun; Pei, Zheng
2015-01-01
P systems are a class of distributed parallel computing models; this paper presents a novel clustering algorithm, which is inspired from mechanism of a tissue-like P system with a loop structure of cells, called membrane clustering algorithm. The objects of the cells express the candidate centers of clusters and are evolved by the evolution rules. Based on the loop membrane structure, the communication rules realize a local neighborhood topology, which helps the coevolution of the objects and improves the diversity of objects in the system. The tissue-like P system can effectively search for the optimal partitioning with the help of its parallel computing advantage. The proposed clustering algorithm is evaluated on four artificial data sets and six real-life data sets. Experimental results show that the proposed clustering algorithm is superior or competitive to k-means algorithm and several evolutionary clustering algorithms recently reported in the literature. PMID:25874264
Energy-balanced WSN Uneven Clustering Routing Algorithm
LV Lin-tao; FAN Yong-lin
2009-01-01
(Abstract)Aiming at the problem of unbalanced energy consumption in the existing Wireless Sensor Networks(WSN) uneven clustering routing algorithm, this paper proposes an energy-balanced uneven clustering WSN routing algorithm. It constructs a middle layer in the divided non-uniform zone to achieve the energy consumption balance of cluster head and inter-cluster node, which realizes the overall energy consumption balance of WSN. Experimental
SEGMENTATION OF IMAGES WITH SEPARATING LAYERS BY FUZZY C-MEANS AND CONVEX OPTIMIZATION
Steidl, Gabriele
SEGMENTATION OF IMAGES WITH SEPARATING LAYERS BY FUZZY C-MEANS AND CONVEX OPTIMIZATION B. SHAFEI-dimensional images containing separated layers. We tackle this problem by combining the fuzzy c-means algorithm of the convex model. We compute the segment prototypes by the fuzzy c-means algorithm (FCM). Then we use
Multi-Parent Clustering Algorithms from Stochastic Grammar Data Models
NASA Technical Reports Server (NTRS)
Mjoisness, Eric; Castano, Rebecca; Gray, Alexander
1999-01-01
We introduce a statistical data model and an associated optimization-based clustering algorithm which allows data vectors to belong to zero, one or several "parent" clusters. For each data vector the algorithm makes a discrete decision among these alternatives. Thus, a recursive version of this algorithm would place data clusters in a Directed Acyclic Graph rather than a tree. We test the algorithm with synthetic data generated according to the statistical data model. We also illustrate the algorithm using real data from large-scale gene expression assays.
A Novel Image Segmentation Algorithm Based on Harmony Fuzzy Search Algorithm
O. M. Alia; Rajeswari Mandava; D. Ramachandram; M. E. Aziz
2009-01-01
Image segmentation is considered as one of the crucial steps in image analysis process and it is the most challenging task. Image segmentation can be modeled as a clustering problem. Therefore, clustering algorithms have been applied successfully in image segmentation problems. Fuzzy c-mean (FCM) algorithm is considered as one of the most popular clustering algorithm. Even that, FCM can generate
A Distributed Algorithm for Multispectral Image Segmentation
Claudiu Gruia; textbfFlorin Pop; Valentin Cristea
2007-01-01
This paper presents a distributed algorithm for multi-spectral image segmentation. Regions of the image are processed separately and then the results are combined. For this, the algorithm employs two types of clustering algorithms, each specialized in its task and steered toward obtaining a final meaningful segmentation. The workers use an iterative clustering algorithm which is an extension of fuzzy c-means
Tai Wai Cheng; Dmitry B. Goldgof; Lawrence O. Hall
1998-01-01
This paper presents a multistage random sampling fuzzy c-means-based clustering algorithm, which significantly reduces the computation time required to partition a data set into c classes. A series of subsets of the full data set are used to create initial cluster centers in order to provide an approximation to the final cluster centers. The quality of the final partitions is
Hybridizing Evolutionary Algorithms and Clustering Algorithms to Find Source-Code Clones
Maletic, Jonathan I.
Hybridizing Evolutionary Algorithms and Clustering Algorithms to Find Source-Code Clones Andrew a hybrid approach to detect source-code clones that combines evolutionary algorithms and clustering. A case is effective in detecting groups of source-code clones. Categories and Subject Descriptors D.2.7 [Software
Clustering Algorithms Optimizer: A Framework for Large Datasets
Roy Varshavsky; David Horn; Michal Linial
2007-01-01
Clustering algorithms are employed in many bioinformatics tasks, including classification of protein sequences and analysis of gene-expression data. Although these algorithms are routinely applied, many of them suffer from the following limitations: (i) relying on predetermined parameters tuning, such as a-priori knowledge regarding the number of clusters; (ii) involving nondeterministic procedures that yield inconsistent outcomes. Thus, a framework that addresses
Intel Technical Report A Comparison of Three Document Clustering Algorithms: TreeCluster,
Mock, Kenrick
Intel Technical Report A Comparison of Three Document Clustering Algorithms: TreeCluster, Word of the document collection. Prior Work Hierarchical Agglomerative Clustering A common technique to cluster a collection of documents: Word-Intersection with GQF, Word-Intersection with hierarchical agglomerative
CASAN: Clustering algorithm for security in ad hoc networks
Mohamed Elhoucine Elhdhili; Lamia Ben Azzouz; Farouk Kamoun
2008-01-01
Clustering in ad hoc networks is an organization method which consists in grouping the nodes into clusters (groups) managed by nodes called clusterheads. This technique has been used for different goals as routing efficiency, transmission management, information collection, etc. As far as we know, no existing clustering algorithms have taken into account the trust level of nodes for clusterheads election.
Evaluation of hierarchical clustering algorithms for document datasets
Ying Zhao; George Karypis
2002-01-01
Fast and high-quality document clustering algorithms play an important role in providing intuitive navigation and browsing mechanisms by organizing large amounts of information into a small number of meaningful clusters. In particular, hierarchical clustering solutions provide a view of the data at different levels of granularity, making them ideal for people to visualize and interactively explore large document collections.In this
CLICK: A Clustering Algorithm with Applications to Gene Expression Analysis
Sharan, Roded
CLICK: A Clustering Algorithm with Applications to Gene Expression Analysis Roded Sharan and Ron when the cell undergoes speci#12;c conditions or processes. Analyzing gene expression data requires the clustering of genes into groups with similar expression patterns. We have developed a novel clustering
A survey of fuzzy clustering algorithms for pattern recognition. I
Andrea Baraldi; Palma Blonda
1999-01-01
Clustering algorithms aim at modeling fuzzy (i.e., ambiguous) unlabeled patterns efficiently. Our goal is to propose a theoretical framework where the expressive power of clustering systems can be compared on the basis of a meaningful set of common functional features. Part I of this paper reviews the following issues related to clustering approaches found in the literature: relative (probabilistic) and
Incremental Clustering Algorithm For Earth Science Data Mining
Vatsavai, Raju [ORNL
2009-01-01
Remote sensing data plays a key role in understanding the complex geographic phenomena. Clustering is a useful tool in discovering interesting patterns and structures within the multivariate geospatial data. One of the key issues in clustering is the specication of appropriate number of clusters, which is not obvious in many practical situations. In this paper we provide an extension of G-means algorithm which automatically learns the number of clusters present in the data and avoids over estimation of the number of clusters. Experimental evaluation on simulated and remotely sensed image data shows the effectiveness of our algorithm.
Probabilistic analysis of the RNN-CLINK clustering algorithm
Sheau-Dong Lang; Li-Jen Mao; Wen-Lin Hsu
1999-01-01
Clustering is among the oldest techniques used in data mining applications. Typical implementations of the hierarchical agglomerative clustering methods (HACM) require an amount of O(N2)-space, when there are N data objects, making such algorithms impractical for problems involving large datasets. The well-known clustering algorithm RNN- CLINK requires only O(N)-space, but O(N3)-time in the worst case, although the average time appears
Harmony K-means algorithm for document clustering
Mehrdad Mahdavi; Hassan Abolhassani
2009-01-01
Fast and high quality document clustering is a crucial task in organizing information, search engine results, enhancing web\\u000a crawling, and information retrieval or filtering. Recent studies have shown that the most commonly used partition-based clustering\\u000a algorithm, the K-means algorithm, is more suitable for large datasets. However, the K-means algorithm can generate a local optimal solution. In this paper we propose
A practical comparison of two K-Means clustering algorithms
Wilkin, Gregory A; Huang, Xiuzhen
2008-01-01
Background Data clustering is a powerful technique for identifying data with similar characteristics, such as genes with similar expression patterns. However, not all implementations of clustering algorithms yield the same performance or the same clusters. Results In this paper, we study two implementations of a general method for data clustering: k-means clustering. Our experimentation compares the running times and distance efficiency of Lloyd's K-means Clustering and the Progressive Greedy K-means Clustering. Conclusion Based on our implementation, not just in processing time, but also in terms of mean squared-difference (MSD), Lloyd's K-means Clustering algorithm is more efficient. This analysis was performed using both a gene expression level sample and on randomly-generated datasets in three-dimensional space. However, other circumstances may dictate a different choice in some situations. PMID:18541054
Programming the K-means clustering algorithm in SQL
Carlos Ordonez
2004-01-01
Using SQL has not been considered an efficient and feasible way to implement data mining algorithms. Although this is true for many data mining, machine learning and statistical algorithms, this work shows it is feasible to get an efficient SQL implementation of the well-known K-means clustering algorithm that can work on top of a relational DBMS. The article emphasizes both
AUTOMATIC FEATURE SET SELECTION FOR MERGING IMAGE SEGMENTATION RESULTS USING FUZZY CLUSTERING
M. Ameer Ali; Laurence S Dooley; Gour C Karmakar
The image segmentation performance of clustering algorithms is highly dependent on the features used and the type of objects contained in the image, which limits the generalization ability of such algorithms. As a consequence, a fuzzy image segmentation using suppressed fuzzy c-means clustering (FSSC) algorithm was proposed that merged the initially segmented regions produced by a fuzzy clustering algorithm, using
A systematic comparison of genome-scale clustering algorithms
2012-01-01
Background A wealth of clustering algorithms has been applied to gene co-expression experiments. These algorithms cover a broad range of approaches, from conventional techniques such as k-means and hierarchical clustering, to graphical approaches such as k-clique communities, weighted gene co-expression networks (WGCNA) and paraclique. Comparison of these methods to evaluate their relative effectiveness provides guidance to algorithm selection, development and implementation. Most prior work on comparative clustering evaluation has focused on parametric methods. Graph theoretical methods are recent additions to the tool set for the global analysis and decomposition of microarray co-expression matrices that have not generally been included in earlier methodological comparisons. In the present study, a variety of parametric and graph theoretical clustering algorithms are compared using well-characterized transcriptomic data at a genome scale from Saccharomyces cerevisiae. Methods For each clustering method under study, a variety of parameters were tested. Jaccard similarity was used to measure each cluster's agreement with every GO and KEGG annotation set, and the highest Jaccard score was assigned to the cluster. Clusters were grouped into small, medium, and large bins, and the Jaccard score of the top five scoring clusters in each bin were averaged and reported as the best average top 5 (BAT5) score for the particular method. Results Clusters produced by each method were evaluated based upon the positive match to known pathways. This produces a readily interpretable ranking of the relative effectiveness of clustering on the genes. Methods were also tested to determine whether they were able to identify clusters consistent with those identified by other clustering methods. Conclusions Validation of clusters against known gene classifications demonstrate that for this data, graph-based techniques outperform conventional clustering approaches, suggesting that further development and application of combinatorial strategies is warranted. PMID:22759431
DISC-Finder: A Distributed Algorithm for Identifying Galaxy Clusters Friends-of-Friends Algorithm
Fink, Eugene
cross-cube merging list of galaxies galaxy clusters We have developed a distributed versionDISC-Finder: A Distributed Algorithm for Identifying Galaxy Clusters SVD10 Friends-of-Friends Algorithm Bin Fu, Kai Ren, Julio López, Eugene Fink, Garth Gibson · Two galaxies are "friends
Scalable clustering algorithm for N-body simulations in a shared-nothing cluster
Balazinska, Magdalena
, climate change, and, as in this paper, the evolution of the universe. Improved technology -- and improvedScalable clustering algorithm for N-body simulations in a shared-nothing cluster YongChul Kwon1 simulation domain. In particular, we present a scalable, parallel al- gorithm for data clustering
A fuzzy clustering algorithm to detect planar and quadric shapes
NASA Technical Reports Server (NTRS)
Krishnapuram, Raghu; Frigui, Hichem; Nasraoui, Olfa
1992-01-01
In this paper, we introduce a new fuzzy clustering algorithm to detect an unknown number of planar and quadric shapes in noisy data. The proposed algorithm is computationally and implementationally simple, and it overcomes many of the drawbacks of the existing algorithms that have been proposed for similar tasks. Since the clustering is performed in the original image space, and since no features need to be computed, this approach is particularly suited for sparse data. The algorithm may also be used in pattern recognition applications.
Fuzzy clustering in parallel universes
B. Wiswedel; M. R. Berthold
2005-01-01
We propose a modified fuzzy c-means algorithm that operates on different feature spaces, so-called parallel universes, simultaneously. The method assigns membership values of patterns to different universes, which are then adopted throughout the training. This leads to better clustering results since patterns not contributing to clustering in a universe are (completely or partially) ignored. The outcome of the algorithm are
The New k-Windows Algorithm for Improving the k-Means Clustering Algorithm
Michael N. Vrahatis; Basilis Boutsinas; Panagiotis Alevizos; Georgios Pavlides
2002-01-01
The process of partitioning a large set of patterns into disjoint and homogeneous clusters is fundamental in knowledge acquisition. It is called Clustering in the literature and it is applied in various fields including data mining, statistical data analysis, compression and vector quantization. The k-means is a very popular algorithm and one of the best for implementing the clustering process.
The Ordered Clustered Travelling Salesman Problem: A Hybrid Genetic Algorithm
Ahmed, Zakir Hussain
2014-01-01
The ordered clustered travelling salesman problem is a variation of the usual travelling salesman problem in which a set of vertices (except the starting vertex) of the network is divided into some prespecified clusters. The objective is to find the least cost Hamiltonian tour in which vertices of any cluster are visited contiguously and the clusters are visited in the prespecified order. The problem is NP-hard, and it arises in practical transportation and sequencing problems. This paper develops a hybrid genetic algorithm using sequential constructive crossover, 2-opt search, and a local search for obtaining heuristic solution to the problem. The efficiency of the algorithm has been examined against two existing algorithms for some asymmetric and symmetric TSPLIB instances of various sizes. The computational results show that the proposed algorithm is very effective in terms of solution quality and computational time. Finally, we present solution to some more symmetric TSPLIB instances. PMID:24701148
A NEW CLUSTERING ALGORITHM FOR SEGMENTATION OF MAGNETIC RESONANCE IMAGES
Slatton, Clint
A NEW CLUSTERING ALGORITHM FOR SEGMENTATION OF MAGNETIC RESONANCE IMAGES By ERHAN GOKCAY which improved the quality of this dissertation. I also wish to thank my wife Didem and my son Tugra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Magnetic Resonance Image Segmentation
Image segmentation using fuzzy clustering: A survey
Samina Naz; Hammad Majeed; Humayun Irshad
2010-01-01
This paper presents a survey of latest image segmentation techniques using fuzzy clustering. Fuzzy C-Means (FCM) Clustering is the most wide spread clustering approach for image segmentation because of its robust characteristics for data classification. In this paper, four image segmentation algorithms using clustering, taken from the literature are reviewed. To address the drawbacks of conventional FCM, all these approaches
Clustering and feature selection via PSO algorithm
Mehran Javani; Karim Faez; Davoud Aghlmandi
2011-01-01
Clustering is one of the popular techniques for data analysis. In this paper, we proposed a new method for the simultaneously clustering and feature selection through the use of the multi-objective particle swarm optimization (PSO). Since different features may have different important in various contexts; some features may be irrelevant and some of them may be misleading in clustering. Therefore,
A Fast Implementation of the ISODATA Clustering Algorithm
NASA Technical Reports Server (NTRS)
Memarsadeghi, Nargess; Mount, David M.; Netanyahu, Nathan S.; LeMoigne, Jacqueline
2005-01-01
Clustering is central to many image processing and remote sensing applications. ISODATA is one of the most popular and widely used clustering methods in geoscience applications, but it can run slowly, particularly with large data sets. We present a more efficient approach to ISODATA clustering, which achieves better running times by storing the points in a kd-tree and through a modification of the way in which the algorithm estimates the dispersion of each cluster. We also present an approximate version of the algorithm which allows the user to further improve the running time, at the expense of lower fidelity in computing the nearest cluster center to each point. We provide both theoretical and empirical justification that our modified approach produces clusterings that are very similar to those produced by the standard ISODATA approach. We also provide empirical studies on both synthetic data and remotely sensed Landsat and MODIS images that show that our approach has significantly lower running times.
Modified k-means algorithm for clustering stock market companies
Parviz Rashidi; M. Analoui
In recent years, there has been a lot of interest in the database community in mining time series data, especially in finance markets. Partitioning assets into natural groups or identifying assets with similar properties are natural problems in finance. In this paper, we proposed a modified k-means clustering algorithm to cluster stock market companies, based on similarity measure between time
Multiresolution Mean Shift Clustering Algorithm for Shape Interpolation
Chen, Sheng-Wei
Multiresolution Mean Shift Clustering Algorithm for Shape Interpolation Hung-Kuo Chu and Tong that is as rigid as possible. Given input shapes with compatible connectivity, we propose a novel multiresolution configuration, multiresolution mean shift (MMS) clustering. Ç 1 INTRODUCTION SHAPE interpolation has been
WCA: A Weighted Clustering Algorithm for Mobile Ad Hoc Networks
Mainak Chatterjee; Sajal K. Das; Damla Turgut
2002-01-01
Abstract: In this paper, we propose an on-demand distributed clustering algorithm for multi-hop packet radio networks. These types ofnetworks, also known as ad hoc networks, are dynamic in nature due to the mobility of nodes. The association and dissociation of nodes toand from clusters perturb the stability of the network topology, and hence a reconfiguration of the system is often
A Nonparametric Algorithm for Detecting Clusters Using Hierarchical Structure
Riichiro Mizoguchi; Masamichi Shimura
1980-01-01
The present paper discusses a nonparametric algorithm for detecting clusters. In the algorithm a positive value called potential is associated with each datum based on dissimilarities. By defining subordination relations among data, hierarchical structure is introduced into the data set. As a result of the introduction of hierarchical structure, the data set is divided into some subsets called subclusters. A
Sodar image segmentation by fuzzy c-means
D. P. Mukherjee; P. Pal; J. Das
1996-01-01
Sodar facsimile records provide meteorological information of the atmospheric boundary layer (ABL). Segmentation of sodar image (digitised sodar record) is vital for subsequent interpretation of ABL structure. The problem of segmentation is presented here as classification problem. Fuzzy c-means classification algorithm has been adopted for segmentation purpose using a set of four feature vectors calculated over a mask such that
A local search approximation algorithm for k-means clustering
Tapas Kanungo; David M. Mount; Nathan S. Netanyahu; Christine D. Piatko; Ruth Silverman; Angela Y. Wu
2004-01-01
Abstract In k-means clustering we are given a set of n data points in d-dimensional space ,, called centers, to minimize the mean squared distance from each data point to its nearest center. No exact polynomial-time algorithms are known,for this problem. Although asymptotically efficient approximation algorithms exist, these algorithms are not practical due to the very high constant factors involved.
Efficient Cluster Algorithm for CP(N-1) Models
B. B Beard; M. Pepe; S. Riederer; U. -J. Wiese
2006-02-14
Despite several attempts, no efficient cluster algorithm has been constructed for CP(N-1) models in the standard Wilson formulation of lattice field theory. In fact, there is a no-go theorem that prevents the construction of an efficient Wolff-type embedding algorithm. In this paper, we construct an efficient cluster algorithm for ferromagnetic SU(N)-symmetric quantum spin systems. Such systems provide a regularization for CP(N-1) models in the framework of D-theory. We present detailed studies of the autocorrelations and find a dynamical critical exponent that is consistent with z = 0.
A geometric clustering algorithm with applications to structural data.
Xu, Shutan; Zou, Shuxue; Wang, Lincong
2015-05-01
An important feature of structural data, especially those from structural determination and protein-ligand docking programs, is that their distribution could be mostly uniform. Traditional clustering algorithms developed specifically for nonuniformly distributed data may not be adequate for their classification. Here we present a geometric partitional algorithm that could be applied to both uniformly and nonuniformly distributed data. The algorithm is a top-down approach that recursively selects the outliers as the seeds to form new clusters until all the structures within a cluster satisfy a classification criterion. The algorithm has been evaluated on a diverse set of real structural data and six sets of test data. The results show that it is superior to the previous algorithms for the clustering of structural data and is similar to or better than them for the classification of the test data. The algorithm should be especially useful for the identification of the best but minor clusters and for speeding up an iterative process widely used in NMR structure determination. PMID:25517067
A decentralized fuzzy C-means-based energy-efficient routing protocol for wireless sensor networks.
Alia, Osama Moh'd
2014-01-01
Energy conservation in wireless sensor networks (WSNs) is a vital consideration when designing wireless networking protocols. In this paper, we propose a Decentralized Fuzzy Clustering Protocol, named DCFP, which minimizes total network energy dissipation to promote maximum network lifetime. The process of constructing the infrastructure for a given WSN is performed only once at the beginning of the protocol at a base station, which remains unchanged throughout the network's lifetime. In this initial construction step, a fuzzy C-means algorithm is adopted to allocate sensor nodes into their most appropriate clusters. Subsequently, the protocol runs its rounds where each round is divided into a CH-Election phase and a Data Transmission phase. In the CH-Election phase, the election of new cluster heads is done locally in each cluster where a new multicriteria objective function is proposed to enhance the quality of elected cluster heads. In the Data Transmission phase, the sensing and data transmission from each sensor node to their respective cluster head is performed and cluster heads in turn aggregate and send the sensed data to the base station. Simulation results demonstrate that the proposed protocol improves network lifetime, data delivery, and energy consumption compared to other well-known energy-efficient protocols. PMID:25162060
A Decentralized Fuzzy C-Means-Based Energy-Efficient Routing Protocol for Wireless Sensor Networks
2014-01-01
Energy conservation in wireless sensor networks (WSNs) is a vital consideration when designing wireless networking protocols. In this paper, we propose a Decentralized Fuzzy Clustering Protocol, named DCFP, which minimizes total network energy dissipation to promote maximum network lifetime. The process of constructing the infrastructure for a given WSN is performed only once at the beginning of the protocol at a base station, which remains unchanged throughout the network's lifetime. In this initial construction step, a fuzzy C-means algorithm is adopted to allocate sensor nodes into their most appropriate clusters. Subsequently, the protocol runs its rounds where each round is divided into a CH-Election phase and a Data Transmission phase. In the CH-Election phase, the election of new cluster heads is done locally in each cluster where a new multicriteria objective function is proposed to enhance the quality of elected cluster heads. In the Data Transmission phase, the sensing and data transmission from each sensor node to their respective cluster head is performed and cluster heads in turn aggregate and send the sensed data to the base station. Simulation results demonstrate that the proposed protocol improves network lifetime, data delivery, and energy consumption compared to other well-known energy-efficient protocols. PMID:25162060
K-Distributions: A New Algorithm for Clustering Categorical Data
NASA Astrophysics Data System (ADS)
Cai, Zhihua; Wang, Dianhong; Jiang, Liangxiao
Clustering is one of the most important tasks in data mining. The K-means algorithm is the most popular one for achieving this task because of its efficiency. However, it works only on numeric values although data sets in data mining often contain categorical values. Responding to this fact, the K-modes algorithm is presented to extend the K-means algorithm to categorical domains. Unfortunately, it suffers from computing the dissimilarity between each pair of objects and the mode of each cluster. Aiming at addressing these problems confronting K-modes, we present a new algorithm called K-distributions in this paper. We experimentally tested K-distributions using the well known 36 UCI data sets selected by Weka, and compared it to K-modes. The experimental results show that K-distributions significantly outperforms K-modes in term of clustering accuracy and log likelihood.
Symmetric nonnegative matrix factorization: algorithms and applications to probabilistic clustering.
He, Zhaoshui; Xie, Shengli; Zdunek, Rafal; Zhou, Guoxu; Cichocki, Andrzej
2011-12-01
Nonnegative matrix factorization (NMF) is an unsupervised learning method useful in various applications including image processing and semantic analysis of documents. This paper focuses on symmetric NMF (SNMF), which is a special case of NMF decomposition. Three parallel multiplicative update algorithms using level 3 basic linear algebra subprograms directly are developed for this problem. First, by minimizing the Euclidean distance, a multiplicative update algorithm is proposed, and its convergence under mild conditions is proved. Based on it, we further propose another two fast parallel methods: ?-SNMF and ? -SNMF algorithms. All of them are easy to implement. These algorithms are applied to probabilistic clustering. We demonstrate their effectiveness for facial image clustering, document categorization, and pattern clustering in gene expression. PMID:22042156
Probabilistic analysis of the RNN-CLINK clustering algorithm
NASA Astrophysics Data System (ADS)
Lang, Sheau-Dong; Mao, Li-Jen; Hsu, Wen-Lin
1999-02-01
Clustering is among the oldest techniques used in data mining applications. Typical implementations of the hierarchical agglomerative clustering methods (HACM) require an amount of O(N2)-space, when there are N data objects, making such algorithms impractical for problems involving large datasets. The well-known clustering algorithm RNN- CLINK requires only O(N)-space, but O(N3)-time in the worst case, although the average time appears to be O(N2-log N). We provide a probabilistic interpretation of the average time complexity of the algorithm. We also report experimental results, using the randomly generated bit vectors, and using the NETNEWS articles as the input, to support our theoretical analysis.
Jiang, Peng; Li, Shengqiang
2010-01-01
A kind of data compression algorithm for sensor networks based on suboptimal clustering and virtual landmark routing within clusters is proposed in this paper. Firstly, temporal redundancy existing in data obtained by the same node in sequential instants can be eliminated. Then sensor networks nodes will be clustered. Virtual node landmarks in clusters can be established based on cluster heads. Routing in clusters can be realized by combining a greedy algorithm and a flooding algorithm. Thirdly, a global structure tree based on cluster heads will be established. During the course of data transmissions from nodes to cluster heads and from cluster heads to sink, the spatial redundancy existing in the data will be eliminated. Only part of the raw data needs to be transmitted from nodes to sink, and all raw data can be recovered in the sink based on a compression code and part of the raw data. Consequently, node energy can be saved, largely because transmission of redundant data can be avoided. As a result the overall performance of the sensor network can obviously be improved. PMID:22163396
Development of clustering algorithms for Compressed Baryonic Matter experiment
NASA Astrophysics Data System (ADS)
Kozlov, G. E.; Ivanov, V. V.; Lebedev, A. A.; Vassiliev, Yu. O.
2015-05-01
A clustering problem for the coordinate detectors in the Compressed Baryonic Matter (CBM) experiment is discussed. Because of the high interaction rate and huge datasets to be dealt with, clustering algorithms are required to be fast and efficient and capable of processing events with high track multiplicity. At present there are two different approaches to the problem. In the first one each fired pad bears information about its charge, while in the second one a pad can or cannot be fired, thus rendering the separation of overlapping clusters a difficult task. To deal with the latter, two different clustering algorithms were developed, integrated into the CBMROOT software environment, and tested with various types of simulated events. Both of them are found to be highly efficient and accurate.
A Mountain Clustering Based on Improved PSO Algorithm
Hong-yuan Shen; Xiao-qi Peng; Jun-nian Wang; Zhi-kun Hu
2005-01-01
\\u000a In order to find most centre of the density of the sample set this paper combines MCA and PSO, and presents a mountain clustering\\u000a based on improved PSO (MCBIPSO) algorithm. A mountain clustering method constructs a mountain function according to the density\\u000a of the sample, but it is not easy to find all peaks of the mountain function. The improved
A multiresolution image segmentation technique based on pyramidal segmentation and fuzzy clustering
Mahmoud Ramze Rezaee; Pieter M. J. Van Der Zwet; Boudewijn P. F. Lelieveldt; Rob J. Van Der Geest; Johan H. C. Reiber
2000-01-01
In this paper, an unsupervised image segmentation technique is presented, which combines pyramidal image segmentation with the fuzzy c-means clustering algorithm. Each layer of the pyramid is split into a number of regions by a root labeling technique, and then fuzzy c-means is used to merge the regions of the layer with the highest image resolution. A cluster validity functional
ACO-Based Projection Pursuit: A Novel Clustering Algorithm
NASA Astrophysics Data System (ADS)
Li, Yancang; Zhao, Lina; Zhou, Shujing
In order to find an effective method of solving the problem of subjectivity and difficulty in the high-dimension data clustering, a new method - an improved Projection Pursuit based on Ant Colony Optimization algorithm was introduced. The ant colony optimization algorithm was employed to optimize the function of the projected indexes in the PP. The ant colony optimization algorithm has the strong global optimization ability and the PP method is a powerful technique for extracting statistically significant features from high-dimension data for automatic target detection and classification. Application results show that the method can complete the selection more objectivity and rationality with objective weight, high resolving power, and stable result. The study provides a novel algorithm for the high-dimension data clustering.
A survey: hybrid evolutionary algorithms for cluster analysis
Mohamed Jafar Abul Hasan; Sivakumar Ramakrishnan
Clustering is a popular data analysis and data mining technique. It is the unsupervised classification of patterns into groups.\\u000a Many algorithms for large data sets have been proposed in the literature using different techniques. However, conventional\\u000a algorithms have some shortcomings such as slowness of the convergence, sensitive to initial value and preset classed in large\\u000a scale data set etc. and
Clustering Algorithm for Network Constraint Trajectories
Zeitouni, Karine
- rithms have been proposed among which K-Means (Lloyd, 1981), BIRCH (Zhang et al., 1996), DBSCAN (Ester et al., 1996) and OPTICS (Ankerst et al., 1999). Recent researches on trajectory clustering uses these algo- rithms while adapting them to the studied domain (Lee et al., 2007), since trajectories
Content-based School Assignment Cluster Algorithm
Juan Feng; Jie Zhao; Guohua Zhan
2008-01-01
Based on the clustering technology in data mining, we aimed to establish a new schoolwork identifying mechanism. In order to let the normal answer can adapt to actual situations better, we first generalized the normal answer, and then calculated the similarity between every sample and normal answer, as well as similar degree between school works. Based on the similarity, we
The C4 clustering algorithm: Clusters of galaxies in the Sloan Digital Sky Survey
Miller, Christopher J.; Nichol, Robert; Reichart, Dan; Wechsler, Risa H.; Evrard, August; Annis, James; McKay, Timothy; Bahcall, Neta; Bernardi, Mariangela; Boehringer,; Connolly, Andrew; Goto, Tomo; Kniazev, Alexie; Lamb, Donald; Postman, Marc; Schneider, Donald; Sheth, Ravi; Voges, Wolfgang; /Cerro-Tololo InterAmerican Obs. /Portsmouth U.,
2005-03-01
We present the ''C4 Cluster Catalog'', a new sample of 748 clusters of galaxies identified in the spectroscopic sample of the Second Data Release (DR2) of the Sloan Digital Sky Survey (SDSS). The C4 cluster-finding algorithm identifies clusters as overdensities in a seven-dimensional position and color space, thus minimizing projection effects that have plagued previous optical cluster selection. The present C4 catalog covers {approx}2600 square degrees of sky and ranges in redshift from z = 0.02 to z = 0.17. The mean cluster membership is 36 galaxies (with redshifts) brighter than r = 17.7, but the catalog includes a range of systems, from groups containing 10 members to massive clusters with over 200 cluster members with redshifts. The catalog provides a large number of measured cluster properties including sky location, mean redshift, galaxy membership, summed r-band optical luminosity (L{sub r}), velocity dispersion, as well as quantitative measures of substructure and the surrounding large-scale environment. We use new, multi-color mock SDSS galaxy catalogs, empirically constructed from the {Lambda}CDM Hubble Volume (HV) Sky Survey output, to investigate the sensitivity of the C4 catalog to the various algorithm parameters (detection threshold, choice of passbands and search aperture), as well as to quantify the purity and completeness of the C4 cluster catalog. These mock catalogs indicate that the C4 catalog is {approx_equal}90% complete and 95% pure above M{sub 200} = 1 x 10{sup 14} h{sup -1}M{sub {circle_dot}} and within 0.03 {le} z {le} 0.12. Using the SDSS DR2 data, we show that the C4 algorithm finds 98% of X-ray identified clusters and 90% of Abell clusters within 0.03 {le} z {le} 0.12. Using the mock galaxy catalogs and the full HV dark matter simulations, we show that the L{sub r} of a cluster is a more robust estimator of the halo mass (M{sub 200}) than the galaxy line-of-sight velocity dispersion or the richness of the cluster. However, if we exclude clusters embedded in complex large-scale environments, we find that the velocity dispersion of the remaining clusters is as good an estimator of M{sub 200} as L{sub r}. The final C4 catalog will contain {approx_equal} 2500 clusters using the full SDSS data set and will represent one of the largest and most homogeneous samples of local clusters.
Efficient Algorithms for Sampling and Clustering of Large Nonuniform Networks
Pekka OrponenSatu; Satu Elisa Schaeffer
2004-01-01
We propose efficient algorithms for two key tasks in the analysis of large nonuniform networks: uniform node sampling and cluster detection. Our sampling technique is based on augmenting a simple, but slowly mixing uniform MCMC sampler with a regular random walk in order to speed up its convergence; however the combined MCMC chain is then only sampled when it is
A Cycle-based Clustering Algorithm for Wireless Sensor Networks
Hassaan Khaliq Qureshi; M. Rajarajan; Veselin Rakocevic; A. Iqbal; M. Saleem; S. A. Khayam
2009-01-01
In energy constrained wireless sensor networks, it is important that a routing protocol provides network redundancy and reliability at a minimum energy consumption cost. To satisfy these conflicting constraints, wireless sensor networks (WSN) routing protocols generally employ a clustering algorithm in which the entire network does not have to be active at all times. In this paper, we propose a
Clustering via Similarity Functions: Theoretical Foundations and Algorithms
Blum, Avrim
biology, can be used to cluster well, and we design new efficient algorithms that are able to take fellowship, and by an IBM Graduate Fellowship. School of Computer Science, Georgia Institute of Technology from computational biology to computer vision to information retrieval. It has many variants
SpaRef: a clustering algorithm for multispectral images
Thanh N. Tran; Ron Wehrens; Lutgarde M. C. Buydens
2003-01-01
Multispectral images such as multispectral chemical images or multispectral satellite images provide detailed data with information in both the spatial and spectral domains. Many segmentation methods for multispectral images are based on a per-pixel classification, which uses only spectral information and ignores spatial information. A clustering algorithm based on both spectral and spatial information would produce better results.In this work,
A Clustering Genetic Algorithm for Actuator Optimization in Flow Control
Michele Milano; Petros Koumoutsakos
2000-01-01
Active flow control can provide a leap in the perform- nace of engineering configurations. Although a number of sensor and actuator configurations have been proposed the task of identifying optimal parameters for control devices is based on engineering intuition usually gathered from un- controlled flow experiments. Here we propose a clustering genetic algorithm that adaptively identifies critical points in the
A Convergence Theorem for the Fuzzy ISODATA Clustering Algorithms
James C. Bezdek
1980-01-01
In this paper the convergence of a class of clustering procedures, popularly known as the fuzzy ISODATA algorithms, is established. The theory of Zangwill is used to prove that arbitrary sequences generated by these (Picard iteration) procedures always terminates at a local minimum, or at worst, always contains a subsequence which converges to a local minimum of the generalized least
An Approximation Algorithm for Broadcast Scheduling in Heterogeneous Clusters
Liu, Pangfeng
the fact that FNF heuristic does not guarantee an optimal broadcast time for general heterogeneous networkAn Approximation Algorithm for Broadcast Scheduling in Heterogeneous Clusters Pangfeng Liu1 , Da Science and Information Engineering, National Chung Cheng University, Chiayi, Taiwan. Abstract. Network
Scalable Clustering Algorithms with Balancing Constraints Arindam Banerjee
Banerjee, Arindam
of the stable marriage problem, whereas the refinement algorithm is a constrained iterative relocation scheme;Another method with a similar flavor [38] compresses the data objects into many small subclusters using modified index trees and performs clustering with these subclusters. A different approach is to subsample
Partitioning clustering algorithms for protein sequence data sets
Fayech, Sondes; Essoussi, Nadia; Limam, Mohamed
2009-01-01
Background Genome-sequencing projects are currently producing an enormous amount of new sequences and cause the rapid increasing of protein sequence databases. The unsupervised classification of these data into functional groups or families, clustering, has become one of the principal research objectives in structural and functional genomics. Computer programs to automatically and accurately classify sequences into families become a necessity. A significant number of methods have addressed the clustering of protein sequences and most of them can be categorized in three major groups: hierarchical, graph-based and partitioning methods. Among the various sequence clustering methods in literature, hierarchical and graph-based approaches have been widely used. Although partitioning clustering techniques are extremely used in other fields, few applications have been found in the field of protein sequence clustering. It is not fully demonstrated if partitioning methods can be applied to protein sequence data and if these methods can be efficient compared to the published clustering methods. Methods We developed four partitioning clustering approaches using Smith-Waterman local-alignment algorithm to determine pair-wise similarities of sequences. Four different sets of protein sequences were used as evaluation data sets for the proposed methods. Results We show that these methods outperform several other published clustering methods in terms of correctly predicting a classifier and especially in terms of the correctness of the provided prediction. The software is available to academic users from the authors upon request. PMID:19341454
Coupled cluster algorithms for networks of shared memory parallel processors
NASA Astrophysics Data System (ADS)
Bentz, Jonathan L.; Olson, Ryan M.; Gordon, Mark S.; Schmidt, Michael W.; Kendall, Ricky A.
2007-05-01
As the popularity of using SMP systems as the building blocks for high performance supercomputers increases, so too increases the need for applications that can utilize the multiple levels of parallelism available in clusters of SMPs. This paper presents a dual-layer distributed algorithm, using both shared-memory and distributed-memory techniques to parallelize a very important algorithm (often called the "gold standard") used in computational chemistry, the single and double excitation coupled cluster method with perturbative triples, i.e. CCSD(T). The algorithm is presented within the framework of the GAMESS [M.W. Schmidt, K.K. Baldridge, J.A. Boatz, S.T. Elbert, M.S. Gordon, J.J. Jensen, S. Koseki, N. Matsunaga, K.A. Nguyen, S. Su, T.L. Windus, M. Dupuis, J.A. Montgomery, General atomic and molecular electronic structure system, J. Comput. Chem. 14 (1993) 1347-1363]. (General Atomic and Molecular Electronic Structure System) program suite and the Distributed Data Interface [M.W. Schmidt, G.D. Fletcher, B.M. Bode, M.S. Gordon, The distributed data interface in GAMESS, Comput. Phys. Comm. 128 (2000) 190]. (DDI), however, the essential features of the algorithm (data distribution, load-balancing and communication overhead) can be applied to more general computational problems. Timing and performance data for our dual-level algorithm is presented on several large-scale clusters of SMPs.
An Efficient Cluster Algorithm for CP(N-1) Models
B. B. Beard; M. Pepe; S. Riederer; U. -J. Wiese
2005-10-06
We construct an efficient cluster algorithm for ferromagnetic SU(N)-symmetric quantum spin systems. Such systems provide a new regularization for CP(N-1) models in the framework of D-theory, which is an alternative non-perturbative approach to quantum field theory formulated in terms of discrete quantum variables instead of classical fields. Despite several attempts, no efficient cluster algorithm has been constructed for CP(N-1) models in the standard formulation of lattice field theory. In fact, there is even a no-go theorem that prevents the construction of an efficient Wolff-type embedding algorithm. We present various simulations for different correlation lengths, couplings and lattice sizes. We have simulated correlation lengths up to 250 lattice spacings on lattices as large as 640x640 and we detect no evidence for critical slowing down.
Fast source optimization by clustering algorithm based on lithography properties
NASA Astrophysics Data System (ADS)
Tawada, Masashi; Hashimoto, Takaki; Sakanushi, Keishi; Nojima, Shigeki; Kotani, Toshiya; Yanagisawa, Masao; Togawa, Nozomu
2015-03-01
Lithography is a technology to make circuit patterns on a wafer. UV light diffracted by a photomask forms optical images on a photoresist. Then, a photoresist is melt by an amount of exposed UV light exceeding the threshold. The UV light diffracted by a photomask through lens exposes the photoresist on the wafer. Its lightness and darkness generate patterns on the photoresist. As the technology node advances, the feature sizes on photoresist becomes much smaller. Diffracted UV light is dispersed on the wafer, and then exposing photoresists has become more difficult. Exposure source optimization, SO in short, techniques for optimizing illumination shape have been studied. Although exposure source has hundreds of grid-points, all of previous works deal with them one by one. Then they consume too much running time and that increases design time extremely. How to reduce the parameters to be optimized in SO is the key to decrease source optimization time. In this paper, we propose a variation-resilient and high-speed cluster-based exposure source optimization algorithm. We focus on image log slope (ILS) and use it for generating clusters. When an optical image formed by a source shape has a small ILS value at an EPE (Edge placement error) evaluation point, dose/focus variation much affects the EPE values. When an optical image formed by a source shape has a large ILS value at an evaluation point, dose/focus variation less affects the EPE value. In our algorithm, we cluster several grid-points with similar ILS values and reduce the number of parameters to be simultaneously optimized in SO. Our clustering algorithm is composed of two STEPs: In STEP 1, we cluster grid-points into four groups based on ILS values of grid-points at each evaluation point. In STEP 2, we generate super clusters from the clusters generated in STEP 1. We consider a set of grid-points in each cluster to be a single light source element. As a result, we can optimize the SO problem very fast. Experimental results demonstrate that our algorithm runs speed-up compared to a conventional algorithm with keeping the EPE values.
FUZZY CLUSTERING ALGORITHMS ON LANDSAT IMAGES FOR DETECTION OF WASTE AREAS: A COMPARISON
Masulli, Francesco
-map. Keywords: Fuzzy clustering algorithms, Landsat images segmentation, detection of waste. 1 IntroductionFUZZY CLUSTERING ALGORITHMS ON LANDSAT IMAGES FOR DETECTION OF WASTE AREAS: A COMPARISON A. In this paper we will present a comparison of fuzzy clustering algorithms for the segmentation of multi
Median graph shift: A new clustering algorithm for graph domain Salim Jouili, Salvatore Tabbone
Paris-Sud XI, Université de
Median graph shift: A new clustering algorithm for graph domain Salim Jouili, Salvatore Tabbone, a new algorithm for the domain of graphs is introduced. In this paper, the key idea is to adapt the mean-shift clustering and its variants proposed for the domain of feature vectors to graph clustering. These algorithms
An evolutionary technique based on K-Means algorithm for optimal clustering in RN
Sanghamitra Bandyopadhyay; Ujjwal Maulik
2002-01-01
A genetic algorithm-based efficient clustering technique that utilizes the principles of K-Means algorithm is described in this paper. The algorithm called KGA-clustering, while exploiting the searching capability of K-Means, avoids its major limitation of getting stuck at locally optimal values. Its superiority over the K-Means algorithm and another genetic algorithm-based clustering method, is extensively demonstrated for several artificial and real
Color image segmentation using histogram thresholding - Fuzzy C-means hybrid approach
Khang Siang Tan; Nor Ashidi Mat Isa
2011-01-01
This paper presents a novel histogram thresholding – fuzzy C-means hybrid (HTFCM) approach that could find different application in pattern recognition as well as in computer vision, particularly in color image segmentation. The proposed approach applies the histogram thresholding technique to obtain all possible uniform regions in the color image. Then, the Fuzzy C-means (FCM) algorithm is utilized to improve
Improved Gravitation Field Algorithm and Its Application in Hierarchical Clustering
Zheng, Ming; Sun, Ying; Liu, Gui-xia; Zhou, You; Zhou, Chun-guang
2012-01-01
Background Gravitation field algorithm (GFA) is a new optimization algorithm which is based on an imitation of natural phenomena. GFA can do well both for searching global minimum and multi-minima in computational biology. But GFA needs to be improved for increasing efficiency, and modified for applying to some discrete data problems in system biology. Method An improved GFA called IGFA was proposed in this paper. Two parts were improved in IGFA. The first one is the rule of random division, which is a reasonable strategy and makes running time shorter. The other one is rotation factor, which can improve the accuracy of IGFA. And to apply IGFA to the hierarchical clustering, the initial part and the movement operator were modified. Results Two kinds of experiments were used to test IGFA. And IGFA was applied to hierarchical clustering. The global minimum experiment was used with IGFA, GFA, GA (genetic algorithm) and SA (simulated annealing). Multi-minima experiment was used with IGFA and GFA. The two experiments results were compared with each other and proved the efficiency of IGFA. IGFA is better than GFA both in accuracy and running time. For the hierarchical clustering, IGFA is used to optimize the smallest distance of genes pairs, and the results were compared with GA and SA, singular-linkage clustering, UPGMA. The efficiency of IGFA is proved. PMID:23173043
A comparison of clustering algorithms in article recommendation system
NASA Astrophysics Data System (ADS)
Tantanasiriwong, Supaporn
2011-12-01
Recommendation system is considered a tool that can be used to recommend researchers about resources that are suitable for their research of interest by using content-based filtering. In this paper, clustering algorithm as an unsupervised learning is introduced for grouping objects based on their feature selection and similarities. The information of publication in Science Cited Index is used to be dataset for clustering as a feature extraction in terms of dimensionality reduction of these articles by comparing Latent Dirichlet Allocation (LDA), Principal Component Analysis (PCA), and K-Mean to determine the best algorithm. In my experiment, the selected database consists of 2625 documents extraction extracted from SCI corpus from 2001 to 2009. Clustering into ranks as 50,100,200,250 is used to consider and using F-Measure evaluate among them in three algorithms. The result of this paper showed that LDA technique given the accuracy up to 95.5% which is the highest effective than any other clustering technique.
A comparison of clustering algorithms in article recommendation system
NASA Astrophysics Data System (ADS)
Tantanasiriwong, Supaporn
2012-01-01
Recommendation system is considered a tool that can be used to recommend researchers about resources that are suitable for their research of interest by using content-based filtering. In this paper, clustering algorithm as an unsupervised learning is introduced for grouping objects based on their feature selection and similarities. The information of publication in Science Cited Index is used to be dataset for clustering as a feature extraction in terms of dimensionality reduction of these articles by comparing Latent Dirichlet Allocation (LDA), Principal Component Analysis (PCA), and K-Mean to determine the best algorithm. In my experiment, the selected database consists of 2625 documents extraction extracted from SCI corpus from 2001 to 2009. Clustering into ranks as 50,100,200,250 is used to consider and using F-Measure evaluate among them in three algorithms. The result of this paper showed that LDA technique given the accuracy up to 95.5% which is the highest effective than any other clustering technique.
A class of constrained clustering algorithms for object boundary extraction.
Abrantes, A J; Marques, J S
1996-01-01
Boundary extraction is a key task in many image analysis operations. This paper describes a class of constrained clustering algorithms for object boundary extraction that includes several well-known algorithms proposed in different fields (deformable models, constrained clustering, data ordering, and traveling salesman problems). The algorithms belonging to this class are obtained by the minimization of a cost function with two terms: a quadratic regularization term and an image-dependent term defined by a set of weighting functions. The minimization of the cost function is achieved by lowpass filtering the previous model shape and by attracting the model units toward the centroids of their attraction regions. To define a new algorithm belonging to this class, the user has to specify a regularization matrix and a set of weighting functions that control the attraction of the model units toward the data. The usefulness of this approach is twofold: it provides a unified framework for many existing algorithms in pattern recognition and deformable models, and allows the design of new recursive schemes. PMID:18290068
Vetoed jet clustering: the mass-jump algorithm
NASA Astrophysics Data System (ADS)
Stoll, Martin
2015-04-01
A new class of jet clustering algorithms is introduced. A criterion inspired by successful mass-drop taggers is applied that prevents the recombination of two hard prongs if their combined jet mass is substantially larger than the masses of the separate prongs. This "mass jump" veto effectively results in jets with variable radii in dense environments. Differences to existing methods are investigated. It is shown for boosted top quarks that the new algorithm has beneficial properties which can lead to improved tagging purity.
Centre based chromosomal representation of genetic algorithms to cluster new student
Zainudin Zukhri; Khairuddin Omar
2008-01-01
This paper is presented to explain application of Genetic Algorithms (GA) to cluster new students into several classes. It is very essential since good educational service for a large number of student needs clustering of them. There is no clustering method can handle it due to the capacity of each class. The supervised clustering method can only cluster students where
On an ensemble algorithm for clustering cancer patient data
2013-01-01
Background The TNM staging system is based on three anatomic prognostic factors: Tumor, Lymph Node and Metastasis. However, cancer is no longer considered an anatomic disease. Therefore, the TNM should be expanded to accommodate new prognostic factors in order to increase the accuracy of estimating cancer patient outcome. The ensemble algorithm for clustering cancer data (EACCD) by Chen et al. reflects an effort to expand the TNM without changing its basic definitions. Though results on using EACCD have been reported, there has been no study on the analysis of the algorithm. In this report, we examine various aspects of EACCD using a large breast cancer patient dataset. We compared the output of EACCD with the corresponding survival curves, investigated the effect of different settings in EACCD, and compared EACCD with alternative clustering approaches. Results Using the basic T and N definitions, EACCD generated a dendrogram that shows a graphic relationship among the survival curves of the breast cancer patients. The dendrograms from EACCD are robust for large values of m (the number of runs in the learning step). When m is large, the dendrograms depend on the linkage functions. The statistical tests, however, employed in the learning step have minimal effect on the dendrogram for large m. In addition, if omitting the step for learning dissimilarity in EACCD, the resulting approaches can have a degraded performance. Furthermore, clustering only based on prognostic factors could generate misleading dendrograms, and direct use of partitioning techniques could lead to misleading assignments to clusters. Conclusions When only the Partitioning Around Medoids (PAM) algorithm is involved in the step of learning dissimilarity, large values of m are required to obtain robust dendrograms, and for a large m EACCD can effectively cluster cancer patient data. PMID:24565417
A local search approximation algorithm for k-means clustering
Tapas Kanungo; David M. Mount; Nathan S. Netanyahu; Christine D. Piatko; Ruth Silverman; Angela Y. Wu
2002-01-01
In k-means clustering we are given a set of n data points in d-dimensional space Rd and an integer k, and the problem is to determine a set of k points in ÓC;d, called centers, to minimize the mean squared distance from each data point to its nearest center. No exact polynomial-time algorithms are known for this problem. Although asymptotically
Image Segmentation Using Ant System-Based Clustering Algorithm
Aleksandar Jevti?; Joel Quintanilla-Domínguez; José Barrón-Adame; Diego Andina
\\u000a Industrial applications of computer vision sometimes require detection of atypical objects that occur as small groups of pixels\\u000a in digital images. These objects are difficult to single out because they are small and randomly distributed. In this work\\u000a we propose an image segmentation method using the novel Ant System-based Clustering Algorithm (ASCA). ASCA models the foraging\\u000a behaviour of ants, which
An improved fuzzy clustering approach for image segmentation
Ivana Despotovic; Bart Goossens; Ewout Vansteenkiste; Wilfried Philips
2010-01-01
Fuzzy clustering techniques have been widely used in automated image segmentation. However, since the standard fuzzy c-means (FCM) clustering algorithm does not consider any spatial information, it is highly sensitive to noise. In this paper, we present an extension of the FCM algorithm to overcome this drawback, by incorporating spatial neighborhood information into a new similarity measure. We consider that
Color image segmentation using fuzzy clustering and supervised learning
Jing Wu; Hong Yan; Andrew N. Chalmers
1994-01-01
We propose a technique for the segmentation of color map images by means of an algorithm based on fuzzy clustering and prototype optimization. Its purpose is to facilitate the extraction of lines and characters from a wide variety of geographical map images. In this method, segmentation is considered to be a process of pixel classification. The fuzzy c-means clustering algorithm
On selecting an optimal number of clusters for color image segmentation Hoel Le Capitaine
Paris-Sud XI, Université de
--This paper addresses the problem of region-based color image segmentation using a fuzzy clustering algorithm, e.g. a spatial version of fuzzy c-means, in order to partition the image into clusters corresponding. II. FUZZY CLUSTERING AND IMAGE SEGMENTATION Clustering is an instance of unsupervised classification
A Geospatial Implementation of a Novel Delineation Clustering Algorithm Employing the K-means
Tonny J. Oyana; Kara E. Scott
2008-01-01
The overarching objective of this study is to report the implementation and performance of a novel delineation clustering algorithm employing the k-means. This study explores a newly proposed algorithm designed to increase the overall performance of the k-means clustering technique—the Fast, Efficient, and Scalable k-means algorithm (FES-k- means*). The algorithm reduces the computational load and produce quality clusters. Resulting improvements
A review on particle swarm optimization algorithms and their applications to data clustering
Sandeep Rana; Sanjay Jasola; Rajesh Kumar
2011-01-01
Data clustering is one of the most popular techniques in data mining. It is a method of grouping data into clusters, in which\\u000a each cluster must have data of great similarity and high dissimilarity with other cluster data. The most popular clustering\\u000a algorithm K-mean and other classical algorithms suffer from disadvantages of initial centroid selection, local optima, low\\u000a convergence rate
Cluster algorithm for Monte Carlo simulations of spin ice
NASA Astrophysics Data System (ADS)
Otsuka, Hiromi
2014-12-01
We present an algorithm for Monte Carlo simulations of a nearest-neighbor spin-ice model based on its cluster representation. To assess its performance, we estimate a relaxation time, and find that, in contrast to the Metropolis algorithm, our algorithm does not develop spin freezing. Also, to demonstrate the efficiency, we calculate the spin and charge structure factors, and observe pinch points in a high-resolution color map. We then find that Debye screening works among defects and brings about short-range correlations, and that the deconfinement transition triggered by a fugacity of defects z is dictated by a singular part of the free-energy density fs?z .
Finding reproducible cluster partitions for the k-means algorithm.
Lisboa, Paulo J G; Etchells, Terence A; Jarman, Ian H; Chambers, Simon J
2013-01-01
K-means clustering is widely used for exploratory data analysis. While its dependence on initialisation is well-known, it is common practice to assume that the partition with lowest sum-of-squares (SSQ) total i.e. within cluster variance, is both reproducible under repeated initialisations and also the closest that k-means can provide to true structure, when applied to synthetic data. We show that this is generally the case for small numbers of clusters, but for values of k that are still of theoretical and practical interest, similar values of SSQ can correspond to markedly different cluster partitions. This paper extends stability measures previously presented in the context of finding optimal values of cluster number, into a component of a 2-d map of the local minima found by the k-means algorithm, from which not only can values of k be identified for further analysis but, more importantly, it is made clear whether the best SSQ is a suitable solution or whether obtaining a consistently good partition requires further application of the stability index. The proposed method is illustrated by application to five synthetic datasets replicating a real world breast cancer dataset with varying data density, and a large bioinformatics dataset. PMID:23369085
Robust growing neural gas algorithm with application in cluster analysis.
Qin, A K; Suganthan, P N
2004-01-01
We propose a novel robust clustering algorithm within the Growing Neural Gas (GNG) framework, called Robust Growing Neural Gas (RGNG) network.The Matlab codes are available from . By incorporating several robust strategies, such as outlier resistant scheme, adaptive modulation of learning rates and cluster repulsion method into the traditional GNG framework, the proposed RGNG network possesses better robustness properties. The RGNG is insensitive to initialization, input sequence ordering and the presence of outliers. Furthermore, the RGNG network can automatically determine the optimal number of clusters by seeking the extreme value of the Minimum Description Length (MDL) measure during network growing process. The resulting center positions of the optimal number of clusters represented by prototype vectors are close to the actual ones irrespective of the existence of outliers. Topology relationships among these prototypes can also be established. Experimental results have shown the superior performance of our proposed method over the original GNG incorporating MDL method, called GNG-M, in static data clustering tasks on both artificial and UCI data sets. PMID:15555857
A fast clustering algorithm for data with a few labeled instances.
Yang, Jinfeng; Xiao, Yong; Wang, Jiabing; Ma, Qianli; Shen, Yanhua
2015-01-01
The diameter of a cluster is the maximum intracluster distance between pairs of instances within the same cluster, and the split of a cluster is the minimum distance between instances within the cluster and instances outside the cluster. Given a few labeled instances, this paper includes two aspects. First, we present a simple and fast clustering algorithm with the following property: if the ratio of the minimum split to the maximum diameter (RSD) of the optimal solution is greater than one, the algorithm returns optimal solutions for three clustering criteria. Second, we study the metric learning problem: learn a distance metric to make the RSD as large as possible. Compared with existing metric learning algorithms, one of our metric learning algorithms is computationally efficient: it is a linear programming model rather than a semidefinite programming model used by most of existing algorithms. We demonstrate empirically that the supervision and the learned metric can improve the clustering quality. PMID:25861252
Differential Evolution Based Fuzzy Clustering
NASA Astrophysics Data System (ADS)
Ravi, V.; Aggarwal, Nupur; Chauhan, Nikunj
In this work, two new fuzzy clustering (FC) algorithms based on Differential Evolution (DE) are proposed. Five well-known data sets viz. Iris, Wine, Glass, E. Coli and Olive Oil are used to demonstrate the effectiveness of DEFC-1 and DEFC-2. They are compared with Fuzzy C-Means (FCM) algorithm and Threshold Accepting Based Fuzzy Clustering algorithms proposed by Ravi et al., [1]. Xie-Beni index is used to arrive at the 'optimal' number of clusters. Based on the numerical experiments, we infer that, in terms of least objective function value, these variants can be used as viable alternatives to FCM algorithm.
Englebienne, Gwenn
Introduction Everything observed kMeans Clustering Mixtures of Gaussians The General EM AlgorithmMeans Clustering Mixtures of Gaussians The General EM Algorithm 1 Introduction 2 Everything observed 3 k Extensions to EM #12;Introduction Everything observed kMeans Clustering Mixtures of Gaussians The General EM
A Dependable Slepian-Wolf Coding Based Clustering Algorithm for Data Aggregation
Boyer, Edmond
A Dependable Slepian-Wolf Coding Based Clustering Algorithm for Data Aggregation in Wireless Sensor the Slepian-Wolf coding based data aggregation problem and the corresponding dependable clustering problem in wireless sensor networks (WSNs). A dependable Slepian-Wolf coding based clustering (D- SWC) algorithm
Nicolaos B. Karayiannis; Mary M. Randolph-Gips
2005-01-01
This paper presents the development of soft clustering and learning vector quantization (LVQ) algorithms that rely on a weighted norm to measure the distance between the feature vectors and their prototypes. The development of LVQ and clustering algorithms is based on the minimization of a reformulation function under the constraint that the generalized mean of the norm weights be constant.
Mammographic images segmentation based on chaotic map clustering algorithm
2014-01-01
Background This work investigates the applicability of a novel clustering approach to the segmentation of mammographic digital images. The chaotic map clustering algorithm is used to group together similar subsets of image pixels resulting in a medically meaningful partition of the mammography. Methods The image is divided into pixels subsets characterized by a set of conveniently chosen features and each of the corresponding points in the feature space is associated to a map. A mutual coupling strength between the maps depending on the associated distance between feature space points is subsequently introduced. On the system of maps, the simulated evolution through chaotic dynamics leads to its natural partitioning, which corresponds to a particular segmentation scheme of the initial mammographic image. Results The system provides a high recognition rate for small mass lesions (about 94% correctly segmented inside the breast) and the reproduction of the shape of regions with denser micro-calcifications in about 2/3 of the cases, while being less effective on identification of larger mass lesions. Conclusions We can summarize our analysis by asserting that due to the particularities of the mammographic images, the chaotic map clustering algorithm should not be used as the sole method of segmentation. It is rather the joint use of this method along with other segmentation techniques that could be successfully used for increasing the segmentation performance and for providing extra information for the subsequent analysis stages such as the classification of the segmented ROI. PMID:24666766
FAST AND EFFICIENT MODEL-BASED CLUSTERING WITH THE ASCENT-EM ALGORITHM
Jank, Wolfgang
FAST AND EFFICIENT MODEL-BASED CLUSTERING WITH THE ASCENT-EM ALGORITHM Wolfgang Jank Department@rhsmith.umd.edu Abstract In this paper we propose an efficient and fast EM algorithm for model-based clustering of large databases. Drawing ideas from its stochastic descendant, the Monte Carlo EM algorithm, the method uses only
Reconfigurable Mesh Algorithms For Image Shrinking, Expanding, Clustering, And Template Matching*
Sahni, Sartaj K.
Reconfigurable Mesh Algorithms For Image Shrinking, Expanding, Clustering, And Template Matching reconfigurable mesh algorithms are developed for the following image processing prob- lems: shrinking, expanding, clustering, and template matching. Our NĂ?N reconfigurable mesh algorithm for the q-step shrinking
Robust fuzzy clustering of relational data
Rajesh N. Davé; Sumit Sen
2002-01-01
Popular relational-data clustering algorithms, relational dual of fuzzy c-means (RFCM), non-Euclidean RFCM (NERFCM) (both by Hathaway et al), and FANNY (by Kaufman and Rousseeuw) are examined. A new algorithm, which is a generalization of FANNY, called the fuzzy relational data clustering (FRC) algorithm, is introduced, having an identical objective functional as RFCM. However, the FRC does not have the restriction
Evaluation and Comparison of Clustering Algorithms in Analyzing ES Cell Gene Expression Data
the robustness of clustering results after small perturbation). The results showed that the ES cell dataset posed others. Some algorithms require that every gene in the dataset belongs to one and only one cluster (i
Classification of posture maintenance data with fuzzy clustering algorithms
NASA Technical Reports Server (NTRS)
Bezdek, James C.
1992-01-01
Sensory inputs from the visual, vestibular, and proprioreceptive systems are integrated by the central nervous system to maintain postural equilibrium. Sustained exposure to microgravity causes neurosensory adaptation during spaceflight, which results in decreased postural stability until readaptation occurs upon return to the terrestrial environment. Data which simulate sensory inputs under various sensory organization test (SOT) conditions were collected in conjunction with Johnson Space Center postural control studies using a tilt-translation device (TTD). The University of West Florida applied the fuzzy c-meams (FCM) clustering algorithms to this data with a view towards identifying various states and stages of subjects experiencing such changes. Feature analysis, time step analysis, pooling data, response of the subjects, and the algorithms used are discussed.
A Modified Version of the K-Means Algorithm with a Distance Based on Cluster Symmetry
Mu-chun Su; Chien-hsing Chou
2001-01-01
Abstract—In this paper, we propose a modified version of the K-means algorithm to cluster data. The proposed,algorithm adopts,a novel nonmetric,distance measure,based,on the idea of “point symmetry.” This kind of “point symmetry distance” can be applied in data clustering and human,face detection. Several data sets are used to illustrate its effectiveness. Index Terms—Data clustering, pattern recognition, k-means algorithm, face
Fault diagnosis of rotating machinery based on a new hybrid clustering algorithm
Yaguo Lei; Zhengjia He; Yanyang Zi; Qiao Hu
2008-01-01
A new hybrid clustering algorithm based on a three-layer feed forward neural network (FFNN), a distribution density function,\\u000a and a cluster validity index, is presented in this paper. In this algorithm, both feature weighting and sample weighting are\\u000a considered, and an optimal cluster number is automatically determined by the cluster validity index. Feature weights are learnt\\u000a via FFNN based on
Incorporation of Non-Euclidean Distance Metrics into Fuzzy Clustering on Graphics Processing Units
He, Zhihai "Henry"
is not a new concept. Shankar and Pal presented a progressive subsampling method called fast fuzzy c-means, but so is the size of the data set. In [2] Pal and Bezdek developed the extensible fast fuzzy c-means clustering (eFFCM) algorithm for the segmentation of very large digital images. In [3] Hathaway and Bezdek
An Improved Ant Colony Optimization Cluster Algorithm Based on Swarm Intelligence
Weihui Dai; Shouji Liu; Shuyi Liang
2009-01-01
This paper proposes an improved ant colony optimization cluster algorithm based on a classics algorithm - LF algorithm. By the introduction of a new formula and the probability of similarity metric conversion function, as well as the new formula of distance, this algorithm can deal with the category data easily. It also introduces a new adjustment process, which adjusts the
An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data
Liping Jing; Michael K. Ng; Joshua Zhexue Huang
2007-01-01
This paper presents a new k-means type algorithm for clustering high-dimensional objects in sub-spaces. In high-dimensional data, clusters of objects often exist in subspaces rather than in the entire space. For example, in text clustering, clusters of documents of different topics are categorized by different subsets of terms or keywords. The keywords for one cluster may not occur in the
Textural defect detect using a revised ant colony clustering algorithm
NASA Astrophysics Data System (ADS)
Zou, Chao; Xiao, Li; Wang, Bingwen
2007-11-01
We propose a totally novel method based on a revised ant colony clustering algorithm (ACCA) to explore the topic of textural defect detection. In this algorithm, our efforts are mainly made on the definition of local irregularity measurement and the implementation of the revised ACCA. The local irregular measurement defined evaluates the local textural inconsistency of each pixel against their mini-environment. In our revised ACCA, the behaviors of each ant are divided into two steps: release pheromone and act. The quantity of pheromone released is proportional to the irregularity measurement; the actions of the ants to act next are chosen independently of each other in a stochastic way according to some evaluated heuristic knowledge. The independency of ants implies the inherent parallel computation architecture of this algorithm. We apply the proposed method in some typical textural images with defects. From the series of pheromone distribution map (PDM), it can be clearly seen that the pheromone distribution approaches the textual defects gradually. By some post-processing, the final distribution of pheromone can demonstrate the shape and area of the defects well.
Two generalizations of Kohonen clustering
NASA Technical Reports Server (NTRS)
Bezdek, James C.; Pal, Nikhil R.; Tsao, Eric C. K.
1993-01-01
The relationship between the sequential hard c-means (SHCM), learning vector quantization (LVQ), and fuzzy c-means (FCM) clustering algorithms is discussed. LVQ and SHCM suffer from several major problems. For example, they depend heavily on initialization. If the initial values of the cluster centers are outside the convex hull of the input data, such algorithms, even if they terminate, may not produce meaningful results in terms of prototypes for cluster representation. This is due in part to the fact that they update only the winning prototype for every input vector. The impact and interaction of these two families with Kohonen's self-organizing feature mapping (SOFM), which is not a clustering method, but which often leads ideas to clustering algorithms is discussed. Then two generalizations of LVQ that are explicitly designed as clustering algorithms are presented; these algorithms are referred to as generalized LVQ = GLVQ; and fuzzy LVQ = FLVQ. Learning rules are derived to optimize an objective function whose goal is to produce 'good clusters'. GLVQ/FLVQ (may) update every node in the clustering net for each input vector. Neither GLVQ nor FLVQ depends upon a choice for the update neighborhood or learning rate distribution - these are taken care of automatically. Segmentation of a gray tone image is used as a typical application of these algorithms to illustrate the performance of GLVQ/FLVQ.
Generalised Blurring Mean-Shift Algorithms for Nonparametric Clustering Miguel A. Carreira-Perpi~nan
Carreira-Perpińán, Miguel Á.
Gaussian blurring mean-shift (GBMS) is a nonparamet- ric clustering algorithm, having a single bandwidth and exponential updates, and depending on a step-size parameter. We give conditions on the step size for the con- vergence of these algorithms and show that the convergence rate for Gaussian clusters ranges from sublinear
User-Based Document Clustering by Redescribing Subject Descriptions with a Genetic Algorithm.
ERIC Educational Resources Information Center
Gordon, Michael D.
1991-01-01
Discussion of clustering of documents and queries in information retrieval systems focuses on the use of a genetic algorithm to adapt subject descriptions so that documents become more effective in matching relevant queries. Various types of clustering are explained, and simulation experiments used to test the genetic algorithm are described. (27…
The performance of minima hopping and evolutionary algorithms for cluster structure prediction
Oganov, Artem R.
The performance of minima hopping and evolutionary algorithms for cluster structure prediction compare evolutionary algorithms with minima hopping for global optimization in the field of cluster it with previously used operators. Minima hopping is improved with a softening method and a stronger feedback
New clustering algorithm-based fault diagnosis using compensation distance evaluation technique
Yaguo Lei; Zhengjia He; Yanyang Zi; Xuefeng Chen
2008-01-01
This paper presents a fault diagnosis method of rotating machinery based on a new clustering algorithm using a compensation distance evaluation technique (CDET). A two-stage feature selection and weighting technique is adopted in this algorithm. Feature weights are computed via CDET according to the sensitivity of features and assigned to the corresponding features to indicate their different importance in clustering.
A Repeated Local Search Algorithm for BiClustering of Gene Expression Data
Battiti, Roberto
A Repeated Local Search Algorithm for BiClustering of Gene Expression Data Duy Tin Truong, Roberto. Given a gene expression data matrix where each cell is the expression level of a gene under a certain-of-the-art algorithms. Keywords: biclustering, co-clustering, local search, gene expression, microarray. 1 Introduction
Multilevel image segmentation using fuzzy clustering and local membership variations detection
E. Levrat; V. Bombardier; M. Lamotte; J. Bremont
1992-01-01
A segmentation method for gray-level images with fuzzy clustering and local detection of membership variations is presented. The method is very efficient for edge detection in images where transitions between two regions are very large. Two fuzzy operations and a fuzzy c-means algorithm adaptation for pixel clustering are introduced. The influence of the number of clusters on the results is
A Formal Algorithm for Verifying the Validity of Clustering Results Based on Model Checking
Huang, Shaobin; Cheng, Yuan; Lang, Dapeng; Chi, Ronghua; Liu, Guofeng
2014-01-01
The limitations in general methods to evaluate clustering will remain difficult to overcome if verifying the clustering validity continues to be based on clustering results and evaluation index values. This study focuses on a clustering process to analyze crisp clustering validity. First, we define the properties that must be satisfied by valid clustering processes and model clustering processes based on program graphs and transition systems. We then recast the analysis of clustering validity as the problem of verifying whether the model of clustering processes satisfies the specified properties with model checking. That is, we try to build a bridge between clustering and model checking. Experiments on several datasets indicate the effectiveness and suitability of our algorithms. Compared with traditional evaluation indices, our formal method can not only indicate whether the clustering results are valid but, in the case the results are invalid, can also detect the objects that have led to the invalidity. PMID:24608823
Parallel Glowworm Swarm Optimization Clustering Algorithm based on MapReduce
Ludwig, Simone
clustering (MRCGSO) using MapReduce is introduced to handle big data. The proposed algorithm uses glowworm quality. Keywords--Big data clustering, Parallel Processing, Hadoop I. INTRODUCTION Data clustering [1 with each other to help reaching the food sources. There is no central member in the swarm, but rather all
Novel clustering algorithm for microarray expression data in a truncated David Horn and Inon Axel
Horn, David
to be better suited for information retrieval purposes than the original matrix. Taking a similar attitude- processing step of a clustering algorithm. The idea is that, rather than performing clustering on the initial data pre- sented by the gene/sample matrix, we compress it #12;rst and perform clustering on the data
Alternatives to the k-means algorithm that find better clusterings
Greg Hamerly; Charles Elkan
2002-01-01
We investigate here the behavior of the standard k-means clustering algorithm and several alternatives to it: the k-harmonic means algorithm due to Zhang and colleagues, fuzzy k-means, Gaussian expectation-maximization, and two new variants of k-harmonic means. Our aim is to find which aspects of these algorithms contribute to finding good clusterings, as opposed to converging to a low-quality local optimum.
Reconfigurable Mesh Algorithms for Image Shrinking, Expanding, Clustering, and Template Matching
Jing-fu Fu Jenq; Sartaj Sahni
1991-01-01
Parallel reconfigurable mesh algorithms are developed for the following image processing prob- lems: shrinking, expanding, clustering, and template matching. Our N×N reconfigurable mesh algorithm for the q-step shrinking and expansion of a binary image takes O (1) time. One pass of the clustering algorithm for N patterns and K centers can be done in O (MK + KlogN), O (KlogNM
A fast readout algorithm for Cluster Counting/Timing drift chambers on a FPGA board
NASA Astrophysics Data System (ADS)
Cappelli, L.; Creti, P.; Grancagnolo, F.; Pepino, A.; Tassielli, G.
2013-08-01
A fast readout algorithm for Cluster Counting and Timing purposes has been implemented and tested on a Virtex 6 core FPGA board. The algorithm analyses and stores data coming from a Helium based drift tube instrumented by 1 GSPS fADC and represents the outcome of balancing between cluster identification efficiency and high speed performance. The algorithm can be implemented in electronics boards serving multiple fADC channels as an online preprocessing stage for drift chamber signals.
NASA Astrophysics Data System (ADS)
Morales-Esteban, Antonio; Martínez-Álvarez, Francisco; Scitovski, Sanja; Scitovski, Rudolf
2014-12-01
In this paper we construct an efficient adaptive Mahalanobis k-means algorithm. In addition, we propose a new efficient algorithm to search for a globally optimal partition obtained by using the adoptive Mahalanobis distance-like function. The algorithm is a generalization of the previously proposed incremental algorithm (Scitovski and Scitovski, 2013). It successively finds optimal partitions with k = 2 , 3 , … clusters. Therefore, it can also be used for the estimation of the most appropriate number of clusters in a partition by using various validity indexes. The algorithm has been applied to the seismic catalogues of Croatia and the Iberian Peninsula. Both regions are characterized by a moderate seismic activity. One of the main advantages of the algorithm is its ability to discover not only circular but also elliptical shapes, whose geometry fits the faults better. Three seismogenic zonings are proposed for Croatia and two for the Iberian Peninsula and adjacent areas, according to the clusters discovered by the algorithm.
Improving the Initial Centroids of k-means Clustering Algorithm to Generalize its Applicability
NASA Astrophysics Data System (ADS)
Goyal, M.; Kumar, S.
2014-12-01
k-means is one of the most widely used partition based clustering algorithm. But the initial centroids generated randomly by the k-means algorithm cause the algorithm to converge at the local optimum. So to make k-means algorithm globally optimum, the initial centroids must be selected carefully rather than randomly. Though many researchers have already been carried out for the enhancement of k-means algorithm, they have their own limitations. In this paper a new method to formulate the initial centroids is proposed which results in better clusters equally for uniform and non-uniform data sets.
Color image segmentation using automatic thresholding and the fuzzy C-means techniques
S. Ben Chaabane; M. Sayadi; F. Fnaiech; E. Brassart
2008-01-01
In this paper, a color image segmentation approach based on automatic histogram thresholding and the fuzzy C-means (FCM) techniques is presented. The originality of this work remains in using thresholding and clustering techniques together for color image segmentation. The histogram considers the occurrence of the gray levels among pixels. In a first stage, the thresholding histogram is used for finding
An improved method of image segmentation using fuzzy c-means
B. Thamaraichelvi; G. Yamuna; U. Vanitha
2012-01-01
We propose a new method by incorporating improved k-means and modified fuzzy c-means clustering techniques for segmenting medical images. Segmentation of medical images plays a key role in estimating the object boundary and abnormalities, if any. Moreover, it is an important process to extract more information from complex medical images. Magnetic resonance images may contain noise due to the operator
NASA Astrophysics Data System (ADS)
Huang, Mingqiang; Hao, Liqing; Guo, Xiaoyong; Hu, Changjin; Gu, Xuejun; Zhao, Weixiong; Wang, Zhenya; Fang, Li; Zhang, Weijun
2013-01-01
Experiments for formation of secondary organic aerosol (SOA) from photooxidation of 1,3,5-trimethylbenzene in the CH3ONO/NO/air mixture were carried out in the laboratory chamber. The size and chemical composition of the resultant individual particles were measured in real-time by an aerosol laser time of flight mass spectrometer (ALTOFMS) recently designed in our group. We also developed Fuzzy C-Means (FCM) algorithm to classify the mass spectra of large numbers of SOA particles. The study first started with mixed particles generated from the standards benzaldehyde, phenol, benzoic acid, and nitrobenzene solutions to test the feasibility of application of the FCM. The FCM was then used to extract out potential aerosol classes in the chamber experiments. The results demonstrate that FCM allowed a clear identification of ten distinct chemical particle classes in this study, namely, 3,5-dimethylbenzoic acid, 3,5-dimethylbenzaldehyde, 2,4,6-trimethyl-5-nitrophenol, 2-methyl-4-oxo-2-pentenal, 2,4,6-trimethylphenol, 3,5-dimethyl-2-furanone, glyoxal, and high-molecular-weight (HMW) components. Compared to offline method such as gas chromatography-mass spectrometry (GC-MS) measurement, the real-time ALTOFMS detection approach coupled with the FCM data processing algorithm can make cluster analysis of SOA successfully and provide more information of products. Thus ALTOFMS is a useful tool to reveal the formation and transformation processes of SOA particles in smog chambers.
Beevi, S Zulaikha; Senthamaraikannan, K
2010-01-01
Medical image segmentation demands an efficient and robust segmentation algorithm against noise. The conventional fuzzy c-means algorithm is an efficient clustering algorithm that is used in medical image segmentation. But FCM is highly vulnerable to noise since it uses only intensity values for clustering the images. This paper aims to develop a novel and efficient fuzzy spatial c-means clustering algorithm which is robust to noise. The proposed clustering algorithm uses fuzzy spatial information to calculate membership value. The input image is clustered using proposed ISFCM algorithm. A comparative study has been made between the conventional FCM and proposed ISFCM. The proposed approach is found to be outperforming the conventional FCM.
S. Zulaikha Beevi; M. Mohammed Sathik; K. Senthamaraikannan; J. H. Jaseema Yasmin
2010-01-01
Segmentation is an important step in many medical imaging applications and a variety of image segmentation techniques do exist. Of them, a group of segmentation algorithms is based on the clustering concepts. In our research, we have intended to devise efficient variants of Fuzzy C-Means (FCM) clustering towards effective segmentation of medical images. The enhanced variants of FCM clustering are
UNSUPERVISED LEARNING CLUSTERING Clustering overview
Weske, Mathias
UNSUPERVISED LEARNING Â CLUSTERING 1 #12;Outline Clustering overview Internal and external clustering criteria Impossibility Theorem for clustering Hierarchical clustering Single-link, complete-link heuristics Partitional/flat clustering algorithms 2 #12;Clustering overview Why clustering? ... no labels
A special local clustering algorithm for identifying the genes associated with Alzheimer's disease.
Pang, Chao-Yang; Hu, Wei; Hu, Ben-Qiong; Shi, Ying; Vanderburg, Charles R; Rogers, Jack T; Huang, Xudong
2010-03-01
Clustering is the grouping of similar objects into a class. Local clustering feature refers to the phenomenon whereby one group of data is separated from another, and the data from these different groups are clustered locally. A compact class is defined as one cluster in which all similar elements cluster tightly within the cluster. Herein, the essence of the local clustering feature, revealed by mathematical manipulation, results in a novel clustering algorithm termed as the special local clustering (SLC) algorithm that was used to process gene microarray data related to Alzheimer's disease (AD). SLC algorithm was able to group together genes with similar expression patterns and identify significantly varied gene expression values as isolated points. If a gene belongs to a compact class in control data and appears as an isolated point in incipient, moderate and/or severe AD gene microarray data, this gene is possibly associated with AD. Application of a clustering algorithm in disease-associated gene identification such as in AD is rarely reported. PMID:20089478
Deb, Suash; Yang, Xin-She
2014-01-01
Traditional K-means clustering algorithms have the drawback of getting stuck at local optima that depend on the random values of initial centroids. Optimization algorithms have their advantages in guiding iterative computation to search for global optima while avoiding local optima. The algorithms help speed up the clustering process by converging into a global optimum early with multiple search agents in action. Inspired by nature, some contemporary optimization algorithms which include Ant, Bat, Cuckoo, Firefly, and Wolf search algorithms mimic the swarming behavior allowing them to cooperatively steer towards an optimal objective within a reasonable time. It is known that these so-called nature-inspired optimization algorithms have their own characteristics as well as pros and cons in different applications. When these algorithms are combined with K-means clustering mechanism for the sake of enhancing its clustering quality by avoiding local optima and finding global optima, the new hybrids are anticipated to produce unprecedented performance. In this paper, we report the results of our evaluation experiments on the integration of nature-inspired optimization methods into K-means algorithms. In addition to the standard evaluation metrics in evaluating clustering quality, the extended K-means algorithms that are empowered by nature-inspired optimization methods are applied on image segmentation as a case study of application scenario. PMID:25202730
Research on image segmentation algorithm based on fuzzy clustering
NASA Astrophysics Data System (ADS)
Qu, Bo
2013-07-01
Through the depth study on the existing classical FCM algorithms, this paper puts forward a program to improve the FCM algorithm. The improved algorithm introduces the statistical properties of images, thus greatly reducing the amount of data the algorithm processed, to speed up the convergence rate, and the segmentation results with FCM algorithm are exactly the same.
Algorithmic transformations in the implementation of K- means clustering on reconfigurable hardware
Mike Estlick; Miriam Leeser; James Theiler; John J. Szymanski
2001-01-01
In mapping the k-means algorithm to FPGA hardware, we examined algorithm level transforms that dramatically increased the achievable parallelism. We apply the k-means algorithm to multi-spectral and hyper-spectral images, which have tens to hundreds of channels per pixel of data. K-means is an iterative algorithm that assigns assigns to each pixel a label indicating which of K clusters the pixel
An adaptive spatial clustering algorithm based on the minimum spanning tree-like
NASA Astrophysics Data System (ADS)
Deng, Min; Liu, Qiliang; Li, Guangqiang; Cheng, Tao
2009-10-01
Spatial clustering is an important means for spatial data mining and spatial analysis, and it can be used to discover the potential rules and outliers among the spatial data. Most existing spatial clustering methods cannot deal with the uneven density of the data and usually require predefined parameters which are hard to justify. In order to overcome such limitations, we firstly propose the concept of edge variation factor based upon the definition of distance variation among the entities in the spatial neighborhood. Then, an approach is presented to construct the minimum spanning tree-like (MST-L). Further, an adaptive MST-L based spatial clustering algorithm (AMSTLSC) is developed in this paper. The spatial clustering algorithm only involves the setting of the threshold of edge variation factor as an input parameter, which is easily made with the support of little priori information. Through this parameter, a series of MST-L can be automatically generated from the high-density region to the low-density one, where each MST-L represents a cluster. As a result, the algorithm proposed in this paper can adapt to the change of local density among spatial points. This property is also called the adaptiveness. Finally, two tests are implemented to demonstrate that the AMSTLSC algorithm is very robust and suitable to find the clusters with different shapes. Especially the algorithm has good adaptiveness. A comparative test is made to further prove the AMSTLSC algorithm better than classic DBSCAN algorithm.
NASA Technical Reports Server (NTRS)
Mach, Douglas M.; Christian, Hugh J.; Blakeslee, Richard; Boccippio, Dennis J.; Goodman, Steve J.; Boeck, William
2006-01-01
We describe the clustering algorithm used by the Lightning Imaging Sensor (LIS) and the Optical Transient Detector (OTD) for combining the lightning pulse data into events, groups, flashes, and areas. Events are single pixels that exceed the LIS/OTD background level during a single frame (2 ms). Groups are clusters of events that occur within the same frame and in adjacent pixels. Flashes are clusters of groups that occur within 330 ms and either 5.5 km (for LIS) or 16.5 km (for OTD) of each other. Areas are clusters of flashes that occur within 16.5 km of each other. Many investigators are utilizing the LIS/OTD flash data; therefore, we test how variations in the algorithms for the event group and group-flash clustering affect the flash count for a subset of the LIS data. We divided the subset into areas with low (1-3), medium (4-15), high (16-63), and very high (64+) flashes to see how changes in the clustering parameters affect the flash rates in these different sizes of areas. We found that as long as the cluster parameters are within about a factor of two of the current values, the flash counts do not change by more than about 20%. Therefore, the flash clustering algorithm used by the LIS and OTD sensors create flash rates that are relatively insensitive to reasonable variations in the clustering algorithms.
Rank-based spatial clustering: an algorithm for rapid outbreak detection
Tsui, Fu-Chiang
2011-01-01
Objective Public health surveillance requires outbreak detection algorithms with computational efficiency sufficient to handle the increasing volume of disease surveillance data. In response to this need, the authors propose a spatial clustering algorithm, rank-based spatial clustering (RSC), that detects rapidly infectious but non-contagious disease outbreaks. Design The authors compared the outbreak-detection performance of RSC with that of three well established algorithms—the wavelet anomaly detector (WAD), the spatial scan statistic (KSS), and the Bayesian spatial scan statistic (BSS)—using real disease surveillance data on to which they superimposed simulated disease outbreaks. Measurements The following outbreak-detection performance metrics were measured: receiver operating characteristic curve, activity monitoring operating curve curve, cluster positive predictive value, cluster sensitivity, and algorithm run time. Results RSC was computationally efficient. It outperformed the other two spatial algorithms in terms of detection timeliness, and outbreak localization. RSC also had overall better timeliness than the time-series algorithm WAD at low false alarm rates. Conclusion RSC is an ideal algorithm for analyzing large datasets when the application of other spatial algorithms is not practical. It also allows timely investigation for public health practitioners by providing early detection and well-localized outbreak clusters. PMID:21486881
A Novel Artificial Bee Colony Based Clustering Algorithm for Categorical Data
2015-01-01
Data with categorical attributes are ubiquitous in the real world. However, existing partitional clustering algorithms for categorical data are prone to fall into local optima. To address this issue, in this paper we propose a novel clustering algorithm, ABC-K-Modes (Artificial Bee Colony clustering based on K-Modes), based on the traditional k-modes clustering algorithm and the artificial bee colony approach. In our approach, we first introduce a one-step k-modes procedure, and then integrate this procedure with the artificial bee colony approach to deal with categorical data. In the search process performed by scout bees, we adopt the multi-source search inspired by the idea of batch processing to accelerate the convergence of ABC-K-Modes. The performance of ABC-K-Modes is evaluated by a series of experiments in comparison with that of the other popular algorithms for categorical data. PMID:25993469
A novel artificial bee colony based clustering algorithm for categorical data.
Ji, Jinchao; Pang, Wei; Zheng, Yanlin; Wang, Zhe; Ma, Zhiqiang
2015-01-01
Data with categorical attributes are ubiquitous in the real world. However, existing partitional clustering algorithms for categorical data are prone to fall into local optima. To address this issue, in this paper we propose a novel clustering algorithm, ABC-K-Modes (Artificial Bee Colony clustering based on K-Modes), based on the traditional k-modes clustering algorithm and the artificial bee colony approach. In our approach, we first introduce a one-step k-modes procedure, and then integrate this procedure with the artificial bee colony approach to deal with categorical data. In the search process performed by scout bees, we adopt the multi-source search inspired by the idea of batch processing to accelerate the convergence of ABC-K-Modes. The performance of ABC-K-Modes is evaluated by a series of experiments in comparison with that of the other popular algorithms for categorical data. PMID:25993469
NASA Astrophysics Data System (ADS)
Zhang, Junda; Tang, Xiao-an; Jiang, Libing
2013-10-01
Clustering is an effective mean for of marine environment data analysis. This paper proposes a clustering algorithm based on the "Velocity-Direction" histogram. First of all, the "Velocity-Direction" histogram is constructed based on the characteristics of marine environment vector field data. Secondly, the exact surface of histogram is reconstructed by the Gaussian kernel function to eliminate the contaminated data points in "Velocity-Direction" histogram. Finally, the FCM algorithm is introduced and modified for the "Velocity-Direction" histogram clustering. The initial number and clustering centers for the FCM algorithm are set as the local extremum in the constructed histogram surfaces. The experiment results based on the simulation and the NOAA marine environment vector field data verifies the effectiveness of the proposed algorithm.
Zhu, Bohui; Ding, Yongsheng; Hao, Kuangrong
2013-01-01
This paper presents a novel maximum margin clustering method with immune evolution (IEMMC) for automatic diagnosis of electrocardiogram (ECG) arrhythmias. This diagnostic system consists of signal processing, feature extraction, and the IEMMC algorithm for clustering of ECG arrhythmias. First, raw ECG signal is processed by an adaptive ECG filter based on wavelet transforms, and waveform of the ECG signal is detected; then, features are extracted from ECG signal to cluster different types of arrhythmias by the IEMMC algorithm. Three types of performance evaluation indicators are used to assess the effect of the IEMMC method for ECG arrhythmias, such as sensitivity, specificity, and accuracy. Compared with K-means and iterSVR algorithms, the IEMMC algorithm reflects better performance not only in clustering result but also in terms of global search ability and convergence ability, which proves its effectiveness for the detection of ECG arrhythmias. PMID:23690875
Self-Stabilizing weight-based Clustering Algorithm for Ad hoc sensor Networks
Johnen, Colette
Self-Stabilizing weight-based Clustering Algorithm for Ad hoc sensor Networks Colette Johnen, Le-mail: colette@lri.fr, lehuy@lri.fr June 2, 2006 Abstract Ad hoc sensor networks consist of large number
Decentralized intelligent routing algorithm in distributed database cluster
Guiduo Duan; Shuai Wu; Fei Xu; Yang Xu
2010-01-01
Cluster technology that enables a group of computers working closely to form a single computer, has been a booming research field in computer engineering and network. State of the art cluster applications, such as large scale database processing, require thousands of computers to work together. However, current solutions for cluster database applications require some kinds of centralized control, such as
An improved fuzzy unequal clustering algorithm for wireless sensor network
Song Mao; Chenglin Zhao; Zheng Zhou; Yabin Ye
2011-01-01
This paper proposes a novel energy efficient unequal clustering scheme for large scale wireless sensor networks(WSNs) which aims to balance the node power consumption and prolong the network lifetime as long as possible. Our approach focuses on energy efficient clustering scheme and inter-cluster routing protocol. On the one hand, considering each node's some local information such as energy level, distance
A Zonal Algorithm for Clustering An Hoc Networks
YUANZHU PETER CHEN; Arthur L. Liestman
2003-01-01
A Mobile Ad Hoc Network (MANET) is an infrastructureless wireless network that can support highly dynamic mobile units. The multi-hop feature of a MANET suggests the use of clustering to simplify routing. Graph domination can be used in dening clusters in MANETs. A variant of dominating set which is more suitable for clustering MANETs is the weakly-connected dominating set. A
Efficient Barrier and Allreduce on IBA clusters using hardware multicast and adaptive algorithms
AMITH R MAMIDALA; JIUXING LIU; DHABALESWAR K PANDA
Popular algorithms proposed in the literature for doing Barrier and Allreduce in clusters, such as pair-wise ex- change, dissemination and gather-broadcast do not give an optimal performance when there is skew among the nodes in the cluster. In pair-wise exchange and dissemination, all the nodes must arrive for the completion of each step. The gather-broadcast algorithm assumes a fixed tree
Accuracy and robustness of clustering algorithms for small-size applications in bioinformatics
NASA Astrophysics Data System (ADS)
Minicozzi, Pamela; Rapallo, Fabio; Scalas, Enrico; Dondero, Francesco
2008-11-01
The performance (accuracy and robustness) of several clustering algorithms is studied for linearly dependent random variables in the presence of noise. It turns out that the error percentage quickly increases when the number of observations is less than the number of variables. This situation is common situation in experiments with DNA microarrays. Moreover, an a posteriori criterion to choose between two discordant clustering algorithm is presented.
Longitudinally-invariant k?-clustering algorithms for hadron-hadron collisions
S. Catani; Yu. L. Dokshitzer; Michael H Seymour; Bryan R Webber
1993-01-01
We propose a version of the QCD-motivated ``k?'' jet-clustering algorithm for hadron-hadron collisions which is invariant under boosts along the beam directions. This leads to improved factorization properties and closer correspondence to experimental practice at hadron colliders. We examine alternative definitions of the resolution variables and cluster recombination scheme, and show that the algorithm can be implemented efficiently on a
Jeremy Kepner; Rita Kim
2000-04-21
Clusters of galaxies are the most massive objects in the Universe and mapping their location is an important astronomical problem. This paper describes an algorithm (based on statistical signal processing methods), a software architecture (based on a hybrid layered approach) and a parallelization scheme (based on a client/server model) for finding clusters of galaxies in large astronomical databases. The Adaptive Matched Filter (AMF) algorithm presented here identifies clusters by finding the peaks in a cluster likelihood map generated by convolving a galaxy survey with a filter based on a cluster model and a background model. The method has proved successful in identifying clusters in real and simulated data. The implementation is flexible and readily executed in parallel on a network of workstations.
A memetic co-clustering algorithm for gene expression profiles and biological annotation
Nora Speer; Christian Spieth; Andreas Zell
2004-01-01
With the invention of microarrays, researchers are capable of measuring thousands of gene expression levels in parallel at various time points of the biological process. To investigate general regulatory mechanisms, biologists cluster genes based on their expression patterns. In this paper, we propose a new memetic co-clustering algorithm for expression profiles, which incorporates a priori knowledge in the form of
A N-dimensional Stochastic Control Algorithm for Electricity Asset Management on PC cluster
Vialle, Stéphane
A N-dimensional Stochastic Control Algorithm for Electricity Asset Management on PC cluster of stocks (N) and a big number of uncertainty factors. This paper introduces a distribution of a N-dimension stochastic dynamic programming application, on PC clusters and IBM Blue Gene/L super- computer. It has needed
Yannis A. Tolias; Stavros M. Panas
1998-01-01
We present an adaptive fuzzy clustering scheme for image segmentation, the adaptive fuzzy clustering\\/segmentation (AFCS) algorithm. In AFCS, the nonstationary nature of images is taken into account by modifying the prototype vectors as functions of the sample location in the image. The inherent high interpixel correlation is modeled using neighborhood information. A multiresolution model is utilized for estimating the spatially
A Fast General-Purpose Clustering Algorithm Based on FPGAs for High-Throughput Data Processing
A. Annovi; M. Beretta
2009-10-14
We present a fast general-purpose algorithm for high-throughput clustering of data "with a two dimensional organization". The algorithm is designed to be implemented with FPGAs or custom electronics. The key feature is a processing time that scales linearly with the amount of data to be processed. This means that clustering can be performed in pipeline with the readout, without suffering from combinatorial delays due to looping multiple times through all the data. This feature makes this algorithm especially well suited for problems where the data has high density, e.g. in the case of tracking devices working under high-luminosity condition such as those of LHC or Super-LHC. The algorithm is organized in two steps: the first step (core) clusters the data; the second step analyzes each cluster of data to extract the desired information. The current algorithm is developed as a clustering device for modern high-energy physics pixel detectors. However, the algorithm has much broader field of applications. In fact, its core does not specifically rely on the kind of data or detector it is working for, while the second step can and should be tailored for a given application. Applications can thus be foreseen to other detectors and other scientific fields ranging from HEP calorimeters to medical imaging. An additional advantage of this two steps approach is that the typical clustering related calculations (second step) are separated from the combinatorial complications of clustering. This separation simplifies the design of the second step and it enables it to perform sophisticated calculations achieving online-quality in online applications. The algorithm is general purpose in the sense that only minimal assumptions on the kind of clustering to be performed are made.
Multispectral Satellite Image Segmentation Using Fuzzy Clustering and Nonlinear Filtering Methods
Leonid P. Podenok; R. Kh. Sadykhov
2008-01-01
Segmentation method for processing the multispectral satellite images based on fuzzy clustering and nonlinear filtering is represented. Three fuzzy clustering algorithms, namely Fuzzy C-means, Gustafson-Kessel, and Gath-Gevawith and without preliminary processing have been tested. The experimental results obtained using these algorithms with and without preliminary nonlinear filtering the source Landsat channels have approved that segmentation based on fuzzy clustering provides
Chinese Text Clustering Algorithm Based k-means
NASA Astrophysics Data System (ADS)
Yao, Mingyu; Pi, Dechang; Cong, Xiangxiang
Text clustering is an important means and method in text mining. The process of Chinese text clustering based on k-means was emphasized, we found that new center of a cluster was easily effected by isolated text after some experiments. Average similarity of one cluster was used as a parameter, and multiplied it with a modulus between 0.75 and 1.25 to get the similarity threshold value, the texts whose similarity with original cluster center was greater than or equal to the threshold value ware collected as a candidate collection, then updated the cluster center with center of candidate collection. The experiments show that improved method averagely increased purity and F value about 10 percent over the original method.
The analysis of a simple k -means clustering algorithm
Tapas Kanungo; David M. Mount; Nathan S. Netanyahu; Christine D. Piatko; Ruth Silverman; Angela Y. Wu
2000-01-01
K-means clustering is a very popular clustering technique which is used in numerous applications.Given a set of n data points in Rdand an integer k, the problem is to determine a setof k points Rd, called centers, so as to minimize the mean squared distance from each datapoint to its nearest center. A popular heuristic for k-means clustering is Lloyd's
Adaptive fuzzy moving K-means clustering algorithm for image segmentation
Nor Mat Isa; Samy Salamah; Umi Ngah
2009-01-01
Image segmentation remains one of the major challenges in image analysis. Many segmentation algorithms have been developed for various applications. Unsatisfactory results have been encountered in some cases, for many existing segmentation algorithms. In this paper, we introduce three modified versions of the conventional moving k-means clustering algorithm called the fuzzy moving k-means, adaptive moving k-means and adaptive fuzzy moving
Stefan Janson; Daniel Merkle
2005-01-01
\\u000a In this paper we introduce the new hybrid Particle Swarm Optimization algorithm for multi-objective optimization ClustMPSO.\\u000a We combined the PSO algorithm with clustering techniques to divide all particles into several subswarms. Strategies for updating\\u000a the personal best position of a particle, for selection of the neighbourhood best and for swarm dominance are proposed. The\\u000a algorithm is analyzed on both artificial
Lin, Nan; Jiang, Junhai; Guo, Shicheng; Xiong, Momiao
2015-01-01
Due to the advancement in sensor technology, the growing large medical image data have the ability to visualize the anatomical changes in biological tissues. As a consequence, the medical images have the potential to enhance the diagnosis of disease, the prediction of clinical outcomes and the characterization of disease progression. But in the meantime, the growing data dimensions pose great methodological and computational challenges for the representation and selection of features in image cluster analysis. To address these challenges, we first extend the functional principal component analysis (FPCA) from one dimension to two dimensions to fully capture the space variation of image the signals. The image signals contain a large number of redundant features which provide no additional information for clustering analysis. The widely used methods for removing the irrelevant features are sparse clustering algorithms using a lasso-type penalty to select the features. However, the accuracy of clustering using a lasso-type penalty depends on the selection of the penalty parameters and the threshold value. In practice, they are difficult to determine. Recently, randomized algorithms have received a great deal of attentions in big data analysis. This paper presents a randomized algorithm for accurate feature selection in image clustering analysis. The proposed method is applied to both the liver and kidney cancer histology image data from the TCGA database. The results demonstrate that the randomized feature selection method coupled with functional principal component analysis substantially outperforms the current sparse clustering algorithms in image cluster analysis. PMID:26196383
A hybrid algorithm for clustering of time series data based on affinity search technique.
Aghabozorgi, Saeed; Ying Wah, Teh; Herawan, Tutut; Jalab, Hamid A; Shaygan, Mohammad Amin; Jalali, Alireza
2014-01-01
Time series clustering is an important solution to various problems in numerous fields of research, including business, medical science, and finance. However, conventional clustering algorithms are not practical for time series data because they are essentially designed for static data. This impracticality results in poor clustering accuracy in several systems. In this paper, a new hybrid clustering algorithm is proposed based on the similarity in shape of time series data. Time series data are first grouped as subclusters based on similarity in time. The subclusters are then merged using the k-Medoids algorithm based on similarity in shape. This model has two contributions: (1) it is more accurate than other conventional and hybrid approaches and (2) it determines the similarity in shape among time series data with a low complexity. To evaluate the accuracy of the proposed model, the model is tested extensively using syntactic and real-world time series datasets. PMID:24982966
A Fast Density-Based Clustering Algorithm for Real-Time Internet of Things Stream
Ying Wah, Teh
2014-01-01
Data streams are continuously generated over time from Internet of Things (IoT) devices. The faster all of this data is analyzed, its hidden trends and patterns discovered, and new strategies created, the faster action can be taken, creating greater value for organizations. Density-based method is a prominent class in clustering data streams. It has the ability to detect arbitrary shape clusters, to handle outlier, and it does not need the number of clusters in advance. Therefore, density-based clustering algorithm is a proper choice for clustering IoT streams. Recently, several density-based algorithms have been proposed for clustering data streams. However, density-based clustering in limited time is still a challenging issue. In this paper, we propose a density-based clustering algorithm for IoT streams. The method has fast processing time to be applicable in real-time application of IoT devices. Experimental results show that the proposed approach obtains high quality results with low computation time on real and synthetic datasets. PMID:25110753
H. Kim; C. Ho; J. Kim
2008-01-01
This study presents the pattern classification of tropical cyclone (TC) tracks over the western North Pacific (WNP) basin during the typhoon season (June through October) for 1965-2006 (total 42 years) using a fuzzy clustering method. After the fuzzy c-mean clustering algorithm to the TC trajectory interpolated into 20 segments of equivalent length, we divided the whole tracks into 7 patterns.
Algorithms for Node Clustering in Wireless Sensor Networks: A Survey
P. Kumarawadu; D. J. Dechene; M. Luccini; A. Sauer
2008-01-01
The self-organizational ability of ad-hoc Wireless Sensor Networks (WSNs) have led them to be the most popular choice in ubiquitous computing. Clustering sensor nodes and organizing them hierarchically have proven to be an effective method to provide better data aggregation and scalability for the sensor network while conserving limited energy. In this paper we provide a comprehensive analysis of clustering
Towards clustering algorithms in wireless sensor networks: a survey
Congfeng Jiang; Daomin Yuan; Yinghui Zhao
2009-01-01
Wireless sensor networks (WSNs) are emerging as essential and popular ways of providing pervasive computing environments for various applications. In all these environments energy constraint is the most critical problem that must be considered. Clustering is introduced to WSNs because of its network scalability, energy-saving attributes and network topology stabilities. However, there also exist some disadvantages associated with individual clustering
Fuzzy Partitioning Using Real Coded Variable Length Genetic Algorithm for Pixel Classification
Bandyopadhyay, Sanghamitra
are compared with those obtained using the well known fuzzy C-means algorithm. Keywords: cluster validity in the image is made. In this article, we model the problem of unsupervised classification, or segmentation, of satellite image data as one of fuzzy clustering in the intensity domain of the different bands of the image
Xiaohe Li; Taiyi Zhang; Zhan Qu
2008-01-01
Image segmentation is an essential processing step for many image analysis applications. In this paper, a novel image segmentation algorithm using fuzzy C-means clustering (FCM) with spatial constraints based on Markov random field (MRF) via Bayesian theory is proposed. Due to disregard of spatial constraint information, the FCM algorithm fails to segment images corrupted by noise. In order to improve
Unsupervised Fuzzy Clustering of Multi-variate Image Data Klaus Baggesen Hilger and Allan Aasbjerg spectral-spatial characteristics. By applying the fuzzy c-means (FCM) algorithm, [1], we are able to segment an image into meaningful regions. For a given number of classes, the algorithm estimates
A MinMax spatial clustering algorithm under the complex geography environment
NASA Astrophysics Data System (ADS)
Guo, Maoyun; Zhang, Zhifen; Hu, Youqiang; Qu, Jianfeng; Chai, Yi
2007-12-01
The spatial clustering analysis of geographic entities under the complex geography environment is significant, such as nature resources analyzing, the public facilities location, the districts adjustment and epidemic prevention. The clustering center which is got from the clustering analyzing can be choice for the public facilities such as schools, hospitals and so on. And also such center can be the source of the epidemic, which supports to prevent the epidemic. The MinMax clustering algorithm based on the euclidian distance. But under complex geography environment such including rivers, mountains and so on, the euclidian distance can not reflect the relationship between one entities and the other. So with the analysis of the influence of road, river and the mountains to distance between the entities, the paper introduces the factors that reflects the geography environment, and improves the MinMax clustering algorithm. And the paper carries out a clustering experiment whose result shows that the improved MinMax clustering algorithm is better than the basic one.
An Efficient k-Means Clustering Algorithm: Analysis and Implementation
Tapas Kanungo; David M. Mount; Nathan S. Netanyahu; Christine D. Piatko; Ruth Silverman; Angela Y. Wu
2002-01-01
In k-means clustering, we are given a set of n data points in d-dimensional space Rd and an integer k and the problem is to determine a set of k points in Rd, called centers, so as to minimize the mean squared distance from each data point to its nearest center. A popular heuristic for k-means clustering is Lloyd's (1982)
Accelerating Bayesian Hierarchical Clustering of Time Series Data with a Randomised Algorithm
Darkins, Robert; Cooke, Emma J.; Ghahramani, Zoubin; Kirk, Paul D. W.; Wild, David L.; Savage, Richard S.
2013-01-01
We live in an era of abundant data. This has necessitated the development of new and innovative statistical algorithms to get the most from experimental data. For example, faster algorithms make practical the analysis of larger genomic data sets, allowing us to extend the utility of cutting-edge statistical methods. We present a randomised algorithm that accelerates the clustering of time series data using the Bayesian Hierarchical Clustering (BHC) statistical method. BHC is a general method for clustering any discretely sampled time series data. In this paper we focus on a particular application to microarray gene expression data. We define and analyse the randomised algorithm, before presenting results on both synthetic and real biological data sets. We show that the randomised algorithm leads to substantial gains in speed with minimal loss in clustering quality. The randomised time series BHC algorithm is available as part of the R package BHC, which is available for download from Bioconductor (version 2.10 and above) via http://bioconductor.org/packages/2.10/bioc/html/BHC.html. We have also made available a set of R scripts which can be used to reproduce the analyses carried out in this paper. These are available from the following URL. https://sites.google.com/site/randomisedbhc/. PMID:23565168
A New-Fangled FES-k-Means Clustering Algorithm for Disease Discovery and Visual Analytics.
Oyana, Tonny J
2010-01-01
The central purpose of this study is to further evaluate the quality of the performance of a new algorithm. The study provides additional evidence on this algorithm that was designed to increase the overall efficiency of the original k-means clustering technique-the Fast, Efficient, and Scalable k-means algorithm (FES-k-means). The FES-k-means algorithm uses a hybrid approach that comprises the k-d tree data structure that enhances the nearest neighbor query, the original k-means algorithm, and an adaptation rate proposed by Mashor. This algorithm was tested using two real datasets and one synthetic dataset. It was employed twice on all three datasets: once on data trained by the innovative MIL-SOM method and then on the actual untrained data in order to evaluate its competence. This two-step approach of data training prior to clustering provides a solid foundation for knowledge discovery and data mining, otherwise unclaimed by clustering methods alone. The benefits of this method are that it produces clusters similar to the original k-means method at a much faster rate as shown by runtime comparison data; and it provides efficient analysis of large geospatial data with implications for disease mechanism discovery. From a disease mechanism discovery perspective, it is hypothesized that the linear-like pattern of elevated blood lead levels discovered in the city of Chicago may be spatially linked to the city's water service lines. PMID:20689710
Bowyer, Kevin W.
and Gabor multi-channel filter- ing. Three segmentation algorithms are considered: fuzzy c-means clustering Gabor filtering. Results are evaluated by comparing the segmented images with those ob- tainedEvaluation of Texture Segmentation Algorithms Kyong I. Chang, Kevin W. Bowyer and Munish
NASA Astrophysics Data System (ADS)
Bansod, Babankumar S.; Pandey, O. P.
2013-05-01
Within-field variability is a well-known phenomenon and its study is at the centre of precision agriculture (PA). In this paper, site-specific spatial variability (SSSV) of apparent Electrical Conductivity (ECa) and crop yield apart from pH, moisture, temperature and di-electric constant information was analyzed to construct spatial distribution maps. Principal component analysis (PCA) and fuzzy c-means (FCM) clustering algorithm were then performed to delineate management zones (MZs). Various performance indices such as Normalized Classification Entropy (NCE) and Fuzzy Performance Index (FPI) were calculated to determine the clustering performance. The geo-referenced sensor data was analyzed for within-field classification. Results revealed that the variables could be aggregated into MZs that characterize spatial variability in soil chemical properties and crop productivity. The resulting classified MZs showed favorable agreement between ECa and crop yield variability pattern. This enables reduction in number of soil analysis needed to create application maps for certain cultivation operations.
An efficient clustering algorithm for partitioning Y-short tandem repeats data
2012-01-01
Background Y-Short Tandem Repeats (Y-STR) data consist of many similar and almost similar objects. This characteristic of Y-STR data causes two problems with partitioning: non-unique centroids and local minima problems. As a result, the existing partitioning algorithms produce poor clustering results. Results Our new algorithm, called k-Approximate Modal Haplotypes (k-AMH), obtains the highest clustering accuracy scores for five out of six datasets, and produces an equal performance for the remaining dataset. Furthermore, clustering accuracy scores of 100% are achieved for two of the datasets. The k-AMH algorithm records the highest mean accuracy score of 0.93 overall, compared to that of other algorithms: k-Population (0.91), k-Modes-RVF (0.81), New Fuzzy k-Modes (0.80), k-Modes (0.76), k-Modes-Hybrid 1 (0.76), k-Modes-Hybrid 2 (0.75), Fuzzy k-Modes (0.74), and k-Modes-UAVM (0.70). Conclusions The partitioning performance of the k-AMH algorithm for Y-STR data is superior to that of other algorithms, owing to its ability to solve the non-unique centroids and local minima problems. Our algorithm is also efficient in terms of time complexity, which is recorded as O(km(n-k)) and considered to be linear. PMID:23039132
Thai Sign Language Translation Using Fuzzy C-Means and Scale Invariant Feature Transform
Suwannee Phitakwinai; Sansanee Auephanwiriyakul; Nipon Theera-umpon
2008-01-01
Visual communication is important for a deft and\\/or mute person. It is also one of the tools for the communication between\\u000a human and machines. In this paper, we develop an automatic Thai finger-spelling sign language translation system using Fuzzy\\u000a C-Means (FCM) and Scale Invariant Feature Transform (SIFT) algorithms. We collect key frames from several subjects at different\\u000a times of day
Cui, Jie; Loewy, John; Kendall, Edward J
2006-05-01
Analysis of synovial fluid by infrared (IR) clinical chemistry requires expert interpretation and is susceptible to subjective error. The application of automated pattern recognition (APR) may enhance the utility of IR analysis. Here, we describe an APR method based on the fuzzy C-means cluster adaptive wavelet (FCMC-AW) algorithm, which consists of two parts: one is a FCMC using the features from an M-band feature extractor adopting the adaptive wavelet algorithm and the second is a Bayesian classifier using the membership matrix generated by the FCMC. A FCMC-cross-validated quadratic probability measure (FCMC-CVQPM) criterion is used under the assumption that the class probability density is equal to the value of the membership matrix. Therefore, both values of posterior probabilities and selection criterion MFQ can be obtained through the membership matrix. The distinctive advantage of this method is that it provides not only the 'hard' classification of a new pattern, but also the confidence of this classification, which is reflected by the membership matrix. PMID:16686402
An effective software pipelining algo-rithm for clustered embedded VLIW processors
C. Akturan; M. F. Jacome
2002-01-01
This paper proposes a software pipelining framework, CALiBeR (Cluster Aware Load Balancing Retiming Algorithm), suitable for compilers targeting clustered embedded VLIW processors. CALiBeR can be used by embedded system designers to explore different code optimization alternatives, that is, high-quality customized retiming solutions for desired throughput and program memory size requirements, while minimizing register pressure. An extensive set of experimental results
A reliable cluster detection technique using photometric redshifts: introducing the 2TecX algorithm
NASA Astrophysics Data System (ADS)
van Breukelen, Caroline; Clewley, Lee
2009-06-01
We present a new cluster detection algorithm designed for finding high-redshift clusters using optical/infrared imaging data. The algorithm has two main characteristics. First, it utilizes each galaxy's full redshift probability function, instead of an estimate of the photometric redshift based on the peak of the probability function and an associated Gaussian error. Second, it identifies cluster candidates through cross-checking the results of two substantially different selection techniques (the name 2TecX representing the cross-check of the two techniques). These are adaptations of the Voronoi Tesselations and Friends-Of-Friends methods. Monte Carlo simulations of mock catalogues show that cross-checking the cluster candidates found by the two techniques significantly reduces the detection of spurious sources. Furthermore, we examine the selection effects and relative strengths and weaknesses of either method. The simulations also allow us to fine-tune the algorithm's parameters, and define completeness and mass limit as a function of redshift. We demonstrate that the algorithm isolates high-redshift clusters at a high level of efficiency and low contamination.
An Efficient Method of Key-Frame Extraction Based on a Cluster Algorithm
Zhang, Qiang; Yu, Shao-Pei; Zhou, Dong-Sheng; Wei, Xiao-Peng
2013-01-01
This paper proposes a novel method of key-frame extraction for use with motion capture data. This method is based on an unsupervised cluster algorithm. First, the motion sequence is clustered into two classes by the similarity distance of the adjacent frames so that the thresholds needed in the next step can be determined adaptively. Second, a dynamic cluster algorithm called ISODATA is used to cluster all the frames and the frames nearest to the center of each class are automatically extracted as key-frames of the sequence. Unlike many other clustering techniques, the present improved cluster algorithm can automatically address different motion types without any need for specified parameters from users. The proposed method is capable of summarizing motion capture data reliably and efficiently. The present work also provides a meaningful comparison between the results of the proposed key-frame extraction technique and other previous methods. These results are evaluated in terms of metrics that measure reconstructed motion and the mean absolute error value, which are derived from the reconstructed data and the original data. PMID:24511336
A Multi-Hop Energy Efficient Clustering Algorithm in Routing Protocol for Wireless Sensor Networks
Xiang Gao; Cuiqin Duan; Jingjing Sun; Yintang Yang
2009-01-01
Wireless sensor networks (WSN) including large quantity of small sensor nodes with low-power energy consumption can effectively extend its life-span. However, the performance of WSN is affected by the environment. In this paper, we devise a new multi-hop routing algorithm clustering sensor nodes into groups to minimize the total energy consumption of the WSN. The proposed algorithm optimizes the intensity
Partitioning clustering algorithms for protein sequence data sets
Sondes Fayech; Nadia Essoussi; Mohamed Limam
2009-01-01
BACKGROUND: Genome-sequencing projects are currently producing an enormous amount of new sequences and cause the rapid increasing of protein sequence databases. The unsupervised classification of these data into functional groups or families, clustering, has become one of the principal research objectives in structural and functional genomics. Computer programs to automatically and accurately classify sequences into families become a necessity. A
Algorithms for clustering expressed sequence tags: the wcd tool
Scott Hazelhurst
2008-01-01
Understanding which genes are active, and when and why, is an important question for molecular biology. Expressed Sequence Tags (ESTs) are a technology used to explore the transcriptome (a record of this gene activity). ESTs are short fragments of DNA created in the laboratory from mRNA extracted from a cell. The key computational step in their processing is clustering: putting
Subquadratic Approximation Algorithms for Clustering Problems in High Dimensional Spaces
Allan Borodin; Rafail Ostrovsky; Yuval Rabani
2004-01-01
One of the central problems in information retrieval, data mining, computational biology, statistical analysis, computer vision, geographic analysis, pattern recognition, distributed protocols is the question of classification of data according to some clustering rule. Often the data is noisy and even approximate classification is of extreme importance. The difficulty of such classification stems from the fact that usually the data
Particle swarm optimization algorithm and its application to clustering analysis
Ching-Yi Chen; Fun Ye
2004-01-01
Clustering analysis is applied generally to pattern recognition, color quantization and image classification. It can help the user to distinguish the structure of data and simplify the complexity of data from mass information. The user can understand the implied information behind extracting these data. In real case, the distribution of information can be any size and shape. A particle swarm
Clustering by fuzzy neural gas and evaluation of fuzzy clusters.
Geweniger, Tina; Fischer, Lydia; Kaden, Marika; Lange, Mandy; Villmann, Thomas
2013-01-01
We consider some modifications of the neural gas algorithm. First, fuzzy assignments as known from fuzzy c-means and neighborhood cooperativeness as known from self-organizing maps and neural gas are combined to obtain a basic Fuzzy Neural Gas. Further, a kernel variant and a simulated annealing approach are derived. Finally, we introduce a fuzzy extension of the ConnIndex to obtain an evaluation measure for clusterings based on fuzzy vector quantization. PMID:24396342
SS-ClusterTree: a subspace clustering based indexing algorithm over high-dimensional image features
Hongli Xu; Dantong Yu; De Xu; Aidong Zhang
2008-01-01
The rapid growth in the volume of image and video data collections motivates the research of building an index structure in image information retrieval. Constructing an index in the image database poses a very challenging problem due to the facts of image databases containing data with high dimensions, and lack of domain knowledge. ClusterTree is an indexing approach representing clusters
Neural-network-assisted genetic algorithm applied to silicon clusters
Marim, L.R.; Lemes, M.R.; Pino, A. Jr. dal [Department of Physics, Instituto Tecnologico de Aeronautica, Pca. Marechal Eduardo Gomes, 50-Sao Jose dos Campos, Sao Paulo 12228-900 (Brazil)
2003-03-01
Recently, a new optimization procedure that combines the power of artificial neural-networks with the versatility of the genetic algorithm (GA) was introduced. This method, called neural-network-assisted genetic algorithm (NAGA), uses a neural network to restrict the search space and it is expected to speed up the solution of global optimization problems if some previous information is available. In this paper, we have tested NAGA to determine the ground-state geometry of Si{sub n} (10{<=}n{<=}15) according to a tight-binding total-energy method. Our results indicate that NAGA was able to find the desired global minimum of the potential energy for all the test cases and it was at least ten times faster than pure genetic algorithm.
Fuzzy clustering in parallel universes
Bernd Wiswedel; Michael R. Berthold
2007-01-01
We present an extension of the fuzzy c-Means algorithm, which operates simultaneously on different feature spaces—so-called parallel universes—and also incorporates noise detection. The method assigns membership values of patterns to different universes, which are then adopted throughout the training. This leads to better clustering results since patterns not contributing to clus- tering in a universe are (completely or partially) ignored.
Law, Bonnie N.F.
, and N. F. Law, Member, IEEE Abstract--An image segmentation algorithm based on adaptive fuzzy c-means444 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 13, NO. 4, AUGUST 2005 Image Segmentation Based the effectiveness of the proposed algorithm. Index Terms--Fuzzy clustering, image segmentation, prototype adaptation
A cluster finding algorithm based on the multiband identification of red sequence galaxies
NASA Astrophysics Data System (ADS)
Oguri, Masamune
2014-10-01
We present a new algorithm, CAMIRA, to identify clusters of galaxies in wide-field imaging survey data. We base our algorithm on the stellar population synthesis model to predict colours of red sequence galaxies at a given redshift for an arbitrary set of bandpass filters, with additional calibration using a sample of spectroscopic galaxies to improve the accuracy of the model prediction. We run the algorithm on 11 960 deg2 of imaging data from the Sloan Digital Sky Survey (SDSS) Data Release 8 to construct a catalogue of 71 743 clusters in the redshift range 0.1 < z < 0.6 with richness after correcting for the incompleteness of the richness estimate greater than 20. We cross-match the cluster catalogue with external cluster catalogues to find that our photometric cluster redshift estimates are accurate with low bias and scatter, and that the corrected richness correlates well with X-ray luminosities and temperatures. We use the publicly available Canada-France-Hawaii Telescope Lensing Survey shear catalogue to calibrate the mass-richness relation from stacked weak lensing analysis. Stacked weak lensing signals are detected significantly for eight subsamples of the SDSS clusters divided by redshift and richness bins, which are then compared with model predictions including miscentring effects to constrain mean halo masses of individual bins. We find the richness correlates well with the halo mass, such that the corrected richness limit of 20 corresponds to the cluster virial mass limit of about 1 × 1014 h-1 M? for the SDSS DR8 cluster sample.
Experimental realization of the Deutsch-Jozsa algorithm with a six-qubit cluster state
Vallone, Giuseppe [Museo Storico della Fisica e Centro Studi e Ricerche Enrico Fermi, Via Panisperna 89/A, Compendio del Viminale, IT-00184 Roma (Italy); Dipartimento di Fisica, Universita Sapienza di Roma, IT-00185 Roma (Italy); Donati, Gaia; Bruno, Natalia; Chiuri, Andrea [Dipartimento di Fisica, Universita Sapienza di Roma, IT-00185 Roma (Italy); Mataloni, Paolo [Dipartimento di Fisica, Universita Sapienza di Roma, IT-00185 Roma (Italy); Istituto Nazionale di Ottica (INO-CNR), L.go E. Fermi 6, IT-50125 Florence (Italy)
2010-05-15
We describe an experimental realization of the Deutsch-Jozsa quantum algorithm to evaluate the properties of a two-bit Boolean function in the framework of one-way quantum computation. For this purpose, a two-photon six-qubit cluster state was engineered. Its peculiar topological structure is the basis of the original measurement pattern allowing the algorithm realization. The good agreement of the experimental results with the theoretical predictions, obtained at {approx}1 kHz success rate, demonstrates the correct implementation of the algorithm.
Experimental Realization of the Deutsch-Jozsa Algorithm with a Six-Qubit Cluster State
Giuseppe Vallone; Gaia Donati; Natalia Bruno; Andrea Chiuri; Paolo Mataloni
2010-03-24
We describe the first experimental realization of the Deutsch-Jozsa quantum algorithm to evaluate the properties of a 2-bit boolean function in the framework of one-way quantum computation. For this purpose a novel two-photon six-qubit cluster state was engineered. Its peculiar topological structure is the basis of the original measurement pattern allowing the algorithm realization. The good agreement of the experimental results with the theoretical predictions, obtained at $\\sim$1kHz success rate, demonstrate the correct implementation of the algorithm.
Simulation of a modified Hubbard model with a chemical potential using a meron-cluster algorithm
J. C. Osborn
2002-09-09
We show how a variant of the Hubbard model can be simulated using a meron-cluster algorithm. This provides a major improvement in our ability to determine the behavior of these types of models. We also present some results that clearly demonstrate the existence of a superconducting state in this model.
Conceptual Clustering Using Lingo Algorithm: Evaluation on Open Directory Project Data
Stanislaw Osinski; Dawid Weiss
2004-01-01
Search results clustering problem is defined as an automatic, on-line grouping of similar documents in a search hits list, returned from a search engine. In this paper we present the results of an experimental evaluation of a new algorithm named Lingo. We use Open Directory Project as a source of high-quality narrow- topic document references and mix them into several
An ideal seed non-hierarchical clustering algorithm for cellular manufacturing
M. P. Chandrasekharan; R. RAJAGOPALAN
1986-01-01
This paper describes the development of a non-heuristic algorithm for solving group techology problems. The problem is first formulated as a bipartite graph, and then an expression for the upper limit to the number of groups is derived. Using this limit, a non-hierarchical clustering method is adopted for grouping components into families and machines into cells. After diagonally correlating the
spDB presentation SCUBA: Scalable Cluster-Based Algorithm for
Kaiserslautern, Universität
concerts, migrating birds etc. #12;02-11-08 Dat5 - spDB 2008 4 The idea and motivation (2/2) Groups moving and motivation SCUBA algorithm Optimisation Results Conclusion Related work and contributions EvaluationDB 2008 16 Optimisation Moving Cluster based Load Shedding. Can further decrease the amount of joins
Application of Ant Colony Clustering Algorithm in Discrimination the Origin of Longjing Tea
Xian Zhang; Huajun Zheng; Xin Qiao
2010-01-01
The existence of fake tea from non-origin impacts on the credibility and sales of the origin Longjing tea seriously. In order to weaken this impact, we proposed a technology using ant colony clustering algorithm in discrimination the origin of Longjing tea. Then acquired and analyzed the characteristics of the origin tea comprehensively, the 16 parameters of the images and spectra
Multiobjective Particle Swarm Algorithm With Fuzzy Clustering for Electrical Power Dispatch
Shubham Agrawal; B. K. Panigrahi; Manoj Kumar Tiwari
2008-01-01
Economic dispatch is a highly constrained optimization problem encompassing interaction among decision variables. Environmental concerns that arise due to the operation of fossil fuel fired electric generators, transforms the classical problem into multiobjective environmental\\/economic dispatch (EED). In this paper, a fuzzy clustering-based particle swarm (FCPSO) algorithm has been proposed to solve the highly constrained EED problem involving conflicting objectives. FCPSO
Tame, M. S.; Kim, M. S. [QOLS, Blackett Laboratory, Imperial College London, Prince Consort Road, SW7 2BW, United Kingdom and Institute for Mathematical Sciences, Imperial College London, SW7 2PG (United Kingdom)
2010-09-15
We show that fundamental versions of the Deutsch-Jozsa and Bernstein-Vazirani quantum algorithms can be performed using a small entangled cluster state resource of only six qubits. We then investigate the minimal resource states needed to demonstrate general n-qubit versions and a scalable method to produce them. For this purpose, we propose a versatile photonic on-chip setup.
SOME CLUSTERING ALGORITHMS TO ENHANCE THE PERFORMANCE OF THE NETWORK INTRUSION DETECTION SYSTEM
Mrutyunjaya Panda; Manas Ranjan Patra
2008-01-01
Most current intrusion detection systems are signature based ones or machine learning based methods. Despite the number of machine learning algorithms applied to KDD 99 cup, none of them have introduced a pre-model to reduce the huge information quantity present in the different KDD 99 datasets. Clustering is an important task in mining evolving data streams. Besides the limited memory
An Evaluation of ISOCLS and Classy Clustering Algorithms for Forest Classification in Northern Idaho
Lee F. Werth
1981-01-01
This paper represents one of many studies funded under AgRISTARS to determine how Landsat can contribute to the Forest Service's Renewable Resource Inventory effort as mandated by the National Forest Management Act of 1976. The specific objective of this study was to test clustering algorithms used at Johnson Space Flight Center - ISOCLS and CLASSY.
The Application of a Simulated Annealing Fuzzy Clustering Algorithm for Cancer Diagnosis
Aickelin, Uwe
The Application of a Simulated Annealing Fuzzy Clustering Algorithm for Cancer Diagnosis Xiao Ying Spectroscopy (FTIR) is becoming a powerful tool for use in the study of biomedical conditions, including cancer diagnosis. As part of an ongoing programme of research into the potential early diagnosis of cervical cancer
Unsupervised unstained cell detection by SIFT keypoint clustering and self-labeling algorithm.
Muallal, Firas; Schöll, Simon; Sommerfeldt, Björn; Maier, Andreas; Steidl, Stefan; Buchholz, Rainer; Hornegger, Joachim
2014-01-01
We propose a novel unstained cell detection algorithm based on unsupervised learning. The algorithm utilizes the scale invariant feature transform (SIFT), a self-labeling algorithm, and two clustering steps in order to achieve high performance in terms of time and detection accuracy. Unstained cell imaging is dominated by phase contrast and bright field microscopy. Therefore, the algorithm was assessed on images acquired using these two modalities. Five cell lines having in total 37 images and 7250 cells were considered for the evaluation: CHO, L929, Sf21, HeLa, and Bovine cells. The obtained F-measures were between 85.1 and 89.5. Compared to the state-of-the-art, the algorithm achieves very close F-measure to the supervised approaches in much less time. PMID:25320822
Scalable Algorithms for Molecular Dynamics Simulations on Commodity Clusters
Kevin J. Bowers; Edmond Chow; Huafeng Xu; Ron O. Dror; Michael P. Eastwood; Brent A. Gregersen; John L. Klepeis; Istvan Kolossvary; Mark A. Moraes; Federico D. Sacerdoti; John K. Salmon; Yibing Shan; David E. Shaw
2006-01-01
Although molecular dynamics (MD) simulations of biomolecular systems often run for days to months, many events of great scientific interest and pharmaceutical relevance occur on long time scales that remain beyond reach. We present several new algorithms and implementation techniques that significantly accelerate parallel MD simulations compared with current state-of-the-art codes. These include a novel parallel decomposition method and message-passing
Numerical convergence and interpretation of the fuzzy c-shells clustering algorithm.
Bezdek, J C; Hathaway, R J
1992-01-01
R. N. Dave's (1990) version of fuzzy c-shells is an iterative clustering algorithm which requires the application of Newton's method or a similar general optimization technique at each half step in any sequence of iterates for minimizing the associated objective function. An important computational question concerns the accuracy of the solution required at each half step within the overall iteration. The general convergence theory for grouped coordination minimization is applied to this question to show that numerically exact solution of the half-step subproblems in Dave's algorithm is not necessary. One iteration of Newton's method in each coordinate minimization half step yields a sequence obtained using the fuzzy c-shells algorithm with numerically exact coordinate minimization at each half step. It is shown that fuzzy c-shells generates hyperspherical prototypes to the clusters it finds for certain special cases of the measure of dissimilarity used. PMID:18276477
NASA Astrophysics Data System (ADS)
Quintanilla-Domínguez, Joel; Ojeda-Magańa, Benjamín; Marcano-Cedeńo, Alexis; Cortina-Januchs, María G.; Vega-Corona, Antonio; Andina, Diego
2011-12-01
A new method for detecting microcalcifications in regions of interest (ROIs) extracted from digitized mammograms is proposed. The top-hat transform is a technique based on mathematical morphology operations and, in this paper, is used to perform contrast enhancement of the mi-crocalcifications. To improve microcalcification detection, a novel image sub-segmentation approach based on the possibilistic fuzzy c-means algorithm is used. From the original ROIs, window-based features, such as the mean and standard deviation, were extracted; these features were used as an input vector in a classifier. The classifier is based on an artificial neural network to identify patterns belonging to microcalcifications and healthy tissue. Our results show that the proposed method is a good alternative for automatically detecting microcalcifications, because this stage is an important part of early breast cancer detection.
Anindya Bhattacharya; Rajat K. De
2010-01-01
Distance based clustering algorithms can group genes that show similar expression values under multiple experimental conditions. They are unable to identify a group of genes that have similar pattern of variation in their expression values. Previously we developed an algorithm called divisive correlation clustering algorithm (DCCA) to tackle this situation, which is based on the concept of correlation clustering. But
FctClus: A Fast Clustering Algorithm for Heterogeneous Information Networks.
Yang, Jing; Chen, Limin; Zhang, Jianpei
2015-01-01
It is important to cluster heterogeneous information networks. A fast clustering algorithm based on an approximate commute time embedding for heterogeneous information networks with a star network schema is proposed in this paper by utilizing the sparsity of heterogeneous information networks. First, a heterogeneous information network is transformed into multiple compatible bipartite graphs from the compatible point of view. Second, the approximate commute time embedding of each bipartite graph is computed using random mapping and a linear time solver. All of the indicator subsets in each embedding simultaneously determine the target dataset. Finally, a general model is formulated by these indicator subsets, and a fast algorithm is derived by simultaneously clustering all of the indicator subsets using the sum of the weighted distances for all indicators for an identical target object. The proposed fast algorithm, FctClus, is shown to be efficient and generalizable and exhibits high clustering accuracy and fast computation speed based on a theoretic analysis and experimental verification. PMID:26090857
FctClus: A Fast Clustering Algorithm for Heterogeneous Information Networks
Yang, Jing; Chen, Limin; Zhang, Jianpei
2015-01-01
It is important to cluster heterogeneous information networks. A fast clustering algorithm based on an approximate commute time embedding for heterogeneous information networks with a star network schema is proposed in this paper by utilizing the sparsity of heterogeneous information networks. First, a heterogeneous information network is transformed into multiple compatible bipartite graphs from the compatible point of view. Second, the approximate commute time embedding of each bipartite graph is computed using random mapping and a linear time solver. All of the indicator subsets in each embedding simultaneously determine the target dataset. Finally, a general model is formulated by these indicator subsets, and a fast algorithm is derived by simultaneously clustering all of the indicator subsets using the sum of the weighted distances for all indicators for an identical target object. The proposed fast algorithm, FctClus, is shown to be efficient and generalizable and exhibits high clustering accuracy and fast computation speed based on a theoretic analysis and experimental verification. PMID:26090857
Scalable fault tolerant algorithms for linear-scaling coupled-cluster electronic structure methods.
Leininger, Matthew L.; Nielsen, Ida Marie B.; Janssen, Curtis L.
2004-10-01
By means of coupled-cluster theory, molecular properties can be computed with an accuracy often exceeding that of experiment. The high-degree polynomial scaling of the coupled-cluster method, however, remains a major obstacle in the accurate theoretical treatment of mainstream chemical problems, despite tremendous progress in computer architectures. Although it has long been recognized that this super-linear scaling is non-physical, the development of efficient reduced-scaling algorithms for massively parallel computers has not been realized. We here present a locally correlated, reduced-scaling, massively parallel coupled-cluster algorithm. A sparse data representation for handling distributed, sparse multidimensional arrays has been implemented along with a set of generalized contraction routines capable of handling such arrays. The parallel implementation entails a coarse-grained parallelization, reducing interprocessor communication and distributing the largest data arrays but replicating as many arrays as possible without introducing memory bottlenecks. The performance of the algorithm is illustrated by several series of runs for glycine chains using a Linux cluster with an InfiniBand interconnect.
Cluster-Based Multipolling Sequencing Algorithm for Collecting RFID Data in Wireless LANs
NASA Astrophysics Data System (ADS)
Choi, Woo-Yong; Chatterjee, Mainak
2015-03-01
With the growing use of RFID (Radio Frequency Identification), it is becoming important to devise ways to read RFID tags in real time. Access points (APs) of IEEE 802.11-based wireless Local Area Networks (LANs) are being integrated with RFID networks that can efficiently collect real-time RFID data. Several schemes, such as multipolling methods based on the dynamic search algorithm and random sequencing, have been proposed. However, as the number of RFID readers associated with an AP increases, it becomes difficult for the dynamic search algorithm to derive the multipolling sequence in real time. Though multipolling methods can eliminate the polling overhead, we still need to enhance the performance of the multipolling methods based on random sequencing. To that extent, we propose a real-time cluster-based multipolling sequencing algorithm that drastically eliminates more than 90% of the polling overhead, particularly so when the dynamic search algorithm fails to derive the multipolling sequence in real time.
El Harchaoui, Nour-Eddine; Ait Kerroum, Mounir; Hammouch, Ahmed; Ouadou, Mohamed; Aboutajdine, Driss
2013-01-01
The analysis and processing of large data are a challenge for researchers. Several approaches have been used to model these complex data, and they are based on some mathematical theories: fuzzy, probabilistic, possibilistic, and evidence theories. In this work, we propose a new unsupervised classification approach that combines the fuzzy and possibilistic theories; our purpose is to overcome the problems of uncertain data in complex systems. We used the membership function of fuzzy c-means (FCM) to initialize the parameters of possibilistic c-means (PCM), in order to solve the problem of coinciding clusters that are generated by PCM and also overcome the weakness of FCM to noise. To validate our approach, we used several validity indexes and we compared them with other conventional classification algorithms: fuzzy c-means, possibilistic c-means, and possibilistic fuzzy c-means. The experiments were realized on different synthetics data sets and real brain MR images. PMID:24489535
El Harchaoui, Nour-Eddine; Ait Kerroum, Mounir; Hammouch, Ahmed; Ouadou, Mohamed; Aboutajdine, Driss
2013-01-01
The analysis and processing of large data are a challenge for researchers. Several approaches have been used to model these complex data, and they are based on some mathematical theories: fuzzy, probabilistic, possibilistic, and evidence theories. In this work, we propose a new unsupervised classification approach that combines the fuzzy and possibilistic theories; our purpose is to overcome the problems of uncertain data in complex systems. We used the membership function of fuzzy c-means (FCM) to initialize the parameters of possibilistic c-means (PCM), in order to solve the problem of coinciding clusters that are generated by PCM and also overcome the weakness of FCM to noise. To validate our approach, we used several validity indexes and we compared them with other conventional classification algorithms: fuzzy c-means, possibilistic c-means, and possibilistic fuzzy c-means. The experiments were realized on different synthetics data sets and real brain MR images. PMID:24489535
A fast hierarchical clustering algorithm for large-scale protein sequence data sets.
Szilágyi, Sándor M; Szilágyi, László
2014-05-01
TRIBE-MCL is a Markov clustering algorithm that operates on a graph built from pairwise similarity information of the input data. Edge weights stored in the stochastic similarity matrix are alternately fed to the two main operations, inflation and expansion, and are normalized in each main loop to maintain the probabilistic constraint. In this paper we propose an efficient implementation of the TRIBE-MCL clustering algorithm, suitable for fast and accurate grouping of protein sequences. A modified sparse matrix structure is introduced that can efficiently handle most operations of the main loop. Taking advantage of the symmetry of the similarity matrix, a fast matrix squaring formula is also introduced to facilitate the time consuming expansion. The proposed algorithm was tested on protein sequence databases like SCOP95. In terms of efficiency, the proposed solution improves execution speed by two orders of magnitude, compared to recently published efficient solutions, reducing the total runtime well below 1min in the case of the 11,944proteins of SCOP95. This improvement in computation time is reached without losing anything from the partition quality. Convergence is generally reached in approximately 50 iterations. The efficient execution enabled us to perform a thorough evaluation of classification results and to formulate recommendations regarding the choice of the algorithm?s parameter values. PMID:24657908
Karimi, Abbas; Afsharfarnia, Abbas; Zarafshan, Faraneh; Al-Haddad, S A R
2014-01-01
The stability of clusters is a serious issue in mobile ad hoc networks. Low stability of clusters may lead to rapid failure of clusters, high energy consumption for reclustering, and decrease in the overall network stability in mobile ad hoc network. In order to improve the stability of clusters, weight-based clustering algorithms are utilized. However, these algorithms only use limited features of the nodes. Thus, they decrease the weight accuracy in determining node's competency and lead to incorrect selection of cluster heads. A new weight-based algorithm presented in this paper not only determines node's weight using its own features, but also considers the direct effect of feature of adjacent nodes. It determines the weight of virtual links between nodes and the effect of the weights on determining node's final weight. By using this strategy, the highest weight is assigned to the best choices for being the cluster heads and the accuracy of nodes selection increases. The performance of new algorithm is analyzed by using computer simulation. The results show that produced clusters have longer lifetime and higher stability. Mathematical simulation shows that this algorithm has high availability in case of failure. PMID:25114965
Search for the ground states of Ising spin clusters by using the genetic algorithms
NASA Astrophysics Data System (ADS)
Oda, Akifumi; Nagao, Hidemi; Kitagawa, Yasutaka; Shigeta, Yasuteru; Shoji, Mitsuo; Nitta, Hiroya; Okumura, Mitsutaka; Yamaguchi, Kizashi
Genetic algorithms (GAs) were applied to the search for ground states of randomly generated spin clusters. Two types of hybrid GAs and one type of pure GA modified by crossover operation were developed especially for the search for the ground state on Ising spin clusters. In the hybrid GAs, we used the 1-neighborhood search and the hill-climbing search for local searches. For modification of GAs, we also proposed a new crossover operator, xŻ crossover, which turns all spins upside down. The fitness function (F(x)) was defined in terms of the energies obtained by the Ising Hamiltonian. For the Ising spin clusters, the effective magnetic interaction (Jab) values were calculated with the unrestricted Hartree-Fock method. These GAs worked well on the searches for the ground states, because the improved GAs avoided being trapped in the local minima frequently and picked up the global minima more effectively than the standard GAs.
User Activity Recognition in Smart Homes Using Pattern Clustering Applied to Temporal ANN Algorithm.
Bourobou, Serge Thomas Mickala; Yoo, Younghwan
2015-01-01
This paper discusses the possibility of recognizing and predicting user activities in the IoT (Internet of Things) based smart environment. The activity recognition is usually done through two steps: activity pattern clustering and activity type decision. Although many related works have been suggested, they had some limited performance because they focused only on one part between the two steps. This paper tries to find the best combination of a pattern clustering method and an activity decision algorithm among various existing works. For the first step, in order to classify so varied and complex user activities, we use a relevant and efficient unsupervised learning method called the K-pattern clustering algorithm. In the second step, the training of smart environment for recognizing and predicting user activities inside his/her personal space is done by utilizing the artificial neural network based on the Allen's temporal relations. The experimental results show that our combined method provides the higher recognition accuracy for various activities, as compared with other data mining classification algorithms. Furthermore, it is more appropriate for a dynamic environment like an IoT based smart home. PMID:26007738
User Activity Recognition in Smart Homes Using Pattern Clustering Applied to Temporal ANN Algorithm
Bourobou, Serge Thomas Mickala; Yoo, Younghwan
2015-01-01
This paper discusses the possibility of recognizing and predicting user activities in the IoT (Internet of Things) based smart environment. The activity recognition is usually done through two steps: activity pattern clustering and activity type decision. Although many related works have been suggested, they had some limited performance because they focused only on one part between the two steps. This paper tries to find the best combination of a pattern clustering method and an activity decision algorithm among various existing works. For the first step, in order to classify so varied and complex user activities, we use a relevant and efficient unsupervised learning method called the K-pattern clustering algorithm. In the second step, the training of smart environment for recognizing and predicting user activities inside his/her personal space is done by utilizing the artificial neural network based on the Allen’s temporal relations. The experimental results show that our combined method provides the higher recognition accuracy for various activities, as compared with other data mining classification algorithms. Furthermore, it is more appropriate for a dynamic environment like an IoT based smart home. PMID:26007738
2011-01-01
Background A Bayesian approach based on a Dirichlet process (DP) prior is useful for inferring genetic population structures because it can infer the number of populations and the assignment of individuals simultaneously. However, the properties of the DP prior method are not well understood, and therefore, the use of this method is relatively uncommon. We characterized the DP prior method to increase its practical use. Results First, we evaluated the usefulness of the sequentially-allocated merge-split (SAMS) sampler, which is a technique for improving the mixing of Markov chain Monte Carlo algorithms. Although this sampler has been implemented in a preceding program, HWLER, its effectiveness has not been investigated. We showed that this sampler was effective for population structure analysis. Implementation of this sampler was useful with regard to the accuracy of inference and computational time. Second, we examined the effect of a hyperparameter for the prior distribution of allele frequencies and showed that the specification of this parameter was important and could be resolved by considering the parameter as a variable. Third, we compared the DP prior method with other Bayesian clustering methods and showed that the DP prior method was suitable for data sets with unbalanced sample sizes among populations. In contrast, although current popular algorithms for population structure analysis, such as those implemented in STRUCTURE, were suitable for data sets with uniform sample sizes, inferences with these algorithms for unbalanced sample sizes tended to be less accurate than those with the DP prior method. Conclusions The clustering method based on the DP prior was found to be useful because it can infer the number of populations and simultaneously assign individuals into populations, and it is suitable for data sets with unbalanced sample sizes among populations. Here we presented a novel program, DPART, that implements the SAMS sampler and can consider the hyperparameter for the prior distribution of allele frequencies to be a variable. PMID:21708038
A hybrid harmony search algorithm for MRI brain segmentation
Osama Moh’d Alia; Mandava Rajeswari; Mohd Ezane Aziz
2011-01-01
Automatic magnetic resonance imaging (MRI) brain segmentation is a challenging problem that has received significant attention\\u000a in the field of medical image processing. In this paper, we present a new dynamic clustering algorithm based on the hybridization\\u000a of harmony search (HS) and fuzzy c-means to automatically segment MRI brain images in an intelligent manner. In our algorithm,\\u000a the capability of
NASA Astrophysics Data System (ADS)
Inclan, Eric; Geohegan, David; Yoon, Mina
2015-03-01
Nanostructured TiO2 materials have interesting properties that are highly relevant to energy and device applications. However, precise control of their morphologies and characterization are still a grand challenge in the field. Using a hybrid optimization algorithm we theoretically explored configuration spaces of energetically metastable TiO2 nanostructures. Our approach is to minimize the total energy of TiO2 clusters in order to identify the structural characteristics and energy landscape of plausible (TiO2)n (n = 1-100). The hybrid algorithm includes a modified differential evolution algorithm, a permutation operator to perform global optimization on a set of randomly generated structures, and then structure refinement using a BFGS Quasi-Newton algorithm. The results were compared against known physical structures and numerical results in the literature as well as our experimentally synthesized structures. Although the global minimum became more computationally expensive to locate with increasing number of TiO2 units, the optimizer successfully identified numerous plausible structures along a range of energies close to the global minimum energy structure for all clusters in the given range. This work is supported by the U.S. Department of Energy, Office of Science, Basic Energy Sciences, Materials Sciences and Engineering Division.
Cloud classification from satellite data using a fuzzy sets algorithm - A polar example
NASA Technical Reports Server (NTRS)
Key, J. R.; Maslanik, J. A.; Barry, R. G.
1989-01-01
Where spatial boundaries between phenomena are diffuse, classification methods which construct mutually exclusive clusters seem inappropriate. The Fuzzy c-means (FCM) algorithm assigns each observation to all clusters, with membership values as a function of distance to the cluster center. The FCM algorithm is applied to AVHRR data for the purpose of classifying polar clouds and surfaces. Careful analysis of the fuzzy sets can provide information on which spectral channels are best suited to the classification of particular features, and can help determine like areas of misclassification. General agreement in the resulting classes and cloud fraction was found between the FCM algorithm, a manual classification, and an unsupervised maximum likelihood classifier.
Raghu Krishnapuram; Hichem Frigui; Olfa Nasraoui
1995-01-01
Abstruct- Shell clustering algorithms are ideally suited for computer vision tasks such as boundary detection and surface approximation, particularly when the boundaries have jagged or scattered edges and when the range data is sparse. This is because shell clustering is insensitive to local aberrations, it can be performed directly in image space, and unlike tra- ditional approaches it does assume
Meyerson, Adam W.
Streaming-Data Algorithms For High-Quality Clustering Liadan O'Callaghan Nina Mishra Adam Meyerson the problem of clustering data streams, which is important in the analysis a variety of sources of data, banks, stock-market analysts, and news organizations, for example, Contact author; e
Chang, Edward Y.
StreamingData Algorithms For HighQuality Clustering Liadan O'Callaghan \\Lambda Nina Mishra Adam. In this paper, we consider the problem of clustering data streams, which is important in the analysis a variety companies, banks, stockmarket analysts, and news organizations, for example, \\Lambda Contact author; e
Algorithm for Image Segmentation Cor J. Veenman, Marcel J. T. Reinders, and Eric Backer, Member, IEEE useful in the image segmentation domain, which we especially address. Further, we propose a cellular-known nonexclusive method is the fuzzy C-means model [15]. In this method objects are soft clustered
Clustering of tethered satellite system simulation data by an adaptive neuro-fuzzy algorithm
NASA Technical Reports Server (NTRS)
Mitra, Sunanda; Pemmaraju, Surya
1992-01-01
Recent developments in neuro-fuzzy systems indicate that the concepts of adaptive pattern recognition, when used to identify appropriate control actions corresponding to clusters of patterns representing system states in dynamic nonlinear control systems, may result in innovative designs. A modular, unsupervised neural network architecture, in which fuzzy learning rules have been embedded is used for on-line identification of similar states. The architecture and control rules involved in Adaptive Fuzzy Leader Clustering (AFLC) allow this system to be incorporated in control systems for identification of system states corresponding to specific control actions. We have used this algorithm to cluster the simulation data of Tethered Satellite System (TSS) to estimate the range of delta voltages necessary to maintain the desired length rate of the tether. The AFLC algorithm is capable of on-line estimation of the appropriate control voltages from the corresponding length error and length rate error without a priori knowledge of their membership functions and familarity with the behavior of the Tethered Satellite System.
a Multi-Core Fpga-Based 2D-CLUSTERING Algorithm for High-Throughput Data Intensive Applications
NASA Astrophysics Data System (ADS)
Sotiropoulou, Calliope-Louisa; Nikolaidis, Spyridon; Annovi, Alberto; Beretta, Matteo; Volpi, Guido; Giannetti, Paola; Luciano, Pierluigi
2014-06-01
A multi-core FPGA-based clustering algorithm for high-throughput data intensive applications is presented. The algorithm is optimized for data with two dimensional organization (e.g. image processing, pixel detectors for high energy physics experiments etc.). It uses a moving window of generic size to adjust to the application's processing requirements (the cluster sizes and shapes that appear in the input data sets). One or more windows (cores) can be used to identify clusters in parallel, allowing for versatility to increase performance or reduce the amount of used resources. In addition to the inherent parallelism the algorithm is executed in a pipeline, thus allowing for readout to be performed in parallel with the cluster identification.
Dipankar Ray; D. Dutta Majumder
2009-01-01
A neuro-fuzzy clustering framework has been presented for a meaningful segmentation of Magnetic Resonance medical images.\\u000a MR imaging provides detail soft tissue descriptions of the target body object and it has immense importance in today’s non-invasive\\u000a therapeutic planning and diagnosis methods. The unlabeled image data has been classified using fuzzy c-means approach and\\u000a then the data has been used for
Development of a Genetic Algorithm to Automate Clustering of a Dependency Structure Matrix
NASA Technical Reports Server (NTRS)
Rogers, James L.; Korte, John J.; Bilardo, Vincent J.
2006-01-01
Much technology assessment and organization design data exists in Microsoft Excel spreadsheets. Tools are needed to put this data into a form that can be used by design managers to make design decisions. One need is to cluster data that is highly coupled. Tools such as the Dependency Structure Matrix (DSM) and a Genetic Algorithm (GA) can be of great benefit. However, no tool currently combines the DSM and a GA to solve the clustering problem. This paper describes a new software tool that interfaces a GA written as an Excel macro with a DSM in spreadsheet format. The results of several test cases are included to demonstrate how well this new tool works.
NASA Astrophysics Data System (ADS)
Miura, Shinichi
2007-03-01
In this paper, we present a path integral hybrid Monte Carlo (PIHMC) method for rotating molecules in quantum fluids. This is an extension of our PIHMC for correlated Bose fluids [S. Miura and J. Tanaka, J. Chem. Phys. 120, 2160 (2004)] to handle the molecular rotation quantum mechanically. A novel technique referred to be an effective potential of quantum rotation is introduced to incorporate the rotational degree of freedom in the path integral molecular dynamics or hybrid Monte Carlo algorithm. For a permutation move to satisfy Bose statistics, we devise a multilevel Metropolis method combined with a configurational-bias technique for efficiently sampling the permutation and the associated atomic coordinates. Then, we have applied the PIHMC to a helium-4 cluster doped with a carbonyl sulfide molecule. The effects of the quantum rotation on the solvation structure and energetics were examined. Translational and rotational fluctuations of the dopant in the superfluid cluster were also analyzed.
CLUSTAG & WCLUSTAG: Hierarchical Clustering Algorithms for Efficient Tag-SNP Selection
NASA Astrophysics Data System (ADS)
Ao, Sio-Iong
More than 6 million single nucleotide polymorphisms (SNPs) in the human genome have been genotyped by the HapMap project. Although only a pro portion of these SNPs are functional, all can be considered as candidate markers for indirect association studies to detect disease-related genetic variants. The complete screening of a gene or a chromosomal region is nevertheless an expensive undertak ing for association studies. A key strategy for improving the efficiency of association studies is to select a subset of informative SNPs, called tag SNPs, for analysis. In the chapter, hierarchical clustering algorithms have been proposed for efficient tag SNP selection.
Automatic segmentation of corpus callosum using Gaussian mixture modeling and Fuzzy C means methods.
?çer, Semra
2013-10-01
This paper presents a comparative study of the success and performance of the Gaussian mixture modeling and Fuzzy C means methods to determine the volume and cross-sectionals areas of the corpus callosum (CC) using simulated and real MR brain images. The Gaussian mixture model (GMM) utilizes weighted sum of Gaussian distributions by applying statistical decision procedures to define image classes. In the Fuzzy C means (FCM), the image classes are represented by certain membership function according to fuzziness information expressing the distance from the cluster centers. In this study, automatic segmentation for midsagittal section of the CC was achieved from simulated and real brain images. The volume of CC was obtained using sagittal sections areas. To compare the success of the methods, segmentation accuracy, Jaccard similarity and time consuming for segmentation were calculated. The results show that the GMM method resulted by a small margin in more accurate segmentation (midsagittal section segmentation accuracy 98.3% and 97.01% for GMM and FCM); however the FCM method resulted in faster segmentation than GMM. With this study, an accurate and automatic segmentation system that allows opportunity for quantitative comparison to doctors in the planning of treatment and the diagnosis of diseases affecting the size of the CC was developed. This study can be adapted to perform segmentation on other regions of the brain, thus, it can be operated as practical use in the clinic. PMID:23871683
Fong, Simon
2012-01-01
Voice biometrics has a long history in biosecurity applications such as verification and identification based on characteristics of the human voice. The other application called voice classification which has its important role in grouping unlabelled voice samples, however, has not been widely studied in research. Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth. In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree. The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification. Two datasets, one that is generated synthetically and the other one empirically collected from past voice recognition experiment, are used to verify and demonstrate the effectiveness of our proposed voice classification algorithm. PMID:22619492
Fong, Simon
2012-01-01
Voice biometrics has a long history in biosecurity applications such as verification and identification based on characteristics of the human voice. The other application called voice classification which has its important role in grouping unlabelled voice samples, however, has not been widely studied in research. Lately voice classification is found useful in phone monitoring, classifying speakers' gender, ethnicity and emotion states, and so forth. In this paper, a collection of computational algorithms are proposed to support voice classification; the algorithms are a combination of hierarchical clustering, dynamic time wrap transform, discrete wavelet transform, and decision tree. The proposed algorithms are relatively more transparent and interpretable than the existing ones, though many techniques such as Artificial Neural Networks, Support Vector Machine, and Hidden Markov Model (which inherently function like a black box) have been applied for voice verification and voice identification. Two datasets, one that is generated synthetically and the other one empirically collected from past voice recognition experiment, are used to verify and demonstrate the effectiveness of our proposed voice classification algorithm. PMID:22619492
Clustering V Validating clustering results
Terzi, Evimaria
Clustering V #12;Outline Â· Validating clustering results Â· Randomization tests #12;Cluster Validity Â· All clustering algorithms provided with a set of points output a clustering Â· How to evaluate the "goodness" of the resulting clusters? Â· Tricky because "clusters are in the eye of the beholder"! Â· Then why
KANTS: a stigmergic ant algorithm for cluster analysis and swarm art.
Fernandes, Carlos M; Mora, Antonio M; Merelo, Juan J; Rosa, Agostinho C
2014-06-01
KANTS is a swarm intelligence clustering algorithm inspired by the behavior of social insects. It uses stigmergy as a strategy for clustering large datasets and, as a result, displays a typical behavior of complex systems: self-organization and global patterns emerging from the local interaction of simple units. This paper introduces a simplified version of KANTS and describes recent experiments with the algorithm in the context of a contemporary artistic and scientific trend called swarm art, a type of generative art in which swarm intelligence systems are used to create artwork or ornamental objects. KANTS is used here for generating color drawings from the input data that represent real-world phenomena, such as electroencephalogram sleep data. However, the main proposal of this paper is an art project based on well-known abstract paintings, from which the chromatic values are extracted and used as input. Colors and shapes are therefore reorganized by KANTS, which generates its own interpretation of the original artworks. The project won the 2012 Evolutionary Art, Design, and Creativity Competition. PMID:23912505
A contiguity-enhanced k-means clustering algorithm for unsupervised multispectral image segmentation
Theiler, J.; Gisler, G.
1997-07-01
The recent and continuing construction of multi and hyper spectral imagers will provide detailed data cubes with information in both the spatial and spectral domain. This data shows great promise for remote sensing applications ranging from environmental and agricultural to national security interests. The reduction of this voluminous data to useful intermediate forms is necessary both for downlinking all those bits and for interpreting them. Smart onboard hardware is required, as well as sophisticated earth bound processing. A segmented image (in which the multispectral data in each pixel is classified into one of a small number of categories) is one kind of intermediate form which provides some measure of data compression. Traditional image segmentation algorithms treat pixels independently and cluster the pixels according only to their spectral information. This neglects the implicit spatial information that is available in the image. We will suggest a simple approach; a variant of the standard k-means algorithm which uses both spatial and spectral properties of the image. The segmented image has the property that pixels which are spatially contiguous are more likely to be in the same class than are random pairs of pixels. This property naturally comes at some cost in terms of the compactness of the clusters in the spectral domain, but we have found that the spatial contiguity and spectral compactness properties are nearly orthogonal, which means that we can make considerable improvements in the one with minimal loss in the other.
Segmentation of a Thematic Mapper Image Using the Fuzzy c-Means Clusterng Algorthm
Robert Cannon; Jitendra Dave; James Bezdek; Mohan Trivedi
1986-01-01
In this paper, a segmentation procedure that utilizes a clustering algorithm based upon fuzzy set theory is developed. The procedure operates in a nonparametric unsupervised mode. The feasibility of the methodology is demonstrated by segmenting a six-band Landsat-4 digital image with 324 scan lines and 392 pixels per scan line. For this image, 100-percent ground cover information is available for
Yanqiu Feng; Wufan Chen
2004-01-01
\\u000a Unsupervised Fuzzy C-Means (FCM) clustering technique has been widely used in image segmentation. However, conventional FCM\\u000a algorithm, being a histogram-based method when used in classification, has an intrinsic limitation: no spatial information\\u000a is taken into account. This causes the FCM algorithm to work only on well-defined images with low level of noise. In this\\u000a paper, a novel improvement to fuzzy
Tao Xie; Xiao Qin; Mais Nijim
2006-01-01
This paper addresses the problem of improving quality of security for real-time parallel applica-tions on heterogeneous clusters. We propose a new security-and heterogeneity-driven scheduling algo-rithm (SHARP for short), which strives to maximize the probability that parallel applications are executed in time without any risk of being attacked. Because of high security overhead in existing clusters, an im-portant step in scheduling
Serguei A. Mokhov
2008-01-01
This work reports experimental results and their analysis in various speech processing tasks using SpeakerIdentApp, a text-independent speaker identification application, based on Modular Audio Recognition Framework (MARF)'s API and its implementation in terms of best of the available algorithm configurations for each particular task using median clusters as opposed to the default mean clusters. This study focuses on the tasks
`Inter-Arrival Time' Inspired Algorithm and its Application in Clustering and Molecular Phylogeny
NASA Astrophysics Data System (ADS)
Kolekar, Pandurang S.; Kale, Mohan M.; Kulkarni-Kale, Urmila
2010-10-01
Bioinformatics, being multidisciplinary field, involves applications of various methods from allied areas of Science for data mining using computational approaches. Clustering and molecular phylogeny is one of the key areas in Bioinformatics, which help in study of classification and evolution of organisms. Molecular phylogeny algorithms can be divided into distance based and character based methods. But most of these methods are dependent on pre-alignment of sequences and become computationally intensive with increase in size of data and hence demand alternative efficient approaches. `Inter arrival time distribution' (IATD) is a popular concept in the theory of stochastic system modeling but its potential in molecular data analysis has not been fully explored. The present study reports application of IATD in Bioinformatics for clustering and molecular phylogeny. The proposed method provides IATDs of nucleotides in genomic sequences. The distance function based on statistical parameters of IATDs is proposed and distance matrix thus obtained is used for the purpose of clustering and molecular phylogeny. The method is applied on a dataset of 3' non-coding region sequences (NCR) of Dengue virus type 3 (DENV-3), subtype III, reported in 2008. The phylogram thus obtained revealed the geographical distribution of DENV-3 isolates. Sri Lankan DENV-3 isolates were further observed to be clustered in two sub-clades corresponding to pre and post Dengue hemorrhagic fever emergence groups. These results are consistent with those reported earlier, which are obtained using pre-aligned sequence data as an input. These findings encourage applications of the IATD based method in molecular phylogenetic analysis in particular and data mining in general.
C. Lobo; A. Iovino; D. Lazzati; G. Chincarini
2000-06-29
We present a new algorithm to search for distant clusters of galaxies on catalogues deriving from imaging data, as those of the ESO Imaging Survey. Our algorithm is a matched filter one, similar to that adopted by Postman et al. (1996), aiming at identifying cluster candidates by using positional and photometric data simultaneously. The main novelty of our approach is that spatial and luminosity filter are run separately on the catalogue and no assumption is made on the typical size nor on the typical M* for clusters, as these parameters intervene in our algorithm as typical angular scale (sigma) and typical apparent magnitude m*. Moreover we estimate the background locally for each candidate, allowing us to overcome the hazards of inhomogeneous datasets. As a consequence our algorithm has a lower contamination rate - without loss of completeness - in comparison to other techniques, as tested through extensive simulations. We provide catalogues of galaxy cluster candidates as the result of applying our algorithm to the I-band data of the EIS-wide patches A and B.
Jonathan R Gair; Gareth Jones
2007-01-04
One of the most exciting prospects for the Laser Interferometer Space Antenna (LISA) is the detection of gravitational waves from the inspirals of stellar-mass compact objects into supermassive black holes. Detection of these sources is an extremely challenging computational problem due to the large parameter space and low amplitude of the signals. However, recent work has suggested that the nearest extreme mass ratio inspiral (EMRI) events will be sufficiently loud that they might be detected using computationally cheap, template-free techniques, such as a time-frequency analysis. In this paper, we examine a particular time-frequency algorithm, the Hierarchical Algorithm for Clusters and Ridges (HACR). This algorithm searches for clusters in a power map and uses the properties of those clusters to identify signals in the data. We find that HACR applied to the raw spectrogram performs poorly, but when the data is binned during the construction of the spectrogram, the algorithm can detect typical EMRI events at distances of up to $\\sim2.6$Gpc. This is a little further than the simple Excess Power method that has been considered previously. We discuss the HACR algorithm, including tuning for single and multiple sources, and illustrate its performance for detection of typical EMRI events, and other likely LISA sources, such as white dwarf binaries and supermassive black hole mergers. We also discuss how HACR cluster properties could be used for parameter extraction.
Fuzzy clustering in Intelligent Scissors.
Wieclawek, W; Pietka, E
2012-07-01
In this study a modified Live-Wire approach is presented. A Fuzzy C-Means (FCM) clustering procedure has been implemented before the wavelet transform cost map function is defined. This shrinks the area to be searched resulting in a significant reduction of the computational complexity. The method has been employed to computed tomography (CT) and magnetic resonance (MR) studies. The 2D segmentation of lungs, abdominal structures and knee joint has been performed in order to evaluate the method. Significant numerical complexity reduction of the Live-Wire algorithm as well as improvement of the object delineation with a decreased number of user interactions have been obtained. PMID:22483373
Haddadi, Hamed
/CTS to avoid collisions from hidden terminals. But the packet delay and data congestion increase rapidly selected for study are carrier sense multiple access with collision avoidance (CSMA/CA) and dual busy tone1 Novel Clustering Algorithm Based on Minimal Path Loss Ratio for Medium Access Control in Vehicle
Ansari, Elnaz Saberi; Eslahchi, Changiz; Pezeshk, Hamid; Sadeghi, Mehdi
2014-09-01
Decomposition of structural domains is an essential task in classifying protein structures, predicting protein function, and many other proteomics problems. As the number of known protein structures in PDB grows exponentially, the need for accurate automatic domain decomposition methods becomes more essential. In this article, we introduce a bottom-up algorithm for assigning protein domains using a graph theoretical approach. This algorithm is based on a center-based clustering approach. For constructing initial clusters, members of an independent dominating set for the graph representation of a protein are considered as the centers. A distance matrix is then defined for these clusters. To obtain final domains, these clusters are merged using the compactness principle of domains and a method similar to the neighbor-joining algorithm considering some thresholds. The thresholds are computed using a training set consisting of 50 protein chains. The algorithm is implemented using C++ language and is named ProDomAs. To assess the performance of ProDomAs, its results are compared with seven automatic methods, against five publicly available benchmarks. The results show that ProDomAs outperforms other methods applied on the mentioned benchmarks. The performance of ProDomAs is also evaluated against 6342 chains obtained from ASTRAL SCOP 1.71. ProDomAs is freely available at http://www.bioinf.cs.ipm.ir/software/prodomas. PMID:24596179
Cancer subtype discovery and biomarker identification via a new robust network clustering algorithm.
Wu, Meng-Yun; Dai, Dao-Qing; Zhang, Xiao-Fei; Zhu, Yuan
2013-01-01
In cancer biology, it is very important to understand the phenotypic changes of the patients and discover new cancer subtypes. Recently, microarray-based technologies have shed light on this problem based on gene expression profiles which may contain outliers due to either chemical or electrical reasons. These undiscovered subtypes may be heterogeneous with respect to underlying networks or pathways, and are related with only a few of interdependent biomarkers. This motivates a need for the robust gene expression-based methods capable of discovering such subtypes, elucidating the corresponding network structures and identifying cancer related biomarkers. This study proposes a penalized model-based Student's t clustering with unconstrained covariance (PMT-UC) to discover cancer subtypes with cluster-specific networks, taking gene dependencies into account and having robustness against outliers. Meanwhile, biomarker identification and network reconstruction are achieved by imposing an adaptive [Formula: see text] penalty on the means and the inverse scale matrices. The model is fitted via the expectation maximization algorithm utilizing the graphical lasso. Here, a network-based gene selection criterion that identifies biomarkers not as individual genes but as subnetworks is applied. This allows us to implicate low discriminative biomarkers which play a central role in the subnetwork by interconnecting many differentially expressed genes, or have cluster-specific underlying network structures. Experiment results on simulated datasets and one available cancer dataset attest to the effectiveness, robustness of PMT-UC in cancer subtype discovering. Moveover, PMT-UC has the ability to select cancer related biomarkers which have been verified in biochemical or biomedical research and learn the biological significant correlation among genes. PMID:23799085
NASA Astrophysics Data System (ADS)
Ashton, Douglas J.; Liu, Jiwen; Luijten, Erik; Wilding, Nigel B.
2010-11-01
Highly size-asymmetrical fluid mixtures arise in a variety of physical contexts, notably in suspensions of colloidal particles to which much smaller particles have been added in the form of polymers or nanoparticles. Conventional schemes for simulating models of such systems are hamstrung by the difficulty of relaxing the large species in the presence of the small one. Here we describe how the rejection-free geometrical cluster algorithm of Liu and Luijten [J. Liu and E. Luijten, Phys. Rev. Lett. 92, 035504 (2004)] can be embedded within a restricted Gibbs ensemble to facilitate efficient and accurate studies of fluid phase behavior of highly size-asymmetrical mixtures. After providing a detailed description of the algorithm, we summarize the bespoke analysis techniques of [Ashton et al., J. Chem. Phys. 132, 074111 (2010)] that permit accurate estimates of coexisting densities and critical-point parameters. We apply our methods to study the liquid-vapor phase diagram of a particular mixture of Lennard-Jones particles having a 10:1 size ratio. As the reservoir volume fraction of small particles is increased in the range of 0%-5%, the critical temperature decreases by approximately 50%, while the critical density drops by some 30%. These trends imply that in our system, adding small particles decreases the net attraction between large particles, a situation that contrasts with hard-sphere mixtures where an attractive depletion force occurs.
Ashton, Douglas J; Liu, Jiwen; Luijten, Erik; Wilding, Nigel B
2010-11-21
Highly size-asymmetrical fluid mixtures arise in a variety of physical contexts, notably in suspensions of colloidal particles to which much smaller particles have been added in the form of polymers or nanoparticles. Conventional schemes for simulating models of such systems are hamstrung by the difficulty of relaxing the large species in the presence of the small one. Here we describe how the rejection-free geometrical cluster algorithm of Liu and Luijten [J. Liu and E. Luijten, Phys. Rev. Lett. 92, 035504 (2004)] can be embedded within a restricted Gibbs ensemble to facilitate efficient and accurate studies of fluid phase behavior of highly size-asymmetrical mixtures. After providing a detailed description of the algorithm, we summarize the bespoke analysis techniques of [Ashton et al., J. Chem. Phys. 132, 074111 (2010)] that permit accurate estimates of coexisting densities and critical-point parameters. We apply our methods to study the liquid-vapor phase diagram of a particular mixture of Lennard-Jones particles having a 10:1 size ratio. As the reservoir volume fraction of small particles is increased in the range of 0%-5%, the critical temperature decreases by approximately 50%, while the critical density drops by some 30%. These trends imply that in our system, adding small particles decreases the net attraction between large particles, a situation that contrasts with hard-sphere mixtures where an attractive depletion force occurs. PMID:21090849
The multisynapse neural network and its application to fuzzy clustering.
Wei, Chih-Hsiu; Fahn, Chin-Shyurng
2002-01-01
In this paper, a new neural architecture, the multisynapse neural network, is developed for constrained optimization problems, whose objective functions may include high-order, logarithmic, and sinusoidal forms, etc., unlike the traditional Hopfield networks which can only handle quadratic form optimization. Meanwhile, based on the application of this new architecture, a fuzzy bidirectional associative clustering network (FBACN), which is composed of two layers of recurrent networks, is proposed for fuzzy-partition clustering according to the objective-functional method. It is well known that fuzzy c-means is a milestone algorithm in the area of fuzzy c-partition clustering. All of the following objective-functional-based fuzzy c-partition algorithms incorporate the formulas of fuzzy c-means as the prime mover in their algorithms. However, when an application of fuzzy c-partition has sophisticated constraints, the necessity of analytical solutions in a single iteration step becomes a fatal issue of the existing algorithms. The largest advantage of FBACN is that it does not need analytical solutions. For the problems on which some prior information is known, we bring a combination of part crisp and part fuzzy clustering in the third optimization problem. PMID:18244459
Nagwani, Naresh Kumar; Deo, Shirish V.
2014-01-01
Understanding of the compressive strength of concrete is important for activities like construction arrangement, prestressing operations, and proportioning new mixtures and for the quality assurance. Regression techniques are most widely used for prediction tasks where relationship between the independent variables and dependent (prediction) variable is identified. The accuracy of the regression techniques for prediction can be improved if clustering can be used along with regression. Clustering along with regression will ensure the more accurate curve fitting between the dependent and independent variables. In this work cluster regression technique is applied for estimating the compressive strength of the concrete and a novel state of the art is proposed for predicting the concrete compressive strength. The objective of this work is to demonstrate that clustering along with regression ensures less prediction errors for estimating the concrete compressive strength. The proposed technique consists of two major stages: in the first stage, clustering is used to group the similar characteristics concrete data and then in the second stage regression techniques are applied over these clusters (groups) to predict the compressive strength from individual clusters. It is found from experiments that clustering along with regression techniques gives minimum errors for predicting compressive strength of concrete; also fuzzy clustering algorithm C-means performs better than K-means algorithm. PMID:25374939
Local manifold spectral clustering with FCM data condensation
NASA Astrophysics Data System (ADS)
Liu, Hanqiang; Jiao, Licheng; Zhao, Feng
2009-10-01
In this paper, a novel local manifold spectral clustering with fuzzy c-means (FCM) data condensation is presented. Firstly, a multilayer FCM data condensation method is performed on the original data to contain a condensation subset. Secondly, the spectral clustering algorithm based on the local manifold distance measure is used to realize the classification of the condensation subset. Finally, the nearest neighbor method is adopted to obtain the clustering result of the original data. Compared with the standard spectral clustering algorithm, the novel method is more robust and has the advantages of effectively dealing with the large scale data. In our experiments, we first analyze the performances of multilayer FCM data condensation and local manifold distance measure, then apply our method to solve image segmentation and the large Brodatz texture images classification. The experimental results show that the method is effective and extensible, and especially the runtime of this method is acceptable.
Roweis, Sam
Lecture 8: Clustering and Tree Models Sam Roweis October 30, 2002 Clustering #15; Clustering; Several approaches: agglomerative, divisive, #12;xed number of clusters, hierarchical, ... #15; All distance (x y) > #6; 1 (x y), ... Algorithm: K-means #15; Select a number of clusters K and covariance #6
Bhattacharya, Anindya; De, Rajat K
2010-08-01
Distance based clustering algorithms can group genes that show similar expression values under multiple experimental conditions. They are unable to identify a group of genes that have similar pattern of variation in their expression values. Previously we developed an algorithm called divisive correlation clustering algorithm (DCCA) to tackle this situation, which is based on the concept of correlation clustering. But this algorithm may also fail for certain cases. In order to overcome these situations, we propose a new clustering algorithm, called average correlation clustering algorithm (ACCA), which is able to produce better clustering solution than that produced by some others. ACCA is able to find groups of genes having more common transcription factors and similar pattern of variation in their expression values. Moreover, ACCA is more efficient than DCCA with respect to the time of execution. Like DCCA, we use the concept of correlation clustering concept introduced by Bansal et al. ACCA uses the correlation matrix in such a way that all genes in a cluster have the highest average correlation values with the genes in that cluster. We have applied ACCA and some well-known conventional methods including DCCA to two artificial and nine gene expression datasets, and compared the performance of the algorithms. The clustering results of ACCA are found to be more significantly relevant to the biological annotations than those of the other methods. Analysis of the results show the superiority of ACCA over some others in determining a group of genes having more common transcription factors and with similar pattern of variation in their expression profiles. Availability of the software: The software has been developed using C and Visual Basic languages, and can be executed on the Microsoft Windows platforms. The software may be downloaded as a zip file from http://www.isical.ac.in/~rajat. Then it needs to be installed. Two word files (included in the zip file) need to be consulted before installation and execution of the software. PMID:20144735
Stochastic modeling for trajectories drift in the ocean: Application of Density Clustering Algorithm
Shchekinova, E Y
2015-01-01
The aim of this study is to address the effects of wind-induced drift on a floating sea objects using high--resolution ocean forecast data and atmospheric data. Two applications of stochastic Leeway model for prediction of trajectories drift in the Mediterranean sea are presented: long-term simulation of sea drifters in the western Adriatic sea (21.06.2009-23.06.2009) and numerical reconstruction of the Elba accident (21.06.2009-23.06.2009). Long-term simulations in the western Adriatic sea are performed using wind data from the European Center for Medium-Range Weather Forecast (ECMWF) and currents from the Adriatic Forecasting System (AFS). An algorithm of spatial clustering is proposed to identify the most probable search areas with a high density of drifters. The results are compared for different simulation scenarios using different categories of drifters and forcing fields. The reconstruction of sea object drift near to the Elba Island is performed using surface currents from the Mediterranean Forecastin...
Stochastic modeling for trajectories drift in the ocean: Application of Density Clustering Algorithm
E. Y. Shchekinova; Y. Kumkar
2015-05-18
The aim of this study is to address the effects of wind-induced drift on a floating sea objects using high--resolution ocean forecast data and atmospheric data. Two applications of stochastic Leeway model for prediction of trajectories drift in the Mediterranean sea are presented: long-term simulation of sea drifters in the western Adriatic sea (21.06.2009-23.06.2009) and numerical reconstruction of the Elba accident (21.06.2009-23.06.2009). Long-term simulations in the western Adriatic sea are performed using wind data from the European Center for Medium-Range Weather Forecast (ECMWF) and currents from the Adriatic Forecasting System (AFS). An algorithm of spatial clustering is proposed to identify the most probable search areas with a high density of drifters. The results are compared for different simulation scenarios using different categories of drifters and forcing fields. The reconstruction of sea object drift near to the Elba Island is performed using surface currents from the Mediterranean Forecasting System (MFS) and atmospheric forcing fields from the ECMWF. The results showed that draft-limited to an upper surface drifters more closely reproduced target trajectory during the accident.
Fleisch, Markus C.; Maxell, Christopher A.; Kuper, Claudia K.; Brown, Erika T.; Parvin, Bahram; Barcellos-Hoff, Mary-Helen; Costes,Sylvain V.
2006-03-08
Centrosomes are small organelles that organize the mitoticspindle during cell division and are also involved in cell shape andpolarity. Within epithelial tumors, such as breast cancer, and somehematological tumors, centrosome abnormalities (CA) are common, occurearly in disease etiology, and correlate with chromosomal instability anddisease stage. In situ quantification of CA by optical microscopy ishampered by overlap and clustering of these organelles, which appear asfocal structures. CA has been frequently associated with Tp53 status inpremalignant lesions and tumors. Here we describe an approach toaccurately quantify centrosomes in tissue sections and tumors.Considering proliferation and baseline amplification rate the resultingpopulation based ratio of centrosomes per nucleus allow the approximationof the proportion of cells with CA. Using this technique we show that20-30 percent of cells have amplified centrosomes in Tp53 null mammarytumors. Combining fluorescence detection, deconvolution microscopy and amathematical algorithm applied to a maximum intensity projection we showthat this approach is superior to traditional investigator based visualanalysis or threshold-based techniques.
Julia Fornleitner; Gerhard Kahl
2007-09-03
Introducing genetic algorithms as a reliable and efficient tool to find ordered equilibrium structures, we predict minimum energy configurations of the square shoulder system for different values of corona width $\\lambda$. Varying systematically the pressure for different values of $\\lambda$ we obtain complete sequences of minimum energy configurations which provide a deeper understanding of the system's strategies to arrange particles in an energetically optimized fashion, leading to the competing self-assembly scenarios of cluster-formation vs. lane-formation.
NASA Astrophysics Data System (ADS)
Fornleitner, J.; Kahl, G.
2008-04-01
Introducing genetic algorithms as a reliable and effective tool to find ordered equilibrium structures, we predict minimum energy configurations of the square shoulder system for different values of corona width ?. By varying systematically the pressure and choosing different values of ? we are able to identify complete sequences of minimum energy configurations. The results provide a deeper understanding of the system's strategies to arrange particles in an energetically optimised fashion, leading to the competing self-assembly scenarios of cluster formation vs. lane formation.
M. S. Tame; M. S. Kim
2010-10-01
We show that fundamental versions of the Deutsch-Jozsa and Bernstein-Vazirani quantum algorithms can be performed using a small entangled cluster state resource of only six qubits. We then investigate the minimal resource states needed to demonstrate arbitrary n-qubit versions and a scalable method to produce them. For this purpose we propose a versatile on-chip photonic waveguide setup.
Li, Weizhong [San Diego Supercomputer Center] [San Diego Supercomputer Center
2011-10-12
San Diego Supercomputer Center's Weizhong Li on "Effective Analysis of NGS Metagenomic Data with Ultra-fast Clustering Algorithms" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011.
D. Napoleon; M. Praneesh; S. Sathya; M. SivaSubramani
2012-01-01
In computer vision, segmentation refers to the process of partitioning a digital image into multiple segments. Image segmentation is typically used to locate objects and boundaries in images. Image segmentation is the process of assigning a label to every pixel in an image such that pixel with the same label share contain visual characteristics. In this paper present a new
Evolution Strategy for the C-Means Algorithm: Application to multimodal image
Masulli, Francesco
Francesco Masulli DIBRIS - Dipartimento di Informatica, Bioingegneria, Robotica e Ingegneria dei Sistemi.massone@spin.cnr.it Andrea Schenone DIBRIS - Dipartimento di Informatica, Bioingegneria, Robotica e Ingegneria dei Sistemi
An algorithm for identifying clusters of functionally related genes in genomes
Yi, Gang Man
2009-05-15
as in properties of gene clusters, including size distribution and functional annotation. These properties may be diagnostic of the evolutionary forces that lead to the formation of gene clusters. The approach finds all gene clusters in the data set and ranks them...
Statistical physics based heuristic clustering algorithms with an application to econophysics
Lucia Liliana Baldwin
2003-01-01
Three new approaches to the clustering of data sets are presented. They are heuristic methods and represent forms of unsupervised (non-parametric) clustering. Applied to an unknown set of data these methods automatically determine the number of clusters and their location using no a priori assumptions. All are based on analogies with different physical phenomena. The first technique, named the Percolation
Pawe? Filipczuk; Marek Kowal; Andrzej Obuchowicz
\\u000a The paper presents k-means based hybrid segmentation method for breast cancer diagnosis problem. It is part of the computer\\u000a system to support diagnosis based on microscope images of the fine needle biopsy. The system assumes distinguishing malignant\\u000a from benign cases. Described method is an alternative to the previously presented algorithms based on fuzzy c-means clustering\\u000a and competitive neural networks. However,
Yin, Jiandong; Yang, Jiawen; Guo, Qiyong
2014-01-01
During dynamic susceptibility contrast-magnetic resonance imaging (DSC-MRI), it has been demonstrated that the arterial input function (AIF) can be obtained using fuzzy c-means (FCM) and k-means clustering methods. However, due to the dependence on the initial centers of clusters, both clustering methods have poor reproducibility between the calculation and recalculation steps. To address this problem, the present study developed an alternative clustering technique based on the agglomerative hierarchy (AH) method for AIF determination. The performance of AH method was evaluated using simulated data and clinical data based on comparisons with the two previously demonstrated clustering-based methods in terms of the detection accuracy, calculation reproducibility, and computational complexity. The statistical analysis demonstrated that, at the cost of a significantly longer execution time, AH method obtained AIFs more in line with the expected AIF, and it was perfectly reproducible at different time points. In our opinion, the disadvantage of AH method in terms of the execution time can be alleviated by introducing a professional high-performance workstation. The findings of this study support the feasibility of using AH clustering method for detecting the AIF automatically. PMID:24932638
NASA Astrophysics Data System (ADS)
Thanos, Konstantinos-Georgios; Thomopoulos, Stelios C. A.
2014-06-01
The study in this paper belongs to a more general research of discovering facial sub-clusters in different ethnicity face databases. These new sub-clusters along with other metadata (such as race, sex, etc.) lead to a vector for each face in the database where each vector component represents the likelihood of participation of a given face to each cluster. This vector is then used as a feature vector in a human identification and tracking system based on face and other biometrics. The first stage in this system involves a clustering method which evaluates and compares the clustering results of five different clustering algorithms (average, complete, single hierarchical algorithm, k-means and DIGNET), and selects the best strategy for each data collection. In this paper we present the comparative performance of clustering results of DIGNET and four clustering algorithms (average, complete, single hierarchical and k-means) on fabricated 2D and 3D samples, and on actual face images from various databases, using four different standard metrics. These metrics are the silhouette figure, the mean silhouette coefficient, the Hubert test ? coefficient, and the classification accuracy for each clustering result. The results showed that, in general, DIGNET gives more trustworthy results than the other algorithms when the metrics values are above a specific acceptance threshold. However when the evaluation results metrics have values lower than the acceptance threshold but not too low (too low corresponds to ambiguous results or false results), then it is necessary for the clustering results to be verified by the other algorithms.
Khotanlou, Hassan; Afrasiabi, Mahlagha
2011-07-01
This paper introduces a novel methodology for the segmentation of brain MS lesions in MRI volumes using a new clustering algorithm named SCPFCM. SCPFCM uses membership, typicality and spatial information to cluster each voxel. The proposed method relies on an initial segmentation of MS lesions in T1-w and T2-w images by applying SCPFCM algorithm, and the T1 image is then used as a mask and is compared with T2 image. The proposed method was applied to 10 clinical MRI datasets. The results obtained on different types of lesions have been evaluated by comparison with manual segmentations. PMID:22606670
DNS dispatching algorithms with state estimators for scalable Web?server clusters
Valeria Cardellini; Michele Colajanni; Philip S. Yu
1999-01-01
Replication of information across a server cluster provides a promising way to support popular Web sites. However, a Web?server\\u000a cluster requires some mechanism for the scheduling of requests to the most available server. One common approach is to use\\u000a the cluster Domain Name System (DNS) as a centralized dispatcher. The main problem is that WWW address caching mechanisms\\u000a (although reducing
Christian Daul; P. Graebling; A. Tiedeu; D. Wolf
2005-01-01
The three-dimensional (3-D) shape of microcalcification clusters is an important indicator in early breast cancer detection. In fact, there is a relationship between the cluster topology and the type of lesion (malignant or benign). This paper presents a 3-D reconstruction method for such clusters using two 2-D views acquired during standard mammographic examinations. For this purpose, the mammographic unit was
KiWi: A Scalable Subspace Clustering Algorithm for Gene Expression Analysis
Obi L. Griffith; Byron J. Gao; Mikhail Bilenky; Yuliya Prichyna; Martin Ester; Steven J. M. Jones
2009-01-01
Subspace clustering has gained increasing popularity in the analysis of gene\\u000aexpression data. Among subspace cluster models, the recently introduced\\u000aorder-preserving sub-matrix (OPSM) has demonstrated high promise. An OPSM,\\u000aessentially a pattern-based subspace cluster, is a subset of rows and columns\\u000ain a data matrix for which all the rows induce the same linear ordering of\\u000acolumns. Existing OPSM discovery
Erban, Radek
, such as microarray and other high-throughput bioinformatics data. The most widely used method is the Fuzzy C-means been proposed in the past to address these two restrictions. Bezdek et al. [2] proposed the Fuzzy C-means) #12;a number of examples in very diverse areas (such as in image segmentation (Trivedi and Bezdek [24
NASA Astrophysics Data System (ADS)
Tian, Dongxu; Zhang, Hualei; Zhao, Jijun
2007-10-01
Using a genetic algorithm followed by local optimization with density functional theory, the lowest-energy structures of Ag clusters in a size range of n=3-22 were studied. The Ag ( n=9-16) clusters prefer compact structures of flat shape, while the Ag (n=19,21,22) clusters adopt amorphous packing based on a 13-atom icosahedral core. For Ag, two competitive candidates for the lowest-energy structures, namely a hollow-cage structure and close-packed structures of flat shape, were found. Two competing candidates were found for Ag and Ag: hollow-cage structures versus icosahedron-based compact structures. The lowest-energy structure of Ag is a highly symmetric tetrahedron with Td symmetry. These results are significantly different from those predicted in earlier works using empirical methods. The ionization potentials and electron affinities for the lowest-energy structures of Ag ( n=3-22) clusters were computed and compared with experimental values.
Tsai, Ming-Hui; Huang, Yueh-Min
2014-01-01
Wireless sensor networks (WSNs) have emerged as a promising solution for various applications due to their low cost and easy deployment. Typically, their limited power capability, i.e., battery powered, make WSNs encounter the challenge of extension of network lifetime. Many hierarchical protocols show better ability of energy efficiency in the literature. Besides, data reduction based on the correlation of sensed readings can efficiently reduce the amount of required transmissions. Therefore, we use a sub-clustering procedure based on spatial data correlation to further separate the hierarchical (clustered) architecture of a WSN. The proposed algorithm (2TC-cor) is composed of two procedures: the prediction model construction procedure and the sub-clustering procedure. The energy conservation benefits by the reduced transmissions, which are dependent on the prediction model. Also, the energy can be further conserved because of the representative mechanism of sub-clustering. As presented by simulation results, it shows that 2TC-cor can effectively conserve energy and monitor accurately the environment within an acceptable level. PMID:25412220
An on-demand weighted clustering algorithm (WCA) for ad hoc networks
Mainak Chatterjee; S. K. Sas; Damla Turgut
2000-01-01
We consider a multi-cluster, multi-hop packet radio network architecture for wireless systems which can dynamically adapt itself with the changing network configurations. Due to the dynamic nature of the mobile nodes, their association and dissociation to and from clusters perturb the stability of the system, and hence a reconfiguration of the system is unavoidable. At the same time it is
Using the Kleinberg Algorithm and Vector Space Model for Software System Clustering
Giuseppe Scanniello; Anna D'Amico; Carmela D'Amico; Teodora D'Amico
2010-01-01
Clustering based approaches are generally difficult to use in practice since they need a significant human interaction for recovering software architectures, are conceived for a specific programming language, and very often do not use design knowledge (e.g., the implemented architectural model). In this paper we present a clustering based approach to recover the implemented architecture of software systems with a
Clustering: Partition Clustering Lecture outline
Terzi, Evimaria
Clustering: Partition Clustering #12;Lecture outline Â· Distance/Similarity between data objects Â· Data objects as geometric data points Â· Clustering problems and algorithms Â Kmeans Â Kmedian Â Kcenter #12;What is clustering? Â· A grouping of data objects such that the objects within a group
Risk Mapping of Cutaneous Leishmaniasis via a Fuzzy C Means-based Neuro-Fuzzy Inference System
NASA Astrophysics Data System (ADS)
Akhavan, P.; Karimi, M.; Pahlavani, P.
2014-10-01
Finding pathogenic factors and how they are spread in the environment has become a global demand, recently. Cutaneous Leishmaniasis (CL) created by Leishmania is a special parasitic disease which can be passed on to human through phlebotomus of vector-born. Studies show that economic situation, cultural issues, as well as environmental and ecological conditions can affect the prevalence of this disease. In this study, Data Mining is utilized in order to predict CL prevalence rate and obtain a risk map. This case is based on effective environmental parameters on CL and a Neuro-Fuzzy system was also used. Learning capacity of Neuro-Fuzzy systems in neural network on one hand and reasoning power of fuzzy systems on the other, make it very efficient to use. In this research, in order to predict CL prevalence rate, an adaptive Neuro-fuzzy inference system with fuzzy inference structure of fuzzy C Means clustering was applied to determine the initial membership functions. Regarding to high incidence of CL in Ilam province, counties of Ilam, Mehran, and Dehloran have been examined and evaluated. The CL prevalence rate was predicted in 2012 by providing effective environmental map and topography properties including temperature, moisture, annual, rainfall, vegetation and elevation. Results indicate that the model precision with fuzzy C Means clustering structure rises acceptable RMSE values of both training and checking data and support our analyses. Using the proposed data mining technology, the pattern of disease spatial distribution and vulnerable areas become identifiable and the map can be used by experts and decision makers of public health as a useful tool in management and optimal decision-making.
NASA Astrophysics Data System (ADS)
Vrtilek, Saeqa Dil; Boroson, Bram S.; Richards, Joseph
2014-06-01
We have explored recent developments in machine learning algorithms, such as diffusion mapping (Richards et al. 2009) which allow us to identify physically similar clusters independent of prior knowledge. We have successfully used this method to separate out different classes of X-ray binaries and of different spectral states within a given system. Beyond the immediate astronomical application, a strength of our approach is to offer new and useful insight into the vast and rapidly growing multi-dimensional data collections in essentially all fields of investigation, not only the astrophysical ones which form our testbed and the immediate focus of our scientific interest.
Chen, Wei-Chen [ORNL] [ORNL; Ostrouchov, George [ORNL] [ORNL; Pugmire, Dave [ORNL] [ORNL; Prabhat, [Lawrence Berkeley National Laboratory (LBNL)] [Lawrence Berkeley National Laboratory (LBNL); Wehner, Michael [Lawrence Berkeley National Laboratory (LBNL)] [Lawrence Berkeley National Laboratory (LBNL)
2013-01-01
We develop a parallel EM algorithm for multivariate Gaussian mixture models and use it to perform model-based clustering of a large climate data set. Three variants of the EM algorithm are reformulated in parallel and a new variant that is faster is presented. All are implemented using the single program, multiple data (SPMD) programming model, which is able to take advantage of the combined collective memory of large distributed computer architectures to process larger data sets. Displays of the estimated mixture model rather than the data allow us to explore multivariate relationships in a way that scales to arbitrary size data. We study the performance of our methodology on simulated data and apply our methodology to a high resolution climate dataset produced by the community atmosphere model (CAM5). This article has supplementary material online.
NASA Astrophysics Data System (ADS)
Niwa, Keiichi; Nishizaki, Ichiro; Sakawa, Masatoshi
2009-01-01
This paper deals with the two level 0-1 programming problems in which there are two decision makers (DMs); the decision maker at the upper level and the decision maker at the lower level. The authors modify the problematic aspects of a computation method for the Stackelberg solution which they previously presented, and thus propose an improved computation method. Specifically, a genetic algorithm (GA) is proposed with the objective of boosting the accuracy of solutions while maintaining the diversity of the population, which adopts a clustering method instead of calculating distances during sharing. Also, in order to eliminate unnecessary computation, an additional algorithm is included for avoiding obtaining the rational reaction of the lower level DM in response to upper level DM's decisions when necessary. In order to verify the effectiveness of the proposed method, it is intended to make a comparison with existing methods by performing numerical experiments into both the accuracy of solutions and the computation time.
Visualizing Clustering Results Ian Davidson
Davidson, Ian
Visualizing Clustering Results Ian Davidson Introduction Non-hierarchical clustering has a long mining [2], statistical analysis [3] and information retrieval [17]. Clustering involves finding by d attributes. A clustering algorithm generates cluster descriptions and assigns each observation
Sensitivity evaluation of dynamic speckle activity measurements using clustering methods
Etchepareborda, Pablo; Federico, Alejandro; Kaufmann, Guillermo H.
2010-07-01
We evaluate and compare the use of competitive neural networks, self-organizing maps, the expectation-maximization algorithm, K-means, and fuzzy C-means techniques as partitional clustering methods, when the sensitivity of the activity measurement of dynamic speckle images needs to be improved. The temporal history of the acquired intensity generated by each pixel is analyzed in a wavelet decomposition framework, and it is shown that the mean energy of its corresponding wavelet coefficients provides a suited feature space for clustering purposes. The sensitivity obtained by using the evaluated clustering techniques is also compared with the well-known methods of Konishi-Fujii, weighted generalized differences, and wavelet entropy. The performance of the partitional clustering approach is evaluated using simulated dynamic speckle patterns and also experimental data.
NASA Astrophysics Data System (ADS)
Kumar, Sanjeev; Ma, Y. G.
2015-03-01
The phase space obtained using the isospin quantum molecular dynamical (IQMD) model is analyzed by applying the binding energy cut in the most commonly and widely used secondary cluster recognition algorithm. In addition, for the present study, the energy contribution from momentum-dependent and symmetry potentials is also included during the calculation of total binding energy, which was absent in clusterization algorithms used earlier. The stability of fragments and isospin effects are explored by using the new clusterization algorithm. The findings are summarized as follows: (1) The clusterization algorithm identifies the fragments at quite early time. (2) It is more sensitive for free nucleons and light charged particles compared to intermediate mass fragments, which results in the enhanced (reduced) production of free nucleons (light charged particles, or LCPs). (3) It has affected the yield of isospin-sensitive observables—neutrons (n ), protons (p ), 3H,3He , and the single ratio [R (n /p )] —to a greater extent in the mid-rapidity and low kinetic energy region. In conclusion, the inclusion of the binding energy cut in the clusterization algorithm is found to play a crucial role in the study of isospin physics. This study will give another direction for the determination of symmetry energy in heavy-ion collisions at intermediate energies.
Slepetz, Brad; Laszlo, Istvan; Gogotsi, Yury; Hyde-Volpe, David; Kertesz, Miklos
2010-11-14
Point defects and pores in diamond affect its optical and electrical properties. We generated and evaluated a large number of vacancy V(n) clusters representing nanosized voids in diamonds for n up to 65. Our generational algorithm spawns the new generation n + 1 from the list of the most stable structures in the previous generation n. With energy as the only criterion, we generate a large structural diversity that allows their unbiased analysis. Since ?-electron delocalization is important for carbon, we used quantum mechanical tight-binding density functional theory (TBDFT). Adamantane-like globular shapes are preferred for n up to ?22. Beginning around n? 35, the most stable structures show overall oblate shapes with some irregularities. These novel structures have not been seen before because hitherto only highly regular structures were considered. We see local graphitization in these relaxed structures providing an atomistic justification for the widely used "slit pore" model. The preference for structures with minimum number of cut bonds diminishes as n increases. There are no particularly stable "magic" sizes for vacancy clusters larger than n = 22 indicating that these larger voids can easily incorporate small vacancies and vacancy clusters. Radial distribution analysis shows that unusual contact or bond distances in the 1.6 to 2.8 ? range appear in the vicinity of the internal surfaces of the vacancy clusters. Extremely long C-C bonds emerge as a result of structural relaxation of the dangling bonds in the vicinity of the vacancy clusters that cannot be simply described by ordinary sp(2)/sp(3) hybridization. PMID:20856969
Approximation Algorithms for Projective Clustering Pankaj K. Agarwal Cecilia M. Procopiuc
Agarwal, Pankaj K.
.S.-Israeli Binational Science Foundation. Ăť A preliminary version of this paper appeared in the Proceedings of the 11th increase exponentially with the dimension, projec- tive clustering is used to reduce the dimensionality
Efficient Active Algorithms for Hierarchical Clustering Akshay Krishnamurthy akshaykr@cs.cmu.edu
Gordon, Geoffrey J.
on perfor- mance, measurement complexity and run- time complexity. We instantiate this frame- work with a simple spectral clustering al- gorithm and provide concrete results on its performance, showing that
The X-Alter algorithm : a parameter-free method to perform clustering
Paris-Sud XI, Université de
according to some distance measure. For a thorough introduction of the subject, we refer to the book method of clustering (Mac- Queen; 1967; Hartigan and Wong; 1979). Its attractiveness lies in its sym
David A. Levine; Ian F. Akyildiz; Mahmoud Naghshineh
1997-01-01
The concept can be used to estimate future resource requirements and to perform call admission decisions in wireless networks. Shadow clusters can be used to decide if a new call can be admitted to a wireless network based on its quality-of-service (QoS) requirements and local traffic conditions. The shadow cluster concept can especially be useful in future wireless networks with
Early Experiences with Algorithm Optimizations on Clusters of Playstation 3's
R. Linderman
2008-01-01
A 50 TeraFLOPS cluster of Playstation 3sreg (PS3) is being built at the Air Force Research Laboratory, Information Directorate (AFRL\\/RI) in Rome, NY as an High Performance Computing Modernization Program (HPCMP) effort to provide early access to the IBM Cell Broadband engine chip technology included in the low priced commodity gaming consoles. A heterogeneous cluster with powerful subcluster headnodes is
A Tabu Search Algorithm for Cluster Building in Wireless Sensor Networks
Abdelmorhit El Rhazi; Samuel Pierre
2009-01-01
The main challenge in wireless sensor network deployment pertains to optimizing energy consumption when collecting data from sensor nodes. This paper proposes a new centralized clustering method for a data collection mechanism in wireless sensor networks, which is based on network energy maps and quality-of-service (QoS) requirements. The clustering problem is modeled as a hypergraph partitioning and its resolution is
An Algorithm for Detecting Unimodal Fuzzy Sets and Its Application as a Clustering Technique
I. Gitman; M. D. Levine
1970-01-01
An algorithm is presented which partitions a given sample from a multimodal fuzzy set into unimodal fuzzy sets. It is proven that if certain assumptions are satisfied, then the algorithm will derive the optimal partition in the sense of maximum separation.
An effective four-stage smoke-detection algorithm using video images for early fire-alarm systems
Truong Xuan Tung; Jong-Myon Kim
2011-01-01
This paper proposes an effective, four-stage smoke-detection algorithm using video images. In the first stage, an approximate median method is used to segment moving regions in a video frame. In the second stage, a fuzzy c-means (FCM) method is used to cluster candidate smoke regions from these moving regions. In the third phase, a parameter extraction method is used to
Hualei Zhang; Dongxu Tian
2008-01-01
Structural evolution of Agnv(v=±1,0;n=3–14) clusters have been studied using an extensive, unbiased search based on genetic algorithm and density functional theory (DFT) methods. Cationic, neutral, and anionic silver clusters have planar shapes for their lowest-energy structures up to n=7, 6, and 6, respectively. Most of the competitive candidates for Agnv(v=±1,0;n=9–14) are found to adopt close-flat configurations. The present results obtained
Solving the Parameter Setting in Multi-Objective Evolutionary Algorithms Using Grid::Cluster
Eduardo Segredo; Casiano Rodríguez; Coromoto León
\\u000a The parameter values of a Multi-objective Evolutionary Algorithm greatly determine the behavior of the algorithm to find good\\u000a solutions within a reasonable time for a particular problem. In general, static strategies consume lots of computational resources\\u000a and time. In this work, a tool is used to develop a static strategy to solve the parameter setting problem, applied to the\\u000a particular
NASA Astrophysics Data System (ADS)
Cazade, Pierre-André; Zheng, Wenwei; Prada-Gracia, Diego; Berezovska, Ganna; Rao, Francesco; Clementi, Cecilia; Meuwly, Markus
2015-01-01
The ligand migration network for O2-diffusion in truncated Hemoglobin N is analyzed based on three different clustering schemes. For coordinate-based clustering, the conventional k-means and the kinetics-based Markov Clustering (MCL) methods are employed, whereas the locally scaled diffusion map (LSDMap) method is a collective-variable-based approach. It is found that all three methods agree well in their geometrical definition of the most important docking site, and all experimentally known docking sites are recovered by all three methods. Also, for most of the states, their population coincides quite favourably, whereas the kinetics of and between the states differs. One of the major differences between k-means and MCL clustering on the one hand and LSDMap on the other is that the latter finds one large primary cluster containing the Xe1a, IS1, and ENT states. This is related to the fact that the motion within the state occurs on similar time scales, whereas structurally the state is found to be quite diverse. In agreement with previous explicit atomistic simulations, the Xe3 pocket is found to be a highly dynamical site which points to its potential role as a hub in the network. This is also highlighted in the fact that LSDMap cannot identify this state. First passage time distributions from MCL clusterings using a one- (ligand-position) and two-dimensional (ligand-position and protein-structure) descriptor suggest that ligand- and protein-motions are coupled. The benefits and drawbacks of the three methods are discussed in a comparative fashion and highlight that depending on the questions at hand the best-performing method for a particular data set may differ.
NASA Astrophysics Data System (ADS)
Fujii, Michiko; Iwasawa, Masaki; Funato, Yoko; Makino, Junichiro
2007-12-01
We developed a new direct-tree hybrid N-body algorithm for fully self-consistent N-body simulations of star clusters in their parent galaxies. In such simulations, star clusters need high accuracy, while galaxies need a fast scheme because of the large number of particles required to model it. In our new algorithm, the internal motion of the star cluster is calculated accurately using the direct Hermite scheme with individual timesteps, and all other motions are calculated using the tree code with a second-order leapfrog integrator. The direct and tree schemes are combined using an extension of the mixed variable symplectic (MVS) scheme. Thus, the Hamiltonian corresponding to everything other than the internal motion of the star cluster is integrated with the leapfrog, which is symplectic. Using this algorithm, we performed fully self-consistent N-body simulations of star clusters in their parent galaxy. The internal and orbital evolutions of the star cluster agreed well with those obtained using the direct scheme. We also performed fully self-consistent N-body simulation for large-N models (N = 2 × 106). In this case, the calculation speed was seven-times faster than what would be if the direct scheme was used.
Nguyen Xuan Vinh
Normalizing gene expression profiles to zero mean and unit norm is a common normalization technique for microarray data. Working with this type of normalized data, the Spherical K-means algorithm (SPK-means) using the dot product similarity measure is a suitable gene clustering technique. Although it has been widely known that K-means type algorithms often converge to a local optimum, we noticed
Kent, University of
. Reference [1] E. Bonabeau and C. Meyer, `Swarm intelligence', Harvard Business Review, vol. 79, no. 5, pp (sociology, psychology, archaeology, education) and economics (marketing segmentation, business strategy) [1 Manufacturing Systems Seminar, 2007. [3] A. Jain, M. Murty and P. Flynn, `Data clustering: a review', ACM
Training sequence size in clustering algorithms and averaging single-particle images
Dong Sik Kim; Kiryung Lee
2003-01-01
In clustering the training vectors, we may consider an algo- rithm, which tries to find empirically optimal representative vectors that achieve the empirical minimum to inductively design optimal representative vectors yielding the true op- timum. In order to evaluate the performance of the repre- sentative vectors, we may observe the empirical minimum with respect to the training ratio, the ratio
A PREDICTION-BASED CLUSTERING ALGORITHM TO ACHIEVE QUALITY OF SERVICE IN MULTIHOP AD HOC NETWORKS
Haddadi, Hamed
for dynamically organising mobile nodes (MNs), in order to support Quality of Service (QoS) in multihop mobile ad failure due to node mobility is the primary obstacle to effective QoS support in MANETs. Hence distributed clustering approach is based on intelligent mobility prediction and enables MNs of a dynamic MANET
An Energy Efficient Weight-Clustering Algorithm in Wireless Sensor Networks
Lu Cheng; Depei Qian; Weiguo Wu
2008-01-01
In recent years, the wireless sensor networks (WSN) attracts increasing attention due to its bright application prospect in both military and civil fields. However, for the WSN is extremely energy limited, the traditional network routing protocols are not suitable to it. Energy conservation becomes a crucial problem in WSN routing protocol. Cluster-based routing protocols such as LEACH conserve energy by
Zbigniew Burski
Summary. In this thesis there are introduced theoretical basics and utilization opportunities of an agglomeration method (tree clustering) in researches on criminality in motor transport in Poland. It concerns a period of inten- sifi cation of government transitions in the country, in which the problems of property larceny, including cars, have considerably grown. Application of the mathematical, statistical method enabled
A Unified Hierarchical Algorithm for Global Illumination with Scattering Volumes and Object Clusters
François X. Sillion
1995-01-01
This paper presents a new radiosity algorithmthat allows the simultaneous computation of energy exchangesbetween surface elements, scattering volume distributions,and groups of surfaces, or object clusters. The newtechnique is based on a hierarchical formulation of the zonalmethod, and e#ciently integrates volumes and surfaces. Inparticular no initial linking stage is needed, even for inhomogeneousvolumes, thanks to the construction of a globalspatial hierarchy.
to understand, compare, and extend subspace and projected clustering algo- rithms. In this work, we discuss eval. 1 eval. 2 eval. 3 re-implementation rare case: common implementation OpenSubspace algo. A algo. B algo. C unified evaluation repository eval. 1 eval. 2 eval. 3 algo. D Figure 1: Subspace and projected
KmL3D: a non-parametric algorithm for clustering joint trajectories.
Genolini, C; Pingault, J B; Driss, T; Côté, S; Tremblay, R E; Vitaro, F; Arnaud, C; Falissard, B
2013-01-01
In cohort studies, variables are measured repeatedly and can be considered as trajectories. A classic way to work with trajectories is to cluster them in order to detect the existence of homogeneous patterns of evolution. Since cohort studies usually measure a large number of variables, it might be interesting to study the joint evolution of several variables (also called joint-variable trajectories). To date, the only way to cluster joint-trajectories is to cluster each trajectory independently, then to cross the partitions obtained. This approach is unsatisfactory because it does not take into account a possible co-evolution of variable-trajectories. KmL3D is an R package that implements a version of k-means dedicated to clustering joint-trajectories. It provides facilities for the management of missing values, offers several quality criteria and its graphic interface helps the user to select the best partition. KmL3D can work with any number of joint-variable trajectories. In the restricted case of two joint trajectories, it proposes 3D tools to visualize the partitioning and then export 3D dynamic rotating-graphs to PDF format. PMID:23127283
Study on skin color image segmentation used by Fuzzy-c-means arithmetic
Keke Shang; Peng Zhou; Guohui Li
2010-01-01
Skin color image segmentation is an important part in skin image analysis. Segmentation feature parameter and segmentation arithmetic significantly influence the segmentation result. In this article, we compared RGB, HSV, and Lab color spaces and found that HSV color space as segmentation feature parameter has the advantage. Furthermore, we used an improved Fuzzy-c-means arithmetic (IFCM) in skin color image segmentation
Clustering III Lecture outline
Terzi, Evimaria
Clustering III #12;Lecture outline Â· Soft (model-based) clustering and EM algorithm Â· Clustering aggregation [A. Gionis, H. Mannila, P. Tsaparas: Clustering aggregation, ICDE 2004] Â· Impossibility theorem for clustering [Jon Kleinberg, An impossibility theorem for clustering, NIPS 2002] #12;Expectation
Clustering Aggregation References
Terzi, Evimaria
Clustering Aggregation Â· References Â A. Gionis, H. Mannila, P. Tsaparas: Clustering aggregation and clustering, JACM 2008 Tuesday, October 15, 13 #12;Clustering aggregation Â· Many different clusterings for the same dataset! Â Different objective functions Â Different algorithms Â Different number of clusters
Fast accurate fuzzy clustering through data reduction
Steven Eschrich; Jingwei Ke; Lawrence O. Hall; Dmitry B. Goldgof
2003-01-01
Clustering is a useful approach in image segmentation, data mining, and other pattern recognition problems for which unlabeled data exist. Fuzzy clustering using fuzzy c-means or variants of it can provide a data partition that is both better and more meaningful than hard clustering approaches. The clustering process can be quite slow when there are many objects or patterns to
DWT-CEM: an algorithm for scale-temporal clustering in fMRI
Joăo Ricardo Sato; André Fujita; Edson Amaro Jr.; Janaina Mourăo Miranda; Pedro Alberto Morettin; Michal John Brammer
2007-01-01
The number of studies using functional magnetic resonance imaging (fMRI) has grown very rapidly since the first description\\u000a of the technique in the early 1990s. Most published studies have utilized data analysis methods based on voxel-wise application\\u000a of general linear models (GLM). On the other hand, temporal clustering analysis (TCA) focuses on the identification of relationships\\u000a between cortical areas by
A Distributed Processing Algorithm for Solving Integer Programs Using a Cluster of Workstations
G. Mitra; I. Hai; M. T. Hajian
1997-01-01
The sequential branch and bound algorithm is the most established method for solving mixed integer and discrete programming problems. It is based on the tree search of the possible subproblems of the original problem. There are two goals in carrying out a tree search, namely, (i) finding a good and ultimately the best integer solution, and (ii) to prove that
RCMP: Reliable clustering based multi-path routing algorithm for wireless sensor networks
Tohid Bagheri; Ali Ghaffari; Saeed Rasouli Heikalabad
2011-01-01
Although data forwarding algorithms are among the first set of the most important issues explored in sensor networks, how to efficiently and reliably deliver sensing data through a large field of sensors remains a research challenge. Multi-path is favorite alternative for sensor networks, as it provides an easy mechanism to distribute traffic and balance network's load, as well as considerate
An Investigation of Exploration and Exploitation Within Cluster Oriented Genetic Algorithms (COGAs)
Christopher R. Bonham; Ian C. Parmee
1999-01-01
When conducting a preliminary search across an engineering design space using an evolutionary search method such as the Genetic Algorithm (GA) it is important to achieve the correct balance between exploration and exploitation. If search is too exploratitive, progress may rapidly degenerate into a random walk where the benefits of evolutionary search are quickly lost. Conversely, if the degree of
NASA Astrophysics Data System (ADS)
Karayiannis, Nicolaos B.
1997-04-01
This paper presents entropy constrained fuzzy clustering and learning vector quantization algorithms and their application in image compression. Entropy constrained fuzzy clustering (ECFC) algorithms were developed by minimizing an objective function incorporating the fuzzy partition entropy and the average distortion between the feature vectors, which represent the image data, and the prototypes, which represent the codevectors or codewords. The reformulation of fuzzy c-means (FCM) algorithms provided the basis for the development of fuzzy learning vector quantization (FLVQ) algorithms and essentially established a link between clustering and learning vector quantization. Minimization of the reformulation function that corresponds to ECFC algorithms using gradient descent results in entropy constrained learning vector quantization (ECLVQ) algorithms. These algorithms allow the gradual transition from a maximally fuzzy partition to a nearly crisp partition of the feature vectors during the learning process. This paper presents two alternative implementations of the proposed algorithms, which differ in terms of the strategy employed for updating the prototypes during learning. The proposed algorithms are tested and evaluated on the design of codebooks used for image data compression.
Qiu-Rui Li; Jing-Qi Xiong; Rui Sun; Zun-Ling Li; Bo Liu
2009-01-01
In response to the roller bearing fault of the engineering vehicle, the excellent characteristics of time-frequency of wavelet packet is used to decompose the fault signal. Then we extract the necessary of the fault signals. Finally, we carry out the recognition of fault types using multi-class support vector machine and dynamic clustering algorithm brought forward in the paper. It is
Soft Clustering Criterion Functions for Partitional Document Clustering
Karypis, George
Soft Clustering Criterion Functions for Partitional Document Clustering Ying Zhao University have shown that partitional clustering algorithms that optimize certain criterion functions, which measure key aspects of inter- and intra-cluster similarity, are very effective in producing hard
An unsupervised, ensemble clustering algorithm: A new approach for classification of X-ray sources
S. M. Hojnacki; J. H. Kastner; G. Micela; E. D. Feigelson; S. M. LaLonde
2007-01-01
A large volume of CCD X-ray spectra is being generated by the Chandra X-ray Observatory Chandra and XMM-Newton. Automated spectral analysis and classification methods can aid in sorting, characterizing, and classifying this large volume of CCD X-ray spectra in a non- parametric fashion, complementary to current parametric model fits. We have developed an algorithm that uses multivariate statistical techniques, including
A new clustered Directed Diffusion Algorithm based on credit of nodes for wireless sensor networks
Farnaz Dargahi; Amir Masoud Rahmani; Reza Samadabadi
2008-01-01
\\u000a for data gathering in wireless sensor network, sensors extract useful information from environment; this information has to\\u000a be routed through several intermediate nodes to reach the destination. How information can effectively disseminate to the\\u000a destination is one of the most important tasks in sensor networks. Due to limited power and slow processor in each node, algorithms\\u000a of sensor networks must
Mutation Clustering Shamaila Hussain
Singer, Jeremy
Mutation Clustering Shamaila Hussain shamaila.2.hussain@kcl.ac.uk Student Number: 0425528 to reduce the computational cost of the mutation testing, reducing the number of the mutants by clustering. K-means clustering algorithm and agglomerative hierarchical clustering algorithm are implemented
NASA Astrophysics Data System (ADS)
Cameron, Maria; Vanden-Eijnden, Eric
2014-08-01
A set of analytical and computational tools based on transition path theory (TPT) is proposed to analyze flows in complex networks. Specifically, TPT is used to study the statistical properties of the reactive trajectories by which transitions occur between specific groups of nodes on the network. Sampling tools are built upon the outputs of TPT that allow to generate these reactive trajectories directly, or even transition paths that travel from one group of nodes to the other without making any detour and carry the same probability current as the reactive trajectories. These objects permit to characterize the mechanism of the transitions, for example by quantifying the width of the tubes by which these transitions occur, the location and distribution of their dynamical bottlenecks, etc. These tools are applied to a network modeling the dynamics of the Lennard-Jones cluster with 38 atoms () and used to understand the mechanism by which this cluster rearranges itself between its two most likely states at various temperatures.
Fuzzy clustering methods for the segmentation of multimodal medical images
Masulli, Francesco
Fuzzy clustering methods for the segmentation of multimodal medical images F. Masulli 2,1 , A volumes, segmentation, fuzzy clustering, neuro- fuzzy possibilistic c-means 1 Introduction Nowadays medical images (MMIs). In particular we consider the Fuzzy C-Means (FCM) [2], the Maximum Entropy
Wu, Shandong; Weinstein, Susan P.; Conant, Emily F.; Kontos, Despina
2013-01-01
Purpose: Breast magnetic resonance imaging (MRI) plays an important role in the clinical management of breast cancer. Studies suggest that the relative amount of fibroglandular (i.e., dense) tissue in the breast as quantified in MR images can be predictive of the risk for developing breast cancer, especially for high-risk women. Automated segmentation of the fibroglandular tissue and volumetric density estimation in breast MRI could therefore be useful for breast cancer risk assessment. Methods: In this work the authors develop and validate a fully automated segmentation algorithm, namely, an atlas-aided fuzzy C-means (FCM-Atlas) method, to estimate the volumetric amount of fibroglandular tissue in breast MRI. The FCM-Atlas is a 2D segmentation method working on a slice-by-slice basis. FCM clustering is first applied to the intensity space of each 2D MR slice to produce an initial voxelwise likelihood map of fibroglandular tissue. Then a prior learned fibroglandular tissue likelihood atlas is incorporated to refine the initial FCM likelihood map to achieve enhanced segmentation, from which the absolute volume of the fibroglandular tissue (|FGT|) and the relative amount (i.e., percentage) of the |FGT| relative to the whole breast volume (FGT%) are computed. The authors' method is evaluated by a representative dataset of 60 3D bilateral breast MRI scans (120 breasts) that span the full breast density range of the American College of Radiology Breast Imaging Reporting and Data System. The automated segmentation is compared to manual segmentation obtained by two experienced breast imaging radiologists. Segmentation performance is assessed by linear regression, Pearson's correlation coefficients, Student's paired t-test, and Dice's similarity coefficients (DSC). Results: The inter-reader correlation is 0.97 for FGT% and 0.95 for |FGT|. When compared to the average of the two readers’ manual segmentation, the proposed FCM-Atlas method achieves a correlation of r = 0.92 for FGT% and r = 0.93 for |FGT|, and the automated segmentation is not statistically significantly different (p = 0.46 for FGT% and p = 0.55 for |FGT|). The bilateral correlation between left breasts and right breasts for the FGT% is 0.94, 0.92, and 0.95 for reader 1, reader 2, and the FCM-Atlas, respectively; likewise, for the |FGT|, it is 0.92, 0.92, and 0.93, respectively. For the spatial segmentation agreement, the automated algorithm achieves a DSC of 0.69 ± 0.1 when compared to reader 1 and 0.61 ± 0.1 for reader 2, respectively, while the DSC between the two readers’ manual segmentation is 0.67 ± 0.15. Additional robustness analysis shows that the segmentation performance of the authors' method is stable both with respect to selecting different cases and to varying the number of cases needed to construct the prior probability atlas. The authors' results also show that the proposed FCM-Atlas method outperforms the commonly used two-cluster FCM-alone method. The authors' method runs at ?5 min for each 3D bilateral MR scan (56 slices) for computing the FGT% and |FGT|, compared to ?55 min needed for manual segmentation for the same purpose. Conclusions: The authors' method achieves robust segmentation and can serve as an efficient tool for processing large clinical datasets for quantifying the fibroglandular tissue content in breast MRI. It holds a great potential to support clinical applications in the future including breast cancer risk assessment. PMID:24320533
A new time dependent density functional algorithm for large systems and plasmons in metal clusters.
Baseggio, Oscar; Fronzoni, Giovanna; Stener, Mauro
2015-07-14
A new algorithm to solve the Time Dependent Density Functional Theory (TDDFT) equations in the space of the density fitting auxiliary basis set has been developed and implemented. The method extracts the spectrum from the imaginary part of the polarizability at any given photon energy, avoiding the bottleneck of Davidson diagonalization. The original idea which made the present scheme very efficient consists in the simplification of the double sum over occupied-virtual pairs in the definition of the dielectric susceptibility, allowing an easy calculation of such matrix as a linear combination of constant matrices with photon energy dependent coefficients. The method has been applied to very different systems in nature and size (from H2 to [Au147](-)). In all cases, the maximum deviations found for the excitation energies with respect to the Amsterdam density functional code are below 0.2 eV. The new algorithm has the merit not only to calculate the spectrum at whichever photon energy but also to allow a deep analysis of the results, in terms of transition contribution maps, Jacob plasmon scaling factor, and induced density analysis, which have been all implemented. PMID:26178089
An improved FCM medical image segmentation algorithm based on MMTD.
Zhou, Ningning; Yang, Tingting; Zhang, Shaobai
2014-01-01
Image segmentation plays an important role in medical image processing. Fuzzy c-means (FCM) is one of the popular clustering algorithms for medical image segmentation. But FCM is highly vulnerable to noise due to not considering the spatial information in image segmentation. This paper introduces medium mathematics system which is employed to process fuzzy information for image segmentation. It establishes the medium similarity measure based on the measure of medium truth degree (MMTD) and uses the correlation of the pixel and its neighbors to define the medium membership function. An improved FCM medical image segmentation algorithm based on MMTD which takes some spatial features into account is proposed in this paper. The experimental results show that the proposed algorithm is more antinoise than the standard FCM, with more certainty and less fuzziness. This will lead to its practicable and effective applications in medical image segmentation. PMID:24648852
Combining Multiple Weak Clusterings
Alexander P. Topchy; Anil K. Jain; William F. Punch
2003-01-01
A data set can be clustered in many ways depending on the clustering algorithm employed, parameter settings used and other factors. Can multiple clusterings be combined so that the final partitioning of data provides better clustering? The answer depends on the quality of clusterings to be combined as well as the properties of the fusion method. First, we introduce a
Distance Measures Hierarchical Clustering
Ullman, Jeffrey D.
1 Clustering Distance Measures Hierarchical Clustering k -Means Algorithms #12;2 The Problem of Clustering Given a set of points, with a notion of distance between points, group the points into some number of clusters, so that members of a cluster are in some sense as close to each other as possible. #12;3 Example
Lecture outline Clustering aggregation
Terzi, Evimaria
Lecture outline Â· Clustering aggregation Â Reference: A. Gionis, H. Mannila, P. Tsaparas: Clustering aggregation, ICDE 2004 Â· Co-clustering (or bi-clustering) Â· References: Â A. Anagnostopoulos, A. Dasgupta and R. Kumar: Approximation Algorithms for co-clustering, PODS 2008. Â K. Puolamaki. S. Hanhijarvi
NASA Astrophysics Data System (ADS)
Domínguez, Eduardo; Lage-Castellanos, Alejandro; Mulet, Roberto
2015-07-01
We study two free energy approximations (Bethe and plaquette-CVM) for the Random Field Ising Model in two dimensions. We compare results obtained by these two methods in single instances of the model on the square grid, showing the difficulties arising in defining a robust critical line. We also attempt average case calculations using a replica-symmetric ansatz, and compare the results with single instances. Both, Bethe and plaquette-CVM approximations present a similar panorama in the phase space, predicting long range order at low temperatures and fields. We show that plaquette-CVM is more precise, in the sense that predicts a lower critical line (the truth being no line at all). Furthermore, we give some insight on the non-trivial structure of the fixed points of different message passing algorithms. A study of the Monte Carlo dynamics for an arbitrary sample shows that GBP states are very well correlated with states that are attractors of the stochastic dynamics.
Automatic skull stripping in MRI based on morphological filters and fuzzy c-means segmentation.
Gambino, Orazio; Daidone, Enrico; Sciortino, Matteo; Pirrone, Roberto; Ardizzone, Edoardo
2011-01-01
In this paper a new automatic skull stripping method for T1-weighted MR image of human brain is presented. Skull stripping is a process that allows to separate the brain from the rest of tissues. The proposed method is based on a 2D brain extraction making use of fuzzy c-means segmentation and morphological operators applied on transversal slices. The approach is extended to the 3D case, taking into account the result obtained from the preceding slice to solve the organ splitting problem. The proposed approach is compared with BET (Brain Extraction Tool) implemented in MRIcro software. PMID:22255471
Neighborgram Clustering Interactive Exploration of Cluster Neighborhoods
Berthold, Michael R.
Neighborgram Clustering Interactive Exploration of Cluster Neighborhoods Michael R. Berthold, Bernd of clus- ters for a given data set. The clustering is done by con- structing local histograms, which can then be used to vi- sualize, select, and fine-tune potential cluster candidates. The accompanying algorithm can
Chen, Zhijia; Zhu, Yuanchang; Di, Yanqiang; Feng, Shaochong
2015-01-01
In IaaS (infrastructure as a service) cloud environment, users are provisioned with virtual machines (VMs). To allocate resources for users dynamically and effectively, accurate resource demands predicting is essential. For this purpose, this paper proposes a self-adaptive prediction method using ensemble model and subtractive-fuzzy clustering based fuzzy neural network (ESFCFNN). We analyze the characters of user preferences and demands. Then the architecture of the prediction model is constructed. We adopt some base predictors to compose the ensemble model. Then the structure and learning algorithm of fuzzy neural network is researched. To obtain the number of fuzzy rules and the initial value of the premise and consequent parameters, this paper proposes the fuzzy c-means combined with subtractive clustering algorithm, that is, the subtractive-fuzzy clustering. Finally, we adopt different criteria to evaluate the proposed method. The experiment results show that the method is accurate and effective in predicting the resource demands. PMID:25691896
Bio Inspired Swarm Algorithm for Tumor Detection in Digital Mammogram
NASA Astrophysics Data System (ADS)
Dheeba, J.; Selvi, Tamil
Microcalcification clusters in mammograms is the significant early sign of breast cancer. Individual clusters are difficult to detect and hence an automatic computer aided mechanism will help the radiologist in detecting the microcalcification clusters in an easy and efficient way. This paper presents a new classification approach for detection of microcalcification in digital mammogram using particle swarm optimization algorithm (PSO) based clustering technique. Fuzzy C-means clustering technique, well defined for clustering data sets are used in combination with the PSO. We adopt the particle swarm optimization to search the cluster center in the arbitrary data set automatically. PSO can search the best solution from the probability option of the Social-only model and Cognition-only model. This method is quite simple and valid, and it can avoid the minimum local value. The proposed classification approach is applied to a database of 322 dense mammographic images, originating from the MIAS database. Results shows that the proposed PSO-FCM approach gives better detection performance compared to conventional approaches.
Clustering method for estimating principal diffusion directions.
Nazem-Zadeh, Mohammad-Reza; Jafari-Khouzani, Kourosh; Davoodi-Bojd, Esmaeil; Jiang, Quan; Soltanian-Zadeh, Hamid
2011-08-01
Diffusion tensor magnetic resonance imaging (DTMRI) is a non-invasive tool for the investigation of white matter structure within the brain. However, the traditional tensor model is unable to characterize anisotropies of orders higher than two in heterogeneous areas containing more than one fiber population. To resolve this issue, high angular resolution diffusion imaging (HARDI) with a large number of diffusion encoding gradients is used along with reconstruction methods such as Q-ball. Using HARDI data, the fiber orientation distribution function (ODF) on the unit sphere is calculated and used to extract the principal diffusion directions (PDDs). Fast and accurate estimation of PDDs is a prerequisite for tracking algorithms that deal with fiber crossings. In this paper, the PDDs are defined as the directions around which the ODF data is concentrated. Estimates of the PDDs based on this definition are less sensitive to noise in comparison with the previous approaches. A clustering approach to estimate the PDDs is proposed which is an extension of fuzzy c-means clustering developed for orientation of points on a sphere. MDL (Minimum description length) principle is proposed to estimate the number of PDDs. Using both simulated and real diffusion data, the proposed method has been evaluated and compared with some previous protocols. Experimental results show that the proposed clustering algorithm is more accurate, more resistant to noise, and faster than some of techniques currently being utilized. PMID:21642005
Clustering method for estimating principal diffusion directions
Nazem-Zadeh, Mohammad-Reza; Jafari-Khouzani, Kourosh; Davoodi-Bojd, Esmaeil; Jiang, Quan; Soltanian-Zadeh, Hamid
2012-01-01
Diffusion tensor magnetic resonance imaging (DTMRI) is a non-invasive tool for the investigation of white matter structure within the brain. However, the traditional tensor model is unable to characterize anisotropies of orders higher than two in heterogeneous areas containing more than one fiber population. To resolve this issue, high angular resolution diffusion imaging (HARDI) with a large number of diffusion encoding gradients is used along with reconstruction methods such as Q-ball. Using HARDI data, the fiber orientation distribution function (ODF) on the unit sphere is calculated and used to extract the principal diffusion directions (PDDs). Fast and accurate estimation of PDDs is a prerequisite for tracking algorithms that deal with fiber crossings. In this paper, the PDDs are defined as the directions around which the ODF data is concentrated. Estimates of the PDDs based on this definition are less sensitive to noise in comparison with the previous approaches. A clustering approach to estimate the PDDs is proposed which is an extension of fuzzy c-means clustering developed for orientation of points on a sphere. MDL (Minimum description length) principle is proposed to estimate the number of PDDs. Using both simulated and real diffusion data, the proposed method has been evaluated and compared with some previous protocols. Experimental results show that the proposed clustering algorithm is more accurate, more resistant to noise, and faster than some of techniques currently being utilized. PMID:21642005
NASA Astrophysics Data System (ADS)
Li, Xiaohe; Zhang, Taiyi; Qu, Zhan
Image segmentation is an essential processing step for many image analysis applications. In this paper, a novel image segmentation algorithm using fuzzy C-means clustering (FCM) with spatial constraints based on Markov random field (MRF) via Bayesian theory is proposed. Due to disregard of spatial constraint information, the FCM algorithm fails to segment images corrupted by noise. In order to improve the robustness of FCM to noise, a powerful model for the membership functions that incorporates local correlation is given by MRF defined through a Gibbs function. Then spatial information is incorporated into the FCM by Bayesian theory. Therefore, the proposed algorithm has both the advantages of the FCM and MRF, and is robust to noise. Experimental results on the synthetic and real-world images are given to demonstrate the robustness and validity of the proposed algorithm.
Document clustering using particle swarm optimization
Xiaohui Cui; Thomas E. Potok; Paul Palathingal
2005-01-01
Fast and high-quality document clustering algorithms play an important role in effectively navigating, summarizing, and organizing information. Recent studies have shown that partitional clustering algorithms are more suitable for clustering large datasets. However, the K-means algorithm, the most commonly used partitional clustering algorithm, can only generate a local optimal solution. In this paper, we present a particle swarm optimization (PSO)
Qin, Xiao
on Heterogeneous Clusters Tao Xie, Xiao Qin ,and Mais Nijim Department of Computer Science New Mexico Institute on heterogeneous clusters. We propose a new security- and heterogeneity-driven scheduling algo- rithm (SHARP without any risk of being attacked. Because of high security overhead in existing clusters, an im- portant
Srivastava, Subodh; Sharma, Neeraj; Singh, S K; Srivastava, R
2014-07-01
In this paper, a combined approach for enhancement and segmentation of mammograms is proposed. In preprocessing stage, a contrast limited adaptive histogram equalization (CLAHE) method is applied to obtain the better contrast mammograms. After this, the proposed combined methods are applied. In the first step of the proposed approach, a two dimensional (2D) discrete wavelet transform (DWT) is applied to all the input images. In the second step, a proposed nonlinear complex diffusion based unsharp masking and crispening method is applied on the approximation coefficients of the wavelet transformed images to further highlight the abnormalities such as micro-calcifications, tumours, etc., to reduce the false positives (FPs). Thirdly, a modified fuzzy c-means (FCM) segmentation method is applied on the output of the second step. In the modified FCM method, the mutual information is proposed as a similarity measure in place of conventional Euclidian distance based dissimilarity measure for FCM segmentation. Finally, the inverse 2D-DWT is applied. The efficacy of the proposed unsharp masking and crispening method for image enhancement is evaluated in terms of signal-to-noise ratio (SNR) and that of the proposed segmentation method is evaluated in terms of random index (RI), global consistency error (GCE), and variation of information (VoI). The performance of the proposed segmentation approach is compared with the other commonly used segmentation approaches such as Otsu's thresholding, texture based, k-means, and FCM clustering as well as thresholding. From the obtained results, it is observed that the proposed segmentation approach performs better and takes lesser processing time in comparison to the standard FCM and other segmentation methods in consideration. PMID:25190996
Distributed Clustering for Ad Hoc Networks
Stefano Basagni
1999-01-01
A Distributed Clustering Algorithm (DCA) and a Distributed Mobility-Adaptive Clustering (DMAC) algorithm are presented that partition the nodes of a fully mobilenetwork (ad hoc network) into clusters, thus giving the network a hierarchical organization.
NASA Astrophysics Data System (ADS)
Heard, Christopher J.; Heiles, Sven; Vajda, Stefan; Johnston, Roy L.
2014-09-01
The novel surface mode of the Birmingham Cluster Genetic Algorithm (S-BCGA) is employed for the global optimisation of noble metal tetramers upon an MgO (100) substrate at the GGA-DFT level of theory. The effect of element identity and alloying in surface-bound neutral subnanometre clusters is determined by energetic comparison between all compositions of PdnAg(4-n) and PdnPt(4-n). While the binding strengths to the surface increase in the order Pt > Pd > Ag, the excess energy profiles suggest a preference for mixed clusters for both cases. The binding of CO is also modelled, showing that the adsorption site can be predicted solely by electrophilicity. Comparison to CO binding on a single metal atom shows a reversal of the 5?-d activation process for clusters, weakening the cluster-surface interaction on CO adsorption. Charge localisation determines homotop, CO binding and surface site preferences. The electronic behaviour, which is intermediate between molecular and metallic particles allows for tunable features in the subnanometre size range.
A stable and unsupervised Fuzzy C-Means for data classification
NASA Astrophysics Data System (ADS)
Taher, Akar; Chehdi, Kacem; Cariou, Claude
2015-04-01
In this paper a stable and unsupervised version of FCM algorithm named FCMO is presented. The originality of the proposed FCMO algorithm relies: i) on the usage of an adaptive incremental technique to initialize the class centres that calls into question the intermediate initializations; this technique renders the algorithm stable and deterministic, and the classification results do not vary from a run to another, and ii) on the unsupervised evaluation criteria of the intermediate classification result to estimate the optimal number of classes; this makes the algorithm unsupervised. The efficiency of this optimized version of FCM is shown through some experimental results for its stability and its correct class number estimation.
Paris-Sud XI, Université de
SCENARIOS OF A NUCLEAR POWER PLANT DIGITAL INSTRUMENTATION AND CONTROL SYSTEM Francesco Di Maioa) equipped with a Digital Instrumentation and Control system (I&C). More specifically, the classification of computer-based systems, expected to improve the plant's safety, reliability and failure detection
Athens, University of
Efficient Image Compression of Medical Images Using the Wavelet Transform and Fuzzy c compression technique, when applied to a set of medical images. 1. Introduction Image compression plays as that they could be used in a multitude of purposes. For instance, it is necessary that medical images