Wen, Haiguang; Shi, Junxing; Chen, Wei; Liu, Zhongming
2018-02-28
The brain represents visual objects with topographic cortical patterns. To address how distributed visual representations enable object categorization, we established predictive encoding models based on a deep residual network, and trained them to predict cortical responses to natural movies. Using this predictive model, we mapped human cortical representations to 64,000 visual objects from 80 categories with high throughput and accuracy. Such representations covered both the ventral and dorsal pathways, reflected multiple levels of object features, and preserved semantic relationships between categories. In the entire visual cortex, object representations were organized into three clusters of categories: biological objects, non-biological objects, and background scenes. In a finer scale specific to each cluster, object representations revealed sub-clusters for further categorization. Such hierarchical clustering of category representations was mostly contributed by cortical representations of object features from middle to high levels. In summary, this study demonstrates a useful computational strategy to characterize the cortical organization and representations of visual features for rapid categorization.
Interactive classification and content-based retrieval of tissue images
NASA Astrophysics Data System (ADS)
Aksoy, Selim; Marchisio, Giovanni B.; Tusk, Carsten; Koperski, Krzysztof
2002-11-01
We describe a system for interactive classification and retrieval of microscopic tissue images. Our system models tissues in pixel, region and image levels. Pixel level features are generated using unsupervised clustering of color and texture values. Region level features include shape information and statistics of pixel level feature values. Image level features include statistics and spatial relationships of regions. To reduce the gap between low-level features and high-level expert knowledge, we define the concept of prototype regions. The system learns the prototype regions in an image collection using model-based clustering and density estimation. Different tissue types are modeled using spatial relationships of these regions. Spatial relationships are represented by fuzzy membership functions. The system automatically selects significant relationships from training data and builds models which can also be updated using user relevance feedback. A Bayesian framework is used to classify tissues based on these models. Preliminary experiments show that the spatial relationship models we developed provide a flexible and powerful framework for classification and retrieval of tissue images.
NASA Astrophysics Data System (ADS)
Farsadnia, F.; Rostami Kamrood, M.; Moghaddam Nia, A.; Modarres, R.; Bray, M. T.; Han, D.; Sadatinejad, J.
2014-02-01
One of the several methods in estimating flood quantiles in ungauged or data-scarce watersheds is regional frequency analysis. Amongst the approaches to regional frequency analysis, different clustering techniques have been proposed to determine hydrologically homogeneous regions in the literature. Recently, Self-Organization feature Map (SOM), a modern hydroinformatic tool, has been applied in several studies for clustering watersheds. However, further studies are still needed with SOM on the interpretation of SOM output map for identifying hydrologically homogeneous regions. In this study, two-level SOM and three clustering methods (fuzzy c-mean, K-mean, and Ward's Agglomerative hierarchical clustering) are applied in an effort to identify hydrologically homogeneous regions in Mazandaran province watersheds in the north of Iran, and their results are compared with each other. Firstly the SOM is used to form a two-dimensional feature map. Next, the output nodes of the SOM are clustered by using unified distance matrix algorithm and three clustering methods to form regions for flood frequency analysis. The heterogeneity test indicates the four regions achieved by the two-level SOM and Ward approach after adjustments are sufficiently homogeneous. The results suggest that the combination of SOM and Ward is much better than the combination of either SOM and FCM or SOM and K-mean.
Cluster Analysis of Weighted Bipartite Networks: A New Copula-Based Approach
Chessa, Alessandro; Crimaldi, Irene; Riccaboni, Massimo; Trapin, Luca
2014-01-01
In this work we are interested in identifying clusters of “positional equivalent” actors, i.e. actors who play a similar role in a system. In particular, we analyze weighted bipartite networks that describes the relationships between actors on one side and features or traits on the other, together with the intensity level to which actors show their features. We develop a methodological approach that takes into account the underlying multivariate dependence among groups of actors. The idea is that positions in a network could be defined on the basis of the similar intensity levels that the actors exhibit in expressing some features, instead of just considering relationships that actors hold with each others. Moreover, we propose a new clustering procedure that exploits the potentiality of copula functions, a mathematical instrument for the modelization of the stochastic dependence structure. Our clustering algorithm can be applied both to binary and real-valued matrices. We validate it with simulations and applications to real-world data. PMID:25303095
NASA Astrophysics Data System (ADS)
Farsadnia, Farhad; Ghahreman, Bijan
2016-04-01
Hydrologic homogeneous group identification is considered both fundamental and applied research in hydrology. Clustering methods are among conventional methods to assess the hydrological homogeneous regions. Recently, Self-Organizing feature Map (SOM) method has been applied in some studies. However, the main problem of this method is the interpretation on the output map of this approach. Therefore, SOM is used as input to other clustering algorithms. The aim of this study is to apply a two-level Self-Organizing feature map and Ward hierarchical clustering method to determine the hydrologic homogenous regions in North and Razavi Khorasan provinces. At first by principal component analysis, we reduced SOM input matrix dimension, then the SOM was used to form a two-dimensional features map. To determine homogeneous regions for flood frequency analysis, SOM output nodes were used as input into the Ward method. Generally, the regions identified by the clustering algorithms are not statistically homogeneous. Consequently, they have to be adjusted to improve their homogeneity. After adjustment of the homogeneity regions by L-moment tests, five hydrologic homogeneous regions were identified. Finally, adjusted regions were created by a two-level SOM and then the best regional distribution function and associated parameters were selected by the L-moment approach. The results showed that the combination of self-organizing maps and Ward hierarchical clustering by principal components as input is more effective than the hierarchical method, by principal components or standardized inputs to achieve hydrologic homogeneous regions.
NASA Astrophysics Data System (ADS)
Hernawati, Kuswari; Insani, Nur; Bambang S. H., M.; Nur Hadi, W.; Sahid
2017-08-01
This research aims to mapping the 33 (thirty-three) provinces in Indonesia, based on the data on air, water and soil pollution, as well as social demography and geography data, into a clustered model. The method used in this study was unsupervised method that combines the basic concept of Kohonen or Self-Organizing Feature Maps (SOFM). The method is done by providing the design parameters for the model based on data related directly/ indirectly to pollution, which are the demographic and social data, pollution levels of air, water and soil, as well as the geographical situation of each province. The parameters used consists of 19 features/characteristics, including the human development index, the number of vehicles, the availability of the plant's water absorption and flood prevention, as well as geographic and demographic situation. The data used were secondary data from the Central Statistics Agency (BPS), Indonesia. The data are mapped into SOFM from a high-dimensional vector space into two-dimensional vector space according to the closeness of location in term of Euclidean distance. The resulting outputs are represented in clustered grouping. Thirty-three provinces are grouped into five clusters, where each cluster has different features/characteristics and level of pollution. The result can used to help the efforts on prevention and resolution of pollution problems on each cluster in an effective and efficient way.
m-BIRCH: an online clustering approach for computer vision applications
NASA Astrophysics Data System (ADS)
Madan, Siddharth K.; Dana, Kristin J.
2015-03-01
We adapt a classic online clustering algorithm called Balanced Iterative Reducing and Clustering using Hierarchies (BIRCH), to incrementally cluster large datasets of features commonly used in multimedia and computer vision. We call the adapted version modified-BIRCH (m-BIRCH). The algorithm uses only a fraction of the dataset memory to perform clustering, and updates the clustering decisions when new data comes in. Modifications made in m-BIRCH enable data driven parameter selection and effectively handle varying density regions in the feature space. Data driven parameter selection automatically controls the level of coarseness of the data summarization. Effective handling of varying density regions is necessary to well represent the different density regions in data summarization. We use m-BIRCH to cluster 840K color SIFT descriptors, and 60K outlier corrupted grayscale patches. We use the algorithm to cluster datasets consisting of challenging non-convex clustering patterns. Our implementation of the algorithm provides an useful clustering tool and is made publicly available.
Image Recommendation Algorithm Using Feature-Based Collaborative Filtering
NASA Astrophysics Data System (ADS)
Kim, Deok-Hwan
As the multimedia contents market continues its rapid expansion, the amount of image contents used in mobile phone services, digital libraries, and catalog service is increasing remarkably. In spite of this rapid growth, users experience high levels of frustration when searching for the desired image. Even though new images are profitable to the service providers, traditional collaborative filtering methods cannot recommend them. To solve this problem, in this paper, we propose feature-based collaborative filtering (FBCF) method to reflect the user's most recent preference by representing his purchase sequence in the visual feature space. The proposed approach represents the images that have been purchased in the past as the feature clusters in the multi-dimensional feature space and then selects neighbors by using an inter-cluster distance function between their feature clusters. Various experiments using real image data demonstrate that the proposed approach provides a higher quality recommendation and better performance than do typical collaborative filtering and content-based filtering techniques.
Clustering Suicide Attempters: Impulsive-Ambivalent, Well-Planned, or Frequent.
Lopez-Castroman, Jorge; Nogue, Erika; Guillaume, Sebastien; Picot, Marie Christine; Courtet, Philippe
2016-06-01
Attempts to predict suicidal behavior within high-risk populations have so far shown insufficient accuracy. Although several psychosocial and clinical features have been consistently associated with suicide attempts, investigations of latent structure in well-characterized populations of suicide attempters are lacking. We analyzed a sample of 1,009 hospitalized suicide attempters that were recruited between 1999 and 2012. Eleven clinically relevant items related to the characteristics of suicidal behavior were submitted to a Hierarchical Ascendant Classification. Phenotypic profiles were compared between the resulting clusters. A decisional tree was constructed to facilitate the differentiation of individuals classified within the first 2 clusters. Most individuals were included in a cluster characterized by less lethal means and planning ("impulse-ambivalent"). A second cluster featured more carefully planned attempts ("well-planned"), more alcohol or drug use before the attempt, and more precautions to avoid interruptions. Finally, a small, third cluster included individuals reporting more attempts ("frequent"), more often serious or violent attempts, and an earlier age at first attempt. Differences across clusters by demographic and clinical characteristics were also found, particularly with the third cluster whose participants had experienced high levels of childhood abuse. Cluster analysis consistently supported 3 distinct clusters of individuals with specific features in their suicidal behaviors and phenotypic profiles that could help clinicians to better focus prevention strategies. © Copyright 2016 Physicians Postgraduate Press, Inc.
Keshtkaran, Mohammad Reza; Yang, Zhi
2017-06-01
Spike sorting is a fundamental preprocessing step for many neuroscience studies which rely on the analysis of spike trains. Most of the feature extraction and dimensionality reduction techniques that have been used for spike sorting give a projection subspace which is not necessarily the most discriminative one. Therefore, the clusters which appear inherently separable in some discriminative subspace may overlap if projected using conventional feature extraction approaches leading to a poor sorting accuracy especially when the noise level is high. In this paper, we propose a noise-robust and unsupervised spike sorting algorithm based on learning discriminative spike features for clustering. The proposed algorithm uses discriminative subspace learning to extract low dimensional and most discriminative features from the spike waveforms and perform clustering with automatic detection of the number of the clusters. The core part of the algorithm involves iterative subspace selection using linear discriminant analysis and clustering using Gaussian mixture model with outlier detection. A statistical test in the discriminative subspace is proposed to automatically detect the number of the clusters. Comparative results on publicly available simulated and real in vivo datasets demonstrate that our algorithm achieves substantially improved cluster distinction leading to higher sorting accuracy and more reliable detection of clusters which are highly overlapping and not detectable using conventional feature extraction techniques such as principal component analysis or wavelets. By providing more accurate information about the activity of more number of individual neurons with high robustness to neural noise and outliers, the proposed unsupervised spike sorting algorithm facilitates more detailed and accurate analysis of single- and multi-unit activities in neuroscience and brain machine interface studies.
NASA Astrophysics Data System (ADS)
Keshtkaran, Mohammad Reza; Yang, Zhi
2017-06-01
Objective. Spike sorting is a fundamental preprocessing step for many neuroscience studies which rely on the analysis of spike trains. Most of the feature extraction and dimensionality reduction techniques that have been used for spike sorting give a projection subspace which is not necessarily the most discriminative one. Therefore, the clusters which appear inherently separable in some discriminative subspace may overlap if projected using conventional feature extraction approaches leading to a poor sorting accuracy especially when the noise level is high. In this paper, we propose a noise-robust and unsupervised spike sorting algorithm based on learning discriminative spike features for clustering. Approach. The proposed algorithm uses discriminative subspace learning to extract low dimensional and most discriminative features from the spike waveforms and perform clustering with automatic detection of the number of the clusters. The core part of the algorithm involves iterative subspace selection using linear discriminant analysis and clustering using Gaussian mixture model with outlier detection. A statistical test in the discriminative subspace is proposed to automatically detect the number of the clusters. Main results. Comparative results on publicly available simulated and real in vivo datasets demonstrate that our algorithm achieves substantially improved cluster distinction leading to higher sorting accuracy and more reliable detection of clusters which are highly overlapping and not detectable using conventional feature extraction techniques such as principal component analysis or wavelets. Significance. By providing more accurate information about the activity of more number of individual neurons with high robustness to neural noise and outliers, the proposed unsupervised spike sorting algorithm facilitates more detailed and accurate analysis of single- and multi-unit activities in neuroscience and brain machine interface studies.
Graph-Based Object Class Discovery
NASA Astrophysics Data System (ADS)
Xia, Shengping; Hancock, Edwin R.
We are interested in the problem of discovering the set of object classes present in a database of images using a weakly supervised graph-based framework. Rather than making use of the ”Bag-of-Features (BoF)” approach widely used in current work on object recognition, we represent each image by a graph using a group of selected local invariant features. Using local feature matching and iterative Procrustes alignment, we perform graph matching and compute a similarity measure. Borrowing the idea of query expansion , we develop a similarity propagation based graph clustering (SPGC) method. Using this method class specific clusters of the graphs can be obtained. Such a cluster can be generally represented by using a higher level graph model whose vertices are the clustered graphs, and the edge weights are determined by the pairwise similarity measure. Experiments are performed on a dataset, in which the number of images increases from 1 to 50K and the number of objects increases from 1 to over 500. Some objects have been discovered with total recall and a precision 1 in a single cluster.
Constrained clusters of gene expression profiles with pathological features.
Sese, Jun; Kurokawa, Yukinori; Monden, Morito; Kato, Kikuya; Morishita, Shinichi
2004-11-22
Gene expression profiles should be useful in distinguishing variations in disease, since they reflect accurately the status of cells. The primary clustering of gene expression reveals the genotypes that are responsible for the proximity of members within each cluster, while further clustering elucidates the pathological features of the individual members of each cluster. However, since the first clustering process and the second classification step, in which the features are associated with clusters, are performed independently, the initial set of clusters may omit genes that are associated with pathologically meaningful features. Therefore, it is important to devise a way of identifying gene expression clusters that are associated with pathological features. We present the novel technique of 'itemset constrained clustering' (IC-Clustering), which computes the optimal cluster that maximizes the interclass variance of gene expression between groups, which are divided according to the restriction that only divisions that can be expressed using common features are allowed. This constraint automatically labels each cluster with a set of pathological features which characterize that cluster. When applied to liver cancer datasets, IC-Clustering revealed informative gene expression clusters, which could be annotated with various pathological features, such as 'tumor' and 'man', or 'except tumor' and 'normal liver function'. In contrast, the k-means method overlooked these clusters.
Inherent Structure versus Geometric Metric for State Space Discretization
Liu, Hanzhong; Li, Minghai; Fan, Jue; Huo, Shuanghong
2016-01-01
Inherent structure (IS) and geometry-based clustering methods are commonly used for analyzing molecular dynamics trajectories. ISs are obtained by minimizing the sampled conformations into local minima on potential/effective energy surface. The conformations that are minimized into the same energy basin belong to one cluster. We investigate the influence of the applications of these two methods of trajectory decomposition on our understanding of the thermodynamics and kinetics of alanine tetrapeptide. We find that at the micro cluster level, the IS approach and root-mean-square deviation (RMSD) based clustering method give totally different results. Depending on the local features of energy landscape, the conformations with close RMSDs can be minimized into different minima, while the conformations with large RMSDs could be minimized into the same basin. However, the relaxation timescales calculated based on the transition matrices built from the micro clusters are similar. The discrepancy at the micro cluster level leads to different macro clusters. Although the dynamic models established through both clustering methods are validated approximately Markovian, the IS approach seems to give a meaningful state space discretization at the macro cluster level. PMID:26915811
Prevalence and correlates of binge eating disorder related features in the community.
Mustelin, Linda; Bulik, Cynthia M; Kaprio, Jaakko; Keski-Rahkonen, Anna
2017-02-01
Binge eating disorder (BED) is associated with high levels of obesity and psychological suffering, but little is known about 1) the distribution of features of BED in the general population and 2) their consequences for weight development and psychological distress in young adulthood. We investigated the prevalence of features of BED and their association with body mass index (BMI) and psychological distress among men (n = 2423) and women (n = 2825) from the longitudinal community-based FinnTwin16 cohort (born 1975-1979). Seven eating-related cognitions and behaviors similar to the defining features of BED were extracted from the Eating Disorder Inventory-2 and were assessed at a mean age of 24. BMI and psychological distress, measured with the General Health Questionnaire, were assessed at ages 24 and 34. We assessed prevalence of the features and their association with BMI and psychological distress cross-sectionally and prospectively. More than half of our participants reported at least one feature of BED; clustering of several features in one individual was less common, particularly among men. The most frequently reported feature was 'stuffing oneself with food', whereas the least common was 'eating or drinking in secrecy'. All individual features of BED and their clustering particularly were associated with higher BMI and more psychological distress cross-sectionally. Prospectively, the clustering of features of BED predicted increase in psychological distress but not additional weight gain when baseline BMI was accounted for. In summary, although some features of BED were common, the clustering of several features in one individual was not. The features were cumulatively associated with BMI and psychological distress and predicted further increase in psychological distress over ten years of follow-up. Copyright © 2016. Published by Elsevier Ltd.
Classification of Two Class Motor Imagery Tasks Using Hybrid GA-PSO Based K-Means Clustering.
Suraj; Tiwari, Purnendu; Ghosh, Subhojit; Sinha, Rakesh Kumar
2015-01-01
Transferring the brain computer interface (BCI) from laboratory condition to meet the real world application needs BCI to be applied asynchronously without any time constraint. High level of dynamism in the electroencephalogram (EEG) signal reasons us to look toward evolutionary algorithm (EA). Motivated by these two facts, in this work a hybrid GA-PSO based K-means clustering technique has been used to distinguish two class motor imagery (MI) tasks. The proposed hybrid GA-PSO based K-means clustering is found to outperform genetic algorithm (GA) and particle swarm optimization (PSO) based K-means clustering techniques in terms of both accuracy and execution time. The lesser execution time of hybrid GA-PSO technique makes it suitable for real time BCI application. Time frequency representation (TFR) techniques have been used to extract the feature of the signal under investigation. TFRs based features are extracted and relying on the concept of event related synchronization (ERD) and desynchronization (ERD) feature vector is formed.
Classification of Two Class Motor Imagery Tasks Using Hybrid GA-PSO Based K-Means Clustering
Suraj; Tiwari, Purnendu; Ghosh, Subhojit; Sinha, Rakesh Kumar
2015-01-01
Transferring the brain computer interface (BCI) from laboratory condition to meet the real world application needs BCI to be applied asynchronously without any time constraint. High level of dynamism in the electroencephalogram (EEG) signal reasons us to look toward evolutionary algorithm (EA). Motivated by these two facts, in this work a hybrid GA-PSO based K-means clustering technique has been used to distinguish two class motor imagery (MI) tasks. The proposed hybrid GA-PSO based K-means clustering is found to outperform genetic algorithm (GA) and particle swarm optimization (PSO) based K-means clustering techniques in terms of both accuracy and execution time. The lesser execution time of hybrid GA-PSO technique makes it suitable for real time BCI application. Time frequency representation (TFR) techniques have been used to extract the feature of the signal under investigation. TFRs based features are extracted and relying on the concept of event related synchronization (ERD) and desynchronization (ERD) feature vector is formed. PMID:25972896
Unsupervised spike sorting based on discriminative subspace learning.
Keshtkaran, Mohammad Reza; Yang, Zhi
2014-01-01
Spike sorting is a fundamental preprocessing step for many neuroscience studies which rely on the analysis of spike trains. In this paper, we present two unsupervised spike sorting algorithms based on discriminative subspace learning. The first algorithm simultaneously learns the discriminative feature subspace and performs clustering. It uses histogram of features in the most discriminative projection to detect the number of neurons. The second algorithm performs hierarchical divisive clustering that learns a discriminative 1-dimensional subspace for clustering in each level of the hierarchy until achieving almost unimodal distribution in the subspace. The algorithms are tested on synthetic and in-vivo data, and are compared against two widely used spike sorting methods. The comparative results demonstrate that our spike sorting methods can achieve substantially higher accuracy in lower dimensional feature space, and they are highly robust to noise. Moreover, they provide significantly better cluster separability in the learned subspace than in the subspace obtained by principal component analysis or wavelet transform.
Inherent structure versus geometric metric for state space discretization.
Liu, Hanzhong; Li, Minghai; Fan, Jue; Huo, Shuanghong
2016-05-30
Inherent structure (IS) and geometry-based clustering methods are commonly used for analyzing molecular dynamics trajectories. ISs are obtained by minimizing the sampled conformations into local minima on potential/effective energy surface. The conformations that are minimized into the same energy basin belong to one cluster. We investigate the influence of the applications of these two methods of trajectory decomposition on our understanding of the thermodynamics and kinetics of alanine tetrapeptide. We find that at the microcluster level, the IS approach and root-mean-square deviation (RMSD)-based clustering method give totally different results. Depending on the local features of energy landscape, the conformations with close RMSDs can be minimized into different minima, while the conformations with large RMSDs could be minimized into the same basin. However, the relaxation timescales calculated based on the transition matrices built from the microclusters are similar. The discrepancy at the microcluster level leads to different macroclusters. Although the dynamic models established through both clustering methods are validated approximately Markovian, the IS approach seems to give a meaningful state space discretization at the macrocluster level in terms of conformational features and kinetics. © 2016 Wiley Periodicals, Inc.
A coloured oil level indicator detection method based on simple linear iterative clustering
NASA Astrophysics Data System (ADS)
Liu, Tianli; Li, Dongsong; Jiao, Zhiming; Liang, Tao; Zhou, Hao; Yang, Guoqing
2017-12-01
A detection method of coloured oil level indicator is put forward. The method is applied to inspection robot in substation, which realized the automatic inspection and recognition of oil level indicator. Firstly, the detected image of the oil level indicator is collected, and the detected image is clustered and segmented to obtain the label matrix of the image. Secondly, the detection image is processed by colour space transformation, and the feature matrix of the image is obtained. Finally, the label matrix and feature matrix are used to locate and segment the detected image, and the upper edge of the recognized region is obtained. If the upper limb line exceeds the preset oil level threshold, the alarm will alert the station staff. Through the above-mentioned image processing, the inspection robot can independently recognize the oil level of the oil level indicator, and instead of manual inspection. It embodies the automatic and intelligent level of unattended operation.
Agent-based model with multi-level herding for complex financial systems
NASA Astrophysics Data System (ADS)
Chen, Jun-Jie; Tan, Lei; Zheng, Bo
2015-02-01
In complex financial systems, the sector structure and volatility clustering are respectively important features of the spatial and temporal correlations. However, the microscopic generation mechanism of the sector structure is not yet understood. Especially, how to produce these two features in one model remains challenging. We introduce a novel interaction mechanism, i.e., the multi-level herding, in constructing an agent-based model to investigate the sector structure combined with volatility clustering. According to the previous market performance, agents trade in groups, and their herding behavior comprises the herding at stock, sector and market levels. Further, we propose methods to determine the key model parameters from historical market data, rather than from statistical fitting of the results. From the simulation, we obtain the sector structure and volatility clustering, as well as the eigenvalue distribution of the cross-correlation matrix, for the New York and Hong Kong stock exchanges. These properties are in agreement with the empirical ones. Our results quantitatively reveal that the multi-level herding is the microscopic generation mechanism of the sector structure, and provide new insight into the spatio-temporal interactions in financial systems at the microscopic level.
Agent-based model with multi-level herding for complex financial systems
Chen, Jun-Jie; Tan, Lei; Zheng, Bo
2015-01-01
In complex financial systems, the sector structure and volatility clustering are respectively important features of the spatial and temporal correlations. However, the microscopic generation mechanism of the sector structure is not yet understood. Especially, how to produce these two features in one model remains challenging. We introduce a novel interaction mechanism, i.e., the multi-level herding, in constructing an agent-based model to investigate the sector structure combined with volatility clustering. According to the previous market performance, agents trade in groups, and their herding behavior comprises the herding at stock, sector and market levels. Further, we propose methods to determine the key model parameters from historical market data, rather than from statistical fitting of the results. From the simulation, we obtain the sector structure and volatility clustering, as well as the eigenvalue distribution of the cross-correlation matrix, for the New York and Hong Kong stock exchanges. These properties are in agreement with the empirical ones. Our results quantitatively reveal that the multi-level herding is the microscopic generation mechanism of the sector structure, and provide new insight into the spatio-temporal interactions in financial systems at the microscopic level. PMID:25669427
Online writer identification using alphabetic information clustering
NASA Astrophysics Data System (ADS)
Tan, Guo Xian; Viard-Gaudin, Christian; Kot, Alex C.
2009-01-01
Writer identification is a topic of much renewed interest today because of its importance in applications such as writer adaptation, routing of documents and forensic document analysis. Various algorithms have been proposed to handle such tasks. Of particular interests are the approaches that use allographic features [1-3] to perform a comparison of the documents in question. The allographic features are used to define prototypes that model the unique handwriting styles of the individual writers. This paper investigates a novel perspective that takes alphabetic information into consideration when the allographic features are clustered into prototypes at the character level. We hypothesize that alphabetic information provides additional clues which help in the clustering of allographic prototypes. An alphabet information coefficient (AIC) has been introduced in our study and the effect of this coefficient is presented. Our experiments showed an increase of writer identification accuracy from 66.0% to 87.0% when alphabetic information was used in conjunction with allographic features on a database of 200 reference writers.
Dynamical organization towards consensus in the Axelrod model on complex networks
NASA Astrophysics Data System (ADS)
Guerra, Beniamino; Poncela, Julia; Gómez-Gardeñes, Jesús; Latora, Vito; Moreno, Yamir
2010-05-01
We analyze the dynamics toward cultural consensus in the Axelrod model on scale-free networks. By looking at the microscopic dynamics of the model, we are able to show how culture traits spread across different cultural features. We compare the diffusion at the level of cultural features to the growth of cultural consensus at the global level, finding important differences between these two processes. In particular, we show that even when most of the cultural features have reached macroscopic consensus, there are still no signals of globalization. Finally, we analyze the topology of consensus clusters both for global culture and at the feature level of representation.
Interpersonal Subtypes Within Social Anxiety: The Identification of Distinct Social Features.
Cooper, Danielle; Anderson, Timothy
2017-10-05
Although social anxiety disorder is defined by anxiety-related symptoms, little research has focused on the interpersonal features of social anxiety. Prior studies (Cain, Pincus, & Grosse Holtforth, 2010; Kachin, Newman, & Pincus, 2001) identified distinct subgroups of socially anxious individuals' interpersonal circumplex problems that were blends of agency and communion, and yet inconsistencies remain. We predicted 2 distinct interpersonal subtypes would exist for individuals with high social anxiety, and that these social anxiety subtypes would differ on empathetic concern, paranoia, received peer victimization, perspective taking, and emotional suppression. From a sample of 175 undergraduate participants, 51 participants with high social anxiety were selected as above a clinical cutoff on the social phobia scale. Cluster analyses identified 2 interpersonal subtypes of socially anxious individuals: low hostility-high submissiveness (Cluster 1) and high hostility-high submissiveness (Cluster 2). Cluster 1 reported higher levels of empathetic concern, lower paranoia, less peer victimization, and lower emotional suppression compared to Cluster 2. There were no differences between subtypes on perspective taking or cognitive reappraisal. Findings are consistent with an interpersonal conceptualization of social anxiety, and provide evidence of distinct social features between these subtypes. Findings have implications for the etiology, classification, and treatment of social anxiety.
A two-stage method for microcalcification cluster segmentation in mammography by deformable models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Arikidis, N.; Kazantzi, A.; Skiadopoulos, S.
Purpose: Segmentation of microcalcification (MC) clusters in x-ray mammography is a difficult task for radiologists. Accurate segmentation is prerequisite for quantitative image analysis of MC clusters and subsequent feature extraction and classification in computer-aided diagnosis schemes. Methods: In this study, a two-stage semiautomated segmentation method of MC clusters is investigated. The first stage is targeted to accurate and time efficient segmentation of the majority of the particles of a MC cluster, by means of a level set method. The second stage is targeted to shape refinement of selected individual MCs, by means of an active contour model. Both methods aremore » applied in the framework of a rich scale-space representation, provided by the wavelet transform at integer scales. Segmentation reliability of the proposed method in terms of inter and intraobserver agreements was evaluated in a case sample of 80 MC clusters originating from the digital database for screening mammography, corresponding to 4 morphology types (punctate: 22, fine linear branching: 16, pleomorphic: 18, and amorphous: 24) of MC clusters, assessing radiologists’ segmentations quantitatively by two distance metrics (Hausdorff distance—HDIST{sub cluster}, average of minimum distance—AMINDIST{sub cluster}) and the area overlap measure (AOM{sub cluster}). The effect of the proposed segmentation method on MC cluster characterization accuracy was evaluated in a case sample of 162 pleomorphic MC clusters (72 malignant and 90 benign). Ten MC cluster features, targeted to capture morphologic properties of individual MCs in a cluster (area, major length, perimeter, compactness, and spread), were extracted and a correlation-based feature selection method yielded a feature subset to feed in a support vector machine classifier. Classification performance of the MC cluster features was estimated by means of the area under receiver operating characteristic curve (Az ± Standard Error) utilizing tenfold cross-validation methodology. A previously developed B-spline active rays segmentation method was also considered for comparison purposes. Results: Interobserver and intraobserver segmentation agreements (median and [25%, 75%] quartile range) were substantial with respect to the distance metrics HDIST{sub cluster} (2.3 [1.8, 2.9] and 2.5 [2.1, 3.2] pixels) and AMINDIST{sub cluster} (0.8 [0.6, 1.0] and 1.0 [0.8, 1.2] pixels), while moderate with respect to AOM{sub cluster} (0.64 [0.55, 0.71] and 0.59 [0.52, 0.66]). The proposed segmentation method outperformed (0.80 ± 0.04) statistically significantly (Mann-Whitney U-test, p < 0.05) the B-spline active rays segmentation method (0.69 ± 0.04), suggesting the significance of the proposed semiautomated method. Conclusions: Results indicate a reliable semiautomated segmentation method for MC clusters offered by deformable models, which could be utilized in MC cluster quantitative image analysis.« less
NASA Astrophysics Data System (ADS)
De, Sandip; Schaefer, Bastian; Sadeghi, Ali; Sicher, Michael; Kanhere, D. G.; Goedecker, Stefan
2014-02-01
Based on a recently introduced metric for measuring distances between configurations, we introduce distance-energy (DE) plots to characterize the potential energy surface of clusters. Producing such plots is computationally feasible on the density functional level since it requires only a few hundred stable low energy configurations including the global minimum. By using standard criteria based on disconnectivity graphs and the dynamics of Lennard-Jones clusters, we show that the DE plots convey the necessary information about the character of the potential energy surface and allow us to distinguish between glassy and nonglassy systems. We then apply this analysis to real clusters at the density functional theory level and show that both glassy and nonglassy clusters can be found in simulations. It turns out that among our investigated clusters only those can be synthesized experimentally which exhibit a nonglassy landscape.
Object-Oriented Image Clustering Method Using UAS Photogrammetric Imagery
NASA Astrophysics Data System (ADS)
Lin, Y.; Larson, A.; Schultz-Fellenz, E. S.; Sussman, A. J.; Swanson, E.; Coppersmith, R.
2016-12-01
Unmanned Aerial Systems (UAS) have been used widely as an imaging modality to obtain remotely sensed multi-band surface imagery, and are growing in popularity due to their efficiency, ease of use, and affordability. Los Alamos National Laboratory (LANL) has employed the use of UAS for geologic site characterization and change detection studies at a variety of field sites. The deployed UAS equipped with a standard visible band camera to collect imagery datasets. Based on the imagery collected, we use deep sparse algorithmic processing to detect and discriminate subtle topographic features created or impacted by subsurface activities. In this work, we develop an object-oriented remote sensing imagery clustering method for land cover classification. To improve the clustering and segmentation accuracy, instead of using conventional pixel-based clustering methods, we integrate the spatial information from neighboring regions to create super-pixels to avoid salt-and-pepper noise and subsequent over-segmentation. To further improve robustness of our clustering method, we also incorporate a custom digital elevation model (DEM) dataset generated using a structure-from-motion (SfM) algorithm together with the red, green, and blue (RGB) band data for clustering. In particular, we first employ an agglomerative clustering to create an initial segmentation map, from where every object is treated as a single (new) pixel. Based on the new pixels obtained, we generate new features to implement another level of clustering. We employ our clustering method to the RGB+DEM datasets collected at the field site. Through binary clustering and multi-object clustering tests, we verify that our method can accurately separate vegetation from non-vegetation regions, and are also able to differentiate object features on the surface.
Ir Spectroscopic Studies on Microsolvation of HCl by Water
NASA Astrophysics Data System (ADS)
Mani, Devendra; Schwan, Raffael; Fischer, Theo; Dey, Arghya; Kaufmann, Matin; Redlich, Britta; van der Meer, Lex; Schwaab, Gerhard; Havenith, Martina
2016-06-01
Acid dissociation reactions are at the heart of chemistry. These reactions are well understood at the macroscopic level. However, a microscopic level understanding is still in the early stages of development. Questions such as 'how many H_2O molecules are needed to dissociate one HCl molecule?' have been posed and explored both theoretically and experimentally.1-5 Most of the theoretical calculations predict that four H_2O molecules are sufficient to dissociate one HCl molecule, resulting in the formation of a solvent separated H_3O+(H_2O)3Cl- cluster.1-3 IR spectroscopy in helium nanodroplets has earlier been used to study this dissociation process.3-5 However, these studies were carried out in the region of O-H and H-Cl stretch, which is dominated by the spectral features of undissociated (HCl)m-(H_2O)n clusters. This contributed to the ambiguity in assigning the spectral features arising from the dissociated cluster.4,5 Recent predictions from Bowman's group, suggest the presence of a broad spectral feature (1300-1360 wn) for the H_3O+(H_2O)3Cl- cluster, corresponding to the umbrella motion of H_3O+ moiety.6 This region is expected to be free from the spectral features due to the undissociated clusters. In conjunction with the FELIX laboratory, we have performed experiments on the (HCl)m(H_2O)n (m=1-2, n≥4) clusters, aggregated in helium nanodroplets, in the 900-1700 wn region. Mass selective measurements on these clusters revealed the presence of a weak-broad feature which spans between 1000-1450 wn and depends on both HCl as well as H_2O concentration. Measurements are in progress for the different deuterated species. The details will be presented in the talk. References: 1) C.T. Lee et al., J. Chem. Phys., 104, 7081 (1996). 2) H. Forbert et al., J. Am. Chem. Soc., 133, 4062 (2011). 3) A. Gutberlet et al., Science, 324, 1545 (2009). 4) S. D. Flynn et al., J. Phys. Chem. Lett., 1, 2233 (2010). 5) M. Letzner et al., J. Chem. Phys., 139, 154304 (2013). 6) J. M. Bowman et al., Phys. Chem. Chem. Phys., 17, 6222 (2015).
Patterns of Dysmorphic Features in Schizophrenia
Scutt, L.E.; Chow, E.W.C.; Weksberg, R.; Honer, W.G.; Bassett, Anne S.
2011-01-01
Congenital dysmorphic features are prevalent in schizophrenia and may reflect underlying neurodevelopmental abnormalities. A cluster analysis approach delineating patterns of dysmorphic features has been used in genetics to classify individuals into more etiologically homogeneous subgroups. In the present study, this approach was applied to schizophrenia, using a sample with a suspected genetic syndrome as a testable model. Subjects (n = 159) with schizophrenia or schizoaffective disorder were ascertained from chronic patient populations (random, n=123) or referred with possible 22q11 deletion syndrome (referred, n = 36). All subjects were evaluated for presence or absence of 70 reliably assessed dysmorphic features, which were used in a three-step cluster analysis. The analysis produced four major clusters with different patterns of dysmorphic features. Significant between-cluster differences were found for rates of 37 dysmorphic features (P < 0.05), median number of dysmorphic features (P = 0.0001), and validating features not used in the cluster analysis: mild mental retardation (P = 0.001) and congenital heart defects (P = 0.002). Two clusters (1 and 4) appeared to represent more developmental subgroups of schizophrenia with elevated rates of dysmorphic features and validating features. Cluster 1 (n = 27) comprised mostly referred subjects. Cluster 4 (n= 18) had a different pattern of dysmorphic features; one subject had a mosaic Turner syndrome variant. Two other clusters had lower rates and patterns of features consistent with those found in previous studies of schizophrenia. Delineating patterns of dysmorphic features may help identify subgroups that could represent neurodevelopmental forms of schizophrenia with more homogeneous origins. PMID:11803519
NASA Astrophysics Data System (ADS)
Liang, Yun-Feng; Shen, Zhao-Qiang; Li, Xiang; Fan, Yi-Zhong; Huang, Xiaoyuan; Lei, Shi-Jun; Feng, Lei; Liang, En-Wei; Chang, Jin
2016-05-01
Galaxy clusters are the largest gravitationally bound objects in the Universe and may be suitable targets for indirect dark matter searches. With 85 months of Fermi LAT Pass 8 publicly available data, we analyze the gamma-ray emission in the direction of 16 nearby galaxy clusters with an unbinned likelihood analysis. No statistically or globally significant γ -ray line feature is identified and a tentative line signal may present at ˜43 GeV . The 95% confidence level upper limits on the velocity-averaged cross section of dark matter particles annihilating into double γ rays (i.e., ⟨σ v ⟩χχ →γ γ) are derived. Unless very optimistic boost factors of dark matter annihilation in these galaxy clusters have been assumed, such constraints are much weaker than the bounds set by the Galactic γ -ray data.
NASA Astrophysics Data System (ADS)
Brandl, Miriam B.; Beck, Dominik; Pham, Tuan D.
2011-06-01
The high dimensionality of image-based dataset can be a drawback for classification accuracy. In this study, we propose the application of fuzzy c-means clustering, cluster validity indices and the notation of a joint-feature-clustering matrix to find redundancies of image-features. The introduced matrix indicates how frequently features are grouped in a mutual cluster. The resulting information can be used to find data-derived feature prototypes with a common biological meaning, reduce data storage as well as computation times and improve the classification accuracy.
Inductive Approaches to Improving Diagnosis and Design for Diagnosability
NASA Technical Reports Server (NTRS)
Fisher, Douglas H. (Principal Investigator)
1995-01-01
The first research area under this grant addresses the problem of classifying time series according to their morphological features in the time domain. A supervised learning system called CALCHAS, which induces a classification procedure for signatures from preclassified examples, was developed. For each of several signature classes, the system infers a model that captures the class's morphological features using Bayesian model induction and the minimum message length approach to assign priors. After induction, a time series (signature) is classified in one of the classes when there is enough evidence to support that decision. Time series with sufficiently novel features, belonging to classes not present in the training set, are recognized as such. A second area of research assumes two sources of information about a system: a model or domain theory that encodes aspects of the system under study and data from actual system operations over time. A model, when it exists, represents strong prior expectations about how a system will perform. Our work with a diagnostic model of the RCS (Reaction Control System) of the Space Shuttle motivated the development of SIG, a system which combines information from a model (or domain theory) and data. As it tracks RCS behavior, the model computes quantitative and qualitative values. Induction is then performed over the data represented by both the 'raw' features and the model-computed high-level features. Finally, work on clustering for operating mode discovery motivated some important extensions to the clustering strategy we had used. One modification appends an iterative optimization technique onto the clustering system; this optimization strategy appears to be novel in the clustering literature. A second modification improves the noise tolerance of the clustering system. In particular, we adapt resampling-based pruning strategies used by supervised learning systems to the task of simplifying hierarchical clusterings, thus making post-clustering analysis easier.
Bennett, Robert M; Russell, Jon; Cappelleri, Joseph C; Bushmakin, Andrew G; Zlateva, Gergana; Sadosky, Alesia
2010-06-28
The purpose of this study was to determine whether some of the clinical features of fibromyalgia (FM) that patients would like to see improved aggregate into definable clusters. Seven hundred and eighty-eight patients with clinically confirmed FM and baseline pain > or =40 mm on a 100 mm visual analogue scale ranked 5 FM clinical features that the subjects would most like to see improved after treatment (one for each priority quintile) from a list of 20 developed during focus groups. For each subject, clinical features were transformed into vectors with rankings assigned values 1-5 (lowest to highest ranking). Logistic analysis was used to create a distance matrix and hierarchical cluster analysis was applied to identify cluster structure. The frequency of cluster selection was determined, and cluster importance was ranked using cluster scores derived from rankings of the clinical features. Multidimensional scaling was used to visualize and conceptualize cluster relationships. Six clinical features clusters were identified and named based on their key characteristics. In order of selection frequency, the clusters were Pain (90%; 4 clinical features), Fatigue (89%; 4 clinical features), Domestic (42%; 4 clinical features), Impairment (29%; 3 functions), Affective (21%; 3 clinical features), and Social (9%; 2 functional). The "Pain Cluster" was ranked of greatest importance by 54% of subjects, followed by Fatigue, which was given the highest ranking by 28% of subjects. Multidimensional scaling mapped these clusters to two dimensions: Status (bounded by Physical and Emotional domains), and Setting (bounded by Individual and Group interactions). Common clinical features of FM could be grouped into 6 clusters (Pain, Fatigue, Domestic, Impairment, Affective, and Social) based on patient perception of relevance to treatment. Furthermore, these 6 clusters could be charted in the 2 dimensions of Status and Setting, thus providing a unique perspective for interpretation of FM symptomatology.
Semi-Supervised Learning to Identify UMLS Semantic Relations.
Luo, Yuan; Uzuner, Ozlem
2014-01-01
The UMLS Semantic Network is constructed by experts and requires periodic expert review to update. We propose and implement a semi-supervised approach for automatically identifying UMLS semantic relations from narrative text in PubMed. Our method analyzes biomedical narrative text to collect semantic entity pairs, and extracts multiple semantic, syntactic and orthographic features for the collected pairs. We experiment with seeded k-means clustering with various distance metrics. We create and annotate a ground truth corpus according to the top two levels of the UMLS semantic relation hierarchy. We evaluate our system on this corpus and characterize the learning curves of different clustering configuration. Using KL divergence consistently performs the best on the held-out test data. With full seeding, we obtain macro-averaged F-measures above 70% for clustering the top level UMLS relations (2-way), and above 50% for clustering the second level relations (7-way).
Hammond, R W
2003-06-01
Isolates of Prunus necrotic ringspot virus (PNRSV) were examined to establish the level of naturally occurring sequence variation in the coat protein (CP) gene and to identify group-specific genome features that may prove valuable for the generation of diagnostic reagents. Phylogenetic analysis of a 452 bp sequence of 68 virus isolates, 20 obtained from the European Union Ilarvirus Ringtest held in October 1998, confirmed the clustering of the isolates into three distinct groups. Although no correlation was found between the sequence and host or geographic origin, there was a general trend for severe isolates to cluster into one group. Group-specific features have been identified for discrimination between virus strains.
Gender differences in psychiatric disorders and clusters of self-esteem among detained adolescents.
Van Damme, Lore; Colins, Olivier F; Vanderplasschen, Wouter
2014-12-30
Detained minors display substantial mental health needs. This study focused on two features (psychopathology and self-esteem) that have received considerable attention in the literature and clinical work, but have rarely been studied simultaneously in detained youths. The aims of this study were to examine gender differences in psychiatric disorders and clusters of self-esteem, and to test the hypothesis that the cluster of adolescents with lower (versus higher) levels of self-esteem have higher rates of psychiatric disorders. The prevalence of psychiatric disorders was assessed in 440 Belgian, detained adolescents using the Diagnostic Interview Schedule for Children-IV. Self-esteem was assessed using the Self-perception Profile for Adolescents. Model-based cluster analyses were performed to identify youths with lower and/or higher levels of self-esteem across several domains. Girls have higher rates for most psychiatric disorders and lower levels of self-esteem than boys. A higher number of clusters was identified in boys (four) than girls (three). Generally, the cluster of adolescents with lower (versus higher) levels of self-esteem had a higher prevalence of psychiatric disorders. These results suggest that the detection of low levels of self-esteem in adolescents, especially girls, might help clinicians to identify a subgroup of detained adolescents with the highest prevalence of psychopathology.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ogden, K; O’Dwyer, R; Bradford, T
Purpose: To reduce differences in features calculated from MRI brain scans acquired at different field strengths with or without Gadolinium contrast. Methods: Brain scans were processed for 111 epilepsy patients to extract hippocampus and thalamus features. Scans were acquired on 1.5 T scanners with Gadolinium contrast (group A), 1.5T scanners without Gd (group B), and 3.0 T scanners without Gd (group C). A total of 72 features were extracted. Features were extracted from original scans and from scans where the image pixel values were rescaled to the mean of the hippocampi and thalami values. For each data set, cluster analysismore » was performed on the raw feature set and for feature sets with normalization (conversion to Z scores). Two methods of normalization were used: The first was over all values of a given feature, and the second by normalizing within the patient group membership. The clustering software was configured to produce 3 clusters. Group fractions in each cluster were calculated. Results: For features calculated from both the non-rescaled and rescaled data, cluster membership was identical for both the non-normalized and normalized data sets. Cluster 1 was comprised entirely of Group A data, Cluster 2 contained data from all three groups, and Cluster 3 contained data from only groups 1 and 2. For the categorically normalized data sets there was a more uniform distribution of group data in the three Clusters. A less pronounced effect was seen in the rescaled image data features. Conclusion: Image Rescaling and feature renormalization can have a significant effect on the results of clustering analysis. These effects are also likely to influence the results of supervised machine learning algorithms. It may be possible to partly remove the influence of scanner field strength and the presence of Gadolinium based contrast in feature extraction for radiomics applications.« less
NASA Astrophysics Data System (ADS)
Juniati, D.; Khotimah, C.; Wardani, D. E. K.; Budayasa, K.
2018-01-01
The heart abnormalities can be detected from heart sound. A heart sound can be heard directly with a stethoscope or indirectly by a phonocardiograph, a machine of the heart sound recording. This paper presents the implementation of fractal dimension theory to make a classification of phonocardiograms into a normal heart sound, a murmur, or an extrasystole. The main algorithm used to calculate the fractal dimension was Higuchi’s Algorithm. There were two steps to make a classification of phonocardiograms, feature extraction, and classification. For feature extraction, we used Discrete Wavelet Transform to decompose the signal of heart sound into several sub-bands depending on the selected level. After the decomposition process, the signal was processed using Fast Fourier Transform (FFT) to determine the spectral frequency. The fractal dimension of the FFT output was calculated using Higuchi Algorithm. The classification of fractal dimension of all phonocardiograms was done with KNN and Fuzzy c-mean clustering methods. Based on the research results, the best accuracy obtained was 86.17%, the feature extraction by DWT decomposition level 3 with the value of kmax 50, using 5-fold cross validation and the number of neighbors was 5 at K-NN algorithm. Meanwhile, for fuzzy c-mean clustering, the accuracy was 78.56%.
Liu, Yuanchao; Liu, Ming; Wang, Xin
2015-01-01
The objective of text clustering is to divide document collections into clusters based on the similarity between documents. In this paper, an extension-based feature modeling approach towards semantically sensitive text clustering is proposed along with the corresponding feature space construction and similarity computation method. By combining the similarity in traditional feature space and that in extension space, the adverse effects of the complexity and diversity of natural language can be addressed and clustering semantic sensitivity can be improved correspondingly. The generated clusters can be organized using different granularities. The experimental evaluations on well-known clustering algorithms and datasets have verified the effectiveness of our approach.
Liu, Yuanchao; Liu, Ming; Wang, Xin
2015-01-01
The objective of text clustering is to divide document collections into clusters based on the similarity between documents. In this paper, an extension-based feature modeling approach towards semantically sensitive text clustering is proposed along with the corresponding feature space construction and similarity computation method. By combining the similarity in traditional feature space and that in extension space, the adverse effects of the complexity and diversity of natural language can be addressed and clustering semantic sensitivity can be improved correspondingly. The generated clusters can be organized using different granularities. The experimental evaluations on well-known clustering algorithms and datasets have verified the effectiveness of our approach. PMID:25794172
Socioeconomic Status (SES) and Childhood Acute Myeloid Leukemia (AML) Mortality
Knoble, Naomi B.; Alderfer, Melissa A.; Hossain, Md Jobayer
2016-01-01
Socioeconomic status (SES) is a complex construct of multiple indicators, known to impact cancer outcomes, but has not been adequately examined among pediatric AML patients. This study aimed to identify the patterns of co-occurrence of multiple community-level SES indicators and to explore associations between various patterns of these indicators and pediatric AML mortality risk. A nationally representative US sample of 3,651 pediatric AML patients, aged 0–19 years at diagnosis was drawn from 17 Surveillance, Epidemiology, and End Results (SEER) database registries created between 1973 and 2012. Factor analysis, cluster analysis, stratified univariable and multivariable Cox proportional hazards models were used. Four SES factors accounting for 87% of the variance in SES indicators were identified: F1) economic/educational disadvantage, less immigration; F2) immigration-related features (foreign-born, language-isolation, crowding), less mobility F3) housing instability; and, F4) absence of moving. F1 and F3 showed elevated risk of mortality, adjusted hazards ratios (aHR) (95% CI): 1.07(1.02–1.12) and 1.05(1.00–1.10), respectively. Seven SES-defined cluster groups were identified. Cluster 1: (low economic/educational disadvantage, few immigration-related features, and residential-stability) showed the minimum risk of mortality. Compared to Cluster 1, Cluster 3: (high economic/educational disadvantage, high-mobility) and Cluster 6: (moderately-high economic/educational disadvantages, housing-instability and immigration-related features) exhibited substantially greater risk of mortality, aHR(95% CI) = 1.19(1.0–1.4) and 1.23 (1.1–1.5), respectively. Factors of correlated SES-indicators and their pattern-based groups demonstrated differential risks in the pediatric AML mortality indicating the need of special public-health attention in areas with economic-educational disadvantages, housing-instability and immigration-related features. PMID:27543948
Method and system for data clustering for very large databases
NASA Technical Reports Server (NTRS)
Livny, Miron (Inventor); Zhang, Tian (Inventor); Ramakrishnan, Raghu (Inventor)
1998-01-01
Multi-dimensional data contained in very large databases is efficiently and accurately clustered to determine patterns therein and extract useful information from such patterns. Conventional computer processors may be used which have limited memory capacity and conventional operating speed, allowing massive data sets to be processed in a reasonable time and with reasonable computer resources. The clustering process is organized using a clustering feature tree structure wherein each clustering feature comprises the number of data points in the cluster, the linear sum of the data points in the cluster, and the square sum of the data points in the cluster. A dense region of data points is treated collectively as a single cluster, and points in sparsely occupied regions can be treated as outliers and removed from the clustering feature tree. The clustering can be carried out continuously with new data points being received and processed, and with the clustering feature tree being restructured as necessary to accommodate the information from the newly received data points.
Yoshimoto, Junichiro; Shimizu, Yu; Okada, Go; Takamura, Masahiro; Okamoto, Yasumasa; Yamawaki, Shigeto; Doya, Kenji
2017-01-01
We propose a novel method for multiple clustering, which is useful for analysis of high-dimensional data containing heterogeneous types of features. Our method is based on nonparametric Bayesian mixture models in which features are automatically partitioned (into views) for each clustering solution. This feature partition works as feature selection for a particular clustering solution, which screens out irrelevant features. To make our method applicable to high-dimensional data, a co-clustering structure is newly introduced for each view. Further, the outstanding novelty of our method is that we simultaneously model different distribution families, such as Gaussian, Poisson, and multinomial distributions in each cluster block, which widens areas of application to real data. We apply the proposed method to synthetic and real data, and show that our method outperforms other multiple clustering methods both in recovering true cluster structures and in computation time. Finally, we apply our method to a depression dataset with no true cluster structure available, from which useful inferences are drawn about possible clustering structures of the data. PMID:29049392
Artificial neural networks for acoustic target recognition
NASA Astrophysics Data System (ADS)
Robertson, James A.; Mossing, John C.; Weber, Bruce A.
1995-04-01
Acoustic sensors can be used to detect, track and identify non-line-of-sight targets passively. Attempts to alter acoustic emissions often result in an undesirable performance degradation. This research project investigates the use of neural networks for differentiating between features extracted from the acoustic signatures of sources. Acoustic data were filtered and digitized using a commercially available analog-digital convertor. The digital data was transformed to the frequency domain for additional processing using the FFT. Narrowband peak detection algorithms were incorporated to select peaks above a user defined SNR. These peaks were then used to generate a set of robust features which relate specifically to target components in varying background conditions. The features were then used as input into a backpropagation neural network. A K-means unsupervised clustering algorithm was used to determine the natural clustering of the observations. Comparisons between a feature set consisting of the normalized amplitudes of the first 250 frequency bins of the power spectrum and a set of 11 harmonically related features were made. Initial results indicate that even though some different target types had a tendency to group in the same clusters, the neural network was able to differentiate the targets. Successful identification of acoustic sources under varying operational conditions with high confidence levels was achieved.
Searching for the 3.5 keV Line in the Stacked Suzaku Observations of Galaxy Clusters
NASA Technical Reports Server (NTRS)
Bulbul, Esra; Markevitch, Maxim; Foster, Adam; Miller, Eric; Bautz, Mark; Lowenstein, Mike; Randall, Scott W.; Smith, Randall K.
2016-01-01
We perform a detailed study of the stacked Suzaku observations of 47 galaxy clusters, spanning a redshift range of 0.01-0.45, to search for the unidentified 3.5 keV line. This sample provides an independent test for the previously detected line. We detect a 2sigma-significant spectral feature at 3.5 keV in the spectrum of the full sample. When the sample is divided into two subsamples (cool-core and non-cool core clusters), the cool-core subsample shows no statistically significant positive residuals at the line energy. A very weak (approx. 2sigma confidence) spectral feature at 3.5 keV is permitted by the data from the non-cool-core clusters sample. The upper limit on a neutrino decay mixing angle of sin(sup 2)(2theta) = 6.1 x 10(exp -11) from the full Suzaku sample is consistent with the previous detections in the stacked XMM-Newton sample of galaxy clusters (which had a higher statistical sensitivity to faint lines), M31, and Galactic center, at a 90% confidence level. However, the constraint from the present sample, which does not include the Perseus cluster, is in tension with previously reported line flux observed in the core of the Perseus cluster with XMM-Newton and Suzaku.
Left inferior parietal lobe engagement in social cognition and language
Bzdok, Danilo; Hartwigsen, Gesa; Reid, Andrew; Laird, Angela R.; Fox, Peter T.; Eickhoff, Simon B.
2017-01-01
Social cognition and language are two core features of the human species. Despite distributed recruitment of brain regions in each mental capacity, the left parietal lobe (LPL) represents a zone of topographical convergence. The present study quantitatively summarizes hundreds of neuroimaging studies on social cognition and language. Using connectivity-based parcellation on a meta-analytically defined volume of interest (VOI), regional coactivation patterns within this VOI allowed identifying distinct subregions. Across parcellation solutions, two clusters emerged consistently in rostro-ventral and caudo-ventral aspects of the parietal VOI. Both clusters were functionally significantly associated with social-cognitive and language processing. In particular, the rostro-ventral cluster was associated with lower-level processing facets, while the caudo-ventral cluster was associated with higher-level processing facets in both mental capacities. Contrarily, in the (less stable) dorsal parietal VOI, all clusters reflected computation of general-purpose processes, such as working memory and matching tasks, that are frequently co-recruited by social or language processes. Our results hence favour a rostro-caudal distinction of lower-versus higher-level processes underlying social cognition and language in the left inferior parietal lobe. PMID:27241201
The spatial clustering of obesity: does the built environment matter?
Huang, R; Moudon, A V; Cook, A J; Drewnowski, A
2015-12-01
Obesity rates in the USA show distinct geographical patterns. The present study used spatial cluster detection methods and individual-level data to locate obesity clusters and to analyse them in relation to the neighbourhood built environment. The 2008-2009 Seattle Obesity Study provided data on the self-reported height, weight, and sociodemographic characteristics of 1602 King County adults. Home addresses were geocoded. Clusters of high or low body mass index were identified using Anselin's Local Moran's I and a spatial scan statistic with regression models that searched for unmeasured neighbourhood-level factors from residuals, adjusting for measured individual-level covariates. Spatially continuous values of objectively measured features of the local neighbourhood built environment (SmartMaps) were constructed for seven variables obtained from tax rolls and commercial databases. Both the Local Moran's I and a spatial scan statistic identified similar spatial concentrations of obesity. High and low obesity clusters were attenuated after adjusting for age, gender, race, education and income, and they disappeared once neighbourhood residential property values and residential density were included in the model. Using individual-level data to detect obesity clusters with two cluster detection methods, the present study showed that the spatial concentration of obesity was wholly explained by neighbourhood composition and socioeconomic characteristics. These characteristics may serve to more precisely locate obesity prevention and intervention programmes. © 2014 The British Dietetic Association Ltd.
Poole, William; Leinonen, Kalle; Shmulevich, Ilya
2017-01-01
Cancer researchers have long recognized that somatic mutations are not uniformly distributed within genes. However, most approaches for identifying cancer mutations focus on either the entire-gene or single amino-acid level. We have bridged these two methodologies with a multiscale mutation clustering algorithm that identifies variable length mutation clusters in cancer genes. We ran our algorithm on 539 genes using the combined mutation data in 23 cancer types from The Cancer Genome Atlas (TCGA) and identified 1295 mutation clusters. The resulting mutation clusters cover a wide range of scales and often overlap with many kinds of protein features including structured domains, phosphorylation sites, and known single nucleotide variants. We statistically associated these multiscale clusters with gene expression and drug response data to illuminate the functional and clinical consequences of mutations in our clusters. Interestingly, we find multiple clusters within individual genes that have differential functional associations: these include PTEN, FUBP1, and CDH1. This methodology has potential implications in identifying protein regions for drug targets, understanding the biological underpinnings of cancer, and personalizing cancer treatments. Toward this end, we have made the mutation clusters and the clustering algorithm available to the public. Clusters and pathway associations can be interactively browsed at m2c.systemsbiology.net. The multiscale mutation clustering algorithm is available at https://github.com/IlyaLab/M2C. PMID:28170390
Poole, William; Leinonen, Kalle; Shmulevich, Ilya; Knijnenburg, Theo A; Bernard, Brady
2017-02-01
Cancer researchers have long recognized that somatic mutations are not uniformly distributed within genes. However, most approaches for identifying cancer mutations focus on either the entire-gene or single amino-acid level. We have bridged these two methodologies with a multiscale mutation clustering algorithm that identifies variable length mutation clusters in cancer genes. We ran our algorithm on 539 genes using the combined mutation data in 23 cancer types from The Cancer Genome Atlas (TCGA) and identified 1295 mutation clusters. The resulting mutation clusters cover a wide range of scales and often overlap with many kinds of protein features including structured domains, phosphorylation sites, and known single nucleotide variants. We statistically associated these multiscale clusters with gene expression and drug response data to illuminate the functional and clinical consequences of mutations in our clusters. Interestingly, we find multiple clusters within individual genes that have differential functional associations: these include PTEN, FUBP1, and CDH1. This methodology has potential implications in identifying protein regions for drug targets, understanding the biological underpinnings of cancer, and personalizing cancer treatments. Toward this end, we have made the mutation clusters and the clustering algorithm available to the public. Clusters and pathway associations can be interactively browsed at m2c.systemsbiology.net. The multiscale mutation clustering algorithm is available at https://github.com/IlyaLab/M2C.
18F-FDG PET radiomics approaches: comparing and clustering features in cervical cancer.
Tsujikawa, Tetsuya; Rahman, Tasmiah; Yamamoto, Makoto; Yamada, Shizuka; Tsuyoshi, Hideaki; Kiyono, Yasushi; Kimura, Hirohiko; Yoshida, Yoshio; Okazawa, Hidehiko
2017-11-01
The aims of our study were to find the textural features on 18 F-FDG PET/CT which reflect the different histological architectures between cervical cancer subtypes and to make a visual assessment of the association between 18 F-FDG PET textural features in cervical cancer. Eighty-three cervical cancer patients [62 squamous cell carcinomas (SCCs) and 21 non-SCCs (NSCCs)] who had undergone pretreatment 18 F-FDG PET/CT were enrolled. A texture analysis was performed on PET/CT images, from which 18 PET radiomics features were extracted including first-order features such as standardized uptake value (SUV), metabolic tumor volume (MTV) and total lesion glycolysis (TLG), second- and high-order textural features using SUV histogram, normalized gray-level co-occurrence matrix (NGLCM), and neighborhood gray-tone difference matrix, respectively. These features were compared between SCC and NSCC using a Bonferroni adjusted P value threshold of 0.0028 (0.05/18). To assess the association between PET features, a heat map analysis with hierarchical clustering, one of the radiomics approaches, was performed. Among 18 PET features, correlation, a second-order textural feature derived from NGLCM, was a stable parameter and it was the only feature which showed a robust trend toward significant difference between SCC and NSCC. Cervical SCC showed a higher correlation (0.70 ± 0.07) than NSCC (0.64 ± 0.07, P = 0.0030). The other PET features did not show any significant differences between SCC and NSCC. A higher correlation in SCC might reflect higher structural integrity and stronger spatial/linear relationship of cancer cells compared with NSCC. A heat map with a PET feature dendrogram clearly showed 5 distinct clusters, where correlation belonged to a cluster including MTV and TLG. However, the association between correlation and MTV/TLG was not strong. Correlation was a relatively independent PET feature in cervical cancer. 18 F-FDG PET textural features might reflect the differences in histological architecture between cervical cancer subtypes. PET radiomics approaches reveal the association between PET features and will be useful for finding a single feature or a combination of features leading to precise diagnoses, potential prognostic models, and effective therapeutic strategies.
Weakly supervised image semantic segmentation based on clustering superpixels
NASA Astrophysics Data System (ADS)
Yan, Xiong; Liu, Xiaohua
2018-04-01
In this paper, we propose an image semantic segmentation model which is trained from image-level labeled images. The proposed model starts with superpixel segmenting, and features of the superpixels are extracted by trained CNN. We introduce a superpixel-based graph followed by applying the graph partition method to group correlated superpixels into clusters. For the acquisition of inter-label correlations between the image-level labels in dataset, we not only utilize label co-occurrence statistics but also exploit visual contextual cues simultaneously. At last, we formulate the task of mapping appropriate image-level labels to the detected clusters as a problem of convex minimization. Experimental results on MSRC-21 dataset and LableMe dataset show that the proposed method has a better performance than most of the weakly supervised methods and is even comparable to fully supervised methods.
Boland, Mary Regina; Miotto, Riccardo; Gao, Junfeng; Weng, Chunhua
2013-01-01
Summary Background When standard therapies fail, clinical trials provide experimental treatment opportunities for patients with drug-resistant illnesses or terminal diseases. Clinical Trials can also provide free treatment and education for individuals who otherwise may not have access to such care. To find relevant clinical trials, patients often search online; however, they often encounter a significant barrier due to the large number of trials and in-effective indexing methods for reducing the trial search space. Objectives This study explores the feasibility of feature-based indexing, clustering, and search of clinical trials and informs designs to automate these processes. Methods We decomposed 80 randomly selected stage III breast cancer clinical trials into a vector of eligibility features, which were organized into a hierarchy. We clustered trials based on their eligibility feature similarities. In a simulated search process, manually selected features were used to generate specific eligibility questions to filter trials iteratively. Results We extracted 1,437 distinct eligibility features and achieved an inter-rater agreement of 0.73 for feature extraction for 37 frequent features occurring in more than 20 trials. Using all the 1,437 features we stratified the 80 trials into six clusters containing trials recruiting similar patients by patient-characteristic features, five clusters by disease-characteristic features, and two clusters by mixed features. Most of the features were mapped to one or more Unified Medical Language System (UMLS) concepts, demonstrating the utility of named entity recognition prior to mapping with the UMLS for automatic feature extraction. Conclusions It is feasible to develop feature-based indexing and clustering methods for clinical trials to identify trials with similar target populations and to improve trial search efficiency. PMID:23666475
Boland, M R; Miotto, R; Gao, J; Weng, C
2013-01-01
When standard therapies fail, clinical trials provide experimental treatment opportunities for patients with drug-resistant illnesses or terminal diseases. Clinical Trials can also provide free treatment and education for individuals who otherwise may not have access to such care. To find relevant clinical trials, patients often search online; however, they often encounter a significant barrier due to the large number of trials and in-effective indexing methods for reducing the trial search space. This study explores the feasibility of feature-based indexing, clustering, and search of clinical trials and informs designs to automate these processes. We decomposed 80 randomly selected stage III breast cancer clinical trials into a vector of eligibility features, which were organized into a hierarchy. We clustered trials based on their eligibility feature similarities. In a simulated search process, manually selected features were used to generate specific eligibility questions to filter trials iteratively. We extracted 1,437 distinct eligibility features and achieved an inter-rater agreement of 0.73 for feature extraction for 37 frequent features occurring in more than 20 trials. Using all the 1,437 features we stratified the 80 trials into six clusters containing trials recruiting similar patients by patient-characteristic features, five clusters by disease-characteristic features, and two clusters by mixed features. Most of the features were mapped to one or more Unified Medical Language System (UMLS) concepts, demonstrating the utility of named entity recognition prior to mapping with the UMLS for automatic feature extraction. It is feasible to develop feature-based indexing and clustering methods for clinical trials to identify trials with similar target populations and to improve trial search efficiency.
Fully convolutional network with cluster for semantic segmentation
NASA Astrophysics Data System (ADS)
Ma, Xiao; Chen, Zhongbi; Zhang, Jianlin
2018-04-01
At present, image semantic segmentation technology has been an active research topic for scientists in the field of computer vision and artificial intelligence. Especially, the extensive research of deep neural network in image recognition greatly promotes the development of semantic segmentation. This paper puts forward a method based on fully convolutional network, by cluster algorithm k-means. The cluster algorithm using the image's low-level features and initializing the cluster centers by the super-pixel segmentation is proposed to correct the set of points with low reliability, which are mistakenly classified in great probability, by the set of points with high reliability in each clustering regions. This method refines the segmentation of the target contour and improves the accuracy of the image segmentation.
Metabolomic biosignature differentiates melancholic depressive patients from healthy controls.
Liu, Yashu; Yieh, Lynn; Yang, Tao; Drinkenburg, Wilhelmus; Peeters, Pieter; Steckler, Thomas; Narayan, Vaibhav A; Wittenberg, Gayle; Ye, Jieping
2016-08-23
Major depressive disorder (MDD) is a heterogeneous disease at the level of clinical symptoms, and this heterogeneity is likely reflected at the level of biology. Two clinical subtypes within MDD that have garnered interest are "melancholic depression" and "anxious depression". Metabolomics enables us to characterize hundreds of small molecules that comprise the metabolome, and recent work suggests the blood metabolome may be able to inform treatment decisions for MDD, however work is at an early stage. Here we examine a metabolomics data set to (1) test whether clinically homogenous MDD subtypes are also more biologically homogeneous, and hence more predictiable, (2) devise a robust machine learning framework that preserves biological meaning, and (3) describe the metabolomic biosignature for melancholic depression. With the proposed computational system we achieves around 80 % classification accuracy, sensitivity and specificity for melancholic depression, but only ~72 % for anxious depression or MDD, suggesting the blood metabolome contains more information about melancholic depression.. We develop an ensemble feature selection framework (EFSF) in which features are first clustered, and learning then takes place on the cluster centroids, retaining information about correlated features during the feature selection process rather than discarding them as most machine learning methods will do. Analysis of the most discriminative feature clusters revealed differences in metabolic classes such as amino acids and lipids as well as pathways studied extensively in MDD such as the activation of cortisol in chronic stress. We find the greater clinical homogeneity does indeed lead to better prediction based on biological measurements in the case of melancholic depression. Melancholic depression is shown to be associated with changes in amino acids, catecholamines, lipids, stress hormones, and immune-related metabolites. The proposed computational framework can be adapted to analyze data from many other biomedical applications where the data has similar characteristics.
MSFC Skylab contamination control systems mission evaluation
NASA Technical Reports Server (NTRS)
1974-01-01
Cluster external contamination control evaluation was made throughout the Skylab Mission. This evaluation indicated that contamination control measures instigated during the design, development, and operational phases of this program were adequate to reduce the general contamination environment external to the Cluster below the threshold senstivity levels for experiments and affected subsystems. Launch and orbit contamination control features included eliminating certain vents, rerouting vents for minimum contamination impact, establishing filters, incorporating materials with minimum outgassing characteristics and developing operational constraints and mission rules to minimize contamination effects. Prior to the launch of Skylab, contamination control math models were developed which were used to predict Cluster surface deposition and background brightness levels throughout the mission. The report summarizes the Skylab system and experiment contamination control evaluation. The Cluster systems and experiments evaluated include Induced Atmosphere, Corollary and ATM Experiments, Thermal Control Surfaces, Solar Array Systems, Windows and Star Tracker.
Delineation of gravel-bed clusters via factorial kriging
NASA Astrophysics Data System (ADS)
Wu, Fu-Chun; Wang, Chi-Kuei; Huang, Guo-Hao
2018-05-01
Gravel-bed clusters are the most prevalent microforms that affect local flows and sediment transport. A growing consensus is that the practice of cluster delineation should be based primarily on bed topography rather than grain sizes. Here we present a novel approach for cluster delineation using patch-scale high-resolution digital elevation models (DEMs). We use a geostatistical interpolation method, i.e., factorial kriging, to decompose the short- and long-range (grain- and microform-scale) DEMs. The required parameters are determined directly from the scales of the nested variograms. The short-range DEM exhibits a flat bed topography, yet individual grains are sharply outlined, making the short-range DEM a useful aid for grain segmentation. The long-range DEM exhibits a smoother topography than the original full DEM, yet groupings of particles emerge as small-scale bedforms, making the contour percentile levels of the long-range DEM a useful tool for cluster identification. Individual clusters are delineated using the segmented grains and identified clusters via a range of contour percentile levels. Our results reveal that the density and total area of delineated clusters decrease with increasing contour percentile level, while the mean grain size of clusters and average size of anchor clast (i.e., the largest particle in a cluster) increase with the contour percentile level. These results support the interpretation that larger particles group as clusters and protrude higher above the bed than other smaller grains. A striking feature of the delineated clusters is that anchor clasts are invariably greater than the D90 of the grain sizes even though a threshold anchor size was not adopted herein. The average areal fractal dimensions (Hausdorff-Besicovich dimensions of the projected areas) of individual clusters, however, demonstrate that clusters delineated with different contour percentile levels exhibit similar planform morphologies. Comparisons with a compilation of existing field data show consistency with the cluster properties documented in a wide variety of settings. This study thus points toward a promising, alternative DEM-based approach to characterizing sediment structures in gravel-bed rivers.
Esserman, Denise; Allore, Heather G.; Travison, Thomas G.
2016-01-01
Cluster-randomized clinical trials (CRT) are trials in which the unit of randomization is not a participant but a group (e.g. healthcare systems or community centers). They are suitable when the intervention applies naturally to the cluster (e.g. healthcare policy); when lack of independence among participants may occur (e.g. nursing home hygiene); or when it is most ethical to apply an intervention to all within a group (e.g. school-level immunization). Because participants in the same cluster receive the same intervention, CRT may approximate clinical practice, and may produce generalizable findings. However, when not properly designed or interpreted, CRT may induce biased results. CRT designs have features that add complexity to statistical estimation and inference. Chief among these is the cluster-level correlation in response measurements induced by the randomization. A critical consideration is the experimental unit of inference; often it is desirable to consider intervention effects at the level of the individual rather than the cluster. Finally, given that the number of clusters available may be limited, simple forms of randomization may not achieve balance between intervention and control arms at either the cluster- or participant-level. In non-clustered clinical trials, balance of key factors may be easier to achieve because the sample can be homogenous by exclusion of participants with multiple chronic conditions (MCC). CRTs, which are often pragmatic, may eschew such restrictions. Failure to account for imbalance may induce bias and reducing validity. This article focuses on the complexities of randomization in the design of CRTs, such as the inclusion of patients with MCC, and imbalances in covariate factors across clusters. PMID:27478520
Male Inmate Profiles and Their Biological Correlates
Horn, Mathilde; Potvin, Stephane; Allaire, Jean-François; Côté, Gilles; Gobbi, Gabriella; Benkirane, Karim; Vachon, Jeanne; Dumais, Alexandre
2014-01-01
Objective: Borderline and antisocial personality disorders (PDs) share common clinical features (impulsivity, aggressiveness, substance use disorders [SUDs], and suicidal behaviours) that are greatly overrepresented in prison populations. These disorders have been associated biologically with testosterone and cortisol levels. However, the associations are ambiguous and the subject of controversy, perhaps because these heterogeneous disorders have been addressed as unitary constructs. A consideration of profiles of people, rather than of exclusive diagnoses, might yield clearer relationships. Methods: In our study, multiple correspondence analysis and cluster analysis were employed to identify subgroups among 545 newly convicted inmates. The groups were then compared in terms of clinical features and biological markers, including levels of cortisol, testosterone, estradiol, progesterone, and sulfoconjugated dehydroepiandrosterone (DHEA-S). Results: Four clusters with differing psychiatric, criminal, and biological profiles emerged. Clinically, one group had intermediate scores for each of the tested clinical features. Another group comprised people with little comorbidity. Two others displayed severe impulsivity, PD, and SUD. Biologically, cortisol levels were lowest in the last 2 groups and highest in the group with less comorbidity. In keeping with previous findings reported in the literature, testosterone was higher in a younger population with severe psychiatric symptoms. However, some apparently comparable behavioural outcomes were found to be related to distinct biological profiles. No differences were observed for estradiol, progesterone, or DHEA-S levels. Conclusions: The results not only confirm the importance of biological markers in the study of personality features but also demonstrate the need to consider the role of comorbidities and steroid coregulation. PMID:25161069
Population clustering based on copy number variations detected from next generation sequencing data.
Duan, Junbo; Zhang, Ji-Gang; Wan, Mingxi; Deng, Hong-Wen; Wang, Yu-Ping
2014-08-01
Copy number variations (CNVs) can be used as significant bio-markers and next generation sequencing (NGS) provides a high resolution detection of these CNVs. But how to extract features from CNVs and further apply them to genomic studies such as population clustering have become a big challenge. In this paper, we propose a novel method for population clustering based on CNVs from NGS. First, CNVs are extracted from each sample to form a feature matrix. Then, this feature matrix is decomposed into the source matrix and weight matrix with non-negative matrix factorization (NMF). The source matrix consists of common CNVs that are shared by all the samples from the same group, and the weight matrix indicates the corresponding level of CNVs from each sample. Therefore, using NMF of CNVs one can differentiate samples from different ethnic groups, i.e. population clustering. To validate the approach, we applied it to the analysis of both simulation data and two real data set from the 1000 Genomes Project. The results on simulation data demonstrate that the proposed method can recover the true common CNVs with high quality. The results on the first real data analysis show that the proposed method can cluster two family trio with different ancestries into two ethnic groups and the results on the second real data analysis show that the proposed method can be applied to the whole-genome with large sample size consisting of multiple groups. Both results demonstrate the potential of the proposed method for population clustering.
On the game of life: population and its diversity
NASA Astrophysics Data System (ADS)
Sales, T. M.; Garcia, J. B. C.; Jyh, T. I.; Ren, T. I.; Gomes, M. A. F.
1993-08-01
One of the most important features of biological life in all levels is its astounding diversity. In this work we study the well-known game “Life” due to Conway analysing the statistics of cluster population, N( t), and cluster diversity, D( t). We have performed simulations on “Life” for dimensions d = 1 and 2 starting with an uncorrelated distribution of live and dead sites at t = 0. For d = 2 we study the effect of different neighbourhood relations in identifying and counting clusters. An interesting scaling relation connecting the maxima of N( t) and D( t) is found.
Magnetic assembly of 3D cell clusters: visualizing the formation of an engineered tissue.
Ghosh, S; Kumar, S R P; Puri, I K; Elankumaran, S
2016-02-01
Contactless magnetic assembly of cells into 3D clusters has been proposed as a novel means for 3D tissue culture that eliminates the need for artificial scaffolds. However, thus far its efficacy has only been studied by comparing expression levels of generic proteins. Here, it has been evaluated by visualizing the evolution of cell clusters assembled by magnetic forces, to examine their resemblance to in vivo tissues. Cells were labeled with magnetic nanoparticles, then assembled into 3D clusters using magnetic force. Scanning electron microscopy was used to image intercellular interactions and morphological features of the clusters. When cells were held together by magnetic forces for a single day, they formed intercellular contacts through extracellular fibers. These kept the clusters intact once the magnetic forces were removed, thus serving the primary function of scaffolds. The cells self-organized into constructs consistent with the corresponding tissues in vivo. Epithelial cells formed sheets while fibroblasts formed spheroids and exhibited position-dependent morphological heterogeneity. Cells on the periphery of a cluster were flattened while those within were spheroidal, a well-known characteristic of connective tissues in vivo. Cells assembled by magnetic forces presented visual features representative of their in vivo states but largely absent in monolayers. This established the efficacy of contactless assembly as a means to fabricate in vitro tissue models. © 2016 John Wiley & Sons Ltd.
Sideris, Costas; Alshurafa, Nabil; Pourhomayoun, Mohammad; Shahmohammadi, Farhad; Samy, Lauren; Sarrafzadeh, Majid
2015-01-01
In this paper, we propose a novel methodology for utilizing disease diagnostic information to predict severity of condition for Congestive Heart Failure (CHF) patients. Our methodology relies on a novel, clustering-based, feature extraction framework using disease diagnostic information. To reduce the dimensionality we identify disease clusters using cooccurence frequencies. We then utilize these clusters as features to predict patient severity of condition. We build our clustering and feature extraction algorithm using the 2012 National Inpatient Sample (NIS), Healthcare Cost and Utilization Project (HCUP) which contains 7 million discharge records and ICD-9-CM codes. The proposed framework is tested on Ronald Reagan UCLA Medical Center Electronic Health Records (EHR) from 3041 patients. We compare our cluster-based feature set with another that incorporates the Charlson comorbidity score as a feature and demonstrate an accuracy improvement of up to 14% in the predictability of the severity of condition.
Left inferior parietal lobe engagement in social cognition and language.
Bzdok, Danilo; Hartwigsen, Gesa; Reid, Andrew; Laird, Angela R; Fox, Peter T; Eickhoff, Simon B
2016-09-01
Social cognition and language are two core features of the human species. Despite distributed recruitment of brain regions in each mental capacity, the left parietal lobe (LPL) represents a zone of topographical convergence. The present study quantitatively summarizes hundreds of neuroimaging studies on social cognition and language. Using connectivity-based parcellation on a meta-analytically defined volume of interest (VOI), regional coactivation patterns within this VOI allowed identifying distinct subregions. Across parcellation solutions, two clusters emerged consistently in rostro-ventral and caudo-ventral aspects of the parietal VOI. Both clusters were functionally significantly associated with social-cognitive and language processing. In particular, the rostro-ventral cluster was associated with lower-level processing facets, while the caudo-ventral cluster was associated with higher-level processing facets in both mental capacities. Contrarily, in the (less stable) dorsal parietal VOI, all clusters reflected computation of general-purpose processes, such as working memory and matching tasks, that are frequently co-recruited by social or language processes. Our results hence favour a rostro-caudal distinction of lower- versus higher-level processes underlying social cognition and language in the left inferior parietal lobe. Copyright © 2016 Elsevier Ltd. All rights reserved.
Tong, Wing-Hang; Maio, Nunziata; Zhang, De-Liang; Palmieri, Erika M; Ollivierre, Hayden; Ghosh, Manik C; McVicar, Daniel W; Rouault, Tracey A
2018-05-22
Given the essential roles of iron-sulfur (Fe-S) cofactors in mediating electron transfer in the mitochondrial respiratory chain and supporting heme biosynthesis, mitochondrial dysfunction is a common feature in a growing list of human Fe-S cluster biogenesis disorders, including Friedreich ataxia and GLRX5-related sideroblastic anemia. Here, our studies showed that restriction of Fe-S cluster biogenesis not only compromised mitochondrial oxidative metabolism but also resulted in decreased overall histone acetylation and increased H3K9me3 levels in the nucleus and increased acetylation of α-tubulin in the cytosol by decreasing the lipoylation of the pyruvate dehydrogenase complex, decreasing levels of succinate dehydrogenase and the histone acetyltransferase ELP3, and increasing levels of the tubulin acetyltransferase MEC17. Previous studies have shown that the metabolic shift in Toll-like receptor (TLR)-activated myeloid cells involves rapid activation of glycolysis and subsequent mitochondrial respiratory failure due to nitric oxide (NO)-mediated damage to Fe-S proteins. Our studies indicated that TLR activation also actively suppresses many components of the Fe-S cluster biogenesis machinery, which exacerbates NO-mediated damage to Fe-S proteins by interfering with cluster recovery. These results reveal new regulatory pathways and novel roles of the Fe-S cluster biogenesis machinery in modifying the epigenome and acetylome and provide new insights into the etiology of Fe-S cluster biogenesis disorders.
Preferences mapping of household biodigester in Bandung
NASA Astrophysics Data System (ADS)
Humaira, S.; Rianawati, E.; Sagala, S.; Sasongko, M. A.
2018-05-01
Bandung city government implemented household biodigester grants in 2015 and 2016. Unfortunately, there are some household biodigesters that still functioning well but not in use. Therefore, this study is an effort to improve the acceptance and usage rate of household biodigesters in Bandung. The purpose of this study is to know citizen’s preference when it comes to household biodigester. To get the picture, we conducted survey through online questionnaire based on eight dimension of quality defined by Garvin (1987) as basis to construct factors that might be favoured by current and potential users of household biodigesters. Based on result of cluster analysis, three clusters with different preferences were interpreted and profiled through Welch’s ANOVA and Games-Howell Test. This study reveals that the cluster with the largest number of members shows reliability and features as the key to determining current and potential user’s preference. This study suggests the developer of household biodigester to choose cluster 1 and prioritize the aspect of reliability and feature within the development of the next household biodigester product to get higher level of public acceptance.
A novel approach to internal crown characterization for coniferous tree species classification
NASA Astrophysics Data System (ADS)
Harikumar, A.; Bovolo, F.; Bruzzone, L.
2016-10-01
The knowledge about individual trees in forest is highly beneficial in forest management. High density small foot- print multi-return airborne Light Detection and Ranging (LiDAR) data can provide a very accurate information about the structural properties of individual trees in forests. Every tree species has a unique set of crown structural characteristics that can be used for tree species classification. In this paper, we use both the internal and external crown structural information of a conifer tree crown, derived from a high density small foot-print multi-return LiDAR data acquisition for species classification. Considering the fact that branches are the major building blocks of a conifer tree crown, we obtain the internal crown structural information using a branch level analysis. The structure of each conifer branch is represented using clusters in the LiDAR point cloud. We propose the joint use of the k-means clustering and geometric shape fitting, on the LiDAR data projected onto a novel 3-dimensional space, to identify branch clusters. After mapping the identified clusters back to the original space, six internal geometric features are estimated using a branch-level analysis. The external crown characteristics are modeled by using six least correlated features based on cone fitting and convex hull. Species classification is performed using a sparse Support Vector Machines (sparse SVM) classifier.
Cluster compression algorithm: A joint clustering/data compression concept
NASA Technical Reports Server (NTRS)
Hilbert, E. E.
1977-01-01
The Cluster Compression Algorithm (CCA), which was developed to reduce costs associated with transmitting, storing, distributing, and interpreting LANDSAT multispectral image data is described. The CCA is a preprocessing algorithm that uses feature extraction and data compression to more efficiently represent the information in the image data. The format of the preprocessed data enables simply a look-up table decoding and direct use of the extracted features to reduce user computation for either image reconstruction, or computer interpretation of the image data. Basically, the CCA uses spatially local clustering to extract features from the image data to describe spectral characteristics of the data set. In addition, the features may be used to form a sequence of scalar numbers that define each picture element in terms of the cluster features. This sequence, called the feature map, is then efficiently represented by using source encoding concepts. Various forms of the CCA are defined and experimental results are presented to show trade-offs and characteristics of the various implementations. Examples are provided that demonstrate the application of the cluster compression concept to multi-spectral images from LANDSAT and other sources.
Knoble, Naomi B; Alderfer, Melissa A; Hossain, Md Jobayer
2016-10-01
Socioeconomic status (SES) is a complex construct of multiple indicators, known to impact cancer outcomes, but has not been adequately examined among pediatric AML patients. This study aimed to identify the patterns of co-occurrence of multiple community-level SES indicators and to explore associations between various patterns of these indicators and pediatric AML mortality risk. A nationally representative US sample of 3651 pediatric AML patients, aged 0-19 years at diagnosis was drawn from 17 Surveillance, Epidemiology, and End Results (SEER) database registries created between 1973 and 2012. Factor analysis, cluster analysis, stratified univariable and multivariable Cox proportional hazards models were used. Four SES factors accounting for 87% of the variance in SES indicators were identified: F1) economic/educational disadvantage, less immigration; F2) immigration-related features (foreign-born, language-isolation, crowding), less mobility; F3) housing instability; and, F4) absence of moving. F1 and F3 showed elevated risk of mortality, adjusted hazards ratios (aHR) (95% CI): 1.07(1.02-1.12) and 1.05(1.00-1.10), respectively. Seven SES-defined cluster groups were identified. Cluster 1 (low economic/educational disadvantage, few immigration-related features, and residential-stability) showed the minimum risk of mortality. Compared to Cluster 1, Cluster 3 (high economic/educational disadvantage, high-mobility) and Cluster 6 (moderately-high economic/educational disadvantages, housing-instability and immigration-related features) exhibited substantially greater risk of mortality, aHR(95% CI)=1.19(1.0-1.4) and 1.23 (1.1-1.5), respectively. Factors of correlated SES-indicators and their pattern-based groups demonstrated differential risks in the pediatric AML mortality indicating the need of special public-health attention in areas with economic-educational disadvantages, housing-instability and immigration-related features. Copyright © 2016 Elsevier Ltd. All rights reserved.
Patient Stratification Using Electronic Health Records from a Chronic Disease Management Program.
Chen, Robert; Sun, Jimeng; Dittus, Robert S; Fabbri, Daniel; Kirby, Jacqueline; Laffer, Cheryl L; McNaughton, Candace D; Malin, Bradley
2016-01-04
The goal of this study is to devise a machine learning framework to assist care coordination programs in prognostic stratification to design and deliver personalized care plans and to allocate financial and medical resources effectively. This study is based on a de-identified cohort of 2,521 hypertension patients from a chronic care coordination program at the Vanderbilt University Medical Center. Patients were modeled as vectors of features derived from electronic health records (EHRs) over a six-year period. We applied a stepwise regression to identify risk factors associated with a decrease in mean arterial pressure of at least 2 mmHg after program enrollment. The resulting features were subsequently validated via a logistic regression classifier. Finally, risk factors were applied to group the patients through model-based clustering. We identified a set of predictive features that consisted of a mix of demographic, medication, and diagnostic concepts. Logistic regression over these features yielded an area under the ROC curve (AUC) of 0.71 (95% CI: [0.67, 0.76]). Based on these features, four clinically meaningful groups are identified through clustering - two of which represented patients with more severe disease profiles, while the remaining represented patients with mild disease profiles. Patients with hypertension can exhibit significant variation in their blood pressure control status and responsiveness to therapy. Yet this work shows that a clustering analysis can generate more homogeneous patient groups, which may aid clinicians in designing and implementing customized care programs. The study shows that predictive modeling and clustering using EHR data can be beneficial for providing a systematic, generalized approach for care providers to tailor their management approach based upon patient-level factors.
de Vries, Natalie Jane; Reis, Rodrigo; Moscato, Pablo
2015-01-01
Organisations in the Not-for-Profit and charity sector face increasing competition to win time, money and efforts from a common donor base. Consequently, these organisations need to be more proactive than ever. The increased level of communications between individuals and organisations today, heightens the need for investigating the drivers of charitable giving and understanding the various consumer groups, or donor segments, within a population. It is contended that `trust' is the cornerstone of the not-for-profit sector's survival, making it an inevitable topic for research in this context. It has become imperative for charities and not-for-profit organisations to adopt for-profit's research, marketing and targeting strategies. This study provides the not-for-profit sector with an easily-interpretable segmentation method based on a novel unsupervised clustering technique (MST-kNN) followed by a feature saliency method (the CM1 score). A sample of 1,562 respondents from a survey conducted by the Australian Charities and Not-for-profits Commission is analysed to reveal donor segments. Each cluster's most salient features are identified using the CM1 score. Furthermore, symbolic regression modelling is employed to find cluster-specific models to predict `low' or `high' involvement in clusters. The MST-kNN method found seven clusters. Based on their salient features they were labelled as: the `non-institutionalist charities supporters', the `resource allocation critics', the `information-seeking financial sceptics', the `non-questioning charity supporters', the `non-trusting sceptics', the `charity management believers' and the `institutionalist charity believers'. Each cluster exhibits their own characteristics as well as different drivers of `involvement'. The method in this study provides the not-for-profit sector with a guideline for clustering, segmenting, understanding and potentially targeting their donor base better. If charities and not-for-profit organisations adopt these strategies, they will be more successful in today's competitive environment.
de Vries, Natalie Jane; Reis, Rodrigo; Moscato, Pablo
2015-01-01
Organisations in the Not-for-Profit and charity sector face increasing competition to win time, money and efforts from a common donor base. Consequently, these organisations need to be more proactive than ever. The increased level of communications between individuals and organisations today, heightens the need for investigating the drivers of charitable giving and understanding the various consumer groups, or donor segments, within a population. It is contended that `trust' is the cornerstone of the not-for-profit sector's survival, making it an inevitable topic for research in this context. It has become imperative for charities and not-for-profit organisations to adopt for-profit's research, marketing and targeting strategies. This study provides the not-for-profit sector with an easily-interpretable segmentation method based on a novel unsupervised clustering technique (MST-kNN) followed by a feature saliency method (the CM1 score). A sample of 1,562 respondents from a survey conducted by the Australian Charities and Not-for-profits Commission is analysed to reveal donor segments. Each cluster's most salient features are identified using the CM1 score. Furthermore, symbolic regression modelling is employed to find cluster-specific models to predict `low' or `high' involvement in clusters. The MST-kNN method found seven clusters. Based on their salient features they were labelled as: the `non-institutionalist charities supporters', the `resource allocation critics', the `information-seeking financial sceptics', the `non-questioning charity supporters', the `non-trusting sceptics', the `charity management believers' and the `institutionalist charity believers'. Each cluster exhibits their own characteristics as well as different drivers of `involvement'. The method in this study provides the not-for-profit sector with a guideline for clustering, segmenting, understanding and potentially targeting their donor base better. If charities and not-for-profit organisations adopt these strategies, they will be more successful in today's competitive environment. PMID:25849547
Cellucci, Tania; Tyrrell, Pascal N; Twilt, Marinka; Sheikh, Shehla; Benseler, Susanne M
2014-03-01
To identify distinct clusters of children with inflammatory brain diseases based on clinical, laboratory, and imaging features at presentation, to assess which features contribute strongly to the development of clusters, and to compare additional features between the identified clusters. A single-center cohort study was performed with children who had been diagnosed as having an inflammatory brain disease between June 1, 1989 and December 31, 2010. Demographic, clinical, laboratory, neuroimaging, and histologic data at diagnosis were collected. K-means cluster analysis was performed to identify clusters of patients based on their presenting features. Associations between the clusters and patient variables, such as diagnoses, were determined. A total of 147 children (50% female; median age 8.8 years) were identified: 105 with primary central nervous system (CNS) vasculitis, 11 with secondary CNS vasculitis, 8 with neuronal antibody syndromes, 6 with postinfectious syndromes, and 17 with other inflammatory brain diseases. Three distinct clusters were identified. Paresis and speech deficits were the most common presenting features in cluster 1. Children in cluster 2 were likely to present with behavior changes, cognitive dysfunction, and seizures, while those in cluster 3 experienced ataxia, vision abnormalities, and seizures. Lesions seen on T2/fluid-attenuated inversion recovery sequences of magnetic resonance imaging were common in all clusters, but unilateral ischemic lesions were more prominent in cluster 1. The clusters were associated with specific diagnoses and diagnostic test results. Children with inflammatory brain diseases presented with distinct phenotypical patterns that are associated with specific diagnoses. This information may inform the development of a diagnostic classification of childhood inflammatory brain diseases and suggest that specific pathways of diagnostic evaluation are warranted. Copyright © 2014 by the American College of Rheumatology.
A graph-Laplacian-based feature extraction algorithm for neural spike sorting.
Ghanbari, Yasser; Spence, Larry; Papamichalis, Panos
2009-01-01
Analysis of extracellular neural spike recordings is highly dependent upon the accuracy of neural waveform classification, commonly referred to as spike sorting. Feature extraction is an important stage of this process because it can limit the quality of clustering which is performed in the feature space. This paper proposes a new feature extraction method (which we call Graph Laplacian Features, GLF) based on minimizing the graph Laplacian and maximizing the weighted variance. The algorithm is compared with Principal Components Analysis (PCA, the most commonly-used feature extraction method) using simulated neural data. The results show that the proposed algorithm produces more compact and well-separated clusters compared to PCA. As an added benefit, tentative cluster centers are output which can be used to initialize a subsequent clustering stage.
A computational microscopy study of nanostructural evolution in irradiated pressure vessel steels
NASA Astrophysics Data System (ADS)
Odette, G. R.; Wirth, B. D.
1997-11-01
Nanostructural features that form in reactor pressure vessel steels under neutron irradiation at around 300°C lead to significant hardening and embrittlement. Continuum thermodynamic-kinetic based rate theories have been very successful in modeling the general characteristics of the copper and manganese nickel rich precipitate evolution, often the dominant source of embrittlement. However, a more detailed atomic scale understanding of these features is needed to interpret experimental measurements and better underpin predictive embrittlement models. Further, other embrittling features, believed to be subnanometer defect (vacancy)-solute complexes and small regions of modest enrichment of solutes are not well understood. A general approach to modeling embrittlement nanostructures, based on the concept of a computational microscope, is described. The objective of the computational microscope is to self-consistently integrate atomic scale simulations with other sources of information, including a wide range of experiments. In this work, lattice Monte Carlo (LMC) simulations are used to resolve the chemically and structurally complex nature of CuMnNiSi precipitates. The LMC simulations unify various nanoscale analytical characterization methods and basic thermodynamics. The LMC simulations also reveal that significant coupled vacancy and solute clustering takes place during cascade aging. The cascade clustering produces the metastable vacancy-cluster solute complexes that mediate flux effects. Cascade solute clustering may also play a role in the formation of dilute atmospheres of solute enrichment and enhance the nucleation of manganese-nickel rich precipitates at low Cu levels. Further, the simulations suggest that complex, highly correlated processes (e.g. cluster diffusion, formation of favored vacancy diffusion paths and solute scavenging vacancy cluster complexes) may lead to anomalous fast thermal aging kinetics at temperatures below about 450°C. The potential technical significance of these phenomena is described.
Chou, A; Burke, J
1999-05-01
DNA sequence clustering has become a valuable method in support of gene discovery and gene expression analysis. Our interest lies in leveraging the sequence diversity within clusters of expressed sequence tags (ESTs) to model gene structure for the study of gene variants that arise from, among other things, alternative mRNA splicing, polymorphism, and divergence after gene duplication, fusion, and translocation events. In previous work, CRAW was developed to discover gene variants from assembled clusters of ESTs. Most importantly, novel gene features (the differing units between gene variants, for example alternative exons, polymorphisms, transposable elements, etc.) that are specialized to tissue, disease, population, or developmental states can be identified when these tools collate DNA source information with gene variant discrimination. While the goal is complete automation of novel feature and gene variant detection, current methods are far from perfect and hence the development of effective tools for visualization and exploratory data analysis are of paramount importance in the process of sifting through candidate genes and validating targets. We present CRAWview, a Java based visualization extension to CRAW. Features that vary between gene forms are displayed using an automatically generated color coded index. The reporting format of CRAWview gives a brief, high level summary report to display overlap and divergence within clusters of sequences as well as the ability to 'drill down' and see detailed information concerning regions of interest. Additionally, the alignment viewing and editing capabilities of CRAWview make it possible to interactively correct frame-shifts and otherwise edit cluster assemblies. We have implemented CRAWview as a Java application across windows NT/95 and UNIX platforms. A beta version of CRAWview will be freely available to academic users from Pangea Systems (http://www.pangeasystems.com). Contact :
Consequences of realistic embedding for the L 2,3 edge XAS of α-Fe 2 O 3
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bagus, Paul S.; Nelin, Connie J.; Sassi, Michel
Cluster models of condensed systems are often used to simulate the core-level spectra obtained with X-ray Photoelectron Spectroscopy, XPS, or with X-ray Absorption Spectroscopy, XAS, especially for near edge features.
NASA Astrophysics Data System (ADS)
Yin, Bing; Li, Teng; Li, Jin-Feng; Yu, Yang; Li, Jian-Li; Wen, Zhen-Yi; Jiang, Zhen-Yi
2014-03-01
The first theoretical exploration of superhalogen properties of polynuclear structures based on pseudohalogen ligand is reported here via a case study on eight triply-bridged [Mg2(CN)5]- clusters. From our high-level ab initio results, all these clusters are superhalogens due to their high vertical electron detachment energies (VDE), of which the largest value is 8.67 eV at coupled-cluster single double triple (CCSD(T)) level. Although outer valence Green's function results are consistent with CCSD(T) in most cases, it overestimates the VDEs of three anions dramatically by more than 1 eV. Therefore, the combined usage of several theoretical methods is important for the accuracy of purely theoretical prediction of superhalogen properties of new structures. Spatial distribution of the extra electron of high-VDE anions here indicates two features: remarkable aggregation on bridging CN units and non-negligible distribution on every CN unit. These two features lower the potential and kinetic energies of the extra electron respectively and thus lead to high VDE. Besides superhalogen properties, the structures, relative stabilities and thermodynamic stabilities with respect to detachment of CN-1 were also investigated for these anions. The collection of these results indicates that polynuclear structures based on pseudohalogen ligand are promising candidates for new superhalogens with enhanced properties.
Yin, Bing; Li, Teng; Li, Jin-Feng; Yu, Yang; Li, Jian-Li; Wen, Zhen-Yi; Jiang, Zhen-Yi
2014-03-07
The first theoretical exploration of superhalogen properties of polynuclear structures based on pseudohalogen ligand is reported here via a case study on eight triply-bridged [Mg2(CN)5](-) clusters. From our high-level ab initio results, all these clusters are superhalogens due to their high vertical electron detachment energies (VDE), of which the largest value is 8.67 eV at coupled-cluster single double triple (CCSD(T)) level. Although outer valence Green's function results are consistent with CCSD(T) in most cases, it overestimates the VDEs of three anions dramatically by more than 1 eV. Therefore, the combined usage of several theoretical methods is important for the accuracy of purely theoretical prediction of superhalogen properties of new structures. Spatial distribution of the extra electron of high-VDE anions here indicates two features: remarkable aggregation on bridging CN units and non-negligible distribution on every CN unit. These two features lower the potential and kinetic energies of the extra electron respectively and thus lead to high VDE. Besides superhalogen properties, the structures, relative stabilities and thermodynamic stabilities with respect to detachment of CN(-1) were also investigated for these anions. The collection of these results indicates that polynuclear structures based on pseudohalogen ligand are promising candidates for new superhalogens with enhanced properties.
Hoadley, Katherine A; Yau, Christina; Hinoue, Toshinori; Wolf, Denise M; Lazar, Alexander J; Drill, Esther; Shen, Ronglai; Taylor, Alison M; Cherniack, Andrew D; Thorsson, Vésteinn; Akbani, Rehan; Bowlby, Reanne; Wong, Christopher K; Wiznerowicz, Maciej; Sanchez-Vega, Francisco; Robertson, A Gordon; Schneider, Barbara G; Lawrence, Michael S; Noushmehr, Houtan; Malta, Tathiane M; Stuart, Joshua M; Benz, Christopher C; Laird, Peter W
2018-04-05
We conducted comprehensive integrative molecular analyses of the complete set of tumors in The Cancer Genome Atlas (TCGA), consisting of approximately 10,000 specimens and representing 33 types of cancer. We performed molecular clustering using data on chromosome-arm-level aneuploidy, DNA hypermethylation, mRNA, and miRNA expression levels and reverse-phase protein arrays, of which all, except for aneuploidy, revealed clustering primarily organized by histology, tissue type, or anatomic origin. The influence of cell type was evident in DNA-methylation-based clustering, even after excluding sites with known preexisting tissue-type-specific methylation. Integrative clustering further emphasized the dominant role of cell-of-origin patterns. Molecular similarities among histologically or anatomically related cancer types provide a basis for focused pan-cancer analyses, such as pan-gastrointestinal, pan-gynecological, pan-kidney, and pan-squamous cancers, and those related by stemness features, which in turn may inform strategies for future therapeutic development. Copyright © 2018 Elsevier Inc. All rights reserved.
Jeong, Jeong-Won; Shin, Dae C; Do, Synho; Marmarelis, Vasilis Z
2006-08-01
This paper presents a novel segmentation methodology for automated classification and differentiation of soft tissues using multiband data obtained with the newly developed system of high-resolution ultrasonic transmission tomography (HUTT) for imaging biological organs. This methodology extends and combines two existing approaches: the L-level set active contour (AC) segmentation approach and the agglomerative hierarchical kappa-means approach for unsupervised clustering (UC). To prevent the trapping of the current iterative minimization AC algorithm in a local minimum, we introduce a multiresolution approach that applies the level set functions at successively increasing resolutions of the image data. The resulting AC clusters are subsequently rearranged by the UC algorithm that seeks the optimal set of clusters yielding the minimum within-cluster distances in the feature space. The presented results from Monte Carlo simulations and experimental animal-tissue data demonstrate that the proposed methodology outperforms other existing methods without depending on heuristic parameters and provides a reliable means for soft tissue differentiation in HUTT images.
NASA Technical Reports Server (NTRS)
Eigen, D. J.; Fromm, F. R.; Northouse, R. A.
1974-01-01
A new clustering algorithm is presented that is based on dimensional information. The algorithm includes an inherent feature selection criterion, which is discussed. Further, a heuristic method for choosing the proper number of intervals for a frequency distribution histogram, a feature necessary for the algorithm, is presented. The algorithm, although usable as a stand-alone clustering technique, is then utilized as a global approximator. Local clustering techniques and configuration of a global-local scheme are discussed, and finally the complete global-local and feature selector configuration is shown in application to a real-time adaptive classification scheme for the analysis of remote sensed multispectral scanner data.
A harmonic linear dynamical system for prominent ECG feature extraction.
Thi, Ngoc Anh Nguyen; Yang, Hyung-Jeong; Kim, SunHee; Do, Luu Ngoc
2014-01-01
Unsupervised mining of electrocardiography (ECG) time series is a crucial task in biomedical applications. To have efficiency of the clustering results, the prominent features extracted from preprocessing analysis on multiple ECG time series need to be investigated. In this paper, a Harmonic Linear Dynamical System is applied to discover vital prominent features via mining the evolving hidden dynamics and correlations in ECG time series. The discovery of the comprehensible and interpretable features of the proposed feature extraction methodology effectively represents the accuracy and the reliability of clustering results. Particularly, the empirical evaluation results of the proposed method demonstrate the improved performance of clustering compared to the previous main stream feature extraction approaches for ECG time series clustering tasks. Furthermore, the experimental results on real-world datasets show scalability with linear computation time to the duration of the time series.
Shah, Sohil Atul
2017-01-01
Clustering is a fundamental procedure in the analysis of scientific data. It is used ubiquitously across the sciences. Despite decades of research, existing clustering algorithms have limited effectiveness in high dimensions and often require tuning parameters for different domains and datasets. We present a clustering algorithm that achieves high accuracy across multiple domains and scales efficiently to high dimensions and large datasets. The presented algorithm optimizes a smooth continuous objective, which is based on robust statistics and allows heavily mixed clusters to be untangled. The continuous nature of the objective also allows clustering to be integrated as a module in end-to-end feature learning pipelines. We demonstrate this by extending the algorithm to perform joint clustering and dimensionality reduction by efficiently optimizing a continuous global objective. The presented approach is evaluated on large datasets of faces, hand-written digits, objects, newswire articles, sensor readings from the Space Shuttle, and protein expression levels. Our method achieves high accuracy across all datasets, outperforming the best prior algorithm by a factor of 3 in average rank. PMID:28851838
Generic features of the primary relaxation in glass-forming materials (Review Article)
NASA Astrophysics Data System (ADS)
Kokshenev, Valery B.
2017-08-01
We discuss structural relaxation in molecular and polymeric supercooled liquids, metallic alloys and orientational glass crystals. The study stresses especially the relationships between observables raised from underlying constraints imposed on degrees of freedom of vitrification systems. A self-consistent parametrization of the α-timescale on macroscopic level results in the material-and-model independent universal equation, relating three fundamental temperatures, characteristic of the primary relaxation, that is numerically proven in all studied glass formers. During the primary relaxation, the corresponding small and large mesoscopic clusters modify their size and structure in a self-similar way, regardless of underlying microscopic realizations. We show that cluster-shape similarity, instead of cluster-size fictive divergence, gives rise to universal features observed in primary relaxation. In all glass formers with structural disorder, including orientational-glass materials (with the exception of plastic crystals), structural relaxation is shown to be driven by local random fields. Within the dynamic stochastic approach, the universal subdiffusive dynamics corresponds to random walks on small and large fractals.
Lin, Nan; Jiang, Junhai; Guo, Shicheng; Xiong, Momiao
2015-01-01
Due to the advancement in sensor technology, the growing large medical image data have the ability to visualize the anatomical changes in biological tissues. As a consequence, the medical images have the potential to enhance the diagnosis of disease, the prediction of clinical outcomes and the characterization of disease progression. But in the meantime, the growing data dimensions pose great methodological and computational challenges for the representation and selection of features in image cluster analysis. To address these challenges, we first extend the functional principal component analysis (FPCA) from one dimension to two dimensions to fully capture the space variation of image the signals. The image signals contain a large number of redundant features which provide no additional information for clustering analysis. The widely used methods for removing the irrelevant features are sparse clustering algorithms using a lasso-type penalty to select the features. However, the accuracy of clustering using a lasso-type penalty depends on the selection of the penalty parameters and the threshold value. In practice, they are difficult to determine. Recently, randomized algorithms have received a great deal of attentions in big data analysis. This paper presents a randomized algorithm for accurate feature selection in image clustering analysis. The proposed method is applied to both the liver and kidney cancer histology image data from the TCGA database. The results demonstrate that the randomized feature selection method coupled with functional principal component analysis substantially outperforms the current sparse clustering algorithms in image cluster analysis. PMID:26196383
Maio, Nunziata; Palmieri, Erika M.; Ollivierre, Hayden; Ghosh, Manik C.
2018-01-01
Given the essential roles of iron-sulfur (Fe-S) cofactors in mediating electron transfer in the mitochondrial respiratory chain and supporting heme biosynthesis, mitochondrial dysfunction is a common feature in a growing list of human Fe-S cluster biogenesis disorders, including Friedreich ataxia and GLRX5-related sideroblastic anemia. Here, our studies showed that restriction of Fe-S cluster biogenesis not only compromised mitochondrial oxidative metabolism but also resulted in decreased overall histone acetylation and increased H3K9me3 levels in the nucleus and increased acetylation of α-tubulin in the cytosol by decreasing the lipoylation of the pyruvate dehydrogenase complex, decreasing levels of succinate dehydrogenase and the histone acetyltransferase ELP3, and increasing levels of the tubulin acetyltransferase MEC17. Previous studies have shown that the metabolic shift in Toll-like receptor (TLR)–activated myeloid cells involves rapid activation of glycolysis and subsequent mitochondrial respiratory failure due to nitric oxide (NO)–mediated damage to Fe-S proteins. Our studies indicated that TLR activation also actively suppresses many components of the Fe-S cluster biogenesis machinery, which exacerbates NO-mediated damage to Fe-S proteins by interfering with cluster recovery. These results reveal new regulatory pathways and novel roles of the Fe-S cluster biogenesis machinery in modifying the epigenome and acetylome and provide new insights into the etiology of Fe-S cluster biogenesis disorders. PMID:29784770
Ning, P; Guo, Y F; Sun, T Y; Zhang, H S; Chai, D; Li, X M
2016-09-01
To study the distinct clinical phenotype of chronic airway diseases by hierarchical cluster analysis and two-step cluster analysis. A population sample of adult patients in Donghuamen community, Dongcheng district and Qinghe community, Haidian district, Beijing from April 2012 to January 2015, who had wheeze within the last 12 months, underwent detailed investigation, including a clinical questionnaire, pulmonary function tests, total serum IgE levels, blood eosinophil level and a peak flow diary. Nine variables were chosen as evaluating parameters, including pre-salbutamol forced expired volume in one second(FEV1)/forced vital capacity(FVC) ratio, pre-salbutamol FEV1, percentage of post-salbutamol change in FEV1, residual capacity, diffusing capacity of the lung for carbon monoxide/alveolar volume adjusted for haemoglobin level, peak expiratory flow(PEF) variability, serum IgE level, cumulative tobacco cigarette consumption (pack-years) and respiratory symptoms (cough and expectoration). Subjects' different clinical phenotype by hierarchical cluster analysis and two-step cluster analysis was identified. (1) Four clusters were identified by hierarchical cluster analysis. Cluster 1 was chronic bronchitis in smokers with normal pulmonary function. Cluster 2 was chronic bronchitis or mild chronic obstructive pulmonary disease (COPD) patients with mild airflow limitation. Cluster 3 included COPD patients with heavy smoking, poor quality of life and severe airflow limitation. Cluster 4 recognized atopic patients with mild airflow limitation, elevated serum IgE and clinical features of asthma. Significant differences were revealed regarding pre-salbutamol FEV1/FVC%, pre-salbutamol FEV1% pred, post-salbutamol change in FEV1%, maximal mid-expiratory flow curve(MMEF)% pred, carbon monoxide diffusing capacity per liter of alveolar(DLCO)/(VA)% pred, residual volume(RV)% pred, total serum IgE level, smoking history (pack-years), St.George's respiratory questionnaire(SGRQ) score, acute exacerbation in the past one year, PEF variability and allergic dermatitis (P<0.05). (2) Four clusters were also identified by two-step cluster analysis as followings, cluster 1, COPD patients with moderate to severe airflow limitation; cluster 2, asthma and COPD patients with heavy smoking, airflow limitation and increased airways reversibility; cluster 3, patients having less smoking and normal pulmonary function with wheezing but no chronic cough; cluster 4, chronic bronchitis patients with normal pulmonary function and chronic cough. Significant differences were revealed regarding gender distribution, respiratory symptoms, pre-salbutamol FEV1/FVC%, pre-salbutamol FEV1% pred, post-salbutamol change in FEV1%, MMEF% pred, DLCO/VA% pred, RV% pred, PEF variability, total serum IgE level, cumulative tobacco cigarette consumption (pack-years), and SGRQ score (P<0.05). By different cluster analyses, distinct clinical phenotypes of chronic airway diseases are identified. Thus, individualized treatments may guide doctors to provide based on different phenotypes.
Hidden electronic rule in the “cluster-plus-glue-atom” model
Du, Jinglian; Dong, Chuang; Melnik, Roderick; Kawazoe, Yoshiyuki; Wen, Bin
2016-01-01
Electrons and their interactions are intrinsic factors to affect the structure and properties of materials. Based on the “cluster-cluster-plus-glue-atom” model, an electron counting rule for complex metallic alloys (CMAs) has been revealed in this work (i. e. the CPGAMEC rule). Our results on the cluster structure and electron concentration of CMAs with apparent cluster features, indicate that the valence electrons’ number per unit cluster formula for these CMAs are specific constants of eight-multiples and twelve-multiples. It is thus termed as specific electrons cluster formula. This CPGAMEC rule has been demonstrated as a useful guidance to direct the design of CMAs with desired properties, while its practical applications and underlying mechanism have been illustrated on the basis of CMAs’ cluster structural features. Our investigation provides an aggregate picture with intriguing electronic rule and atomic structural features of CMAs. PMID:27642002
Unsupervised feature relevance analysis applied to improve ECG heartbeat clustering.
Rodríguez-Sotelo, J L; Peluffo-Ordoñez, D; Cuesta-Frau, D; Castellanos-Domínguez, G
2012-10-01
The computer-assisted analysis of biomedical records has become an essential tool in clinical settings. However, current devices provide a growing amount of data that often exceeds the processing capacity of normal computers. As this amount of information rises, new demands for more efficient data extracting methods appear. This paper addresses the task of data mining in physiological records using a feature selection scheme. An unsupervised method based on relevance analysis is described. This scheme uses a least-squares optimization of the input feature matrix in a single iteration. The output of the algorithm is a feature weighting vector. The performance of the method was assessed using a heartbeat clustering test on real ECG records. The quantitative cluster validity measures yielded a correctly classified heartbeat rate of 98.69% (specificity), 85.88% (sensitivity) and 95.04% (general clustering performance), which is even higher than the performance achieved by other similar ECG clustering studies. The number of features was reduced on average from 100 to 18, and the temporal cost was a 43% lower than in previous ECG clustering schemes. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Shi, Wenzhong; Deng, Susu; Xu, Wenbing
2018-02-01
For automatic landslide detection, landslide morphological features should be quantitatively expressed and extracted. High-resolution Digital Elevation Models (DEMs) derived from airborne Light Detection and Ranging (LiDAR) data allow fine-scale morphological features to be extracted, but noise in DEMs influences morphological feature extraction, and the multi-scale nature of landslide features should be considered. This paper proposes a method to extract landslide morphological features characterized by homogeneous spatial patterns. Both profile and tangential curvature are utilized to quantify land surface morphology, and a local Gi* statistic is calculated for each cell to identify significant patterns of clustering of similar morphometric values. The method was tested on both synthetic surfaces simulating natural terrain and airborne LiDAR data acquired over an area dominated by shallow debris slides and flows. The test results of the synthetic data indicate that the concave and convex morphologies of the simulated terrain features at different scales and distinctness could be recognized using the proposed method, even when random noise was added to the synthetic data. In the test area, cells with large local Gi* values were extracted at a specified significance level from the profile and the tangential curvature image generated from the LiDAR-derived 1-m DEM. The morphologies of landslide main scarps, source areas and trails were clearly indicated, and the morphological features were represented by clusters of extracted cells. A comparison with the morphological feature extraction method based on curvature thresholds proved the proposed method's robustness to DEM noise. When verified against a landslide inventory, the morphological features of almost all recent (< 5 years) landslides and approximately 35% of historical (> 10 years) landslides were extracted. This finding indicates that the proposed method can facilitate landslide detection, although the cell clusters extracted from curvature images should be filtered using a filtering strategy based on supplementary information provided by expert knowledge or other data sources.
Kebir, Sied; Khurshid, Zain; Gaertner, Florian C; Essler, Markus; Hattingen, Elke; Fimmers, Rolf; Scheffler, Björn; Herrlinger, Ulrich; Bundschuh, Ralph A; Glas, Martin
2017-01-31
Timely detection of pseudoprogression (PSP) is crucial for the management of patients with high-grade glioma (HGG) but remains difficult. Textural features of O-(2-[18F]fluoroethyl)-L-tyrosine positron emission tomography (FET-PET) mirror tumor uptake heterogeneity; some of them may be associated with tumor progression. Fourteen patients with HGG and suspected of PSP underwent FET-PET imaging. A set of 19 conventional and textural FET-PET features were evaluated and subjected to unsupervised consensus clustering. The final diagnosis of true progression vs. PSP was based on follow-up MRI using RANO criteria. Three robust clusters have been identified based on 10 predominantly textural FET-PET features. None of the patients with PSP fell into cluster 2, which was associated with high values for textural FET-PET markers of uptake heterogeneity. Three out of 4 patients with PSP were assigned to cluster 3 that was largely associated with low values of textural FET-PET features. By comparison, tumor-to-normal brain ratio (TNRmax) at the optimal cutoff 2.1 was less predictive of PSP (negative predictive value 57% for detecting true progression, p=0.07 vs. 75% with cluster 3, p=0.04). Clustering based on textural O-(2-[18F]fluoroethyl)-L-tyrosine PET features may provide valuable information in assessing the elusive phenomenon of pseudoprogression.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fransson, Thomas; Norman, Patrick; Coriani, Sonia
2013-03-28
Near carbon K-edge X-ray absorption fine structure spectra of a series of fluorine-substituted ethenes and acetone have been studied using coupled cluster and density functional theory (DFT) polarization propagator methods, as well as the static-exchange (STEX) approach. With the complex polarization propagator (CPP) implemented in coupled cluster theory, relaxation effects following the excitation of core electrons are accounted for in terms of electron correlation, enabling a systematic convergence of these effects with respect to electron excitations in the cluster operator. Coupled cluster results have been used as benchmarks for the assessment of propagator methods in DFT as well as themore » state-specific static-exchange approach. Calculations on ethene and 1,1-difluoroethene illustrate the possibility of using nonrelativistic coupled cluster singles and doubles (CCSD) with additional effects of electron correlation and relativity added as scalar shifts in energetics. It has been demonstrated that CPP spectra obtained with coupled cluster singles and approximate doubles (CC2), CCSD, and DFT (with a Coulomb attenuated exchange-correlation functional) yield excellent predictions of chemical shifts for vinylfluoride, 1,1-difluoroethene, trifluoroethene, as well as good spectral features for acetone in the case of CCSD and DFT. Following this, CPP-DFT is considered to be a viable option for the calculation of X-ray absorption spectra of larger {pi}-conjugated systems, and CC2 is deemed applicable for chemical shifts but not for studies of fine structure features. The CCSD method as well as the more approximate CC2 method are shown to yield spectral features relating to {pi}*-resonances in good agreement with experiment, not only for the aforementioned molecules but also for ethene, cis-1,2-difluoroethene, and tetrafluoroethene. The STEX approach is shown to underestimate {pi}*-peak separations due to spectral compressions, a characteristic which is inherent to this method.« less
Fransson, Thomas; Coriani, Sonia; Christiansen, Ove; Norman, Patrick
2013-03-28
Near carbon K-edge X-ray absorption fine structure spectra of a series of fluorine-substituted ethenes and acetone have been studied using coupled cluster and density functional theory (DFT) polarization propagator methods, as well as the static-exchange (STEX) approach. With the complex polarization propagator (CPP) implemented in coupled cluster theory, relaxation effects following the excitation of core electrons are accounted for in terms of electron correlation, enabling a systematic convergence of these effects with respect to electron excitations in the cluster operator. Coupled cluster results have been used as benchmarks for the assessment of propagator methods in DFT as well as the state-specific static-exchange approach. Calculations on ethene and 1,1-difluoroethene illustrate the possibility of using nonrelativistic coupled cluster singles and doubles (CCSD) with additional effects of electron correlation and relativity added as scalar shifts in energetics. It has been demonstrated that CPP spectra obtained with coupled cluster singles and approximate doubles (CC2), CCSD, and DFT (with a Coulomb attenuated exchange-correlation functional) yield excellent predictions of chemical shifts for vinylfluoride, 1,1-difluoroethene, trifluoroethene, as well as good spectral features for acetone in the case of CCSD and DFT. Following this, CPP-DFT is considered to be a viable option for the calculation of X-ray absorption spectra of larger π-conjugated systems, and CC2 is deemed applicable for chemical shifts but not for studies of fine structure features. The CCSD method as well as the more approximate CC2 method are shown to yield spectral features relating to π∗-resonances in good agreement with experiment, not only for the aforementioned molecules but also for ethene, cis-1,2-difluoroethene, and tetrafluoroethene. The STEX approach is shown to underestimate π∗-peak separations due to spectral compressions, a characteristic which is inherent to this method.
Diagnostic and prognostic histopathology system using morphometric indices
DOE Office of Scientific and Technical Information (OSTI.GOV)
Parvin, Bahram; Chang, Hang; Han, Ju
Determining at least one of a prognosis or a therapy for a patient based on a stained tissue section of the patient. An image of a stained tissue section of a patient is processed by a processing device. A set of features values for a set of cell-based features is extracted from the processed image, and the processed image is associated with a particular cluster of a plurality of clusters based on the set of feature values, where the plurality of clusters is defined with respect to a feature space corresponding to the set of features.
A multiple-feature and multiple-kernel scene segmentation algorithm for humanoid robot.
Liu, Zhi; Xu, Shuqiong; Zhang, Yun; Chen, Chun Lung Philip
2014-11-01
This technical correspondence presents a multiple-feature and multiple-kernel support vector machine (MFMK-SVM) methodology to achieve a more reliable and robust segmentation performance for humanoid robot. The pixel wise intensity, gradient, and C1 SMF features are extracted via the local homogeneity model and Gabor filter, which would be used as inputs of MFMK-SVM model. It may provide multiple features of the samples for easier implementation and efficient computation of MFMK-SVM model. A new clustering method, which is called feature validity-interval type-2 fuzzy C-means (FV-IT2FCM) clustering algorithm, is proposed by integrating a type-2 fuzzy criterion in the clustering optimization process to improve the robustness and reliability of clustering results by the iterative optimization. Furthermore, the clustering validity is employed to select the training samples for the learning of the MFMK-SVM model. The MFMK-SVM scene segmentation method is able to fully take advantage of the multiple features of scene image and the ability of multiple kernels. Experiments on the BSDS dataset and real natural scene images demonstrate the superior performance of our proposed method.
High precision relocation of earthquakes at Iliamna Volcano, Alaska
Statz-Boyer, P.; Thurber, C.; Pesicek, J.; Prejean, S.
2009-01-01
In August 1996, a period of elevated seismicity commenced beneath Iliamna Volcano, Alaska. This activity lasted until early 1997, consisted of over 3000 earthquakes, and was accompanied by elevated emissions of volcanic gases. No eruption occurred and seismicity returned to background levels where it has remained since. We use waveform alignment with bispectrum-verified cross-correlation and double-difference methods to relocate over 2000 earthquakes from 1996 to 2005 with high precision (~ 100??m). The results of this analysis greatly clarify the distribution of seismic activity, revealing distinct features previously hidden by location scatter. A set of linear earthquake clusters diverges upward and southward from the main group of earthquakes. The events in these linear clusters show a clear southward migration with time. We suggest that these earthquakes represent either a response to degassing of the magma body, circulation of fluids due to exsolution from magma or heating of ground water, or possibly the intrusion of new dikes beneath Iliamna's southern flank. In addition, we speculate that the deeper, somewhat diffuse cluster of seismicity near and south of Iliamna's summit indicates the presence of an underlying magma body between about 2 and 4??km depth below sea level, based on similar features found previously at several other Alaskan volcanoes. ?? 2009 Elsevier B.V.
A roadmap of clustering algorithms: finding a match for a biomedical application.
Andreopoulos, Bill; An, Aijun; Wang, Xiaogang; Schroeder, Michael
2009-05-01
Clustering is ubiquitously applied in bioinformatics with hierarchical clustering and k-means partitioning being the most popular methods. Numerous improvements of these two clustering methods have been introduced, as well as completely different approaches such as grid-based, density-based and model-based clustering. For improved bioinformatics analysis of data, it is important to match clusterings to the requirements of a biomedical application. In this article, we present a set of desirable clustering features that are used as evaluation criteria for clustering algorithms. We review 40 different clustering algorithms of all approaches and datatypes. We compare algorithms on the basis of desirable clustering features, and outline algorithms' benefits and drawbacks as a basis for matching them to biomedical applications.
An Objective Classification of Saturn Cloud Features from Cassini ISS Images
NASA Technical Reports Server (NTRS)
Del Genio, Anthony D.; Barbara, John M.
2016-01-01
A k -means clustering algorithm is applied to Cassini Imaging Science Subsystem continuum and methane band images of Saturn's northern hemisphere to objectively classify regional albedo features and aid in their dynamical interpretation. The procedure is based on a technique applied previously to visible- infrared images of Earth. It provides a new perspective on giant planet cloud morphology and its relationship to the dynamics and a meteorological context for the analysis of other types of simultaneous Saturn observations. The method identifies 6 clusters that exhibit distinct morphology, vertical structure, and preferred latitudes of occurrence. These correspond to areas dominated by deep convective cells; low contrast areas, some including thinner and thicker clouds possibly associated with baroclinic instability; regions with possible isolated thin cirrus clouds; darker areas due to thinner low level clouds or clearer skies due to downwelling, or due to absorbing particles; and fields of relatively shallow cumulus clouds. The spatial associations among these cloud types suggest that dynamically, there are three distinct types of latitude bands on Saturn: deep convectively disturbed latitudes in cyclonic shear regions poleward of the eastward jets; convectively suppressed regions near and surrounding the westward jets; and baro-clinically unstable latitudes near eastward jet cores and in the anti-cyclonic regions equatorward of them. These are roughly analogous to some of the features of Earth's tropics, subtropics, and midlatitudes, respectively. This classification may be more useful for dynamics purposes than the traditional belt-zone partitioning. Temporal variations of feature contrast and cluster occurrence suggest that the upper tropospheric haze in the northern hemisphere may have thickened by 2014. The results suggest that routine use of clustering may be a worthwhile complement to many different types of planetary atmospheric data analysis.
NASA Astrophysics Data System (ADS)
Chen, Siyue; Leung, Henry; Dondo, Maxwell
2014-05-01
As computer network security threats increase, many organizations implement multiple Network Intrusion Detection Systems (NIDS) to maximize the likelihood of intrusion detection and provide a comprehensive understanding of intrusion activities. However, NIDS trigger a massive number of alerts on a daily basis. This can be overwhelming for computer network security analysts since it is a slow and tedious process to manually analyse each alert produced. Thus, automated and intelligent clustering of alerts is important to reveal the structural correlation of events by grouping alerts with common features. As the nature of computer network attacks, and therefore alerts, is not known in advance, unsupervised alert clustering is a promising approach to achieve this goal. We propose a joint optimization technique for feature selection and clustering to aggregate similar alerts and to reduce the number of alerts that analysts have to handle individually. More precisely, each identified feature is assigned a binary value, which reflects the feature's saliency. This value is treated as a hidden variable and incorporated into a likelihood function for clustering. Since computing the optimal solution of the likelihood function directly is analytically intractable, we use the Expectation-Maximisation (EM) algorithm to iteratively update the hidden variable and use it to maximize the expected likelihood. Our empirical results, using a labelled Defense Advanced Research Projects Agency (DARPA) 2000 reference dataset, show that the proposed method gives better results than the EM clustering without feature selection in terms of the clustering accuracy.
Kebir, Sied; Khurshid, Zain; Gaertner, Florian C.; Essler, Markus; Hattingen, Elke; Fimmers, Rolf; Scheffler, Björn; Herrlinger, Ulrich; Bundschuh, Ralph A.; Glas, Martin
2017-01-01
Rationale Timely detection of pseudoprogression (PSP) is crucial for the management of patients with high-grade glioma (HGG) but remains difficult. Textural features of O-(2-[18F]fluoroethyl)-L-tyrosine positron emission tomography (FET-PET) mirror tumor uptake heterogeneity; some of them may be associated with tumor progression. Methods Fourteen patients with HGG and suspected of PSP underwent FET-PET imaging. A set of 19 conventional and textural FET-PET features were evaluated and subjected to unsupervised consensus clustering. The final diagnosis of true progression vs. PSP was based on follow-up MRI using RANO criteria. Results Three robust clusters have been identified based on 10 predominantly textural FET-PET features. None of the patients with PSP fell into cluster 2, which was associated with high values for textural FET-PET markers of uptake heterogeneity. Three out of 4 patients with PSP were assigned to cluster 3 that was largely associated with low values of textural FET-PET features. By comparison, tumor-to-normal brain ratio (TNRmax) at the optimal cutoff 2.1 was less predictive of PSP (negative predictive value 57% for detecting true progression, p=0.07 vs. 75% with cluster 3, p=0.04). Principal Conclusions Clustering based on textural O-(2-[18F]fluoroethyl)-L-tyrosine PET features may provide valuable information in assessing the elusive phenomenon of pseudoprogression. PMID:28030820
Enhanced HMAX model with feedforward feature learning for multiclass categorization.
Li, Yinlin; Wu, Wei; Zhang, Bo; Li, Fengfu
2015-01-01
In recent years, the interdisciplinary research between neuroscience and computer vision has promoted the development in both fields. Many biologically inspired visual models are proposed, and among them, the Hierarchical Max-pooling model (HMAX) is a feedforward model mimicking the structures and functions of V1 to posterior inferotemporal (PIT) layer of the primate visual cortex, which could generate a series of position- and scale- invariant features. However, it could be improved with attention modulation and memory processing, which are two important properties of the primate visual cortex. Thus, in this paper, based on recent biological research on the primate visual cortex, we still mimic the first 100-150 ms of visual cognition to enhance the HMAX model, which mainly focuses on the unsupervised feedforward feature learning process. The main modifications are as follows: (1) To mimic the attention modulation mechanism of V1 layer, a bottom-up saliency map is computed in the S1 layer of the HMAX model, which can support the initial feature extraction for memory processing; (2) To mimic the learning, clustering and short-term memory to long-term memory conversion abilities of V2 and IT, an unsupervised iterative clustering method is used to learn clusters with multiscale middle level patches, which are taken as long-term memory; (3) Inspired by the multiple feature encoding mode of the primate visual cortex, information including color, orientation, and spatial position are encoded in different layers of the HMAX model progressively. By adding a softmax layer at the top of the model, multiclass categorization experiments can be conducted, and the results on Caltech101 show that the enhanced model with a smaller memory size exhibits higher accuracy than the original HMAX model, and could also achieve better accuracy than other unsupervised feature learning methods in multiclass categorization task.
Cluster Analysis Identifies 3 Phenotypes within Allergic Asthma.
Sendín-Hernández, María Paz; Ávila-Zarza, Carmelo; Sanz, Catalina; García-Sánchez, Asunción; Marcos-Vadillo, Elena; Muñoz-Bellido, Francisco J; Laffond, Elena; Domingo, Christian; Isidoro-García, María; Dávila, Ignacio
Asthma is a heterogeneous chronic disease with different clinical expressions and responses to treatment. In recent years, several unbiased approaches based on clinical, physiological, and molecular features have described several phenotypes of asthma. Some phenotypes are allergic, but little is known about whether these phenotypes can be further subdivided. We aimed to phenotype patients with allergic asthma using an unbiased approach based on multivariate classification techniques (unsupervised hierarchical cluster analysis). From a total of 54 variables of 225 patients with well-characterized allergic asthma diagnosed following American Thoracic Society (ATS) recommendation, positive skin prick test to aeroallergens, and concordant symptoms, we finally selected 19 variables by multiple correspondence analyses. Then a cluster analysis was performed. Three groups were identified. Cluster 1 was constituted by patients with intermittent or mild persistent asthma, without family antecedents of atopy, asthma, or rhinitis. This group showed the lowest total IgE levels. Cluster 2 was constituted by patients with mild asthma with a family history of atopy, asthma, or rhinitis. Total IgE levels were intermediate. Cluster 3 included patients with moderate or severe persistent asthma that needed treatment with corticosteroids and long-acting β-agonists. This group showed the highest total IgE levels. We identified 3 phenotypes of allergic asthma in our population. Furthermore, we described 2 phenotypes of mild atopic asthma mainly differentiated by a family history of allergy. Copyright © 2017 American Academy of Allergy, Asthma & Immunology. Published by Elsevier Inc. All rights reserved.
Hierarchical Kohonenen net for anomaly detection in network security.
Sarasamma, Suseela T; Zhu, Qiuming A; Huff, Julie
2005-04-01
A novel multilevel hierarchical Kohonen Net (K-Map) for an intrusion detection system is presented. Each level of the hierarchical map is modeled as a simple winner-take-all K-Map. One significant advantage of this multilevel hierarchical K-Map is its computational efficiency. Unlike other statistical anomaly detection methods such as nearest neighbor approach, K-means clustering or probabilistic analysis that employ distance computation in the feature space to identify the outliers, our approach does not involve costly point-to-point computation in organizing the data into clusters. Another advantage is the reduced network size. We use the classification capability of the K-Map on selected dimensions of data set in detecting anomalies. Randomly selected subsets that contain both attacks and normal records from the KDD Cup 1999 benchmark data are used to train the hierarchical net. We use a confidence measure to label the clusters. Then we use the test set from the same KDD Cup 1999 benchmark to test the hierarchical net. We show that a hierarchical K-Map in which each layer operates on a small subset of the feature space is superior to a single-layer K-Map operating on the whole feature space in detecting a variety of attacks in terms of detection rate as well as false positive rate.
Effective traffic features selection algorithm for cyber-attacks samples
NASA Astrophysics Data System (ADS)
Li, Yihong; Liu, Fangzheng; Du, Zhenyu
2018-05-01
By studying the defense scheme of Network attacks, this paper propose an effective traffic features selection algorithm based on k-means++ clustering to deal with the problem of high dimensionality of traffic features which extracted from cyber-attacks samples. Firstly, this algorithm divide the original feature set into attack traffic feature set and background traffic feature set by the clustering. Then, we calculates the variation of clustering performance after removing a certain feature. Finally, evaluating the degree of distinctiveness of the feature vector according to the result. Among them, the effective feature vector is whose degree of distinctiveness exceeds the set threshold. The purpose of this paper is to select out the effective features from the extracted original feature set. In this way, it can reduce the dimensionality of the features so as to reduce the space-time overhead of subsequent detection. The experimental results show that the proposed algorithm is feasible and it has some advantages over other selection algorithms.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kazin, Eyal A.; Blanton, Michael R.; Scoccimarro, Roman
2010-08-20
We analyze the line-of-sight baryonic acoustic feature in the two-point correlation function {xi} of the Sloan Digital Sky Survey luminous red galaxy (LRG) sample (0.16 < z < 0.47). By defining a narrow line-of-sight region, r{sub p} < 5.5 h {sup -1} Mpc, where r{sub p} is the transverse separation component, we measure a strong excess of clustering at {approx}110 h {sup -1} Mpc, as previously reported in the literature. We also test these results in an alternative coordinate system, by defining the line of sight as {theta} < 3{sup 0}, where {theta} is the opening angle. This clustering excessmore » appears much stronger than the feature in the better-measured monopole. A fiducial {Lambda}CDM nonlinear model in redshift space predicts a much weaker signature. We use realistic mock catalogs to model the expected signal and noise. We find that the line-of-sight measurements can be explained well by our mocks as well as by a featureless {xi} = 0. We conclude that there is no convincing evidence that the strong clustering measurement is the line-of-sight baryonic acoustic feature. We also evaluate how detectable such a signal would be in the upcoming Baryon Oscillation Spectroscopic Survey (BOSS) LRG volume. Mock LRG catalogs (z < 0.6) suggest that (1) the narrow line-of-sight cylinder and cone defined above probably will not reveal a detectable acoustic feature in BOSS; (2) a clustering measurement as high as that in the current sample can be ruled out (or confirmed) at a high confidence level using a BOSS-sized data set; (3) an analysis with wider angular cuts, which provide better signal-to-noise ratios, can nevertheless be used to compare line-of-sight and transverse distances, and thereby constrain the expansion rate H(z) and diameter distance D{sub A}(z).« less
Zhu, Bin; Liu, Jinlin; Fu, Yang; Zhang, Bo; Mao, Ying
2018-04-02
Viral hepatitis, as one of the most serious notifiable infectious diseases in China, takes heavy tolls from the infected and causes a severe economic burden to society, yet few studies have systematically explored the spatio-temporal epidemiology of viral hepatitis in China. This study aims to explore, visualize and compare the epidemiologic trends and spatial changing patterns of different types of viral hepatitis (A, B, C, E and unspecified, based on the classification of CDC) at the provincial level in China. The growth rates of incidence are used and converted to box plots to visualize the epidemiologic trends, with the linear trend being tested by chi-square linear by linear association test. Two complementary spatial cluster methods are used to explore the overall agglomeration level and identify spatial clusters: spatial autocorrelation analysis (measured by global and local Moran's I) and space-time scan analysis. Based on the spatial autocorrelation analysis, the hotspots of hepatitis A remain relatively stable and gradually shrunk, with Yunnan and Sichuan successively moving out the high-high (HH) cluster area. The HH clustering feature of hepatitis B in China gradually disappeared with time. However, the HH cluster area of hepatitis C has gradually moved towards the west, while for hepatitis E, the provincial units around the Yangtze River Delta region have been revealing HH cluster features since 2005. The space-time scan analysis also indicates the distinct spatial changing patterns of different types of viral hepatitis in China. It is easy to conclude that there is no one-size-fits-all plan for the prevention and control of viral hepatitis in all the provincial units. An effective response requires a package of coordinated actions, which should vary across localities regarding the spatial-temporal epidemic dynamics of each type of virus and the specific conditions of each provincial unit.
Neuropsychological assessment of decision making in alcohol-dependent commercial pilots.
Georgemiller, Randy; Machizawa, Sayaka; Young, Kathleen M; Martin, Cynthia N
2013-09-01
The aim of this exploratory archival study was to discern the utility of the Iowa Gambling Task (IGT) in identifying adaptive decision-making capacities among pilots with a history of alcohol dependence both with and without Cluster B personality features. Participants included 18 male airmen at the rank of captain with a history of receiving alcohol dependence treatment and subsequent referral for a fitness-for-duty evaluation. Data from prior comprehensive neuropsychological evaluations conducted in a private practice setting at the mandate of the FAA utilizing criteria outlined in the HIMS program was used. ANOVA was conducted to compare pilots with (N = 4) and without Cluster B personality features (N = 14) on measures of decisionmaking capacities, intelligence, and executive functioning. Pilots with Cluster B personality features were found to have a significantly lower Total Net T-Score on IGT (M = 35.00, SD = 9.27) than pilots without features of Cluster B (M = 56.36, SD = 9.55). Furthermore, with the exception of the first 20 cards (i.e., Net 1); the groups significantly differed in their Net scores. No statistically significant difference was found on airmen's intelligence and executive functioning. The present study found that alcohol-dependent airmen with Cluster B personality features evidenced significantly poorer decisionmaking capacities as measured by the ICT in comparison to alcohol dependent airman without Cluster B personality features. Implications and limitations of the study are discussed.
NASA Astrophysics Data System (ADS)
Cannata, A.; Montalto, P.; Aliotta, M.; Cassisi, C.; Pulvirenti, A.; Privitera, E.; Patanè, D.
2011-04-01
Active volcanoes generate sonic and infrasonic signals, whose investigation provides useful information for both monitoring purposes and the study of the dynamics of explosive phenomena. At Mt. Etna volcano (Italy), a pattern recognition system based on infrasonic waveform features has been developed. First, by a parametric power spectrum method, the features describing and characterizing the infrasound events were extracted: peak frequency and quality factor. Then, together with the peak-to-peak amplitude, these features constituted a 3-D ‘feature space’; by Density-Based Spatial Clustering of Applications with Noise algorithm (DBSCAN) three clusters were recognized inside it. After the clustering process, by using a common location method (semblance method) and additional volcanological information concerning the intensity of the explosive activity, we were able to associate each cluster to a particular source vent and/or a kind of volcanic activity. Finally, for automatic event location, clusters were used to train a model based on Support Vector Machine, calculating optimal hyperplanes able to maximize the margins of separation among the clusters. After the training phase this system automatically allows recognizing the active vent with no location algorithm and by using only a single station.
Hadjithomas, Michalis; Chen, I-Min A.; Chu, Ken; ...
2016-11-29
Secondary metabolites produced by microbes have diverse biological functions, which makes them a great potential source of biotechnologically relevant compounds with antimicrobial, anti-cancer and other activities. The proteins needed to synthesize these natural products are often encoded by clusters of co-located genes called biosynthetic gene clusters (BCs). In order to advance the exploration of microbial secondary metabolism, we developed the largest publically available database of experimentally verified and predicted BCs, the Integrated Microbial Genomes Atlas of Biosynthetic gene Clusters (IMG-ABC) (https://img.jgi.doe.gov/abc/). Here, we describe an update of IMG-ABC, which includes ClusterScout, a tool for targeted identification of custom biosynthetic genemore » clusters across 40 000 isolate microbial genomes, and a new search capability to query more than 700 000 BCs from isolate genomes for clusters with similar Pfam composition. Additional features enable fast exploration and analysis of BCs through two new interactive visualization features, a BC function heatmap and a BC similarity network graph. These new tools and features add to the value of IMG-ABC's vast body of BC data, facilitating their in-depth analysis and accelerating secondary metabolite discovery.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hadjithomas, Michalis; Chen, I-Min A.; Chu, Ken
Secondary metabolites produced by microbes have diverse biological functions, which makes them a great potential source of biotechnologically relevant compounds with antimicrobial, anti-cancer and other activities. The proteins needed to synthesize these natural products are often encoded by clusters of co-located genes called biosynthetic gene clusters (BCs). In order to advance the exploration of microbial secondary metabolism, we developed the largest publically available database of experimentally verified and predicted BCs, the Integrated Microbial Genomes Atlas of Biosynthetic gene Clusters (IMG-ABC) (https://img.jgi.doe.gov/abc/). Here, we describe an update of IMG-ABC, which includes ClusterScout, a tool for targeted identification of custom biosynthetic genemore » clusters across 40 000 isolate microbial genomes, and a new search capability to query more than 700 000 BCs from isolate genomes for clusters with similar Pfam composition. Additional features enable fast exploration and analysis of BCs through two new interactive visualization features, a BC function heatmap and a BC similarity network graph. These new tools and features add to the value of IMG-ABC's vast body of BC data, facilitating their in-depth analysis and accelerating secondary metabolite discovery.« less
NASA Astrophysics Data System (ADS)
Sirakov, Nikolay M.; Suh, Sang; Attardo, Salvatore
2011-06-01
This paper presents a further step of a research toward the development of a quick and accurate weapons identification methodology and system. A basic stage of this methodology is the automatic acquisition and updating of weapons ontology as a source of deriving high level weapons information. The present paper outlines the main ideas used to approach the goal. In the next stage, a clustering approach is suggested on the base of hierarchy of concepts. An inherent slot of every node of the proposed ontology is a low level features vector (LLFV), which facilitates the search through the ontology. Part of the LLFV is the information about the object's parts. To partition an object a new approach is presented capable of defining the objects concavities used to mark the end points of weapon parts, considered as convexities. Further an existing matching approach is optimized to determine whether an ontological object matches the objects from an input image. Objects from derived ontological clusters will be considered for the matching process. Image resizing is studied and applied to decrease the runtime of the matching approach and investigate its rotational and scaling invariance. Set of experiments are preformed to validate the theoretical concepts.
Schizophrenia classification using functional network features
NASA Astrophysics Data System (ADS)
Rish, Irina; Cecchi, Guillermo A.; Heuton, Kyle
2012-03-01
This paper focuses on discovering statistical biomarkers (features) that are predictive of schizophrenia, with a particular focus on topological properties of fMRI functional networks. We consider several network properties, such as node (voxel) strength, clustering coefficients, local efficiency, as well as just a subset of pairwise correlations. While all types of features demonstrate highly significant statistical differences in several brain areas, and close to 80% classification accuracy, the most remarkable results of 93% accuracy are achieved by using a small subset of only a dozen of most-informative (lowest p-value) correlation features. Our results suggest that voxel-level correlations and functional network features derived from them are highly informative about schizophrenia and can be used as statistical biomarkers for the disease.
Subgraph augmented non-negative tensor factorization (SANTF) for modeling clinical narrative text
Xin, Yu; Hochberg, Ephraim; Joshi, Rohit; Uzuner, Ozlem; Szolovits, Peter
2015-01-01
Objective Extracting medical knowledge from electronic medical records requires automated approaches to combat scalability limitations and selection biases. However, existing machine learning approaches are often regarded by clinicians as black boxes. Moreover, training data for these automated approaches at often sparsely annotated at best. The authors target unsupervised learning for modeling clinical narrative text, aiming at improving both accuracy and interpretability. Methods The authors introduce a novel framework named subgraph augmented non-negative tensor factorization (SANTF). In addition to relying on atomic features (e.g., words in clinical narrative text), SANTF automatically mines higher-order features (e.g., relations of lymphoid cells expressing antigens) from clinical narrative text by converting sentences into a graph representation and identifying important subgraphs. The authors compose a tensor using patients, higher-order features, and atomic features as its respective modes. We then apply non-negative tensor factorization to cluster patients, and simultaneously identify latent groups of higher-order features that link to patient clusters, as in clinical guidelines where a panel of immunophenotypic features and laboratory results are used to specify diagnostic criteria. Results and Conclusion SANTF demonstrated over 10% improvement in averaged F-measure on patient clustering compared to widely used non-negative matrix factorization (NMF) and k-means clustering methods. Multiple baselines were established by modeling patient data using patient-by-features matrices with different feature configurations and then performing NMF or k-means to cluster patients. Feature analysis identified latent groups of higher-order features that lead to medical insights. We also found that the latent groups of atomic features help to better correlate the latent groups of higher-order features. PMID:25862765
Graph-based Data Modeling and Analysis for Data Fusion in Remote Sensing
NASA Astrophysics Data System (ADS)
Fan, Lei
Hyperspectral imaging provides the capability of increased sensitivity and discrimination over traditional imaging methods by combining standard digital imaging with spectroscopic methods. For each individual pixel in a hyperspectral image (HSI), a continuous spectrum is sampled as the spectral reflectance/radiance signature to facilitate identification of ground cover and surface material. The abundant spectrum knowledge allows all available information from the data to be mined. The superior qualities within hyperspectral imaging allow wide applications such as mineral exploration, agriculture monitoring, and ecological surveillance, etc. The processing of massive high-dimensional HSI datasets is a challenge since many data processing techniques have a computational complexity that grows exponentially with the dimension. Besides, a HSI dataset may contain a limited number of degrees of freedom due to the high correlations between data points and among the spectra. On the other hand, merely taking advantage of the sampled spectrum of individual HSI data point may produce inaccurate results due to the mixed nature of raw HSI data, such as mixed pixels, optical interferences and etc. Fusion strategies are widely adopted in data processing to achieve better performance, especially in the field of classification and clustering. There are mainly three types of fusion strategies, namely low-level data fusion, intermediate-level feature fusion, and high-level decision fusion. Low-level data fusion combines multi-source data that is expected to be complementary or cooperative. Intermediate-level feature fusion aims at selection and combination of features to remove redundant information. Decision level fusion exploits a set of classifiers to provide more accurate results. The fusion strategies have wide applications including HSI data processing. With the fast development of multiple remote sensing modalities, e.g. Very High Resolution (VHR) optical sensors, LiDAR, etc., fusion of multi-source data can in principal produce more detailed information than each single source. On the other hand, besides the abundant spectral information contained in HSI data, features such as texture and shape may be employed to represent data points from a spatial perspective. Furthermore, feature fusion also includes the strategy of removing redundant and noisy features in the dataset. One of the major problems in machine learning and pattern recognition is to develop appropriate representations for complex nonlinear data. In HSI processing, a particular data point is usually described as a vector with coordinates corresponding to the intensities measured in the spectral bands. This vector representation permits the application of linear and nonlinear transformations with linear algebra to find an alternative representation of the data. More generally, HSI is multi-dimensional in nature and the vector representation may lose the contextual correlations. Tensor representation provides a more sophisticated modeling technique and a higher-order generalization to linear subspace analysis. In graph theory, data points can be generalized as nodes with connectivities measured from the proximity of a local neighborhood. The graph-based framework efficiently characterizes the relationships among the data and allows for convenient mathematical manipulation in many applications, such as data clustering, feature extraction, feature selection and data alignment. In this thesis, graph-based approaches applied in the field of multi-source feature and data fusion in remote sensing area are explored. We will mainly investigate the fusion of spatial, spectral and LiDAR information with linear and multilinear algebra under graph-based framework for data clustering and classification problems.
A flexible data-driven comorbidity feature extraction framework.
Sideris, Costas; Pourhomayoun, Mohammad; Kalantarian, Haik; Sarrafzadeh, Majid
2016-06-01
Disease and symptom diagnostic codes are a valuable resource for classifying and predicting patient outcomes. In this paper, we propose a novel methodology for utilizing disease diagnostic information in a predictive machine learning framework. Our methodology relies on a novel, clustering-based feature extraction framework using disease diagnostic information. To reduce the data dimensionality, we identify disease clusters using co-occurrence statistics. We optimize the number of generated clusters in the training set and then utilize these clusters as features to predict patient severity of condition and patient readmission risk. We build our clustering and feature extraction algorithm using the 2012 National Inpatient Sample (NIS), Healthcare Cost and Utilization Project (HCUP) which contains 7 million hospital discharge records and ICD-9-CM codes. The proposed framework is tested on Ronald Reagan UCLA Medical Center Electronic Health Records (EHR) from 3041 Congestive Heart Failure (CHF) patients and the UCI 130-US diabetes dataset that includes admissions from 69,980 diabetic patients. We compare our cluster-based feature set with the commonly used comorbidity frameworks including Charlson's index, Elixhauser's comorbidities and their variations. The proposed approach was shown to have significant gains between 10.7-22.1% in predictive accuracy for CHF severity of condition prediction and 4.65-5.75% in diabetes readmission prediction. Copyright © 2016 Elsevier Ltd. All rights reserved.
Features of asthma which provide meaningful insights for understanding the disease heterogeneity.
Deliu, M; Yavuz, T S; Sperrin, M; Belgrave, D; Sahiner, U M; Sackesen, C; Kalayci, O; Custovic, A
2018-01-01
Data-driven methods such as hierarchical clustering (HC) and principal component analysis (PCA) have been used to identify asthma subtypes, with inconsistent results. To develop a framework for the discovery of stable and clinically meaningful asthma subtypes. We performed HC in a rich data set from 613 asthmatic children, using 45 clinical variables (Model 1), and after PCA dimensionality reduction (Model 2). Clinical experts then identified a set of asthma features/domains which informed clusters in the two analyses. In Model 3, we reclustered the data using these features to ascertain whether this improved the discovery process. Cluster stability was poor in Models 1 and 2. Clinical experts highlighted four asthma features/domains which differentiated the clusters in two models: age of onset, allergic sensitization, severity, and recent exacerbations. In Model 3 (HC using these four features), cluster stability improved substantially. The cluster assignment changed, providing more clinically interpretable results. In a 5-cluster model, we labelled the clusters as: "Difficult asthma" (n = 132); "Early-onset mild atopic" (n = 210); "Early-onset mild non-atopic: (n = 153); "Late-onset" (n = 105); and "Exacerbation-prone asthma" (n = 13). Multinomial regression demonstrated that lung function was significantly diminished among children with "Difficult asthma"; blood eosinophilia was a significant feature of "Difficult," "Early-onset mild atopic," and "Late-onset asthma." Children with moderate-to-severe asthma were present in each cluster. An integrative approach of blending the data with clinical expert domain knowledge identified four features, which may be informative for ascertaining asthma endotypes. These findings suggest that variables which are key determinants of asthma presence, severity, or control may not be the most informative for determining asthma subtypes. Our results indicate that exacerbation-prone asthma may be a separate asthma endotype and that severe asthma is not a single entity, but an extreme end of the spectrum of several different asthma endotypes. © 2017 The Authors. Clinical & Experimental Allergy published by John Wiley & Sons Ltd.
Multi scales based sparse matrix spectral clustering image segmentation
NASA Astrophysics Data System (ADS)
Liu, Zhongmin; Chen, Zhicai; Li, Zhanming; Hu, Wenjin
2018-04-01
In image segmentation, spectral clustering algorithms have to adopt the appropriate scaling parameter to calculate the similarity matrix between the pixels, which may have a great impact on the clustering result. Moreover, when the number of data instance is large, computational complexity and memory use of the algorithm will greatly increase. To solve these two problems, we proposed a new spectral clustering image segmentation algorithm based on multi scales and sparse matrix. We devised a new feature extraction method at first, then extracted the features of image on different scales, at last, using the feature information to construct sparse similarity matrix which can improve the operation efficiency. Compared with traditional spectral clustering algorithm, image segmentation experimental results show our algorithm have better degree of accuracy and robustness.
Analysis of ground-motion simulation big data
NASA Astrophysics Data System (ADS)
Maeda, T.; Fujiwara, H.
2016-12-01
We developed a parallel distributed processing system which applies a big data analysis to the large-scale ground motion simulation data. The system uses ground-motion index values and earthquake scenario parameters as input. We used peak ground velocity value and velocity response spectra as the ground-motion index. The ground-motion index values are calculated from our simulation data. We used simulated long-period ground motion waveforms at about 80,000 meshes calculated by a three dimensional finite difference method based on 369 earthquake scenarios of a great earthquake in the Nankai Trough. These scenarios were constructed by considering the uncertainty of source model parameters such as source area, rupture starting point, asperity location, rupture velocity, fmax and slip function. We used these parameters as the earthquake scenario parameter. The system firstly carries out the clustering of the earthquake scenario in each mesh by the k-means method. The number of clusters is determined in advance using a hierarchical clustering by the Ward's method. The scenario clustering results are converted to the 1-D feature vector. The dimension of the feature vector is the number of scenario combination. If two scenarios belong to the same cluster the component of the feature vector is 1, and otherwise the component is 0. The feature vector shows a `response' of mesh to the assumed earthquake scenario group. Next, the system performs the clustering of the mesh by k-means method using the feature vector of each mesh previously obtained. Here the number of clusters is arbitrarily given. The clustering of scenarios and meshes are performed by parallel distributed processing with Hadoop and Spark, respectively. In this study, we divided the meshes into 20 clusters. The meshes in each cluster are geometrically concentrated. Thus this system can extract regions, in which the meshes have similar `response', as clusters. For each cluster, it is possible to determine particular scenario parameters which characterize the cluster. In other word, by utilizing this system, we can obtain critical scenario parameters of the ground-motion simulation for each evaluation point objectively. This research was supported by CREST, JST.
OCEANIDS: Autonomous Data Acquisition, Management and Distribution System
NASA Technical Reports Server (NTRS)
Bingham, Andrew; Rigor, Eric; Cervantes, Alex; Armstrong, Edward
2004-01-01
OCEANIDS is a clearinghouse for mission essential and near-real-time satellite data streams. This viewgraph presentation describes this mission, and includes the following topics: 1) OCEANIDS Motivation; 2) High-Level Architecture; 3) OCEANIDS Features; 4) OCEANIDS GUI: Nodes; 5) OCEANIDS GUI: Cluster; 6) Data Streams; 7) Statistics; and 8) GHRSST-PP.
NASA Astrophysics Data System (ADS)
Hortos, William S.
2008-04-01
Proposed distributed wavelet-based algorithms are a means to compress sensor data received at the nodes forming a wireless sensor network (WSN) by exchanging information between neighboring sensor nodes. Local collaboration among nodes compacts the measurements, yielding a reduced fused set with equivalent information at far fewer nodes. Nodes may be equipped with multiple sensor types, each capable of sensing distinct phenomena: thermal, humidity, chemical, voltage, or image signals with low or no frequency content as well as audio, seismic or video signals within defined frequency ranges. Compression of the multi-source data through wavelet-based methods, distributed at active nodes, reduces downstream processing and storage requirements along the paths to sink nodes; it also enables noise suppression and more energy-efficient query routing within the WSN. Targets are first detected by the multiple sensors; then wavelet compression and data fusion are applied to the target returns, followed by feature extraction from the reduced data; feature data are input to target recognition/classification routines; targets are tracked during their sojourns through the area monitored by the WSN. Algorithms to perform these tasks are implemented in a distributed manner, based on a partition of the WSN into clusters of nodes. In this work, a scheme of collaborative processing is applied for hierarchical data aggregation and decorrelation, based on the sensor data itself and any redundant information, enabled by a distributed, in-cluster wavelet transform with lifting that allows multiple levels of resolution. The wavelet-based compression algorithm significantly decreases RF bandwidth and other resource use in target processing tasks. Following wavelet compression, features are extracted. The objective of feature extraction is to maximize the probabilities of correct target classification based on multi-source sensor measurements, while minimizing the resource expenditures at participating nodes. Therefore, the feature-extraction method based on the Haar DWT is presented that employs a maximum-entropy measure to determine significant wavelet coefficients. Features are formed by calculating the energy of coefficients grouped around the competing clusters. A DWT-based feature extraction algorithm used for vehicle classification in WSNs can be enhanced by an added rule for selecting the optimal number of resolution levels to improve the correct classification rate and reduce energy consumption expended in local algorithm computations. Published field trial data for vehicular ground targets, measured with multiple sensor types, are used to evaluate the wavelet-assisted algorithms. Extracted features are used in established target recognition routines, e.g., the Bayesian minimum-error-rate classifier, to compare the effects on the classification performance of the wavelet compression. Simulations of feature sets and recognition routines at different resolution levels in target scenarios indicate the impact on classification rates, while formulas are provided to estimate reduction in resource use due to distributed compression.
Performance of Multiplexed XY Resistive Micromegas detectors in a high intensity beam
NASA Astrophysics Data System (ADS)
Banerjee, D.; Burtsev, V.; Chumakov, A.; Cooke, D.; Depero, E.; Dermenev, A. V.; Donskov, S. V.; Dubinin, F.; Dusaev, R. R.; Emmenegger, S.; Fabich, A.; Frolov, V. N.; Gardikiotis, A.; Gninenko, S. N.; Hösgen, M.; Karneyeu, A. E.; Ketzer, B.; Kirsanov, M. M.; Konorov, I. V.; Kramarenko, V. A.; Kuleshov, S. V.; Levchenko, E.; Lyubovitskij, V. E.; Lysan, V.; Mamon, S.; Matveev, V. A.; Mikhailov, Yu. V.; Myalkovskiy, V. V.; Peshekhonov, V. D.; Peshekhonov, D. V.; Polyakov, V. A.; Radics, B.; Rubbia, A.; Samoylenko, V. D.; Tikhomirov, V. O.; Tlisov, D. A.; Toropin, A. N.; Vasilishin, B.; Arenas, G. Vasquez; Ulloa, P.; Crivelli, P.
2018-02-01
We present the performance of multiplexed XY resistive Micromegas detectors tested in the CERN SPS 100 GeV/c electron beam at intensities up to 3 . 3 × 105e- /(s ṡcm2) . So far, all studies with multiplexed Micromegas have only been reported for tests with radioactive sources and cosmic rays. The use of multiplexed modules in high intensity environments was not explored due to the effect of ambiguities in the reconstruction of the hit point caused by the multiplexing feature. For the specific mapping and beam intensities analyzed in this work with a multiplexing factor of five, more than 50% level of ambiguity is introduced due to particle pile-up as well as fake clusters due to the mapping feature. Our results prove that by using the additional information of cluster size and integrated charge from the signal clusters induced on the XY strips, the ambiguities can be reduced to a level below 2%. The tested detectors are used in the CERN NA64 experiment for tracking the incoming particles bending in a magnetic field in order to reconstruct their momentum. The average hit detection efficiency of each module was found to be ∼96% at the highest beam intensities. By using four modules a tracking resolution of 1.1% was obtained with ∼85% combined tracking efficiency.
The Analysis of Surface EMG Signals with the Wavelet-Based Correlation Dimension Method
Zhang, Yanyan; Wang, Jue
2014-01-01
Many attempts have been made to effectively improve a prosthetic system controlled by the classification of surface electromyographic (SEMG) signals. Recently, the development of methodologies to extract the effective features still remains a primary challenge. Previous studies have demonstrated that the SEMG signals have nonlinear characteristics. In this study, by combining the nonlinear time series analysis and the time-frequency domain methods, we proposed the wavelet-based correlation dimension method to extract the effective features of SEMG signals. The SEMG signals were firstly analyzed by the wavelet transform and the correlation dimension was calculated to obtain the features of the SEMG signals. Then, these features were used as the input vectors of a Gustafson-Kessel clustering classifier to discriminate four types of forearm movements. Our results showed that there are four separate clusters corresponding to different forearm movements at the third resolution level and the resulting classification accuracy was 100%, when two channels of SEMG signals were used. This indicates that the proposed approach can provide important insight into the nonlinear characteristics and the time-frequency domain features of SEMG signals and is suitable for classifying different types of forearm movements. By comparing with other existing methods, the proposed method exhibited more robustness and higher classification accuracy. PMID:24868240
Sharp, Carla; Kalpakci, Allison; Mellick, William; Venta, Amanda; Temple, Jeff R
2015-03-01
At least two leading developmental models of borderline personality disorder (BPD) emphasize the role of accurate reflection and understanding of internal states as significant to the development of BPD features (Fonagy, Int J Psycho-Anal 72:639-656, 1991; Linehan, Cognitive-behavioral treatment of borderline personality disorder, 1993). The current study used the construct of experiential avoidance (EA) to operationalize avoidance of internal states and sought to examine (1) the concurrent relations between EA and borderline features in a large and diverse community sample; and (2) the prospective relation between EA and borderline features over a 1-year follow-up, controlling for baseline levels of borderline features. N = 881 adolescents recruited from public schools in a large metropolitan area participated in baseline assessments and N = 730 completed follow-up assessments. Two main findings were reported. First, EA was associated with borderline features, depressive, and anxiety symptoms at the bivariate level, but when all variables were considered together, depression and anxiety no longer remained significantly associated with borderline features, suggesting that the relations among these symptom clusters may be accounted for by EA as a cross-cutting underlying psychological process. Second, EA predicted levels of borderline symptoms at 1-year follow-up, controlling for baseline levels of borderline symptoms, and symptoms of anxiety and depression. Results are interpreted against the background of developmental theories of borderline personality disorder.
OPTICAL COLORS OF INTRACLUSTER LIGHT IN THE VIRGO CLUSTER CORE
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rudick, Craig S.; Mihos, J. Christopher; Harding, Paul
2010-09-01
We continue our deep optical imaging survey of the Virgo cluster using the CWRU Burrell Schmidt telescope by presenting B-band surface photometry of the core of the Virgo cluster in order to study the cluster's intracluster light (ICL). We find ICL features down to {mu}{sub B} {approx}29 mag arcsec{sup -2}, confirming the results of Mihos et al., who saw a vast web of low surface brightness streams, arcs, plumes, and diffuse light in the Virgo cluster core using V-band imaging. By combining these two data sets, we are able to measure the optical colors of many of the cluster's lowmore » surface brightness features. While much of our imaging area is contaminated by galactic cirrus, the cluster core near the cD galaxy, M87, is unobscured. We trace the color profile of M87 out to over 2000'', and find a blueing trend with radius, continuing out to the largest radii. Moreover, we have measured the colors of several ICL features which extend beyond M87's outermost reaches and find that they have similar colors to the M87's halo itself, B - V {approx}0.8. The common colors of these features suggest that the extended outer envelopes of cD galaxies, such as M87, may be formed from similar streams, created by tidal interactions within the cluster, that have since dissolved into a smooth background in the cluster potential.« less
Gaul, C; Christmann, N; Schröder, D; Weber, R; Shanib, H; Diener, H C; Holle, D
2012-05-01
Data on clinical differences between episodic (eCH) and chronic cluster headache (cCH) and accompanying migraine features are limited. History and clinical features of 209 consecutive cluster headache patients (144 eCH, 65 cCH; male:female ratio 3.4 : 1) were obtained in a tertiary headache centre by face-to-face interviews. Relationship between occurrence of accompanying symptoms, pain intensity, comorbid migraine, and circannual and circadian rhythmicity was analysed. 99.5% of patients reported a minimum of one ipsilateral cranial autonomic symptom (CAS); 80% showed at least three CAS. A seasonal rhythmicity was observed in both eCH and cCH. A comorbid headache disorder occurred in 25%. No significant difference was detected between patients with comorbid migraine and without regarding occurrence of phonophobia, photophobia or nausea during cluster attacks. Patients with comorbid migraine reported allodynia significantly (p = 0.022) more often during cluster attacks than patients without comorbid migraine. Occurrence of CAS and attack frequency, as well as periodic patterns of attacks, are relatively uniform in eCH and cCH. Multiple CAS are not related to pain intensity. Allodynia during cluster attacks is a frequent symptom. The unexpectedly high rate of accompanying migrainous features during cluster attacks cannot be explained by comorbid migraine.
Hadjithomas, Michalis; Chen, I-Min A; Chu, Ken; Huang, Jinghua; Ratner, Anna; Palaniappan, Krishna; Andersen, Evan; Markowitz, Victor; Kyrpides, Nikos C; Ivanova, Natalia N
2017-01-04
Secondary metabolites produced by microbes have diverse biological functions, which makes them a great potential source of biotechnologically relevant compounds with antimicrobial, anti-cancer and other activities. The proteins needed to synthesize these natural products are often encoded by clusters of co-located genes called biosynthetic gene clusters (BCs). In order to advance the exploration of microbial secondary metabolism, we developed the largest publically available database of experimentally verified and predicted BCs, the Integrated Microbial Genomes Atlas of Biosynthetic gene Clusters (IMG-ABC) (https://img.jgi.doe.gov/abc/). Here, we describe an update of IMG-ABC, which includes ClusterScout, a tool for targeted identification of custom biosynthetic gene clusters across 40 000 isolate microbial genomes, and a new search capability to query more than 700 000 BCs from isolate genomes for clusters with similar Pfam composition. Additional features enable fast exploration and analysis of BCs through two new interactive visualization features, a BC function heatmap and a BC similarity network graph. These new tools and features add to the value of IMG-ABC's vast body of BC data, facilitating their in-depth analysis and accelerating secondary metabolite discovery. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Clustering and pasta phases in nuclear density functional theory
Schuetrumpf, Bastian; Zhang, Chunli; Nazarewicz, Witold
2017-05-23
Nuclear density functional theory is the tool of choice in describing properties of complex nuclei and intricate phases of bulk nucleonic matter. It is a microscopic approach based on an energy density functional representing the nuclear interaction. An attractive feature of nuclear DFT is that it can be applied to both finite nuclei and pasta phases appearing in the inner crust of neutron stars. While nuclear pasta clusters in a neutron star can be easily characterized through their density distributions, the level of clustering of nucleons in a nucleus can often be difficult to assess. To this end, we usemore » the concept of nucleon localization. We demonstrate that the localization measure provides us with fingerprints of clusters in light and heavy nuclei, including fissioning systems. Furthermore we investigate the rod-like pasta phase using twist-averaged boundary conditions, which enable calculations in finite volumes accessible by state of the art DFT solvers.« less
Detection and clustering of features in aerial images by neuron network-based algorithm
NASA Astrophysics Data System (ADS)
Vozenilek, Vit
2015-12-01
The paper presents the algorithm for detection and clustering of feature in aerial photographs based on artificial neural networks. The presented approach is not focused on the detection of specific topographic features, but on the combination of general features analysis and their use for clustering and backward projection of clusters to aerial image. The basis of the algorithm is a calculation of the total error of the network and a change of weights of the network to minimize the error. A classic bipolar sigmoid was used for the activation function of the neurons and the basic method of backpropagation was used for learning. To verify that a set of features is able to represent the image content from the user's perspective, the web application was compiled (ASP.NET on the Microsoft .NET platform). The main achievements include the knowledge that man-made objects in aerial images can be successfully identified by detection of shapes and anomalies. It was also found that the appropriate combination of comprehensive features that describe the colors and selected shapes of individual areas can be useful for image analysis.
Comparative study of feature selection with ensemble learning using SOM variants
NASA Astrophysics Data System (ADS)
Filali, Ameni; Jlassi, Chiraz; Arous, Najet
2017-03-01
Ensemble learning has succeeded in the growth of stability and clustering accuracy, but their runtime prohibits them from scaling up to real-world applications. This study deals the problem of selecting a subset of the most pertinent features for every cluster from a dataset. The proposed method is another extension of the Random Forests approach using self-organizing maps (SOM) variants to unlabeled data that estimates the out-of-bag feature importance from a set of partitions. Every partition is created using a various bootstrap sample and a random subset of the features. Then, we show that the process internal estimates are used to measure variable pertinence in Random Forests are also applicable to feature selection in unsupervised learning. This approach aims to the dimensionality reduction, visualization and cluster characterization at the same time. Hence, we provide empirical results on nineteen benchmark data sets indicating that RFS can lead to significant improvement in terms of clustering accuracy, over several state-of-the-art unsupervised methods, with a very limited subset of features. The approach proves promise to treat with very broad domains.
DBSCAN-based ROI extracted from SAR images and the discrimination of multi-feature ROI
NASA Astrophysics Data System (ADS)
He, Xin Yi; Zhao, Bo; Tan, Shu Run; Zhou, Xiao Yang; Jiang, Zhong Jin; Cui, Tie Jun
2009-10-01
The purpose of the paper is to extract the region of interest (ROI) from the coarse detected synthetic aperture radar (SAR) images and discriminate if the ROI contains a target or not, so as to eliminate the false alarm, and prepare for the target recognition. The automatic target clustering is one of the most difficult tasks in the SAR-image automatic target recognition system. The density-based spatial clustering of applications with noise (DBSCAN) relies on a density-based notion of clusters which is designed to discover clusters of arbitrary shape. DBSCAN was first used in the SAR image processing, which has many excellent features: only two insensitivity parameters (radius of neighborhood and minimum number of points) are needed; clusters of arbitrary shapes which fit in with the coarse detected SAR images can be discovered; and the calculation time and memory can be reduced. In the multi-feature ROI discrimination scheme, we extract several target features which contain the geometry features such as the area discriminator and Radon-transform based target profile discriminator, the distribution characteristics such as the EFF discriminator, and the EM scattering property such as the PPR discriminator. The synthesized judgment effectively eliminates the false alarms.
Feature Clustering for Accelerating Parallel Coordinate Descent
DOE Office of Scientific and Technical Information (OSTI.GOV)
Scherrer, Chad; Tewari, Ambuj; Halappanavar, Mahantesh
2012-12-06
We demonstrate an approach for accelerating calculation of the regularization path for L1 sparse logistic regression problems. We show the benefit of feature clustering as a preconditioning step for parallel block-greedy coordinate descent algorithms.
Diagenetic Crystal Clusters and Dendrites, Lower Mount Sharp, Gale Crater
NASA Technical Reports Server (NTRS)
Kah, L. C.; Kronyak, R.; Van Beek, J.; Nachon, M.; Mangold, N.; Thompson, L.; Wiens, R.; Grotzinger, J.; Farmer, J.; Minitti, M.;
2015-01-01
Since approximately Sol 753 (to sol 840+) the Mars Science Laboratory Curiosity rover has been investigating the Pahrump locality. Mapping of HiRise images suggests that the Pahrup locality represents the first occurrence of strata associated with basal Mount Sharp. Considerable efforts have been made to document the Pahrump locality in detail, in order to constrain both depositional and diagenetic facies. The Pahrump succession consists of approximately 13 meters of recessive-weathering mudstone interbedded with thin (decimeter-scale) intervals of more erosionally resistant mudstone, and crossbedded sandstone in the upper stratigraphic levels. Mudstone textures vary from massive, to poorly laminated, to well-laminated. Here we investigate the distribution and structure of unusual diagenetic features that occur in the lowermost portion of the Pahrump section. These diagenetic features consist of three dimensional crystal clusters and dendrites that are erosionally resistant with respect to the host rock.
Polycystic Ovary-Like Abnormalities (PCO-L) in women with functional hypothalamic amenorrhea.
Robin, G; Gallo, C; Catteau-Jonard, S; Lefebvre-Maunoury, C; Pigny, P; Duhamel, A; Dewailly, D
2012-11-01
In the general population, about 30% of asymptomatic women have polycystic ovary-like abnormalities (PCO-L), i.e. polycystic ovarian morphology (PCOM) at ultrasound and/or increased anti-Müllerian hormone (AMH) serum level. PCOM has also been reported in 30-50% of women with functional hypothalamic amenorrhea (FHA). The aim of this study was to verify whether both PCOM and excessive AMH level indicate PCO-L in FHA and to elucidate its significance. We conducted a retrospective analysis using a database and comparison with a control population. Subjects received ambulatory care in an academic hospital. Fifty-eight patients with FHA were compared to 217 control women with nonendocrine infertility and body mass index of less than 25 kg/m(2). There were no interventions. We measured serum testosterone, androstenedione, FSH, LH, AMH, and ovarian area values. The antral follicle count (AFC) was used as a binary variable (i.e. negative or positive) because of the evolution of its sensitivity over the time of this study. The ability of these variables (except AFC) to detect PCO-L in both populations was tested by cluster analysis. One cluster (cluster 2) suggesting PCO-L was detected in the control population (n = 52; 24%), whereas two such clusters were observed in the FHA population (n = 22 and n = 6; 38 and 10%; clusters 2 and 3, respectively). Cluster 2 in FHA had similar features of PCO-L as cluster 2 in controls, with higher prevalence of positive AFC (70%) and PCOM (70%), higher values of ovarian area and higher serum AMH (P < 0.0001 for all), and testosterone levels (P < 0.01) than in cluster 1. Cluster 3 in FHA was peculiar, with frankly elevated AMH levels. In the whole population (controls + FHA), PCO-L was significantly associated with lower FSH values (P < 0.0001). PCO-L in FHA is a frequent and usually incidental finding of unclear significance, as in controls. The association of PCO-L with hypothalamic amenorrhea should not lead to a mistaken diagnosis of PCOS.
Topic modeling for cluster analysis of large biological and medical datasets
2014-01-01
Background The big data moniker is nowhere better deserved than to describe the ever-increasing prodigiousness and complexity of biological and medical datasets. New methods are needed to generate and test hypotheses, foster biological interpretation, and build validated predictors. Although multivariate techniques such as cluster analysis may allow researchers to identify groups, or clusters, of related variables, the accuracies and effectiveness of traditional clustering methods diminish for large and hyper dimensional datasets. Topic modeling is an active research field in machine learning and has been mainly used as an analytical tool to structure large textual corpora for data mining. Its ability to reduce high dimensionality to a small number of latent variables makes it suitable as a means for clustering or overcoming clustering difficulties in large biological and medical datasets. Results In this study, three topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, are proposed and tested on the cluster analysis of three large datasets: Salmonella pulsed-field gel electrophoresis (PFGE) dataset, lung cancer dataset, and breast cancer dataset, which represent various types of large biological or medical datasets. All three various methods are shown to improve the efficacy/effectiveness of clustering results on the three datasets in comparison to traditional methods. A preferable cluster analysis method emerged for each of the three datasets on the basis of replicating known biological truths. Conclusion Topic modeling could be advantageously applied to the large datasets of biological or medical research. The three proposed topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, yield clustering improvements for the three different data types. Clusters more efficaciously represent truthful groupings and subgroupings in the data than traditional methods, suggesting that topic model-based methods could provide an analytic advancement in the analysis of large biological or medical datasets. PMID:25350106
Topic modeling for cluster analysis of large biological and medical datasets.
Zhao, Weizhong; Zou, Wen; Chen, James J
2014-01-01
The big data moniker is nowhere better deserved than to describe the ever-increasing prodigiousness and complexity of biological and medical datasets. New methods are needed to generate and test hypotheses, foster biological interpretation, and build validated predictors. Although multivariate techniques such as cluster analysis may allow researchers to identify groups, or clusters, of related variables, the accuracies and effectiveness of traditional clustering methods diminish for large and hyper dimensional datasets. Topic modeling is an active research field in machine learning and has been mainly used as an analytical tool to structure large textual corpora for data mining. Its ability to reduce high dimensionality to a small number of latent variables makes it suitable as a means for clustering or overcoming clustering difficulties in large biological and medical datasets. In this study, three topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, are proposed and tested on the cluster analysis of three large datasets: Salmonella pulsed-field gel electrophoresis (PFGE) dataset, lung cancer dataset, and breast cancer dataset, which represent various types of large biological or medical datasets. All three various methods are shown to improve the efficacy/effectiveness of clustering results on the three datasets in comparison to traditional methods. A preferable cluster analysis method emerged for each of the three datasets on the basis of replicating known biological truths. Topic modeling could be advantageously applied to the large datasets of biological or medical research. The three proposed topic model-derived clustering methods, highest probable topic assignment, feature selection and feature extraction, yield clustering improvements for the three different data types. Clusters more efficaciously represent truthful groupings and subgroupings in the data than traditional methods, suggesting that topic model-based methods could provide an analytic advancement in the analysis of large biological or medical datasets.
Jin, Yuanyuan; Lu, Shengjie; Hermann, Andreas; Kuang, Xiaoyu; Zhang, Chuanzhao; Lu, Cheng; Xu, Hongguang; Zheng, Weijun
2016-01-01
We present a combined experimental and theoretical study of ruthenium doped germanium clusters, RuGen− (n = 3–12), and their corresponding neutral species. Photoelectron spectra of RuGen− clusters are measured at 266 nm. The vertical detachment energies (VDEs) and adiabatic detachment energies (ADEs) are obtained. Unbiased CALYPSO structure searches confirm the low-lying structures of anionic and neutral ruthenium doped germanium clusters in the size range of 3 ≤ n ≤ 12. Subsequent geometry optimizations using density functional theory (DFT) at PW91/LANL2DZ level are carried out to determine the relative stability and electronic properties of ruthenium doped germanium clusters. It is found that most of the anionic and neutral clusters have very similar global features. Although the global minimum structures of the anionic and neutral clusters are different, their respective geometries are observed as the low-lying isomers in either case. In addition, for n > 8, the Ru atom in RuGen−/0 clusters is absorbed endohedrally in the Ge cage. The theoretically predicted vertical and adiabatic detachment energies are in good agreement with the experimental measurements. The excellent agreement between DFT calculations and experiment enables a comprehensive evaluation of the geometrical and electronic structures of ruthenium doped germanium clusters. PMID:27439955
A curvature-based weighted fuzzy c-means algorithm for point clouds de-noising
NASA Astrophysics Data System (ADS)
Cui, Xin; Li, Shipeng; Yan, Xiutian; He, Xinhua
2018-04-01
In order to remove the noise of three-dimensional scattered point cloud and smooth the data without damnify the sharp geometric feature simultaneity, a novel algorithm is proposed in this paper. The feature-preserving weight is added to fuzzy c-means algorithm which invented a curvature weighted fuzzy c-means clustering algorithm. Firstly, the large-scale outliers are removed by the statistics of r radius neighboring points. Then, the algorithm estimates the curvature of the point cloud data by using conicoid parabolic fitting method and calculates the curvature feature value. Finally, the proposed clustering algorithm is adapted to calculate the weighted cluster centers. The cluster centers are regarded as the new points. The experimental results show that this approach is efficient to different scale and intensities of noise in point cloud with a high precision, and perform a feature-preserving nature at the same time. Also it is robust enough to different noise model.
Spadone, Sara; de Pasquale, Francesco; Mantini, Dante; Della Penna, Stefania
2012-09-01
Independent component analysis (ICA) is typically applied on functional magnetic resonance imaging, electroencephalographic and magnetoencephalographic (MEG) data due to its data-driven nature. In these applications, ICA needs to be extended from single to multi-session and multi-subject studies for interpreting and assigning a statistical significance at the group level. Here a novel strategy for analyzing MEG independent components (ICs) is presented, Multivariate Algorithm for Grouping MEG Independent Components K-means based (MAGMICK). The proposed approach is able to capture spatio-temporal dynamics of brain activity in MEG studies by running ICA at subject level and then clustering the ICs across sessions and subjects. Distinctive features of MAGMICK are: i) the implementation of an efficient set of "MEG fingerprints" designed to summarize properties of MEG ICs as they are built on spatial, temporal and spectral parameters; ii) the implementation of a modified version of the standard K-means procedure to improve its data-driven character. This algorithm groups the obtained ICs automatically estimating the number of clusters through an adaptive weighting of the parameters and a constraint on the ICs independence, i.e. components coming from the same session (at subject level) or subject (at group level) cannot be grouped together. The performances of MAGMICK are illustrated by analyzing two sets of MEG data obtained during a finger tapping task and median nerve stimulation. The results demonstrate that the method can extract consistent patterns of spatial topography and spectral properties across sessions and subjects that are in good agreement with the literature. In addition, these results are compared to those from a modified version of affinity propagation clustering method. The comparison, evaluated in terms of different clustering validity indices, shows that our methodology often outperforms the clustering algorithm. Eventually, these results are confirmed by a comparison with a MEG tailored version of the self-organizing group ICA, which is largely used for fMRI IC clustering. Copyright © 2012 Elsevier Inc. All rights reserved.
Non-redundant patent sequence databases with value-added annotations at two levels
Li, Weizhong; McWilliam, Hamish; de la Torre, Ana Richart; Grodowski, Adam; Benediktovich, Irina; Goujon, Mickael; Nauche, Stephane; Lopez, Rodrigo
2010-01-01
The European Bioinformatics Institute (EMBL-EBI) provides public access to patent data, including abstracts, chemical compounds and sequences. Sequences can appear multiple times due to the filing of the same invention with multiple patent offices, or the use of the same sequence by different inventors in different contexts. Information relating to the source invention may be incomplete, and biological information available in patent documents elsewhere may not be reflected in the annotation of the sequence. Search and analysis of these data have become increasingly challenging for both the scientific and intellectual-property communities. Here, we report a collection of non-redundant patent sequence databases, which cover the EMBL-Bank nucleotides patent class and the patent protein databases and contain value-added annotations from patent documents. The databases were created at two levels by the use of sequence MD5 checksums. Sequences within a level-1 cluster are 100% identical over their whole length. Level-2 clusters were defined by sub-grouping level-1 clusters based on patent family information. Value-added annotations, such as publication number corrections, earliest publication dates and feature collations, significantly enhance the quality of the data, allowing for better tracking and cross-referencing. The databases are available format: http://www.ebi.ac.uk/patentdata/nr/. PMID:19884134
Non-redundant patent sequence databases with value-added annotations at two levels.
Li, Weizhong; McWilliam, Hamish; de la Torre, Ana Richart; Grodowski, Adam; Benediktovich, Irina; Goujon, Mickael; Nauche, Stephane; Lopez, Rodrigo
2010-01-01
The European Bioinformatics Institute (EMBL-EBI) provides public access to patent data, including abstracts, chemical compounds and sequences. Sequences can appear multiple times due to the filing of the same invention with multiple patent offices, or the use of the same sequence by different inventors in different contexts. Information relating to the source invention may be incomplete, and biological information available in patent documents elsewhere may not be reflected in the annotation of the sequence. Search and analysis of these data have become increasingly challenging for both the scientific and intellectual-property communities. Here, we report a collection of non-redundant patent sequence databases, which cover the EMBL-Bank nucleotides patent class and the patent protein databases and contain value-added annotations from patent documents. The databases were created at two levels by the use of sequence MD5 checksums. Sequences within a level-1 cluster are 100% identical over their whole length. Level-2 clusters were defined by sub-grouping level-1 clusters based on patent family information. Value-added annotations, such as publication number corrections, earliest publication dates and feature collations, significantly enhance the quality of the data, allowing for better tracking and cross-referencing. The databases are available format: http://www.ebi.ac.uk/patentdata/nr/.
Thermodynamically accessible titanium clusters TiN, N = 2-32.
Lazauskas, Tomas; Sokol, Alexey A; Buckeridge, John; Catlow, C Richard A; Escher, Susanne G E T; Farrow, Matthew R; Mora-Fonz, David; Blum, Volker W; Phaahla, Tshegofatso M; Chauke, Hasani R; Ngoepe, Phuti E; Woodley, Scott M
2018-05-10
We have performed a genetic algorithm search on the tight-binding interatomic potential energy surface (PES) for small TiN (N = 2-32) clusters. The low energy candidate clusters were further refined using density functional theory (DFT) calculations with the PBEsol exchange-correlation functional and evaluated with the PBEsol0 hybrid functional. The resulting clusters were analysed in terms of their structural features, growth mechanism and surface area. The results suggest a growth mechanism that is based on forming coordination centres by interpenetrating icosahedra, icositetrahedra and Frank-Kasper polyhedra. We identify centres of coordination, which act as centres of bulk nucleation in medium sized clusters and determine the morphological features of the cluster.
Knutson, Stacy T.; Westwood, Brian M.; Leuthaeuser, Janelle B.; Turner, Brandon E.; Nguyendac, Don; Shea, Gabrielle; Kumar, Kiran; Hayden, Julia D.; Harper, Angela F.; Brown, Shoshana D.; Morris, John H.; Ferrin, Thomas E.; Babbitt, Patricia C.
2017-01-01
Abstract Protein function identification remains a significant problem. Solving this problem at the molecular functional level would allow mechanistic determinant identification—amino acids that distinguish details between functional families within a superfamily. Active site profiling was developed to identify mechanistic determinants. DASP and DASP2 were developed as tools to search sequence databases using active site profiling. Here, TuLIP (Two‐Level Iterative clustering Process) is introduced as an iterative, divisive clustering process that utilizes active site profiling to separate structurally characterized superfamily members into functionally relevant clusters. Underlying TuLIP is the observation that functionally relevant families (curated by Structure‐Function Linkage Database, SFLD) self‐identify in DASP2 searches; clusters containing multiple functional families do not. Each TuLIP iteration produces candidate clusters, each evaluated to determine if it self‐identifies using DASP2. If so, it is deemed a functionally relevant group. Divisive clustering continues until each structure is either a functionally relevant group member or a singlet. TuLIP is validated on enolase and glutathione transferase structures, superfamilies well‐curated by SFLD. Correlation is strong; small numbers of structures prevent statistically significant analysis. TuLIP‐identified enolase clusters are used in DASP2 GenBank searches to identify sequences sharing functional site features. Analysis shows a true positive rate of 96%, false negative rate of 4%, and maximum false positive rate of 4%. F‐measure and performance analysis on the enolase search results and comparison to GEMMA and SCI‐PHY demonstrate that TuLIP avoids the over‐division problem of these methods. Mechanistic determinants for enolase families are evaluated and shown to correlate well with literature results. PMID:28054422
Knutson, Stacy T; Westwood, Brian M; Leuthaeuser, Janelle B; Turner, Brandon E; Nguyendac, Don; Shea, Gabrielle; Kumar, Kiran; Hayden, Julia D; Harper, Angela F; Brown, Shoshana D; Morris, John H; Ferrin, Thomas E; Babbitt, Patricia C; Fetrow, Jacquelyn S
2017-04-01
Protein function identification remains a significant problem. Solving this problem at the molecular functional level would allow mechanistic determinant identification-amino acids that distinguish details between functional families within a superfamily. Active site profiling was developed to identify mechanistic determinants. DASP and DASP2 were developed as tools to search sequence databases using active site profiling. Here, TuLIP (Two-Level Iterative clustering Process) is introduced as an iterative, divisive clustering process that utilizes active site profiling to separate structurally characterized superfamily members into functionally relevant clusters. Underlying TuLIP is the observation that functionally relevant families (curated by Structure-Function Linkage Database, SFLD) self-identify in DASP2 searches; clusters containing multiple functional families do not. Each TuLIP iteration produces candidate clusters, each evaluated to determine if it self-identifies using DASP2. If so, it is deemed a functionally relevant group. Divisive clustering continues until each structure is either a functionally relevant group member or a singlet. TuLIP is validated on enolase and glutathione transferase structures, superfamilies well-curated by SFLD. Correlation is strong; small numbers of structures prevent statistically significant analysis. TuLIP-identified enolase clusters are used in DASP2 GenBank searches to identify sequences sharing functional site features. Analysis shows a true positive rate of 96%, false negative rate of 4%, and maximum false positive rate of 4%. F-measure and performance analysis on the enolase search results and comparison to GEMMA and SCI-PHY demonstrate that TuLIP avoids the over-division problem of these methods. Mechanistic determinants for enolase families are evaluated and shown to correlate well with literature results. © 2017 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.
Tran, Van Tan; Nguyen, Minh Thao; Tran, Quoc Tri
2017-10-12
Density functional theory and the multiconfigurational CASSCF/CASPT2 method have been employed to study the low-lying states of VGe n -/0 (n = 1-4) clusters. For VGe -/0 and VGe 2 -/0 clusters, the relative energies and geometrical structures of the low-lying states are reported at the CASSCF/CASPT2 level. For the VGe 3 -/0 and VGe 4 -/0 clusters, the computational results show that due to the large contribution of the Hartree-Fock exact exchange, the hybrid B3LYP, B3PW91, and PBE0 functionals overestimate the energies of the high-spin states as compared to the pure GGA BP86 and PBE functionals and the CASPT2 method. On the basis of the pure GGA BP86 and PBE functionals and the CASSCF/CASPT2 results, the ground states of anionic and neutral clusters are defined, the relative energies of the excited states are computed, and the electron detachment energies of the anionic clusters are evaluated. The computational results are employed to give new assignments for all features in the photoelectron spectra of VGe 3 - and VGe 4 - clusters.
NASA Technical Reports Server (NTRS)
Brumfield, J. O.; Bloemer, H. H. L.; Campbell, W. J.
1981-01-01
Two unsupervised classification procedures for analyzing Landsat data used to monitor land reclamation in a surface mining area in east central Ohio are compared for agreement with data collected from the corresponding locations on the ground. One procedure is based on a traditional unsupervised-clustering/maximum-likelihood algorithm sequence that assumes spectral groupings in the Landsat data in n-dimensional space; the other is based on a nontraditional unsupervised-clustering/canonical-transformation/clustering algorithm sequence that not only assumes spectral groupings in n-dimensional space but also includes an additional feature-extraction technique. It is found that the nontraditional procedure provides an appreciable improvement in spectral groupings and apparently increases the level of accuracy in the classification of land cover categories.
NASA Astrophysics Data System (ADS)
Vaknin, D.; Garlea, V. O.; Demmel, F.; Mamontov, E.; Nojiri, H.; Martin, C.; Chiorescu, I.; Qiu, Y.; Kögerler, P.; Fielden, J.; Engelhardt, L.; Rainey, C.; Luban, M.
2010-11-01
Inelastic neutron scattering (INS) in variable magnetic field and high-field magnetization measurements in the millikelvin temperature range were performed to gain insight into the low-energy magnetic excitation spectrum and the field-induced level crossings in the molecular spin cluster {Cr8}-cubane. These complementary techniques provide consistent estimates of the lowest level-crossing field. The overall features of the experimental data are explained using an isotropic Heisenberg model, based on three distinct exchange interactions linking the eight CrIII paramagnetic centers (spins s = 3/2), that is supplemented with a relatively large molecular magnetic anisotropy term for the lowest S = 1 multiplet. It is noted that the existence of the anisotropy is clearly evident from the magnetic field dependence of the excitations in the INS measurements, while the magnetization measurements are not sensitive to its effects.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vaknin, D.; Garlea, Vasile O; Demmel, F.
Inelastic neutron scattering (INS) in variable magnetic field and high-field magnetization measurements in the millikelvin temperature range were performed to gain insight into the low-energy magnetic excitation spectrum and the field-induced level crossings in the molecular spin cluster {Cr8}-cubane. These complementary techniques provide consistent estimates of the lowest level-crossing field. The overall features of the experimental data are explained using an isotropic Heisenberg model, based on three distinct exchange interactions linking the eight CrIII paramagnetic centers (spins s = 3/2), that is supplemented with a relatively large molecular magnetic anisotropy term for the lowest S = 1 multiplet. It ismore » noted that the existence of the anisotropy is clearly evident from the magnetic field dependence of the excitations in the INS measurements, while the magnetization measurements are not sensitive to its effects.« less
NASA Astrophysics Data System (ADS)
Chun, Sang-Hyun; Kim, Jae-Woo; Sohn, Sangmo T.; Park, Jang-Hyun; Han, Wonyong; Kim, Ho-Il; Lee, Young-Wook; Lee, Myung Gyoon; Lee, Sang-Gak; Sohn, Young-Jong
2010-02-01
Wide-field deep g'r'i' images obtained with the Megacam of the Canada-France-Hawaii Telescope are used to investigate the spatial configuration of stars around five metal-poor globular clusters M15, M30, M53, NGC 5053, and NGC 5466, in a field-of-view ~3°. Applying a mask filtering algorithm to the color-magnitude diagrams of the observed stars, we sorted cluster's member star candidates that are used to examine the characteristics of the spatial stellar distribution surrounding the target clusters. The smoothed surface density maps and the overlaid isodensity contours indicate that all of the five metal-poor globular clusters exhibit strong evidence of extratidal overdensity features over their tidal radii, in the form of extended tidal tails around the clusters. The orientations of the observed extratidal features show signatures of tidal tails tracing the clusters' orbits, inferred from their proper motions, and effects of dynamical interactions with the Galaxy. Our findings include detections of a tidal bridge-like feature and an envelope structure around the pair of globular clusters M53 and NGC 5053. The observed radial surface density profiles of target clusters have a deviation from theoretical King models, for which the profiles show a break at 0.5-0.7rt , extending the overdensity features out to 1.5-2rt . Both radial surface density profiles for different angular sections and azimuthal number density profiles confirm the overdensity features of tidal tails around the five metal-poor globular clusters. Our results add further observational evidence that the observed metal-poor halo globular clusters originate from an accreted satellite system, indicative of the merging scenario of the formation of the Galactic halo. Based on observations carried out at the Canada-France-Hawaii Telescope, operated by the National Research Council of Canada, the Centre National de la Recherche Scientifique de France, and the University of Hawaii. This is part of the Searching for the Galactic Halo project using the CFHT, organized by the Korea Astronomy and Space Science Institute.
An adaptive clustering algorithm for image matching based on corner feature
NASA Astrophysics Data System (ADS)
Wang, Zhe; Dong, Min; Mu, Xiaomin; Wang, Song
2018-04-01
The traditional image matching algorithm always can not balance the real-time and accuracy better, to solve the problem, an adaptive clustering algorithm for image matching based on corner feature is proposed in this paper. The method is based on the similarity of the matching pairs of vector pairs, and the adaptive clustering is performed on the matching point pairs. Harris corner detection is carried out first, the feature points of the reference image and the perceived image are extracted, and the feature points of the two images are first matched by Normalized Cross Correlation (NCC) function. Then, using the improved algorithm proposed in this paper, the matching results are clustered to reduce the ineffective operation and improve the matching speed and robustness. Finally, the Random Sample Consensus (RANSAC) algorithm is used to match the matching points after clustering. The experimental results show that the proposed algorithm can effectively eliminate the most wrong matching points while the correct matching points are retained, and improve the accuracy of RANSAC matching, reduce the computation load of whole matching process at the same time.
[Application of Kohonen Self-Organizing Feature Maps in QSAR of human ADMET and kinase data sets].
Hegymegi-Barakonyi, Bálint; Orfi, László; Kéri, György; Kövesdi, István
2013-01-01
QSAR predictions have been proven very useful in a large number of studies for drug design, such as kinase inhibitor design as targets for cancer therapy, however the overall predictability often remains unsatisfactory. To improve predictability of ADMET features and kinase inhibitory data, we present a new method using Kohonen's Self-Organizing Feature Map (SOFM) to cluster molecules based on explanatory variables (X) and separate dissimilar ones. We calculated SOFM clusters for a large number of molecules with human ADMET and kinase inhibitory data, and we showed that chemically similar molecules were in the same SOFM cluster, and within such clusters the QSAR models had significantly better predictability. We used also target variables (Y, e.g. ADMET) jointly with X variables to create a novel type of clustering. With our method, cells of loosely coupled XY data could be identified and separated into different model building sets.
A taxonomy of hospitals participating in Medicare accountable care organizations.
Bazzoli, Gloria J; Harless, David W; Chukmaitov, Askar S
2017-03-03
Medicare was an early innovator of accountable care organizations (ACOs), establishing the Medicare Shared Savings Program (MSSP) and Pioneer programs in 2012-2013. Existing research has documented that ACOs bring together an array of health providers with hospitals serving as important participants. Hospitals vary markedly in their service structure and organizational capabilities, and thus, one would expect hospital ACO participants to vary in these regards. Our research identifies hospital subgroups that share certain capabilities and competencies. Such research, in conjunction with existing ACO research, provides deeper understanding of the structure and operation of these organizations. Given that Medicare was an initiator of the ACO concept, our findings provide a baseline to track the evolution of ACO hospitals over time. Hierarchical clustering methods are used in separate analyses of MSSP and Pioneer ACO hospitals. Hospitals participating in ACOs with 2012-2013 start dates are identified through multiple sources. Study data come from the Centers for Medicare and Medicaid Services, American Hospital Association, and Health Information and Management Systems Society. Five-cluster solutions were developed separately for the MSSP and Pioneer hospital samples. Both the MSSP and Pioneer taxonomies had several clusters with high levels of health information technology capabilities. Also distinct clusters with strong physician linkages were present. We examined Pioneer ACO hospitals that subsequently left the program and found that they commonly had low levels of ambulatory care services or health information technology. Distinct subgroups of hospitals exist in both the MSSP and Pioneer programs, suggesting that individual hospitals serve different roles within an ACO. Health information technology and physician linkages appear to be particularly important features in ACO hospitals. ACOs need to consider not only geographic and service mix when selecting hospital participants but also their vertical integration features and management competencies.
DePianto, Daryle J; Chandriani, Sanjay; Abbas, Alexander R; Jia, Guiquan; N'Diaye, Elsa N; Caplazi, Patrick; Kauder, Steven E; Biswas, Sabyasachi; Karnik, Satyajit K; Ha, Connie; Modrusan, Zora; Matthay, Michael A; Kukreja, Jasleen; Collard, Harold R; Egen, Jackson G; Wolters, Paul J; Arron, Joseph R
2015-01-01
There is microscopic spatial and temporal heterogeneity of pathological changes in idiopathic pulmonary fibrosis (IPF) lung tissue, which may relate to heterogeneity in pathophysiological mediators of disease and clinical progression. We assessed relationships between gene expression patterns, pathological features, and systemic biomarkers to identify biomarkers that reflect the aggregate disease burden in patients with IPF. Gene expression microarrays (N=40 IPF; 8 controls) and immunohistochemical analyses (N=22 IPF; 8 controls) of lung biopsies. Clinical characterisation and blood biomarker levels of MMP3 and CXCL13 in a separate cohort of patients with IPF (N=80). 2940 genes were significantly differentially expressed between IPF and control samples (|fold change| >1.5, p<0.05). Two clusters of co-regulated genes related to bronchiolar epithelium or lymphoid aggregates exhibited substantial heterogeneity within the IPF population. Gene expression in bronchiolar and lymphoid clusters corresponded to the extent of bronchiolisation and lymphoid aggregates determined by immunohistochemistry in adjacent tissue sections. Elevated serum levels of MMP3, encoded in the bronchiolar cluster, and CXCL13, encoded in the lymphoid cluster, corresponded to disease severity and shortened survival time (p<10(-7) for MMP3 and p<10(-5) for CXCL13; Cox proportional hazards model). Microscopic pathological heterogeneity in IPF lung tissue corresponds to specific gene expression patterns related to bronchiolisation and lymphoid aggregates. MMP3 and CXCL13 are systemic biomarkers that reflect the aggregate burden of these pathological features across total lung tissue. These biomarkers may have clinical utility as prognostic and/or surrogate biomarkers of disease activity in interventional studies in IPF. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Latha, Manohar; Kavitha, Ganesan
2018-02-03
Schizophrenia (SZ) is a psychiatric disorder that especially affects individuals during their adolescence. There is a need to study the subanatomical regions of SZ brain on magnetic resonance images (MRI) based on morphometry. In this work, an attempt was made to analyze alterations in structure and texture patterns in images of the SZ brain using the level-set method and Laws texture features. T1-weighted MRI of the brain from Center of Biomedical Research Excellence (COBRE) database were considered for analysis. Segmentation was carried out using the level-set method. Geometrical and Laws texture features were extracted from the segmented brain stem, corpus callosum, cerebellum, and ventricle regions to analyze pattern changes in SZ. The level-set method segmented multiple brain regions, with higher similarity and correlation values compared with an optimized method. The geometric features obtained from regions of the corpus callosum and ventricle showed significant variation (p < 0.00001) between normal and SZ brain. Laws texture feature identified a heterogeneous appearance in the brain stem, corpus callosum and ventricular regions, and features from the brain stem were correlated with Positive and Negative Syndrome Scale (PANSS) score (p < 0.005). A framework of geometric and Laws texture features obtained from brain subregions can be used as a supplement for diagnosis of psychiatric disorders.
Substance misuse subtypes among women convicted of homicide.
de Melo Nunes, Adriana; Baltieri, Danilo Antonio
2013-01-01
The proportion of women incarcerated is growing at a faster pace than that for men. The reasons for this important increase have been mainly attributed to drug-using lifestyle and drug-related offenses. About half of female inmates have history of substance misuse and one third demonstrate high impulsiveness levels. The objectives of this study were to (a) identify subtypes of alcohol and drug problems and impulsiveness among women convicted of homicide, and (b) examine the association between psychosocial and criminological features and the resulting clusters. Data come from 158 female inmates serving a sentence for homicide in the Penitentiary of Sant'Ana in São Paulo State, Brazil. Latent class analysis was used to group participants into substance misuse and impulsiveness classes. Two classes were identified: nonproblematic (cluster 1: 54.53%, n = 86) and problematic (cluster 2: 45.57%, n = 72) ones. After controlling for several psychosocial and criminological variables, cluster 2 inmates showed an earlier beginning of criminal activities and a lower educational level than their counterparts. To recognize the necessities of specific groups of female offenders is crucial for the development of an adequate system of health politics and for the decrease of criminal recidivism among those offenders who have shown higher risk.
Image Registration Algorithm Based on Parallax Constraint and Clustering Analysis
NASA Astrophysics Data System (ADS)
Wang, Zhe; Dong, Min; Mu, Xiaomin; Wang, Song
2018-01-01
To resolve the problem of slow computation speed and low matching accuracy in image registration, a new image registration algorithm based on parallax constraint and clustering analysis is proposed. Firstly, Harris corner detection algorithm is used to extract the feature points of two images. Secondly, use Normalized Cross Correlation (NCC) function to perform the approximate matching of feature points, and the initial feature pair is obtained. Then, according to the parallax constraint condition, the initial feature pair is preprocessed by K-means clustering algorithm, which is used to remove the feature point pairs with obvious errors in the approximate matching process. Finally, adopt Random Sample Consensus (RANSAC) algorithm to optimize the feature points to obtain the final feature point matching result, and the fast and accurate image registration is realized. The experimental results show that the image registration algorithm proposed in this paper can improve the accuracy of the image matching while ensuring the real-time performance of the algorithm.
Feature Selection Using Information Gain for Improved Structural-Based Alert Correlation
Siraj, Maheyzah Md; Zainal, Anazida; Elshoush, Huwaida Tagelsir; Elhaj, Fatin
2016-01-01
Grouping and clustering alerts for intrusion detection based on the similarity of features is referred to as structurally base alert correlation and can discover a list of attack steps. Previous researchers selected different features and data sources manually based on their knowledge and experience, which lead to the less accurate identification of attack steps and inconsistent performance of clustering accuracy. Furthermore, the existing alert correlation systems deal with a huge amount of data that contains null values, incomplete information, and irrelevant features causing the analysis of the alerts to be tedious, time-consuming and error-prone. Therefore, this paper focuses on selecting accurate and significant features of alerts that are appropriate to represent the attack steps, thus, enhancing the structural-based alert correlation model. A two-tier feature selection method is proposed to obtain the significant features. The first tier aims at ranking the subset of features based on high information gain entropy in decreasing order. The second tier extends additional features with a better discriminative ability than the initially ranked features. Performance analysis results show the significance of the selected features in terms of the clustering accuracy using 2000 DARPA intrusion detection scenario-specific dataset. PMID:27893821
Identification and characterization of near-fatal asthma phenotypes by cluster analysis.
Serrano-Pariente, J; Rodrigo, G; Fiz, J A; Crespo, A; Plaza, V
2015-09-01
Near-fatal asthma (NFA) is a heterogeneous clinical entity and several profiles of patients have been described according to different clinical, pathophysiological and histological features. However, there are no previous studies that identify in a unbiased way--using statistical methods such as clusters analysis--different phenotypes of NFA. Therefore, the aim of the present study was to identify and to characterize phenotypes of near fatal asthma using a cluster analysis. Over a period of 2 years, 33 Spanish hospitals enrolled 179 asthmatics admitted for an episode of NFA. A cluster analysis using two-steps algorithm was performed from data of 84 of these cases. The analysis defined three clusters of patients with NFA: cluster 1, the largest, including older patients with clinical and therapeutic criteria of severe asthma; cluster 2, with an high proportion of respiratory arrest (68%), impaired consciousness level (82%) and mechanical ventilation (93%); and cluster 3, which included younger patients, characterized by an insufficient anti-inflammatory treatment and frequent sensitization to Alternaria alternata and soybean. These results identify specific asthma phenotypes involved in NFA, confirming in part previous findings observed in studies with a clinical approach. The identification of patients with a specific NFA phenotype could suggest interventions to prevent future severe asthma exacerbations. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Perualila-Tan, Nolen Joy; Shkedy, Ziv; Talloen, Willem; Göhlmann, Hinrich W H; Moerbeke, Marijke Van; Kasim, Adetayo
2016-08-01
The modern process of discovering candidate molecules in early drug discovery phase includes a wide range of approaches to extract vital information from the intersection of biology and chemistry. A typical strategy in compound selection involves compound clustering based on chemical similarity to obtain representative chemically diverse compounds (not incorporating potency information). In this paper, we propose an integrative clustering approach that makes use of both biological (compound efficacy) and chemical (structural features) data sources for the purpose of discovering a subset of compounds with aligned structural and biological properties. The datasets are integrated at the similarity level by assigning complementary weights to produce a weighted similarity matrix, serving as a generic input in any clustering algorithm. This new analysis work flow is semi-supervised method since, after the determination of clusters, a secondary analysis is performed wherein it finds differentially expressed genes associated to the derived integrated cluster(s) to further explain the compound-induced biological effects inside the cell. In this paper, datasets from two drug development oncology projects are used to illustrate the usefulness of the weighted similarity-based clustering approach to integrate multi-source high-dimensional information to aid drug discovery. Compounds that are structurally and biologically similar to the reference compounds are discovered using this proposed integrative approach.
Remote sensing imagery classification using multi-objective gravitational search algorithm
NASA Astrophysics Data System (ADS)
Zhang, Aizhu; Sun, Genyun; Wang, Zhenjie
2016-10-01
Simultaneous optimization of different validity measures can capture different data characteristics of remote sensing imagery (RSI) and thereby achieving high quality classification results. In this paper, two conflicting cluster validity indices, the Xie-Beni (XB) index and the fuzzy C-means (FCM) (Jm) measure, are integrated with a diversity-enhanced and memory-based multi-objective gravitational search algorithm (DMMOGSA) to present a novel multi-objective optimization based RSI classification method. In this method, the Gabor filter method is firstly implemented to extract texture features of RSI. Then, the texture features are syncretized with the spectral features to construct the spatial-spectral feature space/set of the RSI. Afterwards, cluster of the spectral-spatial feature set is carried out on the basis of the proposed method. To be specific, cluster centers are randomly generated initially. After that, the cluster centers are updated and optimized adaptively by employing the DMMOGSA. Accordingly, a set of non-dominated cluster centers are obtained. Therefore, numbers of image classification results of RSI are produced and users can pick up the most promising one according to their problem requirements. To quantitatively and qualitatively validate the effectiveness of the proposed method, the proposed classification method was applied to classifier two aerial high-resolution remote sensing imageries. The obtained classification results are compared with that produced by two single cluster validity index based and two state-of-the-art multi-objective optimization algorithms based classification results. Comparison results show that the proposed method can achieve more accurate RSI classification.
Modeling sports highlights using a time-series clustering framework and model interpretation
NASA Astrophysics Data System (ADS)
Radhakrishnan, Regunathan; Otsuka, Isao; Xiong, Ziyou; Divakaran, Ajay
2005-01-01
In our past work on sports highlights extraction, we have shown the utility of detecting audience reaction using an audio classification framework. The audio classes in the framework were chosen based on intuition. In this paper, we present a systematic way of identifying the key audio classes for sports highlights extraction using a time series clustering framework. We treat the low-level audio features as a time series and model the highlight segments as "unusual" events in a background of an "usual" process. The set of audio classes to characterize the sports domain is then identified by analyzing the consistent patterns in each of the clusters output from the time series clustering framework. The distribution of features from the training data so obtained for each of the key audio classes, is parameterized by a Minimum Description Length Gaussian Mixture Model (MDL-GMM). We also interpret the meaning of each of the mixture components of the MDL-GMM for the key audio class (the "highlight" class) that is correlated with highlight moments. Our results show that the "highlight" class is a mixture of audience cheering and commentator's excited speech. Furthermore, we show that the precision-recall performance for highlights extraction based on this "highlight" class is better than that of our previous approach which uses only audience cheering as the key highlight class.
Modulated Modularity Clustering as an Exploratory Tool for Functional Genomic Inference
Stone, Eric A.; Ayroles, Julien F.
2009-01-01
In recent years, the advent of high-throughput assays, coupled with their diminishing cost, has facilitated a systems approach to biology. As a consequence, massive amounts of data are currently being generated, requiring efficient methodology aimed at the reduction of scale. Whole-genome transcriptional profiling is a standard component of systems-level analyses, and to reduce scale and improve inference clustering genes is common. Since clustering is often the first step toward generating hypotheses, cluster quality is critical. Conversely, because the validation of cluster-driven hypotheses is indirect, it is critical that quality clusters not be obtained by subjective means. In this paper, we present a new objective-based clustering method and demonstrate that it yields high-quality results. Our method, modulated modularity clustering (MMC), seeks community structure in graphical data. MMC modulates the connection strengths of edges in a weighted graph to maximize an objective function (called modularity) that quantifies community structure. The result of this maximization is a clustering through which tightly-connected groups of vertices emerge. Our application is to systems genetics, and we quantitatively compare MMC both to the hierarchical clustering method most commonly employed and to three popular spectral clustering approaches. We further validate MMC through analyses of human and Drosophila melanogaster expression data, demonstrating that the clusters we obtain are biologically meaningful. We show MMC to be effective and suitable to applications of large scale. In light of these features, we advocate MMC as a standard tool for exploration and hypothesis generation. PMID:19424432
Banerjee, Amit; Misra, Milind; Pai, Deepa; Shih, Liang-Yu; Woodley, Rohan; Lu, Xiang-Jun; Srinivasan, A R; Olson, Wilma K; Davé, Rajesh N; Venanzi, Carol A
2007-01-01
Six rigid-body parameters (Shift, Slide, Rise, Tilt, Roll, Twist) are commonly used to describe the relative displacement and orientation of successive base pairs in a nucleic acid structure. The present work adapts this approach to describe the relative displacement and orientation of any two planes in an arbitrary molecule-specifically, planes which contain important pharmacophore elements. Relevant code from the 3DNA software package (Nucleic Acids Res. 2003, 31, 5108-5121) was generalized to treat molecular fragments other than DNA bases as input for the calculation of the corresponding rigid-body (or "planes") parameters. These parameters were used to construct feature vectors for a fuzzy relational clustering study of over 700 conformations of a flexible analogue of the dopamine reuptake inhibitor, GBR 12909. Several cluster validity measures were used to determine the optimal number of clusters. Translational (Shift, Slide, Rise) rather than rotational (Tilt, Roll, Twist) features dominate clustering based on planes that are relatively far apart, whereas both types of features are important to clustering when the pair of planes are close by. This approach was able to classify the data set of molecular conformations into groups and to identify representative conformers for use as template conformers in future Comparative Molecular Field Analysis studies of GBR 12909 analogues. The advantage of using the planes parameters, rather than the combination of atomic coordinates and angles between molecular planes used in our previous fuzzy relational clustering of the same data set (J. Chem. Inf. Model. 2005, 45, 610-623), is that the present clustering results are independent of molecular superposition and the technique is able to identify clusters in the molecule considered as a whole. This approach is easily generalizable to any two planes in any molecule.
Reboussin, Beth A; Preisser, John S; Song, Eun-Young; Wolfson, Mark
2012-07-01
Under-age drinking is an enormous public health issue in the USA. Evidence that community level structures may impact on under-age drinking has led to a proliferation of efforts to change the environment surrounding the use of alcohol. Although the focus of these efforts is to reduce drinking by individual youths, environmental interventions are typically implemented at the community level with entire communities randomized to the same intervention condition. A distinct feature of these trials is the tendency of the behaviours of individuals residing in the same community to be more alike than that of others residing in different communities, which is herein called 'clustering'. Statistical analyses and sample size calculations must account for this clustering to avoid type I errors and to ensure an appropriately powered trial. Clustering itself may also be of scientific interest. We consider the alternating logistic regressions procedure within the population-averaged modelling framework to estimate the effect of a law enforcement intervention on the prevalence of under-age drinking behaviours while modelling the clustering at multiple levels, e.g. within communities and within neighbourhoods nested within communities, by using pairwise odds ratios. We then derive sample size formulae for estimating intervention effects when planning a post-test-only or repeated cross-sectional community-randomized trial using the alternating logistic regressions procedure.
1985-07-30
OA 3’, one of the cases studied by Lohmann et al(2 ). The new feature in their measurement is that they normalize their cross section by an...dynamic behav- ior of neutral and ionic clusters[lO]. In the case of ionic clusters there have been already extensive studies on their stability and...The specific cases studied so far on an ab initio level (e + F2 [12], e + N2 [13], e + H2 [14]) indicate that nonlocal effects are generally important
The drug target genes show higher evolutionary conservation than non-target genes.
Lv, Wenhua; Xu, Yongdeng; Guo, Yiying; Yu, Ziqi; Feng, Guanglong; Liu, Panpan; Luan, Meiwei; Zhu, Hongjie; Liu, Guiyou; Zhang, Mingming; Lv, Hongchao; Duan, Lian; Shang, Zhenwei; Li, Jin; Jiang, Yongshuai; Zhang, Ruijie
2016-01-26
Although evidence indicates that drug target genes share some common evolutionary features, there have been few studies analyzing evolutionary features of drug targets from an overall level. Therefore, we conducted an analysis which aimed to investigate the evolutionary characteristics of drug target genes. We compared the evolutionary conservation between human drug target genes and non-target genes by combining both the evolutionary features and network topological properties in human protein-protein interaction network. The evolution rate, conservation score and the percentage of orthologous genes of 21 species were included in our study. Meanwhile, four topological features including the average shortest path length, betweenness centrality, clustering coefficient and degree were considered for comparison analysis. Then we got four results as following: compared with non-drug target genes, 1) drug target genes had lower evolutionary rates; 2) drug target genes had higher conservation scores; 3) drug target genes had higher percentages of orthologous genes and 4) drug target genes had a tighter network structure including higher degrees, betweenness centrality, clustering coefficients and lower average shortest path lengths. These results demonstrate that drug target genes are more evolutionarily conserved than non-drug target genes. We hope that our study will provide valuable information for other researchers who are interested in evolutionary conservation of drug targets.
Strittmatter, M; Hamann, G F; Grauer, M; Fischer, C; Blaes, F; Hoffmann, K H; Schimrigk, K
1996-05-17
Twelve patients (age 43.4 +/- 6.3 years) with episodic cluster headache (CH) were examined during the cluster period. Plasma norepinephrine levels in patients suffering from CH were significantly decreased compared with the control group (p < 0.01). There were also statistically significant correlations between norepinephrine levels and clinical features of the pain attacks including duration (r = 0.75, p < 0.05), intensity (r = 0.64, p < 0.05) and frequency (r = 0.68, p < 0.06), thereby suggesting a pathophysiological involvement of the sympathetic nervous system in CH. Increased plasma levels of plasmacortisol and ACTH in patients with CH, especially in the morning and in the evening, suggest an alteration of the feedback circuit involving the hypothalamus, the pituitary and the adrenal gland, an imbalance in the hormones related to these structures, as well as an alteration of the circadian rhythm. In addition, CH patients demonstrated significantly decreased levels of norepinephrine (p < 0.05), HVA (p < 0.01) and 5-HIAA (p < 0.01) in the cerebrospinal fluid (CSF) consistent with a central genesis of CH. These significant relationships between neurochemical parameters and the clinical patterns suggest a complex interplay between the hypothalamus, neuroendocrinological parameters, activity of the autonomic nervous system and the pain of CH.
Yang, Jing
2018-03-01
This study investigated the durational features of English word-initial /s/+stop clusters produced by bilingual Mandarin (L1)-English (L2) children and monolingual English children and adults. The participants included two groups of five- to six-year-old bilingual children: low proficiency in the L2 (Bi-low) and high proficiency in the L2 (Bi-high), one group of age-matched English children, and one group of English adults. Each participant produced a list of English words containing /sp, st, sk/ at the word-initial position followed by /a, i, u/, respectively. The absolute durations of the clusters and cluster elements and the durational proportions of elements to the overall cluster were measured. The results revealed that Bi-high children behaved similarly to the English monolinguals whereas Bi-low children used a different strategy of temporal organization to coordinate the cluster components in comparison to the English monolinguals and Bi-high children. The influence of language experience and continuing development of temporal features in children were discussed.
On-Line Pattern Analysis and Recognition System. OLPARS VI. Software Reference Manual,
1982-06-18
Discriminant Analysis Data Transformation, Feature Extraction, Feature Evaluation Cluster Analysis, Classification Computer Software 20Z. ABSTRACT... cluster /scatter cut-off value, (2) change the one-space bin factor, (3) change from long prompts to short prompts or vice versa, (4) change the...value, a cluster plot is displayed, otherwise a scatter plot is shown. if option 1 is selected, the program requests that a new value be input
Mining the modular structure of protein interaction networks.
Berenstein, Ariel José; Piñero, Janet; Furlong, Laura Inés; Chernomoretz, Ariel
2015-01-01
Cluster-based descriptions of biological networks have received much attention in recent years fostered by accumulated evidence of the existence of meaningful correlations between topological network clusters and biological functional modules. Several well-performing clustering algorithms exist to infer topological network partitions. However, due to respective technical idiosyncrasies they might produce dissimilar modular decompositions of a given network. In this contribution, we aimed to analyze how alternative modular descriptions could condition the outcome of follow-up network biology analysis. We considered a human protein interaction network and two paradigmatic cluster recognition algorithms, namely: the Clauset-Newman-Moore and the infomap procedures. We analyzed to what extent both methodologies yielded different results in terms of granularity and biological congruency. In addition, taking into account Guimera's cartographic role characterization of network nodes, we explored how the adoption of a given clustering methodology impinged on the ability to highlight relevant network meso-scale connectivity patterns. As a case study we considered a set of aging related proteins and showed that only the high-resolution modular description provided by infomap, could unveil statistically significant associations between them and inter/intra modular cartographic features. Besides reporting novel biological insights that could be gained from the discovered associations, our contribution warns against possible technical concerns that might affect the tools used to mine for interaction patterns in network biology studies. In particular our results suggested that sub-optimal partitions from the strict point of view of their modularity levels might still be worth being analyzed when meso-scale features were to be explored in connection with external source of biological knowledge.
SAR image segmentation using skeleton-based fuzzy clustering
NASA Astrophysics Data System (ADS)
Cao, Yun Yi; Chen, Yan Qiu
2003-06-01
SAR image segmentation can be converted to a clustering problem in which pixels or small patches are grouped together based on local feature information. In this paper, we present a novel framework for segmentation. The segmentation goal is achieved by unsupervised clustering upon characteristic descriptors extracted from local patches. The mixture model of characteristic descriptor, which combines intensity and texture feature, is investigated. The unsupervised algorithm is derived from the recently proposed Skeleton-Based Data Labeling method. Skeletons are constructed as prototypes of clusters to represent arbitrary latent structures in image data. Segmentation using Skeleton-Based Fuzzy Clustering is able to detect the types of surfaces appeared in SAR images automatically without any user input.
Spatial correlations, clustering and percolation-like transitions in homicide crimes
NASA Astrophysics Data System (ADS)
Alves, L. G. A.; Lenzi, E. K.; Mendes, R. S.; Ribeiro, H. V.
2015-07-01
The spatial dynamics of criminal activities has been recently studied through statistical physics methods; however, models and results have been focusing on local scales (city level) and much less is known about these patterns at larger scales, e.g. at a country level. Here we report on a characterization of the spatial dynamics of the homicide crimes along the Brazilian territory using data from all cities (˜5000) in a period of more than thirty years. Our results show that the spatial correlation function in the per capita homicides decays exponentially with the distance between cities and that the characteristic correlation length displays an acute increasing trend in the latest years. We also investigate the formation of spatial clusters of cities via a percolation-like analysis, where clustering of cities and a phase-transition-like behavior describing the size of the largest cluster as a function of a homicide threshold are observed. This transition-like behavior presents evolutive features characterized by an increasing in the homicide threshold (where the transitions occur) and by a decreasing in the transition magnitudes (length of the jumps in the cluster size). We believe that our work sheds new light on the spatial patterns of criminal activities at large scales, which may contribute for better political decisions and resources allocation as well as opens new possibilities for modeling criminal activities by setting up fundamental empirical patterns at large scales.
Lutfey, Karen E; Gerstenberger, Eric; McKinlay, John B
2013-06-01
To identify styles of physician decision making (as opposed to singular clinical actions) and to analyze their association with variations in the management of a vignette presentation of coronary heart disease (CHD). Primary data were collected from primary care physicians in North and South Carolina. In a balanced factorial experimental design, primary care physicians viewed one of 16 (2(4)) video vignette presentations of CHD and provided detailed information about how they would manage the case. 256 MD primary care physicians were interviewed face-to-face in North and South Carolina. We identify three clusters depicting unique styles of CHD management that are robust to controls for physician (gender and level of experience) and patient characteristics (age, gender, socioeconomic status, and race) as well as key organizational features of physicians' work settings. Physicians in Cluster 1 "Cardiac" (N = 92) were more likely to focus on cardiac issues compared with their counterparts; physicians in Cluster 2 "Talkers" (N = 93) were more likely to give advice and take additional medical history; whereas physicians in Cluster 3 "Minimalists" (N = 71) were less likely than their counterparts to take action on any of the types of management behavior. Variations in styles of decision making, which encompass multiple outcome variables and extend beyond individual-level demographic predictors, may add to our understanding of disparities in health quality and outcomes. © Health Research and Educational Trust.
A new clustering algorithm applicable to multispectral and polarimetric SAR images
NASA Technical Reports Server (NTRS)
Wong, Yiu-Fai; Posner, Edward C.
1993-01-01
We describe an application of a scale-space clustering algorithm to the classification of a multispectral and polarimetric SAR image of an agricultural site. After the initial polarimetric and radiometric calibration and noise cancellation, we extracted a 12-dimensional feature vector for each pixel from the scattering matrix. The clustering algorithm was able to partition a set of unlabeled feature vectors from 13 selected sites, each site corresponding to a distinct crop, into 13 clusters without any supervision. The cluster parameters were then used to classify the whole image. The classification map is much less noisy and more accurate than those obtained by hierarchical rules. Starting with every point as a cluster, the algorithm works by melting the system to produce a tree of clusters in the scale space. It can cluster data in any multidimensional space and is insensitive to variability in cluster densities, sizes and ellipsoidal shapes. This algorithm, more powerful than existing ones, may be useful for remote sensing for land use.
Clustering of Multi-Temporal Fully Polarimetric L-Band SAR Data for Agricultural Land Cover Mapping
NASA Astrophysics Data System (ADS)
Tamiminia, H.; Homayouni, S.; Safari, A.
2015-12-01
Recently, the unique capabilities of Polarimetric Synthetic Aperture Radar (PolSAR) sensors make them an important and efficient tool for natural resources and environmental applications, such as land cover and crop classification. The aim of this paper is to classify multi-temporal full polarimetric SAR data using kernel-based fuzzy C-means clustering method, over an agricultural region. This method starts with transforming input data into the higher dimensional space using kernel functions and then clustering them in the feature space. Feature space, due to its inherent properties, has the ability to take in account the nonlinear and complex nature of polarimetric data. Several SAR polarimetric features extracted using target decomposition algorithms. Features from Cloude-Pottier, Freeman-Durden and Yamaguchi algorithms used as inputs for the clustering. This method was applied to multi-temporal UAVSAR L-band images acquired over an agricultural area near Winnipeg, Canada, during June and July in 2012. The results demonstrate the efficiency of this approach with respect to the classical methods. In addition, using multi-temporal data in the clustering process helped to investigate the phenological cycle of plants and significantly improved the performance of agricultural land cover mapping.
Dark matter phenomenology of high-speed galaxy cluster collisions
Mishchenko, Yuriy; Ji, Chueng-Ryong
2017-07-29
Here, we perform a general computational analysis of possible post-collision mass distributions in high-speed galaxy cluster collisions in the presence of self-interacting dark matter. Using this analysis, we show that astrophysically weakly self-interacting dark matter can impart subtle yet measurable features in the mass distributions of colliding galaxy clusters even without significant disruptions to the dark matter halos of the colliding galaxy clusters themselves. Most profound such evidence is found to reside in the tails of dark matter halos’ distributions, in the space between the colliding galaxy clusters. Such features appear in our simulations as shells of scattered dark mattermore » expanding in alignment with the outgoing original galaxy clusters, contributing significant densities to projected mass distributions at large distances from collision centers and large scattering angles of up to 90°. Our simulations indicate that as much as 20% of the total collision’s mass may be deposited into such structures without noticeable disruptions to the main galaxy clusters. Such structures at large scattering angles are forbidden in purely gravitational high-speed galaxy cluster collisions.Convincing identification of such structures in real colliding galaxy clusters would be a clear indication of the self-interacting nature of dark matter. Our findings may offer an explanation for the ring-like dark matter feature recently identified in the long-range reconstructions of the mass distribution of the colliding galaxy cluster CL0024+017.« less
Dark matter phenomenology of high-speed galaxy cluster collisions
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mishchenko, Yuriy; Ji, Chueng-Ryong
Here, we perform a general computational analysis of possible post-collision mass distributions in high-speed galaxy cluster collisions in the presence of self-interacting dark matter. Using this analysis, we show that astrophysically weakly self-interacting dark matter can impart subtle yet measurable features in the mass distributions of colliding galaxy clusters even without significant disruptions to the dark matter halos of the colliding galaxy clusters themselves. Most profound such evidence is found to reside in the tails of dark matter halos’ distributions, in the space between the colliding galaxy clusters. Such features appear in our simulations as shells of scattered dark mattermore » expanding in alignment with the outgoing original galaxy clusters, contributing significant densities to projected mass distributions at large distances from collision centers and large scattering angles of up to 90°. Our simulations indicate that as much as 20% of the total collision’s mass may be deposited into such structures without noticeable disruptions to the main galaxy clusters. Such structures at large scattering angles are forbidden in purely gravitational high-speed galaxy cluster collisions.Convincing identification of such structures in real colliding galaxy clusters would be a clear indication of the self-interacting nature of dark matter. Our findings may offer an explanation for the ring-like dark matter feature recently identified in the long-range reconstructions of the mass distribution of the colliding galaxy cluster CL0024+017.« less
Heard, Christopher J.; Heiles, Sven; Vajda, Stefan; ...
2014-08-07
We employed the novel surface mode of the Birmingham Cluster Genetic Algorithm (S-BCGA) for the global optimisation of noble metal tetramers upon an MgO(100) substrate at the GGA-DFT level of theory. The effect of element identity and alloying in surface-bound neutral subnanometre clusters is determined by energetic comparison between all compositions of Pd nAg (4-n) and Pd nPt (4-n). And while the binding strengths to the surface increase in the order Pt > Pd > Ag, the excess energy profiles suggest a preference for mixed clusters for both cases. The binding of CO is also modelled, showing that the adsorptionmore » site can be predicted solely by electrophilicity. Comparison to CO binding on a single metal atom shows a reversal of the 5s-d activation process for clusters, weakening the cluster surface interaction on CO adsorption. Charge localisation determines homotop, CO binding and surface site preferences. Furthermore, the electronic behaviour, which is intermediate between molecular and metallic particles allows for tunable features in the subnanometre size range.« less
Lu, Chi-Jie; Chang, Chi-Chang
2014-01-01
Sales forecasting plays an important role in operating a business since it can be used to determine the required inventory level to meet consumer demand and avoid the problem of under/overstocking. Improving the accuracy of sales forecasting has become an important issue of operating a business. This study proposes a hybrid sales forecasting scheme by combining independent component analysis (ICA) with K-means clustering and support vector regression (SVR). The proposed scheme first uses the ICA to extract hidden information from the observed sales data. The extracted features are then applied to K-means algorithm for clustering the sales data into several disjoined clusters. Finally, the SVR forecasting models are applied to each group to generate final forecasting results. Experimental results from information technology (IT) product agent sales data reveal that the proposed sales forecasting scheme outperforms the three comparison models and hence provides an efficient alternative for sales forecasting.
2014-01-01
Sales forecasting plays an important role in operating a business since it can be used to determine the required inventory level to meet consumer demand and avoid the problem of under/overstocking. Improving the accuracy of sales forecasting has become an important issue of operating a business. This study proposes a hybrid sales forecasting scheme by combining independent component analysis (ICA) with K-means clustering and support vector regression (SVR). The proposed scheme first uses the ICA to extract hidden information from the observed sales data. The extracted features are then applied to K-means algorithm for clustering the sales data into several disjoined clusters. Finally, the SVR forecasting models are applied to each group to generate final forecasting results. Experimental results from information technology (IT) product agent sales data reveal that the proposed sales forecasting scheme outperforms the three comparison models and hence provides an efficient alternative for sales forecasting. PMID:25045738
DOE Office of Scientific and Technical Information (OSTI.GOV)
Moody, Daniela Irina
An approach for land cover classification, seasonal and yearly change detection and monitoring, and identification of changes in man-made features may use a clustering of sparse approximations (CoSA) on sparse representations in learned dictionaries. A Hebbian learning rule may be used to build multispectral or hyperspectral, multiresolution dictionaries that are adapted to regional satellite image data. Sparse image representations of pixel patches over the learned dictionaries may be used to perform unsupervised k-means clustering into land cover categories. The clustering process behaves as a classifier in detecting real variability. This approach may combine spectral and spatial textural characteristics to detectmore » geologic, vegetative, hydrologic, and man-made features, as well as changes in these features over time.« less
Moody, Daniela; Wohlberg, Brendt
2018-01-02
An approach for land cover classification, seasonal and yearly change detection and monitoring, and identification of changes in man-made features may use a clustering of sparse approximations (CoSA) on sparse representations in learned dictionaries. The learned dictionaries may be derived using efficient convolutional sparse coding to build multispectral or hyperspectral, multiresolution dictionaries that are adapted to regional satellite image data. Sparse image representations of images over the learned dictionaries may be used to perform unsupervised k-means clustering into land cover categories. The clustering process behaves as a classifier in detecting real variability. This approach may combine spectral and spatial textural characteristics to detect geologic, vegetative, hydrologic, and man-made features, as well as changes in these features over time.
Using earthquake clusters to identify fracture zones at Puna geothermal field, Hawaii
NASA Astrophysics Data System (ADS)
Lucas, A.; Shalev, E.; Malin, P.; Kenedi, C. L.
2010-12-01
The actively producing Puna geothermal system (PGS) is located on the Kilauea East Rift Zone (ERZ), which extends out from the active Kilauea volcano on Hawaii. In the Puna area the rift trend is identified as NE-SW from surface expressions of normal faulting with a corresponding strike; at PGS the surface expression offsets in a left step, but no rift perpendicular faulting is observed. An eight station borehole seismic network has been installed in the area of the geothermal system. Since June 2006, a total of 6162 earthquakes have been located close to or inside the geothermal system. The spread of earthquake locations follows the rift trend, but down rift to the NE of PGS almost no earthquakes are observed. Most earthquakes located within the PGS range between 2-3 km depth. Up rift to the SW of PGS the number of events decreases and the depth range increases to 3-4 km. All initial locations used Hypoinverse71 and showed no trends other than the dominant rift parallel. Double difference relocation of all earthquakes, using both catalog and cross-correlation, identified one large cluster but could not conclusively identify trends within the cluster. A large number of earthquake waveforms showed identifiable shear wave splitting. For five stations out of the six where shear wave splitting was observed, the dominant polarization direction was rift parallel. Two of the five stations also showed a smaller rift perpendicular signal. The sixth station (located close to the area of the rift offset) displayed a N-S polarization, approximately halfway between rift parallel and perpendicular. The shear wave splitting time delays indicate that fracture density is higher at the PGS compared to the surrounding ERZ. Correlation co-efficient clustering with independent P and S wave windows was used to identify clusters based on similar earthquake waveforms. In total, 40 localized clusters containing ten or more events were identified. The largest cluster was located in the production area for the power plant. Most of the clusters had linear features when their Hypoinverse locations were plotted. The concentration of individual linear features was higher in the PGS than the surrounding ERZ. The resolution of the features was resolved further by relocating each individual cluster through the catalog double difference method. Mapping of the linear features showed that a number of the larger features ran rift parallel. However a large number of rift perpendicular features were also identified. In the area where the anomalous (N-S) shear wave polarization was observed, a number of linear features with a similar orientation were identified. We assume that events occurring on the same fracture zone have similar source mechanisms and thus similar waveforms. It is concluded that the linear features identified by earthquake clustering are fracture zones. The orientation and concentration of the fracture zones is consistent with that of the shear wave splitting polarizations.
Analysis of multiplex gene expression maps obtained by voxelation.
An, Li; Xie, Hongbo; Chin, Mark H; Obradovic, Zoran; Smith, Desmond J; Megalooikonomou, Vasileios
2009-04-29
Gene expression signatures in the mammalian brain hold the key to understanding neural development and neurological disease. Researchers have previously used voxelation in combination with microarrays for acquisition of genome-wide atlases of expression patterns in the mouse brain. On the other hand, some work has been performed on studying gene functions, without taking into account the location information of a gene's expression in a mouse brain. In this paper, we present an approach for identifying the relation between gene expression maps obtained by voxelation and gene functions. To analyze the dataset, we chose typical genes as queries and aimed at discovering similar gene groups. Gene similarity was determined by using the wavelet features extracted from the left and right hemispheres averaged gene expression maps, and by the Euclidean distance between each pair of feature vectors. We also performed a multiple clustering approach on the gene expression maps, combined with hierarchical clustering. Among each group of similar genes and clusters, the gene function similarity was measured by calculating the average gene function distances in the gene ontology structure. By applying our methodology to find similar genes to certain target genes we were able to improve our understanding of gene expression patterns and gene functions. By applying the clustering analysis method, we obtained significant clusters, which have both very similar gene expression maps and very similar gene functions respectively to their corresponding gene ontologies. The cellular component ontology resulted in prominent clusters expressed in cortex and corpus callosum. The molecular function ontology gave prominent clusters in cortex, corpus callosum and hypothalamus. The biological process ontology resulted in clusters in cortex, hypothalamus and choroid plexus. Clusters from all three ontologies combined were most prominently expressed in cortex and corpus callosum. The experimental results confirm the hypothesis that genes with similar gene expression maps might have similar gene functions. The voxelation data takes into account the location information of gene expression level in mouse brain, which is novel in related research. The proposed approach can potentially be used to predict gene functions and provide helpful suggestions to biologists.
NASA Astrophysics Data System (ADS)
Ng, Tony T.
The mammalian cortex is a highly structured network of densely packed neurons that interact strongly with each other in very specific ways. Loosely speaking, neurons are cells that fire clicks at each other as a means of communication. Common sites of communication, known as synapses, are enabled by transmitter molecules released from presynaptic sender cells, which bind to receptors on postsynaptic receiver cells. There are two major classes of neurons - excitatory ones that prompt their downstream neighbors to fire spikes through depolarization, and inhibitory ones that suppress spike activity of their postsynaptic partners via hyperpolarization. Depolarization and hyperpolarization make membrane potential of a cell more positive and more negative, respectively. A sufficiently depolarized neuron fires a spike, which technically is called an action potential. In this thesis, we focus on the interplay between three of the cortex's most ubiquitous features and examine some of the consequences that their interactions have on cortical dynamics. One of the features, widespread projections between clusters of excitatory neurons, is topological. The two remaining features, homeostasis and balance between the amount of excitatory and inhibitory activity are dynamical. Here, homeostasis refers to the regulatory mechanism of individual cells or collections of cells that maintains constant levels of spike activity over time. Simply by varying the average homeostatic firing rate in clusters of excitatory neurons or by tuning the common homoeostatic rate of individual inhibitory neurons, we show via simulation that cluster-based activity bursts can exhibit critical dynamics and display power-law distributions with exponents that are consistent with those found in in vivo experiments of awake animals. Criticality is an idea that originated in statistical physics. At the critical point, activity levels of sites across an entire system, such as those of different cortical regions across the brain, can dynamically correlate not only over short distances, but also over large distances. The spatial extent of time-varying signal propagation can range from a couple of regions to a dozen regions to hundreds and thousands of regions and beyond. It has been shown in previous studies that size of a network's pattern repertoire, degree of information transmission from stimuli to responses, and potential to respond to a large range of stimulus intensities, are maximized at the critical state. In addition to demonstrating the presence of criticality in our class of networks, we show that (1) another pervasive connectivity motif in the cortex is incapable of supporting criticality, (2) excitation-inhibition balance modulates the distribution of spike-based bursts of various sizes, (3) how critical dynamics at the cluster level emerges from excitation-inhibition balance, and (4) how we can reconcile differences in burst statistics at spike-based and cluster-based levels observed in animal experiments.
Fast detection of vascular plaque in optical coherence tomography images using a reduced feature set
NASA Astrophysics Data System (ADS)
Prakash, Ammu; Ocana Macias, Mariano; Hewko, Mark; Sowa, Michael; Sherif, Sherif
2018-03-01
Optical coherence tomography (OCT) images are capable of detecting vascular plaque by using the full set of 26 Haralick textural features and a standard K-means clustering algorithm. However, the use of the full set of 26 textural features is computationally expensive and may not be feasible for real time implementation. In this work, we identified a reduced set of 3 textural feature which characterizes vascular plaque and used a generalized Fuzzy C-means clustering algorithm. Our work involves three steps: 1) the reduction of a full set 26 textural feature to a reduced set of 3 textural features by using genetic algorithm (GA) optimization method 2) the implementation of an unsupervised generalized clustering algorithm (Fuzzy C-means) on the reduced feature space, and 3) the validation of our results using histology and actual photographic images of vascular plaque. Our results show an excellent match with histology and actual photographic images of vascular tissue. Therefore, our results could provide an efficient pre-clinical tool for the detection of vascular plaque in real time OCT imaging.
a Clustering-Based Approach for Evaluation of EO Image Indexing
NASA Astrophysics Data System (ADS)
Bahmanyar, R.; Rigoll, G.; Datcu, M.
2013-09-01
The volume of Earth Observation data is increasing immensely in order of several Terabytes a day. Therefore, to explore and investigate the content of this huge amount of data, developing more sophisticated Content-Based Information Retrieval (CBIR) systems are highly demanded. These systems should be able to not only discover unknown structures behind the data, but also provide relevant results to the users' queries. Since in any retrieval system the images are processed based on a discrete set of their features (i.e., feature descriptors), study and assessment of the structure of feature space, build by different feature descriptors, is of high importance. In this paper, we introduce a clustering-based approach to study the content of image collections. In our approach, we claim that using both internal and external evaluation of clusters for different feature descriptors, helps to understand the structure of feature space. Moreover, the semantic understanding of users about the images also can be assessed. To validate the performance of our approach, we used an annotated Synthetic Aperture Radar (SAR) image collection. Quantitative results besides the visualization of feature space demonstrate the applicability of our approach.
NASA Astrophysics Data System (ADS)
Houdashelt, M. L.
1992-05-01
Initial results are presented from an examination of near-infrared spectra (6800 - 9200 Angstroms) of 34 early-type galaxies - 17 in the Virgo cluster, 10 in the Coma cluster and seven field members. It has previously been speculated that E/S0 galaxies of similar luminosity in the Virgo and Coma clusters have different red stellar populations. To explore this possibility, pseudo-equivalent widths of a number of near-IR spectral features have been measured. The important features studied include the TiO bands near 7100, 7890, 8197, 8500 and 8950 Angstroms, which are mainly produced by the late-type stars whose flux contributes only about 10-20\\ the near-IR. The strengths of the Ca triplet (8498, 8542, 8662 Angstroms) and Na I doublet (8183, 8195 Angstroms) are also measured, since these features are affected by the relative contribution of dwarf stars to the red light. Although the main focus of this work is the search for spectral differences among the Coma, Virgo and field E/S0 populations, each subgroup of galaxies (and the sample as a whole) are also examined for correlations among the feature strengths, galaxy color and luminosity.
Non-specific filtering of beta-distributed data.
Wang, Xinhui; Laird, Peter W; Hinoue, Toshinori; Groshen, Susan; Siegmund, Kimberly D
2014-06-19
Non-specific feature selection is a dimension reduction procedure performed prior to cluster analysis of high dimensional molecular data. Not all measured features are expected to show biological variation, so only the most varying are selected for analysis. In DNA methylation studies, DNA methylation is measured as a proportion, bounded between 0 and 1, with variance a function of the mean. Filtering on standard deviation biases the selection of probes to those with mean values near 0.5. We explore the effect this has on clustering, and develop alternate filter methods that utilize a variance stabilizing transformation for Beta distributed data and do not share this bias. We compared results for 11 different non-specific filters on eight Infinium HumanMethylation data sets, selected to span a variety of biological conditions. We found that for data sets having a small fraction of samples showing abnormal methylation of a subset of normally unmethylated CpGs, a characteristic of the CpG island methylator phenotype in cancer, a novel filter statistic that utilized a variance-stabilizing transformation for Beta distributed data outperformed the common filter of using standard deviation of the DNA methylation proportion, or its log-transformed M-value, in its ability to detect the cancer subtype in a cluster analysis. However, the standard deviation filter always performed among the best for distinguishing subgroups of normal tissue. The novel filter and standard deviation filter tended to favour features in different genome contexts; for the same data set, the novel filter always selected more features from CpG island promoters and the standard deviation filter always selected more features from non-CpG island intergenic regions. Interestingly, despite selecting largely non-overlapping sets of features, the two filters did find sample subsets that overlapped for some real data sets. We found two different filter statistics that tended to prioritize features with different characteristics, each performed well for identifying clusters of cancer and non-cancer tissue, and identifying a cancer CpG island hypermethylation phenotype. Since cluster analysis is for discovery, we would suggest trying both filters on any new data sets, evaluating the overlap of features selected and clusters discovered.
TU-CD-BRB-12: Radiogenomics of MRI-Guided Prostate Cancer Biopsy Habitats
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stoyanova, R; Lynne, C; Abraham, S
2015-06-15
Purpose: Diagnostic prostate biopsies are subject to sampling bias. We hypothesize that quantitative imaging with multiparametric (MP)-MRI can more accurately direct targeted biopsies to index lesions associated with highest risk clinical and genomic features. Methods: Regionally distinct prostate habitats were delineated on MP-MRI (T2-weighted, perfusion and diffusion imaging). Directed biopsies were performed on 17 habitats from 6 patients using MRI-ultrasound fusion. Biopsy location was characterized with 52 radiographic features. Transcriptome-wide analysis of 1.4 million RNA probes was performed on RNA from each habitat. Genomics features with insignificant expression values (<0.25) and interquartile range <0.5 were filtered, leaving total of 212more » genes. Correlation between imaging features, genes and a 22 feature genomic classifier (GC), developed as a prognostic assay for metastasis after radical prostatectomy was investigated. Results: High quality genomic data was derived from 17 (100%) biopsies. Using the 212 ‘unbiased’ genes, the samples clustered by patient origin in unsupervised analysis. When only prostate cancer related genomic features were used, hierarchical clustering revealed samples clustered by needle-biopsy Gleason score (GS). Similarly, principal component analysis of the imaging features, found the primary source of variance segregated the samples into high (≥7) and low (6) GS. Pearson’s correlation analysis of genes with significant expression showed two main patterns of gene expression clustering prostate peripheral and transitional zone MRI features. Two-way hierarchical clustering of GC with radiomics features resulted in the expected groupings of high and low expressed genes in this metastasis signature. Conclusions: MP-MRI-targeted diagnostic biopsies can potentially improve risk stratification by directing pathological and genomic analysis to clinically significant index lesions. As determinant lesions are more reliably identified, targeting with radiotherapy should improve outcome. This is the first demonstration of a link between quantitative imaging features (radiomics) with genomic features in MRI-directed prostate biopsies. The research was supported by NIH- NCI R01 CA 189295 and R01 CA 189295; E Davicioni is partial owner of GenomeDx Biosciences, Inc. M Takhar, N Erho, L Lam, C Buerki and E Davicioni are current employees at GenomeDx Biosciences, Inc.« less
A Wavelet-Based Methodology for Grinding Wheel Condition Monitoring
DOE Office of Scientific and Technical Information (OSTI.GOV)
Liao, T. W.; Ting, C.F.; Qu, Jun
2007-01-01
Grinding wheel surface condition changes as more material is removed. This paper presents a wavelet-based methodology for grinding wheel condition monitoring based on acoustic emission (AE) signals. Grinding experiments in creep feed mode were conducted to grind alumina specimens with a resinoid-bonded diamond wheel using two different conditions. During the experiments, AE signals were collected when the wheel was 'sharp' and when the wheel was 'dull'. Discriminant features were then extracted from each raw AE signal segment using the discrete wavelet decomposition procedure. An adaptive genetic clustering algorithm was finally applied to the extracted features in order to distinguish differentmore » states of grinding wheel condition. The test results indicate that the proposed methodology can achieve 97% clustering accuracy for the high material removal rate condition, 86.7% for the low material removal rate condition, and 76.7% for the combined grinding conditions if the base wavelet, the decomposition level, and the GA parameters are properly selected.« less
Nanoclusters as a new family of high temperature superconductors (Conference Presentation)
NASA Astrophysics Data System (ADS)
Halder, Avik; Kresin, Vitaly V.
2017-03-01
Electrons in metal clusters organize into quantum shells, akin to atomic shells in the periodic table. Such nanoparticles are referred to as "superatoms". The electronic shell levels are highly degenerate giving rise to sharp peaks in the density of states, which can enable exceptionally strong electron pairing in certain clusters containing tens to hundreds of atoms. A spectroscopic investigation of size - resolved aluminum nanoclusters has revealed a sharp rise in the density of states near the Fermi level as the temperature decreases towards 100 K. The effect is especially prominent in the closed-shell "magic" cluster Al66 [1, 2]. The characteristics of this behavior are fully consistent with a pairing transition, implying a high temperature superconducting state with Tc < 100K. This value exceeds that of bulk aluminum by two orders of magnitude. As a new class of high-temperature superconductors, such metal nanocluster particles are promising building blocks for high-Tc materials, devices, and networks. ---------- 1. Halder, A., Liang, A., Kresin, V. V. A novel feature in aluminum cluster photoionization spectra and possibility of electron pairing at T 100K. Nano Lett 15, 1410 - 1413 (2015) 2. Halder, A., Kresin, V. V. A transition in the density of states of metal "superatom" nanoclusters and evidence for superconducting pairing at T 100K. Phys. Rev. B 92, 214506 (2015).
Grimsley, Jasmine M S; Gadziola, Marie A; Wenstrup, Jeffrey J
2012-01-01
Mouse pups vocalize at high rates when they are cold or isolated from the nest. The proportions of each syllable type produced carry information about disease state and are being used as behavioral markers for the internal state of animals. Manual classifications of these vocalizations identified 10 syllable types based on their spectro-temporal features. However, manual classification of mouse syllables is time consuming and vulnerable to experimenter bias. This study uses an automated cluster analysis to identify acoustically distinct syllable types produced by CBA/CaJ mouse pups, and then compares the results to prior manual classification methods. The cluster analysis identified two syllable types, based on their frequency bands, that have continuous frequency-time structure, and two syllable types featuring abrupt frequency transitions. Although cluster analysis computed fewer syllable types than manual classification, the clusters represented well the probability distributions of the acoustic features within syllables. These probability distributions indicate that some of the manually classified syllable types are not statistically distinct. The characteristics of the four classified clusters were used to generate a Microsoft Excel-based mouse syllable classifier that rapidly categorizes syllables, with over a 90% match, into the syllable types determined by cluster analysis.
Vajda, Szilárd; Rangoni, Yves; Cecotti, Hubert
2015-01-01
For training supervised classifiers to recognize different patterns, large data collections with accurate labels are necessary. In this paper, we propose a generic, semi-automatic labeling technique for large handwritten character collections. In order to speed up the creation of a large scale ground truth, the method combines unsupervised clustering and minimal expert knowledge. To exploit the potential discriminant complementarities across features, each character is projected into five different feature spaces. After clustering the images in each feature space, the human expert labels the cluster centers. Each data point inherits the label of its cluster’s center. A majority (or unanimity) vote decides the label of each character image. The amount of human involvement (labeling) is strictly controlled by the number of clusters – produced by the chosen clustering approach. To test the efficiency of the proposed approach, we have compared, and evaluated three state-of-the art clustering methods (k-means, self-organizing maps, and growing neural gas) on the MNIST digit data set, and a Lampung Indonesian character data set, respectively. Considering a k-nn classifier, we show that labeling manually only 1.3% (MNIST), and 3.2% (Lampung) of the training data, provides the same range of performance than a completely labeled data set would. PMID:25870463
NASA Astrophysics Data System (ADS)
Saha, P.; Rahane, A. B.; Kumar, V.; Sukumar, N.
2016-05-01
Boron atomic clusters show several interesting and unusual size-dependent features due to the small covalent radius, electron deficiency, and higher coordination number of boron as compared to carbon. These include aromaticity and a diverse array of structures such as quasi-planar, ring or tubular shaped, and fullerene-like. In the present work, we have analyzed features of the computed electron density distributions of small boron clusters having up to 11 boron atoms, and investigated the effect of doping with C, P, Al, Si, and Zn atoms on their structural and physical properties, in order to understand the bonding characteristics and discern trends in bonding and stability. We find that in general there are covalent bonds as well as delocalized charge distribution in these clusters. We associate the strong stability of some of these planar/quasiplanar disc-type clusters with the electronic shell closing with effectively twelve delocalized valence electrons using a disc-shaped jellium model. {{{{B}}}9}-, B10, B7P, and B8Si, in particular, are found to be exceptional with very large gaps between the highest occupied molecular orbital and the lowest unoccupied molecular orbital, and these are suggested to be magic clusters.
Automatic detection of erythemato-squamous diseases using k-means clustering.
Ubeyli, Elif Derya; Doğdu, Erdoğan
2010-04-01
A new approach based on the implementation of k-means clustering is presented for automated detection of erythemato-squamous diseases. The purpose of clustering techniques is to find a structure for the given data by finding similarities between data according to data characteristics. The studied domain contained records of patients with known diagnosis. The k-means clustering algorithm's task was to classify the data points, in this case the patients with attribute data, to one of the five clusters. The algorithm was used to detect the five erythemato-squamous diseases when 33 features defining five disease indications were used. The purpose is to determine an optimum classification scheme for this problem. The present research demonstrated that the features well represent the erythemato-squamous diseases and the k-means clustering algorithm's task achieved high classification accuracies for only five erythemato-squamous diseases.
State estimation and prediction using clustered particle filters.
Lee, Yoonsang; Majda, Andrew J
2016-12-20
Particle filtering is an essential tool to improve uncertain model predictions by incorporating noisy observational data from complex systems including non-Gaussian features. A class of particle filters, clustered particle filters, is introduced for high-dimensional nonlinear systems, which uses relatively few particles compared with the standard particle filter. The clustered particle filter captures non-Gaussian features of the true signal, which are typical in complex nonlinear dynamical systems such as geophysical systems. The method is also robust in the difficult regime of high-quality sparse and infrequent observations. The key features of the clustered particle filtering are coarse-grained localization through the clustering of the state variables and particle adjustment to stabilize the method; each observation affects only neighbor state variables through clustering and particles are adjusted to prevent particle collapse due to high-quality observations. The clustered particle filter is tested for the 40-dimensional Lorenz 96 model with several dynamical regimes including strongly non-Gaussian statistics. The clustered particle filter shows robust skill in both achieving accurate filter results and capturing non-Gaussian statistics of the true signal. It is further extended to multiscale data assimilation, which provides the large-scale estimation by combining a cheap reduced-order forecast model and mixed observations of the large- and small-scale variables. This approach enables the use of a larger number of particles due to the computational savings in the forecast model. The multiscale clustered particle filter is tested for one-dimensional dispersive wave turbulence using a forecast model with model errors.
State estimation and prediction using clustered particle filters
Lee, Yoonsang; Majda, Andrew J.
2016-01-01
Particle filtering is an essential tool to improve uncertain model predictions by incorporating noisy observational data from complex systems including non-Gaussian features. A class of particle filters, clustered particle filters, is introduced for high-dimensional nonlinear systems, which uses relatively few particles compared with the standard particle filter. The clustered particle filter captures non-Gaussian features of the true signal, which are typical in complex nonlinear dynamical systems such as geophysical systems. The method is also robust in the difficult regime of high-quality sparse and infrequent observations. The key features of the clustered particle filtering are coarse-grained localization through the clustering of the state variables and particle adjustment to stabilize the method; each observation affects only neighbor state variables through clustering and particles are adjusted to prevent particle collapse due to high-quality observations. The clustered particle filter is tested for the 40-dimensional Lorenz 96 model with several dynamical regimes including strongly non-Gaussian statistics. The clustered particle filter shows robust skill in both achieving accurate filter results and capturing non-Gaussian statistics of the true signal. It is further extended to multiscale data assimilation, which provides the large-scale estimation by combining a cheap reduced-order forecast model and mixed observations of the large- and small-scale variables. This approach enables the use of a larger number of particles due to the computational savings in the forecast model. The multiscale clustered particle filter is tested for one-dimensional dispersive wave turbulence using a forecast model with model errors. PMID:27930332
NASA Technical Reports Server (NTRS)
Dasarathy, B. V.
1976-01-01
An algorithm is proposed for dimensionality reduction in the context of clustering techniques based on histogram analysis. The approach is based on an evaluation of the hills and valleys in the unidimensional histograms along the different features and provides an economical means of assessing the significance of the features in a nonparametric unsupervised data environment. The method has relevance to remote sensing applications.
Quantifying site-specific physical heterogeneity within an estuarine seascape
Kennedy, Cristina G.; Mather, Martha E.; Smith, Joseph M.
2017-01-01
Quantifying physical heterogeneity is essential for meaningful ecological research and effective resource management. Spatial patterns of multiple, co-occurring physical features are rarely quantified across a seascape because of methodological challenges. Here, we identified approaches that measured total site-specific heterogeneity, an often overlooked aspect of estuarine ecosystems. Specifically, we examined 23 metrics that quantified four types of common physical features: (1) river and creek confluences, (2) bathymetric variation including underwater drop-offs, (3) land features such as islands/sandbars, and (4) major underwater channel networks. Our research at 40 sites throughout Plum Island Estuary (PIE) provided solutions to two problems. The first problem was that individual metrics that measured heterogeneity of a single physical feature showed different regional patterns. We solved this first problem by combining multiple metrics for a single feature using a within-physical feature cluster analysis. With this approach, we identified sites with four different types of confluences and three different types of underwater drop-offs. The second problem was that when multiple physical features co-occurred, new patterns of total site-specific heterogeneity were created across the seascape. This pattern of total heterogeneity has potential ecological relevance to structure-oriented predators. To address this second problem, we identified sites with similar types of total physical heterogeneity using an across-physical feature cluster analysis. Then, we calculated an additive heterogeneity index, which integrated all physical features at a site. Finally, we tested if site-specific additive heterogeneity index values differed for across-physical feature clusters. In PIE, the sites with the highest additive heterogeneity index values were clustered together and corresponded to sites where a fish predator, adult striped bass (Morone saxatilis), aggregated in a related acoustic tracking study. In summary, we have shown general approaches to quantifying site-specific heterogeneity.
Analyzing Sub-Classifications of Glaucoma via SOM Based Clustering of Optic Nerve Images.
Yan, Sanjun; Abidi, Syed Sibte Raza; Artes, Paul Habib
2005-01-01
We present a data mining framework to cluster optic nerve images obtained by Confocal Scanning Laser Tomography (CSLT) in normal subjects and patients with glaucoma. We use self-organizing maps and expectation maximization methods to partition the data into clusters that provide insights into potential sub-classification of glaucoma based on morphological features. We conclude that our approach provides a first step towards a better understanding of morphological features in optic nerve images obtained from glaucoma patients and healthy controls.
Efficient architecture for spike sorting in reconfigurable hardware.
Hwang, Wen-Jyi; Lee, Wei-Hao; Lin, Shiow-Jyu; Lai, Sheng-Ying
2013-11-01
This paper presents a novel hardware architecture for fast spike sorting. The architecture is able to perform both the feature extraction and clustering in hardware. The generalized Hebbian algorithm (GHA) and fuzzy C-means (FCM) algorithm are used for feature extraction and clustering, respectively. The employment of GHA allows efficient computation of principal components for subsequent clustering operations. The FCM is able to achieve near optimal clustering for spike sorting. Its performance is insensitive to the selection of initial cluster centers. The hardware implementations of GHA and FCM feature low area costs and high throughput. In the GHA architecture, the computation of different weight vectors share the same circuit for lowering the area costs. Moreover, in the FCM hardware implementation, the usual iterative operations for updating the membership matrix and cluster centroid are merged into one single updating process to evade the large storage requirement. To show the effectiveness of the circuit, the proposed architecture is physically implemented by field programmable gate array (FPGA). It is embedded in a System-on-Chip (SOC) platform for performance measurement. Experimental results show that the proposed architecture is an efficient spike sorting design for attaining high classification correct rate and high speed computation.
Yang, Guang; Nawaz, Tahir; Barrick, Thomas R; Howe, Franklyn A; Slabaugh, Greg
2015-12-01
Many approaches have been considered for automatic grading of brain tumors by means of pattern recognition with magnetic resonance spectroscopy (MRS). Providing an improved technique which can assist clinicians in accurately identifying brain tumor grades is our main objective. The proposed technique, which is based on the discrete wavelet transform (DWT) of whole-spectral or subspectral information of key metabolites, combined with unsupervised learning, inspects the separability of the extracted wavelet features from the MRS signal to aid the clustering. In total, we included 134 short echo time single voxel MRS spectra (SV MRS) in our study that cover normal controls, low grade and high grade tumors. The combination of DWT-based whole-spectral or subspectral analysis and unsupervised clustering achieved an overall clustering accuracy of 94.8% and a balanced error rate of 7.8%. To the best of our knowledge, it is the first study using DWT combined with unsupervised learning to cluster brain SV MRS. Instead of dimensionality reduction on SV MRS or feature selection using model fitting, our study provides an alternative method of extracting features to obtain promising clustering results.
High-dimensional cluster analysis with the Masked EM Algorithm
Kadir, Shabnam N.; Goodman, Dan F. M.; Harris, Kenneth D.
2014-01-01
Cluster analysis faces two problems in high dimensions: first, the “curse of dimensionality” that can lead to overfitting and poor generalization performance; and second, the sheer time taken for conventional algorithms to process large amounts of high-dimensional data. We describe a solution to these problems, designed for the application of “spike sorting” for next-generation high channel-count neural probes. In this problem, only a small subset of features provide information about the cluster member-ship of any one data vector, but this informative feature subset is not the same for all data points, rendering classical feature selection ineffective. We introduce a “Masked EM” algorithm that allows accurate and time-efficient clustering of up to millions of points in thousands of dimensions. We demonstrate its applicability to synthetic data, and to real-world high-channel-count spike sorting data. PMID:25149694
Automated thematic mapping and change detection of ERTS-A images
NASA Technical Reports Server (NTRS)
Gramenopoulos, N. (Principal Investigator)
1975-01-01
The author has identified the following significant results. In the first part of the investigation, spatial and spectral features were developed which were employed to automatically recognize terrain features through a clustering algorithm. In this part of the investigation, the size of the cell which is the number of digital picture elements used for computing the spatial and spectral features was varied. It was determined that the accuracy of terrain recognition decreases slowly as the cell size is reduced and coincides with increased cluster diffuseness. It was also proven that a cell size of 17 x 17 pixels when used with the clustering algorithm results in high recognition rates for major terrain classes. ERTS-1 data from five diverse geographic regions of the United States were processed through the clustering algorithm with 17 x 17 pixel cells. Simple land use maps were produced and the average terrain recognition accuracy was 82 percent.
NASA Astrophysics Data System (ADS)
Borgelt, Christian
In clustering we often face the situation that only a subset of the available attributes is relevant for forming clusters, even though this may not be known beforehand. In such cases it is desirable to have a clustering algorithm that automatically weights attributes or even selects a proper subset. In this paper I study such an approach for fuzzy clustering, which is based on the idea to transfer an alternative to the fuzzifier (Klawonn and Höppner, What is fuzzy about fuzzy clustering? Understanding and improving the concept of the fuzzifier, In: Proc. 5th Int. Symp. on Intelligent Data Analysis, 254-264, Springer, Berlin, 2003) to attribute weighting fuzzy clustering (Keller and Klawonn, Int J Uncertain Fuzziness Knowl Based Syst 8:735-746, 2000). In addition, by reformulating Gustafson-Kessel fuzzy clustering, a scheme for weighting and selecting principal axes can be obtained. While in Borgelt (Feature weighting and feature selection in fuzzy clustering, In: Proc. 17th IEEE Int. Conf. on Fuzzy Systems, IEEE Press, Piscataway, NJ, 2008) I already presented such an approach for a global selection of attributes and principal axes, this paper extends it to a cluster-specific selection, thus arriving at a fuzzy subspace clustering algorithm (Parsons, Haque, and Liu, 2004).
Wang, Jiaojiao; Cao, Zhidong; Zeng, Daniel Dajun; Wang, Quanyi; Wang, Xiaoli; Qian, Haikun
2014-01-01
Hand, foot, and mouth disease (HFMD) mostly affects the health of infants and preschool children. Many studies of HFMD in different regions have been published. However, the epidemiological characteristics and space-time patterns of individual-level HFMD cases in a major city such as Beijing are unknown. The objective of this study was to investigate epidemiological features and identify high relative risk space-time HFMD clusters at a fine spatial scale. Detailed information on age, occupation, pathogen and gender was used to analyze the epidemiological features of HFMD epidemics. Data on individual-level HFMD cases were examined using Local Indicators of Spatial Association (LISA) analysis to identify the spatial autocorrelation of HFMD incidence. Spatial filtering combined with scan statistics methods were used to detect HFMD clusters. A total of 157,707 HFMD cases (60.25% were male, 39.75% were female) reported in Beijing from 2008 to 2012 included 1465 severe cases and 33 fatal cases. The annual average incidence rate was 164.3 per 100,000 (ranged from 104.2 in 2008 to 231.5 in 2010). Male incidence was higher than female incidence for the 0 to 14-year age group, and 93.88% were nursery children or lived at home. Areas at a higher relative risk were mainly located in the urban-rural transition zones (the percentage of the population at risk ranged from 33.89% in 2011 to 39.58% in 2012) showing High-High positive spatial association for HFMD incidence. The most likely space-time cluster was located in the mid-east part of the Fangshan district, southwest of Beijing. The spatial-time patterns of Beijing HFMD (2008-2012) showed relatively steady. The population at risk were mainly distributed in the urban-rural transition zones. Epidemiological features of Beijing HFMD were generally consistent with the previous research. The findings generated computational insights useful for disease surveillance, risk assessment and early warning.
Clustering gene expression regulators: new approach to disease subtyping.
Pyatnitskiy, Mikhail; Mazo, Ilya; Shkrob, Maria; Schwartz, Elena; Kotelnikova, Ekaterina
2014-01-01
One of the main challenges in modern medicine is to stratify different patient groups in terms of underlying disease molecular mechanisms as to develop more personalized approach to therapy. Here we propose novel method for disease subtyping based on analysis of activated expression regulators on a sample-by-sample basis. Our approach relies on Sub-Network Enrichment Analysis algorithm (SNEA) which identifies gene subnetworks with significant concordant changes in expression between two conditions. Subnetwork consists of central regulator and downstream genes connected by relations extracted from global literature-extracted regulation database. Regulators found in each patient separately are clustered together and assigned activity scores which are used for final patients grouping. We show that our approach performs well compared to other related methods and at the same time provides researchers with complementary level of understanding of pathway-level biology behind a disease by identification of significant expression regulators. We have observed the reasonable grouping of neuromuscular disorders (triggered by structural damage vs triggered by unknown mechanisms), that was not revealed using standard expression profile clustering. For another experiment we were able to suggest the clusters of regulators, responsible for colorectal carcinoma vs adenoma discrimination and identify frequently genetically changed regulators that could be of specific importance for the individual characteristics of cancer development. Proposed approach can be regarded as biologically meaningful feature selection, reducing tens of thousands of genes down to dozens of clusters of regulators. Obtained clusters of regulators make possible to generate valuable biological hypotheses about molecular mechanisms related to a clinical outcome for individual patient.
Clustering Gene Expression Regulators: New Approach to Disease Subtyping
Pyatnitskiy, Mikhail; Mazo, Ilya; Shkrob, Maria; Schwartz, Elena; Kotelnikova, Ekaterina
2014-01-01
One of the main challenges in modern medicine is to stratify different patient groups in terms of underlying disease molecular mechanisms as to develop more personalized approach to therapy. Here we propose novel method for disease subtyping based on analysis of activated expression regulators on a sample-by-sample basis. Our approach relies on Sub-Network Enrichment Analysis algorithm (SNEA) which identifies gene subnetworks with significant concordant changes in expression between two conditions. Subnetwork consists of central regulator and downstream genes connected by relations extracted from global literature-extracted regulation database. Regulators found in each patient separately are clustered together and assigned activity scores which are used for final patients grouping. We show that our approach performs well compared to other related methods and at the same time provides researchers with complementary level of understanding of pathway-level biology behind a disease by identification of significant expression regulators. We have observed the reasonable grouping of neuromuscular disorders (triggered by structural damage vs triggered by unknown mechanisms), that was not revealed using standard expression profile clustering. For another experiment we were able to suggest the clusters of regulators, responsible for colorectal carcinoma vs adenoma discrimination and identify frequently genetically changed regulators that could be of specific importance for the individual characteristics of cancer development. Proposed approach can be regarded as biologically meaningful feature selection, reducing tens of thousands of genes down to dozens of clusters of regulators. Obtained clusters of regulators make possible to generate valuable biological hypotheses about molecular mechanisms related to a clinical outcome for individual patient. PMID:24416320
Kalpathy-Cramer, Jayashree; Hersh, William
2008-01-01
In 2006 and 2007, Oregon Health & Science University (OHSU) participated in the automatic image annotation task for medical images at ImageCLEF, an annual international benchmarking event that is part of the Cross Language Evaluation Forum (CLEF). The goal of the automatic annotation task was to classify 1000 test images based on the Image Retrieval in Medical Applications (IRMA) code, given a set of 10,000 training images. There were 116 distinct classes in 2006 and 2007. We evaluated the efficacy of a variety of primarily global features for this classification task. These included features based on histograms, gray level correlation matrices and the gist technique. A multitude of classifiers including k-nearest neighbors, two-level neural networks, support vector machines, and maximum likelihood classifiers were evaluated. Our official error rates for the 1000 test images were 26% in 2006 using the flat classification structure. The error count in 2007 was 67.8 using the hierarchical classification error computation based on the IRMA code in 2007. Confusion matrices as well as clustering experiments were used to identify visually similar classes. The use of the IRMA code did not help us in the classification task as the semantic hierarchy of the IRMA classes did not correspond well with the hierarchy based on clustering of image features that we used. Our most frequent misclassification errors were along the view axis. Subsequent experiments based on a two-stage classification system decreased our error rate to 19.8% for the 2006 dataset and our error count to 55.4 for the 2007 data. PMID:19884953
A Typology of Social Workers in Long-Term Care Facilities in Israel.
Lev, Sagit; Ayalon, Liat
2018-04-01
This article explores moral distress among long-term care facility (LTCF) social workers by examining the relationships between moral distress and environmental and personal features. Based on these features, authors identified a typology of LTCF social workers and how they handle moral distress. Such a typology can assist in the identification of social workers who are in a particular need for assistance. Overall, 216 LTCF social workers took part in the study. A two-step cluster analysis was conducted to identify a typology of LTCF social workers based on features such as ethical environment, support in workplace, mastery, and resilience. The variance of the identified clusters and their associations with moral distress were examined, and four clusters of LTCF social workers were identified. The clusters varied from each other in relation to their personal and environmental features and in relation to their experience of moral distress. The article concludes with a discussion of the importance of developing programs for LTCF social workers that provide support and enhancement of personal resources and an adequate and ethical environment for practice.
Three-dimensional seismic velocity structure and earthquake relocations at Katmai, Alaska
Murphy, Rachel; Thurber, Clifford; Prejean, Stephanie G.; Bennington, Ninfa
2014-01-01
We invert arrival time data from local earthquakes occurring between September 2004 and May 2009 to determine the three-dimensional (3D) upper crustal seismic structure in the Katmai volcanic region. Waveforms for the study come from the Alaska Volcano Observatory's permanent network of 20 seismic stations in the area (predominantly single-component, short period instruments) plus a densely spaced temporary array of 11 broadband, 3-component stations. The absolute and relative arrival times are used in a double-difference seismic tomography inversion to solve for 3D P- and S-wave velocity models for an area encompassing the main volcanic centers. The relocated hypocenters provide insight into the geometry of seismogenic structures in the area, revealing clustering of events into four distinct zones associated with Martin, Mageik, Trident-Novarupta, and Mount Katmai. The seismic activity extends from about sea level to 2 km depth (all depths referenced to mean sea level) beneath Martin, is concentrated near 2 km depth beneath Mageik, and lies mainly between 2 and 4 km depth below Katmai and Trident-Novarupta. Many new features are apparent within these earthquake clusters. In particular, linear features are visible within all clusters, some associated with swarm activity, including an observation of earthquake migration near Trident in 2008. The final velocity model reveals a possible zone of magma storage beneath Mageik, but there is no clear evidence for magma beneath the Katmai-Novarupta area where the 1912 eruptive activity occurred, suggesting that the storage zone for that eruption may have largely been evacuated, or remnant magma has solidified.
Galaxy clusters as hydrodynamics laboratories
NASA Astrophysics Data System (ADS)
Roediger, Elke; Sheardown, Alexander; Fish, Thomas; ZuHone, John; Hunt, Matthew; Su, Yuanyuan; Kraft, Ralph P.; Nulsen, Paul; Forman, William R.; Churazov, Eugene; Randall, Scott W.; Jones, Christine; Machacek, Marie E.
2017-08-01
The intra-cluster medium (ICM) of galaxy clusters shows a wealth of hydrodynamical features that trace the growth of clusters via the infall of galaxies or smaller subclusters. Such hydrodynamical features include the wakes of the infalling objects as well as the interfaces between the host cluster’s ICM and the atmosphere of the infalling object. Furthermore, the cluster dynamics can be traced by merger shocks, bow shocks, and sloshing motions of the ICM.The characteristics of these dynamical features, e.g., the direction, length, brightness, and temperature of the galaxies' or subclusters' gas tails varies significantly between different objects. This could be due to either dynamical conditions or ICM transport coefficients such as viscosity and thermal conductivity. For example, the cool long gas tails of of some infalling galaxies and groups have been attributed to a substantial ICM viscosity suppressing mixing of the stripped galaxy or group gas with the hotter ambient ICM.Using hydrodynamical simulations of minor mergers we show, however, that these features can be explained naturally by the dynamical conditions of each particular galaxy or group infall. Specifically, we identify observable features to distinguish the first and second infall of a galaxy or group into its host cluster as well as characteristics during apocentre passage. Comparing our simulations with observations, we can explain several puzzling observations such as the long and cold tail of M86 in Virgo and the very long and tangentially oriented tail of the group LEDA 87445 in Hydra A.Using our simulations, we also assess the validity of the stagnation pressure method that is widely used to determine an infalling galaxy's velocity. We show that near pericentre passage the method gives reasonable results, but near apocentre it is not easily applicable.
NASA Astrophysics Data System (ADS)
Gandomkar, Ziba; Tay, Kevin; Ryder, Will; Brennan, Patrick C.; Mello-Thoms, Claudia
2016-03-01
Radiologists' gaze-related parameters combined with image-based features were utilized to classify suspicious mammographic areas ultimately scored as True Positives (TP) and False Positives (FP). Eight breast radiologists read 120 two-view digital mammograms of which 59 had biopsy proven cancer. Eye tracking data was collected and nearby fixations were clustered together. Suspicious areas on mammograms were independently identified based on thresholding an intensity saliency map followed by automatic segmentation and pruning steps. For each radiologist reported area, radiologist's fixation clusters in the area, as well as neighboring suspicious areas within 2.5° of the center of fixation, were found. A 45-dimensional feature vector containing gaze parameters of the corresponding cluster along with image-based characteristics was constructed. Gaze parameters included total number of fixations in the cluster, dwell time, time to hit the cluster for the first time, maximum number of consecutive fixations, and saccade magnitude of the first fixation in the cluster. Image-based features consisted of intensity, shape, and texture descriptors extracted from the region around the suspicious area, its surrounding tissue, and the entire breast. For each radiologist, a userspecific Support Vector Machine (SVM) model was built to classify the reported areas as TPs or FPs. Leave-one-out cross validation was utilized to avoid over-fitting. A feature selection step was embedded in the SVM training procedure by allowing radial basis function kernels to have 45 scaling factors. The proposed method was compared with the radiologists' performance using the jackknife alternative free-response receiver operating characteristic (JAFROC). The JAFROC figure of merit increased significantly for six radiologists.
Reefing Line Tension in CPAS Main Parachute Clusters
NASA Technical Reports Server (NTRS)
Ray, Eric S.
2013-01-01
Reefing lines are an essential feature to manage inflation loads. During each Engineering Development Unit (EDU) test of the Capsule Parachute Assembly System (CPAS), a chase aircraft is staged to be level with the cluster of Main ringsail parachutes during the initial inflation and reefed stages. This allows for capturing high-quality still photographs of the reefed skirt, suspension line, and canopy geometry. The over-inflation angles are synchronized with measured loads data in order to compute the tension force in the reefing line. The traditional reefing tension equation assumes radial symmetry, but cluster effects cause the reefed skirt of each parachute to elongate to a more elliptical shape. This effect was considered in evaluating multiple parachutes to estimate the semi-major and semi-minor axes. Three flight tests are assessed, including one with a skipped first stage, which had peak reefing line tension over three times higher than the nominal parachute disreef sequence.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schuetrumpf, Bastian; Zhang, Chunli; Nazarewicz, Witold
Nuclear density functional theory is the tool of choice in describing properties of complex nuclei and intricate phases of bulk nucleonic matter. It is a microscopic approach based on an energy density functional representing the nuclear interaction. An attractive feature of nuclear DFT is that it can be applied to both finite nuclei and pasta phases appearing in the inner crust of neutron stars. While nuclear pasta clusters in a neutron star can be easily characterized through their density distributions, the level of clustering of nucleons in a nucleus can often be difficult to assess. To this end, we usemore » the concept of nucleon localization. We demonstrate that the localization measure provides us with fingerprints of clusters in light and heavy nuclei, including fissioning systems. Furthermore we investigate the rod-like pasta phase using twist-averaged boundary conditions, which enable calculations in finite volumes accessible by state of the art DFT solvers.« less
Low-level processing for real-time image analysis
NASA Technical Reports Server (NTRS)
Eskenazi, R.; Wilf, J. M.
1979-01-01
A system that detects object outlines in television images in real time is described. A high-speed pipeline processor transforms the raw image into an edge map and a microprocessor, which is integrated into the system, clusters the edges, and represents them as chain codes. Image statistics, useful for higher level tasks such as pattern recognition, are computed by the microprocessor. Peak intensity and peak gradient values are extracted within a programmable window and are used for iris and focus control. The algorithms implemented in hardware and the pipeline processor architecture are described. The strategy for partitioning functions in the pipeline was chosen to make the implementation modular. The microprocessor interface allows flexible and adaptive control of the feature extraction process. The software algorithms for clustering edge segments, creating chain codes, and computing image statistics are also discussed. A strategy for real time image analysis that uses this system is given.
NASA Astrophysics Data System (ADS)
Shah, Shishir
This paper presents a segmentation method for detecting cells in immunohistochemically stained cytological images. A two-phase approach to segmentation is used where an unsupervised clustering approach coupled with cluster merging based on a fitness function is used as the first phase to obtain a first approximation of the cell locations. A joint segmentation-classification approach incorporating ellipse as a shape model is used as the second phase to detect the final cell contour. The segmentation model estimates a multivariate density function of low-level image features from training samples and uses it as a measure of how likely each image pixel is to be a cell. This estimate is constrained by the zero level set, which is obtained as a solution to an implicit representation of an ellipse. Results of segmentation are presented and compared to ground truth measurements.
Semantic Shot Classification in Sports Video
NASA Astrophysics Data System (ADS)
Duan, Ling-Yu; Xu, Min; Tian, Qi
2003-01-01
In this paper, we present a unified framework for semantic shot classification in sports videos. Unlike previous approaches, which focus on clustering by aggregating shots with similar low-level features, the proposed scheme makes use of domain knowledge of a specific sport to perform a top-down video shot classification, including identification of video shot classes for each sport, and supervised learning and classification of the given sports video with low-level and middle-level features extracted from the sports video. It is observed that for each sport we can predefine a small number of semantic shot classes, about 5~10, which covers 90~95% of sports broadcasting video. With the supervised learning method, we can map the low-level features to middle-level semantic video shot attributes such as dominant object motion (a player), camera motion patterns, and court shape, etc. On the basis of the appropriate fusion of those middle-level shot classes, we classify video shots into the predefined video shot classes, each of which has a clear semantic meaning. The proposed method has been tested over 4 types of sports videos: tennis, basketball, volleyball and soccer. Good classification accuracy of 85~95% has been achieved. With correctly classified sports video shots, further structural and temporal analysis, such as event detection, video skimming, table of content, etc, will be greatly facilitated.
The Orion Nebula Cluster as a Paradigm of Star Formation
NASA Astrophysics Data System (ADS)
Robberto, Massimo
2014-10-01
We propose a 52-orbit Treasury Program to investigate two fundamental questions of star formation: a) the low-mass tail of the IMF, down to a few Jupiter masses; b) the dynamical evolution of clusters, as revealed by stellar proper motions. We target the Orion Nebula Cluster (ONC) using WFC3 and ACS in coordinated parallel mode to perform a synoptic survey in the 1.345micron H2O feature and Ic broad-band. Our main objectives are: 1) to discover and classify ~500 brown dwarfs and planetary-mass objects in the field, extending the IMF down to lowest masses formed by gravitational collapse. Using the latest generation of high contrast image processing we will also search for faint companions, reaching down to sub-arcsecond separations and 1E-4 flux ratios. 2) to derive high precision (~0.2km/s) relative proper motions of low-mass stars and substellar objects (about 1000 sources total), leveraging on first epoch data obtained by our previous HST Treasury Program about 10 years ago. These data will unveil the cluster dynamics: velocity dispersion vs. mass, substructures, and the fraction of escaping sources. Only HST can access the IR H2O absorption feature sensitive to the effective temperature of substellar objects, while providing the exceptionally stable PSF needed for the detection of faint companions, and the identical ACS platform for our second epoch proper-motion survey. This program will provide the definitive HST legacy dataset on the ONC. Our High-Level Science Products will be mined by the community, both statistically to constrain competing theories of star formation, and to study in depth the multitude of exotic sources harboured by the cluster.
Fast and robust generation of feature maps for region-based visual attention.
Aziz, Muhammad Zaheer; Mertsching, Bärbel
2008-05-01
Visual attention is one of the important phenomena in biological vision which can be followed to achieve more efficiency, intelligence, and robustness in artificial vision systems. This paper investigates a region-based approach that performs pixel clustering prior to the processes of attention in contrast to late clustering as done by contemporary methods. The foundation steps of feature map construction for the region-based attention model are proposed here. The color contrast map is generated based upon the extended findings from the color theory, the symmetry map is constructed using a novel scanning-based method, and a new algorithm is proposed to compute a size contrast map as a formal feature channel. Eccentricity and orientation are computed using the moments of obtained regions and then saliency is evaluated using the rarity criteria. The efficient design of the proposed algorithms allows incorporating five feature channels while maintaining a processing rate of multiple frames per second. Another salient advantage over the existing techniques is the reusability of the salient regions in the high-level machine vision procedures due to preservation of their shapes and precise locations. The results indicate that the proposed model has the potential to efficiently integrate the phenomenon of attention into the main stream of machine vision and systems with restricted computing resources such as mobile robots can benefit from its advantages.
Stewart, C M; Newlands, S D; Perachio, A A
2004-12-01
Rapid and accurate discrimination of single units from extracellular recordings is a fundamental process for the analysis and interpretation of electrophysiological recordings. We present an algorithm that performs detection, characterization, discrimination, and analysis of action potentials from extracellular recording sessions. The program was entirely written in LabVIEW (National Instruments), and requires no external hardware devices or a priori information about action potential shapes. Waveform events are detected by scanning the digital record for voltages that exceed a user-adjustable trigger. Detected events are characterized to determine nine different time and voltage levels for each event. Various algebraic combinations of these waveform features are used as axis choices for 2-D Cartesian plots of events. The user selects axis choices that generate distinct clusters. Multiple clusters may be defined as action potentials by manually generating boundaries of arbitrary shape. Events defined as action potentials are validated by visual inspection of overlain waveforms. Stimulus-response relationships may be identified by selecting any recorded channel for comparison to continuous and average cycle histograms of binned unit data. The algorithm includes novel aspects of feature analysis and acquisition, including higher acquisition rates for electrophysiological data compared to other channels. The program confirms that electrophysiological data may be discriminated with high-speed and efficiency using algebraic combinations of waveform features derived from high-speed digital records.
STRONG GRAVITATIONAL LENSING BY THE SUPER-MASSIVE cD GALAXY IN ABELL 3827
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carrasco, E. R.; Gomez, P. L.; Lee, H.
2010-06-01
We have discovered strong gravitational lensing features in the core of the nearby cluster Abell 3827 by analyzing Gemini South GMOS images. The most prominent strong lensing feature is a highly magnified, ring-shaped configuration of four images around the central cD galaxy. GMOS spectroscopic analysis puts this source at z {approx} 0.2. Located {approx}20'' away from the central galaxy is a secondary tangential arc feature which has been identified as a background galaxy with z {approx} 0.4. We have modeled the gravitational potential of the cluster core, taking into account the mass from the cluster, the brightest cluster galaxy (BCG),more » and other galaxies. We derive a total mass of (2.7 {+-} 0.4) x 10{sup 13} M {sub sun} within 37 h {sup -1} kpc. This mass is an order of magnitude larger than that derived from X-ray observations. The total mass derived from lensing data suggests that the BCG in this cluster is perhaps the most massive galaxy in the nearby universe.« less
Canibalismo Extremo y Lente Gravitacional Intensa en el Cúmulo de Galaxias Abell 3827
NASA Astrophysics Data System (ADS)
Díaz, R. J.; West, M.; Bergmann, M.; Carrasco, E. R.; Gomez, P.; Lee, H.; Miller, B.; Turner, J.
Abell 3827 is one of the most massive known clusters and at its center we observe an extreme example of galactic canibalism: a super giant elliptical galaxy in its formation process, devoring five massive galaxies at the same time. Using high spatial resolution Gemini+GMOS imagery and multi-object spectroscopy, we derived the redshift (z=0.099) and the radial velocity dispersion of the 55 brightest galaxies in the cluster central region (1134 +- 125 km/s). The estimated virial mass is ~ 1E14 M(sun) inside a radius of 300 kpc of the cluster center. We have also found features corresponding to a strong gravitational lense, four anular features arranged in an Einstein Ring from a galaxy (z=0.2) at double redshift than the cluster, and a fifth arclet feature corresponding to the lensed light of a farther galaxy (z=0.4). The possible Einstein Ring is of small angular size and the gravitational lense morphology would confirm that the cluster is indeed very massive and dense. FULL TEXT IN SPANISH.
Sensory over responsivity and obsessive compulsive symptoms: A cluster analysis.
Ben-Sasson, Ayelet; Podoly, Tamar Yonit
2017-02-01
Several studies have examined the sensory component in Obsesseive Compulsive Disorder (OCD) and described an OCD subtype which has a unique profile, and that Sensory Phenomena (SP) is a significant component of this subtype. SP has some commonalities with Sensory Over Responsivity (SOR) and might be in part a characteristic of this subtype. Although there are some studies that have examined SOR and its relation to Obsessive Compulsive Symptoms (OCS), literature lacks sufficient data on this interplay. First to further examine the correlations between OCS and SOR, and to explore the correlations between SOR modalities (i.e. smell, touch, etc.) and OCS subscales (i.e. washing, ordering, etc.). Second, to investigate the cluster analysis of SOR and OCS dimensions in adults, that is, to classify the sample using the sensory scores to find whether a sensory OCD subtype can be specified. Our third goal was to explore the psychometric features of a new sensory questionnaire: the Sensory Perception Quotient (SPQ). A sample of non clinical adults (n=350) was recruited via e-mail, social media and social networks. Participants completed questionnaires for measuring SOR, OCS, and anxiety. SOR and OCI-F scores were moderately significantly correlated (n=274), significant correlations between all SOR modalities and OCS subscales were found with no specific higher correlation between one modality to one OCS subscale. Cluster analysis revealed four distinct clusters: (1) No OC and SOR symptoms (NONE; n=100), (2) High OC and SOR symptoms (BOTH; n=28), (3) Moderate OC symptoms (OCS; n=63), (4) Moderate SOR symptoms (SOR; n=83). The BOTH cluster had significantly higher anxiety levels than the other clusters, and shared OC subscales scores with the OCS cluster. The BOTH cluster also reported higher SOR scores across tactile, vision, taste and olfactory modalities. The SPQ was found reliable and suitable to detect SOR, the sample SPQ scores was normally distributed (n=350). SOR is a dimensional feature that can influence the severity of OCS and may characterize a unique sensory OCD subtype. Copyright © 2016 Elsevier Inc. All rights reserved.
Searches for 3.5 keV Absorption Features in Cluster AGN Spectra
NASA Astrophysics Data System (ADS)
Conlon, Joseph P.
2018-06-01
We investigate possible evidence for a spectral dip around 3.5 keV in central cluster AGNs, motivated by previous results for archival Chandra observations of the Perseus cluster and the general interest in novel spectral features around 3.5 keV that may arise from dark matter physics. We use two deep Chandra observations of the Perseus and Virgo clusters that have recently been made public. In both cases, mild improvements in the fit (Δχ2 = 4.2 and Δχ2 = 2.5) are found by including such a dip at 3.5 keV into the spectrum. A comparable result (Δχ2 = 6.5) is found re-analysing archival on-axis Chandra ACIS-S observations of the centre of the Perseus cluster.
The role of park conditions and features on park visitation and physical activity.
Rung, Ariane L; Mowen, Andrew J; Broyles, Stephanie T; Gustat, Jeanette
2011-09-01
Neighborhood parks play an important role in promoting physical activity. We examined the effect of activity area, condition, and presence of supporting features on number of park users and park-based physical activity levels. 37 parks and 154 activity areas within parks were assessed during summer 2008 for their features and park-based physical activity. Outcomes included any park use, number of park users, mean and total energy expenditure. Independent variables included type and condition of activity area, supporting features, size of activity area, gender, and day of week. Multilevel models controlled for clustering of observations at activity area and park levels. Type of activity area was associated with number of park users, mean and total energy expenditure, with basketball courts having the highest number of users and total energy expenditure, and playgrounds having the highest mean energy expenditure. Condition of activity areas was positively associated with number of basketball court users and inversely associated with number of green space users and total green space energy expenditure. Various supporting features were both positively and negatively associated with each outcome. This study provides evidence regarding characteristics of parks that can contribute to achieving physical activity goals within recreational spaces.
Does reflective functioning mediate the relationship between attachment and personality?
Nazzaro, Maria Paola; Boldrini, Tommaso; Tanzilli, Annalisa; Muzi, Laura; Giovanardi, Guido; Lingiardi, Vittorio
2017-10-01
Mentalization, operationalized as reflective functioning (RF), can play a crucial role in the psychological mechanisms underlying personality functioning. This study aimed to: (a) study the association between RF, personality disorders (cluster level) and functioning; (b) investigate whether RF and personality functioning are influenced by (secure vs. insecure) attachment; and (c) explore the potential mediating effect of RF on the relationship between attachment and personality functioning. The Shedler-Westen Assessment Procedure (SWAP-200) was used to assess personality disorders and levels of psychological functioning in a clinical sample (N = 88). Attachment and RF were evaluated with the Adult Attachment Interview (AAI) and Reflective Functioning Scale (RFS). Findings showed that RF had significant negative associations with cluster A and B personality disorders, and a significant positive association with psychological functioning. Moreover, levels of RF and personality functioning were influenced by attachment patterns. Finally, RF completely mediated the relationship between (secure/insecure) attachment and adaptive psychological features, and thus accounted for differences in overall personality functioning. Lack of mentalization seemed strongly associated with vulnerabilities in personality functioning, especially in patients with cluster A and B personality disorders. These findings provide support for the development of therapeutic interventions to improve patients' RF. Copyright © 2017 Elsevier B.V. All rights reserved.
Orban, Pierre; Doyon, Julien; Petrides, Michael; Mennes, Maarten; Hoge, Richard; Bellec, Pierre
2015-01-01
Functional magnetic resonance imaging can measure distributed and subtle variations in brain responses associated with task performance. However, it is unclear whether the rich variety of responses observed across the brain is functionally meaningful and consistent across individuals. Here, we used a multivariate clustering approach that grouped brain regions into clusters based on the similarity of their task-evoked temporal responses at the individual level, and then established the spatial consistency of these individual clusters at the group level. We observed a stable pseudohierarchy of task-evoked networks in the context of a delayed sequential motor task, where the fractionation of networks was driven by a gradient of involvement in motor sequence preparation versus execution. In line with theories about higher-level cognitive functioning, this gradient evolved in a rostro-caudal manner in the frontal lobe. In addition, parcellations in the cerebellum and basal ganglia matched with known anatomical territories and fiber pathways with the cerebral cortex. These findings demonstrate that subtle variations in brain responses associated with task performance are systematic enough across subjects to define a pseudohierarchy of task-evoked networks. Such networks capture meaningful functional features of brain organization as shaped by a given cognitive context. PMID:24729172
Data-driven cluster reinforcement and visualization in sparsely-matched self-organizing maps.
Manukyan, Narine; Eppstein, Margaret J; Rizzo, Donna M
2012-05-01
A self-organizing map (SOM) is a self-organized projection of high-dimensional data onto a typically 2-dimensional (2-D) feature map, wherein vector similarity is implicitly translated into topological closeness in the 2-D projection. However, when there are more neurons than input patterns, it can be challenging to interpret the results, due to diffuse cluster boundaries and limitations of current methods for displaying interneuron distances. In this brief, we introduce a new cluster reinforcement (CR) phase for sparsely-matched SOMs. The CR phase amplifies within-cluster similarity in an unsupervised, data-driven manner. Discontinuities in the resulting map correspond to between-cluster distances and are stored in a boundary (B) matrix. We describe a new hierarchical visualization of cluster boundaries displayed directly on feature maps, which requires no further clustering beyond what was implicitly accomplished during self-organization in SOM training. We use a synthetic benchmark problem and previously published microbial community profile data to demonstrate the benefits of the proposed methods.
Use of LANDSAT imagery for wildlife habitat mapping in northeast and eastcentral Alaska
NASA Technical Reports Server (NTRS)
Lent, P. C. (Principal Investigator)
1976-01-01
The author has identified the following significant results. There is strong indication that spatially rare feature classes may be missed in clustering classifications based on 2% random sampling. Therefore, it seems advisable to augment random sampling for cluster analysis with directed sampling of any spatially rare features which are relevant to the analysis.
Gunn, Lucy Dubrelle; Mavoa, Suzanne; Boulangé, Claire; Hooper, Paula; Kavanagh, Anne; Giles-Corti, Billie
2017-12-04
Evidence-based metrics are needed to inform urban policy to create healthy walkable communities. Most active living research has developed metrics of the environment around residential addresses, ignoring other important walking locations. Therefore, this study examined: metrics for built environment features surrounding local shopping centres, (known in Melbourne, Australia as neighbourhood activity centres (NACs) which are typically anchored by a supermarket); the association between NACs and transport walking; and, policy compliance for supermarket provision. In this observational study, cluster analysis was used to categorize 534 NACs in Melbourne, Australia by their built environment features. The NACS were linked to eligible Victorian Integrated Survey of Travel Activity 2009-2010 (VISTA) survey participants (n=19,984). Adjusted multilevel logistic regressions estimated associations between each cluster typology and two outcomes of daily walking: any transport walking; and, any 'neighbourhood' transport walking. Distance between residential dwellings and closest NAC was assessed to evaluate compliance with local planning policy on supermarket locations. Metrics for 19 built environment features were estimated and three NAC clusters associated with walkability were identified. NACs with significantly higher street connectivity (mean:161, SD:20), destination diversity (mean:16, SD:0.4); and net residential density (mean:77, SD:65) were interpreted as being 'highly walkable' when compared with 'low walkable' NACs, which had lower street connectivity (mean:57, SD:15); destination diversity (mean:11, SD:3); and net residential density (mean:10, SD:3). The odds of any daily transport walking was 5.85 times higher (95% CI: 4.22, 8.11), and for any 'neighborhood' transport walking 8.66 (95% CI: 5.89, 12.72) times higher, for residents whose closest NAC was highly walkable compared with those living near low walkable NACs. Only highly walkable NACs met the policy requirement that residents live within 1km of a local supermarket. Built environment features surrounding NACs must reach certain levels to encourage walking and deliver walkable communities. Research and metrics about the type and quantity of built environment features around both walking trip origins and destinations is needed to inform urban planning policies and urban design guidelines.
Identifying the ideal profile of French yogurts for different clusters of consumers.
Masson, M; Saint-Eve, A; Delarue, J; Blumenthal, D
2016-05-01
Identifying the sensory properties that affect consumer preferences for food products is an important feature of product development. Different methods, such as external preference mapping or partial least squares regression, are used to establish relationships between sensory data and consumer preferences and to identify sensory attributes that drive consumer preferences, by highlighting optimum products. Plain French yogurts were evaluated by a sensory profiling method performed by 12 trained judges. In parallel, 180 consumers were asked to score their overall liking and complete a cognitive restraint questionnaire. After hierarchical cluster analysis on the liking scores, preference mapping using a quadratic regression model was performed. Five clusters of consumers were identified as a function of different preference patterns. Contrary to our expectations, fat levels were not discriminating. For each cluster, the results of preference mapping enabled the identification of optimum products. A comparison of the 5 sensory profiles revealed numerous differences between key sensory attributes. For example, one consumer cluster had a strong preference for products perceived as very thick, grainy, but with a less flowing texture, less sticky, whey presence and color, in contrast to other clusters. In addition, each segment of consumers was characterized according to the results of the cognitive restraint questionnaire. Copyright © 2016 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Image fusion using sparse overcomplete feature dictionaries
Brumby, Steven P.; Bettencourt, Luis; Kenyon, Garrett T.; Chartrand, Rick; Wohlberg, Brendt
2015-10-06
Approaches for deciding what individuals in a population of visual system "neurons" are looking for using sparse overcomplete feature dictionaries are provided. A sparse overcomplete feature dictionary may be learned for an image dataset and a local sparse representation of the image dataset may be built using the learned feature dictionary. A local maximum pooling operation may be applied on the local sparse representation to produce a translation-tolerant representation of the image dataset. An object may then be classified and/or clustered within the translation-tolerant representation of the image dataset using a supervised classification algorithm and/or an unsupervised clustering algorithm.
Neighbourhood typology based on virtual audit of environmental obesogenic characteristics.
Feuillet, T; Charreire, H; Roda, C; Ben Rebah, M; Mackenbach, J D; Compernolle, S; Glonti, K; Bárdos, H; Rutter, H; De Bourdeaudhuij, I; McKee, M; Brug, J; Lakerveld, J; Oppert, J-M
2016-01-01
Virtual audit (using tools such as Google Street View) can help assess multiple characteristics of the physical environment. This exposure assessment can then be associated with health outcomes such as obesity. Strengths of virtual audit include collection of large amount of data, from various geographical contexts, following standard protocols. Using data from a virtual audit of obesity-related features carried out in five urban European regions, the current study aimed to (i) describe this international virtual audit dataset and (ii) identify neighbourhood patterns that can synthesize the complexity of such data and compare patterns across regions. Data were obtained from 4,486 street segments across urban regions in Belgium, France, Hungary, the Netherlands and the UK. We used multiple factor analysis and hierarchical clustering on principal components to build a typology of neighbourhoods and to identify similar/dissimilar neighbourhoods, regardless of region. Four neighbourhood clusters emerged, which differed in terms of food environment, recreational facilities and active mobility features, i.e. the three indicators derived from factor analysis. Clusters were unequally distributed across urban regions. Neighbourhoods mostly characterized by a high level of outdoor recreational facilities were predominantly located in Greater London, whereas neighbourhoods characterized by high urban density and large amounts of food outlets were mostly located in Paris. Neighbourhoods in the Randstad conurbation, Ghent and Budapest appeared to be very similar, characterized by relatively lower residential densities, greener areas and a very low percentage of streets offering food and recreational facility items. These results provide multidimensional constructs of obesogenic characteristics that may help target at-risk neighbourhoods more effectively than isolated features. © 2016 World Obesity.
Buried landmine detection using multivariate normal clustering
NASA Astrophysics Data System (ADS)
Duston, Brian M.
2001-10-01
A Bayesian classification algorithm is presented for discriminating buried land mines from buried and surface clutter in Ground Penetrating Radar (GPR) signals. This algorithm is based on multivariate normal (MVN) clustering, where feature vectors are used to identify populations (clusters) of mines and clutter objects. The features are extracted from two-dimensional images created from ground penetrating radar scans. MVN clustering is used to determine the number of clusters in the data and to create probability density models for target and clutter populations, producing the MVN clustering classifier (MVNCC). The Bayesian Information Criteria (BIC) is used to evaluate each model to determine the number of clusters in the data. An extension of the MVNCC allows the model to adapt to local clutter distributions by treating each of the MVN cluster components as a Poisson process and adaptively estimating the intensity parameters. The algorithm is developed using data collected by the Mine Hunter/Killer Close-In Detector (MH/K CID) at prepared mine lanes. The Mine Hunter/Killer is a prototype mine detecting and neutralizing vehicle developed for the U.S. Army to clear roads of anti-tank mines.
No Evidence for Multiple Stellar Populations in the Low-mass Galactic Globular Cluster E 3
NASA Astrophysics Data System (ADS)
Salinas, Ricardo; Strader, Jay
2015-08-01
Multiple stellar populations are a widespread phenomenon among Galactic globular clusters. Even though the origin of the enriched material from which new generations of stars are produced remains unclear, it is likely that self-enrichment will be feasible only in clusters massive enough to retain this enriched material. We searched for multiple populations in the low mass (M˜ 1.4× {10}4 {M}⊙ ) globular cluster E3, analyzing SOAR/Goodman multi-object spectroscopy centered on the blue cyanogen (CN) absorption features of 23 red giant branch stars. We find that the CN abundance does not present the typical bimodal behavior seen in clusters hosting multistellar populations, but rather a unimodal distribution that indicates the presence of a genuine single stellar population, or a level of enrichment much lower than in clusters that show evidence for two populations from high-resolution spectroscopy. E3 would be the first bona fide Galactic old globular cluster where no sign of self-enrichment is found. Based on observations obtained at the Southern Astrophysical Research (SOAR) Telescope, which is a joint project of the Ministério da Ciência, Tecnologia, e Inovação (MCTI) da República Federativa do Brasil, the US National Optical Astronomy Observatory (NOAO), the University of North Carolina at Chapel Hill (UNC), and Michigan State University (MSU).
Nuclear Potential Clustering As a New Tool to Detect Patterns in High Dimensional Datasets
NASA Astrophysics Data System (ADS)
Tonkova, V.; Paulus, D.; Neeb, H.
2013-02-01
We present a new approach for the clustering of high dimensional data without prior assumptions about the structure of the underlying distribution. The proposed algorithm is based on a concept adapted from nuclear physics. To partition the data, we model the dynamic behaviour of nucleons interacting in an N-dimensional space. An adaptive nuclear potential, comprised of a short-range attractive (strong interaction) and a long-range repulsive term (Coulomb force) is assigned to each data point. By modelling the dynamics, nucleons that are densely distributed in space fuse to build nuclei (clusters) whereas single point clusters repel each other. The formation of clusters is completed when the system reaches the state of minimal potential energy. The data are then grouped according to the particles' final effective potential energy level. The performance of the algorithm is tested with several synthetic datasets showing that the proposed method can robustly identify clusters even when complex configurations are present. Furthermore, quantitative MRI data from 43 multiple sclerosis patients were analyzed, showing a reasonable splitting into subgroups according to the individual patients' disease grade. The good performance of the algorithm on such highly correlated non-spherical datasets, which are typical for MRI derived image features, shows that Nuclear Potential Clustering is a valuable tool for automated data analysis, not only in the MRI domain.
Harper, Angela F; Leuthaeuser, Janelle B; Babbitt, Patricia C; Morris, John H; Ferrin, Thomas E; Poole, Leslie B; Fetrow, Jacquelyn S
2017-02-01
Peroxiredoxins (Prxs or Prdxs) are a large protein superfamily of antioxidant enzymes that rapidly detoxify damaging peroxides and/or affect signal transduction and, thus, have roles in proliferation, differentiation, and apoptosis. Prx superfamily members are widespread across phylogeny and multiple methods have been developed to classify them. Here we present an updated atlas of the Prx superfamily identified using a novel method called MISST (Multi-level Iterative Sequence Searching Technique). MISST is an iterative search process developed to be both agglomerative, to add sequences containing similar functional site features, and divisive, to split groups when functional site features suggest distinct functionally-relevant clusters. Superfamily members need not be identified initially-MISST begins with a minimal representative set of known structures and searches GenBank iteratively. Further, the method's novelty lies in the manner in which isofunctional groups are selected; rather than use a single or shifting threshold to identify clusters, the groups are deemed isofunctional when they pass a self-identification criterion, such that the group identifies itself and nothing else in a search of GenBank. The method was preliminarily validated on the Prxs, as the Prxs presented challenges of both agglomeration and division. For example, previous sequence analysis clustered the Prx functional families Prx1 and Prx6 into one group. Subsequent expert analysis clearly identified Prx6 as a distinct functionally relevant group. The MISST process distinguishes these two closely related, though functionally distinct, families. Through MISST search iterations, over 38,000 Prx sequences were identified, which the method divided into six isofunctional clusters, consistent with previous expert analysis. The results represent the most complete computational functional analysis of proteins comprising the Prx superfamily. The feasibility of this novel method is demonstrated by the Prx superfamily results, laying the foundation for potential functionally relevant clustering of the universe of protein sequences.
Taranto, F; D'Agostino, N; Greco, B; Cardi, T; Tripodi, P
2016-11-21
Knowledge on population structure and genetic diversity in vegetable crops is essential for association mapping studies and genomic selection. Genotyping by sequencing (GBS) represents an innovative method for large scale SNP detection and genotyping of genetic resources. Herein we used the GBS approach for the genome-wide identification of SNPs in a collection of Capsicum spp. accessions and for the assessment of the level of genetic diversity in a subset of 222 cultivated pepper (Capsicum annum) genotypes. GBS analysis generated a total of 7,568,894 master tags, of which 43.4% uniquely aligned to the reference genome CM334. A total of 108,591 SNP markers were identified, of which 105,184 were in C. annuum accessions. In order to explore the genetic diversity of C. annuum and to select a minimal core set representing most of the total genetic variation with minimum redundancy, a subset of 222 C. annuum accessions were analysed using 32,950 high quality SNPs. Based on Bayesian and Hierarchical clustering it was possible to divide the collection into three clusters. Cluster I had the majority of varieties and landraces mainly from Southern and Northern Italy, and from Eastern Europe, whereas clusters II and III comprised accessions of different geographical origins. Considering the genome-wide genetic variation among the accessions included in cluster I, a second round of Bayesian (K = 3) and Hierarchical (K = 2) clustering was performed. These analysis showed that genotypes were grouped not only based on geographical origin, but also on fruit-related features. GBS data has proven useful to assess the genetic diversity in a collection of C. annuum accessions. The high number of SNP markers, uniformly distributed on the 12 chromosomes, allowed the accessions to be distinguished according to geographical origin and fruit-related features. SNP markers and information on population structure developed in this study will undoubtedly support genome-wide association mapping studies and marker-assisted selection programs.
Babbitt, Patricia C.; Ferrin, Thomas E.
2017-01-01
Peroxiredoxins (Prxs or Prdxs) are a large protein superfamily of antioxidant enzymes that rapidly detoxify damaging peroxides and/or affect signal transduction and, thus, have roles in proliferation, differentiation, and apoptosis. Prx superfamily members are widespread across phylogeny and multiple methods have been developed to classify them. Here we present an updated atlas of the Prx superfamily identified using a novel method called MISST (Multi-level Iterative Sequence Searching Technique). MISST is an iterative search process developed to be both agglomerative, to add sequences containing similar functional site features, and divisive, to split groups when functional site features suggest distinct functionally-relevant clusters. Superfamily members need not be identified initially—MISST begins with a minimal representative set of known structures and searches GenBank iteratively. Further, the method’s novelty lies in the manner in which isofunctional groups are selected; rather than use a single or shifting threshold to identify clusters, the groups are deemed isofunctional when they pass a self-identification criterion, such that the group identifies itself and nothing else in a search of GenBank. The method was preliminarily validated on the Prxs, as the Prxs presented challenges of both agglomeration and division. For example, previous sequence analysis clustered the Prx functional families Prx1 and Prx6 into one group. Subsequent expert analysis clearly identified Prx6 as a distinct functionally relevant group. The MISST process distinguishes these two closely related, though functionally distinct, families. Through MISST search iterations, over 38,000 Prx sequences were identified, which the method divided into six isofunctional clusters, consistent with previous expert analysis. The results represent the most complete computational functional analysis of proteins comprising the Prx superfamily. The feasibility of this novel method is demonstrated by the Prx superfamily results, laying the foundation for potential functionally relevant clustering of the universe of protein sequences. PMID:28187133
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kuo, J; Su, K; Department of Radiology, University Hospitals Case Medical Center, Case Western Reserve University, Cleveland, Ohio
Purpose: Accurate and robust photon attenuation derived from MR is essential for PET/MR and MR-based radiation treatment planning applications. Although the fuzzy C-means (FCM) algorithm has been applied for pseudo-CT generation, the input feature combination and the number of clusters have not been optimized. This study aims to optimize both for clinically practical pseudo-CT generation. Methods: Nine volunteers were recruited. A 190-second, single-acquisition UTE-mDixon with 25% (angular) sampling and 3D radial readout was performed to acquire three primitive MR features at TEs of 0.1, 1.5, and 2.8 ms: the free-induction-decay (FID), the first and the second echo images. Three derivedmore » images, Dixon-fat and Dixon-water generated by two-point Dixon water/fat separation, and R2* (1/T2*) map, were also created. To identify informative inputs for generating a pseudo-CT image volume, all 63 combinations, choosing one to six of the feature images, were used as inputs to FCM for pseudo-CT generation. Further, the number of clusters was varied from four to seven to find the optimal approach. Mean prediction deviation (MPD), mean absolute prediction deviation (MAPD), and correlation coefficient (R) of different combinations were compared for feature selection. Results: Among the 63 feature combinations, the four that resulted in the best MAPD and R were further compared along with the set containing all six features. The results suggested that R2* and Dixon-water are the most informative features. Further, including FID also improved the performance of pseudo-CT generation. Consequently, the set containing FID, Dixon-water, and R2* resulted in the most accurate, robust pseudo-CT when the number of cluster equals to five (5C). The clusters were interpreted as air, fat, bone, brain, and fluid. The six-cluster Result additionally included bone marrow. Conclusion: The results suggested that FID, Dixon-water, R2* are the most important features. The findings can be used to facilitate pseudo-CT generation for unsupervised clustering. Please note that the project was completed with partial funding from the Ohio Department of Development grant TECH 11-063 and a sponsored research agreement with Philips Healthcare that is managed by Case Western Reserve University. As noted in the affiliations, some of the authors are Philips employees.« less
iPcc: a novel feature extraction method for accurate disease class discovery and prediction
Ren, Xianwen; Wang, Yong; Zhang, Xiang-Sun; Jin, Qi
2013-01-01
Gene expression profiling has gradually become a routine procedure for disease diagnosis and classification. In the past decade, many computational methods have been proposed, resulting in great improvements on various levels, including feature selection and algorithms for classification and clustering. In this study, we present iPcc, a novel method from the feature extraction perspective to further propel gene expression profiling technologies from bench to bedside. We define ‘correlation feature space’ for samples based on the gene expression profiles by iterative employment of Pearson’s correlation coefficient. Numerical experiments on both simulated and real gene expression data sets demonstrate that iPcc can greatly highlight the latent patterns underlying noisy gene expression data and thus greatly improve the robustness and accuracy of the algorithms currently available for disease diagnosis and classification based on gene expression profiles. PMID:23761440
Collaboration patterns in the German political science co-authorship network.
Leifeld, Philip; Wankmüller, Sandra; Berger, Valentin T Z; Ingold, Karin; Steiner, Christiane
2017-01-01
Research on social processes in the production of scientific output suggests that the collective research agenda of a discipline is influenced by its structural features, such as "invisible colleges" or "groups of collaborators" as well as academic "stars" that are embedded in, or connect, these research groups. Based on an encompassing dataset that takes into account multiple publication types including journals and chapters in edited volumes, we analyze the complete co-authorship network of all 1,339 researchers in German political science. Through the use of consensus graph clustering techniques and descriptive centrality measures, we identify the ten largest research clusters, their research topics, and the most central researchers who act as bridges and connect these clusters. We also aggregate the findings at the level of research organizations and consider the inter-university co-authorship network. The findings indicate that German political science is structured by multiple overlapping research clusters with a dominance of the subfields of international relations, comparative politics and political sociology. A small set of well-connected universities takes leading roles in these informal research groups.
Collaboration patterns in the German political science co-authorship network
Wankmüller, Sandra; Berger, Valentin T. Z.; Ingold, Karin; Steiner, Christiane
2017-01-01
Research on social processes in the production of scientific output suggests that the collective research agenda of a discipline is influenced by its structural features, such as “invisible colleges” or “groups of collaborators” as well as academic “stars” that are embedded in, or connect, these research groups. Based on an encompassing dataset that takes into account multiple publication types including journals and chapters in edited volumes, we analyze the complete co-authorship network of all 1,339 researchers in German political science. Through the use of consensus graph clustering techniques and descriptive centrality measures, we identify the ten largest research clusters, their research topics, and the most central researchers who act as bridges and connect these clusters. We also aggregate the findings at the level of research organizations and consider the inter-university co-authorship network. The findings indicate that German political science is structured by multiple overlapping research clusters with a dominance of the subfields of international relations, comparative politics and political sociology. A small set of well-connected universities takes leading roles in these informal research groups. PMID:28388621
NASA Astrophysics Data System (ADS)
Hadida, Jonathan; Desrosiers, Christian; Duong, Luc
2011-03-01
The segmentation of anatomical structures in Computed Tomography Angiography (CTA) is a pre-operative task useful in image guided surgery. Even though very robust and precise methods have been developed to help achieving a reliable segmentation (level sets, active contours, etc), it remains very time consuming both in terms of manual interactions and in terms of computation time. The goal of this study is to present a fast method to find coarse anatomical structures in CTA with few parameters, based on hierarchical clustering. The algorithm is organized as follows: first, a fast non-parametric histogram clustering method is proposed to compute a piecewise constant mask. A second step then indexes all the space-connected regions in the piecewise constant mask. Finally, a hierarchical clustering is achieved to build a graph representing the connections between the various regions in the piecewise constant mask. This step builds up a structural knowledge about the image. Several interactive features for segmentation are presented, for instance association or disassociation of anatomical structures. A comparison with the Mean-Shift algorithm is presented.
Gattuso, Hugo; Durand, Elodie; Bignon, Emmanuelle; Morell, Christophe; Georgakilas, Alexandros G; Dumont, Elise; Chipot, Christophe; Dehez, François; Monari, Antonio
2016-10-06
In the present contribution, the interaction between damaged DNA and repair enzymes is examined by means of molecular dynamics simulations. More specifically, we consider clustered abasic DNA lesions processed by the primary human apurinic/apyrimidinic (AP) endonuclease, APE1. Our results show that, in stark contrast with the corresponding bacterial endonucleases, human APE1 imposes strong geometrical constraints on the DNA duplex. As a consequence, the level of recognition and, hence, the repair rate is higher. Important features that guide the DNA/protein interactions are the presence of an extended positively charged region and of a molecular tweezers that strongly constrains DNA. Our results are on very good agreement with the experimentally determined repair rate of clustered abasic lesions. The lack of repair for one particular arrangement of the two abasic sites is also explained considering the peculiar destabilizing interaction between the recognition region and the second lesion, resulting in a partial opening of the molecular tweezers and, thus, a less stable complex. This contribution cogently establishes the molecular bases for the recognition and repair of clustered DNA lesions by means of human endonucleases.
Blessy, S A Praylin Selva; Sulochana, C Helen
2015-01-01
Segmentation of brain tumor from Magnetic Resonance Imaging (MRI) becomes very complicated due to the structural complexities of human brain and the presence of intensity inhomogeneities. To propose a method that effectively segments brain tumor from MR images and to evaluate the performance of unsupervised optimal fuzzy clustering (UOFC) algorithm for segmentation of brain tumor from MR images. Segmentation is done by preprocessing the MR image to standardize intensity inhomogeneities followed by feature extraction, feature fusion and clustering. Different validation measures are used to evaluate the performance of the proposed method using different clustering algorithms. The proposed method using UOFC algorithm produces high sensitivity (96%) and low specificity (4%) compared to other clustering methods. Validation results clearly show that the proposed method with UOFC algorithm effectively segments brain tumor from MR images.
Dimensional assessment of personality pathology in patients with eating disorders.
Goldner, E M; Srikameswaran, S; Schroeder, M L; Livesley, W J; Birmingham, C L
1999-02-22
This study examined patients with eating disorders on personality pathology using a dimensional method. Female subjects who met DSM-IV diagnostic criteria for eating disorder (n = 136) were evaluated and compared to an age-controlled general population sample (n = 68). We assessed 18 features of personality disorder with the Dimensional Assessment of Personality Pathology - Basic Questionnaire (DAPP-BQ). Factor analysis and cluster analysis were used to derive three clusters of patients. A five-factor solution was obtained with limited intercorrelation between factors. Cluster analysis produced three clusters with the following characteristics: Cluster 1 members (constituting 49.3% of the sample and labelled 'rigid') had higher mean scores on factors denoting compulsivity and interpersonal difficulties; Cluster 2 (18.4% of the sample) showed highest scores in factors denoting psychopathy, neuroticism and impulsive features, and appeared to constitute a borderline psychopathology group; Cluster 3 (32.4% of the sample) was characterized by few differences in personality pathology in comparison to the normal population sample. Cluster membership was associated with DSM-IV diagnosis -- a large proportion of patients with anorexia nervosa were members of Cluster 1. An empirical classification of eating-disordered patients derived from dimensional assessment of personality pathology identified three groups with clinical relevance.
Blue emitting undecaplatinum clusters
NASA Astrophysics Data System (ADS)
Chakraborty, Indranath; Bhuin, Radha Gobinda; Bhat, Shridevi; Pradeep, T.
2014-07-01
A blue luminescent 11-atom platinum cluster showing step-like optical features and the absence of plasmon absorption was synthesized. The cluster was purified using high performance liquid chromatography (HPLC). Electrospray ionization (ESI) and matrix assisted laser desorption ionization (MALDI) mass spectrometry (MS) suggest a composition, Pt11(BBS)8, which was confirmed by a range of other experimental tools. The cluster is highly stable and compatible with many organic solvents.A blue luminescent 11-atom platinum cluster showing step-like optical features and the absence of plasmon absorption was synthesized. The cluster was purified using high performance liquid chromatography (HPLC). Electrospray ionization (ESI) and matrix assisted laser desorption ionization (MALDI) mass spectrometry (MS) suggest a composition, Pt11(BBS)8, which was confirmed by a range of other experimental tools. The cluster is highly stable and compatible with many organic solvents. Electronic supplementary information (ESI) available: Details of experimental procedures, instrumentation, chromatogram of the crude cluster; SEM/EDAX, DLS, PXRD, TEM, FT-IR, and XPS of the isolated Pt11 cluster; UV/Vis, MALDI MS and SEM/EDAX of isolated 2 and 3; and 195Pt NMR of the K2PtCl6 standard. See DOI: 10.1039/c4nr02778g
Diffuse radio emission in the complex merging galaxy cluster Abell2069
NASA Astrophysics Data System (ADS)
Drabent, A.; Hoeft, M.; Pizzo, R. F.; Bonafede, A.; van Weeren, R. J.; Klein, U.
2015-03-01
Context. Galaxy clusters with signs of a recent merger in many cases show extended diffuse radio features. This emission originates from relativistic electrons that suffer synchrotron losses due to the intracluster magnetic field. The mechanisms of particle acceleration and the properties of the magnetic field are still poorly understood. Aims: We search for diffuse radio emission in galaxy clusters. Here, we study the complex galaxy cluster Abell 2069, for which X-ray observations indicate a recent merger. Methods: We investigate the cluster's radio continuum emission by deep Westerbork Synthesis Radio Telescope (WSRT) observations at 346 MHz and Giant Metrewave Radio Telescope (GMRT) observations at 322 MHz. Results: We find an extended diffuse radio feature roughly coinciding with the main component of the cluster. We classify this emission as a radio halo and estimate its lower limit flux density at 25 ± 9 mJy. Moreover, we find a second extended diffuse source located at the cluster's companion and estimate its flux density at 15 ± 2 mJy. We speculate that this is a small halo or a mini-halo. If true, this cluster is the first example of a double-halo in a single galaxy cluster.
Pattern recognition approach to the subsequent event of damaging earthquakes in Italy
NASA Astrophysics Data System (ADS)
Gentili, S.; Di Giovambattista, R.
2017-05-01
In this study, we investigate the occurrence of large aftershocks following the most significant earthquakes that occurred in Italy after 1980. In accordance with previous studies (Vorobieva and Panza, 1993; Vorobieva, 1999), we group clusters associated with mainshocks into two categories: ;type A; if, given a main shock of magnitude M, the subsequent strongest earthquake in the cluster has magnitude ≥M - 1 or type B otherwise. In this paper, we apply a pattern recognition approach using statistical features to foresee the class of the analysed clusters. The classification of the two categories is based on some features of the time, space, and magnitude distribution of the aftershocks. Specifically, we analyse the temporal evolution of the radiated energy at different elapsed times after the mainshock, the spatio-temporal evolution of the aftershocks occurring within a few days, and the probability of a strong earthquake. An attempt is made to classify the studied region into smaller seismic zones with a prevalence of type A and B clusters. We demonstrate that the two types of clusters have distinct preferred geographic locations inside the Italian territory that likely reflected key properties of the deforming regions, different crustal domains and faulting style. We use decision trees as classifiers of single features to characterize the features depending on the cluster type. The performance of the classification is tested by the Leave-One-Out method. The analysis is performed on different time-spans after the mainshock to simulate the dependence of the accuracy on the information available as data increased over a longer period with increasing time after the mainshock.
Universal dynamical properties preclude standard clustering in a large class of biochemical data.
Gomez, Florian; Stoop, Ralph L; Stoop, Ruedi
2014-09-01
Clustering of chemical and biochemical data based on observed features is a central cognitive step in the analysis of chemical substances, in particular in combinatorial chemistry, or of complex biochemical reaction networks. Often, for reasons unknown to the researcher, this step produces disappointing results. Once the sources of the problem are known, improved clustering methods might revitalize the statistical approach of compound and reaction search and analysis. Here, we present a generic mechanism that may be at the origin of many clustering difficulties. The variety of dynamical behaviors that can be exhibited by complex biochemical reactions on variation of the system parameters are fundamental system fingerprints. In parameter space, shrimp-like or swallow-tail structures separate parameter sets that lead to stable periodic dynamical behavior from those leading to irregular behavior. We work out the genericity of this phenomenon and demonstrate novel examples for their occurrence in realistic models of biophysics. Although we elucidate the phenomenon by considering the emergence of periodicity in dependence on system parameters in a low-dimensional parameter space, the conclusions from our simple setting are shown to continue to be valid for features in a higher-dimensional feature space, as long as the feature-generating mechanism is not too extreme and the dimension of this space is not too high compared with the amount of available data. For online versions of super-paramagnetic clustering see http://stoop.ini.uzh.ch/research/clustering. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Quantitative radiomic profiling of glioblastoma represents transcriptomic expression.
Kong, Doo-Sik; Kim, Junhyung; Ryu, Gyuha; You, Hye-Jin; Sung, Joon Kyung; Han, Yong Hee; Shin, Hye-Mi; Lee, In-Hee; Kim, Sung-Tae; Park, Chul-Kee; Choi, Seung Hong; Choi, Jeong Won; Seol, Ho Jun; Lee, Jung-Il; Nam, Do-Hyun
2018-01-19
Quantitative imaging biomarkers have increasingly emerged in the field of research utilizing available imaging modalities. We aimed to identify good surrogate radiomic features that can represent genetic changes of tumors, thereby establishing noninvasive means for predicting treatment outcome. From May 2012 to June 2014, we retrospectively identified 65 patients with treatment-naïve glioblastoma with available clinical information from the Samsung Medical Center data registry. Preoperative MR imaging data were obtained for all 65 patients with primary glioblastoma. A total of 82 imaging features including first-order statistics, volume, and size features, were semi-automatically extracted from structural and physiologic images such as apparent diffusion coefficient and perfusion images. Using commercially available software, NordicICE, we performed quantitative imaging analysis and collected the dataset composed of radiophenotypic parameters. Unsupervised clustering methods revealed that the radiophenotypic dataset was composed of three clusters. Each cluster represented a distinct molecular classification of glioblastoma; classical type, proneural and neural types, and mesenchymal type. These clusters also reflected differential clinical outcomes. We found that extracted imaging signatures does not represent copy number variation and somatic mutation. Quantitative radiomic features provide a potential evidence to predict molecular phenotype and treatment outcome. Radiomic profiles represents transcriptomic phenotypes more well.
NASA Astrophysics Data System (ADS)
Zhang, Han; Chen, Xuefeng; Du, Zhaohui; Li, Xiang; Yan, Ruqiang
2016-04-01
Fault information of aero-engine bearings presents two particular phenomena, i.e., waveform distortion and impulsive feature frequency band dispersion, which leads to a challenging problem for current techniques of bearing fault diagnosis. Moreover, although many progresses of sparse representation theory have been made in feature extraction of fault information, the theory also confronts inevitable performance degradation due to the fact that relatively weak fault information has not sufficiently prominent and sparse representations. Therefore, a novel nonlocal sparse model (coined NLSM) and its algorithm framework has been proposed in this paper, which goes beyond simple sparsity by introducing more intrinsic structures of feature information. This work adequately exploits the underlying prior information that feature information exhibits nonlocal self-similarity through clustering similar signal fragments and stacking them together into groups. Within this framework, the prior information is transformed into a regularization term and a sparse optimization problem, which could be solved through block coordinate descent method (BCD), is formulated. Additionally, the adaptive structural clustering sparse dictionary learning technique, which utilizes k-Nearest-Neighbor (kNN) clustering and principal component analysis (PCA) learning, is adopted to further enable sufficient sparsity of feature information. Moreover, the selection rule of regularization parameter and computational complexity are described in detail. The performance of the proposed framework is evaluated through numerical experiment and its superiority with respect to the state-of-the-art method in the field is demonstrated through the vibration signals of experimental rig of aircraft engine bearings.
Tang, Jialin; Soua, Slim; Mares, Cristinel; Gan, Tat-Hean
2017-01-01
The identification of particular types of damage in wind turbine blades using acoustic emission (AE) techniques is a significant emerging field. In this work, a 45.7-m turbine blade was subjected to flap-wise fatigue loading for 21 days, during which AE was measured by internally mounted piezoelectric sensors. This paper focuses on using unsupervised pattern recognition methods to characterize different AE activities corresponding to different fracture mechanisms. A sequential feature selection method based on a k-means clustering algorithm is used to achieve a fine classification accuracy. The visualization of clusters in peak frequency−frequency centroid features is used to correlate the clustering results with failure modes. The positions of these clusters in time domain features, average frequency−MARSE, and average frequency−peak amplitude are also presented in this paper (where MARSE represents the Measured Area under Rectified Signal Envelope). The results show that these parameters are representative for the classification of the failure modes. PMID:29104245
Tang, Jialin; Soua, Slim; Mares, Cristinel; Gan, Tat-Hean
2017-11-01
The identification of particular types of damage in wind turbine blades using acoustic emission (AE) techniques is a significant emerging field. In this work, a 45.7-m turbine blade was subjected to flap-wise fatigue loading for 21 days, during which AE was measured by internally mounted piezoelectric sensors. This paper focuses on using unsupervised pattern recognition methods to characterize different AE activities corresponding to different fracture mechanisms. A sequential feature selection method based on a k-means clustering algorithm is used to achieve a fine classification accuracy. The visualization of clusters in peak frequency-frequency centroid features is used to correlate the clustering results with failure modes. The positions of these clusters in time domain features, average frequency-MARSE, and average frequency-peak amplitude are also presented in this paper (where MARSE represents the Measured Area under Rectified Signal Envelope). The results show that these parameters are representative for the classification of the failure modes.
A Multiple-Label Guided Clustering Algorithm for Historical Document Dating and Localization.
He, Sheng; Samara, Petros; Burgers, Jan; Schomaker, Lambert
2016-11-01
It is of essential importance for historians to know the date and place of origin of the documents they study. It would be a huge advancement for historical scholars if it would be possible to automatically estimate the geographical and temporal provenance of a handwritten document by inferring them from the handwriting style of such a document. We propose a multiple-label guided clustering algorithm to discover the correlations between the concrete low-level visual elements in historical documents and abstract labels, such as date and location. First, a novel descriptor, called histogram of orientations of handwritten strokes, is proposed to extract and describe the visual elements, which is built on a scale-invariant polar-feature space. In addition, the multi-label self-organizing map (MLSOM) is proposed to discover the correlations between the low-level visual elements and their labels in a single framework. Our proposed MLSOM can be used to predict the labels directly. Moreover, the MLSOM can also be considered as a pre-structured clustering method to build a codebook, which contains more discriminative information on date and geography. The experimental results on the medieval paleographic scale data set demonstrate that our method achieves state-of-the-art results.
Artim-Esen, Bahar; Çene, Erhan; Şahinkaya, Yasemin; Ertan, Semra; Pehlivan, Özlem; Kamali, Sevil; Gül, Ahmet; Öcal, Lale; Aral, Orhan; Inanç, Murat
2014-07-01
Associations between autoantibodies and clinical features have been described in systemic lupus erythematosus (SLE). Herein, we aimed to define autoantibody clusters and their clinical correlations in a large cohort of patients with SLE. We analyzed 852 patients with SLE who attended our clinic. Seven autoantibodies were selected for cluster analysis: anti-DNA, anti-Sm, anti-RNP, anticardiolipin (aCL) immunoglobulin (Ig)G or IgM, lupus anticoagulant (LAC), anti-Ro, and anti-La. Two-step clustering and Kaplan-Meier survival analyses were used. Five clusters were identified. A cluster consisted of patients with only anti-dsDNA antibodies, a cluster of anti-Sm and anti-RNP, a cluster of aCL IgG/M and LAC, and a cluster of anti-Ro and anti-La antibodies. Analysis revealed 1 more cluster that consisted of patients who did not belong to any of the clusters formed by antibodies chosen for cluster analysis. Sm/RNP cluster had significantly higher incidence of pulmonary hypertension and Raynaud phenomenon. DsDNA cluster had the highest incidence of renal involvement. In the aCL/LAC cluster, there were significantly more patients with neuropsychiatric involvement, antiphospholipid syndrome, autoimmune hemolytic anemia, and thrombocytopenia. According to the Systemic Lupus International Collaborating Clinics damage index, the highest frequency of damage was in the aCL/LAC cluster. Comparison of 10 and 20 years survival showed reduced survival in the aCL/LAC cluster. This study supports the existence of autoantibody clusters with distinct clinical features in SLE and shows that forming clinical subsets according to autoantibody clusters may be useful in predicting the outcome of the disease. Autoantibody clusters in SLE may exhibit differences according to the clinical setting or population.
NASA Technical Reports Server (NTRS)
Chaban, Galina M.; Gerber, R. Benny; Kwak, Dochan (Technical Monitor)
2001-01-01
Anharmonic vibrational frequencies and intensities are computed for hydrogen fluoride clusters (HF)n with n=3,4 and mixed clusters of hydrogen fluoride with water (HF)n(H2O)n where n=1,2. For the (HF)4(H2O)4 complex, the vibrational spectra are calculated at the harmonic level, and anharmonic effects are estimated. Potential energy surfaces for these systems are obtained at the MP2/TZP level of electronic structure theory. Vibrational states are calculated from the potential surface points using the correlation-corrected vibrational self-consistent field (CC-VSCF) method. The method accounts for the anharmonicities and couplings between all vibrational modes and provides fairly accurate anharmonic vibrational spectra that can be directly compared with experimental results without a need for empirical scaling. For (HF)n, good agreement is found with experimental data. This agreement shows that the MP2 potential surfaces for these systems are reasonably reliable. The accuracy is best for the stiff intramolecular modes, which indicates the validity of MP2 in describing coupling between intramolecular and intermolecular degrees of freedom. For (HF)n(H2O)n experimental results are unavailable. The computed intramolecular frequencies show a strong dependence on cluster size. Intensity features are predicted for future experiments.
NASA Technical Reports Server (NTRS)
Chaban, Galina M.; Gerber, R. Benny
2002-01-01
Anharmonic vibrational frequencies and intensities are computed for hydrogen fluoride clusters (HF)n, with n = 3, 4 and mixed clusters of hydrogen fluoride with water (HF)n(H2O)n where n = 1, 2. For the (HF)4(H2O)4 complex, the vibrational spectra are calculated at the harmonic level, and anharmonic effects are estimated. Potential energy surfaces for these systems are obtained at the MP2/TZP level of electronic structure theory. Vibrational states are calculated from the potential surface points using the correlation-corrected vibrational self-consistent field method. The method accounts for the anharmonicities and couplings between all vibrational modes and provides fairly accurate anharmonic vibrational spectra that can be directly compared with experimental results without a need for empirical scaling. For (HF)n, good agreement is found with experimental data. This agreement shows that the Moller-Plesset (MP2) potential surfaces for these systems are reasonably reliable. The accuracy is best for the stiff intramolecular modes, which indicates the validity of MP2 in describing coupling between intramolecular and intermolecular degrees of freedom. For (HF)n(H2O)n experimental results are unavailable. The computed intramolecular frequencies show a strong dependence on cluster size. Intensity features are predicted for future experiments.
Yin, Zhong; Zhang, Jianhua
2014-07-01
Identifying the abnormal changes of mental workload (MWL) over time is quite crucial for preventing the accidents due to cognitive overload and inattention of human operators in safety-critical human-machine systems. It is known that various neuroimaging technologies can be used to identify the MWL variations. In order to classify MWL into a few discrete levels using representative MWL indicators and small-sized training samples, a novel EEG-based approach by combining locally linear embedding (LLE), support vector clustering (SVC) and support vector data description (SVDD) techniques is proposed and evaluated by using the experimentally measured data. The MWL indicators from different cortical regions are first elicited by using the LLE technique. Then, the SVC approach is used to find the clusters of these MWL indicators and thereby to detect MWL variations. It is shown that the clusters can be interpreted as the binary class MWL. Furthermore, a trained binary SVDD classifier is shown to be capable of detecting slight variations of those indicators. By combining the two schemes, a SVC-SVDD framework is proposed, where the clear-cut (smaller) cluster is detected by SVC first and then a subsequent SVDD model is utilized to divide the overlapped (larger) cluster into two classes. Finally, three-class MWL levels (low, normal and high) can be identified automatically. The experimental data analysis results are compared with those of several existing methods. It has been demonstrated that the proposed framework can lead to acceptable computational accuracy and has the advantages of both unsupervised and supervised training strategies. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Pfeiffenberger, Erik; Chaleil, Raphael A.G.; Moal, Iain H.
2017-01-01
ABSTRACT Reliable identification of near‐native poses of docked protein–protein complexes is still an unsolved problem. The intrinsic heterogeneity of protein–protein interactions is challenging for traditional biophysical or knowledge based potentials and the identification of many false positive binding sites is not unusual. Often, ranking protocols are based on initial clustering of docked poses followed by the application of an energy function to rank each cluster according to its lowest energy member. Here, we present an approach of cluster ranking based not only on one molecular descriptor (e.g., an energy function) but also employing a large number of descriptors that are integrated in a machine learning model, whereby, an extremely randomized tree classifier based on 109 molecular descriptors is trained. The protocol is based on first locally enriching clusters with additional poses, the clusters are then characterized using features describing the distribution of molecular descriptors within the cluster, which are combined into a pairwise cluster comparison model to discriminate near‐native from incorrect clusters. The results show that our approach is able to identify clusters containing near‐native protein–protein complexes. In addition, we present an analysis of the descriptors with respect to their power to discriminate near native from incorrect clusters and how data transformations and recursive feature elimination can improve the ranking performance. Proteins 2017; 85:528–543. © 2016 Wiley Periodicals, Inc. PMID:27935158
Segmentation of Polarimetric SAR Images Usig Wavelet Transformation and Texture Features
NASA Astrophysics Data System (ADS)
Rezaeian, A.; Homayouni, S.; Safari, A.
2015-12-01
Polarimetric Synthetic Aperture Radar (PolSAR) sensors can collect useful observations from earth's surfaces and phenomena for various remote sensing applications, such as land cover mapping, change and target detection. These data can be acquired without the limitations of weather conditions, sun illumination and dust particles. As result, SAR images, and in particular Polarimetric SAR (PolSAR) are powerful tools for various environmental applications. Unlike the optical images, SAR images suffer from the unavoidable speckle, which causes the segmentation of this data difficult. In this paper, we use the wavelet transformation for segmentation of PolSAR images. Our proposed method is based on the multi-resolution analysis of texture features is based on wavelet transformation. Here, we use the information of gray level value and the information of texture. First, we produce coherency or covariance matrices and then generate span image from them. In the next step of proposed method is texture feature extraction from sub-bands is generated from discrete wavelet transform (DWT). Finally, PolSAR image are segmented using clustering methods as fuzzy c-means (FCM) and k-means clustering. We have applied the proposed methodology to full polarimetric SAR images acquired by the Uninhabited Aerial Vehicle Synthetic Aperture Radar (UAVSAR) L-band system, during July, in 2012 over an agricultural area in Winnipeg, Canada.
NASA Astrophysics Data System (ADS)
Anick, David J.
2010-04-01
For (H2O)20X water clusters consisting of X enclosed by the 512 dodecahedral cage, X=empty, H2O, NH3, and H3O+, databases are made consisting of 55-82 isomers optimized via B3LYP/6-311++G∗∗. Correlations are explored between ground state electronic energy (Ee) or electronic energy plus zero point energy (Ee+ZPE) and the clusters' topology, defined as the set of directed H-bonds. Linear regression is done to identify topological features that correlate with cluster energy. For each X, variables are found that account for 99% of the variance in Ee and predict it with a rms error under 0.2 kcal/mol. The method of analysis emphasizes the importance of an intermediate level of structure, the "O-topology," consisting of O-types and a list of O pairs that are bonded but omitting H-bond directions, as a device to organize the databases and reduce the number of structures one needs to consider. Relevant variables include three parameters, which count the number of H-bonds having particular donor and acceptor types; |M|2, where M is the cluster's vector dipole moment; and the projection of M onto the symmetry axis of X. Scatter diagrams for Ee or Ee+ZPE versus |M| show that clusters fall naturally into "families" defined by the values of certain discrete parameters, the "major parameters," for each X. Combining "family" analysis and O-topologies, a small group of clusters is identified for each X that are candidates to be the global minimum, and the minimum is determined. For X=H3O+, one cluster with central hydronium lies just 2.08 kcal/mol above the lowest isomer with surface hydronium. Implications of the methodology for dodecahedral (H2O)20(NH4+) and (H2O)20(NH4+)(OH-) are discussed, and new lower energy isomers are found. For MP2/TZVP, the lowest-energy (H2O)20(NH4+) isomer features a trifurcated H-bond. The results suggest a much more efficient and comprehensive way of seeking low-energy water cluster geometries that may have wide applicability.
Anick, David J
2010-04-28
For (H(2)O)(20)X water clusters consisting of X enclosed by the 5(12) dodecahedral cage, X = empty, H(2)O, NH(3), and H(3)O(+), databases are made consisting of 55-82 isomers optimized via B3LYP/6-311++G(**). Correlations are explored between ground state electronic energy (Ee) or electronic energy plus zero point energy (Ee+ZPE) and the clusters' topology, defined as the set of directed H-bonds. Linear regression is done to identify topological features that correlate with cluster energy. For each X, variables are found that account for 99% of the variance in Ee and predict it with a rms error under 0.2 kcal/mol. The method of analysis emphasizes the importance of an intermediate level of structure, the "O-topology," consisting of O-types and a list of O pairs that are bonded but omitting H-bond directions, as a device to organize the databases and reduce the number of structures one needs to consider. Relevant variables include three parameters, which count the number of H-bonds having particular donor and acceptor types; absolute value(M)(2), where M is the cluster's vector dipole moment; and the projection of M onto the symmetry axis of X. Scatter diagrams for Ee or Ee+ZPE versus absolute value(M) show that clusters fall naturally into "families" defined by the values of certain discrete parameters, the "major parameters," for each X. Combining "family" analysis and O-topologies, a small group of clusters is identified for each X that are candidates to be the global minimum, and the minimum is determined. For X = H(3)O(+), one cluster with central hydronium lies just 2.08 kcal/mol above the lowest isomer with surface hydronium. Implications of the methodology for dodecahedral (H(2)O)(20)(NH(4)(+)) and (H(2)O)(20)(NH(4)(+))(OH(-)) are discussed, and new lower energy isomers are found. For MP2/TZVP, the lowest-energy (H(2)O)(20)(NH(4)(+)) isomer features a trifurcated H-bond. The results suggest a much more efficient and comprehensive way of seeking low-energy water cluster geometries that may have wide applicability.
Ecological characteristics of Simulium breeding sites in West Africa.
Cheke, Robert A; Young, Stephen; Garms, Rolf
2017-03-01
Twenty-nine taxa of Simulium were identified amongst 527 collections of larvae and pupae from untreated rivers and streams in Liberia (362 collections in 1967-71 & 1989), Togo (125 in 1979-81), Benin (35 in 1979-81) and Ghana (5 in 1980-81). Presence or absence of associations between different taxa were used to group them into six clusters using Ward agglomerative hierarchical cluster analysis. Environmental data associated with the pre-imaginal habitats were then analysed in relation to the six clusters by one way ANOVA. The results revealed significant effects in determining the clusters of maximum river width (all P<0.001 unless stated otherwise), water temperature, dry bulb air temperature, relative humidity, altitude, type of water (on a range from trickle to large river), water level, slope, current, vegetation, light conditions, discharge, length of breeding area, environs, terrain, river bed type (P<0.01), and the supports to which the insects were attached (P<0.01). When four non-significant contributors (wet bulb temperature, river features, height of waterfall and depth) were excluded and the reduced data-set analysed by principal components analysis (PCA), the first two principal components (PCs) accounted for 87% of the variance, with geographical features dominant in PC1 and hydrological characteristics in PC2. The analyses also revealed the ecological characteristics of each taxon's pre-imaginal habitats, which are discussed with particular reference to members of the Simulium damnosum species complex, whose breeding site distributions were further analysed by canonical correspondence analysis (CCA), a method also applied to the data on non-vector species. Copyright © 2016 Elsevier B.V. All rights reserved.
Periorbital melasma: Hierarchical cluster analysis of clinical features in Asian patients.
Jung, Y S; Bae, J M; Kim, B J; Kang, J-S; Cho, S B
2017-11-01
Studies have shown melasma lesions to be distributed across the face in centrofacial, malar, and mandibular patterns. Meanwhile, however, melasma lesions of the periorbital area have yet to be thoroughly described. We analyzed normal and ultraviolet light-exposed photographs of patients with melasma. The periorbital melasma lesions were measured according to anatomical reference points and a hierarchical cluster analysis was performed. The periorbital melasma lesions showed clinical features of fine and homogenous melasma pigmentation, involving both the upper and lower eyelids that extended to other anatomical sites with a darker and coarser appearance. The hierarchical cluster analysis indicated that patients with periorbital melasma can be categorized into two clusters according to the surface anatomy of the face. Significant differences between cluster 1 and cluster 2 were found in lateral distance and inferolateral distance, but not in medial distance and superior distance. Comparing the two clusters, patients in cluster 2 were found to be significantly older and more commonly accompanied by melasma lesions of the temple and medial cheek. Our hierarchical cluster analysis of periorbital melasma lesions demonstrated that Asian patients with periorbital melasma can be categorized into two clusters according to the surface anatomy of the face. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Study of hot flow anomalies using Cluster multi-spacecraft measurements
NASA Astrophysics Data System (ADS)
Facskó, G.; Trotignon, J. G.; Dandouras, I.; Lucek, E. A.; Daly, P. W.
2010-02-01
Hot flow anomalies (HFAs) were first discovered in the early 1980s at the bow shock of the Earth. In the 1990s these features were studied, observed and simulated very intensively and many new missions (Cluster, THEMIS, Cassini and Venus Express) focused the attention to this phenomenon again. Many basic features and the HFA formation mechanism were clarified observationally and using hybrid simulation techniques. We described previous observational, theoretical and simulation results in the research field of HFAs. We introduced HFA observations performed at the Earth, Mars, Venus and Saturn in this paper. We share different observation results of space mission to give an overview to the reader. Cluster multi-spacecraft measurements gave us more observed HFA events and finer, more sophisticated methods to understand them better. In this study, HFAs were studied using observations of the Cluster magnetometer and the Cluster plasma detector aboard the four Cluster spacecraft. Energetic particle measurements (28.2-68.9 keV) were also used to detect and select HFAs. We studied several specific features of tangential discontinuities generating HFAs on the basis of Cluster measurements in the period February-April 2003, December 2005-April 2006 and January-April, 2007, when the separation of spacecraft was large and the Cluster fleet reached the bow shock. We have confirmed the condition for forming HFAs, that the solar wind speed is higher than the average. This condition was also confirmed by simultaneous ACE magnetic field and solar wind plasma observations at the L1 point 1.4 million km upstream of the Earth. The measured and calculated features of HFA events were compared with the results of different previous hybrid simulations. During the whole spring season of 2003, the solar wind speed was higher than the average. Here we checked whether the higher solar wind speed is a real condition of HFA formation also in 2006 and 2007. At the end we gave an outlook and suggested several desirable direction of the further research of HFAs using the measurements of Cluster, THEMIS, incoming Cross Scale and other space missions.
Personality Disorder Features and Insomnia Status amongst Hypnotic-Dependent Adults
Ruiter, Megan E.; Lichstein, Kenneth L.; Nau, Sidney D.; Geyer, James
2012-01-01
Objective To determine the prevalence of personality disorders and their relation to insomnia parameters among persons with chronic insomnia with hypnotic dependence. Methods Eighty-four adults with chronic insomnia with hypnotic dependence completed the SCID-II personality questionnaire, two-weeks of sleep diaries, polysomnography, and measures of insomnia severity, impact, fatigue severity, depression, anxiety, and quality of life. Frequencies, between-subjects t-tests and hierarchical regression models were conducted. Results Cluster C personality disorders were most prevalent (50%). Obsessive-compulsive personality disorder (OCPD) was most common (n=39). These individuals compared to participants with no personality disorders did not differ in objective and subjective sleep parameters. Yet, they had poorer insomnia-related daytime functioning. OCPD and Avoidant personality disorders features were associated with poorer daytime functioning. OCPD features were related to greater fatigue severity, and overestimation of time awake was trending. Schizotypal and Schizoid features were positively associated with insomnia severity. Dependent personality disorder features were related to underestimating time awake. Conclusions Cluster C personality disorders were highly prevalent in patients with chronic insomnia with hypnotic dependence. Features of Cluster C and A personality disorders were variously associated with poorer insomnia-related daytime functioning, fatigue, and estimation of nightly wake-time. Future interventions may need to address these personality features. PMID:22938862
Personality disorder features and insomnia status amongst hypnotic-dependent adults.
Ruiter, Megan E; Lichstein, Kenneth L; Nau, Sidney D; Geyer, James D
2012-10-01
To determine the prevalence of personality disorders and their relation to insomnia parameters among persons with chronic insomnia with hypnotic dependence. Eighty-four adults with chronic insomnia with hypnotic dependence completed the SCID-II personality questionnaire, two-weeks of sleep diaries, polysomnography, and measures of insomnia severity, impact, fatigue severity, depression, anxiety, and quality of life. Frequencies, between-subjects t-tests and hierarchical regression models were conducted. Cluster C personality disorders were most prevalent (50%). Obsessive-Compulsive personality disorder (OCPD) was most common (n=39). These individuals compared to participants with no personality disorders did not differ in objective and subjective sleep parameters. Yet, they had poorer insomnia-related daytime functioning. OCPD and Avoidant personality disorders features were associated with poorer daytime functioning. OCPD features were related to greater fatigue severity, and overestimation of time awake was trending. Schizotypal and Schizoid features were positively associated with insomnia severity. Dependent personality disorder features were related to underestimating time awake. Cluster C personality disorders were highly prevalent in patients with chronic insomnia with hypnotic dependence. Features of Cluster C and A personality disorders were variously associated with poorer insomnia-related daytime functioning, fatigue, and estimation of nightly wake-time. Future interventions may need to address these personality features. Copyright © 2012 Elsevier B.V. All rights reserved.
Xu, Xin; Huang, Zhenhua; Graves, Daniel; Pedrycz, Witold
2014-12-01
In order to deal with the sequential decision problems with large or continuous state spaces, feature representation and function approximation have been a major research topic in reinforcement learning (RL). In this paper, a clustering-based graph Laplacian framework is presented for feature representation and value function approximation (VFA) in RL. By making use of clustering-based techniques, that is, K-means clustering or fuzzy C-means clustering, a graph Laplacian is constructed by subsampling in Markov decision processes (MDPs) with continuous state spaces. The basis functions for VFA can be automatically generated from spectral analysis of the graph Laplacian. The clustering-based graph Laplacian is integrated with a class of approximation policy iteration algorithms called representation policy iteration (RPI) for RL in MDPs with continuous state spaces. Simulation and experimental results show that, compared with previous RPI methods, the proposed approach needs fewer sample points to compute an efficient set of basis functions and the learning control performance can be improved for a variety of parameter settings.
NASA Technical Reports Server (NTRS)
Houdashelt, Mark L.; Frogel, Jay A.
1993-01-01
Earlier researchers derived the relative distance between the Coma and Virgo clusters from color-magnitude relations of the early-type galaxies in each cluster. They found that the derived distance was color-dependent and concluded that the galaxies of similar luminosity in the two clusters differ in their red stellar populations. More recently, the color-dependence of the Coma-Virgo distance modulus has been called into question. However, because these two clusters differ so dramatically in their morphologies and kinematics, it is plausible that the star formation histories of the member galaxies also differed. If the conclusions of earlier researchers are indeed correct, then some signature of the resulting stellar population differences should appear in the near-infrared and/or infrared light of the respective galaxies. We have collected near-infrared spectra of 17 Virgo and 10 Coma early-type galaxies; this sample spans about four magnitudes in luminosity in each cluster. Seven field E/S0 galaxies have been observed for comparison. Pseudo-equivalent widths have been measured for all of the field galaxies, all but one of the Virgo members, and five of the Coma galaxies. The features examined are sensitive to the temperature, metallicity, and surface gravity of the reddest stars. A preliminary analysis of these spectral features has been performed, and, with a few notable exceptions, the measured pseudo-equivalent widths agree well with previously published values.
From hormones to secondary metabolism: the emergence of metabolic gene clusters in plants.
Chu, Hoi Yee; Wegel, Eva; Osbourn, Anne
2011-04-01
Gene clusters for the synthesis of secondary metabolites are a common feature of microbial genomes. Well-known examples include clusters for the synthesis of antibiotics in actinomycetes, and also for the synthesis of antibiotics and toxins in filamentous fungi. Until recently it was thought that genes for plant metabolic pathways were not clustered, and this is certainly true in many cases; however, five plant secondary metabolic gene clusters have now been discovered, all of them implicated in synthesis of defence compounds. An obvious assumption might be that these eukaryotic gene clusters have arisen by horizontal gene transfer from microbes, but there is compelling evidence to indicate that this is not the case. This raises intriguing questions about how widespread such clusters are, what the significance of clustering is, why genes for some metabolic pathways are clustered and those for others are not, and how these clusters form. In answering these questions we may hope to learn more about mechanisms of genome plasticity and adaptive evolution in plants. It is noteworthy that for the five plant secondary metabolic gene clusters reported so far, the enzymes for the first committed steps all appear to have been recruited directly or indirectly from primary metabolic pathways involved in hormone synthesis. This may or may not turn out to be a common feature of plant secondary metabolic gene clusters as new clusters emerge. © 2011 The Authors. The Plant Journal © 2011 Blackwell Publishing Ltd.
Zhu, Bohui; Ding, Yongsheng; Hao, Kuangrong
2013-01-01
This paper presents a novel maximum margin clustering method with immune evolution (IEMMC) for automatic diagnosis of electrocardiogram (ECG) arrhythmias. This diagnostic system consists of signal processing, feature extraction, and the IEMMC algorithm for clustering of ECG arrhythmias. First, raw ECG signal is processed by an adaptive ECG filter based on wavelet transforms, and waveform of the ECG signal is detected; then, features are extracted from ECG signal to cluster different types of arrhythmias by the IEMMC algorithm. Three types of performance evaluation indicators are used to assess the effect of the IEMMC method for ECG arrhythmias, such as sensitivity, specificity, and accuracy. Compared with K-means and iterSVR algorithms, the IEMMC algorithm reflects better performance not only in clustering result but also in terms of global search ability and convergence ability, which proves its effectiveness for the detection of ECG arrhythmias. PMID:23690875
Micro-flock patterns and macro-clusters in chiral active Brownian disks
NASA Astrophysics Data System (ADS)
Levis, Demian; Liebchen, Benno
2018-02-01
Chiral active particles (or self-propelled circle swimmers) feature a rich collective behavior, comprising rotating macro-clusters and micro-flock patterns which consist of phase-synchronized rotating clusters with a characteristic self-limited size. These patterns emerge from the competition of alignment interactions and rotations suggesting that they might occur generically in many chiral active matter systems. However, although excluded volume interactions occur naturally among typical circle swimmers, it is not yet clear if macro-clusters and micro-flock patterns survive their presence. The present work shows that both types of pattern do survive but feature strongly enhance fluctuations regarding the size and shape of the individual clusters. Despite these fluctuations, we find that the average micro-flock size still follows the same characteristic scaling law as in the absence of excluded volume interactions, i.e. micro-flock sizes scale linearly with the single-swimmer radius.
Utilizing the Structure and Content Information for XML Document Clustering
NASA Astrophysics Data System (ADS)
Tran, Tien; Kutty, Sangeetha; Nayak, Richi
This paper reports on the experiments and results of a clustering approach used in the INEX 2008 document mining challenge. The clustering approach utilizes both the structure and content information of the Wikipedia XML document collection. A latent semantic kernel (LSK) is used to measure the semantic similarity between XML documents based on their content features. The construction of a latent semantic kernel involves the computing of singular vector decomposition (SVD). On a large feature space matrix, the computation of SVD is very expensive in terms of time and memory requirements. Thus in this clustering approach, the dimension of the document space of a term-document matrix is reduced before performing SVD. The document space reduction is based on the common structural information of the Wikipedia XML document collection. The proposed clustering approach has shown to be effective on the Wikipedia collection in the INEX 2008 document mining challenge.
ten Have, Arjen; Dekkers, Ester; Kay, John; Phylip, Lowri H; van Kan, Jan A L
2004-07-01
Botrytis cinerea, an important fungal plant pathogen, secretes aspartic proteinase (AP) activity in axenic cultures. No cysteine, serine or metalloproteinase activity could be detected. Proteinase activity was higher in culture medium containing BSA or wheat germ extract, as compared to minimal medium. A proportion of the enzyme activity remained in the extracellular glucan sheath. AP was also the only type of proteinase activity in fluid obtained from B. cinerea-infected tissue of apple, pepper, tomato and zucchini. Five B. cinerea genes encoding an AP were cloned and denoted Bcap1-5. Features of the encoded proteins are discussed. BcAP1, especially, has novel characteristics. A phylogenetic analysis was performed comprising sequences originating from different kingdoms. BcAP1 and BcAP5 did not cluster in a bootstrap-supported clade. BcAP2 clusters with vacuolar APs. BcAP3 and BcAP4 cluster with secreted APs in a clade that also contains glycosylphosphatidylinositol-anchored proteinases from Saccharomyces cerevisiae and Candida albicans. All five Bcap genes are expressed in liquid cultures. Transcript levels of Bcap1, Bcap2, Bcap3 and Bcap4 are subject to glucose and peptone repression. Transcripts from all five Bcap genes were detected in infected plant tissue, indicating that at least part of the AP activity in planta originates from the pathogen.
Key-Node-Separated Graph Clustering and Layouts for Human Relationship Graph Visualization.
Itoh, Takayuki; Klein, Karsten
2015-01-01
Many graph-drawing methods apply node-clustering techniques based on the density of edges to find tightly connected subgraphs and then hierarchically visualize the clustered graphs. However, users may want to focus on important nodes and their connections to groups of other nodes for some applications. For this purpose, it is effective to separately visualize the key nodes detected based on adjacency and attributes of the nodes. This article presents a graph visualization technique for attribute-embedded graphs that applies a graph-clustering algorithm that accounts for the combination of connections and attributes. The graph clustering step divides the nodes according to the commonality of connected nodes and similarity of feature value vectors. It then calculates the distances between arbitrary pairs of clusters according to the number of connecting edges and the similarity of feature value vectors and finally places the clusters based on the distances. Consequently, the technique separates important nodes that have connections to multiple large clusters and improves the visibility of such nodes' connections. To test this technique, this article presents examples with human relationship graph datasets, including a coauthorship and Twitter communication network dataset.
Cluster Headache: Epidemiology, Pathophysiology, Clinical Features, and Diagnosis
Wei, Diana Yi-Ting; Yuan Ong, Jonathan Jia; Goadsby, Peter James
2018-01-01
Cluster headache is a primary headache disorder affecting up to 0.1% of the population. Patients suffer from cluster headache attacks lasting from 15 to 180 min up to 8 times a day. The attacks are characterized by the severe unilateral pain mainly in the first division of the trigeminal nerve, with associated prominent unilateral cranial autonomic symptoms and a sense of agitation and restlessness during the attacks. The male-to-female ratio is approximately 2.5:1. Experimental, clinical, and neuroimaging studies have advanced our understanding of the pathogenesis of cluster headache. The pathophysiology involves activation of the trigeminovascular complex and the trigeminal-autonomic reflex and accounts for the unilateral severe headache, the prominent ipsilateral cranial autonomic symptoms. In addition, the circadian and circannual rhythmicity unique to this condition is postulated to involve the hypothalamus and suprachiasmatic nucleus. Although the clinical features are distinct, it may be misdiagnosed, with patients often presenting to the otolaryngologist or dentist with symptoms. The prognosis of cluster headache remains difficult to predict. Patients with episodic cluster headache can shift to chronic cluster headache and vice versa. Longitudinally, cluster headache tends to remit with age with less frequent bouts and more prolonged periods of remission in between bouts. PMID:29720812
Cluster Headache: Epidemiology, Pathophysiology, Clinical Features, and Diagnosis.
Wei, Diana Yi-Ting; Yuan Ong, Jonathan Jia; Goadsby, Peter James
2018-04-01
Cluster headache is a primary headache disorder affecting up to 0.1% of the population. Patients suffer from cluster headache attacks lasting from 15 to 180 min up to 8 times a day. The attacks are characterized by the severe unilateral pain mainly in the first division of the trigeminal nerve, with associated prominent unilateral cranial autonomic symptoms and a sense of agitation and restlessness during the attacks. The male-to-female ratio is approximately 2.5:1. Experimental, clinical, and neuroimaging studies have advanced our understanding of the pathogenesis of cluster headache. The pathophysiology involves activation of the trigeminovascular complex and the trigeminal-autonomic reflex and accounts for the unilateral severe headache, the prominent ipsilateral cranial autonomic symptoms. In addition, the circadian and circannual rhythmicity unique to this condition is postulated to involve the hypothalamus and suprachiasmatic nucleus. Although the clinical features are distinct, it may be misdiagnosed, with patients often presenting to the otolaryngologist or dentist with symptoms. The prognosis of cluster headache remains difficult to predict. Patients with episodic cluster headache can shift to chronic cluster headache and vice versa. Longitudinally, cluster headache tends to remit with age with less frequent bouts and more prolonged periods of remission in between bouts.
Takeuchi, Hiroshi
2012-10-18
The structures of the simplest aromatic clusters, benzene clusters (C(6)H(6))(n), are not well elucidated. In the present study, benzene clusters (C(6)H(6))(n) (n ≤ 30) were investigated with the all-atom optimized parameters for liquid simulation (OPLS) potential. The global minima and low-lying minima of the benzene clusters were searched with the heuristic method combined with geometrical perturbations. The structural features and growth sequence of the clusters were examined by carrying out local structure analyses and structural similarity evaluation with rotational constants. Because of the anisotropic interaction between the benzene molecules, the local structures consisting of 13 molecules are considerably deviated from regular icosahedron, and the geometries of some of the clusters are inconsistent with the shapes constructed by the interior molecules. The distribution of the angle between the lines normal to two neighboring benzene rings is anisotropic in the clusters, whereas that in the liquid benzene is nearly isotropic. The geometries and energies of the low-lying configurations and the saddle points between them suggest that most of the configurations previously detected in supersonic expansions take different orientations for one to four neighboring molecules.
Adaptive Scaling of Cluster Boundaries for Large-Scale Social Media Data Clustering.
Meng, Lei; Tan, Ah-Hwee; Wunsch, Donald C
2016-12-01
The large scale and complex nature of social media data raises the need to scale clustering techniques to big data and make them capable of automatically identifying data clusters with few empirical settings. In this paper, we present our investigation and three algorithms based on the fuzzy adaptive resonance theory (Fuzzy ART) that have linear computational complexity, use a single parameter, i.e., the vigilance parameter to identify data clusters, and are robust to modest parameter settings. The contribution of this paper lies in two aspects. First, we theoretically demonstrate how complement coding, commonly known as a normalization method, changes the clustering mechanism of Fuzzy ART, and discover the vigilance region (VR) that essentially determines how a cluster in the Fuzzy ART system recognizes similar patterns in the feature space. The VR gives an intrinsic interpretation of the clustering mechanism and limitations of Fuzzy ART. Second, we introduce the idea of allowing different clusters in the Fuzzy ART system to have different vigilance levels in order to meet the diverse nature of the pattern distribution of social media data. To this end, we propose three vigilance adaptation methods, namely, the activation maximization (AM) rule, the confliction minimization (CM) rule, and the hybrid integration (HI) rule. With an initial vigilance value, the resulting clustering algorithms, namely, the AM-ART, CM-ART, and HI-ART, can automatically adapt the vigilance values of all clusters during the learning epochs in order to produce better cluster boundaries. Experiments on four social media data sets show that AM-ART, CM-ART, and HI-ART are more robust than Fuzzy ART to the initial vigilance value, and they usually achieve better or comparable performance and much faster speed than the state-of-the-art clustering algorithms that also do not require a predefined number of clusters.
Mining Co-Location Patterns with Clustering Items from Spatial Data Sets
NASA Astrophysics Data System (ADS)
Zhou, G.; Li, Q.; Deng, G.; Yue, T.; Zhou, X.
2018-05-01
The explosive growth of spatial data and widespread use of spatial databases emphasize the need for the spatial data mining. Co-location patterns discovery is an important branch in spatial data mining. Spatial co-locations represent the subsets of features which are frequently located together in geographic space. However, the appearance of a spatial feature C is often not determined by a single spatial feature A or B but by the two spatial features A and B, that is to say where A and B appear together, C often appears. We note that this co-location pattern is different from the traditional co-location pattern. Thus, this paper presents a new concept called clustering terms, and this co-location pattern is called co-location patterns with clustering items. And the traditional algorithm cannot mine this co-location pattern, so we introduce the related concept in detail and propose a novel algorithm. This algorithm is extended by join-based approach proposed by Huang. Finally, we evaluate the performance of this algorithm.
Oluwadare, Oluwatosin; Cheng, Jianlin
2017-11-14
With the development of chromosomal conformation capturing techniques, particularly, the Hi-C technique, the study of the spatial conformation of a genome is becoming an important topic in bioinformatics and computational biology. The Hi-C technique can generate genome-wide chromosomal interaction (contact) data, which can be used to investigate the higher-level organization of chromosomes, such as Topologically Associated Domains (TAD), i.e., locally packed chromosome regions bounded together by intra chromosomal contacts. The identification of the TADs for a genome is useful for studying gene regulation, genomic interaction, and genome function. Here, we formulate the TAD identification problem as an unsupervised machine learning (clustering) problem, and develop a new TAD identification method called ClusterTAD. We introduce a novel method to represent chromosomal contacts as features to be used by the clustering algorithm. Our results show that ClusterTAD can accurately predict the TADs on a simulated Hi-C data. Our method is also largely complementary and consistent with existing methods on the real Hi-C datasets of two mouse cells. The validation with the chromatin immunoprecipitation (ChIP) sequencing (ChIP-Seq) data shows that the domain boundaries identified by ClusterTAD have a high enrichment of CTCF binding sites, promoter-related marks, and enhancer-related histone modifications. As ClusterTAD is based on a proven clustering approach, it opens a new avenue to apply a large array of clustering methods developed in the machine learning field to the TAD identification problem. The source code, the results, and the TADs generated for the simulated and real Hi-C datasets are available here: https://github.com/BDM-Lab/ClusterTAD .
Automatic classification of endoscopic images for premalignant conditions of the esophagus
NASA Astrophysics Data System (ADS)
Boschetto, Davide; Gambaretto, Gloria; Grisan, Enrico
2016-03-01
Barrett's esophagus (BE) is a precancerous complication of gastroesophageal reflux disease in which normal stratified squamous epithelium lining the esophagus is replaced by intestinal metaplastic columnar epithelium. Repeated endoscopies and multiple biopsies are often necessary to establish the presence of intestinal metaplasia. Narrow Band Imaging (NBI) is an imaging technique commonly used with endoscopies that enhances the contrast of vascular pattern on the mucosa. We present a computer-based method for the automatic normal/metaplastic classification of endoscopic NBI images. Superpixel segmentation is used to identify and cluster pixels belonging to uniform regions. From each uniform clustered region of pixels, eight features maximizing differences among normal and metaplastic epithelium are extracted for the classification step. For each superpixel, the three mean intensities of each color channel are firstly selected as features. Three added features are the mean intensities for each superpixel after separately applying to the red-channel image three different morphological filters (top-hat filtering, entropy filtering and range filtering). The last two features require the computation of the Grey-Level Co-Occurrence Matrix (GLCM), and are reflective of the contrast and the homogeneity of each superpixel. The classification step is performed using an ensemble of 50 classification trees, with a 10-fold cross-validation scheme by training the classifier at each step on a random 70% of the images and testing on the remaining 30% of the dataset. Sensitivity and Specificity are respectively of 79.2% and 87.3%, with an overall accuracy of 83.9%.
Bushel, Pierre R; Wolfinger, Russell D; Gibson, Greg
2007-01-01
Background Commonly employed clustering methods for analysis of gene expression data do not directly incorporate phenotypic data about the samples. Furthermore, clustering of samples with known phenotypes is typically performed in an informal fashion. The inability of clustering algorithms to incorporate biological data in the grouping process can limit proper interpretation of the data and its underlying biology. Results We present a more formal approach, the modk-prototypes algorithm, for clustering biological samples based on simultaneously considering microarray gene expression data and classes of known phenotypic variables such as clinical chemistry evaluations and histopathologic observations. The strategy involves constructing an objective function with the sum of the squared Euclidean distances for numeric microarray and clinical chemistry data and simple matching for histopathology categorical values in order to measure dissimilarity of the samples. Separate weighting terms are used for microarray, clinical chemistry and histopathology measurements to control the influence of each data domain on the clustering of the samples. The dynamic validity index for numeric data was modified with a category utility measure for determining the number of clusters in the data sets. A cluster's prototype, formed from the mean of the values for numeric features and the mode of the categorical values of all the samples in the group, is representative of the phenotype of the cluster members. The approach is shown to work well with a simulated mixed data set and two real data examples containing numeric and categorical data types. One from a heart disease study and another from acetaminophen (an analgesic) exposure in rat liver that causes centrilobular necrosis. Conclusion The modk-prototypes algorithm partitioned the simulated data into clusters with samples in their respective class group and the heart disease samples into two groups (sick and buff denoting samples having pain type representative of angina and non-angina respectively) with an accuracy of 79%. This is on par with, or better than, the assignment accuracy of the heart disease samples by several well-known and successful clustering algorithms. Following modk-prototypes clustering of the acetaminophen-exposed samples, informative genes from the cluster prototypes were identified that are descriptive of, and phenotypically anchored to, levels of necrosis of the centrilobular region of the rat liver. The biological processes cell growth and/or maintenance, amine metabolism, and stress response were shown to discern between no and moderate levels of acetaminophen-induced centrilobular necrosis. The use of well-known and traditional measurements directly in the clustering provides some guarantee that the resulting clusters will be meaningfully interpretable. PMID:17408499
Anderson, Jordan M.; Kier, Brandon; Jurban, Brice; Byrne, Aimee; Shu, Irene; Eidenschink, Lisa A.; Shcherbakov, Alexander A.; Hudson, Mike; Fesinmeyer, R. M.; Andersen, Niels H.
2017-01-01
We have extended our studies of Trp/Trp to other Aryl/Aryl through-space interactions that stabilize hairpins and other small polypeptide folds. Herein we detail the NMR and CD spectroscopic features of these types of interactions. NMR data remains the best diagnostic for characterizing the common T-shape orientation. Designated as an edge-to-face (EtF or FtE) interaction, large ring current shifts are produced at the edge aryl ring hydrogens and, in most cases, large exciton couplets appear in the far UV circular dichroic (CD) spectrum. The preference for the face aryl in FtE clusters is W≫Y≥F (there are some exceptions in the Y/F order); this sequence corresponds to the order of fold stability enhancement and always predicts the amplitude of the lower energy feature of the exciton couplet in the CD spectrum. The CD spectra for FtE W/W, W/Y, Y/W, and Y/Y pairs all include an intense feature at 225–232 nm. An additional couplet feature seen for W/Y, W/F, Y/Y and F/Y clusters, is a negative feature at 197–200 nm. Tyr/Tyr (as well as F/Y and F/F) interactions produce much smaller exciton couplet amplitudes. The Trp-cage fold was employed to search for the CD effects of other Trp/Trp and Trp/Tyr cluster geometries: several were identified. In this account, we provide additional examples of the application of cross-strand aryl/aryl clusters for the design of stable β-sheet models and a scale of fold stability increments associated with all possible FtE Ar/Ar clusters in several structural contexts. PMID:26850220
Hierarchical video summarization
NASA Astrophysics Data System (ADS)
Ratakonda, Krishna; Sezan, M. Ibrahim; Crinon, Regis J.
1998-12-01
We address the problem of key-frame summarization of vide in the absence of any a priori information about its content. This is a common problem that is encountered in home videos. We propose a hierarchical key-frame summarization algorithm where a coarse-to-fine key-frame summary is generated. A hierarchical key-frame summary facilitates multi-level browsing where the user can quickly discover the content of the video by accessing its coarsest but most compact summary and then view a desired segment of the video with increasingly more detail. At the finest level, the summary is generated on the basis of color features of video frames, using an extension of a recently proposed key-frame extraction algorithm. The finest level key-frames are recursively clustered using a novel pairwise K-means clustering approach with temporal consecutiveness constraint. We also address summarization of MPEG-2 compressed video without fully decoding the bitstream. We also propose efficient mechanisms that facilitate decoding the video when the hierarchical summary is utilized in browsing and playback of video segments starting at selected key-frames.
Yiu, Sean; Farewell, Vernon T; Tom, Brian D M
2018-02-01
In psoriatic arthritis, it is important to understand the joint activity (represented by swelling and pain) and damage processes because both are related to severe physical disability. The paper aims to provide a comprehensive investigation into both processes occurring over time, in particular their relationship, by specifying a joint multistate model at the individual hand joint level, which also accounts for many of their important features. As there are multiple hand joints, such an analysis will be based on the use of clustered multistate models. Here we consider an observation level random-effects structure with dynamic covariates and allow for the possibility that a subpopulation of patients is at minimal risk of damage. Such an analysis is found to provide further understanding of the activity-damage relationship beyond that provided by previous analyses. Consideration is also given to the modelling of mean sojourn times and jump probabilities. In particular, a novel model parameterization which allows easily interpretable covariate effects to act on these quantities is proposed.
3D variational brain tumor segmentation on a clustered feature set
NASA Astrophysics Data System (ADS)
Popuri, Karteek; Cobzas, Dana; Jagersand, Martin; Shah, Sirish L.; Murtha, Albert
2009-02-01
Tumor segmentation from MRI data is a particularly challenging and time consuming task. Tumors have a large diversity in shape and appearance with intensities overlapping the normal brain tissues. In addition, an expanding tumor can also deflect and deform nearby tissue. Our work addresses these last two difficult problems. We use the available MRI modalities (T1, T1c, T2) and their texture characteristics to construct a multi-dimensional feature set. Further, we extract clusters which provide a compact representation of the essential information in these features. The main idea in this paper is to incorporate these clustered features into the 3D variational segmentation framework. In contrast to the previous variational approaches, we propose a segmentation method that evolves the contour in a supervised fashion. The segmentation boundary is driven by the learned inside and outside region voxel probabilities in the cluster space. We incorporate prior knowledge about the normal brain tissue appearance, during the estimation of these region statistics. In particular, we use a Dirichlet prior that discourages the clusters in the ventricles to be in the tumor and hence better disambiguate the tumor from brain tissue. We show the performance of our method on real MRI scans. The experimental dataset includes MRI scans, from patients with difficult instances, with tumors that are inhomogeneous in appearance, small in size and in proximity to the major structures in the brain. Our method shows good results on these test cases.
Integrative Data Analysis of Multi-Platform Cancer Data with a Multimodal Deep Learning Approach.
Liang, Muxuan; Li, Zhizhong; Chen, Ting; Zeng, Jianyang
2015-01-01
Identification of cancer subtypes plays an important role in revealing useful insights into disease pathogenesis and advancing personalized therapy. The recent development of high-throughput sequencing technologies has enabled the rapid collection of multi-platform genomic data (e.g., gene expression, miRNA expression, and DNA methylation) for the same set of tumor samples. Although numerous integrative clustering approaches have been developed to analyze cancer data, few of them are particularly designed to exploit both deep intrinsic statistical properties of each input modality and complex cross-modality correlations among multi-platform input data. In this paper, we propose a new machine learning model, called multimodal deep belief network (DBN), to cluster cancer patients from multi-platform observation data. In our integrative clustering framework, relationships among inherent features of each single modality are first encoded into multiple layers of hidden variables, and then a joint latent model is employed to fuse common features derived from multiple input modalities. A practical learning algorithm, called contrastive divergence (CD), is applied to infer the parameters of our multimodal DBN model in an unsupervised manner. Tests on two available cancer datasets show that our integrative data analysis approach can effectively extract a unified representation of latent features to capture both intra- and cross-modality correlations, and identify meaningful disease subtypes from multi-platform cancer data. In addition, our approach can identify key genes and miRNAs that may play distinct roles in the pathogenesis of different cancer subtypes. Among those key miRNAs, we found that the expression level of miR-29a is highly correlated with survival time in ovarian cancer patients. These results indicate that our multimodal DBN based data analysis approach may have practical applications in cancer pathogenesis studies and provide useful guidelines for personalized cancer therapy.
NASA Astrophysics Data System (ADS)
Kazin, Eyal A.; Sánchez, Ariel G.; Cuesta, Antonio J.; Beutler, Florian; Chuang, Chia-Hsun; Eisenstein, Daniel J.; Manera, Marc; Padmanabhan, Nikhil; Percival, Will J.; Prada, Francisco; Ross, Ashley J.; Seo, Hee-Jong; Tinker, Jeremy; Tojeiro, Rita; Xu, Xiaoying; Brinkmann, J.; Joel, Brownstein; Nichol, Robert C.; Schlegel, David J.; Schneider, Donald P.; Thomas, Daniel
2013-10-01
We analyse the 2D correlation function of the Sloan Digital Sky Survey-III Baryon Oscillation Spectroscopic Survey (BOSS) CMASS sample of massive galaxies of the ninth data release to measure cosmic expansion H and the angular diameter distance DA at a mean redshift of
NASA Astrophysics Data System (ADS)
Tamiminia, Haifa; Homayouni, Saeid; McNairn, Heather; Safari, Abdoreza
2017-06-01
Polarimetric Synthetic Aperture Radar (PolSAR) data, thanks to their specific characteristics such as high resolution, weather and daylight independence, have become a valuable source of information for environment monitoring and management. The discrimination capability of observations acquired by these sensors can be used for land cover classification and mapping. The aim of this paper is to propose an optimized kernel-based C-means clustering algorithm for agriculture crop mapping from multi-temporal PolSAR data. Firstly, several polarimetric features are extracted from preprocessed data. These features are linear polarization intensities, and several statistical and physical based decompositions such as Cloude-Pottier, Freeman-Durden and Yamaguchi techniques. Then, the kernelized version of hard and fuzzy C-means clustering algorithms are applied to these polarimetric features in order to identify crop types. The kernel function, unlike the conventional partitioning clustering algorithms, simplifies the non-spherical and non-linearly patterns of data structure, to be clustered easily. In addition, in order to enhance the results, Particle Swarm Optimization (PSO) algorithm is used to tune the kernel parameters, cluster centers and to optimize features selection. The efficiency of this method was evaluated by using multi-temporal UAVSAR L-band images acquired over an agricultural area near Winnipeg, Manitoba, Canada, during June and July in 2012. The results demonstrate more accurate crop maps using the proposed method when compared to the classical approaches, (e.g. 12% improvement in general). In addition, when the optimization technique is used, greater improvement is observed in crop classification, e.g. 5% in overall. Furthermore, a strong relationship between Freeman-Durden volume scattering component, which is related to canopy structure, and phenological growth stages is observed.
Atik, Meryem; Işıklı, Rabia Canay; Ortaçeşme, Veli
2016-11-01
Landscape is the natural and cultural features of the environment. Characters are distinct recognisable patterns in the landscape that were comprised as a result of human and nature interactions. Landscape characters demonstrate precise features and values that exist in the current environment and provide information for those who use, manage, live in, benefit from and enjoy the landscape. The aim of this study was to interpret landscape characters with a common set of terminology and to evaluate clusters of characters and so to discuss how they can be used as a communicative tool in characterisation in the case of Side District in Turkish Mediterranean. Number of 35 landscape characters were analysed as variables with aesthetic, cultural, value, perceptual and natural features so to communicate between characters and landscapes. The study results demonstrated that clusters of landscape characters were divided into 3 character groups; mainly cultural, mainly cultural and a joint cluster of aesthetic, and perceptual and value aspects while spatial composition of landscape character groups was named and mapped as natural, rural, historical, urban and buffer. Aesthetic features were the most prominent elements as they combined in all sub-clusters giving the evidence that landscape is a visual construct. However, landscape characters which can be either outstanding or ordinary and their clusters provide exchange of information about relationship between man and nature, natural and cultural, objective and subjective for planners and managers, for public and professionals. Landscape characters become a body of message which ultimately offers a framework for planners and decision makers for both maintenance and protection of landscapes. Copyright © 2016 Elsevier Ltd. All rights reserved.
Polymorphism of Lysozyme Condensates.
Safari, Mohammad S; Byington, Michael C; Conrad, Jacinta C; Vekilov, Peter G
2017-10-05
Protein condensates play essential roles in physiological processes and pathological conditions. Recently discovered mesoscopic protein-rich clusters may act as crucial precursors for the nucleation of ordered protein solids, such as crystals, sickle hemoglobin polymers, and amyloid fibrils. These clusters challenge settled paradigms of protein condensation as the constituent protein molecules present features characteristic of both partially misfolded and native proteins. Here we employ the antimicrobial enzyme lysozyme and examine the similarities between mesoscopic clusters, amyloid structures, and disordered aggregates consisting of chemically modified protein. We show that the mesoscopic clusters are distinct from the other two classes of aggregates. Whereas cluster formation and amyloid oligomerization are both reversible, aggregation triggered by reduction of the intramolecular S-S bonds is permanent. In contrast to the amyloid structures, protein molecules in the clusters retain their enzymatic activity. Furthermore, an essential feature of the mesoscopic clusters is their constant radius of less than 50 nm. The amyloid and disordered aggregates are significantly larger and rapidly grow. These findings demonstrate that the clusters are a product of limited protein structural flexibility. In view of the role of the clusters in the nucleation of ordered protein solids, our results suggest that fine-tuning the degree of protein conformational stability is a powerful tool to control and direct the pathways of protein condensation.
Bacterial chemoreceptors: high-performance signaling in networked arrays.
Hazelbauer, Gerald L; Falke, Joseph J; Parkinson, John S
2008-01-01
Chemoreceptors are crucial components in the bacterial sensory systems that mediate chemotaxis. Chemotactic responses exhibit exquisite sensitivity, extensive dynamic range and precise adaptation. The mechanisms that mediate these high-performance functions involve not only actions of individual proteins but also interactions among clusters of components, localized in extensive patches of thousands of molecules. Recently, these patches have been imaged in native cells, important features of chemoreceptor structure and on-off switching have been identified, and new insights have been gained into the structural basis and functional consequences of higher order interactions among sensory components. These new data suggest multiple levels of molecular interactions, each of which contribute specific functional features and together create a sophisticated signaling device.
Bacterial chemoreceptors: high-performance signaling in networked arrays
Hazelbauer, Gerald L.; Falke, Joseph J.; Parkinson, John S.
2010-01-01
Chemoreceptors are crucial components in the bacterial sensory systems that mediate chemotaxis. Chemotactic responses exhibit exquisite sensitivity, extensive dynamic range and precise adaptation. The mechanisms that mediate these high-performance functions involve not only actions of individual proteins but also interactions among clusters of components, localized in extensive patches of thousands of molecules. Recently, these patches have been imaged in native cells, important features of chemoreceptor structure and on–off switching have been identified, and new insights have been gained into the structural basis and functional consequences of higher order interactions among sensory components. These new data suggest multiple levels of molecular interactions, each of which contribute specific functional features and together create a sophisticated signaling device. PMID:18165013
Spectrum syntheses of high-resolution integrated light spectra of Galactic globular clusters
NASA Astrophysics Data System (ADS)
Sakari, Charli M.; Shetrone, Matthew; Venn, Kim; McWilliam, Andrew; Dotter, Aaron
2013-09-01
Spectrum syntheses for three elements (Mg, Na and Eu) in high-resolution integrated light spectra of the Galactic globular clusters 47 Tuc, M3, M13, NGC 7006 and M15 are presented, along with calibration syntheses of the solar and Arcturus spectra. Iron abundances in the target clusters are also derived from integrated light equivalent width analyses. Line profiles in the spectra of these five globular clusters are well fitted after careful consideration of the atomic and molecular spectral features, providing levels of precision that are better than equivalent width analyses of the same integrated light spectra, and that are comparable to the precision in individual stellar analyses. The integrated light abundances from the 5528 and 5711 Å Mg I lines, the 6154 and 6160 Å Na I lines, and the 6645 Å Eu II line fall within the observed ranges from individual stars; however, these integrated light abundances do not always agree with the average literature abundances. Tests with the second parameter clusters M3, M13 and NGC 7006 show that assuming an incorrect horizontal branch morphology is likely to have only a small ( ≲ 0.06 dex) effect on these Mg, Na and Eu abundances. These tests therefore show that integrated light spectrum syntheses can be applied to unresolved globular clusters over a wide range of metallicities and horizontal branch morphologies. Such high precision in integrated light spectrum syntheses is valuable for interpreting the chemical abundances of globular cluster systems around other galaxies.
Biomarker clusters are differentially associated with longitudinal cognitive decline in late midlife
Racine, Annie M.; Koscik, Rebecca L.; Berman, Sara E.; Nicholas, Christopher R.; Clark, Lindsay R.; Okonkwo, Ozioma C.; Rowley, Howard A.; Asthana, Sanjay; Bendlin, Barbara B.; Blennow, Kaj; Zetterberg, Henrik; Gleason, Carey E.; Carlsson, Cynthia M.
2016-01-01
The ability to detect preclinical Alzheimer’s disease is of great importance, as this stage of the Alzheimer’s continuum is believed to provide a key window for intervention and prevention. As Alzheimer’s disease is characterized by multiple pathological changes, a biomarker panel reflecting co-occurring pathology will likely be most useful for early detection. Towards this end, 175 late middle-aged participants (mean age 55.9 ± 5.7 years at first cognitive assessment, 70% female) were recruited from two longitudinally followed cohorts to undergo magnetic resonance imaging and lumbar puncture. Cluster analysis was used to group individuals based on biomarkers of amyloid pathology (cerebrospinal fluid amyloid-β42/amyloid-β40 assay levels), magnetic resonance imaging-derived measures of neurodegeneration/atrophy (cerebrospinal fluid-to-brain volume ratio, and hippocampal volume), neurofibrillary tangles (cerebrospinal fluid phosphorylated tau181 assay levels), and a brain-based marker of vascular risk (total white matter hyperintensity lesion volume). Four biomarker clusters emerged consistent with preclinical features of (i) Alzheimer’s disease; (ii) mixed Alzheimer’s disease and vascular aetiology; (iii) suspected non-Alzheimer’s disease aetiology; and (iv) healthy ageing. Cognitive decline was then analysed between clusters using longitudinal assessments of episodic memory, semantic memory, executive function, and global cognitive function with linear mixed effects modelling. Cluster 1 exhibited a higher intercept and greater rates of decline on tests of episodic memory. Cluster 2 had a lower intercept on a test of semantic memory and both Cluster 2 and Cluster 3 had steeper rates of decline on a test of global cognition. Additional analyses on Cluster 3, which had the smallest hippocampal volume, suggest that its biomarker profile is more likely due to hippocampal vulnerability and not to detectable specific volume loss exceeding the rate of normal ageing. Our results demonstrate that pathology, as indicated by biomarkers, in a preclinical timeframe is related to patterns of longitudinal cognitive decline. Such biomarker patterns may be useful for identifying at-risk populations to recruit for clinical trials. PMID:27324877
Racine, Annie M; Koscik, Rebecca L; Berman, Sara E; Nicholas, Christopher R; Clark, Lindsay R; Okonkwo, Ozioma C; Rowley, Howard A; Asthana, Sanjay; Bendlin, Barbara B; Blennow, Kaj; Zetterberg, Henrik; Gleason, Carey E; Carlsson, Cynthia M; Johnson, Sterling C
2016-08-01
The ability to detect preclinical Alzheimer's disease is of great importance, as this stage of the Alzheimer's continuum is believed to provide a key window for intervention and prevention. As Alzheimer's disease is characterized by multiple pathological changes, a biomarker panel reflecting co-occurring pathology will likely be most useful for early detection. Towards this end, 175 late middle-aged participants (mean age 55.9 ± 5.7 years at first cognitive assessment, 70% female) were recruited from two longitudinally followed cohorts to undergo magnetic resonance imaging and lumbar puncture. Cluster analysis was used to group individuals based on biomarkers of amyloid pathology (cerebrospinal fluid amyloid-β42/amyloid-β40 assay levels), magnetic resonance imaging-derived measures of neurodegeneration/atrophy (cerebrospinal fluid-to-brain volume ratio, and hippocampal volume), neurofibrillary tangles (cerebrospinal fluid phosphorylated tau181 assay levels), and a brain-based marker of vascular risk (total white matter hyperintensity lesion volume). Four biomarker clusters emerged consistent with preclinical features of (i) Alzheimer's disease; (ii) mixed Alzheimer's disease and vascular aetiology; (iii) suspected non-Alzheimer's disease aetiology; and (iv) healthy ageing. Cognitive decline was then analysed between clusters using longitudinal assessments of episodic memory, semantic memory, executive function, and global cognitive function with linear mixed effects modelling. Cluster 1 exhibited a higher intercept and greater rates of decline on tests of episodic memory. Cluster 2 had a lower intercept on a test of semantic memory and both Cluster 2 and Cluster 3 had steeper rates of decline on a test of global cognition. Additional analyses on Cluster 3, which had the smallest hippocampal volume, suggest that its biomarker profile is more likely due to hippocampal vulnerability and not to detectable specific volume loss exceeding the rate of normal ageing. Our results demonstrate that pathology, as indicated by biomarkers, in a preclinical timeframe is related to patterns of longitudinal cognitive decline. Such biomarker patterns may be useful for identifying at-risk populations to recruit for clinical trials. © The Author (2016). Published by Oxford University Press on behalf of the Guarantors of Brain. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Scalable Integrated Region-Based Image Retrieval Using IRM and Statistical Clustering.
ERIC Educational Resources Information Center
Wang, James Z.; Du, Yanping
Statistical clustering is critical in designing scalable image retrieval systems. This paper presents a scalable algorithm for indexing and retrieving images based on region segmentation. The method uses statistical clustering on region features and IRM (Integrated Region Matching), a measure developed to evaluate overall similarity between images…
Semi-Supervised Clustering for High-Dimensional and Sparse Features
ERIC Educational Resources Information Center
Yan, Su
2010-01-01
Clustering is one of the most common data mining tasks, used frequently for data organization and analysis in various application domains. Traditional machine learning approaches to clustering are fully automated and unsupervised where class labels are unknown a priori. In real application domains, however, some "weak" form of side…
NASA Technical Reports Server (NTRS)
Glick, B. J.
1985-01-01
Techniques for classifying objects into groups or clases go under many different names including, most commonly, cluster analysis. Mathematically, the general problem is to find a best mapping of objects into an index set consisting of class identifiers. When an a priori grouping of objects exists, the process of deriving the classification rules from samples of classified objects is known as discrimination. When such rules are applied to objects of unknown class, the process is denoted classification. The specific problem addressed involves the group classification of a set of objects that are each associated with a series of measurements (ratio, interval, ordinal, or nominal levels of measurement). Each measurement produces one variable in a multidimensional variable space. Cluster analysis techniques are reviewed and methods for incuding geographic location, distance measures, and spatial pattern (distribution) as parameters in clustering are examined. For the case of patterning, measures of spatial autocorrelation are discussed in terms of the kind of data (nominal, ordinal, or interval scaled) to which they may be applied.
NASA Astrophysics Data System (ADS)
Chen, Xin; Liu, Li; Zhou, Sida; Yue, Zhenjiang
2016-09-01
Reduced order models(ROMs) based on the snapshots on the CFD high-fidelity simulations have been paid great attention recently due to their capability of capturing the features of the complex geometries and flow configurations. To improve the efficiency and precision of the ROMs, it is indispensable to add extra sampling points to the initial snapshots, since the number of sampling points to achieve an adequately accurate ROM is generally unknown in prior, but a large number of initial sampling points reduces the parsimony of the ROMs. A fuzzy-clustering-based adding-point strategy is proposed and the fuzzy clustering acts an indicator of the region in which the precision of ROMs is relatively low. The proposed method is applied to construct the ROMs for the benchmark mathematical examples and a numerical example of hypersonic aerothermodynamics prediction for a typical control surface. The proposed method can achieve a 34.5% improvement on the efficiency than the estimated mean squared error prediction algorithm and shows same-level prediction accuracy.
ERIC Educational Resources Information Center
Yang, Jing
2018-01-01
This study investigated the durational features of English word-initial /s/+stop clusters produced by bilingual Mandarin (L1)-English (L2) children and monolingual English children and adults. The participants included two groups of five- to six-year-old bilingual children: low proficiency in the L2 (Bi-low) and high proficiency in the L2…
Discriminative clustering on manifold for adaptive transductive classification.
Zhang, Zhao; Jia, Lei; Zhang, Min; Li, Bing; Zhang, Li; Li, Fanzhang
2017-10-01
In this paper, we mainly propose a novel adaptive transductive label propagation approach by joint discriminative clustering on manifolds for representing and classifying high-dimensional data. Our framework seamlessly combines the unsupervised manifold learning, discriminative clustering and adaptive classification into a unified model. Also, our method incorporates the adaptive graph weight construction with label propagation. Specifically, our method is capable of propagating label information using adaptive weights over low-dimensional manifold features, which is different from most existing studies that usually predict the labels and construct the weights in the original Euclidean space. For transductive classification by our formulation, we first perform the joint discriminative K-means clustering and manifold learning to capture the low-dimensional nonlinear manifolds. Then, we construct the adaptive weights over the learnt manifold features, where the adaptive weights are calculated through performing the joint minimization of the reconstruction errors over features and soft labels so that the graph weights can be joint-optimal for data representation and classification. Using the adaptive weights, we can easily estimate the unknown labels of samples. After that, our method returns the updated weights for further updating the manifold features. Extensive simulations on image classification and segmentation show that our proposed algorithm can deliver the state-of-the-art performance on several public datasets. Copyright © 2017 Elsevier Ltd. All rights reserved.
Constraints on inflation with LSS surveys: features in the primordial power spectrum
NASA Astrophysics Data System (ADS)
Palma, Gonzalo A.; Sapone, Domenico; Sypsas, Spyros
2018-06-01
We analyse the efficiency of future large scale structure surveys to unveil the presence of scale dependent features in the primordial spectrum—resulting from cosmic inflation—imprinted in the distribution of galaxies. Features may appear as a consequence of non-trivial dynamics during cosmic inflation, in which one or more background quantities experienced small but rapid deviations from their characteristic slow-roll evolution. We consider two families of features: localised features and oscillatory extended features. To characterise them we employ various possible templates parametrising their scale dependence and provide forecasts on the constraints on these parametrisations for LSST like surveys. We perform a Fisher matrix analysis for three observables: cosmic microwave background (CMB), galaxy clustering and weak lensing. We find that the combined data set of these observables will be able to limit the presence of features down to levels that are more restrictive than current constraints coming from CMB observations only. In particular, we address the possibility of gaining information on currently known deviations from scale invariance inferred from CMB data, such as the feature appearing at the l ~ 20 multipole (which is the main contribution to the low-l deficit) and another one around l ~ 800.
Reaction and Aggregation Dynamics of Cell Surface Receptors
NASA Astrophysics Data System (ADS)
Wang, Michelle Dong
This dissertation is composed of both theoretical and experimental studies of cell surface receptor reaction and aggregation. Project I studies the reaction rate enhancement due to surface diffusion of a bulk dissolved ligand with its membrane embedded target, using numerical calculations. The results show that the reaction rate enhancement is determined by ligand surface adsorption and desorption kinetic rates, surface and bulk diffusion coefficients, and geometry. In particular, we demonstrate that the ligand surface adsorption and desorption kinetic rates, rather than their ratio (the equilibrium constant), are important in rate enhancement. The second and third projects are studies of acetylcholine receptor clusters on cultured rat myotubes using fluorescence techniques after labeling the receptors with tetramethylrhodamine -alpha-bungarotoxin. The second project studies when and where the clusters form by making time-lapse movies. The movies are made from overlay of the pseudocolored total internal reflection fluorescence (TIRF) images of the cluster, and the schlieren images of the cell cultures. These movies are the first movies made using TIRF, and they clearly show the cluster formation from the myoblast fusion, the first appearance of clusters, and the eventual disappearance of clusters. The third project studies the fine structural features of individual clusters observed under TIRF. The features were characterized with six parameters by developing a novel fluorescence technique: spatial fluorescence autocorrelation. These parameters were then used to study the feature variations with age, and with treatments of drugs (oligomycin and carbachol). The results show little variation with age. However, drug treatment induced significant changes in some parameters. These changes were different for oligomycin and carbachol, which indicates that the two drugs may eliminate clusters through different mechanisms.
Bae, Hyoung Won; Ji, Yongwoo; Lee, Hye Sun; Lee, Naeun; Hong, Samin; Seong, Gong Je; Sung, Kyung Rim; Kim, Chan Yun
2015-01-01
Normal-tension glaucoma (NTG) is a heterogenous disease, and there is still controversy about subclassifications of this disorder. On the basis of spectral-domain optical coherence tomography (SD-OCT), we subdivided NTG with hierarchical cluster analysis using optic nerve head (ONH) parameters and retinal nerve fiber layer (RNFL) thicknesses. A total of 200 eyes of 200 NTG patients between March 2011 and June 2012 underwent SD-OCT scans to measure ONH parameters and RNFL thicknesses. We classified NTG into homogenous subgroups based on these variables using a hierarchical cluster analysis, and compared clusters to evaluate diverse NTG characteristics. Three clusters were found after hierarchical cluster analysis. Cluster 1 (62 eyes) had the thickest RNFL and widest rim area, and showed early glaucoma features. Cluster 2 (60 eyes) was characterized by the largest cup/disc ratio and cup volume, and showed advanced glaucomatous damage. Cluster 3 (78 eyes) had small disc areas in SD-OCT and were comprised of patients with significantly younger age, longer axial length, and greater myopia than the other 2 groups. A hierarchical cluster analysis of SD-OCT scans divided NTG patients into 3 groups based upon ONH parameters and RNFL thicknesses. It is anticipated that the small disc area group comprised of younger and more myopic patients may show unique features unlike the other 2 groups.
Network Ecology and Adolescent Social Structure
McFarland, Daniel A.; Moody, James; Diehl, David; Smith, Jeffrey A.; Thomas, Reuben J.
2014-01-01
Adolescent societies—whether arising from weak, short-term classroom friendships or from close, long-term friendships—exhibit various levels of network clustering, segregation, and hierarchy. Some are rank-ordered caste systems and others are flat, cliquish worlds. Explaining the source of such structural variation remains a challenge, however, because global network features are generally treated as the agglomeration of micro-level tie-formation mechanisms, namely balance, homophily, and dominance. How do the same micro-mechanisms generate significant variation in global network structures? To answer this question we propose and test a network ecological theory that specifies the ways features of organizational environments moderate the expression of tie-formation processes, thereby generating variability in global network structures across settings. We develop this argument using longitudinal friendship data on schools (Add Health study) and classrooms (Classroom Engagement study), and by extending exponential random graph models to the study of multiple societies over time. PMID:25535409
Network Ecology and Adolescent Social Structure.
McFarland, Daniel A; Moody, James; Diehl, David; Smith, Jeffrey A; Thomas, Reuben J
2014-12-01
Adolescent societies-whether arising from weak, short-term classroom friendships or from close, long-term friendships-exhibit various levels of network clustering, segregation, and hierarchy. Some are rank-ordered caste systems and others are flat, cliquish worlds. Explaining the source of such structural variation remains a challenge, however, because global network features are generally treated as the agglomeration of micro-level tie-formation mechanisms, namely balance, homophily, and dominance. How do the same micro-mechanisms generate significant variation in global network structures? To answer this question we propose and test a network ecological theory that specifies the ways features of organizational environments moderate the expression of tie-formation processes, thereby generating variability in global network structures across settings. We develop this argument using longitudinal friendship data on schools (Add Health study) and classrooms (Classroom Engagement study), and by extending exponential random graph models to the study of multiple societies over time.
Extracting intrinsic functional networks with feature-based group independent component analysis.
Calhoun, Vince D; Allen, Elena
2013-04-01
There is increasing use of functional imaging data to understand the macro-connectome of the human brain. Of particular interest is the structure and function of intrinsic networks (regions exhibiting temporally coherent activity both at rest and while a task is being performed), which account for a significant portion of the variance in functional MRI data. While networks are typically estimated based on the temporal similarity between regions (based on temporal correlation, clustering methods, or independent component analysis [ICA]), some recent work has suggested that these intrinsic networks can be extracted from the inter-subject covariation among highly distilled features, such as amplitude maps reflecting regions modulated by a task or even coordinates extracted from large meta analytic studies. In this paper our goal was to explicitly compare the networks obtained from a first-level ICA (ICA on the spatio-temporal functional magnetic resonance imaging (fMRI) data) to those from a second-level ICA (i.e., ICA on computed features rather than on the first-level fMRI data). Convergent results from simulations, task-fMRI data, and rest-fMRI data show that the second-level analysis is slightly noisier than the first-level analysis but yields strikingly similar patterns of intrinsic networks (spatial correlations as high as 0.85 for task data and 0.65 for rest data, well above the empirical null) and also preserves the relationship of these networks with other variables such as age (for example, default mode network regions tended to show decreased low frequency power for first-level analyses and decreased loading parameters for second-level analyses). In addition, the best-estimated second-level results are those which are the most strongly reflected in the input feature. In summary, the use of feature-based ICA appears to be a valid tool for extracting intrinsic networks. We believe it will become a useful and important approach in the study of the macro-connectome, particularly in the context of data fusion.
The application of data mining techniques to oral cancer prognosis.
Tseng, Wan-Ting; Chiang, Wei-Fan; Liu, Shyun-Yeu; Roan, Jinsheng; Lin, Chun-Nan
2015-05-01
This study adopted an integrated procedure that combines the clustering and classification features of data mining technology to determine the differences between the symptoms shown in past cases where patients died from or survived oral cancer. Two data mining tools, namely decision tree and artificial neural network, were used to analyze the historical cases of oral cancer, and their performance was compared with that of logistic regression, the popular statistical analysis tool. Both decision tree and artificial neural network models showed superiority to the traditional statistical model. However, as to clinician, the trees created by the decision tree models are relatively easier to interpret compared to that of the artificial neural network models. Cluster analysis also discovers that those stage 4 patients whose also possess the following four characteristics are having an extremely low survival rate: pN is N2b, level of RLNM is level I-III, AJCC-T is T4, and cells mutate situation (G) is moderate.
Clustering method for counting passengers getting in a bus with single camera
NASA Astrophysics Data System (ADS)
Yang, Tao; Zhang, Yanning; Shao, Dapei; Li, Ying
2010-03-01
Automatic counting of passengers is very important for both business and security applications. We present a single-camera-based vision system that is able to count passengers in a highly crowded situation at the entrance of a traffic bus. The unique characteristics of the proposed system include, First, a novel feature-point-tracking- and online clustering-based passenger counting framework, which performs much better than those of background-modeling-and foreground-blob-tracking-based methods. Second, a simple and highly accurate clustering algorithm is developed that projects the high-dimensional feature point trajectories into a 2-D feature space by their appearance and disappearance times and counts the number of people through online clustering. Finally, all test video sequences in the experiment are captured from a real traffic bus in Shanghai, China. The results show that the system can process two 320×240 video sequences at a frame rate of 25 fps simultaneously, and can count passengers reliably in various difficult scenarios with complex interaction and occlusion among people. The method achieves high accuracy rates up to 96.5%.
Nguyen, Thanh; Khosravi, Abbas; Creighton, Douglas; Nahavandi, Saeid
2014-12-30
Understanding neural functions requires knowledge from analysing electrophysiological data. The process of assigning spikes of a multichannel signal into clusters, called spike sorting, is one of the important problems in such analysis. There have been various automated spike sorting techniques with both advantages and disadvantages regarding accuracy and computational costs. Therefore, developing spike sorting methods that are highly accurate and computationally inexpensive is always a challenge in the biomedical engineering practice. An automatic unsupervised spike sorting method is proposed in this paper. The method uses features extracted by the locality preserving projection (LPP) algorithm. These features afterwards serve as inputs for the landmark-based spectral clustering (LSC) method. Gap statistics (GS) is employed to evaluate the number of clusters before the LSC can be performed. The proposed LPP-LSC is highly accurate and computationally inexpensive spike sorting approach. LPP spike features are very discriminative; thereby boost the performance of clustering methods. Furthermore, the LSC method exhibits its efficiency when integrated with the cluster evaluator GS. The proposed method's accuracy is approximately 13% superior to that of the benchmark combination between wavelet transformation and superparamagnetic clustering (WT-SPC). Additionally, LPP-LSC computing time is six times less than that of the WT-SPC. LPP-LSC obviously demonstrates a win-win spike sorting solution meeting both accuracy and computational cost criteria. LPP and LSC are linear algorithms that help reduce computational burden and thus their combination can be applied into real-time spike analysis. Copyright © 2014 Elsevier B.V. All rights reserved.
Lithium Abundance in M3 Red Giant
NASA Astrophysics Data System (ADS)
Givens, Rashad; Pilachowski, Catherine A.
2015-01-01
We present the abundance of lithium in the red giant star vZ 1050 (SK 291) in the globular cluster M3. A previous survey of giants in the cluster showed that like IV-101, vZ 1050 displays a prominent Li I 6707 Å feature. vZ 1050 lies on the blue side of the red giant branch about 1.3 magnitudes above the level of the horizontal branch, and may be an asymptotic giant branch star. A high resolution spectrum of M3 vZ1050 was obtained with the ARC 3.5m telescope and the ARC Echelle Spectrograph (ARCES). Atmospheric parameters were determined using Fe I and Fe II lines from the spectrum using the MOOG spectral analysis program, and the lithium abundance was determined using spectrum synthesis.
NeCamp, Timothy; Kilbourne, Amy; Almirall, Daniel
2017-08-01
Cluster-level dynamic treatment regimens can be used to guide sequential treatment decision-making at the cluster level in order to improve outcomes at the individual or patient-level. In a cluster-level dynamic treatment regimen, the treatment is potentially adapted and re-adapted over time based on changes in the cluster that could be impacted by prior intervention, including aggregate measures of the individuals or patients that compose it. Cluster-randomized sequential multiple assignment randomized trials can be used to answer multiple open questions preventing scientists from developing high-quality cluster-level dynamic treatment regimens. In a cluster-randomized sequential multiple assignment randomized trial, sequential randomizations occur at the cluster level and outcomes are observed at the individual level. This manuscript makes two contributions to the design and analysis of cluster-randomized sequential multiple assignment randomized trials. First, a weighted least squares regression approach is proposed for comparing the mean of a patient-level outcome between the cluster-level dynamic treatment regimens embedded in a sequential multiple assignment randomized trial. The regression approach facilitates the use of baseline covariates which is often critical in the analysis of cluster-level trials. Second, sample size calculators are derived for two common cluster-randomized sequential multiple assignment randomized trial designs for use when the primary aim is a between-dynamic treatment regimen comparison of the mean of a continuous patient-level outcome. The methods are motivated by the Adaptive Implementation of Effective Programs Trial which is, to our knowledge, the first-ever cluster-randomized sequential multiple assignment randomized trial in psychiatry.
Machine-learned cluster identification in high-dimensional data.
Ultsch, Alfred; Lötsch, Jörn
2017-02-01
High-dimensional biomedical data are frequently clustered to identify subgroup structures pointing at distinct disease subtypes. It is crucial that the used cluster algorithm works correctly. However, by imposing a predefined shape on the clusters, classical algorithms occasionally suggest a cluster structure in homogenously distributed data or assign data points to incorrect clusters. We analyzed whether this can be avoided by using emergent self-organizing feature maps (ESOM). Data sets with different degrees of complexity were submitted to ESOM analysis with large numbers of neurons, using an interactive R-based bioinformatics tool. On top of the trained ESOM the distance structure in the high dimensional feature space was visualized in the form of a so-called U-matrix. Clustering results were compared with those provided by classical common cluster algorithms including single linkage, Ward and k-means. Ward clustering imposed cluster structures on cluster-less "golf ball", "cuboid" and "S-shaped" data sets that contained no structure at all (random data). Ward clustering also imposed structures on permuted real world data sets. By contrast, the ESOM/U-matrix approach correctly found that these data contain no cluster structure. However, ESOM/U-matrix was correct in identifying clusters in biomedical data truly containing subgroups. It was always correct in cluster structure identification in further canonical artificial data. Using intentionally simple data sets, it is shown that popular clustering algorithms typically used for biomedical data sets may fail to cluster data correctly, suggesting that they are also likely to perform erroneously on high dimensional biomedical data. The present analyses emphasized that generally established classical hierarchical clustering algorithms carry a considerable tendency to produce erroneous results. By contrast, unsupervised machine-learned analysis of cluster structures, applied using the ESOM/U-matrix method, is a viable, unbiased method to identify true clusters in the high-dimensional space of complex data. Copyright © 2017 The Authors. Published by Elsevier Inc. All rights reserved.
Zhu, Jianwei; Zhang, Haicang; Li, Shuai Cheng; Wang, Chao; Kong, Lupeng; Sun, Shiwei; Zheng, Wei-Mou; Bu, Dongbo
2017-12-01
Accurate recognition of protein fold types is a key step for template-based prediction of protein structures. The existing approaches to fold recognition mainly exploit the features derived from alignments of query protein against templates. These approaches have been shown to be successful for fold recognition at family level, but usually failed at superfamily/fold levels. To overcome this limitation, one of the key points is to explore more structurally informative features of proteins. Although residue-residue contacts carry abundant structural information, how to thoroughly exploit these information for fold recognition still remains a challenge. In this study, we present an approach (called DeepFR) to improve fold recognition at superfamily/fold levels. The basic idea of our approach is to extract fold-specific features from predicted residue-residue contacts of proteins using deep convolutional neural network (DCNN) technique. Based on these fold-specific features, we calculated similarity between query protein and templates, and then assigned query protein with fold type of the most similar template. DCNN has showed excellent performance in image feature extraction and image recognition; the rational underlying the application of DCNN for fold recognition is that contact likelihood maps are essentially analogy to images, as they both display compositional hierarchy. Experimental results on the LINDAHL dataset suggest that even using the extracted fold-specific features alone, our approach achieved success rate comparable to the state-of-the-art approaches. When further combining these features with traditional alignment-related features, the success rate of our approach increased to 92.3%, 82.5% and 78.8% at family, superfamily and fold levels, respectively, which is about 18% higher than the state-of-the-art approach at fold level, 6% higher at superfamily level and 1% higher at family level. An independent assessment on SCOP_TEST dataset showed consistent performance improvement, indicating robustness of our approach. Furthermore, bi-clustering results of the extracted features are compatible with fold hierarchy of proteins, implying that these features are fold-specific. Together, these results suggest that the features extracted from predicted contacts are orthogonal to alignment-related features, and the combination of them could greatly facilitate fold recognition at superfamily/fold levels and template-based prediction of protein structures. Source code of DeepFR is freely available through https://github.com/zhujianwei31415/deepfr, and a web server is available through http://protein.ict.ac.cn/deepfr. zheng@itp.ac.cn or dbu@ict.ac.cn. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Qin, Lei; Snoussi, Hichem; Abdallah, Fahed
2014-01-01
We propose a novel approach for tracking an arbitrary object in video sequences for visual surveillance. The first contribution of this work is an automatic feature extraction method that is able to extract compact discriminative features from a feature pool before computing the region covariance descriptor. As the feature extraction method is adaptive to a specific object of interest, we refer to the region covariance descriptor computed using the extracted features as the adaptive covariance descriptor. The second contribution is to propose a weakly supervised method for updating the object appearance model during tracking. The method performs a mean-shift clustering procedure among the tracking result samples accumulated during a period of time and selects a group of reliable samples for updating the object appearance model. As such, the object appearance model is kept up-to-date and is prevented from contamination even in case of tracking mistakes. We conducted comparing experiments on real-world video sequences, which confirmed the effectiveness of the proposed approaches. The tracking system that integrates the adaptive covariance descriptor and the clustering-based model updating method accomplished stable object tracking on challenging video sequences. PMID:24865883
Supra-galactic colour patterns in globular cluster systems
NASA Astrophysics Data System (ADS)
Forte, Juan C.
2017-07-01
An analysis of globular cluster systems associated with galaxies included in the Virgo and Fornax Hubble Space Telescope-Advanced Camera Surveys reveals distinct (g - z) colour modulation patterns. These features appear on composite samples of globular clusters and, most evidently, in galaxies with absolute magnitudes Mg in the range from -20.2 to -19.2. These colour modulations are also detectable on some samples of globular clusters in the central galaxies NGC 1399 and NGC 4486 (and confirmed on data sets obtained with different instruments and photometric systems), as well as in other bright galaxies in these clusters. After discarding field contamination, photometric errors and statistical effects, we conclude that these supra-galactic colour patterns are real and reflect some previously unknown characteristic. These features suggest that the globular cluster formation process was not entirely stochastic but included a fraction of clusters that formed in a rather synchronized fashion over large spatial scales, and in a tentative time lapse of about 1.5 Gy at redshifts z between 2 and 4. We speculate that the putative mechanism leading to that synchronism may be associated with large scale feedback effects connected with violent star-forming events and/or with supermassive black holes.
Cluster ensemble based on Random Forests for genetic data.
Alhusain, Luluah; Hafez, Alaaeldin M
2017-01-01
Clustering plays a crucial role in several application domains, such as bioinformatics. In bioinformatics, clustering has been extensively used as an approach for detecting interesting patterns in genetic data. One application is population structure analysis, which aims to group individuals into subpopulations based on shared genetic variations, such as single nucleotide polymorphisms. Advances in DNA sequencing technology have facilitated the obtainment of genetic datasets with exceptional sizes. Genetic data usually contain hundreds of thousands of genetic markers genotyped for thousands of individuals, making an efficient means for handling such data desirable. Random Forests (RFs) has emerged as an efficient algorithm capable of handling high-dimensional data. RFs provides a proximity measure that can capture different levels of co-occurring relationships between variables. RFs has been widely considered a supervised learning method, although it can be converted into an unsupervised learning method. Therefore, RF-derived proximity measure combined with a clustering technique may be well suited for determining the underlying structure of unlabeled data. This paper proposes, RFcluE, a cluster ensemble approach for determining the underlying structure of genetic data based on RFs. The approach comprises a cluster ensemble framework to combine multiple runs of RF clustering. Experiments were conducted on high-dimensional, real genetic dataset to evaluate the proposed approach. The experiments included an examination of the impact of parameter changes, comparing RFcluE performance against other clustering methods, and an assessment of the relationship between the diversity and quality of the ensemble and its effect on RFcluE performance. This paper proposes, RFcluE, a cluster ensemble approach based on RF clustering to address the problem of population structure analysis and demonstrate the effectiveness of the approach. The paper also illustrates that applying a cluster ensemble approach, combining multiple RF clusterings, produces more robust and higher-quality results as a consequence of feeding the ensemble with diverse views of high-dimensional genetic data obtained through bagging and random subspace, the two key features of the RF algorithm.
Web Image Search Re-ranking with Click-based Similarity and Typicality.
Yang, Xiaopeng; Mei, Tao; Zhang, Yong Dong; Liu, Jie; Satoh, Shin'ichi
2016-07-20
In image search re-ranking, besides the well known semantic gap, intent gap, which is the gap between the representation of users' query/demand and the real intent of the users, is becoming a major problem restricting the development of image retrieval. To reduce human effects, in this paper, we use image click-through data, which can be viewed as the "implicit feedback" from users, to help overcome the intention gap, and further improve the image search performance. Generally, the hypothesis visually similar images should be close in a ranking list and the strategy images with higher relevance should be ranked higher than others are widely accepted. To obtain satisfying search results, thus, image similarity and the level of relevance typicality are determinate factors correspondingly. However, when measuring image similarity and typicality, conventional re-ranking approaches only consider visual information and initial ranks of images, while overlooking the influence of click-through data. This paper presents a novel re-ranking approach, named spectral clustering re-ranking with click-based similarity and typicality (SCCST). First, to learn an appropriate similarity measurement, we propose click-based multi-feature similarity learning algorithm (CMSL), which conducts metric learning based on clickbased triplets selection, and integrates multiple features into a unified similarity space via multiple kernel learning. Then based on the learnt click-based image similarity measure, we conduct spectral clustering to group visually and semantically similar images into same clusters, and get the final re-rank list by calculating click-based clusters typicality and withinclusters click-based image typicality in descending order. Our experiments conducted on two real-world query-image datasets with diverse representative queries show that our proposed reranking approach can significantly improve initial search results, and outperform several existing re-ranking approaches.
Multi-scale clustering by building a robust and self correcting ultrametric topology on data points.
Fushing, Hsieh; Wang, Hui; Vanderwaal, Kimberly; McCowan, Brenda; Koehl, Patrice
2013-01-01
The advent of high-throughput technologies and the concurrent advances in information sciences have led to an explosion in size and complexity of the data sets collected in biological sciences. The biggest challenge today is to assimilate this wealth of information into a conceptual framework that will help us decipher biological functions. A large and complex collection of data, usually called a data cloud, naturally embeds multi-scale characteristics and features, generically termed geometry. Understanding this geometry is the foundation for extracting knowledge from data. We have developed a new methodology, called data cloud geometry-tree (DCG-tree), to resolve this challenge. This new procedure has two main features that are keys to its success. Firstly, it derives from the empirical similarity measurements a hierarchy of clustering configurations that captures the geometric structure of the data. This hierarchy is then transformed into an ultrametric space, which is then represented via an ultrametric tree or a Parisi matrix. Secondly, it has a built-in mechanism for self-correcting clustering membership across different tree levels. We have compared the trees generated with this new algorithm to equivalent trees derived with the standard Hierarchical Clustering method on simulated as well as real data clouds from fMRI brain connectivity studies, cancer genomics, giraffe social networks, and Lewis Carroll's Doublets network. In each of these cases, we have shown that the DCG trees are more robust and less sensitive to measurement errors, and that they provide a better quantification of the multi-scale geometric structures of the data. As such, DCG-tree is an effective tool for analyzing complex biological data sets.
Weak Gravitational Lensing by the Nearby Cluster Abell 3667.
Joffre; Fischer; Frieman; McKay; Mohr; Nichol; Johnston; Sheldon; Bernstein
2000-05-10
We present two weak lensing reconstructions of the nearby (zcl=0.055) merging cluster Abell 3667, based on observations taken approximately 1 yr apart under different seeing conditions. This is the lowest redshift cluster with a weak lensing mass reconstruction to date. The reproducibility of features in the two mass maps demonstrates that weak lensing studies of low-redshift clusters are feasible. These data constitute the first results from an X-ray luminosity-selected weak lensing survey of 19 low-redshift (z<0.1) southern clusters.
Glioma grading using cell nuclei morphologic features in digital pathology images
NASA Astrophysics Data System (ADS)
Reza, Syed M. S.; Iftekharuddin, Khan M.
2016-03-01
This work proposes a computationally efficient cell nuclei morphologic feature analysis technique to characterize the brain gliomas in tissue slide images. In this work, our contributions are two-fold: 1) obtain an optimized cell nuclei segmentation method based on the pros and cons of the existing techniques in literature, 2) extract representative features by k-mean clustering of nuclei morphologic features to include area, perimeter, eccentricity, and major axis length. This clustering based representative feature extraction avoids shortcomings of extensive tile [1] [2] and nuclear score [3] based methods for brain glioma grading in pathology images. Multilayer perceptron (MLP) is used to classify extracted features into two tumor types: glioblastoma multiforme (GBM) and low grade glioma (LGG). Quantitative scores such as precision, recall, and accuracy are obtained using 66 clinical patients' images from The Cancer Genome Atlas (TCGA) [4] dataset. On an average ~94% accuracy from 10 fold crossvalidation confirms the efficacy of the proposed method.
Hu, Jing; Zhang, Xiaolong; Liu, Xiaoming; Tang, Jinshan
2015-06-01
Discovering hot regions in protein-protein interaction is important for drug and protein design, while experimental identification of hot regions is a time-consuming and labor-intensive effort; thus, the development of predictive models can be very helpful. In hot region prediction research, some models are based on structure information, and others are based on a protein interaction network. However, the prediction accuracy of these methods can still be improved. In this paper, a new method is proposed for hot region prediction, which combines density-based incremental clustering with feature-based classification. The method uses density-based incremental clustering to obtain rough hot regions, and uses feature-based classification to remove the non-hot spot residues from the rough hot regions. Experimental results show that the proposed method significantly improves the prediction performance of hot regions. Copyright © 2015 Elsevier Ltd. All rights reserved.
ERIC Educational Resources Information Center
Jak, Suzanne; Oort, Frans J.; Dolan, Conor V.
2013-01-01
We present a test for cluster bias, which can be used to detect violations of measurement invariance across clusters in 2-level data. We show how measurement invariance assumptions across clusters imply measurement invariance across levels in a 2-level factor model. Cluster bias is investigated by testing whether the within-level factor loadings…
Gait patterns in hemiplegic patients with equinus foot deformity.
Manca, M; Ferraresi, G; Cosma, M; Cavazzuti, L; Morelli, M; Benedetti, M G
2014-01-01
Equinus deformity of the foot is a common feature of hemiplegia, which impairs the gait pattern of patients. The aim of the present study was to explore the role of ankle-foot deformity in gait impairment. A hierarchical cluster analysis was used to classify the gait patterns of 49 chronic hemiplegic patients with equinus deformity of the foot, based on temporal-distance parameters and joint kinematic measures obtained by an innovative protocol for motion assessment in the sagittal, frontal, and transverse planes, synthesized by parametrical analysis. Cluster analysis identified five subgroups of patients with homogenous levels of dysfunction during gait. Specific joint kinematic abnormalities were found, according to the speed of progression in each cluster. Patients with faster walking were those with less ankle-foot complex impairment or with reduced range of motion of ankle-foot complex, that is with a stiff ankle-foot complex. Slow walking was typical of patients with ankle-foot complex instability (i.e., larger motion in all the planes), severe equinus and hip internal rotation pattern, and patients with hip external rotation pattern. Clustering of gait patterns in these patients is helpful for a better understanding of dysfunction during gait and delivering more targeted treatment.
Izquierdo, Javier A; Sizova, Maria V; Lynd, Lee R
2010-06-01
The enrichment from nature of novel microbial communities with high cellulolytic activity is useful in the identification of novel organisms and novel functions that enhance the fundamental understanding of microbial cellulose degradation. In this work we identify predominant organisms in three cellulolytic enrichment cultures with thermophilic compost as an inoculum. Community structure based on 16S rRNA gene clone libraries featured extensive representation of clostridia from cluster III, with minor representation of clostridial clusters I and XIV and a novel Lutispora species cluster. Our studies reveal different levels of 16S rRNA gene diversity, ranging from 3 to 18 operational taxonomic units (OTUs), as well as variability in community membership across the three enrichment cultures. By comparison, glycosyl hydrolase family 48 (GHF48) diversity analyses revealed a narrower breadth of novel clostridial genes associated with cultured and uncultured cellulose degraders. The novel GHF48 genes identified in this study were related to the novel clostridia Clostridium straminisolvens and Clostridium clariflavum, with one cluster sharing as little as 73% sequence similarity with the closest known relative. In all, 14 new GHF48 gene sequences were added to the known diversity of 35 genes from cultured species.
Gone with the Wind: Watching Galaxy Transformation in Abell 2125
NASA Astrophysics Data System (ADS)
Keel, W.; Owen, F.; Ledlow, M.; Wang, D.
2003-12-01
Dense environments clearly foster the transformation of galaxies, but it has proven difficult to untangle the roles of various processes in cluster environments. We have found a uniquely strong case for ongoing stripping of gas from the galaxy C153 in Abell 2125. The cluster, at z=0.25, includes merging subsystems with a relative line-of-sight velocity near 2000 km/s. C153, identified using the VLA as a strong radio source powered by star formation, is the brightest cluster member with activity of this kind, and part of the less populous blueshifted grouping. Several lines of evidence indicate that it is being swept by a stripping event. (1) A tail of ionized gas is seen in [O II] emission, which extends at least 70 kpc toward the cluster core, coinciding with a soft X-ray feature seen in the Chandra observations reported by Wang et al. (2) HST WFPC2 images reveal disturbed and clumpy morphology, including luminous star-forming complexes and chaotic dust features. (3) The spectral energy distribution and Gemini GMOS absorption-line spectrum indicate a massive burst of star formation ≈ 108 years ago superimposed on an older and much fainter population. (4) The stellar and gas kinematics are decoupled, with multiple gas velocity systems including counter-rotating components. The large velocity difference between the galaxy and (most of the) intracluster medium may contribute to the signatures being more prominent than hitherto seen. The starburst age is consistent with estimates of the time since the closest encounter of the major subsystems during the cluster-level merger. We continue to explore whether a starburst outflow or tidal damage has added to the role of stripping by the ICM, and how star formation has proceeded in the gas after leaving the galaxy disk. This work was supported by NASA through HST grant GO-07279.01-96A, and by the NSF through facilities at NRAO, Kitt Peak, and Gemini-North.
VizieR Online Data Catalog: Redshift reliability flags (VVDS data) (Jamal+, 2018)
NASA Astrophysics Data System (ADS)
Jamal, S.; Le Brun, V.; Le Fevre, O.; Vibert, D.; Schmitt, A.; Surace, C.; Copin, Y.; Garilli, B.; Moresco, M.; Pozzetti, L.
2017-09-01
The VIMOS VLT Deep Survey (Le Fevre et al. 2013A&A...559A..14L) is a combination of 3 i-band magnitude limited surveys: Wide (17.5<=iAB<=22.5; 8.6deg2), Deep (17.5<=iAB<=24; 0.6deg2) and Ultra-Deep (23<=iAB<=24.75; 512arcmin2), that produced a total of 35526 spectroscopic galaxy redshifts between 0 and 6.7 (22434 in Wide, 12051 in Deep and 1041 in UDeep). We supplement spectra of the VIMOS VLT Deep Survey (VVDS) with newly-defined redshift reliability flags obtained from clustering (unsupervised classification in Machine Learning) a set of descriptors from individual zPDFs. In this paper, we exploit a set of 24519 spectra from the VVDS database. After computing zPDFs for each individual spectrum, a set of (8) descriptors of the zPDF are extracted to build a feature matrix X (dimension = 24519 rows, 8 columns). Then, we use a clustering (unsupervised algorithms in Machine Learning) algorithm to partition the feature space into distinct clusters (5 clusters: C1,C2,C3,C4,C5), each depicting a different level of confidence to associate with the measured redshift zMAP (Maximum-A-Posteriori estimate that corresponds to the maximum of the redshift PDF). The clustering results (C1,C2,C3,C4,C5) reported in the table are those used in the paper (Jamal et al, 2017) to present the new methodology of automating the zspec reliability assessment. In particular, we would like to point out that they were obtained from first tests conducted on the VVDS spectroscopic data (end of 2016). Therefore, the table does not depict immutable results (on-going improvements). Future updates of the VVDS redshift reliability flags can be expected. (1 data file).
High ozone levels in the northeast of Portugal: Analysis and characterization
NASA Astrophysics Data System (ADS)
Carvalho, A.; Monteiro, A.; Ribeiro, I.; Tchepel, O.; Miranda, A. I.; Borrego, C.; Saavedra, S.; Souto, J. A.; Casares, J. J.
2010-03-01
Each summer period extremely high ozone levels are registered at the rural background station of Lamas d'Olo, located in the Northeast of Portugal. In average, 30% of the total alert threshold registered in Portugal is detected at this site. The main purpose of this study is to characterize the atmospheric conditions that lead to the ozone-rich episodes at this site. Synoptic patterns anomalies and back trajectories cluster analysis were performed, for the period between 2004 and 2007, considering 76 days when ozone maximum hourly concentrations were above 200 μg m -3. The obtained atmospheric anomaly fields suggested that a positive temperature anomaly is visible above the Iberian Peninsula. A strong wind flow pattern from NE is observable in the North of Portugal and Galicia, in Spain. These two features may lead to an enhancement of the photochemical production and to the transport of pollutants from Spain to Portugal. In addition, the 3D mean back trajectories associated to the ozone episode days were analysed. A clustering method has been applied to the obtained back trajectories. Four main clusters of ozone-rich episodes were identified, with different frequencies of occurrence: north-westerly flows (11%); north-easterly flows (45%), southern flow (4%) and westerly flows (40%). Both analyses highlight the NE flow as a dominant pattern over the North of Portugal during summer. The analysis of the ozone concentrations for each selected cluster indicates that this northeast circulation pattern, together with the southern flow, are responsible for the highest ozone peak episodes. This also suggests that long-range transport of atmospheric pollutants is the main contributor to the ozone levels registered at Lamas d'Olo. This is also highlighted by the correlation of the ozone time-series with the meteorological parameters analysed in the frequency domain.
Techniques to derive geometries for image-based Eulerian computations
Dillard, Seth; Buchholz, James; Vigmostad, Sarah; Kim, Hyunggun; Udaykumar, H.S.
2014-01-01
Purpose The performance of three frequently used level set-based segmentation methods is examined for the purpose of defining features and boundary conditions for image-based Eulerian fluid and solid mechanics models. The focus of the evaluation is to identify an approach that produces the best geometric representation from a computational fluid/solid modeling point of view. In particular, extraction of geometries from a wide variety of imaging modalities and noise intensities, to supply to an immersed boundary approach, is targeted. Design/methodology/approach Two- and three-dimensional images, acquired from optical, X-ray CT, and ultrasound imaging modalities, are segmented with active contours, k-means, and adaptive clustering methods. Segmentation contours are converted to level sets and smoothed as necessary for use in fluid/solid simulations. Results produced by the three approaches are compared visually and with contrast ratio, signal-to-noise ratio, and contrast-to-noise ratio measures. Findings While the active contours method possesses built-in smoothing and regularization and produces continuous contours, the clustering methods (k-means and adaptive clustering) produce discrete (pixelated) contours that require smoothing using speckle-reducing anisotropic diffusion (SRAD). Thus, for images with high contrast and low to moderate noise, active contours are generally preferable. However, adaptive clustering is found to be far superior to the other two methods for images possessing high levels of noise and global intensity variations, due to its more sophisticated use of local pixel/voxel intensity statistics. Originality/value It is often difficult to know a priori which segmentation will perform best for a given image type, particularly when geometric modeling is the ultimate goal. This work offers insight to the algorithm selection process, as well as outlining a practical framework for generating useful geometric surfaces in an Eulerian setting. PMID:25750470
NASA Astrophysics Data System (ADS)
Rai, Akhand; Upadhyay, S. H.
2017-09-01
Bearing is the most critical component in rotating machinery since it is more susceptible to failure. The monitoring of degradation in bearings becomes of great concern for averting the sudden machinery breakdown. In this study, a novel method for bearing performance degradation assessment (PDA) based on an amalgamation of empirical mode decomposition (EMD) and k-medoids clustering is encouraged. The fault features are extracted from the bearing signals using the EMD process. The extracted features are then subjected to k-medoids based clustering for obtaining the normal state and failure state cluster centres. A confidence value (CV) curve based on dissimilarity of the test data object to the normal state is obtained and employed as the degradation indicator for assessing the health of bearings. The proposed outlook is applied on the vibration signals collected in run-to-failure tests of bearings to assess its effectiveness in bearing PDA. To validate the superiority of the suggested approach, it is compared with commonly used time-domain features RMS and kurtosis, well-known fault diagnosis method envelope analysis (EA) and existing PDA classifiers i.e. self-organizing maps (SOM) and Fuzzy c-means (FCM). The results demonstrate that the recommended method outperforms the time-domain features, SOM and FCM based PDA in detecting the early stage degradation more precisely. Moreover, EA can be used as an accompanying method to confirm the early stage defect detected by the proposed bearing PDA approach. The study shows the potential application of k-medoids clustering as an effective tool for PDA of bearings.
Comparative study of icy patches on comet nuclei
NASA Astrophysics Data System (ADS)
Oklay, Nilda; Pommerol, Antoine; Barucci, Maria Antonietta; Sunshine, Jessica; Sierks, Holger; Pajola, Maurizio
2016-07-01
Cometary missions Deep Impact, EPOXI and Rosetta investigated the nuclei of comets 9P/Tempel 1, 103P/Hartley 2 and 67P/Churyumov-Gerasimenko respectively. Bright patches were observed on the surfaces of each of these three comets [1-5]. Of these, the surface of 67P is mapped at the highest spatial resolution via narrow angle camera (NAC) of the Optical, Spectroscopic, and Infrared Remote Imaging System (OSIRIS, [6]) on board the Rosetta spacecraft. OSIRIS NAC is equipped with twelve filters covering the wavelength range of 250 nm to 1000 nm. Various filters combinations are used during surface mapping. With high spatial resolution data of comet 67P, three types of bright features were detected on the comet surface: Clustered, isolated and bright boulders [2]. In the visible spectral range, clustered bright features on comet 67P display bluer spectral slopes than the average surface [2, 4] while isolated bright features on comet 67P have flat spectra [4]. Icy patches observed on the surface of comets 9P and 103P display bluer spectral slopes than the average surface [1, 5]. Clustered and isolated bright features are blue in the RGB composites generated by using the images taken in NIR, visible and NUV wavelengths [2, 4]. This is valid for the icy patches observed on comets 9P and 103P [1, 5]. Spectroscopic observations of bright patches on comets 9P and 103P confirmed the existence of water [1, 5]. There were more than a hundred of bright features detected on the northern hemisphere of comet 67P [2]. Analysis of those features from both multispectral data and spectroscopic data is an ongoing work. Water ice is detected in eight of the bright features so far [7]. Additionally, spectroscopic observations of two clustered bright features on the surface of comet 67P revealed the existence of water ice [3]. The spectral properties of one of the icy patches were studied by [4] using OSIRIS NAC images and compared with the spectral properties of the active regions observed on comet 67P. Additionally jets rising from the same clustered bright feature were detected visually [4]. We analyzed bright patches on the surface of comets 9P, 103P and 67P using multispectral data obtained by the high-resolution instrument (HRI), medium- resolution instrument (MRI) and OSIRIS NAC using various spectral analysis techniques. Clustered bright features on comet 67P have similar visible spectra to the bright patches on comets 9P and 103P. The comparison of the bright patches includes the published results of the IR spectra. References: [1] Sunshine et al., 2006, Science, 311, 1453 [2] Pommerol et al., 2015, A&A, 583, A25 [3] Filacchione et al., 2016, Nature, 529, 368-372 [4] Oklay et al., 2016, A&A, 586, A80 [5] Sunshine et al. 2012, ACM [6] Keller et al., 2007, Space Sci. Rev., 128, 433 [7] Barucci et al., 2016, COSPAR, B04
The extended stellar substructures of four metal-poor globular clusters in the galactic bulge
NASA Astrophysics Data System (ADS)
Chun, Sang-Hyun; Sohn, Young-Jong
2015-08-01
We investigated stellar spatial density distribution around four metal-poor globular clusters (NGC 6266, NGC 6626, NGC 6642 and NGC 6723) in order to find extended stellar substructures. Wide-field deep J, H, and K imaging data were taken using the WFCAM near-infrared array on United Kingdom Infrared Telescope (UKIRT). The contamination of field stars around clusters was minimised by applying a statistical weighted filtering algorithm for the stars on the color-magnitude diagram. In two-dimensional isodensity contour map, we find that all four of the globular clusters shows tidal stripping stellar features in the form of tidal tails (NGC 6266 and NGC 6723) or small density lobes/chunk (NGC 6642 and NGC 6723). The stellar substructures extend toward the Galactic centre or anticancer, and the proper motion direction of the clusters. The radial density profiles of the clusters also depart from theoretical King and Wilson models and show overdensity feature with a break in a slope of profile at the outer region of clusters. The observed results indicate that four globular clusters in the Galactic bulge have experienced strong tidal force or bulge/disk shock effect of the Galaxy. These observational results provide us further constraints to understand the evolution of clusters in the Galactic bulge region as well as the formation of the Galaxy.
Bipolar disorder with comorbid cluster B personality disorder features: impact on suicidality.
Garno, Jessica L; Goldberg, Joseph F; Ramirez, Paul Michael; Ritzler, Barry A
2005-03-01
Because of their overlapping phenomenology and mutually chronic, persistent nature, distinctions between bipolar disorder and cluster B personality disorders remain a source of unresolved clinical controversy. The extent to which comorbid personality disorders impact course and outcome for bipolar patients also has received little systematic study. One hundred DSM-IV bipolar I (N = 73) or II (N = 27) patients consecutively underwent diagnostic evaluations with structured clinical interviews for DSM-IV Axis I and cluster B Axis II disorders, along with assessments of histories of childhood trauma or abuse. Cluster B diagnostic comorbidity was examined relative to lifetime substance abuse, suicide attempt histories, and other clinical features. Thirty percent of subjects met DSM-IV criteria for a cluster B personality disorder (17% borderline, 6% antisocial, 5% histrionic, 8% narcissistic). Cluster B diagnoses were significantly linked with histories of childhood emotional abuse (p = .009), physical abuse (p = .014), and emotional neglect (p = .022), but not sexual abuse or physical neglect. Cluster B comorbidity was associated with significantly more lifetime suicide attempts and current depression. Lifetime suicide attempts were significantly associated with cluster B comorbidity (OR = 3.195, 95% CI = 1.124 to 9.088), controlling for current depression severity, lifetime substance abuse, and past sexual or emotional abuse. Cluster B personality disorders are prevalent comorbid conditions identifiable in a substantial number of individuals with bipolar disorder, making an independent contribution to increased lifetime suicide risk.
Search for gamma-ray lines towards galaxy clusters with the Fermi-LAT
Anderson, B.; Zimmer, S.; Conrad, J.; ...
2016-02-09
We report on a search for monochromatic γ-ray features in the spectra of galaxy clusters observed by the Fermi Large Area Telescope. Galaxy clusters are the largest structures in the Universe that are bound by dark matter (DM), making them an important testing ground for possible selfinteractions or decays of the DM particles. Monochromatic γ-ray lines provide a unique signature due to the absence of astrophysical backgrounds and are as such considered a smoking-gun signature for new physics. An unbinned joint likelihood analysis of the sixteen most promising clusters using five years of data at energies between 10 and 400more » GeV revealed no significant features. For the case of self-annihilation, we set upper limits on the monochromatic velocity-averaged interaction cross section. These limits are compatible with those obtained from observations of the Galactic Center, albeit weaker due to the larger distance to the studied clusters.« less
The very massive star content of the nuclear star clusters in NGC 5253
NASA Astrophysics Data System (ADS)
Smith, Linda J.; Crowther, Paul A.; Calzetti, Daniela
2017-11-01
The blue compact dwarf galaxy NGC 5253 hosts a very young starburst containing twin nuclear star clusters. Calzetti et al. (2015) find that the two clusters have an age of 1 Myr, in contradiction to the age of 3-5 Myr inferred from the presence of Wolf-Rayet (W-R) spectral features. We use Hubble Space Telescope (HST) far-ultraviolet (FUV) and ground-based optical spectra to show that the cluster stellar features arise from very massive stars (VMS), with masses greater than 100 M⊙, at an age of 1-2 Myr. We discuss the implications of this and show that the very high ionizing flux can only be explained by VMS. We further discuss our findings in the context of VMS contributing to He ii λ1640 emission in high redshift galaxies, and emphasize that population synthesis models with upper mass cut-offs greater than 100 M⊙ are crucial for future studies of young massive clusters.
Going beyond Clustering in MD Trajectory Analysis: An Application to Villin Headpiece Folding
Rajan, Aruna; Freddolino, Peter L.; Schulten, Klaus
2010-01-01
Recent advances in computing technology have enabled microsecond long all-atom molecular dynamics (MD) simulations of biological systems. Methods that can distill the salient features of such large trajectories are now urgently needed. Conventional clustering methods used to analyze MD trajectories suffer from various setbacks, namely (i) they are not data driven, (ii) they are unstable to noise and changes in cut-off parameters such as cluster radius and cluster number, and (iii) they do not reduce the dimensionality of the trajectories, and hence are unsuitable for finding collective coordinates. We advocate the application of principal component analysis (PCA) and a non-metric multidimensional scaling (nMDS) method to reduce MD trajectories and overcome the drawbacks of clustering. To illustrate the superiority of nMDS over other methods in reducing data and reproducing salient features, we analyze three complete villin headpiece folding trajectories. Our analysis suggests that the folding process of the villin headpiece is structurally heterogeneous. PMID:20419160
Going beyond clustering in MD trajectory analysis: an application to villin headpiece folding.
Rajan, Aruna; Freddolino, Peter L; Schulten, Klaus
2010-04-15
Recent advances in computing technology have enabled microsecond long all-atom molecular dynamics (MD) simulations of biological systems. Methods that can distill the salient features of such large trajectories are now urgently needed. Conventional clustering methods used to analyze MD trajectories suffer from various setbacks, namely (i) they are not data driven, (ii) they are unstable to noise and changes in cut-off parameters such as cluster radius and cluster number, and (iii) they do not reduce the dimensionality of the trajectories, and hence are unsuitable for finding collective coordinates. We advocate the application of principal component analysis (PCA) and a non-metric multidimensional scaling (nMDS) method to reduce MD trajectories and overcome the drawbacks of clustering. To illustrate the superiority of nMDS over other methods in reducing data and reproducing salient features, we analyze three complete villin headpiece folding trajectories. Our analysis suggests that the folding process of the villin headpiece is structurally heterogeneous.
Unsupervised texture image segmentation by improved neural network ART2
NASA Technical Reports Server (NTRS)
Wang, Zhiling; Labini, G. Sylos; Mugnuolo, R.; Desario, Marco
1994-01-01
We here propose a segmentation algorithm of texture image for a computer vision system on a space robot. An improved adaptive resonance theory (ART2) for analog input patterns is adapted to classify the image based on a set of texture image features extracted by a fast spatial gray level dependence method (SGLDM). The nonlinear thresholding functions in input layer of the neural network have been constructed by two parts: firstly, to reduce the effects of image noises on the features, a set of sigmoid functions is chosen depending on the types of the feature; secondly, to enhance the contrast of the features, we adopt fuzzy mapping functions. The cluster number in output layer can be increased by an autogrowing mechanism constantly when a new pattern happens. Experimental results and original or segmented pictures are shown, including the comparison between this approach and K-means algorithm. The system written in C language is performed on a SUN-4/330 sparc-station with an image board IT-150 and a CCD camera.
High-Resolution Remote Sensing Image Building Extraction Based on Markov Model
NASA Astrophysics Data System (ADS)
Zhao, W.; Yan, L.; Chang, Y.; Gong, L.
2018-04-01
With the increase of resolution, remote sensing images have the characteristics of increased information load, increased noise, more complex feature geometry and texture information, which makes the extraction of building information more difficult. To solve this problem, this paper designs a high resolution remote sensing image building extraction method based on Markov model. This method introduces Contourlet domain map clustering and Markov model, captures and enhances the contour and texture information of high-resolution remote sensing image features in multiple directions, and further designs the spectral feature index that can characterize "pseudo-buildings" in the building area. Through the multi-scale segmentation and extraction of image features, the fine extraction from the building area to the building is realized. Experiments show that this method can restrain the noise of high-resolution remote sensing images, reduce the interference of non-target ground texture information, and remove the shadow, vegetation and other pseudo-building information, compared with the traditional pixel-level image information extraction, better performance in building extraction precision, accuracy and completeness.
The Peculiarities in O-Type Galaxy Clusters
NASA Astrophysics Data System (ADS)
Panko, E. A.; Emelyanov, S. I.
We present the results of analysis of 2D distribution of galaxies in galaxy cluster fields. The Catalogue of Galaxy Clusters and Groups PF (Panko & Flin) was used as input observational data set. We selected open rich PF galaxy clusters, containing 100 and more galaxies for our study. According to Panko classification scheme open galaxy clusters (O-type) have no concentration to the cluster center. The data set contains both pure O-type clusters and O-type clusters with overdence belts, namely OL and OF types. According to Rood & Sastry and Struble & Rood ideas, the open galaxy clusters are the beginning stage of cluster evolution. We found in the O-type clusters some types of statistically significant regular peculiarities, such as two crossed belts or curved strip. We suppose founded features connected with galaxy clusters evolution and the distribution of DM inside the clusters.
Ullrich, Susann; Kotz, Sonja A.; Schmidtke, David S.; Aryani, Arash; Conrad, Markus
2016-01-01
While linguistic theory posits an arbitrary relation between signifiers and the signified (de Saussure, 1916), our analysis of a large-scale German database containing affective ratings of words revealed that certain phoneme clusters occur more often in words denoting concepts with negative and arousing meaning. Here, we investigate how such phoneme clusters that potentially serve as sublexical markers of affect can influence language processing. We registered the EEG signal during a lexical decision task with a novel manipulation of the words' putative sublexical affective potential: the means of valence and arousal values for single phoneme clusters, each computed as a function of respective values of words from the database these phoneme clusters occur in. Our experimental manipulations also investigate potential contributions of formal salience to the sublexical affective potential: Typically, negative high-arousing phonological segments—based on our calculations—tend to be less frequent and more structurally complex than neutral ones. We thus constructed two experimental sets, one involving this natural confound, while controlling for it in the other. A negative high-arousing sublexical affective potential in the strictly controlled stimulus set yielded an early posterior negativity (EPN), in similar ways as an independent manipulation of lexical affective content did. When other potentially salient formal features at the sublexical level were not controlled for, the effect of the sublexical affective potential was strengthened and prolonged (250–650 ms), presumably because formal salience helps making specific phoneme clusters efficient sublexical markers of negative high-arousing affective meaning. These neurophysiological data support the assumption that the organization of a language's vocabulary involves systematic sound-to-meaning correspondences at the phonemic level that influence the way we process language. PMID:27588008
Diversity and Community Can Coexist.
Stivala, Alex; Robins, Garry; Kashima, Yoshihisa; Kirley, Michael
2016-03-01
We examine the (in)compatibility of diversity and sense of community by means of agent-based models based on the well-known Schelling model of residential segregation and Axelrod model of cultural dissemination. We find that diversity and highly clustered social networks, on the assumptions of social tie formation based on spatial proximity and homophily, are incompatible when agent features are immutable, and this holds even for multiple independent features. We include both mutable and immutable features into a model that integrates Schelling and Axelrod models, and we find that even for multiple independent features, diversity and highly clustered social networks can be incompatible on the assumptions of social tie formation based on spatial proximity and homophily. However, this incompatibility breaks down when cultural diversity can be sufficiently large, at which point diversity and clustering need not be negatively correlated. This implies that segregation based on immutable characteristics such as race can possibly be overcome by sufficient similarity on mutable characteristics based on culture, which are subject to a process of social influence, provided a sufficiently large "scope of cultural possibilities" exists. © Society for Community Research and Action 2016.
NASA Astrophysics Data System (ADS)
Wagstaff, Kiri L.
2012-03-01
On obtaining a new data set, the researcher is immediately faced with the challenge of obtaining a high-level understanding from the observations. What does a typical item look like? What are the dominant trends? How many distinct groups are included in the data set, and how is each one characterized? Which observable values are common, and which rarely occur? Which items stand out as anomalies or outliers from the rest of the data? This challenge is exacerbated by the steady growth in data set size [11] as new instruments push into new frontiers of parameter space, via improvements in temporal, spatial, and spectral resolution, or by the desire to "fuse" observations from different modalities and instruments into a larger-picture understanding of the same underlying phenomenon. Data clustering algorithms provide a variety of solutions for this task. They can generate summaries, locate outliers, compress data, identify dense or sparse regions of feature space, and build data models. It is useful to note up front that "clusters" in this context refer to groups of items within some descriptive feature space, not (necessarily) to "galaxy clusters" which are dense regions in physical space. The goal of this chapter is to survey a variety of data clustering methods, with an eye toward their applicability to astronomical data analysis. In addition to improving the individual researcher’s understanding of a given data set, clustering has led directly to scientific advances, such as the discovery of new subclasses of stars [14] and gamma-ray bursts (GRBs) [38]. All clustering algorithms seek to identify groups within a data set that reflect some observed, quantifiable structure. Clustering is traditionally an unsupervised approach to data analysis, in the sense that it operates without any direct guidance about which items should be assigned to which clusters. There has been a recent trend in the clustering literature toward supporting semisupervised or constrained clustering, in which some partial information about item assignments or other components of the resulting output are already known and must be accommodated by the solution. Some algorithms seek a partition of the data set into distinct clusters, while others build a hierarchy of nested clusters that can capture taxonomic relationships. Some produce a single optimal solution, while others construct a probabilistic model of cluster membership. More formally, clustering algorithms operate on a data set X composed of items represented by one or more features (dimensions). These could include physical location, such as right ascension and declination, as well as other properties such as brightness, color, temporal change, size, texture, and so on. Let D be the number of dimensions used to represent each item, xi ∈ RD. The clustering goal is to produce an organization P of the items in X that optimizes an objective function f : P -> R, which quantifies the quality of solution P. Often f is defined so as to maximize similarity within a cluster and minimize similarity between clusters. To that end, many algorithms make use of a measure d : X x X -> R of the distance between two items. A partitioning algorithm produces a set of clusters P = {c1, . . . , ck} such that the clusters are nonoverlapping (c_i intersected with c_j = empty set, i != j) subsets of the data set (Union_i c_i=X). Hierarchical algorithms produce a series of partitions P = {p1, . . . , pn }. For a complete hierarchy, the number of partitions n’= n, the number of items in the data set; the top partition is a single cluster containing all items, and the bottom partition contains n clusters, each containing a single item. For model-based clustering, each cluster c_j is represented by a model m_j , such as the cluster center or a Gaussian distribution. The wide array of available clustering algorithms may seem bewildering, and covering all of them is beyond the scope of this chapter. Choosing among them for a particular application involves considerations of the kind of data being analyzed, algorithm runtime efficiency, and how much prior knowledge is available about the problem domain, which can dictate the nature of clusters sought. Fundamentally, the clustering method and its representations of clusters carries with it a definition of what a cluster is, and it is important that this be aligned with the analysis goals for the problem at hand. In this chapter, I emphasize this point by identifying for each algorithm the cluster representation as a model, m_j , even for algorithms that are not typically thought of as creating a “model.” This chapter surveys a basic collection of clustering methods useful to any practitioner who is interested in applying clustering to a new data set. The algorithms include k-means (Section 25.2), EM (Section 25.3), agglomerative (Section 25.4), and spectral (Section 25.5) clustering, with side mentions of variants such as kernel k-means and divisive clustering. The chapter also discusses each algorithm’s strengths and limitations and provides pointers to additional in-depth reading for each subject. Section 25.6 discusses methods for incorporating domain knowledge into the clustering process. This chapter concludes with a brief survey of interesting applications of clustering methods to astronomy data (Section 25.7). The chapter begins with k-means because it is both generally accessible and so widely used that understanding it can be considered a necessary prerequisite for further work in the field. EM can be viewed as a more sophisticated version of k-means that uses a generative model for each cluster and probabilistic item assignments. Agglomerative clustering is the most basic form of hierarchical clustering and provides a basis for further exploration of algorithms in that vein. Spectral clustering permits a departure from feature-vector-based clustering and can operate on data sets instead represented as affinity, or similarity matrices—cases in which only pairwise information is known. The list of algorithms covered in this chapter is representative of those most commonly in use, but it is by no means comprehensive. There is an extensive collection of existing books on clustering that provide additional background and depth. Three early books that remain useful today are Anderberg’s Cluster Analysis for Applications [3], Hartigan’s Clustering Algorithms [25], and Gordon’s Classification [22]. The latter covers basics on similarity measures, partitioning and hierarchical algorithms, fuzzy clustering, overlapping clustering, conceptual clustering, validations methods, and visualization or data reduction techniques such as principal components analysis (PCA),multidimensional scaling, and self-organizing maps. More recently, Jain et al. provided a useful and informative survey [27] of a variety of different clustering algorithms, including those mentioned here as well as fuzzy, graph-theoretic, and evolutionary clustering. Everitt’s Cluster Analysis [19] provides a modern overview of algorithms, similarity measures, and evaluation methods.
SUPERMODEL ANALYSIS OF GALAXY CLUSTERS
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fusco-Femiano, R.; Cavaliere, A.; Lapi, A.
2009-11-01
We present the analysis of the X-ray brightness and temperature profiles for six clusters belonging to both the Cool Core (CC) and Non Cool Core (NCC) classes, in terms of the Supermodel (SM) developed by Cavaliere et al. Based on the gravitational wells set by the dark matter (DM) halos, the SM straightforwardly expresses the equilibrium of the intracluster plasma (ICP) modulated by the entropy deposited at the boundary by standing shocks from gravitational accretion, and injected at the center by outgoing blast waves from mergers or from outbursts of active galactic nuclei. The cluster set analyzed here highlights notmore » only how simply the SM represents the main dichotomy CC versus NCC clusters in terms of a few ICP parameters governing the radial entropy run, but also how accurately it fits even complex brightness and temperature profiles. For CC clusters like A2199 and A2597, the SM with a low level of central entropy straightforwardly yields the characteristic peaked profile of the temperature marked by a decline toward the center, without requiring currently strong radiative cooling and high mass deposition rates. NCC clusters like A1656 require instead a central entropy floor of a substantial level, and some like A2256 and even more A644 feature structured temperature profiles that also call for a definite floor extension; in such conditions the SM accurately fits the observations, and suggests that in these clusters the ICP has been just remolded by a merger event, in the way of a remnant cool core. The SM also predicts that DM halos with high concentration should correlate with flatter entropy profiles and steeper brightness in the outskirts; this is indeed the case with A1689, for which from X-rays we find concentration values c approx 10, the hallmark of an early halo formation. Thus, we show the SM to constitute a fast tool not only to provide wide libraries of accurate fits to X-ray temperature and density profiles, but also to retrieve from the ICP archives specific information concerning the physical histories of DM and baryons in the inner and the outer cluster regions.« less
Weider, Karola; Bergmann, Martin; Giese, Sarah; Guillou, Florian; Failing, Klaus; Brehm, Ralph
2011-07-01
Histological analysis revealed that Sertoli cell specific knockout of the predominant testicular gap junction protein connexin 43 results in a spermatogenic arrest at the level of spermatogonia or Sertoli cell-only syndrome, intratubular cell clusters and still proliferating adult Sertoli cells, implying an important role for connexin 43 in the Sertoli and germ cell development. This study aimed to determine the (1) Sertoli cell maturation state, (2) time of occurrence and (3) composition, differentiation and fate of clustered cells in knockout mice. Using immunohistochemistry connexin 43 deficient Sertoli cells showed an accurate start of the mature markers androgen receptor and GATA-1 during puberty and a vimentin expression from neonatal to adult. Expression of anti-Muellerian hormone, as a marker of Sertoli cell immaturity, was finally down-regulated during puberty, but its disappearance was delayed. This observed extended anti-Müllerian hormone synthesis during puberty was confirmed by western blot and Real-Time PCR and suggests a partial alteration in the Sertoli cell differentiation program. Additionally, Sertoli cells of adult knockouts showed a permanent and uniform expression of GATA-1 at protein and mRNA level, maybe caused by the lack of maturing germ cells and missing negative feedback signals. At ultrastructural level, basally located adult Sertoli cells obtained their mature appearance, demonstrated by the tripartite nucleolus as a typical feature of differentiated Sertoli cells. Intratubular clustered cells were mainly formed by abnormal Sertoli cells and single attached apoptotic germ cells, verified by immunohistochemistry, TUNEL staining and transmission electron microscopy. Clusters first appeared during puberty and became more numerous in adulthood with increasing cell numbers per cluster suggesting an age-related process. In conclusion, adult connexin 43 deficient Sertoli cells seem to proliferate while maintaining expression of mature markers and their adult morphology, indicating a unique and abnormal intermediate phenotype with characteristics common to both undifferentiated and differentiated Sertoli cells. Copyright © 2011 International Society of Differentiation. Published by Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Troxel, M. A.; Ishak, Mustapha; Peel, Austin, E-mail: troxel@utdallas.edu, E-mail: mishak@utdallas.edu, E-mail: austin.peel@utdallas.edu
2014-03-01
The study of relativistic, higher order, and nonlinear effects has become necessary in recent years in the pursuit of precision cosmology. We develop and apply here a framework to study gravitational lensing in exact models in general relativity that are not restricted to homogeneity and isotropy, and where full nonlinearity and relativistic effects are thus naturally included. We apply the framework to a specific, anisotropic galaxy cluster model which is based on a modified NFW halo density profile and described by the Szekeres metric. We examine the effects of increasing levels of anisotropy in the galaxy cluster on lensing observablesmore » like the convergence and shear for various lensing geometries, finding a strong nonlinear response in both the convergence and shear for rays passing through anisotropic regions of the cluster. Deviation from the expected values in a spherically symmetric structure are asymmetric with respect to path direction and thus will persist as a statistical effect when averaged over some ensemble of such clusters. The resulting relative difference in various geometries can be as large as approximately 2%, 8%, and 24% in the measure of convergence (1−κ) for levels of anisotropy of 5%, 10%, and 15%, respectively, as a fraction of total cluster mass. For the total magnitude of shear, the relative difference can grow near the center of the structure to be as large as 15%, 32%, and 44% for the same levels of anisotropy, averaged over the two extreme geometries. The convergence is impacted most strongly for rays which pass in directions along the axis of maximum dipole anisotropy in the structure, while the shear is most strongly impacted for rays which pass in directions orthogonal to this axis, as expected. The rich features found in the lensing signal due to anisotropic substructure are nearly entirely lost when one treats the cluster in the traditional FLRW lensing framework. These effects due to anisotropic structures are thus likely to impact lensing measurements and must be fully examined in an era of precision cosmology.« less
Interactive color display for multispectral imagery using correlation clustering
NASA Technical Reports Server (NTRS)
Haskell, R. E. (Inventor)
1979-01-01
A method for processing multispectral data is provided, which permits an operator to make parameter level changes during the processing of the data. The system is directed to production of a color classification map on a video display in which a given color represents a localized region in multispectral feature space. Interactive controls permit an operator to alter the size and change the location of these regions, permitting the classification of such region to be changed from a broad to a narrow classification.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Andersson, Karl E.; /Stockholm U. /SLAC; Peterson, J.R.
2007-04-17
We propose a new Monte Carlo method to study extended X-ray sources with the European Photon Imaging Camera (EPIC) aboard XMM Newton. The Smoothed Particle Inference (SPI) technique, described in a companion paper, is applied here to the EPIC data for the clusters of galaxies Abell 1689, Centaurus and RXJ 0658-55 (the ''bullet cluster''). We aim to show the advantages of this method of simultaneous spectral-spatial modeling over traditional X-ray spectral analysis. In Abell 1689 we confirm our earlier findings about structure in temperature distribution and produce a high resolution temperature map. We also confirm our findings about velocity structuremore » within the gas. In the bullet cluster, RXJ 0658-55, we produce the highest resolution temperature map ever to be published of this cluster allowing us to trace what looks like the motion of the bullet in the cluster. We even detect a south to north temperature gradient within the bullet itself. In the Centaurus cluster we detect, by dividing up the luminosity of the cluster in bands of gas temperatures, a striking feature to the north-east of the cluster core. We hypothesize that this feature is caused by a subcluster left over from a substantial merger that slightly displaced the core. We conclude that our method is very powerful in determining the spatial distributions of plasma temperatures and very useful for systematic studies in cluster structure.« less
Cluster headache in Greece: an observational clinical and demographic study of 302 patients.
Vikelis, Michail; Rapoport, Alan M
2016-12-01
Cluster headache (CH) is considered the most excruciating primary headache syndrome; although much less prevalent than migraine, it is not rare as it affects more than 1/1000 people. While its clinical presentation is considered stereotypic, atypical features are often encountered. Internationally, cluster headache is often misdiagnosed, undertreated and mistreated. We prospectively studied 302 CH patients, all examined by the same headache specialist. The aim of our study was to describe the demographic and clinical characteristics of CH patients in Greece and draw attention to under-management, under-treatment and mis-treatment often encountered in clinical practice; our purpose is to improve recognition and successful treatment of cluster patients by Greek neurologists and other physicians. In the present cohort, clinical characteristics of CH are similar to those described in other populations. Beyond the standard clinical characteristics, features like side shifts (12.6 %), location of maximal pain intensity outside the first trigeminal branch division (10.2 %), lack of autonomic features (7 %), presence of associated features of migraine and aggravation by physical activity (10 %) were encountered. Four out of five patients had consulted a physician prior to diagnosis. The median number of physicians seen prior to diagnosis was 3 and the median time to diagnosis was 5 years, though it improved for patients with recent onset. Chronic cluster headache, side shifts, pain location in the face or the back of the head and aggravation by physical activity were found, among others, to be statistically significantly related to delayed diagnosis or more physicians seen prior to diagnosis. Even properly diagnosed patients were often undertreated or mistreated. Cluster headache, in a large cohort of Greek patients, has the same phenotypic characteristics as described internationally. Uncommon clinical features do exist and physicians should be aware of those, since they may eventuate in diagnostic problems. Most CH patients in Greece remain misdiagnosed or undiagnosed for rather lengthy periods of time, but time to diagnosis has improved recently. Even after diagnosis, treatment received was suboptimal.
Bag of Lines (BoL) for Improved Aerial Scene Representation
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sridharan, Harini; Cheriyadat, Anil M.
2014-09-22
Feature representation is a key step in automated visual content interpretation. In this letter, we present a robust feature representation technique, referred to as bag of lines (BoL), for high-resolution aerial scenes. The proposed technique involves extracting and compactly representing low-level line primitives from the scene. The compact scene representation is generated by counting the different types of lines representing various linear structures in the scene. Through extensive experiments, we show that the proposed scene representation is invariant to scale changes and scene conditions and can discriminate urban scene categories accurately. We compare the BoL representation with the popular scalemore » invariant feature transform (SIFT) and Gabor wavelets for their classification and clustering performance on an aerial scene database consisting of images acquired by sensors with different spatial resolutions. The proposed BoL representation outperforms the SIFT- and Gabor-based representations.« less
Wang, Z; Wang, W H; Wang, S L; Jin, J; Song, Y W; Liu, Y P; Ren, H; Fang, H; Tang, Y; Chen, B; Qi, S N; Lu, N N; Li, N; Tang, Y; Liu, X F; Yu, Z H; Li, Y X
2016-06-23
To find phenotypic subgroups of patients with pT1-2N0 invasive breast cancer by means of cluster analysis and estimate the prognosis and clinicopathological features of these subgroups. From 1999 to 2013, 4979 patients with pT1-2N0 invasive breast cancer were recruited for hierarchical clustering analysis. Age (≤40, 41-70, 70+ years), size of primary tumor, pathological type, grade of differentiation, microvascular invasion, estrogen receptor (ER), progesterone receptor (PR) and human epidermal growth factor receptor 2 (HER-2) were chosen as distance metric between patients. Hierarchical cluster analysis was performed using Ward's method. Cophenetic correlation coefficient (CPCC) and Spearman correlation coefficient were used to validate clustering structures. The CPCC was 0.603. The Spearman correlation coefficient was 0.617 (P<0.001), which indicated a good fit of hierarchy to the data. A twelve-cluster model seemed to best illustrate our patient cohort. Patients in cluster 5, 9 and 12 had best prognosis and were characterized by age >40 years, smaller primary tumor, lower histologic grade, positive ER and PR status, and mainly negative HER-2. Patients in the cluster 1 and 11 had the worst prognosis, The cluster 1 was characterized by a larger tumor, higher grade and negative ER and PR status, while the cluster 11 was characterized by positive microvascular invasion. Patients in other 7 clusters had a moderate prognosis, and patients in each cluster had distinctive clinicopathological features and recurrent patterns. This study identified distinctive clinicopathologic phenotypes in a large cohort of patients with pT1-2N0 breast cancer through hierarchical clustering and revealed different prognosis. This integrative model may help physicians to make more personalized decisions regarding adjuvant therapy.
Iris recognition using image moments and k-means algorithm.
Khan, Yaser Daanial; Khan, Sher Afzal; Ahmad, Farooq; Islam, Saeed
2014-01-01
This paper presents a biometric technique for identification of a person using the iris image. The iris is first segmented from the acquired image of an eye using an edge detection algorithm. The disk shaped area of the iris is transformed into a rectangular form. Described moments are extracted from the grayscale image which yields a feature vector containing scale, rotation, and translation invariant moments. Images are clustered using the k-means algorithm and centroids for each cluster are computed. An arbitrary image is assumed to belong to the cluster whose centroid is the nearest to the feature vector in terms of Euclidean distance computed. The described model exhibits an accuracy of 98.5%.
Iris Recognition Using Image Moments and k-Means Algorithm
Khan, Yaser Daanial; Khan, Sher Afzal; Ahmad, Farooq; Islam, Saeed
2014-01-01
This paper presents a biometric technique for identification of a person using the iris image. The iris is first segmented from the acquired image of an eye using an edge detection algorithm. The disk shaped area of the iris is transformed into a rectangular form. Described moments are extracted from the grayscale image which yields a feature vector containing scale, rotation, and translation invariant moments. Images are clustered using the k-means algorithm and centroids for each cluster are computed. An arbitrary image is assumed to belong to the cluster whose centroid is the nearest to the feature vector in terms of Euclidean distance computed. The described model exhibits an accuracy of 98.5%. PMID:24977221
AMOEBA clustering revisited. [cluster analysis, classification, and image display program
NASA Technical Reports Server (NTRS)
Bryant, Jack
1990-01-01
A description of the clustering, classification, and image display program AMOEBA is presented. Using a difficult high resolution aircraft-acquired MSS image, the steps the program takes in forming clusters are traced. A number of new features are described here for the first time. Usage of the program is discussed. The theoretical foundation (the underlying mathematical model) is briefly presented. The program can handle images of any size and dimensionality.
Evidence for cluster shape effects on the kinetic energy spectrum in thermionic emission.
Calvo, F; Lépine, F; Baguenard, B; Pagliarulo, F; Concina, B; Bordas, C; Parneix, P
2007-11-28
Experimental kinetic energy release distributions obtained for the thermionic emission from C(n) (-) clusters, 10< or =n< or =20, exhibit significant non-Boltzmann variations. Using phase space theory, these different features are analyzed and interpreted as the consequence of contrasting shapes in the daughter clusters; linear and nonlinear isomers have clearly distinct signatures. These results provide a novel indirect structural probe for atomic clusters associated with their thermionic emission spectra.
Inano, Rika; Oishi, Naoya; Kunieda, Takeharu; Arakawa, Yoshiki; Kikuchi, Takayuki; Fukuyama, Hidenao; Miyamoto, Susumu
2016-07-26
Preoperative glioma grading is important for therapeutic strategies and influences prognosis. Intratumoral heterogeneity can cause an underestimation of grading because of the sampling error in biopsies. We developed a voxel-based unsupervised clustering method with multiple magnetic resonance imaging (MRI)-derived features using a self-organizing map followed by K-means. This method produced novel magnetic resonance-based clustered images (MRcIs) that enabled the visualization of glioma grades in 36 patients. The 12-class MRcIs revealed the highest classification performance for the prediction of glioma grading (area under the receiver operating characteristic curve = 0.928; 95% confidential interval = 0.920-0.936). Furthermore, we also created 12-class MRcIs in four new patients using the previous data from the 36 patients as training data and obtained tissue sections of the classes 11 and 12, which were significantly higher in high-grade gliomas (HGGs), and those of classes 4, 5 and 9, which were not significantly different between HGGs and low-grade gliomas (LGGs), according to a MRcI-based navigational system. The tissues of classes 11 and 12 showed features of malignant glioma, whereas those of classes 4, 5 and 9 showed LGGs without anaplastic features. These results suggest that the proposed voxel-based clustering method provides new insights into preoperative regional glioma grading.
Co-Clustering by Bipartite Spectral Graph Partitioning for Out-of-Tutor Prediction
ERIC Educational Resources Information Center
Trivedi, Shubhendu; Pardos, Zachary A.; Sarkozy, Gabor N.; Heffernan, Neil T.
2012-01-01
Learning a more distributed representation of the input feature space is a powerful method to boost the performance of a given predictor. Often this is accomplished by partitioning the data into homogeneous groups by clustering so that separate models could be trained on each cluster. Intuitively each such predictor is a better representative of…
Tavazzi, Eleonora; Laganà, Maria Marcella; Bergsland, Niels; Tortorella, Paola; Pinardi, Giovanna; Lunetta, Christian; Corbo, Massimo; Rovaris, Marco
2015-03-01
Primary progressive multiple sclerosis (PPMS) and amyotrophic lateral sclerosis (ALS) seem to share some clinical and pathological features. MRI studies revealed the presence of grey matter (GM) atrophy in both diseases, but no comparative data are available. The objective was to compare the regional patterns of GM tissue loss in PPMS and ALS with voxel-based morphometry (VBM). Eighteen PPMS patients, 20 ALS patients, and 31 healthy controls (HC) were studied with a 1.5 Tesla scanner. VBM was performed to assess volumetric GM differences with age and sex as covariates. Threshold-free cluster enhancement analysis was used to obtain significant clusters. Group comparisons were tested with family-wise error correction for multiple comparisons (p < 0.05) except for HC versus MND which was tested at a level of p < 0.001 uncorrected and a cluster threshold of 20 contiguous voxels. Compared to HC, ALS patients showed GM tissue reduction in selected frontal and temporal areas, while PPMS patients showed a widespread bilateral GM volume decrease, involving both deep and cortical regions. Compared to ALS, PPMS patients showed tissue volume reductions in both deep and cortical GM areas. This preliminary study confirms that PPMS is characterized by a more diffuse cortical and subcortical GM atrophy than ALS and that, in the latter condition, brain damage is present outside the motor system. These results suggest that PPMS and ALS may share pathological features leading to GM tissue loss.
A Survey on Node Clustering in Cognitive Radio Wireless Sensor Networks.
Joshi, Gyanendra Prasad; Kim, Sung Won
2016-09-10
Cognitive radio wireless sensor networks (CR-WSNs) have attracted a great deal of attention recently due to the emerging spectrum scarcity issue. This work attempts to provide a detailed analysis of the role of node clustering in CR-WSNs. We outline the objectives, requirements, and advantages of node clustering in CR-WSNs. We describe how a CR-WSN with node clustering differs from conventional wireless sensor networks, and we discuss its characteristics, architecture, and topologies. We survey the existing clustering algorithms and compare their objectives and features. We suggest how clustering issues and challenges can be handled.
Procedure of Partitioning Data Into Number of Data Sets or Data Group - A Review
NASA Astrophysics Data System (ADS)
Kim, Tai-Hoon
The goal of clustering is to decompose a dataset into similar groups based on a objective function. Some already well established clustering algorithms are there for data clustering. Objective of these data clustering algorithms are to divide the data points of the feature space into a number of groups (or classes) so that a predefined set of criteria are satisfied. The article considers the comparative study about the effectiveness and efficiency of traditional data clustering algorithms. For evaluating the performance of the clustering algorithms, Minkowski score is used here for different data sets.
Paraskevopoulou, Sivylla E; Barsakcioglu, Deren Y; Saberi, Mohammed R; Eftekhar, Amir; Constandinou, Timothy G
2013-04-30
Next generation neural interfaces aspire to achieve real-time multi-channel systems by integrating spike sorting on chip to overcome limitations in communication channel capacity. The feasibility of this approach relies on developing highly efficient algorithms for feature extraction and clustering with the potential of low-power hardware implementation. We are proposing a feature extraction method, not requiring any calibration, based on first and second derivative features of the spike waveform. The accuracy and computational complexity of the proposed method are quantified and compared against commonly used feature extraction methods, through simulation across four datasets (with different single units) at multiple noise levels (ranging from 5 to 20% of the signal amplitude). The average classification error is shown to be below 7% with a computational complexity of 2N-3, where N is the number of sample points of each spike. Overall, this method presents a good trade-off between accuracy and computational complexity and is thus particularly well-suited for hardware-efficient implementation. Copyright © 2013 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
2014-08-14
RolX takes the features from Re-FeX or any other feature matrix as input and outputs role assignments (clusters). The output of RolX is a csv file containing the node-role memberships and a csv file containing the role-feature definitions.
Lindesmith, Lisa C; Kocher, Jacob F; Donaldson, Eric F; Debbink, Kari; Mallory, Michael L; Swann, Excel W; Brewer-Jensen, Paul D; Baric, Ralph S
2017-12-05
Human norovirus is a significant public health burden, with >30 genotypes causing endemic levels of disease and strains from the GII.4 genotype causing serial pandemics as the virus evolves new ligand binding and antigenicity features. During 2014-2015, genotype GII.17 cluster IIIb strains emerged as the leading cause of norovirus infection in select global locations. Comparison of capsid sequences indicates that GII.17 is evolving at previously defined GII.4 antibody epitopes. Antigenicity of virus-like particles (VLPs) representative of clusters I, II, and IIIb GII.17 strains were compared by a surrogate neutralization assay based on antibody blockade of ligand binding. Sera from mice immunized with a single GII.17 VLP identified antigenic shifts between each cluster of GII.17 strains. Ligand binding of GII.17 cluster IIIb VLP was blocked only by antisera from mice immunized with cluster IIIb VLPs. Exchange of residues 393-396 from GII.17.2015 into GII.17.1978 ablated ligand binding and altered antigenicity, defining an important varying epitope in GII.17. The capsid sequence changes in GII.17 strains result in loss of blockade antibody binding, indicating that viral evolution, specifically at residues 393-396, may have contributed to the emergence of cluster IIIb strains and the persistence of GII.17 in human populations. © The Author 2017. Published by Oxford University Press for the Infectious Diseases Society of America. All rights reserved. For permissions, e-mail: journals.permissions@oup.com.
Muraoka, Azusa; Inokuchi, Yoshiya; Hammer, Nathan I; Shin, Joong-Won; Johnson, Mark A; Nagata, Takashi
2009-08-06
The [(CO2)n(H2O)]- cluster anions are studied using infrared photodissociation (IPD) spectroscopy in the 2800-3800 cm(-1) range. The observed IPD spectra display a drastic change in the vibrational band features at n = 4, indicating a sharp discontinuity in the structural evolution of the monohydrated cluster anions. The n = 2 and 3 spectra are composed of a series of sharp bands around 3600 cm(-1), which are assignable to the stretching vibrations of H2O bound to C2O4- in a double ionic hydrogen-bonding (DIHB) configuration, as was previously discussed (J. Chem. Phys. 2005, 122, 094303). In the n > or = 4 spectrum, a pair of intense bands additionally appears at approximately 3300 cm(-1). With the aid of ab initio calculations at the MP2/6-31+G* level, the 3300 cm(-1) bands are assigned to the bending overtone and the hydrogen-bonded OH vibration of H2O bound to CO2- via a single O-H...O linkage. Thus, the structures of [(CO2)n(H2O)]- evolve with cluster size such that DIHB to C2O4- is favored in the smaller clusters with n = 2 and 3 whereas CO2- is preferentially stabilized via the formation of a single ionic hydrogen-bonding (SIHB) configuration in the larger clusters with n > or = 4.
Tidal stripping stellar substructures around four metal-poor globular clusters in the galactic bulge
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chun, Sang-Hyun; Kang, Minhee; Jung, DooSeok
2015-01-01
We investigate the spatial density configuration of stars around four metal-poor globular clusters (NGC 6266, NGC 6626, NGC 6642, and NGC 6723) in the Galactic bulge region using wide-field deep J, H, and K imaging data obtained with the Wide Field Camera near-infrared array on the United Kingdom Infrared Telescope. A statistical weighted filtering algorithm for the stars on the color–magnitude diagram is applied in order to sort cluster member candidates from the field star contamination. In two-dimensional isodensity contour maps of the clusters, we find that all four of the globular clusters exhibit strong evidence of tidally stripped stellarmore » features beyond the tidal radius in the form of tidal tails or small density lobes/chunks. The orientations of the extended stellar substructures are likely to be associated with the effect of dynamic interaction with the Galaxy and the cluster's space motion. The observed radial density profiles of the four globular clusters also describe the extended substructures; they depart from theoretical King and Wilson models and have an overdensity feature with a break in the slope of the profile at the outer region of clusters. The observed results could imply that four globular clusters in the Galactic bulge region have experienced strong environmental effects such as tidal forces or bulge/disk shocks of the Galaxy during the dynamical evolution of globular clusters. These observational results provide further details which add to our understanding of the evolution of clusters in the Galactic bulge region as well as the formation of the Galaxy.« less
Sutton, Katherine S; Stratton, Natalie; Pytyck, Jennifer; Kolla, Nathan J; Cantor, James M
2015-01-01
Hypersexuality remains an increasingly common but poorly understood patient complaint. Despite diversity in clinical presentations of patients referred for hypersexuality, the literature has maintained treatment approaches that are assumed to apply to the entire phenomenon. This approach has proven ineffective, despite its application over several decades. The present study used quantitative methods to examine demographic, mental health, and sexological correlates of common clinical subtypes of hypersexuality referrals. Findings support the existence of subtypes, each with distinct clusters of features. Paraphilic hypersexuals reported greater numbers of sexual partners, more substance abuse, initiation to sexual activity at an earlier age, and novelty as a driving force behind their sexual behavior. Avoidant masturbators reported greater levels of anxiety, delayed ejaculation, and use of sex as an avoidance strategy. Chronic adulterers reported premature ejaculation and later onset of puberty. Designated patients were less likely to report substance abuse, employment, or finance problems. Although quantitative, this article nonetheless presents a descriptive study in which the underlying typology emerged from features most salient in routine sexological assessment. Future studies might apply purely empirical statistical techniques, such as cluster analyses, to ascertain to what extent similar typologies emerge when examined prospectively.
The Impact of Partial Measurement Invariance on Testing Moderation for Single and Multi-Level Data
Hsiao, Yu-Yu; Lai, Mark H. C.
2018-01-01
Moderation effect is a commonly used concept in the field of social and behavioral science. Several studies regarding the implication of moderation effects have been done; however, little is known about how partial measurement invariance influences the properties of tests for moderation effects when categorical moderators were used. Additionally, whether the impact is the same across single and multilevel data is still unknown. Hence, the purpose of the present study is twofold: (a) To investigate the performance of the moderation test in single-level studies when measurement invariance does not hold; (b) To examine whether unique features of multilevel data, such as intraclass correlation (ICC) and number of clusters, influence the effect of measurement non-invariance on the performance of tests for moderation. Simulation results indicated that falsely assuming measurement invariance lead to biased estimates, inflated Type I error rates, and more gain or more loss in power (depends on simulation conditions) for the test of moderation effects. Such patterns were more salient as sample size and the number of non-invariant items increase for both single- and multi-level data. With multilevel data, the cluster size seemed to have a larger impact than the number of clusters when falsely assuming measurement invariance in the moderation estimation. ICC was trivially related to the moderation estimates. Overall, when testing moderation effects with categorical moderators, employing a model that accounts for the measurement (non)invariance structure of the predictor and/or the outcome is recommended. PMID:29867692
The Impact of Partial Measurement Invariance on Testing Moderation for Single and Multi-Level Data.
Hsiao, Yu-Yu; Lai, Mark H C
2018-01-01
Moderation effect is a commonly used concept in the field of social and behavioral science. Several studies regarding the implication of moderation effects have been done; however, little is known about how partial measurement invariance influences the properties of tests for moderation effects when categorical moderators were used. Additionally, whether the impact is the same across single and multilevel data is still unknown. Hence, the purpose of the present study is twofold: (a) To investigate the performance of the moderation test in single-level studies when measurement invariance does not hold; (b) To examine whether unique features of multilevel data, such as intraclass correlation (ICC) and number of clusters, influence the effect of measurement non-invariance on the performance of tests for moderation. Simulation results indicated that falsely assuming measurement invariance lead to biased estimates, inflated Type I error rates, and more gain or more loss in power (depends on simulation conditions) for the test of moderation effects. Such patterns were more salient as sample size and the number of non-invariant items increase for both single- and multi-level data. With multilevel data, the cluster size seemed to have a larger impact than the number of clusters when falsely assuming measurement invariance in the moderation estimation. ICC was trivially related to the moderation estimates. Overall, when testing moderation effects with categorical moderators, employing a model that accounts for the measurement (non)invariance structure of the predictor and/or the outcome is recommended.
NASA Astrophysics Data System (ADS)
Myint, S. W.; Zheng, B.; Fan, C.; Kaplan, S.; Brazel, A.; Middel, A.; Smith, M.
2014-12-01
While the relationship between fractional cover of anthropogenic and vegetation features and the urban heat island has been well studied, the effect of spatial arrangements (e.g., clustered, dispersed) of these features on urban warming or cooling are not well understood. The goal of this study is to examine if and how spatial configuration of land cover features influence land surface temperatures (LST) in urban areas. This study focuses on Phoenix, AZ and Las Vegas, NV that have undergone dramatic urban expansion. The data used to classify detailed urban land cover types include Geoeye-1 (Las Vegas) and QuickBird (Phoenix). The Geoeye-1 image (3 m resolution) was acquired on October 12, 2011 and the QuickBird image (2.4 m resolution) was taken on May 29, 2007. Classification was performed using object based image analysis (OBIA). We employed a spatial autocorrelation approach (i.e., Moran's I) that measures the spatial dependence of a point to its neighboring points and describes how clustered or dispersed points are arranged in space. We used Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) data acquired over Phoenix (daytime on June 10, 2011 and nighttime on October 17, 2011) and Las Vegas (daytime on July 6, 2005 and nighttime on August 27, 2005) to examine daytime and nighttime LST with regards to the spatial arrangement of anthropogenic and vegetation features. We spatially correlate Moran's I values of each land cover per surface temperature, and develop regression models. The spatial configuration of grass and trees shows strong negative correlations with LST, implying that clustered vegetation lowers surface temperatures more effectively. In contrast, a clustered spatial arrangement of anthropogenic land-cover features, especially impervious surfaces, significantly elevates surface temperatures. Results from this study suggest that the spatial configuration of anthropogenic and vegetation features influence urban warming and cooling.
NASA Astrophysics Data System (ADS)
Soltanian-Zadeh, Hamid; Windham, Joe P.; Peck, Donald J.
1997-04-01
This paper presents development and performance evaluation of an MRI feature space method. The method is useful for: identification of tissue types; segmentation of tissues; and quantitative measurements on tissues, to obtain information that can be used in decision making (diagnosis, treatment planning, and evaluation of treatment). The steps of the work accomplished are as follows: (1) Four T2-weighted and two T1-weighted images (before and after injection of Gadolinium) were acquired for ten tumor patients. (2) Images were analyed by two image analysts according to the following algorithm. The intracranial brain tissues were segmented from the scalp and background. The additive noise was suppressed using a multi-dimensional non-linear edge- preserving filter which preserves partial volume information on average. Image nonuniformities were corrected using a modified lowpass filtering approach. The resulting images were used to generate and visualize an optimal feature space. Cluster centers were identified on the feature space. Then images were segmented into normal tissues and different zones of the tumor. (3) Biopsy samples were extracted from each patient and were subsequently analyzed by the pathology laboratory. (4) Image analysis results were compared to each other and to the biopsy results. Pre- and post-surgery feature spaces were also compared. The proposed algorithm made it possible to visualize the MRI feature space and to segment the image. In all cases, the operators were able to find clusters for normal and abnormal tissues. Also, clusters for different zones of the tumor were found. Based on the clusters marked for each zone, the method successfully segmented the image into normal tissues (white matter, gray matter, and CSF) and different zones of the lesion (tumor, cyst, edema, radiation necrosis, necrotic core, and infiltrated tumor). The results agreed with those obtained from the biopsy samples. Comparison of pre- to post-surgery and radiation feature spaces confirmed that the tumor was not present in the second study but radiation necrosis was generated as a result of radiation.
SU-F-R-33: Can CT and CBCT Be Used Simultaneously for Radiomics Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Luo, R; Wang, J; Zhong, H
2016-06-15
Purpose: To investigate whether CBCT and CT can be used in radiomics analysis simultaneously. To establish a batch correction method for radiomics in two similar image modalities. Methods: Four sites including rectum, bladder, femoral head and lung were considered as region of interest (ROI) in this study. For each site, 10 treatment planning CT images were collected. And 10 CBCT images which came from same site of same patient were acquired at first radiotherapy fraction. 253 radiomics features, which were selected by our test-retest study at rectum cancer CT (ICC>0.8), were calculated for both CBCT and CT images in MATLAB.more » Simple scaling (z-score) and nonlinear correction methods were applied to the CBCT radiomics features. The Pearson Correlation Coefficient was calculated to analyze the correlation between radiomics features of CT and CBCT images before and after correction. Cluster analysis of mixed data (for each site, 5 CT and 5 CBCT data are randomly selected) was implemented to validate the feasibility to merge radiomics data from CBCT and CT. The consistency of clustering result and site grouping was verified by a chi-square test for different datasets respectively. Results: For simple scaling, 234 of the 253 features have correlation coefficient ρ>0.8 among which 154 features haveρ>0.9 . For radiomics data after nonlinear correction, 240 of the 253 features have ρ>0.8 among which 220 features have ρ>0.9. Cluster analysis of mixed data shows that data of four sites was almost precisely separated for simple scaling(p=1.29 * 10{sup −7}, χ{sup 2} test) and nonlinear correction (p=5.98 * 10{sup −7}, χ{sup 2} test), which is similar to the cluster result of CT data (p=4.52 * 10{sup −8}, χ{sup 2} test). Conclusion: Radiomics data from CBCT can be merged with those from CT by simple scaling or nonlinear correction for radiomics analysis.« less
a Probabilistic Embedding Clustering Method for Urban Structure Detection
NASA Astrophysics Data System (ADS)
Lin, X.; Li, H.; Zhang, Y.; Gao, L.; Zhao, L.; Deng, M.
2017-09-01
Urban structure detection is a basic task in urban geography. Clustering is a core technology to detect the patterns of urban spatial structure, urban functional region, and so on. In big data era, diverse urban sensing datasets recording information like human behaviour and human social activity, suffer from complexity in high dimension and high noise. And unfortunately, the state-of-the-art clustering methods does not handle the problem with high dimension and high noise issues concurrently. In this paper, a probabilistic embedding clustering method is proposed. Firstly, we come up with a Probabilistic Embedding Model (PEM) to find latent features from high dimensional urban sensing data by "learning" via probabilistic model. By latent features, we could catch essential features hidden in high dimensional data known as patterns; with the probabilistic model, we can also reduce uncertainty caused by high noise. Secondly, through tuning the parameters, our model could discover two kinds of urban structure, the homophily and structural equivalence, which means communities with intensive interaction or in the same roles in urban structure. We evaluated the performance of our model by conducting experiments on real-world data and experiments with real data in Shanghai (China) proved that our method could discover two kinds of urban structure, the homophily and structural equivalence, which means clustering community with intensive interaction or under the same roles in urban space.
Contribution of long-range transport to the ozone levels recorded in the Northeast of Portugal
NASA Astrophysics Data System (ADS)
Gama, C.; Nunes, T.; Marques, M. C.; Ferreira, F.
2009-04-01
In the past four years (2004-2007), measurements carried out at Lamas de Olo, the only air quality monitoring background station in the Northeast of Portugal, showed high ozone concentrations (97,7±29,7 g.m-3). This remote site, located in the middle of Alvão Natural Park, in Portugal, 1086 m asl, plays a significant role on the total amount of exceedances registered in the national air quality network. The analysis of the data recorded at this monitoring station revealed an annual cycle of ozone concentrations similar to the ones observed in other background sites of the Northern Hemisphere (Monks, 2000; Vingarzan and Taylor, 2003). This common feature comprises a distinct maximum during spring (peaking during the month of April). Nevertheless it is during the summer that the hourly concentrations are higher, due to the typical atmospheric and meteorological conditions that promote photochemical pollution episodes. Photochemical pollution episodes can be related with production of ozone in a local scale or in a global scale due to the transportation of polluted air masses. For this reason analysing these events is crucial to fully understand the behaviour of ozone in the Northeast of Portugal, in order to adopt the correct long-term policies. With the purpose of studying the influence of long-range transport on the ozone levels recorded at Lamas de Olo, a cluster analysis was performed on 96-hour back trajectories air masses. Different trajectory clusters represent air masses with different source regions of atmospheric pollutants and the influence of these regions on the atmospheric composition at the arrival point (receptor) of the trajectories can therefore be assessed (EMPA, 2008). The back trajectories were simulated 4 times per day, using HYSPLIT model. A "bottom-up" cluster methodology was used to group trajectories into clusters according to their characteristics, for several time periods with similar ozone levels and/or distributions. Ozone average levels were calculated for each cluster and the differences between the groups were validated using the Kruskal-Wallis statistical test. The results have shown a significant influence of the transport path on ozone concentrations, which is more noticeable when the probability of occurring photochemical pollution phenomena is higher. Air masses from Europe (Spain, France, United Kingdom, etc.) generally originate higher ozone levels than the ones arriving from the Atlantic Ocean. This feature shows the role of photochemical production along long-range transport phenomena, and the input of pollutants into air masses, along their path. A more detailed analysis at local/regional scale, supported mainly by an intensive field campaign performed during spring/summer of 2006 in the vicinity of Alvão Natural Park (FOTONET Project), at different altitudes, together with pollutant measurements from rural air quality stations in the north of Portugal and one from Spain (Peñausende) was carried out in order to evaluate the extension of photochemical pollution in the Northeast of Portugal. Ozone concentrations measurements in the region showed a noticeable decrease with altitude, mainly at night. In resume back trajectories based analysis has demonstrated that other countries, mainly Spain, contribute decisively to the ozone levels registered in the station used for this study. Backed on this knowledge we point out towards the need of considering common international policies when dealing with controlling ozone levels in the environment. References: Monks, P. (2000): A review of the observations and origins of the spring ozone maximum. Atmospheric Environment 34, 3545-3561. Vingarzan, R., Taylor, B. (2003): Trend analysis of ground level ozone in the greater Vancouver / Fraser Valley area of British Columbia. Atmospheric Environment 37, 2159-2171. EMPA (2008): Air mass trajectory clustering. Retrieved 01 November 2008 from: http://www.empa.ch/plugin/template/empa/*/63288/—/l=1
NASA Astrophysics Data System (ADS)
Tian, Wen-Juan; Zhao, Li-Juan; Chen, Qiang; Ou, Ting; Xu, Hong-Guang; Zheng, Wei-Jun; Zhai, Hua-Jin; Li, Si-Dian
2015-04-01
Gas-phase anion photoelectron spectroscopy (PES) is combined with global structural searches and electronic structure calculations at the hybrid Becke 3-parameter exchange functional and Lee-Yang-Parr correlation functional (B3LYP) and single-point coupled-cluster with single, double, and perturbative triple excitations (CCSD(T)) levels to probe the structural and electronic properties and chemical bonding of the B4O40/- clusters. The measured PES spectra of B4O4- exhibit a major band with the adiabatic and vertical detachment energies (ADE and VDE) of 2.64 ± 0.10 and 2.81 ± 0.10 eV, respectively, as well as a weak peak with the ADE and VDE of 1.42 ± 0.08 and 1.48 ± 0.08 eV. The former band proves to correspond to the Y-shaped global minimum of Cs B4O4- (2A″), with the calculated ADE/VDE of 2.57/2.84 eV at the CCSD(T) level, whereas the weak band is associated with the second lowest-energy, rhombic isomer of D2h B4O4- (2B2g) with the predicted ADE/VDE of 1.43/1.49 eV. Both anion structures are planar, featuring a B atom or a B2O2 core bonded with terminal BO and/or BO2 groups. The same Y-shaped and rhombic structures are also located for the B4O4 neutral cluster, albeit with a reversed energy order. Bonding analyses reveal dual three-center four-electron (3c-4e) π hyperbonds in the Y-shaped B4O40/- clusters and a four-center four-electron (4c-4e) π bond, that is, the so-called o-bond in the rhombic B4O40/- clusters. This work is the first experimental study on a molecular system with an o-bond.
Method for indexing and retrieving manufacturing-specific digital imagery based on image content
Ferrell, Regina K.; Karnowski, Thomas P.; Tobin, Jr., Kenneth W.
2004-06-15
A method for indexing and retrieving manufacturing-specific digital images based on image content comprises three steps. First, at least one feature vector can be extracted from a manufacturing-specific digital image stored in an image database. In particular, each extracted feature vector corresponds to a particular characteristic of the manufacturing-specific digital image, for instance, a digital image modality and overall characteristic, a substrate/background characteristic, and an anomaly/defect characteristic. Notably, the extracting step includes generating a defect mask using a detection process. Second, using an unsupervised clustering method, each extracted feature vector can be indexed in a hierarchical search tree. Third, a manufacturing-specific digital image associated with a feature vector stored in the hierarchicial search tree can be retrieved, wherein the manufacturing-specific digital image has image content comparably related to the image content of the query image. More particularly, can include two data reductions, the first performed based upon a query vector extracted from a query image. Subsequently, a user can select relevant images resulting from the first data reduction. From the selection, a prototype vector can be calculated, from which a second-level data reduction can be performed. The second-level data reduction can result in a subset of feature vectors comparable to the prototype vector, and further comparable to the query vector. An additional fourth step can include managing the hierarchical search tree by substituting a vector average for several redundant feature vectors encapsulated by nodes in the hierarchical search tree.
Pollen morphology and plant taxonomy of white oaks in eastern North America
DOE Office of Scientific and Technical Information (OSTI.GOV)
Solomon, A.M.
An evaluation of possible approaches to fossil oak pollen identification utilized scanning electron microscopy to examine exine-surface features of 171 collections, representing 16 Quercus subgenus Lepidobalanus species and varieties of eastern North America. Twenty qualitative pollen morphological characters were defined and tabulated for each of 217 pollen grains. The data were subjected to cluster analysis and cluster diagrams were compared with published white oak taxonomy. Pollen morphology and plant taxonomy compared well in series of the subgenus Lepidobalanus due primarily to consistency of character presence and absence within species and varieties. Pollen morphology of white oaks appears to reflect plantmore » systematics above the species level. Use of routine SEM analysis to identify series of white oaks among fossil pollen grains likely will yield valid results. 38 references.« less
Partitioning of the degradation space for OCR training
NASA Astrophysics Data System (ADS)
Barney Smith, Elisa H.; Andersen, Tim
2006-01-01
Generally speaking optical character recognition algorithms tend to perform better when presented with homogeneous data. This paper studies a method that is designed to increase the homogeneity of training data, based on an understanding of the types of degradations that occur during the printing and scanning process, and how these degradations affect the homogeneity of the data. While it has been shown that dividing the degradation space by edge spread improves recognition accuracy over dividing the degradation space by threshold or point spread function width alone, the challenge is in deciding how many partitions and at what value of edge spread the divisions should be made. Clustering of different types of character features, fonts, sizes, resolutions and noise levels shows that edge spread is indeed shown to be a strong indicator of the homogeneity of character data clusters.
Kiran, Asha; Knights, Janice
2010-08-01
This study investigated the effectiveness of Traditional Indigenous Games (TIG) to improve physical activity and cultural connectedness among primary school students in the community renewal areas of Townsville in North Queensland. A cluster randomised control trial was conducted in four primary schools in 2007. Baseline and post implementation surveys were conducted in two intervention and two control schools and the results were compared. TIG delivered in primary schools every week over period of three months did not contribute to any statistically significant improvement in intervention and control groups in physical activity levels or cultural connectedness. Further research specifically in terms of intensity and duration of TIG may inform whether physical activity may be improved. Enhancing the Indigenous cultural features of the existing TIG kit might positively influence Indigenous cultural connectedness.
Uber, Amy; Sadler, Richard C; Chassee, Todd; Reynolds, Joshua C
2017-08-01
Geographic clustering of bystander cardiopulmonary resuscitation (CPR) is associated with demographic and socioeconomic features of the community where out-of-hospital cardiac arrest (OHCA) occurred, although this association remains largely untested in rural areas. With a significant rural component and relative racial homogeneity, Kent County, Michigan, provides a unique setting to externally validate or identify new community features associated with bystander CPR. Using a large, countywide data set, we tested for geographic clustering of bystander CPR and its associations with community socioeconomic features. Secondary analysis of adult OHCA subjects (2010-2015) in the Cardiac Arrest Registry to Enhance Survival (CARES) data set for Kent County, Michigan. After linking geocoded OHCA cases to U.S. census data, we used Moran's I-test to assess for spatial autocorrelation of population-weighted cardiac arrest rate by census block group. Getis-Ord Gi statistic assessed for spatial clustering of bystander CPR and mixed-effects hierarchical logistic regression estimated adjusted associations between community features and bystander CPR. Of 1,592 subjects, 1,465 met inclusion criteria. Geospatial analysis revealed significant clustering of OHCA in more populated/urban areas. Conversely, bystander CPR was less likely in these areas (99% confidence) and more likely in suburban and rural areas (99% confidence). Adjusting for clinical, demographic, and socioeconomic covariates, bystander CPR was associated with public location (odds ratio [OR] = 1.19; 95% confidence interval [CI] = 1.03-1.39), initially shockable rhythms (OR = 1.48; 95% CI = 1.12-1.96), and those in urban neighborhoods (OR = 0.54; 95% CI = 0.38-0.77). Out-of-hospital cardiac arrest and bystander CPR are geographically clustered in Kent County, Michigan, but bystander CPR is inversely associated with urban designation. These results offer new insight into bystander CPR patterns in mixed urban and rural regions and afford the opportunity for targeted community CPR education in areas of low bystander CPR prevalence. © 2017 by the Society for Academic Emergency Medicine.
Partial wave analysis of the reaction p(3.5 GeV) + p → pK + Λ to search for the "ppK –" bound state
Agakishiev, G.; Arnold, O.; Belver, D.; ...
2015-01-26
Employing the Bonn–Gatchina partial wave analysis framework (PWA), we have analyzed HADES data of the reaction p(3.5GeV) + p → pK +Λ. This reaction might contain information about the kaonic cluster “ppK -” (with quantum numbers J P=0 - and total isospin I =1/2) via its decay into pΛ. Due to interference effects in our coherent description of the data, a hypothetical K ¯NN (or, specifically “ppK -”) cluster signal need not necessarily show up as a pronounced feature (e.g. a peak) in an invariant mass spectrum like pΛ. Our PWA analysis includes a variety of resonant and non-resonant intermediatemore » states and delivers a good description of our data (various angular distributions and two-hadron invariant mass spectra) without a contribution of a K ¯NN cluster. At a confidence level of CL s=95% such a cluster cannot contribute more than 2–12% to the total cross section with a pK + Λ final state, which translates into a production cross-section between 0.7 μb and 4.2 μb, respectively. The range of the upper limit depends on the assumed cluster mass, width and production process.« less
Insulin Resistance: Regression and Clustering
Yoon, Sangho; Assimes, Themistocles L.; Quertermous, Thomas; Hsiao, Chin-Fu; Chuang, Lee-Ming; Hwu, Chii-Min; Rajaratnam, Bala; Olshen, Richard A.
2014-01-01
In this paper we try to define insulin resistance (IR) precisely for a group of Chinese women. Our definition deliberately does not depend upon body mass index (BMI) or age, although in other studies, with particular random effects models quite different from models used here, BMI accounts for a large part of the variability in IR. We accomplish our goal through application of Gauss mixture vector quantization (GMVQ), a technique for clustering that was developed for application to lossy data compression. Defining data come from measurements that play major roles in medical practice. A precise statement of what the data are is in Section 1. Their family structures are described in detail. They concern levels of lipids and the results of an oral glucose tolerance test (OGTT). We apply GMVQ to residuals obtained from regressions of outcomes of an OGTT and lipids on functions of age and BMI that are inferred from the data. A bootstrap procedure developed for our family data supplemented by insights from other approaches leads us to believe that two clusters are appropriate for defining IR precisely. One cluster consists of women who are IR, and the other of women who seem not to be. Genes and other features are used to predict cluster membership. We argue that prediction with “main effects” is not satisfactory, but prediction that includes interactions may be. PMID:24887437
Social phobia subtypes in the general population revealed by cluster analysis.
Furmark, T; Tillfors, M; Stattin, H; Ekselius, L; Fredrikson, M
2000-11-01
Epidemiological data on subtypes of social phobia are scarce and their defining features are debated. Hence, the present study explored the prevalence and descriptive characteristics of empirically derived social phobia subgroups in the general population. To reveal subtypes, data on social distress, functional impairment, number of social fears and criteria fulfilled for avoidant personality disorder were extracted from a previously published epidemiological study of 188 social phobics and entered into an hierarchical cluster analysis. Criterion validity was evaluated by comparing clusters on the Social Phobia Scale (SPS) and the Social Interaction Anxiety Scale (SIAS). Finally, profile analyses were performed in which clusters were compared on a set of sociodemographic and descriptive characteristics. Three clusters emerged, consisting of phobics scoring either high (generalized subtype), intermediate (non-generalized subtype) or low (discrete subtype) on all variables. Point prevalence rates were 2.0%, 5.9% and 7.7% respectively. All subtypes were distinguished on both SPS and SIAS. Generalized or severe social phobia tended to be over-represented among individuals with low levels of educational attainment and social support. Overall, public-speaking was the most common fear. Although categorical distinctions may be used, the present data suggest that social phobia subtypes in the general population mainly differ dimensionally along a mild moderate-severe continuum, and that the number of cases declines with increasing severity.
Wang, Juan; Nishikawa, Robert M; Yang, Yongyi
2017-04-01
In computerized detection of clustered microcalcifications (MCs) from mammograms, the traditional approach is to apply a pattern detector to locate the presence of individual MCs, which are subsequently grouped into clusters. Such an approach is often susceptible to the occurrence of false positives (FPs) caused by local image patterns that resemble MCs. We investigate the feasibility of a direct detection approach to determining whether an image region contains clustered MCs or not. Toward this goal, we develop a deep convolutional neural network (CNN) as the classifier model to which the input consists of a large image window ([Formula: see text] in size). The multiple layers in the CNN classifier are trained to automatically extract image features relevant to MCs at different spatial scales. In the experiments, we demonstrated this approach on a dataset consisting of both screen-film mammograms and full-field digital mammograms. We evaluated the detection performance both on classifying image regions of clustered MCs using a receiver operating characteristic (ROC) analysis and on detecting clustered MCs from full mammograms by a free-response receiver operating characteristic analysis. For comparison, we also considered a recently developed MC detector with FP suppression. In classifying image regions of clustered MCs, the CNN classifier achieved 0.971 in the area under the ROC curve, compared to 0.944 for the MC detector. In detecting clustered MCs from full mammograms, at 90% sensitivity, the CNN classifier obtained an FP rate of 0.69 clusters/image, compared to 1.17 clusters/image by the MC detector. These results indicate that using global image features can be more effective in discriminating clustered MCs from FPs caused by various sources, such as linear structures, thereby providing a more accurate detection of clustered MCs on mammograms.
Comparison of organs' shapes with geometric and Zernike 3D moments.
Broggio, D; Moignier, A; Ben Brahim, K; Gardumi, A; Grandgirard, N; Pierrat, N; Chea, M; Derreumaux, S; Desbrée, A; Boisserie, G; Aubert, B; Mazeron, J-J; Franck, D
2013-09-01
The morphological similarity of organs is studied with feature vectors based on geometric and Zernike 3D moments. It is particularly investigated if outliers and average models can be identified. For this purpose, the relative proximity to the mean feature vector is defined, principal coordinate and clustering analyses are also performed. To study the consistency and usefulness of this approach, 17 livers and 76 hearts voxel models from several sources are considered. In the liver case, models with similar morphological feature are identified. For the limited amount of studied cases, the liver of the ICRP male voxel model is identified as a better surrogate than the female one. For hearts, the clustering analysis shows that three heart shapes represent about 80% of the morphological variations. The relative proximity and clustering analysis rather consistently identify outliers and average models. For the two cases, identification of outliers and surrogate of average models is rather robust. However, deeper classification of morphological feature is subject to caution and can only be performed after cross analysis of at least two kinds of feature vectors. Finally, the Zernike moments contain all the information needed to re-construct the studied objects and thus appear as a promising tool to derive statistical organ shapes. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Mo, Yun; Zhang, Zhongzhao; Meng, Weixiao; Ma, Lin; Wang, Yao
2014-01-01
Indoor positioning systems based on the fingerprint method are widely used due to the large number of existing devices with a wide range of coverage. However, extensive positioning regions with a massive fingerprint database may cause high computational complexity and error margins, therefore clustering methods are widely applied as a solution. However, traditional clustering methods in positioning systems can only measure the similarity of the Received Signal Strength without being concerned with the continuity of physical coordinates. Besides, outage of access points could result in asymmetric matching problems which severely affect the fine positioning procedure. To solve these issues, in this paper we propose a positioning system based on the Spatial Division Clustering (SDC) method for clustering the fingerprint dataset subject to physical distance constraints. With the Genetic Algorithm and Support Vector Machine techniques, SDC can achieve higher coarse positioning accuracy than traditional clustering algorithms. In terms of fine localization, based on the Kernel Principal Component Analysis method, the proposed positioning system outperforms its counterparts based on other feature extraction methods in low dimensionality. Apart from balancing online matching computational burden, the new positioning system exhibits advantageous performance on radio map clustering, and also shows better robustness and adaptability in the asymmetric matching problem aspect. PMID:24451470
The doubling of stellar black hole nuclei
NASA Astrophysics Data System (ADS)
Kazandjian, Mher V.; Touma, J. R.
2013-04-01
It is strongly believed that Andromeda's double nucleus signals a disc of stars revolving around its central supermassive black hole on eccentric Keplerian orbits with nearly aligned apsides. A self-consistent stellar dynamical origin for such apparently long-lived alignment has so far been lacking, with indications that cluster self-gravity is capable of sustaining such lopsided configurations if and when stimulated by external perturbations. Here, we present results of N-body simulations which show unstable counter-rotating stellar clusters around supermassive black holes saturating into uniformly precessing lopsided nuclei. The double nucleus in our featured experiment decomposes naturally into a thick eccentric disc of apo-apse aligned stars which is embedded in a lighter triaxial cluster. The eccentric disc reproduces key features of Keplerian disc models of Andromeda's double nucleus; the triaxial cluster has a distinctive kinematic signature which is evident in Hubble Space Telescope observations of Andromeda's double nucleus, and has been difficult to reproduce with Keplerian discs alone. Our simulations demonstrate how the combination of an eccentric disc and a triaxial cluster arises naturally when a star cluster accreted over a preexisting and counter-rotating disc of stars drives disc and cluster into a mutually destabilizing dance. Such accretion events are inherent to standard galaxy formation scenarios. They are here shown to double stellar black hole nuclei as they feed them.
Hierarchical clustering of EMD based interest points for road sign detection
NASA Astrophysics Data System (ADS)
Khan, Jesmin; Bhuiyan, Sharif; Adhami, Reza
2014-04-01
This paper presents an automatic road traffic signs detection and recognition system based on hierarchical clustering of interest points and joint transform correlation. The proposed algorithm consists of the three following stages: interest points detection, clustering of those points and similarity search. At the first stage, good discriminative, rotation and scale invariant interest points are selected from the image edges based on the 1-D empirical mode decomposition (EMD). We propose a two-step unsupervised clustering technique, which is adaptive and based on two criterion. In this context, the detected points are initially clustered based on the stable local features related to the brightness and color, which are extracted using Gabor filter. Then points belonging to each partition are reclustered depending on the dispersion of the points in the initial cluster using position feature. This two-step hierarchical clustering yields the possible candidate road signs or the region of interests (ROIs). Finally, a fringe-adjusted joint transform correlation (JTC) technique is used for matching the unknown signs with the existing known reference road signs stored in the database. The presented framework provides a novel way to detect a road sign from the natural scenes and the results demonstrate the efficacy of the proposed technique, which yields a very low false hit rate.
AGN Feedback in Clusters of Galaxies
2010-01-01
cooling non-radiatively or being heated to higher temperatures. Throughout this paper , we use the term “cooling flow” to indicate clusters with...taurus cluster [51] and M87/ Virgo [24]. Concentric ripple-like features are also seen surrounding the center of Abell 2052, but current analysis shows that...2002) Chandra Imaging of the X-ray Core of the Virgo Cluster . ApJ 579:560-570. 37. Fujita Y et al. (2002) Chandra Observations of the Disruption of the
An Intelligent Decision Support System for Leukaemia Diagnosis using Microscopic Blood Images.
Chin Neoh, Siew; Srisukkham, Worawut; Zhang, Li; Todryk, Stephen; Greystoke, Brigit; Peng Lim, Chee; Alamgir Hossain, Mohammed; Aslam, Nauman
2015-10-09
This research proposes an intelligent decision support system for acute lymphoblastic leukaemia diagnosis from microscopic blood images. A novel clustering algorithm with stimulating discriminant measures (SDM) of both within- and between-cluster scatter variances is proposed to produce robust segmentation of nucleus and cytoplasm of lymphocytes/lymphoblasts. Specifically, the proposed between-cluster evaluation is formulated based on the trade-off of several between-cluster measures of well-known feature extraction methods. The SDM measures are used in conjuction with Genetic Algorithm for clustering nucleus, cytoplasm, and background regions. Subsequently, a total of eighty features consisting of shape, texture, and colour information of the nucleus and cytoplasm sub-images are extracted. A number of classifiers (multi-layer perceptron, Support Vector Machine (SVM) and Dempster-Shafer ensemble) are employed for lymphocyte/lymphoblast classification. Evaluated with the ALL-IDB2 database, the proposed SDM-based clustering overcomes the shortcomings of Fuzzy C-means which focuses purely on within-cluster scatter variance. It also outperforms Linear Discriminant Analysis and Fuzzy Compactness and Separation for nucleus-cytoplasm separation. The overall system achieves superior recognition rates of 96.72% and 96.67% accuracies using bootstrapping and 10-fold cross validation with Dempster-Shafer and SVM, respectively. The results also compare favourably with those reported in the literature, indicating the usefulness of the proposed SDM-based clustering method.
SAIL: Summation-bAsed Incremental Learning for Information-Theoretic Text Clustering.
Cao, Jie; Wu, Zhiang; Wu, Junjie; Xiong, Hui
2013-04-01
Information-theoretic clustering aims to exploit information-theoretic measures as the clustering criteria. A common practice on this topic is the so-called Info-Kmeans, which performs K-means clustering with KL-divergence as the proximity function. While expert efforts on Info-Kmeans have shown promising results, a remaining challenge is to deal with high-dimensional sparse data such as text corpora. Indeed, it is possible that the centroids contain many zero-value features for high-dimensional text vectors, which leads to infinite KL-divergence values and creates a dilemma in assigning objects to centroids during the iteration process of Info-Kmeans. To meet this challenge, in this paper, we propose a Summation-bAsed Incremental Learning (SAIL) algorithm for Info-Kmeans clustering. Specifically, by using an equivalent objective function, SAIL replaces the computation of KL-divergence by the incremental computation of Shannon entropy. This can avoid the zero-feature dilemma caused by the use of KL-divergence. To improve the clustering quality, we further introduce the variable neighborhood search scheme and propose the V-SAIL algorithm, which is then accelerated by a multithreaded scheme in PV-SAIL. Our experimental results on various real-world text collections have shown that, with SAIL as a booster, the clustering performance of Info-Kmeans can be significantly improved. Also, V-SAIL and PV-SAIL indeed help improve the clustering quality at a lower cost of computation.
Clustering of Farsi sub-word images for whole-book recognition
NASA Astrophysics Data System (ADS)
Soheili, Mohammad Reza; Kabir, Ehsanollah; Stricker, Didier
2015-01-01
Redundancy of word and sub-word occurrences in large documents can be effectively utilized in an OCR system to improve recognition results. Most OCR systems employ language modeling techniques as a post-processing step; however these techniques do not use important pictorial information that exist in the text image. In case of large-scale recognition of degraded documents, this information is even more valuable. In our previous work, we proposed a subword image clustering method for the applications dealing with large printed documents. In our clustering method, the ideal case is when all equivalent sub-word images lie in one cluster. To overcome the issues of low print quality, the clustering method uses an image matching algorithm for measuring the distance between two sub-word images. The measured distance with a set of simple shape features were used to cluster all sub-word images. In this paper, we analyze the effects of adding more shape features on processing time, purity of clustering, and the final recognition rate. Previously published experiments have shown the efficiency of our method on a book. Here we present extended experimental results and evaluate our method on another book with totally different font face. Also we show that the number of the new created clusters in a page can be used as a criteria for assessing the quality of print and evaluating preprocessing phases.
An Intelligent Decision Support System for Leukaemia Diagnosis using Microscopic Blood Images
Chin Neoh, Siew; Srisukkham, Worawut; Zhang, Li; Todryk, Stephen; Greystoke, Brigit; Peng Lim, Chee; Alamgir Hossain, Mohammed; Aslam, Nauman
2015-01-01
This research proposes an intelligent decision support system for acute lymphoblastic leukaemia diagnosis from microscopic blood images. A novel clustering algorithm with stimulating discriminant measures (SDM) of both within- and between-cluster scatter variances is proposed to produce robust segmentation of nucleus and cytoplasm of lymphocytes/lymphoblasts. Specifically, the proposed between-cluster evaluation is formulated based on the trade-off of several between-cluster measures of well-known feature extraction methods. The SDM measures are used in conjuction with Genetic Algorithm for clustering nucleus, cytoplasm, and background regions. Subsequently, a total of eighty features consisting of shape, texture, and colour information of the nucleus and cytoplasm sub-images are extracted. A number of classifiers (multi-layer perceptron, Support Vector Machine (SVM) and Dempster-Shafer ensemble) are employed for lymphocyte/lymphoblast classification. Evaluated with the ALL-IDB2 database, the proposed SDM-based clustering overcomes the shortcomings of Fuzzy C-means which focuses purely on within-cluster scatter variance. It also outperforms Linear Discriminant Analysis and Fuzzy Compactness and Separation for nucleus-cytoplasm separation. The overall system achieves superior recognition rates of 96.72% and 96.67% accuracies using bootstrapping and 10-fold cross validation with Dempster-Shafer and SVM, respectively. The results also compare favourably with those reported in the literature, indicating the usefulness of the proposed SDM-based clustering method. PMID:26450665
Cross-entropy clustering framework for catchment classification
NASA Astrophysics Data System (ADS)
Tongal, Hakan; Sivakumar, Bellie
2017-09-01
There is an increasing interest in catchment classification and regionalization in hydrology, as they are useful for identification of appropriate model complexity and transfer of information from gauged catchments to ungauged ones, among others. This study introduces a nonlinear cross-entropy clustering (CEC) method for classification of catchments. The method specifically considers embedding dimension (m), sample entropy (SampEn), and coefficient of variation (CV) to represent dimensionality, complexity, and variability of the time series, respectively. The method is applied to daily streamflow time series from 217 gauging stations across Australia. The results suggest that a combination of linear and nonlinear parameters (i.e. m, SampEn, and CV), representing different aspects of the underlying dynamics of streamflows, could be useful for determining distinct patterns of flow generation mechanisms within a nonlinear clustering framework. For the 217 streamflow time series, nine hydrologically homogeneous clusters that have distinct patterns of flow regime characteristics and specific dominant hydrological attributes with different climatic features are obtained. Comparison of the results with those obtained using the widely employed k-means clustering method (which results in five clusters, with the loss of some information about the features of the clusters) suggests the superiority of the cross-entropy clustering method. The outcomes from this study provide a useful guideline for employing the nonlinear dynamic approaches based on hydrologic signatures and for gaining an improved understanding of streamflow variability at a large scale.
Spectroscopic Confirmation of Five Galaxy Clusters at z > 1.25 in the 2500 deg^2 SPT-SZ Survey
NASA Astrophysics Data System (ADS)
Khullar, Gourav; Bleem, Lindsey; Bayliss, Matthew; Gladders, Michael; South Pole Telescope (SPT) Collaboration
2018-06-01
We present spectroscopic confirmation of 5 galaxy clusters at 1.25 < z < 1.5, discovered in the 2500 deg2 South Pole Telescope Sunyaev-Zel’dovich (SPT-SZ) survey. These clusters, taken from a nearly redshift-independent mass-limited sample of clusters, have multi-wavelength follow-up imaging data from the X-ray to the near-IR, and currently form the most homogenous massive high-redshift cluster sample in existence. We briefly describe the analysis pipeline used on the low S/N spectra of these faint galaxies, and describing the multiple techniques used to extract robust redshifts from a combination of absorption-line (Ca II H&K doublet - λλ3934,3968Å) and emission-line ([OII] λλ3727,3729Å) spectral features. We present several ensemble analyses of cluster member galaxies that demonstrate the reliability of the measured redshifts. We also identify modest [OII] emission and pronounced CN and Hδ absorption in a composite stacked spectrum of 28 low S/N passive galaxy spectra with redshifts derived primarily from Ca II H&K features. This work increases the number of spectroscopically-confirmed SPT-SZ galaxy clusters at z > 1.25 from 2 to 7, further demonstrating the efficacy of SZ selection for the highest redshift massive clusters, and enabling further detailed study of these confirmed systems.
Li, Ke; Liu, Yi; Wang, Quanxin; Wu, Yalei; Song, Shimin; Sun, Yi; Liu, Tengchong; Wang, Jun; Li, Yang; Du, Shaoyi
2015-01-01
This paper proposes a novel multi-label classification method for resolving the spacecraft electrical characteristics problems which involve many unlabeled test data processing, high-dimensional features, long computing time and identification of slow rate. Firstly, both the fuzzy c-means (FCM) offline clustering and the principal component feature extraction algorithms are applied for the feature selection process. Secondly, the approximate weighted proximal support vector machine (WPSVM) online classification algorithms is used to reduce the feature dimension and further improve the rate of recognition for electrical characteristics spacecraft. Finally, the data capture contribution method by using thresholds is proposed to guarantee the validity and consistency of the data selection. The experimental results indicate that the method proposed can obtain better data features of the spacecraft electrical characteristics, improve the accuracy of identification and shorten the computing time effectively. PMID:26544549
Coupled-cluster and explicitly correlated perturbation-theory calculations of the uracil anion.
Bachorz, Rafał A; Klopper, Wim; Gutowski, Maciej
2007-02-28
A valence-type anion of the canonical tautomer of uracil has been characterized using explicitly correlated second-order Moller-Plesset perturbation theory (RI-MP2-R12) in conjunction with conventional coupled-cluster theory with single, double, and perturbative triple excitations. At this level of electron-correlation treatment and after inclusion of a zero-point vibrational energy correction, determined in the harmonic approximation at the RI-MP2 level of theory, the valence anion is adiabatically stable with respect to the neutral molecule by 40 meV. The anion is characterized by a vertical detachment energy of 0.60 eV. To obtain accurate estimates of the vertical and adiabatic electron binding energies, a scheme was applied in which electronic energy contributions from various levels of theory were added, each of them extrapolated to the corresponding basis-set limit. The MP2 basis-set limits were also evaluated using an explicitly correlated approach, and the results of these calculations are in agreement with the extrapolated values. A remarkable feature of the valence anionic state is that the adiabatic electron binding energy is positive but smaller than the adiabatic electron binding energy of the dipole-bound state.
Berenguer, Roberto; Pastor-Juan, María Del Rosario; Canales-Vázquez, Jesús; Castro-García, Miguel; Villas, María Victoria; Legorburo, Francisco Mansilla; Sabater, Sebastià
2018-04-24
Purpose To identify the reproducible and nonredundant radiomics features (RFs) for computed tomography (CT). Materials and Methods Two phantoms were used to test RF reproducibility by using test-retest analysis, by changing the CT acquisition parameters (hereafter, intra-CT analysis), and by comparing five different scanners with the same CT parameters (hereafter, inter-CT analysis). Reproducible RFs were selected by using the concordance correlation coefficient (as a measure of the agreement between variables) and the coefficient of variation (defined as the ratio of the standard deviation to the mean). Redundant features were grouped by using hierarchical cluster analysis. Results A total of 177 RFs including intensity, shape, and texture features were evaluated. The test-retest analysis showed that 91% (161 of 177) of the RFs were reproducible according to concordance correlation coefficient. Reproducibility of intra-CT RFs, based on coefficient of variation, ranged from 89.3% (151 of 177) to 43.1% (76 of 177) where the pitch factor and the reconstruction kernel were modified, respectively. Reproducibility of inter-CT RFs, based on coefficient of variation, also showed large material differences, from 85.3% (151 of 177; wood) to only 15.8% (28 of 177; polyurethane). Ten clusters were identified after the hierarchical cluster analysis and one RF per cluster was chosen as representative. Conclusion Many RFs were redundant and nonreproducible. If all the CT parameters are fixed except field of view, tube voltage, and milliamperage, then the information provided by the analyzed RFs can be summarized in only 10 RFs (each representing a cluster) because of redundancy. © RSNA, 2018 Online supplemental material is available for this article.
Comorbid personality traits in schizophrenia: prevalence and clinical characteristics.
Moore, Elizabeth A; Green, Melissa J; Carr, Vaughan J
2012-03-01
Accumulating evidence suggests high rates of personality disorder (PD) in schizophrenia (Sz), and as such, the implications of PD in this context are beginning to be studied more thoroughly. We examined clinical, cognitive and experiential (i.e., reported childhood adversity) correlates of aberrant personality traits in schizophrenia and healthy controls (HC) as measured by the International Personality Disorder Examination Questionnaire (IPDEQ). Participants were 549 individuals with schizophrenia or schizoaffective disorder, and 572 healthy adults recruited to the Australian Schizophrenia Research Bank (ASRB). Schizophrenia participants were significantly more likely than healthy controls to screen positive for personality disorder across all ICD-10 subtypes, and there was substantial overlap between clusters, with ∼33% of Sz participants screening positive for all 3 personality disorder clusters. Among both Sz and HC groups, cluster B personality characteristics were significantly associated with increased suicidal behaviours, lower cognitive performance, and the experience of childhood adversity. In addition, Cluster C personality features were associated with higher overall ratings of affective blunting in schizophrenia, and Cluster A personality features were associated with childhood 'loss' in HC participants only. The cumulative effects of screening positive for more than one personality disorder in Sz was associated with higher likelihood of suicidal behaviour, earlier age of onset of Sz, and poorer cognitive functioning. The results suggest that abnormal co-occurrence of personality traits across DSM-IV clusters is evident in a significant proportion of individuals with schizophrenia, and that these personality features impact significantly on clinical and cognitive characteristics of Sz. Copyright © 2011 Elsevier Ltd. All rights reserved.
Phonologic errors as a clinical marker of the logopenic variant of PPA.
Leyton, Cristian E; Ballard, Kirrie J; Piguet, Olivier; Hodges, John R
2014-05-06
To disentangle the clinical heterogeneity of nonsemantic variants of primary progressive aphasia (PPA) and to identify a coherent linguistic-anatomical marker for the logopenic variant of PPA (lv-PPA). Key speech and language features of 14 cases of lv-PPA and 18 cases of nonfluent/agrammatic variant of PPA were systematically evaluated and scored by an independent rater blinded to diagnosis. Every case underwent a structural MRI and a Pittsburgh compound B (PiB)-PET scan, a putative biomarker of Alzheimer disease. Key speech and language features that showed association with the PiB-PET status were entered into a hierarchical cluster analysis. The linguistic features and patterns of cortical thinning in each resultant cluster were analyzed. The cluster analysis revealed 3 coherent clinical groups, each of which was linked to a specific PiB-PET status. The first cluster was linked to high PiB retention and characterized by phonologic errors and cortical thinning focused on the left superior temporal gyrus. The second and third clusters were characterized by grammatical production errors and motor speech disorders, respectively, and were associated with low PiB brain retention. A fourth cluster, however, demonstrated nonspecific language deficits and unpredictable PiB-PET status. These findings suggest that despite the clinical and pathologic heterogeneity of nonsemantic variants, discrete clinical syndromes can be distinguished and linked to specific likelihood of PiB-PET status. Phonologic errors seem to be highly predictive of high amyloid burden in PPA and can provide a specific clinical marker for lv-PPA.
NASA Astrophysics Data System (ADS)
Lo, Joseph Y.; Gavrielides, Marios A.; Markey, Mia K.; Jesneck, Jonathan L.
2003-05-01
We developed an ensemble classifier for the task of computer-aided diagnosis of breast microcalcification clusters,which are very challenging to characterize for radiologists and computer models alike. The purpose of this study is to help radiologists identify whether suspicious calcification clusters are benign vs. malignant, such that they may potentially recommend fewer unnecessary biopsies for actually benign lesions. The data consists of mammographic features extracted by automated image processing algorithms as well as manually interpreted by radiologists according to a standardized lexicon. We used 292 cases from a publicly available mammography database. From each cases, we extracted 22 image processing features pertaining to lesion morphology, 5 radiologist features also pertaining to morphology, and the patient age. Linear discriminant analysis (LDA) models were designed using each of the three data types. Each local model performed poorly; the best was one based upon image processing features which yielded ROC area index AZ of 0.59 +/- 0.03 and partial AZ above 90% sensitivity of 0.08 +/- 0.03. We then developed ensemble models using different combinations of those data types, and these models all improved performance compared to the local models. The final ensemble model was based upon 5 features selected by stepwise LDA from all 28 available features. This ensemble performed with AZ of 0.69 +/- 0.03 and partial AZ of 0.21 +/- 0.04, which was statistically significantly better than the model based on the image processing features alone (p<0.001 and p=0.01 for full and partial AZ respectively). This demonstrated the value of the radiologist-extracted features as a source of information for this task. It also suggested there is potential for improved performance using this ensemble classifier approach to combine different sources of currently available data.
A Comparison of Single Sample and Bootstrap Methods to Assess Mediation in Cluster Randomized Trials
ERIC Educational Resources Information Center
Pituch, Keenan A.; Stapleton, Laura M.; Kang, Joo Youn
2006-01-01
A Monte Carlo study examined the statistical performance of single sample and bootstrap methods that can be used to test and form confidence interval estimates of indirect effects in two cluster randomized experimental designs. The designs were similar in that they featured random assignment of clusters to one of two treatment conditions and…
NASA Astrophysics Data System (ADS)
Basalto, Nicolas; Bellotti, Roberto; de Carlo, Francesco; Facchi, Paolo; Pantaleo, Ester; Pascazio, Saverio
2008-10-01
A clustering algorithm based on the Hausdorff distance is analyzed and compared to the single, complete, and average linkage algorithms. The four clustering procedures are applied to a toy example and to the time series of financial data. The dendrograms are scrutinized and their features compared. The Hausdorff linkage relies on firm mathematical grounds and turns out to be very effective when one has to discriminate among complex structures.
Featured Image: New Detail in the Toothbrush Cluster
NASA Astrophysics Data System (ADS)
Kohler, Susanna
2018-01-01
This spectacular composite (click here for the full image) reveals the galaxy cluster 1RXS J0603.3+4214, known as the Toothbrush cluster due to the shape of its most prominent radio relic. Featured in a recent publication led by Kamlesh Rajpurohit (Thuringian State Observatory, Germany), this image contains new Very Large Array (VLA) 1.5-GHz observations (red) showing the radio emission within the cluster. This is composited with a Chandra view of the X-ray emitting gas of the cluster (blue) and an optical image of the background from Subaru data. The new deep VLA data totaling 26 hours of observations provides a detailed look at the complex structure within the Toothbrush relic, revealing enigmatic filaments and twists (see below). This new data will help us to explore the possible merger history of this cluster, which is theorized to have caused the unusual shapes we see today. For more information, check out the original article linked below.High resolution VLA 12 GHz image of the Toothbrush showing the complex, often filamentary structures. [Rajpurohit et al. 2018]CitationK. Rajpurohit et al 2018 ApJ 852 65. doi:10.3847/1538-4357/aa9f13
Parents' personality clusters and eating disordered daughters' personality and psychopathology.
Amianto, Federico; Ercole, Roberta; Marzola, Enrica; Abbate Daga, Giovanni; Fassino, Secondo
2015-11-30
The present study explores how parents' personality clusters relate to their eating disordered daughters' personality and psychopathology. Mothers and fathers were tested with the Temperament Character Inventory. Their daughters were assessed with the following: Temperament and Character Inventory, Eating Disorder Inventory-2, Symptom Checklist-90, Parental Bonding Instrument, Attachment Style Questionnaire, and Family Assessment Device. Daughters' personality traits and psychopathology scores were compared between clusters. Daughters' features were related to those of their parents. Explosive/adventurous mothers were found to relate to their daughters' borderline personality profile and more severe interoceptive awareness. Mothers' immaturity was correlated to their daughters' higher character immaturity, inadequacy, and depressive feelings. Fathers who were explosive/methodic correlated with their daughters' character immaturity, severe eating, and general psychopathology. Fathers' character immaturity only marginally related to their daughters' specific features. Both parents' temperament clusters and mothers' character clusters related to patients' personality and eating psychopathology. The cluster approach to personality-related dynamics of families with an individual affected by an eating disorder expands the knowledge on the relationship between parents' characteristics and daughters' illness, suggesting complex and unique relationships correlating parents' personality traits to their daughters' disorder. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Clustering methods for the optimization of atomic cluster structure
NASA Astrophysics Data System (ADS)
Bagattini, Francesco; Schoen, Fabio; Tigli, Luca
2018-04-01
In this paper, we propose a revised global optimization method and apply it to large scale cluster conformation problems. In the 1990s, the so-called clustering methods were considered among the most efficient general purpose global optimization techniques; however, their usage has quickly declined in recent years, mainly due to the inherent difficulties of clustering approaches in large dimensional spaces. Inspired from the machine learning literature, we redesigned clustering methods in order to deal with molecular structures in a reduced feature space. Our aim is to show that by suitably choosing a good set of geometrical features coupled with a very efficient descent method, an effective optimization tool is obtained which is capable of finding, with a very high success rate, all known putative optima for medium size clusters without any prior information, both for Lennard-Jones and Morse potentials. The main result is that, beyond being a reliable approach, the proposed method, based on the idea of starting a computationally expensive deep local search only when it seems worth doing so, is capable of saving a huge amount of searches with respect to an analogous algorithm which does not employ a clustering phase. In this paper, we are not claiming the superiority of the proposed method compared to specific, refined, state-of-the-art procedures, but rather indicating a quite straightforward way to save local searches by means of a clustering scheme working in a reduced variable space, which might prove useful when included in many modern methods.
NASA Astrophysics Data System (ADS)
Ulbrich, Sven; Pinto, Joaquim G.; Economou, Theodoros; Stephenson, David B.; Karremann, Melanie K.; Shaffrey, Len C.
2017-04-01
Cyclone families are a frequent synoptic weather feature in the Euro-Atlantic area, particularly during wintertime. Given appropriate large-scale conditions, such series (clusters) of storms may cause large socio-economic impacts and cumulative losses. Recent studies analyzing reanalysis data using single cyclone tracking methods have shown that serial clustering of cyclones occurs on both flanks and downstream regions of the North Atlantic storm track. Based on winter (DJF) cyclone counts from the IMILAST cyclone database, we explore the representation of serial clustering in the ERA-Interim period and its relationship with the NAO-phase and jet intensity. With this aim, clustering is estimated by the dispersion of winter (DJF) cyclone passages for each grid point over the Euro-Atlantic area. Results indicate that clustering over the Eastern North Atlantic and Western Europe can be identified for all methods, although the exact location and the dispersion magnitude may vary. The relationship between clustering and (i) the NAO-phase and (ii) jet intensity over the North Atlantic is statistically evaluated. Results show that the NAO-index and the jet intensity show a strong contribution to clustering, even though some spread is found between methods. We conclude that the general features of clustering of extratropical cyclones over the North Atlantic and Western Europe are robust to the choice of tracking method. The same is true for the influence of the NAO and jet intensity on cyclone dispersion.
Robustness of serial clustering of extra-tropical cyclones to the choice of tracking method
NASA Astrophysics Data System (ADS)
Pinto, Joaquim G.; Ulbrich, Sven; Karremann, Melanie K.; Stephenson, David B.; Economou, Theodoros; Shaffrey, Len C.
2016-04-01
Cyclone families are a frequent synoptic weather feature in the Euro-Atlantic area in winter. Given appropriate large-scale conditions, the occurrence of such series (clusters) of storms may lead to large socio-economic impacts and cumulative losses. Recent studies analyzing Reanalysis data using single cyclone tracking methods have shown that serial clustering of cyclones occurs on both flanks and downstream regions of the North Atlantic storm track. This study explores the sensitivity of serial clustering to the choice of tracking method. With this aim, the IMILAST cyclone track database based on ERA-interim data is analysed. Clustering is estimated by the dispersion (ratio of variance to mean) of winter (DJF) cyclones passages near each grid point over the Euro-Atlantic area. Results indicate that while the general pattern of clustering is identified for all methods, there are considerable differences in detail. This can primarily be attributed to the differences in the variance of cyclone counts between the methods, which range up to one order of magnitude. Nevertheless, clustering over the Eastern North Atlantic and Western Europe can be identified for all methods and can thus be generally considered as a robust feature. The statistical links between large-scale patterns like the NAO and clustering are obtained for all methods, though with different magnitudes. We conclude that the occurrence of cyclone clustering over the Eastern North Atlantic and Western Europe is largely independent from the choice of tracking method and hence from the definition of a cyclone.
Automatic Clustering Using Multi-objective Particle Swarm and Simulated Annealing
Abubaker, Ahmad; Baharum, Adam; Alrefaei, Mahmoud
2015-01-01
This paper puts forward a new automatic clustering algorithm based on Multi-Objective Particle Swarm Optimization and Simulated Annealing, “MOPSOSA”. The proposed algorithm is capable of automatic clustering which is appropriate for partitioning datasets to a suitable number of clusters. MOPSOSA combines the features of the multi-objective based particle swarm optimization (PSO) and the Multi-Objective Simulated Annealing (MOSA). Three cluster validity indices were optimized simultaneously to establish the suitable number of clusters and the appropriate clustering for a dataset. The first cluster validity index is centred on Euclidean distance, the second on the point symmetry distance, and the last cluster validity index is based on short distance. A number of algorithms have been compared with the MOPSOSA algorithm in resolving clustering problems by determining the actual number of clusters and optimal clustering. Computational experiments were carried out to study fourteen artificial and five real life datasets. PMID:26132309
Adaptive Water Sampling based on Unsupervised Clustering
NASA Astrophysics Data System (ADS)
Py, F.; Ryan, J.; Rajan, K.; Sherman, A.; Bird, L.; Fox, M.; Long, D.
2007-12-01
Autonomous Underwater Vehicles (AUVs) are widely used for oceanographic surveys, during which data is collected from a number of on-board sensors. Engineers and scientists at MBARI have extended this approach by developing a water sampler specialy for the AUV, which can sample a specific patch of water at a specific time. The sampler, named the Gulper, captures 2 liters of seawater in less than 2 seconds on a 21" MBARI Odyssey AUV. Each sample chamber of the Gulper is filled with seawater through a one-way valve, which protrudes through the fairing of the AUV. This new kind of device raises a new problem: when to trigger the gulper autonomously? For example, scientists interested in studying the mobilization and transport of shelf sediments would like to detect intermediate nepheloïd layers (INLs). To be able to detect this phenomenon we need to extract a model based on AUV sensors that can detect this feature in-situ. The formation of such a model is not obvious as identification of this feature is generally based on data from multiple sensors. We have developed an unsupervised data clustering technique to extract the different features which will then be used for on-board classification and triggering of the Gulper. We use a three phase approach: 1) use data from past missions to learn the different classes of data from sensor inputs. The clustering algorithm will then extract the set of features that can be distinguished within this large data set. 2) Scientists on shore then identify these features and point out which correspond to those of interest (e.g. nepheloïd layer, upwelling material etc) 3) Embed the corresponding classifier into the AUV control system to indicate the most probable feature of the water depending on sensory input. The triggering algorithm looks to this result and triggers the Gulper if the classifier indicates that we are within the feature of interest with a predetermined threshold of confidence. We have deployed this method of online classification and sampling based on AUV depth and HOBI Labs Hydroscat-2 sensor data. Using approximately 20,000 data samples the clustering algorithm generated 14 clusters with one identified as corresponding to a nepheloïd layer. We demonstrate that such a technique can be used to reliably and efficiently sample water based on multiple sources of data in real-time.
Identification of Alfalfa Leaf Diseases Using Image Recognition Technology
Qin, Feng; Liu, Dongxia; Sun, Bingda; Ruan, Liu; Ma, Zhanhong; Wang, Haiguang
2016-01-01
Common leaf spot (caused by Pseudopeziza medicaginis), rust (caused by Uromyces striatus), Leptosphaerulina leaf spot (caused by Leptosphaerulina briosiana) and Cercospora leaf spot (caused by Cercospora medicaginis) are the four common types of alfalfa leaf diseases. Timely and accurate diagnoses of these diseases are critical for disease management, alfalfa quality control and the healthy development of the alfalfa industry. In this study, the identification and diagnosis of the four types of alfalfa leaf diseases were investigated using pattern recognition algorithms based on image-processing technology. A sub-image with one or multiple typical lesions was obtained by artificial cutting from each acquired digital disease image. Then the sub-images were segmented using twelve lesion segmentation methods integrated with clustering algorithms (including K_means clustering, fuzzy C-means clustering and K_median clustering) and supervised classification algorithms (including logistic regression analysis, Naive Bayes algorithm, classification and regression tree, and linear discriminant analysis). After a comprehensive comparison, the segmentation method integrating the K_median clustering algorithm and linear discriminant analysis was chosen to obtain lesion images. After the lesion segmentation using this method, a total of 129 texture, color and shape features were extracted from the lesion images. Based on the features selected using three methods (ReliefF, 1R and correlation-based feature selection), disease recognition models were built using three supervised learning methods, including the random forest, support vector machine (SVM) and K-nearest neighbor methods. A comparison of the recognition results of the models was conducted. The results showed that when the ReliefF method was used for feature selection, the SVM model built with the most important 45 features (selected from a total of 129 features) was the optimal model. For this SVM model, the recognition accuracies of the training set and the testing set were 97.64% and 94.74%, respectively. Semi-supervised models for disease recognition were built based on the 45 effective features that were used for building the optimal SVM model. For the optimal semi-supervised models built with three ratios of labeled to unlabeled samples in the training set, the recognition accuracies of the training set and the testing set were both approximately 80%. The results indicated that image recognition of the four alfalfa leaf diseases can be implemented with high accuracy. This study provides a feasible solution for lesion image segmentation and image recognition of alfalfa leaf disease. PMID:27977767
Identification of Alfalfa Leaf Diseases Using Image Recognition Technology.
Qin, Feng; Liu, Dongxia; Sun, Bingda; Ruan, Liu; Ma, Zhanhong; Wang, Haiguang
2016-01-01
Common leaf spot (caused by Pseudopeziza medicaginis), rust (caused by Uromyces striatus), Leptosphaerulina leaf spot (caused by Leptosphaerulina briosiana) and Cercospora leaf spot (caused by Cercospora medicaginis) are the four common types of alfalfa leaf diseases. Timely and accurate diagnoses of these diseases are critical for disease management, alfalfa quality control and the healthy development of the alfalfa industry. In this study, the identification and diagnosis of the four types of alfalfa leaf diseases were investigated using pattern recognition algorithms based on image-processing technology. A sub-image with one or multiple typical lesions was obtained by artificial cutting from each acquired digital disease image. Then the sub-images were segmented using twelve lesion segmentation methods integrated with clustering algorithms (including K_means clustering, fuzzy C-means clustering and K_median clustering) and supervised classification algorithms (including logistic regression analysis, Naive Bayes algorithm, classification and regression tree, and linear discriminant analysis). After a comprehensive comparison, the segmentation method integrating the K_median clustering algorithm and linear discriminant analysis was chosen to obtain lesion images. After the lesion segmentation using this method, a total of 129 texture, color and shape features were extracted from the lesion images. Based on the features selected using three methods (ReliefF, 1R and correlation-based feature selection), disease recognition models were built using three supervised learning methods, including the random forest, support vector machine (SVM) and K-nearest neighbor methods. A comparison of the recognition results of the models was conducted. The results showed that when the ReliefF method was used for feature selection, the SVM model built with the most important 45 features (selected from a total of 129 features) was the optimal model. For this SVM model, the recognition accuracies of the training set and the testing set were 97.64% and 94.74%, respectively. Semi-supervised models for disease recognition were built based on the 45 effective features that were used for building the optimal SVM model. For the optimal semi-supervised models built with three ratios of labeled to unlabeled samples in the training set, the recognition accuracies of the training set and the testing set were both approximately 80%. The results indicated that image recognition of the four alfalfa leaf diseases can be implemented with high accuracy. This study provides a feasible solution for lesion image segmentation and image recognition of alfalfa leaf disease.
Comprehensive cluster analysis with Transitivity Clustering.
Wittkop, Tobias; Emig, Dorothea; Truss, Anke; Albrecht, Mario; Böcker, Sebastian; Baumbach, Jan
2011-03-01
Transitivity Clustering is a method for the partitioning of biological data into groups of similar objects, such as genes, for instance. It provides integrated access to various functions addressing each step of a typical cluster analysis. To facilitate this, Transitivity Clustering is accessible online and offers three user-friendly interfaces: a powerful stand-alone version, a web interface, and a collection of Cytoscape plug-ins. In this paper, we describe three major workflows: (i) protein (super)family detection with Cytoscape, (ii) protein homology detection with incomplete gold standards and (iii) clustering of gene expression data. This protocol guides the user through the most important features of Transitivity Clustering and takes ∼1 h to complete.
Analysis of 3D vortex motion in a dusty plasma
NASA Astrophysics Data System (ADS)
Mulsow, M.; Himpel, M.; Melzer, A.
2017-12-01
Dust clusters of about 50-1000 particles have been confined near the sheath region of a gaseous radio-frequency plasma discharge. These compact clusters exhibit a vortex motion which has been reconstructed in full three dimensions from stereoscopy. Smaller clusters are found to show a competition between solid-like cluster structure and vortex motion, whereas larger clusters feature very pronounced vortices. From the three-dimensional analysis, the dust flow field has been found to be nearly incompressible. The vortices in all observed clusters are essentially poloidal. The dependence of the vorticity on the cluster size is discussed. Finally, the vortex motion has been quantitatively attributed to radial gradients of the ion drag force.
Forck, Richard M; Pradzynski, Christoph C; Wolff, Sabine; Ončák, Milan; Slavíček, Petr; Zeuch, Thomas
2012-03-07
Size resolved IR action spectra of neutral sodium doped methanol clusters have been measured using IR excitation modulated photoionisation mass spectroscopy. The Na(CH(3)OH)(n) clusters were generated in a supersonic He seeded expansion of methanol by subsequent Na doping in a pick-up cell. A combined analysis of IR action spectra, IP evolutions and harmonic predictions of IR spectra (using density functional theory) of the most stable structures revealed that for n = 4, 5 structures with an exterior Na atom showing high ionisation potentials (IPs) of ~4 eV dominate, while for n = 6, 7 clusters with lower IPs (~3.2 eV) featuring fully solvated Na atoms and solvated electrons emerge and dominate the IR action spectra. For n = 4 simulations of photoionisation spectra using an ab initio MD approach confirm the dominance of exterior structures and explain the previously reported appearance IP of 3.48 eV by small fractions of clusters with partly solvated Na atoms. Only for this cluster size a shift in the isomer composition with cluster temperature has been observed, which may be related to kinetic stabilisation of less Na solvated clusters at low temperatures. Features of slow fragmentation dynamics of cationic Na(+)(CH(3)OH)(6) clusters have been observed for the photoionisation near the adiabatic limit. This finding points to the relevance of previously proposed non-vertical photoionisation dynamics of this system.
NASA Astrophysics Data System (ADS)
Lyle, Justin; Wedig, Olivia; Gulania, Sahil; Krylov, Anna I.; Mabbs, Richard
2017-12-01
We report photoelectron spectra of CH2CN-, recorded at photon energies between 13 460 and 15 384 cm-1, which show rapid intensity variations in particular detachment channels. The branching ratios for various spectral features reveal rotational structure associated with autodetachment from an intermediate anion state. Calculations using equation-of-motion coupled-cluster method with single and double excitations reveal the presence of two dipole-bound excited anion states (a singlet and a triplet). The computed oscillator strength for the transition to the singlet dipole-bound state provides an estimate of the autodetachment channel contribution to the total photoelectron yield. Analysis of the different spectral features allows identification of the dipole-bound and neutral vibrational levels involved in the autodetachment processes. For the most part, the autodetachment channels are consistent with the vibrational propensity rule and normal mode expectation. However, examination of the rotational structure shows that autodetachment from the ν3 (v = 1 and v = 2) levels of the dipole-bound state displays behavior counter to the normal mode expectation with the final state vibrational level belonging to a different mode.
Costa, Marta; Manton, James D; Ostrovsky, Aaron D; Prohaska, Steffen; Jefferis, Gregory S X E
2016-07-20
Neural circuit mapping is generating datasets of tens of thousands of labeled neurons. New computational tools are needed to search and organize these data. We present NBLAST, a sensitive and rapid algorithm, for measuring pairwise neuronal similarity. NBLAST considers both position and local geometry, decomposing neurons into short segments; matched segments are scored using a probabilistic scoring matrix defined by statistics of matches and non-matches. We validated NBLAST on a published dataset of 16,129 single Drosophila neurons. NBLAST can distinguish neuronal types down to the finest level (single identified neurons) without a priori information. Cluster analysis of extensively studied neuronal classes identified new types and unreported topographical features. Fully automated clustering organized the validation dataset into 1,052 clusters, many of which map onto previously described neuronal types. NBLAST supports additional query types, including searching neurons against transgene expression patterns. Finally, we show that NBLAST is effective with data from other invertebrates and zebrafish. VIDEO ABSTRACT. Copyright © 2016 MRC Laboratory of Molecular Biology. Published by Elsevier Inc. All rights reserved.
Organization of nif gene cluster in Frankia sp. EuIK1 strain, a symbiont of Elaeagnus umbellata.
Oh, Chang Jae; Kim, Ho Bang; Kim, Jitae; Kim, Won Jin; Lee, Hyoungseok; An, Chung Sun
2012-01-01
The nucleotide sequence of a 20.5-kb genomic region harboring nif genes was determined and analyzed. The fragment was obtained from Frankia sp. EuIK1 strain, an indigenous symbiont of Elaeagnus umbellata. A total of 20 ORFs including 12 nif genes were identified and subjected to comparative analysis with the genome sequences of 3 Frankia strains representing diverse host plant specificities. The nucleotide and deduced amino acid sequences showed highest levels of identity with orthologous genes from an Elaeagnus-infecting strain. The gene organization patterns around the nif gene clusters were well conserved among all 4 Frankia strains. However, characteristic features appeared in the location of the nifV gene for each Frankia strain, depending on the type of host plant. Sequence analysis was performed to determine the transcription units and suggested that there could be an independent operon starting from the nifW gene in the EuIK strain. Considering the organization patterns and their total extensions on the genome, we propose that the nif gene clusters remained stable despite genetic variations occurring in the Frankia genomes.
Ab initio and empirical energy landscapes of (MgF2)n clusters (n = 3, 4).
Neelamraju, S; Schön, J C; Doll, K; Jansen, M
2012-01-21
We explore the energy landscape of (MgF(2))(3) on both the empirical and ab initio level using the threshold algorithm. In order to determine the energy landscape and the dynamics of the trimer we investigate not only the stable isomers but also the barriers separating these isomers. Furthermore, we study the probability flows in order to estimate the stability of all the isomers found. We find that there is reasonable qualitative agreement between the ab initio and empirical potential, and important features such as sub-basins and energetic barriers follow similar trends. However, we observe that the energies are systematically different for the less compact clusters, when comparing empirical and ab initio energies. Since the underlying motivation of this work is to identify the possible clusters present in the gas phase during a low-temperature atom beam deposition synthesis of MgF(2), we employ the same procedure to additionally investigate the energy landscape of the tetramer. For this case, however, we use only the empirical potential.
Geometry and topology of the space of sonar target echos.
Robinson, Michael; Fennell, Sean; DiZio, Brian; Dumiak, Jennifer
2018-03-01
Successful synthetic aperture sonar target classification depends on the "shape" of the scatterers within a target signature. This article presents a workflow that computes a target-to-target distance from persistence diagrams, since the "shape" of a signature informs its persistence diagram in a structure-preserving way. The target-to-target distances derived from persistence diagrams compare favorably against those derived from spectral features and have the advantage of being substantially more compact. While spectral features produce clusters associated to each target type that are reasonably dense and well formed, the clusters are not well-separated from one another. In rather dramatic contrast, a distance derived from persistence diagrams results in highly separated clusters at the expense of some misclassification of outliers.
An unusual 2p-3d-4f heterometallic coordination polymer featuring Ln8Na and Cu8I clusters as nodes
NASA Astrophysics Data System (ADS)
Zhao, Mingjuan; Chen, Shimin; Huang, Yutian; Dan, Youmeng
2017-01-01
A new cluster-based three-dimensional 2p-3d-4f heterometallic framework {[Ho8Na(OH)6Cu16I2(CPT)24](NO3)9(H2O)6(CH3CN)18}n (1, HCPT = 4-(4-carboxyphenyl)-1,2,4 triazole) has been prepared under solvothermal condition by using a custom-designed bifunctional organic ligand. The single-crystal structure analysis reveals that this framework features novel Ln8Na and Cu8I clusters as nodes, these nodes are further connected by the CPT ligands to give rise to a (6,14)-connected network. The magnetic property of this framework has also been investigated.
Clustering of Multivariate Geostatistical Data
NASA Astrophysics Data System (ADS)
Fouedjio, Francky
2017-04-01
Multivariate data indexed by geographical coordinates have become omnipresent in the geosciences and pose substantial analysis challenges. One of them is the grouping of data locations into spatially contiguous clusters so that data locations belonging to the same cluster have a certain degree of homogeneity while data locations in the different clusters have to be as different as possible. However, groups of data locations created through classical clustering techniques turn out to show poor spatial contiguity, a feature obviously inconvenient for many geoscience applications. In this work, we develop a clustering method that overcomes this problem by accounting the spatial dependence structure of data; thus reinforcing the spatial contiguity of resulting cluster. The capability of the proposed clustering method to provide spatially contiguous and meaningful clusters of data locations is assessed using both synthetic and real datasets. Keywords: clustering, geostatistics, spatial contiguity, spatial dependence.
Chaos theory perspective for industry clusters development
NASA Astrophysics Data System (ADS)
Yu, Haiying; Jiang, Minghui; Li, Chengzhang
2016-03-01
Industry clusters have outperformed in economic development in most developing countries. The contributions of industrial clusters have been recognized as promotion of regional business and the alleviation of economic and social costs. It is no doubt globalization is rendering clusters in accelerating the competitiveness of economic activities. In accordance, many ideas and concepts involve in illustrating evolution tendency, stimulating the clusters development, meanwhile, avoiding industrial clusters recession. The term chaos theory is introduced to explain inherent relationship of features within industry clusters. A preferred life cycle approach is proposed for industrial cluster recessive theory analysis. Lyapunov exponents and Wolf model are presented for chaotic identification and examination. A case study of Tianjin, China has verified the model effectiveness. The investigations indicate that the approaches outperform in explaining chaos properties in industrial clusters, which demonstrates industrial clusters evolution, solves empirical issues and generates corresponding strategies.
A Fine-Scale Functional Logic to Convergence from Retina to Thalamus.
Liang, Liang; Fratzl, Alex; Goldey, Glenn; Ramesh, Rohan N; Sugden, Arthur U; Morgan, Josh L; Chen, Chinfei; Andermann, Mark L
2018-05-31
Numerous well-defined classes of retinal ganglion cells innervate the thalamus to guide image-forming vision, yet the rules governing their convergence and divergence remain unknown. Using two-photon calcium imaging in awake mouse thalamus, we observed a functional arrangement of retinal ganglion cell axonal boutons in which coarse-scale retinotopic ordering gives way to fine-scale organization based on shared preferences for other visual features. Specifically, at the ∼6 μm scale, clusters of boutons from different axons often showed similar preferences for either one or multiple features, including axis and direction of motion, spatial frequency, and changes in luminance. Conversely, individual axons could "de-multiplex" information channels by participating in multiple, functionally distinct bouton clusters. Finally, ultrastructural analyses demonstrated that retinal axonal boutons in a local cluster often target the same dendritic domain. These data suggest that functionally specific convergence and divergence of retinal axons may impart diverse, robust, and often novel feature selectivity to visual thalamus. Copyright © 2018 Elsevier Inc. All rights reserved.
Tomkova, Veronika; Korenkova, Vlasta; Langerova, Lucie; Simonova, Ekaterina; Zjablovskaja, Polina; Alberich-Jorda, Meritxell; Neuzil, Jiri; Truksa, Jaroslav
2017-01-01
The importance of iron in the growth and progression of tumors has been widely documented. In this report, we show that tumor-initiating cells (TICs), represented by spheres derived from the MCF7 cell line, exhibit higher intracellular labile iron pool, mitochondrial iron accumulation and are more susceptible to iron chelation. TICs also show activation of the IRP/IRE system, leading to higher iron uptake and decrease in iron storage, suggesting that level of properly assembled cytosolic iron-sulfur clusters (FeS) is reduced. This finding is confirmed by lower enzymatic activity of aconitase and FeS cluster biogenesis enzymes, as well as lower levels of reduced glutathione, implying reduced FeS clusters synthesis/utilization in TICs. Importantly, we have identified specific gene signature related to iron metabolism consisting of genes regulating iron uptake, mitochondrial FeS cluster biogenesis and hypoxic response (ABCB10, ACO1, CYBRD1, EPAS1, GLRX5, HEPH, HFE, IREB2, QSOX1 and TFRC). Principal component analysis based on this signature is able to distinguish TICs from cancer cells in vitro and also Leukemia-initiating cells (LICs) from non-LICs in the mouse model of acute promyelocytic leukemia (APL). Majority of the described changes were also recapitulated in an alternative model represented by MCF7 cells resistant to tamoxifen (TAMR) that exhibit features of TICs. Our findings point to the critical importance of redox balance and iron metabolism-related genes and proteins in the context of cancer and TICs that could be potentially used for cancer diagnostics or therapy. PMID:28031527
Global occurrence and heterogeneity of the Roseobacter-clade species Ruegeria mobilis
Sonnenschein, Eva C; Nielsen, Kristian F; D'Alvise, Paul; Porsby, Cisse H; Melchiorsen, Jette; Heilmann, Jens; Kalatzis, Panos G; López-Pérez, Mario; Bunk, Boyke; Spröer, Cathrin; Middelboe, Mathias; Gram, Lone
2017-01-01
Tropodithietic acid (TDA)-producing Ruegeria mobilis strains of the Roseobacter clade have primarily been isolated from marine aquaculture and have probiotic potential due to inhibition of fish pathogens. We hypothesized that TDA producers with additional novel features are present in the oceanic environment. We isolated 42 TDA-producing R. mobilis strains during a global marine research cruise. While highly similar on the 16S ribosomal RNA gene level (99–100% identity), the strains separated into four sub-clusters in a multilocus sequence analysis. They were further differentiated to the strain level by average nucleotide identity using pairwise genome comparison. The four sub-clusters could not be associated with a specific environmental niche, however, correlated with the pattern of sub-typing using co-isolated phages, the number of prophages in the genomes and the distribution in ocean provinces. Major genomic differences within the sub-clusters include prophages and toxin-antitoxin systems. In general, the genome of R. mobilis revealed adaptation to a particle-associated life style and querying TARA ocean data confirmed that R. mobilis is more abundant in the particle-associated fraction than in the free-living fraction occurring in 40% and 6% of the samples, respectively. Our data and the TARA data, although lacking sufficient data from the polar regions, demonstrate that R. mobilis is a globally distributed marine bacterial species found primarily in the upper open oceans. It has preserved key phenotypic behaviors such as the production of TDA, but contains diverse sub-clusters, which could provide new capabilities for utilization in aquaculture. PMID:27552638
Pellacani, Giovanni; Vinceti, Marco; Bassoli, Sara; Braun, Ralph; Gonzalez, Salvador; Guitera, Pascale; Longo, Caterina; Marghoob, Ashfaq A; Menzies, Scott W; Puig, Susana; Scope, Alon; Seidenari, Stefania; Malvehy, Josep
2009-10-01
To test the interobserver and intraobserver reproducibility of the standard terminology for description and diagnosis of melanocytic lesions in in vivo confocal microscopy. A dedicated Web platform was developed to train the participants and to allow independent distant evaluations of confocal images via the Internet. Department of Dermatology, University of Modena and Reggio Emilia, Modena, Italy. The study population was composed of 15 melanomas, 30 nevi, and 5 Spitz/Reed nevi. Six expert centers were invited to participate at the study. Intervention Evaluation of 36 features in 345 confocal microscopic images from melanocytic lesions. Interobserved and intraobserved agreement, by calculating the Cohen kappa statistics measure for each descriptor. High overall levels of reproducibility were shown for most of the evaluated features. In both the training and test sets there was a parallel trend of decreasing kappa values as deeper anatomic skin levels were evaluated. All of the features, except 1, used for melanoma diagnosis, including roundish pagetoid cells, nonedged papillae, atypical cells in basal layer, cerebriform clusters, and nucleated cells infiltrating dermal papillae, showed high overall levels of reproducibility. However, less-than-ideal reproducibility was obtained for some descriptors, such as grainy appearance of the epidermis, junctional thickening, mild atypia in basal layer, plump bright cells, small bright cells, and reticulated fibers in the dermis. Conclusion The standard consensus confocal terminology useful for the evaluation of melanocytic lesions was reproducibly recognized by independent observers.
Application of Classification Methods for Forecasting Mid-Term Power Load Patterns
NASA Astrophysics Data System (ADS)
Piao, Minghao; Lee, Heon Gyu; Park, Jin Hyoung; Ryu, Keun Ho
Currently an automated methodology based on data mining techniques is presented for the prediction of customer load patterns in long duration load profiles. The proposed approach in this paper consists of three stages: (i) data preprocessing: noise or outlier is removed and the continuous attribute-valued features are transformed to discrete values, (ii) cluster analysis: k-means clustering is used to create load pattern classes and the representative load profiles for each class and (iii) classification: we evaluated several supervised learning methods in order to select a suitable prediction method. According to the proposed methodology, power load measured from AMR (automatic meter reading) system, as well as customer indexes, were used as inputs for clustering. The output of clustering was the classification of representative load profiles (or classes). In order to evaluate the result of forecasting load patterns, the several classification methods were applied on a set of high voltage customers of the Korea power system and derived class labels from clustering and other features are used as input to produce classifiers. Lastly, the result of our experiments was presented.
Features of globular cluster's dynamics with an intermediate-mass black hole
NASA Astrophysics Data System (ADS)
Ryabova, Marina V.; Gorban, Alena S.; Shchekinov, Yuri A.; Vasiliev, Evgenii O.
2018-02-01
In this paper, we address the question of how a central intermediate-mass black hole (IMBH) in a globular cluster (GC) affects dynamics, core collapse, and formation of the binary population. It is shown that the central IMBH forms a binary system that affects dynamics of stars in the cluster significantly. The presence of an intermediate-mass black hole with mass ≥ 1.0-1.7%of the total stellar mass in the cluster inhibits the formation of binary stars population.
Computer aided detection of clusters of microcalcifications on full field digital mammograms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ge Jun; Sahiner, Berkman; Hadjiiski, Lubomir M.
2006-08-15
We are developing a computer-aided detection (CAD) system to identify microcalcification clusters (MCCs) automatically on full field digital mammograms (FFDMs). The CAD system includes six stages: preprocessing; image enhancement; segmentation of microcalcification candidates; false positive (FP) reduction for individual microcalcifications; regional clustering; and FP reduction for clustered microcalcifications. At the stage of FP reduction for individual microcalcifications, a truncated sum-of-squares error function was used to improve the efficiency and robustness of the training of an artificial neural network in our CAD system for FFDMs. At the stage of FP reduction for clustered microcalcifications, morphological features and features derived from themore » artificial neural network outputs were extracted from each cluster. Stepwise linear discriminant analysis (LDA) was used to select the features. An LDA classifier was then used to differentiate clustered microcalcifications from FPs. A data set of 96 cases with 192 images was collected at the University of Michigan. This data set contained 96 MCCs, of which 28 clusters were proven by biopsy to be malignant and 68 were proven to be benign. The data set was separated into two independent data sets for training and testing of the CAD system in a cross-validation scheme. When one data set was used to train and validate the convolution neural network (CNN) in our CAD system, the other data set was used to evaluate the detection performance. With the use of a truncated error metric, the training of CNN could be accelerated and the classification performance was improved. The CNN in combination with an LDA classifier could substantially reduce FPs with a small tradeoff in sensitivity. By using the free-response receiver operating characteristic methodology, it was found that our CAD system can achieve a cluster-based sensitivity of 70, 80, and 90 % at 0.21, 0.61, and 1.49 FPs/image, respectively. For case-based performance evaluation, a sensitivity of 70, 80, and 90 % can be achieved at 0.07, 0.17, and 0.65 FPs/image, respectively. We also used a data set of 216 mammograms negative for clustered microcalcifications to further estimate the FP rate of our CAD system. The corresponding FP rates were 0.15, 0.31, and 0.86 FPs/image for cluster-based detection when negative mammograms were used for estimation of FP rates.« less
Akar, Servet; Solmaz, Dilek; Kasifoglu, Timucin; Bilge, Sule Yasar; Sari, Ismail; Gumus, Zeynep Zehra; Tunca, Mehmet
2016-02-01
The aim of this study was to evaluate whether there are clinical subgroups that may have different prognoses among FMF patients. The cumulative clinical features of a large group of FMF patients [1168 patients, 593 (50.8%) male, mean age 35.3 years (s.d. 12.4)] were studied. To analyse our data and identify groups of FMF patients with similar clinical characteristics, a two-step cluster analysis using log-likelihood distance measures was performed. For clustering the FMF patients, we evaluated the following variables: gender, current age, age at symptom onset, age at diagnosis, presence of major clinical features, variables related with therapy and family history for FMF, renal failure and carriage of M694V. Three distinct groups of FMF patients were identified. Cluster 1 was characterized by a high prevalence of arthritis, pleuritis, erysipelas-like erythema (ELE) and febrile myalgia. The dosage of colchicine and the frequency of amyloidosis were lower in cluster 1. Patients in cluster 2 had an earlier age of disease onset and diagnosis. M694V carriage and amyloidosis prevalence were the highest in cluster 2. This group of patients was using the highest dose of colchicine. Patients in cluster 3 had the lowest prevalence of arthritis, ELE and febrile myalgia. The frequencies of M694V carriage and amyloidosis were lower in cluster 3 than the overall FMF patients. Non-response to colchicine was also slightly lower in cluster 3. Patients with FMF can be clustered into distinct patterns of clinical and genetic manifestations and these patterns may have different prognostic significance. © The Author 2015. Published by Oxford University Press on behalf of the British Society for Rheumatology. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Phenotypes determined by cluster analysis in severe or difficult-to-treat asthma.
Schatz, Michael; Hsu, Jin-Wen Y; Zeiger, Robert S; Chen, Wansu; Dorenbaum, Alejandro; Chipps, Bradley E; Haselkorn, Tmirah
2014-06-01
Asthma phenotyping can facilitate understanding of disease pathogenesis and potential targeted therapies. To further characterize the distinguishing features of phenotypic groups in difficult-to-treat asthma. Children ages 6-11 years (n = 518) and adolescents and adults ages ≥12 years (n = 3612) with severe or difficult-to-treat asthma from The Epidemiology and Natural History of Asthma: Outcomes and Treatment Regimens (TENOR) study were evaluated in this post hoc cluster analysis. Analyzed variables included sex, race, atopy, age of asthma onset, smoking (adolescents and adults), passive smoke exposure (children), obesity, and aspirin sensitivity. Cluster analysis used the hierarchical clustering algorithm with the Ward minimum variance method. The results were compared among clusters by χ(2) analysis; variables with significant (P < .05) differences among clusters were considered as distinguishing feature candidates. Associations among clusters and asthma-related health outcomes were assessed in multivariable analyses by adjusting for socioeconomic status, environmental exposures, and intensity of therapy. Five clusters were identified in each age stratum. Sex, atopic status, and nonwhite race were distinguishing variables in both strata; passive smoke exposure was distinguishing in children and aspirin sensitivity in adolescents and adults. Clusters were not related to outcomes in children, but 2 adult and adolescent clusters distinguished by nonwhite race and aspirin sensitivity manifested poorer quality of life (P < .0001), and the aspirin-sensitive cluster experienced more frequent asthma exacerbations (P < .0001). Distinct phenotypes appear to exist in patients with severe or difficult-to-treat asthma, which is related to outcomes in adolescents and adults but not in children. The study of the therapeutic implications of these phenotypes is warranted. Copyright © 2013 American Academy of Allergy, Asthma & Immunology. Published by Mosby, Inc. All rights reserved.
Matsui, Naruaki; Kajiwara, Hiroshi; Morishita, Akihiro; Tsukada, Hitomi; Nakazawa, Kazumi; Miyazawa, Masaki; Mikami, Mikio; Nakamura, Naoya; Sato, Shinkichi
2015-06-20
Aim of study was to clarify the cytological characteristics of grade 3 endometrioid adenocarcinoma of endometrial origin (G3 EA) by endometrial brushing cytology. The subjects were 11 patients in whom G3 EA was diagnosed by review of preoperative cytological specimens obtained at our hospital and related institutions between 2000 and 2010. These patients were investigated with respect to the preoperative cytological diagnosis, background changes, cell cluster patterns, and individual cellular findings. Background changes were classified as inflammatory or tumorous, while cell clusters were classified as overlapping cell cluster, sheet-like cell cluster, clump of high dense gland, papillary, or other cell cluster. Cellular findings were investigated by comparing the incidence of squamous and clear cell metaplasia, the nuclear rounding rate, and the nuclear area with the findings in a control group (35 patients with G1-2 EA). Background changes were classified as inflammatory in 63.6% and necrotic in 36.4%. The cell clusters were classified as overlapping cell cluster in 44.8%, cell cluster in 21.7%, clump of high dense gland in 10.0%, papillary in 4.0%, and other cell cluster in 19.5%. The incidence of squamous and clear cell metaplasia was 27.2% and 18.1%, respectively. The mean nuclear rounding rate was 0.97, and the mean nuclear area was 55.98 µm2. Investigation of the cytoarchitecture of G3 EA with endometrial brushing cytology revealed overlapping cell cluster and tumor cells of a relatively uniform size. These findings suggest that it is necessary to recognize that there are differences between the cytological findings of G3 EA and the usual features of G1-2 EA.
NASA Astrophysics Data System (ADS)
Burchett, Joseph N.; Tripp, Todd M.; Wang, Q. Daniel; Willmer, Christopher N. A.; Bowen, David V.; Jenkins, Edward B.
2018-04-01
We analyse the intracluster medium (ICM) and circumgalactic medium (CGM) in seven X-ray-detected galaxy clusters using spectra of background quasi-stellar objects (QSOs) (HST-COS/STIS), optical spectroscopy of the cluster galaxies (MMT/Hectospec and SDSS), and X-ray imaging/spectroscopy (XMM-Newton and Chandra). First, we report a very low covering fraction of H I absorption in the CGM of these cluster galaxies, f_c = 25^{+25}_{-15} {per cent}, to stringent detection limits (N(H I) <1013 cm-2). As field galaxies have an H I covering fraction of ˜ 100 per cent at similar radii, the dearth of CGM H I in our data indicates that the cluster environment has effectively stripped or overionized the gaseous haloes of these cluster galaxies. Secondly, we assess the contribution of warm-hot (105-106 K) gas to the ICM as traced by O VI and broad Ly α (BLA) absorption. Despite the high signal-to-noise ratio of our data, we do not detect O VI in any cluster, and we only detect BLA features in the QSO spectrum probing one cluster. We estimate that the total column density of warm-hot gas along this line of sight totals to ˜ 3 per cent of that contained in the hot T > 107 K X-ray emitting phase. Residing at high relative velocities, these features may trace pre-shocked material outside the cluster. Comparing gaseous galaxy haloes from the low-density `field' to galaxy groups and high-density clusters, we find that the CGM is progressively depleted of H I with increasing environmental density, and the CGM is most severely transformed in galaxy clusters. This CGM transformation may play a key role in environmental galaxy quenching.
Theory of mind predicts severity level in autism.
Hoogenhout, Michelle; Malcolm-Smith, Susan
2017-02-01
We investigated whether theory of mind skills can indicate autism spectrum disorder severity. In all, 62 children with autism spectrum disorder completed a developmentally sensitive theory of mind battery. We used intelligence quotient, Diagnostic and Statistical Manual of Mental Disorders (4th ed.) diagnosis and level of support needed as indicators of severity level. Using hierarchical cluster analysis, we found three distinct clusters of theory of mind ability: early-developing theory of mind (Cluster 1), false-belief reasoning (Cluster 2) and sophisticated theory of mind understanding (Cluster 3). The clusters corresponded to severe, moderate and mild autism spectrum disorder. As an indicator of level of support needed, cluster grouping predicted the type of school children attended. All Cluster 1 children attended autism-specific schools; Cluster 2 was divided between autism-specific and special needs schools and nearly all Cluster 3 children attended general special needs and mainstream schools. Assessing theory of mind skills can reliably discriminate severity levels within autism spectrum disorder.
Bessette, Katie L; Jenkins, Lisanne M; Skerrett, Kristy A; Gowins, Jennifer R; DelDonno, Sophie R; Zubieta, Jon-Kar; McInnis, Melvin G; Jacobs, Rachel H; Ajilore, Olusola; Langenecker, Scott A
2018-01-01
There is substantial variability across studies of default mode network (DMN) connectivity in major depressive disorder, and reliability and time-invariance are not reported. This study evaluates whether DMN dysconnectivity in remitted depression (rMDD) is reliable over time and symptom-independent, and explores convergent relationships with cognitive features of depression. A longitudinal study was conducted with 82 young adults free of psychotropic medications (47 rMDD, 35 healthy controls) who completed clinical structured interviews, neuropsychological assessments, and 2 resting-state fMRI scans across 2 study sites. Functional connectivity analyses from bilateral posterior cingulate and anterior hippocampal formation seeds in DMN were conducted at both time points within a repeated-measures analysis of variance to compare groups and evaluate reliability of group-level connectivity findings. Eleven hyper- (from posterior cingulate) and 6 hypo- (from hippocampal formation) connectivity clusters in rMDD were obtained with moderate to adequate reliability in all but one cluster (ICC's range = 0.50 to 0.76 for 16 of 17). The significant clusters were reduced with a principle component analysis (5 components obtained) to explore these connectivity components, and were then correlated with cognitive features (rumination, cognitive control, learning and memory, and explicit emotion identification). At the exploratory level, for convergent validity, components consisting of posterior cingulate with cognitive control network hyperconnectivity in rMDD were related to cognitive control (inverse) and rumination (positive). Components consisting of anterior hippocampal formation with social emotional network and DMN hypoconnectivity were related to memory (inverse) and happy emotion identification (positive). Thus, time-invariant DMN connectivity differences exist early in the lifespan course of depression and are reliable. The nuanced results suggest a ventral within-network hypoconnectivity associated with poor memory and a dorsal cross-network hyperconnectivity linked to poorer cognitive control and elevated rumination. Study of early course remitted depression with attention to reliability and symptom independence could lead to more readily translatable clinical assessment tools for biomarkers.
New approaches to model and study social networks
NASA Astrophysics Data System (ADS)
Lind, P. G.; Herrmann, H. J.
2007-07-01
We describe and develop three recent novelties in network research which are particularly useful for studying social systems. The first one concerns the discovery of some basic dynamical laws that enable the emergence of the fundamental features observed in social networks, namely the nontrivial clustering properties, the existence of positive degree correlations and the subdivision into communities. To reproduce all these features, we describe a simple model of mobile colliding agents, whose collisions define the connections between the agents which are the nodes in the underlying network, and develop some analytical considerations. The second point addresses the particular feature of clustering and its relationship with global network measures, namely with the distribution of the size of cycles in the network. Since in social bipartite networks it is not possible to measure the clustering from standard procedures, we propose an alternative clustering coefficient that can be used to extract an improved normalized cycle distribution in any network. Finally, the third point addresses dynamical processes occurring on networks, namely when studying the propagation of information in them. In particular, we focus on the particular features of gossip propagation which impose some restrictions in the propagation rules. To this end we introduce a quantity, the spread factor, which measures the average maximal fraction of nearest neighbours which get in contact with the gossip, and find the striking result that there is an optimal non-trivial number of friends for which the spread factor is minimized, decreasing the danger of being gossiped about.
NASA Astrophysics Data System (ADS)
Sembolini, Federico; De Petris, Marco; Yepes, Gustavo; Foschi, Emma; Lamagna, Luca; Gottlöber, Stefan
2014-06-01
In this work, we study the properties of protoclusters of galaxies by employing the MultiDark SImulations of galaxy Clusters (MUSIC) set of hydrodynamical simulations, featuring a sample of 282 resimulated clusters with available merger trees up to z = 4. We study the characteristics and redshift evolution of the mass and the spatial distribution for all the protoclusters, which we define as the most massive progenitors of the clusters identified at z = 0. We extend the study of the baryon content to redshifts larger than 1 also in terms of gas and stars budgets: no remarkable variations with redshift are discovered. Furthermore, motivated by the proven potential of Sunyaev-Zel'dovich surveys to blindly search for faint distant objects, we compute the scaling relation between total object mass and integrated Compton y-parameter. We find that the slope of this scaling law is steeper than what expected for a self-similarity assumption among these objects, and it increases with redshift mainly when radiative processes are included. We use three different criteria to account for the dynamical state of the protoclusters, and find no significant dependence of the scaling parameters on the level of relaxation. We exclude the dynamical state as the cause of the observed deviations from self-similarity in protoclusters.
NASA Astrophysics Data System (ADS)
Macleod, Neil A.; Simons, John P.
2002-10-01
The conformational landscapes of 2-phenoxy ethanol (POX) and its hydrated clusters have been studied in the gas-phase, providing a model for pharmaceutical β-blockers. A combination of experimental techniques, including resonant two-photon ionisation (R2PI), laser-induced-fluorescence (LIF) and resonant ion-dip infra-red spectroscopy (RIDIRS), coupled with high-level ab initio calculations has allowed the assignment of the individually resolved spectral features to discrete conformational and supra-molecular structures. Assignments were made by comparison of experimental vibrational spectra and partially resolved ultra-violet rotational band contours with those predicted from quantum chemical calculations. The isolated molecule displays a solitary structure with an extended geometry of the side-chain which is stabilised by an intramolecular hydrogen-bond between the alcohol (proton donor) and the ether (proton acceptor) groups of the side-chain. In singly hydrated clusters the water molecule is accommodated by insertion into the intramolecular hydrogen-bond. In the doubly hydrated and higher clusters cyclic structures are generated which incorporate both the water molecules and the terminal OH group of the side-chain; additional (weak) hydrogen bonded interactions with the phenoxy group provide a degree of selectivity but essentially, the water 'droplet' forms on the end of the alcohol side-chain.
Medical Imaging Lesion Detection Based on Unified Gravitational Fuzzy Clustering
Vianney Kinani, Jean Marie; Gallegos Funes, Francisco; Mújica Vargas, Dante; Ramos Díaz, Eduardo; Arellano, Alfonso
2017-01-01
We develop a swift, robust, and practical tool for detecting brain lesions with minimal user intervention to assist clinicians and researchers in the diagnosis process, radiosurgery planning, and assessment of the patient's response to the therapy. We propose a unified gravitational fuzzy clustering-based segmentation algorithm, which integrates the Newtonian concept of gravity into fuzzy clustering. We first perform fuzzy rule-based image enhancement on our database which is comprised of T1/T2 weighted magnetic resonance (MR) and fluid-attenuated inversion recovery (FLAIR) images to facilitate a smoother segmentation. The scalar output obtained is fed into a gravitational fuzzy clustering algorithm, which separates healthy structures from the unhealthy. Finally, the lesion contour is automatically outlined through the initialization-free level set evolution method. An advantage of this lesion detection algorithm is its precision and its simultaneous use of features computed from the intensity properties of the MR scan in a cascading pattern, which makes the computation fast, robust, and self-contained. Furthermore, we validate our algorithm with large-scale experiments using clinical and synthetic brain lesion datasets. As a result, an 84%–93% overlap performance is obtained, with an emphasis on robustness with respect to different and heterogeneous types of lesion and a swift computation time. PMID:29158887
A graph-based watershed merging using fuzzy C-means and simulated annealing for image segmentation
NASA Astrophysics Data System (ADS)
Vadiveloo, Mogana; Abdullah, Rosni; Rajeswari, Mandava
2015-12-01
In this paper, we have addressed the issue of over-segmented regions produced in watershed by merging the regions using global feature. The global feature information is obtained from clustering the image in its feature space using Fuzzy C-Means (FCM) clustering. The over-segmented regions produced by performing watershed on the gradient of the image are then mapped to this global information in the feature space. Further to this, the global feature information is optimized using Simulated Annealing (SA). The optimal global feature information is used to derive the similarity criterion to merge the over-segmented watershed regions which are represented by the region adjacency graph (RAG). The proposed method has been tested on digital brain phantom simulated dataset to segment white matter (WM), gray matter (GM) and cerebrospinal fluid (CSF) soft tissues regions. The experiments showed that the proposed method performs statistically better, with average of 95.242% regions are merged, than the immersion watershed and average accuracy improvement of 8.850% in comparison with RAG-based immersion watershed merging using global and local features.
The detection methods of dynamic objects
NASA Astrophysics Data System (ADS)
Knyazev, N. L.; Denisova, L. A.
2018-01-01
The article deals with the application of cluster analysis methods for solving the task of aircraft detection on the basis of distribution of navigation parameters selection into groups (clusters). The modified method of cluster analysis for search and detection of objects and then iterative combining in clusters with the subsequent count of their quantity for increase in accuracy of the aircraft detection have been suggested. The course of the method operation and the features of implementation have been considered. In the conclusion the noted efficiency of the offered method for exact cluster analysis for finding targets has been shown.
Yang, Mingxing; Li, Xiumin; Li, Zhibin; Ou, Zhimin; Liu, Ming; Liu, Suhuan; Li, Xuejun; Yang, Shuyu
2013-01-01
DNA microarray analysis is characterized by obtaining a large number of gene variables from a small number of observations. Cluster analysis is widely used to analyze DNA microarray data to make classification and diagnosis of disease. Because there are so many irrelevant and insignificant genes in a dataset, a feature selection approach must be employed in data analysis. The performance of cluster analysis of this high-throughput data depends on whether the feature selection approach chooses the most relevant genes associated with disease classes. Here we proposed a new method using multiple Orthogonal Partial Least Squares-Discriminant Analysis (mOPLS-DA) models and S-plots to select the most relevant genes to conduct three-class disease classification and prediction. We tested our method using Golub's leukemia microarray data. For three classes with subtypes, we proposed hierarchical orthogonal partial least squares-discriminant analysis (OPLS-DA) models and S-plots to select features for two main classes and their subtypes. For three classes in parallel, we employed three OPLS-DA models and S-plots to choose marker genes for each class. The power of feature selection to classify and predict three-class disease was evaluated using cluster analysis. Further, the general performance of our method was tested using four public datasets and compared with those of four other feature selection methods. The results revealed that our method effectively selected the most relevant features for disease classification and prediction, and its performance was better than that of the other methods.
Influence of magnetism and correlation on the spectral properties of doped Mott insulators
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Yao; Moritz, Brian; Chen, Cheng-Chien
Unraveling the nature of the doping-induced transition between a Mott insulator and a weakly correlated metal is crucial to understanding novel emergent phases in strongly correlated materials. Here, for this purpose, we study the evolution of spectral properties upon doping Mott insulating states by utilizing the cluster perturbation theory on the Hubbard and t – J -like models. Specifically, a quasifree dispersion crossing the Fermi level develops with small doping, and it eventually evolves into the most dominant feature at high doping levels. Although this dispersion is related to the free-electron hopping, our study shows that this spectral feature is,more » in fact, influenced inherently by both electron-electron correlation and spin-exchange interaction: the correlation destroys coherence, while the coupling between spin and mobile charge restores it in the photoemission spectrum. Due to the persistent impact of correlations and spin physics, the onset of gaps or the high-energy anomaly in the spectral functions can be expected in doped Mott insulators.« less
Influence of magnetism and correlation on the spectral properties of doped Mott insulators
Wang, Yao; Moritz, Brian; Chen, Cheng-Chien; ...
2018-03-01
Unraveling the nature of the doping-induced transition between a Mott insulator and a weakly correlated metal is crucial to understanding novel emergent phases in strongly correlated materials. Here, for this purpose, we study the evolution of spectral properties upon doping Mott insulating states by utilizing the cluster perturbation theory on the Hubbard and t – J -like models. Specifically, a quasifree dispersion crossing the Fermi level develops with small doping, and it eventually evolves into the most dominant feature at high doping levels. Although this dispersion is related to the free-electron hopping, our study shows that this spectral feature is,more » in fact, influenced inherently by both electron-electron correlation and spin-exchange interaction: the correlation destroys coherence, while the coupling between spin and mobile charge restores it in the photoemission spectrum. Due to the persistent impact of correlations and spin physics, the onset of gaps or the high-energy anomaly in the spectral functions can be expected in doped Mott insulators.« less
Many Specialists for Suppressing Cortical Excitation
Burkhalter, Andreas
2008-01-01
Cortical computations are critically dependent on GABA-releasing neurons for dynamically balancing excitation with inhibition that is proportional to the overall level of activity. Although it is widely accepted that there are multiple types of interneurons, defining their identities based on qualitative descriptions of morphological, molecular and physiological features has failed to produce a universally accepted ‘parts list’, which is needed to understand the roles that interneurons play in cortical processing. A list of features has been published by the Petilla Interneurons Nomenclature Group, which represents an important step toward an unbiased classification of interneurons. To this end some essential features have recently been studied quantitatively and their association was examined using multidimensional cluster analyses. These studies revealed at least 3 distinct electrophysiological, 6 morphological and 15 molecular phenotypes. This is a conservative estimate of the number of interneuron types, which almost certainly will be revised as more quantitative studies will be performed and similarities will be defined objectively. It is clear that interneurons are organized with physiological attributes representing the most general, molecular characteristics the most detailed and morphological features occupying the middle ground. By themselves, none of these features are sufficient to define classes of interneurons. The challenge will be to determine which features belong together and how cell type-specific feature combinations are genetically specified. PMID:19225588
2013-06-24
NASA Cassini spacecraft has been monitoring propeller features such as Bleriot since their discovery. The bright dash-like features are regions where a small moonlet has caused ring particles to cluster together more densely than normal.
Protein attributes contribute to halo-stability, bioinformatics approach
2011-01-01
Halophile proteins can tolerate high salt concentrations. Understanding halophilicity features is the first step toward engineering halostable crops. To this end, we examined protein features contributing to the halo-toleration of halophilic organisms. We compared more than 850 features for halophilic and non-halophilic proteins with various screening, clustering, decision tree, and generalized rule induction models to search for patterns that code for halo-toleration. Up to 251 protein attributes selected by various attribute weighting algorithms as important features contribute to halo-stability; from them 14 attributes selected by 90% of models and the count of hydrogen gained the highest value (1.0) in 70% of attribute weighting models, showing the importance of this attribute in feature selection modeling. The other attributes mostly were the frequencies of di-peptides. No changes were found in the numbers of groups when K-Means and TwoStep clustering modeling were performed on datasets with or without feature selection filtering. Although the depths of induced trees were not high, the accuracies of trees were higher than 94% and the frequency of hydrophobic residues pointed as the most important feature to build trees. The performance evaluation of decision tree models had the same values and the best correctness percentage recorded with the Exhaustive CHAID and CHAID models. We did not find any significant difference in the percent of correctness, performance evaluation, and mean correctness of various decision tree models with or without feature selection. For the first time, we analyzed the performance of different screening, clustering, and decision tree algorithms for discriminating halophilic and non-halophilic proteins and the results showed that amino acid composition can be used to discriminate between halo-tolerant and halo-sensitive proteins. PMID:21592393
Rare k-mer DNA: Identification of sequence motifs and prediction of CpG island and promoter.
Mohamed Hashim, Ezzeddin Kamil; Abdullah, Rosni
2015-12-21
Empirical analysis on k-mer DNA has been proven as an effective tool in finding unique patterns in DNA sequences which can lead to the discovery of potential sequence motifs. In an extensive study of empirical k-mer DNA on hundreds of organisms, the researchers found unique multi-modal k-mer spectra occur in the genomes of organisms from the tetrapod clade only which includes all mammals. The multi-modality is caused by the formation of the two lowest modes where k-mers under them are referred as the rare k-mers. The suppression of the two lowest modes (or the rare k-mers) can be attributed to the CG dinucleotide inclusions in them. Apart from that, the rare k-mers are selectively distributed in certain genomic features of CpG Island (CGI), promoter, 5' UTR, and exon. We correlated the rare k-mers with hundreds of annotated features using several bioinformatic tools, performed further intrinsic rare k-mer analyses within the correlated features, and modeled the elucidated rare k-mer clustering feature into a classifier to predict the correlated CGI and promoter features. Our correlation results show that rare k-mers are highly associated with several annotated features of CGI, promoter, 5' UTR, and open chromatin regions. Our intrinsic results show that rare k-mers have several unique topological, compositional, and clustering properties in CGI and promoter features. Finally, the performances of our RWC (rare-word clustering) method in predicting the CGI and promoter features are ranked among the top three, in eight of the CGI and promoter evaluations, among eight of the benchmarked datasets. Crown Copyright © 2015. Published by Elsevier Ltd. All rights reserved.
van Bronswijk, Suzanne C; Lemmens, Lotte H J M; Viechtbauer, Wolfgang; Huibers, Marcus J H; Arntz, Arnoud; Peeters, Frenk P M L
2018-01-01
Despite extensive research, there is no consensus how Personality Disorders (PD) and PD features affect outcome for Major Depressive Disorder (MDD). The present study evaluated the effects of PD (features) on treatment continuation and effectiveness in Cognitive Therapy (CT) and Interpersonal Psychotherapy (IPT) for MDD. Depressed outpatients were randomized to CT (n=72) and IPT (n=74). Primary outcome was depression severity measured repeatedly with the Beck Depression Inventory-II (BDI-II) at baseline, three months, at the start of each therapy session, at post-treatment and monthly during five months follow-up. Comorbid PD and PD features did not affect dropout. Multilevel and Cox regression models indicated no negative effect of PD on BDI-II change and remission rates during treatment and follow-up, irrespective of the treatment received. For both therapies, higher dependent PD features predicted overall lower BDI-II scores during treatment, however this effect did not sustain through follow-up. Cluster A PD features moderated treatment outcome during treatment and follow-up: individuals with high cluster A PD features had greater BDI-II reductions over time in CT as compared to IPT. Not all therapists and participants were blind to the assessment of PD (features), and assessments were performed by one rater. Further research must investigate the state and trait dependent changes of PD and MDD over time. We found no negative impact of PD on the effectiveness and treatment retention of CT and IPT for MDD during treatment and follow-up. If replicated, cluster A PD features can be used to optimize treatment selection. Copyright © 2017 Elsevier B.V. All rights reserved.
Nonredundant sparse feature extraction using autoencoders with receptive fields clustering.
Ayinde, Babajide O; Zurada, Jacek M
2017-09-01
This paper proposes new techniques for data representation in the context of deep learning using agglomerative clustering. Existing autoencoder-based data representation techniques tend to produce a number of encoding and decoding receptive fields of layered autoencoders that are duplicative, thereby leading to extraction of similar features, thus resulting in filtering redundancy. We propose a way to address this problem and show that such redundancy can be eliminated. This yields smaller networks and produces unique receptive fields that extract distinct features. It is also shown that autoencoders with nonnegativity constraints on weights are capable of extracting fewer redundant features than conventional sparse autoencoders. The concept is illustrated using conventional sparse autoencoder and nonnegativity-constrained autoencoders with MNIST digits recognition, NORB normalized-uniform object data and Yale face dataset. Copyright © 2017 Elsevier Ltd. All rights reserved.
Properties of the gold-sulphur interface: from self-assembled monolayers to clusters
NASA Astrophysics Data System (ADS)
Bürgi, Thomas
2015-09-01
The gold-sulphur interface of self-assembled monolayers (SAMs) was extensively studied some time ago. More recently tremendous progress has been made in the preparation and characterization of thiolate-protected gold clusters. In this feature article we address different properties of the two systems such as their structure, the mobility of the thiolates on the surface and other dynamical aspects, the chirality of the structures and characteristics related to it and their vibrational properties. SAMs and clusters are in the focus of different communities that typically use different experimental approaches to study the respective systems. However, it seems that the nature of the Au-S interfaces in the two cases is quite similar. Recent single crystal X-ray structures of thiolate-protected gold clusters reveal staple motifs characterized by gold ad-atoms sandwiched between two sulphur atoms. This finding contradicts older work on SAMs. However, newer studies on SAMs also reveal ad-atoms. Whether this finding can be generalized remains to be shown. In any case, more and more studies highlight the dynamic nature of the Au-S interface, both on flat surfaces and in clusters. At temperatures slightly above ambient thiolates migrate on the gold surface and on clusters. Evidence for desorption of thiolates at room temperature, at least under certain conditions, has been demonstrated for both systems. The adsorbed thiolate can lead to chirality at different lengths scales, which has been shown both on surfaces and for clusters. Chirality emerges from the organization of the thiolates as well as locally at the molecular level. Chirality can also be transferred from a chiral surface to an adsorbate, as evidenced by vibrational spectroscopy.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Correnti, Matteo; Goudfrooij, Paul; Kalirai, Jason S.
2014-10-01
We use the Wide Field Camera 3 on board the Hubble Space Telescope (HST) to obtain deep, high-resolution images of two intermediate-age star clusters in the Large Magellanic Cloud of relatively low mass (≈10{sup 4} M {sub ☉}) and significantly different core radii, namely NGC 2209 and NGC 2249. For comparison purposes, we also reanalyzed archival HST images of NGC 1795 and IC 2146, two other relatively low-mass star clusters. From the comparison of the observed color-magnitude diagrams with Monte Carlo simulations, we find that the main-sequence turnoff (MSTO) regions in NGC 2209 and NGC 2249 are significantly wider thanmore » that derived from simulations of simple stellar populations, while those in NGC 1795 and IC 2146 are not. We determine the evolution of the clusters' masses and escape velocities from an age of 10 Myr to the present age. We find that differences among these clusters can be explained by dynamical evolution arguments if the currently extended clusters (NGC 2209 and IC 2146) experienced stronger levels of initial mass segregation than the currently compact ones (NGC 2249 and NGC 1795). Under this assumption, we find that NGC 2209 and NGC 2249 have estimated escape velocities, V {sub esc} ≳ 15 km s{sup –1} at an age of 10 Myr, large enough to retain material ejected by slow winds of first-generation stars, while the two clusters that do not feature extended MSTOs have V {sub esc} ≲ 12 km s{sup –1} at that age. These results suggest that the extended MSTO phenomenon can be better explained by a range of stellar ages rather than a range of stellar rotation velocities or interacting binaries.« less
Mainshock-Aftershocks Clustering Detection in Volcanic Regions
NASA Astrophysics Data System (ADS)
Garza Giron, R.; Brodsky, E. E.; Prejean, S. G.
2017-12-01
Crustal earthquakes tend to break their general Poissonean process behavior by gathering into two main kinds of seismic bursts: swarms and mainshock-aftershocks sequences. The former is commonly related to volcanic or geothermal processes whereas the latter is a characteristic feature of tectonically driven seismicity. We explore the mainshock-aftershock clustering behavior of different active volcanic regions in Japan and its comparison to non-volcanic regions. We find that aftershock production in volcanoes shows mainshock-aftershocks clustering similar to what is observed in non-volcanic areas. The ratio of volanic areas that cluster in mainshock-aftershocks sequences vs the areas that do not is comparable to the ratio of non-volcanic regions that show clustering vs the ones that do not. Furthermore, the level of production of aftershocks for most volcanic areas where clustering is present seems to be of the same order of magnitude, or slightly higher, as the median of the non-volcanic regions. An interesting example of highly aftershock-productive volcanoes emerges from the 2000 Miyakejima dike intrusion. A big seismic cluster started to build up rapidly in the south-west flank of Miyakejima to later propagate to the north-west towards the Kozushima and Niijima volcanoes. In Miyakejima the seismicity showed a swarm-like signature with a constant earthquake rate, whereas Kozushima and Niijima both had expressions of highly productive mainshock-aftershocks sequences. These findings are surprising given the alternative mechanisms available in volcanic systems for releasing deviatoric strain. We speculate that aftershock behavior might hold a relationship with the rheological properties of the rocks of each system and with the capacity of a system to accumulate or release the internal pressures caused by magmatic or hydrothermal systems.
Properties of the gold-sulphur interface: from self-assembled monolayers to clusters.
Bürgi, Thomas
2015-10-14
The gold-sulphur interface of self-assembled monolayers (SAMs) was extensively studied some time ago. More recently tremendous progress has been made in the preparation and characterization of thiolate-protected gold clusters. In this feature article we address different properties of the two systems such as their structure, the mobility of the thiolates on the surface and other dynamical aspects, the chirality of the structures and characteristics related to it and their vibrational properties. SAMs and clusters are in the focus of different communities that typically use different experimental approaches to study the respective systems. However, it seems that the nature of the Au-S interfaces in the two cases is quite similar. Recent single crystal X-ray structures of thiolate-protected gold clusters reveal staple motifs characterized by gold ad-atoms sandwiched between two sulphur atoms. This finding contradicts older work on SAMs. However, newer studies on SAMs also reveal ad-atoms. Whether this finding can be generalized remains to be shown. In any case, more and more studies highlight the dynamic nature of the Au-S interface, both on flat surfaces and in clusters. At temperatures slightly above ambient thiolates migrate on the gold surface and on clusters. Evidence for desorption of thiolates at room temperature, at least under certain conditions, has been demonstrated for both systems. The adsorbed thiolate can lead to chirality at different lengths scales, which has been shown both on surfaces and for clusters. Chirality emerges from the organization of the thiolates as well as locally at the molecular level. Chirality can also be transferred from a chiral surface to an adsorbate, as evidenced by vibrational spectroscopy.
What Happens in the Atmospheres of Hot Horizontal Branch Stars Near 20, 000K?
NASA Astrophysics Data System (ADS)
Brown, Thomas
2016-10-01
In the color-magnitude diagrams (CMDs) of many globular clusters, the horizontal branch (HB) exhibits a long blue tail extending to high effective temperatures. In such clusters, two discontinuities appear within the HB locus. The first discontinuity occurs at 12,000K, and was discovered by Grundahl et al. (1998). It is associated with the radiative levitation of metals and the gravitational settling of helium in the atmospheres of HB stars hotter than 12,000K. The hot subdwarf stars of the Galactic field population exhibit the same phenomenon. The second discontinuity occurs at 20,000K, and was discovered by Momany et al. (2002). Its origin is unknown, but it appears at the same effective temperature in all globular clusters hosting HB stars near 20,000K, regardless of cluster properties (age, chemical composition, mass, etc.). We propose STIS long-slit spectroscopy of 6 HB stars that straddle this feature in the HB distribution of omega Cen, the nearest globular cluster where the feature is well populated. With this approach, we can efficiently obtain high-quality UV and blue spectra that span the full wavelength range of the photometric bands where this CMD feature is most prominent - a range this is only accessible by HST. The resulting spectra will unambiguously reveal the nature of this phenomenon - one that is universal in the atmospheres of hot evolved stars - and will yield new insight into the role of diffusion and radiative levitation in these stars.
JPL-20171101-WHATSUf-0001-What's Up November 2017
2017-11-01
Monthly series for amateur astronomers. November features: Viewing the moon, star clusters (the Pleiades, M35, and the Beehive Cluster), and a close pairing of Venus and Jupiter. Plus three meteor showers: the Leonids, the northern and southern Taurids, and the Orionids.
Objects and categories: feature statistics and object processing in the ventral stream.
Tyler, Lorraine K; Chiu, Shannon; Zhuang, Jie; Randall, Billi; Devereux, Barry J; Wright, Paul; Clarke, Alex; Taylor, Kirsten I
2013-10-01
Recognizing an object involves more than just visual analyses; its meaning must also be decoded. Extensive research has shown that processing the visual properties of objects relies on a hierarchically organized stream in ventral occipitotemporal cortex, with increasingly more complex visual features being coded from posterior to anterior sites culminating in the perirhinal cortex (PRC) in the anteromedial temporal lobe (aMTL). The neurobiological principles of the conceptual analysis of objects remain more controversial. Much research has focused on two neural regions-the fusiform gyrus and aMTL, both of which show semantic category differences, but of different types. fMRI studies show category differentiation in the fusiform gyrus, based on clusters of semantically similar objects, whereas category-specific deficits, specifically for living things, are associated with damage to the aMTL. These category-specific deficits for living things have been attributed to problems in differentiating between highly similar objects, a process that involves the PRC. To determine whether the PRC and the fusiform gyri contribute to different aspects of an object's meaning, with differentiation between confusable objects in the PRC and categorization based on object similarity in the fusiform, we carried out an fMRI study of object processing based on a feature-based model that characterizes the degree of semantic similarity and difference between objects and object categories. Participants saw 388 objects for which feature statistic information was available and named the objects at the basic level while undergoing fMRI scanning. After controlling for the effects of visual information, we found that feature statistics that capture similarity between objects formed category clusters in fusiform gyri, such that objects with many shared features (typical of living things) were associated with activity in the lateral fusiform gyri whereas objects with fewer shared features (typical of nonliving things) were associated with activity in the medial fusiform gyri. Significantly, a feature statistic reflecting differentiation between highly similar objects, enabling object-specific representations, was associated with bilateral PRC activity. These results confirm that the statistical characteristics of conceptual object features are coded in the ventral stream, supporting a conceptual feature-based hierarchy, and integrating disparate findings of category responses in fusiform gyri and category deficits in aMTL into a unifying neurocognitive framework.
NASA Astrophysics Data System (ADS)
Waltmunson, Jeremy C.
2005-07-01
This study has investigated the L2 acquisition of Spanish word-medial /d, t, r, (fish hook)/, word-initial /r/, and onset cluster /(fish hook)/. Two similar experiments were designed to address the relative degree of difficulty of the word-medial contrasts, as well as the effect of word-position on /r/ and /(fish hook)/ accuracy scores. In addition, the effect of vowel height on the production of [r] and the L2 emergence of the svarabhakti vowel in onset cluster /(fish hook)/ were investigated. Participants included 34 Ll English speakers from a range of L2 Spanish levels who were recorded in multiple sessions across a 6-month or 2-month period. The criteria for assessing segment accuracy was based on auditory and acoustic features found in productions by native Spanish speakers. In order to be scored as accurate, the L2 productions had to evidence both the auditory and acoustic features found in native speaker productions. L2 participant scores for each target were normalized in order to account for the variation of features found across native speaker productions. The results showed that word-medial accuracy scores followed two significant rankings (from lowest to highest): /r <= d <= (fish hook) <= t/ and /r <= (fish hook) <= d <= t/; however, when scores for /t/ included a voice onset time criterion, only the ranking /r <= (fish hook) <= d <= t/ was significant. These results suggest that /r/ is most difficult for learners while /t/ is least difficult, although individual variation was found. Regarding /r/, there was a strong effect of word position and vowel height on accuracy scores. For productions of /(fish hook)/, there was a strong effect of syllable position on accuracy scores. Acoustic analyses of taps in onset cluster revealed that only the experienced L2 Spanish participants demonstrated svarabhakti vowel emergence with native-like performance, suggesting that its emergence occurs relatively late in L2 acquisition.
NASA Astrophysics Data System (ADS)
Hummel, M.; Wood, N. J.; Stacey, M. T.; Schweikert, A.; Barnard, P.; Erikson, L. H.
2016-12-01
The threat of tidal flooding in coastal regions is exacerbated by sea level rise (SLR), which can lead to more frequent and persistent nuisance flooding and permanent inundation of low-lying areas. When coupled with extreme storm events, SLR also increases the extent and depth of flooding due to storm surges. To mitigate these impacts, bayfront communities are considering a variety of options for shoreline protection, including restoration of natural features such as wetlands and hardening of the shoreline using levees and sea walls. These shoreline modifications can produce changes in the tidal dynamics in a basin, either by increasing dissipation of tidal energy or enhancing tidal amplification [1]. As a result, actions taken by individual communities not only impact local inundation, but can also have implications for flooding on a regional scale. However, regional collaboration is lacking in flood mitigation planning, which is often done on a community-by-community basis. This can lead to redundancy in planning efforts and can also have adverse effects on communities that are not included in discussions about shoreline infrastructure improvements. Using flooding extent outputs from a hydrodynamic model of San Francisco Bay, we performed a K-means clustering analysis to identify similarities between 65 bayfront communities in terms of the spatial, demographic, and economic characteristics of their vulnerable assets for a suite of SLR and storm scenarios. Our clustering analysis identifies communities with similar vulnerabilities and allows for more effective collaboration and decision-making at a regional level by encouraging comparable communities to work together and pool resources to find effective adaptation strategies as flooding becomes more frequent and severe. [1] Holleman RC, Stacey MT (2014) Coupling of sea level rise, tidal amplification, and inundation. Journal of Physical Oceanography 44:1439-1455.
Diversification versus specialization in complex ecosystems.
Di Clemente, Riccardo; Chiarotti, Guido L; Cristelli, Matthieu; Tacchella, Andrea; Pietronero, Luciano
2014-01-01
By analyzing the distribution of revenues across the production sectors of quoted firms we suggest a novel dimension that drives the firms diversification process at country level. Data show a non trivial macro regional clustering of the diversification process, which underlines the relevance of geopolitical environments in determining the microscopic dynamics of economic entities. These findings demonstrate the possibility of singling out in complex ecosystems those micro-features that emerge at macro-levels, which could be of particular relevance for decision-makers in selecting the appropriate parameters to be acted upon in order to achieve desirable results. The understanding of this micro-macro information exchange is further deepened through the introduction of a simplified dynamic model.
Diversification versus Specialization in Complex Ecosystems
Di Clemente, Riccardo; Chiarotti, Guido L.; Cristelli, Matthieu; Tacchella, Andrea; Pietronero, Luciano
2014-01-01
By analyzing the distribution of revenues across the production sectors of quoted firms we suggest a novel dimension that drives the firms diversification process at country level. Data show a non trivial macro regional clustering of the diversification process, which underlines the relevance of geopolitical environments in determining the microscopic dynamics of economic entities. These findings demonstrate the possibility of singling out in complex ecosystems those micro-features that emerge at macro-levels, which could be of particular relevance for decision-makers in selecting the appropriate parameters to be acted upon in order to achieve desirable results. The understanding of this micro-macro information exchange is further deepened through the introduction of a simplified dynamic model. PMID:25384059
NASA Astrophysics Data System (ADS)
Sams, Michael; Silye, Rene; Göhring, Janett; Muresan, Leila; Schilcher, Kurt; Jacak, Jaroslaw
2014-01-01
We present a cluster spatial analysis method using nanoscopic dSTORM images to determine changes in protein cluster distributions within brain tissue. Such methods are suitable to investigate human brain tissue and will help to achieve a deeper understanding of brain disease along with aiding drug development. Human brain tissue samples are usually treated postmortem via standard fixation protocols, which are established in clinical laboratories. Therefore, our localization microscopy-based method was adapted to characterize protein density and protein cluster localization in samples fixed using different protocols followed by common fluorescent immunohistochemistry techniques. The localization microscopy allows nanoscopic mapping of serotonin 5-HT1A receptor groups within a two-dimensional image of a brain tissue slice. These nanoscopically mapped proteins can be confined to clusters by applying the proposed statistical spatial analysis. Selected features of such clusters were subsequently used to characterize and classify the tissue. Samples were obtained from different types of patients, fixed with different preparation methods, and finally stored in a human tissue bank. To verify the proposed method, samples of a cryopreserved healthy brain have been compared with epitope-retrieved and paraffin-fixed tissues. Furthermore, samples of healthy brain tissues were compared with data obtained from patients suffering from mental illnesses (e.g., major depressive disorder). Our work demonstrates the applicability of localization microscopy and image analysis methods for comparison and classification of human brain tissues at a nanoscopic level. Furthermore, the presented workflow marks a unique technological advance in the characterization of protein distributions in brain tissue sections.
Schuppert, H Marieke; Albers, Casper J; Minderaa, Ruud B; Emmelkamp, Paul Mg; Nauta, Maaike H
2012-08-27
A combination of multiple factors, including a strong genetic predisposition and environmental factors, are considered to contribute to the developmental pathways to borderline personality disorder (BPD). However, these factors have mostly been investigated retrospectively, and hardly in adolescents. The current study focuses on maternal factors in BPD features in adolescence. Actual parenting was investigated in a group of referred adolescents with BPD features (N = 101) and a healthy control group (N = 44). Self-reports of perceived concurrent parenting were completed by the adolescents. Questionnaires on parental psychopathology (both Axis I and Axis II disorders) were completed by their mothers. Adolescents reported significantly less emotional warmth, more rejection and more overprotection from their mothers in the BPD-group than in the control group. Mothers in the BPD group reported significantly more parenting stress compared to mothers in the control group. Also, these mothers showed significantly more general psychopathology and clusters C personality traits than mothers in the control group. Contrary to expectations, mothers of adolescents with BPD features reported the same level of cluster B personality traits, compared to mothers in the control group. Hierarchical logistic regression revealed that parental rearing styles (less emotional warmth, and more overprotection) and general psychopathology of the mother were the strongest factors differentiating between controls and adolescents with BPD symptoms. Adolescents with BPD features experience less emotional warmth and more overprotection from their mothers, while the mothers themselves report more symptoms of anxiety and depression. Addition of family interventions to treatment programs for adolescents might increase the effectiveness of such early interventions, and prevent the adverse outcome that is often seen in adult BPD patients.
Discovering semantic features in the literature: a foundation for building functional associations
Chagoyen, Monica; Carmona-Saez, Pedro; Shatkay, Hagit; Carazo, Jose M; Pascual-Montano, Alberto
2006-01-01
Background Experimental techniques such as DNA microarray, serial analysis of gene expression (SAGE) and mass spectrometry proteomics, among others, are generating large amounts of data related to genes and proteins at different levels. As in any other experimental approach, it is necessary to analyze these data in the context of previously known information about the biological entities under study. The literature is a particularly valuable source of information for experiment validation and interpretation. Therefore, the development of automated text mining tools to assist in such interpretation is one of the main challenges in current bioinformatics research. Results We present a method to create literature profiles for large sets of genes or proteins based on common semantic features extracted from a corpus of relevant documents. These profiles can be used to establish pair-wise similarities among genes, utilized in gene/protein classification or can be even combined with experimental measurements. Semantic features can be used by researchers to facilitate the understanding of the commonalities indicated by experimental results. Our approach is based on non-negative matrix factorization (NMF), a machine-learning algorithm for data analysis, capable of identifying local patterns that characterize a subset of the data. The literature is thus used to establish putative relationships among subsets of genes or proteins and to provide coherent justification for this clustering into subsets. We demonstrate the utility of the method by applying it to two independent and vastly different sets of genes. Conclusion The presented method can create literature profiles from documents relevant to sets of genes. The representation of genes as additive linear combinations of semantic features allows for the exploration of functional associations as well as for clustering, suggesting a valuable methodology for the validation and interpretation of high-throughput experimental data. PMID:16438716
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gobrecht, David; Cristallo, Sergio; Piersanti, Luciano
Silicon carbide (SiC) grains are a major dust component in carbon-rich asymptotic giant branch stars. However, the formation pathways of these grains are not fully understood. We calculate ground states and energetically low-lying structures of (SiC){sub n}, n = 1, 16 clusters by means of simulated annealing and Monte Carlo simulations of seed structures and subsequent quantum-mechanical calculations on the density functional level of theory. We derive the infrared (IR) spectra of these clusters and compare the IR signatures to observational and laboratory data. According to energetic considerations, we evaluate the viability of SiC cluster growth at several densities andmore » temperatures, characterizing various locations and evolutionary states in circumstellar envelopes. We discover new, energetically low-lying structures for Si{sub 4}C{sub 4}, Si{sub 5}C{sub 5}, Si{sub 15}C{sub 15}, and Si{sub 16}C{sub 16} and new ground states for Si{sub 10}C{sub 10} and Si{sub 15}C{sub 15}. The clusters with carbon-segregated substructures tend to be more stable by 4–9 eV than their bulk-like isomers with alternating Si–C bonds. However, we find ground states with cage geometries resembling buckminsterfullerens (“bucky-like”) for Si{sub 12}C{sub 12} and Si{sub 16}C{sub 16} and low-lying stable cage structures for n ≥ 12. The latter findings thus indicate a regime of cluster sizes that differ from small clusters as well as from large-scale crystals. Thus—and owing to their stability and geometry—the latter clusters may mark a transition from a quantum-confined cluster regime to a crystalline, solid bulk-material. The calculated vibrational IR spectra of the ground-state SiC clusters show significant emission. They include the 10–13 μ m wavelength range and the 11.3 μm feature inferred from laboratory measurements and observations, respectively, although the overall intensities are rather low.« less
Clustering by reordering of similarity and Laplacian matrices: Application to galaxy clusters
NASA Astrophysics Data System (ADS)
Mahmoud, E.; Shoukry, A.; Takey, A.
2018-04-01
Similarity metrics, kernels and similarity-based algorithms have gained much attention due to their increasing applications in information retrieval, data mining, pattern recognition and machine learning. Similarity Graphs are often adopted as the underlying representation of similarity matrices and are at the origin of known clustering algorithms such as spectral clustering. Similarity matrices offer the advantage of working in object-object (two-dimensional) space where visualization of clusters similarities is available instead of object-features (multi-dimensional) space. In this paper, sparse ɛ-similarity graphs are constructed and decomposed into strong components using appropriate methods such as Dulmage-Mendelsohn permutation (DMperm) and/or Reverse Cuthill-McKee (RCM) algorithms. The obtained strong components correspond to groups (clusters) in the input (feature) space. Parameter ɛi is estimated locally, at each data point i from a corresponding narrow range of the number of nearest neighbors. Although more advanced clustering techniques are available, our method has the advantages of simplicity, better complexity and direct visualization of the clusters similarities in a two-dimensional space. Also, no prior information about the number of clusters is needed. We conducted our experiments on two and three dimensional, low and high-sized synthetic datasets as well as on an astronomical real-dataset. The results are verified graphically and analyzed using gap statistics over a range of neighbors to verify the robustness of the algorithm and the stability of the results. Combining the proposed algorithm with gap statistics provides a promising tool for solving clustering problems. An astronomical application is conducted for confirming the existence of 45 galaxy clusters around the X-ray positions of galaxy clusters in the redshift range [0.1..0.8]. We re-estimate the photometric redshifts of the identified galaxy clusters and obtain acceptable values compared to published spectroscopic redshifts with a 0.029 standard deviation of their differences.
NASA Astrophysics Data System (ADS)
Pasquato, Mario; Chung, Chul
2016-05-01
Context. Machine-learning (ML) solves problems by learning patterns from data with limited or no human guidance. In astronomy, ML is mainly applied to large observational datasets, e.g. for morphological galaxy classification. Aims: We apply ML to gravitational N-body simulations of star clusters that are either formed by merging two progenitors or evolved in isolation, planning to later identify globular clusters (GCs) that may have a history of merging from observational data. Methods: We create mock-observations from simulated GCs, from which we measure a set of parameters (also called features in the machine-learning field). After carrying out dimensionality reduction on the feature space, the resulting datapoints are fed in to various classification algorithms. Using repeated random subsampling validation, we check whether the groups identified by the algorithms correspond to the underlying physical distinction between mergers and monolithically evolved simulations. Results: The three algorithms we considered (C5.0 trees, k-nearest neighbour, and support-vector machines) all achieve a test misclassification rate of about 10% without parameter tuning, with support-vector machines slightly outperforming the others. The first principal component of feature space correlates with cluster concentration. If we exclude it from the regression, the performance of the algorithms is only slightly reduced.
Rahman, Quazi Abidur; Pirbaglou, Meysam; Ritvo, Paul; Heffernan, Jane M; Clarke, Hance; Katz, Joel
2017-01-01
Background Pain is one of the most prevalent health-related concerns and is among the top 3 most common reasons for seeking medical help. Scientific publications of data collected from pain tracking and monitoring apps are important to help consumers and healthcare professionals select the right app for their use. Objective The main objectives of this paper were to (1) discover user engagement patterns of the pain management app, Manage My Pain, using data mining methods; and (2) identify the association between several attributes characterizing individual users and their levels of engagement. Methods User engagement was defined by 2 key features of the app: longevity (number of days between the first and last pain record) and number of records. Users were divided into 5 user engagement clusters employing the k-means clustering algorithm. Each cluster was characterized by 6 attributes: gender, age, number of pain conditions, number of medications, pain severity, and opioid use. Z tests and chi-square tests were used for analyzing categorical attributes. Effects of gender and cluster on numerical attributes were analyzed using 2-way analysis of variances (ANOVAs) followed up by pairwise comparisons using Tukey honest significant difference (HSD). Results The clustering process produced 5 clusters representing different levels of user engagement. The proportion of males and females was significantly different in 4 of the 5 clusters (all P ≤.03). The proportion of males was higher than females in users with relatively high longevity. Mean ages of users in 2 clusters with high longevity were higher than users from other 3 clusters (all P <.001). Overall, males were significantly older than females (P <.001). Across clusters, females reported more pain conditions than males (all P <.001). Users from highly engaged clusters reported taking more medication than less engaged users (all P <.001). Females reported taking a greater number of medications than males (P =.04). In 4 of 5 clusters, the percentage of males taking an opioid was significantly greater (all P ≤.05) than that of females. The proportion of males with mild pain was significantly higher than that of females in 3 clusters (all P ≤.008). Conclusions Although most users of the app reported being female, male users were more likely to be highly engaged in the app. Users in the most engaged clusters self-reported a higher number of pain conditions, a higher number of current medications, and a higher incidence of opioid usage. The high engagement by males in these clusters does not appear to be driven by pain severity which may, in part, be the case for females. Use of a mobile pain app may be relatively more attractive to highly-engaged males than highly-engaged females, and to those with relatively more complex chronic pain problems. PMID:28701291
Rahman, Quazi Abidur; Janmohamed, Tahir; Pirbaglou, Meysam; Ritvo, Paul; Heffernan, Jane M; Clarke, Hance; Katz, Joel
2017-07-12
Pain is one of the most prevalent health-related concerns and is among the top 3 most common reasons for seeking medical help. Scientific publications of data collected from pain tracking and monitoring apps are important to help consumers and healthcare professionals select the right app for their use. The main objectives of this paper were to (1) discover user engagement patterns of the pain management app, Manage My Pain, using data mining methods; and (2) identify the association between several attributes characterizing individual users and their levels of engagement. User engagement was defined by 2 key features of the app: longevity (number of days between the first and last pain record) and number of records. Users were divided into 5 user engagement clusters employing the k-means clustering algorithm. Each cluster was characterized by 6 attributes: gender, age, number of pain conditions, number of medications, pain severity, and opioid use. Z tests and chi-square tests were used for analyzing categorical attributes. Effects of gender and cluster on numerical attributes were analyzed using 2-way analysis of variances (ANOVAs) followed up by pairwise comparisons using Tukey honest significant difference (HSD). The clustering process produced 5 clusters representing different levels of user engagement. The proportion of males and females was significantly different in 4 of the 5 clusters (all P ≤.03). The proportion of males was higher than females in users with relatively high longevity. Mean ages of users in 2 clusters with high longevity were higher than users from other 3 clusters (all P <.001). Overall, males were significantly older than females (P <.001). Across clusters, females reported more pain conditions than males (all P <.001). Users from highly engaged clusters reported taking more medication than less engaged users (all P <.001). Females reported taking a greater number of medications than males (P =.04). In 4 of 5 clusters, the percentage of males taking an opioid was significantly greater (all P ≤.05) than that of females. The proportion of males with mild pain was significantly higher than that of females in 3 clusters (all P ≤.008). Although most users of the app reported being female, male users were more likely to be highly engaged in the app. Users in the most engaged clusters self-reported a higher number of pain conditions, a higher number of current medications, and a higher incidence of opioid usage. The high engagement by males in these clusters does not appear to be driven by pain severity which may, in part, be the case for females. Use of a mobile pain app may be relatively more attractive to highly-engaged males than highly-engaged females, and to those with relatively more complex chronic pain problems. ©Quazi Abidur Rahman, Tahir Janmohamed, Meysam Pirbaglou, Paul Ritvo, Jane M Heffernan, Hance Clarke, Joel Katz. Originally published in JMIR Mhealth and Uhealth (http://mhealth.jmir.org), 12.07.2017.
Neuro- and social-cognitive clustering highlights distinct profiles in adults with anorexia nervosa.
Renwick, Beth; Musiat, Peter; Lose, Anna; DeJong, Hannah; Broadbent, Hannah; Kenyon, Martha; Loomes, Rachel; Watson, Charlotte; Ghelani, Shreena; Serpell, Lucy; Richards, Lorna; Johnson-Sabine, Eric; Boughton, Nicky; Treasure, Janet; Schmidt, Ulrike
2015-01-01
This study aimed to explore the neuro- and social-cognitive profile of a consecutive series of adult outpatients with anorexia nervosa (AN) when compared with widely available age and gender matched historical control data. The relationship between performance profiles, clinical characteristics, service utilization, and treatment adherence was also investigated. Consecutively recruited outpatients with a broad diagnosis of AN (restricting subtype AN-R: n = 44, binge-purge subtype AN-BP: n = 33 or Eating Disorder Not Otherwise Specified-AN subtype EDNOS-AN: n = 23) completed a comprehensive set of neurocognitive (set-shifting, central coherence) and social-cognitive measures (Emotional Theory of Mind). Data were subjected to hierarchical cluster analysis and a discriminant function analysis. Three separate, meaningful clusters emerged. Cluster 1 (n = 45) showed overall average to high average neuro- and social- cognitive performance, Cluster 2 (n = 38) showed mixed performance characterized by distinct strengths and weaknesses, and Cluster 3 (n = 17) showed poor overall performance (Autism Spectrum disorder (ASD) like cluster). The three clusters did not differ in terms of eating disorder symptoms, comorbid features or service utilization and treatment adherence. A discriminant function analysis confirmed that the clusters were best characterized by performance in perseveration and set-shifting measures. The findings suggest that considerable neuro- and social-cognitive heterogeneity exists in patients with AN, with a subset showing ASD-like features. The value of this method of profiling in predicting longer term patient outcomes and in guiding development of etiologically targeted treatments remains to be seen. © 2014 Wiley Periodicals, Inc.
The myeloproliferative neoplasms, unclassifiable: clinical and pathological considerations.
Gianelli, Umberto; Cattaneo, Daniele; Bossi, Anna; Cortinovis, Ivan; Boiocchi, Leonardo; Liu, Yen-Chun; Augello, Claudia; Bonometti, Arturo; Fiori, Stefano; Orofino, Nicola; Guidotti, Francesca; Orazi, Attilio; Iurlo, Alessandra
2017-02-01
In this study, we investigate in detail the morphological, clinical and molecular features of 71 consecutive patients with a diagnosis of myeloproliferative neoplasms, unclassifiable. We performed a meticulous morphological analysis and found that most of the cases displayed a hypercellular bone marrow (70%) with normal erythropoiesis without left-shifting (59%), increased granulopoiesis with left-shifting (73%) and increased megakaryocytes with loose clustering (96%). Megakaryocytes displayed frequent giant forms with hyperlobulated or bulbous nuclei and/or other maturation defects. Interestingly, more than half of the cases displayed severe bone marrow fibrosis (59%). Median values of hemoglobin level and white blood cells count were all within the normal range; in contrast, median platelets count and lactate dehydrogenase were increased. Little less than half of the patients (44%) showed splenomegaly. JAK2V617F mutation was detected in 72% of all patients. Among the JAK2-negative cases, MPLW515L mutation was found in 17% and CALR mutations in 67% of the investigated cases, respectively. Finally, by multiple correspondence analysis of the morphological profiles, we found that all but four of the cases could be grouped in three morphological clusters with some features similar to those of the classic BCR-ABL1-negative myeloproliferative neoplasms. Analysis of the clinical parameters in these three clusters revealed discrepancies with the morphological profile in about 55% of the patients. In conclusion, we found that the category of myeloproliferative neoplasm, unclassifiable is heterogeneous but identification of different subgroups is possible and should be recommended for a better management of these patients.
Kumar, Surendra; Ghosh, Subhojit; Tetarway, Suhash; Sinha, Rakesh Kumar
2015-07-01
In this study, the magnitude and spatial distribution of frequency spectrum in the resting electroencephalogram (EEG) were examined to address the problem of detecting alcoholism in the cerebral motor cortex. The EEG signals were recorded from chronic alcoholic conditions (n = 20) and the control group (n = 20). Data were taken from motor cortex region and divided into five sub-bands (delta, theta, alpha, beta-1 and beta-2). Three methodologies were adopted for feature extraction: (1) absolute power, (2) relative power and (3) peak power frequency. The dimension of the extracted features is reduced by linear discrimination analysis and classified by support vector machine (SVM) and fuzzy C-mean clustering. The maximum classification accuracy (88 %) with SVM clustering was achieved with the EEG spectral features with absolute power frequency on F4 channel. Among the bands, relatively higher classification accuracy was found over theta band and beta-2 band in most of the channels when computed with the EEG features of relative power. Electrodes wise CZ, C3 and P4 were having more alteration. Considering the good classification accuracy obtained by SVM with relative band power features in most of the EEG channels of motor cortex, it can be suggested that the noninvasive automated online diagnostic system for the chronic alcoholic condition can be developed with the help of EEG signals.
Ortiz-Rosario, Alexis; Adeli, Hojjat; Buford, John A
2017-01-15
Researchers often rely on simple methods to identify involvement of neurons in a particular motor task. The historical approach has been to inspect large groups of neurons and subjectively separate neurons into groups based on the expertise of the investigator. In cases where neuron populations are small it is reasonable to inspect these neuronal recordings and their firing rates carefully to avoid data omissions. In this paper, a new methodology is presented for automatic objective classification of neurons recorded in association with behavioral tasks into groups. By identifying characteristics of neurons in a particular group, the investigator can then identify functional classes of neurons based on their relationship to the task. The methodology is based on integration of a multiple signal classification (MUSIC) algorithm to extract relevant features from the firing rate and an expectation-maximization Gaussian mixture algorithm (EM-GMM) to cluster the extracted features. The methodology is capable of identifying and clustering similar firing rate profiles automatically based on specific signal features. An empirical wavelet transform (EWT) was used to validate the features found in the MUSIC pseudospectrum and the resulting signal features captured by the methodology. Additionally, this methodology was used to inspect behavioral elements of neurons to physiologically validate the model. This methodology was tested using a set of data collected from awake behaving non-human primates. Copyright © 2016 Elsevier B.V. All rights reserved.
Personality assessment of homeless adults as a tool for service planning.
Tolomiczenko, G S; Sota, T; Goering, P N
2000-01-01
The psychiatric status of homeless adults has been described primarily in terms of Axis I disorders. By adding a subset of the Personality Assessment Inventory, this study tests the feasibility and usefulness of a brief, self-administered questionnaire to obtain scores on several dimensions of personality. Cluster analysis sorted 112 tested subjects into four groups characterized by distinct profiles. Two of these were characterized by extreme scores on pathological dimensions of personality (borderline features, antisocial traits, and aggressivity) and differed primarily on the dimension of suicidality. The third reflected moderate levels of personality dysfunction and the fourth did not deviate from adult nonclinical norms. The validity of the clusters was supported by demographic, background, and diagnostic subgroup differences. Brief personality assessment can be a cost-effective approach to matching services with clinical needs of homeless adults by attending to interpersonal dimensions that will likely affect service provision.
Finding Clothing That Fit through Cluster Analysis and Objective Interestingness Measures
NASA Astrophysics Data System (ADS)
Peña, Isis; Viktor, Herna L.; Paquet, Eric
Clothes should fit consumers well, be aesthetically pleasing and comfortable. However, repeated studies of customers’ levels of satisfaction indicate that this is often not the case. For example, more robust males often find it difficult to find pants that are the correct length and fit their waists well. What, then, are the typical body profiles of the population? Would it be possible to identify the measurements that are of importance for different sizes and genders? Furthermore, assuming that we have access to an anthropometric database would there be a way to guide the data mining process to discover only those relevant body measurements that are of the most interest for apparel designers? This paper describes our results when addressing these questions through cluster analysis and interestingness measures-based feature selection. We explore a database containing anthropometric measurements as well as 3-D body scans, of a representative sample of the Dutch population.
Wang, Juan; Nishikawa, Robert M; Yang, Yongyi
2016-01-01
In computer-aided detection of microcalcifications (MCs), the detection accuracy is often compromised by frequent occurrence of false positives (FPs), which can be attributed to a number of factors, including imaging noise, inhomogeneity in tissue background, linear structures, and artifacts in mammograms. In this study, the authors investigated a unified classification approach for combating the adverse effects of these heterogeneous factors for accurate MC detection. To accommodate FPs caused by different factors in a mammogram image, the authors developed a classification model to which the input features were adapted according to the image context at a detection location. For this purpose, the input features were defined in two groups, of which one group was derived from the image intensity pattern in a local neighborhood of a detection location, and the other group was used to characterize how a MC is different from its structural background. Owing to the distinctive effect of linear structures in the detector response, the authors introduced a dummy variable into the unified classifier model, which allowed the input features to be adapted according to the image context at a detection location (i.e., presence or absence of linear structures). To suppress the effect of inhomogeneity in tissue background, the input features were extracted from different domains aimed for enhancing MCs in a mammogram image. To demonstrate the flexibility of the proposed approach, the authors implemented the unified classifier model by two widely used machine learning algorithms, namely, a support vector machine (SVM) classifier and an Adaboost classifier. In the experiment, the proposed approach was tested for two representative MC detectors in the literature [difference-of-Gaussians (DoG) detector and SVM detector]. The detection performance was assessed using free-response receiver operating characteristic (FROC) analysis on a set of 141 screen-film mammogram (SFM) images (66 cases) and a set of 188 full-field digital mammogram (FFDM) images (95 cases). The FROC analysis results show that the proposed unified classification approach can significantly improve the detection accuracy of two MC detectors on both SFM and FFDM images. Despite the difference in performance between the two detectors, the unified classifiers can reduce their FP rate to a similar level in the output of the two detectors. In particular, with true-positive rate at 85%, the FP rate on SFM images for the DoG detector was reduced from 1.16 to 0.33 clusters/image (unified SVM) and 0.36 clusters/image (unified Adaboost), respectively; similarly, for the SVM detector, the FP rate was reduced from 0.45 clusters/image to 0.30 clusters/image (unified SVM) and 0.25 clusters/image (unified Adaboost), respectively. Similar FP reduction results were also achieved on FFDM images for the two MC detectors. The proposed unified classification approach can be effective for discriminating MCs from FPs caused by different factors (such as MC-like noise patterns and linear structures) in MC detection. The framework is general and can be applicable for further improving the detection accuracy of existing MC detectors.
Haakensen, Vilde D; Lingjaerde, Ole Christian; Lüders, Torben; Riis, Margit; Prat, Aleix; Troester, Melissa A; Holmen, Marit M; Frantzen, Jan Ole; Romundstad, Linda; Navjord, Dina; Bukholm, Ida K; Johannesen, Tom B; Perou, Charles M; Ursin, Giske; Kristensen, Vessela N; Børresen-Dale, Anne-Lise; Helland, Aslaug
2011-11-01
Increased understanding of the variability in normal breast biology will enable us to identify mechanisms of breast cancer initiation and the origin of different subtypes, and to better predict breast cancer risk. Gene expression patterns in breast biopsies from 79 healthy women referred to breast diagnostic centers in Norway were explored by unsupervised hierarchical clustering and supervised analyses, such as gene set enrichment analysis and gene ontology analysis and comparison with previously published genelists and independent datasets. Unsupervised hierarchical clustering identified two separate clusters of normal breast tissue based on gene-expression profiling, regardless of clustering algorithm and gene filtering used. Comparison of the expression profile of the two clusters with several published gene lists describing breast cells revealed that the samples in cluster 1 share characteristics with stromal cells and stem cells, and to a certain degree with mesenchymal cells and myoepithelial cells. The samples in cluster 1 also share many features with the newly identified claudin-low breast cancer intrinsic subtype, which also shows characteristics of stromal and stem cells. More women belonging to cluster 1 have a family history of breast cancer and there is a slight overrepresentation of nulliparous women in cluster 1. Similar findings were seen in a separate dataset consisting of histologically normal tissue from both breasts harboring breast cancer and from mammoplasty reductions. This is the first study to explore the variability of gene expression patterns in whole biopsies from normal breasts and identified distinct subtypes of normal breast tissue. Further studies are needed to determine the specific cell contribution to the variation in the biology of normal breasts, how the clusters identified relate to breast cancer risk and their possible link to the origin of the different molecular subtypes of breast cancer.
Isolation and Versatile Derivatization of an Unsaturated Anionic Silicon Cluster (Siliconoid)
Willmes, Philipp; Leszczyńska, Kinga; Heider, Yannic; Abersfelder, Kai; Zimmer, Michael; Huch, Volker
2016-01-01
Abstract The characteristic features of bulk silicon surfaces are echoed in the related partially substituted—and thus unsaturated—neutral silicon clusters (siliconoids). The incorporation of siliconoids into more‐extended frameworks is promising owing to their unique electronic features, but further developments in this regard are limited by the notable absence of functionalized siliconoid derivatives until now. Herein we report the isolation and full characterization of the lithium salt of an anionic R5Si6‐siliconoid, thus providing the missing link between silicon‐based Zintl anions and siliconoid clusters. Proof‐of‐principle for the high potential of this species for the efficient transfer of the intact unsaturated R5Si6 moiety is demonstrated by clean reactions with representative electrophiles of Groups 13, 14, and 15. PMID:26800440
NASA Astrophysics Data System (ADS)
Sa, Qila; Wang, Zhihui
2018-03-01
At present, content-based video retrieval (CBVR) is the most mainstream video retrieval method, using the video features of its own to perform automatic identification and retrieval. This method involves a key technology, i.e. shot segmentation. In this paper, the method of automatic video shot boundary detection with K-means clustering and improved adaptive dual threshold comparison is proposed. First, extract the visual features of every frame and divide them into two categories using K-means clustering algorithm, namely, one with significant change and one with no significant change. Then, as to the classification results, utilize the improved adaptive dual threshold comparison method to determine the abrupt as well as gradual shot boundaries.Finally, achieve automatic video shot boundary detection system.
Texture Analysis and Cartographic Feature Extraction.
1985-01-01
Investigations into using various image descriptors as well as developing interactive feature extraction software on the Digital Image Analysis Laboratory...system. Originator-supplied keywords: Ad-Hoc image descriptor; Bayes classifier; Bhattachryya distance; Clustering; Digital Image Analysis Laboratory
Molecular evolution of the clustered MMIC-3 multigene family of Gossypium species
USDA-ARS?s Scientific Manuscript database
Uniqueness, content, localization, and defense-related features of the root-knot nematode resistance-associated MIC-3 supergene cluster in the genus Gossypium are all of interest for molecular evolutionary studies of duplicate supergenes in allopolyploids. Here we report molecular evolutionary rates...
Discussion of CoSA: Clustering of Sparse Approximations
DOE Office of Scientific and Technical Information (OSTI.GOV)
Armstrong, Derek Elswick
2017-03-07
The purpose of this talk is to discuss the possible applications of CoSA (Clustering of Sparse Approximations) to the exploitation of HSI (HyperSpectral Imagery) data. CoSA is presented by Moody et al. in the Journal of Applied Remote Sensing (“Land cover classification in multispectral imagery using clustering of sparse approximations over learned feature dictionaries”, Vol. 8, 2014) and is based on machine learning techniques.
Self-organization and clustering algorithms
NASA Technical Reports Server (NTRS)
Bezdek, James C.
1991-01-01
Kohonen's feature maps approach to clustering is often likened to the k or c-means clustering algorithms. Here, the author identifies some similarities and differences between the hard and fuzzy c-Means (HCM/FCM) or ISODATA algorithms and Kohonen's self-organizing approach. The author concludes that some differences are significant, but at the same time there may be some important unknown relationships between the two methodologies. Several avenues of research are proposed.
Configurational coupled cluster approach with applications to magnetic model systems
NASA Astrophysics Data System (ADS)
Wu, Siyuan; Nooijen, Marcel
2018-05-01
A general exponential, coupled cluster like, approach is discussed to extract an effective Hamiltonian in configurational space, as a sum of 1-body, 2-body up to n-body operators. The simplest two-body approach is illustrated by calculations on simple magnetic model systems. A key feature of the approach is that equations up to a certain rank do not depend on higher body cluster operators.
Analysis of perceived similarity between pairs of microcalcification clusters in mammograms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Juan; Jing, Hao; Wernick, Miles N.
2014-05-15
Purpose: Content-based image retrieval aims to assist radiologists by presenting example images with known pathology that are visually similar to the case being evaluated. In this work, the authors investigate several fundamental issues underlying the similarity ratings between pairs of microcalcification (MC) lesions on mammograms as judged by radiologists: the degree of variability in the similarity ratings, the impact of this variability on agreement between readers in retrieval of similar lesions, and the factors contributing to the readers’ similarity ratings. Methods: The authors conduct a reader study on a set of 1000 image pairs of MC lesions, in which amore » group of experienced breast radiologists rated the degree of similarity between each image pair. The image pairs are selected, from among possible pairings of 222 cases (110 malignant, 112 benign), based on quantitative image attributes (features) and the results of a preliminary reader study. Next, the authors apply analysis of variance (ANOVA) to quantify the level of variability in the readers’ similarity ratings, and study how the variability in individual reader ratings affects consistency between readers. The authors also measure the extent to which readers agree on images which are most similar to a given query, for which the Dice coefficient is used. To investigate how the similarity ratings potentially relate to the attributes underlying the cases, the authors study the fraction of perceptually similar images that also share the same benign or malignant pathology as the query image; moreover, the authors apply multidimensional scaling (MDS) to embed the cases according to their mutual perceptual similarity in a two-dimensional plot, which allows the authors to examine the manner in which similar lesions relate to one another in terms of benign or malignant pathology and clustered MCs. Results: The ANOVA results show that the coefficient of determination in the reader similarity ratings is 0.59. The variability level in the similarity ratings is proved to be a limiting factor, leading to only moderate correlation between the readers in their readings. The Dice coefficient, measuring agreement between readers in retrieval of similar images, can vary from 0.45 to 0.64 with different levels of similarity for individual readers, but is higher for average ratings from a group of readers (from 0.59 to 0.78). More importantly, the fraction of retrieved cases that match the benign or malignant pathology of the query image was found to increase with the degree of similarity among the retrieved images, reaching average value as high as 0.69 for the radiologists (p-value <10{sup −4} compared to random guessing). Moreover, MDS embedding of all the cases shows that cases having the same pathology tend to cluster together, and that neighboring cases in the plot tend to be similar in their clustered MCs. Conclusions: While individual readers exhibit substantial variability in their similarity ratings, similarity ratings averaged from a group of readers can achieve a high level of intergroup consistency and agreement in retrieval of similar images. More importantly, perceptually similar cases are also likely to be similar in their underlying benign or malignant pathology and image features of clustered MCs, which could be of diagnostic value in computer-aided diagnosis for lesions with clustered MCs.« less
Editing ERTS-1 data to exclude land aids cluster analysis of water targets
NASA Technical Reports Server (NTRS)
Erb, R. B. (Principal Investigator)
1973-01-01
The author has identified the following significant results. It has been determined that an increase in the number of spectrally distinct coastal water types is achieved when data values over the adjacent land areas are excluded from the processing routine. This finding resulted from an automatic clustering analysis of ERTS-1 system corrected MSS scene 1002-18134 of 25 July 1972 over Monterey Bay, California. When the entire study area data set was submitted to the clustering only two distinct water classes were extracted. However, when the land area data points were removed from the data set and resubmitted to the clustering routine, four distinct groupings of water features were identified. Additionally, unlike the previous separation, the four types could be correlated to features observable in the associated ERTS-1 imagery. This exercise demonstrates that by proper selection of data submitted to the processing routine, based upon the specific application of study, additional information may be extracted from the ERTS-1 MSS data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ben-Naim, Eli; Krapivsky, Paul
Here we generalize the ordinary aggregation process to allow for choice. In ordinary aggregation, two random clusters merge and form a larger aggregate. In our implementation of choice, a target cluster and two candidate clusters are randomly selected and the target cluster merges with the larger of the two candidate clusters.We study the long-time asymptotic behavior and find that as in ordinary aggregation, the size density adheres to the standard scaling form. However, aggregation with choice exhibits a number of different features. First, the density of the smallest clusters exhibits anomalous scaling. Second, both the small-size and the large-size tailsmore » of the density are overpopulated, at the expense of the density of moderate-size clusters. Finally, we also study the complementary case where the smaller candidate cluster participates in the aggregation process and find an abundance of moderate clusters at the expense of small and large clusters. Additionally, we investigate aggregation processes with choice among multiple candidate clusters and a symmetric implementation where the choice is between two pairs of clusters.« less
Wu, K; Daruwalla, Z J; Wong, K L; Murphy, D; Ren, H
2015-08-01
The commercial humeral implants based on the Western population are currently not entirely compatible with Asian patients, due to differences in bone size, shape and structure. Surgeons may have to compromise or use different implants that are less conforming, which may cause complications of as well as inconvenience to the implant position. The construction of Asian humerus atlases of different clusters has therefore been proposed to eradicate this problem and to facilitate planning minimally invasive surgical procedures [6,31]. According to the features of the atlases, new implants could be designed specifically for different patients. Furthermore, an automatic implant selection algorithm has been proposed as well in order to reduce the complications caused by implant and bone mismatch. Prior to the design of the implant, data clustering and extraction of the relevant features were carried out on the datasets of each gender. The fuzzy C-means clustering method is explored in this paper. Besides, two new schemes of implant selection procedures, namely the Procrustes analysis-based scheme and the group average distance-based scheme, were proposed to better search for the matching implants for new coming patients from the database. Both these two algorithms have not been used in this area, while they turn out to have excellent performance in implant selection. Additionally, algorithms to calculate the matching scores between various implants and the patient data are proposed in this paper to assist the implant selection procedure. The results obtained have indicated the feasibility of the proposed development and selection scheme. The 16 sets of male data were divided into two clusters with 8 and 8 subjects, respectively, and the 11 female datasets were also divided into two clusters with 5 and 6 subjects, respectively. Based on the features of each cluster, the implants designed by the proposed algorithm fit very well on their reference humeri and the proposed implant selection procedure allows for a scenario of treating a patient with merely a preoperative anatomical model in order to correctly select the implant that has the best fit. Based on the leave-one-out validation, it can be concluded that both the PA-based method and GAD-based method are able to achieve excellent performance when dealing with the problem of implant selection. The accuracy and average execution time for the PA-based method were 100 % and 0.132 s, respectively, while those of the GAD- based method were 100 % and 0.058 s. Therefore, the GAD-based method outperformed the PA-based method in terms of execution speed. The primary contributions of this paper include the proposal of methods for development of Asian-, gender- and cluster-specific implants based on shape features and selection of the best fit implants for future patients according to their features. To the best of our knowledge, this is the first work that proposes implant design and selection for Asian patients automatically based on features extracted from cluster-specific statistical atlases.
Spatial distribution of 12 class B notifiable infectious diseases in China: A retrospective study.
Zhu, Bin; Fu, Yang; Liu, Jinlin; Mao, Ying
2018-01-01
China is the largest developing country with a relatively developed public health system. To further prevent and eliminate the spread of infectious diseases, China has listed 39 notifiable infectious diseases characterized by wide prevalence or great harm, and classified them into classes A, B, and C, with severity decreasing across classes. Class A diseases have been almost eradicated in China, thus making class B diseases a priority in infectious disease prevention and control. In this retrospective study, we analyze the spatial distribution patterns of 12 class B notifiable infectious diseases that remain active all over China. Global and local Moran's I and corresponding graphic tools are adopted to explore and visualize the global and local spatial distribution of the incidence of the selected epidemics, respectively. Inter-correlations of clustering patterns of each pair of diseases and a cumulative summary of the high/low cluster frequency of the provincial units are also provided by means of figures and maps. Of the 12 most commonly notifiable class B infectious diseases, viral hepatitis and tuberculosis show high incidence rates and account for more than half of the reported cases. Almost all the diseases, except pertussis, exhibit positive spatial autocorrelation at the provincial level. All diseases feature varying spatial concentrations. Nevertheless, associations exist between spatial distribution patterns, with some provincial units displaying the same type of cluster features for two or more infectious diseases. Overall, high-low (unit with high incidence surrounded by units with high incidence, the same below) and high-high spatial cluster areas tend to be prevalent in the provincial units located in western and southwest China, whereas low-low and low-high spatial cluster areas abound in provincial units in north and east China. Despite the various distribution patterns of 12 class B notifiable infectious diseases, certain similarities between their spatial distributions are present. Substantial evidence is available to support disease-specific, location-specific, and disease-combined interventions. Regarding provinces that show high-high/high-low patterns of multiple diseases, comprehensive interventions targeting different diseases should be established. As to the adjacent provincial units revealing similar patterns, coordinated actions need to be taken across borders.
Hsu, Chien-Chang; Cheng, Ching-Wen; Chiu, Yi-Shiuan
2017-02-15
Electroencephalograms can record wave variations in any brain activity. Beta waves are produced when an external stimulus induces logical thinking, computation, and reasoning during consciousness. This work uses the beta wave of major scale working memory N-back tasks to analyze the differences between young musicians and non-musicians. After the feature analysis uses signal filtering, Hilbert-Huang transformation, and feature extraction methods to identify differences, k-means clustering algorithm are used to group them into different clusters. The results of feature analysis showed that beta waves significantly differ between young musicians and non-musicians from the low memory load of working memory task. Copyright © 2017 Elsevier B.V. All rights reserved.
[Changes in mammographic features of breast cancer--comparison with previous films].
Matsunaga, T; Hagiwara, K; Kimura, K; Kusama, M
1992-11-25
Mammographic features of 87 breast cancer patients were studied in comparison with their previous survey films. Changes in the mammographic features included microcalicification (28 cases), tumor shadow (35 cases) and intratumorous microcalicifications (6 cases). Seven cases had several extremely faint calcifications on the previous films, and three of six cases with clustered and scattered microcalcifications that extended over an entire breast quadrant had increased in number, density and extent. Eight cases in which clustered microcalcifications had increased in number, density and extent suggested a relationship between the increase in the extent of microcalcifications and length of time between visits. In most cases with tumor shadow, a slight localized increase in mammary gland density, irregular margins and straightened trabeculae were overlooked because of breast density.
Pipelining Architecture of Indexing Using Agglomerative Clustering
NASA Astrophysics Data System (ADS)
Goyal, Deepika; Goyal, Deepti; Gupta, Parul
2010-11-01
The World Wide Web is an interlinked collection of billions of documents. Ironically the huge size of this collection has become an obstacle for information retrieval. To access the information from Internet, search engine is used. Search engine retrieve the pages from indexer. This paper introduce a novel pipelining technique for structuring the core index-building system that substantially reduces the index construction time and also clustering algorithm that aims at partitioning the set of documents into ordered clusters so that the documents within the same cluster are similar and are being assigned the closer document identifiers. After assigning to the clusters it creates the hierarchy of index so that searching is efficient. It will make the super cluster then mega cluster by itself. The pipeline architecture will create the index in such a way that it will be efficient in space and time saving manner. It will direct the search from higher level to lower level of index or higher level of clusters to lower level of cluster so that the user gets the possible match result in time saving manner. As one cluster is making by taking only two clusters so it search is limited to two clusters for lower level of index and so on. So it is efficient in time saving manner.
A flower image retrieval method based on ROI feature.
Hong, An-Xiang; Chen, Gang; Li, Jun-Li; Chi, Zhe-Ru; Zhang, Dan
2004-07-01
Flower image retrieval is a very important step for computer-aided plant species recognition. In this paper, we propose an efficient segmentation method based on color clustering and domain knowledge to extract flower regions from flower images. For flower retrieval, we use the color histogram of a flower region to characterize the color features of flower and two shape-based features sets, Centroid-Contour Distance (CCD) and Angle Code Histogram (ACH), to characterize the shape features of a flower contour. Experimental results showed that our flower region extraction method based on color clustering and domain knowledge can produce accurate flower regions. Flower retrieval results on a database of 885 flower images collected from 14 plant species showed that our Region-of-Interest (ROI) based retrieval approach using both color and shape features can perform better than a method based on the global color histogram proposed by Swain and Ballard (1991) and a method based on domain knowledge-driven segmentation and color names proposed by Das et al.(1999).