Unsupervised learning of natural languages
Solan, Zach; Horn, David; Ruppin, Eytan; Edelman, Shimon
2005-01-01
We address the problem, fundamental to linguistics, bioinformatics, and certain other disciplines, of using corpora of raw symbolic sequential data to infer underlying rules that govern their production. Given a corpus of strings (such as text, transcribed speech, chromosome or protein sequence data, sheet music, etc.), our unsupervised algorithm recursively distills from it hierarchically structured patterns. The adios (automatic distillation of structure) algorithm relies on a statistical method for pattern extraction and on structured generalization, two processes that have been implicated in language acquisition. It has been evaluated on artificial context-free grammars with thousands of rules, on natural languages as diverse as English and Chinese, and on protein data correlating sequence with function. This unsupervised algorithm is capable of learning complex syntax, generating grammatical novel sentences, and proving useful in other fields that call for structure discovery from raw data, such as bioinformatics. PMID:16087885
Unsupervised learning of natural languages.
Solan, Zach; Horn, David; Ruppin, Eytan; Edelman, Shimon
2005-08-16
We address the problem, fundamental to linguistics, bioinformatics, and certain other disciplines, of using corpora of raw symbolic sequential data to infer underlying rules that govern their production. Given a corpus of strings (such as text, transcribed speech, chromosome or protein sequence data, sheet music, etc.), our unsupervised algorithm recursively distills from it hierarchically structured patterns. The adios (automatic distillation of structure) algorithm relies on a statistical method for pattern extraction and on structured generalization, two processes that have been implicated in language acquisition. It has been evaluated on artificial context-free grammars with thousands of rules, on natural languages as diverse as English and Chinese, and on protein data correlating sequence with function. This unsupervised algorithm is capable of learning complex syntax, generating grammatical novel sentences, and proving useful in other fields that call for structure discovery from raw data, such as bioinformatics.
Hsu, Arthur L; Tang, Sen-Lin; Halgamuge, Saman K
2003-11-01
Current Self-Organizing Maps (SOMs) approaches to gene expression pattern clustering require the user to predefine the number of clusters likely to be expected. Hierarchical clustering methods used in this area do not provide unique partitioning of data. We describe an unsupervised dynamic hierarchical self-organizing approach, which suggests an appropriate number of clusters, to perform class discovery and marker gene identification in microarray data. In the process of class discovery, the proposed algorithm identifies corresponding sets of predictor genes that best distinguish one class from other classes. The approach integrates merits of hierarchical clustering with robustness against noise known from self-organizing approaches. The proposed algorithm applied to DNA microarray data sets of two types of cancers has demonstrated its ability to produce the most suitable number of clusters. Further, the corresponding marker genes identified through the unsupervised algorithm also have a strong biological relationship to the specific cancer class. The algorithm tested on leukemia microarray data, which contains three leukemia types, was able to determine three major and one minor cluster. Prediction models built for the four clusters indicate that the prediction strength for the smaller cluster is generally low, therefore labelled as uncertain cluster. Further analysis shows that the uncertain cluster can be subdivided further, and the subdivisions are related to two of the original clusters. Another test performed using colon cancer microarray data has automatically derived two clusters, which is consistent with the number of classes in data (cancerous and normal). JAVA software of dynamic SOM tree algorithm is available upon request for academic use. A comparison of rectangular and hexagonal topologies for GSOM is available from http://www.mame.mu.oz.au/mechatronics/journalinfo/Hsu2003supp.pdf
Unsupervised learning on scientific ocean drilling datasets from the South China Sea
NASA Astrophysics Data System (ADS)
Tse, Kevin C.; Chiu, Hon-Chim; Tsang, Man-Yin; Li, Yiliang; Lam, Edmund Y.
2018-06-01
Unsupervised learning methods were applied to explore data patterns in multivariate geophysical datasets collected from ocean floor sediment core samples coming from scientific ocean drilling in the South China Sea. Compared to studies on similar datasets, but using supervised learning methods which are designed to make predictions based on sample training data, unsupervised learning methods require no a priori information and focus only on the input data. In this study, popular unsupervised learning methods including K-means, self-organizing maps, hierarchical clustering and random forest were coupled with different distance metrics to form exploratory data clusters. The resulting data clusters were externally validated with lithologic units and geologic time scales assigned to the datasets by conventional methods. Compact and connected data clusters displayed varying degrees of correspondence with existing classification by lithologic units and geologic time scales. K-means and self-organizing maps were observed to perform better with lithologic units while random forest corresponded best with geologic time scales. This study sets a pioneering example of how unsupervised machine learning methods can be used as an automatic processing tool for the increasingly high volume of scientific ocean drilling data.
Hierarchical clustering of HPV genotype patterns in the ASCUS-LSIL triage study
Wentzensen, Nicolas; Wilson, Lauren E.; Wheeler, Cosette M.; Carreon, Joseph D.; Gravitt, Patti E.; Schiffman, Mark; Castle, Philip E.
2010-01-01
Anogenital cancers are associated with about 13 carcinogenic HPV types in a broader group that cause cervical intraepithelial neoplasia (CIN). Multiple concurrent cervical HPV infections are common which complicate the attribution of HPV types to different grades of CIN. Here we report the analysis of HPV genotype patterns in the ASCUS-LSIL triage study using unsupervised hierarchical clustering. Women who underwent colposcopy at baseline (n = 2780) were grouped into 20 disease categories based on histology and cytology. Disease groups and HPV genotypes were clustered using complete linkage. Risk of 2-year cumulative CIN3+, viral load, colposcopic impression, and age were compared between disease groups and major clusters. Hierarchical clustering yielded four major disease clusters: Cluster 1 included all CIN3 histology with abnormal cytology; Cluster 2 included CIN3 histology with normal cytology and combinations with either CIN2 or high-grade squamous intraepithelial lesion (HSIL) cytology; Cluster 3 included older women with normal or low grade histology/cytology and low viral load; Cluster 4 included younger women with low grade histology/cytology, multiple infections, and the highest viral load. Three major groups of HPV genotypes were identified: Group 1 included only HPV16; Group 2 included nine carcinogenic types plus non-carcinogenic HPV53 and HPV66; and Group 3 included non-carcinogenic types plus carcinogenic HPV33 and HPV45. Clustering results suggested that colposcopy missed a prevalent precancer in many women with no biopsy/normal histology and HSIL. This result was confirmed by an elevated 2-year risk of CIN3+ in these groups. Our novel approach to study multiple genotype infections in cervical disease using unsupervised hierarchical clustering can address complex genotype distributions on a population level. PMID:20959485
2015-12-01
group assignment of samples in unsupervised hierarchical clustering by the Unweighted Pair-Group Method using Arithmetic averages ( UPGMA ) based on...log2 transformed MAS5.0 signal values; probe set clustering was performed by the UPGMA method using Cosine correlation as the similarity met- ric. For...differentially-regulated genes identified were subjected to unsupervised hierarchical clustering analysis using the UPGMA algorithm with cosine correlation as
2013-10-01
correct group assignment of samples in unsupervised hierarchical clustering by the Unweighted Pair-Group Method using Arithmetic averages ( UPGMA ) based on...centering of log2 transformed MAS5.0 signal values; probe set clustering was performed by the UPGMA method using Cosine correlation as the similarity met...A) The 108 differentially-regulated genes identified were subjected to unsupervised hierarchical clustering analysis using the UPGMA algorithm with
Nonparametric Hierarchical Bayesian Model for Functional Brain Parcellation
Lashkari, Danial; Sridharan, Ramesh; Vul, Edward; Hsieh, Po-Jang; Kanwisher, Nancy; Golland, Polina
2011-01-01
We develop a method for unsupervised analysis of functional brain images that learns group-level patterns of functional response. Our algorithm is based on a generative model that comprises two main layers. At the lower level, we express the functional brain response to each stimulus as a binary activation variable. At the next level, we define a prior over the sets of activation variables in all subjects. We use a Hierarchical Dirichlet Process as the prior in order to simultaneously learn the patterns of response that are shared across the group, and to estimate the number of these patterns supported by data. Inference based on this model enables automatic discovery and characterization of salient and consistent patterns in functional signals. We apply our method to data from a study that explores the response of the visual cortex to a collection of images. The discovered profiles of activation correspond to selectivity to a number of image categories such as faces, bodies, and scenes. More generally, our results appear superior to the results of alternative data-driven methods in capturing the category structure in the space of stimuli. PMID:21841977
Robust Real-Time Music Transcription with a Compositional Hierarchical Model.
Pesek, Matevž; Leonardis, Aleš; Marolt, Matija
2017-01-01
The paper presents a new compositional hierarchical model for robust music transcription. Its main features are unsupervised learning of a hierarchical representation of input data, transparency, which enables insights into the learned representation, as well as robustness and speed which make it suitable for real-world and real-time use. The model consists of multiple layers, each composed of a number of parts. The hierarchical nature of the model corresponds well to hierarchical structures in music. The parts in lower layers correspond to low-level concepts (e.g. tone partials), while the parts in higher layers combine lower-level representations into more complex concepts (tones, chords). The layers are learned in an unsupervised manner from music signals. Parts in each layer are compositions of parts from previous layers based on statistical co-occurrences as the driving force of the learning process. In the paper, we present the model's structure and compare it to other hierarchical approaches in the field of music information retrieval. We evaluate the model's performance for the multiple fundamental frequency estimation. Finally, we elaborate on extensions of the model towards other music information retrieval tasks.
NASA Astrophysics Data System (ADS)
Chen, B.; Chehdi, K.; De Oliveria, E.; Cariou, C.; Charbonnier, B.
2015-10-01
In this paper a new unsupervised top-down hierarchical classification method to partition airborne hyperspectral images is proposed. The unsupervised approach is preferred because the difficulty of area access and the human and financial resources required to obtain ground truth data, constitute serious handicaps especially over large areas which can be covered by airborne or satellite images. The developed classification approach allows i) a successive partitioning of data into several levels or partitions in which the main classes are first identified, ii) an estimation of the number of classes automatically at each level without any end user help, iii) a nonsystematic subdivision of all classes of a partition Pj to form a partition Pj+1, iv) a stable partitioning result of the same data set from one run of the method to another. The proposed approach was validated on synthetic and real hyperspectral images related to the identification of several marine algae species. In addition to highly accurate and consistent results (correct classification rate over 99%), this approach is completely unsupervised. It estimates at each level, the optimal number of classes and the final partition without any end user intervention.
Sadeghi, Zahra; Testolin, Alberto
2017-08-01
In humans, efficient recognition of written symbols is thought to rely on a hierarchical processing system, where simple features are progressively combined into more abstract, high-level representations. Here, we present a computational model of Persian character recognition based on deep belief networks, where increasingly more complex visual features emerge in a completely unsupervised manner by fitting a hierarchical generative model to the sensory data. Crucially, high-level internal representations emerging from unsupervised deep learning can be easily read out by a linear classifier, achieving state-of-the-art recognition accuracy. Furthermore, we tested the hypothesis that handwritten digits and letters share many common visual features: A generative model that captures the statistical structure of the letters distribution should therefore also support the recognition of written digits. To this aim, deep networks trained on Persian letters were used to build high-level representations of Persian digits, which were indeed read out with high accuracy. Our simulations show that complex visual features, such as those mediating the identification of Persian symbols, can emerge from unsupervised learning in multilayered neural networks and can support knowledge transfer across related domains.
NASA Astrophysics Data System (ADS)
Sarparandeh, Mohammadali; Hezarkhani, Ardeshir
2017-12-01
The use of efficient methods for data processing has always been of interest to researchers in the field of earth sciences. Pattern recognition techniques are appropriate methods for high-dimensional data such as geochemical data. Evaluation of the geochemical distribution of rare earth elements (REEs) requires the use of such methods. In particular, the multivariate nature of REE data makes them a good target for numerical analysis. The main subject of this paper is application of unsupervised pattern recognition approaches in evaluating geochemical distribution of REEs in the Kiruna type magnetite-apatite deposit of Se-Chahun. For this purpose, 42 bulk lithology samples were collected from the Se-Chahun iron ore deposit. In this study, 14 rare earth elements were measured with inductively coupled plasma mass spectrometry (ICP-MS). Pattern recognition makes it possible to evaluate the relations between the samples based on all these 14 features, simultaneously. In addition to providing easy solutions, discovery of the hidden information and relations of data samples is the advantage of these methods. Therefore, four clustering methods (unsupervised pattern recognition) - including a modified basic sequential algorithmic scheme (MBSAS), hierarchical (agglomerative) clustering, k-means clustering and self-organizing map (SOM) - were applied and results were evaluated using the silhouette criterion. Samples were clustered in four types. Finally, the results of this study were validated with geological facts and analysis results from, for example, scanning electron microscopy (SEM), X-ray diffraction (XRD), ICP-MS and optical mineralogy. The results of the k-means clustering and SOM methods have the best matches with reality, with experimental studies of samples and with field surveys. Since only the rare earth elements are used in this division, a good agreement of the results with lithology is considerable. It is concluded that the combination of the proposed methods and geological studies leads to finding some hidden information, and this approach has the best results compared to using only one of them.
Haakensen, Vilde D; Lingjaerde, Ole Christian; Lüders, Torben; Riis, Margit; Prat, Aleix; Troester, Melissa A; Holmen, Marit M; Frantzen, Jan Ole; Romundstad, Linda; Navjord, Dina; Bukholm, Ida K; Johannesen, Tom B; Perou, Charles M; Ursin, Giske; Kristensen, Vessela N; Børresen-Dale, Anne-Lise; Helland, Aslaug
2011-11-01
Increased understanding of the variability in normal breast biology will enable us to identify mechanisms of breast cancer initiation and the origin of different subtypes, and to better predict breast cancer risk. Gene expression patterns in breast biopsies from 79 healthy women referred to breast diagnostic centers in Norway were explored by unsupervised hierarchical clustering and supervised analyses, such as gene set enrichment analysis and gene ontology analysis and comparison with previously published genelists and independent datasets. Unsupervised hierarchical clustering identified two separate clusters of normal breast tissue based on gene-expression profiling, regardless of clustering algorithm and gene filtering used. Comparison of the expression profile of the two clusters with several published gene lists describing breast cells revealed that the samples in cluster 1 share characteristics with stromal cells and stem cells, and to a certain degree with mesenchymal cells and myoepithelial cells. The samples in cluster 1 also share many features with the newly identified claudin-low breast cancer intrinsic subtype, which also shows characteristics of stromal and stem cells. More women belonging to cluster 1 have a family history of breast cancer and there is a slight overrepresentation of nulliparous women in cluster 1. Similar findings were seen in a separate dataset consisting of histologically normal tissue from both breasts harboring breast cancer and from mammoplasty reductions. This is the first study to explore the variability of gene expression patterns in whole biopsies from normal breasts and identified distinct subtypes of normal breast tissue. Further studies are needed to determine the specific cell contribution to the variation in the biology of normal breasts, how the clusters identified relate to breast cancer risk and their possible link to the origin of the different molecular subtypes of breast cancer.
Paraskevopoulou, Sivylla E; Wu, Di; Eftekhar, Amir; Constandinou, Timothy G
2014-09-30
This work presents a novel unsupervised algorithm for real-time adaptive clustering of neural spike data (spike sorting). The proposed Hierarchical Adaptive Means (HAM) clustering method combines centroid-based clustering with hierarchical cluster connectivity to classify incoming spikes using groups of clusters. It is described how the proposed method can adaptively track the incoming spike data without requiring any past history, iteration or training and autonomously determines the number of spike classes. Its performance (classification accuracy) has been tested using multiple datasets (both simulated and recorded) achieving a near-identical accuracy compared to k-means (using 10-iterations and provided with the number of spike classes). Also, its robustness in applying to different feature extraction methods has been demonstrated by achieving classification accuracies above 80% across multiple datasets. Last but crucially, its low complexity, that has been quantified through both memory and computation requirements makes this method hugely attractive for future hardware implementation. Copyright © 2014 Elsevier B.V. All rights reserved.
A Hierarchical Bayesian Model for Crowd Emotions
Urizar, Oscar J.; Baig, Mirza S.; Barakova, Emilia I.; Regazzoni, Carlo S.; Marcenaro, Lucio; Rauterberg, Matthias
2016-01-01
Estimation of emotions is an essential aspect in developing intelligent systems intended for crowded environments. However, emotion estimation in crowds remains a challenging problem due to the complexity in which human emotions are manifested and the capability of a system to perceive them in such conditions. This paper proposes a hierarchical Bayesian model to learn in unsupervised manner the behavior of individuals and of the crowd as a single entity, and explore the relation between behavior and emotions to infer emotional states. Information about the motion patterns of individuals are described using a self-organizing map, and a hierarchical Bayesian network builds probabilistic models to identify behaviors and infer the emotional state of individuals and the crowd. This model is trained and tested using data produced from simulated scenarios that resemble real-life environments. The conducted experiments tested the efficiency of our method to learn, detect and associate behaviors with emotional states yielding accuracy levels of 74% for individuals and 81% for the crowd, similar in performance with existing methods for pedestrian behavior detection but with novel concepts regarding the analysis of crowds. PMID:27458366
Cluster analysis of sputum cytokine-high profiles reveals diversity in T(h)2-high asthma patients.
Seys, Sven F; Scheers, Hans; Van den Brande, Paul; Marijsse, Gudrun; Dilissen, Ellen; Van Den Bergh, Annelies; Goeminne, Pieter C; Hellings, Peter W; Ceuppens, Jan L; Dupont, Lieven J; Bullens, Dominique M A
2017-02-23
Asthma is characterized by a heterogeneous inflammatory profile and can be subdivided into T(h)2-high and T(h)2-low airway inflammation. Profiling of a broader panel of airway cytokines in large unselected patient cohorts is lacking. Patients (n = 205) were defined as being "cytokine-low/high" if sputum mRNA expression of a particular cytokine was outside the respective 10 th /90 th percentile range of the control group (n = 80). Unsupervised hierarchical clustering was used to determine clusters based on sputum cytokine profiles. Half of patients (n = 108; 52.6%) had a classical T(h)2-high ("IL-4-, IL-5- and/or IL-13-high") sputum cytokine profile. Unsupervised cluster analysis revealed 5 clusters. Patients with an "IL-4- and/or IL-13-high" pattern surprisingly did not cluster but were equally distributed among the 5 clusters. Patients with an "IL-5-, IL-17A-/F- and IL-25- high" profile were restricted to cluster 1 (n = 24) with increased sputum eosinophil as well as neutrophil counts and poor lung function parameters at baseline and 2 years later. Four other clusters were identified: "IL-5-high or IL-10-high" (n = 16), "IL-6-high" (n = 8), "IL-22-high" (n = 25). Cluster 5 (n = 132) consists of patients without "cytokine-high" pattern or patients with only high IL-4 and/or IL-13. We identified 5 unique asthma molecular phenotypes by biological clustering. Type 2 cytokines cluster with non-type 2 cytokines in 4 out of 5 clusters. Unsupervised analysis thus not supports a priori type 2 versus non-type 2 molecular phenotypes. www.clinicaltrials.gov NCT01224938. Registered 18 October 2010.
Personalized Medicine in Veterans with Traumatic Brain Injuries
2014-07-01
9 control cases are subjected to unsupervised hierarchical clustering analysis using the UPGMA algorithm with cosine correlation as the similarity...in unsu- pervised hierarchical clustering by the Un- weighted Pair-Group Method using Arithmetic averages ( UPGMA ) based on cosine correlation of row...of log2 trans- formed MAS5.0 signal values; probe set cluster- ing was performed by the UPGMA method using Cosine correlation as the similarity
Hierarchical models for epidermal nerve fiber data.
Andersson, Claes; Rajala, Tuomas; Särkkä, Aila
2018-02-10
While epidermal nerve fiber (ENF) data have been used to study the effects of small fiber neuropathies through the density and the spatial patterns of the ENFs, little research has been focused on the effects on the individual nerve fibers. Studying the individual nerve fibers might give a better understanding of the effects of the neuropathy on the growth process of the individual ENFs. In this study, data from 32 healthy volunteers and 20 diabetic subjects, obtained from suction induced skin blister biopsies, are analyzed by comparing statistics for the nerve fibers as a whole and for the segments that a nerve fiber is composed of. Moreover, it is evaluated whether this type of data can be used to detect diabetic neuropathy, by using hierarchical models to perform unsupervised classification of the subjects. It is found that using the information about the individual nerve fibers in combination with the ENF counts yields a considerable improvement as compared to using the ENF counts only. Copyright © 2017 John Wiley & Sons, Ltd.
2011-01-01
Background Increased understanding of the variability in normal breast biology will enable us to identify mechanisms of breast cancer initiation and the origin of different subtypes, and to better predict breast cancer risk. Methods Gene expression patterns in breast biopsies from 79 healthy women referred to breast diagnostic centers in Norway were explored by unsupervised hierarchical clustering and supervised analyses, such as gene set enrichment analysis and gene ontology analysis and comparison with previously published genelists and independent datasets. Results Unsupervised hierarchical clustering identified two separate clusters of normal breast tissue based on gene-expression profiling, regardless of clustering algorithm and gene filtering used. Comparison of the expression profile of the two clusters with several published gene lists describing breast cells revealed that the samples in cluster 1 share characteristics with stromal cells and stem cells, and to a certain degree with mesenchymal cells and myoepithelial cells. The samples in cluster 1 also share many features with the newly identified claudin-low breast cancer intrinsic subtype, which also shows characteristics of stromal and stem cells. More women belonging to cluster 1 have a family history of breast cancer and there is a slight overrepresentation of nulliparous women in cluster 1. Similar findings were seen in a separate dataset consisting of histologically normal tissue from both breasts harboring breast cancer and from mammoplasty reductions. Conclusion This is the first study to explore the variability of gene expression patterns in whole biopsies from normal breasts and identified distinct subtypes of normal breast tissue. Further studies are needed to determine the specific cell contribution to the variation in the biology of normal breasts, how the clusters identified relate to breast cancer risk and their possible link to the origin of the different molecular subtypes of breast cancer. PMID:22044755
Unsupervised active learning based on hierarchical graph-theoretic clustering.
Hu, Weiming; Hu, Wei; Xie, Nianhua; Maybank, Steve
2009-10-01
Most existing active learning approaches are supervised. Supervised active learning has the following problems: inefficiency in dealing with the semantic gap between the distribution of samples in the feature space and their labels, lack of ability in selecting new samples that belong to new categories that have not yet appeared in the training samples, and lack of adaptability to changes in the semantic interpretation of sample categories. To tackle these problems, we propose an unsupervised active learning framework based on hierarchical graph-theoretic clustering. In the framework, two promising graph-theoretic clustering algorithms, namely, dominant-set clustering and spectral clustering, are combined in a hierarchical fashion. Our framework has some advantages, such as ease of implementation, flexibility in architecture, and adaptability to changes in the labeling. Evaluations on data sets for network intrusion detection, image classification, and video classification have demonstrated that our active learning framework can effectively reduce the workload of manual classification while maintaining a high accuracy of automatic classification. It is shown that, overall, our framework outperforms the support-vector-machine-based supervised active learning, particularly in terms of dealing much more efficiently with new samples whose categories have not yet appeared in the training samples.
Nonparametric weighted stochastic block models
NASA Astrophysics Data System (ADS)
Peixoto, Tiago P.
2018-01-01
We present a Bayesian formulation of weighted stochastic block models that can be used to infer the large-scale modular structure of weighted networks, including their hierarchical organization. Our method is nonparametric, and thus does not require the prior knowledge of the number of groups or other dimensions of the model, which are instead inferred from data. We give a comprehensive treatment of different kinds of edge weights (i.e., continuous or discrete, signed or unsigned, bounded or unbounded), as well as arbitrary weight transformations, and describe an unsupervised model selection approach to choose the best network description. We illustrate the application of our method to a variety of empirical weighted networks, such as global migrations, voting patterns in congress, and neural connections in the human brain.
Kolodny, Oren; Lotem, Arnon; Edelman, Shimon
2015-03-01
We introduce a set of biologically and computationally motivated design choices for modeling the learning of language, or of other types of sequential, hierarchically structured experience and behavior, and describe an implemented system that conforms to these choices and is capable of unsupervised learning from raw natural-language corpora. Given a stream of linguistic input, our model incrementally learns a grammar that captures its statistical patterns, which can then be used to parse or generate new data. The grammar constructed in this manner takes the form of a directed weighted graph, whose nodes are recursively (hierarchically) defined patterns over the elements of the input stream. We evaluated the model in seventeen experiments, grouped into five studies, which examined, respectively, (a) the generative ability of grammar learned from a corpus of natural language, (b) the characteristics of the learned representation, (c) sequence segmentation and chunking, (d) artificial grammar learning, and (e) certain types of structure dependence. The model's performance largely vindicates our design choices, suggesting that progress in modeling language acquisition can be made on a broad front-ranging from issues of generativity to the replication of human experimental findings-by bringing biological and computational considerations, as well as lessons from prior efforts, to bear on the modeling approach. Copyright © 2014 Cognitive Science Society, Inc.
Hierarchical classification method and its application in shape representation
NASA Astrophysics Data System (ADS)
Ireton, M. A.; Oakley, John P.; Xydeas, Costas S.
1992-04-01
In this paper we describe a technique for performing shaped-based content retrieval of images from a large database. In order to be able to formulate such user-generated queries about visual objects, we have developed an hierarchical classification technique. This hierarchical classification technique enables similarity matching between objects, with the position in the hierarchy signifying the level of generality to be used in the query. The classification technique is unsupervised, robust, and general; it can be applied to any suitable parameter set. To establish the potential of this classifier for aiding visual querying, we have applied it to the classification of the 2-D outlines of leaves.
Wu, Jiayi; Ma, Yong-Bei; Congdon, Charles; Brett, Bevin; Chen, Shuobing; Xu, Yaofang; Ouyang, Qi
2017-01-01
Structural heterogeneity in single-particle cryo-electron microscopy (cryo-EM) data represents a major challenge for high-resolution structure determination. Unsupervised classification may serve as the first step in the assessment of structural heterogeneity. However, traditional algorithms for unsupervised classification, such as K-means clustering and maximum likelihood optimization, may classify images into wrong classes with decreasing signal-to-noise-ratio (SNR) in the image data, yet demand increased computational costs. Overcoming these limitations requires further development of clustering algorithms for high-performance cryo-EM data processing. Here we introduce an unsupervised single-particle clustering algorithm derived from a statistical manifold learning framework called generative topographic mapping (GTM). We show that unsupervised GTM clustering improves classification accuracy by about 40% in the absence of input references for data with lower SNRs. Applications to several experimental datasets suggest that our algorithm can detect subtle structural differences among classes via a hierarchical clustering strategy. After code optimization over a high-performance computing (HPC) environment, our software implementation was able to generate thousands of reference-free class averages within hours in a massively parallel fashion, which allows a significant improvement on ab initio 3D reconstruction and assists in the computational purification of homogeneous datasets for high-resolution visualization. PMID:28786986
Wu, Jiayi; Ma, Yong-Bei; Congdon, Charles; Brett, Bevin; Chen, Shuobing; Xu, Yaofang; Ouyang, Qi; Mao, Youdong
2017-01-01
Structural heterogeneity in single-particle cryo-electron microscopy (cryo-EM) data represents a major challenge for high-resolution structure determination. Unsupervised classification may serve as the first step in the assessment of structural heterogeneity. However, traditional algorithms for unsupervised classification, such as K-means clustering and maximum likelihood optimization, may classify images into wrong classes with decreasing signal-to-noise-ratio (SNR) in the image data, yet demand increased computational costs. Overcoming these limitations requires further development of clustering algorithms for high-performance cryo-EM data processing. Here we introduce an unsupervised single-particle clustering algorithm derived from a statistical manifold learning framework called generative topographic mapping (GTM). We show that unsupervised GTM clustering improves classification accuracy by about 40% in the absence of input references for data with lower SNRs. Applications to several experimental datasets suggest that our algorithm can detect subtle structural differences among classes via a hierarchical clustering strategy. After code optimization over a high-performance computing (HPC) environment, our software implementation was able to generate thousands of reference-free class averages within hours in a massively parallel fashion, which allows a significant improvement on ab initio 3D reconstruction and assists in the computational purification of homogeneous datasets for high-resolution visualization.
NASA Astrophysics Data System (ADS)
Graham, James; Ternovskiy, Igor V.
2013-06-01
We applied a two stage unsupervised hierarchical learning system to model complex dynamic surveillance and cyber space monitoring systems using a non-commercial version of the NeoAxis visualization software. The hierarchical scene learning and recognition approach is based on hierarchical expectation maximization, and was linked to a 3D graphics engine for validation of learning and classification results and understanding the human - autonomous system relationship. Scene recognition is performed by taking synthetically generated data and feeding it to a dynamic logic algorithm. The algorithm performs hierarchical recognition of the scene by first examining the features of the objects to determine which objects are present, and then determines the scene based on the objects present. This paper presents a framework within which low level data linked to higher-level visualization can provide support to a human operator and be evaluated in a detailed and systematic way.
Heterogeneous patterns of brain atrophy in Alzheimer's disease.
Poulakis, Konstantinos; Pereira, Joana B; Mecocci, Patrizia; Vellas, Bruno; Tsolaki, Magda; Kłoszewska, Iwona; Soininen, Hilkka; Lovestone, Simon; Simmons, Andrew; Wahlund, Lars-Olof; Westman, Eric
2018-05-01
There is increasing evidence showing that brain atrophy varies between patients with Alzheimer's disease (AD), suggesting that different anatomical patterns might exist within the same disorder. We investigated AD heterogeneity based on cortical and subcortical atrophy patterns in 299 AD subjects from 2 multicenter cohorts. Clusters of patients and important discriminative features were determined using random forest pairwise similarity, multidimensional scaling, and distance-based hierarchical clustering. We discovered 2 typical (72.2%) and 3 atypical (28.8%) subtypes with significantly different demographic, clinical, and cognitive characteristics, and different rates of cognitive decline. In contrast to previous studies, our unsupervised random forest approach based on cortical and subcortical volume measures and their linear and nonlinear interactions revealed more typical AD subtypes with important anatomically discriminative features, while the prevalence of atypical cases was lower. The hippocampal-sparing and typical AD subtypes exhibited worse clinical progression in visuospatial, memory, and executive cognitive functions. Our findings suggest there is substantial heterogeneity in AD that has an impact on how patients function and progress over time. Copyright © 2018 Elsevier Inc. All rights reserved.
Accuracy of latent-variable estimation in Bayesian semi-supervised learning.
Yamazaki, Keisuke
2015-09-01
Hierarchical probabilistic models, such as Gaussian mixture models, are widely used for unsupervised learning tasks. These models consist of observable and latent variables, which represent the observable data and the underlying data-generation process, respectively. Unsupervised learning tasks, such as cluster analysis, are regarded as estimations of latent variables based on the observable ones. The estimation of latent variables in semi-supervised learning, where some labels are observed, will be more precise than that in unsupervised, and one of the concerns is to clarify the effect of the labeled data. However, there has not been sufficient theoretical analysis of the accuracy of the estimation of latent variables. In a previous study, a distribution-based error function was formulated, and its asymptotic form was calculated for unsupervised learning with generative models. It has been shown that, for the estimation of latent variables, the Bayes method is more accurate than the maximum-likelihood method. The present paper reveals the asymptotic forms of the error function in Bayesian semi-supervised learning for both discriminative and generative models. The results show that the generative model, which uses all of the given data, performs better when the model is well specified. Copyright © 2015 Elsevier Ltd. All rights reserved.
Yang, Guang; Raschke, Felix; Barrick, Thomas R; Howe, Franklyn A
2015-09-01
To investigate whether nonlinear dimensionality reduction improves unsupervised classification of (1) H MRS brain tumor data compared with a linear method. In vivo single-voxel (1) H magnetic resonance spectroscopy (55 patients) and (1) H magnetic resonance spectroscopy imaging (MRSI) (29 patients) data were acquired from histopathologically diagnosed gliomas. Data reduction using Laplacian eigenmaps (LE) or independent component analysis (ICA) was followed by k-means clustering or agglomerative hierarchical clustering (AHC) for unsupervised learning to assess tumor grade and for tissue type segmentation of MRSI data. An accuracy of 93% in classification of glioma grade II and grade IV, with 100% accuracy in distinguishing tumor and normal spectra, was obtained by LE with unsupervised clustering, but not with the combination of k-means and ICA. With (1) H MRSI data, LE provided a more linear distribution of data for cluster analysis and better cluster stability than ICA. LE combined with k-means or AHC provided 91% accuracy for classifying tumor grade and 100% accuracy for identifying normal tissue voxels. Color-coded visualization of normal brain, tumor core, and infiltration regions was achieved with LE combined with AHC. The LE method is promising for unsupervised clustering to separate brain and tumor tissue with automated color-coding for visualization of (1) H MRSI data after cluster analysis. © 2014 Wiley Periodicals, Inc.
Video mining using combinations of unsupervised and supervised learning techniques
NASA Astrophysics Data System (ADS)
Divakaran, Ajay; Miyahara, Koji; Peker, Kadir A.; Radhakrishnan, Regunathan; Xiong, Ziyou
2003-12-01
We discuss the meaning and significance of the video mining problem, and present our work on some aspects of video mining. A simple definition of video mining is unsupervised discovery of patterns in audio-visual content. Such purely unsupervised discovery is readily applicable to video surveillance as well as to consumer video browsing applications. We interpret video mining as content-adaptive or "blind" content processing, in which the first stage is content characterization and the second stage is event discovery based on the characterization obtained in stage 1. We discuss the target applications and find that using a purely unsupervised approach are too computationally complex to be implemented on our product platform. We then describe various combinations of unsupervised and supervised learning techniques that help discover patterns that are useful to the end-user of the application. We target consumer video browsing applications such as commercial message detection, sports highlights extraction etc. We employ both audio and video features. We find that supervised audio classification combined with unsupervised unusual event discovery enables accurate supervised detection of desired events. Our techniques are computationally simple and robust to common variations in production styles etc.
ERIC Educational Resources Information Center
Kolodny, Oren; Lotem, Arnon; Edelman, Shimon
2015-01-01
We introduce a set of biologically and computationally motivated design choices for modeling the learning of language, or of other types of sequential, hierarchically structured experience and behavior, and describe an implemented system that conforms to these choices and is capable of unsupervised learning from raw natural-language corpora. Given…
Huang, Qi; Yang, Dapeng; Jiang, Li; Zhang, Huajie; Liu, Hong; Kotani, Kiyoshi
2017-01-01
Performance degradation will be caused by a variety of interfering factors for pattern recognition-based myoelectric control methods in the long term. This paper proposes an adaptive learning method with low computational cost to mitigate the effect in unsupervised adaptive learning scenarios. We presents a particle adaptive classifier (PAC), by constructing a particle adaptive learning strategy and universal incremental least square support vector classifier (LS-SVC). We compared PAC performance with incremental support vector classifier (ISVC) and non-adapting SVC (NSVC) in a long-term pattern recognition task in both unsupervised and supervised adaptive learning scenarios. Retraining time cost and recognition accuracy were compared by validating the classification performance on both simulated and realistic long-term EMG data. The classification results of realistic long-term EMG data showed that the PAC significantly decreased the performance degradation in unsupervised adaptive learning scenarios compared with NSVC (9.03% ± 2.23%, p < 0.05) and ISVC (13.38% ± 2.62%, p = 0.001), and reduced the retraining time cost compared with ISVC (2 ms per updating cycle vs. 50 ms per updating cycle). PMID:28608824
Huang, Qi; Yang, Dapeng; Jiang, Li; Zhang, Huajie; Liu, Hong; Kotani, Kiyoshi
2017-06-13
Performance degradation will be caused by a variety of interfering factors for pattern recognition-based myoelectric control methods in the long term. This paper proposes an adaptive learning method with low computational cost to mitigate the effect in unsupervised adaptive learning scenarios. We presents a particle adaptive classifier (PAC), by constructing a particle adaptive learning strategy and universal incremental least square support vector classifier (LS-SVC). We compared PAC performance with incremental support vector classifier (ISVC) and non-adapting SVC (NSVC) in a long-term pattern recognition task in both unsupervised and supervised adaptive learning scenarios. Retraining time cost and recognition accuracy were compared by validating the classification performance on both simulated and realistic long-term EMG data. The classification results of realistic long-term EMG data showed that the PAC significantly decreased the performance degradation in unsupervised adaptive learning scenarios compared with NSVC (9.03% ± 2.23%, p < 0.05) and ISVC (13.38% ± 2.62%, p = 0.001), and reduced the retraining time cost compared with ISVC (2 ms per updating cycle vs. 50 ms per updating cycle).
Self-Supervised Video Hashing With Hierarchical Binary Auto-Encoder.
Song, Jingkuan; Zhang, Hanwang; Li, Xiangpeng; Gao, Lianli; Wang, Meng; Hong, Richang
2018-07-01
Existing video hash functions are built on three isolated stages: frame pooling, relaxed learning, and binarization, which have not adequately explored the temporal order of video frames in a joint binary optimization model, resulting in severe information loss. In this paper, we propose a novel unsupervised video hashing framework dubbed self-supervised video hashing (SSVH), which is able to capture the temporal nature of videos in an end-to-end learning to hash fashion. We specifically address two central problems: 1) how to design an encoder-decoder architecture to generate binary codes for videos and 2) how to equip the binary codes with the ability of accurate video retrieval. We design a hierarchical binary auto-encoder to model the temporal dependencies in videos with multiple granularities, and embed the videos into binary codes with less computations than the stacked architecture. Then, we encourage the binary codes to simultaneously reconstruct the visual content and neighborhood structure of the videos. Experiments on two real-world data sets show that our SSVH method can significantly outperform the state-of-the-art methods and achieve the current best performance on the task of unsupervised video retrieval.
Self-Supervised Video Hashing With Hierarchical Binary Auto-Encoder
NASA Astrophysics Data System (ADS)
Song, Jingkuan; Zhang, Hanwang; Li, Xiangpeng; Gao, Lianli; Wang, Meng; Hong, Richang
2018-07-01
Existing video hash functions are built on three isolated stages: frame pooling, relaxed learning, and binarization, which have not adequately explored the temporal order of video frames in a joint binary optimization model, resulting in severe information loss. In this paper, we propose a novel unsupervised video hashing framework dubbed Self-Supervised Video Hashing (SSVH), that is able to capture the temporal nature of videos in an end-to-end learning-to-hash fashion. We specifically address two central problems: 1) how to design an encoder-decoder architecture to generate binary codes for videos; and 2) how to equip the binary codes with the ability of accurate video retrieval. We design a hierarchical binary autoencoder to model the temporal dependencies in videos with multiple granularities, and embed the videos into binary codes with less computations than the stacked architecture. Then, we encourage the binary codes to simultaneously reconstruct the visual content and neighborhood structure of the videos. Experiments on two real-world datasets (FCVID and YFCC) show that our SSVH method can significantly outperform the state-of-the-art methods and achieve the currently best performance on the task of unsupervised video retrieval.
Bagur, M G; Morales, S; López-Chicano, M
2009-11-15
Unsupervised and supervised pattern recognition techniques such as hierarchical cluster analysis, principal component analysis, factor analysis and linear discriminant analysis have been applied to water samples recollected in Rodalquilar mining district (Southern Spain) in order to identify different sources of environmental pollution caused by the abandoned mining industry. The effect of the mining activity on waters was monitored determining the concentration of eleven elements (Mn, Ba, Co, Cu, Zn, As, Cd, Sb, Hg, Au and Pb) by inductively coupled plasma mass spectrometry (ICP-MS). The Box-Cox transformation has been used to transform the data set in normal form in order to minimize the non-normal distribution of the geochemical data. The environmental impact is affected mainly by the mining activity developed in the zone, the acid drainage and finally by the chemical treatment used for the benefit of gold.
Basati, Zahra; Jamshidi, Bahareh; Rasekh, Mansour; Abbaspour-Gilandeh, Yousef
2018-05-30
The presence of sunn pest-damaged grains in wheat mass reduces the quality of flour and bread produced from it. Therefore, it is essential to assess the quality of the samples in collecting and storage centers of wheat and flour mills. In this research, the capability of visible/near-infrared (Vis/NIR) spectroscopy combined with pattern recognition methods was investigated for discrimination of wheat samples with different percentages of sunn pest-damaged. To this end, various samples belonging to five classes (healthy and 5%, 10%, 15% and 20% unhealthy) were analyzed using Vis/NIR spectroscopy (wavelength range of 350-1000 nm) based on both supervised and unsupervised pattern recognition methods. Principal component analysis (PCA) and hierarchical cluster analysis (HCA) as the unsupervised techniques and soft independent modeling of class analogies (SIMCA) and partial least squares-discriminant analysis (PLS-DA) as supervised methods were used. The results showed that Vis/NIR spectra of healthy samples were correctly clustered using both PCA and HCA. Due to the high overlapping between the four unhealthy classes (5%, 10%, 15% and 20%), it was not possible to discriminate all the unhealthy samples in individual classes. However, when considering only the two main categories of healthy and unhealthy, an acceptable degree of separation between the classes can be obtained after classification with supervised pattern recognition methods of SIMCA and PLS-DA. SIMCA based on PCA modeling correctly classified samples in two classes of healthy and unhealthy with classification accuracy of 100%. Moreover, the power of the wavelengths of 839 nm, 918 nm and 995 nm were more than other wavelengths to discriminate two classes of healthy and unhealthy. It was also concluded that PLS-DA provides excellent classification results of healthy and unhealthy samples (R 2 = 0.973 and RMSECV = 0.057). Therefore, Vis/NIR spectroscopy based on pattern recognition techniques can be useful for rapid distinguishing the healthy wheat samples from those damaged by sunn pest in the maintenance and processing centers. Copyright © 2018 Elsevier B.V. All rights reserved.
A novel unsupervised spike sorting algorithm for intracranial EEG.
Yadav, R; Shah, A K; Loeb, J A; Swamy, M N S; Agarwal, R
2011-01-01
This paper presents a novel, unsupervised spike classification algorithm for intracranial EEG. The method combines template matching and principal component analysis (PCA) for building a dynamic patient-specific codebook without a priori knowledge of the spike waveforms. The problem of misclassification due to overlapping classes is resolved by identifying similar classes in the codebook using hierarchical clustering. Cluster quality is visually assessed by projecting inter- and intra-clusters onto a 3D plot. Intracranial EEG from 5 patients was utilized to optimize the algorithm. The resulting codebook retains 82.1% of the detected spikes in non-overlapping and disjoint clusters. Initial results suggest a definite role of this method for both rapid review and quantitation of interictal spikes that could enhance both clinical treatment and research studies on epileptic patients.
Unsupervised deep learning reveals prognostically relevant subtypes of glioblastoma.
Young, Jonathan D; Cai, Chunhui; Lu, Xinghua
2017-10-03
One approach to improving the personalized treatment of cancer is to understand the cellular signaling transduction pathways that cause cancer at the level of the individual patient. In this study, we used unsupervised deep learning to learn the hierarchical structure within cancer gene expression data. Deep learning is a group of machine learning algorithms that use multiple layers of hidden units to capture hierarchically related, alternative representations of the input data. We hypothesize that this hierarchical structure learned by deep learning will be related to the cellular signaling system. Robust deep learning model selection identified a network architecture that is biologically plausible. Our model selection results indicated that the 1st hidden layer of our deep learning model should contain about 1300 hidden units to most effectively capture the covariance structure of the input data. This agrees with the estimated number of human transcription factors, which is approximately 1400. This result lends support to our hypothesis that the 1st hidden layer of a deep learning model trained on gene expression data may represent signals related to transcription factor activation. Using the 3rd hidden layer representation of each tumor as learned by our unsupervised deep learning model, we performed consensus clustering on all tumor samples-leading to the discovery of clusters of glioblastoma multiforme with differential survival. One of these clusters contained all of the glioblastoma samples with G-CIMP, a known methylation phenotype driven by the IDH1 mutation and associated with favorable prognosis, suggesting that the hidden units in the 3rd hidden layer representations captured a methylation signal without explicitly using methylation data as input. We also found differentially expressed genes and well-known mutations (NF1, IDH1, EGFR) that were uniquely correlated with each of these clusters. Exploring these unique genes and mutations will allow us to further investigate the disease mechanisms underlying each of these clusters. In summary, we show that a deep learning model can be trained to represent biologically and clinically meaningful abstractions of cancer gene expression data. Understanding what additional relationships these hidden layer abstractions have with the cancer cellular signaling system could have a significant impact on the understanding and treatment of cancer.
NASA Astrophysics Data System (ADS)
D'Amore, M.; Le Scaon, R.; Helbert, J.; Maturilli, A.
2017-12-01
Machine-learning achieved unprecedented results in high-dimensional data processing tasks with wide applications in various fields. Due to the growing number of complex nonlinear systems that have to be investigated in science and the bare raw size of data nowadays available, ML offers the unique ability to extract knowledge, regardless the specific application field. Examples are image segmentation, supervised/unsupervised/ semi-supervised classification, feature extraction, data dimensionality analysis/reduction.The MASCS instrument has mapped Mercury surface in the 400-1145 nm wavelength range during orbital observations by the MESSENGER spacecraft. We have conducted k-means unsupervised hierarchical clustering to identify and characterize spectral units from MASCS observations. The results display a dichotomy: a polar and equatorial units, possibly linked to compositional differences or weathering due to irradiation. To explore possible relations between composition and spectral behavior, we have compared the spectral provinces with elemental abundance maps derived from MESSENGER's X-Ray Spectrometer (XRS).For the Vesta application on DAWN Visible and infrared spectrometer (VIR) data, we explored several Machine Learning techniques: image segmentation method, stream algorithm and hierarchical clustering.The algorithm successfully separates the Olivine outcrops around two craters on Vesta's surface [1]. New maps summarizing the spectral and chemical signature of the surface could be automatically produced.We conclude that instead of hand digging in data, scientist could choose a subset of algorithms with well known feature (i.e. efficacy on the particular problem, speed, accuracy) and focus their effort in understanding what important characteristic of the groups found in the data mean. [1] E Ammannito et al. "Olivine in an unexpected location on Vesta's surface". In: Nature 504.7478 (2013), pp. 122-125.
Ramón, M; Martínez-Pastor, F
2018-04-23
Computer-aided sperm analysis (CASA) produces a wealth of data that is frequently ignored. The use of multiparametric statistical methods can help explore these datasets, unveiling the subpopulation structure of sperm samples. In this review we analyse the significance of the internal heterogeneity of sperm samples and its relevance. We also provide a brief description of the statistical tools used for extracting sperm subpopulations from the datasets, namely unsupervised clustering (with non-hierarchical, hierarchical and two-step methods) and the most advanced supervised methods, based on machine learning. The former method has allowed exploration of subpopulation patterns in many species, whereas the latter offering further possibilities, especially considering functional studies and the practical use of subpopulation analysis. We also consider novel approaches, such as the use of geometric morphometrics or imaging flow cytometry. Finally, although the data provided by CASA systems provides valuable information on sperm samples by applying clustering analyses, there are several caveats. Protocols for capturing and analysing motility or morphometry should be standardised and adapted to each experiment, and the algorithms should be open in order to allow comparison of results between laboratories. Moreover, we must be aware of new technology that could change the paradigm for studying sperm motility and morphology.
UNSUPERVISED TRANSIENT LIGHT CURVE ANALYSIS VIA HIERARCHICAL BAYESIAN INFERENCE
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sanders, N. E.; Soderberg, A. M.; Betancourt, M., E-mail: nsanders@cfa.harvard.edu
2015-02-10
Historically, light curve studies of supernovae (SNe) and other transient classes have focused on individual objects with copious and high signal-to-noise observations. In the nascent era of wide field transient searches, objects with detailed observations are decreasing as a fraction of the overall known SN population, and this strategy sacrifices the majority of the information contained in the data about the underlying population of transients. A population level modeling approach, simultaneously fitting all available observations of objects in a transient sub-class of interest, fully mines the data to infer the properties of the population and avoids certain systematic biases. Wemore » present a novel hierarchical Bayesian statistical model for population level modeling of transient light curves, and discuss its implementation using an efficient Hamiltonian Monte Carlo technique. As a test case, we apply this model to the Type IIP SN sample from the Pan-STARRS1 Medium Deep Survey, consisting of 18,837 photometric observations of 76 SNe, corresponding to a joint posterior distribution with 9176 parameters under our model. Our hierarchical model fits provide improved constraints on light curve parameters relevant to the physical properties of their progenitor stars relative to modeling individual light curves alone. Moreover, we directly evaluate the probability for occurrence rates of unseen light curve characteristics from the model hyperparameters, addressing observational biases in survey methodology. We view this modeling framework as an unsupervised machine learning technique with the ability to maximize scientific returns from data to be collected by future wide field transient searches like LSST.« less
Personalized Medicine in Veterans with Traumatic Brain Injuries
2013-05-01
Pair-Group Method using Arithmetic averages ( UPGMA ) based on cosine correlation of row mean centered log2 signal values; this was the top 50%-tile...cluster- ing was performed by the UPGMA method using Cosine correlation as the similarity metric. For comparative purposes, clustered heat maps included...non-mTBI cases were subjected to unsupervised hierarchical clustering analysis using the UPGMA algorithm with cosine correlation as the similarity
Molecular subtyping of bladder cancer using Kohonen self-organizing maps
Borkowska, Edyta M; Kruk, Andrzej; Jedrzejczyk, Adam; Rozniecki, Marek; Jablonowski, Zbigniew; Traczyk, Magdalena; Constantinou, Maria; Banaszkiewicz, Monika; Pietrusinski, Michal; Sosnowski, Marek; Hamdy, Freddie C; Peter, Stefan; Catto, James WF; Kaluzewski, Bogdan
2014-01-01
Kohonen self-organizing maps (SOMs) are unsupervised Artificial Neural Networks (ANNs) that are good for low-density data visualization. They easily deal with complex and nonlinear relationships between variables. We evaluated molecular events that characterize high- and low-grade BC pathways in the tumors from 104 patients. We compared the ability of statistical clustering with a SOM to stratify tumors according to the risk of progression to more advanced disease. In univariable analysis, tumor stage (log rank P = 0.006) and grade (P < 0.001), HPV DNA (P < 0.004), Chromosome 9 loss (P = 0.04) and the A148T polymorphism (rs 3731249) in CDKN2A (P = 0.02) were associated with progression. Multivariable analysis of these parameters identified that tumor grade (Cox regression, P = 0.001, OR.2.9 (95% CI 1.6–5.2)) and the presence of HPV DNA (P = 0.017, OR 3.8 (95% CI 1.3–11.4)) were the only independent predictors of progression. Unsupervised hierarchical clustering grouped the tumors into discreet branches but did not stratify according to progression free survival (log rank P = 0.39). These genetic variables were presented to SOM input neurons. SOMs are suitable for complex data integration, allow easy visualization of outcomes, and may stratify BC progression more robustly than hierarchical clustering. PMID:25142434
Effective implementation of hierarchical clustering
NASA Astrophysics Data System (ADS)
Verma, Mudita; Vijayarajan, V.; Sivashanmugam, G.; Bessie Amali, D. Geraldine
2017-11-01
Hierarchical clustering is generally used for cluster analysis in which we build up a hierarchy of clusters. In order to find that which cluster should be split a large amount of observations are being carried out. Here the data set of US based personalities has been considered for clustering. After implementation of hierarchical clustering on the data set we group it in three different clusters one is of politician, sports person and musicians. Training set is the main parameter which decides the category which has to be assigned to the observations that are being collected. The category of these observations must be known. Recognition comes from the formulation of classification. Supervised learning has the main instance in the form of classification. While on the other hand Clustering is an instance of unsupervised procedure. Clustering consists of grouping of data that have similar properties which are either their own or are inherited from some other sources.
NASA Astrophysics Data System (ADS)
Holtzman, B. K.; Paté, A.; Paisley, J.; Waldhauser, F.; Repetto, D.; Boschi, L.
2017-12-01
The earthquake process reflects complex interactions of stress, fracture and frictional properties. New machine learning methods reveal patterns in time-dependent spectral properties of seismic signals and enable identification of changes in faulting processes. Our methods are based closely on those developed for music information retrieval and voice recognition, using the spectrogram instead of the waveform directly. Unsupervised learning involves identification of patterns based on differences among signals without any additional information provided to the algorithm. Clustering of 46,000 earthquakes of $0.3
Deep Learning with Hierarchical Convolutional Factor Analysis
Chen, Bo; Polatkan, Gungor; Sapiro, Guillermo; Blei, David; Dunson, David; Carin, Lawrence
2013-01-01
Unsupervised multi-layered (“deep”) models are considered for general data, with a particular focus on imagery. The model is represented using a hierarchical convolutional factor-analysis construction, with sparse factor loadings and scores. The computation of layer-dependent model parameters is implemented within a Bayesian setting, employing a Gibbs sampler and variational Bayesian (VB) analysis, that explicitly exploit the convolutional nature of the expansion. In order to address large-scale and streaming data, an online version of VB is also developed. The number of basis functions or dictionary elements at each layer is inferred from the data, based on a beta-Bernoulli implementation of the Indian buffet process. Example results are presented for several image-processing applications, with comparisons to related models in the literature. PMID:23787342
Yang, Yang; Saleemi, Imran; Shah, Mubarak
2013-07-01
This paper proposes a novel representation of articulated human actions and gestures and facial expressions. The main goals of the proposed approach are: 1) to enable recognition using very few examples, i.e., one or k-shot learning, and 2) meaningful organization of unlabeled datasets by unsupervised clustering. Our proposed representation is obtained by automatically discovering high-level subactions or motion primitives, by hierarchical clustering of observed optical flow in four-dimensional, spatial, and motion flow space. The completely unsupervised proposed method, in contrast to state-of-the-art representations like bag of video words, provides a meaningful representation conducive to visual interpretation and textual labeling. Each primitive action depicts an atomic subaction, like directional motion of limb or torso, and is represented by a mixture of four-dimensional Gaussian distributions. For one--shot and k-shot learning, the sequence of primitive labels discovered in a test video are labeled using KL divergence, and can then be represented as a string and matched against similar strings of training videos. The same sequence can also be collapsed into a histogram of primitives or be used to learn a Hidden Markov model to represent classes. We have performed extensive experiments on recognition by one and k-shot learning as well as unsupervised action clustering on six human actions and gesture datasets, a composite dataset, and a database of facial expressions. These experiments confirm the validity and discriminative nature of the proposed representation.
Data-driven cluster reinforcement and visualization in sparsely-matched self-organizing maps.
Manukyan, Narine; Eppstein, Margaret J; Rizzo, Donna M
2012-05-01
A self-organizing map (SOM) is a self-organized projection of high-dimensional data onto a typically 2-dimensional (2-D) feature map, wherein vector similarity is implicitly translated into topological closeness in the 2-D projection. However, when there are more neurons than input patterns, it can be challenging to interpret the results, due to diffuse cluster boundaries and limitations of current methods for displaying interneuron distances. In this brief, we introduce a new cluster reinforcement (CR) phase for sparsely-matched SOMs. The CR phase amplifies within-cluster similarity in an unsupervised, data-driven manner. Discontinuities in the resulting map correspond to between-cluster distances and are stored in a boundary (B) matrix. We describe a new hierarchical visualization of cluster boundaries displayed directly on feature maps, which requires no further clustering beyond what was implicitly accomplished during self-organization in SOM training. We use a synthetic benchmark problem and previously published microbial community profile data to demonstrate the benefits of the proposed methods.
Statistical Significance for Hierarchical Clustering
Kimes, Patrick K.; Liu, Yufeng; Hayes, D. Neil; Marron, J. S.
2017-01-01
Summary Cluster analysis has proved to be an invaluable tool for the exploratory and unsupervised analysis of high dimensional datasets. Among methods for clustering, hierarchical approaches have enjoyed substantial popularity in genomics and other fields for their ability to simultaneously uncover multiple layers of clustering structure. A critical and challenging question in cluster analysis is whether the identified clusters represent important underlying structure or are artifacts of natural sampling variation. Few approaches have been proposed for addressing this problem in the context of hierarchical clustering, for which the problem is further complicated by the natural tree structure of the partition, and the multiplicity of tests required to parse the layers of nested clusters. In this paper, we propose a Monte Carlo based approach for testing statistical significance in hierarchical clustering which addresses these issues. The approach is implemented as a sequential testing procedure guaranteeing control of the family-wise error rate. Theoretical justification is provided for our approach, and its power to detect true clustering structure is illustrated through several simulation studies and applications to two cancer gene expression datasets. PMID:28099990
Learning of Chunking Sequences in Cognition and Behavior
Rabinovich, Mikhail
2015-01-01
We often learn and recall long sequences in smaller segments, such as a phone number 858 534 22 30 memorized as four segments. Behavioral experiments suggest that humans and some animals employ this strategy of breaking down cognitive or behavioral sequences into chunks in a wide variety of tasks, but the dynamical principles of how this is achieved remains unknown. Here, we study the temporal dynamics of chunking for learning cognitive sequences in a chunking representation using a dynamical model of competing modes arranged to evoke hierarchical Winnerless Competition (WLC) dynamics. Sequential memory is represented as trajectories along a chain of metastable fixed points at each level of the hierarchy, and bistable Hebbian dynamics enables the learning of such trajectories in an unsupervised fashion. Using computer simulations, we demonstrate the learning of a chunking representation of sequences and their robust recall. During learning, the dynamics associates a set of modes to each information-carrying item in the sequence and encodes their relative order. During recall, hierarchical WLC guarantees the robustness of the sequence order when the sequence is not too long. The resulting patterns of activities share several features observed in behavioral experiments, such as the pauses between boundaries of chunks, their size and their duration. Failures in learning chunking sequences provide new insights into the dynamical causes of neurological disorders such as Parkinson’s disease and Schizophrenia. PMID:26584306
Molecular subtyping of bladder cancer using Kohonen self-organizing maps.
Borkowska, Edyta M; Kruk, Andrzej; Jedrzejczyk, Adam; Rozniecki, Marek; Jablonowski, Zbigniew; Traczyk, Magdalena; Constantinou, Maria; Banaszkiewicz, Monika; Pietrusinski, Michal; Sosnowski, Marek; Hamdy, Freddie C; Peter, Stefan; Catto, James W F; Kaluzewski, Bogdan
2014-10-01
Kohonen self-organizing maps (SOMs) are unsupervised Artificial Neural Networks (ANNs) that are good for low-density data visualization. They easily deal with complex and nonlinear relationships between variables. We evaluated molecular events that characterize high- and low-grade BC pathways in the tumors from 104 patients. We compared the ability of statistical clustering with a SOM to stratify tumors according to the risk of progression to more advanced disease. In univariable analysis, tumor stage (log rank P = 0.006) and grade (P < 0.001), HPV DNA (P < 0.004), Chromosome 9 loss (P = 0.04) and the A148T polymorphism (rs 3731249) in CDKN2A (P = 0.02) were associated with progression. Multivariable analysis of these parameters identified that tumor grade (Cox regression, P = 0.001, OR.2.9 (95% CI 1.6-5.2)) and the presence of HPV DNA (P = 0.017, OR 3.8 (95% CI 1.3-11.4)) were the only independent predictors of progression. Unsupervised hierarchical clustering grouped the tumors into discreet branches but did not stratify according to progression free survival (log rank P = 0.39). These genetic variables were presented to SOM input neurons. SOMs are suitable for complex data integration, allow easy visualization of outcomes, and may stratify BC progression more robustly than hierarchical clustering. © 2014 The Authors. Cancer Medicine published by John Wiley & Sons Ltd.
Rough Set Based Splitting Criterion for Binary Decision Tree Classifiers
2006-09-26
Alata O. Fernandez-Maloigne C., and Ferrie J.C. (2001). Unsupervised Algorithm for the Segmentation of Three-Dimensional Magnetic Resonance Brain ...instinctual and learned responses in the brain , causing it to make decisions based on patterns in the stimuli. Using this deceptively simple process...2001. [2] Bohn C. (1997). An Incremental Unsupervised Learning Scheme for Function Approximation. In: Proceedings of the 1997 IEEE International
Deep Unsupervised Learning on a Desktop PC: A Primer for Cognitive Scientists.
Testolin, Alberto; Stoianov, Ivilin; De Filippo De Grazia, Michele; Zorzi, Marco
2013-01-01
Deep belief networks hold great promise for the simulation of human cognition because they show how structured and abstract representations may emerge from probabilistic unsupervised learning. These networks build a hierarchy of progressively more complex distributed representations of the sensory data by fitting a hierarchical generative model. However, learning in deep networks typically requires big datasets and it can involve millions of connection weights, which implies that simulations on standard computers are unfeasible. Developing realistic, medium-to-large-scale learning models of cognition would therefore seem to require expertise in programing parallel-computing hardware, and this might explain why the use of this promising approach is still largely confined to the machine learning community. Here we show how simulations of deep unsupervised learning can be easily performed on a desktop PC by exploiting the processors of low cost graphic cards (graphic processor units) without any specific programing effort, thanks to the use of high-level programming routines (available in MATLAB or Python). We also show that even an entry-level graphic card can outperform a small high-performance computing cluster in terms of learning time and with no loss of learning quality. We therefore conclude that graphic card implementations pave the way for a widespread use of deep learning among cognitive scientists for modeling cognition and behavior.
Deep Unsupervised Learning on a Desktop PC: A Primer for Cognitive Scientists
Testolin, Alberto; Stoianov, Ivilin; De Filippo De Grazia, Michele; Zorzi, Marco
2013-01-01
Deep belief networks hold great promise for the simulation of human cognition because they show how structured and abstract representations may emerge from probabilistic unsupervised learning. These networks build a hierarchy of progressively more complex distributed representations of the sensory data by fitting a hierarchical generative model. However, learning in deep networks typically requires big datasets and it can involve millions of connection weights, which implies that simulations on standard computers are unfeasible. Developing realistic, medium-to-large-scale learning models of cognition would therefore seem to require expertise in programing parallel-computing hardware, and this might explain why the use of this promising approach is still largely confined to the machine learning community. Here we show how simulations of deep unsupervised learning can be easily performed on a desktop PC by exploiting the processors of low cost graphic cards (graphic processor units) without any specific programing effort, thanks to the use of high-level programming routines (available in MATLAB or Python). We also show that even an entry-level graphic card can outperform a small high-performance computing cluster in terms of learning time and with no loss of learning quality. We therefore conclude that graphic card implementations pave the way for a widespread use of deep learning among cognitive scientists for modeling cognition and behavior. PMID:23653617
Andreev, Victor P; Gillespie, Brenda W; Helfand, Brian T; Merion, Robert M
2016-01-01
Unsupervised classification methods are gaining acceptance in omics studies of complex common diseases, which are often vaguely defined and are likely the collections of disease subtypes. Unsupervised classification based on the molecular signatures identified in omics studies have the potential to reflect molecular mechanisms of the subtypes of the disease and to lead to more targeted and successful interventions for the identified subtypes. Multiple classification algorithms exist but none is ideal for all types of data. Importantly, there are no established methods to estimate sample size in unsupervised classification (unlike power analysis in hypothesis testing). Therefore, we developed a simulation approach allowing comparison of misclassification errors and estimating the required sample size for a given effect size, number, and correlation matrix of the differentially abundant proteins in targeted proteomics studies. All the experiments were performed in silico. The simulated data imitated the expected one from the study of the plasma of patients with lower urinary tract dysfunction with the aptamer proteomics assay Somascan (SomaLogic Inc, Boulder, CO), which targeted 1129 proteins, including 330 involved in inflammation, 180 in stress response, 80 in aging, etc. Three popular clustering methods (hierarchical, k-means, and k-medoids) were compared. K-means clustering performed much better for the simulated data than the other two methods and enabled classification with misclassification error below 5% in the simulated cohort of 100 patients based on the molecular signatures of 40 differentially abundant proteins (effect size 1.5) from among the 1129-protein panel. PMID:27524871
Modeling language and cognition with deep unsupervised learning: a tutorial overview
Zorzi, Marco; Testolin, Alberto; Stoianov, Ivilin P.
2013-01-01
Deep unsupervised learning in stochastic recurrent neural networks with many layers of hidden units is a recent breakthrough in neural computation research. These networks build a hierarchy of progressively more complex distributed representations of the sensory data by fitting a hierarchical generative model. In this article we discuss the theoretical foundations of this approach and we review key issues related to training, testing and analysis of deep networks for modeling language and cognitive processing. The classic letter and word perception problem of McClelland and Rumelhart (1981) is used as a tutorial example to illustrate how structured and abstract representations may emerge from deep generative learning. We argue that the focus on deep architectures and generative (rather than discriminative) learning represents a crucial step forward for the connectionist modeling enterprise, because it offers a more plausible model of cortical learning as well as a way to bridge the gap between emergentist connectionist models and structured Bayesian models of cognition. PMID:23970869
Modeling language and cognition with deep unsupervised learning: a tutorial overview.
Zorzi, Marco; Testolin, Alberto; Stoianov, Ivilin P
2013-01-01
Deep unsupervised learning in stochastic recurrent neural networks with many layers of hidden units is a recent breakthrough in neural computation research. These networks build a hierarchy of progressively more complex distributed representations of the sensory data by fitting a hierarchical generative model. In this article we discuss the theoretical foundations of this approach and we review key issues related to training, testing and analysis of deep networks for modeling language and cognitive processing. The classic letter and word perception problem of McClelland and Rumelhart (1981) is used as a tutorial example to illustrate how structured and abstract representations may emerge from deep generative learning. We argue that the focus on deep architectures and generative (rather than discriminative) learning represents a crucial step forward for the connectionist modeling enterprise, because it offers a more plausible model of cortical learning as well as a way to bridge the gap between emergentist connectionist models and structured Bayesian models of cognition.
Unsupervised spike sorting based on discriminative subspace learning.
Keshtkaran, Mohammad Reza; Yang, Zhi
2014-01-01
Spike sorting is a fundamental preprocessing step for many neuroscience studies which rely on the analysis of spike trains. In this paper, we present two unsupervised spike sorting algorithms based on discriminative subspace learning. The first algorithm simultaneously learns the discriminative feature subspace and performs clustering. It uses histogram of features in the most discriminative projection to detect the number of neurons. The second algorithm performs hierarchical divisive clustering that learns a discriminative 1-dimensional subspace for clustering in each level of the hierarchy until achieving almost unimodal distribution in the subspace. The algorithms are tested on synthetic and in-vivo data, and are compared against two widely used spike sorting methods. The comparative results demonstrate that our spike sorting methods can achieve substantially higher accuracy in lower dimensional feature space, and they are highly robust to noise. Moreover, they provide significantly better cluster separability in the learned subspace than in the subspace obtained by principal component analysis or wavelet transform.
Lopez-Meyer, Paulo; Schuckers, Stephanie; Makeyev, Oleksandr; Fontana, Juan M; Sazonov, Edward
2012-09-01
The number of distinct foods consumed in a meal is of significant clinical concern in the study of obesity and other eating disorders. This paper proposes the use of information contained in chewing and swallowing sequences for meal segmentation by food types. Data collected from experiments of 17 volunteers were analyzed using two different clustering techniques. First, an unsupervised clustering technique, Affinity Propagation (AP), was used to automatically identify the number of segments within a meal. Second, performance of the unsupervised AP method was compared to a supervised learning approach based on Agglomerative Hierarchical Clustering (AHC). While the AP method was able to obtain 90% accuracy in predicting the number of food items, the AHC achieved an accuracy >95%. Experimental results suggest that the proposed models of automatic meal segmentation may be utilized as part of an integral application for objective Monitoring of Ingestive Behavior in free living conditions.
Vékony, Hedy; Röser, Kerstin; Löning, Thomas; Ylstra, Bauke; Meijer, Gerrit A; van Wieringen, Wessel N; van de Wiel, Mark A; Carvalho, Beatriz; Kok, Klaas; Leemans, C René; van der Waal, Isaäc; Bloemena, Elisabeth
2009-02-01
Salivary gland myoepithelial tumors are relatively uncommon tumors with an unpredictable clinical course. More knowledge about their genetic profiles is necessary to identify novel predictors of disease. In this study, we subjected 27 primary tumors (15 myoepitheliomas and 12 myoepithelial carcinomas) to genome-wide microarray-based comparative genomic hybridization (array CGH). We set out to delineate known chromosomal aberrations in more detail and to unravel chromosomal differences between benign myoepitheliomas and myoepithelial carcinomas. Patterns of DNA copy number aberrations were analyzed by unsupervised hierarchical cluster analysis. Both benign and malignant tumors revealed a limited amount of chromosomal alterations (median of 5 and 7.5, respectively). In both tumor groups, high frequency gains (> or =20%) were found mainly at loci of growth factors and growth factor receptors (e.g., PDGF, FGF(R)s, and EGFR). In myoepitheliomas, high frequency losses (> or =20%) were detected at regions of proto-cadherins. Cluster analysis of the array CGH data identified three clusters. Differential copy numbers on chromosome arm 8q and chromosome 17 set the clusters apart. Cluster 1 contained a mixture of the two phenotypes (n = 10), cluster 2 included mostly benign tumors (n = 10), and cluster 3 only contained carcinomas (n = 7). Supervised analysis between malignant and benign tumors revealed a 36 Mbp-region at 8q being more frequently gained in malignant tumors (P = 0.007, FDR = 0.05). This is the first study investigating genomic differences between benign and malignant myoepithelial tumors of the salivary glands at a genomic level. Both unsupervised and supervised analysis of the genomic profiles revealed chromosome arm 8q to be involved in the malignant phenotype of salivary gland myoepitheliomas.
Layher, Georg; Schrodt, Fabian; Butz, Martin V.; Neumann, Heiko
2014-01-01
The categorization of real world objects is often reflected in the similarity of their visual appearances. Such categories of objects do not necessarily form disjunct sets of objects, neither semantically nor visually. The relationship between categories can often be described in terms of a hierarchical structure. For instance, tigers and leopards build two separate mammalian categories, both of which are subcategories of the category Felidae. In the last decades, the unsupervised learning of categories of visual input stimuli has been addressed by numerous approaches in machine learning as well as in computational neuroscience. However, the question of what kind of mechanisms might be involved in the process of subcategory learning, or category refinement, remains a topic of active investigation. We propose a recurrent computational network architecture for the unsupervised learning of categorial and subcategorial visual input representations. During learning, the connection strengths of bottom-up weights from input to higher-level category representations are adapted according to the input activity distribution. In a similar manner, top-down weights learn to encode the characteristics of a specific stimulus category. Feedforward and feedback learning in combination realize an associative memory mechanism, enabling the selective top-down propagation of a category's feedback weight distribution. We suggest that the difference between the expected input encoded in the projective field of a category node and the current input pattern controls the amplification of feedforward-driven representations. Large enough differences trigger the recruitment of new representational resources and the establishment of additional (sub-) category representations. We demonstrate the temporal evolution of such learning and show how the proposed combination of an associative memory with a modulatory feedback integration successfully establishes category and subcategory representations. PMID:25538637
Jiang, Jheng Jie; Lee, Chon Lin; Fang, Meng Der; Boyd, Kenneth G.; Gibb, Stuart W.
2015-01-01
This paper presents a methodology based on multivariate data analysis for characterizing potential source contributions of emerging contaminants (ECs) detected in 26 river water samples across multi-scape regions during dry and wet seasons. Based on this methodology, we unveil an approach toward potential source contributions of ECs, a concept we refer to as the “Pharmaco-signature.” Exploratory analysis of data points has been carried out by unsupervised pattern recognition (hierarchical cluster analysis, HCA) and receptor model (principal component analysis-multiple linear regression, PCA-MLR) in an attempt to demonstrate significant source contributions of ECs in different land-use zone. Robust cluster solutions grouped the database according to different EC profiles. PCA-MLR identified that 58.9% of the mean summed ECs were contributed by domestic impact, 9.7% by antibiotics application, and 31.4% by drug abuse. Diclofenac, ibuprofen, codeine, ampicillin, tetracycline, and erythromycin-H2O have significant pollution risk quotients (RQ>1), indicating potentially high risk to aquatic organisms in Taiwan. PMID:25874375
Subgroup Specific Alternative Splicing in Medulloblastoma
Kloosterhof, Nanne K; Northcott, Paul A; Yu, Emily PY; Shih, David; Peacock, John; Grajkowska, Wieslawa; van Meter, Timothy; Eberhart, Charles G; Pfister, Stefan; Marra, Marco A; Weiss, William A; Scherer, Stephen W; Rutka, James T; French, Pim J; Taylor, Michael D
2014-01-01
Medulloblastoma is comprised of four distinct molecular variants: WNT, SHH, Group 3, and Group 4. We analyzed alternative splicing usage in 14 normal cerebellar samples and 103 medulloblastomas of known subgroup. Medulloblastoma samples have a statistically significant increase in alternative splicing as compared to normal fetal cerebella (2.3-times; P<6.47E-8). Splicing patterns are distinct and specific between molecular subgroups. Unsupervised hierarchical clustering of alternative splicing events accurately assigns medulloblastomas to their correct subgroup. Subgroup-specific splicing and alternative promoter usage was most prevalent in Group 3 (19.4%) and SHH (16.2%) medulloblastomas, while observed less frequently in WNT (3.2%), and Group 4 (9.3%) tumors. Functional annotation of alternatively spliced genes reveals over-representation of genes important for neuronal development. Alternative splicing events in medulloblastoma may be regulated in part by the correlative expression of antisense transcripts, suggesting a possible mechanism affecting subgroup specific alternative splicing. Our results identify additional candidate markers for medulloblastoma subgroup affiliation, further support the existence of distinct subgroups of the disease, and demonstrate an additional level of transcriptional heterogeneity between medulloblastoma subgroups. PMID:22358458
Collected Notes on the Workshop for Pattern Discovery in Large Databases
NASA Technical Reports Server (NTRS)
Buntine, Wray (Editor); Delalto, Martha (Editor)
1991-01-01
These collected notes are a record of material presented at the Workshop. The core data analysis is addressed that have traditionally required statistical or pattern recognition techniques. Some of the core tasks include classification, discrimination, clustering, supervised and unsupervised learning, discovery and diagnosis, i.e., general pattern discovery.
Some simple guides to finding useful information in exploration geochemical data
Singer, D.A.; Kouda, R.
2001-01-01
Most regional geochemistry data reflect processes that can produce superfluous bits of noise and, perhaps, information about the mineralization process of interest. There are two end-member approaches to finding patterns in geochemical data-unsupervised learning and supervised learning. In unsupervised learning, data are processed and the geochemist is given the task of interpreting and identifying possible sources of any patterns. In supervised learning, data from known subgroups such as rock type, mineralized and nonmineralized, and types of mineralization are used to train the system which then is given unknown samples to classify into these subgroups. To locate patterns of interest, it is helpful to transform the data and to remove unwanted masking patterns. With trace elements use of a logarithmic transformation is recommended. In many situations, missing censored data can be estimated using multiple regression of other uncensored variables on the variable with censored values. In unsupervised learning, transformed values can be standardized, or normalized, to a Z-score by subtracting the subset's mean and dividing by its standard deviation. Subsets include any source of differences that might be related to processes unrelated to the target sought such as different laboratories, regional alteration, analytical procedures, or rock types. Normalization removes effects of different means and measurement scales as well as facilitates comparison of spatial patterns of elements. These adjustments remove effects of different subgroups and hopefully leave on the map the simple and uncluttered pattern(s) related to the mineralization only. Supervised learning methods, such as discriminant analysis and neural networks, offer the promise of consistent and, in certain situations, unbiased estimates of where mineralization might exist. These methods critically rely on being trained with data that encompasses all populations fairly and that can possibly fall into only the identified populations. ?? 2001 International Association for Mathematical Geology.
An automatic taxonomy of galaxy morphology using unsupervised machine learning
NASA Astrophysics Data System (ADS)
Hocking, Alex; Geach, James E.; Sun, Yi; Davey, Neil
2018-01-01
We present an unsupervised machine learning technique that automatically segments and labels galaxies in astronomical imaging surveys using only pixel data. Distinct from previous unsupervised machine learning approaches used in astronomy we use no pre-selection or pre-filtering of target galaxy type to identify galaxies that are similar. We demonstrate the technique on the Hubble Space Telescope (HST) Frontier Fields. By training the algorithm using galaxies from one field (Abell 2744) and applying the result to another (MACS 0416.1-2403), we show how the algorithm can cleanly separate early and late type galaxies without any form of pre-directed training for what an 'early' or 'late' type galaxy is. We then apply the technique to the HST Cosmic Assembly Near-infrared Deep Extragalactic Legacy Survey (CANDELS) fields, creating a catalogue of approximately 60 000 classifications. We show how the automatic classification groups galaxies of similar morphological (and photometric) type and make the classifications public via a catalogue, a visual catalogue and galaxy similarity search. We compare the CANDELS machine-based classifications to human-classifications from the Galaxy Zoo: CANDELS project. Although there is not a direct mapping between Galaxy Zoo and our hierarchical labelling, we demonstrate a good level of concordance between human and machine classifications. Finally, we show how the technique can be used to identify rarer objects and present lensed galaxy candidates from the CANDELS imaging.
Lopane, Giovanna; Mellone, Sabato; Corzani, Mattia; Chiari, Lorenzo; Cortelli, Pietro; Calandra-Buonaura, Giovanna; Contin, Manuela
2018-06-01
We aimed to assess the intrasubject reproducibility of a technology-based levodopa (LD) therapeutic monitoring protocol administered in supervised versus unsupervised conditions in patients with Parkinson's disease (PD). The study design was pilot, intrasubject, single center, open and prospective. Twenty patients were recruited. Patients performed a standardized monitoring protocol instrumented by an ad hoc embedded platform after their usual first morning LD dose in two different randomized ambulatory sessions: one under a physician's supervision, the other self-administered. The protocol is made up of serial motor and non-motor tests, including alternate finger tapping, Timed Up and Go test, and measurement of blood pressure. Primary motor outcomes included comparisons of intrasubject LD subacute motor response patterns over the 3-h test in the two experimental conditions. Secondary outcomes were the number of intrasession serial test repetitions due to technical or handling errors and patients' satisfaction with the unsupervised LD monitoring protocol. Intrasubject LD motor response patterns were concordant between the two study sessions in all patients but one. Platform handling problems averaged 4% of total planned serial tests for both sessions. Ninety-five percent of patients were satisfied with the self-administered LD monitoring protocol. To our knowledge, this study is the first to explore the potential of unsupervised technology-based objective motor and non-motor tasks to monitor subacute LD dosing effects in PD patients. The results are promising for future telemedicine applications.
Semi-supervised and unsupervised extreme learning machines.
Huang, Gao; Song, Shiji; Gupta, Jatinder N D; Wu, Cheng
2014-12-01
Extreme learning machines (ELMs) have proven to be efficient and effective learning mechanisms for pattern classification and regression. However, ELMs are primarily applied to supervised learning problems. Only a few existing research papers have used ELMs to explore unlabeled data. In this paper, we extend ELMs for both semi-supervised and unsupervised tasks based on the manifold regularization, thus greatly expanding the applicability of ELMs. The key advantages of the proposed algorithms are as follows: 1) both the semi-supervised ELM (SS-ELM) and the unsupervised ELM (US-ELM) exhibit learning capability and computational efficiency of ELMs; 2) both algorithms naturally handle multiclass classification or multicluster clustering; and 3) both algorithms are inductive and can handle unseen data at test time directly. Moreover, it is shown in this paper that all the supervised, semi-supervised, and unsupervised ELMs can actually be put into a unified framework. This provides new perspectives for understanding the mechanism of random feature mapping, which is the key concept in ELM theory. Empirical study on a wide range of data sets demonstrates that the proposed algorithms are competitive with the state-of-the-art semi-supervised or unsupervised learning algorithms in terms of accuracy and efficiency.
NASA Astrophysics Data System (ADS)
Langer, H. K.; Falsaperla, S. M.; Behncke, B.; Messina, A.; Spampinato, S.
2009-12-01
Artificial Intelligence (AI) has found broad applications in volcano observatories worldwide with the aim of reducing volcanic hazard. The need to process larger and larger quantity of data makes indeed AI techniques appealing for monitoring purposes. Tools based on Artificial Neural Networks and Support Vector Machine have proved to be particularly successful in the classification of seismic events and volcanic tremor changes heralding eruptive activity, such as paroxysmal explosions and lava fountaining at Stromboli and Mt Etna, Italy (e.g., Falsaperla et al., 1996; Langer et al., 2009). Moving on from the excellent results obtained from these applications, we present KKAnalysis, a MATLAB based software which combines several unsupervised pattern classification methods, exploiting routines of the SOM Toolbox 2 for MATLAB (http://www.cis.hut.fi/projects/somtoolbox). KKAnalysis is based on Self Organizing Maps (SOM) and clustering methods consisting of K-Means, Fuzzy C-Means, and a scheme based on a metrics accounting for correlation between components of the feature vector. We show examples of applications of this tool to volcanic tremor data recorded at Mt Etna between 2007 and 2009. This time span - during which Strombolian explosions, 7 episodes of lava fountaining and effusive activity occurred - is particularly interesting, as it encompassed different states of volcanic activity (i.e., non-eruptive, eruptive according to different styles) for the unsupervised classifier to identify, highlighting their development in time. Even subtle changes in the signal characteristics allow the unsupervised classifier to recognize features belonging to the different classes and stages of volcanic activity. A convenient color-code representation shows up the temporal development of the different classes of signal, making this method extremely helpful for monitoring purposes and surveillance. Though being developed for volcanic tremor classification, KKAnalysis is generally applicable to any type of physical or chemical pattern, provided that feature vectors are given in numerical form. References: Falsaperla, S., S. Graziani, G. Nunnari, and S. Spampinato (1996). Automatic classification of volcanic earthquakes by using multy-layered neural networks. Natural Hazard, 13, 205-228. Langer, H., S. Falsaperla, M. Masotti, R. Campanini, S. Spampinato, and A. Messina (2008). Synopsis of supervised and unsupervised pattern classification techniques applied to volcanic tremor data at Mt Etna, Italy. Geophys. J. Int., doi:10.1111/j.1365-246X.2009.04179.x.
Multi person detection and tracking based on hierarchical level-set method
NASA Astrophysics Data System (ADS)
Khraief, Chadia; Benzarti, Faouzi; Amiri, Hamid
2018-04-01
In this paper, we propose an efficient unsupervised method for mutli-person tracking based on hierarchical level-set approach. The proposed method uses both edge and region information in order to effectively detect objects. The persons are tracked on each frame of the sequence by minimizing an energy functional that combines color, texture and shape information. These features are enrolled in covariance matrix as region descriptor. The present method is fully automated without the need to manually specify the initial contour of Level-set. It is based on combined person detection and background subtraction methods. The edge-based is employed to maintain a stable evolution, guide the segmentation towards apparent boundaries and inhibit regions fusion. The computational cost of level-set is reduced by using narrow band technique. Many experimental results are performed on challenging video sequences and show the effectiveness of the proposed method.
Xu, Rong; Supekar, Kaustubh; Morgan, Alex; Das, Amar; Garber, Alan
2008-11-06
Concept specific lexicons (e.g. diseases, drugs, anatomy) are a critical source of background knowledge for many medical language-processing systems. However, the rapid pace of biomedical research and the lack of constraints on usage ensure that such dictionaries are incomplete. Focusing on disease terminology, we have developed an automated, unsupervised, iterative pattern learning approach for constructing a comprehensive medical dictionary of disease terms from randomized clinical trial (RCT) abstracts, and we compared different ranking methods for automatically extracting con-textual patterns and concept terms. When used to identify disease concepts from 100 randomly chosen, manually annotated clinical abstracts, our disease dictionary shows significant performance improvement (F1 increased by 35-88%) over available, manually created disease terminologies.
Xu, Rong; Supekar, Kaustubh; Morgan, Alex; Das, Amar; Garber, Alan
2008-01-01
Concept specific lexicons (e.g. diseases, drugs, anatomy) are a critical source of background knowledge for many medical language-processing systems. However, the rapid pace of biomedical research and the lack of constraints on usage ensure that such dictionaries are incomplete. Focusing on disease terminology, we have developed an automated, unsupervised, iterative pattern learning approach for constructing a comprehensive medical dictionary of disease terms from randomized clinical trial (RCT) abstracts, and we compared different ranking methods for automatically extracting contextual patterns and concept terms. When used to identify disease concepts from 100 randomly chosen, manually annotated clinical abstracts, our disease dictionary shows significant performance improvement (F1 increased by 35–88%) over available, manually created disease terminologies. PMID:18999169
Identification of chronic rhinosinusitis phenotypes using cluster analysis.
Soler, Zachary M; Hyer, J Madison; Ramakrishnan, Viswanathan; Smith, Timothy L; Mace, Jess; Rudmik, Luke; Schlosser, Rodney J
2015-05-01
Current clinical classifications of chronic rhinosinusitis (CRS) have been largely defined based upon preconceived notions of factors thought to be important, such as polyp or eosinophil status. Unfortunately, these classification systems have little correlation with symptom severity or treatment outcomes. Unsupervised clustering can be used to identify phenotypic subgroups of CRS patients, describe clinical differences in these clusters and define simple algorithms for classification. A multi-institutional, prospective study of 382 patients with CRS who had failed initial medical therapy completed the Sino-Nasal Outcome Test (SNOT-22), Rhinosinusitis Disability Index (RSDI), Medical Outcomes Study Short Form-12 (SF-12), Pittsburgh Sleep Quality Index (PSQI), and Patient Health Questionnaire (PHQ-2). Objective measures of CRS severity included Brief Smell Identification Test (B-SIT), CT, and endoscopy scoring. All variables were reduced and unsupervised hierarchical clustering was performed. After clusters were defined, variations in medication usage were analyzed. Discriminant analysis was performed to develop a simplified, clinically useful algorithm for clustering. Clustering was largely determined by age, severity of patient reported outcome measures, depression, and fibromyalgia. CT and endoscopy varied somewhat among clusters. Traditional clinical measures, including polyp/atopic status, prior surgery, B-SIT and asthma, did not vary among clusters. A simplified algorithm based upon productivity loss, SNOT-22 score, and age predicted clustering with 89% accuracy. Medication usage among clusters did vary significantly. A simplified algorithm based upon hierarchical clustering is able to classify CRS patients and predict medication usage. Further studies are warranted to determine if such clustering predicts treatment outcomes. © 2015 ARS-AAOA, LLC.
Berger, Christoph T; Greiff, Victor; Mehling, Matthias; Fritz, Stefanie; Meier, Marc A; Hoenger, Gideon; Conen, Anna; Recher, Mike; Battegay, Manuel; Reddy, Sai T; Hess, Christoph
2015-01-01
Vaccines dramatically reduce infection-related morbidity and mortality. Determining factors that modulate the host response is key to rational vaccine design and demands unsupervised analysis. To longitudinally resolve influenza-specific humoral immune response dynamics we constructed vaccine response profiles of influenza A- and B-specific IgM and IgG levels from 42 healthy and 31 HIV infected influenza-vaccinated individuals. Pre-vaccination antibody levels and levels at 3 predefined time points after vaccination were included in each profile. We performed hierarchical clustering on these profiles to study the extent to which HIV infection associated immune dysfunction, adaptive immune factors (pre-existing influenza-specific antibodies, T cell responses), an innate immune factor (Mannose Binding Lectin, MBL), demographic characteristics (gender, age), or the vaccine preparation (split vs. virosomal) impacted the immune response to influenza vaccination. Hierarchical clustering associated vaccine preparation and pre-existing IgG levels with the profiles of healthy individuals. In contrast to previous in vitro and animal data, MBL levels had no impact on the adaptive vaccine response. Importantly, while HIV infected subjects with low CD4 T cell counts showed a reduced magnitude of their vaccine response, their response profiles were indistinguishable from those of healthy controls, suggesting quantitative but not qualitative deficits. Unsupervised profile-based analysis ranks factors impacting the vaccine-response by relative importance, with substantial implications for comparing, designing and improving vaccine preparations and strategies. Profile similarity between HIV infected and HIV negative individuals suggests merely quantitative differences in the vaccine response in these individuals, offering a rationale for boosting strategies in the HIV infected population.
Discursive Hierarchical Patterning in Economics Cases
ERIC Educational Resources Information Center
Lung, Jane
2011-01-01
This paper attempts to apply Lung's (2008) model of the discursive hierarchical patterning of cases to a closer and more specific study of Economics cases and proposes a model of the distinct discursive hierarchical patterning of the same. It examines a corpus of 150 Economics cases with a view to uncovering the patterns of discourse construction.…
NASA Astrophysics Data System (ADS)
Cruz-Roa, Angel; Arevalo, John; Basavanhally, Ajay; Madabhushi, Anant; González, Fabio
2015-01-01
Learning data representations directly from the data itself is an approach that has shown great success in different pattern recognition problems, outperforming state-of-the-art feature extraction schemes for different tasks in computer vision, speech recognition and natural language processing. Representation learning applies unsupervised and supervised machine learning methods to large amounts of data to find building-blocks that better represent the information in it. Digitized histopathology images represents a very good testbed for representation learning since it involves large amounts of high complex, visual data. This paper presents a comparative evaluation of different supervised and unsupervised representation learning architectures to specifically address open questions on what type of learning architectures (deep or shallow), type of learning (unsupervised or supervised) is optimal. In this paper we limit ourselves to addressing these questions in the context of distinguishing between anaplastic and non-anaplastic medulloblastomas from routine haematoxylin and eosin stained images. The unsupervised approaches evaluated were sparse autoencoders and topographic reconstruct independent component analysis, and the supervised approach was convolutional neural networks. Experimental results show that shallow architectures with more neurons are better than deeper architectures without taking into account local space invariances and that topographic constraints provide useful invariant features in scale and rotations for efficient tumor differentiation.
Learning Behavior Characterization with Multi-Feature, Hierarchical Activity Sequences
ERIC Educational Resources Information Center
Ye, Cheng; Segedy, James R.; Kinnebrew, John S.; Biswas, Gautam
2015-01-01
This paper discusses Multi-Feature Hierarchical Sequential Pattern Mining, MFH-SPAM, a novel algorithm that efficiently extracts patterns from students' learning activity sequences. This algorithm extends an existing sequential pattern mining algorithm by dynamically selecting the level of specificity for hierarchically-defined features…
Wang, Nancy X. R.; Olson, Jared D.; Ojemann, Jeffrey G.; Rao, Rajesh P. N.; Brunton, Bingni W.
2016-01-01
Fully automated decoding of human activities and intentions from direct neural recordings is a tantalizing challenge in brain-computer interfacing. Implementing Brain Computer Interfaces (BCIs) outside carefully controlled experiments in laboratory settings requires adaptive and scalable strategies with minimal supervision. Here we describe an unsupervised approach to decoding neural states from naturalistic human brain recordings. We analyzed continuous, long-term electrocorticography (ECoG) data recorded over many days from the brain of subjects in a hospital room, with simultaneous audio and video recordings. We discovered coherent clusters in high-dimensional ECoG recordings using hierarchical clustering and automatically annotated them using speech and movement labels extracted from audio and video. To our knowledge, this represents the first time techniques from computer vision and speech processing have been used for natural ECoG decoding. Interpretable behaviors were decoded from ECoG data, including moving, speaking and resting; the results were assessed by comparison with manual annotation. Discovered clusters were projected back onto the brain revealing features consistent with known functional areas, opening the door to automated functional brain mapping in natural settings. PMID:27148018
Semi-supervised clustering for parcellating brain regions based on resting state fMRI data
NASA Astrophysics Data System (ADS)
Cheng, Hewei; Fan, Yong
2014-03-01
Many unsupervised clustering techniques have been adopted for parcellating brain regions of interest into functionally homogeneous subregions based on resting state fMRI data. However, the unsupervised clustering techniques are not able to take advantage of exiting knowledge of the functional neuroanatomy readily available from studies of cytoarchitectonic parcellation or meta-analysis of the literature. In this study, we propose a semi-supervised clustering method for parcellating amygdala into functionally homogeneous subregions based on resting state fMRI data. Particularly, the semi-supervised clustering is implemented under the framework of graph partitioning, and adopts prior information and spatial consistent constraints to obtain a spatially contiguous parcellation result. The graph partitioning problem is solved using an efficient algorithm similar to the well-known weighted kernel k-means algorithm. Our method has been validated for parcellating amygdala into 3 subregions based on resting state fMRI data of 28 subjects. The experiment results have demonstrated that the proposed method is more robust than unsupervised clustering and able to parcellate amygdala into centromedial, laterobasal, and superficial parts with improved functionally homogeneity compared with the cytoarchitectonic parcellation result. The validity of the parcellation results is also supported by distinctive functional and structural connectivity patterns of the subregions and high consistency between coactivation patterns derived from a meta-analysis and functional connectivity patterns of corresponding subregions.
Unsupervised Learning in an Ensemble of Spiking Neural Networks Mediated by ITDP.
Shim, Yoonsik; Philippides, Andrew; Staras, Kevin; Husbands, Phil
2016-10-01
We propose a biologically plausible architecture for unsupervised ensemble learning in a population of spiking neural network classifiers. A mixture of experts type organisation is shown to be effective, with the individual classifier outputs combined via a gating network whose operation is driven by input timing dependent plasticity (ITDP). The ITDP gating mechanism is based on recent experimental findings. An abstract, analytically tractable model of the ITDP driven ensemble architecture is derived from a logical model based on the probabilities of neural firing events. A detailed analysis of this model provides insights that allow it to be extended into a full, biologically plausible, computational implementation of the architecture which is demonstrated on a visual classification task. The extended model makes use of a style of spiking network, first introduced as a model of cortical microcircuits, that is capable of Bayesian inference, effectively performing expectation maximization. The unsupervised ensemble learning mechanism, based around such spiking expectation maximization (SEM) networks whose combined outputs are mediated by ITDP, is shown to perform the visual classification task well and to generalize to unseen data. The combined ensemble performance is significantly better than that of the individual classifiers, validating the ensemble architecture and learning mechanisms. The properties of the full model are analysed in the light of extensive experiments with the classification task, including an investigation into the influence of different input feature selection schemes and a comparison with a hierarchical STDP based ensemble architecture.
GO-PCA: An Unsupervised Method to Explore Gene Expression Data Using Prior Knowledge
Wagner, Florian
2015-01-01
Method Genome-wide expression profiling is a widely used approach for characterizing heterogeneous populations of cells, tissues, biopsies, or other biological specimen. The exploratory analysis of such data typically relies on generic unsupervised methods, e.g. principal component analysis (PCA) or hierarchical clustering. However, generic methods fail to exploit prior knowledge about the molecular functions of genes. Here, I introduce GO-PCA, an unsupervised method that combines PCA with nonparametric GO enrichment analysis, in order to systematically search for sets of genes that are both strongly correlated and closely functionally related. These gene sets are then used to automatically generate expression signatures with functional labels, which collectively aim to provide a readily interpretable representation of biologically relevant similarities and differences. The robustness of the results obtained can be assessed by bootstrapping. Results I first applied GO-PCA to datasets containing diverse hematopoietic cell types from human and mouse, respectively. In both cases, GO-PCA generated a small number of signatures that represented the majority of lineages present, and whose labels reflected their respective biological characteristics. I then applied GO-PCA to human glioblastoma (GBM) data, and recovered signatures associated with four out of five previously defined GBM subtypes. My results demonstrate that GO-PCA is a powerful and versatile exploratory method that reduces an expression matrix containing thousands of genes to a much smaller set of interpretable signatures. In this way, GO-PCA aims to facilitate hypothesis generation, design of further analyses, and functional comparisons across datasets. PMID:26575370
GO-PCA: An Unsupervised Method to Explore Gene Expression Data Using Prior Knowledge.
Wagner, Florian
2015-01-01
Genome-wide expression profiling is a widely used approach for characterizing heterogeneous populations of cells, tissues, biopsies, or other biological specimen. The exploratory analysis of such data typically relies on generic unsupervised methods, e.g. principal component analysis (PCA) or hierarchical clustering. However, generic methods fail to exploit prior knowledge about the molecular functions of genes. Here, I introduce GO-PCA, an unsupervised method that combines PCA with nonparametric GO enrichment analysis, in order to systematically search for sets of genes that are both strongly correlated and closely functionally related. These gene sets are then used to automatically generate expression signatures with functional labels, which collectively aim to provide a readily interpretable representation of biologically relevant similarities and differences. The robustness of the results obtained can be assessed by bootstrapping. I first applied GO-PCA to datasets containing diverse hematopoietic cell types from human and mouse, respectively. In both cases, GO-PCA generated a small number of signatures that represented the majority of lineages present, and whose labels reflected their respective biological characteristics. I then applied GO-PCA to human glioblastoma (GBM) data, and recovered signatures associated with four out of five previously defined GBM subtypes. My results demonstrate that GO-PCA is a powerful and versatile exploratory method that reduces an expression matrix containing thousands of genes to a much smaller set of interpretable signatures. In this way, GO-PCA aims to facilitate hypothesis generation, design of further analyses, and functional comparisons across datasets.
Unsupervised Learning in an Ensemble of Spiking Neural Networks Mediated by ITDP
Staras, Kevin
2016-01-01
We propose a biologically plausible architecture for unsupervised ensemble learning in a population of spiking neural network classifiers. A mixture of experts type organisation is shown to be effective, with the individual classifier outputs combined via a gating network whose operation is driven by input timing dependent plasticity (ITDP). The ITDP gating mechanism is based on recent experimental findings. An abstract, analytically tractable model of the ITDP driven ensemble architecture is derived from a logical model based on the probabilities of neural firing events. A detailed analysis of this model provides insights that allow it to be extended into a full, biologically plausible, computational implementation of the architecture which is demonstrated on a visual classification task. The extended model makes use of a style of spiking network, first introduced as a model of cortical microcircuits, that is capable of Bayesian inference, effectively performing expectation maximization. The unsupervised ensemble learning mechanism, based around such spiking expectation maximization (SEM) networks whose combined outputs are mediated by ITDP, is shown to perform the visual classification task well and to generalize to unseen data. The combined ensemble performance is significantly better than that of the individual classifiers, validating the ensemble architecture and learning mechanisms. The properties of the full model are analysed in the light of extensive experiments with the classification task, including an investigation into the influence of different input feature selection schemes and a comparison with a hierarchical STDP based ensemble architecture. PMID:27760125
NASA Technical Reports Server (NTRS)
Jammu, Vinay B.; Danai, Kourosh; Lewicki, David G.
1996-01-01
A new unsupervised pattern classifier is introduced for on-line detection of abnormality in features of vibration that are used for fault diagnosis of helicopter gearboxes. This classifier compares vibration features with their respective normal values and assigns them a value in (0, 1) to reflect their degree of abnormality. Therefore, the salient feature of this classifier is that it does not require feature values associated with faulty cases to identify abnormality. In order to cope with noise and changes in the operating conditions, an adaptation algorithm is incorporated that continually updates the normal values of the features. The proposed classifier is tested using experimental vibration features obtained from an OH-58A main rotor gearbox. The overall performance of this classifier is then evaluated by integrating the abnormality-scaled features for detection of faults. The fault detection results indicate that the performance of this classifier is comparable to the leading unsupervised neural networks: Kohonen's Feature Mapping and Adaptive Resonance Theory (AR72). This is significant considering that the independence of this classifier from fault-related features makes it uniquely suited to abnormality-scaling of vibration features for fault diagnosis.
Hierarchical Aligned Cluster Analysis for Temporal Clustering of Human Motion.
Zhou, Feng; De la Torre, Fernando; Hodgins, Jessica K
2013-03-01
Temporal segmentation of human motion into plausible motion primitives is central to understanding and building computational models of human motion. Several issues contribute to the challenge of discovering motion primitives: the exponential nature of all possible movement combinations, the variability in the temporal scale of human actions, and the complexity of representing articulated motion. We pose the problem of learning motion primitives as one of temporal clustering, and derive an unsupervised hierarchical bottom-up framework called hierarchical aligned cluster analysis (HACA). HACA finds a partition of a given multidimensional time series into m disjoint segments such that each segment belongs to one of k clusters. HACA combines kernel k-means with the generalized dynamic time alignment kernel to cluster time series data. Moreover, it provides a natural framework to find a low-dimensional embedding for time series. HACA is efficiently optimized with a coordinate descent strategy and dynamic programming. Experimental results on motion capture and video data demonstrate the effectiveness of HACA for segmenting complex motions and as a visualization tool. We also compare the performance of HACA to state-of-the-art algorithms for temporal clustering on data of a honey bee dance. The HACA code is available online.
Apollo, Alessandro; Pescucci, Chiara; Licastro, Danilo; Urso, Carmelo; Gerlini, Gianni; Borgognoni, Lorenzo; Luzzatto, Lucio; Stecca, Barbara
2016-01-01
Cutaneous melanoma is one of the most aggressive type of skin tumor. Early stage melanoma can be often cured by surgery; therefore current management guidelines dictate a different approach for thin (<1mm) versus thick (>4mm) melanomas. We have carried out whole-exome sequencing in 5 thin and 5 thick fresh-frozen primary cutaneous melanomas. Unsupervised hierarchical clustering analysis of somatic copy number alterations (SCNAs) identified two groups corresponding to thin and thick melanomas. The most striking difference between them was the much greater abundance of SCNAs in thick melanomas, whereas mutation frequency did not significantly change between the two groups. We found novel mutations and focal SCNAs in genes that are embryonic regulators of axon guidance, predominantly in thick melanomas. Analysis of publicly available microarray datasets provided further support for a potential role of Ephrin receptors in melanoma progression. In addition, we have identified a set of SCNAs, including amplification of BRAF and ofthe epigenetic modifier EZH2, that are specific for the group of thick melanomas that developed metastasis during the follow-up. Our data suggest that mutations occur early during melanoma development, whereas SCNAs might be involved in melanoma progression. PMID:27095580
Montagnani, Valentina; Benelli, Matteo; Apollo, Alessandro; Pescucci, Chiara; Licastro, Danilo; Urso, Carmelo; Gerlini, Gianni; Borgognoni, Lorenzo; Luzzatto, Lucio; Stecca, Barbara
2016-05-24
Cutaneous melanoma is one of the most aggressive type of skin tumor. Early stage melanoma can be often cured by surgery; therefore current management guidelines dictate a different approach for thin (<1mm) versus thick (>4mm) melanomas. We have carried out whole-exome sequencing in 5 thin and 5 thick fresh-frozen primary cutaneous melanomas. Unsupervised hierarchical clustering analysis of somatic copy number alterations (SCNAs) identified two groups corresponding to thin and thick melanomas. The most striking difference between them was the much greater abundance of SCNAs in thick melanomas, whereas mutation frequency did not significantly change between the two groups. We found novel mutations and focal SCNAs in genes that are embryonic regulators of axon guidance, predominantly in thick melanomas. Analysis of publicly available microarray datasets provided further support for a potential role of Ephrin receptors in melanoma progression. In addition, we have identified a set of SCNAs, including amplification of BRAF and ofthe epigenetic modifier EZH2, that are specific for the group of thick melanomas that developed metastasis during the follow-up. Our data suggest that mutations occur early during melanoma development, whereas SCNAs might be involved in melanoma progression.
Exploring supervised and unsupervised methods to detect topics in biomedical text
Lee, Minsuk; Wang, Weiqing; Yu, Hong
2006-01-01
Background Topic detection is a task that automatically identifies topics (e.g., "biochemistry" and "protein structure") in scientific articles based on information content. Topic detection will benefit many other natural language processing tasks including information retrieval, text summarization and question answering; and is a necessary step towards the building of an information system that provides an efficient way for biologists to seek information from an ocean of literature. Results We have explored the methods of Topic Spotting, a task of text categorization that applies the supervised machine-learning technique naïve Bayes to assign automatically a document into one or more predefined topics; and Topic Clustering, which apply unsupervised hierarchical clustering algorithms to aggregate documents into clusters such that each cluster represents a topic. We have applied our methods to detect topics of more than fifteen thousand of articles that represent over sixteen thousand entries in the Online Mendelian Inheritance in Man (OMIM) database. We have explored bag of words as the features. Additionally, we have explored semantic features; namely, the Medical Subject Headings (MeSH) that are assigned to the MEDLINE records, and the Unified Medical Language System (UMLS) semantic types that correspond to the MeSH terms, in addition to bag of words, to facilitate the tasks of topic detection. Our results indicate that incorporating the MeSH terms and the UMLS semantic types as additional features enhances the performance of topic detection and the naïve Bayes has the highest accuracy, 66.4%, for predicting the topic of an OMIM article as one of the total twenty-five topics. Conclusion Our results indicate that the supervised topic spotting methods outperformed the unsupervised topic clustering; on the other hand, the unsupervised topic clustering methods have the advantages of being robust and applicable in real world settings. PMID:16539745
Unsupervised classification of multivariate geostatistical data: Two algorithms
NASA Astrophysics Data System (ADS)
Romary, Thomas; Ors, Fabien; Rivoirard, Jacques; Deraisme, Jacques
2015-12-01
With the increasing development of remote sensing platforms and the evolution of sampling facilities in mining and oil industry, spatial datasets are becoming increasingly large, inform a growing number of variables and cover wider and wider areas. Therefore, it is often necessary to split the domain of study to account for radically different behaviors of the natural phenomenon over the domain and to simplify the subsequent modeling step. The definition of these areas can be seen as a problem of unsupervised classification, or clustering, where we try to divide the domain into homogeneous domains with respect to the values taken by the variables in hand. The application of classical clustering methods, designed for independent observations, does not ensure the spatial coherence of the resulting classes. Image segmentation methods, based on e.g. Markov random fields, are not adapted to irregularly sampled data. Other existing approaches, based on mixtures of Gaussian random functions estimated via the expectation-maximization algorithm, are limited to reasonable sample sizes and a small number of variables. In this work, we propose two algorithms based on adaptations of classical algorithms to multivariate geostatistical data. Both algorithms are model free and can handle large volumes of multivariate, irregularly spaced data. The first one proceeds by agglomerative hierarchical clustering. The spatial coherence is ensured by a proximity condition imposed for two clusters to merge. This proximity condition relies on a graph organizing the data in the coordinates space. The hierarchical algorithm can then be seen as a graph-partitioning algorithm. Following this interpretation, a spatial version of the spectral clustering algorithm is also proposed. The performances of both algorithms are assessed on toy examples and a mining dataset.
Comparison Between Supervised and Unsupervised Classifications of Neuronal Cell Types: A Case Study
Guerra, Luis; McGarry, Laura M; Robles, Víctor; Bielza, Concha; Larrañaga, Pedro; Yuste, Rafael
2011-01-01
In the study of neural circuits, it becomes essential to discern the different neuronal cell types that build the circuit. Traditionally, neuronal cell types have been classified using qualitative descriptors. More recently, several attempts have been made to classify neurons quantitatively, using unsupervised clustering methods. While useful, these algorithms do not take advantage of previous information known to the investigator, which could improve the classification task. For neocortical GABAergic interneurons, the problem to discern among different cell types is particularly difficult and better methods are needed to perform objective classifications. Here we explore the use of supervised classification algorithms to classify neurons based on their morphological features, using a database of 128 pyramidal cells and 199 interneurons from mouse neocortex. To evaluate the performance of different algorithms we used, as a “benchmark,” the test to automatically distinguish between pyramidal cells and interneurons, defining “ground truth” by the presence or absence of an apical dendrite. We compared hierarchical clustering with a battery of different supervised classification algorithms, finding that supervised classifications outperformed hierarchical clustering. In addition, the selection of subsets of distinguishing features enhanced the classification accuracy for both sets of algorithms. The analysis of selected variables indicates that dendritic features were most useful to distinguish pyramidal cells from interneurons when compared with somatic and axonal morphological variables. We conclude that supervised classification algorithms are better matched to the general problem of distinguishing neuronal cell types when some information on these cell groups, in our case being pyramidal or interneuron, is known a priori. As a spin-off of this methodological study, we provide several methods to automatically distinguish neocortical pyramidal cells from interneurons, based on their morphologies. © 2010 Wiley Periodicals, Inc. Develop Neurobiol 71: 71–82, 2011 PMID:21154911
Smith, D. R. [Princeton Plasma Physics Lab. (PPPL), Princeton, NJ (United States); Bell, R. E. [Princeton Plasma Physics Lab. (PPPL), Princeton, NJ (United States); Podesta, M. [Princeton Plasma Physics Lab. (PPPL), Princeton, NJ (United States); Smith, D. R. [Princeton Plasma Physics Lab. (PPPL), Princeton, NJ (United States); Fonck, R. J. [Princeton Plasma Physics Lab. (PPPL), Princeton, NJ (United States); McKee, G. R. [Princeton Plasma Physics Lab. (PPPL), Princeton, NJ (United States); Diallo, A. [Princeton Plasma Physics Lab. (PPPL), Princeton, NJ (United States); Kaye, S. M. [Princeton Plasma Physics Lab. (PPPL), Princeton, NJ (United States); LeBlanc, B. P. [Princeton Plasma Physics Lab. (PPPL), Princeton, NJ (United States); Sabbagh, S. A. [Princeton Plasma Physics Lab. (PPPL), Princeton, NJ (United States)
2015-09-01
We implement unsupervised machine learning techniques to identify characteristic evolution patterns and associated parameter regimes in edge localized mode (ELM) events observed on the National Spherical Torus Experiment. Multi-channel, localized measurements spanning the pedestal region capture the complex evolution patterns of ELM events on Alfven timescales. Some ELM events are active for less than 100~microsec, but others persist for up to 1~ms. Also, some ELM events exhibit a single dominant perturbation, but others are oscillatory. Clustering calculations with time-series similarity metrics indicate the ELM database contains at least two and possibly three groups of ELMs with similar evolution patterns. The identified ELM groups trigger similar stored energy loss, but the groups occupy distinct parameter regimes for ELM-relevant quantities like plasma current, triangularity, and pedestal height. Notably, the pedestal electron pressure gradient is not an effective parameter for distinguishing the ELM groups, but the ELM groups segregate in terms of electron density gradient and electron temperature gradient. The ELM evolution patterns and corresponding parameter regimes can shape the formulation or validation of nonlinear ELM models. Finally, the techniques and results demonstrate an application of unsupervised machine learning at a data-rich fusion facility.
Unsupervised classification of Space Acceleration Measurement System (SAMS) data using ART2-A
NASA Technical Reports Server (NTRS)
Smith, A. D.; Sinha, A.
1999-01-01
The Space Acceleration Measurement System (SAMS) has been developed by NASA to monitor the microgravity acceleration environment aboard the space shuttle. The amount of data collected by a SAMS unit during a shuttle mission is in the several gigabytes range. Adaptive Resonance Theory 2-A (ART2-A), an unsupervised neural network, has been used to cluster these data and to develop cause and effect relationships among disturbances and the acceleration environment. Using input patterns formed on the basis of power spectral densities (psd), data collected from two missions, STS-050 and STS-057, have been clustered.
NASA Astrophysics Data System (ADS)
Salman, S. S.; Abbas, W. A.
2018-05-01
The goal of the study is to support analysis Enhancement of Resolution and study effect on classification methods on bands spectral information of specific and quantitative approaches. In this study introduce a method to enhancement resolution Landsat 8 of combining the bands spectral of 30 meters resolution with panchromatic band 8 of 15 meters resolution, because of importance multispectral imagery to extracting land - cover. Classification methods used in this study to classify several lands -covers recorded from OLI- 8 imagery. Two methods of Data mining can be classified as either supervised or unsupervised. In supervised methods, there is a particular predefined target, that means the algorithm learn which values of the target are associated with which values of the predictor sample. K-nearest neighbors and maximum likelihood algorithms examine in this work as supervised methods. In other hand, no sample identified as target in unsupervised methods, the algorithm of data extraction searches for structure and patterns between all the variables, represented by Fuzzy C-mean clustering method as one of the unsupervised methods, NDVI vegetation index used to compare the results of classification method, the percent of dense vegetation in maximum likelihood method give a best results.
Hashemi, Jamileh; Fotouhi, Omid; Sulaiman, Luqman; Kjellman, Magnus; Höög, Anders; Zedenius, Jan; Larsson, Catharina
2013-10-29
Small intestinal neuroendocrine tumors (SI-NETs) are typically slow-growing tumors that have metastasized already at the time of diagnosis. The purpose of the present study was to further refine and define regions of recurrent copy number (CN) alterations (CNA) in SI-NETs. Genome-wide CNAs was determined by applying array CGH (a-CGH) on SI-NETs including 18 primary tumors and 12 metastases. Quantitative PCR analysis (qPCR) was used to confirm CNAs detected by a-CGH as well as to detect CNAs in an extended panel of SI-NETs. Unsupervised hierarchical clustering was used to detect tumor groups with similar patterns of chromosomal alterations based on recurrent regions of CN loss or gain. The log rank test was used to calculate overall survival. Mann-Whitney U test or Fisher's exact test were used to evaluate associations between tumor groups and recurrent CNAs or clinical parameters. The most frequent abnormality was loss of chromosome 18 observed in 70% of the cases. CN losses were also frequently found of chromosomes 11 (23%), 16 (20%), and 9 (20%), with regions of recurrent CN loss identified in 11q23.1-qter, 16q12.2-qter, 9pter-p13.2 and 9p13.1-11.2. Gains were most frequently detected in chromosomes 14 (43%), 20 (37%), 4 (27%), and 5 (23%) with recurrent regions of CN gain located to 14q11.2, 14q32.2-32.31, 20pter-p11.21, 20q11.1-11.21, 20q12-qter, 4 and 5. qPCR analysis confirmed most CNAs detected by a-CGH as well as revealed CNAs in an extended panel of SI-NETs. Unsupervised hierarchical clustering of recurrent regions of CNAs revealed two separate tumor groups and 5 chromosomal clusters. Loss of chromosomes 18, 16 and 11 and gain of chromosome 20 were found in both tumor groups. Tumor group II was enriched for alterations in chromosome cluster-d, including gain of chromosomes 4, 5, 7, 14 and gain of 20 in chromosome cluster-b. Gain in 20pter-p11.21 was associated with short survival. Statistically significant differences were observed between primary tumors and metastases for loss of 16q and gain of 7. Our results revealed recurrent CNAs in several candidate regions with a potential role in SI-NET development. Distinct genetic alterations and pathways are involved in tumorigenesis of SI-NETs.
A hierarchical approach to forest landscape pattern characterization.
Wang, Jialing; Yang, Xiaojun
2012-01-01
Landscape spatial patterns have increasingly been considered to be essential for environmental planning and resources management. In this study, we proposed a hierarchical approach for landscape classification and evaluation by characterizing landscape spatial patterns across different hierarchical levels. The case study site is the Red Hills region of northern Florida and southwestern Georgia, well known for its biodiversity, historic resources, and scenic beauty. We used one Landsat Enhanced Thematic Mapper image to extract land-use/-cover information. Then, we employed principal-component analysis to help identify key class-level landscape metrics for forests at different hierarchical levels, namely, open pine, upland pine, and forest as a whole. We found that the key class-level landscape metrics varied across different hierarchical levels. Compared with forest as a whole, open pine forest is much more fragmented. The landscape metric, such as CONTIG_MN, which measures whether pine patches are contiguous or not, is more important to characterize the spatial pattern of pine forest than to forest as a whole. This suggests that different metric sets should be used to characterize landscape patterns at different hierarchical levels. We further used these key metrics, along with the total class area, to classify and evaluate subwatersheds through cluster analysis. This study demonstrates a promising approach that can be used to integrate spatial patterns and processes for hierarchical forest landscape planning and management.
Chimera states in networks of logistic maps with hierarchical connectivities
NASA Astrophysics Data System (ADS)
zur Bonsen, Alexander; Omelchenko, Iryna; Zakharova, Anna; Schöll, Eckehard
2018-04-01
Chimera states are complex spatiotemporal patterns consisting of coexisting domains of coherence and incoherence. We study networks of nonlocally coupled logistic maps and analyze systematically how the dilution of the network links influences the appearance of chimera patterns. The network connectivities are constructed using an iterative Cantor algorithm to generate fractal (hierarchical) connectivities. Increasing the hierarchical level of iteration, we compare the resulting spatiotemporal patterns. We demonstrate that a high clustering coefficient and symmetry of the base pattern promotes chimera states, and asymmetric connectivities result in complex nested chimera patterns.
Sauwen, N; Acou, M; Van Cauter, S; Sima, D M; Veraart, J; Maes, F; Himmelreich, U; Achten, E; Van Huffel, S
2016-01-01
Tumor segmentation is a particularly challenging task in high-grade gliomas (HGGs), as they are among the most heterogeneous tumors in oncology. An accurate delineation of the lesion and its main subcomponents contributes to optimal treatment planning, prognosis and follow-up. Conventional MRI (cMRI) is the imaging modality of choice for manual segmentation, and is also considered in the vast majority of automated segmentation studies. Advanced MRI modalities such as perfusion-weighted imaging (PWI), diffusion-weighted imaging (DWI) and magnetic resonance spectroscopic imaging (MRSI) have already shown their added value in tumor tissue characterization, hence there have been recent suggestions of combining different MRI modalities into a multi-parametric MRI (MP-MRI) approach for brain tumor segmentation. In this paper, we compare the performance of several unsupervised classification methods for HGG segmentation based on MP-MRI data including cMRI, DWI, MRSI and PWI. Two independent MP-MRI datasets with a different acquisition protocol were available from different hospitals. We demonstrate that a hierarchical non-negative matrix factorization variant which was previously introduced for MP-MRI tumor segmentation gives the best performance in terms of mean Dice-scores for the pathologic tissue classes on both datasets.
Brasier, Allan R; Victor, Sundar; Boetticher, Gary; Ju, Hyunsu; Lee, Chang; Bleecker, Eugene R; Castro, Mario; Busse, William W; Calhoun, William J
2008-01-01
Asthma is a heterogeneous clinical disorder. Methods for objective identification of disease subtypes will focus on clinical interventions and help identify causative pathways. Few studies have explored phenotypes at a molecular level. We sought to discriminate asthma phenotypes on the basis of cytokine profiles in bronchoalveolar lavage (BAL) samples from patients with mild-moderate and severe asthma. Twenty-five cytokines were measured in BAL samples of 84 patients (41 severe, 43 mild-moderate) using bead-based multiplex immunoassays. The normalized data were subjected to statistical and informatics analysis. Four groups of asthmatic profiles could be identified on the basis of unsupervised analysis (hierarchical clustering) that were independent of treatment. One group, enriched in patients with severe asthma, showed differences in BAL cellular content, reductions in baseline pulmonary function, and enhanced response to methacholine provocation. Ten cytokines were identified that accurately predicted this group. Classification methods for predicting methacholine sensitivity were developed. The best model analysis predicted hyperresponders with 88% accuracy in 10 trials by using a 10-fold cross-validation. The cytokines that contributed to this model were IL-2, IL-4, and IL-5. On the basis of this classifier, 3 distinct hyperresponder classes were identified that varied in BAL eosinophil count and PC20 methacholine. Cytokine expression patterns in BAL can be used to identify distinct types of asthma and identify distinct subsets of methacholine hyperresponders. Further biomarker discovery in BAL may be informative.
Unsupervised Neural Network Quantifies the Cost of Visual Information Processing.
Orbán, Levente L; Chartier, Sylvain
2015-01-01
Untrained, "flower-naïve" bumblebees display behavioural preferences when presented with visual properties such as colour, symmetry, spatial frequency and others. Two unsupervised neural networks were implemented to understand the extent to which these models capture elements of bumblebees' unlearned visual preferences towards flower-like visual properties. The computational models, which are variants of Independent Component Analysis and Feature-Extracting Bidirectional Associative Memory, use images of test-patterns that are identical to ones used in behavioural studies. Each model works by decomposing images of floral patterns into meaningful underlying factors. We reconstruct the original floral image using the components and compare the quality of the reconstructed image to the original image. Independent Component Analysis matches behavioural results substantially better across several visual properties. These results are interpreted to support a hypothesis that the temporal and energetic costs of information processing by pollinators served as a selective pressure on floral displays: flowers adapted to pollinators' cognitive constraints.
Higgins, Irina; Stringer, Simon; Schnupp, Jan
2017-01-01
The nature of the code used in the auditory cortex to represent complex auditory stimuli, such as naturally spoken words, remains a matter of debate. Here we argue that such representations are encoded by stable spatio-temporal patterns of firing within cell assemblies known as polychronous groups, or PGs. We develop a physiologically grounded, unsupervised spiking neural network model of the auditory brain with local, biologically realistic, spike-time dependent plasticity (STDP) learning, and show that the plastic cortical layers of the network develop PGs which convey substantially more information about the speaker independent identity of two naturally spoken word stimuli than does rate encoding that ignores the precise spike timings. We furthermore demonstrate that such informative PGs can only develop if the input spatio-temporal spike patterns to the plastic cortical areas of the model are relatively stable.
Stringer, Simon
2017-01-01
The nature of the code used in the auditory cortex to represent complex auditory stimuli, such as naturally spoken words, remains a matter of debate. Here we argue that such representations are encoded by stable spatio-temporal patterns of firing within cell assemblies known as polychronous groups, or PGs. We develop a physiologically grounded, unsupervised spiking neural network model of the auditory brain with local, biologically realistic, spike-time dependent plasticity (STDP) learning, and show that the plastic cortical layers of the network develop PGs which convey substantially more information about the speaker independent identity of two naturally spoken word stimuli than does rate encoding that ignores the precise spike timings. We furthermore demonstrate that such informative PGs can only develop if the input spatio-temporal spike patterns to the plastic cortical areas of the model are relatively stable. PMID:28797034
NASA Astrophysics Data System (ADS)
Yi, Wei-song; Cui, Dian-sheng; Li, Zhi; Wu, Lan-lan; Shen, Ai-guo; Hu, Ji-ming
2013-01-01
The manuscript has investigated the application of near-infrared (NIR) spectroscopy for differentiation gastric cancer. The 90 spectra from cancerous and normal tissues were collected from a total of 30 surgical specimens using Fourier transform near-infrared spectroscopy (FT-NIR) equipped with a fiber-optic probe. Major spectral differences were observed in the CH-stretching second overtone (9000-7000 cm-1), CH-stretching first overtone (6000-5200 cm-1), and CH-stretching combination (4500-4000 cm-1) regions. By use of unsupervised pattern recognition, such as principal component analysis (PCA) and cluster analysis (CA), all spectra were classified into cancerous and normal tissue groups with accuracy up to 81.1%. The sensitivity and specificity was 100% and 68.2%, respectively. These present results indicate that CH-stretching first, combination band and second overtone regions can serve as diagnostic markers for gastric cancer.
Hierarchical clustering of EMD based interest points for road sign detection
NASA Astrophysics Data System (ADS)
Khan, Jesmin; Bhuiyan, Sharif; Adhami, Reza
2014-04-01
This paper presents an automatic road traffic signs detection and recognition system based on hierarchical clustering of interest points and joint transform correlation. The proposed algorithm consists of the three following stages: interest points detection, clustering of those points and similarity search. At the first stage, good discriminative, rotation and scale invariant interest points are selected from the image edges based on the 1-D empirical mode decomposition (EMD). We propose a two-step unsupervised clustering technique, which is adaptive and based on two criterion. In this context, the detected points are initially clustered based on the stable local features related to the brightness and color, which are extracted using Gabor filter. Then points belonging to each partition are reclustered depending on the dispersion of the points in the initial cluster using position feature. This two-step hierarchical clustering yields the possible candidate road signs or the region of interests (ROIs). Finally, a fringe-adjusted joint transform correlation (JTC) technique is used for matching the unknown signs with the existing known reference road signs stored in the database. The presented framework provides a novel way to detect a road sign from the natural scenes and the results demonstrate the efficacy of the proposed technique, which yields a very low false hit rate.
Metal hierarchical patterning by direct nanoimprint lithography
Radha, Boya; Lim, Su Hui; Saifullah, Mohammad S. M.; Kulkarni, Giridhar U.
2013-01-01
Three-dimensional hierarchical patterning of metals is of paramount importance in diverse fields involving photonics, controlling surface wettability and wearable electronics. Conventionally, this type of structuring is tedious and usually involves layer-by-layer lithographic patterning. Here, we describe a simple process of direct nanoimprint lithography using palladium benzylthiolate, a versatile metal-organic ink, which not only leads to the formation of hierarchical patterns but also is amenable to layer-by-layer stacking of the metal over large areas. The key to achieving such multi-faceted patterning is hysteretic melting of ink, enabling its shaping. It undergoes transformation to metallic palladium under gentle thermal conditions without affecting the integrity of the hierarchical patterns on micro- as well as nanoscale. A metallic rice leaf structure showing anisotropic wetting behavior and woodpile-like structures were thus fabricated. Furthermore, this method is extendable for transferring imprinted structures to a flexible substrate to make them robust enough to sustain numerous bending cycles. PMID:23446801
Pedretti, G; Milo, V; Ambrogio, S; Carboni, R; Bianchi, S; Calderoni, A; Ramaswamy, N; Spinelli, A S; Ielmini, D
2017-07-13
Brain-inspired computation can revolutionize information technology by introducing machines capable of recognizing patterns (images, speech, video) and interacting with the external world in a cognitive, humanlike way. Achieving this goal requires first to gain a detailed understanding of the brain operation, and second to identify a scalable microelectronic technology capable of reproducing some of the inherent functions of the human brain, such as the high synaptic connectivity (~10 4 ) and the peculiar time-dependent synaptic plasticity. Here we demonstrate unsupervised learning and tracking in a spiking neural network with memristive synapses, where synaptic weights are updated via brain-inspired spike timing dependent plasticity (STDP). The synaptic conductance is updated by the local time-dependent superposition of pre- and post-synaptic spikes within a hybrid one-transistor/one-resistor (1T1R) memristive synapse. Only 2 synaptic states, namely the low resistance state (LRS) and the high resistance state (HRS), are sufficient to learn and recognize patterns. Unsupervised learning of a static pattern and tracking of a dynamic pattern of up to 4 × 4 pixels are demonstrated, paving the way for intelligent hardware technology with up-scaled memristive neural networks.
Unsupervised EEG analysis for automated epileptic seizure detection
NASA Astrophysics Data System (ADS)
Birjandtalab, Javad; Pouyan, Maziyar Baran; Nourani, Mehrdad
2016-07-01
Epilepsy is a neurological disorder which can, if not controlled, potentially cause unexpected death. It is extremely crucial to have accurate automatic pattern recognition and data mining techniques to detect the onset of seizures and inform care-givers to help the patients. EEG signals are the preferred biosignals for diagnosis of epileptic patients. Most of the existing pattern recognition techniques used in EEG analysis leverage the notion of supervised machine learning algorithms. Since seizure data are heavily under-represented, such techniques are not always practical particularly when the labeled data is not sufficiently available or when disease progression is rapid and the corresponding EEG footprint pattern will not be robust. Furthermore, EEG pattern change is highly individual dependent and requires experienced specialists to annotate the seizure and non-seizure events. In this work, we present an unsupervised technique to discriminate seizures and non-seizures events. We employ power spectral density of EEG signals in different frequency bands that are informative features to accurately cluster seizure and non-seizure events. The experimental results tried so far indicate achieving more than 90% accuracy in clustering seizure and non-seizure events without having any prior knowledge on patient's history.
A parallelized binary search tree
USDA-ARS?s Scientific Manuscript database
PTTRNFNDR is an unsupervised statistical learning algorithm that detects patterns in DNA sequences, protein sequences, or any natural language texts that can be decomposed into letters of a finite alphabet. PTTRNFNDR performs complex mathematical computations and its processing time increases when i...
Chang, Hang; Han, Ju; Zhong, Cheng; Snijders, Antoine M.; Mao, Jian-Hua
2017-01-01
The capabilities of (I) learning transferable knowledge across domains; and (II) fine-tuning the pre-learned base knowledge towards tasks with considerably smaller data scale are extremely important. Many of the existing transfer learning techniques are supervised approaches, among which deep learning has the demonstrated power of learning domain transferrable knowledge with large scale network trained on massive amounts of labeled data. However, in many biomedical tasks, both the data and the corresponding label can be very limited, where the unsupervised transfer learning capability is urgently needed. In this paper, we proposed a novel multi-scale convolutional sparse coding (MSCSC) method, that (I) automatically learns filter banks at different scales in a joint fashion with enforced scale-specificity of learned patterns; and (II) provides an unsupervised solution for learning transferable base knowledge and fine-tuning it towards target tasks. Extensive experimental evaluation of MSCSC demonstrates the effectiveness of the proposed MSCSC in both regular and transfer learning tasks in various biomedical domains. PMID:28129148
System Biology Approach: Gene Network Analysis for Muscular Dystrophy.
Censi, Federica; Calcagnini, Giovanni; Mattei, Eugenio; Giuliani, Alessandro
2018-01-01
Phenotypic changes at different organization levels from cell to entire organism are associated to changes in the pattern of gene expression. These changes involve the entire genome expression pattern and heavily rely upon correlation patterns among genes. The classical approach used to analyze gene expression data builds upon the application of supervised statistical techniques to detect genes differentially expressed among two or more phenotypes (e.g., normal vs. disease). The use of an a posteriori, unsupervised approach based on principal component analysis (PCA) and the subsequent construction of gene correlation networks can shed a light on unexpected behaviour of gene regulation system while maintaining a more naturalistic view on the studied system.In this chapter we applied an unsupervised method to discriminate DMD patient and controls. The genes having the highest absolute scores in the discrimination between the groups were then analyzed in terms of gene expression networks, on the basis of their mutual correlation in the two groups. The correlation network structures suggest two different modes of gene regulation in the two groups, reminiscent of important aspects of DMD pathogenesis.
Regional differences in mitochondrial DNA methylation in human post-mortem brain tissue.
Devall, Matthew; Smith, Rebecca G; Jeffries, Aaron; Hannon, Eilis; Davies, Matthew N; Schalkwyk, Leonard; Mill, Jonathan; Weedon, Michael; Lunnon, Katie
2017-01-01
DNA methylation is an important epigenetic mechanism involved in gene regulation, with alterations in DNA methylation in the nuclear genome being linked to numerous complex diseases. Mitochondrial DNA methylation is a phenomenon that is receiving ever-increasing interest, particularly in diseases characterized by mitochondrial dysfunction; however, most studies have been limited to the investigation of specific target regions. Analyses spanning the entire mitochondrial genome have been limited, potentially due to the amount of input DNA required. Further, mitochondrial genetic studies have been previously confounded by nuclear-mitochondrial pseudogenes. Methylated DNA Immunoprecipitation Sequencing is a technique widely used to profile DNA methylation across the nuclear genome; however, reads mapped to mitochondrial DNA are often discarded. Here, we have developed an approach to control for nuclear-mitochondrial pseudogenes within Methylated DNA Immunoprecipitation Sequencing data. We highlight the utility of this approach in identifying differences in mitochondrial DNA methylation across regions of the human brain and pre-mortem blood. We were able to correlate mitochondrial DNA methylation patterns between the cortex, cerebellum and blood. We identified 74 nominally significant differentially methylated regions ( p < 0.05) in the mitochondrial genome, between anatomically separate cortical regions and the cerebellum in matched samples ( N = 3 matched donors). Further analysis identified eight significant differentially methylated regions between the total cortex and cerebellum after correcting for multiple testing. Using unsupervised hierarchical clustering analysis of the mitochondrial DNA methylome, we were able to identify tissue-specific patterns of mitochondrial DNA methylation between blood, cerebellum and cortex. Our study represents a comprehensive analysis of the mitochondrial methylome using pre-existing Methylated DNA Immunoprecipitation Sequencing data to identify brain region-specific patterns of mitochondrial DNA methylation.
Padilla, Jennifer E.; Liu, Wenyan; Seeman, Nadrian C.
2012-01-01
We introduce a hierarchical self assembly algorithm that produces the quasiperiodic patterns found in the Robinson tilings and suggest a practical implementation of this algorithm using DNA origami tiles. We modify the abstract Tile Assembly Model, (aTAM), to include active signaling and glue activation in response to signals to coordinate the hierarchical assembly of Robinson patterns of arbitrary size from a small set of tiles according to the tile substitution algorithm that generates them. Enabling coordinated hierarchical assembly in the aTAM makes possible the efficient encoding of the recursive process of tile substitution. PMID:23226722
Padilla, Jennifer E; Liu, Wenyan; Seeman, Nadrian C
2012-06-01
We introduce a hierarchical self assembly algorithm that produces the quasiperiodic patterns found in the Robinson tilings and suggest a practical implementation of this algorithm using DNA origami tiles. We modify the abstract Tile Assembly Model, (aTAM), to include active signaling and glue activation in response to signals to coordinate the hierarchical assembly of Robinson patterns of arbitrary size from a small set of tiles according to the tile substitution algorithm that generates them. Enabling coordinated hierarchical assembly in the aTAM makes possible the efficient encoding of the recursive process of tile substitution.
Bichler, Olivier; Querlioz, Damien; Thorpe, Simon J; Bourgoin, Jean-Philippe; Gamrat, Christian
2012-08-01
A biologically inspired approach to learning temporally correlated patterns from a spiking silicon retina is presented. Spikes are generated from the retina in response to relative changes in illumination at the pixel level and transmitted to a feed-forward spiking neural network. Neurons become sensitive to patterns of pixels with correlated activation times, in a fully unsupervised scheme. This is achieved using a special form of Spike-Timing-Dependent Plasticity which depresses synapses that did not recently contribute to the post-synaptic spike activation, regardless of their activation time. Competitive learning is implemented with lateral inhibition. When tested with real-life data, the system is able to extract complex and overlapping temporally correlated features such as car trajectories on a freeway, after only 10 min of traffic learning. Complete trajectories can be learned with a 98% detection rate using a second layer, still with unsupervised learning, and the system may be used as a car counter. The proposed neural network is extremely robust to noise and it can tolerate a high degree of synaptic and neuronal variability with little impact on performance. Such results show that a simple biologically inspired unsupervised learning scheme is capable of generating selectivity to complex meaningful events on the basis of relatively little sensory experience. Copyright © 2012 Elsevier Ltd. All rights reserved.
Mwangi, Benson; Soares, Jair C; Hasan, Khader M
2014-10-30
Neuroimaging machine learning studies have largely utilized supervised algorithms - meaning they require both neuroimaging scan data and corresponding target variables (e.g. healthy vs. diseased) to be successfully 'trained' for a prediction task. Noticeably, this approach may not be optimal or possible when the global structure of the data is not well known and the researcher does not have an a priori model to fit the data. We set out to investigate the utility of an unsupervised machine learning technique; t-distributed stochastic neighbour embedding (t-SNE) in identifying 'unseen' sample population patterns that may exist in high-dimensional neuroimaging data. Multimodal neuroimaging scans from 92 healthy subjects were pre-processed using atlas-based methods, integrated and input into the t-SNE algorithm. Patterns and clusters discovered by the algorithm were visualized using a 2D scatter plot and further analyzed using the K-means clustering algorithm. t-SNE was evaluated against classical principal component analysis. Remarkably, based on unlabelled multimodal scan data, t-SNE separated study subjects into two very distinct clusters which corresponded to subjects' gender labels (cluster silhouette index value=0.79). The resulting clusters were used to develop an unsupervised minimum distance clustering model which identified 93.5% of subjects' gender. Notably, from a neuropsychiatric perspective this method may allow discovery of data-driven disease phenotypes or sub-types of treatment responders. Copyright © 2014 Elsevier B.V. All rights reserved.
Enhanced HMAX model with feedforward feature learning for multiclass categorization.
Li, Yinlin; Wu, Wei; Zhang, Bo; Li, Fengfu
2015-01-01
In recent years, the interdisciplinary research between neuroscience and computer vision has promoted the development in both fields. Many biologically inspired visual models are proposed, and among them, the Hierarchical Max-pooling model (HMAX) is a feedforward model mimicking the structures and functions of V1 to posterior inferotemporal (PIT) layer of the primate visual cortex, which could generate a series of position- and scale- invariant features. However, it could be improved with attention modulation and memory processing, which are two important properties of the primate visual cortex. Thus, in this paper, based on recent biological research on the primate visual cortex, we still mimic the first 100-150 ms of visual cognition to enhance the HMAX model, which mainly focuses on the unsupervised feedforward feature learning process. The main modifications are as follows: (1) To mimic the attention modulation mechanism of V1 layer, a bottom-up saliency map is computed in the S1 layer of the HMAX model, which can support the initial feature extraction for memory processing; (2) To mimic the learning, clustering and short-term memory to long-term memory conversion abilities of V2 and IT, an unsupervised iterative clustering method is used to learn clusters with multiscale middle level patches, which are taken as long-term memory; (3) Inspired by the multiple feature encoding mode of the primate visual cortex, information including color, orientation, and spatial position are encoded in different layers of the HMAX model progressively. By adding a softmax layer at the top of the model, multiclass categorization experiments can be conducted, and the results on Caltech101 show that the enhanced model with a smaller memory size exhibits higher accuracy than the original HMAX model, and could also achieve better accuracy than other unsupervised feature learning methods in multiclass categorization task.
Narayanan, Shrikanth
2009-01-01
We describe a method for unsupervised region segmentation of an image using its spatial frequency domain representation. The algorithm was designed to process large sequences of real-time magnetic resonance (MR) images containing the 2-D midsagittal view of a human vocal tract airway. The segmentation algorithm uses an anatomically informed object model, whose fit to the observed image data is hierarchically optimized using a gradient descent procedure. The goal of the algorithm is to automatically extract the time-varying vocal tract outline and the position of the articulators to facilitate the study of the shaping of the vocal tract during speech production. PMID:19244005
Representation learning: a unified deep learning framework for automatic prostate MR segmentation.
Liao, Shu; Gao, Yaozong; Oto, Aytekin; Shen, Dinggang
2013-01-01
Image representation plays an important role in medical image analysis. The key to the success of different medical image analysis algorithms is heavily dependent on how we represent the input data, namely features used to characterize the input image. In the literature, feature engineering remains as an active research topic, and many novel hand-crafted features are designed such as Haar wavelet, histogram of oriented gradient, and local binary patterns. However, such features are not designed with the guidance of the underlying dataset at hand. To this end, we argue that the most effective features should be designed in a learning based manner, namely representation learning, which can be adapted to different patient datasets at hand. In this paper, we introduce a deep learning framework to achieve this goal. Specifically, a stacked independent subspace analysis (ISA) network is adopted to learn the most effective features in a hierarchical and unsupervised manner. The learnt features are adapted to the dataset at hand and encode high level semantic anatomical information. The proposed method is evaluated on the application of automatic prostate MR segmentation. Experimental results show that significant segmentation accuracy improvement can be achieved by the proposed deep learning method compared to other state-of-the-art segmentation approaches.
Steingass, Christof Björn; Jutzi, Manfred; Müller, Jenny; Carle, Reinhold; Schmarr, Hans-Georg
2015-03-01
Ripening-dependent changes of pineapple volatiles were studied in a nontargeted profiling analysis. Volatiles were isolated via headspace solid phase microextraction and analyzed by comprehensive 2D gas chromatography and mass spectrometry (HS-SPME-GC×GC-qMS). Profile patterns presented in the contour plots were evaluated applying image processing techniques and subsequent multivariate statistical data analysis. Statistical methods comprised unsupervised hierarchical cluster analysis (HCA) and principal component analysis (PCA) to classify the samples. Supervised partial least squares discriminant analysis (PLS-DA) and partial least squares (PLS) regression were applied to discriminate different ripening stages and describe the development of volatiles during postharvest storage, respectively. Hereby, substantial chemical markers allowing for class separation were revealed. The workflow permitted the rapid distinction between premature green-ripe pineapples and postharvest-ripened sea-freighted fruits. Volatile profiles of fully ripe air-freighted pineapples were similar to those of green-ripe fruits postharvest ripened for 6 days after simulated sea freight export, after PCA with only two principal components. However, PCA considering also the third principal component allowed differentiation between air-freighted fruits and the four progressing postharvest maturity stages of sea-freighted pineapples.
Zhuang, Chengxu; Wang, Yulong; Yamins, Daniel; Hu, Xiaolin
2017-01-01
Visual information in the visual cortex is processed in a hierarchical manner. Recent studies show that higher visual areas, such as V2, V3, and V4, respond more vigorously to images with naturalistic higher-order statistics than to images lacking them. This property is a functional signature of higher areas, as it is much weaker or even absent in the primary visual cortex (V1). However, the mechanism underlying this signature remains elusive. We studied this problem using computational models. In several typical hierarchical visual models including the AlexNet, VggNet, and SHMAX, this signature was found to be prominent in higher layers but much weaker in lower layers. By changing both the model structure and experimental settings, we found that the signature strongly correlated with sparse firing of units in higher layers but not with any other factors, including model structure, training algorithm (supervised or unsupervised), receptive field size, and property of training stimuli. The results suggest an important role of sparse neuronal activity underlying this special feature of higher visual areas.
Zhuang, Chengxu; Wang, Yulong; Yamins, Daniel; Hu, Xiaolin
2017-01-01
Visual information in the visual cortex is processed in a hierarchical manner. Recent studies show that higher visual areas, such as V2, V3, and V4, respond more vigorously to images with naturalistic higher-order statistics than to images lacking them. This property is a functional signature of higher areas, as it is much weaker or even absent in the primary visual cortex (V1). However, the mechanism underlying this signature remains elusive. We studied this problem using computational models. In several typical hierarchical visual models including the AlexNet, VggNet, and SHMAX, this signature was found to be prominent in higher layers but much weaker in lower layers. By changing both the model structure and experimental settings, we found that the signature strongly correlated with sparse firing of units in higher layers but not with any other factors, including model structure, training algorithm (supervised or unsupervised), receptive field size, and property of training stimuli. The results suggest an important role of sparse neuronal activity underlying this special feature of higher visual areas. PMID:29163117
The Common Prescription Patterns Based on the Hierarchical Clustering of Herb-Pairs Efficacies
2016-01-01
Prescription patterns are rules or regularities used to generate, recognize, or judge a prescription. Most of existing studies focused on the specific prescription patterns for diverse diseases or syndromes, while little attention was paid to the common patterns, which reflect the global view of the regularities of prescriptions. In this paper, we designed a method CPPM to find the common prescription patterns. The CPPM is based on the hierarchical clustering of herb-pair efficacies (HPEs). Firstly, HPEs were hierarchically clustered; secondly, the individual herbs are labeled by the HPEC (the clusters of HPEs); and then the prescription patterns were extracted from the combinations of HPEC; finally the common patterns are recognized statistically. The results showed that HPEs have hierarchical clustering structure. When the clustering level is 2 and the HPEs were classified into two clusters, the common prescription patterns are obvious. Among 332 candidate prescriptions, 319 prescriptions follow the common patterns. The description of the patterns is that if a prescription contains the herbs of the cluster (C 1), it is very likely to have other herbs of another cluster (C 2); while a prescription has the herbs of C 2, it may have no herbs of C 1. Finally, we discussed that the common patterns are mathematically coincident with the Blood-Qi theory. PMID:27190534
Inferring a District-Based Hierarchical Structure of Social Contacts from Census Data
Yu, Zhiwen; Liu, Jiming; Zhu, Xianjun
2015-01-01
Researchers have recently paid attention to social contact patterns among individuals due to their useful applications in such areas as epidemic evaluation and control, public health decisions, chronic disease research and social network research. Although some studies have estimated social contact patterns from social networks and surveys, few have considered how to infer the hierarchical structure of social contacts directly from census data. In this paper, we focus on inferring an individual’s social contact patterns from detailed census data, and generate various types of social contact patterns such as hierarchical-district-structure-based, cross-district and age-district-based patterns. We evaluate newly generated contact patterns derived from detailed 2011 Hong Kong census data by incorporating them into a model and simulation of the 2009 Hong Kong H1N1 epidemic. We then compare the newly generated social contact patterns with the mixing patterns that are often used in the literature, and draw the following conclusions. First, the generation of social contact patterns based on a hierarchical district structure allows for simulations at different district levels. Second, the newly generated social contact patterns reflect individuals social contacts. Third, the newly generated social contact patterns improve the accuracy of the SEIR-based epidemic model. PMID:25679787
Lasko, Thomas A; Denny, Joshua C; Levy, Mia A
2013-01-01
Inferring precise phenotypic patterns from population-scale clinical data is a core computational task in the development of precision, personalized medicine. The traditional approach uses supervised learning, in which an expert designates which patterns to look for (by specifying the learning task and the class labels), and where to look for them (by specifying the input variables). While appropriate for individual tasks, this approach scales poorly and misses the patterns that we don't think to look for. Unsupervised feature learning overcomes these limitations by identifying patterns (or features) that collectively form a compact and expressive representation of the source data, with no need for expert input or labeled examples. Its rising popularity is driven by new deep learning methods, which have produced high-profile successes on difficult standardized problems of object recognition in images. Here we introduce its use for phenotype discovery in clinical data. This use is challenging because the largest source of clinical data - Electronic Medical Records - typically contains noisy, sparse, and irregularly timed observations, rendering them poor substrates for deep learning methods. Our approach couples dirty clinical data to deep learning architecture via longitudinal probability densities inferred using Gaussian process regression. From episodic, longitudinal sequences of serum uric acid measurements in 4368 individuals we produced continuous phenotypic features that suggest multiple population subtypes, and that accurately distinguished (0.97 AUC) the uric-acid signatures of gout vs. acute leukemia despite not being optimized for the task. The unsupervised features were as accurate as gold-standard features engineered by an expert with complete knowledge of the domain, the classification task, and the class labels. Our findings demonstrate the potential for achieving computational phenotype discovery at population scale. We expect such data-driven phenotypes to expose unknown disease variants and subtypes and to provide rich targets for genetic association studies.
Lasko, Thomas A.; Denny, Joshua C.; Levy, Mia A.
2013-01-01
Inferring precise phenotypic patterns from population-scale clinical data is a core computational task in the development of precision, personalized medicine. The traditional approach uses supervised learning, in which an expert designates which patterns to look for (by specifying the learning task and the class labels), and where to look for them (by specifying the input variables). While appropriate for individual tasks, this approach scales poorly and misses the patterns that we don’t think to look for. Unsupervised feature learning overcomes these limitations by identifying patterns (or features) that collectively form a compact and expressive representation of the source data, with no need for expert input or labeled examples. Its rising popularity is driven by new deep learning methods, which have produced high-profile successes on difficult standardized problems of object recognition in images. Here we introduce its use for phenotype discovery in clinical data. This use is challenging because the largest source of clinical data – Electronic Medical Records – typically contains noisy, sparse, and irregularly timed observations, rendering them poor substrates for deep learning methods. Our approach couples dirty clinical data to deep learning architecture via longitudinal probability densities inferred using Gaussian process regression. From episodic, longitudinal sequences of serum uric acid measurements in 4368 individuals we produced continuous phenotypic features that suggest multiple population subtypes, and that accurately distinguished (0.97 AUC) the uric-acid signatures of gout vs. acute leukemia despite not being optimized for the task. The unsupervised features were as accurate as gold-standard features engineered by an expert with complete knowledge of the domain, the classification task, and the class labels. Our findings demonstrate the potential for achieving computational phenotype discovery at population scale. We expect such data-driven phenotypes to expose unknown disease variants and subtypes and to provide rich targets for genetic association studies. PMID:23826094
Unsupervised data mining in nanoscale x-ray spectro-microscopic study of NdFeB magnet
DOE Office of Scientific and Technical Information (OSTI.GOV)
Duan, Xiaoyue; Yang, Feifei; Antono, Erin
Novel developments in X-ray based spectro-microscopic characterization techniques have increased the rate of acquisition of spatially resolved spectroscopic data by several orders of magnitude over what was possible a few years ago. This accelerated data acquisition, with high spatial resolution at nanoscale and sensitivity to subtle differences in chemistry and atomic structure, provides a unique opportunity to investigate hierarchically complex and structurally heterogeneous systems found in functional devices and materials systems. However, handling and analyzing the large volume data generated poses significant challenges. Here we apply an unsupervised data-mining algorithm known as DBSCAN to study a rare-earth element based permanentmore » magnet material, Nd 2Fe 14B. We are able to reduce a large spectro-microscopic dataset of over 300,000 spectra to 3, preserving much of the underlying information. Scientists can easily and quickly analyze in detail three characteristic spectra. Our approach can rapidly provide a concise representation of a large and complex dataset to materials scientists and chemists. For instance, it shows that the surface of common Nd 2Fe 14B magnet is chemically and structurally very different from the bulk, suggesting a possible surface alteration effect possibly due to the corrosion, which could affect the material’s overall properties.« less
Miotto, Riccardo; Li, Li; Kidd, Brian A.; Dudley, Joel T.
2016-01-01
Secondary use of electronic health records (EHRs) promises to advance clinical research and better inform clinical decision making. Challenges in summarizing and representing patient data prevent widespread practice of predictive modeling using EHRs. Here we present a novel unsupervised deep feature learning method to derive a general-purpose patient representation from EHR data that facilitates clinical predictive modeling. In particular, a three-layer stack of denoising autoencoders was used to capture hierarchical regularities and dependencies in the aggregated EHRs of about 700,000 patients from the Mount Sinai data warehouse. The result is a representation we name “deep patient”. We evaluated this representation as broadly predictive of health states by assessing the probability of patients to develop various diseases. We performed evaluation using 76,214 test patients comprising 78 diseases from diverse clinical domains and temporal windows. Our results significantly outperformed those achieved using representations based on raw EHR data and alternative feature learning strategies. Prediction performance for severe diabetes, schizophrenia, and various cancers were among the top performing. These findings indicate that deep learning applied to EHRs can derive patient representations that offer improved clinical predictions, and could provide a machine learning framework for augmenting clinical decision systems. PMID:27185194
NASA Astrophysics Data System (ADS)
Miotto, Riccardo; Li, Li; Kidd, Brian A.; Dudley, Joel T.
2016-05-01
Secondary use of electronic health records (EHRs) promises to advance clinical research and better inform clinical decision making. Challenges in summarizing and representing patient data prevent widespread practice of predictive modeling using EHRs. Here we present a novel unsupervised deep feature learning method to derive a general-purpose patient representation from EHR data that facilitates clinical predictive modeling. In particular, a three-layer stack of denoising autoencoders was used to capture hierarchical regularities and dependencies in the aggregated EHRs of about 700,000 patients from the Mount Sinai data warehouse. The result is a representation we name “deep patient”. We evaluated this representation as broadly predictive of health states by assessing the probability of patients to develop various diseases. We performed evaluation using 76,214 test patients comprising 78 diseases from diverse clinical domains and temporal windows. Our results significantly outperformed those achieved using representations based on raw EHR data and alternative feature learning strategies. Prediction performance for severe diabetes, schizophrenia, and various cancers were among the top performing. These findings indicate that deep learning applied to EHRs can derive patient representations that offer improved clinical predictions, and could provide a machine learning framework for augmenting clinical decision systems.
Gomes, Liliane R.; Gomes, Marcelo; Jung, Bryan; Paniagua, Beatriz; Ruellas, Antonio C.; Gonçalves, João Roberto; Styner, Martin A.; Wolford, Larry; Cevidanes, Lucia
2015-01-01
Abstract. This study aimed to investigate imaging statistical approaches for classifying three-dimensional (3-D) osteoarthritic morphological variations among 169 temporomandibular joint (TMJ) condyles. Cone-beam computed tomography scans were acquired from 69 subjects with long-term TMJ osteoarthritis (OA), 15 subjects at initial diagnosis of OA, and 7 healthy controls. Three-dimensional surface models of the condyles were constructed and SPHARM-PDM established correspondent points on each model. Multivariate analysis of covariance and direction-projection-permutation (DiProPerm) were used for testing statistical significance of the differences between the groups determined by clinical and radiographic diagnoses. Unsupervised classification using hierarchical agglomerative clustering was then conducted. Compared with healthy controls, OA average condyle was significantly smaller in all dimensions except its anterior surface. Significant flattening of the lateral pole was noticed at initial diagnosis. We observed areas of 3.88-mm bone resorption at the superior surface and 3.10-mm bone apposition at the anterior aspect of the long-term OA average model. DiProPerm supported a significant difference between the healthy control and OA group (p-value=0.001). Clinically meaningful unsupervised classification of TMJ condylar morphology determined a preliminary diagnostic index of 3-D osteoarthritic changes, which may be the first step towards a more targeted diagnosis of this condition. PMID:26158119
Unsupervised data mining in nanoscale x-ray spectro-microscopic study of NdFeB magnet
Duan, Xiaoyue; Yang, Feifei; Antono, Erin; ...
2016-09-29
Novel developments in X-ray based spectro-microscopic characterization techniques have increased the rate of acquisition of spatially resolved spectroscopic data by several orders of magnitude over what was possible a few years ago. This accelerated data acquisition, with high spatial resolution at nanoscale and sensitivity to subtle differences in chemistry and atomic structure, provides a unique opportunity to investigate hierarchically complex and structurally heterogeneous systems found in functional devices and materials systems. However, handling and analyzing the large volume data generated poses significant challenges. Here we apply an unsupervised data-mining algorithm known as DBSCAN to study a rare-earth element based permanentmore » magnet material, Nd 2Fe 14B. We are able to reduce a large spectro-microscopic dataset of over 300,000 spectra to 3, preserving much of the underlying information. Scientists can easily and quickly analyze in detail three characteristic spectra. Our approach can rapidly provide a concise representation of a large and complex dataset to materials scientists and chemists. For instance, it shows that the surface of common Nd 2Fe 14B magnet is chemically and structurally very different from the bulk, suggesting a possible surface alteration effect possibly due to the corrosion, which could affect the material’s overall properties.« less
Butaciu, Sinziana; Senila, Marin; Sarbu, Costel; Ponta, Michaela; Tanaselia, Claudiu; Cadar, Oana; Roman, Marius; Radu, Emil; Sima, Mihaela; Frentiu, Tiberiu
2017-04-01
The study proposes a combined model based on diagrams (Gibbs, Piper, Stuyfzand Hydrogeochemical Classification System) and unsupervised statistical approaches (Cluster Analysis, Principal Component Analysis, Fuzzy Principal Component Analysis, Fuzzy Hierarchical Cross-Clustering) to describe natural enrichment of inorganic arsenic and co-occurring species in groundwater in the Banat Plain, southwestern Romania. Speciation of inorganic As (arsenite, arsenate), ion concentrations (Na + , K + , Ca 2+ , Mg 2+ , HCO 3 - , Cl - , F - , SO 4 2- , PO 4 3- , NO 3 - ), pH, redox potential, conductivity and total dissolved substances were performed. Classical diagrams provided the hydrochemical characterization, while statistical approaches were helpful to establish (i) the mechanism of naturally occurring of As and F - species and the anthropogenic one for NO 3 - , SO 4 2- , PO 4 3- and K + and (ii) classification of groundwater based on content of arsenic species. The HCO 3 - type of local groundwater and alkaline pH (8.31-8.49) were found to be responsible for the enrichment of arsenic species and occurrence of F - but by different paths. The PO 4 3- -AsO 4 3- ion exchange, water-rock interaction (silicates hydrolysis and desorption from clay) were associated to arsenate enrichment in the oxidizing aquifer. Fuzzy Hierarchical Cross-Clustering was the strongest tool for the rapid simultaneous classification of groundwaters as a function of arsenic content and hydrogeochemical characteristics. The approach indicated the Na + -F - -pH cluster as marker for groundwater with naturally elevated As and highlighted which parameters need to be monitored. A chemical conceptual model illustrating the natural and anthropogenic paths and enrichment of As and co-occurring species in the local groundwater supported by mineralogical analysis of rocks was established. Copyright © 2016 Elsevier Ltd. All rights reserved.
Molecular Subgroup of Primary Prostate Cancer Presenting with Metastatic Biology.
Walker, Steven M; Knight, Laura A; McCavigan, Andrena M; Logan, Gemma E; Berge, Viktor; Sherif, Amir; Pandha, Hardev; Warren, Anne Y; Davidson, Catherine; Uprichard, Adam; Blayney, Jaine K; Price, Bethanie; Jellema, Gera L; Steele, Christopher J; Svindland, Aud; McDade, Simon S; Eden, Christopher G; Foster, Chris; Mills, Ian G; Neal, David E; Mason, Malcolm D; Kay, Elaine W; Waugh, David J; Harkin, D Paul; Watson, R William; Clarke, Noel W; Kennedy, Richard D
2017-10-01
Approximately 4-25% of patients with early prostate cancer develop disease recurrence following radical prostatectomy. To identify a molecular subgroup of prostate cancers with metastatic potential at presentation resulting in a high risk of recurrence following radical prostatectomy. Unsupervised hierarchical clustering was performed using gene expression data from 70 primary resections, 31 metastatic lymph nodes, and 25 normal prostate samples. Independent assay validation was performed using 322 radical prostatectomy samples from four sites with a mean follow-up of 50.3 months. Molecular subgroups were identified using unsupervised hierarchical clustering. A partial least squares approach was used to generate a gene expression assay. Relationships with outcome (time to biochemical and metastatic recurrence) were analysed using multivariable Cox regression and log-rank analysis. A molecular subgroup of primary prostate cancer with biology similar to metastatic disease was identified. A 70-transcript signature (metastatic assay) was developed and independently validated in the radical prostatectomy samples. Metastatic assay positive patients had increased risk of biochemical recurrence (multivariable hazard ratio [HR] 1.62 [1.13-2.33]; p=0.0092) and metastatic recurrence (multivariable HR=3.20 [1.76-5.80]; p=0.0001). A combined model with Cancer of the Prostate Risk Assessment post surgical (CAPRA-S) identified patients at an increased risk of biochemical and metastatic recurrence superior to either model alone (HR=2.67 [1.90-3.75]; p<0.0001 and HR=7.53 [4.13-13.73]; p<0.0001, respectively). The retrospective nature of the study is acknowledged as a potential limitation. The metastatic assay may identify a molecular subgroup of primary prostate cancers with metastatic potential. The metastatic assay may improve the ability to detect patients at risk of metastatic recurrence following radical prostatectomy. The impact of adjuvant therapies should be assessed in this higher-risk population. Copyright © 2017 European Association of Urology. Published by Elsevier B.V. All rights reserved.
Matsubara, Takashi
2017-01-01
Precise spike timing is considered to play a fundamental role in communications and signal processing in biological neural networks. Understanding the mechanism of spike timing adjustment would deepen our understanding of biological systems and enable advanced engineering applications such as efficient computational architectures. However, the biological mechanisms that adjust and maintain spike timing remain unclear. Existing algorithms adopt a supervised approach, which adjusts the axonal conduction delay and synaptic efficacy until the spike timings approximate the desired timings. This study proposes a spike timing-dependent learning model that adjusts the axonal conduction delay and synaptic efficacy in both unsupervised and supervised manners. The proposed learning algorithm approximates the Expectation-Maximization algorithm, and classifies the input data encoded into spatio-temporal spike patterns. Even in the supervised classification, the algorithm requires no external spikes indicating the desired spike timings unlike existing algorithms. Furthermore, because the algorithm is consistent with biological models and hypotheses found in existing biological studies, it could capture the mechanism underlying biological delay learning. PMID:29209191
Matsubara, Takashi
2017-01-01
Precise spike timing is considered to play a fundamental role in communications and signal processing in biological neural networks. Understanding the mechanism of spike timing adjustment would deepen our understanding of biological systems and enable advanced engineering applications such as efficient computational architectures. However, the biological mechanisms that adjust and maintain spike timing remain unclear. Existing algorithms adopt a supervised approach, which adjusts the axonal conduction delay and synaptic efficacy until the spike timings approximate the desired timings. This study proposes a spike timing-dependent learning model that adjusts the axonal conduction delay and synaptic efficacy in both unsupervised and supervised manners. The proposed learning algorithm approximates the Expectation-Maximization algorithm, and classifies the input data encoded into spatio-temporal spike patterns. Even in the supervised classification, the algorithm requires no external spikes indicating the desired spike timings unlike existing algorithms. Furthermore, because the algorithm is consistent with biological models and hypotheses found in existing biological studies, it could capture the mechanism underlying biological delay learning.
Unsupervised classification of variable stars
NASA Astrophysics Data System (ADS)
Valenzuela, Lucas; Pichara, Karim
2018-03-01
During the past 10 years, a considerable amount of effort has been made to develop algorithms for automatic classification of variable stars. That has been primarily achieved by applying machine learning methods to photometric data sets where objects are represented as light curves. Classifiers require training sets to learn the underlying patterns that allow the separation among classes. Unfortunately, building training sets is an expensive process that demands a lot of human efforts. Every time data come from new surveys; the only available training instances are the ones that have a cross-match with previously labelled objects, consequently generating insufficient training sets compared with the large amounts of unlabelled sources. In this work, we present an algorithm that performs unsupervised classification of variable stars, relying only on the similarity among light curves. We tackle the unsupervised classification problem by proposing an untraditional approach. Instead of trying to match classes of stars with clusters found by a clustering algorithm, we propose a query-based method where astronomers can find groups of variable stars ranked by similarity. We also develop a fast similarity function specific for light curves, based on a novel data structure that allows scaling the search over the entire data set of unlabelled objects. Experiments show that our unsupervised model achieves high accuracy in the classification of different types of variable stars and that the proposed algorithm scales up to massive amounts of light curves.
Hierarchical animal movement models for population-level inference
Hooten, Mevin B.; Buderman, Frances E.; Brost, Brian M.; Hanks, Ephraim M.; Ivans, Jacob S.
2016-01-01
New methods for modeling animal movement based on telemetry data are developed regularly. With advances in telemetry capabilities, animal movement models are becoming increasingly sophisticated. Despite a need for population-level inference, animal movement models are still predominantly developed for individual-level inference. Most efforts to upscale the inference to the population level are either post hoc or complicated enough that only the developer can implement the model. Hierarchical Bayesian models provide an ideal platform for the development of population-level animal movement models but can be challenging to fit due to computational limitations or extensive tuning required. We propose a two-stage procedure for fitting hierarchical animal movement models to telemetry data. The two-stage approach is statistically rigorous and allows one to fit individual-level movement models separately, then resample them using a secondary MCMC algorithm. The primary advantages of the two-stage approach are that the first stage is easily parallelizable and the second stage is completely unsupervised, allowing for an automated fitting procedure in many cases. We demonstrate the two-stage procedure with two applications of animal movement models. The first application involves a spatial point process approach to modeling telemetry data, and the second involves a more complicated continuous-time discrete-space animal movement model. We fit these models to simulated data and real telemetry data arising from a population of monitored Canada lynx in Colorado, USA.
Protein expression patterns of cell cycle regulators in operable breast cancer.
Zagouri, Flora; Kotoula, Vassiliki; Kouvatseas, George; Sotiropoulou, Maria; Koletsa, Triantafyllia; Gavressea, Theofani; Valavanis, Christos; Trihia, Helen; Bobos, Mattheos; Lazaridis, Georgios; Koutras, Angelos; Pentheroudakis, George; Skarlos, Pantelis; Bafaloukos, Dimitrios; Arnogiannaki, Niki; Chrisafi, Sofia; Christodoulou, Christos; Papakostas, Pavlos; Aravantinos, Gerasimos; Kosmidis, Paris; Karanikiotis, Charisios; Zografos, George; Papadimitriou, Christos; Fountzilas, George
2017-01-01
To evaluate the prognostic role of elaborate molecular clusters encompassing cyclin D1, cyclin E1, p21, p27 and p53 in the context of various breast cancer subtypes. Cyclin E1, cyclin D1, p53, p21 and p27 were evaluated with immunohistochemistry in 1077 formalin-fixed paraffin-embedded tissues from breast cancer patients who had been treated within clinical trials. Jaccard distances were computed for the markers and the resulted matrix was used for conducting unsupervised hierarchical clustering, in order to identify distinct groups correlating with prognosis. Luminal B and triple-negative (TNBC) tumors presented with the highest and lowest levels of cyclin D1 expression, respectively. By contrast, TNBC frequently expressed Cyclin E1, whereas ER-positive tumors did not. Absence of Cyclin D1 predicted for worse OS, while absence of Cyclin E1 for poorer DFS. The expression patterns of all examined proteins yielded 3 distinct clusters; (1) Cyclin D1 and/or E1 positive with moderate p21 expression; (2) Cyclin D1 and/or E1, and p27 positive, p53 protein negative; and, (3) Cyclin D1 or E1 positive, p53 positive, p21 and p27 negative or moderately positive. The 5-year DFS rates for clusters 1, 2 and 3 were 70.0%, 79.1%, 67.4% and OS 88.4%, 90.4%, 78.9%, respectively. It seems that the expression of cell cycle regulators in the absence of p53 protein is associated with favorable prognosis in operable breast cancer.
Gaussian process based independent analysis for temporal source separation in fMRI.
Hald, Ditte Høvenhoff; Henao, Ricardo; Winther, Ole
2017-05-15
Functional Magnetic Resonance Imaging (fMRI) gives us a unique insight into the processes of the brain, and opens up for analyzing the functional activation patterns of the underlying sources. Task-inferred supervised learning with restrictive assumptions in the regression set-up, restricts the exploratory nature of the analysis. Fully unsupervised independent component analysis (ICA) algorithms, on the other hand, can struggle to detect clear classifiable components on single-subject data. We attribute this shortcoming to inadequate modeling of the fMRI source signals by failing to incorporate its temporal nature. fMRI source signals, biological stimuli and non-stimuli-related artifacts are all smooth over a time-scale compatible with the sampling time (TR). We therefore propose Gaussian process ICA (GPICA), which facilitates temporal dependency by the use of Gaussian process source priors. On two fMRI data sets with different sampling frequency, we show that the GPICA-inferred temporal components and associated spatial maps allow for a more definite interpretation than standard temporal ICA methods. The temporal structures of the sources are controlled by the covariance of the Gaussian process, specified by a kernel function with an interpretable and controllable temporal length scale parameter. We propose a hierarchical model specification, considering both instantaneous and convolutive mixing, and we infer source spatial maps, temporal patterns and temporal length scale parameters by Markov Chain Monte Carlo. A companion implementation made as a plug-in for SPM can be downloaded from https://github.com/dittehald/GPICA. Copyright © 2017 Elsevier Inc. All rights reserved.
Niegowski, Maciej; Zivanovic, Miroslav
2016-03-01
We present a novel approach aimed at removing electrocardiogram (ECG) perturbation from single-channel surface electromyogram (EMG) recordings by means of unsupervised learning of wavelet-based intensity images. The general idea is to combine the suitability of certain wavelet decomposition bases which provide sparse electrocardiogram time-frequency representations, with the capacity of non-negative matrix factorization (NMF) for extracting patterns from images. In order to overcome convergence problems which often arise in NMF-related applications, we design a novel robust initialization strategy which ensures proper signal decomposition in a wide range of ECG contamination levels. Moreover, the method can be readily used because no a priori knowledge or parameter adjustment is needed. The proposed method was evaluated on real surface EMG signals against two state-of-the-art unsupervised learning algorithms and a singular spectrum analysis based method. The results, expressed in terms of high-to-low energy ratio, normalized median frequency, spectral power difference and normalized average rectified value, suggest that the proposed method enables better ECG-EMG separation quality than the reference methods. Copyright © 2015 IPEM. Published by Elsevier Ltd. All rights reserved.
Classifying seismic noise and sources from OBS data using unsupervised machine learning
NASA Astrophysics Data System (ADS)
Mosher, S. G.; Audet, P.
2017-12-01
The paradigm of plate tectonics was established mainly by recognizing the central role of oceanic plates in the production and destruction of tectonic plates at their boundaries. Since that realization, however, seismic studies of tectonic plates and their associated deformation have slowly shifted their attention toward continental plates due to the ease of installation and maintenance of high-quality seismic networks on land. The result has been a much more detailed understanding of the seismicity patterns associated with continental plate deformation in comparison with the low-magnitude deformation patterns within oceanic plates and at their boundaries. While the number of high-quality ocean-bottom seismometer (OBS) deployments within the past decade has demonstrated the potential to significantly increase our understanding of tectonic systems in oceanic settings, OBS data poses significant challenges to many of the traditional data processing techniques in seismology. In particular, problems involving the detection, location, and classification of seismic sources occurring within oceanic settings are much more difficult due to the extremely noisy seafloor environment in which data are recorded. However, classifying data without a priori constraints is a problem that is routinely pursued via unsupervised machine learning algorithms, which remain robust even in cases involving complicated datasets. In this research, we apply simple unsupervised machine learning algorithms (e.g., clustering) to OBS data from the Cascadia Initiative in an attempt to classify and detect a broad range of seismic sources, including various noise sources and tremor signals occurring within ocean settings.
Fossil Signatures Using Elemental Abundance Distributions and Bayesian Probabilistic Classification
NASA Technical Reports Server (NTRS)
Hoover, Richard B.; Storrie-Lombardi, Michael C.
2004-01-01
Elemental abundances (C6, N7, O8, Na11, Mg12, Al3, P15, S16, Cl17, K19, Ca20, Ti22, Mn25, Fe26, and Ni28) were obtained for a set of terrestrial fossils and the rock matrix surrounding them. Principal Component Analysis extracted five factors accounting for the 92.5% of the data variance, i.e. information content, of the elemental abundance data. Hierarchical Cluster Analysis provided unsupervised sample classification distinguishing fossil from matrix samples on the basis of either raw abundances or PCA input that agreed strongly with visual classification. A stochastic, non-linear Artificial Neural Network produced a Bayesian probability of correct sample classification. The results provide a quantitative probabilistic methodology for discriminating terrestrial fossils from the surrounding rock matrix using chemical information. To demonstrate the applicability of these techniques to the assessment of meteoritic samples or in situ extraterrestrial exploration, we present preliminary data on samples of the Orgueil meteorite. In both systems an elemental signature produces target classification decisions remarkably consistent with morphological classification by a human expert using only structural (visual) information. We discuss the possibility of implementing a complexity analysis metric capable of automating certain image analysis and pattern recognition abilities of the human eye using low magnification optical microscopy images and discuss the extension of this technique across multiple scales.
Lifelong learning of human actions with deep neural network self-organization.
Parisi, German I; Tani, Jun; Weber, Cornelius; Wermter, Stefan
2017-12-01
Lifelong learning is fundamental in autonomous robotics for the acquisition and fine-tuning of knowledge through experience. However, conventional deep neural models for action recognition from videos do not account for lifelong learning but rather learn a batch of training data with a predefined number of action classes and samples. Thus, there is the need to develop learning systems with the ability to incrementally process available perceptual cues and to adapt their responses over time. We propose a self-organizing neural architecture for incrementally learning to classify human actions from video sequences. The architecture comprises growing self-organizing networks equipped with recurrent neurons for processing time-varying patterns. We use a set of hierarchically arranged recurrent networks for the unsupervised learning of action representations with increasingly large spatiotemporal receptive fields. Lifelong learning is achieved in terms of prediction-driven neural dynamics in which the growth and the adaptation of the recurrent networks are driven by their capability to reconstruct temporally ordered input sequences. Experimental results on a classification task using two action benchmark datasets show that our model is competitive with state-of-the-art methods for batch learning also when a significant number of sample labels are missing or corrupted during training sessions. Additional experiments show the ability of our model to adapt to non-stationary input avoiding catastrophic interference. Copyright © 2017 The Author(s). Published by Elsevier Ltd.. All rights reserved.
Mat-Desa, Wan N S; Ismail, Dzulkiflee; NicDaeid, Niamh
2011-10-15
Three different medium petroleum distillate (MPD) products (white spirit, paint brush cleaner, and lamp oil) were purchased from commercial stores in Glasgow, Scotland. Samples of 10, 25, 50, 75, 90, and 95% evaporated product were prepared, resulting in 56 samples in total which were analyzed using gas chromatography-mass spectrometry. Data sets from the chromatographic patterns were examined and preprocessed for unsupervised multivariate analyses using principal component analysis (PCA), hierarchical cluster analysis (HCA), and a self organizing feature map (SOFM) artificial neural network. It was revealed that data sets comprised of higher boiling point hydrocarbon compounds provided a good means for the classification of the samples and successfully linked highly weathered samples back to their unevaporated counterpart in every case. The classification abilities of SOFM were further tested and validated for their predictive abilities where one set of weather data in each case was withdrawn from the sample set and used as a test set of the retrained network. This revealed SOFM to be an outstanding mechanism for sample discrimination and linkage over the more conventional PCA and HCA methods often suggested for such data analysis. SOFM also has the advantage of providing additional information through the evaluation of component planes facilitating the investigation of underlying variables that account for the classification. © 2011 American Chemical Society
NASA Astrophysics Data System (ADS)
Li, Zheng; Jiang, Yi-han; Duan, Lian; Zhu, Chao-zhe
2017-08-01
Objective. Functional near infra-red spectroscopy (fNIRS) is a promising brain imaging technology for brain-computer interfaces (BCI). Future clinical uses of fNIRS will likely require operation over long time spans, during which neural activation patterns may change. However, current decoders for fNIRS signals are not designed to handle changing activation patterns. The objective of this study is to test via simulations a new adaptive decoder for fNIRS signals, the Gaussian mixture model adaptive classifier (GMMAC). Approach. GMMAC can simultaneously classify and track activation pattern changes without the need for ground-truth labels. This adaptive classifier uses computationally efficient variational Bayesian inference to label new data points and update mixture model parameters, using the previous model parameters as priors. We test GMMAC in simulations in which neural activation patterns change over time and compare to static decoders and unsupervised adaptive linear discriminant analysis classifiers. Main results. Our simulation experiments show GMMAC can accurately decode under time-varying activation patterns: shifts of activation region, expansions of activation region, and combined contractions and shifts of activation region. Furthermore, the experiments show the proposed method can track the changing shape of the activation region. Compared to prior work, GMMAC performed significantly better than the other unsupervised adaptive classifiers on a difficult activation pattern change simulation: 99% versus <54% in two-choice classification accuracy. Significance. We believe GMMAC will be useful for clinical fNIRS-based brain-computer interfaces, including neurofeedback training systems, where operation over long time spans is required.
ERIC Educational Resources Information Center
Wu, Chung-Hsien; Su, Hung-Yu; Liu, Chao-Hong
2013-01-01
This study presents an efficient approach to personalized mispronunciation detection of Taiwanese-accented English. The main goal of this study was to detect frequently occurring mispronunciation patterns of Taiwanese-accented English instead of scoring English pronunciations directly. The proposed approach quickly identifies personalized…
Quick fuzzy backpropagation algorithm.
Nikov, A; Stoeva, S
2001-03-01
A modification of the fuzzy backpropagation (FBP) algorithm called QuickFBP algorithm is proposed, where the computation of the net function is significantly quicker. It is proved that the FBP algorithm is of exponential time complexity, while the QuickFBP algorithm is of polynomial time complexity. Convergence conditions of the QuickFBP, resp. the FBP algorithm are defined and proved for: (1) single output neural networks in case of training patterns with different targets; and (2) multiple output neural networks in case of training patterns with equivalued target vector. They support the automation of the weights training process (quasi-unsupervised learning) establishing the target value(s) depending on the network's input values. In these cases the simulation results confirm the convergence of both algorithms. An example with a large-sized neural network illustrates the significantly greater training speed of the QuickFBP rather than the FBP algorithm. The adaptation of an interactive web system to users on the basis of the QuickFBP algorithm is presented. Since the QuickFBP algorithm ensures quasi-unsupervised learning, this implies its broad applicability in areas of adaptive and adaptable interactive systems, data mining, etc. applications.
Tan, Jie; Doing, Georgia; Lewis, Kimberley A; Price, Courtney E; Chen, Kathleen M; Cady, Kyle C; Perchuk, Barret; Laub, Michael T; Hogan, Deborah A; Greene, Casey S
2017-07-26
Cross-experiment comparisons in public data compendia are challenged by unmatched conditions and technical noise. The ADAGE method, which performs unsupervised integration with denoising autoencoder neural networks, can identify biological patterns, but because ADAGE models, like many neural networks, are over-parameterized, different ADAGE models perform equally well. To enhance model robustness and better build signatures consistent with biological pathways, we developed an ensemble ADAGE (eADAGE) that integrated stable signatures across models. We applied eADAGE to a compendium of Pseudomonas aeruginosa gene expression profiling experiments performed in 78 media. eADAGE revealed a phosphate starvation response controlled by PhoB in media with moderate phosphate and predicted that a second stimulus provided by the sensor kinase, KinB, is required for this PhoB activation. We validated this relationship using both targeted and unbiased genetic approaches. eADAGE, which captures stable biological patterns, enables cross-experiment comparisons that can highlight measured but undiscovered relationships. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.
Method for indexing and retrieving manufacturing-specific digital imagery based on image content
Ferrell, Regina K.; Karnowski, Thomas P.; Tobin, Jr., Kenneth W.
2004-06-15
A method for indexing and retrieving manufacturing-specific digital images based on image content comprises three steps. First, at least one feature vector can be extracted from a manufacturing-specific digital image stored in an image database. In particular, each extracted feature vector corresponds to a particular characteristic of the manufacturing-specific digital image, for instance, a digital image modality and overall characteristic, a substrate/background characteristic, and an anomaly/defect characteristic. Notably, the extracting step includes generating a defect mask using a detection process. Second, using an unsupervised clustering method, each extracted feature vector can be indexed in a hierarchical search tree. Third, a manufacturing-specific digital image associated with a feature vector stored in the hierarchicial search tree can be retrieved, wherein the manufacturing-specific digital image has image content comparably related to the image content of the query image. More particularly, can include two data reductions, the first performed based upon a query vector extracted from a query image. Subsequently, a user can select relevant images resulting from the first data reduction. From the selection, a prototype vector can be calculated, from which a second-level data reduction can be performed. The second-level data reduction can result in a subset of feature vectors comparable to the prototype vector, and further comparable to the query vector. An additional fourth step can include managing the hierarchical search tree by substituting a vector average for several redundant feature vectors encapsulated by nodes in the hierarchical search tree.
Unsupervised Structure Detection in Biomedical Data.
Vogt, Julia E
2015-01-01
A major challenge in computational biology is to find simple representations of high-dimensional data that best reveal the underlying structure. In this work, we present an intuitive and easy-to-implement method based on ranked neighborhood comparisons that detects structure in unsupervised data. The method is based on ordering objects in terms of similarity and on the mutual overlap of nearest neighbors. This basic framework was originally introduced in the field of social network analysis to detect actor communities. We demonstrate that the same ideas can successfully be applied to biomedical data sets in order to reveal complex underlying structure. The algorithm is very efficient and works on distance data directly without requiring a vectorial embedding of data. Comprehensive experiments demonstrate the validity of this approach. Comparisons with state-of-the-art clustering methods show that the presented method outperforms hierarchical methods as well as density based clustering methods and model-based clustering. A further advantage of the method is that it simultaneously provides a visualization of the data. Especially in biomedical applications, the visualization of data can be used as a first pre-processing step when analyzing real world data sets to get an intuition of the underlying data structure. We apply this model to synthetic data as well as to various biomedical data sets which demonstrate the high quality and usefulness of the inferred structure.
Hierarchical Gene Selection and Genetic Fuzzy System for Cancer Microarray Data Classification
Nguyen, Thanh; Khosravi, Abbas; Creighton, Douglas; Nahavandi, Saeid
2015-01-01
This paper introduces a novel approach to gene selection based on a substantial modification of analytic hierarchy process (AHP). The modified AHP systematically integrates outcomes of individual filter methods to select the most informative genes for microarray classification. Five individual ranking methods including t-test, entropy, receiver operating characteristic (ROC) curve, Wilcoxon and signal to noise ratio are employed to rank genes. These ranked genes are then considered as inputs for the modified AHP. Additionally, a method that uses fuzzy standard additive model (FSAM) for cancer classification based on genes selected by AHP is also proposed in this paper. Traditional FSAM learning is a hybrid process comprising unsupervised structure learning and supervised parameter tuning. Genetic algorithm (GA) is incorporated in-between unsupervised and supervised training to optimize the number of fuzzy rules. The integration of GA enables FSAM to deal with the high-dimensional-low-sample nature of microarray data and thus enhance the efficiency of the classification. Experiments are carried out on numerous microarray datasets. Results demonstrate the performance dominance of the AHP-based gene selection against the single ranking methods. Furthermore, the combination of AHP-FSAM shows a great accuracy in microarray data classification compared to various competing classifiers. The proposed approach therefore is useful for medical practitioners and clinicians as a decision support system that can be implemented in the real medical practice. PMID:25823003
Hierarchical gene selection and genetic fuzzy system for cancer microarray data classification.
Nguyen, Thanh; Khosravi, Abbas; Creighton, Douglas; Nahavandi, Saeid
2015-01-01
This paper introduces a novel approach to gene selection based on a substantial modification of analytic hierarchy process (AHP). The modified AHP systematically integrates outcomes of individual filter methods to select the most informative genes for microarray classification. Five individual ranking methods including t-test, entropy, receiver operating characteristic (ROC) curve, Wilcoxon and signal to noise ratio are employed to rank genes. These ranked genes are then considered as inputs for the modified AHP. Additionally, a method that uses fuzzy standard additive model (FSAM) for cancer classification based on genes selected by AHP is also proposed in this paper. Traditional FSAM learning is a hybrid process comprising unsupervised structure learning and supervised parameter tuning. Genetic algorithm (GA) is incorporated in-between unsupervised and supervised training to optimize the number of fuzzy rules. The integration of GA enables FSAM to deal with the high-dimensional-low-sample nature of microarray data and thus enhance the efficiency of the classification. Experiments are carried out on numerous microarray datasets. Results demonstrate the performance dominance of the AHP-based gene selection against the single ranking methods. Furthermore, the combination of AHP-FSAM shows a great accuracy in microarray data classification compared to various competing classifiers. The proposed approach therefore is useful for medical practitioners and clinicians as a decision support system that can be implemented in the real medical practice.
Fong, Allan; Clark, Lindsey; Cheng, Tianyi; Franklin, Ella; Fernandez, Nicole; Ratwani, Raj; Parker, Sarah Henrickson
2017-07-01
The objective of this paper is to identify attribute patterns of influential individuals in intensive care units using unsupervised cluster analysis. Despite the acknowledgement that culture of an organisation is critical to improving patient safety, specific methods to shift culture have not been explicitly identified. A social network analysis survey was conducted and an unsupervised cluster analysis was used. A total of 100 surveys were gathered. Unsupervised cluster analysis was used to group individuals with similar dimensions highlighting three general genres of influencers: well-rounded, knowledge and relational. Culture is created locally by individual influencers. Cluster analysis is an effective way to identify common characteristics among members of an intensive care unit team that are noted as highly influential by their peers. To change culture, identifying and then integrating the influencers in intervention development and dissemination may create more sustainable and effective culture change. Additional studies are ongoing to test the effectiveness of utilising these influencers to disseminate patient safety interventions. This study offers an approach that can be helpful in both identifying and understanding influential team members and may be an important aspect of developing methods to change organisational culture. © 2017 John Wiley & Sons Ltd.
Application of diffusion maps to identify human factors of self-reported anomalies in aviation.
Andrzejczak, Chris; Karwowski, Waldemar; Mikusinski, Piotr
2012-01-01
A study investigating what factors are present leading to pilots submitting voluntary anomaly reports regarding their flight performance was conducted. Diffusion Maps (DM) were selected as the method of choice for performing dimensionality reduction on text records for this study. Diffusion Maps have seen successful use in other domains such as image classification and pattern recognition. High-dimensionality data in the form of narrative text reports from the NASA Aviation Safety Reporting System (ASRS) were clustered and categorized by way of dimensionality reduction. Supervised analyses were performed to create a baseline document clustering system. Dimensionality reduction techniques identified concepts or keywords within records, and allowed the creation of a framework for an unsupervised document classification system. Results from the unsupervised clustering algorithm performed similarly to the supervised methods outlined in the study. The dimensionality reduction was performed on 100 of the most commonly occurring words within 126,000 text records describing commercial aviation incidents. This study demonstrates that unsupervised machine clustering and organization of incident reports is possible based on unbiased inputs. Findings from this study reinforced traditional views on what factors contribute to civil aviation anomalies, however, new associations between previously unrelated factors and conditions were also found.
Yang, Guang; Nawaz, Tahir; Barrick, Thomas R; Howe, Franklyn A; Slabaugh, Greg
2015-12-01
Many approaches have been considered for automatic grading of brain tumors by means of pattern recognition with magnetic resonance spectroscopy (MRS). Providing an improved technique which can assist clinicians in accurately identifying brain tumor grades is our main objective. The proposed technique, which is based on the discrete wavelet transform (DWT) of whole-spectral or subspectral information of key metabolites, combined with unsupervised learning, inspects the separability of the extracted wavelet features from the MRS signal to aid the clustering. In total, we included 134 short echo time single voxel MRS spectra (SV MRS) in our study that cover normal controls, low grade and high grade tumors. The combination of DWT-based whole-spectral or subspectral analysis and unsupervised clustering achieved an overall clustering accuracy of 94.8% and a balanced error rate of 7.8%. To the best of our knowledge, it is the first study using DWT combined with unsupervised learning to cluster brain SV MRS. Instead of dimensionality reduction on SV MRS or feature selection using model fitting, our study provides an alternative method of extracting features to obtain promising clustering results.
Mehryary, Farrokh; Kaewphan, Suwisa; Hakala, Kai; Ginter, Filip
2016-01-01
Biomedical event extraction is one of the key tasks in biomedical text mining, supporting various applications such as database curation and hypothesis generation. Several systems, some of which have been applied at a large scale, have been introduced to solve this task. Past studies have shown that the identification of the phrases describing biological processes, also known as trigger detection, is a crucial part of event extraction, and notable overall performance gains can be obtained by solely focusing on this sub-task. In this paper we propose a novel approach for filtering falsely identified triggers from large-scale event databases, thus improving the quality of knowledge extraction. Our method relies on state-of-the-art word embeddings, event statistics gathered from the whole biomedical literature, and both supervised and unsupervised machine learning techniques. We focus on EVEX, an event database covering the whole PubMed and PubMed Central Open Access literature containing more than 40 million extracted events. The top most frequent EVEX trigger words are hierarchically clustered, and the resulting cluster tree is pruned to identify words that can never act as triggers regardless of their context. For rarely occurring trigger words we introduce a supervised approach trained on the combination of trigger word classification produced by the unsupervised clustering method and manual annotation. The method is evaluated on the official test set of BioNLP Shared Task on Event Extraction. The evaluation shows that the method can be used to improve the performance of the state-of-the-art event extraction systems. This successful effort also translates into removing 1,338,075 of potentially incorrect events from EVEX, thus greatly improving the quality of the data. The method is not solely bound to the EVEX resource and can be thus used to improve the quality of any event extraction system or database. The data and source code for this work are available at: http://bionlp-www.utu.fi/trigger-clustering/.
Srinivasa, Narayan; Cho, Youngkwan
2014-01-01
A spiking neural network model is described for learning to discriminate among spatial patterns in an unsupervised manner. The network anatomy consists of source neurons that are activated by external inputs, a reservoir that resembles a generic cortical layer with an excitatory-inhibitory (EI) network and a sink layer of neurons for readout. Synaptic plasticity in the form of STDP is imposed on all the excitatory and inhibitory synapses at all times. While long-term excitatory STDP enables sparse and efficient learning of the salient features in inputs, inhibitory STDP enables this learning to be stable by establishing a balance between excitatory and inhibitory currents at each neuron in the network. The synaptic weights between source and reservoir neurons form a basis set for the input patterns. The neural trajectories generated in the reservoir due to input stimulation and lateral connections between reservoir neurons can be readout by the sink layer neurons. This activity is used for adaptation of synapses between reservoir and sink layer neurons. A new measure called the discriminability index (DI) is introduced to compute if the network can discriminate between old patterns already presented in an initial training session. The DI is also used to compute if the network adapts to new patterns without losing its ability to discriminate among old patterns. The final outcome is that the network is able to correctly discriminate between all patterns-both old and new. This result holds as long as inhibitory synapses employ STDP to continuously enable current balance in the network. The results suggest a possible direction for future investigation into how spiking neural networks could address the stability-plasticity question despite having continuous synaptic plasticity.
van Doorn, Remco; Dijkman, Remco; Vermeer, Maarten H; Out-Luiting, Jacoba J; van der Raaij-Helmer, Elisabeth M H; Willemze, Rein; Tensen, Cornelis P
2004-08-15
Sézary syndrome (Sz) is a malignancy of CD4+ memory skin-homing T cells and presents with erythroderma, lymphadenopathy, and peripheral blood involvement. To gain more insight into the molecular features of Sz, oligonucleotide array analysis was performed comparing gene expression patterns of CD4+ T cells from peripheral blood of patients with Sz with those of patients with erythroderma secondary to dermatitis and healthy controls. Using unsupervised hierarchical clustering gene, expression patterns of T cells from patients with Sz were classified separately from those of benign T cells. One hundred twenty-three genes were identified as significantly differentially expressed and had an average fold change exceeding 2. T cells from patients with Sz demonstrated decreased expression of the following hematopoietic malignancy-linked tumor suppressor genes: TGF-beta receptor II, Mxi1, Riz1, CREB-binding protein, BCL11a, STAT4, and Forkhead Box O1A. Moreover, the tyrosine kinase receptor EphA4 and the potentially oncogenic transcription factor Twist were highly and selectively expressed in T cells of patients with Sz. High expression of EphA4 and Twist was also observed in lesional skin biopsy specimens of a subset of patients with cutaneous T cell lymphomas related to Sz, whereas their expression was nearly undetectable in benign T cells or in skin lesions of patients with inflammatory dermatoses. Detection of EphA4 and Twist may be used in the molecular diagnosis of Sz and related cutaneous T-cell lymphomas. Furthermore, the membrane-bound EphA4 receptor may serve as a target for directed therapeutic intervention.
Lei, Zhengdeng; Tan, Iain Beehuat; Das, Kakoli; Deng, Niantao; Zouridis, Hermioni; Pattison, Sharon; Chua, Clarinda; Feng, Zhu; Guan, Yeoh Khay; Ooi, Chia Huey; Ivanova, Tatiana; Zhang, Shenli; Lee, Minghui; Wu, Jeanie; Ngo, Anna; Manesh, Sravanthy; Tan, Elisabeth; Teh, Bin Tean; So, Jimmy Bok Yan; Goh, Liang Kee; Boussioutas, Alex; Lim, Tony Kiat Hon; Flotow, Horst; Tan, Patrick; Rozen, Steven G
2013-09-01
Almost all gastric cancers are adenocarcinomas, which have considerable heterogeneity among patients. We sought to identify subtypes of gastric adenocarcinomas with particular biological properties and responses to chemotherapy and targeted agents. We compared gene expression patterns among 248 gastric tumors; using a robust method of unsupervised clustering, consensus hierarchical clustering with iterative feature selection, we identified 3 major subtypes. We developed a classifier for these subtypes and validated it in 70 tumors from a different population. We identified distinct genomic and epigenomic properties of the subtypes. We determined drug sensitivities of the subtypes in primary tumors using clinical survival data, and in cell lines through high-throughput drug screening. We identified 3 subtypes of gastric adenocarcinoma: proliferative, metabolic, and mesenchymal. Tumors of the proliferative subtype had high levels of genomic instability, TP53 mutations, and DNA hypomethylation. Cancer cells of the metabolic subtype were more sensitive to 5-fluorouracil than the other subtypes. Furthermore, in 2 independent groups of patients, those with tumors of the metabolic subtype appeared to have greater benefits with 5-fluorouracil treatment. Tumors of the mesenchymal subtype contain cells with features of cancer stem cells, and cell lines of this subtype are particularly sensitive to phosphatidylinositol 3-kinase-AKT-mTOR inhibitors in vitro. Based on gene expression patterns, we classified gastric cancers into 3 subtypes, and validated these in an independent set of tumors. The subgroups have differences in molecular and genetic features and response to therapy; this information might be used to select specific treatment approaches for patients with gastric cancer. Copyright © 2013 AGA Institute. Published by Elsevier Inc. All rights reserved.
Corbin, Karen D.; Abdelmalek, Manal F.; Spencer, Melanie D.; da Costa, Kerry-Ann; Galanko, Joseph A.; Sha, Wei; Suzuki, Ayako; Guy, Cynthia D.; Cardona, Diana M.; Torquati, Alfonso; Diehl, Anna Mae; Zeisel, Steven H.
2013-01-01
Choline metabolism is important for very low-density lipoprotein secretion, making this nutritional pathway an important contributor to hepatic lipid balance. The purpose of this study was to assess whether the cumulative effects of multiple single nucleotide polymorphisms (SNPs) across genes of choline/1-carbon metabolism and functionally related pathways increase susceptibility to developing hepatic steatosis. In biopsy-characterized cases of nonalcoholic fatty liver disease and controls, we assessed 260 SNPs across 21 genes in choline/1-carbon metabolism. When SNPs were examined individually, using logistic regression, we only identified a single SNP (PNPLA3 rs738409) that was significantly associated with severity of hepatic steatosis after adjusting for confounders and multiple comparisons (P=0.02). However, when groupings of SNPs in similar metabolic pathways were defined using unsupervised hierarchical clustering, we identified groups of subjects with shared SNP signatures that were significantly correlated with steatosis burden (P=0.0002). The lowest and highest steatosis clusters could also be differentiated by ethnicity. However, unique SNP patterns defined steatosis burden irrespective of ethnicity. Our results suggest that analysis of SNP patterns in genes of choline/1-carbon metabolism may be useful for prediction of severity of steatosis in specific subsets of people, and the metabolic inefficiencies caused by these SNPs should be examined further.—Corbin, K. D., Abdelmalek, M. F., Spencer, M. D., da Costa, K.-A., Galanko, J. A., Sha, W., Suzuki, A., Guy, C. D., Cardona, D. M., Torquati, A., Diehl, A. M., Zeisel, S. H. Genetic signatures in choline and 1-carbon metabolism are associated with the severity of hepatic steatosis. PMID:23292069
Iatropoulos, Paraskevas; Daina, Erica; Curreri, Manuela; Piras, Rossella; Valoti, Elisabetta; Mele, Caterina; Bresin, Elena; Gamba, Sara; Alberti, Marta; Breno, Matteo; Perna, Annalisa; Bettoni, Serena; Sabadini, Ettore; Murer, Luisa; Vivarelli, Marina; Noris, Marina; Remuzzi, Giuseppe
2018-01-01
Membranoproliferative GN (MPGN) was recently reclassified as alternative pathway complement-mediated C3 glomerulopathy (C3G) and immune complex-mediated membranoproliferative GN (IC-MPGN). However, genetic and acquired alternative pathway abnormalities are also observed in IC-MPGN. Here, we explored the presence of distinct disease entities characterized by specific pathophysiologic mechanisms. We performed unsupervised hierarchical clustering, a data-driven statistical approach, on histologic, genetic, and clinical data and data regarding serum/plasma complement parameters from 173 patients with C3G/IC-MPGN. This approach divided patients into four clusters, indicating the existence of four different pathogenetic patterns. Specifically, this analysis separated patients with fluid-phase complement activation (clusters 1-3) who had low serum C3 levels and a high prevalence of genetic and acquired alternative pathway abnormalities from patients with solid-phase complement activation (cluster 4) who had normal or mildly altered serum C3, late disease onset, and poor renal survival. In patients with fluid-phase complement activation, those in clusters 1 and 2 had massive activation of the alternative pathway, including activation of the terminal pathway, and the highest prevalence of subendothelial deposits, but those in cluster 2 had additional activation of the classic pathway and the highest prevalence of nephrotic syndrome at disease onset. Patients in cluster 3 had prevalent activation of C3 convertase and highly electron-dense intramembranous deposits. In addition, we provide a simple algorithm to assign patients with C3G/IC-MPGN to specific clusters. These distinct clusters may facilitate clarification of disease etiology, improve risk assessment for ESRD, and pave the way for personalized treatment. Copyright © 2018 by the American Society of Nephrology.
Selective hierarchical patterning of silicon nanostructures via soft nanostencil lithography
NASA Astrophysics Data System (ADS)
Du, Ke; Ding, Junjun; Wathuthanthri, Ishan; Choi, Chang-Hwan
2017-11-01
It is challenging to hierarchically pattern high-aspect-ratio nanostructures on microstructures using conventional lithographic techniques, where photoresist (PR) film is not able to uniformly cover on the microstructures as the aspect ratio increases. Such non-uniformity causes poor definition of nanopatterns over the microstructures. Nanostencil lithography can provide an alternative means to hierarchically construct nanostructures on microstructures via direct deposition or plasma etching through a free-standing nanoporous membrane. In this work, we demonstrate the multiscale hierarchical fabrication of high-aspect-ratio nanostructures on microstructures of silicon using a free-standing nanostencil, which is a nanoporous membrane consisting of metal (Cr), PR, and anti-reflective coating. The nanostencil membrane is used as a deposition mask to define Cr nanodot patterns on the predefined silicon microstructures. Then, deep reactive ion etching is used to hierarchically create nanostructures on the microstructures using the Cr nanodots as an etch mask. With simple modification of the main fabrication processes, high-aspect-ratio nanopillars are selectively defined only on top of the microstructures, on bottom, or on both top and bottom.
Selective hierarchical patterning of silicon nanostructures via soft nanostencil lithography.
Du, Ke; Ding, Junjun; Wathuthanthri, Ishan; Choi, Chang-Hwan
2017-11-17
It is challenging to hierarchically pattern high-aspect-ratio nanostructures on microstructures using conventional lithographic techniques, where photoresist (PR) film is not able to uniformly cover on the microstructures as the aspect ratio increases. Such non-uniformity causes poor definition of nanopatterns over the microstructures. Nanostencil lithography can provide an alternative means to hierarchically construct nanostructures on microstructures via direct deposition or plasma etching through a free-standing nanoporous membrane. In this work, we demonstrate the multiscale hierarchical fabrication of high-aspect-ratio nanostructures on microstructures of silicon using a free-standing nanostencil, which is a nanoporous membrane consisting of metal (Cr), PR, and anti-reflective coating. The nanostencil membrane is used as a deposition mask to define Cr nanodot patterns on the predefined silicon microstructures. Then, deep reactive ion etching is used to hierarchically create nanostructures on the microstructures using the Cr nanodots as an etch mask. With simple modification of the main fabrication processes, high-aspect-ratio nanopillars are selectively defined only on top of the microstructures, on bottom, or on both top and bottom.
Schwieger, Wilhelm; Machoke, Albert Gonche; Weissenberger, Tobias; Inayat, Amer; Selvam, Thangaraj; Klumpp, Michael; Inayat, Alexandra
2016-06-13
'Hierarchy' is a property which can be attributed to a manifold of different immaterial systems, such as ideas, items and organisations or material ones like biological systems within living organisms or artificial, man-made constructions. The property 'hierarchy' is mainly characterised by a certain ordering of individual elements relative to each other, often in combination with a certain degree of branching. Especially mass-flow related systems in the natural environment feature special hierarchically branched patterns. This review is a survey into the world of hierarchical systems with special focus on hierarchically porous zeolite materials. A classification of hierarchical porosity is proposed based on the flow distribution pattern within the respective pore systems. In addition, this review might serve as a toolbox providing several synthetic and post-synthetic strategies to prepare zeolitic or zeolite containing material with tailored hierarchical porosity. Very often, such strategies with their underlying principles were developed for improving the performance of the final materials in different technical applications like adsorptive or catalytic processes. In the present review, besides on the hierarchically porous all-zeolite material, special focus is laid on the preparation of zeolitic composite materials with hierarchical porosity capable to face the demands of industrial application.
NASA Astrophysics Data System (ADS)
Lins, D. B.; Zullo, J.; Friedel, M. J.
2013-12-01
The Cerrado (savanna ecosystem) of São Paulo state (Brazil) represent a complex mosaic of different typologies of uses, actors and biophysical and social restrictions. Originally, 14% of the state of São Paulo area was covered by the diversity of Cerrado phytophysiognomies. Currently, only 1% of this original composition remains fragmented into numerous relicts of biodiversity, mainly concentrated in the central-eastern of the state. A relevant part of the fragments are found in areas of intense coverage change by human activities, whereas the greatest pressure comes from sugar cane cultivation, either by direct replacement of Cerrado vegetation or occupying pasture areas in the fragments edges. As a result, new local level dynamics has been introduced, directly or indirectly, affecting the established of processes in climate systems. In this study, the main goal is analyzing the relationship between the Cerrado landscape changing and the climate dynamics in regional and local areas. The multi-temporal MODIS 250 m Vegetation Index (VI) datasets (period of 2000 to 2012) are integrated with precipitation data of the correspondent period (http://www.agritempo.gov.br/),one of the most important variable of the spatial phytophysiognomies distribution. The integration of meteorological data enable the development of an integrated approach to understand the relationship between climatic seasonality and the changes in the spatial patterns. A procedure to congregated diverse dynamics information is the Self Organizing Map (SOM, Kohonen, 2001), a technique that relies on unsupervised competitive learning (Kohonen and Somervuo 2002) to recognize patterns. In this approach, high-dimensional data are represented on two dimensions, making possible to obtain patterns that takes into account information from different natures. Observed advances will contribute to bring machine-learning techniques as a valid tool to provide improve in land use/land cover (LULC) analyzes at different hierarchical scales to support numerous science and policy applications.
Exploring the Dynamics of Dyadic Interactions via Hierarchical Segmentation
ERIC Educational Resources Information Center
Hsieh, Fushing; Ferrer, Emilio; Chen, Shu-Chun; Chow, Sy-Miin
2010-01-01
In this article we present an exploratory tool for extracting systematic patterns from multivariate data. The technique, hierarchical segmentation (HS), can be used to group multivariate time series into segments with similar discrete-state recurrence patterns and it is not restricted by the stationarity assumption. We use a simulation study to…
Quasi-Supervised Scoring of Human Sleep in Polysomnograms Using Augmented Input Variables
Yaghouby, Farid; Sunderam, Sridhar
2015-01-01
The limitations of manual sleep scoring make computerized methods highly desirable. Scoring errors can arise from human rater uncertainty or inter-rater variability. Sleep scoring algorithms either come as supervised classifiers that need scored samples of each state to be trained, or as unsupervised classifiers that use heuristics or structural clues in unscored data to define states. We propose a quasi-supervised classifier that models observations in an unsupervised manner but mimics a human rater wherever training scores are available. EEG, EMG, and EOG features were extracted in 30s epochs from human-scored polysomnograms recorded from 42 healthy human subjects (18 to 79 years) and archived in an anonymized, publicly accessible database. Hypnograms were modified so that: 1. Some states are scored but not others; 2. Samples of all states are scored but not for transitional epochs; and 3. Two raters with 67% agreement are simulated. A framework for quasi-supervised classification was devised in which unsupervised statistical models—specifically Gaussian mixtures and hidden Markov models—are estimated from unlabeled training data, but the training samples are augmented with variables whose values depend on available scores. Classifiers were fitted to signal features incorporating partial scores, and used to predict scores for complete recordings. Performance was assessed using Cohen's K statistic. The quasi-supervised classifier performed significantly better than an unsupervised model and sometimes as well as a completely supervised model despite receiving only partial scores. The quasi-supervised algorithm addresses the need for classifiers that mimic scoring patterns of human raters while compensating for their limitations. PMID:25679475
Quasi-supervised scoring of human sleep in polysomnograms using augmented input variables.
Yaghouby, Farid; Sunderam, Sridhar
2015-04-01
The limitations of manual sleep scoring make computerized methods highly desirable. Scoring errors can arise from human rater uncertainty or inter-rater variability. Sleep scoring algorithms either come as supervised classifiers that need scored samples of each state to be trained, or as unsupervised classifiers that use heuristics or structural clues in unscored data to define states. We propose a quasi-supervised classifier that models observations in an unsupervised manner but mimics a human rater wherever training scores are available. EEG, EMG, and EOG features were extracted in 30s epochs from human-scored polysomnograms recorded from 42 healthy human subjects (18-79 years) and archived in an anonymized, publicly accessible database. Hypnograms were modified so that: 1. Some states are scored but not others; 2. Samples of all states are scored but not for transitional epochs; and 3. Two raters with 67% agreement are simulated. A framework for quasi-supervised classification was devised in which unsupervised statistical models-specifically Gaussian mixtures and hidden Markov models--are estimated from unlabeled training data, but the training samples are augmented with variables whose values depend on available scores. Classifiers were fitted to signal features incorporating partial scores, and used to predict scores for complete recordings. Performance was assessed using Cohen's Κ statistic. The quasi-supervised classifier performed significantly better than an unsupervised model and sometimes as well as a completely supervised model despite receiving only partial scores. The quasi-supervised algorithm addresses the need for classifiers that mimic scoring patterns of human raters while compensating for their limitations. Copyright © 2015 Elsevier Ltd. All rights reserved.
Unsupervised classification of major depression using functional connectivity MRI.
Zeng, Ling-Li; Shen, Hui; Liu, Li; Hu, Dewen
2014-04-01
The current diagnosis of psychiatric disorders including major depressive disorder based largely on self-reported symptoms and clinical signs may be prone to patients' behaviors and psychiatrists' bias. This study aims at developing an unsupervised machine learning approach for the accurate identification of major depression based on single resting-state functional magnetic resonance imaging scans in the absence of clinical information. Twenty-four medication-naive patients with major depression and 29 demographically similar healthy individuals underwent resting-state functional magnetic resonance imaging. We first clustered the voxels within the perigenual cingulate cortex into two subregions, a subgenual region and a pregenual region, according to their distinct resting-state functional connectivity patterns and showed that a maximum margin clustering-based unsupervised machine learning approach extracted sufficient information from the subgenual cingulate functional connectivity map to differentiate depressed patients from healthy controls with a group-level clustering consistency of 92.5% and an individual-level classification consistency of 92.5%. It was also revealed that the subgenual cingulate functional connectivity network with the highest discriminative power primarily included the ventrolateral and ventromedial prefrontal cortex, superior temporal gyri and limbic areas, indicating that these connections may play critical roles in the pathophysiology of major depression. The current study suggests that subgenual cingulate functional connectivity network signatures may provide promising objective biomarkers for the diagnosis of major depression and that maximum margin clustering-based unsupervised machine learning approaches may have the potential to inform clinical practice and aid in research on psychiatric disorders. Copyright © 2013 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Li, Bin
Spatial control behaviors account for a large proportion of human everyday activities from normal daily tasks, such as reaching for objects, to specialized tasks, such as driving, surgery, or operating equipment. These behaviors involve intensive interactions within internal processes (i.e. cognitive, perceptual, and motor control) and with the physical world. This dissertation builds on a concept of interaction pattern and a hierarchical functional model. Interaction pattern represents a type of behavior synergy that humans coordinates cognitive, perceptual, and motor control processes. It contributes to the construction of the hierarchical functional model that delineates humans spatial control behaviors as the coordination of three functional subsystems: planning, guidance, and tracking/pursuit. This dissertation formalizes and validates these two theories and extends them for the investigation of human spatial control skills encompassing development and assessment. Specifically, this dissertation first presents an overview of studies in human spatial control skills encompassing definition, characteristic, development, and assessment, to provide theoretical evidence for the concept of interaction pattern and the hierarchical functional model. The following, the human experiments for collecting motion and gaze data and techniques to register and classify gaze data, are described. This dissertation then elaborates and mathematically formalizes the hierarchical functional model and the concept of interaction pattern. These theories then enables the construction of a succinct simulation model that can reproduce a variety of human performance with a minimal set of hypotheses. This validates the hierarchical functional model as a normative framework for interpreting human spatial control behaviors. The dissertation then investigates human skill development and captures the emergence of interaction pattern. The final part of the dissertation applies the hierarchical functional model for skill assessment and introduces techniques to capture interaction patterns both from the top down using their geometric features and from the bottom up using their dynamical characteristics. The validity and generality of the skill assessment is illustrated using two the remote-control flight and laparoscopic surgical training experiments.
Handfield, Louis-François; Chong, Yolanda T.; Simmons, Jibril; Andrews, Brenda J.; Moses, Alan M.
2013-01-01
Protein subcellular localization has been systematically characterized in budding yeast using fluorescently tagged proteins. Based on the fluorescence microscopy images, subcellular localization of many proteins can be classified automatically using supervised machine learning approaches that have been trained to recognize predefined image classes based on statistical features. Here, we present an unsupervised analysis of protein expression patterns in a set of high-resolution, high-throughput microscope images. Our analysis is based on 7 biologically interpretable features which are evaluated on automatically identified cells, and whose cell-stage dependency is captured by a continuous model for cell growth. We show that it is possible to identify most previously identified localization patterns in a cluster analysis based on these features and that similarities between the inferred expression patterns contain more information about protein function than can be explained by a previous manual categorization of subcellular localization. Furthermore, the inferred cell-stage associated to each fluorescence measurement allows us to visualize large groups of proteins entering the bud at specific stages of bud growth. These correspond to proteins localized to organelles, revealing that the organelles must be entering the bud in a stereotypical order. We also identify and organize a smaller group of proteins that show subtle differences in the way they move around the bud during growth. Our results suggest that biologically interpretable features based on explicit models of cell morphology will yield unprecedented power for pattern discovery in high-resolution, high-throughput microscopy images. PMID:23785265
Bio-inspired computational heuristics to study Lane-Emden systems arising in astrophysics model.
Ahmad, Iftikhar; Raja, Muhammad Asif Zahoor; Bilal, Muhammad; Ashraf, Farooq
2016-01-01
This study reports novel hybrid computational methods for the solutions of nonlinear singular Lane-Emden type differential equation arising in astrophysics models by exploiting the strength of unsupervised neural network models and stochastic optimization techniques. In the scheme the neural network, sub-part of large field called soft computing, is exploited for modelling of the equation in an unsupervised manner. The proposed approximated solutions of higher order ordinary differential equation are calculated with the weights of neural networks trained with genetic algorithm, and pattern search hybrid with sequential quadratic programming for rapid local convergence. The results of proposed solvers for solving the nonlinear singular systems are in good agreements with the standard solutions. Accuracy and convergence the design schemes are demonstrated by the results of statistical performance measures based on the sufficient large number of independent runs.
Unsupervised daily routine and activity discovery in smart homes.
Jie Yin; Qing Zhang; Karunanithi, Mohan
2015-08-01
The ability to accurately recognize daily activities of residents is a core premise of smart homes to assist with remote health monitoring. Most of the existing methods rely on a supervised model trained from a preselected and manually labeled set of activities, which are often time-consuming and costly to obtain in practice. In contrast, this paper presents an unsupervised method for discovering daily routines and activities for smart home residents. Our proposed method first uses a Markov chain to model a resident's locomotion patterns at different times of day and discover clusters of daily routines at the macro level. For each routine cluster, it then drills down to further discover room-level activities at the micro level. The automatic identification of daily routines and activities is useful for understanding indicators of functional decline of elderly people and suggesting timely interventions.
Unsupervised pattern recognition methods in ciders profiling based on GCE voltammetric signals.
Jakubowska, Małgorzata; Sordoń, Wanda; Ciepiela, Filip
2016-07-15
This work presents a complete methodology of distinguishing between different brands of cider and ageing degrees, based on voltammetric signals, utilizing dedicated data preprocessing procedures and unsupervised multivariate analysis. It was demonstrated that voltammograms recorded on glassy carbon electrode in Britton-Robinson buffer at pH 2 are reproducible for each brand. By application of clustering algorithms and principal component analysis visible homogenous clusters were obtained. Advanced signal processing strategy which included automatic baseline correction, interval scaling and continuous wavelet transform with dedicated mother wavelet, was a key step in the correct recognition of the objects. The results show that voltammetry combined with optimized univariate and multivariate data processing is a sufficient tool to distinguish between ciders from various brands and to evaluate their freshness. Copyright © 2016 Elsevier Ltd. All rights reserved.
Comparisons of non-Gaussian statistical models in DNA methylation analysis.
Ma, Zhanyu; Teschendorff, Andrew E; Yu, Hong; Taghia, Jalil; Guo, Jun
2014-06-16
As a key regulatory mechanism of gene expression, DNA methylation patterns are widely altered in many complex genetic diseases, including cancer. DNA methylation is naturally quantified by bounded support data; therefore, it is non-Gaussian distributed. In order to capture such properties, we introduce some non-Gaussian statistical models to perform dimension reduction on DNA methylation data. Afterwards, non-Gaussian statistical model-based unsupervised clustering strategies are applied to cluster the data. Comparisons and analysis of different dimension reduction strategies and unsupervised clustering methods are presented. Experimental results show that the non-Gaussian statistical model-based methods are superior to the conventional Gaussian distribution-based method. They are meaningful tools for DNA methylation analysis. Moreover, among several non-Gaussian methods, the one that captures the bounded nature of DNA methylation data reveals the best clustering performance.
Comparisons of Non-Gaussian Statistical Models in DNA Methylation Analysis
Ma, Zhanyu; Teschendorff, Andrew E.; Yu, Hong; Taghia, Jalil; Guo, Jun
2014-01-01
As a key regulatory mechanism of gene expression, DNA methylation patterns are widely altered in many complex genetic diseases, including cancer. DNA methylation is naturally quantified by bounded support data; therefore, it is non-Gaussian distributed. In order to capture such properties, we introduce some non-Gaussian statistical models to perform dimension reduction on DNA methylation data. Afterwards, non-Gaussian statistical model-based unsupervised clustering strategies are applied to cluster the data. Comparisons and analysis of different dimension reduction strategies and unsupervised clustering methods are presented. Experimental results show that the non-Gaussian statistical model-based methods are superior to the conventional Gaussian distribution-based method. They are meaningful tools for DNA methylation analysis. Moreover, among several non-Gaussian methods, the one that captures the bounded nature of DNA methylation data reveals the best clustering performance. PMID:24937687
Srinivasa, Narayan; Cho, Youngkwan
2014-01-01
A spiking neural network model is described for learning to discriminate among spatial patterns in an unsupervised manner. The network anatomy consists of source neurons that are activated by external inputs, a reservoir that resembles a generic cortical layer with an excitatory-inhibitory (EI) network and a sink layer of neurons for readout. Synaptic plasticity in the form of STDP is imposed on all the excitatory and inhibitory synapses at all times. While long-term excitatory STDP enables sparse and efficient learning of the salient features in inputs, inhibitory STDP enables this learning to be stable by establishing a balance between excitatory and inhibitory currents at each neuron in the network. The synaptic weights between source and reservoir neurons form a basis set for the input patterns. The neural trajectories generated in the reservoir due to input stimulation and lateral connections between reservoir neurons can be readout by the sink layer neurons. This activity is used for adaptation of synapses between reservoir and sink layer neurons. A new measure called the discriminability index (DI) is introduced to compute if the network can discriminate between old patterns already presented in an initial training session. The DI is also used to compute if the network adapts to new patterns without losing its ability to discriminate among old patterns. The final outcome is that the network is able to correctly discriminate between all patterns—both old and new. This result holds as long as inhibitory synapses employ STDP to continuously enable current balance in the network. The results suggest a possible direction for future investigation into how spiking neural networks could address the stability-plasticity question despite having continuous synaptic plasticity. PMID:25566045
Insights into quasar UV spectra using unsupervised clustering analysis
NASA Astrophysics Data System (ADS)
Tammour, A.; Gallagher, S. C.; Daley, M.; Richards, G. T.
2016-06-01
Machine learning techniques can provide powerful tools to detect patterns in multidimensional parameter space. We use K-means - a simple yet powerful unsupervised clustering algorithm which picks out structure in unlabelled data - to study a sample of quasar UV spectra from the Quasar Catalog of the 10th Data Release of the Sloan Digital Sky Survey (SDSS-DR10) of Paris et al. Detecting patterns in large data sets helps us gain insights into the physical conditions and processes giving rise to the observed properties of quasars. We use K-means to find clusters in the parameter space of the equivalent width (EW), the blue- and red-half-width at half-maximum (HWHM) of the Mg II 2800 Å line, the C IV 1549 Å line, and the C III] 1908 Å blend in samples of broad absorption line (BAL) and non-BAL quasars at redshift 1.6-2.1. Using this method, we successfully recover correlations well-known in the UV regime such as the anti-correlation between the EW and blueshift of the C IV emission line and the shape of the ionizing spectra energy distribution (SED) probed by the strength of He II and the Si III]/C III] ratio. We find this to be particularly evident when the properties of C III] are used to find the clusters, while those of Mg II proved to be less strongly correlated with the properties of the other lines in the spectra such as the width of C IV or the Si III]/C III] ratio. We conclude that unsupervised clustering methods (such as K-means) are powerful methods for finding `natural' binning boundaries in multidimensional data sets and discuss caveats and future work.
Scale of association: hierarchical linear models and the measurement of ecological systems
Sean M. McMahon; Jeffrey M. Diez
2007-01-01
A fundamental challenge to understanding patterns in ecological systems lies in employing methods that can analyse, test and draw inference from measured associations between variables across scales. Hierarchical linear models (HLM) use advanced estimation algorithms to measure regression relationships and variance-covariance parameters in hierarchically structured...
A hierarchical framework of aquatic ecological units in North America (Nearctic Zone).
James R. Maxwell; Clayton J. Edwards; Mark E. Jensen; Steven J. Paustian; Harry Parrott; Donley M. Hill
1995-01-01
Proposes a framework for classifying and mapping aquatic systems at various scales using ecologically significant physical and biological criteria. Classification and mapping concepts follow tenets of hierarchical theory, pattern recognition, and driving variables. Criteria are provided for the hierarchical classification and mapping of aquatic ecological units of...
Han, S; Humphreys, G W; Chen, L
1999-10-01
The role of perceptual grouping and the encoding of closure of local elements in the processing of hierarchical patterns was studied. Experiments 1 and 2 showed a global advantage over the local level for 2 tasks involving the discrimination of orientation and closure, but there was a local advantage for the closure discrimination task relative to the orientation discrimination task. Experiment 3 showed a local precedence effect for the closure discrimination task when local element grouping was weakened by embedding the stimuli from Experiment 1 in a background made up of cross patterns. Experiments 4A and 4B found that dissimilarity of closure between the local elements of hierarchical stimuli and the background figures could facilitate the grouping of closed local elements and enhanced the perception of global structure. Experiment 5 showed that the advantage for detecting the closure of local elements in hierarchical analysis also held under divided- and selective-attention conditions. Results are consistent with the idea that grouping between local elements takes place in parallel and competes with the computation of closure of local elements in determining the selection between global and local levels of hierarchical patterns for response.
Sparse alignment for robust tensor learning.
Lai, Zhihui; Wong, Wai Keung; Xu, Yong; Zhao, Cairong; Sun, Mingming
2014-10-01
Multilinear/tensor extensions of manifold learning based algorithms have been widely used in computer vision and pattern recognition. This paper first provides a systematic analysis of the multilinear extensions for the most popular methods by using alignment techniques, thereby obtaining a general tensor alignment framework. From this framework, it is easy to show that the manifold learning based tensor learning methods are intrinsically different from the alignment techniques. Based on the alignment framework, a robust tensor learning method called sparse tensor alignment (STA) is then proposed for unsupervised tensor feature extraction. Different from the existing tensor learning methods, L1- and L2-norms are introduced to enhance the robustness in the alignment step of the STA. The advantage of the proposed technique is that the difficulty in selecting the size of the local neighborhood can be avoided in the manifold learning based tensor feature extraction algorithms. Although STA is an unsupervised learning method, the sparsity encodes the discriminative information in the alignment step and provides the robustness of STA. Extensive experiments on the well-known image databases as well as action and hand gesture databases by encoding object images as tensors demonstrate that the proposed STA algorithm gives the most competitive performance when compared with the tensor-based unsupervised learning methods.
Unsupervised segmentation of lungs from chest radiographs
NASA Astrophysics Data System (ADS)
Ghosh, Payel; Antani, Sameer K.; Long, L. Rodney; Thoma, George R.
2012-03-01
This paper describes our preliminary investigations for deriving and characterizing coarse-level textural regions present in the lung field on chest radiographs using unsupervised grow-cut (UGC), a cellular automaton based unsupervised segmentation technique. The segmentation has been performed on a publicly available data set of chest radiographs. The algorithm is useful for this application because it automatically converges to a natural segmentation of the image from random seed points using low-level image features such as pixel intensity values and texture features. Our goal is to develop a portable screening system for early detection of lung diseases for use in remote areas in developing countries. This involves developing automated algorithms for screening x-rays as normal/abnormal with a high degree of sensitivity, and identifying lung disease patterns on chest x-rays. Automatically deriving and quantitatively characterizing abnormal regions present in the lung field is the first step toward this goal. Therefore, region-based features such as geometrical and pixel-value measurements were derived from the segmented lung fields. In the future, feature selection and classification will be performed to identify pathological conditions such as pulmonary tuberculosis on chest radiographs. Shape-based features will also be incorporated to account for occlusions of the lung field and by other anatomical structures such as the heart and diaphragm.
Kopriva, Ivica; Hadžija, Mirko; Popović Hadžija, Marijana; Korolija, Marina; Cichocki, Andrzej
2011-01-01
A methodology is proposed for nonlinear contrast-enhanced unsupervised segmentation of multispectral (color) microscopy images of principally unstained specimens. The methodology exploits spectral diversity and spatial sparseness to find anatomical differences between materials (cells, nuclei, and background) present in the image. It consists of rth-order rational variety mapping (RVM) followed by matrix/tensor factorization. Sparseness constraint implies duality between nonlinear unsupervised segmentation and multiclass pattern assignment problems. Classes not linearly separable in the original input space become separable with high probability in the higher-dimensional mapped space. Hence, RVM mapping has two advantages: it takes implicitly into account nonlinearities present in the image (ie, they are not required to be known) and it increases spectral diversity (ie, contrast) between materials, due to increased dimensionality of the mapped space. This is expected to improve performance of systems for automated classification and analysis of microscopic histopathological images. The methodology was validated using RVM of the second and third orders of the experimental multispectral microscopy images of unstained sciatic nerve fibers (nervus ischiadicus) and of unstained white pulp in the spleen tissue, compared with a manually defined ground truth labeled by two trained pathophysiologists. The methodology can also be useful for additional contrast enhancement of images of stained specimens. PMID:21708116
Design of partially supervised classifiers for multispectral image data
NASA Technical Reports Server (NTRS)
Jeon, Byeungwoo; Landgrebe, David
1993-01-01
A partially supervised classification problem is addressed, especially when the class definition and corresponding training samples are provided a priori only for just one particular class. In practical applications of pattern classification techniques, a frequently observed characteristic is the heavy, often nearly impossible requirements on representative prior statistical class characteristics of all classes in a given data set. Considering the effort in both time and man-power required to have a well-defined, exhaustive list of classes with a corresponding representative set of training samples, this 'partially' supervised capability would be very desirable, assuming adequate classifier performance can be obtained. Two different classification algorithms are developed to achieve simplicity in classifier design by reducing the requirement of prior statistical information without sacrificing significant classifying capability. The first one is based on optimal significance testing, where the optimal acceptance probability is estimated directly from the data set. In the second approach, the partially supervised classification is considered as a problem of unsupervised clustering with initially one known cluster or class. A weighted unsupervised clustering procedure is developed to automatically define other classes and estimate their class statistics. The operational simplicity thus realized should make these partially supervised classification schemes very viable tools in pattern classification.
Geophysical phenomena classification by artificial neural networks
NASA Technical Reports Server (NTRS)
Gough, M. P.; Bruckner, J. R.
1995-01-01
Space science information systems involve accessing vast data bases. There is a need for an automatic process by which properties of the whole data set can be assimilated and presented to the user. Where data are in the form of spectrograms, phenomena can be detected by pattern recognition techniques. Presented are the first results obtained by applying unsupervised Artificial Neural Networks (ANN's) to the classification of magnetospheric wave spectra. The networks used here were a simple unsupervised Hamming network run on a PC and a more sophisticated CALM network run on a Sparc workstation. The ANN's were compared in their geophysical data recognition performance. CALM networks offer such qualities as fast learning, superiority in generalizing, the ability to continuously adapt to changes in the pattern set, and the possibility to modularize the network to allow the inter-relation between phenomena and data sets. This work is the first step toward an information system interface being developed at Sussex, the Whole Information System Expert (WISE). Phenomena in the data are automatically identified and provided to the user in the form of a data occurrence morphology, the Whole Information System Data Occurrence Morphology (WISDOM), along with relationships to other parameters and phenomena.
Predictability and hierarchy in Drosophila behavior.
Berman, Gordon J; Bialek, William; Shaevitz, Joshua W
2016-10-18
Even the simplest of animals exhibit behavioral sequences with complex temporal dynamics. Prominent among the proposed organizing principles for these dynamics has been the idea of a hierarchy, wherein the movements an animal makes can be understood as a set of nested subclusters. Although this type of organization holds potential advantages in terms of motion control and neural circuitry, measurements demonstrating this for an animal's entire behavioral repertoire have been limited in scope and temporal complexity. Here, we use a recently developed unsupervised technique to discover and track the occurrence of all stereotyped behaviors performed by fruit flies moving in a shallow arena. Calculating the optimally predictive representation of the fly's future behaviors, we show that fly behavior exhibits multiple time scales and is organized into a hierarchical structure that is indicative of its underlying behavioral programs and its changing internal states.
NASA Astrophysics Data System (ADS)
Zhou, Zheng; Liu, Chen; Shen, Wensheng; Dong, Zhen; Chen, Zhe; Huang, Peng; Liu, Lifeng; Liu, Xiaoyan; Kang, Jinfeng
2017-04-01
A binary spike-time-dependent plasticity (STDP) protocol based on one resistive-switching random access memory (RRAM) device was proposed and experimentally demonstrated in the fabricated RRAM array. Based on the STDP protocol, a novel unsupervised online pattern recognition system including RRAM synapses and CMOS neurons is developed. Our simulations show that the system can efficiently compete the handwritten digits recognition task, which indicates the feasibility of using the RRAM-based binary STDP protocol in neuromorphic computing systems to obtain good performance.
ERIC Educational Resources Information Center
Guy, Maggie W.; Reynolds, Greg D.; Zhang, Dantong
2013-01-01
Event-related potentials (ERPs) were utilized in an investigation of 21 six-month-olds' attention to and processing of global and local properties of hierarchical patterns. Overall, infants demonstrated an advantage for processing the overall configuration (i.e., global properties) of local features of hierarchical patterns; however,…
Wang, Junping; Xie, Xinfang; Feng, Jinsong; Chen, Jessica C; Du, Xin-jun; Luo, Jiangzhao; Lu, Xiaonan; Wang, Shuo
2015-07-02
Listeria monocytogenes is a facultatively anaerobic, Gram-positive, rod-shape foodborne bacterium causing invasive infection, listeriosis, in susceptible populations. Rapid and high-throughput detection of this pathogen in dairy products is critical as milk and other dairy products have been implicated as food vehicles in several outbreaks. Here we evaluated confocal micro-Raman spectroscopy (785 nm laser) coupled with chemometric analysis to distinguish six closely related Listeria species, including L. monocytogenes, in both liquid media and milk. Raman spectra of different Listeria species and other bacteria (i.e., Staphylococcus aureus, Salmonella enterica and Escherichia coli) were collected to create two independent databases for detection in media and milk, respectively. Unsupervised chemometric models including principal component analysis and hierarchical cluster analysis were applied to differentiate L. monocytogenes from Listeria and other bacteria. To further evaluate the performance and reliability of unsupervised chemometric analyses, supervised chemometrics were performed, including two discriminant analyses (DA) and soft independent modeling of class analogies (SIMCA). By analyzing Raman spectra via two DA-based chemometric models, average identification accuracies of 97.78% and 98.33% for L. monocytogenes in media, and 95.28% and 96.11% in milk were obtained, respectively. SIMCA analysis also resulted in satisfied average classification accuracies (over 93% in both media and milk). This Raman spectroscopic-based detection of L. monocytogenes in media and milk can be finished within a few hours and requires no extensive sample preparation. Copyright © 2015 Elsevier B.V. All rights reserved.
Unsupervised segmentation of MRI knees using image partition forests
NASA Astrophysics Data System (ADS)
Marčan, Marija; Voiculescu, Irina
2016-03-01
Nowadays many people are affected by arthritis, a condition of the joints with limited prevention measures, but with various options of treatment the most radical of which is surgical. In order for surgery to be successful, it can make use of careful analysis of patient-based models generated from medical images, usually by manual segmentation. In this work we show how to automate the segmentation of a crucial and complex joint -- the knee. To achieve this goal we rely on our novel way of representing a 3D voxel volume as a hierarchical structure of partitions which we have named Image Partition Forest (IPF). The IPF contains several partition layers of increasing coarseness, with partitions nested across layers in the form of adjacency graphs. On the basis of a set of properties (size, mean intensity, coordinates) of each node in the IPF we classify nodes into different features. Values indicating whether or not any particular node belongs to the femur or tibia are assigned through node filtering and node-based region growing. So far we have evaluated our method on 15 MRI knee images. Our unsupervised segmentation compared against a hand-segmented gold standard has achieved an average Dice similarity coefficient of 0.95 for femur and 0.93 for tibia, and an average symmetric surface distance of 0.98 mm for femur and 0.73 mm for tibia. The paper also discusses ways to introduce stricter morphological and spatial conditioning in the bone labelling process.
Unsupervised classification of operator workload from brain signals.
Schultze-Kraft, Matthias; Dähne, Sven; Gugler, Manfred; Curio, Gabriel; Blankertz, Benjamin
2016-06-01
In this study we aimed for the classification of operator workload as it is expected in many real-life workplace environments. We explored brain-signal based workload predictors that differ with respect to the level of label information required for training, including entirely unsupervised approaches. Subjects executed a task on a touch screen that required continuous effort of visual and motor processing with alternating difficulty. We first employed classical approaches for workload state classification that operate on the sensor space of EEG and compared those to the performance of three state-of-the-art spatial filtering methods: common spatial patterns (CSPs) analysis, which requires binary label information; source power co-modulation (SPoC) analysis, which uses the subjects' error rate as a target function; and canonical SPoC (cSPoC) analysis, which solely makes use of cross-frequency power correlations induced by different states of workload and thus represents an unsupervised approach. Finally, we investigated the effects of fusing brain signals and peripheral physiological measures (PPMs) and examined the added value for improving classification performance. Mean classification accuracies of 94%, 92% and 82% were achieved with CSP, SPoC, cSPoC, respectively. These methods outperformed the approaches that did not use spatial filtering and they extracted physiologically plausible components. The performance of the unsupervised cSPoC is significantly increased by augmenting it with PPM features. Our analyses ensured that the signal sources used for classification were of cortical origin and not contaminated with artifacts. Our findings show that workload states can be successfully differentiated from brain signals, even when less and less information from the experimental paradigm is used, thus paving the way for real-world applications in which label information may be noisy or entirely unavailable.
Advanced Treatment Monitoring for Olympic-Level Athletes Using Unsupervised Modeling Techniques
Siedlik, Jacob A.; Bergeron, Charles; Cooper, Michael; Emmons, Russell; Moreau, William; Nabhan, Dustin; Gallagher, Philip; Vardiman, John P.
2016-01-01
Context Analysis of injury and illness data collected at large international competitions provides the US Olympic Committee and the national governing bodies for each sport with information to best prepare for future competitions. Research in which authors have evaluated medical contacts to provide the expected level of medical care and sports medicine services at international competitions is limited. Objective To analyze the medical-contact data for athletes, staff, and coaches who participated in the 2011 Pan American Games in Guadalajara, Mexico, using unsupervised modeling techniques to identify underlying treatment patterns. Design Descriptive epidemiology study. Setting Pan American Games. Patients or Other Participants A total of 618 US athletes (337 males, 281 females) participated in the 2011 Pan American Games. Main Outcome Measure(s) Medical data were recorded from the injury-evaluation and injury-treatment forms used by clinicians assigned to the central US Olympic Committee Sport Medicine Clinic and satellite locations during the operational 17-day period of the 2011 Pan American Games. We used principal components analysis and agglomerative clustering algorithms to identify and define grouped modalities. Lift statistics were calculated for within-cluster subgroups. Results Principal component analyses identified 3 components, accounting for 72.3% of the variability in datasets. Plots of the principal components showed that individual contacts focused on 4 treatment clusters: massage, paired manipulation and mobilization, soft tissue therapy, and general medical. Conclusions Unsupervised modeling techniques were useful for visualizing complex treatment data and provided insights for improved treatment modeling in athletes. Given its ability to detect clinically relevant treatment pairings in large datasets, unsupervised modeling should be considered a feasible option for future analyses of medical-contact data from international competitions. PMID:26794628
Unsupervised classification of operator workload from brain signals
NASA Astrophysics Data System (ADS)
Schultze-Kraft, Matthias; Dähne, Sven; Gugler, Manfred; Curio, Gabriel; Blankertz, Benjamin
2016-06-01
Objective. In this study we aimed for the classification of operator workload as it is expected in many real-life workplace environments. We explored brain-signal based workload predictors that differ with respect to the level of label information required for training, including entirely unsupervised approaches. Approach. Subjects executed a task on a touch screen that required continuous effort of visual and motor processing with alternating difficulty. We first employed classical approaches for workload state classification that operate on the sensor space of EEG and compared those to the performance of three state-of-the-art spatial filtering methods: common spatial patterns (CSPs) analysis, which requires binary label information; source power co-modulation (SPoC) analysis, which uses the subjects’ error rate as a target function; and canonical SPoC (cSPoC) analysis, which solely makes use of cross-frequency power correlations induced by different states of workload and thus represents an unsupervised approach. Finally, we investigated the effects of fusing brain signals and peripheral physiological measures (PPMs) and examined the added value for improving classification performance. Main results. Mean classification accuracies of 94%, 92% and 82% were achieved with CSP, SPoC, cSPoC, respectively. These methods outperformed the approaches that did not use spatial filtering and they extracted physiologically plausible components. The performance of the unsupervised cSPoC is significantly increased by augmenting it with PPM features. Significance. Our analyses ensured that the signal sources used for classification were of cortical origin and not contaminated with artifacts. Our findings show that workload states can be successfully differentiated from brain signals, even when less and less information from the experimental paradigm is used, thus paving the way for real-world applications in which label information may be noisy or entirely unavailable.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Musah, Rabi A.; Espinoza, Edgard O.; Cody, Robert B.
A high throughput method for species identification and classification through chemometric processing of direct analysis in real time (DART) mass spectrometry-derived fingerprint signatures has been developed. The method entails introduction of samples to the open air space between the DART ion source and the mass spectrometer inlet, with the entire observed mass spectral fingerprint subjected to unsupervised hierarchical clustering processing. Moreover, a range of both polar and non-polar chemotypes are instantaneously detected. The result is identification and species level classification based on the entire DART-MS spectrum. In this paper, we illustrate how the method can be used to: (1) distinguishmore » between endangered woods regulated by the Convention for the International Trade of Endangered Flora and Fauna (CITES) treaty; (2) assess the origin and by extension the properties of biodiesel feedstocks; (3) determine insect species from analysis of puparial casings; (4) distinguish between psychoactive plants products; and (5) differentiate between Eucalyptus species. An advantage of the hierarchical clustering approach to processing of the DART-MS derived fingerprint is that it shows both similarities and differences between species based on their chemotypes. Furthermore, full knowledge of the identities of the constituents contained within the small molecule profile of analyzed samples is not required.« less
Musah, Rabi A.; Espinoza, Edgard O.; Cody, Robert B.; ...
2015-07-09
A high throughput method for species identification and classification through chemometric processing of direct analysis in real time (DART) mass spectrometry-derived fingerprint signatures has been developed. The method entails introduction of samples to the open air space between the DART ion source and the mass spectrometer inlet, with the entire observed mass spectral fingerprint subjected to unsupervised hierarchical clustering processing. Moreover, a range of both polar and non-polar chemotypes are instantaneously detected. The result is identification and species level classification based on the entire DART-MS spectrum. In this paper, we illustrate how the method can be used to: (1) distinguishmore » between endangered woods regulated by the Convention for the International Trade of Endangered Flora and Fauna (CITES) treaty; (2) assess the origin and by extension the properties of biodiesel feedstocks; (3) determine insect species from analysis of puparial casings; (4) distinguish between psychoactive plants products; and (5) differentiate between Eucalyptus species. An advantage of the hierarchical clustering approach to processing of the DART-MS derived fingerprint is that it shows both similarities and differences between species based on their chemotypes. Furthermore, full knowledge of the identities of the constituents contained within the small molecule profile of analyzed samples is not required.« less
Musah, Rabi A.; Espinoza, Edgard O.; Cody, Robert B.; Lesiak, Ashton D.; Christensen, Earl D.; Moore, Hannah E.; Maleknia, Simin; Drijfhout, Falko P.
2015-01-01
A high throughput method for species identification and classification through chemometric processing of direct analysis in real time (DART) mass spectrometry-derived fingerprint signatures has been developed. The method entails introduction of samples to the open air space between the DART ion source and the mass spectrometer inlet, with the entire observed mass spectral fingerprint subjected to unsupervised hierarchical clustering processing. A range of both polar and non-polar chemotypes are instantaneously detected. The result is identification and species level classification based on the entire DART-MS spectrum. Here, we illustrate how the method can be used to: (1) distinguish between endangered woods regulated by the Convention for the International Trade of Endangered Flora and Fauna (CITES) treaty; (2) assess the origin and by extension the properties of biodiesel feedstocks; (3) determine insect species from analysis of puparial casings; (4) distinguish between psychoactive plants products; and (5) differentiate between Eucalyptus species. An advantage of the hierarchical clustering approach to processing of the DART-MS derived fingerprint is that it shows both similarities and differences between species based on their chemotypes. Furthermore, full knowledge of the identities of the constituents contained within the small molecule profile of analyzed samples is not required. PMID:26156000
Slow feature analysis: unsupervised learning of invariances.
Wiskott, Laurenz; Sejnowski, Terrence J
2002-04-01
Invariant features of temporally varying signals are useful for analysis and classification. Slow feature analysis (SFA) is a new method for learning invariant or slowly varying features from a vectorial input signal. It is based on a nonlinear expansion of the input signal and application of principal component analysis to this expanded signal and its time derivative. It is guaranteed to find the optimal solution within a family of functions directly and can learn to extract a large number of decorrelated features, which are ordered by their degree of invariance. SFA can be applied hierarchically to process high-dimensional input signals and extract complex features. SFA is applied first to complex cell tuning properties based on simple cell output, including disparity and motion. Then more complicated input-output functions are learned by repeated application of SFA. Finally, a hierarchical network of SFA modules is presented as a simple model of the visual system. The same unstructured network can learn translation, size, rotation, contrast, or, to a lesser degree, illumination invariance for one-dimensional objects, depending on only the training stimulus. Surprisingly, only a few training objects suffice to achieve good generalization to new objects. The generated representation is suitable for object recognition. Performance degrades if the network is trained to learn multiple invariances simultaneously.
NASA Astrophysics Data System (ADS)
Musah, Rabi A.; Espinoza, Edgard O.; Cody, Robert B.; Lesiak, Ashton D.; Christensen, Earl D.; Moore, Hannah E.; Maleknia, Simin; Drijfhout, Falko P.
2015-07-01
A high throughput method for species identification and classification through chemometric processing of direct analysis in real time (DART) mass spectrometry-derived fingerprint signatures has been developed. The method entails introduction of samples to the open air space between the DART ion source and the mass spectrometer inlet, with the entire observed mass spectral fingerprint subjected to unsupervised hierarchical clustering processing. A range of both polar and non-polar chemotypes are instantaneously detected. The result is identification and species level classification based on the entire DART-MS spectrum. Here, we illustrate how the method can be used to: (1) distinguish between endangered woods regulated by the Convention for the International Trade of Endangered Flora and Fauna (CITES) treaty; (2) assess the origin and by extension the properties of biodiesel feedstocks; (3) determine insect species from analysis of puparial casings; (4) distinguish between psychoactive plants products; and (5) differentiate between Eucalyptus species. An advantage of the hierarchical clustering approach to processing of the DART-MS derived fingerprint is that it shows both similarities and differences between species based on their chemotypes. Furthermore, full knowledge of the identities of the constituents contained within the small molecule profile of analyzed samples is not required.
Kruse, Christian
2018-06-01
To review current practices and technologies within the scope of "Big Data" that can further our understanding of diabetes mellitus and osteoporosis from large volumes of data. "Big Data" techniques involving supervised machine learning, unsupervised machine learning, and deep learning image analysis are presented with examples of current literature. Supervised machine learning can allow us to better predict diabetes-induced osteoporosis and understand relative predictor importance of diabetes-affected bone tissue. Unsupervised machine learning can allow us to understand patterns in data between diabetic pathophysiology and altered bone metabolism. Image analysis using deep learning can allow us to be less dependent on surrogate predictors and use large volumes of images to classify diabetes-induced osteoporosis and predict future outcomes directly from images. "Big Data" techniques herald new possibilities to understand diabetes-induced osteoporosis and ascertain our current ability to classify, understand, and predict this condition.
Li, Yan; Zhang, Ji; Zhao, Yanli; Liu, Honggao; Wang, Yuanzhong; Jin, Hang
2016-01-01
In this study the geographical differentiation of dried sclerotia of the medicinal mushroom Wolfiporia extensa, obtained from different regions in Yunnan Province, China, was explored using Fourier-transform infrared (FT-IR) spectroscopy coupled with multivariate data analysis. The FT-IR spectra of 97 samples were obtained for wave numbers ranging from 4000 to 400 cm-1. Then, the fingerprint region of 1800-600 cm-1 of the FT-IR spectrum, rather than the full spectrum, was analyzed. Different pretreatments were applied on the spectra, and a discriminant analysis model based on the Mahalanobis distance was developed to select an optimal pretreatment combination. Two unsupervised pattern recognition procedures- principal component analysis and hierarchical cluster analysis-were applied to enhance the authenticity of discrimination of the specimens. The results showed that excellent classification could be obtained after optimizing spectral pretreatment. The tested samples were successfully discriminated according to their geographical locations. The chemical properties of dried sclerotia of W. extensa were clearly dependent on the mushroom's geographical origins. Furthermore, an interesting finding implied that the elevations of collection areas may have effects on the chemical components of wild W. extensa sclerotia. Overall, this study highlights the feasibility of FT-IR spectroscopy combined with multivariate data analysis in particular for exploring the distinction of different regional W. extensa sclerotia samples. This research could also serve as a basis for the exploitation and utilization of medicinal mushrooms.
Double-Barrier Memristive Devices for Unsupervised Learning and Pattern Recognition.
Hansen, Mirko; Zahari, Finn; Ziegler, Martin; Kohlstedt, Hermann
2017-01-01
The use of interface-based resistive switching devices for neuromorphic computing is investigated. In a combined experimental and numerical study, the important device parameters and their impact on a neuromorphic pattern recognition system are studied. The memristive cells consist of a layer sequence Al/Al 2 O 3 /Nb x O y /Au and are fabricated on a 4-inch wafer. The key functional ingredients of the devices are a 1.3 nm thick Al 2 O 3 tunnel barrier and a 2.5 mm thick Nb x O y memristive layer. Voltage pulse measurements are used to study the electrical conditions for the emulation of synaptic functionality of single cells for later use in a recognition system. The results are evaluated and modeled in the framework of the plasticity model of Ziegler et al. Based on this model, which is matched to experimental data from 84 individual devices, the network performance with regard to yield, reliability, and variability is investigated numerically. As the network model, a computing scheme for pattern recognition and unsupervised learning based on the work of Querlioz et al. (2011), Sheridan et al. (2014), Zahari et al. (2015) is employed. This is a two-layer feedforward network with a crossbar array of memristive devices, leaky integrate-and-fire output neurons including a winner-takes-all strategy, and a stochastic coding scheme for the input pattern. As input pattern, the full data set of digits from the MNIST database is used. The numerical investigation indicates that the experimentally obtained yield, reliability, and variability of the memristive cells are suitable for such a network. Furthermore, evidence is presented that their strong I - V non-linearity might avoid the need for selector devices in crossbar array structures.
Bowd, Christopher; Weinreb, Robert N; Balasubramanian, Madhusudhanan; Lee, Intae; Jang, Giljin; Yousefi, Siamak; Zangwill, Linda M; Medeiros, Felipe A; Girkin, Christopher A; Liebmann, Jeffrey M; Goldbaum, Michael H
2014-01-01
The variational Bayesian independent component analysis-mixture model (VIM), an unsupervised machine-learning classifier, was used to automatically separate Matrix Frequency Doubling Technology (FDT) perimetry data into clusters of healthy and glaucomatous eyes, and to identify axes representing statistically independent patterns of defect in the glaucoma clusters. FDT measurements were obtained from 1,190 eyes with normal FDT results and 786 eyes with abnormal FDT results from the UCSD-based Diagnostic Innovations in Glaucoma Study (DIGS) and African Descent and Glaucoma Evaluation Study (ADAGES). For all eyes, VIM input was 52 threshold test points from the 24-2 test pattern, plus age. FDT mean deviation was -1.00 dB (S.D. = 2.80 dB) and -5.57 dB (S.D. = 5.09 dB) in FDT-normal eyes and FDT-abnormal eyes, respectively (p<0.001). VIM identified meaningful clusters of FDT data and positioned a set of statistically independent axes through the mean of each cluster. The optimal VIM model separated the FDT fields into 3 clusters. Cluster N contained primarily normal fields (1109/1190, specificity 93.1%) and clusters G1 and G2 combined, contained primarily abnormal fields (651/786, sensitivity 82.8%). For clusters G1 and G2 the optimal number of axes were 2 and 5, respectively. Patterns automatically generated along axes within the glaucoma clusters were similar to those known to be indicative of glaucoma. Fields located farther from the normal mean on each glaucoma axis showed increasing field defect severity. VIM successfully separated FDT fields from healthy and glaucoma eyes without a priori information about class membership, and identified familiar glaucomatous patterns of loss.
CNN: a speaker recognition system using a cascaded neural network.
Zaki, M; Ghalwash, A; Elkouny, A A
1996-05-01
The main emphasis of this paper is to present an approach for combining supervised and unsupervised neural network models to the issue of speaker recognition. To enhance the overall operation and performance of recognition, the proposed strategy integrates the two techniques, forming one global model called the cascaded model. We first present a simple conventional technique based on the distance measured between a test vector and a reference vector for different speakers in the population. This particular distance metric has the property of weighting down the components in those directions along which the intraspeaker variance is large. The reason for presenting this method is to clarify the discrepancy in performance between the conventional and neural network approach. We then introduce the idea of using unsupervised learning technique, presented by the winner-take-all model, as a means of recognition. Due to several tests that have been conducted and in order to enhance the performance of this model, dealing with noisy patterns, we have preceded it with a supervised learning model--the pattern association model--which acts as a filtration stage. This work includes both the design and implementation of both conventional and neural network approaches to recognize the speakers templates--which are introduced to the system via a voice master card and preprocessed before extracting the features used in the recognition. The conclusion indicates that the system performance in case of neural network is better than that of the conventional one, achieving a smooth degradation in respect of noisy patterns, and higher performance in respect of noise-free patterns.
Asiimwe, Stephen; Oloya, James; Song, Xiao; Whalen, Christopher C
2014-12-01
Unsupervised HIV self-testing (HST) has potential to increase knowledge of HIV status; however, its accuracy is unknown. To estimate the accuracy of unsupervised HST in field settings in Uganda, we performed a non-blinded, randomized controlled, non-inferiority trial of unsupervised compared with supervised HST among selected high HIV risk fisherfolk (22.1 % HIV Prevalence) in three fishing villages in Uganda between July and September 2013. The study enrolled 246 participants and randomized them in a 1:1 ratio to unsupervised HST or provider-supervised HST. In an intent-to-treat analysis, the HST sensitivity was 90 % in the unsupervised arm and 100 % among the provider-supervised, yielding a difference 0f -10 % (90 % CI -21, 1 %); non-inferiority was not shown. In a per protocol analysis, the difference in sensitivity was -5.6 % (90 % CI -14.4, 3.3 %) and did show non-inferiority. We conclude that unsupervised HST is feasible in rural Africa and may be non-inferior to provider-supervised HST.
New insight in magnetic saturation behavior of nickel hierarchical structures
NASA Astrophysics Data System (ADS)
Ma, Ji; Zhang, Jianxing; Liu, Chunting; Chen, Kezheng
2017-09-01
It is unanimously accepted that non-ferromagnetic inclusions in a ferromagnetic system will lower down total saturation magnetization in unit of emu/g. In this study, ;lattice strain; was found to be another key factor to have critical impact on magnetic saturation behavior of the system. The lattice strain determined assembling patterns of primary nanoparticles in hierarchical structures and was intimately related with the formation process of these architectures. Therefore, flower-necklace-like and cauliflower-like nickel hierarchical structures were used as prototype systems to evidence the relationship between assembling patterns of primary nanoparticles and magnetic saturation behaviors of these architectures. It was found that the influence of lattice strain on saturation magnetization outperformed that of non-ferromagnetic inclusions in these hierarchical structures. This will enable new insights into fundamental understanding of related magnetic effects.
NASA Astrophysics Data System (ADS)
Huerta-Murillo, D.; Aguilar-Morales, A. I.; Alamri, S.; Cardoso, J. T.; Jagdheesh, R.; Lasagni, A. F.; Ocaña, J. L.
2017-11-01
In this work, hierarchical surface patterns fabricated on Ti-6Al-4V alloy combining two laser micro-machining techniques are presented. The used technologies are based on nanosecond Direct Laser Writing and picosecond Direct Laser Interference Patterning. Squared shape micro-cells with different hatch distances were produced by Direct Laser Writing with depths values in the micro-scale, forming a well-defined closed packet. Subsequently, cross-like periodic patterns were fabricated by means of Direct Laser Interference Patterning using a two-beam configuration, generating a dual-scale periodic surface structure in both micro- and nano-scale due to the formation of Laser-Induced Periodic Surface Structure after the picosecond process. As a result a triple hierarchical periodic surface structure was generated. The surface morphology of the irradiated area was characterized with scanning electron microscopy and confocal microscopy. Additionally, static contact angle measurements were made to analyze the wettability behavior of the structures, showing a hydrophobic behavior for the hierarchical structures.
Wolff, J Gerard
2016-01-01
The SP theory of intelligence , with its realization in the SP computer model , aims to simplify and integrate observations and concepts across artificial intelligence, mainstream computing, mathematics, and human perception and cognition, with information compression as a unifying theme. This paper describes how abstract structures and processes in the theory may be realized in terms of neurons, their interconnections, and the transmission of signals between neurons. This part of the SP theory- SP-neural -is a tentative and partial model for the representation and processing of knowledge in the brain. Empirical support for the SP theory-outlined in the paper-provides indirect support for SP-neural. In the abstract part of the SP theory (SP-abstract), all kinds of knowledge are represented with patterns , where a pattern is an array of atomic symbols in one or two dimensions. In SP-neural, the concept of a "pattern" is realized as an array of neurons called a pattern assembly , similar to Hebb's concept of a "cell assembly" but with important differences. Central to the processing of information in SP-abstract is information compression via the matching and unification of patterns (ICMUP) and, more specifically, information compression via the powerful concept of multiple alignment , borrowed and adapted from bioinformatics. Processes such as pattern recognition, reasoning and problem solving are achieved via the building of multiple alignments, while unsupervised learning is achieved by creating patterns from sensory information and also by creating patterns from multiple alignments in which there is a partial match between one pattern and another. It is envisaged that, in SP-neural, short-lived neural structures equivalent to multiple alignments will be created via an inter-play of excitatory and inhibitory neural signals. It is also envisaged that unsupervised learning will be achieved by the creation of pattern assemblies from sensory information and from the neural equivalents of multiple alignments, much as in the non-neural SP theory-and significantly different from the "Hebbian" kinds of learning which are widely used in the kinds of artificial neural network that are popular in computer science. The paper discusses several associated issues, with relevant empirical evidence.
None of the above: A Bayesian account of the detection of novel categories.
Navarro, Daniel J; Kemp, Charles
2017-10-01
Every time we encounter a new object, action, or event, there is some chance that we will need to assign it to a novel category. We describe and evaluate a class of probabilistic models that detect when an object belongs to a category that has not previously been encountered. The models incorporate a prior distribution that is influenced by the distribution of previous objects among categories, and we present 2 experiments that demonstrate that people are also sensitive to this distributional information. Two additional experiments confirm that distributional information is combined with similarity when both sources of information are available. We compare our approach to previous models of unsupervised categorization and to several heuristic-based models, and find that a hierarchical Bayesian approach provides the best account of our data. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Sampling schemes and parameter estimation for nonlinear Bernoulli-Gaussian sparse models
NASA Astrophysics Data System (ADS)
Boudineau, Mégane; Carfantan, Hervé; Bourguignon, Sébastien; Bazot, Michael
2016-06-01
We address the sparse approximation problem in the case where the data are approximated by the linear combination of a small number of elementary signals, each of these signals depending non-linearly on additional parameters. Sparsity is explicitly expressed through a Bernoulli-Gaussian hierarchical model in a Bayesian framework. Posterior mean estimates are computed using Markov Chain Monte-Carlo algorithms. We generalize the partially marginalized Gibbs sampler proposed in the linear case in [1], and build an hybrid Hastings-within-Gibbs algorithm in order to account for the nonlinear parameters. All model parameters are then estimated in an unsupervised procedure. The resulting method is evaluated on a sparse spectral analysis problem. It is shown to converge more efficiently than the classical joint estimation procedure, with only a slight increase of the computational cost per iteration, consequently reducing the global cost of the estimation procedure.
Kopriva, Ivica; Hadžija, Mirko; Popović Hadžija, Marijana; Korolija, Marina; Cichocki, Andrzej
2011-08-01
A methodology is proposed for nonlinear contrast-enhanced unsupervised segmentation of multispectral (color) microscopy images of principally unstained specimens. The methodology exploits spectral diversity and spatial sparseness to find anatomical differences between materials (cells, nuclei, and background) present in the image. It consists of rth-order rational variety mapping (RVM) followed by matrix/tensor factorization. Sparseness constraint implies duality between nonlinear unsupervised segmentation and multiclass pattern assignment problems. Classes not linearly separable in the original input space become separable with high probability in the higher-dimensional mapped space. Hence, RVM mapping has two advantages: it takes implicitly into account nonlinearities present in the image (ie, they are not required to be known) and it increases spectral diversity (ie, contrast) between materials, due to increased dimensionality of the mapped space. This is expected to improve performance of systems for automated classification and analysis of microscopic histopathological images. The methodology was validated using RVM of the second and third orders of the experimental multispectral microscopy images of unstained sciatic nerve fibers (nervus ischiadicus) and of unstained white pulp in the spleen tissue, compared with a manually defined ground truth labeled by two trained pathophysiologists. The methodology can also be useful for additional contrast enhancement of images of stained specimens. Copyright © 2011 American Society for Investigative Pathology. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Jiang, Guo-Qian; Xie, Ping; Wang, Xiao; Chen, Meng; He, Qun
2017-11-01
The performance of traditional vibration based fault diagnosis methods greatly depends on those handcrafted features extracted using signal processing algorithms, which require significant amounts of domain knowledge and human labor, and do not generalize well to new diagnosis domains. Recently, unsupervised representation learning provides an alternative promising solution to feature extraction in traditional fault diagnosis due to its superior learning ability from unlabeled data. Given that vibration signals usually contain multiple temporal structures, this paper proposes a multiscale representation learning (MSRL) framework to learn useful features directly from raw vibration signals, with the aim to capture rich and complementary fault pattern information at different scales. In our proposed approach, a coarse-grained procedure is first employed to obtain multiple scale signals from an original vibration signal. Then, sparse filtering, a newly developed unsupervised learning algorithm, is applied to automatically learn useful features from each scale signal, respectively, and then the learned features at each scale to be concatenated one by one to obtain multiscale representations. Finally, the multiscale representations are fed into a supervised classifier to achieve diagnosis results. Our proposed approach is evaluated using two different case studies: motor bearing and wind turbine gearbox fault diagnosis. Experimental results show that the proposed MSRL approach can take full advantages of the availability of unlabeled data to learn discriminative features and achieved better performance with higher accuracy and stability compared to the traditional approaches.
TU-CD-BRB-12: Radiogenomics of MRI-Guided Prostate Cancer Biopsy Habitats
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stoyanova, R; Lynne, C; Abraham, S
2015-06-15
Purpose: Diagnostic prostate biopsies are subject to sampling bias. We hypothesize that quantitative imaging with multiparametric (MP)-MRI can more accurately direct targeted biopsies to index lesions associated with highest risk clinical and genomic features. Methods: Regionally distinct prostate habitats were delineated on MP-MRI (T2-weighted, perfusion and diffusion imaging). Directed biopsies were performed on 17 habitats from 6 patients using MRI-ultrasound fusion. Biopsy location was characterized with 52 radiographic features. Transcriptome-wide analysis of 1.4 million RNA probes was performed on RNA from each habitat. Genomics features with insignificant expression values (<0.25) and interquartile range <0.5 were filtered, leaving total of 212more » genes. Correlation between imaging features, genes and a 22 feature genomic classifier (GC), developed as a prognostic assay for metastasis after radical prostatectomy was investigated. Results: High quality genomic data was derived from 17 (100%) biopsies. Using the 212 ‘unbiased’ genes, the samples clustered by patient origin in unsupervised analysis. When only prostate cancer related genomic features were used, hierarchical clustering revealed samples clustered by needle-biopsy Gleason score (GS). Similarly, principal component analysis of the imaging features, found the primary source of variance segregated the samples into high (≥7) and low (6) GS. Pearson’s correlation analysis of genes with significant expression showed two main patterns of gene expression clustering prostate peripheral and transitional zone MRI features. Two-way hierarchical clustering of GC with radiomics features resulted in the expected groupings of high and low expressed genes in this metastasis signature. Conclusions: MP-MRI-targeted diagnostic biopsies can potentially improve risk stratification by directing pathological and genomic analysis to clinically significant index lesions. As determinant lesions are more reliably identified, targeting with radiotherapy should improve outcome. This is the first demonstration of a link between quantitative imaging features (radiomics) with genomic features in MRI-directed prostate biopsies. The research was supported by NIH- NCI R01 CA 189295 and R01 CA 189295; E Davicioni is partial owner of GenomeDx Biosciences, Inc. M Takhar, N Erho, L Lam, C Buerki and E Davicioni are current employees at GenomeDx Biosciences, Inc.« less
Statistics of voids in hierarchical universes
NASA Technical Reports Server (NTRS)
Fry, J. N.
1986-01-01
As one alternative to the N-point galaxy correlation function statistics, the distribution of holes or the probability that a volume of given size and shape be empty of galaxies can be considered. The probability of voids resulting from a variety of hierarchical patterns of clustering is considered, and these are compared with the results of numerical simulations and with observations. A scaling relation required by the hierarchical pattern of higher order correlation functions is seen to be obeyed in the simulations, and the numerical results show a clear difference between neutrino models and cold-particle models; voids are more likely in neutrino universes. Observational data do not yet distinguish but are close to being able to distinguish between models.
Topological dimension tunes activity patterns in hierarchical modular networks
NASA Astrophysics Data System (ADS)
Safari, Ali; Moretti, Paolo; Muñoz, Miguel A.
2017-11-01
Connectivity patterns of relevance in neuroscience and systems biology can be encoded in hierarchical modular networks (HMNs). Recent studies highlight the role of hierarchical modular organization in shaping brain activity patterns, providing an excellent substrate to promote both segregation and integration of neural information. Here, we propose an extensive analysis of the critical spreading rate (or ‘epidemic’ threshold)—separating a phase with endemic persistent activity from one in which activity ceases—on diverse HMNs. By employing analytical and computational techniques we determine the nature of such a threshold and scrutinize how it depends on general structural features of the underlying HMN. We critically discuss the extent to which current graph-spectral methods can be applied to predict the onset of spreading in HMNs and, most importantly, we elucidate the role played by the network topological dimension as a relevant and unifying structural parameter, controlling the epidemic threshold.
Wolff, J. Gerard
2016-01-01
The SP theory of intelligence, with its realization in the SP computer model, aims to simplify and integrate observations and concepts across artificial intelligence, mainstream computing, mathematics, and human perception and cognition, with information compression as a unifying theme. This paper describes how abstract structures and processes in the theory may be realized in terms of neurons, their interconnections, and the transmission of signals between neurons. This part of the SP theory—SP-neural—is a tentative and partial model for the representation and processing of knowledge in the brain. Empirical support for the SP theory—outlined in the paper—provides indirect support for SP-neural. In the abstract part of the SP theory (SP-abstract), all kinds of knowledge are represented with patterns, where a pattern is an array of atomic symbols in one or two dimensions. In SP-neural, the concept of a “pattern” is realized as an array of neurons called a pattern assembly, similar to Hebb's concept of a “cell assembly” but with important differences. Central to the processing of information in SP-abstract is information compression via the matching and unification of patterns (ICMUP) and, more specifically, information compression via the powerful concept of multiple alignment, borrowed and adapted from bioinformatics. Processes such as pattern recognition, reasoning and problem solving are achieved via the building of multiple alignments, while unsupervised learning is achieved by creating patterns from sensory information and also by creating patterns from multiple alignments in which there is a partial match between one pattern and another. It is envisaged that, in SP-neural, short-lived neural structures equivalent to multiple alignments will be created via an inter-play of excitatory and inhibitory neural signals. It is also envisaged that unsupervised learning will be achieved by the creation of pattern assemblies from sensory information and from the neural equivalents of multiple alignments, much as in the non-neural SP theory—and significantly different from the “Hebbian” kinds of learning which are widely used in the kinds of artificial neural network that are popular in computer science. The paper discusses several associated issues, with relevant empirical evidence. PMID:27857695
Modular and hierarchical structure of social contact networks
NASA Astrophysics Data System (ADS)
Ge, Yuanzheng; Song, Zhichao; Qiu, Xiaogang; Song, Hongbin; Wang, Yong
2013-10-01
Social contact networks exhibit overlapping qualities of communities, hierarchical structure and spatial-correlated nature. We propose a mixing pattern of modular and growing hierarchical structures to reconstruct social contact networks by using an individual’s geospatial distribution information in the real world. The hierarchical structure of social contact networks is defined based on the spatial distance between individuals, and edges among individuals are added in turn from the modular layer to the highest layer. It is a gradual process to construct the hierarchical structure: from the basic modular model up to the global network. The proposed model not only shows hierarchically increasing degree distribution and large clustering coefficients in communities, but also exhibits spatial clustering features of individual distributions. As an evaluation of the method, we reconstruct a hierarchical contact network based on the investigation data of a university. Transmission experiments of influenza H1N1 are carried out on the generated social contact networks, and results show that the constructed network is efficient to reproduce the dynamic process of an outbreak and evaluate interventions. The reproduced spread process exhibits that the spatial clustering of infection is accordant with the clustering of network topology. Moreover, the effect of individual topological character on the spread of influenza is analyzed, and the experiment results indicate that the spread is limited by individual daily contact patterns and local clustering topology rather than individual degree.
Detection of Tree Crowns Based on Reclassification Using Aerial Images and LIDAR Data
NASA Astrophysics Data System (ADS)
Talebi, S.; Zarea, A.; Sadeghian, S.; Arefi, H.
2013-09-01
Tree detection using aerial sensors in early decades was focused by many researchers in different fields including Remote Sensing and Photogrammetry. This paper is intended to detect trees in complex city areas using aerial imagery and laser scanning data. Our methodology is a hierarchal unsupervised method consists of some primitive operations. This method could be divided into three sections, in which, first section uses aerial imagery and both second and third sections use laser scanners data. In the first section a vegetation cover mask is created in both sunny and shadowed areas. In the second section Rate of Slope Change (RSC) is used to eliminate grasses. In the third section a Digital Terrain Model (DTM) is obtained from LiDAR data. By using DTM and Digital Surface Model (DSM) we would get to Normalized Digital Surface Model (nDSM). Then objects which are lower than a specific height are eliminated. Now there are three result layers from three sections. At the end multiplication operation is used to get final result layer. This layer will be smoothed by morphological operations. The result layer is sent to WG III/4 to evaluate. The evaluation result shows that our method has a good rank in comparing to other participants' methods in ISPRS WG III/4, when assessed in terms of 5 indices including area base completeness, area base correctness, object base completeness, object base correctness and boundary RMS. With regarding of being unsupervised and automatic, this method is improvable and could be integrate with other methods to get best results.
Diagnostic index of 3D osteoarthritic changes in TMJ condylar morphology
NASA Astrophysics Data System (ADS)
Gomes, Liliane R.; Gomes, Marcelo; Jung, Bryan; Paniagua, Beatriz; Ruellas, Antonio C.; Gonçalves, João. Roberto; Styner, Martin A.; Wolford, Larry; Cevidanes, Lucia
2015-03-01
The aim of this study was to investigate imaging statistical approaches for classifying 3D osteoarthritic morphological variations among 169 Temporomandibular Joint (TMJ) condyles. Cone beam Computed Tomography (CBCT) scans were acquired from 69 patients with long-term TMJ Osteoarthritis (OA) (39.1 ± 15.7 years), 15 patients at initial diagnosis of OA (44.9 ± 14.8 years) and 7 healthy controls (43 ± 12.4 years). 3D surface models of the condyles were constructed and Shape Correspondence was used to establish correspondent points on each model. The statistical framework included a multivariate analysis of covariance (MANCOVA) and Direction-Projection- Permutation (DiProPerm) for testing statistical significance of the differences between healthy control and the OA group determined by clinical and radiographic diagnoses. Unsupervised classification using hierarchical agglomerative clustering (HAC) was then conducted. Condylar morphology in OA and healthy subjects varied widely. Compared with healthy controls, OA average condyle was statistically significantly smaller in all dimensions except its anterior surface. Significant flattening of the lateral pole was noticed at initial diagnosis (p < 0.05). It was observed areas of 3.88 mm bone resorption at the superior surface and 3.10 mm bone apposition at the anterior aspect of the long-term OA average model. 1000 permutation statistics of DiProPerm supported a significant difference between the healthy control group and OA group (t = 6.7, empirical p-value = 0.001). Clinically meaningful unsupervised classification of TMJ condylar morphology determined a preliminary diagnostic index of 3D osteoarthritic changes, which may be the first step towards a more targeted diagnosis of this condition.
Sul, Woo Jun; Cole, James R.; Jesus, Ederson da C.; Wang, Qiong; Farris, Ryan J.; Fish, Jordan A.; Tiedje, James M.
2011-01-01
High-throughput sequencing of 16S rRNA genes has increased our understanding of microbial community structure, but now even higher-throughput methods to the Illumina scale allow the creation of much larger datasets with more samples and orders-of-magnitude more sequences that swamp current analytic methods. We developed a method capable of handling these larger datasets on the basis of assignment of sequences into an existing taxonomy using a supervised learning approach (taxonomy-supervised analysis). We compared this method with a commonly used clustering approach based on sequence similarity (taxonomy-unsupervised analysis). We sampled 211 different bacterial communities from various habitats and obtained ∼1.3 million 16S rRNA sequences spanning the V4 hypervariable region by pyrosequencing. Both methodologies gave similar ecological conclusions in that β-diversity measures calculated by using these two types of matrices were significantly correlated to each other, as were the ordination configurations and hierarchical clustering dendrograms. In addition, our taxonomy-supervised analyses were also highly correlated with phylogenetic methods, such as UniFrac. The taxonomy-supervised analysis has the advantages that it is not limited by the exhaustive computation required for the alignment and clustering necessary for the taxonomy-unsupervised analysis, is more tolerant of sequencing errors, and allows comparisons when sequences are from different regions of the 16S rRNA gene. With the tremendous expansion in 16S rRNA data acquisition underway, the taxonomy-supervised approach offers the potential to provide more rapid and extensive community comparisons across habitats and samples. PMID:21873204
A unique patterned diamond stamp for a periodically hierarchical nanoarray structure.
Wang, Yi; Shen, Yanting; Xu, Weiqing; Xu, Shuping; Li, Hongdong
2016-09-23
A diamond stamp with a hierarchical pattern was designed for the direct preparation of a periodic nanoarray structure, which was prepared by the reactive ion etching technique with a hierarchical ultrathin alumina membrane (HUTAM) as a mask. The optimal etching conditions for fabricating the diamond stamp were discussed in order to realize a vertical nanopore structure, avoiding structural damage from lateral etching. By using this diamond stamp, a polymer film with the desired hierarchical nanorod array structure can be obtained easily via the simple stamping process, which greatly simplifies the processing procedure. More importantly, the stamp is reusable because of its super-hardness, which ensures the reproducibility of the nanorod array pattern. Another merit is that the smooth surface of the etched diamond can avoid the use of a release agent. Our results prove that this hard stamp can be used for quick preparation of an elaborate periodic nanoarray structure. This study is significant in that it solves the problems of high cost and easy damage of stamps in nanoimprint lithography, and it might inspire more sophisticated applications of such an ordered structure in nanoplasmonics, biochemical sensing and nanophotonic devices.
González-Calabozo, Jose M; Valverde-Albacete, Francisco J; Peláez-Moreno, Carmen
2016-09-15
Gene Expression Data (GED) analysis poses a great challenge to the scientific community that can be framed into the Knowledge Discovery in Databases (KDD) and Data Mining (DM) paradigm. Biclustering has emerged as the machine learning method of choice to solve this task, but its unsupervised nature makes result assessment problematic. This is often addressed by means of Gene Set Enrichment Analysis (GSEA). We put forward a framework in which GED analysis is understood as an Exploratory Data Analysis (EDA) process where we provide support for continuous human interaction with data aiming at improving the step of hypothesis abduction and assessment. We focus on the adaptation to human cognition of data interpretation and visualization of the output of EDA. First, we give a proper theoretical background to bi-clustering using Lattice Theory and provide a set of analysis tools revolving around [Formula: see text]-Formal Concept Analysis ([Formula: see text]-FCA), a lattice-theoretic unsupervised learning technique for real-valued matrices. By using different kinds of cost structures to quantify expression we obtain different sequences of hierarchical bi-clusterings for gene under- and over-expression using thresholds. Consequently, we provide a method with interleaved analysis steps and visualization devices so that the sequences of lattices for a particular experiment summarize the researcher's vision of the data. This also allows us to define measures of persistence and robustness of biclusters to assess them. Second, the resulting biclusters are used to index external omics databases-for instance, Gene Ontology (GO)-thus offering a new way of accessing publicly available resources. This provides different flavors of gene set enrichment against which to assess the biclusters, by obtaining their p-values according to the terminology of those resources. We illustrate the exploration procedure on a real data example confirming results previously published. The GED analysis problem gets transformed into the exploration of a sequence of lattices enabling the visualization of the hierarchical structure of the biclusters with a certain degree of granularity. The ability of FCA-based bi-clustering methods to index external databases such as GO allows us to obtain a quality measure of the biclusters, to observe the evolution of a gene throughout the different biclusters it appears in, to look for relevant biclusters-by observing their genes and what their persistence is-to infer, for instance, hypotheses on their function.
Yousefi, Siamak; Balasubramanian, Madhusudhanan; Goldbaum, Michael H; Medeiros, Felipe A; Zangwill, Linda M; Weinreb, Robert N; Liebmann, Jeffrey M; Girkin, Christopher A; Bowd, Christopher
2016-05-01
To validate Gaussian mixture-model with expectation maximization (GEM) and variational Bayesian independent component analysis mixture-models (VIM) for detecting glaucomatous progression along visual field (VF) defect patterns (GEM-progression of patterns (POP) and VIM-POP). To compare GEM-POP and VIM-POP with other methods. GEM and VIM models separated cross-sectional abnormal VFs from 859 eyes and normal VFs from 1117 eyes into abnormal and normal clusters. Clusters were decomposed into independent axes. The confidence limit (CL) of stability was established for each axis with a set of 84 stable eyes. Sensitivity for detecting progression was assessed in a sample of 83 eyes with known progressive glaucomatous optic neuropathy (PGON). Eyes were classified as progressed if any defect pattern progressed beyond the CL of stability. Performance of GEM-POP and VIM-POP was compared to point-wise linear regression (PLR), permutation analysis of PLR (PoPLR), and linear regression (LR) of mean deviation (MD), and visual field index (VFI). Sensitivity and specificity for detecting glaucomatous VFs were 89.9% and 93.8%, respectively, for GEM and 93.0% and 97.0%, respectively, for VIM. Receiver operating characteristic (ROC) curve areas for classifying progressed eyes were 0.82 for VIM-POP, 0.86 for GEM-POP, 0.81 for PoPLR, 0.69 for LR of MD, and 0.76 for LR of VFI. GEM-POP was significantly more sensitive to PGON than PoPLR and linear regression of MD and VFI in our sample, while providing localized progression information. Detection of glaucomatous progression can be improved by assessing longitudinal changes in localized patterns of glaucomatous defect identified by unsupervised machine learning.
Unsupervised laparoscopic appendicectomy by surgical trainees is safe and time-effective.
Wong, Kenneth; Duncan, Tristram; Pearson, Andrew
2007-07-01
Open appendicectomy is the traditional standard treatment for appendicitis. Laparoscopic appendicectomy is perceived as a procedure with greater potential for complications and longer operative times. This paper examines the hypothesis that unsupervised laparoscopic appendicectomy by surgical trainees is a safe and time-effective valid alternative. Medical records, operating theatre records and histopathology reports of all patients undergoing laparoscopic and open appendicectomy over a 15-month period in two hospitals within an area health service were retrospectively reviewed. Data were analysed to compare patient features, pathology findings, operative times, complications, readmissions and mortality between laparoscopic and open groups and between unsupervised surgical trainee operators versus consultant surgeon operators. A total of 143 laparoscopic and 222 open appendicectomies were reviewed. Unsupervised trainees performed 64% of the laparoscopic appendicectomies and 55% of the open appendicectomies. There were no significant differences in complication rates, readmissions, mortality and length of stay between laparoscopic and open appendicectomy groups or between trainee and consultant surgeon operators. Conversion rates (laparoscopic to open approach) were similar for trainees and consultants. Unsupervised senior surgical trainees did not take significantly longer to perform laparoscopic appendicectomy when compared to unsupervised trainee-performed open appendicectomy. Unsupervised laparoscopic appendicectomy by surgical trainees is safe and time-effective.
Lee, Yoon Ho; Lee, Tae Kyung; Kim, Hongki; Song, Inho; Lee, Jiwon; Kang, Saewon; Ko, Hyunhyub; Kwak, Sang Kyu; Oh, Joon Hak
2018-03-01
In insect eyes, ommatidia with hierarchical structured cornea play a critical role in amplifying and transferring visual signals to the brain through optic nerves, enabling the perception of various visual signals. Here, inspired by the structure and functions of insect ommatidia, a flexible photoimaging device is reported that can simultaneously detect and record incoming photonic signals by vertically stacking an organic photodiode and resistive memory device. A single-layered, hierarchical multiple-patterned back reflector that can exhibit various plasmonic effects is incorporated into the organic photodiode. The multiple-patterned flexible organic photodiodes exhibit greatly enhanced photoresponsivity due to the increased light absorption in comparison with the flat systems. Moreover, the flexible photoimaging device shows a well-resolved spatiotemporal mapping of optical signals with excellent operational and mechanical stabilities at low driving voltages below half of the flat systems. Theoretical calculation and scanning near-field optical microscopy analyses clearly reveal that multiple-patterned electrodes have much stronger surface plasmon coupling than flat and single-patterned systems. The developed methodology provides a versatile and effective route for realizing high-performance optoelectronic and photonic systems. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
A continuous-time neural model for sequential action.
Kachergis, George; Wyatte, Dean; O'Reilly, Randall C; de Kleijn, Roy; Hommel, Bernhard
2014-11-05
Action selection, planning and execution are continuous processes that evolve over time, responding to perceptual feedback as well as evolving top-down constraints. Existing models of routine sequential action (e.g. coffee- or pancake-making) generally fall into one of two classes: hierarchical models that include hand-built task representations, or heterarchical models that must learn to represent hierarchy via temporal context, but thus far lack goal-orientedness. We present a biologically motivated model of the latter class that, because it is situated in the Leabra neural architecture, affords an opportunity to include both unsupervised and goal-directed learning mechanisms. Moreover, we embed this neurocomputational model in the theoretical framework of the theory of event coding (TEC), which posits that actions and perceptions share a common representation with bidirectional associations between the two. Thus, in this view, not only does perception select actions (along with task context), but actions are also used to generate perceptions (i.e. intended effects). We propose a neural model that implements TEC to carry out sequential action control in hierarchically structured tasks such as coffee-making. Unlike traditional feedforward discrete-time neural network models, which use static percepts to generate static outputs, our biological model accepts continuous-time inputs and likewise generates non-stationary outputs, making short-timescale dynamic predictions. © 2014 The Author(s) Published by the Royal Society. All rights reserved.
McCann, Cooper; Repasky, Kevin S.; Morin, Mikindra; ...
2017-05-23
Hyperspectral image analysis has benefited from an array of methods that take advantage of the increased spectral depth compared to multispectral sensors; however, the focus of these developments has been on supervised classification methods. Lack of a priori knowledge regarding land cover characteristics can make unsupervised classification methods preferable under certain circumstances. An unsupervised classification technique is presented in this paper that utilizes physically relevant basis functions to model the reflectance spectra. These fit parameters used to generate the basis functions allow clustering based on spectral characteristics rather than spectral channels and provide both noise and data reduction. Histogram splittingmore » of the fit parameters is then used as a means of producing an unsupervised classification. Unlike current unsupervised classification techniques that rely primarily on Euclidian distance measures to determine similarity, the unsupervised classification technique uses the natural splitting of the fit parameters associated with the basis functions creating clusters that are similar in terms of physical parameters. The data set used in this work utilizes the publicly available data collected at Indian Pines, Indiana. This data set provides reference data allowing for comparisons of the efficacy of different unsupervised data analysis. The unsupervised histogram splitting technique presented in this paper is shown to be better than the standard unsupervised ISODATA clustering technique with an overall accuracy of 34.3/19.0% before merging and 40.9/39.2% after merging. Finally, this improvement is also seen as an improvement of kappa before/after merging of 24.8/30.5 for the histogram splitting technique compared to 15.8/28.5 for ISODATA.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
McCann, Cooper; Repasky, Kevin S.; Morin, Mikindra
Hyperspectral image analysis has benefited from an array of methods that take advantage of the increased spectral depth compared to multispectral sensors; however, the focus of these developments has been on supervised classification methods. Lack of a priori knowledge regarding land cover characteristics can make unsupervised classification methods preferable under certain circumstances. An unsupervised classification technique is presented in this paper that utilizes physically relevant basis functions to model the reflectance spectra. These fit parameters used to generate the basis functions allow clustering based on spectral characteristics rather than spectral channels and provide both noise and data reduction. Histogram splittingmore » of the fit parameters is then used as a means of producing an unsupervised classification. Unlike current unsupervised classification techniques that rely primarily on Euclidian distance measures to determine similarity, the unsupervised classification technique uses the natural splitting of the fit parameters associated with the basis functions creating clusters that are similar in terms of physical parameters. The data set used in this work utilizes the publicly available data collected at Indian Pines, Indiana. This data set provides reference data allowing for comparisons of the efficacy of different unsupervised data analysis. The unsupervised histogram splitting technique presented in this paper is shown to be better than the standard unsupervised ISODATA clustering technique with an overall accuracy of 34.3/19.0% before merging and 40.9/39.2% after merging. Finally, this improvement is also seen as an improvement of kappa before/after merging of 24.8/30.5 for the histogram splitting technique compared to 15.8/28.5 for ISODATA.« less
Bae, Won-Gyu; Kim, Hong Nam; Kim, Doogon; Park, Suk-Hee; Jeong, Hoon Eui; Suh, Kahp-Yang
2014-02-01
Multiscale, hierarchically patterned surfaces, such as lotus leaves, butterfly wings, adhesion pads of gecko lizards are abundantly found in nature, where microstructures are usually used to strengthen the mechanical stability while nanostructures offer the main functionality, i.e., wettability, structural color, or dry adhesion. To emulate such hierarchical structures in nature, multiscale, multilevel patterning has been extensively utilized for the last few decades towards various applications ranging from wetting control, structural colors, to tissue scaffolds. In this review, we highlight recent advances in scalable multiscale patterning to bring about improved functions that can even surpass those found in nature, with particular focus on the analogy between natural and synthetic architectures in terms of the role of different length scales. This review is organized into four sections. First, the role and importance of multiscale, hierarchical structures is described with four representative examples. Second, recent achievements in multiscale patterning are introduced with their strengths and weaknesses. Third, four application areas of wetting control, dry adhesives, selectively filtrating membranes, and multiscale tissue scaffolds are overviewed by stressing out how and why multiscale structures need to be incorporated to carry out their performances. Finally, we present future directions and challenges for scalable, multiscale patterned surfaces. © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Paik, Taejong; Yun, Hongseok; Fleury, Blaise
We demonstrate the fabrication of hierarchical materials by controlling the structure of highly ordered binary nanocrystal superlattices (BNSLs) on multiple length scales. Combinations of magnetic, plasmonic, semiconducting, and insulating colloidal nanocrystal (NC) building blocks are self-assembled into BNSL membranes via the liquid–interfacial assembly technique. Free-standing BNSL membranes are transferred onto topographically structured poly(dimethylsiloxane) molds via the Langmuir–Schaefer technique and then deposited in patterns onto substrates via transfer printing. BNSLs with different structural motifs are successfully patterned into various meso- and microstructures such as lines, circles, and even three-dimensional grids across large-area substrates. A combination of electron microscopy and grazing incidencemore » small-angle X-ray scattering (GISAXS) measurements confirm the ordering of NC building blocks in meso- and micropatterned BNSLs. This technique demonstrates structural diversity in the design of hierarchical materials by assembling BNSLs from NC building blocks of different composition and size by patterning BNSLs into various size and shape superstructures of interest for a broad range of applications.« less
Network structure of subway passenger flows
NASA Astrophysics Data System (ADS)
Xu, Q.; Mao, B. H.; Bai, Y.
2016-03-01
The results of transportation infrastructure network analyses have been used to analyze complex networks in a topological context. However, most modeling approaches, including those based on complex network theory, do not fully account for real-life traffic patterns and may provide an incomplete view of network functions. This study utilizes trip data obtained from the Beijing Subway System to characterize individual passenger movement patterns. A directed weighted passenger flow network was constructed from the subway infrastructure network topology by incorporating trip data. The passenger flow networks exhibit several properties that can be characterized by power-law distributions based on flow size, and log-logistic distributions based on the fraction of boarding and departing passengers. The study also characterizes the temporal patterns of in-transit and waiting passengers and provides a hierarchical clustering structure for passenger flows. This hierarchical flow organization varies in the spatial domain. Ten cluster groups were identified, indicating a hierarchical urban polycentric structure composed of large concentrated flows at urban activity centers. These empirical findings provide insights regarding urban human mobility patterns within a large subway network.
Jiménez-Hernández, Hugo; González-Barbosa, Jose-Joel; Garcia-Ramírez, Teresa
2010-01-01
This investigation demonstrates an unsupervised approach for modeling traffic flow and detecting abnormal vehicle behaviors at intersections. In the first stage, the approach reveals and records the different states of the system. These states are the result of coding and grouping the historical motion of vehicles as long binary strings. In the second stage, using sequences of the recorded states, a stochastic graph model based on a Markovian approach is built. A behavior is labeled abnormal when current motion pattern cannot be recognized as any state of the system or a particular sequence of states cannot be parsed with the stochastic model. The approach is tested with several sequences of images acquired from a vehicular intersection where the traffic flow and duration used in connection with the traffic lights are continuously changed throughout the day. Finally, the low complexity and the flexibility of the approach make it reliable for use in real time systems. PMID:22163616
Jiménez-Hernández, Hugo; González-Barbosa, Jose-Joel; Garcia-Ramírez, Teresa
2010-01-01
This investigation demonstrates an unsupervised approach for modeling traffic flow and detecting abnormal vehicle behaviors at intersections. In the first stage, the approach reveals and records the different states of the system. These states are the result of coding and grouping the historical motion of vehicles as long binary strings. In the second stage, using sequences of the recorded states, a stochastic graph model based on a Markovian approach is built. A behavior is labeled abnormal when current motion pattern cannot be recognized as any state of the system or a particular sequence of states cannot be parsed with the stochastic model. The approach is tested with several sequences of images acquired from a vehicular intersection where the traffic flow and duration used in connection with the traffic lights are continuously changed throughout the day. Finally, the low complexity and the flexibility of the approach make it reliable for use in real time systems.
An unsupervised method for summarizing egocentric sport videos
NASA Astrophysics Data System (ADS)
Habibi Aghdam, Hamed; Jahani Heravi, Elnaz; Puig, Domenec
2015-12-01
People are getting more interested to record their sport activities using head-worn or hand-held cameras. This type of videos which is called egocentric sport videos has different motion and appearance patterns compared with life-logging videos. While a life-logging video can be defined in terms of well-defined human-object interactions, notwithstanding, it is not trivial to describe egocentric sport videos using well-defined activities. For this reason, summarizing egocentric sport videos based on human-object interaction might fail to produce meaningful results. In this paper, we propose an unsupervised method for summarizing egocentric videos by identifying the key-frames of the video. Our method utilizes both appearance and motion information and it automatically finds the number of the key-frames. Our blind user study on the new dataset collected from YouTube shows that in 93:5% cases, the users choose the proposed method as their first video summary choice. In addition, our method is within the top 2 choices of the users in 99% of studies.
Neurons with two sites of synaptic integration learn invariant representations.
Körding, K P; König, P
2001-12-01
Neurons in mammalian cerebral cortex combine specific responses with respect to some stimulus features with invariant responses to other stimulus features. For example, in primary visual cortex, complex cells code for orientation of a contour but ignore its position to a certain degree. In higher areas, such as the inferotemporal cortex, translation-invariant, rotation-invariant, and even view point-invariant responses can be observed. Such properties are of obvious interest to artificial systems performing tasks like pattern recognition. It remains to be resolved how such response properties develop in biological systems. Here we present an unsupervised learning rule that addresses this problem. It is based on a neuron model with two sites of synaptic integration, allowing qualitatively different effects of input to basal and apical dendritic trees, respectively. Without supervision, the system learns to extract invariance properties using temporal or spatial continuity of stimuli. Furthermore, top-down information can be smoothly integrated in the same framework. Thus, this model lends a physiological implementation to approaches of unsupervised learning of invariant-response properties.
Twellmann, Thorsten; Meyer-Baese, Anke; Lange, Oliver; Foo, Simon; Nattkemper, Tim W.
2008-01-01
Dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) has become an important tool in breast cancer diagnosis, but evaluation of multitemporal 3D image data holds new challenges for human observers. To aid the image analysis process, we apply supervised and unsupervised pattern recognition techniques for computing enhanced visualizations of suspicious lesions in breast MRI data. These techniques represent an important component of future sophisticated computer-aided diagnosis (CAD) systems and support the visual exploration of spatial and temporal features of DCE-MRI data stemming from patients with confirmed lesion diagnosis. By taking into account the heterogeneity of cancerous tissue, these techniques reveal signals with malignant, benign and normal kinetics. They also provide a regional subclassification of pathological breast tissue, which is the basis for pseudo-color presentations of the image data. Intelligent medical systems are expected to have substantial implications in healthcare politics by contributing to the diagnosis of indeterminate breast lesions by non-invasive imaging. PMID:19255616
A neural-visualization IDS for honeynet data.
Herrero, Álvaro; Zurutuza, Urko; Corchado, Emilio
2012-04-01
Neural intelligent systems can provide a visualization of the network traffic for security staff, in order to reduce the widely known high false-positive rate associated with misuse-based Intrusion Detection Systems (IDSs). Unlike previous work, this study proposes an unsupervised neural models that generate an intuitive visualization of the captured traffic, rather than network statistics. These snapshots of network events are immensely useful for security personnel that monitor network behavior. The system is based on the use of different neural projection and unsupervised methods for the visual inspection of honeypot data, and may be seen as a complementary network security tool that sheds light on internal data structures through visual inspection of the traffic itself. Furthermore, it is intended to facilitate verification and assessment of Snort performance (a well-known and widely-used misuse-based IDS), through the visualization of attack patterns. Empirical verification and comparison of the proposed projection methods are performed in a real domain, where two different case studies are defined and analyzed.
The Convallis Rule for Unsupervised Learning in Cortical Networks
Yger, Pierre; Harris, Kenneth D.
2013-01-01
The phenomenology and cellular mechanisms of cortical synaptic plasticity are becoming known in increasing detail, but the computational principles by which cortical plasticity enables the development of sensory representations are unclear. Here we describe a framework for cortical synaptic plasticity termed the “Convallis rule”, mathematically derived from a principle of unsupervised learning via constrained optimization. Implementation of the rule caused a recurrent cortex-like network of simulated spiking neurons to develop rate representations of real-world speech stimuli, enabling classification by a downstream linear decoder. Applied to spike patterns used in in vitro plasticity experiments, the rule reproduced multiple results including and beyond STDP. However STDP alone produced poorer learning performance. The mathematical form of the rule is consistent with a dual coincidence detector mechanism that has been suggested by experiments in several synaptic classes of juvenile neocortex. Based on this confluence of normative, phenomenological, and mechanistic evidence, we suggest that the rule may approximate a fundamental computational principle of the neocortex. PMID:24204224
Unsupervised Categorization in a Sample of Children with Autism Spectrum Disorders
ERIC Educational Resources Information Center
Edwards, Darren J.; Perlman, Amotz; Reed, Phil
2012-01-01
Studies of supervised Categorization have demonstrated limited Categorization performance in participants with autism spectrum disorders (ASD), however little research has been conducted regarding unsupervised Categorization in this population. This study explored unsupervised Categorization using two stimulus sets that differed in their…
Unsupervised Deep Hashing With Pseudo Labels for Scalable Image Retrieval.
Zhang, Haofeng; Liu, Li; Long, Yang; Shao, Ling
2018-04-01
In order to achieve efficient similarity searching, hash functions are designed to encode images into low-dimensional binary codes with the constraint that similar features will have a short distance in the projected Hamming space. Recently, deep learning-based methods have become more popular, and outperform traditional non-deep methods. However, without label information, most state-of-the-art unsupervised deep hashing (DH) algorithms suffer from severe performance degradation for unsupervised scenarios. One of the main reasons is that the ad-hoc encoding process cannot properly capture the visual feature distribution. In this paper, we propose a novel unsupervised framework that has two main contributions: 1) we convert the unsupervised DH model into supervised by discovering pseudo labels; 2) the framework unifies likelihood maximization, mutual information maximization, and quantization error minimization so that the pseudo labels can maximumly preserve the distribution of visual features. Extensive experiments on three popular data sets demonstrate the advantages of the proposed method, which leads to significant performance improvement over the state-of-the-art unsupervised hashing algorithms.
Tian, Moqian; Grill-Spector, Kalanit
2015-01-01
Recognizing objects is difficult because it requires both linking views of an object that can be different and distinguishing objects with similar appearance. Interestingly, people can learn to recognize objects across views in an unsupervised way, without feedback, just from the natural viewing statistics. However, there is intense debate regarding what information during unsupervised learning is used to link among object views. Specifically, researchers argue whether temporal proximity, motion, or spatiotemporal continuity among object views during unsupervised learning is beneficial. Here, we untangled the role of each of these factors in unsupervised learning of novel three-dimensional (3-D) objects. We found that after unsupervised training with 24 object views spanning a 180° view space, participants showed significant improvement in their ability to recognize 3-D objects across rotation. Surprisingly, there was no advantage to unsupervised learning with spatiotemporal continuity or motion information than training with temporal proximity. However, we discovered that when participants were trained with just a third of the views spanning the same view space, unsupervised learning via spatiotemporal continuity yielded significantly better recognition performance on novel views than learning via temporal proximity. These results suggest that while it is possible to obtain view-invariant recognition just from observing many views of an object presented in temporal proximity, spatiotemporal information enhances performance by producing representations with broader view tuning than learning via temporal association. Our findings have important implications for theories of object recognition and for the development of computational algorithms that learn from examples. PMID:26024454
NASA Technical Reports Server (NTRS)
Bradshaw, G. A.
1995-01-01
There has been an increased interest in the quantification of pattern in ecological systems over the past years. This interest is motivated by the desire to construct valid models which extend across many scales. Spatial methods must quantify pattern, discriminate types of pattern, and relate hierarchical phenomena across scales. Wavelet analysis is introduced as a method to identify spatial structure in ecological transect data. The main advantage of the wavelet transform over other methods is its ability to preserve and display hierarchical information while allowing for pattern decomposition. Two applications of wavelet analysis are illustrated, as a means to: (1) quantify known spatial patterns in Douglas-fir forests at several scales, and (2) construct spatially-explicit hypotheses regarding pattern generating mechanisms. Application of the wavelet variance, derived from the wavelet transform, is developed for forest ecosystem analysis to obtain additional insight into spatially-explicit data. Specifically, the resolution capabilities of the wavelet variance are compared to the semi-variogram and Fourier power spectra for the description of spatial data using a set of one-dimensional stationary and non-stationary processes. The wavelet cross-covariance function is derived from the wavelet transform and introduced as a alternative method for the analysis of multivariate spatial data of understory vegetation and canopy in Douglas-fir forests of the western Cascades of Oregon.
Unsupervised Feature Selection Based on the Morisita Index for Hyperspectral Images
NASA Astrophysics Data System (ADS)
Golay, Jean; Kanevski, Mikhail
2017-04-01
Hyperspectral sensors are capable of acquiring images with hundreds of narrow and contiguous spectral bands. Compared with traditional multispectral imagery, the use of hyperspectral images allows better performance in discriminating between land-cover classes, but it also results in large redundancy and high computational data processing. To alleviate such issues, unsupervised feature selection techniques for redundancy minimization can be implemented. Their goal is to select the smallest subset of features (or bands) in such a way that all the information content of a data set is preserved as much as possible. The present research deals with the application to hyperspectral images of a recently introduced technique of unsupervised feature selection: the Morisita-Based filter for Redundancy Minimization (MBRM). MBRM is based on the (multipoint) Morisita index of clustering and on the Morisita estimator of Intrinsic Dimension (ID). The fundamental idea of the technique is to retain only the bands which contribute to increasing the ID of an image. In this way, redundant bands are disregarded, since they have no impact on the ID. Besides, MBRM has several advantages over benchmark techniques: in addition to its ability to deal with large data sets, it can capture highly-nonlinear dependences and its implementation is straightforward in any programming environment. Experimental results on freely available hyperspectral images show the good effectiveness of MBRM in remote sensing data processing. Comparisons with benchmark techniques are carried out and random forests are used to assess the performance of MBRM in reducing the data dimensionality without loss of relevant information. References [1] C. Traina Jr., A.J.M. Traina, L. Wu, C. Faloutsos, Fast feature selection using fractal dimension, in: Proceedings of the XV Brazilian Symposium on Databases, SBBD, pp. 158-171, 2000. [2] J. Golay, M. Kanevski, A new estimator of intrinsic dimension based on the multipoint Morisita index, Pattern Recognition 48(12), pp. 4070-4081, 2015. [3] J. Golay, M. Kanevski, Unsupervised feature selection based on the Morisita estimator of intrinsic dimension, arXiv:1608.05581, 2016.
Experience-Driven Formation of Parts-Based Representations in a Model of Layered Visual Memory
Jitsev, Jenia; von der Malsburg, Christoph
2009-01-01
Growing neuropsychological and neurophysiological evidence suggests that the visual cortex uses parts-based representations to encode, store and retrieve relevant objects. In such a scheme, objects are represented as a set of spatially distributed local features, or parts, arranged in stereotypical fashion. To encode the local appearance and to represent the relations between the constituent parts, there has to be an appropriate memory structure formed by previous experience with visual objects. Here, we propose a model how a hierarchical memory structure supporting efficient storage and rapid recall of parts-based representations can be established by an experience-driven process of self-organization. The process is based on the collaboration of slow bidirectional synaptic plasticity and homeostatic unit activity regulation, both running at the top of fast activity dynamics with winner-take-all character modulated by an oscillatory rhythm. These neural mechanisms lay down the basis for cooperation and competition between the distributed units and their synaptic connections. Choosing human face recognition as a test task, we show that, under the condition of open-ended, unsupervised incremental learning, the system is able to form memory traces for individual faces in a parts-based fashion. On a lower memory layer the synaptic structure is developed to represent local facial features and their interrelations, while the identities of different persons are captured explicitly on a higher layer. An additional property of the resulting representations is the sparseness of both the activity during the recall and the synaptic patterns comprising the memory traces. PMID:19862345
Perkins, Timothy N.; Peeters, Paul M.; Shukla, Arti; Arijs, Ingrid; Dragon, Julie; Wouters, Emiel F.M.; Reynaert, Niki L.; Mossman, Brooke T.
2015-01-01
Occupational and environmental exposures to airborne asbestos and silica are associated with the development of lung fibrosis in the forms of asbestosis and silicosis, respectively. However, both diseases display distinct pathologic presentations, likely associated with differences in gene expression induced by different mineral structures, composition and bio-persistent properties. We hypothesized that effects of mineral exposure in the airway epithelium may dictate deviating molecular events that may explain the different pathologies of asbestosis versus silicosis. Using robust gene expression-profiling in conjunction with in-depth pathway analysis, we assessed early (24 h) alterations in gene expression associated with crocidolite asbestos or cristobalite silica exposures in primary human bronchial epithelial cells (NHBEs). Observations were confirmed in an immortalized line (BEAS-2B) by QRT-PCR and protein assays. Utilization of overall gene expression, unsupervised hierarchical cluster analysis and integrated pathway analysis revealed gene alterations that were common to both minerals or unique to either mineral. Our findings reveal that both minerals had potent effects on genes governing cell adhesion/migration, inflammation, and cellular stress, key features of fibrosis. Asbestos exposure was most specifically associated with aberrant cell proliferation and carcinogenesis, whereas silica exposure was highly associated with additional inflammatory responses, as well as pattern recognition, and fibrogenesis. These findings illustrate the use of gene-profiling as a means to determine early molecular events that may dictate pathological processes induced by exogenous cellular insults. In addition, it is a useful approach for predicting the pathogenicity of potentially harmful materials. PMID:25351596
Ibarrola-Villava, Maider; Peña-Chilet, María; Mongort, Cristina; Martinez-Ciarpaglini, Carolina; Navarro, Lara; Gambardella, Valentina; Castillo, Josefa; Roselló, Susana; Navarro, Samuel; Ribas, Gloria; Cervantes, Andrés
2016-01-01
Gastric cancer (GC) pathogenesis involves genetic, epigenetic and environmental factors. Epigenetic alterations, such as DNA methylation are considered pivotal in the inactivation of tumor-related genes. We assessed a methylation panel of 5 genes to study their association to GC progression and microsatellite instability (MSI), and studied the role of RUNX3 in GC pathogenesis and the tumor immune microenvironment. The methylation status of 47 promoter-CpG islands was studied through MALDI-TOF mass spectrometry analysis in 35 Microsatellite stable (MSS) GC, 26 MSI, and 18 cancer-free samples (CFS), and 6 MSS GC and 4 MSI GC cell lines. We also studied RUNX3 expression by immunohistochemistry (IHC) in 40 samples, and validated differences in methylation levels between tumor, normal, and immune tissue in 14 additional samples. Unsupervised hierarchical clustering of methylation levels revealed no distinct subgroups between MSI and MSS samples or cell lines. CFSs clustered together showing higher levels of RUNX3 methylation compared to GC samples. RUNX3 showed protein silencing in cancer and normal mucosa, compared to inflammatory peritumoural infiltrate in almost all cases, showing a non-lymphocytic predominant pattern and being correlated with epigenetic silencing. Our results show aberrant promoter's methylation in APC, CDH1, CDKN2A, MLH1 and RUNX3 associated with GC, as well as a non-lymphocytic predominant infiltrate with high expression of RUNX3. Deep study of RUNX3 inflammation signaling could help in understanding inflammation and immune activation in the tumor microenvironment. PMID:27566570
Methylation profiling identifies 2 groups of gliomas according to their tumorigenesis.
Laffaire, Julien; Everhard, Sibille; Idbaih, Ahmed; Crinière, Emmanuelle; Marie, Yannick; de Reyniès, Aurelien; Schiappa, Renaud; Mokhtari, Karima; Hoang-Xuan, Khê; Sanson, Marc; Delattre, Jean-Yves; Thillet, Joëlle; Ducray, François
2011-01-01
Extensive genomic and gene expression studies have been performed in gliomas, but the epigenetic alterations that characterize different subtypes of gliomas remain largely unknown. Here, we analyzed the methylation patterns of 807 genes (1536 CpGs) in a series of 33 low-grade gliomas (LGGs), 36 glioblastomas (GBMs), 8 paired initial and recurrent gliomas, and 9 controls. This analysis was performed with Illumina's Golden Gate Bead methylation arrays and was correlated with clinical, histological, genomic, gene expression, and genotyping data, including IDH1 mutations. Unsupervised hierarchical clustering resulted in 2 groups of gliomas: a group corresponding to de novo GBMs and a group consisting of LGGs, recurrent anaplastic gliomas, and secondary GBMs. When compared with de novo GBMs and controls, this latter group was characterized by a very high frequency of IDH1 mutations and by a hypermethylated profile similar to the recently described glioma CpG island methylator phenotype. MGMT methylation was more frequent in this group. Among the LGG cluster, 1p19q codeleted LGG displayed a distinct methylation profile. A study of paired initial and recurrent gliomas demonstrated that methylation profiles were remarkably stable across glioma evolution, even during anaplastic transformation, suggesting that epigenetic alterations occur early during gliomagenesis. Using the Cancer Genome Atlas data set, we demonstrated that GBM samples that had an LGG-like hypermethylated profile had a high rate of IDH1 mutations and a better outcome. Finally, we identified several hypermethylated and downregulated genes that may be associated with LGG and GBM oncogenesis, LGG oncogenesis, 1p19q codeleted LGG oncogenesis, and GBM oncogenesis.
Methylation profiling identifies 2 groups of gliomas according to their tumorigenesis
Laffaire, Julien; Everhard, Sibille; Idbaih, Ahmed; Crinière, Emmanuelle; Marie, Yannick; de Reyniès, Aurelien; Schiappa, Renaud; Mokhtari, Karima; Hoang-Xuan, Khê; Sanson, Marc; Delattre, Jean-Yves; Thillet, Joëlle; Ducray, François
2011-01-01
Extensive genomic and gene expression studies have been performed in gliomas, but the epigenetic alterations that characterize different subtypes of gliomas remain largely unknown. Here, we analyzed the methylation patterns of 807 genes (1536 CpGs) in a series of 33 low-grade gliomas (LGGs), 36 glioblastomas (GBMs), 8 paired initial and recurrent gliomas, and 9 controls. This analysis was performed with Illumina's Golden Gate Bead methylation arrays and was correlated with clinical, histological, genomic, gene expression, and genotyping data, including IDH1 mutations. Unsupervised hierarchical clustering resulted in 2 groups of gliomas: a group corresponding to de novo GBMs and a group consisting of LGGs, recurrent anaplastic gliomas, and secondary GBMs. When compared with de novo GBMs and controls, this latter group was characterized by a very high frequency of IDH1 mutations and by a hypermethylated profile similar to the recently described glioma CpG island methylator phenotype. MGMT methylation was more frequent in this group. Among the LGG cluster, 1p19q codeleted LGG displayed a distinct methylation profile. A study of paired initial and recurrent gliomas demonstrated that methylation profiles were remarkably stable across glioma evolution, even during anaplastic transformation, suggesting that epigenetic alterations occur early during gliomagenesis. Using the Cancer Genome Atlas data set, we demonstrated that GBM samples that had an LGG-like hypermethylated profile had a high rate of IDH1 mutations and a better outcome. Finally, we identified several hypermethylated and downregulated genes that may be associated with LGG and GBM oncogenesis, LGG oncogenesis, 1p19q codeleted LGG oncogenesis, and GBM oncogenesis. PMID:20926426
Shipham, Ashlee; Schmidt, Daniel J; Hughes, Jane M
2013-01-01
Recent work has highlighted the need to account for hierarchical patterns of genetic structure when estimating evolutionary and ecological parameters of interest. This caution is particularly relevant to studies of riverine organisms, where hierarchical structure appears to be commonplace. Here, we indirectly estimate dispersal distance in a hierarchically structured freshwater fish, Mogurnda adspersa. Microsatellite and mitochondrial DNA (mtDNA) data were obtained for 443 individuals across 27 sites separated by an average of 1.3 km within creeks of southeastern Queensland, Australia. Significant genetic structure was found among sites (mtDNA Φ(ST) = 0.508; microsatellite F(ST) = 0.225, F'(ST) = 0.340). Various clustering methods produced congruent patterns of hierarchical structure reflecting stream architecture. Partial mantel tests identified contiguous sets of sample sites where isolation by distance (IBD) explained F(ST) variation without significant contribution of hierarchical structure. Analysis of mean natal dispersal distance (σ) within sets of IBD-linked sample sites suggested most dispersal occurs over less than 1 km, and the average effective density (D(e)) was estimated at 11.5 individuals km(-1); indicating sedentary behavior and small effective population size are responsible for the remarkable patterns of genetic structure observed. Our results demonstrate that Rousset's regression-based method is applicable to estimating the scale of dispersal in riverine organisms and that identifying contiguous populations that satisfy the assumptions of this model is achievable with genetic clustering methods and partial correlations.
Pothos, Emmanuel M; Bailey, Todd M
2009-07-01
Naïve observers typically perceive some groupings for a set of stimuli as more intuitive than others. The problem of predicting category intuitiveness has been historically considered the remit of models of unsupervised categorization. In contrast, this article develops a measure of category intuitiveness from one of the most widely supported models of supervised categorization, the generalized context model (GCM). Considering different category assignments for a set of instances, the authors asked how well the GCM can predict the classification of each instance on the basis of all the other instances. The category assignment that results in the smallest prediction error is interpreted as the most intuitive for the GCM-the authors refer to this way of applying the GCM as "unsupervised GCM." The authors systematically compared predictions of category intuitiveness from the unsupervised GCM and two models of unsupervised categorization: the simplicity model and the rational model. The unsupervised GCM compared favorably with the simplicity model and the rational model. This success of the unsupervised GCM illustrates that the distinction between supervised and unsupervised categorization may need to be reconsidered. However, no model emerged as clearly superior, indicating that there is more work to be done in understanding and modeling category intuitiveness.
Wendel, Jochen; Buttenfield, Barbara P.; Stanislawski, Larry V.
2016-01-01
Knowledge of landscape type can inform cartographic generalization of hydrographic features, because landscape characteristics provide an important geographic context that affects variation in channel geometry, flow pattern, and network configuration. Landscape types are characterized by expansive spatial gradients, lacking abrupt changes between adjacent classes; and as having a limited number of outliers that might confound classification. The US Geological Survey (USGS) is exploring methods to automate generalization of features in the National Hydrography Data set (NHD), to associate specific sequences of processing operations and parameters with specific landscape characteristics, thus obviating manual selection of a unique processing strategy for every NHD watershed unit. A chronology of methods to delineate physiographic regions for the United States is described, including a recent maximum likelihood classification based on seven input variables. This research compares unsupervised and supervised algorithms applied to these seven input variables, to evaluate and possibly refine the recent classification. Evaluation metrics for unsupervised methods include the Davies–Bouldin index, the Silhouette index, and the Dunn index as well as quantization and topographic error metrics. Cross validation and misclassification rate analysis are used to evaluate supervised classification methods. The paper reports the comparative analysis and its impact on the selection of landscape regions. The compared solutions show problems in areas of high landscape diversity. There is some indication that additional input variables, additional classes, or more sophisticated methods can refine the existing classification.
Differential principal component analysis of ChIP-seq.
Ji, Hongkai; Li, Xia; Wang, Qian-fei; Ning, Yang
2013-04-23
We propose differential principal component analysis (dPCA) for analyzing multiple ChIP-sequencing datasets to identify differential protein-DNA interactions between two biological conditions. dPCA integrates unsupervised pattern discovery, dimension reduction, and statistical inference into a single framework. It uses a small number of principal components to summarize concisely the major multiprotein synergistic differential patterns between the two conditions. For each pattern, it detects and prioritizes differential genomic loci by comparing the between-condition differences with the within-condition variation among replicate samples. dPCA provides a unique tool for efficiently analyzing large amounts of ChIP-sequencing data to study dynamic changes of gene regulation across different biological conditions. We demonstrate this approach through analyses of differential chromatin patterns at transcription factor binding sites and promoters as well as allele-specific protein-DNA interactions.
Na, Kyoung-Sae; Lee, Soyoung Irene; Hong, Hyun Ju; Oh, Myoung-Ja; Bahn, Geon Ho; Ha, Kyunghee; Shin, Yun Mi; Song, Jungeun; Park, Eun Jin; Yoo, Heejung; Kim, Hyunsoo; Kyung, Yun-Mi
2014-06-01
In the last few decades, changing socioeconomic and family structures have increasingly left children alone without adult supervision. Carefully prepared and limited periods of unsupervised time are not harmful for children. However, long unsupervised periods have harmful effects, particularly for those children at high risk for inattention and problem behaviors. In this study, we examined the influence of unsupervised time on behavior problems by studying a sample of elementary school children at high risk for inattention and problem behaviors. The study analyzed data from the Children's Mental Health Promotion Project, which was conducted in collaboration with education, government, and mental health professionals. The child behavior checklist (CBCL) was administered to assess problem behaviors among first- and fourth-grade children. Multivariate logistic regression analysis was used to evaluate the influence of unsupervised time on children's behavior. A total of 3,270 elementary school children (1,340 first-graders and 1,930 fourth-graders) were available for this study; 1,876 of the 3,270 children (57.4%) reportedly spent a significant amount of time unsupervised during the day. Unsupervised time that exceeded more than 2h per day increased the risk of delinquency, aggressive behaviors, and somatic complaints, as well as externalizing and internalizing problems. Carefully planned afterschool programming and care should be provided to children at high risk for inattention and problem behaviors. Also, a more comprehensive approach is needed to identify the possible mechanisms by which unsupervised time aggravates behavior problems in children predisposed for these behaviors. Copyright © 2013 Elsevier Ltd. All rights reserved.
Processing prosodic structure by adults with language-based learning disability.
Bahl, Megha; Plante, Elena; Gerken, LouAnn
2009-01-01
Two experiments investigated the ability of adults with a history of language-based learning disability (hLLD) and their normal language (NL) peers to learn prosodic patterns of a novel language. Participants were exposed to stimuli from an artificial language and tested on items that required generalization of the stress patterns and the hierarchical principles of stress assignment that could be inferred from the input. In Study 1, the NL group successfully generalized the patterns of stress heard during familiarization, but failed to show generalization of the hierarchical principles. The hLLD group performed at chance for both types of generalization items. In Study 2, the intensity of stress elements was increased. The performance of the NL group improved whereas the hLLD groups' performance decreased on both types of generalization items. The results indicate that NL adults are able to successfully abstract the complex hierarchical rules of stress if the prosodic cues are made sufficiently salient, but this same task is difficult for adults with hLLD. The reader will be able to understand: (1) the difference in the ability of hLLD and NL adults to process stress assignment in an implicit learning context and (2) that typical adults can abstract complex hierarchical rules of stress assignment when provided with strong cues.
Hierarchical model analysis of the Atlantic Flyway Breeding Waterfowl Survey
Sauer, John R.; Zimmerman, Guthrie S.; Klimstra, Jon D.; Link, William A.
2014-01-01
We used log-linear hierarchical models to analyze data from the Atlantic Flyway Breeding Waterfowl Survey. The survey has been conducted by state biologists each year since 1989 in the northeastern United States from Virginia north to New Hampshire and Vermont. Although yearly population estimates from the survey are used by the United States Fish and Wildlife Service for estimating regional waterfowl population status for mallards (Anas platyrhynchos), black ducks (Anas rubripes), wood ducks (Aix sponsa), and Canada geese (Branta canadensis), they are not routinely adjusted to control for time of day effects and other survey design issues. The hierarchical model analysis permits estimation of year effects and population change while accommodating the repeated sampling of plots and controlling for time of day effects in counting. We compared population estimates from the current stratified random sample analysis to population estimates from hierarchical models with alternative model structures that describe year to year changes as random year effects, a trend with random year effects, or year effects modeled as 1-year differences. Patterns of population change from the hierarchical model results generally were similar to the patterns described by stratified random sample estimates, but significant visibility differences occurred between twilight to midday counts in all species. Controlling for the effects of time of day resulted in larger population estimates for all species in the hierarchical model analysis relative to the stratified random sample analysis. The hierarchical models also provided a convenient means of estimating population trend as derived statistics from the analysis. We detected significant declines in mallard and American black ducks and significant increases in wood ducks and Canada geese, a trend that had not been significant for 3 of these 4 species in the prior analysis. We recommend using hierarchical models for analysis of the Atlantic Flyway Breeding Waterfowl Survey.
Mapping Informative Clusters in a Hierarchial Framework of fMRI Multivariate Analysis
Xu, Rui; Zhen, Zonglei; Liu, Jia
2010-01-01
Pattern recognition methods have become increasingly popular in fMRI data analysis, which are powerful in discriminating between multi-voxel patterns of brain activities associated with different mental states. However, when they are used in functional brain mapping, the location of discriminative voxels varies significantly, raising difficulties in interpreting the locus of the effect. Here we proposed a hierarchical framework of multivariate approach that maps informative clusters rather than voxels to achieve reliable functional brain mapping without compromising the discriminative power. In particular, we first searched for local homogeneous clusters that consisted of voxels with similar response profiles. Then, a multi-voxel classifier was built for each cluster to extract discriminative information from the multi-voxel patterns. Finally, through multivariate ranking, outputs from the classifiers were served as a multi-cluster pattern to identify informative clusters by examining interactions among clusters. Results from both simulated and real fMRI data demonstrated that this hierarchical approach showed better performance in the robustness of functional brain mapping than traditional voxel-based multivariate methods. In addition, the mapped clusters were highly overlapped for two perceptually equivalent object categories, further confirming the validity of our approach. In short, the hierarchical framework of multivariate approach is suitable for both pattern classification and brain mapping in fMRI studies. PMID:21152081
Tessem, May-Britt; Bathen, Tone F; Cejková, Jitka; Midelfart, Anna
2005-03-01
This study was conducted to investigate metabolic changes in aqueous humor from rabbit eyes exposed to either UV-A or -B radiation, by using (1)H nuclear magnetic resonance (NMR) spectroscopy and unsupervised pattern recognition methods. Both eyes of adult albino rabbits were irradiated with UV-A (366 nm, 0.589 J/cm(2)) or UV-B (312 nm, 1.667 J/cm(2)) radiation for 8 minutes, once a day for 5 days. Three days after the last irradiation, samples of aqueous humor were aspirated, and the metabolic profiles analyzed with (1)H NMR spectroscopy. The metabolic concentrations in the exposed and control materials were statistically analyzed and compared, with multivariate methods and one-way ANOVA. UV-B radiation caused statistically significant alterations of betaine, glucose, ascorbate, valine, isoleucine, and formate in the rabbit aqueous humor. By using principal component analysis, the UV-B-irradiated samples were clearly separated from the UV-A-irradiated samples and the control group. No significant metabolic changes were detected in UV-A-irradiated samples. This study demonstrates the potential of using unsupervised pattern recognition methods to extract valuable metabolic information from complex (1)H NMR spectra. UV-B irradiation of rabbit eyes led to significant metabolic changes in the aqueous humor detected 3 days after the last exposure.
Geospatiotemporal Data Mining of Remotely Sensed Phenology for Unsupervised Forest Threat Detection
NASA Astrophysics Data System (ADS)
Mills, R. T.; Hoffman, F. M.; Kumar, J.; Vulli, S. S.; Hargrove, W. W.; Spruce, J.
2010-12-01
Hargrove and Hoffman have previously developed and applied a scalable geospatiotemporal data mining approach to define a set of categorical, multivariate classes or states for describing and tracking the behavior of ecosystem properties through time within a multi-dimensional phase or state space. The method employs a standard k-means cluster analysis with enhancements that reduce the number of required comparisons, dramatically accelerating iterative convergence. In support of efforts by the USDA Forest Service to develop a National Early Warning System for Forest Disturbances, we have applied this geospatiotemporal cluster analysis procedure to annual phenology patterns derived from Moderate Resolution Imaging Spectroradiometer (MODIS) Normalized Difference Vegetation Index (NDVI) for unsupervised change detection. We will present initial results from the analysis of seven years of 250-m MODIS NDVI data for the conterminous United States. While determining what constitutes a "normal" phenological pattern for any given location is challenging due to interannual climate variability, a spatially varying climate change trend, and the relatively short record of MODIS NDVI observations, these results demonstrate the utility of the method for detecting significant mortality events, like the progressive damage from mountain pine beetle, and suggest that the technique may be successfully implemented as a key component in an early warning system for identifying forest threats from natural and anthropogenic disturbances at a continental scale.
Hierarchical Spatiotemporal Dynamics of Speech Rhythm and Articulation
ERIC Educational Resources Information Center
Tilsen, Samuel Edward
2009-01-01
Hierarchy is one of the most important concepts in the scientific study of language. This dissertation aims to understand why we observe hierarchical structures in speech by investigating the cognitive processes from which they emerge. To that end, the dissertation explores how articulatory, rhythmic, and prosodic patterns of speech interact.…
Fabelo, Himar; Ortega, Samuel; Ravi, Daniele; Kiran, B Ravi; Sosa, Coralia; Bulters, Diederik; Callicó, Gustavo M; Bulstrode, Harry; Szolna, Adam; Piñeiro, Juan F; Kabwama, Silvester; Madroñal, Daniel; Lazcano, Raquel; J-O'Shanahan, Aruma; Bisshopp, Sara; Hernández, María; Báez, Abelardo; Yang, Guang-Zhong; Stanciulescu, Bogdan; Salvador, Rubén; Juárez, Eduardo; Sarmiento, Roberto
2018-01-01
Surgery for brain cancer is a major problem in neurosurgery. The diffuse infiltration into the surrounding normal brain by these tumors makes their accurate identification by the naked eye difficult. Since surgery is the common treatment for brain cancer, an accurate radical resection of the tumor leads to improved survival rates for patients. However, the identification of the tumor boundaries during surgery is challenging. Hyperspectral imaging is a non-contact, non-ionizing and non-invasive technique suitable for medical diagnosis. This study presents the development of a novel classification method taking into account the spatial and spectral characteristics of the hyperspectral images to help neurosurgeons to accurately determine the tumor boundaries in surgical-time during the resection, avoiding excessive excision of normal tissue or unintentionally leaving residual tumor. The algorithm proposed in this study to approach an efficient solution consists of a hybrid framework that combines both supervised and unsupervised machine learning methods. Firstly, a supervised pixel-wise classification using a Support Vector Machine classifier is performed. The generated classification map is spatially homogenized using a one-band representation of the HS cube, employing the Fixed Reference t-Stochastic Neighbors Embedding dimensional reduction algorithm, and performing a K-Nearest Neighbors filtering. The information generated by the supervised stage is combined with a segmentation map obtained via unsupervised clustering employing a Hierarchical K-Means algorithm. The fusion is performed using a majority voting approach that associates each cluster with a certain class. To evaluate the proposed approach, five hyperspectral images of surface of the brain affected by glioblastoma tumor in vivo from five different patients have been used. The final classification maps obtained have been analyzed and validated by specialists. These preliminary results are promising, obtaining an accurate delineation of the tumor area.
Kabwama, Silvester; Madroñal, Daniel; Lazcano, Raquel; J-O’Shanahan, Aruma; Bisshopp, Sara; Hernández, María; Báez, Abelardo; Yang, Guang-Zhong; Stanciulescu, Bogdan; Salvador, Rubén; Juárez, Eduardo; Sarmiento, Roberto
2018-01-01
Surgery for brain cancer is a major problem in neurosurgery. The diffuse infiltration into the surrounding normal brain by these tumors makes their accurate identification by the naked eye difficult. Since surgery is the common treatment for brain cancer, an accurate radical resection of the tumor leads to improved survival rates for patients. However, the identification of the tumor boundaries during surgery is challenging. Hyperspectral imaging is a non-contact, non-ionizing and non-invasive technique suitable for medical diagnosis. This study presents the development of a novel classification method taking into account the spatial and spectral characteristics of the hyperspectral images to help neurosurgeons to accurately determine the tumor boundaries in surgical-time during the resection, avoiding excessive excision of normal tissue or unintentionally leaving residual tumor. The algorithm proposed in this study to approach an efficient solution consists of a hybrid framework that combines both supervised and unsupervised machine learning methods. Firstly, a supervised pixel-wise classification using a Support Vector Machine classifier is performed. The generated classification map is spatially homogenized using a one-band representation of the HS cube, employing the Fixed Reference t-Stochastic Neighbors Embedding dimensional reduction algorithm, and performing a K-Nearest Neighbors filtering. The information generated by the supervised stage is combined with a segmentation map obtained via unsupervised clustering employing a Hierarchical K-Means algorithm. The fusion is performed using a majority voting approach that associates each cluster with a certain class. To evaluate the proposed approach, five hyperspectral images of surface of the brain affected by glioblastoma tumor in vivo from five different patients have been used. The final classification maps obtained have been analyzed and validated by specialists. These preliminary results are promising, obtaining an accurate delineation of the tumor area. PMID:29554126
Unsupervised Segmentation of Head Tissues from Multi-modal MR Images for EEG Source Localization.
Mahmood, Qaiser; Chodorowski, Artur; Mehnert, Andrew; Gellermann, Johanna; Persson, Mikael
2015-08-01
In this paper, we present and evaluate an automatic unsupervised segmentation method, hierarchical segmentation approach (HSA)-Bayesian-based adaptive mean shift (BAMS), for use in the construction of a patient-specific head conductivity model for electroencephalography (EEG) source localization. It is based on a HSA and BAMS for segmenting the tissues from multi-modal magnetic resonance (MR) head images. The evaluation of the proposed method was done both directly in terms of segmentation accuracy and indirectly in terms of source localization accuracy. The direct evaluation was performed relative to a commonly used reference method brain extraction tool (BET)-FMRIB's automated segmentation tool (FAST) and four variants of the HSA using both synthetic data and real data from ten subjects. The synthetic data includes multiple realizations of four different noise levels and several realizations of typical noise with a 20% bias field level. The Dice index and Hausdorff distance were used to measure the segmentation accuracy. The indirect evaluation was performed relative to the reference method BET-FAST using synthetic two-dimensional (2D) multimodal magnetic resonance (MR) data with 3% noise and synthetic EEG (generated for a prescribed source). The source localization accuracy was determined in terms of localization error and relative error of potential. The experimental results demonstrate the efficacy of HSA-BAMS, its robustness to noise and the bias field, and that it provides better segmentation accuracy than the reference method and variants of the HSA. They also show that it leads to a more accurate localization accuracy than the commonly used reference method and suggest that it has potential as a surrogate for expert manual segmentation for the EEG source localization problem.
Two Clinical Phenotypes in Polycythemia Vera
Spivak, Jerry L.; Considine, Michael; Williams, Donna M.; Talbot, Conover C.; Rogers, Ophelia; Moliterno, Alison R.; Jie, Chunfa; Ochs, Michael F.
2014-01-01
BACKGROUND Polycythemia vera is the ultimate phenotypic consequence of the V617F mutation in Janus kinase 2 (encoded by JAK2), but the extent to which this mutation influences the behavior of the involved CD34+ hematopoietic stem cells is unknown. METHODS We analyzed gene expression in CD34+ peripheral-blood cells from 19 patients with polycythemia vera, using oligonucleotide microarray technology after correcting for potential confounding by sex, since the phenotypic features of the disease differ between men and women. RESULTS Men with polycythemia vera had twice as many up-regulated or down-regulated genes as women with polycythemia vera, in a comparison of gene expression in the patients and in healthy persons of the same sex, but there were 102 genes with differential regulation that was concordant in men and women. When these genes were used for class discovery by means of unsupervised hierarchical clustering, the 19 patients could be divided into two groups that did not differ significantly with respect to age, neutrophil JAK2 V617F allele burden, white-cell count, platelet count, or clonal dominance. However, they did differ significantly with respect to disease duration; hemoglobin level; frequency of thromboembolic events, palpable splenomegaly, and splenectomy; chemotherapy exposure; leukemic transformation; and survival. The unsupervised clustering was confirmed by a supervised approach with the use of a top-scoring-pair classifier that segregated the 19 patients into the same two phenotypic groups with 100% accuracy. CONCLUSIONS Removing sex as a potential confounder, we identified an accurate molecular method for classifying patients with polycythemia vera according to disease behavior, independently of their JAK2 V617F allele burden, and identified previously unrecognized molecular pathways in polycythemia vera outside the canonical JAK2 pathway that may be amenable to targeted therapy. PMID:25162887
A hierarchical graph neuron scheme for real-time pattern recognition.
Nasution, B B; Khan, A I
2008-02-01
The hierarchical graph neuron (HGN) implements a single cycle memorization and recall operation through a novel algorithmic design. The HGN is an improvement on the already published original graph neuron (GN) algorithm. In this improved approach, it recognizes incomplete/noisy patterns. It also resolves the crosstalk problem, which is identified in the previous publications, within closely matched patterns. To accomplish this, the HGN links multiple GN networks for filtering noise and crosstalk out of pattern data inputs. Intrinsically, the HGN is a lightweight in-network processing algorithm which does not require expensive floating point computations; hence, it is very suitable for real-time applications and tiny devices such as the wireless sensor networks. This paper describes that the HGN's pattern matching capability and the small response time remain insensitive to the increases in the number of stored patterns. Moreover, the HGN does not require definition of rules or setting of thresholds by the operator to achieve the desired results nor does it require heuristics entailing iterative operations for memorization and recall of patterns.
Atherton, Olivia E; Schofield, Thomas J; Sitka, Angela; Conger, Rand D; Robins, Richard W
2016-04-01
Despite widespread speculation about the detrimental effect of unsupervised self-care on adolescent outcomes, little is known about which children are particularly prone to problem behaviors when left at home without adult supervision. The present research used data from a longitudinal study of 674 Mexican-origin children residing in the United States to examine the prospective effect of unsupervised self-care on conduct problems, and the moderating roles of hostile aggression and gender. Results showed that unsupervised self-care was related to increases over time in conduct problems such as lying, stealing, and bullying. However, unsupervised self-care only led to conduct problems for boys and for children with an aggressive temperament. The main and interactive effects held for both mother-reported and observational-rated hostile aggression and after controlling for potential confounds. Copyright © 2016 The Foundation for Professionals in Services for Adolescents. Published by Elsevier Ltd. All rights reserved.
Using Unsupervised Learning to Unlock the Potential of Hydrologic Similarity
NASA Astrophysics Data System (ADS)
Chaney, N.; Newman, A. J.
2017-12-01
By clustering environmental data into representative hydrologic response units (HRUs), hydrologic similarity aims to harness the covariance between a system's physical environment and its hydrologic response to create reduced-order models. This is the primary approach through which sub-grid hydrologic processes are represented in large-scale models (e.g., Earth System Models). Although the possibilities of hydrologic similarity are extensive, its practical implementations have been limited to 1-d bins of oversimplistic metrics of hydrologic response (e.g., topographic index)—this is a missed opportunity. In this presentation we will show how unsupervised learning is unlocking the potential of hydrologic similarity; clustering methods enable generalized frameworks to effectively and efficiently harness the petabytes of global environmental data to robustly characterize sub-grid heterogeneity in large-scale models. To illustrate the potential that unsupervised learning has towards advancing hydrologic similarity, we introduce a hierarchical clustering algorithm (HCA) that clusters very high resolution (30-100 meters) elevation, soil, climate, and land cover data to assemble a domain's representative HRUs. These HRUs are then used to parameterize the sub-grid heterogeneity in land surface models; for this study we use the GFDL LM4 model—the land component of the GFDL Earth System Model. To explore HCA and its impacts on the hydrologic system we use a ¼ grid cell in southeastern California as a test site. HCA is used to construct an ensemble of 9 different HRU configurations—each configuration has a different number of HRUs; for each ensemble member LM4 is run between 2002 and 2014 with a 26 year spinup. The analysis of the ensemble of model simulations show that: 1) clustering the high-dimensional environmental data space leads to a robust representation of the role of the physical environment in the coupled water, energy, and carbon cycles at a relatively low number of HRUs; 2) the reduced-order model with around 300 HRUs effectively reproduces the fully distributed model simulation (30 meters) with less than 1/1000 of computational expense; 3) assigning each grid cell of the fully distributed grid to an HRU via HCA enables novel visualization methods for large-scale models—this has significant implications for how these models are applied and evaluated. We will conclude by outlining the potential that this work has within operational prediction systems including numerical weather prediction, Earth System models, and Early Warning systems.
Unsupervised universal steganalyzer for high-dimensional steganalytic features
NASA Astrophysics Data System (ADS)
Hou, Xiaodan; Zhang, Tao
2016-11-01
The research in developing steganalytic features has been highly successful. These features are extremely powerful when applied to supervised binary classification problems. However, they are incompatible with unsupervised universal steganalysis because the unsupervised method cannot distinguish embedding distortion from varying levels of noises caused by cover variation. This study attempts to alleviate the problem by introducing similarity retrieval of image statistical properties (SRISP), with the specific aim of mitigating the effect of cover variation on the existing steganalytic features. First, cover images with some statistical properties similar to those of a given test image are searched from a retrieval cover database to establish an aided sample set. Then, unsupervised outlier detection is performed on a test set composed of the given test image and its aided sample set to determine the type (cover or stego) of the given test image. Our proposed framework, called SRISP-aided unsupervised outlier detection, requires no training. Thus, it does not suffer from model mismatch mess. Compared with prior unsupervised outlier detectors that do not consider SRISP, the proposed framework not only retains the universality but also exhibits superior performance when applied to high-dimensional steganalytic features.
Clonal Selection Based Artificial Immune System for Generalized Pattern Recognition
NASA Technical Reports Server (NTRS)
Huntsberger, Terry
2011-01-01
The last two decades has seen a rapid increase in the application of AIS (Artificial Immune Systems) modeled after the human immune system to a wide range of areas including network intrusion detection, job shop scheduling, classification, pattern recognition, and robot control. JPL (Jet Propulsion Laboratory) has developed an integrated pattern recognition/classification system called AISLE (Artificial Immune System for Learning and Exploration) based on biologically inspired models of B-cell dynamics in the immune system. When used for unsupervised or supervised classification, the method scales linearly with the number of dimensions, has performance that is relatively independent of the total size of the dataset, and has been shown to perform as well as traditional clustering methods. When used for pattern recognition, the method efficiently isolates the appropriate matches in the data set. The paper presents the underlying structure of AISLE and the results from a number of experimental studies.
An Improved Hierarchical Genetic Algorithm for Sheet Cutting Scheduling with Process Constraints
Rao, Yunqing; Qi, Dezhong; Li, Jinling
2013-01-01
For the first time, an improved hierarchical genetic algorithm for sheet cutting problem which involves n cutting patterns for m non-identical parallel machines with process constraints has been proposed in the integrated cutting stock model. The objective of the cutting scheduling problem is minimizing the weighted completed time. A mathematical model for this problem is presented, an improved hierarchical genetic algorithm (ant colony—hierarchical genetic algorithm) is developed for better solution, and a hierarchical coding method is used based on the characteristics of the problem. Furthermore, to speed up convergence rates and resolve local convergence issues, a kind of adaptive crossover probability and mutation probability is used in this algorithm. The computational result and comparison prove that the presented approach is quite effective for the considered problem. PMID:24489491
An improved hierarchical genetic algorithm for sheet cutting scheduling with process constraints.
Rao, Yunqing; Qi, Dezhong; Li, Jinling
2013-01-01
For the first time, an improved hierarchical genetic algorithm for sheet cutting problem which involves n cutting patterns for m non-identical parallel machines with process constraints has been proposed in the integrated cutting stock model. The objective of the cutting scheduling problem is minimizing the weighted completed time. A mathematical model for this problem is presented, an improved hierarchical genetic algorithm (ant colony--hierarchical genetic algorithm) is developed for better solution, and a hierarchical coding method is used based on the characteristics of the problem. Furthermore, to speed up convergence rates and resolve local convergence issues, a kind of adaptive crossover probability and mutation probability is used in this algorithm. The computational result and comparison prove that the presented approach is quite effective for the considered problem.
Xiao, Fuyuan; Aritsugi, Masayoshi; Wang, Qing; Zhang, Rong
2016-09-01
For efficient and sophisticated analysis of complex event patterns that appear in streams of big data from health care information systems and support for decision-making, a triaxial hierarchical model is proposed in this paper. Our triaxial hierarchical model is developed by focusing on hierarchies among nested event pattern queries with an event concept hierarchy, thereby allowing us to identify the relationships among the expressions and sub-expressions of the queries extensively. We devise a cost-based heuristic by means of the triaxial hierarchical model to find an optimised query execution plan in terms of the costs of both the operators and the communications between them. According to the triaxial hierarchical model, we can also calculate how to reuse the results of the common sub-expressions in multiple queries. By integrating the optimised query execution plan with the reuse schemes, a multi-query optimisation strategy is developed to accomplish efficient processing of multiple nested event pattern queries. We present empirical studies in which the performance of multi-query optimisation strategy was examined under various stream input rates and workloads. Specifically, the workloads of pattern queries can be used for supporting monitoring patients' conditions. On the other hand, experiments with varying input rates of streams can correspond to changes of the numbers of patients that a system should manage, whereas burst input rates can correspond to changes of rushes of patients to be taken care of. The experimental results have shown that, in Workload 1, our proposal can improve about 4 and 2 times throughput comparing with the relative works, respectively; in Workload 2, our proposal can improve about 3 and 2 times throughput comparing with the relative works, respectively; in Workload 3, our proposal can improve about 6 times throughput comparing with the relative work. The experimental results demonstrated that our proposal was able to process complex queries efficiently which can support health information systems and further decision-making. Copyright © 2016 Elsevier B.V. All rights reserved.
Scale-invariant feature extraction of neural network and renormalization group flow
NASA Astrophysics Data System (ADS)
Iso, Satoshi; Shiba, Shotaro; Yokoo, Sumito
2018-05-01
Theoretical understanding of how a deep neural network (DNN) extracts features from input images is still unclear, but it is widely believed that the extraction is performed hierarchically through a process of coarse graining. It reminds us of the basic renormalization group (RG) concept in statistical physics. In order to explore possible relations between DNN and RG, we use the restricted Boltzmann machine (RBM) applied to an Ising model and construct a flow of model parameters (in particular, temperature) generated by the RBM. We show that the unsupervised RBM trained by spin configurations at various temperatures from T =0 to T =6 generates a flow along which the temperature approaches the critical value Tc=2.2 7 . This behavior is the opposite of the typical RG flow of the Ising model. By analyzing various properties of the weight matrices of the trained RBM, we discuss why it flows towards Tc and how the RBM learns to extract features of spin configurations.
NASA Astrophysics Data System (ADS)
Schlueter-Kuck, Kristy L.; Dabiri, John O.
2017-09-01
We present a method for identifying the coherent structures associated with individual Lagrangian flow trajectories even where only sparse particle trajectory data are available. The method, based on techniques in spectral graph theory, uses the Coherent Structure Coloring vector and associated eigenvectors to analyze the distance in higher-dimensional eigenspace between a selected reference trajectory and other tracer trajectories in the flow. By analyzing this distance metric in a hierarchical clustering, the coherent structure of which the reference particle is a member can be identified. This algorithm is proven successful in identifying coherent structures of varying complexities in canonical unsteady flows. Additionally, the method is able to assess the relative coherence of the associated structure in comparison to the surrounding flow. Although the method is demonstrated here in the context of fluid flow kinematics, the generality of the approach allows for its potential application to other unsupervised clustering problems in dynamical systems such as neuronal activity, gene expression, or social networks.
Functional model of biological neural networks.
Lo, James Ting-Ho
2010-12-01
A functional model of biological neural networks, called temporal hierarchical probabilistic associative memory (THPAM), is proposed in this paper. THPAM comprises functional models of dendritic trees for encoding inputs to neurons, a first type of neuron for generating spike trains, a second type of neuron for generating graded signals to modulate neurons of the first type, supervised and unsupervised Hebbian learning mechanisms for easy learning and retrieving, an arrangement of dendritic trees for maximizing generalization, hardwiring for rotation-translation-scaling invariance, and feedback connections with different delay durations for neurons to make full use of present and past informations generated by neurons in the same and higher layers. These functional models and their processing operations have many functions of biological neural networks that have not been achieved by other models in the open literature and provide logically coherent answers to many long-standing neuroscientific questions. However, biological justifications of these functional models and their processing operations are required for THPAM to qualify as a macroscopic model (or low-order approximate) of biological neural networks.
Griffin, William A.; Li, Xun
2016-01-01
Sequential affect dynamics generated during the interaction of intimate dyads, such as married couples, are associated with a cascade of effects—some good and some bad—on each partner, close family members, and other social contacts. Although the effects are well documented, the probabilistic structures associated with micro-social processes connected to the varied outcomes remain enigmatic. Using extant data we developed a method of classifying and subsequently generating couple dynamics using a Hierarchical Dirichlet Process Hidden semi-Markov Model (HDP-HSMM). Our findings indicate that several key aspects of existing models of marital interaction are inadequate: affect state emissions and their durations, along with the expected variability differences between distressed and nondistressed couples are present but highly nuanced; and most surprisingly, heterogeneity among highly satisfied couples necessitate that they be divided into subgroups. We review how this unsupervised learning technique generates plausible dyadic sequences that are sensitive to relationship quality and provide a natural mechanism for computational models of behavioral and affective micro-social processes. PMID:27187319
Jeong, Jeong-Won; Shin, Dae C; Do, Synho; Marmarelis, Vasilis Z
2006-08-01
This paper presents a novel segmentation methodology for automated classification and differentiation of soft tissues using multiband data obtained with the newly developed system of high-resolution ultrasonic transmission tomography (HUTT) for imaging biological organs. This methodology extends and combines two existing approaches: the L-level set active contour (AC) segmentation approach and the agglomerative hierarchical kappa-means approach for unsupervised clustering (UC). To prevent the trapping of the current iterative minimization AC algorithm in a local minimum, we introduce a multiresolution approach that applies the level set functions at successively increasing resolutions of the image data. The resulting AC clusters are subsequently rearranged by the UC algorithm that seeks the optimal set of clusters yielding the minimum within-cluster distances in the feature space. The presented results from Monte Carlo simulations and experimental animal-tissue data demonstrate that the proposed methodology outperforms other existing methods without depending on heuristic parameters and provides a reliable means for soft tissue differentiation in HUTT images.
Anomaly Detection of Electromyographic Signals.
Ijaz, Ahsan; Choi, Jongeun
2018-04-01
In this paper, we provide a robust framework to detect anomalous electromyographic (EMG) signals and identify contamination types. As a first step for feature selection, optimally selected Lawton wavelets transform is applied. Robust principal component analysis (rPCA) is then performed on these wavelet coefficients to obtain features in a lower dimension. The rPCA based features are used for constructing a self-organizing map (SOM). Finally, hierarchical clustering is applied on the SOM that separates anomalous signals residing in the smaller clusters and breaks them into logical units for contamination identification. The proposed methodology is tested using synthetic and real world EMG signals. The synthetic EMG signals are generated using a heteroscedastic process mimicking desired experimental setups. A sub-part of these synthetic signals is introduced with anomalies. These results are followed with real EMG signals introduced with synthetic anomalies. Finally, a heterogeneous real world data set is used with known quality issues under an unsupervised setting. The framework provides recall of 90% (± 3.3) and precision of 99%(±0.4).
Hierarchical Letters in ASD: High Stimulus Variability under Different Attentional Modes
ERIC Educational Resources Information Center
Van der Hallen, Ruth; Vanmarcke, Steven; Noens, Ilse; Wagemans, Johan
2017-01-01
Studies using hierarchical patterns to test global precedence and local-global interference in individuals with ASD have produced mixed results. The current study focused on stimulus variability and locational uncertainty, while using different attentional modes. Two groups of 44 children with and without ASD completed a divided attention task as…
Bao, Rong-Rong; Zhang, Cheng-Yi; Zhang, Xiu-Juan; Ou, Xue-Mei; Lee, Chun-Sing; Jie, Jian-Sheng; Zhang, Xiao-Hong
2013-06-26
The controlled growth and alignment of one-dimensional organic nanostructures at well-defined locations considerably hinders the integration of nanostructures for electronic and optoelectronic applications. Here, we demonstrate a simple process to achieve the growth, alignment, and hierarchical patterning of organic nanowires on substrates with controlled patterns of surface wettability. The first-level pattern is confined by the substrate patterns of wettability. Organic nanostructures are preferentially grown on solvent wettable regions. The second-level pattern is the patterning of aligned organic nanowires deposited by controlling the shape and movement of the solution contact lines during evaporation on the wettable regions. This process is controlled by the cover-hat-controlled method or vertical evaportation method. Therefore, various new patterns of organic nanostructures can be obtained by combing these two levels of patterns. This simple method proves to be a general approach that can be applied to other organic nanostructure systems. Using the as-prepared patterned nanowire arrays, an optoelectronic device (photodetector) is easily fabricated. Hence, the proposed simple, large-scale, low-cost method of preparing patterns of highly ordered organic nanostructures has high potential applications in various electronic and optoelectronic devices.
BicPAMS: software for biological data analysis with pattern-based biclustering.
Henriques, Rui; Ferreira, Francisco L; Madeira, Sara C
2017-02-02
Biclustering has been largely applied for the unsupervised analysis of biological data, being recognised today as a key technique to discover putative modules in both expression data (subsets of genes correlated in subsets of conditions) and network data (groups of coherently interconnected biological entities). However, given its computational complexity, only recent breakthroughs on pattern-based biclustering enabled efficient searches without the restrictions that state-of-the-art biclustering algorithms place on the structure and homogeneity of biclusters. As a result, pattern-based biclustering provides the unprecedented opportunity to discover non-trivial yet meaningful biological modules with putative functions, whose coherency and tolerance to noise can be tuned and made problem-specific. To enable the effective use of pattern-based biclustering by the scientific community, we developed BicPAMS (Biclustering based on PAttern Mining Software), a software that: 1) makes available state-of-the-art pattern-based biclustering algorithms (BicPAM (Henriques and Madeira, Alg Mol Biol 9:27, 2014), BicNET (Henriques and Madeira, Alg Mol Biol 11:23, 2016), BicSPAM (Henriques and Madeira, BMC Bioinforma 15:130, 2014), BiC2PAM (Henriques and Madeira, Alg Mol Biol 11:1-30, 2016), BiP (Henriques and Madeira, IEEE/ACM Trans Comput Biol Bioinforma, 2015), DeBi (Serin and Vingron, AMB 6:1-12, 2011) and BiModule (Okada et al., IPSJ Trans Bioinf 48(SIG5):39-48, 2007)); 2) consistently integrates their dispersed contributions; 3) further explores additional accuracy and efficiency gains; and 4) makes available graphical and application programming interfaces. Results on both synthetic and real data confirm the relevance of BicPAMS for biological data analysis, highlighting its essential role for the discovery of putative modules with non-trivial yet biologically significant functions from expression and network data. BicPAMS is the first biclustering tool offering the possibility to: 1) parametrically customize the structure, coherency and quality of biclusters; 2) analyze large-scale biological networks; and 3) tackle the restrictive assumptions placed by state-of-the-art biclustering algorithms. These contributions are shown to be key for an adequate, complete and user-assisted unsupervised analysis of biological data. BicPAMS and its tutorial available in http://www.bicpams.com .
Selmansberger, Martin; Braselmann, Herbert; Hess, Julia; Bogdanova, Tetiana; Abend, Michael; Tronko, Mykola; Brenner, Alina; Zitzelsberger, Horst; Unger, Kristian
2015-01-01
One of the major consequences of the 1986 Chernobyl reactor accident was a dramatic increase in papillary thyroid carcinoma (PTC) incidence, predominantly in patients exposed to the radioiodine fallout at young age. The present study is the first on genomic copy number alterations (CNAs) of PTCs of the Ukrainian–American cohort (UkrAm) generated by array comparative genomic hybridization (aCGH). Unsupervised hierarchical clustering of CNA profiles revealed a significant enrichment of a subgroup of patients with female gender, long latency (>17 years) and negative lymph node status. Further, we identified single CNAs that were significantly associated with latency, gender, radiation dose and BRAF V600E mutation status. Multivariate analysis revealed no interactions but additive effects of parameters gender, latency and dose on CNAs. The previously identified radiation-associated gain of the chromosomal bands 7q11.22-11.23 was present in 29% of cases. Moreover, comparison of our radiation-associated PTC data set with the TCGA data set on sporadic PTCs revealed altered copy numbers of the tumor driver genes NF2 and CHEK2. Further, we integrated the CNA data with transcriptomic data that were available on a subset of the herein analyzed cohort and did not find statistically significant associations between the two molecular layers. However, applying hierarchical clustering on a ‘BRAF-like/RAS-like’ transcriptome signature split the cases into four groups, one of which containing all BRAF-positive cases validating the signature in an independent data set. PMID:26320103
Cannistraci, Carlo Vittorio; Ravasi, Timothy; Montevecchi, Franco Maria; Ideker, Trey; Alessio, Massimo
2010-09-15
Nonlinear small datasets, which are characterized by low numbers of samples and very high numbers of measures, occur frequently in computational biology, and pose problems in their investigation. Unsupervised hybrid-two-phase (H2P) procedures-specifically dimension reduction (DR), coupled with clustering-provide valuable assistance, not only for unsupervised data classification, but also for visualization of the patterns hidden in high-dimensional feature space. 'Minimum Curvilinearity' (MC) is a principle that-for small datasets-suggests the approximation of curvilinear sample distances in the feature space by pair-wise distances over their minimum spanning tree (MST), and thus avoids the introduction of any tuning parameter. MC is used to design two novel forms of nonlinear machine learning (NML): Minimum Curvilinear embedding (MCE) for DR, and Minimum Curvilinear affinity propagation (MCAP) for clustering. Compared with several other unsupervised and supervised algorithms, MCE and MCAP, whether individually or combined in H2P, overcome the limits of classical approaches. High performance was attained in the visualization and classification of: (i) pain patients (proteomic measurements) in peripheral neuropathy; (ii) human organ tissues (genomic transcription factor measurements) on the basis of their embryological origin. MC provides a valuable framework to estimate nonlinear distances in small datasets. Its extension to large datasets is prefigured for novel NMLs. Classification of neuropathic pain by proteomic profiles offers new insights for future molecular and systems biology characterization of pain. Improvements in tissue embryological classification refine results obtained in an earlier study, and suggest a possible reinterpretation of skin attribution as mesodermal. https://sites.google.com/site/carlovittoriocannistraci/home.
Spike timing analysis in neural networks with unsupervised synaptic plasticity
NASA Astrophysics Data System (ADS)
Mizusaki, B. E. P.; Agnes, E. J.; Brunnet, L. G.; Erichsen, R., Jr.
2013-01-01
The synaptic plasticity rules that sculpt a neural network architecture are key elements to understand cortical processing, as they may explain the emergence of stable, functional activity, while avoiding runaway excitation. For an associative memory framework, they should be built in a way as to enable the network to reproduce a robust spatio-temporal trajectory in response to an external stimulus. Still, how these rules may be implemented in recurrent networks and the way they relate to their capacity of pattern recognition remains unclear. We studied the effects of three phenomenological unsupervised rules in sparsely connected recurrent networks for associative memory: spike-timing-dependent-plasticity, short-term-plasticity and an homeostatic scaling. The system stability is monitored during the learning process of the network, as the mean firing rate converges to a value determined by the homeostatic scaling. Afterwards, it is possible to measure the recovery efficiency of the activity following each initial stimulus. This is evaluated by a measure of the correlation between spike fire timings, and we analysed the full memory separation capacity and limitations of this system.
On the unsupervised analysis of domain-specific Chinese texts
Deng, Ke; Bol, Peter K.; Li, Kate J.; Liu, Jun S.
2016-01-01
With the growing availability of digitized text data both publicly and privately, there is a great need for effective computational tools to automatically extract information from texts. Because the Chinese language differs most significantly from alphabet-based languages in not specifying word boundaries, most existing Chinese text-mining methods require a prespecified vocabulary and/or a large relevant training corpus, which may not be available in some applications. We introduce an unsupervised method, top-down word discovery and segmentation (TopWORDS), for simultaneously discovering and segmenting words and phrases from large volumes of unstructured Chinese texts, and propose ways to order discovered words and conduct higher-level context analyses. TopWORDS is particularly useful for mining online and domain-specific texts where the underlying vocabulary is unknown or the texts of interest differ significantly from available training corpora. When outputs from TopWORDS are fed into context analysis tools such as topic modeling, word embedding, and association pattern finding, the results are as good as or better than that from using outputs of a supervised segmentation method. PMID:27185919
Ptitsyn, Andrey; Hulver, Matthew; Cefalu, William; York, David; Smith, Steven R
2006-12-19
Classification of large volumes of data produced in a microarray experiment allows for the extraction of important clues as to the nature of a disease. Using multi-dimensional unsupervised FOREL (FORmal ELement) algorithm we have re-analyzed three public datasets of skeletal muscle gene expression in connection with insulin resistance and type 2 diabetes (DM2). Our analysis revealed the major line of variation between expression profiles of normal, insulin resistant, and diabetic skeletal muscle. A cluster of most "metabolically sound" samples occupied one end of this line. The distance along this line coincided with the classic markers of diabetes risk, namely obesity and insulin resistance, but did not follow the accepted clinical diagnosis of DM2 as defined by the presence or absence of hyperglycemia. Genes implicated in this expression pattern are those controlling skeletal muscle fiber type and glycolytic metabolism. Additionally myoglobin and hemoglobin were upregulated and ribosomal genes deregulated in insulin resistant patients. Our findings are concordant with the changes seen in skeletal muscle with altitude hypoxia. This suggests that hypoxia and shift to glycolytic metabolism may also drive insulin resistance.
NASA Astrophysics Data System (ADS)
masini, nicola; Lasaponara, Rosa
2013-04-01
The papers deals with the use of VHR satellite multitemporal data set to extract cultural landscape changes in the roman site of Grumentum Grumentum is an ancient town, 50 km south of Potenza, located near the roman road of Via Herculea which connected the Venusia, in the north est of Basilicata, with Heraclea in the Ionian coast. The first settlement date back to the 6th century BC. It was resettled by the Romans in the 3rd century BC. Its urban fabric which evidences a long history from the Republican age to late Antiquity (III BC-V AD) is composed of the typical urban pattern of cardi and decumani. Its excavated ruins include a large amphitheatre, a theatre, the thermae, the Forum and some temples. There are many techniques nowadays available to capture and record differences in two or more images. In this paper we focus and apply the two main approaches which can be distinguished into : (i) unsupervised and (ii) supervised change detection methods. Unsupervised change detection methods are generally based on the transformation of the two multispectral images in to a single band or multiband image which are further analyzed to identify changes Unsupervised change detection techniques are generally based on three basic steps (i) the preprocessing step, (ii) a pixel-by-pixel comparison is performed, (iii). Identification of changes according to the magnitude an direction (positive /negative). Unsupervised change detection are generally based on the transformation of the two multispectral images into a single band or multiband image which are further analyzed to identify changes. Than the separation between changed and unchanged classes is obtained from the magnitude of the resulting spectral change vectors by means of empirical or theoretical well founded approaches Supervised change detection methods are generally based on supervised classification methods, which require the availability of a suitable training set for the learning process of the classifiers. Unsupervised change detection techniques are generally based on three basic steps (i) the preprocessing step, (ii) supervised classification is performed on the single dates or on the map obtained as the difference of two dates, (iii). Identification of changes according to the magnitude an direction (positive /negative). Supervised change detection are generally based on supervised classification methods, which require the availability of a suitable training set for the learning process of the classifiers, therefore these algorithms require a preliminary knowledge necessary: (i) to generate representative parameters for each class of interest; and (ii) to carry out the training stage Advantages and disadvantages of the supervised and unsupervised approaches are discuss. Finally results from the the satellite multitemporal dataset was also integrated with aerial photos from historical archive in order to expand the time window of the investigation and capture landscape changes occurred from the Agrarian Reform, in the 50s, up today.
Hierarchical species distribution models
Hefley, Trevor J.; Hooten, Mevin B.
2016-01-01
Determining the distribution pattern of a species is important to increase scientific knowledge, inform management decisions, and conserve biodiversity. To infer spatial and temporal patterns, species distribution models have been developed for use with many sampling designs and types of data. Recently, it has been shown that count, presence-absence, and presence-only data can be conceptualized as arising from a point process distribution. Therefore, it is important to understand properties of the point process distribution. We examine how the hierarchical species distribution modeling framework has been used to incorporate a wide array of regression and theory-based components while accounting for the data collection process and making use of auxiliary information. The hierarchical modeling framework allows us to demonstrate how several commonly used species distribution models can be derived from the point process distribution, highlight areas of potential overlap between different models, and suggest areas where further research is needed.
The dynamics of human-induced land cover change in miombo ecosystems of southern Africa
NASA Astrophysics Data System (ADS)
Jaiteh, Malanding Sambou
Understanding human-induced land cover change in the miombo require the consistent, geographically-referenced, data on temporal land cover characteristics as well as biophysical and socioeconomic drivers of land use, the major cause of land cover change. The overall goal of this research to examine the applications of high-resolution satellite remote sensing data in studying the dynamics of human-induced land cover change in the miombo. Specific objectives are to: (1) evaluate the applications of computer-assisted classification of Landsat Thematic Mapper (TM) data for land cover mapping in the miombo and (2) analyze spatial and temporal patterns of landscape change locations in the miombo. Stepwise Thematic Classification, STC (a hybrid supervised-unsupervised classification) procedure for classifying Landsat TM data was developed and tested using Landsat TM data. Classification accuracy results were compared to those from supervised and unsupervised classification. The STC provided the highest classification accuracy i.e., 83.9% correspondence between classified and referenced data compared to 44.2% and 34.5% for unsupervised and supervised classification respectively. Improvements in the classification process can be attributed to thematic stratification of the image data into spectrally homogenous (thematic) groups and step-by-step classification of the groups using supervised or unsupervised classification techniques. Supervised classification failed to classify 18% of the scene evidence that training data used did not adequately represent all of the variability in the data. Application of the procedure in drier miombo produced overall classification accuracy of 63%. This is much lower than that of wetter miombo. The results clearly demonstrate that digital classification of Landsat TM can be successfully implemented in the miombo without intensive fieldwork. Spatial characteristics of land cover change in agricultural and forested landscapes in central Malawi were analyzed for the period 1984 to 1995 spatial pattern analysis methods. Shifting cultivation areas, Agriculture in forested landscape, experienced highest rate of woodland cover fragmentation with mean patch size of closed woodland cover decreasing from 20ha to 7.5ha. Permanent bare (cropland and settlement) in intensive agricultural matrix landscapes increased 52% largely through the conversion of fallow areas. Protected National Park area remained fairly unchanged although closed woodland area increased by 4%, mainly from regeneration of open woodland. This study provided evidence that changes in spatial characteristics in the miombo differ with landscape. Land use change (i.e. conversion to cropland) is the primary driving force behind changes in landscape spatial patterns. Also, results revealed that exclusion of intense human use (i.e. cultivation and woodcutting) through regulations and/or fencing increased both closed woodland area (through regeneration of open woodland) and overall connectivity in the landscape. Spatial characteristics of land cover change were analyzed at locations in Malawi (wetter miombo) and Zimbabwe (drier miombo). Results indicate land cover dynamics differ both between and within case study sites. In communal areas in the Kasungu scene, land cover change is dominated by woodland fragmentation to open vegetation. Change in private commercial lands was dominantly expansion of bare (settlement and cropland) areas primarily at the expense of open vegetation (fallow land).
Dronova, Iryna; Spotswood, Erica N.; Suding, Katharine N.
2017-01-01
Understanding spatial distributions of invasive plant species at early infestation stages is critical for assessing the dynamics and underlying factors of invasions. Recent progress in very high resolution remote sensing is facilitating this task by providing high spatial detail over whole-site extents that are prohibitive to comprehensive ground surveys. This study assessed the opportunities and constraints to characterize landscape distribution of the invasive grass medusahead (Elymus caput-medusae) in a ∼36.8 ha grassland in California, United States from 0.15m-resolution visible/near-infrared aerial imagery at the stage of late spring phenological contrast with dominant grasses. We compared several object-based unsupervised, single-run supervised and hierarchical approaches to classify medusahead using spectral, textural, and contextual variables. Fuzzy accuracy assessment indicated that 44–100% of test medusahead samples were matched by its classified extents from different methods, while 63–83% of test samples classified as medusahead had this class as an acceptable candidate. Main sources of error included spectral similarity between medusahead and other green species and mixing of medusahead with other vegetation at variable densities. Adding texture attributes to spectral variables increased the accuracy of most classification methods, corroborating the informative value of local patterns under limited spectral data. The highest accuracy across different metrics was shown by the supervised single-run support vector machine with seven vegetation classes and Bayesian algorithms with three vegetation classes; however, their medusahead allocations showed some “spillover” effects due to misclassifications with other green vegetation. This issue was addressed by more complex hierarchical approaches, though their final accuracy did not exceed the best single-run methods. However, the comparison of classified medusahead extents with field segments of its patches overlapping with survey transects indicated that most methods tended to miss and/or over-estimate the length of the smallest patches and under-estimate the largest ones due to classification errors. Overall, the study outcomes support the potential of cost-effective, very high-resolution sensing for the site-scale detection of infestation hotspots that can be customized to plant phenological schedules. However, more accurate medusahead patch delineation in mixed-cover grasslands would benefit from testing hyperspectral data and using our study’s framework to inform and constrain the candidate vegetation classes in heterogeneous locations. PMID:28611806
Dronova, Iryna; Spotswood, Erica N; Suding, Katharine N
2017-01-01
Understanding spatial distributions of invasive plant species at early infestation stages is critical for assessing the dynamics and underlying factors of invasions. Recent progress in very high resolution remote sensing is facilitating this task by providing high spatial detail over whole-site extents that are prohibitive to comprehensive ground surveys. This study assessed the opportunities and constraints to characterize landscape distribution of the invasive grass medusahead ( Elymus caput-medusae ) in a ∼36.8 ha grassland in California, United States from 0.15m-resolution visible/near-infrared aerial imagery at the stage of late spring phenological contrast with dominant grasses. We compared several object-based unsupervised, single-run supervised and hierarchical approaches to classify medusahead using spectral, textural, and contextual variables. Fuzzy accuracy assessment indicated that 44-100% of test medusahead samples were matched by its classified extents from different methods, while 63-83% of test samples classified as medusahead had this class as an acceptable candidate. Main sources of error included spectral similarity between medusahead and other green species and mixing of medusahead with other vegetation at variable densities. Adding texture attributes to spectral variables increased the accuracy of most classification methods, corroborating the informative value of local patterns under limited spectral data. The highest accuracy across different metrics was shown by the supervised single-run support vector machine with seven vegetation classes and Bayesian algorithms with three vegetation classes; however, their medusahead allocations showed some "spillover" effects due to misclassifications with other green vegetation. This issue was addressed by more complex hierarchical approaches, though their final accuracy did not exceed the best single-run methods. However, the comparison of classified medusahead extents with field segments of its patches overlapping with survey transects indicated that most methods tended to miss and/or over-estimate the length of the smallest patches and under-estimate the largest ones due to classification errors. Overall, the study outcomes support the potential of cost-effective, very high-resolution sensing for the site-scale detection of infestation hotspots that can be customized to plant phenological schedules. However, more accurate medusahead patch delineation in mixed-cover grasslands would benefit from testing hyperspectral data and using our study's framework to inform and constrain the candidate vegetation classes in heterogeneous locations.
HOTS: A Hierarchy of Event-Based Time-Surfaces for Pattern Recognition.
Lagorce, Xavier; Orchard, Garrick; Galluppi, Francesco; Shi, Bertram E; Benosman, Ryad B
2017-07-01
This paper describes novel event-based spatio-temporal features called time-surfaces and how they can be used to create a hierarchical event-based pattern recognition architecture. Unlike existing hierarchical architectures for pattern recognition, the presented model relies on a time oriented approach to extract spatio-temporal features from the asynchronously acquired dynamics of a visual scene. These dynamics are acquired using biologically inspired frameless asynchronous event-driven vision sensors. Similarly to cortical structures, subsequent layers in our hierarchy extract increasingly abstract features using increasingly large spatio-temporal windows. The central concept is to use the rich temporal information provided by events to create contexts in the form of time-surfaces which represent the recent temporal activity within a local spatial neighborhood. We demonstrate that this concept can robustly be used at all stages of an event-based hierarchical model. First layer feature units operate on groups of pixels, while subsequent layer feature units operate on the output of lower level feature units. We report results on a previously published 36 class character recognition task and a four class canonical dynamic card pip task, achieving near 100 percent accuracy on each. We introduce a new seven class moving face recognition task, achieving 79 percent accuracy.This paper describes novel event-based spatio-temporal features called time-surfaces and how they can be used to create a hierarchical event-based pattern recognition architecture. Unlike existing hierarchical architectures for pattern recognition, the presented model relies on a time oriented approach to extract spatio-temporal features from the asynchronously acquired dynamics of a visual scene. These dynamics are acquired using biologically inspired frameless asynchronous event-driven vision sensors. Similarly to cortical structures, subsequent layers in our hierarchy extract increasingly abstract features using increasingly large spatio-temporal windows. The central concept is to use the rich temporal information provided by events to create contexts in the form of time-surfaces which represent the recent temporal activity within a local spatial neighborhood. We demonstrate that this concept can robustly be used at all stages of an event-based hierarchical model. First layer feature units operate on groups of pixels, while subsequent layer feature units operate on the output of lower level feature units. We report results on a previously published 36 class character recognition task and a four class canonical dynamic card pip task, achieving near 100 percent accuracy on each. We introduce a new seven class moving face recognition task, achieving 79 percent accuracy.
Davidson, Ben; Stavnes, Helene Tuft; Holth, Arild; Chen, Xu; Yang, Yanqin; Shih, Ie-Ming; Wang, Tian-Li
2011-01-01
Abstract Ovarian/primary peritoneal carcinoma and breast carcinoma are the gynaecological cancers that most frequently involve the serosal cavities. With the objective of improving on the limited diagnostic panel currently available for the differential diagnosis of these two malignancies, as well as to define tumour-specific biological targets, we compared their global gene expression patterns. Gene expression profiles of 10 serous ovarian/peritoneal and eight ductal breast carcinoma effusions were analysed using the HumanRef-8 BeadChip from Illumina. Differentially expressed candidate genes were validated using quantitative real-time PCR and immunohistochemistry. Unsupervised hierarchical clustering using all 54,675 genes in the array separated ovarian from breast carcinoma samples. We identified 288 unique probes that were significantly differentially expressed in the two cancers by greater than 3.5-fold, of which 81 and 207 were overexpressed in breast and ovarian/peritoneal carcinoma, respectively. SAM analysis identified 1078 differentially expressed probes with false discovery rate less than 0.05. Genes overexpressed in breast carcinoma included TFF1, TFF3, FOXA1, CA12, GATA3, SDC1, PITX1, TH, EHFD1, EFEMP1, TOB1 and KLF2. Genes overexpressed in ovarian/peritoneal carcinoma included SPON1, RBP1, MFGE8, TM4SF12, MMP7, KLK5/6/7, FOLR1/3, PAX8, APOL2 and NRCAM. The differential expression of 14 genes was validated by quantitative real-time PCR, and differences in 5 gene products were confirmed by immunohistochemistry. Expression profiling distinguishes ovarian/peritoneal carcinoma from breast carcinoma and identifies genes that are differentially expressed in these two tumour types. The molecular signatures unique to these cancers may facilitate their differential diagnosis and may provide a molecular basis for therapeutic target discovery. PMID:20132413
Advanced methods in NDE using machine learning approaches
NASA Astrophysics Data System (ADS)
Wunderlich, Christian; Tschöpe, Constanze; Duckhorn, Frank
2018-04-01
Machine learning (ML) methods and algorithms have been applied recently with great success in quality control and predictive maintenance. Its goal to build new and/or leverage existing algorithms to learn from training data and give accurate predictions, or to find patterns, particularly with new and unseen similar data, fits perfectly to Non-Destructive Evaluation. The advantages of ML in NDE are obvious in such tasks as pattern recognition in acoustic signals or automated processing of images from X-ray, Ultrasonics or optical methods. Fraunhofer IKTS is using machine learning algorithms in acoustic signal analysis. The approach had been applied to such a variety of tasks in quality assessment. The principal approach is based on acoustic signal processing with a primary and secondary analysis step followed by a cognitive system to create model data. Already in the second analysis steps unsupervised learning algorithms as principal component analysis are used to simplify data structures. In the cognitive part of the software further unsupervised and supervised learning algorithms will be trained. Later the sensor signals from unknown samples can be recognized and classified automatically by the algorithms trained before. Recently the IKTS team was able to transfer the software for signal processing and pattern recognition to a small printed circuit board (PCB). Still, algorithms will be trained on an ordinary PC; however, trained algorithms run on the Digital Signal Processor and the FPGA chip. The identical approach will be used for pattern recognition in image analysis of OCT pictures. Some key requirements have to be fulfilled, however. A sufficiently large set of training data, a high signal-to-noise ratio, and an optimized and exact fixation of components are required. The automated testing can be done subsequently by the machine. By integrating the test data of many components along the value chain further optimization including lifetime and durability prediction based on big data becomes possible, even if components are used in different versions or configurations. This is the promise behind German Industry 4.0.
ERIC Educational Resources Information Center
Hagger, Martin S.; Biddle, Stuart J. H.; John Wang, C. K.
2005-01-01
This study tests the generalizability of the factor pattern, structural parameters, and latent mean structure of a multidimensional, hierarchical model of physical self-concept in adolescents across gender and grade. A children's version of the Physical Self-Perception Profile (C-PSPP) was administered to seventh-, eighth- and ninth-grade high…
Taguchi, Y-h; Iwadate, Mitsuo; Umeyama, Hideaki
2015-04-30
Feature extraction (FE) is difficult, particularly if there are more features than samples, as small sample numbers often result in biased outcomes or overfitting. Furthermore, multiple sample classes often complicate FE because evaluating performance, which is usual in supervised FE, is generally harder than the two-class problem. Developing sample classification independent unsupervised methods would solve many of these problems. Two principal component analysis (PCA)-based FE, specifically, variational Bayes PCA (VBPCA) was extended to perform unsupervised FE, and together with conventional PCA (CPCA)-based unsupervised FE, were tested as sample classification independent unsupervised FE methods. VBPCA- and CPCA-based unsupervised FE both performed well when applied to simulated data, and a posttraumatic stress disorder (PTSD)-mediated heart disease data set that had multiple categorical class observations in mRNA/microRNA expression of stressed mouse heart. A critical set of PTSD miRNAs/mRNAs were identified that show aberrant expression between treatment and control samples, and significant, negative correlation with one another. Moreover, greater stability and biological feasibility than conventional supervised FE was also demonstrated. Based on the results obtained, in silico drug discovery was performed as translational validation of the methods. Our two proposed unsupervised FE methods (CPCA- and VBPCA-based) worked well on simulated data, and outperformed two conventional supervised FE methods on a real data set. Thus, these two methods have suggested equivalence for FE on categorical multiclass data sets, with potential translational utility for in silico drug discovery.
NASA Astrophysics Data System (ADS)
Unglert, K.; Radić, V.; Jellinek, A. M.
2016-06-01
Variations in the spectral content of volcano seismicity related to changes in volcanic activity are commonly identified manually in spectrograms. However, long time series of monitoring data at volcano observatories require tools to facilitate automated and rapid processing. Techniques such as self-organizing maps (SOM) and principal component analysis (PCA) can help to quickly and automatically identify important patterns related to impending eruptions. For the first time, we evaluate the performance of SOM and PCA on synthetic volcano seismic spectra constructed from observations during two well-studied eruptions at Klauea Volcano, Hawai'i, that include features observed in many volcanic settings. In particular, our objective is to test which of the techniques can best retrieve a set of three spectral patterns that we used to compose a synthetic spectrogram. We find that, without a priori knowledge of the given set of patterns, neither SOM nor PCA can directly recover the spectra. We thus test hierarchical clustering, a commonly used method, to investigate whether clustering in the space of the principal components and on the SOM, respectively, can retrieve the known patterns. Our clustering method applied to the SOM fails to detect the correct number and shape of the known input spectra. In contrast, clustering of the data reconstructed by the first three PCA modes reproduces these patterns and their occurrence in time more consistently. This result suggests that PCA in combination with hierarchical clustering is a powerful practical tool for automated identification of characteristic patterns in volcano seismic spectra. Our results indicate that, in contrast to PCA, common clustering algorithms may not be ideal to group patterns on the SOM and that it is crucial to evaluate the performance of these tools on a control dataset prior to their application to real data.
Involvement of Working Memory in College Students' Sequential Pattern Learning and Performance
ERIC Educational Resources Information Center
Kundey, Shannon M. A.; De Los Reyes, Andres; Rowan, James D.; Lee, Bern; Delise, Justin; Molina, Sabrina; Cogdill, Lindsay
2013-01-01
When learning highly organized sequential patterns of information, humans and nonhuman animals learn rules regarding the hierarchical structures of these sequences. In three experiments, we explored the role of working memory in college students' sequential pattern learning and performance in a computerized task involving a sequential…
Wang, Meng; Liu, Qian; Zhang, Haoran; Wang, Chuang; Wang, Lei; Xiang, Bingxi; Fan, Yongtao; Guo, Chuan Fei; Ruan, Shuangchen
2017-08-30
Directional water collection has stimulated a great deal of interest because of its potential applications in the field of microfluidics, liquid transportation, fog harvesting, and so forth. There have been some bio or bioinspired structures for directional water collection, from one-dimensional spider silk to two-dimensional star-like patterns to three-dimensional Nepenthes alata. Here we present a simple way for the accurate design and highly controllable driving of tiny droplets: by laser direct writing of hierarchical patterns with modified wettability and desired geometry on a superhydrophobic film, the patterned film can precisely and directionally drive tiny water droplets and dramatically improve the efficiency of water collection with a factor of ∼36 compared with the original superhydrophobic film. Such a patterned film might be an ideal platform for water collection from humid air and for planar microfluidics without tunnels.
Microimaging FT-IR of oral cavity tumours. Part III: Cells, inoculated tissues and human tissues
NASA Astrophysics Data System (ADS)
Conti, C.; Ferraris, P.; Giorgini, E.; Pieramici, T.; Possati, L.; Rocchetti, R.; Rubini, C.; Sabbatini, S.; Tosi, G.; Mariggiò, M. A.; Lo Muzio, L.
2007-05-01
The biochemistry of healthy and tumour cell cultures, inoculated tissues and oral cavity tissues have been studied by FT-IR Microscopy with the aim to relate spectral patterns with microbiological and histopathological findings. 'Supervised' and 'unsupervised' procedures of data handling afforded a satisfactory degree of accordance between spectroscopic and the other two techniques. In particular, changes in frequency and intensity of proteins, connective and nucleic acids vibrational modes as well as the visualization of biochemical single wave number or band ratio images, allowed an evaluation of the pathological changes. The spectroscopic patterns of inoculated tissues resulted quite similar to human tissues; differences of both types of sections with cellular lines could be explained by the influence of the environment.
ERIC Educational Resources Information Center
Evans, Carolyn
2012-01-01
Although social stratification usually calls to mind the hierarchical ranking of individuals, sociology often broadly considers it the ranking of any social objects. The Treiman Socio-Economic Index (SEI), for example, provides a quantitative assessment of the hierarchical ranking of occupations. This dissertation considers the hierarchical…
The Role of Prototype Learning in Hierarchical Models of Vision
ERIC Educational Resources Information Center
Thomure, Michael David
2014-01-01
I conduct a study of learning in HMAX-like models, which are hierarchical models of visual processing in biological vision systems. Such models compute a new representation for an image based on the similarity of image sub-parts to a number of specific patterns, called prototypes. Despite being a central piece of the overall model, the issue of…
Tsakpinoglou, Florence; Poulin, François
2017-10-01
Best friends exert a substantial influence on rising alcohol and marijuana use during adolescence. Two mechanisms occurring within friendship - friend pressure and unsupervised co-deviancy - may partially capture the way friends influence one another. The current study aims to: (1) examine the psychometric properties of a new instrument designed to assess pressure from a youth's best friend and unsupervised co-deviancy; (2) investigate the relative contribution of these processes to alcohol and marijuana use; and (3) determine whether gender moderates these associations. Data were collected through self-report questionnaires completed by 294 Canadian youths (62% female) across two time points (ages 15-16). Principal component analysis yielded a two-factor solution corresponding to friend pressure and unsupervised co-deviancy. Logistic regressions subsequently showed that unsupervised co-deviancy was predictive of an increase in marijuana use one year later. Neither process predicted an increase in alcohol use. Results did not differ as a function of gender. Copyright © 2017 The Foundation for Professionals in Services for Adolescents. Published by Elsevier Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Brumfield, J. O.; Bloemer, H. H. L.; Campbell, W. J.
1981-01-01
Two unsupervised classification procedures for analyzing Landsat data used to monitor land reclamation in a surface mining area in east central Ohio are compared for agreement with data collected from the corresponding locations on the ground. One procedure is based on a traditional unsupervised-clustering/maximum-likelihood algorithm sequence that assumes spectral groupings in the Landsat data in n-dimensional space; the other is based on a nontraditional unsupervised-clustering/canonical-transformation/clustering algorithm sequence that not only assumes spectral groupings in n-dimensional space but also includes an additional feature-extraction technique. It is found that the nontraditional procedure provides an appreciable improvement in spectral groupings and apparently increases the level of accuracy in the classification of land cover categories.
Magnetic domain pattern in hierarchically twinned epitaxial Ni-Mn-Ga films.
Diestel, Anett; Neu, Volker; Backen, Anja; Schultz, Ludwig; Fähler, Sebastian
2013-07-03
Magnetic shape memory alloys exhibit a hierarchically twinned microstructure, which has been examined thoroughly in epitaxial Ni-Mn-Ga films. Here we analyze the consequences of this 'twin within twins' microstructure on the magnetic domain pattern. Atomic and magnetic force microscopy are used to probe the correlation between the martensitic microstructure and magnetic domains. We examine the consequences of different twin boundary orientations with respect to the substrate normal as well as variant boundaries between differently aligned twinned laminates. A detailed micromagnetic analysis is given which describes the influence of the finite film thickness on the formation of magnetic band domains in these multiferroic materials.
Biological hierarchies and the nature of extinction.
Congreve, Curtis R; Falk, Amanda R; Lamsdell, James C
2018-05-01
Hierarchy theory recognises that ecological and evolutionary units occur in a nested and interconnected hierarchical system, with cascading effects occurring between hierarchical levels. Different biological disciplines have routinely come into conflict over the primacy of different forcing mechanisms behind evolutionary and ecological change. These disconnects arise partly from differences in perspective (with some researchers favouring ecological forcing mechanisms while others favour developmental/historical mechanisms), as well as differences in the temporal framework in which workers operate. In particular, long-term palaeontological data often show that large-scale (macro) patterns of evolution are predominantly dictated by shifts in the abiotic environment, while short-term (micro) modern biological studies stress the importance of biotic interactions. We propose that thinking about ecological and evolutionary interactions in a hierarchical framework is a fruitful way to resolve these conflicts. Hierarchy theory suggests that changes occurring at lower hierarchical levels can have unexpected, complex effects at higher scales due to emergent interactions between simple systems. In this way, patterns occurring on short- and long-term time scales are equally valid, as changes that are driven from lower levels will manifest in different forms at higher levels. We propose that the dual hierarchy framework fits well with our current understanding of evolutionary and ecological theory. Furthermore, we describe how this framework can be used to understand major extinction events better. Multi-generational attritional loss of reproductive fitness (MALF) has recently been proposed as the primary mechanism behind extinction events, whereby extinction is explainable solely through processes that result in extirpation of populations through a shutdown of reproduction. While not necessarily explicit, the push to explain extinction through solely population-level dynamics could be used to suggest that environmentally mediated patterns of extinction or slowed speciation across geological time are largely artefacts of poor preservation or a coarse temporal scale. We demonstrate how MALF fits into a hierarchical framework, showing that MALF can be a primary forcing mechanism at lower scales that still results in differential survivorship patterns at the species and clade level which vary depending upon the initial environmental forcing mechanism. Thus, even if MALF is the primary mechanism of extinction across all mass extinction events, the primary environmental cause of these events will still affect the system and result in differential responses. Therefore, patterns at both temporal scales are relevant. © 2017 Cambridge Philosophical Society.
Kamali, Tahereh; Stashuk, Daniel
2016-10-01
Robust and accurate segmentation of brain white matter (WM) fiber bundles assists in diagnosing and assessing progression or remission of neuropsychiatric diseases such as schizophrenia, autism and depression. Supervised segmentation methods are infeasible in most applications since generating gold standards is too costly. Hence, there is a growing interest in designing unsupervised methods. However, most conventional unsupervised methods require the number of clusters be known in advance which is not possible in most applications. The purpose of this study is to design an unsupervised segmentation algorithm for brain white matter fiber bundles which can automatically segment fiber bundles using intrinsic diffusion tensor imaging data information without considering any prior information or assumption about data distributions. Here, a new density based clustering algorithm called neighborhood distance entropy consistency (NDEC), is proposed which discovers natural clusters within data by simultaneously utilizing both local and global density information. The performance of NDEC is compared with other state of the art clustering algorithms including chameleon, spectral clustering, DBSCAN and k-means using Johns Hopkins University publicly available diffusion tensor imaging data. The performance of NDEC and other employed clustering algorithms were evaluated using dice ratio as an external evaluation criteria and density based clustering validation (DBCV) index as an internal evaluation metric. Across all employed clustering algorithms, NDEC obtained the highest average dice ratio (0.94) and DBCV value (0.71). NDEC can find clusters with arbitrary shapes and densities and consequently can be used for WM fiber bundle segmentation where there is no distinct boundary between various bundles. NDEC may also be used as an effective tool in other pattern recognition and medical diagnostic systems in which discovering natural clusters within data is a necessity. Copyright © 2016 Elsevier B.V. All rights reserved.
Patterned Diblock Co-Polymer Thin Films as Templates for Advanced Anisotropic Metal Nanostructures.
Roth, Stephan V; Santoro, Gonzalo; Risch, Johannes F H; Yu, Shun; Schwartzkopf, Matthias; Boese, Torsten; Döhrmann, Ralph; Zhang, Peng; Besner, Bastian; Bremer, Philipp; Rukser, Dieter; Rübhausen, Michael A; Terrill, Nick J; Staniec, Paul A; Yao, Yuan; Metwalli, Ezzeldin; Müller-Buschbaum, Peter
2015-06-17
We demonstrate glancing-angle deposition of gold on a nanostructured diblock copolymer, namely polystyrene-block-poly(methyl methacrylate) thin film. Exploiting the selective wetting of gold on the polystyrene block, we are able to fabricate directional hierarchical structures. We prove the asymmetric growth of the gold nanoparticles and are able to extract the different growth laws by in situ scattering methods. The optical anisotropy of these hierarchical hybrid materials is further probed by angular resolved spectroscopic methods. This approach enables us to tailor functional hierarchical layers in nanodevices, such as nanoantennae arrays, organic photovoltaics, and sensor electronics.
Novel approaches are needed for discovery of targeted therapies for non-small-cell lung cancer (NSCLC) that are specific to certain patients. Whole genome RNAi screening of lung cancer cell lines provides an ideal source for determining candidate drug targets. Unsupervised learning algorithms uncovered patterns of differential vulnerability across lung cancer cell lines to loss of functionally related genes. Such genetic vulnerabilities represent candidate targets for therapy and are found to be involved in splicing, translation and protein folding.
Chartier, Sylvain; Proulx, Robert
2005-11-01
This paper presents a new unsupervised attractor neural network, which, contrary to optimal linear associative memory models, is able to develop nonbipolar attractors as well as bipolar attractors. Moreover, the model is able to develop less spurious attractors and has a better recall performance under random noise than any other Hopfield type neural network. Those performances are obtained by a simple Hebbian/anti-Hebbian online learning rule that directly incorporates feedback from a specific nonlinear transmission rule. Several computer simulations show the model's distinguishing properties.
Analytical aspects of plant metabolite profiling platforms: current standings and future aims.
Seger, Christoph; Sturm, Sonja
2007-02-01
Over the past years, metabolic profiling has been established as a comprehensive systems biology tool. Mass spectrometry or NMR spectroscopy-based technology platforms combined with unsupervised or supervised multivariate statistical methodologies allow a deep insight into the complex metabolite patterns of plant-derived samples. Within this review, we provide a thorough introduction to the analytical hard- and software requirements of metabolic profiling platforms. Methodological limitations are addressed, and the metabolic profiling workflow is exemplified by summarizing recent applications ranging from model systems to more applied topics.
Using Machine Learning Techniques in the Analysis of Oceanographic Data
NASA Astrophysics Data System (ADS)
Falcinelli, K. E.; Abuomar, S.
2017-12-01
Acoustic Doppler Current Profilers (ADCPs) are oceanographic tools capable of collecting large amounts of current profile data. Using unsupervised machine learning techniques such as principal component analysis, fuzzy c-means clustering, and self-organizing maps, patterns and trends in an ADCP dataset are found. Cluster validity algorithms such as visual assessment of cluster tendency and clustering index are used to determine the optimal number of clusters in the ADCP dataset. These techniques prove to be useful in analysis of ADCP data and demonstrate potential for future use in other oceanographic applications.
ERIC Educational Resources Information Center
Dagne, Getachew A.; Brown, C. Hendricks; Howe, George W.
2007-01-01
This article presents new methods for modeling the strength of association between multiple behaviors in a behavioral sequence, particularly those involving substantively important interaction patterns. Modeling and identifying such interaction patterns becomes more complex when behaviors are assigned to more than two categories, as is the case…
Magagna, Federico; Guglielmetti, Alessandro; Liberto, Erica; Reichenbach, Stephen E; Allegrucci, Elena; Gobino, Guido; Bicchi, Carlo; Cordero, Chiara
2017-08-02
This study investigates chemical information of volatile fractions of high-quality cocoa (Theobroma cacao L. Malvaceae) from different origins (Mexico, Ecuador, Venezuela, Columbia, Java, Trinidad, and Sao Tomè) produced for fine chocolate. This study explores the evolution of the entire pattern of volatiles in relation to cocoa processing (raw, roasted, steamed, and ground beans). Advanced chemical fingerprinting (e.g., combined untargeted and targeted fingerprinting) with comprehensive two-dimensional gas chromatography coupled with mass spectrometry allows advanced pattern recognition for classification, discrimination, and sensory-quality characterization. The entire data set is analyzed for 595 reliable two-dimensional peak regions, including 130 known analytes and 13 potent odorants. Multivariate analysis with unsupervised exploration (principal component analysis) and simple supervised discrimination methods (Fisher ratios and linear regression trees) reveal informative patterns of similarities and differences and identify characteristic compounds related to sample origin and manufacturing step.
A Dictionary Approach to Electron Backscatter Diffraction Indexing.
Chen, Yu H; Park, Se Un; Wei, Dennis; Newstadt, Greg; Jackson, Michael A; Simmons, Jeff P; De Graef, Marc; Hero, Alfred O
2015-06-01
We propose a framework for indexing of grain and subgrain structures in electron backscatter diffraction patterns of polycrystalline materials. We discretize the domain of a dynamical forward model onto a dense grid of orientations, producing a dictionary of patterns. For each measured pattern, we identify the most similar patterns in the dictionary, and identify boundaries, detect anomalies, and index crystal orientations. The statistical distribution of these closest matches is used in an unsupervised binary decision tree (DT) classifier to identify grain boundaries and anomalous regions. The DT classifies a pattern as an anomaly if it has an abnormally low similarity to any pattern in the dictionary. It classifies a pixel as being near a grain boundary if the highly ranked patterns in the dictionary differ significantly over the pixel's neighborhood. Indexing is accomplished by computing the mean orientation of the closest matches to each pattern. The mean orientation is estimated using a maximum likelihood approach that models the orientation distribution as a mixture of Von Mises-Fisher distributions over the quaternionic three sphere. The proposed dictionary matching approach permits segmentation, anomaly detection, and indexing to be performed in a unified manner with the additional benefit of uncertainty quantification.
Use of space-time models to investigate the stability of patterns of disease.
Abellan, Juan Jose; Richardson, Sylvia; Best, Nicky
2008-08-01
The use of Bayesian hierarchical spatial models has become widespread in disease mapping and ecologic studies of health-environment associations. In this type of study, the data are typically aggregated over an extensive time period, thus neglecting the time dimension. The output of purely spatial disease mapping studies is therefore the average spatial pattern of risk over the period analyzed, but the results do not inform about, for example, whether a high average risk was sustained over time or changed over time. We investigated how including the time dimension in disease-mapping models strengthens the epidemiologic interpretation of the overall pattern of risk. We discuss a class of Bayesian hierarchical models that simultaneously characterize and estimate the stable spatial and temporal patterns as well as departures from these stable components. We show how useful rules for classifying areas as stable can be constructed based on the posterior distribution of the space-time interactions. We carry out a simulation study to investigate the sensitivity and specificity of the decision rules we propose, and we illustrate our approach in a case study of congenital anomalies in England. Our results confirm that extending hierarchical disease-mapping models to models that simultaneously consider space and time leads to a number of benefits in terms of interpretation and potential for detection of localized excesses.
NASA Astrophysics Data System (ADS)
von Fischer, J. C.; Koyama, A.; Johnson, N. G.; Webb, C. T.
2015-12-01
Scaling problems abound in biogeochemistry. At the finest scale, soil microbes experience habitats and environmental changes that affect the chemical transformations of interest. We collect the DNA of these organisms from sites across landscapes and note differences in who is there, and we seek to evaluate why group membership changes in space (biogeography) and why activity rates change over time (physiology). The goal of efforts at finer scales is often to better predict patterns at larger scales. We conducted such a hierarchical examination of methane uptake in the Great Plains grasslands of North America, gathering data from 22 plots at 8 field locations, scattered from South Dakota to New Mexico and Colorado to Kansas. Our work provides insight into methanotroph biogeochemistry at all of these scales. For example, we found that methane uptake rates vary mostly due to the methanotroph activity, and less so due to diffusivity. A combination of field and lab observations reveal that methanotroph communities differ in their sensitivity to soil moisture and to ammonium (an inhibitor of methanotrophy). Examination of methanotroph community composition reveals tantalizing patterns in composition, dominance and richness across sites, that appears to be structured by patterns of precipitation and soil texture. We anticipate that greater synthesis of these hierarchical findings will paint a richer picture of methanotroph life and enable improved prediction of methane uptake at regional scales.
Age-Related Change in Shifting Attention between Global and Local Levels of Hierarchical Stimuli
ERIC Educational Resources Information Center
Huizinga, Mariette; Burack, Jacob A.; Van der Molen, Maurits W.
2010-01-01
The focus of this study was the developmental pattern of the ability to shift attention between global and local levels of hierarchical stimuli. Children aged 7 years and 11 years and 21-year-old adults were administered a task (two experiments) that allowed for the examination of 1) the direction of attention to global or local stimulus levels;…
Liu, Gui-Song; Guo, Hao-Song; Pan, Tao; Wang, Ji-Hua; Cao, Gan
2014-10-01
Based on Savitzky-Golay (SG) smoothing screening, principal component analysis (PCA) combined with separately supervised linear discriminant analysis (LDA) and unsupervised hierarchical clustering analysis (HCA) were used for non-destructive visible and near-infrared (Vis-NIR) detection for breed screening of transgenic sugarcane. A random and stability-dependent framework of calibration, prediction, and validation was proposed. A total of 456 samples of sugarcane leaves planting in the elongating stage were collected from the field, which was composed of 306 transgenic (positive) samples containing Bt and Bar gene and 150 non-transgenic (negative) samples. A total of 156 samples (negative 50 and positive 106) were randomly selected as the validation set; the remaining samples (negative 100 and positive 200, a total of 300 samples) were used as the modeling set, and then the modeling set was subdivided into calibration (negative 50 and positive 100, a total of 150 samples) and prediction sets (negative 50 and positive 100, a total of 150 samples) for 50 times. The number of SG smoothing points was ex- panded, while some modes of higher derivative were removed because of small absolute value, and a total of 264 smoothing modes were used for screening. The pairwise combinations of first three principal components were used, and then the optimal combination of principal components was selected according to the model effect. Based on all divisions of calibration and prediction sets and all SG smoothing modes, the SG-PCA-LDA and SG-PCA-HCA models were established, the model parameters were optimized based on the average prediction effect for all divisions to produce modeling stability. Finally, the model validation was performed by validation set. With SG smoothing, the modeling accuracy and stability of PCA-LDA, PCA-HCA were signif- icantly improved. For the optimal SG-PCA-LDA model, the recognition rate of positive and negative validation samples were 94.3%, 96.0%; and were 92.5%, 98.0% for the optimal SG-PCA-LDA model, respectively. Vis-NIR spectro- scopic pattern recognition combined with SG smoothing could be used for accurate recognition of transgenic sugarcane leaves, and provided a convenient screening method for transgenic sugarcane breeding.
NASA Technical Reports Server (NTRS)
Jung, Jinha; Pasolli, Edoardo; Prasad, Saurabh; Tilton, James C.; Crawford, Melba M.
2014-01-01
Acquiring current, accurate land-use information is critical for monitoring and understanding the impact of anthropogenic activities on natural environments.Remote sensing technologies are of increasing importance because of their capability to acquire information for large areas in a timely manner, enabling decision makers to be more effective in complex environments. Although optical imagery has demonstrated to be successful for land cover classification, active sensors, such as light detection and ranging (LiDAR), have distinct capabilities that can be exploited to improve classification results. However, utilization of LiDAR data for land cover classification has not been fully exploited. Moreover, spatial-spectral classification has recently gained significant attention since classification accuracy can be improved by extracting additional information from the neighboring pixels. Although spatial information has been widely used for spectral data, less attention has been given to LiDARdata. In this work, a new framework for land cover classification using discrete return LiDAR data is proposed. Pseudo-waveforms are generated from the LiDAR data and processed by hierarchical segmentation. Spatial featuresare extracted in a region-based way using a new unsupervised strategy for multiple pruning of the segmentation hierarchy. The proposed framework is validated experimentally on a real dataset acquired in an urban area. Better classification results are exhibited by the proposed framework compared to the cases in which basic LiDAR products such as digital surface model and intensity image are used. Moreover, the proposed region-based feature extraction strategy results in improved classification accuracies in comparison with a more traditional window-based approach.
Highly efficient classification and identification of human pathogenic bacteria by MALDI-TOF MS.
Hsieh, Sen-Yung; Tseng, Chiao-Li; Lee, Yun-Shien; Kuo, An-Jing; Sun, Chien-Feng; Lin, Yen-Hsiu; Chen, Jen-Kun
2008-02-01
Accurate and rapid identification of pathogenic microorganisms is of critical importance in disease treatment and public health. Conventional work flows are time-consuming, and procedures are multifaceted. MS can be an alternative but is limited by low efficiency for amino acid sequencing as well as low reproducibility for spectrum fingerprinting. We systematically analyzed the feasibility of applying MS for rapid and accurate bacterial identification. Directly applying bacterial colonies without further protein extraction to MALDI-TOF MS analysis revealed rich peak contents and high reproducibility. The MS spectra derived from 57 isolates comprising six human pathogenic bacterial species were analyzed using both unsupervised hierarchical clustering and supervised model construction via the Genetic Algorithm. Hierarchical clustering analysis categorized the spectra into six groups precisely corresponding to the six bacterial species. Precise classification was also maintained in an independently prepared set of bacteria even when the numbers of m/z values were reduced to six. In parallel, classification models were constructed via Genetic Algorithm analysis. A model containing 18 m/z values accurately classified independently prepared bacteria and identified those species originally not used for model construction. Moreover bacteria fewer than 10(4) cells and different species in bacterial mixtures were identified using the classification model approach. In conclusion, the application of MALDI-TOF MS in combination with a suitable model construction provides a highly accurate method for bacterial classification and identification. The approach can identify bacteria with low abundance even in mixed flora, suggesting that a rapid and accurate bacterial identification using MS techniques even before culture can be attained in the near future.
Supervised versus unsupervised categorization: two sides of the same coin?
Pothos, Emmanuel M; Edwards, Darren J; Perlman, Amotz
2011-09-01
Supervised and unsupervised categorization have been studied in separate research traditions. A handful of studies have attempted to explore a possible convergence between the two. The present research builds on these studies, by comparing the unsupervised categorization results of Pothos et al. ( 2011 ; Pothos et al., 2008 ) with the results from two procedures of supervised categorization. In two experiments, we tested 375 participants with nine different stimulus sets and examined the relation between ease of learning of a classification, memory for a classification, and spontaneous preference for a classification. After taking into account the role of the number of category labels (clusters) in supervised learning, we found the three variables to be closely associated with each other. Our results provide encouragement for researchers seeking unified theoretical explanations for supervised and unsupervised categorization, but raise a range of challenging theoretical questions.
Link, W.A.; Sauer, J.R.; Niven, D.K.
2006-01-01
Analysis of Christmas Bird Count (CBC) data is complicated by the need to account for variation in effort on counts and to provide summaries over large geographic regions. We describe a hierarchical model for analysis of population change using CBC data that addresses these needs. The effect of effort is modeled parametrically, with parameter values varying among strata as identically distributed random effects. Year and site effects are modeled hierarchically, accommodating large regional variation in number of samples and precision of estimates. The resulting model is complex, but a Bayesian analysis can be conducted using Markov chain Monte Carlo techniques. We analyze CBC data for American Black Ducks (Anas rubripes), a species of considerable management interest that has historically been monitored using winter surveys. Over the interval 1966-2003, Black Duck populations showed distinct regional patterns of population change. The patterns shown by CBC data are similar to those shown by the Midwinter Waterfowl Inventory for the United States.
A neural network with modular hierarchical learning
NASA Technical Reports Server (NTRS)
Baldi, Pierre F. (Inventor); Toomarian, Nikzad (Inventor)
1994-01-01
This invention provides a new hierarchical approach for supervised neural learning of time dependent trajectories. The modular hierarchical methodology leads to architectures which are more structured than fully interconnected networks. The networks utilize a general feedforward flow of information and sparse recurrent connections to achieve dynamic effects. The advantages include the sparsity of units and connections, the modular organization. A further advantage is that the learning is much more circumscribed learning than in fully interconnected systems. The present invention is embodied by a neural network including a plurality of neural modules each having a pre-established performance capability wherein each neural module has an output outputting present results of the performance capability and an input for changing the present results of the performance capabilitiy. For pattern recognition applications, the performance capability may be an oscillation capability producing a repeating wave pattern as the present results. In the preferred embodiment, each of the plurality of neural modules includes a pre-established capability portion and a performance adjustment portion connected to control the pre-established capability portion.
A new local-global approach for classification.
Peres, R T; Pedreira, C E
2010-09-01
In this paper, we propose a new local-global pattern classification scheme that combines supervised and unsupervised approaches, taking advantage of both, local and global environments. We understand as global methods the ones concerned with the aim of constructing a model for the whole problem space using the totality of the available observations. Local methods focus into sub regions of the space, possibly using an appropriately selected subset of the sample. In the proposed method, the sample is first divided in local cells by using a Vector Quantization unsupervised algorithm, the LBG (Linde-Buzo-Gray). In a second stage, the generated assemblage of much easier problems is locally solved with a scheme inspired by Bayes' rule. Four classification methods were implemented for comparison purposes with the proposed scheme: Learning Vector Quantization (LVQ); Feedforward Neural Networks; Support Vector Machine (SVM) and k-Nearest Neighbors. These four methods and the proposed scheme were implemented in eleven datasets, two controlled experiments, plus nine public available datasets from the UCI repository. The proposed method has shown a quite competitive performance when compared to these classical and largely used classifiers. Our method is simple concerning understanding and implementation and is based on very intuitive concepts. Copyright 2010 Elsevier Ltd. All rights reserved.
Unsupervised learning of structure in spectroscopic cubes
NASA Astrophysics Data System (ADS)
Araya, M.; Mendoza, M.; Solar, M.; Mardones, D.; Bayo, A.
2018-07-01
We consider the problem of analyzing the structure of spectroscopic cubes using unsupervised machine learning techniques. We propose representing the target's signal as a homogeneous set of volumes through an iterative algorithm that separates the structured emission from the background while not overestimating the flux. Besides verifying some basic theoretical properties, the algorithm is designed to be tuned by domain experts, because its parameters have meaningful values in the astronomical context. Nevertheless, we propose a heuristic to automatically estimate the signal-to-noise ratio parameter of the algorithm directly from data. The resulting light-weighted set of samples (≤ 1% compared to the original data) offer several advantages. For instance, it is statistically correct and computationally inexpensive to apply well-established techniques of the pattern recognition and machine learning domains; such as clustering and dimensionality reduction algorithms. We use ALMA science verification data to validate our method, and present examples of the operations that can be performed by using the proposed representation. Even though this approach is focused on providing faster and better analysis tools for the end-user astronomer, it also opens the possibility of content-aware data discovery by applying our algorithm to big data.
NASA Astrophysics Data System (ADS)
Omenzetter, Piotr; de Lautour, Oliver R.
2010-04-01
Developed for studying long, periodic records of various measured quantities, time series analysis methods are inherently suited and offer interesting possibilities for Structural Health Monitoring (SHM) applications. However, their use in SHM can still be regarded as an emerging application and deserves more studies. In this research, Autoregressive (AR) models were used to fit experimental acceleration time histories from two experimental structural systems, a 3- storey bookshelf-type laboratory structure and the ASCE Phase II SHM Benchmark Structure, in healthy and several damaged states. The coefficients of the AR models were chosen as damage sensitive features. Preliminary visual inspection of the large, multidimensional sets of AR coefficients to check the presence of clusters corresponding to different damage severities was achieved using Sammon mapping - an efficient nonlinear data compression technique. Systematic classification of damage into states based on the analysis of the AR coefficients was achieved using two supervised classification techniques: Nearest Neighbor Classification (NNC) and Learning Vector Quantization (LVQ), and one unsupervised technique: Self-organizing Maps (SOM). This paper discusses the performance of AR coefficients as damage sensitive features and compares the efficiency of the three classification techniques using experimental data.
Mastication Evaluation With Unsupervised Learning: Using an Inertial Sensor-Based System.
Lucena, Caroline Vieira; Lacerda, Marcelo; Caldas, Rafael; De Lima Neto, Fernando Buarque; Rativa, Diego
2018-01-01
There is a direct relationship between the prevalence of musculoskeletal disorders of the temporomandibular joint and orofacial disorders. A well-elaborated analysis of the jaw movements provides relevant information for healthcare professionals to conclude their diagnosis. Different approaches have been explored to track jaw movements such that the mastication analysis is getting less subjective; however, all methods are still highly subjective, and the quality of the assessments depends much on the experience of the health professional. In this paper, an accurate and non-invasive method based on a commercial low-cost inertial sensor (MPU6050) to measure jaw movements is proposed. The jaw-movement feature values are compared to the obtained with clinical analysis, showing no statistically significant difference between both methods. Moreover, We propose to use unsupervised paradigm approaches to cluster mastication patterns of healthy subjects and simulated patients with facial trauma. Two techniques were used in this paper to instantiate the method: Kohonen's Self-Organizing Maps and K-Means Clustering. Both algorithms have excellent performances to process jaw-movements data, showing encouraging results and potential to bring a full assessment of the masticatory function. The proposed method can be applied in real-time providing relevant dynamic information for health-care professionals.
Chen, Chien-Chang; Juan, Hung-Hui; Tsai, Meng-Yuan; Lu, Henry Horng-Shing
2018-01-11
By introducing the methods of machine learning into the density functional theory, we made a detour for the construction of the most probable density function, which can be estimated by learning relevant features from the system of interest. Using the properties of universal functional, the vital core of density functional theory, the most probable cluster numbers and the corresponding cluster boundaries in a studying system can be simultaneously and automatically determined and the plausibility is erected on the Hohenberg-Kohn theorems. For the method validation and pragmatic applications, interdisciplinary problems from physical to biological systems were enumerated. The amalgamation of uncharged atomic clusters validated the unsupervised searching process of the cluster numbers and the corresponding cluster boundaries were exhibited likewise. High accurate clustering results of the Fisher's iris dataset showed the feasibility and the flexibility of the proposed scheme. Brain tumor detections from low-dimensional magnetic resonance imaging datasets and segmentations of high-dimensional neural network imageries in the Brainbow system were also used to inspect the method practicality. The experimental results exhibit the successful connection between the physical theory and the machine learning methods and will benefit the clinical diagnoses.
Supervised and Unsupervised Learning Technology in the Study of Rodent Behavior
Gris, Katsiaryna V.; Coutu, Jean-Philippe; Gris, Denis
2017-01-01
Quantifying behavior is a challenge for scientists studying neuroscience, ethology, psychology, pathology, etc. Until now, behavior was mostly considered as qualitative descriptions of postures or labor intensive counting of bouts of individual movements. Many prominent behavioral scientists conducted studies describing postures of mice and rats, depicting step by step eating, grooming, courting, and other behaviors. Automated video assessment technologies permit scientists to quantify daily behavioral patterns/routines, social interactions, and postural changes in an unbiased manner. Here, we extensively reviewed published research on the topic of the structural blocks of behavior and proposed a structure of behavior based on the latest publications. We discuss the importance of defining a clear structure of behavior to allow professionals to write viable algorithms. We presented a discussion of technologies that are used in automated video assessment of behavior in mice and rats. We considered advantages and limitations of supervised and unsupervised learning. We presented the latest scientific discoveries that were made using automated video assessment. In conclusion, we proposed that the automated quantitative approach to evaluating animal behavior is the future of understanding the effect of brain signaling, pathologies, genetic content, and environment on behavior. PMID:28804452
A taxonomy of epithelial human cancer and their metastases
2009-01-01
Background Microarray technology has allowed to molecularly characterize many different cancer sites. This technology has the potential to individualize therapy and to discover new drug targets. However, due to technological differences and issues in standardized sample collection no study has evaluated the molecular profile of epithelial human cancer in a large number of samples and tissues. Additionally, it has not yet been extensively investigated whether metastases resemble their tissue of origin or tissue of destination. Methods We studied the expression profiles of a series of 1566 primary and 178 metastases by unsupervised hierarchical clustering. The clustering profile was subsequently investigated and correlated with clinico-pathological data. Statistical enrichment of clinico-pathological annotations of groups of samples was investigated using Fisher exact test. Gene set enrichment analysis (GSEA) and DAVID functional enrichment analysis were used to investigate the molecular pathways. Kaplan-Meier survival analysis and log-rank tests were used to investigate prognostic significance of gene signatures. Results Large clusters corresponding to breast, gastrointestinal, ovarian and kidney primary tissues emerged from the data. Chromophobe renal cell carcinoma clustered together with follicular differentiated thyroid carcinoma, which supports recent morphological descriptions of thyroid follicular carcinoma-like tumors in the kidney and suggests that they represent a subtype of chromophobe carcinoma. We also found an expression signature identifying primary tumors of squamous cell histology in multiple tissues. Next, a subset of ovarian tumors enriched with endometrioid histology clustered together with endometrium tumors, confirming that they share their etiopathogenesis, which strongly differs from serous ovarian tumors. In addition, the clustering of colon and breast tumors correlated with clinico-pathological characteristics. Moreover, a signature was developed based on our unsupervised clustering of breast tumors and this was predictive for disease-specific survival in three independent studies. Next, the metastases from ovarian, breast, lung and vulva cluster with their tissue of origin while metastases from colon showed a bimodal distribution. A significant part clusters with tissue of origin while the remaining tumors cluster with the tissue of destination. Conclusion Our molecular taxonomy of epithelial human cancer indicates surprising correlations over tissues. This may have a significant impact on the classification of many cancer sites and may guide pathologists, both in research and daily practice. Moreover, these results based on unsupervised analysis yielded a signature predictive of clinical outcome in breast cancer. Additionally, we hypothesize that metastases from gastrointestinal origin either remember their tissue of origin or adapt to the tissue of destination. More specifically, colon metastases in the liver show strong evidence for such a bimodal tissue specific profile. PMID:20017941
Unsupervised automated high throughput phenotyping of RNAi time-lapse movies.
Failmezger, Henrik; Fröhlich, Holger; Tresch, Achim
2013-10-04
Gene perturbation experiments in combination with fluorescence time-lapse cell imaging are a powerful tool in reverse genetics. High content applications require tools for the automated processing of the large amounts of data. These tools include in general several image processing steps, the extraction of morphological descriptors, and the grouping of cells into phenotype classes according to their descriptors. This phenotyping can be applied in a supervised or an unsupervised manner. Unsupervised methods are suitable for the discovery of formerly unknown phenotypes, which are expected to occur in high-throughput RNAi time-lapse screens. We developed an unsupervised phenotyping approach based on Hidden Markov Models (HMMs) with multivariate Gaussian emissions for the detection of knockdown-specific phenotypes in RNAi time-lapse movies. The automated detection of abnormal cell morphologies allows us to assign a phenotypic fingerprint to each gene knockdown. By applying our method to the Mitocheck database, we show that a phenotypic fingerprint is indicative of a gene's function. Our fully unsupervised HMM-based phenotyping is able to automatically identify cell morphologies that are specific for a certain knockdown. Beyond the identification of genes whose knockdown affects cell morphology, phenotypic fingerprints can be used to find modules of functionally related genes.
An Efficient Optimization Method for Solving Unsupervised Data Classification Problems.
Shabanzadeh, Parvaneh; Yusof, Rubiyah
2015-01-01
Unsupervised data classification (or clustering) analysis is one of the most useful tools and a descriptive task in data mining that seeks to classify homogeneous groups of objects based on similarity and is used in many medical disciplines and various applications. In general, there is no single algorithm that is suitable for all types of data, conditions, and applications. Each algorithm has its own advantages, limitations, and deficiencies. Hence, research for novel and effective approaches for unsupervised data classification is still active. In this paper a heuristic algorithm, Biogeography-Based Optimization (BBO) algorithm, was adapted for data clustering problems by modifying the main operators of BBO algorithm, which is inspired from the natural biogeography distribution of different species. Similar to other population-based algorithms, BBO algorithm starts with an initial population of candidate solutions to an optimization problem and an objective function that is calculated for them. To evaluate the performance of the proposed algorithm assessment was carried on six medical and real life datasets and was compared with eight well known and recent unsupervised data classification algorithms. Numerical results demonstrate that the proposed evolutionary optimization algorithm is efficient for unsupervised data classification.
Wang, Wei; Ackland, David C; McClelland, Jodie A; Webster, Kate E; Halgamuge, Saman
2018-01-01
Quantitative gait analysis is an important tool in objective assessment and management of total knee arthroplasty (TKA) patients. Studies evaluating gait patterns in TKA patients have tended to focus on discrete data such as spatiotemporal information, joint range of motion and peak values of kinematics and kinetics, or consider selected principal components of gait waveforms for analysis. These strategies may not have the capacity to capture small variations in gait patterns associated with each joint across an entire gait cycle, and may ultimately limit the accuracy of gait classification. The aim of this study was to develop an automatic feature extraction method to analyse patterns from high-dimensional autocorrelated gait waveforms. A general linear feature extraction framework was proposed and a hierarchical partial least squares method derived for discriminant analysis of multiple gait waveforms. The effectiveness of this strategy was verified using a dataset of joint angle and ground reaction force waveforms from 43 patients after TKA surgery and 31 healthy control subjects. Compared with principal component analysis and partial least squares methods, the hierarchical partial least squares method achieved generally better classification performance on all possible combinations of waveforms, with the highest classification accuracy . The novel hierarchical partial least squares method proposed is capable of capturing virtually all significant differences between TKA patients and the controls, and provides new insights into data visualization. The proposed framework presents a foundation for more rigorous classification of gait, and may ultimately be used to evaluate the effects of interventions such as surgery and rehabilitation.
Brown, Laura J E; Adlam, Tim; Hwang, Faustina; Khadra, Hassan; Maclean, Linda M; Rudd, Bridey; Smith, Tom; Timon, Claire; Williams, Elizabeth A; Astell, Arlene J
2016-08-01
Patterns of cognitive change over micro-longitudinal timescales (i.e., ranging from hours to days) are associated with a wide range of age-related health and functional outcomes. However, practical issues of conducting high-frequency assessments make investigations of micro-longitudinal cognition costly and burdensome to run. One way of addressing this is to develop cognitive assessments that can be performed by older adults, in their own homes, without a researcher being present. Here, we address the question of whether reliable and valid cognitive data can be collected over micro-longitudinal timescales using unsupervised cognitive tests.In study 1, 48 older adults completed two touchscreen cognitive tests, on three occasions, in controlled conditions, alongside a battery of standard tests of cognitive functions. In study 2, 40 older adults completed the same two computerized tasks on multiple occasions, over three separate week-long periods, in their own homes, without a researcher present. Here, the tasks were incorporated into a wider touchscreen system (Novel Assessment of Nutrition and Ageing (NANA)) developed to assess multiple domains of health and behavior. Standard tests of cognitive function were also administered prior to participants using the NANA system.Performance on the two "NANA" cognitive tasks showed convergent validity with, and similar levels of reliability to, the standard cognitive battery in both studies. Completion and accuracy rates were also very high. These results show that reliable and valid cognitive data can be collected from older adults using unsupervised computerized tests, thus affording new opportunities for the investigation of cognitive.
Unsupervised classification of cirrhotic livers using MRI data
NASA Astrophysics Data System (ADS)
Lee, Gobert; Kanematsu, Masayuki; Kato, Hiroki; Kondo, Hiroshi; Zhou, Xiangrong; Hara, Takeshi; Fujita, Hiroshi; Hoshi, Hiroaki
2008-03-01
Cirrhosis of the liver is a chronic disease. It is characterized by the presence of widespread nodules and fibrosis in the liver which results in characteristic texture patterns. Computerized analysis of hepatic texture patterns is usually based on regions-of-interest (ROIs). However, not all ROIs are typical representatives of the disease stage of the liver from which the ROIs originated. This leads to uncertainties in the ROI labels (diseased or non-diseased). On the other hand, supervised classifiers are commonly used in determining the assignment rule. This presents a problem as the training of a supervised classifier requires the correct labels of the ROIs. The main purpose of this paper is to investigate the use of an unsupervised classifier, the k-means clustering, in classifying ROI based data. In addition, a procedure for generating a receiver operating characteristic (ROC) curve depicting the classification performance of k-means clustering is also reported. Hepatic MRI images of 44 patients (16 cirrhotic; 28 non-cirrhotic) are used in this study. The MRI data are derived from gadolinium-enhanced equilibrium phase images. For each patient, 10 ROIs selected by an experienced radiologist and 7 texture features measured on each ROI are included in the MRI data. Results of the k-means classifier are depicted using an ROC curve. The area under the curve (AUC) has a value of 0.704. This is slightly lower than but comparable to that of LDA and ANN classifiers which have values 0.781 and 0.801, respectively. Methods in constructing ROC curve in relation to k-means clustering have not been previously reported in the literature.
Unsupervised learning of digit recognition using spike-timing-dependent plasticity
Diehl, Peter U.; Cook, Matthew
2015-01-01
In order to understand how the mammalian neocortex is performing computations, two things are necessary; we need to have a good understanding of the available neuronal processing units and mechanisms, and we need to gain a better understanding of how those mechanisms are combined to build functioning systems. Therefore, in recent years there is an increasing interest in how spiking neural networks (SNN) can be used to perform complex computations or solve pattern recognition tasks. However, it remains a challenging task to design SNNs which use biologically plausible mechanisms (especially for learning new patterns), since most such SNN architectures rely on training in a rate-based network and subsequent conversion to a SNN. We present a SNN for digit recognition which is based on mechanisms with increased biological plausibility, i.e., conductance-based instead of current-based synapses, spike-timing-dependent plasticity with time-dependent weight change, lateral inhibition, and an adaptive spiking threshold. Unlike most other systems, we do not use a teaching signal and do not present any class labels to the network. Using this unsupervised learning scheme, our architecture achieves 95% accuracy on the MNIST benchmark, which is better than previous SNN implementations without supervision. The fact that we used no domain-specific knowledge points toward the general applicability of our network design. Also, the performance of our network scales well with the number of neurons used and shows similar performance for four different learning rules, indicating robustness of the full combination of mechanisms, which suggests applicability in heterogeneous biological neural networks. PMID:26941637
Bae, Hyoung Won; Rho, Seungsoo; Lee, Hye Sun; Lee, Naeun; Hong, Samin; Seong, Gong Je; Sung, Kyung Rim; Kim, Chan Yun
2014-04-29
To classify medically treated open-angle glaucoma (OAG) by the pattern of progression using hierarchical cluster analysis, and to determine OAG progression characteristics by comparing clusters. Ninety-five eyes of 95 OAG patients who received medical treatment, and who had undergone visual field (VF) testing at least once per year for 5 or more years. OAG was classified into subgroups using hierarchical cluster analysis based on the following five variables: baseline mean deviation (MD), baseline visual field index (VFI), MD slope, VFI slope, and Glaucoma Progression Analysis (GPA) printout. After that, other parameters were compared between clusters. Two clusters were made after a hierarchical cluster analysis. Cluster 1 showed -4.06 ± 2.43 dB baseline MD, 92.58% ± 6.27% baseline VFI, -0.28 ± 0.38 dB per year MD slope, -0.52% ± 0.81% per year VFI slope, and all "no progression" cases in GPA printout, whereas cluster 2 showed -8.68 ± 3.81 baseline MD, 77.54 ± 12.98 baseline VFI, -0.72 ± 0.55 MD slope, -2.22 ± 1.89 VFI slope, and seven "possible" and four "likely" progression cases in GPA printout. There were no significant differences in age, sex, mean IOP, central corneal thickness, and axial length between clusters. However, cluster 2 included more high-tension glaucoma patients and used a greater number of antiglaucoma eye drops significantly compared with cluster 1. Hierarchical cluster analysis of progression patterns divided OAG into slow and fast progression groups, evidenced by assessing the parameters of glaucomatous progression in VF testing. In the fast progression group, the prevalence of high-tension glaucoma was greater and the number of antiglaucoma medications administered was increased versus the slow progression group. Copyright 2014 The Association for Research in Vision and Ophthalmology, Inc.
Amis, Gregory P; Carpenter, Gail A
2010-03-01
Computational models of learning typically train on labeled input patterns (supervised learning), unlabeled input patterns (unsupervised learning), or a combination of the two (semi-supervised learning). In each case input patterns have a fixed number of features throughout training and testing. Human and machine learning contexts present additional opportunities for expanding incomplete knowledge from formal training, via self-directed learning that incorporates features not previously experienced. This article defines a new self-supervised learning paradigm to address these richer learning contexts, introducing a neural network called self-supervised ARTMAP. Self-supervised learning integrates knowledge from a teacher (labeled patterns with some features), knowledge from the environment (unlabeled patterns with more features), and knowledge from internal model activation (self-labeled patterns). Self-supervised ARTMAP learns about novel features from unlabeled patterns without destroying partial knowledge previously acquired from labeled patterns. A category selection function bases system predictions on known features, and distributed network activation scales unlabeled learning to prediction confidence. Slow distributed learning on unlabeled patterns focuses on novel features and confident predictions, defining classification boundaries that were ambiguous in the labeled patterns. Self-supervised ARTMAP improves test accuracy on illustrative low-dimensional problems and on high-dimensional benchmarks. Model code and benchmark data are available from: http://techlab.eu.edu/SSART/. Copyright 2009 Elsevier Ltd. All rights reserved.
Unsupervised chunking based on graph propagation from bilingual corpus.
Zhu, Ling; Wong, Derek F; Chao, Lidia S
2014-01-01
This paper presents a novel approach for unsupervised shallow parsing model trained on the unannotated Chinese text of parallel Chinese-English corpus. In this approach, no information of the Chinese side is applied. The exploitation of graph-based label propagation for bilingual knowledge transfer, along with an application of using the projected labels as features in unsupervised model, contributes to a better performance. The experimental comparisons with the state-of-the-art algorithms show that the proposed approach is able to achieve impressive higher accuracy in terms of F-score.
An unsupervised classification technique for multispectral remote sensing data.
NASA Technical Reports Server (NTRS)
Su, M. Y.; Cummings, R. E.
1973-01-01
Description of a two-part clustering technique consisting of (a) a sequential statistical clustering, which is essentially a sequential variance analysis, and (b) a generalized K-means clustering. In this composite clustering technique, the output of (a) is a set of initial clusters which are input to (b) for further improvement by an iterative scheme. This unsupervised composite technique was employed for automatic classification of two sets of remote multispectral earth resource observations. The classification accuracy by the unsupervised technique is found to be comparable to that by traditional supervised maximum-likelihood classification techniques.
Unsupervised classification of earth resources data.
NASA Technical Reports Server (NTRS)
Su, M. Y.; Jayroe, R. R., Jr.; Cummings, R. E.
1972-01-01
A new clustering technique is presented. It consists of two parts: (a) a sequential statistical clustering which is essentially a sequential variance analysis and (b) a generalized K-means clustering. In this composite clustering technique, the output of (a) is a set of initial clusters which are input to (b) for further improvement by an iterative scheme. This unsupervised composite technique was employed for automatic classification of two sets of remote multispectral earth resource observations. The classification accuracy by the unsupervised technique is found to be comparable to that by existing supervised maximum liklihood classification technique.
New Materials and Methods for Hierarchically Structured Tissue Scaffolds
2005-01-01
to the fabrication of hierarchically structured scaffolds. In order to achieve this goal, photopolymerizable materials must be developed that are... photopolymerizable materials that can also be selectively chemically modified during the SL part building process. This paper provides an update on our work...which uses a laser to "write" patterns into a vat containing a photopolymerizable resin. The first step in performing SL is generating a computer
NASA Astrophysics Data System (ADS)
Jing, Yingqi
2017-07-01
Exploring the relationships between structural rules and their linearization constraints have been a central issue in formal syntax and linguistic typology [1]. Liu et al. give a historical overview of the investigation of dependency distance minimization (DDM) in various fields, and specify its potential connections with the graphic patterns of syntactic structure and the linear ordering of words and constituents in real sentences [2]. This comment focuses on discussing the relations between dependency distance (DD), hierarchical structure and word order, and advocates further study on the coevolution of these traits in language histories.
Leslie, Toby; Rab, Mohammad Abdur; Ahmadzai, Hayat; Durrani, Naeem; Fayaz, Mohammad; Kolaczinski, Jan; Rowland, Mark
2004-03-01
The only available treatment that can eliminate the latent hypnozoite reservoir of vivax malaria is a 14 d course of primaquine (PQ). A potential problem with long-course chemotherapy is the issue of compliance after clinical symptoms have subsided. The present study, carried out at an Afghan refugee camp in Pakistan, between June 2000 and August 2001, compared 14 d treatment in supervised and unsupervised groups in which compliance was monitored by comparison of relapse rates. Clinical cases recruited by passive case detection were randomised by family to placebo, supervised, or unsupervised groups, and treated with chloroquine (25 mg/kg) over 3 days to eliminate erythrocytic stages. Individuals with glucose-6-phosphate dehydrogenase (G6PD) deficiency were excluded from the trial. Cases allocated to supervision were given directly observed treatment (0.25 mg PQ/kg body weight) once per day for 14 days. Cases allocated to the unsupervised group were provided with 14 PQ doses upon enrollment and strongly advised to complete the course. A total of 595 cases were enrolled. After 9 months of follow up PQ proved equally protective against further episodes of P. vivax in supervised (odds ratio 0.35, 95% CI 0.21-0.57) and unsupervised (odds ratio 0.37, 95% CI 0.23-0.59) groups as compared to placebo. All age groups on supervised or unsupervised treatment showed a similar degree of protection even though the risk of relapse decreased with age. The study showed that a presumed problem of poor compliance may be overcome with simple health messages even when the majority of individuals are illiterate and without formal education. Unsupervised treatment with 14-day PQ when combined with simple instruction can avert a significant amount of the morbidity associated with relapse in populations where G6PD deficiency is either absent or readily diagnosable.
True Zero-Training Brain-Computer Interfacing – An Online Study
Kindermans, Pieter-Jan; Schreuder, Martijn; Schrauwen, Benjamin; Müller, Klaus-Robert; Tangermann, Michael
2014-01-01
Despite several approaches to realize subject-to-subject transfer of pre-trained classifiers, the full performance of a Brain-Computer Interface (BCI) for a novel user can only be reached by presenting the BCI system with data from the novel user. In typical state-of-the-art BCI systems with a supervised classifier, the labeled data is collected during a calibration recording, in which the user is asked to perform a specific task. Based on the known labels of this recording, the BCI's classifier can learn to decode the individual's brain signals. Unfortunately, this calibration recording consumes valuable time. Furthermore, it is unproductive with respect to the final BCI application, e.g. text entry. Therefore, the calibration period must be reduced to a minimum, which is especially important for patients with a limited concentration ability. The main contribution of this manuscript is an online study on unsupervised learning in an auditory event-related potential (ERP) paradigm. Our results demonstrate that the calibration recording can be bypassed by utilizing an unsupervised trained classifier, that is initialized randomly and updated during usage. Initially, the unsupervised classifier tends to make decoding mistakes, as the classifier might not have seen enough data to build a reliable model. Using a constant re-analysis of the previously spelled symbols, these initially misspelled symbols can be rectified posthoc when the classifier has learned to decode the signals. We compare the spelling performance of our unsupervised approach and of the unsupervised posthoc approach to the standard supervised calibration-based dogma for n = 10 healthy users. To assess the learning behavior of our approach, it is unsupervised trained from scratch three times per user. Even with the relatively low SNR of an auditory ERP paradigm, the results show that after a limited number of trials (30 trials), the unsupervised approach performs comparably to a classic supervised model. PMID:25068464
Kauppi, Jukka-Pekka; Martikainen, Kalle; Ruotsalainen, Ulla
2010-12-01
The central purpose of passive signal intercept receivers is to perform automatic categorization of unknown radar signals. Currently, there is an urgent need to develop intelligent classification algorithms for these devices due to emerging complexity of radar waveforms. Especially multifunction radars (MFRs) capable of performing several simultaneous tasks by utilizing complex, dynamically varying scheduled waveforms are a major challenge for automatic pattern classification systems. To assist recognition of complex radar emissions in modern intercept receivers, we have developed a novel method to recognize dynamically varying pulse repetition interval (PRI) modulation patterns emitted by MFRs. We use robust feature extraction and classifier design techniques to assist recognition in unpredictable real-world signal environments. We classify received pulse trains hierarchically which allows unambiguous detection of the subpatterns using a sliding window. Accuracy, robustness and reliability of the technique are demonstrated with extensive simulations using both static and dynamically varying PRI modulation patterns. Copyright © 2010 Elsevier Ltd. All rights reserved.
Liberal Entity Extraction: Rapid Construction of Fine-Grained Entity Typing Systems.
Huang, Lifu; May, Jonathan; Pan, Xiaoman; Ji, Heng; Ren, Xiang; Han, Jiawei; Zhao, Lin; Hendler, James A
2017-03-01
The ability of automatically recognizing and typing entities in natural language without prior knowledge (e.g., predefined entity types) is a major challenge in processing such data. Most existing entity typing systems are limited to certain domains, genres, and languages. In this article, we propose a novel unsupervised entity-typing framework by combining symbolic and distributional semantics. We start from learning three types of representations for each entity mention: general semantic representation, specific context representation, and knowledge representation based on knowledge bases. Then we develop a novel joint hierarchical clustering and linking algorithm to type all mentions using these representations. This framework does not rely on any annotated data, predefined typing schema, or handcrafted features; therefore, it can be quickly adapted to a new domain, genre, and/or language. Experiments on genres (news and discussion forum) show comparable performance with state-of-the-art supervised typing systems trained from a large amount of labeled data. Results on various languages (English, Chinese, Japanese, Hausa, and Yoruba) and domains (general and biomedical) demonstrate the portability of our framework.
Gene expression profiling of single cells on large-scale oligonucleotide arrays
Hartmann, Claudia H.; Klein, Christoph A.
2006-01-01
Over the last decade, important insights into the regulation of cellular responses to various stimuli were gained by global gene expression analyses of cell populations. More recently, specific cell functions and underlying regulatory networks of rare cells isolated from their natural environment moved to the center of attention. However, low cell numbers still hinder gene expression profiling of rare ex vivo material in biomedical research. Therefore, we developed a robust method for gene expression profiling of single cells on high-density oligonucleotide arrays with excellent coverage of low abundance transcripts. The protocol was extensively tested with freshly isolated single cells of very low mRNA content including single epithelial, mature and immature dendritic cells and hematopoietic stem cells. Quantitative PCR confirmed that the PCR-based global amplification method did not change the relative ratios of transcript abundance and unsupervised hierarchical cluster analysis revealed that the histogenetic origin of an individual cell is correctly reflected by the gene expression profile. Moreover, the gene expression data from dendritic cells demonstrate that cellular differentiation and pathway activation can be monitored in individual cells. PMID:17071717
Performance Assessment of Kernel Density Clustering for Gene Expression Profile Data
Zeng, Beiyan; Chen, Yiping P.; Smith, Oscar H.
2003-01-01
Kernel density smoothing techniques have been used in classification or supervised learning of gene expression profile (GEP) data, but their applications to clustering or unsupervised learning of those data have not been explored and assessed. Here we report a kernel density clustering method for analysing GEP data and compare its performance with the three most widely-used clustering methods: hierarchical clustering, K-means clustering, and multivariate mixture model-based clustering. Using several methods to measure agreement, between-cluster isolation, and withincluster coherence, such as the Adjusted Rand Index, the Pseudo F test, the r2 test, and the profile plot, we have assessed the effectiveness of kernel density clustering for recovering clusters, and its robustness against noise on clustering both simulated and real GEP data. Our results show that the kernel density clustering method has excellent performance in recovering clusters from simulated data and in grouping large real expression profile data sets into compact and well-isolated clusters, and that it is the most robust clustering method for analysing noisy expression profile data compared to the other three methods assessed. PMID:18629292
Liberal Entity Extraction: Rapid Construction of Fine-Grained Entity Typing Systems
Huang, Lifu; May, Jonathan; Pan, Xiaoman; Ji, Heng; Ren, Xiang; Han, Jiawei; Zhao, Lin; Hendler, James A.
2017-01-01
Abstract The ability of automatically recognizing and typing entities in natural language without prior knowledge (e.g., predefined entity types) is a major challenge in processing such data. Most existing entity typing systems are limited to certain domains, genres, and languages. In this article, we propose a novel unsupervised entity-typing framework by combining symbolic and distributional semantics. We start from learning three types of representations for each entity mention: general semantic representation, specific context representation, and knowledge representation based on knowledge bases. Then we develop a novel joint hierarchical clustering and linking algorithm to type all mentions using these representations. This framework does not rely on any annotated data, predefined typing schema, or handcrafted features; therefore, it can be quickly adapted to a new domain, genre, and/or language. Experiments on genres (news and discussion forum) show comparable performance with state-of-the-art supervised typing systems trained from a large amount of labeled data. Results on various languages (English, Chinese, Japanese, Hausa, and Yoruba) and domains (general and biomedical) demonstrate the portability of our framework. PMID:28328252
Vision Sensor-Based Road Detection for Field Robot Navigation
Lu, Keyu; Li, Jian; An, Xiangjing; He, Hangen
2015-01-01
Road detection is an essential component of field robot navigation systems. Vision sensors play an important role in road detection for their great potential in environmental perception. In this paper, we propose a hierarchical vision sensor-based method for robust road detection in challenging road scenes. More specifically, for a given road image captured by an on-board vision sensor, we introduce a multiple population genetic algorithm (MPGA)-based approach for efficient road vanishing point detection. Superpixel-level seeds are then selected in an unsupervised way using a clustering strategy. Then, according to the GrowCut framework, the seeds proliferate and iteratively try to occupy their neighbors. After convergence, the initial road segment is obtained. Finally, in order to achieve a globally-consistent road segment, the initial road segment is refined using the conditional random field (CRF) framework, which integrates high-level information into road detection. We perform several experiments to evaluate the common performance, scale sensitivity and noise sensitivity of the proposed method. The experimental results demonstrate that the proposed method exhibits high robustness compared to the state of the art. PMID:26610514
Washio, Kana; Oka, Takashi; Abdalkader, Lamia; Muraoka, Michiko; Shimada, Akira; Oda, Megumi; Sato, Hiaki; Takata, Katsuyoshi; Kagami, Yoshitoyo; Shimizu, Norio; Kato, Seiichi; Kimura, Hiroshi; Nishizaki, Kazunori; Yoshino, Tadashi; Tsukahara, Hirokazu
2017-11-01
The human herpes virus, Epstein-Barr virus (EBV), is a known oncogenic virus and plays important roles in life-threatening T/NK-cell lymphoproliferative disorders (T/NK-cell LPD) such as hypersensitivity to mosquito bite (HMB), chronic active EBV infection (CAEBV), and NK/T-cell lymphoma/leukemia. During the clinical courses of HMB and CAEBV, patients frequently develop malignant lymphomas and the diseases passively progress sequentially. In the present study, gene expression of CD16 (-) CD56 (+) -, EBV (+) HMB, CAEBV, NK-lymphoma, and NK-leukemia cell lines, which were established from patients, was analyzed using oligonucleotide microarrays and compared to that of CD56 bright CD16 dim/- NK cells from healthy donors. Principal components analysis showed that CAEBV and NK-lymphoma cells were relatively closely located, indicating that they had similar expression profiles. Unsupervised hierarchal clustering analyses of microarray data and gene ontology analysis revealed specific gene clusters and identified several candidate genes responsible for disease that can be used to discriminate each category of NK-LPD and NK-cell lymphoma/leukemia.
Generating Adaptive Behaviour within a Memory-Prediction Framework
Rawlinson, David; Kowadlo, Gideon
2012-01-01
The Memory-Prediction Framework (MPF) and its Hierarchical-Temporal Memory implementation (HTM) have been widely applied to unsupervised learning problems, for both classification and prediction. To date, there has been no attempt to incorporate MPF/HTM in reinforcement learning or other adaptive systems; that is, to use knowledge embodied within the hierarchy to control a system, or to generate behaviour for an agent. This problem is interesting because the human neocortex is believed to play a vital role in the generation of behaviour, and the MPF is a model of the human neocortex. We propose some simple and biologically-plausible enhancements to the Memory-Prediction Framework. These cause it to explore and interact with an external world, while trying to maximize a continuous, time-varying reward function. All behaviour is generated and controlled within the MPF hierarchy. The hierarchy develops from a random initial configuration by interaction with the world and reinforcement learning only. Among other demonstrations, we show that a 2-node hierarchy can learn to successfully play “rocks, paper, scissors” against a predictable opponent. PMID:22272231
Di Girolamo, Francesco; Masotti, Andrea; Lante, Isabella; Scapaticci, Margherita; Calvano, Cosima Damiana; Zambonin, Carlo; Muraca, Maurizio; Putignani, Lorenza
2015-09-01
Extra virgin olive oil (EVOO) with its nutraceutical characteristics substantially contributes as a major nutrient to the health benefit of the Mediterranean diet. Unfortunately, the adulteration of EVOO with less expensive oils (e.g., peanut and corn oils), has become one of the biggest source of agricultural fraud in the European Union, with important health implications for consumers, mainly due to the introduction of seed oil-derived allergens causing, especially in children, severe food allergy phenomena. In this regard, revealing adulterations of EVOO is of fundamental importance for health care and prevention reasons, especially in children. To this aim, effective analytical methods to assess EVOO purity are necessary. Here, we propose a simple, rapid, robust and very sensitive method for non-specialized mass spectrometric laboratory, based on the matrix-assisted laser desorption/ionization mass spectrometry (MALDI-TOF MS) coupled to unsupervised hierarchical clustering (UHC), principal component (PCA) and Pearson's correlation analyses, to reveal corn oil (CO) adulterations in EVOO at very low levels (down to 0.5%).
Di Girolamo, Francesco; Masotti, Andrea; Lante, Isabella; Scapaticci, Margherita; Calvano, Cosima Damiana; Zambonin, Carlo; Muraca, Maurizio; Putignani, Lorenza
2015-01-01
Extra virgin olive oil (EVOO) with its nutraceutical characteristics substantially contributes as a major nutrient to the health benefit of the Mediterranean diet. Unfortunately, the adulteration of EVOO with less expensive oils (e.g., peanut and corn oils), has become one of the biggest source of agricultural fraud in the European Union, with important health implications for consumers, mainly due to the introduction of seed oil-derived allergens causing, especially in children, severe food allergy phenomena. In this regard, revealing adulterations of EVOO is of fundamental importance for health care and prevention reasons, especially in children. To this aim, effective analytical methods to assess EVOO purity are necessary. Here, we propose a simple, rapid, robust and very sensitive method for non-specialized mass spectrometric laboratory, based on the matrix-assisted laser desorption/ionization mass spectrometry (MALDI-TOF MS) coupled to unsupervised hierarchical clustering (UHC), principal component (PCA) and Pearson’s correlation analyses, to reveal corn oil (CO) adulterations in EVOO at very low levels (down to 0.5%). PMID:26340625
Nucleus segmentation in histology images with hierarchical multilevel thresholding
NASA Astrophysics Data System (ADS)
Ahmady Phoulady, Hady; Goldgof, Dmitry B.; Hall, Lawrence O.; Mouton, Peter R.
2016-03-01
Automatic segmentation of histological images is an important step for increasing throughput while maintaining high accuracy, avoiding variation from subjective bias, and reducing the costs for diagnosing human illnesses such as cancer and Alzheimer's disease. In this paper, we present a novel method for unsupervised segmentation of cell nuclei in stained histology tissue. Following an initial preprocessing step involving color deconvolution and image reconstruction, the segmentation step consists of multilevel thresholding and a series of morphological operations. The only parameter required for the method is the minimum region size, which is set according to the resolution of the image. Hence, the proposed method requires no training sets or parameter learning. Because the algorithm requires no assumptions or a priori information with regard to cell morphology, the automatic approach is generalizable across a wide range of tissues. Evaluation across a dataset consisting of diverse tissues, including breast, liver, gastric mucosa and bone marrow, shows superior performance over four other recent methods on the same dataset in terms of F-measure with precision and recall of 0.929 and 0.886, respectively.
Morphological Feature Extraction for Automatic Registration of Multispectral Images
NASA Technical Reports Server (NTRS)
Plaza, Antonio; LeMoigne, Jacqueline; Netanyahu, Nathan S.
2007-01-01
The task of image registration can be divided into two major components, i.e., the extraction of control points or features from images, and the search among the extracted features for the matching pairs that represent the same feature in the images to be matched. Manual extraction of control features can be subjective and extremely time consuming, and often results in few usable points. On the other hand, automated feature extraction allows using invariant target features such as edges, corners, and line intersections as relevant landmarks for registration purposes. In this paper, we present an extension of a recently developed morphological approach for automatic extraction of landmark chips and corresponding windows in a fully unsupervised manner for the registration of multispectral images. Once a set of chip-window pairs is obtained, a (hierarchical) robust feature matching procedure, based on a multiresolution overcomplete wavelet decomposition scheme, is used for registration purposes. The proposed method is validated on a pair of remotely sensed scenes acquired by the Advanced Land Imager (ALI) multispectral instrument and the Hyperion hyperspectral instrument aboard NASA's Earth Observing-1 satellite.
Hall, L O; Bensaid, A M; Clarke, L P; Velthuizen, R P; Silbiger, M S; Bezdek, J C
1992-01-01
Magnetic resonance (MR) brain section images are segmented and then synthetically colored to give visual representations of the original data with three approaches: the literal and approximate fuzzy c-means unsupervised clustering algorithms, and a supervised computational neural network. Initial clinical results are presented on normal volunteers and selected patients with brain tumors surrounded by edema. Supervised and unsupervised segmentation techniques provide broadly similar results. Unsupervised fuzzy algorithms were visually observed to show better segmentation when compared with raw image data for volunteer studies. For a more complex segmentation problem with tumor/edema or cerebrospinal fluid boundary, where the tissues have similar MR relaxation behavior, inconsistency in rating among experts was observed, with fuzz-c-means approaches being slightly preferred over feedforward cascade correlation results. Various facets of both approaches, such as supervised versus unsupervised learning, time complexity, and utility for the diagnostic process, are compared.
Nicholson, Vaughan Patrick; McKean, Mark; Lowe, John; Fawcett, Christine; Burkett, Brendan
2015-01-01
To determine the effectiveness of unsupervised Nintendo Wii Fit balance training in older adults. Forty-one older adults were recruited from local retirement villages and educational settings to participate in a six-week two-group repeated measures study. The Wii group (n = 19, 75 ± 6 years) undertook 30 min of unsupervised Wii balance gaming three times per week in their retirement village while the comparison group (n = 22, 74 ± 5 years) continued with their usual exercise program. Participants' balance abilities were assessed pre- and postintervention. The Wii Fit group demonstrated significant improvements (P < .05) in timed up-and-go, left single-leg balance, lateral reach (left and right), and gait speed compared with the comparison group. Reported levels of enjoyment following game play increased during the study. Six weeks of unsupervised Wii balance training is an effective modality for improving balance in independent older adults.
Assessing the Linguistic Productivity of Unsupervised Deep Neural Networks
DOE Office of Scientific and Technical Information (OSTI.GOV)
Phillips, Lawrence A.; Hodas, Nathan O.
Increasingly, cognitive scientists have demonstrated interest in applying tools from deep learning. One use for deep learning is in language acquisition where it is useful to know if a linguistic phenomenon can be learned through domain-general means. To assess whether unsupervised deep learning is appropriate, we first pose a smaller question: Can unsupervised neural networks apply linguistic rules productively, using them in novel situations. We draw from the literature on determiner/noun productivity by training an unsupervised, autoencoder network measuring its ability to combine nouns with determiners. Our simple autoencoder creates combinations it has not previously encountered, displaying a degree ofmore » overlap similar to actual children. While this preliminary work does not provide conclusive evidence for productivity, it warrants further investigation with more complex models. Further, this work helps lay the foundations for future collaboration between the deep learning and cognitive science communities.« less
Application of hierarchical clustering method to classify of space-time rainfall patterns
NASA Astrophysics Data System (ADS)
Yu, Hwa-Lung; Chang, Tu-Je
2010-05-01
Understanding the local precipitation patterns is essential to the water resources management and flooding mitigation. The precipitation patterns can vary in space and time depending upon the factors from different spatial scales such as local topological changes and macroscopic atmospheric circulation. The spatiotemporal variation of precipitation in Taiwan is significant due to its complex terrain and its location at west pacific and subtropical area, where is the boundary between the pacific ocean and Asia continent with the complex interactions among the climatic processes. This study characterizes local-scale precipitation patterns by classifying the historical space-time precipitation records. We applied the hierarchical ascending clustering method to analyze the precipitation records from 1960 to 2008 at the six rainfall stations located in Lan-yang catchment at the northeast of the island. Our results identify the four primary space-time precipitation types which may result from distinct driving forces from the changes of atmospheric variables and topology at different space-time scales. This study also presents an important application of the statistical downscaling to combine large-scale upper-air circulation with local space-time precipitation patterns.
NASA Astrophysics Data System (ADS)
Wang, Yu; Sun, Qingyang; Xiao, Jianliang
2018-02-01
Highly organized hierarchical surface morphologies possess various intriguing properties that could find important potential applications. In this paper, we demonstrate a facile approach to simultaneously form multiscale hierarchical surface morphologies through sequential wrinkling. This method combines surface wrinkling induced by thermal expansion and mechanical strain on a three-layer structure composed of an aluminum film, a hard Polydimethylsiloxane (PDMS) film, and a soft PDMS substrate. Deposition of the aluminum film on hard PDMS induces biaxial wrinkling due to thermal expansion mismatch, and recovering the prestrain in the soft PDMS substrate leads to wrinkling of the hard PDMS film. In total, three orders of wrinkling patterns form in this process, with wavelength and amplitude spanning 3 orders of magnitude in length scale. By increasing the prestrain in the soft PDMS substrate, a hierarchical wrinkling-folding structure was also obtained. This approach can be easily extended to other thin films for fabrication of multiscale hierarchical surface morphologies with potential applications in different areas.
Self-organizing neural network models for visual pattern recognition.
Fukushima, K
1987-01-01
Two neural network models for visual pattern recognition are discussed. The first model, called a "neocognitron", is a hierarchical multilayered network which has only afferent synaptic connections. It can acquire the ability to recognize patterns by "learning-without-a-teacher": the repeated presentation of a set of training patterns is sufficient, and no information about the categories of the patterns is necessary. The cells of the highest stage eventually become "gnostic cells", whose response shows the final result of the pattern-recognition of the network. Pattern recognition is performed on the basis of similarity in shape between patterns, and is not affected by deformation, nor by changes in size, nor by shifts in the position of the stimulus pattern. The second model has not only afferent but also efferent synaptic connections, and is endowed with the function of selective attention. The afferent and the efferent signals interact with each other in the hierarchical network: the efferent signals, that is, the signals for selective attention, have a facilitating effect on the afferent signals, and at the same time, the afferent signals gate efferent signal flow. When a complex figure, consisting of two patterns or more, is presented to the model, it is segmented into individual patterns, and each pattern is recognized separately. Even if one of the patterns to which the models is paying selective attention is affected by noise or defects, the model can "recall" the complete pattern from which the noise has been eliminated and the defects corrected.
Organization of excitable dynamics in hierarchical biological networks.
Müller-Linow, Mark; Hilgetag, Claus C; Hütt, Marc-Thorsten
2008-09-26
This study investigates the contributions of network topology features to the dynamic behavior of hierarchically organized excitable networks. Representatives of different types of hierarchical networks as well as two biological neural networks are explored with a three-state model of node activation for systematically varying levels of random background network stimulation. The results demonstrate that two principal topological aspects of hierarchical networks, node centrality and network modularity, correlate with the network activity patterns at different levels of spontaneous network activation. The approach also shows that the dynamic behavior of the cerebral cortical systems network in the cat is dominated by the network's modular organization, while the activation behavior of the cellular neuronal network of Caenorhabditis elegans is strongly influenced by hub nodes. These findings indicate the interaction of multiple topological features and dynamic states in the function of complex biological networks.
Hierarchical ferroelectric and ferrotoroidic polarizations coexistent in nano-metamaterials
Shimada, Takahiro; Lich, Le Van; Nagano, Koyo; Wang, Jie; Kitamura, Takayuki
2015-01-01
Tailoring materials to obtain unique, or significantly enhanced material properties through rationally designed structures rather than chemical constituents is principle of metamaterial concept, which leads to the realization of remarkable optical and mechanical properties. Inspired by the recent progress in electromagnetic and mechanical metamaterials, here we introduce the concept of ferroelectric nano-metamaterials, and demonstrate through an experiment in silico with hierarchical nanostructures of ferroelectrics using sophisticated real-space phase-field techniques. This new concept enables variety of unusual and complex yet controllable domain patterns to be achieved, where the coexistence between hierarchical ferroelectric and ferrotoroidic polarizations establishes a new benchmark for exploration of complexity in spontaneous polarization ordering. The concept opens a novel route to effectively tailor domain configurations through the control of internal structure, facilitating access to stabilization and control of complex domain patterns that provide high potential for novel functionalities. A key design parameter to achieve such complex patterns is explored based on the parity of junctions that connect constituent nanostructures. We further highlight the variety of additional functionalities that are potentially obtained from ferroelectric nano-metamaterials, and provide promising perspectives for novel multifunctional devices. This study proposes an entirely new discipline of ferroelectric nano-metamaterials, further driving advances in metamaterials research. PMID:26424484
Differentiation of Students' Reasoning on Linear and Quadratic Geometric Number Patterns
ERIC Educational Resources Information Center
Lin, Fou-Lai; Yang, Kai-Lin
2004-01-01
There are two purposes in this study. One is to compare how 7th and 8th graders reason on linear and quadratic geometric number patterns when they have not learned this kind of tasks in school. The other is to explore the hierarchical relations among the four components of reasoning on geometric number patterns: understanding, generalizing,…
Zong, Chuanyong; Zhao, Yan; Ji, Haipeng; Xie, Jixun; Han, Xue; Wang, Juanjuan; Cao, Yanping; Lu, Conghua; Li, Hongfei; Jiang, Shichun
2016-08-01
Here, a simple combined strategy of surface wrinkling with visible light irradiation to fabricate well tunable hierarchical surface patterns on azo-containing multilayer films is reported. The key to tailor surface patterns is to introduce a photosensitive poly(disperse orange 3) intermediate layer into the film/substrate wrinkling system, in which the modulus decrease is induced by the reversible photoisomerization. The existence of a photoinert top layer prevents the photoisomerization-induced stress release in the intermediate layer to some extent. Consequently, the as-formed wrinkling patterns can be modulated over a large area by light irradiation. Interestingly, in the case of selective exposure, the wrinkle wavelength in the exposed region decreases, while the wrinkles in the unexposed region are evolved into highly oriented wrinkles with the orientation perpendicular to the exposed/unexposed boundary. Compared with traditional single layer-based film/substrate systems, the multilayer system consisting of the photosensitive intermediate layer offers unprecedented advantages in the patterning controllability/universality. As demonstrated here, this simple and versatile strategy can be conveniently extended to functional multilayer systems for the creation of prescribed hierarchical surface patterns with optically tailored microstructures. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Generating region proposals for histopathological whole slide image retrieval.
Ma, Yibing; Jiang, Zhiguo; Zhang, Haopeng; Xie, Fengying; Zheng, Yushan; Shi, Huaqiang; Zhao, Yu; Shi, Jun
2018-06-01
Content-based image retrieval is an effective method for histopathological image analysis. However, given a database of huge whole slide images (WSIs), acquiring appropriate region-of-interests (ROIs) for training is significant and difficult. Moreover, histopathological images can only be annotated by pathologists, resulting in the lack of labeling information. Therefore, it is an important and challenging task to generate ROIs from WSI and retrieve image with few labels. This paper presents a novel unsupervised region proposing method for histopathological WSI based on Selective Search. Specifically, the WSI is over-segmented into regions which are hierarchically merged until the WSI becomes a single region. Nucleus-oriented similarity measures for region mergence and Nucleus-Cytoplasm color space for histopathological image are specially defined to generate accurate region proposals. Additionally, we propose a new semi-supervised hashing method for image retrieval. The semantic features of images are extracted with Latent Dirichlet Allocation and transformed into binary hashing codes with Supervised Hashing. The methods are tested on a large-scale multi-class database of breast histopathological WSIs. The results demonstrate that for one WSI, our region proposing method can generate 7.3 thousand contoured regions which fit well with 95.8% of the ROIs annotated by pathologists. The proposed hashing method can retrieve a query image among 136 thousand images in 0.29 s and reach precision of 91% with only 10% of images labeled. The unsupervised region proposing method can generate regions as predictions of lesions in histopathological WSI. The region proposals can also serve as the training samples to train machine-learning models for image retrieval. The proposed hashing method can achieve fast and precise image retrieval with small amount of labels. Furthermore, the proposed methods can be potentially applied in online computer-aided-diagnosis systems. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Li, Hui; Yu, Jun-Ling; Yu, Le-An; Sun, Jie
2014-05-01
Case-based reasoning (CBR) is one of the main forecasting methods in business forecasting, which performs well in prediction and holds the ability of giving explanations for the results. In business failure prediction (BFP), the number of failed enterprises is relatively small, compared with the number of non-failed ones. However, the loss is huge when an enterprise fails. Therefore, it is necessary to develop methods (trained on imbalanced samples) which forecast well for this small proportion of failed enterprises and performs accurately on total accuracy meanwhile. Commonly used methods constructed on the assumption of balanced samples do not perform well in predicting minority samples on imbalanced samples consisting of the minority/failed enterprises and the majority/non-failed ones. This article develops a new method called clustering-based CBR (CBCBR), which integrates clustering analysis, an unsupervised process, with CBR, a supervised process, to enhance the efficiency of retrieving information from both minority and majority in CBR. In CBCBR, various case classes are firstly generated through hierarchical clustering inside stored experienced cases, and class centres are calculated out by integrating cases information in the same clustered class. When predicting the label of a target case, its nearest clustered case class is firstly retrieved by ranking similarities between the target case and each clustered case class centre. Then, nearest neighbours of the target case in the determined clustered case class are retrieved. Finally, labels of the nearest experienced cases are used in prediction. In the empirical experiment with two imbalanced samples from China, the performance of CBCBR was compared with the classical CBR, a support vector machine, a logistic regression and a multi-variant discriminate analysis. The results show that compared with the other four methods, CBCBR performed significantly better in terms of sensitivity for identifying the minority samples and generated high total accuracy meanwhile. The proposed approach makes CBR useful in imbalanced forecasting.
NASA Astrophysics Data System (ADS)
Spinner, Marlene; Kovalev, Alexander; Gorb, Stanislav N.; Westhoff, Guido
2013-05-01
The West African Gaboon viper (Bitis rhinoceros) is a master of camouflage due to its colouration pattern. Its skin is geometrically patterned and features black spots that purport an exceptional spatial depth due to their velvety surface texture. Our study shades light on micromorphology, optical characteristics and principles behind such a velvet black appearance. We revealed a unique hierarchical pattern of leaf-like microstructures striated with nanoridges on the snake scales that coincides with the distribution of black colouration. Velvet black sites demonstrate four times lower reflectance and higher absorbance than other scales in the UV - near IR spectral range. The combination of surface structures impeding reflectance and absorbing dark pigments, deposited in the skin material, provides reflecting less than 11% of the light reflected by a polytetrafluoroethylene diffuse reflectance standard in any direction. A view-angle independent black structural colour in snakes is reported here for the first time.
Spinner, Marlene; Kovalev, Alexander; Gorb, Stanislav N; Westhoff, Guido
2013-01-01
The West African Gaboon viper (Bitis rhinoceros) is a master of camouflage due to its colouration pattern. Its skin is geometrically patterned and features black spots that purport an exceptional spatial depth due to their velvety surface texture. Our study shades light on micromorphology, optical characteristics and principles behind such a velvet black appearance. We revealed a unique hierarchical pattern of leaf-like microstructures striated with nanoridges on the snake scales that coincides with the distribution of black colouration. Velvet black sites demonstrate four times lower reflectance and higher absorbance than other scales in the UV-near IR spectral range. The combination of surface structures impeding reflectance and absorbing dark pigments, deposited in the skin material, provides reflecting less than 11% of the light reflected by a polytetrafluoroethylene diffuse reflectance standard in any direction. A view-angle independent black structural colour in snakes is reported here for the first time.
NASA Technical Reports Server (NTRS)
Shahshahani, Behzad M.; Landgrebe, David A.
1992-01-01
The effect of additional unlabeled samples in improving the supervised learning process is studied in this paper. Three learning processes. supervised, unsupervised, and combined supervised-unsupervised, are compared by studying the asymptotic behavior of the estimates obtained under each process. Upper and lower bounds on the asymptotic covariance matrices are derived. It is shown that under a normal mixture density assumption for the probability density function of the feature space, the combined supervised-unsupervised learning is always superior to the supervised learning in achieving better estimates. Experimental results are provided to verify the theoretical concepts.
A nontransferring dry adhesive with hierarchical polymer nanohairs.
Jeong, Hoon Eui; Lee, Jin-Kwan; Kim, Hong Nam; Moon, Sang Heup; Suh, Kahp Y
2009-04-07
We present a simple yet robust method for fabricating angled, hierarchically patterned high-aspect-ratio polymer nanohairs to generate directionally sensitive dry adhesives. The slanted polymeric nanostructures were molded from an etched polySi substrate containing slanted nanoholes. An angled etching technique was developed to fabricate slanted nanoholes with flat tips by inserting an etch-stop layer of silicon dioxide. This unique etching method was equipped with a Faraday cage system to control the ion-incident angles in the conventional plasma etching system. The polymeric nanohairs were fabricated with tailored leaning angles, sizes, tip shapes, and hierarchical structures. As a result of controlled leaning angle and bulged flat top of the nanohairs, the replicated, slanted nanohairs showed excellent directional adhesion, exhibiting strong shear attachment (approximately 26 N/cm(2) in maximum) in the angled direction and easy detachment (approximately 2.2 N/cm(2)) in the opposite direction, with a hysteresis value of approximately 10. In addition to single scale nanohairs, monolithic, micro-nanoscale combined hierarchical hairs were also fabricated by using a 2-step UV-assisted molding technique. These hierarchical nanoscale patterns maintained their adhesive force even on a rough surface (roughness <20 microm) because of an increase in the contact area by the enhanced height of hierarchy, whereas simple nanohairs lost their adhesion strength. To demonstrate the potential applications of the adhesive patch, the dry adhesive was used to transport a large-area glass (47.5 x 37.5 cm(2), second-generation TFT-LCD glass), which could replace the current electrostatic transport/holding system with further optimization.
A nontransferring dry adhesive with hierarchical polymer nanohairs
Jeong, Hoon Eui; Lee, Jin-Kwan; Kim, Hong Nam; Moon, Sang Heup; Suh, Kahp Y.
2009-01-01
We present a simple yet robust method for fabricating angled, hierarchically patterned high-aspect-ratio polymer nanohairs to generate directionally sensitive dry adhesives. The slanted polymeric nanostructures were molded from an etched polySi substrate containing slanted nanoholes. An angled etching technique was developed to fabricate slanted nanoholes with flat tips by inserting an etch-stop layer of silicon dioxide. This unique etching method was equipped with a Faraday cage system to control the ion-incident angles in the conventional plasma etching system. The polymeric nanohairs were fabricated with tailored leaning angles, sizes, tip shapes, and hierarchical structures. As a result of controlled leaning angle and bulged flat top of the nanohairs, the replicated, slanted nanohairs showed excellent directional adhesion, exhibiting strong shear attachment (≈26 N/cm2 in maximum) in the angled direction and easy detachment (≈2.2 N/cm2) in the opposite direction, with a hysteresis value of ≈10. In addition to single scale nanohairs, monolithic, micro-nanoscale combined hierarchical hairs were also fabricated by using a 2-step UV-assisted molding technique. These hierarchical nanoscale patterns maintained their adhesive force even on a rough surface (roughness <20 μm) because of an increase in the contact area by the enhanced height of hierarchy, whereas simple nanohairs lost their adhesion strength. To demonstrate the potential applications of the adhesive patch, the dry adhesive was used to transport a large-area glass (47.5 × 37.5 cm2, second-generation TFT-LCD glass), which could replace the current electrostatic transport/holding system with further optimization. PMID:19304801
ERIC Educational Resources Information Center
Chen, June L.; Sung, Connie; Pi, Sukyeong
2015-01-01
Young adults with autism spectrum disorders (ASD) often experience employment difficulties. Using Rehabilitation Service Administration data (RSA-911), this study investigated the service patterns and factors related to the employment outcomes of individuals with ASD in different age groups. Hierarchical logistic regression analyses were conducted…
Active Commuting Patterns at a Large, Midwestern College Campus
ERIC Educational Resources Information Center
Bopp, Melissa; Kaczynski, Andrew; Wittman, Pamela
2011-01-01
Objective: To understand patterns and influences on active commuting (AC) behavior. Participants: Students and faculty/staff at a university campus. Methods: In April-May 2008, respondents answered an online survey about mode of travel to campus and influences on commuting decisions. Hierarchical regression analyses predicted variance in walking…
Cohen, Mitchell J; Grossman, Adam D; Morabito, Diane; Knudson, M Margaret; Butte, Atul J; Manley, Geoffrey T
2010-01-01
Advances in technology have made extensive monitoring of patient physiology the standard of care in intensive care units (ICUs). While many systems exist to compile these data, there has been no systematic multivariate analysis and categorization across patient physiological data. The sheer volume and complexity of these data make pattern recognition or identification of patient state difficult. Hierarchical cluster analysis allows visualization of high dimensional data and enables pattern recognition and identification of physiologic patient states. We hypothesized that processing of multivariate data using hierarchical clustering techniques would allow identification of otherwise hidden patient physiologic patterns that would be predictive of outcome. Multivariate physiologic and ventilator data were collected continuously using a multimodal bioinformatics system in the surgical ICU at San Francisco General Hospital. These data were incorporated with non-continuous data and stored on a server in the ICU. A hierarchical clustering algorithm grouped each minute of data into 1 of 10 clusters. Clusters were correlated with outcome measures including incidence of infection, multiple organ failure (MOF), and mortality. We identified 10 clusters, which we defined as distinct patient states. While patients transitioned between states, they spent significant amounts of time in each. Clusters were enriched for our outcome measures: 2 of the 10 states were enriched for infection, 6 of 10 were enriched for MOF, and 3 of 10 were enriched for death. Further analysis of correlations between pairs of variables within each cluster reveals significant differences in physiology between clusters. Here we show for the first time the feasibility of clustering physiological measurements to identify clinically relevant patient states after trauma. These results demonstrate that hierarchical clustering techniques can be useful for visualizing complex multivariate data and may provide new insights for the care of critically injured patients.
Dyslexic Participants Show Intact Spontaneous Categorization Processes
ERIC Educational Resources Information Center
Nikolopoulos, Dimitris S.; Pothos, Emmanuel M.
2009-01-01
We examine the performance of dyslexic participants on an unsupervised categorization task against that of matched non-dyslexic control participants. Unsupervised categorization is a cognitive process critical for conceptual development. Existing research in dyslexia has emphasized perceptual tasks and supervised categorization tasks (for which…
NASA Technical Reports Server (NTRS)
Alexander, June; Corwin, Edward; Lloyd, David; Logar, Antonette; Welch, Ronald
1996-01-01
This research focuses on a new neural network scene classification technique. The task is to identify scene elements in Advanced Very High Resolution Radiometry (AVHRR) data from three scene types: polar, desert and smoke from biomass burning in South America (smoke). The ultimate goal of this research is to design and implement a computer system which will identify the clouds present on a whole-Earth satellite view as a means of tracking global climate changes. Previous research has reported results for rule-based systems (Tovinkere et at 1992, 1993) for standard back propagation (Watters et at. 1993) and for a hierarchical approach (Corwin et al 1994) for polar data. This research uses a hierarchical neural network with don't care conditions and applies this technique to complex scenes. A hierarchical neural network consists of a switching network and a collection of leaf networks. The idea of the hierarchical neural network is that it is a simpler task to classify a certain pattern from a subset of patterns than it is to classify a pattern from the entire set. Therefore, the first task is to cluster the classes into groups. The switching, or decision network, performs an initial classification by selecting a leaf network. The leaf networks contain a reduced set of similar classes, and it is in the various leaf networks that the actual classification takes place. The grouping of classes in the various leaf networks is determined by applying an iterative clustering algorithm. Several clustering algorithms were investigated, but due to the size of the data sets, the exhaustive search algorithms were eliminated. A heuristic approach using a confusion matrix from a lightly trained neural network provided the basis for the clustering algorithm. Once the clusters have been identified, the hierarchical network can be trained. The approach of using don't care nodes results from the difficulty in generating extremely complex surfaces in order to separate one class from all of the others. This approach finds pairwise separating surfaces and forms the more complex separating surface from combinations of simpler surfaces. This technique both reduces training time and improves accuracy over the previously reported results. Accuracies of 97.47%, 95.70%, and 99.05% were achieved for the polar, desert and smoke data sets.
Housing and sexual health among street-involved youth.
Kumar, Maya M; Nisenbaum, Rosane; Barozzino, Tony; Sgro, Michael; Bonifacio, Herbert J; Maguire, Jonathon L
2015-10-01
Street-involved youth (SIY) carry a disproportionate burden of sexually transmitted diseases (STD). Studies among adults suggest that improving housing stability may be an effective primary prevention strategy for improving sexual health. Housing options available to SIY offer varying degrees of stability and adult supervision. This study investigated whether housing options offering more stability and adult supervision are associated with fewer STD and related risk behaviors among SIY. A cross-sectional study was performed using public health survey and laboratory data collected from Toronto SIY in 2010. Three exposure categories were defined a priori based on housing situation: (1) stable and supervised housing, (2) stable and unsupervised housing, and (3) unstable and unsupervised housing. Multivariate logistic regression was used to test the association between housing category and current or recent STD. Secondary analyses were performed using the following secondary outcomes: blood-borne infection, recent binge-drinking, and recent high-risk sexual behavior. The final analysis included 184 SIY. Of these, 28.8 % had a current or recent STD. Housing situation was stable and supervised for 12.5 %, stable and unsupervised for 46.2 %, and unstable and unsupervised for 41.3 %. Compared to stable and supervised housing, there was no significant association between current or recent STD among stable and unsupervised housing or unstable and unsupervised housing. There was no significant association between housing category and risk of blood-borne infection, binge-drinking, or high-risk sexual behavior. Although we did not demonstrate a significant association between stable and supervised housing and lower STD risk, our incorporation of both housing stability and adult supervision into a priori defined exposure groups may inform future studies of housing-related prevention strategies among SIY. Multi-modal interventions beyond housing alone may also be required to prevent sexual morbidity among these vulnerable youth.
Out-of-School Time and Adolescent Substance Use.
Lee, Kenneth T H; Vandell, Deborah Lowe
2015-11-01
High levels of adolescent substance use are linked to lower academic achievement, reduced schooling, and delinquency. We assess four types of out-of-school time (OST) contexts--unsupervised time with peers, sports, organized activities, and paid employment--in relation to tobacco, alcohol, and marijuana use at the end of high school. Other research has examined these OST contexts in isolation, limiting efforts to disentangle potentially confounded relations. Longitudinal data from the National Institute of Child Health and Human Development Study of Early Child Care and Youth Development (N = 766) examined associations between different OST contexts during high school and substance use at the end of high school. Unsupervised time with peers increased the odds of tobacco, alcohol, and marijuana use, whereas sports increased the odds of alcohol use and decreased the odds of marijuana use. Paid employment increased the odds of tobacco and alcohol use. Unsupervised time with peers predicted increased amounts of tobacco, alcohol, and marijuana use, whereas sports predicted decreased amounts of tobacco and marijuana use and increased amounts of alcohol use at the end of high school. Although unsupervised time with peers, sports, and paid employment were differentially linked to the odds of substance use, only unsupervised time with peers and sports were significantly associated with the amounts of tobacco, alcohol, and marijuana use at the end of high school. These findings underscore the value of considering OST contexts in relation to strategies to promote adolescent health. Reducing unsupervised time with peers and increasing sports participation may have positive impacts on reducing substance use. Copyright © 2015 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.
Hübner, David; Verhoeven, Thibault; Schmid, Konstantin; Müller, Klaus-Robert; Tangermann, Michael; Kindermans, Pieter-Jan
2017-01-01
Using traditional approaches, a brain-computer interface (BCI) requires the collection of calibration data for new subjects prior to online use. Calibration time can be reduced or eliminated e.g., by subject-to-subject transfer of a pre-trained classifier or unsupervised adaptive classification methods which learn from scratch and adapt over time. While such heuristics work well in practice, none of them can provide theoretical guarantees. Our objective is to modify an event-related potential (ERP) paradigm to work in unison with the machine learning decoder, and thus to achieve a reliable unsupervised calibrationless decoding with a guarantee to recover the true class means. We introduce learning from label proportions (LLP) to the BCI community as a new unsupervised, and easy-to-implement classification approach for ERP-based BCIs. The LLP estimates the mean target and non-target responses based on known proportions of these two classes in different groups of the data. We present a visual ERP speller to meet the requirements of LLP. For evaluation, we ran simulations on artificially created data sets and conducted an online BCI study with 13 subjects performing a copy-spelling task. Theoretical considerations show that LLP is guaranteed to minimize the loss function similar to a corresponding supervised classifier. LLP performed well in simulations and in the online application, where 84.5% of characters were spelled correctly on average without prior calibration. The continuously adapting LLP classifier is the first unsupervised decoder for ERP BCIs guaranteed to find the optimal decoder. This makes it an ideal solution to avoid tedious calibration sessions. Additionally, LLP works on complementary principles compared to existing unsupervised methods, opening the door for their further enhancement when combined with LLP.
Verhoeven, Thibault; Schmid, Konstantin; Müller, Klaus-Robert; Tangermann, Michael; Kindermans, Pieter-Jan
2017-01-01
Objective Using traditional approaches, a brain-computer interface (BCI) requires the collection of calibration data for new subjects prior to online use. Calibration time can be reduced or eliminated e.g., by subject-to-subject transfer of a pre-trained classifier or unsupervised adaptive classification methods which learn from scratch and adapt over time. While such heuristics work well in practice, none of them can provide theoretical guarantees. Our objective is to modify an event-related potential (ERP) paradigm to work in unison with the machine learning decoder, and thus to achieve a reliable unsupervised calibrationless decoding with a guarantee to recover the true class means. Method We introduce learning from label proportions (LLP) to the BCI community as a new unsupervised, and easy-to-implement classification approach for ERP-based BCIs. The LLP estimates the mean target and non-target responses based on known proportions of these two classes in different groups of the data. We present a visual ERP speller to meet the requirements of LLP. For evaluation, we ran simulations on artificially created data sets and conducted an online BCI study with 13 subjects performing a copy-spelling task. Results Theoretical considerations show that LLP is guaranteed to minimize the loss function similar to a corresponding supervised classifier. LLP performed well in simulations and in the online application, where 84.5% of characters were spelled correctly on average without prior calibration. Significance The continuously adapting LLP classifier is the first unsupervised decoder for ERP BCIs guaranteed to find the optimal decoder. This makes it an ideal solution to avoid tedious calibration sessions. Additionally, LLP works on complementary principles compared to existing unsupervised methods, opening the door for their further enhancement when combined with LLP. PMID:28407016
BORAWSKI, ELAINE A.; IEVERS-LANDIS, CAROLYN E.; LOVEGREEN, LOREN D.; TRAPL, ERIKA S.
2010-01-01
Purpose To compare two different parenting practices (parental monitoring and negotiated unsupervised time) and perceived parental trust in the reporting of health risk behaviors among adolescents. Methods Data were derived from 692 adolescents in 9th and 10th grades (X̄ = 15.7 years) enrolled in health education classes in six urban high schools. Students completed a self-administered paper-based survey that assessed adolescents’ perceptions of the degree to which their parents monitor their whereabouts, are permitted to negotiate unsupervised time with their friends and trust them to make decisions. Using gender-specific multivariate logistic regression analyses, we examined the relative importance of parental monitoring, negotiated unsupervised time with peers, and parental trust in predicting reported sexual activity, sex-related protective actions (e.g., condom use, carrying protection) and substance use (alcohol, tobacco, and marijuana). Results For males and females, increased negotiated unsupervised time was strongly associated with increased risk behavior (e.g., sexual activity, alcohol and marijuana use) but also sex-related protective actions. In males, high parental monitoring was associated with less alcohol use and consistent condom use. Parental monitoring had no affect on female behavior. Perceived parental trust served as a protective factor against sexual activity, tobacco, and marijuana use in females, and alcohol use in males. Conclusions Although monitoring is an important practice for parents of older adolescents, managing their behavior through negotiation of unsupervised time may have mixed results leading to increased experimentation with sexuality and substances, but perhaps in a more responsible way. Trust established between an adolescent female and her parents continues to be a strong deterrent for risky behaviors but appears to have little effect on behaviors of adolescent males. PMID:12890596
Unsupervised Cryo-EM Data Clustering through Adaptively Constrained K-Means Algorithm
Xu, Yaofang; Wu, Jiayi; Yin, Chang-Cheng; Mao, Youdong
2016-01-01
In single-particle cryo-electron microscopy (cryo-EM), K-means clustering algorithm is widely used in unsupervised 2D classification of projection images of biological macromolecules. 3D ab initio reconstruction requires accurate unsupervised classification in order to separate molecular projections of distinct orientations. Due to background noise in single-particle images and uncertainty of molecular orientations, traditional K-means clustering algorithm may classify images into wrong classes and produce classes with a large variation in membership. Overcoming these limitations requires further development on clustering algorithms for cryo-EM data analysis. We propose a novel unsupervised data clustering method building upon the traditional K-means algorithm. By introducing an adaptive constraint term in the objective function, our algorithm not only avoids a large variation in class sizes but also produces more accurate data clustering. Applications of this approach to both simulated and experimental cryo-EM data demonstrate that our algorithm is a significantly improved alterative to the traditional K-means algorithm in single-particle cryo-EM analysis. PMID:27959895
Unsupervised Cryo-EM Data Clustering through Adaptively Constrained K-Means Algorithm.
Xu, Yaofang; Wu, Jiayi; Yin, Chang-Cheng; Mao, Youdong
2016-01-01
In single-particle cryo-electron microscopy (cryo-EM), K-means clustering algorithm is widely used in unsupervised 2D classification of projection images of biological macromolecules. 3D ab initio reconstruction requires accurate unsupervised classification in order to separate molecular projections of distinct orientations. Due to background noise in single-particle images and uncertainty of molecular orientations, traditional K-means clustering algorithm may classify images into wrong classes and produce classes with a large variation in membership. Overcoming these limitations requires further development on clustering algorithms for cryo-EM data analysis. We propose a novel unsupervised data clustering method building upon the traditional K-means algorithm. By introducing an adaptive constraint term in the objective function, our algorithm not only avoids a large variation in class sizes but also produces more accurate data clustering. Applications of this approach to both simulated and experimental cryo-EM data demonstrate that our algorithm is a significantly improved alterative to the traditional K-means algorithm in single-particle cryo-EM analysis.
Hierarchical Representation Learning for Kinship Verification.
Kohli, Naman; Vatsa, Mayank; Singh, Richa; Noore, Afzel; Majumdar, Angshul
2017-01-01
Kinship verification has a number of applications such as organizing large collections of images and recognizing resemblances among humans. In this paper, first, a human study is conducted to understand the capabilities of human mind and to identify the discriminatory areas of a face that facilitate kinship-cues. The visual stimuli presented to the participants determine their ability to recognize kin relationship using the whole face as well as specific facial regions. The effect of participant gender and age and kin-relation pair of the stimulus is analyzed using quantitative measures such as accuracy, discriminability index d' , and perceptual information entropy. Utilizing the information obtained from the human study, a hierarchical kinship verification via representation learning (KVRL) framework is utilized to learn the representation of different face regions in an unsupervised manner. We propose a novel approach for feature representation termed as filtered contractive deep belief networks (fcDBN). The proposed feature representation encodes relational information present in images using filters and contractive regularization penalty. A compact representation of facial images of kin is extracted as an output from the learned model and a multi-layer neural network is utilized to verify the kin accurately. A new WVU kinship database is created, which consists of multiple images per subject to facilitate kinship verification. The results show that the proposed deep learning framework (KVRL-fcDBN) yields the state-of-the-art kinship verification accuracy on the WVU kinship database and on four existing benchmark data sets. Furthermore, kinship information is used as a soft biometric modality to boost the performance of face verification via product of likelihood ratio and support vector machine based approaches. Using the proposed KVRL-fcDBN framework, an improvement of over 20% is observed in the performance of face verification.
Simultaneous growth of self-patterned carbon nanotube forests with dual height scales
NASA Astrophysics Data System (ADS)
Sam, Ebru Devrim; Kucukayan-Dogu, Gokce; Baykal, Beril; Dalkilic, Zeynep; Rana, Kuldeep; Bengu, Erman
2012-05-01
In this study, we report on a unique, one-step fabrication technique enabling the simultaneous synthesis of vertically aligned multi-walled carbon nanotubes (VA-MWCNTs) with dual height scales through alcohol catalyzed chemical vapor deposition (ACCVD). Regions of VA-MWCNTs with different heights were well separated from each other leading to a self-patterning on the surface. We devised a unique layer-by-layer process for application of catalyst and inhibitor precursors on oxidized Si (100) surfaces before the ACCVD step to achieve a hierarchical arrangement. Patterning could be controlled by adjusting the molarity and application sequence of precursors. Contact angle measurements on these self-patterned surfaces indicated that manipulation of these hierarchical arrays resulted in a wide range of hydrophobic behavior changing from that of a sticky rose petal to a lotus leaf.In this study, we report on a unique, one-step fabrication technique enabling the simultaneous synthesis of vertically aligned multi-walled carbon nanotubes (VA-MWCNTs) with dual height scales through alcohol catalyzed chemical vapor deposition (ACCVD). Regions of VA-MWCNTs with different heights were well separated from each other leading to a self-patterning on the surface. We devised a unique layer-by-layer process for application of catalyst and inhibitor precursors on oxidized Si (100) surfaces before the ACCVD step to achieve a hierarchical arrangement. Patterning could be controlled by adjusting the molarity and application sequence of precursors. Contact angle measurements on these self-patterned surfaces indicated that manipulation of these hierarchical arrays resulted in a wide range of hydrophobic behavior changing from that of a sticky rose petal to a lotus leaf. Electronic supplementary information (ESI) available: Fig. S1; AFM image of the Co-O layer which was first dried at 40 °C and then oxidized at 200 °C. Fig. S2; graph relative to the area of CNT islands for different catalyst configurations. Fig. S3; representative XPS spectra of (a) Si 2p, (b) Al 2p, (c) Fe 2p and (d) Co 2p for a reduced Al/Fe/Al/Co (20/20/20/20) catalyst film (grey line in all figures shows the peak backgrounds and orange line shows the curve fitted). Contact angle movies, Video S1 and Video S2, of Al/Fe/Al/Co samples 40/20/20/20 and 20/40/20/20, respectively. See DOI: 10.1039/c2nr30258f
User Activity Recognition in Smart Homes Using Pattern Clustering Applied to Temporal ANN Algorithm.
Bourobou, Serge Thomas Mickala; Yoo, Younghwan
2015-05-21
This paper discusses the possibility of recognizing and predicting user activities in the IoT (Internet of Things) based smart environment. The activity recognition is usually done through two steps: activity pattern clustering and activity type decision. Although many related works have been suggested, they had some limited performance because they focused only on one part between the two steps. This paper tries to find the best combination of a pattern clustering method and an activity decision algorithm among various existing works. For the first step, in order to classify so varied and complex user activities, we use a relevant and efficient unsupervised learning method called the K-pattern clustering algorithm. In the second step, the training of smart environment for recognizing and predicting user activities inside his/her personal space is done by utilizing the artificial neural network based on the Allen's temporal relations. The experimental results show that our combined method provides the higher recognition accuracy for various activities, as compared with other data mining classification algorithms. Furthermore, it is more appropriate for a dynamic environment like an IoT based smart home.
The neural network classification of false killer whale (Pseudorca crassidens) vocalizations.
Murray, S O; Mercado, E; Roitblat, H L
1998-12-01
This study reports the use of unsupervised, self-organizing neural network to categorize the repertoire of false killer whale vocalizations. Self-organizing networks are capable of detecting patterns in their input and partitioning those patterns into categories without requiring that the number or types of categories be predefined. The inputs for the neural networks were two-dimensional characterization of false killer whale vocalization, where each vocalization was characterized by a sequence of short-time measurements of duty cycle and peak frequency. The first neural network used competitive learning, where units in a competitive layer distributed themselves to recognize frequently presented input vectors. This network resulted in classes representing typical patterns in the vocalizations. The second network was a Kohonen feature map which organized the outputs topologically, providing a graphical organization of pattern relationships. The networks performed well as measured by (1) the average correlation between the input vectors and the weight vectors for each category, and (2) the ability of the networks to classify novel vocalizations. The techniques used in this study could easily be applied to other species and facilitate the development of objective, comprehensive repertoire models.
Unsupervised iterative detection of land mines in highly cluttered environments.
Batman, Sinan; Goutsias, John
2003-01-01
An unsupervised iterative scheme is proposed for land mine detection in heavily cluttered scenes. This scheme is based on iterating hybrid multispectral filters that consist of a decorrelating linear transform coupled with a nonlinear morphological detector. Detections extracted from the first pass are used to improve results in subsequent iterations. The procedure stops after a predetermined number of iterations. The proposed scheme addresses several weaknesses associated with previous adaptations of morphological approaches to land mine detection. Improvement in detection performance, robustness with respect to clutter inhomogeneities, a completely unsupervised operation, and computational efficiency are the main highlights of the method. Experimental results reveal excellent performance.
Bio-Inspired Microsystem for Robust Genetic Assay Recognition
Lue, Jaw-Chyng; Fang, Wai-Chi
2008-01-01
A compact integrated system-on-chip (SoC) architecture solution for robust, real-time, and on-site genetic analysis has been proposed. This microsystem solution is noise-tolerable and suitable for analyzing the weak fluorescence patterns from a PCR prepared dual-labeled DNA microchip assay. In the architecture, a preceding VLSI differential logarithm microchip is designed for effectively computing the logarithm of the normalized input fluorescence signals. A posterior VLSI artificial neural network (ANN) processor chip is used for analyzing the processed signals from the differential logarithm stage. A single-channel logarithmic circuit was fabricated and characterized. A prototype ANN chip with unsupervised winner-take-all (WTA) function was designed, fabricated, and tested. An ANN learning algorithm using a novel sigmoid-logarithmic transfer function based on the supervised backpropagation (BP) algorithm is proposed for robustly recognizing low-intensity patterns. Our results show that the trained new ANN can recognize low-fluorescence patterns better than an ANN using the conventional sigmoid function. PMID:18566679
Social networking patterns/hazards among teenagers.
Machold, C; Judge, G; Mavrinac, A; Elliott, J; Murphy, A M; Roche, E
2012-05-01
Social Networking Sites (SNSs) have grown substantially, posing new hazards to teenagers. This study aimed to determine general patterns of Internet usage among Irish teenagers aged 11-16 years, and to identify potential hazards, including; bullying, inappropriate contact, overuse, addiction and invasion of users' privacy. A cross-sectional study design was employed to survey students at three Irish secondary schools, with a sample of 474 completing a questionnaire. 202 (44%) (n = 460) accessed the Internet using a shared home computer. Two hours or less were spent online daily by 285(62%), of whom 450 (98%) were unsupervised. 306 (72%) (n = 425) reported frequent usage of SNSs, 403 (95%) of whom were Facebook users. 42 (10%) males and 51 (12%) females experienced bullying online, while 114 (27%) reported inappropriate contact from others. Concerning overuse and the risk of addiction, 140 (33%) felt they accessed SNSs too often. These patterns among Irish teenagers suggest that SNS usage poses significant dangers, which are going largely unaddressed.
A New Approach for Mining Order-Preserving Submatrices Based on All Common Subsequences.
Xue, Yun; Liao, Zhengling; Li, Meihang; Luo, Jie; Kuang, Qiuhua; Hu, Xiaohui; Li, Tiechen
2015-01-01
Order-preserving submatrices (OPSMs) have been applied in many fields, such as DNA microarray data analysis, automatic recommendation systems, and target marketing systems, as an important unsupervised learning model. Unfortunately, most existing methods are heuristic algorithms which are unable to reveal OPSMs entirely in NP-complete problem. In particular, deep OPSMs, corresponding to long patterns with few supporting sequences, incur explosive computational costs and are completely pruned by most popular methods. In this paper, we propose an exact method to discover all OPSMs based on frequent sequential pattern mining. First, an existing algorithm was adjusted to disclose all common subsequence (ACS) between every two row sequences, and therefore all deep OPSMs will not be missed. Then, an improved data structure for prefix tree was used to store and traverse ACS, and Apriori principle was employed to efficiently mine the frequent sequential pattern. Finally, experiments were implemented on gene and synthetic datasets. Results demonstrated the effectiveness and efficiency of this method.
GibbsCluster: unsupervised clustering and alignment of peptide sequences.
Andreatta, Massimo; Alvarez, Bruno; Nielsen, Morten
2017-07-03
Receptor interactions with short linear peptide fragments (ligands) are at the base of many biological signaling processes. Conserved and information-rich amino acid patterns, commonly called sequence motifs, shape and regulate these interactions. Because of the properties of a receptor-ligand system or of the assay used to interrogate it, experimental data often contain multiple sequence motifs. GibbsCluster is a powerful tool for unsupervised motif discovery because it can simultaneously cluster and align peptide data. The GibbsCluster 2.0 presented here is an improved version incorporating insertion and deletions accounting for variations in motif length in the peptide input. In basic terms, the program takes as input a set of peptide sequences and clusters them into meaningful groups. It returns the optimal number of clusters it identified, together with the sequence alignment and sequence motif characterizing each cluster. Several parameters are available to customize cluster analysis, including adjustable penalties for small clusters and overlapping groups and a trash cluster to remove outliers. As an example application, we used the server to deconvolute multiple specificities in large-scale peptidome data generated by mass spectrometry. The server is available at http://www.cbs.dtu.dk/services/GibbsCluster-2.0. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Learning spatially coherent properties of the visual world in connectionist networks
NASA Astrophysics Data System (ADS)
Becker, Suzanna; Hinton, Geoffrey E.
1991-10-01
In the unsupervised learning paradigm, a network of neuron-like units is presented with an ensemble of input patterns from a structured environment, such as the visual world, and learns to represent the regularities in that input. The major goal in developing unsupervised learning algorithms is to find objective functions that characterize the quality of the network's representation without explicitly specifying the desired outputs of any of the units. The sort of objective functions considered cause a unit to become tuned to spatially coherent features of visual images (such as texture, depth, shading, and surface orientation), by learning to predict the outputs of other units which have spatially adjacent receptive fields. Simulations show that using an information-theoretic algorithm called IMAX, a network can be trained to represent depth by observing random dot stereograms of surfaces with continuously varying disparities. Once a layer of depth-tuned units has developed, subsequent layers are trained to perform surface interpolation of curved surfaces, by learning to predict the depth of one image region based on depth measurements in surrounding regions. An extension of the basic model allows a population of competing neurons to learn a distributed code for disparity, which naturally gives rise to a representation of discontinuities.
Vajda, Szilárd; Rangoni, Yves; Cecotti, Hubert
2015-01-01
For training supervised classifiers to recognize different patterns, large data collections with accurate labels are necessary. In this paper, we propose a generic, semi-automatic labeling technique for large handwritten character collections. In order to speed up the creation of a large scale ground truth, the method combines unsupervised clustering and minimal expert knowledge. To exploit the potential discriminant complementarities across features, each character is projected into five different feature spaces. After clustering the images in each feature space, the human expert labels the cluster centers. Each data point inherits the label of its cluster’s center. A majority (or unanimity) vote decides the label of each character image. The amount of human involvement (labeling) is strictly controlled by the number of clusters – produced by the chosen clustering approach. To test the efficiency of the proposed approach, we have compared, and evaluated three state-of-the art clustering methods (k-means, self-organizing maps, and growing neural gas) on the MNIST digit data set, and a Lampung Indonesian character data set, respectively. Considering a k-nn classifier, we show that labeling manually only 1.3% (MNIST), and 3.2% (Lampung) of the training data, provides the same range of performance than a completely labeled data set would. PMID:25870463
Clustervision: Visual Supervision of Unsupervised Clustering.
Kwon, Bum Chul; Eysenbach, Ben; Verma, Janu; Ng, Kenney; De Filippi, Christopher; Stewart, Walter F; Perer, Adam
2018-01-01
Clustering, the process of grouping together similar items into distinct partitions, is a common type of unsupervised machine learning that can be useful for summarizing and aggregating complex multi-dimensional data. However, data can be clustered in many ways, and there exist a large body of algorithms designed to reveal different patterns. While having access to a wide variety of algorithms is helpful, in practice, it is quite difficult for data scientists to choose and parameterize algorithms to get the clustering results relevant for their dataset and analytical tasks. To alleviate this problem, we built Clustervision, a visual analytics tool that helps ensure data scientists find the right clustering among the large amount of techniques and parameters available. Our system clusters data using a variety of clustering techniques and parameters and then ranks clustering results utilizing five quality metrics. In addition, users can guide the system to produce more relevant results by providing task-relevant constraints on the data. Our visual user interface allows users to find high quality clustering results, explore the clusters using several coordinated visualization techniques, and select the cluster result that best suits their task. We demonstrate this novel approach using a case study with a team of researchers in the medical domain and showcase that our system empowers users to choose an effective representation of their complex data.
Mastication Evaluation With Unsupervised Learning: Using an Inertial Sensor-Based System
Lucena, Caroline Vieira; Lacerda, Marcelo; Caldas, Rafael; De Lima Neto, Fernando Buarque
2018-01-01
There is a direct relationship between the prevalence of musculoskeletal disorders of the temporomandibular joint and orofacial disorders. A well-elaborated analysis of the jaw movements provides relevant information for healthcare professionals to conclude their diagnosis. Different approaches have been explored to track jaw movements such that the mastication analysis is getting less subjective; however, all methods are still highly subjective, and the quality of the assessments depends much on the experience of the health professional. In this paper, an accurate and non-invasive method based on a commercial low-cost inertial sensor (MPU6050) to measure jaw movements is proposed. The jaw-movement feature values are compared to the obtained with clinical analysis, showing no statistically significant difference between both methods. Moreover, We propose to use unsupervised paradigm approaches to cluster mastication patterns of healthy subjects and simulated patients with facial trauma. Two techniques were used in this paper to instantiate the method: Kohonen’s Self-Organizing Maps and K-Means Clustering. Both algorithms have excellent performances to process jaw-movements data, showing encouraging results and potential to bring a full assessment of the masticatory function. The proposed method can be applied in real-time providing relevant dynamic information for health-care professionals. PMID:29651365
Vibration control of building structures using self-organizing and self-learning neural networks
NASA Astrophysics Data System (ADS)
Madan, Alok
2005-11-01
Past research in artificial intelligence establishes that artificial neural networks (ANN) are effective and efficient computational processors for performing a variety of tasks including pattern recognition, classification, associative recall, combinatorial problem solving, adaptive control, multi-sensor data fusion, noise filtering and data compression, modelling and forecasting. The paper presents a potentially feasible approach for training ANN in active control of earthquake-induced vibrations in building structures without the aid of teacher signals (i.e. target control forces). A counter-propagation neural network is trained to output the control forces that are required to reduce the structural vibrations in the absence of any feedback on the correctness of the output control forces (i.e. without any information on the errors in output activations of the network). The present study shows that, in principle, the counter-propagation network (CPN) can learn from the control environment to compute the required control forces without the supervision of a teacher (unsupervised learning). Simulated case studies are presented to demonstrate the feasibility of implementing the unsupervised learning approach in ANN for effective vibration control of structures under the influence of earthquake ground motions. The proposed learning methodology obviates the need for developing a mathematical model of structural dynamics or training a separate neural network to emulate the structural response for implementation in practice.
Saeed, Isaam; Tang, Sen-Lin; Halgamuge, Saman K.
2012-01-01
An approach to infer the unknown microbial population structure within a metagenome is to cluster nucleotide sequences based on common patterns in base composition, otherwise referred to as binning. When functional roles are assigned to the identified populations, a deeper understanding of microbial communities can be attained, more so than gene-centric approaches that explore overall functionality. In this study, we propose an unsupervised, model-based binning method with two clustering tiers, which uses a novel transformation of the oligonucleotide frequency-derived error gradient and GC content to generate coarse groups at the first tier of clustering; and tetranucleotide frequency to refine these groups at the secondary clustering tier. The proposed method has a demonstrated improvement over PhyloPythia, S-GSOM, TACOA and TaxSOM on all three benchmarks that were used for evaluation in this study. The proposed method is then applied to a pyrosequenced metagenomic library of mud volcano sediment sampled in southwestern Taiwan, with the inferred population structure validated against complementary sequencing of 16S ribosomal RNA marker genes. Finally, the proposed method was further validated against four publicly available metagenomes, including a highly complex Antarctic whale-fall bone sample, which was previously assumed to be too complex for binning prior to functional analysis. PMID:22180538
Rathleff, C R; Bandholm, T; Spaich, E G; Jorgensen, M; Andreasen, J
2017-01-01
Frailty is a serious condition frequently present in geriatric inpatients that potentially causes serious adverse events. Strength training is acknowledged as a means of preventing or delaying frailty and loss of function in these patients. However, limited hospital resources challenge the amount of supervised training, and unsupervised training could possibly supplement supervised training thereby increasing the total exercise dose during admission. A new valid and reliable technology, the BandCizer, objectively measures the exact training dosage performed. The purpose was to investigate feasibility and acceptability of an unsupervised progressive strength training intervention monitored by BandCizer for frail geriatric inpatients. This feasibility trial included 15 frail inpatients at a geriatric ward. At hospitalization, the patients were prescribed two elastic band exercises to be performed unsupervised once daily. A BandCizer Datalogger enabling measurement of the number of sets, repetitions, and time-under-tension was attached to the elastic band. The patients were instructed in performing strength training: 3 sets of 10 repetitions (10-12 repetition maximum (RM)) with a separation of 2-min pauses and a time-under-tension of 8 s. The feasibility criterion for the unsupervised progressive exercises was that 33% of the recommended number of sets would be performed by at least 30% of patients. In addition, patients and staff were interviewed about their experiences with the intervention. Four (27%) out of 15 patients completed 33% of the recommended number of sets. For the total sample, the average percent of performed sets was 23% and for those who actually trained ( n = 12) 26%. Patients and staff expressed a general positive attitude towards the unsupervised training as an addition to the supervised training sessions. However, barriers were also described-especially constant interruptions. Based on the predefined criterion for feasibility, the unsupervised training was not feasible, although the criterion was almost met. The patients and staff mainly expressed positive attitudes towards the unsupervised training. As even a small training dosage has been shown to improve the physical performance of geriatric inpatients, the proposed intervention might be relevant if the interruptions are decreased in future large-scale trials and if the adherence is increased. ClinicalTrials.gov: NCT02702557, February 29, 2016. Data Protection Agency: 2016-42, February 25, 2016. Ethics Committee: No registration needed, December 8, 2015 (e-mail correspondence).
Use Hierarchical Storage and Analysis to Exploit Intrinsic Parallelism
NASA Astrophysics Data System (ADS)
Zender, C. S.; Wang, W.; Vicente, P.
2013-12-01
Big Data is an ugly name for the scientific opportunities and challenges created by the growing wealth of geoscience data. How to weave large, disparate datasets together to best reveal their underlying properties, to exploit their strengths and minimize their weaknesses, to continually aggregate more information than the world knew yesterday and less than we will learn tomorrow? Data analytics techniques (statistics, data mining, machine learning, etc.) can accelerate pattern recognition and discovery. However, often researchers must, prior to analysis, organize multiple related datasets into a coherent framework. Hierarchical organization permits entire dataset to be stored in nested groups that reflect their intrinsic relationships and similarities. Hierarchical data can be simpler and faster to analyze by coding operators to automatically parallelize processes over isomorphic storage units, i.e., groups. The newest generation of netCDF Operators (NCO) embody this hierarchical approach, while still supporting traditional analysis approaches. We will use NCO to demonstrate the trade-offs involved in processing a prototypical Big Data application (analysis of CMIP5 datasets) using hierarchical and traditional analysis approaches.
Disturbance patterns in a socio-ecological system at multiple scales
G. Zurlini; Kurt H. Riitters; N. Zaccarelli; I. Petrosillo; K.B. Jones; L. Rossi
2006-01-01
Ecological systems with hierarchical organization and non-equilibrium dynamics require multiple-scale analyses to comprehend how a system is structured and to formulate hypotheses about regulatory mechanisms. Characteristic scales in real landscapes are determined by, or at least reflect, the spatial patterns and scales of constraining human interactions with the...
NASA Astrophysics Data System (ADS)
An, Soyoung; Choi, Woochul; Paik, Se-Bum
2015-11-01
Understanding the mechanism of information processing in the human brain remains a unique challenge because the nonlinear interactions between the neurons in the network are extremely complex and because controlling every relevant parameter during an experiment is difficult. Therefore, a simulation using simplified computational models may be an effective approach. In the present study, we developed a general model of neural networks that can simulate nonlinear activity patterns in the hierarchical structure of a neural network system. To test our model, we first examined whether our simulation could match the previously-observed nonlinear features of neural activity patterns. Next, we performed a psychophysics experiment for a simple visual working memory task to evaluate whether the model could predict the performance of human subjects. Our studies show that the model is capable of reproducing the relationship between memory load and performance and may contribute, in part, to our understanding of how the structure of neural circuits can determine the nonlinear neural activity patterns in the human brain.
NASA Astrophysics Data System (ADS)
Keyport, Ren N.; Oommen, Thomas; Martha, Tapas R.; Sajinkumar, K. S.; Gierke, John S.
2018-02-01
A comparative analysis of landslides detected by pixel-based and object-oriented analysis (OOA) methods was performed using very high-resolution (VHR) remotely sensed aerial images for the San Juan La Laguna, Guatemala, which witnessed widespread devastation during the 2005 Hurricane Stan. A 3-band orthophoto of 0.5 m spatial resolution together with a 115 field-based landslide inventory were used for the analysis. A binary reference was assigned with a zero value for landslide and unity for non-landslide pixels. The pixel-based analysis was performed using unsupervised classification, which resulted in 11 different trial classes. Detection of landslides using OOA includes 2-step K-means clustering to eliminate regions based on brightness; elimination of false positives using object properties such as rectangular fit, compactness, length/width ratio, mean difference of objects, and slope angle. Both overall accuracy and F-score for OOA methods outperformed pixel-based unsupervised classification methods in both landslide and non-landslide classes. The overall accuracy for OOA and pixel-based unsupervised classification was 96.5% and 94.3%, respectively, whereas the best F-score for landslide identification for OOA and pixel-based unsupervised methods: were 84.3% and 77.9%, respectively.Results indicate that the OOA is able to identify the majority of landslides with a few false positive when compared to pixel-based unsupervised classification.
A hierarchical model for estimating change in American Woodcock populations
Sauer, J.R.; Link, W.A.; Kendall, W.L.; Kelley, J.R.; Niven, D.K.
2008-01-01
The Singing-Ground Survey (SGS) is a primary source of information on population change for American woodcock (Scolopax minor). We analyzed the SGS using a hierarchical log-linear model and compared the estimates of change and annual indices of abundance to a route regression analysis of SGS data. We also grouped SGS routes into Bird Conservation Regions (BCRs) and estimated population change and annual indices using BCRs within states and provinces as strata. Based on the hierarchical model?based estimates, we concluded that woodcock populations were declining in North America between 1968 and 2006 (trend = -0.9%/yr, 95% credible interval: -1.2, -0.5). Singing-Ground Survey results are generally similar between analytical approaches, but the hierarchical model has several important advantages over the route regression. Hierarchical models better accommodate changes in survey efficiency over time and space by treating strata, years, and observers as random effects in the context of a log-linear model, providing trend estimates that are derived directly from the annual indices. We also conducted a hierarchical model analysis of woodcock data from the Christmas Bird Count and the North American Breeding Bird Survey. All surveys showed general consistency in patterns of population change, but the SGS had the shortest credible intervals. We suggest that population management and conservation planning for woodcock involving interpretation of the SGS use estimates provided by the hierarchical model.
Multiple Coordination Patterns in Infant and Adult Vocalizations
Abney, Drew H.; Warlaumont, Anne S.; Oller, D. Kimbrough; Wallot, Sebastian; Kello, Christopher T.
2017-01-01
The study of vocal coordination between infants and adults has led to important insights into the development of social, cognitive, emotional and linguistic abilities. We used an automatic system to identify vocalizations produced by infants and adults over the course of the day for fifteen infants studied longitudinally during the first two years of life. We measured three different types of vocal coordination: coincidence-based, rate-based, and cluster-based. Coincidence-based and rate-based coordination are established measures in the developmental literature. Cluster-based coordination is new and measures the strength of matching in the degree to which vocalization events occur in hierarchically nested clusters. We investigated whether various coordination patterns differ as a function of vocalization type, whether different coordination patterns provide unique information about the dynamics of vocal interaction, and how the various coordination patterns each relate to infant age. All vocal coordination patterns displayed greater coordination for infant speech-related vocalizations, adults adapted the hierarchical clustering of their vocalizations to match that of infants, and each of the three coordination patterns had unique associations with infant age. Altogether, our results indicate that vocal coordination between infants and adults is multifaceted, suggesting a complex relationship between vocal coordination and the development of vocal communication. PMID:29375276
Jin, Shengyu; Wang, Yixiu; Motlag, Maithilee; Gao, Shengjie; Xu, Jin; Nian, Qiong; Wu, Wenzhuo; Cheng, Gary J
2018-03-01
Ongoing efforts in triboelectric nanogenerators (TENGs) focus on enhancing power generation, but obstacles concerning the economical and cost-effective production of TENGs continue to prevail. Micro-/nanostructure engineering of polymer surfaces has been dominantly utilized for boosting the contact triboelectrification, with deposited metal electrodes for collecting the scavenged energy. Nevertheless, this state-of-the-art approach is limited by the vague potential for producing 3D hierarchical surface structures with conformable coverage of high-quality metal. Laser-shock imprinting (LSI) is emerging as a potentially scalable approach for directly surface patterning of a wide range of metals with 3D nanoscale structures by design, benefiting from the ultrahigh-strain-rate forming process. Here, a TENG device is demonstrated with LSI-processed biomimetic hierarchically structured metal electrodes for efficient harvesting of water-drop energy in the environment. Mimicking and transferring hierarchical microstructures from natural templates, such as leaves, into these water-TENG devices is effective regarding repelling water drops from the device surface, since surface hydrophobicity from these biomicrostructures maximizes the TENG output. Among various leaves' microstructures, hierarchical microstructures from dried bamboo leaves are preferable regarding maximizing power output, which is attributed to their unique structures, containing both dense nanostructures and microscale features, compared with other types of leaves. Also, the triboelectric output is significantly improved by closely mimicking the hydrophobic nature of the leaves in the LSI-processed metal surface after functionalizing it with low-surface-energy self-assembled-monolayers. The approach opens doors to new manufacturable TENG technologies for economically feasible and ecologically friendly production of functional devices with directly patterned 3D biomimic metallic surfaces in energy, electronics, and sensor applications. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Feature Extraction Using an Unsupervised Neural Network
1991-05-03
with this neural netowrk is given and its connection to exploratory projection pursuit methods is established. DD I 2 P JA d 73 EDITIONj Of I NOV 6s...IS OBSOLETE $IN 0102- LF- 014- 6601 SECURITY CLASSIFICATION OF THIS PAGE (When Daoes Enlered) Feature Extraction using an Unsupervised Neural Network
An Unsupervised Method for Uncovering Morphological Chains (Open Access, Publisher’s Version)
2015-03-08
Consortium. Marco Baroni, Johannes Matiasek, and Harald Trost. 2002. Unsupervised discovery of morphologically re- lated words based on orthographic and...Better word representations with re- cursive neural networks for morphology. In CoNLL, Sofia, Bulgaria. Mohamed Maamouri, Ann Bies, Hubert Jin, and Tim
Hierarchically ordered carbon tubes
NASA Astrophysics Data System (ADS)
Pan, Zheng-Wei; Zhu, Hao-Guo; Zhang, Zong-Tao; Im, Hee-Jung; Dai, Sheng; Beach, David B.; Lowndes, Douglas H.
2003-04-01
Micropatterns of hierarchically ordered carbon tubes (i.e., ordered carbon microtubes composed of aligned carbon nanotubes) were grown on a film-like iron/silica substrate consisting of ring-like catalyst patterns. The substrates were prepared by a combined technique, in which the sol-gel method was used to prepare catalyst film and transmission electron microscope grids were used as a shadow mask. In comparison with other techniques that involve sophisticated lithography, this approach represents a simple and low-cost way to the micropatterning of aligned carbon nanotubes.
The inner formal structure of the H-T-P drawings: an exploratory study.
Vass, Z
1998-08-01
The study describes some interrelated patterns of traits of the House-Tree-Person (H-T-P) drawings with the instruments of hierarchical cluster analysis. First, according to the literature 1 7 formal or structural aspects of the projective drawings were collected, after which a detailed manual for coding was compiled. Second, the interrater reliability and the consistency of this manual was tested. Third, the hierarchical cluster structure of the reliable and consistent formal aspects was analysed. Results are: (a) a psychometrically tested coding manual of the investigated formal-structural aspects, each of them illustrated with drawings that showed the highest interrater agreement; and (b) the hierarchic cluster structure of the formal aspects of the H-T-P drawings of "normal" adults.
Fabrication of hierarchically structured superhydrophobic PDMS surfaces by Cu and CuO casting
NASA Astrophysics Data System (ADS)
Migliaccio, Christopher P.; Lazarus, Nathan
2015-10-01
Poly(dimethylsiloxane) (PDMS) films decorated with hierarchically structured pillars are cast from large area copper and copper oxide negative molds. The molds are fabricated using a single patterning step and electroplating. The process of casting structured PDMS films is simpler and cheaper than alternatives based on deep reactive ion etching or laser roughening of bulk silicone. Texture imparted to the pillars from the mold walls renders the PDMS films superhydrophobic, with the contact angle/hysteresis of the most non-wetting surfaces measuring 164°/9° and 158°/10° for surfaces with and without application of a low surface energy coating. The usefulness of patterned PDMS films as a "self-cleaning" solar cell module covering is demonstrated and other applications are discussed.
Nee, Derek Evan; Brown, Joshua W.
2013-01-01
Recent theories propose that the prefrontal cortex (PFC) is organized in a hierarchical fashion with more abstract, higher level information represented in anterior regions and more concrete, lower level information represented in posterior regions. This hierarchical organization affords flexible adjustments of action plans based on the context. Computational models suggest that such hierarchical organization in the PFC is achieved through interactions with the basal ganglia (BG) wherein the BG gate relevant contexts into the PFC. Here, we tested this proposal using functional magnetic resonance imaging (fMRI). Participants were scanned while updating working memory (WM) with 2 levels of hierarchical contexts. Consistent with PFC abstraction proposals, higher level context updates involved anterior portions of the PFC (BA 46), whereas lower level context updates involved posterior portions of the PFC (BA 6). Computational models were only partially supported as the BG were sensitive to higher, but not lower level context updates. The posterior parietal cortex (PPC) showed the opposite pattern. Analyses examining changes in functional connectivity confirmed dissociable roles of the anterior PFC–BG during higher level context updates and posterior PFC–PPC during lower level context updates. These results suggest that hierarchical contexts are organized by distinct frontal–striatal and frontal–parietal networks. PMID:22798339
Evaluating, Comparing, and Interpreting Protein Domain Hierarchies
2014-01-01
Abstract Arranging protein domain sequences hierarchically into evolutionarily divergent subgroups is important for investigating evolutionary history, for speeding up web-based similarity searches, for identifying sequence determinants of protein function, and for genome annotation. However, whether or not a particular hierarchy is optimal is often unclear, and independently constructed hierarchies for the same domain can often differ significantly. This article describes methods for statistically evaluating specific aspects of a hierarchy, for probing the criteria underlying its construction and for direct comparisons between hierarchies. Information theoretical notions are used to quantify the contributions of specific hierarchical features to the underlying statistical model. Such features include subhierarchies, sequence subgroups, individual sequences, and subgroup-associated signature patterns. Underlying properties are graphically displayed in plots of each specific feature's contributions, in heat maps of pattern residue conservation, in “contrast alignments,” and through cross-mapping of subgroups between hierarchies. Together, these approaches provide a deeper understanding of protein domain functional divergence, reveal uncertainties caused by inconsistent patterns of sequence conservation, and help resolve conflicts between competing hierarchies. PMID:24559108
Kim, Jeong; Kim, Sun Il; Cho, Seong-Ho; Hwang, Sungwoo; Lee, Young Hee; Hur, Jaehyun
2015-11-01
We report on new fabrication methods for a transparent, hierarchical, and patterned electrode comprised of either carbon nanotubes or zinc oxide nanorods. Vertically aligned carbon nanotubes or zinc oxide nanorod arrays were fabricated by either chemical vapor deposition or hydrothermal growth, in combination with photolithography. A transparent conductive graphene layer or zinc oxide seed layer was employed as the transparent electrode. On the patterned surface defined using photoresist, the vertically grown carbon nanotubes or zinc oxides could produce a concentrated electric field under applied DC voltage. This periodic electric field was used to align liquid crystal molecules in localized areas within the optical cell, effectively modulating the refractive index. Depending on the material and morphology of these patterned electrodes, the diffraction efficiency presented different behavior. From this study, we established the relationship between the hierarchical structure of the different electrodes and their efficiency for modulating the refractive index. We believe that this study will pave a new path for future optoelectronic applications.
Hinckley, Christopher A; Alaynick, William A; Gallarda, Benjamin W; Hayashi, Marito; Hilde, Kathryn L; Driscoll, Shawn P; Dekker, Joseph D; Tucker, Haley O; Sharpee, Tatyana O; Pfaff, Samuel L
2015-09-02
The coordination of multi-muscle movements originates in the circuitry that regulates the firing patterns of spinal motorneurons. Sensory neurons rely on the musculotopic organization of motorneurons to establish orderly connections, prompting us to examine whether the intraspinal circuitry that coordinates motor activity likewise uses cell position as an internal wiring reference. We generated a motorneuron-specific GCaMP6f mouse line and employed two-photon imaging to monitor the activity of lumbar motorneurons. We show that the central pattern generator neural network coordinately drives rhythmic columnar-specific motorneuron bursts at distinct phases of the locomotor cycle. Using multiple genetic strategies to perturb the subtype identity and orderly position of motorneurons, we found that neurons retained their rhythmic activity-but cell position was decoupled from the normal phasing pattern underlying flexion and extension. These findings suggest a hierarchical basis of motor circuit formation that relies on increasingly stringent matching of neuronal identity and position. Copyright © 2015 Elsevier Inc. All rights reserved.
DNA methylation-based reclassification of olfactory neuroblastoma.
Capper, David; Engel, Nils W; Stichel, Damian; Lechner, Matt; Glöss, Stefanie; Schmid, Simone; Koelsche, Christian; Schrimpf, Daniel; Niesen, Judith; Wefers, Annika K; Jones, David T W; Sill, Martin; Weigert, Oliver; Ligon, Keith L; Olar, Adriana; Koch, Arend; Forster, Martin; Moran, Sebastian; Tirado, Oscar M; Sáinz-Japeado, Miguel; Mora, Jaume; Esteller, Manel; Alonso, Javier; Del Muro, Xavier Garcia; Paulus, Werner; Felsberg, Jörg; Reifenberger, Guido; Glatzel, Markus; Frank, Stephan; Monoranu, Camelia M; Lund, Valerie J; von Deimling, Andreas; Pfister, Stefan; Buslei, Rolf; Ribbat-Idel, Julika; Perner, Sven; Gudziol, Volker; Meinhardt, Matthias; Schüller, Ulrich
2018-05-05
Olfactory neuroblastoma/esthesioneuroblastoma (ONB) is an uncommon neuroectodermal neoplasm thought to arise from the olfactory epithelium. Little is known about its molecular pathogenesis. For this study, a retrospective cohort of n = 66 tumor samples with the institutional diagnosis of ONB was analyzed by immunohistochemistry, genome-wide DNA methylation profiling, copy number analysis, and in a subset, next-generation panel sequencing of 560 tumor-associated genes. DNA methylation profiles were compared to those of relevant differential diagnoses of ONB. Unsupervised hierarchical clustering analysis of DNA methylation data revealed four subgroups among institutionally diagnosed ONB. The largest group (n = 42, 64%, Core ONB) presented with classical ONB histology and no overlap with other classes upon methylation profiling-based t-distributed stochastic neighbor embedding (t-SNE) analysis. A second DNA methylation group (n = 7, 11%) with CpG island methylator phenotype (CIMP) consisted of cases with strong expression of cytokeratin, no or scarce chromogranin A expression and IDH2 hotspot mutation in all cases. T-SNE analysis clustered these cases together with sinonasal carcinoma with IDH2 mutation. Four cases (6%) formed a small group characterized by an overall high level of DNA methylation, but without CIMP. The fourth group consisted of 13 cases that had heterogeneous DNA methylation profiles and strong cytokeratin expression in most cases. In t-SNE analysis, these cases mostly grouped among sinonasal adenocarcinoma, squamous cell carcinoma, and undifferentiated carcinoma. Copy number analysis indicated highly recurrent chromosomal changes among Core ONB with a high frequency of combined loss of chromosome 1-4, 8-10, and 12. NGS sequencing did not reveal highly recurrent mutations in ONB, with the only recurrently mutated genes being TP53 and DNMT3A. In conclusion, we demonstrate that institutionally diagnosed ONB are a heterogeneous group of tumors. Expression of cytokeratin, chromogranin A, the mutational status of IDH2 as well as DNA methylation patterns may greatly aid in the precise classification of ONB.
Choudhury, Javed Hussain; Ghosh, Sankar Kumar
2015-01-01
Background Epigenetic and genetic alteration plays a major role to the development of head and neck squamous cell carcinoma (HNSCC). Consumption of tobacco (smoking/chewing) and human papilloma virus (HPV) are also associated with an increase the risk of HNSCC. Promoter hypermethylation of the tumor suppression genes is related with transcriptional inactivation and loss of gene expression. We investigated epigenetic alteration (promoter methylation of tumor-related genes/loci) in tumor tissues in the context of genetic alteration, viral infection, and tobacco exposure and survival status. Methodology The study included 116 tissue samples (71 tumor and 45 normal tissues) from the Northeast Indian population. Methylation specific polymerase chain reaction (MSP) was used to determine the methylation status of 10 tumor-related genes/loci (p16, DAPK, RASSF1, BRAC1, GSTP1, ECAD, MLH1, MINT1, MINT2 and MINT31). Polymorphisms of CYP1A1, GST (M1 & T1), XRCC1and XRCC2 genes were studied by using polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP) and multiplex-PCR respectively. Principal Findings Unsupervised hierarchical clustering analysis based on methylation pattern had identified two tumor clusters, which significantly differ by CpG island methylator phenotype (CIMP), tobacco, GSTM1, CYP1A1, HPV and survival status. Analyzing methylation of genes/loci individually, we have found significant higher methylation of DAPK, RASSF1, p16 and MINT31genes (P = 0.031, 0.013, 0.031 and 0.015 respectively) in HPV (+) cases compared to HPV (-). Furthermore, a CIMP-high and Cluster-1 characteristic was also associated with poor survival. Conclusions Promoter methylation profiles reflecting a correlation with tobacco, HPV, survival status and genetic alteration and may act as a marker to determine subtypes and patient outcome in HNSCC. PMID:26098903
BiomeNet: A Bayesian Model for Inference of Metabolic Divergence among Microbial Communities
Chipman, Hugh; Gu, Hong; Bielawski, Joseph P.
2014-01-01
Metagenomics yields enormous numbers of microbial sequences that can be assigned a metabolic function. Using such data to infer community-level metabolic divergence is hindered by the lack of a suitable statistical framework. Here, we describe a novel hierarchical Bayesian model, called BiomeNet (Bayesian inference of metabolic networks), for inferring differential prevalence of metabolic subnetworks among microbial communities. To infer the structure of community-level metabolic interactions, BiomeNet applies a mixed-membership modelling framework to enzyme abundance information. The basic idea is that the mixture components of the model (metabolic reactions, subnetworks, and networks) are shared across all groups (microbiome samples), but the mixture proportions vary from group to group. Through this framework, the model can capture nested structures within the data. BiomeNet is unique in modeling each metagenome sample as a mixture of complex metabolic systems (metabosystems). The metabosystems are composed of mixtures of tightly connected metabolic subnetworks. BiomeNet differs from other unsupervised methods by allowing researchers to discriminate groups of samples through the metabolic patterns it discovers in the data, and by providing a framework for interpreting them. We describe a collapsed Gibbs sampler for inference of the mixture weights under BiomeNet, and we use simulation to validate the inference algorithm. Application of BiomeNet to human gut metagenomes revealed a metabosystem with greater prevalence among inflammatory bowel disease (IBD) patients. Based on the discriminatory subnetworks for this metabosystem, we inferred that the community is likely to be closely associated with the human gut epithelium, resistant to dietary interventions, and interfere with human uptake of an antioxidant connected to IBD. Because this metabosystem has a greater capacity to exploit host-associated glycans, we speculate that IBD-associated communities might arise from opportunist growth of bacteria that can circumvent the host's nutrient-based mechanism for bacterial partner selection. PMID:25412107
Using Machine Learning for Advanced Anomaly Detection and Classification
NASA Astrophysics Data System (ADS)
Lane, B.; Poole, M.; Camp, M.; Murray-Krezan, J.
2016-09-01
Machine Learning (ML) techniques have successfully been used in a wide variety of applications to automatically detect and potentially classify changes in activity, or a series of activities by utilizing large amounts data, sometimes even seemingly-unrelated data. The amount of data being collected, processed, and stored in the Space Situational Awareness (SSA) domain has grown at an exponential rate and is now better suited for ML. This paper describes development of advanced algorithms to deliver significant improvements in characterization of deep space objects and indication and warning (I&W) using a global network of telescopes that are collecting photometric data on a multitude of space-based objects. The Phase II Air Force Research Laboratory (AFRL) Small Business Innovative Research (SBIR) project Autonomous Characterization Algorithms for Change Detection and Characterization (ACDC), contracted to ExoAnalytic Solutions Inc. is providing the ability to detect and identify photometric signature changes due to potential space object changes (e.g. stability, tumble rate, aspect ratio), and correlate observed changes to potential behavioral changes using a variety of techniques, including supervised learning. Furthermore, these algorithms run in real-time on data being collected and processed by the ExoAnalytic Space Operations Center (EspOC), providing timely alerts and warnings while dynamically creating collection requirements to the EspOC for the algorithms that generate higher fidelity I&W. This paper will discuss the recently implemented ACDC algorithms, including the general design approach and results to date. The usage of supervised algorithms, such as Support Vector Machines, Neural Networks, k-Nearest Neighbors, etc., and unsupervised algorithms, for example k-means, Principle Component Analysis, Hierarchical Clustering, etc., and the implementations of these algorithms is explored. Results of applying these algorithms to EspOC data both in an off-line "pattern of life" analysis as well as using the algorithms on-line in real-time, meaning as data is collected, will be presented. Finally, future work in applying ML for SSA will be discussed.
Spatial scaling of non-native fish richness across the United States
Qinfeng Guo; Julian D. Olden
2014-01-01
A major goal and challenge of invasion ecology is to describe and interpret spatial and temporal patterns of species invasions. Here, we examined fish invasion patterns at four spatially structured and hierarchically nested scales across the contiguous United States (i.e., from large to small: region, basin, watershed, and sub-watershed). All spatial relationships in...
ERIC Educational Resources Information Center
Butz, Martin V.; Herbort, Oliver; Hoffmann, Joachim
2007-01-01
Autonomously developing organisms face several challenges when learning reaching movements. First, motor control is learned unsupervised or self-supervised. Second, knowledge of sensorimotor contingencies is acquired in contexts in which action consequences unfold in time. Third, motor redundancies must be resolved. To solve all 3 of these…
Bilingual Lexical Interactions in an Unsupervised Neural Network Model
ERIC Educational Resources Information Center
Zhao, Xiaowei; Li, Ping
2010-01-01
In this paper we present an unsupervised neural network model of bilingual lexical development and interaction. We focus on how the representational structures of the bilingual lexicons can emerge, develop, and interact with each other as a function of the learning history. The results show that: (1) distinct representations for the two lexicons…
Subtyping of Toddlers with ASD Based on Patterns of Social Attention Deficits
2014-10-01
AWARD NUMBER: W81XWH-13-1-0179 TITLE: Subtyping of Toddlers with ASD Based on Patterns of Social Attention Deficits PRINCIPAL INVESTIGATOR...TITLE AND SUBTITLE Subtyping of Toddlers with ASD Based on Patterns of Social Attention Deficits 5a. CONTRACT NUMBER 5b. GRANT NUMBER W81XWH-13-1-0179...tracking data. 15. SUBJECT TERMS ASD, subgrouping, toddlers , heterogeneity, eye-tracking, visual attention, dyadic orienting, hierarchical
Hadley, Wendy; Houck, Christopher D; Barker, David; Senocak, Natali
2015-06-01
The purpose of this study was to examine the moderating influence of parental monitoring (e.g., unsupervised time with opposite sex peers) and adolescent emotional competence on sexual behaviors, among a sample of at-risk early adolescents. This study included 376 seventh-grade adolescents (age, 12-14 years) with behavioral or emotional difficulties. Questionnaires were completed on private laptop computers and assessed adolescent Emotional Competence (including Regulation and Negativity/Lability), Unsupervised Time, and a range of Sexual Behaviors. Generalized linear models were used to evaluate the independent and combined influence of Emotional Competency and Unsupervised Time on adolescent report of Sexual Behaviors. Analyses were stratified by gender to account for the notable gender differences in the targeted moderators and outcome variables. Findings indicated that more unsupervised time was a risk factor for all youth but was influenced by an adolescent's ability to regulate their emotions. Specifically, for males and females, poorer Emotion Regulation was associated with having engaged in a greater variety of Sexual Behaviors. However, lower Negativity/Lability and >1× per week Unsupervised Time were associated with a higher number of sexual behaviors among females only. Based on the findings of this study, a lack of parental supervision seems to be particularly problematic for both male and female adolescents with poor emotion regulation abilities. It may be important to impact both emotion regulation abilities and increase parental knowledge and skills associated with effective monitoring to reduce risk-taking for these youth.
A Novel Unsupervised Segmentation Quality Evaluation Method for Remote Sensing Images
Tang, Yunwei; Jing, Linhai; Ding, Haifeng
2017-01-01
The segmentation of a high spatial resolution remote sensing image is a critical step in geographic object-based image analysis (GEOBIA). Evaluating the performance of segmentation without ground truth data, i.e., unsupervised evaluation, is important for the comparison of segmentation algorithms and the automatic selection of optimal parameters. This unsupervised strategy currently faces several challenges in practice, such as difficulties in designing effective indicators and limitations of the spectral values in the feature representation. This study proposes a novel unsupervised evaluation method to quantitatively measure the quality of segmentation results to overcome these problems. In this method, multiple spectral and spatial features of images are first extracted simultaneously and then integrated into a feature set to improve the quality of the feature representation of ground objects. The indicators designed for spatial stratified heterogeneity and spatial autocorrelation are included to estimate the properties of the segments in this integrated feature set. These two indicators are then combined into a global assessment metric as the final quality score. The trade-offs of the combined indicators are accounted for using a strategy based on the Mahalanobis distance, which can be exhibited geometrically. The method is tested on two segmentation algorithms and three testing images. The proposed method is compared with two existing unsupervised methods and a supervised method to confirm its capabilities. Through comparison and visual analysis, the results verified the effectiveness of the proposed method and demonstrated the reliability and improvements of this method with respect to other methods. PMID:29064416
Developmental Self-Construction and -Configuration of Functional Neocortical Neuronal Networks
Bauer, Roman; Zubler, Frédéric; Pfister, Sabina; Hauri, Andreas; Pfeiffer, Michael; Muir, Dylan R.; Douglas, Rodney J.
2014-01-01
The prenatal development of neural circuits must provide sufficient configuration to support at least a set of core postnatal behaviors. Although knowledge of various genetic and cellular aspects of development is accumulating rapidly, there is less systematic understanding of how these various processes play together in order to construct such functional networks. Here we make some steps toward such understanding by demonstrating through detailed simulations how a competitive co-operative (‘winner-take-all’, WTA) network architecture can arise by development from a single precursor cell. This precursor is granted a simplified gene regulatory network that directs cell mitosis, differentiation, migration, neurite outgrowth and synaptogenesis. Once initial axonal connection patterns are established, their synaptic weights undergo homeostatic unsupervised learning that is shaped by wave-like input patterns. We demonstrate how this autonomous genetically directed developmental sequence can give rise to self-calibrated WTA networks, and compare our simulation results with biological data. PMID:25474693
Color normalization of histology slides using graph regularized sparse NMF
NASA Astrophysics Data System (ADS)
Sha, Lingdao; Schonfeld, Dan; Sethi, Amit
2017-03-01
Computer based automatic medical image processing and quantification are becoming popular in digital pathology. However, preparation of histology slides can vary widely due to differences in staining equipment, procedures and reagents, which can reduce the accuracy of algorithms that analyze their color and texture information. To re- duce the unwanted color variations, various supervised and unsupervised color normalization methods have been proposed. Compared with supervised color normalization methods, unsupervised color normalization methods have advantages of time and cost efficient and universal applicability. Most of the unsupervised color normaliza- tion methods for histology are based on stain separation. Based on the fact that stain concentration cannot be negative and different parts of the tissue absorb different stains, nonnegative matrix factorization (NMF), and particular its sparse version (SNMF), are good candidates for stain separation. However, most of the existing unsupervised color normalization method like PCA, ICA, NMF and SNMF fail to consider important information about sparse manifolds that its pixels occupy, which could potentially result in loss of texture information during color normalization. Manifold learning methods like Graph Laplacian have proven to be very effective in interpreting high-dimensional data. In this paper, we propose a novel unsupervised stain separation method called graph regularized sparse nonnegative matrix factorization (GSNMF). By considering the sparse prior of stain concentration together with manifold information from high-dimensional image data, our method shows better performance in stain color deconvolution than existing unsupervised color deconvolution methods, especially in keeping connected texture information. To utilized the texture information, we construct a nearest neighbor graph between pixels within a spatial area of an image based on their distances using heat kernal in lαβ space. The representation of a pixel in the stain density space is constrained to follow the feature distance of the pixel to pixels in the neighborhood graph. Utilizing color matrix transfer method with the stain concentrations found using our GSNMF method, the color normalization performance was also better than existing methods.
Unsupervised discovery of information structure in biomedical documents.
Kiela, Douwe; Guo, Yufan; Stenius, Ulla; Korhonen, Anna
2015-04-01
Information structure (IS) analysis is a text mining technique, which classifies text in biomedical articles into categories that capture different types of information, such as objectives, methods, results and conclusions of research. It is a highly useful technique that can support a range of Biomedical Text Mining tasks and can help readers of biomedical literature find information of interest faster, accelerating the highly time-consuming process of literature review. Several approaches to IS analysis have been presented in the past, with promising results in real-world biomedical tasks. However, all existing approaches, even weakly supervised ones, require several hundreds of hand-annotated training sentences specific to the domain in question. Because biomedicine is subject to considerable domain variation, such annotations are expensive to obtain. This makes the application of IS analysis across biomedical domains difficult. In this article, we investigate an unsupervised approach to IS analysis and evaluate the performance of several unsupervised methods on a large corpus of biomedical abstracts collected from PubMed. Our best unsupervised algorithm (multilevel-weighted graph clustering algorithm) performs very well on the task, obtaining over 0.70 F scores for most IS categories when applied to well-known IS schemes. This level of performance is close to that of lightly supervised IS methods and has proven sufficient to aid a range of practical tasks. Thus, using an unsupervised approach, IS could be applied to support a wide range of tasks across sub-domains of biomedicine. We also demonstrate that unsupervised learning brings novel insights into IS of biomedical literature and discovers information categories that are not present in any of the existing IS schemes. The annotated corpus and software are available at http://www.cl.cam.ac.uk/∼dk427/bio14info.html. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Smart, Daniel J; Gill, Nicholas D
2013-03-01
The aims of the study were to determine if a supervised off-season conditioning program enhanced gains in physical characteristics compared with the same program performed in an unsupervised manner and to establish the persistence of the physical changes after a 6-month unsupervised competition period. Forty-four provincial representative adolescent rugby union players (age, mean ± SD, 15.3 ± 1.3 years) participated in a 15-week off-season conditioning program either under supervision from an experienced strength and conditioning coach or unsupervised. Measures of body composition, strength, vertical jump, speed, and anaerobic and aerobic running performance were taken, before, immediately after, and 6 months after the conditioning. Post conditioning program the supervised group had greater improvements in all strength measures than the unsupervised group, with small, moderate and large differences between the groups\\x{2019} changes for chin-ups (9.1%; ± 11.6%), bench-press (16.9%; ± 11.7%) and box-squat (50.4%; ± 20.9%) estimated 1RM respectively. Both groups showed trivial increases in mass; however increases in fat free mass were small and trivial for supervised and unsupervised players respectively. Strength declined in the supervised group while the unsupervised group had small increases during the competition phase, resulting in only a small difference between the long-term changes in box-squat 1RM (15.9%; ± 13.2%). The supervised group had further small increases in fat free mass resulting in a small difference (2.4%; ± 2.7%) in the long-term changes. The postconditioning differences between the 2 groups may have been a result of increased adherence and the attainment of higher training loads during supervised training. The lack of differences in strength after the competition period indicates that supervision should be maintained to reduce substantial decrements in performance.
Hierarchical singleton-type recurrent neural fuzzy networks for noisy speech recognition.
Juang, Chia-Feng; Chiou, Chyi-Tian; Lai, Chun-Lung
2007-05-01
This paper proposes noisy speech recognition using hierarchical singleton-type recurrent neural fuzzy networks (HSRNFNs). The proposed HSRNFN is a hierarchical connection of two singleton-type recurrent neural fuzzy networks (SRNFNs), where one is used for noise filtering and the other for recognition. The SRNFN is constructed by recurrent fuzzy if-then rules with fuzzy singletons in the consequences, and their recurrent properties make them suitable for processing speech patterns with temporal characteristics. In n words recognition, n SRNFNs are created for modeling n words, where each SRNFN receives the current frame feature and predicts the next one of its modeling word. The prediction error of each SRNFN is used as recognition criterion. In filtering, one SRNFN is created, and each SRNFN recognizer is connected to the same SRNFN filter, which filters noisy speech patterns in the feature domain before feeding them to the SRNFN recognizer. Experiments with Mandarin word recognition under different types of noise are performed. Other recognizers, including multilayer perceptron (MLP), time-delay neural networks (TDNNs), and hidden Markov models (HMMs), are also tested and compared. These experiments and comparisons demonstrate good results with HSRNFN for noisy speech recognition tasks.
Retrieval Capabilities of Hierarchical Networks: From Dyson to Hopfield
NASA Astrophysics Data System (ADS)
Agliari, Elena; Barra, Adriano; Galluzzi, Andrea; Guerra, Francesco; Tantari, Daniele; Tavani, Flavia
2015-01-01
We consider statistical-mechanics models for spin systems built on hierarchical structures, which provide a simple example of non-mean-field framework. We show that the coupling decay with spin distance can give rise to peculiar features and phase diagrams much richer than their mean-field counterpart. In particular, we consider the Dyson model, mimicking ferromagnetism in lattices, and we prove the existence of a number of metastabilities, beyond the ordered state, which become stable in the thermodynamic limit. Such a feature is retained when the hierarchical structure is coupled with the Hebb rule for learning, hence mimicking the modular architecture of neurons, and gives rise to an associative network able to perform single pattern retrieval as well as multiple-pattern retrieval, depending crucially on the external stimuli and on the rate of interaction decay with distance; however, those emergent multitasking features reduce the network capacity with respect to the mean-field counterpart. The analysis is accomplished through statistical mechanics, Markov chain theory, signal-to-noise ratio technique, and numerical simulations in full consistency. Our results shed light on the biological complexity shown by real networks, and suggest future directions for understanding more realistic models.
Hierarchical ultrathin alumina membrane for the fabrication of unique nanodot arrays
NASA Astrophysics Data System (ADS)
Wang, Yuyang; Wang, Yi; Wang, Hailong; Wang, Xinnan; Cong, Ming; Xu, Weiqing; Xu, Shuping
2016-01-01
Ultrathin alumina membranes (UTAMs) as evaporation masks have been a powerful tool for the fabrication of high-density nanodot arrays and have received much attention in magnetic memory devices, photovoltaics, and nanoplasmonics. In this paper, we report the fabrication of a hierarchical ultrathin alumina membrane (HUTAM) with highly ordered submicro/nanoscale channels and its application as an evaporation mask for the realization of unique non-hexagonal nanodot arrays dependent on the geometrical features of the HUTAM. This is the first report of a UTAM with a hierarchical geometry, breaking the stereotype that only limited sets of nanopatterns can be realized using the UTAM method (with typical inter-pore distance of 100 nm). The fabrication of a HUTAM is discussed in detail. An improved, longer wet etching time than previously reported is found to effectively remove the barrier layer and widen the pores of a HUTAM. A growth sustainability issue brought about by pre-patterning is discussed. Spectral comparison was made to distinguish the UTAM nanodots and HUTAM nanodots. Our results can be an inspiration for more sophisticated applications of pre-patterned anodized aluminum oxide in photocatalysis, photovoltaics, and nanoplasmonics.
Architecture of the parallel hierarchical network for fast image recognition
NASA Astrophysics Data System (ADS)
Timchenko, Leonid; Wójcik, Waldemar; Kokriatskaia, Natalia; Kutaev, Yuriy; Ivasyuk, Igor; Kotyra, Andrzej; Smailova, Saule
2016-09-01
Multistage integration of visual information in the brain allows humans to respond quickly to most significant stimuli while maintaining their ability to recognize small details in the image. Implementation of this principle in technical systems can lead to more efficient processing procedures. The multistage approach to image processing includes main types of cortical multistage convergence. The input images are mapped into a flexible hierarchy that reflects complexity of image data. Procedures of the temporal image decomposition and hierarchy formation are described in mathematical expressions. The multistage system highlights spatial regularities, which are passed through a number of transformational levels to generate a coded representation of the image that encapsulates a structure on different hierarchical levels in the image. At each processing stage a single output result is computed to allow a quick response of the system. The result is presented as an activity pattern, which can be compared with previously computed patterns on the basis of the closest match. With regard to the forecasting method, its idea lies in the following. In the results synchronization block, network-processed data arrive to the database where a sample of most correlated data is drawn using service parameters of the parallel-hierarchical network.
Testolin, Alberto; Zorzi, Marco
2016-01-01
Connectionist models can be characterized within the more general framework of probabilistic graphical models, which allow to efficiently describe complex statistical distributions involving a large number of interacting variables. This integration allows building more realistic computational models of cognitive functions, which more faithfully reflect the underlying neural mechanisms at the same time providing a useful bridge to higher-level descriptions in terms of Bayesian computations. Here we discuss a powerful class of graphical models that can be implemented as stochastic, generative neural networks. These models overcome many limitations associated with classic connectionist models, for example by exploiting unsupervised learning in hierarchical architectures (deep networks) and by taking into account top-down, predictive processing supported by feedback loops. We review some recent cognitive models based on generative networks, and we point out promising research directions to investigate neuropsychological disorders within this approach. Though further efforts are required in order to fill the gap between structured Bayesian models and more realistic, biophysical models of neuronal dynamics, we argue that generative neural networks have the potential to bridge these levels of analysis, thereby improving our understanding of the neural bases of cognition and of pathologies caused by brain damage. PMID:27468262
Lahiri, A; Roy, Abhijit Guha; Sheet, Debdoot; Biswas, Prabir Kumar
2016-08-01
Automated segmentation of retinal blood vessels in label-free fundus images entails a pivotal role in computed aided diagnosis of ophthalmic pathologies, viz., diabetic retinopathy, hypertensive disorders and cardiovascular diseases. The challenge remains active in medical image analysis research due to varied distribution of blood vessels, which manifest variations in their dimensions of physical appearance against a noisy background. In this paper we formulate the segmentation challenge as a classification task. Specifically, we employ unsupervised hierarchical feature learning using ensemble of two level of sparsely trained denoised stacked autoencoder. First level training with bootstrap samples ensures decoupling and second level ensemble formed by different network architectures ensures architectural revision. We show that ensemble training of auto-encoders fosters diversity in learning dictionary of visual kernels for vessel segmentation. SoftMax classifier is used for fine tuning each member autoencoder and multiple strategies are explored for 2-level fusion of ensemble members. On DRIVE dataset, we achieve maximum average accuracy of 95.33% with an impressively low standard deviation of 0.003 and Kappa agreement coefficient of 0.708. Comparison with other major algorithms substantiates the high efficacy of our model.
Floris, Patrick; McGillicuddy, Nicola; Albrecht, Simone; Morrissey, Brian; Kaisermayer, Christian; Lindeberg, Anna; Bones, Jonathan
2017-09-19
An untargeted LC-MS/MS platform was implemented for monitoring variations in CHO cell culture media upon exposure to high temperature short time (HTST) treatment, a commonly used viral clearance upstream strategy. Chemically defined (CD) and hydrolysate-supplemented media formulations were not visibly altered by the treatment. The absence of solute precipitation effects during media treatment and very modest shifts in pH values observed indicated sufficient compatibility of the formulations evaluated with the HTST-processing conditions. Unsupervised chemometric analysis of LC-MS/MS data, however, revealed clear separation of HTST-treated samples from untreated counterparts as observed from analysis of principal components and hierarchical clustering sample grouping. An increased presence of Maillard products in HTST-treated formulations contributed to the observed differences which included organic acids, observed particularly in chemically defined formulations, and furans, pyridines, pyrazines, and pyrrolidines which were determined in hydrolysate-supplemented formulations. The presence of Maillard products in media did not affect cell culture performance with similar growth and viability profiles observed for CHO-K1 and CHO-DP12 cells when cultured using both HTST-treated and untreated media formulations.
Learning a Dictionary of Shape Epitomes with Applications to Image Labeling
Chen, Liang-Chieh; Papandreou, George; Yuille, Alan L.
2015-01-01
The first main contribution of this paper is a novel method for representing images based on a dictionary of shape epitomes. These shape epitomes represent the local edge structure of the image and include hidden variables to encode shift and rotations. They are learnt in an unsupervised manner from groundtruth edges. This dictionary is compact but is also able to capture the typical shapes of edges in natural images. In this paper, we illustrate the shape epitomes by applying them to the image labeling task. In other work, described in the supplementary material, we apply them to edge detection and image modeling. We apply shape epitomes to image labeling by using Conditional Random Field (CRF) Models. They are alternatives to the superpixel or pixel representations used in most CRFs. In our approach, the shape of an image patch is encoded by a shape epitome from the dictionary. Unlike the superpixel representation, our method avoids making early decisions which cannot be reversed. Our resulting hierarchical CRFs efficiently capture both local and global class co-occurrence properties. We demonstrate its quantitative and qualitative properties of our approach with image labeling experiments on two standard datasets: MSRC-21 and Stanford Background. PMID:26321886
Convolutional neural network features based change detection in satellite images
NASA Astrophysics Data System (ADS)
Mohammed El Amin, Arabi; Liu, Qingjie; Wang, Yunhong
2016-07-01
With the popular use of high resolution remote sensing (HRRS) satellite images, a huge research efforts have been placed on change detection (CD) problem. An effective feature selection method can significantly boost the final result. While hand-designed features have proven difficulties to design features that effectively capture high and mid-level representations, the recent developments in machine learning (Deep Learning) omit this problem by learning hierarchical representation in an unsupervised manner directly from data without human intervention. In this letter, we propose approaching the change detection problem from a feature learning perspective. A novel deep Convolutional Neural Networks (CNN) features based HR satellite images change detection method is proposed. The main guideline is to produce a change detection map directly from two images using a pretrained CNN. This method can omit the limited performance of hand-crafted features. Firstly, CNN features are extracted through different convolutional layers. Then, a concatenation step is evaluated after an normalization step, resulting in a unique higher dimensional feature map. Finally, a change map was computed using pixel-wise Euclidean distance. Our method has been validated on real bitemporal HRRS satellite images according to qualitative and quantitative analyses. The results obtained confirm the interest of the proposed method.
Deeply learnt hashing forests for content based image retrieval in prostate MR images
NASA Astrophysics Data System (ADS)
Shah, Amit; Conjeti, Sailesh; Navab, Nassir; Katouzian, Amin
2016-03-01
Deluge in the size and heterogeneity of medical image databases necessitates the need for content based retrieval systems for their efficient organization. In this paper, we propose such a system to retrieve prostate MR images which share similarities in appearance and content with a query image. We introduce deeply learnt hashing forests (DL-HF) for this image retrieval task. DL-HF effectively leverages the semantic descriptiveness of deep learnt Convolutional Neural Networks. This is used in conjunction with hashing forests which are unsupervised random forests. DL-HF hierarchically parses the deep-learnt feature space to encode subspaces with compact binary code words. We propose a similarity preserving feature descriptor called Parts Histogram which is derived from DL-HF. Correlation defined on this descriptor is used as a similarity metric for retrieval from the database. Validations on publicly available multi-center prostate MR image database established the validity of the proposed approach. The proposed method is fully-automated without any user-interaction and is not dependent on any external image standardization like image normalization and registration. This image retrieval method is generalizable and is well-suited for retrieval in heterogeneous databases other imaging modalities and anatomies.
Testolin, Alberto; Zorzi, Marco
2016-01-01
Connectionist models can be characterized within the more general framework of probabilistic graphical models, which allow to efficiently describe complex statistical distributions involving a large number of interacting variables. This integration allows building more realistic computational models of cognitive functions, which more faithfully reflect the underlying neural mechanisms at the same time providing a useful bridge to higher-level descriptions in terms of Bayesian computations. Here we discuss a powerful class of graphical models that can be implemented as stochastic, generative neural networks. These models overcome many limitations associated with classic connectionist models, for example by exploiting unsupervised learning in hierarchical architectures (deep networks) and by taking into account top-down, predictive processing supported by feedback loops. We review some recent cognitive models based on generative networks, and we point out promising research directions to investigate neuropsychological disorders within this approach. Though further efforts are required in order to fill the gap between structured Bayesian models and more realistic, biophysical models of neuronal dynamics, we argue that generative neural networks have the potential to bridge these levels of analysis, thereby improving our understanding of the neural bases of cognition and of pathologies caused by brain damage.
NASA Astrophysics Data System (ADS)
Hu, Xiaoye; Zheng, Peng; Meng, Guowen; Huang, Qing; Zhu, Chuhong; Han, Fangming; Huang, Zhulin; Li, Zhongbo; Wang, Zhaoming; Wu, Nianqiang
2016-09-01
An ordered array of hierarchically-structured core-nanosphere@space-layer@shell-nanoparticles has been fabricated for surface-enhanced Raman scattering (SERS) detection. To fabricate this hierarchically-structured chip, a long-range ordered array of Au/Ag-nanospheres is first patterned in the nano-bowls on the planar surface of ordered nanoporous anodic titanium oxide template. A ultra-thin alumina middle space-layer is then conformally coated on the Au/Ag-nanospheres, and Ag-nanoparticles are finally deposited on the surface of the alumina space-layer to form an ordered array of Au/Ag-nanosphere@Al2O3-layer@Ag-nanoparticles. Finite-difference time-domain simulation shows that SERS hot spots are created between the neighboring Ag-nanoparticles. The ordered array of hierarchical nanostructures is used as the SERS-substrate for a trial detection of methyl parathion (a pesticide) in water and a limit of detection of 1 nM is reached, indicating its promising potential in rapid monitoring of organic pollutants in aquatic environment.
AA9int: SNP Interaction Pattern Search Using Non-Hierarchical Additive Model Set.
Lin, Hui-Yi; Huang, Po-Yu; Chen, Dung-Tsa; Tung, Heng-Yuan; Sellers, Thomas A; Pow-Sang, Julio; Eeles, Rosalind; Easton, Doug; Kote-Jarai, Zsofia; Amin Al Olama, Ali; Benlloch, Sara; Muir, Kenneth; Giles, Graham G; Wiklund, Fredrik; Gronberg, Henrik; Haiman, Christopher A; Schleutker, Johanna; Nordestgaard, Børge G; Travis, Ruth C; Hamdy, Freddie; Neal, David E; Pashayan, Nora; Khaw, Kay-Tee; Stanford, Janet L; Blot, William J; Thibodeau, Stephen N; Maier, Christiane; Kibel, Adam S; Cybulski, Cezary; Cannon-Albright, Lisa; Brenner, Hermann; Kaneva, Radka; Batra, Jyotsna; Teixeira, Manuel R; Pandha, Hardev; Lu, Yong-Jie; Park, Jong Y
2018-06-07
The use of single nucleotide polymorphism (SNP) interactions to predict complex diseases is getting more attention during the past decade, but related statistical methods are still immature. We previously proposed the SNP Interaction Pattern Identifier (SIPI) approach to evaluate 45 SNP interaction patterns/patterns. SIPI is statistically powerful but suffers from a large computation burden. For large-scale studies, it is necessary to use a powerful and computation-efficient method. The objective of this study is to develop an evidence-based mini-version of SIPI as the screening tool or solitary use and to evaluate the impact of inheritance mode and model structure on detecting SNP-SNP interactions. We tested two candidate approaches: the 'Five-Full' and 'AA9int' method. The Five-Full approach is composed of the five full interaction models considering three inheritance modes (additive, dominant and recessive). The AA9int approach is composed of nine interaction models by considering non-hierarchical model structure and the additive mode. Our simulation results show that AA9int has similar statistical power compared to SIPI and is superior to the Five-Full approach, and the impact of the non-hierarchical model structure is greater than that of the inheritance mode in detecting SNP-SNP interactions. In summary, it is recommended that AA9int is a powerful tool to be used either alone or as the screening stage of a two-stage approach (AA9int+SIPI) for detecting SNP-SNP interactions in large-scale studies. The 'AA9int' and 'parAA9int' functions (standard and parallel computing version) are added in the SIPI R package, which is freely available at https://linhuiyi.github.io/LinHY_Software/. hlin1@lsuhsc.edu. Supplementary data are available at Bioinformatics online.
ERIC Educational Resources Information Center
Amershi, Saleema; Conati, Cristina
2009-01-01
In this paper, we present a data-based user modeling framework that uses both unsupervised and supervised classification to build student models for exploratory learning environments. We apply the framework to build student models for two different learning environments and using two different data sources (logged interface and eye-tracking data).…
Unsupervised Discovery of Nonlinear Structure Using Contrastive Backpropagation
ERIC Educational Resources Information Center
Hinton, Geoffrey; Osindero, Simon; Welling, Max; Teh, Yee-Whye
2006-01-01
We describe a way of modeling high-dimensional data vectors by using an unsupervised, nonlinear, multilayer neural network in which the activity of each neuron-like unit makes an additive contribution to a global energy score that indicates how surprised the network is by the data vector. The connection weights that determine how the activity of…
ERIC Educational Resources Information Center
Protopapas, Athanassios; Skaloumbakas, Christos; Bali, Persefoni
2008-01-01
After reviewing past efforts related to computer-based reading disability (RD) assessment, we present a fully automated screening battery that evaluates critical skills relevant for RD diagnosis designed for unsupervised application in the Greek educational system. Psychometric validation in 301 children, 8-10 years old (grades 3 and 4; including…
Unsupervised classification of remote multispectral sensing data
NASA Technical Reports Server (NTRS)
Su, M. Y.
1972-01-01
The new unsupervised classification technique for classifying multispectral remote sensing data which can be either from the multispectral scanner or digitized color-separation aerial photographs consists of two parts: (a) a sequential statistical clustering which is a one-pass sequential variance analysis and (b) a generalized K-means clustering. In this composite clustering technique, the output of (a) is a set of initial clusters which are input to (b) for further improvement by an iterative scheme. Applications of the technique using an IBM-7094 computer on multispectral data sets over Purdue's Flight Line C-1 and the Yellowstone National Park test site have been accomplished. Comparisons between the classification maps by the unsupervised technique and the supervised maximum liklihood technique indicate that the classification accuracies are in agreement.
NASA Astrophysics Data System (ADS)
Serb, Alexander; Bill, Johannes; Khiat, Ali; Berdan, Radu; Legenstein, Robert; Prodromakis, Themis
2016-09-01
In an increasingly data-rich world the need for developing computing systems that cannot only process, but ideally also interpret big data is becoming continuously more pressing. Brain-inspired concepts have shown great promise towards addressing this need. Here we demonstrate unsupervised learning in a probabilistic neural network that utilizes metal-oxide memristive devices as multi-state synapses. Our approach can be exploited for processing unlabelled data and can adapt to time-varying clusters that underlie incoming data by supporting the capability of reversible unsupervised learning. The potential of this work is showcased through the demonstration of successful learning in the presence of corrupted input data and probabilistic neurons, thus paving the way towards robust big-data processors.
Classification of earth terrain using polarimetric synthetic aperture radar images
NASA Technical Reports Server (NTRS)
Lim, H. H.; Swartz, A. A.; Yueh, H. A.; Kong, J. A.; Shin, R. T.; Van Zyl, J. J.
1989-01-01
Supervised and unsupervised classification techniques are developed and used to classify the earth terrain components from SAR polarimetric images of San Francisco Bay and Traverse City, Michigan. The supervised techniques include the Bayes classifiers, normalized polarimetric classification, and simple feature classification using discriminates such as the absolute and normalized magnitude response of individual receiver channel returns and the phase difference between receiver channels. An algorithm is developed as an unsupervised technique which classifies terrain elements based on the relationship between the orientation angle and the handedness of the transmitting and receiving polariation states. It is found that supervised classification produces the best results when accurate classifier training data are used, while unsupervised classification may be applied when training data are not available.
Light-Induced Buckles Localized by Polymeric Inks Printed on Bilayer Films.
Park, Sungjune; Nallainathan, Umaash; Mondal, Kunal; Sen, Pratik; Dickey, Michael D
2018-04-16
Buckling instabilities generate microscale features in thin films in a facile manner. Buckles can form, for example, by heating a metal/polymer film stack on a rigid substrate. Thermal expansion differences of the individual layers generate compressive stress that causes the metal to buckle over the entire surface. The ability to dictate and confine the location of buckle formation can enable patterns with more than one length scale, including hierarchical patterns. Here, sacrificial "ink" patterned on top of the film stack localizes the buckles via two mechanisms. First, stiff inks suppress buckles such that only the non-inked regions buckle in response to infrared light. The metal in the non-inked regions absorbs the infrared light and thus gets sufficiently hot to induce buckles. Second, soft inks that absorb light get hot faster than the non-inked regions and promote buckling when exposed to visible light. The exposed metal in the non-inked regions reflects the light and thus never get sufficiently hot to induce buckles. This second method works on glass substrates, but not silicon substrates, due to the superior thermal insulation of glass. The patterned ink can be removed, leaving behind hierarchical patterns consisting of regions of buckles among non-buckled regions. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Davis, Tyson C; Bang, Jae Jin; Brooks, Jacob T; McMillan, David G; Claridge, Shelley A
2018-01-30
Noncovalent monolayer chemistries are often used to functionalize 2D materials. Nanoscopic ligand ordering has been widely demonstrated (e.g., lying-down lamellar phases of functional alkanes); however, combining this control with micro- and macroscopic patterning for practical applications remains a significant challenge. A few reports have demonstrated that standing phase Langmuir films on water can be converted into nanoscopic lying-down molecular domains on 2D substrates (e.g., graphite), using horizontal dipping (Langmuir-Schaefer, LS, transfer). Molecular patterns are known to form at scales up to millimeters in Langmuir films, suggesting the possibility of transforming such structures into functional patterns on 2D materials. However, to our knowledge, this approach has not been investigated, and the rules governing LS conversion are not well understood. In part, this is because the conversion process is mechanistically very different from classic LS transfer of standing phases; challenges also arise due to the need to characterize structure in noncovalently adsorbed ligand layers <0.5 nm thick, at scales ranging from millimeters to nanometers. Here, we show that scanning electron microscopy enables diynoic acid lying-down phases to be imaged across this range of scales; using this structural information, we establish conditions for LS conversion to create hierarchical microscopic and nanoscopic functional patterns. Such control opens the door to tailoring noncovalent surface chemistry of 2D materials to pattern local interactions with the environment.
Kolasa, Jurek; Allen, Craig R.; Sendzimir, Jan; Stow, Craig A.
2012-01-01
Interaction between habitat and species is central in ecology. Habitat structure may be conceived as being hierarchical, where larger, more diverse, portions or categories contain smaller, more homogeneous portions. When this conceptualization is combined with the observation that species have different abilities to relate to portions of the habitat that differ in their characteristics, a number of known patterns can be derived and new patterns hypothesized. We propose a quantitative form of this habitat–species relationship by considering species abundance to be a function of habitat specialization, habitat fragmentation, amount of habitat, and adult body mass. The model reproduces and explains patterns such as variation in rank–abundance curves, greater variation and extinction probabilities of habitat specialists, discontinuities in traits (abundance, ecological range, pattern of variation, body size) among species sharing a community or area, and triangular distribution of body sizes, among others. The model has affinities to Holling's textural discontinuity hypothesis and metacommunity theory but differs from both by offering a more general perspective. In support of the model, we illustrate its general potential to capture and explain several empirical observations that historically have been treated independently.
Quark Yukawa pattern from spontaneous breaking of flavour SU(3) 3
NASA Astrophysics Data System (ADS)
Nardi, Enrico
2015-10-01
A SU(3)Q × SU(3)u × SU(3)d invariant scalar potential breaking spontaneously the quark flavour symmetry can explain the Standard Model flavour puzzle. The approximate alignment in flavour space of the vacuum expectation values of the up and down 'Yukawa fields' results as a dynamical effect. The observed quark mixing angles, the weak CP violating phase, and hierarchical quark masses can be all reproduced at the cost of introducing additional (auxiliary) scalar multiplets, but without the need of introducing hierarchical parameters.
Velocity and Hierarchical Spread of Epidemic Outbreaks in Scale-Free Networks
NASA Astrophysics Data System (ADS)
Barthélemy, Marc; Barrat, Alain; Pastor-Satorras, Romualdo; Vespignani, Alessandro
2004-04-01
We study the effect of the connectivity pattern of complex networks on the propagation dynamics of epidemics. The growth time scale of outbreaks is inversely proportional to the network degree fluctuations, signaling that epidemics spread almost instantaneously in networks with scale-free degree distributions. This feature is associated with an epidemic propagation that follows a precise hierarchical dynamics. Once the highly connected hubs are reached, the infection pervades the network in a progressive cascade across smaller degree classes. The present results are relevant for the development of adaptive containment strategies.
Supramolecular structure of polymer binders and composites: targeted control based on the hierarchy
NASA Astrophysics Data System (ADS)
Matveeva, Larisa; Belentsov, Yuri
2017-10-01
The article discusses the problem of targeted control over properties by modifying the supramolecular structure of polymer binders and composites based on their hierarchy. Control over the structure formation of polymers and introduction of modifying additives should be tailored to the specific hierarchical structural levels. Characteristics of polymer materials are associated with structural defects, which also display a hierarchical pattern. Classification of structural defects in polymers is presented. The primary structural level (nano level) of supramolecular formations is of great importance to the reinforcement and regulation of strength characteristics.
The RDoC initiative and the structure of psychopathology.
Krueger, Robert F; DeYoung, Colin G
2016-03-01
The NIMH Research Domain Criteria (RDoC) project represents a welcome effort to circumvent the limitations of psychiatric categories as phenotypes for psychopathology research. Here, we describe the hierarchical and dimensional structure of phenotypic psychopathology and illustrate how this structure provides phenotypes suitable for RDoC research on neural correlates of psychopathology. A hierarchical and dimensional approach to psychopathology phenotypes holds great promise for delineating connections between neuroscience constructs and the patterns of affect, cognition, and behavior that constitute manifest psychopathology. © 2016 Society for Psychophysiological Research.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ramanathan, Arvind; Pullum, Laura L.; Hobson, Tanner C.
Here, we describe a data-driven unsupervised machine learning approach to extract geo-temporal co-occurrence patterns of asthma and the flu from large-scale electronic healthcare reimbursement claims (eHRC) datasets. Specifically, we examine the eHRC data from 2009 to 2010 pandemic H1N1 influenza season and analyze whether different geographic regions within the United States (US) showed an increase in co-occurrence patterns of the flu and asthma. Our analyses reveal that the temporal patterns extracted from the eHRC data show a distinct lag time between the peak incidence of the asthma and the flu. While the increased occurrence of asthma contributed to increased flumore » incidence during the pandemic, this co-occurrence is predominant for female patients. The geo-temporal patterns reveal that the co-occurrence of the flu and asthma are typically concentrated within the south-east US. Further, in agreement with previous studies, large urban areas (such as New York, Miami, and Los Angeles) exhibit co-occurrence patterns that suggest a peak incidence of asthma and flu significantly early in the spring and winter seasons. Together, our data-analytic approach, integrated within the Oak Ridge Bio-surveillance Toolkit platform, demonstrates how eHRC data can provide novel insights into co-occurring disease patterns.« less
Ramanathan, Arvind; Pullum, Laura L.; Hobson, Tanner C.; ...
2015-08-03
Here, we describe a data-driven unsupervised machine learning approach to extract geo-temporal co-occurrence patterns of asthma and the flu from large-scale electronic healthcare reimbursement claims (eHRC) datasets. Specifically, we examine the eHRC data from 2009 to 2010 pandemic H1N1 influenza season and analyze whether different geographic regions within the United States (US) showed an increase in co-occurrence patterns of the flu and asthma. Our analyses reveal that the temporal patterns extracted from the eHRC data show a distinct lag time between the peak incidence of the asthma and the flu. While the increased occurrence of asthma contributed to increased flumore » incidence during the pandemic, this co-occurrence is predominant for female patients. The geo-temporal patterns reveal that the co-occurrence of the flu and asthma are typically concentrated within the south-east US. Further, in agreement with previous studies, large urban areas (such as New York, Miami, and Los Angeles) exhibit co-occurrence patterns that suggest a peak incidence of asthma and flu significantly early in the spring and winter seasons. Together, our data-analytic approach, integrated within the Oak Ridge Bio-surveillance Toolkit platform, demonstrates how eHRC data can provide novel insights into co-occurring disease patterns.« less
Truppa, Valentina; Carducci, Paola; De Simone, Diego Antonio; Bisazza, Angelo; De Lillo, Carlo
2017-03-01
In the last two decades, comparative research has addressed the issue of how the global and local levels of structure of visual stimuli are processed by different species, using Navon-type hierarchical figures, i.e. smaller local elements that form larger global configurations. Determining whether or not the variety of procedures adopted to test different species with hierarchical figures are equivalent is of crucial importance to ensure comparability of results. Among non-human species, global/local processing has been extensively studied in tufted capuchin monkeys using matching-to-sample tasks with hierarchical patterns. Local dominance has emerged consistently in these New World primates. In the present study, we assessed capuchins' processing of hierarchical stimuli with a method frequently adopted in studies of global/local processing in non-primate species: the conflict-choice task. Different from the matching-to-sample procedure, this task involved processing local and global information retained in long-term memory. Capuchins were trained to discriminate between consistent hierarchical stimuli (similar global and local shape) and then tested with inconsistent hierarchical stimuli (different global and local shapes). We found that capuchins preferred the hierarchical stimuli featuring the correct local elements rather than those with the correct global configuration. This finding confirms that capuchins' local dominance, typically observed using matching-to-sample procedures, is also expressed as a local preference in the conflict-choice task. Our study adds to the growing body of comparative studies on visual grouping functions by demonstrating that the methods most frequently used in the literature on global/local processing produce analogous results irrespective of extent of the involvement of memory processes.
Juan-Albarracín, Javier; Fuster-Garcia, Elies; Manjón, José V; Robles, Montserrat; Aparici, F; Martí-Bonmatí, L; García-Gómez, Juan M
2015-01-01
Automatic brain tumour segmentation has become a key component for the future of brain tumour treatment. Currently, most of brain tumour segmentation approaches arise from the supervised learning standpoint, which requires a labelled training dataset from which to infer the models of the classes. The performance of these models is directly determined by the size and quality of the training corpus, whose retrieval becomes a tedious and time-consuming task. On the other hand, unsupervised approaches avoid these limitations but often do not reach comparable results than the supervised methods. In this sense, we propose an automated unsupervised method for brain tumour segmentation based on anatomical Magnetic Resonance (MR) images. Four unsupervised classification algorithms, grouped by their structured or non-structured condition, were evaluated within our pipeline. Considering the non-structured algorithms, we evaluated K-means, Fuzzy K-means and Gaussian Mixture Model (GMM), whereas as structured classification algorithms we evaluated Gaussian Hidden Markov Random Field (GHMRF). An automated postprocess based on a statistical approach supported by tissue probability maps is proposed to automatically identify the tumour classes after the segmentations. We evaluated our brain tumour segmentation method with the public BRAin Tumor Segmentation (BRATS) 2013 Test and Leaderboard datasets. Our approach based on the GMM model improves the results obtained by most of the supervised methods evaluated with the Leaderboard set and reaches the second position in the ranking. Our variant based on the GHMRF achieves the first position in the Test ranking of the unsupervised approaches and the seventh position in the general Test ranking, which confirms the method as a viable alternative for brain tumour segmentation.
Kim, Eun-Young; Kim, Suhn-Yeop; Oh, Duck-Won
2012-02-01
To investigate the effect of supervised and unsupervised pelvic floor muscle exercises utilizing trunk stabilization for treating postpartum urinary incontinence and to compare the outcomes. Randomized, single-blind controlled study. Outpatient rehabilitation hospital. Eighteen subjects with postpartum urinary incontinence. Subjects were randomized to either a supervised training group with verbal instruction from a physiotherapist, or an unsupervised training group after undergoing a supervised demonstration session. Bristol Female Lower Urinary Tract Symptom questionnaire (urinary symptoms and quality of life) and vaginal function test (maximal vaginal squeeze pressure and holding time) using a perineometer. The change values for urinary symptoms (-27.22 ± 6.20 versus -18.22 ± 5.49), quality of life (-5.33 ± 2.96 versus -1.78 ± 3.93), total score (-32.56 ± 8.17 versus -20.00 ± 6.67), maximal vaginal squeeze pressure (18.96 ± 9.08 versus 2.67 ± 3.64 mmHg), and holding time (11.32 ± 3.17 versus 5.72 ± 2.29 seconds) were more improved in the supervised group than in the unsupervised group (P < 0.05). In the supervised group, significant differences were found for all variables between pre- and post-test values (P < 0.01), whereas the unsupervised group showed significant differences for urinary symptom score, total score and holding time between the pre- and post-test results (P < 0.05). These findings suggest that exercising the pelvic floor muscles by utilizing trunk stabilization under physiotherapist supervision may be beneficial for the management of postpartum urinary incontinence.
Hyperspectral Imaging Using Flexible Endoscopy for Laryngeal Cancer Detection
Regeling, Bianca; Thies, Boris; Gerstner, Andreas O. H.; Westermann, Stephan; Müller, Nina A.; Bendix, Jörg; Laffers, Wiebke
2016-01-01
Hyperspectral imaging (HSI) is increasingly gaining acceptance in the medical field. Up until now, HSI has been used in conjunction with rigid endoscopy to detect cancer in vivo. The logical next step is to pair HSI with flexible endoscopy, since it improves access to hard-to-reach areas. While the flexible endoscope’s fiber optic cables provide the advantage of flexibility, they also introduce an interfering honeycomb-like pattern onto images. Due to the substantial impact this pattern has on locating cancerous tissue, it must be removed before the HS data can be further processed. Thereby, the loss of information is to minimize avoiding the suppression of small-area variations of pixel values. We have developed a system that uses flexible endoscopy to record HS cubes of the larynx and designed a special filtering technique to remove the honeycomb-like pattern with minimal loss of information. We have confirmed its feasibility by comparing it to conventional filtering techniques using an objective metric and by applying unsupervised and supervised classifications to raw and pre-processed HS cubes. Compared to conventional techniques, our method successfully removes the honeycomb-like pattern and considerably improves classification performance, while preserving image details. PMID:27529255
User Activity Recognition in Smart Homes Using Pattern Clustering Applied to Temporal ANN Algorithm
Bourobou, Serge Thomas Mickala; Yoo, Younghwan
2015-01-01
This paper discusses the possibility of recognizing and predicting user activities in the IoT (Internet of Things) based smart environment. The activity recognition is usually done through two steps: activity pattern clustering and activity type decision. Although many related works have been suggested, they had some limited performance because they focused only on one part between the two steps. This paper tries to find the best combination of a pattern clustering method and an activity decision algorithm among various existing works. For the first step, in order to classify so varied and complex user activities, we use a relevant and efficient unsupervised learning method called the K-pattern clustering algorithm. In the second step, the training of smart environment for recognizing and predicting user activities inside his/her personal space is done by utilizing the artificial neural network based on the Allen’s temporal relations. The experimental results show that our combined method provides the higher recognition accuracy for various activities, as compared with other data mining classification algorithms. Furthermore, it is more appropriate for a dynamic environment like an IoT based smart home. PMID:26007738
Image-based spectroscopy for environmental monitoring
NASA Astrophysics Data System (ADS)
Bachmakov, Eduard; Molina, Carolyn; Wynne, Rosalind
2014-03-01
An image-processing algorithm for use with a nano-featured spectrometer chemical agent detection configuration is presented. The spectrometer chip acquired from Nano-Optic DevicesTM can reduce the size of the spectrometer down to a coin. The nanospectrometer chip was aligned with a 635nm laser source, objective lenses, and a CCD camera. The images from a nanospectrometer chip were collected and compared to reference spectra. Random background noise contributions were isolated and removed from the diffraction pattern image analysis via a threshold filter. Results are provided for the image-based detection of the diffraction pattern produced by the nanospectrometer. The featured PCF spectrometer has the potential to measure optical absorption spectra in order to detect trace amounts of contaminants. MATLAB tools allow for implementation of intelligent, automatic detection of the relevant sub-patterns in the diffraction patterns and subsequent extraction of the parameters using region-detection algorithms such as the generalized Hough transform, which detects specific shapes within the image. This transform is a method for detecting curves by exploiting the duality between points on a curve and parameters of that curve. By employing this imageprocessing technique, future sensor systems will benefit from new applications such as unsupervised environmental monitoring of air or water quality.
Hyperspectral Imaging Using Flexible Endoscopy for Laryngeal Cancer Detection.
Regeling, Bianca; Thies, Boris; Gerstner, Andreas O H; Westermann, Stephan; Müller, Nina A; Bendix, Jörg; Laffers, Wiebke
2016-08-13
Hyperspectral imaging (HSI) is increasingly gaining acceptance in the medical field. Up until now, HSI has been used in conjunction with rigid endoscopy to detect cancer in vivo. The logical next step is to pair HSI with flexible endoscopy, since it improves access to hard-to-reach areas. While the flexible endoscope's fiber optic cables provide the advantage of flexibility, they also introduce an interfering honeycomb-like pattern onto images. Due to the substantial impact this pattern has on locating cancerous tissue, it must be removed before the HS data can be further processed. Thereby, the loss of information is to minimize avoiding the suppression of small-area variations of pixel values. We have developed a system that uses flexible endoscopy to record HS cubes of the larynx and designed a special filtering technique to remove the honeycomb-like pattern with minimal loss of information. We have confirmed its feasibility by comparing it to conventional filtering techniques using an objective metric and by applying unsupervised and supervised classifications to raw and pre-processed HS cubes. Compared to conventional techniques, our method successfully removes the honeycomb-like pattern and considerably improves classification performance, while preserving image details.
Unsupervised Ensemble Anomaly Detection Using Time-Periodic Packet Sampling
NASA Astrophysics Data System (ADS)
Uchida, Masato; Nawata, Shuichi; Gu, Yu; Tsuru, Masato; Oie, Yuji
We propose an anomaly detection method for finding patterns in network traffic that do not conform to legitimate (i.e., normal) behavior. The proposed method trains a baseline model describing the normal behavior of network traffic without using manually labeled traffic data. The trained baseline model is used as the basis for comparison with the audit network traffic. This anomaly detection works in an unsupervised manner through the use of time-periodic packet sampling, which is used in a manner that differs from its intended purpose — the lossy nature of packet sampling is used to extract normal packets from the unlabeled original traffic data. Evaluation using actual traffic traces showed that the proposed method has false positive and false negative rates in the detection of anomalies regarding TCP SYN packets comparable to those of a conventional method that uses manually labeled traffic data to train the baseline model. Performance variation due to the probabilistic nature of sampled traffic data is mitigated by using ensemble anomaly detection that collectively exploits multiple baseline models in parallel. Alarm sensitivity is adjusted for the intended use by using maximum- and minimum-based anomaly detection that effectively take advantage of the performance variations among the multiple baseline models. Testing using actual traffic traces showed that the proposed anomaly detection method performs as well as one using manually labeled traffic data and better than one using randomly sampled (unlabeled) traffic data.
Kastberger, G; Kranner, G
2000-02-01
Viscovery SOMine is a software tool for advanced analysis and monitoring of numerical data sets. It was developed for professional use in business, industry, and science and to support dependency analysis, deviation detection, unsupervised clustering, nonlinear regression, data association, pattern recognition, and animated monitoring. Based on the concept of self-organizing maps (SOMs), it employs a robust variant of unsupervised neural networks--namely, Kohonen's Batch-SOM, which is further enhanced with a new scaling technique for speeding up the learning process. This tool provides a powerful means by which to analyze complex data sets without prior statistical knowledge. The data representation contained in the trained SOM is systematically converted to be used in a spectrum of visualization techniques, such as evaluating dependencies between components, investigating geometric properties of the data distribution, searching for clusters, or monitoring new data. We have used this software tool to analyze and visualize multiple influences of the ocellar system on free-flight behavior in giant honeybees. Occlusion of ocelli will affect orienting reactivities in relation to flight target, level of disturbance, and position of the bee in the flight chamber; it will induce phototaxis and make orienting imprecise and dependent on motivational settings. Ocelli permit the adjustment of orienting strategies to environmental demands by enforcing abilities such as centering or flight kinetics and by providing independent control of posture and flight course.
NASA Astrophysics Data System (ADS)
Bellón, Beatriz; Bégué, Agnès; Lo Seen, Danny; Lebourgeois, Valentine; Evangelista, Balbino Antônio; Simões, Margareth; Demonte Ferraz, Rodrigo Peçanha
2018-06-01
Cropping systems' maps at fine scale over large areas provide key information for further agricultural production and environmental impact assessments, and thus represent a valuable tool for effective land-use planning. There is, therefore, a growing interest in mapping cropping systems in an operational manner over large areas, and remote sensing approaches based on vegetation index time series analysis have proven to be an efficient tool. However, supervised pixel-based approaches are commonly adopted, requiring resource consuming field campaigns to gather training data. In this paper, we present a new object-based unsupervised classification approach tested on an annual MODIS 16-day composite Normalized Difference Vegetation Index time series and a Landsat 8 mosaic of the State of Tocantins, Brazil, for the 2014-2015 growing season. Two variants of the approach are compared: an hyperclustering approach, and a landscape-clustering approach involving a previous stratification of the study area into landscape units on which the clustering is then performed. The main cropping systems of Tocantins, characterized by the crop types and cropping patterns, were efficiently mapped with the landscape-clustering approach. Results show that stratification prior to clustering significantly improves the classification accuracies for underrepresented and sparsely distributed cropping systems. This study illustrates the potential of unsupervised classification for large area cropping systems' mapping and contributes to the development of generic tools for supporting large-scale agricultural monitoring across regions.
ERIC Educational Resources Information Center
Siennick, Sonja E.; Osgood, D. Wayne
2012-01-01
Companions are central to explanations of the risky nature of unstructured and unsupervised socializing, yet we know little about whom adolescents are with when hanging out. We examine predictors of how often friendship dyads hang out via multilevel analyses of longitudinal friendship-level data on over 5,000 middle schoolers. Adolescents hang out…
Teacher and learner: Supervised and unsupervised learning in communities.
Shafto, Michael G; Seifert, Colleen M
2015-01-01
How far can teaching methods go to enhance learning? Optimal methods of teaching have been considered in research on supervised and unsupervised learning. Locally optimal methods are usually hybrids of teaching and self-directed approaches. The costs and benefits of specific methods have been shown to depend on the structure of the learning task, the learners, the teachers, and the environment.
Shan, Ying; Sawhney, Harpreet S; Kumar, Rakesh
2008-04-01
This paper proposes a novel unsupervised algorithm learning discriminative features in the context of matching road vehicles between two non-overlapping cameras. The matching problem is formulated as a same-different classification problem, which aims to compute the probability of vehicle images from two distinct cameras being from the same vehicle or different vehicle(s). We employ a novel measurement vector that consists of three independent edge-based measures and their associated robust measures computed from a pair of aligned vehicle edge maps. The weight of each measure is determined by an unsupervised learning algorithm that optimally separates the same-different classes in the combined measurement space. This is achieved with a weak classification algorithm that automatically collects representative samples from same-different classes, followed by a more discriminative classifier based on Fisher' s Linear Discriminants and Gibbs Sampling. The robustness of the match measures and the use of unsupervised discriminant analysis in the classification ensures that the proposed method performs consistently in the presence of missing/false features, temporally and spatially changing illumination conditions, and systematic misalignment caused by different camera configurations. Extensive experiments based on real data of over 200 vehicles at different times of day demonstrate promising results.
Penalized unsupervised learning with outliers
Witten, Daniela M.
2013-01-01
We consider the problem of performing unsupervised learning in the presence of outliers – that is, observations that do not come from the same distribution as the rest of the data. It is known that in this setting, standard approaches for unsupervised learning can yield unsatisfactory results. For instance, in the presence of severe outliers, K-means clustering will often assign each outlier to its own cluster, or alternatively may yield distorted clusters in order to accommodate the outliers. In this paper, we take a new approach to extending existing unsupervised learning techniques to accommodate outliers. Our approach is an extension of a recent proposal for outlier detection in the regression setting. We allow each observation to take on an “error” term, and we penalize the errors using a group lasso penalty in order to encourage most of the observations’ errors to exactly equal zero. We show that this approach can be used in order to develop extensions of K-means clustering and principal components analysis that result in accurate outlier detection, as well as improved performance in the presence of outliers. These methods are illustrated in a simulation study and on two gene expression data sets, and connections with M-estimation are explored. PMID:23875057
A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data.
Goldstein, Markus; Uchida, Seiichi
2016-01-01
Anomaly detection is the process of identifying unexpected items or events in datasets, which differ from the norm. In contrast to standard classification tasks, anomaly detection is often applied on unlabeled data, taking only the internal structure of the dataset into account. This challenge is known as unsupervised anomaly detection and is addressed in many practical applications, for example in network intrusion detection, fraud detection as well as in the life science and medical domain. Dozens of algorithms have been proposed in this area, but unfortunately the research community still lacks a comparative universal evaluation as well as common publicly available datasets. These shortcomings are addressed in this study, where 19 different unsupervised anomaly detection algorithms are evaluated on 10 different datasets from multiple application domains. By publishing the source code and the datasets, this paper aims to be a new well-funded basis for unsupervised anomaly detection research. Additionally, this evaluation reveals the strengths and weaknesses of the different approaches for the first time. Besides the anomaly detection performance, computational effort, the impact of parameter settings as well as the global/local anomaly detection behavior is outlined. As a conclusion, we give an advise on algorithm selection for typical real-world tasks.
Edwards, Darren J; Wood, Rodger
2016-01-01
This study explored over-selectivity (executive dysfunction) using a standard unsupervised categorization task. Over-selectivity has been demonstrated using supervised categorization procedures (where training is given); however, little has been done in the way of unsupervised categorization (without training). A standard unsupervised categorization task was used to assess levels of over-selectivity in a traumatic brain injury (TBI) population. Individuals with TBI were selected from the Tertiary Traumatic Brain Injury Clinic at Swansea University and were asked to categorize two-dimensional items (pictures on cards), into groups that they felt were most intuitive, and without any learning (feedback from experimenter). This was compared against categories made by a control group for the same task. The findings of this study demonstrate that individuals with TBI had deficits for both easy and difficult categorization sets, as indicated by a larger amount of one-dimensional sorting compared to control participants. Deficits were significantly greater for the easy condition. The implications of these findings are discussed in the context of over-selectivity, and the processes that underlie this deficit. Also, the implications for using this procedure as a screening measure for over-selectivity in TBI are discussed.
Segmentation of fluorescence microscopy cell images using unsupervised mining.
Du, Xian; Dua, Sumeet
2010-05-28
The accurate measurement of cell and nuclei contours are critical for the sensitive and specific detection of changes in normal cells in several medical informatics disciplines. Within microscopy, this task is facilitated using fluorescence cell stains, and segmentation is often the first step in such approaches. Due to the complex nature of cell issues and problems inherent to microscopy, unsupervised mining approaches of clustering can be incorporated in the segmentation of cells. In this study, we have developed and evaluated the performance of multiple unsupervised data mining techniques in cell image segmentation. We adapt four distinctive, yet complementary, methods for unsupervised learning, including those based on k-means clustering, EM, Otsu's threshold, and GMAC. Validation measures are defined, and the performance of the techniques is evaluated both quantitatively and qualitatively using synthetic and recently published real data. Experimental results demonstrate that k-means, Otsu's threshold, and GMAC perform similarly, and have more precise segmentation results than EM. We report that EM has higher recall values and lower precision results from under-segmentation due to its Gaussian model assumption. We also demonstrate that these methods need spatial information to segment complex real cell images with a high degree of efficacy, as expected in many medical informatics applications.
Lacroix, André; Hortobágyi, Tibor; Beurskens, Rainer; Granacher, Urs
2017-11-01
Balance and resistance training can improve healthy older adults' balance and muscle strength. Delivering such exercise programs at home without supervision may facilitate participation for older adults because they do not have to leave their homes. To date, no systematic literature analysis has been conducted to determine if supervision affects the effectiveness of these programs to improve healthy older adults' balance and muscle strength/power. The objective of this systematic review and meta-analysis was to quantify the effectiveness of supervised vs. unsupervised balance and/or resistance training programs on measures of balance and muscle strength/power in healthy older adults. In addition, the impact of supervision on training-induced adaptive processes was evaluated in the form of dose-response relationships by analyzing randomized controlled trials that compared supervised with unsupervised trials. A computerized systematic literature search was performed in the electronic databases PubMed, Web of Science, and SportDiscus to detect articles examining the role of supervision in balance and/or resistance training in older adults. The initially identified 6041 articles were systematically screened. Studies were included if they examined balance and/or resistance training in adults aged ≥65 years with no relevant diseases and registered at least one behavioral balance (e.g., time during single leg stance) and/or muscle strength/power outcome (e.g., time for 5-Times-Chair-Rise-Test). Finally, 11 studies were eligible for inclusion in this meta-analysis. Weighted mean standardized mean differences between subjects (SMD bs ) of supervised vs. unsupervised balance/resistance training studies were calculated. The included studies were coded for the following variables: number of participants, sex, age, number and type of interventions, type of balance/strength tests, and change (%) from pre- to post-intervention values. Additionally, we coded training according to the following modalities: period, frequency, volume, modalities of supervision (i.e., number of supervised/unsupervised sessions within the supervised or unsupervised training groups, respectively). Heterogeneity was computed using I 2 and χ 2 statistics. The methodological quality of the included studies was evaluated using the Physiotherapy Evidence Database scale. Our analyses revealed that in older adults, supervised balance/resistance training was superior compared with unsupervised balance/resistance training in improving measures of static steady-state balance (mean SMD bs = 0.28, p = 0.39), dynamic steady-state balance (mean SMD bs = 0.35, p = 0.02), proactive balance (mean SMD bs = 0.24, p = 0.05), balance test batteries (mean SMD bs = 0.53, p = 0.02), and measures of muscle strength/power (mean SMD bs = 0.51, p = 0.04). Regarding the examined dose-response relationships, our analyses showed that a number of 10-29 additional supervised sessions in the supervised training groups compared with the unsupervised training groups resulted in the largest effects for static steady-state balance (mean SMD bs = 0.35), dynamic steady-state balance (mean SMD bs = 0.37), and muscle strength/power (mean SMD bs = 1.12). Further, ≥30 additional supervised sessions in the supervised training groups were needed to produce the largest effects on proactive balance (mean SMD bs = 0.30) and balance test batteries (mean SMD bs = 0.77). Effects in favor of supervised programs were larger for studies that did not include any supervised sessions in their unsupervised programs (mean SMD bs : 0.28-1.24) compared with studies that implemented a few supervised sessions in their unsupervised programs (e.g., three supervised sessions throughout the entire intervention program; SMD bs : -0.06 to 0.41). The present findings have to be interpreted with caution because of the low number of eligible studies and the moderate methodological quality of the included studies, which is indicated by a median Physiotherapy Evidence Database scale score of 5. Furthermore, we indirectly compared dose-response relationships across studies and not from single controlled studies. Our analyses suggest that supervised balance and/or resistance training improved measures of balance and muscle strength/power to a greater extent than unsupervised programs in older adults. Owing to the small number of available studies, we were unable to establish a clear dose-response relationship with regard to the impact of supervision. However, the positive effects of supervised training are particularly prominent when compared with completely unsupervised training programs. It is therefore recommended to include supervised sessions (i.e., two out of three sessions/week) in balance/resistance training programs to effectively improve balance and muscle strength/power in older adults.
Sereshti, Hassan; Poursorkh, Zahra; Aliakbarzadeh, Ghazaleh; Zarre, Shahin; Ataolahi, Sahar
2018-01-15
Quality of saffron, a valuable food additive, could considerably affect the consumers' health. In this work, a novel preprocessing strategy for image analysis of saffron thin layer chromatographic (TLC) patterns was introduced. This includes performing a series of image pre-processing techniques on TLC images such as compression, inversion, elimination of general baseline (using asymmetric least squares (AsLS)), removing spots shift and concavity (by correlation optimization warping (COW)), and finally conversion to RGB chromatograms. Subsequently, an unsupervised multivariate data analysis including principal component analysis (PCA) and k-means clustering was utilized to investigate the soil salinity effect, as a cultivation parameter, on saffron TLC patterns. This method was used as a rapid and simple technique to obtain the chemical fingerprints of saffron TLC images. Finally, the separated TLC spots were chemically identified using high-performance liquid chromatography-diode array detection (HPLC-DAD). Accordingly, the saffron quality from different areas of Iran was evaluated and classified. Copyright © 2017 Elsevier Ltd. All rights reserved.
A New MI-Based Visualization Aided Validation Index for Mining Big Longitudinal Web Trial Data
Zhang, Zhaoyang; Fang, Hua; Wang, Honggang
2016-01-01
Web-delivered clinical trials generate big complex data. To help untangle the heterogeneity of treatment effects, unsupervised learning methods have been widely applied. However, identifying valid patterns is a priority but challenging issue for these methods. This paper, built upon our previous research on multiple imputation (MI)-based fuzzy clustering and validation, proposes a new MI-based Visualization-aided validation index (MIVOOS) to determine the optimal number of clusters for big incomplete longitudinal Web-trial data with inflated zeros. Different from a recently developed fuzzy clustering validation index, MIVOOS uses a more suitable overlap and separation measures for Web-trial data but does not depend on the choice of fuzzifiers as the widely used Xie and Beni (XB) index. Through optimizing the view angles of 3-D projections using Sammon mapping, the optimal 2-D projection-guided MIVOOS is obtained to better visualize and verify the patterns in conjunction with trajectory patterns. Compared with XB and VOS, our newly proposed MIVOOS shows its robustness in validating big Web-trial data under different missing data mechanisms using real and simulated Web-trial data. PMID:27482473
Greene, Kathryn; Banerjee, Smita C
2009-04-01
This study explored the association between unsupervised time with peers and adolescent smoking behavior both directly and indirectly through interaction with delinquent peers, social expectancies about cigarette smoking, and cigarette offers from peers. A cross-sectional survey was used for the study and included 248 male and female middle school students. Results of structural equation modeling revealed that unsupervised time with peers is associated indirectly with adolescent smoking behavior through the mediation of association with delinquent peers, social expectancies about cigarette smoking, and cigarette offers from peers. Interventions designed to motivate adolescents without adult supervision to associate more with friends who engage in prosocial activities may eventually reduce adolescent smoking. Further implications for structured supervised time for students outside of school time are discussed.
NASA Technical Reports Server (NTRS)
Hall, Lawrence O.; Bensaid, Amine M.; Clarke, Laurence P.; Velthuizen, Robert P.; Silbiger, Martin S.; Bezdek, James C.
1992-01-01
Magnetic resonance (MR) brain section images are segmented and then synthetically colored to give visual representations of the original data with three approaches: the literal and approximate fuzzy c-means unsupervised clustering algorithms and a supervised computational neural network, a dynamic multilayered perception trained with the cascade correlation learning algorithm. Initial clinical results are presented on both normal volunteers and selected patients with brain tumors surrounded by edema. Supervised and unsupervised segmentation techniques provide broadly similar results. Unsupervised fuzzy algorithms were visually observed to show better segmentation when compared with raw image data for volunteer studies. However, for a more complex segmentation problem with tumor/edema or cerebrospinal fluid boundary, where the tissues have similar MR relaxation behavior, inconsistency in rating among experts was observed.
NASA Astrophysics Data System (ADS)
Wu, Hua; Briscoe, Wuge H.
2018-04-01
We report polycrystalline residual patterns with dendritic micromorphologies upon fast evaporation of a mixed-solvent sessile drop containing reactive ZnO nanoparticles. The molecular and particulate species generated in situ upon evaporative drying collude with and modify the Marangoni solvent flows and Bénard-Marangoni instabilities, as they undergo self-assembly and self-organization under conditions far from equilibrium, leading to the ultimate hierarchical central cellular patterns surrounded by a peripheral coffee ring upon drying.
Property Specification Patterns for intelligence building software
NASA Astrophysics Data System (ADS)
Chun, Seungsu
2018-03-01
In this paper, through the property specification pattern research for Modal MU(μ) logical aspects present a single framework based on the pattern of intelligence building software. In this study, broken down by state property specification pattern classification of Dwyer (S) and action (A) and was subdivided into it again strong (A) and weaknesses (E). Through these means based on a hierarchical pattern classification of the property specification pattern analysis of logical aspects Mu(μ) was applied to the pattern classification of the examples used in the actual model checker. As a result, not only can a more accurate classification than the existing classification systems were easy to create and understand the attributes specified.
Hierarchical Spatio-temporal Visual Analysis of Cluster Evolution in Electrocorticography Data
Murugesan, Sugeerth; Bouchard, Kristofer; Chang, Edward; ...
2016-10-02
Here, we present ECoG ClusterFlow, a novel interactive visual analysis tool for the exploration of high-resolution Electrocorticography (ECoG) data. Our system detects and visualizes dynamic high-level structures, such as communities, using the time-varying spatial connectivity network derived from the high-resolution ECoG data. ECoG ClusterFlow provides a multi-scale visualization of the spatio-temporal patterns underlying the time-varying communities using two views: 1) an overview summarizing the evolution of clusters over time and 2) a hierarchical glyph-based technique that uses data aggregation and small multiples techniques to visualize the propagation of clusters in their spatial domain. ECoG ClusterFlow makes it possible 1) tomore » compare the spatio-temporal evolution patterns across various time intervals, 2) to compare the temporal information at varying levels of granularity, and 3) to investigate the evolution of spatial patterns without occluding the spatial context information. Lastly, we present case studies done in collaboration with neuroscientists on our team for both simulated and real epileptic seizure data aimed at evaluating the effectiveness of our approach.« less
Zhou, Jianhong; Li, Bo; Han, Yong; Zhao, Lingzhou
2016-07-01
Advanced titanium based bone implant with fast established, rigid and stable osseointegration is stringently needed in clinic. Here the hierarchical micropore/nanorod-patterned strontium doped hydroxyapatite (Ca9Sr1(PO4)6(OH)2, Sr1-HA) coatings (MNRs) with different interrod spacings varying from about 300 to 33nm were developed. MNRs showed dramatically differential biological performance closely related to the interrod spacing. Compared to micropore/nanogranule-patterned Sr1-HA coating (MNG), MNRs with an interrod spacing of larger than 137nm resulted in inhibited in vitro mesenchymal stem cell functions and in vivo osseointegration, while those of smaller than 96nm gave rise to dramatically enhanced the biological effect, especially those of mean 67nm displayed the best effect. The differential biological effect of MNRs was related to their modulation on the focal adhesion mediated mechanotransduction. These results suggest that MNRs with a mean interrod spacing of 67nm may give rise to an advanced implant of improved clinical performance. Copyright © 2016 Elsevier Inc. All rights reserved.
Systems view on spatial planning and perception based on invariants in agent-environment dynamics
Mettler, Bérénice; Kong, Zhaodan; Li, Bin; Andersh, Jonathan
2015-01-01
Modeling agile and versatile spatial behavior remains a challenging task, due to the intricate coupling of planning, control, and perceptual processes. Previous results have shown that humans plan and organize their guidance behavior by exploiting patterns in the interactions between agent or organism and the environment. These patterns, described under the concept of Interaction Patterns (IPs), capture invariants arising from equivalences and symmetries in the interaction with the environment, as well as effects arising from intrinsic properties of human control and guidance processes, such as perceptual guidance mechanisms. The paper takes a systems' perspective, considering the IP as a unit of organization, and builds on its properties to present a hierarchical model that delineates the planning, control, and perceptual processes and their integration. The model's planning process is further elaborated by showing that the IP can be abstracted, using spatial time-to-go functions. The perceptual processes are elaborated from the hierarchical model. The paper provides experimental support for the model's ability to predict the spatial organization of behavior and the perceptual processes. PMID:25628524
Adaptive fuzzy leader clustering of complex data sets in pattern recognition
NASA Technical Reports Server (NTRS)
Newton, Scott C.; Pemmaraju, Surya; Mitra, Sunanda
1992-01-01
A modular, unsupervised neural network architecture for clustering and classification of complex data sets is presented. The adaptive fuzzy leader clustering (AFLC) architecture is a hybrid neural-fuzzy system that learns on-line in a stable and efficient manner. The initial classification is performed in two stages: a simple competitive stage and a distance metric comparison stage. The cluster prototypes are then incrementally updated by relocating the centroid positions from fuzzy C-means system equations for the centroids and the membership values. The AFLC algorithm is applied to the Anderson Iris data and laser-luminescent fingerprint image data. It is concluded that the AFLC algorithm successfully classifies features extracted from real data, discrete or continuous.
ERIC Educational Resources Information Center
Lewkowicz, David J.
2003-01-01
Three experiments examined 4- to 10-month-olds' perception of audio-visual (A-V) temporal synchrony cues in the presence or absence of rhythmic pattern cues. Results established that infants of all ages could discriminate between two different audio-visual rhythmic events. Only 10-month-olds detected a desynchronization of the auditory and visual…
Hierarchical content-based image retrieval by dynamic indexing and guided search
NASA Astrophysics Data System (ADS)
You, Jane; Cheung, King H.; Liu, James; Guo, Linong
2003-12-01
This paper presents a new approach to content-based image retrieval by using dynamic indexing and guided search in a hierarchical structure, and extending data mining and data warehousing techniques. The proposed algorithms include: a wavelet-based scheme for multiple image feature extraction, the extension of a conventional data warehouse and an image database to an image data warehouse for dynamic image indexing, an image data schema for hierarchical image representation and dynamic image indexing, a statistically based feature selection scheme to achieve flexible similarity measures, and a feature component code to facilitate query processing and guide the search for the best matching. A series of case studies are reported, which include a wavelet-based image color hierarchy, classification of satellite images, tropical cyclone pattern recognition, and personal identification using multi-level palmprint and face features.
Diuk, Carlos; Tsai, Karin; Wallis, Jonathan; Botvinick, Matthew; Niv, Yael
2013-03-27
Studies suggest that dopaminergic neurons report a unitary, global reward prediction error signal. However, learning in complex real-life tasks, in particular tasks that show hierarchical structure, requires multiple prediction errors that may coincide in time. We used functional neuroimaging to measure prediction error signals in humans performing such a hierarchical task involving simultaneous, uncorrelated prediction errors. Analysis of signals in a priori anatomical regions of interest in the ventral striatum and the ventral tegmental area indeed evidenced two simultaneous, but separable, prediction error signals corresponding to the two levels of hierarchy in the task. This result suggests that suitably designed tasks may reveal a more intricate pattern of firing in dopaminergic neurons. Moreover, the need for downstream separation of these signals implies possible limitations on the number of different task levels that we can learn about simultaneously.
Complexity and dynamics of topological and community structure in complex networks
NASA Astrophysics Data System (ADS)
Berec, Vesna
2017-07-01
Complexity is highly susceptible to variations in the network dynamics, reflected on its underlying architecture where topological organization of cohesive subsets into clusters, system's modular structure and resulting hierarchical patterns, are cross-linked with functional dynamics of the system. Here we study connection between hierarchical topological scales of the simplicial complexes and the organization of functional clusters - communities in complex networks. The analysis reveals the full dynamics of different combinatorial structures of q-th-dimensional simplicial complexes and their Laplacian spectra, presenting spectral properties of resulting symmetric and positive semidefinite matrices. The emergence of system's collective behavior from inhomogeneous statistical distribution is induced by hierarchically ordered topological structure, which is mapped to simplicial complex where local interactions between the nodes clustered into subcomplexes generate flow of information that characterizes complexity and dynamics of the full system.
Lv, Tong; Cheng, Zhongjun; Zhang, Dongjie; Zhang, Enshuang; Zhao, Qianlong; Liu, Yuyan; Jiang, Lei
2016-09-21
Recently, superhydrophobic surfaces with tunable wettability have aroused much attention. Noticeably, almost all present smart performances rely on the variation of surface chemistry on static micro/nanostructure, to obtain a surface with dynamically tunable micro/nanostructure, especially that can memorize and keep different micro/nanostructures and related wettabilities, is still a challenge. Herein, by creating micro/nanostructured arrays on shape memory polymer, a superhydrophobic surface that has shape memory ability in changing and recovering its hierarchical structures and related wettabilities was reported. Meanwhile, the surface was successfully used in the rewritable functional chip for droplet storage by designing microstructure-dependent patterns, which breaks through current research that structure patterns cannot be reprogrammed. This article advances a superhydrophobic surface with shape memory hierarchical structure and the application in rewritable functional chip, which could start some fresh ideas for the development of smart superhydrophobic surface.
Principles of Temporal Processing Across the Cortical Hierarchy.
Himberger, Kevin D; Chien, Hsiang-Yun; Honey, Christopher J
2018-05-02
The world is richly structured on multiple spatiotemporal scales. In order to represent spatial structure, many machine-learning models repeat a set of basic operations at each layer of a hierarchical architecture. These iterated spatial operations - including pooling, normalization and pattern completion - enable these systems to recognize and predict spatial structure, while robust to changes in the spatial scale, contrast and noisiness of the input signal. Because our brains also process temporal information that is rich and occurs across multiple time scales, might the brain employ an analogous set of operations for temporal information processing? Here we define a candidate set of temporal operations, and we review evidence that they are implemented in the mammalian cerebral cortex in a hierarchical manner. We conclude that multiple consecutive stages of cortical processing can be understood to perform temporal pooling, temporal normalization and temporal pattern completion. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.
ERIC Educational Resources Information Center
Snyder, Robin M.
2015-01-01
The field of topic modeling has become increasingly important over the past few years. Topic modeling is an unsupervised machine learning way to organize text (or image or DNA, etc.) information such that related pieces of text can be identified. This paper/session will present/discuss the current state of topic modeling, why it is important, and…
ERIC Educational Resources Information Center
Ladyshewsky, Richard K.
2015-01-01
This research explores differences in multiple choice test (MCT) scores in a cohort of post-graduate students enrolled in a management and leadership course. A total of 250 students completed the MCT in either a supervised in-class paper and pencil test or an unsupervised online test. The only statistically significant difference between the nine…
Exploiting Secondary Sources for Unsupervised Record Linkage
2004-01-01
paper, we present an extension to Apollo’s active learning component to Report Documentation Page Form ApprovedOMB No. 0704-0188 Public reporting...Sources address the issue of user involvement. Using secondary sources, a system can autonomously answer questions posed by its active learning component...over, we present how Apollo utilizes the identified sec- ondary sources in an unsupervised active learning pro- cess. Apollo’s learning algorithm
Belgiu, Mariana; Dr Guţ, Lucian
2014-10-01
Although multiresolution segmentation (MRS) is a powerful technique for dealing with very high resolution imagery, some of the image objects that it generates do not match the geometries of the target objects, which reduces the classification accuracy. MRS can, however, be guided to produce results that approach the desired object geometry using either supervised or unsupervised approaches. Although some studies have suggested that a supervised approach is preferable, there has been no comparative evaluation of these two approaches. Therefore, in this study, we have compared supervised and unsupervised approaches to MRS. One supervised and two unsupervised segmentation methods were tested on three areas using QuickBird and WorldView-2 satellite imagery. The results were assessed using both segmentation evaluation methods and an accuracy assessment of the resulting building classifications. Thus, differences in the geometries of the image objects and in the potential to achieve satisfactory thematic accuracies were evaluated. The two approaches yielded remarkably similar classification results, with overall accuracies ranging from 82% to 86%. The performance of one of the unsupervised methods was unexpectedly similar to that of the supervised method; they identified almost identical scale parameters as being optimal for segmenting buildings, resulting in very similar geometries for the resulting image objects. The second unsupervised method produced very different image objects from the supervised method, but their classification accuracies were still very similar. The latter result was unexpected because, contrary to previously published findings, it suggests a high degree of independence between the segmentation results and classification accuracy. The results of this study have two important implications. The first is that object-based image analysis can be automated without sacrificing classification accuracy, and the second is that the previously accepted idea that classification is dependent on segmentation is challenged by our unexpected results, casting doubt on the value of pursuing 'optimal segmentation'. Our results rather suggest that as long as under-segmentation remains at acceptable levels, imperfections in segmentation can be ruled out, so that a high level of classification accuracy can still be achieved.
Brown, Justin C; Ko, Emily M; Schmitz, Kathryn H
2015-02-01
The health benefits of exercise increase in dose-response fashion among cancer survivors. However, it is unclear how to identify cancer survivors who may require a pre-exercise evaluation before they progress from the common recommendation of walking to unsupervised moderate- to vigorous-intensity exercise. To clarify how to identify cancer survivors who should undergo a pre-exercise evaluation before they progress from the common recommendation of walking to unsupervised moderate- to vigorous-intensity exercise. Electronic survey. Forty-seven (n = 47) experts in the field of exercise physiology, rehabilitation medicine, and cancer survivorship. Not applicable. We synthesized peer-reviewed guidelines for exercise and cancer survivorship and identified 82 health factors that may warrant a pre-exercise evaluation before a survivor engages in unsupervised moderate- to vigorous-intensity exercise. The 82 health factors were classified into 3 domains: (1) clinical health factors; (2) comorbidity and device health factors; and (3) medications. We surveyed a sample of experts asking them to identify which of the 82 health factors among cancer survivors would indicate the need for a pre-exercise evaluation before they engaged in moderate- to vigorous-intensity exercise. The response rate to our survey was 75% (n = 47). Across the 3 domains of health factors, acute symptoms, comorbidities, and medications related to cardiovascular disease were agreed on to indicate a pre-exercise evaluation for survivors before they engaged in unsupervised moderate- to vigorous-intensity exercise. Other health factors in the survey included hematologic, musculoskeletal, systemic, gastrointestinal, pulmonary, and neurological symptoms and comorbidities. Eighteen experts (38%) said it was difficult to provide absolute answers because no 2 patients are alike, and their decisions are made on a case-by-case basis. The results from this expert survey will help to identify which cancer survivors should undergo a pre-exercise evaluation before they engage in unsupervised moderate- to vigorous-intensity exercise. Copyright © 2015 American Academy of Physical Medicine and Rehabilitation. Published by Elsevier Inc. All rights reserved.
Ylioja, T.; Slone, D.H.; Ayres, M.P.
2005-01-01
The impacts on forests of tree-killing bark beetles can depend on the species composition of potential host trees. Host susceptibility might be an intrinsic property of tree species, or it might depend on spatial patterning of alternative host species. We compared the susceptibility of loblolly pine (Pinus taeda) and Virginia pine (P. virginiana) to southern pine beetle (Dendroctonus frontalis) at two hierarchical levels of geographic scale: within beetle infestations in heterospecific stands (extent ranging from 0.28 to 0.65 ha), and across a forest landscape (extent 72,500 ha) that was dominated by monospecific stands. In the former, beetles preferentially attacked Virginia pine (tree mortality = 65-100% in Virginia pine versus 0-66% in loblolly pine), but in the latter, loblolly stands were more susceptible than Virginia stands. This hierarchical transition in host susceptibility was predicted from knowledge of (1) a behavioral preference of beetles for attacking loblolly versus Virginia pine, (2) a negative correlation between preference and performance, and (3) a mismatch in the domain of scale between demographics and host selection by individuals. There is value for forest management in understanding the processes that can produce hierarchical transitions in ecological patterns. Copyright ?? 2005 by the Society of American Foresters.
Phonological Concept Learning.
Moreton, Elliott; Pater, Joe; Pertsova, Katya
2017-01-01
Linguistic and non-linguistic pattern learning have been studied separately, but we argue for a comparative approach. Analogous inductive problems arise in phonological and visual pattern learning. Evidence from three experiments shows that human learners can solve them in analogous ways, and that human performance in both cases can be captured by the same models. We test GMECCS (Gradual Maximum Entropy with a Conjunctive Constraint Schema), an implementation of the Configural Cue Model (Gluck & Bower, ) in a Maximum Entropy phonotactic-learning framework (Goldwater & Johnson, ; Hayes & Wilson, ) with a single free parameter, against the alternative hypothesis that learners seek featurally simple algebraic rules ("rule-seeking"). We study the full typology of patterns introduced by Shepard, Hovland, and Jenkins () ("SHJ"), instantiated as both phonotactic patterns and visual analogs, using unsupervised training. Unlike SHJ, Experiments 1 and 2 found that both phonotactic and visual patterns that depended on fewer features could be more difficult than those that depended on more features, as predicted by GMECCS but not by rule-seeking. GMECCS also correctly predicted performance differences between stimulus subclasses within each pattern. A third experiment tried supervised training (which can facilitate rule-seeking in visual learning) to elicit simple rule-seeking phonotactic learning, but cue-based behavior persisted. We conclude that similar cue-based cognitive processes are available for phonological and visual concept learning, and hence that studying either kind of learning can lead to significant insights about the other. Copyright © 2015 Cognitive Science Society, Inc.