Fast max-margin clustering for unsupervised word sense disambiguation in biomedical texts
Duan, Weisi; Song, Min; Yates, Alexander
2009-01-01
Background We aim to solve the problem of determining word senses for ambiguous biomedical terms with minimal human effort. Methods We build a fully automated system for Word Sense Disambiguation by designing a system that does not require manually-constructed external resources or manually-labeled training examples except for a single ambiguous word. The system uses a novel and efficient graph-based algorithm to cluster words into groups that have the same meaning. Our algorithm follows the principle of finding a maximum margin between clusters, determining a split of the data that maximizes the minimum distance between pairs of data points belonging to two different clusters. Results On a test set of 21 ambiguous keywords from PubMed abstracts, our system has an average accuracy of 78%, outperforming a state-of-the-art unsupervised system by 2% and a baseline technique by 23%. On a standard data set from the National Library of Medicine, our system outperforms the baseline by 6% and comes within 5% of the accuracy of a supervised system. Conclusion Our system is a novel, state-of-the-art technique for efficiently finding word sense clusters, and does not require training data or human effort for each new word to be disambiguated. PMID:19344480
An experimental study of graph connectivity for unsupervised word sense disambiguation.
Navigli, Roberto; Lapata, Mirella
2010-04-01
Word sense disambiguation (WSD), the task of identifying the intended meanings (senses) of words in context, has been a long-standing research objective for natural language processing. In this paper, we are concerned with graph-based algorithms for large-scale WSD. Under this framework, finding the right sense for a given word amounts to identifying the most "important" node among the set of graph nodes representing its senses. We introduce a graph-based WSD algorithm which has few parameters and does not require sense-annotated data for training. Using this algorithm, we investigate several measures of graph connectivity with the aim of identifying those best suited for WSD. We also examine how the chosen lexicon and its connectivity influences WSD performance. We report results on standard data sets and show that our graph-based approach performs comparably to the state of the art.
Experiments in automatic word class and word sense identification for information retrieval
DOE Office of Scientific and Technical Information (OSTI.GOV)
Gauch, S.; Futrelle, R.P.
Automatic identification of related words and automatic detection of word senses are two long-standing goals of researchers in natural language processing. Word class information and word sense identification may enhance the performance of information retrieval system4ms. Large online corpora and increased computational capabilities make new techniques based on corpus linguisitics feasible. Corpus-based analysis is especially needed for corpora from specialized fields for which no electronic dictionaries or thesauri exist. The methods described here use a combination of mutual information and word context to establish word similarities. Then, unsupervised classification is done using clustering in the word space, identifying word classesmore » without pretagging. We also describe an extension of the method to handle the difficult problems of disambiguation and of determining part-of-speech and semantic information for low-frequency words. The method is powerful enough to produce high-quality results on a small corpus of 200,000 words from abstracts in a field of molecular biology.« less
Chasin, Rachel; Rumshisky, Anna; Uzuner, Ozlem; Szolovits, Peter
2014-01-01
Objective To evaluate state-of-the-art unsupervised methods on the word sense disambiguation (WSD) task in the clinical domain. In particular, to compare graph-based approaches relying on a clinical knowledge base with bottom-up topic-modeling-based approaches. We investigate several enhancements to the topic-modeling techniques that use domain-specific knowledge sources. Materials and methods The graph-based methods use variations of PageRank and distance-based similarity metrics, operating over the Unified Medical Language System (UMLS). Topic-modeling methods use unlabeled data from the Multiparameter Intelligent Monitoring in Intensive Care (MIMIC II) database to derive models for each ambiguous word. We investigate the impact of using different linguistic features for topic models, including UMLS-based and syntactic features. We use a sense-tagged clinical dataset from the Mayo Clinic for evaluation. Results The topic-modeling methods achieve 66.9% accuracy on a subset of the Mayo Clinic's data, while the graph-based methods only reach the 40–50% range, with a most-frequent-sense baseline of 56.5%. Features derived from the UMLS semantic type and concept hierarchies do not produce a gain over bag-of-words features in the topic models, but identifying phrases from UMLS and using syntax does help. Discussion Although topic models outperform graph-based methods, semantic features derived from the UMLS prove too noisy to improve performance beyond bag-of-words. Conclusions Topic modeling for WSD provides superior results in the clinical domain; however, integration of knowledge remains to be effectively exploited. PMID:24441986
Harmony Search Algorithm for Word Sense Disambiguation.
Abed, Saad Adnan; Tiun, Sabrina; Omar, Nazlia
2015-01-01
Word Sense Disambiguation (WSD) is the task of determining which sense of an ambiguous word (word with multiple meanings) is chosen in a particular use of that word, by considering its context. A sentence is considered ambiguous if it contains ambiguous word(s). Practically, any sentence that has been classified as ambiguous usually has multiple interpretations, but just one of them presents the correct interpretation. We propose an unsupervised method that exploits knowledge based approaches for word sense disambiguation using Harmony Search Algorithm (HSA) based on a Stanford dependencies generator (HSDG). The role of the dependency generator is to parse sentences to obtain their dependency relations. Whereas, the goal of using the HSA is to maximize the overall semantic similarity of the set of parsed words. HSA invokes a combination of semantic similarity and relatedness measurements, i.e., Jiang and Conrath (jcn) and an adapted Lesk algorithm, to perform the HSA fitness function. Our proposed method was experimented on benchmark datasets, which yielded results comparable to the state-of-the-art WSD methods. In order to evaluate the effectiveness of the dependency generator, we perform the same methodology without the parser, but with a window of words. The empirical results demonstrate that the proposed method is able to produce effective solutions for most instances of the datasets used.
Harmony Search Algorithm for Word Sense Disambiguation
Abed, Saad Adnan; Tiun, Sabrina; Omar, Nazlia
2015-01-01
Word Sense Disambiguation (WSD) is the task of determining which sense of an ambiguous word (word with multiple meanings) is chosen in a particular use of that word, by considering its context. A sentence is considered ambiguous if it contains ambiguous word(s). Practically, any sentence that has been classified as ambiguous usually has multiple interpretations, but just one of them presents the correct interpretation. We propose an unsupervised method that exploits knowledge based approaches for word sense disambiguation using Harmony Search Algorithm (HSA) based on a Stanford dependencies generator (HSDG). The role of the dependency generator is to parse sentences to obtain their dependency relations. Whereas, the goal of using the HSA is to maximize the overall semantic similarity of the set of parsed words. HSA invokes a combination of semantic similarity and relatedness measurements, i.e., Jiang and Conrath (jcn) and an adapted Lesk algorithm, to perform the HSA fitness function. Our proposed method was experimented on benchmark datasets, which yielded results comparable to the state-of-the-art WSD methods. In order to evaluate the effectiveness of the dependency generator, we perform the same methodology without the parser, but with a window of words. The empirical results demonstrate that the proposed method is able to produce effective solutions for most instances of the datasets used. PMID:26422368
LEARNING SEMANTICS-ENHANCED LANGUAGE MODELS APPLIED TO UNSUEPRVISED WSD
DOE Office of Scientific and Technical Information (OSTI.GOV)
VERSPOOR, KARIN; LIN, SHOU-DE
An N-gram language model aims at capturing statistical syntactic word order information from corpora. Although the concept of language models has been applied extensively to handle a variety of NLP problems with reasonable success, the standard model does not incorporate semantic information, and consequently limits its applicability to semantic problems such as word sense disambiguation. We propose a framework that integrates semantic information into the language model schema, allowing a system to exploit both syntactic and semantic information to address NLP problems. Furthermore, acknowledging the limited availability of semantically annotated data, we discuss how the proposed model can be learnedmore » without annotated training examples. Finally, we report on a case study showing how the semantics-enhanced language model can be applied to unsupervised word sense disambiguation with promising results.« less
Co-occurrence graphs for word sense disambiguation in the biomedical domain.
Duque, Andres; Stevenson, Mark; Martinez-Romo, Juan; Araujo, Lourdes
2018-05-01
Word sense disambiguation is a key step for many natural language processing tasks (e.g. summarization, text classification, relation extraction) and presents a challenge to any system that aims to process documents from the biomedical domain. In this paper, we present a new graph-based unsupervised technique to address this problem. The knowledge base used in this work is a graph built with co-occurrence information from medical concepts found in scientific abstracts, and hence adapted to the specific domain. Unlike other unsupervised approaches based on static graphs such as UMLS, in this work the knowledge base takes the context of the ambiguous terms into account. Abstracts downloaded from PubMed are used for building the graph and disambiguation is performed using the personalized PageRank algorithm. Evaluation is carried out over two test datasets widely explored in the literature. Different parameters of the system are also evaluated to test robustness and scalability. Results show that the system is able to outperform state-of-the-art knowledge-based systems, obtaining more than 10% of accuracy improvement in some cases, while only requiring minimal external resources. Copyright © 2018 Elsevier B.V. All rights reserved.
Disambiguating ambiguous biomedical terms in biomedical narrative text: an unsupervised method.
Liu, H; Lussier, Y A; Friedman, C
2001-08-01
With the growing use of Natural Language Processing (NLP) techniques for information extraction and concept indexing in the biomedical domain, a method that quickly and efficiently assigns the correct sense of an ambiguous biomedical term in a given context is needed concurrently. The current status of word sense disambiguation (WSD) in the biomedical domain is that handcrafted rules are used based on contextual material. The disadvantages of this approach are (i) generating WSD rules manually is a time-consuming and tedious task, (ii) maintenance of rule sets becomes increasingly difficult over time, and (iii) handcrafted rules are often incomplete and perform poorly in new domains comprised of specialized vocabularies and different genres of text. This paper presents a two-phase unsupervised method to build a WSD classifier for an ambiguous biomedical term W. The first phase automatically creates a sense-tagged corpus for W, and the second phase derives a classifier for W using the derived sense-tagged corpus as a training set. A formative experiment was performed, which demonstrated that classifiers trained on the derived sense-tagged corpora achieved an overall accuracy of about 97%, with greater than 90% accuracy for each individual ambiguous term.
Acquiring Information from Wider Scope to Improve Event Extraction
2012-05-01
solve all the problems might be hard or even impossible: Word sense disambiguation is already a hard NLP task, and normalizing different expressions...blindfolded woman seen being shot in the head by a hooded militant on a video obtained but not aired by the Arab television station Al-Jazeera. She...imbalance Why are we interested in unsupervised topic features? There is a problem that arises in the evaluation of almost all the tasks in NLP , concerning
An Unsupervised Method for Uncovering Morphological Chains (Open Access, Publisher’s Version)
2015-03-08
Consortium. Marco Baroni, Johannes Matiasek, and Harald Trost. 2002. Unsupervised discovery of morphologically re- lated words based on orthographic and...Better word representations with re- cursive neural networks for morphology. In CoNLL, Sofia, Bulgaria. Mohamed Maamouri, Ann Bies, Hubert Jin, and Tim
Graph-based word sense disambiguation of biomedical documents.
Agirre, Eneko; Soroa, Aitor; Stevenson, Mark
2010-11-15
Word Sense Disambiguation (WSD), automatically identifying the meaning of ambiguous words in context, is an important stage of text processing. This article presents a graph-based approach to WSD in the biomedical domain. The method is unsupervised and does not require any labeled training data. It makes use of knowledge from the Unified Medical Language System (UMLS) Metathesaurus which is represented as a graph. A state-of-the-art algorithm, Personalized PageRank, is used to perform WSD. When evaluated on the NLM-WSD dataset, the algorithm outperforms other methods that rely on the UMLS Metathesaurus alone. The WSD system is open source licensed and available from http://ixa2.si.ehu.es/ukb/. The UMLS, MetaMap program and NLM-WSD corpus are available from the National Library of Medicine https://www.nlm.nih.gov/research/umls/, http://mmtx.nlm.nih.gov and http://wsd.nlm.nih.gov. Software to convert the NLM-WSD corpus into a format that can be used by our WSD system is available from http://www.dcs.shef.ac.uk/∼marks/biomedical_wsd under open source license.
On the unsupervised analysis of domain-specific Chinese texts
Deng, Ke; Bol, Peter K.; Li, Kate J.; Liu, Jun S.
2016-01-01
With the growing availability of digitized text data both publicly and privately, there is a great need for effective computational tools to automatically extract information from texts. Because the Chinese language differs most significantly from alphabet-based languages in not specifying word boundaries, most existing Chinese text-mining methods require a prespecified vocabulary and/or a large relevant training corpus, which may not be available in some applications. We introduce an unsupervised method, top-down word discovery and segmentation (TopWORDS), for simultaneously discovering and segmenting words and phrases from large volumes of unstructured Chinese texts, and propose ways to order discovered words and conduct higher-level context analyses. TopWORDS is particularly useful for mining online and domain-specific texts where the underlying vocabulary is unknown or the texts of interest differ significantly from available training corpora. When outputs from TopWORDS are fed into context analysis tools such as topic modeling, word embedding, and association pattern finding, the results are as good as or better than that from using outputs of a supervised segmentation method. PMID:27185919
Named Entity Recognition in Chinese Clinical Text Using Deep Neural Network.
Wu, Yonghui; Jiang, Min; Lei, Jianbo; Xu, Hua
2015-01-01
Rapid growth in electronic health records (EHRs) use has led to an unprecedented expansion of available clinical data in electronic formats. However, much of the important healthcare information is locked in the narrative documents. Therefore Natural Language Processing (NLP) technologies, e.g., Named Entity Recognition that identifies boundaries and types of entities, has been extensively studied to unlock important clinical information in free text. In this study, we investigated a novel deep learning method to recognize clinical entities in Chinese clinical documents using the minimal feature engineering approach. We developed a deep neural network (DNN) to generate word embeddings from a large unlabeled corpus through unsupervised learning and another DNN for the NER task. The experiment results showed that the DNN with word embeddings trained from the large unlabeled corpus outperformed the state-of-the-art CRF's model in the minimal feature engineering setting, achieving the highest F1-score of 0.9280. Further analysis showed that word embeddings derived through unsupervised learning from large unlabeled corpus remarkably improved the DNN with randomized embedding, denoting the usefulness of unsupervised feature learning.
Natural-Annotation-based Unsupervised Construction of Korean-Chinese Domain Dictionary
NASA Astrophysics Data System (ADS)
Liu, Wuying; Wang, Lin
2018-03-01
The large-scale bilingual parallel resource is significant to statistical learning and deep learning in natural language processing. This paper addresses the automatic construction issue of the Korean-Chinese domain dictionary, and presents a novel unsupervised construction method based on the natural annotation in the raw corpus. We firstly extract all Korean-Chinese word pairs from Korean texts according to natural annotations, secondly transform the traditional Chinese characters into the simplified ones, and finally distill out a bilingual domain dictionary after retrieving the simplified Chinese words in an extra Chinese domain dictionary. The experimental results show that our method can automatically build multiple Korean-Chinese domain dictionaries efficiently.
Unsupervised classification of remote multispectral sensing data
NASA Technical Reports Server (NTRS)
Su, M. Y.
1972-01-01
The new unsupervised classification technique for classifying multispectral remote sensing data which can be either from the multispectral scanner or digitized color-separation aerial photographs consists of two parts: (a) a sequential statistical clustering which is a one-pass sequential variance analysis and (b) a generalized K-means clustering. In this composite clustering technique, the output of (a) is a set of initial clusters which are input to (b) for further improvement by an iterative scheme. Applications of the technique using an IBM-7094 computer on multispectral data sets over Purdue's Flight Line C-1 and the Yellowstone National Park test site have been accomplished. Comparisons between the classification maps by the unsupervised technique and the supervised maximum liklihood technique indicate that the classification accuracies are in agreement.
Higgins, Irina; Stringer, Simon; Schnupp, Jan
2017-01-01
The nature of the code used in the auditory cortex to represent complex auditory stimuli, such as naturally spoken words, remains a matter of debate. Here we argue that such representations are encoded by stable spatio-temporal patterns of firing within cell assemblies known as polychronous groups, or PGs. We develop a physiologically grounded, unsupervised spiking neural network model of the auditory brain with local, biologically realistic, spike-time dependent plasticity (STDP) learning, and show that the plastic cortical layers of the network develop PGs which convey substantially more information about the speaker independent identity of two naturally spoken word stimuli than does rate encoding that ignores the precise spike timings. We furthermore demonstrate that such informative PGs can only develop if the input spatio-temporal spike patterns to the plastic cortical areas of the model are relatively stable.
Stringer, Simon
2017-01-01
The nature of the code used in the auditory cortex to represent complex auditory stimuli, such as naturally spoken words, remains a matter of debate. Here we argue that such representations are encoded by stable spatio-temporal patterns of firing within cell assemblies known as polychronous groups, or PGs. We develop a physiologically grounded, unsupervised spiking neural network model of the auditory brain with local, biologically realistic, spike-time dependent plasticity (STDP) learning, and show that the plastic cortical layers of the network develop PGs which convey substantially more information about the speaker independent identity of two naturally spoken word stimuli than does rate encoding that ignores the precise spike timings. We furthermore demonstrate that such informative PGs can only develop if the input spatio-temporal spike patterns to the plastic cortical areas of the model are relatively stable. PMID:28797034
High-Dimensional Semantic Space Accounts of Priming
ERIC Educational Resources Information Center
Jones, Michael N.; Kintsch, Walter; Mewhort, Douglas J. K.
2006-01-01
A broad range of priming data has been used to explore the structure of semantic memory and to test between models of word representation. In this paper, we examine the computational mechanisms required to learn distributed semantic representations for words directly from unsupervised experience with language. To best account for the variety of…
An unsupervised classification technique for multispectral remote sensing data.
NASA Technical Reports Server (NTRS)
Su, M. Y.; Cummings, R. E.
1973-01-01
Description of a two-part clustering technique consisting of (a) a sequential statistical clustering, which is essentially a sequential variance analysis, and (b) a generalized K-means clustering. In this composite clustering technique, the output of (a) is a set of initial clusters which are input to (b) for further improvement by an iterative scheme. This unsupervised composite technique was employed for automatic classification of two sets of remote multispectral earth resource observations. The classification accuracy by the unsupervised technique is found to be comparable to that by traditional supervised maximum-likelihood classification techniques.
NASA Astrophysics Data System (ADS)
Su, Tengfei
2018-04-01
In this paper, an unsupervised evaluation scheme for remote sensing image segmentation is developed. Based on a method called under- and over-segmentation aware (UOA), the new approach is improved by overcoming the defect in the part of estimating over-segmentation error. Two cases of such error-prone defect are listed, and edge strength is employed to devise a solution to this issue. Two subsets of high resolution remote sensing images were used to test the proposed algorithm, and the experimental results indicate its superior performance, which is attributed to its improved OSE detection model.
Mehryary, Farrokh; Kaewphan, Suwisa; Hakala, Kai; Ginter, Filip
2016-01-01
Biomedical event extraction is one of the key tasks in biomedical text mining, supporting various applications such as database curation and hypothesis generation. Several systems, some of which have been applied at a large scale, have been introduced to solve this task. Past studies have shown that the identification of the phrases describing biological processes, also known as trigger detection, is a crucial part of event extraction, and notable overall performance gains can be obtained by solely focusing on this sub-task. In this paper we propose a novel approach for filtering falsely identified triggers from large-scale event databases, thus improving the quality of knowledge extraction. Our method relies on state-of-the-art word embeddings, event statistics gathered from the whole biomedical literature, and both supervised and unsupervised machine learning techniques. We focus on EVEX, an event database covering the whole PubMed and PubMed Central Open Access literature containing more than 40 million extracted events. The top most frequent EVEX trigger words are hierarchically clustered, and the resulting cluster tree is pruned to identify words that can never act as triggers regardless of their context. For rarely occurring trigger words we introduce a supervised approach trained on the combination of trigger word classification produced by the unsupervised clustering method and manual annotation. The method is evaluated on the official test set of BioNLP Shared Task on Event Extraction. The evaluation shows that the method can be used to improve the performance of the state-of-the-art event extraction systems. This successful effort also translates into removing 1,338,075 of potentially incorrect events from EVEX, thus greatly improving the quality of the data. The method is not solely bound to the EVEX resource and can be thus used to improve the quality of any event extraction system or database. The data and source code for this work are available at: http://bionlp-www.utu.fi/trigger-clustering/.
MARTA GANs: Unsupervised Representation Learning for Remote Sensing Image Classification
NASA Astrophysics Data System (ADS)
Lin, Daoyu; Fu, Kun; Wang, Yang; Xu, Guangluan; Sun, Xian
2017-11-01
With the development of deep learning, supervised learning has frequently been adopted to classify remotely sensed images using convolutional networks (CNNs). However, due to the limited amount of labeled data available, supervised learning is often difficult to carry out. Therefore, we proposed an unsupervised model called multiple-layer feature-matching generative adversarial networks (MARTA GANs) to learn a representation using only unlabeled data. MARTA GANs consists of both a generative model $G$ and a discriminative model $D$. We treat $D$ as a feature extractor. To fit the complex properties of remote sensing data, we use a fusion layer to merge the mid-level and global features. $G$ can produce numerous images that are similar to the training data; therefore, $D$ can learn better representations of remotely sensed images using the training data provided by $G$. The classification results on two widely used remote sensing image databases show that the proposed method significantly improves the classification performance compared with other state-of-the-art methods.
A Novel Unsupervised Segmentation Quality Evaluation Method for Remote Sensing Images
Tang, Yunwei; Jing, Linhai; Ding, Haifeng
2017-01-01
The segmentation of a high spatial resolution remote sensing image is a critical step in geographic object-based image analysis (GEOBIA). Evaluating the performance of segmentation without ground truth data, i.e., unsupervised evaluation, is important for the comparison of segmentation algorithms and the automatic selection of optimal parameters. This unsupervised strategy currently faces several challenges in practice, such as difficulties in designing effective indicators and limitations of the spectral values in the feature representation. This study proposes a novel unsupervised evaluation method to quantitatively measure the quality of segmentation results to overcome these problems. In this method, multiple spectral and spatial features of images are first extracted simultaneously and then integrated into a feature set to improve the quality of the feature representation of ground objects. The indicators designed for spatial stratified heterogeneity and spatial autocorrelation are included to estimate the properties of the segments in this integrated feature set. These two indicators are then combined into a global assessment metric as the final quality score. The trade-offs of the combined indicators are accounted for using a strategy based on the Mahalanobis distance, which can be exhibited geometrically. The method is tested on two segmentation algorithms and three testing images. The proposed method is compared with two existing unsupervised methods and a supervised method to confirm its capabilities. Through comparison and visual analysis, the results verified the effectiveness of the proposed method and demonstrated the reliability and improvements of this method with respect to other methods. PMID:29064416
NASA Astrophysics Data System (ADS)
Jansen, Peter A.; Watter, Scott
2012-03-01
Connectionist language modelling typically has difficulty with syntactic systematicity, or the ability to generalise language learning to untrained sentences. This work develops an unsupervised connectionist model of infant grammar learning. Following the semantic boostrapping hypothesis, the network distils word category using a developmentally plausible infant-scale database of grounded sensorimotor conceptual representations, as well as a biologically plausible semantic co-occurrence activation function. The network then uses this knowledge to acquire an early benchmark clausal grammar using correlational learning, and further acquires separate conceptual and grammatical category representations. The network displays strongly systematic behaviour indicative of the general acquisition of the combinatorial systematicity present in the grounded infant-scale language stream, outperforms previous contemporary models that contain primarily noun and verb word categories, and successfully generalises broadly to novel untrained sensorimotor grounded sentences composed of unfamiliar nouns and verbs. Limitations as well as implications to later grammar learning are discussed.
Incorporating linguistic knowledge for learning distributed word representations.
Wang, Yan; Liu, Zhiyuan; Sun, Maosong
2015-01-01
Combined with neural language models, distributed word representations achieve significant advantages in computational linguistics and text mining. Most existing models estimate distributed word vectors from large-scale data in an unsupervised fashion, which, however, do not take rich linguistic knowledge into consideration. Linguistic knowledge can be represented as either link-based knowledge or preference-based knowledge, and we propose knowledge regularized word representation models (KRWR) to incorporate these prior knowledge for learning distributed word representations. Experiment results demonstrate that our estimated word representation achieves better performance in task of semantic relatedness ranking. This indicates that our methods can efficiently encode both prior knowledge from knowledge bases and statistical knowledge from large-scale text corpora into a unified word representation model, which will benefit many tasks in text mining.
Incorporating Linguistic Knowledge for Learning Distributed Word Representations
Wang, Yan; Liu, Zhiyuan; Sun, Maosong
2015-01-01
Combined with neural language models, distributed word representations achieve significant advantages in computational linguistics and text mining. Most existing models estimate distributed word vectors from large-scale data in an unsupervised fashion, which, however, do not take rich linguistic knowledge into consideration. Linguistic knowledge can be represented as either link-based knowledge or preference-based knowledge, and we propose knowledge regularized word representation models (KRWR) to incorporate these prior knowledge for learning distributed word representations. Experiment results demonstrate that our estimated word representation achieves better performance in task of semantic relatedness ranking. This indicates that our methods can efficiently encode both prior knowledge from knowledge bases and statistical knowledge from large-scale text corpora into a unified word representation model, which will benefit many tasks in text mining. PMID:25874581
The long past and short history of the vocabulary of Anglophone psychology.
Benjafield, John G
2012-02-01
How do particular words come to be part of the vocabulary of Anglophone psychology? The present study sampled 600 words with psychological senses from the Oxford English Dictionary, which not only gives the number of senses for each word but also the date and author for the earliest known occurrence of each sense. Analogous information for the same words was taken from PsycINFO. One can distinguish between words for which their psychological sense is the first to occur in the history of the written language (primary psychological words) and words for which their psychological sense only emerges after one or more other senses have become established in the written language (secondary psychological words). To use a distinction made famous by Ebbinghaus, secondary psychological words have both a past and a history in psychology, while primary psychological words only have a history. Secondary psychological words have more connections to other words and occur more frequently in PsycINFO than do primary psychological words. For secondary psychological words, it is possible to trace a process of metaphoric polysemy that provides a basis for the eventual occurrence of the psychological sense of a word. Some primary psychological words are now developing secondary, nonpsychological senses, showing that they are subject to the same metaphoric process as are any other words.
Zagoris, Konstantinos; Pratikakis, Ioannis; Gatos, Basilis
2017-05-03
Word spotting strategies employed in historical handwritten documents face many challenges due to variation in the writing style and intense degradation. In this paper, a new method that permits effective word spotting in handwritten documents is presented that it relies upon document-oriented local features which take into account information around representative keypoints as well a matching process that incorporates spatial context in a local proximity search without using any training data. Experimental results on four historical handwritten datasets for two different scenarios (segmentation-based and segmentation-free) using standard evaluation measures show the improved performance achieved by the proposed methodology.
NASA Technical Reports Server (NTRS)
Odenyo, V. A. O.
1975-01-01
Remote sensing data on computer-compatible tapes of LANDSAT 1 multispectral scanner imager were analyzed to generate a land use map of the City of Virginia Beach. All four bands were used in both the supervised and unsupervised approaches with the LAYSYS software system. Color IR imagery of a U-2 flight of the same area was also digitized and two sample areas were analyzed via the unsupervised approach. The relationships between the mapped land use and the soils of the area were investigated. A land use land cover map at a scale of 1:24,000 was obtained from the supervised analysis of LANDSAT 1 data. It was concluded that machine analysis of remote sensing data to produce land use maps was feasible; that the LAYSYS software system was usable for this purpose; and that the machine analysis was capable of extracting detailed information from the relatively small scale LANDSAT data in a much shorter time without compromising accuracy.
Algorithms in the historical emergence of word senses.
Ramiro, Christian; Srinivasan, Mahesh; Malt, Barbara C; Xu, Yang
2018-03-06
Human language relies on a finite lexicon to express a potentially infinite set of ideas. A key result of this tension is that words acquire novel senses over time. However, the cognitive processes that underlie the historical emergence of new word senses are poorly understood. Here, we present a computational framework that formalizes competing views of how new senses of a word might emerge by attaching to existing senses of the word. We test the ability of the models to predict the temporal order in which the senses of individual words have emerged, using an historical lexicon of English spanning the past millennium. Our findings suggest that word senses emerge in predictable ways, following an historical path that reflects cognitive efficiency, predominantly through a process of nearest-neighbor chaining. Our work contributes a formal account of the generative processes that underlie lexical evolution.
Schouten, Kim; van der Weijde, Onne; Frasincar, Flavius; Dekker, Rommert
2018-04-01
Using online consumer reviews as electronic word of mouth to assist purchase-decision making has become increasingly popular. The Web provides an extensive source of consumer reviews, but one can hardly read all reviews to obtain a fair evaluation of a product or service. A text processing framework that can summarize reviews, would therefore be desirable. A subtask to be performed by such a framework would be to find the general aspect categories addressed in review sentences, for which this paper presents two methods. In contrast to most existing approaches, the first method presented is an unsupervised method that applies association rule mining on co-occurrence frequency data obtained from a corpus to find these aspect categories. While not on par with state-of-the-art supervised methods, the proposed unsupervised method performs better than several simple baselines, a similar but supervised method, and a supervised baseline, with an -score of 67%. The second method is a supervised variant that outperforms existing methods with an -score of 84%.
Exploring supervised and unsupervised methods to detect topics in biomedical text
Lee, Minsuk; Wang, Weiqing; Yu, Hong
2006-01-01
Background Topic detection is a task that automatically identifies topics (e.g., "biochemistry" and "protein structure") in scientific articles based on information content. Topic detection will benefit many other natural language processing tasks including information retrieval, text summarization and question answering; and is a necessary step towards the building of an information system that provides an efficient way for biologists to seek information from an ocean of literature. Results We have explored the methods of Topic Spotting, a task of text categorization that applies the supervised machine-learning technique naïve Bayes to assign automatically a document into one or more predefined topics; and Topic Clustering, which apply unsupervised hierarchical clustering algorithms to aggregate documents into clusters such that each cluster represents a topic. We have applied our methods to detect topics of more than fifteen thousand of articles that represent over sixteen thousand entries in the Online Mendelian Inheritance in Man (OMIM) database. We have explored bag of words as the features. Additionally, we have explored semantic features; namely, the Medical Subject Headings (MeSH) that are assigned to the MEDLINE records, and the Unified Medical Language System (UMLS) semantic types that correspond to the MeSH terms, in addition to bag of words, to facilitate the tasks of topic detection. Our results indicate that incorporating the MeSH terms and the UMLS semantic types as additional features enhances the performance of topic detection and the naïve Bayes has the highest accuracy, 66.4%, for predicting the topic of an OMIM article as one of the total twenty-five topics. Conclusion Our results indicate that the supervised topic spotting methods outperformed the unsupervised topic clustering; on the other hand, the unsupervised topic clustering methods have the advantages of being robust and applicable in real world settings. PMID:16539745
Class imbalance in unsupervised change detection - A diagnostic analysis from urban remote sensing
NASA Astrophysics Data System (ADS)
Leichtle, Tobias; Geiß, Christian; Lakes, Tobia; Taubenböck, Hannes
2017-08-01
Automatic monitoring of changes on the Earth's surface is an intrinsic capability and simultaneously a persistent methodological challenge in remote sensing, especially regarding imagery with very-high spatial resolution (VHR) and complex urban environments. In order to enable a high level of automatization, the change detection problem is solved in an unsupervised way to alleviate efforts associated with collection of properly encoded prior knowledge. In this context, this paper systematically investigates the nature and effects of class distribution and class imbalance in an unsupervised binary change detection application based on VHR imagery over urban areas. For this purpose, a diagnostic framework for sensitivity analysis of a large range of possible degrees of class imbalance is presented, which is of particular importance with respect to unsupervised approaches where the content of images and thus the occurrence and the distribution of classes are generally unknown a priori. Furthermore, this framework can serve as a general technique to evaluate model transferability in any two-class classification problem. The applied change detection approach is based on object-based difference features calculated from VHR imagery and subsequent unsupervised two-class clustering using k-means, genetic k-means and self-organizing map (SOM) clustering. The results from two test sites with different structural characteristics of the built environment demonstrated that classification performance is generally worse in imbalanced class distribution settings while best results were reached in balanced or close to balanced situations. Regarding suitable accuracy measures for evaluating model performance in imbalanced settings, this study revealed that the Kappa statistics show significant response to class distribution while the true skill statistic was widely insensitive to imbalanced classes. In general, the genetic k-means clustering algorithm achieved the most robust results with respect to class imbalance while the SOM clustering exhibited a distinct optimization towards a balanced distribution of classes.
A cascaded neuro-computational model for spoken word recognition
NASA Astrophysics Data System (ADS)
Hoya, Tetsuya; van Leeuwen, Cees
2010-03-01
In human speech recognition, words are analysed at both pre-lexical (i.e., sub-word) and lexical (word) levels. The aim of this paper is to propose a constructive neuro-computational model that incorporates both these levels as cascaded layers of pre-lexical and lexical units. The layered structure enables the system to handle the variability of real speech input. Within the model, receptive fields of the pre-lexical layer consist of radial basis functions; the lexical layer is composed of units that perform pattern matching between their internal template and a series of labels, corresponding to the winning receptive fields in the pre-lexical layer. The model adapts through self-tuning of all units, in combination with the formation of a connectivity structure through unsupervised (first layer) and supervised (higher layers) network growth. Simulation studies show that the model can achieve a level of performance in spoken word recognition similar to that of a benchmark approach using hidden Markov models, while enabling parallel access to word candidates in lexical decision making.
Wave Scattering and Sensing Strategies in Intermittent Terrestrial Environments
2008-01-01
objects and signal coherence (a measure of sig- nal randomness, which usually determines the sensing sys- tem performance) is strongly degraded...3.1 What are Quasi-Wavelets? Until this point, the objects in the cascades have not been explicitly described. We now associate them with wavelet, or...unsupervised clas- sification scheme used the intensity of the lidar returns to map the material types. 4.2 Seismic Measurement Procedure Thirty-six
The Mental Representation of Polysemy across Word Classes
Lopukhina, Anastasiya; Laurinavichyute, Anna; Lopukhin, Konstantin; Dragoy, Olga
2018-01-01
Experimental studies on polysemy have come to contradictory conclusions on whether words with multiple senses are stored as separate or shared mental representations. The present study examined the semantic relatedness and semantic similarity of literal and non-literal (metonymic and metaphorical) senses of three word classes: nouns, verbs, and adjectives. Two methods were used: a psycholinguistic experiment and a distributional analysis of corpus data. In the experiment, participants were presented with 6–12 short phrases containing a polysemous word in literal, metonymic, or metaphorical senses and were asked to classify them so that phrases with the same perceived sense were grouped together. To investigate the impact of professional background on their decisions, participants were controlled for linguistic vs. non-linguistic education. For nouns and verbs, all participants preferred to group together phrases with literal and metonymic senses, but not any other pairs of senses. For adjectives, two pairs of senses were often grouped together: literal with metonymic, and metonymic with metaphorical. Participants with a linguistic background were more accurate than participants with non-linguistic backgrounds, although both groups shared principal patterns of sense classification. For the distributional analysis of corpus data, we used a semantic vector approach to quantify the similarity of phrases with literal, metonymic, and metaphorical senses in the corpora. We found that phrases with literal and metonymic senses had the highest degree of similarity for the three word classes, and that metonymic and metaphorical senses of adjectives had the highest degree of similarity among all word classes. These findings are in line with the experimental results. Overall, the results suggest that the mental representation of a polysemous word depends on its word class. In nouns and verbs, literal and metonymic senses are stored together, while metaphorical senses are stored separately; in adjectives, metonymic senses significantly overlap with both literal and metaphorical senses. PMID:29515502
The Mental Representation of Polysemy across Word Classes.
Lopukhina, Anastasiya; Laurinavichyute, Anna; Lopukhin, Konstantin; Dragoy, Olga
2018-01-01
Experimental studies on polysemy have come to contradictory conclusions on whether words with multiple senses are stored as separate or shared mental representations. The present study examined the semantic relatedness and semantic similarity of literal and non-literal (metonymic and metaphorical) senses of three word classes: nouns, verbs, and adjectives. Two methods were used: a psycholinguistic experiment and a distributional analysis of corpus data. In the experiment, participants were presented with 6-12 short phrases containing a polysemous word in literal, metonymic, or metaphorical senses and were asked to classify them so that phrases with the same perceived sense were grouped together. To investigate the impact of professional background on their decisions, participants were controlled for linguistic vs. non-linguistic education. For nouns and verbs, all participants preferred to group together phrases with literal and metonymic senses, but not any other pairs of senses. For adjectives, two pairs of senses were often grouped together: literal with metonymic, and metonymic with metaphorical. Participants with a linguistic background were more accurate than participants with non-linguistic backgrounds, although both groups shared principal patterns of sense classification. For the distributional analysis of corpus data, we used a semantic vector approach to quantify the similarity of phrases with literal, metonymic, and metaphorical senses in the corpora. We found that phrases with literal and metonymic senses had the highest degree of similarity for the three word classes, and that metonymic and metaphorical senses of adjectives had the highest degree of similarity among all word classes. These findings are in line with the experimental results. Overall, the results suggest that the mental representation of a polysemous word depends on its word class. In nouns and verbs, literal and metonymic senses are stored together, while metaphorical senses are stored separately; in adjectives, metonymic senses significantly overlap with both literal and metaphorical senses.
Sanfilippo, Antonio P [Richland, WA; Tratz, Stephen C [Richland, WA; Gregory, Michelle L [Richland, WA; Chappell, Alan R [Seattle, WA; Whitney, Paul D [Richland, WA; Posse, Christian [Seattle, WA; Baddeley, Robert L [Richland, WA; Hohimer, Ryan E [West Richland, WA
2011-10-11
Methods of defining ontologies, word disambiguation methods, computer systems, and articles of manufacture are described according to some aspects. In one aspect, a word disambiguation method includes accessing textual content to be disambiguated, wherein the textual content comprises a plurality of words individually comprising a plurality of word senses, for an individual word of the textual content, identifying one of the word senses of the word as indicative of the meaning of the word in the textual content, for the individual word, selecting one of a plurality of event classes of a lexical database ontology using the identified word sense of the individual word, and for the individual word, associating the selected one of the event classes with the textual content to provide disambiguation of a meaning of the individual word in the textual content.
NASA Astrophysics Data System (ADS)
Keyport, Ren N.; Oommen, Thomas; Martha, Tapas R.; Sajinkumar, K. S.; Gierke, John S.
2018-02-01
A comparative analysis of landslides detected by pixel-based and object-oriented analysis (OOA) methods was performed using very high-resolution (VHR) remotely sensed aerial images for the San Juan La Laguna, Guatemala, which witnessed widespread devastation during the 2005 Hurricane Stan. A 3-band orthophoto of 0.5 m spatial resolution together with a 115 field-based landslide inventory were used for the analysis. A binary reference was assigned with a zero value for landslide and unity for non-landslide pixels. The pixel-based analysis was performed using unsupervised classification, which resulted in 11 different trial classes. Detection of landslides using OOA includes 2-step K-means clustering to eliminate regions based on brightness; elimination of false positives using object properties such as rectangular fit, compactness, length/width ratio, mean difference of objects, and slope angle. Both overall accuracy and F-score for OOA methods outperformed pixel-based unsupervised classification methods in both landslide and non-landslide classes. The overall accuracy for OOA and pixel-based unsupervised classification was 96.5% and 94.3%, respectively, whereas the best F-score for landslide identification for OOA and pixel-based unsupervised methods: were 84.3% and 77.9%, respectively.Results indicate that the OOA is able to identify the majority of landslides with a few false positive when compared to pixel-based unsupervised classification.
Shifting senses in lexical semantic development
Rabagliati, Hugh; Marcus, Gary F.; Pylkkänen, Liina
2010-01-01
Most words are associated with multiple senses. A DVD can be round (when describing a disc), and a DVD can be an hour long (when describing a movie), and in each case DVD means something different. The possible senses of a word are often predictable, and also constrained, as words cannot take just any meaning: for example, although a movie can be an hour long, it cannot sensibly be described as round (unlike a DVD). Learning the scope and limits of word meaning is vital for the comprehension of natural language, but poses a potentially difficult learnability problem for children. By testing what senses children are willing to assign to a variety of words, we demonstrate that, in comprehension, the problem is solved using a productive learning strategy. Children are perfectly capable of assigning different senses to a word; indeed they are essentially adult-like at assigning licensed meanings. But difficulties arise in determining which senses are assignable: children systematically overestimate the possible senses of a word, allowing meanings that adults rule unlicensed (e.g., taking round movie to refer to a disc). By contrast, this strategy does not extend to production, in which children use licensed, but not unlicensed, senses. Children’s productive comprehension strategy suggests an early emerging facility for using context in sense resolution (a difficult task for natural language processing algorithms), but leaves an intriguing question as to the mechanisms children use to learn a restricted, adult-like set of senses. PMID:20638655
NASA Technical Reports Server (NTRS)
Dasarathy, B. V.
1976-01-01
An algorithm is proposed for dimensionality reduction in the context of clustering techniques based on histogram analysis. The approach is based on an evaluation of the hills and valleys in the unidimensional histograms along the different features and provides an economical means of assessing the significance of the features in a nonparametric unsupervised data environment. The method has relevance to remote sensing applications.
Yu, Zhiguo; Nguyen, Thang; Dhombres, Ferdinand; Johnson, Todd; Bodenreider, Olivier
2018-01-01
Extracting and understanding information, themes and relationships from large collections of documents is an important task for biomedical researchers. Latent Dirichlet Allocation is an unsupervised topic modeling technique using the bag-of-words assumption that has been applied extensively to unveil hidden thematic information within large sets of documents. In this paper, we added MeSH descriptors to the bag-of-words assumption to generate ‘hybrid topics’, which are mixed vectors of words and descriptors. We evaluated this approach on the quality and interpretability of topics in both a general corpus and a specialized corpus. Our results demonstrated that the coherence of ‘hybrid topics’ is higher than that of regular bag-of-words topics in the specialized corpus. We also found that the proportion of topics that are not associated with MeSH descriptors is higher in the specialized corpus than in the general corpus. PMID:29295179
NASA Astrophysics Data System (ADS)
Nahari, R. V.; Alfita, R.
2018-01-01
Remote sensing technology has been widely used in the geographic information system in order to obtain data more quickly, accurately and affordably. One of the advantages of using remote sensing imagery (satellite imagery) is to analyze land cover and land use. Satellite image data used in this study were images from the Landsat 8 satellite combined with the data from the Municipality of Malang government. The satellite image was taken in July 2016. Furthermore, the method used in this study was unsupervised classification. Based on the analysis towards the satellite images and field observations, 29% of the land in the Municipality of Malang was plantation, 22% of the area was rice field, 12% was residential area, 10% was land with shrubs, and the remaining 2% was water (lake/reservoir). The shortcoming of the methods was 25% of the land in the area was unidentified because it was covered by cloud. It is expected that future researchers involve cloud removal processing to minimize unidentified area.
Jimeno Yepes, Antonio
2017-09-01
Word sense disambiguation helps identifying the proper sense of ambiguous words in text. With large terminologies such as the UMLS Metathesaurus ambiguities appear and highly effective disambiguation methods are required. Supervised learning algorithm methods are used as one of the approaches to perform disambiguation. Features extracted from the context of an ambiguous word are used to identify the proper sense of such a word. The type of features have an impact on machine learning methods, thus affect disambiguation performance. In this work, we have evaluated several types of features derived from the context of the ambiguous word and we have explored as well more global features derived from MEDLINE using word embeddings. Results show that word embeddings improve the performance of more traditional features and allow as well using recurrent neural network classifiers based on Long-Short Term Memory (LSTM) nodes. The combination of unigrams and word embeddings with an SVM sets a new state of the art performance with a macro accuracy of 95.97 in the MSH WSD data set. Copyright © 2017 Elsevier Inc. All rights reserved.
Evidential analysis of difference images for change detection of multitemporal remote sensing images
NASA Astrophysics Data System (ADS)
Chen, Yin; Peng, Lijuan; Cremers, Armin B.
2018-03-01
In this article, we develop two methods for unsupervised change detection in multitemporal remote sensing images based on Dempster-Shafer's theory of evidence (DST). In most unsupervised change detection methods, the probability of difference image is assumed to be characterized by mixture models, whose parameters are estimated by the expectation maximization (EM) method. However, the main drawback of the EM method is that it does not consider spatial contextual information, which may entail rather noisy detection results with numerous spurious alarms. To remedy this, we firstly develop an evidence theory based EM method (EEM) which incorporates spatial contextual information in EM by iteratively fusing the belief assignments of neighboring pixels to the central pixel. Secondly, an evidential labeling method in the sense of maximizing a posteriori probability (MAP) is proposed in order to further enhance the detection result. It first uses the parameters estimated by EEM to initialize the class labels of a difference image. Then it iteratively fuses class conditional information and spatial contextual information, and updates labels and class parameters. Finally it converges to a fixed state which gives the detection result. A simulated image set and two real remote sensing data sets are used to evaluate the two evidential change detection methods. Experimental results show that the new evidential methods are comparable to other prevalent methods in terms of total error rate.
Infrared vehicle recognition using unsupervised feature learning based on K-feature
NASA Astrophysics Data System (ADS)
Lin, Jin; Tan, Yihua; Xia, Haijiao; Tian, Jinwen
2018-02-01
Subject to the complex battlefield environment, it is difficult to establish a complete knowledge base in practical application of vehicle recognition algorithms. The infrared vehicle recognition is always difficult and challenging, which plays an important role in remote sensing. In this paper we propose a new unsupervised feature learning method based on K-feature to recognize vehicle in infrared images. First, we use the target detection algorithm which is based on the saliency to detect the initial image. Then, the unsupervised feature learning based on K-feature, which is generated by Kmeans clustering algorithm that extracted features by learning a visual dictionary from a large number of samples without label, is calculated to suppress the false alarm and improve the accuracy. Finally, the vehicle target recognition image is finished by some post-processing. Large numbers of experiments demonstrate that the proposed method has satisfy recognition effectiveness and robustness for vehicle recognition in infrared images under complex backgrounds, and it also improve the reliability of it.
Application of diffusion maps to identify human factors of self-reported anomalies in aviation.
Andrzejczak, Chris; Karwowski, Waldemar; Mikusinski, Piotr
2012-01-01
A study investigating what factors are present leading to pilots submitting voluntary anomaly reports regarding their flight performance was conducted. Diffusion Maps (DM) were selected as the method of choice for performing dimensionality reduction on text records for this study. Diffusion Maps have seen successful use in other domains such as image classification and pattern recognition. High-dimensionality data in the form of narrative text reports from the NASA Aviation Safety Reporting System (ASRS) were clustered and categorized by way of dimensionality reduction. Supervised analyses were performed to create a baseline document clustering system. Dimensionality reduction techniques identified concepts or keywords within records, and allowed the creation of a framework for an unsupervised document classification system. Results from the unsupervised clustering algorithm performed similarly to the supervised methods outlined in the study. The dimensionality reduction was performed on 100 of the most commonly occurring words within 126,000 text records describing commercial aviation incidents. This study demonstrates that unsupervised machine clustering and organization of incident reports is possible based on unbiased inputs. Findings from this study reinforced traditional views on what factors contribute to civil aviation anomalies, however, new associations between previously unrelated factors and conditions were also found.
Yang, Yang; Saleemi, Imran; Shah, Mubarak
2013-07-01
This paper proposes a novel representation of articulated human actions and gestures and facial expressions. The main goals of the proposed approach are: 1) to enable recognition using very few examples, i.e., one or k-shot learning, and 2) meaningful organization of unlabeled datasets by unsupervised clustering. Our proposed representation is obtained by automatically discovering high-level subactions or motion primitives, by hierarchical clustering of observed optical flow in four-dimensional, spatial, and motion flow space. The completely unsupervised proposed method, in contrast to state-of-the-art representations like bag of video words, provides a meaningful representation conducive to visual interpretation and textual labeling. Each primitive action depicts an atomic subaction, like directional motion of limb or torso, and is represented by a mixture of four-dimensional Gaussian distributions. For one--shot and k-shot learning, the sequence of primitive labels discovered in a test video are labeled using KL divergence, and can then be represented as a string and matched against similar strings of training videos. The same sequence can also be collapsed into a histogram of primitives or be used to learn a Hidden Markov model to represent classes. We have performed extensive experiments on recognition by one and k-shot learning as well as unsupervised action clustering on six human actions and gesture datasets, a composite dataset, and a database of facial expressions. These experiments confirm the validity and discriminative nature of the proposed representation.
Application of remote sensing techniques for identification of irrigated crop lands in Arizona
NASA Technical Reports Server (NTRS)
Billings, H. A.
1981-01-01
Satellite imagery was used in a project developed to demonstrate remote sensing methods of determining irrigated acreage in Arizona. The Maricopa water district, west of Phoenix, was chosen as the test area. Band rationing and unsupervised categorization were used to perform the inventory. For both techniques the irrigation district boundaries and section lines were digitized and calculated and displayed by section. Both estimation techniques were quite accurate in estimating irrigated acreage in the 1979 growing season.
A Semantic Lexicon-Based Approach for Sense Disambiguation and Its WWW Application
NASA Astrophysics Data System (ADS)
di Lecce, Vincenzo; Calabrese, Marco; Soldo, Domenico
This work proposes a basic framework for resolving sense disambiguation through the use of Semantic Lexicon, a machine readable dictionary managing both word senses and lexico-semantic relations. More specifically, polysemous ambiguity characterizing Web documents is discussed. The adopted Semantic Lexicon is WordNet, a lexical knowledge-base of English words widely adopted in many research studies referring to knowledge discovery. The proposed approach extends recent works on knowledge discovery by focusing on the sense disambiguation aspect. By exploiting the structure of WordNet database, lexico-semantic features are used to resolve the inherent sense ambiguity of written text with particular reference to HTML resources. The obtained results may be extended to generic hypertextual repositories as well. Experiments show that polysemy reduction can be used to hint about the meaning of specific senses in given contexts.
Polysemy in Sentence Comprehension: Effects of Meaning Dominance
Foraker, Stephani; Murphy, Gregory L.
2012-01-01
Words like church are polysemous, having two related senses (a building and an organization). Three experiments investigated how polysemous senses are represented and processed during sentence comprehension. On one view, readers retrieve an underspecified, core meaning, which is later specified more fully with contextual information. On another view, readers retrieve one or more specific senses. In a reading task, context that was neutral or biased towards a particular sense preceded a polysemous word. Disambiguating material consistent with only one sense followed, in a second sentence (Experiment 1) or the same sentence (Experiments 2 & 3). Reading the disambiguating material was faster when it was consistent with that context, and dominant senses were committed to more strongly than subordinate senses. Critically, following neutral context, the continuation was read more quickly when it selected the dominant sense, and the degree of sense dominance partially explained the reading time advantage. Similarity of the senses also affected reading times. Across experiments, we found that sense selection may not be completed immediately following a polysemous word but is completed at a sentence boundary. Overall, the results suggest that readers select an individual sense when reading a polysemous word, rather than a core meaning. PMID:23185103
NASA Astrophysics Data System (ADS)
Hortos, William S.
2008-04-01
In previous work by the author, effective persistent and pervasive sensing for recognition and tracking of battlefield targets were seen to be achieved, using intelligent algorithms implemented by distributed mobile agents over a composite system of unmanned aerial vehicles (UAVs) for persistence and a wireless network of unattended ground sensors for pervasive coverage of the mission environment. While simulated performance results for the supervised algorithms of the composite system are shown to provide satisfactory target recognition over relatively brief periods of system operation, this performance can degrade by as much as 50% as target dynamics in the environment evolve beyond the period of system operation in which the training data are representative. To overcome this limitation, this paper applies the distributed approach using mobile agents to the network of ground-based wireless sensors alone, without the UAV subsystem, to provide persistent as well as pervasive sensing for target recognition and tracking. The supervised algorithms used in the earlier work are supplanted by unsupervised routines, including competitive-learning neural networks (CLNNs) and new versions of support vector machines (SVMs) for characterization of an unknown target environment. To capture the same physical phenomena from battlefield targets as the composite system, the suite of ground-based sensors can be expanded to include imaging and video capabilities. The spatial density of deployed sensor nodes is increased to allow more precise ground-based location and tracking of detected targets by active nodes. The "swarm" mobile agents enabling WSN intelligence are organized in a three processing stages: detection, recognition and sustained tracking of ground targets. Features formed from the compressed sensor data are down-selected according to an information-theoretic algorithm that reduces redundancy within the feature set, reducing the dimension of samples used in the target recognition and tracking routines. Target tracking is based on simplified versions of Kalman filtration. Accuracy of recognition and tracking of implemented versions of the proposed suite of unsupervised algorithms is somewhat degraded from the ideal. Target recognition and tracking by supervised routines and by unsupervised SVM and CLNN routines in the ground-based WSN is evaluated in simulations using published system values and sensor data from vehicular targets in ground-surveillance scenarios. Results are compared with previously published performance for the system of the ground-based sensor network (GSN) and UAV swarm.
Finding Meaning: Sense Inventories for Improved Word Sense Disambiguation
ERIC Educational Resources Information Center
Brown, Susan Windisch
2010-01-01
The deep semantic understanding necessary for complex natural language processing tasks, such as automatic question-answering or text summarization, would benefit from highly accurate word sense disambiguation (WSD). This dissertation investigates what makes an appropriate and effective sense inventory for WSD. Drawing on theories and…
Learning Semantic Tags from Big Data for Clinical Text Representation.
Li, Yanpeng; Liu, Hongfang
2015-01-01
In clinical text mining, it is one of the biggest challenges to represent medical terminologies and n-gram terms in sparse medical reports using either supervised or unsupervised methods. Addressing this issue, we propose a novel method for word and n-gram representation at semantic level. We first represent each word by its distance with a set of reference features calculated by reference distance estimator (RDE) learned from labeled and unlabeled data, and then generate new features using simple techniques of discretization, random sampling and merging. The new features are a set of binary rules that can be interpreted as semantic tags derived from word and n-grams. We show that the new features significantly outperform classical bag-of-words and n-grams in the task of heart disease risk factor extraction in i2b2 2014 challenge. It is promising to see that semantics tags can be used to replace the original text entirely with even better prediction performance as well as derive new rules beyond lexical level.
NASA Astrophysics Data System (ADS)
Madokoro, H.; Tsukada, M.; Sato, K.
2013-07-01
This paper presents an unsupervised learning-based object category formation and recognition method for mobile robot vision. Our method has the following features: detection of feature points and description of features using a scale-invariant feature transform (SIFT), selection of target feature points using one class support vector machines (OC-SVMs), generation of visual words using self-organizing maps (SOMs), formation of labels using adaptive resonance theory 2 (ART-2), and creation and classification of categories on a category map of counter propagation networks (CPNs) for visualizing spatial relations between categories. Classification results of dynamic images using time-series images obtained using two different-size robots and according to movements respectively demonstrate that our method can visualize spatial relations of categories while maintaining time-series characteristics. Moreover, we emphasize the effectiveness of our method for category formation of appearance changes of objects.
Parametric embedding for class visualization.
Iwata, Tomoharu; Saito, Kazumi; Ueda, Naonori; Stromsten, Sean; Griffiths, Thomas L; Tenenbaum, Joshua B
2007-09-01
We propose a new method, parametric embedding (PE), that embeds objects with the class structure into a low-dimensional visualization space. PE takes as input a set of class conditional probabilities for given data points and tries to preserve the structure in an embedding space by minimizing a sum of Kullback-Leibler divergences, under the assumption that samples are generated by a gaussian mixture with equal covariances in the embedding space. PE has many potential uses depending on the source of the input data, providing insight into the classifier's behavior in supervised, semisupervised, and unsupervised settings. The PE algorithm has a computational advantage over conventional embedding methods based on pairwise object relations since its complexity scales with the product of the number of objects and the number of classes. We demonstrate PE by visualizing supervised categorization of Web pages, semisupervised categorization of digits, and the relations of words and latent topics found by an unsupervised algorithm, latent Dirichlet allocation.
Juan-Albarracín, Javier; Fuster-Garcia, Elies; Manjón, José V; Robles, Montserrat; Aparici, F; Martí-Bonmatí, L; García-Gómez, Juan M
2015-01-01
Automatic brain tumour segmentation has become a key component for the future of brain tumour treatment. Currently, most of brain tumour segmentation approaches arise from the supervised learning standpoint, which requires a labelled training dataset from which to infer the models of the classes. The performance of these models is directly determined by the size and quality of the training corpus, whose retrieval becomes a tedious and time-consuming task. On the other hand, unsupervised approaches avoid these limitations but often do not reach comparable results than the supervised methods. In this sense, we propose an automated unsupervised method for brain tumour segmentation based on anatomical Magnetic Resonance (MR) images. Four unsupervised classification algorithms, grouped by their structured or non-structured condition, were evaluated within our pipeline. Considering the non-structured algorithms, we evaluated K-means, Fuzzy K-means and Gaussian Mixture Model (GMM), whereas as structured classification algorithms we evaluated Gaussian Hidden Markov Random Field (GHMRF). An automated postprocess based on a statistical approach supported by tissue probability maps is proposed to automatically identify the tumour classes after the segmentations. We evaluated our brain tumour segmentation method with the public BRAin Tumor Segmentation (BRATS) 2013 Test and Leaderboard datasets. Our approach based on the GMM model improves the results obtained by most of the supervised methods evaluated with the Leaderboard set and reaches the second position in the ranking. Our variant based on the GHMRF achieves the first position in the Test ranking of the unsupervised approaches and the seventh position in the general Test ranking, which confirms the method as a viable alternative for brain tumour segmentation.
The Unsupervised Acquisition of a Lexicon from Continuous Speech.
1995-11-01
Com- munication, 2(1):57{89, 1982. [42] J. Ziv and A. Lempel . Compression of individual sequences by variable rate coding. IEEE Trans- actions on...parameters of the compression algorithm , in a never-ending attempt to identify and eliminate the predictable. They lead us to a class of grammars in...the rst 10 sentences of the test set, previously unseen by the algorithm . Vertical bars indicate word boundaries. 7.1 Text Compression and Language
Knowledge-Based Topic Model for Unsupervised Object Discovery and Localization.
Niu, Zhenxing; Hua, Gang; Wang, Le; Gao, Xinbo
Unsupervised object discovery and localization is to discover some dominant object classes and localize all of object instances from a given image collection without any supervision. Previous work has attempted to tackle this problem with vanilla topic models, such as latent Dirichlet allocation (LDA). However, in those methods no prior knowledge for the given image collection is exploited to facilitate object discovery. On the other hand, the topic models used in those methods suffer from the topic coherence issue-some inferred topics do not have clear meaning, which limits the final performance of object discovery. In this paper, prior knowledge in terms of the so-called must-links are exploited from Web images on the Internet. Furthermore, a novel knowledge-based topic model, called LDA with mixture of Dirichlet trees, is proposed to incorporate the must-links into topic modeling for object discovery. In particular, to better deal with the polysemy phenomenon of visual words, the must-link is re-defined as that one must-link only constrains one or some topic(s) instead of all topics, which leads to significantly improved topic coherence. Moreover, the must-links are built and grouped with respect to specific object classes, thus the must-links in our approach are semantic-specific , which allows to more efficiently exploit discriminative prior knowledge from Web images. Extensive experiments validated the efficiency of our proposed approach on several data sets. It is shown that our method significantly improves topic coherence and outperforms the unsupervised methods for object discovery and localization. In addition, compared with discriminative methods, the naturally existing object classes in the given image collection can be subtly discovered, which makes our approach well suited for realistic applications of unsupervised object discovery.Unsupervised object discovery and localization is to discover some dominant object classes and localize all of object instances from a given image collection without any supervision. Previous work has attempted to tackle this problem with vanilla topic models, such as latent Dirichlet allocation (LDA). However, in those methods no prior knowledge for the given image collection is exploited to facilitate object discovery. On the other hand, the topic models used in those methods suffer from the topic coherence issue-some inferred topics do not have clear meaning, which limits the final performance of object discovery. In this paper, prior knowledge in terms of the so-called must-links are exploited from Web images on the Internet. Furthermore, a novel knowledge-based topic model, called LDA with mixture of Dirichlet trees, is proposed to incorporate the must-links into topic modeling for object discovery. In particular, to better deal with the polysemy phenomenon of visual words, the must-link is re-defined as that one must-link only constrains one or some topic(s) instead of all topics, which leads to significantly improved topic coherence. Moreover, the must-links are built and grouped with respect to specific object classes, thus the must-links in our approach are semantic-specific , which allows to more efficiently exploit discriminative prior knowledge from Web images. Extensive experiments validated the efficiency of our proposed approach on several data sets. It is shown that our method significantly improves topic coherence and outperforms the unsupervised methods for object discovery and localization. In addition, compared with discriminative methods, the naturally existing object classes in the given image collection can be subtly discovered, which makes our approach well suited for realistic applications of unsupervised object discovery.
Processing and Representation of Ambiguous Words in Chinese Reading: Evidence from Eye Movements.
Shen, Wei; Li, Xingshan
2016-01-01
In the current study, we used eye tracking to investigate whether senses of polysemous words and meanings of homonymous words are represented and processed similarly or differently in Chinese reading. Readers read sentences containing target words which was either homonymous words or polysemous words. The contexts of text preceding the target words were manipulated to bias the participants toward reading the ambiguous words according to their dominant, subordinate, or neutral meanings. Similarly, disambiguating regions following the target words were also manipulated to favor either the dominant or subordinate meanings of ambiguous words. The results showed that there were similar eye movement patterns when Chinese participants read sentences containing homonymous and polysemous words. The study also found that participants took longer to read the target word and the disambiguating text following it when the prior context and disambiguating regions favored divergent meanings rather than the same meaning. These results suggested that homonymy and polysemy are represented similarly in the mental lexicon when a particular meaning (sense) is fully specified by disambiguating information. Furthermore, multiple meanings (senses) are represented as separate entries in the mental lexicon.
The 100-Mile Curriculum: Place as an Educative Construct
ERIC Educational Resources Information Center
Giesbrecht, Sheila
2008-01-01
Over the last decades, the ways in which children experience and understand their worlds have radically altered. In still-recent times, children were part of communities; they played in wild places and had unsupervised experiences. Today, the lives of children are increasingly fragmented, solitary, and removed from a sense of place. Children come…
Percha, Bethany; Altman, Russ B
2013-01-01
The biomedical literature presents a uniquely challenging text mining problem. Sentences are long and complex, the subject matter is highly specialized with a distinct vocabulary, and producing annotated training data for this domain is time consuming and expensive. In this environment, unsupervised text mining methods that do not rely on annotated training data are valuable. Here we investigate the use of random indexing, an automated method for producing vector-space semantic representations of words from large, unlabeled corpora, to address the problem of term normalization in sentences describing drugs and genes. We show that random indexing produces similarity scores that capture some of the structure of PHARE, a manually curated ontology of pharmacogenomics concepts. We further show that random indexing can be used to identify likely word candidates for inclusion in the ontology, and can help localize these new labels among classes and roles within the ontology.
Percha, Bethany; Altman, Russ B.
2013-01-01
The biomedical literature presents a uniquely challenging text mining problem. Sentences are long and complex, the subject matter is highly specialized with a distinct vocabulary, and producing annotated training data for this domain is time consuming and expensive. In this environment, unsupervised text mining methods that do not rely on annotated training data are valuable. Here we investigate the use of random indexing, an automated method for producing vector-space semantic representations of words from large, unlabeled corpora, to address the problem of term normalization in sentences describing drugs and genes. We show that random indexing produces similarity scores that capture some of the structure of PHARE, a manually curated ontology of pharmacogenomics concepts. We further show that random indexing can be used to identify likely word candidates for inclusion in the ontology, and can help localize these new labels among classes and roles within the ontology. PMID:24551397
Font adaptive word indexing of modern printed documents.
Marinai, Simone; Marino, Emanuele; Soda, Giovanni
2006-08-01
We propose an approach for the word-level indexing of modern printed documents which are difficult to recognize using current OCR engines. By means of word-level indexing, it is possible to retrieve the position of words in a document, enabling queries involving proximity of terms. Web search engines implement this kind of indexing, allowing users to retrieve Web pages on the basis of their textual content. Nowadays, digital libraries hold collections of digitized documents that can be retrieved either by browsing the document images or relying on appropriate metadata assembled by domain experts. Word indexing tools would therefore increase the access to these collections. The proposed system is designed to index homogeneous document collections by automatically adapting to different languages and font styles without relying on OCR engines for character recognition. The approach is based on three main ideas: the use of Self Organizing Maps (SOM) to perform unsupervised character clustering, the definition of one suitable vector-based word representation whose size depends on the word aspect-ratio, and the run-time alignment of the query word with indexed words to deal with broken and touching characters. The most appropriate applications are for processing modern printed documents (17th to 19th centuries) where current OCR engines are less accurate. Our experimental analysis addresses six data sets containing documents ranging from books of the 17th century to contemporary journals.
NASA Astrophysics Data System (ADS)
Martinez Vicente, V.; Simis, S. G. H.; Alegre, R.; Land, P. E.; Groom, S. B.
2013-09-01
Un-supervised hyperspectral remote-sensing reflectance data (<15 km from the shore) were collected from a moving research vessel. Twodifferent processing methods were compared. The results were similar to concurrent Aqua-MODIS and Suomi-NPP-VIIRS satellite data.
Juan-Albarracín, Javier; Fuster-Garcia, Elies; Manjón, José V.; Robles, Montserrat; Aparici, F.; Martí-Bonmatí, L.; García-Gómez, Juan M.
2015-01-01
Automatic brain tumour segmentation has become a key component for the future of brain tumour treatment. Currently, most of brain tumour segmentation approaches arise from the supervised learning standpoint, which requires a labelled training dataset from which to infer the models of the classes. The performance of these models is directly determined by the size and quality of the training corpus, whose retrieval becomes a tedious and time-consuming task. On the other hand, unsupervised approaches avoid these limitations but often do not reach comparable results than the supervised methods. In this sense, we propose an automated unsupervised method for brain tumour segmentation based on anatomical Magnetic Resonance (MR) images. Four unsupervised classification algorithms, grouped by their structured or non-structured condition, were evaluated within our pipeline. Considering the non-structured algorithms, we evaluated K-means, Fuzzy K-means and Gaussian Mixture Model (GMM), whereas as structured classification algorithms we evaluated Gaussian Hidden Markov Random Field (GHMRF). An automated postprocess based on a statistical approach supported by tissue probability maps is proposed to automatically identify the tumour classes after the segmentations. We evaluated our brain tumour segmentation method with the public BRAin Tumor Segmentation (BRATS) 2013 Test and Leaderboard datasets. Our approach based on the GMM model improves the results obtained by most of the supervised methods evaluated with the Leaderboard set and reaches the second position in the ranking. Our variant based on the GHMRF achieves the first position in the Test ranking of the unsupervised approaches and the seventh position in the general Test ranking, which confirms the method as a viable alternative for brain tumour segmentation. PMID:25978453
Information properties of morphologically complex words modulate brain activity during word reading
Hultén, Annika; Lehtonen, Minna; Lagus, Krista; Salmelin, Riitta
2018-01-01
Abstract Neuroimaging studies of the reading process point to functionally distinct stages in word recognition. Yet, current understanding of the operations linked to those various stages is mainly descriptive in nature. Approaches developed in the field of computational linguistics may offer a more quantitative approach for understanding brain dynamics. Our aim was to evaluate whether a statistical model of morphology, with well‐defined computational principles, can capture the neural dynamics of reading, using the concept of surprisal from information theory as the common measure. The Morfessor model, created for unsupervised discovery of morphemes, is based on the minimum description length principle and attempts to find optimal units of representation for complex words. In a word recognition task, we correlated brain responses to word surprisal values derived from Morfessor and from other psycholinguistic variables that have been linked with various levels of linguistic abstraction. The magnetoencephalography data analysis focused on spatially, temporally and functionally distinct components of cortical activation observed in reading tasks. The early occipital and occipito‐temporal responses were correlated with parameters relating to visual complexity and orthographic properties, whereas the later bilateral superior temporal activation was correlated with whole‐word based and morphological models. The results show that the word processing costs estimated by the statistical Morfessor model are relevant for brain dynamics of reading during late processing stages. PMID:29524274
Information properties of morphologically complex words modulate brain activity during word reading.
Hakala, Tero; Hultén, Annika; Lehtonen, Minna; Lagus, Krista; Salmelin, Riitta
2018-06-01
Neuroimaging studies of the reading process point to functionally distinct stages in word recognition. Yet, current understanding of the operations linked to those various stages is mainly descriptive in nature. Approaches developed in the field of computational linguistics may offer a more quantitative approach for understanding brain dynamics. Our aim was to evaluate whether a statistical model of morphology, with well-defined computational principles, can capture the neural dynamics of reading, using the concept of surprisal from information theory as the common measure. The Morfessor model, created for unsupervised discovery of morphemes, is based on the minimum description length principle and attempts to find optimal units of representation for complex words. In a word recognition task, we correlated brain responses to word surprisal values derived from Morfessor and from other psycholinguistic variables that have been linked with various levels of linguistic abstraction. The magnetoencephalography data analysis focused on spatially, temporally and functionally distinct components of cortical activation observed in reading tasks. The early occipital and occipito-temporal responses were correlated with parameters relating to visual complexity and orthographic properties, whereas the later bilateral superior temporal activation was correlated with whole-word based and morphological models. The results show that the word processing costs estimated by the statistical Morfessor model are relevant for brain dynamics of reading during late processing stages. © 2018 The Authors Human Brain Mapping Published by Wiley Periodicals, Inc.
A Fast Implementation of the ISOCLUS Algorithm
NASA Technical Reports Server (NTRS)
Memarsadeghi, Nargess; Mount, David M.; Netanyahu, Nathan S.; LeMoigne, Jacqueline
2003-01-01
Unsupervised clustering is a fundamental tool in numerous image processing and remote sensing applications. For example, unsupervised clustering is often used to obtain vegetation maps of an area of interest. This approach is useful when reliable training data are either scarce or expensive, and when relatively little a priori information about the data is available. Unsupervised clustering methods play a significant role in the pursuit of unsupervised classification. One of the most popular and widely used clustering schemes for remote sensing applications is the ISOCLUS algorithm, which is based on the ISODATA method. The algorithm is given a set of n data points (or samples) in d-dimensional space, an integer k indicating the initial number of clusters, and a number of additional parameters. The general goal is to compute a set of cluster centers in d-space. Although there is no specific optimization criterion, the algorithm is similar in spirit to the well known k-means clustering method in which the objective is to minimize the average squared distance of each point to its nearest center, called the average distortion. One significant feature of ISOCLUS over k-means is that clusters may be merged or split, and so the final number of clusters may be different from the number k supplied as part of the input. This algorithm will be described in later in this paper. The ISOCLUS algorithm can run very slowly, particularly on large data sets. Given its wide use in remote sensing, its efficient computation is an important goal. We have developed a fast implementation of the ISOCLUS algorithm. Our improvement is based on a recent acceleration to the k-means algorithm, the filtering algorithm, by Kanungo et al.. They showed that, by storing the data in a kd-tree, it was possible to significantly reduce the running time of k-means. We have adapted this method for the ISOCLUS algorithm. For technical reasons, which are explained later, it is necessary to make a minor modification to the ISOCLUS specification. We provide empirical evidence, on both synthetic and Landsat image data sets, that our algorithm's performance is essentially the same as that of ISOCLUS, but with significantly lower running times. We show that our algorithm runs from 3 to 30 times faster than a straightforward implementation of ISOCLUS. Our adaptation of the filtering algorithm involves the efficient computation of a number of cluster statistics that are needed for ISOCLUS, but not for k-means.
A lexicon based method to search for extreme opinions
Gamallo, Pablo
2018-01-01
Studies in sentiment analysis and opinion mining have been focused on many aspects related to opinions, namely polarity classification by making use of positive, negative or neutral values. However, most studies have overlooked the identification of extreme opinions (most negative and most positive opinions) in spite of their vast significance in many applications. We use an unsupervised approach to search for extreme opinions, which is based on the automatic construction of a new lexicon containing the most negative and most positive words. PMID:29799867
A lexicon based method to search for extreme opinions.
Almatarneh, Sattam; Gamallo, Pablo
2018-01-01
Studies in sentiment analysis and opinion mining have been focused on many aspects related to opinions, namely polarity classification by making use of positive, negative or neutral values. However, most studies have overlooked the identification of extreme opinions (most negative and most positive opinions) in spite of their vast significance in many applications. We use an unsupervised approach to search for extreme opinions, which is based on the automatic construction of a new lexicon containing the most negative and most positive words.
Tailoring vocabularies for NLP in sub-domains: a method to detect unused word sense.
Figueroa, Rosa L; Zeng-Treitler, Qing; Goryachev, Sergey; Wiechmann, Eduardo P
2009-11-14
We developed a method to help tailor a comprehensive vocabulary system (e.g. the UMLS) for a sub-domain (e.g. clinical reports) in support of natural language processing (NLP). The method detects unused sense in a sub-domain by comparing the relational neighborhood of a word/term in the vocabulary with the semantic neighborhood of the word/term in the sub-domain. The semantic neighborhood of the word/term in the sub-domain is determined using latent semantic analysis (LSA). We trained and tested the unused sense detection on two clinical text corpora: one contains discharge summaries and the other outpatient visit notes. We were able to detect unused senses with precision from 79% to 87%, recall from 48% to 74%, and an area under receiver operation curve (AUC) of 72% to 87%.
NASA Technical Reports Server (NTRS)
LeMoigne, Jacqueline; Laporte, Nadine; Netanyahuy, Nathan S.; Zukor, Dorothy (Technical Monitor)
2001-01-01
The characterization and the mapping of land cover/land use of forest areas, such as the Central African rainforest, is a very complex task. This complexity is mainly due to the extent of such areas and, as a consequence, to the lack of full and continuous cloud-free coverage of those large regions by one single remote sensing instrument, In order to provide improved vegetation maps of Central Africa and to develop forest monitoring techniques for applications at the local and regional scales, we propose to utilize multi-sensor remote sensing observations coupled with in-situ data. Fusion and clustering of multi-sensor data are the first steps towards the development of such a forest monitoring system. In this paper, we will describe some preliminary experiments involving the fusion of SAR and Landsat image data of the Lope Reserve in Gabon. Similarly to previous fusion studies, our fusion method is wavelet-based. The fusion provides a new image data set which contains more detailed texture features and preserves the large homogeneous regions that are observed by the Thematic Mapper sensor. The fusion step is followed by unsupervised clustering and provides a vegetation map of the area.
Geological applications of machine learning on hyperspectral remote sensing data
NASA Astrophysics Data System (ADS)
Tse, C. H.; Li, Yi-liang; Lam, Edmund Y.
2015-02-01
The CRISM imaging spectrometer orbiting Mars has been producing a vast amount of data in the visible to infrared wavelengths in the form of hyperspectral data cubes. These data, compared with those obtained from previous remote sensing techniques, yield an unprecedented level of detailed spectral resolution in additional to an ever increasing level of spatial information. A major challenge brought about by the data is the burden of processing and interpreting these datasets and extract the relevant information from it. This research aims at approaching the challenge by exploring machine learning methods especially unsupervised learning to achieve cluster density estimation and classification, and ultimately devising an efficient means leading to identification of minerals. A set of software tools have been constructed by Python to access and experiment with CRISM hyperspectral cubes selected from two specific Mars locations. A machine learning pipeline is proposed and unsupervised learning methods were implemented onto pre-processed datasets. The resulting data clusters are compared with the published ASTER spectral library and browse data products from the Planetary Data System (PDS). The result demonstrated that this approach is capable of processing the huge amount of hyperspectral data and potentially providing guidance to scientists for more detailed studies.
Modeling language and cognition with deep unsupervised learning: a tutorial overview
Zorzi, Marco; Testolin, Alberto; Stoianov, Ivilin P.
2013-01-01
Deep unsupervised learning in stochastic recurrent neural networks with many layers of hidden units is a recent breakthrough in neural computation research. These networks build a hierarchy of progressively more complex distributed representations of the sensory data by fitting a hierarchical generative model. In this article we discuss the theoretical foundations of this approach and we review key issues related to training, testing and analysis of deep networks for modeling language and cognitive processing. The classic letter and word perception problem of McClelland and Rumelhart (1981) is used as a tutorial example to illustrate how structured and abstract representations may emerge from deep generative learning. We argue that the focus on deep architectures and generative (rather than discriminative) learning represents a crucial step forward for the connectionist modeling enterprise, because it offers a more plausible model of cortical learning as well as a way to bridge the gap between emergentist connectionist models and structured Bayesian models of cognition. PMID:23970869
Modeling language and cognition with deep unsupervised learning: a tutorial overview.
Zorzi, Marco; Testolin, Alberto; Stoianov, Ivilin P
2013-01-01
Deep unsupervised learning in stochastic recurrent neural networks with many layers of hidden units is a recent breakthrough in neural computation research. These networks build a hierarchy of progressively more complex distributed representations of the sensory data by fitting a hierarchical generative model. In this article we discuss the theoretical foundations of this approach and we review key issues related to training, testing and analysis of deep networks for modeling language and cognitive processing. The classic letter and word perception problem of McClelland and Rumelhart (1981) is used as a tutorial example to illustrate how structured and abstract representations may emerge from deep generative learning. We argue that the focus on deep architectures and generative (rather than discriminative) learning represents a crucial step forward for the connectionist modeling enterprise, because it offers a more plausible model of cortical learning as well as a way to bridge the gap between emergentist connectionist models and structured Bayesian models of cognition.
Integration of multispectral satellite and hyperspectral field data for aquatic macrophyte studies
NASA Astrophysics Data System (ADS)
John, C. M.; Kavya, N.
2014-11-01
Aquatic macrophytes (AM) can serve as useful indicators of water pollution along the littoral zones. The spectral signatures of various AM were investigated to determine whether species could be discriminated by remote sensing. In this study the spectral readings of different AM communities identified were done using the ASD Fieldspec® Hand Held spectro-radiometer in the wavelength range of 325-1075 nm. The collected specific reflectance spectra were applied to space borne multi-spectral remote sensing data from Worldview-2, acquired on 26th March 2011. The dimensionality reduction of the spectro-radiometric data was done using the technique principal components analysis (PCA). Out of the different PCA axes generated, 93.472 % variance of the spectra was explained by the first axis. The spectral derivative analysis was done to identify the wavelength where the greatest difference in reflectance is shown. The identified wavelengths are 510, 690, 720, 756, 806, 885, 907 and 923 nm. The output of PCA and derivative analysis were applied to Worldview-2 satellite data for spectral subsetting. The unsupervised classification was used to effectively classify the AM species using the different spectral subsets. The accuracy assessment of the results of the unsupervised classification and their comparison were done. The overall accuracy of the result of unsupervised classification using the band combinations Red-Edge, Green, Coastal blue & Red-edge, Yellow, Blue is 100%. The band combinations NIR-1, Green, Coastal blue & NIR-1, Yellow, Blue yielded an accuracy of 82.35 %. The existing vegetation indices and new hyper-spectral indices for the different type of AM communities were computed. Overall, results of this study suggest that high spectral and spatial resolution images provide useful information for natural resource managers especially with regard to the location identification and distribution mapping of macrophyte species and their communities.
NASA Astrophysics Data System (ADS)
Bhardwaj, Kaushal; Patra, Swarnajyoti
2018-04-01
Inclusion of spatial information along with spectral features play a significant role in classification of remote sensing images. Attribute profiles have already proved their ability to represent spatial information. In order to incorporate proper spatial information, multiple attributes are required and for each attribute large profiles need to be constructed by varying the filter parameter values within a wide range. Thus, the constructed profiles that represent spectral-spatial information of an hyperspectral image have huge dimension which leads to Hughes phenomenon and increases computational burden. To mitigate these problems, this work presents an unsupervised feature selection technique that selects a subset of filtered image from the constructed high dimensional multi-attribute profile which are sufficiently informative to discriminate well among classes. In this regard the proposed technique exploits genetic algorithms (GAs). The fitness function of GAs are defined in an unsupervised way with the help of mutual information. The effectiveness of the proposed technique is assessed using one-against-all support vector machine classifier. The experiments conducted on three hyperspectral data sets show the robustness of the proposed method in terms of computation time and classification accuracy.
NASA Technical Reports Server (NTRS)
1974-01-01
The present work gathers together numerous papers describing the use of remote sensing technology for mapping, monitoring, and management of earth resources and man's environment. Studies using various types of sensing equipment are described, including multispectral scanners, radar imagery, spectrometers, lidar, and aerial photography, and both manual and computer-aided data processing techniques are described. Some of the topics covered include: estimation of population density in Tokyo districts from ERTS-1 data, a clustering algorithm for unsupervised crop classification, passive microwave sensing of moist soils, interactive computer processing for land use planning, the use of remote sensing to delineate floodplains, moisture detection from Skylab, scanning thermal plumes, electrically scanning microwave radiometers, oil slick detection by X-band synthetic aperture radar, and the use of space photos for search of oil and gas fields. Individual items are announced in this issue.
NASA Astrophysics Data System (ADS)
Gjaja, Marin N.
1997-11-01
Neural networks for supervised and unsupervised learning are developed and applied to problems in remote sensing, continuous map learning, and speech perception. Adaptive Resonance Theory (ART) models are real-time neural networks for category learning, pattern recognition, and prediction. Unsupervised fuzzy ART networks synthesize fuzzy logic and neural networks, and supervised ARTMAP networks incorporate ART modules for prediction and classification. New ART and ARTMAP methods resulting from analyses of data structure, parameter specification, and category selection are developed. Architectural modifications providing flexibility for a variety of applications are also introduced and explored. A new methodology for automatic mapping from Landsat Thematic Mapper (TM) and terrain data, based on fuzzy ARTMAP, is developed. System capabilities are tested on a challenging remote sensing problem, prediction of vegetation classes in the Cleveland National Forest from spectral and terrain features. After training at the pixel level, performance is tested at the stand level, using sites not seen during training. Results are compared to those of maximum likelihood classifiers, back propagation neural networks, and K-nearest neighbor algorithms. Best performance is obtained using a hybrid system based on a convex combination of fuzzy ARTMAP and maximum likelihood predictions. This work forms the foundation for additional studies exploring fuzzy ARTMAP's capability to estimate class mixture composition for non-homogeneous sites. Exploratory simulations apply ARTMAP to the problem of learning continuous multidimensional mappings. A novel system architecture retains basic ARTMAP properties of incremental and fast learning in an on-line setting while adding components to solve this class of problems. The perceptual magnet effect is a language-specific phenomenon arising early in infant speech development that is characterized by a warping of speech sound perception. An unsupervised neural network model is proposed that embodies two principal hypotheses supported by experimental data--that sensory experience guides language-specific development of an auditory neural map and that a population vector can predict psychological phenomena based on map cell activities. Model simulations show how a nonuniform distribution of map cell firing preferences can develop from language-specific input and give rise to the magnet effect.
Innovation Engine for Blog Spaces
2011-09-01
183 7.2.2 Architecture for mining Wikipedia as a sense-annotated corpus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183...are mined from a corpus by dictionary learning, and the representation is com- puted by sparse coding (Sec. 5.5). The topics can be embedded into a...intend to deter- mine the exact sense of a word whose surface form is unknown. This generalizes the original word sense disambiguation problem since we
Chen, Jinying; Yu, Hong
2017-04-01
Allowing patients to access their own electronic health record (EHR) notes through online patient portals has the potential to improve patient-centered care. However, EHR notes contain abundant medical jargon that can be difficult for patients to comprehend. One way to help patients is to reduce information overload and help them focus on medical terms that matter most to them. Targeted education can then be developed to improve patient EHR comprehension and the quality of care. The aim of this work was to develop FIT (Finding Important Terms for patients), an unsupervised natural language processing (NLP) system that ranks medical terms in EHR notes based on their importance to patients. We built FIT on a new unsupervised ensemble ranking model derived from the biased random walk algorithm to combine heterogeneous information resources for ranking candidate terms from each EHR note. Specifically, FIT integrates four single views (rankers) for term importance: patient use of medical concepts, document-level term salience, word co-occurrence based term relatedness, and topic coherence. It also incorporates partial information of term importance as conveyed by terms' unfamiliarity levels and semantic types. We evaluated FIT on 90 expert-annotated EHR notes and used the four single-view rankers as baselines. In addition, we implemented three benchmark unsupervised ensemble ranking methods as strong baselines. FIT achieved 0.885 AUC-ROC for ranking candidate terms from EHR notes to identify important terms. When including term identification, the performance of FIT for identifying important terms from EHR notes was 0.813 AUC-ROC. Both performance scores significantly exceeded the corresponding scores from the four single rankers (P<0.001). FIT also outperformed the three ensemble rankers for most metrics. Its performance is relatively insensitive to its parameter. FIT can automatically identify EHR terms important to patients. It may help develop future interventions to improve quality of care. By using unsupervised learning as well as a robust and flexible framework for information fusion, FIT can be readily applied to other domains and applications. Copyright © 2017 Elsevier Inc. All rights reserved.
Deep generative learning of location-invariant visual word recognition.
Di Bono, Maria Grazia; Zorzi, Marco
2013-01-01
It is widely believed that orthographic processing implies an approximate, flexible coding of letter position, as shown by relative-position and transposition priming effects in visual word recognition. These findings have inspired alternative proposals about the representation of letter position, ranging from noisy coding across the ordinal positions to relative position coding based on open bigrams. This debate can be cast within the broader problem of learning location-invariant representations of written words, that is, a coding scheme abstracting the identity and position of letters (and combinations of letters) from their eye-centered (i.e., retinal) locations. We asked whether location-invariance would emerge from deep unsupervised learning on letter strings and what type of intermediate coding would emerge in the resulting hierarchical generative model. We trained a deep network with three hidden layers on an artificial dataset of letter strings presented at five possible retinal locations. Though word-level information (i.e., word identity) was never provided to the network during training, linear decoding from the activity of the deepest hidden layer yielded near-perfect accuracy in location-invariant word recognition. Conversely, decoding from lower layers yielded a large number of transposition errors. Analyses of emergent internal representations showed that word selectivity and location invariance increased as a function of layer depth. Word-tuning and location-invariance were found at the level of single neurons, but there was no evidence for bigram coding. Finally, the distributed internal representation of words at the deepest layer showed higher similarity to the representation elicited by the two exterior letters than by other combinations of two contiguous letters, in agreement with the hypothesis that word edges have special status. These results reveal that the efficient coding of written words-which was the model's learning objective-is largely based on letter-level information.
Benjafield, John G
2016-05-01
The digital humanities are being applied with increasing frequency to the analysis of historically important texts. In this study, the methods of G. K. Zipf are used to explore the digital history of the vocabulary of psychology. Zipf studied a great many phenomena, from word frequencies to city sizes, showing that they tend to have a characteristic distribution in which there are a few cases that occur very frequently and many more cases that occur very infrequently. We find that the number of new words and word senses that writers contribute to the vocabulary of psychology have such a Zipfian distribution. Moreover, those who make the most contributions, such as William James, tend also to invent new metaphorical senses of words rather than new words. By contrast, those who make the fewest contributions tend to invent entirely new words. The use of metaphor makes a text easier for a reader to understand. While the use of new words requires more effort on the part of the reader, it may lead to more precise understanding than does metaphor. On average, new words and word senses become a part of psychology's vocabulary in the time leading up to World War I, suggesting that psychology was "finding its language" (Danziger, 1997) during this period. (c) 2016 APA, all rights reserved).
Use of Common-Sense Knowledge, Language and Reality in Mathematical Word Problem Solving
ERIC Educational Resources Information Center
Sepeng, Percy
2014-01-01
The study reported in this article sought to explore and observe how grade 9 learners solve real-wor(l)d problems (a) without real context and (b) without real meaning. Learners' abilities to make sense of the decontextualised word problems set in the real world were investigated with regard to learners' use of common sense in relation to problem…
NASA Astrophysics Data System (ADS)
Szu, Harold H.; Buss, James R.; Kopriva, Ivica
2004-04-01
We proposed the physics approach to solve a physical inverse problem, namely to choose the unique equilibrium solution (at the minimum free energy: H= E - ToS, including the Wiener, l.m.s E, and ICA, Max S, as special cases). The "unsupervised classification" presumes that required information must be learned and derived directly and solely from the data alone, in consistence with the classical Duda-Hart ATR definition of the "unlabelled data". Such truly unsupervised methodology is presented for space-variant imaging processing for a single pixel in the real world case of remote sensing, early tumor detections and SARS. The indeterminacy of the multiple solutions of the inverse problem is regulated or selected by means of the absolute minimum of isothermal free energy as the ground truth of local equilibrium condition at the single-pixel foot print.
Remote sensing of environmental impact of land use activities
NASA Technical Reports Server (NTRS)
Paul, C. K.
1977-01-01
The capability to monitor land cover, associated in the past with aerial film cameras and radar systems, was discussed in regard to aircraft and spacecraft multispectral scanning sensors. A proposed thematic mapper with greater spectral and spatial resolutions for the fourth LANDSAT is expected to usher in new environmental monitoring capability. In addition, continuing improvements in image classification by supervised and unsupervised computer techniques are being operationally verified for discriminating environmental impacts of human activities on the land. The benefits of employing remote sensing for this discrimination was shown to far outweigh the incremental costs of converting to an aircraft-satellite multistage system.
Assessing Hurricane Katrina Damage to the Mississippi Gulf Coast Using IKONOS Imagery
NASA Technical Reports Server (NTRS)
Spruce, Joseph; McKellip, Rodney
2006-01-01
Hurricane Katrina hit southeastern Louisiana and the Mississippi Gulf Coast as a Category 3 hurricane with storm surges as high as 9 m. Katrina devastated several coastal towns by destroying or severely damaging hundreds of homes. Several Federal agencies are assessing storm impacts and assisting recovery using high-spatial-resolution remotely sensed data from satellite and airborne platforms. High-quality IKONOS satellite imagery was collected on September 2, 2005, over southwestern Mississippi. Pan-sharpened IKONOS multispectral data and ERDAS IMAGINE software were used to classify post-storm land cover for coastal Hancock and Harrison Counties. This classification included a storm debris category of interest to FEMA for disaster mitigation. The classification resulted from combining traditional unsupervised and supervised classification techniques. Higher spatial resolution aerial and handheld photography were used as reference data. Results suggest that traditional classification techniques and IKONOS data can map wood-dominated storm debris in open areas if relevant training areas are used to develop the unsupervised classification signatures. IKONOS data also enabled other hurricane damage assessment, such as flood-deposited mud on lawns and vegetation foliage loss from the storm. IKONOS data has also aided regional Katrina vegetation damage surveys from multidate Land Remote Sensing Satellite and Moderate Resolution Imaging Spectroradiometer data.
NASA Astrophysics Data System (ADS)
Luo, Chang; Wang, Jie; Feng, Gang; Xu, Suhui; Wang, Shiqiang
2017-10-01
Deep convolutional neural networks (CNNs) have been widely used to obtain high-level representation in various computer vision tasks. However, for remote scene classification, there are not sufficient images to train a very deep CNN from scratch. From two viewpoints of generalization power, we propose two promising kinds of deep CNNs for remote scenes and try to find whether deep CNNs need to be deep for remote scene classification. First, we transfer successful pretrained deep CNNs to remote scenes based on the theory that depth of CNNs brings the generalization power by learning available hypothesis for finite data samples. Second, according to the opposite viewpoint that generalization power of deep CNNs comes from massive memorization and shallow CNNs with enough neural nodes have perfect finite sample expressivity, we design a lightweight deep CNN (LDCNN) for remote scene classification. With five well-known pretrained deep CNNs, experimental results on two independent remote-sensing datasets demonstrate that transferred deep CNNs can achieve state-of-the-art results in an unsupervised setting. However, because of its shallow architecture, LDCNN cannot obtain satisfactory performance, regardless of whether in an unsupervised, semisupervised, or supervised setting. CNNs really need depth to obtain general features for remote scenes. This paper also provides baseline for applying deep CNNs to other remote sensing tasks.
NASA Astrophysics Data System (ADS)
Belfatti, Monica A.
Recently developed common core standards echo calls by educators for ensuring that upper elementary students become proficient readers of informational texts. Informational texts have been theorized as causing difficulty for students because they contain linguistic and visual features different from more familiar narrative genres (Lemke, 2004). It has been argued that learning to read informational texts, particularly those with science subject matter, requires making sense of words, images, and the relationships among them (Pappas, 2006). Yet, conspicuously absent in the research are empirical studies documenting ways students make use of textual resources to build textual and conceptual understandings during classroom literacy instruction. This 10-month practitioner research study was designed to investigate the ways a group of ethnically and linguistically diverse fourth graders in one metropolitan school made sense of science information books during dialogically organized literature discussions. In this nontraditional instructional context, I wondered whether and how young students might make use of science informational text features, both words and images, in the midst of collaborative textual and conceptual inquiry. Drawing on methods of constructivist grounded theory and classroom discourse analysis, I analyzed student and teacher talk in 25 discussions of earth and life science books. Digital voice recordings and transcriptions served as the main data sources for this study. I found that, without teacher prompts or mandates to do so, fourth graders raised a wide range of textual and conceptual inquiries about words, images, scientific figures, and phenomena. In addition, my analysis yielded a typology of ways students constructed relationships between words and images within and across page openings of the information books read for their sense-making endeavors. The diversity of constructed word-image relationships aided students in raising, exploring, and contesting textual and conceptual ideas. Moreover, through their joint inquiries, students marshaled and evaluated a rich array of resources. Students' sense-making of information books was not contained by the words and images alone; it involved a situated, complex process of making sense of multiple texts, discourses, and epistemologies. These findings suggest educators, theorists, and policy makers reconsider acontextual, linear, hierarchical models for developing elementary students as sense-makers of nonfiction.
Cluster Method Analysis of K. S. C. Image
NASA Technical Reports Server (NTRS)
Rodriguez, Joe, Jr.; Desai, M.
1997-01-01
Information obtained from satellite-based systems has moved to the forefront as a method in the identification of many land cover types. Identification of different land features through remote sensing is an effective tool for regional and global assessment of geometric characteristics. Classification data acquired from remote sensing images have a wide variety of applications. In particular, analysis of remote sensing images have special applications in the classification of various types of vegetation. Results obtained from classification studies of a particular area or region serve towards a greater understanding of what parameters (ecological, temporal, etc.) affect the region being analyzed. In this paper, we make a distinction between both types of classification approaches although, focus is given to the unsupervised classification method using 1987 Thematic Mapped (TM) images of Kennedy Space Center.
Unsupervised Medical Entity Recognition and Linking in Chinese Online Medical Text
Gan, Liang; Cheng, Mian; Wu, Quanyuan
2018-01-01
Online medical text is full of references to medical entities (MEs), which are valuable in many applications, including medical knowledge-based (KB) construction, decision support systems, and the treatment of diseases. However, the diverse and ambiguous nature of the surface forms gives rise to a great difficulty for ME identification. Many existing solutions have focused on supervised approaches, which are often task-dependent. In other words, applying them to different kinds of corpora or identifying new entity categories requires major effort in data annotation and feature definition. In this paper, we propose unMERL, an unsupervised framework for recognizing and linking medical entities mentioned in Chinese online medical text. For ME recognition, unMERL first exploits a knowledge-driven approach to extract candidate entities from free text. Then, the categories of the candidate entities are determined using a distributed semantic-based approach. For ME linking, we propose a collaborative inference approach which takes full advantage of heterogenous entity knowledge and unstructured information in KB. Experimental results on real corpora demonstrate significant benefits compared to recent approaches with respect to both ME recognition and linking. PMID:29849994
Xuan, Junyu; Lu, Jie; Zhang, Guangquan; Luo, Xiangfeng
2015-12-01
Graph mining has been a popular research area because of its numerous application scenarios. Many unstructured and structured data can be represented as graphs, such as, documents, chemical molecular structures, and images. However, an issue in relation to current research on graphs is that they cannot adequately discover the topics hidden in graph-structured data which can be beneficial for both the unsupervised learning and supervised learning of the graphs. Although topic models have proved to be very successful in discovering latent topics, the standard topic models cannot be directly applied to graph-structured data due to the "bag-of-word" assumption. In this paper, an innovative graph topic model (GTM) is proposed to address this issue, which uses Bernoulli distributions to model the edges between nodes in a graph. It can, therefore, make the edges in a graph contribute to latent topic discovery and further improve the accuracy of the supervised and unsupervised learning of graphs. The experimental results on two different types of graph datasets show that the proposed GTM outperforms the latent Dirichlet allocation on classification by using the unveiled topics of these two models to represent graphs.
Fluid Lensing based Machine Learning for Augmenting Earth Science Coral Datasets
NASA Astrophysics Data System (ADS)
Li, A.; Instrella, R.; Chirayath, V.
2016-12-01
Recently, there has been increased interest in monitoring the effects of climate change upon the world's marine ecosystems, particularly coral reefs. These delicate ecosystems are especially threatened due to their sensitivity to ocean warming and acidification, leading to unprecedented levels of coral bleaching and die-off in recent years. However, current global aquatic remote sensing datasets are unable to quantify changes in marine ecosystems at spatial and temporal scales relevant to their growth. In this project, we employ various supervised and unsupervised machine learning algorithms to augment existing datasets from NASA's Earth Observing System (EOS), using high resolution airborne imagery. This method utilizes NASA's ongoing airborne campaigns as well as its spaceborne assets to collect remote sensing data over these afflicted regions, and employs Fluid Lensing algorithms to resolve optical distortions caused by the fluid surface, producing cm-scale resolution imagery of these diverse ecosystems from airborne platforms. Support Vector Machines (SVMs) and K-mean clustering methods were applied to satellite imagery at 0.5m resolution, producing segmented maps classifying coral based on percent cover and morphology. Compared to a previous study using multidimensional maximum a posteriori (MAP) estimation to separate these features in high resolution airborne datasets, SVMs are able to achieve above 75% accuracy when augmented with existing MAP estimates, while unsupervised methods such as K-means achieve roughly 68% accuracy, verified by manually segmented reference data provided by a marine biologist. This effort thus has broad applications for coastal remote sensing, by helping marine biologists quantify behavioral trends spanning large areas and over longer timescales, and to assess the health of coral reefs worldwide.
Thematic Journeys. Words that I Own
ERIC Educational Resources Information Center
Zingher, Gary
2005-01-01
Children may feel a sense of ownership when they learn a new vocabulary word that genuinely excites them--a dynamic word, a poetic word, a word with a delicious sound or interesting meaning. Right away, they like to try out these words, experiment with them, incorporate them into the speaking and writing, and impress others with their mastery.…
Cross-language opinion lexicon extraction using mutual-reinforcement label propagation.
Lin, Zheng; Tan, Songbo; Liu, Yue; Cheng, Xueqi; Xu, Xueke
2013-01-01
There is a growing interest in automatically building opinion lexicon from sources such as product reviews. Most of these methods depend on abundant external resources such as WordNet, which limits the applicability of these methods. Unsupervised or semi-supervised learning provides an optional solution to multilingual opinion lexicon extraction. However, the datasets are imbalanced in different languages. For some languages, the high-quality corpora are scarce or hard to obtain, which limits the research progress. To solve the above problems, we explore a mutual-reinforcement label propagation framework. First, for each language, a label propagation algorithm is applied to a word relation graph, and then a bilingual dictionary is used as a bridge to transfer information between two languages. A key advantage of this model is its ability to make two languages learn from each other and boost each other. The experimental results show that the proposed approach outperforms baseline significantly.
Cross-Language Opinion Lexicon Extraction Using Mutual-Reinforcement Label Propagation
Lin, Zheng; Tan, Songbo; Liu, Yue; Cheng, Xueqi; Xu, Xueke
2013-01-01
There is a growing interest in automatically building opinion lexicon from sources such as product reviews. Most of these methods depend on abundant external resources such as WordNet, which limits the applicability of these methods. Unsupervised or semi-supervised learning provides an optional solution to multilingual opinion lexicon extraction. However, the datasets are imbalanced in different languages. For some languages, the high-quality corpora are scarce or hard to obtain, which limits the research progress. To solve the above problems, we explore a mutual-reinforcement label propagation framework. First, for each language, a label propagation algorithm is applied to a word relation graph, and then a bilingual dictionary is used as a bridge to transfer information between two languages. A key advantage of this model is its ability to make two languages learn from each other and boost each other. The experimental results show that the proposed approach outperforms baseline significantly. PMID:24260190
Carving up Word Meaning: Portioning and Grinding
ERIC Educational Resources Information Center
Frisson, S.; Frazier, L.
2005-01-01
Two eye-tracking experiments investigated the processing of mass nouns used as count nouns and count nouns used as mass nouns. Following Copestake and Briscoe (1995), the basic or underived sense of a word was treated as the input to a derivational rule (''grinding'' or ''portioning'') which produced the derived sense as output. It was…
Improving Acoustic Models by Watching Television
NASA Technical Reports Server (NTRS)
Witbrock, Michael J.; Hauptmann, Alexander G.
1998-01-01
Obtaining sufficient labelled training data is a persistent difficulty for speech recognition research. Although well transcribed data is expensive to produce, there is a constant stream of challenging speech data and poor transcription broadcast as closed-captioned television. We describe a reliable unsupervised method for identifying accurately transcribed sections of these broadcasts, and show how these segments can be used to train a recognition system. Starting from acoustic models trained on the Wall Street Journal database, a single iteration of our training method reduced the word error rate on an independent broadcast television news test set from 62.2% to 59.5%.
Word List for a Spelling Program.
ERIC Educational Resources Information Center
Smith, Carl B.
What logic should educators use in choosing words for students to learn to spell? Common sense provides the answer: students should learn to spell the words they use in writing. What these words are has been a subject of concern since the beginning of this century. Dozens of word frequency lists have been developed over the years, based primarily…
Bringing a Class to Its Senses.
ERIC Educational Resources Information Center
Eichenberg, Mary Ann
1965-01-01
Students can be taught to create vivid, colorful descriptions. To train their senses and sharpen their word choices and images, they can be asked to (1) list specific adjectives to describe such an image-producing word as "ocean," (2) substitute sharply-etched verbs for general ones in a given sentence, (3) record day-to-day observations in a…
NASA Astrophysics Data System (ADS)
Abdullahi, Sahra; Schardt, Mathias; Pretzsch, Hans
2017-05-01
Forest structure at stand level plays a key role for sustainable forest management, since the biodiversity, productivity, growth and stability of the forest can be positively influenced by managing its structural diversity. In contrast to field-based measurements, remote sensing techniques offer a cost-efficient opportunity to collect area-wide information about forest stand structure with high spatial and temporal resolution. Especially Interferometric Synthetic Aperture Radar (InSAR), which facilitates worldwide acquisition of 3d information independent from weather conditions and illumination, is convenient to capture forest stand structure. This study purposes an unsupervised two-stage clustering approach for forest structure classification based on height information derived from interferometric X-band SAR data which was performed in complex temperate forest stands of Traunstein forest (South Germany). In particular, a four dimensional input data set composed of first-order height statistics was non-linearly projected on a two-dimensional Self-Organizing Map, spatially ordered according to similarity (based on the Euclidean distance) in the first stage and classified using the k-means algorithm in the second stage. The study demonstrated that X-band InSAR data exhibits considerable capabilities for forest structure classification. Moreover, the unsupervised classification approach achieved meaningful and reasonable results by means of comparison to aerial imagery and LiDAR data.
Pupillary Responses to Words That Convey a Sense of Brightness or Darkness
Mathôt, Sebastiaan; Grainger, Jonathan; Strijkers, Kristof
2017-01-01
Theories about embodiment of language hold that when you process a word’s meaning, you automatically simulate associated sensory input (e.g., perception of brightness when you process lamp) and prepare associated actions (e.g., finger movements when you process typing). To test this latter prediction, we measured pupillary responses to single words that conveyed a sense of brightness (e.g., day) or darkness (e.g., night) or were neutral (e.g., house). We found that pupils were largest for words conveying darkness, of intermediate size for neutral words, and smallest for words conveying brightness. This pattern was found for both visually presented and spoken words, which suggests that it was due to the words’ meanings, rather than to visual or auditory properties of the stimuli. Our findings suggest that word meaning is sufficient to trigger a pupillary response, even when this response is not imposed by the experimental task, and even when this response is beyond voluntary control. PMID:28613135
NASA Astrophysics Data System (ADS)
Pal, Alok Ranjan; Saha, Diganta; Dash, Niladri Sekhar; Pal, Antara
2018-05-01
An attempt is made in this paper to report how a supervised methodology has been adopted for the task of word sense disambiguation in Bangla with necessary modifications. At the initial stage, the Naïve Bayes probabilistic model that has been adopted as a baseline method for sense classification, yields moderate result with 81% accuracy when applied on a database of 19 (nineteen) most frequently used Bangla ambiguous words. On experimental basis, the baseline method is modified with two extensions: (a) inclusion of lemmatization process into of the system, and (b) bootstrapping of the operational process. As a result, the level of accuracy of the method is slightly improved up to 84% accuracy, which is a positive signal for the whole process of disambiguation as it opens scope for further modification of the existing method for better result. The data sets that have been used for this experiment include the Bangla POS tagged corpus obtained from the Indian Languages Corpora Initiative, and the Bangla WordNet, an online sense inventory developed at the Indian Statistical Institute, Kolkata. The paper also reports about the challenges and pitfalls of the work that have been closely observed and addressed to achieve expected level of accuracy.
Radiation area monitor device and method
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vencelj, Matjaz; Stowe, Ashley C.; Petrovic, Toni
A radiation area monitor device/method, utilizing: a radiation sensor having a directional radiation sensing capability; a rotation mechanism operable for selectively rotating the radiation sensor such that the directional radiation sensing capability selectively sweeps an area of interest; and a processor operable for analyzing and storing a radiation fingerprint acquired by the radiation sensor as the directional radiation sensing capability selectively sweeps the area of interest. Optionally, the radiation sensor includes a gamma and/or neutron radiation sensor. The device/method selectively operates in: a first supervised mode during which a baseline radiation fingerprint is acquired by the radiation sensor; and amore » second unsupervised mode during which a subsequent radiation fingerprint is acquired by the radiation sensor, wherein the subsequent radiation fingerprint is compared to the baseline radiation fingerprint and, if a predetermined difference threshold is exceeded, an alert is issued.« less
Phrasing in the speech and reading of the hearing impaired.
Gregory, J F
1986-08-01
The study reported here explored a partial explanation for the fourth-grade "bottleneck" in literacy advancement by hearing-impaired students. Speech samples from 21 deaf subjects were rated for degree of evident phrasal quality. Likewise, reading comprehension scores for each student were obtained under four reading conditions: reading in whole sentences, in phrases, in fragmented word groups, and in single words. Degree of rated speech phrasality was found to relate significantly and positively to correct recall answers to questions based upon silent reading of passages typed in meaningful word groups (but not when the passages were typed in whole sentences, fragmented word groups, or in single words). The results were taken to suggest that--whereas staccato-speaking deaf students may lack a sense of the phrase altogether--phrasal-speaking deaf youngsters fail to independently apply their phrase sense in the normal reading situation. Thus, both types of deaf youngsters have difficulty affecting the transition to phrase reading that is common for hearing students at or about the fourth-grade level. Finally, I argue that this phrase sense can be instilled in hearing-impaired students and that they can be trained to use it in reading.
Tense and aspect in word problems about motion: diagram, gesture, and the felt experience of time
NASA Astrophysics Data System (ADS)
de Freitas, Elizabeth; Zolkower, Betina
2015-09-01
Word problems about motion contain various conjugated verb forms. As students and teachers grapple with such word problems, they jointly operationalize diagrams, gestures, and language. Drawing on findings from a 3-year research project examining the social semiotics of classroom interaction, we show how teachers and students use gesture and diagram to make sense of complex verb forms in such word problems. We focus on the grammatical category of "aspect" for how it broadens the concept of verb tense. Aspect conveys duration and completion or frequency of an event. The aspect of a verb defines its temporal flow (or lack thereof) and the location of a vantage point for making sense of this durational process.
Age-Related Evolution Patterns in Online Handwriting
2016-01-01
Characterizing age from handwriting (HW) has important applications, as it is key to distinguishing normal HW evolution with age from abnormal HW change, potentially triggered by neurodegenerative decline. We propose, in this work, an original approach for online HW style characterization based on a two-level clustering scheme. The first level generates writer-independent word clusters from raw spatial-dynamic HW information. At the second level, each writer's words are converted into a Bag of Prototype Words that is augmented by an interword stability measure. This two-level HW style representation is input to an unsupervised learning technique, aiming at uncovering HW style categories and their correlation with age. To assess the effectiveness of our approach, we propose information theoretic measures to quantify the gain on age information from each clustering layer. We have carried out extensive experiments on a large public online HW database, augmented by HW samples acquired at Broca Hospital in Paris from people mostly between 60 and 85 years old. Unlike previous works claiming that there is only one pattern of HW change with age, our study reveals three major aging HW styles, one specific to aged people and the two others shared by other age groups. PMID:27752277
The potential of latent semantic analysis for machine grading of clinical case summaries.
Kintsch, Walter
2002-02-01
This paper introduces latent semantic analysis (LSA), a machine learning method for representing the meaning of words, sentences, and texts. LSA induces a high-dimensional semantic space from reading a very large amount of texts. The meaning of words and texts can be represented as vectors in this space and hence can be compared automatically and objectively. A generative theory of the mental lexicon based on LSA is described. The word vectors LSA constructs are context free, and each word, irrespective of how many meanings or senses it has, is represented by a single vector. However, when a word is used in different contexts, context appropriate word senses emerge. Several applications of LSA to educational software are described, involving the ability of LSA to quickly compare the content of texts, such as an essay written by a student and a target essay. An LSA-based software tool is sketched for machine grading of clinical case summaries written by medical students.
ERIC Educational Resources Information Center
Cleary, Anne M.; Claxton, Alexander B.
2015-01-01
This study shows that the presence of a tip-of-the-tongue (TOT) state--the sense that a word is in memory when its retrieval fails--is used as a heuristic for inferring that an inaccessible word has characteristics that are consistent with greater word perceptibility. When reporting a TOT state, people judged an unretrieved word as more likely to…
Reading as Active Sensing: A Computational Model of Gaze Planning in Word Recognition
Ferro, Marcello; Ognibene, Dimitri; Pezzulo, Giovanni; Pirrelli, Vito
2010-01-01
We offer a computational model of gaze planning during reading that consists of two main components: a lexical representation network, acquiring lexical representations from input texts (a subset of the Italian CHILDES database), and a gaze planner, designed to recognize written words by mapping strings of characters onto lexical representations. The model implements an active sensing strategy that selects which characters of the input string are to be fixated, depending on the predictions dynamically made by the lexical representation network. We analyze the developmental trajectory of the system in performing the word recognition task as a function of both increasing lexical competence, and correspondingly increasing lexical prediction ability. We conclude by discussing how our approach can be scaled up in the context of an active sensing strategy applied to a robotic setting. PMID:20577589
Reading as active sensing: a computational model of gaze planning in word recognition.
Ferro, Marcello; Ognibene, Dimitri; Pezzulo, Giovanni; Pirrelli, Vito
2010-01-01
WE OFFER A COMPUTATIONAL MODEL OF GAZE PLANNING DURING READING THAT CONSISTS OF TWO MAIN COMPONENTS: a lexical representation network, acquiring lexical representations from input texts (a subset of the Italian CHILDES database), and a gaze planner, designed to recognize written words by mapping strings of characters onto lexical representations. The model implements an active sensing strategy that selects which characters of the input string are to be fixated, depending on the predictions dynamically made by the lexical representation network. We analyze the developmental trajectory of the system in performing the word recognition task as a function of both increasing lexical competence, and correspondingly increasing lexical prediction ability. We conclude by discussing how our approach can be scaled up in the context of an active sensing strategy applied to a robotic setting.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stevens, K; Huang, T; Buttler, D
We present the C-Cat Wordnet package, an open source library for using and modifying Wordnet. The package includes four key features: an API for modifying Synsets; implementations of standard similarity metrics, implementations of well known Word Sense Disambiguation algorithms, and an implementation of the Castanet algorithm. The library is easily extendible and usable in many runtime environments. We demonstrate it's use on two standard Word Sense Disambiguation tasks and apply the Castanet algorithm to a corpus.
Raghuram, Jayaram; Miller, David J; Kesidis, George
2014-07-01
We propose a method for detecting anomalous domain names, with focus on algorithmically generated domain names which are frequently associated with malicious activities such as fast flux service networks, particularly for bot networks (or botnets), malware, and phishing. Our method is based on learning a (null hypothesis) probability model based on a large set of domain names that have been white listed by some reliable authority. Since these names are mostly assigned by humans, they are pronounceable, and tend to have a distribution of characters, words, word lengths, and number of words that are typical of some language (mostly English), and often consist of words drawn from a known lexicon. On the other hand, in the present day scenario, algorithmically generated domain names typically have distributions that are quite different from that of human-created domain names. We propose a fully generative model for the probability distribution of benign (white listed) domain names which can be used in an anomaly detection setting for identifying putative algorithmically generated domain names. Unlike other methods, our approach can make detections without considering any additional (latency producing) information sources, often used to detect fast flux activity. Experiments on a publicly available, large data set of domain names associated with fast flux service networks show encouraging results, relative to several baseline methods, with higher detection rates and low false positive rates.
Raghuram, Jayaram; Miller, David J.; Kesidis, George
2014-01-01
We propose a method for detecting anomalous domain names, with focus on algorithmically generated domain names which are frequently associated with malicious activities such as fast flux service networks, particularly for bot networks (or botnets), malware, and phishing. Our method is based on learning a (null hypothesis) probability model based on a large set of domain names that have been white listed by some reliable authority. Since these names are mostly assigned by humans, they are pronounceable, and tend to have a distribution of characters, words, word lengths, and number of words that are typical of some language (mostly English), and often consist of words drawn from a known lexicon. On the other hand, in the present day scenario, algorithmically generated domain names typically have distributions that are quite different from that of human-created domain names. We propose a fully generative model for the probability distribution of benign (white listed) domain names which can be used in an anomaly detection setting for identifying putative algorithmically generated domain names. Unlike other methods, our approach can make detections without considering any additional (latency producing) information sources, often used to detect fast flux activity. Experiments on a publicly available, large data set of domain names associated with fast flux service networks show encouraging results, relative to several baseline methods, with higher detection rates and low false positive rates. PMID:25685511
Blind image quality assessment via probabilistic latent semantic analysis.
Yang, Xichen; Sun, Quansen; Wang, Tianshu
2016-01-01
We propose a blind image quality assessment that is highly unsupervised and training free. The new method is based on the hypothesis that the effect caused by distortion can be expressed by certain latent characteristics. Combined with probabilistic latent semantic analysis, the latent characteristics can be discovered by applying a topic model over a visual word dictionary. Four distortion-affected features are extracted to form the visual words in the dictionary: (1) the block-based local histogram; (2) the block-based local mean value; (3) the mean value of contrast within a block; (4) the variance of contrast within a block. Based on the dictionary, the latent topics in the images can be discovered. The discrepancy between the frequency of the topics in an unfamiliar image and a large number of pristine images is applied to measure the image quality. Experimental results for four open databases show that the newly proposed method correlates well with human subjective judgments of diversely distorted images.
Language Model Combination and Adaptation Using Weighted Finite State Transducers
NASA Technical Reports Server (NTRS)
Liu, X.; Gales, M. J. F.; Hieronymus, J. L.; Woodland, P. C.
2010-01-01
In speech recognition systems language model (LMs) are often constructed by training and combining multiple n-gram models. They can be either used to represent different genres or tasks found in diverse text sources, or capture stochastic properties of different linguistic symbol sequences, for example, syllables and words. Unsupervised LM adaption may also be used to further improve robustness to varying styles or tasks. When using these techniques, extensive software changes are often required. In this paper an alternative and more general approach based on weighted finite state transducers (WFSTs) is investigated for LM combination and adaptation. As it is entirely based on well-defined WFST operations, minimum change to decoding tools is needed. A wide range of LM combination configurations can be flexibly supported. An efficient on-the-fly WFST decoding algorithm is also proposed. Significant error rate gains of 7.3% relative were obtained on a state-of-the-art broadcast audio recognition task using a history dependently adapted multi-level LM modelling both syllable and word sequences
Feature-level sentiment analysis by using comparative domain corpora
NASA Astrophysics Data System (ADS)
Quan, Changqin; Ren, Fuji
2016-06-01
Feature-level sentiment analysis (SA) is able to provide more fine-grained SA on certain opinion targets and has a wider range of applications on E-business. This study proposes an approach based on comparative domain corpora for feature-level SA. The proposed approach makes use of word associations for domain-specific feature extraction. First, we assign a similarity score for each candidate feature to denote its similarity extent to a domain. Then we identify domain features based on their similarity scores on different comparative domain corpora. After that, dependency grammar and a general sentiment lexicon are applied to extract and expand feature-oriented opinion words. Lastly, the semantic orientation of a domain-specific feature is determined based on the feature-oriented opinion lexicons. In evaluation, we compare the proposed method with several state-of-the-art methods (including unsupervised and semi-supervised) using a standard product review test collection. The experimental results demonstrate the effectiveness of using comparative domain corpora.
Unsupervised Feature Selection Based on the Morisita Index for Hyperspectral Images
NASA Astrophysics Data System (ADS)
Golay, Jean; Kanevski, Mikhail
2017-04-01
Hyperspectral sensors are capable of acquiring images with hundreds of narrow and contiguous spectral bands. Compared with traditional multispectral imagery, the use of hyperspectral images allows better performance in discriminating between land-cover classes, but it also results in large redundancy and high computational data processing. To alleviate such issues, unsupervised feature selection techniques for redundancy minimization can be implemented. Their goal is to select the smallest subset of features (or bands) in such a way that all the information content of a data set is preserved as much as possible. The present research deals with the application to hyperspectral images of a recently introduced technique of unsupervised feature selection: the Morisita-Based filter for Redundancy Minimization (MBRM). MBRM is based on the (multipoint) Morisita index of clustering and on the Morisita estimator of Intrinsic Dimension (ID). The fundamental idea of the technique is to retain only the bands which contribute to increasing the ID of an image. In this way, redundant bands are disregarded, since they have no impact on the ID. Besides, MBRM has several advantages over benchmark techniques: in addition to its ability to deal with large data sets, it can capture highly-nonlinear dependences and its implementation is straightforward in any programming environment. Experimental results on freely available hyperspectral images show the good effectiveness of MBRM in remote sensing data processing. Comparisons with benchmark techniques are carried out and random forests are used to assess the performance of MBRM in reducing the data dimensionality without loss of relevant information. References [1] C. Traina Jr., A.J.M. Traina, L. Wu, C. Faloutsos, Fast feature selection using fractal dimension, in: Proceedings of the XV Brazilian Symposium on Databases, SBBD, pp. 158-171, 2000. [2] J. Golay, M. Kanevski, A new estimator of intrinsic dimension based on the multipoint Morisita index, Pattern Recognition 48(12), pp. 4070-4081, 2015. [3] J. Golay, M. Kanevski, Unsupervised feature selection based on the Morisita estimator of intrinsic dimension, arXiv:1608.05581, 2016.
Derwent's Doors: Creative Acts
ERIC Educational Resources Information Center
Gillen, Julia
2007-01-01
Children's early word learning is not usually considered creative in the same sense as artistic productions of later life. Yet early word learning is a creative response to the intrinsic instability of word meaning. As the child acts to participate in her community, she strives for intersubjectivity, manifest in neologisms and under- and…
Doesn't Everyone Have Rights to a Learner's Permit?
ERIC Educational Resources Information Center
Gehrig, Jody; Bazzanella, Mary Beth; Hilton, Cheri; Nassar, Nance; Peterson, Carol; White, Nancy
2009-01-01
The word "rights" often conjures up emotions and images and a sense of entitlement. The word "rights" might be preceded with other words such as "civil," "constitutional," or "human." Merriam-Webster defines a right as "something to which one has a just claim: as the power or privilege to…
Generating quality word sense disambiguation test sets based on MeSH indexing.
Fan, Jung-Wei; Friedman, Carol
2009-11-14
Word sense disambiguation (WSD) determines the correct meaning of a word that has more than one meaning, and is a critical step in biomedical natural language processing, as interpretation of information in text can be correct only if the meanings of their component terms are correctly identified first. Quality evaluation sets are important to WSD because they can be used as representative samples for developing automatic programs and as referees for comparing different WSD programs. To help create quality test sets for WSD, we developed a MeSH-based automatic sense-tagging method that preferentially annotates terms being topical of the text. Preliminary results were promising and revealed important issues to be addressed in biomedical WSD research. We also suggest that, by cross-validating with 2 or 3 annotators, the method should be able to efficiently generate quality WSD test sets. Online supplement is available at: http://www.dbmi.columbia.edu/~juf7002/AMIA09.
Watermarking techniques for electronic delivery of remote sensing images
NASA Astrophysics Data System (ADS)
Barni, Mauro; Bartolini, Franco; Magli, Enrico; Olmo, Gabriella
2002-09-01
Earth observation missions have recently attracted a growing interest, mainly due to the large number of possible applications capable of exploiting remotely sensed data and images. Along with the increase of market potential, the need arises for the protection of the image products. Such a need is a very crucial one, because the Internet and other public/private networks have become preferred means of data exchange. A critical issue arising when dealing with digital image distribution is copyright protection. Such a problem has been largely addressed by resorting to watermarking technology. A question that obviously arises is whether the requirements imposed by remote sensing imagery are compatible with existing watermarking techniques. On the basis of these motivations, the contribution of this work is twofold: assessment of the requirements imposed by remote sensing applications on watermark-based copyright protection, and modification of two well-established digital watermarking techniques to meet such constraints. More specifically, the concept of near-lossless watermarking is introduced and two possible algorithms matching such a requirement are presented. Experimental results are shown to measure the impact of watermark introduction on a typical remote sensing application, i.e., unsupervised image classification.
Biomedical word sense disambiguation with ontologies and metadata: automation meets accuracy
Alexopoulou, Dimitra; Andreopoulos, Bill; Dietze, Heiko; Doms, Andreas; Gandon, Fabien; Hakenberg, Jörg; Khelif, Khaled; Schroeder, Michael; Wächter, Thomas
2009-01-01
Background Ontology term labels can be ambiguous and have multiple senses. While this is no problem for human annotators, it is a challenge to automated methods, which identify ontology terms in text. Classical approaches to word sense disambiguation use co-occurring words or terms. However, most treat ontologies as simple terminologies, without making use of the ontology structure or the semantic similarity between terms. Another useful source of information for disambiguation are metadata. Here, we systematically compare three approaches to word sense disambiguation, which use ontologies and metadata, respectively. Results The 'Closest Sense' method assumes that the ontology defines multiple senses of the term. It computes the shortest path of co-occurring terms in the document to one of these senses. The 'Term Cooc' method defines a log-odds ratio for co-occurring terms including co-occurrences inferred from the ontology structure. The 'MetaData' approach trains a classifier on metadata. It does not require any ontology, but requires training data, which the other methods do not. To evaluate these approaches we defined a manually curated training corpus of 2600 documents for seven ambiguous terms from the Gene Ontology and MeSH. All approaches over all conditions achieve 80% success rate on average. The 'MetaData' approach performed best with 96%, when trained on high-quality data. Its performance deteriorates as quality of the training data decreases. The 'Term Cooc' approach performs better on Gene Ontology (92% success) than on MeSH (73% success) as MeSH is not a strict is-a/part-of, but rather a loose is-related-to hierarchy. The 'Closest Sense' approach achieves on average 80% success rate. Conclusion Metadata is valuable for disambiguation, but requires high quality training data. Closest Sense requires no training, but a large, consistently modelled ontology, which are two opposing conditions. Term Cooc achieves greater 90% success given a consistently modelled ontology. Overall, the results show that well structured ontologies can play a very important role to improve disambiguation. Availability The three benchmark datasets created for the purpose of disambiguation are available in Additional file 1. PMID:19159460
Nikfarjam, Azadeh; Sarker, Abeed; O'Connor, Karen; Ginn, Rachel; Gonzalez, Graciela
2015-05-01
Social media is becoming increasingly popular as a platform for sharing personal health-related information. This information can be utilized for public health monitoring tasks, particularly for pharmacovigilance, via the use of natural language processing (NLP) techniques. However, the language in social media is highly informal, and user-expressed medical concepts are often nontechnical, descriptive, and challenging to extract. There has been limited progress in addressing these challenges, and thus far, advanced machine learning-based NLP techniques have been underutilized. Our objective is to design a machine learning-based approach to extract mentions of adverse drug reactions (ADRs) from highly informal text in social media. We introduce ADRMine, a machine learning-based concept extraction system that uses conditional random fields (CRFs). ADRMine utilizes a variety of features, including a novel feature for modeling words' semantic similarities. The similarities are modeled by clustering words based on unsupervised, pretrained word representation vectors (embeddings) generated from unlabeled user posts in social media using a deep learning technique. ADRMine outperforms several strong baseline systems in the ADR extraction task by achieving an F-measure of 0.82. Feature analysis demonstrates that the proposed word cluster features significantly improve extraction performance. It is possible to extract complex medical concepts, with relatively high performance, from informal, user-generated content. Our approach is particularly scalable, suitable for social media mining, as it relies on large volumes of unlabeled data, thus diminishing the need for large, annotated training data sets. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.
Semi-automated surface mapping via unsupervised classification
NASA Astrophysics Data System (ADS)
D'Amore, M.; Le Scaon, R.; Helbert, J.; Maturilli, A.
2017-09-01
Due to the increasing volume of the returned data from space mission, the human search for correlation and identification of interesting features becomes more and more unfeasible. Statistical extraction of features via machine learning methods will increase the scientific output of remote sensing missions and aid the discovery of yet unknown feature hidden in dataset. Those methods exploit algorithm trained on features from multiple instrument, returning classification maps that explore intra-dataset correlation, allowing for the discovery of unknown features. We present two applications, one for Mercury and one for Vesta.
Polysemy Advantage with Abstract but Not Concrete Words
ERIC Educational Resources Information Center
Jager, Bernadet; Cleland, Alexandra A.
2016-01-01
It is a robust finding that ambiguous words are recognized faster than unambiguous words. More recent studies (e.g., Rodd et al. in "J Mem Lang" 46:245-266, 2002) now indicate that this "ambiguity advantage" may in reality be a "polysemy advantage": caused by related senses (polysemy) rather than unrelated meanings…
ERIC Educational Resources Information Center
Brocher, Andreas
2013-01-01
Because many words of a language have more than one meaning, readers regularly need to disambiguate words during sentence comprehension. Using priming, eye-tracking, and event-related brain potentials, this thesis tested whether readers differently disambiguate words with semantically related meanings like "wire" and "cone,"…
Duality of Mathematical Thinking When Making Sense of Simple Word Problems: Theoretical Essay
ERIC Educational Resources Information Center
Polotskaia, Elena; Savard, Annie; Freiman, Viktor
2015-01-01
This essay proposes a reflection on the learning difficulties and teaching approaches associated with arithmetic word problem solving. We question the development of word problem solving skills in the early grades of elementary school. We are trying to revive the discussion because first, the knowledge in question--reversibility of arithmetic…
NASA Astrophysics Data System (ADS)
Falco, N.; Pedersen, G. B. M.; Vilmunandardóttir, O. K.; Belart, J. M. M. C.; Sigurmundsson, F. S.; Benediktsson, J. A.
2016-12-01
The project "Environmental Mapping and Monitoring of Iceland by Remote Sensing (EMMIRS)" aims at providing fast and reliable mapping and monitoring techniques on a big spatial scale with a high temporal resolution of the Icelandic landscape. Such mapping and monitoring will be crucial to both mitigate and understand the scale of processes and their often complex interlinked feedback mechanisms.In the EMMIRS project, the Hekla volcano area is one of the main sites under study, where the volcanic eruptions, extreme weather and human activities had an extensive impact on the landscape degradation. The development of innovative remote sensing approaches to compute earth observation variables as automatically as possible is one of the main tasks of the EMMIRS project. Furthermore, a temporal remote sensing archive is created and composed by images acquired by different sensors (Landsat, RapidEye, ASTER and SPOT5). Moreover, historical aerial stereo photos allowed decadal reconstruction of the landscape by reconstruction of digital elevation models. Here, we propose a novel architecture for automatic unsupervised change detection analysis able to ingest multi-source data in order to detect landscape changes in the Hekla area. The change detection analysis is based on multi-scale analysis, which allows the identification of changes at different level of abstraction, from pixel-level to region-level. For this purpose, operators defined in mathematical morphology framework are implemented to model the contextual information, represented by the neighbour system of a pixel, allowing the identification of changes related to both geometrical and spectral domains. Automatic radiometric normalization strategy is also implemented as pre-processing step, aiming at minimizing the effect of different acquisition conditions. The proposed architecture is tested on multi-temporal data sets acquired over different time periods coinciding with the last three eruptions (1980-1981, 1991, 2000) occurred on Hekla volcano. The results reveal emplacement of new lava flows and the initial vegetation succession, providing insightful information on the evolving of vegetation in such environment. Shadow and snow patch changes are resolved in post-processing by exploiting the available spectral information.
NASA Technical Reports Server (NTRS)
Srivastava, Ashok; McIntosh, Dawn; Castle, Pat; Pontikakis, Manos; Diev, Vesselin; Zane-Ulman, Brett; Turkov, Eugene; Akella, Ram; Xu, Zuobing; Kumaresan, Sakthi Preethi
2006-01-01
This viewgraph document describes the data mining system developed at NASA Ames. Many NASA programs have large numbers (and types) of problem reports.These free text reports are written by a number of different people, thus the emphasis and wording vary considerably With so much data to sift through, analysts (subject experts) need help identifying any possible safety issues or concerns and help them confirm that they haven't missed important problems. Unsupervised clustering is the initial step to accomplish this; We think we can go much farther, specifically, identify possible recurring anomalies. Recurring anomalies may be indicators of larger systemic problems. The requirement to identify these anomalies has led to the development of Recurring Anomaly Discovery System (ReADS).
White, Thomas M; Hauan, Michael J
2002-01-01
Web-based data collection has considerable appeal. However, the quality of data collected using such instruments is often questionable. There can be systematic problems with the wording of the surveys, and/or the means with which they are deployed. In unsupervised data collection, there are also concerns about whether subjects understand the questions, and wehther they are answering honestly. This paper presents a schema for using client-side timestamps and traces of subjects' paths through instruments to detect problems with the definition of instruments and their deployment. We discuss two large, anonymous, web-based, medical surveys as examples of the utility of this approach.
Young Filipino Students Making Sense of Arithmetic Word Problems in English
ERIC Educational Resources Information Center
Bautista, Debbie; Mulligan, Joanne; Mitchelmore, Michael
2009-01-01
Young Filipino children are expected to solve mathematical word problems in English, a task which they typically encounter only in schools. In this exploratory study, task-based interviews were conducted with seven Filipino children from a public school. The children were asked to read and solve addition and subtraction word problems in English or…
ERIC Educational Resources Information Center
Jenik, Cynthia A.; Hicks, Danny G.
2005-01-01
Because spelling skills are most often assessed in the classroom through the traditional dictation task, it makes sense for a standardized measure to assess spelling in a similar manner. This involves an examiner saying a word, using the target word in a sentence, and repeating the word. The student then writes the target word. Most teachers would…
NASA Astrophysics Data System (ADS)
van der Wal, Daphne; van Dalen, Jeroen; Wielemaker-van den Dool, Annette; Dijkstra, Jasper T.; Ysebaert, Tom
2014-07-01
Intertidal benthic macroalgae are a biological quality indicator in estuaries and coasts. While remote sensing has been applied to quantify the spatial distribution of such macroalgae, it is generally not used for their monitoring. We examined the day-to-day and seasonal dynamics of macroalgal cover on a sandy intertidal flat using visible and near-infrared images from a time-lapse camera mounted on a tower. Benthic algae were identified using supervised, semi-supervised and unsupervised classification techniques, validated with monthly ground-truthing over one year. A supervised classification (based on maximum likelihood, using training areas identified in the field) performed best in discriminating between sediment, benthic diatom films and macroalgae, with highest spectral separability between macroalgae and diatoms in spring/summer. An automated unsupervised classification (based on the Normalised Differential Vegetation Index NDVI) allowed detection of daily changes in macroalgal coverage without the need for calibration. This method showed a bloom of macroalgae (filamentous green algae, Ulva sp.) in summer with > 60% cover, but with pronounced superimposed day-to-day variation in cover. Waves were a major factor in regulating macroalgal cover, but regrowth of the thalli after a summer storm was fast (2 weeks). Images and in situ data demonstrated that the protruding tubes of the polychaete Lanice conchilega facilitated both settlement (anchorage) and survival (resistance to waves) of the macroalgae. Thus, high-frequency, high resolution images revealed the mechanisms for regulating the dynamics in cover of the macroalgae and for their spatial structuring. Ramifications for the mode, timing, frequency and evaluation of monitoring macroalgae by field and remote sensing surveys are discussed.
UNMANNED AERIAL VEHICLE (UAV) HYPERSPECTRAL REMOTE SENSING FOR DRYLAND VEGETATION MONITORING
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nancy F. Glenn; Jessica J. Mitchell; Matthew O. Anderson
2012-06-01
UAV-based hyperspectral remote sensing capabilities developed by the Idaho National Lab and Idaho State University, Boise Center Aerospace Lab, were recently tested via demonstration flights that explored the influence of altitude on geometric error, image mosaicking, and dryland vegetation classification. The test flights successfully acquired usable flightline data capable of supporting classifiable composite images. Unsupervised classification results support vegetation management objectives that rely on mapping shrub cover and distribution patterns. Overall, supervised classifications performed poorly despite spectral separability in the image-derived endmember pixels. Future mapping efforts that leverage ground reference data, ultra-high spatial resolution photos and time series analysis shouldmore » be able to effectively distinguish native grasses such as Sandberg bluegrass (Poa secunda), from invasives such as burr buttercup (Ranunculus testiculatus) and cheatgrass (Bromus tectorum).« less
Real time three dimensional sensing system
Gordon, S.J.
1996-12-31
The invention is a three dimensional sensing system which utilizes two flexibly located cameras for receiving and recording visual information with respect to a sensed object illuminated by a series of light planes. Each pixel of each image is converted to a digital word and the words are grouped into stripes, each stripe comprising contiguous pixels. One pixel of each stripe in one image is selected and an epi-polar line of that point is drawn in the other image. The three dimensional coordinate of each selected point is determined by determining the point on said epi-polar line which also lies on a stripe in the second image and which is closest to a known light plane. 7 figs.
Real time three dimensional sensing system
Gordon, Steven J.
1996-01-01
The invention is a three dimensional sensing system which utilizes two flexibly located cameras for receiving and recording visual information with respect to a sensed object illuminated by a series of light planes. Each pixel of each image is converted to a digital word and the words are grouped into stripes, each stripe comprising contiguous pixels. One pixel of each stripe in one image is selected and an epi-polar line of that point is drawn in the other image. The three dimensional coordinate of each selected point is determined by determining the point on said epi-polar line which also lies on a stripe in the second image and which is closest to a known light plane.
Moon, Sungrim; McInnes, Bridget; Melton, Genevieve B
2015-01-01
Although acronyms and abbreviations in clinical text are used widely on a daily basis, relatively little research has focused upon word sense disambiguation (WSD) of acronyms and abbreviations in the healthcare domain. Since clinical notes have distinctive characteristics, it is unclear whether techniques effective for acronym and abbreviation WSD from biomedical literature are sufficient. The authors discuss feature selection for automated techniques and challenges with WSD of acronyms and abbreviations in the clinical domain. There are significant challenges associated with the informal nature of clinical text, such as typographical errors and incomplete sentences; difficulty with insufficient clinical resources, such as clinical sense inventories; and obstacles with privacy and security for conducting research with clinical text. Although we anticipated that using sophisticated techniques, such as biomedical terminologies, semantic types, part-of-speech, and language modeling, would be needed for feature selection with automated machine learning approaches, we found instead that simple techniques, such as bag-of-words, were quite effective in many cases. Factors, such as majority sense prevalence and the degree of separateness between sense meanings, were also important considerations. The first lesson is that a comprehensive understanding of the unique characteristics of clinical text is important for automatic acronym and abbreviation WSD. The second lesson learned is that investigators may find that using simple approaches is an effective starting point for these tasks. Finally, similar to other WSD tasks, an understanding of baseline majority sense rates and separateness between senses is important. Further studies and practical solutions are needed to better address these issues.
Strange Words: Autistic Traits and the Processing of Non-Literal Language.
McKenna, Peter E; Glass, Alexandra; Rajendran, Gnanathusharan; Corley, Martin
2015-11-01
Previous investigations into metonymy comprehension in ASD have confounded metonymy with anaphora, and outcome with process. Here we show how these confounds may be avoided, using data from non-diagnosed participants classified using Autism Quotient. Participants read sentences containing target words with novel or established metonymic senses (e.g., Finland, Vietnam) in literal- or figurative-supporting contexts. Participants took longer to read target words in figurative contexts, especially where the metonymic sense was novel. Importantly, participants with higher AQs took longer still to read novel metonyms. This suggests a focus for further exploration, in terms of potential differences between individuals diagnosed with ASD and their neurotypical counterparts, and more generally in terms of the processes by which comprehension is achieved.
Opinion Summarizationof CustomerComments
NASA Astrophysics Data System (ADS)
Fan, Miao; Wu, Guoshi
Web 2.0 technologies have enabled more and more customers to freely comment on different kinds of entities, such as sellers, products and services. The large scale of information poses the need and challenge of automatic summarization. In many cases, each of the user-generated short comments implies the opinions which rate the target entity. In this paper, we aim to mine and to summarize all the customer comments of a product. The algorithm proposed in this researchis more reliable on opinion identification because it is unsupervised and the accuracy of the result improves as the number of comments increases. Our research is performed in four steps: (1) mining the frequent aspects of a product that have been commented on by customers; (2) mining the infrequent aspects of a product which have been commented by customers (3) identifying opinion words in each comment and deciding whether each opinion word is positive, negative or neutral; (4) summarizing the comments. This paper proposes several novel techniques to perform these tasks. Our experimental results using comments of a number of products sold online demonstrate the effectiveness of the techniques.
A Deep Learning-Based Method for Similar Patient Question Retrieval in Chinese.
Tang, Guo Yu; Ni, Yuan; Xie, Guo Tong; Fan, Xin Li; Shi, Yan Ling
2017-01-01
The online patient question and answering (Q&A) system, either as a website or a mobile application, attracts an increasing number of users in China. Patients will post their questions and the registered doctors then provide the corresponding answers. A large amount of questions with answers from doctors are accumulated. Instead of awaiting the response from a doctor, the newly posted question could be quickly answered by finding a semantically equivalent question from the Q&A achive. In this study, we investigated a novel deep learning based method to retrieve the similar patient question in Chinese. An unsupervised learning algorithm using deep neural network is performed on the corpus to generate the word embedding. The word embedding was then used as the input to a supervised learning algorithm using a designed deep neural network, i.e. the supervised neural attention model (SNA), to predict the similarity between two questions. The experimental results showed that our SNA method achieved P@1 = 77% and P@5 = 84%, which outperformed all other compared methods.
Asiimwe, Stephen; Oloya, James; Song, Xiao; Whalen, Christopher C
2014-12-01
Unsupervised HIV self-testing (HST) has potential to increase knowledge of HIV status; however, its accuracy is unknown. To estimate the accuracy of unsupervised HST in field settings in Uganda, we performed a non-blinded, randomized controlled, non-inferiority trial of unsupervised compared with supervised HST among selected high HIV risk fisherfolk (22.1 % HIV Prevalence) in three fishing villages in Uganda between July and September 2013. The study enrolled 246 participants and randomized them in a 1:1 ratio to unsupervised HST or provider-supervised HST. In an intent-to-treat analysis, the HST sensitivity was 90 % in the unsupervised arm and 100 % among the provider-supervised, yielding a difference 0f -10 % (90 % CI -21, 1 %); non-inferiority was not shown. In a per protocol analysis, the difference in sensitivity was -5.6 % (90 % CI -14.4, 3.3 %) and did show non-inferiority. We conclude that unsupervised HST is feasible in rural Africa and may be non-inferior to provider-supervised HST.
A Synergy-Based Optimally Designed Sensing Glove for Functional Grasp Recognition
Ciotti, Simone; Battaglia, Edoardo; Carbonaro, Nicola; Bicchi, Antonio; Tognetti, Alessandro; Bianchi, Matteo
2016-01-01
Achieving accurate and reliable kinematic hand pose reconstructions represents a challenging task. The main reason for this is the complexity of hand biomechanics, where several degrees of freedom are distributed along a continuous deformable structure. Wearable sensing can represent a viable solution to tackle this issue, since it enables a more natural kinematic monitoring. However, the intrinsic accuracy (as well as the number of sensing elements) of wearable hand pose reconstruction (HPR) systems can be severely limited by ergonomics and cost considerations. In this paper, we combined the theoretical foundations of the optimal design of HPR devices based on hand synergy information, i.e., the inter-joint covariation patterns, with textile goniometers based on knitted piezoresistive fabrics (KPF) technology, to develop, for the first time, an optimally-designed under-sensed glove for measuring hand kinematics. We used only five sensors optimally placed on the hand and completed hand pose reconstruction (described according to a kinematic model with 19 degrees of freedom) leveraging upon synergistic information. The reconstructions we obtained from five different subjects were used to implement an unsupervised method for the recognition of eight functional grasps, showing a high degree of accuracy and robustness. PMID:27271621
Pati, Sumati; Maity, A; Banerji, P; Majumder, S B
2014-04-07
In the present work we have grown highly textured, ultra-thin, nano-crystalline zinc oxide thin films using a metal organic chemical vapor deposition technique and addressed their selectivity towards hydrogen, carbon dioxide and methane gas sensing. Structural and microstructural characteristics of the synthesized films were investigated utilizing X-ray diffraction and electron microscopy techniques respectively. Using a dynamic flow gas sensing measurement set up, the sensing characteristics of these films were investigated as a function of gas concentration (10-1660 ppm) and operating temperature (250-380 °C). ZnO thin film sensing elements were found to be sensitive to all of these gases. Thus at a sensor operating temperature of ~300 °C, the response% of the ZnO thin films were ~68, 59, and 52% for hydrogen, carbon monoxide and methane gases respectively. The data matrices extracted from first Fourier transform analyses (FFT) of the conductance transients were used as input parameters in a linear unsupervised principal component analysis (PCA) pattern recognition technique. We have demonstrated that FFT combined with PCA is an excellent tool for the differentiation of these reducing gases.
Using hyperspectral remote sensing for land cover classification
NASA Astrophysics Data System (ADS)
Zhang, Wendy W.; Sriharan, Shobha
2005-01-01
This project used hyperspectral data set to classify land cover using remote sensing techniques. Many different earth-sensing satellites, with diverse sensors mounted on sophisticated platforms, are currently in earth orbit. These sensors are designed to cover a wide range of the electromagnetic spectrum and are generating enormous amounts of data that must be processed, stored, and made available to the user community. The Airborne Visible-Infrared Imaging Spectrometer (AVIRIS) collects data in 224 bands that are approximately 9.6 nm wide in contiguous bands between 0.40 and 2.45 mm. Hyperspectral sensors acquire images in many, very narrow, contiguous spectral bands throughout the visible, near-IR, and thermal IR portions of the spectrum. The unsupervised image classification procedure automatically categorizes the pixels in an image into land cover classes or themes. Experiments on using hyperspectral remote sensing for land cover classification were conducted during the 2003 and 2004 NASA Summer Faculty Fellowship Program at Stennis Space Center. Research Systems Inc.'s (RSI) ENVI software package was used in this application framework. In this application, emphasis was placed on: (1) Spectrally oriented classification procedures for land cover mapping, particularly, the supervised surface classification using AVIRIS data; and (2) Identifying data endmembers.
A Synergy-Based Optimally Designed Sensing Glove for Functional Grasp Recognition.
Ciotti, Simone; Battaglia, Edoardo; Carbonaro, Nicola; Bicchi, Antonio; Tognetti, Alessandro; Bianchi, Matteo
2016-06-02
Achieving accurate and reliable kinematic hand pose reconstructions represents a challenging task. The main reason for this is the complexity of hand biomechanics, where several degrees of freedom are distributed along a continuous deformable structure. Wearable sensing can represent a viable solution to tackle this issue, since it enables a more natural kinematic monitoring. However, the intrinsic accuracy (as well as the number of sensing elements) of wearable hand pose reconstruction (HPR) systems can be severely limited by ergonomics and cost considerations. In this paper, we combined the theoretical foundations of the optimal design of HPR devices based on hand synergy information, i.e., the inter-joint covariation patterns, with textile goniometers based on knitted piezoresistive fabrics (KPF) technology, to develop, for the first time, an optimally-designed under-sensed glove for measuring hand kinematics. We used only five sensors optimally placed on the hand and completed hand pose reconstruction (described according to a kinematic model with 19 degrees of freedom) leveraging upon synergistic information. The reconstructions we obtained from five different subjects were used to implement an unsupervised method for the recognition of eight functional grasps, showing a high degree of accuracy and robustness.
A Review of Wetland Remote Sensing.
Guo, Meng; Li, Jing; Sheng, Chunlei; Xu, Jiawei; Wu, Li
2017-04-05
Wetlands are some of the most important ecosystems on Earth. They play a key role in alleviating floods and filtering polluted water and also provide habitats for many plants and animals. Wetlands also interact with climate change. Over the past 50 years, wetlands have been polluted and declined dramatically as land cover has changed in some regions. Remote sensing has been the most useful tool to acquire spatial and temporal information about wetlands. In this paper, seven types of sensors were reviewed: aerial photos coarse-resolution, medium-resolution, high-resolution, hyperspectral imagery, radar, and Light Detection and Ranging (LiDAR) data. This study also discusses the advantage of each sensor for wetland research. Wetland research themes reviewed in this paper include wetland classification, habitat or biodiversity, biomass estimation, plant leaf chemistry, water quality, mangrove forest, and sea level rise. This study also gives an overview of the methods used in wetland research such as supervised and unsupervised classification and decision tree and object-based classification. Finally, this paper provides some advice on future wetland remote sensing. To our knowledge, this paper is the most comprehensive and detailed review of wetland remote sensing and it will be a good reference for wetland researchers.
A Review of Wetland Remote Sensing
Guo, Meng; Li, Jing; Sheng, Chunlei; Xu, Jiawei; Wu, Li
2017-01-01
Wetlands are some of the most important ecosystems on Earth. They play a key role in alleviating floods and filtering polluted water and also provide habitats for many plants and animals. Wetlands also interact with climate change. Over the past 50 years, wetlands have been polluted and declined dramatically as land cover has changed in some regions. Remote sensing has been the most useful tool to acquire spatial and temporal information about wetlands. In this paper, seven types of sensors were reviewed: aerial photos coarse-resolution, medium-resolution, high-resolution, hyperspectral imagery, radar, and Light Detection and Ranging (LiDAR) data. This study also discusses the advantage of each sensor for wetland research. Wetland research themes reviewed in this paper include wetland classification, habitat or biodiversity, biomass estimation, plant leaf chemistry, water quality, mangrove forest, and sea level rise. This study also gives an overview of the methods used in wetland research such as supervised and unsupervised classification and decision tree and object-based classification. Finally, this paper provides some advice on future wetland remote sensing. To our knowledge, this paper is the most comprehensive and detailed review of wetland remote sensing and it will be a good reference for wetland researchers. PMID:28379174
Remote sensing of Earth terrain
NASA Technical Reports Server (NTRS)
Kong, Jin AU; Shin, Robert T.; Nghiem, Son V.; Yueh, Herng-Aung; Han, Hsiu C.; Lim, Harold H.; Arnold, David V.
1990-01-01
Remote sensing of earth terrain is examined. The layered random medium model is used to investigate the fully polarimetric scattering of electromagnetic waves from vegetation. The model is used to interpret the measured data for vegetation fields such as rice, wheat, or soybean over water or soil. Accurate calibration of polarimetric radar systems is essential for the polarimetric remote sensing of earth terrain. A polarimetric calibration algorithm using three arbitrary in-scene reflectors is developed. In the interpretation of active and passive microwave remote sensing data from the earth terrain, the random medium model was shown to be quite successful. A multivariate K-distribution is proposed to model the statistics of fully polarimetric radar returns from earth terrain. In the terrain cover classification using the synthetic aperture radar (SAR) images, the applications of the K-distribution model will provide better performance than the conventional Gaussian classifiers. The layered random medium model is used to study the polarimetric response of sea ice. Supervised and unsupervised classification procedures are also developed and applied to synthetic aperture radar polarimetric images in order to identify their various earth terrain components for more than two classes. These classification procedures were applied to San Francisco Bay and Traverse City SAR images.
Low-cost multispectral imaging for remote sensing of lettuce health
NASA Astrophysics Data System (ADS)
Ren, David D. W.; Tripathi, Siddhant; Li, Larry K. B.
2017-01-01
In agricultural remote sensing, unmanned aerial vehicle (UAV) platforms offer many advantages over conventional satellite and full-scale airborne platforms. One of the most important advantages is their ability to capture high spatial resolution images (1-10 cm) on-demand and at different viewing angles. However, UAV platforms typically rely on the use of multiple cameras, which can be costly and difficult to operate. We present the development of a simple low-cost imaging system for remote sensing of crop health and demonstrate it on lettuce (Lactuca sativa) grown in Hong Kong. To identify the optimal vegetation index, we recorded images of both healthy and unhealthy lettuce, and used them as input in an expectation maximization cluster analysis with a Gaussian mixture model. Results from unsupervised and supervised clustering show that, among four widely used vegetation indices, the blue wide-dynamic range vegetation index is the most accurate. This study shows that it is readily possible to design and build a remote sensing system capable of determining the health status of lettuce at a reasonably low cost (
Moody, Daniela I.; Brumby, Steven P.; Rowland, Joel C.; ...
2014-12-09
We present results from an ongoing effort to extend neuromimetic machine vision algorithms to multispectral data using adaptive signal processing combined with compressive sensing and machine learning techniques. Our goal is to develop a robust classification methodology that will allow for automated discretization of the landscape into distinct units based on attributes such as vegetation, surface hydrological properties, and topographic/geomorphic characteristics. We use a Hebbian learning rule to build spectral-textural dictionaries that are tailored for classification. We learn our dictionaries from millions of overlapping multispectral image patches and then use a pursuit search to generate classification features. Land cover labelsmore » are automatically generated using unsupervised clustering of sparse approximations (CoSA). We demonstrate our method on multispectral WorldView-2 data from a coastal plain ecosystem in Barrow, Alaska. We explore learning from both raw multispectral imagery and normalized band difference indices. We explore a quantitative metric to evaluate the spectral properties of the clusters in order to potentially aid in assigning land cover categories to the cluster labels. In this study, our results suggest CoSA is a promising approach to unsupervised land cover classification in high-resolution satellite imagery.« less
NASA Astrophysics Data System (ADS)
Shi, Aiye; Wang, Chao; Shen, Shaohong; Huang, Fengchen; Ma, Zhenli
2016-10-01
Chi-squared transform (CST), as a statistical method, can describe the difference degree between vectors. The CST-based methods operate directly on information stored in the difference image and are simple and effective methods for detecting changes in remotely sensed images that have been registered and aligned. However, the technique does not take spatial information into consideration, which leads to much noise in the result of change detection. An improved unsupervised change detection method is proposed based on spatial constraint CST (SCCST) in combination with a Markov random field (MRF) model. First, the mean and variance matrix of the difference image of bitemporal images are estimated by an iterative trimming method. In each iteration, spatial information is injected to reduce scattered changed points (also known as "salt and pepper" noise). To determine the key parameter confidence level in the SCCST method, a pseudotraining dataset is constructed to estimate the optimal value. Then, the result of SCCST, as an initial solution of change detection, is further improved by the MRF model. The experiments on simulated and real multitemporal and multispectral images indicate that the proposed method performs well in comprehensive indices compared with other methods.
NASA Astrophysics Data System (ADS)
Cooper, L. A.; Ballantyne, A.
2017-12-01
Forest disturbances are critical components of ecosystems. Knowledge of their prevalence and impacts is necessary to accurately describe forest health and ecosystem services through time. While there are currently several methods available to identify and describe forest disturbances, especially those which occur in North America, the process remains inefficient and inaccessible in many parts of the world. Here, we introduce a preliminary approach to streamline and automate both the detection and attribution of forest disturbances. We use a combination of the Breaks for Additive Season and Trend (BFAST) detection algorithm to detect disturbances in combination with supervised and unsupervised classification algorithms to attribute the detections to disturbance classes. Both spatial and temporal disturbance characteristics are derived and utilized for the goal of automating the disturbance attribution process. The resulting preliminary algorithm is applied to up-scaled (100m) Landsat data for several different ecosystems in North America, with varying success. Our results indicate that supervised classification is more reliable than unsupervised classification, but that limited training data are required for a region. Future work will improve the algorithm through refining and validating at sites within North America before applying this approach globally.
Nanomaterial-Enabled Dry Electrodes for Electrophysiological Sensing: A Review
NASA Astrophysics Data System (ADS)
Yao, Shanshan; Zhu, Yong
2016-04-01
Long-term, continuous, and unsupervised tracking of physiological data is becoming increasingly attractive for health/wellness monitoring and ailment treatment. Nanomaterials have recently attracted extensive attention as building blocks for flexible/stretchable conductors and are thus promising candidates for electrophysiological electrodes. Here we provide a review on nanomaterial-enabled dry electrodes for electrophysiological sensing, focusing on electrocardiography (ECG). The dry electrodes can be classified into contact surface electrodes, contact-penetrating electrodes, and noncontact capacitive electrodes. Different types of electrodes including their corresponding equivalent electrode-skin interface models and the sources of the noise are first introduced, followed by a review on recent developments of dry ECG electrodes based on various nanomaterials, including metallic nanowires, metallic nanoparticles, carbon nanotubes, and graphene. Their fabrication processes and performances in terms of electrode-skin impedance, signal-to-noise ratio, resistance to motion artifacts, skin compatibility, and long-term stability are discussed.
Sensor Drift Compensation Algorithm based on PDF Distance Minimization
NASA Astrophysics Data System (ADS)
Kim, Namyong; Byun, Hyung-Gi; Persaud, Krishna C.; Huh, Jeung-Soo
2009-05-01
In this paper, a new unsupervised classification algorithm is introduced for the compensation of sensor drift effects of the odor sensing system using a conducting polymer sensor array. The proposed method continues updating adaptive Radial Basis Function Network (RBFN) weights in the testing phase based on minimizing Euclidian Distance between two Probability Density Functions (PDFs) of a set of training phase output data and another set of testing phase output data. The output in the testing phase using the fixed weights of the RBFN are significantly dispersed and shifted from each target value due mostly to sensor drift effect. In the experimental results, the output data by the proposed methods are observed to be concentrated closer again to their own target values significantly. This indicates that the proposed method can be effectively applied to improved odor sensing system equipped with the capability of sensor drift effect compensation
A neuromorphic approach to satellite image understanding
NASA Astrophysics Data System (ADS)
Partsinevelos, Panagiotis; Perakakis, Manolis
2014-05-01
Remote sensing satellite imagery provides high altitude, top viewing aspects of large geographic regions and as such the depicted features are not always easily recognizable. Nevertheless, geoscientists familiar to remote sensing data, gradually gain experience and enhance their satellite image interpretation skills. The aim of this study is to devise a novel computational neuro-centered classification approach for feature extraction and image understanding. Object recognition through image processing practices is related to a series of known image/feature based attributes including size, shape, association, texture, etc. The objective of the study is to weight these attribute values towards the enhancement of feature recognition. The key cognitive experimentation concern is to define the point when a user recognizes a feature as it varies in terms of the above mentioned attributes and relate it with their corresponding values. Towards this end, we have set up an experimentation methodology that utilizes cognitive data from brain signals (EEG) and eye gaze data (eye tracking) of subjects watching satellite images of varying attributes; this allows the collection of rich real-time data that will be used for designing the image classifier. Since the data are already labeled by users (using an input device) a first step is to compare the performance of various machine-learning algorithms on the collected data. On the long-run, the aim of this work would be to investigate the automatic classification of unlabeled images (unsupervised learning) based purely on image attributes. The outcome of this innovative process is twofold: First, in an abundance of remote sensing image datasets we may define the essential image specifications in order to collect the appropriate data for each application and improve processing and resource efficiency. E.g. for a fault extraction application in a given scale a medium resolution 4-band image, may be more effective than costly, multispectral, very high resolution imagery. Second, we attempt to relate the experienced against the non-experienced user understanding in order to indirectly assess the possible limits of purely computational systems. In other words, obtain the conceptual limits of computation vs human cognition concerning feature recognition from satellite imagery. Preliminary results of this pilot study show relations between collected data and differentiation of the image attributes which indicates that our methodology can lead to important results.
NASA Astrophysics Data System (ADS)
D'Amore, M.; Le Scaon, R.; Helbert, J.; Maturilli, A.
2017-12-01
Machine-learning achieved unprecedented results in high-dimensional data processing tasks with wide applications in various fields. Due to the growing number of complex nonlinear systems that have to be investigated in science and the bare raw size of data nowadays available, ML offers the unique ability to extract knowledge, regardless the specific application field. Examples are image segmentation, supervised/unsupervised/ semi-supervised classification, feature extraction, data dimensionality analysis/reduction.The MASCS instrument has mapped Mercury surface in the 400-1145 nm wavelength range during orbital observations by the MESSENGER spacecraft. We have conducted k-means unsupervised hierarchical clustering to identify and characterize spectral units from MASCS observations. The results display a dichotomy: a polar and equatorial units, possibly linked to compositional differences or weathering due to irradiation. To explore possible relations between composition and spectral behavior, we have compared the spectral provinces with elemental abundance maps derived from MESSENGER's X-Ray Spectrometer (XRS).For the Vesta application on DAWN Visible and infrared spectrometer (VIR) data, we explored several Machine Learning techniques: image segmentation method, stream algorithm and hierarchical clustering.The algorithm successfully separates the Olivine outcrops around two craters on Vesta's surface [1]. New maps summarizing the spectral and chemical signature of the surface could be automatically produced.We conclude that instead of hand digging in data, scientist could choose a subset of algorithms with well known feature (i.e. efficacy on the particular problem, speed, accuracy) and focus their effort in understanding what important characteristic of the groups found in the data mean. [1] E Ammannito et al. "Olivine in an unexpected location on Vesta's surface". In: Nature 504.7478 (2013), pp. 122-125.
NASA Astrophysics Data System (ADS)
Madokoro, H.; Yamanashi, A.; Sato, K.
2013-08-01
This paper presents an unsupervised scene classification method for actualizing semantic recognition of indoor scenes. Background and foreground features are respectively extracted using Gist and color scale-invariant feature transform (SIFT) as feature representations based on context. We used hue, saturation, and value SIFT (HSV-SIFT) because of its simple algorithm with low calculation costs. Our method creates bags of features for voting visual words created from both feature descriptors to a two-dimensional histogram. Moreover, our method generates labels as candidates of categories for time-series images while maintaining stability and plasticity together. Automatic labeling of category maps can be realized using labels created using adaptive resonance theory (ART) as teaching signals for counter propagation networks (CPNs). We evaluated our method for semantic scene classification using KTH's image database for robot localization (KTH-IDOL), which is popularly used for robot localization and navigation. The mean classification accuracies of Gist, gray SIFT, one class support vector machines (OC-SVM), position-invariant robust features (PIRF), and our method are, respectively, 39.7, 58.0, 56.0, 63.6, and 79.4%. The result of our method is 15.8% higher than that of PIRF. Moreover, we applied our method for fine classification using our original mobile robot. We obtained mean classification accuracy of 83.2% for six zones.
ERIC Educational Resources Information Center
Chen, Baoguo; Zhou, Huixia; Gao, Yiwen; Dunlap, Susan
2014-01-01
The present study aimed to test the Sense Model of cross-linguistic masked translation priming asymmetry, proposed by Finkbeiner et al. ("J Mem Lang" 51:1-22, 2004), by manipulating the number of senses that bilingual participants associated with words from both languages. Three lexical decision experiments were conducted with…
NASA Astrophysics Data System (ADS)
Betancourt, J. L.; Biondi, F.; Bradford, J. B.; Foster, J. R.; Betancourt, J. L.; Foster, J. R.; Biondi, F.; Bradford, J. B.; Henebry, G. M.; Post, E.; Koenig, W.; Hoffman, F. M.; de Beurs, K.; Hoffman, F. M.; Kumar, J.; Hargrove, W. W.; Norman, S. P.; Brooks, B. G.
2016-12-01
Vegetated ecosystems exhibit unique phenological behavior over the course of a year, suggesting that remotely sensed land surface phenology may be useful for characterizing land cover and ecoregions. However, phenology is also strongly influenced by temperature and water stress; insect, fire, and weather disturbances; and climate change over seasonal, interannual, decadal and longer time scales. Normalized difference vegetation index (NDVI), a remotely sensed measure of greenness, provides a useful proxy for land surface phenology. We used NDVI for the conterminous United States (CONUS) derived from the Moderate Resolution Spectroradiometer (MODIS) every eight days at 250 m resolution for the period 2000-2015 to develop phenological signatures of emergent ecological regimes called phenoregions. We employed a "Big Data" classification approach on a supercomputer, specifically applying an unsupervised data mining technique, to this large collection of NDVI measurements to develop annual maps of phenoregions. This technique produces a prescribed number of prototypical phenological states to which every location belongs in any year. To reduce the impact of short-term disturbances, we derived a single map of the mode of annual phenological states for the CONUS, assigning each map cell to the state with the largest integrated NDVI in cases where multiple states tie for the highest frequency of occurrence. Since the data mining technique is unsupervised, individual phenoregions are not associated with an ecologically understandable label. To add automated supervision to the process, we applied the method of Mapcurves, developed by Hargrove and Hoffman, to associate individual phenoregions with labeled polygons in expert-derived maps of biomes, land cover, and ecoregions. We will present the phenoregions methodology and resulting maps for the CONUS, describe the "label-stealing" technique for ascribing biome characteristics to phenoregions, and introduce a new polar plotting scheme for processing NDVI data by localized seasonality.
Sanfilippo, Antonio [Richland, WA; Calapristi, Augustin J [West Richland, WA; Crow, Vernon L [Richland, WA; Hetzler, Elizabeth G [Kennewick, WA; Turner, Alan E [Kennewick, WA
2009-12-22
Document clustering methods, document cluster label disambiguation methods, document clustering apparatuses, and articles of manufacture are described. In one aspect, a document clustering method includes providing a document set comprising a plurality of documents, providing a cluster comprising a subset of the documents of the document set, using a plurality of terms of the documents, providing a cluster label indicative of subject matter content of the documents of the cluster, wherein the cluster label comprises a plurality of word senses, and selecting one of the word senses of the cluster label.
Patterns of similarity and difference between the vocabularies of psychology and other subjects.
Benjafield, John G
2014-02-01
The vocabulary of Anglophone psychology is shared with many other subjects. Previous research using the Oxford English Dictionary has shown that the subjects having the most words in common with psychology are biology, chemistry, computing, electricity, law, linguistics, mathematics, medicine, music, pathology, philosophy, and physics. The present study presents a database of the vocabularies of these 12 subjects that is similar to one previously constructed for psychology, enabling the histories of the vocabularies of these subjects to be compared with each other as well as with psychology. All subjects have a majority of word senses that are metaphorical. However, psychology is not among the most metaphorical of subjects, a distinction belonging to computing, linguistics, and mathematics. Indeed, the history of other subjects shows an increasing tendency to recycle old words and give them new, metaphorical meanings. The history of psychology shows an increasing tendency to invent new words rather than metaphorical senses of existing words. These results were discussed in terms of the degree to which psychology's vocabulary remains unsettled in comparison with other subjects. The possibility was raised that the vocabulary of psychology is in a state similar to that of chemistry prior to Lavoisier.
Measurement of negativity bias in personal narratives using corpus-based emotion dictionaries.
Cohen, Shuki J
2011-04-01
This study presents a novel methodology for the measurement of negativity bias using positive and negative dictionaries of emotion words applied to autobiographical narratives. At odds with the cognitive theory of mood dysregulation, previous text-analytical studies have failed to find significant correlation between emotion dictionaries and negative affectivity or dysphoria. In the present study, an a priori list dictionary of emotion words was refined based on the actual use of these words in personal narratives collected from close to 500 college students. Half of the corpus was used to construct, via concordance analysis, the grammatical structures associated with the words in their emotional sense. The second half of the corpus served as a validation corpus. The resulting dictionary ignores words that are not used in their intended emotional sense, including negated emotions, homophones, frozen idioms etc. Correlations of the resulting corpus-based negative and positive emotion dictionaries with self-report measures of negative affectivity were in the expected direction, and were statistically significant, with medium effect size. The potential use of these dictionaries as implicit measures of negativity bias and in the analysis of psychotherapy transcripts is discussed.
Taxonomies, Folksonomies, and Semantics: Establishing Functional Meaning in Navigational Structures
ERIC Educational Resources Information Center
Bacha, Jeffrey A.
2012-01-01
This article argues for the establishment of a usability process that incorporates the study of "words" and "word phrases." It demonstrates how semantically mapping a navigational taxonomy can help the developers of digital environments establish a more focused sense of functional meaning for the users of their digital designs.
Making Sense of Word Senses: The Comprehension of Polysemy Depends on Sense Overlap
ERIC Educational Resources Information Center
Klepousniotou, Ekaterini; Titone, Debra; Romero, Carolina
2008-01-01
Studies of polysemy are few in number and are contradictory. Some have found differences between polysemy and homonymy (L. Frazier & K. Rayner, 1990), and others have found similarities (D. K. Klein & G. Murphy, 2001). The authors investigated this issue using the methods of D. K. Klein and G. Murphy (2001), in whose study participants judged…
Unsupervised laparoscopic appendicectomy by surgical trainees is safe and time-effective.
Wong, Kenneth; Duncan, Tristram; Pearson, Andrew
2007-07-01
Open appendicectomy is the traditional standard treatment for appendicitis. Laparoscopic appendicectomy is perceived as a procedure with greater potential for complications and longer operative times. This paper examines the hypothesis that unsupervised laparoscopic appendicectomy by surgical trainees is a safe and time-effective valid alternative. Medical records, operating theatre records and histopathology reports of all patients undergoing laparoscopic and open appendicectomy over a 15-month period in two hospitals within an area health service were retrospectively reviewed. Data were analysed to compare patient features, pathology findings, operative times, complications, readmissions and mortality between laparoscopic and open groups and between unsupervised surgical trainee operators versus consultant surgeon operators. A total of 143 laparoscopic and 222 open appendicectomies were reviewed. Unsupervised trainees performed 64% of the laparoscopic appendicectomies and 55% of the open appendicectomies. There were no significant differences in complication rates, readmissions, mortality and length of stay between laparoscopic and open appendicectomy groups or between trainee and consultant surgeon operators. Conversion rates (laparoscopic to open approach) were similar for trainees and consultants. Unsupervised senior surgical trainees did not take significantly longer to perform laparoscopic appendicectomy when compared to unsupervised trainee-performed open appendicectomy. Unsupervised laparoscopic appendicectomy by surgical trainees is safe and time-effective.
McCann, Cooper; Repasky, Kevin S.; Morin, Mikindra; ...
2017-05-23
Hyperspectral image analysis has benefited from an array of methods that take advantage of the increased spectral depth compared to multispectral sensors; however, the focus of these developments has been on supervised classification methods. Lack of a priori knowledge regarding land cover characteristics can make unsupervised classification methods preferable under certain circumstances. An unsupervised classification technique is presented in this paper that utilizes physically relevant basis functions to model the reflectance spectra. These fit parameters used to generate the basis functions allow clustering based on spectral characteristics rather than spectral channels and provide both noise and data reduction. Histogram splittingmore » of the fit parameters is then used as a means of producing an unsupervised classification. Unlike current unsupervised classification techniques that rely primarily on Euclidian distance measures to determine similarity, the unsupervised classification technique uses the natural splitting of the fit parameters associated with the basis functions creating clusters that are similar in terms of physical parameters. The data set used in this work utilizes the publicly available data collected at Indian Pines, Indiana. This data set provides reference data allowing for comparisons of the efficacy of different unsupervised data analysis. The unsupervised histogram splitting technique presented in this paper is shown to be better than the standard unsupervised ISODATA clustering technique with an overall accuracy of 34.3/19.0% before merging and 40.9/39.2% after merging. Finally, this improvement is also seen as an improvement of kappa before/after merging of 24.8/30.5 for the histogram splitting technique compared to 15.8/28.5 for ISODATA.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
McCann, Cooper; Repasky, Kevin S.; Morin, Mikindra
Hyperspectral image analysis has benefited from an array of methods that take advantage of the increased spectral depth compared to multispectral sensors; however, the focus of these developments has been on supervised classification methods. Lack of a priori knowledge regarding land cover characteristics can make unsupervised classification methods preferable under certain circumstances. An unsupervised classification technique is presented in this paper that utilizes physically relevant basis functions to model the reflectance spectra. These fit parameters used to generate the basis functions allow clustering based on spectral characteristics rather than spectral channels and provide both noise and data reduction. Histogram splittingmore » of the fit parameters is then used as a means of producing an unsupervised classification. Unlike current unsupervised classification techniques that rely primarily on Euclidian distance measures to determine similarity, the unsupervised classification technique uses the natural splitting of the fit parameters associated with the basis functions creating clusters that are similar in terms of physical parameters. The data set used in this work utilizes the publicly available data collected at Indian Pines, Indiana. This data set provides reference data allowing for comparisons of the efficacy of different unsupervised data analysis. The unsupervised histogram splitting technique presented in this paper is shown to be better than the standard unsupervised ISODATA clustering technique with an overall accuracy of 34.3/19.0% before merging and 40.9/39.2% after merging. Finally, this improvement is also seen as an improvement of kappa before/after merging of 24.8/30.5 for the histogram splitting technique compared to 15.8/28.5 for ISODATA.« less
Feger, Mark A; Herb, C Collin; Fraser, John J; Glaviano, Neal; Hertel, Jay
2015-04-01
In competitive sports medicine, supervised rehabilitation is the standard of care; in the general population, unsupervised home exercise is more common. We systematically reviewed randomized, controlled trials comparing outcomes for supervised rehabilitation versus home exercise programs. Supervised rehabilitation programs resulted in (1) less pain and subjective instability, (2) greater gains in ankle strength and joint position sense, and (3) inconclusive results regarding prevention of recurrent ankle sprains. We recommend supervised rehabilitation over home exercise programs owing to the improved short-term patient-recorded evidence with a strength-of-recommendation taxonomy level of evidence of 2B. Copyright © 2015 Elsevier Inc. All rights reserved.
Unsupervised Categorization in a Sample of Children with Autism Spectrum Disorders
ERIC Educational Resources Information Center
Edwards, Darren J.; Perlman, Amotz; Reed, Phil
2012-01-01
Studies of supervised Categorization have demonstrated limited Categorization performance in participants with autism spectrum disorders (ASD), however little research has been conducted regarding unsupervised Categorization in this population. This study explored unsupervised Categorization using two stimulus sets that differed in their…
Unsupervised Deep Hashing With Pseudo Labels for Scalable Image Retrieval.
Zhang, Haofeng; Liu, Li; Long, Yang; Shao, Ling
2018-04-01
In order to achieve efficient similarity searching, hash functions are designed to encode images into low-dimensional binary codes with the constraint that similar features will have a short distance in the projected Hamming space. Recently, deep learning-based methods have become more popular, and outperform traditional non-deep methods. However, without label information, most state-of-the-art unsupervised deep hashing (DH) algorithms suffer from severe performance degradation for unsupervised scenarios. One of the main reasons is that the ad-hoc encoding process cannot properly capture the visual feature distribution. In this paper, we propose a novel unsupervised framework that has two main contributions: 1) we convert the unsupervised DH model into supervised by discovering pseudo labels; 2) the framework unifies likelihood maximization, mutual information maximization, and quantization error minimization so that the pseudo labels can maximumly preserve the distribution of visual features. Extensive experiments on three popular data sets demonstrate the advantages of the proposed method, which leads to significant performance improvement over the state-of-the-art unsupervised hashing algorithms.
Of Papers and Pens: Polysemes and Homophones in Lexical (Mis)Selection
ERIC Educational Resources Information Center
Li, Leon; Slevc, L. Robert
2017-01-01
Every word signifies multiple senses. Many studies using comprehension-based measures suggest that polysemes' senses (e.g., "paper" as in "printer paper" or "term paper") share lexical representations, whereas homophones' meanings (e.g., "pen" as in "ballpoint pen" or "pig pen")…
Tian, Moqian; Grill-Spector, Kalanit
2015-01-01
Recognizing objects is difficult because it requires both linking views of an object that can be different and distinguishing objects with similar appearance. Interestingly, people can learn to recognize objects across views in an unsupervised way, without feedback, just from the natural viewing statistics. However, there is intense debate regarding what information during unsupervised learning is used to link among object views. Specifically, researchers argue whether temporal proximity, motion, or spatiotemporal continuity among object views during unsupervised learning is beneficial. Here, we untangled the role of each of these factors in unsupervised learning of novel three-dimensional (3-D) objects. We found that after unsupervised training with 24 object views spanning a 180° view space, participants showed significant improvement in their ability to recognize 3-D objects across rotation. Surprisingly, there was no advantage to unsupervised learning with spatiotemporal continuity or motion information than training with temporal proximity. However, we discovered that when participants were trained with just a third of the views spanning the same view space, unsupervised learning via spatiotemporal continuity yielded significantly better recognition performance on novel views than learning via temporal proximity. These results suggest that while it is possible to obtain view-invariant recognition just from observing many views of an object presented in temporal proximity, spatiotemporal information enhances performance by producing representations with broader view tuning than learning via temporal association. Our findings have important implications for theories of object recognition and for the development of computational algorithms that learn from examples. PMID:26024454
NASA Astrophysics Data System (ADS)
Oommen, T.; Baise, L. G.; Gens, R.; Prakash, A.; Gupta, R. P.
2008-12-01
Seismic liquefaction is the loss of strength of soil due to shaking that leads to various ground failures such as lateral spreading, settlements, tilting, and sand boils. It is important to document these failures after earthquakes to advance our study of when and where liquefaction occurs. The current approach of mapping these failures by field investigation teams suffers due to the inaccessibility to some of the sites immediately after the event, short life of some of these failures, difficulties in mapping the aerial extent of the failure, incomplete coverage etc. After the 2001 Bhuj earthquake (India), researchers, using the Indian remote sensing satellite, illustrated that satellite remote sensing can provide a synoptic view of the terrain and offer unbiased estimates of liquefaction failures. However, a multisensor (data from different sensors onboard of the same or different satellites) and multispectral (data collected in different spectral regions) approach is needed to efficiently document liquefaction incidences and/or its potential of occurrence due to the possibility of a particular satellite being located inappropriately to image an area shortly after an earthquake. The use of SAR satellite imagery ensures the acquisition of data in all weather conditions at day and night as well as information complimentary to the optical data sets. In this study, we analyze the applicability of the various satellites (Landsat, RADARSAT, Terra-MISR, IRS-1C, IRS-1D) in mapping liquefaction failures after the 2001 Bhuj earthquake using Support Vector Data Description (SVDD). The SVDD is a kernel based nonparametric outlier detection algorithm inspired by the Support Vector Machines (SVMs), which is a new generation learning algorithm based on the statistical learning theory. We present the applicability of SVDD for unsupervised change-detection studies (i.e. to identify post-earthquake liquefaction failures). The liquefaction occurrences identified from the different satellites using SVDD have been compared to the ground truth in terms of documented liquefaction failures by other researchers. We present the applicability and appropriateness of the various satellites and spectral regions for documenting liquefaction related failures. Results illustrate that the SVDD is a promising unsupervised change-detection algorithm, which can help in automating the documentation of earthquake induced liquefaction failures.
Remote Sensing Monitoring of Changes in Soil Salinity: A Case Study in Inner Mongolia, China.
Wu, Jingwei; Vincent, Bernard; Yang, Jinzhong; Bouarfa, Sami; Vidal, Alain
2008-11-07
This study used archived remote sensing images to depict the history of changes in soil salinity in the Hetao Irrigation District in Inner Mongolia, China, with the purpose of linking these changes with land and water management practices and to draw lessons for salinity control. Most data came from LANDSAT satellite images taken in 1973, 1977, 1988, 1991, 1996, 2001, and 2006. In these years salt-affected areas were detected using a normal supervised classification method. Corresponding cropped areas were detected from NVDI (Normalized Difference Vegetation Index) values using an unsupervised method. Field samples and agricultural statistics were used to estimate the accuracy of the classification. Historical data concerning irrigation/drainage and the groundwater table were used to analyze the relation between changes in soil salinity and land and water management practices. Results showed that: (1) the overall accuracy of remote sensing in detecting soil salinity was 90.2%, and in detecting cropped area, 98%; (2) the installation/innovation of the drainage system did help to control salinity; and (3) a low ratio of cropped land helped control salinity in the Hetao Irrigation District. These findings suggest that remote sensing is a useful tool to detect soil salinity and has potential in evaluating and improving land and water management practices.
Images, Words, and Narrative Epistemology.
ERIC Educational Resources Information Center
Fleckenstein, Kristie S.
1996-01-01
Reviews work suggesting that imagery and language function in tandem to constitute a sense of being, and that metaphors of sight hold as much formative power as metaphors of word. Describes the limitations of language and the ways in which imagery compensates for that limitation. Discusses narrative of epistemology as a fusion of image and…
A Phenomenographic Study of Youth Conceptualizations of Evil: Order-Words and the Politics of Evil
ERIC Educational Resources Information Center
van Kessel, Cathryn
2017-01-01
Students in secondary social studies examine descriptions of historical events and rhetoric by politicians that utilize the word and concept of evil. The label of evil can evoke specific images, feelings, and thoughts; oversimplify historical and contemporary situations; and decrease students' sense of agency. This phenomenographical study…
NASA Technical Reports Server (NTRS)
1976-01-01
Abstracts related to remote sensing instrumentation and techniques, and to the remote sensing of natural resources are presented by the Technology Application Center at the University of New Mexico. Areas of interest included theory, general surveys, and miscellaneous studies; geology and hydrology; agriculture and forestry; marine sciences; and urban and land use. An alphabetically arranged Author/Key Word index is provided.
Cohen, Trevor; Blatter, Brett; Patel, Vimla
2005-01-01
Certain applications require computer systems to approximate intended human meaning. This is achievable in constrained domains with a finite number of concepts. Areas such as psychiatry, however, draw on concepts from the world-at-large. A knowledge structure with broad scope is required to comprehend such domains. Latent Semantic Analysis (LSA) is an unsupervised corpus-based statistical method that derives quantitative estimates of the similarity between words and documents from their contextual usage statistics. The aim of this research was to evaluate the ability of LSA to derive meaningful associations between concepts relevant to the assessment of dangerousness in psychiatry. An expert reference model of dangerousness was used to guide the construction of a relevant corpus. Derived associations between words in the corpus were evaluated qualitatively. A similarity-based scoring function was used to assign dangerousness categories to discharge summaries. LSA was shown to derive intuitive relationships between concepts and correlated significantly better than random with human categorization of psychiatric discharge summaries according to dangerousness. The use of LSA to derive a simulated knowledge structure can extend the scope of computer systems beyond the boundaries of constrained conceptual domains. PMID:16779020
Pothos, Emmanuel M; Bailey, Todd M
2009-07-01
Naïve observers typically perceive some groupings for a set of stimuli as more intuitive than others. The problem of predicting category intuitiveness has been historically considered the remit of models of unsupervised categorization. In contrast, this article develops a measure of category intuitiveness from one of the most widely supported models of supervised categorization, the generalized context model (GCM). Considering different category assignments for a set of instances, the authors asked how well the GCM can predict the classification of each instance on the basis of all the other instances. The category assignment that results in the smallest prediction error is interpreted as the most intuitive for the GCM-the authors refer to this way of applying the GCM as "unsupervised GCM." The authors systematically compared predictions of category intuitiveness from the unsupervised GCM and two models of unsupervised categorization: the simplicity model and the rational model. The unsupervised GCM compared favorably with the simplicity model and the rational model. This success of the unsupervised GCM illustrates that the distinction between supervised and unsupervised categorization may need to be reconsidered. However, no model emerged as clearly superior, indicating that there is more work to be done in understanding and modeling category intuitiveness.
Semantic Factors Predict the Rate of Lexical Replacement of Content Words
Vejdemo, Susanne; Hörberg, Thomas
2016-01-01
The rate of lexical replacement estimates the diachronic stability of word forms on the basis of how frequently a proto-language word is replaced or retained in its daughter languages. Lexical replacement rate has been shown to be highly related to word class and word frequency. In this paper, we argue that content words and function words behave differently with respect to lexical replacement rate, and we show that semantic factors predict the lexical replacement rate of content words. For the 167 content items in the Swadesh list, data was gathered on the features of lexical replacement rate, word class, frequency, age of acquisition, synonyms, arousal, imageability and average mutual information, either from published databases or gathered from corpora and lexica. A linear regression model shows that, in addition to frequency, synonyms, senses and imageability are significantly related to the lexical replacement rate of content words–in particular the number of synonyms that a word has. The model shows no differences in lexical replacement rate between word classes, and outperforms a model with word class and word frequency predictors only. PMID:26820737
Polysemy in Sentence Comprehension: Effects of Meaning Dominance
ERIC Educational Resources Information Center
Foraker, Stephani; Murphy, Gregory L.
2012-01-01
Words like "church" are polysemous, having two related senses (a building and an organization). Three experiments investigated how polysemous senses are represented and processed during sentence comprehension. On one view, readers retrieve an underspecified, core meaning, which is later specified more fully with contextual information. On another…
Marapareddy, Ramakalavathi; Aanstoos, James V.; Younan, Nicolas H.
2016-01-01
Fully polarimetric Synthetic Aperture Radar (polSAR) data analysis has wide applications for terrain and ground cover classification. The dynamics of surface and subsurface water events can lead to slope instability resulting in slough slides on earthen levees. Early detection of these anomalies by a remote sensing approach could save time versus direct assessment. We used L-band Synthetic Aperture Radar (SAR) to screen levees for anomalies. SAR technology, due to its high spatial resolution and soil penetration capability, is a good choice for identifying problematic areas on earthen levees. Using the parameters entropy (H), anisotropy (A), alpha (α), and eigenvalues (λ, λ1, λ2, and λ3), we implemented several unsupervised classification algorithms for the identification of anomalies on the levee. The classification techniques applied are H/α, H/A, A/α, Wishart H/α, Wishart H/A/α, and H/α/λ classification algorithms. In this work, the effectiveness of the algorithms was demonstrated using quad-polarimetric L-band SAR imagery from the NASA Jet Propulsion Laboratory’s (JPL’s) Uninhabited Aerial Vehicle Synthetic Aperture Radar (UAVSAR). The study area is a section of the lower Mississippi River valley in the Southern USA, where earthen flood control levees are maintained by the US Army Corps of Engineers. PMID:27322270
Maximum Margin Clustering of Hyperspectral Data
NASA Astrophysics Data System (ADS)
Niazmardi, S.; Safari, A.; Homayouni, S.
2013-09-01
In recent decades, large margin methods such as Support Vector Machines (SVMs) are supposed to be the state-of-the-art of supervised learning methods for classification of hyperspectral data. However, the results of these algorithms mainly depend on the quality and quantity of available training data. To tackle down the problems associated with the training data, the researcher put effort into extending the capability of large margin algorithms for unsupervised learning. One of the recent proposed algorithms is Maximum Margin Clustering (MMC). The MMC is an unsupervised SVMs algorithm that simultaneously estimates both the labels and the hyperplane parameters. Nevertheless, the optimization of the MMC algorithm is a non-convex problem. Most of the existing MMC methods rely on the reformulating and the relaxing of the non-convex optimization problem as semi-definite programs (SDP), which are computationally very expensive and only can handle small data sets. Moreover, most of these algorithms are two-class classification, which cannot be used for classification of remotely sensed data. In this paper, a new MMC algorithm is used that solve the original non-convex problem using Alternative Optimization method. This algorithm is also extended for multi-class classification and its performance is evaluated. The results of the proposed algorithm show that the algorithm has acceptable results for hyperspectral data clustering.
An Application of Self-Organizing Map for Multirobot Multigoal Path Planning with Minmax Objective.
Faigl, Jan
2016-01-01
In this paper, Self-Organizing Map (SOM) for the Multiple Traveling Salesman Problem (MTSP) with minmax objective is applied to the robotic problem of multigoal path planning in the polygonal domain. The main difficulty of such SOM deployment is determination of collision-free paths among obstacles that is required to evaluate the neuron-city distances in the winner selection phase of unsupervised learning. Moreover, a collision-free path is also needed in the adaptation phase, where neurons are adapted towards the presented input signal (city) to the network. Simple approximations of the shortest path are utilized to address this issue and solve the robotic MTSP by SOM. Suitability of the proposed approximations is verified in the context of cooperative inspection, where cities represent sensing locations that guarantee to "see" the whole robots' workspace. The inspection task formulated as the MTSP-Minmax is solved by the proposed SOM approach and compared with the combinatorial heuristic GENIUS. The results indicate that the proposed approach provides competitive results to GENIUS and support applicability of SOM for robotic multigoal path planning with a group of cooperating mobile robots. The proposed combination of approximate shortest paths with unsupervised learning opens further applications of SOM in the field of robotic planning.
An Application of Self-Organizing Map for Multirobot Multigoal Path Planning with Minmax Objective
Faigl, Jan
2016-01-01
In this paper, Self-Organizing Map (SOM) for the Multiple Traveling Salesman Problem (MTSP) with minmax objective is applied to the robotic problem of multigoal path planning in the polygonal domain. The main difficulty of such SOM deployment is determination of collision-free paths among obstacles that is required to evaluate the neuron-city distances in the winner selection phase of unsupervised learning. Moreover, a collision-free path is also needed in the adaptation phase, where neurons are adapted towards the presented input signal (city) to the network. Simple approximations of the shortest path are utilized to address this issue and solve the robotic MTSP by SOM. Suitability of the proposed approximations is verified in the context of cooperative inspection, where cities represent sensing locations that guarantee to “see” the whole robots' workspace. The inspection task formulated as the MTSP-Minmax is solved by the proposed SOM approach and compared with the combinatorial heuristic GENIUS. The results indicate that the proposed approach provides competitive results to GENIUS and support applicability of SOM for robotic multigoal path planning with a group of cooperating mobile robots. The proposed combination of approximate shortest paths with unsupervised learning opens further applications of SOM in the field of robotic planning. PMID:27340395
Na, Kyoung-Sae; Lee, Soyoung Irene; Hong, Hyun Ju; Oh, Myoung-Ja; Bahn, Geon Ho; Ha, Kyunghee; Shin, Yun Mi; Song, Jungeun; Park, Eun Jin; Yoo, Heejung; Kim, Hyunsoo; Kyung, Yun-Mi
2014-06-01
In the last few decades, changing socioeconomic and family structures have increasingly left children alone without adult supervision. Carefully prepared and limited periods of unsupervised time are not harmful for children. However, long unsupervised periods have harmful effects, particularly for those children at high risk for inattention and problem behaviors. In this study, we examined the influence of unsupervised time on behavior problems by studying a sample of elementary school children at high risk for inattention and problem behaviors. The study analyzed data from the Children's Mental Health Promotion Project, which was conducted in collaboration with education, government, and mental health professionals. The child behavior checklist (CBCL) was administered to assess problem behaviors among first- and fourth-grade children. Multivariate logistic regression analysis was used to evaluate the influence of unsupervised time on children's behavior. A total of 3,270 elementary school children (1,340 first-graders and 1,930 fourth-graders) were available for this study; 1,876 of the 3,270 children (57.4%) reportedly spent a significant amount of time unsupervised during the day. Unsupervised time that exceeded more than 2h per day increased the risk of delinquency, aggressive behaviors, and somatic complaints, as well as externalizing and internalizing problems. Carefully planned afterschool programming and care should be provided to children at high risk for inattention and problem behaviors. Also, a more comprehensive approach is needed to identify the possible mechanisms by which unsupervised time aggravates behavior problems in children predisposed for these behaviors. Copyright © 2013 Elsevier Ltd. All rights reserved.
The vocabulary of anglophone psychology in the context of other subjects.
Benjafield, John G
2013-02-01
Anglophone psychology shares its vocabulary with several other subjects. Some of the more obvious subjects that have parts of their vocabulary in common with Anglophone psychology include biology (e.g., dominance), chemistry (e.g., isomorphism), philosophy (e.g., phenomenology), and theology (e.g., mediator). Using data from the Oxford English Dictionary as well as other sources, the present study explored the history of these common vocabularies, with a view to broadening our understanding of the relation between the history of psychology and the histories of other subjects. It turns out that there are at least 156 different subjects that share words with psychology. Those that have the most words in common with psychology are mathematics, biology, physics, medicine, chemistry, philosophy, law, music, linguistics, electricity, pathology, and computing. Words that have senses in other subjects and have their origins in ordinary language are used more frequently as PsycINFO keywords than words that were invented specifically for use in psychology. These and other results are interpreted in terms of the ordinary language roots of the vocabulary of Anglophone psychology and other subjects, the degree to which operational definitions have determined the meaning of the psychological senses of words, the role of the psychologist in interdisciplinary research, and the validity of psychological essentialism.
NASA Astrophysics Data System (ADS)
Hsu, Kuo-Hsien
2012-11-01
Formosat-2 image is a kind of high-spatial-resolution (2 meters GSD) remote sensing satellite data, which includes one panchromatic band and four multispectral bands (Blue, Green, Red, near-infrared). An essential sector in the daily processing of received Formosat-2 image is to estimate the cloud statistic of image using Automatic Cloud Coverage Assessment (ACCA) algorithm. The information of cloud statistic of image is subsequently recorded as an important metadata for image product catalog. In this paper, we propose an ACCA method with two consecutive stages: preprocessing and post-processing analysis. For pre-processing analysis, the un-supervised K-means classification, Sobel's method, thresholding method, non-cloudy pixels reexamination, and cross-band filter method are implemented in sequence for cloud statistic determination. For post-processing analysis, Box-Counting fractal method is implemented. In other words, the cloud statistic is firstly determined via pre-processing analysis, the correctness of cloud statistic of image of different spectral band is eventually cross-examined qualitatively and quantitatively via post-processing analysis. The selection of an appropriate thresholding method is very critical to the result of ACCA method. Therefore, in this work, We firstly conduct a series of experiments of the clustering-based and spatial thresholding methods that include Otsu's, Local Entropy(LE), Joint Entropy(JE), Global Entropy(GE), and Global Relative Entropy(GRE) method, for performance comparison. The result shows that Otsu's and GE methods both perform better than others for Formosat-2 image. Additionally, our proposed ACCA method by selecting Otsu's method as the threshoding method has successfully extracted the cloudy pixels of Formosat-2 image for accurate cloud statistic estimation.
Howard, Ian S.; Messum, Piers
2014-01-01
Words are made up of speech sounds. Almost all accounts of child speech development assume that children learn the pronunciation of first language (L1) speech sounds by imitation, most claiming that the child performs some kind of auditory matching to the elements of ambient speech. However, there is evidence to support an alternative account and we investigate the non-imitative child behavior and well-attested caregiver behavior that this account posits using Elija, a computational model of an infant. Through unsupervised active learning, Elija began by discovering motor patterns, which produced sounds. In separate interaction experiments, native speakers of English, French and German then played the role of his caregiver. In their first interactions with Elija, they were allowed to respond to his sounds if they felt this was natural. We analyzed the interactions through phonemic transcriptions of the caregivers' utterances and found that they interpreted his output within the framework of their native languages. Their form of response was almost always a reformulation of Elija's utterance into well-formed sounds of L1. Elija retained those motor patterns to which a caregiver responded and formed associations between his motor pattern and the response it provoked. Thus in a second phase of interaction, he was able to parse input utterances in terms of the caregiver responses he had heard previously, and respond using his associated motor patterns. This capacity enabled the caregivers to teach Elija to pronounce some simple words in their native languages, by his serial imitation of the words' component speech sounds. Overall, our results demonstrate that the natural responses and behaviors of human subjects to infant-like vocalizations can take a computational model from a biologically plausible initial state through to word pronunciation. This provides support for an alternative to current auditory matching hypotheses for how children learn to pronounce. PMID:25333740
Impact of Authenticity on Sense Making in Word Problem Solving
ERIC Educational Resources Information Center
Palm, Torulf
2008-01-01
The study presented in this paper seeks to investigate the impact of authenticity on the students' disposition to make necessary real world considerations in their word problem solving. The aim is also to gather information about the extent to which different reasons for the students' behaviors are responsible for not providing solutions that are…
On Lying and Being Lied to: A Linguistic Analysis of Deception in Computer-Mediated Communication
ERIC Educational Resources Information Center
Hancock, Jeffrey T.; Curry, Lauren E.; Goorha, Saurabh; Woodworth, Michael
2008-01-01
This study investigated changes in both the liar's and the conversational partner's linguistic style across truthful and deceptive dyadic communication in a synchronous text-based setting. An analysis of 242 transcripts revealed that liars produced more words, more sense-based words (e.g., seeing, touching), and used fewer self-oriented but more…
"Seeing It on the Screen Isn't Really Seeing It": Reading Problems of Writers Using Word Processing.
ERIC Educational Resources Information Center
Haas, Christina
An observational study examined computer writers' use of hard copy for reading. The study begins with a description, based on interviews, of four kinds of reading problems encountered by writers using word processing; formatting, proofreading, reorganizing, and critical reading ("getting a sense of the text"). Subjects, six freshmen…
Naming Tropes and Schemes in J. K. Rowling's Harry Potter Books
ERIC Educational Resources Information Center
Nilsen, Don L. F.; Nilsen, Alleen Pace
2009-01-01
"Trope" comes from a Greek word meaning "turn." In the rhetorical sense, a trope refers to a "turn" in the way that words are being used to communicate something more than--or different from--a literal or straightforward message. Tropes are part of "deep structure" meanings and include such rhetorical devices as allegories, allusions, euphemisms,…
ERIC Educational Resources Information Center
Reist, Kay
2006-01-01
French Dada artist Marcel Duchamp was one of a group of artists who created art that ridiculed contemporary European culture and traditional art forms. It is said that the movement got its name when one of the artists randomly opened a dictionary and blindly pointed to the word dada. The word made no sense at all but the artists considered it an…
The Significance of Personal Names for Very Young Children
ERIC Educational Resources Information Center
Ostler, Teresa
2014-01-01
Personal names are more than just a sound or word. From the earliest stages of development, names are closely connected to a child's attachment figures and sense of identity. Like words of magic, young children first use names to beckon the parent to them. Experiences with others provide the necessary backdrop for young children to infuse names…
NASA Astrophysics Data System (ADS)
He, Y.; He, Y.
2018-04-01
Urban shanty towns are communities that has contiguous old and dilapidated houses with more than 2000 square meters built-up area or more than 50 households. This study makes attempts to extract shanty towns in Nanning City using the product of Census and TripleSat satellite images. With 0.8-meter high-resolution remote sensing images, five texture characteristics (energy, contrast, maximum probability, and inverse difference moment) of shanty towns are trained and analyzed through GLCM. In this study, samples of shanty town are well classified with 98.2 % producer accuracy of unsupervised classification and 73.2 % supervised classification correctness. Low-rise and mid-rise residential blocks in Nanning City are classified into 4 different types by using k-means clustering and nearest neighbour classification respectively. This study initially establish texture feature descriptions of different types of residential areas, especially low-rise and mid-rise buildings, which would help city administrator evaluate residential blocks and reconstruction shanty towns.
NASA Technical Reports Server (NTRS)
Blackwell, R. J.
1982-01-01
Remote sensing data analysis of water quality monitoring is evaluated. Data anaysis and image processing techniques are applied to LANDSAT remote sensing data to produce an effective operational tool for lake water quality surveying and monitoring. Digital image processing and analysis techniques were designed, developed, tested, and applied to LANDSAT multispectral scanner (MSS) data and conventional surface acquired data. Utilization of these techniques facilitates the surveying and monitoring of large numbers of lakes in an operational manner. Supervised multispectral classification, when used in conjunction with surface acquired water quality indicators, is used to characterize water body trophic status. Unsupervised multispectral classification, when interpreted by lake scientists familiar with a specific water body, yields classifications of equal validity with supervised methods and in a more cost effective manner. Image data base technology is used to great advantage in characterizing other contributing effects to water quality. These effects include drainage basin configuration, terrain slope, soil, precipitation and land cover characteristics.
Change Detection of Remote Sensing Images by Dt-Cwt and Mrf
NASA Astrophysics Data System (ADS)
Ouyang, S.; Fan, K.; Wang, H.; Wang, Z.
2017-05-01
Aiming at the significant loss of high frequency information during reducing noise and the pixel independence in change detection of multi-scale remote sensing image, an unsupervised algorithm is proposed based on the combination between Dual-tree Complex Wavelet Transform (DT-CWT) and Markov random Field (MRF) model. This method first performs multi-scale decomposition for the difference image by the DT-CWT and extracts the change characteristics in high-frequency regions by using a MRF-based segmentation algorithm. Then our method estimates the final maximum a posterior (MAP) according to the segmentation algorithm of iterative condition model (ICM) based on fuzzy c-means(FCM) after reconstructing the high-frequency and low-frequency sub-bands of each layer respectively. Finally, the method fuses the above segmentation results of each layer by using the fusion rule proposed to obtain the mask of the final change detection result. The results of experiment prove that the method proposed is of a higher precision and of predominant robustness properties.
Opposing Effects of Semantic Diversity in Lexical and Semantic Relatedness Decisions
2015-01-01
Semantic ambiguity has often been divided into 2 forms: homonymy, referring to words with 2 unrelated interpretations (e.g., bark), and polysemy, referring to words associated with a number of varying but semantically linked uses (e.g., twist). Typically, polysemous words are thought of as having a fixed number of discrete definitions, or “senses,” with each use of the word corresponding to one of its senses. In this study, we investigated an alternative conception of polysemy, based on the idea that polysemous variation in meaning is a continuous, graded phenomenon that occurs as a function of contextual variation in word usage. We quantified this contextual variation using semantic diversity (SemD), a corpus-based measure of the degree to which a particular word is used in a diverse set of linguistic contexts. In line with other approaches to polysemy, we found a reaction time (RT) advantage for high SemD words in lexical decision, which occurred for words of both high and low imageability. When participants made semantic relatedness decisions to word pairs, however, responses were slower to high SemD pairs, irrespective of whether these were related or unrelated. Again, this result emerged irrespective of the imageability of the word. The latter result diverges from previous findings using homonyms, in which ambiguity effects have only been found for related word pairs. We argue that participants were slower to respond to high SemD words because their high contextual variability resulted in noisy, underspecified semantic representations that were more difficult to compare with one another. We demonstrated this principle in a connectionist computational model that was trained to activate distributed semantic representations from orthographic inputs. Greater variability in the orthography-to-semantic mappings of high SemD words resulted in a lower degree of similarity for related pairs of this type. At the same time, the representations of high SemD unrelated pairs were less distinct from one another. In addition, the model demonstrated more rapid semantic activation for high SemD words, thought to underpin the processing advantage in lexical decision. These results support the view that polysemous variation in word meaning can be conceptualized in terms of graded variation in distributed semantic representations. PMID:25751041
Distance-dependent processing of pictures and words.
Amit, Elinor; Algom, Daniel; Trope, Yaacov
2009-08-01
A series of 8 experiments investigated the association between pictorial and verbal representations and the psychological distance of the referent objects from the observer. The results showed that people better process pictures that represent proximal objects and words that represent distal objects than pictures that represent distal objects and words that represent proximal objects. These results were obtained with various psychological distance dimensions (spatial, temporal, and social), different tasks (classification and categorization), and different measures (speed of processing and selective attention). The authors argue that differences in the processing of pictures and words emanate from the physical similarity of pictures, but not words, to the referents. Consequently, perceptual analysis is commonly applied to pictures but not to words. Pictures thus impart a sense of closeness to the referent objects and are preferably used to represent such objects, whereas words do not convey proximity and are preferably used to represent distal objects in space, time, and social perspective.
Atherton, Olivia E; Schofield, Thomas J; Sitka, Angela; Conger, Rand D; Robins, Richard W
2016-04-01
Despite widespread speculation about the detrimental effect of unsupervised self-care on adolescent outcomes, little is known about which children are particularly prone to problem behaviors when left at home without adult supervision. The present research used data from a longitudinal study of 674 Mexican-origin children residing in the United States to examine the prospective effect of unsupervised self-care on conduct problems, and the moderating roles of hostile aggression and gender. Results showed that unsupervised self-care was related to increases over time in conduct problems such as lying, stealing, and bullying. However, unsupervised self-care only led to conduct problems for boys and for children with an aggressive temperament. The main and interactive effects held for both mother-reported and observational-rated hostile aggression and after controlling for potential confounds. Copyright © 2016 The Foundation for Professionals in Services for Adolescents. Published by Elsevier Ltd. All rights reserved.
Good initialization model with constrained body structure for scene text recognition
NASA Astrophysics Data System (ADS)
Zhu, Anna; Wang, Guoyou; Dong, Yangbo
2016-09-01
Scene text recognition has gained significant attention in the computer vision community. Character detection and recognition are the promise of text recognition and affect the overall performance to a large extent. We proposed a good initialization model for scene character recognition from cropped text regions. We use constrained character's body structures with deformable part-based models to detect and recognize characters in various backgrounds. The character's body structures are achieved by an unsupervised discriminative clustering approach followed by a statistical model and a self-build minimum spanning tree model. Our method utilizes part appearance and location information, and combines character detection and recognition in cropped text region together. The evaluation results on the benchmark datasets demonstrate that our proposed scheme outperforms the state-of-the-art methods both on scene character recognition and word recognition aspects.
Near ground level sensing for spatial analysis of vegetation
NASA Technical Reports Server (NTRS)
Sauer, Tom; Rasure, John; Gage, Charlie
1991-01-01
Measured changes in vegetation indicate the dynamics of ecological processes and can identify the impacts from disturbances. Traditional methods of vegetation analysis tend to be slow because they are labor intensive; as a result, these methods are often confined to small local area measurements. Scientists need new algorithms and instruments that will allow them to efficiently study environmental dynamics across a range of different spatial scales. A new methodology that addresses this problem is presented. This methodology includes the acquisition, processing, and presentation of near ground level image data and its corresponding spatial characteristics. The systematic approach taken encompasses a feature extraction process, a supervised and unsupervised classification process, and a region labeling process yielding spatial information.
NASA Astrophysics Data System (ADS)
Komachi, Mamoru; Kudo, Taku; Shimbo, Masashi; Matsumoto, Yuji
Bootstrapping has a tendency, called semantic drift, to select instances unrelated to the seed instances as the iteration proceeds. We demonstrate the semantic drift of Espresso-style bootstrapping has the same root as the topic drift of Kleinberg's HITS, using a simplified graph-based reformulation of bootstrapping. We confirm that two graph-based algorithms, the von Neumann kernels and the regularized Laplacian, can reduce the effect of semantic drift in the task of word sense disambiguation (WSD) on Senseval-3 English Lexical Sample Task. Proposed algorithms achieve superior performance to Espresso and previous graph-based WSD methods, even though the proposed algorithms have less parameters and are easy to calibrate.
Development of remote sensing based site specific weed management for Midwest mint production
NASA Astrophysics Data System (ADS)
Gumz, Mary Saumur Paulson
Peppermint and spearmint are high value essential oil crops in Indiana, Michigan, and Wisconsin. Although the mints are profitable alternatives to corn and soybeans, mint production efficiency must improve in order to allow industry survival against foreign produced oils and synthetic flavorings. Weed control is the major input cost in mint production and tools to increase efficiency are necessary. Remote sensing-based site-specific weed management offers potential for decreasing weed control costs through simplified weed detection and control from accurate site specific weed and herbicide application maps. This research showed the practicability of remote sensing for weed detection in the mints. Research was designed to compare spectral response curves of field grown mint and weeds, and to use these data to develop spectral vegetation indices for automated weed detection. Viability of remote sensing in mint production was established using unsupervised classification, supervised classification, handheld spectroradiometer readings and spectral vegetation indices (SVIs). Unsupervised classification of multispectral images of peppermint production fields generated crop health maps with 92 and 67% accuracy in meadow and row peppermint, respectively. Supervised classification of multispectral images identified weed infestations with 97% and 85% accuracy for meadow and row peppermint, respectively. Supervised classification showed that peppermint was spectrally distinct from weeds, but the accuracy of these measures was dependent on extensive ground referencing which is impractical and too costly for on-farm use. Handheld spectroradiometer measurements of peppermint, spearmint, and several weeds and crop and weed mixtures were taken over three years from greenhouse grown plants, replicated field plots, and production peppermint and spearmint fields. Results showed that mints have greater near infrared (NIR) and lower green reflectance and a steeper red edge slope than all weed species. These distinguishing characteristics were combined to develop narrow band and broadband spectral vegetation indices (SVIs, ratios of NIR/green reflectance), that were effective in differentiating mint from key weed species. Hyperspectral images of production peppermint and spearmint fields were then classified using SVI-based classification. Narrowband and broadband SVIs classified early season peppermint and spearmint with 64 to 100% accuracy compared to 79 to 100% accuracy for supervised classification of multispectral images of the same fields. Broadband SVIs have potential for use as an automated spectral indicator for weeds in the mints since they require minimal ground referencing and can be calculated from multispectral imagery which is cheaper and more readily available than hyperspectral imagery. This research will allow growers to implement remote sensing based site specific weed management in mint resulting in reduced grower input costs and reduced herbicide entry into the environment and will have applications in other specialty and meadow crops.
A Survey of Meaning Discrimination in Selected English/Spanish Dictionaries.
ERIC Educational Resources Information Center
Powers, Michael D.
1985-01-01
Examines the treatment of sense discrimination in eight Spanish/English English/Spanish bilingual dictionaries and one specialized dictionary. Does this by analyzing 30 words that Torrents des Prats determined have at least nine different sense discriminations from English into Spanish. Larousse was found to be far superior to the others. (SED)
New Interpretations of Native American Literature: A Survival Technique.
ERIC Educational Resources Information Center
Buller, Galen
1980-01-01
Uses examples from the work of several Native American authors, including N. Scott Momaday and Leslie Silko, to discuss five unique elements in American Indian literature: reverence for words, dependence on a sense of place, sense of ritual, affirmation of the need for community, and a significantly different world view. (SB)
Automatic Word Sense Disambiguation of Acronyms and Abbreviations in Clinical Texts
ERIC Educational Resources Information Center
Moon, Sungrim
2012-01-01
The use of acronyms and abbreviations is increasing profoundly in the clinical domain in large part due to the greater adoption of electronic health record (EHR) systems and increased electronic documentation within healthcare. A single acronym or abbreviation may have multiple different meanings or senses. Comprehending the proper meaning of an…
Infographics: More than Words Can Say
ERIC Educational Resources Information Center
Krauss, Jane
2012-01-01
Good learning experiences ask students to investigate and make sense of the world. While there are many ways to do this, K-12 curriculum has traditionally skewed toward reading and writing to interpret and express students' sense-making. But there is another way. Infographics represent data and ideas visually, in pictures, engaging more parts of…
Unsupervised universal steganalyzer for high-dimensional steganalytic features
NASA Astrophysics Data System (ADS)
Hou, Xiaodan; Zhang, Tao
2016-11-01
The research in developing steganalytic features has been highly successful. These features are extremely powerful when applied to supervised binary classification problems. However, they are incompatible with unsupervised universal steganalysis because the unsupervised method cannot distinguish embedding distortion from varying levels of noises caused by cover variation. This study attempts to alleviate the problem by introducing similarity retrieval of image statistical properties (SRISP), with the specific aim of mitigating the effect of cover variation on the existing steganalytic features. First, cover images with some statistical properties similar to those of a given test image are searched from a retrieval cover database to establish an aided sample set. Then, unsupervised outlier detection is performed on a test set composed of the given test image and its aided sample set to determine the type (cover or stego) of the given test image. Our proposed framework, called SRISP-aided unsupervised outlier detection, requires no training. Thus, it does not suffer from model mismatch mess. Compared with prior unsupervised outlier detectors that do not consider SRISP, the proposed framework not only retains the universality but also exhibits superior performance when applied to high-dimensional steganalytic features.
Video mining using combinations of unsupervised and supervised learning techniques
NASA Astrophysics Data System (ADS)
Divakaran, Ajay; Miyahara, Koji; Peker, Kadir A.; Radhakrishnan, Regunathan; Xiong, Ziyou
2003-12-01
We discuss the meaning and significance of the video mining problem, and present our work on some aspects of video mining. A simple definition of video mining is unsupervised discovery of patterns in audio-visual content. Such purely unsupervised discovery is readily applicable to video surveillance as well as to consumer video browsing applications. We interpret video mining as content-adaptive or "blind" content processing, in which the first stage is content characterization and the second stage is event discovery based on the characterization obtained in stage 1. We discuss the target applications and find that using a purely unsupervised approach are too computationally complex to be implemented on our product platform. We then describe various combinations of unsupervised and supervised learning techniques that help discover patterns that are useful to the end-user of the application. We target consumer video browsing applications such as commercial message detection, sports highlights extraction etc. We employ both audio and video features. We find that supervised audio classification combined with unsupervised unusual event discovery enables accurate supervised detection of desired events. Our techniques are computationally simple and robust to common variations in production styles etc.
Manual for Bilingual Dictionaries. Textbook, Word List A-L, and Word List LL-Z.
ERIC Educational Resources Information Center
Robinson, Dow F.
Volume One of this handbook for the preparation of bilingual dictionaries deals with (1) the purpose and structure of the bilingual dictionary for which this manual is designed; (2) the grammatical form of a main entry; (3) the grammatical designation of vernacular entries; (4) gloss in Spanish and vernacular; (5) sense discriminations; (6)…
The Role of Orthographic Analogies in Reading for Meaning: Evidence from Readers with Dyslexia
ERIC Educational Resources Information Center
Humphrey, Neil; Richard Hanley, J.
2004-01-01
The aim of this experiment was to investigate the use of orthographic analogies in conditions that involved making sense of print (picture-word matching) and pronouncing print (reading aloud) for readers with dyslexia. An adapted version of the classic clue-word paradigm developed by Goswami was used. Participants were 40 readers with dyslexia and…
Hybrid region merging method for segmentation of high-resolution remote sensing images
NASA Astrophysics Data System (ADS)
Zhang, Xueliang; Xiao, Pengfeng; Feng, Xuezhi; Wang, Jiangeng; Wang, Zuo
2014-12-01
Image segmentation remains a challenging problem for object-based image analysis. In this paper, a hybrid region merging (HRM) method is proposed to segment high-resolution remote sensing images. HRM integrates the advantages of global-oriented and local-oriented region merging strategies into a unified framework. The globally most-similar pair of regions is used to determine the starting point of a growing region, which provides an elegant way to avoid the problem of starting point assignment and to enhance the optimization ability for local-oriented region merging. During the region growing procedure, the merging iterations are constrained within the local vicinity, so that the segmentation is accelerated and can reflect the local context, as compared with the global-oriented method. A set of high-resolution remote sensing images is used to test the effectiveness of the HRM method, and three region-based remote sensing image segmentation methods are adopted for comparison, including the hierarchical stepwise optimization (HSWO) method, the local-mutual best region merging (LMM) method, and the multiresolution segmentation (MRS) method embedded in eCognition Developer software. Both the supervised evaluation and visual assessment show that HRM performs better than HSWO and LMM by combining both their advantages. The segmentation results of HRM and MRS are visually comparable, but HRM can describe objects as single regions better than MRS, and the supervised and unsupervised evaluation results further prove the superiority of HRM.
Abstractness and emotionality values for 398 English words.
Guido, Gianluigi; Provenzano, Maria Rosaria
2004-06-01
This study is aimed to replicate Vikis-Freibergs' classic study (1976) on the values of vividness for French words. Vividness resulted from the concreteness and the emotionality values of words, here defined, respectively, as referring to something that can be experienced through senses and that can arouse pleasant or unpleasant emotions. 398 English words were rated on two different scales, Abstractness and Emotionality, by a group of English native speakers and also by a group of Italian subjects who used English as a second language. Results show a low correlation between the concreteness and emotionality ratings in line with Vikis-Freibergs' previous study of French words (1976). A negative correlation between Abstractness and Emotionality was observed for British data but a slightly positive correlation for the Italian data.
NASA Astrophysics Data System (ADS)
Hoffman, F. M.; Kumar, J.; Hargrove, W. W.
2013-12-01
Vegetated ecosystems typically exhibit unique phenological behavior over the course of a year, suggesting that remotely sensed land surface phenology may be useful for characterizing land cover and ecoregions. However, phenology is also strongly influenced by temperature and water stress; insect, fire, and storm disturbances; and climate change over seasonal, interannual, decadal and longer time scales. Normalized difference vegetation index (NDVI), a remotely sensed measure of greenness, provides a useful proxy for land surface phenology. We used NDVI for the conterminous United States (CONUS) derived from the Moderate Resolution Spectroradiometer (MODIS) at 250 m resolution to develop phenological signatures of emergent ecological regimes called phenoregions. By applying a unsupervised, quantitative data mining technique to NDVI measurements for every eight days over the entire MODIS record, annual maps of phenoregions were developed. This technique produces a prescribed number of prototypical phenological states to which every location belongs in any year. To reduce the impact of short-term disturbances, we derived a single map of the mode of annual phenological states for the CONUS, assigning each map cell to the state with the largest integrated NDVI in cases where multiple states tie for the highest frequency. Since the data mining technique is unsupervised, individual phenoregions are not associated with an ecologically understandable label. To add automated supervision to the process, we applied the method of Mapcurves, developed by Hargrove and Hoffman, to associate individual phenoregions with labeled polygons in expert-derived maps of biomes, land cover, and ecoregions. Utilizing spatial overlays with multiple expert-derived maps, this "label-stealing"' technique exploits the knowledge contained in a collection of maps to identify biome characteristics of our statistically derived phenoregions. Generalized land cover maps were produced by combining phenoregions according to their degree of spatial coincidence with expert-developed land cover or biome regions. Goodness-of-fit maps, which show the strength the spatial correspondence, were also generated.
ERIC Educational Resources Information Center
Perricone, Christopher
2010-01-01
"Tragedy," both in what the author calls the strict and nuclear ancient Greek sense of the term (which does not imply that tragedy is clearly and distinctly defined, even in ancient Greece) and in the looser, derived sense of the word, has a long and compelling history. It is not only true that tragedy as practice and performance has a…
The focus series: A collection of single-concept remote sensing educational materials
NASA Technical Reports Server (NTRS)
Davis, S. M.
1977-01-01
The FOCUS series is a collection of two-page foldout documents each consisting of a diagram or photograph and an extended option of three to four hundred words. The series was developed to present basic remote sensing concepts in a simple, concise way. Issues currently available are collected in this information note.
Acquiring the Impossible: Developmental Stages of Copredication
Murphy, Elliot
2017-01-01
Much is known about the acquisition of phonological competence and lexical categories, but there has been substantially less research into word meaning development. In an attempt to contribute to this debate, a group of 24 children aged 4–11 were asked to define a set of words, as were a group of 12 adult controls. The stimuli included both concrete and abstract words, in particular words exhibiting a rare form of polysemy known as copredication, which permits the simultaneous attribution of concrete and abstract senses to a single nominal, creating an ‘impossible’ entity. The results were used to track the developmental trajectory of copredication, previously unexplored in the language acquisition literature. PMID:28701983
ERIC Educational Resources Information Center
Russell, Dale W.
An obstacle in Natural Language understanding is the existence of lexical gaps, i.e. words or word senses that are not in the lexicon of the system. This thesis describes the implementation of MURRAY, a learning mechanism which infers the properties of a new lexical item from its syntactical environment and infers its meaning based on context and…
River of Words: Discovering a Sense of Place and Self
ERIC Educational Resources Information Center
Baird, Jeffrey Marshall
2006-01-01
One of the most powerful ways middle school students learn about themselves is through writing, and one of the most effective ways for students to learn about themselves is by writing about a place. At Cosgriff, a K-8 Catholic School in Salt Lake City, Utah, teachers have put this idea into practice by participating in the River of Words project.…
ERIC Educational Resources Information Center
Taber, Keith S.; Bricheno, Pat
2009-01-01
The present paper discusses the conceptual demands of an apparently straightforward task set to secondary-level students--completing chemical word equations with a single omitted term. Chemical equations are of considerable importance in chemistry, and school students are expected to learn to be able to write and interpret them. However, it is…
Taguchi, Y-h; Iwadate, Mitsuo; Umeyama, Hideaki
2015-04-30
Feature extraction (FE) is difficult, particularly if there are more features than samples, as small sample numbers often result in biased outcomes or overfitting. Furthermore, multiple sample classes often complicate FE because evaluating performance, which is usual in supervised FE, is generally harder than the two-class problem. Developing sample classification independent unsupervised methods would solve many of these problems. Two principal component analysis (PCA)-based FE, specifically, variational Bayes PCA (VBPCA) was extended to perform unsupervised FE, and together with conventional PCA (CPCA)-based unsupervised FE, were tested as sample classification independent unsupervised FE methods. VBPCA- and CPCA-based unsupervised FE both performed well when applied to simulated data, and a posttraumatic stress disorder (PTSD)-mediated heart disease data set that had multiple categorical class observations in mRNA/microRNA expression of stressed mouse heart. A critical set of PTSD miRNAs/mRNAs were identified that show aberrant expression between treatment and control samples, and significant, negative correlation with one another. Moreover, greater stability and biological feasibility than conventional supervised FE was also demonstrated. Based on the results obtained, in silico drug discovery was performed as translational validation of the methods. Our two proposed unsupervised FE methods (CPCA- and VBPCA-based) worked well on simulated data, and outperformed two conventional supervised FE methods on a real data set. Thus, these two methods have suggested equivalence for FE on categorical multiclass data sets, with potential translational utility for in silico drug discovery.
How meaning similarity influences ambiguous word processing: the current state of the literature
Tokowicz, Natasha
2016-01-01
The majority of words in the English language do not correspond to a single meaning, but rather correspond to two or more unrelated meanings (i.e., are homonyms) or multiple related senses (i.e., are polysemes). It has been proposed that the different types of “semantically-ambiguous words” (i.e., words with more than one meaning) are processed and represented differently in the human mind. Several review papers and books have been written on the subject of semantic ambiguity (e.g., Adriaens, Small, Cottrell, & Tanenhaus, 1988; Burgess & Simpson, 1988; Degani & Tokowicz, 2010; Gorfein, 1989, 2001; Simpson, 1984). However, several more recent studies (e.g., Klein & Murphy, 2001; Klepousniotou, 2002; Klepousniotou & Baum, 2007; Rodd, Gaskell, & Marslen-Wilson, 2002) have investigated the role of the semantic similarity between the multiple meanings of ambiguous words on processing and representation, whereas this was not the emphasis of previous reviews of the literature. In this review, we focus on the current state of the semantic ambiguity literature that examines how different types of ambiguous words influence processing and representation. We analyze the consistent and inconsistent findings reported in the literature and how factors such as semantic similarity, meaning/sense frequency, task, timing, and modality affect ambiguous word processing. We discuss the findings with respect to recent parallel distributed processing (PDP) models of ambiguity processing (Armstrong & Plaut, 2008, 2011; Rodd, Gaskell, & Marslen-Wilson, 2004). Finally, we discuss how experience/instance-based models (e.g., Hintzman, 1986; Reichle & Perfetti, 2003) can inform a comprehensive understanding of semantic ambiguity resolution. PMID:24889119
Adaptive multi-sensor biomimetics for unsupervised submarine hunt (AMBUSH): Early results
NASA Astrophysics Data System (ADS)
Blouin, Stéphane
2014-10-01
Underwater surveillance is inherently difficult because acoustic wave propagation and transmission are limited and unpredictable when targets and sensors move around in the communication-opaque undersea environment. Today's Navy underwater sensors enable the collection of a massive amount of data, often analyzed offtine. The Navy of tomorrow will dominate by making sense of that data in real-time. DRDC's AMBUSH project proposes a new undersea-surveillance network paradigm that will enable such a real-time operation. Nature abounds with examples of collaborative tasks taking place despite limited communication and computational capabilities. This publication describes a year's worth of research efforts finding inspiration in Nature's collaborative tasks such as wolves hunting in packs. This project proposes the utilization of a heterogeneous network combining both static and mobile network nodes. The military objective is to enable an unsupervised surveillance capability while maximizing target localization performance and endurance. The scientific objective is to develop the necessary technology to acoustically and passively localize a noise-source of interest in shallow waters. The project fulfills these objectives via distributed computing and adaptation to changing undersea conditions. Specific research interests discussed here relate to approaches for performing: (a) network self-discovery, (b) network connectivity self-assessment, (c) opportunistic network routing, (d) distributed data-aggregation, and (e) simulation of underwater acoustic propagation. We present early results then followed by a discussion about future work.
Geospatiotemporal Data Mining of Remotely Sensed Phenology for Unsupervised Forest Threat Detection
NASA Astrophysics Data System (ADS)
Mills, R. T.; Hoffman, F. M.; Kumar, J.; Vulli, S. S.; Hargrove, W. W.; Spruce, J.
2010-12-01
Hargrove and Hoffman have previously developed and applied a scalable geospatiotemporal data mining approach to define a set of categorical, multivariate classes or states for describing and tracking the behavior of ecosystem properties through time within a multi-dimensional phase or state space. The method employs a standard k-means cluster analysis with enhancements that reduce the number of required comparisons, dramatically accelerating iterative convergence. In support of efforts by the USDA Forest Service to develop a National Early Warning System for Forest Disturbances, we have applied this geospatiotemporal cluster analysis procedure to annual phenology patterns derived from Moderate Resolution Imaging Spectroradiometer (MODIS) Normalized Difference Vegetation Index (NDVI) for unsupervised change detection. We will present initial results from the analysis of seven years of 250-m MODIS NDVI data for the conterminous United States. While determining what constitutes a "normal" phenological pattern for any given location is challenging due to interannual climate variability, a spatially varying climate change trend, and the relatively short record of MODIS NDVI observations, these results demonstrate the utility of the method for detecting significant mortality events, like the progressive damage from mountain pine beetle, and suggest that the technique may be successfully implemented as a key component in an early warning system for identifying forest threats from natural and anthropogenic disturbances at a continental scale.
Change classification in SAR time series: a functional approach
NASA Astrophysics Data System (ADS)
Boldt, Markus; Thiele, Antje; Schulz, Karsten; Hinz, Stefan
2017-10-01
Change detection represents a broad field of research in SAR remote sensing, consisting of many different approaches. Besides the simple recognition of change areas, the analysis of type, category or class of the change areas is at least as important for creating a comprehensive result. Conventional strategies for change classification are based on supervised or unsupervised landuse / landcover classifications. The main drawback of such approaches is that the quality of the classification result directly depends on the selection of training and reference data. Additionally, supervised processing methods require an experienced operator who capably selects the training samples. This training step is not necessary when using unsupervised strategies, but nevertheless meaningful reference data must be available for identifying the resulting classes. Consequently, an experienced operator is indispensable. In this study, an innovative concept for the classification of changes in SAR time series data is proposed. Regarding the drawbacks of traditional strategies given above, it copes without using any training data. Moreover, the method can be applied by an operator, who does not have detailed knowledge about the available scenery yet. This knowledge is provided by the algorithm. The final step of the procedure, which main aspect is given by the iterative optimization of an initial class scheme with respect to the categorized change objects, is represented by the classification of these objects to the finally resulting classes. This assignment step is subject of this paper.
NASA Astrophysics Data System (ADS)
Bellón, Beatriz; Bégué, Agnès; Lo Seen, Danny; Lebourgeois, Valentine; Evangelista, Balbino Antônio; Simões, Margareth; Demonte Ferraz, Rodrigo Peçanha
2018-06-01
Cropping systems' maps at fine scale over large areas provide key information for further agricultural production and environmental impact assessments, and thus represent a valuable tool for effective land-use planning. There is, therefore, a growing interest in mapping cropping systems in an operational manner over large areas, and remote sensing approaches based on vegetation index time series analysis have proven to be an efficient tool. However, supervised pixel-based approaches are commonly adopted, requiring resource consuming field campaigns to gather training data. In this paper, we present a new object-based unsupervised classification approach tested on an annual MODIS 16-day composite Normalized Difference Vegetation Index time series and a Landsat 8 mosaic of the State of Tocantins, Brazil, for the 2014-2015 growing season. Two variants of the approach are compared: an hyperclustering approach, and a landscape-clustering approach involving a previous stratification of the study area into landscape units on which the clustering is then performed. The main cropping systems of Tocantins, characterized by the crop types and cropping patterns, were efficiently mapped with the landscape-clustering approach. Results show that stratification prior to clustering significantly improves the classification accuracies for underrepresented and sparsely distributed cropping systems. This study illustrates the potential of unsupervised classification for large area cropping systems' mapping and contributes to the development of generic tools for supporting large-scale agricultural monitoring across regions.
Automatic cloud coverage assessment of Formosat-2 image
NASA Astrophysics Data System (ADS)
Hsu, Kuo-Hsien
2011-11-01
Formosat-2 satellite equips with the high-spatial-resolution (2m ground sampling distance) remote sensing instrument. It has been being operated on the daily-revisiting mission orbit by National Space organization (NSPO) of Taiwan since May 21 2004. NSPO has also serving as one of the ground receiving stations for daily processing the received Formosat- 2 images. The current cloud coverage assessment of Formosat-2 image for NSPO Image Processing System generally consists of two major steps. Firstly, an un-supervised K-means method is used for automatically estimating the cloud statistic of Formosat-2 image. Secondly, manual estimation of cloud coverage from Formosat-2 image is processed by manual examination. Apparently, a more accurate Automatic Cloud Coverage Assessment (ACCA) method certainly increases the efficiency of processing step 2 with a good prediction of cloud statistic. In this paper, mainly based on the research results from Chang et al, Irish, and Gotoh, we propose a modified Formosat-2 ACCA method which considered pre-processing and post-processing analysis. For pre-processing analysis, cloud statistic is determined by using un-supervised K-means classification, Sobel's method, Otsu's method, non-cloudy pixels reexamination, and cross-band filter method. Box-Counting fractal method is considered as a post-processing tool to double check the results of pre-processing analysis for increasing the efficiency of manual examination.
NELasso: Group-Sparse Modeling for Characterizing Relations Among Named Entities in News Articles.
Tariq, Amara; Karim, Asim; Foroosh, Hassan
2017-10-01
Named entities such as people, locations, and organizations play a vital role in characterizing online content. They often reflect information of interest and are frequently used in search queries. Although named entities can be detected reliably from textual content, extracting relations among them is more challenging, yet useful in various applications (e.g., news recommending systems). In this paper, we present a novel model and system for learning semantic relations among named entities from collections of news articles. We model each named entity occurrence with sparse structured logistic regression, and consider the words (predictors) to be grouped based on background semantics. This sparse group LASSO approach forces the weights of word groups that do not influence the prediction towards zero. The resulting sparse structure is utilized for defining the type and strength of relations. Our unsupervised system yields a named entities' network where each relation is typed, quantified, and characterized in context. These relations are the key to understanding news material over time and customizing newsfeeds for readers. Extensive evaluation of our system on articles from TIME magazine and BBC News shows that the learned relations correlate with static semantic relatedness measures like WLM, and capture the evolving relationships among named entities over time.
A topic clustering approach to finding similar questions from large question and answer archives.
Zhang, Wei-Nan; Liu, Ting; Yang, Yang; Cao, Liujuan; Zhang, Yu; Ji, Rongrong
2014-01-01
With the blooming of Web 2.0, Community Question Answering (CQA) services such as Yahoo! Answers (http://answers.yahoo.com), WikiAnswer (http://wiki.answers.com), and Baidu Zhidao (http://zhidao.baidu.com), etc., have emerged as alternatives for knowledge and information acquisition. Over time, a large number of question and answer (Q&A) pairs with high quality devoted by human intelligence have been accumulated as a comprehensive knowledge base. Unlike the search engines, which return long lists of results, searching in the CQA services can obtain the correct answers to the question queries by automatically finding similar questions that have already been answered by other users. Hence, it greatly improves the efficiency of the online information retrieval. However, given a question query, finding the similar and well-answered questions is a non-trivial task. The main challenge is the word mismatch between question query (query) and candidate question for retrieval (question). To investigate this problem, in this study, we capture the word semantic similarity between query and question by introducing the topic modeling approach. We then propose an unsupervised machine-learning approach to finding similar questions on CQA Q&A archives. The experimental results show that our proposed approach significantly outperforms the state-of-the-art methods.
Henriksson, Aron; Kvist, Maria; Dalianis, Hercules; Duneld, Martin
2015-10-01
For the purpose of post-marketing drug safety surveillance, which has traditionally relied on the voluntary reporting of individual cases of adverse drug events (ADEs), other sources of information are now being explored, including electronic health records (EHRs), which give us access to enormous amounts of longitudinal observations of the treatment of patients and their drug use. Adverse drug events, which can be encoded in EHRs with certain diagnosis codes, are, however, heavily underreported. It is therefore important to develop capabilities to process, by means of computational methods, the more unstructured EHR data in the form of clinical notes, where clinicians may describe and reason around suspected ADEs. In this study, we report on the creation of an annotated corpus of Swedish health records for the purpose of learning to identify information pertaining to ADEs present in clinical notes. To this end, three key tasks are tackled: recognizing relevant named entities (disorders, symptoms, drugs), labeling attributes of the recognized entities (negation, speculation, temporality), and relationships between them (indication, adverse drug event). For each of the three tasks, leveraging models of distributional semantics - i.e., unsupervised methods that exploit co-occurrence information to model, typically in vector space, the meaning of words - and, in particular, combinations of such models, is shown to improve the predictive performance. The ability to make use of such unsupervised methods is critical when faced with large amounts of sparse and high-dimensional data, especially in domains where annotated resources are scarce. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
On application of image analysis and natural language processing for music search
NASA Astrophysics Data System (ADS)
Gwardys, Grzegorz
2013-10-01
In this paper, I investigate a problem of finding most similar music tracks using, popular in Natural Language Processing, techniques like: TF-IDF and LDA. I de ned document as music track. Each music track is transformed to spectrogram, thanks that, I can use well known techniques to get words from images. I used SURF operation to detect characteristic points and novel approach for their description. The standard kmeans was used for clusterization. Clusterization is here identical with dictionary making, so after that I can transform spectrograms to text documents and perform TF-IDF and LDA. At the final, I can make a query in an obtained vector space. The research was done on 16 music tracks for training and 336 for testing, that are splitted in four categories: Hiphop, Jazz, Metal and Pop. Although used technique is completely unsupervised, results are satisfactory and encouraging to further research.
Finding Specification Pages from the Web
NASA Astrophysics Data System (ADS)
Yoshinaga, Naoki; Torisawa, Kentaro
This paper presents a method of finding a specification page on the Web for a given object (e.g., ``Ch. d'Yquem'') and its class label (e.g., ``wine''). A specification page for an object is a Web page which gives concise attribute-value information about the object (e.g., ``county''-``Sauternes'') in well formatted structures. A simple unsupervised method using layout and symbolic decoration cues was applied to a large number of the Web pages to acquire candidate attributes for each class (e.g., ``county'' for a class ``wine''). We then filter out irrelevant words from the putative attributes through an author-aware scoring function that we called site frequency. We used the acquired attributes to select a representative specification page for a given object from the Web pages retrieved by a normal search engine. Experimental results revealed that our system greatly outperformed the normal search engine in terms of this specification retrieval.
Disentangling the Lexicons of Disaster Response in Twitter
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hodas, Nathan O.; Ver Steeg, Greg; Harrison, Joshua J.
Abstract: People around the world use social media platforms such as Twitter heavily to express their opinion about various as- pects of daily life. In the same way social media changes communication in daily life, it also is transforming the way individuals communicate during disasters and emergencies. Emergency officials have come to rely on social media to communicate alerts and updates. How do users communi- cate risk on social media? We used a novel information- theoretic unsupervised learning tool, CorEx, to extract and classify highly relevant words used by the public on Twit- ter during known emergencies, such as fires,more » explosions, and hurricanes. By utilizing the resulting classification strategy, authorities can use the derived language to craft more rel- evant risk communication to maximize the effectiveness of short-message broadcasts such as the Wireless Emergency Alerts (WEA) service.« less
NASA Astrophysics Data System (ADS)
Neuman, Yair; Cohen, Yochai; Israeli, Navot; Tamir, Boaz
2018-02-01
The availability of historical textual corpora has led to the study of words' frequency along the historical time line, as representing the public's focus of attention over time. However, studying of the dynamics of words' meaning is still in its infancy. In this paper, we propose a methodology for studying the historical trajectory of words' meaning through Tsallis entropy. First, we present the idea that the meaning of a word may be studied through the entropy of its embedding. Using two historical case studies, we show that this entropy measure is correlated with the intensity in which a word is used. More importantly, we show that using Tsallis entropy with a superadditive entropy index may provide a better estimation of a word's frequency of use than using Shannon entropy. We explain this finding as resulting from an increasing redundancy between the words that comprise the semantic field of the target word and develop a new measure of redundancy between words. Using this measure, which relies on the Tsallis version of the Kullback-Leibler divergence, we show that the evolving meaning of a word involves the dynamics of increasing redundancy between components of its semantic field. The proposed methodology may enrich the toolkit of researchers who study the dynamics of word senses.
On the clustering of multidimensional pictorial data
NASA Technical Reports Server (NTRS)
Bryant, J. D. (Principal Investigator)
1979-01-01
Obvious approaches to reducing the cost (in computer resources) of applying current clustering techniques to the problem of remote sensing are discussed. The use of spatial information in finding fields and in classifying mixture pixels is examined, and the AMOEBA clustering program is described. Internally, a pattern recognition program, from without, AMOEBA appears to be an unsupervised clustering program. It is fast and automatic. No choices (such as arbitrary thresholds to set split/combine sequences) need be made. The problem of finding the number of clusters is solved automatically. At the conclusion of the program, all points in the scene are classified; however, a provision is included for a reject classification of some points which, within the theoretical framework, cannot rationally be assigned to any cluster.
Mathematical morphology for automated analysis of remotely sensed objects in radar images
NASA Technical Reports Server (NTRS)
Daida, Jason M.; Vesecky, John F.
1991-01-01
A symbiosis of pyramidal segmentation and morphological transmission is described. The pyramidal segmentation portion of the symbiosis has resulted in low (2.6 percent) misclassification error rate for a one-look simulation. Other simulations indicate lower error rates (1.8 percent for a four-look image). The morphological transformation portion has resulted in meaningful partitions with a minimal loss of fractal boundary information. An unpublished version of Thicken, suitable for watersheds transformations of fractal objects, is also presented. It is demonstrated that the proposed symbiosis works with SAR (synthetic aperture radar) images: in this case, a four-look Seasat image of sea ice. It is concluded that the symbiotic forms of both segmentation and morphological transformation seem well suited for unsupervised geophysical analysis.
Hargreaves, Ian S; Pexman, Penny M
2014-05-01
According to several current frameworks, semantic processing involves an early influence of language-based information followed by later influences of object-based information (e.g., situated simulations; Santos, Chaigneau, Simmons, & Barsalou, 2011). In the present study we examined whether these predictions extend to the influence of semantic variables in visual word recognition. We investigated the time course of semantic richness effects in visual word recognition using a signal-to-respond (STR) paradigm fitted to a lexical decision (LDT) and a semantic categorization (SCT) task. We used linear mixed effects to examine the relative contributions of language-based (number of senses, ARC) and object-based (imageability, number of features, body-object interaction ratings) descriptions of semantic richness at four STR durations (75, 100, 200, and 400ms). Results showed an early influence of number of senses and ARC in the SCT. In both LDT and SCT, object-based effects were the last to influence participants' decision latencies. We interpret our results within a framework in which semantic processes are available to influence word recognition as a function of their availability over time, and of their relevance to task-specific demands. Copyright © 2014 Elsevier B.V. All rights reserved.
The embodied mind extended: using words as social tools
Borghi, Anna M.; Scorolli, Claudia; Caligiore, Daniele; Baldassarre, Gianluca; Tummolini, Luca
2013-01-01
The extended mind view and the embodied-grounded view of cognition and language are typically considered as rather independent perspectives. In this paper we propose a possible integration of the two views and support it proposing the idea of “Words As social Tools” (WAT). In this respect, we will propose that words, also due to their social and public character, can be conceived as quasi-external devices that extend our cognition. Moreover, words function like tools in that they enlarge the bodily space of action thus modifying our sense of body. To support our proposal, we review the relevant literature on tool-use and on words as tools and report recent evidence indicating that word use leads to an extension of space close to the body. In addition, we outline a model of the neural processes that may underpin bodily space extension via word use and may reflect possible effects on cognition of the use of words as external means. We also discuss how reconciling the two perspectives can help to overcome the limitations they encounter if considered independently. PMID:23641224
Improving Students’ Sense to Learn Language in Islamic Institution of Coastal Area Indonesia
NASA Astrophysics Data System (ADS)
Kuraedah, St.; Azaliah Mar, Nur; Gunawan, Fahmi
2018-05-01
This research aims to examine the ways to develop a sense of love for learning Arabic among students in Islamic Higher education of Indonesia. This study is essential to do because Arabic should be the favourite subject by the students. In addition, Arabic is also the language of Al-Qur’an. As the language of Al-Qur’an, Arabic for Indonesian is not a foreign language as other foreign languages. In fact, the Arabic becomes one of the dreaded subjects by most students, especially at State Islamic Institute of Kendari. Therefore, it takes the tips and efforts by the Arabic teachers to make Arabic more interesting for the students. The results show that one way to increase the motivation to learn Arabic for students is to develop students’ sense of love to Arabic. The teachers can do it by showing how easy Arabic is and how important it is since it is a language of religion and science, and providing the tips to learn the language. Besides, they also can explain some borrowing words from Arabic adopted into Indonesian and to be used in daily conversations without realizing it, and show the form of word derivation in Arabic that can help to enrich the Arabic vocabulary. The teacher should tell the students that knowing one word in Arabic can develop into some vocabulary with different meanings.
[Pre-verbality in focusing and the need for self check. An attempt at "focusing check"].
Masui, T; Ikemi, A; Murayama, S
1983-06-01
Though the Focusing process is not entirely non-verbal, in Focusing, careful attention is paid by the Focuser and the Listener to the pre-verbal experiential process. In other words, Focusing involves attending to the felt sense that is not easily expressed in words immediately. Hence, during the process of learning to Focus, the Focusing teacher attempts to communicate the experiences of Focusing to the student which are not easily done by words. Due to such difficulties, the Focusing student may (and quite frequently does) mistake the experiential process in Focusing with other processes. Often, the felt sense can be confused with other phenomena such as "autogenic discharge". Also the Focuser may not stay with the felt sense and drift into "free association" or frequently, certain processes in "meditation" can be confused with Focusing. Therefore, there is a need for a "check" by which the Focusing student can confirm the Focusing experience for himself. For the Focusing student, such a "check" serves not only to confirm the Focusing process, but also an aid to learning Focusing. We will report here a "Focusing Check" which we developed by translating Eugene Gendlin's "Focusing Check" and making several modifications in it so that it will be more understandable to the Japanese. Along with the "Focusing Check" we developed, the authors discuss the need for such a check.
Embree, Lindsay M; Budson, Andrew E; Ally, Brandon A
2012-07-01
Understanding how memory breaks down in the earliest stages of Alzheimer's disease (AD) process has significant implications, both clinically and with respect to intervention development. Previous work has highlighted a robust picture superiority effect in patients with amnestic mild cognitive impairment (aMCI). However, it remains unclear as to how pictures improve memory compared to words in this patient population. In the current study, we utilized receiver operating characteristic (ROC) curves to obtain estimates of familiarity and recollection for pictures and words in patients with aMCI and healthy older controls. Analysis of accuracy shows that even when performance is matched between pictures and words in the healthy control group, patients with aMCI continue to show a significant picture superiority effect. The results of the ROC analysis showed that patients demonstrated significantly impaired recollection and familiarity for words compared controls. In contrast, patients with aMCI demonstrated impaired recollection, but intact familiarity for pictures, compared to controls. Based on previous work from our lab, we speculate that patients can utilize the rich conceptual information provided by pictures to enhance familiarity, and perceptual information may allow for post-retrieval monitoring or verification of the enhanced sense of familiarity. Alternatively, the combination of enhanced conceptual and perceptual fluency of the test item might drive a stronger or more robust sense of familiarity that can be accurately attributed to a studied item. Copyright © 2012 Elsevier Ltd. All rights reserved.
Embree, Lindsay M.; Budson, Andrew E.; Ally, Brandon A.
2012-01-01
Understanding how memory breaks down in the earliest stages of the Alzheimer’s disease (AD) process has significant implications, both clinically and with respect to intervention development. Previous work has highlighted a robust picture superiority effect in patients with amnestic mild cognitive impairment (aMCI). However, it remains unclear as to how pictures improve memory compared to words in this patient population. In the current study, we utilized receiver operating characteristic (ROC) curves to obtain estimates of familiarity and recollection for pictures and words in patients with aMCI and healthy older controls. Analysis of accuracy shows that even when performance is matched between pictures and words in the healthy control group, patients with aMCI continue to show a significant picture superiority effect. The results of the ROC analysis showed that patients demonstrated significantly impaired recollection and familiarity for words compared controls. In contrast, patients with aMCI demonstrated impaired recollection, but intact familiarity for pictures, compared to controls. Based on previous work from our lab, we speculate that patients can utilize the rich conceptual information provided by pictures to enhance familiarity, and perceptual information may allow for post-retrieval monitoring or verification of the enhanced sense of familiarity. Alternatively, the combination of enhanced conceptual and perceptual fluency of the test item might drive a stronger or more robust sense of familiarity that can be accurately attributed to a studied item. PMID:22705441
... about 1 million people in the United States today suffer from aphasia. The type and severity of ... but cannot make sense of the words. (3) Global aphasia results from severe and extensive damage to ...
The Kinesthetic Speaker: Putting Action into Words.
ERIC Educational Resources Information Center
Moran, Nick
2001-01-01
Suggests that the "kinesthetic connection" is missing in today's speeches and presentations. Describes techniques for harnessing kinesthetic power and creating a sense of intimacy with the audience. (JOW)
Tsakpinoglou, Florence; Poulin, François
2017-10-01
Best friends exert a substantial influence on rising alcohol and marijuana use during adolescence. Two mechanisms occurring within friendship - friend pressure and unsupervised co-deviancy - may partially capture the way friends influence one another. The current study aims to: (1) examine the psychometric properties of a new instrument designed to assess pressure from a youth's best friend and unsupervised co-deviancy; (2) investigate the relative contribution of these processes to alcohol and marijuana use; and (3) determine whether gender moderates these associations. Data were collected through self-report questionnaires completed by 294 Canadian youths (62% female) across two time points (ages 15-16). Principal component analysis yielded a two-factor solution corresponding to friend pressure and unsupervised co-deviancy. Logistic regressions subsequently showed that unsupervised co-deviancy was predictive of an increase in marijuana use one year later. Neither process predicted an increase in alcohol use. Results did not differ as a function of gender. Copyright © 2017 The Foundation for Professionals in Services for Adolescents. Published by Elsevier Ltd. All rights reserved.
The affection of boreal forest changes on imbalance of Nature (Invited)
NASA Astrophysics Data System (ADS)
Tana, G.; Tateishi, R.
2013-12-01
Abstract: The balance of nature does not exist, and, perhaps, never has existed [1]. In other words, the Mother Nature is imbalanced at all. The Mother Nature is changing every moment and never returns to previous condition. Because of the imbalance of nature, global climate has been changing gradually. To reveal the imbalance of nature, there is a need to monitor the dynamic changes of the Earth surface. Forest cover and forest cover change have been grown in importance as basic variables for modelling of global biogeochemical cycles as well as climate [2]. The boreal area contains 1/3 of the earth's trees. These trees play a large part in limiting harmful greenhouse gases by aborbing much of the earth's carbon dioxide (CO2) [3]. The boreal area mainly consists of needleleaf evergreen forest and needleleaf deciduous forest. Both of the needleleaf evergreen forest and needleleaf deciduous forest play the important roles on the uptake of CO2. However, because of the dormant period of needleleaf evergreen forest are shorter than that of needleleaf deciduous forest, needleleaf evergreen forest makes a greater contribution to the absorbtion of CO2. Satellite sensor because of its ability to observe the Earth continuously, can provide the opportunity to monitor the dynamic changes of the Earth. In this study, we used the MODerate resolution Imaging Spectroradiometer (MODIS) satellite data to monitor the dynamic change of boreal forest area which are mainly consist from needleleaf evergreen forest and needleleaf deciduous forest during 2003-2012. Three years MODIS data from the year 2003, 2008 and 2012 were used to detect the forest changed area. A hybrid change detection method which combines the threshold method and unsupervised classification method was used to detect the changes of forest area. In the first step, the difference of Normalized Difference Vegetation Index (NDVI) of the three years were calculated and were used to extract the changed areas by the threshold method. In the second step, the unsupervised classification method was used to classify and analyze detected change areas derived from the first step. Finally, the changed area were validated using the traning data collected for the three years. The validation result revealed that the forest in the study area has undergone the area and type changes during 2003-2012. The detailed procedure will be presented in the meeting. References: [1] Elton, C.S. (1930). Animal Ecology and Evolution. New York, Oxford University Press. [2] Potapov, P., Hansen, M. C., Stehman, S. V., Loveland, T. R., Pittman, K. (2008). Combining MODIS and Landsat imagery to estimate and map boreal forest cover loss, Remote Sensing of Environment, 112, 3708-3719. [3] Houghton, R. A. (2003). Why are estimates of the terrestrial carbon balance so different? Global Change Biology, 9, 500-509.
NASA Technical Reports Server (NTRS)
Brumfield, J. O.; Bloemer, H. H. L.; Campbell, W. J.
1981-01-01
Two unsupervised classification procedures for analyzing Landsat data used to monitor land reclamation in a surface mining area in east central Ohio are compared for agreement with data collected from the corresponding locations on the ground. One procedure is based on a traditional unsupervised-clustering/maximum-likelihood algorithm sequence that assumes spectral groupings in the Landsat data in n-dimensional space; the other is based on a nontraditional unsupervised-clustering/canonical-transformation/clustering algorithm sequence that not only assumes spectral groupings in n-dimensional space but also includes an additional feature-extraction technique. It is found that the nontraditional procedure provides an appreciable improvement in spectral groupings and apparently increases the level of accuracy in the classification of land cover categories.
Cleary, Anne M; Claxton, Alexander B
2015-09-01
This study shows that the presence of a tip-of-the-tongue (TOT) state--the sense that a word is in memory when its retrieval fails--is used as a heuristic for inferring that an inaccessible word has characteristics that are consistent with greater word perceptibility. When reporting a TOT state, people judged an unretrieved word as more likely to have previously appeared darker and clearer (Experiment 1a), and larger (Experiment 1b). They also judged an unretrieved word as more likely to be a high frequency word (Experiment 2). This was not because greater fluency or word perceptibility at encoding led to later TOT states: Increased fluency or perceptibility of a word at encoding did not increase the likelihood of a TOT state for it when its retrieval later failed; moreover, the TOT state was not diagnostic of an unretrieved word's fluency or perceptibility when it was last seen. Results instead suggest that TOT states themselves are used as a heuristic for inferring the likely characteristics of unretrieved words. During the uncertainty of retrieval failure, TOT states are a source of information on which people rely in reasoning about the likely characteristics of the unretrieved information, choosing characteristics that are consistent with greater fluency of processing. (c) 2015 APA, all rights reserved).
Compositional symbol grounding for motor patterns.
Greco, Alberto; Caneva, Claudio
2010-01-01
We developed a new experimental and simulative paradigm to study the establishing of compositional grounded representations for motor patterns. Participants learned to associate non-sense arm motor patterns, performed in three different hand postures, with non-sense words. There were two group conditions: in the first (compositional), each pattern was associated with a two-word (verb-adverb) sentence; in the second (holistic), each same pattern was associated with a unique word. Two experiments were performed. In the first, motor pattern recognition and naming were tested in the two conditions. Results showed that verbal compositionality had no role in recognition and that the main source of confusability in this task came from discriminating hand postures. As the naming task resulted too difficult, some changes in the learning procedure were implemented in the second experiment. In this experiment, the compositional group achieved better results in naming motor patterns especially for patterns where hand postures discrimination was relevant. In order to ascertain the differential effect, upon this result, of memory load and of systematic grounding, neural network simulations were also made. After a basic simulation that worked as a good model of subjects performance, in following simulations the number of stimuli (motor patterns and words) was increased and the systematic association between words and patterns was disrupted, while keeping the same number of words and syntax. Results showed that in both conditions the advantage for the compositional condition significantly increased. These simulations showed that the advantage for this condition may be more related to the systematicity rather than to the mere informational gain. All results are discussed in connection to the possible support of the hypothesis of a compositional motor representation and toward a more precise explanation of the factors that make compositional representations working.
Myint, S.W.; Yuan, M.; Cerveny, R.S.; Giri, C.P.
2008-01-01
Remote sensing techniques have been shown effective for large-scale damage surveys after a hazardous event in both near real-time or post-event analyses. The paper aims to compare accuracy of common imaging processing techniques to detect tornado damage tracks from Landsat TM data. We employed the direct change detection approach using two sets of images acquired before and after the tornado event to produce a principal component composite images and a set of image difference bands. Techniques in the comparison include supervised classification, unsupervised classification, and objectoriented classification approach with a nearest neighbor classifier. Accuracy assessment is based on Kappa coefficient calculated from error matrices which cross tabulate correctly identified cells on the TM image and commission and omission errors in the result. Overall, the Object-oriented Approach exhibits the highest degree of accuracy in tornado damage detection. PCA and Image Differencing methods show comparable outcomes. While selected PCs can improve detection accuracy 5 to 10%, the Object-oriented Approach performs significantly better with 15-20% higher accuracy than the other two techniques. ?? 2008 by MDPI.
Myint, Soe W.; Yuan, May; Cerveny, Randall S.; Giri, Chandra P.
2008-01-01
Remote sensing techniques have been shown effective for large-scale damage surveys after a hazardous event in both near real-time or post-event analyses. The paper aims to compare accuracy of common imaging processing techniques to detect tornado damage tracks from Landsat TM data. We employed the direct change detection approach using two sets of images acquired before and after the tornado event to produce a principal component composite images and a set of image difference bands. Techniques in the comparison include supervised classification, unsupervised classification, and object-oriented classification approach with a nearest neighbor classifier. Accuracy assessment is based on Kappa coefficient calculated from error matrices which cross tabulate correctly identified cells on the TM image and commission and omission errors in the result. Overall, the Object-oriented Approach exhibits the highest degree of accuracy in tornado damage detection. PCA and Image Differencing methods show comparable outcomes. While selected PCs can improve detection accuracy 5 to 10%, the Object-oriented Approach performs significantly better with 15-20% higher accuracy than the other two techniques. PMID:27879757
Building damage assessment using airborne lidar
NASA Astrophysics Data System (ADS)
Axel, Colin; van Aardt, Jan
2017-10-01
The assessment of building damage following a natural disaster is a crucial step in determining the impact of the event itself and gauging reconstruction needs. Automatic methods for deriving damage maps from remotely sensed data are preferred, since they are regarded as being rapid and objective. We propose an algorithm for performing unsupervised building segmentation and damage assessment using airborne light detection and ranging (lidar) data. Local surface properties, including normal vectors and curvature, were used along with region growing to segment individual buildings in lidar point clouds. Damaged building candidates were identified based on rooftop inclination angle, and then damage was assessed using planarity and point height metrics. Validation of the building segmentation and damage assessment techniques were performed using airborne lidar data collected after the Haiti earthquake of 2010. Building segmentation and damage assessment accuracies of 93.8% and 78.9%, respectively, were obtained using lidar point clouds and expert damage assessments of 1953 buildings in heavily damaged regions. We believe this research presents an indication of the utility of airborne lidar remote sensing for increasing the efficiency and speed at which emergency response operations are performed.
Exploiting domain information for Word Sense Disambiguation of medical documents.
Stevenson, Mark; Agirre, Eneko; Soroa, Aitor
2012-01-01
Current techniques for knowledge-based Word Sense Disambiguation (WSD) of ambiguous biomedical terms rely on relations in the Unified Medical Language System Metathesaurus but do not take into account the domain of the target documents. The authors' goal is to improve these methods by using information about the topic of the document in which the ambiguous term appears. The authors proposed and implemented several methods to extract lists of key terms associated with Medical Subject Heading terms. These key terms are used to represent the document topic in a knowledge-based WSD system. They are applied both alone and in combination with local context. A standard measure of accuracy was calculated over the set of target words in the widely used National Library of Medicine WSD dataset. The authors report a significant improvement when combining those key terms with local context, showing that domain information improves the results of a WSD system based on the Unified Medical Language System Metathesaurus alone. The best results were obtained using key terms obtained by relevance feedback and weighted by inverse document frequency.
Exploiting domain information for Word Sense Disambiguation of medical documents
Agirre, Eneko; Soroa, Aitor
2011-01-01
Objective Current techniques for knowledge-based Word Sense Disambiguation (WSD) of ambiguous biomedical terms rely on relations in the Unified Medical Language System Metathesaurus but do not take into account the domain of the target documents. The authors' goal is to improve these methods by using information about the topic of the document in which the ambiguous term appears. Design The authors proposed and implemented several methods to extract lists of key terms associated with Medical Subject Heading terms. These key terms are used to represent the document topic in a knowledge-based WSD system. They are applied both alone and in combination with local context. Measurements A standard measure of accuracy was calculated over the set of target words in the widely used National Library of Medicine WSD dataset. Results and discussion The authors report a significant improvement when combining those key terms with local context, showing that domain information improves the results of a WSD system based on the Unified Medical Language System Metathesaurus alone. The best results were obtained using key terms obtained by relevance feedback and weighted by inverse document frequency. PMID:21900701
NASA Astrophysics Data System (ADS)
Watanabe, W. M.; Candido, A.; Amâncio, M. A.; De Oliveira, M.; Pardo, T. A. S.; Fortes, R. P. M.; Aluísio, S. M.
2010-12-01
This paper presents an approach for assisting low-literacy readers in accessing Web online information. The "Educational FACILITA" tool is a Web content adaptation tool that provides innovative features and follows more intuitive interaction models regarding accessibility concerns. Especially, we propose an interaction model and a Web application that explore the natural language processing tasks of lexical elaboration and named entity labeling for improving Web accessibility. We report on the results obtained from a pilot study on usability analysis carried out with low-literacy users. The preliminary results show that "Educational FACILITA" improves the comprehension of text elements, although the assistance mechanisms might also confuse users when word sense ambiguity is introduced, by gathering, for a complex word, a list of synonyms with multiple meanings. This fact evokes a future solution in which the correct sense for a complex word in a sentence is identified, solving this pervasive characteristic of natural languages. The pilot study also identified that experienced computer users find the tool to be more useful than novice computer users do.
PI2GIS: processing image to geographical information systems, a learning tool for QGIS
NASA Astrophysics Data System (ADS)
Correia, R.; Teodoro, A.; Duarte, L.
2017-10-01
To perform an accurate interpretation of remote sensing images, it is necessary to extract information using different image processing techniques. Nowadays, it became usual to use image processing plugins to add new capabilities/functionalities integrated in Geographical Information System (GIS) software. The aim of this work was to develop an open source application to automatically process and classify remote sensing images from a set of satellite input data. The application was integrated in a GIS software (QGIS), automating several image processing steps. The use of QGIS for this purpose is justified since it is easy and quick to develop new plugins, using Python language. This plugin is inspired in the Semi-Automatic Classification Plugin (SCP) developed by Luca Congedo. SCP allows the supervised classification of remote sensing images, the calculation of vegetation indices such as NDVI (Normalized Difference Vegetation Index) and EVI (Enhanced Vegetation Index) and other image processing operations. When analysing SCP, it was realized that a set of operations, that are very useful in teaching classes of remote sensing and image processing tasks, were lacking, such as the visualization of histograms, the application of filters, different image corrections, unsupervised classification and several environmental indices computation. The new set of operations included in the PI2GIS plugin can be divided into three groups: pre-processing, processing, and classification procedures. The application was tested consider an image from Landsat 8 OLI from a North area of Portugal.
An emergentist perspective on the origin of number sense
2018-01-01
The finding that human infants and many other animal species are sensitive to numerical quantity has been widely interpreted as evidence for evolved, biologically determined numerical capacities across unrelated species, thereby supporting a ‘nativist’ stance on the origin of number sense. Here, we tackle this issue within the ‘emergentist’ perspective provided by artificial neural network models, and we build on computer simulations to discuss two different approaches to think about the innateness of number sense. The first, illustrated by artificial life simulations, shows that numerical abilities can be supported by domain-specific representations emerging from evolutionary pressure. The second assumes that numerical representations need not be genetically pre-determined but can emerge from the interplay between innate architectural constraints and domain-general learning mechanisms, instantiated in deep learning simulations. We show that deep neural networks endowed with basic visuospatial processing exhibit a remarkable performance in numerosity discrimination before any experience-dependent learning, whereas unsupervised sensory experience with visual sets leads to subsequent improvement of number acuity and reduces the influence of continuous visual cues. The emergent neuronal code for numbers in the model includes both numerosity-sensitive (summation coding) and numerosity-selective response profiles, closely mirroring those found in monkey intraparietal neurons. We conclude that a form of innatism based on architectural and learning biases is a fruitful approach to understanding the origin and development of number sense. This article is part of a discussion meeting issue ‘The origins of numerical abilities'. PMID:29292348
The emotional counting Stroop: a task for assessing emotional interference during brain imaging.
Whalen, Paul J; Bush, George; Shin, Lisa M; Rauch, Scott L
2006-01-01
The emotional counting Stroop (ecStroop) is an emotional variant of the counting Stroop. Both of these tasks require a motor response instead of a spoken response for the purpose of minimizing head movement during functional MRI (fMRI). During this task, subjects report, by button press, the number of words (1-4) that appear on a screen, regardless of word meaning. Neutral word-control trials contain common words (e.g., 'cabinet' written three times), while interference trials contain emotional words (e.g., 'murder' written three times). The degree to which this task represents a true 'Stroop' interference task, in the sense that emotional words will increase motor-response times compared with neutral words, depends upon the subjects of the study and the words that are presented. Much research on the emotional Stroop task demonstrates that interference effects are observed in psychopathological groups in response to words that are specific to their disorder, and in normal subjects when the words are related to current concerns endorsed by them. The ecStroop task described here will produce reaction time-interference effects that are comparable to the traditional color-naming emotional Stroop. This protocol can be completed in approximately 20 min per subject. The protocol described here employs neutral words and emotional words that include general-negative words, as well as words specific to combat-related trauma. However, this protocol is amenable to any emotional word lists.
Supervised versus unsupervised categorization: two sides of the same coin?
Pothos, Emmanuel M; Edwards, Darren J; Perlman, Amotz
2011-09-01
Supervised and unsupervised categorization have been studied in separate research traditions. A handful of studies have attempted to explore a possible convergence between the two. The present research builds on these studies, by comparing the unsupervised categorization results of Pothos et al. ( 2011 ; Pothos et al., 2008 ) with the results from two procedures of supervised categorization. In two experiments, we tested 375 participants with nine different stimulus sets and examined the relation between ease of learning of a classification, memory for a classification, and spontaneous preference for a classification. After taking into account the role of the number of category labels (clusters) in supervised learning, we found the three variables to be closely associated with each other. Our results provide encouragement for researchers seeking unified theoretical explanations for supervised and unsupervised categorization, but raise a range of challenging theoretical questions.
Unsupervised automated high throughput phenotyping of RNAi time-lapse movies.
Failmezger, Henrik; Fröhlich, Holger; Tresch, Achim
2013-10-04
Gene perturbation experiments in combination with fluorescence time-lapse cell imaging are a powerful tool in reverse genetics. High content applications require tools for the automated processing of the large amounts of data. These tools include in general several image processing steps, the extraction of morphological descriptors, and the grouping of cells into phenotype classes according to their descriptors. This phenotyping can be applied in a supervised or an unsupervised manner. Unsupervised methods are suitable for the discovery of formerly unknown phenotypes, which are expected to occur in high-throughput RNAi time-lapse screens. We developed an unsupervised phenotyping approach based on Hidden Markov Models (HMMs) with multivariate Gaussian emissions for the detection of knockdown-specific phenotypes in RNAi time-lapse movies. The automated detection of abnormal cell morphologies allows us to assign a phenotypic fingerprint to each gene knockdown. By applying our method to the Mitocheck database, we show that a phenotypic fingerprint is indicative of a gene's function. Our fully unsupervised HMM-based phenotyping is able to automatically identify cell morphologies that are specific for a certain knockdown. Beyond the identification of genes whose knockdown affects cell morphology, phenotypic fingerprints can be used to find modules of functionally related genes.
Unsupervised learning on scientific ocean drilling datasets from the South China Sea
NASA Astrophysics Data System (ADS)
Tse, Kevin C.; Chiu, Hon-Chim; Tsang, Man-Yin; Li, Yiliang; Lam, Edmund Y.
2018-06-01
Unsupervised learning methods were applied to explore data patterns in multivariate geophysical datasets collected from ocean floor sediment core samples coming from scientific ocean drilling in the South China Sea. Compared to studies on similar datasets, but using supervised learning methods which are designed to make predictions based on sample training data, unsupervised learning methods require no a priori information and focus only on the input data. In this study, popular unsupervised learning methods including K-means, self-organizing maps, hierarchical clustering and random forest were coupled with different distance metrics to form exploratory data clusters. The resulting data clusters were externally validated with lithologic units and geologic time scales assigned to the datasets by conventional methods. Compact and connected data clusters displayed varying degrees of correspondence with existing classification by lithologic units and geologic time scales. K-means and self-organizing maps were observed to perform better with lithologic units while random forest corresponded best with geologic time scales. This study sets a pioneering example of how unsupervised machine learning methods can be used as an automatic processing tool for the increasingly high volume of scientific ocean drilling data.
An Efficient Optimization Method for Solving Unsupervised Data Classification Problems.
Shabanzadeh, Parvaneh; Yusof, Rubiyah
2015-01-01
Unsupervised data classification (or clustering) analysis is one of the most useful tools and a descriptive task in data mining that seeks to classify homogeneous groups of objects based on similarity and is used in many medical disciplines and various applications. In general, there is no single algorithm that is suitable for all types of data, conditions, and applications. Each algorithm has its own advantages, limitations, and deficiencies. Hence, research for novel and effective approaches for unsupervised data classification is still active. In this paper a heuristic algorithm, Biogeography-Based Optimization (BBO) algorithm, was adapted for data clustering problems by modifying the main operators of BBO algorithm, which is inspired from the natural biogeography distribution of different species. Similar to other population-based algorithms, BBO algorithm starts with an initial population of candidate solutions to an optimization problem and an objective function that is calculated for them. To evaluate the performance of the proposed algorithm assessment was carried on six medical and real life datasets and was compared with eight well known and recent unsupervised data classification algorithms. Numerical results demonstrate that the proposed evolutionary optimization algorithm is efficient for unsupervised data classification.
Semi-supervised and unsupervised extreme learning machines.
Huang, Gao; Song, Shiji; Gupta, Jatinder N D; Wu, Cheng
2014-12-01
Extreme learning machines (ELMs) have proven to be efficient and effective learning mechanisms for pattern classification and regression. However, ELMs are primarily applied to supervised learning problems. Only a few existing research papers have used ELMs to explore unlabeled data. In this paper, we extend ELMs for both semi-supervised and unsupervised tasks based on the manifold regularization, thus greatly expanding the applicability of ELMs. The key advantages of the proposed algorithms are as follows: 1) both the semi-supervised ELM (SS-ELM) and the unsupervised ELM (US-ELM) exhibit learning capability and computational efficiency of ELMs; 2) both algorithms naturally handle multiclass classification or multicluster clustering; and 3) both algorithms are inductive and can handle unseen data at test time directly. Moreover, it is shown in this paper that all the supervised, semi-supervised, and unsupervised ELMs can actually be put into a unified framework. This provides new perspectives for understanding the mechanism of random feature mapping, which is the key concept in ELM theory. Empirical study on a wide range of data sets demonstrates that the proposed algorithms are competitive with the state-of-the-art semi-supervised or unsupervised learning algorithms in terms of accuracy and efficiency.
Unsupervised chunking based on graph propagation from bilingual corpus.
Zhu, Ling; Wong, Derek F; Chao, Lidia S
2014-01-01
This paper presents a novel approach for unsupervised shallow parsing model trained on the unannotated Chinese text of parallel Chinese-English corpus. In this approach, no information of the Chinese side is applied. The exploitation of graph-based label propagation for bilingual knowledge transfer, along with an application of using the projected labels as features in unsupervised model, contributes to a better performance. The experimental comparisons with the state-of-the-art algorithms show that the proposed approach is able to achieve impressive higher accuracy in terms of F-score.
Unsupervised classification of earth resources data.
NASA Technical Reports Server (NTRS)
Su, M. Y.; Jayroe, R. R., Jr.; Cummings, R. E.
1972-01-01
A new clustering technique is presented. It consists of two parts: (a) a sequential statistical clustering which is essentially a sequential variance analysis and (b) a generalized K-means clustering. In this composite clustering technique, the output of (a) is a set of initial clusters which are input to (b) for further improvement by an iterative scheme. This unsupervised composite technique was employed for automatic classification of two sets of remote multispectral earth resource observations. The classification accuracy by the unsupervised technique is found to be comparable to that by existing supervised maximum liklihood classification technique.
Sustained meaning activation for polysemous but not homonymous words: evidence from EEG.
MacGregor, Lucy J; Bouwsema, Jennifer; Klepousniotou, Ekaterini
2015-02-01
Theoretical linguistic accounts of lexical ambiguity distinguish between homonymy, where words that share a lexical form have unrelated meanings, and polysemy, where the meanings are related. The present study explored the psychological reality of this theoretical assumption by asking whether there is evidence that homonyms and polysemes are represented and processed differently in the brain. We investigated the time-course of meaning activation of different types of ambiguous words using EEG. Homonyms and polysemes were each further subdivided into two: unbalanced homonyms (e.g., "coach") and balanced homonyms (e.g., "match"); metaphorical polysemes (e.g., "mouth") and metonymic polysemes (e.g., "rabbit"). These four types of ambiguous words were presented as primes in a visual single-word priming delayed lexical decision task employing a long ISI (750 ms). Targets were related to one of the meanings of the primes, or were unrelated. ERPs formed relative to the target onset indicated that the theoretical distinction between homonymy and polysemy was reflected in the N400 brain response. For targets following homonymous primes (both unbalanced and balanced), no effects survived at this long ISI indicating that both meanings of the prime had already decayed. On the other hand, for polysemous primes (both metaphorical and metonymic), activation was observed for both dominant and subordinate senses. The observed processing differences between homonymy and polysemy provide evidence in support of differential neuro-cognitive representations for the two types of ambiguity. We argue that the polysemous senses act collaboratively to strengthen the representation, facilitating maintenance, while the competitive nature of homonymous meanings leads to decay. Copyright © 2015 Elsevier Ltd. All rights reserved.
Assigning clinical codes with data-driven concept representation on Dutch clinical free text.
Scheurwegs, Elyne; Luyckx, Kim; Luyten, Léon; Goethals, Bart; Daelemans, Walter
2017-05-01
Clinical codes are used for public reporting purposes, are fundamental to determining public financing for hospitals, and form the basis for reimbursement claims to insurance providers. They are assigned to a patient stay to reflect the diagnosis and performed procedures during that stay. This paper aims to enrich algorithms for automated clinical coding by taking a data-driven approach and by using unsupervised and semi-supervised techniques for the extraction of multi-word expressions that convey a generalisable medical meaning (referred to as concepts). Several methods for extracting concepts from text are compared, two of which are constructed from a large unannotated corpus of clinical free text. A distributional semantic model (i.c. the word2vec skip-gram model) is used to generalize over concepts and retrieve relations between them. These methods are validated on three sets of patient stay data, in the disease areas of urology, cardiology, and gastroenterology. The datasets are in Dutch, which introduces a limitation on available concept definitions from expert-based ontologies (e.g. UMLS). The results show that when expert-based knowledge in ontologies is unavailable, concepts derived from raw clinical texts are a reliable alternative. Both concepts derived from raw clinical texts perform and concepts derived from expert-created dictionaries outperform a bag-of-words approach in clinical code assignment. Adding features based on tokens that appear in a semantically similar context has a positive influence for predicting diagnostic codes. Furthermore, the experiments indicate that a distributional semantics model can find relations between semantically related concepts in texts but also introduces erroneous and redundant relations, which can undermine clinical coding performance. Copyright © 2017. Published by Elsevier Inc.
Lexicon-enhanced sentiment analysis framework using rule-based classification scheme.
Asghar, Muhammad Zubair; Khan, Aurangzeb; Ahmad, Shakeel; Qasim, Maria; Khan, Imran Ali
2017-01-01
With the rapid increase in social networks and blogs, the social media services are increasingly being used by online communities to share their views and experiences about a particular product, policy and event. Due to economic importance of these reviews, there is growing trend of writing user reviews to promote a product. Nowadays, users prefer online blogs and review sites to purchase products. Therefore, user reviews are considered as an important source of information in Sentiment Analysis (SA) applications for decision making. In this work, we exploit the wealth of user reviews, available through the online forums, to analyze the semantic orientation of words by categorizing them into +ive and -ive classes to identify and classify emoticons, modifiers, general-purpose and domain-specific words expressed in the public's feedback about the products. However, the un-supervised learning approach employed in previous studies is becoming less efficient due to data sparseness, low accuracy due to non-consideration of emoticons, modifiers, and presence of domain specific words, as they may result in inaccurate classification of users' reviews. Lexicon-enhanced sentiment analysis based on Rule-based classification scheme is an alternative approach for improving sentiment classification of users' reviews in online communities. In addition to the sentiment terms used in general purpose sentiment analysis, we integrate emoticons, modifiers and domain specific terms to analyze the reviews posted in online communities. To test the effectiveness of the proposed method, we considered users reviews in three domains. The results obtained from different experiments demonstrate that the proposed method overcomes limitations of previous methods and the performance of the sentiment analysis is improved after considering emoticons, modifiers, negations, and domain specific terms when compared to baseline methods.
1988-02-13
Peter Kreeft, writing from a Christian point of view, explores what the problem of [Illegible Word] is and how it can be understood. He describes his book as a 'journey' in which he covers 'ten easy answers' to the problem of evil - including atheism, scientism, dualism and satanism.
Literature review of the remote sensing of natural resources. [bibliography
NASA Technical Reports Server (NTRS)
Fears, C. B. (Editor); Inglis, M. H. (Editor)
1977-01-01
Abstracts of 596 documents related to remote sensors or the remote sensing of natural resources by satellite, aircraft, or ground-based stations are presented. Topics covered include general theory, geology and hydrology, agriculture and forestry, marine sciences, urban land use, and instrumentation. Recent documents not yet cited in any of the seven information sources used for the compilation are summarized. An author/key word index is provided.
Attachment anxiety benefits from security priming: Evidence from working memory performance
2018-01-01
The present study investigates the relationship between the attachment dimensions (anxious vs. avoidance) and the cognitive performance of individuals, specifically whether the attachment dimensions would predict the working memory (WM) performance. In the n-back task, reflecting the WM capacity, both attachment related and non-attachment related words were used. Participants were randomly assigned into two groups that received either the secure or the neutral subliminal priming. In the secure priming condition, the aim was to induce sense of security by presenting secure attachment words prior to the n-back task performance. In neutral priming condition, neutral words that did not elicit sense of security were presented. Structural equation modeling revealed divergent patterns for attachment anxiety and avoidance dimensions under the different priming conditions. In neutral priming condition, WM performance declined in terms of capacity in the n-back task for individuals who rated higher levels of attachment anxiety. However in the secure priming condition, WM performance was boosted in the n-back task for individuals who rated higher levels of attachment anxiety. In other words, the subliminal priming of the security led to increased WM capacity of individuals who rated higher levels of attachment anxiety. This effect, however, was not observed for higher levels of attachment avoidance. Results are discussed along the lines of hyperactivation and deactivation strategies of the attachment system. PMID:29522549
"A Camel in the Harbor": Poetry and Prediction.
ERIC Educational Resources Information Center
Kantor, Kenneth J.
1978-01-01
In order to help children appreciate poetic surprises, teachers should select poems which appeal to the predictive sense by means of such devices as rhyming words, thematic twists, and surprise endings. (DD)
Leslie, Toby; Rab, Mohammad Abdur; Ahmadzai, Hayat; Durrani, Naeem; Fayaz, Mohammad; Kolaczinski, Jan; Rowland, Mark
2004-03-01
The only available treatment that can eliminate the latent hypnozoite reservoir of vivax malaria is a 14 d course of primaquine (PQ). A potential problem with long-course chemotherapy is the issue of compliance after clinical symptoms have subsided. The present study, carried out at an Afghan refugee camp in Pakistan, between June 2000 and August 2001, compared 14 d treatment in supervised and unsupervised groups in which compliance was monitored by comparison of relapse rates. Clinical cases recruited by passive case detection were randomised by family to placebo, supervised, or unsupervised groups, and treated with chloroquine (25 mg/kg) over 3 days to eliminate erythrocytic stages. Individuals with glucose-6-phosphate dehydrogenase (G6PD) deficiency were excluded from the trial. Cases allocated to supervision were given directly observed treatment (0.25 mg PQ/kg body weight) once per day for 14 days. Cases allocated to the unsupervised group were provided with 14 PQ doses upon enrollment and strongly advised to complete the course. A total of 595 cases were enrolled. After 9 months of follow up PQ proved equally protective against further episodes of P. vivax in supervised (odds ratio 0.35, 95% CI 0.21-0.57) and unsupervised (odds ratio 0.37, 95% CI 0.23-0.59) groups as compared to placebo. All age groups on supervised or unsupervised treatment showed a similar degree of protection even though the risk of relapse decreased with age. The study showed that a presumed problem of poor compliance may be overcome with simple health messages even when the majority of individuals are illiterate and without formal education. Unsupervised treatment with 14-day PQ when combined with simple instruction can avert a significant amount of the morbidity associated with relapse in populations where G6PD deficiency is either absent or readily diagnosable.
True Zero-Training Brain-Computer Interfacing – An Online Study
Kindermans, Pieter-Jan; Schreuder, Martijn; Schrauwen, Benjamin; Müller, Klaus-Robert; Tangermann, Michael
2014-01-01
Despite several approaches to realize subject-to-subject transfer of pre-trained classifiers, the full performance of a Brain-Computer Interface (BCI) for a novel user can only be reached by presenting the BCI system with data from the novel user. In typical state-of-the-art BCI systems with a supervised classifier, the labeled data is collected during a calibration recording, in which the user is asked to perform a specific task. Based on the known labels of this recording, the BCI's classifier can learn to decode the individual's brain signals. Unfortunately, this calibration recording consumes valuable time. Furthermore, it is unproductive with respect to the final BCI application, e.g. text entry. Therefore, the calibration period must be reduced to a minimum, which is especially important for patients with a limited concentration ability. The main contribution of this manuscript is an online study on unsupervised learning in an auditory event-related potential (ERP) paradigm. Our results demonstrate that the calibration recording can be bypassed by utilizing an unsupervised trained classifier, that is initialized randomly and updated during usage. Initially, the unsupervised classifier tends to make decoding mistakes, as the classifier might not have seen enough data to build a reliable model. Using a constant re-analysis of the previously spelled symbols, these initially misspelled symbols can be rectified posthoc when the classifier has learned to decode the signals. We compare the spelling performance of our unsupervised approach and of the unsupervised posthoc approach to the standard supervised calibration-based dogma for n = 10 healthy users. To assess the learning behavior of our approach, it is unsupervised trained from scratch three times per user. Even with the relatively low SNR of an auditory ERP paradigm, the results show that after a limited number of trials (30 trials), the unsupervised approach performs comparably to a classic supervised model. PMID:25068464
Psoriasis image representation using patch-based dictionary learning for erythema severity scoring.
George, Yasmeen; Aldeen, Mohammad; Garnavi, Rahil
2018-06-01
Psoriasis is a chronic skin disease which can be life-threatening. Accurate severity scoring helps dermatologists to decide on the treatment. In this paper, we present a semi-supervised computer-aided system for automatic erythema severity scoring in psoriasis images. Firstly, the unsupervised stage includes a novel image representation method. We construct a dictionary, which is then used in the sparse representation for local feature extraction. To acquire the final image representation vector, an aggregation method is exploited over the local features. Secondly, the supervised phase is where various multi-class machine learning (ML) classifiers are trained for erythema severity scoring. Finally, we compare the proposed system with two popular unsupervised feature extractor methods, namely: bag of visual words model (BoVWs) and AlexNet pretrained model. Root mean square error (RMSE) and F1 score are used as performance measures for the learned dictionaries and the trained ML models, respectively. A psoriasis image set consisting of 676 images, is used in this study. Experimental results demonstrate that the use of the proposed procedure can provide a setup where erythema scoring is accurate and consistent. Also, it is revealed that dictionaries with large number of atoms and small patch sizes yield the best representative erythema severity features. Further, random forest (RF) outperforms other classifiers with F1 score 0.71, followed by support vector machine (SVM) and boosting with 0.66 and 0.64 scores, respectively. Furthermore, the conducted comparative studies confirm the effectiveness of the proposed approach with improvement of 9% and 12% over BoVWs and AlexNet based features, respectively. Crown Copyright © 2018. Published by Elsevier Ltd. All rights reserved.
Hall, L O; Bensaid, A M; Clarke, L P; Velthuizen, R P; Silbiger, M S; Bezdek, J C
1992-01-01
Magnetic resonance (MR) brain section images are segmented and then synthetically colored to give visual representations of the original data with three approaches: the literal and approximate fuzzy c-means unsupervised clustering algorithms, and a supervised computational neural network. Initial clinical results are presented on normal volunteers and selected patients with brain tumors surrounded by edema. Supervised and unsupervised segmentation techniques provide broadly similar results. Unsupervised fuzzy algorithms were visually observed to show better segmentation when compared with raw image data for volunteer studies. For a more complex segmentation problem with tumor/edema or cerebrospinal fluid boundary, where the tissues have similar MR relaxation behavior, inconsistency in rating among experts was observed, with fuzz-c-means approaches being slightly preferred over feedforward cascade correlation results. Various facets of both approaches, such as supervised versus unsupervised learning, time complexity, and utility for the diagnostic process, are compared.
Nicholson, Vaughan Patrick; McKean, Mark; Lowe, John; Fawcett, Christine; Burkett, Brendan
2015-01-01
To determine the effectiveness of unsupervised Nintendo Wii Fit balance training in older adults. Forty-one older adults were recruited from local retirement villages and educational settings to participate in a six-week two-group repeated measures study. The Wii group (n = 19, 75 ± 6 years) undertook 30 min of unsupervised Wii balance gaming three times per week in their retirement village while the comparison group (n = 22, 74 ± 5 years) continued with their usual exercise program. Participants' balance abilities were assessed pre- and postintervention. The Wii Fit group demonstrated significant improvements (P < .05) in timed up-and-go, left single-leg balance, lateral reach (left and right), and gait speed compared with the comparison group. Reported levels of enjoyment following game play increased during the study. Six weeks of unsupervised Wii balance training is an effective modality for improving balance in independent older adults.
Assessing the Linguistic Productivity of Unsupervised Deep Neural Networks
DOE Office of Scientific and Technical Information (OSTI.GOV)
Phillips, Lawrence A.; Hodas, Nathan O.
Increasingly, cognitive scientists have demonstrated interest in applying tools from deep learning. One use for deep learning is in language acquisition where it is useful to know if a linguistic phenomenon can be learned through domain-general means. To assess whether unsupervised deep learning is appropriate, we first pose a smaller question: Can unsupervised neural networks apply linguistic rules productively, using them in novel situations. We draw from the literature on determiner/noun productivity by training an unsupervised, autoencoder network measuring its ability to combine nouns with determiners. Our simple autoencoder creates combinations it has not previously encountered, displaying a degree ofmore » overlap similar to actual children. While this preliminary work does not provide conclusive evidence for productivity, it warrants further investigation with more complex models. Further, this work helps lay the foundations for future collaboration between the deep learning and cognitive science communities.« less
Pedoinformatics Approach to Soil Text Analytics
NASA Astrophysics Data System (ADS)
Furey, J.; Seiter, J.; Davis, A.
2017-12-01
The several extant schema for the classification of soils rely on differing criteria, but the major soil science taxonomies, including the United States Department of Agriculture (USDA) and the international harmonized World Reference Base for Soil Resources systems, are based principally on inferred pedogenic properties. These taxonomies largely result from compiled individual observations of soil morphologies within soil profiles, and the vast majority of this pedologic information is contained in qualitative text descriptions. We present text mining analyses of hundreds of gigabytes of parsed text and other data in the digitally available USDA soil taxonomy documentation, the Soil Survey Geographic (SSURGO) database, and the National Cooperative Soil Survey (NCSS) soil characterization database. These analyses implemented iPython calls to Gensim modules for topic modelling, with latent semantic indexing completed down to the lowest taxon level (soil series) paragraphs. Via a custom extension of the Natural Language Toolkit (NLTK), approximately one percent of the USDA soil series descriptions were used to train a classifier for the remainder of the documents, essentially by treating soil science words as comprising a novel language. While location-specific descriptors at the soil series level are amenable to geomatics methods, unsupervised clustering of the occurrence of other soil science words did not closely follow the usual hierarchy of soil taxa. We present preliminary phrasal analyses that may account for some of these effects.
A Graph-Embedding Approach to Hierarchical Visual Word Mergence.
Wang, Lei; Liu, Lingqiao; Zhou, Luping
2017-02-01
Appropriately merging visual words are an effective dimension reduction method for the bag-of-visual-words model in image classification. The approach of hierarchically merging visual words has been extensively employed, because it gives a fully determined merging hierarchy. Existing supervised hierarchical merging methods take different approaches and realize the merging process with various formulations. In this paper, we propose a unified hierarchical merging approach built upon the graph-embedding framework. Our approach is able to merge visual words for any scenario, where a preferred structure and an undesired structure are defined, and, therefore, can effectively attend to all kinds of requirements for the word-merging process. In terms of computational efficiency, we show that our algorithm can seamlessly integrate a fast search strategy developed in our previous work and, thus, well maintain the state-of-the-art merging speed. To the best of our survey, the proposed approach is the first one that addresses the hierarchical visual word mergence in such a flexible and unified manner. As demonstrated, it can maintain excellent image classification performance even after a significant dimension reduction, and outperform all the existing comparable visual word-merging methods. In a broad sense, our work provides an open platform for applying, evaluating, and developing new criteria for hierarchical word-merging tasks.
ERIC Educational Resources Information Center
Saffioti, Carol Lee
1977-01-01
Exercises in sketching a scene of words, focusing, describing elemental structure (using comparison, contrast, analogy, and antithesis), and sketching and writing about still-life arrangements can heighten students' awareness of sense impressions and lead to improved writing skills. (TJ)
[The Freiburg monosyllable word test in postoperative cochlear implant diagnostics].
Hey, M; Brademann, G; Ambrosch, P
2016-08-01
The Freiburg monosyllable word test represents a central tool of postoperative cochlear implant (CI) diagnostics. The objective of this study is to test the equivalence of different word lists by analysing word comprehension. For patients whose CI has been implanted for more than 5 years, the distribution of suprathreshold speech intelligibility outcomes will also be analysed. In a retrospective data analysis, speech understanding for 626 CI users word correct scores were evaluated using a total of 5211 lists with 20 words each. The analysis of word comprehension within each list shows differences in mean and in the kind of distribution function. There are lists which show a significant difference of their mean word recognition to the overall mean. The Freiburg monosyllable word test is easy to administer at suprathreshold speech level for CI recipients, and typically has a saturation level above 80 %. The Freiburg monosyllable word test can be performed successfully by the majority of CI patients. The limited balance of the test lists elicits the conclusion that an adaptive test procedure with the Freiburg monosyllable test does not make sense. The Freiburg monosyllable test can be restructured by resorting all words across lists, or by omitting individual words of a test list to increase the reliability of the test. The results show that speech intelligibility in quiet should also be investigated in CI recipients al levels below 70 dB.
NASA Technical Reports Server (NTRS)
Shahshahani, Behzad M.; Landgrebe, David A.
1992-01-01
The effect of additional unlabeled samples in improving the supervised learning process is studied in this paper. Three learning processes. supervised, unsupervised, and combined supervised-unsupervised, are compared by studying the asymptotic behavior of the estimates obtained under each process. Upper and lower bounds on the asymptotic covariance matrices are derived. It is shown that under a normal mixture density assumption for the probability density function of the feature space, the combined supervised-unsupervised learning is always superior to the supervised learning in achieving better estimates. Experimental results are provided to verify the theoretical concepts.
Wu, Jiayi; Ma, Yong-Bei; Congdon, Charles; Brett, Bevin; Chen, Shuobing; Xu, Yaofang; Ouyang, Qi
2017-01-01
Structural heterogeneity in single-particle cryo-electron microscopy (cryo-EM) data represents a major challenge for high-resolution structure determination. Unsupervised classification may serve as the first step in the assessment of structural heterogeneity. However, traditional algorithms for unsupervised classification, such as K-means clustering and maximum likelihood optimization, may classify images into wrong classes with decreasing signal-to-noise-ratio (SNR) in the image data, yet demand increased computational costs. Overcoming these limitations requires further development of clustering algorithms for high-performance cryo-EM data processing. Here we introduce an unsupervised single-particle clustering algorithm derived from a statistical manifold learning framework called generative topographic mapping (GTM). We show that unsupervised GTM clustering improves classification accuracy by about 40% in the absence of input references for data with lower SNRs. Applications to several experimental datasets suggest that our algorithm can detect subtle structural differences among classes via a hierarchical clustering strategy. After code optimization over a high-performance computing (HPC) environment, our software implementation was able to generate thousands of reference-free class averages within hours in a massively parallel fashion, which allows a significant improvement on ab initio 3D reconstruction and assists in the computational purification of homogeneous datasets for high-resolution visualization. PMID:28786986
Wu, Jiayi; Ma, Yong-Bei; Congdon, Charles; Brett, Bevin; Chen, Shuobing; Xu, Yaofang; Ouyang, Qi; Mao, Youdong
2017-01-01
Structural heterogeneity in single-particle cryo-electron microscopy (cryo-EM) data represents a major challenge for high-resolution structure determination. Unsupervised classification may serve as the first step in the assessment of structural heterogeneity. However, traditional algorithms for unsupervised classification, such as K-means clustering and maximum likelihood optimization, may classify images into wrong classes with decreasing signal-to-noise-ratio (SNR) in the image data, yet demand increased computational costs. Overcoming these limitations requires further development of clustering algorithms for high-performance cryo-EM data processing. Here we introduce an unsupervised single-particle clustering algorithm derived from a statistical manifold learning framework called generative topographic mapping (GTM). We show that unsupervised GTM clustering improves classification accuracy by about 40% in the absence of input references for data with lower SNRs. Applications to several experimental datasets suggest that our algorithm can detect subtle structural differences among classes via a hierarchical clustering strategy. After code optimization over a high-performance computing (HPC) environment, our software implementation was able to generate thousands of reference-free class averages within hours in a massively parallel fashion, which allows a significant improvement on ab initio 3D reconstruction and assists in the computational purification of homogeneous datasets for high-resolution visualization.
NASA Astrophysics Data System (ADS)
Cruz-Roa, Angel; Arevalo, John; Basavanhally, Ajay; Madabhushi, Anant; González, Fabio
2015-01-01
Learning data representations directly from the data itself is an approach that has shown great success in different pattern recognition problems, outperforming state-of-the-art feature extraction schemes for different tasks in computer vision, speech recognition and natural language processing. Representation learning applies unsupervised and supervised machine learning methods to large amounts of data to find building-blocks that better represent the information in it. Digitized histopathology images represents a very good testbed for representation learning since it involves large amounts of high complex, visual data. This paper presents a comparative evaluation of different supervised and unsupervised representation learning architectures to specifically address open questions on what type of learning architectures (deep or shallow), type of learning (unsupervised or supervised) is optimal. In this paper we limit ourselves to addressing these questions in the context of distinguishing between anaplastic and non-anaplastic medulloblastomas from routine haematoxylin and eosin stained images. The unsupervised approaches evaluated were sparse autoencoders and topographic reconstruct independent component analysis, and the supervised approach was convolutional neural networks. Experimental results show that shallow architectures with more neurons are better than deeper architectures without taking into account local space invariances and that topographic constraints provide useful invariant features in scale and rotations for efficient tumor differentiation.
The effects of environmental context on recognition memory and claims of remembering.
Hockley, William E
2008-11-01
Recognition memory for words was tested in same or different contexts using the remember/know response procedure. Context was manipulated by presenting words in different screen colors and locations and by presenting words against real-world photographs. Overall hit and false-alarm rates were higher for tests presented in an old context compared to a new context. This concordant effect was seen in both remember responses and estimates of familiarity. Similar results were found for rearranged pairings of old study contexts and targets, for study contexts that were unique or were repeated with different words, and for new picture contexts that were physically similar to old contexts. Similar results were also found when subjects focused attention on the study words, but a different pattern of results was obtained when subjects explicitly associated the study words with their picture context. The results show that subjective feelings of recollection play a role in the effects of environmental context but are likely based more on a sense of familiarity that is evoked by the context than on explicit associations between targets and their study context.
Zipf's Law for Word Frequencies: Word Forms versus Lemmas in Long Texts.
Corral, Álvaro; Boleda, Gemma; Ferrer-i-Cancho, Ramon
2015-01-01
Zipf's law is a fundamental paradigm in the statistics of written and spoken natural language as well as in other communication systems. We raise the question of the elementary units for which Zipf's law should hold in the most natural way, studying its validity for plain word forms and for the corresponding lemma forms. We analyze several long literary texts comprising four languages, with different levels of morphological complexity. In all cases Zipf's law is fulfilled, in the sense that a power-law distribution of word or lemma frequencies is valid for several orders of magnitude. We investigate the extent to which the word-lemma transformation preserves two parameters of Zipf's law: the exponent and the low-frequency cut-off. We are not able to demonstrate a strict invariance of the tail, as for a few texts both exponents deviate significantly, but we conclude that the exponents are very similar, despite the remarkable transformation that going from words to lemmas represents, considerably affecting all ranges of frequencies. In contrast, the low-frequency cut-offs are less stable, tending to increase substantially after the transformation.
Dyslexic Participants Show Intact Spontaneous Categorization Processes
ERIC Educational Resources Information Center
Nikolopoulos, Dimitris S.; Pothos, Emmanuel M.
2009-01-01
We examine the performance of dyslexic participants on an unsupervised categorization task against that of matched non-dyslexic control participants. Unsupervised categorization is a cognitive process critical for conceptual development. Existing research in dyslexia has emphasized perceptual tasks and supervised categorization tasks (for which…
ERIC Educational Resources Information Center
Davis, Philip J.
1993-01-01
Argues for a mathematics education that interprets the word "theorem" in a sense that is wide enough to include the visual aspects of mathematical intuition and reasoning. Defines the term "visual theorems" and illustrates the concept using the Marigold of Theodorus. (Author/MDH)
Diraco, Giovanni; Leone, Alessandro; Siciliano, Pietro
2017-11-24
Continuous in-home monitoring of older adults living alone aims to improve their quality of life and independence, by detecting early signs of illness and functional decline or emergency conditions. To meet requirements for technology acceptance by seniors (unobtrusiveness, non-intrusiveness, and privacy-preservation), this study presents and discusses a new smart sensor system for the detection of abnormalities during daily activities, based on ultra-wideband radar providing rich, not privacy-sensitive, information useful for sensing both cardiorespiratory and body movements, regardless of ambient lighting conditions and physical obstructions (through-wall sensing). The radar sensing is a very promising technology, enabling the measurement of vital signs and body movements at a distance, and thus meeting both requirements of unobtrusiveness and accuracy. In particular, impulse-radio ultra-wideband radar has attracted considerable attention in recent years thanks to many properties that make it useful for assisted living purposes. The proposed sensing system, evaluated in meaningful assisted living scenarios by involving 30 participants, exhibited the ability to detect vital signs, to discriminate among dangerous situations and activities of daily living, and to accommodate individual physical characteristics and habits. The reported results show that vital signs can be detected also while carrying out daily activities or after a fall event (post-fall phase), with accuracy varying according to the level of movements, reaching up to 95% and 91% in detecting respiration and heart rates, respectively. Similarly, good results were achieved in fall detection by using the micro-motion signature and unsupervised learning, with sensitivity and specificity greater than 97% and 90%, respectively.
Leone, Alessandro; Siciliano, Pietro
2017-01-01
Continuous in-home monitoring of older adults living alone aims to improve their quality of life and independence, by detecting early signs of illness and functional decline or emergency conditions. To meet requirements for technology acceptance by seniors (unobtrusiveness, non-intrusiveness, and privacy-preservation), this study presents and discusses a new smart sensor system for the detection of abnormalities during daily activities, based on ultra-wideband radar providing rich, not privacy-sensitive, information useful for sensing both cardiorespiratory and body movements, regardless of ambient lighting conditions and physical obstructions (through-wall sensing). The radar sensing is a very promising technology, enabling the measurement of vital signs and body movements at a distance, and thus meeting both requirements of unobtrusiveness and accuracy. In particular, impulse-radio ultra-wideband radar has attracted considerable attention in recent years thanks to many properties that make it useful for assisted living purposes. The proposed sensing system, evaluated in meaningful assisted living scenarios by involving 30 participants, exhibited the ability to detect vital signs, to discriminate among dangerous situations and activities of daily living, and to accommodate individual physical characteristics and habits. The reported results show that vital signs can be detected also while carrying out daily activities or after a fall event (post-fall phase), with accuracy varying according to the level of movements, reaching up to 95% and 91% in detecting respiration and heart rates, respectively. Similarly, good results were achieved in fall detection by using the micro-motion signature and unsupervised learning, with sensitivity and specificity greater than 97% and 90%, respectively. PMID:29186786
Wilson, A; Weinstein, L
1992-01-01
The Russian psychologist Lev Vygotsky proposed an analysis of language, thought, and internalization that has direct relevance to the current concerns of psychoanalysts. Striking methodological and conceptual similarities and useful complementarities with psychoanalysis are discovered when one peers beneath the surface of Vygotskian psychology. Our adaptation of Vygostsky's views expands upon Freud's assigned role to language in the topographic model. We suggest that the analysand's speech offers several windows into the history of the individual, through prosody, tropes, word meaning, and word sense. We particularly emphasize Vygotsky's views on the genesis and utilization of word meanings. The acquisition of word meanings will contain key elements of the internal climate present when the word meaning was forged. Bearing this in mind, crucial theoretical questions follow, such as how psychoanalysis is to understand the unconscious fantasies, identifications, anxieties, and defenses associated with the psychodynamics of language acquisition and later language usage. We propose that the clinical situation is an ideal place to test these hypotheses.
Günther, Fritz; Dudschig, Carolin; Kaup, Barbara
2018-05-01
Theories of embodied cognition assume that concepts are grounded in non-linguistic, sensorimotor experience. In support of this assumption, previous studies have shown that upwards response movements are faster than downwards movements after participants have been presented with words whose referents are typically located in the upper vertical space (and vice versa for downwards responses). This is taken as evidence that processing these words reactivates sensorimotor experiential traces. This congruency effect was also found for novel words, after participants learned these words as labels for novel objects that they encountered either in their upper or lower visual field. While this indicates that direct experience with a word's referent is sufficient to evoke said congruency effects, the present study investigates whether this direct experience is also a necessary condition. To this end, we conducted five experiments in which participants learned novel words from purely linguistic input: Novel words were presented in pairs with real up- or down-words (Experiment 1); they were presented in natural sentences where they replaced these real words (Experiment 2); they were presented as new labels for these real words (Experiment 3); and they were presented as labels for novel combined concepts based on these real words (Experiment 4 and 5). In all five experiments, we did not find any congruency effects elicited by the novel words; however, participants were always able to make correct explicit judgements about the vertical dimension associated to the novel words. These results suggest that direct experience is necessary for reactivating experiential traces, but this reactivation is not a necessary condition for understanding (in the sense of storing and accessing) the corresponding aspects of word meaning. Copyright © 2017 Cognitive Science Society, Inc.
Housing and sexual health among street-involved youth.
Kumar, Maya M; Nisenbaum, Rosane; Barozzino, Tony; Sgro, Michael; Bonifacio, Herbert J; Maguire, Jonathon L
2015-10-01
Street-involved youth (SIY) carry a disproportionate burden of sexually transmitted diseases (STD). Studies among adults suggest that improving housing stability may be an effective primary prevention strategy for improving sexual health. Housing options available to SIY offer varying degrees of stability and adult supervision. This study investigated whether housing options offering more stability and adult supervision are associated with fewer STD and related risk behaviors among SIY. A cross-sectional study was performed using public health survey and laboratory data collected from Toronto SIY in 2010. Three exposure categories were defined a priori based on housing situation: (1) stable and supervised housing, (2) stable and unsupervised housing, and (3) unstable and unsupervised housing. Multivariate logistic regression was used to test the association between housing category and current or recent STD. Secondary analyses were performed using the following secondary outcomes: blood-borne infection, recent binge-drinking, and recent high-risk sexual behavior. The final analysis included 184 SIY. Of these, 28.8 % had a current or recent STD. Housing situation was stable and supervised for 12.5 %, stable and unsupervised for 46.2 %, and unstable and unsupervised for 41.3 %. Compared to stable and supervised housing, there was no significant association between current or recent STD among stable and unsupervised housing or unstable and unsupervised housing. There was no significant association between housing category and risk of blood-borne infection, binge-drinking, or high-risk sexual behavior. Although we did not demonstrate a significant association between stable and supervised housing and lower STD risk, our incorporation of both housing stability and adult supervision into a priori defined exposure groups may inform future studies of housing-related prevention strategies among SIY. Multi-modal interventions beyond housing alone may also be required to prevent sexual morbidity among these vulnerable youth.
Out-of-School Time and Adolescent Substance Use.
Lee, Kenneth T H; Vandell, Deborah Lowe
2015-11-01
High levels of adolescent substance use are linked to lower academic achievement, reduced schooling, and delinquency. We assess four types of out-of-school time (OST) contexts--unsupervised time with peers, sports, organized activities, and paid employment--in relation to tobacco, alcohol, and marijuana use at the end of high school. Other research has examined these OST contexts in isolation, limiting efforts to disentangle potentially confounded relations. Longitudinal data from the National Institute of Child Health and Human Development Study of Early Child Care and Youth Development (N = 766) examined associations between different OST contexts during high school and substance use at the end of high school. Unsupervised time with peers increased the odds of tobacco, alcohol, and marijuana use, whereas sports increased the odds of alcohol use and decreased the odds of marijuana use. Paid employment increased the odds of tobacco and alcohol use. Unsupervised time with peers predicted increased amounts of tobacco, alcohol, and marijuana use, whereas sports predicted decreased amounts of tobacco and marijuana use and increased amounts of alcohol use at the end of high school. Although unsupervised time with peers, sports, and paid employment were differentially linked to the odds of substance use, only unsupervised time with peers and sports were significantly associated with the amounts of tobacco, alcohol, and marijuana use at the end of high school. These findings underscore the value of considering OST contexts in relation to strategies to promote adolescent health. Reducing unsupervised time with peers and increasing sports participation may have positive impacts on reducing substance use. Copyright © 2015 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.
Hübner, David; Verhoeven, Thibault; Schmid, Konstantin; Müller, Klaus-Robert; Tangermann, Michael; Kindermans, Pieter-Jan
2017-01-01
Using traditional approaches, a brain-computer interface (BCI) requires the collection of calibration data for new subjects prior to online use. Calibration time can be reduced or eliminated e.g., by subject-to-subject transfer of a pre-trained classifier or unsupervised adaptive classification methods which learn from scratch and adapt over time. While such heuristics work well in practice, none of them can provide theoretical guarantees. Our objective is to modify an event-related potential (ERP) paradigm to work in unison with the machine learning decoder, and thus to achieve a reliable unsupervised calibrationless decoding with a guarantee to recover the true class means. We introduce learning from label proportions (LLP) to the BCI community as a new unsupervised, and easy-to-implement classification approach for ERP-based BCIs. The LLP estimates the mean target and non-target responses based on known proportions of these two classes in different groups of the data. We present a visual ERP speller to meet the requirements of LLP. For evaluation, we ran simulations on artificially created data sets and conducted an online BCI study with 13 subjects performing a copy-spelling task. Theoretical considerations show that LLP is guaranteed to minimize the loss function similar to a corresponding supervised classifier. LLP performed well in simulations and in the online application, where 84.5% of characters were spelled correctly on average without prior calibration. The continuously adapting LLP classifier is the first unsupervised decoder for ERP BCIs guaranteed to find the optimal decoder. This makes it an ideal solution to avoid tedious calibration sessions. Additionally, LLP works on complementary principles compared to existing unsupervised methods, opening the door for their further enhancement when combined with LLP.
Verhoeven, Thibault; Schmid, Konstantin; Müller, Klaus-Robert; Tangermann, Michael; Kindermans, Pieter-Jan
2017-01-01
Objective Using traditional approaches, a brain-computer interface (BCI) requires the collection of calibration data for new subjects prior to online use. Calibration time can be reduced or eliminated e.g., by subject-to-subject transfer of a pre-trained classifier or unsupervised adaptive classification methods which learn from scratch and adapt over time. While such heuristics work well in practice, none of them can provide theoretical guarantees. Our objective is to modify an event-related potential (ERP) paradigm to work in unison with the machine learning decoder, and thus to achieve a reliable unsupervised calibrationless decoding with a guarantee to recover the true class means. Method We introduce learning from label proportions (LLP) to the BCI community as a new unsupervised, and easy-to-implement classification approach for ERP-based BCIs. The LLP estimates the mean target and non-target responses based on known proportions of these two classes in different groups of the data. We present a visual ERP speller to meet the requirements of LLP. For evaluation, we ran simulations on artificially created data sets and conducted an online BCI study with 13 subjects performing a copy-spelling task. Results Theoretical considerations show that LLP is guaranteed to minimize the loss function similar to a corresponding supervised classifier. LLP performed well in simulations and in the online application, where 84.5% of characters were spelled correctly on average without prior calibration. Significance The continuously adapting LLP classifier is the first unsupervised decoder for ERP BCIs guaranteed to find the optimal decoder. This makes it an ideal solution to avoid tedious calibration sessions. Additionally, LLP works on complementary principles compared to existing unsupervised methods, opening the door for their further enhancement when combined with LLP. PMID:28407016
BORAWSKI, ELAINE A.; IEVERS-LANDIS, CAROLYN E.; LOVEGREEN, LOREN D.; TRAPL, ERIKA S.
2010-01-01
Purpose To compare two different parenting practices (parental monitoring and negotiated unsupervised time) and perceived parental trust in the reporting of health risk behaviors among adolescents. Methods Data were derived from 692 adolescents in 9th and 10th grades (X̄ = 15.7 years) enrolled in health education classes in six urban high schools. Students completed a self-administered paper-based survey that assessed adolescents’ perceptions of the degree to which their parents monitor their whereabouts, are permitted to negotiate unsupervised time with their friends and trust them to make decisions. Using gender-specific multivariate logistic regression analyses, we examined the relative importance of parental monitoring, negotiated unsupervised time with peers, and parental trust in predicting reported sexual activity, sex-related protective actions (e.g., condom use, carrying protection) and substance use (alcohol, tobacco, and marijuana). Results For males and females, increased negotiated unsupervised time was strongly associated with increased risk behavior (e.g., sexual activity, alcohol and marijuana use) but also sex-related protective actions. In males, high parental monitoring was associated with less alcohol use and consistent condom use. Parental monitoring had no affect on female behavior. Perceived parental trust served as a protective factor against sexual activity, tobacco, and marijuana use in females, and alcohol use in males. Conclusions Although monitoring is an important practice for parents of older adolescents, managing their behavior through negotiation of unsupervised time may have mixed results leading to increased experimentation with sexuality and substances, but perhaps in a more responsible way. Trust established between an adolescent female and her parents continues to be a strong deterrent for risky behaviors but appears to have little effect on behaviors of adolescent males. PMID:12890596
Unsupervised Cryo-EM Data Clustering through Adaptively Constrained K-Means Algorithm
Xu, Yaofang; Wu, Jiayi; Yin, Chang-Cheng; Mao, Youdong
2016-01-01
In single-particle cryo-electron microscopy (cryo-EM), K-means clustering algorithm is widely used in unsupervised 2D classification of projection images of biological macromolecules. 3D ab initio reconstruction requires accurate unsupervised classification in order to separate molecular projections of distinct orientations. Due to background noise in single-particle images and uncertainty of molecular orientations, traditional K-means clustering algorithm may classify images into wrong classes and produce classes with a large variation in membership. Overcoming these limitations requires further development on clustering algorithms for cryo-EM data analysis. We propose a novel unsupervised data clustering method building upon the traditional K-means algorithm. By introducing an adaptive constraint term in the objective function, our algorithm not only avoids a large variation in class sizes but also produces more accurate data clustering. Applications of this approach to both simulated and experimental cryo-EM data demonstrate that our algorithm is a significantly improved alterative to the traditional K-means algorithm in single-particle cryo-EM analysis. PMID:27959895
Unsupervised Cryo-EM Data Clustering through Adaptively Constrained K-Means Algorithm.
Xu, Yaofang; Wu, Jiayi; Yin, Chang-Cheng; Mao, Youdong
2016-01-01
In single-particle cryo-electron microscopy (cryo-EM), K-means clustering algorithm is widely used in unsupervised 2D classification of projection images of biological macromolecules. 3D ab initio reconstruction requires accurate unsupervised classification in order to separate molecular projections of distinct orientations. Due to background noise in single-particle images and uncertainty of molecular orientations, traditional K-means clustering algorithm may classify images into wrong classes and produce classes with a large variation in membership. Overcoming these limitations requires further development on clustering algorithms for cryo-EM data analysis. We propose a novel unsupervised data clustering method building upon the traditional K-means algorithm. By introducing an adaptive constraint term in the objective function, our algorithm not only avoids a large variation in class sizes but also produces more accurate data clustering. Applications of this approach to both simulated and experimental cryo-EM data demonstrate that our algorithm is a significantly improved alterative to the traditional K-means algorithm in single-particle cryo-EM analysis.
Open Globe Injury Patient Identification in Warfare Clinical Notes1
Apostolova, Emilia; White, Helen A.; Morris, Patty A.; Eliason, David A.; Velez, Tom
2017-01-01
The aim of this study is to utilize the Defense and Veterans Eye Injury and Vision Registry clinical data derived from DoD and VA medical systems which include documentation of care while in combat, and develop methods for comprehensive and reliable Open Globe Injury (OGI) patient identification. In particular, we focus on the use of free-form clinical notes, since structured data, such as diagnoses or procedure codes, as found in early post-trauma clinical records, may not be a comprehensive and reliable indicator of OGIs. The challenges of the task include low incidence rate (few positive examples), idiosyncratic military ophthalmology vocabulary, extreme brevity of notes, specialized abbreviations, typos and misspellings. We modeled the problem as a text classification task and utilized a combination of supervised learning (SVMs) and word embeddings learnt in a unsupervised manner, achieving a precision of 92.50% and a recall of89.83%o. The described techniques are applicable to patient cohort identification with limited training data and low incidence rate. PMID:29854104
Fuzzy Mathematical Models To Remove Poverty Of Gypsies In Tamilnadu
NASA Astrophysics Data System (ADS)
Chandrasekaran, A. D.; Ramkumar, C.; Siva, E. P.; Balaji, N.
2018-04-01
In the society there are several poor people are living. One of the sympathetic poor people is gypsies. They are moving from one place to another place towards survive of life because of not having any permanent place to live. In this paper we have interviewed 895 gypsies in Tamilnadu using a linguistic questionnaire. As the problems faced by them to improve their life at large involve so much of feeling, uncertainties and unpredictabilitys. I felt that it deem fit to use fuzzy theory in general and fuzzy matrix in particular. Fuzzy matrix is the best suitable tool where the data is an unsupervised one. Further the fuzzy matrix is so powerful to identify the main development factor of gypsies.This paper has three sections. In section one the method of application of CEFD matrix. In section two, we describe the development factors of gypsies. In section three, we apply these factors to the CEFD matrix and derive our conclusions. Key words: RD matrix, AFD matrix, CEFD matrix.
Norman, Laura M.; Middleton, Barry R.; Wilson, Natalie R.
2018-01-01
Mapping of vegetation types is of great importance to the San Carlos Apache Tribe and their management of forestry and fire fuels. Various remote sensing techniques were applied to classify multitemporal Landsat 8 satellite data, vegetation index, and digital elevation model data. A multitiered unsupervised classification generated over 900 classes that were then recoded to one of the 16 generalized vegetation/land cover classes using the Southwest Regional Gap Analysis Project (SWReGAP) map as a guide. A supervised classification was also run using field data collected in the SWReGAP project and our field campaign. Field data were gathered and accuracy assessments were generated to compare outputs. Our hypothesis was that a resulting map would update and potentially improve upon the vegetation/land cover class distributions of the older SWReGAP map over the 24,000 km2 study area. The estimated overall accuracies ranged between 43% and 75%, depending on which method and field dataset were used. The findings demonstrate the complexity of vegetation mapping, the importance of recent, high-quality-field data, and the potential for misleading results when insufficient field data are collected.
Using deep learning in image hyper spectral segmentation, classification, and detection
NASA Astrophysics Data System (ADS)
Zhao, Xiuying; Su, Zhenyu
2018-02-01
Recent years have shown that deep learning neural networks are a valuable tool in the field of computer vision. Deep learning method can be used in applications like remote sensing such as Land cover Classification, Detection of Vehicle in Satellite Images, Hyper spectral Image classification. This paper addresses the use of the deep learning artificial neural network in Satellite image segmentation. Image segmentation plays an important role in image processing. The hue of the remote sensing image often has a large hue difference, which will result in the poor display of the images in the VR environment. Image segmentation is a pre processing technique applied to the original images and splits the image into many parts which have different hue to unify the color. Several computational models based on supervised, unsupervised, parametric, probabilistic region based image segmentation techniques have been proposed. Recently, one of the machine learning technique known as, deep learning with convolution neural network has been widely used for development of efficient and automatic image segmentation models. In this paper, we focus on study of deep neural convolution network and its variants for automatic image segmentation rather than traditional image segmentation strategies.
Multi-Sensor Radiometric Study to Detect Pathologies in Historical Buildings
NASA Astrophysics Data System (ADS)
Del Pozo, S.; Herrero-Pascual, J.; Felipe-García, B.; Hernández-López, D.; Rodríguez-Gonzálvez, P.; González-Aguilera, D.
2015-02-01
This paper presents a comparative study with different remote sensing technologies to recognize pathologies in façades of historical buildings. Building materials deteriorate over the years due to different extrinsic and intrinsic agents, so assessing these diseases in a non-invasive way is crucial to help preserve them. Most of these buildings are extremely valuable and some of them have been declared monuments of cultural interest. In this way through close range remote sensing techniques, it is possible to study material pathologies in a rigorous way and in a short duration field campaign. For the investigation two different acquisition systems were applied, active and passive methods. The terrestrial laser scanner FARO Focus 3D was used as active sensor, working at the wavelength of 905 nm. For the case of passive sensors, a Nikon D-5000 and a 6- bands Mini-MCA multispectral camera (530-801 nm) were applied covering visible and near infrared spectral range. This analysis allows assessing the sensor, or sensors combination, suitability for pathologies detection, addressing the limitations according to the spatial and spectral resolution. Moreover, the pathology detection by unsupervised classification methods is addressed in order to evaluate the automation capability of this process.
Unsupervised iterative detection of land mines in highly cluttered environments.
Batman, Sinan; Goutsias, John
2003-01-01
An unsupervised iterative scheme is proposed for land mine detection in heavily cluttered scenes. This scheme is based on iterating hybrid multispectral filters that consist of a decorrelating linear transform coupled with a nonlinear morphological detector. Detections extracted from the first pass are used to improve results in subsequent iterations. The procedure stops after a predetermined number of iterations. The proposed scheme addresses several weaknesses associated with previous adaptations of morphological approaches to land mine detection. Improvement in detection performance, robustness with respect to clutter inhomogeneities, a completely unsupervised operation, and computational efficiency are the main highlights of the method. Experimental results reveal excellent performance.
Netherlands Maritime Institute
ERIC Educational Resources Information Center
Hoefsmit, R. G. A.
1976-01-01
Account of the aims and activities of the Netherlands Maritime Institute provided by the Secretary to the Institute's Board of Directors, The Institute's intent is "to promote maritime activities, including the shipbuilding-shipping relationship, in the broadest sense of the word." (Editor/RK)
Determinants of translation ambiguity
Degani, Tamar; Prior, Anat; Eddington, Chelsea M.; Arêas da Luz Fontes, Ana B.; Tokowicz, Natasha
2016-01-01
Ambiguity in translation is highly prevalent, and has consequences for second-language learning and for bilingual lexical processing. To better understand this phenomenon, the current study compared the determinants of translation ambiguity across four sets of translation norms from English to Spanish, Dutch, German and Hebrew. The number of translations an English word received was correlated across these different languages, and was also correlated with the number of senses the word has in English, demonstrating that translation ambiguity is partially determined by within-language semantic ambiguity. For semantically-ambiguous English words, the probability of the different translations in Spanish and Hebrew was predicted by the meaning-dominance structure in English, beyond the influence of other lexical and semantic factors, for bilinguals translating from their L1, and translating from their L2. These findings are consistent with models postulating direct access to meaning from L2 words for moderately-proficient bilinguals. PMID:27882188
Rathleff, C R; Bandholm, T; Spaich, E G; Jorgensen, M; Andreasen, J
2017-01-01
Frailty is a serious condition frequently present in geriatric inpatients that potentially causes serious adverse events. Strength training is acknowledged as a means of preventing or delaying frailty and loss of function in these patients. However, limited hospital resources challenge the amount of supervised training, and unsupervised training could possibly supplement supervised training thereby increasing the total exercise dose during admission. A new valid and reliable technology, the BandCizer, objectively measures the exact training dosage performed. The purpose was to investigate feasibility and acceptability of an unsupervised progressive strength training intervention monitored by BandCizer for frail geriatric inpatients. This feasibility trial included 15 frail inpatients at a geriatric ward. At hospitalization, the patients were prescribed two elastic band exercises to be performed unsupervised once daily. A BandCizer Datalogger enabling measurement of the number of sets, repetitions, and time-under-tension was attached to the elastic band. The patients were instructed in performing strength training: 3 sets of 10 repetitions (10-12 repetition maximum (RM)) with a separation of 2-min pauses and a time-under-tension of 8 s. The feasibility criterion for the unsupervised progressive exercises was that 33% of the recommended number of sets would be performed by at least 30% of patients. In addition, patients and staff were interviewed about their experiences with the intervention. Four (27%) out of 15 patients completed 33% of the recommended number of sets. For the total sample, the average percent of performed sets was 23% and for those who actually trained ( n = 12) 26%. Patients and staff expressed a general positive attitude towards the unsupervised training as an addition to the supervised training sessions. However, barriers were also described-especially constant interruptions. Based on the predefined criterion for feasibility, the unsupervised training was not feasible, although the criterion was almost met. The patients and staff mainly expressed positive attitudes towards the unsupervised training. As even a small training dosage has been shown to improve the physical performance of geriatric inpatients, the proposed intervention might be relevant if the interruptions are decreased in future large-scale trials and if the adherence is increased. ClinicalTrials.gov: NCT02702557, February 29, 2016. Data Protection Agency: 2016-42, February 25, 2016. Ethics Committee: No registration needed, December 8, 2015 (e-mail correspondence).
NASA Technical Reports Server (NTRS)
Park, K. Y.; Miller, L. D.
1978-01-01
Computer analysis was applied to single date LANDSAT MSS imagery of a sample coastal area near Seoul, Korea equivalent to a 1:50,000 topographic map. Supervised image processing yielded a test classification map from this sample image containing 12 classes: 5 water depth/sediment classes, 2 shoreline/tidal classes, and 5 coastal land cover classes at a scale of 1:25,000 and with a training set accuracy of 76%. Unsupervised image classification was applied to a subportion of the site analyzed and produced classification maps comparable in results in a spatial sense. The results of this test indicated that it is feasible to produce such quantitative maps for detailed study of dynamic coastal processes given a LANDSAT image data base at sufficiently frequent time intervals.
Yang, Guang; Raschke, Felix; Barrick, Thomas R; Howe, Franklyn A
2015-09-01
To investigate whether nonlinear dimensionality reduction improves unsupervised classification of (1) H MRS brain tumor data compared with a linear method. In vivo single-voxel (1) H magnetic resonance spectroscopy (55 patients) and (1) H magnetic resonance spectroscopy imaging (MRSI) (29 patients) data were acquired from histopathologically diagnosed gliomas. Data reduction using Laplacian eigenmaps (LE) or independent component analysis (ICA) was followed by k-means clustering or agglomerative hierarchical clustering (AHC) for unsupervised learning to assess tumor grade and for tissue type segmentation of MRSI data. An accuracy of 93% in classification of glioma grade II and grade IV, with 100% accuracy in distinguishing tumor and normal spectra, was obtained by LE with unsupervised clustering, but not with the combination of k-means and ICA. With (1) H MRSI data, LE provided a more linear distribution of data for cluster analysis and better cluster stability than ICA. LE combined with k-means or AHC provided 91% accuracy for classifying tumor grade and 100% accuracy for identifying normal tissue voxels. Color-coded visualization of normal brain, tumor core, and infiltration regions was achieved with LE combined with AHC. The LE method is promising for unsupervised clustering to separate brain and tumor tissue with automated color-coding for visualization of (1) H MRSI data after cluster analysis. © 2014 Wiley Periodicals, Inc.
Detection of Tree Crowns Based on Reclassification Using Aerial Images and LIDAR Data
NASA Astrophysics Data System (ADS)
Talebi, S.; Zarea, A.; Sadeghian, S.; Arefi, H.
2013-09-01
Tree detection using aerial sensors in early decades was focused by many researchers in different fields including Remote Sensing and Photogrammetry. This paper is intended to detect trees in complex city areas using aerial imagery and laser scanning data. Our methodology is a hierarchal unsupervised method consists of some primitive operations. This method could be divided into three sections, in which, first section uses aerial imagery and both second and third sections use laser scanners data. In the first section a vegetation cover mask is created in both sunny and shadowed areas. In the second section Rate of Slope Change (RSC) is used to eliminate grasses. In the third section a Digital Terrain Model (DTM) is obtained from LiDAR data. By using DTM and Digital Surface Model (DSM) we would get to Normalized Digital Surface Model (nDSM). Then objects which are lower than a specific height are eliminated. Now there are three result layers from three sections. At the end multiplication operation is used to get final result layer. This layer will be smoothed by morphological operations. The result layer is sent to WG III/4 to evaluate. The evaluation result shows that our method has a good rank in comparing to other participants' methods in ISPRS WG III/4, when assessed in terms of 5 indices including area base completeness, area base correctness, object base completeness, object base correctness and boundary RMS. With regarding of being unsupervised and automatic, this method is improvable and could be integrate with other methods to get best results.
Mapping of Geographically Isolated Wetlands of Western Siberia Using High Resolution Space Images
NASA Astrophysics Data System (ADS)
Dyukarev, E.; Pologova, N.; Dyukarev, A.; Lane, C.; Autrey, B. C.
2014-12-01
Using the remote sensing data for integrated study of natural objects is actual for investigation of difficult to access areas of West Siberia. The research of this study focuses on determining the extent and spectral signatures of isolated wetlands within Ob-Tom Interfluve area using Landsat and Quickbird space images. High-resolution space images were carefully examined and wetlands were manually delineated. Wetlands have clear visible signs at the high resolution space images. 567 wetlands were recognized as isolated wetlands with the area about 10 000 ha (of 2.5% of the study area). Isolated wetlands with area less 2 ha are the most frequent. Half of the total amount of wetlands has area less than 6.4 ha. The largest isolated wetland occupies 797 ha, and only 5% have area more than 50 ha. The Landsat 7 ETM+ data were used for analysis of vegetation structure and spectral characteristics of wetlands. The masked isolated wetlands image was classified into 12 land cover classes using ISODATA unsupervised classification. The attribution of unsupervised classification results allowed us to clearly recognize 7 types of wetlands: tall, low and sparse ryams (Pine-Shrub-Sphagnum community), open wetlands with shrub, moss or sedge cover, and open water objects. Analysis of spectral profiles for all classes has shown that Landsat spectral bands 4 and 5 have higher variability. These bands allow to separate wetland classed definitely. Accuracy assessment of isolated wetland map shows a good agreement with expert field data. The work was supported by grants ISTC № 4079.
Feature Extraction Using an Unsupervised Neural Network
1991-05-03
with this neural netowrk is given and its connection to exploratory projection pursuit methods is established. DD I 2 P JA d 73 EDITIONj Of I NOV 6s...IS OBSOLETE $IN 0102- LF- 014- 6601 SECURITY CLASSIFICATION OF THIS PAGE (When Daoes Enlered) Feature Extraction using an Unsupervised Neural Network
An Abstraction-Based Data Model for Information Retrieval
NASA Astrophysics Data System (ADS)
McAllister, Richard A.; Angryk, Rafal A.
Language ontologies provide an avenue for automated lexical analysis that may be used to supplement existing information retrieval methods. This paper presents a method of information retrieval that takes advantage of WordNet, a lexical database, to generate paths of abstraction, and uses them as the basis for an inverted index structure to be used in the retrieval of documents from an indexed corpus. We present this method as a entree to a line of research on using ontologies to perform word-sense disambiguation and improve the precision of existing information retrieval techniques.
Gaete, Jorge; Montero-Marin, Jesus; Rojas-Barahona, Cristian A.; Olivares, Esterbina; Araya, Ricardo
2016-01-01
School membership appears to be an important factor in explaining the relationship between students and schools, including school staff. School membership is associated with several school-related outcomes, such as academic performance and expectations. Most studies on school membership have been conducted in developed countries. The Psychological Sense of School Membership (PSSM) scale (18 items: 13 positively worded items, 5 negatively worded items) has been widely used to measure this construct, but no studies regarding its validity and reliability have been conducted in Spanish-speaking Latin American countries. This study investigates the psychometric properties, factor structure and reliability of this scale in a sample of 1250 early adolescents in Chile. Both exploratory and confirmatory factor analyses provide evidence of an excellent fit for a one-factor solution after removing the negatively worded items. The internal consistency of this new abbreviated version was 0.92. The association analyses demonstrated that high school membership was associated with better academic performance, stronger school bonding, a reduced likelihood of school misbehavior, and reduced likelihood of substance use. Analyses showed support for the reliability and validity of the PSSM among Chilean adolescents. PMID:27999554
Zipf’s Law for Word Frequencies: Word Forms versus Lemmas in Long Texts
Corral, Álvaro; Boleda, Gemma; Ferrer-i-Cancho, Ramon
2015-01-01
Zipf’s law is a fundamental paradigm in the statistics of written and spoken natural language as well as in other communication systems. We raise the question of the elementary units for which Zipf’s law should hold in the most natural way, studying its validity for plain word forms and for the corresponding lemma forms. We analyze several long literary texts comprising four languages, with different levels of morphological complexity. In all cases Zipf’s law is fulfilled, in the sense that a power-law distribution of word or lemma frequencies is valid for several orders of magnitude. We investigate the extent to which the word-lemma transformation preserves two parameters of Zipf’s law: the exponent and the low-frequency cut-off. We are not able to demonstrate a strict invariance of the tail, as for a few texts both exponents deviate significantly, but we conclude that the exponents are very similar, despite the remarkable transformation that going from words to lemmas represents, considerably affecting all ranges of frequencies. In contrast, the low-frequency cut-offs are less stable, tending to increase substantially after the transformation. PMID:26158787
ERIC Educational Resources Information Center
Butz, Martin V.; Herbort, Oliver; Hoffmann, Joachim
2007-01-01
Autonomously developing organisms face several challenges when learning reaching movements. First, motor control is learned unsupervised or self-supervised. Second, knowledge of sensorimotor contingencies is acquired in contexts in which action consequences unfold in time. Third, motor redundancies must be resolved. To solve all 3 of these…
Bilingual Lexical Interactions in an Unsupervised Neural Network Model
ERIC Educational Resources Information Center
Zhao, Xiaowei; Li, Ping
2010-01-01
In this paper we present an unsupervised neural network model of bilingual lexical development and interaction. We focus on how the representational structures of the bilingual lexicons can emerge, develop, and interact with each other as a function of the learning history. The results show that: (1) distinct representations for the two lexicons…
Huang, Qi; Yang, Dapeng; Jiang, Li; Zhang, Huajie; Liu, Hong; Kotani, Kiyoshi
2017-01-01
Performance degradation will be caused by a variety of interfering factors for pattern recognition-based myoelectric control methods in the long term. This paper proposes an adaptive learning method with low computational cost to mitigate the effect in unsupervised adaptive learning scenarios. We presents a particle adaptive classifier (PAC), by constructing a particle adaptive learning strategy and universal incremental least square support vector classifier (LS-SVC). We compared PAC performance with incremental support vector classifier (ISVC) and non-adapting SVC (NSVC) in a long-term pattern recognition task in both unsupervised and supervised adaptive learning scenarios. Retraining time cost and recognition accuracy were compared by validating the classification performance on both simulated and realistic long-term EMG data. The classification results of realistic long-term EMG data showed that the PAC significantly decreased the performance degradation in unsupervised adaptive learning scenarios compared with NSVC (9.03% ± 2.23%, p < 0.05) and ISVC (13.38% ± 2.62%, p = 0.001), and reduced the retraining time cost compared with ISVC (2 ms per updating cycle vs. 50 ms per updating cycle). PMID:28608824
Huang, Qi; Yang, Dapeng; Jiang, Li; Zhang, Huajie; Liu, Hong; Kotani, Kiyoshi
2017-06-13
Performance degradation will be caused by a variety of interfering factors for pattern recognition-based myoelectric control methods in the long term. This paper proposes an adaptive learning method with low computational cost to mitigate the effect in unsupervised adaptive learning scenarios. We presents a particle adaptive classifier (PAC), by constructing a particle adaptive learning strategy and universal incremental least square support vector classifier (LS-SVC). We compared PAC performance with incremental support vector classifier (ISVC) and non-adapting SVC (NSVC) in a long-term pattern recognition task in both unsupervised and supervised adaptive learning scenarios. Retraining time cost and recognition accuracy were compared by validating the classification performance on both simulated and realistic long-term EMG data. The classification results of realistic long-term EMG data showed that the PAC significantly decreased the performance degradation in unsupervised adaptive learning scenarios compared with NSVC (9.03% ± 2.23%, p < 0.05) and ISVC (13.38% ± 2.62%, p = 0.001), and reduced the retraining time cost compared with ISVC (2 ms per updating cycle vs. 50 ms per updating cycle).
Promoting Decimal Number Sense and Representational Fluency
ERIC Educational Resources Information Center
Suh, Jennifer M.; Johnston, Chris; Jamieson, Spencer; Mills, Michelle
2008-01-01
The abstract nature of mathematics requires the communication of mathematical ideas through multiple representations, such as words, symbols, pictures, objects, or actions. Building representational fluency involves using mathematical representations flexibly and being able to interpret and translate among these different models and mathematical…
Hadley, Wendy; Houck, Christopher D; Barker, David; Senocak, Natali
2015-06-01
The purpose of this study was to examine the moderating influence of parental monitoring (e.g., unsupervised time with opposite sex peers) and adolescent emotional competence on sexual behaviors, among a sample of at-risk early adolescents. This study included 376 seventh-grade adolescents (age, 12-14 years) with behavioral or emotional difficulties. Questionnaires were completed on private laptop computers and assessed adolescent Emotional Competence (including Regulation and Negativity/Lability), Unsupervised Time, and a range of Sexual Behaviors. Generalized linear models were used to evaluate the independent and combined influence of Emotional Competency and Unsupervised Time on adolescent report of Sexual Behaviors. Analyses were stratified by gender to account for the notable gender differences in the targeted moderators and outcome variables. Findings indicated that more unsupervised time was a risk factor for all youth but was influenced by an adolescent's ability to regulate their emotions. Specifically, for males and females, poorer Emotion Regulation was associated with having engaged in a greater variety of Sexual Behaviors. However, lower Negativity/Lability and >1× per week Unsupervised Time were associated with a higher number of sexual behaviors among females only. Based on the findings of this study, a lack of parental supervision seems to be particularly problematic for both male and female adolescents with poor emotion regulation abilities. It may be important to impact both emotion regulation abilities and increase parental knowledge and skills associated with effective monitoring to reduce risk-taking for these youth.
The correlation between proprioception and handwriting legibility in children
Hong, So Young; Jung, Nam-Hae; Kim, Kyeong Mi
2016-01-01
[Purpose] This study investigated the association between proprioception, including joint position sense and kinetic sense, and handwriting legibility in healthy children. [Subjects and Methods] Assessment of joint position sense, kinetic sense, and handwriting legibility was conducted for 19 healthy children. Joint position sense was assessed by asking the children to flex their right elbow between 30° to 110° while blindfolded. The range of elbow movement was analyzed with Compact Measuring System 10 for 3D motion Analysis. Kinetic sense was assessed using the Sensory Integration and Praxis Test. The children were directed to write 30 words from the Korean alphabet, and the legibility of their handwriting was scored for form, alignment, space, size, and shape. To analyze the data, descriptive statistics and Spearman correlation analysis were conducted using IBM SPSS Statistics 20.0. [Results] There was significant negative correlation between handwriting legibility and Kinetic sense. A significant correlation between handwriting legibility and Joint position sense was not found. [Conclusion] This study showed that a higher Kinetic sense was associated with better legibility of handwriting. Further work is needed to determine the association of handwriting legibility and speed with Joint position sense of the elbow, wrist, and fingers. PMID:27821948
Color normalization of histology slides using graph regularized sparse NMF
NASA Astrophysics Data System (ADS)
Sha, Lingdao; Schonfeld, Dan; Sethi, Amit
2017-03-01
Computer based automatic medical image processing and quantification are becoming popular in digital pathology. However, preparation of histology slides can vary widely due to differences in staining equipment, procedures and reagents, which can reduce the accuracy of algorithms that analyze their color and texture information. To re- duce the unwanted color variations, various supervised and unsupervised color normalization methods have been proposed. Compared with supervised color normalization methods, unsupervised color normalization methods have advantages of time and cost efficient and universal applicability. Most of the unsupervised color normaliza- tion methods for histology are based on stain separation. Based on the fact that stain concentration cannot be negative and different parts of the tissue absorb different stains, nonnegative matrix factorization (NMF), and particular its sparse version (SNMF), are good candidates for stain separation. However, most of the existing unsupervised color normalization method like PCA, ICA, NMF and SNMF fail to consider important information about sparse manifolds that its pixels occupy, which could potentially result in loss of texture information during color normalization. Manifold learning methods like Graph Laplacian have proven to be very effective in interpreting high-dimensional data. In this paper, we propose a novel unsupervised stain separation method called graph regularized sparse nonnegative matrix factorization (GSNMF). By considering the sparse prior of stain concentration together with manifold information from high-dimensional image data, our method shows better performance in stain color deconvolution than existing unsupervised color deconvolution methods, especially in keeping connected texture information. To utilized the texture information, we construct a nearest neighbor graph between pixels within a spatial area of an image based on their distances using heat kernal in lαβ space. The representation of a pixel in the stain density space is constrained to follow the feature distance of the pixel to pixels in the neighborhood graph. Utilizing color matrix transfer method with the stain concentrations found using our GSNMF method, the color normalization performance was also better than existing methods.
Unsupervised discovery of information structure in biomedical documents.
Kiela, Douwe; Guo, Yufan; Stenius, Ulla; Korhonen, Anna
2015-04-01
Information structure (IS) analysis is a text mining technique, which classifies text in biomedical articles into categories that capture different types of information, such as objectives, methods, results and conclusions of research. It is a highly useful technique that can support a range of Biomedical Text Mining tasks and can help readers of biomedical literature find information of interest faster, accelerating the highly time-consuming process of literature review. Several approaches to IS analysis have been presented in the past, with promising results in real-world biomedical tasks. However, all existing approaches, even weakly supervised ones, require several hundreds of hand-annotated training sentences specific to the domain in question. Because biomedicine is subject to considerable domain variation, such annotations are expensive to obtain. This makes the application of IS analysis across biomedical domains difficult. In this article, we investigate an unsupervised approach to IS analysis and evaluate the performance of several unsupervised methods on a large corpus of biomedical abstracts collected from PubMed. Our best unsupervised algorithm (multilevel-weighted graph clustering algorithm) performs very well on the task, obtaining over 0.70 F scores for most IS categories when applied to well-known IS schemes. This level of performance is close to that of lightly supervised IS methods and has proven sufficient to aid a range of practical tasks. Thus, using an unsupervised approach, IS could be applied to support a wide range of tasks across sub-domains of biomedicine. We also demonstrate that unsupervised learning brings novel insights into IS of biomedical literature and discovers information categories that are not present in any of the existing IS schemes. The annotated corpus and software are available at http://www.cl.cam.ac.uk/∼dk427/bio14info.html. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Smart, Daniel J; Gill, Nicholas D
2013-03-01
The aims of the study were to determine if a supervised off-season conditioning program enhanced gains in physical characteristics compared with the same program performed in an unsupervised manner and to establish the persistence of the physical changes after a 6-month unsupervised competition period. Forty-four provincial representative adolescent rugby union players (age, mean ± SD, 15.3 ± 1.3 years) participated in a 15-week off-season conditioning program either under supervision from an experienced strength and conditioning coach or unsupervised. Measures of body composition, strength, vertical jump, speed, and anaerobic and aerobic running performance were taken, before, immediately after, and 6 months after the conditioning. Post conditioning program the supervised group had greater improvements in all strength measures than the unsupervised group, with small, moderate and large differences between the groups\\x{2019} changes for chin-ups (9.1%; ± 11.6%), bench-press (16.9%; ± 11.7%) and box-squat (50.4%; ± 20.9%) estimated 1RM respectively. Both groups showed trivial increases in mass; however increases in fat free mass were small and trivial for supervised and unsupervised players respectively. Strength declined in the supervised group while the unsupervised group had small increases during the competition phase, resulting in only a small difference between the long-term changes in box-squat 1RM (15.9%; ± 13.2%). The supervised group had further small increases in fat free mass resulting in a small difference (2.4%; ± 2.7%) in the long-term changes. The postconditioning differences between the 2 groups may have been a result of increased adherence and the attainment of higher training loads during supervised training. The lack of differences in strength after the competition period indicates that supervision should be maintained to reduce substantial decrements in performance.
Hoffman, Paul; Lambon Ralph, Matthew A; Rogers, Timothy T
2013-09-01
Semantic ambiguity is typically measured by summing the number of senses or dictionary definitions that a word has. Such measures are somewhat subjective and may not adequately capture the full extent of variation in word meaning, particularly for polysemous words that can be used in many different ways, with subtle shifts in meaning. Here, we describe an alternative, computationally derived measure of ambiguity based on the proposal that the meanings of words vary continuously as a function of their contexts. On this view, words that appear in a wide range of contexts on diverse topics are more variable in meaning than those that appear in a restricted set of similar contexts. To quantify this variation, we performed latent semantic analysis on a large text corpus to estimate the semantic similarities of different linguistic contexts. From these estimates, we calculated the degree to which the different contexts associated with a given word vary in their meanings. We term this quantity a word's semantic diversity (SemD). We suggest that this approach provides an objective way of quantifying the subtle, context-dependent variations in word meaning that are often present in language. We demonstrate that SemD is correlated with other measures of ambiguity and contextual variability, as well as with frequency and imageability. We also show that SemD is a strong predictor of performance in semantic judgments in healthy individuals and in patients with semantic deficits, accounting for unique variance beyond that of other predictors. SemD values for over 30,000 English words are provided as supplementary materials.
Tian, Tian; Li, Chang; Xu, Jinkang; Ma, Jiayi
2018-03-18
Detecting urban areas from very high resolution (VHR) remote sensing images plays an important role in the field of Earth observation. The recently-developed deep convolutional neural networks (DCNNs), which can extract rich features from training data automatically, have achieved outstanding performance on many image classification databases. Motivated by this fact, we propose a new urban area detection method based on DCNNs in this paper. The proposed method mainly includes three steps: (i) a visual dictionary is obtained based on the deep features extracted by pre-trained DCNNs; (ii) urban words are learned from labeled images; (iii) the urban regions are detected in a new image based on the nearest dictionary word criterion. The qualitative and quantitative experiments on different datasets demonstrate that the proposed method can obtain a remarkable overall accuracy (OA) and kappa coefficient. Moreover, it can also strike a good balance between the true positive rate (TPR) and false positive rate (FPR).
ERIC Educational Resources Information Center
Amershi, Saleema; Conati, Cristina
2009-01-01
In this paper, we present a data-based user modeling framework that uses both unsupervised and supervised classification to build student models for exploratory learning environments. We apply the framework to build student models for two different learning environments and using two different data sources (logged interface and eye-tracking data).…
Unsupervised Discovery of Nonlinear Structure Using Contrastive Backpropagation
ERIC Educational Resources Information Center
Hinton, Geoffrey; Osindero, Simon; Welling, Max; Teh, Yee-Whye
2006-01-01
We describe a way of modeling high-dimensional data vectors by using an unsupervised, nonlinear, multilayer neural network in which the activity of each neuron-like unit makes an additive contribution to a global energy score that indicates how surprised the network is by the data vector. The connection weights that determine how the activity of…
ERIC Educational Resources Information Center
Protopapas, Athanassios; Skaloumbakas, Christos; Bali, Persefoni
2008-01-01
After reviewing past efforts related to computer-based reading disability (RD) assessment, we present a fully automated screening battery that evaluates critical skills relevant for RD diagnosis designed for unsupervised application in the Greek educational system. Psychometric validation in 301 children, 8-10 years old (grades 3 and 4; including…
NASA Astrophysics Data System (ADS)
Serb, Alexander; Bill, Johannes; Khiat, Ali; Berdan, Radu; Legenstein, Robert; Prodromakis, Themis
2016-09-01
In an increasingly data-rich world the need for developing computing systems that cannot only process, but ideally also interpret big data is becoming continuously more pressing. Brain-inspired concepts have shown great promise towards addressing this need. Here we demonstrate unsupervised learning in a probabilistic neural network that utilizes metal-oxide memristive devices as multi-state synapses. Our approach can be exploited for processing unlabelled data and can adapt to time-varying clusters that underlie incoming data by supporting the capability of reversible unsupervised learning. The potential of this work is showcased through the demonstration of successful learning in the presence of corrupted input data and probabilistic neurons, thus paving the way towards robust big-data processors.
Classification of earth terrain using polarimetric synthetic aperture radar images
NASA Technical Reports Server (NTRS)
Lim, H. H.; Swartz, A. A.; Yueh, H. A.; Kong, J. A.; Shin, R. T.; Van Zyl, J. J.
1989-01-01
Supervised and unsupervised classification techniques are developed and used to classify the earth terrain components from SAR polarimetric images of San Francisco Bay and Traverse City, Michigan. The supervised techniques include the Bayes classifiers, normalized polarimetric classification, and simple feature classification using discriminates such as the absolute and normalized magnitude response of individual receiver channel returns and the phase difference between receiver channels. An algorithm is developed as an unsupervised technique which classifies terrain elements based on the relationship between the orientation angle and the handedness of the transmitting and receiving polariation states. It is found that supervised classification produces the best results when accurate classifier training data are used, while unsupervised classification may be applied when training data are not available.
Spectral analysis of white ash response to emerald ash borer infestations
NASA Astrophysics Data System (ADS)
Calandra, Laura
The emerald ash borer (EAB) (Agrilus planipennis Fairmaire) is an invasive insect that has killed over 50 million ash trees in the US. The goal of this research was to establish a method to identify ash trees infested with EAB using remote sensing techniques at the leaf-level and tree crown level. First, a field-based study at the leaf-level used the range of spectral bands from the WorldView-2 sensor to determine if there was a significant difference between EAB-infested white ash (Fraxinus americana) and healthy leaves. Binary logistic regression models were developed using individual and combinations of wavelengths; the most successful model included 545 and 950 nm bands. The second half of this research employed imagery to identify healthy and EAB-infested trees, comparing pixel- and object-based methods by applying an unsupervised classification approach and a tree crown delineation algorithm, respectively. The pixel-based models attained the highest overall accuracies.
Interdisciplinary education approach to the human science
NASA Astrophysics Data System (ADS)
Szu, Harold; Zheng, Yufeng; Zhang, Nian
2012-06-01
We introduced human sciences as components, and integrated them together as an interdisciplinary endeavor over decades. This year, we built a website to maintain systematically the educational research service. We captured the human sciences in various components in the SPIE proceedings over the last decades, which included: (i) ears & eyes like adaptive wavelets, (ii) brain-like unsupervised learning independent component analysis (ICA); (iii) compressive sampling spatiotemporal sparse information processing, (iv) nanoengineering approach to sensing components, (v) systems biology measurements, and (vi) biomedical wellness applications. In order to serve the interdisciplinary community better, our system approach is based on that the former recipients invited the next recipients to deliver their review talks and panel discussions. Since only the former recipients of each component can lead the nomination committees and make the final selections, we also create a leadership award which may be nominated by any conference attendance, to be approved by the conference organization committee.
NASA Technical Reports Server (NTRS)
Jensen, John R.; Hodgson, Michael E.; Mackey, Halkard E., Jr.; Krabill, William
1987-01-01
Wetlands in a portion of the Savannah River swamp forest, the Steel Creek Delta, were mapped using April 26, 1985 high-resolution aircraft multispectral scanner (MSS) data. Due to the complex spectral characteristics of the wetland vegetation, it was necessary to implement several techniques in the classification of the MSS imagery of the Steel Creek Delta. In particular, when performing unsupervised classification, an iterative cluster busting technique was used which simplified the cluster labeling process. In addition to the MSS data, light detecting and ranging (LIDAR) data were acquired by National Aeronautics and Space Administration (NASA) personnel along two flightlines over the Steel Creek Delta. These data were registered with the wetland classification map and correlated. Statistical analyses demonstrated that the laser derived canopy height information was significantly correlated with the Steel Creek Delta wetland classes encountered along the profiling transect of the LIDAR data.
NASA Technical Reports Server (NTRS)
Langley, P. G.
1981-01-01
A method of relating different classifications at each stage of a multistage, multiresource inventory using remotely sensed imagery is discussed. A class transformation matrix allowing the conversion of a set of proportions at one stage, to a set of proportions at the subsequent stage through use of a linear model, is described. The technique was tested by applying it to Kershaw County, South Carolina. Unsupervised LANDSAT spectral classifications were correlated with interpretations of land use aerial photography, the correlations employed to estimate land use classifications using the linear model, and the land use proportions used to stratify current annual increment (CAI) field plot data to obtain a total CAI for the county. The estimate differed by 1% from the published figure for land use. Potential sediment loss and a variety of land use classifications were also obtained.
Gender and neural substrates subserving implicit processing of death-related linguistic cues.
Qin, Jungang; Shi, Zhenhao; Ma, Yina; Han, Shihui
2018-02-01
Our recent functional magnetic resonance imaging study revealed decreased activities in the anterior cingulate cortex (ACC) and bilateral insula for women during the implicit processing of death-related linguistic cues. Current work tested whether aforementioned activities are common for women and men and explored potential gender differences. We scanned twenty males while they performed a color-naming task on death-related, negative-valence, and neutral-valence words. Whole-brain analysis showed increased left frontal activity and decreased activities in the ACC and bilateral insula to death-related versus negative-valence words for both men and women. However, relative to women, men showed greater increased activity in the left middle frontal cortex and decreased activity in the right cerebellum to death-related versus negative-valence words. The results suggest, while implicit processing of death-related words is characterized with weakened sense of oneself for both women and men, men may recruit stronger cognitive regulation of emotion than women.
Barraza, Paulo; Chavez, Mario; Rodríguez, Eugenio
2016-01-01
Similar to linguistic stimuli, music can also prime the meaning of a subsequent word. However, it is so far unknown what is the brain dynamics underlying the semantic priming effect induced by music, and its relation to language. To elucidate these issues, we compare the brain oscillatory response to visual words that have been semantically primed either by a musical excerpt or by an auditory sentence. We found that semantic violation between music-word pairs triggers a classical ERP N400, and induces a sustained increase of long-distance theta phase synchrony, along with a transient increase of local gamma activity. Similar results were observed after linguistic semantic violation except for gamma activity, which increased after semantic congruence between sentence-word pairs. Our findings indicate that local gamma activity is a neural marker that signals different ways of semantic processing between music and language, revealing the dynamic and self-organized nature of the semantic processing. Copyright © 2015 Elsevier Inc. All rights reserved.
[Lights, art, science - action!].
Lopes, Thelma
2005-01-01
The article offers some reflections on the main interactions between theater, science, and technology down through the history of theater. Based on our experience at "Science in the Spotlight", part of the Casa de Oswaldo Cruz's Museum of Life, we discuss how these interactions can be part of a science museum's daily activities. We use the word 'science' in its broad sense, encompassing not only the natural but human sciences as well; likewise, we use the word 'technology' as it relates to applied science. Art and science are understood here as creative processes, as ways of representing the world and expressing human knowledge.
Cross domains Arabic named entity recognition system
NASA Astrophysics Data System (ADS)
Al-Ahmari, S. Saad; Abdullatif Al-Johar, B.
2016-07-01
Named Entity Recognition (NER) plays an important role in many Natural Language Processing (NLP) applications such as; Information Extraction (IE), Question Answering (QA), Text Clustering, Text Summarization and Word Sense Disambiguation. This paper presents the development and implementation of domain independent system to recognize three types of Arabic named entities. The system works based on a set of domain independent grammar-rules along with Arabic part of speech tagger in addition to gazetteers and lists of trigger words. The experimental results shown, that the system performed as good as other systems with better results in some cases of cross-domains corpora.
ERIC Educational Resources Information Center
Larson, Wendy Ann
1990-01-01
Effective student recruitment writing can evoke a sense of what an institution stands for, who the students and faculty are, and what it's like to study on the campus. Clear and accessible information can give prospects the nudge they need to read on--and then apply and enroll. (MLW)
Chinese Orthographic Decomposition and Logographic Structure
ERIC Educational Resources Information Center
Cheng, Chao-Ming; Lin, Shan-Yuan
2013-01-01
"Chinese orthographic decomposition" refers to a sense of uncertainty about the writing of a well-learned Chinese character following a prolonged inspection of the character. This study investigated the decomposition phenomenon in a test situation in which Chinese characters were repeatedly presented in a word context and assessed…
ERIC Educational Resources Information Center
Snyder, Sarah
A booklet for limited English speakers on money management provides information on savings accounts, checking accounts, choosing a bank, and the basics of budgeting. Cartoons, questions about the message in cartoons and narrative passages, checklists on things to consider, and the phonetic pronunciation of key words are presented. Specific topics…
Bright Sneezes and Dark Coughs, Loud Sunlight and Soft Moonlight.
ERIC Educational Resources Information Center
Marks, Lawrence E.
1982-01-01
In a series of four experiments, subjects used scales of loudness, pitch, and brightness to evaluate the meanings of a variety of synesthetic metaphors--expressions in which words or phrases describing experiences proper to one sense modality transfer their meaning to another modality. (Author/PN)
ERIC Educational Resources Information Center
Harper, G. H.
1985-01-01
Argues that the meaning of the word "symbiosis" be standardized and that it should be used in a broad sense. Also criticizes the orthodox teaching of general principles in this subject and recommends that priority be given to continuity, intimacy, and associated adaptations, rather than to the harm/benefit relationship. (Author/JN)
Kim, Eun-Young; Kim, Suhn-Yeop; Oh, Duck-Won
2012-02-01
To investigate the effect of supervised and unsupervised pelvic floor muscle exercises utilizing trunk stabilization for treating postpartum urinary incontinence and to compare the outcomes. Randomized, single-blind controlled study. Outpatient rehabilitation hospital. Eighteen subjects with postpartum urinary incontinence. Subjects were randomized to either a supervised training group with verbal instruction from a physiotherapist, or an unsupervised training group after undergoing a supervised demonstration session. Bristol Female Lower Urinary Tract Symptom questionnaire (urinary symptoms and quality of life) and vaginal function test (maximal vaginal squeeze pressure and holding time) using a perineometer. The change values for urinary symptoms (-27.22 ± 6.20 versus -18.22 ± 5.49), quality of life (-5.33 ± 2.96 versus -1.78 ± 3.93), total score (-32.56 ± 8.17 versus -20.00 ± 6.67), maximal vaginal squeeze pressure (18.96 ± 9.08 versus 2.67 ± 3.64 mmHg), and holding time (11.32 ± 3.17 versus 5.72 ± 2.29 seconds) were more improved in the supervised group than in the unsupervised group (P < 0.05). In the supervised group, significant differences were found for all variables between pre- and post-test values (P < 0.01), whereas the unsupervised group showed significant differences for urinary symptom score, total score and holding time between the pre- and post-test results (P < 0.05). These findings suggest that exercising the pelvic floor muscles by utilizing trunk stabilization under physiotherapist supervision may be beneficial for the management of postpartum urinary incontinence.
The Hands-On Universe: Making Sense of the Universe with All Your Senses
NASA Astrophysics Data System (ADS)
Trotta, R.
2018-02-01
For the past four years, the Hands-On Universe public engagement programme has explored unconventional, interactive and multi-sensorial ways of communicating complex ideas in cosmology and astrophysics to a wide variety of audiences. The programme lead, Roberto Trotta, has reached thousands of people through food-based workshops, art and science collaborations and a book written using only the 1000 most common words in the English language. In this article, Roberto reflects in first person on what has worked well in the programme, and what has not.
2015-08-28
AFRL-RX-WP-JA-2016-0251 COMPOSITIONAL CONTROL OF THE MIXED ANION ALLOYS IN GALLIUM -FREE InAs/InAsSb SUPERLATTICE MATERIALS FOR...ANION ALLOYS IN GALLIUM -FREE InAs/InAsSb SUPERLATTICE MATERIALS FOR INFRARED SENSING (POSTPRINT) 5a. CONTRACT NUMBER FA8650-07-D-5800-0006 5b...proceedings.spiedigitallibrary.org doi: 10.1117/12.2186188 14. ABSTRACT (Maximum 200 words) Gallium (Ga)-free InAs/InAsSb superlattices (SLs) are being actively explored for
ERIC Educational Resources Information Center
Siennick, Sonja E.; Osgood, D. Wayne
2012-01-01
Companions are central to explanations of the risky nature of unstructured and unsupervised socializing, yet we know little about whom adolescents are with when hanging out. We examine predictors of how often friendship dyads hang out via multilevel analyses of longitudinal friendship-level data on over 5,000 middle schoolers. Adolescents hang out…
Teacher and learner: Supervised and unsupervised learning in communities.
Shafto, Michael G; Seifert, Colleen M
2015-01-01
How far can teaching methods go to enhance learning? Optimal methods of teaching have been considered in research on supervised and unsupervised learning. Locally optimal methods are usually hybrids of teaching and self-directed approaches. The costs and benefits of specific methods have been shown to depend on the structure of the learning task, the learners, the teachers, and the environment.
NASA Astrophysics Data System (ADS)
Chen, B.; Chehdi, K.; De Oliveria, E.; Cariou, C.; Charbonnier, B.
2015-10-01
In this paper a new unsupervised top-down hierarchical classification method to partition airborne hyperspectral images is proposed. The unsupervised approach is preferred because the difficulty of area access and the human and financial resources required to obtain ground truth data, constitute serious handicaps especially over large areas which can be covered by airborne or satellite images. The developed classification approach allows i) a successive partitioning of data into several levels or partitions in which the main classes are first identified, ii) an estimation of the number of classes automatically at each level without any end user help, iii) a nonsystematic subdivision of all classes of a partition Pj to form a partition Pj+1, iv) a stable partitioning result of the same data set from one run of the method to another. The proposed approach was validated on synthetic and real hyperspectral images related to the identification of several marine algae species. In addition to highly accurate and consistent results (correct classification rate over 99%), this approach is completely unsupervised. It estimates at each level, the optimal number of classes and the final partition without any end user intervention.
Shan, Ying; Sawhney, Harpreet S; Kumar, Rakesh
2008-04-01
This paper proposes a novel unsupervised algorithm learning discriminative features in the context of matching road vehicles between two non-overlapping cameras. The matching problem is formulated as a same-different classification problem, which aims to compute the probability of vehicle images from two distinct cameras being from the same vehicle or different vehicle(s). We employ a novel measurement vector that consists of three independent edge-based measures and their associated robust measures computed from a pair of aligned vehicle edge maps. The weight of each measure is determined by an unsupervised learning algorithm that optimally separates the same-different classes in the combined measurement space. This is achieved with a weak classification algorithm that automatically collects representative samples from same-different classes, followed by a more discriminative classifier based on Fisher' s Linear Discriminants and Gibbs Sampling. The robustness of the match measures and the use of unsupervised discriminant analysis in the classification ensures that the proposed method performs consistently in the presence of missing/false features, temporally and spatially changing illumination conditions, and systematic misalignment caused by different camera configurations. Extensive experiments based on real data of over 200 vehicles at different times of day demonstrate promising results.
NASA Astrophysics Data System (ADS)
Salman, S. S.; Abbas, W. A.
2018-05-01
The goal of the study is to support analysis Enhancement of Resolution and study effect on classification methods on bands spectral information of specific and quantitative approaches. In this study introduce a method to enhancement resolution Landsat 8 of combining the bands spectral of 30 meters resolution with panchromatic band 8 of 15 meters resolution, because of importance multispectral imagery to extracting land - cover. Classification methods used in this study to classify several lands -covers recorded from OLI- 8 imagery. Two methods of Data mining can be classified as either supervised or unsupervised. In supervised methods, there is a particular predefined target, that means the algorithm learn which values of the target are associated with which values of the predictor sample. K-nearest neighbors and maximum likelihood algorithms examine in this work as supervised methods. In other hand, no sample identified as target in unsupervised methods, the algorithm of data extraction searches for structure and patterns between all the variables, represented by Fuzzy C-mean clustering method as one of the unsupervised methods, NDVI vegetation index used to compare the results of classification method, the percent of dense vegetation in maximum likelihood method give a best results.
Sadeghi, Zahra; Testolin, Alberto
2017-08-01
In humans, efficient recognition of written symbols is thought to rely on a hierarchical processing system, where simple features are progressively combined into more abstract, high-level representations. Here, we present a computational model of Persian character recognition based on deep belief networks, where increasingly more complex visual features emerge in a completely unsupervised manner by fitting a hierarchical generative model to the sensory data. Crucially, high-level internal representations emerging from unsupervised deep learning can be easily read out by a linear classifier, achieving state-of-the-art recognition accuracy. Furthermore, we tested the hypothesis that handwritten digits and letters share many common visual features: A generative model that captures the statistical structure of the letters distribution should therefore also support the recognition of written digits. To this aim, deep networks trained on Persian letters were used to build high-level representations of Persian digits, which were indeed read out with high accuracy. Our simulations show that complex visual features, such as those mediating the identification of Persian symbols, can emerge from unsupervised learning in multilayered neural networks and can support knowledge transfer across related domains.
Penalized unsupervised learning with outliers
Witten, Daniela M.
2013-01-01
We consider the problem of performing unsupervised learning in the presence of outliers – that is, observations that do not come from the same distribution as the rest of the data. It is known that in this setting, standard approaches for unsupervised learning can yield unsatisfactory results. For instance, in the presence of severe outliers, K-means clustering will often assign each outlier to its own cluster, or alternatively may yield distorted clusters in order to accommodate the outliers. In this paper, we take a new approach to extending existing unsupervised learning techniques to accommodate outliers. Our approach is an extension of a recent proposal for outlier detection in the regression setting. We allow each observation to take on an “error” term, and we penalize the errors using a group lasso penalty in order to encourage most of the observations’ errors to exactly equal zero. We show that this approach can be used in order to develop extensions of K-means clustering and principal components analysis that result in accurate outlier detection, as well as improved performance in the presence of outliers. These methods are illustrated in a simulation study and on two gene expression data sets, and connections with M-estimation are explored. PMID:23875057
A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data.
Goldstein, Markus; Uchida, Seiichi
2016-01-01
Anomaly detection is the process of identifying unexpected items or events in datasets, which differ from the norm. In contrast to standard classification tasks, anomaly detection is often applied on unlabeled data, taking only the internal structure of the dataset into account. This challenge is known as unsupervised anomaly detection and is addressed in many practical applications, for example in network intrusion detection, fraud detection as well as in the life science and medical domain. Dozens of algorithms have been proposed in this area, but unfortunately the research community still lacks a comparative universal evaluation as well as common publicly available datasets. These shortcomings are addressed in this study, where 19 different unsupervised anomaly detection algorithms are evaluated on 10 different datasets from multiple application domains. By publishing the source code and the datasets, this paper aims to be a new well-funded basis for unsupervised anomaly detection research. Additionally, this evaluation reveals the strengths and weaknesses of the different approaches for the first time. Besides the anomaly detection performance, computational effort, the impact of parameter settings as well as the global/local anomaly detection behavior is outlined. As a conclusion, we give an advise on algorithm selection for typical real-world tasks.
Edwards, Darren J; Wood, Rodger
2016-01-01
This study explored over-selectivity (executive dysfunction) using a standard unsupervised categorization task. Over-selectivity has been demonstrated using supervised categorization procedures (where training is given); however, little has been done in the way of unsupervised categorization (without training). A standard unsupervised categorization task was used to assess levels of over-selectivity in a traumatic brain injury (TBI) population. Individuals with TBI were selected from the Tertiary Traumatic Brain Injury Clinic at Swansea University and were asked to categorize two-dimensional items (pictures on cards), into groups that they felt were most intuitive, and without any learning (feedback from experimenter). This was compared against categories made by a control group for the same task. The findings of this study demonstrate that individuals with TBI had deficits for both easy and difficult categorization sets, as indicated by a larger amount of one-dimensional sorting compared to control participants. Deficits were significantly greater for the easy condition. The implications of these findings are discussed in the context of over-selectivity, and the processes that underlie this deficit. Also, the implications for using this procedure as a screening measure for over-selectivity in TBI are discussed.
Accuracy of latent-variable estimation in Bayesian semi-supervised learning.
Yamazaki, Keisuke
2015-09-01
Hierarchical probabilistic models, such as Gaussian mixture models, are widely used for unsupervised learning tasks. These models consist of observable and latent variables, which represent the observable data and the underlying data-generation process, respectively. Unsupervised learning tasks, such as cluster analysis, are regarded as estimations of latent variables based on the observable ones. The estimation of latent variables in semi-supervised learning, where some labels are observed, will be more precise than that in unsupervised, and one of the concerns is to clarify the effect of the labeled data. However, there has not been sufficient theoretical analysis of the accuracy of the estimation of latent variables. In a previous study, a distribution-based error function was formulated, and its asymptotic form was calculated for unsupervised learning with generative models. It has been shown that, for the estimation of latent variables, the Bayes method is more accurate than the maximum-likelihood method. The present paper reveals the asymptotic forms of the error function in Bayesian semi-supervised learning for both discriminative and generative models. The results show that the generative model, which uses all of the given data, performs better when the model is well specified. Copyright © 2015 Elsevier Ltd. All rights reserved.
Segmentation of fluorescence microscopy cell images using unsupervised mining.
Du, Xian; Dua, Sumeet
2010-05-28
The accurate measurement of cell and nuclei contours are critical for the sensitive and specific detection of changes in normal cells in several medical informatics disciplines. Within microscopy, this task is facilitated using fluorescence cell stains, and segmentation is often the first step in such approaches. Due to the complex nature of cell issues and problems inherent to microscopy, unsupervised mining approaches of clustering can be incorporated in the segmentation of cells. In this study, we have developed and evaluated the performance of multiple unsupervised data mining techniques in cell image segmentation. We adapt four distinctive, yet complementary, methods for unsupervised learning, including those based on k-means clustering, EM, Otsu's threshold, and GMAC. Validation measures are defined, and the performance of the techniques is evaluated both quantitatively and qualitatively using synthetic and recently published real data. Experimental results demonstrate that k-means, Otsu's threshold, and GMAC perform similarly, and have more precise segmentation results than EM. We report that EM has higher recall values and lower precision results from under-segmentation due to its Gaussian model assumption. We also demonstrate that these methods need spatial information to segment complex real cell images with a high degree of efficacy, as expected in many medical informatics applications.
Lacroix, André; Hortobágyi, Tibor; Beurskens, Rainer; Granacher, Urs
2017-11-01
Balance and resistance training can improve healthy older adults' balance and muscle strength. Delivering such exercise programs at home without supervision may facilitate participation for older adults because they do not have to leave their homes. To date, no systematic literature analysis has been conducted to determine if supervision affects the effectiveness of these programs to improve healthy older adults' balance and muscle strength/power. The objective of this systematic review and meta-analysis was to quantify the effectiveness of supervised vs. unsupervised balance and/or resistance training programs on measures of balance and muscle strength/power in healthy older adults. In addition, the impact of supervision on training-induced adaptive processes was evaluated in the form of dose-response relationships by analyzing randomized controlled trials that compared supervised with unsupervised trials. A computerized systematic literature search was performed in the electronic databases PubMed, Web of Science, and SportDiscus to detect articles examining the role of supervision in balance and/or resistance training in older adults. The initially identified 6041 articles were systematically screened. Studies were included if they examined balance and/or resistance training in adults aged ≥65 years with no relevant diseases and registered at least one behavioral balance (e.g., time during single leg stance) and/or muscle strength/power outcome (e.g., time for 5-Times-Chair-Rise-Test). Finally, 11 studies were eligible for inclusion in this meta-analysis. Weighted mean standardized mean differences between subjects (SMD bs ) of supervised vs. unsupervised balance/resistance training studies were calculated. The included studies were coded for the following variables: number of participants, sex, age, number and type of interventions, type of balance/strength tests, and change (%) from pre- to post-intervention values. Additionally, we coded training according to the following modalities: period, frequency, volume, modalities of supervision (i.e., number of supervised/unsupervised sessions within the supervised or unsupervised training groups, respectively). Heterogeneity was computed using I 2 and χ 2 statistics. The methodological quality of the included studies was evaluated using the Physiotherapy Evidence Database scale. Our analyses revealed that in older adults, supervised balance/resistance training was superior compared with unsupervised balance/resistance training in improving measures of static steady-state balance (mean SMD bs = 0.28, p = 0.39), dynamic steady-state balance (mean SMD bs = 0.35, p = 0.02), proactive balance (mean SMD bs = 0.24, p = 0.05), balance test batteries (mean SMD bs = 0.53, p = 0.02), and measures of muscle strength/power (mean SMD bs = 0.51, p = 0.04). Regarding the examined dose-response relationships, our analyses showed that a number of 10-29 additional supervised sessions in the supervised training groups compared with the unsupervised training groups resulted in the largest effects for static steady-state balance (mean SMD bs = 0.35), dynamic steady-state balance (mean SMD bs = 0.37), and muscle strength/power (mean SMD bs = 1.12). Further, ≥30 additional supervised sessions in the supervised training groups were needed to produce the largest effects on proactive balance (mean SMD bs = 0.30) and balance test batteries (mean SMD bs = 0.77). Effects in favor of supervised programs were larger for studies that did not include any supervised sessions in their unsupervised programs (mean SMD bs : 0.28-1.24) compared with studies that implemented a few supervised sessions in their unsupervised programs (e.g., three supervised sessions throughout the entire intervention program; SMD bs : -0.06 to 0.41). The present findings have to be interpreted with caution because of the low number of eligible studies and the moderate methodological quality of the included studies, which is indicated by a median Physiotherapy Evidence Database scale score of 5. Furthermore, we indirectly compared dose-response relationships across studies and not from single controlled studies. Our analyses suggest that supervised balance and/or resistance training improved measures of balance and muscle strength/power to a greater extent than unsupervised programs in older adults. Owing to the small number of available studies, we were unable to establish a clear dose-response relationship with regard to the impact of supervision. However, the positive effects of supervised training are particularly prominent when compared with completely unsupervised training programs. It is therefore recommended to include supervised sessions (i.e., two out of three sessions/week) in balance/resistance training programs to effectively improve balance and muscle strength/power in older adults.
Synthesis of Common Arabic Handwritings to Aid Optical Character Recognition Research.
Dinges, Laslo; Al-Hamadi, Ayoub; Elzobi, Moftah; El-Etriby, Sherif
2016-03-11
Document analysis tasks such as pattern recognition, word spotting or segmentation, require comprehensive databases for training and validation. Not only variations in writing style but also the used list of words is of importance in the case that training samples should reflect the input of a specific area of application. However, generation of training samples is expensive in the sense of manpower and time, particularly if complete text pages including complex ground truth are required. This is why there is a lack of such databases, especially for Arabic, the second most popular language. However, Arabic handwriting recognition involves different preprocessing, segmentation and recognition methods. Each requires particular ground truth or samples to enable optimal training and validation, which are often not covered by the currently available databases. To overcome this issue, we propose a system that synthesizes Arabic handwritten words and text pages and generates corresponding detailed ground truth. We use these syntheses to validate a new, segmentation based system that recognizes handwritten Arabic words. We found that a modification of an Active Shape Model based character classifiers-that we proposed earlier-improves the word recognition accuracy. Further improvements are achieved, by using a vocabulary of the 50,000 most common Arabic words for error correction.
Synthesis of Common Arabic Handwritings to Aid Optical Character Recognition Research
Dinges, Laslo; Al-Hamadi, Ayoub; Elzobi, Moftah; El-etriby, Sherif
2016-01-01
Document analysis tasks such as pattern recognition, word spotting or segmentation, require comprehensive databases for training and validation. Not only variations in writing style but also the used list of words is of importance in the case that training samples should reflect the input of a specific area of application. However, generation of training samples is expensive in the sense of manpower and time, particularly if complete text pages including complex ground truth are required. This is why there is a lack of such databases, especially for Arabic, the second most popular language. However, Arabic handwriting recognition involves different preprocessing, segmentation and recognition methods. Each requires particular ground truth or samples to enable optimal training and validation, which are often not covered by the currently available databases. To overcome this issue, we propose a system that synthesizes Arabic handwritten words and text pages and generates corresponding detailed ground truth. We use these syntheses to validate a new, segmentation based system that recognizes handwritten Arabic words. We found that a modification of an Active Shape Model based character classifiers—that we proposed earlier—improves the word recognition accuracy. Further improvements are achieved, by using a vocabulary of the 50,000 most common Arabic words for error correction. PMID:26978368
Unsupervised classification of multivariate geostatistical data: Two algorithms
NASA Astrophysics Data System (ADS)
Romary, Thomas; Ors, Fabien; Rivoirard, Jacques; Deraisme, Jacques
2015-12-01
With the increasing development of remote sensing platforms and the evolution of sampling facilities in mining and oil industry, spatial datasets are becoming increasingly large, inform a growing number of variables and cover wider and wider areas. Therefore, it is often necessary to split the domain of study to account for radically different behaviors of the natural phenomenon over the domain and to simplify the subsequent modeling step. The definition of these areas can be seen as a problem of unsupervised classification, or clustering, where we try to divide the domain into homogeneous domains with respect to the values taken by the variables in hand. The application of classical clustering methods, designed for independent observations, does not ensure the spatial coherence of the resulting classes. Image segmentation methods, based on e.g. Markov random fields, are not adapted to irregularly sampled data. Other existing approaches, based on mixtures of Gaussian random functions estimated via the expectation-maximization algorithm, are limited to reasonable sample sizes and a small number of variables. In this work, we propose two algorithms based on adaptations of classical algorithms to multivariate geostatistical data. Both algorithms are model free and can handle large volumes of multivariate, irregularly spaced data. The first one proceeds by agglomerative hierarchical clustering. The spatial coherence is ensured by a proximity condition imposed for two clusters to merge. This proximity condition relies on a graph organizing the data in the coordinates space. The hierarchical algorithm can then be seen as a graph-partitioning algorithm. Following this interpretation, a spatial version of the spectral clustering algorithm is also proposed. The performances of both algorithms are assessed on toy examples and a mining dataset.
NASA Astrophysics Data System (ADS)
Mafanya, Madodomzi; Tsele, Philemon; Botai, Joel; Manyama, Phetole; Swart, Barend; Monate, Thabang
2017-07-01
Invasive alien plants (IAPs) not only pose a serious threat to biodiversity and water resources but also have impacts on human and animal wellbeing. To support decision making in IAPs monitoring, semi-automated image classifiers which are capable of extracting valuable information in remotely sensed data are vital. This study evaluated the mapping accuracies of supervised and unsupervised image classifiers for mapping Harrisia pomanensis (a cactus plant commonly known as the Midnight Lady) using two interlinked evaluation strategies i.e. point and area based accuracy assessment. Results of the point-based accuracy assessment show that with reference to 219 ground control points, the supervised image classifiers (i.e. Maxver and Bhattacharya) mapped H. pomanensis better than the unsupervised image classifiers (i.e. K-mediuns, Euclidian Length and Isoseg). In this regard, user and producer accuracies were 82.4% and 84% respectively for the Maxver classifier. The user and producer accuracies for the Bhattacharya classifier were 90% and 95.7%, respectively. Though the Maxver produced a higher overall accuracy and Kappa estimate than the Bhattacharya classifier, the Maxver Kappa estimate of 0.8305 is not significantly (statistically) greater than the Bhattacharya Kappa estimate of 0.8088 at a 95% confidence interval. The area based accuracy assessment results show that the Bhattacharya classifier estimated the spatial extent of H. pomanensis with an average mapping accuracy of 86.1% whereas the Maxver classifier only gave an average mapping accuracy of 65.2%. Based on these results, the Bhattacharya classifier is therefore recommended for mapping H. pomanensis. These findings will aid in the algorithm choice making for the development of a semi-automated image classification system for mapping IAPs.
The Livermore Brain: Massive Deep Learning Networks Enabled by High Performance Computing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, Barry Y.
The proliferation of inexpensive sensor technologies like the ubiquitous digital image sensors has resulted in the collection and sharing of vast amounts of unsorted and unexploited raw data. Companies and governments who are able to collect and make sense of large datasets to help them make better decisions more rapidly will have a competitive advantage in the information era. Machine Learning technologies play a critical role for automating the data understanding process; however, to be maximally effective, useful intermediate representations of the data are required. These representations or “features” are transformations of the raw data into a form where patternsmore » are more easily recognized. Recent breakthroughs in Deep Learning have made it possible to learn these features from large amounts of labeled data. The focus of this project is to develop and extend Deep Learning algorithms for learning features from vast amounts of unlabeled data and to develop the HPC neural network training platform to support the training of massive network models. This LDRD project succeeded in developing new unsupervised feature learning algorithms for images and video and created a scalable neural network training toolkit for HPC. Additionally, this LDRD helped create the world’s largest freely-available image and video dataset supporting open multimedia research and used this dataset for training our deep neural networks. This research helped LLNL capture several work-for-others (WFO) projects, attract new talent, and establish collaborations with leading academic and commercial partners. Finally, this project demonstrated the successful training of the largest unsupervised image neural network using HPC resources and helped establish LLNL leadership at the intersection of Machine Learning and HPC research.« less
Evaluation of eelgrass beds mapping using a high-resolution airborne multispectral scanner
Su, H.; Karna, D.; Fraim, E.; Fitzgerald, M.; Dominguez, R.; Myers, J.S.; Coffland, B.; Handley, L.R.; Mace, T.
2006-01-01
Eelgrass (Zostera marina) can provide vital ecological functions in stabilizing sediments, influencing current dynamics, and contributing significant amounts of biomass to numerous food webs in coastal ecosystems. Mapping eelgrass beds is important for coastal water and nearshore estuarine monitoring, management, and planning. This study demonstrated the possible use of high spatial (approximately 5 m) and temporal (maximum low tide) resolution airborne multispectral scanner on mapping eelgrass beds in Northern Puget Sound, Washington. A combination of supervised and unsupervised classification approaches were performed on the multispectral scanner imagery. A normalized difference vegetation index (NDVI) derived from the red and near-infrared bands and ancillary spatial information, were used to extract and mask eelgrass beds and other submerged aquatic vegetation (SAV) in the study area. We evaluated the resulting thematic map (geocoded, classified image) against a conventional aerial photograph interpretation using 260 point locations randomly stratified over five defined classes from the thematic map. We achieved an overall accuracy of 92 percent with 0.92 Kappa Coefficient in the study area. This study demonstrates that the airborne multispectral scanner can be useful for mapping eelgrass beds in a local or regional scale, especially in regions for which optical remote sensing from space is constrained by climatic and tidal conditions. ?? 2006 American Society for Photogrammetry and Remote Sensing.
Greene, Kathryn; Banerjee, Smita C
2009-04-01
This study explored the association between unsupervised time with peers and adolescent smoking behavior both directly and indirectly through interaction with delinquent peers, social expectancies about cigarette smoking, and cigarette offers from peers. A cross-sectional survey was used for the study and included 248 male and female middle school students. Results of structural equation modeling revealed that unsupervised time with peers is associated indirectly with adolescent smoking behavior through the mediation of association with delinquent peers, social expectancies about cigarette smoking, and cigarette offers from peers. Interventions designed to motivate adolescents without adult supervision to associate more with friends who engage in prosocial activities may eventually reduce adolescent smoking. Further implications for structured supervised time for students outside of school time are discussed.
NASA Technical Reports Server (NTRS)
Hall, Lawrence O.; Bensaid, Amine M.; Clarke, Laurence P.; Velthuizen, Robert P.; Silbiger, Martin S.; Bezdek, James C.
1992-01-01
Magnetic resonance (MR) brain section images are segmented and then synthetically colored to give visual representations of the original data with three approaches: the literal and approximate fuzzy c-means unsupervised clustering algorithms and a supervised computational neural network, a dynamic multilayered perception trained with the cascade correlation learning algorithm. Initial clinical results are presented on both normal volunteers and selected patients with brain tumors surrounded by edema. Supervised and unsupervised segmentation techniques provide broadly similar results. Unsupervised fuzzy algorithms were visually observed to show better segmentation when compared with raw image data for volunteer studies. However, for a more complex segmentation problem with tumor/edema or cerebrospinal fluid boundary, where the tissues have similar MR relaxation behavior, inconsistency in rating among experts was observed.
Making Connections: Elementary Teachers' Construction of Division Word Problems and Representations
ERIC Educational Resources Information Center
Timmerman, Maria A.
2014-01-01
If teachers make few connections among multiple representations of division, supporting students in using representations to develop operation sense demanded by national standards will not occur. Studies have investigated how prospective and practicing teachers use representations to develop knowledge of fraction division. However, few studies…
ERIC Educational Resources Information Center
Hochstetler, Douglas
2012-01-01
In this article I examine the theme of wilderness through the lens of American philosopher Henry Bugbee. His conception of wilderness goes beyond the literal sense of the word to what Mooney (1999) terms "a generous space of listening, mutuality of address and presence" (p. ix). I contend that Bugbee's metaphorical expression of wilderness has…
The Montessori Classroom: A Foundation for Global Citizenship
ERIC Educational Resources Information Center
Leonard, Gerard
2015-01-01
Gerard Leonard maps the child's increasingly global environment and sense of citizenship from elementary to adolescence. For the elementary child, an orientation to the local history and geography of their surroundings provides a framework for understanding geography. In Leonard's words, "We have to know and understand a lot about many…
Technique for improving solid state mosaic images
NASA Technical Reports Server (NTRS)
Saboe, J. M.
1969-01-01
Method identifies and corrects mosaic image faults in solid state visual displays and opto-electronic presentation systems. Composite video signals containing faults due to defective sensing elements are corrected by a memory unit that contains the stored fault pattern and supplies the appropriate fault word to the blanking circuit.
Utility of an automated thermal-based approach for monitoring evapotranspiration
USDA-ARS?s Scientific Manuscript database
A very simple remote sensing-based model for water use monitoring is presented. The model acronym DATTUTDUT, (Deriving Atmosphere Turbulent Transport Useful To Dummies Using Temperature) is a Dutch word which loosely translates as “It’s unbelievable that it works”. DATTUTDUT is fully automated and o...
Teaching Free Expression in Word and Example (Commentary).
ERIC Educational Resources Information Center
Merrill, John
1991-01-01
Suggests that the teaching of free expression may be the highest calling of a communications or journalism professor. Argues that freedom must be tempered by a sense of ethics. Calls upon teachers to encourage students to analyze the questions surrounding free expression. Describes techniques for scrutinizing journalistic myths. (SG)
Processing of Irregular Polysemes in Sentence Reading
ERIC Educational Resources Information Center
Brocher, Andreas; Foraker, Stephani; Koenig, Jean-Pierre
2016-01-01
The degree to which meanings are related in memory affects ambiguous word processing. We examined irregular polysemes, which have related senses based on similar or shared features rather than a relational rule, like regular polysemy. We tested to what degree the related meanings of irregular polysemes ("wire") are represented with…
Interlanguage Development and Collocational Clash
ERIC Educational Resources Information Center
Shahheidaripour, Gholamabbass
2000-01-01
Background: Persian English learners committed mistakes and errors which were due to insufficient knowledge of different senses of the words and collocational structures they formed. Purpose: The study reported here was conducted for a thesis submitted in partial fulfillment of the requirements for The Master of Arts degree, School of Graduate…
ERIC Educational Resources Information Center
Jaeger, Cora
2018-01-01
Tracking the depictions of animals in children's literature through history reveals not only what authors think about animals, but also what they think about the human experience and of childhood itself. As the word "animal" can be used both to mark the similarities and the differences between beasts and men, it makes sense then that…
Common Sense Planning for a Computer, or, What's It Worth to You?
ERIC Educational Resources Information Center
Crawford, Walt
1984-01-01
Suggests factors to be considered in planning for the purchase of a microcomputer, including budgets, benefits, costs, and decisions. Major uses of a personal computer are described--word processing, financial analysis, file and database management, programming and computer literacy, education, entertainment, and thrill of high technology. (EJS)
Making Sense of Phonics: The Hows and Whys. Second Edition
ERIC Educational Resources Information Center
Beck, Isabel L.; Beck, Mark E.
2013-01-01
This bestselling book provides indispensable tools and strategies for explicit, systematic phonics instruction in K-3. Teachers learn effective ways to build students' decoding skills by teaching letter-sound relationships, blending, word building, multisyllabic decoding, fluency, and more. The volume is packed with engaging classroom activities,…
Common Sense and Computer Magazines, or, What's the Good Word, Part 1: Periodicals.
ERIC Educational Resources Information Center
Crawford, Walt
1984-01-01
This list of 60 microcomputer magazines encountered at newsstands during September 1984 is broken down by specific computer or software coverage. Reviews for 22 magazines note number of pages, advertisements, reviews, and articles, reviewer's opinions, and recommended use. Eight magazines are recommended for most libraries. (EJS)
Mikhail Bakhtin and "Expressive Discourse."
ERIC Educational Resources Information Center
Ewald, Helen Rothschild
Mikhail Bakhtin's concept of dialogism has applications to rhetoric and composition instruction. Dialogism, sometimes translated as intertextuality, is the term Bakhtin used to designate the relation of one utterance to other utterances. Dialogism is not dialogue in the usual sense of the word; it is the context which informs utterance, and…
Directional gravity sensing in gravitropism.
Morita, Miyo Terao
2010-01-01
Plants can reorient their growth direction by sensing organ tilt relative to the direction of gravity. With respect to gravity sensing in gravitropism, the classic starch statolith hypothesis, i.e., that starch-accumulating amyloplast movement along the gravity vector within gravity-sensing cells (statocytes) is the probable trigger of subsequent intracellular signaling, is widely accepted. Several lines of experimental evidence have demonstrated that starch is important but not essential for gravity sensing and have suggested that it is reasonable to regard plastids (containers of starch) as statoliths. Although the word statolith means sedimented stone, actual amyloplasts are not static but instead possess dynamic movement. Recent studies combining genetic and cell biological approaches, using Arabidopsis thaliana, have demonstrated that amyloplast movement is an intricate process involving vacuolar membrane structures and the actin cytoskeleton. This review covers current knowledge regarding gravity sensing, particularly gravity susception, and the factors modulating the function of amyloplasts for sensing the directional change of gravity. Specific emphasis is made on the remarkable differences in the cytological properties, developmental origins, tissue locations, and response of statocytes between root and shoot systems. Such an approach reveals a common theme in directional gravity-sensing mechanisms in these two disparate organs.
Craig, Hugh; Berretta, Regina; Moscato, Pablo
2016-01-01
In this study we propose a novel, unsupervised clustering methodology for analyzing large datasets. This new, efficient methodology converts the general clustering problem into the community detection problem in graph by using the Jensen-Shannon distance, a dissimilarity measure originating in Information Theory. Moreover, we use graph theoretic concepts for the generation and analysis of proximity graphs. Our methodology is based on a newly proposed memetic algorithm (iMA-Net) for discovering clusters of data elements by maximizing the modularity function in proximity graphs of literary works. To test the effectiveness of this general methodology, we apply it to a text corpus dataset, which contains frequencies of approximately 55,114 unique words across all 168 written in the Shakespearean era (16th and 17th centuries), to analyze and detect clusters of similar plays. Experimental results and comparison with state-of-the-art clustering methods demonstrate the remarkable performance of our new method for identifying high quality clusters which reflect the commonalities in the literary style of the plays. PMID:27571416
A Self-Organizing Incremental Neural Network based on local distribution learning.
Xing, Youlu; Shi, Xiaofeng; Shen, Furao; Zhou, Ke; Zhao, Jinxi
2016-12-01
In this paper, we propose an unsupervised incremental learning neural network based on local distribution learning, which is called Local Distribution Self-Organizing Incremental Neural Network (LD-SOINN). The LD-SOINN combines the advantages of incremental learning and matrix learning. It can automatically discover suitable nodes to fit the learning data in an incremental way without a priori knowledge such as the structure of the network. The nodes of the network store rich local information regarding the learning data. The adaptive vigilance parameter guarantees that LD-SOINN is able to add new nodes for new knowledge automatically and the number of nodes will not grow unlimitedly. While the learning process continues, nodes that are close to each other and have similar principal components are merged to obtain a concise local representation, which we call a relaxation data representation. A denoising process based on density is designed to reduce the influence of noise. Experiments show that the LD-SOINN performs well on both artificial and real-word data. Copyright © 2016 Elsevier Ltd. All rights reserved.
A suffix arrays based approach to semantic search in P2P systems
NASA Astrophysics Data System (ADS)
Shi, Qingwei; Zhao, Zheng; Bao, Hu
2007-09-01
Building a semantic search system on top of peer-to-peer (P2P) networks is becoming an attractive and promising alternative scheme for the reason of scalability, Data freshness and search cost. In this paper, we present a Suffix Arrays based algorithm for Semantic Search (SASS) in P2P systems, which generates a distributed Semantic Overlay Network (SONs) construction for full-text search in P2P networks. For each node through the P2P network, SASS distributes document indices based on a set of suffix arrays, by which clusters are created depending on words or phrases shared between documents, therefore, the search cost for a given query is decreased by only scanning semantically related documents. In contrast to recently announced SONs scheme designed by using metadata or predefined-class, SASS is an unsupervised approach for decentralized generation of SONs. SASS is also an incremental, linear time algorithm, which efficiently handle the problem of nodes update in P2P networks. Our simulation results demonstrate that SASS yields high search efficiency in dynamic environments.
Bletzer, Keith V
2015-01-01
Satisfaction surveys are common in the field of health education, as a means of assisting organizations to improve the appropriateness of training materials and the effectiveness of facilitation-presentation. Data can be qualitative of which analysis often become specialized. This technical article aims to reveal whether qualitative survey results can be visualized by presenting them as a Word Cloud. Qualitative materials in the form of written comments on an agency-specific satisfaction survey were coded and quantified. The resulting quantitative data were used to convert comments into "input terms" to generate Word Clouds to increase comprehension and accessibility through visualization of the written responses. A three-tier display incorporated a Word Cloud at the top, followed by the corresponding frequency table, and a textual summary of the qualitative data represented by the Word Cloud imagery. This mixed format adheres to recognition that people vary in what format is most effective for assimilating new information. The combination of visual representation through Word Clouds complemented by quantified qualitative materials is one means of increasing comprehensibility for a range of stakeholders, who might not be familiar with numerical tables or statistical analyses.
Rough Set Based Splitting Criterion for Binary Decision Tree Classifiers
2006-09-26
Alata O. Fernandez-Maloigne C., and Ferrie J.C. (2001). Unsupervised Algorithm for the Segmentation of Three-Dimensional Magnetic Resonance Brain ...instinctual and learned responses in the brain , causing it to make decisions based on patterns in the stimuli. Using this deceptively simple process...2001. [2] Bohn C. (1997). An Incremental Unsupervised Learning Scheme for Function Approximation. In: Proceedings of the 1997 IEEE International
ERIC Educational Resources Information Center
Snyder, Robin M.
2015-01-01
The field of topic modeling has become increasingly important over the past few years. Topic modeling is an unsupervised machine learning way to organize text (or image or DNA, etc.) information such that related pieces of text can be identified. This paper/session will present/discuss the current state of topic modeling, why it is important, and…
ERIC Educational Resources Information Center
Ladyshewsky, Richard K.
2015-01-01
This research explores differences in multiple choice test (MCT) scores in a cohort of post-graduate students enrolled in a management and leadership course. A total of 250 students completed the MCT in either a supervised in-class paper and pencil test or an unsupervised online test. The only statistically significant difference between the nine…
Exploiting Secondary Sources for Unsupervised Record Linkage
2004-01-01
paper, we present an extension to Apollo’s active learning component to Report Documentation Page Form ApprovedOMB No. 0704-0188 Public reporting...Sources address the issue of user involvement. Using secondary sources, a system can autonomously answer questions posed by its active learning component...over, we present how Apollo utilizes the identified sec- ondary sources in an unsupervised active learning pro- cess. Apollo’s learning algorithm
Belgiu, Mariana; Dr Guţ, Lucian
2014-10-01
Although multiresolution segmentation (MRS) is a powerful technique for dealing with very high resolution imagery, some of the image objects that it generates do not match the geometries of the target objects, which reduces the classification accuracy. MRS can, however, be guided to produce results that approach the desired object geometry using either supervised or unsupervised approaches. Although some studies have suggested that a supervised approach is preferable, there has been no comparative evaluation of these two approaches. Therefore, in this study, we have compared supervised and unsupervised approaches to MRS. One supervised and two unsupervised segmentation methods were tested on three areas using QuickBird and WorldView-2 satellite imagery. The results were assessed using both segmentation evaluation methods and an accuracy assessment of the resulting building classifications. Thus, differences in the geometries of the image objects and in the potential to achieve satisfactory thematic accuracies were evaluated. The two approaches yielded remarkably similar classification results, with overall accuracies ranging from 82% to 86%. The performance of one of the unsupervised methods was unexpectedly similar to that of the supervised method; they identified almost identical scale parameters as being optimal for segmenting buildings, resulting in very similar geometries for the resulting image objects. The second unsupervised method produced very different image objects from the supervised method, but their classification accuracies were still very similar. The latter result was unexpected because, contrary to previously published findings, it suggests a high degree of independence between the segmentation results and classification accuracy. The results of this study have two important implications. The first is that object-based image analysis can be automated without sacrificing classification accuracy, and the second is that the previously accepted idea that classification is dependent on segmentation is challenged by our unexpected results, casting doubt on the value of pursuing 'optimal segmentation'. Our results rather suggest that as long as under-segmentation remains at acceptable levels, imperfections in segmentation can be ruled out, so that a high level of classification accuracy can still be achieved.
Brown, Justin C; Ko, Emily M; Schmitz, Kathryn H
2015-02-01
The health benefits of exercise increase in dose-response fashion among cancer survivors. However, it is unclear how to identify cancer survivors who may require a pre-exercise evaluation before they progress from the common recommendation of walking to unsupervised moderate- to vigorous-intensity exercise. To clarify how to identify cancer survivors who should undergo a pre-exercise evaluation before they progress from the common recommendation of walking to unsupervised moderate- to vigorous-intensity exercise. Electronic survey. Forty-seven (n = 47) experts in the field of exercise physiology, rehabilitation medicine, and cancer survivorship. Not applicable. We synthesized peer-reviewed guidelines for exercise and cancer survivorship and identified 82 health factors that may warrant a pre-exercise evaluation before a survivor engages in unsupervised moderate- to vigorous-intensity exercise. The 82 health factors were classified into 3 domains: (1) clinical health factors; (2) comorbidity and device health factors; and (3) medications. We surveyed a sample of experts asking them to identify which of the 82 health factors among cancer survivors would indicate the need for a pre-exercise evaluation before they engaged in moderate- to vigorous-intensity exercise. The response rate to our survey was 75% (n = 47). Across the 3 domains of health factors, acute symptoms, comorbidities, and medications related to cardiovascular disease were agreed on to indicate a pre-exercise evaluation for survivors before they engaged in unsupervised moderate- to vigorous-intensity exercise. Other health factors in the survey included hematologic, musculoskeletal, systemic, gastrointestinal, pulmonary, and neurological symptoms and comorbidities. Eighteen experts (38%) said it was difficult to provide absolute answers because no 2 patients are alike, and their decisions are made on a case-by-case basis. The results from this expert survey will help to identify which cancer survivors should undergo a pre-exercise evaluation before they engage in unsupervised moderate- to vigorous-intensity exercise. Copyright © 2015 American Academy of Physical Medicine and Rehabilitation. Published by Elsevier Inc. All rights reserved.
Enhanced Memory Consolidation Via Automatic Sound Stimulation During Non-REM Sleep.
Leminen, Miika M; Virkkala, Jussi; Saure, Emma; Paajanen, Teemu; Zee, Phyllis C; Santostasi, Giovanni; Hublin, Christer; Müller, Kiti; Porkka-Heiskanen, Tarja; Huotilainen, Minna; Paunio, Tiina
2017-03-01
Slow-wave sleep (SWS) slow waves and sleep spindle activity have been shown to be crucial for memory consolidation. Recently, memory consolidation has been causally facilitated in human participants via auditory stimuli phase-locked to SWS slow waves. Here, we aimed to develop a new acoustic stimulus protocol to facilitate learning and to validate it using different memory tasks. Most importantly, the stimulation setup was automated to be applicable for ambulatory home use. Fifteen healthy participants slept 3 nights in the laboratory. Learning was tested with 4 memory tasks (word pairs, serial finger tapping, picture recognition, and face-name association). Additional questionnaires addressed subjective sleep quality and overnight changes in mood. During the stimulus night, auditory stimuli were adjusted and targeted by an unsupervised algorithm to be phase-locked to the negative peak of slow waves in SWS. During the control night no sounds were presented. Results showed that the sound stimulation increased both slow wave (p = .002) and sleep spindle activity (p < .001). When overnight improvement of memory performance was compared between stimulus and control nights, we found a significant effect in word pair task but not in other memory tasks. The stimulation did not affect sleep structure or subjective sleep quality. We showed that the memory effect of the SWS-targeted individually triggered single-sound stimulation is specific to verbal associative memory. Moreover, the ambulatory and automated sound stimulus setup was promising and allows for a broad range of potential follow-up studies in the future. © Sleep Research Society 2017. Published by Oxford University Press [on behalf of the Sleep Research Society].
Li, Jia; Xia, Changqun; Chen, Xiaowu
2017-10-12
Image-based salient object detection (SOD) has been extensively studied in past decades. However, video-based SOD is much less explored due to the lack of large-scale video datasets within which salient objects are unambiguously defined and annotated. Toward this end, this paper proposes a video-based SOD dataset that consists of 200 videos. In constructing the dataset, we manually annotate all objects and regions over 7,650 uniformly sampled keyframes and collect the eye-tracking data of 23 subjects who free-view all videos. From the user data, we find that salient objects in a video can be defined as objects that consistently pop-out throughout the video, and objects with such attributes can be unambiguously annotated by combining manually annotated object/region masks with eye-tracking data of multiple subjects. To the best of our knowledge, it is currently the largest dataset for videobased salient object detection. Based on this dataset, this paper proposes an unsupervised baseline approach for video-based SOD by using saliencyguided stacked autoencoders. In the proposed approach, multiple spatiotemporal saliency cues are first extracted at the pixel, superpixel and object levels. With these saliency cues, stacked autoencoders are constructed in an unsupervised manner that automatically infers a saliency score for each pixel by progressively encoding the high-dimensional saliency cues gathered from the pixel and its spatiotemporal neighbors. In experiments, the proposed unsupervised approach is compared with 31 state-of-the-art models on the proposed dataset and outperforms 30 of them, including 19 imagebased classic (unsupervised or non-deep learning) models, six image-based deep learning models, and five video-based unsupervised models. Moreover, benchmarking results show that the proposed dataset is very challenging and has the potential to boost the development of video-based SOD.
Rabiul Islam, Md; Khademul Islam Molla, Md; Nakanishi, Masaki; Tanaka, Toshihisa
2017-04-01
Recently developed effective methods for detection commands of steady-state visual evoked potential (SSVEP)-based brain-computer interface (BCI) that need calibration for visual stimuli, which cause more time and fatigue prior to the use, as the number of commands increases. This paper develops a novel unsupervised method based on canonical correlation analysis (CCA) for accurate detection of stimulus frequency. A novel unsupervised technique termed as binary subband CCA (BsCCA) is implemented in a multiband approach to enhance the frequency recognition performance of SSVEP. In BsCCA, two subbands are used and a CCA-based correlation coefficient is computed for the individual subbands. In addition, a reduced set of artificial reference signals is used to calculate CCA for the second subband. The analyzing SSVEP is decomposed into multiple subband and the BsCCA is implemented for each one. Then, the overall recognition score is determined by a weighted sum of the canonical correlation coefficients obtained from each band. A 12-class SSVEP dataset (frequency range: 9.25-14.75 Hz with an interval of 0.5 Hz) for ten healthy subjects are used to evaluate the performance of the proposed method. The results suggest that BsCCA significantly improves the performance of SSVEP-based BCI compared to the state-of-the-art methods. The proposed method is an unsupervised approach with averaged information transfer rate (ITR) of 77.04 bits min -1 across 10 subjects. The maximum individual ITR is 107.55 bits min -1 for 12-class SSVEP dataset, whereas, the ITR of 69.29 and 69.44 bits min -1 are achieved with CCA and NCCA respectively. The statistical test shows that the proposed unsupervised method significantly improves the performance of the SSVEP-based BCI. It can be usable in real world applications.
Davies, Emlyn J.; Buscombe, Daniel D.; Graham, George W.; Nimmo-Smith, W. Alex M.
2015-01-01
Substantial information can be gained from digital in-line holography of marine particles, eliminating depth-of-field and focusing errors associated with standard lens-based imaging methods. However, for the technique to reach its full potential in oceanographic research, fully unsupervised (automated) methods are required for focusing, segmentation, sizing and classification of particles. These computational challenges are the subject of this paper, in which we draw upon data collected using a variety of holographic systems developed at Plymouth University, UK, from a significant range of particle types, sizes and shapes. A new method for noise reduction in reconstructed planes is found to be successful in aiding particle segmentation and sizing. The performance of an automated routine for deriving particle characteristics (and subsequent size distributions) is evaluated against equivalent size metrics obtained by a trained operative measuring grain axes on screen. The unsupervised method is found to be reliable, despite some errors resulting from over-segmentation of particles. A simple unsupervised particle classification system is developed, and is capable of successfully differentiating sand grains, bubbles and diatoms from within the surf-zone. Avoiding miscounting bubbles and biological particles as sand grains enables more accurate estimates of sand concentrations, and is especially important in deployments of particle monitoring instrumentation in aerated water. Perhaps the greatest potential for further development in the computational aspects of particle holography is in the area of unsupervised particle classification. The simple method proposed here provides a foundation upon which further development could lead to reliable identification of more complex particle populations, such as those containing phytoplankton, zooplankton, flocculated cohesive sediments and oil droplets.
Sola, J; Braun, F; Muntane, E; Verjus, C; Bertschi, M; Hugon, F; Manzano, S; Benissa, M; Gervaix, A
2016-08-01
Pneumonia remains the worldwide leading cause of children mortality under the age of five, with every year 1.4 million deaths. Unfortunately, in low resource settings, very limited diagnostic support aids are provided to point-of-care practitioners. Current UNICEF/WHO case management algorithm relies on the use of a chronometer to manually count breath rates on pediatric patients: there is thus a major need for more sophisticated tools to diagnose pneumonia that increase sensitivity and specificity of breath-rate-based algorithms. These tools should be low cost, and adapted to practitioners with limited training. In this work, a novel concept of unsupervised tool for the diagnosis of childhood pneumonia is presented. The concept relies on the automated analysis of respiratory sounds as recorded by a point-of-care electronic stethoscope. By identifying the presence of auscultation sounds at different chest locations, this diagnostic tool is intended to estimate a pneumonia likelihood score. After presenting the overall architecture of an algorithm to estimate pneumonia scores, the importance of a robust unsupervised method to identify inspiratory and expiratory phases of a respiratory cycle is highlighted. Based on data from an on-going study involving pediatric pneumonia patients, a first algorithm to segment respiratory sounds is suggested. The unsupervised algorithm relies on a Mel-frequency filter bank, a two-step Gaussian Mixture Model (GMM) description of data, and a final Hidden Markov Model (HMM) interpretation of inspiratory-expiratory sequences. Finally, illustrative results on first recruited patients are provided. The presented algorithm opens the doors to a new family of unsupervised respiratory sound analyzers that could improve future versions of case management algorithms for the diagnosis of pneumonia in low-resources settings.
NASA Astrophysics Data System (ADS)
Akbar, M. S.; Sarker, M. H.; Sattar, M. A.; Sarwar, G. M.; Rahman, S. M. M.; Rahman, M. M.; Khan, Z. U.
2017-05-01
Cultivation of shrimp mostly in unplanned way has been considered as one of the major environmental disasters of Shamnagar. Villagers surrounding the rivers are mainly involved with fish (shrimp) cultivation. So, fertile agriculture land has been converted to shrimp cultivation. Conversion of agriculture land to other usage is a common but acute problem for land resources of the country like Bangladesh. Conventional methods for collecting this information are relatively costly and time consuming. Contrarily, Remote Sensing satellite observation with its unique capability to provide cost-effective support in compiling the latest information about the natural resource. Remote sensing, in conjunction with GIS, has been widely applied and been recognized as a powerful and effective tool in detecting land use and land cover changes. RapidEye, Landsat8 images were used to identify land use and land cover of the area during the period 2008 and 2015. Google images were used to identify the micro-level land use features of the same period. Multi-spectral classifications using unsupervised and supervised classification were done and results have been compared based on the field investigation. The study reveals that during the period 2008 to 2015 agricultural practice has been reduced from 35 % to 21 % and shrimp cultivation area increased from 38 % to 50 %. Due to the impact of high salinity and salt water intrusion caused by natural disaster, agricultural activities is reduced and farmers have been converted to other practices, as a result shrimp farming is gaining popularity in the area.
Unsupervised learning of natural languages
Solan, Zach; Horn, David; Ruppin, Eytan; Edelman, Shimon
2005-01-01
We address the problem, fundamental to linguistics, bioinformatics, and certain other disciplines, of using corpora of raw symbolic sequential data to infer underlying rules that govern their production. Given a corpus of strings (such as text, transcribed speech, chromosome or protein sequence data, sheet music, etc.), our unsupervised algorithm recursively distills from it hierarchically structured patterns. The adios (automatic distillation of structure) algorithm relies on a statistical method for pattern extraction and on structured generalization, two processes that have been implicated in language acquisition. It has been evaluated on artificial context-free grammars with thousands of rules, on natural languages as diverse as English and Chinese, and on protein data correlating sequence with function. This unsupervised algorithm is capable of learning complex syntax, generating grammatical novel sentences, and proving useful in other fields that call for structure discovery from raw data, such as bioinformatics. PMID:16087885
Unsupervised learning of natural languages.
Solan, Zach; Horn, David; Ruppin, Eytan; Edelman, Shimon
2005-08-16
We address the problem, fundamental to linguistics, bioinformatics, and certain other disciplines, of using corpora of raw symbolic sequential data to infer underlying rules that govern their production. Given a corpus of strings (such as text, transcribed speech, chromosome or protein sequence data, sheet music, etc.), our unsupervised algorithm recursively distills from it hierarchically structured patterns. The adios (automatic distillation of structure) algorithm relies on a statistical method for pattern extraction and on structured generalization, two processes that have been implicated in language acquisition. It has been evaluated on artificial context-free grammars with thousands of rules, on natural languages as diverse as English and Chinese, and on protein data correlating sequence with function. This unsupervised algorithm is capable of learning complex syntax, generating grammatical novel sentences, and proving useful in other fields that call for structure discovery from raw data, such as bioinformatics.
NASA Astrophysics Data System (ADS)
Masalmah, Yahya M.; Vélez-Reyes, Miguel
2007-04-01
The authors proposed in previous papers the use of the constrained Positive Matrix Factorization (cPMF) to perform unsupervised unmixing of hyperspectral imagery. Two iterative algorithms were proposed to compute the cPMF based on the Gauss-Seidel and penalty approaches to solve optimization problems. Results presented in previous papers have shown the potential of the proposed method to perform unsupervised unmixing in HYPERION and AVIRIS imagery. The performance of iterative methods is highly dependent on the initialization scheme. Good initialization schemes can improve convergence speed, whether or not a global minimum is found, and whether or not spectra with physical relevance are retrieved as endmembers. In this paper, different initializations using random selection, longest norm pixels, and standard endmembers selection routines are studied and compared using simulated and real data.
Training strategy for convolutional neural networks in pedestrian gender classification
NASA Astrophysics Data System (ADS)
Ng, Choon-Boon; Tay, Yong-Haur; Goi, Bok-Min
2017-06-01
In this work, we studied a strategy for training a convolutional neural network in pedestrian gender classification with limited amount of labeled training data. Unsupervised learning by k-means clustering on pedestrian images was used to learn the filters to initialize the first layer of the network. As a form of pre-training, supervised learning for the related task of pedestrian classification was performed. Finally, the network was fine-tuned for gender classification. We found that this strategy improved the network's generalization ability in gender classification, achieving better test results when compared to random weights initialization and slightly more beneficial than merely initializing the first layer filters by unsupervised learning. This shows that unsupervised learning followed by pre-training with pedestrian images is an effective strategy to learn useful features for pedestrian gender classification.
ERIC Educational Resources Information Center
Sentse, Miranda; Dijkstra, Jan Kornelis; Lindenberg, Siegwart; Ormel, Johan; Veenstra, Rene
2010-01-01
In a large sample of early adolescents (T2: N = 1023; M age = 13.51; 55.5% girls), the impact of parental protection and unsupervised wandering on adolescents' antisocial behavior 2.5 years later was tested in this TRAILS study; gender and parental knowledge were controlled for. In addition, the level of biological maturation and having antisocial…
FRaC: a feature-modeling approach for semi-supervised and unsupervised anomaly detection.
Noto, Keith; Brodley, Carla; Slonim, Donna
2012-01-01
Anomaly detection involves identifying rare data instances (anomalies) that come from a different class or distribution than the majority (which are simply called "normal" instances). Given a training set of only normal data, the semi-supervised anomaly detection task is to identify anomalies in the future. Good solutions to this task have applications in fraud and intrusion detection. The unsupervised anomaly detection task is different: Given unlabeled, mostly-normal data, identify the anomalies among them. Many real-world machine learning tasks, including many fraud and intrusion detection tasks, are unsupervised because it is impractical (or impossible) to verify all of the training data. We recently presented FRaC, a new approach for semi-supervised anomaly detection. FRaC is based on using normal instances to build an ensemble of feature models, and then identifying instances that disagree with those models as anomalous. In this paper, we investigate the behavior of FRaC experimentally and explain why FRaC is so successful. We also show that FRaC is a superior approach for the unsupervised as well as the semi-supervised anomaly detection task, compared to well-known state-of-the-art anomaly detection methods, LOF and one-class support vector machines, and to an existing feature-modeling approach.
A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data
Goldstein, Markus; Uchida, Seiichi
2016-01-01
Anomaly detection is the process of identifying unexpected items or events in datasets, which differ from the norm. In contrast to standard classification tasks, anomaly detection is often applied on unlabeled data, taking only the internal structure of the dataset into account. This challenge is known as unsupervised anomaly detection and is addressed in many practical applications, for example in network intrusion detection, fraud detection as well as in the life science and medical domain. Dozens of algorithms have been proposed in this area, but unfortunately the research community still lacks a comparative universal evaluation as well as common publicly available datasets. These shortcomings are addressed in this study, where 19 different unsupervised anomaly detection algorithms are evaluated on 10 different datasets from multiple application domains. By publishing the source code and the datasets, this paper aims to be a new well-funded basis for unsupervised anomaly detection research. Additionally, this evaluation reveals the strengths and weaknesses of the different approaches for the first time. Besides the anomaly detection performance, computational effort, the impact of parameter settings as well as the global/local anomaly detection behavior is outlined. As a conclusion, we give an advise on algorithm selection for typical real-world tasks. PMID:27093601
FRaC: a feature-modeling approach for semi-supervised and unsupervised anomaly detection
Brodley, Carla; Slonim, Donna
2011-01-01
Anomaly detection involves identifying rare data instances (anomalies) that come from a different class or distribution than the majority (which are simply called “normal” instances). Given a training set of only normal data, the semi-supervised anomaly detection task is to identify anomalies in the future. Good solutions to this task have applications in fraud and intrusion detection. The unsupervised anomaly detection task is different: Given unlabeled, mostly-normal data, identify the anomalies among them. Many real-world machine learning tasks, including many fraud and intrusion detection tasks, are unsupervised because it is impractical (or impossible) to verify all of the training data. We recently presented FRaC, a new approach for semi-supervised anomaly detection. FRaC is based on using normal instances to build an ensemble of feature models, and then identifying instances that disagree with those models as anomalous. In this paper, we investigate the behavior of FRaC experimentally and explain why FRaC is so successful. We also show that FRaC is a superior approach for the unsupervised as well as the semi-supervised anomaly detection task, compared to well-known state-of-the-art anomaly detection methods, LOF and one-class support vector machines, and to an existing feature-modeling approach. PMID:22639542
Detection of food intake from swallowing sequences by supervised and unsupervised methods.
Lopez-Meyer, Paulo; Makeyev, Oleksandr; Schuckers, Stephanie; Melanson, Edward L; Neuman, Michael R; Sazonov, Edward
2010-08-01
Studies of food intake and ingestive behavior in free-living conditions most often rely on self-reporting-based methods that can be highly inaccurate. Methods of Monitoring of Ingestive Behavior (MIB) rely on objective measures derived from chewing and swallowing sequences and thus can be used for unbiased study of food intake with free-living conditions. Our previous study demonstrated accurate detection of food intake in simple models relying on observation of both chewing and swallowing. This article investigates methods that achieve comparable accuracy of food intake detection using only the time series of swallows and thus eliminating the need for the chewing sensor. The classification is performed for each individual swallow rather than for previously used time slices and thus will lead to higher accuracy in mass prediction models relying on counts of swallows. Performance of a group model based on a supervised method (SVM) is compared to performance of individual models based on an unsupervised method (K-means) with results indicating better performance of the unsupervised, self-adapting method. Overall, the results demonstrate that highly accurate detection of intake of foods with substantially different physical properties is possible by an unsupervised system that relies on the information provided by the swallowing alone.
Detection of Food Intake from Swallowing Sequences by Supervised and Unsupervised Methods
Lopez-Meyer, Paulo; Makeyev, Oleksandr; Schuckers, Stephanie; Melanson, Edward L.; Neuman, Michael R.; Sazonov, Edward
2010-01-01
Studies of food intake and ingestive behavior in free-living conditions most often rely on self-reporting-based methods that can be highly inaccurate. Methods of Monitoring of Ingestive Behavior (MIB) rely on objective measures derived from chewing and swallowing sequences and thus can be used for unbiased study of food intake with free-living conditions. Our previous study demonstrated accurate detection of food intake in simple models relying on observation of both chewing and swallowing. This article investigates methods that achieve comparable accuracy of food intake detection using only the time series of swallows and thus eliminating the need for the chewing sensor. The classification is performed for each individual swallow rather than for previously used time slices and thus will lead to higher accuracy in mass prediction models relying on counts of swallows. Performance of a group model based on a supervised method (SVM) is compared to performance of individual models based on an unsupervised method (K-means) with results indicating better performance of the unsupervised, self-adapting method. Overall, the results demonstrate that highly accurate detection of intake of foods with substantially different physical properties is possible by an unsupervised system that relies on the information provided by the swallowing alone. PMID:20352335
Necessary Transformation or Safe Permanence? A Philosophical Approach to the Desire for Change
ERIC Educational Resources Information Center
Drouin-Hans, Anne-Marie
2011-01-01
What is proposed is a meditation on the phrase "transformation of the educational system", paying attention to the sense of the words, and showing what the desire for educational change can reveal. After explaining to what extent "educational system" is a quasi-oxymoron, the meaning of "transformation" has to be…
ERIC Educational Resources Information Center
Cobine, Gary R.
Creative writing is not a magical art from magic wands, but an everyday practice in the hands of steady writers. Creative writing calls, above all, for self-discipline. Along with intellectual and emotional stamina, a poetic writer needs sensory awareness. The writer also forms a mysterious sixth sense--intuition. In search of the good words, the…
ERIC Educational Resources Information Center
Speiser, Bob; Walter, Chuck
2011-01-01
This paper explores how models can support productive thinking. For us a model is a "thing", a tool to help make sense of something. We restrict attention to specific models for whole-number multiplication, hence the wording of the title. They support evolving thinking in large measure through the ways their users redesign them. They assume new…
Achieving "Querencia": Integrating a Sense of Place with Disciplined Thinking
ERIC Educational Resources Information Center
Ault, Charles R., Jr.
2008-01-01
The Spanish word "querencia," evocative of how feelings and deepest beliefs attach the self to place (Lopez, 1992), invites the rediscovery of the meaning of equity as a reciprocal relationship between peoples and the landscapes they inhabit. This article begins with an exploration of a concept of "reciprocal equity," cultivated by achieving such…
ERIC Educational Resources Information Center
Schrader, Teri
2009-01-01
The Common Principles have been at the very center of the author's professional practice. When she first read Ted Sizer's writing and learned about the Coalition of Essential Schools, she felt as though he was talking directly to her. Not only did every word of the then nine Common Principles make sense, but after reading Sizer's work, her own…
Activating the Imagination inside the World Language Classroom
ERIC Educational Resources Information Center
Mitchell, Claire
2015-01-01
Imagination, creation, and innovation are three powerful words that present many possibilities in the world language classroom. When learners can see themselves as language users, they take ownership of their learning experience and become more invested in and engaged with the topic being studied. This heightened sense of investment in turn leads…
Tuning Out the World with Noise-Canceling Headphones
ERIC Educational Resources Information Center
McCulloch, Allison W.; Whitehead, Ashley; Lovett, Jennifer N.; Whitley, Blake
2017-01-01
Context is what makes mathematical modeling tasks different from more traditional textbook word problems. Math problems are sometimes stripped of context as they are worked on. For modeling problems, however, context is important for making sense of the mathematics. The task should be brought back to its real-world context as often as possible. In…
ERIC Educational Resources Information Center
Ohler, Jason
2009-01-01
Being literate in a real-world sense means being able to read and write using the media forms of the day, whatever they may be. For centuries, consuming and producing words through reading and writing and, to a lesser extent, listening and speaking were sufficient. But because of inexpensive, easy-to-use, and widely available new tools, literacy…
Initial implementation of The National Map
Roth, K.
2003-01-01
The development of The National Map is "national" in the broadest sense of the word. Although the U.S. Geological Survey is taking the lead, local governments, states, and regions are active and essential partners in the process, contributing, for example, data updates, problem-solving data integration, and map development from multiple data layers.
Understanding Old Words with New Meanings.
ERIC Educational Resources Information Center
Clark, Herbert H.; Gerrig, Richard J.
1983-01-01
Assumptions about comprehension of utterances are challenged in two experiments using as an example the verb phrase "to do a Richard Nixon on a tape" (i.e., erase it). It is argued that creating meanings, as with this phrase, works differently from selecting senses for utterances and that many require a mixture of the two. (MSE)
Office Design: A Study of Environment.
ERIC Educational Resources Information Center
Manning, Peter, Ed.
Reporting upon a study of environment which was based on the design of office buildings and office space, the study forms part of a continuing program of environmental research sponsored by Pilkington Brothers Limited of St. Helens, England. In this report the word 'environment' is used in the sense of the sum of the physical and emotional…
Aesthetics and a Sense of Wonder
ERIC Educational Resources Information Center
Wilson, Ruth A.
2010-01-01
Rachel Carson (1956)--scientist, writer, and environmentalist--states that "A child's world is fresh and new and beautiful, full of wonder and excitement". Many people have heard and been inspired by these words, but may not have a clear idea about what wonder really is. This isn't surprising, because wonder in different contexts can mean…
ERIC Educational Resources Information Center
Norbury, Keith
2013-01-01
Gone are the days when green campus initiatives were a balm to the soul and a drain on the wallet. Today's environmental initiatives are all about saving lots of green--in every sense of the word. The environmental benefits of green campus projects--whether wind turbines or better insulation--are pretty clear. Unfortunately, in today's…
Beyond Tradition: Culture, Symbolism, and Practicality in American Indian Art
ERIC Educational Resources Information Center
Sorensen, Barbara Ellen
2013-01-01
Indigenous people have always created what colonial language labels art. Yet there is no Native word for "art" as defined in a Euro-American sense. Art, as the dominant culture envisions, is mostly ornamental. This is in sharp juxtaposition to a Native perspective, which sees art as integrative, inclusive, practical, and constantly…
Density: A Definition, a Concept, or Both?
ERIC Educational Resources Information Center
Gaides, G. Edward
1989-01-01
Many words which have been treated in the denotative sense are actually connotative in nature. That is to say that citing a definition or stating a fact should not be a learning goal. Rather, a "conceptualization" should be what teachers are striving for. A series of activities dealing with density have been provided for demonstrations or…
Can multilinguality improve Biomedical Word Sense Disambiguation?
Duque, Andres; Martinez-Romo, Juan; Araujo, Lourdes
2016-12-01
Ambiguity in the biomedical domain represents a major issue when performing Natural Language Processing tasks over the huge amount of available information in the field. For this reason, Word Sense Disambiguation is critical for achieving accurate systems able to tackle complex tasks such as information extraction, summarization or document classification. In this work we explore whether multilinguality can help to solve the problem of ambiguity, and the conditions required for a system to improve the results obtained by monolingual approaches. Also, we analyze the best ways to generate those useful multilingual resources, and study different languages and sources of knowledge. The proposed system, based on co-occurrence graphs containing biomedical concepts and textual information, is evaluated on a test dataset frequently used in biomedicine. We can conclude that multilingual resources are able to provide a clear improvement of more than 7% compared to monolingual approaches, for graphs built from a small number of documents. Also, empirical results show that automatically translated resources are a useful source of information for this particular task. Copyright © 2016 Elsevier Inc. All rights reserved.
A Proposed Direction Finding and Polarization Sensing Scheme.
1976-03-01
on ’:i J -v nor any ’.o tion 4-u rn s e c, c. in a ,y x ay su-.plie soi a. cdl nir ino g cvsci f icati o-c, in ar"r’. mnner C o the >o d r c ,,,o-,r...SUPPLEMENTARY NOTES 19. KEY WORDS (Continue on reverse aide iI necessary and identify by block number) Direction-findingi t Polarization-sensing Small...antenna array 20. ABS CT (Continue on reverse aide If necesaery and identify by block number) This report proposes and demonstrates a direction finding
Johnston, Richard; Valentinuzzi, Max E
2016-01-01
A previous "Retrospectroscope" note, published early in 2014, dealt with spirometry: it described many apparatuses used to measure the volume of inhaled and exhaled air that results from breathing [1]. Such machines, when adequately modified, are also able to measure the rate at which work is produced (specifically by an animal or a human being). Metabolism in that sense is the term used by physiologists and physicians, a word that in Greek, metabolismos, means "change" or "overthrow," in the sense of breaking down material, as in burning some stuff.
Hardie, Kim Rachael; Heurlier, Karin
2008-08-01
Multicellular bacterial communities (biofilms) abound in nature, and their successful formation and survival is likely to require cell-cell communication--including quorum sensing--to co-ordinate appropriate gene expression. The only mode of quorum sensing that is shared by both Gram-positive and Gram-negative bacteria involves the production of the signalling molecule autoinducer 2 by LuxS. A survey of the current literature reveals that luxS contributes to biofilm development in some bacteria. However, inconsistencies prevent biofilm development being attributed to the production of AI2 in all cases.
Martínez-Ramos, David
2008-12-01
The Spanish words severo (severe) and severidad (severity) are usually used as a synonyms of grave (serious) and gravedad (seriousness), although the Spanish Royal Academy of Language (Real Academia Española [RAE]) specifically recommends not to use them in this sense. A retrospective analysis to evaluate the use of the words severo and severidad in Cirugía Española during 2007 was performed. All the articles published in Cirugía Española during 2007 were reviewed. The articles in which severo and/or severidad were present were selected. For each article, the month of publication, the type of article, the geographic origin and the exact sentence containing these words were analyzed. Correctness and incorrectness of their use was studied according to the RAE normative. A total of 33 articles were selected. Every month (except for January) had, at least, 2 articles. Thirty-one of the articles were from Spain whereas 2 were from Hispano-America. Eleven cases were original articles, 7 reviews, 6 case reports, 3 editorials, 3 special articles and 3 letters to the editor. The Spanish words severo and severidad are inadequately used too often in scientific texts. It must be avoided using them as a synonym of grave, importante or serio, incorrect translations of the English word severe.
Temporal Patterns of Happiness and Information in a Global Social Network: Hedonometrics and Twitter
Dodds, Peter Sheridan; Harris, Kameron Decker; Kloumann, Isabel M.; Bliss, Catherine A.; Danforth, Christopher M.
2011-01-01
Individual happiness is a fundamental societal metric. Normally measured through self-report, happiness has often been indirectly characterized and overshadowed by more readily quantifiable economic indicators such as gross domestic product. Here, we examine expressions made on the online, global microblog and social networking service Twitter, uncovering and explaining temporal variations in happiness and information levels over timescales ranging from hours to years. Our data set comprises over 46 billion words contained in nearly 4.6 billion expressions posted over a 33 month span by over 63 million unique users. In measuring happiness, we construct a tunable, real-time, remote-sensing, and non-invasive, text-based hedonometer. In building our metric, made available with this paper, we conducted a survey to obtain happiness evaluations of over 10,000 individual words, representing a tenfold size improvement over similar existing word sets. Rather than being ad hoc, our word list is chosen solely by frequency of usage, and we show how a highly robust and tunable metric can be constructed and defended. PMID:22163266
Dodds, Peter Sheridan; Harris, Kameron Decker; Kloumann, Isabel M; Bliss, Catherine A; Danforth, Christopher M
2011-01-01
Individual happiness is a fundamental societal metric. Normally measured through self-report, happiness has often been indirectly characterized and overshadowed by more readily quantifiable economic indicators such as gross domestic product. Here, we examine expressions made on the online, global microblog and social networking service Twitter, uncovering and explaining temporal variations in happiness and information levels over timescales ranging from hours to years. Our data set comprises over 46 billion words contained in nearly 4.6 billion expressions posted over a 33 month span by over 63 million unique users. In measuring happiness, we construct a tunable, real-time, remote-sensing, and non-invasive, text-based hedonometer. In building our metric, made available with this paper, we conducted a survey to obtain happiness evaluations of over 10,000 individual words, representing a tenfold size improvement over similar existing word sets. Rather than being ad hoc, our word list is chosen solely by frequency of usage, and we show how a highly robust and tunable metric can be constructed and defended.
The dynamics of human-induced land cover change in miombo ecosystems of southern Africa
NASA Astrophysics Data System (ADS)
Jaiteh, Malanding Sambou
Understanding human-induced land cover change in the miombo require the consistent, geographically-referenced, data on temporal land cover characteristics as well as biophysical and socioeconomic drivers of land use, the major cause of land cover change. The overall goal of this research to examine the applications of high-resolution satellite remote sensing data in studying the dynamics of human-induced land cover change in the miombo. Specific objectives are to: (1) evaluate the applications of computer-assisted classification of Landsat Thematic Mapper (TM) data for land cover mapping in the miombo and (2) analyze spatial and temporal patterns of landscape change locations in the miombo. Stepwise Thematic Classification, STC (a hybrid supervised-unsupervised classification) procedure for classifying Landsat TM data was developed and tested using Landsat TM data. Classification accuracy results were compared to those from supervised and unsupervised classification. The STC provided the highest classification accuracy i.e., 83.9% correspondence between classified and referenced data compared to 44.2% and 34.5% for unsupervised and supervised classification respectively. Improvements in the classification process can be attributed to thematic stratification of the image data into spectrally homogenous (thematic) groups and step-by-step classification of the groups using supervised or unsupervised classification techniques. Supervised classification failed to classify 18% of the scene evidence that training data used did not adequately represent all of the variability in the data. Application of the procedure in drier miombo produced overall classification accuracy of 63%. This is much lower than that of wetter miombo. The results clearly demonstrate that digital classification of Landsat TM can be successfully implemented in the miombo without intensive fieldwork. Spatial characteristics of land cover change in agricultural and forested landscapes in central Malawi were analyzed for the period 1984 to 1995 spatial pattern analysis methods. Shifting cultivation areas, Agriculture in forested landscape, experienced highest rate of woodland cover fragmentation with mean patch size of closed woodland cover decreasing from 20ha to 7.5ha. Permanent bare (cropland and settlement) in intensive agricultural matrix landscapes increased 52% largely through the conversion of fallow areas. Protected National Park area remained fairly unchanged although closed woodland area increased by 4%, mainly from regeneration of open woodland. This study provided evidence that changes in spatial characteristics in the miombo differ with landscape. Land use change (i.e. conversion to cropland) is the primary driving force behind changes in landscape spatial patterns. Also, results revealed that exclusion of intense human use (i.e. cultivation and woodcutting) through regulations and/or fencing increased both closed woodland area (through regeneration of open woodland) and overall connectivity in the landscape. Spatial characteristics of land cover change were analyzed at locations in Malawi (wetter miombo) and Zimbabwe (drier miombo). Results indicate land cover dynamics differ both between and within case study sites. In communal areas in the Kasungu scene, land cover change is dominated by woodland fragmentation to open vegetation. Change in private commercial lands was dominantly expansion of bare (settlement and cropland) areas primarily at the expense of open vegetation (fallow land).
NASA Technical Reports Server (NTRS)
Brooks, Colin; Bourgeau-Chavez, Laura; Endres, Sarah; Battaglia, Michael; Shuchman, Robert
2015-01-01
Primary Goal: Assist with the evaluation and measuring of wetlands hydroperiod at the PlumBrook Station using multi-source remote sensing data as part of a larger effort on projecting climate change-related impacts on the station's wetland ecosystems. MTRI expanded on the multi-source remote sensing capabilities to help estimate and measure hydroperiod and the relative soil moisture of wetlands at NASA's Plum Brook Station. Multi-source remote sensing capabilities are useful in estimating and measuring hydroperiod and relative soil moisture of wetlands. This is important as a changing regional climate has several potential risks for wetland ecosystem function. The year two analysis built on the first year of the project by acquiring and analyzing remote sensing data for additional dates and types of imagery, combined with focused field work. Five deliverables were planned and completed: 1) Show the relative length of hydroperiod using available remote sensing datasets 2) Date linked table of wetlands extent over time for all feasible non-forested wetlands 3) Utilize LIDAR data to measure topographic height above sea level of all wetlands, wetland to catchment area radio, slope of wetlands, and other useful variables 4) A demonstration of how analyzed results from multiple remote sensing data sources can help with wetlands vulnerability assessment 5) A MTRI style report summarizing year 2 results. This report serves as a descriptive summary of our completion of these our deliverables. Additionally, two formal meetings were held with Larry Liou and Amanda Sprinzl to provide project updates and receive direction on outputs. These were held on 2/26/15 and 9/17/15 at the Plum Brook Station. Principal Component Analysis (PCA) is a multivariate statistical technique used to identify dominant spatial and temporal backscatter signatures. PCA reduces the information contained in the temporal dataset to the first few new Principal Component (PC) images. Some advantages of PCA include the ability to filter out temporal autocorrelation and reduce speckle to the higher order PC images. A PCA was performed using ERDAS Imagine on a time series of PALSAR dates. Hydroperiod maps were created by separating the PALSAR dates into two date ranges, 2006-2008 and 2010, and performing an unsupervised classification on the PCAs.
2015-12-01
group assignment of samples in unsupervised hierarchical clustering by the Unweighted Pair-Group Method using Arithmetic averages ( UPGMA ) based on...log2 transformed MAS5.0 signal values; probe set clustering was performed by the UPGMA method using Cosine correlation as the similarity met- ric. For...differentially-regulated genes identified were subjected to unsupervised hierarchical clustering analysis using the UPGMA algorithm with cosine correlation as
Saludes-Rodil, Sergio; Baeyens, Enrique; Rodríguez-Juan, Carlos P
2015-04-29
An unsupervised approach to classify surface defects in wire rod manufacturing is developed in this paper. The defects are extracted from an eddy current signal and classified using a clustering technique that uses the dynamic time warping distance as the dissimilarity measure. The new approach has been successfully tested using industrial data. It is shown that it outperforms other classification alternatives, such as the modified Fourier descriptors.
Nonequilibrium thermodynamics of restricted Boltzmann machines.
Salazar, Domingos S P
2017-08-01
In this work, we analyze the nonequilibrium thermodynamics of a class of neural networks known as restricted Boltzmann machines (RBMs) in the context of unsupervised learning. We show how the network is described as a discrete Markov process and how the detailed balance condition and the Maxwell-Boltzmann equilibrium distribution are sufficient conditions for a complete thermodynamics description, including nonequilibrium fluctuation theorems. Numerical simulations in a fully trained RBM are performed and the heat exchange fluctuation theorem is verified with excellent agreement to the theory. We observe how the contrastive divergence functional, mostly used in unsupervised learning of RBMs, is closely related to nonequilibrium thermodynamic quantities. We also use the framework to interpret the estimation of the partition function of RBMs with the annealed importance sampling method from a thermodynamics standpoint. Finally, we argue that unsupervised learning of RBMs is equivalent to a work protocol in a system driven by the laws of thermodynamics in the absence of labeled data.
Unsupervised real-time speaker identification for daily movies
NASA Astrophysics Data System (ADS)
Li, Ying; Kuo, C.-C. Jay
2002-07-01
The problem of identifying speakers for movie content analysis is addressed in this paper. While most previous work on speaker identification was carried out in a supervised mode using pure audio data, more robust results can be obtained in real-time by integrating knowledge from multiple media sources in an unsupervised mode. In this work, both audio and visual cues will be employed and subsequently combined in a probabilistic framework to identify speakers. Particularly, audio information is used to identify speakers with a maximum likelihood (ML)-based approach while visual information is adopted to distinguish speakers by detecting and recognizing their talking faces based on face detection/recognition and mouth tracking techniques. Moreover, to accommodate for speakers' acoustic variations along time, we update their models on the fly by adapting to their newly contributed speech data. Encouraging results have been achieved through extensive experiments, which shows a promising future of the proposed audiovisual-based unsupervised speaker identification system.
When and where do youths have sex? The potential role of adult supervision.
Cohen, Deborah A; Farley, Thomas A; Taylor, Stephanie N; Martin, David H; Schuster, Mark A
2002-12-01
Interventions to reduce high-risk behaviors such as sex and substance use among youths have focused mainly on promoting abstinence, refusal skills, and negotiation skills, yet the frequency of high-risk behaviors among youths may also be influenced by opportunity, particularly the amount of time during which they are not supervised by adults. In this study, we examined when and where youths have sex and whether there is a relationship between unsupervised time and sex, sexually transmitted diseases (STDs), and substance use. A cross-sectional survey was conducted in 6 public high schools in an urban school district. Participants were 1065 boys and 969 girls from a school-based STD screening program. Ninety-eight percent of students were black, and 79% were in the free or reduced lunch program. Most students reported living with 1 parent only, primarily the mother (52%); only 27% lived in 2-parent families. Sexual activity, substance use, and the prevalence of gonorrhea or chlamydia as determined by a ligase-chain reaction test on a urine sample were measured. Fifty-six percent reported being home without an adult present 4 or more hours per day after school. There was no difference in the number of unsupervised after-school hours between children in 1- and 2-parent families. Fifty-five percent of boys and 41% of girls were participating in or planned to participate in after-school activities during the school year. Boys were more likely than girls to report having had sex for the first time before age 14 (42% vs 9%) and had a greater number of lifetime sex partners (mean: 4.2 vs 2.4 partners). Among the respondents who had had intercourse, 91% said that the last time had been in a home setting, including their own home (37%), their partner's home (43%), and a friend's home (12%), usually after school. Boys were more likely than girls to report having had sex in their own homes (43% vs 28%) and less likely than girls to report having had sex in their partner's homes (30% vs 59%). Fifty-six percent of youths who had had intercourse reported that the last time was on a weekday: 18% before 3:00, 17% between 3:00 and 6:00, and 21% after 6:00. There were no gender differences in the day of the week or time of day during which students reported having had intercourse. Youths who were unsupervised for 30 or more hours per week were more likely to be sexually active compared with those who were unsupervised for 5 hours a week or less (80% vs 68%). In addition, for boys, the greater the amount of unsupervised time, the higher the number of lifetime sex partners. Among girls but not among boys, sexual activity was associated with nonparticipation in after-school programs; 71% of those who were not participating in an after-school activity were sexually active compared with 59% of those who were participating. Tobacco and alcohol use were associated with unsupervised time among boys but not among girls. Boys who were unsupervised >5 hours per week after school were twice as likely to have gonorrhea or chlamydial infection as boys who were unsupervised for 5 hours or less. We found that substantial numbers of youths currently spend long periods of time without adult supervision and have limited opportunities to participate in after-school activities. More than half of sexually active youths reported that they had sex at home after school, and, particularly for boys, sexual-and drug-related risks increased as the amount of unsupervised time increased. As youths come of age, parents probably believe that it is appropriate to leave them increasingly on their own, and, accordingly, prevention approaches have concentrated on providing information and motivation for abstinence or safer sex. However, given the independent association between the amount of unsupervised time and sexual behaviors (with STD rates suggestive of particularly risky sexual behaviors) and substance use behaviors, it is worth considering increasing youth supervision, if not by parents, then by programs organized at schools organized at school or other community settings. Parents and community members should consider increasing opportunities for supervised activities to determine whether this will reduce risk-taking among youths.
Lacroix, André; Kressig, Reto W; Muehlbauer, Thomas; Gschwind, Yves J; Pfenninger, Barbara; Bruegger, Othmar; Granacher, Urs
2016-01-01
Losses in lower extremity muscle strength/power, muscle mass and deficits in static and particularly dynamic balance due to aging are associated with impaired functional performance and an increased fall risk. It has been shown that the combination of balance and strength training (BST) mitigates these age-related deficits. However, it is unresolved whether supervised versus unsupervised BST is equally effective in improving muscle power and balance in older adults. This study examined the impact of a 12-week BST program followed by 12 weeks of detraining on measures of balance and muscle power in healthy older adults enrolled in supervised (SUP) or unsupervised (UNSUP) training. Sixty-six older adults (men: 25, women: 41; age 73 ± 4 years) were randomly assigned to a SUP group (2/week supervised training, 1/week unsupervised training; n = 22), an UNSUP group (3/week unsupervised training; n = 22) or a passive control group (CON; n = 22). Static (i.e., Romberg Test) and dynamic (i.e., 10-meter walk test) steady-state, proactive (i.e., Timed Up and Go Test, Functional Reach Test), and reactive balance (e.g., Push and Release Test), as well as lower extremity muscle power (i.e., Chair Stand Test; Stair Ascent and Descent Test) were tested before and after the active training phase as well as after detraining. Adherence rates to training were 92% for SUP and 97% for UNSUP. BST resulted in significant group × time interactions. Post hoc analyses showed, among others, significant training-related improvements for the Romberg Test, stride velocity, Timed Up and Go Test, and Chair Stand Test in favor of the SUP group. Following detraining, significantly enhanced performances (compared to baseline) were still present in 13 variables for the SUP group and in 10 variables for the UNSUP group. Twelve weeks of BST proved to be safe (no training-related injuries) and feasible (high attendance rates of >90%). Deficits of balance and lower extremity muscle power can be mitigated by BST in healthy older adults. Additionally, supervised as compared to unsupervised BST was more effective. Thus, it is recommended to counteract intrinsic fall risk factors by applying supervised BST programs for older adults. © 2015 The Author(s) Published by S. Karger AG, Basel.
Pant Pai, Nitika; Sharma, Jigyasa; Shivkumar, Sushmita; Pillay, Sabrina; Vadnais, Caroline; Joseph, Lawrence; Dheda, Keertan; Peeling, Rosanna W.
2013-01-01
Background Stigma, discrimination, lack of privacy, and long waiting times partly explain why six out of ten individuals living with HIV do not access facility-based testing. By circumventing these barriers, self-testing offers potential for more people to know their sero-status. Recent approval of an in-home HIV self test in the US has sparked self-testing initiatives, yet data on acceptability, feasibility, and linkages to care are limited. We systematically reviewed evidence on supervised (self-testing and counselling aided by a health care professional) and unsupervised (performed by self-tester with access to phone/internet counselling) self-testing strategies. Methods and Findings Seven databases (Medline [via PubMed], Biosis, PsycINFO, Cinahl, African Medicus, LILACS, and EMBASE) and conference abstracts of six major HIV/sexually transmitted infections conferences were searched from 1st January 2000–30th October 2012. 1,221 citations were identified and 21 studies included for review. Seven studies evaluated an unsupervised strategy and 14 evaluated a supervised strategy. For both strategies, data on acceptability (range: 74%–96%), preference (range: 61%–91%), and partner self-testing (range: 80%–97%) were high. A high specificity (range: 99.8%–100%) was observed for both strategies, while a lower sensitivity was reported in the unsupervised (range: 92.9%–100%; one study) versus supervised (range: 97.4%–97.9%; three studies) strategy. Regarding feasibility of linkage to counselling and care, 96% (n = 102/106) of individuals testing positive for HIV stated they would seek post-test counselling (unsupervised strategy, one study). No extreme adverse events were noted. The majority of data (n = 11,019/12,402 individuals, 89%) were from high-income settings and 71% (n = 15/21) of studies were cross-sectional in design, thus limiting our analysis. Conclusions Both supervised and unsupervised testing strategies were highly acceptable, preferred, and more likely to result in partner self-testing. However, no studies evaluated post-test linkage with counselling and treatment outcomes and reporting quality was poor. Thus, controlled trials of high quality from diverse settings are warranted to confirm and extend these findings. Please see later in the article for the Editors' Summary PMID:23565066
Dubois, Matthieu; Poeppel, David; Pelli, Denis G
2013-01-01
To understand why human sensitivity for complex objects is so low, we study how word identification combines eye and ear or parts of a word (features, letters, syllables). Our observers identify printed and spoken words presented concurrently or separately. When researchers measure threshold (energy of the faintest visible or audible signal) they may report either sensitivity (one over the human threshold) or efficiency (ratio of the best possible threshold to the human threshold). When the best possible algorithm identifies an object (like a word) in noise, its threshold is independent of how many parts the object has. But, with human observers, efficiency depends on the task. In some tasks, human observers combine parts efficiently, needing hardly more energy to identify an object with more parts. In other tasks, they combine inefficiently, needing energy nearly proportional to the number of parts, over a 60∶1 range. Whether presented to eye or ear, efficiency for detecting a short sinusoid (tone or grating) with few features is a substantial 20%, while efficiency for identifying a word with many features is merely 1%. Why? We show that the low human sensitivity for words is a cost of combining their many parts. We report a dichotomy between inefficient combining of adjacent features and efficient combining across senses. Joining our results with a survey of the cue-combination literature reveals that cues combine efficiently only if they are perceived as aspects of the same object. Observers give different names to adjacent letters in a word, and combine them inefficiently. Observers give the same name to a word's image and sound, and combine them efficiently. The brain's machinery optimally combines only cues that are perceived as originating from the same object. Presumably such cues each find their own way through the brain to arrive at the same object representation.
The Role of Mental Imagery in Imaginative and Ecological Teaching
ERIC Educational Resources Information Center
Judson, Gillian
2014-01-01
This article explores how mental imagery evoked from words might enhance the learning of cross-curricular content and how it may help cultivate students' "ecological understanding": that deep sense of connection to a living world and the care and concern to live differently within it. With reference to Elliott Eisner's and Kieran Egan's…
Michael Young's "The Rise of the Meritocracy": A Philosophical Critique
ERIC Educational Resources Information Center
Allen, Ansgar
2011-01-01
This paper examines Michael Young's 1958 dystopia, "The Rise of the Meritocracy". In this book, the word "meritocracy" was coined and used in a pejorative sense. Today, however, meritocracy represents a positive ideal against which we measure the justice of our institutions. This paper argues that, when read in the twenty-first century, Young's…
Podcasting Syndication Services and University Students: Why Don't They Subscribe?
ERIC Educational Resources Information Center
Lee, Mark J. W.; Miller, Charlynn; Newnham, Leon
2009-01-01
Partly owing to the status of podcasting as a buzzword and subject of much recent media attention, educational technology researchers and practitioners have been using the term very loosely. Few studies have examined student perceptions and uptake of "podcasting" in the true sense of the word, whereby a syndication protocol such as Really Simple…
Iranian EFL Teachers' Voices on the Pedagogy of Word and World
ERIC Educational Resources Information Center
Safari, Parvin; Rashidi, Nasser
2015-01-01
Critical pedagogy (CP) with the eventual aim of creating changes in society towards the socially just world rests upon the premise that language learning is understood as a sociopolitical event. Schools and classrooms are not merely seen as the neutral and apolitical sites or oxymoron of transmitting taken-for-granted knowledge and common sense to…
ERIC Educational Resources Information Center
Patterson, Olga
2012-01-01
Domain adaptation of natural language processing systems is challenging because it requires human expertise. While manual effort is effective in creating a high quality knowledge base, it is expensive and time consuming. Clinical text adds another layer of complexity to the task due to privacy and confidentiality restrictions that hinder the…
Clinician or Witness? The Intervener's Relationship with Traumatized Children
ERIC Educational Resources Information Center
Steele, William
2008-01-01
To heal the hurt child, one begins not as a clinician but as a person trying to witness how the child experiences trauma. This requires more than just talking since the child's terrifying memories are stored in the brain's senses and visual imagery, not in rational thoughts and words. The goal is to change these frightening sensory experiences…
Deleuze and Guattari's Language for New Empirical Inquiry
ERIC Educational Resources Information Center
St. Pierre, Elizabeth Adams
2017-01-01
This paper reviews Deleuze's theory of language in "Logic of Sense," and Deleuze and Guattari's theory of language in "A Thousand Plateaus." In the ontology informed by the Stoics described in those books, human being and language do not exist separately but in a mixture of words and things. The author argues that this…
On the Nature of Syntactic Variation: Evidence from Complex Predicates and Complex Word-Formation.
ERIC Educational Resources Information Center
Snyder, William
2001-01-01
Provides evidence from child language acquisition and comparative syntax for existence of a syntactic parameter in the classical sense of Chomsky (1981), with simultaneous effects on syntactic argument structure. Implications are that syntax is subject to points of substantive parametric variation as envisioned in Chomsky, and the time course of…
Moving Past "Right" or "Wrong" toward a Continuum of Young Children's Semantic Knowledge
ERIC Educational Resources Information Center
Christ, Tanya
2011-01-01
Vocabulary development is a critical goal for early childhood education. However, it is difficult for researchers and teachers to determine whether this goal is being met, given the limitations of current assessment tools. These tools tend to view word knowledge dichotomously--as right or wrong. A clear sense of children's depth of semantic…
The Way of the S/Word: Storytelling as Emerging Liminal
ERIC Educational Resources Information Center
Josephs, Caroline
2008-01-01
The paper focuses on oral storytelling and transformation through the significance of the liminal zone as thresholding. Involving the reader-listener in an experiential and performative approach, the article draws on all of the senses, using a wide range of data such as dreams, drawing, writing, as well as the act of (sacred) oral storytelling and…
Human Capital and Its Development in Present-Day Russia
ERIC Educational Resources Information Center
Nureev, R. M.
2010-01-01
In the broad sense of the word human capital is a specific form of capital that is embodied in people themselves. It consists of the individual's reserve of health, knowledge, skills, abilities, and motivations that enable him to increase his labor productivity and give him an income in the form of wages, salaries, and other income. The structure…
ERIC Educational Resources Information Center
Ashbrook, Peggy
2007-01-01
From children's viewpoints, what they experience in the world is what the world is like--for everyone. "What do others experience with their senses when they are in the same situation?" is a question that young children can explore by collecting data as they use a "feely box," or take a "sensory walk." There are many ways to focus the children's…
The Cultural Preferences of Today's Russian College Students
ERIC Educational Resources Information Center
Andreev, A. L.
2009-01-01
Education rests on the foundation of culture in the broadest sense of that word. How deeply and solidly that foundation has been laid down determines the size and solidity of the building that can be constructed on it. This applies in particular to higher education, which is by no means designed solely to offer just a body of specialized…
Making Sense of Administrative Leadership. The "L" Word in Higher Education. ERIC Digest.
ERIC Educational Resources Information Center
Bensimon, Estela M.; And Others
The digest is based on a full length report (with the same title) on leadership in higher education. The full report provides a definitive review of the literature and institutional practice on the topic. Recent scholars have new ideas challenging traditional notions that organizations are driven by leadership or that the quality of leadership…
ERIC Educational Resources Information Center
Bensimon, Estela M.; And Others
An integration and synthesis of the theoretical literature on leadership with the literature concerning higher education as a social institution is presented. The literature on a conceptual explanation of leadership is reviewed and related directly to higher education and its sociological and organizational uniqueness. The first four of the…
Learning in the Learner's Perspective. I. Some Common-Sense Conceptions. No. 76.
ERIC Educational Resources Information Center
Saljo, Roger
Ninety Swedish teenagers and adults with varying levels of formal education were interviewed about their own learning experiences and techniques. Subjects were then asked what they actually meant by learning. The concept was variously defined as: (1) an increase in knowledge (merely a synonym for the word learning); (2) memorizing; (3) an…
In Defense of Dirty Words: The Case against Judicial Censorship in Oral Interpretation Events.
ERIC Educational Resources Information Center
Kugler, Drew B.
Within the realm of forensic oral interpretation, concern over the use of profanity in presentations has aroused repressive criticism from some judges, who then express their offense by ranking the performance negatively. This judicial opposition is deleterious not only to the precepts of oral interpretation, but also--in a larger sense--to the…
2013-10-01
correct group assignment of samples in unsupervised hierarchical clustering by the Unweighted Pair-Group Method using Arithmetic averages ( UPGMA ) based on...centering of log2 transformed MAS5.0 signal values; probe set clustering was performed by the UPGMA method using Cosine correlation as the similarity met...A) The 108 differentially-regulated genes identified were subjected to unsupervised hierarchical clustering analysis using the UPGMA algorithm with
Object-oriented feature-tracking algorithms for SAR images of the marginal ice zone
NASA Technical Reports Server (NTRS)
Daida, Jason; Samadani, Ramin; Vesecky, John F.
1990-01-01
An unsupervised method that chooses and applies the most appropriate tracking algorithm from among different sea-ice tracking algorithms is reported. In contrast to current unsupervised methods, this method chooses and applies an algorithm by partially examining a sequential image pair to draw inferences about what was examined. Based on these inferences the reported method subsequently chooses which algorithm to apply to specific areas of the image pair where that algorithm should work best.
An Example of Unsupervised Networks Kohonen's Self-Organizing Feature Map
NASA Technical Reports Server (NTRS)
Niebur, Dagmar
1995-01-01
Kohonen's self-organizing feature map belongs to a class of unsupervised artificial neural network commonly referred to as topographic maps. It serves two purposes, the quantization and dimensionality reduction of date. A short description of its history and its biological context is given. We show that the inherent classification properties of the feature map make it a suitable candidate for solving the classification task in power system areas like load forecasting, fault diagnosis and security assessment.
Lopane, Giovanna; Mellone, Sabato; Corzani, Mattia; Chiari, Lorenzo; Cortelli, Pietro; Calandra-Buonaura, Giovanna; Contin, Manuela
2018-06-01
We aimed to assess the intrasubject reproducibility of a technology-based levodopa (LD) therapeutic monitoring protocol administered in supervised versus unsupervised conditions in patients with Parkinson's disease (PD). The study design was pilot, intrasubject, single center, open and prospective. Twenty patients were recruited. Patients performed a standardized monitoring protocol instrumented by an ad hoc embedded platform after their usual first morning LD dose in two different randomized ambulatory sessions: one under a physician's supervision, the other self-administered. The protocol is made up of serial motor and non-motor tests, including alternate finger tapping, Timed Up and Go test, and measurement of blood pressure. Primary motor outcomes included comparisons of intrasubject LD subacute motor response patterns over the 3-h test in the two experimental conditions. Secondary outcomes were the number of intrasession serial test repetitions due to technical or handling errors and patients' satisfaction with the unsupervised LD monitoring protocol. Intrasubject LD motor response patterns were concordant between the two study sessions in all patients but one. Platform handling problems averaged 4% of total planned serial tests for both sessions. Ninety-five percent of patients were satisfied with the self-administered LD monitoring protocol. To our knowledge, this study is the first to explore the potential of unsupervised technology-based objective motor and non-motor tasks to monitor subacute LD dosing effects in PD patients. The results are promising for future telemedicine applications.
Nasiri, Jaber; Naghavi, Mohammad Reza; Kayvanjoo, Amir Hossein; Nasiri, Mojtaba; Ebrahimi, Mansour
2015-03-07
For the first time, prediction accuracies of some supervised and unsupervised algorithms were evaluated in an SSR-based DNA fingerprinting study of a pea collection containing 20 cultivars and 57 wild samples. In general, according to the 10 attribute weighting models, the SSR alleles of PEAPHTAP-2 and PSBLOX13.2-1 were the two most important attributes to generate discrimination among eight different species and subspecies of genus Pisum. In addition, K-Medoids unsupervised clustering run on Chi squared dataset exhibited the best prediction accuracy (83.12%), while the lowest accuracy (25.97%) gained as K-Means model ran on FCdb database. Irrespective of some fluctuations, the overall accuracies of tree induction models were significantly high for many algorithms, and the attributes PSBLOX13.2-3 and PEAPHTAP could successfully detach Pisum fulvum accessions and cultivars from the others when two selected decision trees were taken into account. Meanwhile, the other used supervised algorithms exhibited overall reliable accuracies, even though in some rare cases, they gave us low amounts of accuracies. Our results, altogether, demonstrate promising applications of both supervised and unsupervised algorithms to provide suitable data mining tools regarding accurate fingerprinting of different species and subspecies of genus Pisum, as a fundamental priority task in breeding programs of the crop. Copyright © 2015 Elsevier Ltd. All rights reserved.
Unsupervised classification of variable stars
NASA Astrophysics Data System (ADS)
Valenzuela, Lucas; Pichara, Karim
2018-03-01
During the past 10 years, a considerable amount of effort has been made to develop algorithms for automatic classification of variable stars. That has been primarily achieved by applying machine learning methods to photometric data sets where objects are represented as light curves. Classifiers require training sets to learn the underlying patterns that allow the separation among classes. Unfortunately, building training sets is an expensive process that demands a lot of human efforts. Every time data come from new surveys; the only available training instances are the ones that have a cross-match with previously labelled objects, consequently generating insufficient training sets compared with the large amounts of unlabelled sources. In this work, we present an algorithm that performs unsupervised classification of variable stars, relying only on the similarity among light curves. We tackle the unsupervised classification problem by proposing an untraditional approach. Instead of trying to match classes of stars with clusters found by a clustering algorithm, we propose a query-based method where astronomers can find groups of variable stars ranked by similarity. We also develop a fast similarity function specific for light curves, based on a novel data structure that allows scaling the search over the entire data set of unlabelled objects. Experiments show that our unsupervised model achieves high accuracy in the classification of different types of variable stars and that the proposed algorithm scales up to massive amounts of light curves.
Unsupervised Machine Learning for Developing Personalised Behaviour Models Using Activity Data.
Fiorini, Laura; Cavallo, Filippo; Dario, Paolo; Eavis, Alexandra; Caleb-Solly, Praminda
2017-05-04
The goal of this study is to address two major issues that undermine the large scale deployment of smart home sensing solutions in people's homes. These include the costs associated with having to install and maintain a large number of sensors, and the pragmatics of annotating numerous sensor data streams for activity classification. Our aim was therefore to propose a method to describe individual users' behavioural patterns starting from unannotated data analysis of a minimal number of sensors and a "blind" approach for activity recognition. The methodology included processing and analysing sensor data from 17 older adults living in community-based housing to extract activity information at different times of the day. The findings illustrate that 55 days of sensor data from a sensor configuration comprising three sensors, and extracting appropriate features including a "busyness" measure, are adequate to build robust models which can be used for clustering individuals based on their behaviour patterns with a high degree of accuracy (>85%). The obtained clusters can be used to describe individual behaviour over different times of the day. This approach suggests a scalable solution to support optimising the personalisation of care by utilising low-cost sensing and analysis. This approach could be used to track a person's needs over time and fine-tune their care plan on an ongoing basis in a cost-effective manner.
Unsupervised Machine Learning for Developing Personalised Behaviour Models Using Activity Data
Fiorini, Laura; Cavallo, Filippo; Dario, Paolo; Eavis, Alexandra; Caleb-Solly, Praminda
2017-01-01
The goal of this study is to address two major issues that undermine the large scale deployment of smart home sensing solutions in people’s homes. These include the costs associated with having to install and maintain a large number of sensors, and the pragmatics of annotating numerous sensor data streams for activity classification. Our aim was therefore to propose a method to describe individual users’ behavioural patterns starting from unannotated data analysis of a minimal number of sensors and a ”blind” approach for activity recognition. The methodology included processing and analysing sensor data from 17 older adults living in community-based housing to extract activity information at different times of the day. The findings illustrate that 55 days of sensor data from a sensor configuration comprising three sensors, and extracting appropriate features including a “busyness” measure, are adequate to build robust models which can be used for clustering individuals based on their behaviour patterns with a high degree of accuracy (>85%). The obtained clusters can be used to describe individual behaviour over different times of the day. This approach suggests a scalable solution to support optimising the personalisation of care by utilising low-cost sensing and analysis. This approach could be used to track a person’s needs over time and fine-tune their care plan on an ongoing basis in a cost-effective manner. PMID:28471405
RADARSAT-2 Polarimetry for Lake Ice Mapping
NASA Astrophysics Data System (ADS)
Pan, Feng; Kang, Kyung-Kuk; Duguay, Claude
2016-04-01
Changes in the ice regime of lakes can be employed to assess long-term climate trends and variability in high latitude regions. Lake ice cover observations are not only useful for climate monitoring, but also for improving ice and weather forecasts using numerical prediction models. In recent years, satellite remote sensing has assumed a greater role in observing lake ice cover for both purposes. Radar remote sensing has become an essential tool for mapping lake ice at high latitudes where cloud cover and polar darkness severely limits ice observations from optical systems. In Canada, there is an emerging interest by government agencies to evaluate the potential of fully polarimetric synthetic aperture radar (SAR) data from RADARSAT-2 (C-band) for lake ice monitoring. In this study, we processed and analyzed the polarization states and scattering mechanisms of fully polarimetric RADARSAT-2 data obtained over Great Bear Lake, Canada, to identify open water and different ice types during the freeze-up and break-up periods. Polarimetric decompositions were employed to separate polarimetric measurements into basic scattering mechanisms. Entropy, anisotropy, and alpha angle were derived to characterize the scattering heterogeneity and mechanisms. Ice classes were then determined based on entropy and alpha angle using the unsupervised Wishart classifier and results evaluated against Landsat 8 imagery. Preliminary results suggest that the RADARSAT-2 polarimetric data offer a strong capability for identifying open water and different lake ice types.
Prediction during language comprehension: benefits, costs, and ERP components.
Van Petten, Cyma; Luka, Barbara J
2012-02-01
Because context has a robust influence on the processing of subsequent words, the idea that readers and listeners predict upcoming words has attracted research attention, but prediction has fallen in and out of favor as a likely factor in normal comprehension. We note that the common sense of this word includes both benefits for confirmed predictions and costs for disconfirmed predictions. The N400 component of the event-related potential (ERP) reliably indexes the benefits of semantic context. Evidence that the N400 is sensitive to the other half of prediction--a cost for failure--is largely absent from the literature. This raises the possibility that "prediction" is not a good description of what comprehenders do. However, it need not be the case that the benefits and costs of prediction are evident in a single ERP component. Research outside of language processing indicates that late positive components of the ERP are very sensitive to disconfirmed predictions. We review late positive components elicited by words that are potentially more or less predictable from preceding sentence context. This survey suggests that late positive responses to unexpected words are fairly common, but that these consist of two distinct components with different scalp topographies, one associated with semantically incongruent words and one associated with congruent words. We conclude with a discussion of the possible cognitive correlates of these distinct late positivities and their relationships with more thoroughly characterized ERP components, namely the P300, P600 response to syntactic errors, and the "old/new effect" in studies of recognition memory. Copyright © 2011 Elsevier B.V. All rights reserved.
Applicability Assessment of Uavsar Data in Wetland Monitoring: a Case Study of Louisiana Wetland
NASA Astrophysics Data System (ADS)
Zhao, J.; Niu, Y.; Lu, Z.; Yang, J.; Li, P.; Liu, W.
2018-04-01
Wetlands are highly productive and support a wide variety of ecosystem goods and services. Monitoring wetland is essential and potential. Because of the repeat-pass nature of satellite orbit and airborne, time-series of remote sensing data can be obtained to monitor wetland. UAVSAR is a NASA L-band synthetic aperture radar (SAR) sensor compact pod-mounted polarimetric instrument for interferometric repeat-track observations. Moreover, UAVSAR images can accurately map crustal deformations associated with natural hazards, such as volcanoes and earthquakes. And its polarization agility facilitates terrain and land-use classification and change detection. In this paper, the multi-temporal UAVSAR data are applied for monitoring the wetland change. Using the multi-temporal polarimetric SAR (PolSAR) data, the change detection maps are obtained by unsupervised and supervised method. And the coherence is extracted from the interfometric SAR (InSAR) data to verify the accuracy of change detection map. The experimental results show that the multi-temporal UAVSAR data is fit for wetland monitor.
Detection of macroalgae blooms by complex SAR imagery.
Shen, Hui; Perrie, William; Liu, Qingrong; He, Yijun
2014-01-15
Increased frequency and enhanced damage to the marine environment and to human society caused by green macroalgae blooms demand improved high-resolution early detection methods. Conventional satellite remote sensing methods via spectra radiometers do not work in cloud-covered areas, and therefore cannot meet these demands for operational applications. We present a methodology for green macroalgae bloom detection based on RADARSAT-2 synthetic aperture radar (SAR) images. Green macroalgae patches exhibit different polarimetric characteristics compared to the open ocean surface, in both the amplitude and phase domains of SAR-measured complex radar backscatter returns. In this study, new index factors are defined which have opposite signs in green macroalgae-covered areas, compared to the open water surface. These index factors enable unsupervised detection from SAR images, providing a high-resolution new tool for detection of green macroalgae blooms, which can potentially contribute to a better understanding of the mechanisms related to outbreaks of green macroalgae blooms in coastal areas throughout the world ocean. Crown Copyright © 2013. Published by Elsevier Ltd. All rights reserved.
Automatic Feature Extraction from Planetary Images
NASA Technical Reports Server (NTRS)
Troglio, Giulia; Le Moigne, Jacqueline; Benediktsson, Jon A.; Moser, Gabriele; Serpico, Sebastiano B.
2010-01-01
With the launch of several planetary missions in the last decade, a large amount of planetary images has already been acquired and much more will be available for analysis in the coming years. The image data need to be analyzed, preferably by automatic processing techniques because of the huge amount of data. Although many automatic feature extraction methods have been proposed and utilized for Earth remote sensing images, these methods are not always applicable to planetary data that often present low contrast and uneven illumination characteristics. Different methods have already been presented for crater extraction from planetary images, but the detection of other types of planetary features has not been addressed yet. Here, we propose a new unsupervised method for the extraction of different features from the surface of the analyzed planet, based on the combination of several image processing techniques, including a watershed segmentation and the generalized Hough Transform. The method has many applications, among which image registration and can be applied to arbitrary planetary images.
NASA Astrophysics Data System (ADS)
Sun, Hao; Zou, Huanxin; Zhou, Shilin
2016-03-01
Detection of anomalous targets of various sizes in hyperspectral data has received a lot of attention in reconnaissance and surveillance applications. Many anomaly detectors have been proposed in literature. However, current methods are susceptible to anomalies in the processing window range and often make critical assumptions about the distribution of the background data. Motivated by the fact that anomaly pixels are often distinctive from their local background, in this letter, we proposed a novel hyperspectral anomaly detection framework for real-time remote sensing applications. The proposed framework consists of four major components, sparse feature learning, pyramid grid window selection, joint spatial-spectral collaborative coding and multi-level divergence fusion. It exploits the collaborative representation difference in the feature space to locate potential anomalies and is totally unsupervised without any prior assumptions. Experimental results on airborne recorded hyperspectral data demonstrate that the proposed methods adaptive to anomalies in a large range of sizes and is well suited for parallel processing.
Involvement of surgical trainees in surgery for colorectal cancer and their effect on outcome.
Borowski, D W; Ratcliffe, A A; Bharathan, B; Gunn, A; Bradburn, D M; Mills, S J; Wilson, R G; Kelly, S B
2008-10-01
Surgical training in the UK is undergoing substantial changes. This study assessed: 1) the training opportunities available to trainees in operations for colorectal cancer, 2) the effect of colorectal specialization on training, and 3) the effect of consultant supervision on anastomotic complications, postoperative stay, operative mortality and 5-year survival. Unadjusted and adjusted comparisons of outcomes were made for unsupervised trainees, supervised trainees and consultants as the primary surgeon in 7411 operated patients included in the Northern Region Colorectal Cancer Audit between 1998 and 2002. Surgery was performed in 656 (8.8%) patients by unsupervised trainees and in 1578 (21.3%) patients by supervised trainees. Unsupervised operations reduced from 182 (12.4%) in 1998 to 82 (6.1%) in 2002 (P < 0.001). Consultants with a colorectal specialist interest were more likely than nonspecialists to be present at surgical resections (OR 1.35, 1.12-1.63, P = 0.001) and to provide supervised training (OR 1.34, 1.17-1.53, P < 0.001). Patients operated on by unsupervised trainees were more often high-risk patients, however, consultant presence was not significantly associated with operative mortality (OR 0.83, 0.63-1.09, P = 0.186) or survival (HR 1.02, 0.92-1.13, P = 0.735) in risk-adjusted analysis. Supervised trainees had a case-mix similar to consultants, with shorter length of hospital stay (11.4 vs 12.4 days, P < 0.001), but similar mortality (OR 0.90, 0.71-1.16, 0.418) and survival (HR 0.96, 0.89-1.05, P = 0.378). One third of patients were operated on by trainees, who were more likely to perform supervised resections in colorectal teams. There was no difference in anastomotic leaks rates, operative mortality or survival between unsupervised trainees, supervised trainees and consultants when case-mix adjustment was applied. This study would suggest that there is considerable underused training capacity available.
NASA Astrophysics Data System (ADS)
Rabiul Islam, Md; Khademul Islam Molla, Md; Nakanishi, Masaki; Tanaka, Toshihisa
2017-04-01
Objective. Recently developed effective methods for detection commands of steady-state visual evoked potential (SSVEP)-based brain-computer interface (BCI) that need calibration for visual stimuli, which cause more time and fatigue prior to the use, as the number of commands increases. This paper develops a novel unsupervised method based on canonical correlation analysis (CCA) for accurate detection of stimulus frequency. Approach. A novel unsupervised technique termed as binary subband CCA (BsCCA) is implemented in a multiband approach to enhance the frequency recognition performance of SSVEP. In BsCCA, two subbands are used and a CCA-based correlation coefficient is computed for the individual subbands. In addition, a reduced set of artificial reference signals is used to calculate CCA for the second subband. The analyzing SSVEP is decomposed into multiple subband and the BsCCA is implemented for each one. Then, the overall recognition score is determined by a weighted sum of the canonical correlation coefficients obtained from each band. Main results. A 12-class SSVEP dataset (frequency range: 9.25-14.75 Hz with an interval of 0.5 Hz) for ten healthy subjects are used to evaluate the performance of the proposed method. The results suggest that BsCCA significantly improves the performance of SSVEP-based BCI compared to the state-of-the-art methods. The proposed method is an unsupervised approach with averaged information transfer rate (ITR) of 77.04 bits min-1 across 10 subjects. The maximum individual ITR is 107.55 bits min-1 for 12-class SSVEP dataset, whereas, the ITR of 69.29 and 69.44 bits min-1 are achieved with CCA and NCCA respectively. Significance. The statistical test shows that the proposed unsupervised method significantly improves the performance of the SSVEP-based BCI. It can be usable in real world applications.
Anastasiadou, Maria N; Christodoulakis, Manolis; Papathanasiou, Eleftherios S; Papacostas, Savvas S; Mitsis, Georgios D
2017-09-01
This paper proposes supervised and unsupervised algorithms for automatic muscle artifact detection and removal from long-term EEG recordings, which combine canonical correlation analysis (CCA) and wavelets with random forests (RF). The proposed algorithms first perform CCA and continuous wavelet transform of the canonical components to generate a number of features which include component autocorrelation values and wavelet coefficient magnitude values. A subset of the most important features is subsequently selected using RF and labelled observations (supervised case) or synthetic data constructed from the original observations (unsupervised case). The proposed algorithms are evaluated using realistic simulation data as well as 30min epochs of non-invasive EEG recordings obtained from ten patients with epilepsy. We assessed the performance of the proposed algorithms using classification performance and goodness-of-fit values for noisy and noise-free signal windows. In the simulation study, where the ground truth was known, the proposed algorithms yielded almost perfect performance. In the case of experimental data, where expert marking was performed, the results suggest that both the supervised and unsupervised algorithm versions were able to remove artifacts without affecting noise-free channels considerably, outperforming standard CCA, independent component analysis (ICA) and Lagged Auto-Mutual Information Clustering (LAMIC). The proposed algorithms achieved excellent performance for both simulation and experimental data. Importantly, for the first time to our knowledge, we were able to perform entirely unsupervised artifact removal, i.e. without using already marked noisy data segments, achieving performance that is comparable to the supervised case. Overall, the results suggest that the proposed algorithms yield significant future potential for improving EEG signal quality in research or clinical settings without the need for marking by expert neurophysiologists, EMG signal recording and user visual inspection. Copyright © 2017 International Federation of Clinical Neurophysiology. Published by Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Hobson, V. R.; Shervais, J. W.
2004-12-01
Developing a method to characterize the physical, chemical and temporal aspects of terrestrial volcanics is a necessary step toward studying volcanics on other planetary bodies. Volcanoes and flows close to populated centers have been studied to varying degree, but remote volcanics remain largely unstudied. Remotely sensed data and derived information can be used to select field sites on Earth and on other planets. Scientists studying volcanics in dangerous areas would benefit from as much advance knowledge of the area as possible before beginning fieldwork. By using satellites and other remote sensing methods, information about the eruptive history can be derived and potentially, the hazard these remote volcanic areas may pose to current and future generations can be estimated. Using Landsat TM, ASTER and other remotely sensed data, the extent and characteristics of lava flows can be examined, but verification and refinement of these methods requires collection of data on the ground. Young lava flows at Craters of the Moon National Park were selected to test methods for remote mapping of recent volcanics. These late Pleistocene to Holocene basalt flows have been mapped to 1:100,000 scale (Kuntz et al, 1988) and have only minor vegetative cover. A range of remotely sensed spectral images were combined to optimize recovery of the mapped flows. Major flow units can be distinguished from each other using unsupervised classification of Landsat TM Bands 1-7, but differentiation of flows within these units presents greater difficulty. Principal component analyses revealed that during the daytime, thermal infrared variations outweigh variations in all other bands. Larger-scale features were observed like edge effects attributable to changes in surface roughness or texture that might occur at flow fronts or at boundaries between flows. Using a digitized version of the geologic map, TM and ASTER data for individual flows were isolated and examined for changes with distance from the source vent or fissure. Several flows were selected for further examination in the field, based on accessibility and scientific interest.
Spectral Resolution and Coverage Impact on Advanced Sounder Information Content
NASA Technical Reports Server (NTRS)
Larar, Allen M.; Liu, Xu; Zhou, Daniel K.; Smith, William L.
2010-01-01
Advanced satellite sensors are tasked with improving global measurements of the Earth s atmosphere, clouds, and surface to enable enhancements in weather prediction, climate monitoring capability, and environmental change detection. Achieving such measurement improvements requires instrument system advancements. This presentation focuses on the impact of spectral resolution and coverage changes on remote sensing system information content, with a specific emphasis on thermodynamic state and trace species variables obtainable from advanced atmospheric sounders such as the Infrared Atmospheric Sounding Interferometer (IASI) and Cross-track Infrared Sounder (CrIS) systems on the MetOp and NPP/NPOESS series of satellites. Key words: remote sensing, advanced sounders, information content, IASI, CrIS
An assessment of the effectiveness of a random forest classifier for land-cover classification
NASA Astrophysics Data System (ADS)
Rodriguez-Galiano, V. F.; Ghimire, B.; Rogan, J.; Chica-Olmo, M.; Rigol-Sanchez, J. P.
2012-01-01
Land cover monitoring using remotely sensed data requires robust classification methods which allow for the accurate mapping of complex land cover and land use categories. Random forest (RF) is a powerful machine learning classifier that is relatively unknown in land remote sensing and has not been evaluated thoroughly by the remote sensing community compared to more conventional pattern recognition techniques. Key advantages of RF include: their non-parametric nature; high classification accuracy; and capability to determine variable importance. However, the split rules for classification are unknown, therefore RF can be considered to be black box type classifier. RF provides an algorithm for estimating missing values; and flexibility to perform several types of data analysis, including regression, classification, survival analysis, and unsupervised learning. In this paper, the performance of the RF classifier for land cover classification of a complex area is explored. Evaluation was based on several criteria: mapping accuracy, sensitivity to data set size and noise. Landsat-5 Thematic Mapper data captured in European spring and summer were used with auxiliary variables derived from a digital terrain model to classify 14 different land categories in the south of Spain. Results show that the RF algorithm yields accurate land cover classifications, with 92% overall accuracy and a Kappa index of 0.92. RF is robust to training data reduction and noise because significant differences in kappa values were only observed for data reduction and noise addition values greater than 50 and 20%, respectively. Additionally, variables that RF identified as most important for classifying land cover coincided with expectations. A McNemar test indicates an overall better performance of the random forest model over a single decision tree at the 0.00001 significance level.
Embedding Open-domain Common-sense Knowledge from Text
Goodwin, Travis; Harabagiu, Sanda
2017-01-01
Our ability to understand language often relies on common-sense knowledge – background information the speaker can assume is known by the reader. Similarly, our comprehension of the language used in complex domains relies on access to domain-specific knowledge. Capturing common-sense and domain-specific knowledge can be achieved by taking advantage of recent advances in open information extraction (IE) techniques and, more importantly, of knowledge embeddings, which are multi-dimensional representations of concepts and relations. Building a knowledge graph for representing common-sense knowledge in which concepts discerned from noun phrases are cast as vertices and lexicalized relations are cast as edges leads to learning the embeddings of common-sense knowledge accounting for semantic compositionality as well as implied knowledge. Common-sense knowledge is acquired from a vast collection of blogs and books as well as from WordNet. Similarly, medical knowledge is learned from two large sets of electronic health records. The evaluation results of these two forms of knowledge are promising: the same knowledge acquisition methodology based on learning knowledge embeddings works well both for common-sense knowledge and for medical knowledge Interestingly, the common-sense knowledge that we have acquired was evaluated as being less neutral than than the medical knowledge, as it often reflected the opinion of the knowledge utterer. In addition, the acquired medical knowledge was evaluated as more plausible than the common-sense knowledge, reflecting the complexity of acquiring common-sense knowledge due to the pragmatics and economicity of language. PMID:28649676
Persinger, M A; Moulden, J A; Richards, P M
1999-10-01
Analyses of the data from 212 boys and girls, aged 7-14 years, demonstrated a relatively abrupt and permanent decrease in the numbers of errors for dichotic (left ear) word listening and for toe gnosis after the ninth year. This pattern was not observed for right ear errors, finger gnosis, or indices of finger and foot agility. The results are compatible with the hypothesis that the final differentiation of the paracentral lobules and adjacent corpus callosum by the most distal portions of the Anterior Cerebral Artery occurs around 9 or 10 years of age. Implications for the development of the sense of self, enhanced apprehension, and "the sense of a presence" are discussed.
Dzubur, Eldin; Khalil, Carine; Almario, Christopher V; Noah, Benjamin; Minhas, Deeba; Ishimori, Mariko; Arnold, Corey; Park, Yujin; Kay, Jonathan; Weisman, Michael H; Spiegel, Brennan M R
2018-05-21
Few studies have examined ankylosing spondylitis (AS) patients' concerns and perceptions of biologic therapies outside of traditional surveys. In this study, we used social media data to examine AS patients' knowledge, attitudes, and beliefs regarding biologic therapies. We collected posts from 601 social media sites made between 1/1/06-4/26/17. Each post mentioned both an AS keyword and a biologic. To explore themes within the collection in an unsupervised manner, a latent Dirichlet allocation topic model was fit to the dataset. Each discovered topic was represented as a discrete distribution over the words in the collection, similar to a word cloud. The topics were manually reviewed to identify themes, which were confirmed with thematic data analysis. We examined 27,416 social media posts and found 112 themes. The majority of themes (60%, 67/112) focused on discussions surrounding AS treatment. Other themes including psychological impact of AS, reporting of medical literature, and AS disease consequences accounted for the remaining 40% (45/112). Within AS treatment discussions, most topics (54%) involved biologics, and most subthemes (78%) centered on side-effects (e.g., fatigue, allergic reactions), biologic attributes (e.g., dosing, frequency), and concerns with biologic use (e.g., increased cancer risk). Additional implicit patient needs (e.g., support) were identified using qualitative analyses. Social media reveals a dynamic range of themes governing AS patients' experience and choice with biologics. The complexity of selecting among biologics and navigating their risk-benefit profiles suggests merit in creating online tailored decision-tools to support patients' decision-making with AS biologic therapies. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Evaluating topic model interpretability from a primary care physician perspective.
Arnold, Corey W; Oh, Andrea; Chen, Shawn; Speier, William
2016-02-01
Probabilistic topic models provide an unsupervised method for analyzing unstructured text. These models discover semantically coherent combinations of words (topics) that could be integrated in a clinical automatic summarization system for primary care physicians performing chart review. However, the human interpretability of topics discovered from clinical reports is unknown. Our objective is to assess the coherence of topics and their ability to represent the contents of clinical reports from a primary care physician's point of view. Three latent Dirichlet allocation models (50 topics, 100 topics, and 150 topics) were fit to a large collection of clinical reports. Topics were manually evaluated by primary care physicians and graduate students. Wilcoxon Signed-Rank Tests for Paired Samples were used to evaluate differences between different topic models, while differences in performance between students and primary care physicians (PCPs) were tested using Mann-Whitney U tests for each of the tasks. While the 150-topic model produced the best log likelihood, participants were most accurate at identifying words that did not belong in topics learned by the 100-topic model, suggesting that 100 topics provides better relative granularity of discovered semantic themes for the data set used in this study. Models were comparable in their ability to represent the contents of documents. Primary care physicians significantly outperformed students in both tasks. This work establishes a baseline of interpretability for topic models trained with clinical reports, and provides insights on the appropriateness of using topic models for informatics applications. Our results indicate that PCPs find discovered topics more coherent and representative of clinical reports relative to students, warranting further research into their use for automatic summarization. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Evaluating Topic Model Interpretability from a Primary Care Physician Perspective
Arnold, Corey W.; Oh, Andrea; Chen, Shawn; Speier, William
2015-01-01
Background and Objective Probabilistic topic models provide an unsupervised method for analyzing unstructured text. These models discover semantically coherent combinations of words (topics) that could be integrated in a clinical automatic summarization system for primary care physicians performing chart review. However, the human interpretability of topics discovered from clinical reports is unknown. Our objective is to assess the coherence of topics and their ability to represent the contents of clinical reports from a primary care physician’s point of view. Methods Three latent Dirichlet allocation models (50 topics, 100 topics, and 150 topics) were fit to a large collection of clinical reports. Topics were manually evaluated by primary care physicians and graduate students. Wilcoxon Signed-Rank Tests for Paired Samples were used to evaluate differences between different topic models, while differences in performance between students and primary care physicians (PCPs) were tested using Mann-Whitney U tests for each of the tasks. Results While the 150-topic model produced the best log likelihood, participants were most accurate at identifying words that did not belong in topics learned by the 100-topic model, suggesting that 100 topics provides better relative granularity of discovered semantic themes for the data set used in this study. Models were comparable in their ability to represent the contents of documents. Primary care physicians significantly outperformed students in both tasks. Conclusion This work establishes a baseline of interpretability for topic models trained with clinical reports, and provides insights on the appropriateness of using topic models for informatics applications. Our results indicate that PCPs find discovered topics more coherent and representative of clinical reports relative to students, warranting further research into their use for automatic summarization. PMID:26614020
Dubois, Matthieu; Poeppel, David; Pelli, Denis G.
2013-01-01
To understand why human sensitivity for complex objects is so low, we study how word identification combines eye and ear or parts of a word (features, letters, syllables). Our observers identify printed and spoken words presented concurrently or separately. When researchers measure threshold (energy of the faintest visible or audible signal) they may report either sensitivity (one over the human threshold) or efficiency (ratio of the best possible threshold to the human threshold). When the best possible algorithm identifies an object (like a word) in noise, its threshold is independent of how many parts the object has. But, with human observers, efficiency depends on the task. In some tasks, human observers combine parts efficiently, needing hardly more energy to identify an object with more parts. In other tasks, they combine inefficiently, needing energy nearly proportional to the number of parts, over a 60∶1 range. Whether presented to eye or ear, efficiency for detecting a short sinusoid (tone or grating) with few features is a substantial 20%, while efficiency for identifying a word with many features is merely 1%. Why? We show that the low human sensitivity for words is a cost of combining their many parts. We report a dichotomy between inefficient combining of adjacent features and efficient combining across senses. Joining our results with a survey of the cue-combination literature reveals that cues combine efficiently only if they are perceived as aspects of the same object. Observers give different names to adjacent letters in a word, and combine them inefficiently. Observers give the same name to a word’s image and sound, and combine them efficiently. The brain’s machinery optimally combines only cues that are perceived as originating from the same object. Presumably such cues each find their own way through the brain to arrive at the same object representation. PMID:23734220
Age differences in suprathreshold sensory function.
Heft, Marc W; Robinson, Michael E
2014-02-01
While there is general agreement that vision and audition decline with aging, observations for the somatosensory senses and taste are less clear. The purpose of this study was to assess age differences in multimodal sensory perception in healthy, community-dwelling participants. Participants (100 females and 78 males aged 20-89 years) judged the magnitudes of sensations associated with graded levels of thermal, tactile, and taste stimuli in separate testing sessions using a cross-modality matching (CMM) procedure. During each testing session, participants also rated words that describe magnitudes of percepts associated with differing-level sensory stimuli. The words provided contextual anchors for the sensory ratings, and the word-rating task served as a control for the CMM. The mean sensory ratings were used as dependent variables in a MANOVA for each sensory domain, with age and sex as between-subject variables. These analyses were repeated with the grand means for the word ratings as a covariate to control for the rating task. The results of this study suggest that there are modest age differences for somatosensory and taste domains. While the magnitudes of these differences are mediated somewhat by age differences in the rating task, differences in warm temperature, tactile, and salty taste persist.
ERIC Educational Resources Information Center
Paul, Kristina Ayers; Tay, Juliana
2016-01-01
Paideia Seminar is a method for facilitating Socratic discussions about different types of texts, whether they be texts in the literal sense of the word or any other object that represents ideas or values. In this article, we describe how teachers can implement Paideia Seminar to spark deep thinking and rich discussion among early elementary…
The Functional Use of a Mathematical Sign
ERIC Educational Resources Information Center
Berger, Margot
2004-01-01
The question of how a mathematics student at university-level makes sense of a new mathematical sign, presented to her or him in the form of a definition, is a fundamental problem in mathematics education. Using an analogy with Vygotsky's theory (1986, 1994) of how a child learns a new word, I argue that a learner uses a new mathematical sign both…
The PHaVE List: A Pedagogical List of Phrasal Verbs and Their Most Frequent Meaning Senses
ERIC Educational Resources Information Center
Garnier, Mélodie; Schmitt, Norbert
2015-01-01
As researchers and practitioners are becoming more aware of the importance of multi-word items in English, there is little doubt that phrasal verbs deserve teaching attention in the classroom. However, there are thousands of phrasal verbs in English, and so the question for practitioners is which phrasal verbs to focus attention upon. Phrasal verb…
Deafness and Hearing Loss. NICHCY Disability Fact Sheet #3
ERIC Educational Resources Information Center
National Dissemination Center for Children with Disabilities, 2010
2010-01-01
Hearing is one of the five senses. Hearing gives access to sounds in the world--people's voices, their words, a car horn blown in warning or as hello! When a child has a hearing loss, it is cause for immediate attention. That is because language and communication skills develop most rapidly in childhood, especially before the age of 3. When…
ERIC Educational Resources Information Center
Dennett, Daniel C.
2006-01-01
According to surveys, most of the people in the world say that religion is very important in their lives. Many would say that without it, their lives would be meaningless. It is tempting just to take them at their word, to declare that nothing more is to be said-- and to tiptoe away. Who would want to interfere with whatever it is that gives their…
Quantitative Assessment of a Field-Based Course on Integrative Geology, Ecology and Cultural History
ERIC Educational Resources Information Center
Sheppard, Paul R.; Donaldson, Brad A.; Huckleberry, Gary
2010-01-01
A field-based course at the University of Arizona called Sense of Place (SOP) covers the geology, ecology and cultural history of the Tucson area. SOP was quantitatively assessed for pedagogical effectiveness. Students of the Spring 2008 course were given pre- and post-course word association surveys in order to assess awareness and comprehension…
"My Life in the New South Africa": A Youth Perspective.
ERIC Educational Resources Information Center
Leggett, Ted, Ed.; Moller, Valerie, Ed.; Richards, Robin, Ed.
This book gives a unique insight into the thoughts and concerns of South Africans under the age of 30. Young people from across the country participated in a letter-writing contest to give their experiences and opinions, and to reveal their lives, hopes, and ambitions. The book uses their own words as they try to make sense of post-Apartheid South…
Free for All: Open Source Software
ERIC Educational Resources Information Center
Schneider, Karen
2008-01-01
Open source software has become a catchword in libraryland. Yet many remain unclear about open source's benefits--or even what it is. So what is open source software (OSS)? It's software that is free in every sense of the word: free to download, free to use, and free to view or modify. Most OSS is distributed on the Web and one doesn't need to…