Sample records for supervised classifier based

  1. Ensemble Semi-supervised Frame-work for Brain Magnetic Resonance Imaging Tissue Segmentation.

    PubMed

    Azmi, Reza; Pishgoo, Boshra; Norozi, Narges; Yeganeh, Samira

    2013-04-01

    Brain magnetic resonance images (MRIs) tissue segmentation is one of the most important parts of the clinical diagnostic tools. Pixel classification methods have been frequently used in the image segmentation with two supervised and unsupervised approaches up to now. Supervised segmentation methods lead to high accuracy, but they need a large amount of labeled data, which is hard, expensive, and slow to obtain. Moreover, they cannot use unlabeled data to train classifiers. On the other hand, unsupervised segmentation methods have no prior knowledge and lead to low level of performance. However, semi-supervised learning which uses a few labeled data together with a large amount of unlabeled data causes higher accuracy with less trouble. In this paper, we propose an ensemble semi-supervised frame-work for segmenting of brain magnetic resonance imaging (MRI) tissues that it has been used results of several semi-supervised classifiers simultaneously. Selecting appropriate classifiers has a significant role in the performance of this frame-work. Hence, in this paper, we present two semi-supervised algorithms expectation filtering maximization and MCo_Training that are improved versions of semi-supervised methods expectation maximization and Co_Training and increase segmentation accuracy. Afterward, we use these improved classifiers together with graph-based semi-supervised classifier as components of the ensemble frame-work. Experimental results show that performance of segmentation in this approach is higher than both supervised methods and the individual semi-supervised classifiers.

  2. Ensemble Semi-supervised Frame-work for Brain Magnetic Resonance Imaging Tissue Segmentation

    PubMed Central

    Azmi, Reza; Pishgoo, Boshra; Norozi, Narges; Yeganeh, Samira

    2013-01-01

    Brain magnetic resonance images (MRIs) tissue segmentation is one of the most important parts of the clinical diagnostic tools. Pixel classification methods have been frequently used in the image segmentation with two supervised and unsupervised approaches up to now. Supervised segmentation methods lead to high accuracy, but they need a large amount of labeled data, which is hard, expensive, and slow to obtain. Moreover, they cannot use unlabeled data to train classifiers. On the other hand, unsupervised segmentation methods have no prior knowledge and lead to low level of performance. However, semi-supervised learning which uses a few labeled data together with a large amount of unlabeled data causes higher accuracy with less trouble. In this paper, we propose an ensemble semi-supervised frame-work for segmenting of brain magnetic resonance imaging (MRI) tissues that it has been used results of several semi-supervised classifiers simultaneously. Selecting appropriate classifiers has a significant role in the performance of this frame-work. Hence, in this paper, we present two semi-supervised algorithms expectation filtering maximization and MCo_Training that are improved versions of semi-supervised methods expectation maximization and Co_Training and increase segmentation accuracy. Afterward, we use these improved classifiers together with graph-based semi-supervised classifier as components of the ensemble frame-work. Experimental results show that performance of segmentation in this approach is higher than both supervised methods and the individual semi-supervised classifiers. PMID:24098863

  3. Evaluation of Semi-supervised Learning for Classification of Protein Crystallization Imagery.

    PubMed

    Sigdel, Madhav; Dinç, İmren; Dinç, Semih; Sigdel, Madhu S; Pusey, Marc L; Aygün, Ramazan S

    2014-03-01

    In this paper, we investigate the performance of two wrapper methods for semi-supervised learning algorithms for classification of protein crystallization images with limited labeled images. Firstly, we evaluate the performance of semi-supervised approach using self-training with naïve Bayesian (NB) and sequential minimum optimization (SMO) as the base classifiers. The confidence values returned by these classifiers are used to select high confident predictions to be used for self-training. Secondly, we analyze the performance of Yet Another Two Stage Idea (YATSI) semi-supervised learning using NB, SMO, multilayer perceptron (MLP), J48 and random forest (RF) classifiers. These results are compared with the basic supervised learning using the same training sets. We perform our experiments on a dataset consisting of 2250 protein crystallization images for different proportions of training and test data. Our results indicate that NB and SMO using both self-training and YATSI semi-supervised approaches improve accuracies with respect to supervised learning. On the other hand, MLP, J48 and RF perform better using basic supervised learning. Overall, random forest classifier yields the best accuracy with supervised learning for our dataset.

  4. Evaluation of Semi-supervised Learning for Classification of Protein Crystallization Imagery

    PubMed Central

    Sigdel, Madhav; Dinç, İmren; Dinç, Semih; Sigdel, Madhu S.; Pusey, Marc L.; Aygün, Ramazan S.

    2015-01-01

    In this paper, we investigate the performance of two wrapper methods for semi-supervised learning algorithms for classification of protein crystallization images with limited labeled images. Firstly, we evaluate the performance of semi-supervised approach using self-training with naïve Bayesian (NB) and sequential minimum optimization (SMO) as the base classifiers. The confidence values returned by these classifiers are used to select high confident predictions to be used for self-training. Secondly, we analyze the performance of Yet Another Two Stage Idea (YATSI) semi-supervised learning using NB, SMO, multilayer perceptron (MLP), J48 and random forest (RF) classifiers. These results are compared with the basic supervised learning using the same training sets. We perform our experiments on a dataset consisting of 2250 protein crystallization images for different proportions of training and test data. Our results indicate that NB and SMO using both self-training and YATSI semi-supervised approaches improve accuracies with respect to supervised learning. On the other hand, MLP, J48 and RF perform better using basic supervised learning. Overall, random forest classifier yields the best accuracy with supervised learning for our dataset. PMID:25914518

  5. Molecular activity prediction by means of supervised subspace projection based ensembles of classifiers.

    PubMed

    Cerruela García, G; García-Pedrajas, N; Luque Ruiz, I; Gómez-Nieto, M Á

    2018-03-01

    This paper proposes a method for molecular activity prediction in QSAR studies using ensembles of classifiers constructed by means of two supervised subspace projection methods, namely nonparametric discriminant analysis (NDA) and hybrid discriminant analysis (HDA). We studied the performance of the proposed ensembles compared to classical ensemble methods using four molecular datasets and eight different models for the representation of the molecular structure. Using several measures and statistical tests for classifier comparison, we observe that our proposal improves the classification results with respect to classical ensemble methods. Therefore, we show that ensembles constructed using supervised subspace projections offer an effective way of creating classifiers in cheminformatics.

  6. An empirical study of ensemble-based semi-supervised learning approaches for imbalanced splice site datasets.

    PubMed

    Stanescu, Ana; Caragea, Doina

    2015-01-01

    Recent biochemical advances have led to inexpensive, time-efficient production of massive volumes of raw genomic data. Traditional machine learning approaches to genome annotation typically rely on large amounts of labeled data. The process of labeling data can be expensive, as it requires domain knowledge and expert involvement. Semi-supervised learning approaches that can make use of unlabeled data, in addition to small amounts of labeled data, can help reduce the costs associated with labeling. In this context, we focus on the problem of predicting splice sites in a genome using semi-supervised learning approaches. This is a challenging problem, due to the highly imbalanced distribution of the data, i.e., small number of splice sites as compared to the number of non-splice sites. To address this challenge, we propose to use ensembles of semi-supervised classifiers, specifically self-training and co-training classifiers. Our experiments on five highly imbalanced splice site datasets, with positive to negative ratios of 1-to-99, showed that the ensemble-based semi-supervised approaches represent a good choice, even when the amount of labeled data consists of less than 1% of all training data. In particular, we found that ensembles of co-training and self-training classifiers that dynamically balance the set of labeled instances during the semi-supervised iterations show improvements over the corresponding supervised ensemble baselines. In the presence of limited amounts of labeled data, ensemble-based semi-supervised approaches can successfully leverage the unlabeled data to enhance supervised ensembles learned from highly imbalanced data distributions. Given that such distributions are common for many biological sequence classification problems, our work can be seen as a stepping stone towards more sophisticated ensemble-based approaches to biological sequence annotation in a semi-supervised framework.

  7. An empirical study of ensemble-based semi-supervised learning approaches for imbalanced splice site datasets

    PubMed Central

    2015-01-01

    Background Recent biochemical advances have led to inexpensive, time-efficient production of massive volumes of raw genomic data. Traditional machine learning approaches to genome annotation typically rely on large amounts of labeled data. The process of labeling data can be expensive, as it requires domain knowledge and expert involvement. Semi-supervised learning approaches that can make use of unlabeled data, in addition to small amounts of labeled data, can help reduce the costs associated with labeling. In this context, we focus on the problem of predicting splice sites in a genome using semi-supervised learning approaches. This is a challenging problem, due to the highly imbalanced distribution of the data, i.e., small number of splice sites as compared to the number of non-splice sites. To address this challenge, we propose to use ensembles of semi-supervised classifiers, specifically self-training and co-training classifiers. Results Our experiments on five highly imbalanced splice site datasets, with positive to negative ratios of 1-to-99, showed that the ensemble-based semi-supervised approaches represent a good choice, even when the amount of labeled data consists of less than 1% of all training data. In particular, we found that ensembles of co-training and self-training classifiers that dynamically balance the set of labeled instances during the semi-supervised iterations show improvements over the corresponding supervised ensemble baselines. Conclusions In the presence of limited amounts of labeled data, ensemble-based semi-supervised approaches can successfully leverage the unlabeled data to enhance supervised ensembles learned from highly imbalanced data distributions. Given that such distributions are common for many biological sequence classification problems, our work can be seen as a stepping stone towards more sophisticated ensemble-based approaches to biological sequence annotation in a semi-supervised framework. PMID:26356316

  8. Arabic Supervised Learning Method Using N-Gram

    ERIC Educational Resources Information Center

    Sanan, Majed; Rammal, Mahmoud; Zreik, Khaldoun

    2008-01-01

    Purpose: Recently, classification of Arabic documents is a real problem for juridical centers. In this case, some of the Lebanese official journal documents are classified, and the center has to classify new documents based on these documents. This paper aims to study and explain the useful application of supervised learning method on Arabic texts…

  9. Classification of earth terrain using polarimetric synthetic aperture radar images

    NASA Technical Reports Server (NTRS)

    Lim, H. H.; Swartz, A. A.; Yueh, H. A.; Kong, J. A.; Shin, R. T.; Van Zyl, J. J.

    1989-01-01

    Supervised and unsupervised classification techniques are developed and used to classify the earth terrain components from SAR polarimetric images of San Francisco Bay and Traverse City, Michigan. The supervised techniques include the Bayes classifiers, normalized polarimetric classification, and simple feature classification using discriminates such as the absolute and normalized magnitude response of individual receiver channel returns and the phase difference between receiver channels. An algorithm is developed as an unsupervised technique which classifies terrain elements based on the relationship between the orientation angle and the handedness of the transmitting and receiving polariation states. It is found that supervised classification produces the best results when accurate classifier training data are used, while unsupervised classification may be applied when training data are not available.

  10. Improved semi-supervised online boosting for object tracking

    NASA Astrophysics Data System (ADS)

    Li, Yicui; Qi, Lin; Tan, Shukun

    2016-10-01

    The advantage of an online semi-supervised boosting method which takes object tracking problem as a classification problem, is training a binary classifier from labeled and unlabeled examples. Appropriate object features are selected based on real time changes in the object. However, the online semi-supervised boosting method faces one key problem: The traditional self-training using the classification results to update the classifier itself, often leads to drifting or tracking failure, due to the accumulated error during each update of the tracker. To overcome the disadvantages of semi-supervised online boosting based on object tracking methods, the contribution of this paper is an improved online semi-supervised boosting method, in which the learning process is guided by positive (P) and negative (N) constraints, termed P-N constraints, which restrict the labeling of the unlabeled samples. First, we train the classification by an online semi-supervised boosting. Then, this classification is used to process the next frame. Finally, the classification is analyzed by the P-N constraints, which are used to verify if the labels of unlabeled data assigned by the classifier are in line with the assumptions made about positive and negative samples. The proposed algorithm can effectively improve the discriminative ability of the classifier and significantly alleviate the drifting problem in tracking applications. In the experiments, we demonstrate real-time tracking of our tracker on several challenging test sequences where our tracker outperforms other related on-line tracking methods and achieves promising tracking performance.

  11. Design of partially supervised classifiers for multispectral image data

    NASA Technical Reports Server (NTRS)

    Jeon, Byeungwoo; Landgrebe, David

    1993-01-01

    A partially supervised classification problem is addressed, especially when the class definition and corresponding training samples are provided a priori only for just one particular class. In practical applications of pattern classification techniques, a frequently observed characteristic is the heavy, often nearly impossible requirements on representative prior statistical class characteristics of all classes in a given data set. Considering the effort in both time and man-power required to have a well-defined, exhaustive list of classes with a corresponding representative set of training samples, this 'partially' supervised capability would be very desirable, assuming adequate classifier performance can be obtained. Two different classification algorithms are developed to achieve simplicity in classifier design by reducing the requirement of prior statistical information without sacrificing significant classifying capability. The first one is based on optimal significance testing, where the optimal acceptance probability is estimated directly from the data set. In the second approach, the partially supervised classification is considered as a problem of unsupervised clustering with initially one known cluster or class. A weighted unsupervised clustering procedure is developed to automatically define other classes and estimate their class statistics. The operational simplicity thus realized should make these partially supervised classification schemes very viable tools in pattern classification.

  12. Target discrimination method for SAR images based on semisupervised co-training

    NASA Astrophysics Data System (ADS)

    Wang, Yan; Du, Lan; Dai, Hui

    2018-01-01

    Synthetic aperture radar (SAR) target discrimination is usually performed in a supervised manner. However, supervised methods for SAR target discrimination may need lots of labeled training samples, whose acquirement is costly, time consuming, and sometimes impossible. This paper proposes an SAR target discrimination method based on semisupervised co-training, which utilizes a limited number of labeled samples and an abundant number of unlabeled samples. First, Lincoln features, widely used in SAR target discrimination, are extracted from the training samples and partitioned into two sets according to their physical meanings. Second, two support vector machine classifiers are iteratively co-trained with the extracted two feature sets based on the co-training algorithm. Finally, the trained classifiers are exploited to classify the test data. The experimental results on real SAR images data not only validate the effectiveness of the proposed method compared with the traditional supervised methods, but also demonstrate the superiority of co-training over self-training, which only uses one feature set.

  13. Determining Effects of Non-synonymous SNPs on Protein-Protein Interactions using Supervised and Semi-supervised Learning

    PubMed Central

    Zhao, Nan; Han, Jing Ginger; Shyu, Chi-Ren; Korkin, Dmitry

    2014-01-01

    Single nucleotide polymorphisms (SNPs) are among the most common types of genetic variation in complex genetic disorders. A growing number of studies link the functional role of SNPs with the networks and pathways mediated by the disease-associated genes. For example, many non-synonymous missense SNPs (nsSNPs) have been found near or inside the protein-protein interaction (PPI) interfaces. Determining whether such nsSNP will disrupt or preserve a PPI is a challenging task to address, both experimentally and computationally. Here, we present this task as three related classification problems, and develop a new computational method, called the SNP-IN tool (non-synonymous SNP INteraction effect predictor). Our method predicts the effects of nsSNPs on PPIs, given the interaction's structure. It leverages supervised and semi-supervised feature-based classifiers, including our new Random Forest self-learning protocol. The classifiers are trained based on a dataset of comprehensive mutagenesis studies for 151 PPI complexes, with experimentally determined binding affinities of the mutant and wild-type interactions. Three classification problems were considered: (1) a 2-class problem (strengthening/weakening PPI mutations), (2) another 2-class problem (mutations that disrupt/preserve a PPI), and (3) a 3-class classification (detrimental/neutral/beneficial mutation effects). In total, 11 different supervised and semi-supervised classifiers were trained and assessed resulting in a promising performance, with the weighted f-measure ranging from 0.87 for Problem 1 to 0.70 for the most challenging Problem 3. By integrating prediction results of the 2-class classifiers into the 3-class classifier, we further improved its performance for Problem 3. To demonstrate the utility of SNP-IN tool, it was applied to study the nsSNP-induced rewiring of two disease-centered networks. The accurate and balanced performance of SNP-IN tool makes it readily available to study the rewiring of large-scale protein-protein interaction networks, and can be useful for functional annotation of disease-associated SNPs. SNIP-IN tool is freely accessible as a web-server at http://korkinlab.org/snpintool/. PMID:24784581

  14. Automatic Classification Using Supervised Learning in a Medical Document Filtering Application.

    ERIC Educational Resources Information Center

    Mostafa, J.; Lam, W.

    2000-01-01

    Presents a multilevel model of the information filtering process that permits document classification. Evaluates a document classification approach based on a supervised learning algorithm, measures the accuracy of the algorithm in a neural network that was trained to classify medical documents on cell biology, and discusses filtering…

  15. An Effective Big Data Supervised Imbalanced Classification Approach for Ortholog Detection in Related Yeast Species

    PubMed Central

    Galpert, Deborah; del Río, Sara; Herrera, Francisco; Ancede-Gallardo, Evys; Antunes, Agostinho; Agüero-Chapin, Guillermin

    2015-01-01

    Orthology detection requires more effective scaling algorithms. In this paper, a set of gene pair features based on similarity measures (alignment scores, sequence length, gene membership to conserved regions, and physicochemical profiles) are combined in a supervised pairwise ortholog detection approach to improve effectiveness considering low ortholog ratios in relation to the possible pairwise comparison between two genomes. In this scenario, big data supervised classifiers managing imbalance between ortholog and nonortholog pair classes allow for an effective scaling solution built from two genomes and extended to other genome pairs. The supervised approach was compared with RBH, RSD, and OMA algorithms by using the following yeast genome pairs: Saccharomyces cerevisiae-Kluyveromyces lactis, Saccharomyces cerevisiae-Candida glabrata, and Saccharomyces cerevisiae-Schizosaccharomyces pombe as benchmark datasets. Because of the large amount of imbalanced data, the building and testing of the supervised model were only possible by using big data supervised classifiers managing imbalance. Evaluation metrics taking low ortholog ratios into account were applied. From the effectiveness perspective, MapReduce Random Oversampling combined with Spark SVM outperformed RBH, RSD, and OMA, probably because of the consideration of gene pair features beyond alignment similarities combined with the advances in big data supervised classification. PMID:26605337

  16. An Effective Big Data Supervised Imbalanced Classification Approach for Ortholog Detection in Related Yeast Species.

    PubMed

    Galpert, Deborah; Del Río, Sara; Herrera, Francisco; Ancede-Gallardo, Evys; Antunes, Agostinho; Agüero-Chapin, Guillermin

    2015-01-01

    Orthology detection requires more effective scaling algorithms. In this paper, a set of gene pair features based on similarity measures (alignment scores, sequence length, gene membership to conserved regions, and physicochemical profiles) are combined in a supervised pairwise ortholog detection approach to improve effectiveness considering low ortholog ratios in relation to the possible pairwise comparison between two genomes. In this scenario, big data supervised classifiers managing imbalance between ortholog and nonortholog pair classes allow for an effective scaling solution built from two genomes and extended to other genome pairs. The supervised approach was compared with RBH, RSD, and OMA algorithms by using the following yeast genome pairs: Saccharomyces cerevisiae-Kluyveromyces lactis, Saccharomyces cerevisiae-Candida glabrata, and Saccharomyces cerevisiae-Schizosaccharomyces pombe as benchmark datasets. Because of the large amount of imbalanced data, the building and testing of the supervised model were only possible by using big data supervised classifiers managing imbalance. Evaluation metrics taking low ortholog ratios into account were applied. From the effectiveness perspective, MapReduce Random Oversampling combined with Spark SVM outperformed RBH, RSD, and OMA, probably because of the consideration of gene pair features beyond alignment similarities combined with the advances in big data supervised classification.

  17. Evaluating pixel and object based image classification techniques for mapping plant invasions from UAV derived aerial imagery: Harrisia pomanensis as a case study

    NASA Astrophysics Data System (ADS)

    Mafanya, Madodomzi; Tsele, Philemon; Botai, Joel; Manyama, Phetole; Swart, Barend; Monate, Thabang

    2017-07-01

    Invasive alien plants (IAPs) not only pose a serious threat to biodiversity and water resources but also have impacts on human and animal wellbeing. To support decision making in IAPs monitoring, semi-automated image classifiers which are capable of extracting valuable information in remotely sensed data are vital. This study evaluated the mapping accuracies of supervised and unsupervised image classifiers for mapping Harrisia pomanensis (a cactus plant commonly known as the Midnight Lady) using two interlinked evaluation strategies i.e. point and area based accuracy assessment. Results of the point-based accuracy assessment show that with reference to 219 ground control points, the supervised image classifiers (i.e. Maxver and Bhattacharya) mapped H. pomanensis better than the unsupervised image classifiers (i.e. K-mediuns, Euclidian Length and Isoseg). In this regard, user and producer accuracies were 82.4% and 84% respectively for the Maxver classifier. The user and producer accuracies for the Bhattacharya classifier were 90% and 95.7%, respectively. Though the Maxver produced a higher overall accuracy and Kappa estimate than the Bhattacharya classifier, the Maxver Kappa estimate of 0.8305 is not significantly (statistically) greater than the Bhattacharya Kappa estimate of 0.8088 at a 95% confidence interval. The area based accuracy assessment results show that the Bhattacharya classifier estimated the spatial extent of H. pomanensis with an average mapping accuracy of 86.1% whereas the Maxver classifier only gave an average mapping accuracy of 65.2%. Based on these results, the Bhattacharya classifier is therefore recommended for mapping H. pomanensis. These findings will aid in the algorithm choice making for the development of a semi-automated image classification system for mapping IAPs.

  18. Quasi-Supervised Scoring of Human Sleep in Polysomnograms Using Augmented Input Variables

    PubMed Central

    Yaghouby, Farid; Sunderam, Sridhar

    2015-01-01

    The limitations of manual sleep scoring make computerized methods highly desirable. Scoring errors can arise from human rater uncertainty or inter-rater variability. Sleep scoring algorithms either come as supervised classifiers that need scored samples of each state to be trained, or as unsupervised classifiers that use heuristics or structural clues in unscored data to define states. We propose a quasi-supervised classifier that models observations in an unsupervised manner but mimics a human rater wherever training scores are available. EEG, EMG, and EOG features were extracted in 30s epochs from human-scored polysomnograms recorded from 42 healthy human subjects (18 to 79 years) and archived in an anonymized, publicly accessible database. Hypnograms were modified so that: 1. Some states are scored but not others; 2. Samples of all states are scored but not for transitional epochs; and 3. Two raters with 67% agreement are simulated. A framework for quasi-supervised classification was devised in which unsupervised statistical models—specifically Gaussian mixtures and hidden Markov models—are estimated from unlabeled training data, but the training samples are augmented with variables whose values depend on available scores. Classifiers were fitted to signal features incorporating partial scores, and used to predict scores for complete recordings. Performance was assessed using Cohen's K statistic. The quasi-supervised classifier performed significantly better than an unsupervised model and sometimes as well as a completely supervised model despite receiving only partial scores. The quasi-supervised algorithm addresses the need for classifiers that mimic scoring patterns of human raters while compensating for their limitations. PMID:25679475

  19. Quasi-supervised scoring of human sleep in polysomnograms using augmented input variables.

    PubMed

    Yaghouby, Farid; Sunderam, Sridhar

    2015-04-01

    The limitations of manual sleep scoring make computerized methods highly desirable. Scoring errors can arise from human rater uncertainty or inter-rater variability. Sleep scoring algorithms either come as supervised classifiers that need scored samples of each state to be trained, or as unsupervised classifiers that use heuristics or structural clues in unscored data to define states. We propose a quasi-supervised classifier that models observations in an unsupervised manner but mimics a human rater wherever training scores are available. EEG, EMG, and EOG features were extracted in 30s epochs from human-scored polysomnograms recorded from 42 healthy human subjects (18-79 years) and archived in an anonymized, publicly accessible database. Hypnograms were modified so that: 1. Some states are scored but not others; 2. Samples of all states are scored but not for transitional epochs; and 3. Two raters with 67% agreement are simulated. A framework for quasi-supervised classification was devised in which unsupervised statistical models-specifically Gaussian mixtures and hidden Markov models--are estimated from unlabeled training data, but the training samples are augmented with variables whose values depend on available scores. Classifiers were fitted to signal features incorporating partial scores, and used to predict scores for complete recordings. Performance was assessed using Cohen's Κ statistic. The quasi-supervised classifier performed significantly better than an unsupervised model and sometimes as well as a completely supervised model despite receiving only partial scores. The quasi-supervised algorithm addresses the need for classifiers that mimic scoring patterns of human raters while compensating for their limitations. Copyright © 2015 Elsevier Ltd. All rights reserved.

  20. Towards Automatic Classification of Exoplanet-Transit-Like Signals: A Case Study on Kepler Mission Data

    NASA Astrophysics Data System (ADS)

    Valizadegan, Hamed; Martin, Rodney; McCauliff, Sean D.; Jenkins, Jon Michael; Catanzarite, Joseph; Oza, Nikunj C.

    2015-08-01

    Building new catalogues of planetary candidates, astrophysical false alarms, and non-transiting phenomena is a challenging task that currently requires a reviewing team of astrophysicists and astronomers. These scientists need to examine more than 100 diagnostic metrics and associated graphics for each candidate exoplanet-transit-like signal to classify it into one of the three classes. Considering that the NASA Explorer Program's TESS mission and ESA's PLATO mission survey even a larger area of space, the classification of their transit-like signals is more time-consuming for human agents and a bottleneck to successfully construct the new catalogues in a timely manner. This encourages building automatic classification tools that can quickly and reliably classify the new signal data from these missions. The standard tool for building automatic classification systems is the supervised machine learning that requires a large set of highly accurate labeled examples in order to build an effective classifier. This requirement cannot be easily met for classifying transit-like signals because not only are existing labeled signals very limited, but also the current labels may not be reliable (because the labeling process is a subjective task). Our experiments with using different supervised classifiers to categorize transit-like signals verifies that the labeled signals are not rich enough to provide the classifier with enough power to generalize well beyond the observed cases (e.g. to unseen or test signals). That motivated us to utilize a new category of learning techniques, so-called semi-supervised learning, that combines the label information from the costly labeled signals, and distribution information from the cheaply available unlabeled signals in order to construct more effective classifiers. Our study on the Kepler Mission data shows that semi-supervised learning can significantly improve the result of multiple base classifiers (e.g. Support Vector Machines, AdaBoost, and Decision Tree) and is a good technique for automatic classification of exoplanet-transit-like signal.

  1. A Label Propagation Approach for Detecting Buried Objects in Handheld GPR Data

    DTIC Science & Technology

    2016-04-17

    regions of interest that correspond to locations with anomalous signatures. Second, a classifier (or an ensemble of classifiers ) is used to assign a...investigated for almost two decades and several classifiers have been developed. Most of these methods are based on the supervised learning paradigm where...labeled target and clutter signatures are needed to train a classifier to discriminate between the two classes. Typically, large and diverse labeled

  2. Semi-supervised vibration-based classification and condition monitoring of compressors

    NASA Astrophysics Data System (ADS)

    Potočnik, Primož; Govekar, Edvard

    2017-09-01

    Semi-supervised vibration-based classification and condition monitoring of the reciprocating compressors installed in refrigeration appliances is proposed in this paper. The method addresses the problem of industrial condition monitoring where prior class definitions are often not available or difficult to obtain from local experts. The proposed method combines feature extraction, principal component analysis, and statistical analysis for the extraction of initial class representatives, and compares the capability of various classification methods, including discriminant analysis (DA), neural networks (NN), support vector machines (SVM), and extreme learning machines (ELM). The use of the method is demonstrated on a case study which was based on industrially acquired vibration measurements of reciprocating compressors during the production of refrigeration appliances. The paper presents a comparative qualitative analysis of the applied classifiers, confirming the good performance of several nonlinear classifiers. If the model parameters are properly selected, then very good classification performance can be obtained from NN trained by Bayesian regularization, SVM and ELM classifiers. The method can be effectively applied for the industrial condition monitoring of compressors.

  3. Voxel-Based Neighborhood for Spatial Shape Pattern Classification of Lidar Point Clouds with Supervised Learning.

    PubMed

    Plaza-Leiva, Victoria; Gomez-Ruiz, Jose Antonio; Mandow, Anthony; García-Cerezo, Alfonso

    2017-03-15

    Improving the effectiveness of spatial shape features classification from 3D lidar data is very relevant because it is largely used as a fundamental step towards higher level scene understanding challenges of autonomous vehicles and terrestrial robots. In this sense, computing neighborhood for points in dense scans becomes a costly process for both training and classification. This paper proposes a new general framework for implementing and comparing different supervised learning classifiers with a simple voxel-based neighborhood computation where points in each non-overlapping voxel in a regular grid are assigned to the same class by considering features within a support region defined by the voxel itself. The contribution provides offline training and online classification procedures as well as five alternative feature vector definitions based on principal component analysis for scatter, tubular and planar shapes. Moreover, the feasibility of this approach is evaluated by implementing a neural network (NN) method previously proposed by the authors as well as three other supervised learning classifiers found in scene processing methods: support vector machines (SVM), Gaussian processes (GP), and Gaussian mixture models (GMM). A comparative performance analysis is presented using real point clouds from both natural and urban environments and two different 3D rangefinders (a tilting Hokuyo UTM-30LX and a Riegl). Classification performance metrics and processing time measurements confirm the benefits of the NN classifier and the feasibility of voxel-based neighborhood.

  4. Rapid classification of heavy metal-exposed freshwater bacteria by infrared spectroscopy coupled with chemometrics using supervised method

    NASA Astrophysics Data System (ADS)

    Gurbanov, Rafig; Gozen, Ayse Gul; Severcan, Feride

    2018-01-01

    Rapid, cost-effective, sensitive and accurate methodologies to classify bacteria are still in the process of development. The major drawbacks of standard microbiological, molecular and immunological techniques call for the possible usage of infrared (IR) spectroscopy based supervised chemometric techniques. Previous applications of IR based chemometric methods have demonstrated outstanding findings in the classification of bacteria. Therefore, we have exploited an IR spectroscopy based chemometrics using supervised method namely Soft Independent Modeling of Class Analogy (SIMCA) technique for the first time to classify heavy metal-exposed bacteria to be used in the selection of suitable bacteria to evaluate their potential for environmental cleanup applications. Herein, we present the powerful differentiation and classification of laboratory strains (Escherichia coli and Staphylococcus aureus) and environmental isolates (Gordonia sp. and Microbacterium oxydans) of bacteria exposed to growth inhibitory concentrations of silver (Ag), cadmium (Cd) and lead (Pb). Our results demonstrated that SIMCA was able to differentiate all heavy metal-exposed and control groups from each other with 95% confidence level. Correct identification of randomly chosen test samples in their corresponding groups and high model distances between the classes were also achieved. We report, for the first time, the success of IR spectroscopy coupled with supervised chemometric technique SIMCA in classification of different bacteria under a given treatment.

  5. Comparison Between Supervised and Unsupervised Classifications of Neuronal Cell Types: A Case Study

    PubMed Central

    Guerra, Luis; McGarry, Laura M; Robles, Víctor; Bielza, Concha; Larrañaga, Pedro; Yuste, Rafael

    2011-01-01

    In the study of neural circuits, it becomes essential to discern the different neuronal cell types that build the circuit. Traditionally, neuronal cell types have been classified using qualitative descriptors. More recently, several attempts have been made to classify neurons quantitatively, using unsupervised clustering methods. While useful, these algorithms do not take advantage of previous information known to the investigator, which could improve the classification task. For neocortical GABAergic interneurons, the problem to discern among different cell types is particularly difficult and better methods are needed to perform objective classifications. Here we explore the use of supervised classification algorithms to classify neurons based on their morphological features, using a database of 128 pyramidal cells and 199 interneurons from mouse neocortex. To evaluate the performance of different algorithms we used, as a “benchmark,” the test to automatically distinguish between pyramidal cells and interneurons, defining “ground truth” by the presence or absence of an apical dendrite. We compared hierarchical clustering with a battery of different supervised classification algorithms, finding that supervised classifications outperformed hierarchical clustering. In addition, the selection of subsets of distinguishing features enhanced the classification accuracy for both sets of algorithms. The analysis of selected variables indicates that dendritic features were most useful to distinguish pyramidal cells from interneurons when compared with somatic and axonal morphological variables. We conclude that supervised classification algorithms are better matched to the general problem of distinguishing neuronal cell types when some information on these cell groups, in our case being pyramidal or interneuron, is known a priori. As a spin-off of this methodological study, we provide several methods to automatically distinguish neocortical pyramidal cells from interneurons, based on their morphologies. © 2010 Wiley Periodicals, Inc. Develop Neurobiol 71: 71–82, 2011 PMID:21154911

  6. Quality-Related Monitoring and Grading of Granulated Products by Weibull-Distribution Modeling of Visual Images with Semi-Supervised Learning.

    PubMed

    Liu, Jinping; Tang, Zhaohui; Xu, Pengfei; Liu, Wenzhong; Zhang, Jin; Zhu, Jianyong

    2016-06-29

    The topic of online product quality inspection (OPQI) with smart visual sensors is attracting increasing interest in both the academic and industrial communities on account of the natural connection between the visual appearance of products with their underlying qualities. Visual images captured from granulated products (GPs), e.g., cereal products, fabric textiles, are comprised of a large number of independent particles or stochastically stacking locally homogeneous fragments, whose analysis and understanding remains challenging. A method of image statistical modeling-based OPQI for GP quality grading and monitoring by a Weibull distribution(WD) model with a semi-supervised learning classifier is presented. WD-model parameters (WD-MPs) of GP images' spatial structures, obtained with omnidirectional Gaussian derivative filtering (OGDF), which were demonstrated theoretically to obey a specific WD model of integral form, were extracted as the visual features. Then, a co-training-style semi-supervised classifier algorithm, named COSC-Boosting, was exploited for semi-supervised GP quality grading, by integrating two independent classifiers with complementary nature in the face of scarce labeled samples. Effectiveness of the proposed OPQI method was verified and compared in the field of automated rice quality grading with commonly-used methods and showed superior performance, which lays a foundation for the quality control of GP on assembly lines.

  7. Support vector machines

    NASA Technical Reports Server (NTRS)

    Garay, Michael J.; Mazzoni, Dominic; Davies, Roger; Wagstaff, Kiri

    2004-01-01

    Support Vector Machines (SVMs) are a type of supervised learning algorith,, other examples of which are Artificial Neural Networks (ANNs), Decision Trees, and Naive Bayesian Classifiers. Supervised learning algorithms are used to classify objects labled by a 'supervisor' - typically a human 'expert.'.

  8. Voxel-Based Neighborhood for Spatial Shape Pattern Classification of Lidar Point Clouds with Supervised Learning

    PubMed Central

    Plaza-Leiva, Victoria; Gomez-Ruiz, Jose Antonio; Mandow, Anthony; García-Cerezo, Alfonso

    2017-01-01

    Improving the effectiveness of spatial shape features classification from 3D lidar data is very relevant because it is largely used as a fundamental step towards higher level scene understanding challenges of autonomous vehicles and terrestrial robots. In this sense, computing neighborhood for points in dense scans becomes a costly process for both training and classification. This paper proposes a new general framework for implementing and comparing different supervised learning classifiers with a simple voxel-based neighborhood computation where points in each non-overlapping voxel in a regular grid are assigned to the same class by considering features within a support region defined by the voxel itself. The contribution provides offline training and online classification procedures as well as five alternative feature vector definitions based on principal component analysis for scatter, tubular and planar shapes. Moreover, the feasibility of this approach is evaluated by implementing a neural network (NN) method previously proposed by the authors as well as three other supervised learning classifiers found in scene processing methods: support vector machines (SVM), Gaussian processes (GP), and Gaussian mixture models (GMM). A comparative performance analysis is presented using real point clouds from both natural and urban environments and two different 3D rangefinders (a tilting Hokuyo UTM-30LX and a Riegl). Classification performance metrics and processing time measurements confirm the benefits of the NN classifier and the feasibility of voxel-based neighborhood. PMID:28294963

  9. Surveying alignment-free features for Ortholog detection in related yeast proteomes by using supervised big data classifiers.

    PubMed

    Galpert, Deborah; Fernández, Alberto; Herrera, Francisco; Antunes, Agostinho; Molina-Ruiz, Reinaldo; Agüero-Chapin, Guillermin

    2018-05-03

    The development of new ortholog detection algorithms and the improvement of existing ones are of major importance in functional genomics. We have previously introduced a successful supervised pairwise ortholog classification approach implemented in a big data platform that considered several pairwise protein features and the low ortholog pair ratios found between two annotated proteomes (Galpert, D et al., BioMed Research International, 2015). The supervised models were built and tested using a Saccharomycete yeast benchmark dataset proposed by Salichos and Rokas (2011). Despite several pairwise protein features being combined in a supervised big data approach; they all, to some extent were alignment-based features and the proposed algorithms were evaluated on a unique test set. Here, we aim to evaluate the impact of alignment-free features on the performance of supervised models implemented in the Spark big data platform for pairwise ortholog detection in several related yeast proteomes. The Spark Random Forest and Decision Trees with oversampling and undersampling techniques, and built with only alignment-based similarity measures or combined with several alignment-free pairwise protein features showed the highest classification performance for ortholog detection in three yeast proteome pairs. Although such supervised approaches outperformed traditional methods, there were no significant differences between the exclusive use of alignment-based similarity measures and their combination with alignment-free features, even within the twilight zone of the studied proteomes. Just when alignment-based and alignment-free features were combined in Spark Decision Trees with imbalance management, a higher success rate (98.71%) within the twilight zone could be achieved for a yeast proteome pair that underwent a whole genome duplication. The feature selection study showed that alignment-based features were top-ranked for the best classifiers while the runners-up were alignment-free features related to amino acid composition. The incorporation of alignment-free features in supervised big data models did not significantly improve ortholog detection in yeast proteomes regarding the classification qualities achieved with just alignment-based similarity measures. However, the similarity of their classification performance to that of traditional ortholog detection methods encourages the evaluation of other alignment-free protein pair descriptors in future research.

  10. True Zero-Training Brain-Computer Interfacing – An Online Study

    PubMed Central

    Kindermans, Pieter-Jan; Schreuder, Martijn; Schrauwen, Benjamin; Müller, Klaus-Robert; Tangermann, Michael

    2014-01-01

    Despite several approaches to realize subject-to-subject transfer of pre-trained classifiers, the full performance of a Brain-Computer Interface (BCI) for a novel user can only be reached by presenting the BCI system with data from the novel user. In typical state-of-the-art BCI systems with a supervised classifier, the labeled data is collected during a calibration recording, in which the user is asked to perform a specific task. Based on the known labels of this recording, the BCI's classifier can learn to decode the individual's brain signals. Unfortunately, this calibration recording consumes valuable time. Furthermore, it is unproductive with respect to the final BCI application, e.g. text entry. Therefore, the calibration period must be reduced to a minimum, which is especially important for patients with a limited concentration ability. The main contribution of this manuscript is an online study on unsupervised learning in an auditory event-related potential (ERP) paradigm. Our results demonstrate that the calibration recording can be bypassed by utilizing an unsupervised trained classifier, that is initialized randomly and updated during usage. Initially, the unsupervised classifier tends to make decoding mistakes, as the classifier might not have seen enough data to build a reliable model. Using a constant re-analysis of the previously spelled symbols, these initially misspelled symbols can be rectified posthoc when the classifier has learned to decode the signals. We compare the spelling performance of our unsupervised approach and of the unsupervised posthoc approach to the standard supervised calibration-based dogma for n = 10 healthy users. To assess the learning behavior of our approach, it is unsupervised trained from scratch three times per user. Even with the relatively low SNR of an auditory ERP paradigm, the results show that after a limited number of trials (30 trials), the unsupervised approach performs comparably to a classic supervised model. PMID:25068464

  11. A review of supervised object-based land-cover image classification

    NASA Astrophysics Data System (ADS)

    Ma, Lei; Li, Manchun; Ma, Xiaoxue; Cheng, Liang; Du, Peijun; Liu, Yongxue

    2017-08-01

    Object-based image classification for land-cover mapping purposes using remote-sensing imagery has attracted significant attention in recent years. Numerous studies conducted over the past decade have investigated a broad array of sensors, feature selection, classifiers, and other factors of interest. However, these research results have not yet been synthesized to provide coherent guidance on the effect of different supervised object-based land-cover classification processes. In this study, we first construct a database with 28 fields using qualitative and quantitative information extracted from 254 experimental cases described in 173 scientific papers. Second, the results of the meta-analysis are reported, including general characteristics of the studies (e.g., the geographic range of relevant institutes, preferred journals) and the relationships between factors of interest (e.g., spatial resolution and study area or optimal segmentation scale, accuracy and number of targeted classes), especially with respect to the classification accuracy of different sensors, segmentation scale, training set size, supervised classifiers, and land-cover types. Third, useful data on supervised object-based image classification are determined from the meta-analysis. For example, we find that supervised object-based classification is currently experiencing rapid advances, while development of the fuzzy technique is limited in the object-based framework. Furthermore, spatial resolution correlates with the optimal segmentation scale and study area, and Random Forest (RF) shows the best performance in object-based classification. The area-based accuracy assessment method can obtain stable classification performance, and indicates a strong correlation between accuracy and training set size, while the accuracy of the point-based method is likely to be unstable due to mixed objects. In addition, the overall accuracy benefits from higher spatial resolution images (e.g., unmanned aerial vehicle) or agricultural sites where it also correlates with the number of targeted classes. More than 95.6% of studies involve an area less than 300 ha, and the spatial resolution of images is predominantly between 0 and 2 m. Furthermore, we identify some methods that may advance supervised object-based image classification. For example, deep learning and type-2 fuzzy techniques may further improve classification accuracy. Lastly, scientists are strongly encouraged to report results of uncertainty studies to further explore the effects of varied factors on supervised object-based image classification.

  12. Supervised Variational Relevance Learning, An Analytic Geometric Feature Selection with Applications to Omic Datasets.

    PubMed

    Boareto, Marcelo; Cesar, Jonatas; Leite, Vitor B P; Caticha, Nestor

    2015-01-01

    We introduce Supervised Variational Relevance Learning (Suvrel), a variational method to determine metric tensors to define distance based similarity in pattern classification, inspired in relevance learning. The variational method is applied to a cost function that penalizes large intraclass distances and favors small interclass distances. We find analytically the metric tensor that minimizes the cost function. Preprocessing the patterns by doing linear transformations using the metric tensor yields a dataset which can be more efficiently classified. We test our methods using publicly available datasets, for some standard classifiers. Among these datasets, two were tested by the MAQC-II project and, even without the use of further preprocessing, our results improve on their performance.

  13. A semi-supervised Support Vector Machine model for predicting the language outcomes following cochlear implantation based on pre-implant brain fMRI imaging.

    PubMed

    Tan, Lirong; Holland, Scott K; Deshpande, Aniruddha K; Chen, Ye; Choo, Daniel I; Lu, Long J

    2015-12-01

    We developed a machine learning model to predict whether or not a cochlear implant (CI) candidate will develop effective language skills within 2 years after the CI surgery by using the pre-implant brain fMRI data from the candidate. The language performance was measured 2 years after the CI surgery by the Clinical Evaluation of Language Fundamentals-Preschool, Second Edition (CELF-P2). Based on the CELF-P2 scores, the CI recipients were designated as either effective or ineffective CI users. For feature extraction from the fMRI data, we constructed contrast maps using the general linear model, and then utilized the Bag-of-Words (BoW) approach that we previously published to convert the contrast maps into feature vectors. We trained both supervised models and semi-supervised models to classify CI users as effective or ineffective. Compared with the conventional feature extraction approach, which used each single voxel as a feature, our BoW approach gave rise to much better performance for the classification of effective versus ineffective CI users. The semi-supervised model with the feature set extracted by the BoW approach from the contrast of speech versus silence achieved a leave-one-out cross-validation AUC as high as 0.97. Recursive feature elimination unexpectedly revealed that two features were sufficient to provide highly accurate classification of effective versus ineffective CI users based on our current dataset. We have validated the hypothesis that pre-implant cortical activation patterns revealed by fMRI during infancy correlate with language performance 2 years after cochlear implantation. The two brain regions highlighted by our classifier are potential biomarkers for the prediction of CI outcomes. Our study also demonstrated the superiority of the semi-supervised model over the supervised model. It is always worthwhile to try a semi-supervised model when unlabeled data are available.

  14. Quality-Related Monitoring and Grading of Granulated Products by Weibull-Distribution Modeling of Visual Images with Semi-Supervised Learning

    PubMed Central

    Liu, Jinping; Tang, Zhaohui; Xu, Pengfei; Liu, Wenzhong; Zhang, Jin; Zhu, Jianyong

    2016-01-01

    The topic of online product quality inspection (OPQI) with smart visual sensors is attracting increasing interest in both the academic and industrial communities on account of the natural connection between the visual appearance of products with their underlying qualities. Visual images captured from granulated products (GPs), e.g., cereal products, fabric textiles, are comprised of a large number of independent particles or stochastically stacking locally homogeneous fragments, whose analysis and understanding remains challenging. A method of image statistical modeling-based OPQI for GP quality grading and monitoring by a Weibull distribution(WD) model with a semi-supervised learning classifier is presented. WD-model parameters (WD-MPs) of GP images’ spatial structures, obtained with omnidirectional Gaussian derivative filtering (OGDF), which were demonstrated theoretically to obey a specific WD model of integral form, were extracted as the visual features. Then, a co-training-style semi-supervised classifier algorithm, named COSC-Boosting, was exploited for semi-supervised GP quality grading, by integrating two independent classifiers with complementary nature in the face of scarce labeled samples. Effectiveness of the proposed OPQI method was verified and compared in the field of automated rice quality grading with commonly-used methods and showed superior performance, which lays a foundation for the quality control of GP on assembly lines. PMID:27367703

  15. Wire connector classification with machine vision and a novel hybrid SVM

    NASA Astrophysics Data System (ADS)

    Chauhan, Vedang; Joshi, Keyur D.; Surgenor, Brian W.

    2018-04-01

    A machine vision-based system has been developed and tested that uses a novel hybrid Support Vector Machine (SVM) in a part inspection application with clear plastic wire connectors. The application required the system to differentiate between 4 different known styles of connectors plus one unknown style, for a total of 5 classes. The requirement to handle an unknown class is what necessitated the hybrid approach. The system was trained with the 4 known classes and tested with 5 classes (the 4 known plus the 1 unknown). The hybrid classification approach used two layers of SVMs: one layer was semi-supervised and the other layer was supervised. The semi-supervised SVM was a special case of unsupervised machine learning that classified test images as one of the 4 known classes (to accept) or as the unknown class (to reject). The supervised SVM classified test images as one of the 4 known classes and consequently would give false positives (FPs). Two methods were tested. The difference between the methods was that the order of the layers was switched. The method with the semi-supervised layer first gave an accuracy of 80% with 20% FPs. The method with the supervised layer first gave an accuracy of 98% with 0% FPs. Further work is being conducted to see if the hybrid approach works with other applications that have an unknown class requirement.

  16. Optimizing area under the ROC curve using semi-supervised learning

    PubMed Central

    Wang, Shijun; Li, Diana; Petrick, Nicholas; Sahiner, Berkman; Linguraru, Marius George; Summers, Ronald M.

    2014-01-01

    Receiver operating characteristic (ROC) analysis is a standard methodology to evaluate the performance of a binary classification system. The area under the ROC curve (AUC) is a performance metric that summarizes how well a classifier separates two classes. Traditional AUC optimization techniques are supervised learning methods that utilize only labeled data (i.e., the true class is known for all data) to train the classifiers. In this work, inspired by semi-supervised and transductive learning, we propose two new AUC optimization algorithms hereby referred to as semi-supervised learning receiver operating characteristic (SSLROC) algorithms, which utilize unlabeled test samples in classifier training to maximize AUC. Unlabeled samples are incorporated into the AUC optimization process, and their ranking relationships to labeled positive and negative training samples are considered as optimization constraints. The introduced test samples will cause the learned decision boundary in a multidimensional feature space to adapt not only to the distribution of labeled training data, but also to the distribution of unlabeled test data. We formulate the semi-supervised AUC optimization problem as a semi-definite programming problem based on the margin maximization theory. The proposed methods SSLROC1 (1-norm) and SSLROC2 (2-norm) were evaluated using 34 (determined by power analysis) randomly selected datasets from the University of California, Irvine machine learning repository. Wilcoxon signed rank tests showed that the proposed methods achieved significant improvement compared with state-of-the-art methods. The proposed methods were also applied to a CT colonography dataset for colonic polyp classification and showed promising results.1 PMID:25395692

  17. Optimizing area under the ROC curve using semi-supervised learning.

    PubMed

    Wang, Shijun; Li, Diana; Petrick, Nicholas; Sahiner, Berkman; Linguraru, Marius George; Summers, Ronald M

    2015-01-01

    Receiver operating characteristic (ROC) analysis is a standard methodology to evaluate the performance of a binary classification system. The area under the ROC curve (AUC) is a performance metric that summarizes how well a classifier separates two classes. Traditional AUC optimization techniques are supervised learning methods that utilize only labeled data (i.e., the true class is known for all data) to train the classifiers. In this work, inspired by semi-supervised and transductive learning, we propose two new AUC optimization algorithms hereby referred to as semi-supervised learning receiver operating characteristic (SSLROC) algorithms, which utilize unlabeled test samples in classifier training to maximize AUC. Unlabeled samples are incorporated into the AUC optimization process, and their ranking relationships to labeled positive and negative training samples are considered as optimization constraints. The introduced test samples will cause the learned decision boundary in a multidimensional feature space to adapt not only to the distribution of labeled training data, but also to the distribution of unlabeled test data. We formulate the semi-supervised AUC optimization problem as a semi-definite programming problem based on the margin maximization theory. The proposed methods SSLROC1 (1-norm) and SSLROC2 (2-norm) were evaluated using 34 (determined by power analysis) randomly selected datasets from the University of California, Irvine machine learning repository. Wilcoxon signed rank tests showed that the proposed methods achieved significant improvement compared with state-of-the-art methods. The proposed methods were also applied to a CT colonography dataset for colonic polyp classification and showed promising results.

  18. Classification of cirrhotic liver in Gadolinium-enhanced MR images

    NASA Astrophysics Data System (ADS)

    Lee, Gobert; Uchiyama, Yoshikazu; Zhang, Xuejun; Kanematsu, Masayuki; Zhou, Xiangrong; Hara, Takeshi; Kato, Hiroki; Kondo, Hiroshi; Fujita, Hiroshi; Hoshi, Hiroaki

    2007-03-01

    Cirrhosis of the liver is characterized by the presence of widespread nodules and fibrosis in the liver. The fibrosis and nodules formation causes distortion of the normal liver architecture, resulting in characteristic texture patterns. Texture patterns are commonly analyzed with the use of co-occurrence matrix based features measured on regions-of-interest (ROIs). A classifier is subsequently used for the classification of cirrhotic or non-cirrhotic livers. Problem arises if the classifier employed falls into the category of supervised classifier which is a popular choice. This is because the 'true disease states' of the ROIs are required for the training of the classifier but is, generally, not available. A common approach is to adopt the 'true disease state' of the liver as the 'true disease state' of all ROIs in that liver. This paper investigates the use of a nonsupervised classifier, the k-means clustering method in classifying livers as cirrhotic or non-cirrhotic using unlabelled ROI data. A preliminary result with a sensitivity and specificity of 72% and 60%, respectively, demonstrates the feasibility of using the k-means non-supervised clustering method in generating a characteristic cluster structure that could facilitate the classification of cirrhotic and non-cirrhotic livers.

  19. Classifying forest inventory data into species-based forest community types at broad extents: exploring tradeoffs among supervised and unsupervised approaches

    Treesearch

    Jennifer K. Costanza; Don Faber-Langendoen; John W. Coulston; David N. Wear

    2018-01-01

    Background: Knowledge of the different kinds of tree communities that currently exist can provide a baseline for assessing the ecological attributes of forests and monitoring future changes. Forest inventory data can facilitate the development of this baseline knowledge across broad extents, but they first must be classified into forest...

  20. Joint Sparse Recovery With Semisupervised MUSIC

    NASA Astrophysics Data System (ADS)

    Wen, Zaidao; Hou, Biao; Jiao, Licheng

    2017-05-01

    Discrete multiple signal classification (MUSIC) with its low computational cost and mild condition requirement becomes a significant noniterative algorithm for joint sparse recovery (JSR). However, it fails in rank defective problem caused by coherent or limited amount of multiple measurement vectors (MMVs). In this letter, we provide a novel sight to address this problem by interpreting JSR as a binary classification problem with respect to atoms. Meanwhile, MUSIC essentially constructs a supervised classifier based on the labeled MMVs so that its performance will heavily depend on the quality and quantity of these training samples. From this viewpoint, we develop a semisupervised MUSIC (SS-MUSIC) in the spirit of machine learning, which declares that the insufficient supervised information in the training samples can be compensated from those unlabeled atoms. Instead of constructing a classifier in a fully supervised manner, we iteratively refine a semisupervised classifier by exploiting the labeled MMVs and some reliable unlabeled atoms simultaneously. Through this way, the required conditions and iterations can be greatly relaxed and reduced. Numerical experimental results demonstrate that SS-MUSIC can achieve much better recovery performances than other MUSIC extended algorithms as well as some typical greedy algorithms for JSR in terms of iterations and recovery probability.

  1. QUEST: Eliminating Online Supervised Learning for Efficient Classification Algorithms.

    PubMed

    Zwartjes, Ardjan; Havinga, Paul J M; Smit, Gerard J M; Hurink, Johann L

    2016-10-01

    In this work, we introduce QUEST (QUantile Estimation after Supervised Training), an adaptive classification algorithm for Wireless Sensor Networks (WSNs) that eliminates the necessity for online supervised learning. Online processing is important for many sensor network applications. Transmitting raw sensor data puts high demands on the battery, reducing network life time. By merely transmitting partial results or classifications based on the sampled data, the amount of traffic on the network can be significantly reduced. Such classifications can be made by learning based algorithms using sampled data. An important issue, however, is the training phase of these learning based algorithms. Training a deployed sensor network requires a lot of communication and an impractical amount of human involvement. QUEST is a hybrid algorithm that combines supervised learning in a controlled environment with unsupervised learning on the location of deployment. Using the SITEX02 dataset, we demonstrate that the presented solution works with a performance penalty of less than 10% in 90% of the tests. Under some circumstances, it even outperforms a network of classifiers completely trained with supervised learning. As a result, the need for on-site supervised learning and communication for training is completely eliminated by our solution.

  2. Global Optimization Ensemble Model for Classification Methods

    PubMed Central

    Anwar, Hina; Qamar, Usman; Muzaffar Qureshi, Abdul Wahab

    2014-01-01

    Supervised learning is the process of data mining for deducing rules from training datasets. A broad array of supervised learning algorithms exists, every one of them with its own advantages and drawbacks. There are some basic issues that affect the accuracy of classifier while solving a supervised learning problem, like bias-variance tradeoff, dimensionality of input space, and noise in the input data space. All these problems affect the accuracy of classifier and are the reason that there is no global optimal method for classification. There is not any generalized improvement method that can increase the accuracy of any classifier while addressing all the problems stated above. This paper proposes a global optimization ensemble model for classification methods (GMC) that can improve the overall accuracy for supervised learning problems. The experimental results on various public datasets showed that the proposed model improved the accuracy of the classification models from 1% to 30% depending upon the algorithm complexity. PMID:24883382

  3. A software package for interactive motor unit potential classification using fuzzy k-NN classifier.

    PubMed

    Rasheed, Sarbast; Stashuk, Daniel; Kamel, Mohamed

    2008-01-01

    We present an interactive software package for implementing the supervised classification task during electromyographic (EMG) signal decomposition process using a fuzzy k-NN classifier and utilizing the MATLAB high-level programming language and its interactive environment. The method employs an assertion-based classification that takes into account a combination of motor unit potential (MUP) shapes and two modes of use of motor unit firing pattern information: the passive and the active modes. The developed package consists of several graphical user interfaces used to detect individual MUP waveforms from a raw EMG signal, extract relevant features, and classify the MUPs into motor unit potential trains (MUPTs) using assertion-based classifiers.

  4. Supervised DNA Barcodes species classification: analysis, comparisons and results

    PubMed Central

    2014-01-01

    Background Specific fragments, coming from short portions of DNA (e.g., mitochondrial, nuclear, and plastid sequences), have been defined as DNA Barcode and can be used as markers for organisms of the main life kingdoms. Species classification with DNA Barcode sequences has been proven effective on different organisms. Indeed, specific gene regions have been identified as Barcode: COI in animals, rbcL and matK in plants, and ITS in fungi. The classification problem assigns an unknown specimen to a known species by analyzing its Barcode. This task has to be supported with reliable methods and algorithms. Methods In this work the efficacy of supervised machine learning methods to classify species with DNA Barcode sequences is shown. The Weka software suite, which includes a collection of supervised classification methods, is adopted to address the task of DNA Barcode analysis. Classifier families are tested on synthetic and empirical datasets belonging to the animal, fungus, and plant kingdoms. In particular, the function-based method Support Vector Machines (SVM), the rule-based RIPPER, the decision tree C4.5, and the Naïve Bayes method are considered. Additionally, the classification results are compared with respect to ad-hoc and well-established DNA Barcode classification methods. Results A software that converts the DNA Barcode FASTA sequences to the Weka format is released, to adapt different input formats and to allow the execution of the classification procedure. The analysis of results on synthetic and real datasets shows that SVM and Naïve Bayes outperform on average the other considered classifiers, although they do not provide a human interpretable classification model. Rule-based methods have slightly inferior classification performances, but deliver the species specific positions and nucleotide assignments. On synthetic data the supervised machine learning methods obtain superior classification performances with respect to the traditional DNA Barcode classification methods. On empirical data their classification performances are at a comparable level to the other methods. Conclusions The classification analysis shows that supervised machine learning methods are promising candidates for handling with success the DNA Barcoding species classification problem, obtaining excellent performances. To conclude, a powerful tool to perform species identification is now available to the DNA Barcoding community. PMID:24721333

  5. An online semi-supervised brain-computer interface.

    PubMed

    Gu, Zhenghui; Yu, Zhuliang; Shen, Zhifang; Li, Yuanqing

    2013-09-01

    Practical brain-computer interface (BCI) systems should require only low training effort for the user, and the algorithms used to classify the intent of the user should be computationally efficient. However, due to inter- and intra-subject variations in EEG signal, intermittent training/calibration is often unavoidable. In this paper, we present an online semi-supervised P300 BCI speller system. After a short initial training (around or less than 1 min in our experiments), the system is switched to a mode where the user can input characters through selective attention. In this mode, a self-training least squares support vector machine (LS-SVM) classifier is gradually enhanced in back end with the unlabeled EEG data collected online after every character input. In this way, the classifier is gradually enhanced. Even though the user may experience some errors in input at the beginning due to the small initial training dataset, the accuracy approaches that of fully supervised method in a few minutes. The algorithm based on LS-SVM and its sequential update has low computational complexity; thus, it is suitable for online applications. The effectiveness of the algorithm has been validated through data analysis on BCI Competition III dataset II (P300 speller BCI data). The performance of the online system was evaluated through experimental results on eight healthy subjects, where all of them achieved the spelling accuracy of 85 % or above within an average online semi-supervised learning time of around 3 min.

  6. Land cover mapping after the tsunami event over Nanggroe Aceh Darussalam (NAD) province, Indonesia

    NASA Astrophysics Data System (ADS)

    Lim, H. S.; MatJafri, M. Z.; Abdullah, K.; Alias, A. N.; Mohd. Saleh, N.; Wong, C. J.; Surbakti, M. S.

    2008-03-01

    Remote sensing offers an important means of detecting and analyzing temporal changes occurring in our landscape. This research used remote sensing to quantify land use/land cover changes at the Nanggroe Aceh Darussalam (Nad) province, Indonesia on a regional scale. The objective of this paper is to assess the changed produced from the analysis of Landsat TM data. A Landsat TM image was used to develop land cover classification map for the 27 March 2005. Four supervised classifications techniques (Maximum Likelihood, Minimum Distance-to- Mean, Parallelepiped and Parallelepiped with Maximum Likelihood Classifier Tiebreaker classifier) were performed to the satellite image. Training sites and accuracy assessment were needed for supervised classification techniques. The training sites were established using polygons based on the colour image. High detection accuracy (>80%) and overall Kappa (>0.80) were achieved by the Parallelepiped with Maximum Likelihood Classifier Tiebreaker classifier in this study. This preliminary study has produced a promising result. This indicates that land cover mapping can be carried out using remote sensing classification method of the satellite digital imagery.

  7. Medical subdomain classification of clinical notes using a machine learning-based natural language processing approach.

    PubMed

    Weng, Wei-Hung; Wagholikar, Kavishwar B; McCray, Alexa T; Szolovits, Peter; Chueh, Henry C

    2017-12-01

    The medical subdomain of a clinical note, such as cardiology or neurology, is useful content-derived metadata for developing machine learning downstream applications. To classify the medical subdomain of a note accurately, we have constructed a machine learning-based natural language processing (NLP) pipeline and developed medical subdomain classifiers based on the content of the note. We constructed the pipeline using the clinical NLP system, clinical Text Analysis and Knowledge Extraction System (cTAKES), the Unified Medical Language System (UMLS) Metathesaurus, Semantic Network, and learning algorithms to extract features from two datasets - clinical notes from Integrating Data for Analysis, Anonymization, and Sharing (iDASH) data repository (n = 431) and Massachusetts General Hospital (MGH) (n = 91,237), and built medical subdomain classifiers with different combinations of data representation methods and supervised learning algorithms. We evaluated the performance of classifiers and their portability across the two datasets. The convolutional recurrent neural network with neural word embeddings trained-medical subdomain classifier yielded the best performance measurement on iDASH and MGH datasets with area under receiver operating characteristic curve (AUC) of 0.975 and 0.991, and F1 scores of 0.845 and 0.870, respectively. Considering better clinical interpretability, linear support vector machine-trained medical subdomain classifier using hybrid bag-of-words and clinically relevant UMLS concepts as the feature representation, with term frequency-inverse document frequency (tf-idf)-weighting, outperformed other shallow learning classifiers on iDASH and MGH datasets with AUC of 0.957 and 0.964, and F1 scores of 0.932 and 0.934 respectively. We trained classifiers on one dataset, applied to the other dataset and yielded the threshold of F1 score of 0.7 in classifiers for half of the medical subdomains we studied. Our study shows that a supervised learning-based NLP approach is useful to develop medical subdomain classifiers. The deep learning algorithm with distributed word representation yields better performance yet shallow learning algorithms with the word and concept representation achieves comparable performance with better clinical interpretability. Portable classifiers may also be used across datasets from different institutions.

  8. Biological classification with RNA-Seq data: Can alternatively spliced transcript expression enhance machine learning classifier?

    PubMed

    Johnson, Nathan T; Dhroso, Andi; Hughes, Katelyn J; Korkin, Dmitry

    2018-06-25

    The extent to which the genes are expressed in the cell can be simplistically defined as a function of one or more factors of the environment, lifestyle, and genetics. RNA sequencing (RNA-Seq) is becoming a prevalent approach to quantify gene expression, and is expected to gain better insights to a number of biological and biomedical questions, compared to the DNA microarrays. Most importantly, RNA-Seq allows to quantify expression at the gene and alternative splicing isoform levels. However, leveraging the RNA-Seq data requires development of new data mining and analytics methods. Supervised machine learning methods are commonly used approaches for biological data analysis, and have recently gained attention for their applications to the RNA-Seq data. In this work, we assess the utility of supervised learning methods trained on RNA-Seq data for a diverse range of biological classification tasks. We hypothesize that the isoform-level expression data is more informative for biological classification tasks than the gene-level expression data. Our large-scale assessment is done through utilizing multiple datasets, organisms, lab groups, and RNA-Seq analysis pipelines. Overall, we performed and assessed 61 biological classification problems that leverage three independent RNA-Seq datasets and include over 2,000 samples that come from multiple organisms, lab groups, and RNA-Seq analyses. These 61 problems include predictions of the tissue type, sex, or age of the sample, healthy or cancerous phenotypes and, the pathological tumor stage for the samples from the cancerous tissue. For each classification problem, the performance of three normalization techniques and six machine learning classifiers was explored. We find that for every single classification problem, the isoform-based classifiers outperform or are comparable with gene expression based methods. The top-performing supervised learning techniques reached a near perfect classification accuracy, demonstrating the utility of supervised learning for RNA-Seq based data analysis. Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  9. Semi-supervised protein subcellular localization.

    PubMed

    Xu, Qian; Hu, Derek Hao; Xue, Hong; Yu, Weichuan; Yang, Qiang

    2009-01-30

    Protein subcellular localization is concerned with predicting the location of a protein within a cell using computational method. The location information can indicate key functionalities of proteins. Accurate predictions of subcellular localizations of protein can aid the prediction of protein function and genome annotation, as well as the identification of drug targets. Computational methods based on machine learning, such as support vector machine approaches, have already been widely used in the prediction of protein subcellular localization. However, a major drawback of these machine learning-based approaches is that a large amount of data should be labeled in order to let the prediction system learn a classifier of good generalization ability. However, in real world cases, it is laborious, expensive and time-consuming to experimentally determine the subcellular localization of a protein and prepare instances of labeled data. In this paper, we present an approach based on a new learning framework, semi-supervised learning, which can use much fewer labeled instances to construct a high quality prediction model. We construct an initial classifier using a small set of labeled examples first, and then use unlabeled instances to refine the classifier for future predictions. Experimental results show that our methods can effectively reduce the workload for labeling data using the unlabeled data. Our method is shown to enhance the state-of-the-art prediction results of SVM classifiers by more than 10%.

  10. Physical Human Activity Recognition Using Wearable Sensors.

    PubMed

    Attal, Ferhat; Mohammed, Samer; Dedabrishvili, Mariam; Chamroukhi, Faicel; Oukhellou, Latifa; Amirat, Yacine

    2015-12-11

    This paper presents a review of different classification techniques used to recognize human activities from wearable inertial sensor data. Three inertial sensor units were used in this study and were worn by healthy subjects at key points of upper/lower body limbs (chest, right thigh and left ankle). Three main steps describe the activity recognition process: sensors' placement, data pre-processing and data classification. Four supervised classification techniques namely, k-Nearest Neighbor (k-NN), Support Vector Machines (SVM), Gaussian Mixture Models (GMM), and Random Forest (RF) as well as three unsupervised classification techniques namely, k-Means, Gaussian mixture models (GMM) and Hidden Markov Model (HMM), are compared in terms of correct classification rate, F-measure, recall, precision, and specificity. Raw data and extracted features are used separately as inputs of each classifier. The feature selection is performed using a wrapper approach based on the RF algorithm. Based on our experiments, the results obtained show that the k-NN classifier provides the best performance compared to other supervised classification algorithms, whereas the HMM classifier is the one that gives the best results among unsupervised classification algorithms. This comparison highlights which approach gives better performance in both supervised and unsupervised contexts. It should be noted that the obtained results are limited to the context of this study, which concerns the classification of the main daily living human activities using three wearable accelerometers placed at the chest, right shank and left ankle of the subject.

  11. Physical Human Activity Recognition Using Wearable Sensors

    PubMed Central

    Attal, Ferhat; Mohammed, Samer; Dedabrishvili, Mariam; Chamroukhi, Faicel; Oukhellou, Latifa; Amirat, Yacine

    2015-01-01

    This paper presents a review of different classification techniques used to recognize human activities from wearable inertial sensor data. Three inertial sensor units were used in this study and were worn by healthy subjects at key points of upper/lower body limbs (chest, right thigh and left ankle). Three main steps describe the activity recognition process: sensors’ placement, data pre-processing and data classification. Four supervised classification techniques namely, k-Nearest Neighbor (k-NN), Support Vector Machines (SVM), Gaussian Mixture Models (GMM), and Random Forest (RF) as well as three unsupervised classification techniques namely, k-Means, Gaussian mixture models (GMM) and Hidden Markov Model (HMM), are compared in terms of correct classification rate, F-measure, recall, precision, and specificity. Raw data and extracted features are used separately as inputs of each classifier. The feature selection is performed using a wrapper approach based on the RF algorithm. Based on our experiments, the results obtained show that the k-NN classifier provides the best performance compared to other supervised classification algorithms, whereas the HMM classifier is the one that gives the best results among unsupervised classification algorithms. This comparison highlights which approach gives better performance in both supervised and unsupervised contexts. It should be noted that the obtained results are limited to the context of this study, which concerns the classification of the main daily living human activities using three wearable accelerometers placed at the chest, right shank and left ankle of the subject. PMID:26690450

  12. A semi-supervised classification algorithm using the TAD-derived background as training data

    NASA Astrophysics Data System (ADS)

    Fan, Lei; Ambeau, Brittany; Messinger, David W.

    2013-05-01

    In general, spectral image classification algorithms fall into one of two categories: supervised and unsupervised. In unsupervised approaches, the algorithm automatically identifies clusters in the data without a priori information about those clusters (except perhaps the expected number of them). Supervised approaches require an analyst to identify training data to learn the characteristics of the clusters such that they can then classify all other pixels into one of the pre-defined groups. The classification algorithm presented here is a semi-supervised approach based on the Topological Anomaly Detection (TAD) algorithm. The TAD algorithm defines background components based on a mutual k-Nearest Neighbor graph model of the data, along with a spectral connected components analysis. Here, the largest components produced by TAD are used as regions of interest (ROI's),or training data for a supervised classification scheme. By combining those ROI's with a Gaussian Maximum Likelihood (GML) or a Minimum Distance to the Mean (MDM) algorithm, we are able to achieve a semi supervised classification method. We test this classification algorithm against data collected by the HyMAP sensor over the Cooke City, MT area and University of Pavia scene.

  13. Cognitions of Expert Supervisors in Academe: A Concept Mapping Approach

    ERIC Educational Resources Information Center

    Kemer, Gülsah; Borders, L. DiAnne; Willse, John

    2014-01-01

    Eighteen expert supervisors reported their thoughts while preparing for, conducting, and evaluating their supervision sessions. Concept mapping (Kane & Trochim, [Kane, M., 2007]) yielded 195 cognitions classified into 25 cognitive categories organized into 5 supervision areas: conceptualization of supervision, supervisee assessment,…

  14. Classifying galaxy spectra at 0.5 < z < 1 with self-organizing maps

    NASA Astrophysics Data System (ADS)

    Rahmani, S.; Teimoorinia, H.; Barmby, P.

    2018-05-01

    The spectrum of a galaxy contains information about its physical properties. Classifying spectra using templates helps elucidate the nature of a galaxy's energy sources. In this paper, we investigate the use of self-organizing maps in classifying galaxy spectra against templates. We trained semi-supervised self-organizing map networks using a set of templates covering the wavelength range from far ultraviolet to near infrared. The trained networks were used to classify the spectra of a sample of 142 galaxies with 0.5 < z < 1 and the results compared to classifications performed using K-means clustering, a supervised neural network, and chi-squared minimization. Spectra corresponding to quiescent galaxies were more likely to be classified similarly by all methods while starburst spectra showed more variability. Compared to classification using chi-squared minimization or the supervised neural network, the galaxies classed together by the self-organizing map had more similar spectra. The class ordering provided by the one-dimensional self-organizing maps corresponds to an ordering in physical properties, a potentially important feature for the exploration of large datasets.

  15. HClass: Automatic classification tool for health pathologies using artificial intelligence techniques.

    PubMed

    Garcia-Chimeno, Yolanda; Garcia-Zapirain, Begonya

    2015-01-01

    The classification of subjects' pathologies enables a rigorousness to be applied to the treatment of certain pathologies, as doctors on occasions play with so many variables that they can end up confusing some illnesses with others. Thanks to Machine Learning techniques applied to a health-record database, it is possible to make using our algorithm. hClass contains a non-linear classification of either a supervised, non-supervised or semi-supervised type. The machine is configured using other techniques such as validation of the set to be classified (cross-validation), reduction in features (PCA) and committees for assessing the various classifiers. The tool is easy to use, and the sample matrix and features that one wishes to classify, the number of iterations and the subjects who are going to be used to train the machine all need to be introduced as inputs. As a result, the success rate is shown either via a classifier or via a committee if one has been formed. A 90% success rate is obtained in the ADABoost classifier and 89.7% in the case of a committee (comprising three classifiers) when PCA is applied. This tool can be expanded to allow the user to totally characterise the classifiers by adjusting them to each classification use.

  16. Tensor-based classification of an auditory mobile BCI without a subject-specific calibration phase

    NASA Astrophysics Data System (ADS)

    Zink, Rob; Hunyadi, Borbála; Van Huffel, Sabine; De Vos, Maarten

    2016-04-01

    Objective. One of the major drawbacks in EEG brain-computer interfaces (BCI) is the need for subject-specific training of the classifier. By removing the need for a supervised calibration phase, new users could potentially explore a BCI faster. In this work we aim to remove this subject-specific calibration phase and allow direct classification. Approach. We explore canonical polyadic decompositions and block term decompositions of the EEG. These methods exploit structure in higher dimensional data arrays called tensors. The BCI tensors are constructed by concatenating ERP templates from other subjects to a target and non-target trial and the inherent structure guides a decomposition that allows accurate classification. We illustrate the new method on data from a three-class auditory oddball paradigm. Main results. The presented approach leads to a fast and intuitive classification with accuracies competitive with a supervised and cross-validated LDA approach. Significance. The described methods are a promising new way of classifying BCI data with a forthright link to the original P300 ERP signal over the conventional and widely used supervised approaches.

  17. Tensor-based classification of an auditory mobile BCI without a subject-specific calibration phase.

    PubMed

    Zink, Rob; Hunyadi, Borbála; Huffel, Sabine Van; Vos, Maarten De

    2016-04-01

    One of the major drawbacks in EEG brain-computer interfaces (BCI) is the need for subject-specific training of the classifier. By removing the need for a supervised calibration phase, new users could potentially explore a BCI faster. In this work we aim to remove this subject-specific calibration phase and allow direct classification. We explore canonical polyadic decompositions and block term decompositions of the EEG. These methods exploit structure in higher dimensional data arrays called tensors. The BCI tensors are constructed by concatenating ERP templates from other subjects to a target and non-target trial and the inherent structure guides a decomposition that allows accurate classification. We illustrate the new method on data from a three-class auditory oddball paradigm. The presented approach leads to a fast and intuitive classification with accuracies competitive with a supervised and cross-validated LDA approach. The described methods are a promising new way of classifying BCI data with a forthright link to the original P300 ERP signal over the conventional and widely used supervised approaches.

  18. Unsupervised classification of cirrhotic livers using MRI data

    NASA Astrophysics Data System (ADS)

    Lee, Gobert; Kanematsu, Masayuki; Kato, Hiroki; Kondo, Hiroshi; Zhou, Xiangrong; Hara, Takeshi; Fujita, Hiroshi; Hoshi, Hiroaki

    2008-03-01

    Cirrhosis of the liver is a chronic disease. It is characterized by the presence of widespread nodules and fibrosis in the liver which results in characteristic texture patterns. Computerized analysis of hepatic texture patterns is usually based on regions-of-interest (ROIs). However, not all ROIs are typical representatives of the disease stage of the liver from which the ROIs originated. This leads to uncertainties in the ROI labels (diseased or non-diseased). On the other hand, supervised classifiers are commonly used in determining the assignment rule. This presents a problem as the training of a supervised classifier requires the correct labels of the ROIs. The main purpose of this paper is to investigate the use of an unsupervised classifier, the k-means clustering, in classifying ROI based data. In addition, a procedure for generating a receiver operating characteristic (ROC) curve depicting the classification performance of k-means clustering is also reported. Hepatic MRI images of 44 patients (16 cirrhotic; 28 non-cirrhotic) are used in this study. The MRI data are derived from gadolinium-enhanced equilibrium phase images. For each patient, 10 ROIs selected by an experienced radiologist and 7 texture features measured on each ROI are included in the MRI data. Results of the k-means classifier are depicted using an ROC curve. The area under the curve (AUC) has a value of 0.704. This is slightly lower than but comparable to that of LDA and ANN classifiers which have values 0.781 and 0.801, respectively. Methods in constructing ROC curve in relation to k-means clustering have not been previously reported in the literature.

  19. Multimodal data and machine learning for surgery outcome prediction in complicated cases of mesial temporal lobe epilepsy.

    PubMed

    Memarian, Negar; Kim, Sally; Dewar, Sandra; Engel, Jerome; Staba, Richard J

    2015-09-01

    This study sought to predict postsurgical seizure freedom from pre-operative diagnostic test results and clinical information using a rapid automated approach, based on supervised learning methods in patients with drug-resistant focal seizures suspected to begin in temporal lobe. We applied machine learning, specifically a combination of mutual information-based feature selection and supervised learning classifiers on multimodal data, to predict surgery outcome retrospectively in 20 presurgical patients (13 female; mean age±SD, in years 33±9.7 for females, and 35.3±9.4 for males) who were diagnosed with mesial temporal lobe epilepsy (MTLE) and subsequently underwent standard anteromesial temporal lobectomy. The main advantage of the present work over previous studies is the inclusion of the extent of ipsilateral neocortical gray matter atrophy and spatiotemporal properties of depth electrode-recorded seizures as training features for individual patient surgery planning. A maximum relevance minimum redundancy (mRMR) feature selector identified the following features as the most informative predictors of postsurgical seizure freedom in this study's sample of patients: family history of epilepsy, ictal EEG onset pattern (positive correlation with seizure freedom), MRI-based gray matter thickness reduction in the hemisphere ipsilateral to seizure onset, proportion of seizures that first appeared in ipsilateral amygdala to total seizures, age, epilepsy duration, delay in the spread of ipsilateral ictal discharges from site of onset, gender, and number of electrode contacts at seizure onset (negative correlation with seizure freedom). Using these features in combination with a least square support vector machine (LS-SVM) classifier compared to other commonly used classifiers resulted in very high surgical outcome prediction accuracy (95%). Supervised machine learning using multimodal compared to unimodal data accurately predicted postsurgical outcome in patients with atypical MTLE. Published by Elsevier Ltd.

  20. Improving EEG-Based Driver Fatigue Classification Using Sparse-Deep Belief Networks.

    PubMed

    Chai, Rifai; Ling, Sai Ho; San, Phyo Phyo; Naik, Ganesh R; Nguyen, Tuan N; Tran, Yvonne; Craig, Ashley; Nguyen, Hung T

    2017-01-01

    This paper presents an improvement of classification performance for electroencephalography (EEG)-based driver fatigue classification between fatigue and alert states with the data collected from 43 participants. The system employs autoregressive (AR) modeling as the features extraction algorithm, and sparse-deep belief networks (sparse-DBN) as the classification algorithm. Compared to other classifiers, sparse-DBN is a semi supervised learning method which combines unsupervised learning for modeling features in the pre-training layer and supervised learning for classification in the following layer. The sparsity in sparse-DBN is achieved with a regularization term that penalizes a deviation of the expected activation of hidden units from a fixed low-level prevents the network from overfitting and is able to learn low-level structures as well as high-level structures. For comparison, the artificial neural networks (ANN), Bayesian neural networks (BNN), and original deep belief networks (DBN) classifiers are used. The classification results show that using AR feature extractor and DBN classifiers, the classification performance achieves an improved classification performance with a of sensitivity of 90.8%, a specificity of 90.4%, an accuracy of 90.6%, and an area under the receiver operating curve (AUROC) of 0.94 compared to ANN (sensitivity at 80.8%, specificity at 77.8%, accuracy at 79.3% with AUC-ROC of 0.83) and BNN classifiers (sensitivity at 84.3%, specificity at 83%, accuracy at 83.6% with AUROC of 0.87). Using the sparse-DBN classifier, the classification performance improved further with sensitivity of 93.9%, a specificity of 92.3%, and an accuracy of 93.1% with AUROC of 0.96. Overall, the sparse-DBN classifier improved accuracy by 13.8, 9.5, and 2.5% over ANN, BNN, and DBN classifiers, respectively.

  1. Improving EEG-Based Driver Fatigue Classification Using Sparse-Deep Belief Networks

    PubMed Central

    Chai, Rifai; Ling, Sai Ho; San, Phyo Phyo; Naik, Ganesh R.; Nguyen, Tuan N.; Tran, Yvonne; Craig, Ashley; Nguyen, Hung T.

    2017-01-01

    This paper presents an improvement of classification performance for electroencephalography (EEG)-based driver fatigue classification between fatigue and alert states with the data collected from 43 participants. The system employs autoregressive (AR) modeling as the features extraction algorithm, and sparse-deep belief networks (sparse-DBN) as the classification algorithm. Compared to other classifiers, sparse-DBN is a semi supervised learning method which combines unsupervised learning for modeling features in the pre-training layer and supervised learning for classification in the following layer. The sparsity in sparse-DBN is achieved with a regularization term that penalizes a deviation of the expected activation of hidden units from a fixed low-level prevents the network from overfitting and is able to learn low-level structures as well as high-level structures. For comparison, the artificial neural networks (ANN), Bayesian neural networks (BNN), and original deep belief networks (DBN) classifiers are used. The classification results show that using AR feature extractor and DBN classifiers, the classification performance achieves an improved classification performance with a of sensitivity of 90.8%, a specificity of 90.4%, an accuracy of 90.6%, and an area under the receiver operating curve (AUROC) of 0.94 compared to ANN (sensitivity at 80.8%, specificity at 77.8%, accuracy at 79.3% with AUC-ROC of 0.83) and BNN classifiers (sensitivity at 84.3%, specificity at 83%, accuracy at 83.6% with AUROC of 0.87). Using the sparse-DBN classifier, the classification performance improved further with sensitivity of 93.9%, a specificity of 92.3%, and an accuracy of 93.1% with AUROC of 0.96. Overall, the sparse-DBN classifier improved accuracy by 13.8, 9.5, and 2.5% over ANN, BNN, and DBN classifiers, respectively. PMID:28326009

  2. Active learning based segmentation of Crohns disease from abdominal MRI.

    PubMed

    Mahapatra, Dwarikanath; Vos, Franciscus M; Buhmann, Joachim M

    2016-05-01

    This paper proposes a novel active learning (AL) framework, and combines it with semi supervised learning (SSL) for segmenting Crohns disease (CD) tissues from abdominal magnetic resonance (MR) images. Robust fully supervised learning (FSL) based classifiers require lots of labeled data of different disease severities. Obtaining such data is time consuming and requires considerable expertise. SSL methods use a few labeled samples, and leverage the information from many unlabeled samples to train an accurate classifier. AL queries labels of most informative samples and maximizes gain from the labeling effort. Our primary contribution is in designing a query strategy that combines novel context information with classification uncertainty and feature similarity. Combining SSL and AL gives a robust segmentation method that: (1) optimally uses few labeled samples and many unlabeled samples; and (2) requires lower training time. Experimental results show our method achieves higher segmentation accuracy than FSL methods with fewer samples and reduced training effort. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  3. Extracting microRNA-gene relations from biomedical literature using distant supervision

    PubMed Central

    Clarke, Luka A.; Couto, Francisco M.

    2017-01-01

    Many biomedical relation extraction approaches are based on supervised machine learning, requiring an annotated corpus. Distant supervision aims at training a classifier by combining a knowledge base with a corpus, reducing the amount of manual effort necessary. This is particularly useful for biomedicine because many databases and ontologies have been made available for many biological processes, while the availability of annotated corpora is still limited. We studied the extraction of microRNA-gene relations from text. MicroRNA regulation is an important biological process due to its close association with human diseases. The proposed method, IBRel, is based on distantly supervised multi-instance learning. We evaluated IBRel on three datasets, and the results were compared with a co-occurrence approach as well as a supervised machine learning algorithm. While supervised learning outperformed on two of those datasets, IBRel obtained an F-score 28.3 percentage points higher on the dataset for which there was no training set developed specifically. To demonstrate the applicability of IBRel, we used it to extract 27 miRNA-gene relations from recently published papers about cystic fibrosis. Our results demonstrate that our method can be successfully used to extract relations from literature about a biological process without an annotated corpus. The source code and data used in this study are available at https://github.com/AndreLamurias/IBRel. PMID:28263989

  4. Extracting microRNA-gene relations from biomedical literature using distant supervision.

    PubMed

    Lamurias, Andre; Clarke, Luka A; Couto, Francisco M

    2017-01-01

    Many biomedical relation extraction approaches are based on supervised machine learning, requiring an annotated corpus. Distant supervision aims at training a classifier by combining a knowledge base with a corpus, reducing the amount of manual effort necessary. This is particularly useful for biomedicine because many databases and ontologies have been made available for many biological processes, while the availability of annotated corpora is still limited. We studied the extraction of microRNA-gene relations from text. MicroRNA regulation is an important biological process due to its close association with human diseases. The proposed method, IBRel, is based on distantly supervised multi-instance learning. We evaluated IBRel on three datasets, and the results were compared with a co-occurrence approach as well as a supervised machine learning algorithm. While supervised learning outperformed on two of those datasets, IBRel obtained an F-score 28.3 percentage points higher on the dataset for which there was no training set developed specifically. To demonstrate the applicability of IBRel, we used it to extract 27 miRNA-gene relations from recently published papers about cystic fibrosis. Our results demonstrate that our method can be successfully used to extract relations from literature about a biological process without an annotated corpus. The source code and data used in this study are available at https://github.com/AndreLamurias/IBRel.

  5. Semi-Supervised Active Learning for Sound Classification in Hybrid Learning Environments.

    PubMed

    Han, Wenjing; Coutinho, Eduardo; Ruan, Huabin; Li, Haifeng; Schuller, Björn; Yu, Xiaojie; Zhu, Xuan

    2016-01-01

    Coping with scarcity of labeled data is a common problem in sound classification tasks. Approaches for classifying sounds are commonly based on supervised learning algorithms, which require labeled data which is often scarce and leads to models that do not generalize well. In this paper, we make an efficient combination of confidence-based Active Learning and Self-Training with the aim of minimizing the need for human annotation for sound classification model training. The proposed method pre-processes the instances that are ready for labeling by calculating their classifier confidence scores, and then delivers the candidates with lower scores to human annotators, and those with high scores are automatically labeled by the machine. We demonstrate the feasibility and efficacy of this method in two practical scenarios: pool-based and stream-based processing. Extensive experimental results indicate that our approach requires significantly less labeled instances to reach the same performance in both scenarios compared to Passive Learning, Active Learning and Self-Training. A reduction of 52.2% in human labeled instances is achieved in both of the pool-based and stream-based scenarios on a sound classification task considering 16,930 sound instances.

  6. Semi-Supervised Active Learning for Sound Classification in Hybrid Learning Environments

    PubMed Central

    Han, Wenjing; Coutinho, Eduardo; Li, Haifeng; Schuller, Björn; Yu, Xiaojie; Zhu, Xuan

    2016-01-01

    Coping with scarcity of labeled data is a common problem in sound classification tasks. Approaches for classifying sounds are commonly based on supervised learning algorithms, which require labeled data which is often scarce and leads to models that do not generalize well. In this paper, we make an efficient combination of confidence-based Active Learning and Self-Training with the aim of minimizing the need for human annotation for sound classification model training. The proposed method pre-processes the instances that are ready for labeling by calculating their classifier confidence scores, and then delivers the candidates with lower scores to human annotators, and those with high scores are automatically labeled by the machine. We demonstrate the feasibility and efficacy of this method in two practical scenarios: pool-based and stream-based processing. Extensive experimental results indicate that our approach requires significantly less labeled instances to reach the same performance in both scenarios compared to Passive Learning, Active Learning and Self-Training. A reduction of 52.2% in human labeled instances is achieved in both of the pool-based and stream-based scenarios on a sound classification task considering 16,930 sound instances. PMID:27627768

  7. Pervasive Sound Sensing: A Weakly Supervised Training Approach.

    PubMed

    Kelly, Daniel; Caulfield, Brian

    2016-01-01

    Modern smartphones present an ideal device for pervasive sensing of human behavior. Microphones have the potential to reveal key information about a person's behavior. However, they have been utilized to a significantly lesser extent than other smartphone sensors in the context of human behavior sensing. We postulate that, in order for microphones to be useful in behavior sensing applications, the analysis techniques must be flexible and allow easy modification of the types of sounds to be sensed. A simplification of the training data collection process could allow a more flexible sound classification framework. We hypothesize that detailed training, a prerequisite for the majority of sound sensing techniques, is not necessary and that a significantly less detailed and time consuming data collection process can be carried out, allowing even a nonexpert to conduct the collection, labeling, and training process. To test this hypothesis, we implement a diverse density-based multiple instance learning framework, to identify a target sound, and a bag trimming algorithm, which, using the target sound, automatically segments weakly labeled sound clips to construct an accurate training set. Experiments reveal that our hypothesis is a valid one and results show that classifiers, trained using the automatically segmented training sets, were able to accurately classify unseen sound samples with accuracies comparable to supervised classifiers, achieving an average F -measure of 0.969 and 0.87 for two weakly supervised datasets.

  8. Fast and robust segmentation of white blood cell images by self-supervised learning.

    PubMed

    Zheng, Xin; Wang, Yong; Wang, Guoyou; Liu, Jianguo

    2018-04-01

    A fast and accurate white blood cell (WBC) segmentation remains a challenging task, as different WBCs vary significantly in color and shape due to cell type differences, staining technique variations and the adhesion between the WBC and red blood cells. In this paper, a self-supervised learning approach, consisting of unsupervised initial segmentation and supervised segmentation refinement, is presented. The first module extracts the overall foreground region from the cell image by K-means clustering, and then generates a coarse WBC region by touching-cell splitting based on concavity analysis. The second module further uses the coarse segmentation result of the first module as automatic labels to actively train a support vector machine (SVM) classifier. Then, the trained SVM classifier is further used to classify each pixel of the image and achieve a more accurate segmentation result. To improve its segmentation accuracy, median color features representing the topological structure and a new weak edge enhancement operator (WEEO) handling fuzzy boundary are introduced. To further reduce its time cost, an efficient cluster sampling strategy is also proposed. We tested the proposed approach with two blood cell image datasets obtained under various imaging and staining conditions. The experiment results show that our approach has a superior performance of accuracy and time cost on both datasets. Copyright © 2018 Elsevier Ltd. All rights reserved.

  9. Detecting Visually Observable Disease Symptoms from Faces.

    PubMed

    Wang, Kuan; Luo, Jiebo

    2016-12-01

    Recent years have witnessed an increasing interest in the application of machine learning to clinical informatics and healthcare systems. A significant amount of research has been done on healthcare systems based on supervised learning. In this study, we present a generalized solution to detect visually observable symptoms on faces using semi-supervised anomaly detection combined with machine vision algorithms. We rely on the disease-related statistical facts to detect abnormalities and classify them into multiple categories to narrow down the possible medical reasons of detecting. Our method is in contrast with most existing approaches, which are limited by the availability of labeled training data required for supervised learning, and therefore offers the major advantage of flagging any unusual and visually observable symptoms.

  10. Partially supervised P300 speller adaptation for eventual stimulus timing optimization: target confidence is superior to error-related potential score as an uncertain label

    NASA Astrophysics Data System (ADS)

    Zeyl, Timothy; Yin, Erwei; Keightley, Michelle; Chau, Tom

    2016-04-01

    Objective. Error-related potentials (ErrPs) have the potential to guide classifier adaptation in BCI spellers, for addressing non-stationary performance as well as for online optimization of system parameters, by providing imperfect or partial labels. However, the usefulness of ErrP-based labels for BCI adaptation has not been established in comparison to other partially supervised methods. Our objective is to make this comparison by retraining a two-step P300 speller on a subset of confident online trials using naïve labels taken from speller output, where confidence is determined either by (i) ErrP scores, (ii) posterior target scores derived from the P300 potential, or (iii) a hybrid of these scores. We further wish to evaluate the ability of partially supervised adaptation and retraining methods to adjust to a new stimulus-onset asynchrony (SOA), a necessary step towards online SOA optimization. Approach. Eleven consenting able-bodied adults attended three online spelling sessions on separate days with feedback in which SOAs were set at 160 ms (sessions 1 and 2) and 80 ms (session 3). A post hoc offline analysis and a simulated online analysis were performed on sessions two and three to compare multiple adaptation methods. Area under the curve (AUC) and symbols spelled per minute (SPM) were the primary outcome measures. Main results. Retraining using supervised labels confirmed improvements of 0.9 percentage points (session 2, p < 0.01) and 1.9 percentage points (session 3, p < 0.05) in AUC using same-day training data over using data from a previous day, which supports classifier adaptation in general. Significance. Using posterior target score alone as a confidence measure resulted in the highest SPM of the partially supervised methods, indicating that ErrPs are not necessary to boost the performance of partially supervised adaptive classification. Partial supervision significantly improved SPM at a novel SOA, showing promise for eventual online SOA optimization.

  11. Classification

    NASA Technical Reports Server (NTRS)

    Oza, Nikunj C.

    2011-01-01

    A supervised learning task involves constructing a mapping from input data (normally described by several features) to the appropriate outputs. Within supervised learning, one type of task is a classification learning task, in which each output is one or more classes to which the input belongs. In supervised learning, a set of training examples---examples with known output values---is used by a learning algorithm to generate a model. This model is intended to approximate the mapping between the inputs and outputs. This model can be used to generate predicted outputs for inputs that have not been seen before. For example, we may have data consisting of observations of sunspots. In a classification learning task, our goal may be to learn to classify sunspots into one of several types. Each example may correspond to one candidate sunspot with various measurements or just an image. A learning algorithm would use the supplied examples to generate a model that approximates the mapping between each supplied set of measurements and the type of sunspot. This model can then be used to classify previously unseen sunspots based on the candidate's measurements. This chapter discusses methods to perform machine learning, with examples involving astronomy.

  12. Semi-supervised manifold learning with affinity regularization for Alzheimer's disease identification using positron emission tomography imaging.

    PubMed

    Lu, Shen; Xia, Yong; Cai, Tom Weidong; Feng, David Dagan

    2015-01-01

    Dementia, Alzheimer's disease (AD) in particular is a global problem and big threat to the aging population. An image based computer-aided dementia diagnosis method is needed to providing doctors help during medical image examination. Many machine learning based dementia classification methods using medical imaging have been proposed and most of them achieve accurate results. However, most of these methods make use of supervised learning requiring fully labeled image dataset, which usually is not practical in real clinical environment. Using large amount of unlabeled images can improve the dementia classification performance. In this study we propose a new semi-supervised dementia classification method based on random manifold learning with affinity regularization. Three groups of spatial features are extracted from positron emission tomography (PET) images to construct an unsupervised random forest which is then used to regularize the manifold learning objective function. The proposed method, stat-of-the-art Laplacian support vector machine (LapSVM) and supervised SVM are applied to classify AD and normal controls (NC). The experiment results show that learning with unlabeled images indeed improves the classification performance. And our method outperforms LapSVM on the same dataset.

  13. Classification and analysis of the Rudaki's Area

    NASA Astrophysics Data System (ADS)

    Zambon, F.; De sanctis, M.; Capaccioni, F.; Filacchione, G.; Carli, C.; Ammannito, E.; Frigeri, A.

    2011-12-01

    During the first two MESSENGER flybys the Mercury Dual Imaging System (MDIS) has mapped 90% of the Mercury's surface. An effective way to study the different terrain on planetary surfaces is to apply classification methods. These are based on clustering algorithms and they can be divided in two categories: unsupervised and supervised. The unsupervised classifiers do not require the analyst feedback and the algorithm automatically organizes pixels values into classes. In the supervised method, instead, the analyst must choose the "training area" that define the pixels value of a given class. We applied an unsupervised classifier, ISODATA, to the WAC filter images of the Rudaki's area where several kind of terrain have been identified showing differences in albedo, topography and crater density. ISODATA classifier divides this region in four classes: 1) shadow regions, 2) rough regions, 3) smooth plane, 4) highest reflectance area. ISODATA can not distinguish the high albedo regions from highly reflective illuminated edge of the craters, however the algorithm identify four classes that can be considered different units mainly on the basis of their reflectances at the various wavelengths. Is not possible, instead, to extrapolate compositional information because of the absence of clear spectral features. An additional analysis was made using ISODATA to choose the "training area" for further supervised classifications. These approach would allow, for example, to separate more accurately the edge of the craters from the high reflectance areas and the low reflectance regions from the shadow areas.

  14. Automated labelling of cancer textures in colorectal histopathology slides using quasi-supervised learning.

    PubMed

    Onder, Devrim; Sarioglu, Sulen; Karacali, Bilge

    2013-04-01

    Quasi-supervised learning is a statistical learning algorithm that contrasts two datasets by computing estimate for the posterior probability of each sample in either dataset. This method has not been applied to histopathological images before. The purpose of this study is to evaluate the performance of the method to identify colorectal tissues with or without adenocarcinoma. Light microscopic digital images from histopathological sections were obtained from 30 colorectal radical surgery materials including adenocarcinoma and non-neoplastic regions. The texture features were extracted by using local histograms and co-occurrence matrices. The quasi-supervised learning algorithm operates on two datasets, one containing samples of normal tissues labelled only indirectly, and the other containing an unlabeled collection of samples of both normal and cancer tissues. As such, the algorithm eliminates the need for manually labelled samples of normal and cancer tissues for conventional supervised learning and significantly reduces the expert intervention. Several texture feature vector datasets corresponding to different extraction parameters were tested within the proposed framework. The Independent Component Analysis dimensionality reduction approach was also identified as the one improving the labelling performance evaluated in this series. In this series, the proposed method was applied to the dataset of 22,080 vectors with reduced dimensionality 119 from 132. Regions containing cancer tissue could be identified accurately having false and true positive rates up to 19% and 88% respectively without using manually labelled ground-truth datasets in a quasi-supervised strategy. The resulting labelling performances were compared to that of a conventional powerful supervised classifier using manually labelled ground-truth data. The supervised classifier results were calculated as 3.5% and 95% for the same case. The results in this series in comparison with the benchmark classifier, suggest that quasi-supervised image texture labelling may be a useful method in the analysis and classification of pathological slides but further study is required to improve the results. Copyright © 2013 Elsevier Ltd. All rights reserved.

  15. Semi-supervised classification tool for DubaiSat-2 multispectral imagery

    NASA Astrophysics Data System (ADS)

    Al-Mansoori, Saeed

    2015-10-01

    This paper addresses a semi-supervised classification tool based on a pixel-based approach of the multi-spectral satellite imagery. There are not many studies demonstrating such algorithm for the multispectral images, especially when the image consists of 4 bands (Red, Green, Blue and Near Infrared) as in DubaiSat-2 satellite images. The proposed approach utilizes both unsupervised and supervised classification schemes sequentially to identify four classes in the image, namely, water bodies, vegetation, land (developed and undeveloped areas) and paved areas (i.e. roads). The unsupervised classification concept is applied to identify two classes; water bodies and vegetation, based on a well-known index that uses the distinct wavelengths of visible and near-infrared sunlight that is absorbed and reflected by the plants to identify the classes; this index parameter is called "Normalized Difference Vegetation Index (NDVI)". Afterward, the supervised classification is performed by selecting training homogenous samples for roads and land areas. Here, a precise selection of training samples plays a vital role in the classification accuracy. Post classification is finally performed to enhance the classification accuracy, where the classified image is sieved, clumped and filtered before producing final output. Overall, the supervised classification approach produced higher accuracy than the unsupervised method. This paper shows some current preliminary research results which point out the effectiveness of the proposed technique in a virtual perspective.

  16. Using Computational Text Classification for Qualitative Research and Evaluation in Extension

    ERIC Educational Resources Information Center

    Smith, Justin G.; Tissing, Reid

    2018-01-01

    This article introduces a process for computational text classification that can be used in a variety of qualitative research and evaluation settings. The process leverages supervised machine learning based on an implementation of a multinomial Bayesian classifier. Applied to a community of inquiry framework, the algorithm was used to identify…

  17. New insights into the classification and nomenclature of cortical GABAergic interneurons.

    PubMed

    DeFelipe, Javier; López-Cruz, Pedro L; Benavides-Piccione, Ruth; Bielza, Concha; Larrañaga, Pedro; Anderson, Stewart; Burkhalter, Andreas; Cauli, Bruno; Fairén, Alfonso; Feldmeyer, Dirk; Fishell, Gord; Fitzpatrick, David; Freund, Tamás F; González-Burgos, Guillermo; Hestrin, Shaul; Hill, Sean; Hof, Patrick R; Huang, Josh; Jones, Edward G; Kawaguchi, Yasuo; Kisvárday, Zoltán; Kubota, Yoshiyuki; Lewis, David A; Marín, Oscar; Markram, Henry; McBain, Chris J; Meyer, Hanno S; Monyer, Hannah; Nelson, Sacha B; Rockland, Kathleen; Rossier, Jean; Rubenstein, John L R; Rudy, Bernardo; Scanziani, Massimo; Shepherd, Gordon M; Sherwood, Chet C; Staiger, Jochen F; Tamás, Gábor; Thomson, Alex; Wang, Yun; Yuste, Rafael; Ascoli, Giorgio A

    2013-03-01

    A systematic classification and accepted nomenclature of neuron types is much needed but is currently lacking. This article describes a possible taxonomical solution for classifying GABAergic interneurons of the cerebral cortex based on a novel, web-based interactive system that allows experts to classify neurons with pre-determined criteria. Using Bayesian analysis and clustering algorithms on the resulting data, we investigated the suitability of several anatomical terms and neuron names for cortical GABAergic interneurons. Moreover, we show that supervised classification models could automatically categorize interneurons in agreement with experts' assignments. These results demonstrate a practical and objective approach to the naming, characterization and classification of neurons based on community consensus.

  18. New insights into the classification and nomenclature of cortical GABAergic interneurons

    PubMed Central

    DeFelipe, Javier; López-Cruz, Pedro L.; Benavides-Piccione, Ruth; Bielza, Concha; Larrañaga, Pedro; Anderson, Stewart; Burkhalter, Andreas; Cauli, Bruno; Fairén, Alfonso; Feldmeyer, Dirk; Fishell, Gord; Fitzpatrick, David; Freund, Tamás F.; González-Burgos, Guillermo; Hestrin, Shaul; Hill, Sean; Hof, Patrick R.; Huang, Josh; Jones, Edward G.; Kawaguchi, Yasuo; Kisvárday, Zoltán; Kubota, Yoshiyuki; Lewis, David A.; Marín, Oscar; Markram, Henry; McBain, Chris J.; Meyer, Hanno S.; Monyer, Hannah; Nelson, Sacha B.; Rockland, Kathleen; Rossier, Jean; Rubenstein, John L. R.; Rudy, Bernardo; Scanziani, Massimo; Shepherd, Gordon M.; Sherwood, Chet C.; Staiger, Jochen F.; Tamás, Gábor; Thomson, Alex; Wang, Yun; Yuste, Rafael; Ascoli, Giorgio A.

    2013-01-01

    A systematic classification and accepted nomenclature of neuron types is much needed but is currently lacking. This article describes a possible taxonomical solution for classifying GABAergic interneurons of the cerebral cortex based on a novel, web-based interactive system that allows experts to classify neurons with pre-determined criteria. Using Bayesian analysis and clustering algorithms on the resulting data, we investigated the suitability of several anatomical terms and neuron names for cortical GABAergic interneurons. Moreover, we show that supervised classification models could automatically categorize interneurons in agreement with experts’ assignments. These results demonstrate a practical and objective approach to the naming, characterization and classification of neurons based on community consensus. PMID:23385869

  19. Filtering big data from social media--Building an early warning system for adverse drug reactions.

    PubMed

    Yang, Ming; Kiang, Melody; Shang, Wei

    2015-04-01

    Adverse drug reactions (ADRs) are believed to be a leading cause of death in the world. Pharmacovigilance systems are aimed at early detection of ADRs. With the popularity of social media, Web forums and discussion boards become important sources of data for consumers to share their drug use experience, as a result may provide useful information on drugs and their adverse reactions. In this study, we propose an automated ADR related posts filtering mechanism using text classification methods. In real-life settings, ADR related messages are highly distributed in social media, while non-ADR related messages are unspecific and topically diverse. It is expensive to manually label a large amount of ADR related messages (positive examples) and non-ADR related messages (negative examples) to train classification systems. To mitigate this challenge, we examine the use of a partially supervised learning classification method to automate the process. We propose a novel pharmacovigilance system leveraging a Latent Dirichlet Allocation modeling module and a partially supervised classification approach. We select drugs with more than 500 threads of discussion, and collect all the original posts and comments of these drugs using an automatic Web spidering program as the text corpus. Various classifiers were trained by varying the number of positive examples and the number of topics. The trained classifiers were applied to 3000 posts published over 60 days. Top-ranked posts from each classifier were pooled and the resulting set of 300 posts was reviewed by a domain expert to evaluate the classifiers. Compare to the alternative approaches using supervised learning methods and three general purpose partially supervised learning methods, our approach performs significantly better in terms of precision, recall, and the F measure (the harmonic mean of precision and recall), based on a computational experiment using online discussion threads from Medhelp. Our design provides satisfactory performance in identifying ADR related posts for post-marketing drug surveillance. The overall design of our system also points out a potentially fruitful direction for building other early warning systems that need to filter big data from social media networks. Copyright © 2015 Elsevier Inc. All rights reserved.

  20. Quantum Support Vector Machine for Big Data Classification

    NASA Astrophysics Data System (ADS)

    Rebentrost, Patrick; Mohseni, Masoud; Lloyd, Seth

    2014-09-01

    Supervised machine learning is the classification of new data based on already classified training examples. In this work, we show that the support vector machine, an optimized binary classifier, can be implemented on a quantum computer, with complexity logarithmic in the size of the vectors and the number of training examples. In cases where classical sampling algorithms require polynomial time, an exponential speedup is obtained. At the core of this quantum big data algorithm is a nonsparse matrix exponentiation technique for efficiently performing a matrix inversion of the training data inner-product (kernel) matrix.

  1. Abnormality detection of mammograms by discriminative dictionary learning on DSIFT descriptors.

    PubMed

    Tavakoli, Nasrin; Karimi, Maryam; Nejati, Mansour; Karimi, Nader; Reza Soroushmehr, S M; Samavi, Shadrokh; Najarian, Kayvan

    2017-07-01

    Detection and classification of breast lesions using mammographic images are one of the most difficult studies in medical image processing. A number of learning and non-learning methods have been proposed for detecting and classifying these lesions. However, the accuracy of the detection/classification still needs improvement. In this paper we propose a powerful classification method based on sparse learning to diagnose breast cancer in mammograms. For this purpose, a supervised discriminative dictionary learning approach is applied on dense scale invariant feature transform (DSIFT) features. A linear classifier is also simultaneously learned with the dictionary which can effectively classify the sparse representations. Our experimental results show the superior performance of our method compared to existing approaches.

  2. Feature extraction for ultrasonic sensor based defect detection in ceramic components

    NASA Astrophysics Data System (ADS)

    Kesharaju, Manasa; Nagarajah, Romesh

    2014-02-01

    High density silicon carbide materials are commonly used as the ceramic element of hard armour inserts used in traditional body armour systems to reduce their weight, while providing improved hardness, strength and elastic response to stress. Currently, armour ceramic tiles are inspected visually offline using an X-ray technique that is time consuming and very expensive. In addition, from X-rays multiple defects are also misinterpreted as single defects. Therefore, to address these problems the ultrasonic non-destructive approach is being investigated. Ultrasound based inspection would be far more cost effective and reliable as the methodology is applicable for on-line quality control including implementation of accept/reject criteria. This paper describes a recently developed methodology to detect, locate and classify various manufacturing defects in ceramic tiles using sub band coding of ultrasonic test signals. The wavelet transform is applied to the ultrasonic signal and wavelet coefficients in the different frequency bands are extracted and used as input features to an artificial neural network (ANN) for purposes of signal classification. Two different classifiers, using artificial neural networks (supervised) and clustering (un-supervised) are supplied with features selected using Principal Component Analysis(PCA) and their classification performance compared. This investigation establishes experimentally that Principal Component Analysis(PCA) can be effectively used as a feature selection method that provides superior results for classifying various defects in the context of ultrasonic inspection in comparison with the X-ray technique.

  3. Closing the loop: from paper to protein annotation using supervised Gene Ontology classification.

    PubMed

    Gobeill, Julien; Pasche, Emilie; Vishnyakova, Dina; Ruch, Patrick

    2014-01-01

    Gene function curation of the literature with Gene Ontology (GO) concepts is one particularly time-consuming task in genomics, and the help from bioinformatics is highly requested to keep up with the flow of publications. In 2004, the first BioCreative challenge already designed a task of automatic GO concepts assignment from a full text. At this time, results were judged far from reaching the performances required by real curation workflows. In particular, supervised approaches produced the most disappointing results because of lack of training data. Ten years later, the available curation data have massively grown. In 2013, the BioCreative IV GO task revisited the automatic GO assignment task. For this issue, we investigated the power of our supervised classifier, GOCat. GOCat computes similarities between an input text and already curated instances contained in a knowledge base to infer GO concepts. The subtask A consisted in selecting GO evidence sentences for a relevant gene in a full text. For this, we designed a state-of-the-art supervised statistical approach, using a naïve Bayes classifier and the official training set, and obtained fair results. The subtask B consisted in predicting GO concepts from the previous output. For this, we applied GOCat and reached leading results, up to 65% for hierarchical recall in the top 20 outputted concepts. Contrary to previous competitions, machine learning has this time outperformed standard dictionary-based approaches. Thanks to BioCreative IV, we were able to design a complete workflow for curation: given a gene name and a full text, this system is able to select evidence sentences for curation and to deliver highly relevant GO concepts. Contrary to previous competitions, machine learning this time outperformed dictionary-based systems. Observed performances are sufficient for being used in a real semiautomatic curation workflow. GOCat is available at http://eagl.unige.ch/GOCat/. http://eagl.unige.ch/GOCat4FT/. © The Author(s) 2014. Published by Oxford University Press.

  4. A systematic comparison of different object-based classification techniques using high spatial resolution imagery in agricultural environments

    NASA Astrophysics Data System (ADS)

    Li, Manchun; Ma, Lei; Blaschke, Thomas; Cheng, Liang; Tiede, Dirk

    2016-07-01

    Geographic Object-Based Image Analysis (GEOBIA) is becoming more prevalent in remote sensing classification, especially for high-resolution imagery. Many supervised classification approaches are applied to objects rather than pixels, and several studies have been conducted to evaluate the performance of such supervised classification techniques in GEOBIA. However, these studies did not systematically investigate all relevant factors affecting the classification (segmentation scale, training set size, feature selection and mixed objects). In this study, statistical methods and visual inspection were used to compare these factors systematically in two agricultural case studies in China. The results indicate that Random Forest (RF) and Support Vector Machines (SVM) are highly suitable for GEOBIA classifications in agricultural areas and confirm the expected general tendency, namely that the overall accuracies decline with increasing segmentation scale. All other investigated methods except for RF and SVM are more prone to obtain a lower accuracy due to the broken objects at fine scales. In contrast to some previous studies, the RF classifiers yielded the best results and the k-nearest neighbor classifier were the worst results, in most cases. Likewise, the RF and Decision Tree classifiers are the most robust with or without feature selection. The results of training sample analyses indicated that the RF and adaboost. M1 possess a superior generalization capability, except when dealing with small training sample sizes. Furthermore, the classification accuracies were directly related to the homogeneity/heterogeneity of the segmented objects for all classifiers. Finally, it was suggested that RF should be considered in most cases for agricultural mapping.

  5. Applying active learning to supervised word sense disambiguation in MEDLINE.

    PubMed

    Chen, Yukun; Cao, Hongxin; Mei, Qiaozhu; Zheng, Kai; Xu, Hua

    2013-01-01

    This study was to assess whether active learning strategies can be integrated with supervised word sense disambiguation (WSD) methods, thus reducing the number of annotated samples, while keeping or improving the quality of disambiguation models. We developed support vector machine (SVM) classifiers to disambiguate 197 ambiguous terms and abbreviations in the MSH WSD collection. Three different uncertainty sampling-based active learning algorithms were implemented with the SVM classifiers and were compared with a passive learner (PL) based on random sampling. For each ambiguous term and each learning algorithm, a learning curve that plots the accuracy computed from the test set as a function of the number of annotated samples used in the model was generated. The area under the learning curve (ALC) was used as the primary metric for evaluation. Our experiments demonstrated that active learners (ALs) significantly outperformed the PL, showing better performance for 177 out of 197 (89.8%) WSD tasks. Further analysis showed that to achieve an average accuracy of 90%, the PL needed 38 annotated samples, while the ALs needed only 24, a 37% reduction in annotation effort. Moreover, we analyzed cases where active learning algorithms did not achieve superior performance and identified three causes: (1) poor models in the early learning stage; (2) easy WSD cases; and (3) difficult WSD cases, which provide useful insight for future improvements. This study demonstrated that integrating active learning strategies with supervised WSD methods could effectively reduce annotation cost and improve the disambiguation models.

  6. Applying active learning to supervised word sense disambiguation in MEDLINE

    PubMed Central

    Chen, Yukun; Cao, Hongxin; Mei, Qiaozhu; Zheng, Kai; Xu, Hua

    2013-01-01

    Objectives This study was to assess whether active learning strategies can be integrated with supervised word sense disambiguation (WSD) methods, thus reducing the number of annotated samples, while keeping or improving the quality of disambiguation models. Methods We developed support vector machine (SVM) classifiers to disambiguate 197 ambiguous terms and abbreviations in the MSH WSD collection. Three different uncertainty sampling-based active learning algorithms were implemented with the SVM classifiers and were compared with a passive learner (PL) based on random sampling. For each ambiguous term and each learning algorithm, a learning curve that plots the accuracy computed from the test set as a function of the number of annotated samples used in the model was generated. The area under the learning curve (ALC) was used as the primary metric for evaluation. Results Our experiments demonstrated that active learners (ALs) significantly outperformed the PL, showing better performance for 177 out of 197 (89.8%) WSD tasks. Further analysis showed that to achieve an average accuracy of 90%, the PL needed 38 annotated samples, while the ALs needed only 24, a 37% reduction in annotation effort. Moreover, we analyzed cases where active learning algorithms did not achieve superior performance and identified three causes: (1) poor models in the early learning stage; (2) easy WSD cases; and (3) difficult WSD cases, which provide useful insight for future improvements. Conclusions This study demonstrated that integrating active learning strategies with supervised WSD methods could effectively reduce annotation cost and improve the disambiguation models. PMID:23364851

  7. Automatic threshold selection for multi-class open set recognition

    NASA Astrophysics Data System (ADS)

    Scherreik, Matthew; Rigling, Brian

    2017-05-01

    Multi-class open set recognition is the problem of supervised classification with additional unknown classes encountered after a model has been trained. An open set classifer often has two core components. The first component is a base classifier which estimates the most likely class of a given example. The second component consists of open set logic which estimates if the example is truly a member of the candidate class. Such a system is operated in a feed-forward fashion. That is, a candidate label is first estimated by the base classifier, and the true membership of the example to the candidate class is estimated afterward. Previous works have developed an iterative threshold selection algorithm for rejecting examples from classes which were not present at training time. In those studies, a Platt-calibrated SVM was used as the base classifier, and the thresholds were applied to class posterior probabilities for rejection. In this work, we investigate the effectiveness of other base classifiers when paired with the threshold selection algorithm and compare their performance with the original SVM solution.

  8. VHR satellite multitemporal data to extract cultural landscape changes in the roman site of Grumentum

    NASA Astrophysics Data System (ADS)

    masini, nicola; Lasaponara, Rosa

    2013-04-01

    The papers deals with the use of VHR satellite multitemporal data set to extract cultural landscape changes in the roman site of Grumentum Grumentum is an ancient town, 50 km south of Potenza, located near the roman road of Via Herculea which connected the Venusia, in the north est of Basilicata, with Heraclea in the Ionian coast. The first settlement date back to the 6th century BC. It was resettled by the Romans in the 3rd century BC. Its urban fabric which evidences a long history from the Republican age to late Antiquity (III BC-V AD) is composed of the typical urban pattern of cardi and decumani. Its excavated ruins include a large amphitheatre, a theatre, the thermae, the Forum and some temples. There are many techniques nowadays available to capture and record differences in two or more images. In this paper we focus and apply the two main approaches which can be distinguished into : (i) unsupervised and (ii) supervised change detection methods. Unsupervised change detection methods are generally based on the transformation of the two multispectral images in to a single band or multiband image which are further analyzed to identify changes Unsupervised change detection techniques are generally based on three basic steps (i) the preprocessing step, (ii) a pixel-by-pixel comparison is performed, (iii). Identification of changes according to the magnitude an direction (positive /negative). Unsupervised change detection are generally based on the transformation of the two multispectral images into a single band or multiband image which are further analyzed to identify changes. Than the separation between changed and unchanged classes is obtained from the magnitude of the resulting spectral change vectors by means of empirical or theoretical well founded approaches Supervised change detection methods are generally based on supervised classification methods, which require the availability of a suitable training set for the learning process of the classifiers. Unsupervised change detection techniques are generally based on three basic steps (i) the preprocessing step, (ii) supervised classification is performed on the single dates or on the map obtained as the difference of two dates, (iii). Identification of changes according to the magnitude an direction (positive /negative). Supervised change detection are generally based on supervised classification methods, which require the availability of a suitable training set for the learning process of the classifiers, therefore these algorithms require a preliminary knowledge necessary: (i) to generate representative parameters for each class of interest; and (ii) to carry out the training stage Advantages and disadvantages of the supervised and unsupervised approaches are discuss. Finally results from the the satellite multitemporal dataset was also integrated with aerial photos from historical archive in order to expand the time window of the investigation and capture landscape changes occurred from the Agrarian Reform, in the 50s, up today.

  9. Transfer learning improves supervised image segmentation across imaging protocols.

    PubMed

    van Opbroek, Annegreet; Ikram, M Arfan; Vernooij, Meike W; de Bruijne, Marleen

    2015-05-01

    The variation between images obtained with different scanners or different imaging protocols presents a major challenge in automatic segmentation of biomedical images. This variation especially hampers the application of otherwise successful supervised-learning techniques which, in order to perform well, often require a large amount of labeled training data that is exactly representative of the target data. We therefore propose to use transfer learning for image segmentation. Transfer-learning techniques can cope with differences in distributions between training and target data, and therefore may improve performance over supervised learning for segmentation across scanners and scan protocols. We present four transfer classifiers that can train a classification scheme with only a small amount of representative training data, in addition to a larger amount of other training data with slightly different characteristics. The performance of the four transfer classifiers was compared to that of standard supervised classification on two magnetic resonance imaging brain-segmentation tasks with multi-site data: white matter, gray matter, and cerebrospinal fluid segmentation; and white-matter-/MS-lesion segmentation. The experiments showed that when there is only a small amount of representative training data available, transfer learning can greatly outperform common supervised-learning approaches, minimizing classification errors by up to 60%.

  10. Cancer classification through filtering progressive transductive support vector machine based on gene expression data

    NASA Astrophysics Data System (ADS)

    Lu, Xinguo; Chen, Dan

    2017-08-01

    Traditional supervised classifiers neglect a large amount of data which not have sufficient follow-up information, only work with labeled data. Consequently, the small sample size limits the advancement of design appropriate classifier. In this paper, a transductive learning method which combined with the filtering strategy in transductive framework and progressive labeling strategy is addressed. The progressive labeling strategy does not need to consider the distribution of labeled samples to evaluate the distribution of unlabeled samples, can effective solve the problem of evaluate the proportion of positive and negative samples in work set. Our experiment result demonstrate that the proposed technique have great potential in cancer prediction based on gene expression.

  11. An expert fitness diagnosis system based on elastic cloud computing.

    PubMed

    Tseng, Kevin C; Wu, Chia-Chuan

    2014-01-01

    This paper presents an expert diagnosis system based on cloud computing. It classifies a user's fitness level based on supervised machine learning techniques. This system is able to learn and make customized diagnoses according to the user's physiological data, such as age, gender, and body mass index (BMI). In addition, an elastic algorithm based on Poisson distribution is presented to allocate computation resources dynamically. It predicts the required resources in the future according to the exponential moving average of past observations. The experimental results show that Naïve Bayes is the best classifier with the highest accuracy (90.8%) and that the elastic algorithm is able to capture tightly the trend of requests generated from the Internet and thus assign corresponding computation resources to ensure the quality of service.

  12. Geomorphology and depositional subenvironments of Gulf Islands National Seashore, Perdido Key and Santa Rosa Island, Florida

    USGS Publications Warehouse

    Morton, Robert A.; Montgomery, Marilyn C.

    2010-01-01

    The primary mapping procedures were supervised functions within a Geographic Information System (GIS) that were applied to delineate and classify depositional subenvironments and features, collectively referred to as map units. The delineated boundaries of the map units were exported to create one shapefile, and are differentiated by the field "Type" in the associated attribute table. Map units were delineated and classified based on differences in tonal patterns of features in contrast to adjacent features observed on orthophotography. Land elevations from recent lidar surveys served as supplementary data to assist in delineating the map unit boundaries.

  13. [Object-oriented aquatic vegetation extracting approach based on visible vegetation indices.

    PubMed

    Jing, Ran; Deng, Lei; Zhao, Wen Ji; Gong, Zhao Ning

    2016-05-01

    Using the estimation of scale parameters (ESP) image segmentation tool to determine the ideal image segmentation scale, the optimal segmented image was created by the multi-scale segmentation method. Based on the visible vegetation indices derived from mini-UAV imaging data, we chose a set of optimal vegetation indices from a series of visible vegetation indices, and built up a decision tree rule. A membership function was used to automatically classify the study area and an aquatic vegetation map was generated. The results showed the overall accuracy of image classification using the supervised classification was 53.7%, and the overall accuracy of object-oriented image analysis (OBIA) was 91.7%. Compared with pixel-based supervised classification method, the OBIA method improved significantly the image classification result and further increased the accuracy of extracting the aquatic vegetation. The Kappa value of supervised classification was 0.4, and the Kappa value based OBIA was 0.9. The experimental results demonstrated that using visible vegetation indices derived from the mini-UAV data and OBIA method extracting the aquatic vegetation developed in this study was feasible and could be applied in other physically similar areas.

  14. Feasibility study of stain-free classification of cell apoptosis based on diffraction imaging flow cytometry and supervised machine learning techniques.

    PubMed

    Feng, Jingwen; Feng, Tong; Yang, Chengwen; Wang, Wei; Sa, Yu; Feng, Yuanming

    2018-06-01

    This study was to explore the feasibility of prediction and classification of cells in different stages of apoptosis with a stain-free method based on diffraction images and supervised machine learning. Apoptosis was induced in human chronic myelogenous leukemia K562 cells by cis-platinum (DDP). A newly developed technique of polarization diffraction imaging flow cytometry (p-DIFC) was performed to acquire diffraction images of the cells in three different statuses (viable, early apoptotic and late apoptotic/necrotic) after cell separation through fluorescence activated cell sorting with Annexin V-PE and SYTOX® Green double staining. The texture features of the diffraction images were extracted with in-house software based on the Gray-level co-occurrence matrix algorithm to generate datasets for cell classification with supervised machine learning method. Therefore, this new method has been verified in hydrogen peroxide induced apoptosis model of HL-60. Results show that accuracy of higher than 90% was achieved respectively in independent test datasets from each cell type based on logistic regression with ridge estimators, which indicated that p-DIFC system has a great potential in predicting and classifying cells in different stages of apoptosis.

  15. Examining change detection approaches for tropical mangrove monitoring

    USGS Publications Warehouse

    Myint, Soe W.; Franklin, Janet; Buenemann, Michaela; Kim, Won; Giri, Chandra

    2014-01-01

    This study evaluated the effectiveness of different band combinations and classifiers (unsupervised, supervised, object-oriented nearest neighbor, and object-oriented decision rule) for quantifying mangrove forest change using multitemporal Landsat data. A discriminant analysis using spectra of different vegetation types determined that bands 2 (0.52 to 0.6 μm), 5 (1.55 to 1.75 μm), and 7 (2.08 to 2.35 μm) were the most effective bands for differentiating mangrove forests from surrounding land cover types. A ranking of thirty-six change maps, produced by comparing the classification accuracy of twelve change detection approaches, was used. The object-based Nearest Neighbor classifier produced the highest mean overall accuracy (84 percent) regardless of band combinations. The automated decision rule-based approach (mean overall accuracy of 88 percent) as well as a composite of bands 2, 5, and 7 used with the unsupervised classifier and the same composite or all band difference with the object-oriented Nearest Neighbor classifier were the most effective approaches.

  16. Supervised fully polarimetric classification of the Black Forest test site: From MAESTROI to MAC Europe

    NASA Technical Reports Server (NTRS)

    Degrandi, G.; Lavalle, C.; Degroof, H.; Sieber, A.

    1992-01-01

    A study on the performance of a supervised fully polarimetric maximum likelihood classifier for synthetic aperture radar (SAR) data when applied to a specific classification context: forest classification based on age classes and in the presence of a sloping terrain is presented. For the experimental part, the polarimetric AIRSAR data at P, L, and C-band, acquired over the German Black Forest near Freiburg in the frame of the 1989 MAESTRO-1 campaign and the 1991 MAC Europe campaign was used, MAESTRO-1 with an ESA/JRC sponsored campaign, and MAC Europe (Multi-sensor Aircraft Campaign); in both cases the multi-frequency polarimetric JPL Airborne Synthetic Aperture Radar (AIRSAR) radar was flown over a number of European test sites. The study is structured as follows. At first, the general characteristics of the classifier and the dependencies from some parameters, like frequency bands, feature vector, calibration, using test areas lying on a flat terrain are investigated. Once it is determined the optimal conditions for the classifier performance, we then move on to the study of the slope effect. The bulk of this work is performed using the Maestrol data set. Next the classifier performance with the MAC Europe data is considered. The study is divided into two stages: first some of the tests done on the Maestro data are repeated, to highlight the improvements due to the new processing scheme that delivers 16 look data. Second we experiment with multi images classification with two goals: to assess the possibility of using a training set measured from one image to classify areas in different images; and to classify areas on critical slopes using different viewing angles. The main points of the study are listed and some of the results obtained so far are highlighted.

  17. Image-Based Airborne Sensors: A Combined Approach for Spectral Signatures Classification through Deterministic Simulated Annealing

    PubMed Central

    Guijarro, María; Pajares, Gonzalo; Herrera, P. Javier

    2009-01-01

    The increasing technology of high-resolution image airborne sensors, including those on board Unmanned Aerial Vehicles, demands automatic solutions for processing, either on-line or off-line, the huge amountds of image data sensed during the flights. The classification of natural spectral signatures in images is one potential application. The actual tendency in classification is oriented towards the combination of simple classifiers. In this paper we propose a combined strategy based on the Deterministic Simulated Annealing (DSA) framework. The simple classifiers used are the well tested supervised parametric Bayesian estimator and the Fuzzy Clustering. The DSA is an optimization approach, which minimizes an energy function. The main contribution of DSA is its ability to avoid local minima during the optimization process thanks to the annealing scheme. It outperforms simple classifiers used for the combination and some combined strategies, including a scheme based on the fuzzy cognitive maps and an optimization approach based on the Hopfield neural network paradigm. PMID:22399989

  18. Gene-Based Multiclass Cancer Diagnosis with Class-Selective Rejections

    PubMed Central

    Jrad, Nisrine; Grall-Maës, Edith; Beauseroy, Pierre

    2009-01-01

    Supervised learning of microarray data is receiving much attention in recent years. Multiclass cancer diagnosis, based on selected gene profiles, are used as adjunct of clinical diagnosis. However, supervised diagnosis may hinder patient care, add expense or confound a result. To avoid this misleading, a multiclass cancer diagnosis with class-selective rejection is proposed. It rejects some patients from one, some, or all classes in order to ensure a higher reliability while reducing time and expense costs. Moreover, this classifier takes into account asymmetric penalties dependant on each class and on each wrong or partially correct decision. It is based on ν-1-SVM coupled with its regularization path and minimizes a general loss function defined in the class-selective rejection scheme. The state of art multiclass algorithms can be considered as a particular case of the proposed algorithm where the number of decisions is given by the classes and the loss function is defined by the Bayesian risk. Two experiments are carried out in the Bayesian and the class selective rejection frameworks. Five genes selected datasets are used to assess the performance of the proposed method. Results are discussed and accuracies are compared with those computed by the Naive Bayes, Nearest Neighbor, Linear Perceptron, Multilayer Perceptron, and Support Vector Machines classifiers. PMID:19584932

  19. Supervised machine learning and active learning in classification of radiology reports.

    PubMed

    Nguyen, Dung H M; Patrick, Jon D

    2014-01-01

    This paper presents an automated system for classifying the results of imaging examinations (CT, MRI, positron emission tomography) into reportable and non-reportable cancer cases. This system is part of an industrial-strength processing pipeline built to extract content from radiology reports for use in the Victorian Cancer Registry. In addition to traditional supervised learning methods such as conditional random fields and support vector machines, active learning (AL) approaches were investigated to optimize training production and further improve classification performance. The project involved two pilot sites in Victoria, Australia (Lake Imaging (Ballarat) and Peter MacCallum Cancer Centre (Melbourne)) and, in collaboration with the NSW Central Registry, one pilot site at Westmead Hospital (Sydney). The reportability classifier performance achieved 98.25% sensitivity and 96.14% specificity on the cancer registry's held-out test set. Up to 92% of training data needed for supervised machine learning can be saved by AL. AL is a promising method for optimizing the supervised training production used in classification of radiology reports. When an AL strategy is applied during the data selection process, the cost of manual classification can be reduced significantly. The most important practical application of the reportability classifier is that it can dramatically reduce human effort in identifying relevant reports from the large imaging pool for further investigation of cancer. The classifier is built on a large real-world dataset and can achieve high performance in filtering relevant reports to support cancer registries. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  20. Arrangement and Applying of Movement Patterns in the Cerebellum Based on Semi-supervised Learning.

    PubMed

    Solouki, Saeed; Pooyan, Mohammad

    2016-06-01

    Biological control systems have long been studied as a possible inspiration for the construction of robotic controllers. The cerebellum is known to be involved in the production and learning of smooth, coordinated movements. Therefore, highly regular structure of the cerebellum has been in the core of attention in theoretical and computational modeling. However, most of these models reflect some special features of the cerebellum without regarding the whole motor command computational process. In this paper, we try to make a logical relation between the most significant models of the cerebellum and introduce a new learning strategy to arrange the movement patterns: cerebellar modular arrangement and applying of movement patterns based on semi-supervised learning (CMAPS). We assume here the cerebellum like a big archive of patterns that has an efficient organization to classify and recall them. The main idea is to achieve an optimal use of memory locations by more than just a supervised learning and classification algorithm. Surely, more experimental and physiological researches are needed to confirm our hypothesis.

  1. Comparing the Behavior of Polarimetric SAR Imagery (TerraSAR-X and Radarsat-2) for Automated Sea Ice Classification

    NASA Astrophysics Data System (ADS)

    Ressel, Rudolf; Singha, Suman; Lehner, Susanne

    2016-08-01

    Arctic Sea ice monitoring has attracted increasing attention over the last few decades. Besides the scientific interest in sea ice, the operational aspect of ice charting is becoming more important due to growing navigational possibilities in an increasingly ice free Arctic. For this purpose, satellite borne SAR imagery has become an invaluable tool. In past, mostly single polarimetric datasets were investigated with supervised or unsupervised classification schemes for sea ice investigation. Despite proven sea ice classification achievements on single polarimetric data, a fully automatic, general purpose classifier for single-pol data has not been established due to large variation of sea ice manifestations and incidence angle impact. Recently, through the advent of polarimetric SAR sensors, polarimetric features have moved into the focus of ice classification research. The higher information content four polarimetric channels promises to offer greater insight into sea ice scattering mechanism and overcome some of the shortcomings of single- polarimetric classifiers. Two spatially and temporally coincident pairs of fully polarimetric acquisitions from the TerraSAR-X/TanDEM-X and RADARSAT-2 satellites are investigated. Proposed supervised classification algorithm consists of two steps: The first step comprises a feature extraction, the results of which are ingested into a neural network classifier in the second step. Based on the common coherency and covariance matrix, we extract a number of features and analyze the relevance and redundancy by means of mutual information for the purpose of sea ice classification. Coherency matrix based features which require an eigendecomposition are found to be either of low relevance or redundant to other covariance matrix based features. Among the most useful features for classification are matrix invariant based features (Geometric Intensity, Scattering Diversity, Surface Scattering Fraction).

  2. A Dirichlet process model for classifying and forecasting epidemic curves.

    PubMed

    Nsoesie, Elaine O; Leman, Scotland C; Marathe, Madhav V

    2014-01-09

    A forecast can be defined as an endeavor to quantitatively estimate a future event or probabilities assigned to a future occurrence. Forecasting stochastic processes such as epidemics is challenging since there are several biological, behavioral, and environmental factors that influence the number of cases observed at each point during an epidemic. However, accurate forecasts of epidemics would impact timely and effective implementation of public health interventions. In this study, we introduce a Dirichlet process (DP) model for classifying and forecasting influenza epidemic curves. The DP model is a nonparametric Bayesian approach that enables the matching of current influenza activity to simulated and historical patterns, identifies epidemic curves different from those observed in the past and enables prediction of the expected epidemic peak time. The method was validated using simulated influenza epidemics from an individual-based model and the accuracy was compared to that of the tree-based classification technique, Random Forest (RF), which has been shown to achieve high accuracy in the early prediction of epidemic curves using a classification approach. We also applied the method to forecasting influenza outbreaks in the United States from 1997-2013 using influenza-like illness (ILI) data from the Centers for Disease Control and Prevention (CDC). We made the following observations. First, the DP model performed as well as RF in identifying several of the simulated epidemics. Second, the DP model correctly forecasted the peak time several days in advance for most of the simulated epidemics. Third, the accuracy of identifying epidemics different from those already observed improved with additional data, as expected. Fourth, both methods correctly classified epidemics with higher reproduction numbers (R) with a higher accuracy compared to epidemics with lower R values. Lastly, in the classification of seasonal influenza epidemics based on ILI data from the CDC, the methods' performance was comparable. Although RF requires less computational time compared to the DP model, the algorithm is fully supervised implying that epidemic curves different from those previously observed will always be misclassified. In contrast, the DP model can be unsupervised, semi-supervised or fully supervised. Since both methods have their relative merits, an approach that uses both RF and the DP model could be beneficial.

  3. Multi-Modal Curriculum Learning for Semi-Supervised Image Classification.

    PubMed

    Gong, Chen; Tao, Dacheng; Maybank, Stephen J; Liu, Wei; Kang, Guoliang; Yang, Jie

    2016-07-01

    Semi-supervised image classification aims to classify a large quantity of unlabeled images by typically harnessing scarce labeled images. Existing semi-supervised methods often suffer from inadequate classification accuracy when encountering difficult yet critical images, such as outliers, because they treat all unlabeled images equally and conduct classifications in an imperfectly ordered sequence. In this paper, we employ the curriculum learning methodology by investigating the difficulty of classifying every unlabeled image. The reliability and the discriminability of these unlabeled images are particularly investigated for evaluating their difficulty. As a result, an optimized image sequence is generated during the iterative propagations, and the unlabeled images are logically classified from simple to difficult. Furthermore, since images are usually characterized by multiple visual feature descriptors, we associate each kind of features with a teacher, and design a multi-modal curriculum learning (MMCL) strategy to integrate the information from different feature modalities. In each propagation, each teacher analyzes the difficulties of the currently unlabeled images from its own modality viewpoint. A consensus is subsequently reached among all the teachers, determining the currently simplest images (i.e., a curriculum), which are to be reliably classified by the multi-modal learner. This well-organized propagation process leveraging multiple teachers and one learner enables our MMCL to outperform five state-of-the-art methods on eight popular image data sets.

  4. Comparative Analysis of Document level Text Classification Algorithms using R

    NASA Astrophysics Data System (ADS)

    Syamala, Maganti; Nalini, N. J., Dr; Maguluri, Lakshamanaphaneendra; Ragupathy, R., Dr.

    2017-08-01

    From the past few decades there has been tremendous volumes of data available in Internet either in structured or unstructured form. Also, there is an exponential growth of information on Internet, so there is an emergent need of text classifiers. Text mining is an interdisciplinary field which draws attention on information retrieval, data mining, machine learning, statistics and computational linguistics. And to handle this situation, a wide range of supervised learning algorithms has been introduced. Among all these K-Nearest Neighbor(KNN) is efficient and simplest classifier in text classification family. But KNN suffers from imbalanced class distribution and noisy term features. So, to cope up with this challenge we use document based centroid dimensionality reduction(CentroidDR) using R Programming. By combining these two text classification techniques, KNN and Centroid classifiers, we propose a scalable and effective flat classifier, called MCenKNN which works well substantially better than CenKNN.

  5. Supervised Learning Applied to Air Traffic Trajectory Classification

    NASA Technical Reports Server (NTRS)

    Bosson, Christabelle S.; Nikoleris, Tasos

    2018-01-01

    Given the recent increase of interest in introducing new vehicle types and missions into the National Airspace System, a transition towards a more autonomous air traffic control system is required in order to enable and handle increased density and complexity. This paper presents an exploratory effort of the needed autonomous capabilities by exploring supervised learning techniques in the context of aircraft trajectories. In particular, it focuses on the application of machine learning algorithms and neural network models to a runway recognition trajectory-classification study. It investigates the applicability and effectiveness of various classifiers using datasets containing trajectory records for a month of air traffic. A feature importance and sensitivity analysis are conducted to challenge the chosen time-based datasets and the ten selected features. The study demonstrates that classification accuracy levels of 90% and above can be reached in less than 40 seconds of training for most machine learning classifiers when one track data point, described by the ten selected features at a particular time step, per trajectory is used as input. It also shows that neural network models can achieve similar accuracy levels but at higher training time costs.

  6. Semi-Supervised Projective Non-Negative Matrix Factorization for Cancer Classification.

    PubMed

    Zhang, Xiang; Guan, Naiyang; Jia, Zhilong; Qiu, Xiaogang; Luo, Zhigang

    2015-01-01

    Advances in DNA microarray technologies have made gene expression profiles a significant candidate in identifying different types of cancers. Traditional learning-based cancer identification methods utilize labeled samples to train a classifier, but they are inconvenient for practical application because labels are quite expensive in the clinical cancer research community. This paper proposes a semi-supervised projective non-negative matrix factorization method (Semi-PNMF) to learn an effective classifier from both labeled and unlabeled samples, thus boosting subsequent cancer classification performance. In particular, Semi-PNMF jointly learns a non-negative subspace from concatenated labeled and unlabeled samples and indicates classes by the positions of the maximum entries of their coefficients. Because Semi-PNMF incorporates statistical information from the large volume of unlabeled samples in the learned subspace, it can learn more representative subspaces and boost classification performance. We developed a multiplicative update rule (MUR) to optimize Semi-PNMF and proved its convergence. The experimental results of cancer classification for two multiclass cancer gene expression profile datasets show that Semi-PNMF outperforms the representative methods.

  7. Remote sensing change detection methods to track deforestation and growth in threatened rainforests in Madre de Dios, Peru

    USGS Publications Warehouse

    Shermeyer, Jacob S.; Haack, Barry N.

    2015-01-01

    Two forestry-change detection methods are described, compared, and contrasted for estimating deforestation and growth in threatened forests in southern Peru from 2000 to 2010. The methods used in this study rely on freely available data, including atmospherically corrected Landsat 5 Thematic Mapper and Moderate Resolution Imaging Spectroradiometer (MODIS) vegetation continuous fields (VCF). The two methods include a conventional supervised signature extraction method and a unique self-calibrating method called MODIS VCF guided forest/nonforest (FNF) masking. The process chain for each of these methods includes a threshold classification of MODIS VCF, training data or signature extraction, signature evaluation, k-nearest neighbor classification, analyst-guided reclassification, and postclassification image differencing to generate forest change maps. Comparisons of all methods were based on an accuracy assessment using 500 validation pixels. Results of this accuracy assessment indicate that FNF masking had a 5% higher overall accuracy and was superior to conventional supervised classification when estimating forest change. Both methods succeeded in classifying persistently forested and nonforested areas, and both had limitations when classifying forest change.

  8. Semi-supervised SVM for individual tree crown species classification

    NASA Astrophysics Data System (ADS)

    Dalponte, Michele; Ene, Liviu Theodor; Marconcini, Mattia; Gobakken, Terje; Næsset, Erik

    2015-12-01

    In this paper a novel semi-supervised SVM classifier is presented, specifically developed for tree species classification at individual tree crown (ITC) level. In ITC tree species classification, all the pixels belonging to an ITC should have the same label. This assumption is used in the learning of the proposed semi-supervised SVM classifier (ITC-S3VM). This method exploits the information contained in the unlabeled ITC samples in order to improve the classification accuracy of a standard SVM. The ITC-S3VM method can be easily implemented using freely available software libraries. The datasets used in this study include hyperspectral imagery and laser scanning data acquired over two boreal forest areas characterized by the presence of three information classes (Pine, Spruce, and Broadleaves). The experimental results quantify the effectiveness of the proposed approach, which provides classification accuracies significantly higher (from 2% to above 27%) than those obtained by the standard supervised SVM and by a state-of-the-art semi-supervised SVM (S3VM). Particularly, by reducing the number of training samples (i.e. from 100% to 25%, and from 100% to 5% for the two datasets, respectively) the proposed method still exhibits results comparable to the ones of a supervised SVM trained with the full available training set. This property of the method makes it particularly suitable for practical forest inventory applications in which collection of in situ information can be very expensive both in terms of cost and time.

  9. Human Activity Recognition by Combining a Small Number of Classifiers.

    PubMed

    Nazabal, Alfredo; Garcia-Moreno, Pablo; Artes-Rodriguez, Antonio; Ghahramani, Zoubin

    2016-09-01

    We consider the problem of daily human activity recognition (HAR) using multiple wireless inertial sensors, and specifically, HAR systems with a very low number of sensors, each one providing an estimation of the performed activities. We propose new Bayesian models to combine the output of the sensors. The models are based on a soft outputs combination of individual classifiers to deal with the small number of sensors. We also incorporate the dynamic nature of human activities as a first-order homogeneous Markov chain. We develop both inductive and transductive inference methods for each model to be employed in supervised and semisupervised situations, respectively. Using different real HAR databases, we compare our classifiers combination models against a single classifier that employs all the signals from the sensors. Our models exhibit consistently a reduction of the error rate and an increase of robustness against sensor failures. Our models also outperform other classifiers combination models that do not consider soft outputs and an Markovian structure of the human activities.

  10. Classifier Subset Selection for the Stacked Generalization Method Applied to Emotion Recognition in Speech

    PubMed Central

    Álvarez, Aitor; Sierra, Basilio; Arruti, Andoni; López-Gil, Juan-Miguel; Garay-Vitoria, Nestor

    2015-01-01

    In this paper, a new supervised classification paradigm, called classifier subset selection for stacked generalization (CSS stacking), is presented to deal with speech emotion recognition. The new approach consists of an improvement of a bi-level multi-classifier system known as stacking generalization by means of an integration of an estimation of distribution algorithm (EDA) in the first layer to select the optimal subset from the standard base classifiers. The good performance of the proposed new paradigm was demonstrated over different configurations and datasets. First, several CSS stacking classifiers were constructed on the RekEmozio dataset, using some specific standard base classifiers and a total of 123 spectral, quality and prosodic features computed using in-house feature extraction algorithms. These initial CSS stacking classifiers were compared to other multi-classifier systems and the employed standard classifiers built on the same set of speech features. Then, new CSS stacking classifiers were built on RekEmozio using a different set of both acoustic parameters (extended version of the Geneva Minimalistic Acoustic Parameter Set (eGeMAPS)) and standard classifiers and employing the best meta-classifier of the initial experiments. The performance of these two CSS stacking classifiers was evaluated and compared. Finally, the new paradigm was tested on the well-known Berlin Emotional Speech database. We compared the performance of single, standard stacking and CSS stacking systems using the same parametrization of the second phase. All of the classifications were performed at the categorical level, including the six primary emotions plus the neutral one. PMID:26712757

  11. Deep Question Answering for protein annotation

    PubMed Central

    Gobeill, Julien; Gaudinat, Arnaud; Pasche, Emilie; Vishnyakova, Dina; Gaudet, Pascale; Bairoch, Amos; Ruch, Patrick

    2015-01-01

    Biomedical professionals have access to a huge amount of literature, but when they use a search engine, they often have to deal with too many documents to efficiently find the appropriate information in a reasonable time. In this perspective, question-answering (QA) engines are designed to display answers, which were automatically extracted from the retrieved documents. Standard QA engines in literature process a user question, then retrieve relevant documents and finally extract some possible answers out of these documents using various named-entity recognition processes. In our study, we try to answer complex genomics questions, which can be adequately answered only using Gene Ontology (GO) concepts. Such complex answers cannot be found using state-of-the-art dictionary- and redundancy-based QA engines. We compare the effectiveness of two dictionary-based classifiers for extracting correct GO answers from a large set of 100 retrieved abstracts per question. In the same way, we also investigate the power of GOCat, a GO supervised classifier. GOCat exploits the GOA database to propose GO concepts that were annotated by curators for similar abstracts. This approach is called deep QA, as it adds an original classification step, and exploits curated biological data to infer answers, which are not explicitly mentioned in the retrieved documents. We show that for complex answers such as protein functional descriptions, the redundancy phenomenon has a limited effect. Similarly usual dictionary-based approaches are relatively ineffective. In contrast, we demonstrate how existing curated data, beyond information extraction, can be exploited by a supervised classifier, such as GOCat, to massively improve both the quantity and the quality of the answers with a +100% improvement for both recall and precision. Database URL: http://eagl.unige.ch/DeepQA4PA/ PMID:26384372

  12. Deep Question Answering for protein annotation.

    PubMed

    Gobeill, Julien; Gaudinat, Arnaud; Pasche, Emilie; Vishnyakova, Dina; Gaudet, Pascale; Bairoch, Amos; Ruch, Patrick

    2015-01-01

    Biomedical professionals have access to a huge amount of literature, but when they use a search engine, they often have to deal with too many documents to efficiently find the appropriate information in a reasonable time. In this perspective, question-answering (QA) engines are designed to display answers, which were automatically extracted from the retrieved documents. Standard QA engines in literature process a user question, then retrieve relevant documents and finally extract some possible answers out of these documents using various named-entity recognition processes. In our study, we try to answer complex genomics questions, which can be adequately answered only using Gene Ontology (GO) concepts. Such complex answers cannot be found using state-of-the-art dictionary- and redundancy-based QA engines. We compare the effectiveness of two dictionary-based classifiers for extracting correct GO answers from a large set of 100 retrieved abstracts per question. In the same way, we also investigate the power of GOCat, a GO supervised classifier. GOCat exploits the GOA database to propose GO concepts that were annotated by curators for similar abstracts. This approach is called deep QA, as it adds an original classification step, and exploits curated biological data to infer answers, which are not explicitly mentioned in the retrieved documents. We show that for complex answers such as protein functional descriptions, the redundancy phenomenon has a limited effect. Similarly usual dictionary-based approaches are relatively ineffective. In contrast, we demonstrate how existing curated data, beyond information extraction, can be exploited by a supervised classifier, such as GOCat, to massively improve both the quantity and the quality of the answers with a +100% improvement for both recall and precision. Database URL: http://eagl.unige.ch/DeepQA4PA/. © The Author(s) 2015. Published by Oxford University Press.

  13. Recognition of medication information from discharge summaries using ensembles of classifiers.

    PubMed

    Doan, Son; Collier, Nigel; Xu, Hua; Pham, Hoang Duy; Tu, Minh Phuong

    2012-05-07

    Extraction of clinical information such as medications or problems from clinical text is an important task of clinical natural language processing (NLP). Rule-based methods are often used in clinical NLP systems because they are easy to adapt and customize. Recently, supervised machine learning methods have proven to be effective in clinical NLP as well. However, combining different classifiers to further improve the performance of clinical entity recognition systems has not been investigated extensively. Combining classifiers into an ensemble classifier presents both challenges and opportunities to improve performance in such NLP tasks. We investigated ensemble classifiers that used different voting strategies to combine outputs from three individual classifiers: a rule-based system, a support vector machine (SVM) based system, and a conditional random field (CRF) based system. Three voting methods were proposed and evaluated using the annotated data sets from the 2009 i2b2 NLP challenge: simple majority, local SVM-based voting, and local CRF-based voting. Evaluation on 268 manually annotated discharge summaries from the i2b2 challenge showed that the local CRF-based voting method achieved the best F-score of 90.84% (94.11% Precision, 87.81% Recall) for 10-fold cross-validation. We then compared our systems with the first-ranked system in the challenge by using the same training and test sets. Our system based on majority voting achieved a better F-score of 89.65% (93.91% Precision, 85.76% Recall) than the previously reported F-score of 89.19% (93.78% Precision, 85.03% Recall) by the first-ranked system in the challenge. Our experimental results using the 2009 i2b2 challenge datasets showed that ensemble classifiers that combine individual classifiers into a voting system could achieve better performance than a single classifier in recognizing medication information from clinical text. It suggests that simple strategies that can be easily implemented such as majority voting could have the potential to significantly improve clinical entity recognition.

  14. Multispectral and Panchromatic used Enhancement Resolution and Study Effective Enhancement on Supervised and Unsupervised Classification Land – Cover

    NASA Astrophysics Data System (ADS)

    Salman, S. S.; Abbas, W. A.

    2018-05-01

    The goal of the study is to support analysis Enhancement of Resolution and study effect on classification methods on bands spectral information of specific and quantitative approaches. In this study introduce a method to enhancement resolution Landsat 8 of combining the bands spectral of 30 meters resolution with panchromatic band 8 of 15 meters resolution, because of importance multispectral imagery to extracting land - cover. Classification methods used in this study to classify several lands -covers recorded from OLI- 8 imagery. Two methods of Data mining can be classified as either supervised or unsupervised. In supervised methods, there is a particular predefined target, that means the algorithm learn which values of the target are associated with which values of the predictor sample. K-nearest neighbors and maximum likelihood algorithms examine in this work as supervised methods. In other hand, no sample identified as target in unsupervised methods, the algorithm of data extraction searches for structure and patterns between all the variables, represented by Fuzzy C-mean clustering method as one of the unsupervised methods, NDVI vegetation index used to compare the results of classification method, the percent of dense vegetation in maximum likelihood method give a best results.

  15. Classifying the auditory P300 using mobile EEG recordings without calibration phase.

    PubMed

    Zink, R; Hunyádi, B; Van Huffel, S; De Vos, M

    2015-08-01

    One of the major drawbacks in mobile EEG Brain Computer Interfaces (BCI) is the need for subject specific training data to train a classifier. By removing the need for supervised classification and calibration phase, new users could start immediate interaction with a BCI. We propose a solution to exploit the structural difference by means of canonical polyadic decomposition (CPD) for three-class auditory oddball data without the need for subject-specific information. We achieve this by adding average event-related-potential (ERP) templates to the CPD model. This constitutes a novel similarity measure between single-trial pairs and known-templates, which results in a fast and interpretable classifier. These results have similar accuracy to those of the supervised and cross-validated stepwise LDA approach but without the need for having subject-dependent data. Therefore the described CPD method has a significant practical advantage over the traditional and widely used approach.

  16. Semi-supervised anomaly detection - towards model-independent searches of new physics

    NASA Astrophysics Data System (ADS)

    Kuusela, Mikael; Vatanen, Tommi; Malmi, Eric; Raiko, Tapani; Aaltonen, Timo; Nagai, Yoshikazu

    2012-06-01

    Most classification algorithms used in high energy physics fall under the category of supervised machine learning. Such methods require a training set containing both signal and background events and are prone to classification errors should this training data be systematically inaccurate for example due to the assumed MC model. To complement such model-dependent searches, we propose an algorithm based on semi-supervised anomaly detection techniques, which does not require a MC training sample for the signal data. We first model the background using a multivariate Gaussian mixture model. We then search for deviations from this model by fitting to the observations a mixture of the background model and a number of additional Gaussians. This allows us to perform pattern recognition of any anomalous excess over the background. We show by a comparison to neural network classifiers that such an approach is a lot more robust against misspecification of the signal MC than supervised classification. In cases where there is an unexpected signal, a neural network might fail to correctly identify it, while anomaly detection does not suffer from such a limitation. On the other hand, when there are no systematic errors in the training data, both methods perform comparably.

  17. Bayesian classification for the selection of in vitro human embryos using morphological and clinical data.

    PubMed

    Morales, Dinora Araceli; Bengoetxea, Endika; Larrañaga, Pedro; García, Miguel; Franco, Yosu; Fresnada, Mónica; Merino, Marisa

    2008-05-01

    In vitro fertilization (IVF) is a medically assisted reproduction technique that enables infertile couples to achieve successful pregnancy. Given the uncertainty of the treatment, we propose an intelligent decision support system based on supervised classification by Bayesian classifiers to aid to the selection of the most promising embryos that will form the batch to be transferred to the woman's uterus. The aim of the supervised classification system is to improve overall success rate of each IVF treatment in which a batch of embryos is transferred each time, where the success is achieved when implantation (i.e. pregnancy) is obtained. Due to ethical reasons, different legislative restrictions apply in every country on this technique. In Spain, legislation allows a maximum of three embryos to form each transfer batch. As a result, clinicians prefer to select the embryos by non-invasive embryo examination based on simple methods and observation focused on morphology and dynamics of embryo development after fertilization. This paper proposes the application of Bayesian classifiers to this embryo selection problem in order to provide a decision support system that allows a more accurate selection than with the actual procedures which fully rely on the expertise and experience of embryologists. For this, we propose to take into consideration a reduced subset of feature variables related to embryo morphology and clinical data of patients, and from this data to induce Bayesian classification models. Results obtained applying a filter technique to choose the subset of variables, and the performance of Bayesian classifiers using them, are presented.

  18. Automated prediction of tissue outcome after acute ischemic stroke in computed tomography perfusion images

    NASA Astrophysics Data System (ADS)

    Vos, Pieter C.; Bennink, Edwin; de Jong, Hugo; Velthuis, Birgitta K.; Viergever, Max A.; Dankbaar, Jan Willem

    2015-03-01

    Assessment of the extent of cerebral damage on admission in patients with acute ischemic stroke could play an important role in treatment decision making. Computed tomography perfusion (CTP) imaging can be used to determine the extent of damage. However, clinical application is hindered by differences among vendors and used methodology. As a result, threshold based methods and visual assessment of CTP images has not yet shown to be useful in treatment decision making and predicting clinical outcome. Preliminary results in MR studies have shown the benefit of using supervised classifiers for predicting tissue outcome, but this has not been demonstrated for CTP. We present a novel method for the automatic prediction of tissue outcome by combining multi-parametric CTP images into a tissue outcome probability map. A supervised classification scheme was developed to extract absolute and relative perfusion values from processed CTP images that are summarized by a trained classifier into a likelihood of infarction. Training was performed using follow-up CT scans of 20 acute stroke patients with complete recanalization of the vessel that was occluded on admission. Infarcted regions were annotated by expert neuroradiologists. Multiple classifiers were evaluated in a leave-one-patient-out strategy for their discriminating performance using receiver operating characteristic (ROC) statistics. Results showed that a RandomForest classifier performed optimally with an area under the ROC of 0.90 for discriminating infarct tissue. The obtained results are an improvement over existing thresholding methods and are in line with results found in literature where MR perfusion was used.

  19. Multilabel user classification using the community structure of online networks

    PubMed Central

    Papadopoulos, Symeon; Kompatsiaris, Yiannis

    2017-01-01

    We study the problem of semi-supervised, multi-label user classification of networked data in the online social platform setting. We propose a framework that combines unsupervised community extraction and supervised, community-based feature weighting before training a classifier. We introduce Approximate Regularized Commute-Time Embedding (ARCTE), an algorithm that projects the users of a social graph onto a latent space, but instead of packing the global structure into a matrix of predefined rank, as many spectral and neural representation learning methods do, it extracts local communities for all users in the graph in order to learn a sparse embedding. To this end, we employ an improvement of personalized PageRank algorithms for searching locally in each user’s graph structure. Then, we perform supervised community feature weighting in order to boost the importance of highly predictive communities. We assess our method performance on the problem of user classification by performing an extensive comparative study among various recent methods based on graph embeddings. The comparison shows that ARCTE significantly outperforms the competition in almost all cases, achieving up to 35% relative improvement compared to the second best competing method in terms of F1-score. PMID:28278242

  20. Multilabel user classification using the community structure of online networks.

    PubMed

    Rizos, Georgios; Papadopoulos, Symeon; Kompatsiaris, Yiannis

    2017-01-01

    We study the problem of semi-supervised, multi-label user classification of networked data in the online social platform setting. We propose a framework that combines unsupervised community extraction and supervised, community-based feature weighting before training a classifier. We introduce Approximate Regularized Commute-Time Embedding (ARCTE), an algorithm that projects the users of a social graph onto a latent space, but instead of packing the global structure into a matrix of predefined rank, as many spectral and neural representation learning methods do, it extracts local communities for all users in the graph in order to learn a sparse embedding. To this end, we employ an improvement of personalized PageRank algorithms for searching locally in each user's graph structure. Then, we perform supervised community feature weighting in order to boost the importance of highly predictive communities. We assess our method performance on the problem of user classification by performing an extensive comparative study among various recent methods based on graph embeddings. The comparison shows that ARCTE significantly outperforms the competition in almost all cases, achieving up to 35% relative improvement compared to the second best competing method in terms of F1-score.

  1. An Event-Driven Classifier for Spiking Neural Networks Fed with Synthetic or Dynamic Vision Sensor Data.

    PubMed

    Stromatias, Evangelos; Soto, Miguel; Serrano-Gotarredona, Teresa; Linares-Barranco, Bernabé

    2017-01-01

    This paper introduces a novel methodology for training an event-driven classifier within a Spiking Neural Network (SNN) System capable of yielding good classification results when using both synthetic input data and real data captured from Dynamic Vision Sensor (DVS) chips. The proposed supervised method uses the spiking activity provided by an arbitrary topology of prior SNN layers to build histograms and train the classifier in the frame domain using the stochastic gradient descent algorithm. In addition, this approach can cope with leaky integrate-and-fire neuron models within the SNN, a desirable feature for real-world SNN applications, where neural activation must fade away after some time in the absence of inputs. Consequently, this way of building histograms captures the dynamics of spikes immediately before the classifier. We tested our method on the MNIST data set using different synthetic encodings and real DVS sensory data sets such as N-MNIST, MNIST-DVS, and Poker-DVS using the same network topology and feature maps. We demonstrate the effectiveness of our approach by achieving the highest classification accuracy reported on the N-MNIST (97.77%) and Poker-DVS (100%) real DVS data sets to date with a spiking convolutional network. Moreover, by using the proposed method we were able to retrain the output layer of a previously reported spiking neural network and increase its performance by 2%, suggesting that the proposed classifier can be used as the output layer in works where features are extracted using unsupervised spike-based learning methods. In addition, we also analyze SNN performance figures such as total event activity and network latencies, which are relevant for eventual hardware implementations. In summary, the paper aggregates unsupervised-trained SNNs with a supervised-trained SNN classifier, combining and applying them to heterogeneous sets of benchmarks, both synthetic and from real DVS chips.

  2. Naive scoring of human sleep based on a hidden Markov model of the electroencephalogram.

    PubMed

    Yaghouby, Farid; Modur, Pradeep; Sunderam, Sridhar

    2014-01-01

    Clinical sleep scoring involves tedious visual review of overnight polysomnograms by a human expert. Many attempts have been made to automate the process by training computer algorithms such as support vector machines and hidden Markov models (HMMs) to replicate human scoring. Such supervised classifiers are typically trained on scored data and then validated on scored out-of-sample data. Here we describe a methodology based on HMMs for scoring an overnight sleep recording without the benefit of a trained initial model. The number of states in the data is not known a priori and is optimized using a Bayes information criterion. When tested on a 22-subject database, this unsupervised classifier agreed well with human scores (mean of Cohen's kappa > 0.7). The HMM also outperformed other unsupervised classifiers (Gaussian mixture models, k-means, and linkage trees), that are capable of naive classification but do not model dynamics, by a significant margin (p < 0.05).

  3. Unsupervised, Robust Estimation-based Clustering for Multispectral Images

    NASA Technical Reports Server (NTRS)

    Netanyahu, Nathan S.

    1997-01-01

    To prepare for the challenge of handling the archiving and querying of terabyte-sized scientific spatial databases, the NASA Goddard Space Flight Center's Applied Information Sciences Branch (AISB, Code 935) developed a number of characterization algorithms that rely on supervised clustering techniques. The research reported upon here has been aimed at continuing the evolution of some of these supervised techniques, namely the neural network and decision tree-based classifiers, plus extending the approach to incorporating unsupervised clustering algorithms, such as those based on robust estimation (RE) techniques. The algorithms developed under this task should be suited for use by the Intelligent Information Fusion System (IIFS) metadata extraction modules, and as such these algorithms must be fast, robust, and anytime in nature. Finally, so that the planner/schedule module of the IlFS can oversee the use and execution of these algorithms, all information required by the planner/scheduler must be provided to the IIFS development team to ensure the timely integration of these algorithms into the overall system.

  4. Classification of document page images based on visual similarity of layout structures

    NASA Astrophysics Data System (ADS)

    Shin, Christian K.; Doermann, David S.

    1999-12-01

    Searching for documents by their type or genre is a natural way to enhance the effectiveness of document retrieval. The layout of a document contains a significant amount of information that can be used to classify a document's type in the absence of domain specific models. A document type or genre can be defined by the user based primarily on layout structure. Our classification approach is based on 'visual similarity' of the layout structure by building a supervised classifier, given examples of the class. We use image features, such as the percentages of tex and non-text (graphics, image, table, and ruling) content regions, column structures, variations in the point size of fonts, the density of content area, and various statistics on features of connected components which can be derived from class samples without class knowledge. In order to obtain class labels for training samples, we conducted a user relevance test where subjects ranked UW-I document images with respect to the 12 representative images. We implemented our classification scheme using the OC1, a decision tree classifier, and report our findings.

  5. Peculiarities of use of ECOC and AdaBoost based classifiers for thematic processing of hyperspectral data

    NASA Astrophysics Data System (ADS)

    Dementev, A. O.; Dmitriev, E. V.; Kozoderov, V. V.; Egorov, V. D.

    2017-10-01

    Hyperspectral imaging is up-to-date promising technology widely applied for the accurate thematic mapping. The presence of a large number of narrow survey channels allows us to use subtle differences in spectral characteristics of objects and to make a more detailed classification than in the case of using standard multispectral data. The difficulties encountered in the processing of hyperspectral images are usually associated with the redundancy of spectral information which leads to the problem of the curse of dimensionality. Methods currently used for recognizing objects on multispectral and hyperspectral images are usually based on standard base supervised classification algorithms of various complexity. Accuracy of these algorithms can be significantly different depending on considered classification tasks. In this paper we study the performance of ensemble classification methods for the problem of classification of the forest vegetation. Error correcting output codes and boosting are tested on artificial data and real hyperspectral images. It is demonstrates, that boosting gives more significant improvement when used with simple base classifiers. The accuracy in this case in comparable the error correcting output code (ECOC) classifier with Gaussian kernel SVM base algorithm. However the necessity of boosting ECOC with Gaussian kernel SVM is questionable. It is demonstrated, that selected ensemble classifiers allow us to recognize forest species with high enough accuracy which can be compared with ground-based forest inventory data.

  6. Mobile robots traversability awareness based on terrain visual sensory data fusion

    NASA Astrophysics Data System (ADS)

    Shirkhodaie, Amir

    2007-04-01

    In this paper, we have presented methods that significantly improve the robot awareness of its terrain traversability conditions. The terrain traversability awareness is achieved by association of terrain image appearances from different poses and fusion of extracted information from multimodality imaging and range sensor data for localization and clustering environment landmarks. Initially, we describe methods for extraction of salient features of the terrain for the purpose of landmarks registration from two or more images taken from different via points along the trajectory path of the robot. The method of image registration is applied as a means of overlaying (two or more) of the same terrain scene at different viewpoints. The registration geometrically aligns salient landmarks of two images (the reference and sensed images). A Similarity matching techniques is proposed for matching the terrain salient landmarks. Secondly, we present three terrain classifier models based on rule-based, supervised neural network, and fuzzy logic for classification of terrain condition under uncertainty and mapping the robot's terrain perception to apt traversability measures. This paper addresses the technical challenges and navigational skill requirements of mobile robots for traversability path planning in natural terrain environments similar to Mars surface terrains. We have described different methods for detection of salient terrain features based on imaging texture analysis techniques. We have also presented three competing techniques for terrain traversability assessment of mobile robots navigating in unstructured natural terrain environments. These three techniques include: a rule-based terrain classifier, a neural network-based terrain classifier, and a fuzzy-logic terrain classifier. Each proposed terrain classifier divides a region of natural terrain into finite sub-terrain regions and classifies terrain condition exclusively within each sub-terrain region based on terrain spatial and textural cues.

  7. Purification of Training Samples Based on Spectral Feature and Superpixel Segmentation

    NASA Astrophysics Data System (ADS)

    Guan, X.; Qi, W.; He, J.; Wen, Q.; Chen, T.; Wang, Z.

    2018-04-01

    Remote sensing image classification is an effective way to extract information from large volumes of high-spatial resolution remote sensing images. Generally, supervised image classification relies on abundant and high-precision training data, which is often manually interpreted by human experts to provide ground truth for training and evaluating the performance of the classifier. Remote sensing enterprises accumulated lots of manually interpreted products from early lower-spatial resolution remote sensing images by executing their routine research and business programs. However, these manually interpreted products may not match the very high resolution (VHR) image properly because of different dates or spatial resolution of both data, thus, hindering suitability of manually interpreted products in training classification models, or small coverage area of these manually interpreted products. We also face similar problems in our laboratory in 21st Century Aerospace Technology Co. Ltd (short for 21AT). In this work, we propose a method to purify the interpreted product to match newly available VHRI data and provide the best training data for supervised image classifiers in VHR image classification. And results indicate that our proposed method can efficiently purify the input data for future machine learning use.

  8. Supervised neural network classification of pre-sliced cooked pork ham images using quaternionic singular values.

    PubMed

    Valous, Nektarios A; Mendoza, Fernando; Sun, Da-Wen; Allen, Paul

    2010-03-01

    The quaternionic singular value decomposition is a technique to decompose a quaternion matrix (representation of a colour image) into quaternion singular vector and singular value component matrices exposing useful properties. The objective of this study was to use a small portion of uncorrelated singular values, as robust features for the classification of sliced pork ham images, using a supervised artificial neural network classifier. Images were acquired from four qualities of sliced cooked pork ham typically consumed in Ireland (90 slices per quality), having similar appearances. Mahalanobis distances and Pearson product moment correlations were used for feature selection. Six highly discriminating features were used as input to train the neural network. An adaptive feedforward multilayer perceptron classifier was employed to obtain a suitable mapping from the input dataset. The overall correct classification performance for the training, validation and test set were 90.3%, 94.4%, and 86.1%, respectively. The results confirm that the classification performance was satisfactory. Extracting the most informative features led to the recognition of a set of different but visually quite similar textural patterns based on quaternionic singular values. Copyright 2009 Elsevier Ltd. All rights reserved.

  9. Weakly Supervised Segmentation-Aided Classification of Urban Scenes from 3d LIDAR Point Clouds

    NASA Astrophysics Data System (ADS)

    Guinard, S.; Landrieu, L.

    2017-05-01

    We consider the problem of the semantic classification of 3D LiDAR point clouds obtained from urban scenes when the training set is limited. We propose a non-parametric segmentation model for urban scenes composed of anthropic objects of simple shapes, partionning the scene into geometrically-homogeneous segments which size is determined by the local complexity. This segmentation can be integrated into a conditional random field classifier (CRF) in order to capture the high-level structure of the scene. For each cluster, this allows us to aggregate the noisy predictions of a weakly-supervised classifier to produce a higher confidence data term. We demonstrate the improvement provided by our method over two publicly-available large-scale data sets.

  10. Material classification and automatic content enrichment of images using supervised learning and knowledge bases

    NASA Astrophysics Data System (ADS)

    Mallepudi, Sri Abhishikth; Calix, Ricardo A.; Knapp, Gerald M.

    2011-02-01

    In recent years there has been a rapid increase in the size of video and image databases. Effective searching and retrieving of images from these databases is a significant current research area. In particular, there is a growing interest in query capabilities based on semantic image features such as objects, locations, and materials, known as content-based image retrieval. This study investigated mechanisms for identifying materials present in an image. These capabilities provide additional information impacting conditional probabilities about images (e.g. objects made of steel are more likely to be buildings). These capabilities are useful in Building Information Modeling (BIM) and in automatic enrichment of images. I2T methodologies are a way to enrich an image by generating text descriptions based on image analysis. In this work, a learning model is trained to detect certain materials in images. To train the model, an image dataset was constructed containing single material images of bricks, cloth, grass, sand, stones, and wood. For generalization purposes, an additional set of 50 images containing multiple materials (some not used in training) was constructed. Two different supervised learning classification models were investigated: a single multi-class SVM classifier, and multiple binary SVM classifiers (one per material). Image features included Gabor filter parameters for texture, and color histogram data for RGB components. All classification accuracy scores using the SVM-based method were above 85%. The second model helped in gathering more information from the images since it assigned multiple classes to the images. A framework for the I2T methodology is presented.

  11. A Dirichlet process model for classifying and forecasting epidemic curves

    PubMed Central

    2014-01-01

    Background A forecast can be defined as an endeavor to quantitatively estimate a future event or probabilities assigned to a future occurrence. Forecasting stochastic processes such as epidemics is challenging since there are several biological, behavioral, and environmental factors that influence the number of cases observed at each point during an epidemic. However, accurate forecasts of epidemics would impact timely and effective implementation of public health interventions. In this study, we introduce a Dirichlet process (DP) model for classifying and forecasting influenza epidemic curves. Methods The DP model is a nonparametric Bayesian approach that enables the matching of current influenza activity to simulated and historical patterns, identifies epidemic curves different from those observed in the past and enables prediction of the expected epidemic peak time. The method was validated using simulated influenza epidemics from an individual-based model and the accuracy was compared to that of the tree-based classification technique, Random Forest (RF), which has been shown to achieve high accuracy in the early prediction of epidemic curves using a classification approach. We also applied the method to forecasting influenza outbreaks in the United States from 1997–2013 using influenza-like illness (ILI) data from the Centers for Disease Control and Prevention (CDC). Results We made the following observations. First, the DP model performed as well as RF in identifying several of the simulated epidemics. Second, the DP model correctly forecasted the peak time several days in advance for most of the simulated epidemics. Third, the accuracy of identifying epidemics different from those already observed improved with additional data, as expected. Fourth, both methods correctly classified epidemics with higher reproduction numbers (R) with a higher accuracy compared to epidemics with lower R values. Lastly, in the classification of seasonal influenza epidemics based on ILI data from the CDC, the methods’ performance was comparable. Conclusions Although RF requires less computational time compared to the DP model, the algorithm is fully supervised implying that epidemic curves different from those previously observed will always be misclassified. In contrast, the DP model can be unsupervised, semi-supervised or fully supervised. Since both methods have their relative merits, an approach that uses both RF and the DP model could be beneficial. PMID:24405642

  12. A comparison of supervised classification methods for the prediction of substrate type using multibeam acoustic and legacy grain-size data.

    PubMed

    Stephens, David; Diesing, Markus

    2014-01-01

    Detailed seabed substrate maps are increasingly in demand for effective planning and management of marine ecosystems and resources. It has become common to use remotely sensed multibeam echosounder data in the form of bathymetry and acoustic backscatter in conjunction with ground-truth sampling data to inform the mapping of seabed substrates. Whilst, until recently, such data sets have typically been classified by expert interpretation, it is now obvious that more objective, faster and repeatable methods of seabed classification are required. This study compares the performances of a range of supervised classification techniques for predicting substrate type from multibeam echosounder data. The study area is located in the North Sea, off the north-east coast of England. A total of 258 ground-truth samples were classified into four substrate classes. Multibeam bathymetry and backscatter data, and a range of secondary features derived from these datasets were used in this study. Six supervised classification techniques were tested: Classification Trees, Support Vector Machines, k-Nearest Neighbour, Neural Networks, Random Forest and Naive Bayes. Each classifier was trained multiple times using different input features, including i) the two primary features of bathymetry and backscatter, ii) a subset of the features chosen by a feature selection process and iii) all of the input features. The predictive performances of the models were validated using a separate test set of ground-truth samples. The statistical significance of model performances relative to a simple baseline model (Nearest Neighbour predictions on bathymetry and backscatter) were tested to assess the benefits of using more sophisticated approaches. The best performing models were tree based methods and Naive Bayes which achieved accuracies of around 0.8 and kappa coefficients of up to 0.5 on the test set. The models that used all input features didn't generally perform well, highlighting the need for some means of feature selection.

  13. Comparison Of Semi-Automatic And Automatic Slick Detection Algorithms For Jiyeh Power Station Oil Spill, Lebanon

    NASA Astrophysics Data System (ADS)

    Osmanoglu, B.; Ozkan, C.; Sunar, F.

    2013-10-01

    After air strikes on July 14 and 15, 2006 the Jiyeh Power Station started leaking oil into the eastern Mediterranean Sea. The power station is located about 30 km south of Beirut and the slick covered about 170 km of coastline threatening the neighboring countries Turkey and Cyprus. Due to the ongoing conflict between Israel and Lebanon, cleaning efforts could not start immediately resulting in 12 000 to 15 000 tons of fuel oil leaking into the sea. In this paper we compare results from automatic and semi-automatic slick detection algorithms. The automatic detection method combines the probabilities calculated for each pixel from each image to obtain a joint probability, minimizing the adverse effects of atmosphere on oil spill detection. The method can readily utilize X-, C- and L-band data where available. Furthermore wind and wave speed observations can be used for a more accurate analysis. For this study, we utilize Envisat ASAR ScanSAR data. A probability map is generated based on the radar backscatter, effect of wind and dampening value. The semi-automatic algorithm is based on supervised classification. As a classifier, Artificial Neural Network Multilayer Perceptron (ANN MLP) classifier is used since it is more flexible and efficient than conventional maximum likelihood classifier for multisource and multi-temporal data. The learning algorithm for ANN MLP is chosen as the Levenberg-Marquardt (LM). Training and test data for supervised classification are composed from the textural information created from SAR images. This approach is semiautomatic because tuning the parameters of classifier and composing training data need a human interaction. We point out the similarities and differences between the two methods and their results as well as underlining their advantages and disadvantages. Due to the lack of ground truth data, we compare obtained results to each other, as well as other published oil slick area assessments.

  14. Bias and Stability of Single Variable Classifiers for Feature Ranking and Selection

    PubMed Central

    Fakhraei, Shobeir; Soltanian-Zadeh, Hamid; Fotouhi, Farshad

    2014-01-01

    Feature rankings are often used for supervised dimension reduction especially when discriminating power of each feature is of interest, dimensionality of dataset is extremely high, or computational power is limited to perform more complicated methods. In practice, it is recommended to start dimension reduction via simple methods such as feature rankings before applying more complex approaches. Single Variable Classifier (SVC) ranking is a feature ranking based on the predictive performance of a classifier built using only a single feature. While benefiting from capabilities of classifiers, this ranking method is not as computationally intensive as wrappers. In this paper, we report the results of an extensive study on the bias and stability of such feature ranking method. We study whether the classifiers influence the SVC rankings or the discriminative power of features themselves has a dominant impact on the final rankings. We show the common intuition of using the same classifier for feature ranking and final classification does not always result in the best prediction performance. We then study if heterogeneous classifiers ensemble approaches provide more unbiased rankings and if they improve final classification performance. Furthermore, we calculate an empirical prediction performance loss for using the same classifier in SVC feature ranking and final classification from the optimal choices. PMID:25177107

  15. Bias and Stability of Single Variable Classifiers for Feature Ranking and Selection.

    PubMed

    Fakhraei, Shobeir; Soltanian-Zadeh, Hamid; Fotouhi, Farshad

    2014-11-01

    Feature rankings are often used for supervised dimension reduction especially when discriminating power of each feature is of interest, dimensionality of dataset is extremely high, or computational power is limited to perform more complicated methods. In practice, it is recommended to start dimension reduction via simple methods such as feature rankings before applying more complex approaches. Single Variable Classifier (SVC) ranking is a feature ranking based on the predictive performance of a classifier built using only a single feature. While benefiting from capabilities of classifiers, this ranking method is not as computationally intensive as wrappers. In this paper, we report the results of an extensive study on the bias and stability of such feature ranking method. We study whether the classifiers influence the SVC rankings or the discriminative power of features themselves has a dominant impact on the final rankings. We show the common intuition of using the same classifier for feature ranking and final classification does not always result in the best prediction performance. We then study if heterogeneous classifiers ensemble approaches provide more unbiased rankings and if they improve final classification performance. Furthermore, we calculate an empirical prediction performance loss for using the same classifier in SVC feature ranking and final classification from the optimal choices.

  16. Evaluation of Alzheimer's disease by analysis of MR images using Objective Dialectical Classifiers as an alternative to ADC maps.

    PubMed

    Dos Santos, Wellington P; de Assis, Francisco M; de Souza, Ricardo E; Dos Santos Filho, Plinio B

    2008-01-01

    Alzheimer's disease is the most common cause of dementia, yet hard to diagnose precisely without invasive techniques, particularly at the onset of the disease. This work approaches image analysis and classification of synthetic multispectral images composed by diffusion-weighted (DW) magnetic resonance (MR) cerebral images for the evaluation of cerebrospinal fluid area and measuring the advance of Alzheimer's disease. A clinical 1.5 T MR imaging system was used to acquire all images presented. The classification methods are based on Objective Dialectical Classifiers, a new method based on Dialectics as defined in the Philosophy of Praxis. A 2-degree polynomial network with supervised training is used to generate the ground truth image. The classification results are used to improve the usual analysis of the apparent diffusion coefficient map.

  17. Comparing supervised learning techniques on the task of physical activity recognition.

    PubMed

    Dalton, A; OLaighin, G

    2013-01-01

    The objective of this study was to compare the performance of base-level and meta-level classifiers on the task of physical activity recognition. Five wireless kinematic sensors were attached to each subject (n = 25) while they completed a range of basic physical activities in a controlled laboratory setting. Subjects were then asked to carry out similar self-annotated physical activities in a random order and in an unsupervised environment. A combination of time-domain and frequency-domain features were extracted from the sensor data including the first four central moments, zero-crossing rate, average magnitude, sensor cross-correlation, sensor auto-correlation, spectral entropy and dominant frequency components. A reduced feature set was generated using a wrapper subset evaluation technique with a linear forward search and this feature set was employed for classifier comparison. The meta-level classifier AdaBoostM1 with C4.5 Graft as its base-level classifier achieved an overall accuracy of 95%. Equal sized datasets of subject independent data and subject dependent data were used to train this classifier and high recognition rates could be achieved without the need for user specific training. Furthermore, it was found that an accuracy of 88% could be achieved using data from the ankle and wrist sensors only.

  18. Mapping Land Cover Types in Amazon Basin Using 1km JERS-1 Mosaic

    NASA Technical Reports Server (NTRS)

    Saatchi, Sassan S.; Nelson, Bruce; Podest, Erika; Holt, John

    2000-01-01

    In this paper, the 100 meter JERS-1 Amazon mosaic image was used in a new classifier to generate a I km resolution land cover map. The inputs to the classifier were 1 km resolution mean backscatter and seven first order texture measures derived from the 100 m data by using a 10 x 10 independent sampling window. The classification approach included two interdependent stages: 1) a supervised maximum a posteriori Bayesian approach to classify the mean backscatter image into 5 general land cover categories of forest, savannah, inundated, white sand, and anthropogenic vegetation classes, and 2) a texture measure decision rule approach to further discriminate subcategory classes based on taxonomic information and biomass levels. Fourteen classes were successfully separated at 1 km scale. The results were verified by examining the accuracy of the approach by comparison with the IBGE and the AVHRR 1 km resolution land cover maps.

  19. Effective Sequential Classifier Training for SVM-Based Multitemporal Remote Sensing Image Classification

    NASA Astrophysics Data System (ADS)

    Guo, Yiqing; Jia, Xiuping; Paull, David

    2018-06-01

    The explosive availability of remote sensing images has challenged supervised classification algorithms such as Support Vector Machines (SVM), as training samples tend to be highly limited due to the expensive and laborious task of ground truthing. The temporal correlation and spectral similarity between multitemporal images have opened up an opportunity to alleviate this problem. In this study, a SVM-based Sequential Classifier Training (SCT-SVM) approach is proposed for multitemporal remote sensing image classification. The approach leverages the classifiers of previous images to reduce the required number of training samples for the classifier training of an incoming image. For each incoming image, a rough classifier is firstly predicted based on the temporal trend of a set of previous classifiers. The predicted classifier is then fine-tuned into a more accurate position with current training samples. This approach can be applied progressively to sequential image data, with only a small number of training samples being required from each image. Experiments were conducted with Sentinel-2A multitemporal data over an agricultural area in Australia. Results showed that the proposed SCT-SVM achieved better classification accuracies compared with two state-of-the-art model transfer algorithms. When training data are insufficient, the overall classification accuracy of the incoming image was improved from 76.18% to 94.02% with the proposed SCT-SVM, compared with those obtained without the assistance from previous images. These results demonstrate that the leverage of a priori information from previous images can provide advantageous assistance for later images in multitemporal image classification.

  20. Combination of mass spectrometry-based targeted lipidomics and supervised machine learning algorithms in detecting adulterated admixtures of white rice.

    PubMed

    Lim, Dong Kyu; Long, Nguyen Phuoc; Mo, Changyeun; Dong, Ziyuan; Cui, Lingmei; Kim, Giyoung; Kwon, Sung Won

    2017-10-01

    The mixing of extraneous ingredients with original products is a common adulteration practice in food and herbal medicines. In particular, authenticity of white rice and its corresponding blended products has become a key issue in food industry. Accordingly, our current study aimed to develop and evaluate a novel discrimination method by combining targeted lipidomics with powerful supervised learning methods, and eventually introduce a platform to verify the authenticity of white rice. A total of 30 cultivars were collected, and 330 representative samples of white rice from Korea and China as well as seven mixing ratios were examined. Random forests (RF), support vector machines (SVM) with a radial basis function kernel, C5.0, model averaged neural network, and k-nearest neighbor classifiers were used for the classification. We achieved desired results, and the classifiers effectively differentiated white rice from Korea to blended samples with high prediction accuracy for the contamination ratio as low as five percent. In addition, RF and SVM classifiers were generally superior to and more robust than the other techniques. Our approach demonstrated that the relative differences in lysoGPLs can be successfully utilized to detect the adulterated mixing of white rice originating from different countries. In conclusion, the present study introduces a novel and high-throughput platform that can be applied to authenticate adulterated admixtures from original white rice samples. Copyright © 2017 Elsevier Ltd. All rights reserved.

  1. Training Classifiers with Shadow Features for Sensor-Based Human Activity Recognition.

    PubMed

    Fong, Simon; Song, Wei; Cho, Kyungeun; Wong, Raymond; Wong, Kelvin K L

    2017-02-27

    In this paper, a novel training/testing process for building/using a classification model based on human activity recognition (HAR) is proposed. Traditionally, HAR has been accomplished by a classifier that learns the activities of a person by training with skeletal data obtained from a motion sensor, such as Microsoft Kinect. These skeletal data are the spatial coordinates (x, y, z) of different parts of the human body. The numeric information forms time series, temporal records of movement sequences that can be used for training a classifier. In addition to the spatial features that describe current positions in the skeletal data, new features called 'shadow features' are used to improve the supervised learning efficacy of the classifier. Shadow features are inferred from the dynamics of body movements, and thereby modelling the underlying momentum of the performed activities. They provide extra dimensions of information for characterising activities in the classification process, and thereby significantly improve the classification accuracy. Two cases of HAR are tested using a classification model trained with shadow features: one is by using wearable sensor and the other is by a Kinect-based remote sensor. Our experiments can demonstrate the advantages of the new method, which will have an impact on human activity detection research.

  2. Training Classifiers with Shadow Features for Sensor-Based Human Activity Recognition

    PubMed Central

    Fong, Simon; Song, Wei; Cho, Kyungeun; Wong, Raymond; Wong, Kelvin K. L.

    2017-01-01

    In this paper, a novel training/testing process for building/using a classification model based on human activity recognition (HAR) is proposed. Traditionally, HAR has been accomplished by a classifier that learns the activities of a person by training with skeletal data obtained from a motion sensor, such as Microsoft Kinect. These skeletal data are the spatial coordinates (x, y, z) of different parts of the human body. The numeric information forms time series, temporal records of movement sequences that can be used for training a classifier. In addition to the spatial features that describe current positions in the skeletal data, new features called ‘shadow features’ are used to improve the supervised learning efficacy of the classifier. Shadow features are inferred from the dynamics of body movements, and thereby modelling the underlying momentum of the performed activities. They provide extra dimensions of information for characterising activities in the classification process, and thereby significantly improve the classification accuracy. Two cases of HAR are tested using a classification model trained with shadow features: one is by using wearable sensor and the other is by a Kinect-based remote sensor. Our experiments can demonstrate the advantages of the new method, which will have an impact on human activity detection research. PMID:28264470

  3. Benchmarking protein classification algorithms via supervised cross-validation.

    PubMed

    Kertész-Farkas, Attila; Dhir, Somdutta; Sonego, Paolo; Pacurar, Mircea; Netoteia, Sergiu; Nijveen, Harm; Kuzniar, Arnold; Leunissen, Jack A M; Kocsor, András; Pongor, Sándor

    2008-04-24

    Development and testing of protein classification algorithms are hampered by the fact that the protein universe is characterized by groups vastly different in the number of members, in average protein size, similarity within group, etc. Datasets based on traditional cross-validation (k-fold, leave-one-out, etc.) may not give reliable estimates on how an algorithm will generalize to novel, distantly related subtypes of the known protein classes. Supervised cross-validation, i.e., selection of test and train sets according to the known subtypes within a database has been successfully used earlier in conjunction with the SCOP database. Our goal was to extend this principle to other databases and to design standardized benchmark datasets for protein classification. Hierarchical classification trees of protein categories provide a simple and general framework for designing supervised cross-validation strategies for protein classification. Benchmark datasets can be designed at various levels of the concept hierarchy using a simple graph-theoretic distance. A combination of supervised and random sampling was selected to construct reduced size model datasets, suitable for algorithm comparison. Over 3000 new classification tasks were added to our recently established protein classification benchmark collection that currently includes protein sequence (including protein domains and entire proteins), protein structure and reading frame DNA sequence data. We carried out an extensive evaluation based on various machine-learning algorithms such as nearest neighbor, support vector machines, artificial neural networks, random forests and logistic regression, used in conjunction with comparison algorithms, BLAST, Smith-Waterman, Needleman-Wunsch, as well as 3D comparison methods DALI and PRIDE. The resulting datasets provide lower, and in our opinion more realistic estimates of the classifier performance than do random cross-validation schemes. A combination of supervised and random sampling was used to construct model datasets, suitable for algorithm comparison.

  4. Combining active learning and semi-supervised learning techniques to extract protein interaction sentences.

    PubMed

    Song, Min; Yu, Hwanjo; Han, Wook-Shin

    2011-11-24

    Protein-protein interaction (PPI) extraction has been a focal point of many biomedical research and database curation tools. Both Active Learning and Semi-supervised SVMs have recently been applied to extract PPI automatically. In this paper, we explore combining the AL with the SSL to improve the performance of the PPI task. We propose a novel PPI extraction technique called PPISpotter by combining Deterministic Annealing-based SSL and an AL technique to extract protein-protein interaction. In addition, we extract a comprehensive set of features from MEDLINE records by Natural Language Processing (NLP) techniques, which further improve the SVM classifiers. In our feature selection technique, syntactic, semantic, and lexical properties of text are incorporated into feature selection that boosts the system performance significantly. By conducting experiments with three different PPI corpuses, we show that PPISpotter is superior to the other techniques incorporated into semi-supervised SVMs such as Random Sampling, Clustering, and Transductive SVMs by precision, recall, and F-measure. Our system is a novel, state-of-the-art technique for efficiently extracting protein-protein interaction pairs.

  5. Predicting Classifier Performance with Limited Training Data: Applications to Computer-Aided Diagnosis in Breast and Prostate Cancer

    PubMed Central

    Basavanhally, Ajay; Viswanath, Satish; Madabhushi, Anant

    2015-01-01

    Clinical trials increasingly employ medical imaging data in conjunction with supervised classifiers, where the latter require large amounts of training data to accurately model the system. Yet, a classifier selected at the start of the trial based on smaller and more accessible datasets may yield inaccurate and unstable classification performance. In this paper, we aim to address two common concerns in classifier selection for clinical trials: (1) predicting expected classifier performance for large datasets based on error rates calculated from smaller datasets and (2) the selection of appropriate classifiers based on expected performance for larger datasets. We present a framework for comparative evaluation of classifiers using only limited amounts of training data by using random repeated sampling (RRS) in conjunction with a cross-validation sampling strategy. Extrapolated error rates are subsequently validated via comparison with leave-one-out cross-validation performed on a larger dataset. The ability to predict error rates as dataset size increases is demonstrated on both synthetic data as well as three different computational imaging tasks: detecting cancerous image regions in prostate histopathology, differentiating high and low grade cancer in breast histopathology, and detecting cancerous metavoxels in prostate magnetic resonance spectroscopy. For each task, the relationships between 3 distinct classifiers (k-nearest neighbor, naive Bayes, Support Vector Machine) are explored. Further quantitative evaluation in terms of interquartile range (IQR) suggests that our approach consistently yields error rates with lower variability (mean IQRs of 0.0070, 0.0127, and 0.0140) than a traditional RRS approach (mean IQRs of 0.0297, 0.0779, and 0.305) that does not employ cross-validation sampling for all three datasets. PMID:25993029

  6. [Quantitative classification in catering trade and countermeasures of supervision and management in Hunan Province].

    PubMed

    Liu, Xiulan; Chen, Lizhang; He, Xiang

    2012-02-01

    To analyze the status quo of quantitative classification in Hunan Province catering industry, and to discuss the countermeasures in-depth. According to relevant laws and regulations, and after referring to Daily supervision and quantitative scoring sheet and consulting experts, a checklist of key supervision indicators was made. The implementation of quantitative classification in 10 cities in Hunan Province was studied, and the status quo was analyzed. All the 390 catering units implemented quantitative classified management. The larger the catering enterprise, the higher level of quantitative classification. In addition to cafeterias, the smaller the catering units, the higher point of deduction, and snack bars and beverage stores were the highest. For those quantified and classified as C and D, the point of deduction was higher in the procurement and storage of raw materials, operation processing and other aspects. The quantitative classification of Hunan Province has relatively wide coverage. There are hidden risks in food security in small catering units, snack bars, and beverage stores. The food hygienic condition of Hunan Province needs to be improved.

  7. Maximum margin semi-supervised learning with irrelevant data.

    PubMed

    Yang, Haiqin; Huang, Kaizhu; King, Irwin; Lyu, Michael R

    2015-10-01

    Semi-supervised learning (SSL) is a typical learning paradigms training a model from both labeled and unlabeled data. The traditional SSL models usually assume unlabeled data are relevant to the labeled data, i.e., following the same distributions of the targeted labeled data. In this paper, we address a different, yet formidable scenario in semi-supervised classification, where the unlabeled data may contain irrelevant data to the labeled data. To tackle this problem, we develop a maximum margin model, named tri-class support vector machine (3C-SVM), to utilize the available training data, while seeking a hyperplane for separating the targeted data well. Our 3C-SVM exhibits several characteristics and advantages. First, it does not need any prior knowledge and explicit assumption on the data relatedness. On the contrary, it can relieve the effect of irrelevant unlabeled data based on the logistic principle and maximum entropy principle. That is, 3C-SVM approaches an ideal classifier. This classifier relies heavily on labeled data and is confident on the relevant data lying far away from the decision hyperplane, while maximally ignoring the irrelevant data, which are hardly distinguished. Second, theoretical analysis is provided to prove that in what condition, the irrelevant data can help to seek the hyperplane. Third, 3C-SVM is a generalized model that unifies several popular maximum margin models, including standard SVMs, Semi-supervised SVMs (S(3)VMs), and SVMs learned from the universum (U-SVMs) as its special cases. More importantly, we deploy a concave-convex produce to solve the proposed 3C-SVM, transforming the original mixed integer programming, to a semi-definite programming relaxation, and finally to a sequence of quadratic programming subproblems, which yields the same worst case time complexity as that of S(3)VMs. Finally, we demonstrate the effectiveness and efficiency of our proposed 3C-SVM through systematical experimental comparisons. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. Feature extraction based on semi-supervised kernel Marginal Fisher analysis and its application in bearing fault diagnosis

    NASA Astrophysics Data System (ADS)

    Jiang, Li; Xuan, Jianping; Shi, Tielin

    2013-12-01

    Generally, the vibration signals of faulty machinery are non-stationary and nonlinear under complicated operating conditions. Therefore, it is a big challenge for machinery fault diagnosis to extract optimal features for improving classification accuracy. This paper proposes semi-supervised kernel Marginal Fisher analysis (SSKMFA) for feature extraction, which can discover the intrinsic manifold structure of dataset, and simultaneously consider the intra-class compactness and the inter-class separability. Based on SSKMFA, a novel approach to fault diagnosis is put forward and applied to fault recognition of rolling bearings. SSKMFA directly extracts the low-dimensional characteristics from the raw high-dimensional vibration signals, by exploiting the inherent manifold structure of both labeled and unlabeled samples. Subsequently, the optimal low-dimensional features are fed into the simplest K-nearest neighbor (KNN) classifier to recognize different fault categories and severities of bearings. The experimental results demonstrate that the proposed approach improves the fault recognition performance and outperforms the other four feature extraction methods.

  9. Automated object-based classification of rain-induced landslides with VHR multispectral images in Madeira Island

    NASA Astrophysics Data System (ADS)

    Heleno, S.; Matias, M.; Pina, P.; Sousa, A. J.

    2015-09-01

    A method for semi-automatic landslide detection, with the ability to separate source and run-out areas, is presented in this paper. It combines object-based image analysis and a Support Vector Machine classifier on a GeoEye-1 multispectral image, sensed 3 days after the major damaging landslide event that occurred in Madeira island (20 February 2010), with a pre-event LIDAR Digital Elevation Model. The testing is developed in a 15 km2-wide study area, where 95 % of the landslides scars are detected by this supervised approach. The classifier presents a good performance in the delineation of the overall landslide area. In addition, fair results are achieved in the separation of the source from the run-out landslide areas, although in less illuminated slopes this discrimination is less effective than in sunnier east facing-slopes.

  10. Unresolved Galaxy Classifier for ESA/Gaia mission: Support Vector Machines approach

    NASA Astrophysics Data System (ADS)

    Bellas-Velidis, Ioannis; Kontizas, Mary; Dapergolas, Anastasios; Livanou, Evdokia; Kontizas, Evangelos; Karampelas, Antonios

    A software package Unresolved Galaxy Classifier (UGC) is being developed for the ground-based pipeline of ESA's Gaia mission. It aims to provide an automated taxonomic classification and specific parameters estimation analyzing Gaia BP/RP instrument low-dispersion spectra of unresolved galaxies. The UGC algorithm is based on a supervised learning technique, the Support Vector Machines (SVM). The software is implemented in Java as two separate modules. An offline learning module provides functions for SVM-models training. Once trained, the set of models can be repeatedly applied to unknown galaxy spectra by the pipeline's application module. A library of galaxy models synthetic spectra, simulated for the BP/RP instrument, is used to train and test the modules. Science tests show a very good classification performance of UGC and relatively good regression performance, except for some of the parameters. Possible approaches to improve the performance are discussed.

  11. Challenges in discriminating profanity from hate speech

    NASA Astrophysics Data System (ADS)

    Malmasi, Shervin; Zampieri, Marcos

    2018-03-01

    In this study, we approach the problem of distinguishing general profanity from hate speech in social media, something which has not been widely considered. Using a new dataset annotated specifically for this task, we employ supervised classification along with a set of features that includes ?-grams, skip-grams and clustering-based word representations. We apply approaches based on single classifiers as well as more advanced ensemble classifiers and stacked generalisation, achieving the best result of ? accuracy for this 3-class classification task. Analysis of the results reveals that discriminating hate speech and profanity is not a simple task, which may require features that capture a deeper understanding of the text not always possible with surface ?-grams. The variability of gold labels in the annotated data, due to differences in the subjective adjudications of the annotators, is also an issue. Other directions for future work are discussed.

  12. An intelligent identification algorithm for the monoclonal picking instrument

    NASA Astrophysics Data System (ADS)

    Yan, Hua; Zhang, Rongfu; Yuan, Xujun; Wang, Qun

    2017-11-01

    The traditional colony selection is mainly operated by manual mode, which takes on low efficiency and strong subjectivity. Therefore, it is important to develop an automatic monoclonal-picking instrument. The critical stage of the automatic monoclonal-picking and intelligent optimal selection is intelligent identification algorithm. An auto-screening algorithm based on Support Vector Machine (SVM) is proposed in this paper, which uses the supervised learning method, which combined with the colony morphological characteristics to classify the colony accurately. Furthermore, through the basic morphological features of the colony, system can figure out a series of morphological parameters step by step. Through the establishment of maximal margin classifier, and based on the analysis of the growth trend of the colony, the selection of the monoclonal colony was carried out. The experimental results showed that the auto-screening algorithm could screen out the regular colony from the other, which meets the requirement of various parameters.

  13. A Hybrid Semi-supervised Classification Scheme for Mining Multisource Geospatial Data

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vatsavai, Raju; Bhaduri, Budhendra L

    2011-01-01

    Supervised learning methods such as Maximum Likelihood (ML) are often used in land cover (thematic) classification of remote sensing imagery. ML classifier relies exclusively on spectral characteristics of thematic classes whose statistical distributions (class conditional probability densities) are often overlapping. The spectral response distributions of thematic classes are dependent on many factors including elevation, soil types, and ecological zones. A second problem with statistical classifiers is the requirement of large number of accurate training samples (10 to 30 |dimensions|), which are often costly and time consuming to acquire over large geographic regions. With the increasing availability of geospatial databases, itmore » is possible to exploit the knowledge derived from these ancillary datasets to improve classification accuracies even when the class distributions are highly overlapping. Likewise newer semi-supervised techniques can be adopted to improve the parameter estimates of statistical model by utilizing a large number of easily available unlabeled training samples. Unfortunately there is no convenient multivariate statistical model that can be employed for mulitsource geospatial databases. In this paper we present a hybrid semi-supervised learning algorithm that effectively exploits freely available unlabeled training samples from multispectral remote sensing images and also incorporates ancillary geospatial databases. We have conducted several experiments on real datasets, and our new hybrid approach shows over 25 to 35% improvement in overall classification accuracy over conventional classification schemes.« less

  14. Supervising community health workers in low-income countries – a review of impact and implementation issues

    PubMed Central

    Hill, Zelee; Dumbaugh, Mari; Benton, Lorna; Källander, Karin; Strachan, Daniel; Asbroek, Augustinus ten; Tibenderana, James; Kirkwood, Betty; Meek, Sylvia

    2014-01-01

    Background Community health workers (CHWs) are an increasingly important component of health systems and programs. Despite the recognized role of supervision in ensuring CHWs are effective, supervision is often weak and under-supported. Little is known about what constitutes adequate supervision and how different supervision strategies influence performance, motivation, and retention. Objective To determine the impact of supervision strategies used in low- and middle-income countries and discuss implementation and feasibility issues with a focus on CHWs. Design A search of peer-reviewed, English language articles evaluating health provider supervision strategies was conducted through November 2013. Included articles evaluated the impact of supervision in low- or middle-income countries using a controlled, pre-/post- or observational design. Implementation and feasibility literature included both peer-reviewed and gray literature. Results A total of 22 impact papers were identified. Papers were from a range of low- and middle-income countries addressing the supervision of a variety of health care providers. We classified interventions as testing supervision frequency, the supportive/facilitative supervision package, supervision mode (peer, group, and community), tools (self-assessment and checklists), focus (quality assurance/problem solving), and training. Outcomes included coverage, performance, and perception of quality but were not uniform across studies. Evidence suggests that improving supervision quality has a greater impact than increasing frequency of supervision alone. Supportive supervision packages, community monitoring, and quality improvement/problem-solving approaches show the most promise; however, evaluation of all strategies was weak. Conclusion Few supervision strategies have been rigorously tested and data on CHW supervision is particularly sparse. This review highlights the diversity of supervision approaches that policy makers have to choose from and, while choices should be context specific, our findings suggest that high-quality supervision that focuses on supportive approaches, community monitoring, and/or quality assurance/problem solving may be most effective. PMID:24815075

  15. Computer-aided diagnosis system: a Bayesian hybrid classification method.

    PubMed

    Calle-Alonso, F; Pérez, C J; Arias-Nicolás, J P; Martín, J

    2013-10-01

    A novel method to classify multi-class biomedical objects is presented. The method is based on a hybrid approach which combines pairwise comparison, Bayesian regression and the k-nearest neighbor technique. It can be applied in a fully automatic way or in a relevance feedback framework. In the latter case, the information obtained from both an expert and the automatic classification is iteratively used to improve the results until a certain accuracy level is achieved, then, the learning process is finished and new classifications can be automatically performed. The method has been applied in two biomedical contexts by following the same cross-validation schemes as in the original studies. The first one refers to cancer diagnosis, leading to an accuracy of 77.35% versus 66.37%, originally obtained. The second one considers the diagnosis of pathologies of the vertebral column. The original method achieves accuracies ranging from 76.5% to 96.7%, and from 82.3% to 97.1% in two different cross-validation schemes. Even with no supervision, the proposed method reaches 96.71% and 97.32% in these two cases. By using a supervised framework the achieved accuracy is 97.74%. Furthermore, all abnormal cases were correctly classified. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  16. Efficient use of unlabeled data for protein sequence classification: a comparative study.

    PubMed

    Kuksa, Pavel; Huang, Pai-Hsi; Pavlovic, Vladimir

    2009-04-29

    Recent studies in computational primary protein sequence analysis have leveraged the power of unlabeled data. For example, predictive models based on string kernels trained on sequences known to belong to particular folds or superfamilies, the so-called labeled data set, can attain significantly improved accuracy if this data is supplemented with protein sequences that lack any class tags-the unlabeled data. In this study, we present a principled and biologically motivated computational framework that more effectively exploits the unlabeled data by only using the sequence regions that are more likely to be biologically relevant for better prediction accuracy. As overly-represented sequences in large uncurated databases may bias the estimation of computational models that rely on unlabeled data, we also propose a method to remove this bias and improve performance of the resulting classifiers. Combined with state-of-the-art string kernels, our proposed computational framework achieves very accurate semi-supervised protein remote fold and homology detection on three large unlabeled databases. It outperforms current state-of-the-art methods and exhibits significant reduction in running time. The unlabeled sequences used under the semi-supervised setting resemble the unpolished gemstones; when used as-is, they may carry unnecessary features and hence compromise the classification accuracy but once cut and polished, they improve the accuracy of the classifiers considerably.

  17. Context-aware adaptive spelling in motor imagery BCI

    NASA Astrophysics Data System (ADS)

    Perdikis, S.; Leeb, R.; Millán, J. d. R.

    2016-06-01

    Objective. This work presents a first motor imagery-based, adaptive brain-computer interface (BCI) speller, which is able to exploit application-derived context for improved, simultaneous classifier adaptation and spelling. Online spelling experiments with ten able-bodied users evaluate the ability of our scheme, first, to alleviate non-stationarity of brain signals for restoring the subject’s performances, second, to guide naive users into BCI control avoiding initial offline BCI calibration and, third, to outperform regular unsupervised adaptation. Approach. Our co-adaptive framework combines the BrainTree speller with smooth-batch linear discriminant analysis adaptation. The latter enjoys contextual assistance through BrainTree’s language model to improve online expectation-maximization maximum-likelihood estimation. Main results. Our results verify the possibility to restore single-sample classification and BCI command accuracy, as well as spelling speed for expert users. Most importantly, context-aware adaptation performs significantly better than its unsupervised equivalent and similar to the supervised one. Although no significant differences are found with respect to the state-of-the-art PMean approach, the proposed algorithm is shown to be advantageous for 30% of the users. Significance. We demonstrate the possibility to circumvent supervised BCI recalibration, saving time without compromising the adaptation quality. On the other hand, we show that this type of classifier adaptation is not as efficient for BCI training purposes.

  18. Context-aware adaptive spelling in motor imagery BCI.

    PubMed

    Perdikis, S; Leeb, R; Millán, J D R

    2016-06-01

    This work presents a first motor imagery-based, adaptive brain-computer interface (BCI) speller, which is able to exploit application-derived context for improved, simultaneous classifier adaptation and spelling. Online spelling experiments with ten able-bodied users evaluate the ability of our scheme, first, to alleviate non-stationarity of brain signals for restoring the subject's performances, second, to guide naive users into BCI control avoiding initial offline BCI calibration and, third, to outperform regular unsupervised adaptation. Our co-adaptive framework combines the BrainTree speller with smooth-batch linear discriminant analysis adaptation. The latter enjoys contextual assistance through BrainTree's language model to improve online expectation-maximization maximum-likelihood estimation. Our results verify the possibility to restore single-sample classification and BCI command accuracy, as well as spelling speed for expert users. Most importantly, context-aware adaptation performs significantly better than its unsupervised equivalent and similar to the supervised one. Although no significant differences are found with respect to the state-of-the-art PMean approach, the proposed algorithm is shown to be advantageous for 30% of the users. We demonstrate the possibility to circumvent supervised BCI recalibration, saving time without compromising the adaptation quality. On the other hand, we show that this type of classifier adaptation is not as efficient for BCI training purposes.

  19. Optimized spatio-temporal descriptors for real-time fall detection: comparison of support vector machine and Adaboost-based classification

    NASA Astrophysics Data System (ADS)

    Charfi, Imen; Miteran, Johel; Dubois, Julien; Atri, Mohamed; Tourki, Rached

    2013-10-01

    We propose a supervised approach to detect falls in a home environment using an optimized descriptor adapted to real-time tasks. We introduce a realistic dataset of 222 videos, a new metric allowing evaluation of fall detection performance in a video stream, and an automatically optimized set of spatio-temporal descriptors which fed a supervised classifier. We build the initial spatio-temporal descriptor named STHF using several combinations of transformations of geometrical features (height and width of human body bounding box, the user's trajectory with her/his orientation, projection histograms, and moments of orders 0, 1, and 2). We study the combinations of usual transformations of the features (Fourier transform, wavelet transform, first and second derivatives), and we show experimentally that it is possible to achieve high performance using support vector machine and Adaboost classifiers. Automatic feature selection allows to show that the best tradeoff between classification performance and processing time is obtained by combining the original low-level features with their first derivative. Hence, we evaluate the robustness of the fall detection regarding location changes. We propose a realistic and pragmatic protocol that enables performance to be improved by updating the training in the current location with normal activities records.

  20. Beyond where to how: a machine learning approach for sensing mobility contexts using smartphone sensors.

    PubMed

    Guinness, Robert E

    2015-04-28

    This paper presents the results of research on the use of smartphone sensors (namely, GPS and accelerometers), geospatial information (points of interest, such as bus stops and train stations) and machine learning (ML) to sense mobility contexts. Our goal is to develop techniques to continuously and automatically detect a smartphone user's mobility activities, including walking, running, driving and using a bus or train, in real-time or near-real-time (<5 s). We investigated a wide range of supervised learning techniques for classification, including decision trees (DT), support vector machines (SVM), naive Bayes classifiers (NB), Bayesian networks (BN), logistic regression (LR), artificial neural networks (ANN) and several instance-based classifiers (KStar, LWLand IBk). Applying ten-fold cross-validation, the best performers in terms of correct classification rate (i.e., recall) were DT (96.5%), BN (90.9%), LWL (95.5%) and KStar (95.6%). In particular, the DT-algorithm RandomForest exhibited the best overall performance. After a feature selection process for a subset of algorithms, the performance was improved slightly. Furthermore, after tuning the parameters of RandomForest, performance improved to above 97.5%. Lastly, we measured the computational complexity of the classifiers, in terms of central processing unit (CPU) time needed for classification, to provide a rough comparison between the algorithms in terms of battery usage requirements. As a result, the classifiers can be ranked from lowest to highest complexity (i.e., computational cost) as follows: SVM, ANN, LR, BN, DT, NB, IBk, LWL and KStar. The instance-based classifiers take considerably more computational time than the non-instance-based classifiers, whereas the slowest non-instance-based classifier (NB) required about five-times the amount of CPU time as the fastest classifier (SVM). The above results suggest that DT algorithms are excellent candidates for detecting mobility contexts in smartphones, both in terms of performance and computational complexity.

  1. Beyond Where to How: A Machine Learning Approach for Sensing Mobility Contexts Using Smartphone Sensors †

    PubMed Central

    Guinness, Robert E.

    2015-01-01

    This paper presents the results of research on the use of smartphone sensors (namely, GPS and accelerometers), geospatial information (points of interest, such as bus stops and train stations) and machine learning (ML) to sense mobility contexts. Our goal is to develop techniques to continuously and automatically detect a smartphone user's mobility activities, including walking, running, driving and using a bus or train, in real-time or near-real-time (<5 s). We investigated a wide range of supervised learning techniques for classification, including decision trees (DT), support vector machines (SVM), naive Bayes classifiers (NB), Bayesian networks (BN), logistic regression (LR), artificial neural networks (ANN) and several instance-based classifiers (KStar, LWLand IBk). Applying ten-fold cross-validation, the best performers in terms of correct classification rate (i.e., recall) were DT (96.5%), BN (90.9%), LWL (95.5%) and KStar (95.6%). In particular, the DT-algorithm RandomForest exhibited the best overall performance. After a feature selection process for a subset of algorithms, the performance was improved slightly. Furthermore, after tuning the parameters of RandomForest, performance improved to above 97.5%. Lastly, we measured the computational complexity of the classifiers, in terms of central processing unit (CPU) time needed for classification, to provide a rough comparison between the algorithms in terms of battery usage requirements. As a result, the classifiers can be ranked from lowest to highest complexity (i.e., computational cost) as follows: SVM, ANN, LR, BN, DT, NB, IBk, LWL and KStar. The instance-based classifiers take considerably more computational time than the non-instance-based classifiers, whereas the slowest non-instance-based classifier (NB) required about five-times the amount of CPU time as the fastest classifier (SVM). The above results suggest that DT algorithms are excellent candidates for detecting mobility contexts in smartphones, both in terms of performance and computational complexity. PMID:25928060

  2. A Deep Convolutional Neural Network for segmenting and classifying epithelial and stromal regions in histopathological images

    PubMed Central

    Xu, Jun; Luo, Xiaofei; Wang, Guanhao; Gilmore, Hannah; Madabhushi, Anant

    2016-01-01

    Epithelial (EP) and stromal (ST) are two types of tissues in histological images. Automated segmentation or classification of EP and ST tissues is important when developing computerized system for analyzing the tumor microenvironment. In this paper, a Deep Convolutional Neural Networks (DCNN) based feature learning is presented to automatically segment or classify EP and ST regions from digitized tumor tissue microarrays (TMAs). Current approaches are based on handcraft feature representation, such as color, texture, and Local Binary Patterns (LBP) in classifying two regions. Compared to handcrafted feature based approaches, which involve task dependent representation, DCNN is an end-to-end feature extractor that may be directly learned from the raw pixel intensity value of EP and ST tissues in a data driven fashion. These high-level features contribute to the construction of a supervised classifier for discriminating the two types of tissues. In this work we compare DCNN based models with three handcraft feature extraction based approaches on two different datasets which consist of 157 Hematoxylin and Eosin (H&E) stained images of breast cancer and 1376 immunohistological (IHC) stained images of colorectal cancer, respectively. The DCNN based feature learning approach was shown to have a F1 classification score of 85%, 89%, and 100%, accuracy (ACC) of 84%, 88%, and 100%, and Matthews Correlation Coefficient (MCC) of 86%, 77%, and 100% on two H&E stained (NKI and VGH) and IHC stained data, respectively. Our DNN based approach was shown to outperform three handcraft feature extraction based approaches in terms of the classification of EP and ST regions. PMID:28154470

  3. A Deep Convolutional Neural Network for segmenting and classifying epithelial and stromal regions in histopathological images.

    PubMed

    Xu, Jun; Luo, Xiaofei; Wang, Guanhao; Gilmore, Hannah; Madabhushi, Anant

    2016-05-26

    Epithelial (EP) and stromal (ST) are two types of tissues in histological images. Automated segmentation or classification of EP and ST tissues is important when developing computerized system for analyzing the tumor microenvironment. In this paper, a Deep Convolutional Neural Networks (DCNN) based feature learning is presented to automatically segment or classify EP and ST regions from digitized tumor tissue microarrays (TMAs). Current approaches are based on handcraft feature representation, such as color, texture, and Local Binary Patterns (LBP) in classifying two regions. Compared to handcrafted feature based approaches, which involve task dependent representation, DCNN is an end-to-end feature extractor that may be directly learned from the raw pixel intensity value of EP and ST tissues in a data driven fashion. These high-level features contribute to the construction of a supervised classifier for discriminating the two types of tissues. In this work we compare DCNN based models with three handcraft feature extraction based approaches on two different datasets which consist of 157 Hematoxylin and Eosin (H&E) stained images of breast cancer and 1376 immunohistological (IHC) stained images of colorectal cancer, respectively. The DCNN based feature learning approach was shown to have a F1 classification score of 85%, 89%, and 100%, accuracy (ACC) of 84%, 88%, and 100%, and Matthews Correlation Coefficient (MCC) of 86%, 77%, and 100% on two H&E stained (NKI and VGH) and IHC stained data, respectively. Our DNN based approach was shown to outperform three handcraft feature extraction based approaches in terms of the classification of EP and ST regions.

  4. A survey of supervised machine learning models for mobile-phone based pathogen identification and classification

    NASA Astrophysics Data System (ADS)

    Ceylan Koydemir, Hatice; Feng, Steve; Liang, Kyle; Nadkarni, Rohan; Tseng, Derek; Benien, Parul; Ozcan, Aydogan

    2017-03-01

    Giardia lamblia causes a disease known as giardiasis, which results in diarrhea, abdominal cramps, and bloating. Although conventional pathogen detection methods used in water analysis laboratories offer high sensitivity and specificity, they are time consuming, and need experts to operate bulky equipment and analyze the samples. Here we present a field-portable and cost-effective smartphone-based waterborne pathogen detection platform that can automatically classify Giardia cysts using machine learning. Our platform enables the detection and quantification of Giardia cysts in one hour, including sample collection, labeling, filtration, and automated counting steps. We evaluated the performance of three prototypes using Giardia-spiked water samples from different sources (e.g., reagent-grade, tap, non-potable, and pond water samples). We populated a training database with >30,000 cysts and estimated our detection sensitivity and specificity using 20 different classifier models, including decision trees, nearest neighbor classifiers, support vector machines (SVMs), and ensemble classifiers, and compared their speed of training and classification, as well as predicted accuracies. Among them, cubic SVM, medium Gaussian SVM, and bagged-trees were the most promising classifier types with accuracies of 94.1%, 94.2%, and 95%, respectively; we selected the latter as our preferred classifier for the detection and enumeration of Giardia cysts that are imaged using our mobile-phone fluorescence microscope. Without the need for any experts or microbiologists, this field-portable pathogen detection platform can present a useful tool for water quality monitoring in resource-limited-settings.

  5. Parametric embedding for class visualization.

    PubMed

    Iwata, Tomoharu; Saito, Kazumi; Ueda, Naonori; Stromsten, Sean; Griffiths, Thomas L; Tenenbaum, Joshua B

    2007-09-01

    We propose a new method, parametric embedding (PE), that embeds objects with the class structure into a low-dimensional visualization space. PE takes as input a set of class conditional probabilities for given data points and tries to preserve the structure in an embedding space by minimizing a sum of Kullback-Leibler divergences, under the assumption that samples are generated by a gaussian mixture with equal covariances in the embedding space. PE has many potential uses depending on the source of the input data, providing insight into the classifier's behavior in supervised, semisupervised, and unsupervised settings. The PE algorithm has a computational advantage over conventional embedding methods based on pairwise object relations since its complexity scales with the product of the number of objects and the number of classes. We demonstrate PE by visualizing supervised categorization of Web pages, semisupervised categorization of digits, and the relations of words and latent topics found by an unsupervised algorithm, latent Dirichlet allocation.

  6. Accurate Natural Trail Detection Using a Combination of a Deep Neural Network and Dynamic Programming.

    PubMed

    Adhikari, Shyam Prasad; Yang, Changju; Slot, Krzysztof; Kim, Hyongsuk

    2018-01-10

    This paper presents a vision sensor-based solution to the challenging problem of detecting and following trails in highly unstructured natural environments like forests, rural areas and mountains, using a combination of a deep neural network and dynamic programming. The deep neural network (DNN) concept has recently emerged as a very effective tool for processing vision sensor signals. A patch-based DNN is trained with supervised data to classify fixed-size image patches into "trail" and "non-trail" categories, and reshaped to a fully convolutional architecture to produce trail segmentation map for arbitrary-sized input images. As trail and non-trail patches do not exhibit clearly defined shapes or forms, the patch-based classifier is prone to misclassification, and produces sub-optimal trail segmentation maps. Dynamic programming is introduced to find an optimal trail on the sub-optimal DNN output map. Experimental results showing accurate trail detection for real-world trail datasets captured with a head mounted vision system are presented.

  7. Self-organized network with a supervised training and its comparison with FALVQ in artificial odor recognition system

    NASA Astrophysics Data System (ADS)

    Kusumoputro, Benyamin; Rostiviani, Linda; Saptawijaya, Ari

    2000-07-01

    Artificial odor recognition system is developed in order to mimic the human sensory test in cosmetics, parfum and beverage industries. The developed system however, lacks of ability to recognize the unknown type of odor. To improve the system's capability, a hybrid neural system with a supervised learning paradigm is developed and used as a pattern classifier. In this paper, the performance of the hybrid neural system is investigated, together with that of FALVQ neural system.

  8. Multi-test cervical cancer diagnosis with missing data estimation

    NASA Astrophysics Data System (ADS)

    Xu, Tao; Huang, Xiaolei; Kim, Edward; Long, L. Rodney; Antani, Sameer

    2015-03-01

    Cervical cancer is a leading most common type of cancer for women worldwide. Existing screening programs for cervical cancer suffer from low sensitivity. Using images of the cervix (cervigrams) as an aid in detecting pre-cancerous changes to the cervix has good potential to improve sensitivity and help reduce the number of cervical cancer cases. In this paper, we present a method that utilizes multi-modality information extracted from multiple tests of a patient's visit to classify the patient visit to be either low-risk or high-risk. Our algorithm integrates image features and text features to make a diagnosis. We also present two strategies to estimate the missing values in text features: Image Classifier Supervised Mean Imputation (ICSMI) and Image Classifier Supervised Linear Interpolation (ICSLI). We evaluate our method on a large medical dataset and compare it with several alternative approaches. The results show that the proposed method with ICSLI strategy achieves the best result of 83.03% specificity and 76.36% sensitivity. When higher specificity is desired, our method can achieve 90% specificity with 62.12% sensitivity.

  9. Noninvasive Dissection of Mouse Sleep Using a Piezoelectric Motion Sensor

    PubMed Central

    Yaghouby, Farid; Donohue, Kevin D.; O’Hara, Bruce F.; Sunderam, Sridhar

    2015-01-01

    Background Changes in autonomic control cause regular breathing during NREM sleep to fluctuate during REM. Piezoelectric cage-floor sensors have been used to successfully discriminate sleep and wake states in mice based on signal features related to respiration and other movements. This study presents a classifier for noninvasively classifying REM and NREM using a piezoelectric sensor. New Method Vigilance state was scored manually in 4-second epochs for 24-hour EEG/EMG recordings in twenty mice. An unsupervised classifier clustered piezoelectric signal features quantifying movement and respiration into three states: one active; and two inactive with regular and irregular breathing respectively. These states were hypothesized to correspond to Wake, NREM, and REM respectively. States predicted by the classifier were compared against manual EEG/EMG scores to test this hypothesis. Results Using only piezoelectric signal features, an unsupervised classifier distinguished Wake with high (89% sensitivity, 96% specificity) and REM with moderate (73% sensitivity, 75% specificity) accuracy, but NREM with poor sensitivity (51%) and high specificity (96%). The classifier sometimes confused light NREM sleep—characterized by irregular breathing and moderate delta EEG power—with REM. A supervised classifier improved sensitivities to 90, 81, and 67% and all specificities to over 90% for Wake, NREM, and REM respectively. Comparison with Existing Methods Unlike most actigraphic techniques, which only differentiate sleep from wake, the proposed piezoelectric method further dissects sleep based on breathing regularity into states strongly correlated with REM and NREM. Conclusions This approach could facilitate large-sample screening for genes influencing different sleep traits, besides drug studies or other manipulations. PMID:26582569

  10. Classification of JERS-1 Image Mosaic of Central Africa Using A Supervised Multiscale Classifier of Texture Features

    NASA Technical Reports Server (NTRS)

    Saatchi, Sassan; DeGrandi, Franco; Simard, Marc; Podest, Erika

    1999-01-01

    In this paper, a multiscale approach is introduced to classify the Japanese Research Satellite-1 (JERS-1) mosaic image over the Central African rainforest. A series of texture maps are generated from the 100 m mosaic image at various scales. Using a quadtree model and relating classes at each scale by a Markovian relationship, the multiscale images are classified from course to finer scale. The results are verified at various scales and the evolution of classification is monitored by calculating the error at each stage.

  11. A Machine Learning and Cross-Validation Approach for the Discrimination of Vegetation Physiognomic Types Using Satellite Based Multispectral and Multitemporal Data.

    PubMed

    Sharma, Ram C; Hara, Keitarou; Hirayama, Hidetake

    2017-01-01

    This paper presents the performance and evaluation of a number of machine learning classifiers for the discrimination between the vegetation physiognomic classes using the satellite based time-series of the surface reflectance data. Discrimination of six vegetation physiognomic classes, Evergreen Coniferous Forest, Evergreen Broadleaf Forest, Deciduous Coniferous Forest, Deciduous Broadleaf Forest, Shrubs, and Herbs, was dealt with in the research. Rich-feature data were prepared from time-series of the satellite data for the discrimination and cross-validation of the vegetation physiognomic types using machine learning approach. A set of machine learning experiments comprised of a number of supervised classifiers with different model parameters was conducted to assess how the discrimination of vegetation physiognomic classes varies with classifiers, input features, and ground truth data size. The performance of each experiment was evaluated by using the 10-fold cross-validation method. Experiment using the Random Forests classifier provided highest overall accuracy (0.81) and kappa coefficient (0.78). However, accuracy metrics did not vary much with experiments. Accuracy metrics were found to be very sensitive to input features and size of ground truth data. The results obtained in the research are expected to be useful for improving the vegetation physiognomic mapping in Japan.

  12. Predicting hepatotoxicity using ToxCast in vitro bioactivity and ...

    EPA Pesticide Factsheets

    Background: The U.S. EPA ToxCastTM program is screening thousands of environmental chemicals for bioactivity using hundreds of high-throughput in vitro assays to build predictive models of toxicity. We represented chemicals based on bioactivity and chemical structure descriptors then used supervised machine learning to predict their hepatotoxic effects.Results: A set of 677 chemicals were represented by 711 in vitro bioactivity descriptors (from ToxCast assays), 4,376 chemical structure descriptors (from QikProp, OpenBabel, PADEL, and PubChem), and three hepatotoxicity categories (from animal studies). Hepatotoxicants were defined by rat liver histopathology observed after chronic chemical testing and grouped into hypertrophy (161), injury (101) and proliferative lesions (99). Classifiers were built using six machine learning algorithms: linear discriminant analysis (LDA), Naïve Bayes (NB), support vector classification (SVM), classification and regression trees (CART), k-nearest neighbors (KNN) and an ensemble of classifiers (ENSMB). Classifiers of hepatotoxicity were built using chemical structure, ToxCast bioactivity, and a hybrid representation. Predictive performance was evaluated using 10-fold cross-validation testing and in-loop, filter-based, feature subset selection. Hybrid classifiers had the best balanced accuracy for predicting hypertrophy (0.78±0.08), injury (0.73±0.10) and proliferative lesions (0.72±0.09). Though chemical and bioactivity class

  13. Just-in-time classifiers for recurrent concepts.

    PubMed

    Alippi, Cesare; Boracchi, Giacomo; Roveri, Manuel

    2013-04-01

    Just-in-time (JIT) classifiers operate in evolving environments by classifying instances and reacting to concept drift. In stationary conditions, a JIT classifier improves its accuracy over time by exploiting additional supervised information coming from the field. In nonstationary conditions, however, the classifier reacts as soon as concept drift is detected; the current classification setup is discarded and a suitable one activated to keep the accuracy high. We present a novel generation of JIT classifiers able to deal with recurrent concept drift by means of a practical formalization of the concept representation and the definition of a set of operators working on such representations. The concept-drift detection activity, which is crucial in promptly reacting to changes exactly when needed, is advanced by considering change-detection tests monitoring both inputs and classes distributions.

  14. On the Implementation of a Land Cover Classification System for SAR Images Using Khoros

    NASA Technical Reports Server (NTRS)

    Medina Revera, Edwin J.; Espinosa, Ramon Vasquez

    1997-01-01

    The Synthetic Aperture Radar (SAR) sensor is widely used to record data about the ground under all atmospheric conditions. The SAR acquired images have very good resolution which necessitates the development of a classification system that process the SAR images to extract useful information for different applications. In this work, a complete system for the land cover classification was designed and programmed using the Khoros, a data flow visual language environment, taking full advantages of the polymorphic data services that it provides. Image analysis was applied to SAR images to improve and automate the processes of recognition and classification of the different regions like mountains and lakes. Both unsupervised and supervised classification utilities were used. The unsupervised classification routines included the use of several Classification/Clustering algorithms like the K-means, ISO2, Weighted Minimum Distance, and the Localized Receptive Field (LRF) training/classifier. Different texture analysis approaches such as Invariant Moments, Fractal Dimension and Second Order statistics were implemented for supervised classification of the images. The results and conclusions for SAR image classification using the various unsupervised and supervised procedures are presented based on their accuracy and performance.

  15. A machine learned classifier for RR Lyrae in the VVV survey

    NASA Astrophysics Data System (ADS)

    Elorrieta, Felipe; Eyheramendy, Susana; Jordán, Andrés; Dékány, István; Catelan, Márcio; Angeloni, Rodolfo; Alonso-García, Javier; Contreras-Ramos, Rodrigo; Gran, Felipe; Hajdu, Gergely; Espinoza, Néstor; Saito, Roberto K.; Minniti, Dante

    2016-11-01

    Variable stars of RR Lyrae type are a prime tool with which to obtain distances to old stellar populations in the Milky Way. One of the main aims of the Vista Variables in the Via Lactea (VVV) near-infrared survey is to use them to map the structure of the Galactic Bulge. Owing to the large number of expected sources, this requires an automated mechanism for selecting RR Lyrae, and particularly those of the more easily recognized type ab (I.e., fundamental-mode pulsators), from the 106-107 variables expected in the VVV survey area. In this work we describe a supervised machine-learned classifier constructed for assigning a score to a Ks-band VVV light curve that indicates its likelihood of being ab-type RR Lyrae. We describe the key steps in the construction of the classifier, which were the choice of features, training set, selection of aperture, and family of classifiers. We find that the AdaBoost family of classifiers give consistently the best performance for our problem, and obtain a classifier based on the AdaBoost algorithm that achieves a harmonic mean between false positives and false negatives of ≈7% for typical VVV light-curve sets. This performance is estimated using cross-validation and through the comparison to two independent datasets that were classified by human experts.

  16. Actively learning to distinguish suspicious from innocuous anomalies in a batch of vehicle tracks

    NASA Astrophysics Data System (ADS)

    Qiu, Zhicong; Miller, David J.; Stieber, Brian; Fair, Tim

    2014-06-01

    We investigate the problem of actively learning to distinguish between two sets of anomalous vehicle tracks, innocuous" and suspicious", starting from scratch, without any initial examples of suspicious" and with no prior knowledge of what an operator would deem suspicious. This two-class problem is challenging because it is a priori unknown which track features may characterize the suspicious class. Furthermore, there is inherent imbalance in the sizes of the labeled innocuous" and suspicious" sets, even after some suspicious examples are identified. We present a comprehensive solution wherein a classifier learns to discriminate suspicious from innocuous based on derived p-value track features. Through active learning, our classifier thus learns the types of anomalies on which to base its discrimination. Our solution encompasses: i) judicious choice of kinematic p-value based features conditioned on the road of origin, along with more explicit features that capture unique vehicle behavior (e.g. U-turns); ii) novel semi-supervised learning that exploits information in the unlabeled (test batch) tracks, and iii) evaluation of several classifier models (logistic regression, SVMs). We find that two active labeling streams are necessary in practice in order to have efficient classifier learning while also forwarding (for labeling) the most actionable tracks. Experiments on wide-area motion imagery (WAMI) tracks, extracted via a system developed by Toyon Research Corporation, demonstrate the strong ROC AUC performance of our system, with sparing use of operator-based active labeling.

  17. Application of remote sensing to state and regional problems

    NASA Technical Reports Server (NTRS)

    Miller, W. F. (Principal Investigator); Quattrochi, D. A.; Carter, B. D.; Higgs, G. K.; Solomon, J. L.; Wax, C. L.

    1979-01-01

    The author has identified the following significant results. The Lowndes County data base is essentially complete with 18 primary variables and 16 proximity variables encoded into the geo-information system. The single purpose, decision tree classifier is now operational. Signatures for the thematic extraction of strip mines from LANDSAT Digital data were obtained by employing both supervised and nonsupervised procedures. Dry, blowing sand areas of beach were also identified from the LANDSAT data. The primary procedure was the analysis of analog data on the I2S signal slicer.

  18. OpenCL based machine learning labeling of biomedical datasets

    NASA Astrophysics Data System (ADS)

    Amoros, Oscar; Escalera, Sergio; Puig, Anna

    2011-03-01

    In this paper, we propose a two-stage labeling method of large biomedical datasets through a parallel approach in a single GPU. Diagnostic methods, structures volume measurements, and visualization systems are of major importance for surgery planning, intra-operative imaging and image-guided surgery. In all cases, to provide an automatic and interactive method to label or to tag different structures contained into input data becomes imperative. Several approaches to label or segment biomedical datasets has been proposed to discriminate different anatomical structures in an output tagged dataset. Among existing methods, supervised learning methods for segmentation have been devised to easily analyze biomedical datasets by a non-expert user. However, they still have some problems concerning practical application, such as slow learning and testing speeds. In addition, recent technological developments have led to widespread availability of multi-core CPUs and GPUs, as well as new software languages, such as NVIDIA's CUDA and OpenCL, allowing to apply parallel programming paradigms in conventional personal computers. Adaboost classifier is one of the most widely applied methods for labeling in the Machine Learning community. In a first stage, Adaboost trains a binary classifier from a set of pre-labeled samples described by a set of features. This binary classifier is defined as a weighted combination of weak classifiers. Each weak classifier is a simple decision function estimated on a single feature value. Then, at the testing stage, each weak classifier is independently applied on the features of a set of unlabeled samples. In this work, we propose an alternative representation of the Adaboost binary classifier. We use this proposed representation to define a new GPU-based parallelized Adaboost testing stage using OpenCL. We provide numerical experiments based on large available data sets and we compare our results to CPU-based strategies in terms of time and labeling speeds.

  19. a New Object-Based Framework to Detect Shodows in High-Resolution Satellite Imagery Over Urban Areas

    NASA Astrophysics Data System (ADS)

    Tatar, N.; Saadatseresht, M.; Arefi, H.; Hadavand, A.

    2015-12-01

    In this paper a new object-based framework to detect shadow areas in high resolution satellite images is proposed. To produce shadow map in pixel level state of the art supervised machine learning algorithms are employed. Automatic ground truth generation based on Otsu thresholding on shadow and non-shadow indices is used to train the classifiers. It is followed by segmenting the image scene and create image objects. To detect shadow objects, a majority voting on pixel-based shadow detection result is designed. GeoEye-1 multi-spectral image over an urban area in Qom city of Iran is used in the experiments. Results shows the superiority of our proposed method over traditional pixel-based, visually and quantitatively.

  20. A Large-scale Distributed Indexed Learning Framework for Data that Cannot Fit into Memory

    DTIC Science & Technology

    2015-03-27

    learn a classifier. Integrating three learning techniques (online, semi-supervised and active learning ) together with a selective sampling with minimum communication between the server and the clients solved this problem.

  1. A supervised learning rule for classification of spatiotemporal spike patterns.

    PubMed

    Lilin Guo; Zhenzhong Wang; Adjouadi, Malek

    2016-08-01

    This study introduces a novel supervised algorithm for spiking neurons that take into consideration synapse delays and axonal delays associated with weights. It can be utilized for both classification and association and uses several biologically influenced properties, such as axonal and synaptic delays. This algorithm also takes into consideration spike-timing-dependent plasticity as in Remote Supervised Method (ReSuMe). This paper focuses on the classification aspect alone. Spiked neurons trained according to this proposed learning rule are capable of classifying different categories by the associated sequences of precisely timed spikes. Simulation results have shown that the proposed learning method greatly improves classification accuracy when compared to the Spike Pattern Association Neuron (SPAN) and the Tempotron learning rule.

  2. The New Possibilities from "Big Data" to Overlooked Associations Between Diabetes, Biochemical Parameters, Glucose Control, and Osteoporosis.

    PubMed

    Kruse, Christian

    2018-06-01

    To review current practices and technologies within the scope of "Big Data" that can further our understanding of diabetes mellitus and osteoporosis from large volumes of data. "Big Data" techniques involving supervised machine learning, unsupervised machine learning, and deep learning image analysis are presented with examples of current literature. Supervised machine learning can allow us to better predict diabetes-induced osteoporosis and understand relative predictor importance of diabetes-affected bone tissue. Unsupervised machine learning can allow us to understand patterns in data between diabetic pathophysiology and altered bone metabolism. Image analysis using deep learning can allow us to be less dependent on surrogate predictors and use large volumes of images to classify diabetes-induced osteoporosis and predict future outcomes directly from images. "Big Data" techniques herald new possibilities to understand diabetes-induced osteoporosis and ascertain our current ability to classify, understand, and predict this condition.

  3. Sea ice type maps from Alaska synthetic aperture radar facility imagery: An assessment

    NASA Technical Reports Server (NTRS)

    Fetterer, Florence M.; Gineris, Denise; Kwok, Ronald

    1994-01-01

    Synthetic aperture radar (SAR) imagery received at the Alaskan SAR Facility is routinely and automatically classified on the Geophysical Processor System (GPS) to create ice type maps. We evaluated the wintertime performance of the GPS classification algorithm by comparing ice type percentages from supervised classification with percentages from the algorithm. The root mean square (RMS) difference for multiyear ice is about 6%, while the inconsistency in supervised classification is about 3%. The algorithm separates first-year from multiyear ice well, although it sometimes fails to correctly classify new ice and open water owing to the wide distribution of backscatter for these classes. Our results imply a high degree of accuracy and consistency in the growing archive of multiyear and first-year ice distribution maps. These results have implications for heat and mass balance studies which are furthered by the ability to accurately characterize ice type distributions over a large part of the Arctic.

  4. Supervised Gamma Process Poisson Factorization

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Anderson, Dylan Zachary

    This thesis develops the supervised gamma process Poisson factorization (S- GPPF) framework, a novel supervised topic model for joint modeling of count matrices and document labels. S-GPPF is fully generative and nonparametric: document labels and count matrices are modeled under a uni ed probabilistic framework and the number of latent topics is controlled automatically via a gamma process prior. The framework provides for multi-class classification of documents using a generative max-margin classifier. Several recent data augmentation techniques are leveraged to provide for exact inference using a Gibbs sampling scheme. The first portion of this thesis reviews supervised topic modeling andmore » several key mathematical devices used in the formulation of S-GPPF. The thesis then introduces the S-GPPF generative model and derives the conditional posterior distributions of the latent variables for posterior inference via Gibbs sampling. The S-GPPF is shown to exhibit state-of-the-art performance for joint topic modeling and document classification on a dataset of conference abstracts, beating out competing supervised topic models. The unique properties of S-GPPF along with its competitive performance make it a novel contribution to supervised topic modeling.« less

  5. Word embeddings and recurrent neural networks based on Long-Short Term Memory nodes in supervised biomedical word sense disambiguation.

    PubMed

    Jimeno Yepes, Antonio

    2017-09-01

    Word sense disambiguation helps identifying the proper sense of ambiguous words in text. With large terminologies such as the UMLS Metathesaurus ambiguities appear and highly effective disambiguation methods are required. Supervised learning algorithm methods are used as one of the approaches to perform disambiguation. Features extracted from the context of an ambiguous word are used to identify the proper sense of such a word. The type of features have an impact on machine learning methods, thus affect disambiguation performance. In this work, we have evaluated several types of features derived from the context of the ambiguous word and we have explored as well more global features derived from MEDLINE using word embeddings. Results show that word embeddings improve the performance of more traditional features and allow as well using recurrent neural network classifiers based on Long-Short Term Memory (LSTM) nodes. The combination of unigrams and word embeddings with an SVM sets a new state of the art performance with a macro accuracy of 95.97 in the MSH WSD data set. Copyright © 2017 Elsevier Inc. All rights reserved.

  6. High resolution mapping and classification of oyster habitats in nearshore Louisiana using sidescan sonar

    USGS Publications Warehouse

    Allen, Y.C.; Wilson, C.A.; Roberts, H.H.; Supan, J.

    2005-01-01

    Sidescan sonar holds great promise as a tool to quantitatively depict the distribution and extent of benthic habitats in Louisiana's turbid estuaries. In this study, we describe an effective protocol for acoustic sampling in this environment. We also compared three methods of classification in detail: mean-based thresholding, supervised, and unsupervised techniques to classify sidescan imagery into categories of mud and shell. Classification results were compared to ground truth results using quadrat and dredge sampling. Supervised classification gave the best overall result (kappa = 75%) when compared to quadrat results. Classification accuracy was less robust when compared to all dredge samples (kappa = 21-56%), but increased greatly (90-100%) when only dredge samples taken from acoustically homogeneous areas were considered. Sidescan sonar when combined with ground truth sampling at an appropriate scale can be effectively used to establish an accurate substrate base map for both research applications and shellfish management. The sidescan imagery presented here also provides, for the first time, a detailed presentation of oyster habitat patchiness and scale in a productive oyster growing area.

  7. Nonlinear Deep Kernel Learning for Image Annotation.

    PubMed

    Jiu, Mingyuan; Sahbi, Hichem

    2017-02-08

    Multiple kernel learning (MKL) is a widely used technique for kernel design. Its principle consists in learning, for a given support vector classifier, the most suitable convex (or sparse) linear combination of standard elementary kernels. However, these combinations are shallow and often powerless to capture the actual similarity between highly semantic data, especially for challenging classification tasks such as image annotation. In this paper, we redefine multiple kernels using deep multi-layer networks. In this new contribution, a deep multiple kernel is recursively defined as a multi-layered combination of nonlinear activation functions, each one involves a combination of several elementary or intermediate kernels, and results into a positive semi-definite deep kernel. We propose four different frameworks in order to learn the weights of these networks: supervised, unsupervised, kernel-based semisupervised and Laplacian-based semi-supervised. When plugged into support vector machines (SVMs), the resulting deep kernel networks show clear gain, compared to several shallow kernels for the task of image annotation. Extensive experiments and analysis on the challenging ImageCLEF photo annotation benchmark, the COREL5k database and the Banana dataset validate the effectiveness of the proposed method.

  8. An evaluation of supervised classifiers for indirectly detecting salt-affected areas at irrigation scheme level

    NASA Astrophysics Data System (ADS)

    Muller, Sybrand Jacobus; van Niekerk, Adriaan

    2016-07-01

    Soil salinity often leads to reduced crop yield and quality and can render soils barren. Irrigated areas are particularly at risk due to intensive cultivation and secondary salinization caused by waterlogging. Regular monitoring of salt accumulation in irrigation schemes is needed to keep its negative effects under control. The dynamic spatial and temporal characteristics of remote sensing can provide a cost-effective solution for monitoring salt accumulation at irrigation scheme level. This study evaluated a range of pan-fused SPOT-5 derived features (spectral bands, vegetation indices, image textures and image transformations) for classifying salt-affected areas in two distinctly different irrigation schemes in South Africa, namely Vaalharts and Breede River. The relationship between the input features and electro conductivity measurements were investigated using regression modelling (stepwise linear regression, partial least squares regression, curve fit regression modelling) and supervised classification (maximum likelihood, nearest neighbour, decision tree analysis, support vector machine and random forests). Classification and regression trees and random forest were used to select the most important features for differentiating salt-affected and unaffected areas. The results showed that the regression analyses produced weak models (<0.4 R squared). Better results were achieved using the supervised classifiers, but the algorithms tend to over-estimate salt-affected areas. A key finding was that none of the feature sets or classification algorithms stood out as being superior for monitoring salt accumulation at irrigation scheme level. This was attributed to the large variations in the spectral responses of different crops types at different growing stages, coupled with their individual tolerances to saline conditions.

  9. Employing Machine-Learning Methods to Study Young Stellar Objects

    NASA Astrophysics Data System (ADS)

    Moore, Nicholas

    2018-01-01

    Vast amounts of data exist in the astronomical data archives, and yet a large number of sources remain unclassified. We developed a multi-wavelength pipeline to classify infrared sources. The pipeline uses supervised machine learning methods to classify objects into the appropriate categories. The program is fed data that is already classified to train it, and is then applied to unknown catalogues. The primary use for such a pipeline is the rapid classification and cataloging of data that would take a much longer time to classify otherwise. While our primary goal is to study young stellar objects (YSOs), the applications extend beyond the scope of this project. We present preliminary results from our analysis and discuss future applications.

  10. The fragmented nature of tundra landscape

    NASA Astrophysics Data System (ADS)

    Virtanen, Tarmo; Ek, Malin

    2014-04-01

    The vegetation and land cover structure of tundra areas is fragmented when compared to other biomes. Thus, satellite images of high resolution are required for producing land cover classifications, in order to reveal the actual distribution of land cover types across these large and remote areas. We produced and compared different land cover classifications using three satellite images (QuickBird, Aster and Landsat TM5) with different pixel sizes (2.4 m, 15 m and 30 m pixel size, respectively). The study area, in north-eastern European Russia, was visited in July 2007 to obtain ground reference data. The QuickBird image was classified using supervised segmentation techniques, while the Aster and Landsat TM5 images were classified using a pixel-based supervised classification method. The QuickBird classification showed the highest accuracy when tested against field data, while the Aster image was generally more problematic to classify than the Landsat TM5 image. Use of smaller pixel sized images distinguished much greater levels of landscape fragmentation. The overall mean patch sizes in the QuickBird, Aster, and Landsat TM5-classifications were 871 m2, 2141 m2 and 7433 m2, respectively. In the QuickBird classification, the mean patch size of all the tundra and peatland vegetation classes was smaller than one pixel of the Landsat TM5 image. Water bodies and fens in particular occur in the landscape in small or elongated patches, and thus cannot be realistically classified from larger pixel sized images. Land cover patterns vary considerably at such a fine-scale, so that a lot of information is lost if only medium resolution satellite images are used. It is crucial to know the amount and spatial distribution of different vegetation types in arctic landscapes, as carbon dynamics and other climate related physical, geological and biological processes are known to vary greatly between vegetation types.

  11. Visual feature extraction and establishment of visual tags in the intelligent visual internet of things

    NASA Astrophysics Data System (ADS)

    Zhao, Yiqun; Wang, Zhihui

    2015-12-01

    The Internet of things (IOT) is a kind of intelligent networks which can be used to locate, track, identify and supervise people and objects. One of important core technologies of intelligent visual internet of things ( IVIOT) is the intelligent visual tag system. In this paper, a research is done into visual feature extraction and establishment of visual tags of the human face based on ORL face database. Firstly, we use the principal component analysis (PCA) algorithm for face feature extraction, then adopt the support vector machine (SVM) for classifying and face recognition, finally establish a visual tag for face which is already classified. We conducted a experiment focused on a group of people face images, the result show that the proposed algorithm have good performance, and can show the visual tag of objects conveniently.

  12. Classification of neocortical interneurons using affinity propagation.

    PubMed

    Santana, Roberto; McGarry, Laura M; Bielza, Concha; Larrañaga, Pedro; Yuste, Rafael

    2013-01-01

    In spite of over a century of research on cortical circuits, it is still unknown how many classes of cortical neurons exist. In fact, neuronal classification is a difficult problem because it is unclear how to designate a neuronal cell class and what are the best characteristics to define them. Recently, unsupervised classifications using cluster analysis based on morphological, physiological, or molecular characteristics, have provided quantitative and unbiased identification of distinct neuronal subtypes, when applied to selected datasets. However, better and more robust classification methods are needed for increasingly complex and larger datasets. Here, we explored the use of affinity propagation, a recently developed unsupervised classification algorithm imported from machine learning, which gives a representative example or exemplar for each cluster. As a case study, we applied affinity propagation to a test dataset of 337 interneurons belonging to four subtypes, previously identified based on morphological and physiological characteristics. We found that affinity propagation correctly classified most of the neurons in a blind, non-supervised manner. Affinity propagation outperformed Ward's method, a current standard clustering approach, in classifying the neurons into 4 subtypes. Affinity propagation could therefore be used in future studies to validly classify neurons, as a first step to help reverse engineer neural circuits.

  13. Classification of Parkinsonian syndromes from FDG-PET brain data using decision trees with SSM/PCA features.

    PubMed

    Mudali, D; Teune, L K; Renken, R J; Leenders, K L; Roerdink, J B T M

    2015-01-01

    Medical imaging techniques like fluorodeoxyglucose positron emission tomography (FDG-PET) have been used to aid in the differential diagnosis of neurodegenerative brain diseases. In this study, the objective is to classify FDG-PET brain scans of subjects with Parkinsonian syndromes (Parkinson's disease, multiple system atrophy, and progressive supranuclear palsy) compared to healthy controls. The scaled subprofile model/principal component analysis (SSM/PCA) method was applied to FDG-PET brain image data to obtain covariance patterns and corresponding subject scores. The latter were used as features for supervised classification by the C4.5 decision tree method. Leave-one-out cross validation was applied to determine classifier performance. We carried out a comparison with other types of classifiers. The big advantage of decision tree classification is that the results are easy to understand by humans. A visual representation of decision trees strongly supports the interpretation process, which is very important in the context of medical diagnosis. Further improvements are suggested based on enlarging the number of the training data, enhancing the decision tree method by bagging, and adding additional features based on (f)MRI data.

  14. Mapping Phonetic Features for Voice-Driven Sound Synthesis

    NASA Astrophysics Data System (ADS)

    Janer, Jordi; Maestre, Esteban

    In applications where the human voice controls the synthesis of musical instruments sounds, phonetics convey musical information that might be related to the sound of the imitated musical instrument. Our initial hypothesis is that phonetics are user- and instrument-dependent, but they remain constant for a single subject and instrument. We propose a user-adapted system, where mappings from voice features to synthesis parameters depend on how subjects sing musical articulations, i.e. note to note transitions. The system consists of two components. First, a voice signal segmentation module that automatically determines note-to-note transitions. Second, a classifier that determines the type of musical articulation for each transition based on a set of phonetic features. For validating our hypothesis, we run an experiment where subjects imitated real instrument recordings with their voice. Performance recordings consisted of short phrases of saxophone and violin performed in three grades of musical articulation labeled as: staccato, normal, legato. The results of a supervised training classifier (user-dependent) are compared to a classifier based on heuristic rules (user-independent). Finally, from the previous results we show how to control the articulation in a sample-concatenation synthesizer by selecting the most appropriate samples.

  15. Development of a Management Techniques Inventory.

    DTIC Science & Technology

    1977-04-01

    Management Position Description Questionnaire ( Tornow & Pinto, 1976). Two frequently used instruments which do assess manager ’ attitudes toward supervision... Tornow , W. W., & Pinto, P. R. The development of a managerial job taxonomy: A system for describing, classifying, and evaluating executive positions

  16. Classifying GABAergic interneurons with semi-supervised projected model-based clustering.

    PubMed

    Mihaljević, Bojan; Benavides-Piccione, Ruth; Guerra, Luis; DeFelipe, Javier; Larrañaga, Pedro; Bielza, Concha

    2015-09-01

    A recently introduced pragmatic scheme promises to be a useful catalog of interneuron names. We sought to automatically classify digitally reconstructed interneuronal morphologies according to this scheme. Simultaneously, we sought to discover possible subtypes of these types that might emerge during automatic classification (clustering). We also investigated which morphometric properties were most relevant for this classification. A set of 118 digitally reconstructed interneuronal morphologies classified into the common basket (CB), horse-tail (HT), large basket (LB), and Martinotti (MA) interneuron types by 42 of the world's leading neuroscientists, quantified by five simple morphometric properties of the axon and four of the dendrites. We labeled each neuron with the type most commonly assigned to it by the experts. We then removed this class information for each type separately, and applied semi-supervised clustering to those cells (keeping the others' cluster membership fixed), to assess separation from other types and look for the formation of new groups (subtypes). We performed this same experiment unlabeling the cells of two types at a time, and of half the cells of a single type at a time. The clustering model is a finite mixture of Gaussians which we adapted for the estimation of local (per-cluster) feature relevance. We performed the described experiments on three different subsets of the data, formed according to how many experts agreed on type membership: at least 18 experts (the full data set), at least 21 (73 neurons), and at least 26 (47 neurons). Interneurons with more reliable type labels were classified more accurately. We classified HT cells with 100% accuracy, MA cells with 73% accuracy, and CB and LB cells with 56% and 58% accuracy, respectively. We identified three subtypes of the MA type, one subtype of CB and LB types each, and no subtypes of HT (it was a single, homogeneous type). We got maximum (adapted) Silhouette width and ARI values of 1, 0.83, 0.79, and 0.42, when unlabeling the HT, CB, LB, and MA types, respectively, confirming the quality of the formed cluster solutions. The subtypes identified when unlabeling a single type also emerged when unlabeling two types at a time, confirming their validity. Axonal morphometric properties were more relevant that dendritic ones, with the axonal polar histogram length in the [π, 2π) angle interval being particularly useful. The applied semi-supervised clustering method can accurately discriminate among CB, HT, LB, and MA interneuron types while discovering potential subtypes, and is therefore useful for neuronal classification. The discovery of potential subtypes suggests that some of these types are more heterogeneous that previously thought. Finally, axonal variables seem to be more relevant than dendritic ones for distinguishing among the CB, HT, LB, and MA interneuron types. Copyright © 2015 Elsevier B.V. All rights reserved.

  17. Efficient use of unlabeled data for protein sequence classification: a comparative study

    PubMed Central

    Kuksa, Pavel; Huang, Pai-Hsi; Pavlovic, Vladimir

    2009-01-01

    Background Recent studies in computational primary protein sequence analysis have leveraged the power of unlabeled data. For example, predictive models based on string kernels trained on sequences known to belong to particular folds or superfamilies, the so-called labeled data set, can attain significantly improved accuracy if this data is supplemented with protein sequences that lack any class tags–the unlabeled data. In this study, we present a principled and biologically motivated computational framework that more effectively exploits the unlabeled data by only using the sequence regions that are more likely to be biologically relevant for better prediction accuracy. As overly-represented sequences in large uncurated databases may bias the estimation of computational models that rely on unlabeled data, we also propose a method to remove this bias and improve performance of the resulting classifiers. Results Combined with state-of-the-art string kernels, our proposed computational framework achieves very accurate semi-supervised protein remote fold and homology detection on three large unlabeled databases. It outperforms current state-of-the-art methods and exhibits significant reduction in running time. Conclusion The unlabeled sequences used under the semi-supervised setting resemble the unpolished gemstones; when used as-is, they may carry unnecessary features and hence compromise the classification accuracy but once cut and polished, they improve the accuracy of the classifiers considerably. PMID:19426450

  18. Land use and land cover classification for rural residential areas in China using soft-probability cascading of multifeatures

    NASA Astrophysics Data System (ADS)

    Zhang, Bin; Liu, Yueyan; Zhang, Zuyu; Shen, Yonglin

    2017-10-01

    A multifeature soft-probability cascading scheme to solve the problem of land use and land cover (LULC) classification using high-spatial-resolution images to map rural residential areas in China is proposed. The proposed method is used to build midlevel LULC features. Local features are frequently considered as low-level feature descriptors in a midlevel feature learning method. However, spectral and textural features, which are very effective low-level features, are neglected. The acquisition of the dictionary of sparse coding is unsupervised, and this phenomenon reduces the discriminative power of the midlevel feature. Thus, we propose to learn supervised features based on sparse coding, a support vector machine (SVM) classifier, and a conditional random field (CRF) model to utilize the different effective low-level features and improve the discriminability of midlevel feature descriptors. First, three kinds of typical low-level features, namely, dense scale-invariant feature transform, gray-level co-occurrence matrix, and spectral features, are extracted separately. Second, combined with sparse coding and the SVM classifier, the probabilities of the different LULC classes are inferred to build supervised feature descriptors. Finally, the CRF model, which consists of two parts: unary potential and pairwise potential, is employed to construct an LULC classification map. Experimental results show that the proposed classification scheme can achieve impressive performance when the total accuracy reached about 87%.

  19. Semiautomated object-based classification of rain-induced landslides with VHR multispectral images on Madeira Island

    NASA Astrophysics Data System (ADS)

    Heleno, Sandra; Matias, Magda; Pina, Pedro; Sousa, António Jorge

    2016-04-01

    A method for semiautomated landslide detection and mapping, with the ability to separate source and run-out areas, is presented in this paper. It combines object-based image analysis and a support vector machine classifier and is tested using a GeoEye-1 multispectral image, sensed 3 days after a major damaging landslide event that occurred on Madeira Island (20 February 2010), and a pre-event lidar digital terrain model. The testing is developed in a 15 km2 wide study area, where 95 % of the number of landslides scars are detected by this supervised approach. The classifier presents a good performance in the delineation of the overall landslide area, with commission errors below 26 % and omission errors below 24 %. In addition, fair results are achieved in the separation of the source from the run-out landslide areas, although in less illuminated slopes this discrimination is less effective than in sunnier, east-facing slopes.

  20. Development of remote sensing based site specific weed management for Midwest mint production

    NASA Astrophysics Data System (ADS)

    Gumz, Mary Saumur Paulson

    Peppermint and spearmint are high value essential oil crops in Indiana, Michigan, and Wisconsin. Although the mints are profitable alternatives to corn and soybeans, mint production efficiency must improve in order to allow industry survival against foreign produced oils and synthetic flavorings. Weed control is the major input cost in mint production and tools to increase efficiency are necessary. Remote sensing-based site-specific weed management offers potential for decreasing weed control costs through simplified weed detection and control from accurate site specific weed and herbicide application maps. This research showed the practicability of remote sensing for weed detection in the mints. Research was designed to compare spectral response curves of field grown mint and weeds, and to use these data to develop spectral vegetation indices for automated weed detection. Viability of remote sensing in mint production was established using unsupervised classification, supervised classification, handheld spectroradiometer readings and spectral vegetation indices (SVIs). Unsupervised classification of multispectral images of peppermint production fields generated crop health maps with 92 and 67% accuracy in meadow and row peppermint, respectively. Supervised classification of multispectral images identified weed infestations with 97% and 85% accuracy for meadow and row peppermint, respectively. Supervised classification showed that peppermint was spectrally distinct from weeds, but the accuracy of these measures was dependent on extensive ground referencing which is impractical and too costly for on-farm use. Handheld spectroradiometer measurements of peppermint, spearmint, and several weeds and crop and weed mixtures were taken over three years from greenhouse grown plants, replicated field plots, and production peppermint and spearmint fields. Results showed that mints have greater near infrared (NIR) and lower green reflectance and a steeper red edge slope than all weed species. These distinguishing characteristics were combined to develop narrow band and broadband spectral vegetation indices (SVIs, ratios of NIR/green reflectance), that were effective in differentiating mint from key weed species. Hyperspectral images of production peppermint and spearmint fields were then classified using SVI-based classification. Narrowband and broadband SVIs classified early season peppermint and spearmint with 64 to 100% accuracy compared to 79 to 100% accuracy for supervised classification of multispectral images of the same fields. Broadband SVIs have potential for use as an automated spectral indicator for weeds in the mints since they require minimal ground referencing and can be calculated from multispectral imagery which is cheaper and more readily available than hyperspectral imagery. This research will allow growers to implement remote sensing based site specific weed management in mint resulting in reduced grower input costs and reduced herbicide entry into the environment and will have applications in other specialty and meadow crops.

  1. Locating articular cartilage in MR images

    NASA Astrophysics Data System (ADS)

    Folkesson, Jenny; Dam, Erik; Pettersen, Paola; Olsen, Ole F.; Nielsen, Mads; Christiansen, Claus

    2005-04-01

    Accurate computation of the thickness of the articular cartilage is of great importance when diagnosing and monitoring the progress of joint diseases such as osteoarthritis. A fully automated cartilage assessment method is preferable compared to methods using manual interaction in order to avoid inter- and intra-observer variability. As a first step in the cartilage assessment, we present an automatic method for locating articular cartilage in knee MRI using supervised learning. The next step will be to fit a variable shape model to the cartilage, initiated at the location found using the method presented in this paper. From the model, disease markers will be extracted for the quantitative evaluation of the cartilage. The cartilage is located using an ANN-classifier, where every voxel is classified as cartilage or non-cartilage based on prior knowledge of the cartilage structure. The classifier is tested using leave-one-out-evaluation, and we found the average sensitivity and specificity to be 91.0% and 99.4%, respectively. The center of mass calculated from voxels classified as cartilage are similar to the corresponding values calculated from manual segmentations, which confirms that this method can find a good initial position for a shape model.

  2. The effects of pre-processing strategies in sentiment analysis of online movie reviews

    NASA Astrophysics Data System (ADS)

    Zin, Harnani Mat; Mustapha, Norwati; Murad, Masrah Azrifah Azmi; Sharef, Nurfadhlina Mohd

    2017-10-01

    With the ever increasing of internet applications and social networking sites, people nowadays can easily express their feelings towards any products and services. These online reviews act as an important source for further analysis and improved decision making. These reviews are mostly unstructured by nature and thus, need processing like sentiment analysis and classification to provide a meaningful information for future uses. In text analysis tasks, the appropriate selection of words/features will have a huge impact on the effectiveness of the classifier. Thus, this paper explores the effect of the pre-processing strategies in the sentiment analysis of online movie reviews. In this paper, supervised machine learning method was used to classify the reviews. The support vector machine (SVM) with linear and non-linear kernel has been considered as classifier for the classification of the reviews. The performance of the classifier is critically examined based on the results of precision, recall, f-measure, and accuracy. Two different features representations were used which are term frequency and term frequency-inverse document frequency. Results show that the pre-processing strategies give a significant impact on the classification process.

  3. Detecting Parkinsons' symptoms in uncontrolled home environments: a multiple instance learning approach.

    PubMed

    Das, Samarjit; Amoedo, Breogan; De la Torre, Fernando; Hodgins, Jessica

    2012-01-01

    In this paper, we propose to use a weakly supervised machine learning framework for automatic detection of Parkinson's Disease motor symptoms in daily living environments. Our primary goal is to develop a monitoring system capable of being used outside of controlled laboratory settings. Such a system would enable us to track medication cycles at home and provide valuable clinical feedback. Most of the relevant prior works involve supervised learning frameworks (e.g., Support Vector Machines). However, in-home monitoring provides only coarse ground truth information about symptom occurrences, making it very hard to adapt and train supervised learning classifiers for symptom detection. We address this challenge by formulating symptom detection under incomplete ground truth information as a multiple instance learning (MIL) problem. MIL is a weakly supervised learning framework that does not require exact instances of symptom occurrences for training; rather, it learns from approximate time intervals within which a symptom might or might not have occurred on a given day. Once trained, the MIL detector was able to spot symptom-prone time windows on other days and approximately localize the symptom instances. We monitored two Parkinson's disease (PD) patients, each for four days with a set of five triaxial accelerometers and utilized a MIL algorithm based on axis parallel rectangle (APR) fitting in the feature space. We were able to detect subject specific symptoms (e.g. dyskinesia) that conformed with a daily log maintained by the patients.

  4. Robust evaluation of time series classification algorithms for structural health monitoring

    NASA Astrophysics Data System (ADS)

    Harvey, Dustin Y.; Worden, Keith; Todd, Michael D.

    2014-03-01

    Structural health monitoring (SHM) systems provide real-time damage and performance information for civil, aerospace, and mechanical infrastructure through analysis of structural response measurements. The supervised learning methodology for data-driven SHM involves computation of low-dimensional, damage-sensitive features from raw measurement data that are then used in conjunction with machine learning algorithms to detect, classify, and quantify damage states. However, these systems often suffer from performance degradation in real-world applications due to varying operational and environmental conditions. Probabilistic approaches to robust SHM system design suffer from incomplete knowledge of all conditions a system will experience over its lifetime. Info-gap decision theory enables nonprobabilistic evaluation of the robustness of competing models and systems in a variety of decision making applications. Previous work employed info-gap models to handle feature uncertainty when selecting various components of a supervised learning system, namely features from a pre-selected family and classifiers. In this work, the info-gap framework is extended to robust feature design and classifier selection for general time series classification through an efficient, interval arithmetic implementation of an info-gap data model. Experimental results are presented for a damage type classification problem on a ball bearing in a rotating machine. The info-gap framework in conjunction with an evolutionary feature design system allows for fully automated design of a time series classifier to meet performance requirements under maximum allowable uncertainty.

  5. Malay sentiment analysis based on combined classification approaches and Senti-lexicon algorithm.

    PubMed

    Al-Saffar, Ahmed; Awang, Suryanti; Tao, Hai; Omar, Nazlia; Al-Saiagh, Wafaa; Al-Bared, Mohammed

    2018-01-01

    Sentiment analysis techniques are increasingly exploited to categorize the opinion text to one or more predefined sentiment classes for the creation and automated maintenance of review-aggregation websites. In this paper, a Malay sentiment analysis classification model is proposed to improve classification performances based on the semantic orientation and machine learning approaches. First, a total of 2,478 Malay sentiment-lexicon phrases and words are assigned with a synonym and stored with the help of more than one Malay native speaker, and the polarity is manually allotted with a score. In addition, the supervised machine learning approaches and lexicon knowledge method are combined for Malay sentiment classification with evaluating thirteen features. Finally, three individual classifiers and a combined classifier are used to evaluate the classification accuracy. In experimental results, a wide-range of comparative experiments is conducted on a Malay Reviews Corpus (MRC), and it demonstrates that the feature extraction improves the performance of Malay sentiment analysis based on the combined classification. However, the results depend on three factors, the features, the number of features and the classification approach.

  6. Malay sentiment analysis based on combined classification approaches and Senti-lexicon algorithm

    PubMed Central

    Awang, Suryanti; Tao, Hai; Omar, Nazlia; Al-Saiagh, Wafaa; Al-bared, Mohammed

    2018-01-01

    Sentiment analysis techniques are increasingly exploited to categorize the opinion text to one or more predefined sentiment classes for the creation and automated maintenance of review-aggregation websites. In this paper, a Malay sentiment analysis classification model is proposed to improve classification performances based on the semantic orientation and machine learning approaches. First, a total of 2,478 Malay sentiment-lexicon phrases and words are assigned with a synonym and stored with the help of more than one Malay native speaker, and the polarity is manually allotted with a score. In addition, the supervised machine learning approaches and lexicon knowledge method are combined for Malay sentiment classification with evaluating thirteen features. Finally, three individual classifiers and a combined classifier are used to evaluate the classification accuracy. In experimental results, a wide-range of comparative experiments is conducted on a Malay Reviews Corpus (MRC), and it demonstrates that the feature extraction improves the performance of Malay sentiment analysis based on the combined classification. However, the results depend on three factors, the features, the number of features and the classification approach. PMID:29684036

  7. Conduction Delay Learning Model for Unsupervised and Supervised Classification of Spatio-Temporal Spike Patterns

    PubMed Central

    Matsubara, Takashi

    2017-01-01

    Precise spike timing is considered to play a fundamental role in communications and signal processing in biological neural networks. Understanding the mechanism of spike timing adjustment would deepen our understanding of biological systems and enable advanced engineering applications such as efficient computational architectures. However, the biological mechanisms that adjust and maintain spike timing remain unclear. Existing algorithms adopt a supervised approach, which adjusts the axonal conduction delay and synaptic efficacy until the spike timings approximate the desired timings. This study proposes a spike timing-dependent learning model that adjusts the axonal conduction delay and synaptic efficacy in both unsupervised and supervised manners. The proposed learning algorithm approximates the Expectation-Maximization algorithm, and classifies the input data encoded into spatio-temporal spike patterns. Even in the supervised classification, the algorithm requires no external spikes indicating the desired spike timings unlike existing algorithms. Furthermore, because the algorithm is consistent with biological models and hypotheses found in existing biological studies, it could capture the mechanism underlying biological delay learning. PMID:29209191

  8. Conduction Delay Learning Model for Unsupervised and Supervised Classification of Spatio-Temporal Spike Patterns.

    PubMed

    Matsubara, Takashi

    2017-01-01

    Precise spike timing is considered to play a fundamental role in communications and signal processing in biological neural networks. Understanding the mechanism of spike timing adjustment would deepen our understanding of biological systems and enable advanced engineering applications such as efficient computational architectures. However, the biological mechanisms that adjust and maintain spike timing remain unclear. Existing algorithms adopt a supervised approach, which adjusts the axonal conduction delay and synaptic efficacy until the spike timings approximate the desired timings. This study proposes a spike timing-dependent learning model that adjusts the axonal conduction delay and synaptic efficacy in both unsupervised and supervised manners. The proposed learning algorithm approximates the Expectation-Maximization algorithm, and classifies the input data encoded into spatio-temporal spike patterns. Even in the supervised classification, the algorithm requires no external spikes indicating the desired spike timings unlike existing algorithms. Furthermore, because the algorithm is consistent with biological models and hypotheses found in existing biological studies, it could capture the mechanism underlying biological delay learning.

  9. Active relearning for robust supervised classification of pulmonary emphysema

    NASA Astrophysics Data System (ADS)

    Raghunath, Sushravya; Rajagopalan, Srinivasan; Karwoski, Ronald A.; Bartholmai, Brian J.; Robb, Richard A.

    2012-03-01

    Radiologists are adept at recognizing the appearance of lung parenchymal abnormalities in CT scans. However, the inconsistent differential diagnosis, due to subjective aggregation, mandates supervised classification. Towards optimizing Emphysema classification, we introduce a physician-in-the-loop feedback approach in order to minimize uncertainty in the selected training samples. Using multi-view inductive learning with the training samples, an ensemble of Support Vector Machine (SVM) models, each based on a specific pair-wise dissimilarity metric, was constructed in less than six seconds. In the active relearning phase, the ensemble-expert label conflicts were resolved by an expert. This just-in-time feedback with unoptimized SVMs yielded 15% increase in classification accuracy and 25% reduction in the number of support vectors. The generality of relearning was assessed in the optimized parameter space of six different classifiers across seven dissimilarity metrics. The resultant average accuracy improved to 21%. The co-operative feedback method proposed here could enhance both diagnostic and staging throughput efficiency in chest radiology practice.

  10. Automated Detection of Microaneurysms Using Scale-Adapted Blob Analysis and Semi-Supervised Learning

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Adal, Kedir M.; Sidebe, Desire; Ali, Sharib

    2014-01-07

    Despite several attempts, automated detection of microaneurysm (MA) from digital fundus images still remains to be an open issue. This is due to the subtle nature of MAs against the surrounding tissues. In this paper, the microaneurysm detection problem is modeled as finding interest regions or blobs from an image and an automatic local-scale selection technique is presented. Several scale-adapted region descriptors are then introduced to characterize these blob regions. A semi-supervised based learning approach, which requires few manually annotated learning examples, is also proposed to train a classifier to detect true MAs. The developed system is built using onlymore » few manually labeled and a large number of unlabeled retinal color fundus images. The performance of the overall system is evaluated on Retinopathy Online Challenge (ROC) competition database. A competition performance measure (CPM) of 0.364 shows the competitiveness of the proposed system against state-of-the art techniques as well as the applicability of the proposed features to analyze fundus images.« less

  11. FROM2D to 3d Supervised Segmentation and Classification for Cultural Heritage Applications

    NASA Astrophysics Data System (ADS)

    Grilli, E.; Dininno, D.; Petrucci, G.; Remondino, F.

    2018-05-01

    The digital management of architectural heritage information is still a complex problem, as a heritage object requires an integrated representation of various types of information in order to develop appropriate restoration or conservation strategies. Currently, there is extensive research focused on automatic procedures of segmentation and classification of 3D point clouds or meshes, which can accelerate the study of a monument and integrate it with heterogeneous information and attributes, useful to characterize and describe the surveyed object. The aim of this study is to propose an optimal, repeatable and reliable procedure to manage various types of 3D surveying data and associate them with heterogeneous information and attributes to characterize and describe the surveyed object. In particular, this paper presents an approach for classifying 3D heritage models, starting from the segmentation of their textures based on supervised machine learning methods. Experimental results run on three different case studies demonstrate that the proposed approach is effective and with many further potentials.

  12. 40 CFR 171.11 - Federal certification of pesticide applicators in States or on Indian Reservations where there is...

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... applicable, any person who uses or supervises the use of any pesticide classified for restricted use must be... program. The individual may successfully complete a self-study learning program approved by the...

  13. 40 CFR 171.11 - Federal certification of pesticide applicators in States or on Indian Reservations where there is...

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... applicable, any person who uses or supervises the use of any pesticide classified for restricted use must be... program. The individual may successfully complete a self-study learning program approved by the...

  14. 40 CFR 171.11 - Federal certification of pesticide applicators in States or on Indian Reservations where there is...

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... applicable, any person who uses or supervises the use of any pesticide classified for restricted use must be... program. The individual may successfully complete a self-study learning program approved by the...

  15. Deep Learning Representation from Electroencephalography of Early-Stage Creutzfeldt-Jakob Disease and Features for Differentiation from Rapidly Progressive Dementia.

    PubMed

    Morabito, Francesco Carlo; Campolo, Maurizio; Mammone, Nadia; Versaci, Mario; Franceschetti, Silvana; Tagliavini, Fabrizio; Sofia, Vito; Fatuzzo, Daniela; Gambardella, Antonio; Labate, Angelo; Mumoli, Laura; Tripodi, Giovanbattista Gaspare; Gasparini, Sara; Cianci, Vittoria; Sueri, Chiara; Ferlazzo, Edoardo; Aguglia, Umberto

    2017-03-01

    A novel technique of quantitative EEG for differentiating patients with early-stage Creutzfeldt-Jakob disease (CJD) from other forms of rapidly progressive dementia (RPD) is proposed. The discrimination is based on the extraction of suitable features from the time-frequency representation of the EEG signals through continuous wavelet transform (CWT). An average measure of complexity of the EEG signal obtained by permutation entropy (PE) is also included. The dimensionality of the feature space is reduced through a multilayer processing system based on the recently emerged deep learning (DL) concept. The DL processor includes a stacked auto-encoder, trained by unsupervised learning techniques, and a classifier whose parameters are determined in a supervised way by associating the known category labels to the reduced vector of high-level features generated by the previous processing blocks. The supervised learning step is carried out by using either support vector machines (SVM) or multilayer neural networks (MLP-NN). A subset of EEG from patients suffering from Alzheimer's Disease (AD) and healthy controls (HC) is considered for differentiating CJD patients. When fine-tuning the parameters of the global processing system by a supervised learning procedure, the proposed system is able to achieve an average accuracy of 89%, an average sensitivity of 92%, and an average specificity of 89% in differentiating CJD from RPD. Similar results are obtained for CJD versus AD and CJD versus HC.

  16. Classification of spontaneous EEG signals in migraine

    NASA Astrophysics Data System (ADS)

    Bellotti, R.; De Carlo, F.; de Tommaso, M.; Lucente, M.

    2007-08-01

    We set up a classification system able to detect patients affected by migraine without aura, through the analysis of their spontaneous EEG patterns. First, the signals are characterized by means of wavelet-based features, than a supervised neural network is used to classify the multichannel data. For the feature extraction, scale-dependent and scale-independent methods are considered with a variety of wavelet functions. Both the approaches provide very high and almost comparable classification performances. A complete separation of the two groups is obtained when the data are plotted in the plane spanned by two suitable neural outputs.

  17. Applications of remote sensing, volume 3

    NASA Technical Reports Server (NTRS)

    Landgrebe, D. A. (Principal Investigator)

    1977-01-01

    The author has identified the following significant results. Of the four change detection techniques (post classification comparison, delta data, spectral/temporal, and layered spectral temporal), the post classification comparison was selected for further development. This was based upon test performances of the four change detection method, straightforwardness of the procedures, and the output products desired. A standardized modified, supervised classification procedure for analyzing the Texas coastal zone data was compiled. This procedure was developed in order that all quadrangles in the study are would be classified using similar analysis techniques to allow for meaningful comparisons and evaluations of the classifications.

  18. Random forests ensemble classifier trained with data resampling strategy to improve cardiac arrhythmia diagnosis.

    PubMed

    Ozçift, Akin

    2011-05-01

    Supervised classification algorithms are commonly used in the designing of computer-aided diagnosis systems. In this study, we present a resampling strategy based Random Forests (RF) ensemble classifier to improve diagnosis of cardiac arrhythmia. Random forests is an ensemble classifier that consists of many decision trees and outputs the class that is the mode of the class's output by individual trees. In this way, an RF ensemble classifier performs better than a single tree from classification performance point of view. In general, multiclass datasets having unbalanced distribution of sample sizes are difficult to analyze in terms of class discrimination. Cardiac arrhythmia is such a dataset that has multiple classes with small sample sizes and it is therefore adequate to test our resampling based training strategy. The dataset contains 452 samples in fourteen types of arrhythmias and eleven of these classes have sample sizes less than 15. Our diagnosis strategy consists of two parts: (i) a correlation based feature selection algorithm is used to select relevant features from cardiac arrhythmia dataset. (ii) RF machine learning algorithm is used to evaluate the performance of selected features with and without simple random sampling to evaluate the efficiency of proposed training strategy. The resultant accuracy of the classifier is found to be 90.0% and this is a quite high diagnosis performance for cardiac arrhythmia. Furthermore, three case studies, i.e., thyroid, cardiotocography and audiology, are used to benchmark the effectiveness of the proposed method. The results of experiments demonstrated the efficiency of random sampling strategy in training RF ensemble classification algorithm. Copyright © 2011 Elsevier Ltd. All rights reserved.

  19. Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds.

    PubMed

    Sur, Maitreyi; Suffredini, Tony; Wessells, Stephen M; Bloom, Peter H; Lanzone, Michael; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd

    2017-01-01

    Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data.

  20. Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds

    PubMed Central

    Suffredini, Tony; Wessells, Stephen M.; Bloom, Peter H.; Lanzone, Michael; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd

    2017-01-01

    Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data. PMID:28403159

  1. Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds

    USGS Publications Warehouse

    Sur, Maitreyi; Suffredini, Tony; Wessells, Stephen M.; Bloom, Peter H.; Lanzone, Michael J.; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd

    2017-01-01

    Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data.

  2. Damage and recovery assessment of the Philippines' mangroves following Super Typhoon Haiyan

    USGS Publications Warehouse

    Long, Jordan; Giri, Chandra; Primavera, Jurgene H.; Trivedi, Mandar

    2016-01-01

    We quantified mangrove disturbance resulting from Super Typhoon Haiyan using a remote sensing approach. Mangrove areas were mapped prior to Haiyan using 30 m Landsat imagery and a supervised decision-tree classification. A time sequence of 250 m eMODIS data was used to monitor mangrove condition prior to, and following, Haiyan. Based on differences in eMODIS NDVI observations before and after the storm, we classified mangrove into three damage level categories: minimal, moderate, or severe. Mangrove damage in terms of extent and severity was greatest where Haiyan first made landfall on Eastern Samar and Western Samar provinces and lessened westward corresponding with decreasing storm intensity as Haiyan tracked from east to west across the Visayas region of the Philippines. However, within 18 months following Haiyan, mangrove areas classified as severely, moderately, and minimally damaged decreased by 90%, 81%, and 57%, respectively, indicating mangroves resilience to powerful typhoons.

  3. Neural-network classifiers for automatic real-world aerial image recognition

    NASA Astrophysics Data System (ADS)

    Greenberg, Shlomo; Guterman, Hugo

    1996-08-01

    We describe the application of the multilayer perceptron (MLP) network and a version of the adaptive resonance theory version 2-A (ART 2-A) network to the problem of automatic aerial image recognition (AAIR). The classification of aerial images, independent of their positions and orientations, is required for automatic tracking and target recognition. Invariance is achieved by the use of different invariant feature spaces in combination with supervised and unsupervised neural networks. The performance of neural-network-based classifiers in conjunction with several types of invariant AAIR global features, such as the Fourier-transform space, Zernike moments, central moments, and polar transforms, are examined. The advantages of this approach are discussed. The performance of the MLP network is compared with that of a classical correlator. The MLP neural-network correlator outperformed the binary phase-only filter (BPOF) correlator. It was found that the ART 2-A distinguished itself with its speed and its low number of required training vectors. However, only the MLP classifier was able to deal with a combination of shift and rotation geometric distortions.

  4. Neural-network classifiers for automatic real-world aerial image recognition.

    PubMed

    Greenberg, S; Guterman, H

    1996-08-10

    We describe the application of the multilayer perceptron (MLP) network and a version of the adaptive resonance theory version 2-A (ART 2-A) network to the problem of automatic aerial image recognition (AAIR). The classification of aerial images, independent of their positions and orientations, is required for automatic tracking and target recognition. Invariance is achieved by the use of different invariant feature spaces in combination with supervised and unsupervised neural networks. The performance of neural-network-based classifiers in conjunction with several types of invariant AAIR global features, such as the Fourier-transform space, Zernike moments, central moments, and polar transforms, are examined. The advantages of this approach are discussed. The performance of the MLP network is compared with that of a classical correlator. The MLP neural-network correlator outperformed the binary phase-only filter (BPOF) correlator. It was found that the ART 2-A distinguished itself with its speed and its low number of required training vectors. However, only the MLP classifier was able to deal with a combination of shift and rotation geometric distortions.

  5. A Novel Unsupervised Adaptive Learning Method for Long-Term Electromyography (EMG) Pattern Recognition

    PubMed Central

    Huang, Qi; Yang, Dapeng; Jiang, Li; Zhang, Huajie; Liu, Hong; Kotani, Kiyoshi

    2017-01-01

    Performance degradation will be caused by a variety of interfering factors for pattern recognition-based myoelectric control methods in the long term. This paper proposes an adaptive learning method with low computational cost to mitigate the effect in unsupervised adaptive learning scenarios. We presents a particle adaptive classifier (PAC), by constructing a particle adaptive learning strategy and universal incremental least square support vector classifier (LS-SVC). We compared PAC performance with incremental support vector classifier (ISVC) and non-adapting SVC (NSVC) in a long-term pattern recognition task in both unsupervised and supervised adaptive learning scenarios. Retraining time cost and recognition accuracy were compared by validating the classification performance on both simulated and realistic long-term EMG data. The classification results of realistic long-term EMG data showed that the PAC significantly decreased the performance degradation in unsupervised adaptive learning scenarios compared with NSVC (9.03% ± 2.23%, p < 0.05) and ISVC (13.38% ± 2.62%, p = 0.001), and reduced the retraining time cost compared with ISVC (2 ms per updating cycle vs. 50 ms per updating cycle). PMID:28608824

  6. A Novel Unsupervised Adaptive Learning Method for Long-Term Electromyography (EMG) Pattern Recognition.

    PubMed

    Huang, Qi; Yang, Dapeng; Jiang, Li; Zhang, Huajie; Liu, Hong; Kotani, Kiyoshi

    2017-06-13

    Performance degradation will be caused by a variety of interfering factors for pattern recognition-based myoelectric control methods in the long term. This paper proposes an adaptive learning method with low computational cost to mitigate the effect in unsupervised adaptive learning scenarios. We presents a particle adaptive classifier (PAC), by constructing a particle adaptive learning strategy and universal incremental least square support vector classifier (LS-SVC). We compared PAC performance with incremental support vector classifier (ISVC) and non-adapting SVC (NSVC) in a long-term pattern recognition task in both unsupervised and supervised adaptive learning scenarios. Retraining time cost and recognition accuracy were compared by validating the classification performance on both simulated and realistic long-term EMG data. The classification results of realistic long-term EMG data showed that the PAC significantly decreased the performance degradation in unsupervised adaptive learning scenarios compared with NSVC (9.03% ± 2.23%, p < 0.05) and ISVC (13.38% ± 2.62%, p = 0.001), and reduced the retraining time cost compared with ISVC (2 ms per updating cycle vs. 50 ms per updating cycle).

  7. Towards harmonized seismic analysis across Europe using supervised machine learning approaches

    NASA Astrophysics Data System (ADS)

    Zaccarelli, Riccardo; Bindi, Dino; Cotton, Fabrice; Strollo, Angelo

    2017-04-01

    In the framework of the Thematic Core Services for Seismology of EPOS-IP (European Plate Observing System-Implementation Phase), a service for disseminating a regionalized logic-tree of ground motions models for Europe is under development. While for the Mediterranean area the large availability of strong motion data qualified and disseminated through the Engineering Strong Motion database (ESM-EPOS), supports the development of both selection criteria and ground motion models, for the low-to-moderate seismic regions of continental Europe the development of ad-hoc models using weak motion recordings of moderate earthquakes is unavoidable. Aim of this work is to present a platform for creating application-oriented earthquake databases by retrieving information from EIDA (European Integrated Data Archive) and applying supervised learning models for earthquake records selection and processing suitable for any specific application of interest. Supervised learning models, i.e. the task of inferring a function from labelled training data, have been extensively used in several fields such as spam detection, speech and image recognition and in general pattern recognition. Their suitability to detect anomalies and perform a semi- to fully- automated filtering on large waveform data set easing the effort of (or replacing) human expertise is therefore straightforward. Being supervised learning algorithms capable of learning from a relatively small training set to predict and categorize unseen data, its advantage when processing large amount of data is crucial. Moreover, their intrinsic ability to make data driven predictions makes them suitable (and preferable) in those cases where explicit algorithms for detection might be unfeasible or too heuristic. In this study, we consider relatively simple statistical classifiers (e.g., Naive Bayes, Logistic Regression, Random Forest, SVMs) where label are assigned to waveform data based on "recognized classes" needed for our use case. These classes might be a simply binary case (e.g., "good for analysis" vs "bad") or more complex one (e.g., "good for analysis" vs "low SNR", "multi-event", "bad coda envelope"). It is important to stress the fact that our approach can be generalized to any use case providing, as in any supervised approach, an adequate training set of labelled data, a feature-set, a statistical classifier, and finally model validation and evaluation. Examples of use cases considered to develop the system prototype are the characterization of the ground motion in low seismic areas; harmonized spectral analysis across Europe for source and attenuation studies; magnitude calibration; coda analysis for attenuation studies.

  8. Probability Based hERG Blocker Classifiers.

    PubMed

    Wang, Zhi; Mussa, Hamse Y; Lowe, Robert; Glen, Robert C; Yan, Aixia

    2012-09-01

    The US Food and Drug Administration (FDA) require in vitro human ether-a-go-go related (hERG) ion channel affinity tests for all drug candidates prior to clinical trials. In this study, probabilistic-based methods were employed to develop prediction models on hERG inhibition prediction, which are different from traditional QSAR models that are mainly based on supervised 'hard point' (HP) classification approaches giving 'yes/no' answers. The obtained models can 'ascertain' whether or not a given set of compounds can block hERG ion channels. The results presented indicate that the proposed probabilistic-based method can be a valuable tool for ranking compounds with respect to their potential cardio-toxicity and will be promising for other toxic property predictions. Copyright © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  9. Daily life activity routine discovery in hemiparetic rehabilitation patients using topic models.

    PubMed

    Seiter, J; Derungs, A; Schuster-Amft, C; Amft, O; Tröster, G

    2015-01-01

    Monitoring natural behavior and activity routines of hemiparetic rehabilitation patients across the day can provide valuable progress information for therapists and patients and contribute to an optimized rehabilitation process. In particular, continuous patient monitoring could add type, frequency and duration of daily life activity routines and hence complement standard clinical scores that are assessed for particular tasks only. Machine learning methods have been applied to infer activity routines from sensor data. However, supervised methods require activity annotations to build recognition models and thus require extensive patient supervision. Discovery methods, including topic models could provide patient routine information and deal with variability in activity and movement performance across patients. Topic models have been used to discover characteristic activity routine patterns of healthy individuals using activity primitives recognized from supervised sensor data. Yet, the applicability of topic models for hemiparetic rehabilitation patients and techniques to derive activity primitives without supervision needs to be addressed. We investigate, 1) whether a topic model-based activity routine discovery framework can infer activity routines of rehabilitation patients from wearable motion sensor data. 2) We compare the performance of our topic model-based activity routine discovery using rule-based and clustering-based activity vocabulary. We analyze the activity routine discovery in a dataset recorded with 11 hemiparetic rehabilitation patients during up to ten full recording days per individual in an ambulatory daycare rehabilitation center using wearable motion sensors attached to both wrists and the non-affected thigh. We introduce and compare rule-based and clustering-based activity vocabulary to process statistical and frequency acceleration features to activity words. Activity words were used for activity routine pattern discovery using topic models based on Latent Dirichlet Allocation. Discovered activity routine patterns were then mapped to six categorized activity routines. Using the rule-based approach, activity routines could be discovered with an average accuracy of 76% across all patients. The rule-based approach outperformed clustering by 10% and showed less confusions for predicted activity routines. Topic models are suitable to discover daily life activity routines in hemiparetic rehabilitation patients without trained classifiers and activity annotations. Activity routines show characteristic patterns regarding activity primitives including body and extremity postures and movement. A patient-independent rule set can be derived. Including expert knowledge supports successful activity routine discovery over completely data-driven clustering.

  10. The Need for Data-Informed Clinical Supervision in Substance Use Disorder Treatment

    PubMed Central

    Ramsey, Alex T.; Baumann, Ana; Silver Wolf, David Patterson; Yan, Yan; Cooper, Ben; Proctor, Enola

    2017-01-01

    Background Effective clinical supervision is necessary for high-quality care in community-based substance use disorder treatment settings, yet little is known about current supervision practices. Some evidence suggests that supervisors and counselors differ in their experiences of clinical supervision; however, the impact of this misalignment on supervision quality is unclear. Clinical information monitoring systems may support supervision in substance use disorder treatment, but the potential use of these tools must first be explored. Aims First, this study examines the extent to which misaligned supervisor-counselor perceptions impact supervision satisfaction and emphasis on evidence-based treatments. This study also reports on formative work to develop a supervision-based clinical dashboard, an electronic information monitoring system and data visualization tool providing real-time clinical information to engage supervisors and counselors in a coordinated and data-informed manner, help align supervisor-counselor perceptions about supervision, and improve supervision effectiveness. Methods Clinical supervisors and frontline counselors (N=165) from five Midwestern agencies providing substance abuse services completed an online survey using Research Electronic Data Capture (REDCap) software, yielding a 75% response rate. Valid quantitative measures of supervision effectiveness were assessed, along with qualitative perceptions of a supervision-based clinical dashboard. Results Through within-dyad analyses, misalignment between supervisor and counselor perceptions of supervision practices was negatively associated with satisfaction of supervision and reported frequency of discussing several important clinical supervision topics, including evidence-based treatments and client rapport. Participants indicated the most useful clinical dashboard functions and reported important benefits and challenges to using the proposed tool. Discussion Clinical supervision tends to be largely an informal and unstructured process in substance abuse treatment, which may compromise the quality of care. Clinical dashboards may be a well-targeted approach to facilitate data-informed clinical supervision in community-based treatment agencies. PMID:28166480

  11. The need for data-informed clinical supervision in substance use disorder treatment.

    PubMed

    Ramsey, Alex T; Baumann, Ana; Patterson Silver Wolf, David; Yan, Yan; Cooper, Ben; Proctor, Enola

    2017-01-01

    Effective clinical supervision is necessary for high-quality care in community-based substance use disorder treatment settings, yet little is known about current supervision practices. Some evidence suggests that supervisors and counselors differ in their experiences of clinical supervision; however, the impact of this misalignment on supervision quality is unclear. Clinical information monitoring systems may support supervision in substance use disorder treatment, but the potential use of these tools must first be explored. First, the current study examines the extent to which misaligned supervisor-counselor perceptions impact supervision satisfaction and emphasis on evidence-based treatments. This study also reports on formative work to develop a supervision-based clinical dashboard, an electronic information monitoring system and data visualization tool providing real-time clinical information to engage supervisors and counselors in a coordinated and data-informed manner, help align supervisor-counselor perceptions about supervision, and improve supervision effectiveness. Clinical supervisors and frontline counselors (N = 165) from five Midwestern agencies providing substance abuse services completed an online survey using Research Electronic Data Capture software, yielding a 75% response rate. Valid quantitative measures of supervision effectiveness were administered, along with qualitative perceptions of a supervision-based clinical dashboard. Through within-dyad analyses, misalignment between supervisor and counselor perceptions of supervision practices was negatively associated with satisfaction of supervision and reported frequency of discussing several important clinical supervision topics, including evidence-based treatments and client rapport. Participants indicated the most useful clinical dashboard functions and reported important benefits and challenges to using the proposed tool. Clinical supervision tends to be largely an informal and unstructured process in substance abuse treatment, which may compromise the quality of care. Clinical dashboards may be a well-targeted approach to facilitate data-informed clinical supervision in community-based treatment agencies.

  12. Comparison of GOES Cloud Classification Algorithms Employing Explicit and Implicit Physics

    NASA Technical Reports Server (NTRS)

    Bankert, Richard L.; Mitrescu, Cristian; Miller, Steven D.; Wade, Robert H.

    2009-01-01

    Cloud-type classification based on multispectral satellite imagery data has been widely researched and demonstrated to be useful for distinguishing a variety of classes using a wide range of methods. The research described here is a comparison of the classifier output from two very different algorithms applied to Geostationary Operational Environmental Satellite (GOES) data over the course of one year. The first algorithm employs spectral channel thresholding and additional physically based tests. The second algorithm was developed through a supervised learning method with characteristic features of expertly labeled image samples used as training data for a 1-nearest-neighbor classification. The latter's ability to identify classes is also based in physics, but those relationships are embedded implicitly within the algorithm. A pixel-to-pixel comparison analysis was done for hourly daytime scenes within a region in the northeastern Pacific Ocean. Considerable agreement was found in this analysis, with many of the mismatches or disagreements providing insight to the strengths and limitations of each classifier. Depending upon user needs, a rule-based or other postprocessing system that combines the output from the two algorithms could provide the most reliable cloud-type classification.

  13. Object-based locust habitat mapping using high-resolution multispectral satellite data in the southern Aral Sea basin

    NASA Astrophysics Data System (ADS)

    Navratil, Peter; Wilps, Hans

    2013-01-01

    Three different object-based image classification techniques are applied to high-resolution satellite data for the mapping of the habitats of Asian migratory locust (Locusta migratoria migratoria) in the southern Aral Sea basin, Uzbekistan. A set of panchromatic and multispectral Système Pour l'Observation de la Terre-5 satellite images was spectrally enhanced by normalized difference vegetation index and tasseled cap transformation and segmented into image objects, which were then classified by three different classification approaches: a rule-based hierarchical fuzzy threshold (HFT) classification method was compared to a supervised nearest neighbor classifier and classification tree analysis by the quick, unbiased, efficient statistical trees algorithm. Special emphasis was laid on the discrimination of locust feeding and breeding habitats due to the significance of this discrimination for practical locust control. Field data on vegetation and land cover, collected at the time of satellite image acquisition, was used to evaluate classification accuracy. The results show that a robust HFT classifier outperformed the two automated procedures by 13% overall accuracy. The classification method allowed a reliable discrimination of locust feeding and breeding habitats, which is of significant importance for the application of the resulting data for an economically and environmentally sound control of locust pests because exact spatial knowledge on the habitat types allows a more effective surveying and use of pesticides.

  14. Mercury⊕: An evidential reasoning image classifier

    NASA Astrophysics Data System (ADS)

    Peddle, Derek R.

    1995-12-01

    MERCURY⊕ is a multisource evidential reasoning classification software system based on the Dempster-Shafer theory of evidence. The design and implementation of this software package is described for improving the classification and analysis of multisource digital image data necessary for addressing advanced environmental and geoscience applications. In the remote-sensing context, the approach provides a more appropriate framework for classifying modern, multisource, and ancillary data sets which may contain a large number of disparate variables with different statistical properties, scales of measurement, and levels of error which cannot be handled using conventional Bayesian approaches. The software uses a nonparametric, supervised approach to classification, and provides a more objective and flexible interface to the evidential reasoning framework using a frequency-based method for computing support values from training data. The MERCURY⊕ software package has been implemented efficiently in the C programming language, with extensive use made of dynamic memory allocation procedures and compound linked list and hash-table data structures to optimize the storage and retrieval of evidence in a Knowledge Look-up Table. The software is complete with a full user interface and runs under Unix, Ultrix, VAX/VMS, MS-DOS, and Apple Macintosh operating system. An example of classifying alpine land cover and permafrost active layer depth in northern Canada is presented to illustrate the use and application of these ideas.

  15. Guidelines for clinical supervision in health service psychology.

    PubMed

    2015-01-01

    This document outlines guidelines for supervision of students in health service psychology education and training programs. The goal was to capture optimal performance expectations for psychologists who supervise. It is based on the premises that supervisors (a) strive to achieve competence in the provision of supervision and (b) employ a competency-based, meta-theoretical approach to the supervision process. The Guidelines on Supervision were developed as a resource to inform education and training regarding the implementation of competency-based supervision. The Guidelines on Supervision build on the robust literatures on competency-based education and clinical supervision. They are organized around seven domains: supervisor competence; diversity; relationships; professionalism; assessment/evaluation/feedback; problems of professional competence, and ethical, legal, and regulatory considerations. The Guidelines on Supervision represent the collective effort of a task force convened by the American Psychological Association (APA) Board of Educational Affairs (BEA). PsycINFO Database Record (c) 2015 APA, all rights reserved.

  16. Supervised learning from human performance at the computationally hard problem of optimal traffic signal control on a network of junctions

    PubMed Central

    Box, Simon

    2014-01-01

    Optimal switching of traffic lights on a network of junctions is a computationally intractable problem. In this research, road traffic networks containing signallized junctions are simulated. A computer game interface is used to enable a human ‘player’ to control the traffic light settings on the junctions within the simulation. A supervised learning approach, based on simple neural network classifiers can be used to capture human player's strategies in the game and thus develop a human-trained machine control (HuTMaC) system that approaches human levels of performance. Experiments conducted within the simulation compare the performance of HuTMaC to two well-established traffic-responsive control systems that are widely deployed in the developed world and also to a temporal difference learning-based control method. In all experiments, HuTMaC outperforms the other control methods in terms of average delay and variance over delay. The conclusion is that these results add weight to the suggestion that HuTMaC may be a viable alternative, or supplemental method, to approximate optimization for some practical engineering control problems where the optimal strategy is computationally intractable. PMID:26064570

  17. Supervised learning from human performance at the computationally hard problem of optimal traffic signal control on a network of junctions.

    PubMed

    Box, Simon

    2014-12-01

    Optimal switching of traffic lights on a network of junctions is a computationally intractable problem. In this research, road traffic networks containing signallized junctions are simulated. A computer game interface is used to enable a human 'player' to control the traffic light settings on the junctions within the simulation. A supervised learning approach, based on simple neural network classifiers can be used to capture human player's strategies in the game and thus develop a human-trained machine control (HuTMaC) system that approaches human levels of performance. Experiments conducted within the simulation compare the performance of HuTMaC to two well-established traffic-responsive control systems that are widely deployed in the developed world and also to a temporal difference learning-based control method. In all experiments, HuTMaC outperforms the other control methods in terms of average delay and variance over delay. The conclusion is that these results add weight to the suggestion that HuTMaC may be a viable alternative, or supplemental method, to approximate optimization for some practical engineering control problems where the optimal strategy is computationally intractable.

  18. D Semantic Labeling of ALS Data Based on Domain Adaption by Transferring and Fusing Random Forest Models

    NASA Astrophysics Data System (ADS)

    Wu, J.; Yao, W.; Zhang, J.; Li, Y.

    2018-04-01

    Labeling 3D point cloud data with traditional supervised learning methods requires considerable labelled samples, the collection of which is cost and time expensive. This work focuses on adopting domain adaption concept to transfer existing trained random forest classifiers (based on source domain) to new data scenes (target domain), which aims at reducing the dependence of accurate 3D semantic labeling in point clouds on training samples from the new data scene. Firstly, two random forest classifiers were firstly trained with existing samples previously collected for other data. They were different from each other by using two different decision tree construction algorithms: C4.5 with information gain ratio and CART with Gini index. Secondly, four random forest classifiers adapted to the target domain are derived through transferring each tree in the source random forest models with two types of operations: structure expansion and reduction-SER and structure transfer-STRUT. Finally, points in target domain are labelled by fusing the four newly derived random forest classifiers using weights of evidence based fusion model. To validate our method, experimental analysis was conducted using 3 datasets: one is used as the source domain data (Vaihingen data for 3D Semantic Labelling); another two are used as the target domain data from two cities in China (Jinmen city and Dunhuang city). Overall accuracies of 85.5 % and 83.3 % for 3D labelling were achieved for Jinmen city and Dunhuang city data respectively, with only 1/3 newly labelled samples compared to the cases without domain adaption.

  19. Data integration modeling applied to drill hole planning through semi-supervised learning: A case study from the Dalli Cu-Au porphyry deposit in the central Iran

    NASA Astrophysics Data System (ADS)

    Fatehi, Moslem; Asadi, Hooshang H.

    2017-04-01

    In this study, the application of a transductive support vector machine (TSVM), an innovative semi-supervised learning algorithm, has been proposed for mapping the potential drill targets at a detailed exploration stage. The semi-supervised learning method is a hybrid of supervised and unsupervised learning approach that simultaneously uses both training and non-training data to design a classifier. By using the TSVM algorithm, exploration layers at the Dalli porphyry Cu-Au deposit in the central Iran were integrated to locate the boundary of the Cu-Au mineralization for further drilling. By applying this algorithm on the non-training (unlabeled) and limited training (labeled) Dalli exploration data, the study area was classified in two domains of Cu-Au ore and waste. Then, the results were validated by the earlier block models created, using the available borehole and trench data. In addition to TSVM, the support vector machine (SVM) algorithm was also implemented on the study area for comparison. Thirty percent of the labeled exploration data was used to evaluate the performance of these two algorithms. The results revealed 87 percent correct recognition accuracy for the TSVM algorithm and 82 percent for the SVM algorithm. The deepest inclined borehole, recently drilled in the western part of the Dalli deposit, indicated that the boundary of Cu-Au mineralization, as identified by the TSVM algorithm, was only 15 m off from the actual boundary intersected by this borehole. According to the results of the TSVM algorithm, six new boreholes were suggested for further drilling at the Dalli deposit. This study showed that the TSVM algorithm could be a useful tool for enhancing the mineralization zones and consequently, ensuring a more accurate drill hole planning.

  20. Automatic Classification of Time-variable X-Ray Sources

    NASA Astrophysics Data System (ADS)

    Lo, Kitty K.; Farrell, Sean; Murphy, Tara; Gaensler, B. M.

    2014-05-01

    To maximize the discovery potential of future synoptic surveys, especially in the field of transient science, it will be necessary to use automatic classification to identify some of the astronomical sources. The data mining technique of supervised classification is suitable for this problem. Here, we present a supervised learning method to automatically classify variable X-ray sources in the Second XMM-Newton Serendipitous Source Catalog (2XMMi-DR2). Random Forest is our classifier of choice since it is one of the most accurate learning algorithms available. Our training set consists of 873 variable sources and their features are derived from time series, spectra, and other multi-wavelength contextual information. The 10 fold cross validation accuracy of the training data is ~97% on a 7 class data set. We applied the trained classification model to 411 unknown variable 2XMM sources to produce a probabilistically classified catalog. Using the classification margin and the Random Forest derived outlier measure, we identified 12 anomalous sources, of which 2XMM J180658.7-500250 appears to be the most unusual source in the sample. Its X-ray spectra is suggestive of a ultraluminous X-ray source but its variability makes it highly unusual. Machine-learned classification and anomaly detection will facilitate scientific discoveries in the era of all-sky surveys.

  1. An open, multi-vendor, multi-field-strength brain MR dataset and analysis of publicly available skull stripping methods agreement.

    PubMed

    Souza, Roberto; Lucena, Oeslle; Garrafa, Julia; Gobbi, David; Saluzzi, Marina; Appenzeller, Simone; Rittner, Letícia; Frayne, Richard; Lotufo, Roberto

    2018-04-15

    This paper presents an open, multi-vendor, multi-field strength magnetic resonance (MR) T1-weighted volumetric brain imaging dataset, named Calgary-Campinas-359 (CC-359). The dataset is composed of images of older healthy adults (29-80 years) acquired on scanners from three vendors (Siemens, Philips and General Electric) at both 1.5 T and 3 T. CC-359 is comprised of 359 datasets, approximately 60 subjects per vendor and magnetic field strength. The dataset is approximately age and gender balanced, subject to the constraints of the available images. It provides consensus brain extraction masks for all volumes generated using supervised classification. Manual segmentation results for twelve randomly selected subjects performed by an expert are also provided. The CC-359 dataset allows investigation of 1) the influences of both vendor and magnetic field strength on quantitative analysis of brain MR; 2) parameter optimization for automatic segmentation methods; and potentially 3) machine learning classifiers with big data, specifically those based on deep learning methods, as these approaches require a large amount of data. To illustrate the utility of this dataset, we compared to the results of a supervised classifier, the results of eight publicly available skull stripping methods and one publicly available consensus algorithm. A linear mixed effects model analysis indicated that vendor (p-value<0.001) and magnetic field strength (p-value<0.001) have statistically significant impacts on skull stripping results. Copyright © 2017 Elsevier Inc. All rights reserved.

  2. Producing Information for Corine Database by Using Classification Method: a Case Study of Sazlidere Basin, Istanbul

    NASA Astrophysics Data System (ADS)

    Sarıyılmaz, F. B.; Musaoğlu, N.; Uluğtekin, N.

    2017-11-01

    The Sazlidere Basin is located on the European side of Istanbul within the borders of Arnavutkoy and Basaksehir districts. The total area of the basin, which is largely located within the province of Arnavutkoy, is approximately 177 km2. The Sazlidere Basin is faced with intense urbanization pressures and land use / cover change due to the Northern Marmara Motorway, 3rd airport and Channel Istanbul Projects, which are planned to be realized in the Arnavutkoy region. Due to the mentioned projects, intense land use /cover changes occur in the basin. In this study, 2000 and 2012 dated LANDSAT images were supervised classified based on CORINE Land Cover first level to determine the land use/cover classes. As a result, four information classes were identified. These classes are water bodies, forest and semi-natural areas, agricultural areas and artificial surfaces. Accuracy analysis of the images were performed following the classification process. The supervised classified images that have the smallest mapping units 0.09 ha and 0.64 ha were generalized to be compatible with the CORINE Land Cover data. The image pixels have been rearranged by using the thematic pixel aggregation method as the smallest mapping unit is 25 ha. These results were compared with CORINE Land Cover 2000 and CORINE Land Cover 2012, which were obtained by digitizing land cover and land use classes on satellite images. It has been determined that the compared results are compatible with each other in terms of quality and quantity.

  3. Psoriasis image representation using patch-based dictionary learning for erythema severity scoring.

    PubMed

    George, Yasmeen; Aldeen, Mohammad; Garnavi, Rahil

    2018-06-01

    Psoriasis is a chronic skin disease which can be life-threatening. Accurate severity scoring helps dermatologists to decide on the treatment. In this paper, we present a semi-supervised computer-aided system for automatic erythema severity scoring in psoriasis images. Firstly, the unsupervised stage includes a novel image representation method. We construct a dictionary, which is then used in the sparse representation for local feature extraction. To acquire the final image representation vector, an aggregation method is exploited over the local features. Secondly, the supervised phase is where various multi-class machine learning (ML) classifiers are trained for erythema severity scoring. Finally, we compare the proposed system with two popular unsupervised feature extractor methods, namely: bag of visual words model (BoVWs) and AlexNet pretrained model. Root mean square error (RMSE) and F1 score are used as performance measures for the learned dictionaries and the trained ML models, respectively. A psoriasis image set consisting of 676 images, is used in this study. Experimental results demonstrate that the use of the proposed procedure can provide a setup where erythema scoring is accurate and consistent. Also, it is revealed that dictionaries with large number of atoms and small patch sizes yield the best representative erythema severity features. Further, random forest (RF) outperforms other classifiers with F1 score 0.71, followed by support vector machine (SVM) and boosting with 0.66 and 0.64 scores, respectively. Furthermore, the conducted comparative studies confirm the effectiveness of the proposed approach with improvement of 9% and 12% over BoVWs and AlexNet based features, respectively. Crown Copyright © 2018. Published by Elsevier Ltd. All rights reserved.

  4. nRC: non-coding RNA Classifier based on structural features.

    PubMed

    Fiannaca, Antonino; La Rosa, Massimo; La Paglia, Laura; Rizzo, Riccardo; Urso, Alfonso

    2017-01-01

    Non-coding RNA (ncRNA) are small non-coding sequences involved in gene expression regulation of many biological processes and diseases. The recent discovery of a large set of different ncRNAs with biologically relevant roles has opened the way to develop methods able to discriminate between the different ncRNA classes. Moreover, the lack of knowledge about the complete mechanisms in regulative processes, together with the development of high-throughput technologies, has required the help of bioinformatics tools in addressing biologists and clinicians with a deeper comprehension of the functional roles of ncRNAs. In this work, we introduce a new ncRNA classification tool, nRC (non-coding RNA Classifier). Our approach is based on features extraction from the ncRNA secondary structure together with a supervised classification algorithm implementing a deep learning architecture based on convolutional neural networks. We tested our approach for the classification of 13 different ncRNA classes. We obtained classification scores, using the most common statistical measures. In particular, we reach an accuracy and sensitivity score of about 74%. The proposed method outperforms other similar classification methods based on secondary structure features and machine learning algorithms, including the RNAcon tool that, to date, is the reference classifier. nRC tool is freely available as a docker image at https://hub.docker.com/r/tblab/nrc/. The source code of nRC tool is also available at https://github.com/IcarPA-TBlab/nrc.

  5. Geographical classification of apple based on hyperspectral imaging

    NASA Astrophysics Data System (ADS)

    Guo, Zhiming; Huang, Wenqian; Chen, Liping; Zhao, Chunjiang; Peng, Yankun

    2013-05-01

    Attribute of apple according to geographical origin is often recognized and appreciated by the consumers. It is usually an important factor to determine the price of a commercial product. Hyperspectral imaging technology and supervised pattern recognition was attempted to discriminate apple according to geographical origins in this work. Hyperspectral images of 207 Fuji apple samples were collected by hyperspectral camera (400-1000nm). Principal component analysis (PCA) was performed on hyperspectral imaging data to determine main efficient wavelength images, and then characteristic variables were extracted by texture analysis based on gray level co-occurrence matrix (GLCM) from dominant waveband image. All characteristic variables were obtained by fusing the data of images in efficient spectra. Support vector machine (SVM) was used to construct the classification model, and showed excellent performance in classification results. The total classification rate had the high classify accuracy of 92.75% in the training set and 89.86% in the prediction sets, respectively. The overall results demonstrated that the hyperspectral imaging technique coupled with SVM classifier can be efficiently utilized to discriminate Fuji apple according to geographical origins.

  6. Improvement of information fusion-based audio steganalysis

    NASA Astrophysics Data System (ADS)

    Kraetzer, Christian; Dittmann, Jana

    2010-01-01

    In the paper we extend an existing information fusion based audio steganalysis approach by three different kinds of evaluations: The first evaluation addresses the so far neglected evaluations on sensor level fusion. Our results show that this fusion removes content dependability while being capable of achieving similar classification rates (especially for the considered global features) if compared to single classifiers on the three exemplarily tested audio data hiding algorithms. The second evaluation enhances the observations on fusion from considering only segmental features to combinations of segmental and global features, with the result of a reduction of the required computational complexity for testing by about two magnitudes while maintaining the same degree of accuracy. The third evaluation tries to build a basis for estimating the plausibility of the introduced steganalysis approach by measuring the sensibility of the models used in supervised classification of steganographic material against typical signal modification operations like de-noising or 128kBit/s MP3 encoding. Our results show that for some of the tested classifiers the probability of false alarms rises dramatically after such modifications.

  7. A deep learning and novelty detection framework for rapid phenotyping in high-content screening

    PubMed Central

    Sommer, Christoph; Hoefler, Rudolf; Samwer, Matthias; Gerlich, Daniel W.

    2017-01-01

    Supervised machine learning is a powerful and widely used method for analyzing high-content screening data. Despite its accuracy, efficiency, and versatility, supervised machine learning has drawbacks, most notably its dependence on a priori knowledge of expected phenotypes and time-consuming classifier training. We provide a solution to these limitations with CellCognition Explorer, a generic novelty detection and deep learning framework. Application to several large-scale screening data sets on nuclear and mitotic cell morphologies demonstrates that CellCognition Explorer enables discovery of rare phenotypes without user training, which has broad implications for improved assay development in high-content screening. PMID:28954863

  8. Psychotherapy-based supervision models in an emerging competency-based era: a commentary.

    PubMed

    Falender, Carol A; Shafranske, Edward P

    2010-03-01

    As psychology engages in a cultural shift to competency-based education and training supervision practice is being transformed to the use of competency frames and the application of benchmark competencies. In this issue, psychotherapy-based models of supervision are conceptualized in a competency framework. This paper reflects on the translation of key components of each psychotherapy-based supervision approach in terms of foundational and functional competencies articulated in the Competencies Benchmarks (Fouad et al., 2009). The commentary concludes with a discussion of implications for supervision practice and identifies directions for future articulation and development, including evidence-based psychotherapy supervision. PsycINFO Database Record (c) 2010 APA, all rights reserved

  9. Neutral face classification using personalized appearance models for fast and robust emotion detection.

    PubMed

    Chiranjeevi, Pojala; Gopalakrishnan, Viswanath; Moogi, Pratibha

    2015-09-01

    Facial expression recognition is one of the open problems in computer vision. Robust neutral face recognition in real time is a major challenge for various supervised learning-based facial expression recognition methods. This is due to the fact that supervised methods cannot accommodate all appearance variability across the faces with respect to race, pose, lighting, facial biases, and so on, in the limited amount of training data. Moreover, processing each and every frame to classify emotions is not required, as user stays neutral for majority of the time in usual applications like video chat or photo album/web browsing. Detecting neutral state at an early stage, thereby bypassing those frames from emotion classification would save the computational power. In this paper, we propose a light-weight neutral versus emotion classification engine, which acts as a pre-processer to the traditional supervised emotion classification approaches. It dynamically learns neutral appearance at key emotion (KE) points using a statistical texture model, constructed by a set of reference neutral frames for each user. The proposed method is made robust to various types of user head motions by accounting for affine distortions based on a statistical texture model. Robustness to dynamic shift of KE points is achieved by evaluating the similarities on a subset of neighborhood patches around each KE point using the prior information regarding the directionality of specific facial action units acting on the respective KE point. The proposed method, as a result, improves emotion recognition (ER) accuracy and simultaneously reduces computational complexity of the ER system, as validated on multiple databases.

  10. Landsat for practical forest type mapping - A test case

    NASA Technical Reports Server (NTRS)

    Bryant, E.; Dodge, A. G., Jr.; Warren, S. D.

    1980-01-01

    Computer classified Landsat maps are compared with a recent conventional inventory of forest lands in northern Maine. Over the 196,000 hectare area mapped, estimates of the areas of softwood, mixed wood and hardwood forest obtained by a supervised classification of the Landsat data and a standard inventory based on aerial photointerpretation, probability proportional to prediction, field sampling and a standard forest measurement program are found to agree to within 5%. The cost of the Landsat maps is estimated to be $0.065/hectare. It is concluded that satellite techniques are worth developing for forest inventories, although they are not yet refined enough to be incorporated into current practical inventories.

  11. Opportunities to Learn Scientific Thinking in Joint Doctoral Supervision

    ERIC Educational Resources Information Center

    Kobayashi, Sofie; Grout, Brian W.; Rump, Camilla Østerberg

    2015-01-01

    Research into doctoral supervision has increased rapidly over the last decades, yet our understanding of how doctoral students learn scientific thinking from supervision is limited. Most studies are based on interviews with little work being reported that is based on observation of actual supervision. While joint supervision has become widely…

  12. Adequate supervision for children and adolescents.

    PubMed

    Anderst, James; Moffatt, Mary

    2014-11-01

    Primary care providers (PCPs) have the opportunity to improve child health and well-being by addressing supervision issues before an injury or exposure has occurred and/or after an injury or exposure has occurred. Appropriate anticipatory guidance on supervision at well-child visits can improve supervision of children, and may prevent future harm. Adequate supervision varies based on the child's development and maturity, and the risks in the child's environment. Consideration should be given to issues as wide ranging as swimming pools, falls, dating violence, and social media. By considering the likelihood of harm and the severity of the potential harm, caregivers may provide adequate supervision by minimizing risks to the child while still allowing the child to take "small" risks as needed for healthy development. Caregivers should initially focus on direct (visual, auditory, and proximity) supervision of the young child. Gradually, supervision needs to be adjusted as the child develops, emphasizing a safe environment and safe social interactions, with graduated independence. PCPs may foster adequate supervision by providing concrete guidance to caregivers. In addition to preventing injury, supervision includes fostering a safe, stable, and nurturing relationship with every child. PCPs should be familiar with age/developmentally based supervision risks, adequate supervision based on those risks, characteristics of neglectful supervision based on age/development, and ways to encourage appropriate supervision throughout childhood. Copyright 2014, SLACK Incorporated.

  13. A Study of Feature Combination for Vehicle Detection Based on Image Processing

    PubMed Central

    2014-01-01

    Video analytics play a critical role in most recent traffic monitoring and driver assistance systems. In this context, the correct detection and classification of surrounding vehicles through image analysis has been the focus of extensive research in the last years. Most of the pieces of work reported for image-based vehicle verification make use of supervised classification approaches and resort to techniques, such as histograms of oriented gradients (HOG), principal component analysis (PCA), and Gabor filters, among others. Unfortunately, existing approaches are lacking in two respects: first, comparison between methods using a common body of work has not been addressed; second, no study of the combination potentiality of popular features for vehicle classification has been reported. In this study the performance of the different techniques is first reviewed and compared using a common public database. Then, the combination capabilities of these techniques are explored and a methodology is presented for the fusion of classifiers built upon them, taking into account also the vehicle pose. The study unveils the limitations of single-feature based classification and makes clear that fusion of classifiers is highly beneficial for vehicle verification. PMID:24672299

  14. Computer-Vision-Assisted Palm Rehabilitation With Supervised Learning.

    PubMed

    Vamsikrishna, K M; Dogra, Debi Prosad; Desarkar, Maunendra Sankar

    2016-05-01

    Physical rehabilitation supported by the computer-assisted-interface is gaining popularity among health-care fraternity. In this paper, we have proposed a computer-vision-assisted contactless methodology to facilitate palm and finger rehabilitation. Leap motion controller has been interfaced with a computing device to record parameters describing 3-D movements of the palm of a user undergoing rehabilitation. We have proposed an interface using Unity3D development platform. Our interface is capable of analyzing intermediate steps of rehabilitation without the help of an expert, and it can provide online feedback to the user. Isolated gestures are classified using linear discriminant analysis (DA) and support vector machines (SVM). Finally, a set of discrete hidden Markov models (HMM) have been used to classify gesture sequence performed during rehabilitation. Experimental validation using a large number of samples collected from healthy volunteers reveals that DA and SVM perform similarly while applied on isolated gesture recognition. We have compared the results of HMM-based sequence classification with CRF-based techniques. Our results confirm that both HMM and CRF perform quite similarly when tested on gesture sequences. The proposed system can be used for home-based palm or finger rehabilitation in the absence of experts.

  15. Generalized query-based active learning to identify differentially methylated regions in DNA.

    PubMed

    Haque, Md Muksitul; Holder, Lawrence B; Skinner, Michael K; Cook, Diane J

    2013-01-01

    Active learning is a supervised learning technique that reduces the number of examples required for building a successful classifier, because it can choose the data it learns from. This technique holds promise for many biological domains in which classified examples are expensive and time-consuming to obtain. Most traditional active learning methods ask very specific queries to the Oracle (e.g., a human expert) to label an unlabeled example. The example may consist of numerous features, many of which are irrelevant. Removing such features will create a shorter query with only relevant features, and it will be easier for the Oracle to answer. We propose a generalized query-based active learning (GQAL) approach that constructs generalized queries based on multiple instances. By constructing appropriately generalized queries, we can achieve higher accuracy compared to traditional active learning methods. We apply our active learning method to find differentially DNA methylated regions (DMRs). DMRs are DNA locations in the genome that are known to be involved in tissue differentiation, epigenetic regulation, and disease. We also apply our method on 13 other data sets and show that our method is better than another popular active learning technique.

  16. Comparison of wheat classification accuracy using different classifiers of the image-100 system

    NASA Technical Reports Server (NTRS)

    Dejesusparada, N. (Principal Investigator); Chen, S. C.; Moreira, M. A.; Delima, A. M.

    1981-01-01

    Classification results using single-cell and multi-cell signature acquisition options, a point-by-point Gaussian maximum-likelihood classifier, and K-means clustering of the Image-100 system are presented. Conclusions reached are that: a better indication of correct classification can be provided by using a test area which contains various cover types of the study area; classification accuracy should be evaluated considering both the percentages of correct classification and error of commission; supervised classification approaches are better than K-means clustering; Gaussian distribution maximum likelihood classifier is better than Single-cell and Multi-cell Signature Acquisition Options of the Image-100 system; and in order to obtain a high classification accuracy in a large and heterogeneous crop area, using Gaussian maximum-likelihood classifier, homogeneous spectral subclasses of the study crop should be created to derive training statistics.

  17. Automatic morphological classification of galaxy images

    PubMed Central

    Shamir, Lior

    2009-01-01

    We describe an image analysis supervised learning algorithm that can automatically classify galaxy images. The algorithm is first trained using a manually classified images of elliptical, spiral, and edge-on galaxies. A large set of image features is extracted from each image, and the most informative features are selected using Fisher scores. Test images can then be classified using a simple Weighted Nearest Neighbor rule such that the Fisher scores are used as the feature weights. Experimental results show that galaxy images from Galaxy Zoo can be classified automatically to spiral, elliptical and edge-on galaxies with accuracy of ~90% compared to classifications carried out by the author. Full compilable source code of the algorithm is available for free download, and its general-purpose nature makes it suitable for other uses that involve automatic image analysis of celestial objects. PMID:20161594

  18. “We make choices we think are going to save us”: Debate and stance identification for online breast cancer CAM discussions

    PubMed Central

    Zhang, Shaodian; Qiu, Lin; Chen, Frank; Zhang, Weinan; Yu, Yong; Elhadad, Noémie

    2017-01-01

    Patients discuss complementary and alternative medicine (CAM) in online health communities. Sometimes, patients’ conflicting opinions toward CAM-related issues trigger debates in the community. The objectives of this paper are to identify such debates, identify controversial CAM therapies in a popular online breast cancer community, as well as patients’ stances towards them. To scale our analysis, we trained a set of classifiers. We first constructed a supervised classifier based on a long short-term memory neural network (LSTM) stacked over a convolutional neural network (CNN) to detect automatically CAM-related debates from a popular breast cancer forum. Members’ stances in these debates were also identified by a CNN-based classifier. Finally, posts automatically flagged as debates by the classifier were analyzed to explore which specific CAM therapies trigger debates more often than others. Our methods are able to detect CAM debates with F score of 77%, and identify stances with F score of 70%. The debate classifier identified about 1/6 of all CAM-related posts as debate. About 60% of CAM-related debate posts represent the supportive stance toward CAM usage. Qualitative analysis shows that some specific therapies, such as Gerson therapy and usage of laetrile, trigger debates frequently among members of the breast cancer community. This study demonstrates that neural networks can effectively locate debates on usage and effectiveness of controversial CAM therapies, and can help make sense of patients’ opinions on such issues under dispute. As to CAM for breast cancer, perceptions of their effectiveness vary among patients. Many of the specific therapies trigger debates frequently and are worth more exploration in future work. PMID:28967000

  19. "We make choices we think are going to save us": Debate and stance identification for online breast cancer CAM discussions.

    PubMed

    Zhang, Shaodian; Qiu, Lin; Chen, Frank; Zhang, Weinan; Yu, Yong; Elhadad, Noémie

    2017-04-01

    Patients discuss complementary and alternative medicine (CAM) in online health communities. Sometimes, patients' conflicting opinions toward CAM-related issues trigger debates in the community. The objectives of this paper are to identify such debates, identify controversial CAM therapies in a popular online breast cancer community, as well as patients' stances towards them. To scale our analysis, we trained a set of classifiers. We first constructed a supervised classifier based on a long short-term memory neural network (LSTM) stacked over a convolutional neural network (CNN) to detect automatically CAM-related debates from a popular breast cancer forum. Members' stances in these debates were also identified by a CNN-based classifier. Finally, posts automatically flagged as debates by the classifier were analyzed to explore which specific CAM therapies trigger debates more often than others. Our methods are able to detect CAM debates with F score of 77%, and identify stances with F score of 70%. The debate classifier identified about 1/6 of all CAM-related posts as debate. About 60% of CAM-related debate posts represent the supportive stance toward CAM usage. Qualitative analysis shows that some specific therapies, such as Gerson therapy and usage of laetrile, trigger debates frequently among members of the breast cancer community. This study demonstrates that neural networks can effectively locate debates on usage and effectiveness of controversial CAM therapies, and can help make sense of patients' opinions on such issues under dispute. As to CAM for breast cancer, perceptions of their effectiveness vary among patients. Many of the specific therapies trigger debates frequently and are worth more exploration in future work.

  20. New developments in technology-assisted supervision and training: a practical overview.

    PubMed

    Rousmaniere, Tony; Abbass, Allan; Frederickson, Jon

    2014-11-01

    Clinical supervision and training are now widely available online. In this article, three of the most accessible and widely adopted new developments in clinical supervision and training technology are described: Videoconference supervision, cloud-based file sharing software, and clinical outcome tracking software. Partial transcripts from two online supervision sessions are provided as examples of videoconference-based supervision. The benefits and limitations of technology in supervision and training are discussed, with an emphasis on supervision process, ethics, privacy, and security. Recommendations for supervision practice are made, including methods to enhance experiential learning, the supervisory working alliance, and online security. © 2014 Wiley Periodicals, Inc.

  1. Event Recognition Based on Deep Learning in Chinese Texts

    PubMed Central

    Zhang, Yajun; Liu, Zongtian; Zhou, Wen

    2016-01-01

    Event recognition is the most fundamental and critical task in event-based natural language processing systems. Existing event recognition methods based on rules and shallow neural networks have certain limitations. For example, extracting features using methods based on rules is difficult; methods based on shallow neural networks converge too quickly to a local minimum, resulting in low recognition precision. To address these problems, we propose the Chinese emergency event recognition model based on deep learning (CEERM). Firstly, we use a word segmentation system to segment sentences. According to event elements labeled in the CEC 2.0 corpus, we classify words into five categories: trigger words, participants, objects, time and location. Each word is vectorized according to the following six feature layers: part of speech, dependency grammar, length, location, distance between trigger word and core word and trigger word frequency. We obtain deep semantic features of words by training a feature vector set using a deep belief network (DBN), then analyze those features in order to identify trigger words by means of a back propagation neural network. Extensive testing shows that the CEERM achieves excellent recognition performance, with a maximum F-measure value of 85.17%. Moreover, we propose the dynamic-supervised DBN, which adds supervised fine-tuning to a restricted Boltzmann machine layer by monitoring its training performance. Test analysis reveals that the new DBN improves recognition performance and effectively controls the training time. Although the F-measure increases to 88.11%, the training time increases by only 25.35%. PMID:27501231

  2. Event Recognition Based on Deep Learning in Chinese Texts.

    PubMed

    Zhang, Yajun; Liu, Zongtian; Zhou, Wen

    2016-01-01

    Event recognition is the most fundamental and critical task in event-based natural language processing systems. Existing event recognition methods based on rules and shallow neural networks have certain limitations. For example, extracting features using methods based on rules is difficult; methods based on shallow neural networks converge too quickly to a local minimum, resulting in low recognition precision. To address these problems, we propose the Chinese emergency event recognition model based on deep learning (CEERM). Firstly, we use a word segmentation system to segment sentences. According to event elements labeled in the CEC 2.0 corpus, we classify words into five categories: trigger words, participants, objects, time and location. Each word is vectorized according to the following six feature layers: part of speech, dependency grammar, length, location, distance between trigger word and core word and trigger word frequency. We obtain deep semantic features of words by training a feature vector set using a deep belief network (DBN), then analyze those features in order to identify trigger words by means of a back propagation neural network. Extensive testing shows that the CEERM achieves excellent recognition performance, with a maximum F-measure value of 85.17%. Moreover, we propose the dynamic-supervised DBN, which adds supervised fine-tuning to a restricted Boltzmann machine layer by monitoring its training performance. Test analysis reveals that the new DBN improves recognition performance and effectively controls the training time. Although the F-measure increases to 88.11%, the training time increases by only 25.35%.

  3. Automated detection of microaneurysms using scale-adapted blob analysis and semi-supervised learning.

    PubMed

    Adal, Kedir M; Sidibé, Désiré; Ali, Sharib; Chaum, Edward; Karnowski, Thomas P; Mériaudeau, Fabrice

    2014-04-01

    Despite several attempts, automated detection of microaneurysm (MA) from digital fundus images still remains to be an open issue. This is due to the subtle nature of MAs against the surrounding tissues. In this paper, the microaneurysm detection problem is modeled as finding interest regions or blobs from an image and an automatic local-scale selection technique is presented. Several scale-adapted region descriptors are introduced to characterize these blob regions. A semi-supervised based learning approach, which requires few manually annotated learning examples, is also proposed to train a classifier which can detect true MAs. The developed system is built using only few manually labeled and a large number of unlabeled retinal color fundus images. The performance of the overall system is evaluated on Retinopathy Online Challenge (ROC) competition database. A competition performance measure (CPM) of 0.364 shows the competitiveness of the proposed system against state-of-the art techniques as well as the applicability of the proposed features to analyze fundus images. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  4. Application of graph-based semi-supervised learning for development of cyber COP and network intrusion detection

    NASA Astrophysics Data System (ADS)

    Levchuk, Georgiy; Colonna-Romano, John; Eslami, Mohammed

    2017-05-01

    The United States increasingly relies on cyber-physical systems to conduct military and commercial operations. Attacks on these systems have increased dramatically around the globe. The attackers constantly change their methods, making state-of-the-art commercial and military intrusion detection systems ineffective. In this paper, we present a model to identify functional behavior of network devices from netflow traces. Our model includes two innovations. First, we define novel features for a host IP using detection of application graph patterns in IP's host graph constructed from 5-min aggregated packet flows. Second, we present the first application, to the best of our knowledge, of Graph Semi-Supervised Learning (GSSL) to the space of IP behavior classification. Using a cyber-attack dataset collected from NetFlow packet traces, we show that GSSL trained with only 20% of the data achieves higher attack detection rates than Support Vector Machines (SVM) and Naïve Bayes (NB) classifiers trained with 80% of data points. We also show how to improve detection quality by filtering out web browsing data, and conclude with discussion of future research directions.

  5. Exploiting unsupervised and supervised classification for segmentation of the pathological lung in CT

    NASA Astrophysics Data System (ADS)

    Korfiatis, P.; Kalogeropoulou, C.; Daoussis, D.; Petsas, T.; Adonopoulos, A.; Costaridou, L.

    2009-07-01

    Delineation of lung fields in presence of diffuse lung diseases (DLPDs), such as interstitial pneumonias (IP), challenges segmentation algorithms. To deal with IP patterns affecting the lung border an automated image texture classification scheme is proposed. The proposed segmentation scheme is based on supervised texture classification between lung tissue (normal and abnormal) and surrounding tissue (pleura and thoracic wall) in the lung border region. This region is coarsely defined around an initial estimate of lung border, provided by means of Markov Radom Field modeling and morphological operations. Subsequently, a support vector machine classifier was trained to distinguish between the above two classes of tissue, using textural feature of gray scale and wavelet domains. 17 patients diagnosed with IP, secondary to connective tissue diseases were examined. Segmentation performance in terms of overlap was 0.924±0.021, and for shape differentiation mean, rms and maximum distance were 1.663±0.816, 2.334±1.574 and 8.0515±6.549 mm, respectively. An accurate, automated scheme is proposed for segmenting abnormal lung fields in HRC affected by IP

  6. Comparing supervised learning methods for classifying sex, age, context and individual Mudi dogs from barking.

    PubMed

    Larrañaga, Ana; Bielza, Concha; Pongrácz, Péter; Faragó, Tamás; Bálint, Anna; Larrañaga, Pedro

    2015-03-01

    Barking is perhaps the most characteristic form of vocalization in dogs; however, very little is known about its role in the intraspecific communication of this species. Besides the obvious need for ethological research, both in the field and in the laboratory, the possible information content of barks can also be explored by computerized acoustic analyses. This study compares four different supervised learning methods (naive Bayes, classification trees, [Formula: see text]-nearest neighbors and logistic regression) combined with three strategies for selecting variables (all variables, filter and wrapper feature subset selections) to classify Mudi dogs by sex, age, context and individual from their barks. The classification accuracy of the models obtained was estimated by means of [Formula: see text]-fold cross-validation. Percentages of correct classifications were 85.13 % for determining sex, 80.25 % for predicting age (recodified as young, adult and old), 55.50 % for classifying contexts (seven situations) and 67.63 % for recognizing individuals (8 dogs), so the results are encouraging. The best-performing method was [Formula: see text]-nearest neighbors following a wrapper feature selection approach. The results for classifying contexts and recognizing individual dogs were better with this method than they were for other approaches reported in the specialized literature. This is the first time that the sex and age of domestic dogs have been predicted with the help of sound analysis. This study shows that dog barks carry ample information regarding the caller's indexical features. Our computerized analysis provides indirect proof that barks may serve as an important source of information for dogs as well.

  7. Detection and Classification of Pole-Like Objects from Mobile Mapping Data

    NASA Astrophysics Data System (ADS)

    Fukano, K.; Masuda, H.

    2015-08-01

    Laser scanners on a vehicle-based mobile mapping system can capture 3D point-clouds of roads and roadside objects. Since roadside objects have to be maintained periodically, their 3D models are useful for planning maintenance tasks. In our previous work, we proposed a method for detecting cylindrical poles and planar plates in a point-cloud. However, it is often required to further classify pole-like objects into utility poles, streetlights, traffic signals and signs, which are managed by different organizations. In addition, our previous method may fail to extract low pole-like objects, which are often observed in urban residential areas. In this paper, we propose new methods for extracting and classifying pole-like objects. In our method, we robustly extract a wide variety of poles by converting point-clouds into wireframe models and calculating cross-sections between wireframe models and horizontal cutting planes. For classifying pole-like objects, we subdivide a pole-like object into five subsets by extracting poles and planes, and calculate feature values of each subset. Then we apply a supervised machine learning method using feature variables of subsets. In our experiments, our method could achieve excellent results for detection and classification of pole-like objects.

  8. Fuzzy recognition of noncompact musical objects

    NASA Astrophysics Data System (ADS)

    Cristobal Salas, Alfredo; Tchernykh, Andrei

    1997-03-01

    This article describes and compares some techniques to extract attributes from black and white images which contain musical objects. The inertia moment, the central moments and the wavelet transform methods are used to describe the images. Two supervised neural networks are applied to classify the images: backpropagation and fuzzy backpropagation. The results are compared.

  9. Considering Alternate Futures to Classify Off-Task Behavior as Emotion Self-Regulation: A Supervised Learning Approach

    ERIC Educational Resources Information Center

    Sabourin, Jennifer L.; Rowe, Jonathan P.; Mott, Bradford W.; Lester, James C.

    2013-01-01

    Over the past decade, there has been growing interest in real-time assessment of student engagement and motivation during interactions with educational software. Detecting symptoms of disengagement, such as off-task behavior, has shown considerable promise for understanding students' motivational characteristics during learning. In this paper, we…

  10. Bayesian network classifiers for categorizing cortical GABAergic interneurons.

    PubMed

    Mihaljević, Bojan; Benavides-Piccione, Ruth; Bielza, Concha; DeFelipe, Javier; Larrañaga, Pedro

    2015-04-01

    An accepted classification of GABAergic interneurons of the cerebral cortex is a major goal in neuroscience. A recently proposed taxonomy based on patterns of axonal arborization promises to be a pragmatic method for achieving this goal. It involves characterizing interneurons according to five axonal arborization features, called F1-F5, and classifying them into a set of predefined types, most of which are established in the literature. Unfortunately, there is little consensus among expert neuroscientists regarding the morphological definitions of some of the proposed types. While supervised classifiers were able to categorize the interneurons in accordance with experts' assignments, their accuracy was limited because they were trained with disputed labels. Thus, here we automatically classify interneuron subsets with different label reliability thresholds (i.e., such that every cell's label is backed by at least a certain (threshold) number of experts). We quantify the cells with parameters of axonal and dendritic morphologies and, in order to predict the type, also with axonal features F1-F4 provided by the experts. Using Bayesian network classifiers, we accurately characterize and classify the interneurons and identify useful predictor variables. In particular, we discriminate among reliable examples of common basket, horse-tail, large basket, and Martinotti cells with up to 89.52% accuracy, and single out the number of branches at 180 μm from the soma, the convex hull 2D area, and the axonal features F1-F4 as especially useful predictors for distinguishing among these types. These results open up new possibilities for an objective and pragmatic classification of interneurons.

  11. Modeling EEG Waveforms with Semi-Supervised Deep Belief Nets: Fast Classification and Anomaly Measurement

    PubMed Central

    Wulsin, D. F.; Gupta, J. R.; Mani, R.; Blanco, J. A.; Litt, B.

    2011-01-01

    Clinical electroencephalography (EEG) records vast amounts of human complex data yet is still reviewed primarily by human readers. Deep Belief Nets (DBNs) are a relatively new type of multi-layer neural network commonly tested on two-dimensional image data, but are rarely applied to times-series data such as EEG. We apply DBNs in a semi-supervised paradigm to model EEG waveforms for classification and anomaly detection. DBN performance was comparable to standard classifiers on our EEG dataset, and classification time was found to be 1.7 to 103.7 times faster than the other high-performing classifiers. We demonstrate how the unsupervised step of DBN learning produces an autoencoder that can naturally be used in anomaly measurement. We compare the use of raw, unprocessed data—a rarity in automated physiological waveform analysis—to hand-chosen features and find that raw data produces comparable classification and better anomaly measurement performance. These results indicate that DBNs and raw data inputs may be more effective for online automated EEG waveform recognition than other common techniques. PMID:21525569

  12. Land Cover Classification in a Complex Urban-Rural Landscape with Quickbird Imagery

    PubMed Central

    Moran, Emilio Federico.

    2010-01-01

    High spatial resolution images have been increasingly used for urban land use/cover classification, but the high spectral variation within the same land cover, the spectral confusion among different land covers, and the shadow problem often lead to poor classification performance based on the traditional per-pixel spectral-based classification methods. This paper explores approaches to improve urban land cover classification with Quickbird imagery. Traditional per-pixel spectral-based supervised classification, incorporation of textural images and multispectral images, spectral-spatial classifier, and segmentation-based classification are examined in a relatively new developing urban landscape, Lucas do Rio Verde in Mato Grosso State, Brazil. This research shows that use of spatial information during the image classification procedure, either through the integrated use of textural and spectral images or through the use of segmentation-based classification method, can significantly improve land cover classification performance. PMID:21643433

  13. (Machine) learning to do more with less

    NASA Astrophysics Data System (ADS)

    Cohen, Timothy; Freytsis, Marat; Ostdiek, Bryan

    2018-02-01

    Determining the best method for training a machine learning algorithm is critical to maximizing its ability to classify data. In this paper, we compare the standard "fully supervised" approach (which relies on knowledge of event-by-event truth-level labels) with a recent proposal that instead utilizes class ratios as the only discriminating information provided during training. This so-called "weakly supervised" technique has access to less information than the fully supervised method and yet is still able to yield impressive discriminating power. In addition, weak supervision seems particularly well suited to particle physics since quantum mechanics is incompatible with the notion of mapping an individual event onto any single Feynman diagram. We examine the technique in detail — both analytically and numerically — with a focus on the robustness to issues of mischaracterizing the training samples. Weakly supervised networks turn out to be remarkably insensitive to a class of systematic mismodeling. Furthermore, we demonstrate that the event level outputs for weakly versus fully supervised networks are probing different kinematics, even though the numerical quality metrics are essentially identical. This implies that it should be possible to improve the overall classification ability by combining the output from the two types of networks. For concreteness, we apply this technology to a signature of beyond the Standard Model physics to demonstrate that all these impressive features continue to hold in a scenario of relevance to the LHC. Example code is provided on GitHub.

  14. Automatic classification of time-variable X-ray sources

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lo, Kitty K.; Farrell, Sean; Murphy, Tara

    2014-05-01

    To maximize the discovery potential of future synoptic surveys, especially in the field of transient science, it will be necessary to use automatic classification to identify some of the astronomical sources. The data mining technique of supervised classification is suitable for this problem. Here, we present a supervised learning method to automatically classify variable X-ray sources in the Second XMM-Newton Serendipitous Source Catalog (2XMMi-DR2). Random Forest is our classifier of choice since it is one of the most accurate learning algorithms available. Our training set consists of 873 variable sources and their features are derived from time series, spectra, andmore » other multi-wavelength contextual information. The 10 fold cross validation accuracy of the training data is ∼97% on a 7 class data set. We applied the trained classification model to 411 unknown variable 2XMM sources to produce a probabilistically classified catalog. Using the classification margin and the Random Forest derived outlier measure, we identified 12 anomalous sources, of which 2XMM J180658.7–500250 appears to be the most unusual source in the sample. Its X-ray spectra is suggestive of a ultraluminous X-ray source but its variability makes it highly unusual. Machine-learned classification and anomaly detection will facilitate scientific discoveries in the era of all-sky surveys.« less

  15. Automatic Identification of Critical Follow-Up Recommendation Sentences in Radiology Reports

    PubMed Central

    Yetisgen-Yildiz, Meliha; Gunn, Martin L.; Xia, Fei; Payne, Thomas H.

    2011-01-01

    Communication of follow-up recommendations when abnormalities are identified on imaging studies is prone to error. When recommendations are not systematically identified and promptly communicated to referrers, poor patient outcomes can result. Using information technology can improve communication and improve patient safety. In this paper, we describe a text processing approach that uses natural language processing (NLP) and supervised text classification methods to automatically identify critical recommendation sentences in radiology reports. To increase the classification performance we enhanced the simple unigram token representation approach with lexical, semantic, knowledge-base, and structural features. We tested different combinations of those features with the Maximum Entropy (MaxEnt) classification algorithm. Classifiers were trained and tested with a gold standard corpus annotated by a domain expert. We applied 5-fold cross validation and our best performing classifier achieved 95.60% precision, 79.82% recall, 87.0% F-score, and 99.59% classification accuracy in identifying the critical recommendation sentences in radiology reports. PMID:22195225

  16. Automatic identification of critical follow-up recommendation sentences in radiology reports.

    PubMed

    Yetisgen-Yildiz, Meliha; Gunn, Martin L; Xia, Fei; Payne, Thomas H

    2011-01-01

    Communication of follow-up recommendations when abnormalities are identified on imaging studies is prone to error. When recommendations are not systematically identified and promptly communicated to referrers, poor patient outcomes can result. Using information technology can improve communication and improve patient safety. In this paper, we describe a text processing approach that uses natural language processing (NLP) and supervised text classification methods to automatically identify critical recommendation sentences in radiology reports. To increase the classification performance we enhanced the simple unigram token representation approach with lexical, semantic, knowledge-base, and structural features. We tested different combinations of those features with the Maximum Entropy (MaxEnt) classification algorithm. Classifiers were trained and tested with a gold standard corpus annotated by a domain expert. We applied 5-fold cross validation and our best performing classifier achieved 95.60% precision, 79.82% recall, 87.0% F-score, and 99.59% classification accuracy in identifying the critical recommendation sentences in radiology reports.

  17. Damage and recovery assessment of the Philippines' mangroves following Super Typhoon Haiyan.

    PubMed

    Long, Jordan; Giri, Chandra; Primavera, Jurgenne; Trivedi, Mandar

    2016-08-30

    We quantified mangrove disturbance resulting from Super Typhoon Haiyan using a remote sensing approach. Mangrove areas were mapped prior to Haiyan using 30m Landsat imagery and a supervised decision-tree classification. A time sequence of 250m eMODIS data was used to monitor mangrove condition prior to, and following, Haiyan. Based on differences in eMODIS NDVI observations before and after the storm, we classified mangrove into three damage level categories: minimal, moderate, or severe. Mangrove damage in terms of extent and severity was greatest where Haiyan first made landfall on Eastern Samar and Western Samar provinces and lessened westward corresponding with decreasing storm intensity as Haiyan tracked from east to west across the Visayas region of the Philippines. However, within 18months following Haiyan, mangrove areas classified as severely, moderately, and minimally damaged decreased by 90%, 81%, and 57%, respectively, indicating mangroves resilience to powerful typhoons. Copyright © 2016 Elsevier Ltd. All rights reserved.

  18. Classifying epileptic EEG signals with delay permutation entropy and Multi-Scale K-means.

    PubMed

    Zhu, Guohun; Li, Yan; Wen, Peng Paul; Wang, Shuaifang

    2015-01-01

    Most epileptic EEG classification algorithms are supervised and require large training datasets, that hinder their use in real time applications. This chapter proposes an unsupervised Multi-Scale K-means (MSK-means) MSK-means algorithm to distinguish epileptic EEG signals and identify epileptic zones. The random initialization of the K-means algorithm can lead to wrong clusters. Based on the characteristics of EEGs, the MSK-means MSK-means algorithm initializes the coarse-scale centroid of a cluster with a suitable scale factor. In this chapter, the MSK-means algorithm is proved theoretically superior to the K-means algorithm on efficiency. In addition, three classifiers: the K-means, MSK-means MSK-means and support vector machine (SVM), are used to identify seizure and localize epileptogenic zone using delay permutation entropy features. The experimental results demonstrate that identifying seizure with the MSK-means algorithm and delay permutation entropy achieves 4. 7 % higher accuracy than that of K-means, and 0. 7 % higher accuracy than that of the SVM.

  19. Urban Shanty Town Recognition Based on High-Resolution Remote Sensing Images and National Geographical Monitoring Features - a Case Study of Nanning City

    NASA Astrophysics Data System (ADS)

    He, Y.; He, Y.

    2018-04-01

    Urban shanty towns are communities that has contiguous old and dilapidated houses with more than 2000 square meters built-up area or more than 50 households. This study makes attempts to extract shanty towns in Nanning City using the product of Census and TripleSat satellite images. With 0.8-meter high-resolution remote sensing images, five texture characteristics (energy, contrast, maximum probability, and inverse difference moment) of shanty towns are trained and analyzed through GLCM. In this study, samples of shanty town are well classified with 98.2 % producer accuracy of unsupervised classification and 73.2 % supervised classification correctness. Low-rise and mid-rise residential blocks in Nanning City are classified into 4 different types by using k-means clustering and nearest neighbour classification respectively. This study initially establish texture feature descriptions of different types of residential areas, especially low-rise and mid-rise buildings, which would help city administrator evaluate residential blocks and reconstruction shanty towns.

  20. Evidence-based Practices Addressed in Community-based Children’s Mental Health Clinical Supervision

    PubMed Central

    Accurso, Erin C.; Taylor, Robin M.; Garland, Ann F.

    2013-01-01

    Context Clinical supervision is the principal method of training for psychotherapeutic practice, however there is virtually no research on supervision practice in community settings. Of particular interest is the role supervision might play in facilitating implementation of evidence-based (EB) care in routine care settings. Objective This study examines the format and functions of clinical supervision sessions in routine care, as well as the extent to which supervision addresses psychotherapeutic practice elements common to EB care for children with disruptive behavior problems, who represent the majority of patients served in publicly-funded routine care settings. Methods Supervisors (n=7) and supervisees (n=12) from four publicly-funded community-based child mental health clinics reported on 130 supervision sessions. Results Supervision sessions were primarily individual in-person meetings lasting one hour. The most common functions included case conceptualization and therapy interventions. Coverage of practice elements common to EB treatments was brief. Discussion Despite the fact that most children presenting to public mental health services are referred for disruptive behavior problems, supervision sessions are infrequently focused on practice elements consistent with EB treatments for this population. Supervision is a promising avenue through which training in EB practices could be supported to improve the quality of care for children in community-based “usual care” clinics. PMID:24761163

  1. Correlation of HIV protease structure with Indinavir resistance: a data mining and neural networks approach

    NASA Astrophysics Data System (ADS)

    Draghici, Sorin; Cumberland, Lonnie T., Jr.; Kovari, Ladislau C.

    2000-04-01

    This paper presents some results of data mining HIV genotypic and structural data. Our aim is to try to relate structural features of HIV enzymes essential to its reproductive abilities to the drug resistance phenomenon. This paper concentrates on the HIV protease enzyme and Indinavir which is one of the FDA approved protease inhibitors. Our starting point was the current list of HIV mutations related to drug resistance. We used the fact that some molecular structures determined through high resolution X-ray crystallography were available for the protease-Indinavir complex. Starting with these structures and the known mutations, we modelled the mutant proteases and studied the pattern of atomic contacts between the protease and the drug. After suitable pre- processing, these patterns have been used as the input of our data mining process. We have used both supervised and unsupervised learning techniques with the aim of understanding the relationship between structural features at a molecular level and resistance to Indinavir. The supervised learning was aimed at predicting IC90 values for arbitrary mutants. The SOFM was aimed at identifying those structural features that are important for drug resistance and discovering a classifier based on such features. We have used validation and cross validation to test the generalization abilities of the learning paradigm we have designed. The straightforward supervised learning was able to learn very successfully but validation results are less than satisfactory. This is due to the insufficient number of patterns in the training set which in turn is due to the scarcity of the available data. The data mining using SOFM was very successful. We have managed to distinguish between resistant and non-resistant mutants using structural features. We have been able to divide all reported HIV mutants into several categories based on their 3- dimensional molecular structures and the pattern of contacts between the mutant protease and Indinavir. Our classifier shows reasonably good prediction performance being able to predict the drug resistance of previously unseen mutants with an accuracy of between 60% and 70%. We believe that this performance can be greatly improved once more data becomes available. The results presented here support the hypothesis that structural features of the molecular structure can be used in antiviral drug treatment selection and drug design.

  2. The Juggling Act of Supervision in Community Mental Health: Implications for Supporting Evidence-Based Treatment.

    PubMed

    Dorsey, Shannon; Pullmann, Michael D; Kerns, Suzanne E U; Jungbluth, Nathaniel; Meza, Rosemary; Thompson, Kelly; Berliner, Lucy

    2017-11-01

    Supervisors are an underutilized resource for supporting evidence-based treatments (EBTs) in community mental health. Little is known about how EBT-trained supervisors use supervision time. Primary aims were to describe supervision (e.g., modality, frequency), examine functions of individual supervision, and examine factors associated with time allocation to supervision functions. Results from 56 supervisors and 207 clinicians from 25 organizations indicate high prevalence of individual supervision, often alongside group and informal supervision. Individual supervision serves a wide range of functions, with substantial variation at the supervisor-level. Implementation climate was the strongest predictor of time allocation to clinical and EBT-relevant functions.

  3. Comparison of several chemometric methods of libraries and classifiers for the analysis of expired drugs based on Raman spectra.

    PubMed

    Gao, Qun; Liu, Yan; Li, Hao; Chen, Hui; Chai, Yifeng; Lu, Feng

    2014-06-01

    Some expired drugs are difficult to detect by conventional means. If they are repackaged and sold back into market, they will constitute a new public health challenge. For the detection of repackaged expired drugs within specification, paracetamol tablet from a manufacturer was used as a model drug in this study for comparison of Raman spectra-based library verification and classification methods. Raman spectra of different batches of paracetamol tablets were collected and a library including standard spectra of unexpired batches of tablets was established. The Raman spectrum of each sample was identified by cosine and correlation with the standard spectrum. The average HQI of the suspicious samples and the standard spectrum were calculated. The optimum threshold values were 0.997 and 0.998 respectively as a result of ROC and four evaluations, for which the accuracy was up to 97%. Three supervised classifiers, PLS-DA, SVM and k-NN, were chosen to establish two-class classification models and compared subsequently. They were used to establish a classification of expired batches and an unexpired batch, and predict the suspect samples. The average accuracy was 90.12%, 96.80% and 89.37% respectively. Different pre-processing techniques were tried to find that first derivative was optimal for methods of libraries and max-min normalization was optimal for that of classifiers. The results obtained from these studies indicated both libraries and classifier methods could detect the expired drugs effectively, and they should be used complementarily in the fast-screening. Copyright © 2014 Elsevier B.V. All rights reserved.

  4. Improved characterization of molecular phenotypes in breast lesions using 18F-FDG PET image homogeneity

    NASA Astrophysics Data System (ADS)

    Cao, Kunlin; Bhagalia, Roshni; Sood, Anup; Brogi, Edi; Mellinghoff, Ingo K.; Larson, Steven M.

    2015-03-01

    Positron emission tomography (PET) using uorodeoxyglucose (18F-FDG) is commonly used in the assessment of breast lesions by computing voxel-wise standardized uptake value (SUV) maps. Simple metrics derived from ensemble properties of SUVs within each identified breast lesion are routinely used for disease diagnosis. The maximum SUV within the lesion (SUVmax) is the most popular of these metrics. However these simple metrics are known to be error-prone and are susceptible to image noise. Finding reliable SUV map-based features that correlate to established molecular phenotypes of breast cancer (viz. estrogen receptor (ER), progesterone receptor (PR) and human epidermal growth factor receptor 2 (HER2) expression) will enable non-invasive disease management. This study investigated 36 SUV features based on first and second order statistics, local histograms and texture of segmented lesions to predict ER and PR expression in 51 breast cancer patients. True ER and PR expression was obtained via immunohistochemistry (IHC) of tissue samples from each lesion. A supervised learning, adaptive boosting-support vector machine (AdaBoost-SVM), framework was used to select a subset of features to classify breast lesions into distinct phenotypes. Performance of the trained multi-feature classifier was compared against the baseline single-feature SUVmax classifier using receiver operating characteristic (ROC) curves. Results show that texture features encoding local lesion homogeneity extracted from gray-level co-occurrence matrices are the strongest discriminator of lesion ER expression. In particular, classifiers including these features increased prediction accuracy from 0.75 (baseline) to 0.82 and the area under the ROC curve from 0.64 (baseline) to 0.75.

  5. Automated mineralogy based on micro-energy-dispersive X-ray fluorescence microscopy (µ-EDXRF) applied to plutonic rock thin sections in comparison to a mineral liberation analyzer

    NASA Astrophysics Data System (ADS)

    Nikonow, Wilhelm; Rammlmair, Dieter

    2017-10-01

    Recent developments in the application of micro-energy-dispersive X-ray fluorescence spectrometry mapping (µ-EDXRF) have opened up new opportunities for fast geoscientific analyses. Acquiring spatially resolved spectral and chemical information non-destructively for large samples of up to 20 cm length provides valuable information for geoscientific interpretation. Using supervised classification of the spectral information, mineral distribution maps can be obtained. In this work, thin sections of plutonic rocks are analyzed by µ-EDXRF and classified using the supervised classification algorithm spectral angle mapper (SAM). Based on the mineral distribution maps, it is possible to obtain quantitative mineral information, i.e., to calculate the modal mineralogy, search and locate minerals of interest, and perform image analysis. The results are compared to automated mineralogy obtained from the mineral liberation analyzer (MLA) of a scanning electron microscope (SEM) and show good accordance, revealing variation resulting mostly from the limit of spatial resolution of the µ-EDXRF instrument. Taking into account the little time needed for sample preparation and measurement, this method seems suitable for fast sample overviews with valuable chemical, mineralogical and textural information. Additionally, it enables the researcher to make better and more targeted decisions for subsequent analyses.

  6. Developing a Peer Mentorship Program to Increase Competence in Clinical Supervision in Clinical Psychology Doctoral Training Programs.

    PubMed

    Foxwell, Aleksandra A; Kennard, Beth D; Rodgers, Cynthia; Wolfe, Kristin L; Cassedy, Hannah F; Thomas, Anna

    2017-12-01

    Supervision has recently been recognized as a core competency for clinical psychologists. This recognition of supervision as a distinct competency has evolved in the context of an overall focus on competency-based education and training in health service psychology, and has recently gained momentum. Few clinical psychology doctoral programs offer formal training experiences in providing supervision. A pilot peer mentorship program (PMP) where graduate students were trained in the knowledge and practice of supervision was developed. The focus of the PMP was to develop basic supervision skills in advanced clinical psychology graduate students, as well as to train junior doctoral students in fundamental clinical and practical skills. Advanced doctoral students were matched to junior doctoral students to gain experience in and increase knowledge base in best practices of supervision skills. The 9-month program consisted of monthly mentorship meetings and three training sessions. The results suggested that mentors reported a 30% or more shift from the category of not competent to needs improvement or competent, in the following supervision competencies: theories of supervision, improved skill in supervision modalities, acquired knowledge in supervision, and supervision experience. Furthermore, 50% of the mentors reported that they were not competent in supervision experience at baseline and only 10% reported that they were not competent at the end of the program. Satisfaction data suggested that satisfaction with the program was high, with 75% of participants indicating increased knowledge base in supervision, and 90% indicating that it was a positive addition to their training program. This program was feasible and acceptable and appears to have had a positive impact on the graduate students who participated. Students reported both high satisfaction with the program as well as an increase in knowledge base and experience in supervision skills.

  7. Hierarchical Gene Selection and Genetic Fuzzy System for Cancer Microarray Data Classification

    PubMed Central

    Nguyen, Thanh; Khosravi, Abbas; Creighton, Douglas; Nahavandi, Saeid

    2015-01-01

    This paper introduces a novel approach to gene selection based on a substantial modification of analytic hierarchy process (AHP). The modified AHP systematically integrates outcomes of individual filter methods to select the most informative genes for microarray classification. Five individual ranking methods including t-test, entropy, receiver operating characteristic (ROC) curve, Wilcoxon and signal to noise ratio are employed to rank genes. These ranked genes are then considered as inputs for the modified AHP. Additionally, a method that uses fuzzy standard additive model (FSAM) for cancer classification based on genes selected by AHP is also proposed in this paper. Traditional FSAM learning is a hybrid process comprising unsupervised structure learning and supervised parameter tuning. Genetic algorithm (GA) is incorporated in-between unsupervised and supervised training to optimize the number of fuzzy rules. The integration of GA enables FSAM to deal with the high-dimensional-low-sample nature of microarray data and thus enhance the efficiency of the classification. Experiments are carried out on numerous microarray datasets. Results demonstrate the performance dominance of the AHP-based gene selection against the single ranking methods. Furthermore, the combination of AHP-FSAM shows a great accuracy in microarray data classification compared to various competing classifiers. The proposed approach therefore is useful for medical practitioners and clinicians as a decision support system that can be implemented in the real medical practice. PMID:25823003

  8. Hierarchical gene selection and genetic fuzzy system for cancer microarray data classification.

    PubMed

    Nguyen, Thanh; Khosravi, Abbas; Creighton, Douglas; Nahavandi, Saeid

    2015-01-01

    This paper introduces a novel approach to gene selection based on a substantial modification of analytic hierarchy process (AHP). The modified AHP systematically integrates outcomes of individual filter methods to select the most informative genes for microarray classification. Five individual ranking methods including t-test, entropy, receiver operating characteristic (ROC) curve, Wilcoxon and signal to noise ratio are employed to rank genes. These ranked genes are then considered as inputs for the modified AHP. Additionally, a method that uses fuzzy standard additive model (FSAM) for cancer classification based on genes selected by AHP is also proposed in this paper. Traditional FSAM learning is a hybrid process comprising unsupervised structure learning and supervised parameter tuning. Genetic algorithm (GA) is incorporated in-between unsupervised and supervised training to optimize the number of fuzzy rules. The integration of GA enables FSAM to deal with the high-dimensional-low-sample nature of microarray data and thus enhance the efficiency of the classification. Experiments are carried out on numerous microarray datasets. Results demonstrate the performance dominance of the AHP-based gene selection against the single ranking methods. Furthermore, the combination of AHP-FSAM shows a great accuracy in microarray data classification compared to various competing classifiers. The proposed approach therefore is useful for medical practitioners and clinicians as a decision support system that can be implemented in the real medical practice.

  9. Sub-pixel image classification for forest types in East Texas

    NASA Astrophysics Data System (ADS)

    Westbrook, Joey

    Sub-pixel classification is the extraction of information about the proportion of individual materials of interest within a pixel. Landcover classification at the sub-pixel scale provides more discrimination than traditional per-pixel multispectral classifiers for pixels where the material of interest is mixed with other materials. It allows for the un-mixing of pixels to show the proportion of each material of interest. The materials of interest for this study are pine, hardwood, mixed forest and non-forest. The goal of this project was to perform a sub-pixel classification, which allows a pixel to have multiple labels, and compare the result to a traditional supervised classification, which allows a pixel to have only one label. The satellite image used was a Landsat 5 Thematic Mapper (TM) scene of the Stephen F. Austin Experimental Forest in Nacogdoches County, Texas and the four cover type classes are pine, hardwood, mixed forest and non-forest. Once classified, a multi-layer raster datasets was created that comprised four raster layers where each layer showed the percentage of that cover type within the pixel area. Percentage cover type maps were then produced and the accuracy of each was assessed using a fuzzy error matrix for the sub-pixel classifications, and the results were compared to the supervised classification in which a traditional error matrix was used. The overall accuracy of the sub-pixel classification using the aerial photo for both training and reference data had the highest (65% overall) out of the three sub-pixel classifications. This was understandable because the analyst can visually observe the cover types actually on the ground for training data and reference data, whereas using the FIA (Forest Inventory and Analysis) plot data, the analyst must assume that an entire pixel contains the exact percentage of a cover type found in a plot. An increase in accuracy was found after reclassifying each sub-pixel classification from nine classes with 10 percent interval each to five classes with 20 percent interval each. When compared to the supervised classification which has a satisfactory overall accuracy of 90%, none of the sub-pixel classification achieved the same level. However, since traditional per-pixel classifiers assign only one label to pixels throughout the landscape while sub-pixel classifications assign multiple labels to each pixel, the traditional 85% accuracy of acceptance for pixel-based classifications should not apply to sub-pixel classifications. More research is needed in order to define the level of accuracy that is deemed acceptable for sub-pixel classifications.

  10. Use of Machine Learning to Identify Children with Autism and Their Motor Abnormalities

    ERIC Educational Resources Information Center

    Crippa, Alessandro; Salvatore, Christian; Perego, Paolo; Forti, Sara; Nobile, Maria; Molteni, Massimo; Castiglioni, Isabella

    2015-01-01

    In the present work, we have undertaken a proof-of-concept study to determine whether a simple upper-limb movement could be useful to accurately classify low-functioning children with autism spectrum disorder (ASD) aged 2-4. To answer this question, we developed a supervised machine-learning method to correctly discriminate 15 preschool children…

  11. The mapping of marsh vegetation using aircraft multispectral scanner data. [in Louisiana

    NASA Technical Reports Server (NTRS)

    Butera, M. K.

    1975-01-01

    A test was conducted to determine if salinity regimes in coastal marshland could be mapped and monitored by the identification and classification of marsh vegetative species from aircraft multispectral scanner data. The data was acquired at 6.1 km (20,000 ft.) on October 2, 1974, over a test area in the coastal marshland of southern Louisiana including fresh, intermediate, brackish, and saline zones. The data was classified by vegetational species using a supervised, spectral pattern recognition procedure. Accuracies of training sites ranged from 67% to 96%. Marsh zones based on free soil water salinity were determined from the species classification to demonstrate a practical use for mapping marsh vegetation.

  12. Multi-label classification of chronically ill patients with bag of words and supervised dimensionality reduction algorithms.

    PubMed

    Bromuri, Stefano; Zufferey, Damien; Hennebert, Jean; Schumacher, Michael

    2014-10-01

    This research is motivated by the issue of classifying illnesses of chronically ill patients for decision support in clinical settings. Our main objective is to propose multi-label classification of multivariate time series contained in medical records of chronically ill patients, by means of quantization methods, such as bag of words (BoW), and multi-label classification algorithms. Our second objective is to compare supervised dimensionality reduction techniques to state-of-the-art multi-label classification algorithms. The hypothesis is that kernel methods and locality preserving projections make such algorithms good candidates to study multi-label medical time series. We combine BoW and supervised dimensionality reduction algorithms to perform multi-label classification on health records of chronically ill patients. The considered algorithms are compared with state-of-the-art multi-label classifiers in two real world datasets. Portavita dataset contains 525 diabetes type 2 (DT2) patients, with co-morbidities of DT2 such as hypertension, dyslipidemia, and microvascular or macrovascular issues. MIMIC II dataset contains 2635 patients affected by thyroid disease, diabetes mellitus, lipoid metabolism disease, fluid electrolyte disease, hypertensive disease, thrombosis, hypotension, chronic obstructive pulmonary disease (COPD), liver disease and kidney disease. The algorithms are evaluated using multi-label evaluation metrics such as hamming loss, one error, coverage, ranking loss, and average precision. Non-linear dimensionality reduction approaches behave well on medical time series quantized using the BoW algorithm, with results comparable to state-of-the-art multi-label classification algorithms. Chaining the projected features has a positive impact on the performance of the algorithm with respect to pure binary relevance approaches. The evaluation highlights the feasibility of representing medical health records using the BoW for multi-label classification tasks. The study also highlights that dimensionality reduction algorithms based on kernel methods, locality preserving projections or both are good candidates to deal with multi-label classification tasks in medical time series with many missing values and high label density. Copyright © 2014 Elsevier Inc. All rights reserved.

  13. Multi-level deep supervised networks for retinal vessel segmentation.

    PubMed

    Mo, Juan; Zhang, Lei

    2017-12-01

    Changes in the appearance of retinal blood vessels are an important indicator for various ophthalmologic and cardiovascular diseases, including diabetes, hypertension, arteriosclerosis, and choroidal neovascularization. Vessel segmentation from retinal images is very challenging because of low blood vessel contrast, intricate vessel topology, and the presence of pathologies such as microaneurysms and hemorrhages. To overcome these challenges, we propose a neural network-based method for vessel segmentation. A deep supervised fully convolutional network is developed by leveraging multi-level hierarchical features of the deep networks. To improve the discriminative capability of features in lower layers of the deep network and guide the gradient back propagation to overcome gradient vanishing, deep supervision with auxiliary classifiers is incorporated in some intermediate layers of the network. Moreover, the transferred knowledge learned from other domains is used to alleviate the issue of insufficient medical training data. The proposed approach does not rely on hand-crafted features and needs no problem-specific preprocessing or postprocessing, which reduces the impact of subjective factors. We evaluate the proposed method on three publicly available databases, the DRIVE, STARE, and CHASE_DB1 databases. Extensive experiments demonstrate that our approach achieves better or comparable performance to state-of-the-art methods with a much faster processing speed, making it suitable for real-world clinical applications. The results of cross-training experiments demonstrate its robustness with respect to the training set. The proposed approach segments retinal vessels accurately with a much faster processing speed and can be easily applied to other biomedical segmentation tasks.

  14. Smart Annotation of Cyclic Data Using Hierarchical Hidden Markov Models.

    PubMed

    Martindale, Christine F; Hoenig, Florian; Strohrmann, Christina; Eskofier, Bjoern M

    2017-10-13

    Cyclic signals are an intrinsic part of daily life, such as human motion and heart activity. The detailed analysis of them is important for clinical applications such as pathological gait analysis and for sports applications such as performance analysis. Labeled training data for algorithms that analyze these cyclic data come at a high annotation cost due to only limited annotations available under laboratory conditions or requiring manual segmentation of the data under less restricted conditions. This paper presents a smart annotation method that reduces this cost of labeling for sensor-based data, which is applicable to data collected outside of strict laboratory conditions. The method uses semi-supervised learning of sections of cyclic data with a known cycle number. A hierarchical hidden Markov model (hHMM) is used, achieving a mean absolute error of 0.041 ± 0.020 s relative to a manually-annotated reference. The resulting model was also used to simultaneously segment and classify continuous, 'in the wild' data, demonstrating the applicability of using hHMM, trained on limited data sections, to label a complete dataset. This technique achieved comparable results to its fully-supervised equivalent. Our semi-supervised method has the significant advantage of reduced annotation cost. Furthermore, it reduces the opportunity for human error in the labeling process normally required for training of segmentation algorithms. It also lowers the annotation cost of training a model capable of continuous monitoring of cycle characteristics such as those employed to analyze the progress of movement disorders or analysis of running technique.

  15. A novel end-to-end classifier using domain transferred deep convolutional neural networks for biomedical images.

    PubMed

    Pang, Shuchao; Yu, Zhezhou; Orgun, Mehmet A

    2017-03-01

    Highly accurate classification of biomedical images is an essential task in the clinical diagnosis of numerous medical diseases identified from those images. Traditional image classification methods combined with hand-crafted image feature descriptors and various classifiers are not able to effectively improve the accuracy rate and meet the high requirements of classification of biomedical images. The same also holds true for artificial neural network models directly trained with limited biomedical images used as training data or directly used as a black box to extract the deep features based on another distant dataset. In this study, we propose a highly reliable and accurate end-to-end classifier for all kinds of biomedical images via deep learning and transfer learning. We first apply domain transferred deep convolutional neural network for building a deep model; and then develop an overall deep learning architecture based on the raw pixels of original biomedical images using supervised training. In our model, we do not need the manual design of the feature space, seek an effective feature vector classifier or segment specific detection object and image patches, which are the main technological difficulties in the adoption of traditional image classification methods. Moreover, we do not need to be concerned with whether there are large training sets of annotated biomedical images, affordable parallel computing resources featuring GPUs or long times to wait for training a perfect deep model, which are the main problems to train deep neural networks for biomedical image classification as observed in recent works. With the utilization of a simple data augmentation method and fast convergence speed, our algorithm can achieve the best accuracy rate and outstanding classification ability for biomedical images. We have evaluated our classifier on several well-known public biomedical datasets and compared it with several state-of-the-art approaches. We propose a robust automated end-to-end classifier for biomedical images based on a domain transferred deep convolutional neural network model that shows a highly reliable and accurate performance which has been confirmed on several public biomedical image datasets. Copyright © 2017 Elsevier Ireland Ltd. All rights reserved.

  16. Identification of species based on DNA barcode using k-mer feature vector and Random forest classifier.

    PubMed

    Meher, Prabina Kumar; Sahu, Tanmaya Kumar; Rao, A R

    2016-11-05

    DNA barcoding is a molecular diagnostic method that allows automated and accurate identification of species based on a short and standardized fragment of DNA. To this end, an attempt has been made in this study to develop a computational approach for identifying the species by comparing its barcode with the barcode sequence of known species present in the reference library. Each barcode sequence was first mapped onto a numeric feature vector based on k-mer frequencies and then Random forest methodology was employed on the transformed dataset for species identification. The proposed approach outperformed similarity-based, tree-based, diagnostic-based approaches and found comparable with existing supervised learning based approaches in terms of species identification success rate, while compared using real and simulated datasets. Based on the proposed approach, an online web interface SPIDBAR has also been developed and made freely available at http://cabgrid.res.in:8080/spidbar/ for species identification by the taxonomists. Copyright © 2016 Elsevier B.V. All rights reserved.

  17. A k-mer-based barcode DNA classification methodology based on spectral representation and a neural gas network.

    PubMed

    Fiannaca, Antonino; La Rosa, Massimo; Rizzo, Riccardo; Urso, Alfonso

    2015-07-01

    In this paper, an alignment-free method for DNA barcode classification that is based on both a spectral representation and a neural gas network for unsupervised clustering is proposed. In the proposed methodology, distinctive words are identified from a spectral representation of DNA sequences. A taxonomic classification of the DNA sequence is then performed using the sequence signature, i.e., the smallest set of k-mers that can assign a DNA sequence to its proper taxonomic category. Experiments were then performed to compare our method with other supervised machine learning classification algorithms, such as support vector machine, random forest, ripper, naïve Bayes, ridor, and classification tree, which also consider short DNA sequence fragments of 200 and 300 base pairs (bp). The experimental tests were conducted over 10 real barcode datasets belonging to different animal species, which were provided by the on-line resource "Barcode of Life Database". The experimental results showed that our k-mer-based approach is directly comparable, in terms of accuracy, recall and precision metrics, with the other classifiers when considering full-length sequences. In addition, we demonstrate the robustness of our method when a classification is performed task with a set of short DNA sequences that were randomly extracted from the original data. For example, the proposed method can reach the accuracy of 64.8% at the species level with 200-bp fragments. Under the same conditions, the best other classifier (random forest) reaches the accuracy of 20.9%. Our results indicate that we obtained a clear improvement over the other classifiers for the study of short DNA barcode sequence fragments. Copyright © 2015 Elsevier B.V. All rights reserved.

  18. Predicting novel microRNA: a comprehensive comparison of machine learning approaches.

    PubMed

    Stegmayer, Georgina; Di Persia, Leandro E; Rubiolo, Mariano; Gerard, Matias; Pividori, Milton; Yones, Cristian; Bugnon, Leandro A; Rodriguez, Tadeo; Raad, Jonathan; Milone, Diego H

    2018-05-23

    The importance of microRNAs (miRNAs) is widely recognized in the community nowadays because these short segments of RNA can play several roles in almost all biological processes. The computational prediction of novel miRNAs involves training a classifier for identifying sequences having the highest chance of being precursors of miRNAs (pre-miRNAs). The big issue with this task is that well-known pre-miRNAs are usually few in comparison with the hundreds of thousands of candidate sequences in a genome, which results in high class imbalance. This imbalance has a strong influence on most standard classifiers, and if not properly addressed in the model and the experiments, not only performance reported can be completely unrealistic but also the classifier will not be able to work properly for pre-miRNA prediction. Besides, another important issue is that for most of the machine learning (ML) approaches already used (supervised methods), it is necessary to have both positive and negative examples. The selection of positive examples is straightforward (well-known pre-miRNAs). However, it is difficult to build a representative set of negative examples because they should be sequences with hairpin structure that do not contain a pre-miRNA. This review provides a comprehensive study and comparative assessment of methods from these two ML approaches for dealing with the prediction of novel pre-miRNAs: supervised and unsupervised training. We present and analyze the ML proposals that have appeared during the past 10 years in literature. They have been compared in several prediction tasks involving two model genomes and increasing imbalance levels. This work provides a review of existing ML approaches for pre-miRNA prediction and fair comparisons of the classifiers with same features and data sets, instead of just a revision of published software tools. The results and the discussion can help the community to select the most adequate bioinformatics approach according to the prediction task at hand. The comparative results obtained suggest that from low to mid-imbalance levels between classes, supervised methods can be the best. However, at very high imbalance levels, closer to real case scenarios, models including unsupervised and deep learning can provide better performance.

  19. Objective coding of content and techniques in workplace-based supervision of an EBT in public mental health.

    PubMed

    Dorsey, Shannon; Kerns, Suzanne E U; Lucid, Leah; Pullmann, Michael D; Harrison, Julie P; Berliner, Lucy; Thompson, Kelly; Deblinger, Esther

    2018-01-24

    Workplace-based clinical supervision as an implementation strategy to support evidence-based treatment (EBT) in public mental health has received limited research attention. A commonly provided infrastructure support, it may offer a relatively cost-neutral implementation strategy for organizations. However, research has not objectively examined workplace-based supervision of EBT and specifically how it might differ from EBT supervision provided in efficacy and effectiveness trials. Data come from a descriptive study of supervision in the context of a state-funded EBT implementation effort. Verbal interactions from audio recordings of 438 supervision sessions between 28 supervisors and 70 clinicians from 17 public mental health organizations (in 23 offices) were objectively coded for presence and intensity coverage of 29 supervision strategies (16 content and 13 technique items), duration, and temporal focus. Random effects mixed models estimated proportion of variance in content and techniques attributable to the supervisor and clinician levels. Interrater reliability among coders was excellent. EBT cases averaged 12.4 min of supervision per session. Intensity of coverage for EBT content varied, with some discussed frequently at medium or high intensity (exposure) and others infrequently discussed or discussed only at low intensity (behavior management; assigning/reviewing client homework). Other than fidelity assessment, supervision techniques common in treatment trials (e.g., reviewing actual practice, behavioral rehearsal) were used rarely or primarily at low intensity. In general, EBT content clustered more at the clinician level; different techniques clustered at either the clinician or supervisor level. Workplace-based clinical supervision may be a feasible implementation strategy for supporting EBT implementation, yet it differs from supervision in treatment trials. Time allotted per case is limited, compressing time for EBT coverage. Techniques that involve observation of clinician skills are rarely used. Workplace-based supervision content appears to be tailored to individual clinicians and driven to some degree by the individual supervisor. Our findings point to areas for intervention to enhance the potential of workplace-based supervision for implementation effectiveness. NCT01800266 , Clinical Trials, Retrospectively Registered (for this descriptive study; registration prior to any intervention [part of phase II RCT, this manuscript is only phase I descriptive results]).

  20. Retinal vessel segmentation using the 2-D Gabor wavelet and supervised classification.

    PubMed

    Soares, João V B; Leandro, Jorge J G; Cesar Júnior, Roberto M; Jelinek, Herbert F; Cree, Michael J

    2006-09-01

    We present a method for automated segmentation of the vasculature in retinal images. The method produces segmentations by classifying each image pixel as vessel or nonvessel, based on the pixel's feature vector. Feature vectors are composed of the pixel's intensity and two-dimensional Gabor wavelet transform responses taken at multiple scales. The Gabor wavelet is capable of tuning to specific frequencies, thus allowing noise filtering and vessel enhancement in a single step. We use a Bayesian classifier with class-conditional probability density functions (likelihoods) described as Gaussian mixtures, yielding a fast classification, while being able to model complex decision surfaces. The probability distributions are estimated based on a training set of labeled pixels obtained from manual segmentations. The method's performance is evaluated on publicly available DRIVE (Staal et al., 2004) and STARE (Hoover et al., 2000) databases of manually labeled images. On the DRIVE database, it achieves an area under the receiver operating characteristic curve of 0.9614, being slightly superior than that presented by state-of-the-art approaches. We are making our implementation available as open source MATLAB scripts for researchers interested in implementation details, evaluation, or development of methods.

  1. A Strengths-Based Approach to Supervised Visitation in Child Welfare

    ERIC Educational Resources Information Center

    Smith, Gabriel Tobin; Shapiro, Valerie B.; Sperry, Rachel Wagner; LeBuffe, Paul A.

    2014-01-01

    This article describes a strengths-based approach to supervised visitation within the child welfare system of the United States. Supervised visitation gives parents accused of abuse or neglect the opportunity to spend time with children temporarily removed from their care. Although supervised visitation has the potential to be a tool for promoting…

  2. Training set optimization and classifier performance in a top-down diabetic retinopathy screening system

    NASA Astrophysics Data System (ADS)

    Wigdahl, J.; Agurto, C.; Murray, V.; Barriga, S.; Soliz, P.

    2013-03-01

    Diabetic retinopathy (DR) affects more than 4.4 million Americans age 40 and over. Automatic screening for DR has shown to be an efficient and cost-effective way to lower the burden on the healthcare system, by triaging diabetic patients and ensuring timely care for those presenting with DR. Several supervised algorithms have been developed to detect pathologies related to DR, but little work has been done in determining the size of the training set that optimizes an algorithm's performance. In this paper we analyze the effect of the training sample size on the performance of a top-down DR screening algorithm for different types of statistical classifiers. Results are based on partial least squares (PLS), support vector machines (SVM), k-nearest neighbor (kNN), and Naïve Bayes classifiers. Our dataset consisted of digital retinal images collected from a total of 745 cases (595 controls, 150 with DR). We varied the number of normal controls in the training set, while keeping the number of DR samples constant, and repeated the procedure 10 times using randomized training sets to avoid bias. Results show increasing performance in terms of area under the ROC curve (AUC) when the number of DR subjects in the training set increased, with similar trends for each of the classifiers. Of these, PLS and k-NN had the highest average AUC. Lower standard deviation and a flattening of the AUC curve gives evidence that there is a limit to the learning ability of the classifiers and an optimal number of cases to train on.

  3. Comparison of the Predictive Accuracy of DNA Array-Based Multigene Classifiers across cDNA Arrays and Affymetrix GeneChips

    PubMed Central

    Stec, James; Wang, Jing; Coombes, Kevin; Ayers, Mark; Hoersch, Sebastian; Gold, David L.; Ross, Jeffrey S; Hess, Kenneth R.; Tirrell, Stephen; Linette, Gerald; Hortobagyi, Gabriel N.; Symmans, W. Fraser; Pusztai, Lajos

    2005-01-01

    We examined how well differentially expressed genes and multigene outcome classifiers retain their class-discriminating values when tested on data generated by different transcriptional profiling platforms. RNA from 33 stage I-III breast cancers was hybridized to both Affymetrix GeneChip and Millennium Pharmaceuticals cDNA arrays. Only 30% of all corresponding gene expression measurements on the two platforms had Pearson correlation coefficient r ≥ 0.7 when UniGene was used to match probes. There was substantial variation in correlation between different Affymetrix probe sets matched to the same cDNA probe. When cDNA and Affymetrix probes were matched by basic local alignment tool (BLAST) sequence identity, the correlation increased substantially. We identified 182 genes in the Affymetrix and 45 in the cDNA data (including 17 common genes) that accurately separated 91% of cases in supervised hierarchical clustering in each data set. Cross-platform testing of these informative genes resulted in lower clustering accuracy of 45 and 79%, respectively. Several sets of accurate five-gene classifiers were developed on each platform using linear discriminant analysis. The best 100 classifiers showed average misclassification error rate of 2% on the original data that rose to 19.5% when tested on data from the other platform. Random five-gene classifiers showed misclassification error rate of 33%. We conclude that multigene predictors optimized for one platform lose accuracy when applied to data from another platform due to missing genes and sequence differences in probes that result in differing measurements for the same gene. PMID:16049308

  4. Vision-Based Real-Time Traversable Region Detection for Mobile Robot in the Outdoors.

    PubMed

    Deng, Fucheng; Zhu, Xiaorui; He, Chao

    2017-09-13

    Environment perception is essential for autonomous mobile robots in human-robot coexisting outdoor environments. One of the important tasks for such intelligent robots is to autonomously detect the traversable region in an unstructured 3D real world. The main drawback of most existing methods is that of high computational complexity. Hence, this paper proposes a binocular vision-based, real-time solution for detecting traversable region in the outdoors. In the proposed method, an appearance model based on multivariate Gaussian is quickly constructed from a sample region in the left image adaptively determined by the vanishing point and dominant borders. Then, a fast, self-supervised segmentation scheme is proposed to classify the traversable and non-traversable regions. The proposed method is evaluated on public datasets as well as a real mobile robot. Implementation on the mobile robot has shown its ability in the real-time navigation applications.

  5. Applying machine-learning techniques to Twitter data for automatic hazard-event classification.

    NASA Astrophysics Data System (ADS)

    Filgueira, R.; Bee, E. J.; Diaz-Doce, D.; Poole, J., Sr.; Singh, A.

    2017-12-01

    The constant flow of information offered by tweets provides valuable information about all sorts of events at a high temporal and spatial resolution. Over the past year we have been analyzing in real-time geological hazards/phenomenon, such as earthquakes, volcanic eruptions, landslides, floods or the aurora, as part of the GeoSocial project, by geo-locating tweets filtered by keywords in a web-map. However, not all the filtered tweets are related with hazard/phenomenon events. This work explores two classification techniques for automatic hazard-event categorization based on tweets about the "Aurora". First, tweets were filtered using aurora-related keywords, removing stop words and selecting the ones written in English. For classifying the remaining between "aurora-event" or "no-aurora-event" categories, we compared two state-of-art techniques: Support Vector Machine (SVM) and Deep Convolutional Neural Networks (CNN) algorithms. Both approaches belong to the family of supervised learning algorithms, which make predictions based on labelled training dataset. Therefore, we created a training dataset by tagging 1200 tweets between both categories. The general form of SVM is used to separate two classes by a function (kernel). We compared the performance of four different kernels (Linear Regression, Logistic Regression, Multinomial Naïve Bayesian and Stochastic Gradient Descent) provided by Scikit-Learn library using our training dataset to build the SVM classifier. The results shown that the Logistic Regression (LR) gets the best accuracy (87%). So, we selected the SVM-LR classifier to categorise a large collection of tweets using the "dispel4py" framework.Later, we developed a CNN classifier, where the first layer embeds words into low-dimensional vectors. The next layer performs convolutions over the embedded word vectors. Results from the convolutional layer are max-pooled into a long feature vector, which is classified using a softmax layer. The CNN's accuracy is lower (83%) than the SVM-LR, since the algorithm needs a bigger training dataset to increase its accuracy. We used TensorFlow framework for applying CNN classifier to the same collection of tweets.In future we will modify both classifiers to work with other geo-hazards, use larger training datasets and apply them in real-time.

  6. Opportunities and Constraints in Characterizing Landscape Distribution of an Invasive Grass from Very High Resolution Multi-Spectral Imagery

    PubMed Central

    Dronova, Iryna; Spotswood, Erica N.; Suding, Katharine N.

    2017-01-01

    Understanding spatial distributions of invasive plant species at early infestation stages is critical for assessing the dynamics and underlying factors of invasions. Recent progress in very high resolution remote sensing is facilitating this task by providing high spatial detail over whole-site extents that are prohibitive to comprehensive ground surveys. This study assessed the opportunities and constraints to characterize landscape distribution of the invasive grass medusahead (Elymus caput-medusae) in a ∼36.8 ha grassland in California, United States from 0.15m-resolution visible/near-infrared aerial imagery at the stage of late spring phenological contrast with dominant grasses. We compared several object-based unsupervised, single-run supervised and hierarchical approaches to classify medusahead using spectral, textural, and contextual variables. Fuzzy accuracy assessment indicated that 44–100% of test medusahead samples were matched by its classified extents from different methods, while 63–83% of test samples classified as medusahead had this class as an acceptable candidate. Main sources of error included spectral similarity between medusahead and other green species and mixing of medusahead with other vegetation at variable densities. Adding texture attributes to spectral variables increased the accuracy of most classification methods, corroborating the informative value of local patterns under limited spectral data. The highest accuracy across different metrics was shown by the supervised single-run support vector machine with seven vegetation classes and Bayesian algorithms with three vegetation classes; however, their medusahead allocations showed some “spillover” effects due to misclassifications with other green vegetation. This issue was addressed by more complex hierarchical approaches, though their final accuracy did not exceed the best single-run methods. However, the comparison of classified medusahead extents with field segments of its patches overlapping with survey transects indicated that most methods tended to miss and/or over-estimate the length of the smallest patches and under-estimate the largest ones due to classification errors. Overall, the study outcomes support the potential of cost-effective, very high-resolution sensing for the site-scale detection of infestation hotspots that can be customized to plant phenological schedules. However, more accurate medusahead patch delineation in mixed-cover grasslands would benefit from testing hyperspectral data and using our study’s framework to inform and constrain the candidate vegetation classes in heterogeneous locations. PMID:28611806

  7. Opportunities and Constraints in Characterizing Landscape Distribution of an Invasive Grass from Very High Resolution Multi-Spectral Imagery.

    PubMed

    Dronova, Iryna; Spotswood, Erica N; Suding, Katharine N

    2017-01-01

    Understanding spatial distributions of invasive plant species at early infestation stages is critical for assessing the dynamics and underlying factors of invasions. Recent progress in very high resolution remote sensing is facilitating this task by providing high spatial detail over whole-site extents that are prohibitive to comprehensive ground surveys. This study assessed the opportunities and constraints to characterize landscape distribution of the invasive grass medusahead ( Elymus caput-medusae ) in a ∼36.8 ha grassland in California, United States from 0.15m-resolution visible/near-infrared aerial imagery at the stage of late spring phenological contrast with dominant grasses. We compared several object-based unsupervised, single-run supervised and hierarchical approaches to classify medusahead using spectral, textural, and contextual variables. Fuzzy accuracy assessment indicated that 44-100% of test medusahead samples were matched by its classified extents from different methods, while 63-83% of test samples classified as medusahead had this class as an acceptable candidate. Main sources of error included spectral similarity between medusahead and other green species and mixing of medusahead with other vegetation at variable densities. Adding texture attributes to spectral variables increased the accuracy of most classification methods, corroborating the informative value of local patterns under limited spectral data. The highest accuracy across different metrics was shown by the supervised single-run support vector machine with seven vegetation classes and Bayesian algorithms with three vegetation classes; however, their medusahead allocations showed some "spillover" effects due to misclassifications with other green vegetation. This issue was addressed by more complex hierarchical approaches, though their final accuracy did not exceed the best single-run methods. However, the comparison of classified medusahead extents with field segments of its patches overlapping with survey transects indicated that most methods tended to miss and/or over-estimate the length of the smallest patches and under-estimate the largest ones due to classification errors. Overall, the study outcomes support the potential of cost-effective, very high-resolution sensing for the site-scale detection of infestation hotspots that can be customized to plant phenological schedules. However, more accurate medusahead patch delineation in mixed-cover grasslands would benefit from testing hyperspectral data and using our study's framework to inform and constrain the candidate vegetation classes in heterogeneous locations.

  8. 40 CFR 171.11 - Federal certification of pesticide applicators in States or on Indian Reservations where there is...

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... license or other State, county, or Tribal identification document issued to the uncertified person to whom... in effect. (a) Applicability. This section applies to persons in any State and on any Indian... applicable, any person who uses or supervises the use of any pesticide classified for restricted use must be...

  9. 40 CFR 171.11 - Federal certification of pesticide applicators in States or on Indian Reservations where there is...

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... license or other State, county, or Tribal identification document issued to the uncertified person to whom... in effect. (a) Applicability. This section applies to persons in any State and on any Indian... applicable, any person who uses or supervises the use of any pesticide classified for restricted use must be...

  10. Instance annotation for multi-instance multi-label learning

    Treesearch

    F. Briggs; X.Z. Fern; R. Raich; Q. Lou

    2013-01-01

    Multi-instance multi-label learning (MIML) is a framework for supervised classification where the objects to be classified are bags of instances associated with multiple labels. For example, an image can be represented as a bag of segments and associated with a list of objects it contains. Prior work on MIML has focused on predicting label sets for previously unseen...

  11. Strength-based Supervision: Frameworks, Current Practice, and Future Directions A Wu-wei Method.

    ERIC Educational Resources Information Center

    Edwards, Jeffrey K.; Chen, Mei-Whei

    1999-01-01

    Discusses a method of counseling supervision similar to the wu-wei practice in Zen and Taoism. Suggests that this strength-based method and an understanding of isomorphy in supervisory relationships are the preferred practice for the supervision of family counselors. States that this model of supervision potentiates the person-of-the-counselor.…

  12. Recapitulation of Ayurveda constitution types by machine learning of phenotypic traits.

    PubMed

    Tiwari, Pradeep; Kutum, Rintu; Sethi, Tavpritesh; Shrivastava, Ankita; Girase, Bhushan; Aggarwal, Shilpi; Patil, Rutuja; Agarwal, Dhiraj; Gautam, Pramod; Agrawal, Anurag; Dash, Debasis; Ghosh, Saurabh; Juvekar, Sanjay; Mukerji, Mitali; Prasher, Bhavana

    2017-01-01

    In Ayurveda system of medicine individuals are classified into seven constitution types, "Prakriti", for assessing disease susceptibility and drug responsiveness. Prakriti evaluation involves clinical examination including questions about physiological and behavioural traits. A need was felt to develop models for accurately predicting Prakriti classes that have been shown to exhibit molecular differences. The present study was carried out on data of phenotypic attributes in 147 healthy individuals of three extreme Prakriti types, from a genetically homogeneous population of Western India. Unsupervised and supervised machine learning approaches were used to infer inherent structure of the data, and for feature selection and building classification models for Prakriti respectively. These models were validated in a North Indian population. Unsupervised clustering led to emergence of three natural clusters corresponding to three extreme Prakriti classes. The supervised modelling approaches could classify individuals, with distinct Prakriti types, in the training and validation sets. This study is the first to demonstrate that Prakriti types are distinct verifiable clusters within a multidimensional space of multiple interrelated phenotypic traits. It also provides a computational framework for predicting Prakriti classes from phenotypic attributes. This approach may be useful in precision medicine for stratification of endophenotypes in healthy and diseased populations.

  13. Classification of ductal carcinoma in situ by gene expression profiling.

    PubMed

    Hannemann, Juliane; Velds, Arno; Halfwerk, Johannes B G; Kreike, Bas; Peterse, Johannes L; van de Vijver, Marc J

    2006-01-01

    Ductal carcinoma in situ (DCIS) is characterised by the intraductal proliferation of malignant epithelial cells. Several histological classification systems have been developed, but assessing the histological type/grade of DCIS lesions is still challenging, making treatment decisions based on these features difficult. To obtain insight in the molecular basis of the development of different types of DCIS and its progression to invasive breast cancer, we have studied differences in gene expression between different types of DCIS and between DCIS and invasive breast carcinomas. Gene expression profiling using microarray analysis has been performed on 40 in situ and 40 invasive breast cancer cases. DCIS cases were classified as well- (n = 6), intermediately (n = 18), and poorly (n = 14) differentiated type. Of the 40 invasive breast cancer samples, five samples were grade I, 11 samples were grade II, and 24 samples were grade III. Using two-dimensional hierarchical clustering, the basal-like type, ERB-B2 type, and the luminal-type tumours originally described for invasive breast cancer could also be identified in DCIS. Using supervised classification, we identified a gene expression classifier of 35 genes, which differed between DCIS and invasive breast cancer; a classifier of 43 genes could be identified separating between well- and poorly differentiated DCIS samples.

  14. Classification of ductal carcinoma in situ by gene expression profiling

    PubMed Central

    Hannemann, Juliane; Velds, Arno; Halfwerk, Johannes BG; Kreike, Bas; Peterse, Johannes L; van de Vijver, Marc J

    2006-01-01

    Introduction Ductal carcinoma in situ (DCIS) is characterised by the intraductal proliferation of malignant epithelial cells. Several histological classification systems have been developed, but assessing the histological type/grade of DCIS lesions is still challenging, making treatment decisions based on these features difficult. To obtain insight in the molecular basis of the development of different types of DCIS and its progression to invasive breast cancer, we have studied differences in gene expression between different types of DCIS and between DCIS and invasive breast carcinomas. Methods Gene expression profiling using microarray analysis has been performed on 40 in situ and 40 invasive breast cancer cases. Results DCIS cases were classified as well- (n = 6), intermediately (n = 18), and poorly (n = 14) differentiated type. Of the 40 invasive breast cancer samples, five samples were grade I, 11 samples were grade II, and 24 samples were grade III. Using two-dimensional hierarchical clustering, the basal-like type, ERB-B2 type, and the luminal-type tumours originally described for invasive breast cancer could also be identified in DCIS. Conclusion Using supervised classification, we identified a gene expression classifier of 35 genes, which differed between DCIS and invasive breast cancer; a classifier of 43 genes could be identified separating between well- and poorly differentiated DCIS samples. PMID:17069663

  15. Application of Metamorphic Testing to Supervised Classifiers

    PubMed Central

    Xie, Xiaoyuan; Ho, Joshua; Kaiser, Gail; Xu, Baowen; Chen, Tsong Yueh

    2010-01-01

    Many applications in the field of scientific computing - such as computational biology, computational linguistics, and others - depend on Machine Learning algorithms to provide important core functionality to support solutions in the particular problem domains. However, it is difficult to test such applications because often there is no “test oracle” to indicate what the correct output should be for arbitrary input. To help address the quality of such software, in this paper we present a technique for testing the implementations of supervised machine learning classification algorithms on which such scientific computing software depends. Our technique is based on an approach called “metamorphic testing”, which has been shown to be effective in such cases. More importantly, we demonstrate that our technique not only serves the purpose of verification, but also can be applied in validation. In addition to presenting our technique, we describe a case study we performed on a real-world machine learning application framework, and discuss how programmers implementing machine learning algorithms can avoid the common pitfalls discovered in our study. We also discuss how our findings can be of use to other areas outside scientific computing, as well. PMID:21243103

  16. Supervised learning methods for pathological arterial pulse wave differentiation: A SVM and neural networks approach.

    PubMed

    Paiva, Joana S; Cardoso, João; Pereira, Tânia

    2018-01-01

    The main goal of this study was to develop an automatic method based on supervised learning methods, able to distinguish healthy from pathologic arterial pulse wave (APW), and those two from noisy waveforms (non-relevant segments of the signal), from the data acquired during a clinical examination with a novel optical system. The APW dataset analysed was composed by signals acquired in a clinical environment from a total of 213 subjects, including healthy volunteers and non-healthy patients. The signals were parameterised by means of 39pulse features: morphologic, time domain statistics, cross-correlation features, wavelet features. Multiclass Support Vector Machine Recursive Feature Elimination (SVM RFE) method was used to select the most relevant features. A comparative study was performed in order to evaluate the performance of the two classifiers: Support Vector Machine (SVM) and Artificial Neural Network (ANN). SVM achieved a statistically significant better performance for this problem with an average accuracy of 0.9917±0.0024 and a F-Measure of 0.9925±0.0019, in comparison with ANN, which reached the values of 0.9847±0.0032 and 0.9852±0.0031 for Accuracy and F-Measure, respectively. A significant difference was observed between the performances obtained with SVM classifier using a different number of features from the original set available. The comparison between SVM and NN allowed reassert the higher performance of SVM. The results obtained in this study showed the potential of the proposed method to differentiate those three important signal outcomes (healthy, pathologic and noise) and to reduce bias associated with clinical diagnosis of cardiovascular disease using APW. Copyright © 2017 Elsevier B.V. All rights reserved.

  17. An efficient abnormal cervical cell detection system based on multi-instance extreme learning machine

    NASA Astrophysics Data System (ADS)

    Zhao, Lili; Yin, Jianping; Yuan, Lihuan; Liu, Qiang; Li, Kuan; Qiu, Minghui

    2017-07-01

    Automatic detection of abnormal cells from cervical smear images is extremely demanded in annual diagnosis of women's cervical cancer. For this medical cell recognition problem, there are three different feature sections, namely cytology morphology, nuclear chromatin pathology and region intensity. The challenges of this problem come from feature combination s and classification accurately and efficiently. Thus, we propose an efficient abnormal cervical cell detection system based on multi-instance extreme learning machine (MI-ELM) to deal with above two questions in one unified framework. MI-ELM is one of the most promising supervised learning classifiers which can deal with several feature sections and realistic classification problems analytically. Experiment results over Herlev dataset demonstrate that the proposed method outperforms three traditional methods for two-class classification in terms of well accuracy and less time.

  18. Natural image classification driven by human brain activity

    NASA Astrophysics Data System (ADS)

    Zhang, Dai; Peng, Hanyang; Wang, Jinqiao; Tang, Ming; Xue, Rong; Zuo, Zhentao

    2016-03-01

    Natural image classification has been a hot topic in computer vision and pattern recognition research field. Since the performance of an image classification system can be improved by feature selection, many image feature selection methods have been developed. However, the existing supervised feature selection methods are typically driven by the class label information that are identical for different samples from the same class, ignoring with-in class image variability and therefore degrading the feature selection performance. In this study, we propose a novel feature selection method, driven by human brain activity signals collected using fMRI technique when human subjects were viewing natural images of different categories. The fMRI signals associated with subjects viewing different images encode the human perception of natural images, and therefore may capture image variability within- and cross- categories. We then select image features with the guidance of fMRI signals from brain regions with active response to image viewing. Particularly, bag of words features based on GIST descriptor are extracted from natural images for classification, and a sparse regression base feature selection method is adapted to select image features that can best predict fMRI signals. Finally, a classification model is built on the select image features to classify images without fMRI signals. The validation experiments for classifying images from 4 categories of two subjects have demonstrated that our method could achieve much better classification performance than the classifiers built on image feature selected by traditional feature selection methods.

  19. Radiomic modeling of BI-RADS density categories

    NASA Astrophysics Data System (ADS)

    Wei, Jun; Chan, Heang-Ping; Helvie, Mark A.; Roubidoux, Marilyn A.; Zhou, Chuan; Hadjiiski, Lubomir

    2017-03-01

    Screening mammography is the most effective and low-cost method to date for early cancer detection. Mammographic breast density has been shown to be highly correlated with breast cancer risk. We are developing a radiomic model for BI-RADS density categorization on digital mammography (FFDM) with a supervised machine learning approach. With IRB approval, we retrospectively collected 478 FFDMs from 478 women. As a gold standard, breast density was assessed by an MQSA radiologist based on BI-RADS categories. The raw FFDMs were used for computerized density assessment. The raw FFDM first underwent log-transform to approximate the x-ray sensitometric response, followed by multiscale processing to enhance the fibroglandular densities and parenchymal patterns. Three ROIs were automatically identified based on the keypoint distribution, where the keypoints were obtained as the extrema in the image Gaussian scale-space. A total of 73 features, including intensity and texture features that describe the density and the parenchymal pattern, were extracted from each breast. Our BI-RADS density estimator was constructed by using a random forest classifier. We used a 10-fold cross validation resampling approach to estimate the errors. With the random forest classifier, computerized density categories for 412 of the 478 cases agree with radiologist's assessment (weighted kappa = 0.93). The machine learning method with radiomic features as predictors demonstrated a high accuracy in classifying FFDMs into BI-RADS density categories. Further work is underway to improve our system performance as well as to perform an independent testing using a large unseen FFDM set.

  20. GENIE: a hybrid genetic algorithm for feature classification in multispectral images

    NASA Astrophysics Data System (ADS)

    Perkins, Simon J.; Theiler, James P.; Brumby, Steven P.; Harvey, Neal R.; Porter, Reid B.; Szymanski, John J.; Bloch, Jeffrey J.

    2000-10-01

    We consider the problem of pixel-by-pixel classification of a multi- spectral image using supervised learning. Conventional spuervised classification techniques such as maximum likelihood classification and less conventional ones s uch as neural networks, typically base such classifications solely on the spectral components of each pixel. It is easy to see why: the color of a pixel provides a nice, bounded, fixed dimensional space in which these classifiers work well. It is often the case however, that spectral information alone is not sufficient to correctly classify a pixel. Maybe spatial neighborhood information is required as well. Or maybe the raw spectral components do not themselves make for easy classification, but some arithmetic combination of them would. In either of these cases we have the problem of selecting suitable spatial, spectral or spatio-spectral features that allow the classifier to do its job well. The number of all possible such features is extremely large. How can we select a suitable subset? We have developed GENIE, a hybrid learning system that combines a genetic algorithm that searches a space of image processing operations for a set that can produce suitable feature planes, and a more conventional classifier which uses those feature planes to output a final classification. In this paper we show that the use of a hybrid GA provides significant advantages over using either a GA alone or more conventional classification methods alone. We present results using high-resolution IKONOS data, looking for regions of burned forest and for roads.

  1. Supervision of Facilitators in a Multisite Study: Goals, Process, and Outcomes

    PubMed Central

    2010-01-01

    Objective To describe the aims, implementation, and desired outcomes of facilitator supervision for both interventions (treatment and control) in Project Eban and to present the Eban Theoretical Framework for Supervision that guided the facilitators’ supervision. The qualifications and training of supervisors and facilitators are also described. Design This article provides a detailed description of supervision in a multisite behavioral intervention trial. The Eban Theoretical Framework for Supervision is guided by 3 theories: cognitive behavior therapy, the Life-long Model of Supervision, and “Empowering supervisees to empower others: a culturally responsive supervision model.” Methods Supervision is based on the Eban Theoretical Framework for Supervision, which provides guidelines for implementing both interventions using goals, process, and outcomes. Results Because of effective supervision, the interventions were implemented with fidelity to the protocol and were standard across the multiple sites. Conclusions Supervision of facilitators is a crucial aspect of multisite intervention research quality assurance. It provides them with expert advice, optimizes the effectiveness of facilitators, and increases adherence to the protocol across multiple sites. Based on the experience in this trial, some of the challenges that arise when conducting a multisite randomized control trial and how they can be handled by implementing the Eban Theoretical Framework for Supervision are described. PMID:18724192

  2. Integrated Low-Rank-Based Discriminative Feature Learning for Recognition.

    PubMed

    Zhou, Pan; Lin, Zhouchen; Zhang, Chao

    2016-05-01

    Feature learning plays a central role in pattern recognition. In recent years, many representation-based feature learning methods have been proposed and have achieved great success in many applications. However, these methods perform feature learning and subsequent classification in two separate steps, which may not be optimal for recognition tasks. In this paper, we present a supervised low-rank-based approach for learning discriminative features. By integrating latent low-rank representation (LatLRR) with a ridge regression-based classifier, our approach combines feature learning with classification, so that the regulated classification error is minimized. In this way, the extracted features are more discriminative for the recognition tasks. Our approach benefits from a recent discovery on the closed-form solutions to noiseless LatLRR. When there is noise, a robust Principal Component Analysis (PCA)-based denoising step can be added as preprocessing. When the scale of a problem is large, we utilize a fast randomized algorithm to speed up the computation of robust PCA. Extensive experimental results demonstrate the effectiveness and robustness of our method.

  3. A robust multi-kernel change detection framework for detecting leaf beetle defoliation using Landsat 7 ETM+ data

    NASA Astrophysics Data System (ADS)

    Anees, Asim; Aryal, Jagannath; O'Reilly, Małgorzata M.; Gale, Timothy J.; Wardlaw, Tim

    2016-12-01

    A robust non-parametric framework, based on multiple Radial Basic Function (RBF) kernels, is proposed in this study, for detecting land/forest cover changes using Landsat 7 ETM+ images. One of the widely used frameworks is to find change vectors (difference image) and use a supervised classifier to differentiate between change and no-change. The Bayesian Classifiers e.g. Maximum Likelihood Classifier (MLC), Naive Bayes (NB), are widely used probabilistic classifiers which assume parametric models, e.g. Gaussian function, for the class conditional distributions. However, their performance can be limited if the data set deviates from the assumed model. The proposed framework exploits the useful properties of Least Squares Probabilistic Classifier (LSPC) formulation i.e. non-parametric and probabilistic nature, to model class posterior probabilities of the difference image using a linear combination of a large number of Gaussian kernels. To this end, a simple technique, based on 10-fold cross-validation is also proposed for tuning model parameters automatically instead of selecting a (possibly) suboptimal combination from pre-specified lists of values. The proposed framework has been tested and compared with Support Vector Machine (SVM) and NB for detection of defoliation, caused by leaf beetles (Paropsisterna spp.) in Eucalyptus nitens and Eucalyptus globulus plantations of two test areas, in Tasmania, Australia, using raw bands and band combination indices of Landsat 7 ETM+. It was observed that due to multi-kernel non-parametric formulation and probabilistic nature, the LSPC outperforms parametric NB with Gaussian assumption in change detection framework, with Overall Accuracy (OA) ranging from 93.6% (κ = 0.87) to 97.4% (κ = 0.94) against 85.3% (κ = 0.69) to 93.4% (κ = 0.85), and is more robust to changing data distributions. Its performance was comparable to SVM, with added advantages of being probabilistic and capable of handling multi-class problems naturally with its original formulation.

  4. Automated Quality Assessment of Structural Magnetic Resonance Brain Images Based on a Supervised Machine Learning Algorithm.

    PubMed

    Pizarro, Ricardo A; Cheng, Xi; Barnett, Alan; Lemaitre, Herve; Verchinski, Beth A; Goldman, Aaron L; Xiao, Ena; Luo, Qian; Berman, Karen F; Callicott, Joseph H; Weinberger, Daniel R; Mattay, Venkata S

    2016-01-01

    High-resolution three-dimensional magnetic resonance imaging (3D-MRI) is being increasingly used to delineate morphological changes underlying neuropsychiatric disorders. Unfortunately, artifacts frequently compromise the utility of 3D-MRI yielding irreproducible results, from both type I and type II errors. It is therefore critical to screen 3D-MRIs for artifacts before use. Currently, quality assessment involves slice-wise visual inspection of 3D-MRI volumes, a procedure that is both subjective and time consuming. Automating the quality rating of 3D-MRI could improve the efficiency and reproducibility of the procedure. The present study is one of the first efforts to apply a support vector machine (SVM) algorithm in the quality assessment of structural brain images, using global and region of interest (ROI) automated image quality features developed in-house. SVM is a supervised machine-learning algorithm that can predict the category of test datasets based on the knowledge acquired from a learning dataset. The performance (accuracy) of the automated SVM approach was assessed, by comparing the SVM-predicted quality labels to investigator-determined quality labels. The accuracy for classifying 1457 3D-MRI volumes from our database using the SVM approach is around 80%. These results are promising and illustrate the possibility of using SVM as an automated quality assessment tool for 3D-MRI.

  5. Rapid classification of pharmaceutical ingredients with Raman spectroscopy using compressive detection strategy with PLS-DA multivariate filters.

    PubMed

    Cebeci Maltaş, Derya; Kwok, Kaho; Wang, Ping; Taylor, Lynne S; Ben-Amotz, Dor

    2013-06-01

    Identifying pharmaceutical ingredients is a routine procedure required during industrial manufacturing. Here we show that a recently developed Raman compressive detection strategy can be employed to classify various widely used pharmaceutical materials using a hybrid supervised/unsupervised strategy in which only two ingredients are used for training and yet six other ingredients can also be distinguished. More specifically, our liquid crystal spatial light modulator (LC-SLM) based compressive detection instrument is trained using only the active ingredient, tadalafil, and the excipient, lactose, but is tested using these and various other excipients; microcrystalline cellulose, magnesium stearate, titanium (IV) oxide, talc, sodium lauryl sulfate and hydroxypropyl cellulose. Partial least squares discriminant analysis (PLS-DA) is used to generate the compressive detection filters necessary for fast chemical classification. Although the filters used in this study are trained on only lactose and tadalafil, we show that all the pharmaceutical ingredients mentioned above can be differentiated and classified using PLS-DA compressive detection filters with an accumulation time of 10ms per filter. Copyright © 2013 Elsevier B.V. All rights reserved.

  6. Machine learning vortices at the Kosterlitz-Thouless transition

    NASA Astrophysics Data System (ADS)

    Beach, Matthew J. S.; Golubeva, Anna; Melko, Roger G.

    2018-01-01

    Efficient and automated classification of phases from minimally processed data is one goal of machine learning in condensed-matter and statistical physics. Supervised algorithms trained on raw samples of microstates can successfully detect conventional phase transitions via learning a bulk feature such as an order parameter. In this paper, we investigate whether neural networks can learn to classify phases based on topological defects. We address this question on the two-dimensional classical XY model which exhibits a Kosterlitz-Thouless transition. We find significant feature engineering of the raw spin states is required to convincingly claim that features of the vortex configurations are responsible for learning the transition temperature. We further show a single-layer network does not correctly classify the phases of the XY model, while a convolutional network easily performs classification by learning the global magnetization. Finally, we design a deep network capable of learning vortices without feature engineering. We demonstrate the detection of vortices does not necessarily result in the best classification accuracy, especially for lattices of less than approximately 1000 spins. For larger systems, it remains a difficult task to learn vortices.

  7. Heterogeneous data fusion and intelligent techniques embedded in a mobile application for real-time chronic disease management.

    PubMed

    Bellos, Christos; Papadopoulos, Athanassios; Rosso, Roberto; Fotiadis, Dimitrios I

    2011-01-01

    CHRONIOUS system is an integrated platform aiming at the management of chronic disease patients. One of the most important components of the system is a Decision Support System (DSS) that has been developed in a Smart Device (SD). This component decides on patient's current health status by combining several data, which are acquired either by wearable sensors or manually inputted by the patient or retrieved from the specific database. In case no abnormal situation has been tracked, the DSS takes no action and remains deactivated until next abnormal situation pack of data are being acquired or next scheduled data being transmitted. The DSS that has been implemented is an integrated classification system with two parallel classifiers, combining an expert system (rule-based system) and a supervised classifier, such as Support Vector Machines (SVM), Random Forests, artificial Neural Networks (aNN like the Multi-Layer Perceptron), Decision Trees and Naïve Bayes. The above categorized system is useful for providing critical information about the health status of the patient.

  8. Reflective dialogue in clinical supervision: A pilot study involving collaborative review of supervision videos.

    PubMed

    Hill, Hamish R M; Crowe, Trevor P; Gonsalvez, Craig J

    2016-01-01

    To pilot an intervention involving reflective dialogue based on video recordings of clinical supervision. Fourteen participants (seven psychotherapists and their supervisors) completed a reflective practice protocol after viewing a video of their most recent supervision session, then shared their reflections in a second session. Thematic analysis of individual reflections and feedback resulted in the following dominant themes: (1) Increased discussion of supervisee anxiety and the tensions between autonomy and dependence; (2) intentions to alter supervisory roles and practice; (3) identification of and reflection on parallel process (defined as the dynamic transmission of relationship patterns between therapy and supervision); and (4) a range of perceived impacts including improvements in supervisory alliance. The results suggest that reflective dialogue based on supervision videos can play a useful role in psychotherapy supervision, including with relatively inexperienced supervisees. Suggestions are provided for the encouragement of ongoing reflective dialogue in routine supervision practice.

  9. Creating Diverse Ensemble Classifiers to Reduce Supervision

    DTIC Science & Technology

    2005-12-01

    artificial examples. Quite often training with noise improves network generalization (Bishop, 1995; Raviv & Intrator, 1996). Adding noise to training...full training set, as seen by comparing to the to- tal dataset sizes. Hence, improving on the data utilization of DECORATE is a fairly difficult task...prohibitively expensive, except (perhaps) with an incremen- tal learner such as Naive Bayes. Our AFA framework is significantly more efficient because

  10. The Team-Based Internal Supervision System Development for the Primary Schools under the Office of the Basic Education Commission

    ERIC Educational Resources Information Center

    Tubsuli, Nattapong; Julsuwan, Suwat; Tesaputa, Kowat

    2017-01-01

    Internal supervision in the school is currently experiencing various problems. Supervision preparation problems are related to: lacking of supervision plan, lacking of holistic and systematic planning, and lacking of analysis in current conditions or requirements. While supervision operational problems are included: lacking of supervision…

  11. On the evaluation of the fidelity of supervised classifiers in the prediction of chimeric RNAs.

    PubMed

    Beaumeunier, Sacha; Audoux, Jérôme; Boureux, Anthony; Ruffle, Florence; Commes, Thérèse; Philippe, Nicolas; Alves, Ronnie

    2016-01-01

    High-throughput sequencing technology and bioinformatics have identified chimeric RNAs (chRNAs), raising the possibility of chRNAs expressing particularly in diseases can be used as potential biomarkers in both diagnosis and prognosis. The task of discriminating true chRNAs from the false ones poses an interesting Machine Learning (ML) challenge. First of all, the sequencing data may contain false reads due to technical artifacts and during the analysis process, bioinformatics tools may generate false positives due to methodological biases. Moreover, if we succeed to have a proper set of observations (enough sequencing data) about true chRNAs, chances are that the devised model can not be able to generalize beyond it. Like any other machine learning problem, the first big issue is finding the good data to build models. As far as we were concerned, there is no common benchmark data available for chRNAs detection. The definition of a classification baseline is lacking in the related literature too. In this work we are moving towards benchmark data and an evaluation of the fidelity of supervised classifiers in the prediction of chRNAs. We proposed a modelization strategy that can be used to increase the tools performances in context of chRNA classification based on a simulated data generator, that permit to continuously integrate new complex chimeric events. The pipeline incorporated a genome mutation process and simulated RNA-seq data. The reads within distinct depth were aligned and analysed by CRAC that integrates genomic location and local coverage, allowing biological predictions at the read scale. Additionally, these reads were functionally annotated and aggregated to form chRNAs events, making it possible to evaluate ML methods (classifiers) performance in both levels of reads and events. Ensemble learning strategies demonstrated to be more robust to this classification problem, providing an average AUC performance of 95 % (ACC=94 %, Kappa=0.87 %). The resulting classification models were also tested on real RNA-seq data from a set of twenty-seven patients with acute myeloid leukemia (AML).

  12. Supervised Detection of Anomalous Light Curves in Massive Astronomical Catalogs

    NASA Astrophysics Data System (ADS)

    Nun, Isadora; Pichara, Karim; Protopapas, Pavlos; Kim, Dae-Won

    2014-09-01

    The development of synoptic sky surveys has led to a massive amount of data for which resources needed for analysis are beyond human capabilities. In order to process this information and to extract all possible knowledge, machine learning techniques become necessary. Here we present a new methodology to automatically discover unknown variable objects in large astronomical catalogs. With the aim of taking full advantage of all information we have about known objects, our method is based on a supervised algorithm. In particular, we train a random forest classifier using known variability classes of objects and obtain votes for each of the objects in the training set. We then model this voting distribution with a Bayesian network and obtain the joint voting distribution among the training objects. Consequently, an unknown object is considered as an outlier insofar it has a low joint probability. By leaving out one of the classes on the training set, we perform a validity test and show that when the random forest classifier attempts to classify unknown light curves (the class left out), it votes with an unusual distribution among the classes. This rare voting is detected by the Bayesian network and expressed as a low joint probability. Our method is suitable for exploring massive data sets given that the training process is performed offline. We tested our algorithm on 20 million light curves from the MACHO catalog and generated a list of anomalous candidates. After analysis, we divided the candidates into two main classes of outliers: artifacts and intrinsic outliers. Artifacts were principally due to air mass variation, seasonal variation, bad calibration, or instrumental errors and were consequently removed from our outlier list and added to the training set. After retraining, we selected about 4000 objects, which we passed to a post-analysis stage by performing a cross-match with all publicly available catalogs. Within these candidates we identified certain known but rare objects such as eclipsing Cepheids, blue variables, cataclysmic variables, and X-ray sources. For some outliers there was no additional information. Among them we identified three unknown variability types and a few individual outliers that will be followed up in order to perform a deeper analysis.

  13. Supervised learning with decision margins in pools of spiking neurons.

    PubMed

    Le Mouel, Charlotte; Harris, Kenneth D; Yger, Pierre

    2014-10-01

    Learning to categorise sensory inputs by generalising from a few examples whose category is precisely known is a crucial step for the brain to produce appropriate behavioural responses. At the neuronal level, this may be performed by adaptation of synaptic weights under the influence of a training signal, in order to group spiking patterns impinging on the neuron. Here we describe a framework that allows spiking neurons to perform such "supervised learning", using principles similar to the Support Vector Machine, a well-established and robust classifier. Using a hinge-loss error function, we show that requesting a margin similar to that of the SVM improves performance on linearly non-separable problems. Moreover, we show that using pools of neurons to discriminate categories can also increase the performance by sharing the load among neurons.

  14. Continuous measurement of breast tumor hormone receptor expression: a comparison of two computational pathology platforms

    PubMed Central

    Ahern, Thomas P.; Beck, Andrew H.; Rosner, Bernard A.; Glass, Ben; Frieling, Gretchen; Collins, Laura C.; Tamimi, Rulla M.

    2017-01-01

    Background Computational pathology platforms incorporate digital microscopy with sophisticated image analysis to permit rapid, continuous measurement of protein expression. We compared two computational pathology platforms on their measurement of breast tumor estrogen receptor (ER) and progesterone receptor (PR) expression. Methods Breast tumor microarrays from the Nurses’ Health Study were stained for ER (n=592) and PR (n=187). One expert pathologist scored cases as positive if ≥1% of tumor nuclei exhibited stain. ER and PR were then measured with the Definiens Tissue Studio (automated) and Aperio Digital Pathology (user-supervised) platforms. Platform-specific measurements were compared using boxplots, scatter plots, and correlation statistics. Classification of ER and PR positivity by platform-specific measurements was evaluated with areas under receiver operating characteristic curves (AUC) from univariable logistic regression models, using expert pathologist classification as the standard. Results Both platforms showed considerable overlap in continuous measurements of ER and PR between positive and negative groups classified by expert pathologist. Platform-specific measurements were strongly and positively correlated with one another (rho≥0.77). The user-supervised Aperio workflow performed slightly better than the automated Definiens workflow at classifying ER positivity (AUCAperio=0.97; AUCDefiniens=0.90; difference=0.07, 95% CI: 0.05, 0.09) and PR positivity (AUCAperio=0.94; AUCDefiniens=0.87; difference=0.07, 95% CI: 0.03, 0.12). Conclusion Paired hormone receptor expression measurements from two different computational pathology platforms agreed well with one another. The user-supervised workflow yielded better classification accuracy than the automated workflow. Appropriately validated computational pathology algorithms enrich molecular epidemiology studies with continuous protein expression data and may accelerate tumor biomarker discovery. PMID:27729430

  15. Evaluation of Machine Learning Algorithms for Classification of Primary Biological Aerosol using a new UV-LIF spectrometer

    NASA Astrophysics Data System (ADS)

    Ruske, S. T.; Topping, D. O.; Foot, V. E.; Kaye, P. H.; Stanley, W. R.; Morse, A. P.; Crawford, I.; Gallagher, M. W.

    2016-12-01

    Characterisation of bio-aerosols has important implications within Environment and Public Health sectors. Recent developments in Ultra-Violet Light Induced Fluorescence (UV-LIF) detectors such as the Wideband Integrated bio-aerosol Spectrometer (WIBS) and the newly introduced Multiparameter bio-aerosol Spectrometer (MBS) has allowed for the real time collection of fluorescence, size and morphology measurements for the purpose of discriminating between bacteria, fungal Spores and pollen. This new generation of instruments has enabled ever-larger data sets to be compiled with the aim of studying more complex environments, yet the algorithms used for specie classification remain largely invalidated. It is therefore imperative that we validate the performance of different algorithms that can be used for the task of classification, which is the focus of this study. For unsupervised learning we test Hierarchical Agglomerative Clustering with various different linkages. For supervised learning, ten methods were tested; including decision trees, ensemble methods: Random Forests, Gradient Boosting and AdaBoost; two implementations for support vector machines: libsvm and liblinear; Gaussian methods: Gaussian naïve Bayesian, quadratic and linear discriminant analysis and finally the k-nearest neighbours algorithm. The methods were applied to two different data sets measured using a new Multiparameter bio-aerosol Spectrometer. We find that clustering, in general, performs slightly worse than the supervised learning methods correctly classifying, at best, only 72.7 and 91.1 percent for the two data sets. For supervised learning the gradient boosting algorithm was found to be the most effective, on average correctly classifying 88.1 and 97.8 percent of the testing data respectively across the two data sets. We discuss the wider relevance of these results with regards to challenging existing classification in real-world environments.

  16. The Impact of the School Counselor Supervision Model on the Self-Efficacy of School Counselor Site Supervisors

    ERIC Educational Resources Information Center

    Brown, Carleton H.; Olivárez, Artura, Jr.; DeKruyf, Loraine

    2018-01-01

    Supervision is a critical element in the professional identity development of school counselors; however, available school counseling-specific supervision training is lacking. The authors describe a 4-hour supervision workshop based on the School Counselor Supervision Model (SCSM; Luke & Bernard, 2006) attended by 31 school counselors from…

  17. Competency-Based Student-Teacher Supervision

    ERIC Educational Resources Information Center

    Spanjer, R. Allan

    1975-01-01

    This author contends that student-teacher supervision cannot be done effectively in traditional ways. He discusses five myths of supervision and explains a program developed at Portland (Ore.) State University that puts the emphasis where it should be--on the supervising teacher. (Editor)

  18. Invariant-feature-based adaptive automatic target recognition in obscured 3D point clouds

    NASA Astrophysics Data System (ADS)

    Khuon, Timothy; Kershner, Charles; Mattei, Enrico; Alverio, Arnel; Rand, Robert

    2014-06-01

    Target recognition and classification in a 3D point cloud is a non-trivial process due to the nature of the data collected from a sensor system. The signal can be corrupted by noise from the environment, electronic system, A/D converter, etc. Therefore, an adaptive system with a desired tolerance is required to perform classification and recognition optimally. The feature-based pattern recognition algorithm architecture as described below is particularly devised for solving a single-sensor classification non-parametrically. Feature set is extracted from an input point cloud, normalized, and classifier a neural network classifier. For instance, automatic target recognition in an urban area would require different feature sets from one in a dense foliage area. The figure above (see manuscript) illustrates the architecture of the feature based adaptive signature extraction of 3D point cloud including LIDAR, RADAR, and electro-optical data. This network takes a 3D cluster and classifies it into a specific class. The algorithm is a supervised and adaptive classifier with two modes: the training mode and the performing mode. For the training mode, a number of novel patterns are selected from actual or artificial data. A particular 3D cluster is input to the network as shown above for the decision class output. The network consists of three sequential functional modules. The first module is for feature extraction that extracts the input cluster into a set of singular value features or feature vector. Then the feature vector is input into the feature normalization module to normalize and balance it before being fed to the neural net classifier for the classification. The neural net can be trained by actual or artificial novel data until each trained output reaches the declared output within the defined tolerance. In case new novel data is added after the neural net has been learned, the training is then resumed until the neural net has incrementally learned with the new novel data. The associative memory capability of the neural net enables the incremental learning. The back propagation algorithm or support vector machine can be utilized for the classification and recognition.

  19. The fuzzy system classifier using an intelligent mattress

    NASA Astrophysics Data System (ADS)

    Hnatiuc, Mihaela; Caruntu, George; Sarbu, Vasile

    2009-01-01

    Quality sleep enables your body and mind to be efficient during the day. The good rest of subject on bed depends by many factors; one of them can be the physical discomfort. In this paper one presents the medical system to prevent the discomfort and to identify the subject behavior during the sleep. The epochs of the sleep are classified using the fuzzy system with many inputs. The aim of this analysis is to realize an expert system to diagnose the sleep. A good diagnostic is obtained if the patient is supervised along all day. The system has a microsystem to control, command the pressure sensors and the relay for tuning the airbags pressure.

  20. Learning time series for intelligent monitoring

    NASA Technical Reports Server (NTRS)

    Manganaris, Stefanos; Fisher, Doug

    1994-01-01

    We address the problem of classifying time series according to their morphological features in the time domain. In a supervised machine-learning framework, we induce a classification procedure from a set of preclassified examples. For each class, we infer a model that captures its morphological features using Bayesian model induction and the minimum message length approach to assign priors. In the performance task, we classify a time series in one of the learned classes when there is enough evidence to support that decision. Time series with sufficiently novel features, belonging to classes not present in the training set, are recognized as such. We report results from experiments in a monitoring domain of interest to NASA.

  1. Evaluation of supervised machine-learning algorithms to distinguish between inflammatory bowel disease and alimentary lymphoma in cats.

    PubMed

    Awaysheh, Abdullah; Wilcke, Jeffrey; Elvinger, François; Rees, Loren; Fan, Weiguo; Zimmerman, Kurt L

    2016-11-01

    Inflammatory bowel disease (IBD) and alimentary lymphoma (ALA) are common gastrointestinal diseases in cats. The very similar clinical signs and histopathologic features of these diseases make the distinction between them diagnostically challenging. We tested the use of supervised machine-learning algorithms to differentiate between the 2 diseases using data generated from noninvasive diagnostic tests. Three prediction models were developed using 3 machine-learning algorithms: naive Bayes, decision trees, and artificial neural networks. The models were trained and tested on data from complete blood count (CBC) and serum chemistry (SC) results for the following 3 groups of client-owned cats: normal, inflammatory bowel disease (IBD), or alimentary lymphoma (ALA). Naive Bayes and artificial neural networks achieved higher classification accuracy (sensitivities of 70.8% and 69.2%, respectively) than the decision tree algorithm (63%, p < 0.0001). The areas under the receiver-operating characteristic curve for classifying cases into the 3 categories was 83% by naive Bayes, 79% by decision tree, and 82% by artificial neural networks. Prediction models using machine learning provided a method for distinguishing between ALA-IBD, ALA-normal, and IBD-normal. The naive Bayes and artificial neural networks classifiers used 10 and 4 of the CBC and SC variables, respectively, to outperform the C4.5 decision tree, which used 5 CBC and SC variables in classifying cats into the 3 classes. These models can provide another noninvasive diagnostic tool to assist clinicians with differentiating between IBD and ALA, and between diseased and nondiseased cats. © 2016 The Author(s).

  2. GOexpress: an R/Bioconductor package for the identification and visualisation of robust gene ontology signatures through supervised learning of gene expression data.

    PubMed

    Rue-Albrecht, Kévin; McGettigan, Paul A; Hernández, Belinda; Nalpas, Nicolas C; Magee, David A; Parnell, Andrew C; Gordon, Stephen V; MacHugh, David E

    2016-03-11

    Identification of gene expression profiles that differentiate experimental groups is critical for discovery and analysis of key molecular pathways and also for selection of robust diagnostic or prognostic biomarkers. While integration of differential expression statistics has been used to refine gene set enrichment analyses, such approaches are typically limited to single gene lists resulting from simple two-group comparisons or time-series analyses. In contrast, functional class scoring and machine learning approaches provide powerful alternative methods to leverage molecular measurements for pathway analyses, and to compare continuous and multi-level categorical factors. We introduce GOexpress, a software package for scoring and summarising the capacity of gene ontology features to simultaneously classify samples from multiple experimental groups. GOexpress integrates normalised gene expression data (e.g., from microarray and RNA-seq experiments) and phenotypic information of individual samples with gene ontology annotations to derive a ranking of genes and gene ontology terms using a supervised learning approach. The default random forest algorithm allows interactions between all experimental factors, and competitive scoring of expressed genes to evaluate their relative importance in classifying predefined groups of samples. GOexpress enables rapid identification and visualisation of ontology-related gene panels that robustly classify groups of samples and supports both categorical (e.g., infection status, treatment) and continuous (e.g., time-series, drug concentrations) experimental factors. The use of standard Bioconductor extension packages and publicly available gene ontology annotations facilitates straightforward integration of GOexpress within existing computational biology pipelines.

  3. Ranking and combining multiple predictors without labeled data

    PubMed Central

    Parisi, Fabio; Strino, Francesco; Nadler, Boaz; Kluger, Yuval

    2014-01-01

    In a broad range of classification and decision-making problems, one is given the advice or predictions of several classifiers, of unknown reliability, over multiple questions or queries. This scenario is different from the standard supervised setting, where each classifier’s accuracy can be assessed using available labeled data, and raises two questions: Given only the predictions of several classifiers over a large set of unlabeled test data, is it possible to (i) reliably rank them and (ii) construct a metaclassifier more accurate than most classifiers in the ensemble? Here we present a spectral approach to address these questions. First, assuming conditional independence between classifiers, we show that the off-diagonal entries of their covariance matrix correspond to a rank-one matrix. Moreover, the classifiers can be ranked using the leading eigenvector of this covariance matrix, because its entries are proportional to their balanced accuracies. Second, via a linear approximation to the maximum likelihood estimator, we derive the Spectral Meta-Learner (SML), an unsupervised ensemble classifier whose weights are equal to these eigenvector entries. On both simulated and real data, SML typically achieves a higher accuracy than most classifiers in the ensemble and can provide a better starting point than majority voting for estimating the maximum likelihood solution. Furthermore, SML is robust to the presence of small malicious groups of classifiers designed to veer the ensemble prediction away from the (unknown) ground truth. PMID:24474744

  4. Local morphologic scale: application to segmenting tumor infiltrating lymphocytes in ovarian cancer TMAs

    NASA Astrophysics Data System (ADS)

    Janowczyk, Andrew; Chandran, Sharat; Feldman, Michael; Madabhushi, Anant

    2011-03-01

    In this paper we present the concept and associated methodological framework for a novel locally adaptive scale notion called local morphological scale (LMS). Broadly speaking, the LMS at every spatial location is defined as the set of spatial locations, with associated morphological descriptors, which characterize the local structure or heterogeneity for the location under consideration. More specifically, the LMS is obtained as the union of all pixels in the polygon obtained by linking the final location of trajectories of particles emanating from the location under consideration, where the path traveled by originating particles is a function of the local gradients and heterogeneity that they encounter along the way. As these particles proceed on their trajectory away from the location under consideration, the velocity of each particle (i.e. do the particles stop, slow down, or simply continue around the object) is modeled using a physics based system. At some time point the particle velocity goes to zero (potentially on account of encountering (a) repeated obstructions, (b) an insurmountable image gradient, or (c) timing out) and comes to a halt. By using a Monte-Carlo sampling technique, LMS is efficiently determined through parallelized computations. LMS is different from previous local scale related formulations in that it is (a) not a locally connected sets of pixels satisfying some pre-defined intensity homogeneity criterion (generalized-scale), nor is it (b) constrained by any prior shape criterion (ball-scale, tensor-scale). Shape descriptors quantifying the morphology of the particle paths are used to define a tensor LMS signature associated with every spatial image location. These features include the number of object collisions per particle, average velocity of a particle, and the length of the individual particle paths. These features can be used in conjunction with a supervised classifier to correctly differentiate between two different object classes based on local structural properties. In this paper, we apply LMS to the specific problem of classifying regions of interest in Ovarian Cancer (OCa) histology images as either tumor or stroma. This approach is used to classify lymphocytes as either tumor infiltrating lymphocytes (TILs) or non-TILs; the presence of TILs having been identified as an important prognostic indicator for disease outcome in patients with OCa. We present preliminary results on the tumor/stroma classification of 11,000 randomly selected locations of interest, across 11 images obtained from 6 patient studies. Using a Probabilistic Boosting Tree (PBT), our supervised classifier yielded an area under the receiver operation characteristic curve (AUC) of 0.8341 +/-0.0059 over 5 runs of randomized cross validation. The average LMS computation time at every spatial location for an image patch comprising 2000 pixels with 24 particles at every location was only 18s.

  5. Classification of Traffic Related Short Texts to Analyse Road Problems in Urban Areas

    NASA Astrophysics Data System (ADS)

    Saldana-Perez, A. M. M.; Moreno-Ibarra, M.; Tores-Ruiz, M.

    2017-09-01

    The Volunteer Geographic Information (VGI) can be used to understand the urban dynamics. In the classification of traffic related short texts to analyze road problems in urban areas, a VGI data analysis is done over a social media's publications, in order to classify traffic events at big cities that modify the movement of vehicles and people through the roads, such as car accidents, traffic and closures. The classification of traffic events described in short texts is done by applying a supervised machine learning algorithm. In the approach users are considered as sensors which describe their surroundings and provide their geographic position at the social network. The posts are treated by a text mining process and classified into five groups. Finally, the classified events are grouped in a data corpus and geo-visualized in the study area, to detect the places with more vehicular problems.

  6. PixelLearn

    NASA Technical Reports Server (NTRS)

    Mazzoni, Dominic; Wagstaff, Kiri; Bornstein, Benjamin; Tang, Nghia; Roden, Joseph

    2006-01-01

    PixelLearn is an integrated user-interface computer program for classifying pixels in scientific images. Heretofore, training a machine-learning algorithm to classify pixels in images has been tedious and difficult. PixelLearn provides a graphical user interface that makes it faster and more intuitive, leading to more interactive exploration of image data sets. PixelLearn also provides image-enhancement controls to make it easier to see subtle details in images. PixelLearn opens images or sets of images in a variety of common scientific file formats and enables the user to interact with several supervised or unsupervised machine-learning pixel-classifying algorithms while the user continues to browse through the images. The machinelearning algorithms in PixelLearn use advanced clustering and classification methods that enable accuracy much higher than is achievable by most other software previously available for this purpose. PixelLearn is written in portable C++ and runs natively on computers running Linux, Windows, or Mac OS X.

  7. A multispectral automatic target recognition application for maritime surveillance, search, and rescue

    NASA Astrophysics Data System (ADS)

    Schoonmaker, Jon; Reed, Scott; Podobna, Yuliya; Vazquez, Jose; Boucher, Cynthia

    2010-04-01

    Due to increased security concerns, the commitment to monitor and maintain security in the maritime environment is increasingly a priority. A country's coast is the most vulnerable area for the incursion of illegal immigrants, terrorists and contraband. This work illustrates the ability of a low-cost, light-weight, multi-spectral, multi-channel imaging system to handle the environment and see under difficult marine conditions. The system and its implemented detecting and tracking technologies should be organic to the maritime homeland security community for search and rescue, fisheries, defense, and law enforcement. It is tailored for airborne and ship based platforms to detect, track and monitor suspected objects (such as semi-submerged targets like marine mammals, vessels in distress, and drug smugglers). In this system, automated detection and tracking technology is used to detect, classify and localize potential threats or objects of interest within the imagery provided by the multi-spectral system. These algorithms process the sensor data in real time, thereby providing immediate feedback when features of interest have been detected. A supervised detection system based on Haar features and Cascade Classifiers is presented and results are provided on real data. The system is shown to be extendable and reusable for a variety of different applications.

  8. Wavelet-based multicomponent denoising on GPU to improve the classification of hyperspectral images

    NASA Astrophysics Data System (ADS)

    Quesada-Barriuso, Pablo; Heras, Dora B.; Argüello, Francisco; Mouriño, J. C.

    2017-10-01

    Supervised classification allows handling a wide range of remote sensing hyperspectral applications. Enhancing the spatial organization of the pixels over the image has proven to be beneficial for the interpretation of the image content, thus increasing the classification accuracy. Denoising in the spatial domain of the image has been shown as a technique that enhances the structures in the image. This paper proposes a multi-component denoising approach in order to increase the classification accuracy when a classification method is applied. It is computed on multicore CPUs and NVIDIA GPUs. The method combines feature extraction based on a 1Ddiscrete wavelet transform (DWT) applied in the spectral dimension followed by an Extended Morphological Profile (EMP) and a classifier (SVM or ELM). The multi-component noise reduction is applied to the EMP just before the classification. The denoising recursively applies a separable 2D DWT after which the number of wavelet coefficients is reduced by using a threshold. Finally, inverse 2D-DWT filters are applied to reconstruct the noise free original component. The computational cost of the classifiers as well as the cost of the whole classification chain is high but it is reduced achieving real-time behavior for some applications through their computation on NVIDIA multi-GPU platforms.

  9. Object recognition through a multi-mode fiber

    NASA Astrophysics Data System (ADS)

    Takagi, Ryosuke; Horisaki, Ryoichi; Tanida, Jun

    2017-04-01

    We present a method of recognizing an object through a multi-mode fiber. A number of speckle patterns transmitted through a multi-mode fiber are provided to a classifier based on machine learning. We experimentally demonstrated binary classification of face and non-face targets based on the method. The measurement process of the experimental setup was random and nonlinear because a multi-mode fiber is a typical strongly scattering medium and any reference light was not used in our setup. Comparisons between three supervised learning methods, support vector machine, adaptive boosting, and neural network, are also provided. All of those learning methods achieved high accuracy rates at about 90% for the classification. The approach presented here can realize a compact and smart optical sensor. It is practically useful for medical applications, such as endoscopy. Also our study indicated a promising utilization of artificial intelligence, which has rapidly progressed, for reducing optical and computational costs in optical sensing systems.

  10. A new feature constituting approach to detection of vocal fold pathology

    NASA Astrophysics Data System (ADS)

    Hariharan, M.; Polat, Kemal; Yaacob, Sazali

    2014-08-01

    In the last two decades, non-invasive methods through acoustic analysis of voice signal have been proved to be excellent and reliable tool to diagnose vocal fold pathologies. This paper proposes a new feature vector based on the wavelet packet transform and singular value decomposition for the detection of vocal fold pathology. k-means clustering based feature weighting is proposed to increase the distinguishing performance of the proposed features. In this work, two databases Massachusetts Eye and Ear Infirmary (MEEI) voice disorders database and MAPACI speech pathology database are used. Four different supervised classifiers such as k-nearest neighbour (k-NN), least-square support vector machine, probabilistic neural network and general regression neural network are employed for testing the proposed features. The experimental results uncover that the proposed features give very promising classification accuracy of 100% for both MEEI database and MAPACI speech pathology database.

  11. Machine learning methods in chemoinformatics

    PubMed Central

    Mitchell, John B O

    2014-01-01

    Machine learning algorithms are generally developed in computer science or adjacent disciplines and find their way into chemical modeling by a process of diffusion. Though particular machine learning methods are popular in chemoinformatics and quantitative structure–activity relationships (QSAR), many others exist in the technical literature. This discussion is methods-based and focused on some algorithms that chemoinformatics researchers frequently use. It makes no claim to be exhaustive. We concentrate on methods for supervised learning, predicting the unknown property values of a test set of instances, usually molecules, based on the known values for a training set. Particularly relevant approaches include Artificial Neural Networks, Random Forest, Support Vector Machine, k-Nearest Neighbors and naïve Bayes classifiers. WIREs Comput Mol Sci 2014, 4:468–481. How to cite this article: WIREs Comput Mol Sci 2014, 4:468–481. doi:10.1002/wcms.1183 PMID:25285160

  12. A general system for automatic biomedical image segmentation using intensity neighborhoods.

    PubMed

    Chen, Cheng; Ozolek, John A; Wang, Wei; Rohde, Gustavo K

    2011-01-01

    Image segmentation is important with applications to several problems in biology and medicine. While extensively researched, generally, current segmentation methods perform adequately in the applications for which they were designed, but often require extensive modifications or calibrations before being used in a different application. We describe an approach that, with few modifications, can be used in a variety of image segmentation problems. The approach is based on a supervised learning strategy that utilizes intensity neighborhoods to assign each pixel in a test image its correct class based on training data. We describe methods for modeling rotations and variations in scales as well as a subset selection for training the classifiers. We show that the performance of our approach in tissue segmentation tasks in magnetic resonance and histopathology microscopy images, as well as nuclei segmentation from fluorescence microscopy images, is similar to or better than several algorithms specifically designed for each of these applications.

  13. Entanglement-Based Machine Learning on a Quantum Computer

    NASA Astrophysics Data System (ADS)

    Cai, X.-D.; Wu, D.; Su, Z.-E.; Chen, M.-C.; Wang, X.-L.; Li, Li; Liu, N.-L.; Lu, C.-Y.; Pan, J.-W.

    2015-03-01

    Machine learning, a branch of artificial intelligence, learns from previous experience to optimize performance, which is ubiquitous in various fields such as computer sciences, financial analysis, robotics, and bioinformatics. A challenge is that machine learning with the rapidly growing "big data" could become intractable for classical computers. Recently, quantum machine learning algorithms [Lloyd, Mohseni, and Rebentrost, arXiv.1307.0411] were proposed which could offer an exponential speedup over classical algorithms. Here, we report the first experimental entanglement-based classification of two-, four-, and eight-dimensional vectors to different clusters using a small-scale photonic quantum computer, which are then used to implement supervised and unsupervised machine learning. The results demonstrate the working principle of using quantum computers to manipulate and classify high-dimensional vectors, the core mathematical routine in machine learning. The method can, in principle, be scaled to larger numbers of qubits, and may provide a new route to accelerate machine learning.

  14. Detection of sunn pest-damaged wheat samples using visible/near-infrared spectroscopy based on pattern recognition.

    PubMed

    Basati, Zahra; Jamshidi, Bahareh; Rasekh, Mansour; Abbaspour-Gilandeh, Yousef

    2018-05-30

    The presence of sunn pest-damaged grains in wheat mass reduces the quality of flour and bread produced from it. Therefore, it is essential to assess the quality of the samples in collecting and storage centers of wheat and flour mills. In this research, the capability of visible/near-infrared (Vis/NIR) spectroscopy combined with pattern recognition methods was investigated for discrimination of wheat samples with different percentages of sunn pest-damaged. To this end, various samples belonging to five classes (healthy and 5%, 10%, 15% and 20% unhealthy) were analyzed using Vis/NIR spectroscopy (wavelength range of 350-1000 nm) based on both supervised and unsupervised pattern recognition methods. Principal component analysis (PCA) and hierarchical cluster analysis (HCA) as the unsupervised techniques and soft independent modeling of class analogies (SIMCA) and partial least squares-discriminant analysis (PLS-DA) as supervised methods were used. The results showed that Vis/NIR spectra of healthy samples were correctly clustered using both PCA and HCA. Due to the high overlapping between the four unhealthy classes (5%, 10%, 15% and 20%), it was not possible to discriminate all the unhealthy samples in individual classes. However, when considering only the two main categories of healthy and unhealthy, an acceptable degree of separation between the classes can be obtained after classification with supervised pattern recognition methods of SIMCA and PLS-DA. SIMCA based on PCA modeling correctly classified samples in two classes of healthy and unhealthy with classification accuracy of 100%. Moreover, the power of the wavelengths of 839 nm, 918 nm and 995 nm were more than other wavelengths to discriminate two classes of healthy and unhealthy. It was also concluded that PLS-DA provides excellent classification results of healthy and unhealthy samples (R 2  = 0.973 and RMSECV = 0.057). Therefore, Vis/NIR spectroscopy based on pattern recognition techniques can be useful for rapid distinguishing the healthy wheat samples from those damaged by sunn pest in the maintenance and processing centers. Copyright © 2018 Elsevier B.V. All rights reserved.

  15. Prediction of Chemical Respiratory Sensitizers Using GARD, a Novel In Vitro Assay Based on a Genomic Biomarker Signature

    PubMed Central

    Albrekt, Ann-Sofie; Borrebaeck, Carl A. K.; Lindstedt, Malin

    2015-01-01

    Background Repeated exposure to certain low molecular weight (LMW) chemical compounds may result in development of allergic reactions in the skin or in the respiratory tract. In most cases, a certain LMW compound selectively sensitize the skin, giving rise to allergic contact dermatitis (ACD), or the respiratory tract, giving rise to occupational asthma (OA). To limit occurrence of allergic diseases, efforts are currently being made to develop predictive assays that accurately identify chemicals capable of inducing such reactions. However, while a few promising methods for prediction of skin sensitization have been described, to date no validated method, in vitro or in vivo, exists that is able to accurately classify chemicals as respiratory sensitizers. Results Recently, we presented the in vitro based Genomic Allergen Rapid Detection (GARD) assay as a novel testing strategy for classification of skin sensitizing chemicals based on measurement of a genomic biomarker signature. We have expanded the applicability domain of the GARD assay to classify also respiratory sensitizers by identifying a separate biomarker signature containing 389 differentially regulated genes for respiratory sensitizers in comparison to non-respiratory sensitizers. By using an independent data set in combination with supervised machine learning, we validated the assay, showing that the identified genomic biomarker is able to accurately classify respiratory sensitizers. Conclusions We have identified a genomic biomarker signature for classification of respiratory sensitizers. Combining this newly identified biomarker signature with our previously identified biomarker signature for classification of skin sensitizers, we have developed a novel in vitro testing strategy with a potent ability to predict both skin and respiratory sensitization in the same sample. PMID:25760038

  16. From sensor data to animal behaviour: an oystercatcher example.

    PubMed

    Shamoun-Baranes, Judy; Bom, Roeland; van Loon, E Emiel; Ens, Bruno J; Oosterbeek, Kees; Bouten, Willem

    2012-01-01

    Animal-borne sensors enable researchers to remotely track animals, their physiological state and body movements. Accelerometers, for example, have been used in several studies to measure body movement, posture, and energy expenditure, although predominantly in marine animals. In many studies, behaviour is often inferred from expert interpretation of sensor data and not validated with direct observations of the animal. The aim of this study was to derive models that could be used to classify oystercatcher (Haematopus ostralegus) behaviour based on sensor data. We measured the location, speed, and tri-axial acceleration of three oystercatchers using a flexible GPS tracking system and conducted simultaneous visual observations of the behaviour of these birds in their natural environment. We then used these data to develop three supervised classification trees of behaviour and finally applied one of the models to calculate time-activity budgets. The model based on accelerometer data developed to classify three behaviours (fly, terrestrial locomotion, and no movement) was much more accurate (cross-validation error = 0.14) than the model based on GPS-speed alone (cross-validation error = 0.35). The most parsimonious acceleration model designed to classify eight behaviours could distinguish five: fly, forage, body care, stand, and sit (cross-validation error = 0.28); other behaviours that were observed, such as aggression or handling of prey, could not be distinguished. Model limitations and potential improvements are discussed. The workflow design presented in this study can facilitate model development, be adapted to a wide range of species, and together with the appropriate measurements, can foster the study of behaviour and habitat use of free living animals throughout their annual routine.

  17. Fuzzy classifier based support vector regression framework for Poisson ratio determination

    NASA Astrophysics Data System (ADS)

    Asoodeh, Mojtaba; Bagheripour, Parisa

    2013-09-01

    Poisson ratio is considered as one of the most important rock mechanical properties of hydrocarbon reservoirs. Determination of this parameter through laboratory measurement is time, cost, and labor intensive. Furthermore, laboratory measurements do not provide continuous data along the reservoir intervals. Hence, a fast, accurate, and inexpensive way of determining Poisson ratio which produces continuous data over the whole reservoir interval is desirable. For this purpose, support vector regression (SVR) method based on statistical learning theory (SLT) was employed as a supervised learning algorithm to estimate Poisson ratio from conventional well log data. SVR is capable of accurately extracting the implicit knowledge contained in conventional well logs and converting the gained knowledge into Poisson ratio data. Structural risk minimization (SRM) principle which is embedded in the SVR structure in addition to empirical risk minimization (EMR) principle provides a robust model for finding quantitative formulation between conventional well log data and Poisson ratio. Although satisfying results were obtained from an individual SVR model, it had flaws of overestimation in low Poisson ratios and underestimation in high Poisson ratios. These errors were eliminated through implementation of fuzzy classifier based SVR (FCBSVR). The FCBSVR significantly improved accuracy of the final prediction. This strategy was successfully applied to data from carbonate reservoir rocks of an Iranian Oil Field. Results indicated that SVR predicted Poisson ratio values are in good agreement with measured values.

  18. Visual Perception-Based Statistical Modeling of Complex Grain Image for Product Quality Monitoring and Supervision on Assembly Production Line

    PubMed Central

    Chen, Qing; Xu, Pengfei; Liu, Wenzhong

    2016-01-01

    Computer vision as a fast, low-cost, noncontact, and online monitoring technology has been an important tool to inspect product quality, particularly on a large-scale assembly production line. However, the current industrial vision system is far from satisfactory in the intelligent perception of complex grain images, comprising a large number of local homogeneous fragmentations or patches without distinct foreground and background. We attempt to solve this problem based on the statistical modeling of spatial structures of grain images. We present a physical explanation in advance to indicate that the spatial structures of the complex grain images are subject to a representative Weibull distribution according to the theory of sequential fragmentation, which is well known in the continued comminution of ore grinding. To delineate the spatial structure of the grain image, we present a method of multiscale and omnidirectional Gaussian derivative filtering. Then, a product quality classifier based on sparse multikernel–least squares support vector machine is proposed to solve the low-confidence classification problem of imbalanced data distribution. The proposed method is applied on the assembly line of a food-processing enterprise to classify (or identify) automatically the production quality of rice. The experiments on the real application case, compared with the commonly used methods, illustrate the validity of our method. PMID:26986726

  19. Incrementally learning objects by touch: online discriminative and generative models for tactile-based recognition.

    PubMed

    Soh, Harold; Demiris, Yiannis

    2014-01-01

    Human beings not only possess the remarkable ability to distinguish objects through tactile feedback but are further able to improve upon recognition competence through experience. In this work, we explore tactile-based object recognition with learners capable of incremental learning. Using the sparse online infinite Echo-State Gaussian process (OIESGP), we propose and compare two novel discriminative and generative tactile learners that produce probability distributions over objects during object grasping/palpation. To enable iterative improvement, our online methods incorporate training samples as they become available. We also describe incremental unsupervised learning mechanisms, based on novelty scores and extreme value theory, when teacher labels are not available. We present experimental results for both supervised and unsupervised learning tasks using the iCub humanoid, with tactile sensors on its five-fingered anthropomorphic hand, and 10 different object classes. Our classifiers perform comparably to state-of-the-art methods (C4.5 and SVM classifiers) and findings indicate that tactile signals are highly relevant for making accurate object classifications. We also show that accurate "early" classifications are possible using only 20-30 percent of the grasp sequence. For unsupervised learning, our methods generate high quality clusterings relative to the widely-used sequential k-means and self-organising map (SOM), and we present analyses into the differences between the approaches.

  20. Counseling Supervision within a Feminist Framework: Guidelines for Intervention

    ERIC Educational Resources Information Center

    Degges-White, Suzanne E.; Colon, Bonnie R.; Borzumato-Gainey, Christine

    2013-01-01

    Feminist supervision is based on the principles of feminist theory. Goals include sharing responsibility for the supervision process, empowering the supervisee, attending to the contextual assumptions about clients, and analyzing gender roles. This article explores feminist supervision and guidelines for providing counseling supervision…

  1. 48 CFR 32.503-2 - Supervision of progress payments.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 48 Federal Acquisition Regulations System 1 2013-10-01 2013-10-01 false Supervision of progress... GENERAL CONTRACTING REQUIREMENTS CONTRACT FINANCING Progress Payments Based on Costs 32.503-2 Supervision of progress payments. (a) The extent of progress payments supervision, by prepayment review or...

  2. 48 CFR 32.503-2 - Supervision of progress payments.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 48 Federal Acquisition Regulations System 1 2014-10-01 2014-10-01 false Supervision of progress... GENERAL CONTRACTING REQUIREMENTS CONTRACT FINANCING Progress Payments Based on Costs 32.503-2 Supervision of progress payments. (a) The extent of progress payments supervision, by prepayment review or...

  3. 48 CFR 32.503-2 - Supervision of progress payments.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 48 Federal Acquisition Regulations System 1 2012-10-01 2012-10-01 false Supervision of progress... GENERAL CONTRACTING REQUIREMENTS CONTRACT FINANCING Progress Payments Based on Costs 32.503-2 Supervision of progress payments. (a) The extent of progress payments supervision, by prepayment review or...

  4. 48 CFR 32.503-2 - Supervision of progress payments.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 48 Federal Acquisition Regulations System 1 2011-10-01 2011-10-01 false Supervision of progress... GENERAL CONTRACTING REQUIREMENTS CONTRACT FINANCING Progress Payments Based on Costs 32.503-2 Supervision of progress payments. (a) The extent of progress payments supervision, by prepayment review or...

  5. Monitoring Urbanization Processes from Space: Using Landsat Imagery to Detect Built-Up Areas at Scale

    NASA Astrophysics Data System (ADS)

    Goldblatt, R.; You, W.; Hanson, G.; Khandelwal, A. K.

    2016-12-01

    Urbanization is one of the most fundamental trends of the past two centuries and a key force shaping almost all dimensions of modern society. Monitoring the spatial extent of cities and their dynamics be means of remote sensing methods is crucial for many research domains, as well as to city and regional planning and to policy making. Yet the majority of urban research is being done in small scales, due, in part, to computational limitation. With the increasing availability of parallel computing platforms with large storage capacities, such as Google Earth Engine (GEE), researchers can scale up the spatial and the temporal units of analysis and investigate urbanization processes over larger areas and over longer periods of time. In this study we present a methodology that is designed to capture temporal changes in the spatial extent of urban areas at the national level. We utilize a large scale ground-truth dataset containing examples of "built-up" and "not built-up" areas from across India. This dataset, which was collected based on 2016 high-resolution imagery, is used for supervised pixel-based image classification in GEE. We assess different types of classifiers and inputs and demonstrate that with Landsat 8 as the classifier`s input, Random Forest achieves a high accuracy rate of around 87%. Although performance with Landsat 8 as the input exceeds that of Landsat 7, with the addition of several per-pixel computed indices to Landsat 7 - NDVI, NDBI, MNDWI and SAVI - the classifier`s sensitivity improves by around 10%. We use Landsat 7 to detect temporal changes in the extent of urban areas. The classifier is trained with 2016 imagery as the input - for which ground truth data is available - and is used the to detect urban areas over the historical imagery. We demonstrate that this classification produces high quality maps of urban extent over time. We compare the classification result with numerous datasets of urban areas (e.g. MODIS, DMSP-OLS and WorldPop) and show that our classification captures the fine boundaries between built-up areas and various types of land cover thus providing an accurate estimation of the extent of urban areas. The study demonstrates the potential of cloud-based platforms, such as GEE, for monitoring long-term and continuous urbanization processes at scale.

  6. Clinical supervision: the state of the art.

    PubMed

    Falender, Carol A; Shafranske, Edward P

    2014-11-01

    Since the recognition of clinical supervision as a distinct professional competence and a core competence, attention has turned to ensuring supervisor competence and effective supervision practice. In this article, we highlight recent developments and the state of the art in supervision, with particular emphasis on the competency-based approach. We present effective clinical supervision strategies, providing an integrated snapshot of the current status. We close with consideration of current training practices in supervision and challenges. © 2014 Wiley Periodicals, Inc.

  7. Environmental Compliance Assessment System (ECAS). Connecticut Supplement

    DTIC Science & Technology

    1994-09-01

    Applicator - any individual who uses or supervises the use of any restricted-use pesticides or any pesticide on property not owned or rented by the...for entry onto the property by pedes- trians or motor vehicles. " Restricted-Use Pesticide - any pesticide --: pesticide use classified as restricted...personnel making an outdoor application of a pesticide within application requires 100 yd of any property line post a sign at the time of application

  8. Accuracy assessments and areal estimates using two-phase stratified random sampling, cluster plots, and the multivariate composite estimator

    Treesearch

    Raymond L. Czaplewski

    2000-01-01

    Consider the following example of an accuracy assessment. Landsat data are used to build a thematic map of land cover for a multicounty region. The map classifier (e.g., a supervised classification algorithm) assigns each pixel into one category of land cover. The classification system includes 12 different types of forest and land cover: black spruce, balsam fir,...

  9. Semantic Shot Classification in Sports Video

    NASA Astrophysics Data System (ADS)

    Duan, Ling-Yu; Xu, Min; Tian, Qi

    2003-01-01

    In this paper, we present a unified framework for semantic shot classification in sports videos. Unlike previous approaches, which focus on clustering by aggregating shots with similar low-level features, the proposed scheme makes use of domain knowledge of a specific sport to perform a top-down video shot classification, including identification of video shot classes for each sport, and supervised learning and classification of the given sports video with low-level and middle-level features extracted from the sports video. It is observed that for each sport we can predefine a small number of semantic shot classes, about 5~10, which covers 90~95% of sports broadcasting video. With the supervised learning method, we can map the low-level features to middle-level semantic video shot attributes such as dominant object motion (a player), camera motion patterns, and court shape, etc. On the basis of the appropriate fusion of those middle-level shot classes, we classify video shots into the predefined video shot classes, each of which has a clear semantic meaning. The proposed method has been tested over 4 types of sports videos: tennis, basketball, volleyball and soccer. Good classification accuracy of 85~95% has been achieved. With correctly classified sports video shots, further structural and temporal analysis, such as event detection, video skimming, table of content, etc, will be greatly facilitated.

  10. Effects of poverty on academic failure and delinquency in boys: a change and process model approach.

    PubMed

    Pagani, L; Boulerice, B; Vitaro, F; Tremblay, R E

    1999-11-01

    Using data from the Montreal Longitudinal-Experimental Study, we examined the impact of poverty (and its correlate, family configuration status) on academic placement and self-reported delinquency in boys at age 16. We then investigated whether the relation between family economic hardship and antisocial behaviour is direct or indirect by considering the value of parenting practices and academic failure as process variables in the model. Data included official records, and parent, teacher, and self-reports. The temporal intensity of poverty was classified into five categories: never-poor; always-poor; poor-earlier; poor-later; and transitory-poverty. Family configuration status was classified by both temporal characteristics and number of marital transitions: intact-family; short-term-single; long-term-single; short-term-remarried; long-term-remarried; and multiple-marital-transitions. Results revealed that when maternal education and early childhood behaviour were controlled, poverty had an effect on both academic failure and extreme delinquency. This effect was independent of family configuration status. Although they both significantly predicted extreme delinquency on their own, academic failure and parental supervision did not mediate the relationship between poverty and delinquency. Divorce increased the risk of theft and fighting at age 16, regardless of financial hardship. Parental supervision only helped explain the effects of divorce on boys' fighting.

  11. Building an Evidence Base for Effective Supervision Practices: An Analogue Experiment of Supervision to Increase EBT Fidelity.

    PubMed

    Bearman, Sarah Kate; Schneiderman, Robyn L; Zoloth, Emma

    2017-03-01

    Treatments that are efficacious in research trials perform less well under routine conditions; differences in supervision may be one contributing factor. This study compared the effect of supervision using active learning techniques (e.g. role play, corrective feedback) versus "supervision as usual" on therapist cognitive restructuring fidelity, overall CBT competence, and CBT expertise. Forty therapist trainees attended a training workshop and were randomized to supervision condition. Outcomes were assessed using behavioral rehearsals pre- and immediately post-training, and after three supervision meetings. EBT knowledge, attitudes, and fidelity improved for all participants post-training, but only the SUP+ group demonstrated improvement following supervision.

  12. HMMER Cut-off Threshold Tool (HMMERCTTER): Supervised classification of superfamily protein sequences with a reliable cut-off threshold.

    PubMed

    Pagnuco, Inti Anabela; Revuelta, María Victoria; Bondino, Hernán Gabriel; Brun, Marcel; Ten Have, Arjen

    2018-01-01

    Protein superfamilies can be divided into subfamilies of proteins with different functional characteristics. Their sequences can be classified hierarchically, which is part of sequence function assignation. Typically, there are no clear subfamily hallmarks that would allow pattern-based function assignation by which this task is mostly achieved based on the similarity principle. This is hampered by the lack of a score cut-off that is both sensitive and specific. HMMER Cut-off Threshold Tool (HMMERCTTER) adds a reliable cut-off threshold to the popular HMMER. Using a high quality superfamily phylogeny, it clusters a set of training sequences such that the cluster-specific HMMER profiles show cluster or subfamily member detection with 100% precision and recall (P&R), thereby generating a specific threshold as inclusion cut-off. Profiles and thresholds are then used as classifiers to screen a target dataset. Iterative inclusion of novel sequences to groups and the corresponding HMMER profiles results in high sensitivity while specificity is maintained by imposing 100% P&R self detection. In three presented case studies of protein superfamilies, classification of large datasets with 100% precision was achieved with over 95% recall. Limits and caveats are presented and explained. HMMERCTTER is a promising protein superfamily sequence classifier provided high quality training datasets are used. It provides a decision support system that aids in the difficult task of sequence function assignation in the twilight zone of sequence similarity. All relevant data and source codes are available from the Github repository at the following URL: https://github.com/BBCMdP/HMMERCTTER.

  13. HMMER Cut-off Threshold Tool (HMMERCTTER): Supervised classification of superfamily protein sequences with a reliable cut-off threshold

    PubMed Central

    Pagnuco, Inti Anabela; Revuelta, María Victoria; Bondino, Hernán Gabriel; Brun, Marcel

    2018-01-01

    Background Protein superfamilies can be divided into subfamilies of proteins with different functional characteristics. Their sequences can be classified hierarchically, which is part of sequence function assignation. Typically, there are no clear subfamily hallmarks that would allow pattern-based function assignation by which this task is mostly achieved based on the similarity principle. This is hampered by the lack of a score cut-off that is both sensitive and specific. Results HMMER Cut-off Threshold Tool (HMMERCTTER) adds a reliable cut-off threshold to the popular HMMER. Using a high quality superfamily phylogeny, it clusters a set of training sequences such that the cluster-specific HMMER profiles show cluster or subfamily member detection with 100% precision and recall (P&R), thereby generating a specific threshold as inclusion cut-off. Profiles and thresholds are then used as classifiers to screen a target dataset. Iterative inclusion of novel sequences to groups and the corresponding HMMER profiles results in high sensitivity while specificity is maintained by imposing 100% P&R self detection. In three presented case studies of protein superfamilies, classification of large datasets with 100% precision was achieved with over 95% recall. Limits and caveats are presented and explained. Conclusions HMMERCTTER is a promising protein superfamily sequence classifier provided high quality training datasets are used. It provides a decision support system that aids in the difficult task of sequence function assignation in the twilight zone of sequence similarity. All relevant data and source codes are available from the Github repository at the following URL: https://github.com/BBCMdP/HMMERCTTER. PMID:29579071

  14. An AdaBoost Based Approach to Automatic Classification and Detection of Buildings Footprints, Vegetation Areas and Roads from Satellite Images

    NASA Astrophysics Data System (ADS)

    Gonulalan, Cansu

    In recent years, there has been an increasing demand for applications to monitor the targets related to land-use, using remote sensing images. Advances in remote sensing satellites give rise to the research in this area. Many applications ranging from urban growth planning to homeland security have already used the algorithms for automated object recognition from remote sensing imagery. However, they have still problems such as low accuracy on detection of targets, specific algorithms for a specific area etc. In this thesis, we focus on an automatic approach to classify and detect building foot-prints, road networks and vegetation areas. The automatic interpretation of visual data is a comprehensive task in computer vision field. The machine learning approaches improve the capability of classification in an intelligent way. We propose a method, which has high accuracy on detection and classification. The multi class classification is developed for detecting multiple objects. We present an AdaBoost-based approach along with the supervised learning algorithm. The combi- nation of AdaBoost with "Attentional Cascade" is adopted from Viola and Jones [1]. This combination decreases the computation time and gives opportunity to real time applications. For the feature extraction step, our contribution is to combine Haar-like features that include corner, rectangle and Gabor. Among all features, AdaBoost selects only critical features and generates in extremely efficient cascade structured classifier. Finally, we present and evaluate our experimental results. The overall system is tested and high performance of detection is achieved. The precision rate of the final multi-class classifier is over 98%.

  15. Exploring Clinical Supervision as Instrument for Effective Teacher Supervision

    ERIC Educational Resources Information Center

    Ibara, E. C.

    2013-01-01

    This paper examines clinical supervision approaches that have the potential to promote and implement effective teacher supervision in Nigeria. The various approaches have been analysed based on the conceptual framework of instructional supervisory behavior. The findings suggest that a clear distinction can be made between the prescriptive and…

  16. Linking Portfolio Development to Clinical Supervision: A Case Study.

    ERIC Educational Resources Information Center

    Zepeda, Sally J.

    2002-01-01

    Describes a model for portfolio supervision based on the results of a 2-year study of one elementary school's experience in implementing portfolio supervision. Includes four propositions that guided the development of the model. Describes the skills inherent in portfolio supervision. Provides general guidelines for implementation of the portfolio…

  17. Group Supervision Attitudes: Supervisory Practices Fostering Resistance to Adoption of Evidence-Based Practices

    ERIC Educational Resources Information Center

    Brooks, Charles T.; Patterson, David A.; McKiernan, Patrick M.

    2012-01-01

    The focus of this study was to qualitatively evaluate worker's attitudes about clinical supervision. It is believed that poor attitudes toward clinical supervision can create barriers during supervision sessions. Fifty-one participants within a social services organization completed an open-ended questionnaire regarding their clinical supervision…

  18. Supporting Early Childhood Practitioners through Relationship-Based, Reflective Supervision

    ERIC Educational Resources Information Center

    Bernstein, Victor J.; Edwards, Renee C.

    2012-01-01

    Reflective supervision is a relationship-based practice that supports the professional development of early childhood practitioners. Reflective supervision helps practitioners cope with the intense feelings and stress that are generated when working with at-risk children and families. It allows them to focus on the purpose and goals of the program…

  19. Sleep in patients with disorders of consciousness characterized by means of machine learning

    PubMed Central

    Lechinger, Julia; Wislowska, Malgorzata; Blume, Christine; Ott, Peter; Wegenkittl, Stefan; del Giudice, Renata; Heib, Dominik P. J.; Mayer, Helmut A.; Laureys, Steven; Pichler, Gerald; Schabus, Manuel

    2018-01-01

    Sleep has been proposed to indicate preserved residual brain functioning in patients suffering from disorders of consciousness (DOC) after awakening from coma. However, a reliable characterization of sleep patterns in this clinical population continues to be challenging given severely altered brain oscillations, frequent and extended artifacts in clinical recordings and the absence of established staging criteria. In the present study, we try to address these issues and investigate the usefulness of a multivariate machine learning technique based on permutation entropy, a complexity measure. Specifically, we used long-term polysomnography (PSG), along with video recordings in day and night periods in a sample of 23 DOC; 12 patients were diagnosed as Unresponsive Wakefulness Syndrome (UWS) and 11 were diagnosed as Minimally Conscious State (MCS). Eight hour PSG recordings of healthy sleepers (N = 26) were additionally used for training and setting parameters of supervised and unsupervised model, respectively. In DOC, the supervised classification (wake, N1, N2, N3 or REM) was validated using simultaneous videos which identified periods with prolonged eye opening or eye closure.The supervised classification revealed that out of the 23 subjects, 11 patients (5 MCS and 6 UWS) yielded highly accurate classification with an average F1-score of 0.87 representing high overlap between the classifier predicting sleep (i.e. one of the 4 sleep stages) and closed eyes. Furthermore, the unsupervised approach revealed a more complex pattern of sleep-wake stages during the night period in the MCS group, as evidenced by the presence of several distinct clusters. In contrast, in UWS patients no such clustering was found. Altogether, we present a novel data-driven method, based on machine learning that can be used to gain new and unambiguous insights into sleep organization and residual brain functioning of patients with DOC. PMID:29293607

  20. Agile Text Mining for the 2014 i2b2/UTHealth Cardiac Risk Factors Challenge

    PubMed Central

    Cormack, James; Nath, Chinmoy; Milward, David; Raja, Kalpana; Jonnalagadda, Siddhartha R

    2016-01-01

    This paper describes the use of an agile text mining platform (Linguamatics’ Interactive Information Extraction Platform, I2E) to extract document-level cardiac risk factors in patient records as defined in the i2b2/UTHealth 2014 Challenge. The approach uses a data-driven rule-based methodology with the addition of a simple supervised classifier. We demonstrate that agile text mining allows for rapid optimization of extraction strategies, while post-processing can leverage annotation guidelines, corpus statistics and logic inferred from the gold standard data. We also show how data imbalance in a training set affects performance. Evaluation of this approach on the test data gave an F-Score of 91.7%, one percent behind the top performing system. PMID:26209007

  1. MRI texture features as biomarkers to predict MGMT methylation status in glioblastomas

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Korfiatis, Panagiotis; Kline, Timothy L.; Erickson, Bradley J., E-mail: bje@mayo.edu

    Purpose: Imaging biomarker research focuses on discovering relationships between radiological features and histological findings. In glioblastoma patients, methylation of the O{sup 6}-methylguanine methyltransferase (MGMT) gene promoter is positively correlated with an increased effectiveness of current standard of care. In this paper, the authors investigate texture features as potential imaging biomarkers for capturing the MGMT methylation status of glioblastoma multiforme (GBM) tumors when combined with supervised classification schemes. Methods: A retrospective study of 155 GBM patients with known MGMT methylation status was conducted. Co-occurrence and run length texture features were calculated, and both support vector machines (SVMs) and random forest classifiersmore » were used to predict MGMT methylation status. Results: The best classification system (an SVM-based classifier) had a maximum area under the receiver-operating characteristic (ROC) curve of 0.85 (95% CI: 0.78–0.91) using four texture features (correlation, energy, entropy, and local intensity) originating from the T2-weighted images, yielding at the optimal threshold of the ROC curve, a sensitivity of 0.803 and a specificity of 0.813. Conclusions: Results show that supervised machine learning of MRI texture features can predict MGMT methylation status in preoperative GBM tumors, thus providing a new noninvasive imaging biomarker.« less

  2. Pulsed terahertz imaging of breast cancer in freshly excised murine tumors

    NASA Astrophysics Data System (ADS)

    Bowman, Tyler; Chavez, Tanny; Khan, Kamrul; Wu, Jingxian; Chakraborty, Avishek; Rajaram, Narasimhan; Bailey, Keith; El-Shenawee, Magda

    2018-02-01

    This paper investigates terahertz (THz) imaging and classification of freshly excised murine xenograft breast cancer tumors. These tumors are grown via injection of E0771 breast adenocarcinoma cells into the flank of mice maintained on high-fat diet. Within 1 h of excision, the tumor and adjacent tissues are imaged using a pulsed THz system in the reflection mode. The THz images are classified using a statistical Bayesian mixture model with unsupervised and supervised approaches. Correlation with digitized pathology images is conducted using classification images assigned by a modal class decision rule. The corresponding receiver operating characteristic curves are obtained based on the classification results. A total of 13 tumor samples obtained from 9 tumors are investigated. The results show good correlation of THz images with pathology results in all samples of cancer and fat tissues. For tumor samples of cancer, fat, and muscle tissues, THz images show reasonable correlation with pathology where the primary challenge lies in the overlapping dielectric properties of cancer and muscle tissues. The use of a supervised regression approach shows improvement in the classification images although not consistently in all tissue regions. Advancing THz imaging of breast tumors from mice and the development of accurate statistical models will ultimately progress the technique for the assessment of human breast tumor margins.

  3. Disease gene classification with metagraph representations.

    PubMed

    Kircali Ata, Sezin; Fang, Yuan; Wu, Min; Li, Xiao-Li; Xiao, Xiaokui

    2017-12-01

    Protein-protein interaction (PPI) networks play an important role in studying the functional roles of proteins, including their association with diseases. However, protein interaction networks are not sufficient without the support of additional biological knowledge for proteins such as their molecular functions and biological processes. To complement and enrich PPI networks, we propose to exploit biological properties of individual proteins. More specifically, we integrate keywords describing protein properties into the PPI network, and construct a novel PPI-Keywords (PPIK) network consisting of both proteins and keywords as two different types of nodes. As disease proteins tend to have a similar topological characteristics on the PPIK network, we further propose to represent proteins with metagraphs. Different from a traditional network motif or subgraph, a metagraph can capture a particular topological arrangement involving the interactions/associations between both proteins and keywords. Based on the novel metagraph representations for proteins, we further build classifiers for disease protein classification through supervised learning. Our experiments on three different PPI databases demonstrate that the proposed method consistently improves disease protein prediction across various classifiers, by 15.3% in AUC on average. It outperforms the baselines including the diffusion-based methods (e.g., RWR) and the module-based methods by 13.8-32.9% for overall disease protein prediction. For predicting breast cancer genes, it outperforms RWR, PRINCE and the module-based baselines by 6.6-14.2%. Finally, our predictions also turn out to have better correlations with literature findings from PubMed. Copyright © 2017 Elsevier Inc. All rights reserved.

  4. Rock classification based on resistivity patterns in electrical borehole wall images

    NASA Astrophysics Data System (ADS)

    Linek, Margarete; Jungmann, Matthias; Berlage, Thomas; Pechnig, Renate; Clauser, Christoph

    2007-06-01

    Electrical borehole wall images represent grey-level-coded micro-resistivity measurements at the borehole wall. Different scientific methods have been implemented to transform image data into quantitative log curves. We introduce a pattern recognition technique applying texture analysis, which uses second-order statistics based on studying the occurrence of pixel pairs. We calculate so-called Haralick texture features such as contrast, energy, entropy and homogeneity. The supervised classification method is used for assigning characteristic texture features to different rock classes and assessing the discriminative power of these image features. We use classifiers obtained from training intervals to characterize the entire image data set recovered in ODP hole 1203A. This yields a synthetic lithology profile based on computed texture data. We show that Haralick features accurately classify 89.9% of the training intervals. We obtained misclassification for vesicular basaltic rocks. Hence, further image analysis tools are used to improve the classification reliability. We decompose the 2D image signal by the application of wavelet transformation in order to enhance image objects horizontally, diagonally and vertically. The resulting filtered images are used for further texture analysis. This combined classification based on Haralick features and wavelet transformation improved our classification up to a level of 98%. The application of wavelet transformation increases the consistency between standard logging profiles and texture-derived lithology. Texture analysis of borehole wall images offers the potential to facilitate objective analysis of multiple boreholes with the same lithology.

  5. The dynamics of human-induced land cover change in miombo ecosystems of southern Africa

    NASA Astrophysics Data System (ADS)

    Jaiteh, Malanding Sambou

    Understanding human-induced land cover change in the miombo require the consistent, geographically-referenced, data on temporal land cover characteristics as well as biophysical and socioeconomic drivers of land use, the major cause of land cover change. The overall goal of this research to examine the applications of high-resolution satellite remote sensing data in studying the dynamics of human-induced land cover change in the miombo. Specific objectives are to: (1) evaluate the applications of computer-assisted classification of Landsat Thematic Mapper (TM) data for land cover mapping in the miombo and (2) analyze spatial and temporal patterns of landscape change locations in the miombo. Stepwise Thematic Classification, STC (a hybrid supervised-unsupervised classification) procedure for classifying Landsat TM data was developed and tested using Landsat TM data. Classification accuracy results were compared to those from supervised and unsupervised classification. The STC provided the highest classification accuracy i.e., 83.9% correspondence between classified and referenced data compared to 44.2% and 34.5% for unsupervised and supervised classification respectively. Improvements in the classification process can be attributed to thematic stratification of the image data into spectrally homogenous (thematic) groups and step-by-step classification of the groups using supervised or unsupervised classification techniques. Supervised classification failed to classify 18% of the scene evidence that training data used did not adequately represent all of the variability in the data. Application of the procedure in drier miombo produced overall classification accuracy of 63%. This is much lower than that of wetter miombo. The results clearly demonstrate that digital classification of Landsat TM can be successfully implemented in the miombo without intensive fieldwork. Spatial characteristics of land cover change in agricultural and forested landscapes in central Malawi were analyzed for the period 1984 to 1995 spatial pattern analysis methods. Shifting cultivation areas, Agriculture in forested landscape, experienced highest rate of woodland cover fragmentation with mean patch size of closed woodland cover decreasing from 20ha to 7.5ha. Permanent bare (cropland and settlement) in intensive agricultural matrix landscapes increased 52% largely through the conversion of fallow areas. Protected National Park area remained fairly unchanged although closed woodland area increased by 4%, mainly from regeneration of open woodland. This study provided evidence that changes in spatial characteristics in the miombo differ with landscape. Land use change (i.e. conversion to cropland) is the primary driving force behind changes in landscape spatial patterns. Also, results revealed that exclusion of intense human use (i.e. cultivation and woodcutting) through regulations and/or fencing increased both closed woodland area (through regeneration of open woodland) and overall connectivity in the landscape. Spatial characteristics of land cover change were analyzed at locations in Malawi (wetter miombo) and Zimbabwe (drier miombo). Results indicate land cover dynamics differ both between and within case study sites. In communal areas in the Kasungu scene, land cover change is dominated by woodland fragmentation to open vegetation. Change in private commercial lands was dominantly expansion of bare (settlement and cropland) areas primarily at the expense of open vegetation (fallow land).

  6. Zooniverse: Combining Human and Machine Classifiers for the Big Survey Era

    NASA Astrophysics Data System (ADS)

    Fortson, Lucy; Wright, Darryl; Beck, Melanie; Lintott, Chris; Scarlata, Claudia; Dickinson, Hugh; Trouille, Laura; Willi, Marco; Laraia, Michael; Boyer, Amy; Veldhuis, Marten; Zooniverse

    2018-01-01

    Many analyses of astronomical data sets, ranging from morphological classification of galaxies to identification of supernova candidates, have relied on humans to classify data into distinct categories. Crowdsourced galaxy classifications via the Galaxy Zoo project provided a solution that scaled visual classification for extant surveys by harnessing the combined power of thousands of volunteers. However, the much larger data sets anticipated from upcoming surveys will require a different approach. Automated classifiers using supervised machine learning have improved considerably over the past decade but their increasing sophistication comes at the expense of needing ever more training data. Crowdsourced classification by human volunteers is a critical technique for obtaining these training data. But several improvements can be made on this zeroth order solution. Efficiency gains can be achieved by implementing a “cascade filtering” approach whereby the task structure is reduced to a set of binary questions that are more suited to simpler machines while demanding lower cognitive loads for humans.Intelligent subject retirement based on quantitative metrics of volunteer skill and subject label reliability also leads to dramatic improvements in efficiency. We note that human and machine classifiers may retire subjects differently leading to trade-offs in performance space. Drawing on work with several Zooniverse projects including Galaxy Zoo and Supernova Hunter, we will present recent findings from experiments that combine cohorts of human and machine classifiers. We show that the most efficient system results when appropriate subsets of the data are intelligently assigned to each group according to their particular capabilities.With sufficient online training, simple machines can quickly classify “easy” subjects, leaving more difficult (and discovery-oriented) tasks for volunteers. We also find humans achieve higher classification purity while samples produced by machines are typically more complete. These findings set the stage for further investigations, with the ultimate goal of efficiently and accurately labeling the wide range of data classes that will arise from the planned large astronomical surveys.

  7. Evaluating suitability of Pol-SAR (TerraSAR-X, Radarsat-2) for automated sea ice classification

    NASA Astrophysics Data System (ADS)

    Ressel, Rudolf; Singha, Suman; Lehner, Susanne

    2016-05-01

    Satellite borne SAR imagery has become an invaluable tool in the field of sea ice monitoring. Previously, single polarimetric imagery were employed in supervised and unsupervised classification schemes for sea ice investigation, which was preceded by image processing techniques such as segmentation and textural features. Recently, through the advent of polarimetric SAR sensors, investigation of polarimetric features in sea ice has attracted increased attention. While dual-polarimetric data has already been investigated in a number of works, full-polarimetric data has so far not been a major scientific focus. To explore the possibilities of full-polarimetric data and compare the differences in C- and X-bands, we endeavor to analyze in detail an array of datasets, simultaneously acquired, in C-band (RADARSAT-2) and X-band (TerraSAR-X) over ice infested areas. First, we propose an array of polarimetric features (Pauli and lexicographic based). Ancillary data from national ice services, SMOS data and expert judgement were utilized to identify the governing ice regimes. Based on these observations, we then extracted mentioned features. The subsequent supervised classification approach was based on an Artificial Neural Network (ANN). To gain quantitative insight into the quality of the features themselves (and reduce a possible impact of the Hughes phenomenon), we employed mutual information to unearth the relevance and redundancy of features. The results of this information theoretic analysis guided a pruning process regarding the optimal subset of features. In the last step we compared the classified results of all sensors and images, stated respective accuracies and discussed output discrepancies in the cases of simultaneous acquisitions.

  8. Generalized SMO algorithm for SVM-based multitask learning.

    PubMed

    Cai, Feng; Cherkassky, Vladimir

    2012-06-01

    Exploiting additional information to improve traditional inductive learning is an active research area in machine learning. In many supervised-learning applications, training data can be naturally separated into several groups, and incorporating this group information into learning may improve generalization. Recently, Vapnik proposed a general approach to formalizing such problems, known as "learning with structured data" and its support vector machine (SVM) based optimization formulation called SVM+. Liang and Cherkassky showed the connection between SVM+ and multitask learning (MTL) approaches in machine learning, and proposed an SVM-based formulation for MTL called SVM+MTL for classification. Training the SVM+MTL classifier requires the solution of a large quadratic programming optimization problem which scales as O(n(3)) with sample size n. So there is a need to develop computationally efficient algorithms for implementing SVM+MTL. This brief generalizes Platt's sequential minimal optimization (SMO) algorithm to the SVM+MTL setting. Empirical results show that, for typical SVM+MTL problems, the proposed generalized SMO achieves over 100 times speed-up, in comparison with general-purpose optimization routines.

  9. Land cover classification of Landsat 8 satellite data based on Fuzzy Logic approach

    NASA Astrophysics Data System (ADS)

    Taufik, Afirah; Sakinah Syed Ahmad, Sharifah

    2016-06-01

    The aim of this paper is to propose a method to classify the land covers of a satellite image based on fuzzy rule-based system approach. The study uses bands in Landsat 8 and other indices, such as Normalized Difference Water Index (NDWI), Normalized difference built-up index (NDBI) and Normalized Difference Vegetation Index (NDVI) as input for the fuzzy inference system. The selected three indices represent our main three classes called water, built- up land, and vegetation. The combination of the original multispectral bands and selected indices provide more information about the image. The parameter selection of fuzzy membership is performed by using a supervised method known as ANFIS (Adaptive neuro fuzzy inference system) training. The fuzzy system is tested for the classification on the land cover image that covers Klang Valley area. The results showed that the fuzzy system approach is effective and can be explored and implemented for other areas of Landsat data.

  10. Logic Learning Machine and standard supervised methods for Hodgkin's lymphoma prognosis using gene expression data and clinical variables.

    PubMed

    Parodi, Stefano; Manneschi, Chiara; Verda, Damiano; Ferrari, Enrico; Muselli, Marco

    2018-03-01

    This study evaluates the performance of a set of machine learning techniques in predicting the prognosis of Hodgkin's lymphoma using clinical factors and gene expression data. Analysed samples from 130 Hodgkin's lymphoma patients included a small set of clinical variables and more than 54,000 gene features. Machine learning classifiers included three black-box algorithms ( k-nearest neighbour, Artificial Neural Network, and Support Vector Machine) and two methods based on intelligible rules (Decision Tree and the innovative Logic Learning Machine method). Support Vector Machine clearly outperformed any of the other methods. Among the two rule-based algorithms, Logic Learning Machine performed better and identified a set of simple intelligible rules based on a combination of clinical variables and gene expressions. Decision Tree identified a non-coding gene ( XIST) involved in the early phases of X chromosome inactivation that was overexpressed in females and in non-relapsed patients. XIST expression might be responsible for the better prognosis of female Hodgkin's lymphoma patients.

  11. Improving practice in community-based settings: a randomized trial of supervision - study protocol.

    PubMed

    Dorsey, Shannon; Pullmann, Michael D; Deblinger, Esther; Berliner, Lucy; Kerns, Suzanne E; Thompson, Kelly; Unützer, Jürgen; Weisz, John R; Garland, Ann F

    2013-08-10

    Evidence-based treatments for child mental health problems are not consistently available in public mental health settings. Expanding availability requires workforce training. However, research has demonstrated that training alone is not sufficient for changing provider behavior, suggesting that ongoing intervention-specific supervision or consultation is required. Supervision is notably under-investigated, particularly as provided in public mental health. The degree to which supervision in this setting includes 'gold standard' supervision elements from efficacy trials (e.g., session review, model fidelity, outcome monitoring, skill-building) is unknown. The current federally-funded investigation leverages the Washington State Trauma-focused Cognitive Behavioral Therapy Initiative to describe usual supervision practices and test the impact of systematic implementation of gold standard supervision strategies on treatment fidelity and clinical outcomes. The study has two phases. We will conduct an initial descriptive study (Phase I) of supervision practices within public mental health in Washington State followed by a randomized controlled trial of gold standard supervision strategies (Phase II), with randomization at the clinician level (i.e., supervisors provide both conditions). Study participants will be 35 supervisors and 130 clinicians in community mental health centers. We will enroll one child per clinician in Phase I (N = 130) and three children per clinician in Phase II (N = 390). We use a multi-level mixed within- and between-subjects longitudinal design. Audio recordings of supervision and therapy sessions will be collected and coded throughout both phases. Child outcome data will be collected at the beginning of treatment and at three and six months into treatment. This study will provide insight into how supervisors can optimally support clinicians delivering evidence-based treatments. Phase I will provide descriptive information, currently unavailable in the literature, about commonly used supervision strategies in community mental health. The Phase II randomized controlled trial of gold standard supervision strategies is, to our knowledge, the first experimental study of gold standard supervision strategies in community mental health and will yield needed information about how to leverage supervision to improve clinician fidelity and client outcomes. ClinicalTrials.gov NCT01800266.

  12. Improving practice in community-based settings: a randomized trial of supervision – study protocol

    PubMed Central

    2013-01-01

    Background Evidence-based treatments for child mental health problems are not consistently available in public mental health settings. Expanding availability requires workforce training. However, research has demonstrated that training alone is not sufficient for changing provider behavior, suggesting that ongoing intervention-specific supervision or consultation is required. Supervision is notably under-investigated, particularly as provided in public mental health. The degree to which supervision in this setting includes ‘gold standard’ supervision elements from efficacy trials (e.g., session review, model fidelity, outcome monitoring, skill-building) is unknown. The current federally-funded investigation leverages the Washington State Trauma-focused Cognitive Behavioral Therapy Initiative to describe usual supervision practices and test the impact of systematic implementation of gold standard supervision strategies on treatment fidelity and clinical outcomes. Methods/Design The study has two phases. We will conduct an initial descriptive study (Phase I) of supervision practices within public mental health in Washington State followed by a randomized controlled trial of gold standard supervision strategies (Phase II), with randomization at the clinician level (i.e., supervisors provide both conditions). Study participants will be 35 supervisors and 130 clinicians in community mental health centers. We will enroll one child per clinician in Phase I (N = 130) and three children per clinician in Phase II (N = 390). We use a multi-level mixed within- and between-subjects longitudinal design. Audio recordings of supervision and therapy sessions will be collected and coded throughout both phases. Child outcome data will be collected at the beginning of treatment and at three and six months into treatment. Discussion This study will provide insight into how supervisors can optimally support clinicians delivering evidence-based treatments. Phase I will provide descriptive information, currently unavailable in the literature, about commonly used supervision strategies in community mental health. The Phase II randomized controlled trial of gold standard supervision strategies is, to our knowledge, the first experimental study of gold standard supervision strategies in community mental health and will yield needed information about how to leverage supervision to improve clinician fidelity and client outcomes. Trial registration ClinicalTrials.gov NCT01800266 PMID:23937766

  13. Test of spectral/spatial classifier

    NASA Technical Reports Server (NTRS)

    Landgrebe, D. A. (Principal Investigator); Kast, J. L.; Davis, B. J.

    1977-01-01

    The author has identified the following significant results. The supervised ECHO processor (which utilizes class statistics for object identification) successfully exploits the redundancy of states characteristic of sampled imagery of ground scenes to achieve better classification accuracy, reduce the number of classifications required, and reduce the variability of classification results. The nonsupervised ECHO processor (which identifies objects without the benefit of class statistics) successfully reduces the number of classifications required and the variability of the classification results.

  14. Classification of wines according to their production regions with the contained trace elements using laser-induced breakdown spectroscopy

    NASA Astrophysics Data System (ADS)

    Tian, Ye; Yan, Chunhua; Zhang, Tianlong; Tang, Hongsheng; Li, Hua; Yu, Jialu; Bernard, Jérôme; Chen, Li; Martin, Serge; Delepine-Gilon, Nicole; Bocková, Jana; Veis, Pavel; Chen, Yanping; Yu, Jin

    2017-09-01

    Laser-induced breakdown spectroscopy (LIBS) has been applied to classify French wines according to their production regions. The use of the surface-assisted (or surface-enhanced) sample preparation method enabled a sub-ppm limit of detection (LOD), which led to the detection and identification of at least 22 metal and nonmetal elements in a typical wine sample including majors, minors and traces. An ensemble of 29 bottles of French wines, either red or white wines, from five production regions, Alsace, Bourgogne, Beaujolais, Bordeaux and Languedoc, was analyzed together with a wine from California, considered as an outlier. A non-supervised classification model based on principal component analysis (PCA) was first developed for the classification. The results showed a limited separation power of the model, which however allowed, in a step by step approach, to understand the physical reasons behind each step of sample separation and especially to observe the influence of the matrix effect in the sample classification. A supervised classification model was then developed based on random forest (RF), which is in addition a nonlinear algorithm. The obtained classification results were satisfactory with, when the parameters of the model were optimized, a classification accuracy of 100% for the tested samples. We especially discuss in the paper, the effect of spectrum normalization with an internal reference, the choice of input variables for the classification models and the optimization of parameters for the developed classification models.

  15. Mapping raised bogs with an iterative one-class classification approach

    NASA Astrophysics Data System (ADS)

    Mack, Benjamin; Roscher, Ribana; Stenzel, Stefanie; Feilhauer, Hannes; Schmidtlein, Sebastian; Waske, Björn

    2016-10-01

    Land use and land cover maps are one of the most commonly used remote sensing products. In many applications the user only requires a map of one particular class of interest, e.g. a specific vegetation type or an invasive species. One-class classifiers are appealing alternatives to common supervised classifiers because they can be trained with labeled training data of the class of interest only. However, training an accurate one-class classification (OCC) model is challenging, particularly when facing a large image, a small class and few training samples. To tackle these problems we propose an iterative OCC approach. The presented approach uses a biased Support Vector Machine as core classifier. In an iterative pre-classification step a large part of the pixels not belonging to the class of interest is classified. The remaining data is classified by a final classifier with a novel model and threshold selection approach. The specific objective of our study is the classification of raised bogs in a study site in southeast Germany, using multi-seasonal RapidEye data and a small number of training sample. Results demonstrate that the iterative OCC outperforms other state of the art one-class classifiers and approaches for model selection. The study highlights the potential of the proposed approach for an efficient and improved mapping of small classes such as raised bogs. Overall the proposed approach constitutes a feasible approach and useful modification of a regular one-class classifier.

  16. The Effectiveness and Cost of Clinical Supervision for Motivational Interviewing: A Randomized Controlled Trial

    PubMed Central

    Martino, Steve; Paris, Manuel; Añez, Luis; Nich, Charla; Canning-Ball, Monica; Hunkele, Karen; Olmstead, Todd A.; Carroll, Kathleen M.

    2016-01-01

    The effectiveness of a competency-based supervision approach called Motivational Interviewing Assessment: Supervisory Tools for Enhancing Proficiency (MIA: STEP) was compared to Supervision-As-Usual (SAU) for increasing clinicians’ motivational interviewing (MI) adherence and competence and client retention and primary substance abstinence in a multisite Hybrid Type 2 effectiveness-implementation randomized controlled trial. Participants were 66 clinicians and 450 clients within one of eleven outpatient substance abuse programs. An independent evaluation of audio recorded supervision sessions indicated that MIA: STEP and SAU were highly and comparably discriminable across sites. While clinicians in both supervision conditions improved their MI performance, clinician supervised with MIA: STEP, compared to those in SAU, showed significantly greater increases in the competency in which they used fundamental and advanced MI strategies when using MI across seven intakes through a 16-week follow-up. There were no retention or substance use differences among the clients seen by clinicians in MIA: STEP or SAU. MIA: STEP was substantially more expensive to deliver than SAU. Innovative alternatives to resource-intensive competency-based supervision approaches such as MIA: STEP are needed to promote the implementation of evidence-based practices. PMID:27431042

  17. Active learning-based information structure analysis of full scientific articles and two applications for biomedical literature review.

    PubMed

    Guo, Yufan; Silins, Ilona; Stenius, Ulla; Korhonen, Anna

    2013-06-01

    Techniques that are capable of automatically analyzing the information structure of scientific articles could be highly useful for improving information access to biomedical literature. However, most existing approaches rely on supervised machine learning (ML) and substantial labeled data that are expensive to develop and apply to different sub-fields of biomedicine. Recent research shows that minimal supervision is sufficient for fairly accurate information structure analysis of biomedical abstracts. However, is it realistic for full articles given their high linguistic and informational complexity? We introduce and release a novel corpus of 50 biomedical articles annotated according to the Argumentative Zoning (AZ) scheme, and investigate active learning with one of the most widely used ML models-Support Vector Machines (SVM)-on this corpus. Additionally, we introduce two novel applications that use AZ to support real-life literature review in biomedicine via question answering and summarization. We show that active learning with SVM trained on 500 labeled sentences (6% of the corpus) performs surprisingly well with the accuracy of 82%, just 2% lower than fully supervised learning. In our question answering task, biomedical researchers find relevant information significantly faster from AZ-annotated than unannotated articles. In the summarization task, sentences extracted from particular zones are significantly more similar to gold standard summaries than those extracted from particular sections of full articles. These results demonstrate that active learning of full articles' information structure is indeed realistic and the accuracy is high enough to support real-life literature review in biomedicine. The annotated corpus, our AZ classifier and the two novel applications are available at http://www.cl.cam.ac.uk/yg244/12bioinfo.html

  18. Sensor feature fusion for detecting buried objects

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Clark, G.A.; Sengupta, S.K.; Sherwood, R.J.

    1993-04-01

    Given multiple registered images of the earth`s surface from dual-band sensors, our system fuses information from the sensors to reduce the effects of clutter and improve the ability to detect buried or surface target sites. The sensor suite currently includes two sensors (5 micron and 10 micron wavelengths) and one ground penetrating radar (GPR) of the wide-band pulsed synthetic aperture type. We use a supervised teaming pattern recognition approach to detect metal and plastic land mines buried in soil. The overall process consists of four main parts: Preprocessing, feature extraction, feature selection, and classification. These parts are used in amore » two step process to classify a subimage. Thee first step, referred to as feature selection, determines the features of sub-images which result in the greatest separability among the classes. The second step, image labeling, uses the selected features and the decisions from a pattern classifier to label the regions in the image which are likely to correspond to buried mines. We extract features from the images, and use feature selection algorithms to select only the most important features according to their contribution to correct detections. This allows us to save computational complexity and determine which of the sensors add value to the detection system. The most important features from the various sensors are fused using supervised teaming pattern classifiers (including neural networks). We present results of experiments to detect buried land mines from real data, and evaluate the usefulness of fusing feature information from multiple sensor types, including dual-band infrared and ground penetrating radar. The novelty of the work lies mostly in the combination of the algorithms and their application to the very important and currently unsolved operational problem of detecting buried land mines from an airborne standoff platform.« less

  19. Correlation coefficient based supervised locally linear embedding for pulmonary nodule recognition.

    PubMed

    Wu, Panpan; Xia, Kewen; Yu, Hengyong

    2016-11-01

    Dimensionality reduction techniques are developed to suppress the negative effects of high dimensional feature space of lung CT images on classification performance in computer aided detection (CAD) systems for pulmonary nodule detection. An improved supervised locally linear embedding (SLLE) algorithm is proposed based on the concept of correlation coefficient. The Spearman's rank correlation coefficient is introduced to adjust the distance metric in the SLLE algorithm to ensure that more suitable neighborhood points could be identified, and thus to enhance the discriminating power of embedded data. The proposed Spearman's rank correlation coefficient based SLLE (SC(2)SLLE) is implemented and validated in our pilot CAD system using a clinical dataset collected from the publicly available lung image database consortium and image database resource initiative (LICD-IDRI). Particularly, a representative CAD system for solitary pulmonary nodule detection is designed and implemented. After a sequential medical image processing steps, 64 nodules and 140 non-nodules are extracted, and 34 representative features are calculated. The SC(2)SLLE, as well as SLLE and LLE algorithm, are applied to reduce the dimensionality. Several quantitative measurements are also used to evaluate and compare the performances. Using a 5-fold cross-validation methodology, the proposed algorithm achieves 87.65% accuracy, 79.23% sensitivity, 91.43% specificity, and 8.57% false positive rate, on average. Experimental results indicate that the proposed algorithm outperforms the original locally linear embedding and SLLE coupled with the support vector machine (SVM) classifier. Based on the preliminary results from a limited number of nodules in our dataset, this study demonstrates the great potential to improve the performance of a CAD system for nodule detection using the proposed SC(2)SLLE. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  20. Supervised interpretation of echocardiograms with a psychological model of expert supervision

    NASA Astrophysics Data System (ADS)

    Revankar, Shriram V.; Sher, David B.; Shalin, Valerie L.; Ramamurthy, Maya

    1993-07-01

    We have developed a collaborative scheme that facilitates active human supervision of the binary segmentation of an echocardiogram. The scheme complements the reliability of a human expert with the precision of segmentation algorithms. In the developed system, an expert user compares the computer generated segmentation with the original image in a user friendly graphics environment, and interactively indicates the incorrectly classified regions either by pointing or by circling. The precise boundaries of the indicated regions are computed by studying original image properties at that region, and a human visual attention distribution map obtained from the published psychological and psychophysical research. We use the developed system to extract contours of heart chambers from a sequence of two dimensional echocardiograms. We are currently extending this method to incorporate a richer set of inputs from the human supervisor, to facilitate multi-classification of image regions depending on their functionality. We are integrating into our system the knowledge related constraints that cardiologists use, to improve the capabilities of our existing system. This extension involves developing a psychological model of expert reasoning, functional and relational models of typical views in echocardiograms, and corresponding interface modifications to map the suggested actions to image processing algorithms.

  1. Simple adaptive sparse representation based classification schemes for EEG based brain-computer interface applications.

    PubMed

    Shin, Younghak; Lee, Seungchan; Ahn, Minkyu; Cho, Hohyun; Jun, Sung Chan; Lee, Heung-No

    2015-11-01

    One of the main problems related to electroencephalogram (EEG) based brain-computer interface (BCI) systems is the non-stationarity of the underlying EEG signals. This results in the deterioration of the classification performance during experimental sessions. Therefore, adaptive classification techniques are required for EEG based BCI applications. In this paper, we propose simple adaptive sparse representation based classification (SRC) schemes. Supervised and unsupervised dictionary update techniques for new test data and a dictionary modification method by using the incoherence measure of the training data are investigated. The proposed methods are very simple and additional computation for the re-training of the classifier is not needed. The proposed adaptive SRC schemes are evaluated using two BCI experimental datasets. The proposed methods are assessed by comparing classification results with the conventional SRC and other adaptive classification methods. On the basis of the results, we find that the proposed adaptive schemes show relatively improved classification accuracy as compared to conventional methods without requiring additional computation. Copyright © 2015 Elsevier Ltd. All rights reserved.

  2. Effectiveness of additional supervised exercises compared with conventional treatment alone in patients with acute lateral ankle sprains: systematic review

    PubMed Central

    van Ochten, John; Luijsterburg, Pim A J; van Middelkoop, Marienke; Koes, Bart W; Bierma-Zeinstra, Sita M A

    2010-01-01

    Objective To summarise the effectiveness of adding supervised exercises to conventional treatment compared with conventional treatment alone in patients with acute lateral ankle sprains. Design Systematic review. Data sources Medline, Embase, Cochrane Central Register of Controlled Trials, Cinahl, and reference screening. Study selection Included studies were randomised controlled trials, quasi-randomised controlled trials, or clinical trials. Patients were adolescents or adults with an acute lateral ankle sprain. The treatment options were conventional treatment alone or conventional treatment combined with supervised exercises. Two reviewers independently assessed the risk of bias, and one reviewer extracted data. Because of clinical heterogeneity we analysed the data using a best evidence synthesis. Follow-up was classified as short term (up to two weeks), intermediate (two weeks to three months), and long term (more than three months). Results 11 studies were included. There was limited to moderate evidence to suggest that the addition of supervised exercises to conventional treatment leads to faster and better recovery and a faster return to sport at short term follow-up than conventional treatment alone. In specific populations (athletes, soldiers, and patients with severe injuries) this evidence was restricted to a faster return to work and sport only. There was no strong evidence of effectiveness for any of the outcome measures. Most of the included studies had a high risk of bias, with few having adequate statistical power to detect clinically relevant differences. Conclusion Additional supervised exercises compared with conventional treatment alone have some benefit for recovery and return to sport in patients with ankle sprain, though the evidence is limited or moderate and many studies are subject to bias. PMID:20978065

  3. Application of Deep Learning to Detect Precursors of Tropical Cyclone

    NASA Astrophysics Data System (ADS)

    Matsuoka, D.; Nakano, M.; Sugiyama, D.; Uchida, S.

    2017-12-01

    Tropical cyclones (TCs) affect significant damage to human society. Predicting TC generation as soon as possible is important issue in both academic and social perspectives. In the present work, we investigate the probability of predicting TCs seven days prior using deep neural networks. The training data is produced from 30-year cloud resolving global atmospheric simulation (NICAM) with 14 km horizontal resolution (Kodama et al., 2015). We employed a TCs tracking algorithm (Sugi et al., 2002; Nakano et al., 2015) to NICAM simulation data in order to generate supervised cloud images (horizontal sizes are 800-1,000km). We generate approximately one million images of "TCs (include their precursors)" and "not TCs (low pressure clouds)". We generate ten types of image classifier based on 2-dimensional convolutional neural network, includes four convolutional layers, three pooling layers and two fully connected layers. The final predicted results are obtained by these ensemble mean values. Generated classifiers are applied to untrained global simulation data (four million test images). As a result, we succeeded in predicting the precursors of TCs seven and five days before their formation with a Recall of 88.6% and 89.6% (Precision is 11.4%), respectively.

  4. Generalized multiple kernel learning with data-dependent priors.

    PubMed

    Mao, Qi; Tsang, Ivor W; Gao, Shenghua; Wang, Li

    2015-06-01

    Multiple kernel learning (MKL) and classifier ensemble are two mainstream methods for solving learning problems in which some sets of features/views are more informative than others, or the features/views within a given set are inconsistent. In this paper, we first present a novel probabilistic interpretation of MKL such that maximum entropy discrimination with a noninformative prior over multiple views is equivalent to the formulation of MKL. Instead of using the noninformative prior, we introduce a novel data-dependent prior based on an ensemble of kernel predictors, which enhances the prediction performance of MKL by leveraging the merits of the classifier ensemble. With the proposed probabilistic framework of MKL, we propose a hierarchical Bayesian model to learn the proposed data-dependent prior and classification model simultaneously. The resultant problem is convex and other information (e.g., instances with either missing views or missing labels) can be seamlessly incorporated into the data-dependent priors. Furthermore, a variety of existing MKL models can be recovered under the proposed MKL framework and can be readily extended to incorporate these priors. Extensive experiments demonstrate the benefits of our proposed framework in supervised and semisupervised settings, as well as in tasks with partial correspondence among multiple views.

  5. Oscillatory neural network for pattern recognition: trajectory based classification and supervised learning.

    PubMed

    Miller, Vonda H; Jansen, Ben H

    2008-12-01

    Computer algorithms that match human performance in recognizing written text or spoken conversation remain elusive. The reasons why the human brain far exceeds any existing recognition scheme to date in the ability to generalize and to extract invariant characteristics relevant to category matching are not clear. However, it has been postulated that the dynamic distribution of brain activity (spatiotemporal activation patterns) is the mechanism by which stimuli are encoded and matched to categories. This research focuses on supervised learning using a trajectory based distance metric for category discrimination in an oscillatory neural network model. Classification is accomplished using a trajectory based distance metric. Since the distance metric is differentiable, a supervised learning algorithm based on gradient descent is demonstrated. Classification of spatiotemporal frequency transitions and their relation to a priori assessed categories is shown along with the improved classification results after supervised training. The results indicate that this spatiotemporal representation of stimuli and the associated distance metric is useful for simple pattern recognition tasks and that supervised learning improves classification results.

  6. Classification

    NASA Astrophysics Data System (ADS)

    Oza, Nikunj

    2012-03-01

    A supervised learning task involves constructing a mapping from input data (normally described by several features) to the appropriate outputs. A set of training examples— examples with known output values—is used by a learning algorithm to generate a model. This model is intended to approximate the mapping between the inputs and outputs. This model can be used to generate predicted outputs for inputs that have not been seen before. Within supervised learning, one type of task is a classification learning task, in which each output is one or more classes to which the input belongs. For example, we may have data consisting of observations of sunspots. In a classification learning task, our goal may be to learn to classify sunspots into one of several types. Each example may correspond to one candidate sunspot with various measurements or just an image. A learning algorithm would use the supplied examples to generate a model that approximates the mapping between each supplied set of measurements and the type of sunspot. This model can then be used to classify previously unseen sunspots based on the candidate’s measurements. The generalization performance of a learned model (how closely the target outputs and the model’s predicted outputs agree for patterns that have not been presented to the learning algorithm) would provide an indication of how well the model has learned the desired mapping. More formally, a classification learning algorithm L takes a training set T as its input. The training set consists of |T| examples or instances. It is assumed that there is a probability distribution D from which all training examples are drawn independently—that is, all the training examples are independently and identically distributed (i.i.d.). The ith training example is of the form (x_i, y_i), where x_i is a vector of values of several features and y_i represents the class to be predicted.* In the sunspot classification example given above, each training example would represent one sunspot’s classification (y_i) and the corresponding set of measurements (x_i). The output of a supervised learning algorithm is a model h that approximates the unknown mapping from the inputs to the outputs. In our example, h would map from the sunspot measurements to the type of sunspot. We may have a test set S—a set of examples not used in training that we use to test how well the model h predicts the outputs on new examples. Just as with the examples in T, the examples in S are assumed to be independent and identically distributed (i.i.d.) draws from the distribution D. We measure the error of h on the test set as the proportion of test cases that h misclassifies: 1/|S| Sigma(x,y union S)[I(h(x)!= y)] where I(v) is the indicator function—it returns 1 if v is true and 0 otherwise. In our sunspot classification example, we would identify additional examples of sunspots that were not used in generating the model, and use these to determine how accurate the model is—the fraction of the test samples that the model classifies correctly. An example of a classification model is the decision tree shown in Figure 23.1. We will discuss the decision tree learning algorithm in more detail later—for now, we assume that, given a training set with examples of sunspots, this decision tree is derived. This can be used to classify previously unseen examples of sunpots. For example, if a new sunspot’s inputs indicate that its "Group Length" is in the range 10-15, then the decision tree would classify the sunspot as being of type “E,” whereas if the "Group Length" is "NULL," the "Magnetic Type" is "bipolar," and the "Penumbra" is "rudimentary," then it would be classified as type "C." In this chapter, we will add to the above description of classification problems. We will discuss decision trees and several other classification models. In particular, we will discuss the learning algorithms that generate these classification models, how to use them to classify new examples, and the strengths and weaknesses of these models. We will end with pointers to further reading on classification methods applied to astronomy data.

  7. A Customized Attention-Based Long Short-Term Memory Network for Distant Supervised Relation Extraction.

    PubMed

    He, Dengchao; Zhang, Hongjun; Hao, Wenning; Zhang, Rui; Cheng, Kai

    2017-07-01

    Distant supervision, a widely applied approach in the field of relation extraction can automatically generate large amounts of labeled training corpus with minimal manual effort. However, the labeled training corpus may have many false-positive data, which would hurt the performance of relation extraction. Moreover, in traditional feature-based distant supervised approaches, extraction models adopt human design features with natural language processing. It may also cause poor performance. To address these two shortcomings, we propose a customized attention-based long short-term memory network. Our approach adopts word-level attention to achieve better data representation for relation extraction without manually designed features to perform distant supervision instead of fully supervised relation extraction, and it utilizes instance-level attention to tackle the problem of false-positive data. Experimental results demonstrate that our proposed approach is effective and achieves better performance than traditional methods.

  8. Semi-Supervised Marginal Fisher Analysis for Hyperspectral Image Classification

    NASA Astrophysics Data System (ADS)

    Huang, H.; Liu, J.; Pan, Y.

    2012-07-01

    The problem of learning with both labeled and unlabeled examples arises frequently in Hyperspectral image (HSI) classification. While marginal Fisher analysis is a supervised method, which cannot be directly applied for Semi-supervised classification. In this paper, we proposed a novel method, called semi-supervised marginal Fisher analysis (SSMFA), to process HSI of natural scenes, which uses a combination of semi-supervised learning and manifold learning. In SSMFA, a new difference-based optimization objective function with unlabeled samples has been designed. SSMFA preserves the manifold structure of labeled and unlabeled samples in addition to separating labeled samples in different classes from each other. The semi-supervised method has an analytic form of the globally optimal solution, and it can be computed based on eigen decomposition. Classification experiments with a challenging HSI task demonstrate that this method outperforms current state-of-the-art HSI-classification methods.

  9. Determining urban land uses through building-associated element attributes derived from lidar and aerial photographs

    NASA Astrophysics Data System (ADS)

    Meng, Xuelian

    Urban land-use research is a key component in analyzing the interactions between human activities and environmental change. Researchers have conducted many experiments to classify urban or built-up land, forest, water, agriculture, and other land-use and land-cover types. Separating residential land uses from other land uses within urban areas, however, has proven to be surprisingly troublesome. Although high-resolution images have recently become more available for land-use classification, an increase in spatial resolution does not guarantee improved classification accuracy by traditional classifiers due to the increase of class complexity. This research presents an approach to detect and separate residential land uses on a building scale directly from remotely sensed imagery to enhance urban land-use analysis. Specifically, the proposed methodology applies a multi-directional ground filter to generate a bare ground surface from lidar data, then utilizes a morphology-based building detection algorithm to identify buildings from lidar and aerial photographs, and finally separates residential buildings using a supervised C4.5 decision tree analysis based on the seven selected building land-use indicators. Successful execution of this study produces three independent methods, each corresponding to the steps of the methodology: lidar ground filtering, building detection, and building-based object-oriented land-use classification. Furthermore, this research provides a prototype as one of the few early explorations of building-based land-use analysis and successful separation of more than 85% of residential buildings based on an experiment on an 8.25-km2 study site located in Austin, Texas.

  10. The efficacy of a supervised and a home-based core strengthening programme in adults with poor core stability: a three-arm randomised controlled trial.

    PubMed

    Chuter, V H; de Jonge, X A K Janse; Thompson, B M; Callister, R

    2015-03-01

    Poor core stability is linked to a range of musculoskeletal pathologies and core-strengthening programmes are widely used as treatment. Treatment outcomes, however, are highly variable, which may be related to the method of delivery of core strengthening programmes. We investigated the effect of identical 8 week core strengthening programmes delivered as either supervised or home-based on measures of core stability. Participants with poor core stability were randomised into three groups: supervised (n=26), home-based (n=26) or control (n=26). Primary outcomes were the Sahrmann test and the Star Excursion Balance Test (SEBT) for dynamic core stability and three endurance tests (side-bridge, flexor and Sorensen) for static core stability. The exercise programme was devised and supervised by an exercise physiologist. Analysis of covariance on the change from baseline over the 8 weeks showed that the supervised group performed significantly better on all core stability measures than both the home-based and control group. The home-based group produced significant improvements compared to the control group in all static core stability tests, but not in most of the dynamic core stability tests (Sahrmann test and two out of three directions of the SEBT). Our results support the use of a supervised core-strengthening programme over a home-based programme to maximise improvements in core stability, especially in its dynamic aspects. Based on our findings in healthy individuals with low core stability, further research is recommended on potential therapeutic benefits of supervised core-strengthening programmes for pathologies associated with low core stability. ACTRN12613000233729. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  11. MicroRNA based Pan-Cancer Diagnosis and Treatment Recommendation.

    PubMed

    Cheerla, Nikhil; Gevaert, Olivier

    2017-01-13

    The current state-of-the-art in cancer diagnosis and treatment is not ideal; diagnostic tests are accurate but invasive, and treatments are "one-size fits-all" instead of being personalized. Recently, miRNA's have garnered significant attention as cancer biomarkers, owing to their ease of access (circulating miRNA in the blood) and stability. There have been many studies showing the effectiveness of miRNA data in diagnosing specific cancer types, but few studies explore the role of miRNA in predicting treatment outcome. Here we go a step further, using tissue miRNA and clinical data across 21 cancers from the 'The Cancer Genome Atlas' (TCGA) database. We use machine learning techniques to create an accurate pan-cancer diagnosis system, and a prediction model for treatment outcomes. Finally, using these models, we create a web-based tool that diagnoses cancer and recommends the best treatment options. We achieved 97.2% accuracy for classification using a support vector machine classifier with radial basis. The accuracies improved to 99.9-100% when climbing up the embryonic tree and classifying cancers at different stages. We define the accuracy as the ratio of the total number of instances correctly classified to the total instances. The classifier also performed well, achieving greater than 80% sensitivity for many cancer types on independent validation datasets. Many miRNAs selected by our feature selection algorithm had strong previous associations to various cancers and tumor progression. Then, using miRNA, clinical and treatment data and encoding it in a machine-learning readable format, we built a prognosis predictor model to predict the outcome of treatment with 85% accuracy. We used this model to create a tool that recommends personalized treatment regimens. Both the diagnosis and prognosis model, incorporating semi-supervised learning techniques to improve their accuracies with repeated use, were uploaded online for easy access. Our research is a step towards the final goal of diagnosing cancer and predicting treatment recommendations using non-invasive blood tests.

  12. Testing random forest classification for identifying lava flows and mapping age groups on a single Landsat 8 image

    NASA Astrophysics Data System (ADS)

    Li, Long; Solana, Carmen; Canters, Frank; Kervyn, Matthieu

    2017-10-01

    Mapping lava flows using satellite images is an important application of remote sensing in volcanology. Several volcanoes have been mapped through remote sensing using a wide range of data, from optical to thermal infrared and radar images, using techniques such as manual mapping, supervised/unsupervised classification, and elevation subtraction. So far, spectral-based mapping applications mainly focus on the use of traditional pixel-based classifiers, without much investigation into the added value of object-based approaches and into advantages of using machine learning algorithms. In this study, Nyamuragira, characterized by a series of > 20 overlapping lava flows erupted over the last century, was used as a case study. The random forest classifier was tested to map lava flows based on pixels and objects. Image classification was conducted for the 20 individual flows and for 8 groups of flows of similar age using a Landsat 8 image and a DEM of the volcano, both at 30-meter spatial resolution. Results show that object-based classification produces maps with continuous and homogeneous lava surfaces, in agreement with the physical characteristics of lava flows, while lava flows mapped through the pixel-based classification are heterogeneous and fragmented including much "salt and pepper noise". In terms of accuracy, both pixel-based and object-based classification performs well but the former results in higher accuracies than the latter except for mapping lava flow age groups without using topographic features. It is concluded that despite spectral similarity, lava flows of contrasting age can be well discriminated and mapped by means of image classification. The classification approach demonstrated in this study only requires easily accessible image data and can be applied to other volcanoes as well if there is sufficient information to calibrate the mapping.

  13. Doctoral Supervision in Virtual Spaces: A Review of Research of Web-Based Tools to Develop Collaborative Supervision

    ERIC Educational Resources Information Center

    Maor, Dorit; Ensor, Jason D.; Fraser, Barry J.

    2016-01-01

    Supervision of doctoral students needs to be improved to increase completion rates, reduce attrition rates (estimated to be at 25% or more) and improve quality of research. The current literature review aimed to explore the contribution that technology can make to higher degree research supervision. The articles selected included empirical studies…

  14. Effectiveness of supportive supervision on the consistency of integrated community cases management skills of the health extension workers in 113 districts of Ethiopia.

    PubMed

    Ameha, Agazi; Karim, Ali Mehryar; Erbo, Amano; Ashenafi, Addis; Hailu, Mulu; Hailu, Berhan; Folla, Abebe; Bizuwork, Simret; Betemariam, Wuleta

    2014-10-01

    Consistency in the adherence to integrated Community Case Management (iCCM) protocols for common childhood illnesses provided by Ethiopia's Health Extension Program (HEP) frontline workers. One approach is to provide regular clinical mentoring to the frontline health workers of the HEP at their health posts (HP) through supportive supervision (SS) following the initial training. To Assess the effectiveness of visits to improve the consistency of iCCM skills (CoS) of the HEWs in 113 districts in Ethiopia. We analyzed data from 3,909 supportive supervision visits between January 2011 and June 2013 in 113 districts in Ethiopia. From case assessment registers, a health post was classified as consistent in managing pneumonia, malaria, or diarrhea cases if the disease classification, treatment, and follow-up of the last two cases managed at the health posts were consistent with the protocol. We used regression models to assess the effects of SS on CoS. All HPs (2,368) received at least one supportive supervision visit, 41% received two, and 15% received more than two. During the observation period, HP management consistency in pneumonia, malaria, and diarrhea increased by 3.0, 2.7 and 4.4-fold, respectively. After controlling for secular trend and other factors, significant dose-response relationships were observed between number of SS visits and CoS indicators. The SS visits following the initial training were effective in improving the CoS.

  15. Efficient Multi-Concept Visual Classifier Adaptation in Changing Environments

    DTIC Science & Technology

    2016-09-01

    yet to be discussed in existing supervised multi-concept visual perception systems used in robotics applications.1,5–7 Anno - tation of images is...Autonomous robot navigation in highly populated pedestrian zones. J Field Robotics. 2015;32(4):565–589. 3. Milella A, Reina G, Underwood J . A self...learning framework for statistical ground classification using RADAR and monocular vision. J Field Robotics. 2015;32(1):20–41. 4. Manjanna S, Dudek G

  16. Neural Network Classifies Teleoperation Data

    NASA Technical Reports Server (NTRS)

    Fiorini, Paolo; Giancaspro, Antonio; Losito, Sergio; Pasquariello, Guido

    1994-01-01

    Prototype artificial neural network, implemented in software, identifies phases of telemanipulator tasks in real time by analyzing feedback signals from force sensors on manipulator hand. Prototype is early, subsystem-level product of continuing effort to develop automated system that assists in training and supervising human control operator: provides symbolic feedback (e.g., warnings of impending collisions or evaluations of performance) to operator in real time during successive executions of same task. Also simplifies transition between teleoperation and autonomous modes of telerobotic system.

  17. Seeing It All: Evaluating Supervised Machine Learning Methods for the Classification of Diverse Otariid Behaviours

    PubMed Central

    Slip, David J.; Hocking, David P.; Harcourt, Robert G.

    2016-01-01

    Constructing activity budgets for marine animals when they are at sea and cannot be directly observed is challenging, but recent advances in bio-logging technology offer solutions to this problem. Accelerometers can potentially identify a wide range of behaviours for animals based on unique patterns of acceleration. However, when analysing data derived from accelerometers, there are many statistical techniques available which when applied to different data sets produce different classification accuracies. We investigated a selection of supervised machine learning methods for interpreting behavioural data from captive otariids (fur seals and sea lions). We conducted controlled experiments with 12 seals, where their behaviours were filmed while they were wearing 3-axis accelerometers. From video we identified 26 behaviours that could be grouped into one of four categories (foraging, resting, travelling and grooming) representing key behaviour states for wild seals. We used data from 10 seals to train four predictive classification models: stochastic gradient boosting (GBM), random forests, support vector machine using four different kernels and a baseline model: penalised logistic regression. We then took the best parameters from each model and cross-validated the results on the two seals unseen so far. We also investigated the influence of feature statistics (describing some characteristic of the seal), testing the models both with and without these. Cross-validation accuracies were lower than training accuracy, but the SVM with a polynomial kernel was still able to classify seal behaviour with high accuracy (>70%). Adding feature statistics improved accuracies across all models tested. Most categories of behaviour -resting, grooming and feeding—were all predicted with reasonable accuracy (52–81%) by the SVM while travelling was poorly categorised (31–41%). These results show that model selection is important when classifying behaviour and that by using animal characteristics we can strengthen the overall accuracy. PMID:28002450

  18. Inductive Approaches to Improving Diagnosis and Design for Diagnosability

    NASA Technical Reports Server (NTRS)

    Fisher, Douglas H. (Principal Investigator)

    1995-01-01

    The first research area under this grant addresses the problem of classifying time series according to their morphological features in the time domain. A supervised learning system called CALCHAS, which induces a classification procedure for signatures from preclassified examples, was developed. For each of several signature classes, the system infers a model that captures the class's morphological features using Bayesian model induction and the minimum message length approach to assign priors. After induction, a time series (signature) is classified in one of the classes when there is enough evidence to support that decision. Time series with sufficiently novel features, belonging to classes not present in the training set, are recognized as such. A second area of research assumes two sources of information about a system: a model or domain theory that encodes aspects of the system under study and data from actual system operations over time. A model, when it exists, represents strong prior expectations about how a system will perform. Our work with a diagnostic model of the RCS (Reaction Control System) of the Space Shuttle motivated the development of SIG, a system which combines information from a model (or domain theory) and data. As it tracks RCS behavior, the model computes quantitative and qualitative values. Induction is then performed over the data represented by both the 'raw' features and the model-computed high-level features. Finally, work on clustering for operating mode discovery motivated some important extensions to the clustering strategy we had used. One modification appends an iterative optimization technique onto the clustering system; this optimization strategy appears to be novel in the clustering literature. A second modification improves the noise tolerance of the clustering system. In particular, we adapt resampling-based pruning strategies used by supervised learning systems to the task of simplifying hierarchical clusterings, thus making post-clustering analysis easier.

  19. Supervised novelty detection in brain tissue classification with an application to white matter hyperintensities

    NASA Astrophysics Data System (ADS)

    Kuijf, Hugo J.; Moeskops, Pim; de Vos, Bob D.; Bouvy, Willem H.; de Bresser, Jeroen; Biessels, Geert Jan; Viergever, Max A.; Vincken, Koen L.

    2016-03-01

    Novelty detection is concerned with identifying test data that differs from the training data of a classifier. In the case of brain MR images, pathology or imaging artefacts are examples of untrained data. In this proof-of-principle study, we measure the behaviour of a classifier during the classification of trained labels (i.e. normal brain tissue). Next, we devise a measure that distinguishes normal classifier behaviour from abnormal behavior that occurs in the case of a novelty. This will be evaluated by training a kNN classifier on normal brain tissue, applying it to images with an untrained pathology (white matter hyperintensities (WMH)), and determine if our measure is able to identify abnormal classifier behaviour at WMH locations. For our kNN classifier, behaviour is modelled as the mean, median, or q1 distance to the k nearest points. Healthy tissue was trained on 15 images; classifier behaviour was trained/tested on 5 images with leave-one-out cross-validation. For each trained class, we measure the distribution of mean/median/q1 distances to the k nearest point. Next, for each test voxel, we compute its Z-score with respect to the measured distribution of its predicted label. We consider a Z-score >=4 abnormal behaviour of the classifier, having a probability due to chance of 0.000032. Our measure identified >90% of WMH volume and also highlighted other non-trained findings. The latter being predominantly vessels, cerebral falx, brain mask errors, choroid plexus. This measure is generalizable to other classifiers and might help in detecting unexpected findings or novelties by measuring classifier behaviour.

  20. A Model for Art Therapy-Based Supervision for End-of-Life Care Workers in Hong Kong.

    PubMed

    Potash, Jordan S; Chan, Faye; Ho, Andy H Y; Wang, Xiao Lu; Cheng, Carol

    2015-01-01

    End-of-life care workers and volunteers are particularly prone to burnout given the intense emotional and existential nature of their work. Supervision is one important way to provide adequate support that focuses on both professional and personal competencies. The inclusion of art therapy principles and practices within supervision further creates a dynamic platform for sustained self-reflection. A 6-week art therapy-based supervision group provided opportunities for developing emotional awareness, recognizing professional strengths, securing collegial relationships, and reflecting on death-related memories. The structure, rationale, and feedback are discussed.

  1. Confidence-Based Feature Acquisition

    NASA Technical Reports Server (NTRS)

    Wagstaff, Kiri L.; desJardins, Marie; MacGlashan, James

    2010-01-01

    Confidence-based Feature Acquisition (CFA) is a novel, supervised learning method for acquiring missing feature values when there is missing data at both training (learning) and test (deployment) time. To train a machine learning classifier, data is encoded with a series of input features describing each item. In some applications, the training data may have missing values for some of the features, which can be acquired at a given cost. A relevant JPL example is that of the Mars rover exploration in which the features are obtained from a variety of different instruments, with different power consumption and integration time costs. The challenge is to decide which features will lead to increased classification performance and are therefore worth acquiring (paying the cost). To solve this problem, CFA, which is made up of two algorithms (CFA-train and CFA-predict), has been designed to greedily minimize total acquisition cost (during training and testing) while aiming for a specific accuracy level (specified as a confidence threshold). With this method, it is assumed that there is a nonempty subset of features that are free; that is, every instance in the data set includes these features initially for zero cost. It is also assumed that the feature acquisition (FA) cost associated with each feature is known in advance, and that the FA cost for a given feature is the same for all instances. Finally, CFA requires that the base-level classifiers produce not only a classification, but also a confidence (or posterior probability).

  2. Quantitative Outline-based Shape Analysis and Classification of Planetary Craterforms using Supervised Learning Models

    NASA Astrophysics Data System (ADS)

    Slezak, Thomas Joseph; Radebaugh, Jani; Christiansen, Eric

    2017-10-01

    The shapes of craterform morphology on planetary surfaces provides rich information about their origins and evolution. While morphologic information provides rich visual clues to geologic processes and properties, the ability to quantitatively communicate this information is less easily accomplished. This study examines the morphology of craterforms using the quantitative outline-based shape methods of geometric morphometrics, commonly used in biology and paleontology. We examine and compare landforms on planetary surfaces using shape, a property of morphology that is invariant to translation, rotation, and size. We quantify the shapes of paterae on Io, martian calderas, terrestrial basaltic shield calderas, terrestrial ash-flow calderas, and lunar impact craters using elliptic Fourier analysis (EFA) and the Zahn and Roskies (Z-R) shape function, or tangent angle approach to produce multivariate shape descriptors. These shape descriptors are subjected to multivariate statistical analysis including canonical variate analysis (CVA), a multiple-comparison variant of discriminant analysis, to investigate the link between craterform shape and classification. Paterae on Io are most similar in shape to terrestrial ash-flow calderas and the shapes of terrestrial basaltic shield volcanoes are most similar to martian calderas. The shapes of lunar impact craters, including simple, transitional, and complex morphology, are classified with a 100% rate of success in all models. Multiple CVA models effectively predict and classify different craterforms using shape-based identification and demonstrate significant potential for use in the analysis of planetary surfaces.

  3. Roof Type Selection Based on Patch-Based Classification Using Deep Learning for High Resolution Satellite Imagery

    NASA Astrophysics Data System (ADS)

    Partovi, T.; Fraundorfer, F.; Azimi, S.; Marmanis, D.; Reinartz, P.

    2017-05-01

    3D building reconstruction from remote sensing image data from satellites is still an active research topic and very valuable for 3D city modelling. The roof model is the most important component to reconstruct the Level of Details 2 (LoD2) for a building in 3D modelling. While the general solution for roof modelling relies on the detailed cues (such as lines, corners and planes) extracted from a Digital Surface Model (DSM), the correct detection of the roof type and its modelling can fail due to low quality of the DSM generated by dense stereo matching. To reduce dependencies of roof modelling on DSMs, the pansharpened satellite images as a rich resource of information are used in addition. In this paper, two strategies are employed for roof type classification. In the first one, building roof types are classified in a state-of-the-art supervised pre-trained convolutional neural network (CNN) framework. In the second strategy, deep features from deep layers of different pre-trained CNN model are extracted and then an RBF kernel using SVM is employed to classify the building roof type. Based on roof complexity of the scene, a roof library including seven types of roofs is defined. A new semi-automatic method is proposed to generate training and test patches of each roof type in the library. Using the pre-trained CNN model does not only decrease the computation time for training significantly but also increases the classification accuracy.

  4. A determination of the optimum time of year for remotely classifying marsh vegetation from LANDSAT multispectral scanner data. [Louisiana

    NASA Technical Reports Server (NTRS)

    Butera, M. K. (Principal Investigator)

    1978-01-01

    The author has identified the following significant results. A technique was used to determine the optimum time for classifying marsh vegetation from computer-processed LANDSAT MSS data. The technique depended on the analysis of data derived from supervised pattern recognition by maximum likelihood theory. A dispersion index, created by the ratio of separability among the class spectral means to variability within the classes, defined the optimum classification time. Data compared from seven LANDSAT passes acquired over the same area of Louisiana marsh indicated that June and September were optimum marsh mapping times to collectively classify Baccharis halimifolia, Spartina patens, Spartina alterniflora, Juncus roemericanus, and Distichlis spicata. The same technique was used to determine the optimum classification time for individual species. April appeared to be the best month to map Juncus roemericanus; May, Spartina alterniflora; June, Baccharis halimifolia; and September, Spartina patens and Distichlis spicata. This information is important, for instance, when a single species is recognized to indicate a particular environmental condition.

  5. Spectral and spatial resolution analysis of multi sensor satellite data for coral reef mapping: Tioman Island, Malaysia

    NASA Astrophysics Data System (ADS)

    Pradhan, Biswajeet; Kabiri, Keivan

    2012-07-01

    This paper describes an assessment of coral reef mapping using multi sensor satellite images such as Landsat ETM, SPOT and IKONOS images for Tioman Island, Malaysia. The study area is known to be one of the best Islands in South East Asia for its unique collection of diversified coral reefs and serves host to thousands of tourists every year. For the coral reef identification, classification and analysis, Landsat ETM, SPOT and IKONOS images were collected processed and classified using hierarchical classification schemes. At first, Decision tree classification method was implemented to separate three main land cover classes i.e. water, rural and vegetation and then maximum likelihood supervised classification method was used to classify these main classes. The accuracy of the classification result is evaluated by a separated test sample set, which is selected based on the fieldwork survey and view interpretation from IKONOS image. Few types of ancillary data in used are: (a) DGPS ground control points; (b) Water quality parameters measured by Hydrolab DS4a; (c) Sea-bed substrates spectrum measured by Unispec and; (d) Landcover observation photos along Tioman island coastal area. The overall accuracy of the final classification result obtained was 92.25% with the kappa coefficient is 0.8940. Key words: Coral reef, Multi-spectral Segmentation, Pixel-Based Classification, Decision Tree, Tioman Island

  6. Text Authorship Identified Using the Dynamics of Word Co-Occurrence Networks

    PubMed Central

    Akimushkin, Camilo; Amancio, Diego Raphael; Oliveira, Osvaldo Novais

    2017-01-01

    Automatic identification of authorship in disputed documents has benefited from complex network theory as this approach does not require human expertise or detailed semantic knowledge. Networks modeling entire books can be used to discriminate texts from different sources and understand network growth mechanisms, but only a few studies have probed the suitability of networks in modeling small chunks of text to grasp stylistic features. In this study, we introduce a methodology based on the dynamics of word co-occurrence networks representing written texts to classify a corpus of 80 texts by 8 authors. The texts were divided into sections with equal number of linguistic tokens, from which time series were created for 12 topological metrics. Since 73% of all series were stationary (ARIMA(p, 0, q)) and the remaining were integrable of first order (ARIMA(p, 1, q)), probability distributions could be obtained for the global network metrics. The metrics exhibit bell-shaped non-Gaussian distributions, and therefore distribution moments were used as learning attributes. With an optimized supervised learning procedure based on a nonlinear transformation performed by Isomap, 71 out of 80 texts were correctly classified using the K-nearest neighbors algorithm, i.e. a remarkable 88.75% author matching success rate was achieved. Hence, purely dynamic fluctuations in network metrics can characterize authorship, thus paving the way for a robust description of large texts in terms of small evolving networks. PMID:28125703

  7. Supervision in primary health care--can it be carried out effectively in developing countries?

    PubMed

    Clements, C John; Streefland, Pieter H; Malau, Clement

    2007-01-01

    There is nothing new about supervision in primary health care service delivery. Supervision was even conducted by the Egyptian pyramid builders. Those supervising have often favoured ridicule and discipline to push individuals and communities to perform their duties. A traditional form of supervision, based on a top-down colonial model, was originally attempted as a tool to improve health service staff performance. This has recently been replaced by a more liberal "supportive supervision". While it is undoubtedly an improvement on the traditional model, we believe that even this version will not succeed to any great extent until there is a better understanding of the human interactions involved in supervision. Tremendous cultural differences exist over the globe regarding the acceptability of this form of management. While it is clear that health services in many countries have benefited from supervision of one sort or another, it is equally clear that in some countries, supervision is not carried out, or when carried out, is done inadequately. In some countries it may be culturally inappropriate, and may even be impossible to carry out supervision at all. We examine this issue with particular reference to immunization and other primary health care services in developing countries. Supported by field observations in Papua New Guinea, we conclude that supervision and its failure should be understood in a social and cultural context, being a far more complex activity than has so far been acknowledged. Social science-based research is needed to enable a third generation of culture-sensitive ideas to be developed that will improve staff performance in the field.

  8. A comparison of graph- and kernel-based -omics data integration algorithms for classifying complex traits.

    PubMed

    Yan, Kang K; Zhao, Hongyu; Pang, Herbert

    2017-12-06

    High-throughput sequencing data are widely collected and analyzed in the study of complex diseases in quest of improving human health. Well-studied algorithms mostly deal with single data source, and cannot fully utilize the potential of these multi-omics data sources. In order to provide a holistic understanding of human health and diseases, it is necessary to integrate multiple data sources. Several algorithms have been proposed so far, however, a comprehensive comparison of data integration algorithms for classification of binary traits is currently lacking. In this paper, we focus on two common classes of integration algorithms, graph-based that depict relationships with subjects denoted by nodes and relationships denoted by edges, and kernel-based that can generate a classifier in feature space. Our paper provides a comprehensive comparison of their performance in terms of various measurements of classification accuracy and computation time. Seven different integration algorithms, including graph-based semi-supervised learning, graph sharpening integration, composite association network, Bayesian network, semi-definite programming-support vector machine (SDP-SVM), relevance vector machine (RVM) and Ada-boost relevance vector machine are compared and evaluated with hypertension and two cancer data sets in our study. In general, kernel-based algorithms create more complex models and require longer computation time, but they tend to perform better than graph-based algorithms. The performance of graph-based algorithms has the advantage of being faster computationally. The empirical results demonstrate that composite association network, relevance vector machine, and Ada-boost RVM are the better performers. We provide recommendations on how to choose an appropriate algorithm for integrating data from multiple sources.

  9. Accurate determination of surface reference data in digital photographs in ice-free surfaces of Maritime Antarctica.

    PubMed

    Pina, Pedro; Vieira, Gonçalo; Bandeira, Lourenço; Mora, Carla

    2016-12-15

    The ice-free areas of Maritime Antarctica show complex mosaics of surface covers, with wide patches of diverse bare soils and rock, together with various vegetation communities dominated by lichens and mosses. The microscale variability is difficult to characterize and quantify, but is essential for ground-truthing and for defining classifiers for large areas using, for example high resolution satellite imagery, or even ultra-high resolution unmanned aerial vehicle (UAV) imagery. The main objective of this paper is to verify the ability and robustness of an automated approach to discriminate the variety of surface types in digital photographs acquired at ground level in ice-free regions of Maritime Antarctica. The proposed method is based on an object-based classification procedure built in two main steps: first, on the automated delineation of homogeneous regions (the objects) of the images through the watershed transform with adequate filtering to avoid an over-segmentation, and second, on labelling each identified object with a supervised decision classifier trained with samples of representative objects of ice-free surface types (bare rock, bare soil, moss and lichen formations). The method is evaluated with images acquired in summer campaigns in Fildes and Barton peninsulas (King George Island, South Shetlands). The best performances for the datasets of the two peninsulas are achieved with a SVM classifier with overall accuracies of about 92% and kappa values around 0.89. The excellent performances allow validating the adequacy of the approach for obtaining accurate surface reference data at the complete pixel scale (sub-metric) of current very high resolution (VHR) satellite images, instead of a common single point sampling. Copyright © 2016 Elsevier B.V. All rights reserved.

  10. Learning from label proportions in brain-computer interfaces: Online unsupervised learning with guarantees.

    PubMed

    Hübner, David; Verhoeven, Thibault; Schmid, Konstantin; Müller, Klaus-Robert; Tangermann, Michael; Kindermans, Pieter-Jan

    2017-01-01

    Using traditional approaches, a brain-computer interface (BCI) requires the collection of calibration data for new subjects prior to online use. Calibration time can be reduced or eliminated e.g., by subject-to-subject transfer of a pre-trained classifier or unsupervised adaptive classification methods which learn from scratch and adapt over time. While such heuristics work well in practice, none of them can provide theoretical guarantees. Our objective is to modify an event-related potential (ERP) paradigm to work in unison with the machine learning decoder, and thus to achieve a reliable unsupervised calibrationless decoding with a guarantee to recover the true class means. We introduce learning from label proportions (LLP) to the BCI community as a new unsupervised, and easy-to-implement classification approach for ERP-based BCIs. The LLP estimates the mean target and non-target responses based on known proportions of these two classes in different groups of the data. We present a visual ERP speller to meet the requirements of LLP. For evaluation, we ran simulations on artificially created data sets and conducted an online BCI study with 13 subjects performing a copy-spelling task. Theoretical considerations show that LLP is guaranteed to minimize the loss function similar to a corresponding supervised classifier. LLP performed well in simulations and in the online application, where 84.5% of characters were spelled correctly on average without prior calibration. The continuously adapting LLP classifier is the first unsupervised decoder for ERP BCIs guaranteed to find the optimal decoder. This makes it an ideal solution to avoid tedious calibration sessions. Additionally, LLP works on complementary principles compared to existing unsupervised methods, opening the door for their further enhancement when combined with LLP.

  11. Learning from label proportions in brain-computer interfaces: Online unsupervised learning with guarantees

    PubMed Central

    Verhoeven, Thibault; Schmid, Konstantin; Müller, Klaus-Robert; Tangermann, Michael; Kindermans, Pieter-Jan

    2017-01-01

    Objective Using traditional approaches, a brain-computer interface (BCI) requires the collection of calibration data for new subjects prior to online use. Calibration time can be reduced or eliminated e.g., by subject-to-subject transfer of a pre-trained classifier or unsupervised adaptive classification methods which learn from scratch and adapt over time. While such heuristics work well in practice, none of them can provide theoretical guarantees. Our objective is to modify an event-related potential (ERP) paradigm to work in unison with the machine learning decoder, and thus to achieve a reliable unsupervised calibrationless decoding with a guarantee to recover the true class means. Method We introduce learning from label proportions (LLP) to the BCI community as a new unsupervised, and easy-to-implement classification approach for ERP-based BCIs. The LLP estimates the mean target and non-target responses based on known proportions of these two classes in different groups of the data. We present a visual ERP speller to meet the requirements of LLP. For evaluation, we ran simulations on artificially created data sets and conducted an online BCI study with 13 subjects performing a copy-spelling task. Results Theoretical considerations show that LLP is guaranteed to minimize the loss function similar to a corresponding supervised classifier. LLP performed well in simulations and in the online application, where 84.5% of characters were spelled correctly on average without prior calibration. Significance The continuously adapting LLP classifier is the first unsupervised decoder for ERP BCIs guaranteed to find the optimal decoder. This makes it an ideal solution to avoid tedious calibration sessions. Additionally, LLP works on complementary principles compared to existing unsupervised methods, opening the door for their further enhancement when combined with LLP. PMID:28407016

  12. Agile text mining for the 2014 i2b2/UTHealth Cardiac risk factors challenge.

    PubMed

    Cormack, James; Nath, Chinmoy; Milward, David; Raja, Kalpana; Jonnalagadda, Siddhartha R

    2015-12-01

    This paper describes the use of an agile text mining platform (Linguamatics' Interactive Information Extraction Platform, I2E) to extract document-level cardiac risk factors in patient records as defined in the i2b2/UTHealth 2014 challenge. The approach uses a data-driven rule-based methodology with the addition of a simple supervised classifier. We demonstrate that agile text mining allows for rapid optimization of extraction strategies, while post-processing can leverage annotation guidelines, corpus statistics and logic inferred from the gold standard data. We also show how data imbalance in a training set affects performance. Evaluation of this approach on the test data gave an F-Score of 91.7%, one percent behind the top performing system. Copyright © 2015 Elsevier Inc. All rights reserved.

  13. Fire detection system using random forest classification for image sequences of complex background

    NASA Astrophysics Data System (ADS)

    Kim, Onecue; Kang, Dong-Joong

    2013-06-01

    We present a fire alarm system based on image processing that detects fire accidents in various environments. To reduce false alarms that frequently appeared in earlier systems, we combined image features including color, motion, and blinking information. We specifically define the color conditions of fires in hue, saturation and value, and RGB color space. Fire features are represented as intensity variation, color mean and variance, motion, and image differences. Moreover, blinking fire features are modeled by using crossing patches. We propose an algorithm that classifies patches into fire or nonfire areas by using random forest supervised learning. We design an embedded surveillance device made with acrylonitrile butadiene styrene housing for stable fire detection in outdoor environments. The experimental results show that our algorithm works robustly in complex environments and is able to detect fires in real time.

  14. UNMANNED AERIAL VEHICLE (UAV) HYPERSPECTRAL REMOTE SENSING FOR DRYLAND VEGETATION MONITORING

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nancy F. Glenn; Jessica J. Mitchell; Matthew O. Anderson

    2012-06-01

    UAV-based hyperspectral remote sensing capabilities developed by the Idaho National Lab and Idaho State University, Boise Center Aerospace Lab, were recently tested via demonstration flights that explored the influence of altitude on geometric error, image mosaicking, and dryland vegetation classification. The test flights successfully acquired usable flightline data capable of supporting classifiable composite images. Unsupervised classification results support vegetation management objectives that rely on mapping shrub cover and distribution patterns. Overall, supervised classifications performed poorly despite spectral separability in the image-derived endmember pixels. Future mapping efforts that leverage ground reference data, ultra-high spatial resolution photos and time series analysis shouldmore » be able to effectively distinguish native grasses such as Sandberg bluegrass (Poa secunda), from invasives such as burr buttercup (Ranunculus testiculatus) and cheatgrass (Bromus tectorum).« less

  15. An Efficient Data Partitioning to Improve Classification Performance While Keeping Parameters Interpretable.

    PubMed

    Korjus, Kristjan; Hebart, Martin N; Vicente, Raul

    2016-01-01

    Supervised machine learning methods typically require splitting data into multiple chunks for training, validating, and finally testing classifiers. For finding the best parameters of a classifier, training and validation are usually carried out with cross-validation. This is followed by application of the classifier with optimized parameters to a separate test set for estimating the classifier's generalization performance. With limited data, this separation of test data creates a difficult trade-off between having more statistical power in estimating generalization performance versus choosing better parameters and fitting a better model. We propose a novel approach that we term "Cross-validation and cross-testing" improving this trade-off by re-using test data without biasing classifier performance. The novel approach is validated using simulated data and electrophysiological recordings in humans and rodents. The results demonstrate that the approach has a higher probability of discovering significant results than the standard approach of cross-validation and testing, while maintaining the nominal alpha level. In contrast to nested cross-validation, which is maximally efficient in re-using data, the proposed approach additionally maintains the interpretability of individual parameters. Taken together, we suggest an addition to currently used machine learning approaches which may be particularly useful in cases where model weights do not require interpretation, but parameters do.

  16. Using an intelligent system to aid in tephra layer correlation of the tephra beds of the Mono-Inyo Craters, California

    NASA Astrophysics Data System (ADS)

    Hanson-Hedgecock, S.; Bursik, M.; Rogova, G.

    2008-12-01

    We are developing an intelligent system to correlate tephra layers by using the lithologic and geochemical characteristics of field samples, to aid geologists in interpreting eruption patterns in volcanic fields. Understanding the eruption history of a volcanic field from stratigraphic studies is important for forecasting future eruptive behavior and hazards. The intelligent system is used to define groups of tephra source vents and to correlate tephra layers based on a combination of geochemical data and lithostratigraphic characteristics. The tephra beds of the Mono-Inyo Craters, California, are used to test the ability of the intelligent system for tephra layer correlation. The data processing is performed by a suite of both unsupervised and supervised classifiers, built and combined within the framework of the Dempster-Shafer theory of evidence. We have developed algorithms to calculate isopleth maps of thickness, lithic and pumice size that are used in the processing of the lithostratigraphic data. This spatial information is important in the determination of eruption patterns and is used by an evidential nearest neighbor classifier to correlate tephra layers. Integrating a better isopleth approximation function and expert knowledge about stratigraphic order of the tephra layers into the classifier improves the lithostratigraphic correlation from 56% to 87% of layers correctly identified. Geochemical data for defining groups of tephra sources are processed by a suit of fuzzy k-means classifiers. Improved clustering results of geochemical data are achieved by the fusion of individual clustering results with an evidential combination method. The intelligent system aids correlation by showing matches and disparities between data patterns from different outcrops that may have been overlooked. The intelligent system produces a useful recognition result, while dealing with the uncertainty from sparse data and the imprecise description of layer characteristics.

  17. Supervised classification of aerial imagery and multi-source data fusion for flood assessment

    NASA Astrophysics Data System (ADS)

    Sava, E.; Harding, L.; Cervone, G.

    2015-12-01

    Floods are among the most devastating natural hazards and the ability to produce an accurate and timely flood assessment before, during, and after an event is critical for their mitigation and response. Remote sensing technologies have become the de-facto approach for observing the Earth and its environment. However, satellite remote sensing data are not always available. For these reasons, it is crucial to develop new techniques in order to produce flood assessments during and after an event. Recent advancements in data fusion techniques of remote sensing with near real time heterogeneous datasets have allowed emergency responders to more efficiently extract increasingly precise and relevant knowledge from the available information. This research presents a fusion technique using satellite remote sensing imagery coupled with non-authoritative data such as Civil Air Patrol (CAP) and tweets. A new computational methodology is proposed based on machine learning algorithms to automatically identify water pixels in CAP imagery. Specifically, wavelet transformations are paired with multiple classifiers, run in parallel, to build models discriminating water and non-water regions. The learned classification models are first tested against a set of control cases, and then used to automatically classify each image separately. A measure of uncertainty is computed for each pixel in an image proportional to the number of models classifying the pixel as water. Geo-tagged tweets are continuously harvested and stored on a MongoDB and queried in real time. They are fused with CAP classified data, and with satellite remote sensing derived flood extent results to produce comprehensive flood assessment maps. The final maps are then compared with FEMA generated flood extents to assess their accuracy. The proposed methodology is applied on two test cases, relative to the 2013 floods in Boulder CO, and the 2015 floods in Texas.

  18. Automatic extraction of relations between medical concepts in clinical texts

    PubMed Central

    Harabagiu, Sanda; Roberts, Kirk

    2011-01-01

    Objective A supervised machine learning approach to discover relations between medical problems, treatments, and tests mentioned in electronic medical records. Materials and methods A single support vector machine classifier was used to identify relations between concepts and to assign their semantic type. Several resources such as Wikipedia, WordNet, General Inquirer, and a relation similarity metric inform the classifier. Results The techniques reported in this paper were evaluated in the 2010 i2b2 Challenge and obtained the highest F1 score for the relation extraction task. When gold standard data for concepts and assertions were available, F1 was 73.7, precision was 72.0, and recall was 75.3. F1 is defined as 2*Precision*Recall/(Precision+Recall). Alternatively, when concepts and assertions were discovered automatically, F1 was 48.4, precision was 57.6, and recall was 41.7. Discussion Although a rich set of features was developed for the classifiers presented in this paper, little knowledge mining was performed from medical ontologies such as those found in UMLS. Future studies should incorporate features extracted from such knowledge sources, which we expect to further improve the results. Moreover, each relation discovery was treated independently. Joint classification of relations may further improve the quality of results. Also, joint learning of the discovery of concepts, assertions, and relations may also improve the results of automatic relation extraction. Conclusion Lexical and contextual features proved to be very important in relation extraction from medical texts. When they are not available to the classifier, the F1 score decreases by 3.7%. In addition, features based on similarity contribute to a decrease of 1.1% when they are not available. PMID:21846787

  19. Photometric classification of type Ia supernovae in the SuperNova Legacy Survey with supervised learning

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Möller, A.; Ruhlmann-Kleider, V.; Leloup, C.

    In the era of large astronomical surveys, photometric classification of supernovae (SNe) has become an important research field due to limited spectroscopic resources for candidate follow-up and classification. In this work, we present a method to photometrically classify type Ia supernovae based on machine learning with redshifts that are derived from the SN light-curves. This method is implemented on real data from the SNLS deferred pipeline, a purely photometric pipeline that identifies SNe Ia at high-redshifts (0.2 < z < 1.1). Our method consists of two stages: feature extraction (obtaining the SN redshift from photometry and estimating light-curve shape parameters)more » and machine learning classification. We study the performance of different algorithms such as Random Forest and Boosted Decision Trees. We evaluate the performance using SN simulations and real data from the first 3 years of the Supernova Legacy Survey (SNLS), which contains large spectroscopically and photometrically classified type Ia samples. Using the Area Under the Curve (AUC) metric, where perfect classification is given by 1, we find that our best-performing classifier (Extreme Gradient Boosting Decision Tree) has an AUC of 0.98.We show that it is possible to obtain a large photometrically selected type Ia SN sample with an estimated contamination of less than 5%. When applied to data from the first three years of SNLS, we obtain 529 events. We investigate the differences between classifying simulated SNe, and real SN survey data. In particular, we find that applying a thorough set of selection cuts to the SN sample is essential for good classification. This work demonstrates for the first time the feasibility of machine learning classification in a high- z SN survey with application to real SN data.« less

  20. Classification of octet AB-type binary compounds using dynamical charges: A materials informatics perspective

    DOE PAGES

    Pilania, G.; Gubernatis, J. E.; Lookman, T.

    2015-12-03

    The role of dynamical (or Born effective) charges in classification of octet AB-type binary compounds between four-fold (zincblende/wurtzite crystal structures) and six-fold (rocksalt crystal structure) coordinated systems is discussed. We show that the difference in the dynamical charges of the fourfold and sixfold coordinated structures, in combination with Harrison’s polarity, serves as an excellent feature to classify the coordination of 82 sp–bonded binary octet compounds. We use a support vector machine classifier to estimate the average classification accuracy and the associated variance in our model where a decision boundary is learned in a supervised manner. Lastly, we compare the out-of-samplemore » classification accuracy achieved by our feature pair with those reported previously.« less

  1. Assessment of computer techniques for processing digital LANDSAT MSS data for lithological discrimination of Serra do Ramalho, State of Bahia

    NASA Technical Reports Server (NTRS)

    Paradella, W. R. (Principal Investigator); Vitorello, I.; Monteiro, M. D.

    1984-01-01

    Enhancement techniques and thematic classifications were applied to the metasediments of Bambui Super Group (Upper Proterozoic) in the Region of Serra do Ramalho, SW of the state of Bahia. Linear contrast stretch, band-ratios with contrast stretch, and color-composites allow lithological discriminations. The effects of human activities and of vegetation cover mask and limit, in several ways, the lithological discrimination with digital MSS data. Principal component images and color composite of linear contrast stretch of these products, show lithological discrimination through tonal gradations. This set of products allows the delineations of several metasedimentary sequences to a level superior to reconnaissance mapping. Supervised (maximum likelihood classifier) and nonsupervised (K-Means classifier) classification of the limestone sequence, host to fluorite mineralization show satisfactory results.

  2. A nonlinear discriminant algorithm for feature extraction and data classification.

    PubMed

    Santa Cruz, C; Dorronsoro, J R

    1998-01-01

    This paper presents a nonlinear supervised feature extraction algorithm that combines Fisher's criterion function with a preliminary perceptron-like nonlinear projection of vectors in pattern space. Its main motivation is to combine the approximation properties of multilayer perceptrons (MLP's) with the target free nature of Fisher's classical discriminant analysis. In fact, although MLP's provide good classifiers for many problems, there may be some situations, such as unequal class sizes with a high degree of pattern mixing among them, that may make difficult the construction of good MLP classifiers. In these instances, the features extracted by our procedure could be more effective. After the description of its construction and the analysis of its complexity, we will illustrate its use over a synthetic problem with the above characteristics.

  3. Classifying social anxiety disorder using multivoxel pattern analyses of brain function and structure☆

    PubMed Central

    Frick, Andreas; Gingnell, Malin; Marquand, Andre F.; Howner, Katarina; Fischer, Håkan; Kristiansson, Marianne; Williams, Steven C.R.; Fredrikson, Mats; Furmark, Tomas

    2014-01-01

    Functional neuroimaging of social anxiety disorder (SAD) support altered neural activation to threat-provoking stimuli focally in the fear network, while structural differences are distributed over the temporal and frontal cortices as well as limbic structures. Previous neuroimaging studies have investigated the brain at the voxel level using mass-univariate methods which do not enable detection of more complex patterns of activity and structural alterations that may separate SAD from healthy individuals. Support vector machine (SVM) is a supervised machine learning method that capitalizes on brain activation and structural patterns to classify individuals. The aim of this study was to investigate if it is possible to discriminate SAD patients (n = 14) from healthy controls (n = 12) using SVM based on (1) functional magnetic resonance imaging during fearful face processing and (2) regional gray matter volume. Whole brain and region of interest (fear network) SVM analyses were performed for both modalities. For functional scans, significant classifications were obtained both at whole brain level and when restricting the analysis to the fear network while gray matter SVM analyses correctly classified participants only when using the whole brain search volume. These results support that SAD is characterized by aberrant neural activation to affective stimuli in the fear network, while disorder-related alterations in regional gray matter volume are more diffusely distributed over the whole brain. SVM may thus be useful for identifying imaging biomarkers of SAD. PMID:24239689

  4. Supervised retinal vessel segmentation from color fundus images based on matched filtering and AdaBoost classifier.

    PubMed

    Memari, Nogol; Ramli, Abd Rahman; Bin Saripan, M Iqbal; Mashohor, Syamsiah; Moghbel, Mehrdad

    2017-01-01

    The structure and appearance of the blood vessel network in retinal fundus images is an essential part of diagnosing various problems associated with the eyes, such as diabetes and hypertension. In this paper, an automatic retinal vessel segmentation method utilizing matched filter techniques coupled with an AdaBoost classifier is proposed. The fundus image is enhanced using morphological operations, the contrast is increased using contrast limited adaptive histogram equalization (CLAHE) method and the inhomogeneity is corrected using Retinex approach. Then, the blood vessels are enhanced using a combination of B-COSFIRE and Frangi matched filters. From this preprocessed image, different statistical features are computed on a pixel-wise basis and used in an AdaBoost classifier to extract the blood vessel network inside the image. Finally, the segmented images are postprocessed to remove the misclassified pixels and regions. The proposed method was validated using publicly accessible Digital Retinal Images for Vessel Extraction (DRIVE), Structured Analysis of the Retina (STARE) and Child Heart and Health Study in England (CHASE_DB1) datasets commonly used for determining the accuracy of retinal vessel segmentation methods. The accuracy of the proposed segmentation method was comparable to other state of the art methods while being very close to the manual segmentation provided by the second human observer with an average accuracy of 0.972, 0.951 and 0.948 in DRIVE, STARE and CHASE_DB1 datasets, respectively.

  5. "When 'Bad' is 'Good'": Identifying Personal Communication and Sentiment in Drug-Related Tweets.

    PubMed

    Daniulaityte, Raminta; Chen, Lu; Lamy, Francois R; Carlson, Robert G; Thirunarayan, Krishnaprasad; Sheth, Amit

    2016-10-24

    To harness the full potential of social media for epidemiological surveillance of drug abuse trends, the field needs a greater level of automation in processing and analyzing social media content. The objective of the study is to describe the development of supervised machine-learning techniques for the eDrugTrends platform to automatically classify tweets by type/source of communication (personal, official/media, retail) and sentiment (positive, negative, neutral) expressed in cannabis- and synthetic cannabinoid-related tweets. Tweets were collected using Twitter streaming Application Programming Interface and filtered through the eDrugTrends platform using keywords related to cannabis, marijuana edibles, marijuana concentrates, and synthetic cannabinoids. After creating coding rules and assessing intercoder reliability, a manually labeled data set (N=4000) was developed by coding several batches of randomly selected subsets of tweets extracted from the pool of 15,623,869 collected by eDrugTrends (May-November 2015). Out of 4000 tweets, 25% (1000/4000) were used to build source classifiers and 75% (3000/4000) were used for sentiment classifiers. Logistic Regression (LR), Naive Bayes (NB), and Support Vector Machines (SVM) were used to train the classifiers. Source classification (n=1000) tested Approach 1 that used short URLs, and Approach 2 where URLs were expanded and included into the bag-of-words analysis. For sentiment classification, Approach 1 used all tweets, regardless of their source/type (n=3000), while Approach 2 applied sentiment classification to personal communication tweets only (2633/3000, 88%). Multiclass and binary classification tasks were examined, and machine-learning sentiment classifier performance was compared with Valence Aware Dictionary for sEntiment Reasoning (VADER), a lexicon and rule-based method. The performance of each classifier was assessed using 5-fold cross validation that calculated average F-scores. One-tailed t test was used to determine if differences in F-scores were statistically significant. In multiclass source classification, the use of expanded URLs did not contribute to significant improvement in classifier performance (0.7972 vs 0.8102 for SVM, P=.19). In binary classification, the identification of all source categories improved significantly when unshortened URLs were used, with personal communication tweets benefiting the most (0.8736 vs 0.8200, P<.001). In multiclass sentiment classification Approach 1, SVM (0.6723) performed similarly to NB (0.6683) and LR (0.6703). In Approach 2, SVM (0.7062) did not differ from NB (0.6980, P=.13) or LR (F=0.6931, P=.05), but it was over 40% more accurate than VADER (F=0.5030, P<.001). In multiclass task, improvements in sentiment classification (Approach 2 vs Approach 1) did not reach statistical significance (eg, SVM: 0.7062 vs 0.6723, P=.052). In binary sentiment classification (positive vs negative), Approach 2 (focus on personal communication tweets only) improved classification results, compared with Approach 1, for LR (0.8752 vs 0.8516, P=.04) and SVM (0.8800 vs 0.8557, P=.045). The study provides an example of the use of supervised machine learning methods to categorize cannabis- and synthetic cannabinoid-related tweets with fairly high accuracy. Use of these content analysis tools along with geographic identification capabilities developed by the eDrugTrends platform will provide powerful methods for tracking regional changes in user opinions related to cannabis and synthetic cannabinoids use over time and across different regions.

  6. Continuous measurement of breast tumour hormone receptor expression: a comparison of two computational pathology platforms.

    PubMed

    Ahern, Thomas P; Beck, Andrew H; Rosner, Bernard A; Glass, Ben; Frieling, Gretchen; Collins, Laura C; Tamimi, Rulla M

    2017-05-01

    Computational pathology platforms incorporate digital microscopy with sophisticated image analysis to permit rapid, continuous measurement of protein expression. We compared two computational pathology platforms on their measurement of breast tumour oestrogen receptor (ER) and progesterone receptor (PR) expression. Breast tumour microarrays from the Nurses' Health Study were stained for ER (n=592) and PR (n=187). One expert pathologist scored cases as positive if ≥1% of tumour nuclei exhibited stain. ER and PR were then measured with the Definiens Tissue Studio (automated) and Aperio Digital Pathology (user-supervised) platforms. Platform-specific measurements were compared using boxplots, scatter plots and correlation statistics. Classification of ER and PR positivity by platform-specific measurements was evaluated with areas under receiver operating characteristic curves (AUC) from univariable logistic regression models, using expert pathologist classification as the standard. Both platforms showed considerable overlap in continuous measurements of ER and PR between positive and negative groups classified by expert pathologist. Platform-specific measurements were strongly and positively correlated with one another (r≥0.77). The user-supervised Aperio workflow performed slightly better than the automated Definiens workflow at classifying ER positivity (AUC Aperio =0.97; AUC Definiens =0.90; difference=0.07, 95% CI 0.05 to 0.09) and PR positivity (AUC Aperio =0.94; AUC Definiens =0.87; difference=0.07, 95% CI 0.03 to 0.12). Paired hormone receptor expression measurements from two different computational pathology platforms agreed well with one another. The user-supervised workflow yielded better classification accuracy than the automated workflow. Appropriately validated computational pathology algorithms enrich molecular epidemiology studies with continuous protein expression data and may accelerate tumour biomarker discovery. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.

  7. The effects of clinical supervision on supervisees and patients in cognitive-behavioral therapy: a study protocol for a systematic review.

    PubMed

    Alfonsson, Sven; Spännargård, Åsa; Parling, Thomas; Andersson, Gerhard; Lundgren, Tobias

    2017-05-11

    Clinical supervision by a senior therapist is a very common practice in psychotherapist training and psychiatric care settings. Though clinical supervision is advocated by most educational and governing institutions, the effects of clinical supervision on the supervisees' competence, e.g., attitudes, behaviors, and skills, as well as on treatment outcomes and other patient variables are debated and largely unknown. Evidence-based practice is advocated in clinical settings but has not yet been fully implemented in educational or clinical training settings. The aim of this systematic review is to synthesize and present the empirical literature regarding effects of clinical supervision in cognitive-behavioral therapy. This study will include a systematic review of the literature to identify studies that have empirically investigated the effects of supervision on supervised psychotherapists and/or the supervisees' patients. A comprehensive search strategy will be conducted to identify published controlled studies indexed in the MEDLINE, EMBASE, PsycINFO, and Cochrane Library databases. Data on supervision outcomes in both psychotherapists and their patients will be extracted, synthesized, and reported. Risk of bias and quality of the included studies will be assessed systematically. This systematic review will rigorously follow established guidelines for systematic reviews in order to summarize and present the evidence base for clinical supervision in cognitive-behavioral therapy and may aid further research and discussion in this area. PROSPERO CRD42016046834.

  8. Improving zero-training brain-computer interfaces by mixing model estimators

    NASA Astrophysics Data System (ADS)

    Verhoeven, T.; Hübner, D.; Tangermann, M.; Müller, K. R.; Dambre, J.; Kindermans, P. J.

    2017-06-01

    Objective. Brain-computer interfaces (BCI) based on event-related potentials (ERP) incorporate a decoder to classify recorded brain signals and subsequently select a control signal that drives a computer application. Standard supervised BCI decoders require a tedious calibration procedure prior to every session. Several unsupervised classification methods have been proposed that tune the decoder during actual use and as such omit this calibration. Each of these methods has its own strengths and weaknesses. Our aim is to improve overall accuracy of ERP-based BCIs without calibration. Approach. We consider two approaches for unsupervised classification of ERP signals. Learning from label proportions (LLP) was recently shown to be guaranteed to converge to a supervised decoder when enough data is available. In contrast, the formerly proposed expectation maximization (EM) based decoding for ERP-BCI does not have this guarantee. However, while this decoder has high variance due to random initialization of its parameters, it obtains a higher accuracy faster than LLP when the initialization is good. We introduce a method to optimally combine these two unsupervised decoding methods, letting one method’s strengths compensate for the weaknesses of the other and vice versa. The new method is compared to the aforementioned methods in a resimulation of an experiment with a visual speller. Main results. Analysis of the experimental results shows that the new method exceeds the performance of the previous unsupervised classification approaches in terms of ERP classification accuracy and symbol selection accuracy during the spelling experiment. Furthermore, the method shows less dependency on random initialization of model parameters and is consequently more reliable. Significance. Improving the accuracy and subsequent reliability of calibrationless BCIs makes these systems more appealing for frequent use.

  9. Classification of asteroid spectra using a neural network

    NASA Technical Reports Server (NTRS)

    Howell, E. S.; Merenyi, E.; Lebofsky, L. A.

    1994-01-01

    The 52-color asteroid survey (Bell et al., 1988) together with the 8-color asteroid survey (Zellner et al., 1985) provide a data set of asteroid spectra spanning 0.3-2.5 micrometers. An artificial neural network clusters these asteroid spectra based on their similarity to each other. We have also trained the neural network with a categorization learning output layer in a supervised mode to associate the established clusters with taxonomic classes. Results of our classification agree with Tholen's classification based on the 8-color data alone. When extending the spectral range using the 52-color survey data, we find that some modification of the Tholen classes is indicated to produce a cleaner, self-consistent set of taxonomic classes. After supervised training using our modified classes, the network correctly classifies both the training examples, and additional spectra into the correct class with an average of 90% accuracy. Our classification supports the separation of the K class from the S class, as suggested by Bell et al. (1987), based on the near-infrared spectrum. We define two end-member subclasses which seem to have compositional significance within the S class: the So class, which is olivine-rich and red, and the Sp class, which is pyroxene-rich and less red. The remaining S-class asteroids have intermediate compositions of both olivine and pyroxene and moderately red continua. The network clustering suggests some additional structure within the E-, M-, and P-class asteroids, even in the absence of albedo information, which is the only discriminant between these in the Tholen classification. New relationships are seen between the C class and related G, B, and F classes. However, in both cases, the number of spectra is too small to interpret or determine the significance of these separations.

  10. Multi-scale textural feature extraction and particle swarm optimization based model selection for false positive reduction in mammography.

    PubMed

    Zyout, Imad; Czajkowska, Joanna; Grzegorzek, Marcin

    2015-12-01

    The high number of false positives and the resulting number of avoidable breast biopsies are the major problems faced by current mammography Computer Aided Detection (CAD) systems. False positive reduction is not only a requirement for mass but also for calcification CAD systems which are currently deployed for clinical use. This paper tackles two problems related to reducing the number of false positives in the detection of all lesions and masses, respectively. Firstly, textural patterns of breast tissue have been analyzed using several multi-scale textural descriptors based on wavelet and gray level co-occurrence matrix. The second problem addressed in this paper is the parameter selection and performance optimization. For this, we adopt a model selection procedure based on Particle Swarm Optimization (PSO) for selecting the most discriminative textural features and for strengthening the generalization capacity of the supervised learning stage based on a Support Vector Machine (SVM) classifier. For evaluating the proposed methods, two sets of suspicious mammogram regions have been used. The first one, obtained from Digital Database for Screening Mammography (DDSM), contains 1494 regions (1000 normal and 494 abnormal samples). The second set of suspicious regions was obtained from database of Mammographic Image Analysis Society (mini-MIAS) and contains 315 (207 normal and 108 abnormal) samples. Results from both datasets demonstrate the efficiency of using PSO based model selection for optimizing both classifier hyper-parameters and parameters, respectively. Furthermore, the obtained results indicate the promising performance of the proposed textural features and more specifically, those based on co-occurrence matrix of wavelet image representation technique. Copyright © 2015 Elsevier Ltd. All rights reserved.

  11. Can art therapy reduce death anxiety and burnout in end-of-life care workers? a quasi-experimental study.

    PubMed

    S Potash, Jordan; Hy Ho, Andy; Chan, Faye; Lu Wang, Xiao; Cheng, Carol

    2014-05-01

    The need for empathy and the difficulties of coping with mortality when caring for the dying and the bereaved can cause psychological, emotional, and spiritual strain. The aim of this study was to examine the effectiveness of art-therapy-based supervision in reducing burnout and death anxiety among end-of-life care workers in Hong Kong. Through a quasi-experimental design, 69 participants enrolled in a 6-week, 18-hour art-therapy-based supervision group, and another 63 enrolled in a 3-day, 18-hour standard skills-based supervision group (n=132). Pre- and post-intervention assessments were carried out with three outcome measures: the Maslach Burnout Inventory-General Survey, the Five Facet Mindfulness Questionnaire, and the Death Attitude Profile-Revised. The data was analysed using paired sample t-tests. Significant reductions in exhaustion and death anxiety and significant increases in emotional awareness were observed for participants in the art-therapy-based supervision group. This study provides preliminary evidence that art-therapy-based supervision for end-of-life care workers can reduce burnout by enhancing emotional awareness and regulation, fostering meaning-making, and promoting reflection on death.

  12. An exploration of undergraduate medical students' satisfaction with faculty support supervision during community placements in Uganda

    PubMed Central

    Mubuuke, AG; Oria, H; Dhabangi, A; Kiguli, S; Sewankambo, NK

    2015-01-01

    Introduction To produce health professionals who are oriented towards addressing community priority health needs, the training in medical schools has been transformed to include a component of community-based training. During this period, students spend a part of their training in the communities they are likely to serve upon graduation. They engage and empower local people in the communities to address their health needs during their placements, and at the same time learn from the people. During the community-based component, students are constantly supervised by faculty from the university to ensure that the intended objectives are achieved. The purpose of the present study was to explore student experiences of support supervision from university faculty during their community-based education, research and service (COBERS placements) and to identify ways in which the student learning can be improved through improved faculty supervision. Methods This was a cross-sectional study involving students at the College of Health Sciences, Makerere University, Uganda, who had a community-based component during their training. Data were collected using both questionnaires and focus group discussions. Quantitative data were analyzed using statistical software and thematic approaches were used for the analysis of qualitative data. Results Most students reported satisfaction with the COBERS supervision; however, junior students were less satisfied with the supervision than the more senior students with more experience of community-based training. Although many supervisors assisted students before departure to COBERS sites, a significant number of supervisors made little follow-up while students were in the community. Incorporating the use of information technology avenues such as emails and skype sessions was suggested as a potential way of enhancing supervision amidst resource constraints without faculty physically visiting the sites. Conclusions Although many students were satisfied with COBERS supervision, there are still some challenges, mostly seen with the more junior students. Using information technology could be a solution to some of these challenges. PMID:26626014

  13. An exploration of undergraduate medical students' satisfaction with faculty support supervision during community placements in Uganda.

    PubMed

    Mubuuke, Aloysius G; Oria, Hussein; Dhabangi, Aggrey; Kiguli, Sarah; Sewankambo, Nelson K

    2015-01-01

    To produce health professionals who are oriented towards addressing community priority health needs, the training in medical schools has been transformed to include a component of community-based training. During this period, students spend a part of their training in the communities they are likely to serve upon graduation. They engage and empower local people in the communities to address their health needs during their placements, and at the same time learn from the people. During the community-based component, students are constantly supervised by faculty from the university to ensure that the intended objectives are achieved. The purpose of the present study was to explore student experiences of support supervision from university faculty during their community-based education, research and service (COBERS placements) and to identify ways in which the student learning can be improved through improved faculty supervision. This was a cross-sectional study involving students at the College of Health Sciences, Makerere University, Uganda, who had a community-based component during their training. Data were collected using both questionnaires and focus group discussions. Quantitative data were analyzed using statistical software and thematic approaches were used for the analysis of qualitative data. Most students reported satisfaction with the COBERS supervision; however, junior students were less satisfied with the supervision than the more senior students with more experience of community-based training. Although many supervisors assisted students before departure to COBERS sites, a significant number of supervisors made little follow-up while students were in the community. Incorporating the use of information technology avenues such as emails and skype sessions was suggested as a potential way of enhancing supervision amidst resource constraints without faculty physically visiting the sites. Although many students were satisfied with COBERS supervision, there are still some challenges, mostly seen with the more junior students. Using information technology could be a solution to some of these challenges.

  14. A tool for classifying individuals with chronic back pain: using multivariate pattern analysis with functional magnetic resonance imaging data.

    PubMed

    Callan, Daniel; Mills, Lloyd; Nott, Connie; England, Robert; England, Shaun

    2014-01-01

    Chronic pain is one of the most prevalent health problems in the world today, yet neurological markers, critical to diagnosis of chronic pain, are still largely unknown. The ability to objectively identify individuals with chronic pain using functional magnetic resonance imaging (fMRI) data is important for the advancement of diagnosis, treatment, and theoretical knowledge of brain processes associated with chronic pain. The purpose of our research is to investigate specific neurological markers that could be used to diagnose individuals experiencing chronic pain by using multivariate pattern analysis with fMRI data. We hypothesize that individuals with chronic pain have different patterns of brain activity in response to induced pain. This pattern can be used to classify the presence or absence of chronic pain. The fMRI experiment consisted of alternating 14 seconds of painful electric stimulation (applied to the lower back) with 14 seconds of rest. We analyzed contrast fMRI images in stimulation versus rest in pain-related brain regions to distinguish between the groups of participants: 1) chronic pain and 2) normal controls. We employed supervised machine learning techniques, specifically sparse logistic regression, to train a classifier based on these contrast images using a leave-one-out cross-validation procedure. We correctly classified 92.3% of the chronic pain group (N = 13) and 92.3% of the normal control group (N = 13) by recognizing multivariate patterns of activity in the somatosensory and inferior parietal cortex. This technique demonstrates that differences in the pattern of brain activity to induced pain can be used as a neurological marker to distinguish between individuals with and without chronic pain. Medical, legal and business professionals have recognized the importance of this research topic and of developing objective measures of chronic pain. This method of data analysis was very successful in correctly classifying each of the two groups.

  15. A Tool for Classifying Individuals with Chronic Back Pain: Using Multivariate Pattern Analysis with Functional Magnetic Resonance Imaging Data

    PubMed Central

    Callan, Daniel; Mills, Lloyd; Nott, Connie; England, Robert; England, Shaun

    2014-01-01

    Chronic pain is one of the most prevalent health problems in the world today, yet neurological markers, critical to diagnosis of chronic pain, are still largely unknown. The ability to objectively identify individuals with chronic pain using functional magnetic resonance imaging (fMRI) data is important for the advancement of diagnosis, treatment, and theoretical knowledge of brain processes associated with chronic pain. The purpose of our research is to investigate specific neurological markers that could be used to diagnose individuals experiencing chronic pain by using multivariate pattern analysis with fMRI data. We hypothesize that individuals with chronic pain have different patterns of brain activity in response to induced pain. This pattern can be used to classify the presence or absence of chronic pain. The fMRI experiment consisted of alternating 14 seconds of painful electric stimulation (applied to the lower back) with 14 seconds of rest. We analyzed contrast fMRI images in stimulation versus rest in pain-related brain regions to distinguish between the groups of participants: 1) chronic pain and 2) normal controls. We employed supervised machine learning techniques, specifically sparse logistic regression, to train a classifier based on these contrast images using a leave-one-out cross-validation procedure. We correctly classified 92.3% of the chronic pain group (N = 13) and 92.3% of the normal control group (N = 13) by recognizing multivariate patterns of activity in the somatosensory and inferior parietal cortex. This technique demonstrates that differences in the pattern of brain activity to induced pain can be used as a neurological marker to distinguish between individuals with and without chronic pain. Medical, legal and business professionals have recognized the importance of this research topic and of developing objective measures of chronic pain. This method of data analysis was very successful in correctly classifying each of the two groups. PMID:24905072

  16. Reliability and validity of assessing subspecialty level of faculty anesthesiologists' supervision of anesthesiology residents.

    PubMed

    De Oliveira, Gildasio S; Dexter, Franklin; Bialek, Jane M; McCarthy, Robert J

    2015-01-01

    Supervision of anesthesiology residents is a major responsibility of faculty (academic) anesthesiologists. Supervision can be evaluated daily for individual anesthesiologists using a 9-question instrument. Faculty anesthesiologists with lesser individual scores contribute to lesser departmental (global) scores. Low (<3, "frequent") department-wide evaluations of supervision are associated with more mistakes with negative consequences to patients. With the long-term aim for residency programs to be evaluated partly based on the quality of their resident supervision, we assessed the 9-item instrument's reliability and validity when used to compare anesthesia programs' rotations nationwide. One thousand five hundred residents in the American Society of Anesthesiologists' directory of anesthesia trainees were randomly selected to be participants. Residents were contacted via e-mail and requested to complete a Web-based survey. Nonrespondents were mailed a paper version of the survey. Internal consistency of the supervision scale was excellent, with Cronbach's α = 0.909 (95% CI, 0.896-0.922, n = 641 respondents). Discriminant validity was found based on absence of rank correlation of supervision score with characteristics of the respondents and programs (all P > 0.10): age, hours worked per week, female, year of anesthesia training, weeks in the current rotation, sequence of survey response, size of residency class, and number of survey respondents from the current rotation and program. Convergent validity was found based on significant positive correlation between supervision score and variables related to safety culture (all P < 0.0001): "Overall perceptions of patient safety," "Teamwork within units," "Nonpunitive response to errors," "Handoffs and transitions," "Feedback and communication about error," "Communication openness," and rotation's "overall grade on patient safety." Convergent validity was found also based on significant negative correlation with variables related to the individual resident's burnout (all P < 0.0001): "I feel burnout from my work," "I have become more callous toward people since I took this job," and numbers of "errors with potential negative consequences to patients [that you have] made and/or witnessed." Usefulness was shown by supervision being predicted by the same 1 variable for each of 3 regression tree criteria: "Teamwork within [the rotation]" (e.g., "When one area in this rotation gets busy, others help out"). Evaluation of the overall quality of supervision of residents by faculty anesthesiologists depends on the reliability and validity of the instrument. Our results show that the 9-item de Oliveira Filho et al. supervision scale can be applied for overall (department, rotation) assessment of anesthesia training programs.

  17. [Quantitative classification-based occupational health management for electroplating enterprises in Baoan District of Shenzhen, China].

    PubMed

    Zhang, Sheng; Huang, Jinsheng; Yang, Baigbing; Lin, Binjie; Xu, Xinyun; Chen, Jinru; Zhao, Zhuandi; Tu, Xiaozhi; Bin, Haihua

    2014-04-01

    To improve the occupational health management levels in electroplating enterprises with quantitative classification measures and to provide a scientific basis for the prevention and control of occupational hazards in electroplating enterprises and the protection of workers' health. A quantitative classification table was created for the occupational health management in electroplating enterprises. The evaluation indicators included 6 items and 27 sub-items, with a total score of 100 points. Forty electroplating enterprises were selected and scored according to the quantitative classification table. These electroplating enterprises were classified into grades A, B, and C based on the scores. Among 40 electroplating enterprises, 11 (27.5%) had scores of >85 points (grade A), 23 (57.5%) had scores of 60∼85 points (grade B), and 6 (15.0%) had scores of <60 points (grade C). Quantitative classification management for electroplating enterprises is a valuable attempt, which is helpful for the supervision and management by the health department and provides an effective method for the self-management of enterprises.

  18. Microcomputer-based classification of environmental data in municipal areas

    NASA Astrophysics Data System (ADS)

    Thiergärtner, H.

    1995-10-01

    Multivariate data-processing methods used in mineral resource identification can be used to classify urban regions. Using elements of expert systems, geographical information systems, as well as known classification and prognosis systems, it is possible to outline a single model that consists of resistant and of temporary parts of a knowledge base including graphical input and output treatment and of resistant and temporary elements of a bank of methods and algorithms. Whereas decision rules created by experts will be stored in expert systems directly, powerful classification rules in form of resistant but latent (implicit) decision algorithms may be implemented in the suggested model. The latent functions will be transformed into temporary explicit decision rules by learning processes depending on the actual task(s), parameter set(s), pixels selection(s), and expert control(s). This takes place both at supervised and nonsupervised classification of multivariately described pixel sets representing municipal subareas. The model is outlined briefly and illustrated by results obtained in a target area covering a part of the city of Berlin (Germany).

  19. A Raman spectroscopy bio-sensor for tissue discrimination in surgical robotics.

    PubMed

    Ashok, Praveen C; Giardini, Mario E; Dholakia, Kishan; Sibbett, Wilson

    2014-01-01

    We report the development of a fiber-based Raman sensor to be used in tumour margin identification during endoluminal robotic surgery. Although this is a generic platform, the sensor we describe was adapted for the ARAKNES (Array of Robots Augmenting the KiNematics of Endoluminal Surgery) robotic platform. On such a platform, the Raman sensor is intended to identify ambiguous tissue margins during robot-assisted surgeries. To maintain sterility of the probe during surgical intervention, a disposable sleeve was specially designed. A straightforward user-compatible interface was implemented where a supervised multivariate classification algorithm was used to classify different tissue types based on specific Raman fingerprints so that it could be used without prior knowledge of spectroscopic data analysis. The protocol avoids inter-patient variability in data and the sensor system is not restricted for use in the classification of a particular tissue type. Representative tissue classification assessments were performed using this system on excised tissue. Copyright © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  20. Using patient experiences on Dutch social media to supervise health care services: exploratory study.

    PubMed

    van de Belt, Tom H; Engelen, Lucien J L P G; Verhoef, Lise M; van der Weide, Marian J A; Schoonhoven, Lisette; Kool, Rudolf B

    2015-01-15

    Social media has become mainstream and a growing number of people use it to share health care-related experiences, for example on health care rating sites. These users' experiences and ratings on social media seem to be associated with quality of care. Therefore, information shared by citizens on social media could be of additional value for supervising the quality and safety of health care services by regulatory bodies, thereby stimulating participation by consumers. The objective of the study was to identify the added value of social media for two types of supervision by the Dutch Healthcare Inspectorate (DHI), which is the regulatory body charged with supervising the quality and safety of health care services in the Netherlands. These were (1) supervision in response to incidents reported by individuals, and (2) risk-based supervision. We performed an exploratory study in cooperation with the DHI and searched different social media sources such as Twitter, Facebook, and healthcare rating sites to find additional information for these incidents and topics, from five different sectors. Supervision experts determined the added value for each individual result found, making use of pre-developed scales. Searches in social media resulted in relevant information for six of 40 incidents studied and provided relevant additional information in 72 of 116 cases in risk-based supervision of long-term elderly care. The results showed that social media could be used to include the patient's perspective in supervision. However, it appeared that the rating site ZorgkaartNederland was the only source that provided information that was of additional value for the DHI, while other sources such as forums and social networks like Twitter and Facebook did not result in additional information. This information could be of importance for health care inspectorates, particularly for its enforcement by risk-based supervision in care of the elderly. Further research is needed to determine the added value for other health care sectors.

  1. Using Patient Experiences on Dutch Social Media to Supervise Health Care Services: Exploratory Study

    PubMed Central

    Engelen, Lucien JLPG; Verhoef, Lise M; van der Weide, Marian JA; Schoonhoven, Lisette; Kool, Rudolf B

    2015-01-01

    Background Social media has become mainstream and a growing number of people use it to share health care-related experiences, for example on health care rating sites. These users’ experiences and ratings on social media seem to be associated with quality of care. Therefore, information shared by citizens on social media could be of additional value for supervising the quality and safety of health care services by regulatory bodies, thereby stimulating participation by consumers. Objective The objective of the study was to identify the added value of social media for two types of supervision by the Dutch Healthcare Inspectorate (DHI), which is the regulatory body charged with supervising the quality and safety of health care services in the Netherlands. These were (1) supervision in response to incidents reported by individuals, and (2) risk-based supervision. Methods We performed an exploratory study in cooperation with the DHI and searched different social media sources such as Twitter, Facebook, and healthcare rating sites to find additional information for these incidents and topics, from five different sectors. Supervision experts determined the added value for each individual result found, making use of pre-developed scales. Results Searches in social media resulted in relevant information for six of 40 incidents studied and provided relevant additional information in 72 of 116 cases in risk-based supervision of long-term elderly care. Conclusions The results showed that social media could be used to include the patient’s perspective in supervision. However, it appeared that the rating site ZorgkaartNederland was the only source that provided information that was of additional value for the DHI, while other sources such as forums and social networks like Twitter and Facebook did not result in additional information. This information could be of importance for health care inspectorates, particularly for its enforcement by risk-based supervision in care of the elderly. Further research is needed to determine the added value for other health care sectors. PMID:25592481

  2. Application of classification methods for mapping Mercury's surface composition: analysis on Rudaki's Area

    NASA Astrophysics Data System (ADS)

    Zambon, F.; De Sanctis, M. C.; Capaccioni, F.; Filacchione, G.; Carli, C.; Ammanito, E.; Friggeri, A.

    2011-10-01

    During the first two MESSENGER flybys (14th January 2008 and 6th October 2008) the Mercury Dual Imaging System (MDIS) has extended the coverage of the Mercury surface, obtained by Mariner 10 and now we have images of about 90% of the Mercury surface [1]. MDIS is equipped with a Narrow Angle Camera (NAC) and a Wide Angle Camera (WAC). The NAC uses an off-axis reflective design with a 1.5° field of view (FOV) centered at 747 nm. The WAC has a re- fractive design with a 10.5° FOV and 12-position filters that cover a 395-1040 nm spectral range [2]. The color images can be used to infer information on the surface composition and classification meth- ods are an interesting technique for multispectral image analysis which can be applied to the study of the planetary surfaces. Classification methods are based on clustering algorithms and they can be divided in two categories: unsupervised and supervised. The unsupervised classifiers do not require the analyst feedback, and the algorithm automatically organizes pixels values into classes. In the supervised method, instead, the analyst must choose the "training area" that define the pixels value of a given class [3]. Here we will describe the classification in different compositional units of the region near the Rudaki Crater on Mercury.

  3. Supervised Classification of Underwater Optical Imagery for Improved Detection and Characterization of Underwater Military Munitions

    DTIC Science & Technology

    2015-06-01

    Atmospheric Administration (NOAA) at the “ Ordnance Reef” site off of Waianae, Hawaii by divers using a hand-held high -definition video (HDV) camera...for the Ordnance Reef dataset as well, though less dramatic than the Miami data because the 2-D results were high to begin with at Ordnance Reef...generally > 80% accuracy. Discrimination of environments was high for the major seabed types. For example, sand and mixed sand-seagrass were classified with

  4. Semi-supervised learning via regularized boosting working on multiple semi-supervised assumptions.

    PubMed

    Chen, Ke; Wang, Shihai

    2011-01-01

    Semi-supervised learning concerns the problem of learning in the presence of labeled and unlabeled data. Several boosting algorithms have been extended to semi-supervised learning with various strategies. To our knowledge, however, none of them takes all three semi-supervised assumptions, i.e., smoothness, cluster, and manifold assumptions, together into account during boosting learning. In this paper, we propose a novel cost functional consisting of the margin cost on labeled data and the regularization penalty on unlabeled data based on three fundamental semi-supervised assumptions. Thus, minimizing our proposed cost functional with a greedy yet stagewise functional optimization procedure leads to a generic boosting framework for semi-supervised learning. Extensive experiments demonstrate that our algorithm yields favorite results for benchmark and real-world classification tasks in comparison to state-of-the-art semi-supervised learning algorithms, including newly developed boosting algorithms. Finally, we discuss relevant issues and relate our algorithm to the previous work.

  5. Joint influences of individual and work unit abusive supervision on ethical intentions and behaviors: a moderated mediation model.

    PubMed

    Hannah, Sean T; Schaubroeck, John M; Peng, Ann C; Lord, Robert G; Trevino, Linda K; Kozlowski, Steve W J; Avolio, Bruce J; Dimotakis, Nikolaos; Doty, Joseph

    2013-07-01

    We develop and test a model based on social cognitive theory (Bandura, 1991) that links abusive supervision to followers' ethical intentions and behaviors. Results from a sample of 2,572 military members show that abusive supervision was negatively related to followers' moral courage and their identification with the organization's core values. In addition, work unit contexts with varying degrees of abusive supervision, reflected by the average level of abusive supervision reported by unit members, moderated relationships between the level of abusive supervision personally experienced by individuals and both their moral courage and their identification with organizational values. Moral courage and identification with organizational values accounted for the relationship between abusive supervision and followers' ethical intentions and unethical behaviors. These findings suggest that abusive supervision may undermine moral agency and that being personally abused is not required for abusive supervision to negatively influence ethical outcomes. PsycINFO Database Record (c) 2013 APA, all rights reserved.

  6. A Comprehensive Texture Segmentation Framework for Segmentation of Capillary Non-Perfusion Regions in Fundus Fluorescein Angiograms

    PubMed Central

    Zheng, Yalin; Kwong, Man Ting; MacCormick, Ian J. C.; Beare, Nicholas A. V.; Harding, Simon P.

    2014-01-01

    Capillary non-perfusion (CNP) in the retina is a characteristic feature used in the management of a wide range of retinal diseases. There is no well-established computation tool for assessing the extent of CNP. We propose a novel texture segmentation framework to address this problem. This framework comprises three major steps: pre-processing, unsupervised total variation texture segmentation, and supervised segmentation. It employs a state-of-the-art multiphase total variation texture segmentation model which is enhanced by new kernel based region terms. The model can be applied to texture and intensity-based multiphase problems. A supervised segmentation step allows the framework to take expert knowledge into account, an AdaBoost classifier with weighted cost coefficient is chosen to tackle imbalanced data classification problems. To demonstrate its effectiveness, we applied this framework to 48 images from malarial retinopathy and 10 images from ischemic diabetic maculopathy. The performance of segmentation is satisfactory when compared to a reference standard of manual delineations: accuracy, sensitivity and specificity are 89.0%, 73.0%, and 90.8% respectively for the malarial retinopathy dataset and 80.8%, 70.6%, and 82.1% respectively for the diabetic maculopathy dataset. In terms of region-wise analysis, this method achieved an accuracy of 76.3% (45 out of 59 regions) for the malarial retinopathy dataset and 73.9% (17 out of 26 regions) for the diabetic maculopathy dataset. This comprehensive segmentation framework can quantify capillary non-perfusion in retinopathy from two distinct etiologies, and has the potential to be adopted for wider applications. PMID:24747681

  7. Automated cell analysis tool for a genome-wide RNAi screen with support vector machine based supervised learning

    NASA Astrophysics Data System (ADS)

    Remmele, Steffen; Ritzerfeld, Julia; Nickel, Walter; Hesser, Jürgen

    2011-03-01

    RNAi-based high-throughput microscopy screens have become an important tool in biological sciences in order to decrypt mostly unknown biological functions of human genes. However, manual analysis is impossible for such screens since the amount of image data sets can often be in the hundred thousands. Reliable automated tools are thus required to analyse the fluorescence microscopy image data sets usually containing two or more reaction channels. The herein presented image analysis tool is designed to analyse an RNAi screen investigating the intracellular trafficking and targeting of acylated Src kinases. In this specific screen, a data set consists of three reaction channels and the investigated cells can appear in different phenotypes. The main issue of the image processing task is an automatic cell segmentation which has to be robust and accurate for all different phenotypes and a successive phenotype classification. The cell segmentation is done in two steps by segmenting the cell nuclei first and then using a classifier-enhanced region growing on basis of the cell nuclei to segment the cells. The classification of the cells is realized by a support vector machine which has to be trained manually using supervised learning. Furthermore, the tool is brightness invariant allowing different staining quality and it provides a quality control that copes with typical defects during preparation and acquisition. A first version of the tool has already been successfully applied for an RNAi-screen containing three hundred thousand image data sets and the SVM extended version is designed for additional screens.

  8. Enhanced multi-protocol analysis via intelligent supervised embedding (EMPrAvISE): detecting prostate cancer on multi-parametric MRI

    NASA Astrophysics Data System (ADS)

    Viswanath, Satish; Bloch, B. Nicholas; Chappelow, Jonathan; Patel, Pratik; Rofsky, Neil; Lenkinski, Robert; Genega, Elizabeth; Madabhushi, Anant

    2011-03-01

    Currently, there is significant interest in developing methods for quantitative integration of multi-parametric (structural, functional) imaging data with the objective of building automated meta-classifiers to improve disease detection, diagnosis, and prognosis. Such techniques are required to address the differences in dimensionalities and scales of individual protocols, while deriving an integrated multi-parametric data representation which best captures all disease-pertinent information available. In this paper, we present a scheme called Enhanced Multi-Protocol Analysis via Intelligent Supervised Embedding (EMPrAvISE); a powerful, generalizable framework applicable to a variety of domains for multi-parametric data representation and fusion. Our scheme utilizes an ensemble of embeddings (via dimensionality reduction, DR); thereby exploiting the variance amongst multiple uncorrelated embeddings in a manner similar to ensemble classifier schemes (e.g. Bagging, Boosting). We apply this framework to the problem of prostate cancer (CaP) detection on 12 3 Tesla pre-operative in vivo multi-parametric (T2-weighted, Dynamic Contrast Enhanced, and Diffusion-weighted) magnetic resonance imaging (MRI) studies, in turn comprising a total of 39 2D planar MR images. We first align the different imaging protocols via automated image registration, followed by quantification of image attributes from individual protocols. Multiple embeddings are generated from the resultant high-dimensional feature space which are then combined intelligently to yield a single stable solution. Our scheme is employed in conjunction with graph embedding (for DR) and probabilistic boosting trees (PBTs) to detect CaP on multi-parametric MRI. Finally, a probabilistic pairwise Markov Random Field algorithm is used to apply spatial constraints to the result of the PBT classifier, yielding a per-voxel classification of CaP presence. Per-voxel evaluation of detection results against ground truth for CaP extent on MRI (obtained by spatially registering pre-operative MRI with available whole-mount histological specimens) reveals that EMPrAvISE yields a statistically significant improvement (AUC=0.77) over classifiers constructed from individual protocols (AUC=0.62, 0.62, 0.65, for T2w, DCE, DWI respectively) as well as one trained using multi-parametric feature concatenation (AUC=0.67).

  9. Administrative clinical supervision as evaluated by the first-line managers in one health care organization district.

    PubMed

    Sirola-Karvinen, Pirjo; Hyrkäs, Kristiina

    2008-07-01

    The aim of this article is to increase knowledge and understanding of administrative clinical supervision. Administrative clinical supervision is a learning process for leaders that is based on experiences. Only a few studies have focused on administrative clinical supervision. The materials for this study were evaluations collected in 2002-2005 using a clinical supervision evaluation scale (MCSS). The respondents (n = 126) in the study were nursing leaders representing different specialties. The data were analysed statistically. The findings showed that the supervision succeeded very well. The contents of the sessions differed depending on the nurse leader's position. Significant differences were found in the evaluations between specialties and within years of work experience. Clinical supervision was utilized best in the psychiatric and mental health sector. The supervisees' who had long work experience scored the importance and value of clinical supervision as high. Clinical supervision is beneficial for nursing leaders. The experiences were positive and the nursing leaders appreciated the importance and value of clinical supervision. It is important to plan and coordinate a longitudinal evaluation so that clinical supervision for nursing leaders is systematically implemented and continuously developed.

  10. Clinical Supervision of Interns: Understanding the View of Interns and the Potential of ICT to Deliver Supervision for Safer Patient Care.

    PubMed

    Yee, Kwang Chien; Madden, Angela; Nash, Rosie; Connolly, Michael

    2017-01-01

    Clinical communication and clinical supervision of junior healthcare professionals are identified as the two most common preventable factors to reduce medical errors. While multiple strategies have been implemented to improve clinical communication, clinical supervision has not attracted as much attention. This is in part due to the lack of understanding of clinical supervision. Furthermore, there is a lack of exploration of information communication technology (ICT) in assisting the delivery of clinical supervision from the perspective of users (i.e. junior clinicians). This paper presents a study to understand clinical supervision from the perspective of medical and pharmacy interns. The important elements of good clinical supervisors and good clinical supervision have been presented in this paper based on our study. More importantly, our results suggest a distinction between good supervisors and good supervisions. Both these factors impact on patient safety. Through discussion of user requirements of good supervision by users (interns), this paper then explores and presents a conceptual framework to assist in the discussion and design of ICT by healthcare organisations to improve clinical supervision of interns and therefore improve patient safety.

  11. Supervised machine learning algorithms to diagnose stress for vehicle drivers based on physiological sensor signals.

    PubMed

    Barua, Shaibal; Begum, Shahina; Ahmed, Mobyen Uddin

    2015-01-01

    Machine learning algorithms play an important role in computer science research. Recent advancement in sensor data collection in clinical sciences lead to a complex, heterogeneous data processing, and analysis for patient diagnosis and prognosis. Diagnosis and treatment of patients based on manual analysis of these sensor data are difficult and time consuming. Therefore, development of Knowledge-based systems to support clinicians in decision-making is important. However, it is necessary to perform experimental work to compare performances of different machine learning methods to help to select appropriate method for a specific characteristic of data sets. This paper compares classification performance of three popular machine learning methods i.e., case-based reasoning, neutral networks and support vector machine to diagnose stress of vehicle drivers using finger temperature and heart rate variability. The experimental results show that case-based reasoning outperforms other two methods in terms of classification accuracy. Case-based reasoning has achieved 80% and 86% accuracy to classify stress using finger temperature and heart rate variability. On contrary, both neural network and support vector machine have achieved less than 80% accuracy by using both physiological signals.

  12. Relationship between after-school care of adolescents and substance use, risk taking, depressed mood, and academic achievement.

    PubMed

    Richardson, J L; Radziszewska, B; Dent, C W; Flay, B R

    1993-07-01

    To examine the relationship between parental monitoring and six negative behaviors: cigarette, alcohol, and marijuana use; depressed mood; risk taking; and lower academic grades. Survey of 3993 ninth-grade students in six school districts in southern California. The sample consisted of 1930 boys and 2063 girls, self-classified as non-Hispanic white (32%), African-American (13%), Hispanic (46%), or Asian (9%). A relationship was found between unsupervised care after school and susceptibility to cigarette, alcohol, and marijuana use; depressed mood; risk taking; and lower academic grades. Adolescents who were unsupervised at home were slightly more likely to engage in problem behavior than those who were supervised at home. Adolescents at a neighbor's house, at school, or at a job and especially those who "hang out" were most likely to engage in problem behavior. Risk was higher if the parent had an unengaged parenting style. Although girls were less likely than boys to engage in problem behavior when supervised, as supervision decreased they were significantly more likely to have each of these problems. Family structure had little impact on risk. Self-care, especially when it occurs outside of the home, is associated with substance use, risk taking, depressed mood, and lower academic grades.

  13. Redesigning Supervision: Alternative Models for Student Teaching and Field Experiences

    ERIC Educational Resources Information Center

    Rodgers, Adrian; Jenkins, Deborah Bainer

    2010-01-01

    In "Redesigning Supervision", active professionals in teacher education and professional development share research-based, alternative models for restructuring the way pre-service teachers are supervised. The authors examine the methods currently used and discuss how teacher educators have striven to change or renew these procedures. They then…

  14. A Grounded Theory Study of Supervision of Preservice Consultation Training

    ERIC Educational Resources Information Center

    Newman, Daniel S.

    2012-01-01

    The purpose of this study was to explore a university-based supervision process for consultants-in-training (CITs) engaged in a preservice level consultation course with applied practicum experience. The study was approached from a constructivist worldview using a grounded theory methodology. Data consisted of supervision session transcripts,…

  15. The Collaborative Peer Supervision Group Project: A Continuing Education Model to Promote Professional Competence

    ERIC Educational Resources Information Center

    Thomasgard, Michael; Warfield, Janeece

    2005-01-01

    Thomasgard, a physician, and Warfield, a psychologist, describe the multidisciplinary Collaborative Peer Supervision Group Project, originally developed and implemented in Columbus, Ohio. Collaborative Peer Supervision Groups (CPSGs) foster the development of case-based, interdisciplinary, continuing education. CPSGs are designed to improve the…

  16. Inspired by "El Duende": One-Canvas Process Painting in Art Therapy Supervision

    ERIC Educational Resources Information Center

    Miller, Abbe

    2012-01-01

    This article describes an art-based approach to supervision that combines clinical insights with archetypal awareness arising from painting on a single canvas throughout the internship semester. Supervision is comprised of three main components: (a) spontaneous painting, (b) complex reflective processing, and (c) aesthetically focused attention to…

  17. Special Education Teachers' Supervision of Paraeducators: A Quantitative Study of Team Sharedness

    ERIC Educational Resources Information Center

    Panitz, Beth L.

    2012-01-01

    Paraeducators, also known as paraprofessionals or teaching assistants, provide special education services for students with disabilities under the supervision of special education teachers. Despite legal requirements that paraeducators work under the direct supervision of teachers, teacher education programs lack research-based evidence to design…

  18. Predicting Teacher Job Satisfaction Based on Principals' Instructional Supervision Behaviours: A Study of Turkish Teachers

    ERIC Educational Resources Information Center

    Ilgan, Abdurrahman; Parylo, Oksana; Sungu, Hilmi

    2015-01-01

    This quantitative research examined instructional supervision behaviours of school principals as a predictor of teacher job satisfaction through the analysis of Turkish teachers' perceptions of principals' instructional supervision behaviours. There was a statistically significant difference found between the teachers' job satisfaction level and…

  19. Making the Invisible Visible: Understanding Social Processes within Multicultural Internship Supervision

    ERIC Educational Resources Information Center

    Proctor, Sherrie L.; Rogers, Margaret R.

    2013-01-01

    Despite a clear need, few resources exist to guide field-based multicultural internship supervision practices in school psychology. This article draws on literature from counseling and clinical psychology and related disciplines to ground and define multicultural internship supervision within the context of school psychology professional practice.…

  20. Online Lab Books for Supervision of Project Students

    ERIC Educational Resources Information Center

    Badge, J. L.; Badge, R. M.

    2009-01-01

    In this article, the authors report a case study where Blackboard's wiki function was used to create electronic lab books for the supervision of undergraduate students completing laboratory based research projects. This successful experiment in supervision using electronic notebooks provided a searchable record of student work and a permanent…

  1. Action Research as Instructional Supervision: Suggestions for Principals

    ERIC Educational Resources Information Center

    Glanz, Jeffrey

    2005-01-01

    Supervision based on collaboration, participative decision making, and reflective practice is the hallmark of a viable school improvement program that is designed to promote teaching and learning. Action research has gradually emerged as an important form of instructional supervision to engage teachers in reflective practice about their teaching…

  2. Is it possible to strengthen psychiatric nursing staff's clinical supervision? RCT of a meta-supervision intervention.

    PubMed

    Gonge, Henrik; Buus, Niels

    2015-04-01

    To test the effects of a meta-supervision intervention in terms of participation, effectiveness and benefits of clinical supervision of psychiatric nursing staff. Clinical supervision is regarded as a central component in developing mental health nursing practices, but the evidence supporting positive outcomes of clinical supervision in psychiatric nursing is not convincing. The study was designed as a randomized controlled trial. All permanently employed nursing staff members at three general psychiatric wards at a Danish university hospital (n = 83) were allocated to either an intervention group (n = 40) receiving the meta-supervision in addition to attending usual supervision or to a control group (n = 43) attending usual supervision. Self-reported questionnaire measures of clinical supervision effectiveness and benefits were collected at base line in January 2012 and at follow-up completed in February 2013. In addition, a prospective registration of clinical supervision participation was carried out over 3 months subsequent to the intervention. The main result was that it was possible to motivate staff in the intervention group to participate significantly more frequently in sessions of the ongoing supervision compared with the control group. However, more frequent participation was not reflected in the experienced effectiveness of the clinical supervision or in the general formative or restorative benefits. The intervention had a positive effect on individuals or wards already actively engaged in clinical supervision, which suggested that individuals and wards without well-established supervision practices may require more comprehensive interventions targeting individual and organizational barriers to clinical supervision. © 2014 John Wiley & Sons Ltd.

  3. Is supervision necessary? Examining the effects of internet-based CBT training with and without supervision.

    PubMed

    Rakovshik, Sarah G; McManus, Freda; Vazquez-Montes, Maria; Muse, Kate; Ougrin, Dennis

    2016-03-01

    To investigate the effect of Internet-based training (IBT), with and without supervision, on therapists' (N = 61) cognitive-behavioral therapy (CBT) skills in routine clinical practice. Participants were randomized into 3 conditions: (1) Internet-based training with use of a consultation worksheet (IBT-CW); (2) Internet-based training with CBT supervision via Skype (IBT-S); and (3) "delayed-training" controls (DTs), who did not receive the training until all data collection was completed. The IBT participants received access to training over a period of 3 months. CBT skills were evaluated at pre-, mid- and posttraining/wait using assessor competence ratings of recorded therapy sessions. Hierarchical linear analysis revealed that the IBT-S participants had significantly greater CBT competence at posttraining than did IBT-CW and DT participants at both the mid- and posttraining/wait assessment points. There were no significant differences between IBT-CW and the delayed (no)-training DTs. IBT programs that include supervision may be a scalable and effective method of disseminating CBT into routine clinical practice, particularly for populations without ready access to more-traditional "live" methods of training. There was no evidence for a significant effect of IBT without supervision over a nontraining control, suggesting that merely providing access to IBT programs may not be an effective method of disseminating CBT to routine clinical practice. (c) 2016 APA, all rights reserved).

  4. Automatic Detection of Seismocardiogram Sensor Misplacement for Robust Pre-Ejection Period Estimation in Unsupervised Settings.

    PubMed

    Ashouri, Hazar; Inan, Omer T

    2017-06-15

    Seismocardiography (SCG), the measurement of the local chest vibrations due to the movements of blood and the heart, is a non-invasive technique for assessing myocardial contractility via the pre-ejection period (PEP). Recently, SCG-based extraction of PEP has been shown to be an effective means of classifying decompensated from compensated heart failure patients, and thus can be potentially used for monitoring such patients at home. Accurate extraction of PEP from SCG signals hinges on lab-based population data (i.e., regression curves) linking particular time-domain features of the SCG signal to corresponding features from reference standard bulky instruments such as impedance cardiography (ICG). Such regression curves, in the case of SCG, have always been estimated based on the "ideal" positioning of the SCG sensor on the chest. However, in settings such as the home where users may position the SCG measurement hardware on the chest without supervision, it is likely that the sensor will not always be placed exactly on this "ideal" location on the sternum, but rather on other positions on the chest as well. In this study, we show for the first time that the regression curve for estimating PEP from SCG signals differs significantly as the position of the sensor changes. We further devise a method to automatically detect when the sensor is placed in any position other than the desired one in order to avoid inaccurate systolic time interval estimation. Our classification algorithm for this purpose resulted in 0.83 precision and 0.82 recall when classifying whether the sensor is placed in the desired position or not. The classifier was tested with heartbeats taken both at rest, and also during exercise recovery to ensure that waveform changes due to positioning could be accurately discriminated from those due to physiological effects.

  5. [Extracting black soil border in Heilongjiang province based on spectral angle match method].

    PubMed

    Zhang, Xin-Le; Zhang, Shu-Wen; Li, Ying; Liu, Huan-Jun

    2009-04-01

    As soils are generally covered by vegetation most time of a year, the spectral reflectance collected by remote sensing technique is from the mixture of soil and vegetation, so the classification precision based on remote sensing (RS) technique is unsatisfied. Under RS and geographic information systems (GIS) environment and with the help of buffer and overlay analysis methods, land use and soil maps were used to derive regions of interest (ROI) for RS supervised classification, which plus MODIS reflectance products were chosen to extract black soil border, with methods including spectral single match. The results showed that the black soil border in Heilongjiang province can be extracted with soil remote sensing method based on MODIS reflectance products, especially in the north part of black soil zone; the classification precision of spectral angel mapping method is the highest, but the classifying accuracy of other soils can not meet the need, because of vegetation covering and similar spectral characteristics; even for the same soil, black soil, the classifying accuracy has obvious spatial heterogeneity, in the north part of black soil zone in Heilongjiang province it is higher than in the south, which is because of spectral differences; as soil uncovering period in Northeastern China is relatively longer, high temporal resolution make MODIS images get the advantage over soil remote sensing classification; with the help of GIS, extracting ROIs by making the best of auxiliary data can improve the precision of soil classification; with the help of auxiliary information, such as topography and climate, the classification accuracy was enhanced significantly. As there are five main factors determining soil classes, much data of different types, such as DEM, terrain factors, climate (temperature, precipitation, etc.), parent material, vegetation map, and remote sensing images, were introduced to classify soils, so how to choose some of the data and quantify the weights of different data layers needs further study.

  6. Supervised spike-timing-dependent plasticity: a spatiotemporal neuronal learning rule for function approximation and decisions.

    PubMed

    Franosch, Jan-Moritz P; Urban, Sebastian; van Hemmen, J Leo

    2013-12-01

    How can an animal learn from experience? How can it train sensors, such as the auditory or tactile system, based on other sensory input such as the visual system? Supervised spike-timing-dependent plasticity (supervised STDP) is a possible answer. Supervised STDP trains one modality using input from another one as "supervisor." Quite complex time-dependent relationships between the senses can be learned. Here we prove that under very general conditions, supervised STDP converges to a stable configuration of synaptic weights leading to a reconstruction of primary sensory input.

  7. Motions and functional performance after supervised physical therapy program versus home-based program after arthroscopic anterior shoulder stabilization: a randomized clinical trial.

    PubMed

    Ismail, M M; El Shorbagy, K M

    2014-01-01

    To compare the effects of a standardized supervised physical therapy versus a controlled home-based programs on the rate of shoulder motion and functional recovery after arthroscopic anterior shoulder stabilization. Twenty-seven patients (18-35years) underwent arthroscopic anterior shoulder stabilization. Patients were randomized into two groups. A supervised group (n=14) received a rehabilitation program, 3 sessions/week for 24 weeks and a controlled home treated group (n=13) who followed a home-based program for same period. Range of motion (ROM) of the shoulder was assessed 4 times after each phase of rehabilitation and function was assessed after the 3rd and 4th phase of rehabilitation. Both groups achieved a significant progressive increase in all shoulder motions throughout the study period. Patients in the supervised group achieved 92.6% and 94.2% of the contralateral side in abduction and forward elevation respectively. The controlled home-based group achieved 87.1% and 94.7% of abduction and forward elevation respectively. For external rotation, the percentage ROM achieved was 81.1% for the supervised group and 76.4% for the controlled home-based group. For function assessment, the two groups showed a significant improvement. However, the two groups were not significantly different from each other in all measured variables. A controlled home-based physical therapy program is as effective as a supervised program in increasing shoulder range of motion and function after arthroscopic anterior shoulder stabilization. Copyright © 2014 Elsevier Masson SAS. All rights reserved.

  8. Genetic Classification of Populations Using Supervised Learning

    PubMed Central

    Bridges, Michael; Heron, Elizabeth A.; O'Dushlaine, Colm; Segurado, Ricardo; Morris, Derek; Corvin, Aiden; Gill, Michael; Pinto, Carlos

    2011-01-01

    There are many instances in genetics in which we wish to determine whether two candidate populations are distinguishable on the basis of their genetic structure. Examples include populations which are geographically separated, case–control studies and quality control (when participants in a study have been genotyped at different laboratories). This latter application is of particular importance in the era of large scale genome wide association studies, when collections of individuals genotyped at different locations are being merged to provide increased power. The traditional method for detecting structure within a population is some form of exploratory technique such as principal components analysis. Such methods, which do not utilise our prior knowledge of the membership of the candidate populations. are termed unsupervised. Supervised methods, on the other hand are able to utilise this prior knowledge when it is available. In this paper we demonstrate that in such cases modern supervised approaches are a more appropriate tool for detecting genetic differences between populations. We apply two such methods, (neural networks and support vector machines) to the classification of three populations (two from Scotland and one from Bulgaria). The sensitivity exhibited by both these methods is considerably higher than that attained by principal components analysis and in fact comfortably exceeds a recently conjectured theoretical limit on the sensitivity of unsupervised methods. In particular, our methods can distinguish between the two Scottish populations, where principal components analysis cannot. We suggest, on the basis of our results that a supervised learning approach should be the method of choice when classifying individuals into pre-defined populations, particularly in quality control for large scale genome wide association studies. PMID:21589856

  9. Entity recognition in the biomedical domain using a hybrid approach.

    PubMed

    Basaldella, Marco; Furrer, Lenz; Tasso, Carlo; Rinaldi, Fabio

    2017-11-09

    This article describes a high-recall, high-precision approach for the extraction of biomedical entities from scientific articles. The approach uses a two-stage pipeline, combining a dictionary-based entity recognizer with a machine-learning classifier. First, the OGER entity recognizer, which has a bias towards high recall, annotates the terms that appear in selected domain ontologies. Subsequently, the Distiller framework uses this information as a feature for a machine learning algorithm to select the relevant entities only. For this step, we compare two different supervised machine-learning algorithms: Conditional Random Fields and Neural Networks. In an in-domain evaluation using the CRAFT corpus, we test the performance of the combined systems when recognizing chemicals, cell types, cellular components, biological processes, molecular functions, organisms, proteins, and biological sequences. Our best system combines dictionary-based candidate generation with Neural-Network-based filtering. It achieves an overall precision of 86% at a recall of 60% on the named entity recognition task, and a precision of 51% at a recall of 49% on the concept recognition task. These results are to our knowledge the best reported so far in this particular task.

  10. Computer aided detection of tumor and edema in brain FLAIR magnetic resonance image using ANN

    NASA Astrophysics Data System (ADS)

    Pradhan, Nandita; Sinha, A. K.

    2008-03-01

    This paper presents an efficient region based segmentation technique for detecting pathological tissues (Tumor & Edema) of brain using fluid attenuated inversion recovery (FLAIR) magnetic resonance (MR) images. This work segments FLAIR brain images for normal and pathological tissues based on statistical features and wavelet transform coefficients using k-means algorithm. The image is divided into small blocks of 4×4 pixels. The k-means algorithm is used to cluster the image based on the feature vectors of blocks forming different classes representing different regions in the whole image. With the knowledge of the feature vectors of different segmented regions, supervised technique is used to train Artificial Neural Network using fuzzy back propagation algorithm (FBPA). Segmentation for detecting healthy tissues and tumors has been reported by several researchers by using conventional MRI sequences like T1, T2 and PD weighted sequences. This work successfully presents segmentation of healthy and pathological tissues (both Tumors and Edema) using FLAIR images. At the end pseudo coloring of segmented and classified regions are done for better human visualization.

  11. Supervisee Art-Based Disclosure in "El Duende" Process Painting

    ERIC Educational Resources Information Center

    Robb, Megan; Miller, Abbe

    2017-01-01

    Although art-based supervision often leads to supervisee disclosure, little is known about the experience, process, or contributions of such disclosure. We investigated the phenomenon of supervisee disclosure during "El Duende" Process Painting art-based group supervision using a qualitative study. JoHari's Window was used as a grounding…

  12. Active semi-supervised learning method with hybrid deep belief networks.

    PubMed

    Zhou, Shusen; Chen, Qingcai; Wang, Xiaolong

    2014-01-01

    In this paper, we develop a novel semi-supervised learning algorithm called active hybrid deep belief networks (AHD), to address the semi-supervised sentiment classification problem with deep learning. First, we construct the previous several hidden layers using restricted Boltzmann machines (RBM), which can reduce the dimension and abstract the information of the reviews quickly. Second, we construct the following hidden layers using convolutional restricted Boltzmann machines (CRBM), which can abstract the information of reviews effectively. Third, the constructed deep architecture is fine-tuned by gradient-descent based supervised learning with an exponential loss function. Finally, active learning method is combined based on the proposed deep architecture. We did several experiments on five sentiment classification datasets, and show that AHD is competitive with previous semi-supervised learning algorithm. Experiments are also conducted to verify the effectiveness of our proposed method with different number of labeled reviews and unlabeled reviews respectively.

  13. Safe semi-supervised learning based on weighted likelihood.

    PubMed

    Kawakita, Masanori; Takeuchi, Jun'ichi

    2014-05-01

    We are interested in developing a safe semi-supervised learning that works in any situation. Semi-supervised learning postulates that n(') unlabeled data are available in addition to n labeled data. However, almost all of the previous semi-supervised methods require additional assumptions (not only unlabeled data) to make improvements on supervised learning. If such assumptions are not met, then the methods possibly perform worse than supervised learning. Sokolovska, Cappé, and Yvon (2008) proposed a semi-supervised method based on a weighted likelihood approach. They proved that this method asymptotically never performs worse than supervised learning (i.e., it is safe) without any assumption. Their method is attractive because it is easy to implement and is potentially general. Moreover, it is deeply related to a certain statistical paradox. However, the method of Sokolovska et al. (2008) assumes a very limited situation, i.e., classification, discrete covariates, n(')→∞ and a maximum likelihood estimator. In this paper, we extend their method by modifying the weight. We prove that our proposal is safe in a significantly wide range of situations as long as n≤n('). Further, we give a geometrical interpretation of the proof of safety through the relationship with the above-mentioned statistical paradox. Finally, we show that the above proposal is asymptotically safe even when n(')

  14. An iterated Laplacian based semi-supervised dimensionality reduction for classification of breast cancer on ultrasound images.

    PubMed

    Liu, Xiao; Shi, Jun; Zhou, Shichong; Lu, Minhua

    2014-01-01

    The dimensionality reduction is an important step in ultrasound image based computer-aided diagnosis (CAD) for breast cancer. A newly proposed l2,1 regularized correntropy algorithm for robust feature selection (CRFS) has achieved good performance for noise corrupted data. Therefore, it has the potential to reduce the dimensions of ultrasound image features. However, in clinical practice, the collection of labeled instances is usually expensive and time costing, while it is relatively easy to acquire the unlabeled or undetermined instances. Therefore, the semi-supervised learning is very suitable for clinical CAD. The iterated Laplacian regularization (Iter-LR) is a new regularization method, which has been proved to outperform the traditional graph Laplacian regularization in semi-supervised classification and ranking. In this study, to augment the classification accuracy of the breast ultrasound CAD based on texture feature, we propose an Iter-LR-based semi-supervised CRFS (Iter-LR-CRFS) algorithm, and then apply it to reduce the feature dimensions of ultrasound images for breast CAD. We compared the Iter-LR-CRFS with LR-CRFS, original supervised CRFS, and principal component analysis. The experimental results indicate that the proposed Iter-LR-CRFS significantly outperforms all other algorithms.

  15. Self-organizing map (SOM) of space acceleration measurement system (SAMS) data.

    PubMed

    Sinha, A; Smith, A D

    1999-01-01

    In this paper, space acceleration measurement system (SAMS) data have been classified using self-organizing map (SOM) networks without any supervision; i.e., no a priori knowledge is assumed regarding input patterns belonging to a certain class. Input patterns are created on the basis of power spectral densities of SAMS data. Results for SAMS data from STS-50 and STS-57 missions are presented. Following issues are discussed in details: impact of number of neurons, global ordering of SOM weight vectors, effectiveness of a SOM in data classification, and effects of shifting time windows in the generation of input patterns. The concept of 'cascade of SOM networks' is also developed and tested. It has been found that a SOM network can successfully classify SAMS data obtained during STS-50 and STS-57 missions.

  16. Self-organizing map (SOM) of space acceleration measurement system (SAMS) data

    NASA Technical Reports Server (NTRS)

    Sinha, A.; Smith, A. D.

    1999-01-01

    In this paper, space acceleration measurement system (SAMS) data have been classified using self-organizing map (SOM) networks without any supervision; i.e., no a priori knowledge is assumed regarding input patterns belonging to a certain class. Input patterns are created on the basis of power spectral densities of SAMS data. Results for SAMS data from STS-50 and STS-57 missions are presented. Following issues are discussed in details: impact of number of neurons, global ordering of SOM weight vectors, effectiveness of a SOM in data classification, and effects of shifting time windows in the generation of input patterns. The concept of 'cascade of SOM networks' is also developed and tested. It has been found that a SOM network can successfully classify SAMS data obtained during STS-50 and STS-57 missions.

  17. Automatic evidence quality prediction to support evidence-based decision making.

    PubMed

    Sarker, Abeed; Mollá, Diego; Paris, Cécile

    2015-06-01

    Evidence-based medicine practice requires practitioners to obtain the best available medical evidence, and appraise the quality of the evidence when making clinical decisions. Primarily due to the plethora of electronically available data from the medical literature, the manual appraisal of the quality of evidence is a time-consuming process. We present a fully automatic approach for predicting the quality of medical evidence in order to aid practitioners at point-of-care. Our approach extracts relevant information from medical article abstracts and utilises data from a specialised corpus to apply supervised machine learning for the prediction of the quality grades. Following an in-depth analysis of the usefulness of features (e.g., publication types of articles), they are extracted from the text via rule-based approaches and from the meta-data associated with the articles, and then applied in the supervised classification model. We propose the use of a highly scalable and portable approach using a sequence of high precision classifiers, and introduce a simple evaluation metric called average error distance (AED) that simplifies the comparison of systems. We also perform elaborate human evaluations to compare the performance of our system against human judgments. We test and evaluate our approaches on a publicly available, specialised, annotated corpus containing 1132 evidence-based recommendations. Our rule-based approach performs exceptionally well at the automatic extraction of publication types of articles, with F-scores of up to 0.99 for high-quality publication types. For evidence quality classification, our approach obtains an accuracy of 63.84% and an AED of 0.271. The human evaluations show that the performance of our system, in terms of AED and accuracy, is comparable to the performance of humans on the same data. The experiments suggest that our structured text classification framework achieves evaluation results comparable to those of human performance. Our overall classification approach and evaluation technique are also highly portable and can be used for various evidence grading scales. Copyright © 2015 Elsevier B.V. All rights reserved.

  18. Facilitating the learning process in design-based learning practices: an investigation of teachers' actions in supervising students

    NASA Astrophysics Data System (ADS)

    Gómez Puente, S. M.; van Eijck, M.; Jochems, W.

    2013-11-01

    Background: In research on design-based learning (DBL), inadequate attention is paid to the role the teacher plays in supervising students in gathering and applying knowledge to design artifacts, systems, and innovative solutions in higher education. Purpose: In this study, we examine whether teacher actions we previously identified in the DBL literature as important in facilitating learning processes and student supervision are present in current DBL engineering practices. Sample: The sample (N=16) consisted of teachers and supervisors in two engineering study programs at a university of technology: mechanical and electrical engineering. We selected randomly teachers from freshman and second-year bachelor DBL projects responsible for student supervision and assessment. Design and method: Interviews with teachers, and interviews and observations of supervisors were used to examine how supervision and facilitation actions are applied according to the DBL framework. Results: Major findings indicate that formulating questions is the most common practice seen in facilitating learning in open-ended engineering design environments. Furthermore, other DBL actions we expected to see based upon the literature were seldom observed in the coaching practices within these two programs. Conclusions: Professionalization of teachers in supervising students need to include methods to scaffold learning by supporting students in reflecting and in providing formative feedback.

  19. Professional Supervision as Storied Experience: Narrative Analysis Findings for Australian-Based Registered Music Therapists.

    PubMed

    Kennelly, Jeanette D; Baker, Felicity A; Daveson, Barbara A

    2017-03-01

    Limited research exists to inform a music therapist's supervision story from their pre-professional training to their practice as a professional. Evidence is needed to understand the complex nature of supervision experiences and their impact on professional practice. This qualitative study explored the supervisory experiences of Australian-based Registered Music Therapists, according to the: 1) themes that characterize their experiences, 2) influences of the supervisor's professional background, 3) outcomes of supervision, and 4) roles of the employer, the professional music therapy association, and the university in supervision standards and practice. Seven professionals were interviewed for this study. Five stages of narrative analysis were used to create their supervision stories: a life course graph, narrative psychological analysis, component story framework and narrative analysis, analysis of narratives, and final integration of the seven narrative summaries. Findings revealed that supervision practice is influenced by a supervisee's personal and professional needs. A range of supervision models or approaches is recommended, including the access of supervisors from different professional backgrounds to support each stage of learning and development. A quality supervisory experience facilitates shifts in awareness and insight, which results in improved or increased skills, confidence, and accountability of practice. Participants' concern about stakeholders included a limited understanding of the role of the supervisor, a lack of clarity about accountability of supervisory practice, and minimal guidelines, which monitor professional competencies. The benefits of supervision in music therapy depend on the quality of the supervision provided, and clarity about the roles of those involved. Research and guidelines are recommended to target these areas. © the American Music Therapy Association 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com

  20. Derivation of an occupational exposure limit for an inhalation analgesic methoxyflurane (Penthrox(®)).

    PubMed

    Frangos, John; Mikkonen, Antti; Down, Christin

    2016-10-01

    Methoxyflurane (MOF) a haloether, is an inhalation analgesic agent for emergency relief of pain by self administration in conscious patients with trauma and associated pain. It is administered under supervision of personnel trained in its use. As a consequence of supervised use, intermittent occupational exposure can occur. An occupational exposure limit has not been established for methoxyflurane. Human clinical and toxicity data have been reviewed and used to derive an occupational exposure limit (referred to as a maximum exposure level, MEL) according to modern principles. The data set for methoxyflurane is complex given its historical use as anaesthetic. Distinguishing clinical investigations of adverse health effects following high and prolonged exposure during anaesthesia to assess relatively low and intermittent exposure during occupational exposure requires an evidence based approach to the toxicity assessment and determination of a critical effect and point of departure. The principal target organs are the kidney and the central nervous system and there have been rare reports of hepatotoxicity, too. Methoxyflurane is not genotoxic based on in vitro bacterial mutation and in vivo micronucleus tests and it is not classifiable (IARC) as a carcinogenic hazard to humans. The critical effect chosen for development of a MEL is kidney toxicity. The point of departure (POD) was derived from the concentration response relationship for kidney toxicity using the benchmark dose method. A MEL of 15 ppm (expressed as an 8 h time weighted average (TWA)) was derived. The derived MEL is at least 50 times higher than the mean observed TWA (0.23 ppm) for ambulance workers and medical staff involved in supervising use of Penthrox. In typical treatment environments (ambulances and treatment rooms) that meet ventilation requirements the derived MEL is at least 10 times higher than the modelled TWA (1.5 ppm or less) and the estimated short term peak concentrations are within the MEL. The odour threshold for MOF of 0.13-0.19 ppm indicates that the odour is detectable well below the MEL. Given the above considerations the proposed MEL is health protective. Copyright © 2016 The Author(s). Published by Elsevier Inc. All rights reserved.

Top