Sample records for wrapper-based feature selection

  1. Quantum-enhanced feature selection with forward selection and backward elimination

    NASA Astrophysics Data System (ADS)

    He, Zhimin; Li, Lvzhou; Huang, Zhiming; Situ, Haozhen

    2018-07-01

    Feature selection is a well-known preprocessing technique in machine learning, which can remove irrelevant features to improve the generalization capability of a classifier and reduce training and inference time. However, feature selection is time-consuming, particularly for the applications those have thousands of features, such as image retrieval, text mining and microarray data analysis. It is crucial to accelerate the feature selection process. We propose a quantum version of wrapper-based feature selection, which converts a classical feature selection to its quantum counterpart. It is valuable for machine learning on quantum computer. In this paper, we focus on two popular kinds of feature selection methods, i.e., wrapper-based forward selection and backward elimination. The proposed feature selection algorithm can quadratically accelerate the classical one.

  2. Wrapper-based selection of genetic features in genome-wide association studies through fast matrix operations

    PubMed Central

    2012-01-01

    Background Through the wealth of information contained within them, genome-wide association studies (GWAS) have the potential to provide researchers with a systematic means of associating genetic variants with a wide variety of disease phenotypes. Due to the limitations of approaches that have analyzed single variants one at a time, it has been proposed that the genetic basis of these disorders could be determined through detailed analysis of the genetic variants themselves and in conjunction with one another. The construction of models that account for these subsets of variants requires methodologies that generate predictions based on the total risk of a particular group of polymorphisms. However, due to the excessive number of variants, constructing these types of models has so far been computationally infeasible. Results We have implemented an algorithm, known as greedy RLS, that we use to perform the first known wrapper-based feature selection on the genome-wide level. The running time of greedy RLS grows linearly in the number of training examples, the number of features in the original data set, and the number of selected features. This speed is achieved through computational short-cuts based on matrix calculus. Since the memory consumption in present-day computers can form an even tighter bottleneck than running time, we also developed a space efficient variation of greedy RLS which trades running time for memory. These approaches are then compared to traditional wrapper-based feature selection implementations based on support vector machines (SVM) to reveal the relative speed-up and to assess the feasibility of the new algorithm. As a proof of concept, we apply greedy RLS to the Hypertension – UK National Blood Service WTCCC dataset and select the most predictive variants using 3-fold external cross-validation in less than 26 minutes on a high-end desktop. On this dataset, we also show that greedy RLS has a better classification performance on independent test data than a classifier trained using features selected by a statistical p-value-based filter, which is currently the most popular approach for constructing predictive models in GWAS. Conclusions Greedy RLS is the first known implementation of a machine learning based method with the capability to conduct a wrapper-based feature selection on an entire GWAS containing several thousand examples and over 400,000 variants. In our experiments, greedy RLS selected a highly predictive subset of genetic variants in a fraction of the time spent by wrapper-based selection methods used together with SVM classifiers. The proposed algorithms are freely available as part of the RLScore software library at http://users.utu.fi/aatapa/RLScore/. PMID:22551170

  3. Diagnosis of Chronic Kidney Disease Based on Support Vector Machine by Feature Selection Methods.

    PubMed

    Polat, Huseyin; Danaei Mehr, Homay; Cetin, Aydin

    2017-04-01

    As Chronic Kidney Disease progresses slowly, early detection and effective treatment are the only cure to reduce the mortality rate. Machine learning techniques are gaining significance in medical diagnosis because of their classification ability with high accuracy rates. The accuracy of classification algorithms depend on the use of correct feature selection algorithms to reduce the dimension of datasets. In this study, Support Vector Machine classification algorithm was used to diagnose Chronic Kidney Disease. To diagnose the Chronic Kidney Disease, two essential types of feature selection methods namely, wrapper and filter approaches were chosen to reduce the dimension of Chronic Kidney Disease dataset. In wrapper approach, classifier subset evaluator with greedy stepwise search engine and wrapper subset evaluator with the Best First search engine were used. In filter approach, correlation feature selection subset evaluator with greedy stepwise search engine and filtered subset evaluator with the Best First search engine were used. The results showed that the Support Vector Machine classifier by using filtered subset evaluator with the Best First search engine feature selection method has higher accuracy rate (98.5%) in the diagnosis of Chronic Kidney Disease compared to other selected methods.

  4. A hybrid feature selection method using multiclass SVM for diagnosis of erythemato-squamous disease

    NASA Astrophysics Data System (ADS)

    Maryam, Setiawan, Noor Akhmad; Wahyunggoro, Oyas

    2017-08-01

    The diagnosis of erythemato-squamous disease is a complex problem and difficult to detect in dermatology. Besides that, it is a major cause of skin cancer. Data mining implementation in the medical field helps expert to diagnose precisely, accurately, and inexpensively. In this research, we use data mining technique to developed a diagnosis model based on multiclass SVM with a novel hybrid feature selection method to diagnose erythemato-squamous disease. Our hybrid feature selection method, named ChiGA (Chi Square and Genetic Algorithm), uses the advantages from filter and wrapper methods to select the optimal feature subset from original feature. Chi square used as filter method to remove redundant features and GA as wrapper method to select the ideal feature subset with SVM used as classifier. Experiment performed with 10 fold cross validation on erythemato-squamous diseases dataset taken from University of California Irvine (UCI) machine learning database. The experimental result shows that the proposed model based multiclass SVM with Chi Square and GA can give an optimum feature subset. There are 18 optimum features with 99.18% accuracy.

  5. An improved wrapper-based feature selection method for machinery fault diagnosis

    PubMed Central

    2017-01-01

    A major issue of machinery fault diagnosis using vibration signals is that it is over-reliant on personnel knowledge and experience in interpreting the signal. Thus, machine learning has been adapted for machinery fault diagnosis. The quantity and quality of the input features, however, influence the fault classification performance. Feature selection plays a vital role in selecting the most representative feature subset for the machine learning algorithm. In contrast, the trade-off relationship between capability when selecting the best feature subset and computational effort is inevitable in the wrapper-based feature selection (WFS) method. This paper proposes an improved WFS technique before integration with a support vector machine (SVM) model classifier as a complete fault diagnosis system for a rolling element bearing case study. The bearing vibration dataset made available by the Case Western Reserve University Bearing Data Centre was executed using the proposed WFS and its performance has been analysed and discussed. The results reveal that the proposed WFS secures the best feature subset with a lower computational effort by eliminating the redundancy of re-evaluation. The proposed WFS has therefore been found to be capable and efficient to carry out feature selection tasks. PMID:29261689

  6. Hybrid feature selection for supporting lightweight intrusion detection systems

    NASA Astrophysics Data System (ADS)

    Song, Jianglong; Zhao, Wentao; Liu, Qiang; Wang, Xin

    2017-08-01

    Redundant and irrelevant features not only cause high resource consumption but also degrade the performance of Intrusion Detection Systems (IDS), especially when coping with big data. These features slow down the process of training and testing in network traffic classification. Therefore, a hybrid feature selection approach in combination with wrapper and filter selection is designed in this paper to build a lightweight intrusion detection system. Two main phases are involved in this method. The first phase conducts a preliminary search for an optimal subset of features, in which the chi-square feature selection is utilized. The selected set of features from the previous phase is further refined in the second phase in a wrapper manner, in which the Random Forest(RF) is used to guide the selection process and retain an optimized set of features. After that, we build an RF-based detection model and make a fair comparison with other approaches. The experimental results on NSL-KDD datasets show that our approach results are in higher detection accuracy as well as faster training and testing processes.

  7. Enhancement web proxy cache performance using Wrapper Feature Selection methods with NB and J48

    NASA Astrophysics Data System (ADS)

    Mahmoud Al-Qudah, Dua'a.; Funke Olanrewaju, Rashidah; Wong Azman, Amelia

    2017-11-01

    Web proxy cache technique reduces response time by storing a copy of pages between client and server sides. If requested pages are cached in the proxy, there is no need to access the server. Due to the limited size and excessive cost of cache compared to the other storages, cache replacement algorithm is used to determine evict page when the cache is full. On the other hand, the conventional algorithms for replacement such as Least Recently Use (LRU), First in First Out (FIFO), Least Frequently Use (LFU), Randomized Policy etc. may discard important pages just before use. Furthermore, using conventional algorithm cannot be well optimized since it requires some decision to intelligently evict a page before replacement. Hence, most researchers propose an integration among intelligent classifiers and replacement algorithm to improves replacement algorithms performance. This research proposes using automated wrapper feature selection methods to choose the best subset of features that are relevant and influence classifiers prediction accuracy. The result present that using wrapper feature selection methods namely: Best First (BFS), Incremental Wrapper subset selection(IWSS)embedded NB and particle swarm optimization(PSO)reduce number of features and have a good impact on reducing computation time. Using PSO enhance NB classifier accuracy by 1.1%, 0.43% and 0.22% over using NB with all features, using BFS and using IWSS embedded NB respectively. PSO rises J48 accuracy by 0.03%, 1.91 and 0.04% over using J48 classifier with all features, using IWSS-embedded NB and using BFS respectively. While using IWSS embedded NB fastest NB and J48 classifiers much more than BFS and PSO. However, it reduces computation time of NB by 0.1383 and reduce computation time of J48 by 2.998.

  8. Feature selection for the classification of traced neurons.

    PubMed

    López-Cabrera, José D; Lorenzo-Ginori, Juan V

    2018-06-01

    The great availability of computational tools to calculate the properties of traced neurons leads to the existence of many descriptors which allow the automated classification of neurons from these reconstructions. This situation determines the necessity to eliminate irrelevant features as well as making a selection of the most appropriate among them, in order to improve the quality of the classification obtained. The dataset used contains a total of 318 traced neurons, classified by human experts in 192 GABAergic interneurons and 126 pyramidal cells. The features were extracted by means of the L-measure software, which is one of the most used computational tools in neuroinformatics to quantify traced neurons. We review some current feature selection techniques as filter, wrapper, embedded and ensemble methods. The stability of the feature selection methods was measured. For the ensemble methods, several aggregation methods based on different metrics were applied to combine the subsets obtained during the feature selection process. The subsets obtained applying feature selection methods were evaluated using supervised classifiers, among which Random Forest, C4.5, SVM, Naïve Bayes, Knn, Decision Table and the Logistic classifier were used as classification algorithms. Feature selection methods of types filter, embedded, wrappers and ensembles were compared and the subsets returned were tested in classification tasks for different classification algorithms. L-measure features EucDistanceSD, PathDistanceSD, Branch_pathlengthAve, Branch_pathlengthSD and EucDistanceAve were present in more than 60% of the selected subsets which provides evidence about their importance in the classification of this neurons. Copyright © 2018 Elsevier B.V. All rights reserved.

  9. Feature Selection for Chemical Sensor Arrays Using Mutual Information

    PubMed Central

    Wang, X. Rosalind; Lizier, Joseph T.; Nowotny, Thomas; Berna, Amalia Z.; Prokopenko, Mikhail; Trowell, Stephen C.

    2014-01-01

    We address the problem of feature selection for classifying a diverse set of chemicals using an array of metal oxide sensors. Our aim is to evaluate a filter approach to feature selection with reference to previous work, which used a wrapper approach on the same data set, and established best features and upper bounds on classification performance. We selected feature sets that exhibit the maximal mutual information with the identity of the chemicals. The selected features closely match those found to perform well in the previous study using a wrapper approach to conduct an exhaustive search of all permitted feature combinations. By comparing the classification performance of support vector machines (using features selected by mutual information) with the performance observed in the previous study, we found that while our approach does not always give the maximum possible classification performance, it always selects features that achieve classification performance approaching the optimum obtained by exhaustive search. We performed further classification using the selected feature set with some common classifiers and found that, for the selected features, Bayesian Networks gave the best performance. Finally, we compared the observed classification performances with the performance of classifiers using randomly selected features. We found that the selected features consistently outperformed randomly selected features for all tested classifiers. The mutual information filter approach is therefore a computationally efficient method for selecting near optimal features for chemical sensor arrays. PMID:24595058

  10. Classification of Alzheimer's disease patients with hippocampal shape wrapper-based feature selection and support vector machine

    NASA Astrophysics Data System (ADS)

    Young, Jonathan; Ridgway, Gerard; Leung, Kelvin; Ourselin, Sebastien

    2012-02-01

    It is well known that hippocampal atrophy is a marker of the onset of Alzheimer's disease (AD) and as a result hippocampal volumetry has been used in a number of studies to provide early diagnosis of AD and predict conversion of mild cognitive impairment patients to AD. However, rates of atrophy are not uniform across the hippocampus making shape analysis a potentially more accurate biomarker. This study studies the hippocampi from 226 healthy controls, 148 AD patients and 330 MCI patients obtained from T1 weighted structural MRI images from the ADNI database. The hippocampi are anatomically segmented using the MAPS multi-atlas segmentation method, and the resulting binary images are then processed with SPHARM software to decompose their shapes as a weighted sum of spherical harmonic basis functions. The resulting parameterizations are then used as feature vectors in Support Vector Machine (SVM) classification. A wrapper based feature selection method was used as this considers the utility of features in discriminating classes in combination, fully exploiting the multivariate nature of the data and optimizing the selected set of features for the type of classifier that is used. The leave-one-out cross validated accuracy obtained on training data is 88.6% for classifying AD vs controls and 74% for classifying MCI-converters vs MCI-stable with very compact feature sets, showing that this is a highly promising method. There is currently a considerable fall in accuracy on unseen data indicating that the feature selection is sensitive to the data used, however feature ensemble methods may overcome this.

  11. Multisensor-based real-time quality monitoring by means of feature extraction, selection and modeling for Al alloy in arc welding

    NASA Astrophysics Data System (ADS)

    Zhang, Zhifen; Chen, Huabin; Xu, Yanling; Zhong, Jiyong; Lv, Na; Chen, Shanben

    2015-08-01

    Multisensory data fusion-based online welding quality monitoring has gained increasing attention in intelligent welding process. This paper mainly focuses on the automatic detection of typical welding defect for Al alloy in gas tungsten arc welding (GTAW) by means of analzing arc spectrum, sound and voltage signal. Based on the developed algorithms in time and frequency domain, 41 feature parameters were successively extracted from these signals to characterize the welding process and seam quality. Then, the proposed feature selection approach, i.e., hybrid fisher-based filter and wrapper was successfully utilized to evaluate the sensitivity of each feature and reduce the feature dimensions. Finally, the optimal feature subset with 19 features was selected to obtain the highest accuracy, i.e., 94.72% using established classification model. This study provides a guideline for feature extraction, selection and dynamic modeling based on heterogeneous multisensory data to achieve a reliable online defect detection system in arc welding.

  12. Feature selection methods for big data bioinformatics: A survey from the search perspective.

    PubMed

    Wang, Lipo; Wang, Yaoli; Chang, Qing

    2016-12-01

    This paper surveys main principles of feature selection and their recent applications in big data bioinformatics. Instead of the commonly used categorization into filter, wrapper, and embedded approaches to feature selection, we formulate feature selection as a combinatorial optimization or search problem and categorize feature selection methods into exhaustive search, heuristic search, and hybrid methods, where heuristic search methods may further be categorized into those with or without data-distilled feature ranking measures. Copyright © 2016 Elsevier Inc. All rights reserved.

  13. Feature selection using probabilistic prediction of support vector regression.

    PubMed

    Yang, Jian-Bo; Ong, Chong-Jin

    2011-06-01

    This paper presents a new wrapper-based feature selection method for support vector regression (SVR) using its probabilistic predictions. The method computes the importance of a feature by aggregating the difference, over the feature space, of the conditional density functions of the SVR prediction with and without the feature. As the exact computation of this importance measure is expensive, two approximations are proposed. The effectiveness of the measure using these approximations, in comparison to several other existing feature selection methods for SVR, is evaluated on both artificial and real-world problems. The result of the experiments show that the proposed method generally performs better than, or at least as well as, the existing methods, with notable advantage when the dataset is sparse.

  14. A proposed framework on hybrid feature selection techniques for handling high dimensional educational data

    NASA Astrophysics Data System (ADS)

    Shahiri, Amirah Mohamed; Husain, Wahidah; Rashid, Nur'Aini Abd

    2017-10-01

    Huge amounts of data in educational datasets may cause the problem in producing quality data. Recently, data mining approach are increasingly used by educational data mining researchers for analyzing the data patterns. However, many research studies have concentrated on selecting suitable learning algorithms instead of performing feature selection process. As a result, these data has problem with computational complexity and spend longer computational time for classification. The main objective of this research is to provide an overview of feature selection techniques that have been used to analyze the most significant features. Then, this research will propose a framework to improve the quality of students' dataset. The proposed framework uses filter and wrapper based technique to support prediction process in future study.

  15. Efficient feature selection using a hybrid algorithm for the task of epileptic seizure detection

    NASA Astrophysics Data System (ADS)

    Lai, Kee Huong; Zainuddin, Zarita; Ong, Pauline

    2014-07-01

    Feature selection is a very important aspect in the field of machine learning. It entails the search of an optimal subset from a very large data set with high dimensional feature space. Apart from eliminating redundant features and reducing computational cost, a good selection of feature also leads to higher prediction and classification accuracy. In this paper, an efficient feature selection technique is introduced in the task of epileptic seizure detection. The raw data are electroencephalography (EEG) signals. Using discrete wavelet transform, the biomedical signals were decomposed into several sets of wavelet coefficients. To reduce the dimension of these wavelet coefficients, a feature selection method that combines the strength of both filter and wrapper methods is proposed. Principal component analysis (PCA) is used as part of the filter method. As for wrapper method, the evolutionary harmony search (HS) algorithm is employed. This metaheuristic method aims at finding the best discriminating set of features from the original data. The obtained features were then used as input for an automated classifier, namely wavelet neural networks (WNNs). The WNNs model was trained to perform a binary classification task, that is, to determine whether a given EEG signal was normal or epileptic. For comparison purposes, different sets of features were also used as input. Simulation results showed that the WNNs that used the features chosen by the hybrid algorithm achieved the highest overall classification accuracy.

  16. Prediction of protein-protein interactions based on PseAA composition and hybrid feature selection.

    PubMed

    Liu, Liang; Cai, Yudong; Lu, Wencong; Feng, Kaiyan; Peng, Chunrong; Niu, Bing

    2009-03-06

    Based on pseudo amino acid (PseAA) composition and a novel hybrid feature selection frame, this paper presents a computational system to predict the PPIs (protein-protein interactions) using 8796 protein pairs. These pairs are coded by PseAA composition, resulting in 114 features. A hybrid feature selection system, mRMR-KNNs-wrapper, is applied to obtain an optimized feature set by excluding poor-performed and/or redundant features, resulting in 103 remaining features. Using the optimized 103-feature subset, a prediction model is trained and tested in the k-nearest neighbors (KNNs) learning system. This prediction model achieves an overall accurate prediction rate of 76.18%, evaluated by 10-fold cross-validation test, which is 1.46% higher than using the initial 114 features and is 6.51% higher than the 20 features, coded by amino acid compositions. The PPIs predictor, developed for this research, is available for public use at http://chemdata.shu.edu.cn/ppi.

  17. The Cross-Entropy Based Multi-Filter Ensemble Method for Gene Selection.

    PubMed

    Sun, Yingqiang; Lu, Chengbo; Li, Xiaobo

    2018-05-17

    The gene expression profile has the characteristics of a high dimension, low sample, and continuous type, and it is a great challenge to use gene expression profile data for the classification of tumor samples. This paper proposes a cross-entropy based multi-filter ensemble (CEMFE) method for microarray data classification. Firstly, multiple filters are used to select the microarray data in order to obtain a plurality of the pre-selected feature subsets with a different classification ability. The top N genes with the highest rank of each subset are integrated so as to form a new data set. Secondly, the cross-entropy algorithm is used to remove the redundant data in the data set. Finally, the wrapper method, which is based on forward feature selection, is used to select the best feature subset. The experimental results show that the proposed method is more efficient than other gene selection methods and that it can achieve a higher classification accuracy under fewer characteristic genes.

  18. A bootstrap based Neyman-Pearson test for identifying variable importance.

    PubMed

    Ditzler, Gregory; Polikar, Robi; Rosen, Gail

    2015-04-01

    Selection of most informative features that leads to a small loss on future data are arguably one of the most important steps in classification, data analysis and model selection. Several feature selection (FS) algorithms are available; however, due to noise present in any data set, FS algorithms are typically accompanied by an appropriate cross-validation scheme. In this brief, we propose a statistical hypothesis test derived from the Neyman-Pearson lemma for determining if a feature is statistically relevant. The proposed approach can be applied as a wrapper to any FS algorithm, regardless of the FS criteria used by that algorithm, to determine whether a feature belongs in the relevant set. Perhaps more importantly, this procedure efficiently determines the number of relevant features given an initial starting point. We provide freely available software implementations of the proposed methodology.

  19. Classification of motor imagery tasks for BCI with multiresolution analysis and multiobjective feature selection.

    PubMed

    Ortega, Julio; Asensio-Cubero, Javier; Gan, John Q; Ortiz, Andrés

    2016-07-15

    Brain-computer interfacing (BCI) applications based on the classification of electroencephalographic (EEG) signals require solving high-dimensional pattern classification problems with such a relatively small number of training patterns that curse of dimensionality problems usually arise. Multiresolution analysis (MRA) has useful properties for signal analysis in both temporal and spectral analysis, and has been broadly used in the BCI field. However, MRA usually increases the dimensionality of the input data. Therefore, some approaches to feature selection or feature dimensionality reduction should be considered for improving the performance of the MRA based BCI. This paper investigates feature selection in the MRA-based frameworks for BCI. Several wrapper approaches to evolutionary multiobjective feature selection are proposed with different structures of classifiers. They are evaluated by comparing with baseline methods using sparse representation of features or without feature selection. The statistical analysis, by applying the Kolmogorov-Smirnoff and Kruskal-Wallis tests to the means of the Kappa values evaluated by using the test patterns in each approach, has demonstrated some advantages of the proposed approaches. In comparison with the baseline MRA approach used in previous studies, the proposed evolutionary multiobjective feature selection approaches provide similar or even better classification performances, with significant reduction in the number of features that need to be computed.

  20. Particle Swarm Optimization Based Feature Enhancement and Feature Selection for Improved Emotion Recognition in Speech and Glottal Signals

    PubMed Central

    Muthusamy, Hariharan; Polat, Kemal; Yaacob, Sazali

    2015-01-01

    In the recent years, many research works have been published using speech related features for speech emotion recognition, however, recent studies show that there is a strong correlation between emotional states and glottal features. In this work, Mel-frequency cepstralcoefficients (MFCCs), linear predictive cepstral coefficients (LPCCs), perceptual linear predictive (PLP) features, gammatone filter outputs, timbral texture features, stationary wavelet transform based timbral texture features and relative wavelet packet energy and entropy features were extracted from the emotional speech (ES) signals and its glottal waveforms(GW). Particle swarm optimization based clustering (PSOC) and wrapper based particle swarm optimization (WPSO) were proposed to enhance the discerning ability of the features and to select the discriminating features respectively. Three different emotional speech databases were utilized to gauge the proposed method. Extreme learning machine (ELM) was employed to classify the different types of emotions. Different experiments were conducted and the results show that the proposed method significantly improves the speech emotion recognition performance compared to previous works published in the literature. PMID:25799141

  1. Decoding grating orientation from microelectrode array recordings in monkey cortical area V4.

    PubMed

    Manyakov, Nikolay V; Van Hulle, Marc M

    2010-04-01

    We propose an invasive brain-machine interface (BMI) that decodes the orientation of a visual grating from spike train recordings made with a 96 microelectrodes array chronically implanted into the prelunate gyrus (area V4) of a rhesus monkey. The orientation is decoded irrespective of the grating's spatial frequency. Since pyramidal cells are less prominent in visual areas, compared to (pre)motor areas, the recordings contain spikes with smaller amplitudes, compared to the noise level. Hence, rather than performing spike decoding, feature selection algorithms are applied to extract the required information for the decoder. Two types of feature selection procedures are compared, filter and wrapper. The wrapper is combined with a linear discriminant analysis classifier, and the filter is followed by a radial-basis function support vector machine classifier. In addition, since we have a multiclass classification problen, different methods for combining pairwise classifiers are compared.

  2. Feature Selection based on Machine Learning in MRIs for Hippocampal Segmentation

    NASA Astrophysics Data System (ADS)

    Tangaro, Sabina; Amoroso, Nicola; Brescia, Massimo; Cavuoti, Stefano; Chincarini, Andrea; Errico, Rosangela; Paolo, Inglese; Longo, Giuseppe; Maglietta, Rosalia; Tateo, Andrea; Riccio, Giuseppe; Bellotti, Roberto

    2015-01-01

    Neurodegenerative diseases are frequently associated with structural changes in the brain. Magnetic resonance imaging (MRI) scans can show these variations and therefore can be used as a supportive feature for a number of neurodegenerative diseases. The hippocampus has been known to be a biomarker for Alzheimer disease and other neurological and psychiatric diseases. However, it requires accurate, robust, and reproducible delineation of hippocampal structures. Fully automatic methods are usually the voxel based approach; for each voxel a number of local features were calculated. In this paper, we compared four different techniques for feature selection from a set of 315 features extracted for each voxel: (i) filter method based on the Kolmogorov-Smirnov test; two wrapper methods, respectively, (ii) sequential forward selection and (iii) sequential backward elimination; and (iv) embedded method based on the Random Forest Classifier on a set of 10 T1-weighted brain MRIs and tested on an independent set of 25 subjects. The resulting segmentations were compared with manual reference labelling. By using only 23 feature for each voxel (sequential backward elimination) we obtained comparable state-of-the-art performances with respect to the standard tool FreeSurfer.

  3. Sparse PLS discriminant analysis: biologically relevant feature selection and graphical displays for multiclass problems.

    PubMed

    Lê Cao, Kim-Anh; Boitard, Simon; Besse, Philippe

    2011-06-22

    Variable selection on high throughput biological data, such as gene expression or single nucleotide polymorphisms (SNPs), becomes inevitable to select relevant information and, therefore, to better characterize diseases or assess genetic structure. There are different ways to perform variable selection in large data sets. Statistical tests are commonly used to identify differentially expressed features for explanatory purposes, whereas Machine Learning wrapper approaches can be used for predictive purposes. In the case of multiple highly correlated variables, another option is to use multivariate exploratory approaches to give more insight into cell biology, biological pathways or complex traits. A simple extension of a sparse PLS exploratory approach is proposed to perform variable selection in a multiclass classification framework. sPLS-DA has a classification performance similar to other wrapper or sparse discriminant analysis approaches on public microarray and SNP data sets. More importantly, sPLS-DA is clearly competitive in terms of computational efficiency and superior in terms of interpretability of the results via valuable graphical outputs. sPLS-DA is available in the R package mixOmics, which is dedicated to the analysis of large biological data sets.

  4. A Novel Feature Selection Technique for Text Classification Using Naïve Bayes.

    PubMed

    Dey Sarkar, Subhajit; Goswami, Saptarsi; Agarwal, Aman; Aktar, Javed

    2014-01-01

    With the proliferation of unstructured data, text classification or text categorization has found many applications in topic classification, sentiment analysis, authorship identification, spam detection, and so on. There are many classification algorithms available. Naïve Bayes remains one of the oldest and most popular classifiers. On one hand, implementation of naïve Bayes is simple and, on the other hand, this also requires fewer amounts of training data. From the literature review, it is found that naïve Bayes performs poorly compared to other classifiers in text classification. As a result, this makes the naïve Bayes classifier unusable in spite of the simplicity and intuitiveness of the model. In this paper, we propose a two-step feature selection method based on firstly a univariate feature selection and then feature clustering, where we use the univariate feature selection method to reduce the search space and then apply clustering to select relatively independent feature sets. We demonstrate the effectiveness of our method by a thorough evaluation and comparison over 13 datasets. The performance improvement thus achieved makes naïve Bayes comparable or superior to other classifiers. The proposed algorithm is shown to outperform other traditional methods like greedy search based wrapper or CFS.

  5. Image steganalysis using Artificial Bee Colony algorithm

    NASA Astrophysics Data System (ADS)

    Sajedi, Hedieh

    2017-09-01

    Steganography is the science of secure communication where the presence of the communication cannot be detected while steganalysis is the art of discovering the existence of the secret communication. Processing a huge amount of information takes extensive execution time and computational sources most of the time. As a result, it is needed to employ a phase of preprocessing, which can moderate the execution time and computational sources. In this paper, we propose a new feature-based blind steganalysis method for detecting stego images from the cover (clean) images with JPEG format. In this regard, we present a feature selection technique based on an improved Artificial Bee Colony (ABC). ABC algorithm is inspired by honeybees' social behaviour in their search for perfect food sources. In the proposed method, classifier performance and the dimension of the selected feature vector depend on using wrapper-based methods. The experiments are performed using two large data-sets of JPEG images. Experimental results demonstrate the effectiveness of the proposed steganalysis technique compared to the other existing techniques.

  6. Adversarial Feature Selection Against Evasion Attacks.

    PubMed

    Zhang, Fei; Chan, Patrick P K; Biggio, Battista; Yeung, Daniel S; Roli, Fabio

    2016-03-01

    Pattern recognition and machine learning techniques have been increasingly adopted in adversarial settings such as spam, intrusion, and malware detection, although their security against well-crafted attacks that aim to evade detection by manipulating data at test time has not yet been thoroughly assessed. While previous work has been mainly focused on devising adversary-aware classification algorithms to counter evasion attempts, only few authors have considered the impact of using reduced feature sets on classifier security against the same attacks. An interesting, preliminary result is that classifier security to evasion may be even worsened by the application of feature selection. In this paper, we provide a more detailed investigation of this aspect, shedding some light on the security properties of feature selection against evasion attacks. Inspired by previous work on adversary-aware classifiers, we propose a novel adversary-aware feature selection model that can improve classifier security against evasion attacks, by incorporating specific assumptions on the adversary's data manipulation strategy. We focus on an efficient, wrapper-based implementation of our approach, and experimentally validate its soundness on different application examples, including spam and malware detection.

  7. Computer Based Behavioral Biometric Authentication via Multi-Modal Fusion

    DTIC Science & Technology

    2013-03-01

    the decisions made by each individual modality. Fusion of features is the simple concatenation of feature vectors from multiple modalities to be...of Features BayesNet MDL 330 LibSVM PCA 80 J48 Wrapper Evaluator 11 3.5.3 Ensemble Based Decision Level Fusion. In ensemble learning multiple ...The high fusion percentages validate our hypothesis that by combining features from multiple modalities, classification accuracy can be improved. As

  8. A hybrid gene selection approach for microarray data classification using cellular learning automata and ant colony optimization.

    PubMed

    Vafaee Sharbaf, Fatemeh; Mosafer, Sara; Moattar, Mohammad Hossein

    2016-06-01

    This paper proposes an approach for gene selection in microarray data. The proposed approach consists of a primary filter approach using Fisher criterion which reduces the initial genes and hence the search space and time complexity. Then, a wrapper approach which is based on cellular learning automata (CLA) optimized with ant colony method (ACO) is used to find the set of features which improve the classification accuracy. CLA is applied due to its capability to learn and model complicated relationships. The selected features from the last phase are evaluated using ROC curve and the most effective while smallest feature subset is determined. The classifiers which are evaluated in the proposed framework are K-nearest neighbor; support vector machine and naïve Bayes. The proposed approach is evaluated on 4 microarray datasets. The evaluations confirm that the proposed approach can find the smallest subset of genes while approaching the maximum accuracy. Copyright © 2016 Elsevier Inc. All rights reserved.

  9. Elitist Binary Wolf Search Algorithm for Heuristic Feature Selection in High-Dimensional Bioinformatics Datasets.

    PubMed

    Li, Jinyan; Fong, Simon; Wong, Raymond K; Millham, Richard; Wong, Kelvin K L

    2017-06-28

    Due to the high-dimensional characteristics of dataset, we propose a new method based on the Wolf Search Algorithm (WSA) for optimising the feature selection problem. The proposed approach uses the natural strategy established by Charles Darwin; that is, 'It is not the strongest of the species that survives, but the most adaptable'. This means that in the evolution of a swarm, the elitists are motivated to quickly obtain more and better resources. The memory function helps the proposed method to avoid repeat searches for the worst position in order to enhance the effectiveness of the search, while the binary strategy simplifies the feature selection problem into a similar problem of function optimisation. Furthermore, the wrapper strategy gathers these strengthened wolves with the classifier of extreme learning machine to find a sub-dataset with a reasonable number of features that offers the maximum correctness of global classification models. The experimental results from the six public high-dimensional bioinformatics datasets tested demonstrate that the proposed method can best some of the conventional feature selection methods up to 29% in classification accuracy, and outperform previous WSAs by up to 99.81% in computational time.

  10. A hybrid feature selection algorithm integrating an extreme learning machine for landslide susceptibility modeling of Mt. Woomyeon, South Korea

    NASA Astrophysics Data System (ADS)

    Vasu, Nikhil N.; Lee, Seung-Rae

    2016-06-01

    An ever-increasing trend of extreme rainfall events in South Korea owing to climate change is causing shallow landslides and debris flows in mountains that cover 70% of the total land area of the nation. These catastrophic, gravity-driven processes cost the government several billion KRW (South Korean Won) in losses in addition to fatalities every year. The most common type of landslide observed is the shallow landslide, which occurs at 1-3 m depth, and may mobilize into more catastrophic flow-type landslides. Hence, to predict potential landslide areas, susceptibility maps are developed in a geographical information system (GIS) environment utilizing available morphological, hydrological, geotechnical, and geological data. Landslide susceptibility models were developed using 163 landslide points and an equal number of nonlandslide points in Mt. Woomyeon, Seoul, and 23 landslide conditioning factors. However, because not all of the factors contribute to the determination of the spatial probability for landslide initiation, and a simple filter or wrapper-based approach is not efficient in identifying all of the relevant features, a feedback-loop-based hybrid algorithm was implemented in conjunction with a learning scheme called an extreme learning machine, which is based on a single-layer, feed-forward network. Validation of the constructed susceptibility model was conducted using a testing set of landslide inventory data through a prediction rate curve. The model selected 13 relevant conditioning factors out of the initial 23; and the resulting susceptibility map shows a success rate of 85% and a prediction rate of 89.45%, indicating a good performance, in contrast to the low success and prediction rate of 69.19% and 56.19%, respectively, as obtained using a wrapper technique.

  11. Wrappers for Performance Enhancement and Oblivious Decision Graphs

    DTIC Science & Technology

    1995-09-01

    always select all relevant features. We test di erent search engines to search the space of feature subsets and introduce compound operators to speed...distinct instances from the original dataset appearing in the test set is thus 0:632m. The 0i accuracy estimate is derived by using bootstrap sample...i for training and the rest of the instances for testing . Given a number b, the number of bootstrap samples, let 0i be the accuracy estimate for

  12. An Intervention Study Examining the Effects of Condom Wrapper Graphics and Scent on Condom Use in the Botswana Defence Force

    PubMed Central

    Tran, Bonnie Robin; Thomas, Anne Goldzier; Vaida, Florin; Ditsela, Mooketsi; Phetogo, Robert; Kelapile, David; Haubrich, Richard; Chambers, Christina; Shaffer, Richard

    2014-01-01

    Free condoms provided by the government are often not used by Botswana Defence Force (BDF) personnel due to a perceived unpleasant scent and unattractive wrapper. Formative work with the BDF found that scented condoms and military inspired (camouflage) wrapper graphics were appealing to personnel. A non-randomized intervention study was implemented to determine if condom wrapper graphics and scent improved condom use in the BDF. Four military sites were selected for participation. Two sites in the south received the intervention condom wrapped in a generic wrapper and two sites in the north received the intervention condom wrapped in a military inspired wrapper; intervention condoms were either scented or unscented. 211 male soldiers who ever had sex, aged 18–30 years, and stationed at one of the selected sites consented to participate. Sexual activity and condom use were measured pre- and post-intervention using sexual behavior diaries. A condom use rate (CUR; frequency of protected sex divided by total frequency of sex) was computed for each participant. Mean CURs significantly increased over time (85.7% baseline vs. 94.5% post-intervention). Adjusted odds of condom use over time were higher among participants who received the intervention condom packaged in the military wrapper compared with the generic wrapper. Adjusted odds of condom use were also higher for participants who reported using scented versus unscented condoms. Providing scented condoms and condoms packaged in a miltiary inspired wrapper may help increase condom use and reduce HIV infection among military personnel. PMID:24266459

  13. A three-step approach for the derivation and validation of high-performing predictive models using an operational dataset: congestive heart failure readmission case study.

    PubMed

    AbdelRahman, Samir E; Zhang, Mingyuan; Bray, Bruce E; Kawamoto, Kensaku

    2014-05-27

    The aim of this study was to propose an analytical approach to develop high-performing predictive models for congestive heart failure (CHF) readmission using an operational dataset with incomplete records and changing data over time. Our analytical approach involves three steps: pre-processing, systematic model development, and risk factor analysis. For pre-processing, variables that were absent in >50% of records were removed. Moreover, the dataset was divided into a validation dataset and derivation datasets which were separated into three temporal subsets based on changes to the data over time. For systematic model development, using the different temporal datasets and the remaining explanatory variables, the models were developed by combining the use of various (i) statistical analyses to explore the relationships between the validation and the derivation datasets; (ii) adjustment methods for handling missing values; (iii) classifiers; (iv) feature selection methods; and (iv) discretization methods. We then selected the best derivation dataset and the models with the highest predictive performance. For risk factor analysis, factors in the highest-performing predictive models were analyzed and ranked using (i) statistical analyses of the best derivation dataset, (ii) feature rankers, and (iii) a newly developed algorithm to categorize risk factors as being strong, regular, or weak. The analysis dataset consisted of 2,787 CHF hospitalizations at University of Utah Health Care from January 2003 to June 2013. In this study, we used the complete-case analysis and mean-based imputation adjustment methods; the wrapper subset feature selection method; and four ranking strategies based on information gain, gain ratio, symmetrical uncertainty, and wrapper subset feature evaluators. The best-performing models resulted from the use of a complete-case analysis derivation dataset combined with the Class-Attribute Contingency Coefficient discretization method and a voting classifier which averaged the results of multi-nominal logistic regression and voting feature intervals classifiers. Of 42 final model risk factors, discharge disposition, discretized age, and indicators of anemia were the most significant. This model achieved a c-statistic of 86.8%. The proposed three-step analytical approach enhanced predictive model performance for CHF readmissions. It could potentially be leveraged to improve predictive model performance in other areas of clinical medicine.

  14. ROOT.NET: Using ROOT from .NET languages like C# and F#

    NASA Astrophysics Data System (ADS)

    Watts, G.

    2012-12-01

    ROOT.NET provides an interface between Microsoft's Common Language Runtime (CLR) and .NET technology and the ubiquitous particle physics analysis tool, ROOT. ROOT.NET automatically generates a series of efficient wrappers around the ROOT API. Unlike pyROOT, these wrappers are statically typed and so are highly efficient as compared to the Python wrappers. The connection to .NET means that one gains access to the full series of languages developed for the CLR including functional languages like F# (based on OCaml). Many features that make ROOT objects work well in the .NET world are added (properties, IEnumerable interface, LINQ compatibility, etc.). Dynamic languages based on the CLR can be used as well, of course (Python, for example). Additionally it is now possible to access ROOT objects that are unknown to the translation tool. This poster will describe the techniques used to effect this translation, along with performance comparisons, and examples. All described source code is posted on the open source site CodePlex.

  15. Examining Brain Morphometry Associated with Self-Esteem in Young Adults Using Multilevel-ROI-Features-Based Classification Method

    PubMed Central

    Peng, Bo; Lu, Jieru; Saxena, Aditya; Zhou, Zhiyong; Zhang, Tao; Wang, Suhong; Dai, Yakang

    2017-01-01

    Purpose: This study is to exam self-esteem related brain morphometry on brain magnetic resonance (MR) images using multilevel-features-based classification method. Method: The multilevel region of interest (ROI) features consist of two types of features: (i) ROI features, which include gray matter volume, white matter volume, cerebrospinal fluid volume, cortical thickness, and cortical surface area, and (ii) similarity features, which are based on similarity calculation of cortical thickness between ROIs. For each feature type, a hybrid feature selection method, comprising of filter-based and wrapper-based algorithms, is used to select the most discriminating features. ROI features and similarity features are integrated by using multi-kernel support vector machines (SVMs) with appropriate weighting factor. Results: The classification performance is improved by using multilevel ROI features with an accuracy of 96.66%, a specificity of 96.62%, and a sensitivity of 95.67%. The most discriminating ROI features that are related to self-esteem spread over occipital lobe, frontal lobe, parietal lobe, limbic lobe, temporal lobe, and central region, mainly involving white matter and cortical thickness. The most discriminating similarity features are distributed in both the right and left hemisphere, including frontal lobe, occipital lobe, limbic lobe, parietal lobe, and central region, which conveys information of structural connections between different brain regions. Conclusion: By using ROI features and similarity features to exam self-esteem related brain morphometry, this paper provides a pilot evidence that self-esteem is linked to specific ROIs and structural connections between different brain regions. PMID:28588470

  16. Examining Brain Morphometry Associated with Self-Esteem in Young Adults Using Multilevel-ROI-Features-Based Classification Method.

    PubMed

    Peng, Bo; Lu, Jieru; Saxena, Aditya; Zhou, Zhiyong; Zhang, Tao; Wang, Suhong; Dai, Yakang

    2017-01-01

    Purpose: This study is to exam self-esteem related brain morphometry on brain magnetic resonance (MR) images using multilevel-features-based classification method. Method: The multilevel region of interest (ROI) features consist of two types of features: (i) ROI features, which include gray matter volume, white matter volume, cerebrospinal fluid volume, cortical thickness, and cortical surface area, and (ii) similarity features, which are based on similarity calculation of cortical thickness between ROIs. For each feature type, a hybrid feature selection method, comprising of filter-based and wrapper-based algorithms, is used to select the most discriminating features. ROI features and similarity features are integrated by using multi-kernel support vector machines (SVMs) with appropriate weighting factor. Results: The classification performance is improved by using multilevel ROI features with an accuracy of 96.66%, a specificity of 96.62%, and a sensitivity of 95.67%. The most discriminating ROI features that are related to self-esteem spread over occipital lobe, frontal lobe, parietal lobe, limbic lobe, temporal lobe, and central region, mainly involving white matter and cortical thickness. The most discriminating similarity features are distributed in both the right and left hemisphere, including frontal lobe, occipital lobe, limbic lobe, parietal lobe, and central region, which conveys information of structural connections between different brain regions. Conclusion: By using ROI features and similarity features to exam self-esteem related brain morphometry, this paper provides a pilot evidence that self-esteem is linked to specific ROIs and structural connections between different brain regions.

  17. A general procedure to generate models for urban environmental-noise pollution using feature selection and machine learning methods.

    PubMed

    Torija, Antonio J; Ruiz, Diego P

    2015-02-01

    The prediction of environmental noise in urban environments requires the solution of a complex and non-linear problem, since there are complex relationships among the multitude of variables involved in the characterization and modelling of environmental noise and environmental-noise magnitudes. Moreover, the inclusion of the great spatial heterogeneity characteristic of urban environments seems to be essential in order to achieve an accurate environmental-noise prediction in cities. This problem is addressed in this paper, where a procedure based on feature-selection techniques and machine-learning regression methods is proposed and applied to this environmental problem. Three machine-learning regression methods, which are considered very robust in solving non-linear problems, are used to estimate the energy-equivalent sound-pressure level descriptor (LAeq). These three methods are: (i) multilayer perceptron (MLP), (ii) sequential minimal optimisation (SMO), and (iii) Gaussian processes for regression (GPR). In addition, because of the high number of input variables involved in environmental-noise modelling and estimation in urban environments, which make LAeq prediction models quite complex and costly in terms of time and resources for application to real situations, three different techniques are used to approach feature selection or data reduction. The feature-selection techniques used are: (i) correlation-based feature-subset selection (CFS), (ii) wrapper for feature-subset selection (WFS), and the data reduction technique is principal-component analysis (PCA). The subsequent analysis leads to a proposal of different schemes, depending on the needs regarding data collection and accuracy. The use of WFS as the feature-selection technique with the implementation of SMO or GPR as regression algorithm provides the best LAeq estimation (R(2)=0.94 and mean absolute error (MAE)=1.14-1.16 dB(A)). Copyright © 2014 Elsevier B.V. All rights reserved.

  18. Using multiple classifiers for predicting the risk of endovascular aortic aneurysm repair re-intervention through hybrid feature selection.

    PubMed

    Attallah, Omneya; Karthikesalingam, Alan; Holt, Peter Je; Thompson, Matthew M; Sayers, Rob; Bown, Matthew J; Choke, Eddie C; Ma, Xianghong

    2017-11-01

    Feature selection is essential in medical area; however, its process becomes complicated with the presence of censoring which is the unique character of survival analysis. Most survival feature selection methods are based on Cox's proportional hazard model, though machine learning classifiers are preferred. They are less employed in survival analysis due to censoring which prevents them from directly being used to survival data. Among the few work that employed machine learning classifiers, partial logistic artificial neural network with auto-relevance determination is a well-known method that deals with censoring and perform feature selection for survival data. However, it depends on data replication to handle censoring which leads to unbalanced and biased prediction results especially in highly censored data. Other methods cannot deal with high censoring. Therefore, in this article, a new hybrid feature selection method is proposed which presents a solution to high level censoring. It combines support vector machine, neural network, and K-nearest neighbor classifiers using simple majority voting and a new weighted majority voting method based on survival metric to construct a multiple classifier system. The new hybrid feature selection process uses multiple classifier system as a wrapper method and merges it with iterated feature ranking filter method to further reduce features. Two endovascular aortic repair datasets containing 91% censored patients collected from two centers were used to construct a multicenter study to evaluate the performance of the proposed approach. The results showed the proposed technique outperformed individual classifiers and variable selection methods based on Cox's model such as Akaike and Bayesian information criterions and least absolute shrinkage and selector operator in p values of the log-rank test, sensitivity, and concordance index. This indicates that the proposed classifier is more powerful in correctly predicting the risk of re-intervention enabling doctor in selecting patients' future follow-up plan.

  19. Differential prioritization between relevance and redundancy in correlation-based feature selection techniques for multiclass gene expression data.

    PubMed

    Ooi, Chia Huey; Chetty, Madhu; Teng, Shyh Wei

    2006-06-23

    Due to the large number of genes in a typical microarray dataset, feature selection looks set to play an important role in reducing noise and computational cost in gene expression-based tissue classification while improving accuracy at the same time. Surprisingly, this does not appear to be the case for all multiclass microarray datasets. The reason is that many feature selection techniques applied on microarray datasets are either rank-based and hence do not take into account correlations between genes, or are wrapper-based, which require high computational cost, and often yield difficult-to-reproduce results. In studies where correlations between genes are considered, attempts to establish the merit of the proposed techniques are hampered by evaluation procedures which are less than meticulous, resulting in overly optimistic estimates of accuracy. We present two realistically evaluated correlation-based feature selection techniques which incorporate, in addition to the two existing criteria involved in forming a predictor set (relevance and redundancy), a third criterion called the degree of differential prioritization (DDP). DDP functions as a parameter to strike the balance between relevance and redundancy, providing our techniques with the novel ability to differentially prioritize the optimization of relevance against redundancy (and vice versa). This ability proves useful in producing optimal classification accuracy while using reasonably small predictor set sizes for nine well-known multiclass microarray datasets. For multiclass microarray datasets, especially the GCM and NCI60 datasets, DDP enables our filter-based techniques to produce accuracies better than those reported in previous studies which employed similarly realistic evaluation procedures.

  20. QMMMW: A wrapper for QM/MM simulations with QUANTUM ESPRESSO and LAMMPS

    NASA Astrophysics Data System (ADS)

    Ma, Changru; Martin-Samos, Layla; Fabris, Stefano; Laio, Alessandro; Piccinin, Simone

    2015-10-01

    We present QMMMW, a new program aimed at performing Quantum Mechanics/Molecular Mechanics (QM/MM) molecular dynamics. The package operates as a wrapper that patches PWscf code included in the QUANTUM ESPRESSO distribution and LAMMPS Molecular Dynamics Simulator. It is designed with a paradigm based on three guidelines: (i) minimal amount of modifications on the parent codes, (ii) flexibility and computational efficiency of the communication layer and (iii) accuracy of the Hamiltonian describing the interaction between the QM and MM subsystems. These three features are seldom present simultaneously in other implementations of QMMM. The QMMMW project is hosted by qe-forge at

  1. An OMIC biomarker detection algorithm TriVote and its application in methylomic biomarker detection.

    PubMed

    Xu, Cheng; Liu, Jiamei; Yang, Weifeng; Shu, Yayun; Wei, Zhipeng; Zheng, Weiwei; Feng, Xin; Zhou, Fengfeng

    2018-04-01

    Transcriptomic and methylomic patterns represent two major OMIC data sources impacted by both inheritable genetic information and environmental factors, and have been widely used as disease diagnosis and prognosis biomarkers. Modern transcriptomic and methylomic profiling technologies detect the status of tens of thousands or even millions of probing residues in the human genome, and introduce a major computational challenge for the existing feature selection algorithms. This study proposes a three-step feature selection algorithm, TriVote, to detect a subset of transcriptomic or methylomic residues with highly accurate binary classification performance. TriVote outperforms both filter and wrapper feature selection algorithms with both higher classification accuracy and smaller feature number on 17 transcriptomes and two methylomes. Biological functions of the methylome biomarkers detected by TriVote were discussed for their disease associations. An easy-to-use Python package is also released to facilitate the further applications.

  2. Comparing supervised learning methods for classifying sex, age, context and individual Mudi dogs from barking.

    PubMed

    Larrañaga, Ana; Bielza, Concha; Pongrácz, Péter; Faragó, Tamás; Bálint, Anna; Larrañaga, Pedro

    2015-03-01

    Barking is perhaps the most characteristic form of vocalization in dogs; however, very little is known about its role in the intraspecific communication of this species. Besides the obvious need for ethological research, both in the field and in the laboratory, the possible information content of barks can also be explored by computerized acoustic analyses. This study compares four different supervised learning methods (naive Bayes, classification trees, [Formula: see text]-nearest neighbors and logistic regression) combined with three strategies for selecting variables (all variables, filter and wrapper feature subset selections) to classify Mudi dogs by sex, age, context and individual from their barks. The classification accuracy of the models obtained was estimated by means of [Formula: see text]-fold cross-validation. Percentages of correct classifications were 85.13 % for determining sex, 80.25 % for predicting age (recodified as young, adult and old), 55.50 % for classifying contexts (seven situations) and 67.63 % for recognizing individuals (8 dogs), so the results are encouraging. The best-performing method was [Formula: see text]-nearest neighbors following a wrapper feature selection approach. The results for classifying contexts and recognizing individual dogs were better with this method than they were for other approaches reported in the specialized literature. This is the first time that the sex and age of domestic dogs have been predicted with the help of sound analysis. This study shows that dog barks carry ample information regarding the caller's indexical features. Our computerized analysis provides indirect proof that barks may serve as an important source of information for dogs as well.

  3. pyMOOGi - python wrapper for MOOG

    NASA Astrophysics Data System (ADS)

    Adamow, Monika M.

    2017-06-01

    pyMOOGi is a python wrapper for MOOG. It allows to use MOOG in a classical, interactive way, but with all graphics handled by python libraries. Some MOOG features have been redesigned, like plotting with abfind driver. Also, new funtions have been added, like automatic rescaling of stellar spectrum for synth driver. pyMOOGi is an open source project.

  4. A novel approach for dimension reduction of microarray.

    PubMed

    Aziz, Rabia; Verma, C K; Srivastava, Namita

    2017-12-01

    This paper proposes a new hybrid search technique for feature (gene) selection (FS) using Independent component analysis (ICA) and Artificial Bee Colony (ABC) called ICA+ABC, to select informative genes based on a Naïve Bayes (NB) algorithm. An important trait of this technique is the optimization of ICA feature vector using ABC. ICA+ABC is a hybrid search algorithm that combines the benefits of extraction approach, to reduce the size of data and wrapper approach, to optimize the reduced feature vectors. This hybrid search technique is facilitated by evaluating the performance of ICA+ABC on six standard gene expression datasets of classification. Extensive experiments were conducted to compare the performance of ICA+ABC with the results obtained from recently published Minimum Redundancy Maximum Relevance (mRMR) +ABC algorithm for NB classifier. Also to check the performance that how ICA+ABC works as feature selection with NB classifier, compared the combination of ICA with popular filter techniques and with other similar bio inspired algorithm such as Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). The result shows that ICA+ABC has a significant ability to generate small subsets of genes from the ICA feature vector, that significantly improve the classification accuracy of NB classifier compared to other previously suggested methods. Copyright © 2017 Elsevier Ltd. All rights reserved.

  5. Classifying vulnerability to sleep deprivation using baseline measures of psychomotor vigilance.

    PubMed

    Patanaik, Amiya; Kwoh, Chee Keong; Chua, Eric C P; Gooley, Joshua J; Chee, Michael W L

    2015-05-01

    To identify measures derived from baseline psychomotor vigilance task (PVT) performance that can reliably predict vulnerability to sleep deprivation. Subjects underwent total sleep deprivation and completed a 10-min PVT every 1-2 h in a controlled laboratory setting. Participants were categorized as vulnerable or resistant to sleep deprivation, based on a median split of lapses that occurred following sleep deprivation. Standard reaction time, drift diffusion model (DDM), and wavelet metrics were derived from PVT response times collected at baseline. A support vector machine model that incorporated maximum relevance and minimum redundancy feature selection and wrapper-based heuristics was used to classify subjects as vulnerable or resistant using rested data. Two academic sleep laboratories. Independent samples of 135 (69 women, age 18 to 25 y), and 45 (3 women, age 22 to 32 y) healthy adults. In both datasets, DDM measures, number of consecutive reaction times that differ by more than 250 ms, and two wavelet features were selected by the model as features predictive of vulnerability to sleep deprivation. Using the best set of features selected in each dataset, classification accuracy was 77% and 82% using fivefold stratified cross-validation, respectively. In both datasets, DDM measures, number of consecutive reaction times that differ by more than 250 ms, and two wavelet features were selected by the model as features predictive of vulnerability to sleep deprivation. Using the best set of features selected in each dataset, classification accuracy was 77% and 82% using fivefold stratified cross-validation, respectively. Despite differences in experimental conditions across studies, drift diffusion model parameters associated reliably with individual differences in performance during total sleep deprivation. These results demonstrate the utility of drift diffusion modeling of baseline performance in estimating vulnerability to psychomotor vigilance decline following sleep deprivation. © 2015 Associated Professional Sleep Societies, LLC.

  6. Multiclass Classification of Cardiac Arrhythmia Using Improved Feature Selection and SVM Invariants.

    PubMed

    Mustaqeem, Anam; Anwar, Syed Muhammad; Majid, Muahammad

    2018-01-01

    Arrhythmia is considered a life-threatening disease causing serious health issues in patients, when left untreated. An early diagnosis of arrhythmias would be helpful in saving lives. This study is conducted to classify patients into one of the sixteen subclasses, among which one class represents absence of disease and the other fifteen classes represent electrocardiogram records of various subtypes of arrhythmias. The research is carried out on the dataset taken from the University of California at Irvine Machine Learning Data Repository. The dataset contains a large volume of feature dimensions which are reduced using wrapper based feature selection technique. For multiclass classification, support vector machine (SVM) based approaches including one-against-one (OAO), one-against-all (OAA), and error-correction code (ECC) are employed to detect the presence and absence of arrhythmias. The SVM method results are compared with other standard machine learning classifiers using varying parameters and the performance of the classifiers is evaluated using accuracy, kappa statistics, and root mean square error. The results show that OAO method of SVM outperforms all other classifiers by achieving an accuracy rate of 81.11% when used with 80/20 data split and 92.07% using 90/10 data split option.

  7. A Pareto-based Ensemble with Feature and Instance Selection for Learning from Multi-Class Imbalanced Datasets.

    PubMed

    Fernández, Alberto; Carmona, Cristobal José; José Del Jesus, María; Herrera, Francisco

    2017-09-01

    Imbalanced classification is related to those problems that have an uneven distribution among classes. In addition to the former, when instances are located into the overlapped areas, the correct modeling of the problem becomes harder. Current solutions for both issues are often focused on the binary case study, as multi-class datasets require an additional effort to be addressed. In this research, we overcome these problems by carrying out a combination between feature and instance selections. Feature selection will allow simplifying the overlapping areas easing the generation of rules to distinguish among the classes. Selection of instances from all classes will address the imbalance itself by finding the most appropriate class distribution for the learning task, as well as possibly removing noise and difficult borderline examples. For the sake of obtaining an optimal joint set of features and instances, we embedded the searching for both parameters in a Multi-Objective Evolutionary Algorithm, using the C4.5 decision tree as baseline classifier in this wrapper approach. The multi-objective scheme allows taking a double advantage: the search space becomes broader, and we may provide a set of different solutions in order to build an ensemble of classifiers. This proposal has been contrasted versus several state-of-the-art solutions on imbalanced classification showing excellent results in both binary and multi-class problems.

  8. Fuzzy support vector machine: an efficient rule-based classification technique for microarrays.

    PubMed

    Hajiloo, Mohsen; Rabiee, Hamid R; Anooshahpour, Mahdi

    2013-01-01

    The abundance of gene expression microarray data has led to the development of machine learning algorithms applicable for tackling disease diagnosis, disease prognosis, and treatment selection problems. However, these algorithms often produce classifiers with weaknesses in terms of accuracy, robustness, and interpretability. This paper introduces fuzzy support vector machine which is a learning algorithm based on combination of fuzzy classifiers and kernel machines for microarray classification. Experimental results on public leukemia, prostate, and colon cancer datasets show that fuzzy support vector machine applied in combination with filter or wrapper feature selection methods develops a robust model with higher accuracy than the conventional microarray classification models such as support vector machine, artificial neural network, decision trees, k nearest neighbors, and diagonal linear discriminant analysis. Furthermore, the interpretable rule-base inferred from fuzzy support vector machine helps extracting biological knowledge from microarray data. Fuzzy support vector machine as a new classification model with high generalization power, robustness, and good interpretability seems to be a promising tool for gene expression microarray classification.

  9. Automatic design of basin-specific drought indexes for highly regulated water systems

    NASA Astrophysics Data System (ADS)

    Zaniolo, Marta; Giuliani, Matteo; Castelletti, Andrea Francesco; Pulido-Velazquez, Manuel

    2018-04-01

    Socio-economic costs of drought are progressively increasing worldwide due to undergoing alterations of hydro-meteorological regimes induced by climate change. Although drought management is largely studied in the literature, traditional drought indexes often fail at detecting critical events in highly regulated systems, where natural water availability is conditioned by the operation of water infrastructures such as dams, diversions, and pumping wells. Here, ad hoc index formulations are usually adopted based on empirical combinations of several, supposed-to-be significant, hydro-meteorological variables. These customized formulations, however, while effective in the design basin, can hardly be generalized and transferred to different contexts. In this study, we contribute FRIDA (FRamework for Index-based Drought Analysis), a novel framework for the automatic design of basin-customized drought indexes. In contrast to ad hoc empirical approaches, FRIDA is fully automated, generalizable, and portable across different basins. FRIDA builds an index representing a surrogate of the drought conditions of the basin, computed by combining all the relevant available information about the water circulating in the system identified by means of a feature extraction algorithm. We used the Wrapper for Quasi-Equally Informative Subset Selection (W-QEISS), which features a multi-objective evolutionary algorithm to find Pareto-efficient subsets of variables by maximizing the wrapper accuracy, minimizing the number of selected variables, and optimizing relevance and redundancy of the subset. The preferred variable subset is selected among the efficient solutions and used to formulate the final index according to alternative model structures. We apply FRIDA to the case study of the Jucar river basin (Spain), a drought-prone and highly regulated Mediterranean water resource system, where an advanced drought management plan relying on the formulation of an ad hoc state index is used for triggering drought management measures. The state index was constructed empirically with a trial-and-error process begun in the 1980s and finalized in 2007, guided by the experts from the Confederación Hidrográfica del Júcar (CHJ). Our results show that the automated variable selection outcomes align with CHJ's 25-year-long empirical refinement. In addition, the resultant FRIDA index outperforms the official State Index in terms of accuracy in reproducing the target variable and cardinality of the selected inputs set.

  10. Application of machine learning techniques to analyse the effects of physical exercise in ventricular fibrillation.

    PubMed

    Caravaca, Juan; Soria-Olivas, Emilio; Bataller, Manuel; Serrano, Antonio J; Such-Miquel, Luis; Vila-Francés, Joan; Guerrero, Juan F

    2014-02-01

    This work presents the application of machine learning techniques to analyse the influence of physical exercise in the physiological properties of the heart, during ventricular fibrillation. To this end, different kinds of classifiers (linear and neural models) are used to classify between trained and sedentary rabbit hearts. The use of those classifiers in combination with a wrapper feature selection algorithm allows to extract knowledge about the most relevant features in the problem. The obtained results show that neural models outperform linear classifiers (better performance indices and a better dimensionality reduction). The most relevant features to describe the benefits of physical exercise are those related to myocardial heterogeneity, mean activation rate and activation complexity. © 2013 Published by Elsevier Ltd.

  11. A review of channel selection algorithms for EEG signal processing

    NASA Astrophysics Data System (ADS)

    Alotaiby, Turky; El-Samie, Fathi E. Abd; Alshebeili, Saleh A.; Ahmad, Ishtiaq

    2015-12-01

    Digital processing of electroencephalography (EEG) signals has now been popularly used in a wide variety of applications such as seizure detection/prediction, motor imagery classification, mental task classification, emotion classification, sleep state classification, and drug effects diagnosis. With the large number of EEG channels acquired, it has become apparent that efficient channel selection algorithms are needed with varying importance from one application to another. The main purpose of the channel selection process is threefold: (i) to reduce the computational complexity of any processing task performed on EEG signals by selecting the relevant channels and hence extracting the features of major importance, (ii) to reduce the amount of overfitting that may arise due to the utilization of unnecessary channels, for the purpose of improving the performance, and (iii) to reduce the setup time in some applications. Signal processing tools such as time-domain analysis, power spectral estimation, and wavelet transform have been used for feature extraction and hence for channel selection in most of channel selection algorithms. In addition, different evaluation approaches such as filtering, wrapper, embedded, hybrid, and human-based techniques have been widely used for the evaluation of the selected subset of channels. In this paper, we survey the recent developments in the field of EEG channel selection methods along with their applications and classify these methods according to the evaluation approach.

  12. WE-E-17A-02: Predictive Modeling of Outcome Following SABR for NSCLC Based On Radiomics of FDG-PET Images

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Li, R; Aguilera, T; Shultz, D

    2014-06-15

    Purpose: This study aims to develop predictive models of patient outcome by extracting advanced imaging features (i.e., Radiomics) from FDG-PET images. Methods: We acquired pre-treatment PET scans for 51 stage I NSCLC patients treated with SABR. We calculated 139 quantitative features from each patient PET image, including 5 morphological features, 8 statistical features, 27 texture features, and 100 features from the intensity-volume histogram. Based on the imaging features, we aim to distinguish between 2 risk groups of patients: those with regional failure or distant metastasis versus those without. We investigated 3 pattern classification algorithms: linear discriminant analysis (LDA), naive Bayesmore » (NB), and logistic regression (LR). To avoid the curse of dimensionality, we performed feature selection by first removing redundant features and then applying sequential forward selection using the wrapper approach. To evaluate the predictive performance, we performed 10-fold cross validation with 1000 random splits of the data and calculated the area under the ROC curve (AUC). Results: Feature selection identified 2 texture features (homogeneity and/or wavelet decompositions) for NB and LR, while for LDA SUVmax and one texture feature (correlation) were identified. All 3 classifiers achieved statistically significant improvements over conventional PET imaging metrics such as tumor volume (AUC = 0.668) and SUVmax (AUC = 0.737). Overall, NB achieved the best predictive performance (AUC = 0.806). This also compares favorably with MTV using the best threshold at an SUV of 11.6 (AUC = 0.746). At a sensitivity of 80%, NB achieved 69% specificity, while SUVmax and tumor volume only had 36% and 47% specificity. Conclusion: Through a systematic analysis of advanced PET imaging features, we are able to build models with improved predictive value over conventional imaging metrics. If validated in a large independent cohort, the proposed techniques could potentially aid in identifying patients who might benefit from adjuvant therapy.« less

  13. Bias and Stability of Single Variable Classifiers for Feature Ranking and Selection

    PubMed Central

    Fakhraei, Shobeir; Soltanian-Zadeh, Hamid; Fotouhi, Farshad

    2014-01-01

    Feature rankings are often used for supervised dimension reduction especially when discriminating power of each feature is of interest, dimensionality of dataset is extremely high, or computational power is limited to perform more complicated methods. In practice, it is recommended to start dimension reduction via simple methods such as feature rankings before applying more complex approaches. Single Variable Classifier (SVC) ranking is a feature ranking based on the predictive performance of a classifier built using only a single feature. While benefiting from capabilities of classifiers, this ranking method is not as computationally intensive as wrappers. In this paper, we report the results of an extensive study on the bias and stability of such feature ranking method. We study whether the classifiers influence the SVC rankings or the discriminative power of features themselves has a dominant impact on the final rankings. We show the common intuition of using the same classifier for feature ranking and final classification does not always result in the best prediction performance. We then study if heterogeneous classifiers ensemble approaches provide more unbiased rankings and if they improve final classification performance. Furthermore, we calculate an empirical prediction performance loss for using the same classifier in SVC feature ranking and final classification from the optimal choices. PMID:25177107

  14. Bias and Stability of Single Variable Classifiers for Feature Ranking and Selection.

    PubMed

    Fakhraei, Shobeir; Soltanian-Zadeh, Hamid; Fotouhi, Farshad

    2014-11-01

    Feature rankings are often used for supervised dimension reduction especially when discriminating power of each feature is of interest, dimensionality of dataset is extremely high, or computational power is limited to perform more complicated methods. In practice, it is recommended to start dimension reduction via simple methods such as feature rankings before applying more complex approaches. Single Variable Classifier (SVC) ranking is a feature ranking based on the predictive performance of a classifier built using only a single feature. While benefiting from capabilities of classifiers, this ranking method is not as computationally intensive as wrappers. In this paper, we report the results of an extensive study on the bias and stability of such feature ranking method. We study whether the classifiers influence the SVC rankings or the discriminative power of features themselves has a dominant impact on the final rankings. We show the common intuition of using the same classifier for feature ranking and final classification does not always result in the best prediction performance. We then study if heterogeneous classifiers ensemble approaches provide more unbiased rankings and if they improve final classification performance. Furthermore, we calculate an empirical prediction performance loss for using the same classifier in SVC feature ranking and final classification from the optimal choices.

  15. A multilevel-ROI-features-based machine learning method for detection of morphometric biomarkers in Parkinson's disease.

    PubMed

    Peng, Bo; Wang, Suhong; Zhou, Zhiyong; Liu, Yan; Tong, Baotong; Zhang, Tao; Dai, Yakang

    2017-06-09

    Machine learning methods have been widely used in recent years for detection of neuroimaging biomarkers in regions of interest (ROIs) and assisting diagnosis of neurodegenerative diseases. The innovation of this study is to use multilevel-ROI-features-based machine learning method to detect sensitive morphometric biomarkers in Parkinson's disease (PD). Specifically, the low-level ROI features (gray matter volume, cortical thickness, etc.) and high-level correlative features (connectivity between ROIs) are integrated to construct the multilevel ROI features. Filter- and wrapper- based feature selection method and multi-kernel support vector machine (SVM) are used in the classification algorithm. T1-weighted brain magnetic resonance (MR) images of 69 PD patients and 103 normal controls from the Parkinson's Progression Markers Initiative (PPMI) dataset are included in the study. The machine learning method performs well in classification between PD patients and normal controls with an accuracy of 85.78%, a specificity of 87.79%, and a sensitivity of 87.64%. The most sensitive biomarkers between PD patients and normal controls are mainly distributed in frontal lobe, parental lobe, limbic lobe, temporal lobe, and central region. The classification performance of our method with multilevel ROI features is significantly improved comparing with other classification methods using single-level features. The proposed method shows promising identification ability for detecting morphometric biomarkers in PD, thus confirming the potentiality of our method in assisting diagnosis of the disease. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. Dynamic adaptive learning for decision-making supporting systems

    NASA Astrophysics Data System (ADS)

    He, Haibo; Cao, Yuan; Chen, Sheng; Desai, Sachi; Hohil, Myron E.

    2008-03-01

    This paper proposes a novel adaptive learning method for data mining in support of decision-making systems. Due to the inherent characteristics of information ambiguity/uncertainty, high dimensionality and noisy in many homeland security and defense applications, such as surveillances, monitoring, net-centric battlefield, and others, it is critical to develop autonomous learning methods to efficiently learn useful information from raw data to help the decision making process. The proposed method is based on a dynamic learning principle in the feature spaces. Generally speaking, conventional approaches of learning from high dimensional data sets include various feature extraction (principal component analysis, wavelet transform, and others) and feature selection (embedded approach, wrapper approach, filter approach, and others) methods. However, very limited understandings of adaptive learning from different feature spaces have been achieved. We propose an integrative approach that takes advantages of feature selection and hypothesis ensemble techniques to achieve our goal. Based on the training data distributions, a feature score function is used to provide a measurement of the importance of different features for learning purpose. Then multiple hypotheses are iteratively developed in different feature spaces according to their learning capabilities. Unlike the pre-set iteration steps in many of the existing ensemble learning approaches, such as adaptive boosting (AdaBoost) method, the iterative learning process will automatically stop when the intelligent system can not provide a better understanding than a random guess in that particular subset of feature spaces. Finally, a voting algorithm is used to combine all the decisions from different hypotheses to provide the final prediction results. Simulation analyses of the proposed method on classification of different US military aircraft databases show the effectiveness of this method.

  17. Improved data retrieval from TreeBASE via taxonomic and linguistic data enrichment

    PubMed Central

    Anwar, Nadia; Hunt, Ela

    2009-01-01

    Background TreeBASE, the only data repository for phylogenetic studies, is not being used effectively since it does not meet the taxonomic data retrieval requirements of the systematics community. We show, through an examination of the queries performed on TreeBASE, that data retrieval using taxon names is unsatisfactory. Results We report on a new wrapper supporting taxon queries on TreeBASE by utilising a Taxonomy and Classification Database (TCl-Db) we created. TCl-Db holds merged and consolidated taxonomic names from multiple data sources and can be used to translate hierarchical, vernacular and synonym queries into specific query terms in TreeBASE. The query expansion supported by TCl-Db shows very significant information retrieval quality improvement. The wrapper can be accessed at the URL The methodology we developed is scalable and can be applied to new data, as those become available in the future. Conclusion Significantly improved data retrieval quality is shown for all queries, and additional flexibility is achieved via user-driven taxonomy selection. PMID:19426482

  18. Using learning automata to determine proper subset size in high-dimensional spaces

    NASA Astrophysics Data System (ADS)

    Seyyedi, Seyyed Hossein; Minaei-Bidgoli, Behrouz

    2017-03-01

    In this paper, we offer a new method called FSLA (Finding the best candidate Subset using Learning Automata), which combines the filter and wrapper approaches for feature selection in high-dimensional spaces. Considering the difficulties of dimension reduction in high-dimensional spaces, FSLA's multi-objective functionality is to determine, in an efficient manner, a feature subset that leads to an appropriate tradeoff between the learning algorithm's accuracy and efficiency. First, using an existing weighting function, the feature list is sorted and selected subsets of the list of different sizes are considered. Then, a learning automaton verifies the performance of each subset when it is used as the input space of the learning algorithm and estimates its fitness upon the algorithm's accuracy and the subset size, which determines the algorithm's efficiency. Finally, FSLA introduces the fittest subset as the best choice. We tested FSLA in the framework of text classification. The results confirm its promising performance of attaining the identified goal.

  19. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Aghaei, Faranak; Tan, Maxine; Liu, Hong

    Purpose: To identify a new clinical marker based on quantitative kinetic image features analysis and assess its feasibility to predict tumor response to neoadjuvant chemotherapy. Methods: The authors assembled a dataset involving breast MR images acquired from 68 cancer patients before undergoing neoadjuvant chemotherapy. Among them, 25 patients had complete response (CR) and 43 had partial and nonresponse (NR) to chemotherapy based on the response evaluation criteria in solid tumors. The authors developed a computer-aided detection scheme to segment breast areas and tumors depicted on the breast MR images and computed a total of 39 kinetic image features from bothmore » tumor and background parenchymal enhancement regions. The authors then applied and tested two approaches to classify between CR and NR cases. The first one analyzed each individual feature and applied a simple feature fusion method that combines classification results from multiple features. The second approach tested an attribute selected classifier that integrates an artificial neural network (ANN) with a wrapper subset evaluator, which was optimized using a leave-one-case-out validation method. Results: In the pool of 39 features, 10 yielded relatively higher classification performance with the areas under receiver operating characteristic curves (AUCs) ranging from 0.61 to 0.78 to classify between CR and NR cases. Using a feature fusion method, the maximum AUC = 0.85 ± 0.05. Using the ANN-based classifier, AUC value significantly increased to 0.96 ± 0.03 (p < 0.01). Conclusions: This study demonstrated that quantitative analysis of kinetic image features computed from breast MR images acquired prechemotherapy has potential to generate a useful clinical marker in predicting tumor response to chemotherapy.« less

  20. Pharmacokinetic Tumor Heterogeneity as a Prognostic Biomarker for Classifying Breast Cancer Recurrence Risk.

    PubMed

    Mahrooghy, Majid; Ashraf, Ahmed B; Daye, Dania; McDonald, Elizabeth S; Rosen, Mark; Mies, Carolyn; Feldman, Michael; Kontos, Despina

    2015-06-01

    Heterogeneity in cancer can affect response to therapy and patient prognosis. Histologic measures have classically been used to measure heterogeneity, although a reliable noninvasive measurement is needed both to establish baseline risk of recurrence and monitor response to treatment. Here, we propose using spatiotemporal wavelet kinetic features from dynamic contrast-enhanced magnetic resonance imaging to quantify intratumor heterogeneity in breast cancer. Tumor pixels are first partitioned into homogeneous subregions using pharmacokinetic measures. Heterogeneity wavelet kinetic (HetWave) features are then extracted from these partitions to obtain spatiotemporal patterns of the wavelet coefficients and the contrast agent uptake. The HetWave features are evaluated in terms of their prognostic value using a logistic regression classifier with genetic algorithm wrapper-based feature selection to classify breast cancer recurrence risk as determined by a validated gene expression assay. Receiver operating characteristic analysis and area under the curve (AUC) are computed to assess classifier performance using leave-one-out cross validation. The HetWave features outperform other commonly used features (AUC = 0.88 HetWave versus 0.70 standard features). The combination of HetWave and standard features further increases classifier performance (AUCs 0.94). The rate of the spatial frequency pattern over the pharmacokinetic partitions can provide valuable prognostic information. HetWave could be a powerful feature extraction approach for characterizing tumor heterogeneity, providing valuable prognostic information.

  1. Chemical and genetic wrappers for improved phage and RNA display.

    PubMed

    Lamboy, Jorge A; Tam, Phillip Y; Lee, Lucie S; Jackson, Pilgrim J; Avrantinis, Sara K; Lee, Hye J; Corn, Robert M; Weiss, Gregory A

    2008-11-24

    An Achilles heel inherent to all molecular display formats, background binding between target and display system introduces false positives into screens and selections. For example, the negatively charged surfaces of phage, mRNA, and ribosome display systems bind with unacceptably high nonspecificity to positively charged target molecules, which represent an estimated 35% of proteins in the human proteome. Here we report the first systematic attempt to understand why a broad class of molecular display selections fail, and then solve the underlying problem for both phage and RNA display. Firstly, a genetic strategy was used to introduce a short, charge-neutralizing peptide into the solvent-exposed, negatively charged phage coat. The modified phage (KO7(+)) reduced or eliminated nonspecific binding to the problematic high-pI proteins. In the second, chemical approach, nonspecific interactions were blocked by oligolysine wrappers in the cases of phage and total RNA. For phage display applications, the peptides Lys(n) (where n=16 to 24) emerged as optimal for wrapping the phage. Lys(8), however, provided effective wrappers for RNA binding in assays against the RNA binding protein HIV-1 Vif. The oligolysine peptides blocked nonspecific binding to allow successful selections, screens, and assays with five previously unworkable protein targets.

  2. A Learning-Based Wrapper Method to Correct Systematic Errors in Automatic Image Segmentation: Consistently Improved Performance in Hippocampus, Cortex and Brain Segmentation

    PubMed Central

    Wang, Hongzhi; Das, Sandhitsu R.; Suh, Jung Wook; Altinay, Murat; Pluta, John; Craige, Caryne; Avants, Brian; Yushkevich, Paul A.

    2011-01-01

    We propose a simple but generally applicable approach to improving the accuracy of automatic image segmentation algorithms relative to manual segmentations. The approach is based on the hypothesis that a large fraction of the errors produced by automatic segmentation are systematic, i.e., occur consistently from subject to subject, and serves as a wrapper method around a given host segmentation method. The wrapper method attempts to learn the intensity, spatial and contextual patterns associated with systematic segmentation errors produced by the host method on training data for which manual segmentations are available. The method then attempts to correct such errors in segmentations produced by the host method on new images. One practical use of the proposed wrapper method is to adapt existing segmentation tools, without explicit modification, to imaging data and segmentation protocols that are different from those on which the tools were trained and tuned. An open-source implementation of the proposed wrapper method is provided, and can be applied to a wide range of image segmentation problems. The wrapper method is evaluated with four host brain MRI segmentation methods: hippocampus segmentation using FreeSurfer (Fischl et al., 2002); hippocampus segmentation using multi-atlas label fusion (Artaechevarria et al., 2009); brain extraction using BET (Smith, 2002); and brain tissue segmentation using FAST (Zhang et al., 2001). The wrapper method generates 72%, 14%, 29% and 21% fewer erroneously segmented voxels than the respective host segmentation methods. In the hippocampus segmentation experiment with multi-atlas label fusion as the host method, the average Dice overlap between reference segmentations and segmentations produced by the wrapper method is 0.908 for normal controls and 0.893 for patients with mild cognitive impairment. Average Dice overlaps of 0.964, 0.905 and 0.951 are obtained for brain extraction, white matter segmentation and gray matter segmentation, respectively. PMID:21237273

  3. --No Title--

    Science.gov Websites

    .mapWrapper #text{border:1px solid #ccc;border-radius:.5em}#container .mapWrapper #text h2{margin:0 0 0.5em 0 }#container .mapWrapper #text h3{margin:0 0 0.5em 0}#container .mapWrapper #text h4{font-size:.9em;margin -bottom:.5em}#container .mapWrapper #text>div{margin-bottom:1.5em}#container .mapWrapper #text>div

  4. BpWrapper: BioPerl-based sequence and tree utilities for rapid prototyping of bioinformatics pipelines.

    PubMed

    Hernández, Yözen; Bernstein, Rocky; Pagan, Pedro; Vargas, Levy; McCaig, William; Ramrattan, Girish; Akther, Saymon; Larracuente, Amanda; Di, Lia; Vieira, Filipe G; Qiu, Wei-Gang

    2018-03-02

    Automated bioinformatics workflows are more robust, easier to maintain, and results more reproducible when built with command-line utilities than with custom-coded scripts. Command-line utilities further benefit by relieving bioinformatics developers to learn the use of, or to interact directly with, biological software libraries. There is however a lack of command-line utilities that leverage popular Open Source biological software toolkits such as BioPerl ( http://bioperl.org ) to make many of the well-designed, robust, and routinely used biological classes available for a wider base of end users. Designed as standard utilities for UNIX-family operating systems, BpWrapper makes functionality of some of the most popular BioPerl modules readily accessible on the command line to novice as well as to experienced bioinformatics practitioners. The initial release of BpWrapper includes four utilities with concise command-line user interfaces, bioseq, bioaln, biotree, and biopop, specialized for manipulation of molecular sequences, sequence alignments, phylogenetic trees, and DNA polymorphisms, respectively. Over a hundred methods are currently available as command-line options and new methods are easily incorporated. Performance of BpWrapper utilities lags that of precompiled utilities while equivalent to that of other utilities based on BioPerl. BpWrapper has been tested on BioPerl Release 1.6, Perl versions 5.10.1 to 5.25.10, and operating systems including Apple macOS, Microsoft Windows, and GNU/Linux. Release code is available from the Comprehensive Perl Archive Network (CPAN) at https://metacpan.org/pod/Bio::BPWrapper . Source code is available on GitHub at https://github.com/bioperl/p5-bpwrapper . BpWrapper improves on existing sequence utilities by following the design principles of Unix text utilities such including a concise user interface, extensive command-line options, and standard input/output for serialized operations. Further, dozens of novel methods for manipulation of sequences, alignments, and phylogenetic trees, unavailable in existing utilities (e.g., EMBOSS, Newick Utilities, and FAST), are provided. Bioinformaticians should find BpWrapper useful for rapid prototyping of workflows on the command-line without creating custom scripts for comparative genomics and other bioinformatics applications.

  5. DARPA Initiative in Concurrent Engineering (DICE). Phase 2

    DTIC Science & Technology

    1990-07-31

    XS spreadsheet tool " Q-Calc spreadsheet tool " TAE+ outer wrapper for XS • Framemaker-based formal EDN (Electronic Design Notebook) " Data...shared global object space and object persistence. Technical Results Module Development XS Integration Environment A prototype of the wrapper concepts...for a spreadsheet integration environment, using an X-Windows based extensible Lotus 1-2-3 emulation called XS , and was (initially) targeted for

  6. Sequence Based Prediction of DNA-Binding Proteins Based on Hybrid Feature Selection Using Random Forest and Gaussian Naïve Bayes

    PubMed Central

    Lou, Wangchao; Wang, Xiaoqing; Chen, Fan; Chen, Yixiao; Jiang, Bo; Zhang, Hua

    2014-01-01

    Developing an efficient method for determination of the DNA-binding proteins, due to their vital roles in gene regulation, is becoming highly desired since it would be invaluable to advance our understanding of protein functions. In this study, we proposed a new method for the prediction of the DNA-binding proteins, by performing the feature rank using random forest and the wrapper-based feature selection using forward best-first search strategy. The features comprise information from primary sequence, predicted secondary structure, predicted relative solvent accessibility, and position specific scoring matrix. The proposed method, called DBPPred, used Gaussian naïve Bayes as the underlying classifier since it outperformed five other classifiers, including decision tree, logistic regression, k-nearest neighbor, support vector machine with polynomial kernel, and support vector machine with radial basis function. As a result, the proposed DBPPred yields the highest average accuracy of 0.791 and average MCC of 0.583 according to the five-fold cross validation with ten runs on the training benchmark dataset PDB594. Subsequently, blind tests on the independent dataset PDB186 by the proposed model trained on the entire PDB594 dataset and by other five existing methods (including iDNA-Prot, DNA-Prot, DNAbinder, DNABIND and DBD-Threader) were performed, resulting in that the proposed DBPPred yielded the highest accuracy of 0.769, MCC of 0.538, and AUC of 0.790. The independent tests performed by the proposed DBPPred on completely a large non-DNA binding protein dataset and two RNA binding protein datasets also showed improved or comparable quality when compared with the relevant prediction methods. Moreover, we observed that majority of the selected features by the proposed method are statistically significantly different between the mean feature values of the DNA-binding and the non DNA-binding proteins. All of the experimental results indicate that the proposed DBPPred can be an alternative perspective predictor for large-scale determination of DNA-binding proteins. PMID:24475169

  7. A new time-frequency method for identification and classification of ball bearing faults

    NASA Astrophysics Data System (ADS)

    Attoui, Issam; Fergani, Nadir; Boutasseta, Nadir; Oudjani, Brahim; Deliou, Adel

    2017-06-01

    In order to fault diagnosis of ball bearing that is one of the most critical components of rotating machinery, this paper presents a time-frequency procedure incorporating a new feature extraction step that combines the classical wavelet packet decomposition energy distribution technique and a new feature extraction technique based on the selection of the most impulsive frequency bands. In the proposed procedure, firstly, as a pre-processing step, the most impulsive frequency bands are selected at different bearing conditions using a combination between Fast-Fourier-Transform FFT and Short-Frequency Energy SFE algorithms. Secondly, once the most impulsive frequency bands are selected, the measured machinery vibration signals are decomposed into different frequency sub-bands by using discrete Wavelet Packet Decomposition WPD technique to maximize the detection of their frequency contents and subsequently the most useful sub-bands are represented in the time-frequency domain by using Short Time Fourier transform STFT algorithm for knowing exactly what the frequency components presented in those frequency sub-bands are. Once the proposed feature vector is obtained, three feature dimensionality reduction techniques are employed using Linear Discriminant Analysis LDA, a feedback wrapper method and Locality Sensitive Discriminant Analysis LSDA. Lastly, the Adaptive Neuro-Fuzzy Inference System ANFIS algorithm is used for instantaneous identification and classification of bearing faults. In order to evaluate the performances of the proposed method, different testing data set to the trained ANFIS model by using different conditions of healthy and faulty bearings under various load levels, fault severities and rotating speed. The conclusion resulting from this paper is highlighted by experimental results which prove that the proposed method can serve as an intelligent bearing fault diagnosis system.

  8. PVM Wrapper

    NASA Technical Reports Server (NTRS)

    Katz, Daniel

    2004-01-01

    PVM Wrapper is a software library that makes it possible for code that utilizes the Parallel Virtual Machine (PVM) software library to run using the message-passing interface (MPI) software library, without needing to rewrite the entire code. PVM and MPI are the two most common software libraries used for applications that involve passing of messages among parallel computers. Since about 1996, MPI has been the de facto standard. Codes written when PVM was popular often feature patterns of {"initsend," "pack," "send"} and {"receive," "unpack"} calls. In many cases, these calls are not contiguous and one set of calls may even exist over multiple subroutines. These characteristics make it difficult to obtain equivalent functionality via a single MPI "send" call. Because PVM Wrapper is written to run with MPI- 1.2, some PVM functions are not permitted and must be replaced - a task that requires some programming expertise. The "pvm_spawn" and "pvm_parent" function calls are not replaced, but a programmer can use "mpirun" and knowledge of the ranks of parent and child tasks with supplied macroinstructions to enable execution of codes that use "pvm_spawn" and "pvm_parent."

  9. Development of a two-stage gene selection method that incorporates a novel hybrid approach using the cuckoo optimization algorithm and harmony search for cancer classification.

    PubMed

    Elyasigomari, V; Lee, D A; Screen, H R C; Shaheed, M H

    2017-03-01

    For each cancer type, only a few genes are informative. Due to the so-called 'curse of dimensionality' problem, the gene selection task remains a challenge. To overcome this problem, we propose a two-stage gene selection method called MRMR-COA-HS. In the first stage, the minimum redundancy and maximum relevance (MRMR) feature selection is used to select a subset of relevant genes. The selected genes are then fed into a wrapper setup that combines a new algorithm, COA-HS, using the support vector machine as a classifier. The method was applied to four microarray datasets, and the performance was assessed by the leave one out cross-validation method. Comparative performance assessment of the proposed method with other evolutionary algorithms suggested that the proposed algorithm significantly outperforms other methods in selecting a fewer number of genes while maintaining the highest classification accuracy. The functions of the selected genes were further investigated, and it was confirmed that the selected genes are biologically relevant to each cancer type. Copyright © 2017. Published by Elsevier Inc.

  10. 76 FR 4935 - In the Matter of Certain Reduced Ignition Proclivity Cigarette Paper Wrappers and Products...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-01-27

    ... Proclivity Cigarette Paper Wrappers and Products Containing Same; Notice of Investigation AGENCY: U.S... importation of certain reduced ignition proclivity cigarette paper wrappers and products containing same by... reduced ignition proclivity cigarette paper wrappers and products containing same that infringe one or...

  11. Calculations of lattice vibrational mode lifetimes using Jazz: a Python wrapper for LAMMPS

    NASA Astrophysics Data System (ADS)

    Gao, Y.; Wang, H.; Daw, M. S.

    2015-06-01

    Jazz is a new python wrapper for LAMMPS [1], implemented to calculate the lifetimes of vibrational normal modes based on forces as calculated for any interatomic potential available in that package. The anharmonic character of the normal modes is analyzed via the Monte Carlo-based moments approximation as is described in Gao and Daw [2]. It is distributed as open-source software and can be downloaded from the website http://jazz.sourceforge.net/.

  12. Do Exam Wrappers Increase Metacognition and Performance? A Single Course Intervention

    ERIC Educational Resources Information Center

    Soicher, Raechel N.; Gurung, Regan A. R.

    2017-01-01

    Previous research has indicated that an intervention called "exam wrappers" can improve students' metacognition when they are using wrappers in more than one course per academic term. In this study, we tested if exam wrappers would improve students' metacognition and academic performance when used in only one course per academic term. A…

  13. Development of an R-based wrapper code for the computation of hydrochemical predominance diagrams based on PHREEQC modeling outputs

    NASA Astrophysics Data System (ADS)

    Mork, M. W.; Kracht, O.

    2012-04-01

    When investigating stability relations in aquatic solutions or rock-water interactions, the number of dissolved species and mineral phases involved can be overwhelming. To facilitate an overview about equilibrium relationships and how chemical elements are distributed between different aqueous ions, complexes, and solids, predominance diagrams are a widely used tool in aquatic chemistry. In the simplest approach, the predominance field boundaries can be calculated based on a set of mass action equations and log K values for the reactions between different species. Example given, for the popular redox diagram (pe-pH diagram), half cell reactions according to Nernst's equation can be used (Garrels & Christ 1965). In such case, boundaries between different species are "equal-activity" lines. However, for boundaries between solids and dissolved species a specific concentration needs to be stipulated, and the same applies if other components than those displayed in the diagram are involved in the possible reactions. In such case, the predominance field boundaries depend on the actual concentration values chosen. An alternative approach can be the computation of predominance diagrams using the full speciation obtained from a geochemical speciation program, which then needs to be coupled with an external wrapper code for appropriate control and data pre- and post-processing. In this way, the distribution of different species can be based on the consideration of complete chemical analysis obtained from laboratory investigations. We present the results of a student semester-project that aimed to develop and test an external wrapper program for the computation of pe-pH diagrams based on modeling outputs obtained with PHREEQC (Parkhurst & Appelo 1999). We have chosen PHREEQC for this core task as a geochemical calculation module, because of its capabilities to simulate a wide range of equilibrium reactions between water and minerals. Due to the intended final users, a free and extensible simulation platform was considered important. The wrapper program was created in the R environment which is freely available under the GNU General Public License (R Development Core Team 2011). The wrapper reads in analytical data in the standard PHREEQC input file format and then iterates over a systematic selection of pe and pH values. These data are transferred to PHREEQC for the calculation of a corresponding set of hydrochemical speciations based on thermodynamic equilibrium. The results of the PHREEQC simulations are subsequently analyzed by a postprocessor function in order to derive a two-dimensional representation of the dominant aquatic species in the pe-pH plane. In this step, the most abundant species at each grid point is identified as the predominant one. To investigate the utility of the program, differences in the speciation of iron were calculated from chemical compositions of water samples from one of our current field sites (Gardermoen / Øvre Romerike aquifer in S-Norway).

  14. --No Title--

    Science.gov Websites

    ------ map -----------------------------------------------*/ .mapWrapper { margin:0 auto ; overflow: auto; } .mapWrapper img { float: left; padding-right: 5px; padding-top: 2px; } .mapWrapper ul

  15. Feature selection through validation and un-censoring of endovascular repair survival data for predicting the risk of re-intervention.

    PubMed

    Attallah, Omneya; Karthikesalingam, Alan; Holt, Peter J E; Thompson, Matthew M; Sayers, Rob; Bown, Matthew J; Choke, Eddie C; Ma, Xianghong

    2017-08-03

    Feature selection (FS) process is essential in the medical area as it reduces the effort and time needed for physicians to measure unnecessary features. Choosing useful variables is a difficult task with the presence of censoring which is the unique characteristic in survival analysis. Most survival FS methods depend on Cox's proportional hazard model; however, machine learning techniques (MLT) are preferred but not commonly used due to censoring. Techniques that have been proposed to adopt MLT to perform FS with survival data cannot be used with the high level of censoring. The researcher's previous publications proposed a technique to deal with the high level of censoring. It also used existing FS techniques to reduce dataset dimension. However, in this paper a new FS technique was proposed and combined with feature transformation and the proposed uncensoring approaches to select a reduced set of features and produce a stable predictive model. In this paper, a FS technique based on artificial neural network (ANN) MLT is proposed to deal with highly censored Endovascular Aortic Repair (EVAR). Survival data EVAR datasets were collected during 2004 to 2010 from two vascular centers in order to produce a final stable model. They contain almost 91% of censored patients. The proposed approach used a wrapper FS method with ANN to select a reduced subset of features that predict the risk of EVAR re-intervention after 5 years to patients from two different centers located in the United Kingdom, to allow it to be potentially applied to cross-centers predictions. The proposed model is compared with the two popular FS techniques; Akaike and Bayesian information criteria (AIC, BIC) that are used with Cox's model. The final model outperforms other methods in distinguishing the high and low risk groups; as they both have concordance index and estimated AUC better than the Cox's model based on AIC, BIC, Lasso, and SCAD approaches. These models have p-values lower than 0.05, meaning that patients with different risk groups can be separated significantly and those who would need re-intervention can be correctly predicted. The proposed approach will save time and effort made by physicians to collect unnecessary variables. The final reduced model was able to predict the long-term risk of aortic complications after EVAR. This predictive model can help clinicians decide patients' future observation plan.

  16. 7 CFR 30.16 - Cigar wrapper.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... AND STANDARDS Classification of Leaf Tobacco Covering Classes, Types and Groups of Grades § 30.16 Cigar wrapper. A portion of a tobacco leaf forming the outer covering of a cigar. Cigar-wrapper tobacco...

  17. 7 CFR 30.16 - Cigar wrapper.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... AND STANDARDS Classification of Leaf Tobacco Covering Classes, Types and Groups of Grades § 30.16 Cigar wrapper. A portion of a tobacco leaf forming the outer covering of a cigar. Cigar-wrapper tobacco...

  18. 7 CFR 30.16 - Cigar wrapper.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... AND STANDARDS Classification of Leaf Tobacco Covering Classes, Types and Groups of Grades § 30.16 Cigar wrapper. A portion of a tobacco leaf forming the outer covering of a cigar. Cigar-wrapper tobacco...

  19. 7 CFR 30.16 - Cigar wrapper.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... AND STANDARDS Classification of Leaf Tobacco Covering Classes, Types and Groups of Grades § 30.16 Cigar wrapper. A portion of a tobacco leaf forming the outer covering of a cigar. Cigar-wrapper tobacco...

  20. 7 CFR 30.16 - Cigar wrapper.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... AND STANDARDS Classification of Leaf Tobacco Covering Classes, Types and Groups of Grades § 30.16 Cigar wrapper. A portion of a tobacco leaf forming the outer covering of a cigar. Cigar-wrapper tobacco...

  1. Designing basin-customized combined drought indices via feature extraction

    NASA Astrophysics Data System (ADS)

    Zaniolo, Marta; Giuliani, Matteo; Castelletti, Andrea

    2017-04-01

    The socio-economic costs of drought are progressively increasing worldwide due to the undergoing alteration of hydro-meteorological regimes induced by climate change. Although drought management is largely studied in the literature, most of the traditional drought indexes fail in detecting critical events in highly regulated systems, which generally rely on ad-hoc formulations and cannot be generalized to different context. In this study, we contribute a novel framework for the design of a basin-customized drought index. This index represents a surrogate of the state of the basin and is computed by combining the available information about the water available in the system to reproduce a representative target variable for the drought condition of the basin (e.g., water deficit). To select the relevant variables and how to combine them, we use an advanced feature extraction algorithm called Wrapper for Quasi Equally Informative Subset Selection (W-QEISS). The W-QEISS algorithm relies on a multi-objective evolutionary algorithm to find Pareto-efficient subsets of variables by maximizing the wrapper accuracy, minimizing the number of selected variables (cardinality) and optimizing relevance and redundancy of the subset. The accuracy objective is evaluated trough the calibration of a pre-defined model (i.e., an extreme learning machine) of the water deficit for each candidate subset of variables, with the index selected from the resulting solutions identifying a suitable compromise between accuracy, cardinality, relevance, and redundancy. The proposed methodology is tested in the case study of Lake Como in northern Italy, a regulated lake mainly operated for irrigation supply to four downstream agricultural districts. In the absence of an institutional drought monitoring system, we constructed the combined index using all the hydrological variables from the existing monitoring system as well as the most common drought indicators at multiple time aggregations. The soil moisture deficit in the root zone computed by a distributed-parameter water balance model of the agricultural districts is used as target variable. Numerical results show that our framework succeeds in constructing a combined drought index that reproduces the soil moisture deficit. Moreover, this index represents a valuable information for supporting appropriate drought management strategies, including the possibility of directly informing the lake operations about the drought conditions and improve the overall reliability of the irrigation supply system.

  2. 77 FR 20844 - Certain Reduced Ignition Proclivity Cigarette Paper Wrappers and Products Containing Same...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-04-06

    ... INTERNATIONAL TRADE COMMISSION [Investigation No. 337-TA-756] Certain Reduced Ignition Proclivity Cigarette Paper Wrappers and Products Containing Same Determination to Partially Review the Final Initial... reduced ignition proclivity cigarette paper wrappers and products containing same by reason of...

  3. C2M: Configurable Chemical Middleware

    PubMed Central

    Roosendaal, Hans E.; Geurts, Peter A. T. M.

    2001-01-01

    One of the vexing problems that besets concurrent use of multiple, heterogeneous resources is format multiplicity. C2M aims to equip scientists with a wrapper generator on their desktop. The wrapper generator can build wrappers, or converters that can convert data from or into different formats, from a high-level description of the formats. The language in which such a high-level description is expressed is easy enough for scientists to be able to write format descriptions at minimal cost. In C2M, wrappers and documentation for human reading are automatically obtained from the same user-supplied specifications. Initial experiments demonstrate that the idea can, indeed, lead to the advent of usergoverned wrapper generators. Future research will consolidate the code and extend the approach to a realistic variety of formats. PMID:18628869

  4. Classification of vocal aging using parameters extracted from the glottal signal.

    PubMed

    Forero Mendoza, Leonardo A; Cataldo, Edson; Vellasco, Marley M B R; Silva, Marco A; Apolinário, José A

    2014-09-01

    This article proposes and evaluates a method to classify vocal aging using artificial neural network (ANN) and support vector machine (SVM), using the parameters extracted from the speech signal as inputs. For each recorded speech, from a corpus of male and female speakers of different ages, the corresponding glottal signal is obtained using an inverse filtering algorithm. The Mel Frequency Cepstrum Coefficients (MFCC) also extracted from the voice signal and the features extracted from the glottal signal are supplied to an ANN and an SVM with a previous selection. The selection is performed by a wrapper approach of the most relevant parameters. Three groups are considered for the aging-voice classification: young (aged 15-30 years), adult (aged 31-60 years), and senior (aged 61-90 years). The results are compared using different possibilities: with only the parameters extracted from the glottal signal, with only the MFCC, and with a combination of both. The results demonstrate that the best classification rate is obtained using the glottal signal features, which is a novel result and the main contribution of this article. Copyright © 2014 The Voice Foundation. Published by Elsevier Inc. All rights reserved.

  5. Test Scheduling for Core-Based SOCs Using Genetic Algorithm Based Heuristic Approach

    NASA Astrophysics Data System (ADS)

    Giri, Chandan; Sarkar, Soumojit; Chattopadhyay, Santanu

    This paper presents a Genetic algorithm (GA) based solution to co-optimize test scheduling and wrapper design for core based SOCs. Core testing solutions are generated as a set of wrapper configurations, represented as rectangles with width equal to the number of TAM (Test Access Mechanism) channels and height equal to the corresponding testing time. A locally optimal best-fit heuristic based bin packing algorithm has been used to determine placement of rectangles minimizing the overall test times, whereas, GA has been utilized to generate the sequence of rectangles to be considered for placement. Experimental result on ITC'02 benchmark SOCs shows that the proposed method provides better solutions compared to the recent works reported in the literature.

  6. Gfr Core Neutronics Studies at CEA

    NASA Astrophysics Data System (ADS)

    Bosq, J. C.; Brun-Magaud, V.; Rimpault, G.; Tommasi, J.; Conti, A.; Garnier, J. C.

    2006-04-01

    The Gas cooled Fast Reactor (GFR) is a high priority in the CEA R&D program on Future Nuclear Energy Systems. After preliminary neutronics and thermo-aerolic studies, a first He-cooled 2400MWth core design based on a series of carbide CERCER plates arranged in an hexagonal wrapper were selected. Although GFR subassembly and core design studies are still at an early stage of development, it is nonetheless possible to identify a number of nuclear data needs that could have some impact on the actual design: new materials, decay heat contributors….

  7. 7 CFR 29.2661 - Wrappers (A Group).

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... 7 Agriculture 2 2011-01-01 2011-01-01 false Wrappers (A Group). 29.2661 Section 29.2661... REGULATIONS TOBACCO INSPECTION Standards Grades § 29.2661 Wrappers (A Group). This group consists of leaves usually grown at or above the center portion of the stalk. Cured leaves of this group are elastic and show...

  8. 76 FR 22145 - In the Matter of Certain Reduced Ignition Proclivity Cigarette Paper Wrappers and Products...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-04-20

    ... INTERNATIONAL TRADE COMMISSION Inv. No. 337-TA-756 In the Matter of Certain Reduced Ignition Proclivity Cigarette Paper Wrappers and Products Containing Same; Notice of Commission Determination Not To... cigarette paper wrappers and products containing same by reason of infringement of certain claims of U.S...

  9. 7 CFR 30.41 - Class 6; cigar-wrapper types and groups.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Class 6; cigar-wrapper types and groups. 30.41 Section 30.41 Agriculture Regulations of the Department of Agriculture AGRICULTURAL MARKETING SERVICE... and Groups of Grades § 30.41 Class 6; cigar-wrapper types and groups. (a) Type 61. That type of shade...

  10. 30 CFR 15.32 - Tolerances for weight of explosive, sheath, wrapper, and specific gravity.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ..., wrapper, and specific gravity. 15.32 Section 15.32 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION... explosive, sheath, wrapper, and specific gravity. (a) The weight of the explosive, the sheath, and the outer.... (c) The specific gravity of the explosive and sheath shall be within ±7.5 percent of that specified...

  11. 30 CFR 15.32 - Tolerances for weight of explosive, sheath, wrapper, and specific gravity.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ..., wrapper, and specific gravity. 15.32 Section 15.32 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION... explosive, sheath, wrapper, and specific gravity. (a) The weight of the explosive, the sheath, and the outer.... (c) The specific gravity of the explosive and sheath shall be within ±7.5 percent of that specified...

  12. 30 CFR 15.32 - Tolerances for weight of explosive, sheath, wrapper, and specific gravity.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ..., wrapper, and specific gravity. 15.32 Section 15.32 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION... explosive, sheath, wrapper, and specific gravity. (a) The weight of the explosive, the sheath, and the outer.... (c) The specific gravity of the explosive and sheath shall be within ±7.5 percent of that specified...

  13. 30 CFR 15.32 - Tolerances for weight of explosive, sheath, wrapper, and specific gravity.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ..., wrapper, and specific gravity. 15.32 Section 15.32 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION... explosive, sheath, wrapper, and specific gravity. (a) The weight of the explosive, the sheath, and the outer.... (c) The specific gravity of the explosive and sheath shall be within ±7.5 percent of that specified...

  14. 30 CFR 15.32 - Tolerances for weight of explosive, sheath, wrapper, and specific gravity.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ..., wrapper, and specific gravity. 15.32 Section 15.32 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION... explosive, sheath, wrapper, and specific gravity. (a) The weight of the explosive, the sheath, and the outer.... (c) The specific gravity of the explosive and sheath shall be within ±7.5 percent of that specified...

  15. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yuuichi Tooya; Tadahiro Washiya; Kenji Koizumi

    Japan Atomic Energy Agency (JAEA) has been leading feasibility study on commercialized fast reactor cycle systems in Japan. In this study, we have proposed a new disassembly technology by mechanical disassembly system that consists of a mechanical cutting step and a wrapper tube pulling step. In the mechanical disassembly system, high durability mechanical tool grinds the wrapper tube (Slit-cut (S/C) operation in circle direction), and then the wrapper tube is pulled out and removed from the fuel assembly. Then the fuel pins are cut (Crop-cut (C/C) operation at entrance nozzle side) and the entrance nozzle is removed. The fuel pinsmore » are transported to the shearing device in next process. The Fundamental tests were carried out with simulated FBR fuel pins and wrapper tube, and cutting performance and wrapper tube pulling performance has been confirmed by engineering scale. As results, we established an efficient disassembly procedure and the fundamental design of mechanical disassembly system. (authors)« less

  16. XLWrap - Querying and Integrating Arbitrary Spreadsheets with SPARQL

    NASA Astrophysics Data System (ADS)

    Langegger, Andreas; Wöß, Wolfram

    In this paper a novel approach is presented for generating RDF graphs of arbitrary complexity from various spreadsheet layouts. Currently, none of the available spreadsheet-to-RDF wrappers supports cross tables and tables where data is not aligned in rows. Similar to RDF123, XLWrap is based on template graphs where fragments of triples can be mapped to specific cells of a spreadsheet. Additionally, it features a full expression algebra based on the syntax of OpenOffice Calc and various shift operations, which can be used to repeat similar mappings in order to wrap cross tables including multiple sheets and spreadsheet files. The set of available expression functions includes most of the native functions of OpenOffice Calc and can be easily extended by users of XLWrap.

  17. Automatic Response to Intrusion

    DTIC Science & Technology

    2002-10-01

    Computing Corporation Sidewinder Firewall [18] SRI EMERALD Basic Security Module (BSM) and EMERALD File Transfer Protocol (FTP) Monitors...the same event TCP Wrappers [24] Internet Security Systems RealSecure [31] SRI EMERALD IDIP monitor NAI Labs Generic Software Wrappers Prototype...included EMERALD , NetRadar, NAI Labs UNIX wrappers, ARGuE, MPOG, NetRadar, CyberCop Server, Gauntlet, RealSecure, and the Cyber Command System

  18. Improving permafrost distribution modelling using feature selection algorithms

    NASA Astrophysics Data System (ADS)

    Deluigi, Nicola; Lambiel, Christophe; Kanevski, Mikhail

    2016-04-01

    The availability of an increasing number of spatial data on the occurrence of mountain permafrost allows the employment of machine learning (ML) classification algorithms for modelling the distribution of the phenomenon. One of the major problems when dealing with high-dimensional dataset is the number of input features (variables) involved. Application of ML classification algorithms to this large number of variables leads to the risk of overfitting, with the consequence of a poor generalization/prediction. For this reason, applying feature selection (FS) techniques helps simplifying the amount of factors required and improves the knowledge on adopted features and their relation with the studied phenomenon. Moreover, taking away irrelevant or redundant variables from the dataset effectively improves the quality of the ML prediction. This research deals with a comparative analysis of permafrost distribution models supported by FS variable importance assessment. The input dataset (dimension = 20-25, 10 m spatial resolution) was constructed using landcover maps, climate data and DEM derived variables (altitude, aspect, slope, terrain curvature, solar radiation, etc.). It was completed with permafrost evidences (geophysical and thermal data and rock glacier inventories) that serve as training permafrost data. Used FS algorithms informed about variables that appeared less statistically important for permafrost presence/absence. Three different algorithms were compared: Information Gain (IG), Correlation-based Feature Selection (CFS) and Random Forest (RF). IG is a filter technique that evaluates the worth of a predictor by measuring the information gain with respect to the permafrost presence/absence. Conversely, CFS is a wrapper technique that evaluates the worth of a subset of predictors by considering the individual predictive ability of each variable along with the degree of redundancy between them. Finally, RF is a ML algorithm that performs FS as part of its overall operation. It operates by constructing a large collection of decorrelated classification trees, and then predicts the permafrost occurrence through a majority vote. With the so-called out-of-bag (OOB) error estimate, the classification of permafrost data can be validated as well as the contribution of each predictor can be assessed. The performances of compared permafrost distribution models (computed on independent testing sets) increased with the application of FS algorithms on the original dataset and irrelevant or redundant variables were removed. As a consequence, the process provided faster and more cost-effective predictors and a better understanding of the underlying structures residing in permafrost data. Our work demonstrates the usefulness of a feature selection step prior to applying a machine learning algorithm. In fact, permafrost predictors could be ranked not only based on their heuristic and subjective importance (expert knowledge), but also based on their statistical relevance in relation of the permafrost distribution.

  19. Electronic nicotine delivery systems: is there a need for regulation?

    PubMed

    Trtchounian, Anna; Talbot, Prue

    2011-01-01

    Electronic nicotine delivery systems (ENDS) purport to deliver nicotine to the lungs of smokers. Five brands of ENDS were evaluated for design features, accuracy and clarity of labelling and quality of instruction manuals and associated print material supplied with products or on manufacturers' websites. ENDS were purchased from online vendors and analysed for various parameters. While the basic design of ENDS was similar across brands, specific design features varied significantly. Fluid contained in cartridge reservoirs readily leaked out of most brands, and it was difficult to assemble or disassemble ENDS without touching nicotine-containing fluid. Two brands had designs that helped lessen this problem. Labelling of cartridges was very poor; labelling of some cartridge wrappers was better than labelling of cartridges. In general, packs of replacement cartridges were better labelled than the wrappers or cartridges, but most packs lacked cartridge content and warning information, and sometimes packs had confusing information. Used cartridges contained fluid, and disposal of nicotine-containing cartridges was not adequately addressed on websites or in manuals. Orders were sometimes filled incorrectly, and safety features did not always function properly. Print and internet material often contained information or made claims for which there is currently no scientific support. Design flaws, lack of adequate labelling and concerns about quality control and health issues indicate that regulators should consider removing ENDS from the market until their safety can be adequately evaluated.

  20. Design and Application of Drought Indexes in Highly Regulated Mediterranean Water Systems

    NASA Astrophysics Data System (ADS)

    Castelletti, A.; Zaniolo, M.; Giuliani, M.

    2017-12-01

    Costs of drought are progressively increasing due to the undergoing alteration of hydro-meteorological regimes induced by climate change. Although drought management is largely studied in the literature, most of the traditional drought indexes fail in detecting critical events in highly regulated systems, which generally rely on ad-hoc formulations and cannot be generalized to different context. In this study, we contribute a novel framework for the design of a basin-customized drought index. This index represents a surrogate of the state of the basin and is computed by combining the available information about the water available in the system to reproduce a representative target variable for the drought condition of the basin (e.g., water deficit). To select the relevant variables and combinatione thereof, we use an advanced feature extraction algorithm called Wrapper for Quasi Equally Informative Subset Selection (W-QEISS). W-QEISS relies on a multi-objective evolutionary algorithm to find Pareto-efficient subsets of variables by maximizing the wrapper accuracy, minimizing the number of selected variables, and optimizing relevance and redundancy of the subset. The accuracy objective is evaluated trough the calibration of an extreme learning machine of the water deficit for each candidate subset of variables, with the index selected from the resulting solutions identifying a suitable compromise between accuracy, cardinality, relevance, and redundancy. The approach is tested on Lake Como, Italy, a regulated lake mainly operated for irrigation supply. In the absence of an institutional drought monitoring system, we constructed the combined index using all the hydrological variables from the existing monitoring system as well as common drought indicators at multiple time aggregations. The soil moisture deficit in the root zone computed by a distributed-parameter water balance model of the agricultural districts is used as target variable. Numerical results show that our combined drought index succesfully reproduces the deficit. The index represents a valuable information for supporting appropriate drought management strategies, including the possibility of directly informing the lake operations about the drought conditions and improve the overall reliability of the irrigation supply system.

  1. Physical Human Activity Recognition Using Wearable Sensors.

    PubMed

    Attal, Ferhat; Mohammed, Samer; Dedabrishvili, Mariam; Chamroukhi, Faicel; Oukhellou, Latifa; Amirat, Yacine

    2015-12-11

    This paper presents a review of different classification techniques used to recognize human activities from wearable inertial sensor data. Three inertial sensor units were used in this study and were worn by healthy subjects at key points of upper/lower body limbs (chest, right thigh and left ankle). Three main steps describe the activity recognition process: sensors' placement, data pre-processing and data classification. Four supervised classification techniques namely, k-Nearest Neighbor (k-NN), Support Vector Machines (SVM), Gaussian Mixture Models (GMM), and Random Forest (RF) as well as three unsupervised classification techniques namely, k-Means, Gaussian mixture models (GMM) and Hidden Markov Model (HMM), are compared in terms of correct classification rate, F-measure, recall, precision, and specificity. Raw data and extracted features are used separately as inputs of each classifier. The feature selection is performed using a wrapper approach based on the RF algorithm. Based on our experiments, the results obtained show that the k-NN classifier provides the best performance compared to other supervised classification algorithms, whereas the HMM classifier is the one that gives the best results among unsupervised classification algorithms. This comparison highlights which approach gives better performance in both supervised and unsupervised contexts. It should be noted that the obtained results are limited to the context of this study, which concerns the classification of the main daily living human activities using three wearable accelerometers placed at the chest, right shank and left ankle of the subject.

  2. Physical Human Activity Recognition Using Wearable Sensors

    PubMed Central

    Attal, Ferhat; Mohammed, Samer; Dedabrishvili, Mariam; Chamroukhi, Faicel; Oukhellou, Latifa; Amirat, Yacine

    2015-01-01

    This paper presents a review of different classification techniques used to recognize human activities from wearable inertial sensor data. Three inertial sensor units were used in this study and were worn by healthy subjects at key points of upper/lower body limbs (chest, right thigh and left ankle). Three main steps describe the activity recognition process: sensors’ placement, data pre-processing and data classification. Four supervised classification techniques namely, k-Nearest Neighbor (k-NN), Support Vector Machines (SVM), Gaussian Mixture Models (GMM), and Random Forest (RF) as well as three unsupervised classification techniques namely, k-Means, Gaussian mixture models (GMM) and Hidden Markov Model (HMM), are compared in terms of correct classification rate, F-measure, recall, precision, and specificity. Raw data and extracted features are used separately as inputs of each classifier. The feature selection is performed using a wrapper approach based on the RF algorithm. Based on our experiments, the results obtained show that the k-NN classifier provides the best performance compared to other supervised classification algorithms, whereas the HMM classifier is the one that gives the best results among unsupervised classification algorithms. This comparison highlights which approach gives better performance in both supervised and unsupervised contexts. It should be noted that the obtained results are limited to the context of this study, which concerns the classification of the main daily living human activities using three wearable accelerometers placed at the chest, right shank and left ankle of the subject. PMID:26690450

  3. Distributed Data Integration Infrastructure

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Critchlow, T; Ludaescher, B; Vouk, M

    The Internet is becoming the preferred method for disseminating scientific data from a variety of disciplines. This can result in information overload on the part of the scientists, who are unable to query all of the relevant sources, even if they knew where to find them, what they contained, how to interact with them, and how to interpret the results. A related issue is keeping up with current trends in information technology often taxes the end-user's expertise and time. Thus instead of benefiting from this information rich environment, scientists become experts on a small number of sources and technologies, usemore » them almost exclusively, and develop a resistance to innovations that can enhance their productivity. Enabling information based scientific advances, in domains such as functional genomics, requires fully utilizing all available information and the latest technologies. In order to address this problem we are developing a end-user centric, domain-sensitive workflow-based infrastructure, shown in Figure 1, that will allow scientists to design complex scientific workflows that reflect the data manipulation required to perform their research without an undue burden. We are taking a three-tiered approach to designing this infrastructure utilizing (1) abstract workflow definition, construction, and automatic deployment, (2) complex agent-based workflow execution and (3) automatic wrapper generation. In order to construct a workflow, the scientist defines an abstract workflow (AWF) in terminology (semantics and context) that is familiar to him/her. This AWF includes all of the data transformations, selections, and analyses required by the scientist, but does not necessarily specify particular data sources. This abstract workflow is then compiled into an executable workflow (EWF, in our case XPDL) that is then evaluated and executed by the workflow engine. This EWF contains references to specific data source and interfaces capable of performing the desired actions. In order to provide access to the largest number of resources possible, our lowest level utilizes automatic wrapper generation techniques to create information and data wrappers capable of interacting with the complex interfaces typical in scientific analysis. The remainder of this document outlines our work in these three areas, the impact our work has made, and our plans for the future.« less

  4. A UIMA wrapper for the NCBO annotator.

    PubMed

    Roeder, Christophe; Jonquet, Clement; Shah, Nigam H; Baumgartner, William A; Verspoor, Karin; Hunter, Lawrence

    2010-07-15

    The Unstructured Information Management Architecture (UIMA) framework and web services are emerging as useful tools for integrating biomedical text mining tools. This note describes our work, which wraps the National Center for Biomedical Ontology (NCBO) Annotator-an ontology-based annotation service-to make it available as a component in UIMA workflows. This wrapper is freely available on the web at http://bionlp-uima.sourceforge.net/ as part of the UIMA tools distribution from the Center for Computational Pharmacology (CCP) at the University of Colorado School of Medicine. It has been implemented in Java for support on Mac OS X, Linux and MS Windows.

  5. Framework for Integrating Science Data Processing Algorithms Into Process Control Systems

    NASA Technical Reports Server (NTRS)

    Mattmann, Chris A.; Crichton, Daniel J.; Chang, Albert Y.; Foster, Brian M.; Freeborn, Dana J.; Woollard, David M.; Ramirez, Paul M.

    2011-01-01

    A software framework called PCS Task Wrapper is responsible for standardizing the setup, process initiation, execution, and file management tasks surrounding the execution of science data algorithms, which are referred to by NASA as Product Generation Executives (PGEs). PGEs codify a scientific algorithm, some step in the overall scientific process involved in a mission science workflow. The PCS Task Wrapper provides a stable operating environment to the underlying PGE during its execution lifecycle. If the PGE requires a file, or metadata regarding the file, the PCS Task Wrapper is responsible for delivering that information to the PGE in a manner that meets its requirements. If the PGE requires knowledge of upstream or downstream PGEs in a sequence of executions, that information is also made available. Finally, if information regarding disk space, or node information such as CPU availability, etc., is required, the PCS Task Wrapper provides this information to the underlying PGE. After this information is collected, the PGE is executed, and its output Product file and Metadata generation is managed via the PCS Task Wrapper framework. The innovation is responsible for marshalling output Products and Metadata back to a PCS File Management component for use in downstream data processing and pedigree. In support of this, the PCS Task Wrapper leverages the PCS Crawler Framework to ingest (during pipeline processing) the output Product files and Metadata produced by the PGE. The architectural components of the PCS Task Wrapper framework include PGE Task Instance, PGE Config File Builder, Config File Property Adder, Science PGE Config File Writer, and PCS Met file Writer. This innovative framework is really the unifying bridge between the execution of a step in the overall processing pipeline, and the available PCS component services as well as the information that they collectively manage.

  6. Hybrid feature selection algorithm using symmetrical uncertainty and a harmony search algorithm

    NASA Astrophysics Data System (ADS)

    Salameh Shreem, Salam; Abdullah, Salwani; Nazri, Mohd Zakree Ahmad

    2016-04-01

    Microarray technology can be used as an efficient diagnostic system to recognise diseases such as tumours or to discriminate between different types of cancers in normal tissues. This technology has received increasing attention from the bioinformatics community because of its potential in designing powerful decision-making tools for cancer diagnosis. However, the presence of thousands or tens of thousands of genes affects the predictive accuracy of this technology from the perspective of classification. Thus, a key issue in microarray data is identifying or selecting the smallest possible set of genes from the input data that can achieve good predictive accuracy for classification. In this work, we propose a two-stage selection algorithm for gene selection problems in microarray data-sets called the symmetrical uncertainty filter and harmony search algorithm wrapper (SU-HSA). Experimental results show that the SU-HSA is better than HSA in isolation for all data-sets in terms of the accuracy and achieves a lower number of genes on 6 out of 10 instances. Furthermore, the comparison with state-of-the-art methods shows that our proposed approach is able to obtain 5 (out of 10) new best results in terms of the number of selected genes and competitive results in terms of the classification accuracy.

  7. Tightly integrated single- and multi-crystal data collection strategy calculation and parallelized data processing in JBluIce beamline control system

    PubMed Central

    Pothineni, Sudhir Babu; Venugopalan, Nagarajan; Ogata, Craig M.; Hilgart, Mark C.; Stepanov, Sergey; Sanishvili, Ruslan; Becker, Michael; Winter, Graeme; Sauter, Nicholas K.; Smith, Janet L.; Fischetti, Robert F.

    2014-01-01

    The calculation of single- and multi-crystal data collection strategies and a data processing pipeline have been tightly integrated into the macromolecular crystallographic data acquisition and beamline control software JBluIce. Both tasks employ wrapper scripts around existing crystallographic software. JBluIce executes scripts through a distributed resource management system to make efficient use of all available computing resources through parallel processing. The JBluIce single-crystal data collection strategy feature uses a choice of strategy programs to help users rank sample crystals and collect data. The strategy results can be conveniently exported to a data collection run. The JBluIce multi-crystal strategy feature calculates a collection strategy to optimize coverage of reciprocal space in cases where incomplete data are available from previous samples. The JBluIce data processing runs simultaneously with data collection using a choice of data reduction wrappers for integration and scaling of newly collected data, with an option for merging with pre-existing data. Data are processed separately if collected from multiple sites on a crystal or from multiple crystals, then scaled and merged. Results from all strategy and processing calculations are displayed in relevant tabs of JBluIce. PMID:25484844

  8. Tightly integrated single- and multi-crystal data collection strategy calculation and parallelized data processing in JBluIce beamline control system

    DOE PAGES

    Pothineni, Sudhir Babu; Venugopalan, Nagarajan; Ogata, Craig M.; ...

    2014-11-18

    The calculation of single- and multi-crystal data collection strategies and a data processing pipeline have been tightly integrated into the macromolecular crystallographic data acquisition and beamline control software JBluIce. Both tasks employ wrapper scripts around existing crystallographic software. JBluIce executes scripts through a distributed resource management system to make efficient use of all available computing resources through parallel processing. The JBluIce single-crystal data collection strategy feature uses a choice of strategy programs to help users rank sample crystals and collect data. The strategy results can be conveniently exported to a data collection run. The JBluIce multi-crystal strategy feature calculates amore » collection strategy to optimize coverage of reciprocal space in cases where incomplete data are available from previous samples. The JBluIce data processing runs simultaneously with data collection using a choice of data reduction wrappers for integration and scaling of newly collected data, with an option for merging with pre-existing data. Data are processed separately if collected from multiple sites on a crystal or from multiple crystals, then scaled and merged. Results from all strategy and processing calculations are displayed in relevant tabs of JBluIce.« less

  9. A UIMA wrapper for the NCBO annotator

    PubMed Central

    Roeder, Christophe; Jonquet, Clement; Shah, Nigam H.; Baumgartner, William A.; Verspoor, Karin; Hunter, Lawrence

    2010-01-01

    Summary: The Unstructured Information Management Architecture (UIMA) framework and web services are emerging as useful tools for integrating biomedical text mining tools. This note describes our work, which wraps the National Center for Biomedical Ontology (NCBO) Annotator—an ontology-based annotation service—to make it available as a component in UIMA workflows. Availability: This wrapper is freely available on the web at http://bionlp-uima.sourceforge.net/ as part of the UIMA tools distribution from the Center for Computational Pharmacology (CCP) at the University of Colorado School of Medicine. It has been implemented in Java for support on Mac OS X, Linux and MS Windows. Contact: chris.roeder@ucdenver.edu PMID:20505005

  10. 7 CFR 30.41 - Class 6; cigar-wrapper types and groups.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... CONTAINER REGULATIONS TOBACCO STOCKS AND STANDARDS Classification of Leaf Tobacco Covering Classes, Types... northern Florida. Groups applicable to types 61 and 62: A—Wrappers. S—Stained. X—Brokes. N—Nondescript, as...

  11. 7 CFR 30.41 - Class 6; cigar-wrapper types and groups.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... CONTAINER REGULATIONS TOBACCO STOCKS AND STANDARDS Classification of Leaf Tobacco Covering Classes, Types... northern Florida. Groups applicable to types 61 and 62: A—Wrappers. S—Stained. X—Brokes. N—Nondescript, as...

  12. 7 CFR 30.41 - Class 6; cigar-wrapper types and groups.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... CONTAINER REGULATIONS TOBACCO STOCKS AND STANDARDS Classification of Leaf Tobacco Covering Classes, Types... northern Florida. Groups applicable to types 61 and 62: A—Wrappers. S—Stained. X—Brokes. N—Nondescript, as...

  13. 7 CFR 30.41 - Class 6; cigar-wrapper types and groups.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... CONTAINER REGULATIONS TOBACCO STOCKS AND STANDARDS Classification of Leaf Tobacco Covering Classes, Types... northern Florida. Groups applicable to types 61 and 62: A—Wrappers. S—Stained. X—Brokes. N—Nondescript, as...

  14. Lead contamination of imported candy wrappers.

    PubMed

    Fuortes, L; Bauer, E

    2000-02-01

    Lead toxicity in a young Hispanic woman from sucking on a terra cotta candy container led to investigating lead contamination in candy packaging materials imported from Mexico. Printed cellophane candy wrappers may present a significant risk for lead exposure.

  15. Unveiling relevant non-motor Parkinson's disease severity symptoms using a machine learning approach.

    PubMed

    Armañanzas, Rubén; Bielza, Concha; Chaudhuri, Kallol Ray; Martinez-Martin, Pablo; Larrañaga, Pedro

    2013-07-01

    Is it possible to predict the severity staging of a Parkinson's disease (PD) patient using scores of non-motor symptoms? This is the kickoff question for a machine learning approach to classify two widely known PD severity indexes using individual tests from a broad set of non-motor PD clinical scales only. The Hoehn & Yahr index and clinical impression of severity index are global measures of PD severity. They constitute the labels to be assigned in two supervised classification problems using only non-motor symptom tests as predictor variables. Such predictors come from a wide range of PD symptoms, such as cognitive impairment, psychiatric complications, autonomic dysfunction or sleep disturbance. The classification was coupled with a feature subset selection task using an advanced evolutionary algorithm, namely an estimation of distribution algorithm. Results show how five different classification paradigms using a wrapper feature selection scheme are capable of predicting each of the class variables with estimated accuracy in the range of 72-92%. In addition, classification into the main three severity categories (mild, moderate and severe) was split into dichotomic problems where binary classifiers perform better and select different subsets of non-motor symptoms. The number of jointly selected symptoms throughout the whole process was low, suggesting a link between the selected non-motor symptoms and the general severity of the disease. Quantitative results are discussed from a medical point of view, reflecting a clear translation to the clinical manifestations of PD. Moreover, results include a brief panel of non-motor symptoms that could help clinical practitioners to identify patients who are at different stages of the disease from a limited set of symptoms, such as hallucinations, fainting, inability to control body sphincters or believing in unlikely facts. Copyright © 2013 Elsevier B.V. All rights reserved.

  16. 30 CFR 15.22 - Tolerances for performance, wrapper, and specific gravity.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... percent of that specified in the approval. (b) The weight of wrapper per 100 grams of explosive shall be within ±2 grams of that specified in the approval. (c) The apparent specific gravity of the explosive...

  17. 30 CFR 15.22 - Tolerances for performance, wrapper, and specific gravity.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... percent of that specified in the approval. (b) The weight of wrapper per 100 grams of explosive shall be within ±2 grams of that specified in the approval. (c) The apparent specific gravity of the explosive...

  18. 30 CFR 15.22 - Tolerances for performance, wrapper, and specific gravity.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... percent of that specified in the approval. (b) The weight of wrapper per 100 grams of explosive shall be within ±2 grams of that specified in the approval. (c) The apparent specific gravity of the explosive...

  19. 30 CFR 15.22 - Tolerances for performance, wrapper, and specific gravity.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... percent of that specified in the approval. (b) The weight of wrapper per 100 grams of explosive shall be within ±2 grams of that specified in the approval. (c) The apparent specific gravity of the explosive...

  20. Umbra (core)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bradley, Jon David; Oppel III, Fred J.; Hart, Brian E.

    Umbra is a flexible simulation framework for complex systems that can be used by itself for modeling, simulation, and analysis, or to create specific applications. It has been applied to many operations, primarily dealing with robotics and system of system simulations. This version, from 4.8 to 4.8.3b, incorporates bug fixes, refactored code, and new managed C++ wrapper code that can be used to bridge new applications written in C# to the C++ libraries. The new managed C++ wrapper code includes (project/directories) BasicSimulation, CSharpUmbraInterpreter, LogFileView, UmbraAboutBox, UmbraControls, UmbraMonitor and UmbraWrapper.

  1. SWMM5 Application Programming Interface and PySWMM: A Python Interfacing Wrapper

    EPA Science Inventory

    In support of the OpenWaterAnalytics open source initiative, the PySWMM project encompasses the development of a Python interfacing wrapper to SWMM5 with parallel ongoing development of the USEPA Stormwater Management Model (SWMM5) application programming interface (API). ...

  2. A modeling paradigm for interdisciplinary water resources modeling: Simple Script Wrappers (SSW)

    NASA Astrophysics Data System (ADS)

    Steward, David R.; Bulatewicz, Tom; Aistrup, Joseph A.; Andresen, Daniel; Bernard, Eric A.; Kulcsar, Laszlo; Peterson, Jeffrey M.; Staggenborg, Scott A.; Welch, Stephen M.

    2014-05-01

    Holistic understanding of a water resources system requires tools capable of model integration. This team has developed an adaptation of the OpenMI (Open Modelling Interface) that allows easy interactions across the data passed between models. Capabilities have been developed to allow programs written in common languages such as matlab, python and scilab to share their data with other programs and accept other program's data. We call this interface the Simple Script Wrapper (SSW). An implementation of SSW is shown that integrates groundwater, economic, and agricultural models in the High Plains region of Kansas. Output from these models illustrates the interdisciplinary discovery facilitated through use of SSW implemented models. Reference: Bulatewicz, T., A. Allen, J.M. Peterson, S. Staggenborg, S.M. Welch, and D.R. Steward, The Simple Script Wrapper for OpenMI: Enabling interdisciplinary modeling studies, Environmental Modelling & Software, 39, 283-294, 2013. http://dx.doi.org/10.1016/j.envsoft.2012.07.006 http://code.google.com/p/simple-script-wrapper/

  3. Image acquisition in the Pi-of-the-Sky project

    NASA Astrophysics Data System (ADS)

    Jegier, M.; Nawrocki, K.; Poźniak, K.; Sokołowski, M.

    2006-10-01

    Modern astronomical image acquisition systems dedicated for sky surveys provide large amount of data in a single measurement session. During one session that lasts a few hours it is possible to get as much as 100 GB of data. This large amount of data needs to be transferred from camera and processed. This paper presents some aspects of image acquisition in a sky survey image acquisition system. It describes a dedicated USB linux driver for the first version of the "Pi of The Sky" CCD camera (later versions have also Ethernet interface) and the test program for the camera together with a driver-wrapper providing core device functionality. Finally, the paper contains description of an algorithm for matching several images based on image features, i.e. star positions and their brightness.

  4. SS-Wrapper: a package of wrapper applications for similarity searches on Linux clusters.

    PubMed

    Wang, Chunlin; Lefkowitz, Elliot J

    2004-10-28

    Large-scale sequence comparison is a powerful tool for biological inference in modern molecular biology. Comparing new sequences to those in annotated databases is a useful source of functional and structural information about these sequences. Using software such as the basic local alignment search tool (BLAST) or HMMPFAM to identify statistically significant matches between newly sequenced segments of genetic material and those in databases is an important task for most molecular biologists. Searching algorithms are intrinsically slow and data-intensive, especially in light of the rapid growth of biological sequence databases due to the emergence of high throughput DNA sequencing techniques. Thus, traditional bioinformatics tools are impractical on PCs and even on dedicated UNIX servers. To take advantage of larger databases and more reliable methods, high performance computation becomes necessary. We describe the implementation of SS-Wrapper (Similarity Search Wrapper), a package of wrapper applications that can parallelize similarity search applications on a Linux cluster. Our wrapper utilizes a query segmentation-search (QS-search) approach to parallelize sequence database search applications. It takes into consideration load balancing between each node on the cluster to maximize resource usage. QS-search is designed to wrap many different search tools, such as BLAST and HMMPFAM using the same interface. This implementation does not alter the original program, so newly obtained programs and program updates should be accommodated easily. Benchmark experiments using QS-search to optimize BLAST and HMMPFAM showed that QS-search accelerated the performance of these programs almost linearly in proportion to the number of CPUs used. We have also implemented a wrapper that utilizes a database segmentation approach (DS-BLAST) that provides a complementary solution for BLAST searches when the database is too large to fit into the memory of a single node. Used together, QS-search and DS-BLAST provide a flexible solution to adapt sequential similarity searching applications in high performance computing environments. Their ease of use and their ability to wrap a variety of database search programs provide an analytical architecture to assist both the seasoned bioinformaticist and the wet-bench biologist.

  5. SS-Wrapper: a package of wrapper applications for similarity searches on Linux clusters

    PubMed Central

    Wang, Chunlin; Lefkowitz, Elliot J

    2004-01-01

    Background Large-scale sequence comparison is a powerful tool for biological inference in modern molecular biology. Comparing new sequences to those in annotated databases is a useful source of functional and structural information about these sequences. Using software such as the basic local alignment search tool (BLAST) or HMMPFAM to identify statistically significant matches between newly sequenced segments of genetic material and those in databases is an important task for most molecular biologists. Searching algorithms are intrinsically slow and data-intensive, especially in light of the rapid growth of biological sequence databases due to the emergence of high throughput DNA sequencing techniques. Thus, traditional bioinformatics tools are impractical on PCs and even on dedicated UNIX servers. To take advantage of larger databases and more reliable methods, high performance computation becomes necessary. Results We describe the implementation of SS-Wrapper (Similarity Search Wrapper), a package of wrapper applications that can parallelize similarity search applications on a Linux cluster. Our wrapper utilizes a query segmentation-search (QS-search) approach to parallelize sequence database search applications. It takes into consideration load balancing between each node on the cluster to maximize resource usage. QS-search is designed to wrap many different search tools, such as BLAST and HMMPFAM using the same interface. This implementation does not alter the original program, so newly obtained programs and program updates should be accommodated easily. Benchmark experiments using QS-search to optimize BLAST and HMMPFAM showed that QS-search accelerated the performance of these programs almost linearly in proportion to the number of CPUs used. We have also implemented a wrapper that utilizes a database segmentation approach (DS-BLAST) that provides a complementary solution for BLAST searches when the database is too large to fit into the memory of a single node. Conclusions Used together, QS-search and DS-BLAST provide a flexible solution to adapt sequential similarity searching applications in high performance computing environments. Their ease of use and their ability to wrap a variety of database search programs provide an analytical architecture to assist both the seasoned bioinformaticist and the wet-bench biologist. PMID:15511296

  6. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Collins, Benjamin S.

    The Futility package contains the following: 1) Definition of the size of integers and real numbers; 2) A generic Unit test harness; 3) Definitions for some basic extensions to the Fortran language: arbitrary length strings, a parameter list construct, exception handlers, command line processor, timers; 4) Geometry definitions: point, line, plane, box, cylinder, polyhedron; 5) File wrapper functions: standard Fortran input/output files, Fortran binary files, HDF5 files; 6) Parallel wrapper functions: MPI, and Open MP abstraction layers, partitioning algorithms; 7) Math utilities: BLAS, Matrix and Vector definitions, Linear Solver methods and wrappers for other TPLs (PETSC, MKL, etc), preconditioner classes;more » 8) Misc: random number generator, water saturation properties, sorting algorithms.« less

  7. 77 FR 6821 - Certain Reduced Ignition Proclivity Cigarette Paper Wrappers and Products Containing Same...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-02-09

    ... INTERNATIONAL TRADE COMMISSION [Investigation No. 337-TA-756] Certain Reduced Ignition Proclivity Cigarette Paper Wrappers and Products Containing Same; Request for Statements on the Public Interest AGENCY: U.S. International Trade Commission. ACTION: Notice. SUMMARY: Notice is hereby given that the...

  8. SpiceyPy, a Python Wrapper for SPICE

    NASA Astrophysics Data System (ADS)

    Annex, A.

    2017-06-01

    SpiceyPy is an open source Python wrapper for the NAIF SPICE toolkit. It is available for macOS, Linux, and Windows platforms and for Python versions 2.7.x and 3.x as well as Anaconda. SpiceyPy can be installed by running: “pip install spiceypy.”

  9. 7 CFR 29.3646 - Wrappers (A Group).

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... REGULATIONS TOBACCO INSPECTION Standards Grades § 29.3646 Wrappers (A Group). This group consists of leaves from the Heavy Leaf and the Thin Leaf groups. Cured leaves of the A group are very elastic, have small..., medium body, open leaf structure, smooth, rich in oil, clear finish, deep color intensity elastic...

  10. 7 CFR 29.3646 - Wrappers (A Group).

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... REGULATIONS TOBACCO INSPECTION Standards Grades § 29.3646 Wrappers (A Group). This group consists of leaves from the Heavy Leaf and the Thin Leaf groups. Cured leaves of the A group are very elastic, have small..., medium body, open leaf structure, smooth, rich in oil, clear finish, deep color intensity elastic...

  11. 7 CFR 29.3646 - Wrappers (A Group).

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... REGULATIONS TOBACCO INSPECTION Standards Grades § 29.3646 Wrappers (A Group). This group consists of leaves from the Heavy Leaf and the Thin Leaf groups. Cured leaves of the A group are very elastic, have small..., medium body, open leaf structure, smooth, rich in oil, clear finish, deep color intensity elastic...

  12. 7 CFR 29.3646 - Wrappers (A Group).

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... REGULATIONS TOBACCO INSPECTION Standards Grades § 29.3646 Wrappers (A Group). This group consists of leaves from the Heavy Leaf and the Thin Leaf groups. Cured leaves of the A group are very elastic, have small..., medium body, open leaf structure, smooth, rich in oil, clear finish, deep color intensity elastic...

  13. 7 CFR 29.3646 - Wrappers (A Group).

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... REGULATIONS TOBACCO INSPECTION Standards Grades § 29.3646 Wrappers (A Group). This group consists of leaves from the Heavy Leaf and the Thin Leaf groups. Cured leaves of the A group are very elastic, have small..., medium body, open leaf structure, smooth, rich in oil, clear finish, deep color intensity elastic...

  14. 21 CFR 801.437 - User labeling for devices that contain natural rubber.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... User labeling for devices that contain natural rubber. (a) Data in the Medical Device Reporting System... of the device packaging, the outside package, container or wrapper, and the immediate device package... panel of the device packaging, the outside package, container or wrapper, and the immediate device...

  15. 21 CFR 801.437 - User labeling for devices that contain natural rubber.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... User labeling for devices that contain natural rubber. (a) Data in the Medical Device Reporting System... of the device packaging, the outside package, container or wrapper, and the immediate device package... panel of the device packaging, the outside package, container or wrapper, and the immediate device...

  16. 21 CFR 801.437 - User labeling for devices that contain natural rubber.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... User labeling for devices that contain natural rubber. (a) Data in the Medical Device Reporting System... of the device packaging, the outside package, container or wrapper, and the immediate device package... panel of the device packaging, the outside package, container or wrapper, and the immediate device...

  17. 21 CFR 801.437 - User labeling for devices that contain natural rubber.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... User labeling for devices that contain natural rubber. (a) Data in the Medical Device Reporting System... of the device packaging, the outside package, container or wrapper, and the immediate device package... panel of the device packaging, the outside package, container or wrapper, and the immediate device...

  18. 21 CFR 801.437 - User labeling for devices that contain natural rubber.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... User labeling for devices that contain natural rubber. (a) Data in the Medical Device Reporting System... of the device packaging, the outside package, container or wrapper, and the immediate device package... panel of the device packaging, the outside package, container or wrapper, and the immediate device...

  19. Comparing supervised learning techniques on the task of physical activity recognition.

    PubMed

    Dalton, A; OLaighin, G

    2013-01-01

    The objective of this study was to compare the performance of base-level and meta-level classifiers on the task of physical activity recognition. Five wireless kinematic sensors were attached to each subject (n = 25) while they completed a range of basic physical activities in a controlled laboratory setting. Subjects were then asked to carry out similar self-annotated physical activities in a random order and in an unsupervised environment. A combination of time-domain and frequency-domain features were extracted from the sensor data including the first four central moments, zero-crossing rate, average magnitude, sensor cross-correlation, sensor auto-correlation, spectral entropy and dominant frequency components. A reduced feature set was generated using a wrapper subset evaluation technique with a linear forward search and this feature set was employed for classifier comparison. The meta-level classifier AdaBoostM1 with C4.5 Graft as its base-level classifier achieved an overall accuracy of 95%. Equal sized datasets of subject independent data and subject dependent data were used to train this classifier and high recognition rates could be achieved without the need for user specific training. Furthermore, it was found that an accuracy of 88% could be achieved using data from the ankle and wrist sensors only.

  20. 30 CFR 15.22 - Tolerances for performance, wrapper, and specific gravity.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... specific gravity. 15.22 Section 15.22 Mineral Resources MINE SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT... performance, wrapper, and specific gravity. (a) The rate of detonation of the explosive shall be within ±15... within ±2 grams of that specified in the approval. (c) The apparent specific gravity of the explosive...

  1. An HL7-CDA wrapper for facilitating semantic interoperability to rule-based Clinical Decision Support Systems.

    PubMed

    Sáez, Carlos; Bresó, Adrián; Vicente, Javier; Robles, Montserrat; García-Gómez, Juan Miguel

    2013-03-01

    The success of Clinical Decision Support Systems (CDSS) greatly depends on its capability of being integrated in Health Information Systems (HIS). Several proposals have been published up to date to permit CDSS gathering patient data from HIS. Some base the CDSS data input on the HL7 reference model, however, they are tailored to specific CDSS or clinical guidelines technologies, or do not focus on standardizing the CDSS resultant knowledge. We propose a solution for facilitating semantic interoperability to rule-based CDSS focusing on standardized input and output documents conforming an HL7-CDA wrapper. We define the HL7-CDA restrictions in a HL7-CDA implementation guide. Patient data and rule inference results are mapped respectively to and from the CDSS by means of a binding method based on an XML binding file. As an independent clinical document, the results of a CDSS can present clinical and legal validity. The proposed solution is being applied in a CDSS for providing patient-specific recommendations for the care management of outpatients with diabetes mellitus. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  2. 7 CFR 29.2661 - Wrappers (A Group).

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... REGULATIONS TOBACCO INSPECTION Standards Grades § 29.2661 Wrappers (A Group). This group consists of leaves usually grown at or above the center portion of the stalk. Cured leaves of this group are elastic and show... color intensity, spready, 90 percent uniform, and 10 percent of leaves not lower than B1 or C1. A2F Fine...

  3. 7 CFR 29.2661 - Wrappers (A Group).

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... REGULATIONS TOBACCO INSPECTION Standards Grades § 29.2661 Wrappers (A Group). This group consists of leaves usually grown at or above the center portion of the stalk. Cured leaves of this group are elastic and show... color intensity, spready, 90 percent uniform, and 10 percent of leaves not lower than B1 or C1. A2F Fine...

  4. 7 CFR 29.2661 - Wrappers (A Group).

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... REGULATIONS TOBACCO INSPECTION Standards Grades § 29.2661 Wrappers (A Group). This group consists of leaves usually grown at or above the center portion of the stalk. Cured leaves of this group are elastic and show... color intensity, spready, 90 percent uniform, and 10 percent of leaves not lower than B1 or C1. A2F Fine...

  5. 7 CFR 29.2661 - Wrappers (A Group).

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... REGULATIONS TOBACCO INSPECTION Standards Grades § 29.2661 Wrappers (A Group). This group consists of leaves usually grown at or above the center portion of the stalk. Cured leaves of this group are elastic and show... color intensity, spready, 90 percent uniform, and 10 percent of leaves not lower than B1 or C1. A2F Fine...

  6. An integrated approach for identifying wrongly labelled samples when performing classification in microarray data.

    PubMed

    Leung, Yuk Yee; Chang, Chun Qi; Hung, Yeung Sam

    2012-01-01

    Using hybrid approach for gene selection and classification is common as results obtained are generally better than performing the two tasks independently. Yet, for some microarray datasets, both classification accuracy and stability of gene sets obtained still have rooms for improvement. This may be due to the presence of samples with wrong class labels (i.e. outliers). Outlier detection algorithms proposed so far are either not suitable for microarray data, or only solve the outlier detection problem on their own. We tackle the outlier detection problem based on a previously proposed Multiple-Filter-Multiple-Wrapper (MFMW) model, which was demonstrated to yield promising results when compared to other hybrid approaches (Leung and Hung, 2010). To incorporate outlier detection and overcome limitations of the existing MFMW model, three new features are introduced in our proposed MFMW-outlier approach: 1) an unbiased external Leave-One-Out Cross-Validation framework is developed to replace internal cross-validation in the previous MFMW model; 2) wrongly labeled samples are identified within the MFMW-outlier model; and 3) a stable set of genes is selected using an L1-norm SVM that removes any redundant genes present. Six binary-class microarray datasets were tested. Comparing with outlier detection studies on the same datasets, MFMW-outlier could detect all the outliers found in the original paper (for which the data was provided for analysis), and the genes selected after outlier removal were proven to have biological relevance. We also compared MFMW-outlier with PRAPIV (Zhang et al., 2006) based on same synthetic datasets. MFMW-outlier gave better average precision and recall values on three different settings. Lastly, artificially flipped microarray datasets were created by removing our detected outliers and flipping some of the remaining samples' labels. Almost all the 'wrong' (artificially flipped) samples were detected, suggesting that MFMW-outlier was sufficiently powerful to detect outliers in high-dimensional microarray datasets.

  7. [Protective role of autotypic contacts under cerebellar neural net injury by toxic doses of NO-generative compounds].

    PubMed

    Samosudova, N V; Reutov, V P; Larionova, N P; Chaĭlakhian, L M

    2005-01-01

    In the present work, cerebellar neural net injury was induced by toxic doses of NO-generative compound (NaNO2). A protective role of glial cells was revealed in such conditions. The present results were compared with those of the previous work concerning the action of high concentration glutamate on the frog cerebellum (Samosudova et al., 1996). In both cases we observed the appearance of spiral-like structures--"wrappers)"--involving several rows of transformed glial processes with smaller width and bridges connecting the inner sides of row (autotypic contact). A statistic analysis was made according to both previous and present data. We calculated the number and width of rows, and intervals between bridges depending on experimental conditions. As the injury increased (stimulation in the NO-presence), the row number in "wrappers" also increased, while the row width and intervals between bridges decreased. The presence of autotypic contacts in glial "wrappers" enables us to suppose the involvement of adhesive proteins--cadherins in its formation. The obtained data suggested that the formation of spiral structures--"wrappers" may be regarded as a compensative-adaptive reaction on the injury of cerebellar neural net glutamate and NO-generative compounds.

  8. An Architecture for Standardized Terminology Services by Wrapping and Integration of Existing Applications

    PubMed Central

    Cornet, Ronald; Prins, Antoon K.

    2003-01-01

    Research on terminology services has resulted in development of applications and definition of standards, but has not yet led to widespread use of (standardized) terminology services in practice. Current terminology services offer functionality both for concept representation and lexical knowledge representation, hampering the possibility of combining the strengths of dedicated (concept and lexical) services. We therefore propose an extensible architecture in which concept-related and lexicon-related components are integrated and made available through a uniform interface. This interface can be extended in order to conform to existing standards, making it possible to use dedicated (third-party) components in a standardized way. As a proof of concept and a reference implementation, a SOAP-based Java implementation of the terminology service is being developed, providing wrappers for Protégé and UMLS Knowledge Source Server. Other systems, such as the Description Logic-based reasoner RACER can be easily integrated by implementation of an appropriate wrapper. PMID:14728158

  9. Thermal-Aware Test Access Mechanism and Wrapper Design Optimization for System-on-Chips

    NASA Astrophysics Data System (ADS)

    Yu, Thomas Edison; Yoneda, Tomokazu; Chakrabarty, Krishnendu; Fujiwara, Hideo

    Rapid advances in semiconductor manufacturing technology have led to higher chip power densities, which places greater emphasis on packaging and temperature control during testing. For system-on-chips, peak power-based scheduling algorithms have been used to optimize tests under specified power constraints. However, imposing power constraints does not always solve the problem of overheating due to the non-uniform distribution of power across the chip. This paper presents a TAM/Wrapper co-design methodology for system-on-chips that ensures thermal safety while still optimizing the test schedule. The method combines a simplified thermal-cost model with a traditional bin-packing algorithm to minimize test time while satisfying temperature constraints. Furthermore, for temperature checking, thermal simulation is done using cycle-accurate power profiles for more realistic results. Experiments show that even a minimal sacrifice in test time can yield a considerable decrease in test temperature as well as the possibility of further lowering temperatures beyond those achieved using traditional power-based test scheduling.

  10. Inexpensive Audio Activities: Earbud-Based Sound Experiments

    ERIC Educational Resources Information Center

    Allen, Joshua; Boucher, Alex; Meggison, Dean; Hruby, Kate; Vesenka, James

    2016-01-01

    Inexpensive alternatives to a number of classic introductory physics sound laboratories are presented including interference phenomena, resonance conditions, and frequency shifts. These can be created using earbuds, economical supplies such as Giant Pixie Stix® wrappers, and free software available for PCs and mobile devices. We describe two…

  11. ClearTK 2.0: Design Patterns for Machine Learning in UIMA

    PubMed Central

    Bethard, Steven; Ogren, Philip; Becker, Lee

    2014-01-01

    ClearTK adds machine learning functionality to the UIMA framework, providing wrappers to popular machine learning libraries, a rich feature extraction library that works across different classifiers, and utilities for applying and evaluating machine learning models. Since its inception in 2008, ClearTK has evolved in response to feedback from developers and the community. This evolution has followed a number of important design principles including: conceptually simple annotator interfaces, readable pipeline descriptions, minimal collection readers, type system agnostic code, modules organized for ease of import, and assisting user comprehension of the complex UIMA framework. PMID:29104966

  12. ClearTK 2.0: Design Patterns for Machine Learning in UIMA.

    PubMed

    Bethard, Steven; Ogren, Philip; Becker, Lee

    2014-05-01

    ClearTK adds machine learning functionality to the UIMA framework, providing wrappers to popular machine learning libraries, a rich feature extraction library that works across different classifiers, and utilities for applying and evaluating machine learning models. Since its inception in 2008, ClearTK has evolved in response to feedback from developers and the community. This evolution has followed a number of important design principles including: conceptually simple annotator interfaces, readable pipeline descriptions, minimal collection readers, type system agnostic code, modules organized for ease of import, and assisting user comprehension of the complex UIMA framework.

  13. Web-based interactive 2D/3D medical image processing and visualization software.

    PubMed

    Mahmoudi, Seyyed Ehsan; Akhondi-Asl, Alireza; Rahmani, Roohollah; Faghih-Roohi, Shahrooz; Taimouri, Vahid; Sabouri, Ahmad; Soltanian-Zadeh, Hamid

    2010-05-01

    There are many medical image processing software tools available for research and diagnosis purposes. However, most of these tools are available only as local applications. This limits the accessibility of the software to a specific machine, and thus the data and processing power of that application are not available to other workstations. Further, there are operating system and processing power limitations which prevent such applications from running on every type of workstation. By developing web-based tools, it is possible for users to access the medical image processing functionalities wherever the internet is available. In this paper, we introduce a pure web-based, interactive, extendable, 2D and 3D medical image processing and visualization application that requires no client installation. Our software uses a four-layered design consisting of an algorithm layer, web-user-interface layer, server communication layer, and wrapper layer. To compete with extendibility of the current local medical image processing software, each layer is highly independent of other layers. A wide range of medical image preprocessing, registration, and segmentation methods are implemented using open source libraries. Desktop-like user interaction is provided by using AJAX technology in the web-user-interface. For the visualization functionality of the software, the VRML standard is used to provide 3D features over the web. Integration of these technologies has allowed implementation of our purely web-based software with high functionality without requiring powerful computational resources in the client side. The user-interface is designed such that the users can select appropriate parameters for practical research and clinical studies. Copyright (c) 2009 Elsevier Ireland Ltd. All rights reserved.

  14. Tree Alignment Based on Needleman-Wunsch Algorithm for Sensor Selection in Smart Homes.

    PubMed

    Chua, Sook-Ling; Foo, Lee Kien

    2017-08-18

    Activity recognition in smart homes aims to infer the particular activities of the inhabitant, the aim being to monitor their activities and identify any abnormalities, especially for those living alone. In order for a smart home to support its inhabitant, the recognition system needs to learn from observations acquired through sensors. One question that often arises is which sensors are useful and how many sensors are required to accurately recognise the inhabitant's activities? Many wrapper methods have been proposed and remain one of the popular evaluators for sensor selection due to its superior accuracy performance. However, they are prohibitively slow during the evaluation process and may run into the risk of overfitting due to the extent of the search. Motivated by this characteristic, this paper attempts to reduce the cost of the evaluation process and overfitting through tree alignment. The performance of our method is evaluated on two public datasets obtained in two distinct smart home environments.

  15. Ecofriendly Fruit Switches: Graphene Oxide-Based Wrapper for Programmed Fruit Preservative Delivery To Extend Shelf Life.

    PubMed

    Sharma, Sandeep; Biswal, Badal Kumar; Kumari, Divya; Bindra, Pulkit; Kumar, Satish; Stobdan, Tsering; Shanmugam, Vijayakumar

    2018-05-21

    According to Food and Agriculture Organization 2015 report, post-harvest agricultural loss accounts for 20-50% annually; on the other hand, reports about preservatives toxicity are also increasing. Hence, preservative release with response to fruit requirement is desired. In this study, acid synthesized in the overripe fruits was envisaged to cleave acid labile hydrazone to release preservative salicylaldehyde from graphene oxide (GO). To maximize loading and to overcome the challenge of GO reduction by hydrazine, two-step activation with ethylenediamine and 4-nitrophenyl chloroformate respectively, are followed. The final composite shows efficient preservative release with the stimuli of the overripe fruit juice and improves the fruit shelf life. The composite shows less toxicity as compared to the free preservative along with the additional scope to reuse. The composite was vacuum-filtered through a 0.4 μm filter paper, to prepare a robust wrapper for the fruit storage.

  16. A stacking ensemble learning framework for annual river ice breakup dates

    NASA Astrophysics Data System (ADS)

    Sun, Wei; Trevor, Bernard

    2018-06-01

    River ice breakup dates (BDs) are not merely a proxy indicator of climate variability and change, but a direct concern in the management of local ice-caused flooding. A framework of stacking ensemble learning for annual river ice BDs was developed, which included two-level components: member and combining models. The member models described the relations between BD and their affecting indicators; the combining models linked the predicted BD by each member models with the observed BD. Especially, Bayesian regularization back-propagation artificial neural network (BRANN), and adaptive neuro fuzzy inference systems (ANFIS) were employed as both member and combining models. The candidate combining models also included the simple average methods (SAM). The input variables for member models were selected by a hybrid filter and wrapper method. The performances of these models were examined using the leave-one-out cross validation. As the largest unregulated river in Alberta, Canada with ice jams frequently occurring in the vicinity of Fort McMurray, the Athabasca River at Fort McMurray was selected as the study area. The breakup dates and candidate affecting indicators in 1980-2015 were collected. The results showed that, the BRANN member models generally outperformed the ANFIS member models in terms of better performances and simpler structures. The difference between the R and MI rankings of inputs in the optimal member models may imply that the linear correlation based filter method would be feasible to generate a range of candidate inputs for further screening through other wrapper or embedded IVS methods. The SAM and BRANN combining models generally outperformed all member models. The optimal SAM combining model combined two BRANN member models and improved upon them in terms of average squared errors by 14.6% and 18.1% respectively. In this study, for the first time, the stacking ensemble learning was applied to forecasting of river ice breakup dates, which appeared promising for other river ice forecasting problems.

  17. Moisture separator reheater with round tube bundle

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Byerley, W. M.

    1984-11-27

    A moisture separator reheater having a central chamber with cylindrical wall protions and a generally round tube bundle, the tube bundle having arcuate plates disposed on each side of the bundle which form a wrapper on each side of the bundle and having a tongue and groove juncture between the wrapper and cylindrical wall portions to provide a seal therebetween and a track for installing and removing the tube bundle from the central chamber.

  18. An OpenMI Implementation of a Water Resources System using Simple Script Wrappers

    NASA Astrophysics Data System (ADS)

    Steward, D. R.; Aistrup, J. A.; Kulcsar, L.; Peterson, J. M.; Welch, S. M.; Andresen, D.; Bernard, E. A.; Staggenborg, S. A.; Bulatewicz, T.

    2013-12-01

    This team has developed an adaption of the Open Modelling Interface (OpenMI) that utilizes Simple Script Wrappers. Code is made OpenMI compliant through organization within three modules that initialize, perform time steps, and finalize results. A configuration file is prepared that specifies variables a model expects to receive as input and those it will make available as output. An example is presented for groundwater, economic, and agricultural production models in the High Plains Aquifer region of Kansas. Our models use the programming environments in Scilab and Matlab, along with legacy Fortran code, and our Simple Script Wrappers can also use Python. These models are collectively run within this interdisciplinary framework from initial conditions into the future. It will be shown that by applying model constraints to one model, the impact may be accessed on changes to the water resources system.

  19. Use of data mining techniques to classify soil CO2 emission induced by crop management in sugarcane field.

    PubMed

    Farhate, Camila Viana Vieira; Souza, Zigomar Menezes de; Oliveira, Stanley Robson de Medeiros; Tavares, Rose Luiza Moraes; Carvalho, João Luís Nunes

    2018-01-01

    Soil CO2 emissions are regarded as one of the largest flows of the global carbon cycle and small changes in their magnitude can have a large effect on the CO2 concentration in the atmosphere. Thus, a better understanding of this attribute would enable the identification of promoters and the development of strategies to mitigate the risks of climate change. Therefore, our study aimed at using data mining techniques to predict the soil CO2 emission induced by crop management in sugarcane areas in Brazil. To do so, we used different variable selection methods (correlation, chi-square, wrapper) and classification (Decision tree, Bayesian models, neural networks, support vector machine, bagging with logistic regression), and finally we tested the efficiency of different approaches through the Receiver Operating Characteristic (ROC) curve. The original dataset consisted of 19 variables (18 independent variables and one dependent (or response) variable). The association between cover crop and minimum tillage are effective strategies to promote the mitigation of soil CO2 emissions, in which the average CO2 emissions are 63 kg ha-1 day-1. The variables soil moisture, soil temperature (Ts), rainfall, pH, and organic carbon were most frequently selected for soil CO2 emission classification using different methods for attribute selection. According to the results of the ROC curve, the best approaches for soil CO2 emission classification were the following: (I)-the Multilayer Perceptron classifier with attribute selection through the wrapper method, that presented rate of false positive of 13,50%, true positive of 94,20% area under the curve (AUC) of 89,90% (II)-the Bagging classifier with logistic regression with attribute selection through the Chi-square method, that presented rate of false positive of 13,50%, true positive of 94,20% AUC of 89,90%. However, the (I) approach stands out in relation to (II) for its higher positive class accuracy (high CO2 emission) and lower computational cost.

  20. Use of data mining techniques to classify soil CO2 emission induced by crop management in sugarcane field

    PubMed Central

    de Souza, Zigomar Menezes; Oliveira, Stanley Robson de Medeiros; Tavares, Rose Luiza Moraes; Carvalho, João Luís Nunes

    2018-01-01

    Soil CO2 emissions are regarded as one of the largest flows of the global carbon cycle and small changes in their magnitude can have a large effect on the CO2 concentration in the atmosphere. Thus, a better understanding of this attribute would enable the identification of promoters and the development of strategies to mitigate the risks of climate change. Therefore, our study aimed at using data mining techniques to predict the soil CO2 emission induced by crop management in sugarcane areas in Brazil. To do so, we used different variable selection methods (correlation, chi-square, wrapper) and classification (Decision tree, Bayesian models, neural networks, support vector machine, bagging with logistic regression), and finally we tested the efficiency of different approaches through the Receiver Operating Characteristic (ROC) curve. The original dataset consisted of 19 variables (18 independent variables and one dependent (or response) variable). The association between cover crop and minimum tillage are effective strategies to promote the mitigation of soil CO2 emissions, in which the average CO2 emissions are 63 kg ha-1 day-1. The variables soil moisture, soil temperature (Ts), rainfall, pH, and organic carbon were most frequently selected for soil CO2 emission classification using different methods for attribute selection. According to the results of the ROC curve, the best approaches for soil CO2 emission classification were the following: (I)–the Multilayer Perceptron classifier with attribute selection through the wrapper method, that presented rate of false positive of 13,50%, true positive of 94,20% area under the curve (AUC) of 89,90% (II)–the Bagging classifier with logistic regression with attribute selection through the Chi-square method, that presented rate of false positive of 13,50%, true positive of 94,20% AUC of 89,90%. However, the (I) approach stands out in relation to (II) for its higher positive class accuracy (high CO2 emission) and lower computational cost. PMID:29513765

  1. Prediction model of potential hepatocarcinogenicity of rat hepatocarcinogens using a large-scale toxicogenomics database

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Uehara, Takeki, E-mail: takeki.uehara@shionogi.co.jp; Toxicogenomics Informatics Project, National Institute of Biomedical Innovation, 7-6-8 Asagi, Ibaraki, Osaka 567-0085; Minowa, Yohsuke

    2011-09-15

    The present study was performed to develop a robust gene-based prediction model for early assessment of potential hepatocarcinogenicity of chemicals in rats by using our toxicogenomics database, TG-GATEs (Genomics-Assisted Toxicity Evaluation System developed by the Toxicogenomics Project in Japan). The positive training set consisted of high- or middle-dose groups that received 6 different non-genotoxic hepatocarcinogens during a 28-day period. The negative training set consisted of high- or middle-dose groups of 54 non-carcinogens. Support vector machine combined with wrapper-type gene selection algorithms was used for modeling. Consequently, our best classifier yielded prediction accuracies for hepatocarcinogenicity of 99% sensitivity and 97% specificitymore » in the training data set, and false positive prediction was almost completely eliminated. Pathway analysis of feature genes revealed that the mitogen-activated protein kinase p38- and phosphatidylinositol-3-kinase-centered interactome and the v-myc myelocytomatosis viral oncogene homolog-centered interactome were the 2 most significant networks. The usefulness and robustness of our predictor were further confirmed in an independent validation data set obtained from the public database. Interestingly, similar positive predictions were obtained in several genotoxic hepatocarcinogens as well as non-genotoxic hepatocarcinogens. These results indicate that the expression profiles of our newly selected candidate biomarker genes might be common characteristics in the early stage of carcinogenesis for both genotoxic and non-genotoxic carcinogens in the rat liver. Our toxicogenomic model might be useful for the prospective screening of hepatocarcinogenicity of compounds and prioritization of compounds for carcinogenicity testing. - Highlights: >We developed a toxicogenomic model to predict hepatocarcinogenicity of chemicals. >The optimized model consisting of 9 probes had 99% sensitivity and 97% specificity. >This model enables us to detect genotoxic as well as non-genotoxic hepatocarcinogens.« less

  2. ewrapper: Operationalizing engagement strategies in mHealth

    PubMed Central

    Wagner, Blake; Liu, Elaine; Shaw, Steven D.; Iakovlev, Gleb; Zhou, Linlu; Harrington, Christina; Abowd, Gregory; Yoon, Carolyn; Kumar, Santosh; Murphy, Susan; Spring, Bonnie; Nahum-Shani, Inbal

    2018-01-01

    The advancement of digital technologies particularly in the domain of mobile health (mHealth) holds great promise in the promotion of health behavior. However, keeping users engaged remains a central challenge. This paper proposes a novel approach to address this issue by supplementing existing and future mHealth applications with an engagement wrapper - a collection of engagement strategies integrated into a single, coherent model. The engagement wrapper is operationalized within the format of an ambient display on the lock screen of mobile devices. PMID:29362728

  3. ewrapper: Operationalizing engagement strategies in mHealth.

    PubMed

    Wagner, Blake; Liu, Elaine; Shaw, Steven D; Iakovlev, Gleb; Zhou, Linlu; Harrington, Christina; Abowd, Gregory; Yoon, Carolyn; Kumar, Santosh; Murphy, Susan; Spring, Bonnie; Nahum-Shani, Inbal

    2017-09-01

    The advancement of digital technologies particularly in the domain of mobile health (mHealth) holds great promise in the promotion of health behavior. However, keeping users engaged remains a central challenge. This paper proposes a novel approach to address this issue by supplementing existing and future mHealth applications with an engagement wrapper - a collection of engagement strategies integrated into a single, coherent model. The engagement wrapper is operationalized within the format of an ambient display on the lock screen of mobile devices.

  4. Real-Time Parallel Software Design Case Study: Implementation of the RASSP SAR Benchmark on the Intel Paragon.

    DTIC Science & Technology

    1996-01-01

    Real-Time 19 5 Conclusion 23 List of References 25 ii LIST OF FIGURES FIGURE PAGE 3-1 Test Bench Pseudo Code 7 3-2 Fast Convolution...3-1 shows pseudo - code for a test bench with two application nodes. The outer test bench wrapper consists of three functions: pipeline_init, pipeline...exit_func); Figure 3-1. Test Bench Pseudo Code The application wrapper is contained in the pipeline routine and similarly consists of an

  5. Elevated temperature tensile properties of P9 steel towards ferritic steel wrapper development for sodium cooled fast reactors

    NASA Astrophysics Data System (ADS)

    Choudhary, B. K.; Mathew, M. D.; Isaac Samuel, E.; Christopher, J.; Jayakumar, T.

    2013-11-01

    Tensile deformation and fracture behaviour of the three developmental heats of P9 steel for wrapper applications containing varying silicon in the range 0.24-0.60% have been examined in the temperature range 300-873 K. Yield and ultimate tensile strengths in all the three heats exhibited gradual decrease with increase in temperature from room to intermediate temperatures followed by rapid decrease at high temperatures. A gradual decrease in ductility to a minimum at intermediate temperatures followed by an increase at high temperatures has been observed. The fracture mode remained transgranular ductile. The steel displayed signatures of dynamic strain ageing at intermediate temperatures and dominance of recovery at high temperatures. No significant difference in the strength and ductility values was observed for varying silicon in the range 0.24-0.60% in P9 steel. P9 steel for wrapper application displayed strength and ductility values comparable to those reported in the literature.

  6. DOVIS 2.0: an efficient and easy to use parallel virtual screening tool based on AutoDock 4.0.

    PubMed

    Jiang, Xiaohui; Kumar, Kamal; Hu, Xin; Wallqvist, Anders; Reifman, Jaques

    2008-09-08

    Small-molecule docking is an important tool in studying receptor-ligand interactions and in identifying potential drug candidates. Previously, we developed a software tool (DOVIS) to perform large-scale virtual screening of small molecules in parallel on Linux clusters, using AutoDock 3.05 as the docking engine. DOVIS enables the seamless screening of millions of compounds on high-performance computing platforms. In this paper, we report significant advances in the software implementation of DOVIS 2.0, including enhanced screening capability, improved file system efficiency, and extended usability. To keep DOVIS up-to-date, we upgraded the software's docking engine to the more accurate AutoDock 4.0 code. We developed a new parallelization scheme to improve runtime efficiency and modified the AutoDock code to reduce excessive file operations during large-scale virtual screening jobs. We also implemented an algorithm to output docked ligands in an industry standard format, sd-file format, which can be easily interfaced with other modeling programs. Finally, we constructed a wrapper-script interface to enable automatic rescoring of docked ligands by arbitrarily selected third-party scoring programs. The significance of the new DOVIS 2.0 software compared with the previous version lies in its improved performance and usability. The new version makes the computation highly efficient by automating load balancing, significantly reducing excessive file operations by more than 95%, providing outputs that conform to industry standard sd-file format, and providing a general wrapper-script interface for rescoring of docked ligands. The new DOVIS 2.0 package is freely available to the public under the GNU General Public License.

  7. Temperature variability analysis using wavelets and multiscale entropy in patients with systemic inflammatory response syndrome, sepsis, and septic shock.

    PubMed

    Papaioannou, Vasilios E; Chouvarda, Ioanna G; Maglaveras, Nikos K; Pneumatikos, Ioannis A

    2012-12-12

    Even though temperature is a continuous quantitative variable, its measurement has been considered a snapshot of a process, indicating whether a patient is febrile or afebrile. Recently, other diagnostic techniques have been proposed for the association between different properties of the temperature curve with severity of illness in the Intensive Care Unit (ICU), based on complexity analysis of continuously monitored body temperature. In this study, we tried to assess temperature complexity in patients with systemic inflammation during a suspected ICU-acquired infection, by using wavelets transformation and multiscale entropy of temperature signals, in a cohort of mixed critically ill patients. Twenty-two patients were enrolled in the study. In five, systemic inflammatory response syndrome (SIRS, group 1) developed, 10 had sepsis (group 2), and seven had septic shock (group 3). All temperature curves were studied during the first 24 hours of an inflammatory state. A wavelet transformation was applied, decomposing the signal in different frequency components (scales) that have been found to reflect neurogenic and metabolic inputs on temperature oscillations. Wavelet energy and entropy per different scales associated with complexity in specific frequency bands and multiscale entropy of the whole signal were calculated. Moreover, a clustering technique and a linear discriminant analysis (LDA) were applied for permitting pattern recognition in data sets and assessing diagnostic accuracy of different wavelet features among the three classes of patients. Statistically significant differences were found in wavelet entropy between patients with SIRS and groups 2 and 3, and in specific ultradian bands between SIRS and group 3, with decreased entropy in sepsis. Cluster analysis using wavelet features in specific bands revealed concrete clusters closely related with the groups in focus. LDA after wrapper-based feature selection was able to classify with an accuracy of more than 80% SIRS from the two sepsis groups, based on multiparametric patterns of entropy values in the very low frequencies and indicating reduced metabolic inputs on local thermoregulation, probably associated with extensive vasodilatation. We suggest that complexity analysis of temperature signals can assess inherent thermoregulatory dynamics during systemic inflammation and has increased discriminating value in patients with infectious versus noninfectious conditions, probably associated with severity of illness.

  8. Temperature variability analysis using wavelets and multiscale entropy in patients with systemic inflammatory response syndrome, sepsis, and septic shock

    PubMed Central

    2012-01-01

    Background Even though temperature is a continuous quantitative variable, its measurement has been considered a snapshot of a process, indicating whether a patient is febrile or afebrile. Recently, other diagnostic techniques have been proposed for the association between different properties of the temperature curve with severity of illness in the Intensive Care Unit (ICU), based on complexity analysis of continuously monitored body temperature. In this study, we tried to assess temperature complexity in patients with systemic inflammation during a suspected ICU-acquired infection, by using wavelets transformation and multiscale entropy of temperature signals, in a cohort of mixed critically ill patients. Methods Twenty-two patients were enrolled in the study. In five, systemic inflammatory response syndrome (SIRS, group 1) developed, 10 had sepsis (group 2), and seven had septic shock (group 3). All temperature curves were studied during the first 24 hours of an inflammatory state. A wavelet transformation was applied, decomposing the signal in different frequency components (scales) that have been found to reflect neurogenic and metabolic inputs on temperature oscillations. Wavelet energy and entropy per different scales associated with complexity in specific frequency bands and multiscale entropy of the whole signal were calculated. Moreover, a clustering technique and a linear discriminant analysis (LDA) were applied for permitting pattern recognition in data sets and assessing diagnostic accuracy of different wavelet features among the three classes of patients. Results Statistically significant differences were found in wavelet entropy between patients with SIRS and groups 2 and 3, and in specific ultradian bands between SIRS and group 3, with decreased entropy in sepsis. Cluster analysis using wavelet features in specific bands revealed concrete clusters closely related with the groups in focus. LDA after wrapper-based feature selection was able to classify with an accuracy of more than 80% SIRS from the two sepsis groups, based on multiparametric patterns of entropy values in the very low frequencies and indicating reduced metabolic inputs on local thermoregulation, probably associated with extensive vasodilatation. Conclusions We suggest that complexity analysis of temperature signals can assess inherent thermoregulatory dynamics during systemic inflammation and has increased discriminating value in patients with infectious versus noninfectious conditions, probably associated with severity of illness. PMID:22424316

  9. Design of a fuel element for a lead-cooled fast reactor

    NASA Astrophysics Data System (ADS)

    Sobolev, V.; Malambu, E.; Abderrahim, H. Aït

    2009-03-01

    The options of a lead-cooled fast reactor (LFR) of the fourth generation (GEN-IV) reactor with the electric power of 600 MW are investigated in the ELSY Project. The fuel selection, design and optimization are important steps of the project. Three types of fuel are considered as candidates: highly enriched Pu-U mixed oxide (MOX) fuel for the first core, the MOX containing between 2.5% and 5.0% of the minor actinides (MA) for next core and Pu-U-MA nitride fuel as an advanced option. Reference fuel rods with claddings made of T91 ferrite-martensitic steel and two alternative fuel assembly designs (one uses a closed hexagonal wrapper and the other is an open square variant without wrapper) have been assessed. This study focuses on the core variant with the closed hexagonal fuel assemblies. Based on the neutronic parameters provided by Monte-Carlo modeling with MCNP5 and ALEPH codes, simulations have been carried out to assess the long-term thermal-mechanical behaviour of the hottest fuel rods. A modified version of the fuel performance code FEMAXI-SCK-1, adapted for fast neutron spectrum, new fuels, cladding materials and coolant, was utilized for these calculations. The obtained results show that the fuel rods can withstand more than four effective full power years under the normal operation conditions without pellet-cladding mechanical interaction (PCMI). In a variant with solid fuel pellets, a mild PCMI can appear during the fifth year, however, it remains at an acceptable level up to the end of operation when the peak fuel pellet burnup ∼80 MW d kg-1 of heavy metal (HM) and the maximum clad damage of about 82 displacements per atom (dpa) are reached. Annular pellets permit to delay PCMI for about 1 year. Based on the results of this simulation, further steps are envisioned for the optimization of the fuel rod design, aiming at achieving the fuel burnup of 100 MW d kg-1 of HM.

  10. Recognizing Potential Buprenorphine Medication Misuse: Product Packaging Does Not Degrade With Laundering.

    PubMed

    Gunderson, Erik W

    2015-01-01

    Expanded office-based buprenorphine opioid dependence treatment is associated with medication misuse and diversion consequences. Recurrent early refill requests may indicate misuse or diversion, although further research is needed on how to effectively recognize and address the issue in clinical practice. In the current study, patient report of damaged medication from laundering prompted evaluation of laundering on degradation of buprenorphine-containing product packages and contents. Four buprenorphine product packaging approaches were assessed: 3 buprenorphine/naloxone placebo demonstration products (Suboxone and Bunavail film in foil wrappers and Zubsolv tablet in a blister pack) and Rexam-manufactured Screw-Loc closure pill container filled with a chewable aspirin as a surrogate for generic buprenorphine and buprenorphine/naloxone products. Two experimental laundering conditions, wash machine alone (W) and washer/dryer (W+D), were compared with unlaundered control (C) condition. Standard laundering settings were based on patient presentation. Products from the 2 experimental conditions and the control condition were labeled A, B, or C with counterbalanced assignment prior to visual examination of packaging and contents by the investigator who was blinded to condition. Packaging and contents remained intact for all products across experimental conditions, with only minor cosmetic effects compared with control. The W+D Suboxone film had 1-2 mm curling of the wrapper corners. Zubsolv blister packs had slight paper label fading (W+D > W). Bunavail W+D foil had an indentation outlining the inner film. The W+D bottle tablet had a ˜1 mm nick on one edge. No other differences were noted. After implementing more structured treatment and reviewing the results with the patient, he endorsed fabricating the laundering story to get additional medication. Laundering is an unlikely cause of damaged buprenorphine-containing medication packaged in foil wrappers (Suboxone, Bunavail), blister pack (Zubsolv), or prescription pill bottle (generic buprenorphine or buprenorphine/naloxone products). Patient reports of such may indicate medication misuse or diversion.

  11. EEG feature selection method based on decision tree.

    PubMed

    Duan, Lijuan; Ge, Hui; Ma, Wei; Miao, Jun

    2015-01-01

    This paper aims to solve automated feature selection problem in brain computer interface (BCI). In order to automate feature selection process, we proposed a novel EEG feature selection method based on decision tree (DT). During the electroencephalogram (EEG) signal processing, a feature extraction method based on principle component analysis (PCA) was used, and the selection process based on decision tree was performed by searching the feature space and automatically selecting optimal features. Considering that EEG signals are a series of non-linear signals, a generalized linear classifier named support vector machine (SVM) was chosen. In order to test the validity of the proposed method, we applied the EEG feature selection method based on decision tree to BCI Competition II datasets Ia, and the experiment showed encouraging results.

  12. A new method using multiphoton imaging and morphometric analysis for differentiating chromophobe renal cell carcinoma and oncocytoma kidney tumors

    NASA Astrophysics Data System (ADS)

    Wu, Binlin; Mukherjee, Sushmita; Jain, Manu

    2016-03-01

    Distinguishing chromophobe renal cell carcinoma (chRCC) from oncocytoma on hematoxylin and eosin images may be difficult and require time-consuming ancillary procedures. Multiphoton microscopy (MPM), an optical imaging modality, was used to rapidly generate sub-cellular histological resolution images from formalin-fixed unstained tissue sections from chRCC and oncocytoma.Tissues were excited using 780nm wavelength and emission signals (including second harmonic generation and autofluorescence) were collected in different channels between 390 nm and 650 nm. Granular structure in the cell cytoplasm was observed in both chRCC and oncocytoma. Quantitative morphometric analysis was conducted to distinguish chRCC and oncocytoma. To perform the analysis, cytoplasm and granules in tumor cells were segmented from the images. Their area and fluorescence intensity were found in different channels. Multiple features were measured to quantify the morphological and fluorescence properties. Linear support vector machine (SVM) was used for classification. Re-substitution validation, cross validation and receiver operating characteristic (ROC) curve were implemented to evaluate the efficacy of the SVM classifier. A wrapper feature algorithm was used to select the optimal features which provided the best predictive performance in separating the two tissue types (classes). Statistical measures such as sensitivity, specificity, accuracy and area under curve (AUC) of ROC were calculated to evaluate the efficacy of the classification. Over 80% accuracy was achieved as the predictive performance. This method, if validated on a larger and more diverse sample set, may serve as an automated rapid diagnostic tool to differentiate between chRCC and oncocytoma. An advantage of such automated methods are that they are free from investigator bias and variability.

  13. Integrated feature extraction and selection for neuroimage classification

    NASA Astrophysics Data System (ADS)

    Fan, Yong; Shen, Dinggang

    2009-02-01

    Feature extraction and selection are of great importance in neuroimage classification for identifying informative features and reducing feature dimensionality, which are generally implemented as two separate steps. This paper presents an integrated feature extraction and selection algorithm with two iterative steps: constrained subspace learning based feature extraction and support vector machine (SVM) based feature selection. The subspace learning based feature extraction focuses on the brain regions with higher possibility of being affected by the disease under study, while the possibility of brain regions being affected by disease is estimated by the SVM based feature selection, in conjunction with SVM classification. This algorithm can not only take into account the inter-correlation among different brain regions, but also overcome the limitation of traditional subspace learning based feature extraction methods. To achieve robust performance and optimal selection of parameters involved in feature extraction, selection, and classification, a bootstrapping strategy is used to generate multiple versions of training and testing sets for parameter optimization, according to the classification performance measured by the area under the ROC (receiver operating characteristic) curve. The integrated feature extraction and selection method is applied to a structural MR image based Alzheimer's disease (AD) study with 98 non-demented and 100 demented subjects. Cross-validation results indicate that the proposed algorithm can improve performance of the traditional subspace learning based classification.

  14. Adaptive semantic tag mining from heterogeneous clinical research texts.

    PubMed

    Hao, T; Weng, C

    2015-01-01

    To develop an adaptive approach to mine frequent semantic tags (FSTs) from heterogeneous clinical research texts. We develop a "plug-n-play" framework that integrates replaceable unsupervised kernel algorithms with formatting, functional, and utility wrappers for FST mining. Temporal information identification and semantic equivalence detection were two example functional wrappers. We first compared this approach's recall and efficiency for mining FSTs from ClinicalTrials.gov to that of a recently published tag-mining algorithm. Then we assessed this approach's adaptability to two other types of clinical research texts: clinical data requests and clinical trial protocols, by comparing the prevalence trends of FSTs across three texts. Our approach increased the average recall and speed by 12.8% and 47.02% respectively upon the baseline when mining FSTs from ClinicalTrials.gov, and maintained an overlap in relevant FSTs with the base- line ranging between 76.9% and 100% for varying FST frequency thresholds. The FSTs saturated when the data size reached 200 documents. Consistent trends in the prevalence of FST were observed across the three texts as the data size or frequency threshold changed. This paper contributes an adaptive tag-mining framework that is scalable and adaptable without sacrificing its recall. This component-based architectural design can be potentially generalizable to improve the adaptability of other clinical text mining methods.

  15. mapview - Interactive viewing of spatial data in R

    NASA Astrophysics Data System (ADS)

    Appelhans, Tim; Detsch, Florian; Reudenbach, Cristoph; Woellauer, Stefan

    2016-04-01

    In this talk we would like to introduce mapview, an R package designed to aid researchers during their work-flow of spatial data analysis. The package was initially developed within the framework of the DFG funded research group "KiLi - Kilimanjaro ecosystems under global change: Linking biodiversity, biotic interactions and biogeochemical ecosystem processes" but has quickly developed into a general purpose spatial data viewer. mapview provides some powerful tools for interactive visualization of standard spatial data in R. It has support for all Spatial*(DataFrame) objects as well as all Raster* objects. It is designed so that one function call - mapview(x) - is all you need to view the data interactively. Adding layers to existing views is very easy and we have taken great care in providing suitable defaults for features such as background maps or coloring but things can be customized flexibly (and permanently) to suit different needs. Even though mapview is for most parts based on the leaflet package, it is far more than just a convenience wrapper around leaflet functionality. mapview provides additional features for handling big data sets (up to several million points) as well as some specialized functionality to view and compare rasters of any size with arbitrary coordinate reference systems. Given that mapview is merely a bridge between R and the underlying leaflet.js javascript library, mapview can be used to produce web-maps by simply providing the path to a designated folder. This talk will be a live demonstration of some of the key features of mapview.

  16. Train axle bearing fault detection using a feature selection scheme based multi-scale morphological filter

    NASA Astrophysics Data System (ADS)

    Li, Yifan; Liang, Xihui; Lin, Jianhui; Chen, Yuejian; Liu, Jianxin

    2018-02-01

    This paper presents a novel signal processing scheme, feature selection based multi-scale morphological filter (MMF), for train axle bearing fault detection. In this scheme, more than 30 feature indicators of vibration signals are calculated for axle bearings with different conditions and the features which can reflect fault characteristics more effectively and representatively are selected using the max-relevance and min-redundancy principle. Then, a filtering scale selection approach for MMF based on feature selection and grey relational analysis is proposed. The feature selection based MMF method is tested on diagnosis of artificially created damages of rolling bearings of railway trains. Experimental results show that the proposed method has a superior performance in extracting fault features of defective train axle bearings. In addition, comparisons are performed with the kurtosis criterion based MMF and the spectral kurtosis criterion based MMF. The proposed feature selection based MMF method outperforms these two methods in detection of train axle bearing faults.

  17. Student Self-evaluation After Nursing Examinations: That's a Wrap.

    PubMed

    Butzlaff, Alice; Gaylle, Debrayh; O'Leary Kelley, Colleen

    2018-04-13

    Examination wrappers are a self-evaluation tool that uses metacognition to help students reflect on test performance. After examinations, rather than focus on points earned, students learn to self-identify study strategies and recognize methods of test preparation. The purpose of the study was to determine if the use of an examination wrapper after each test would encourage students to self-evaluate performance and adjust study strategies. A total of 120 undergraduate nursing students completed self-evaluations after each examination, which were analyzed using content analysis. Three general patterns emerged from student self-evaluation: effective and ineffective study strategies, understanding versus memorization of content, and nurse educator assistance.

  18. SWMM5 Application Programming Interface and PySWMM: A ...

    EPA Pesticide Factsheets

    In support of the OpenWaterAnalytics open source initiative, the PySWMM project encompasses the development of a Python interfacing wrapper to SWMM5 with parallel ongoing development of the USEPA Stormwater Management Model (SWMM5) application programming interface (API). ... The purpose of this work is to increase the utility of the SWMM dll by creating a Toolkit API for accessing its functionality. The utility of the Toolkit is further enhanced with a wrapper to allow access from the Python scripting language. This work is being prosecuted as part of an Open Source development strategy and is being performed by volunteer software developers.

  19. Genetic Particle Swarm Optimization-Based Feature Selection for Very-High-Resolution Remotely Sensed Imagery Object Change Detection.

    PubMed

    Chen, Qiang; Chen, Yunhao; Jiang, Weiguo

    2016-07-30

    In the field of multiple features Object-Based Change Detection (OBCD) for very-high-resolution remotely sensed images, image objects have abundant features and feature selection affects the precision and efficiency of OBCD. Through object-based image analysis, this paper proposes a Genetic Particle Swarm Optimization (GPSO)-based feature selection algorithm to solve the optimization problem of feature selection in multiple features OBCD. We select the Ratio of Mean to Variance (RMV) as the fitness function of GPSO, and apply the proposed algorithm to the object-based hybrid multivariate alternative detection model. Two experiment cases on Worldview-2/3 images confirm that GPSO can significantly improve the speed of convergence, and effectively avoid the problem of premature convergence, relative to other feature selection algorithms. According to the accuracy evaluation of OBCD, GPSO is superior at overall accuracy (84.17% and 83.59%) and Kappa coefficient (0.6771 and 0.6314) than other algorithms. Moreover, the sensitivity analysis results show that the proposed algorithm is not easily influenced by the initial parameters, but the number of features to be selected and the size of the particle swarm would affect the algorithm. The comparison experiment results reveal that RMV is more suitable than other functions as the fitness function of GPSO-based feature selection algorithm.

  20. Next Generation Transport Phenomenology Model

    NASA Technical Reports Server (NTRS)

    Strickland, Douglas J.; Knight, Harold; Evans, J. Scott

    2004-01-01

    This report describes the progress made in Quarter 3 of Contract Year 3 on the development of Aeronomy Phenomenology Modeling Tool (APMT), an open-source, component-based, client-server architecture for distributed modeling, analysis, and simulation activities focused on electron and photon transport for general atmospheres. In the past quarter, column emission rate computations were implemented in Java, preexisting Fortran programs for computing synthetic spectra were embedded into APMT through Java wrappers, and work began on a web-based user interface for setting input parameters and running the photoelectron and auroral electron transport models.

  1. The Study on Collaborative Manufacturing Platform Based on Agent

    NASA Astrophysics Data System (ADS)

    Zhang, Xiao-yan; Qu, Zheng-geng

    To fulfill the trends of knowledge-intensive in collaborative manufacturing development, we have described multi agent architecture supporting knowledge-based platform of collaborative manufacturing development platform. In virtue of wrapper service and communication capacity agents provided, the proposed architecture facilitates organization and collaboration of multi-disciplinary individuals and tools. By effectively supporting the formal representation, capture, retrieval and reuse of manufacturing knowledge, the generalized knowledge repository based on ontology library enable engineers to meaningfully exchange information and pass knowledge across boundaries. Intelligent agent technology increases traditional KBE systems efficiency and interoperability and provides comprehensive design environments for engineers.

  2. Kernel-based Joint Feature Selection and Max-Margin Classification for Early Diagnosis of Parkinson’s Disease

    NASA Astrophysics Data System (ADS)

    Adeli, Ehsan; Wu, Guorong; Saghafi, Behrouz; An, Le; Shi, Feng; Shen, Dinggang

    2017-01-01

    Feature selection methods usually select the most compact and relevant set of features based on their contribution to a linear regression model. Thus, these features might not be the best for a non-linear classifier. This is especially crucial for the tasks, in which the performance is heavily dependent on the feature selection techniques, like the diagnosis of neurodegenerative diseases. Parkinson’s disease (PD) is one of the most common neurodegenerative disorders, which progresses slowly while affects the quality of life dramatically. In this paper, we use the data acquired from multi-modal neuroimaging data to diagnose PD by investigating the brain regions, known to be affected at the early stages. We propose a joint kernel-based feature selection and classification framework. Unlike conventional feature selection techniques that select features based on their performance in the original input feature space, we select features that best benefit the classification scheme in the kernel space. We further propose kernel functions, specifically designed for our non-negative feature types. We use MRI and SPECT data of 538 subjects from the PPMI database, and obtain a diagnosis accuracy of 97.5%, which outperforms all baseline and state-of-the-art methods.

  3. Kernel-based Joint Feature Selection and Max-Margin Classification for Early Diagnosis of Parkinson’s Disease

    PubMed Central

    Adeli, Ehsan; Wu, Guorong; Saghafi, Behrouz; An, Le; Shi, Feng; Shen, Dinggang

    2017-01-01

    Feature selection methods usually select the most compact and relevant set of features based on their contribution to a linear regression model. Thus, these features might not be the best for a non-linear classifier. This is especially crucial for the tasks, in which the performance is heavily dependent on the feature selection techniques, like the diagnosis of neurodegenerative diseases. Parkinson’s disease (PD) is one of the most common neurodegenerative disorders, which progresses slowly while affects the quality of life dramatically. In this paper, we use the data acquired from multi-modal neuroimaging data to diagnose PD by investigating the brain regions, known to be affected at the early stages. We propose a joint kernel-based feature selection and classification framework. Unlike conventional feature selection techniques that select features based on their performance in the original input feature space, we select features that best benefit the classification scheme in the kernel space. We further propose kernel functions, specifically designed for our non-negative feature types. We use MRI and SPECT data of 538 subjects from the PPMI database, and obtain a diagnosis accuracy of 97.5%, which outperforms all baseline and state-of-the-art methods. PMID:28120883

  4. Attentional Selection Can Be Predicted by Reinforcement Learning of Task-relevant Stimulus Features Weighted by Value-independent Stickiness.

    PubMed

    Balcarras, Matthew; Ardid, Salva; Kaping, Daniel; Everling, Stefan; Womelsdorf, Thilo

    2016-02-01

    Attention includes processes that evaluate stimuli relevance, select the most relevant stimulus against less relevant stimuli, and bias choice behavior toward the selected information. It is not clear how these processes interact. Here, we captured these processes in a reinforcement learning framework applied to a feature-based attention task that required macaques to learn and update the value of stimulus features while ignoring nonrelevant sensory features, locations, and action plans. We found that value-based reinforcement learning mechanisms could account for feature-based attentional selection and choice behavior but required a value-independent stickiness selection process to explain selection errors while at asymptotic behavior. By comparing different reinforcement learning schemes, we found that trial-by-trial selections were best predicted by a model that only represents expected values for the task-relevant feature dimension, with nonrelevant stimulus features and action plans having only a marginal influence on covert selections. These findings show that attentional control subprocesses can be described by (1) the reinforcement learning of feature values within a restricted feature space that excludes irrelevant feature dimensions, (2) a stochastic selection process on feature-specific value representations, and (3) value-independent stickiness toward previous feature selections akin to perseveration in the motor domain. We speculate that these three mechanisms are implemented by distinct but interacting brain circuits and that the proposed formal account of feature-based stimulus selection will be important to understand how attentional subprocesses are implemented in primate brain networks.

  5. Genetic Particle Swarm Optimization–Based Feature Selection for Very-High-Resolution Remotely Sensed Imagery Object Change Detection

    PubMed Central

    Chen, Qiang; Chen, Yunhao; Jiang, Weiguo

    2016-01-01

    In the field of multiple features Object-Based Change Detection (OBCD) for very-high-resolution remotely sensed images, image objects have abundant features and feature selection affects the precision and efficiency of OBCD. Through object-based image analysis, this paper proposes a Genetic Particle Swarm Optimization (GPSO)-based feature selection algorithm to solve the optimization problem of feature selection in multiple features OBCD. We select the Ratio of Mean to Variance (RMV) as the fitness function of GPSO, and apply the proposed algorithm to the object-based hybrid multivariate alternative detection model. Two experiment cases on Worldview-2/3 images confirm that GPSO can significantly improve the speed of convergence, and effectively avoid the problem of premature convergence, relative to other feature selection algorithms. According to the accuracy evaluation of OBCD, GPSO is superior at overall accuracy (84.17% and 83.59%) and Kappa coefficient (0.6771 and 0.6314) than other algorithms. Moreover, the sensitivity analysis results show that the proposed algorithm is not easily influenced by the initial parameters, but the number of features to be selected and the size of the particle swarm would affect the algorithm. The comparison experiment results reveal that RMV is more suitable than other functions as the fitness function of GPSO-based feature selection algorithm. PMID:27483285

  6. ViennaNGS: A toolbox for building efficient next- generation sequencing analysis pipelines

    PubMed Central

    Wolfinger, Michael T.; Fallmann, Jörg; Eggenhofer, Florian; Amman, Fabian

    2015-01-01

    Recent achievements in next-generation sequencing (NGS) technologies lead to a high demand for reuseable software components to easily compile customized analysis workflows for big genomics data. We present ViennaNGS, an integrated collection of Perl modules focused on building efficient pipelines for NGS data processing. It comes with functionality for extracting and converting features from common NGS file formats, computation and evaluation of read mapping statistics, as well as normalization of RNA abundance. Moreover, ViennaNGS provides software components for identification and characterization of splice junctions from RNA-seq data, parsing and condensing sequence motif data, automated construction of Assembly and Track Hubs for the UCSC genome browser, as well as wrapper routines for a set of commonly used NGS command line tools. PMID:26236465

  7. The fate of task-irrelevant visual motion: perceptual load versus feature-based attention.

    PubMed

    Taya, Shuichiro; Adams, Wendy J; Graf, Erich W; Lavie, Nilli

    2009-11-18

    We tested contrasting predictions derived from perceptual load theory and from recent feature-based selection accounts. Observers viewed moving, colored stimuli and performed low or high load tasks associated with one stimulus feature, either color or motion. The resultant motion aftereffect (MAE) was used to evaluate attentional allocation. We found that task-irrelevant visual features received less attention than co-localized task-relevant features of the same objects. Moreover, when color and motion features were co-localized yet perceived to belong to two distinct surfaces, feature-based selection was further increased at the expense of object-based co-selection. Load theory predicts that the MAE for task-irrelevant motion would be reduced with a higher load color task. However, this was not seen for co-localized features; perceptual load only modulated the MAE for task-irrelevant motion when this was spatially separated from the attended color location. Our results suggest that perceptual load effects are mediated by spatial selection and do not generalize to the feature domain. Feature-based selection operates to suppress processing of task-irrelevant, co-localized features, irrespective of perceptual load.

  8. Comparison of Genetic Algorithm, Particle Swarm Optimization and Biogeography-based Optimization for Feature Selection to Classify Clusters of Microcalcifications

    NASA Astrophysics Data System (ADS)

    Khehra, Baljit Singh; Pharwaha, Amar Partap Singh

    2017-04-01

    Ductal carcinoma in situ (DCIS) is one type of breast cancer. Clusters of microcalcifications (MCCs) are symptoms of DCIS that are recognized by mammography. Selection of robust features vector is the process of selecting an optimal subset of features from a large number of available features in a given problem domain after the feature extraction and before any classification scheme. Feature selection reduces the feature space that improves the performance of classifier and decreases the computational burden imposed by using many features on classifier. Selection of an optimal subset of features from a large number of available features in a given problem domain is a difficult search problem. For n features, the total numbers of possible subsets of features are 2n. Thus, selection of an optimal subset of features problem belongs to the category of NP-hard problems. In this paper, an attempt is made to find the optimal subset of MCCs features from all possible subsets of features using genetic algorithm (GA), particle swarm optimization (PSO) and biogeography-based optimization (BBO). For simulation, a total of 380 benign and malignant MCCs samples have been selected from mammogram images of DDSM database. A total of 50 features extracted from benign and malignant MCCs samples are used in this study. In these algorithms, fitness function is correct classification rate of classifier. Support vector machine is used as a classifier. From experimental results, it is also observed that the performance of PSO-based and BBO-based algorithms to select an optimal subset of features for classifying MCCs as benign or malignant is better as compared to GA-based algorithm.

  9. Feature Selection Using Information Gain for Improved Structural-Based Alert Correlation

    PubMed Central

    Siraj, Maheyzah Md; Zainal, Anazida; Elshoush, Huwaida Tagelsir; Elhaj, Fatin

    2016-01-01

    Grouping and clustering alerts for intrusion detection based on the similarity of features is referred to as structurally base alert correlation and can discover a list of attack steps. Previous researchers selected different features and data sources manually based on their knowledge and experience, which lead to the less accurate identification of attack steps and inconsistent performance of clustering accuracy. Furthermore, the existing alert correlation systems deal with a huge amount of data that contains null values, incomplete information, and irrelevant features causing the analysis of the alerts to be tedious, time-consuming and error-prone. Therefore, this paper focuses on selecting accurate and significant features of alerts that are appropriate to represent the attack steps, thus, enhancing the structural-based alert correlation model. A two-tier feature selection method is proposed to obtain the significant features. The first tier aims at ranking the subset of features based on high information gain entropy in decreasing order. The‏ second tier extends additional features with a better discriminative ability than the initially ranked features. Performance analysis results show the significance of the selected features in terms of the clustering accuracy using 2000 DARPA intrusion detection scenario-specific dataset. PMID:27893821

  10. Mutual information criterion for feature selection with application to classification of breast microcalcifications

    NASA Astrophysics Data System (ADS)

    Diamant, Idit; Shalhon, Moran; Goldberger, Jacob; Greenspan, Hayit

    2016-03-01

    Classification of clustered breast microcalcifications into benign and malignant categories is an extremely challenging task for computerized algorithms and expert radiologists alike. In this paper we present a novel method for feature selection based on mutual information (MI) criterion for automatic classification of microcalcifications. We explored the MI based feature selection for various texture features. The proposed method was evaluated on a standardized digital database for screening mammography (DDSM). Experimental results demonstrate the effectiveness and the advantage of using the MI-based feature selection to obtain the most relevant features for the task and thus to provide for improved performance as compared to using all features.

  11. McTwo: a two-step feature selection algorithm based on maximal information coefficient.

    PubMed

    Ge, Ruiquan; Zhou, Manli; Luo, Youxi; Meng, Qinghan; Mai, Guoqin; Ma, Dongli; Wang, Guoqing; Zhou, Fengfeng

    2016-03-23

    High-throughput bio-OMIC technologies are producing high-dimension data from bio-samples at an ever increasing rate, whereas the training sample number in a traditional experiment remains small due to various difficulties. This "large p, small n" paradigm in the area of biomedical "big data" may be at least partly solved by feature selection algorithms, which select only features significantly associated with phenotypes. Feature selection is an NP-hard problem. Due to the exponentially increased time requirement for finding the globally optimal solution, all the existing feature selection algorithms employ heuristic rules to find locally optimal solutions, and their solutions achieve different performances on different datasets. This work describes a feature selection algorithm based on a recently published correlation measurement, Maximal Information Coefficient (MIC). The proposed algorithm, McTwo, aims to select features associated with phenotypes, independently of each other, and achieving high classification performance of the nearest neighbor algorithm. Based on the comparative study of 17 datasets, McTwo performs about as well as or better than existing algorithms, with significantly reduced numbers of selected features. The features selected by McTwo also appear to have particular biomedical relevance to the phenotypes from the literature. McTwo selects a feature subset with very good classification performance, as well as a small feature number. So McTwo may represent a complementary feature selection algorithm for the high-dimensional biomedical datasets.

  12. Max-AUC Feature Selection in Computer-Aided Detection of Polyps in CT Colonography

    PubMed Central

    Xu, Jian-Wu; Suzuki, Kenji

    2014-01-01

    We propose a feature selection method based on a sequential forward floating selection (SFFS) procedure to improve the performance of a classifier in computerized detection of polyps in CT colonography (CTC). The feature selection method is coupled with a nonlinear support vector machine (SVM) classifier. Unlike the conventional linear method based on Wilks' lambda, the proposed method selected the most relevant features that would maximize the area under the receiver operating characteristic curve (AUC), which directly maximizes classification performance, evaluated based on AUC value, in the computer-aided detection (CADe) scheme. We presented two variants of the proposed method with different stopping criteria used in the SFFS procedure. The first variant searched all feature combinations allowed in the SFFS procedure and selected the subsets that maximize the AUC values. The second variant performed a statistical test at each step during the SFFS procedure, and it was terminated if the increase in the AUC value was not statistically significant. The advantage of the second variant is its lower computational cost. To test the performance of the proposed method, we compared it against the popular stepwise feature selection method based on Wilks' lambda for a colonic-polyp database (25 polyps and 2624 nonpolyps). We extracted 75 morphologic, gray-level-based, and texture features from the segmented lesion candidate regions. The two variants of the proposed feature selection method chose 29 and 7 features, respectively. Two SVM classifiers trained with these selected features yielded a 96% by-polyp sensitivity at false-positive (FP) rates of 4.1 and 6.5 per patient, respectively. Experiments showed a significant improvement in the performance of the classifier with the proposed feature selection method over that with the popular stepwise feature selection based on Wilks' lambda that yielded 18.0 FPs per patient at the same sensitivity level. PMID:24608058

  13. Max-AUC feature selection in computer-aided detection of polyps in CT colonography.

    PubMed

    Xu, Jian-Wu; Suzuki, Kenji

    2014-03-01

    We propose a feature selection method based on a sequential forward floating selection (SFFS) procedure to improve the performance of a classifier in computerized detection of polyps in CT colonography (CTC). The feature selection method is coupled with a nonlinear support vector machine (SVM) classifier. Unlike the conventional linear method based on Wilks' lambda, the proposed method selected the most relevant features that would maximize the area under the receiver operating characteristic curve (AUC), which directly maximizes classification performance, evaluated based on AUC value, in the computer-aided detection (CADe) scheme. We presented two variants of the proposed method with different stopping criteria used in the SFFS procedure. The first variant searched all feature combinations allowed in the SFFS procedure and selected the subsets that maximize the AUC values. The second variant performed a statistical test at each step during the SFFS procedure, and it was terminated if the increase in the AUC value was not statistically significant. The advantage of the second variant is its lower computational cost. To test the performance of the proposed method, we compared it against the popular stepwise feature selection method based on Wilks' lambda for a colonic-polyp database (25 polyps and 2624 nonpolyps). We extracted 75 morphologic, gray-level-based, and texture features from the segmented lesion candidate regions. The two variants of the proposed feature selection method chose 29 and 7 features, respectively. Two SVM classifiers trained with these selected features yielded a 96% by-polyp sensitivity at false-positive (FP) rates of 4.1 and 6.5 per patient, respectively. Experiments showed a significant improvement in the performance of the classifier with the proposed feature selection method over that with the popular stepwise feature selection based on Wilks' lambda that yielded 18.0 FPs per patient at the same sensitivity level.

  14. Entity- Version 1.0

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Hart, Brian; Oppel, Fred; Rigdon, Brian

    2012-09-13

    This package contains classes that capture high-level aspects of characters and vehicles. Vehicles manage seats and riders. Vehicles and characters now can be configured to compose different behaviors and have certain capabilities, by adding them through xml data. These behaviors and capabilities are not included in this package, but instead are part of other packages such as mobility behavior, path planning, sight, sound. Entity is not dependent on these other packages. This package also contains the icons used for Umbra applications Dante Scenario Editor, Dante Tabletop and OpShed. This assertion includes a managed C++ wrapper code (EntityWrapper) to enable C#more » applications, such as Dante Scenario Editor, Dante Tabletop, and OpShed, to incorporate this library.« less

  15. Terrain - Umbra Package v. 1.0

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Oppel, Fred; Hart, Brian; Rigdon, James Brian

    This library contains modules that read terrain files (e.g., OpenFlight, Open Scene Graph IVE, GeoTIFF Image) and to read and manage ESRI terrain datasets. All data is stored and managed in Open Scene Graph (OSG). Terrain system accesses OSG and provides elevation data, access to meta-data such as soil types and enables linears, areals and buildings to be placed in a terrain, These geometry objects include boxes, point, path, and polygon (region), and sector modules. Utilities have been made available for clamping objects to the terrain and accessing LOS information. This assertion includes a managed C++ wrapper code (TerrainWrapper) tomore » enable C# applications, such as OpShed and UTU, to incorporate this library.« less

  16. Rough sets and Laplacian score based cost-sensitive feature selection

    PubMed Central

    Yu, Shenglong

    2018-01-01

    Cost-sensitive feature selection learning is an important preprocessing step in machine learning and data mining. Recently, most existing cost-sensitive feature selection algorithms are heuristic algorithms, which evaluate the importance of each feature individually and select features one by one. Obviously, these algorithms do not consider the relationship among features. In this paper, we propose a new algorithm for minimal cost feature selection called the rough sets and Laplacian score based cost-sensitive feature selection. The importance of each feature is evaluated by both rough sets and Laplacian score. Compared with heuristic algorithms, the proposed algorithm takes into consideration the relationship among features with locality preservation of Laplacian score. We select a feature subset with maximal feature importance and minimal cost when cost is undertaken in parallel, where the cost is given by three different distributions to simulate different applications. Different from existing cost-sensitive feature selection algorithms, our algorithm simultaneously selects out a predetermined number of “good” features. Extensive experimental results show that the approach is efficient and able to effectively obtain the minimum cost subset. In addition, the results of our method are more promising than the results of other cost-sensitive feature selection algorithms. PMID:29912884

  17. Rough sets and Laplacian score based cost-sensitive feature selection.

    PubMed

    Yu, Shenglong; Zhao, Hong

    2018-01-01

    Cost-sensitive feature selection learning is an important preprocessing step in machine learning and data mining. Recently, most existing cost-sensitive feature selection algorithms are heuristic algorithms, which evaluate the importance of each feature individually and select features one by one. Obviously, these algorithms do not consider the relationship among features. In this paper, we propose a new algorithm for minimal cost feature selection called the rough sets and Laplacian score based cost-sensitive feature selection. The importance of each feature is evaluated by both rough sets and Laplacian score. Compared with heuristic algorithms, the proposed algorithm takes into consideration the relationship among features with locality preservation of Laplacian score. We select a feature subset with maximal feature importance and minimal cost when cost is undertaken in parallel, where the cost is given by three different distributions to simulate different applications. Different from existing cost-sensitive feature selection algorithms, our algorithm simultaneously selects out a predetermined number of "good" features. Extensive experimental results show that the approach is efficient and able to effectively obtain the minimum cost subset. In addition, the results of our method are more promising than the results of other cost-sensitive feature selection algorithms.

  18. Speech Emotion Feature Selection Method Based on Contribution Analysis Algorithm of Neural Network

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wang Xiaojia; Mao Qirong; Zhan Yongzhao

    There are many emotion features. If all these features are employed to recognize emotions, redundant features may be existed. Furthermore, recognition result is unsatisfying and the cost of feature extraction is high. In this paper, a method to select speech emotion features based on contribution analysis algorithm of NN is presented. The emotion features are selected by using contribution analysis algorithm of NN from the 95 extracted features. Cluster analysis is applied to analyze the effectiveness for the features selected, and the time of feature extraction is evaluated. Finally, 24 emotion features selected are used to recognize six speech emotions.more » The experiments show that this method can improve the recognition rate and the time of feature extraction.« less

  19. Improving Classification of Protein Interaction Articles Using Context Similarity-Based Feature Selection.

    PubMed

    Chen, Yifei; Sun, Yuxing; Han, Bing-Qing

    2015-01-01

    Protein interaction article classification is a text classification task in the biological domain to determine which articles describe protein-protein interactions. Since the feature space in text classification is high-dimensional, feature selection is widely used for reducing the dimensionality of features to speed up computation without sacrificing classification performance. Many existing feature selection methods are based on the statistical measure of document frequency and term frequency. One potential drawback of these methods is that they treat features separately. Hence, first we design a similarity measure between the context information to take word cooccurrences and phrase chunks around the features into account. Then we introduce the similarity of context information to the importance measure of the features to substitute the document and term frequency. Hence we propose new context similarity-based feature selection methods. Their performance is evaluated on two protein interaction article collections and compared against the frequency-based methods. The experimental results reveal that the context similarity-based methods perform better in terms of the F1 measure and the dimension reduction rate. Benefiting from the context information surrounding the features, the proposed methods can select distinctive features effectively for protein interaction article classification.

  20. LISA Framework for Enhancing Gravitational Wave Signal Extraction Techniques

    NASA Technical Reports Server (NTRS)

    Thompson, David E.; Thirumalainambi, Rajkumar

    2006-01-01

    This paper describes the development of a Framework for benchmarking and comparing signal-extraction and noise-interference-removal methods that are applicable to interferometric Gravitational Wave detector systems. The primary use is towards comparing signal and noise extraction techniques at LISA frequencies from multiple (possibly confused) ,gravitational wave sources. The Framework includes extensive hybrid learning/classification algorithms, as well as post-processing regularization methods, and is based on a unique plug-and-play (component) architecture. Published methods for signal extraction and interference removal at LISA Frequencies are being encoded, as well as multiple source noise models, so that the stiffness of GW Sensitivity Space can be explored under each combination of methods. Furthermore, synthetic datasets and source models can be created and imported into the Framework, and specific degraded numerical experiments can be run to test the flexibility of the analysis methods. The Framework also supports use of full current LISA Testbeds, Synthetic data systems, and Simulators already in existence through plug-ins and wrappers, thus preserving those legacy codes and systems in tact. Because of the component-based architecture, all selected procedures can be registered or de-registered at run-time, and are completely reusable, reconfigurable, and modular.

  1. Feature Selection in Classification of Eye Movements Using Electrooculography for Activity Recognition

    PubMed Central

    Mala, S.; Latha, K.

    2014-01-01

    Activity recognition is needed in different requisition, for example, reconnaissance system, patient monitoring, and human-computer interfaces. Feature selection plays an important role in activity recognition, data mining, and machine learning. In selecting subset of features, an efficient evolutionary algorithm Differential Evolution (DE), a very efficient optimizer, is used for finding informative features from eye movements using electrooculography (EOG). Many researchers use EOG signals in human-computer interactions with various computational intelligence methods to analyze eye movements. The proposed system involves analysis of EOG signals using clearness based features, minimum redundancy maximum relevance features, and Differential Evolution based features. This work concentrates more on the feature selection algorithm based on DE in order to improve the classification for faultless activity recognition. PMID:25574185

  2. Feature selection in classification of eye movements using electrooculography for activity recognition.

    PubMed

    Mala, S; Latha, K

    2014-01-01

    Activity recognition is needed in different requisition, for example, reconnaissance system, patient monitoring, and human-computer interfaces. Feature selection plays an important role in activity recognition, data mining, and machine learning. In selecting subset of features, an efficient evolutionary algorithm Differential Evolution (DE), a very efficient optimizer, is used for finding informative features from eye movements using electrooculography (EOG). Many researchers use EOG signals in human-computer interactions with various computational intelligence methods to analyze eye movements. The proposed system involves analysis of EOG signals using clearness based features, minimum redundancy maximum relevance features, and Differential Evolution based features. This work concentrates more on the feature selection algorithm based on DE in order to improve the classification for faultless activity recognition.

  3. Green indeed

    NASA Astrophysics Data System (ADS)

    Dennis-Purves, Neil

    2018-04-01

    The plastic around Physics World can be recycled in the carrier bag bin at some supermarkets (Tesco, Sainsbury’s, Morrisons or Waitrose) along with other stretchy plastic wrappers such as bread bags.

  4. Feature selection method based on multi-fractal dimension and harmony search algorithm and its application

    NASA Astrophysics Data System (ADS)

    Zhang, Chen; Ni, Zhiwei; Ni, Liping; Tang, Na

    2016-10-01

    Feature selection is an important method of data preprocessing in data mining. In this paper, a novel feature selection method based on multi-fractal dimension and harmony search algorithm is proposed. Multi-fractal dimension is adopted as the evaluation criterion of feature subset, which can determine the number of selected features. An improved harmony search algorithm is used as the search strategy to improve the efficiency of feature selection. The performance of the proposed method is compared with that of other feature selection algorithms on UCI data-sets. Besides, the proposed method is also used to predict the daily average concentration of PM2.5 in China. Experimental results show that the proposed method can obtain competitive results in terms of both prediction accuracy and the number of selected features.

  5. Mesalamine Rectal

    MedlinePlus

    ... and use your fingers to peel off the plastic wrapper. Try to handle the suppository as little ... to your pharmacist or contact your local garbage/recycling department to learn about take-back programs in ...

  6. Skin lesion computational diagnosis of dermoscopic images: Ensemble models based on input feature manipulation.

    PubMed

    Oliveira, Roberta B; Pereira, Aledir S; Tavares, João Manuel R S

    2017-10-01

    The number of deaths worldwide due to melanoma has risen in recent times, in part because melanoma is the most aggressive type of skin cancer. Computational systems have been developed to assist dermatologists in early diagnosis of skin cancer, or even to monitor skin lesions. However, there still remains a challenge to improve classifiers for the diagnosis of such skin lesions. The main objective of this article is to evaluate different ensemble classification models based on input feature manipulation to diagnose skin lesions. Input feature manipulation processes are based on feature subset selections from shape properties, colour variation and texture analysis to generate diversity for the ensemble models. Three subset selection models are presented here: (1) a subset selection model based on specific feature groups, (2) a correlation-based subset selection model, and (3) a subset selection model based on feature selection algorithms. Each ensemble classification model is generated using an optimum-path forest classifier and integrated with a majority voting strategy. The proposed models were applied on a set of 1104 dermoscopic images using a cross-validation procedure. The best results were obtained by the first ensemble classification model that generates a feature subset ensemble based on specific feature groups. The skin lesion diagnosis computational system achieved 94.3% accuracy, 91.8% sensitivity and 96.7% specificity. The input feature manipulation process based on specific feature subsets generated the greatest diversity for the ensemble classification model with very promising results. Copyright © 2017 Elsevier B.V. All rights reserved.

  7. Attentional Selection of Feature Conjunctions Is Accomplished by Parallel and Independent Selection of Single Features.

    PubMed

    Andersen, Søren K; Müller, Matthias M; Hillyard, Steven A

    2015-07-08

    Experiments that study feature-based attention have often examined situations in which selection is based on a single feature (e.g., the color red). However, in more complex situations relevant stimuli may not be set apart from other stimuli by a single defining property but by a specific combination of features. Here, we examined sustained attentional selection of stimuli defined by conjunctions of color and orientation. Human observers attended to one out of four concurrently presented superimposed fields of randomly moving horizontal or vertical bars of red or blue color to detect brief intervals of coherent motion. Selective stimulus processing in early visual cortex was assessed by recordings of steady-state visual evoked potentials (SSVEPs) elicited by each of the flickering fields of stimuli. We directly contrasted attentional selection of single features and feature conjunctions and found that SSVEP amplitudes on conditions in which selection was based on a single feature only (color or orientation) exactly predicted the magnitude of attentional enhancement of SSVEPs when attending to a conjunction of both features. Furthermore, enhanced SSVEP amplitudes elicited by attended stimuli were accompanied by equivalent reductions of SSVEP amplitudes elicited by unattended stimuli in all cases. We conclude that attentional selection of a feature-conjunction stimulus is accomplished by the parallel and independent facilitation of its constituent feature dimensions in early visual cortex. The ability to perceive the world is limited by the brain's processing capacity. Attention affords adaptive behavior by selectively prioritizing processing of relevant stimuli based on their features (location, color, orientation, etc.). We found that attentional mechanisms for selection of different features belonging to the same object operate independently and in parallel: concurrent attentional selection of two stimulus features is simply the sum of attending to each of those features separately. This result is key to understanding attentional selection in complex (natural) scenes, where relevant stimuli are likely to be defined by a combination of stimulus features. Copyright © 2015 the authors 0270-6474/15/359912-08$15.00/0.

  8. Biological data integration: wrapping data and tools.

    PubMed

    Lacroix, Zoé

    2002-06-01

    Nowadays scientific data is inevitably digital and stored in a wide variety of formats in heterogeneous systems. Scientists need to access an integrated view of remote or local heterogeneous data sources with advanced data accessing, analyzing, and visualization tools. Building a digital library for scientific data requires accessing and manipulating data extracted from flat files or databases, documents retrieved from the Web as well as data generated by software. We present an approach to wrapping web data sources, databases, flat files, or data generated by tools through a database view mechanism. Generally, a wrapper has two tasks: it first sends a query to the source to retrieve data and, second builds the expected output with respect to the virtual structure. Our wrappers are composed of a retrieval component based on an intermediate object view mechanism called search views mapping the source capabilities to attributes, and an eXtensible Markup Language (XML) engine, respectively, to perform these two tasks. The originality of the approach consists of: 1) a generic view mechanism to access seamlessly data sources with limited capabilities and 2) the ability to wrap data sources as well as the useful specific tools they may provide. Our approach has been developed and demonstrated as part of the multidatabase system supporting queries via uniform object protocol model (OPM) interfaces.

  9. Rotorcraft Optimization Tools: Incorporating Rotorcraft Design Codes into Multi-Disciplinary Design, Analysis, and Optimization

    NASA Technical Reports Server (NTRS)

    Meyn, Larry A.

    2018-01-01

    One of the goals of NASA's Revolutionary Vertical Lift Technology Project (RVLT) is to provide validated tools for multidisciplinary design, analysis and optimization (MDAO) of vertical lift vehicles. As part of this effort, the software package, RotorCraft Optimization Tools (RCOTOOLS), is being developed to facilitate incorporating key rotorcraft conceptual design codes into optimizations using the OpenMDAO multi-disciplinary optimization framework written in Python. RCOTOOLS, also written in Python, currently supports the incorporation of the NASA Design and Analysis of RotorCraft (NDARC) vehicle sizing tool and the Comprehensive Analytical Model of Rotorcraft Aerodynamics and Dynamics II (CAMRAD II) analysis tool into OpenMDAO-driven optimizations. Both of these tools use detailed, file-based inputs and outputs, so RCOTOOLS provides software wrappers to update input files with new design variable values, execute these codes and then extract specific response variable values from the file outputs. These wrappers are designed to be flexible and easy to use. RCOTOOLS also provides several utilities to aid in optimization model development, including Graphical User Interface (GUI) tools for browsing input and output files in order to identify text strings that are used to identify specific variables as optimization input and response variables. This paper provides an overview of RCOTOOLS and its use

  10. Evaluation of Semi-supervised Learning for Classification of Protein Crystallization Imagery.

    PubMed

    Sigdel, Madhav; Dinç, İmren; Dinç, Semih; Sigdel, Madhu S; Pusey, Marc L; Aygün, Ramazan S

    2014-03-01

    In this paper, we investigate the performance of two wrapper methods for semi-supervised learning algorithms for classification of protein crystallization images with limited labeled images. Firstly, we evaluate the performance of semi-supervised approach using self-training with naïve Bayesian (NB) and sequential minimum optimization (SMO) as the base classifiers. The confidence values returned by these classifiers are used to select high confident predictions to be used for self-training. Secondly, we analyze the performance of Yet Another Two Stage Idea (YATSI) semi-supervised learning using NB, SMO, multilayer perceptron (MLP), J48 and random forest (RF) classifiers. These results are compared with the basic supervised learning using the same training sets. We perform our experiments on a dataset consisting of 2250 protein crystallization images for different proportions of training and test data. Our results indicate that NB and SMO using both self-training and YATSI semi-supervised approaches improve accuracies with respect to supervised learning. On the other hand, MLP, J48 and RF perform better using basic supervised learning. Overall, random forest classifier yields the best accuracy with supervised learning for our dataset.

  11. 39 CFR 233.3 - Mail covers.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... transcription, photograph, photocopy or any other facsimile of the image of the outside cover, envelope, wrapper... Postal Inspection Service to transmit mail cover reports directly to the requesting authority. (j) Review...

  12. 39 CFR 233.3 - Mail covers.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... transcription, photograph, photocopy or any other facsimile of the image of the outside cover, envelope, wrapper... Postal Inspection Service to transmit mail cover reports directly to the requesting authority. (j) Review...

  13. 39 CFR 233.3 - Mail covers.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... transcription, photograph, photocopy or any other facsimile of the image of the outside cover, envelope, wrapper... Postal Inspection Service to transmit mail cover reports directly to the requesting authority. (j) Review...

  14. 39 CFR 233.3 - Mail covers.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... transcription, photograph, photocopy or any other facsimile of the image of the outside cover, envelope, wrapper... Postal Inspection Service to transmit mail cover reports directly to the requesting authority. (j) Review...

  15. 39 CFR 233.3 - Mail covers.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... transcription, photograph, photocopy or any other facsimile of the image of the outside cover, envelope, wrapper... Postal Inspection Service to transmit mail cover reports directly to the requesting authority. (j) Review...

  16. Enhancing the Discrimination Ability of a Gas Sensor Array Based on a Novel Feature Selection and Fusion Framework.

    PubMed

    Deng, Changjian; Lv, Kun; Shi, Debo; Yang, Bo; Yu, Song; He, Zhiyi; Yan, Jia

    2018-06-12

    In this paper, a novel feature selection and fusion framework is proposed to enhance the discrimination ability of gas sensor arrays for odor identification. Firstly, we put forward an efficient feature selection method based on the separability and the dissimilarity to determine the feature selection order for each type of feature when increasing the dimension of selected feature subsets. Secondly, the K-nearest neighbor (KNN) classifier is applied to determine the dimensions of the optimal feature subsets for different types of features. Finally, in the process of establishing features fusion, we come up with a classification dominance feature fusion strategy which conducts an effective basic feature. Experimental results on two datasets show that the recognition rates of Database I and Database II achieve 97.5% and 80.11%, respectively, when k = 1 for KNN classifier and the distance metric is correlation distance (COR), which demonstrates the superiority of the proposed feature selection and fusion framework in representing signal features. The novel feature selection method proposed in this paper can effectively select feature subsets that are conducive to the classification, while the feature fusion framework can fuse various features which describe the different characteristics of sensor signals, for enhancing the discrimination ability of gas sensors and, to a certain extent, suppressing drift effect.

  17. A combined Fisher and Laplacian score for feature selection in QSAR based drug design using compounds with known and unknown activities.

    PubMed

    Valizade Hasanloei, Mohammad Amin; Sheikhpour, Razieh; Sarram, Mehdi Agha; Sheikhpour, Elnaz; Sharifi, Hamdollah

    2018-02-01

    Quantitative structure-activity relationship (QSAR) is an effective computational technique for drug design that relates the chemical structures of compounds to their biological activities. Feature selection is an important step in QSAR based drug design to select the most relevant descriptors. One of the most popular feature selection methods for classification problems is Fisher score which aim is to minimize the within-class distance and maximize the between-class distance. In this study, the properties of Fisher criterion were extended for QSAR models to define the new distance metrics based on the continuous activity values of compounds with known activities. Then, a semi-supervised feature selection method was proposed based on the combination of Fisher and Laplacian criteria which exploits both compounds with known and unknown activities to select the relevant descriptors. To demonstrate the efficiency of the proposed semi-supervised feature selection method in selecting the relevant descriptors, we applied the method and other feature selection methods on three QSAR data sets such as serine/threonine-protein kinase PLK3 inhibitors, ROCK inhibitors and phenol compounds. The results demonstrated that the QSAR models built on the selected descriptors by the proposed semi-supervised method have better performance than other models. This indicates the efficiency of the proposed method in selecting the relevant descriptors using the compounds with known and unknown activities. The results of this study showed that the compounds with known and unknown activities can be helpful to improve the performance of the combined Fisher and Laplacian based feature selection methods.

  18. A combined Fisher and Laplacian score for feature selection in QSAR based drug design using compounds with known and unknown activities

    NASA Astrophysics Data System (ADS)

    Valizade Hasanloei, Mohammad Amin; Sheikhpour, Razieh; Sarram, Mehdi Agha; Sheikhpour, Elnaz; Sharifi, Hamdollah

    2018-02-01

    Quantitative structure-activity relationship (QSAR) is an effective computational technique for drug design that relates the chemical structures of compounds to their biological activities. Feature selection is an important step in QSAR based drug design to select the most relevant descriptors. One of the most popular feature selection methods for classification problems is Fisher score which aim is to minimize the within-class distance and maximize the between-class distance. In this study, the properties of Fisher criterion were extended for QSAR models to define the new distance metrics based on the continuous activity values of compounds with known activities. Then, a semi-supervised feature selection method was proposed based on the combination of Fisher and Laplacian criteria which exploits both compounds with known and unknown activities to select the relevant descriptors. To demonstrate the efficiency of the proposed semi-supervised feature selection method in selecting the relevant descriptors, we applied the method and other feature selection methods on three QSAR data sets such as serine/threonine-protein kinase PLK3 inhibitors, ROCK inhibitors and phenol compounds. The results demonstrated that the QSAR models built on the selected descriptors by the proposed semi-supervised method have better performance than other models. This indicates the efficiency of the proposed method in selecting the relevant descriptors using the compounds with known and unknown activities. The results of this study showed that the compounds with known and unknown activities can be helpful to improve the performance of the combined Fisher and Laplacian based feature selection methods.

  19. Object-based selection from spatially-invariant representations: evidence from a feature-report task.

    PubMed

    Matsukura, Michi; Vecera, Shaun P

    2011-02-01

    Attention selects objects as well as locations. When attention selects an object's features, observers identify two features from a single object more accurately than two features from two different objects (object-based effect of attention; e.g., Duncan, Journal of Experimental Psychology: General, 113, 501-517, 1984). Several studies have demonstrated that object-based attention can operate at a late visual processing stage that is independent of objects' spatial information (Awh, Dhaliwal, Christensen, & Matsukura, Psychological Science, 12, 329-334, 2001; Matsukura & Vecera, Psychonomic Bulletin & Review, 16, 529-536, 2009; Vecera, Journal of Experimental Psychology: General, 126, 14-18, 1997; Vecera & Farah, Journal of Experimental Psychology: General, 123, 146-160, 1994). In the present study, we asked two questions regarding this late object-based selection mechanism. In Part I, we investigated how observers' foreknowledge of to-be-reported features allows attention to select objects, as opposed to individual features. Using a feature-report task, a significant object-based effect was observed when to-be-reported features were known in advance but not when this advance knowledge was absent. In Part II, we examined what drives attention to select objects rather than individual features in the absence of observers' foreknowledge of to-be-reported features. Results suggested that, when there was no opportunity for observers to direct their attention to objects that possess to-be-reported features at the time of stimulus presentation, these stimuli must retain strong perceptual cues to establish themselves as separate objects.

  20. Relevance popularity: A term event model based feature selection scheme for text classification.

    PubMed

    Feng, Guozhong; An, Baiguo; Yang, Fengqin; Wang, Han; Zhang, Libiao

    2017-01-01

    Feature selection is a practical approach for improving the performance of text classification methods by optimizing the feature subsets input to classifiers. In traditional feature selection methods such as information gain and chi-square, the number of documents that contain a particular term (i.e. the document frequency) is often used. However, the frequency of a given term appearing in each document has not been fully investigated, even though it is a promising feature to produce accurate classifications. In this paper, we propose a new feature selection scheme based on a term event Multinomial naive Bayes probabilistic model. According to the model assumptions, the matching score function, which is based on the prediction probability ratio, can be factorized. Finally, we derive a feature selection measurement for each term after replacing inner parameters by their estimators. On a benchmark English text datasets (20 Newsgroups) and a Chinese text dataset (MPH-20), our numerical experiment results obtained from using two widely used text classifiers (naive Bayes and support vector machine) demonstrate that our method outperformed the representative feature selection methods.

  1. Toward optimal feature and time segment selection by divergence method for EEG signals classification.

    PubMed

    Wang, Jie; Feng, Zuren; Lu, Na; Luo, Jing

    2018-06-01

    Feature selection plays an important role in the field of EEG signals based motor imagery pattern classification. It is a process that aims to select an optimal feature subset from the original set. Two significant advantages involved are: lowering the computational burden so as to speed up the learning procedure and removing redundant and irrelevant features so as to improve the classification performance. Therefore, feature selection is widely employed in the classification of EEG signals in practical brain-computer interface systems. In this paper, we present a novel statistical model to select the optimal feature subset based on the Kullback-Leibler divergence measure, and automatically select the optimal subject-specific time segment. The proposed method comprises four successive stages: a broad frequency band filtering and common spatial pattern enhancement as preprocessing, features extraction by autoregressive model and log-variance, the Kullback-Leibler divergence based optimal feature and time segment selection and linear discriminate analysis classification. More importantly, this paper provides a potential framework for combining other feature extraction models and classification algorithms with the proposed method for EEG signals classification. Experiments on single-trial EEG signals from two public competition datasets not only demonstrate that the proposed method is effective in selecting discriminative features and time segment, but also show that the proposed method yields relatively better classification results in comparison with other competitive methods. Copyright © 2018 Elsevier Ltd. All rights reserved.

  2. Beam Instrument Development System

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    DOOLITTLE, LAWRENCE; HUANG, GANG; DU, QIANG

    Beam Instrumentation Development System (BIDS) is a collection of common support libraries and modules developed during a series of Low-Level Radio Frequency (LLRF) control and timing/synchronization projects. BIDS includes a collection of Hardware Description Language (HDL) libraries and software libraries. The BIDS can be used for the development of any FPGA-based system, such as LLRF controllers. HDL code in this library is generic and supports common Digital Signal Processing (DSP) functions, FPGA-specific drivers (high-speed serial link wrappers, clock generation, etc.), ADC/DAC drivers, Ethernet MAC implementation, etc.

  3. RM-CLEAN: RM spectra cleaner

    NASA Astrophysics Data System (ADS)

    Heald, George

    2017-08-01

    RM-CLEAN reads in dirty Q and U cubes, generates rmtf based on the frequencies given in an ASCII file, and cleans the RM spectra following the algorithm given by Brentjens (2007). The output cubes contain the clean model components and the CLEANed RM spectra. The input cubes must be reordered with mode=312, and the output cubes will have the same ordering and thus must be reordered after being written to disk. RM-CLEAN runs as a MIRIAD (ascl:1106.007) task and a Python wrapper is included with the code.

  4. 7 CFR 29.2436 - Wrappers (A Group).

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ..., firm, rich in oil, elastic, strong, bright finish, deep color intensity, broad, 95 percent uniform, and..., deep color intensity, broad, 95 percent uniform, and 5 percent injury tolerance. A2D Fine Dark-brown...

  5. 7 CFR 29.2436 - Wrappers (A Group).

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ..., firm, rich in oil, elastic, strong, bright finish, deep color intensity, broad, 95 percent uniform, and..., deep color intensity, broad, 95 percent uniform, and 5 percent injury tolerance. A2D Fine Dark-brown...

  6. 7 CFR 29.2436 - Wrappers (A Group).

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ..., firm, rich in oil, elastic, strong, bright finish, deep color intensity, broad, 95 percent uniform, and..., deep color intensity, broad, 95 percent uniform, and 5 percent injury tolerance. A2D Fine Dark-brown...

  7. 7 CFR 29.2436 - Wrappers (A Group).

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ..., firm, rich in oil, elastic, strong, bright finish, deep color intensity, broad, 95 percent uniform, and..., deep color intensity, broad, 95 percent uniform, and 5 percent injury tolerance. A2D Fine Dark-brown...

  8. 7 CFR 29.2436 - Wrappers (A Group).

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ..., firm, rich in oil, elastic, strong, bright finish, deep color intensity, broad, 95 percent uniform, and..., deep color intensity, broad, 95 percent uniform, and 5 percent injury tolerance. A2D Fine Dark-brown...

  9. Studies into the transfer and migration of phthalate esters from aluminium foil-paper laminates to butter and margarine.

    PubMed

    Page, B D; Lacroix, G M

    1992-01-01

    Retail samples of Canadian butter and margarine wrapped in aluminium foil-paper laminate were found to contain dibutyl, butyl benzyl and/or di-2-ethylhexyl phthalate (DBP, BBP, DEHP) as packaging migrants at levels up to 10.6, 47.8 and 11.9 micrograms/g, respectively. These phthalates were determined by capillary gas chromatography with flame ionization detection (GC-FID) after clean-up of the separated oil by sweep co-distillation. The phthalate esters found in the contacted butter or margarine were also found in the contacting wrappers. They were determined in wrapper extracts by liquid chromatography with diode array detection or by GC-FID. Analysis of unused wrappers showed 76-88% of the total DBP and DEHP to be present on the foil (outer) surface as a component of the protective coating (washcoat). The remainder of the DBP and DEHP was found on the food-contacting paper surface, presumably by transfer from the outer to inner surface during storage in tightly wound rolls, although transfer of phthalate esters, if present in the paper-foil adhesive, cannot be ruled out. Food-contacting surface concentrations of DBP and DEHP were found to be 2.4 to 4.7 and 2.8 to 3.6 micrograms/cm2, respectively. Samples of each packaging component: paper, foil, adhesive, washcoat and inks were analysed for phthalate esters and only the washcoat was found to contain phthalate esters.

  10. Comparison of Feature Selection Techniques in Machine Learning for Anatomical Brain MRI in Dementia.

    PubMed

    Tohka, Jussi; Moradi, Elaheh; Huttunen, Heikki

    2016-07-01

    We present a comparative split-half resampling analysis of various data driven feature selection and classification methods for the whole brain voxel-based classification analysis of anatomical magnetic resonance images. We compared support vector machines (SVMs), with or without filter based feature selection, several embedded feature selection methods and stability selection. While comparisons of the accuracy of various classification methods have been reported previously, the variability of the out-of-training sample classification accuracy and the set of selected features due to independent training and test sets have not been previously addressed in a brain imaging context. We studied two classification problems: 1) Alzheimer's disease (AD) vs. normal control (NC) and 2) mild cognitive impairment (MCI) vs. NC classification. In AD vs. NC classification, the variability in the test accuracy due to the subject sample did not vary between different methods and exceeded the variability due to different classifiers. In MCI vs. NC classification, particularly with a large training set, embedded feature selection methods outperformed SVM-based ones with the difference in the test accuracy exceeding the test accuracy variability due to the subject sample. The filter and embedded methods produced divergent feature patterns for MCI vs. NC classification that suggests the utility of the embedded feature selection for this problem when linked with the good generalization performance. The stability of the feature sets was strongly correlated with the number of features selected, weakly correlated with the stability of classification accuracy, and uncorrelated with the average classification accuracy.

  11. Temporal Correlation Mechanisms and Their Role in Feature Selection: A Single-Unit Study in Primate Somatosensory Cortex

    PubMed Central

    Gomez-Ramirez, Manuel; Trzcinski, Natalie K.; Mihalas, Stefan; Niebur, Ernst

    2014-01-01

    Studies in vision show that attention enhances the firing rates of cells when it is directed towards their preferred stimulus feature. However, it is unknown whether other sensory systems employ this mechanism to mediate feature selection within their modalities. Moreover, whether feature-based attention modulates the correlated activity of a population is unclear. Indeed, temporal correlation codes such as spike-synchrony and spike-count correlations (rsc) are believed to play a role in stimulus selection by increasing the signal and reducing the noise in a population, respectively. Here, we investigate (1) whether feature-based attention biases the correlated activity between neurons when attention is directed towards their common preferred feature, (2) the interplay between spike-synchrony and rsc during feature selection, and (3) whether feature attention effects are common across the visual and tactile systems. Single-unit recordings were made in secondary somatosensory cortex of three non-human primates while animals engaged in tactile feature (orientation and frequency) and visual discrimination tasks. We found that both firing rate and spike-synchrony between neurons with similar feature selectivity were enhanced when attention was directed towards their preferred feature. However, attention effects on spike-synchrony were twice as large as those on firing rate, and had a tighter relationship with behavioral performance. Further, we observed increased rsc when attention was directed towards the visual modality (i.e., away from touch). These data suggest that similar feature selection mechanisms are employed in vision and touch, and that temporal correlation codes such as spike-synchrony play a role in mediating feature selection. We posit that feature-based selection operates by implementing multiple mechanisms that reduce the overall noise levels in the neural population and synchronize activity across subpopulations that encode the relevant features of sensory stimuli. PMID:25423284

  12. Feature selection methods for object-based classification of sub-decimeter resolution digital aerial imagery

    USDA-ARS?s Scientific Manuscript database

    Due to the availability of numerous spectral, spatial, and contextual features, the determination of optimal features and class separabilities can be a time consuming process in object-based image analysis (OBIA). While several feature selection methods have been developed to assist OBIA, a robust c...

  13. Constraint programming based biomarker optimization.

    PubMed

    Zhou, Manli; Luo, Youxi; Sun, Guoquan; Mai, Guoqin; Zhou, Fengfeng

    2015-01-01

    Efficient and intuitive characterization of biological big data is becoming a major challenge for modern bio-OMIC based scientists. Interactive visualization and exploration of big data is proven to be one of the successful solutions. Most of the existing feature selection algorithms do not allow the interactive inputs from users in the optimizing process of feature selection. This study investigates this question as fixing a few user-input features in the finally selected feature subset and formulates these user-input features as constraints for a programming model. The proposed algorithm, fsCoP (feature selection based on constrained programming), performs well similar to or much better than the existing feature selection algorithms, even with the constraints from both literature and the existing algorithms. An fsCoP biomarker may be intriguing for further wet lab validation, since it satisfies both the classification optimization function and the biomedical knowledge. fsCoP may also be used for the interactive exploration of bio-OMIC big data by interactively adding user-defined constraints for modeling.

  14. 75 FR 80842 - Notice of Receipt of Complaint; Solicitation of Comments Relating to the Public Interest

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-12-23

    ... Certain Reduced Ignition Proclivity Cigarette Paper Wrappers and Products Containing Same, DN 2774; the... within the United States after importation of certain reduced ignition proclivity cigarette paper...

  15. Natural image classification driven by human brain activity

    NASA Astrophysics Data System (ADS)

    Zhang, Dai; Peng, Hanyang; Wang, Jinqiao; Tang, Ming; Xue, Rong; Zuo, Zhentao

    2016-03-01

    Natural image classification has been a hot topic in computer vision and pattern recognition research field. Since the performance of an image classification system can be improved by feature selection, many image feature selection methods have been developed. However, the existing supervised feature selection methods are typically driven by the class label information that are identical for different samples from the same class, ignoring with-in class image variability and therefore degrading the feature selection performance. In this study, we propose a novel feature selection method, driven by human brain activity signals collected using fMRI technique when human subjects were viewing natural images of different categories. The fMRI signals associated with subjects viewing different images encode the human perception of natural images, and therefore may capture image variability within- and cross- categories. We then select image features with the guidance of fMRI signals from brain regions with active response to image viewing. Particularly, bag of words features based on GIST descriptor are extracted from natural images for classification, and a sparse regression base feature selection method is adapted to select image features that can best predict fMRI signals. Finally, a classification model is built on the select image features to classify images without fMRI signals. The validation experiments for classifying images from 4 categories of two subjects have demonstrated that our method could achieve much better classification performance than the classifiers built on image feature selected by traditional feature selection methods.

  16. Detection and Modeling of High-Dimensional Thresholds for Fault Detection and Diagnosis

    NASA Technical Reports Server (NTRS)

    He, Yuning

    2015-01-01

    Many Fault Detection and Diagnosis (FDD) systems use discrete models for detection and reasoning. To obtain categorical values like oil pressure too high, analog sensor values need to be discretized using a suitablethreshold. Time series of analog and discrete sensor readings are processed and discretized as they come in. This task isusually performed by the wrapper code'' of the FDD system, together with signal preprocessing and filtering. In practice,selecting the right threshold is very difficult, because it heavily influences the quality of diagnosis. If a threshold causesthe alarm trigger even in nominal situations, false alarms will be the consequence. On the other hand, if threshold settingdoes not trigger in case of an off-nominal condition, important alarms might be missed, potentially causing hazardoussituations. In this paper, we will in detail describe the underlying statistical modeling techniques and algorithm as well as the Bayesian method for selecting the most likely shape and its parameters. Our approach will be illustrated by several examples from the Aerospace domain.

  17. A Filter Feature Selection Method Based on MFA Score and Redundancy Excluding and It's Application to Tumor Gene Expression Data Analysis.

    PubMed

    Li, Jiangeng; Su, Lei; Pang, Zenan

    2015-12-01

    Feature selection techniques have been widely applied to tumor gene expression data analysis in recent years. A filter feature selection method named marginal Fisher analysis score (MFA score) which is based on graph embedding has been proposed, and it has been widely used mainly because it is superior to Fisher score. Considering the heavy redundancy in gene expression data, we proposed a new filter feature selection technique in this paper. It is named MFA score+ and is based on MFA score and redundancy excluding. We applied it to an artificial dataset and eight tumor gene expression datasets to select important features and then used support vector machine as the classifier to classify the samples. Compared with MFA score, t test and Fisher score, it achieved higher classification accuracy.

  18. A comparison of three feature selection methods for object-based classification of sub-decimeter resolution UltraCam-L imagery

    USDA-ARS?s Scientific Manuscript database

    The availability of numerous spectral, spatial, and contextual features with object-based image analysis (OBIA) renders the selection of optimal features a time consuming and subjective process. While several feature election methods have been used in conjunction with OBIA, a robust comparison of th...

  19. Multi-level gene/MiRNA feature selection using deep belief nets and active learning.

    PubMed

    Ibrahim, Rania; Yousri, Noha A; Ismail, Mohamed A; El-Makky, Nagwa M

    2014-01-01

    Selecting the most discriminative genes/miRNAs has been raised as an important task in bioinformatics to enhance disease classifiers and to mitigate the dimensionality curse problem. Original feature selection methods choose genes/miRNAs based on their individual features regardless of how they perform together. Considering group features instead of individual ones provides a better view for selecting the most informative genes/miRNAs. Recently, deep learning has proven its ability in representing the data in multiple levels of abstraction, allowing for better discrimination between different classes. However, the idea of using deep learning for feature selection is not widely used in the bioinformatics field yet. In this paper, a novel multi-level feature selection approach named MLFS is proposed for selecting genes/miRNAs based on expression profiles. The approach is based on both deep and active learning. Moreover, an extension to use the technique for miRNAs is presented by considering the biological relation between miRNAs and genes. Experimental results show that the approach was able to outperform classical feature selection methods in hepatocellular carcinoma (HCC) by 9%, lung cancer by 6% and breast cancer by around 10% in F1-measure. Results also show the enhancement in F1-measure of our approach over recently related work in [1] and [2].

  20. Feature engineering for drug name recognition in biomedical texts: feature conjunction and feature selection.

    PubMed

    Liu, Shengyu; Tang, Buzhou; Chen, Qingcai; Wang, Xiaolong; Fan, Xiaoming

    2015-01-01

    Drug name recognition (DNR) is a critical step for drug information extraction. Machine learning-based methods have been widely used for DNR with various types of features such as part-of-speech, word shape, and dictionary feature. Features used in current machine learning-based methods are usually singleton features which may be due to explosive features and a large number of noisy features when singleton features are combined into conjunction features. However, singleton features that can only capture one linguistic characteristic of a word are not sufficient to describe the information for DNR when multiple characteristics should be considered. In this study, we explore feature conjunction and feature selection for DNR, which have never been reported. We intuitively select 8 types of singleton features and combine them into conjunction features in two ways. Then, Chi-square, mutual information, and information gain are used to mine effective features. Experimental results show that feature conjunction and feature selection can improve the performance of the DNR system with a moderate number of features and our DNR system significantly outperforms the best system in the DDIExtraction 2013 challenge.

  1. Feature Selection for Speech Emotion Recognition in Spanish and Basque: On the Use of Machine Learning to Improve Human-Computer Interaction

    PubMed Central

    Arruti, Andoni; Cearreta, Idoia; Álvarez, Aitor; Lazkano, Elena; Sierra, Basilio

    2014-01-01

    Study of emotions in human–computer interaction is a growing research area. This paper shows an attempt to select the most significant features for emotion recognition in spoken Basque and Spanish Languages using different methods for feature selection. RekEmozio database was used as the experimental data set. Several Machine Learning paradigms were used for the emotion classification task. Experiments were executed in three phases, using different sets of features as classification variables in each phase. Moreover, feature subset selection was applied at each phase in order to seek for the most relevant feature subset. The three phases approach was selected to check the validity of the proposed approach. Achieved results show that an instance-based learning algorithm using feature subset selection techniques based on evolutionary algorithms is the best Machine Learning paradigm in automatic emotion recognition, with all different feature sets, obtaining a mean of 80,05% emotion recognition rate in Basque and a 74,82% in Spanish. In order to check the goodness of the proposed process, a greedy searching approach (FSS-Forward) has been applied and a comparison between them is provided. Based on achieved results, a set of most relevant non-speaker dependent features is proposed for both languages and new perspectives are suggested. PMID:25279686

  2. Collecting wrappers, labels, and packages to enhance accuracy of food records among children 2-8 years in the Pacific region: Children's Healthy Living Program (CHL).

    PubMed

    Yonemori, Kim M; Ennis, Tui; Novotny, Rachel; Fialkowski, Marie K; Ettienne, Reynolette; Wilkens, Lynne R; Leon Guerrero, Rachael T; Bersamin, Andrea; Coleman, Patricia; Li, Fenfang; Boushey, Carol J

    2017-12-01

    The aim was to describe differences in dietary outcomes based on the provision of food wrappers, labels or packages (WLP) to complement data from dietary records (DR) among children from the US Affiliated Pacific. The WLP were intended to aid food coding. Since WLP can be associated with ultra-processed foods, one might expect differences in sodium, sugar, and other added ingredients to emerge. Dietary intakes of children (2-8 y) in Alaska, Hawai'i, Commonwealth of the Northern Mariana Islands, and Guam were collected using parent/caregiver completed 2-day DR. Parents were encouraged to collect WLP associated with the child's intake. Trained staff entered data from the DRs including the WLP when available using PacTrac3, a web application. Of the 1,868 DRs collected and entered at the time of this report, 498 (27%) included WLP. After adjusting for confounders (sex, age, location, education, food assistance), the DRs with WLP had significantly higher amounts of energy (kcal), total fat, saturated fat, added sugar, and sodium. These results suggest the inclusion of WLP enhanced the dietary intake data. The intake of energy, fat, added sugar and sodium derived from processed foods and foods consumed outside the home was better captured in children who had WLP.

  3. Adaptive runtime for a multiprocessing API

    DOEpatents

    Antao, Samuel F.; Bertolli, Carlo; Eichenberger, Alexandre E.; O'Brien, John K.

    2016-11-15

    A computer-implemented method includes selecting a runtime for executing a program. The runtime includes a first combination of feature implementations, where each feature implementation implements a feature of an application programming interface (API). Execution of the program is monitored, and the execution uses the runtime. Monitor data is generated based on the monitoring. A second combination of feature implementations are selected, by a computer processor, where the selection is based at least in part on the monitor data. The runtime is modified by activating the second combination of feature implementations to replace the first combination of feature implementations.

  4. Adaptive runtime for a multiprocessing API

    DOEpatents

    Antao, Samuel F.; Bertolli, Carlo; Eichenberger, Alexandre E.; O'Brien, John K.

    2016-10-11

    A computer-implemented method includes selecting a runtime for executing a program. The runtime includes a first combination of feature implementations, where each feature implementation implements a feature of an application programming interface (API). Execution of the program is monitored, and the execution uses the runtime. Monitor data is generated based on the monitoring. A second combination of feature implementations are selected, by a computer processor, where the selection is based at least in part on the monitor data. The runtime is modified by activating the second combination of feature implementations to replace the first combination of feature implementations.

  5. Infrared face recognition based on LBP histogram and KW feature selection

    NASA Astrophysics Data System (ADS)

    Xie, Zhihua

    2014-07-01

    The conventional LBP-based feature as represented by the local binary pattern (LBP) histogram still has room for performance improvements. This paper focuses on the dimension reduction of LBP micro-patterns and proposes an improved infrared face recognition method based on LBP histogram representation. To extract the local robust features in infrared face images, LBP is chosen to get the composition of micro-patterns of sub-blocks. Based on statistical test theory, Kruskal-Wallis (KW) feature selection method is proposed to get the LBP patterns which are suitable for infrared face recognition. The experimental results show combination of LBP and KW features selection improves the performance of infrared face recognition, the proposed method outperforms the traditional methods based on LBP histogram, discrete cosine transform(DCT) or principal component analysis(PCA).

  6. [Feature extraction for breast cancer data based on geometric algebra theory and feature selection using differential evolution].

    PubMed

    Li, Jing; Hong, Wenxue

    2014-12-01

    The feature extraction and feature selection are the important issues in pattern recognition. Based on the geometric algebra representation of vector, a new feature extraction method using blade coefficient of geometric algebra was proposed in this study. At the same time, an improved differential evolution (DE) feature selection method was proposed to solve the elevated high dimension issue. The simple linear discriminant analysis was used as the classifier. The result of the 10-fold cross-validation (10 CV) classification of public breast cancer biomedical dataset was more than 96% and proved superior to that of the original features and traditional feature extraction method.

  7. AVC: Selecting discriminative features on basis of AUC by maximizing variable complementarity.

    PubMed

    Sun, Lei; Wang, Jun; Wei, Jinmao

    2017-03-14

    The Receiver Operator Characteristic (ROC) curve is well-known in evaluating classification performance in biomedical field. Owing to its superiority in dealing with imbalanced and cost-sensitive data, the ROC curve has been exploited as a popular metric to evaluate and find out disease-related genes (features). The existing ROC-based feature selection approaches are simple and effective in evaluating individual features. However, these approaches may fail to find real target feature subset due to their lack of effective means to reduce the redundancy between features, which is essential in machine learning. In this paper, we propose to assess feature complementarity by a trick of measuring the distances between the misclassified instances and their nearest misses on the dimensions of pairwise features. If a misclassified instance and its nearest miss on one feature dimension are far apart on another feature dimension, the two features are regarded as complementary to each other. Subsequently, we propose a novel filter feature selection approach on the basis of the ROC analysis. The new approach employs an efficient heuristic search strategy to select optimal features with highest complementarities. The experimental results on a broad range of microarray data sets validate that the classifiers built on the feature subset selected by our approach can get the minimal balanced error rate with a small amount of significant features. Compared with other ROC-based feature selection approaches, our new approach can select fewer features and effectively improve the classification performance.

  8. Emotional textile image classification based on cross-domain convolutional sparse autoencoders with feature selection

    NASA Astrophysics Data System (ADS)

    Li, Zuhe; Fan, Yangyu; Liu, Weihua; Yu, Zeqi; Wang, Fengqin

    2017-01-01

    We aim to apply sparse autoencoder-based unsupervised feature learning to emotional semantic analysis for textile images. To tackle the problem of limited training data, we present a cross-domain feature learning scheme for emotional textile image classification using convolutional autoencoders. We further propose a correlation-analysis-based feature selection method for the weights learned by sparse autoencoders to reduce the number of features extracted from large size images. First, we randomly collect image patches on an unlabeled image dataset in the source domain and learn local features with a sparse autoencoder. We then conduct feature selection according to the correlation between different weight vectors corresponding to the autoencoder's hidden units. We finally adopt a convolutional neural network including a pooling layer to obtain global feature activations of textile images in the target domain and send these global feature vectors into logistic regression models for emotional image classification. The cross-domain unsupervised feature learning method achieves 65% to 78% average accuracy in the cross-validation experiments corresponding to eight emotional categories and performs better than conventional methods. Feature selection can reduce the computational cost of global feature extraction by about 50% while improving classification performance.

  9. A feature selection approach towards progressive vector transmission over the Internet

    NASA Astrophysics Data System (ADS)

    Miao, Ru; Song, Jia; Feng, Min

    2017-09-01

    WebGIS has been applied for visualizing and sharing geospatial information popularly over the Internet. In order to improve the efficiency of the client applications, the web-based progressive vector transmission approach is proposed. Important features should be selected and transferred firstly, and the methods for measuring the importance of features should be further considered in the progressive transmission. However, studies on progressive transmission for large-volume vector data have mostly focused on map generalization in the field of cartography, but rarely discussed on the selection of geographic features quantitatively. This paper applies information theory for measuring the feature importance of vector maps. A measurement model for the amount of information of vector features is defined based upon the amount of information for dealing with feature selection issues. The measurement model involves geometry factor, spatial distribution factor and thematic attribute factor. Moreover, a real-time transport protocol (RTP)-based progressive transmission method is then presented to improve the transmission of vector data. To clearly demonstrate the essential methodology and key techniques, a prototype for web-based progressive vector transmission is presented, and an experiment of progressive selection and transmission for vector features is conducted. The experimental results indicate that our approach clearly improves the performance and end-user experience of delivering and manipulating large vector data over the Internet.

  10. HIV-1 protease cleavage site prediction based on two-stage feature selection method.

    PubMed

    Niu, Bing; Yuan, Xiao-Cheng; Roeper, Preston; Su, Qiang; Peng, Chun-Rong; Yin, Jing-Yuan; Ding, Juan; Li, HaiPeng; Lu, Wen-Cong

    2013-03-01

    Knowledge of the mechanism of HIV protease cleavage specificity is critical to the design of specific and effective HIV inhibitors. Searching for an accurate, robust, and rapid method to correctly predict the cleavage sites in proteins is crucial when searching for possible HIV inhibitors. In this article, HIV-1 protease specificity was studied using the correlation-based feature subset (CfsSubset) selection method combined with Genetic Algorithms method. Thirty important biochemical features were found based on a jackknife test from the original data set containing 4,248 features. By using the AdaBoost method with the thirty selected features the prediction model yields an accuracy of 96.7% for the jackknife test and 92.1% for an independent set test, with increased accuracy over the original dataset by 6.7% and 77.4%, respectively. Our feature selection scheme could be a useful technique for finding effective competitive inhibitors of HIV protease.

  11. 7 CFR 30.36 - Class 1; flue-cured types and groups.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... REGULATIONS TOBACCO STOCKS AND STANDARDS Classification of Leaf Tobacco Covering Classes, Types and Groups of... extent, in Alabama. Groups applicable to types 11, 12, 13, and 14: A—Wrappers. B—Leaf. H—Smoking Leaf. C...

  12. 7 CFR 30.36 - Class 1; flue-cured types and groups.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... REGULATIONS TOBACCO STOCKS AND STANDARDS Classification of Leaf Tobacco Covering Classes, Types and Groups of... extent, in Alabama. Groups applicable to types 11, 12, 13, and 14: A—Wrappers. B—Leaf. H—Smoking Leaf. C...

  13. 7 CFR 30.37 - Class 2; fire-cured types and groups.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... REGULATIONS TOBACCO STOCKS AND STANDARDS Classification of Leaf Tobacco Covering Classes, Types and Groups of.... Groups applicable to types 21, 22, and 23: A—Wrappers. B—Heavy Leaf. C—Thin Leaf. X—Lugs. N—Nondescript...

  14. 7 CFR 30.36 - Class 1; flue-cured types and groups.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... REGULATIONS TOBACCO STOCKS AND STANDARDS Classification of Leaf Tobacco Covering Classes, Types and Groups of... extent, in Alabama. Groups applicable to types 11, 12, 13, and 14: A—Wrappers. B—Leaf. H—Smoking Leaf. C...

  15. 7 CFR 30.36 - Class 1; flue-cured types and groups.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... REGULATIONS TOBACCO STOCKS AND STANDARDS Classification of Leaf Tobacco Covering Classes, Types and Groups of... extent, in Alabama. Groups applicable to types 11, 12, 13, and 14: A—Wrappers. B—Leaf. H—Smoking Leaf. C...

  16. 7 CFR 30.36 - Class 1; flue-cured types and groups.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... REGULATIONS TOBACCO STOCKS AND STANDARDS Classification of Leaf Tobacco Covering Classes, Types and Groups of... extent, in Alabama. Groups applicable to types 11, 12, 13, and 14: A—Wrappers. B—Leaf. H—Smoking Leaf. C...

  17. 7 CFR 30.37 - Class 2; fire-cured types and groups.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... REGULATIONS TOBACCO STOCKS AND STANDARDS Classification of Leaf Tobacco Covering Classes, Types and Groups of.... Groups applicable to types 21, 22, and 23: A—Wrappers. B—Heavy Leaf. C—Thin Leaf. X—Lugs. N—Nondescript...

  18. 7 CFR 30.37 - Class 2; fire-cured types and groups.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... REGULATIONS TOBACCO STOCKS AND STANDARDS Classification of Leaf Tobacco Covering Classes, Types and Groups of.... Groups applicable to types 21, 22, and 23: A—Wrappers. B—Heavy Leaf. C—Thin Leaf. X—Lugs. N—Nondescript...

  19. 7 CFR 30.37 - Class 2; fire-cured types and groups.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... REGULATIONS TOBACCO STOCKS AND STANDARDS Classification of Leaf Tobacco Covering Classes, Types and Groups of.... Groups applicable to types 21, 22, and 23: A—Wrappers. B—Heavy Leaf. C—Thin Leaf. X—Lugs. N—Nondescript...

  20. 7 CFR 30.37 - Class 2; fire-cured types and groups.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... REGULATIONS TOBACCO STOCKS AND STANDARDS Classification of Leaf Tobacco Covering Classes, Types and Groups of.... Groups applicable to types 21, 22, and 23: A—Wrappers. B—Heavy Leaf. C—Thin Leaf. X—Lugs. N—Nondescript...

  1. Risk of Salmonellosis from Chicken Parts Prepared from Whole Chickens Sold in Flow Pack Wrappers and Subjected to Temperature Abuse.

    PubMed

    Oscar, T P

    2017-09-01

    The flow pack wrapper is a popular packaging choice for retail sale of whole chickens. However, it may provide a favorable environment for growth and spread of Salmonella within the package, leading to an outbreak of salmonellosis. To investigate this possibility, a process risk model was developed that predicted the risk of salmonellosis from chicken parts prepared from whole chickens sold in flow pack wrappers and subjected to proper storage (6 h at 4°C) or improper storage (72 h at 15°C) before preparation. The model had four unit operations (pathogen events): (i) preparation (contamination), (ii) cooking (death), (iii) serving (cross-contamination), and (iv) consumption (dose-response). Data for prevalence, number, and serotype of Salmonella on chicken parts were obtained by whole sample enrichment, real-time PCR. Improper storage increased (P < 0.05) prevalence of Salmonella on raw chicken parts from 10.6% (17 of 160) to 41.2% (66 of 160) and incidence of cross-contamination of cooked chicken from 10% (4 of 40) to 52.2% (24 of 46). Improper storage also increased (P < 0.05) the number (mean ± standard deviation) of Salmonella from 0.017 ± 0.030 to 3.51 ± 1.34 log per raw chicken part and from 0.048 ± 0.089 to 3.08 ± 1.50 log per cooked chicken part. The predominant serotypes isolated (n = 111) were Typhimurium (34.2%), Typhimurium var 5- (20.7%), Kentucky (12.6%), Enteritidis (11.7%), and Heidelberg (8.1%). When chicken was properly stored before preparation, the model predicted that risk of salmonellosis was low and sporadic with only six cases per 100 simulations of 10 5 chicken parts. However, when 0.1 to 1% of chickens were improperly stored before preparation, the model predicted that salmonellosis would increase (P < 0.05) linearly from a median of 7 (range, 1 to 15) to a median of 72 (range, 52 to 93) cases per 10 5 chicken parts. These results indicated that the flow pack wrapper provided a favorable environment for growth and spread of Salmonella within the package and that even when only a small percentage of packages were subjected to improper storage before preparation, the risk and size of an outbreak of salmonellosis from chicken parts increased significantly.

  2. SNPs selection using support vector regression and genetic algorithms in GWAS

    PubMed Central

    2014-01-01

    Introduction This paper proposes a new methodology to simultaneously select the most relevant SNPs markers for the characterization of any measurable phenotype described by a continuous variable using Support Vector Regression with Pearson Universal kernel as fitness function of a binary genetic algorithm. The proposed methodology is multi-attribute towards considering several markers simultaneously to explain the phenotype and is based jointly on statistical tools, machine learning and computational intelligence. Results The suggested method has shown potential in the simulated database 1, with additive effects only, and real database. In this simulated database, with a total of 1,000 markers, and 7 with major effect on the phenotype and the other 993 SNPs representing the noise, the method identified 21 markers. Of this total, 5 are relevant SNPs between the 7 but 16 are false positives. In real database, initially with 50,752 SNPs, we have reduced to 3,073 markers, increasing the accuracy of the model. In the simulated database 2, with additive effects and interactions (epistasis), the proposed method matched to the methodology most commonly used in GWAS. Conclusions The method suggested in this paper demonstrates the effectiveness in explaining the real phenotype (PTA for milk), because with the application of the wrapper based on genetic algorithm and Support Vector Regression with Pearson Universal, many redundant markers were eliminated, increasing the prediction and accuracy of the model on the real database without quality control filters. The PUK demonstrated that it can replicate the performance of linear and RBF kernels. PMID:25573332

  3. Inexpensive Audio Activities: Earbud-based Sound Experiments

    NASA Astrophysics Data System (ADS)

    Allen, Joshua; Boucher, Alex; Meggison, Dean; Hruby, Kate; Vesenka, James

    2016-11-01

    Inexpensive alternatives to a number of classic introductory physics sound laboratories are presented including interference phenomena, resonance conditions, and frequency shifts. These can be created using earbuds, economical supplies such as Giant Pixie Stix® wrappers, and free software available for PCs and mobile devices. We describe two interference laboratories (beat frequency and two-speaker interference) and two resonance laboratories (quarter- and half-wavelength). Lastly, a Doppler laboratory using rotating earbuds is explained. The audio signal captured by all experiments is analyzed on free spectral analysis software and many of the experiments incorporate the unifying theme of measuring the speed of sound in air.

  4. A novel feature extraction approach for microarray data based on multi-algorithm fusion

    PubMed Central

    Jiang, Zhu; Xu, Rong

    2015-01-01

    Feature extraction is one of the most important and effective method to reduce dimension in data mining, with emerging of high dimensional data such as microarray gene expression data. Feature extraction for gene selection, mainly serves two purposes. One is to identify certain disease-related genes. The other is to find a compact set of discriminative genes to build a pattern classifier with reduced complexity and improved generalization capabilities. Depending on the purpose of gene selection, two types of feature extraction algorithms including ranking-based feature extraction and set-based feature extraction are employed in microarray gene expression data analysis. In ranking-based feature extraction, features are evaluated on an individual basis, without considering inter-relationship between features in general, while set-based feature extraction evaluates features based on their role in a feature set by taking into account dependency between features. Just as learning methods, feature extraction has a problem in its generalization ability, which is robustness. However, the issue of robustness is often overlooked in feature extraction. In order to improve the accuracy and robustness of feature extraction for microarray data, a novel approach based on multi-algorithm fusion is proposed. By fusing different types of feature extraction algorithms to select the feature from the samples set, the proposed approach is able to improve feature extraction performance. The new approach is tested against gene expression dataset including Colon cancer data, CNS data, DLBCL data, and Leukemia data. The testing results show that the performance of this algorithm is better than existing solutions. PMID:25780277

  5. A novel feature extraction approach for microarray data based on multi-algorithm fusion.

    PubMed

    Jiang, Zhu; Xu, Rong

    2015-01-01

    Feature extraction is one of the most important and effective method to reduce dimension in data mining, with emerging of high dimensional data such as microarray gene expression data. Feature extraction for gene selection, mainly serves two purposes. One is to identify certain disease-related genes. The other is to find a compact set of discriminative genes to build a pattern classifier with reduced complexity and improved generalization capabilities. Depending on the purpose of gene selection, two types of feature extraction algorithms including ranking-based feature extraction and set-based feature extraction are employed in microarray gene expression data analysis. In ranking-based feature extraction, features are evaluated on an individual basis, without considering inter-relationship between features in general, while set-based feature extraction evaluates features based on their role in a feature set by taking into account dependency between features. Just as learning methods, feature extraction has a problem in its generalization ability, which is robustness. However, the issue of robustness is often overlooked in feature extraction. In order to improve the accuracy and robustness of feature extraction for microarray data, a novel approach based on multi-algorithm fusion is proposed. By fusing different types of feature extraction algorithms to select the feature from the samples set, the proposed approach is able to improve feature extraction performance. The new approach is tested against gene expression dataset including Colon cancer data, CNS data, DLBCL data, and Leukemia data. The testing results show that the performance of this algorithm is better than existing solutions.

  6. Spectral Band Selection for Urban Material Classification Using Hyperspectral Libraries

    NASA Astrophysics Data System (ADS)

    Le Bris, A.; Chehata, N.; Briottet, X.; Paparoditis, N.

    2016-06-01

    In urban areas, information concerning very high resolution land cover and especially material maps are necessary for several city modelling or monitoring applications. That is to say, knowledge concerning the roofing materials or the different kinds of ground areas is required. Airborne remote sensing techniques appear to be convenient for providing such information at a large scale. However, results obtained using most traditional processing methods based on usual red-green-blue-near infrared multispectral images remain limited for such applications. A possible way to improve classification results is to enhance the imagery spectral resolution using superspectral or hyperspectral sensors. In this study, it is intended to design a superspectral sensor dedicated to urban materials classification and this work particularly focused on the selection of the optimal spectral band subsets for such sensor. First, reflectance spectral signatures of urban materials were collected from 7 spectral libraires. Then, spectral optimization was performed using this data set. The band selection workflow included two steps, optimising first the number of spectral bands using an incremental method and then examining several possible optimised band subsets using a stochastic algorithm. The same wrapper relevance criterion relying on a confidence measure of Random Forests classifier was used at both steps. To cope with the limited number of available spectra for several classes, additional synthetic spectra were generated from the collection of reference spectra: intra-class variability was simulated by multiplying reference spectra by a random coefficient. At the end, selected band subsets were evaluated considering the classification quality reached using a rbf svm classifier. It was confirmed that a limited band subset was sufficient to classify common urban materials. The important contribution of bands from the Short Wave Infra-Red (SWIR) spectral domain (1000-2400 nm) to material classification was also shown.

  7. System Complexity Reduction via Feature Selection

    ERIC Educational Resources Information Center

    Deng, Houtao

    2011-01-01

    This dissertation transforms a set of system complexity reduction problems to feature selection problems. Three systems are considered: classification based on association rules, network structure learning, and time series classification. Furthermore, two variable importance measures are proposed to reduce the feature selection bias in tree…

  8. IMMAN: free software for information theory-based chemometric analysis.

    PubMed

    Urias, Ricardo W Pino; Barigye, Stephen J; Marrero-Ponce, Yovani; García-Jacas, César R; Valdes-Martiní, José R; Perez-Gimenez, Facundo

    2015-05-01

    The features and theoretical background of a new and free computational program for chemometric analysis denominated IMMAN (acronym for Information theory-based CheMoMetrics ANalysis) are presented. This is multi-platform software developed in the Java programming language, designed with a remarkably user-friendly graphical interface for the computation of a collection of information-theoretic functions adapted for rank-based unsupervised and supervised feature selection tasks. A total of 20 feature selection parameters are presented, with the unsupervised and supervised frameworks represented by 10 approaches in each case. Several information-theoretic parameters traditionally used as molecular descriptors (MDs) are adapted for use as unsupervised rank-based feature selection methods. On the other hand, a generalization scheme for the previously defined differential Shannon's entropy is discussed, as well as the introduction of Jeffreys information measure for supervised feature selection. Moreover, well-known information-theoretic feature selection parameters, such as information gain, gain ratio, and symmetrical uncertainty are incorporated to the IMMAN software ( http://mobiosd-hub.com/imman-soft/ ), following an equal-interval discretization approach. IMMAN offers data pre-processing functionalities, such as missing values processing, dataset partitioning, and browsing. Moreover, single parameter or ensemble (multi-criteria) ranking options are provided. Consequently, this software is suitable for tasks like dimensionality reduction, feature ranking, as well as comparative diversity analysis of data matrices. Simple examples of applications performed with this program are presented. A comparative study between IMMAN and WEKA feature selection tools using the Arcene dataset was performed, demonstrating similar behavior. In addition, it is revealed that the use of IMMAN unsupervised feature selection methods improves the performance of both IMMAN and WEKA supervised algorithms. Graphic representation for Shannon's distribution of MD calculating software.

  9. Feature selection for elderly faller classification based on wearable sensors.

    PubMed

    Howcroft, Jennifer; Kofman, Jonathan; Lemaire, Edward D

    2017-05-30

    Wearable sensors can be used to derive numerous gait pattern features for elderly fall risk and faller classification; however, an appropriate feature set is required to avoid high computational costs and the inclusion of irrelevant features. The objectives of this study were to identify and evaluate smaller feature sets for faller classification from large feature sets derived from wearable accelerometer and pressure-sensing insole gait data. A convenience sample of 100 older adults (75.5 ± 6.7 years; 76 non-fallers, 24 fallers based on 6 month retrospective fall occurrence) walked 7.62 m while wearing pressure-sensing insoles and tri-axial accelerometers at the head, pelvis, left and right shanks. Feature selection was performed using correlation-based feature selection (CFS), fast correlation based filter (FCBF), and Relief-F algorithms. Faller classification was performed using multi-layer perceptron neural network, naïve Bayesian, and support vector machine classifiers, with 75:25 single stratified holdout and repeated random sampling. The best performing model was a support vector machine with 78% accuracy, 26% sensitivity, 95% specificity, 0.36 F1 score, and 0.31 MCC and one posterior pelvis accelerometer input feature (left acceleration standard deviation). The second best model achieved better sensitivity (44%) and used a support vector machine with 74% accuracy, 83% specificity, 0.44 F1 score, and 0.29 MCC. This model had ten input features: maximum, mean and standard deviation posterior acceleration; maximum, mean and standard deviation anterior acceleration; mean superior acceleration; and three impulse features. The best multi-sensor model sensitivity (56%) was achieved using posterior pelvis and both shank accelerometers and a naïve Bayesian classifier. The best single-sensor model sensitivity (41%) was achieved using the posterior pelvis accelerometer and a naïve Bayesian classifier. Feature selection provided models with smaller feature sets and improved faller classification compared to faller classification without feature selection. CFS and FCBF provided the best feature subset (one posterior pelvis accelerometer feature) for faller classification. However, better sensitivity was achieved by the second best model based on a Relief-F feature subset with three pressure-sensing insole features and seven head accelerometer features. Feature selection should be considered as an important step in faller classification using wearable sensors.

  10. Status of advanced fuel candidates for Sodium Fast Reactor within the Generation IV International Forum

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    F. Delage; J. Carmack; C. B. Lee

    2013-10-01

    The main challenge for fuels for future Sodium Fast Reactor systems is the development and qualification of a nuclear fuel sub-assembly which meets the Generation IV International Forum goals. The Advanced Fuel project investigates high burn-up minor actinide bearing fuels as well as claddings and wrappers to withstand high neutron doses and temperatures. The R&D outcome of national and collaborative programs has been collected and shared between the AF project members in order to review the capability of sub-assembly material and fuel candidates, to identify the issues and select the viable options. Based on historical experience and knowledge, both oxidemore » and metal fuels emerge as primary options to meet the performance and the reliability goals of Generation IV SFR systems. There is a significant positive experience on carbide fuels but major issues remain to be overcome: strong in-pile swelling, atmosphere required for fabrication as well as Pu and Am losses. The irradiation performance database for nitride fuels is limited with longer term R&D activities still required. The promising core material candidates are Ferritic/Martensitic (F/M) and Oxide Dispersed Strengthened (ODS) steels.« less

  11. Evaluation of Semi-supervised Learning for Classification of Protein Crystallization Imagery

    PubMed Central

    Sigdel, Madhav; Dinç, İmren; Dinç, Semih; Sigdel, Madhu S.; Pusey, Marc L.; Aygün, Ramazan S.

    2015-01-01

    In this paper, we investigate the performance of two wrapper methods for semi-supervised learning algorithms for classification of protein crystallization images with limited labeled images. Firstly, we evaluate the performance of semi-supervised approach using self-training with naïve Bayesian (NB) and sequential minimum optimization (SMO) as the base classifiers. The confidence values returned by these classifiers are used to select high confident predictions to be used for self-training. Secondly, we analyze the performance of Yet Another Two Stage Idea (YATSI) semi-supervised learning using NB, SMO, multilayer perceptron (MLP), J48 and random forest (RF) classifiers. These results are compared with the basic supervised learning using the same training sets. We perform our experiments on a dataset consisting of 2250 protein crystallization images for different proportions of training and test data. Our results indicate that NB and SMO using both self-training and YATSI semi-supervised approaches improve accuracies with respect to supervised learning. On the other hand, MLP, J48 and RF perform better using basic supervised learning. Overall, random forest classifier yields the best accuracy with supervised learning for our dataset. PMID:25914518

  12. Dynamic publication model for neurophysiology databases.

    PubMed

    Gardner, D; Abato, M; Knuth, K H; DeBellis, R; Erde, S M

    2001-08-29

    We have implemented a pair of database projects, one serving cortical electrophysiology and the other invertebrate neurones and recordings. The design for each combines aspects of two proven schemes for information interchange. The journal article metaphor determined the type, scope, organization and quantity of data to comprise each submission. Sequence databases encouraged intuitive tools for data viewing, capture, and direct submission by authors. Neurophysiology required transcending these models with new datatypes. Time-series, histogram and bivariate datatypes, including illustration-like wrappers, were selected by their utility to the community of investigators. As interpretation of neurophysiological recordings depends on context supplied by metadata attributes, searches are via visual interfaces to sets of controlled-vocabulary metadata trees. Neurones, for example, can be specified by metadata describing functional and anatomical characteristics. Permanence is advanced by data model and data formats largely independent of contemporary technology or implementation, including Java and the XML standard. All user tools, including dynamic data viewers that serve as a virtual oscilloscope, are Java-based, free, multiplatform, and distributed by our application servers to any contemporary networked computer. Copyright is retained by submitters; viewer displays are dynamic and do not violate copyright of related journal figures. Panels of neurophysiologists view and test schemas and tools, enhancing community support.

  13. The Resource, Spring 2002

    DTIC Science & Technology

    2002-01-01

    wrappers to other widely used languages, namely TCL/TK, Java, and Python . VTK is very powerful and covers polygonal models and image processing classes and...follows: � Large Data Visualization and Rendering � Information Visualization for Beginners � Rendering and Visualization in Parallel Environments

  14. 76 FR 48902 - Solicitation of Nominations for the United States Department of Labor's Iqbal Masih Award for the...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-08-09

    ... application wrapper or other documentary evidence or receipt maintained by that office. Applications sent by... Office of Management and Budget (OMB) control number. This collection of information is approved under...

  15. Feature selection for wearable smartphone-based human activity recognition with able bodied, elderly, and stroke patients.

    PubMed

    Capela, Nicole A; Lemaire, Edward D; Baddour, Natalie

    2015-01-01

    Human activity recognition (HAR), using wearable sensors, is a growing area with the potential to provide valuable information on patient mobility to rehabilitation specialists. Smartphones with accelerometer and gyroscope sensors are a convenient, minimally invasive, and low cost approach for mobility monitoring. HAR systems typically pre-process raw signals, segment the signals, and then extract features to be used in a classifier. Feature selection is a crucial step in the process to reduce potentially large data dimensionality and provide viable parameters to enable activity classification. Most HAR systems are customized to an individual research group, including a unique data set, classes, algorithms, and signal features. These data sets are obtained predominantly from able-bodied participants. In this paper, smartphone accelerometer and gyroscope sensor data were collected from populations that can benefit from human activity recognition: able-bodied, elderly, and stroke patients. Data from a consecutive sequence of 41 mobility tasks (18 different tasks) were collected for a total of 44 participants. Seventy-six signal features were calculated and subsets of these features were selected using three filter-based, classifier-independent, feature selection methods (Relief-F, Correlation-based Feature Selection, Fast Correlation Based Filter). The feature subsets were then evaluated using three generic classifiers (Naïve Bayes, Support Vector Machine, j48 Decision Tree). Common features were identified for all three populations, although the stroke population subset had some differences from both able-bodied and elderly sets. Evaluation with the three classifiers showed that the feature subsets produced similar or better accuracies than classification with the entire feature set. Therefore, since these feature subsets are classifier-independent, they should be useful for developing and improving HAR systems across and within populations.

  16. Feature Selection for Wearable Smartphone-Based Human Activity Recognition with Able bodied, Elderly, and Stroke Patients

    PubMed Central

    2015-01-01

    Human activity recognition (HAR), using wearable sensors, is a growing area with the potential to provide valuable information on patient mobility to rehabilitation specialists. Smartphones with accelerometer and gyroscope sensors are a convenient, minimally invasive, and low cost approach for mobility monitoring. HAR systems typically pre-process raw signals, segment the signals, and then extract features to be used in a classifier. Feature selection is a crucial step in the process to reduce potentially large data dimensionality and provide viable parameters to enable activity classification. Most HAR systems are customized to an individual research group, including a unique data set, classes, algorithms, and signal features. These data sets are obtained predominantly from able-bodied participants. In this paper, smartphone accelerometer and gyroscope sensor data were collected from populations that can benefit from human activity recognition: able-bodied, elderly, and stroke patients. Data from a consecutive sequence of 41 mobility tasks (18 different tasks) were collected for a total of 44 participants. Seventy-six signal features were calculated and subsets of these features were selected using three filter-based, classifier-independent, feature selection methods (Relief-F, Correlation-based Feature Selection, Fast Correlation Based Filter). The feature subsets were then evaluated using three generic classifiers (Naïve Bayes, Support Vector Machine, j48 Decision Tree). Common features were identified for all three populations, although the stroke population subset had some differences from both able-bodied and elderly sets. Evaluation with the three classifiers showed that the feature subsets produced similar or better accuracies than classification with the entire feature set. Therefore, since these feature subsets are classifier-independent, they should be useful for developing and improving HAR systems across and within populations. PMID:25885272

  17. Feature Selection Method Based on Neighborhood Relationships: Applications in EEG Signal Identification and Chinese Character Recognition

    PubMed Central

    Zhao, Yu-Xiang; Chou, Chien-Hsing

    2016-01-01

    In this study, a new feature selection algorithm, the neighborhood-relationship feature selection (NRFS) algorithm, is proposed for identifying rat electroencephalogram signals and recognizing Chinese characters. In these two applications, dependent relationships exist among the feature vectors and their neighboring feature vectors. Therefore, the proposed NRFS algorithm was designed for solving this problem. By applying the NRFS algorithm, unselected feature vectors have a high priority of being added into the feature subset if the neighboring feature vectors have been selected. In addition, selected feature vectors have a high priority of being eliminated if the neighboring feature vectors are not selected. In the experiments conducted in this study, the NRFS algorithm was compared with two feature algorithms. The experimental results indicated that the NRFS algorithm can extract the crucial frequency bands for identifying rat vigilance states and identifying crucial character regions for recognizing Chinese characters. PMID:27314346

  18. Compact cancer biomarkers discovery using a swarm intelligence feature selection algorithm.

    PubMed

    Martinez, Emmanuel; Alvarez, Mario Moises; Trevino, Victor

    2010-08-01

    Biomarker discovery is a typical application from functional genomics. Due to the large number of genes studied simultaneously in microarray data, feature selection is a key step. Swarm intelligence has emerged as a solution for the feature selection problem. However, swarm intelligence settings for feature selection fail to select small features subsets. We have proposed a swarm intelligence feature selection algorithm based on the initialization and update of only a subset of particles in the swarm. In this study, we tested our algorithm in 11 microarray datasets for brain, leukemia, lung, prostate, and others. We show that the proposed swarm intelligence algorithm successfully increase the classification accuracy and decrease the number of selected features compared to other swarm intelligence methods. Copyright © 2010 Elsevier Ltd. All rights reserved.

  19. Computational Prediction of Protein Epsilon Lysine Acetylation Sites Based on a Feature Selection Method.

    PubMed

    Gao, JianZhao; Tao, Xue-Wen; Zhao, Jia; Feng, Yuan-Ming; Cai, Yu-Dong; Zhang, Ning

    2017-01-01

    Lysine acetylation, as one type of post-translational modifications (PTM), plays key roles in cellular regulations and can be involved in a variety of human diseases. However, it is often high-cost and time-consuming to use traditional experimental approaches to identify the lysine acetylation sites. Therefore, effective computational methods should be developed to predict the acetylation sites. In this study, we developed a position-specific method for epsilon lysine acetylation site prediction. Sequences of acetylated proteins were retrieved from the UniProt database. Various kinds of features such as position specific scoring matrix (PSSM), amino acid factors (AAF), and disorders were incorporated. A feature selection method based on mRMR (Maximum Relevance Minimum Redundancy) and IFS (Incremental Feature Selection) was employed. Finally, 319 optimal features were selected from total 541 features. Using the 319 optimal features to encode peptides, a predictor was constructed based on dagging. As a result, an accuracy of 69.56% with MCC of 0.2792 was achieved. We analyzed the optimal features, which suggested some important factors determining the lysine acetylation sites. We developed a position-specific method for epsilon lysine acetylation site prediction. A set of optimal features was selected. Analysis of the optimal features provided insights into the mechanism of lysine acetylation sites, providing guidance of experimental validation. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  20. Regression-Based Approach For Feature Selection In Classification Issues. Application To Breast Cancer Detection And Recurrence

    NASA Astrophysics Data System (ADS)

    Belciug, Smaranda; Serbanescu, Mircea-Sebastian

    2015-09-01

    Feature selection is considered a key factor in classifications/decision problems. It is currently used in designing intelligent decision systems to choose the best features which allow the best performance. This paper proposes a regression-based approach to select the most important predictors to significantly increase the classification performance. Application to breast cancer detection and recurrence using publically available datasets proved the efficiency of this technique.

  1. Minimizing the semantic gap in biomedical content-based image retrieval

    NASA Astrophysics Data System (ADS)

    Guan, Haiying; Antani, Sameer; Long, L. Rodney; Thoma, George R.

    2010-03-01

    A major challenge in biomedical Content-Based Image Retrieval (CBIR) is to achieve meaningful mappings that minimize the semantic gap between the high-level biomedical semantic concepts and the low-level visual features in images. This paper presents a comprehensive learning-based scheme toward meeting this challenge and improving retrieval quality. The article presents two algorithms: a learning-based feature selection and fusion algorithm and the Ranking Support Vector Machine (Ranking SVM) algorithm. The feature selection algorithm aims to select 'good' features and fuse them using different similarity measurements to provide a better representation of the high-level concepts with the low-level image features. Ranking SVM is applied to learn the retrieval rank function and associate the selected low-level features with query concepts, given the ground-truth ranking of the training samples. The proposed scheme addresses four major issues in CBIR to improve the retrieval accuracy: image feature extraction, selection and fusion, similarity measurements, the association of the low-level features with high-level concepts, and the generation of the rank function to support high-level semantic image retrieval. It models the relationship between semantic concepts and image features, and enables retrieval at the semantic level. We apply it to the problem of vertebra shape retrieval from a digitized spine x-ray image set collected by the second National Health and Nutrition Examination Survey (NHANES II). The experimental results show an improvement of up to 41.92% in the mean average precision (MAP) over conventional image similarity computation methods.

  2. Classification of Medical Datasets Using SVMs with Hybrid Evolutionary Algorithms Based on Endocrine-Based Particle Swarm Optimization and Artificial Bee Colony Algorithms.

    PubMed

    Lin, Kuan-Cheng; Hsieh, Yi-Hsiu

    2015-10-01

    The classification and analysis of data is an important issue in today's research. Selecting a suitable set of features makes it possible to classify an enormous quantity of data quickly and efficiently. Feature selection is generally viewed as a problem of feature subset selection, such as combination optimization problems. Evolutionary algorithms using random search methods have proven highly effective in obtaining solutions to problems of optimization in a diversity of applications. In this study, we developed a hybrid evolutionary algorithm based on endocrine-based particle swarm optimization (EPSO) and artificial bee colony (ABC) algorithms in conjunction with a support vector machine (SVM) for the selection of optimal feature subsets for the classification of datasets. The results of experiments using specific UCI medical datasets demonstrate that the accuracy of the proposed hybrid evolutionary algorithm is superior to that of basic PSO, EPSO and ABC algorithms, with regard to classification accuracy using subsets with a reduced number of features.

  3. Feature selection for neural network based defect classification of ceramic components using high frequency ultrasound.

    PubMed

    Kesharaju, Manasa; Nagarajah, Romesh

    2015-09-01

    The motivation for this research stems from a need for providing a non-destructive testing method capable of detecting and locating any defects and microstructural variations within armour ceramic components before issuing them to the soldiers who rely on them for their survival. The development of an automated ultrasonic inspection based classification system would make possible the checking of each ceramic component and immediately alert the operator about the presence of defects. Generally, in many classification problems a choice of features or dimensionality reduction is significant and simultaneously very difficult, as a substantial computational effort is required to evaluate possible feature subsets. In this research, a combination of artificial neural networks and genetic algorithms are used to optimize the feature subset used in classification of various defects in reaction-sintered silicon carbide ceramic components. Initially wavelet based feature extraction is implemented from the region of interest. An Artificial Neural Network classifier is employed to evaluate the performance of these features. Genetic Algorithm based feature selection is performed. Principal Component Analysis is a popular technique used for feature selection and is compared with the genetic algorithm based technique in terms of classification accuracy and selection of optimal number of features. The experimental results confirm that features identified by Principal Component Analysis lead to improved performance in terms of classification percentage with 96% than Genetic algorithm with 94%. Copyright © 2015 Elsevier B.V. All rights reserved.

  4. Feature-based and spatial attentional selection in visual working memory.

    PubMed

    Heuer, Anna; Schubö, Anna

    2016-05-01

    The contents of visual working memory (VWM) can be modulated by spatial cues presented during the maintenance interval ("retrocues"). Here, we examined whether attentional selection of representations in VWM can also be based on features. In addition, we investigated whether the mechanisms of feature-based and spatial attention in VWM differ with respect to parallel access to noncontiguous locations. In two experiments, we tested the efficacy of valid retrocues relying on different kinds of information. Specifically, participants were presented with a typical spatial retrocue pointing to two locations, a symbolic spatial retrocue (numbers mapping onto two locations), and two feature-based retrocues: a color retrocue (a blob of the same color as two of the items) and a shape retrocue (an outline of the shape of two of the items). The two cued items were presented at either contiguous or noncontiguous locations. Overall retrocueing benefits, as compared to a neutral condition, were observed for all retrocue types. Whereas feature-based retrocues yielded benefits for cued items presented at both contiguous and noncontiguous locations, spatial retrocues were only effective when the cued items had been presented at contiguous locations. These findings demonstrate that attentional selection and updating in VWM can operate on different kinds of information, allowing for a flexible and efficient use of this limited system. The observation that the representations of items presented at noncontiguous locations could only be reliably selected with feature-based retrocues suggests that feature-based and spatial attentional selection in VWM rely on different mechanisms, as has been shown for attentional orienting in the external world.

  5. 32 CFR 2001.46 - Transmission.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... an attention line. The following exceptions apply: (i) If the classified information is an internal... Class Mail. However, Confidential information shall not be transmitted to government contractor facilities via first class mail. When first class mail is used, the envelope or outer wrapper shall be marked...

  6. 75 FR 38835 - Solicitation of Nominations for the United States Department of Labor's Iqbal Masih Award for the...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-07-06

    ... of Labor is the date/time stamp of the ILAB/OCFT on the application wrapper or other documentary... currently valid Office of Management and Budget (OMB) control number. This collection of information is...

  7. A Beginning Reading Strategy.

    ERIC Educational Resources Information Center

    Aldridge, Jerry T.; Rust, Debra

    1987-01-01

    First-graders (identified as high-risk for reading difficulties) were taught to read examples of "environmental print" (words on candy wrappers, grocery bags, newspaper advertisements) and were able to identify and write words when logos and supporting detail were removed, indicating that activities using environmental print can…

  8. Integration of digital gross pathology images for enterprise-wide access.

    PubMed

    Amin, Milon; Sharma, Gaurav; Parwani, Anil V; Anderson, Ralph; Kolowitz, Brian J; Piccoli, Anthony; Shrestha, Rasu B; Lauro, Gonzalo Romero; Pantanowitz, Liron

    2012-01-01

    Sharing digital pathology images for enterprise- wide use into a picture archiving and communication system (PACS) is not yet widely adopted. We share our solution and 3-year experience of transmitting such images to an enterprise image server (EIS). Gross pathology images acquired by prosectors were integrated with clinical cases into the laboratory information system's image management module, and stored in JPEG2000 format on a networked image server. Automated daily searches for cases with gross images were used to compile an ASCII text file that was forwarded to a separate institutional Enterprise Digital Imaging and Communications in Medicine (DICOM) Wrapper (EDW) server. Concurrently, an HL7-based image order for these cases was generated, containing the locations of images and patient data, and forwarded to the EDW, which combined data in these locations to generate images with patient data, as required by DICOM standards. The image and data were then "wrapped" according to DICOM standards, transferred to the PACS servers, and made accessible on an institution-wide basis. In total, 26,966 gross images from 9,733 cases were transmitted over the 3-year period from the laboratory information system to the EIS. The average process time for cases with successful automatic uploads (n=9,688) to the EIS was 98 seconds. Only 45 cases (0.5%) failed requiring manual intervention. Uploaded images were immediately available to institution- wide PACS users. Since inception, user feedback has been positive. Enterprise- wide PACS- based sharing of pathology images is feasible, provides useful services to clinical staff, and utilizes existing information system and telecommunications infrastructure. PACS-shared pathology images, however, require a "DICOM wrapper" for multisystem compatibility.

  9. Automatic migraine classification via feature selection committee and machine learning techniques over imaging and questionnaire data.

    PubMed

    Garcia-Chimeno, Yolanda; Garcia-Zapirain, Begonya; Gomez-Beldarrain, Marian; Fernandez-Ruanova, Begonya; Garcia-Monco, Juan Carlos

    2017-04-13

    Feature selection methods are commonly used to identify subsets of relevant features to facilitate the construction of models for classification, yet little is known about how feature selection methods perform in diffusion tensor images (DTIs). In this study, feature selection and machine learning classification methods were tested for the purpose of automating diagnosis of migraines using both DTIs and questionnaire answers related to emotion and cognition - factors that influence of pain perceptions. We select 52 adult subjects for the study divided into three groups: control group (15), subjects with sporadic migraine (19) and subjects with chronic migraine and medication overuse (18). These subjects underwent magnetic resonance with diffusion tensor to see white matter pathway integrity of the regions of interest involved in pain and emotion. The tests also gather data about pathology. The DTI images and test results were then introduced into feature selection algorithms (Gradient Tree Boosting, L1-based, Random Forest and Univariate) to reduce features of the first dataset and classification algorithms (SVM (Support Vector Machine), Boosting (Adaboost) and Naive Bayes) to perform a classification of migraine group. Moreover we implement a committee method to improve the classification accuracy based on feature selection algorithms. When classifying the migraine group, the greatest improvements in accuracy were made using the proposed committee-based feature selection method. Using this approach, the accuracy of classification into three types improved from 67 to 93% when using the Naive Bayes classifier, from 90 to 95% with the support vector machine classifier, 93 to 94% in boosting. The features that were determined to be most useful for classification included are related with the pain, analgesics and left uncinate brain (connected with the pain and emotions). The proposed feature selection committee method improved the performance of migraine diagnosis classifiers compared to individual feature selection methods, producing a robust system that achieved over 90% accuracy in all classifiers. The results suggest that the proposed methods can be used to support specialists in the classification of migraines in patients undergoing magnetic resonance imaging.

  10. Effect of feature-selective attention on neuronal responses in macaque area MT

    PubMed Central

    Chen, X.; Hoffmann, K.-P.; Albright, T. D.

    2012-01-01

    Attention influences visual processing in striate and extrastriate cortex, which has been extensively studied for spatial-, object-, and feature-based attention. Most studies exploring neural signatures of feature-based attention have trained animals to attend to an object identified by a certain feature and ignore objects/displays identified by a different feature. Little is known about the effects of feature-selective attention, where subjects attend to one stimulus feature domain (e.g., color) of an object while features from different domains (e.g., direction of motion) of the same object are ignored. To study this type of feature-selective attention in area MT in the middle temporal sulcus, we trained macaque monkeys to either attend to and report the direction of motion of a moving sine wave grating (a feature for which MT neurons display strong selectivity) or attend to and report its color (a feature for which MT neurons have very limited selectivity). We hypothesized that neurons would upregulate their firing rate during attend-direction conditions compared with attend-color conditions. We found that feature-selective attention significantly affected 22% of MT neurons. Contrary to our hypothesis, these neurons did not necessarily increase firing rate when animals attended to direction of motion but fell into one of two classes. In one class, attention to color increased the gain of stimulus-induced responses compared with attend-direction conditions. The other class displayed the opposite effects. Feature-selective activity modulations occurred earlier in neurons modulated by attention to color compared with neurons modulated by attention to motion direction. Thus feature-selective attention influences neuronal processing in macaque area MT but often exhibited a mismatch between the preferred stimulus dimension (direction of motion) and the preferred attention dimension (attention to color). PMID:22170961

  11. Effect of feature-selective attention on neuronal responses in macaque area MT.

    PubMed

    Chen, X; Hoffmann, K-P; Albright, T D; Thiele, A

    2012-03-01

    Attention influences visual processing in striate and extrastriate cortex, which has been extensively studied for spatial-, object-, and feature-based attention. Most studies exploring neural signatures of feature-based attention have trained animals to attend to an object identified by a certain feature and ignore objects/displays identified by a different feature. Little is known about the effects of feature-selective attention, where subjects attend to one stimulus feature domain (e.g., color) of an object while features from different domains (e.g., direction of motion) of the same object are ignored. To study this type of feature-selective attention in area MT in the middle temporal sulcus, we trained macaque monkeys to either attend to and report the direction of motion of a moving sine wave grating (a feature for which MT neurons display strong selectivity) or attend to and report its color (a feature for which MT neurons have very limited selectivity). We hypothesized that neurons would upregulate their firing rate during attend-direction conditions compared with attend-color conditions. We found that feature-selective attention significantly affected 22% of MT neurons. Contrary to our hypothesis, these neurons did not necessarily increase firing rate when animals attended to direction of motion but fell into one of two classes. In one class, attention to color increased the gain of stimulus-induced responses compared with attend-direction conditions. The other class displayed the opposite effects. Feature-selective activity modulations occurred earlier in neurons modulated by attention to color compared with neurons modulated by attention to motion direction. Thus feature-selective attention influences neuronal processing in macaque area MT but often exhibited a mismatch between the preferred stimulus dimension (direction of motion) and the preferred attention dimension (attention to color).

  12. Improved sparse decomposition based on a smoothed L0 norm using a Laplacian kernel to select features from fMRI data.

    PubMed

    Zhang, Chuncheng; Song, Sutao; Wen, Xiaotong; Yao, Li; Long, Zhiying

    2015-04-30

    Feature selection plays an important role in improving the classification accuracy of multivariate classification techniques in the context of fMRI-based decoding due to the "few samples and large features" nature of functional magnetic resonance imaging (fMRI) data. Recently, several sparse representation methods have been applied to the voxel selection of fMRI data. Despite the low computational efficiency of the sparse representation methods, they still displayed promise for applications that select features from fMRI data. In this study, we proposed the Laplacian smoothed L0 norm (LSL0) approach for feature selection of fMRI data. Based on the fast sparse decomposition using smoothed L0 norm (SL0) (Mohimani, 2007), the LSL0 method used the Laplacian function to approximate the L0 norm of sources. Results of the simulated and real fMRI data demonstrated the feasibility and robustness of LSL0 for the sparse source estimation and feature selection. Simulated results indicated that LSL0 produced more accurate source estimation than SL0 at high noise levels. The classification accuracy using voxels that were selected by LSL0 was higher than that by SL0 in both simulated and real fMRI experiment. Moreover, both LSL0 and SL0 showed higher classification accuracy and required less time than ICA and t-test for the fMRI decoding. LSL0 outperformed SL0 in sparse source estimation at high noise level and in feature selection. Moreover, LSL0 and SL0 showed better performance than ICA and t-test for feature selection. Copyright © 2015 Elsevier B.V. All rights reserved.

  13. Parallel workflow manager for non-parallel bioinformatic applications to solve large-scale biological problems on a supercomputer.

    PubMed

    Suplatov, Dmitry; Popova, Nina; Zhumatiy, Sergey; Voevodin, Vladimir; Švedas, Vytas

    2016-04-01

    Rapid expansion of online resources providing access to genomic, structural, and functional information associated with biological macromolecules opens an opportunity to gain a deeper understanding of the mechanisms of biological processes due to systematic analysis of large datasets. This, however, requires novel strategies to optimally utilize computer processing power. Some methods in bioinformatics and molecular modeling require extensive computational resources. Other algorithms have fast implementations which take at most several hours to analyze a common input on a modern desktop station, however, due to multiple invocations for a large number of subtasks the full task requires a significant computing power. Therefore, an efficient computational solution to large-scale biological problems requires both a wise parallel implementation of resource-hungry methods as well as a smart workflow to manage multiple invocations of relatively fast algorithms. In this work, a new computer software mpiWrapper has been developed to accommodate non-parallel implementations of scientific algorithms within the parallel supercomputing environment. The Message Passing Interface has been implemented to exchange information between nodes. Two specialized threads - one for task management and communication, and another for subtask execution - are invoked on each processing unit to avoid deadlock while using blocking calls to MPI. The mpiWrapper can be used to launch all conventional Linux applications without the need to modify their original source codes and supports resubmission of subtasks on node failure. We show that this approach can be used to process huge amounts of biological data efficiently by running non-parallel programs in parallel mode on a supercomputer. The C++ source code and documentation are available from http://biokinet.belozersky.msu.ru/mpiWrapper .

  14. On the use of feature selection to improve the detection of sea oil spills in SAR images

    NASA Astrophysics Data System (ADS)

    Mera, David; Bolon-Canedo, Veronica; Cotos, J. M.; Alonso-Betanzos, Amparo

    2017-03-01

    Fast and effective oil spill detection systems are crucial to ensure a proper response to environmental emergencies caused by hydrocarbon pollution on the ocean's surface. Typically, these systems uncover not only oil spills, but also a high number of look-alikes. The feature extraction is a critical and computationally intensive phase where each detected dark spot is independently examined. Traditionally, detection systems use an arbitrary set of features to discriminate between oil spills and look-alikes phenomena. However, Feature Selection (FS) methods based on Machine Learning (ML) have proved to be very useful in real domains for enhancing the generalization capabilities of the classifiers, while discarding the existing irrelevant features. In this work, we present a generic and systematic approach, based on FS methods, for choosing a concise and relevant set of features to improve the oil spill detection systems. We have compared five FS methods: Correlation-based feature selection (CFS), Consistency-based filter, Information Gain, ReliefF and Recursive Feature Elimination for Support Vector Machine (SVM-RFE). They were applied on a 141-input vector composed of features from a collection of outstanding studies. Selected features were validated via a Support Vector Machine (SVM) classifier and the results were compared with previous works. Test experiments revealed that the classifier trained with the 6-input feature vector proposed by SVM-RFE achieved the best accuracy and Cohen's kappa coefficient (87.1% and 74.06% respectively). This is a smaller feature combination with similar or even better classification accuracy than previous works. The presented finding allows to speed up the feature extraction phase without reducing the classifier accuracy. Experiments also confirmed the significance of the geometrical features since 75.0% of the different features selected by the applied FS methods as well as 66.67% of the proposed 6-input feature vector belong to this category.

  15. Availability of tobacco products associated with use of marijuana cigars (blunts).

    PubMed

    Lipperman-Kreda, Sharon; Lee, Juliet P; Morrison, Chris; Freisthler, Bridget

    2014-01-01

    This study examines factors associated with availability of tobacco products for marijuana cigars (i.e., blunts) in 50 non-contiguous mid-sized California communities. The study is based on data collected in 943 tobacco outlets. Neighborhood demographics, community adult marijuana prevalence, medical marijuana policy and access to medical marijuana dispensaries and delivery services were included. Multilevel logistic regression analyses indicated that compared with small markets, availability of tobacco products associated with use of blunts was significantly higher in convenience stores, smoke/tobacco shops and liquor stores. None of the neighborhood demographics were associated with availability of blunt wrappers and only a small percent of Whites was positively associated with availability of blunt cigars, small cigars or cigarillos at the store. Controlling for outlet type and neighborhood demographics, higher city prevalence of adult marijuana use was associated with greater availability of blunt wrappers. Also, policy that permits medical marijuana dispensaries or private cultivation was positively associated with availability of tobacco products for blunts. Density of medical marijuana dispensaries and delivery services, however, was negatively associated with greater availability of these products at tobacco outlets. Results suggest that availability of tobacco products associated with blunts is similar in neighborhoods with different socioeconomic status and racial and ethnic composition. Results also suggest the important role that community norms that support marijuana use or legalization of medical marijuana and medical marijuana policy may play in increasing availability of tobacco products associated with blunts. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  16. Availability of Tobacco Products Associated with Use of Marijuana Cigars (Blunts)

    PubMed Central

    Juliet, P. Lee; Morrison, Chris; Bridget, Freisthler

    2013-01-01

    Objectives This study examines factors associated with availability of tobacco products for marijuana cigars (i.e., blunts) in 50 non-contiguous mid-sized California communities. Methods The study is based on data collected in 943 tobacco outlets. Neighborhood demographics, community adult marijuana prevalence, medical marijuana policy and access to medical marijuana dispensaries and delivery services were included. Results Multilevel logistic regression analyses indicated that compared with small markets, availability of tobacco products associated with use of blunts was significantly higher in convenience stores, smoke/tobacco shops and liquor stores. None of the neighborhood demographics were associated with availability of blunt wrappers and only a small percent of Whites was positively associated with availability of blunt cigars, small cigars or cigarillos at the store. Controlling for outlet type and neighborhood demographics, higher city prevalence of adult marijuana use was associated with greater availability of blunt wrappers. Also, policy that permits medical marijuana dispensaries or private cultivation was positively associated with availability of tobacco products for blunts. Density of medical marijuana dispensaries and delivery services, however, was negatively associated with greater availability of these products at tobacco outlets. Conclusions Results suggest that availability of tobacco products associated with blunts is similar in neighborhoods with different socioeconomic status and racial and ethnic composition. Results also suggest the important role that community norms that support marijuana use or legalization of medical marijuana and medical marijuana policy may play in increasing availability of tobacco products associated with blunts. PMID:24290366

  17. Multiband tangent space mapping and feature selection for classification of EEG during motor imagery.

    PubMed

    Islam, Md Rabiul; Tanaka, Toshihisa; Molla, Md Khademul Islam

    2018-05-08

    When designing multiclass motor imagery-based brain-computer interface (MI-BCI), a so-called tangent space mapping (TSM) method utilizing the geometric structure of covariance matrices is an effective technique. This paper aims to introduce a method using TSM for finding accurate operational frequency bands related brain activities associated with MI tasks. A multichannel electroencephalogram (EEG) signal is decomposed into multiple subbands, and tangent features are then estimated on each subband. A mutual information analysis-based effective algorithm is implemented to select subbands containing features capable of improving motor imagery classification accuracy. Thus obtained features of selected subbands are combined to get feature space. A principal component analysis-based approach is employed to reduce the features dimension and then the classification is accomplished by a support vector machine (SVM). Offline analysis demonstrates the proposed multiband tangent space mapping with subband selection (MTSMS) approach outperforms state-of-the-art methods. It acheives the highest average classification accuracy for all datasets (BCI competition dataset 2a, IIIa, IIIb, and dataset JK-HH1). The increased classification accuracy of MI tasks with the proposed MTSMS approach can yield effective implementation of BCI. The mutual information-based subband selection method is implemented to tune operation frequency bands to represent actual motor imagery tasks.

  18. Vessel Classification in Cosmo-Skymed SAR Data Using Hierarchical Feature Selection

    NASA Astrophysics Data System (ADS)

    Makedonas, A.; Theoharatos, C.; Tsagaris, V.; Anastasopoulos, V.; Costicoglou, S.

    2015-04-01

    SAR based ship detection and classification are important elements of maritime monitoring applications. Recently, high-resolution SAR data have opened new possibilities to researchers for achieving improved classification results. In this work, a hierarchical vessel classification procedure is presented based on a robust feature extraction and selection scheme that utilizes scale, shape and texture features in a hierarchical way. Initially, different types of feature extraction algorithms are implemented in order to form the utilized feature pool, able to represent the structure, material, orientation and other vessel type characteristics. A two-stage hierarchical feature selection algorithm is utilized next in order to be able to discriminate effectively civilian vessels into three distinct types, in COSMO-SkyMed SAR images: cargos, small ships and tankers. In our analysis, scale and shape features are utilized in order to discriminate smaller types of vessels present in the available SAR data, or shape specific vessels. Then, the most informative texture and intensity features are incorporated in order to be able to better distinguish the civilian types with high accuracy. A feature selection procedure that utilizes heuristic measures based on features' statistical characteristics, followed by an exhaustive research with feature sets formed by the most qualified features is carried out, in order to discriminate the most appropriate combination of features for the final classification. In our analysis, five COSMO-SkyMed SAR data with 2.2m x 2.2m resolution were used to analyse the detailed characteristics of these types of ships. A total of 111 ships with available AIS data were used in the classification process. The experimental results show that this method has good performance in ship classification, with an overall accuracy reaching 83%. Further investigation of additional features and proper feature selection is currently in progress.

  19. Sandia Text ANaLysis Extensible librarY Server

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    2006-05-11

    This is a server wrapper for STANLEY (Sandia Text ANaLysis Extensible librarY). STANLEY provides capabilities for analyzing, indexing and searching through text. STANLEY Server exposes this capability through a TCP/IP interface allowing third party applications and remote clients to access it.

  20. Total and Compound Formation Cross Sections for Americium Nuclei: Recommendations for Coupled-Channels Calculations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Escher, J. E.

    Calculations for total cross sections and compound-nucleus (CN) formation cross sections for americium isotopes are described, for use in the 2017 NA-22 evaluation effort. The code ECIS 2006 was used in conjunction with Frank Dietrich's wrapper `runtemplate'.

  1. Microbial Penetration of Muslin- and Paper-Wrapped Sterile Packs Stored on Open Shelves and in Closed Cabinets

    PubMed Central

    Standard, Paul G.; Mackel, Don C.; Mallison, G. F.

    1971-01-01

    Microbial penetration of sterile packs was studied using single-wrap (two layers) muslin, double-wrap (four layers) muslin, and two-way crepe paper (single layer) to wrap 20 gauze sponges (2 by 2 inch). These packs were stored in the central sterile supply departments of two hospitals and processed for sterility at predetermined intervals. Microorganisms penetrated single-wrap muslin as early as 3 days and double-wrap muslin and single-wrap two-way crepe paper in 21 to 28 days stored in open shelves. The time required for microbial penetration was at least twice as long when closed cabinets were used. Single-wrap muslin packs stored in sealed, impervious plastic bags remained sterile for at least 9 months. All sterile materials in pervious wrappers should be handled as little as possible and then only with extreme care and caution. Closed cabinets offer more protection than open shelves, and single wrappers are not recommended. Images PMID:5119207

  2. [Electroencephalogram Feature Selection Based on Correlation Coefficient Analysis].

    PubMed

    Zhou, Jinzhi; Tang, Xiaofang

    2015-08-01

    In order to improve the accuracy of classification with small amount of motor imagery training data on the development of brain-computer interface (BCD systems, we proposed an analyzing method to automatically select the characteristic parameters based on correlation coefficient analysis. Throughout the five sample data of dataset IV a from 2005 BCI Competition, we utilized short-time Fourier transform (STFT) and correlation coefficient calculation to reduce the number of primitive electroencephalogram dimension, then introduced feature extraction based on common spatial pattern (CSP) and classified by linear discriminant analysis (LDA). Simulation results showed that the average rate of classification accuracy could be improved by using correlation coefficient feature selection method than those without using this algorithm. Comparing with support vector machine (SVM) optimization features algorithm, the correlation coefficient analysis can lead better selection parameters to improve the accuracy of classification.

  3. Neural mechanisms of selective attention in the somatosensory system.

    PubMed

    Gomez-Ramirez, Manuel; Hysaj, Kristjana; Niebur, Ernst

    2016-09-01

    Selective attention allows organisms to extract behaviorally relevant information while ignoring distracting stimuli that compete for the limited resources of their central nervous systems. Attention is highly flexible, and it can be harnessed to select information based on sensory modality, within-modality feature(s), spatial location, object identity, and/or temporal properties. In this review, we discuss the body of work devoted to understanding mechanisms of selective attention in the somatosensory system. In particular, we describe the effects of attention on tactile behavior and corresponding neural activity in somatosensory cortex. Our focus is on neural mechanisms that select tactile stimuli based on their location on the body (somatotopic-based attention) or their sensory feature (feature-based attention). We highlight parallels between selection mechanisms in touch and other sensory systems and discuss several putative neural coding schemes employed by cortical populations to signal the behavioral relevance of sensory inputs. Specifically, we contrast the advantages and disadvantages of using a gain vs. spike-spike correlation code for representing attended sensory stimuli. We favor a neural network model of tactile attention that is composed of frontal, parietal, and subcortical areas that controls somatosensory cells encoding the relevant stimulus features to enable preferential processing throughout the somatosensory hierarchy. Our review is based on data from noninvasive electrophysiological and imaging data in humans as well as single-unit recordings in nonhuman primates. Copyright © 2016 the American Physiological Society.

  4. Neural mechanisms of selective attention in the somatosensory system

    PubMed Central

    Hysaj, Kristjana; Niebur, Ernst

    2016-01-01

    Selective attention allows organisms to extract behaviorally relevant information while ignoring distracting stimuli that compete for the limited resources of their central nervous systems. Attention is highly flexible, and it can be harnessed to select information based on sensory modality, within-modality feature(s), spatial location, object identity, and/or temporal properties. In this review, we discuss the body of work devoted to understanding mechanisms of selective attention in the somatosensory system. In particular, we describe the effects of attention on tactile behavior and corresponding neural activity in somatosensory cortex. Our focus is on neural mechanisms that select tactile stimuli based on their location on the body (somatotopic-based attention) or their sensory feature (feature-based attention). We highlight parallels between selection mechanisms in touch and other sensory systems and discuss several putative neural coding schemes employed by cortical populations to signal the behavioral relevance of sensory inputs. Specifically, we contrast the advantages and disadvantages of using a gain vs. spike-spike correlation code for representing attended sensory stimuli. We favor a neural network model of tactile attention that is composed of frontal, parietal, and subcortical areas that controls somatosensory cells encoding the relevant stimulus features to enable preferential processing throughout the somatosensory hierarchy. Our review is based on data from noninvasive electrophysiological and imaging data in humans as well as single-unit recordings in nonhuman primates. PMID:27334956

  5. Sparse Zero-Sum Games as Stable Functional Feature Selection

    PubMed Central

    Sokolovska, Nataliya; Teytaud, Olivier; Rizkalla, Salwa; Clément, Karine; Zucker, Jean-Daniel

    2015-01-01

    In large-scale systems biology applications, features are structured in hidden functional categories whose predictive power is identical. Feature selection, therefore, can lead not only to a problem with a reduced dimensionality, but also reveal some knowledge on functional classes of variables. In this contribution, we propose a framework based on a sparse zero-sum game which performs a stable functional feature selection. In particular, the approach is based on feature subsets ranking by a thresholding stochastic bandit. We provide a theoretical analysis of the introduced algorithm. We illustrate by experiments on both synthetic and real complex data that the proposed method is competitive from the predictive and stability viewpoints. PMID:26325268

  6. An ant colony optimization based feature selection for web page classification.

    PubMed

    Saraç, Esra; Özel, Selma Ayşe

    2014-01-01

    The increased popularity of the web has caused the inclusion of huge amount of information to the web, and as a result of this explosive information growth, automated web page classification systems are needed to improve search engines' performance. Web pages have a large number of features such as HTML/XML tags, URLs, hyperlinks, and text contents that should be considered during an automated classification process. The aim of this study is to reduce the number of features to be used to improve runtime and accuracy of the classification of web pages. In this study, we used an ant colony optimization (ACO) algorithm to select the best features, and then we applied the well-known C4.5, naive Bayes, and k nearest neighbor classifiers to assign class labels to web pages. We used the WebKB and Conference datasets in our experiments, and we showed that using the ACO for feature selection improves both accuracy and runtime performance of classification. We also showed that the proposed ACO based algorithm can select better features with respect to the well-known information gain and chi square feature selection methods.

  7. Diagnosing and ranking retinopathy disease level using diabetic fundus image recuperation approach.

    PubMed

    Somasundaram, K; Rajendran, P Alli

    2015-01-01

    Retinal fundus images are widely used in diagnosing different types of eye diseases. The existing methods such as Feature Based Macular Edema Detection (FMED) and Optimally Adjusted Morphological Operator (OAMO) effectively detected the presence of exudation in fundus images and identified the true positive ratio of exudates detection, respectively. These mechanically detected exudates did not include more detailed feature selection technique to the system for detection of diabetic retinopathy. To categorize the exudates, Diabetic Fundus Image Recuperation (DFIR) method based on sliding window approach is developed in this work to select the features of optic cup in digital retinal fundus images. The DFIR feature selection uses collection of sliding windows with varying range to obtain the features based on the histogram value using Group Sparsity Nonoverlapping Function. Using support vector model in the second phase, the DFIR method based on Spiral Basis Function effectively ranks the diabetic retinopathy disease level. The ranking of disease level on each candidate set provides a much promising result for developing practically automated and assisted diabetic retinopathy diagnosis system. Experimental work on digital fundus images using the DFIR method performs research on the factors such as sensitivity, ranking efficiency, and feature selection time.

  8. Diagnosing and Ranking Retinopathy Disease Level Using Diabetic Fundus Image Recuperation Approach

    PubMed Central

    Somasundaram, K.; Alli Rajendran, P.

    2015-01-01

    Retinal fundus images are widely used in diagnosing different types of eye diseases. The existing methods such as Feature Based Macular Edema Detection (FMED) and Optimally Adjusted Morphological Operator (OAMO) effectively detected the presence of exudation in fundus images and identified the true positive ratio of exudates detection, respectively. These mechanically detected exudates did not include more detailed feature selection technique to the system for detection of diabetic retinopathy. To categorize the exudates, Diabetic Fundus Image Recuperation (DFIR) method based on sliding window approach is developed in this work to select the features of optic cup in digital retinal fundus images. The DFIR feature selection uses collection of sliding windows with varying range to obtain the features based on the histogram value using Group Sparsity Nonoverlapping Function. Using support vector model in the second phase, the DFIR method based on Spiral Basis Function effectively ranks the diabetic retinopathy disease level. The ranking of disease level on each candidate set provides a much promising result for developing practically automated and assisted diabetic retinopathy diagnosis system. Experimental work on digital fundus images using the DFIR method performs research on the factors such as sensitivity, ranking efficiency, and feature selection time. PMID:25945362

  9. Performance Analysis of Ivshmem for High-Performance Computing in Virtual Machines

    NASA Astrophysics Data System (ADS)

    Ivanovic, Pavle; Richter, Harald

    2018-01-01

    High-Performance computing (HPC) is rarely accomplished via virtual machines (VMs). In this paper, we present a remake of ivshmem which can change this. Ivshmem was a shared memory (SHM) between virtual machines on the same server, with SHM-access synchronization included, until about 5 years ago when newer versions of Linux and its virtualization library libvirt evolved. We restored that SHM-access synchronization feature because it is indispensable for HPC and made ivshmem runnable with contemporary versions of Linux, libvirt, KVM, QEMU and especially MPICH, which is an implementation of MPI - the standard HPC communication library. Additionally, MPICH was transparently modified by us to get ivshmem included, resulting in a three to ten times performance improvement compared to TCP/IP. Furthermore, we have transparently replaced MPI_PUT, a single-side MPICH communication mechanism, by an own MPI_PUT wrapper. As a result, our ivshmem even surpasses non-virtualized SHM data transfers for block lengths greater than 512 KBytes, showing the benefits of virtualization. All improvements were possible without using SR-IOV.

  10. A New Direction of Cancer Classification: Positive Effect of Low-Ranking MicroRNAs.

    PubMed

    Li, Feifei; Piao, Minghao; Piao, Yongjun; Li, Meijing; Ryu, Keun Ho

    2014-10-01

    Many studies based on microRNA (miRNA) expression profiles showed a new aspect of cancer classification. Because one characteristic of miRNA expression data is the high dimensionality, feature selection methods have been used to facilitate dimensionality reduction. The feature selection methods have one shortcoming thus far: they just consider the problem of where feature to class is 1:1 or n:1. However, because one miRNA may influence more than one type of cancer, human miRNA is considered to be ranked low in traditional feature selection methods and are removed most of the time. In view of the limitation of the miRNA number, low-ranking miRNAs are also important to cancer classification. We considered both high- and low-ranking features to cover all problems (1:1, n:1, 1:n, and m:n) in cancer classification. First, we used the correlation-based feature selection method to select the high-ranking miRNAs, and chose the support vector machine, Bayes network, decision tree, k-nearest-neighbor, and logistic classifier to construct cancer classification. Then, we chose Chi-square test, information gain, gain ratio, and Pearson's correlation feature selection methods to build the m:n feature subset, and used the selected miRNAs to determine cancer classification. The low-ranking miRNA expression profiles achieved higher classification accuracy compared with just using high-ranking miRNAs in traditional feature selection methods. Our results demonstrate that the m:n feature subset made a positive impression of low-ranking miRNAs in cancer classification.

  11. A Novel Image Recuperation Approach for Diagnosing and Ranking Retinopathy Disease Level Using Diabetic Fundus Image

    PubMed Central

    2015-01-01

    Retinal fundus images are widely used in diagnosing and providing treatment for several eye diseases. Prior works using retinal fundus images detected the presence of exudation with the aid of publicly available dataset using extensive segmentation process. Though it was proved to be computationally efficient, it failed to create a diabetic retinopathy feature selection system for transparently diagnosing the disease state. Also the diagnosis of diseases did not employ machine learning methods to categorize candidate fundus images into true positive and true negative ratio. Several candidate fundus images did not include more detailed feature selection technique for diabetic retinopathy. To apply machine learning methods and classify the candidate fundus images on the basis of sliding window a method called, Diabetic Fundus Image Recuperation (DFIR) is designed in this paper. The initial phase of DFIR method select the feature of optic cup in digital retinal fundus images based on Sliding Window Approach. With this, the disease state for diabetic retinopathy is assessed. The feature selection in DFIR method uses collection of sliding windows to obtain the features based on the histogram value. The histogram based feature selection with the aid of Group Sparsity Non-overlapping function provides more detailed information of features. Using Support Vector Model in the second phase, the DFIR method based on Spiral Basis Function effectively ranks the diabetic retinopathy diseases. The ranking of disease level for each candidate set provides a much promising result for developing practically automated diabetic retinopathy diagnosis system. Experimental work on digital fundus images using the DFIR method performs research on the factors such as sensitivity, specificity rate, ranking efficiency and feature selection time. PMID:25974230

  12. Identifying Trustworthiness Deficit in Legacy Systems Using the NFR Approach

    DTIC Science & Technology

    2014-01-01

    trustworthy envi- ronment. These adaptations can be stated in terms of design modifications and/or implementation mechanisms (for example, wrappers) that will...extensions to the VHSIC Hardware Description Language ( VHDL -AMS). He has spent the last 10 years leading research in high performance embedded computing

  13. 21 CFR 607.3 - Definitions.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... process. The term includes packaging, labeling, repackaging or otherwise changing the container, wrapper... neither imported nor offered for import into the United States. (f) Any material change includes but is not limited to any change in the name of the blood product, in the quantity or identity of the active...

  14. 21 CFR 607.3 - Definitions.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... process. The term includes packaging, labeling, repackaging or otherwise changing the container, wrapper... neither imported nor offered for import into the United States. (f) Any material change includes but is not limited to any change in the name of the blood product, in the quantity or identity of the active...

  15. A universal deep learning approach for modeling the flow of patients under different severities.

    PubMed

    Jiang, Shancheng; Chin, Kwai-Sang; Tsui, Kwok L

    2018-02-01

    The Accident and Emergency Department (A&ED) is the frontline for providing emergency care in hospitals. Unfortunately, relative A&ED resources have failed to keep up with continuously increasing demand in recent years, which leads to overcrowding in A&ED. Knowing the fluctuation of patient arrival volume in advance is a significant premise to relieve this pressure. Based on this motivation, the objective of this study is to explore an integrated framework with high accuracy for predicting A&ED patient flow under different triage levels, by combining a novel feature selection process with deep neural networks. Administrative data is collected from an actual A&ED and categorized into five groups based on different triage levels. A genetic algorithm (GA)-based feature selection algorithm is improved and implemented as a pre-processing step for this time-series prediction problem, in order to explore key features affecting patient flow. In our improved GA, a fitness-based crossover is proposed to maintain the joint information of multiple features during iterative process, instead of traditional point-based crossover. Deep neural networks (DNN) is employed as the prediction model to utilize their universal adaptability and high flexibility. In the model-training process, the learning algorithm is well-configured based on a parallel stochastic gradient descent algorithm. Two effective regularization strategies are integrated in one DNN framework to avoid overfitting. All introduced hyper-parameters are optimized efficiently by grid-search in one pass. As for feature selection, our improved GA-based feature selection algorithm has outperformed a typical GA and four state-of-the-art feature selection algorithms (mRMR, SAFS, VIFR, and CFR). As for the prediction accuracy of proposed integrated framework, compared with other frequently used statistical models (GLM, seasonal-ARIMA, ARIMAX, and ANN) and modern machine models (SVM-RBF, SVM-linear, RF, and R-LASSO), the proposed integrated "DNN-I-GA" framework achieves higher prediction accuracy on both MAPE and RMSE metrics in pairwise comparisons. The contribution of our study is two-fold. Theoretically, the traditional GA-based feature selection process is improved to have less hyper-parameters and higher efficiency, and the joint information of multiple features is maintained by fitness-based crossover operator. The universal property of DNN is further enhanced by merging different regularization strategies. Practically, features selected by our improved GA can be used to acquire an underlying relationship between patient flows and input features. Predictive values are significant indicators of patients' demand and can be used by A&ED managers to make resource planning and allocation. High accuracy achieved by the present framework in different cases enhances the reliability of downstream decision makings. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. Tumor recognition in wireless capsule endoscopy images using textural features and SVM-based feature selection.

    PubMed

    Li, Baopu; Meng, Max Q-H

    2012-05-01

    Tumor in digestive tract is a common disease and wireless capsule endoscopy (WCE) is a relatively new technology to examine diseases for digestive tract especially for small intestine. This paper addresses the problem of automatic recognition of tumor for WCE images. Candidate color texture feature that integrates uniform local binary pattern and wavelet is proposed to characterize WCE images. The proposed features are invariant to illumination change and describe multiresolution characteristics of WCE images. Two feature selection approaches based on support vector machine, sequential forward floating selection and recursive feature elimination, are further employed to refine the proposed features for improving the detection accuracy. Extensive experiments validate that the proposed computer-aided diagnosis system achieves a promising tumor recognition accuracy of 92.4% in WCE images on our collected data.

  17. Optimization of breast mass classification using sequential forward floating selection (SFFS) and a support vector machine (SVM) model

    PubMed Central

    Tan, Maxine; Pu, Jiantao; Zheng, Bin

    2014-01-01

    Purpose: Improving radiologists’ performance in classification between malignant and benign breast lesions is important to increase cancer detection sensitivity and reduce false-positive recalls. For this purpose, developing computer-aided diagnosis (CAD) schemes has been attracting research interest in recent years. In this study, we investigated a new feature selection method for the task of breast mass classification. Methods: We initially computed 181 image features based on mass shape, spiculation, contrast, presence of fat or calcifications, texture, isodensity, and other morphological features. From this large image feature pool, we used a sequential forward floating selection (SFFS)-based feature selection method to select relevant features, and analyzed their performance using a support vector machine (SVM) model trained for the classification task. On a database of 600 benign and 600 malignant mass regions of interest (ROIs), we performed the study using a ten-fold cross-validation method. Feature selection and optimization of the SVM parameters were conducted on the training subsets only. Results: The area under the receiver operating characteristic curve (AUC) = 0.805±0.012 was obtained for the classification task. The results also showed that the most frequently-selected features by the SFFS-based algorithm in 10-fold iterations were those related to mass shape, isodensity and presence of fat, which are consistent with the image features frequently used by radiologists in the clinical environment for mass classification. The study also indicated that accurately computing mass spiculation features from the projection mammograms was difficult, and failed to perform well for the mass classification task due to tissue overlap within the benign mass regions. Conclusions: In conclusion, this comprehensive feature analysis study provided new and valuable information for optimizing computerized mass classification schemes that may have potential to be useful as a “second reader” in future clinical practice. PMID:24664267

  18. The optional selection of micro-motion feature based on Support Vector Machine

    NASA Astrophysics Data System (ADS)

    Li, Bo; Ren, Hongmei; Xiao, Zhi-he; Sheng, Jing

    2017-11-01

    Micro-motion form of target is multiple, different micro-motion forms are apt to be modulated, which makes it difficult for feature extraction and recognition. Aiming at feature extraction of cone-shaped objects with different micro-motion forms, this paper proposes the best selection method of micro-motion feature based on support vector machine. After the time-frequency distribution of radar echoes, comparing the time-frequency spectrum of objects with different micro-motion forms, features are extracted based on the differences between the instantaneous frequency variations of different micro-motions. According to the methods based on SVM (Support Vector Machine) features are extracted, then the best features are acquired. Finally, the result shows the method proposed in this paper is feasible under the test condition of certain signal-to-noise ratio(SNR).

  19. A ROC-based feature selection method for computer-aided detection and diagnosis

    NASA Astrophysics Data System (ADS)

    Wang, Songyuan; Zhang, Guopeng; Liao, Qimei; Zhang, Junying; Jiao, Chun; Lu, Hongbing

    2014-03-01

    Image-based computer-aided detection and diagnosis (CAD) has been a very active research topic aiming to assist physicians to detect lesions and distinguish them from benign to malignant. However, the datasets fed into a classifier usually suffer from small number of samples, as well as significantly less samples available in one class (have a disease) than the other, resulting in the classifier's suboptimal performance. How to identifying the most characterizing features of the observed data for lesion detection is critical to improve the sensitivity and minimize false positives of a CAD system. In this study, we propose a novel feature selection method mR-FAST that combines the minimal-redundancymaximal relevance (mRMR) framework with a selection metric FAST (feature assessment by sliding thresholds) based on the area under a ROC curve (AUC) generated on optimal simple linear discriminants. With three feature datasets extracted from CAD systems for colon polyps and bladder cancer, we show that the space of candidate features selected by mR-FAST is more characterizing for lesion detection with higher AUC, enabling to find a compact subset of superior features at low cost.

  20. Analysis of Radarsat-2 Full Polarimetric Data for Forest Mapping

    NASA Astrophysics Data System (ADS)

    Maghsoudi, Yasser

    Forests are a major natural resource of the Earth and control a wide range of environmental processes. Forests comprise a major part of the planet's plant biodiversity and have an important role in the global hydrological and biochemical cycles. Among the numerous potential applications of remote sensing in forestry, forest mapping plays a vital role for characterization of the forest in terms of species. Particularly, in Canada where forests occupy 45% of the territory, representing more than 400 million hectares of the total Canadian continental area. In this thesis, the potential of polarimetric SAR (PolSAR) Radarsat-2 data for forest mapping is investigated. This thesis has two principle objectives. First is to propose algorithms for analyzing the PolSAR image data for forest mapping. There are a wide range of SAR parameters that can be derived from PolSAR data. In order to make full use of the discriminative power offered by all these parameters, two categories of methods are proposed. The methods are based on the concept of feature selection and classifier ensemble. First, a nonparametric definition of the evaluation function is proposed and hence the methods NFS and CBFS. Second, a fast wrapper algorithm is proposed for the evaluation function in feature selection and hence the methods FWFS and FWCBFS. Finally, to incorporate the neighboring pixels information in classification an extension of the FWCBFS method i.e. CCBFS is proposed. The second objective of this thesis is to provide a comparison between leaf-on (summer) and leaf-off (fall) season images for forest mapping. Two Radarsat-2 images acquired in fine quad-polarized mode were chosen for this study. The images were collected in leaf-on and leaf-off seasons. We also test the hypothesis whether combining the SAR parameters obtained from both images can provide better results than either individual datasets. The rationale for this combination is that every dataset has some parameters which may be useful for forest mapping. To assess the potential of the proposed methods their performance have been compared with each other and with the baseline classifiers. The baseline methods include the Wishart classifier, which is a commonly used classification method in PolSAR community, as well as an SVM classifier with the full set of parameters. Experimental results showed a better performance of the leaf-off image compared to that of leaf-on image for forest mapping. It is also shown that combining leaf-off parameters with leaf-on parameters can significantly improve the classification accuracy. Also, the classification results (in terms of the overall accuracy) compared to the baseline classifiers demonstrate the effectiveness of the proposed nonparametric scheme for forest mapping.

  1. Automatic parameter selection for feature-based multi-sensor image registration

    NASA Astrophysics Data System (ADS)

    DelMarco, Stephen; Tom, Victor; Webb, Helen; Chao, Alan

    2006-05-01

    Accurate image registration is critical for applications such as precision targeting, geo-location, change-detection, surveillance, and remote sensing. However, the increasing volume of image data is exceeding the current capacity of human analysts to perform manual registration. This image data glut necessitates the development of automated approaches to image registration, including algorithm parameter value selection. Proper parameter value selection is crucial to the success of registration techniques. The appropriate algorithm parameters can be highly scene and sensor dependent. Therefore, robust algorithm parameter value selection approaches are a critical component of an end-to-end image registration algorithm. In previous work, we developed a general framework for multisensor image registration which includes feature-based registration approaches. In this work we examine the problem of automated parameter selection. We apply the automated parameter selection approach of Yitzhaky and Peli to select parameters for feature-based registration of multisensor image data. The approach consists of generating multiple feature-detected images by sweeping over parameter combinations and using these images to generate estimated ground truth. The feature-detected images are compared to the estimated ground truth images to generate ROC points associated with each parameter combination. We develop a strategy for selecting the optimal parameter set by choosing the parameter combination corresponding to the optimal ROC point. We present numerical results showing the effectiveness of the approach using registration of collected SAR data to reference EO data.

  2. 21 CFR 880.6850 - Sterilization wrap.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... 21 Food and Drugs 8 2010-04-01 2010-04-01 false Sterilization wrap. 880.6850 Section 880.6850 Food... § 880.6850 Sterilization wrap. (a) Identification. A sterilization wrap (pack, sterilization wrapper... sterilized by a health care provider. It is intended to allow sterilization of the enclosed medical device...

  3. Unmanned Ground Vehicle Communications Relays: Lessons Learned

    DTIC Science & Technology

    2014-04-01

    technology, specifically an open-source VPN package, OpenVPN . This technology provides a wrapper around the network messages, providing a plug-and-play...performed in OpenVPN :  Set the Maximum Transmission Unit (MTU) to 1600. This is because each VPN endpoint has an MTU of 1500 (the default for Ethernet

  4. 77 FR 34407 - Certain Reduced Ignition Proclivity Cigarette Paper Wrappers and Products Containing Same...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-06-11

    ... INTERNATIONAL TRADE COMMISSION [Investigation No. 337-TA-756] Certain Reduced Ignition Proclivity... of No Violation AGENCY: U.S. International Trade Commission. ACTION: Notice. SUMMARY: Notice is hereby given that the U.S. International Trade Commission has determined to terminate the above-captioned...

  5. 49 CFR 1039.11 - Miscellaneous commodities exemptions.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ..., etc. 26 214 Wrapping paper, wrappers or coarse paper. 26 218 Sanitary tissue stock. 26 471 Sanitary... 30 111 Rubber pneumatic tires or parts. 31 ......do Leather or leather products. 32 ......do Clay... 32 952 15 Cinders, clay, shale expanded shale), slate or volcanic (not pumice stone), or haydrite. 33...

  6. 7 CFR 58.426 - Rindless cheese wrapping equipment.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... 7 Agriculture 3 2011-01-01 2011-01-01 false Rindless cheese wrapping equipment. 58.426 Section 58... Service 1 Equipment and Utensils § 58.426 Rindless cheese wrapping equipment. The equipment used to heat seal the wrapper applied to rindless cheese shall have square interior corners, reasonably smooth...

  7. 7 CFR 58.426 - Rindless cheese wrapping equipment.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 7 Agriculture 3 2010-01-01 2010-01-01 false Rindless cheese wrapping equipment. 58.426 Section 58... Service 1 Equipment and Utensils § 58.426 Rindless cheese wrapping equipment. The equipment used to heat seal the wrapper applied to rindless cheese shall have square interior corners, reasonably smooth...

  8. 25 CFR 307.1 - Penalties.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... 25 Indians 2 2010-04-01 2010-04-01 false Penalties. 307.1 Section 307.1 Indians INDIAN ARTS AND CRAFTS BOARD, DEPARTMENT OF THE INTERIOR NAVAJO ALL-WOOL WOVEN FABRICS; USE OF GOVERNMENT CERTIFICATE OF... products, Indian or otherwise, or to any labels, signs, prints, packages, wrappers, or receptacles intended...

  9. 30 CFR 15.7 - Approval marking.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... under the brand or trade name specified in the approval. (b) The wrapper of each cartridge and each case of approved explosives shall be legibly labeled with the following: the brand or trade name, “MSHA... legibly labeled with the following: the brand or trade name, “MSHA Approved Sheathed Explosive Unit”, the...

  10. 46 CFR 59.10-35 - Wrapper plates and back heads.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ..., PRESSURE VESSELS AND APPURTENANCES Welding Repairs to Boilers and Pressure Vessels in -Service § 59.10-35... staybolts where the thickness is approximately the same as the original construction. If welding is employed... plates of combustion chambers outside of stayed areas may be repaired by welding provided the welded...

  11. Managing configuration software of ground software applications with glueware

    NASA Technical Reports Server (NTRS)

    Larsen, B.; Herrera, R.; Sesplaukis, T.; Cheng, L.; Sarrel, M.

    2003-01-01

    This paper reports on a simple, low-cost effort to streamline the configuration of the uplink software tools. Even though the existing ground system consisted of JPL and custom Cassini software rather than COTS, we chose a glueware approach--reintegrating with wrappers and bridges and adding minimal new functionality.

  12. Occupational asthma due to polyethylene shrink wrapping (paper wrapper's asthma).

    PubMed Central

    Gannon, P F; Burge, P S; Benfield, G F

    1992-01-01

    Occupational asthma due to the pyrolysis products of polyvinyl chloride (PVC) produced by shrink wrapping processes has previously been reported. The first case of occupational asthma in a shrink wrap worker using a different plastic, polyethylene, is reported; the association was confirmed by specific bronchial provocation testing. PMID:1440477

  13. A Virtual Laboratory for Digital Signal Processing

    ERIC Educational Resources Information Center

    Dow, Chyi-Ren; Li, Yi-Hsung; Bai, Jin-Yu

    2006-01-01

    This work designs and implements a virtual digital signal processing laboratory, VDSPL. VDSPL consists of four parts: mobile agent execution environments, mobile agents, DSP development software, and DSP experimental platforms. The network capability of VDSPL is created by using mobile agent and wrapper techniques without modifying the source code…

  14. 9 CFR 94.5 - Regulation of certain garbage.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    .... Cooking garbage at an internal temperature of 212 °F for 30 minutes. Stores. The food, supplies, and other... in this section and includes food scraps, table refuse, galley refuse, food wrappers or packaging materials, and other waste material from stores, food preparation areas, passengers' or crews' quarters...

  15. 9 CFR 351.11 - Identification and separation of technical animal fats for certification and materials for use...

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... technical animal fats for certification and materials for use therein; removal of wrappers, etc.; cleaning... AND VOLUNTARY INSPECTION AND CERTIFICATION CERTIFICATION OF TECHNICAL ANIMAL FATS FOR EXPORT Facilities and Operations § 351.11 Identification and separation of technical animal fats for certification...

  16. 9 CFR 351.11 - Identification and separation of technical animal fats for certification and materials for use...

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... technical animal fats for certification and materials for use therein; removal of wrappers, etc.; cleaning... AND VOLUNTARY INSPECTION AND CERTIFICATION CERTIFICATION OF TECHNICAL ANIMAL FATS FOR EXPORT Facilities and Operations § 351.11 Identification and separation of technical animal fats for certification...

  17. 9 CFR 351.11 - Identification and separation of technical animal fats for certification and materials for use...

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... technical animal fats for certification and materials for use therein; removal of wrappers, etc.; cleaning... AND VOLUNTARY INSPECTION AND CERTIFICATION CERTIFICATION OF TECHNICAL ANIMAL FATS FOR EXPORT Facilities and Operations § 351.11 Identification and separation of technical animal fats for certification...

  18. 9 CFR 351.11 - Identification and separation of technical animal fats for certification and materials for use...

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... technical animal fats for certification and materials for use therein; removal of wrappers, etc.; cleaning... AND VOLUNTARY INSPECTION AND CERTIFICATION CERTIFICATION OF TECHNICAL ANIMAL FATS FOR EXPORT Facilities and Operations § 351.11 Identification and separation of technical animal fats for certification...

  19. 9 CFR 351.11 - Identification and separation of technical animal fats for certification and materials for use...

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... technical animal fats for certification and materials for use therein; removal of wrappers, etc.; cleaning... AND VOLUNTARY INSPECTION AND CERTIFICATION CERTIFICATION OF TECHNICAL ANIMAL FATS FOR EXPORT Facilities and Operations § 351.11 Identification and separation of technical animal fats for certification...

  20. 46 CFR 59.10-35 - Wrapper plates and back heads.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ..., PRESSURE VESSELS AND APPURTENANCES Welding Repairs to Boilers and Pressure Vessels in -Service § 59.10-35... staybolts where the thickness is approximately the same as the original construction. If welding is employed... plates of combustion chambers outside of stayed areas may be repaired by welding provided the welded...

  1. 46 CFR 59.10-35 - Wrapper plates and back heads.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ..., PRESSURE VESSELS AND APPURTENANCES Welding Repairs to Boilers and Pressure Vessels in -Service § 59.10-35... staybolts where the thickness is approximately the same as the original construction. If welding is employed... plates of combustion chambers outside of stayed areas may be repaired by welding provided the welded...

  2. 46 CFR 59.10-35 - Wrapper plates and back heads.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ..., PRESSURE VESSELS AND APPURTENANCES Welding Repairs to Boilers and Pressure Vessels in -Service § 59.10-35... staybolts where the thickness is approximately the same as the original construction. If welding is employed... plates of combustion chambers outside of stayed areas may be repaired by welding provided the welded...

  3. 46 CFR 59.10-35 - Wrapper plates and back heads.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ..., PRESSURE VESSELS AND APPURTENANCES Welding Repairs to Boilers and Pressure Vessels in -Service § 59.10-35... staybolts where the thickness is approximately the same as the original construction. If welding is employed... plates of combustion chambers outside of stayed areas may be repaired by welding provided the welded...

  4. --No Title--

    Science.gov Websites

    background-color:#fff;font-size:80%;font-family:Verdana, Arial, Helvetica, sans-serif;font -weight:normal;color:#000;margin:0;padding:0;border:0;padding-bottom:25px;min-width:1000px;} /* Page Structure */ #wrapper {width:1000px;margin:0 auto;} #nrelheader {width:100%;background-color:#fff;} #topnav {width:100

  5. 10 CFR 1016.33 - External transmission of documents and material.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... ordinary manner and sealed with tape, the appropriate classification shall be placed on both sides of the... address. (3) The outer envelope or wrapper shall be addressed in the ordinary manner. No classification... clearance or access authorization who have been given written authority by their employers. (2) Confidential...

  6. 10 CFR 1016.33 - External transmission of documents and material.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... ordinary manner and sealed with tape, the appropriate classification shall be placed on both sides of the... address. (3) The outer envelope or wrapper shall be addressed in the ordinary manner. No classification... clearance or access authorization who have been given written authority by their employers. (2) Confidential...

  7. Vibration and acoustic frequency spectra for industrial process modeling using selective fusion multi-condition samples and multi-source features

    NASA Astrophysics Data System (ADS)

    Tang, Jian; Qiao, Junfei; Wu, ZhiWei; Chai, Tianyou; Zhang, Jian; Yu, Wen

    2018-01-01

    Frequency spectral data of mechanical vibration and acoustic signals relate to difficult-to-measure production quality and quantity parameters of complex industrial processes. A selective ensemble (SEN) algorithm can be used to build a soft sensor model of these process parameters by fusing valued information selectively from different perspectives. However, a combination of several optimized ensemble sub-models with SEN cannot guarantee the best prediction model. In this study, we use several techniques to construct mechanical vibration and acoustic frequency spectra of a data-driven industrial process parameter model based on selective fusion multi-condition samples and multi-source features. Multi-layer SEN (MLSEN) strategy is used to simulate the domain expert cognitive process. Genetic algorithm and kernel partial least squares are used to construct the inside-layer SEN sub-model based on each mechanical vibration and acoustic frequency spectral feature subset. Branch-and-bound and adaptive weighted fusion algorithms are integrated to select and combine outputs of the inside-layer SEN sub-models. Then, the outside-layer SEN is constructed. Thus, "sub-sampling training examples"-based and "manipulating input features"-based ensemble construction methods are integrated, thereby realizing the selective information fusion process based on multi-condition history samples and multi-source input features. This novel approach is applied to a laboratory-scale ball mill grinding process. A comparison with other methods indicates that the proposed MLSEN approach effectively models mechanical vibration and acoustic signals.

  8. NetProt: Complex-based Feature Selection.

    PubMed

    Goh, Wilson Wen Bin; Wong, Limsoon

    2017-08-04

    Protein complex-based feature selection (PCBFS) provides unparalleled reproducibility with high phenotypic relevance on proteomics data. Currently, there are five PCBFS paradigms, but not all representative methods have been implemented or made readily available. To allow general users to take advantage of these methods, we developed the R-package NetProt, which provides implementations of representative feature-selection methods. NetProt also provides methods for generating simulated differential data and generating pseudocomplexes for complex-based performance benchmarking. The NetProt open source R package is available for download from https://github.com/gohwils/NetProt/releases/ , and online documentation is available at http://rpubs.com/gohwils/204259 .

  9. Feature-based attentional modulations in the absence of direct visual stimulation.

    PubMed

    Serences, John T; Boynton, Geoffrey M

    2007-07-19

    When faced with a crowded visual scene, observers must selectively attend to behaviorally relevant objects to avoid sensory overload. Often this selection process is guided by prior knowledge of a target-defining feature (e.g., the color red when looking for an apple), which enhances the firing rate of visual neurons that are selective for the attended feature. Here, we used functional magnetic resonance imaging and a pattern classification algorithm to predict the attentional state of human observers as they monitored a visual feature (one of two directions of motion). We find that feature-specific attention effects spread across the visual field-even to regions of the scene that do not contain a stimulus. This spread of feature-based attention to empty regions of space may facilitate the perception of behaviorally relevant stimuli by increasing sensitivity to attended features at all locations in the visual field.

  10. Optimum location of external markers using feature selection algorithms for real‐time tumor tracking in external‐beam radiotherapy: a virtual phantom study

    PubMed Central

    Nankali, Saber; Miandoab, Payam Samadi; Baghizadeh, Amin

    2016-01-01

    In external‐beam radiotherapy, using external markers is one of the most reliable tools to predict tumor position, in clinical applications. The main challenge in this approach is tumor motion tracking with highest accuracy that depends heavily on external markers location, and this issue is the objective of this study. Four commercially available feature selection algorithms entitled 1) Correlation‐based Feature Selection, 2) Classifier, 3) Principal Components, and 4) Relief were proposed to find optimum location of external markers in combination with two “Genetic” and “Ranker” searching procedures. The performance of these algorithms has been evaluated using four‐dimensional extended cardiac‐torso anthropomorphic phantom. Six tumors in lung, three tumors in liver, and 49 points on the thorax surface were taken into account to simulate internal and external motions, respectively. The root mean square error of an adaptive neuro‐fuzzy inference system (ANFIS) as prediction model was considered as metric for quantitatively evaluating the performance of proposed feature selection algorithms. To do this, the thorax surface region was divided into nine smaller segments and predefined tumors motion was predicted by ANFIS using external motion data of given markers at each small segment, separately. Our comparative results showed that all feature selection algorithms can reasonably select specific external markers from those segments where the root mean square error of the ANFIS model is minimum. Moreover, the performance accuracy of proposed feature selection algorithms was compared, separately. For this, each tumor motion was predicted using motion data of those external markers selected by each feature selection algorithm. Duncan statistical test, followed by F‐test, on final results reflected that all proposed feature selection algorithms have the same performance accuracy for lung tumors. But for liver tumors, a correlation‐based feature selection algorithm, in combination with a genetic search algorithm, proved to yield best performance accuracy for selecting optimum markers. PACS numbers: 87.55.km, 87.56.Fc PMID:26894358

  11. Optimum location of external markers using feature selection algorithms for real-time tumor tracking in external-beam radiotherapy: a virtual phantom study.

    PubMed

    Nankali, Saber; Torshabi, Ahmad Esmaili; Miandoab, Payam Samadi; Baghizadeh, Amin

    2016-01-08

    In external-beam radiotherapy, using external markers is one of the most reliable tools to predict tumor position, in clinical applications. The main challenge in this approach is tumor motion tracking with highest accuracy that depends heavily on external markers location, and this issue is the objective of this study. Four commercially available feature selection algorithms entitled 1) Correlation-based Feature Selection, 2) Classifier, 3) Principal Components, and 4) Relief were proposed to find optimum location of external markers in combination with two "Genetic" and "Ranker" searching procedures. The performance of these algorithms has been evaluated using four-dimensional extended cardiac-torso anthropomorphic phantom. Six tumors in lung, three tumors in liver, and 49 points on the thorax surface were taken into account to simulate internal and external motions, respectively. The root mean square error of an adaptive neuro-fuzzy inference system (ANFIS) as prediction model was considered as metric for quantitatively evaluating the performance of proposed feature selection algorithms. To do this, the thorax surface region was divided into nine smaller segments and predefined tumors motion was predicted by ANFIS using external motion data of given markers at each small segment, separately. Our comparative results showed that all feature selection algorithms can reasonably select specific external markers from those segments where the root mean square error of the ANFIS model is minimum. Moreover, the performance accuracy of proposed feature selection algorithms was compared, separately. For this, each tumor motion was predicted using motion data of those external markers selected by each feature selection algorithm. Duncan statistical test, followed by F-test, on final results reflected that all proposed feature selection algorithms have the same performance accuracy for lung tumors. But for liver tumors, a correlation-based feature selection algorithm, in combination with a genetic search algorithm, proved to yield best performance accuracy for selecting optimum markers.

  12. EEG-based mild depressive detection using feature selection methods and classifiers.

    PubMed

    Li, Xiaowei; Hu, Bin; Sun, Shuting; Cai, Hanshu

    2016-11-01

    Depression has become a major health burden worldwide, and effectively detection of such disorder is a great challenge which requires latest technological tool, such as Electroencephalography (EEG). This EEG-based research seeks to find prominent frequency band and brain regions that are most related to mild depression, as well as an optimal combination of classification algorithms and feature selection methods which can be used in future mild depression detection. An experiment based on facial expression viewing task (Emo_block and Neu_block) was conducted, and EEG data of 37 university students were collected using a 128 channel HydroCel Geodesic Sensor Net (HCGSN). For discriminating mild depressive patients and normal controls, BayesNet (BN), Support Vector Machine (SVM), Logistic Regression (LR), k-nearest neighbor (KNN) and RandomForest (RF) classifiers were used. And BestFirst (BF), GreedyStepwise (GSW), GeneticSearch (GS), LinearForwordSelection (LFS) and RankSearch (RS) based on Correlation Features Selection (CFS) were applied for linear and non-linear EEG features selection. Independent Samples T-test with Bonferroni correction was used to find the significantly discriminant electrodes and features. Data mining results indicate that optimal performance is achieved using a combination of feature selection method GSW based on CFS and classifier KNN for beta frequency band. Accuracies achieved 92.00% and 98.00%, and AUC achieved 0.957 and 0.997, for Emo_block and Neu_block beta band data respectively. T-test results validate the effectiveness of selected features by search method GSW. Simplified EEG system with only FP1, FP2, F3, O2, T3 electrodes was also explored with linear features, which yielded accuracies of 91.70% and 96.00%, AUC of 0.952 and 0.972, for Emo_block and Neu_block respectively. Classification results obtained by GSW + KNN are encouraging and better than previously published results. In the spatial distribution of features, we find that left parietotemporal lobe in beta EEG frequency band has greater effect on mild depression detection. And fewer EEG channels (FP1, FP2, F3, O2 and T3) combined with linear features may be good candidates for usage in portable systems for mild depression detection. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  13. An Ant Colony Optimization Based Feature Selection for Web Page Classification

    PubMed Central

    2014-01-01

    The increased popularity of the web has caused the inclusion of huge amount of information to the web, and as a result of this explosive information growth, automated web page classification systems are needed to improve search engines' performance. Web pages have a large number of features such as HTML/XML tags, URLs, hyperlinks, and text contents that should be considered during an automated classification process. The aim of this study is to reduce the number of features to be used to improve runtime and accuracy of the classification of web pages. In this study, we used an ant colony optimization (ACO) algorithm to select the best features, and then we applied the well-known C4.5, naive Bayes, and k nearest neighbor classifiers to assign class labels to web pages. We used the WebKB and Conference datasets in our experiments, and we showed that using the ACO for feature selection improves both accuracy and runtime performance of classification. We also showed that the proposed ACO based algorithm can select better features with respect to the well-known information gain and chi square feature selection methods. PMID:25136678

  14. Hypergraph Based Feature Selection Technique for Medical Diagnosis.

    PubMed

    Somu, Nivethitha; Raman, M R Gauthama; Kirthivasan, Kannan; Sriram, V S Shankar

    2016-11-01

    The impact of internet and information systems across various domains have resulted in substantial generation of multidimensional datasets. The use of data mining and knowledge discovery techniques to extract the original information contained in the multidimensional datasets play a significant role in the exploitation of complete benefit provided by them. The presence of large number of features in the high dimensional datasets incurs high computational cost in terms of computing power and time. Hence, feature selection technique has been commonly used to build robust machine learning models to select a subset of relevant features which projects the maximal information content of the original dataset. In this paper, a novel Rough Set based K - Helly feature selection technique (RSKHT) which hybridize Rough Set Theory (RST) and K - Helly property of hypergraph representation had been designed to identify the optimal feature subset or reduct for medical diagnostic applications. Experiments carried out using the medical datasets from the UCI repository proves the dominance of the RSKHT over other feature selection techniques with respect to the reduct size, classification accuracy and time complexity. The performance of the RSKHT had been validated using WEKA tool, which shows that RSKHT had been computationally attractive and flexible over massive datasets.

  15. [Combining speech sample and feature bilateral selection algorithm for classification of Parkinson's disease].

    PubMed

    Zhang, Xiaoheng; Wang, Lirui; Cao, Yao; Wang, Pin; Zhang, Cheng; Yang, Liuyang; Li, Yongming; Zhang, Yanling; Cheng, Oumei

    2018-02-01

    Diagnosis of Parkinson's disease (PD) based on speech data has been proved to be an effective way in recent years. However, current researches just care about the feature extraction and classifier design, and do not consider the instance selection. Former research by authors showed that the instance selection can lead to improvement on classification accuracy. However, no attention is paid on the relationship between speech sample and feature until now. Therefore, a new diagnosis algorithm of PD is proposed in this paper by simultaneously selecting speech sample and feature based on relevant feature weighting algorithm and multiple kernel method, so as to find their synergy effects, thereby improving classification accuracy. Experimental results showed that this proposed algorithm obtained apparent improvement on classification accuracy. It can obtain mean classification accuracy of 82.5%, which was 30.5% higher than the relevant algorithm. Besides, the proposed algorithm detected the synergy effects of speech sample and feature, which is valuable for speech marker extraction.

  16. Stabilisation problem in biaxial platform

    NASA Astrophysics Data System (ADS)

    Lindner, Tymoteusz; Rybarczyk, Dominik; Wyrwał, Daniel

    2016-12-01

    The article describes investigation of rolling ball stabilization problem on a biaxial platform. The aim of the control system proposed here is to stabilize ball moving on a plane in equilibrium point. The authors proposed a control algorithm based on cascade PID and they compared it with another control method. The article shows the results of the accuracy of ball stabilization and influence of applied filter on the signal waveform. The application used to detect the ball position measured by digital camera has been written using a cross platform .Net wrapper to the OpenCV image processing library - EmguCV. The authors used the bipolar stepper motor with dedicated electronic controller. The data between the computer and the designed controller are sent with use of the RS232 standard. The control stand is based on ATmega series microcontroller.

  17. Image search engine with selective filtering and feature-element-based classification

    NASA Astrophysics Data System (ADS)

    Li, Qing; Zhang, Yujin; Dai, Shengyang

    2001-12-01

    With the growth of Internet and storage capability in recent years, image has become a widespread information format in World Wide Web. However, it has become increasingly harder to search for images of interest, and effective image search engine for the WWW needs to be developed. We propose in this paper a selective filtering process and a novel approach for image classification based on feature element in the image search engine we developed for the WWW. First a selective filtering process is embedded in a general web crawler to filter out the meaningless images with GIF format. Two parameters that can be obtained easily are used in the filtering process. Our classification approach first extract feature elements from images instead of feature vectors. Compared with feature vectors, feature elements can better capture visual meanings of the image according to subjective perception of human beings. Different from traditional image classification method, our classification approach based on feature element doesn't calculate the distance between two vectors in the feature space, while trying to find associations between feature element and class attribute of the image. Experiments are presented to show the efficiency of the proposed approach.

  18. Influence of time and length size feature selections for human activity sequences recognition.

    PubMed

    Fang, Hongqing; Chen, Long; Srinivasan, Raghavendiran

    2014-01-01

    In this paper, Viterbi algorithm based on a hidden Markov model is applied to recognize activity sequences from observed sensors events. Alternative features selections of time feature values of sensors events and activity length size feature values are tested, respectively, and then the results of activity sequences recognition performances of Viterbi algorithm are evaluated. The results show that the selection of larger time feature values of sensor events and/or smaller activity length size feature values will generate relatively better results on the activity sequences recognition performances. © 2013 ISA Published by ISA All rights reserved.

  19. The Speed of Feature-Based Attention: Attentional Advantage Is Slow, but Selection Is Fast

    ERIC Educational Resources Information Center

    Huang, Liqiang

    2010-01-01

    When paying attention to a feature (e.g., red), no attentional advantage is gained in perceiving items with this feature in very brief displays. Therefore, feature-based attention seems to be slow. In previous feature-based attention studies, attention has often been measured as the difference in performance in a secondary task. In our recent work…

  20. Application-Dedicated Selection of Filters (ADSF) using covariance maximization and orthogonal projection.

    PubMed

    Hadoux, Xavier; Kumar, Dinesh Kant; Sarossy, Marc G; Roger, Jean-Michel; Gorretta, Nathalie

    2016-05-19

    Visible and near-infrared (Vis-NIR) spectra are generated by the combination of numerous low resolution features. Spectral variables are thus highly correlated, which can cause problems for selecting the most appropriate ones for a given application. Some decomposition bases such as Fourier or wavelet generally help highlighting spectral features that are important, but are by nature constraint to have both positive and negative components. Thus, in addition to complicating the selected features interpretability, it impedes their use for application-dedicated sensors. In this paper we have proposed a new method for feature selection: Application-Dedicated Selection of Filters (ADSF). This method relaxes the shape constraint by enabling the selection of any type of user defined custom features. By considering only relevant features, based on the underlying nature of the data, high regularization of the final model can be obtained, even in the small sample size context often encountered in spectroscopic applications. For larger scale deployment of application-dedicated sensors, these predefined feature constraints can lead to application specific optical filters, e.g., lowpass, highpass, bandpass or bandstop filters with positive only coefficients. In a similar fashion to Partial Least Squares, ADSF successively selects features using covariance maximization and deflates their influences using orthogonal projection in order to optimally tune the selection to the data with limited redundancy. ADSF is well suited for spectroscopic data as it can deal with large numbers of highly correlated variables in supervised learning, even with many correlated responses. Copyright © 2016 Elsevier B.V. All rights reserved.

  1. Efficient Iris Recognition Based on Optimal Subfeature Selection and Weighted Subregion Fusion

    PubMed Central

    Deng, Ning

    2014-01-01

    In this paper, we propose three discriminative feature selection strategies and weighted subregion matching method to improve the performance of iris recognition system. Firstly, we introduce the process of feature extraction and representation based on scale invariant feature transformation (SIFT) in detail. Secondly, three strategies are described, which are orientation probability distribution function (OPDF) based strategy to delete some redundant feature keypoints, magnitude probability distribution function (MPDF) based strategy to reduce dimensionality of feature element, and compounded strategy combined OPDF and MPDF to further select optimal subfeature. Thirdly, to make matching more effective, this paper proposes a novel matching method based on weighted sub-region matching fusion. Particle swarm optimization is utilized to accelerate achieve different sub-region's weights and then weighted different subregions' matching scores to generate the final decision. The experimental results, on three public and renowned iris databases (CASIA-V3 Interval, Lamp, andMMU-V1), demonstrate that our proposed methods outperform some of the existing methods in terms of correct recognition rate, equal error rate, and computation complexity. PMID:24683317

  2. Efficient iris recognition based on optimal subfeature selection and weighted subregion fusion.

    PubMed

    Chen, Ying; Liu, Yuanning; Zhu, Xiaodong; He, Fei; Wang, Hongye; Deng, Ning

    2014-01-01

    In this paper, we propose three discriminative feature selection strategies and weighted subregion matching method to improve the performance of iris recognition system. Firstly, we introduce the process of feature extraction and representation based on scale invariant feature transformation (SIFT) in detail. Secondly, three strategies are described, which are orientation probability distribution function (OPDF) based strategy to delete some redundant feature keypoints, magnitude probability distribution function (MPDF) based strategy to reduce dimensionality of feature element, and compounded strategy combined OPDF and MPDF to further select optimal subfeature. Thirdly, to make matching more effective, this paper proposes a novel matching method based on weighted sub-region matching fusion. Particle swarm optimization is utilized to accelerate achieve different sub-region's weights and then weighted different subregions' matching scores to generate the final decision. The experimental results, on three public and renowned iris databases (CASIA-V3 Interval, Lamp, and MMU-V1), demonstrate that our proposed methods outperform some of the existing methods in terms of correct recognition rate, equal error rate, and computation complexity.

  3. Unbiased feature selection in learning random forests for high-dimensional data.

    PubMed

    Nguyen, Thanh-Tung; Huang, Joshua Zhexue; Nguyen, Thuy Thi

    2015-01-01

    Random forests (RFs) have been widely used as a powerful classification method. However, with the randomization in both bagging samples and feature selection, the trees in the forest tend to select uninformative features for node splitting. This makes RFs have poor accuracy when working with high-dimensional data. Besides that, RFs have bias in the feature selection process where multivalued features are favored. Aiming at debiasing feature selection in RFs, we propose a new RF algorithm, called xRF, to select good features in learning RFs for high-dimensional data. We first remove the uninformative features using p-value assessment, and the subset of unbiased features is then selected based on some statistical measures. This feature subset is then partitioned into two subsets. A feature weighting sampling technique is used to sample features from these two subsets for building trees. This approach enables one to generate more accurate trees, while allowing one to reduce dimensionality and the amount of data needed for learning RFs. An extensive set of experiments has been conducted on 47 high-dimensional real-world datasets including image datasets. The experimental results have shown that RFs with the proposed approach outperformed the existing random forests in increasing the accuracy and the AUC measures.

  4. Adaptive feature selection using v-shaped binary particle swarm optimization.

    PubMed

    Teng, Xuyang; Dong, Hongbin; Zhou, Xiurong

    2017-01-01

    Feature selection is an important preprocessing method in machine learning and data mining. This process can be used not only to reduce the amount of data to be analyzed but also to build models with stronger interpretability based on fewer features. Traditional feature selection methods evaluate the dependency and redundancy of features separately, which leads to a lack of measurement of their combined effect. Moreover, a greedy search considers only the optimization of the current round and thus cannot be a global search. To evaluate the combined effect of different subsets in the entire feature space, an adaptive feature selection method based on V-shaped binary particle swarm optimization is proposed. In this method, the fitness function is constructed using the correlation information entropy. Feature subsets are regarded as individuals in a population, and the feature space is searched using V-shaped binary particle swarm optimization. The above procedure overcomes the hard constraint on the number of features, enables the combined evaluation of each subset as a whole, and improves the search ability of conventional binary particle swarm optimization. The proposed algorithm is an adaptive method with respect to the number of feature subsets. The experimental results show the advantages of optimizing the feature subsets using the V-shaped transfer function and confirm the effectiveness and efficiency of the feature subsets obtained under different classifiers.

  5. Adaptive feature selection using v-shaped binary particle swarm optimization

    PubMed Central

    Dong, Hongbin; Zhou, Xiurong

    2017-01-01

    Feature selection is an important preprocessing method in machine learning and data mining. This process can be used not only to reduce the amount of data to be analyzed but also to build models with stronger interpretability based on fewer features. Traditional feature selection methods evaluate the dependency and redundancy of features separately, which leads to a lack of measurement of their combined effect. Moreover, a greedy search considers only the optimization of the current round and thus cannot be a global search. To evaluate the combined effect of different subsets in the entire feature space, an adaptive feature selection method based on V-shaped binary particle swarm optimization is proposed. In this method, the fitness function is constructed using the correlation information entropy. Feature subsets are regarded as individuals in a population, and the feature space is searched using V-shaped binary particle swarm optimization. The above procedure overcomes the hard constraint on the number of features, enables the combined evaluation of each subset as a whole, and improves the search ability of conventional binary particle swarm optimization. The proposed algorithm is an adaptive method with respect to the number of feature subsets. The experimental results show the advantages of optimizing the feature subsets using the V-shaped transfer function and confirm the effectiveness and efficiency of the feature subsets obtained under different classifiers. PMID:28358850

  6. Investigating a memory-based account of negative priming: support for selection-feature mismatch.

    PubMed

    MacDonald, P A; Joordens, S

    2000-08-01

    Using typical and modified negative priming tasks, the selection-feature mismatch account of negative priming was tested. In the modified task, participants performed selections on the basis of a semantic feature (e.g., referent size). This procedure has been shown to enhance negative priming (P. A. MacDonald, S. Joordens, & K. N. Seergobin, 1999). Across 3 experiments, negative priming occurred only when the repeated item mismatched in terms of the feature used as the basis for selections. When the repeated item was congruent on the selection feature across the prime and probe displays, positive priming arose. This pattern of results appeared in both the ignored- and the attended-repetition conditions. Negative priming does not result from previously ignoring an item. These findings strongly support the selection-feature mismatch account of negative priming and refute both the distractor inhibition and the episodic-retrieval explanations.

  7. Compensatory selection for roads over natural linear features by wolves in northern Ontario: Implications for caribou conservation

    PubMed Central

    Patterson, Brent R.; Anderson, Morgan L.; Rodgers, Arthur R.; Vander Vennen, Lucas M.; Fryxell, John M.

    2017-01-01

    Woodland caribou (Rangifer tarandus caribou) in Ontario are a threatened species that have experienced a substantial retraction of their historic range. Part of their decline has been attributed to increasing densities of anthropogenic linear features such as trails, roads, railways, and hydro lines. These features have been shown to increase the search efficiency and kill rate of wolves. However, it is unclear whether selection for anthropogenic linear features is additive or compensatory to selection for natural (water) linear features which may also be used for travel. We studied the selection of water and anthropogenic linear features by 52 resident wolves (Canis lupus x lycaon) over four years across three study areas in northern Ontario that varied in degrees of forestry activity and human disturbance. We used Euclidean distance-based resource selection functions (mixed-effects logistic regression) at the seasonal range scale with random coefficients for distance to water linear features, primary/secondary roads/railways, and hydro lines, and tertiary roads to estimate the strength of selection for each linear feature and for several habitat types, while accounting for availability of each feature. Next, we investigated the trade-off between selection for anthropogenic and water linear features. Wolves selected both anthropogenic and water linear features; selection for anthropogenic features was stronger than for water during the rendezvous season. Selection for anthropogenic linear features increased with increasing density of these features on the landscape, while selection for natural linear features declined, indicating compensatory selection of anthropogenic linear features. These results have implications for woodland caribou conservation. Prey encounter rates between wolves and caribou seem to be strongly influenced by increasing linear feature densities. This behavioral mechanism–a compensatory functional response to anthropogenic linear feature density resulting in decreased use of natural travel corridors–has negative consequences for the viability of woodland caribou. PMID:29117234

  8. Compensatory selection for roads over natural linear features by wolves in northern Ontario: Implications for caribou conservation.

    PubMed

    Newton, Erica J; Patterson, Brent R; Anderson, Morgan L; Rodgers, Arthur R; Vander Vennen, Lucas M; Fryxell, John M

    2017-01-01

    Woodland caribou (Rangifer tarandus caribou) in Ontario are a threatened species that have experienced a substantial retraction of their historic range. Part of their decline has been attributed to increasing densities of anthropogenic linear features such as trails, roads, railways, and hydro lines. These features have been shown to increase the search efficiency and kill rate of wolves. However, it is unclear whether selection for anthropogenic linear features is additive or compensatory to selection for natural (water) linear features which may also be used for travel. We studied the selection of water and anthropogenic linear features by 52 resident wolves (Canis lupus x lycaon) over four years across three study areas in northern Ontario that varied in degrees of forestry activity and human disturbance. We used Euclidean distance-based resource selection functions (mixed-effects logistic regression) at the seasonal range scale with random coefficients for distance to water linear features, primary/secondary roads/railways, and hydro lines, and tertiary roads to estimate the strength of selection for each linear feature and for several habitat types, while accounting for availability of each feature. Next, we investigated the trade-off between selection for anthropogenic and water linear features. Wolves selected both anthropogenic and water linear features; selection for anthropogenic features was stronger than for water during the rendezvous season. Selection for anthropogenic linear features increased with increasing density of these features on the landscape, while selection for natural linear features declined, indicating compensatory selection of anthropogenic linear features. These results have implications for woodland caribou conservation. Prey encounter rates between wolves and caribou seem to be strongly influenced by increasing linear feature densities. This behavioral mechanism-a compensatory functional response to anthropogenic linear feature density resulting in decreased use of natural travel corridors-has negative consequences for the viability of woodland caribou.

  9. Feature Selection for Ridge Regression with Provable Guarantees.

    PubMed

    Paul, Saurabh; Drineas, Petros

    2016-04-01

    We introduce single-set spectral sparsification as a deterministic sampling-based feature selection technique for regularized least-squares classification, which is the classification analog to ridge regression. The method is unsupervised and gives worst-case guarantees of the generalization power of the classification function after feature selection with respect to the classification function obtained using all features. We also introduce leverage-score sampling as an unsupervised randomized feature selection method for ridge regression. We provide risk bounds for both single-set spectral sparsification and leverage-score sampling on ridge regression in the fixed design setting and show that the risk in the sampled space is comparable to the risk in the full-feature space. We perform experiments on synthetic and real-world data sets; a subset of TechTC-300 data sets, to support our theory. Experimental results indicate that the proposed methods perform better than the existing feature selection methods.

  10. Direct Observation in the Conduct of Training Impact Analyses

    DTIC Science & Technology

    2000-04-01

    with the point of the pen barely extruding from the wrapper (the tight fit keeps both the pen and chemlight in place). This will put a spot of soft...recommendations for supplying that office: A-2 Snack food and beverages (your next opportunity for a meal may not be predictable) Ample supply of notepads

  11. Rx for low cash yields.

    PubMed

    Tobe, Chris

    2003-10-01

    Certain strategies can offer not-for-profit hospitals potentially greater investment yields while maintaining stability and principal safety. Treasury inflation-indexed securities can offer good returns, low volatility, and inflation protection. "Enhanced cash" strategies offer liquidity and help to preserve capital. Stable value "wrappers" allow hospitals to pursue higher-yielding fixed-income securities without an increase in volatility.

  12. 7 CFR 29.2526 - Group.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... 7 Agriculture 2 2014-01-01 2014-01-01 false Group. 29.2526 Section 29.2526 Agriculture Regulations...-Cured Tobacco (u.s. Types 22, 23, and Foreign Type 96) § 29.2526 Group. A division of a type covering..., or the general quality of the tobacco. Groups in these types are Wrappers (A), Heavy Leaf (B), Thin...

  13. 7 CFR 29.2526 - Group.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... 7 Agriculture 2 2012-01-01 2012-01-01 false Group. 29.2526 Section 29.2526 Agriculture Regulations...-Cured Tobacco (u.s. Types 22, 23, and Foreign Type 96) § 29.2526 Group. A division of a type covering..., or the general quality of the tobacco. Groups in these types are Wrappers (A), Heavy Leaf (B), Thin...

  14. 7 CFR 29.2526 - Group.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... 7 Agriculture 2 2013-01-01 2013-01-01 false Group. 29.2526 Section 29.2526 Agriculture Regulations...-Cured Tobacco (u.s. Types 22, 23, and Foreign Type 96) § 29.2526 Group. A division of a type covering..., or the general quality of the tobacco. Groups in these types are Wrappers (A), Heavy Leaf (B), Thin...

  15. 7 CFR 29.2526 - Group.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... 7 Agriculture 2 2011-01-01 2011-01-01 false Group. 29.2526 Section 29.2526 Agriculture Regulations...-Cured Tobacco (u.s. Types 22, 23, and Foreign Type 96) § 29.2526 Group. A division of a type covering..., or the general quality of the tobacco. Groups in these types are Wrappers (A), Heavy Leaf (B), Thin...

  16. 7 CFR 29.2526 - Group.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Group. 29.2526 Section 29.2526 Agriculture Regulations...-Cured Tobacco (u.s. Types 22, 23, and Foreign Type 96) § 29.2526 Group. A division of a type covering..., or the general quality of the tobacco. Groups in these types are Wrappers (A), Heavy Leaf (B), Thin...

  17. 7 CFR 27.23 - Duplicate sets of samples of cotton.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... 7 Agriculture 2 2011-01-01 2011-01-01 false Duplicate sets of samples of cotton. 27.23 Section 27... REGULATIONS COTTON CLASSIFICATION UNDER COTTON FUTURES LEGISLATION Regulations Inspection and Samples § 27.23 Duplicate sets of samples of cotton. The duplicate sets of samples shall be inclosed in wrappers or...

  18. 7 CFR 27.23 - Duplicate sets of samples of cotton.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 7 Agriculture 2 2010-01-01 2010-01-01 false Duplicate sets of samples of cotton. 27.23 Section 27... REGULATIONS COTTON CLASSIFICATION UNDER COTTON FUTURES LEGISLATION Regulations Inspection and Samples § 27.23 Duplicate sets of samples of cotton. The duplicate sets of samples shall be inclosed in wrappers or...

  19. 7 CFR 27.23 - Duplicate sets of samples of cotton.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... 7 Agriculture 2 2014-01-01 2014-01-01 false Duplicate sets of samples of cotton. 27.23 Section 27... REGULATIONS COTTON CLASSIFICATION UNDER COTTON FUTURES LEGISLATION Regulations Inspection and Samples § 27.23 Duplicate sets of samples of cotton. The duplicate sets of samples shall be inclosed in wrappers or...

  20. 7 CFR 27.23 - Duplicate sets of samples of cotton.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... 7 Agriculture 2 2013-01-01 2013-01-01 false Duplicate sets of samples of cotton. 27.23 Section 27... REGULATIONS COTTON CLASSIFICATION UNDER COTTON FUTURES LEGISLATION Regulations Inspection and Samples § 27.23 Duplicate sets of samples of cotton. The duplicate sets of samples shall be inclosed in wrappers or...

  1. 7 CFR 27.23 - Duplicate sets of samples of cotton.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... 7 Agriculture 2 2012-01-01 2012-01-01 false Duplicate sets of samples of cotton. 27.23 Section 27... REGULATIONS COTTON CLASSIFICATION UNDER COTTON FUTURES LEGISLATION Regulations Inspection and Samples § 27.23 Duplicate sets of samples of cotton. The duplicate sets of samples shall be inclosed in wrappers or...

  2. 77 FR 3745 - Establishment of a One-Year Retention Period for Patent-Related Papers That Have Been Scanned...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-01-25

    ... File Wrapper System or the Supplemental Complex Repository for Examiners AGENCY: United States Patent... (IFW) or the Supplemental Complex Repository for Examiners (SCORE). The USPTO has considered the... Supplemental Complex Repository for Examiners, 76 FR 53667 (August 29, 2011), 1370 Off. Gaz. Pat. Office 211...

  3. 9 CFR 317.13 - Storage and distribution of labels and containers bearing official marks.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... containers bearing official marks. 317.13 Section 317.13 Animals and Animal Products FOOD SAFETY AND... General § 317.13 Storage and distribution of labels and containers bearing official marks. Labels, wrappers, and containers bearing any official marks, with or without the establishment number, may be...

  4. El Habano and the world it has shaped: Cuba, Connecticut, and Indonesia.

    PubMed

    Stubbs, Jean

    2010-01-01

    In the half century since the 1959 Cuban Revolution, El Habano remains the premium cigar the world over; but both before and since 1959, the seed, agricultural and industrial know-how, and human capital have been transplanted to replicate that cigar in a process accentuated by upheavals and out-migration. The focus here is on a little-known facet of the interconnected island and offshore Havana cigar history, linking Cuba with Connecticut and Indonesia: from when tobacco was taken from the Americas to Indonesia and gave rise to the famed Sumatra cigar wrapper leaf; through the rise and demise of its sister shade wrapper in Connecticut, with Cuban and Sumatra seed, ultimately overshadowed by Indonesia; and the resulting challenges facing Cuba today. The article highlights the role of Dutch, U.S., British, and Swedish capital to explain why in 2009 the two major global cigar corporations, British Imperial Tobacco and Swedish Match, were lobbying Washington, respectively, for and against the embargo on Cuba. As the antismoking, antitobacco lobby gains ground internationally, the intriguing final question is whether the future lies with El Habano or smokeless Swedish snus.

  5. Reconfigurable, Intelligently-Adaptive, Communication System, an SDR Platform

    NASA Technical Reports Server (NTRS)

    Roche, Rigoberto

    2016-01-01

    The Space Telecommunications Radio System (STRS) provides a common, consistent framework to abstract the application software from the radio platform hardware. STRS aims to reduce the cost and risk of using complex, configurable and reprogrammable radio systems across NASA missions. The Glenn Research Center (GRC) team made a software-defined radio (SDR) platform STRS compliant by adding an STRS operating environment and a field programmable gate array (FPGA) wrapper, capable of implementing each of the platforms interfaces, as well as a test waveform to exercise those interfaces. This effort serves to provide a framework toward waveform development on an STRS compliant platform to support future space communication systems for advanced exploration missions. Validated STRS compliant applications provided tested code with extensive documentation to potentially reduce risk, cost and efforts in development of space-deployable SDRs. This paper discusses the advantages of STRS, the integration of STRS onto a Reconfigurable, Intelligently-Adaptive, Communication System (RIACS) SDR platform, the sample waveform, and wrapper development efforts. The paper emphasizes the infusion of the STRS Architecture onto the RIACS platform for potential use in next generation SDRs for advance exploration missions.

  6. Haptic/graphic rehabilitation: integrating a robot into a virtual environment library and applying it to stroke therapy.

    PubMed

    Sharp, Ian; Patton, James; Listenberger, Molly; Case, Emily

    2011-08-08

    Recent research that tests interactive devices for prolonged therapy practice has revealed new prospects for robotics combined with graphical and other forms of biofeedback. Previous human-robot interactive systems have required different software commands to be implemented for each robot leading to unnecessary developmental overhead time each time a new system becomes available. For example, when a haptic/graphic virtual reality environment has been coded for one specific robot to provide haptic feedback, that specific robot would not be able to be traded for another robot without recoding the program. However, recent efforts in the open source community have proposed a wrapper class approach that can elicit nearly identical responses regardless of the robot used. The result can lead researchers across the globe to perform similar experiments using shared code. Therefore modular "switching out"of one robot for another would not affect development time. In this paper, we outline the successful creation and implementation of a wrapper class for one robot into the open-source H3DAPI, which integrates the software commands most commonly used by all robots.

  7. Effective traffic features selection algorithm for cyber-attacks samples

    NASA Astrophysics Data System (ADS)

    Li, Yihong; Liu, Fangzheng; Du, Zhenyu

    2018-05-01

    By studying the defense scheme of Network attacks, this paper propose an effective traffic features selection algorithm based on k-means++ clustering to deal with the problem of high dimensionality of traffic features which extracted from cyber-attacks samples. Firstly, this algorithm divide the original feature set into attack traffic feature set and background traffic feature set by the clustering. Then, we calculates the variation of clustering performance after removing a certain feature. Finally, evaluating the degree of distinctiveness of the feature vector according to the result. Among them, the effective feature vector is whose degree of distinctiveness exceeds the set threshold. The purpose of this paper is to select out the effective features from the extracted original feature set. In this way, it can reduce the dimensionality of the features so as to reduce the space-time overhead of subsequent detection. The experimental results show that the proposed algorithm is feasible and it has some advantages over other selection algorithms.

  8. A Study for the Feature Selection to Identify GIEMSA-Stained Human Chromosomes Based on Artificial Neural Network

    DTIC Science & Technology

    2001-10-25

    neural network (ANN) has been adopted for the human chromosome classification. It is important to select optimum features for training neural network...Many studies for computer-based chromosome analysis have shown that it is possible to classify chromosomes into 24 subgroups. In addition, artificial

  9. Feature Selection Methods for Zero-Shot Learning of Neural Activity.

    PubMed

    Caceres, Carlos A; Roos, Matthew J; Rupp, Kyle M; Milsap, Griffin; Crone, Nathan E; Wolmetz, Michael E; Ratto, Christopher R

    2017-01-01

    Dimensionality poses a serious challenge when making predictions from human neuroimaging data. Across imaging modalities, large pools of potential neural features (e.g., responses from particular voxels, electrodes, and temporal windows) have to be related to typically limited sets of stimuli and samples. In recent years, zero-shot prediction models have been introduced for mapping between neural signals and semantic attributes, which allows for classification of stimulus classes not explicitly included in the training set. While choices about feature selection can have a substantial impact when closed-set accuracy, open-set robustness, and runtime are competing design objectives, no systematic study of feature selection for these models has been reported. Instead, a relatively straightforward feature stability approach has been adopted and successfully applied across models and imaging modalities. To characterize the tradeoffs in feature selection for zero-shot learning, we compared correlation-based stability to several other feature selection techniques on comparable data sets from two distinct imaging modalities: functional Magnetic Resonance Imaging and Electrocorticography. While most of the feature selection methods resulted in similar zero-shot prediction accuracies and spatial/spectral patterns of selected features, there was one exception; A novel feature/attribute correlation approach was able to achieve those accuracies with far fewer features, suggesting the potential for simpler prediction models that yield high zero-shot classification accuracy.

  10. Selecting Feature Subsets Based on SVM-RFE and the Overlapping Ratio with Applications in Bioinformatics.

    PubMed

    Lin, Xiaohui; Li, Chao; Zhang, Yanhui; Su, Benzhe; Fan, Meng; Wei, Hai

    2017-12-26

    Feature selection is an important topic in bioinformatics. Defining informative features from complex high dimensional biological data is critical in disease study, drug development, etc. Support vector machine-recursive feature elimination (SVM-RFE) is an efficient feature selection technique that has shown its power in many applications. It ranks the features according to the recursive feature deletion sequence based on SVM. In this study, we propose a method, SVM-RFE-OA, which combines the classification accuracy rate and the average overlapping ratio of the samples to determine the number of features to be selected from the feature rank of SVM-RFE. Meanwhile, to measure the feature weights more accurately, we propose a modified SVM-RFE-OA (M-SVM-RFE-OA) algorithm that temporally screens out the samples lying in a heavy overlapping area in each iteration. The experiments on the eight public biological datasets show that the discriminative ability of the feature subset could be measured more accurately by combining the classification accuracy rate with the average overlapping degree of the samples compared with using the classification accuracy rate alone, and shielding the samples in the overlapping area made the calculation of the feature weights more stable and accurate. The methods proposed in this study can also be used with other RFE techniques to define potential biomarkers from big biological data.

  11. Classification of epileptic EEG signals based on simple random sampling and sequential feature selection.

    PubMed

    Ghayab, Hadi Ratham Al; Li, Yan; Abdulla, Shahab; Diykh, Mohammed; Wan, Xiangkui

    2016-06-01

    Electroencephalogram (EEG) signals are used broadly in the medical fields. The main applications of EEG signals are the diagnosis and treatment of diseases such as epilepsy, Alzheimer, sleep problems and so on. This paper presents a new method which extracts and selects features from multi-channel EEG signals. This research focuses on three main points. Firstly, simple random sampling (SRS) technique is used to extract features from the time domain of EEG signals. Secondly, the sequential feature selection (SFS) algorithm is applied to select the key features and to reduce the dimensionality of the data. Finally, the selected features are forwarded to a least square support vector machine (LS_SVM) classifier to classify the EEG signals. The LS_SVM classifier classified the features which are extracted and selected from the SRS and the SFS. The experimental results show that the method achieves 99.90, 99.80 and 100 % for classification accuracy, sensitivity and specificity, respectively.

  12. Selecting relevant 3D image features of margin sharpness and texture for lung nodule retrieval.

    PubMed

    Ferreira, José Raniery; de Azevedo-Marques, Paulo Mazzoncini; Oliveira, Marcelo Costa

    2017-03-01

    Lung cancer is the leading cause of cancer-related deaths in the world. Its diagnosis is a challenge task to specialists due to several aspects on the classification of lung nodules. Therefore, it is important to integrate content-based image retrieval methods on the lung nodule classification process, since they are capable of retrieving similar cases from databases that were previously diagnosed. However, this mechanism depends on extracting relevant image features in order to obtain high efficiency. The goal of this paper is to perform the selection of 3D image features of margin sharpness and texture that can be relevant on the retrieval of similar cancerous and benign lung nodules. A total of 48 3D image attributes were extracted from the nodule volume. Border sharpness features were extracted from perpendicular lines drawn over the lesion boundary. Second-order texture features were extracted from a cooccurrence matrix. Relevant features were selected by a correlation-based method and a statistical significance analysis. Retrieval performance was assessed according to the nodule's potential malignancy on the 10 most similar cases and by the parameters of precision and recall. Statistical significant features reduced retrieval performance. Correlation-based method selected 2 margin sharpness attributes and 6 texture attributes and obtained higher precision compared to all 48 extracted features on similar nodule retrieval. Feature space dimensionality reduction of 83 % obtained higher retrieval performance and presented to be a computationaly low cost method of retrieving similar nodules for the diagnosis of lung cancer.

  13. Effective and extensible feature extraction method using genetic algorithm-based frequency-domain feature search for epileptic EEG multiclassification

    PubMed Central

    Wen, Tingxi; Zhang, Zhongnan

    2017-01-01

    Abstract In this paper, genetic algorithm-based frequency-domain feature search (GAFDS) method is proposed for the electroencephalogram (EEG) analysis of epilepsy. In this method, frequency-domain features are first searched and then combined with nonlinear features. Subsequently, these features are selected and optimized to classify EEG signals. The extracted features are analyzed experimentally. The features extracted by GAFDS show remarkable independence, and they are superior to the nonlinear features in terms of the ratio of interclass distance and intraclass distance. Moreover, the proposed feature search method can search for features of instantaneous frequency in a signal after Hilbert transformation. The classification results achieved using these features are reasonable; thus, GAFDS exhibits good extensibility. Multiple classical classifiers (i.e., k-nearest neighbor, linear discriminant analysis, decision tree, AdaBoost, multilayer perceptron, and Naïve Bayes) achieve satisfactory classification accuracies by using the features generated by the GAFDS method and the optimized feature selection. The accuracies for 2-classification and 3-classification problems may reach up to 99% and 97%, respectively. Results of several cross-validation experiments illustrate that GAFDS is effective in the extraction of effective features for EEG classification. Therefore, the proposed feature selection and optimization model can improve classification accuracy. PMID:28489789

  14. Effective and extensible feature extraction method using genetic algorithm-based frequency-domain feature search for epileptic EEG multiclassification.

    PubMed

    Wen, Tingxi; Zhang, Zhongnan

    2017-05-01

    In this paper, genetic algorithm-based frequency-domain feature search (GAFDS) method is proposed for the electroencephalogram (EEG) analysis of epilepsy. In this method, frequency-domain features are first searched and then combined with nonlinear features. Subsequently, these features are selected and optimized to classify EEG signals. The extracted features are analyzed experimentally. The features extracted by GAFDS show remarkable independence, and they are superior to the nonlinear features in terms of the ratio of interclass distance and intraclass distance. Moreover, the proposed feature search method can search for features of instantaneous frequency in a signal after Hilbert transformation. The classification results achieved using these features are reasonable; thus, GAFDS exhibits good extensibility. Multiple classical classifiers (i.e., k-nearest neighbor, linear discriminant analysis, decision tree, AdaBoost, multilayer perceptron, and Naïve Bayes) achieve satisfactory classification accuracies by using the features generated by the GAFDS method and the optimized feature selection. The accuracies for 2-classification and 3-classification problems may reach up to 99% and 97%, respectively. Results of several cross-validation experiments illustrate that GAFDS is effective in the extraction of effective features for EEG classification. Therefore, the proposed feature selection and optimization model can improve classification accuracy.

  15. Integrating CLIPS applications into heterogeneous distributed systems

    NASA Technical Reports Server (NTRS)

    Adler, Richard M.

    1991-01-01

    SOCIAL is an advanced, object-oriented development tool for integrating intelligent and conventional applications across heterogeneous hardware and software platforms. SOCIAL defines a family of 'wrapper' objects called agents, which incorporate predefined capabilities for distributed communication and control. Developers embed applications within agents and establish interactions between distributed agents via non-intrusive message-based interfaces. This paper describes a predefined SOCIAL agent that is specialized for integrating C Language Integrated Production System (CLIPS)-based applications. The agent's high-level Application Programming Interface supports bidirectional flow of data, knowledge, and commands to other agents, enabling CLIPS applications to initiate interactions autonomously, and respond to requests and results from heterogeneous remote systems. The design and operation of CLIPS agents are illustrated with two distributed applications that integrate CLIPS-based expert systems with other intelligent systems for isolating and mapping problems in the Space Shuttle Launch Processing System at the NASA Kennedy Space Center.

  16. Feature-Selective Attention Adaptively Shifts Noise Correlations in Primary Auditory Cortex.

    PubMed

    Downer, Joshua D; Rapone, Brittany; Verhein, Jessica; O'Connor, Kevin N; Sutter, Mitchell L

    2017-05-24

    Sensory environments often contain an overwhelming amount of information, with both relevant and irrelevant information competing for neural resources. Feature attention mediates this competition by selecting the sensory features needed to form a coherent percept. How attention affects the activity of populations of neurons to support this process is poorly understood because population coding is typically studied through simulations in which one sensory feature is encoded without competition. Therefore, to study the effects of feature attention on population-based neural coding, investigations must be extended to include stimuli with both relevant and irrelevant features. We measured noise correlations ( r noise ) within small neural populations in primary auditory cortex while rhesus macaques performed a novel feature-selective attention task. We found that the effect of feature-selective attention on r noise depended not only on the population tuning to the attended feature, but also on the tuning to the distractor feature. To attempt to explain how these observed effects might support enhanced perceptual performance, we propose an extension of a simple and influential model in which shifts in r noise can simultaneously enhance the representation of the attended feature while suppressing the distractor. These findings present a novel mechanism by which attention modulates neural populations to support sensory processing in cluttered environments. SIGNIFICANCE STATEMENT Although feature-selective attention constitutes one of the building blocks of listening in natural environments, its neural bases remain obscure. To address this, we developed a novel auditory feature-selective attention task and measured noise correlations ( r noise ) in rhesus macaque A1 during task performance. Unlike previous studies showing that the effect of attention on r noise depends on population tuning to the attended feature, we show that the effect of attention depends on the tuning to the distractor feature as well. We suggest that these effects represent an efficient process by which sensory cortex simultaneously enhances relevant information and suppresses irrelevant information. Copyright © 2017 the authors 0270-6474/17/375378-15$15.00/0.

  17. Feature-Selective Attention Adaptively Shifts Noise Correlations in Primary Auditory Cortex

    PubMed Central

    2017-01-01

    Sensory environments often contain an overwhelming amount of information, with both relevant and irrelevant information competing for neural resources. Feature attention mediates this competition by selecting the sensory features needed to form a coherent percept. How attention affects the activity of populations of neurons to support this process is poorly understood because population coding is typically studied through simulations in which one sensory feature is encoded without competition. Therefore, to study the effects of feature attention on population-based neural coding, investigations must be extended to include stimuli with both relevant and irrelevant features. We measured noise correlations (rnoise) within small neural populations in primary auditory cortex while rhesus macaques performed a novel feature-selective attention task. We found that the effect of feature-selective attention on rnoise depended not only on the population tuning to the attended feature, but also on the tuning to the distractor feature. To attempt to explain how these observed effects might support enhanced perceptual performance, we propose an extension of a simple and influential model in which shifts in rnoise can simultaneously enhance the representation of the attended feature while suppressing the distractor. These findings present a novel mechanism by which attention modulates neural populations to support sensory processing in cluttered environments. SIGNIFICANCE STATEMENT Although feature-selective attention constitutes one of the building blocks of listening in natural environments, its neural bases remain obscure. To address this, we developed a novel auditory feature-selective attention task and measured noise correlations (rnoise) in rhesus macaque A1 during task performance. Unlike previous studies showing that the effect of attention on rnoise depends on population tuning to the attended feature, we show that the effect of attention depends on the tuning to the distractor feature as well. We suggest that these effects represent an efficient process by which sensory cortex simultaneously enhances relevant information and suppresses irrelevant information. PMID:28432139

  18. Classification of early-stage non-small cell lung cancer by weighing gene expression profiles with connectivity information.

    PubMed

    Zhang, Ao; Tian, Suyan

    2018-05-01

    Pathway-based feature selection algorithms, which utilize biological information contained in pathways to guide which features/genes should be selected, have evolved quickly and become widespread in the field of bioinformatics. Based on how the pathway information is incorporated, we classify pathway-based feature selection algorithms into three major categories-penalty, stepwise forward, and weighting. Compared to the first two categories, the weighting methods have been underutilized even though they are usually the simplest ones. In this article, we constructed three different genes' connectivity information-based weights for each gene and then conducted feature selection upon the resulting weighted gene expression profiles. Using both simulations and a real-world application, we have demonstrated that when the data-driven connectivity information constructed from the data of specific disease under study is considered, the resulting weighted gene expression profiles slightly outperform the original expression profiles. In summary, a big challenge faced by the weighting method is how to estimate pathway knowledge-based weights more accurately and precisely. Only until the issue is conquered successfully will wide utilization of the weighting methods be impossible. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  19. Greedy feature selection for glycan chromatography data with the generalized Dirichlet distribution

    PubMed Central

    2013-01-01

    Background Glycoproteins are involved in a diverse range of biochemical and biological processes. Changes in protein glycosylation are believed to occur in many diseases, particularly during cancer initiation and progression. The identification of biomarkers for human disease states is becoming increasingly important, as early detection is key to improving survival and recovery rates. To this end, the serum glycome has been proposed as a potential source of biomarkers for different types of cancers. High-throughput hydrophilic interaction liquid chromatography (HILIC) technology for glycan analysis allows for the detailed quantification of the glycan content in human serum. However, the experimental data from this analysis is compositional by nature. Compositional data are subject to a constant-sum constraint, which restricts the sample space to a simplex. Statistical analysis of glycan chromatography datasets should account for their unusual mathematical properties. As the volume of glycan HILIC data being produced increases, there is a considerable need for a framework to support appropriate statistical analysis. Proposed here is a methodology for feature selection in compositional data. The principal objective is to provide a template for the analysis of glycan chromatography data that may be used to identify potential glycan biomarkers. Results A greedy search algorithm, based on the generalized Dirichlet distribution, is carried out over the feature space to search for the set of “grouping variables” that best discriminate between known group structures in the data, modelling the compositional variables using beta distributions. The algorithm is applied to two glycan chromatography datasets. Statistical classification methods are used to test the ability of the selected features to differentiate between known groups in the data. Two well-known methods are used for comparison: correlation-based feature selection (CFS) and recursive partitioning (rpart). CFS is a feature selection method, while recursive partitioning is a learning tree algorithm that has been used for feature selection in the past. Conclusions The proposed feature selection method performs well for both glycan chromatography datasets. It is computationally slower, but results in a lower misclassification rate and a higher sensitivity rate than both correlation-based feature selection and the classification tree method. PMID:23651459

  20. Improving lung cancer prognosis assessment by incorporating synthetic minority oversampling technique and score fusion method

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yan, Shiju; Qian, Wei; Guan, Yubao

    2016-06-15

    Purpose: This study aims to investigate the potential to improve lung cancer recurrence risk prediction performance for stage I NSCLS patients by integrating oversampling, feature selection, and score fusion techniques and develop an optimal prediction model. Methods: A dataset involving 94 early stage lung cancer patients was retrospectively assembled, which includes CT images, nine clinical and biological (CB) markers, and outcome of 3-yr disease-free survival (DFS) after surgery. Among the 94 patients, 74 remained DFS and 20 had cancer recurrence. Applying a computer-aided detection scheme, tumors were segmented from the CT images and 35 quantitative image (QI) features were initiallymore » computed. Two normalized Gaussian radial basis function network (RBFN) based classifiers were built based on QI features and CB markers separately. To improve prediction performance, the authors applied a synthetic minority oversampling technique (SMOTE) and a BestFirst based feature selection method to optimize the classifiers and also tested fusion methods to combine QI and CB based prediction results. Results: Using a leave-one-case-out cross-validation (K-fold cross-validation) method, the computed areas under a receiver operating characteristic curve (AUCs) were 0.716 ± 0.071 and 0.642 ± 0.061, when using the QI and CB based classifiers, respectively. By fusion of the scores generated by the two classifiers, AUC significantly increased to 0.859 ± 0.052 (p < 0.05) with an overall prediction accuracy of 89.4%. Conclusions: This study demonstrated the feasibility of improving prediction performance by integrating SMOTE, feature selection, and score fusion techniques. Combining QI features and CB markers and performing SMOTE prior to feature selection in classifier training enabled RBFN based classifier to yield improved prediction accuracy.« less

  1. Feature Selection for Classification of Polar Regions Using a Fuzzy Expert System

    NASA Technical Reports Server (NTRS)

    Penaloza, Mauel A.; Welch, Ronald M.

    1996-01-01

    Labeling, feature selection, and the choice of classifier are critical elements for classification of scenes and for image understanding. This study examines several methods for feature selection in polar regions, including the list, of a fuzzy logic-based expert system for further refinement of a set of selected features. Six Advanced Very High Resolution Radiometer (AVHRR) Local Area Coverage (LAC) arctic scenes are classified into nine classes: water, snow / ice, ice cloud, land, thin stratus, stratus over water, cumulus over water, textured snow over water, and snow-covered mountains. Sixty-seven spectral and textural features are computed and analyzed by the feature selection algorithms. The divergence, histogram analysis, and discriminant analysis approaches are intercompared for their effectiveness in feature selection. The fuzzy expert system method is used not only to determine the effectiveness of each approach in classifying polar scenes, but also to further reduce the features into a more optimal set. For each selection method,features are ranked from best to worst, and the best half of the features are selected. Then, rules using these selected features are defined. The results of running the fuzzy expert system with these rules show that the divergence method produces the best set features, not only does it produce the highest classification accuracy, but also it has the lowest computation requirements. A reduction of the set of features produced by the divergence method using the fuzzy expert system results in an overall classification accuracy of over 95 %. However, this increase of accuracy has a high computation cost.

  2. An effective biometric discretization approach to extract highly discriminative, informative, and privacy-protective binary representation

    NASA Astrophysics Data System (ADS)

    Lim, Meng-Hui; Teoh, Andrew Beng Jin

    2011-12-01

    Biometric discretization derives a binary string for each user based on an ordered set of biometric features. This representative string ought to be discriminative, informative, and privacy protective when it is employed as a cryptographic key in various security applications upon error correction. However, it is commonly believed that satisfying the first and the second criteria simultaneously is not feasible, and a tradeoff between them is always definite. In this article, we propose an effective fixed bit allocation-based discretization approach which involves discriminative feature extraction, discriminative feature selection, unsupervised quantization (quantization that does not utilize class information), and linearly separable subcode (LSSC)-based encoding to fulfill all the ideal properties of a binary representation extracted for cryptographic applications. In addition, we examine a number of discriminative feature-selection measures for discretization and identify the proper way of setting an important feature-selection parameter. Encouraging experimental results vindicate the feasibility of our approach.

  3. Feature Selection for Object-Based Classification of High-Resolution Remote Sensing Images Based on the Combination of a Genetic Algorithm and Tabu Search

    PubMed Central

    Shi, Lei; Wan, Youchuan; Gao, Xianjun

    2018-01-01

    In object-based image analysis of high-resolution images, the number of features can reach hundreds, so it is necessary to perform feature reduction prior to classification. In this paper, a feature selection method based on the combination of a genetic algorithm (GA) and tabu search (TS) is presented. The proposed GATS method aims to reduce the premature convergence of the GA by the use of TS. A prematurity index is first defined to judge the convergence situation during the search. When premature convergence does take place, an improved mutation operator is executed, in which TS is performed on individuals with higher fitness values. As for the other individuals with lower fitness values, mutation with a higher probability is carried out. Experiments using the proposed GATS feature selection method and three other methods, a standard GA, the multistart TS method, and ReliefF, were conducted on WorldView-2 and QuickBird images. The experimental results showed that the proposed method outperforms the other methods in terms of the final classification accuracy. PMID:29581721

  4. A Feature Selection Method Based on Fisher's Discriminant Ratio for Text Sentiment Classification

    NASA Astrophysics Data System (ADS)

    Wang, Suge; Li, Deyu; Wei, Yingjie; Li, Hongxia

    With the rapid growth of e-commerce, product reviews on the Web have become an important information source for customers' decision making when they intend to buy some product. As the reviews are often too many for customers to go through, how to automatically classify them into different sentiment orientation categories (i.e. positive/negative) has become a research problem. In this paper, based on Fisher's discriminant ratio, an effective feature selection method is proposed for product review text sentiment classification. In order to validate the validity of the proposed method, we compared it with other methods respectively based on information gain and mutual information while support vector machine is adopted as the classifier. In this paper, 6 subexperiments are conducted by combining different feature selection methods with 2 kinds of candidate feature sets. Under 1006 review documents of cars, the experimental results indicate that the Fisher's discriminant ratio based on word frequency estimation has the best performance with F value 83.3% while the candidate features are the words which appear in both positive and negative texts.

  5. Reducing Sweeping Frequencies in Microwave NDT Employing Machine Learning Feature Selection

    PubMed Central

    Moomen, Abdelniser; Ali, Abdulbaset; Ramahi, Omar M.

    2016-01-01

    Nondestructive Testing (NDT) assessment of materials’ health condition is useful for classifying healthy from unhealthy structures or detecting flaws in metallic or dielectric structures. Performing structural health testing for coated/uncoated metallic or dielectric materials with the same testing equipment requires a testing method that can work on metallics and dielectrics such as microwave testing. Reducing complexity and expenses associated with current diagnostic practices of microwave NDT of structural health requires an effective and intelligent approach based on feature selection and classification techniques of machine learning. Current microwave NDT methods in general based on measuring variation in the S-matrix over the entire operating frequency ranges of the sensors. For instance, assessing the health of metallic structures using a microwave sensor depends on the reflection or/and transmission coefficient measurements as a function of the sweeping frequencies of the operating band. The aim of this work is reducing sweeping frequencies using machine learning feature selection techniques. By treating sweeping frequencies as features, the number of top important features can be identified, then only the most influential features (frequencies) are considered when building the microwave NDT equipment. The proposed method of reducing sweeping frequencies was validated experimentally using a waveguide sensor and a metallic plate with different cracks. Among the investigated feature selection techniques are information gain, gain ratio, relief, chi-squared. The effectiveness of the selected features were validated through performance evaluations of various classification models; namely, Nearest Neighbor, Neural Networks, Random Forest, and Support Vector Machine. Results showed good crack classification accuracy rates after employing feature selection algorithms. PMID:27104533

  6. An audiovisual emotion recognition system

    NASA Astrophysics Data System (ADS)

    Han, Yi; Wang, Guoyin; Yang, Yong; He, Kun

    2007-12-01

    Human emotions could be expressed by many bio-symbols. Speech and facial expression are two of them. They are both regarded as emotional information which is playing an important role in human-computer interaction. Based on our previous studies on emotion recognition, an audiovisual emotion recognition system is developed and represented in this paper. The system is designed for real-time practice, and is guaranteed by some integrated modules. These modules include speech enhancement for eliminating noises, rapid face detection for locating face from background image, example based shape learning for facial feature alignment, and optical flow based tracking algorithm for facial feature tracking. It is known that irrelevant features and high dimensionality of the data can hurt the performance of classifier. Rough set-based feature selection is a good method for dimension reduction. So 13 speech features out of 37 ones and 10 facial features out of 33 ones are selected to represent emotional information, and 52 audiovisual features are selected due to the synchronization when speech and video fused together. The experiment results have demonstrated that this system performs well in real-time practice and has high recognition rate. Our results also show that the work in multimodules fused recognition will become the trend of emotion recognition in the future.

  7. A Feature and Algorithm Selection Method for Improving the Prediction of Protein Structural Class.

    PubMed

    Ni, Qianwu; Chen, Lei

    2017-01-01

    Correct prediction of protein structural class is beneficial to investigation on protein functions, regulations and interactions. In recent years, several computational methods have been proposed in this regard. However, based on various features, it is still a great challenge to select proper classification algorithm and extract essential features to participate in classification. In this study, a feature and algorithm selection method was presented for improving the accuracy of protein structural class prediction. The amino acid compositions and physiochemical features were adopted to represent features and thirty-eight machine learning algorithms collected in Weka were employed. All features were first analyzed by a feature selection method, minimum redundancy maximum relevance (mRMR), producing a feature list. Then, several feature sets were constructed by adding features in the list one by one. For each feature set, thirtyeight algorithms were executed on a dataset, in which proteins were represented by features in the set. The predicted classes yielded by these algorithms and true class of each protein were collected to construct a dataset, which were analyzed by mRMR method, yielding an algorithm list. From the algorithm list, the algorithm was taken one by one to build an ensemble prediction model. Finally, we selected the ensemble prediction model with the best performance as the optimal ensemble prediction model. Experimental results indicate that the constructed model is much superior to models using single algorithm and other models that only adopt feature selection procedure or algorithm selection procedure. The feature selection procedure or algorithm selection procedure are really helpful for building an ensemble prediction model that can yield a better performance. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  8. Oral cancer prognosis based on clinicopathologic and genomic markers using a hybrid of feature selection and machine learning methods

    PubMed Central

    2013-01-01

    Background Machine learning techniques are becoming useful as an alternative approach to conventional medical diagnosis or prognosis as they are good for handling noisy and incomplete data, and significant results can be attained despite a small sample size. Traditionally, clinicians make prognostic decisions based on clinicopathologic markers. However, it is not easy for the most skilful clinician to come out with an accurate prognosis by using these markers alone. Thus, there is a need to use genomic markers to improve the accuracy of prognosis. The main aim of this research is to apply a hybrid of feature selection and machine learning methods in oral cancer prognosis based on the parameters of the correlation of clinicopathologic and genomic markers. Results In the first stage of this research, five feature selection methods have been proposed and experimented on the oral cancer prognosis dataset. In the second stage, the model with the features selected from each feature selection methods are tested on the proposed classifiers. Four types of classifiers are chosen; these are namely, ANFIS, artificial neural network, support vector machine and logistic regression. A k-fold cross-validation is implemented on all types of classifiers due to the small sample size. The hybrid model of ReliefF-GA-ANFIS with 3-input features of drink, invasion and p63 achieved the best accuracy (accuracy = 93.81%; AUC = 0.90) for the oral cancer prognosis. Conclusions The results revealed that the prognosis is superior with the presence of both clinicopathologic and genomic markers. The selected features can be investigated further to validate the potential of becoming as significant prognostic signature in the oral cancer studies. PMID:23725313

  9. A feature-based approach to modeling protein-protein interaction hot spots.

    PubMed

    Cho, Kyu-il; Kim, Dongsup; Lee, Doheon

    2009-05-01

    Identifying features that effectively represent the energetic contribution of an individual interface residue to the interactions between proteins remains problematic. Here, we present several new features and show that they are more effective than conventional features. By combining the proposed features with conventional features, we develop a predictive model for interaction hot spots. Initially, 54 multifaceted features, composed of different levels of information including structure, sequence and molecular interaction information, are quantified. Then, to identify the best subset of features for predicting hot spots, feature selection is performed using a decision tree. Based on the selected features, a predictive model for hot spots is created using support vector machine (SVM) and tested on an independent test set. Our model shows better overall predictive accuracy than previous methods such as the alanine scanning methods Robetta and FOLDEF, and the knowledge-based method KFC. Subsequent analysis yields several findings about hot spots. As expected, hot spots have a larger relative surface area burial and are more hydrophobic than other residues. Unexpectedly, however, residue conservation displays a rather complicated tendency depending on the types of protein complexes, indicating that this feature is not good for identifying hot spots. Of the selected features, the weighted atomic packing density, relative surface area burial and weighted hydrophobicity are the top 3, with the weighted atomic packing density proving to be the most effective feature for predicting hot spots. Notably, we find that hot spots are closely related to pi-related interactions, especially pi . . . pi interactions.

  10. Facial recognition using multisensor images based on localized kernel eigen spaces.

    PubMed

    Gundimada, Satyanadh; Asari, Vijayan K

    2009-06-01

    A feature selection technique along with an information fusion procedure for improving the recognition accuracy of a visual and thermal image-based facial recognition system is presented in this paper. A novel modular kernel eigenspaces approach is developed and implemented on the phase congruency feature maps extracted from the visual and thermal images individually. Smaller sub-regions from a predefined neighborhood within the phase congruency images of the training samples are merged to obtain a large set of features. These features are then projected into higher dimensional spaces using kernel methods. The proposed localized nonlinear feature selection procedure helps to overcome the bottlenecks of illumination variations, partial occlusions, expression variations and variations due to temperature changes that affect the visual and thermal face recognition techniques. AR and Equinox databases are used for experimentation and evaluation of the proposed technique. The proposed feature selection procedure has greatly improved the recognition accuracy for both the visual and thermal images when compared to conventional techniques. Also, a decision level fusion methodology is presented which along with the feature selection procedure has outperformed various other face recognition techniques in terms of recognition accuracy.

  11. Feature Selection Methods for Robust Decoding of Finger Movements in a Non-human Primate

    PubMed Central

    Padmanaban, Subash; Baker, Justin; Greger, Bradley

    2018-01-01

    Objective: The performance of machine learning algorithms used for neural decoding of dexterous tasks may be impeded due to problems arising when dealing with high-dimensional data. The objective of feature selection algorithms is to choose a near-optimal subset of features from the original feature space to improve the performance of the decoding algorithm. The aim of our study was to compare the effects of four feature selection techniques, Wilcoxon signed-rank test, Relative Importance, Principal Component Analysis (PCA), and Mutual Information Maximization on SVM classification performance for a dexterous decoding task. Approach: A nonhuman primate (NHP) was trained to perform small coordinated movements—similar to typing. An array of microelectrodes was implanted in the hand area of the motor cortex of the NHP and used to record action potentials (AP) during finger movements. A Support Vector Machine (SVM) was used to classify which finger movement the NHP was making based upon AP firing rates. We used the SVM classification to examine the functional parameters of (i) robustness to simulated failure and (ii) longevity of classification. We also compared the effect of using isolated-neuron and multi-unit firing rates as the feature vector supplied to the SVM. Main results: The average decoding accuracy for multi-unit features and single-unit features using Mutual Information Maximization (MIM) across 47 sessions was 96.74 ± 3.5% and 97.65 ± 3.36% respectively. The reduction in decoding accuracy between using 100% of the features and 10% of features based on MIM was 45.56% (from 93.7 to 51.09%) and 4.75% (from 95.32 to 90.79%) for multi-unit and single-unit features respectively. MIM had best performance compared to other feature selection methods. Significance: These results suggest improved decoding performance can be achieved by using optimally selected features. The results based on clinically relevant performance metrics also suggest that the decoding algorithm can be made robust by using optimal features and feature selection algorithms. We believe that even a few percent increase in performance is important and improves the decoding accuracy of the machine learning algorithm potentially increasing the ease of use of a brain machine interface. PMID:29467602

  12. Radiomics-based Prognosis Analysis for Non-Small Cell Lung Cancer

    NASA Astrophysics Data System (ADS)

    Zhang, Yucheng; Oikonomou, Anastasia; Wong, Alexander; Haider, Masoom A.; Khalvati, Farzad

    2017-04-01

    Radiomics characterizes tumor phenotypes by extracting large numbers of quantitative features from radiological images. Radiomic features have been shown to provide prognostic value in predicting clinical outcomes in several studies. However, several challenges including feature redundancy, unbalanced data, and small sample sizes have led to relatively low predictive accuracy. In this study, we explore different strategies for overcoming these challenges and improving predictive performance of radiomics-based prognosis for non-small cell lung cancer (NSCLC). CT images of 112 patients (mean age 75 years) with NSCLC who underwent stereotactic body radiotherapy were used to predict recurrence, death, and recurrence-free survival using a comprehensive radiomics analysis. Different feature selection and predictive modeling techniques were used to determine the optimal configuration of prognosis analysis. To address feature redundancy, comprehensive analysis indicated that Random Forest models and Principal Component Analysis were optimum predictive modeling and feature selection methods, respectively, for achieving high prognosis performance. To address unbalanced data, Synthetic Minority Over-sampling technique was found to significantly increase predictive accuracy. A full analysis of variance showed that data endpoints, feature selection techniques, and classifiers were significant factors in affecting predictive accuracy, suggesting that these factors must be investigated when building radiomics-based predictive models for cancer prognosis.

  13. Normed kernel function-based fuzzy possibilistic C-means (NKFPCM) algorithm for high-dimensional breast cancer database classification with feature selection is based on Laplacian Score

    NASA Astrophysics Data System (ADS)

    Lestari, A. W.; Rustam, Z.

    2017-07-01

    In the last decade, breast cancer has become the focus of world attention as this disease is one of the primary leading cause of death for women. Therefore, it is necessary to have the correct precautions and treatment. In previous studies, Fuzzy Kennel K-Medoid algorithm has been used for multi-class data. This paper proposes an algorithm to classify the high dimensional data of breast cancer using Fuzzy Possibilistic C-means (FPCM) and a new method based on clustering analysis using Normed Kernel Function-Based Fuzzy Possibilistic C-Means (NKFPCM). The objective of this paper is to obtain the best accuracy in classification of breast cancer data. In order to improve the accuracy of the two methods, the features candidates are evaluated using feature selection, where Laplacian Score is used. The results show the comparison accuracy and running time of FPCM and NKFPCM with and without feature selection.

  14. NEXUS - Resilient Intelligent Middleware

    NASA Astrophysics Data System (ADS)

    Kaveh, N.; Hercock, R. Ghanea

    Service-oriented computing, a composition of distributed-object computing, component-based, and Web-based concepts, is becoming the widespread choice for developing dynamic heterogeneous software assets available as services across a network. One of the major strengths of service-oriented technologies is the high abstraction layer and large granularity level at which software assets are viewed compared to traditional object-oriented technologies. Collaboration through encapsulated and separately defined service interfaces creates a service-oriented environment, whereby multiple services can be linked together through their interfaces to compose a functional system. This approach enables better integration of legacy and non-legacy services, via wrapper interfaces, and allows for service composition at a more abstract level especially in cases such as vertical market stacks. The heterogeneous nature of service-oriented technologies and the granularity of their software components makes them a suitable computing model in the pervasive domain.

  15. JWST Wavefront Control Toolbox

    NASA Technical Reports Server (NTRS)

    Shin, Shahram Ron; Aronstein, David L.

    2011-01-01

    A Matlab-based toolbox has been developed for the wavefront control and optimization of segmented optical surfaces to correct for possible misalignments of James Webb Space Telescope (JWST) using influence functions. The toolbox employs both iterative and non-iterative methods to converge to an optimal solution by minimizing the cost function. The toolbox could be used in either of constrained and unconstrained optimizations. The control process involves 1 to 7 degrees-of-freedom perturbations per segment of primary mirror in addition to the 5 degrees of freedom of secondary mirror. The toolbox consists of a series of Matlab/Simulink functions and modules, developed based on a "wrapper" approach, that handles the interface and data flow between existing commercial optical modeling software packages such as Zemax and Code V. The limitations of the algorithm are dictated by the constraints of the moving parts in the mirrors.

  16. Identity Recognition Algorithm Using Improved Gabor Feature Selection of Gait Energy Image

    NASA Astrophysics Data System (ADS)

    Chao, LIANG; Ling-yao, JIA; Dong-cheng, SHI

    2017-01-01

    This paper describes an effective gait recognition approach based on Gabor features of gait energy image. In this paper, the kernel Fisher analysis combined with kernel matrix is proposed to select dominant features. The nearest neighbor classifier based on whitened cosine distance is used to discriminate different gait patterns. The approach proposed is tested on the CASIA and USF gait databases. The results show that our approach outperforms other state of gait recognition approaches in terms of recognition accuracy and robustness.

  17. Combined texture feature analysis of segmentation and classification of benign and malignant tumour CT slices.

    PubMed

    Padma, A; Sukanesh, R

    2013-01-01

    A computer software system is designed for the segmentation and classification of benign from malignant tumour slices in brain computed tomography (CT) images. This paper presents a method to find and select both the dominant run length and co-occurrence texture features of region of interest (ROI) of the tumour region of each slice to be segmented by Fuzzy c means clustering (FCM) and evaluate the performance of support vector machine (SVM)-based classifiers in classifying benign and malignant tumour slices. Two hundred and six tumour confirmed CT slices are considered in this study. A total of 17 texture features are extracted by a feature extraction procedure, and six features are selected using Principal Component Analysis (PCA). This study constructed the SVM-based classifier with the selected features and by comparing the segmentation results with the experienced radiologist labelled ground truth (target). Quantitative analysis between ground truth and segmented tumour is presented in terms of segmentation accuracy, segmentation error and overlap similarity measures such as the Jaccard index. The classification performance of the SVM-based classifier with the same selected features is also evaluated using a 10-fold cross-validation method. The proposed system provides some newly found texture features have an important contribution in classifying benign and malignant tumour slices efficiently and accurately with less computational time. The experimental results showed that the proposed system is able to achieve the highest segmentation and classification accuracy effectiveness as measured by jaccard index and sensitivity and specificity.

  18. In Plain, Brown Wrapper. Teacher's Guide [and] Student Material.

    ERIC Educational Resources Information Center

    Estes, Cynthia

    This document provides teaching guidelines and student material for a unit intended for use in sixth grade consumer, home economics, or language arts programs. Time allotment is six hours of classroom time spread over a two to three week period. The objectives of this capsule are to help students understand the purpose of advertising, locate the…

  19. 76 FR 53667 - Establishing a One-Year Retention Period for Patent-Related Papers That Have Been Scanned Into...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-08-29

    ... System or the Supplemental Complex Repository for Examiners AGENCY: United States Patent and Trademark... been scanned into the Image File Wrapper system (IFW) or the Supplemental Complex Repository for..., the USPTO had fully deployed SCORE, a data repository system designed to augment IFW with the capture...

  20. 11. 4TH FLOOR, HOTEL SOAP LINE No. 6 TO NORTHEAST, ...

    Library of Congress Historic Buildings Survey, Historic Engineering Record, Historic Landscapes Survey

    11. 4TH FLOOR, HOTEL SOAP LINE No. 6 TO NORTHEAST, WITH WRAPPER (LEFT), PRESS (CENTER), AND CUTTER (RIGHT, BEHIND CHUTE); BUCKET CONVEYOR AT RIGHT MOVED WASTE FROM PRESS TO 5TH FLOOR FOR RE-MANUFACTURE - Colgate & Company Jersey City Plant, Building No. B-14, 54-58 Grand Street, Jersey City, Hudson County, NJ

  1. Perceptual quality estimation of H.264/AVC videos using reduced-reference and no-reference models

    NASA Astrophysics Data System (ADS)

    Shahid, Muhammad; Pandremmenou, Katerina; Kondi, Lisimachos P.; Rossholm, Andreas; Lövström, Benny

    2016-09-01

    Reduced-reference (RR) and no-reference (NR) models for video quality estimation, using features that account for the impact of coding artifacts, spatio-temporal complexity, and packet losses, are proposed. The purpose of this study is to analyze a number of potentially quality-relevant features in order to select the most suitable set of features for building the desired models. The proposed sets of features have not been used in the literature and some of the features are used for the first time in this study. The features are employed by the least absolute shrinkage and selection operator (LASSO), which selects only the most influential of them toward perceptual quality. For comparison, we apply feature selection in the complete feature sets and ridge regression on the reduced sets. The models are validated using a database of H.264/AVC encoded videos that were subjectively assessed for quality in an ITU-T compliant laboratory. We infer that just two features selected by RR LASSO and two bitstream-based features selected by NR LASSO are able to estimate perceptual quality with high accuracy, higher than that of ridge, which uses more features. The comparisons with competing works and two full-reference metrics also verify the superiority of our models.

  2. Feature Selection Methods for Zero-Shot Learning of Neural Activity

    PubMed Central

    Caceres, Carlos A.; Roos, Matthew J.; Rupp, Kyle M.; Milsap, Griffin; Crone, Nathan E.; Wolmetz, Michael E.; Ratto, Christopher R.

    2017-01-01

    Dimensionality poses a serious challenge when making predictions from human neuroimaging data. Across imaging modalities, large pools of potential neural features (e.g., responses from particular voxels, electrodes, and temporal windows) have to be related to typically limited sets of stimuli and samples. In recent years, zero-shot prediction models have been introduced for mapping between neural signals and semantic attributes, which allows for classification of stimulus classes not explicitly included in the training set. While choices about feature selection can have a substantial impact when closed-set accuracy, open-set robustness, and runtime are competing design objectives, no systematic study of feature selection for these models has been reported. Instead, a relatively straightforward feature stability approach has been adopted and successfully applied across models and imaging modalities. To characterize the tradeoffs in feature selection for zero-shot learning, we compared correlation-based stability to several other feature selection techniques on comparable data sets from two distinct imaging modalities: functional Magnetic Resonance Imaging and Electrocorticography. While most of the feature selection methods resulted in similar zero-shot prediction accuracies and spatial/spectral patterns of selected features, there was one exception; A novel feature/attribute correlation approach was able to achieve those accuracies with far fewer features, suggesting the potential for simpler prediction models that yield high zero-shot classification accuracy. PMID:28690513

  3. Web-based three-dimensional geo-referenced visualization

    NASA Astrophysics Data System (ADS)

    Lin, Hui; Gong, Jianhua; Wang, Freeman

    1999-12-01

    This paper addresses several approaches to implementing web-based, three-dimensional (3-D), geo-referenced visualization. The discussion focuses on the relationship between multi-dimensional data sets and applications, as well as the thick/thin client and heavy/light server structure. Two models of data sets are addressed in this paper. One is the use of traditional 3-D data format such as 3-D Studio Max, Open Inventor 2.0, Vis5D and OBJ. The other is modelled by a web-based language such as VRML. Also, traditional languages such as C and C++, as well as web-based programming tools such as Java, Java3D and ActiveX, can be used for developing applications. The strengths and weaknesses of each approach are elaborated. Four practical solutions for using VRML and Java, Java and Java3D, VRML and ActiveX and Java wrapper classes (Java and C/C++), to develop applications are presented for web-based, real-time interactive and explorative visualization.

  4. News video story segmentation method using fusion of audio-visual features

    NASA Astrophysics Data System (ADS)

    Wen, Jun; Wu, Ling-da; Zeng, Pu; Luan, Xi-dao; Xie, Yu-xiang

    2007-11-01

    News story segmentation is an important aspect for news video analysis. This paper presents a method for news video story segmentation. Different form prior works, which base on visual features transform, the proposed technique uses audio features as baseline and fuses visual features with it to refine the results. At first, it selects silence clips as audio features candidate points, and selects shot boundaries and anchor shots as two kinds of visual features candidate points. Then this paper selects audio feature candidates as cues and develops different fusion method, which effectively using diverse type visual candidates to refine audio candidates, to get story boundaries. Experiment results show that this method has high efficiency and adaptability to different kinds of news video.

  5. An opinion formation based binary optimization approach for feature selection

    NASA Astrophysics Data System (ADS)

    Hamedmoghadam, Homayoun; Jalili, Mahdi; Yu, Xinghuo

    2018-02-01

    This paper proposed a novel optimization method based on opinion formation in complex network systems. The proposed optimization technique mimics human-human interaction mechanism based on a mathematical model derived from social sciences. Our method encodes a subset of selected features to the opinion of an artificial agent and simulates the opinion formation process among a population of agents to solve the feature selection problem. The agents interact using an underlying interaction network structure and get into consensus in their opinions, while finding better solutions to the problem. A number of mechanisms are employed to avoid getting trapped in local minima. We compare the performance of the proposed method with a number of classical population-based optimization methods and a state-of-the-art opinion formation based method. Our experiments on a number of high dimensional datasets reveal outperformance of the proposed algorithm over others.

  6. Text-Based Conferencing: Features vs. Functionality

    ERIC Educational Resources Information Center

    Anderson, Lynn; McCarthy, Cathy

    2005-01-01

    This report examines three text-based conferencing products: "WowBB", "Invision Power Board", and "vBulletin". Their selection was prompted by a feature-by-feature comparison of the same products on the "WowBB" website. The comparison chart painted a misleading impression of "WowBB's" features in relation to the other two products; so the…

  7. Unsupervised Feature Selection Based on the Morisita Index for Hyperspectral Images

    NASA Astrophysics Data System (ADS)

    Golay, Jean; Kanevski, Mikhail

    2017-04-01

    Hyperspectral sensors are capable of acquiring images with hundreds of narrow and contiguous spectral bands. Compared with traditional multispectral imagery, the use of hyperspectral images allows better performance in discriminating between land-cover classes, but it also results in large redundancy and high computational data processing. To alleviate such issues, unsupervised feature selection techniques for redundancy minimization can be implemented. Their goal is to select the smallest subset of features (or bands) in such a way that all the information content of a data set is preserved as much as possible. The present research deals with the application to hyperspectral images of a recently introduced technique of unsupervised feature selection: the Morisita-Based filter for Redundancy Minimization (MBRM). MBRM is based on the (multipoint) Morisita index of clustering and on the Morisita estimator of Intrinsic Dimension (ID). The fundamental idea of the technique is to retain only the bands which contribute to increasing the ID of an image. In this way, redundant bands are disregarded, since they have no impact on the ID. Besides, MBRM has several advantages over benchmark techniques: in addition to its ability to deal with large data sets, it can capture highly-nonlinear dependences and its implementation is straightforward in any programming environment. Experimental results on freely available hyperspectral images show the good effectiveness of MBRM in remote sensing data processing. Comparisons with benchmark techniques are carried out and random forests are used to assess the performance of MBRM in reducing the data dimensionality without loss of relevant information. References [1] C. Traina Jr., A.J.M. Traina, L. Wu, C. Faloutsos, Fast feature selection using fractal dimension, in: Proceedings of the XV Brazilian Symposium on Databases, SBBD, pp. 158-171, 2000. [2] J. Golay, M. Kanevski, A new estimator of intrinsic dimension based on the multipoint Morisita index, Pattern Recognition 48(12), pp. 4070-4081, 2015. [3] J. Golay, M. Kanevski, Unsupervised feature selection based on the Morisita estimator of intrinsic dimension, arXiv:1608.05581, 2016.

  8. Development and selection of Asian-specific humeral implants based on statistical atlas: toward planning minimally invasive surgery.

    PubMed

    Wu, K; Daruwalla, Z J; Wong, K L; Murphy, D; Ren, H

    2015-08-01

    The commercial humeral implants based on the Western population are currently not entirely compatible with Asian patients, due to differences in bone size, shape and structure. Surgeons may have to compromise or use different implants that are less conforming, which may cause complications of as well as inconvenience to the implant position. The construction of Asian humerus atlases of different clusters has therefore been proposed to eradicate this problem and to facilitate planning minimally invasive surgical procedures [6,31]. According to the features of the atlases, new implants could be designed specifically for different patients. Furthermore, an automatic implant selection algorithm has been proposed as well in order to reduce the complications caused by implant and bone mismatch. Prior to the design of the implant, data clustering and extraction of the relevant features were carried out on the datasets of each gender. The fuzzy C-means clustering method is explored in this paper. Besides, two new schemes of implant selection procedures, namely the Procrustes analysis-based scheme and the group average distance-based scheme, were proposed to better search for the matching implants for new coming patients from the database. Both these two algorithms have not been used in this area, while they turn out to have excellent performance in implant selection. Additionally, algorithms to calculate the matching scores between various implants and the patient data are proposed in this paper to assist the implant selection procedure. The results obtained have indicated the feasibility of the proposed development and selection scheme. The 16 sets of male data were divided into two clusters with 8 and 8 subjects, respectively, and the 11 female datasets were also divided into two clusters with 5 and 6 subjects, respectively. Based on the features of each cluster, the implants designed by the proposed algorithm fit very well on their reference humeri and the proposed implant selection procedure allows for a scenario of treating a patient with merely a preoperative anatomical model in order to correctly select the implant that has the best fit. Based on the leave-one-out validation, it can be concluded that both the PA-based method and GAD-based method are able to achieve excellent performance when dealing with the problem of implant selection. The accuracy and average execution time for the PA-based method were 100 % and 0.132 s, respectively, while those of the GAD- based method were 100 % and 0.058 s. Therefore, the GAD-based method outperformed the PA-based method in terms of execution speed. The primary contributions of this paper include the proposal of methods for development of Asian-, gender- and cluster-specific implants based on shape features and selection of the best fit implants for future patients according to their features. To the best of our knowledge, this is the first work that proposes implant design and selection for Asian patients automatically based on features extracted from cluster-specific statistical atlases.

  9. Discriminative spatial-frequency-temporal feature extraction and classification of motor imagery EEG: An sparse regression and Weighted Naïve Bayesian Classifier-based approach.

    PubMed

    Miao, Minmin; Zeng, Hong; Wang, Aimin; Zhao, Changsen; Liu, Feixiang

    2017-02-15

    Common spatial pattern (CSP) is most widely used in motor imagery based brain-computer interface (BCI) systems. In conventional CSP algorithm, pairs of the eigenvectors corresponding to both extreme eigenvalues are selected to construct the optimal spatial filter. In addition, an appropriate selection of subject-specific time segments and frequency bands plays an important role in its successful application. This study proposes to optimize spatial-frequency-temporal patterns for discriminative feature extraction. Spatial optimization is implemented by channel selection and finding discriminative spatial filters adaptively on each time-frequency segment. A novel Discernibility of Feature Sets (DFS) criteria is designed for spatial filter optimization. Besides, discriminative features located in multiple time-frequency segments are selected automatically by the proposed sparse time-frequency segment common spatial pattern (STFSCSP) method which exploits sparse regression for significant features selection. Finally, a weight determined by the sparse coefficient is assigned for each selected CSP feature and we propose a Weighted Naïve Bayesian Classifier (WNBC) for classification. Experimental results on two public EEG datasets demonstrate that optimizing spatial-frequency-temporal patterns in a data-driven manner for discriminative feature extraction greatly improves the classification performance. The proposed method gives significantly better classification accuracies in comparison with several competing methods in the literature. The proposed approach is a promising candidate for future BCI systems. Copyright © 2016 Elsevier B.V. All rights reserved.

  10. Fuzzy feature selection based on interval type-2 fuzzy sets

    NASA Astrophysics Data System (ADS)

    Cherif, Sahar; Baklouti, Nesrine; Alimi, Adel; Snasel, Vaclav

    2017-03-01

    When dealing with real world data; noise, complexity, dimensionality, uncertainty and irrelevance can lead to low performance and insignificant judgment. Fuzzy logic is a powerful tool for controlling conflicting attributes which can have similar effects and close meanings. In this paper, an interval type-2 fuzzy feature selection is presented as a new approach for removing irrelevant features and reducing complexity. We demonstrate how can Feature Selection be joined with Interval Type-2 Fuzzy Logic for keeping significant features and hence reducing time complexity. The proposed method is compared with some other approaches. The results show that the number of attributes is proportionally small.

  11. PrAS: Prediction of amidation sites using multiple feature extraction.

    PubMed

    Wang, Tong; Zheng, Wei; Wuyun, Qiqige; Wu, Zhenfeng; Ruan, Jishou; Hu, Gang; Gao, Jianzhao

    2017-02-01

    Amidation plays an important role in a variety of pathological processes and serious diseases like neural dysfunction and hypertension. However, identification of protein amidation sites through traditional experimental methods is time consuming and expensive. In this paper, we proposed a novel predictor for Prediction of Amidation Sites (PrAS), which is the first software package for academic users. The method incorporated four representative feature types, which are position-based features, physicochemical and biochemical properties features, predicted structure-based features and evolutionary information features. A novel feature selection method, positive contribution feature selection was proposed to optimize features. PrAS achieved AUC of 0.96, accuracy of 92.1%, sensitivity of 81.2%, specificity of 94.9% and MCC of 0.76 on the independent test set. PrAS is freely available at https://sourceforge.net/p/praspkg. Copyright © 2016 Elsevier Ltd. All rights reserved.

  12. The role of lightness, hue and saturation in feature-based visual attention.

    PubMed

    Stuart, Geoffrey W; Barsdell, Wendy N; Day, Ross H

    2014-03-01

    Visual attention is used to select part of the visual array for higher-level processing. Visual selection can be based on spatial location, but it has also been demonstrated that multiple locations can be selected simultaneously on the basis of a visual feature such as color. One task that has been used to demonstrate feature-based attention is the judgement of the symmetry of simple four-color displays. In a typical task, when symmetry is violated, four squares on either side of the display do not match. When four colors are involved, symmetry judgements are made more quickly than when only two of the four colors are involved. This indicates that symmetry judgements are made one color at a time. Previous studies have confounded lightness, hue, and saturation when defining the colors used in such displays. In three experiments, symmetry was defined by lightness alone, lightness plus hue, or by hue or saturation alone, with lightness levels randomised. The difference between judgements of two- and four-color asymmetry was maintained, showing that hue and saturation can provide the sole basis for feature-based attentional selection. Crown Copyright © 2014. Published by Elsevier Ltd. All rights reserved.

  13. A Genetic-Based Feature Selection Approach in the Identification of Left/Right Hand Motor Imagery for a Brain-Computer Interface

    PubMed Central

    Yaacoub, Charles; Mhanna, Georges; Rihana, Sandy

    2017-01-01

    Electroencephalography is a non-invasive measure of the brain electrical activity generated by millions of neurons. Feature extraction in electroencephalography analysis is a core issue that may lead to accurate brain mental state classification. This paper presents a new feature selection method that improves left/right hand movement identification of a motor imagery brain-computer interface, based on genetic algorithms and artificial neural networks used as classifiers. Raw electroencephalography signals are first preprocessed using appropriate filtering. Feature extraction is carried out afterwards, based on spectral and temporal signal components, and thus a feature vector is constructed. As various features might be inaccurate and mislead the classifier, thus degrading the overall system performance, the proposed approach identifies a subset of features from a large feature space, such that the classifier error rate is reduced. Experimental results show that the proposed method is able to reduce the number of features to as low as 0.5% (i.e., the number of ignored features can reach 99.5%) while improving the accuracy, sensitivity, specificity, and precision of the classifier. PMID:28124985

  14. A Genetic-Based Feature Selection Approach in the Identification of Left/Right Hand Motor Imagery for a Brain-Computer Interface.

    PubMed

    Yaacoub, Charles; Mhanna, Georges; Rihana, Sandy

    2017-01-23

    Electroencephalography is a non-invasive measure of the brain electrical activity generated by millions of neurons. Feature extraction in electroencephalography analysis is a core issue that may lead to accurate brain mental state classification. This paper presents a new feature selection method that improves left/right hand movement identification of a motor imagery brain-computer interface, based on genetic algorithms and artificial neural networks used as classifiers. Raw electroencephalography signals are first preprocessed using appropriate filtering. Feature extraction is carried out afterwards, based on spectral and temporal signal components, and thus a feature vector is constructed. As various features might be inaccurate and mislead the classifier, thus degrading the overall system performance, the proposed approach identifies a subset of features from a large feature space, such that the classifier error rate is reduced. Experimental results show that the proposed method is able to reduce the number of features to as low as 0.5% (i.e., the number of ignored features can reach 99.5%) while improving the accuracy, sensitivity, specificity, and precision of the classifier.

  15. Less is more: Avoiding the LIBS dimensionality curse through judicious feature selection for explosive detection.

    PubMed

    Kumar Myakalwar, Ashwin; Spegazzini, Nicolas; Zhang, Chi; Kumar Anubham, Siva; Dasari, Ramachandra R; Barman, Ishan; Kumar Gundawar, Manoj

    2015-08-19

    Despite its intrinsic advantages, translation of laser induced breakdown spectroscopy for material identification has been often impeded by the lack of robustness of developed classification models, often due to the presence of spurious correlations. While a number of classifiers exhibiting high discriminatory power have been reported, efforts in establishing the subset of relevant spectral features that enable a fundamental interpretation of the segmentation capability and avoid the 'curse of dimensionality' have been lacking. Using LIBS data acquired from a set of secondary explosives, we investigate judicious feature selection approaches and architect two different chemometrics classifiers -based on feature selection through prerequisite knowledge of the sample composition and genetic algorithm, respectively. While the full spectral input results in classification rate of ca.92%, selection of only carbon to hydrogen spectral window results in near identical performance. Importantly, the genetic algorithm-derived classifier shows a statistically significant improvement to ca. 94% accuracy for prospective classification, even though the number of features used is an order of magnitude smaller. Our findings demonstrate the impact of rigorous feature selection in LIBS and also hint at the feasibility of using a discrete filter based detector thereby enabling a cheaper and compact system more amenable to field operations.

  16. Less is more: Avoiding the LIBS dimensionality curse through judicious feature selection for explosive detection

    PubMed Central

    Kumar Myakalwar, Ashwin; Spegazzini, Nicolas; Zhang, Chi; Kumar Anubham, Siva; Dasari, Ramachandra R.; Barman, Ishan; Kumar Gundawar, Manoj

    2015-01-01

    Despite its intrinsic advantages, translation of laser induced breakdown spectroscopy for material identification has been often impeded by the lack of robustness of developed classification models, often due to the presence of spurious correlations. While a number of classifiers exhibiting high discriminatory power have been reported, efforts in establishing the subset of relevant spectral features that enable a fundamental interpretation of the segmentation capability and avoid the ‘curse of dimensionality’ have been lacking. Using LIBS data acquired from a set of secondary explosives, we investigate judicious feature selection approaches and architect two different chemometrics classifiers –based on feature selection through prerequisite knowledge of the sample composition and genetic algorithm, respectively. While the full spectral input results in classification rate of ca.92%, selection of only carbon to hydrogen spectral window results in near identical performance. Importantly, the genetic algorithm-derived classifier shows a statistically significant improvement to ca. 94% accuracy for prospective classification, even though the number of features used is an order of magnitude smaller. Our findings demonstrate the impact of rigorous feature selection in LIBS and also hint at the feasibility of using a discrete filter based detector thereby enabling a cheaper and compact system more amenable to field operations. PMID:26286630

  17. Data Foundry: Data Warehousing and Integration for Scientific Data Management

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Musick, R.; Critchlow, T.; Ganesh, M.

    2000-02-29

    Data warehousing is an approach for managing data from multiple sources by representing them with a single, coherent point of view. Commercial data warehousing products have been produced by companies such as RebBrick, IBM, Brio, Andyne, Ardent, NCR, Information Advantage, Informatica, and others. Other companies have chosen to develop their own in-house data warehousing solution using relational databases, such as those sold by Oracle, IBM, Informix and Sybase. The typical approaches include federated systems, and mediated data warehouses, each of which, to some extent, makes use of a series of source-specific wrapper and mediator layers to integrate the data intomore » a consistent format which is then presented to users as a single virtual data store. These approaches are successful when applied to traditional business data because the data format used by the individual data sources tends to be rather static. Therefore, once a data source has been integrated into a data warehouse, there is relatively little work required to maintain that connection. However, that is not the case for all data sources. Data sources from scientific domains tend to regularly change their data model, format and interface. This is problematic because each change requires the warehouse administrator to update the wrapper, mediator, and warehouse interfaces to properly read, interpret, and represent the modified data source. Furthermore, the data that scientists require to carry out research is continuously changing as their understanding of a research question develops, or as their research objectives evolve. The difficulty and cost of these updates effectively limits the number of sources that can be integrated into a single data warehouse, or makes an approach based on warehousing too expensive to consider.« less

  18. Raman spectral feature selection using ant colony optimization for breast cancer diagnosis.

    PubMed

    Fallahzadeh, Omid; Dehghani-Bidgoli, Zohreh; Assarian, Mohammad

    2018-06-04

    Pathology as a common diagnostic test of cancer is an invasive, time-consuming, and partially subjective method. Therefore, optical techniques, especially Raman spectroscopy, have attracted the attention of cancer diagnosis researchers. However, as Raman spectra contain numerous peaks involved in molecular bounds of the sample, finding the best features related to cancerous changes can improve the accuracy of diagnosis in this method. The present research attempted to improve the power of Raman-based cancer diagnosis by finding the best Raman features using the ACO algorithm. In the present research, 49 spectra were measured from normal, benign, and cancerous breast tissue samples using a 785-nm micro-Raman system. After preprocessing for removal of noise and background fluorescence, the intensity of 12 important Raman bands of the biological samples was extracted as features of each spectrum. Then, the ACO algorithm was applied to find the optimum features for diagnosis. As the results demonstrated, by selecting five features, the classification accuracy of the normal, benign, and cancerous groups increased by 14% and reached 87.7%. ACO feature selection can improve the diagnostic accuracy of Raman-based diagnostic models. In the present study, features corresponding to ν(C-C) αhelix proline, valine (910-940), νs(C-C) skeletal lipids (1110-1130), and δ(CH2)/δ(CH3) proteins (1445-1460) were selected as the best features in cancer diagnosis.

  19. Spoken language identification based on the enhanced self-adjusting extreme learning machine approach.

    PubMed

    Albadr, Musatafa Abbas Abbood; Tiun, Sabrina; Al-Dhief, Fahad Taha; Sammour, Mahmoud A M

    2018-01-01

    Spoken Language Identification (LID) is the process of determining and classifying natural language from a given content and dataset. Typically, data must be processed to extract useful features to perform LID. The extracting features for LID, based on literature, is a mature process where the standard features for LID have already been developed using Mel-Frequency Cepstral Coefficients (MFCC), Shifted Delta Cepstral (SDC), the Gaussian Mixture Model (GMM) and ending with the i-vector based framework. However, the process of learning based on extract features remains to be improved (i.e. optimised) to capture all embedded knowledge on the extracted features. The Extreme Learning Machine (ELM) is an effective learning model used to perform classification and regression analysis and is extremely useful to train a single hidden layer neural network. Nevertheless, the learning process of this model is not entirely effective (i.e. optimised) due to the random selection of weights within the input hidden layer. In this study, the ELM is selected as a learning model for LID based on standard feature extraction. One of the optimisation approaches of ELM, the Self-Adjusting Extreme Learning Machine (SA-ELM) is selected as the benchmark and improved by altering the selection phase of the optimisation process. The selection process is performed incorporating both the Split-Ratio and K-Tournament methods, the improved SA-ELM is named Enhanced Self-Adjusting Extreme Learning Machine (ESA-ELM). The results are generated based on LID with the datasets created from eight different languages. The results of the study showed excellent superiority relating to the performance of the Enhanced Self-Adjusting Extreme Learning Machine LID (ESA-ELM LID) compared with the SA-ELM LID, with ESA-ELM LID achieving an accuracy of 96.25%, as compared to the accuracy of SA-ELM LID of only 95.00%.

  20. Spoken language identification based on the enhanced self-adjusting extreme learning machine approach

    PubMed Central

    Tiun, Sabrina; AL-Dhief, Fahad Taha; Sammour, Mahmoud A. M.

    2018-01-01

    Spoken Language Identification (LID) is the process of determining and classifying natural language from a given content and dataset. Typically, data must be processed to extract useful features to perform LID. The extracting features for LID, based on literature, is a mature process where the standard features for LID have already been developed using Mel-Frequency Cepstral Coefficients (MFCC), Shifted Delta Cepstral (SDC), the Gaussian Mixture Model (GMM) and ending with the i-vector based framework. However, the process of learning based on extract features remains to be improved (i.e. optimised) to capture all embedded knowledge on the extracted features. The Extreme Learning Machine (ELM) is an effective learning model used to perform classification and regression analysis and is extremely useful to train a single hidden layer neural network. Nevertheless, the learning process of this model is not entirely effective (i.e. optimised) due to the random selection of weights within the input hidden layer. In this study, the ELM is selected as a learning model for LID based on standard feature extraction. One of the optimisation approaches of ELM, the Self-Adjusting Extreme Learning Machine (SA-ELM) is selected as the benchmark and improved by altering the selection phase of the optimisation process. The selection process is performed incorporating both the Split-Ratio and K-Tournament methods, the improved SA-ELM is named Enhanced Self-Adjusting Extreme Learning Machine (ESA-ELM). The results are generated based on LID with the datasets created from eight different languages. The results of the study showed excellent superiority relating to the performance of the Enhanced Self-Adjusting Extreme Learning Machine LID (ESA-ELM LID) compared with the SA-ELM LID, with ESA-ELM LID achieving an accuracy of 96.25%, as compared to the accuracy of SA-ELM LID of only 95.00%. PMID:29672546

  1. Feature selection gait-based gender classification under different circumstances

    NASA Astrophysics Data System (ADS)

    Sabir, Azhin; Al-Jawad, Naseer; Jassim, Sabah

    2014-05-01

    This paper proposes a gender classification based on human gait features and investigates the problem of two variations: clothing (wearing coats) and carrying bag condition as addition to the normal gait sequence. The feature vectors in the proposed system are constructed after applying wavelet transform. Three different sets of feature are proposed in this method. First, Spatio-temporal distance that is dealing with the distance of different parts of the human body (like feet, knees, hand, Human Height and shoulder) during one gait cycle. The second and third feature sets are constructed from approximation and non-approximation coefficient of human body respectively. To extract these two sets of feature we divided the human body into two parts, upper and lower body part, based on the golden ratio proportion. In this paper, we have adopted a statistical method for constructing the feature vector from the above sets. The dimension of the constructed feature vector is reduced based on the Fisher score as a feature selection method to optimize their discriminating significance. Finally k-Nearest Neighbor is applied as a classification method. Experimental results demonstrate that our approach is providing more realistic scenario and relatively better performance compared with the existing approaches.

  2. A scale space feature based registration technique for fusion of satellite imagery

    NASA Technical Reports Server (NTRS)

    Raghavan, Srini; Cromp, Robert F.; Campbell, William C.

    1997-01-01

    Feature based registration is one of the most reliable methods to register multi-sensor images (both active and passive imagery) since features are often more reliable than intensity or radiometric values. The only situation where a feature based approach will fail is when the scene is completely homogenous or densely textural in which case a combination of feature and intensity based methods may yield better results. In this paper, we present some preliminary results of testing our scale space feature based registration technique, a modified version of feature based method developed earlier for classification of multi-sensor imagery. The proposed approach removes the sensitivity in parameter selection experienced in the earlier version as explained later.

  3. A feature-based approach to modeling protein–protein interaction hot spots

    PubMed Central

    Cho, Kyu-il; Kim, Dongsup; Lee, Doheon

    2009-01-01

    Identifying features that effectively represent the energetic contribution of an individual interface residue to the interactions between proteins remains problematic. Here, we present several new features and show that they are more effective than conventional features. By combining the proposed features with conventional features, we develop a predictive model for interaction hot spots. Initially, 54 multifaceted features, composed of different levels of information including structure, sequence and molecular interaction information, are quantified. Then, to identify the best subset of features for predicting hot spots, feature selection is performed using a decision tree. Based on the selected features, a predictive model for hot spots is created using support vector machine (SVM) and tested on an independent test set. Our model shows better overall predictive accuracy than previous methods such as the alanine scanning methods Robetta and FOLDEF, and the knowledge-based method KFC. Subsequent analysis yields several findings about hot spots. As expected, hot spots have a larger relative surface area burial and are more hydrophobic than other residues. Unexpectedly, however, residue conservation displays a rather complicated tendency depending on the types of protein complexes, indicating that this feature is not good for identifying hot spots. Of the selected features, the weighted atomic packing density, relative surface area burial and weighted hydrophobicity are the top 3, with the weighted atomic packing density proving to be the most effective feature for predicting hot spots. Notably, we find that hot spots are closely related to π–related interactions, especially π · · · π interactions. PMID:19273533

  4. Choice: 36 band feature selection software with applications to multispectral pattern recognition

    NASA Technical Reports Server (NTRS)

    Jones, W. C.

    1973-01-01

    Feature selection software was developed at the Earth Resources Laboratory that is capable of inputting up to 36 channels and selecting channel subsets according to several criteria based on divergence. One of the criterion used is compatible with the table look-up classifier requirements. The software indicates which channel subset best separates (based on average divergence) each class from all other classes. The software employs an exhaustive search technique, and computer time is not prohibitive. A typical task to select the best 4 of 22 channels for 12 classes takes 9 minutes on a Univac 1108 computer.

  5. A stereo remote sensing feature selection method based on artificial bee colony algorithm

    NASA Astrophysics Data System (ADS)

    Yan, Yiming; Liu, Pigang; Zhang, Ye; Su, Nan; Tian, Shu; Gao, Fengjiao; Shen, Yi

    2014-05-01

    To improve the efficiency of stereo information for remote sensing classification, a stereo remote sensing feature selection method is proposed in this paper presents, which is based on artificial bee colony algorithm. Remote sensing stereo information could be described by digital surface model (DSM) and optical image, which contain information of the three-dimensional structure and optical characteristics, respectively. Firstly, three-dimensional structure characteristic could be analyzed by 3D-Zernike descriptors (3DZD). However, different parameters of 3DZD could descript different complexity of three-dimensional structure, and it needs to be better optimized selected for various objects on the ground. Secondly, features for representing optical characteristic also need to be optimized. If not properly handled, when a stereo feature vector composed of 3DZD and image features, that would be a lot of redundant information, and the redundant information may not improve the classification accuracy, even cause adverse effects. To reduce information redundancy while maintaining or improving the classification accuracy, an optimized frame for this stereo feature selection problem is created, and artificial bee colony algorithm is introduced for solving this optimization problem. Experimental results show that the proposed method can effectively improve the computational efficiency, improve the classification accuracy.

  6. EEG-based recognition of video-induced emotions: selecting subject-independent feature set.

    PubMed

    Kortelainen, Jukka; Seppänen, Tapio

    2013-01-01

    Emotions are fundamental for everyday life affecting our communication, learning, perception, and decision making. Including emotions into the human-computer interaction (HCI) could be seen as a significant step forward offering a great potential for developing advanced future technologies. While the electrical activity of the brain is affected by emotions, offers electroencephalogram (EEG) an interesting channel to improve the HCI. In this paper, the selection of subject-independent feature set for EEG-based emotion recognition is studied. We investigate the effect of different feature sets in classifying person's arousal and valence while watching videos with emotional content. The classification performance is optimized by applying a sequential forward floating search algorithm for feature selection. The best classification rate (65.1% for arousal and 63.0% for valence) is obtained with a feature set containing power spectral features from the frequency band of 1-32 Hz. The proposed approach substantially improves the classification rate reported in the literature. In future, further analysis of the video-induced EEG changes including the topographical differences in the spectral features is needed.

  7. A P2P Botnet detection scheme based on decision tree and adaptive multilayer neural networks.

    PubMed

    Alauthaman, Mohammad; Aslam, Nauman; Zhang, Li; Alasem, Rafe; Hossain, M A

    2018-01-01

    In recent years, Botnets have been adopted as a popular method to carry and spread many malicious codes on the Internet. These malicious codes pave the way to execute many fraudulent activities including spam mail, distributed denial-of-service attacks and click fraud. While many Botnets are set up using centralized communication architecture, the peer-to-peer (P2P) Botnets can adopt a decentralized architecture using an overlay network for exchanging command and control data making their detection even more difficult. This work presents a method of P2P Bot detection based on an adaptive multilayer feed-forward neural network in cooperation with decision trees. A classification and regression tree is applied as a feature selection technique to select relevant features. With these features, a multilayer feed-forward neural network training model is created using a resilient back-propagation learning algorithm. A comparison of feature set selection based on the decision tree, principal component analysis and the ReliefF algorithm indicated that the neural network model with features selection based on decision tree has a better identification accuracy along with lower rates of false positives. The usefulness of the proposed approach is demonstrated by conducting experiments on real network traffic datasets. In these experiments, an average detection rate of 99.08 % with false positive rate of 0.75 % was observed.

  8. SVM-RFE based feature selection and Taguchi parameters optimization for multiclass SVM classifier.

    PubMed

    Huang, Mei-Ling; Hung, Yung-Hsiang; Lee, W M; Li, R K; Jiang, Bo-Ru

    2014-01-01

    Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parameters C and γ to increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases.

  9. SVM-RFE Based Feature Selection and Taguchi Parameters Optimization for Multiclass SVM Classifier

    PubMed Central

    Huang, Mei-Ling; Hung, Yung-Hsiang; Lee, W. M.; Li, R. K.; Jiang, Bo-Ru

    2014-01-01

    Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parameters C and γ to increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases. PMID:25295306

  10. Cluster analysis based on dimensional information with applications to feature selection and classification

    NASA Technical Reports Server (NTRS)

    Eigen, D. J.; Fromm, F. R.; Northouse, R. A.

    1974-01-01

    A new clustering algorithm is presented that is based on dimensional information. The algorithm includes an inherent feature selection criterion, which is discussed. Further, a heuristic method for choosing the proper number of intervals for a frequency distribution histogram, a feature necessary for the algorithm, is presented. The algorithm, although usable as a stand-alone clustering technique, is then utilized as a global approximator. Local clustering techniques and configuration of a global-local scheme are discussed, and finally the complete global-local and feature selector configuration is shown in application to a real-time adaptive classification scheme for the analysis of remote sensed multispectral scanner data.

  11. Collective feature selection to identify crucial epistatic variants.

    PubMed

    Verma, Shefali S; Lucas, Anastasia; Zhang, Xinyuan; Veturi, Yogasudha; Dudek, Scott; Li, Binglan; Li, Ruowang; Urbanowicz, Ryan; Moore, Jason H; Kim, Dokyoon; Ritchie, Marylyn D

    2018-01-01

    Machine learning methods have gained popularity and practicality in identifying linear and non-linear effects of variants associated with complex disease/traits. Detection of epistatic interactions still remains a challenge due to the large number of features and relatively small sample size as input, thus leading to the so-called "short fat data" problem. The efficiency of machine learning methods can be increased by limiting the number of input features. Thus, it is very important to perform variable selection before searching for epistasis. Many methods have been evaluated and proposed to perform feature selection, but no single method works best in all scenarios. We demonstrate this by conducting two separate simulation analyses to evaluate the proposed collective feature selection approach. Through our simulation study we propose a collective feature selection approach to select features that are in the "union" of the best performing methods. We explored various parametric, non-parametric, and data mining approaches to perform feature selection. We choose our top performing methods to select the union of the resulting variables based on a user-defined percentage of variants selected from each method to take to downstream analysis. Our simulation analysis shows that non-parametric data mining approaches, such as MDR, may work best under one simulation criteria for the high effect size (penetrance) datasets, while non-parametric methods designed for feature selection, such as Ranger and Gradient boosting, work best under other simulation criteria. Thus, using a collective approach proves to be more beneficial for selecting variables with epistatic effects also in low effect size datasets and different genetic architectures. Following this, we applied our proposed collective feature selection approach to select the top 1% of variables to identify potential interacting variables associated with Body Mass Index (BMI) in ~ 44,000 samples obtained from Geisinger's MyCode Community Health Initiative (on behalf of DiscovEHR collaboration). In this study, we were able to show that selecting variables using a collective feature selection approach could help in selecting true positive epistatic variables more frequently than applying any single method for feature selection via simulation studies. We were able to demonstrate the effectiveness of collective feature selection along with a comparison of many methods in our simulation analysis. We also applied our method to identify non-linear networks associated with obesity.

  12. Use of Exam Wrappers to Enhance Students' Metacognitive Skills in a Large Introductory Food Science and Human Nutrition Course

    ERIC Educational Resources Information Center

    Gezer-Templeton, P. Gizem; Mayhew, Emily J.; Korte, Debra S.; Schmidt, Shelly J.

    2017-01-01

    Research shows that students struggle to develop higher order thinking skills and effective study strategies during the transition from high school to college. Therefore, in addition to teaching course content, effective instructors should assist students in developing metacognitive skills, that is, the practice of thinking about their thinking.…

  13. 26 CFR 301.7502-1 - Timely mailing of documents and payments treated as timely filing and paying.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... postmark stamped on the envelope or other appropriate wrapper (envelope) in which the document or payment was mailed. Thus, if the envelope that contains the document or payment has a timely postmark, the... apply in determining whether a failure to file a return or pay a tax has continued for an additional...

  14. Reconfigurable, Intelligently-Adaptive, Communication System, an SDR Platform

    NASA Technical Reports Server (NTRS)

    Roche, Rigoberto J.; Shalkhauser, Mary Jo; Hickey, Joseph P.; Briones, Janette C.

    2016-01-01

    The Space Telecommunications Radio System (STRS) provides a common, consistent framework to abstract the application software from the radio platform hardware. STRS aims to reduce the cost and risk of using complex, configurable and reprogrammable radio systems across NASA missions. The NASA Glenn Research Center (GRC) team made a software defined radio (SDR) platform STRS compliant by adding an STRS operating environment and a field programmable gate array (FPGA) wrapper, capable of implementing each of the platforms interfaces, as well as a test waveform to exercise those interfaces. This effort serves to provide a framework toward waveform development onto an STRS compliant platform to support future space communication systems for advanced exploration missions. The use of validated STRS compliant applications provides tested code with extensive documentation to potentially reduce risk, cost and e ort in development of space-deployable SDRs. This paper discusses the advantages of STRS, the integration of STRS onto a Reconfigurable, Intelligently-Adaptive, Communication System (RIACS) SDR platform, and the test waveform and wrapper development e orts. The paper emphasizes the infusion of the STRS Architecture onto the RIACS platform for potential use in next generation flight system SDRs for advanced exploration missions.

  15. Selecting multiple features delays perception, but only when targets are horizontally arranged.

    PubMed

    Lo, Shih-Yu

    2017-01-01

    Based on the finding that perception is lagged by attention split on multiple features (Lo et al., 2012), this study investigated how the feature-based lag effect interacts with the target spatial arrangement. Participants were presented with gratings the spatial frequencies of which constantly changed. The task was to monitor two gratings of the same or different colors and report their spatial frequencies right before the stimulus offset. The results showed a perceptual lag wherein the reported value was closer to the physical value some time prior to the stimulus offset. This lag effect was larger when the two gratings were of different colors than when they were the same color. Furthermore, the feature-based lag effect was statistically significant when the two gratings were horizontally arranged but not when they were vertically or diagonally arranged. A model is proposed to explain the effect of target arrangement: When targets are horizontally arranged, selecting an additional feature delays perception. When targets are vertically or diagonally arranged, target selection for the lower field is prioritized. This prioritization on the lower target might prompt observers to only select the lower target and ignore the upper one, and this causes more perceptual errors without delaying perception. © 2017 Elsevier B.V. All rights reserved.

  16. Multiclass feature selection for improved pediatric brain tumor segmentation

    NASA Astrophysics Data System (ADS)

    Ahmed, Shaheen; Iftekharuddin, Khan M.

    2012-03-01

    In our previous work, we showed that fractal-based texture features are effective in detection, segmentation and classification of posterior-fossa (PF) pediatric brain tumor in multimodality MRI. We exploited an information theoretic approach such as Kullback-Leibler Divergence (KLD) for feature selection and ranking different texture features. We further incorporated the feature selection technique with segmentation method such as Expectation Maximization (EM) for segmentation of tumor T and non tumor (NT) tissues. In this work, we extend the two class KLD technique to multiclass for effectively selecting the best features for brain tumor (T), cyst (C) and non tumor (NT). We further obtain segmentation robustness for each tissue types by computing Bay's posterior probabilities and corresponding number of pixels for each tissue segments in MRI patient images. We evaluate improved tumor segmentation robustness using different similarity metric for 5 patients in T1, T2 and FLAIR modalities.

  17. Comparison of Different EHG Feature Selection Methods for the Detection of Preterm Labor

    PubMed Central

    Alamedine, D.; Khalil, M.; Marque, C.

    2013-01-01

    Numerous types of linear and nonlinear features have been extracted from the electrohysterogram (EHG) in order to classify labor and pregnancy contractions. As a result, the number of available features is now very large. The goal of this study is to reduce the number of features by selecting only the relevant ones which are useful for solving the classification problem. This paper presents three methods for feature subset selection that can be applied to choose the best subsets for classifying labor and pregnancy contractions: an algorithm using the Jeffrey divergence (JD) distance, a sequential forward selection (SFS) algorithm, and a binary particle swarm optimization (BPSO) algorithm. The two last methods are based on a classifier and were tested with three types of classifiers. These methods have allowed us to identify common features which are relevant for contraction classification. PMID:24454536

  18. An improved discriminative filter bank selection approach for motor imagery EEG signal classification using mutual information.

    PubMed

    Kumar, Shiu; Sharma, Alok; Tsunoda, Tatsuhiko

    2017-12-28

    Common spatial pattern (CSP) has been an effective technique for feature extraction in electroencephalography (EEG) based brain computer interfaces (BCIs). However, motor imagery EEG signal feature extraction using CSP generally depends on the selection of the frequency bands to a great extent. In this study, we propose a mutual information based frequency band selection approach. The idea of the proposed method is to utilize the information from all the available channels for effectively selecting the most discriminative filter banks. CSP features are extracted from multiple overlapping sub-bands. An additional sub-band has been introduced that cover the wide frequency band (7-30 Hz) and two different types of features are extracted using CSP and common spatio-spectral pattern techniques, respectively. Mutual information is then computed from the extracted features of each of these bands and the top filter banks are selected for further processing. Linear discriminant analysis is applied to the features extracted from each of the filter banks. The scores are fused together, and classification is done using support vector machine. The proposed method is evaluated using BCI Competition III dataset IVa, BCI Competition IV dataset I and BCI Competition IV dataset IIb, and it outperformed all other competing methods achieving the lowest misclassification rate and the highest kappa coefficient on all three datasets. Introducing a wide sub-band and using mutual information for selecting the most discriminative sub-bands, the proposed method shows improvement in motor imagery EEG signal classification.

  19. Semantic point cloud interpretation based on optimal neighborhoods, relevant features and efficient classifiers

    NASA Astrophysics Data System (ADS)

    Weinmann, Martin; Jutzi, Boris; Hinz, Stefan; Mallet, Clément

    2015-07-01

    3D scene analysis in terms of automatically assigning 3D points a respective semantic label has become a topic of great importance in photogrammetry, remote sensing, computer vision and robotics. In this paper, we address the issue of how to increase the distinctiveness of geometric features and select the most relevant ones among these for 3D scene analysis. We present a new, fully automated and versatile framework composed of four components: (i) neighborhood selection, (ii) feature extraction, (iii) feature selection and (iv) classification. For each component, we consider a variety of approaches which allow applicability in terms of simplicity, efficiency and reproducibility, so that end-users can easily apply the different components and do not require expert knowledge in the respective domains. In a detailed evaluation involving 7 neighborhood definitions, 21 geometric features, 7 approaches for feature selection, 10 classifiers and 2 benchmark datasets, we demonstrate that the selection of optimal neighborhoods for individual 3D points significantly improves the results of 3D scene analysis. Additionally, we show that the selection of adequate feature subsets may even further increase the quality of the derived results while significantly reducing both processing time and memory consumption.

  20. The Emotion Recognition System Based on Autoregressive Model and Sequential Forward Feature Selection of Electroencephalogram Signals

    PubMed Central

    Hatamikia, Sepideh; Maghooli, Keivan; Nasrabadi, Ali Motie

    2014-01-01

    Electroencephalogram (EEG) is one of the useful biological signals to distinguish different brain diseases and mental states. In recent years, detecting different emotional states from biological signals has been merged more attention by researchers and several feature extraction methods and classifiers are suggested to recognize emotions from EEG signals. In this research, we introduce an emotion recognition system using autoregressive (AR) model, sequential forward feature selection (SFS) and K-nearest neighbor (KNN) classifier using EEG signals during emotional audio-visual inductions. The main purpose of this paper is to investigate the performance of AR features in the classification of emotional states. To achieve this goal, a distinguished AR method (Burg's method) based on Levinson-Durbin's recursive algorithm is used and AR coefficients are extracted as feature vectors. In the next step, two different feature selection methods based on SFS algorithm and Davies–Bouldin index are used in order to decrease the complexity of computing and redundancy of features; then, three different classifiers include KNN, quadratic discriminant analysis and linear discriminant analysis are used to discriminate two and three different classes of valence and arousal levels. The proposed method is evaluated with EEG signals of available database for emotion analysis using physiological signals, which are recorded from 32 participants during 40 1 min audio visual inductions. According to the results, AR features are efficient to recognize emotional states from EEG signals, and KNN performs better than two other classifiers in discriminating of both two and three valence/arousal classes. The results also show that SFS method improves accuracies by almost 10-15% as compared to Davies–Bouldin based feature selection. The best accuracies are %72.33 and %74.20 for two classes of valence and arousal and %61.10 and %65.16 for three classes, respectively. PMID:25298928

  1. New bandwidth selection criterion for Kernel PCA: approach to dimensionality reduction and classification problems.

    PubMed

    Thomas, Minta; De Brabanter, Kris; De Moor, Bart

    2014-05-10

    DNA microarrays are potentially powerful technology for improving diagnostic classification, treatment selection, and prognostic assessment. The use of this technology to predict cancer outcome has a history of almost a decade. Disease class predictors can be designed for known disease cases and provide diagnostic confirmation or clarify abnormal cases. The main input to this class predictors are high dimensional data with many variables and few observations. Dimensionality reduction of these features set significantly speeds up the prediction task. Feature selection and feature transformation methods are well known preprocessing steps in the field of bioinformatics. Several prediction tools are available based on these techniques. Studies show that a well tuned Kernel PCA (KPCA) is an efficient preprocessing step for dimensionality reduction, but the available bandwidth selection method for KPCA was computationally expensive. In this paper, we propose a new data-driven bandwidth selection criterion for KPCA, which is related to least squares cross-validation for kernel density estimation. We propose a new prediction model with a well tuned KPCA and Least Squares Support Vector Machine (LS-SVM). We estimate the accuracy of the newly proposed model based on 9 case studies. Then, we compare its performances (in terms of test set Area Under the ROC Curve (AUC) and computational time) with other well known techniques such as whole data set + LS-SVM, PCA + LS-SVM, t-test + LS-SVM, Prediction Analysis of Microarrays (PAM) and Least Absolute Shrinkage and Selection Operator (Lasso). Finally, we assess the performance of the proposed strategy with an existing KPCA parameter tuning algorithm by means of two additional case studies. We propose, evaluate, and compare several mathematical/statistical techniques, which apply feature transformation/selection for subsequent classification, and consider its application in medical diagnostics. Both feature selection and feature transformation perform well on classification tasks. Due to the dynamic selection property of feature selection, it is hard to define significant features for the classifier, which predicts classes of future samples. Moreover, the proposed strategy enjoys a distinctive advantage with its relatively lesser time complexity.

  2. Intentional attention switching in dichotic listening: exploring the efficiency of nonspatial and spatial selection.

    PubMed

    Lawo, Vera; Fels, Janina; Oberem, Josefa; Koch, Iring

    2014-10-01

    Using an auditory variant of task switching, we examined the ability to intentionally switch attention in a dichotic-listening task. In our study, participants responded selectively to one of two simultaneously presented auditory number words (spoken by a female and a male, one for each ear) by categorizing its numerical magnitude. The mapping of gender (female vs. male) and ear (left vs. right) was unpredictable. The to-be-attended feature for gender or ear, respectively, was indicated by a visual selection cue prior to auditory stimulus onset. In Experiment 1, explicitly cued switches of the relevant feature dimension (e.g., from gender to ear) and switches of the relevant feature within a dimension (e.g., from male to female) occurred in an unpredictable manner. We found large performance costs when the relevant feature switched, but switches of the relevant feature dimension incurred only small additional costs. The feature-switch costs were larger in ear-relevant than in gender-relevant trials. In Experiment 2, we replicated these findings using a simplified design (i.e., only within-dimension switches with blocked dimensions). In Experiment 3, we examined preparation effects by manipulating the cueing interval and found a preparation benefit only when ear was cued. Together, our data suggest that the large part of attentional switch costs arises from reconfiguration at the level of relevant auditory features (e.g., left vs. right) rather than feature dimensions (ear vs. gender). Additionally, our findings suggest that ear-based target selection benefits more from preparation time (i.e., time to direct attention to one ear) than gender-based target selection.

  3. Differential privacy-based evaporative cooling feature selection and classification with relief-F and random forests.

    PubMed

    Le, Trang T; Simmons, W Kyle; Misaki, Masaya; Bodurka, Jerzy; White, Bill C; Savitz, Jonathan; McKinney, Brett A

    2017-09-15

    Classification of individuals into disease or clinical categories from high-dimensional biological data with low prediction error is an important challenge of statistical learning in bioinformatics. Feature selection can improve classification accuracy but must be incorporated carefully into cross-validation to avoid overfitting. Recently, feature selection methods based on differential privacy, such as differentially private random forests and reusable holdout sets, have been proposed. However, for domains such as bioinformatics, where the number of features is much larger than the number of observations p≫n , these differential privacy methods are susceptible to overfitting. We introduce private Evaporative Cooling, a stochastic privacy-preserving machine learning algorithm that uses Relief-F for feature selection and random forest for privacy preserving classification that also prevents overfitting. We relate the privacy-preserving threshold mechanism to a thermodynamic Maxwell-Boltzmann distribution, where the temperature represents the privacy threshold. We use the thermal statistical physics concept of Evaporative Cooling of atomic gases to perform backward stepwise privacy-preserving feature selection. On simulated data with main effects and statistical interactions, we compare accuracies on holdout and validation sets for three privacy-preserving methods: the reusable holdout, reusable holdout with random forest, and private Evaporative Cooling, which uses Relief-F feature selection and random forest classification. In simulations where interactions exist between attributes, private Evaporative Cooling provides higher classification accuracy without overfitting based on an independent validation set. In simulations without interactions, thresholdout with random forest and private Evaporative Cooling give comparable accuracies. We also apply these privacy methods to human brain resting-state fMRI data from a study of major depressive disorder. Code available at http://insilico.utulsa.edu/software/privateEC . brett-mckinney@utulsa.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  4. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jing, Yaqi; Meng, Qinghao, E-mail: qh-meng@tju.edu.cn; Qi, Peifeng

    An electronic nose (e-nose) was designed to classify Chinese liquors of the same aroma style. A new method of feature reduction which combined feature selection with feature extraction was proposed. Feature selection method used 8 feature-selection algorithms based on information theory and reduced the dimension of the feature space to 41. Kernel entropy component analysis was introduced into the e-nose system as a feature extraction method and the dimension of feature space was reduced to 12. Classification of Chinese liquors was performed by using back propagation artificial neural network (BP-ANN), linear discrimination analysis (LDA), and a multi-linear classifier. The classificationmore » rate of the multi-linear classifier was 97.22%, which was higher than LDA and BP-ANN. Finally the classification of Chinese liquors according to their raw materials and geographical origins was performed using the proposed multi-linear classifier and classification rate was 98.75% and 100%, respectively.« less

  5. Wrapping Python around MODFLOW/MT3DMS based groundwater models

    NASA Astrophysics Data System (ADS)

    Post, V.

    2008-12-01

    Numerical models that simulate groundwater flow and solute transport require a great amount of input data that is often organized into different files. A large proportion of the input data consists of spatially-distributed model parameters. The model output consists of a variety data such as heads, fluxes and concentrations. Typically all files have different formats. Consequently, preparing input and managing output is a complex and error-prone task. Proprietary software tools are available that facilitate the preparation of input files and analysis of model outcomes. The use of such software may be limited if it does not support all the features of the groundwater model or when the costs of such tools are prohibitive. Therefore a Python library was developed that contains routines to generate input files and process output files of MODFLOW/MT3DMS based models. The library is freely available and has an open structure so that the routines can be customized and linked into other scripts and libraries. The current set of functions supports the generation of input files for MODFLOW and MT3DMS, including the capability to read spatially-distributed input parameters (e.g. hydraulic conductivity) from PNG files. Both ASCII and binary output files can be read efficiently allowing for visualization of, for example, solute concentration patterns in contour plots with superimposed flow vectors using matplotlib. Series of contour plots are then easily saved as an animation. The subroutines can also be used within scripts to calculate derived quantities such as the mass of a solute within a particular region of the model domain. Using Python as a wrapper around groundwater models provides an efficient and flexible way of processing input and output data, which is not constrained by limitations of third-party products.

  6. Software architecture and design of the web services facilitating climate model diagnostic analysis

    NASA Astrophysics Data System (ADS)

    Pan, L.; Lee, S.; Zhang, J.; Tang, B.; Zhai, C.; Jiang, J. H.; Wang, W.; Bao, Q.; Qi, M.; Kubar, T. L.; Teixeira, J.

    2015-12-01

    Climate model diagnostic analysis is a computationally- and data-intensive task because it involves multiple numerical model outputs and satellite observation data that can both be high resolution. We have built an online tool that facilitates this process. The tool is called Climate Model Diagnostic Analyzer (CMDA). It employs the web service technology and provides a web-based user interface. The benefits of these choices include: (1) No installation of any software other than a browser, hence it is platform compatable; (2) Co-location of computation and big data on the server side, and small results and plots to be downloaded on the client side, hence high data efficiency; (3) multi-threaded implementation to achieve parallel performance on multi-core servers; and (4) cloud deployment so each user has a dedicated virtual machine. In this presentation, we will focus on the computer science aspects of this tool, namely the architectural design, the infrastructure of the web services, the implementation of the web-based user interface, the mechanism of provenance collection, the approach to virtualization, and the Amazon Cloud deployment. As an example, We will describe our methodology to transform an existing science application code into a web service using a Python wrapper interface and Python web service frameworks (i.e., Flask, Gunicorn, and Tornado). Another example is the use of Docker, a light-weight virtualization container, to distribute and deploy CMDA onto an Amazon EC2 instance. Our tool of CMDA has been successfully used in the 2014 Summer School hosted by the JPL Center for Climate Science. Students had positive feedbacks in general and we will report their comments. An enhanced version of CMDA with several new features, some requested by the 2014 students, will be used in the 2015 Summer School soon.

  7. A comparative analysis of swarm intelligence techniques for feature selection in cancer classification.

    PubMed

    Gunavathi, Chellamuthu; Premalatha, Kandasamy

    2014-01-01

    Feature selection in cancer classification is a central area of research in the field of bioinformatics and used to select the informative genes from thousands of genes of the microarray. The genes are ranked based on T-statistics, signal-to-noise ratio (SNR), and F-test values. The swarm intelligence (SI) technique finds the informative genes from the top-m ranked genes. These selected genes are used for classification. In this paper the shuffled frog leaping with Lévy flight (SFLLF) is proposed for feature selection. In SFLLF, the Lévy flight is included to avoid premature convergence of shuffled frog leaping (SFL) algorithm. The SI techniques such as particle swarm optimization (PSO), cuckoo search (CS), SFL, and SFLLF are used for feature selection which identifies informative genes for classification. The k-nearest neighbour (k-NN) technique is used to classify the samples. The proposed work is applied on 10 different benchmark datasets and examined with SI techniques. The experimental results show that the results obtained from k-NN classifier through SFLLF feature selection method outperform PSO, CS, and SFL.

  8. Overcoming intratumoural heterogeneity for reproducible molecular risk stratification: a case study in advanced kidney cancer.

    PubMed

    Lubbock, Alexander L R; Stewart, Grant D; O'Mahony, Fiach C; Laird, Alexander; Mullen, Peter; O'Donnell, Marie; Powles, Thomas; Harrison, David J; Overton, Ian M

    2017-06-26

    Metastatic clear cell renal cell cancer (mccRCC) portends a poor prognosis and urgently requires better clinical tools for prognostication as well as for prediction of response to treatment. Considerable investment in molecular risk stratification has sought to overcome the performance ceiling encountered by methods restricted to traditional clinical parameters. However, replication of results has proven challenging, and intratumoural heterogeneity (ITH) may confound attempts at tissue-based stratification. We investigated the influence of confounding ITH on the performance of a novel molecular prognostic model, enabled by pathologist-guided multiregion sampling (n = 183) of geographically separated mccRCC cohorts from the SuMR trial (development, n = 22) and the SCOTRRCC study (validation, n = 22). Tumour protein levels quantified by reverse phase protein array (RPPA) were investigated alongside clinical variables. Regularised wrapper selection identified features for Cox multivariate analysis with overall survival as the primary endpoint. The optimal subset of variables in the final stratification model consisted of N-cadherin, EPCAM, Age, mTOR (NEAT). Risk groups from NEAT had a markedly different prognosis in the validation cohort (log-rank p = 7.62 × 10 -7 ; hazard ratio (HR) 37.9, 95% confidence interval 4.1-353.8) and 2-year survival rates (accuracy = 82%, Matthews correlation coefficient = 0.62). Comparisons with established clinico-pathological scores suggest favourable performance for NEAT (Net reclassification improvement 7.1% vs International Metastatic Database Consortium score, 25.4% vs Memorial Sloan Kettering Cancer Center score). Limitations include the relatively small cohorts and associated wide confidence intervals on predictive performance. Our multiregion sampling approach enabled investigation of NEAT validation when limiting the number of samples analysed per tumour, which significantly degraded performance. Indeed, sample selection could change risk group assignment for 64% of patients, and prognostication with one sample per patient performed only slightly better than random expectation (median logHR = 0.109). Low grade tissue was associated with 3.5-fold greater variation in predicted risk than high grade (p = 0.044). This case study in mccRCC quantitatively demonstrates the critical importance of tumour sampling for the success of molecular biomarker studies research where ITH is a factor. The NEAT model shows promise for mccRCC prognostication and warrants follow-up in larger cohorts. Our work evidences actionable parameters to guide sample collection (tumour coverage, size, grade) to inform the development of reproducible molecular risk stratification methods.

  9. Comparability of Computer-Based and Paper-Based Science Assessments

    ERIC Educational Resources Information Center

    Herrmann-Abell, Cari F.; Hardcastle, Joseph; DeBoer, George E.

    2018-01-01

    We compared students' performance on a paper-based test (PBT) and three computer-based tests (CBTs). The three computer-based tests used different test navigation and answer selection features, allowing us to examine how these features affect student performance. The study sample consisted of 9,698 fourth through twelfth grade students from across…

  10. Feature-based attention to unconscious shapes and colors.

    PubMed

    Schmidt, Filipp; Schmidt, Thomas

    2010-08-01

    Two experiments employed feature-based attention to modulate the impact of completely masked primes on subsequent pointing responses. Participants processed a color cue to select a pair of possible pointing targets out of multiple targets on the basis of their color, and then pointed to the one of those two targets with a prespecified shape. All target pairs were preceded by prime pairs triggering either the correct or the opposite response. The time interval between cue and primes was varied to modulate the time course of feature-based attentional selection. In a second experiment, the roles of color and shape were switched. Pointing trajectories showed large priming effects that were amplified by feature-based attention, indicating that attention modulated the earliest phases of motor output. Priming effects as well as their attentional modulation occurred even though participants remained unable to identify the primes, indicating distinct processes underlying visual awareness, attention, and response control.

  11. A method for fast selecting feature wavelengths from the spectral information of crop nitrogen

    USDA-ARS?s Scientific Manuscript database

    Research on a method for fast selecting feature wavelengths from the nitrogen spectral information is necessary, which can determine the nitrogen content of crops. Based on the uniformity of uniform design, this paper proposed an improved particle swarm optimization (PSO) method. The method can ch...

  12. Medical X-ray Image Hierarchical Classification Using a Merging and Splitting Scheme in Feature Space.

    PubMed

    Fesharaki, Nooshin Jafari; Pourghassem, Hossein

    2013-07-01

    Due to the daily mass production and the widespread variation of medical X-ray images, it is necessary to classify these for searching and retrieving proposes, especially for content-based medical image retrieval systems. In this paper, a medical X-ray image hierarchical classification structure based on a novel merging and splitting scheme and using shape and texture features is proposed. In the first level of the proposed structure, to improve the classification performance, similar classes with regard to shape contents are grouped based on merging measures and shape features into the general overlapped classes. In the next levels of this structure, the overlapped classes split in smaller classes based on the classification performance of combination of shape and texture features or texture features only. Ultimately, in the last levels, this procedure is also continued forming all the classes, separately. Moreover, to optimize the feature vector in the proposed structure, we use orthogonal forward selection algorithm according to Mahalanobis class separability measure as a feature selection and reduction algorithm. In other words, according to the complexity and inter-class distance of each class, a sub-space of the feature space is selected in each level and then a supervised merging and splitting scheme is applied to form the hierarchical classification. The proposed structure is evaluated on a database consisting of 2158 medical X-ray images of 18 classes (IMAGECLEF 2005 database) and accuracy rate of 93.6% in the last level of the hierarchical structure for an 18-class classification problem is obtained.

  13. Feature and Score Fusion Based Multiple Classifier Selection for Iris Recognition

    PubMed Central

    Islam, Md. Rabiul

    2014-01-01

    The aim of this work is to propose a new feature and score fusion based iris recognition approach where voting method on Multiple Classifier Selection technique has been applied. Four Discrete Hidden Markov Model classifiers output, that is, left iris based unimodal system, right iris based unimodal system, left-right iris feature fusion based multimodal system, and left-right iris likelihood ratio score fusion based multimodal system, is combined using voting method to achieve the final recognition result. CASIA-IrisV4 database has been used to measure the performance of the proposed system with various dimensions. Experimental results show the versatility of the proposed system of four different classifiers with various dimensions. Finally, recognition accuracy of the proposed system has been compared with existing N hamming distance score fusion approach proposed by Ma et al., log-likelihood ratio score fusion approach proposed by Schmid et al., and single level feature fusion approach proposed by Hollingsworth et al. PMID:25114676

  14. Feature and score fusion based multiple classifier selection for iris recognition.

    PubMed

    Islam, Md Rabiul

    2014-01-01

    The aim of this work is to propose a new feature and score fusion based iris recognition approach where voting method on Multiple Classifier Selection technique has been applied. Four Discrete Hidden Markov Model classifiers output, that is, left iris based unimodal system, right iris based unimodal system, left-right iris feature fusion based multimodal system, and left-right iris likelihood ratio score fusion based multimodal system, is combined using voting method to achieve the final recognition result. CASIA-IrisV4 database has been used to measure the performance of the proposed system with various dimensions. Experimental results show the versatility of the proposed system of four different classifiers with various dimensions. Finally, recognition accuracy of the proposed system has been compared with existing N hamming distance score fusion approach proposed by Ma et al., log-likelihood ratio score fusion approach proposed by Schmid et al., and single level feature fusion approach proposed by Hollingsworth et al.

  15. Application of machine learning on brain cancer multiclass classification

    NASA Astrophysics Data System (ADS)

    Panca, V.; Rustam, Z.

    2017-07-01

    Classification of brain cancer is a problem of multiclass classification. One approach to solve this problem is by first transforming it into several binary problems. The microarray gene expression dataset has the two main characteristics of medical data: extremely many features (genes) and only a few number of samples. The application of machine learning on microarray gene expression dataset mainly consists of two steps: feature selection and classification. In this paper, the features are selected using a method based on support vector machine recursive feature elimination (SVM-RFE) principle which is improved to solve multiclass classification, called multiple multiclass SVM-RFE. Instead of using only the selected features on a single classifier, this method combines the result of multiple classifiers. The features are divided into subsets and SVM-RFE is used on each subset. Then, the selected features on each subset are put on separate classifiers. This method enhances the feature selection ability of each single SVM-RFE. Twin support vector machine (TWSVM) is used as the method of the classifier to reduce computational complexity. While ordinary SVM finds single optimum hyperplane, the main objective Twin SVM is to find two non-parallel optimum hyperplanes. The experiment on the brain cancer microarray gene expression dataset shows this method could classify 71,4% of the overall test data correctly, using 100 and 1000 genes selected from multiple multiclass SVM-RFE feature selection method. Furthermore, the per class results show that this method could classify data of normal and MD class with 100% accuracy.

  16. 75 FR 42605 - Increase in Tax Rates on Tobacco Products and Cigarette Papers and Tubes; Floor Stocks Tax on...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-07-22

    ... than large cigars described in 26 U.S.C. 5701(a)(2)), and on cigarette papers and tubes, held for sale... tobacco for making cigars and tobacco for use as wrappers for cigars, effective April 1, 2009. The..., independent public health foundation, commented that TTB's large cigar reporting rules should be amended to...

  17. Information Assurance Study

    DTIC Science & Technology

    1998-01-01

    usually written up by Logistics or Maintenance (4790 is the Maintenance “ Bible ”). If need be, and if resources are available, one could collect all...Public domain) SATAN (System Administration Tool for Analyzing Networks) (Public Domain) STAT ( Security Test and Analysis Tool) (Harris Corporation...Service-Filtering Tools 1. TCP/IP wrapper program • Tools to Scan Hosts for Known Vulnerabilities 1. ISS (Internet Security Scanner) 2. SATAN (Security

  18. Optical-Correlator Neural Network Based On Neocognitron

    NASA Technical Reports Server (NTRS)

    Chao, Tien-Hsin; Stoner, William W.

    1994-01-01

    Multichannel optical correlator implements shift-invariant, high-discrimination pattern-recognizing neural network based on paradigm of neocognitron. Selected as basic building block of this neural network because invariance under shifts is inherent advantage of Fourier optics included in optical correlators in general. Neocognitron is conceptual electronic neural-network model for recognition of visual patterns. Multilayer processing achieved by iteratively feeding back output of feature correlator to input spatial light modulator and updating Fourier filters. Neural network trained by use of characteristic features extracted from target images. Multichannel implementation enables parallel processing of large number of selected features.

  19. Feature selection and classification of multiparametric medical images using bagging and SVM

    NASA Astrophysics Data System (ADS)

    Fan, Yong; Resnick, Susan M.; Davatzikos, Christos

    2008-03-01

    This paper presents a framework for brain classification based on multi-parametric medical images. This method takes advantage of multi-parametric imaging to provide a set of discriminative features for classifier construction by using a regional feature extraction method which takes into account joint correlations among different image parameters; in the experiments herein, MRI and PET images of the brain are used. Support vector machine classifiers are then trained based on the most discriminative features selected from the feature set. To facilitate robust classification and optimal selection of parameters involved in classification, in view of the well-known "curse of dimensionality", base classifiers are constructed in a bagging (bootstrap aggregating) framework for building an ensemble classifier and the classification parameters of these base classifiers are optimized by means of maximizing the area under the ROC (receiver operating characteristic) curve estimated from their prediction performance on left-out samples of bootstrap sampling. This classification system is tested on a sex classification problem, where it yields over 90% classification rates for unseen subjects. The proposed classification method is also compared with other commonly used classification algorithms, with favorable results. These results illustrate that the methods built upon information jointly extracted from multi-parametric images have the potential to perform individual classification with high sensitivity and specificity.

  20. Mammogram classification scheme using 2D-discrete wavelet and local binary pattern for detection of breast cancer

    NASA Astrophysics Data System (ADS)

    Adi Putra, Januar

    2018-04-01

    In this paper, we propose a new mammogram classification scheme to classify the breast tissues as normal or abnormal. Feature matrix is generated using Local Binary Pattern to all the detailed coefficients from 2D-DWT of the region of interest (ROI) of a mammogram. Feature selection is done by selecting the relevant features that affect the classification. Feature selection is used to reduce the dimensionality of data and features that are not relevant, in this paper the F-test and Ttest will be performed to the results of the feature extraction dataset to reduce and select the relevant feature. The best features are used in a Neural Network classifier for classification. In this research we use MIAS and DDSM database. In addition to the suggested scheme, the competent schemes are also simulated for comparative analysis. It is observed that the proposed scheme has a better say with respect to accuracy, specificity and sensitivity. Based on experiments, the performance of the proposed scheme can produce high accuracy that is 92.71%, while the lowest accuracy obtained is 77.08%.

  1. Python as a federation tool for GENESIS 3.0.

    PubMed

    Cornelis, Hugo; Rodriguez, Armando L; Coop, Allan D; Bower, James M

    2012-01-01

    The GENESIS simulation platform was one of the first broad-scale modeling systems in computational biology to encourage modelers to develop and share model features and components. Supported by a large developer community, it participated in innovative simulator technologies such as benchmarking, parallelization, and declarative model specification and was the first neural simulator to define bindings for the Python scripting language. An important feature of the latest version of GENESIS is that it decomposes into self-contained software components complying with the Computational Biology Initiative federated software architecture. This architecture allows separate scripting bindings to be defined for different necessary components of the simulator, e.g., the mathematical solvers and graphical user interface. Python is a scripting language that provides rich sets of freely available open source libraries. With clean dynamic object-oriented designs, they produce highly readable code and are widely employed in specialized areas of software component integration. We employ a simplified wrapper and interface generator to examine an application programming interface and make it available to a given scripting language. This allows independent software components to be 'glued' together and connected to external libraries and applications from user-defined Python or Perl scripts. We illustrate our approach with three examples of Python scripting. (1) Generate and run a simple single-compartment model neuron connected to a stand-alone mathematical solver. (2) Interface a mathematical solver with GENESIS 3.0 to explore a neuron morphology from either an interactive command-line or graphical user interface. (3) Apply scripting bindings to connect the GENESIS 3.0 simulator to external graphical libraries and an open source three dimensional content creation suite that supports visualization of models based on electron microscopy and their conversion to computational models. Employed in this way, the stand-alone software components of the GENESIS 3.0 simulator provide a framework for progressive federated software development in computational neuroscience.

  2. Python as a Federation Tool for GENESIS 3.0

    PubMed Central

    Cornelis, Hugo; Rodriguez, Armando L.; Coop, Allan D.; Bower, James M.

    2012-01-01

    The GENESIS simulation platform was one of the first broad-scale modeling systems in computational biology to encourage modelers to develop and share model features and components. Supported by a large developer community, it participated in innovative simulator technologies such as benchmarking, parallelization, and declarative model specification and was the first neural simulator to define bindings for the Python scripting language. An important feature of the latest version of GENESIS is that it decomposes into self-contained software components complying with the Computational Biology Initiative federated software architecture. This architecture allows separate scripting bindings to be defined for different necessary components of the simulator, e.g., the mathematical solvers and graphical user interface. Python is a scripting language that provides rich sets of freely available open source libraries. With clean dynamic object-oriented designs, they produce highly readable code and are widely employed in specialized areas of software component integration. We employ a simplified wrapper and interface generator to examine an application programming interface and make it available to a given scripting language. This allows independent software components to be ‘glued’ together and connected to external libraries and applications from user-defined Python or Perl scripts. We illustrate our approach with three examples of Python scripting. (1) Generate and run a simple single-compartment model neuron connected to a stand-alone mathematical solver. (2) Interface a mathematical solver with GENESIS 3.0 to explore a neuron morphology from either an interactive command-line or graphical user interface. (3) Apply scripting bindings to connect the GENESIS 3.0 simulator to external graphical libraries and an open source three dimensional content creation suite that supports visualization of models based on electron microscopy and their conversion to computational models. Employed in this way, the stand-alone software components of the GENESIS 3.0 simulator provide a framework for progressive federated software development in computational neuroscience. PMID:22276101

  3. Object-based attentional selection modulates anticipatory alpha oscillations

    PubMed Central

    Knakker, Balázs; Weiss, Béla; Vidnyánszky, Zoltán

    2015-01-01

    Visual cortical alpha oscillations are involved in attentional gating of incoming visual information. It has been shown that spatial and feature-based attentional selection result in increased alpha oscillations over the cortical regions representing sensory input originating from the unattended visual field and task-irrelevant visual features, respectively. However, whether attentional gating in the case of object based selection is also associated with alpha oscillations has not been investigated before. Here we measured anticipatory electroencephalography (EEG) alpha oscillations while participants were cued to attend to foveal face or word stimuli, the processing of which is known to have right and left hemispheric lateralization, respectively. The results revealed that in the case of simultaneously displayed, overlapping face and word stimuli, attending to the words led to increased power of parieto-occipital alpha oscillations over the right hemisphere as compared to when faces were attended. This object category-specific modulation of the hemispheric lateralization of anticipatory alpha oscillations was maintained during sustained attentional selection of sequentially presented face and word stimuli. These results imply that in the case of object-based attentional selection—similarly to spatial and feature-based attention—gating of visual information processing might involve visual cortical alpha oscillations. PMID:25628554

  4. Attention to Color Sharpens Neural Population Tuning via Feedback Processing in the Human Visual Cortex Hierarchy.

    PubMed

    Bartsch, Mandy V; Loewe, Kristian; Merkel, Christian; Heinze, Hans-Jochen; Schoenfeld, Mircea A; Tsotsos, John K; Hopf, Jens-Max

    2017-10-25

    Attention can facilitate the selection of elementary object features such as color, orientation, or motion. This is referred to as feature-based attention and it is commonly attributed to a modulation of the gain and tuning of feature-selective units in visual cortex. Although gain mechanisms are well characterized, little is known about the cortical processes underlying the sharpening of feature selectivity. Here, we show with high-resolution magnetoencephalography in human observers (men and women) that sharpened selectivity for a particular color arises from feedback processing in the human visual cortex hierarchy. To assess color selectivity, we analyze the response to a color probe that varies in color distance from an attended color target. We find that attention causes an initial gain enhancement in anterior ventral extrastriate cortex that is coarsely selective for the target color and transitions within ∼100 ms into a sharper tuned profile in more posterior ventral occipital cortex. We conclude that attention sharpens selectivity over time by attenuating the response at lower levels of the cortical hierarchy to color values neighboring the target in color space. These observations support computational models proposing that attention tunes feature selectivity in visual cortex through backward-propagating attenuation of units less tuned to the target. SIGNIFICANCE STATEMENT Whether searching for your car, a particular item of clothing, or just obeying traffic lights, in everyday life, we must select items based on color. But how does attention allow us to select a specific color? Here, we use high spatiotemporal resolution neuromagnetic recordings to examine how color selectivity emerges in the human brain. We find that color selectivity evolves as a coarse to fine process from higher to lower levels within the visual cortex hierarchy. Our observations support computational models proposing that feature selectivity increases over time by attenuating the responses of less-selective cells in lower-level brain areas. These data emphasize that color perception involves multiple areas across a hierarchy of regions, interacting with each other in a complex, recursive manner. Copyright © 2017 the authors 0270-6474/17/3710346-12$15.00/0.

  5. A computational approach to compare regression modelling strategies in prediction research.

    PubMed

    Pajouheshnia, Romin; Pestman, Wiebe R; Teerenstra, Steven; Groenwold, Rolf H H

    2016-08-25

    It is often unclear which approach to fit, assess and adjust a model will yield the most accurate prediction model. We present an extension of an approach for comparing modelling strategies in linear regression to the setting of logistic regression and demonstrate its application in clinical prediction research. A framework for comparing logistic regression modelling strategies by their likelihoods was formulated using a wrapper approach. Five different strategies for modelling, including simple shrinkage methods, were compared in four empirical data sets to illustrate the concept of a priori strategy comparison. Simulations were performed in both randomly generated data and empirical data to investigate the influence of data characteristics on strategy performance. We applied the comparison framework in a case study setting. Optimal strategies were selected based on the results of a priori comparisons in a clinical data set and the performance of models built according to each strategy was assessed using the Brier score and calibration plots. The performance of modelling strategies was highly dependent on the characteristics of the development data in both linear and logistic regression settings. A priori comparisons in four empirical data sets found that no strategy consistently outperformed the others. The percentage of times that a model adjustment strategy outperformed a logistic model ranged from 3.9 to 94.9 %, depending on the strategy and data set. However, in our case study setting the a priori selection of optimal methods did not result in detectable improvement in model performance when assessed in an external data set. The performance of prediction modelling strategies is a data-dependent process and can be highly variable between data sets within the same clinical domain. A priori strategy comparison can be used to determine an optimal logistic regression modelling strategy for a given data set before selecting a final modelling approach.

  6. Automated texture-based identification of ovarian cancer in confocal microendoscope images

    NASA Astrophysics Data System (ADS)

    Srivastava, Saurabh; Rodriguez, Jeffrey J.; Rouse, Andrew R.; Brewer, Molly A.; Gmitro, Arthur F.

    2005-03-01

    The fluorescence confocal microendoscope provides high-resolution, in-vivo imaging of cellular pathology during optical biopsy. There are indications that the examination of human ovaries with this instrument has diagnostic implications for the early detection of ovarian cancer. The purpose of this study was to develop a computer-aided system to facilitate the identification of ovarian cancer from digital images captured with the confocal microendoscope system. To achieve this goal, we modeled the cellular-level structure present in these images as texture and extracted features based on first-order statistics, spatial gray-level dependence matrices, and spatial-frequency content. Selection of the best features for classification was performed using traditional feature selection techniques including stepwise discriminant analysis, forward sequential search, a non-parametric method, principal component analysis, and a heuristic technique that combines the results of these methods. The best set of features selected was used for classification, and performance of various machine classifiers was compared by analyzing the areas under their receiver operating characteristic curves. The results show that it is possible to automatically identify patients with ovarian cancer based on texture features extracted from confocal microendoscope images and that the machine performance is superior to that of the human observer.

  7. Evolutionary Algorithm Based Feature Optimization for Multi-Channel EEG Classification.

    PubMed

    Wang, Yubo; Veluvolu, Kalyana C

    2017-01-01

    The most BCI systems that rely on EEG signals employ Fourier based methods for time-frequency decomposition for feature extraction. The band-limited multiple Fourier linear combiner is well-suited for such band-limited signals due to its real-time applicability. Despite the improved performance of these techniques in two channel settings, its application in multiple-channel EEG is not straightforward and challenging. As more channels are available, a spatial filter will be required to eliminate the noise and preserve the required useful information. Moreover, multiple-channel EEG also adds the high dimensionality to the frequency feature space. Feature selection will be required to stabilize the performance of the classifier. In this paper, we develop a new method based on Evolutionary Algorithm (EA) to solve these two problems simultaneously. The real-valued EA encodes both the spatial filter estimates and the feature selection into its solution and optimizes it with respect to the classification error. Three Fourier based designs are tested in this paper. Our results show that the combination of Fourier based method with covariance matrix adaptation evolution strategy (CMA-ES) has the best overall performance.

  8. What automated age estimation of hand and wrist MRI data tells us about skeletal maturation in male adolescents.

    PubMed

    Urschler, Martin; Grassegger, Sabine; Štern, Darko

    2015-01-01

    Age estimation of individuals is important in human biology and has various medical and forensic applications. Recent interest in MR-based methods aims to investigate alternatives for established methods involving ionising radiation. Automatic, software-based methods additionally promise improved estimation objectivity. To investigate how informative automatically selected image features are regarding their ability to discriminate age, by exploring a recently proposed software-based age estimation method for MR images of the left hand and wrist. One hundred and two MR datasets of left hand images are used to evaluate age estimation performance, consisting of bone and epiphyseal gap volume localisation, computation of one age regression model per bone mapping image features to age and fusion of individual bone age predictions to a final age estimate. Quantitative results of the software-based method show an age estimation performance with a mean absolute difference of 0.85 years (SD = 0.58 years) to chronological age, as determined by a cross-validation experiment. Qualitatively, it is demonstrated how feature selection works and which image features of skeletal maturation are automatically chosen to model the non-linear regression function. Feasibility of automatic age estimation based on MRI data is shown and selected image features are found to be informative for describing anatomical changes during physical maturation in male adolescents.

  9. Random forest feature selection approach for image segmentation

    NASA Astrophysics Data System (ADS)

    Lefkovits, László; Lefkovits, Szidónia; Emerich, Simina; Vaida, Mircea Florin

    2017-03-01

    In the field of image segmentation, discriminative models have shown promising performance. Generally, every such model begins with the extraction of numerous features from annotated images. Most authors create their discriminative model by using many features without using any selection criteria. A more reliable model can be built by using a framework that selects the important variables, from the point of view of the classification, and eliminates the unimportant once. In this article we present a framework for feature selection and data dimensionality reduction. The methodology is built around the random forest (RF) algorithm and its variable importance evaluation. In order to deal with datasets so large as to be practically unmanageable, we propose an algorithm based on RF that reduces the dimension of the database by eliminating irrelevant features. Furthermore, this framework is applied to optimize our discriminative model for brain tumor segmentation.

  10. FSR: feature set reduction for scalable and accurate multi-class cancer subtype classification based on copy number.

    PubMed

    Wong, Gerard; Leckie, Christopher; Kowalczyk, Adam

    2012-01-15

    Feature selection is a key concept in machine learning for microarray datasets, where features represented by probesets are typically several orders of magnitude larger than the available sample size. Computational tractability is a key challenge for feature selection algorithms in handling very high-dimensional datasets beyond a hundred thousand features, such as in datasets produced on single nucleotide polymorphism microarrays. In this article, we present a novel feature set reduction approach that enables scalable feature selection on datasets with hundreds of thousands of features and beyond. Our approach enables more efficient handling of higher resolution datasets to achieve better disease subtype classification of samples for potentially more accurate diagnosis and prognosis, which allows clinicians to make more informed decisions in regards to patient treatment options. We applied our feature set reduction approach to several publicly available cancer single nucleotide polymorphism (SNP) array datasets and evaluated its performance in terms of its multiclass predictive classification accuracy over different cancer subtypes, its speedup in execution as well as its scalability with respect to sample size and array resolution. Feature Set Reduction (FSR) was able to reduce the dimensions of an SNP array dataset by more than two orders of magnitude while achieving at least equal, and in most cases superior predictive classification performance over that achieved on features selected by existing feature selection methods alone. An examination of the biological relevance of frequently selected features from FSR-reduced feature sets revealed strong enrichment in association with cancer. FSR was implemented in MATLAB R2010b and is available at http://ww2.cs.mu.oz.au/~gwong/FSR.

  11. Ensemble Feature Learning of Genomic Data Using Support Vector Machine

    PubMed Central

    Anaissi, Ali; Goyal, Madhu; Catchpoole, Daniel R.; Braytee, Ali; Kennedy, Paul J.

    2016-01-01

    The identification of a subset of genes having the ability to capture the necessary information to distinguish classes of patients is crucial in bioinformatics applications. Ensemble and bagging methods have been shown to work effectively in the process of gene selection and classification. Testament to that is random forest which combines random decision trees with bagging to improve overall feature selection and classification accuracy. Surprisingly, the adoption of these methods in support vector machines has only recently received attention but mostly on classification not gene selection. This paper introduces an ensemble SVM-Recursive Feature Elimination (ESVM-RFE) for gene selection that follows the concepts of ensemble and bagging used in random forest but adopts the backward elimination strategy which is the rationale of RFE algorithm. The rationale behind this is, building ensemble SVM models using randomly drawn bootstrap samples from the training set, will produce different feature rankings which will be subsequently aggregated as one feature ranking. As a result, the decision for elimination of features is based upon the ranking of multiple SVM models instead of choosing one particular model. Moreover, this approach will address the problem of imbalanced datasets by constructing a nearly balanced bootstrap sample. Our experiments show that ESVM-RFE for gene selection substantially increased the classification performance on five microarray datasets compared to state-of-the-art methods. Experiments on the childhood leukaemia dataset show that an average 9% better accuracy is achieved by ESVM-RFE over SVM-RFE, and 5% over random forest based approach. The selected genes by the ESVM-RFE algorithm were further explored with Singular Value Decomposition (SVD) which reveals significant clusters with the selected data. PMID:27304923

  12. Object-based target templates guide attention during visual search.

    PubMed

    Berggren, Nick; Eimer, Martin

    2018-05-03

    During visual search, attention is believed to be controlled in a strictly feature-based fashion, without any guidance by object-based target representations. To challenge this received view, we measured electrophysiological markers of attentional selection (N2pc component) and working memory (sustained posterior contralateral negativity; SPCN) in search tasks where two possible targets were defined by feature conjunctions (e.g., blue circles and green squares). Critically, some search displays also contained nontargets with two target features (incorrect conjunction objects, e.g., blue squares). Because feature-based guidance cannot distinguish these objects from targets, any selective bias for targets will reflect object-based attentional control. In Experiment 1, where search displays always contained only one object with target-matching features, targets and incorrect conjunction objects elicited identical N2pc and SPCN components, demonstrating that attentional guidance was entirely feature-based. In Experiment 2, where targets and incorrect conjunction objects could appear in the same display, clear evidence for object-based attentional control was found. The target N2pc became larger than the N2pc to incorrect conjunction objects from 250 ms poststimulus, and only targets elicited SPCN components. This demonstrates that after an initial feature-based guidance phase, object-based templates are activated when they are required to distinguish target and nontarget objects. These templates modulate visual processing and control access to working memory, and their activation may coincide with the start of feature integration processes. Results also suggest that while multiple feature templates can be activated concurrently, only a single object-based target template can guide attention at any given time. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  13. Optimizing classification performance in an object-based very-high-resolution land use-land cover urban application

    NASA Astrophysics Data System (ADS)

    Georganos, Stefanos; Grippa, Tais; Vanhuysse, Sabine; Lennert, Moritz; Shimoni, Michal; Wolff, Eléonore

    2017-10-01

    This study evaluates the impact of three Feature Selection (FS) algorithms in an Object Based Image Analysis (OBIA) framework for Very-High-Resolution (VHR) Land Use-Land Cover (LULC) classification. The three selected FS algorithms, Correlation Based Selection (CFS), Mean Decrease in Accuracy (MDA) and Random Forest (RF) based Recursive Feature Elimination (RFE), were tested on Support Vector Machine (SVM), K-Nearest Neighbor, and Random Forest (RF) classifiers. The results demonstrate that the accuracy of SVM and KNN classifiers are the most sensitive to FS. The RF appeared to be more robust to high dimensionality, although a significant increase in accuracy was found by using the RFE method. In terms of classification accuracy, SVM performed the best using FS, followed by RF and KNN. Finally, only a small number of features is needed to achieve the highest performance using each classifier. This study emphasizes the benefits of rigorous FS for maximizing performance, as well as for minimizing model complexity and interpretation.

  14. A primitive study of voxel feature generation by multiple stacked denoising autoencoders for detecting cerebral aneurysms on MRA

    NASA Astrophysics Data System (ADS)

    Nemoto, Mitsutaka; Hayashi, Naoto; Hanaoka, Shouhei; Nomura, Yukihiro; Miki, Soichiro; Yoshikawa, Takeharu; Ohtomo, Kuni

    2016-03-01

    The purpose of this study is to evaluate the feasibility of a novel feature generation, which is based on multiple deep neural networks (DNNs) with boosting, for computer-assisted detection (CADe). It is hard and time-consuming to optimize the hyperparameters for DNNs such as stacked denoising autoencoder (SdA). The proposed method allows using SdA based features without the burden of the hyperparameter setting. The proposed method was evaluated by an application for detecting cerebral aneurysms on magnetic resonance angiogram (MRA). A baseline CADe process included four components; scaling, candidate area limitation, candidate detection, and candidate classification. Proposed feature generation method was applied to extract the optimal features for candidate classification. Proposed method only required setting range of the hyperparameters for SdA. The optimal feature set was selected from a large quantity of SdA based features by multiple SdAs, each of which was trained using different hyperparameter set. The feature selection was operated through ada-boost ensemble learning method. Training of the baseline CADe process and proposed feature generation were operated with 200 MRA cases, and the evaluation was performed with 100 MRA cases. Proposed method successfully provided SdA based features just setting the range of some hyperparameters for SdA. The CADe process by using both previous voxel features and SdA based features had the best performance with 0.838 of an area under ROC curve and 0.312 of ANODE score. The results showed that proposed method was effective in the application for detecting cerebral aneurysms on MRA.

  15. Differential diagnosis of CT focal liver lesions using texture features, feature selection and ensemble driven classifiers.

    PubMed

    Mougiakakou, Stavroula G; Valavanis, Ioannis K; Nikita, Alexandra; Nikita, Konstantina S

    2007-09-01

    The aim of the present study is to define an optimally performing computer-aided diagnosis (CAD) architecture for the classification of liver tissue from non-enhanced computed tomography (CT) images into normal liver (C1), hepatic cyst (C2), hemangioma (C3), and hepatocellular carcinoma (C4). To this end, various CAD architectures, based on texture features and ensembles of classifiers (ECs), are comparatively assessed. Number of regions of interests (ROIs) corresponding to C1-C4 have been defined by experienced radiologists in non-enhanced liver CT images. For each ROI, five distinct sets of texture features were extracted using first order statistics, spatial gray level dependence matrix, gray level difference method, Laws' texture energy measures, and fractal dimension measurements. Two different ECs were constructed and compared. The first one consists of five multilayer perceptron neural networks (NNs), each using as input one of the computed texture feature sets or its reduced version after genetic algorithm-based feature selection. The second EC comprised five different primary classifiers, namely one multilayer perceptron NN, one probabilistic NN, and three k-nearest neighbor classifiers, each fed with the combination of the five texture feature sets or their reduced versions. The final decision of each EC was extracted by using appropriate voting schemes, while bootstrap re-sampling was utilized in order to estimate the generalization ability of the CAD architectures based on the available relatively small-sized data set. The best mean classification accuracy (84.96%) is achieved by the second EC using a fused feature set, and the weighted voting scheme. The fused feature set was obtained after appropriate feature selection applied to specific subsets of the original feature set. The comparative assessment of the various CAD architectures shows that combining three types of classifiers with a voting scheme, fed with identical feature sets obtained after appropriate feature selection and fusion, may result in an accurate system able to assist differential diagnosis of focal liver lesions from non-enhanced CT images.

  16. Selection of features within and without objects: effects of gestalt appearance and object-based instruction on behavior and event-related brain potentials.

    PubMed

    Verleger, Rolf; Groen, Margriet; Heide, Wolfgang; Sobieralska, Kinga; Jaśkowski, Piotr

    2008-05-01

    We studied how physical and instructed embedding of features in gestalts affects perceptual selection. Four ovals on the horizontal midline were either unconnected or pairwise connected by circles, forming ears of left and right heads (gestalts). Relevant to responding was the position of one colored oval, either within its pair or relative to fixation ("object-based" or "fixation-based" instruction). Responses were faster under fixation- than object-based instruction, less so with gestalts. Previously reported increases of N1 when evoked by features within objects were replicated for fixation-based instruction only. There was no effect of instruction on N2pc. However P1 increased under the adequate instruction, object-based for gestalts, fixation-based for unconnected items, which presumably indicated how foci of attention were set by expecting specific stimuli under instructions that specified how to bind these stimuli to objects.

  17. Geospatial Analytics in Retail Site Selection and Sales Prediction.

    PubMed

    Ting, Choo-Yee; Ho, Chiung Ching; Yee, Hui Jia; Matsah, Wan Razali

    2018-03-01

    Studies have shown that certain features from geography, demography, trade area, and environment can play a vital role in retail site selection, largely due to the impact they asserted on retail performance. Although the relevant features could be elicited by domain experts, determining the optimal feature set can be intractable and labor-intensive exercise. The challenges center around (1) how to determine features that are important to a particular retail business and (2) how to estimate retail sales performance given a new location? The challenges become apparent when the features vary across time. In this light, this study proposed a nonintervening approach by employing feature selection algorithms and subsequently sales prediction through similarity-based methods. The results of prediction were validated by domain experts. In this study, data sets from different sources were transformed and aggregated before an analytics data set that is ready for analysis purpose could be obtained. The data sets included data about feature location, population count, property type, education status, and monthly sales from 96 branches of a telecommunication company in Malaysia. The finding suggested that (1) optimal retail performance can only be achieved through fulfillment of specific location features together with the surrounding trade area characteristics and (2) similarity-based method can provide solution to retail sales prediction.

  18. Exploring the QSAR's predictive truthfulness of the novel N-tuple discrete derivative indices on benchmark datasets.

    PubMed

    Martínez-Santiago, O; Marrero-Ponce, Y; Vivas-Reyes, R; Rivera-Borroto, O M; Hurtado, E; Treto-Suarez, M A; Ramos, Y; Vergara-Murillo, F; Orozco-Ugarriza, M E; Martínez-López, Y

    2017-05-01

    Graph derivative indices (GDIs) have recently been defined over N-atoms (N = 2, 3 and 4) simultaneously, which are based on the concept of derivatives in discrete mathematics (finite difference), metaphorical to the derivative concept in classical mathematical analysis. These molecular descriptors (MDs) codify topo-chemical and topo-structural information based on the concept of the derivative of a molecular graph with respect to a given event (S) over duplex, triplex and quadruplex relations of atoms (vertices). These GDIs have been successfully applied in the description of physicochemical properties like reactivity, solubility and chemical shift, among others, and in several comparative quantitative structure activity/property relationship (QSAR/QSPR) studies. Although satisfactory results have been obtained in previous modelling studies with the aforementioned indices, it is necessary to develop new, more rigorous analysis to assess the true predictive performance of the novel structure codification. So, in the present paper, an assessment and statistical validation of the performance of these novel approaches in QSAR studies are executed, as well as a comparison with those of other QSAR procedures reported in the literature. To achieve the main aim of this research, QSARs were developed on eight chemical datasets widely used as benchmarks in the evaluation/validation of several QSAR methods and/or many different MDs (fundamentally 3D MDs). Three to seven variable QSAR models were built for each chemical dataset, according to the original dissection into training/test sets. The models were developed by using multiple linear regression (MLR) coupled with a genetic algorithm as the feature wrapper selection technique in the MobyDigs software. Each family of GDIs (for duplex, triplex and quadruplex) behaves similarly in all modelling, although there were some exceptions. However, when all families were used in combination, the results achieved were quantitatively higher than those reported by other authors in similar experiments. Comparisons with respect to external correlation coefficients (q 2 ext ) revealed that the models based on GDIs possess superior predictive ability in seven of the eight datasets analysed, outperforming methodologies based on similar or more complex techniques and confirming the good predictive power of the obtained models. For the q 2 ext values, the non-parametric comparison revealed significantly different results to those reported so far, which demonstrated that the models based on DIVATI's indices presented the best global performance and yielded significantly better predictions than the 12 0-3D QSAR procedures used in the comparison. Therefore, GDIs are suitable for structure codification of the molecules and constitute a good alternative to build QSARs for the prediction of physicochemical, biological and environmental endpoints.

  19. Developing a new case based computer-aided detection scheme and an adaptive cueing method to improve performance in detecting mammographic lesions

    PubMed Central

    Tan, Maxine; Aghaei, Faranak; Wang, Yunzhi; Zheng, Bin

    2017-01-01

    The purpose of this study is to evaluate a new method to improve performance of computer-aided detection (CAD) schemes of screening mammograms with two approaches. In the first approach, we developed a new case based CAD scheme using a set of optimally selected global mammographic density, texture, spiculation, and structural similarity features computed from all four full-field digital mammography (FFDM) images of the craniocaudal (CC) and mediolateral oblique (MLO) views by using a modified fast and accurate sequential floating forward selection feature selection algorithm. Selected features were then applied to a “scoring fusion” artificial neural network (ANN) classification scheme to produce a final case based risk score. In the second approach, we combined the case based risk score with the conventional lesion based scores of a conventional lesion based CAD scheme using a new adaptive cueing method that is integrated with the case based risk scores. We evaluated our methods using a ten-fold cross-validation scheme on 924 cases (476 cancer and 448 recalled or negative), whereby each case had all four images from the CC and MLO views. The area under the receiver operating characteristic curve was AUC = 0.793±0.015 and the odds ratio monotonically increased from 1 to 37.21 as CAD-generated case based detection scores increased. Using the new adaptive cueing method, the region based and case based sensitivities of the conventional CAD scheme at a false positive rate of 0.71 per image increased by 2.4% and 0.8%, respectively. The study demonstrated that supplementary information can be derived by computing global mammographic density image features to improve CAD-cueing performance on the suspicious mammographic lesions. PMID:27997380

  20. A Feature Selection Algorithm to Compute Gene Centric Methylation from Probe Level Methylation Data.

    PubMed

    Baur, Brittany; Bozdag, Serdar

    2016-01-01

    DNA methylation is an important epigenetic event that effects gene expression during development and various diseases such as cancer. Understanding the mechanism of action of DNA methylation is important for downstream analysis. In the Illumina Infinium HumanMethylation 450K array, there are tens of probes associated with each gene. Given methylation intensities of all these probes, it is necessary to compute which of these probes are most representative of the gene centric methylation level. In this study, we developed a feature selection algorithm based on sequential forward selection that utilized different classification methods to compute gene centric DNA methylation using probe level DNA methylation data. We compared our algorithm to other feature selection algorithms such as support vector machines with recursive feature elimination, genetic algorithms and ReliefF. We evaluated all methods based on the predictive power of selected probes on their mRNA expression levels and found that a K-Nearest Neighbors classification using the sequential forward selection algorithm performed better than other algorithms based on all metrics. We also observed that transcriptional activities of certain genes were more sensitive to DNA methylation changes than transcriptional activities of other genes. Our algorithm was able to predict the expression of those genes with high accuracy using only DNA methylation data. Our results also showed that those DNA methylation-sensitive genes were enriched in Gene Ontology terms related to the regulation of various biological processes.

Top