Integrated feature extraction and selection for neuroimage classification
NASA Astrophysics Data System (ADS)
Fan, Yong; Shen, Dinggang
2009-02-01
Feature extraction and selection are of great importance in neuroimage classification for identifying informative features and reducing feature dimensionality, which are generally implemented as two separate steps. This paper presents an integrated feature extraction and selection algorithm with two iterative steps: constrained subspace learning based feature extraction and support vector machine (SVM) based feature selection. The subspace learning based feature extraction focuses on the brain regions with higher possibility of being affected by the disease under study, while the possibility of brain regions being affected by disease is estimated by the SVM based feature selection, in conjunction with SVM classification. This algorithm can not only take into account the inter-correlation among different brain regions, but also overcome the limitation of traditional subspace learning based feature extraction methods. To achieve robust performance and optimal selection of parameters involved in feature extraction, selection, and classification, a bootstrapping strategy is used to generate multiple versions of training and testing sets for parameter optimization, according to the classification performance measured by the area under the ROC (receiver operating characteristic) curve. The integrated feature extraction and selection method is applied to a structural MR image based Alzheimer's disease (AD) study with 98 non-demented and 100 demented subjects. Cross-validation results indicate that the proposed algorithm can improve performance of the traditional subspace learning based classification.
Effect of finite sample size on feature selection and classification: a simulation study.
Way, Ted W; Sahiner, Berkman; Hadjiiski, Lubomir M; Chan, Heang-Ping
2010-02-01
The small number of samples available for training and testing is often the limiting factor in finding the most effective features and designing an optimal computer-aided diagnosis (CAD) system. Training on a limited set of samples introduces bias and variance in the performance of a CAD system relative to that trained with an infinite sample size. In this work, the authors conducted a simulation study to evaluate the performances of various combinations of classifiers and feature selection techniques and their dependence on the class distribution, dimensionality, and the training sample size. The understanding of these relationships will facilitate development of effective CAD systems under the constraint of limited available samples. Three feature selection techniques, the stepwise feature selection (SFS), sequential floating forward search (SFFS), and principal component analysis (PCA), and two commonly used classifiers, Fisher's linear discriminant analysis (LDA) and support vector machine (SVM), were investigated. Samples were drawn from multidimensional feature spaces of multivariate Gaussian distributions with equal or unequal covariance matrices and unequal means, and with equal covariance matrices and unequal means estimated from a clinical data set. Classifier performance was quantified by the area under the receiver operating characteristic curve Az. The mean Az values obtained by resubstitution and hold-out methods were evaluated for training sample sizes ranging from 15 to 100 per class. The number of simulated features available for selection was chosen to be 50, 100, and 200. It was found that the relative performance of the different combinations of classifier and feature selection method depends on the feature space distributions, the dimensionality, and the available training sample sizes. The LDA and SVM with radial kernel performed similarly for most of the conditions evaluated in this study, although the SVM classifier showed a slightly higher hold-out performance than LDA for some conditions and vice versa for other conditions. PCA was comparable to or better than SFS and SFFS for LDA at small samples sizes, but inferior for SVM with polynomial kernel. For the class distributions simulated from clinical data, PCA did not show advantages over the other two feature selection methods. Under this condition, the SVM with radial kernel performed better than the LDA when few training samples were available, while LDA performed better when a large number of training samples were available. None of the investigated feature selection-classifier combinations provided consistently superior performance under the studied conditions for different sample sizes and feature space distributions. In general, the SFFS method was comparable to the SFS method while PCA may have an advantage for Gaussian feature spaces with unequal covariance matrices. The performance of the SVM with radial kernel was better than, or comparable to, that of the SVM with polynomial kernel under most conditions studied.
Hollingworth, Andrew; Hwang, Seongmin
2013-10-19
We examined the conditions under which a feature value in visual working memory (VWM) recruits visual attention to matching stimuli. Previous work has suggested that VWM supports two qualitatively different states of representation: an active state that interacts with perceptual selection and a passive (or accessory) state that does not. An alternative hypothesis is that VWM supports a single form of representation, with the precision of feature memory controlling whether or not the representation interacts with perceptual selection. The results of three experiments supported the dual-state hypothesis. We established conditions under which participants retained a relatively precise representation of a parcticular colour. If the colour was immediately task relevant, it reliably recruited attention to matching stimuli. However, if the colour was not immediately task relevant, it failed to interact with perceptual selection. Feature maintenance in VWM is not necessarily equivalent with feature-based attentional selection.
Collective feature selection to identify crucial epistatic variants.
Verma, Shefali S; Lucas, Anastasia; Zhang, Xinyuan; Veturi, Yogasudha; Dudek, Scott; Li, Binglan; Li, Ruowang; Urbanowicz, Ryan; Moore, Jason H; Kim, Dokyoon; Ritchie, Marylyn D
2018-01-01
Machine learning methods have gained popularity and practicality in identifying linear and non-linear effects of variants associated with complex disease/traits. Detection of epistatic interactions still remains a challenge due to the large number of features and relatively small sample size as input, thus leading to the so-called "short fat data" problem. The efficiency of machine learning methods can be increased by limiting the number of input features. Thus, it is very important to perform variable selection before searching for epistasis. Many methods have been evaluated and proposed to perform feature selection, but no single method works best in all scenarios. We demonstrate this by conducting two separate simulation analyses to evaluate the proposed collective feature selection approach. Through our simulation study we propose a collective feature selection approach to select features that are in the "union" of the best performing methods. We explored various parametric, non-parametric, and data mining approaches to perform feature selection. We choose our top performing methods to select the union of the resulting variables based on a user-defined percentage of variants selected from each method to take to downstream analysis. Our simulation analysis shows that non-parametric data mining approaches, such as MDR, may work best under one simulation criteria for the high effect size (penetrance) datasets, while non-parametric methods designed for feature selection, such as Ranger and Gradient boosting, work best under other simulation criteria. Thus, using a collective approach proves to be more beneficial for selecting variables with epistatic effects also in low effect size datasets and different genetic architectures. Following this, we applied our proposed collective feature selection approach to select the top 1% of variables to identify potential interacting variables associated with Body Mass Index (BMI) in ~ 44,000 samples obtained from Geisinger's MyCode Community Health Initiative (on behalf of DiscovEHR collaboration). In this study, we were able to show that selecting variables using a collective feature selection approach could help in selecting true positive epistatic variables more frequently than applying any single method for feature selection via simulation studies. We were able to demonstrate the effectiveness of collective feature selection along with a comparison of many methods in our simulation analysis. We also applied our method to identify non-linear networks associated with obesity.
Select Features in "Sibelius 6" for Music Educators
ERIC Educational Resources Information Center
Thompson, Douglas Earl
2012-01-01
Select features of interest to music educators are spotlighted in "Sibelius 6" (version 6.2.0 build 88) and presented under three headings: (a) Worksheets, (2) Plug-Ins, and (3) Potpourri. Guidance is given for accessing these features, and the commentary suggests their application in the music education classroom. (Contains 3 figures.)
Hollingworth, Andrew; Hwang, Seongmin
2013-01-01
We examined the conditions under which a feature value in visual working memory (VWM) recruits visual attention to matching stimuli. Previous work has suggested that VWM supports two qualitatively different states of representation: an active state that interacts with perceptual selection and a passive (or accessory) state that does not. An alternative hypothesis is that VWM supports a single form of representation, with the precision of feature memory controlling whether or not the representation interacts with perceptual selection. The results of three experiments supported the dual-state hypothesis. We established conditions under which participants retained a relatively precise representation of a parcticular colour. If the colour was immediately task relevant, it reliably recruited attention to matching stimuli. However, if the colour was not immediately task relevant, it failed to interact with perceptual selection. Feature maintenance in VWM is not necessarily equivalent with feature-based attentional selection. PMID:24018723
Natural image statistics and low-complexity feature selection.
Vasconcelos, Manuela; Vasconcelos, Nuno
2009-02-01
Low-complexity feature selection is analyzed in the context of visual recognition. It is hypothesized that high-order dependences of bandpass features contain little information for discrimination of natural images. This hypothesis is characterized formally by the introduction of the concepts of conjunctive interference and decomposability order of a feature set. Necessary and sufficient conditions for the feasibility of low-complexity feature selection are then derived in terms of these concepts. It is shown that the intrinsic complexity of feature selection is determined by the decomposability order of the feature set and not its dimension. Feature selection algorithms are then derived for all levels of complexity and are shown to be approximated by existing information-theoretic methods, which they consistently outperform. The new algorithms are also used to objectively test the hypothesis of low decomposability order through comparison of classification performance. It is shown that, for image classification, the gain of modeling feature dependencies has strongly diminishing returns: best results are obtained under the assumption of decomposability order 1. This suggests a generic law for bandpass features extracted from natural images: that the effect, on the dependence of any two features, of observing any other feature is constant across image classes.
Local Feature Selection for Data Classification.
Armanfard, Narges; Reilly, James P; Komeili, Majid
2016-06-01
Typical feature selection methods choose an optimal global feature subset that is applied over all regions of the sample space. In contrast, in this paper we propose a novel localized feature selection (LFS) approach whereby each region of the sample space is associated with its own distinct optimized feature set, which may vary both in membership and size across the sample space. This allows the feature set to optimally adapt to local variations in the sample space. An associated method for measuring the similarities of a query datum to each of the respective classes is also proposed. The proposed method makes no assumptions about the underlying structure of the samples; hence the method is insensitive to the distribution of the data over the sample space. The method is efficiently formulated as a linear programming optimization problem. Furthermore, we demonstrate the method is robust against the over-fitting problem. Experimental results on eleven synthetic and real-world data sets demonstrate the viability of the formulation and the effectiveness of the proposed algorithm. In addition we show several examples where localized feature selection produces better results than a global feature selection method.
Wang, Hailing; Ip, Chengteng; Fu, Shimin; Sun, Pei
2017-05-01
Face recognition theories suggest that our brains process invariant (e.g., gender) and changeable (e.g., emotion) facial dimensions separately. To investigate whether these two dimensions are processed in different time courses, we analyzed the selection negativity (SN, an event-related potential component reflecting attentional modulation) elicited by face gender and emotion during a feature selective attention task. Participants were instructed to attend to a combination of face emotion and gender attributes in Experiment 1 (bi-dimensional task) and to either face emotion or gender in Experiment 2 (uni-dimensional task). The results revealed that face emotion did not elicit a substantial SN, whereas face gender consistently generated a substantial SN in both experiments. These results suggest that face gender is more sensitive to feature-selective attention and that face emotion is encoded relatively automatically on SN, implying the existence of different underlying processing mechanisms for invariant and changeable facial dimensions. Copyright © 2017 Elsevier Ltd. All rights reserved.
Max-AUC Feature Selection in Computer-Aided Detection of Polyps in CT Colonography
Xu, Jian-Wu; Suzuki, Kenji
2014-01-01
We propose a feature selection method based on a sequential forward floating selection (SFFS) procedure to improve the performance of a classifier in computerized detection of polyps in CT colonography (CTC). The feature selection method is coupled with a nonlinear support vector machine (SVM) classifier. Unlike the conventional linear method based on Wilks' lambda, the proposed method selected the most relevant features that would maximize the area under the receiver operating characteristic curve (AUC), which directly maximizes classification performance, evaluated based on AUC value, in the computer-aided detection (CADe) scheme. We presented two variants of the proposed method with different stopping criteria used in the SFFS procedure. The first variant searched all feature combinations allowed in the SFFS procedure and selected the subsets that maximize the AUC values. The second variant performed a statistical test at each step during the SFFS procedure, and it was terminated if the increase in the AUC value was not statistically significant. The advantage of the second variant is its lower computational cost. To test the performance of the proposed method, we compared it against the popular stepwise feature selection method based on Wilks' lambda for a colonic-polyp database (25 polyps and 2624 nonpolyps). We extracted 75 morphologic, gray-level-based, and texture features from the segmented lesion candidate regions. The two variants of the proposed feature selection method chose 29 and 7 features, respectively. Two SVM classifiers trained with these selected features yielded a 96% by-polyp sensitivity at false-positive (FP) rates of 4.1 and 6.5 per patient, respectively. Experiments showed a significant improvement in the performance of the classifier with the proposed feature selection method over that with the popular stepwise feature selection based on Wilks' lambda that yielded 18.0 FPs per patient at the same sensitivity level. PMID:24608058
Max-AUC feature selection in computer-aided detection of polyps in CT colonography.
Xu, Jian-Wu; Suzuki, Kenji
2014-03-01
We propose a feature selection method based on a sequential forward floating selection (SFFS) procedure to improve the performance of a classifier in computerized detection of polyps in CT colonography (CTC). The feature selection method is coupled with a nonlinear support vector machine (SVM) classifier. Unlike the conventional linear method based on Wilks' lambda, the proposed method selected the most relevant features that would maximize the area under the receiver operating characteristic curve (AUC), which directly maximizes classification performance, evaluated based on AUC value, in the computer-aided detection (CADe) scheme. We presented two variants of the proposed method with different stopping criteria used in the SFFS procedure. The first variant searched all feature combinations allowed in the SFFS procedure and selected the subsets that maximize the AUC values. The second variant performed a statistical test at each step during the SFFS procedure, and it was terminated if the increase in the AUC value was not statistically significant. The advantage of the second variant is its lower computational cost. To test the performance of the proposed method, we compared it against the popular stepwise feature selection method based on Wilks' lambda for a colonic-polyp database (25 polyps and 2624 nonpolyps). We extracted 75 morphologic, gray-level-based, and texture features from the segmented lesion candidate regions. The two variants of the proposed feature selection method chose 29 and 7 features, respectively. Two SVM classifiers trained with these selected features yielded a 96% by-polyp sensitivity at false-positive (FP) rates of 4.1 and 6.5 per patient, respectively. Experiments showed a significant improvement in the performance of the classifier with the proposed feature selection method over that with the popular stepwise feature selection based on Wilks' lambda that yielded 18.0 FPs per patient at the same sensitivity level.
Discriminative least squares regression for multiclass classification and feature selection.
Xiang, Shiming; Nie, Feiping; Meng, Gaofeng; Pan, Chunhong; Zhang, Changshui
2012-11-01
This paper presents a framework of discriminative least squares regression (LSR) for multiclass classification and feature selection. The core idea is to enlarge the distance between different classes under the conceptual framework of LSR. First, a technique called ε-dragging is introduced to force the regression targets of different classes moving along opposite directions such that the distances between classes can be enlarged. Then, the ε-draggings are integrated into the LSR model for multiclass classification. Our learning framework, referred to as discriminative LSR, has a compact model form, where there is no need to train two-class machines that are independent of each other. With its compact form, this model can be naturally extended for feature selection. This goal is achieved in terms of L2,1 norm of matrix, generating a sparse learning model for feature selection. The model for multiclass classification and its extension for feature selection are finally solved elegantly and efficiently. Experimental evaluation over a range of benchmark datasets indicates the validity of our method.
Spatial and Feature-Based Attention in a Layered Cortical Microcircuit Model
Wagatsuma, Nobuhiko; Potjans, Tobias C.; Diesmann, Markus; Sakai, Ko; Fukai, Tomoki
2013-01-01
Directing attention to the spatial location or the distinguishing feature of a visual object modulates neuronal responses in the visual cortex and the stimulus discriminability of subjects. However, the spatial and feature-based modes of attention differently influence visual processing by changing the tuning properties of neurons. Intriguingly, neurons' tuning curves are modulated similarly across different visual areas under both these modes of attention. Here, we explored the mechanism underlying the effects of these two modes of visual attention on the orientation selectivity of visual cortical neurons. To do this, we developed a layered microcircuit model. This model describes multiple orientation-specific microcircuits sharing their receptive fields and consisting of layers 2/3, 4, 5, and 6. These microcircuits represent a functional grouping of cortical neurons and mutually interact via lateral inhibition and excitatory connections between groups with similar selectivity. The individual microcircuits receive bottom-up visual stimuli and top-down attention in different layers. A crucial assumption of the model is that feature-based attention activates orientation-specific microcircuits for the relevant feature selectively, whereas spatial attention activates all microcircuits homogeneously, irrespective of their orientation selectivity. Consequently, our model simultaneously accounts for the multiplicative scaling of neuronal responses in spatial attention and the additive modulations of orientation tuning curves in feature-based attention, which have been observed widely in various visual cortical areas. Simulations of the model predict contrasting differences between excitatory and inhibitory neurons in the two modes of attentional modulations. Furthermore, the model replicates the modulation of the psychophysical discriminability of visual stimuli in the presence of external noise. Our layered model with a biologically suggested laminar structure describes the basic circuit mechanism underlying the attention-mode specific modulations of neuronal responses and visual perception. PMID:24324628
Mujtaba, Ghulam; Shuib, Liyana; Raj, Ram Gopal; Rajandram, Retnagowri; Shaikh, Khairunisa; Al-Garadi, Mohammed Ali
2017-01-01
Widespread implementation of electronic databases has improved the accessibility of plaintext clinical information for supplementary use. Numerous machine learning techniques, such as supervised machine learning approaches or ontology-based approaches, have been employed to obtain useful information from plaintext clinical data. This study proposes an automatic multi-class classification system to predict accident-related causes of death from plaintext autopsy reports through expert-driven feature selection with supervised automatic text classification decision models. Accident-related autopsy reports were obtained from one of the largest hospital in Kuala Lumpur. These reports belong to nine different accident-related causes of death. Master feature vector was prepared by extracting features from the collected autopsy reports by using unigram with lexical categorization. This master feature vector was used to detect cause of death [according to internal classification of disease version 10 (ICD-10) classification system] through five automated feature selection schemes, proposed expert-driven approach, five subset sizes of features, and five machine learning classifiers. Model performance was evaluated using precisionM, recallM, F-measureM, accuracy, and area under ROC curve. Four baselines were used to compare the results with the proposed system. Random forest and J48 decision models parameterized using expert-driven feature selection yielded the highest evaluation measure approaching (85% to 90%) for most metrics by using a feature subset size of 30. The proposed system also showed approximately 14% to 16% improvement in the overall accuracy compared with the existing techniques and four baselines. The proposed system is feasible and practical to use for automatic classification of ICD-10-related cause of death from autopsy reports. The proposed system assists pathologists to accurately and rapidly determine underlying cause of death based on autopsy findings. Furthermore, the proposed expert-driven feature selection approach and the findings are generally applicable to other kinds of plaintext clinical reports.
Mujtaba, Ghulam; Shuib, Liyana; Raj, Ram Gopal; Rajandram, Retnagowri; Shaikh, Khairunisa; Al-Garadi, Mohammed Ali
2017-01-01
Objectives Widespread implementation of electronic databases has improved the accessibility of plaintext clinical information for supplementary use. Numerous machine learning techniques, such as supervised machine learning approaches or ontology-based approaches, have been employed to obtain useful information from plaintext clinical data. This study proposes an automatic multi-class classification system to predict accident-related causes of death from plaintext autopsy reports through expert-driven feature selection with supervised automatic text classification decision models. Methods Accident-related autopsy reports were obtained from one of the largest hospital in Kuala Lumpur. These reports belong to nine different accident-related causes of death. Master feature vector was prepared by extracting features from the collected autopsy reports by using unigram with lexical categorization. This master feature vector was used to detect cause of death [according to internal classification of disease version 10 (ICD-10) classification system] through five automated feature selection schemes, proposed expert-driven approach, five subset sizes of features, and five machine learning classifiers. Model performance was evaluated using precisionM, recallM, F-measureM, accuracy, and area under ROC curve. Four baselines were used to compare the results with the proposed system. Results Random forest and J48 decision models parameterized using expert-driven feature selection yielded the highest evaluation measure approaching (85% to 90%) for most metrics by using a feature subset size of 30. The proposed system also showed approximately 14% to 16% improvement in the overall accuracy compared with the existing techniques and four baselines. Conclusion The proposed system is feasible and practical to use for automatic classification of ICD-10-related cause of death from autopsy reports. The proposed system assists pathologists to accurately and rapidly determine underlying cause of death based on autopsy findings. Furthermore, the proposed expert-driven feature selection approach and the findings are generally applicable to other kinds of plaintext clinical reports. PMID:28166263
A universal deep learning approach for modeling the flow of patients under different severities.
Jiang, Shancheng; Chin, Kwai-Sang; Tsui, Kwok L
2018-02-01
The Accident and Emergency Department (A&ED) is the frontline for providing emergency care in hospitals. Unfortunately, relative A&ED resources have failed to keep up with continuously increasing demand in recent years, which leads to overcrowding in A&ED. Knowing the fluctuation of patient arrival volume in advance is a significant premise to relieve this pressure. Based on this motivation, the objective of this study is to explore an integrated framework with high accuracy for predicting A&ED patient flow under different triage levels, by combining a novel feature selection process with deep neural networks. Administrative data is collected from an actual A&ED and categorized into five groups based on different triage levels. A genetic algorithm (GA)-based feature selection algorithm is improved and implemented as a pre-processing step for this time-series prediction problem, in order to explore key features affecting patient flow. In our improved GA, a fitness-based crossover is proposed to maintain the joint information of multiple features during iterative process, instead of traditional point-based crossover. Deep neural networks (DNN) is employed as the prediction model to utilize their universal adaptability and high flexibility. In the model-training process, the learning algorithm is well-configured based on a parallel stochastic gradient descent algorithm. Two effective regularization strategies are integrated in one DNN framework to avoid overfitting. All introduced hyper-parameters are optimized efficiently by grid-search in one pass. As for feature selection, our improved GA-based feature selection algorithm has outperformed a typical GA and four state-of-the-art feature selection algorithms (mRMR, SAFS, VIFR, and CFR). As for the prediction accuracy of proposed integrated framework, compared with other frequently used statistical models (GLM, seasonal-ARIMA, ARIMAX, and ANN) and modern machine models (SVM-RBF, SVM-linear, RF, and R-LASSO), the proposed integrated "DNN-I-GA" framework achieves higher prediction accuracy on both MAPE and RMSE metrics in pairwise comparisons. The contribution of our study is two-fold. Theoretically, the traditional GA-based feature selection process is improved to have less hyper-parameters and higher efficiency, and the joint information of multiple features is maintained by fitness-based crossover operator. The universal property of DNN is further enhanced by merging different regularization strategies. Practically, features selected by our improved GA can be used to acquire an underlying relationship between patient flows and input features. Predictive values are significant indicators of patients' demand and can be used by A&ED managers to make resource planning and allocation. High accuracy achieved by the present framework in different cases enhances the reliability of downstream decision makings. Copyright © 2017 Elsevier B.V. All rights reserved.
MacLean, Mary H; Giesbrecht, Barry
2015-07-01
Task-relevant and physically salient features influence visual selective attention. In the present study, we investigated the influence of task-irrelevant and physically nonsalient reward-associated features on visual selective attention. Two hypotheses were tested: One predicts that the effects of target-defining task-relevant and task-irrelevant features interact to modulate visual selection; the other predicts that visual selection is determined by the independent combination of relevant and irrelevant feature effects. These alternatives were tested using a visual search task that contained multiple targets, placing a high demand on the need for selectivity, and that was data-limited and required unspeeded responses, emphasizing early perceptual selection processes. One week prior to the visual search task, participants completed a training task in which they learned to associate particular colors with a specific reward value. In the search task, the reward-associated colors were presented surrounding targets and distractors, but were neither physically salient nor task-relevant. In two experiments, the irrelevant reward-associated features influenced performance, but only when they were presented in a task-relevant location. The costs induced by the irrelevant reward-associated features were greater when they oriented attention to a target than to a distractor. In a third experiment, we examined the effects of selection history in the absence of reward history and found that the interaction between task relevance and selection history differed, relative to when the features had previously been associated with reward. The results indicate that under conditions that demand highly efficient perceptual selection, physically nonsalient task-irrelevant and task-relevant factors interact to influence visual selective attention.
Verleger, Rolf; Groen, Margriet; Heide, Wolfgang; Sobieralska, Kinga; Jaśkowski, Piotr
2008-05-01
We studied how physical and instructed embedding of features in gestalts affects perceptual selection. Four ovals on the horizontal midline were either unconnected or pairwise connected by circles, forming ears of left and right heads (gestalts). Relevant to responding was the position of one colored oval, either within its pair or relative to fixation ("object-based" or "fixation-based" instruction). Responses were faster under fixation- than object-based instruction, less so with gestalts. Previously reported increases of N1 when evoked by features within objects were replicated for fixation-based instruction only. There was no effect of instruction on N2pc. However P1 increased under the adequate instruction, object-based for gestalts, fixation-based for unconnected items, which presumably indicated how foci of attention were set by expecting specific stimuli under instructions that specified how to bind these stimuli to objects.
Oculomotor selection underlies feature retention in visual working memory.
Hanning, Nina M; Jonikaitis, Donatas; Deubel, Heiner; Szinte, Martin
2016-02-01
Oculomotor selection, spatial task relevance, and visual working memory (WM) are described as three processes highly intertwined and sustained by similar cortical structures. However, because task-relevant locations always constitute potential saccade targets, no study so far has been able to distinguish between oculomotor selection and spatial task relevance. We designed an experiment that allowed us to dissociate in humans the contribution of task relevance, oculomotor selection, and oculomotor execution to the retention of feature representations in WM. We report that task relevance and oculomotor selection lead to dissociable effects on feature WM maintenance. In a first task, in which an object's location was encoded as a saccade target, its feature representations were successfully maintained in WM, whereas they declined at nonsaccade target locations. Likewise, we observed a similar WM benefit at the target of saccades that were prepared but never executed. In a second task, when an object's location was marked as task relevant but constituted a nonsaccade target (a location to avoid), feature representations maintained at that location did not benefit. Combined, our results demonstrate that oculomotor selection is consistently associated with WM, whereas task relevance is not. This provides evidence for an overlapping circuitry serving saccade target selection and feature-based WM that can be dissociated from processes encoding task-relevant locations. Copyright © 2016 the American Physiological Society.
Daffner, Kirk R; Zhuravleva, Tatyana Y; Sun, Xue; Tarbi, Elise C; Haring, Anna E; Rentz, Dorene M; Holcomb, Phillip J
2012-02-01
Numerous studies have demonstrated that selective attention to color is associated with a larger neural response under attend than ignore conditions, but have not addressed whether this difference reflects enhanced activity under attend or suppressed activity under ignore. In this study, a color-neutral condition was included, which presented stimuli physically identical to those under attend and ignore conditions, but in which color was not task relevant. Attention to color did not modulate the early sensory-evoked P1 and N1 components. Traditional ERP markers of early selection (the anterior Selection Positivity and posterior Selection Negativity) did not differ between the attend and neutral conditions, arguing against a mechanism of enhanced activity. However, there were markedly reduced responses under the ignore relative to the neutral condition, consistent with the view that early selection mechanisms reflect suppression of neural activity under the ignore condition. Copyright © 2011 Elsevier B.V. All rights reserved.
Adaptive feature selection using v-shaped binary particle swarm optimization.
Teng, Xuyang; Dong, Hongbin; Zhou, Xiurong
2017-01-01
Feature selection is an important preprocessing method in machine learning and data mining. This process can be used not only to reduce the amount of data to be analyzed but also to build models with stronger interpretability based on fewer features. Traditional feature selection methods evaluate the dependency and redundancy of features separately, which leads to a lack of measurement of their combined effect. Moreover, a greedy search considers only the optimization of the current round and thus cannot be a global search. To evaluate the combined effect of different subsets in the entire feature space, an adaptive feature selection method based on V-shaped binary particle swarm optimization is proposed. In this method, the fitness function is constructed using the correlation information entropy. Feature subsets are regarded as individuals in a population, and the feature space is searched using V-shaped binary particle swarm optimization. The above procedure overcomes the hard constraint on the number of features, enables the combined evaluation of each subset as a whole, and improves the search ability of conventional binary particle swarm optimization. The proposed algorithm is an adaptive method with respect to the number of feature subsets. The experimental results show the advantages of optimizing the feature subsets using the V-shaped transfer function and confirm the effectiveness and efficiency of the feature subsets obtained under different classifiers.
Adaptive feature selection using v-shaped binary particle swarm optimization
Dong, Hongbin; Zhou, Xiurong
2017-01-01
Feature selection is an important preprocessing method in machine learning and data mining. This process can be used not only to reduce the amount of data to be analyzed but also to build models with stronger interpretability based on fewer features. Traditional feature selection methods evaluate the dependency and redundancy of features separately, which leads to a lack of measurement of their combined effect. Moreover, a greedy search considers only the optimization of the current round and thus cannot be a global search. To evaluate the combined effect of different subsets in the entire feature space, an adaptive feature selection method based on V-shaped binary particle swarm optimization is proposed. In this method, the fitness function is constructed using the correlation information entropy. Feature subsets are regarded as individuals in a population, and the feature space is searched using V-shaped binary particle swarm optimization. The above procedure overcomes the hard constraint on the number of features, enables the combined evaluation of each subset as a whole, and improves the search ability of conventional binary particle swarm optimization. The proposed algorithm is an adaptive method with respect to the number of feature subsets. The experimental results show the advantages of optimizing the feature subsets using the V-shaped transfer function and confirm the effectiveness and efficiency of the feature subsets obtained under different classifiers. PMID:28358850
Hadoux, Xavier; Kumar, Dinesh Kant; Sarossy, Marc G; Roger, Jean-Michel; Gorretta, Nathalie
2016-05-19
Visible and near-infrared (Vis-NIR) spectra are generated by the combination of numerous low resolution features. Spectral variables are thus highly correlated, which can cause problems for selecting the most appropriate ones for a given application. Some decomposition bases such as Fourier or wavelet generally help highlighting spectral features that are important, but are by nature constraint to have both positive and negative components. Thus, in addition to complicating the selected features interpretability, it impedes their use for application-dedicated sensors. In this paper we have proposed a new method for feature selection: Application-Dedicated Selection of Filters (ADSF). This method relaxes the shape constraint by enabling the selection of any type of user defined custom features. By considering only relevant features, based on the underlying nature of the data, high regularization of the final model can be obtained, even in the small sample size context often encountered in spectroscopic applications. For larger scale deployment of application-dedicated sensors, these predefined feature constraints can lead to application specific optical filters, e.g., lowpass, highpass, bandpass or bandstop filters with positive only coefficients. In a similar fashion to Partial Least Squares, ADSF successively selects features using covariance maximization and deflates their influences using orthogonal projection in order to optimally tune the selection to the data with limited redundancy. ADSF is well suited for spectroscopic data as it can deal with large numbers of highly correlated variables in supervised learning, even with many correlated responses. Copyright © 2016 Elsevier B.V. All rights reserved.
Fizzy: feature subset selection for metagenomics.
Ditzler, Gregory; Morrison, J Calvin; Lan, Yemin; Rosen, Gail L
2015-11-04
Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using α- & β-diversity. Feature subset selection--a sub-field of machine learning--can also provide a unique insight into the differences between metagenomic or 16S phenotypes. In particular, feature subset selection methods can obtain the operational taxonomic units (OTUs), or functional features, that have a high-level of influence on the condition being studied. For example, in a previous study we have used information-theoretic feature selection to understand the differences between protein family abundances that best discriminate between age groups in the human gut microbiome. We have developed a new Python command line tool, which is compatible with the widely adopted BIOM format, for microbial ecologists that implements information-theoretic subset selection methods for biological data formats. We demonstrate the software tools capabilities on publicly available datasets. We have made the software implementation of Fizzy available to the public under the GNU GPL license. The standalone implementation can be found at http://github.com/EESI/Fizzy.
Fizzy. Feature subset selection for metagenomics
Ditzler, Gregory; Morrison, J. Calvin; Lan, Yemin; ...
2015-11-04
Background: Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using α– & β–diversity. Feature subset selection – a sub-field of machine learning – can also provide a unique insight into the differences between metagenomic or 16S phenotypes. In particular, feature subset selection methods can obtain the operational taxonomic units (OTUs), or functional features, that have a high-level of influence on the condition being studied. For example, in a previous study we have used information-theoretic feature selection to understand the differences between protein family abundances that best discriminate betweenmore » age groups in the human gut microbiome. Results: We have developed a new Python command line tool, which is compatible with the widely adopted BIOM format, for microbial ecologists that implements information-theoretic subset selection methods for biological data formats. We demonstrate the software tools capabilities on publicly available datasets. Conclusions: We have made the software implementation of Fizzy available to the public under the GNU GPL license. The standalone implementation can be found at http://github.com/EESI/Fizzy.« less
NASA Astrophysics Data System (ADS)
Wang, Xiao; Burghardt, Dirk
2018-05-01
This paper presents a new strategy for the generalization of discrete area features by using stroke grouping method and polarization transportation selection. The mentioned stroke is constructed on derive of the refined proximity graph of area features, and the refinement is under the control of four constraints to meet different grouping requirements. The area features which belong to the same stroke are detected into the same group. The stroke-based strategy decomposes the generalization process into two sub-processes by judging whether the area features related to strokes or not. For the area features which belong to the same one stroke, they normally present a linear like pat-tern, and in order to preserve this kind of pattern, typification is chosen as the operator to implement the generalization work. For the remaining area features which are not related by strokes, they are still distributed randomly and discretely, and the selection is chosen to conduct the generalization operation. For the purpose of retaining their original distribution characteristic, a Polarization Transportation (PT) method is introduced to implement the selection operation. Buildings and lakes are selected as the representatives of artificial area feature and natural area feature respectively to take the experiments. The generalized results indicate that by adopting this proposed strategy, the original distribution characteristics of building and lake data can be preserved, and the visual perception is pre-served as before.
Wu, Haifeng; Sun, Tao; Wang, Jingjing; Li, Xia; Wang, Wei; Huo, Da; Lv, Pingxin; He, Wen; Wang, Keyang; Guo, Xiuhua
2013-08-01
The objective of this study was to investigate the method of the combination of radiological and textural features for the differentiation of malignant from benign solitary pulmonary nodules by computed tomography. Features including 13 gray level co-occurrence matrix textural features and 12 radiological features were extracted from 2,117 CT slices, which came from 202 (116 malignant and 86 benign) patients. Lasso-type regularization to a nonlinear regression model was applied to select predictive features and a BP artificial neural network was used to build the diagnostic model. Eight radiological and two textural features were obtained after the Lasso-type regularization procedure. Twelve radiological features alone could reach an area under the ROC curve (AUC) of 0.84 in differentiating between malignant and benign lesions. The 10 selected characters improved the AUC to 0.91. The evaluation results showed that the method of selecting radiological and textural features appears to yield more effective in the distinction of malignant from benign solitary pulmonary nodules by computed tomography.
Effect of processing on structural features of anodic aluminum oxides
NASA Astrophysics Data System (ADS)
Erdogan, Pembe; Birol, Yucel
2012-09-01
Morphological features of the anodic aluminum oxide (AAO) templates fabricated by electrochemical oxidation under different processing conditions were investigated. The selection of the polishing parameters does not appear to be critical as long as the aluminum substrate is polished adequately prior to the anodization process. AAO layers with a highly ordered pore distribution are obtained after anodizing in 0.6 M oxalic acid at 20 °C under 40 V for 5 minutes suggesting that the desired pore features are attained once an oxide layer develops on the surface. While the pore features are not affected much, the thickness of the AAO template increases with increasing anodization treatment time. Pore features are better and the AAO growth rate is higher at 20 °C than at 5 °C; higher under 45 V than under 40 V; higher with 0.6 M than with 0.3 M oxalic acid.
Higher criticism thresholding: Optimal feature selection when useful features are rare and weak.
Donoho, David; Jin, Jiashun
2008-09-30
In important application fields today-genomics and proteomics are examples-selecting a small subset of useful features is crucial for success of Linear Classification Analysis. We study feature selection by thresholding of feature Z-scores and introduce a principle of threshold selection, based on the notion of higher criticism (HC). For i = 1, 2, ..., p, let pi(i) denote the two-sided P-value associated with the ith feature Z-score and pi((i)) denote the ith order statistic of the collection of P-values. The HC threshold is the absolute Z-score corresponding to the P-value maximizing the HC objective (i/p - pi((i)))/sqrt{i/p(1-i/p)}. We consider a rare/weak (RW) feature model, where the fraction of useful features is small and the useful features are each too weak to be of much use on their own. HC thresholding (HCT) has interesting behavior in this setting, with an intimate link between maximizing the HC objective and minimizing the error rate of the designed classifier, and very different behavior from popular threshold selection procedures such as false discovery rate thresholding (FDRT). In the most challenging RW settings, HCT uses an unconventionally low threshold; this keeps the missed-feature detection rate under better control than FDRT and yields a classifier with improved misclassification performance. Replacing cross-validated threshold selection in the popular Shrunken Centroid classifier with the computationally less expensive and simpler HCT reduces the variance of the selected threshold and the error rate of the constructed classifier. Results on standard real datasets and in asymptotic theory confirm the advantages of HCT.
Higher criticism thresholding: Optimal feature selection when useful features are rare and weak
Donoho, David; Jin, Jiashun
2008-01-01
In important application fields today—genomics and proteomics are examples—selecting a small subset of useful features is crucial for success of Linear Classification Analysis. We study feature selection by thresholding of feature Z-scores and introduce a principle of threshold selection, based on the notion of higher criticism (HC). For i = 1, 2, …, p, let πi denote the two-sided P-value associated with the ith feature Z-score and π(i) denote the ith order statistic of the collection of P-values. The HC threshold is the absolute Z-score corresponding to the P-value maximizing the HC objective (i/p − π(i))/i/p(1−i/p). We consider a rare/weak (RW) feature model, where the fraction of useful features is small and the useful features are each too weak to be of much use on their own. HC thresholding (HCT) has interesting behavior in this setting, with an intimate link between maximizing the HC objective and minimizing the error rate of the designed classifier, and very different behavior from popular threshold selection procedures such as false discovery rate thresholding (FDRT). In the most challenging RW settings, HCT uses an unconventionally low threshold; this keeps the missed-feature detection rate under better control than FDRT and yields a classifier with improved misclassification performance. Replacing cross-validated threshold selection in the popular Shrunken Centroid classifier with the computationally less expensive and simpler HCT reduces the variance of the selected threshold and the error rate of the constructed classifier. Results on standard real datasets and in asymptotic theory confirm the advantages of HCT. PMID:18815365
Das, Dev Kumar; Ghosh, Madhumala; Pal, Mallika; Maiti, Asok K; Chakraborty, Chandan
2013-02-01
The aim of this paper is to address the development of computer assisted malaria parasite characterization and classification using machine learning approach based on light microscopic images of peripheral blood smears. In doing this, microscopic image acquisition from stained slides, illumination correction and noise reduction, erythrocyte segmentation, feature extraction, feature selection and finally classification of different stages of malaria (Plasmodium vivax and Plasmodium falciparum) have been investigated. The erythrocytes are segmented using marker controlled watershed transformation and subsequently total ninety six features describing shape-size and texture of erythrocytes are extracted in respect to the parasitemia infected versus non-infected cells. Ninety four features are found to be statistically significant in discriminating six classes. Here a feature selection-cum-classification scheme has been devised by combining F-statistic, statistical learning techniques i.e., Bayesian learning and support vector machine (SVM) in order to provide the higher classification accuracy using best set of discriminating features. Results show that Bayesian approach provides the highest accuracy i.e., 84% for malaria classification by selecting 19 most significant features while SVM provides highest accuracy i.e., 83.5% with 9 most significant features. Finally, the performance of these two classifiers under feature selection framework has been compared toward malaria parasite classification. Copyright © 2012 Elsevier Ltd. All rights reserved.
A ROC-based feature selection method for computer-aided detection and diagnosis
NASA Astrophysics Data System (ADS)
Wang, Songyuan; Zhang, Guopeng; Liao, Qimei; Zhang, Junying; Jiao, Chun; Lu, Hongbing
2014-03-01
Image-based computer-aided detection and diagnosis (CAD) has been a very active research topic aiming to assist physicians to detect lesions and distinguish them from benign to malignant. However, the datasets fed into a classifier usually suffer from small number of samples, as well as significantly less samples available in one class (have a disease) than the other, resulting in the classifier's suboptimal performance. How to identifying the most characterizing features of the observed data for lesion detection is critical to improve the sensitivity and minimize false positives of a CAD system. In this study, we propose a novel feature selection method mR-FAST that combines the minimal-redundancymaximal relevance (mRMR) framework with a selection metric FAST (feature assessment by sliding thresholds) based on the area under a ROC curve (AUC) generated on optimal simple linear discriminants. With three feature datasets extracted from CAD systems for colon polyps and bladder cancer, we show that the space of candidate features selected by mR-FAST is more characterizing for lesion detection with higher AUC, enabling to find a compact subset of superior features at low cost.
Yu, Sheng; Liao, Katherine P; Shaw, Stanley Y; Gainer, Vivian S; Churchill, Susanne E; Szolovits, Peter; Murphy, Shawn N; Kohane, Isaac S; Cai, Tianxi
2015-09-01
Analysis of narrative (text) data from electronic health records (EHRs) can improve population-scale phenotyping for clinical and genetic research. Currently, selection of text features for phenotyping algorithms is slow and laborious, requiring extensive and iterative involvement by domain experts. This paper introduces a method to develop phenotyping algorithms in an unbiased manner by automatically extracting and selecting informative features, which can be comparable to expert-curated ones in classification accuracy. Comprehensive medical concepts were collected from publicly available knowledge sources in an automated, unbiased fashion. Natural language processing (NLP) revealed the occurrence patterns of these concepts in EHR narrative notes, which enabled selection of informative features for phenotype classification. When combined with additional codified features, a penalized logistic regression model was trained to classify the target phenotype. The authors applied our method to develop algorithms to identify patients with rheumatoid arthritis and coronary artery disease cases among those with rheumatoid arthritis from a large multi-institutional EHR. The area under the receiver operating characteristic curves (AUC) for classifying RA and CAD using models trained with automated features were 0.951 and 0.929, respectively, compared to the AUCs of 0.938 and 0.929 by models trained with expert-curated features. Models trained with NLP text features selected through an unbiased, automated procedure achieved comparable or slightly higher accuracy than those trained with expert-curated features. The majority of the selected model features were interpretable. The proposed automated feature extraction method, generating highly accurate phenotyping algorithms with improved efficiency, is a significant step toward high-throughput phenotyping. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Local curvature analysis for classifying breast tumors: Preliminary analysis in dedicated breast CT
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, Juhun, E-mail: leej15@upmc.edu; Nishikawa, Robert M.; Reiser, Ingrid
2015-09-15
Purpose: The purpose of this study is to measure the effectiveness of local curvature measures as novel image features for classifying breast tumors. Methods: A total of 119 breast lesions from 104 noncontrast dedicated breast computed tomography images of women were used in this study. Volumetric segmentation was done using a seed-based segmentation algorithm and then a triangulated surface was extracted from the resulting segmentation. Total, mean, and Gaussian curvatures were then computed. Normalized curvatures were used as classification features. In addition, traditional image features were also extracted and a forward feature selection scheme was used to select the optimalmore » feature set. Logistic regression was used as a classifier and leave-one-out cross-validation was utilized to evaluate the classification performances of the features. The area under the receiver operating characteristic curve (AUC, area under curve) was used as a figure of merit. Results: Among curvature measures, the normalized total curvature (C{sub T}) showed the best classification performance (AUC of 0.74), while the others showed no classification power individually. Five traditional image features (two shape, two margin, and one texture descriptors) were selected via the feature selection scheme and its resulting classifier achieved an AUC of 0.83. Among those five features, the radial gradient index (RGI), which is a margin descriptor, showed the best classification performance (AUC of 0.73). A classifier combining RGI and C{sub T} yielded an AUC of 0.81, which showed similar performance (i.e., no statistically significant difference) to the classifier with the above five traditional image features. Additional comparisons in AUC values between classifiers using different combinations of traditional image features and C{sub T} were conducted. The results showed that C{sub T} was able to replace the other four image features for the classification task. Conclusions: The normalized curvature measure contains useful information in classifying breast tumors. Using this, one can reduce the number of features in a classifier, which may result in more robust classifiers for different datasets.« less
Stabilizing l1-norm prediction models by supervised feature grouping.
Kamkar, Iman; Gupta, Sunil Kumar; Phung, Dinh; Venkatesh, Svetha
2016-02-01
Emerging Electronic Medical Records (EMRs) have reformed the modern healthcare. These records have great potential to be used for building clinical prediction models. However, a problem in using them is their high dimensionality. Since a lot of information may not be relevant for prediction, the underlying complexity of the prediction models may not be high. A popular way to deal with this problem is to employ feature selection. Lasso and l1-norm based feature selection methods have shown promising results. But, in presence of correlated features, these methods select features that change considerably with small changes in data. This prevents clinicians to obtain a stable feature set, which is crucial for clinical decision making. Grouping correlated variables together can improve the stability of feature selection, however, such grouping is usually not known and needs to be estimated for optimal performance. Addressing this problem, we propose a new model that can simultaneously learn the grouping of correlated features and perform stable feature selection. We formulate the model as a constrained optimization problem and provide an efficient solution with guaranteed convergence. Our experiments with both synthetic and real-world datasets show that the proposed model is significantly more stable than Lasso and many existing state-of-the-art shrinkage and classification methods. We further show that in terms of prediction performance, the proposed method consistently outperforms Lasso and other baselines. Our model can be used for selecting stable risk factors for a variety of healthcare problems, so it can assist clinicians toward accurate decision making. Copyright © 2015 Elsevier Inc. All rights reserved.
The optional selection of micro-motion feature based on Support Vector Machine
NASA Astrophysics Data System (ADS)
Li, Bo; Ren, Hongmei; Xiao, Zhi-he; Sheng, Jing
2017-11-01
Micro-motion form of target is multiple, different micro-motion forms are apt to be modulated, which makes it difficult for feature extraction and recognition. Aiming at feature extraction of cone-shaped objects with different micro-motion forms, this paper proposes the best selection method of micro-motion feature based on support vector machine. After the time-frequency distribution of radar echoes, comparing the time-frequency spectrum of objects with different micro-motion forms, features are extracted based on the differences between the instantaneous frequency variations of different micro-motions. According to the methods based on SVM (Support Vector Machine) features are extracted, then the best features are acquired. Finally, the result shows the method proposed in this paper is feasible under the test condition of certain signal-to-noise ratio(SNR).
Combined rule extraction and feature elimination in supervised classification.
Liu, Sheng; Patel, Ronak Y; Daga, Pankaj R; Liu, Haining; Fu, Gang; Doerksen, Robert J; Chen, Yixin; Wilkins, Dawn E
2012-09-01
There are a vast number of biology related research problems involving a combination of multiple sources of data to achieve a better understanding of the underlying problems. It is important to select and interpret the most important information from these sources. Thus it will be beneficial to have a good algorithm to simultaneously extract rules and select features for better interpretation of the predictive model. We propose an efficient algorithm, Combined Rule Extraction and Feature Elimination (CRF), based on 1-norm regularized random forests. CRF simultaneously extracts a small number of rules generated by random forests and selects important features. We applied CRF to several drug activity prediction and microarray data sets. CRF is capable of producing performance comparable with state-of-the-art prediction algorithms using a small number of decision rules. Some of the decision rules are biologically significant.
NASA Astrophysics Data System (ADS)
Mishra, Aashwin; Iaccarino, Gianluca
2017-11-01
In spite of their deficiencies, RANS models represent the workhorse for industrial investigations into turbulent flows. In this context, it is essential to provide diagnostic measures to assess the quality of RANS predictions. To this end, the primary step is to identify feature importances amongst massive sets of potentially descriptive and discriminative flow features. This aids the physical interpretability of the resultant discrepancy model and its extensibility to similar problems. Recent investigations have utilized approaches such as Random Forests, Support Vector Machines and the Least Absolute Shrinkage and Selection Operator for feature selection. With examples, we exhibit how such methods may not be suitable for turbulent flow datasets. The underlying rationale, such as the correlation bias and the required conditions for the success of penalized algorithms, are discussed with illustrative examples. Finally, we provide alternate approaches using convex combinations of regularized regression approaches and randomized sub-sampling in combination with feature selection algorithms, to infer model structure from data. This research was supported by the Defense Advanced Research Projects Agency under the Enabling Quantification of Uncertainty in Physical Systems (EQUiPS) project (technical monitor: Dr Fariba Fahroo).
Muthu Rama Krishnan, M; Shah, Pratik; Chakraborty, Chandan; Ray, Ajoy K
2012-04-01
The objective of this paper is to provide an improved technique, which can assist oncopathologists in correct screening of oral precancerous conditions specially oral submucous fibrosis (OSF) with significant accuracy on the basis of collagen fibres in the sub-epithelial connective tissue. The proposed scheme is composed of collagen fibres segmentation, its textural feature extraction and selection, screening perfomance enhancement under Gaussian transformation and finally classification. In this study, collagen fibres are segmented on R,G,B color channels using back-probagation neural network from 60 normal and 59 OSF histological images followed by histogram specification for reducing the stain intensity variation. Henceforth, textural features of collgen area are extracted using fractal approaches viz., differential box counting and brownian motion curve . Feature selection is done using Kullback-Leibler (KL) divergence criterion and the screening performance is evaluated based on various statistical tests to conform Gaussian nature. Here, the screening performance is enhanced under Gaussian transformation of the non-Gaussian features using hybrid distribution. Moreover, the routine screening is designed based on two statistical classifiers viz., Bayesian classification and support vector machines (SVM) to classify normal and OSF. It is observed that SVM with linear kernel function provides better classification accuracy (91.64%) as compared to Bayesian classifier. The addition of fractal features of collagen under Gaussian transformation improves Bayesian classifier's performance from 80.69% to 90.75%. Results are here studied and discussed.
Optical-Correlator Neural Network Based On Neocognitron
NASA Technical Reports Server (NTRS)
Chao, Tien-Hsin; Stoner, William W.
1994-01-01
Multichannel optical correlator implements shift-invariant, high-discrimination pattern-recognizing neural network based on paradigm of neocognitron. Selected as basic building block of this neural network because invariance under shifts is inherent advantage of Fourier optics included in optical correlators in general. Neocognitron is conceptual electronic neural-network model for recognition of visual patterns. Multilayer processing achieved by iteratively feeding back output of feature correlator to input spatial light modulator and updating Fourier filters. Neural network trained by use of characteristic features extracted from target images. Multichannel implementation enables parallel processing of large number of selected features.
Magnetic field feature extraction and selection for indoor location estimation.
Galván-Tejada, Carlos E; García-Vázquez, Juan Pablo; Brena, Ramon F
2014-06-20
User indoor positioning has been under constant improvement especially with the availability of new sensors integrated into the modern mobile devices, which allows us to exploit not only infrastructures made for everyday use, such as WiFi, but also natural infrastructure, as is the case of natural magnetic field. In this paper we present an extension and improvement of our current indoor localization model based on the feature extraction of 46 magnetic field signal features. The extension adds a feature selection phase to our methodology, which is performed through Genetic Algorithm (GA) with the aim of optimizing the fitness of our current model. In addition, we present an evaluation of the final model in two different scenarios: home and office building. The results indicate that performing a feature selection process allows us to reduce the number of signal features of the model from 46 to 5 regardless the scenario and room location distribution. Further, we verified that reducing the number of features increases the probability of our estimator correctly detecting the user's location (sensitivity) and its capacity to detect false positives (specificity) in both scenarios.
Tan, Maxine; Pu, Jiantao; Zheng, Bin
2014-01-01
Purpose: Improving radiologists’ performance in classification between malignant and benign breast lesions is important to increase cancer detection sensitivity and reduce false-positive recalls. For this purpose, developing computer-aided diagnosis (CAD) schemes has been attracting research interest in recent years. In this study, we investigated a new feature selection method for the task of breast mass classification. Methods: We initially computed 181 image features based on mass shape, spiculation, contrast, presence of fat or calcifications, texture, isodensity, and other morphological features. From this large image feature pool, we used a sequential forward floating selection (SFFS)-based feature selection method to select relevant features, and analyzed their performance using a support vector machine (SVM) model trained for the classification task. On a database of 600 benign and 600 malignant mass regions of interest (ROIs), we performed the study using a ten-fold cross-validation method. Feature selection and optimization of the SVM parameters were conducted on the training subsets only. Results: The area under the receiver operating characteristic curve (AUC) = 0.805±0.012 was obtained for the classification task. The results also showed that the most frequently-selected features by the SFFS-based algorithm in 10-fold iterations were those related to mass shape, isodensity and presence of fat, which are consistent with the image features frequently used by radiologists in the clinical environment for mass classification. The study also indicated that accurately computing mass spiculation features from the projection mammograms was difficult, and failed to perform well for the mass classification task due to tissue overlap within the benign mass regions. Conclusions: In conclusion, this comprehensive feature analysis study provided new and valuable information for optimizing computerized mass classification schemes that may have potential to be useful as a “second reader” in future clinical practice. PMID:24664267
ERIC Educational Resources Information Center
Hommuk, Karita; Bachmann, Talis
2009-01-01
The problem of feature binding has been examined under conditions of distributed attention or with spatially dispersed stimuli. We studied binding by asking whether selective attention to a feature of a masked object enables perceptual access to the other features of that object using conditions in which spatial attention was directed at a single…
The Cross-Entropy Based Multi-Filter Ensemble Method for Gene Selection.
Sun, Yingqiang; Lu, Chengbo; Li, Xiaobo
2018-05-17
The gene expression profile has the characteristics of a high dimension, low sample, and continuous type, and it is a great challenge to use gene expression profile data for the classification of tumor samples. This paper proposes a cross-entropy based multi-filter ensemble (CEMFE) method for microarray data classification. Firstly, multiple filters are used to select the microarray data in order to obtain a plurality of the pre-selected feature subsets with a different classification ability. The top N genes with the highest rank of each subset are integrated so as to form a new data set. Secondly, the cross-entropy algorithm is used to remove the redundant data in the data set. Finally, the wrapper method, which is based on forward feature selection, is used to select the best feature subset. The experimental results show that the proposed method is more efficient than other gene selection methods and that it can achieve a higher classification accuracy under fewer characteristic genes.
Early melanoma diagnosis with mobile imaging.
Do, Thanh-Toan; Zhou, Yiren; Zheng, Haitian; Cheung, Ngai-Man; Koh, Dawn
2014-01-01
We research a mobile imaging system for early diagnosis of melanoma. Different from previous work, we focus on smartphone-captured images, and propose a detection system that runs entirely on the smartphone. Smartphone-captured images taken under loosely-controlled conditions introduce new challenges for melanoma detection, while processing performed on the smartphone is subject to computation and memory constraints. To address these challenges, we propose to localize the skin lesion by combining fast skin detection and fusion of two fast segmentation results. We propose new features to capture color variation and border irregularity which are useful for smartphone-captured images. We also propose a new feature selection criterion to select a small set of good features used in the final lightweight system. Our evaluation confirms the effectiveness of proposed algorithms and features. In addition, we present our system prototype which computes selected visual features from a user-captured skin lesion image, and analyzes them to estimate the likelihood of malignance, all on an off-the-shelf smartphone.
Wang, Kung-Jeng; Makond, Bunjira; Wang, Kung-Min
2013-11-09
Breast cancer is one of the most critical cancers and is a major cause of cancer death among women. It is essential to know the survivability of the patients in order to ease the decision making process regarding medical treatment and financial preparation. Recently, the breast cancer data sets have been imbalanced (i.e., the number of survival patients outnumbers the number of non-survival patients) whereas the standard classifiers are not applicable for the imbalanced data sets. The methods to improve survivability prognosis of breast cancer need for study. Two well-known five-year prognosis models/classifiers [i.e., logistic regression (LR) and decision tree (DT)] are constructed by combining synthetic minority over-sampling technique (SMOTE), cost-sensitive classifier technique (CSC), under-sampling, bagging, and boosting. The feature selection method is used to select relevant variables, while the pruning technique is applied to obtain low information-burden models. These methods are applied on data obtained from the Surveillance, Epidemiology, and End Results database. The improvements of survivability prognosis of breast cancer are investigated based on the experimental results. Experimental results confirm that the DT and LR models combined with SMOTE, CSC, and under-sampling generate higher predictive performance consecutively than the original ones. Most of the time, DT and LR models combined with SMOTE and CSC use less informative burden/features when a feature selection method and a pruning technique are applied. LR is found to have better statistical power than DT in predicting five-year survivability. CSC is superior to SMOTE, under-sampling, bagging, and boosting to improve the prognostic performance of DT and LR.
2013-01-01
Background Breast cancer is one of the most critical cancers and is a major cause of cancer death among women. It is essential to know the survivability of the patients in order to ease the decision making process regarding medical treatment and financial preparation. Recently, the breast cancer data sets have been imbalanced (i.e., the number of survival patients outnumbers the number of non-survival patients) whereas the standard classifiers are not applicable for the imbalanced data sets. The methods to improve survivability prognosis of breast cancer need for study. Methods Two well-known five-year prognosis models/classifiers [i.e., logistic regression (LR) and decision tree (DT)] are constructed by combining synthetic minority over-sampling technique (SMOTE) ,cost-sensitive classifier technique (CSC), under-sampling, bagging, and boosting. The feature selection method is used to select relevant variables, while the pruning technique is applied to obtain low information-burden models. These methods are applied on data obtained from the Surveillance, Epidemiology, and End Results database. The improvements of survivability prognosis of breast cancer are investigated based on the experimental results. Results Experimental results confirm that the DT and LR models combined with SMOTE, CSC, and under-sampling generate higher predictive performance consecutively than the original ones. Most of the time, DT and LR models combined with SMOTE and CSC use less informative burden/features when a feature selection method and a pruning technique are applied. Conclusions LR is found to have better statistical power than DT in predicting five-year survivability. CSC is superior to SMOTE, under-sampling, bagging, and boosting to improve the prognostic performance of DT and LR. PMID:24207108
Putting Parameters in Their Proper Place
ERIC Educational Resources Information Center
Montrul, Silvina; Yoon, James
2009-01-01
Seeing the logical problem of second language acquisition as that of primarily selecting and re-assembling bundles of features anew, Lardiere proposes to dispense with the deductive learning approach and its broad range of consequences subsumed under the concept of parameters. While we agree that feature assembly captures more precisely the…
Liang, Ja-Der; Ping, Xiao-Ou; Tseng, Yi-Ju; Huang, Guan-Tarn; Lai, Feipei; Yang, Pei-Ming
2014-12-01
Recurrence of hepatocellular carcinoma (HCC) is an important issue despite effective treatments with tumor eradication. Identification of patients who are at high risk for recurrence may provide more efficacious screening and detection of tumor recurrence. The aim of this study was to develop recurrence predictive models for HCC patients who received radiofrequency ablation (RFA) treatment. From January 2007 to December 2009, 83 newly diagnosed HCC patients receiving RFA as their first treatment were enrolled. Five feature selection methods including genetic algorithm (GA), simulated annealing (SA) algorithm, random forests (RF) and hybrid methods (GA+RF and SA+RF) were utilized for selecting an important subset of features from a total of 16 clinical features. These feature selection methods were combined with support vector machine (SVM) for developing predictive models with better performance. Five-fold cross-validation was used to train and test SVM models. The developed SVM-based predictive models with hybrid feature selection methods and 5-fold cross-validation had averages of the sensitivity, specificity, accuracy, positive predictive value, negative predictive value, and area under the ROC curve as 67%, 86%, 82%, 69%, 90%, and 0.69, respectively. The SVM derived predictive model can provide suggestive high-risk recurrent patients, who should be closely followed up after complete RFA treatment. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
ERIC Educational Resources Information Center
Schmiemann, Philipp; Nehm, Ross H.; Tornabene, Robyn E.
2017-01-01
Understanding how situational features of assessment tasks impact reasoning is important for many educational pursuits, notably the selection of curricular examples to illustrate phenomena, the design of formative and summative assessment items, and determination of whether instruction has fostered the development of abstract schemas divorced from…
Thomas, Minta; De Brabanter, Kris; De Moor, Bart
2014-05-10
DNA microarrays are potentially powerful technology for improving diagnostic classification, treatment selection, and prognostic assessment. The use of this technology to predict cancer outcome has a history of almost a decade. Disease class predictors can be designed for known disease cases and provide diagnostic confirmation or clarify abnormal cases. The main input to this class predictors are high dimensional data with many variables and few observations. Dimensionality reduction of these features set significantly speeds up the prediction task. Feature selection and feature transformation methods are well known preprocessing steps in the field of bioinformatics. Several prediction tools are available based on these techniques. Studies show that a well tuned Kernel PCA (KPCA) is an efficient preprocessing step for dimensionality reduction, but the available bandwidth selection method for KPCA was computationally expensive. In this paper, we propose a new data-driven bandwidth selection criterion for KPCA, which is related to least squares cross-validation for kernel density estimation. We propose a new prediction model with a well tuned KPCA and Least Squares Support Vector Machine (LS-SVM). We estimate the accuracy of the newly proposed model based on 9 case studies. Then, we compare its performances (in terms of test set Area Under the ROC Curve (AUC) and computational time) with other well known techniques such as whole data set + LS-SVM, PCA + LS-SVM, t-test + LS-SVM, Prediction Analysis of Microarrays (PAM) and Least Absolute Shrinkage and Selection Operator (Lasso). Finally, we assess the performance of the proposed strategy with an existing KPCA parameter tuning algorithm by means of two additional case studies. We propose, evaluate, and compare several mathematical/statistical techniques, which apply feature transformation/selection for subsequent classification, and consider its application in medical diagnostics. Both feature selection and feature transformation perform well on classification tasks. Due to the dynamic selection property of feature selection, it is hard to define significant features for the classifier, which predicts classes of future samples. Moreover, the proposed strategy enjoys a distinctive advantage with its relatively lesser time complexity.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ditzler, Gregory; Morrison, J. Calvin; Lan, Yemin
Background: Some of the current software tools for comparative metagenomics provide ecologists with the ability to investigate and explore bacterial communities using α– & β–diversity. Feature subset selection – a sub-field of machine learning – can also provide a unique insight into the differences between metagenomic or 16S phenotypes. In particular, feature subset selection methods can obtain the operational taxonomic units (OTUs), or functional features, that have a high-level of influence on the condition being studied. For example, in a previous study we have used information-theoretic feature selection to understand the differences between protein family abundances that best discriminate betweenmore » age groups in the human gut microbiome. Results: We have developed a new Python command line tool, which is compatible with the widely adopted BIOM format, for microbial ecologists that implements information-theoretic subset selection methods for biological data formats. We demonstrate the software tools capabilities on publicly available datasets. Conclusions: We have made the software implementation of Fizzy available to the public under the GNU GPL license. The standalone implementation can be found at http://github.com/EESI/Fizzy.« less
Nagarajan, Mahesh B; Coan, Paola; Huber, Markus B; Diemoz, Paul C; Wismüller, Axel
2015-01-01
Phase contrast X-ray computed tomography (PCI-CT) has been demonstrated as a novel imaging technique that can visualize human cartilage with high spatial resolution and soft tissue contrast. Different textural approaches have been previously investigated for characterizing chondrocyte organization on PCI-CT to enable classification of healthy and osteoarthritic cartilage. However, the large size of feature sets extracted in such studies motivates an investigation into algorithmic feature reduction for computing efficient feature representations without compromising their discriminatory power. For this purpose, geometrical feature sets derived from the scaling index method (SIM) were extracted from 1392 volumes of interest (VOI) annotated on PCI-CT images of ex vivo human patellar cartilage specimens. The extracted feature sets were subject to linear and non-linear dimension reduction techniques as well as feature selection based on evaluation of mutual information criteria. The reduced feature set was subsequently used in a machine learning task with support vector regression to classify VOIs as healthy or osteoarthritic; classification performance was evaluated using the area under the receiver-operating characteristic (ROC) curve (AUC). Our results show that the classification performance achieved by 9-D SIM-derived geometric feature sets (AUC: 0.96 ± 0.02) can be maintained with 2-D representations computed from both dimension reduction and feature selection (AUC values as high as 0.97 ± 0.02). Thus, such feature reduction techniques can offer a high degree of compaction to large feature sets extracted from PCI-CT images while maintaining their ability to characterize the underlying chondrocyte patterns.
Exploration of complex visual feature spaces for object perception
Leeds, Daniel D.; Pyles, John A.; Tarr, Michael J.
2014-01-01
The mid- and high-level visual properties supporting object perception in the ventral visual pathway are poorly understood. In the absence of well-specified theory, many groups have adopted a data-driven approach in which they progressively interrogate neural units to establish each unit's selectivity. Such methods are challenging in that they require search through a wide space of feature models and stimuli using a limited number of samples. To more rapidly identify higher-level features underlying human cortical object perception, we implemented a novel functional magnetic resonance imaging method in which visual stimuli are selected in real-time based on BOLD responses to recently shown stimuli. This work was inspired by earlier primate physiology work, in which neural selectivity for mid-level features in IT was characterized using a simple parametric approach (Hung et al., 2012). To extend such work to human neuroimaging, we used natural and synthetic object stimuli embedded in feature spaces constructed on the basis of the complex visual properties of the objects themselves. During fMRI scanning, we employed a real-time search method to control continuous stimulus selection within each image space. This search was designed to maximize neural responses across a pre-determined 1 cm3 brain region within ventral cortex. To assess the value of this method for understanding object encoding, we examined both the behavior of the method itself and the complex visual properties the method identified as reliably activating selected brain regions. We observed: (1) Regions selective for both holistic and component object features and for a variety of surface properties; (2) Object stimulus pairs near one another in feature space that produce responses at the opposite extremes of the measured activity range. Together, these results suggest that real-time fMRI methods may yield more widely informative measures of selectivity within the broad classes of visual features associated with cortical object representation. PMID:25309408
A Feature Selection Method Based on Fisher's Discriminant Ratio for Text Sentiment Classification
NASA Astrophysics Data System (ADS)
Wang, Suge; Li, Deyu; Wei, Yingjie; Li, Hongxia
With the rapid growth of e-commerce, product reviews on the Web have become an important information source for customers' decision making when they intend to buy some product. As the reviews are often too many for customers to go through, how to automatically classify them into different sentiment orientation categories (i.e. positive/negative) has become a research problem. In this paper, based on Fisher's discriminant ratio, an effective feature selection method is proposed for product review text sentiment classification. In order to validate the validity of the proposed method, we compared it with other methods respectively based on information gain and mutual information while support vector machine is adopted as the classifier. In this paper, 6 subexperiments are conducted by combining different feature selection methods with 2 kinds of candidate feature sets. Under 1006 review documents of cars, the experimental results indicate that the Fisher's discriminant ratio based on word frequency estimation has the best performance with F value 83.3% while the candidate features are the words which appear in both positive and negative texts.
Slippery liquid-infused porous surfaces having improved stability
DOE Office of Scientific and Technical Information (OSTI.GOV)
Aizenberg, Joanna; Vogel, Nicolas
Methods and articles disclosed herein relate to liquid repellant surfaces having selective wetting and transport properties. An article having a repellant surface includes a substrate comprising surface features with re-entrant curvature and an immobilized layer of lubricating liquid wetting over the surface features. The surface features with re-entrant curvature can be designed to provide high repellency even after failure or removal of the immobilized layer of lubricating liquid under certain operating conditions.
Bartsch, Mandy V; Loewe, Kristian; Merkel, Christian; Heinze, Hans-Jochen; Schoenfeld, Mircea A; Tsotsos, John K; Hopf, Jens-Max
2017-10-25
Attention can facilitate the selection of elementary object features such as color, orientation, or motion. This is referred to as feature-based attention and it is commonly attributed to a modulation of the gain and tuning of feature-selective units in visual cortex. Although gain mechanisms are well characterized, little is known about the cortical processes underlying the sharpening of feature selectivity. Here, we show with high-resolution magnetoencephalography in human observers (men and women) that sharpened selectivity for a particular color arises from feedback processing in the human visual cortex hierarchy. To assess color selectivity, we analyze the response to a color probe that varies in color distance from an attended color target. We find that attention causes an initial gain enhancement in anterior ventral extrastriate cortex that is coarsely selective for the target color and transitions within ∼100 ms into a sharper tuned profile in more posterior ventral occipital cortex. We conclude that attention sharpens selectivity over time by attenuating the response at lower levels of the cortical hierarchy to color values neighboring the target in color space. These observations support computational models proposing that attention tunes feature selectivity in visual cortex through backward-propagating attenuation of units less tuned to the target. SIGNIFICANCE STATEMENT Whether searching for your car, a particular item of clothing, or just obeying traffic lights, in everyday life, we must select items based on color. But how does attention allow us to select a specific color? Here, we use high spatiotemporal resolution neuromagnetic recordings to examine how color selectivity emerges in the human brain. We find that color selectivity evolves as a coarse to fine process from higher to lower levels within the visual cortex hierarchy. Our observations support computational models proposing that feature selectivity increases over time by attenuating the responses of less-selective cells in lower-level brain areas. These data emphasize that color perception involves multiple areas across a hierarchy of regions, interacting with each other in a complex, recursive manner. Copyright © 2017 the authors 0270-6474/17/3710346-12$15.00/0.
Genetic architecture and balancing selection: the life and death of differentiated variants.
Llaurens, Violaine; Whibley, Annabel; Joron, Mathieu
2017-05-01
Balancing selection describes any form of natural selection, which results in the persistence of multiple variants of a trait at intermediate frequencies within populations. By offering up a snapshot of multiple co-occurring functional variants and their interactions, systems under balancing selection can reveal the evolutionary mechanisms favouring the emergence and persistence of adaptive variation in natural populations. We here focus on the mechanisms by which several functional variants for a given trait can arise, a process typically requiring multiple epistatic mutations. We highlight how balancing selection can favour specific features in the genetic architecture and review the evolutionary and molecular mechanisms shaping this architecture. First, balancing selection affects the number of loci underlying differentiated traits and their respective effects. Control by one or few loci favours the persistence of differentiated functional variants by limiting intergenic recombination, or its impact, and may sometimes lead to the evolution of supergenes. Chromosomal rearrangements, particularly inversions, preventing adaptive combinations from being dissociated are increasingly being noted as features of such systems. Similarly, due to the frequency of heterozygotes maintained by balancing selection, dominance may be a key property of adaptive variants. High heterozygosity and limited recombination also influence associated genetic load, as linked recessive deleterious mutations may be sheltered. The capture of deleterious elements in a locus under balancing selection may reinforce polymorphism by further promoting heterozygotes. Finally, according to recent genomewide scans, balanced polymorphism might be more pervasive than generally thought. We stress the need for both functional and ecological studies to characterize the evolutionary mechanisms operating in these systems. © 2017 John Wiley & Sons Ltd.
Feature-based attentional modulation increases with stimulus separation in divided-attention tasks.
Sally, Sharon L; Vidnyánsky, Zoltán; Papathomas, Thomas V
2009-01-01
Attention modifies our visual experience by selecting certain aspects of a scene for further processing. It is therefore important to understand factors that govern the deployment of selective attention over the visual field. Both location and feature-specific mechanisms of attention have been identified and their modulatory effects can interact at a neural level (Treue and Martinez-Trujillo, 1999). The effects of spatial parameters on feature-based attentional modulation were examined for the feature dimensions of orientation, motion and color using three divided-attention tasks. Subjects performed concurrent discriminations of two briefly presented targets (Gabor patches) to the left and right of a central fixation point at eccentricities of +/-2.5 degrees , 5 degrees , 10 degrees and 15 degrees in the horizontal plane. Gabors were size-scaled to maintain consistent single-task performance across eccentricities. For all feature dimensions, the data show a linear increase in the attentional effects with target separation. In a control experiment, Gabors were presented on an isoeccentric viewing arc at 10 degrees and 15 degrees at the closest spatial separation (+/-2.5 degrees ) of the main experiment. Under these conditions, the effects of feature-based attentional effects were largely eliminated. Our results are consistent with the hypothesis that feature-based attention prioritizes the processing of attended features. Feature-based attentional mechanisms may have helped direct the attentional focus to the appropriate target locations at greater separations, whereas similar assistance may not have been necessary at closer target spacings. The results of the present study specify conditions under which dual-task performance benefits from sharing similar target features and may therefore help elucidate the processes by which feature-based attention operates.
Nagarajan, Mahesh B.; Coan, Paola; Huber, Markus B.; Diemoz, Paul C.; Wismüller, Axel
2015-01-01
Phase contrast X-ray computed tomography (PCI-CT) has been demonstrated as a novel imaging technique that can visualize human cartilage with high spatial resolution and soft tissue contrast. Different textural approaches have been previously investigated for characterizing chondrocyte organization on PCI-CT to enable classification of healthy and osteoarthritic cartilage. However, the large size of feature sets extracted in such studies motivates an investigation into algorithmic feature reduction for computing efficient feature representations without compromising their discriminatory power. For this purpose, geometrical feature sets derived from the scaling index method (SIM) were extracted from 1392 volumes of interest (VOI) annotated on PCI-CT images of ex vivo human patellar cartilage specimens. The extracted feature sets were subject to linear and non-linear dimension reduction techniques as well as feature selection based on evaluation of mutual information criteria. The reduced feature set was subsequently used in a machine learning task with support vector regression to classify VOIs as healthy or osteoarthritic; classification performance was evaluated using the area under the receiver-operating characteristic (ROC) curve (AUC). Our results show that the classification performance achieved by 9-D SIM-derived geometric feature sets (AUC: 0.96 ± 0.02) can be maintained with 2-D representations computed from both dimension reduction and feature selection (AUC values as high as 0.97 ± 0.02). Thus, such feature reduction techniques can offer a high degree of compaction to large feature sets extracted from PCI-CT images while maintaining their ability to characterize the underlying chondrocyte patterns. PMID:25710875
An opinion formation based binary optimization approach for feature selection
NASA Astrophysics Data System (ADS)
Hamedmoghadam, Homayoun; Jalili, Mahdi; Yu, Xinghuo
2018-02-01
This paper proposed a novel optimization method based on opinion formation in complex network systems. The proposed optimization technique mimics human-human interaction mechanism based on a mathematical model derived from social sciences. Our method encodes a subset of selected features to the opinion of an artificial agent and simulates the opinion formation process among a population of agents to solve the feature selection problem. The agents interact using an underlying interaction network structure and get into consensus in their opinions, while finding better solutions to the problem. A number of mechanisms are employed to avoid getting trapped in local minima. We compare the performance of the proposed method with a number of classical population-based optimization methods and a state-of-the-art opinion formation based method. Our experiments on a number of high dimensional datasets reveal outperformance of the proposed algorithm over others.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Honorio, J.; Goldstein, R.; Honorio, J.
We propose a simple, well grounded classification technique which is suited for group classification on brain fMRI data sets that have high dimensionality, small number of subjects, high noise level, high subject variability, imperfect registration and capture subtle cognitive effects. We propose threshold-split region as a new feature selection method and majority voteas the classification technique. Our method does not require a predefined set of regions of interest. We use average acros ssessions, only one feature perexperimental condition, feature independence assumption, and simple classifiers. The seeming counter-intuitive approach of using a simple design is supported by signal processing and statisticalmore » theory. Experimental results in two block design data sets that capture brain function under distinct monetary rewards for cocaine addicted and control subjects, show that our method exhibits increased generalization accuracy compared to commonly used feature selection and classification techniques.« less
Zawbaa, Hossam M; Szlȩk, Jakub; Grosan, Crina; Jachowicz, Renata; Mendyk, Aleksander
2016-01-01
Poly-lactide-co-glycolide (PLGA) is a copolymer of lactic and glycolic acid. Drug release from PLGA microspheres depends not only on polymer properties but also on drug type, particle size, morphology of microspheres, release conditions, etc. Selecting a subset of relevant properties for PLGA is a challenging machine learning task as there are over three hundred features to consider. In this work, we formulate the selection of critical attributes for PLGA as a multiobjective optimization problem with the aim of minimizing the error of predicting the dissolution profile while reducing the number of attributes selected. Four bio-inspired optimization algorithms: antlion optimization, binary version of antlion optimization, grey wolf optimization, and social spider optimization are used to select the optimal feature set for predicting the dissolution profile of PLGA. Besides these, LASSO algorithm is also used for comparisons. Selection of crucial variables is performed under the assumption that both predictability and model simplicity are of equal importance to the final result. During the feature selection process, a set of input variables is employed to find minimum generalization error across different predictive models and their settings/architectures. The methodology is evaluated using predictive modeling for which various tools are chosen, such as Cubist, random forests, artificial neural networks (monotonic MLP, deep learning MLP), multivariate adaptive regression splines, classification and regression tree, and hybrid systems of fuzzy logic and evolutionary computations (fugeR). The experimental results are compared with the results reported by Szlȩk. We obtain a normalized root mean square error (NRMSE) of 15.97% versus 15.4%, and the number of selected input features is smaller, nine versus eleven.
Zawbaa, Hossam M.; Szlȩk, Jakub; Grosan, Crina; Jachowicz, Renata; Mendyk, Aleksander
2016-01-01
Poly-lactide-co-glycolide (PLGA) is a copolymer of lactic and glycolic acid. Drug release from PLGA microspheres depends not only on polymer properties but also on drug type, particle size, morphology of microspheres, release conditions, etc. Selecting a subset of relevant properties for PLGA is a challenging machine learning task as there are over three hundred features to consider. In this work, we formulate the selection of critical attributes for PLGA as a multiobjective optimization problem with the aim of minimizing the error of predicting the dissolution profile while reducing the number of attributes selected. Four bio-inspired optimization algorithms: antlion optimization, binary version of antlion optimization, grey wolf optimization, and social spider optimization are used to select the optimal feature set for predicting the dissolution profile of PLGA. Besides these, LASSO algorithm is also used for comparisons. Selection of crucial variables is performed under the assumption that both predictability and model simplicity are of equal importance to the final result. During the feature selection process, a set of input variables is employed to find minimum generalization error across different predictive models and their settings/architectures. The methodology is evaluated using predictive modeling for which various tools are chosen, such as Cubist, random forests, artificial neural networks (monotonic MLP, deep learning MLP), multivariate adaptive regression splines, classification and regression tree, and hybrid systems of fuzzy logic and evolutionary computations (fugeR). The experimental results are compared with the results reported by Szlȩk. We obtain a normalized root mean square error (NRMSE) of 15.97% versus 15.4%, and the number of selected input features is smaller, nine versus eleven. PMID:27315205
A PdAg bimetallic nanocatalyst for selective reductive amination of nitroarenes.
Li, Linsen; Niu, Zhiqiang; Cai, Shuangfei; Zhi, Yun; Li, Hao; Rong, Hongpan; Liu, Lichen; Liu, Lei; He, Wei; Li, Yadong
2013-08-07
Herein we have identified an optimal catalyst, Pd1Ag1.7, for the tandem reductive amination between nitroarenes and aldehydes (selectivity > 93%). Key to the success is the ability to control the compositions of the investigational Pd1-xAgx (x = 0-1) catalysts, as well as the clear composition dependent activity/selectivity trend observed in this study. This catalyst features a wide substrate scope, excellent recyclability, activity and selectivity under ambient conditions.
Zhang, Ao; Tian, Suyan
2018-05-01
Pathway-based feature selection algorithms, which utilize biological information contained in pathways to guide which features/genes should be selected, have evolved quickly and become widespread in the field of bioinformatics. Based on how the pathway information is incorporated, we classify pathway-based feature selection algorithms into three major categories-penalty, stepwise forward, and weighting. Compared to the first two categories, the weighting methods have been underutilized even though they are usually the simplest ones. In this article, we constructed three different genes' connectivity information-based weights for each gene and then conducted feature selection upon the resulting weighted gene expression profiles. Using both simulations and a real-world application, we have demonstrated that when the data-driven connectivity information constructed from the data of specific disease under study is considered, the resulting weighted gene expression profiles slightly outperform the original expression profiles. In summary, a big challenge faced by the weighting method is how to estimate pathway knowledge-based weights more accurately and precisely. Only until the issue is conquered successfully will wide utilization of the weighting methods be impossible. © 2017 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
ERIC Educational Resources Information Center
Andrews, Lucy S.; Watson, Derrick G.; Humphreys, Glyn W.; Braithwaite, Jason J.
2011-01-01
Evidence for inhibitory processes in visual search comes from studies using preview conditions, where responses to new targets are delayed if they carry a featural attribute belonging to the old distractor items that are currently being ignored--the negative carry-over effect (Braithwaite, Humphreys, & Hodsoll, 2003). We examined whether…
Allen, Cerisse E; Beldade, Patrícia; Zwaan, Bas J; Brakefield, Paul M
2008-03-26
There is spectacular morphological diversity in nature but lineages typically display a limited range of phenotypes. Because developmental processes generate the phenotypic variation that fuels natural selection, they are a likely source of evolutionary biases, facilitating some changes and limiting others. Although shifts in developmental regulation are associated with morphological differences between taxa, it is unclear how underlying mechanisms affect the rate and direction of evolutionary change within populations under selection. Here we focus on two ecologically relevant features of butterfly wing color patterns, eyespot size and color composition, which are similarly and strongly correlated across the serially repeated eyespots. Though these two characters show similar patterns of standing variation and covariation within a population, they differ in key features of their underlying development. We targeted pairs of eyespots with artificial selection for coordinated (concerted selection) versus independent (antagonistic selection) change in their color composition and size and compared evolutionary responses of the two color pattern characters. The two characters respond to selection in strikingly different ways despite initially similar patterns of variation in all directions present in the starting population. Size (determined by local properties of a diffusing inductive signal) evolves flexibly in all selected directions. However, color composition (determined by a tissue-level response to the signal concentration gradient) evolves only in the direction of coordinated change. There was no independent evolutionary change in the color composition of two eyespots in response to antagonistic selection. Moreover, these differences in the directions of short-term evolutionary change in eyespot size and color composition within a single species are consistent with the observed wing pattern diversity in the genus. Both characters respond rapidly to selection for coordinated change, but there are striking differences in their response to selection for antagonistic, independent change across eyespots. While many additional factors may contribute to both short- and long-term evolutionary response, we argue that the compartmentalization of developmental processes can influence the diversification of serial repeats such as butterfly eyespots, even under strong selection.
Lemaire, Patrick; Brun, Fleur
2014-10-01
Ageing results in the tendency of older adults to repeat the same strategy across consecutive problems more often than young adults, even when such strategy perseveration is not appropriate. Here, we examined how these age-related differences in strategy perseveration are modulated by response-stimulus intervals and problem characteristics. We asked participants to select the best strategy while accomplishing a computational estimation task (i.e., provide approximate sums to two-digit addition problems like 38 + 74). We found that participants repeated the same strategy across consecutive problems more often when the duration between their response and next problem display was short (300 ms) than when it was long (1300 ms). We also found more strategy perseverations in older than in young adults under short Response-Stimulus Intervals, but not under long Response-Stimulus Intervals. Finally, age-related differences in strategy perseveration decreased when problem features helped participants to select the best strategy. These modulations of age-related differences in strategy perseveration by response-stimulus intervals and characteristics of target problems are important for furthering our understanding of mechanisms underlying strategy perseveration and, more generally, ageing effects on strategy selection.
Feature selection gait-based gender classification under different circumstances
NASA Astrophysics Data System (ADS)
Sabir, Azhin; Al-Jawad, Naseer; Jassim, Sabah
2014-05-01
This paper proposes a gender classification based on human gait features and investigates the problem of two variations: clothing (wearing coats) and carrying bag condition as addition to the normal gait sequence. The feature vectors in the proposed system are constructed after applying wavelet transform. Three different sets of feature are proposed in this method. First, Spatio-temporal distance that is dealing with the distance of different parts of the human body (like feet, knees, hand, Human Height and shoulder) during one gait cycle. The second and third feature sets are constructed from approximation and non-approximation coefficient of human body respectively. To extract these two sets of feature we divided the human body into two parts, upper and lower body part, based on the golden ratio proportion. In this paper, we have adopted a statistical method for constructing the feature vector from the above sets. The dimension of the constructed feature vector is reduced based on the Fisher score as a feature selection method to optimize their discriminating significance. Finally k-Nearest Neighbor is applied as a classification method. Experimental results demonstrate that our approach is providing more realistic scenario and relatively better performance compared with the existing approaches.
Automated texture-based identification of ovarian cancer in confocal microendoscope images
NASA Astrophysics Data System (ADS)
Srivastava, Saurabh; Rodriguez, Jeffrey J.; Rouse, Andrew R.; Brewer, Molly A.; Gmitro, Arthur F.
2005-03-01
The fluorescence confocal microendoscope provides high-resolution, in-vivo imaging of cellular pathology during optical biopsy. There are indications that the examination of human ovaries with this instrument has diagnostic implications for the early detection of ovarian cancer. The purpose of this study was to develop a computer-aided system to facilitate the identification of ovarian cancer from digital images captured with the confocal microendoscope system. To achieve this goal, we modeled the cellular-level structure present in these images as texture and extracted features based on first-order statistics, spatial gray-level dependence matrices, and spatial-frequency content. Selection of the best features for classification was performed using traditional feature selection techniques including stepwise discriminant analysis, forward sequential search, a non-parametric method, principal component analysis, and a heuristic technique that combines the results of these methods. The best set of features selected was used for classification, and performance of various machine classifiers was compared by analyzing the areas under their receiver operating characteristic curves. The results show that it is possible to automatically identify patients with ovarian cancer based on texture features extracted from confocal microendoscope images and that the machine performance is superior to that of the human observer.
Sustained selective attention to competing amplitude-modulations in human auditory cortex.
Riecke, Lars; Scharke, Wolfgang; Valente, Giancarlo; Gutschalk, Alexander
2014-01-01
Auditory selective attention plays an essential role for identifying sounds of interest in a scene, but the neural underpinnings are still incompletely understood. Recent findings demonstrate that neural activity that is time-locked to a particular amplitude-modulation (AM) is enhanced in the auditory cortex when the modulated stream of sounds is selectively attended to under sensory competition with other streams. However, the target sounds used in the previous studies differed not only in their AM, but also in other sound features, such as carrier frequency or location. Thus, it remains uncertain whether the observed enhancements reflect AM-selective attention. The present study aims at dissociating the effect of AM frequency on response enhancement in auditory cortex by using an ongoing auditory stimulus that contains two competing targets differing exclusively in their AM frequency. Electroencephalography results showed a sustained response enhancement for auditory attention compared to visual attention, but not for AM-selective attention (attended AM frequency vs. ignored AM frequency). In contrast, the response to the ignored AM frequency was enhanced, although a brief trend toward response enhancement occurred during the initial 15 s. Together with the previous findings, these observations indicate that selective enhancement of attended AMs in auditory cortex is adaptive under sustained AM-selective attention. This finding has implications for our understanding of cortical mechanisms for feature-based attentional gain control.
Sustained Selective Attention to Competing Amplitude-Modulations in Human Auditory Cortex
Riecke, Lars; Scharke, Wolfgang; Valente, Giancarlo; Gutschalk, Alexander
2014-01-01
Auditory selective attention plays an essential role for identifying sounds of interest in a scene, but the neural underpinnings are still incompletely understood. Recent findings demonstrate that neural activity that is time-locked to a particular amplitude-modulation (AM) is enhanced in the auditory cortex when the modulated stream of sounds is selectively attended to under sensory competition with other streams. However, the target sounds used in the previous studies differed not only in their AM, but also in other sound features, such as carrier frequency or location. Thus, it remains uncertain whether the observed enhancements reflect AM-selective attention. The present study aims at dissociating the effect of AM frequency on response enhancement in auditory cortex by using an ongoing auditory stimulus that contains two competing targets differing exclusively in their AM frequency. Electroencephalography results showed a sustained response enhancement for auditory attention compared to visual attention, but not for AM-selective attention (attended AM frequency vs. ignored AM frequency). In contrast, the response to the ignored AM frequency was enhanced, although a brief trend toward response enhancement occurred during the initial 15 s. Together with the previous findings, these observations indicate that selective enhancement of attended AMs in auditory cortex is adaptive under sustained AM-selective attention. This finding has implications for our understanding of cortical mechanisms for feature-based attentional gain control. PMID:25259525
Incipient Fault Detection for Rolling Element Bearings under Varying Speed Conditions.
Xue, Lang; Li, Naipeng; Lei, Yaguo; Li, Ningbo
2017-06-20
Varying speed conditions bring a huge challenge to incipient fault detection of rolling element bearings because both the change of speed and faults could lead to the amplitude fluctuation of vibration signals. Effective detection methods need to be developed to eliminate the influence of speed variation. This paper proposes an incipient fault detection method for bearings under varying speed conditions. Firstly, relative residual (RR) features are extracted, which are insensitive to the varying speed conditions and are able to reflect the degradation trend of bearings. Then, a health indicator named selected negative log-likelihood probability (SNLLP) is constructed to fuse a feature set including RR features and non-dimensional features. Finally, based on the constructed SNLLP health indicator, a novel alarm trigger mechanism is designed to detect the incipient fault. The proposed method is demonstrated using vibration signals from bearing tests and industrial wind turbines. The results verify the effectiveness of the proposed method for incipient fault detection of rolling element bearings under varying speed conditions.
Incipient Fault Detection for Rolling Element Bearings under Varying Speed Conditions
Xue, Lang; Li, Naipeng; Lei, Yaguo; Li, Ningbo
2017-01-01
Varying speed conditions bring a huge challenge to incipient fault detection of rolling element bearings because both the change of speed and faults could lead to the amplitude fluctuation of vibration signals. Effective detection methods need to be developed to eliminate the influence of speed variation. This paper proposes an incipient fault detection method for bearings under varying speed conditions. Firstly, relative residual (RR) features are extracted, which are insensitive to the varying speed conditions and are able to reflect the degradation trend of bearings. Then, a health indicator named selected negative log-likelihood probability (SNLLP) is constructed to fuse a feature set including RR features and non-dimensional features. Finally, based on the constructed SNLLP health indicator, a novel alarm trigger mechanism is designed to detect the incipient fault. The proposed method is demonstrated using vibration signals from bearing tests and industrial wind turbines. The results verify the effectiveness of the proposed method for incipient fault detection of rolling element bearings under varying speed conditions. PMID:28773035
Feature-based attention to unconscious shapes and colors.
Schmidt, Filipp; Schmidt, Thomas
2010-08-01
Two experiments employed feature-based attention to modulate the impact of completely masked primes on subsequent pointing responses. Participants processed a color cue to select a pair of possible pointing targets out of multiple targets on the basis of their color, and then pointed to the one of those two targets with a prespecified shape. All target pairs were preceded by prime pairs triggering either the correct or the opposite response. The time interval between cue and primes was varied to modulate the time course of feature-based attentional selection. In a second experiment, the roles of color and shape were switched. Pointing trajectories showed large priming effects that were amplified by feature-based attention, indicating that attention modulated the earliest phases of motor output. Priming effects as well as their attentional modulation occurred even though participants remained unable to identify the primes, indicating distinct processes underlying visual awareness, attention, and response control.
Chen, Xi; Kopsaftopoulos, Fotis; Wu, Qi; Ren, He; Chang, Fu-Kuo
2018-04-29
In this work, a data-driven approach for identifying the flight state of a self-sensing wing structure with an embedded multi-functional sensing network is proposed. The flight state is characterized by the structural vibration signals recorded from a series of wind tunnel experiments under varying angles of attack and airspeeds. A large feature pool is created by extracting potential features from the signals covering the time domain, the frequency domain as well as the information domain. Special emphasis is given to feature selection in which a novel filter method is developed based on the combination of a modified distance evaluation algorithm and a variance inflation factor. Machine learning algorithms are then employed to establish the mapping relationship from the feature space to the practical state space. Results from two case studies demonstrate the high identification accuracy and the effectiveness of the model complexity reduction via the proposed method, thus providing new perspectives of self-awareness towards the next generation of intelligent air vehicles.
Park, Eunjeong; Chang, Hyuk-Jae; Nam, Hyo Suk
2017-04-18
The pronator drift test (PDT), a neurological examination, is widely used in clinics to measure motor weakness of stroke patients. The aim of this study was to develop a PDT tool with machine learning classifiers to detect stroke symptoms based on quantification of proximal arm weakness using inertial sensors and signal processing. We extracted features of drift and pronation from accelerometer signals of wearable devices on the inner wrists of 16 stroke patients and 10 healthy controls. Signal processing and feature selection approach were applied to discriminate PDT features used to classify stroke patients. A series of machine learning techniques, namely support vector machine (SVM), radial basis function network (RBFN), and random forest (RF), were implemented to discriminate stroke patients from controls with leave-one-out cross-validation. Signal processing by the PDT tool extracted a total of 12 PDT features from sensors. Feature selection abstracted the major attributes from the 12 PDT features to elucidate the dominant characteristics of proximal weakness of stroke patients using machine learning classification. Our proposed PDT classifiers had an area under the receiver operating characteristic curve (AUC) of .806 (SVM), .769 (RBFN), and .900 (RF) without feature selection, and feature selection improves the AUCs to .913 (SVM), .956 (RBFN), and .975 (RF), representing an average performance enhancement of 15.3%. Sensors and machine learning methods can reliably detect stroke signs and quantify proximal arm weakness. Our proposed solution will facilitate pervasive monitoring of stroke patients. ©Eunjeong Park, Hyuk-Jae Chang, Hyo Suk Nam. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 18.04.2017.
Deep Learning for Population Genetic Inference.
Sheehan, Sara; Song, Yun S
2016-03-01
Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data) to the output (e.g., population genetic parameters of interest). We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history). Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep) or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme.
Deep Learning for Population Genetic Inference
Sheehan, Sara; Song, Yun S.
2016-01-01
Given genomic variation data from multiple individuals, computing the likelihood of complex population genetic models is often infeasible. To circumvent this problem, we introduce a novel likelihood-free inference framework by applying deep learning, a powerful modern technique in machine learning. Deep learning makes use of multilayer neural networks to learn a feature-based function from the input (e.g., hundreds of correlated summary statistics of data) to the output (e.g., population genetic parameters of interest). We demonstrate that deep learning can be effectively employed for population genetic inference and learning informative features of data. As a concrete application, we focus on the challenging problem of jointly inferring natural selection and demography (in the form of a population size change history). Our method is able to separate the global nature of demography from the local nature of selection, without sequential steps for these two factors. Studying demography and selection jointly is motivated by Drosophila, where pervasive selection confounds demographic analysis. We apply our method to 197 African Drosophila melanogaster genomes from Zambia to infer both their overall demography, and regions of their genome under selection. We find many regions of the genome that have experienced hard sweeps, and fewer under selection on standing variation (soft sweep) or balancing selection. Interestingly, we find that soft sweeps and balancing selection occur more frequently closer to the centromere of each chromosome. In addition, our demographic inference suggests that previously estimated bottlenecks for African Drosophila melanogaster are too extreme. PMID:27018908
Information Assurance Tasks Supporting the Processing of Electronic Records Archives
2007-03-01
3 Table 2. OpenVPN evaluation results...........................................................................................10 iv 1...operation of necessary security features and compare the network performance under OpenVPN (openvpn.net) operation with the network performance under no...VPN operation (non-VPN) in a gigabit network environment. The reason for selecting OpenVPN product was based on the previous findings of Khanvilkar
Inferring the Mode of Selection from the Transient Response to Demographic Perturbations
NASA Astrophysics Data System (ADS)
Balick, Daniel; Do, Ron; Reich, David; Sunyaev, Shamil
2014-03-01
Despite substantial recent progress in theoretical population genetics, most models work under the assumption of a constant population size. Deviations from fixed population sizes are ubiquitous in natural populations, many of which experience population bottlenecks and re-expansions. The non-equilibrium dynamics introduced by a large perturbation in population size are generally viewed as a confounding factor. In the present work, we take advantage of the transient response to a population bottleneck to infer features of the mode of selection and the distribution of selective effects. We develop an analytic framework and a corresponding statistical test that qualitatively differentiates between alleles under additive and those under recessive or more general epistatic selection. This statistic can be used to bound the joint distribution of selective effects and dominance effects in any diploid sexual organism. We apply this technique to human population genetic data, and severely restrict the space of allowed selective coefficients in humans. Additionally, one can test a set of functionally or medically relevant alleles for the primary mode of selection, or determine the local regional variation in dominance coefficients along the genome.
NASA Astrophysics Data System (ADS)
Wang, Dong
2016-03-01
Gears are the most commonly used components in mechanical transmission systems. Their failures may cause transmission system breakdown and result in economic loss. Identification of different gear crack levels is important to prevent any unexpected gear failure because gear cracks lead to gear tooth breakage. Signal processing based methods mainly require expertize to explain gear fault signatures which is usually not easy to be achieved by ordinary users. In order to automatically identify different gear crack levels, intelligent gear crack identification methods should be developed. The previous case studies experimentally proved that K-nearest neighbors based methods exhibit high prediction accuracies for identification of 3 different gear crack levels under different motor speeds and loads. In this short communication, to further enhance prediction accuracies of existing K-nearest neighbors based methods and extend identification of 3 different gear crack levels to identification of 5 different gear crack levels, redundant statistical features are constructed by using Daubechies 44 (db44) binary wavelet packet transform at different wavelet decomposition levels, prior to the use of a K-nearest neighbors method. The dimensionality of redundant statistical features is 620, which provides richer gear fault signatures. Since many of these statistical features are redundant and highly correlated with each other, dimensionality reduction of redundant statistical features is conducted to obtain new significant statistical features. At last, the K-nearest neighbors method is used to identify 5 different gear crack levels under different motor speeds and loads. A case study including 3 experiments is investigated to demonstrate that the developed method provides higher prediction accuracies than the existing K-nearest neighbors based methods for recognizing different gear crack levels under different motor speeds and loads. Based on the new significant statistical features, some other popular statistical models including linear discriminant analysis, quadratic discriminant analysis, classification and regression tree and naive Bayes classifier, are compared with the developed method. The results show that the developed method has the highest prediction accuracies among these statistical models. Additionally, selection of the number of new significant features and parameter selection of K-nearest neighbors are thoroughly investigated.
NASA Astrophysics Data System (ADS)
Wei, Jun; Zhou, Chuan; Chan, Heang-Ping; Chughtai, Aamer; Agarwal, Prachi; Kuriakose, Jean; Hadjiiski, Lubomir; Patel, Smita; Kazerooni, Ella
2015-03-01
We are developing a computer-aided detection system to assist radiologists in detection of non-calcified plaques (NCPs) in coronary CT angiograms (cCTA). In this study, we performed quantitative analysis of arterial flow properties in each vessel branch and extracted flow information to differentiate the presence and absence of stenosis in a vessel segment. Under rest conditions, blood flow in a single vessel branch was assumed to follow Poiseuille's law. For a uniform pressure distribution, two quantitative flow features, the normalized arterial compliance per unit length (Cu) and the normalized volumetric flow (Q) along the vessel centerline, were calculated based on the parabolic Poiseuille solution. The flow features were evaluated for a two-class classification task to differentiate NCP candidates obtained by prescreening as true NCPs and false positives (FPs) in cCTA. For evaluation, a data set of 83 cCTA scans was retrospectively collected from 83 patient files with IRB approval. A total of 118 NCPs were identified by experienced cardiothoracic radiologists. The correlation between the two flow features was 0.32. The discriminatory ability of the flow features evaluated as the area under the ROC curve (AUC) was 0.65 for Cu and 0.63 for Q in comparison with AUCs of 0.56-0.69 from our previous luminal features. With stepwise LDA feature selection, volumetric flow (Q) was selected in addition to three other luminal features. With FROC analysis, the test results indicated a reduction of the FP rates to 3.14, 1.98, and 1.32 FPs/scan at sensitivities of 90%, 80%, and 70%, respectively. The study indicated that quantitative blood flow analysis has the potential to provide useful features for the detection of NCPs in cCTA.
McLaren, Christine E.; Chen, Wen-Pin; Nie, Ke; Su, Min-Ying
2009-01-01
Rationale and Objectives Dynamic contrast enhanced MRI (DCE-MRI) is a clinical imaging modality for detection and diagnosis of breast lesions. Analytical methods were compared for diagnostic feature selection and performance of lesion classification to differentiate between malignant and benign lesions in patients. Materials and Methods The study included 43 malignant and 28 benign histologically-proven lesions. Eight morphological parameters, ten gray level co-occurrence matrices (GLCM) texture features, and fourteen Laws’ texture features were obtained using automated lesion segmentation and quantitative feature extraction. Artificial neural network (ANN) and logistic regression analysis were compared for selection of the best predictors of malignant lesions among the normalized features. Results Using ANN, the final four selected features were compactness, energy, homogeneity, and Law_LS, with area under the receiver operating characteristic curve (AUC) = 0.82, and accuracy = 0.76. The diagnostic performance of these 4-features computed on the basis of logistic regression yielded AUC = 0.80 (95% CI, 0.688 to 0.905), similar to that of ANN. The analysis also shows that the odds of a malignant lesion decreased by 48% (95% CI, 25% to 92%) for every increase of 1 SD in the Law_LS feature, adjusted for differences in compactness, energy, and homogeneity. Using logistic regression with z-score transformation, a model comprised of compactness, NRL entropy, and gray level sum average was selected, and it had the highest overall accuracy of 0.75 among all models, with AUC = 0.77 (95% CI, 0.660 to 0.880). When logistic modeling of transformations using the Box-Cox method was performed, the most parsimonious model with predictors, compactness and Law_LS, had an AUC of 0.79 (95% CI, 0.672 to 0.898). Conclusion The diagnostic performance of models selected by ANN and logistic regression was similar. The analytic methods were found to be roughly equivalent in terms of predictive ability when a small number of variables were chosen. The robust ANN methodology utilizes a sophisticated non-linear model, while logistic regression analysis provides insightful information to enhance interpretation of the model features. PMID:19409817
Tew, Min Wei; Nachtegaal, Maarten; Janousch, Markus; Huthwelker, Thomas; van Bokhoven, Jeroen A
2012-04-28
The catalytically active phase of silica-supported palladium catalysts in the selective and non-selective hydrogenation of 1-pentyne was determined using in situ X-ray absorption spectroscopy at the Pd K and L(3) edges. Upon exposure to alkyne, a palladium carbide-like phase rapidly forms, which prevents hydrogen to diffuse into the bulk of the nano-sized particles. Both selective and non-selective hydrogenation occur over carbided particles. The palladium carbide-like phase is stable under reaction conditions and only partially decomposes under high hydrogen partial pressure. Non-selective hydrogenation to pentane is not indicative of hydride formation. The palladium carbide phase was detected in the EXAFS analysis and the K edge XANES showed representative features. This journal is © the Owner Societies 2012
Phylogenetic divergence of cell biological features
2018-01-01
Most cellular features have a range of states, but understanding the mechanisms responsible for interspecific divergence is a challenge for evolutionary cell biology. Models are developed for the distribution of mean phenotypes likely to evolve under the joint forces of mutation and genetic drift in the face of constant selection pressures. Mean phenotypes will deviate from optimal states to a degree depending on the effective population size, potentially leading to substantial divergence in the absence of diversifying selection. The steady-state distribution for the mean can even be bimodal, with one domain being largely driven by selection and the other by mutation pressure, leading to the illusion of phenotypic shifts being induced by movement among alternative adaptive domains. These results raise questions as to whether lineage-specific selective pressures are necessary to account for interspecific divergence, providing a possible platform for the establishment of null models for the evolution of cell-biological traits. PMID:29927740
DOE Office of Scientific and Technical Information (OSTI.GOV)
Klement, Rainer J., E-mail: rainer_klement@gmx.de; Department of Radiotherapy and Radiation Oncology, Leopoldina Hospital, Schweinfurt; Allgäuer, Michael
2014-03-01
Background: Several prognostic factors for local tumor control probability (TCP) after stereotactic body radiation therapy (SBRT) for early stage non-small cell lung cancer (NSCLC) have been described, but no attempts have been undertaken to explore whether a nonlinear combination of potential factors might synergistically improve the prediction of local control. Methods and Materials: We investigated a support vector machine (SVM) for predicting TCP in a cohort of 399 patients treated at 13 German and Austrian institutions. Among 7 potential input features for the SVM we selected those most important on the basis of forward feature selection, thereby evaluating classifier performancemore » by using 10-fold cross-validation and computing the area under the ROC curve (AUC). The final SVM classifier was built by repeating the feature selection 10 times with different splitting of the data for cross-validation and finally choosing only those features that were selected at least 5 out of 10 times. It was compared with a multivariate logistic model that was built by forward feature selection. Results: Local failure occurred in 12% of patients. Biologically effective dose (BED) at the isocenter (BED{sub ISO}) was the strongest predictor of TCP in the logistic model and also the most frequently selected input feature for the SVM. A bivariate logistic function of BED{sub ISO} and the pulmonary function indicator forced expiratory volume in 1 second (FEV1) yielded the best description of the data but resulted in a significantly smaller AUC than the final SVM classifier with the input features BED{sub ISO}, age, baseline Karnofsky index, and FEV1 (0.696 ± 0.040 vs 0.789 ± 0.001, P<.03). The final SVM resulted in sensitivity and specificity of 67.0% ± 0.5% and 78.7% ± 0.3%, respectively. Conclusions: These results confirm that machine learning techniques like SVMs can be successfully applied to predict treatment outcome after SBRT. Improvements over traditional TCP modeling are expected through a nonlinear combination of multiple features, eventually helping in the task of personalized treatment planning.« less
Klement, Rainer J; Allgäuer, Michael; Appold, Steffen; Dieckmann, Karin; Ernst, Iris; Ganswindt, Ute; Holy, Richard; Nestle, Ursula; Nevinny-Stickel, Meinhard; Semrau, Sabine; Sterzing, Florian; Wittig, Andrea; Andratschke, Nicolaus; Guckenberger, Matthias
2014-03-01
Several prognostic factors for local tumor control probability (TCP) after stereotactic body radiation therapy (SBRT) for early stage non-small cell lung cancer (NSCLC) have been described, but no attempts have been undertaken to explore whether a nonlinear combination of potential factors might synergistically improve the prediction of local control. We investigated a support vector machine (SVM) for predicting TCP in a cohort of 399 patients treated at 13 German and Austrian institutions. Among 7 potential input features for the SVM we selected those most important on the basis of forward feature selection, thereby evaluating classifier performance by using 10-fold cross-validation and computing the area under the ROC curve (AUC). The final SVM classifier was built by repeating the feature selection 10 times with different splitting of the data for cross-validation and finally choosing only those features that were selected at least 5 out of 10 times. It was compared with a multivariate logistic model that was built by forward feature selection. Local failure occurred in 12% of patients. Biologically effective dose (BED) at the isocenter (BED(ISO)) was the strongest predictor of TCP in the logistic model and also the most frequently selected input feature for the SVM. A bivariate logistic function of BED(ISO) and the pulmonary function indicator forced expiratory volume in 1 second (FEV1) yielded the best description of the data but resulted in a significantly smaller AUC than the final SVM classifier with the input features BED(ISO), age, baseline Karnofsky index, and FEV1 (0.696 ± 0.040 vs 0.789 ± 0.001, P<.03). The final SVM resulted in sensitivity and specificity of 67.0% ± 0.5% and 78.7% ± 0.3%, respectively. These results confirm that machine learning techniques like SVMs can be successfully applied to predict treatment outcome after SBRT. Improvements over traditional TCP modeling are expected through a nonlinear combination of multiple features, eventually helping in the task of personalized treatment planning. Copyright © 2014 Elsevier Inc. All rights reserved.
Computer-aided Classification of Mammographic Masses Using Visually Sensitive Image Features
Wang, Yunzhi; Aghaei, Faranak; Zarafshani, Ali; Qiu, Yuchen; Qian, Wei; Zheng, Bin
2017-01-01
Purpose To develop a new computer-aided diagnosis (CAD) scheme that computes visually sensitive image features routinely used by radiologists to develop a machine learning classifier and distinguish between the malignant and benign breast masses detected from digital mammograms. Methods An image dataset including 301 breast masses was retrospectively selected. From each segmented mass region, we computed image features that mimic five categories of visually sensitive features routinely used by radiologists in reading mammograms. We then selected five optimal features in the five feature categories and applied logistic regression models for classification. A new CAD interface was also designed to show lesion segmentation, computed feature values and classification score. Results Areas under ROC curves (AUC) were 0.786±0.026 and 0.758±0.027 when to classify mass regions depicting on two view images, respectively. By fusing classification scores computed from two regions, AUC increased to 0.806±0.025. Conclusion This study demonstrated a new approach to develop CAD scheme based on 5 visually sensitive image features. Combining with a “visual aid” interface, CAD results may be much more easily explainable to the observers and increase their confidence to consider CAD generated classification results than using other conventional CAD approaches, which involve many complicated and visually insensitive texture features. PMID:27911353
Feature selection and classification of multiparametric medical images using bagging and SVM
NASA Astrophysics Data System (ADS)
Fan, Yong; Resnick, Susan M.; Davatzikos, Christos
2008-03-01
This paper presents a framework for brain classification based on multi-parametric medical images. This method takes advantage of multi-parametric imaging to provide a set of discriminative features for classifier construction by using a regional feature extraction method which takes into account joint correlations among different image parameters; in the experiments herein, MRI and PET images of the brain are used. Support vector machine classifiers are then trained based on the most discriminative features selected from the feature set. To facilitate robust classification and optimal selection of parameters involved in classification, in view of the well-known "curse of dimensionality", base classifiers are constructed in a bagging (bootstrap aggregating) framework for building an ensemble classifier and the classification parameters of these base classifiers are optimized by means of maximizing the area under the ROC (receiver operating characteristic) curve estimated from their prediction performance on left-out samples of bootstrap sampling. This classification system is tested on a sex classification problem, where it yields over 90% classification rates for unseen subjects. The proposed classification method is also compared with other commonly used classification algorithms, with favorable results. These results illustrate that the methods built upon information jointly extracted from multi-parametric images have the potential to perform individual classification with high sensitivity and specificity.
Text Classification for Assisting Moderators in Online Health Communities
Huh, Jina; Yetisgen-Yildiz, Meliha; Pratt, Wanda
2013-01-01
Objectives Patients increasingly visit online health communities to get help on managing health. The large scale of these online communities makes it impossible for the moderators to engage in all conversations; yet, some conversations need their expertise. Our work explores low-cost text classification methods to this new domain of determining whether a thread in an online health forum needs moderators’ help. Methods We employed a binary classifier on WebMD’s online diabetes community data. To train the classifier, we considered three feature types: (1) word unigram, (2) sentiment analysis features, and (3) thread length. We applied feature selection methods based on χ2 statistics and under sampling to account for unbalanced data. We then performed a qualitative error analysis to investigate the appropriateness of the gold standard. Results Using sentiment analysis features, feature selection methods, and balanced training data increased the AUC value up to 0.75 and the F1-score up to 0.54 compared to the baseline of using word unigrams with no feature selection methods on unbalanced data (0.65 AUC and 0.40 F1-score). The error analysis uncovered additional reasons for why moderators respond to patients’ posts. Discussion We showed how feature selection methods and balanced training data can improve the overall classification performance. We present implications of weighing precision versus recall for assisting moderators of online health communities. Our error analysis uncovered social, legal, and ethical issues around addressing community members’ needs. We also note challenges in producing a gold standard, and discuss potential solutions for addressing these challenges. Conclusion Social media environments provide popular venues in which patients gain health-related information. Our work contributes to understanding scalable solutions for providing moderators’ expertise in these large-scale, social media environments. PMID:24025513
Marafino, Ben J; Boscardin, W John; Dudley, R Adams
2015-04-01
Sparsity is often a desirable property of statistical models, and various feature selection methods exist so as to yield sparser and interpretable models. However, their application to biomedical text classification, particularly to mortality risk stratification among intensive care unit (ICU) patients, has not been thoroughly studied. To develop and characterize sparse classifiers based on the free text of nursing notes in order to predict ICU mortality risk and to discover text features most strongly associated with mortality. We selected nursing notes from the first 24h of ICU admission for 25,826 adult ICU patients from the MIMIC-II database. We then developed a pair of stochastic gradient descent-based classifiers with elastic-net regularization. We also studied the performance-sparsity tradeoffs of both classifiers as their regularization parameters were varied. The best-performing classifier achieved a 10-fold cross-validated AUC of 0.897 under the log loss function and full L2 regularization, while full L1 regularization used just 0.00025% of candidate input features and resulted in an AUC of 0.889. Using the log loss (range of AUCs 0.889-0.897) yielded better performance compared to the hinge loss (0.850-0.876), but the latter yielded even sparser models. Most features selected by both classifiers appear clinically relevant and correspond to predictors already present in existing ICU mortality models. The sparser classifiers were also able to discover a number of informative - albeit nonclinical - features. The elastic-net-regularized classifiers perform reasonably well and are capable of reducing the number of features required by over a thousandfold, with only a modest impact on performance. Copyright © 2015 Elsevier Inc. All rights reserved.
A new approach to develop computer-aided detection schemes of digital mammograms
NASA Astrophysics Data System (ADS)
Tan, Maxine; Qian, Wei; Pu, Jiantao; Liu, Hong; Zheng, Bin
2015-06-01
The purpose of this study is to develop a new global mammographic image feature analysis based computer-aided detection (CAD) scheme and evaluate its performance in detecting positive screening mammography examinations. A dataset that includes images acquired from 1896 full-field digital mammography (FFDM) screening examinations was used in this study. Among them, 812 cases were positive for cancer and 1084 were negative or benign. After segmenting the breast area, a computerized scheme was applied to compute 92 global mammographic tissue density based features on each of four mammograms of the craniocaudal (CC) and mediolateral oblique (MLO) views. After adding three existing popular risk factors (woman’s age, subjectively rated mammographic density, and family breast cancer history) into the initial feature pool, we applied a sequential forward floating selection feature selection algorithm to select relevant features from the bilateral CC and MLO view images separately. The selected CC and MLO view image features were used to train two artificial neural networks (ANNs). The results were then fused by a third ANN to build a two-stage classifier to predict the likelihood of the FFDM screening examination being positive. CAD performance was tested using a ten-fold cross-validation method. The computed area under the receiver operating characteristic curve was AUC = 0.779 ± 0.025 and the odds ratio monotonically increased from 1 to 31.55 as CAD-generated detection scores increased. The study demonstrated that this new global image feature based CAD scheme had a relatively higher discriminatory power to cue the FFDM examinations with high risk of being positive, which may provide a new CAD-cueing method to assist radiologists in reading and interpreting screening mammograms.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yan, Shiju; Qian, Wei; Guan, Yubao
2016-06-15
Purpose: This study aims to investigate the potential to improve lung cancer recurrence risk prediction performance for stage I NSCLS patients by integrating oversampling, feature selection, and score fusion techniques and develop an optimal prediction model. Methods: A dataset involving 94 early stage lung cancer patients was retrospectively assembled, which includes CT images, nine clinical and biological (CB) markers, and outcome of 3-yr disease-free survival (DFS) after surgery. Among the 94 patients, 74 remained DFS and 20 had cancer recurrence. Applying a computer-aided detection scheme, tumors were segmented from the CT images and 35 quantitative image (QI) features were initiallymore » computed. Two normalized Gaussian radial basis function network (RBFN) based classifiers were built based on QI features and CB markers separately. To improve prediction performance, the authors applied a synthetic minority oversampling technique (SMOTE) and a BestFirst based feature selection method to optimize the classifiers and also tested fusion methods to combine QI and CB based prediction results. Results: Using a leave-one-case-out cross-validation (K-fold cross-validation) method, the computed areas under a receiver operating characteristic curve (AUCs) were 0.716 ± 0.071 and 0.642 ± 0.061, when using the QI and CB based classifiers, respectively. By fusion of the scores generated by the two classifiers, AUC significantly increased to 0.859 ± 0.052 (p < 0.05) with an overall prediction accuracy of 89.4%. Conclusions: This study demonstrated the feasibility of improving prediction performance by integrating SMOTE, feature selection, and score fusion techniques. Combining QI features and CB markers and performing SMOTE prior to feature selection in classifier training enabled RBFN based classifier to yield improved prediction accuracy.« less
Quantum-enhanced feature selection with forward selection and backward elimination
NASA Astrophysics Data System (ADS)
He, Zhimin; Li, Lvzhou; Huang, Zhiming; Situ, Haozhen
2018-07-01
Feature selection is a well-known preprocessing technique in machine learning, which can remove irrelevant features to improve the generalization capability of a classifier and reduce training and inference time. However, feature selection is time-consuming, particularly for the applications those have thousands of features, such as image retrieval, text mining and microarray data analysis. It is crucial to accelerate the feature selection process. We propose a quantum version of wrapper-based feature selection, which converts a classical feature selection to its quantum counterpart. It is valuable for machine learning on quantum computer. In this paper, we focus on two popular kinds of feature selection methods, i.e., wrapper-based forward selection and backward elimination. The proposed feature selection algorithm can quadratically accelerate the classical one.
NASA Technical Reports Server (NTRS)
Bishop, Janice L.; Pieters, Carle M.
1995-01-01
Infrared reflectance spectra of carefully selected Mars soil analog materials have been measured under low atmospheric pressures and temperatures. Chemically altered montmorillonites containing ferrihydrite and hydrated ferric sulfate complexes are examined, as well as synthetic ferrihydrate and a palagonitic soil from Haleakala, Maui. Reflectance spectra of these analog materials exhibit subtle visible to near-infrared features, which are indicative of nanophase ferric oxides or oxyhydroxides and are similar to features observed in the spectra of the bright regions of Mars. Infrared reflectance spectra of these analogs include hydration features due to structural OH, bound H2O and adsorbed H2O. The spectal character of these hydration features is highly dependent on the sample environment and on the nature of the H2O/OH in the analogs. The behavior of the hydration features near 1.9 micrometers, 2.2 micrometers, 2.7 micrometers, 3 micrometers, and 6 micrometers are reported here in spetra measured under Marslike atmospheric environment. In spectra of these analogs measured under dry Earth atmospheric conditions the 1.9-micrometer band depth is 8-17%; this band is much stonger under moist conditions. Under Marslike atmospheric conditions the 1.9-micrometer feature is broad and barely discernible (1-3% band depth) in spectra of the ferrihydrite and palagonitic soil samples. In comparable spectra of the ferric sulfate-bearing montmorillonite the 1.9-micrometer feature is also broad, but stronger (6% band depth). In the low atmospheric pressure and temperature spectra of the ferrihydrite-bearing montmorillonite this feature is sharper than the other analogs and relatively stronger (6% band depth). Although the intensity of the 3- micrometer band is weaker in spectra of each of the analogs when measured under Marslike conditions, the 3-micromter band remains a dominant feature and is especially broad in spectra of the ferrihydrite and palagonitic soil. The structural OH features observed in these materials at 2.2-2.3 micrometers and 2.27 micrometers remain largely unaffected by the environmental conditions. A shift in the Christiansen feature towards shorter wavelengths has also been observed with decreasing atmospheric pressure and temperature in the midinfrared spectra of these samples.
NASA Technical Reports Server (NTRS)
Bishop, Janice L.; Pieters, Carle M.
1995-01-01
Infrared reflectance spectra of carefully selected Mars soil analog materials have been measured under low atmospheric pressures and temperatures. Chemically altered montmorillonites containing ferrihydrite and hydrated ferric sulfate complexes are examined, as well as synthetic ferrihydrite and a palagonitic soil from Haleakala, Maui. Reflectance spectra of these analog materials exhibit subtle visible to near-infrared features, which are indicative of nanophase ferric oxides or oxyhydroxides and are similar to features observed in the spectra of the bright regions of Mars. Infrared reflectance spectra of these analogs include hydration features due to structural OH, bound H2O, and adsorbed H2O. The spectral character of these hydration features is highly dependent on the sample environment and on the nature of the H2O/OH in the analogs. The behavior of the hydration features near 1.9 micron, 2.2 micron, 2.7 micron, 3 micron, and 6 microns are reported here in spectra measured under a Marslike atmospheric environment. In spectra of these analogs measured under dry Earth atmospheric conditions the 1.9-micron band depth is 8-17%; this band is much stronger under moist conditions. Under Marslike atmospheric conditions the 1.9-micron feature is broad and barely discernible (1-3% band depth) in spectra of the ferrihydrite and palagonitic soil samples. In comparable spectra of the ferric sulfate-bearing montmorillonite the 1.9-micron feature is also broad, but stronger (6% band depth). In the low atmospheric pressure and temperature spectra of the ferrihydrite-bearing montmorillonite this feature is sharper than the other analogs and relatively stronger (6% band depth). Although the intensity of the 3-micron band is weaker in spectra of each of the analogs when measured under Marslike conditions, the 3-micron band remains a dominant feature and is especially broad in spectra of the ferrihydrite and palagonitic soil. The structural OH features observed in these materials at 2.2-2.3 micron and 2.75 microns remain largely unaffected by the environmental conditions. A shift in the Christiansen feature towards shorter wavelengths has also been observed with decreasing atmospheric pressure and temperature in the midinfrared spectra of these samples.
Specific excitatory connectivity for feature integration in mouse primary visual cortex
Molina-Luna, Patricia; Roth, Morgane M.
2017-01-01
Local excitatory connections in mouse primary visual cortex (V1) are stronger and more prevalent between neurons that share similar functional response features. However, the details of how functional rules for local connectivity shape neuronal responses in V1 remain unknown. We hypothesised that complex responses to visual stimuli may arise as a consequence of rules for selective excitatory connectivity within the local network in the superficial layers of mouse V1. In mouse V1 many neurons respond to overlapping grating stimuli (plaid stimuli) with highly selective and facilitatory responses, which are not simply predicted by responses to single gratings presented alone. This complexity is surprising, since excitatory neurons in V1 are considered to be mainly tuned to single preferred orientations. Here we examined the consequences for visual processing of two alternative connectivity schemes: in the first case, local connections are aligned with visual properties inherited from feedforward input (a ‘like-to-like’ scheme specifically connecting neurons that share similar preferred orientations); in the second case, local connections group neurons into excitatory subnetworks that combine and amplify multiple feedforward visual properties (a ‘feature binding’ scheme). By comparing predictions from large scale computational models with in vivo recordings of visual representations in mouse V1, we found that responses to plaid stimuli were best explained by assuming feature binding connectivity. Unlike under the like-to-like scheme, selective amplification within feature-binding excitatory subnetworks replicated experimentally observed facilitatory responses to plaid stimuli; explained selective plaid responses not predicted by grating selectivity; and was consistent with broad anatomical selectivity observed in mouse V1. Our results show that visual feature binding can occur through local recurrent mechanisms without requiring feedforward convergence, and that such a mechanism is consistent with visual responses and cortical anatomy in mouse V1. PMID:29240769
Using multiscale texture and density features for near-term breast cancer risk analysis
Sun, Wenqing; Tseng, Tzu-Liang (Bill); Qian, Wei; Zhang, Jianying; Saltzstein, Edward C.; Zheng, Bin; Lure, Fleming; Yu, Hui; Zhou, Shi
2015-01-01
Purpose: To help improve efficacy of screening mammography by eventually establishing a new optimal personalized screening paradigm, the authors investigated the potential of using the quantitative multiscale texture and density feature analysis of digital mammograms to predict near-term breast cancer risk. Methods: The authors’ dataset includes digital mammograms acquired from 340 women. Among them, 141 were positive and 199 were negative/benign cases. The negative digital mammograms acquired from the “prior” screening examinations were used in the study. Based on the intensity value distributions, five subregions at different scales were extracted from each mammogram. Five groups of features, including density and texture features, were developed and calculated on every one of the subregions. Sequential forward floating selection was used to search for the effective combinations. Using the selected features, a support vector machine (SVM) was optimized using a tenfold validation method to predict the risk of each woman having image-detectable cancer in the next sequential mammography screening. The area under the receiver operating characteristic curve (AUC) was used as the performance assessment index. Results: From a total number of 765 features computed from multiscale subregions, an optimal feature set of 12 features was selected. Applying this feature set, a SVM classifier yielded performance of AUC = 0.729 ± 0.021. The positive predictive value was 0.657 (92 of 140) and the negative predictive value was 0.755 (151 of 200). Conclusions: The study results demonstrated a moderately high positive association between risk prediction scores generated by the quantitative multiscale mammographic image feature analysis and the actual risk of a woman having an image-detectable breast cancer in the next subsequent examinations. PMID:26127038
Decorrelation of the true and estimated classifier errors in high-dimensional settings.
Hanczar, Blaise; Hua, Jianping; Dougherty, Edward R
2007-01-01
The aim of many microarray experiments is to build discriminatory diagnosis and prognosis models. Given the huge number of features and the small number of examples, model validity which refers to the precision of error estimation is a critical issue. Previous studies have addressed this issue via the deviation distribution (estimated error minus true error), in particular, the deterioration of cross-validation precision in high-dimensional settings where feature selection is used to mitigate the peaking phenomenon (overfitting). Because classifier design is based upon random samples, both the true and estimated errors are sample-dependent random variables, and one would expect a loss of precision if the estimated and true errors are not well correlated, so that natural questions arise as to the degree of correlation and the manner in which lack of correlation impacts error estimation. We demonstrate the effect of correlation on error precision via a decomposition of the variance of the deviation distribution, observe that the correlation is often severely decreased in high-dimensional settings, and show that the effect of high dimensionality on error estimation tends to result more from its decorrelating effects than from its impact on the variance of the estimated error. We consider the correlation between the true and estimated errors under different experimental conditions using both synthetic and real data, several feature-selection methods, different classification rules, and three error estimators commonly used (leave-one-out cross-validation, k-fold cross-validation, and .632 bootstrap). Moreover, three scenarios are considered: (1) feature selection, (2) known-feature set, and (3) all features. Only the first is of practical interest; however, the other two are needed for comparison purposes. We will observe that the true and estimated errors tend to be much more correlated in the case of a known feature set than with either feature selection or using all features, with the better correlation between the latter two showing no general trend, but differing for different models.
NASA Astrophysics Data System (ADS)
Aad, G.; Abajyan, T.; Abbott, B.; Abdallah, J.; Khalek, S. Abdel; Abdinov, O.; Aben, R.; Abi, B.; Abolins, M.; AbouZeid, O. S.; Abramowicz, H.; Abreu, H.; Abulaiti, Y.; Acharya, B. S.; Adamczyk, L.; Adams, D. L.; Addy, T. N.; Adelman, J.; Adomeit, S.; Adye, T.; Agatonovic-Jovin, T.; Aguilar-Saavedra, J. A.; Agustoni, M.; Ahlen, S. P.; Ahmadov, F.; Aielli, G.; Åkesson, T. P. A.; Akimoto, G.; Akimov, A. V.; Albert, J.; Albrand, S.; Verzini, M. J. Alconada; Aleksa, M.; Aleksandrov, I. N.; Alexa, C.; Alexander, G.; Alexandre, G.; Alexopoulos, T.; Alhroob, M.; Alimonti, G.; Alio, L.; Alison, J.; Allbrooke, B. M. M.; Allison, L. J.; Allport, P. P.; Allwood-Spiers, S. E.; Almond, J.; Aloisio, A.; Alon, R.; Alonso, A.; Alonso, F.; Alpigiani, C.; Altheimer, A.; Gonzalez, B. Alvarez; Alviggi, M. G.; Amako, K.; Coutinho, Y. Amaral; Amelung, C.; Amidei, D.; Ammosov, V. V.; Santos, S. P. Amor Dos; Amorim, A.; Amoroso, S.; Amram, N.; Amundsen, G.; Anastopoulos, C.; Ancu, L. S.; Andari, N.; Andeen, T.; Anders, C. F.; Anders, G.; Anderson, K. J.; Andreazza, A.; Andrei, V.; Anduaga, X. S.; Angelidakis, S.; Anger, P.; Angerami, A.; Anghinolfi, F.; Anisenkov, A. V.; Anjos, N.; Annovi, A.; Antonaki, A.; Antonelli, M.; Antonov, A.; Antos, J.; Anulli, F.; Aoki, M.; Bella, L. Aperio; Apolle, R.; Arabidze, G.; Aracena, I.; Arai, Y.; Araque, J. P.; Arce, A. T. H.; Arguin, J.-F.; Argyropoulos, S.; Arik, M.; Armbruster, A. J.; Arnaez, O.; Arnal, V.; Arslan, O.; Artamonov, A.; Artoni, G.; Asai, S.; Asbah, N.; Ashkenazi, A.; Ask, S.; Åsman, B.; Asquith, L.; Assamagan, K.; Astalos, R.; Atkinson, M.; Atlay, N. B.; Auerbach, B.; Auge, E.; Augsten, K.; Aurousseau, M.; Avolio, G.; Azuelos, G.; Azuma, Y.; Baak, M. A.; Bacci, C.; Bach, A. M.; Bachacou, H.; Bachas, K.; Backes, M.; Backhaus, M.; Mayes, J. Backus; Badescu, E.; Bagiacchi, P.; Bagnaia, P.; Bai, Y.; Bailey, D. C.; Bain, T.; Baines, J. T.; Baker, O. K.; Baker, S.; Balek, P.; Balli, F.; Banas, E.; Banerjee, Sw.; Bangert, A.; Bannoura, A. A. E.; Bansal, V.; Bansil, H. S.; Barak, L.; Baranov, S. P.; Barber, T.; Barberio, E. L.; Barberis, D.; Barbero, M.; Barillari, T.; Barisonzi, M.; Barklow, T.; Barlow, N.; Barnett, B. M.; Barnett, R. M.; Barnovska, Z.; Baroncelli, A.; Barone, G.; Barr, A. J.; Barreiro, F.; da Costa, J. Barreiro Guimarães; Bartoldus, R.; Barton, A. E.; Bartos, P.; Bartsch, V.; Bassalat, A.; Basye, A.; Bates, R. L.; Batkova, L.; Batley, J. R.; Battistin, M.; Bauer, F.; Bawa, H. S.; Beau, T.; Beauchemin, P. H.; Beccherle, R.; Bechtle, P.; Beck, H. P.; Becker, K.; Becker, S.; Beckingham, M.; Becot, C.; Beddall, A. J.; Beddall, A.; Bedikian, S.; Bednyakov, V. A.; Bee, C. P.; Beemster, L. J.; Beermann, T. A.; Begel, M.; Behr, K.; Belanger-Champagne, C.; Bell, P. J.; Bell, W. H.; Bella, G.; Bellagamba, L.; Bellerive, A.; Bellomo, M.; Belloni, A.; Belotskiy, K.; Beltramello, O.; Benary, O.; Benchekroun, D.; Bendtz, K.; Benekos, N.; Benhammou, Y.; Noccioli, E. Benhar; Garcia, J. A. Benitez; Benjamin, D. P.; Bensinger, J. R.; Benslama, K.; Bentvelsen, S.; Berge, D.; Kuutmann, E. Bergeaas; Berger, N.; Berghaus, F.; Berglund, E.; Beringer, J.; Bernard, C.; Bernat, P.; Bernius, C.; Bernlochner, F. U.; Berry, T.; Berta, P.; Bertella, C.; Bertolucci, F.; Besana, M. I.; Besjes, G. J.; Bessidskaia, O.; Besson, N.; Betancourt, C.; Bethke, S.; Bhimji, W.; Bianchi, R. M.; Bianchini, L.; Bianco, M.; Biebel, O.; Bieniek, S. P.; Bierwagen, K.; Biesiada, J.; Biglietti, M.; De Mendizabal, J. Bilbao; Bilokon, H.; Bindi, M.; Binet, S.; Bingul, A.; Bini, C.; Black, C. W.; Black, J. E.; Black, K. M.; Blackburn, D.; Blair, R. E.; Blanchard, J.-B.; Blazek, T.; Bloch, I.; Blocker, C.; Blum, W.; Blumenschein, U.; Bobbink, G. J.; Bobrovnikov, V. S.; Bocchetta, S. S.; Bocci, A.; Boddy, C. R.; Boehler, M.; Boek, J.; Boek, T. T.; Bogaerts, J. A.; Bogdanchikov, A. G.; Bogouch, A.; Bohm, C.; Bohm, J.; Boisvert, V.; Bold, T.; Boldea, V.; Boldyrev, A. S.; Bolnet, N. M.; Bomben, M.; Bona, M.; Boonekamp, M.; Borisov, A.; Borissov, G.; Borri, M.; Borroni, S.; Bortfeldt, J.; Bortolotto, V.; Bos, K.; Boscherini, D.; Bosman, M.; Boterenbrood, H.; Boudreau, J.; Bouffard, J.; Bouhova-Thacker, E. V.; Boumediene, D.; Bourdarios, C.; Bousson, N.; Boutouil, S.; Boveia, A.; Boyd, J.; Boyko, I. R.; Bozovic-Jelisavcic, I.; Bracinik, J.; Branchini, P.; Brandt, A.; Brandt, G.; Brandt, O.; Bratzler, U.; Brau, B.; Brau, J. E.; Braun, H. M.; Brazzale, S. F.; Brelier, B.; Brendlinger, K.; Brennan, A. J.; Brenner, R.; Bressler, S.; Bristow, K.; Bristow, T. M.; Britton, D.; Brochu, F. M.; Brock, I.; Brock, R.; Bromberg, C.; Bronner, J.; Brooijmans, G.; Brooks, T.; Brooks, W. K.; Brosamer, J.; Brost, E.; Brown, G.; Brown, J.; Renstrom, P. A. Bruckman de; Bruncko, D.; Bruneliere, R.; Brunet, S.; Bruni, A.; Bruni, G.; Bruschi, M.; Bryngemark, L.; Buanes, T.; Buat, Q.; Bucci, F.; Buchholz, P.; Buckingham, R. M.; Buckley, A. G.; Buda, S. I.; Budagov, I. A.; Buehrer, F.; Bugge, L.; Bugge, M. K.; Bulekov, O.; Bundock, A. C.; Burckhart, H.; Burdin, S.; Burghgrave, B.; Burke, S.; Burmeister, I.; Busato, E.; Büscher, V.; Bussey, P.; Buszello, C. P.; Butler, B.; Butler, J. M.; Butt, A. I.; Buttar, C. M.; Butterworth, J. M.; Butti, P.; Buttinger, W.; Buzatu, A.; Byszewski, M.; Urbán, S. Cabrera; Caforio, D.; Cakir, O.; Calafiura, P.; Calderini, G.; Calfayan, P.; Calkins, R.; Caloba, L. P.; Calvet, D.; Calvet, S.; Toro, R. Camacho; Camarda, S.; Cameron, D.; Caminada, L. M.; Armadans, R. Caminal; Campana, S.; Campanelli, M.; Campoverde, A.; Canale, V.; Canepa, A.; Cantero, J.; Cantrill, R.; Cao, T.; Garrido, M. D. M. Capeans; Caprini, I.; Caprini, M.; Capua, M.; Caputo, R.; Cardarelli, R.; Carli, T.; Carlino, G.; Carminati, L.; Caron, S.; Carquin, E.; Carrillo-Montoya, G. D.; Carter, J. R.; Carvalho, J.; Casadei, D.; Casado, M. P.; Castaneda-Miranda, E.; Castelli, A.; Gimenez, V. Castillo; Castro, N. F.; Catastini, P.; Catinaccio, A.; Catmore, J. R.; Cattai, A.; Cattani, G.; Caughron, S.; Cavaliere, V.; Cavalli, D.; Cavalli-Sforza, M.; Cavasinni, V.; Ceradini, F.; Cerio, B.; Cerny, K.; Cerqueira, A. S.; Cerri, A.; Cerrito, L.; Cerutti, F.; Cerv, M.; Cervelli, A.; Cetin, S. A.; Chafaq, A.; Chakraborty, D.; Chalupkova, I.; Chan, K.; Chang, P.; Chapleau, B.; Chapman, J. D.; Charfeddine, D.; Charlton, D. G.; Chau, C. C.; Barajas, C. A. Chavez; Cheatham, S.; Chegwidden, A.; Chekanov, S.; Chekulaev, S. V.; Chelkov, G. A.; Chelstowska, M. A.; Chen, C.; Chen, H.; Chen, K.; Chen, L.; Chen, S.; Chen, X.; Chen, Y.; Cheng, H. C.; Cheng, Y.; Cheplakov, A.; El Moursli, R. Cherkaoui; Chernyatin, V.; Cheu, E.; Chevalier, L.; Chiarella, V.; Chiefari, G.; Childers, J. T.; Chilingarov, A.; Chiodini, G.; Chisholm, A. S.; Chislett, R. T.; Chitan, A.; Chizhov, M. V.; Chouridou, S.; Chow, B. K. B.; Christidi, I. A.; Chromek-Burckhart, D.; Chu, M. L.; Chudoba, J.; Chytka, L.; Ciapetti, G.; Ciftci, A. K.; Ciftci, R.; Cinca, D.; Cindro, V.; Ciocio, A.; Cirkovic, P.; Citron, Z. H.; Citterio, M.; Ciubancan, M.; Clark, A.; Clark, P. J.; Clarke, R. N.; Cleland, W.; Clemens, J. C.; Clement, B.; Clement, C.; Coadou, Y.; Cobal, M.; Coccaro, A.; Cochran, J.; Coffey, L.; Cogan, J. G.; Coggeshall, J.; Cole, B.; Cole, S.; Colijn, A. P.; Collins-Tooth, C.; Collot, J.; Colombo, T.; Colon, G.; Compostella, G.; Muiño, P. Conde; Coniavitis, E.; Conidi, M. C.; Connell, S. H.; Connelly, I. A.; Consonni, S. M.; Consorti, V.; Constantinescu, S.; Conta, C.; Conti, G.; Conventi, F.; Cooke, M.; Cooper, B. D.; Cooper-Sarkar, A. M.; Cooper-Smith, N. J.; Copic, K.; Cornelissen, T.; Corradi, M.; Corriveau, F.; Corso-Radu, A.; Cortes-Gonzalez, A.; Cortiana, G.; Costa, G.; Costa, M. J.; Costanzo, D.; Côté, D.; Cottin, G.; Cowan, G.; Cox, B. E.; Cranmer, K.; Cree, G.; Crépé-Renaudin, S.; Crescioli, F.; Ortuzar, M. Crispin; Cristinziani, M.; Crosetti, G.; Cuciuc, C.-M.; Donszelmann, T. Cuhadar; Cummings, J.; Curatolo, M.; Cuthbert, C.; Czirr, H.; Czodrowski, P.; Czyczula, Z.; D'Auria, S.; D'Onofrio, M.; Da Cunha Sargedas De Sousa, M. J.; Da Via, C.; Dabrowski, W.; Dafinca, A.; Dai, T.; Dale, O.; Dallaire, F.; Dallapiccola, C.; Dam, M.; Daniells, A. C.; Hoffmann, M. Dano; Dao, V.; Darbo, G.; Darlea, G. L.; Darmora, S.; Dassoulas, J. A.; Davey, W.; David, C.; Davidek, T.; Davies, E.; Davies, M.; Davignon, O.; Davison, A. R.; Davison, P.; Davygora, Y.; Dawe, E.; Dawson, I.; Daya-Ishmukhametova, R. K.; De, K.; de Asmundis, R.; De Castro, S.; De Cecco, S.; de Graat, J.; De Groot, N.; de Jong, P.; De La Taille, C.; De la Torre, H.; De Lorenzi, F.; De Nooij, L.; De Pedis, D.; De Salvo, A.; De Sanctis, U.; De Santo, A.; De Vivie De Regie, J. B.; De Zorzi, G.; Dearnaley, W. J.; Debbe, R.; Debenedetti, C.; Dechenaux, B.; Dedovich, D. V.; Degenhardt, J.; Deigaard, I.; Del Peso, J.; Del Prete, T.; Deliot, F.; Deliyergiyev, M.; Dell'Acqua, A.; Dell'Asta, L.; Dell'Orso, M.; Della Pietra, M.; della Volpe, D.; Delmastro, M.; Delsart, P. A.; Deluca, C.; Demers, S.; Demichev, M.; Demilly, A.; Denisov, S. P.; Derendarz, D.; Derkaoui, J. E.; Derue, F.; Dervan, P.; Desch, K.; Deterre, C.; Deviveiros, P. O.; Dewhurst, A.; Dhaliwal, S.; Di Ciaccio, A.; Di Ciaccio, L.; Di Domenico, A.; Di Donato, C.; Di Girolamo, A.; Di Girolamo, B.; Di Mattia, A.; Di Micco, B.; Di Nardo, R.; Di Simone, A.; Di Sipio, R.; Di Valentino, D.; Diaz, M. A.; Diehl, E. B.; Dietrich, J.; Dietzsch, T. A.; Diglio, S.; Dimitrievska, A.; Dingfelder, J.; Dionisi, C.; Dita, P.; Dita, S.; Dittus, F.; Djama, F.; Djobava, T.; do Vale, M. A. B.; Do Valle Wemans, A.; Doan, T. K. O.; Dobos, D.; Dobson, E.; Doglioni, C.; Doherty, T.; Dohmae, T.; Dolejsi, J.; Dolezal, Z.; Dolgoshein, B. A.; Donadelli, M.; Donati, S.; Dondero, P.; Donini, J.; Dopke, J.; Doria, A.; Dova, M. T.; Doyle, A. T.; Dris, M.; Dubbert, J.; Dube, S.; Dubreuil, E.; Duchovni, E.; Duckeck, G.; Ducu, O. A.; Duda, D.; Dudarev, A.; Dudziak, F.; Duflot, L.; Duguid, L.; Dührssen, M.; Dunford, M.; Yildiz, H. Duran; Düren, M.; Durglishvili, A.; Dwuznik, M.; Dyndal, M.; Ebke, J.; Edson, W.; Edwards, N. C.; Ehrenfeld, W.; Eifert, T.; Eigen, G.; Einsweiler, K.; Ekelof, T.; El Kacimi, M.; Ellert, M.; Elles, S.; Ellinghaus, F.; Ellis, N.; Elmsheuser, J.; Elsing, M.; Emeliyanov, D.; Enari, Y.; Endner, O. C.; Endo, M.; Engelmann, R.; Erdmann, J.; Ereditato, A.; Eriksson, D.; Ernis, G.; Ernst, J.; Ernst, M.; Ernwein, J.; Errede, D.; Errede, S.; Ertel, E.; Escalier, M.; Esch, H.; Escobar, C.; Esposito, B.; Etienvre, A. I.; Etzion, E.; Evans, H.; Fabbri, L.; Facini, G.; Fakhrutdinov, R. M.; Falciano, S.; Faltova, J.; Fang, Y.; Fanti, M.; Farbin, A.; Farilla, A.; Farooque, T.; Farrell, S.; Farrington, S. M.; Farthouat, P.; Fassi, F.; Fassnacht, P.; Fassouliotis, D.; Favareto, A.; Fayard, L.; Federic, P.; Fedin, O. L.; Fedorko, W.; Fehling-Kaschek, M.; Feigl, S.; Feligioni, L.; Feng, C.; Feng, E. J.; Feng, H.; Fenyuk, A. B.; Perez, S. Fernandez; Fernando, W.; Ferrag, S.; Ferrando, J.; Ferrara, V.; Ferrari, A.; Ferrari, P.; Ferrari, R.; Ferreira de Lima, D. E.; Ferrer, A.; Ferrere, D.; Ferretti, C.; Parodi, A. Ferretto; Fiascaris, M.; Fiedler, F.; Filipčič, A.; Filipuzzi, M.; Filthaut, F.; Fincke-Keeler, M.; Finelli, K. D.; Fiolhais, M. C. N.; Fiorini, L.; Firan, A.; Fischer, J.; Fisher, M. J.; Fisher, W. C.; Fitzgerald, E. A.; Flechl, M.; Fleck, I.; Fleischmann, P.; Fleischmann, S.; Fletcher, G. T.; Fletcher, G.; Flick, T.; Floderus, A.; Castillo, L. R. Flores; Bustos, A. C. Florez; Flowerdew, M. J.; Formica, A.; Forti, A.; Fortin, D.; Fournier, D.; Fox, H.; Fracchia, S.; Francavilla, P.; Franchini, M.; Franchino, S.; Francis, D.; Franklin, M.; Franz, S.; Fraternali, M.; French, S. T.; Friedrich, C.; Friedrich, F.; Froidevaux, D.; Frost, J. A.; Fukunaga, C.; Torregrosa, E. Fullana; Fulsom, B. G.; Fuster, J.; Gabaldon, C.; Gabizon, O.; Gabrielli, A.; Gabrielli, A.; Gadatsch, S.; Gadomski, S.; Gagliardi, G.; Gagnon, P.; Galea, C.; Galhardo, B.; Gallas, E. J.; Gallo, V.; Gallop, B. J.; Gallus, P.; Galster, G.; Gan, K. K.; Gandrajula, R. P.; Gao, J.; Gao, Y. S.; Walls, F. M. Garay; Garberson, F.; García, C.; Navarro, J. E. García; Garcia-Sciveres, M.; Gardner, R. W.; Garelli, N.; Garonne, V.; Gatti, C.; Gaudio, G.; Gaur, B.; Gauthier, L.; Gauzzi, P.; Gavrilenko, I. L.; Gay, C.; Gaycken, G.; Gazis, E. N.; Ge, P.; Gecse, Z.; Gee, C. N. P.; Geerts, D. A. A.; Geich-Gimbel, Ch.; Gellerstedt, K.; Gemme, C.; Gemmell, A.; Genest, M. H.; Gentile, S.; George, M.; George, S.; Gerbaudo, D.; Gershon, A.; Ghazlane, H.; Ghodbane, N.; Giacobbe, B.; Giagu, S.; Giangiobbe, V.; Giannetti, P.; Gianotti, F.; Gibbard, B.; Gibson, S. M.; Gilchriese, M.; Gillam, T. P. S.; Gillberg, D.; Gingrich, D. M.; Giokaris, N.; Giordani, M. P.; Giordano, R.; Giorgi, F. M.; Giraud, P. F.; Giugni, D.; Giuliani, C.; Giulini, M.; Gjelsten, B. K.; Gkialas, I.; Gladilin, L. K.; Glasman, C.; Glatzer, J.; Glaysher, P. C. F.; Glazov, A.; Glonti, G. L.; Goblirsch-Kolb, M.; Goddard, J. R.; Godfrey, J.; Godlewski, J.; Goeringer, C.; Goldfarb, S.; Golling, T.; Golubkov, D.; Gomes, A.; Fajardo, L. S. Gomez; Gonçalo, R.; Da Costa, J. Goncalves Pinto Firmino; Gonella, L.; de la Hoz, S. González; Parra, G. Gonzalez; Silva, M. L. Gonzalez; Gonzalez-Sevilla, S.; Goossens, L.; Gorbounov, P. A.; Gordon, H. A.; Gorelov, I.; Gorini, B.; Gorini, E.; Gorišek, A.; Gornicki, E.; Goshaw, A. T.; Gössling, C.; Gostkin, M. I.; Gouighri, M.; Goujdami, D.; Goulette, M. P.; Goussiou, A. G.; Goy, C.; Gozpinar, S.; Grabas, H. M. X.; Graber, L.; Grabowska-Bold, I.; Grafström, P.; Grahn, K.-J.; Gramling, J.; Gramstad, E.; Grancagnolo, F.; Grancagnolo, S.; Grassi, V.; Gratchev, V.; Gray, H. M.; Graziani, E.; Grebenyuk, O. G.; Greenwood, Z. D.; Gregersen, K.; Gregor, I. M.; Grenier, P.; Griffiths, J.; Grillo, A. A.; Grimm, K.; Grinstein, S.; Gris, Ph.; Grishkevich, Y. V.; Grivaz, J.-F.; Grohs, J. P.; Grohsjean, A.; Gross, E.; Grosse-Knetter, J.; Grossi, G. C.; Groth-Jensen, J.; Grout, Z. J.; Grybel, K.; Guan, L.; Guescini, F.; Guest, D.; Gueta, O.; Guicheney, C.; Guido, E.; Guillemin, T.; Guindon, S.; Gul, U.; Gumpert, C.; Gunther, J.; Guo, J.; Gupta, S.; Gutierrez, P.; Ortiz, N. G. Gutierrez; Gutschow, C.; Guttman, N.; Guyot, C.; Gwenlan, C.; Gwilliam, C. B.; Haas, A.; Haber, C.; Hadavand, H. K.; Haddad, N.; Haefner, P.; Hageböck, S.; Hajduk, Z.; Hakobyan, H.; Haleem, M.; Hall, D.; Halladjian, G.; Hamacher, K.; Hamal, P.; Hamano, K.; Hamer, M.; Hamilton, A.; Hamilton, S.; Hamnett, P. G.; Han, L.; Hanagaki, K.; Hanawa, K.; Hance, M.; Hanke, P.; Hansen, J. B.; Hansen, J. D.; Hansen, P. H.; Hara, K.; Hard, A. S.; Harenberg, T.; Harkusha, S.; Harper, D.; Harrington, R. D.; Harris, O. M.; Harrison, P. F.; Hartjes, F.; Harvey, A.; Hasegawa, S.; Hasegawa, Y.; Hasib, A.; Hassani, S.; Haug, S.; Hauschild, M.; Hauser, R.; Havranek, M.; Hawkes, C. M.; Hawkings, R. J.; Hawkins, A. D.; Hayashi, T.; Hayden, D.; Hays, C. P.; Hayward, H. S.; Haywood, S. J.; Head, S. J.; Heck, T.; Hedberg, V.; Heelan, L.; Heim, S.; Heim, T.; Heinemann, B.; Heinrich, L.; Heisterkamp, S.; Hejbal, J.; Helary, L.; Heller, C.; Heller, M.; Hellman, S.; Hellmich, D.; Helsens, C.; Henderson, J.; Henderson, R. C. W.; Hengler, C.; Henrichs, A.; Correia, A. M. Henriques; Henrot-Versille, S.; Hensel, C.; Herbert, G. H.; Jiménez, Y. Hernández; Herrberg-Schubert, R.; Herten, G.; Hertenberger, R.; Hervas, L.; Hesketh, G. G.; Hessey, N. P.; Hickling, R.; Higón-Rodriguez, E.; Hill, J. C.; Hiller, K. H.; Hillert, S.; Hillier, S. J.; Hinchliffe, I.; Hines, E.; Hirose, M.; Hirschbuehl, D.; Hobbs, J.; Hod, N.; Hodgkinson, M. C.; Hodgson, P.; Hoecker, A.; Hoeferkamp, M. R.; Hoffman, J.; Hoffmann, D.; Hofmann, J. I.; Hohlfeld, M.; Holmes, T. R.; Hong, T. M.; van Huysduynen, L. Hooft; Hostachy, J.-Y.; Hou, S.; Hoummada, A.; Howard, J.; Howarth, J.; Hrabovsky, M.; Hristova, I.; Hrivnac, J.; Hryn'ova, T.; Hsu, P. J.; Hsu, S.-C.; Hu, D.; Hu, X.; Huang, Y.; Hubacek, Z.; Hubaut, F.; Huegging, F.; Huffman, T. B.; Hughes, E. W.; Hughes, G.; Huhtinen, M.; Hülsing, T. A.; Hurwitz, M.; Huseynov, N.; Huston, J.; Huth, J.; Iacobucci, G.; Iakovidis, G.; Ibragimov, I.; Iconomidou-Fayard, L.; Ideal, E.; Iengo, P.; Igonkina, O.; Iizawa, T.; Ikegami, Y.; Ikematsu, K.; Ikeno, M.; Iliadis, D.; Ilic, N.; Inamaru, Y.; Ince, T.; Ioannou, P.; Iodice, M.; Iordanidou, K.; Ippolito, V.; Quiles, A. Irles; Isaksson, C.; Ishino, M.; Ishitsuka, M.; Ishmukhametov, R.; Issever, C.; Istin, S.; Ponce, J. M. Iturbe; Ivashin, A. V.; Iwanski, W.; Iwasaki, H.; Izen, J. M.; Izzo, V.; Jackson, B.; Jackson, J. N.; Jackson, M.; Jackson, P.; Jaekel, M. R.; Jain, V.; Jakobs, K.; Jakobsen, S.; Jakoubek, T.; Jakubek, J.; Jamin, D. O.; Jana, D. K.; Jansen, E.; Jansen, H.; Janssen, J.; Janus, M.; Jarlskog, G.; Javůrek, T.; Jeanty, L.; Jeng, G.-Y.; Jennens, D.; Jenni, P.; Jentzsch, J.; Jeske, C.; Jézéquel, S.; Ji, H.; Ji, W.; Jia, J.; Jiang, Y.; Belenguer, M. Jimenez; Jin, S.; Jinaru, A.; Jinnouchi, O.; Joergensen, M. D.; Johansson, K. E.; Johansson, P.; Johns, K. A.; Jon-And, K.; Jones, G.; Jones, R. W. L.; Jones, T. J.; Jongmanns, J.; Jorge, P. M.; Joshi, K. D.; Jovicevic, J.; Ju, X.; Jung, C. A.; Jungst, R. M.; Jussel, P.; Rozas, A. Juste; Kaci, M.; Kaczmarska, A.; Kado, M.; Kagan, H.; Kagan, M.; Kajomovitz, E.; Kama, S.; Kanaya, N.; Kaneda, M.; Kaneti, S.; Kanno, T.; Kantserov, V. A.; Kanzaki, J.; Kaplan, B.; Kapliy, A.; Kar, D.; Karakostas, K.; Karastathis, N.; Karnevskiy, M.; Karpov, S. N.; Karthik, K.; Kartvelishvili, V.; Karyukhin, A. N.; Kashif, L.; Kasieczka, G.; Kass, R. D.; Kastanas, A.; Kataoka, Y.; Katre, A.; Katzy, J.; Kaushik, V.; Kawagoe, K.; Kawamoto, T.; Kawamura, G.; Kazama, S.; Kazanin, V. F.; Kazarinov, M. Y.; Keeler, R.; Kehoe, R.; Keil, M.; Keller, J. S.; Keoshkerian, H.; Kepka, O.; Kerševan, B. P.; Kersten, S.; Kessoku, K.; Keung, J.; Khalil-zada, F.; Khandanyan, H.; Khanov, A.; Khodinov, A.; Khomich, A.; Khoo, T. J.; Khoriauli, G.; Khoroshilov, A.; Khovanskiy, V.; Khramov, E.; Khubua, J.; Kim, H. Y.; Kim, H.; Kim, S. H.; Kimura, N.; Kind, O.; King, B. T.; King, M.; King, R. S. B.; King, S. B.; Kirk, J.; Kiryunin, A. E.; Kishimoto, T.; Kisielewska, D.; Kiss, F.; Kitamura, T.; Kittelmann, T.; Kiuchi, K.; Kladiva, E.; Klein, M.; Klein, U.; Kleinknecht, K.; Klimek, P.; Klimentov, A.; Klingenberg, R.; Klinger, J. A.; Klinkby, E. B.; Klioutchnikova, T.; Klok, P. F.; Kluge, E.-E.; Kluit, P.; Kluth, S.; Kneringer, E.; Knoops, E. B. F. G.; Knue, A.; Kobayashi, T.; Kobel, M.; Kocian, M.; Kodys, P.; Koevesarki, P.; Koffas, T.; Koffeman, E.; Kogan, L. A.; Kohlmann, S.; Kohout, Z.; Kohriki, T.; Koi, T.; Kolanoski, H.; Koletsou, I.; Koll, J.; Komar, A. A.; Komori, Y.; Kondo, T.; Köneke, K.; König, A. C.; König, S.; Kono, T.; Konoplich, R.; Konstantinidis, N.; Kopeliansky, R.; Koperny, S.; Köpke, L.; Kopp, A. K.; Korcyl, K.; Kordas, K.; Korn, A.; Korol, A. A.; Korolkov, I.; Korolkova, E. V.; Korotkov, V. A.; Kortner, O.; Kortner, S.; Kostyukhin, V. V.; Kotov, V. M.; Kotwal, A.; Kourkoumelis, C.; Kouskoura, V.; Koutsman, A.; Kowalewski, R.; Kowalski, T. Z.; Kozanecki, W.; Kozhin, A. S.; Kral, V.; Kramarenko, V. A.; Kramberger, G.; Krasnopevtsev, D.; Krasny, M. W.; Krasznahorkay, A.; Kraus, J. K.; Kravchenko, A.; Kreiss, S.; Kretz, M.; Kretzschmar, J.; Kreutzfeldt, K.; Krieger, P.; Kroeninger, K.; Kroha, H.; Kroll, J.; Kroseberg, J.; Krstic, J.; Kruchonak, U.; Krüger, H.; Kruker, T.; Krumnack, N.; Krumshteyn, Z. V.; Kruse, A.; Kruse, M. C.; Kruskal, M.; Kubota, T.; Kuday, S.; Kuehn, S.; Kugel, A.; Kuhl, A.; Kuhl, T.; Kukhtin, V.; Kulchitsky, Y.; Kuleshov, S.; Kuna, M.; Kunkle, J.; Kupco, A.; Kurashige, H.; Kurochkin, Y. A.; Kurumida, R.; Kus, V.; Kuwertz, E. S.; Kuze, M.; Kvita, J.; La Rosa, A.; La Rotonda, L.; Labarga, L.; Lacasta, C.; Lacava, F.; Lacey, J.; Lacker, H.; Lacour, D.; Lacuesta, V. R.; Ladygin, E.; Lafaye, R.; Laforge, B.; Lagouri, T.; Lai, S.; Laier, H.; Lambourne, L.; Lammers, S.; Lampen, C. L.; Lampl, W.; Lançon, E.; Landgraf, U.; Landon, M. P. J.; Lang, V. S.; Lange, C.; Lankford, A. J.; Lanni, F.; Lantzsch, K.; Laplace, S.; Lapoire, C.; Laporte, J. F.; Lari, T.; Lassnig, M.; Laurelli, P.; Lavorini, V.; Lavrijsen, W.; Law, A. T.; Laycock, P.; Le, B. T.; Le Dortz, O.; Le Guirriec, E.; Le Menedeu, E.; LeCompte, T.; Ledroit-Guillon, F.; Lee, C. A.; Lee, H.; Lee, J. S. H.; Lee, S. C.; Lee, L.; Lefebvre, G.; Lefebvre, M.; Legger, F.; Leggett, C.; Lehan, A.; Lehmacher, M.; Miotto, G. Lehmann; Lei, X.; Leister, A. G.; Leite, M. A. L.; Leitner, R.; Lellouch, D.; Lemmer, B.; Leney, K. J. C.; Lenz, T.; Lenzen, G.; Lenzi, B.; Leone, R.; Leonhardt, K.; Leontsinis, S.; Leroy, C.; Lester, C. G.; Lester, C. M.; Levêque, J.; Levin, D.; Levinson, L. J.; Levy, M.; Lewis, A.; Lewis, G. H.; Leyko, A. M.; Leyton, M.; Li, B.; Li, B.; Li, H.; Li, H. L.; Li, S.; Li, X.; Li, Y.; Liang, Z.; Liao, H.; Liberti, B.; Lichard, P.; Lie, K.; Liebal, J.; Liebig, W.; Limbach, C.; Limosani, A.; Limper, M.; Lin, S. C.; Linde, F.; Lindquist, B. E.; Linnemann, J. T.; Lipeles, E.; Lipniacka, A.; Lisovyi, M.; Liss, T. M.; Lissauer, D.; Lister, A.; Litke, A. M.; Liu, B.; Liu, D.; Liu, J. B.; Liu, K.; Liu, L.; Liu, M.; Liu, M.; Liu, Y.; Livan, M.; Livermore, S. S. A.; Lleres, A.; Merino, J. Llorente; Lloyd, S. L.; Lo Sterzo, F.; Lobodzinska, E.; Loch, P.; Lockman, W. S.; Loddenkoetter, T.; Loebinger, F. K.; Loevschall-Jensen, A. E.; Loginov, A.; Loh, C. W.; Lohse, T.; Lohwasser, K.; Lokajicek, M.; Lombardo, V. P.; Long, J. D.; Long, R. E.; Lopes, L.; Mateos, D. Lopez; Paredes, B. Lopez; Lorenz, J.; Martinez, N. Lorenzo; Losada, M.; Loscutoff, P.; Losty, M. J.; Lou, X.; Lounis, A.; Love, J.; Love, P. A.; Lowe, A. J.; Lu, F.; Lubatti, H. J.; Luci, C.; Lucotte, A.; Luehring, F.; Lukas, W.; Luminari, L.; Lundberg, O.; Lund-Jensen, B.; Lungwitz, M.; Lynn, D.; Lysak, R.; Lytken, E.; Ma, H.; Ma, L. L.; Maccarrone, G.; Macchiolo, A.; Maček, B.; Miguens, J. Machado; Macina, D.; Madaffari, D.; Madar, R.; Maddocks, H. J.; Mader, W. F.; Madsen, A.; Maeno, M.; Maeno, T.; Magradze, E.; Mahboubi, K.; Mahlstedt, J.; Mahmoud, S.; Maiani, C.; Maidantchik, C.; Maio, A.; Majewski, S.; Makida, Y.; Makovec, N.; Mal, P.; Malaescu, B.; Malecki, Pa.; Maleev, V. P.; Malek, F.; Mallik, U.; Malon, D.; Malone, C.; Maltezos, S.; Malyshev, V. M.; Malyukov, S.; Mamuzic, J.; Mandelli, B.; Mandelli, L.; Mandić, I.; Mandrysch, R.; Maneira, J.; Manfredini, A.; de Andrade Filho, L. Manhaes; Ramos, J. A. Manjarres; Mann, A.; Manning, P. M.; Manousakis-Katsikakis, A.; Mansoulie, B.; Mantifel, R.; Mapelli, L.; March, L.; Marchand, J. F.; Marchese, F.; Marchiori, G.; Marcisovsky, M.; Marino, C. P.; Marques, C. N.; Marroquim, F.; Marsden, S. P.; Marshall, Z.; Marti, L. F.; Marti-Garcia, S.; Martin, B.; Martin, B.; Martin, T. A.; Martin, V. J.; dit Latour, B. Martin; Martinez, H.; Martinez, M.; Martin-Haugh, S.; Martyniuk, A. C.; Marx, M.; Marzano, F.; Marzin, A.; Masetti, L.; Mashimo, T.; Mashinistov, R.; Masik, J.; Maslennikov, A. L.; Massa, I.; Massol, N.; Mastrandrea, P.; Mastroberardino, A.; Masubuchi, T.; Matsunaga, H.; Matsushita, T.; Mättig, P.; Mättig, S.; Mattmann, J.; Maurer, J.; Maxfield, S. J.; Maximov, D. A.; Mazini, R.; Mazzaferro, L.; Mc Goldrick, G.; Mc Kee, S. P.; McCarn, A.; McCarthy, R. L.; McCarthy, T. G.; McCubbin, N. A.; McFarlane, K. W.; Mcfayden, J. A.; Mchedlidze, G.; Mclaughlan, T.; McMahon, S. J.; McPherson, R. A.; Meade, A.; Mechnich, J.; Medinnis, M.; Meehan, S.; Meera-Lebbai, R.; Mehlhase, S.; Mehta, A.; Meier, K.; Meineck, C.; Meirose, B.; Melachrinos, C.; Garcia, B. R. Mellado; Meloni, F.; Navas, L. Mendoza; Mengarelli, A.; Menke, S.; Meoni, E.; Mercurio, K. M.; Mergelmeyer, S.; Meric, N.; Mermod, P.; Merola, L.; Meroni, C.; Merritt, F. S.; Merritt, H.; Messina, A.; Metcalfe, J.; Mete, A. S.; Meyer, C.; Meyer, C.; Meyer, J.-P.; Meyer, J.; Middleton, R. P.; Migas, S.; Mijović, L.; Mikenberg, G.; Mikestikova, M.; Mikuž, M.; Miller, D. W.; Mills, C.; Milov, A.; Milstead, D. A.; Milstein, D.; Minaenko, A. A.; Moya, M. Miñano; Minashvili, I. A.; Mincer, A. I.; Mindur, B.; Mineev, M.; Ming, Y.; Mir, L. M.; Mirabelli, G.; Mitani, T.; Mitrevski, J.; Mitsou, V. A.; Mitsui, S.; Miucci, A.; Miyagawa, P. S.; Mjörnmark, J. U.; Moa, T.; Mochizuki, K.; Moeller, V.; Mohapatra, S.; Mohr, W.; Molander, S.; Moles-Valls, R.; Mönig, K.; Monini, C.; Monk, J.; Monnier, E.; Berlingen, J. Montejo; Monticelli, F.; Monzani, S.; Moore, R. W.; Herrera, C. Mora; Moraes, A.; Morange, N.; Morel, J.; Moreno, D.; Llácer, M. Moreno; Morettini, P.; Morgenstern, M.; Morii, M.; Moritz, S.; Morley, A. K.; Mornacchi, G.; Morris, J. D.; Morvaj, L.; Moser, H. G.; Mosidze, M.; Moss, J.; Mount, R.; Mountricha, E.; Mouraviev, S. V.; Moyse, E. J. W.; Muanza, S.; Mudd, R. D.; Mueller, F.; Mueller, J.; Mueller, K.; Mueller, T.; Mueller, T.; Muenstermann, D.; Munwes, Y.; Quijada, J. A. Murillo; Murray, W. J.; Musto, E.; Myagkov, A. G.; Myska, M.; Nackenhorst, O.; Nadal, J.; Nagai, K.; Nagai, R.; Nagai, Y.; Nagano, K.; Nagarkar, A.; Nagasaka, Y.; Nagel, M.; Nairz, A. M.; Nakahama, Y.; Nakamura, K.; Nakamura, T.; Nakano, I.; Namasivayam, H.; Nanava, G.; Narayan, R.; Nattermann, T.; Naumann, T.; Navarro, G.; Nayyar, R.; Neal, H. A.; Nechaeva, P. Yu.; Neep, T. J.; Negri, A.; Negri, G.; Negrini, M.; Nektarijevic, S.; Nelson, A.; Nelson, T. K.; Nemecek, S.; Nemethy, P.; Nepomuceno, A. A.; Nessi, M.; Neubauer, M. S.; Neumann, M.; Neves, R. M.; Nevski, P.; Newman, P. R.; Nguyen, D. H.; Nickerson, R. B.; Nicolaidou, R.; Nicquevert, B.; Nielsen, J.; Nikiforou, N.; Nikiforov, A.; Nikolaenko, V.; Nikolic-Audit, I.; Nikolics, K.; Nikolopoulos, K.; Nilsson, P.; Ninomiya, Y.; Nisati, A.; Nisius, R.; Nobe, T.; Nodulman, L.; Nomachi, M.; Nomidis, I.; Norberg, S.; Nordberg, M.; Nowak, S.; Nozaki, M.; Nozka, L.; Ntekas, K.; Hanninger, G. Nunes; Nunnemann, T.; Nurse, E.; Nuti, F.; O'Brien, B. J.; O'Grady, F.; O'Neil, D. C.; O'Shea, V.; Oakham, F. G.; Oberlack, H.; Obermann, T.; Ocariz, J.; Ochi, A.; Ochoa, M. I.; Oda, S.; Odaka, S.; Ogren, H.; Oh, A.; Oh, S. H.; Ohm, C. C.; Ohman, H.; Ohshima, T.; Okamura, W.; Okawa, H.; Okumura, Y.; Okuyama, T.; Olariu, A.; Olchevski, A. G.; Pino, S. A. Olivares; Damazio, D. Oliveira; Garcia, E. Oliver; Olivito, D.; Olszewski, A.; Olszowska, J.; Onofre, A.; Onyisi, P. U. E.; Oram, C. J.; Oreglia, M. J.; Oren, Y.; Orestano, D.; Orlando, N.; Barrera, C. Oropeza; Orr, R. S.; Osculati, B.; Ospanov, R.; y Garzon, G. Otero; Otono, H.; Ouchrif, M.; Ouellette, E. A.; Ould-Saada, F.; Ouraou, A.; Oussoren, K. P.; Ouyang, Q.; Ovcharova, A.; Owen, M.; Ozcan, V. E.; Ozturk, N.; Pachal, K.; Pages, A. Pacheco; Aranda, C. Padilla; Pagáčová, M.; Griso, S. Pagan; Paganis, E.; Pahl, C.; Paige, F.; Pais, P.; Pajchel, K.; Palacino, G.; Palestini, S.; Pallin, D.; Palma, A.; Palmer, J. D.; Pan, Y. B.; Panagiotopoulou, E.; Vazquez, J. G. Panduro; Pani, P.; Panikashvili, N.; Panitkin, S.; Pantea, D.; Paolozzi, L.; Papadopoulou, Th. D.; Papageorgiou, K.; Paramonov, A.; Hernandez, D. Paredes; Parker, M. A.; Parodi, F.; Parsons, J. A.; Parzefall, U.; Pasqualucci, E.; Passaggio, S.; Passeri, A.; Pastore, F.; Pastore, Fr.; Pásztor, G.; Pataraia, S.; Patel, N. D.; Pater, J. R.; Patricelli, S.; Pauly, T.; Pearce, J.; Pedersen, M.; Lopez, S. Pedraza; Pedro, R.; Peleganchuk, S. V.; Pelikan, D.; Peng, H.; Penning, B.; Penwell, J.; Perepelitsa, D. V.; Codina, E. Perez; García-Estañ, M. T. Pérez; Reale, V. Perez; Perini, L.; Pernegger, H.; Perrino, R.; Peschke, R.; Peshekhonov, V. D.; Peters, K.; Peters, R. F. Y.; Petersen, B. A.; Petersen, J.; Petersen, T. C.; Petit, E.; Petridis, A.; Petridou, C.; Petrolo, E.; Petrucci, F.; Petteni, M.; Pettersson, N. E.; Pezoa, R.; Phillips, P. W.; Piacquadio, G.; Pianori, E.; Picazio, A.; Piccaro, E.; Piccinini, M.; Piec, S. M.; Piegaia, R.; Pignotti, D. T.; Pilcher, J. E.; Pilkington, A. D.; Pina, J.; Pinamonti, M.; Pinder, A.; Pinfold, J. L.; Pingel, A.; Pinto, B.; Pires, S.; Pizio, C.; Pleier, M.-A.; Pleskot, V.; Plotnikova, E.; Plucinski, P.; Poddar, S.; Podlyski, F.; Poettgen, R.; Poggioli, L.; Pohl, D.; Pohl, M.; Polesello, G.; Policicchio, A.; Polifka, R.; Polini, A.; Pollard, C. S.; Polychronakos, V.; Pommès, K.; Pontecorvo, L.; Pope, B. G.; Popeneciu, G. A.; Popovic, D. S.; Poppleton, A.; Bueso, X. Portell; Pospelov, G. E.; Pospisil, S.; Potamianos, K.; Potrap, I. N.; Potter, C. J.; Potter, C. T.; Poulard, G.; Poveda, J.; Pozdnyakov, V.; Prabhu, R.; Pralavorio, P.; Pranko, A.; Prasad, S.; Pravahan, R.; Prell, S.; Price, D.; Price, J.; Price, L. E.; Prieur, D.; Primavera, M.; Proissl, M.; Prokofiev, K.; Prokoshin, F.; Protopapadaki, E.; Protopopescu, S.; Proudfoot, J.; Przybycien, M.; Przysiezniak, H.; Ptacek, E.; Pueschel, E.; Puldon, D.; Purohit, M.; Puzo, P.; Pylypchenko, Y.; Qian, J.; Qin, G.; Quadt, A.; Quarrie, D. R.; Quayle, W. B.; Quilty, D.; Qureshi, A.; Radeka, V.; Radescu, V.; Radhakrishnan, S. K.; Radloff, P.; Rados, P.; Ragusa, F.; Rahal, G.; Rajagopalan, S.; Rammensee, M.; Rammes, M.; Randle-Conde, A. S.; Rangel-Smith, C.; Rao, K.; Rauscher, F.; Rave, T. C.; Ravenscroft, T.; Raymond, M.; Read, A. L.; Rebuzzi, D. M.; Redelbach, A.; Redlinger, G.; Reece, R.; Reeves, K.; Rehnisch, L.; Reinsch, A.; Reisin, H.; Relich, M.; Rembser, C.; Ren, Z. L.; Renaud, A.; Rescigno, M.; Resconi, S.; Rezanova, O. L.; Reznicek, P.; Rezvani, R.; Richter, R.; Ridel, M.; Rieck, P.; Rijssenbeek, M.; Rimoldi, A.; Rinaldi, L.; Ritsch, E.; Riu, I.; Rizatdinova, F.; Rizvi, E.; Robertson, S. H.; Robichaud-Veronneau, A.; Robinson, D.; Robinson, J. E. M.; Robson, A.; Roda, C.; Rodrigues, L.; Roe, S.; Røhne, O.; Rolli, S.; Romaniouk, A.; Romano, M.; Romeo, G.; Adam, E. Romero; Rompotis, N.; Roos, L.; Ros, E.; Rosati, S.; Rosbach, K.; Rose, A.; Rose, M.; Rosendahl, P. L.; Rosenthal, O.; Rossetti, V.; Rossi, E.; Rossi, L. P.; Rosten, R.; Rotaru, M.; Roth, I.; Rothberg, J.; Rousseau, D.; Royon, C. R.; Rozanov, A.; Rozen, Y.; Ruan, X.; Rubbo, F.; Rubinskiy, I.; Rud, V. I.; Rudolph, C.; Rudolph, M. S.; Rühr, F.; Ruiz-Martinez, A.; Rurikova, Z.; Rusakovich, N. A.; Ruschke, A.; Rutherfoord, J. P.; Ruthmann, N.; Ryabov, Y. F.; Rybar, M.; Rybkin, G.; Ryder, N. C.; Saavedra, A. F.; Sacerdoti, S.; Saddique, A.; Sadeh, I.; Sadrozinski, H. F.-W.; Sadykov, R.; Tehrani, F. Safai; Sakamoto, H.; Sakurai, Y.; Salamanna, G.; Salamon, A.; Saleem, M.; Salek, D.; De Bruin, P. H. Sales; Salihagic, D.; Salnikov, A.; Salt, J.; Ferrando, B. M. Salvachua; Salvatore, D.; Salvatore, F.; Salvucci, A.; Salzburger, A.; Sampsonidis, D.; Sanchez, A.; Sánchez, J.; Martinez, V. Sanchez; Sandaker, H.; Sander, H. G.; Sanders, M. P.; Sandhoff, M.; Sandoval, T.; Sandoval, C.; Sandstroem, R.; Sankey, D. P. C.; Sansoni, A.; Santoni, C.; Santonico, R.; Santos, H.; Castillo, I. Santoyo; Sapp, K.; Sapronov, A.; Saraiva, J. G.; Sarrazin, B.; Sartisohn, G.; Sasaki, O.; Sasaki, Y.; Sauvage, G.; Sauvan, E.; Savard, P.; Savu, D. O.; Sawyer, C.; Sawyer, L.; Saxon, D. H.; Saxon, J.; Sbarra, C.; Sbrizzi, A.; Scanlon, T.; Scannicchio, D. A.; Scarcella, M.; Schaarschmidt, J.; Schacht, P.; Schaefer, D.; Schaefer, R.; Schaelicke, A.; Schaepe, S.; Schaetzel, S.; Schäfer, U.; Schaffer, A. C.; Schaile, D.; Schamberger, R. D.; Scharf, V.; Schegelsky, V. A.; Scheirich, D.; Schernau, M.; Scherzer, M. I.; Schiavi, C.; Schieck, J.; Schillo, C.; Schioppa, M.; Schlenker, S.; Schmidt, E.; Schmieden, K.; Schmitt, C.; Schmitt, C.; Schmitt, S.; Schneider, B.; Schnellbach, Y. J.; Schnoor, U.; Schoeffel, L.; Schoening, A.; Schoenrock, B. D.; Schorlemmer, A. L. S.; Schott, M.; Schouten, D.; Schovancova, J.; Schramm, S.; Schreyer, M.; Schroeder, C.; Schuh, N.; Schultens, M. J.; Schultz-Coulon, H.-C.; Schulz, H.; Schumacher, M.; Schumm, B. A.; Schune, Ph.; Schwartzman, A.; Schwegler, Ph.; Schwemling, Ph.; Schwienhorst, R.; Schwindling, J.; Schwindt, T.; Schwoerer, M.; Sciacca, F. G.; Scifo, E.; Sciolla, G.; Scott, W. G.; Scuri, F.; Scutti, F.; Searcy, J.; Sedov, G.; Sedykh, E.; Seidel, S. C.; Seiden, A.; Seifert, F.; Seixas, J. M.; Sekhniaidze, G.; Sekula, S. J.; Selbach, K. E.; Seliverstov, D. M.; Sellers, G.; Semprini-Cesari, N.; Serfon, C.; Serin, L.; Serkin, L.; Serre, T.; Seuster, R.; Severini, H.; Sforza, F.; Sfyrla, A.; Shabalina, E.; Shamim, M.; Shan, L. Y.; Shank, J. T.; Shao, Q. T.; Shapiro, M.; Shatalov, P. B.; Shaw, K.; Sherwood, P.; Shimizu, S.; Shimmin, C. O.; Shimojima, M.; Shiyakova, M.; Shmeleva, A.; Shochet, M. J.; Short, D.; Shrestha, S.; Shulga, E.; Shupe, M. A.; Shushkevich, S.; Sicho, P.; Sidorov, D.; Sidoti, A.; Siegert, F.; Sijacki, Dj.; Silbert, O.; Silva, J.; Silver, Y.; Silverstein, D.; Silverstein, S. B.; Simak, V.; Simard, O.; Simic, Lj.; Simion, S.; Simioni, E.; Simmons, B.; Simoniello, R.; Simonyan, M.; Sinervo, P.; Sinev, N. B.; Sipica, V.; Siragusa, G.; Sircar, A.; Sisakyan, A. N.; Sivoklokov, S. Yu.; Sjölin, J.; Sjursen, T. B.; Skinnari, L. A.; Skottowe, H. P.; Skovpen, K. Yu.; Skubic, P.; Slater, M.; Slavicek, T.; Sliwa, K.; Smakhtin, V.; Smart, B. H.; Smestad, L.; Smirnov, S. Yu.; Smirnov, Y.; Smirnova, L. N.; Smirnova, O.; Smith, K. M.; Smizanska, M.; Smolek, K.; Snesarev, A. A.; Snidero, G.; Snyder, S.; Sobie, R.; Socher, F.; Soffer, A.; Soh, D. A.; Solans, C. A.; Solar, M.; Solc, J.; Soldatov, E. Yu.; Soldevila, U.; Camillocci, E. Solfaroli; Solodkov, A. A.; Solovyanov, O. V.; Solovyev, V.; Sommer, P.; Song, H. Y.; Soni, N.; Sood, A.; Sopko, B.; Sopko, V.; Sorin, V.; Sosebee, M.; Soualah, R.; Soueid, P.; Soukharev, A. M.; South, D.; Spagnolo, S.; Spanò, F.; Spearman, W. R.; Spighi, R.; Spigo, G.; Spousta, M.; Spreitzer, T.; Spurlock, B.; St. Denis, R. D.; Staerz, S.; Stahlman, J.; Stamen, R.; Stanecka, E.; Stanek, R. W.; Stanescu, C.; Stanescu-Bellu, M.; Stanitzki, M. M.; Stapnes, S.; Starchenko, E. A.; Stark, J.; Staroba, P.; Starovoitov, P.; Staszewski, R.; Stavina, P.; Steele, G.; Steinberg, P.; Stelzer, B.; Stelzer, H. J.; Stelzer-Chilton, O.; Stenzel, H.; Stern, S.; Stewart, G. A.; Stillings, J. A.; Stockton, M. C.; Stoebe, M.; Stoerig, K.; Stoicea, G.; Stolte, P.; Stonjek, S.; Stradling, A. R.; Straessner, A.; Strandberg, J.; Strandberg, S.; Strandlie, A.; Strauss, E.; Strauss, M.; Strizenec, P.; Ströhmer, R.; Strom, D. M.; Stroynowski, R.; Stucci, S. A.; Stugu, B.; Styles, N. A.; Su, D.; Su, J.; Subramania, H. S.; Subramaniam, R.; Succurro, A.; Sugaya, Y.; Suhr, C.; Suk, M.; Sulin, V. V.; Sultansoy, S.; Sumida, T.; Sun, X.; Sundermann, J. E.; Suruliz, K.; Susinno, G.; Sutton, M. R.; Suzuki, Y.; Svatos, M.; Swedish, S.; Swiatlowski, M.; Sykora, I.; Sykora, T.; Ta, D.; Tackmann, K.; Taenzer, J.; Taffard, A.; Tafirout, R.; Taiblum, N.; Takahashi, Y.; Takai, H.; Takashima, R.; Takeda, H.; Takeshita, T.; Takubo, Y.; Talby, M.; Talyshev, A. A.; Tam, J. Y. C.; Tamsett, M. C.; Tan, K. G.; Tanaka, J.; Tanaka, R.; Tanaka, S.; Tanaka, S.; Tanasijczuk, A. J.; Tani, K.; Tannoury, N.; Tapprogge, S.; Tarem, S.; Tarrade, F.; Tartarelli, G. F.; Tas, P.; Tasevsky, M.; Tashiro, T.; Tassi, E.; Delgado, A. Tavares; Tayalati, Y.; Taylor, C.; Taylor, F. E.; Taylor, G. N.; Taylor, W.; Teischinger, F. A.; Castanheira, M. Teixeira Dias; Teixeira-Dias, P.; Temming, K. K.; Kate, H. Ten; Teng, P. K.; Terada, S.; Terashi, K.; Terron, J.; Terzo, S.; Testa, M.; Teuscher, R. J.; Therhaag, J.; Theveneaux-Pelzer, T.; Thoma, S.; Thomas, J. P.; Thomas-Wilsker, J.; Thompson, E. N.; Thompson, P. D.; Thompson, P. D.; Thompson, A. S.; Thomsen, L. A.; Thomson, E.; Thomson, M.; Thong, W. M.; Thun, R. P.; Tian, F.; Tibbetts, M. J.; Tikhomirov, V. O.; Tikhonov, Yu. A.; Timoshenko, S.; Tiouchichine, E.; Tipton, P.; Tisserant, S.; Todorov, T.; Todorova-Nova, S.; Toggerson, B.; Tojo, J.; Tokár, S.; Tokushuku, K.; Tollefson, K.; Tomlinson, L.; Tomoto, M.; Tompkins, L.; Toms, K.; Topilin, N. D.; Torrence, E.; Torres, H.; Pastor, E. Torró; Toth, J.; Touchard, F.; Tovey, D. R.; Tran, H. L.; Trefzger, T.; Tremblet, L.; Tricoli, A.; Trigger, I. M.; Trincaz-Duvoid, S.; Tripiana, M. F.; Triplett, N.; Trischuk, W.; Trocmé, B.; Troncon, C.; Trottier-McDonald, M.; Trovatelli, M.; True, P.; Trzebinski, M.; Trzupek, A.; Tsarouchas, C.; Tseng, J. C.-L.; Tsiareshka, P. V.; Tsionou, D.; Tsipolitis, G.; Tsirintanis, N.; Tsiskaridze, S.; Tsiskaridze, V.; Tskhadadze, E. G.; Tsukerman, I. I.; Tsulaia, V.; Tsuno, S.; Tsybychev, D.; Tua, A.; Tudorache, A.; Tudorache, V.; Tuna, A. N.; Tupputi, S. A.; Turchikhin, S.; Turecek, D.; Cakir, I. Turk; Turra, R.; Tuts, P. M.; Tykhonov, A.; Tylmad, M.; Tyndel, M.; Uchida, K.; Ueda, I.; Ueno, R.; Ughetto, M.; Ugland, M.; Uhlenbrock, M.; Ukegawa, F.; Unal, G.; Undrus, A.; Unel, G.; Ungaro, F. C.; Unno, Y.; Urbaniec, D.; Urquijo, P.; Usai, G.; Usanova, A.; Vacavant, L.; Vacek, V.; Vachon, B.; Valencic, N.; Valentinetti, S.; Valero, A.; Valery, L.; Valkar, S.; Gallego, E. Valladolid; Vallecorsa, S.; Ferrer, J. A. Valls; Van Der Deijl, P. C.; van der Geer, R.; van der Graaf, H.; Van Der Leeuw, R.; van der Ster, D.; Eldik, N. van; van Gemmeren, P.; Van Nieuwkoop, J.; van Vulpen, I.; van Woerden, M. C.; Vanadia, M.; Vandelli, W.; Vaniachine, A.; Vankov, P.; Vannucci, F.; Vardanyan, G.; Vari, R.; Varnes, E. W.; Varol, T.; Varouchas, D.; Vartapetian, A.; Varvell, K. E.; Vazeille, F.; Schroeder, T. Vazquez; Veatch, J.; Veloso, F.; Veneziano, S.; Ventura, A.; Ventura, D.; Venturi, M.; Venturi, N.; Venturini, A.; Vercesi, V.; Verducci, M.; Verkerke, W.; Vermeulen, J. C.; Vest, A.; Vetterli, M. C.; Viazlo, O.; Vichou, I.; Vickey, T.; Boeriu, O. E. Vickey; Viehhauser, G. H. A.; Viel, S.; Vigne, R.; Villa, M.; Perez, M. Villaplana; Vilucchi, E.; Vincter, M. G.; Vinogradov, V. B.; Virzi, J.; Vitells, O.; Vivarelli, I.; Vaque, F. Vives; Vlachos, S.; Vladoiu, D.; Vlasak, M.; Vogel, A.; Vokac, P.; Volpi, G.; Volpi, M.; von der Schmitt, H.; Radziewski, H. von; von Toerne, E.; Vorobel, V.; Vos, M.; Voss, R.; Vossebeld, J. H.; Vranjes, N.; Milosavljevic, M. Vranjes; Vrba, V.; Vreeswijk, M.; Anh, T. Vu; Vuillermet, R.; Vukotic, I.; Vykydal, Z.; Wagner, P.; Wagner, W.; Wahrmund, S.; Wakabayashi, J.; Walder, J.; Walker, R.; Walkowiak, W.; Wall, R.; Waller, P.; Walsh, B.; Wang, C.; Wang, C.; Wang, F.; Wang, H.; Wang, H.; Wang, J.; Wang, J.; Wang, K.; Wang, R.; Wang, S. M.; Wang, T.; Wang, X.; Warburton, A.; Ward, C. P.; Wardrope, D. R.; Warsinsky, M.; Washbrook, A.; Wasicki, C.; Watanabe, I.; Watkins, P. M.; Watson, A. T.; Watson, I. J.; Watson, M. F.; Watts, G.; Watts, S.; Waugh, B. M.; Webb, S.; Weber, M. S.; Weber, S. W.; Webster, J. S.; Weidberg, A. R.; Weigell, P.; Weinert, B.; Weingarten, J.; Weiser, C.; Weits, H.; Wells, P. S.; Wenaus, T.; Wendland, D.; Weng, Z.; Wengler, T.; Wenig, S.; Wermes, N.; Werner, M.; Werner, P.; Wessels, M.; Wetter, J.; Whalen, K.; White, A.; White, M. J.; White, R.; White, S.; Whiteson, D.; Wicke, D.; Wickens, F. J.; Wiedenmann, W.; Wielers, M.; Wienemann, P.; Wiglesworth, C.; Wiik-Fuchs, L. A. M.; Wijeratne, P. A.; Wildauer, A.; Wildt, M. A.; Wilkens, H. G.; Will, J. Z.; Williams, H. H.; Williams, S.; Willis, C.; Willocq, S.; Wilson, A.; Wilson, J. A.; Wingerter-Seez, I.; Winkelmann, S.; Winklmeier, F.; Wittgen, M.; Wittig, T.; Wittkowski, J.; Wollstadt, S. J.; Wolter, M. W.; Wolters, H.; Wosiek, B. K.; Wotschack, J.; Woudstra, M. J.; Wozniak, K. W.; Wright, M.; Wu, S. L.; Wu, X.; Wu, Y.; Wulf, E.; Wyatt, T. R.; Wynne, B. M.; Xella, S.; Xiao, M.; Xu, D.; Xu, L.; Yabsley, B.; Yacoob, S.; Yamada, M.; Yamaguchi, H.; Yamaguchi, Y.; Yamamoto, A.; Yamamoto, K.; Yamamoto, S.; Yamamura, T.; Yamanaka, T.; Yamauchi, K.; Yamazaki, Y.; Yan, Z.; Yang, H.; Yang, H.; Yang, U. K.; Yang, Y.; Yanush, S.; Yao, L.; Yao, W.-M.; Yasu, Y.; Yatsenko, E.; Wong, K. H. Yau; Ye, J.; Ye, S.; Yen, A. L.; Yildirim, E.; Yilmaz, M.; Yoosoofmiya, R.; Yorita, K.; Yoshida, R.; Yoshihara, K.; Young, C.; Young, C. J. S.; Youssef, S.; Yu, D. R.; Yu, J.; Yu, J. M.; Yu, J.; Yuan, L.; Yurkewicz, A.; Zabinski, B.; Zaidan, R.; Zaitsev, A. M.; Zaman, A.; Zambito, S.; Zanello, L.; Zanzi, D.; Zaytsev, A.; Zeitnitz, C.; Zeman, M.; Zemla, A.; Zengel, K.; Zenin, O.; Ženiš, T.; Zerwas, D.; della Porta, G. Zevi; Zhang, D.; Zhang, F.; Zhang, H.; Zhang, J.; Zhang, L.; Zhang, X.; Zhang, Z.; Zhao, Z.; Zhemchugov, A.; Zhong, J.; Zhou, B.; Zhou, L.; Zhou, N.; Zhu, C. G.; Zhu, H.; Zhu, J.; Zhu, Y.; Zhuang, X.; Zibell, A.; Zieminska, D.; Zimine, N. I.; Zimmermann, C.; Zimmermann, R.; Zimmermann, S.; Zimmermann, S.; Zinonos, Z.; Ziolkowski, M.; Zitoun, R.; Zobernig, G.; Zoccoli, A.; zur Nedden, M.; Zurzolo, G.; Zutshi, V.; Zwalinski, L.
2014-08-01
Distributions sensitive to the underlying event in QCD jet events have been measured with the ATLAS detector at the LHC, based on of proton-proton collision data collected at a centre-of-mass energy of 7 . Charged-particle mean and densities of all-particle and charged-particle multiplicity and have been measured in regions azimuthally transverse to the hardest jet in each event. These are presented both as one-dimensional distributions and with their mean values as functions of the leading-jet transverse momentum from 20 to 800 . The correlation of charged-particle mean with charged-particle multiplicity is also studied, and the densities include the forward rapidity region; these features provide extra data constraints for Monte Carlo modelling of colour reconnection and beam-remnant effects respectively. For the first time, underlying event observables have been computed separately for inclusive jet and exclusive dijet event selections, allowing more detailed study of the interplay of multiple partonic scattering and QCD radiation contributions to the underlying event. Comparisons to the predictions of different Monte Carlo models show a need for further model tuning, but the standard approach is found to generally reproduce the features of the underlying event in both types of event selection.
Feature Selection for Chemical Sensor Arrays Using Mutual Information
Wang, X. Rosalind; Lizier, Joseph T.; Nowotny, Thomas; Berna, Amalia Z.; Prokopenko, Mikhail; Trowell, Stephen C.
2014-01-01
We address the problem of feature selection for classifying a diverse set of chemicals using an array of metal oxide sensors. Our aim is to evaluate a filter approach to feature selection with reference to previous work, which used a wrapper approach on the same data set, and established best features and upper bounds on classification performance. We selected feature sets that exhibit the maximal mutual information with the identity of the chemicals. The selected features closely match those found to perform well in the previous study using a wrapper approach to conduct an exhaustive search of all permitted feature combinations. By comparing the classification performance of support vector machines (using features selected by mutual information) with the performance observed in the previous study, we found that while our approach does not always give the maximum possible classification performance, it always selects features that achieve classification performance approaching the optimum obtained by exhaustive search. We performed further classification using the selected feature set with some common classifiers and found that, for the selected features, Bayesian Networks gave the best performance. Finally, we compared the observed classification performances with the performance of classifiers using randomly selected features. We found that the selected features consistently outperformed randomly selected features for all tested classifiers. The mutual information filter approach is therefore a computationally efficient method for selecting near optimal features for chemical sensor arrays. PMID:24595058
High Dimensional Classification Using Features Annealed Independence Rules.
Fan, Jianqing; Fan, Yingying
2008-01-01
Classification using high-dimensional features arises frequently in many contemporary statistical studies such as tumor classification using microarray or other high-throughput data. The impact of dimensionality on classifications is largely poorly understood. In a seminal paper, Bickel and Levina (2004) show that the Fisher discriminant performs poorly due to diverging spectra and they propose to use the independence rule to overcome the problem. We first demonstrate that even for the independence classification rule, classification using all the features can be as bad as the random guessing due to noise accumulation in estimating population centroids in high-dimensional feature space. In fact, we demonstrate further that almost all linear discriminants can perform as bad as the random guessing. Thus, it is paramountly important to select a subset of important features for high-dimensional classification, resulting in Features Annealed Independence Rules (FAIR). The conditions under which all the important features can be selected by the two-sample t-statistic are established. The choice of the optimal number of features, or equivalently, the threshold value of the test statistics are proposed based on an upper bound of the classification error. Simulation studies and real data analysis support our theoretical results and demonstrate convincingly the advantage of our new classification procedure.
An explorative childhood pneumonia analysis based on ultrasonic imaging texture features
NASA Astrophysics Data System (ADS)
Zenteno, Omar; Diaz, Kristians; Lavarello, Roberto; Zimic, Mirko; Correa, Malena; Mayta, Holger; Anticona, Cynthia; Pajuelo, Monica; Oberhelman, Richard; Checkley, William; Gilman, Robert H.; Figueroa, Dante; Castañeda, Benjamín.
2015-12-01
According to World Health Organization, pneumonia is the respiratory disease with the highest pediatric mortality rate accounting for 15% of all deaths of children under 5 years old worldwide. The diagnosis of pneumonia is commonly made by clinical criteria with support from ancillary studies and also laboratory findings. Chest imaging is commonly done with chest X-rays and occasionally with a chest CT scan. Lung ultrasound is a promising alternative for chest imaging; however, interpretation is subjective and requires adequate training. In the present work, a two-class classification algorithm based on four Gray-level co-occurrence matrix texture features (i.e., Contrast, Correlation, Energy and Homogeneity) extracted from lung ultrasound images from children aged between six months and five years is presented. Ultrasound data was collected using a L14-5/38 linear transducer. The data consisted of 22 positive- and 68 negative-diagnosed B-mode cine-loops selected by a medical expert and captured in the facilities of the Instituto Nacional de Salud del Niño (Lima, Peru), for a total number of 90 videos obtained from twelve children diagnosed with pneumonia. The classification capacity of each feature was explored independently and the optimal threshold was selected by a receiver operator characteristic (ROC) curve analysis. In addition, a principal component analysis was performed to evaluate the combined performance of all the features. Contrast and correlation resulted the two more significant features. The classification performance of these two features by principal components was evaluated. The results revealed 82% sensitivity, 76% specificity, 78% accuracy and 0.85 area under the ROC.
NASA Astrophysics Data System (ADS)
Cary, Theodore W.; Cwanger, Alyssa; Venkatesh, Santosh S.; Conant, Emily F.; Sehgal, Chandra M.
2012-03-01
This study compares the performance of two proven but very different machine learners, Naïve Bayes and logistic regression, for differentiating malignant and benign breast masses using ultrasound imaging. Ultrasound images of 266 masses were analyzed quantitatively for shape, echogenicity, margin characteristics, and texture features. These features along with patient age, race, and mammographic BI-RADS category were used to train Naïve Bayes and logistic regression classifiers to diagnose lesions as malignant or benign. ROC analysis was performed using all of the features and using only a subset that maximized information gain. Performance was determined by the area under the ROC curve, Az, obtained from leave-one-out cross validation. Naïve Bayes showed significant variation (Az 0.733 +/- 0.035 to 0.840 +/- 0.029, P < 0.002) with the choice of features, but the performance of logistic regression was relatively unchanged under feature selection (Az 0.839 +/- 0.029 to 0.859 +/- 0.028, P = 0.605). Out of 34 features, a subset of 6 gave the highest information gain: brightness difference, margin sharpness, depth-to-width, mammographic BI-RADs, age, and race. The probabilities of malignancy determined by Naïve Bayes and logistic regression after feature selection showed significant correlation (R2= 0.87, P < 0.0001). The diagnostic performance of Naïve Bayes and logistic regression can be comparable, but logistic regression is more robust. Since probability of malignancy cannot be measured directly, high correlation between the probabilities derived from two basic but dissimilar models increases confidence in the predictive power of machine learning models for characterizing solid breast masses on ultrasound.
Predicting axillary lymph node metastasis from kinetic statistics of DCE-MRI breast images
NASA Astrophysics Data System (ADS)
Ashraf, Ahmed B.; Lin, Lilie; Gavenonis, Sara C.; Mies, Carolyn; Xanthopoulos, Eric; Kontos, Despina
2012-03-01
The presence of axillary lymph node metastases is the most important prognostic factor in breast cancer and can influence the selection of adjuvant therapy, both chemotherapy and radiotherapy. In this work we present a set of kinetic statistics derived from DCE-MRI for predicting axillary node status. Breast DCE-MRI images from 69 women with known nodal status were analyzed retrospectively under HIPAA and IRB approval. Axillary lymph nodes were positive in 12 patients while 57 patients had no axillary lymph node involvement. Kinetic curves for each pixel were computed and a pixel-wise map of time-to-peak (TTP) was obtained. Pixels were first partitioned according to the similarity of their kinetic behavior, based on TTP values. For every kinetic curve, the following pixel-wise features were computed: peak enhancement (PE), wash-in-slope (WIS), wash-out-slope (WOS). Partition-wise statistics for every feature map were calculated, resulting in a total of 21 kinetic statistic features. ANOVA analysis was done to select features that differ significantly between node positive and node negative women. Using the computed kinetic statistic features a leave-one-out SVM classifier was learned that performs with AUC=0.77 under the ROC curve, outperforming the conventional kinetic measures, including maximum peak enhancement (MPE) and signal enhancement ratio (SER), (AUCs of 0.61 and 0.57 respectively). These findings suggest that our DCE-MRI kinetic statistic features can be used to improve the prediction of axillary node status in breast cancer patients. Such features could ultimately be used as imaging biomarkers to guide personalized treatment choices for women diagnosed with breast cancer.
Efficient Feature Selection and Classification of Protein Sequence Data in Bioinformatics
Faye, Ibrahima; Samir, Brahim Belhaouari; Md Said, Abas
2014-01-01
Bioinformatics has been an emerging area of research for the last three decades. The ultimate aims of bioinformatics were to store and manage the biological data, and develop and analyze computational tools to enhance their understanding. The size of data accumulated under various sequencing projects is increasing exponentially, which presents difficulties for the experimental methods. To reduce the gap between newly sequenced protein and proteins with known functions, many computational techniques involving classification and clustering algorithms were proposed in the past. The classification of protein sequences into existing superfamilies is helpful in predicting the structure and function of large amount of newly discovered proteins. The existing classification results are unsatisfactory due to a huge size of features obtained through various feature encoding methods. In this work, a statistical metric-based feature selection technique has been proposed in order to reduce the size of the extracted feature vector. The proposed method of protein classification shows significant improvement in terms of performance measure metrics: accuracy, sensitivity, specificity, recall, F-measure, and so forth. PMID:25045727
Chen, Xi; Wu, Qi; Ren, He; Chang, Fu-Kuo
2018-01-01
In this work, a data-driven approach for identifying the flight state of a self-sensing wing structure with an embedded multi-functional sensing network is proposed. The flight state is characterized by the structural vibration signals recorded from a series of wind tunnel experiments under varying angles of attack and airspeeds. A large feature pool is created by extracting potential features from the signals covering the time domain, the frequency domain as well as the information domain. Special emphasis is given to feature selection in which a novel filter method is developed based on the combination of a modified distance evaluation algorithm and a variance inflation factor. Machine learning algorithms are then employed to establish the mapping relationship from the feature space to the practical state space. Results from two case studies demonstrate the high identification accuracy and the effectiveness of the model complexity reduction via the proposed method, thus providing new perspectives of self-awareness towards the next generation of intelligent air vehicles. PMID:29710832
Rough sets and Laplacian score based cost-sensitive feature selection
Yu, Shenglong
2018-01-01
Cost-sensitive feature selection learning is an important preprocessing step in machine learning and data mining. Recently, most existing cost-sensitive feature selection algorithms are heuristic algorithms, which evaluate the importance of each feature individually and select features one by one. Obviously, these algorithms do not consider the relationship among features. In this paper, we propose a new algorithm for minimal cost feature selection called the rough sets and Laplacian score based cost-sensitive feature selection. The importance of each feature is evaluated by both rough sets and Laplacian score. Compared with heuristic algorithms, the proposed algorithm takes into consideration the relationship among features with locality preservation of Laplacian score. We select a feature subset with maximal feature importance and minimal cost when cost is undertaken in parallel, where the cost is given by three different distributions to simulate different applications. Different from existing cost-sensitive feature selection algorithms, our algorithm simultaneously selects out a predetermined number of “good” features. Extensive experimental results show that the approach is efficient and able to effectively obtain the minimum cost subset. In addition, the results of our method are more promising than the results of other cost-sensitive feature selection algorithms. PMID:29912884
Rough sets and Laplacian score based cost-sensitive feature selection.
Yu, Shenglong; Zhao, Hong
2018-01-01
Cost-sensitive feature selection learning is an important preprocessing step in machine learning and data mining. Recently, most existing cost-sensitive feature selection algorithms are heuristic algorithms, which evaluate the importance of each feature individually and select features one by one. Obviously, these algorithms do not consider the relationship among features. In this paper, we propose a new algorithm for minimal cost feature selection called the rough sets and Laplacian score based cost-sensitive feature selection. The importance of each feature is evaluated by both rough sets and Laplacian score. Compared with heuristic algorithms, the proposed algorithm takes into consideration the relationship among features with locality preservation of Laplacian score. We select a feature subset with maximal feature importance and minimal cost when cost is undertaken in parallel, where the cost is given by three different distributions to simulate different applications. Different from existing cost-sensitive feature selection algorithms, our algorithm simultaneously selects out a predetermined number of "good" features. Extensive experimental results show that the approach is efficient and able to effectively obtain the minimum cost subset. In addition, the results of our method are more promising than the results of other cost-sensitive feature selection algorithms.
ANALYSIS OF SAMPLING TECHNIQUES FOR IMBALANCED DATA: AN N=648 ADNI STUDY
Dubey, Rashmi; Zhou, Jiayu; Wang, Yalin; Thompson, Paul M.; Ye, Jieping
2013-01-01
Many neuroimaging applications deal with imbalanced imaging data. For example, in Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset, the mild cognitive impairment (MCI) cases eligible for the study are nearly two times the Alzheimer’s disease (AD) patients for structural magnetic resonance imaging (MRI) modality and six times the control cases for proteomics modality. Constructing an accurate classifier from imbalanced data is a challenging task. Traditional classifiers that aim to maximize the overall prediction accuracy tend to classify all data into the majority class. In this paper, we study an ensemble system of feature selection and data sampling for the class imbalance problem. We systematically analyze various sampling techniques by examining the efficacy of different rates and types of undersampling, oversampling, and a combination of over and under sampling approaches. We thoroughly examine six widely used feature selection algorithms to identify significant biomarkers and thereby reduce the complexity of the data. The efficacy of the ensemble techniques is evaluated using two different classifiers including Random Forest and Support Vector Machines based on classification accuracy, area under the receiver operating characteristic curve (AUC), sensitivity, and specificity measures. Our extensive experimental results show that for various problem settings in ADNI, (1). a balanced training set obtained with K-Medoids technique based undersampling gives the best overall performance among different data sampling techniques and no sampling approach; and (2). sparse logistic regression with stability selection achieves competitive performance among various feature selection algorithms. Comprehensive experiments with various settings show that our proposed ensemble model of multiple undersampled datasets yields stable and promising results. PMID:24176869
Selectivity in Genetic Association with Sub-classified Migraine in Women
Chasman, Daniel I.; Anttila, Verneri; Buring, Julie E.; Ridker, Paul M.; Schürks, Markus; Kurth, Tobias
2014-01-01
Migraine can be sub-classified not only according to presence of migraine aura (MA) or absence of migraine aura (MO), but also by additional features accompanying migraine attacks, e.g. photophobia, phonophobia, nausea, etc. all of which are formally recognized by the International Classification of Headache Disorders. It remains unclear how aura status and the other migraine features may be related to underlying migraine pathophysiology. Recent genome-wide association studies (GWAS) have identified 12 independent loci at which single nucleotide polymorphisms (SNPs) are associated with migraine. Using a likelihood framework, we explored the selective association of these SNPs with migraine, sub-classified according to aura status and the other features in a large population-based cohort of women including 3,003 active migraineurs and 18,108 free of migraine. Five loci met stringent significance for association with migraine, among which four were selective for sub-classified migraine, including rs11172113 (LRP1) for MO. The number of loci associated with migraine increased to 11 at suggestive significance thresholds, including five additional selective associations for MO but none for MA. No two SNPs showed similar patterns of selective association with migraine characteristics. At one extreme, SNPs rs6790925 (near TGFBR2) and rs2274316 (MEF2D) were not associated with migraine overall, MA, or MO but were selective for migraine sub-classified by the presence of one or more of the additional migraine features. In contrast, SNP rs7577262 (TRPM8) was associated with migraine overall and showed little or no selectivity for any of the migraine characteristics. The results emphasize the multivalent nature of migraine pathophysiology and suggest that a complete understanding of the genetic influence on migraine may benefit from analyses that stratify migraine according to both aura status and the additional diagnostic features used for clinical characterization of migraine. PMID:24852292
Toward a model for lexical access based on acoustic landmarks and distinctive features
NASA Astrophysics Data System (ADS)
Stevens, Kenneth N.
2002-04-01
This article describes a model in which the acoustic speech signal is processed to yield a discrete representation of the speech stream in terms of a sequence of segments, each of which is described by a set (or bundle) of binary distinctive features. These distinctive features specify the phonemic contrasts that are used in the language, such that a change in the value of a feature can potentially generate a new word. This model is a part of a more general model that derives a word sequence from this feature representation, the words being represented in a lexicon by sequences of feature bundles. The processing of the signal proceeds in three steps: (1) Detection of peaks, valleys, and discontinuities in particular frequency ranges of the signal leads to identification of acoustic landmarks. The type of landmark provides evidence for a subset of distinctive features called articulator-free features (e.g., [vowel], [consonant], [continuant]). (2) Acoustic parameters are derived from the signal near the landmarks to provide evidence for the actions of particular articulators, and acoustic cues are extracted by sampling selected attributes of these parameters in these regions. The selection of cues that are extracted depends on the type of landmark and on the environment in which it occurs. (3) The cues obtained in step (2) are combined, taking context into account, to provide estimates of ``articulator-bound'' features associated with each landmark (e.g., [lips], [high], [nasal]). These articulator-bound features, combined with the articulator-free features in (1), constitute the sequence of feature bundles that forms the output of the model. Examples of cues that are used, and justification for this selection, are given, as well as examples of the process of inferring the underlying features for a segment when there is variability in the signal due to enhancement gestures (recruited by a speaker to make a contrast more salient) or due to overlap of gestures from neighboring segments.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, R; Aguilera, T; Shultz, D
2014-06-15
Purpose: This study aims to develop predictive models of patient outcome by extracting advanced imaging features (i.e., Radiomics) from FDG-PET images. Methods: We acquired pre-treatment PET scans for 51 stage I NSCLC patients treated with SABR. We calculated 139 quantitative features from each patient PET image, including 5 morphological features, 8 statistical features, 27 texture features, and 100 features from the intensity-volume histogram. Based on the imaging features, we aim to distinguish between 2 risk groups of patients: those with regional failure or distant metastasis versus those without. We investigated 3 pattern classification algorithms: linear discriminant analysis (LDA), naive Bayesmore » (NB), and logistic regression (LR). To avoid the curse of dimensionality, we performed feature selection by first removing redundant features and then applying sequential forward selection using the wrapper approach. To evaluate the predictive performance, we performed 10-fold cross validation with 1000 random splits of the data and calculated the area under the ROC curve (AUC). Results: Feature selection identified 2 texture features (homogeneity and/or wavelet decompositions) for NB and LR, while for LDA SUVmax and one texture feature (correlation) were identified. All 3 classifiers achieved statistically significant improvements over conventional PET imaging metrics such as tumor volume (AUC = 0.668) and SUVmax (AUC = 0.737). Overall, NB achieved the best predictive performance (AUC = 0.806). This also compares favorably with MTV using the best threshold at an SUV of 11.6 (AUC = 0.746). At a sensitivity of 80%, NB achieved 69% specificity, while SUVmax and tumor volume only had 36% and 47% specificity. Conclusion: Through a systematic analysis of advanced PET imaging features, we are able to build models with improved predictive value over conventional imaging metrics. If validated in a large independent cohort, the proposed techniques could potentially aid in identifying patients who might benefit from adjuvant therapy.« less
Evolution and selection of river networks: Statics, dynamics, and complexity
Rinaldo, Andrea; Rigon, Riccardo; Banavar, Jayanth R.; Maritan, Amos; Rodriguez-Iturbe, Ignacio
2014-01-01
Moving from the exact result that drainage network configurations minimizing total energy dissipation are stationary solutions of the general equation describing landscape evolution, we review the static properties and the dynamic origins of the scale-invariant structure of optimal river patterns. Optimal channel networks (OCNs) are feasible optimal configurations of a spanning network mimicking landscape evolution and network selection through imperfect searches for dynamically accessible states. OCNs are spanning loopless configurations, however, only under precise physical requirements that arise under the constraints imposed by river dynamics—every spanning tree is exactly a local minimum of total energy dissipation. It is remarkable that dynamically accessible configurations, the local optima, stabilize into diverse metastable forms that are nevertheless characterized by universal statistical features. Such universal features explain very well the statistics of, and the linkages among, the scaling features measured for fluvial landforms across a broad range of scales regardless of geology, exposed lithology, vegetation, or climate, and differ significantly from those of the ground state, known exactly. Results are provided on the emergence of criticality through adaptative evolution and on the yet-unexplored range of applications of the OCN concept. PMID:24550264
Visuomotor Transformations Underlying Hunting Behavior in Zebrafish
Bianco, Isaac H.; Engert, Florian
2015-01-01
Summary Visuomotor circuits filter visual information and determine whether or not to engage downstream motor modules to produce behavioral outputs. However, the circuit mechanisms that mediate and link perception of salient stimuli to execution of an adaptive response are poorly understood. We combined a virtual hunting assay for tethered larval zebrafish with two-photon functional calcium imaging to simultaneously monitor neuronal activity in the optic tectum during naturalistic behavior. Hunting responses showed mixed selectivity for combinations of visual features, specifically stimulus size, speed, and contrast polarity. We identified a subset of tectal neurons with similar highly selective tuning, which show non-linear mixed selectivity for visual features and are likely to mediate the perceptual recognition of prey. By comparing neural dynamics in the optic tectum during response versus non-response trials, we discovered premotor population activity that specifically preceded initiation of hunting behavior and exhibited anatomical localization that correlated with motor variables. In summary, the optic tectum contains non-linear mixed selectivity neurons that are likely to mediate reliable detection of ethologically relevant sensory stimuli. Recruitment of small tectal assemblies appears to link perception to action by providing the premotor commands that release hunting responses. These findings allow us to propose a model circuit for the visuomotor transformations underlying a natural behavior. PMID:25754638
Visuomotor transformations underlying hunting behavior in zebrafish.
Bianco, Isaac H; Engert, Florian
2015-03-30
Visuomotor circuits filter visual information and determine whether or not to engage downstream motor modules to produce behavioral outputs. However, the circuit mechanisms that mediate and link perception of salient stimuli to execution of an adaptive response are poorly understood. We combined a virtual hunting assay for tethered larval zebrafish with two-photon functional calcium imaging to simultaneously monitor neuronal activity in the optic tectum during naturalistic behavior. Hunting responses showed mixed selectivity for combinations of visual features, specifically stimulus size, speed, and contrast polarity. We identified a subset of tectal neurons with similar highly selective tuning, which show non-linear mixed selectivity for visual features and are likely to mediate the perceptual recognition of prey. By comparing neural dynamics in the optic tectum during response versus non-response trials, we discovered premotor population activity that specifically preceded initiation of hunting behavior and exhibited anatomical localization that correlated with motor variables. In summary, the optic tectum contains non-linear mixed selectivity neurons that are likely to mediate reliable detection of ethologically relevant sensory stimuli. Recruitment of small tectal assemblies appears to link perception to action by providing the premotor commands that release hunting responses. These findings allow us to propose a model circuit for the visuomotor transformations underlying a natural behavior. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
Thin Cloud Detection Method by Linear Combination Model of Cloud Image
NASA Astrophysics Data System (ADS)
Liu, L.; Li, J.; Wang, Y.; Xiao, Y.; Zhang, W.; Zhang, S.
2018-04-01
The existing cloud detection methods in photogrammetry often extract the image features from remote sensing images directly, and then use them to classify images into cloud or other things. But when the cloud is thin and small, these methods will be inaccurate. In this paper, a linear combination model of cloud images is proposed, by using this model, the underlying surface information of remote sensing images can be removed. So the cloud detection result can become more accurate. Firstly, the automatic cloud detection program in this paper uses the linear combination model to split the cloud information and surface information in the transparent cloud images, then uses different image features to recognize the cloud parts. In consideration of the computational efficiency, AdaBoost Classifier was introduced to combine the different features to establish a cloud classifier. AdaBoost Classifier can select the most effective features from many normal features, so the calculation time is largely reduced. Finally, we selected a cloud detection method based on tree structure and a multiple feature detection method using SVM classifier to compare with the proposed method, the experimental data shows that the proposed cloud detection program in this paper has high accuracy and fast calculation speed.
Object-based attention underlies the rehearsal of feature binding in visual working memory.
Shen, Mowei; Huang, Xiang; Gao, Zaifeng
2015-04-01
Feature binding is a core concept in many research fields, including the study of working memory (WM). Over the past decade, it has been debated whether keeping the feature binding in visual WM consumes more visual attention than the constituent single features. Previous studies have only explored the contribution of domain-general attention or space-based attention in the binding process; no study so far has explored the role of object-based attention in retaining binding in visual WM. We hypothesized that object-based attention underlay the mechanism of rehearsing feature binding in visual WM. Therefore, during the maintenance phase of a visual WM task, we inserted a secondary mental rotation (Experiments 1-3), transparent motion (Experiment 4), or an object-based feature report task (Experiment 5) to consume the object-based attention available for binding. In line with the prediction of the object-based attention hypothesis, Experiments 1-5 revealed a more significant impairment for binding than for constituent single features. However, this selective binding impairment was not observed when inserting a space-based visual search task (Experiment 6). We conclude that object-based attention underlies the rehearsal of binding representation in visual WM. (c) 2015 APA, all rights reserved.
Discriminative region extraction and feature selection based on the combination of SURF and saliency
NASA Astrophysics Data System (ADS)
Deng, Li; Wang, Chunhong; Rao, Changhui
2011-08-01
The objective of this paper is to provide a possible optimization on salient region algorithm, which is extensively used in recognizing and learning object categories. Salient region algorithm owns the superiority of intra-class tolerance, global score of features and automatically prominent scale selection under certain range. However, the major limitation behaves on performance, and that is what we attempt to improve. By reducing the number of pixels involved in saliency calculation, it can be accelerated. We use interest points detected by fast-Hessian, the detector of SURF, as the candidate feature for saliency operation, rather than the whole set in image. This implementation is thereby called Saliency based Optimization over SURF (SOSU for short). Experiment shows that bringing in of such a fast detector significantly speeds up the algorithm. Meanwhile, Robustness of intra-class diversity ensures object recognition accuracy.
Unconscious semantic activation depends on feature-specific attention allocation.
Spruyt, Adriaan; De Houwer, Jan; Everaert, Tom; Hermans, Dirk
2012-01-01
We examined whether semantic activation by subliminally presented stimuli is dependent upon the extent to which participants assign attention to specific semantic stimulus features and stimulus dimensions. Participants pronounced visible target words that were preceded by briefly presented, masked prime words. Both affective and non-affective semantic congruence of the prime-target pairs were manipulated under conditions that either promoted selective attention for affective stimulus information or selective attention for non-affective semantic stimulus information. In line with our predictions, results showed that affective congruence had a clear impact on word pronunciation latencies only if participants were encouraged to assign attention to the affective stimulus dimension. In contrast, non-affective semantic relatedness of the prime-target pairs produced no priming at all. Our findings are consistent with the hypothesis that unconscious activation of (affective) semantic information is modulated by feature-specific attention allocation. Copyright © 2011 Elsevier B.V. All rights reserved.
Robust evaluation of time series classification algorithms for structural health monitoring
NASA Astrophysics Data System (ADS)
Harvey, Dustin Y.; Worden, Keith; Todd, Michael D.
2014-03-01
Structural health monitoring (SHM) systems provide real-time damage and performance information for civil, aerospace, and mechanical infrastructure through analysis of structural response measurements. The supervised learning methodology for data-driven SHM involves computation of low-dimensional, damage-sensitive features from raw measurement data that are then used in conjunction with machine learning algorithms to detect, classify, and quantify damage states. However, these systems often suffer from performance degradation in real-world applications due to varying operational and environmental conditions. Probabilistic approaches to robust SHM system design suffer from incomplete knowledge of all conditions a system will experience over its lifetime. Info-gap decision theory enables nonprobabilistic evaluation of the robustness of competing models and systems in a variety of decision making applications. Previous work employed info-gap models to handle feature uncertainty when selecting various components of a supervised learning system, namely features from a pre-selected family and classifiers. In this work, the info-gap framework is extended to robust feature design and classifier selection for general time series classification through an efficient, interval arithmetic implementation of an info-gap data model. Experimental results are presented for a damage type classification problem on a ball bearing in a rotating machine. The info-gap framework in conjunction with an evolutionary feature design system allows for fully automated design of a time series classifier to meet performance requirements under maximum allowable uncertainty.
Comparative analysis of feature extraction methods in satellite imagery
NASA Astrophysics Data System (ADS)
Karim, Shahid; Zhang, Ye; Asif, Muhammad Rizwan; Ali, Saad
2017-10-01
Feature extraction techniques are extensively being used in satellite imagery and getting impressive attention for remote sensing applications. The state-of-the-art feature extraction methods are appropriate according to the categories and structures of the objects to be detected. Based on distinctive computations of each feature extraction method, different types of images are selected to evaluate the performance of the methods, such as binary robust invariant scalable keypoints (BRISK), scale-invariant feature transform, speeded-up robust features (SURF), features from accelerated segment test (FAST), histogram of oriented gradients, and local binary patterns. Total computational time is calculated to evaluate the speed of each feature extraction method. The extracted features are counted under shadow regions and preprocessed shadow regions to compare the functioning of each method. We have studied the combination of SURF with FAST and BRISK individually and found very promising results with an increased number of features and less computational time. Finally, feature matching is conferred for all methods.
Kasthurirathne, Suranga N; Dixon, Brian E; Gichoya, Judy; Xu, Huiping; Xia, Yuni; Mamlin, Burke; Grannis, Shaun J
2016-04-01
Increased adoption of electronic health records has resulted in increased availability of free text clinical data for secondary use. A variety of approaches to obtain actionable information from unstructured free text data exist. These approaches are resource intensive, inherently complex and rely on structured clinical data and dictionary-based approaches. We sought to evaluate the potential to obtain actionable information from free text pathology reports using routinely available tools and approaches that do not depend on dictionary-based approaches. We obtained pathology reports from a large health information exchange and evaluated the capacity to detect cancer cases from these reports using 3 non-dictionary feature selection approaches, 4 feature subset sizes, and 5 clinical decision models: simple logistic regression, naïve bayes, k-nearest neighbor, random forest, and J48 decision tree. The performance of each decision model was evaluated using sensitivity, specificity, accuracy, positive predictive value, and area under the receiver operating characteristics (ROC) curve. Decision models parameterized using automated, informed, and manual feature selection approaches yielded similar results. Furthermore, non-dictionary classification approaches identified cancer cases present in free text reports with evaluation measures approaching and exceeding 80-90% for most metrics. Our methods are feasible and practical approaches for extracting substantial information value from free text medical data, and the results suggest that these methods can perform on par, if not better, than existing dictionary-based approaches. Given that public health agencies are often under-resourced and lack the technical capacity for more complex methodologies, these results represent potentially significant value to the public health field. Copyright © 2016 Elsevier Inc. All rights reserved.
Online feature selection with streaming features.
Wu, Xindong; Yu, Kui; Ding, Wei; Wang, Hao; Zhu, Xingquan
2013-05-01
We propose a new online feature selection framework for applications with streaming features where the knowledge of the full feature space is unknown in advance. We define streaming features as features that flow in one by one over time whereas the number of training examples remains fixed. This is in contrast with traditional online learning methods that only deal with sequentially added observations, with little attention being paid to streaming features. The critical challenges for Online Streaming Feature Selection (OSFS) include 1) the continuous growth of feature volumes over time, 2) a large feature space, possibly of unknown or infinite size, and 3) the unavailability of the entire feature set before learning starts. In the paper, we present a novel Online Streaming Feature Selection method to select strongly relevant and nonredundant features on the fly. An efficient Fast-OSFS algorithm is proposed to improve feature selection performance. The proposed algorithms are evaluated extensively on high-dimensional datasets and also with a real-world case study on impact crater detection. Experimental results demonstrate that the algorithms achieve better compactness and higher prediction accuracy than existing streaming feature selection algorithms.
Kopp, Bruno; Wessel, Karl
2010-05-01
In the present study, event-related potentials (ERPs) were recorded to investigate cognitive processes related to the partial transmission of information from stimulus recognition to response preparation. Participants classified two-dimensional visual stimuli with dimensions size and form. One feature combination was designated as the go-target, whereas the other three feature combinations served as no-go distractors. Size discriminability was manipulated across three experimental conditions. N2c and P3a amplitudes were enhanced in response to those distractors that shared the feature from the faster dimension with the target. Moreover, N2c and P3a amplitudes showed a crossover effect: Size distractors evoked more pronounced ERPs under high size discriminability, but form distractors elicited enhanced ERPs under low size discriminability. These results suggest that partial perceptual-motor transmission of information is accompanied by acts of cognitive control and by shifts of attention between the sources of conflicting information. Selection negativity findings imply adaptive allocation of visual feature-based attention across the two stimulus dimensions.
Deng, Changjian; Lv, Kun; Shi, Debo; Yang, Bo; Yu, Song; He, Zhiyi; Yan, Jia
2018-06-12
In this paper, a novel feature selection and fusion framework is proposed to enhance the discrimination ability of gas sensor arrays for odor identification. Firstly, we put forward an efficient feature selection method based on the separability and the dissimilarity to determine the feature selection order for each type of feature when increasing the dimension of selected feature subsets. Secondly, the K-nearest neighbor (KNN) classifier is applied to determine the dimensions of the optimal feature subsets for different types of features. Finally, in the process of establishing features fusion, we come up with a classification dominance feature fusion strategy which conducts an effective basic feature. Experimental results on two datasets show that the recognition rates of Database I and Database II achieve 97.5% and 80.11%, respectively, when k = 1 for KNN classifier and the distance metric is correlation distance (COR), which demonstrates the superiority of the proposed feature selection and fusion framework in representing signal features. The novel feature selection method proposed in this paper can effectively select feature subsets that are conducive to the classification, while the feature fusion framework can fuse various features which describe the different characteristics of sensor signals, for enhancing the discrimination ability of gas sensors and, to a certain extent, suppressing drift effect.
Fukunaga-Koontz transform based dimensionality reduction for hyperspectral imagery
NASA Astrophysics Data System (ADS)
Ochilov, S.; Alam, M. S.; Bal, A.
2006-05-01
Fukunaga-Koontz Transform based technique offers some attractive properties for desired class oriented dimensionality reduction in hyperspectral imagery. In FKT, feature selection is performed by transforming into a new space where feature classes have complimentary eigenvectors. Dimensionality reduction technique based on these complimentary eigenvector analysis can be described under two classes, desired class and background clutter, such that each basis function best represent one class while carrying the least amount of information from the second class. By selecting a few eigenvectors which are most relevant to desired class, one can reduce the dimension of hyperspectral cube. Since the FKT based technique reduces data size, it provides significant advantages for near real time detection applications in hyperspectral imagery. Furthermore, the eigenvector selection approach significantly reduces computation burden via the dimensionality reduction processes. The performance of the proposed dimensionality reduction algorithm has been tested using real-world hyperspectral dataset.
Selective Audiovisual Semantic Integration Enabled by Feature-Selective Attention.
Li, Yuanqing; Long, Jinyi; Huang, Biao; Yu, Tianyou; Wu, Wei; Li, Peijun; Fang, Fang; Sun, Pei
2016-01-13
An audiovisual object may contain multiple semantic features, such as the gender and emotional features of the speaker. Feature-selective attention and audiovisual semantic integration are two brain functions involved in the recognition of audiovisual objects. Humans often selectively attend to one or several features while ignoring the other features of an audiovisual object. Meanwhile, the human brain integrates semantic information from the visual and auditory modalities. However, how these two brain functions correlate with each other remains to be elucidated. In this functional magnetic resonance imaging (fMRI) study, we explored the neural mechanism by which feature-selective attention modulates audiovisual semantic integration. During the fMRI experiment, the subjects were presented with visual-only, auditory-only, or audiovisual dynamical facial stimuli and performed several feature-selective attention tasks. Our results revealed that a distribution of areas, including heteromodal areas and brain areas encoding attended features, may be involved in audiovisual semantic integration. Through feature-selective attention, the human brain may selectively integrate audiovisual semantic information from attended features by enhancing functional connectivity and thus regulating information flows from heteromodal areas to brain areas encoding the attended features.
Aad, G.; Abajyan, T.; Abbott, B.; ...
2014-08-12
Distributions sensitive to the underlying event in QCD jet events have been measured with the ATLAS detector at the LHC, based on 37 pb -1 of proton–proton collision data collected at a centre-of-mass energy of 7 TeV. Charged-particle mean p T and densities of all-particle E T and charged-particle multiplicity and p T have been measured in regions azimuthally transverse to the hardest jet in each event. These are presented both as one-dimensional distributions and with their mean values as functions of the leading-jet transverse momentum from 20 to 800 GeV. The correlation of charged-particle mean p T with charged-particlemore » multiplicity is also studied, and the E T densities include the forward rapidity region; these features provide extra data constraints for Monte Carlo modelling of colour reconnection and beam-remnant effects respectively. For the first time, underlying event observables have been computed separately for inclusive jet and exclusive dijet event selections, allowing more detailed study of the interplay of multiple partonic scattering and QCD radiation contributions to the underlying event. Comparisons to the predictions of different Monte Carlo models show a need for further model tuning, but the standard approach is found to generally reproduce the features of the underlying event in both types of event selection.« less
Double dissociation of semantic categories in Alzheimer's disease.
Gonnerman, L M; Andersen, E S; Devlin, J T; Kempler, D; Seidenberg, M S
1997-04-01
Data that demonstrate distinct patterns of semantic impairment in Alzheimer's disease (AD) are presented. Findings suggest that while groups of mild-moderate patients may not display category specific impairments, some individual patients do show selective impairment of either natural kinds or artifacts. We present a model of semantic organization in which category specific impairments arise from damage to distributed features underlying different types of categories. We incorporate the crucial notions of intercorrelations and distinguishing features, allowing us to demonstrate (1) how category specific impairments can result from widespread damage and (2) how selective deficits in AD reflect different points in the progression of impairment. The different patterns of impairment arise from an interaction between the nature of the semantic categories and the progression of damage.
EEG feature selection method based on decision tree.
Duan, Lijuan; Ge, Hui; Ma, Wei; Miao, Jun
2015-01-01
This paper aims to solve automated feature selection problem in brain computer interface (BCI). In order to automate feature selection process, we proposed a novel EEG feature selection method based on decision tree (DT). During the electroencephalogram (EEG) signal processing, a feature extraction method based on principle component analysis (PCA) was used, and the selection process based on decision tree was performed by searching the feature space and automatically selecting optimal features. Considering that EEG signals are a series of non-linear signals, a generalized linear classifier named support vector machine (SVM) was chosen. In order to test the validity of the proposed method, we applied the EEG feature selection method based on decision tree to BCI Competition II datasets Ia, and the experiment showed encouraging results.
Patterson, Brent R.; Anderson, Morgan L.; Rodgers, Arthur R.; Vander Vennen, Lucas M.; Fryxell, John M.
2017-01-01
Woodland caribou (Rangifer tarandus caribou) in Ontario are a threatened species that have experienced a substantial retraction of their historic range. Part of their decline has been attributed to increasing densities of anthropogenic linear features such as trails, roads, railways, and hydro lines. These features have been shown to increase the search efficiency and kill rate of wolves. However, it is unclear whether selection for anthropogenic linear features is additive or compensatory to selection for natural (water) linear features which may also be used for travel. We studied the selection of water and anthropogenic linear features by 52 resident wolves (Canis lupus x lycaon) over four years across three study areas in northern Ontario that varied in degrees of forestry activity and human disturbance. We used Euclidean distance-based resource selection functions (mixed-effects logistic regression) at the seasonal range scale with random coefficients for distance to water linear features, primary/secondary roads/railways, and hydro lines, and tertiary roads to estimate the strength of selection for each linear feature and for several habitat types, while accounting for availability of each feature. Next, we investigated the trade-off between selection for anthropogenic and water linear features. Wolves selected both anthropogenic and water linear features; selection for anthropogenic features was stronger than for water during the rendezvous season. Selection for anthropogenic linear features increased with increasing density of these features on the landscape, while selection for natural linear features declined, indicating compensatory selection of anthropogenic linear features. These results have implications for woodland caribou conservation. Prey encounter rates between wolves and caribou seem to be strongly influenced by increasing linear feature densities. This behavioral mechanism–a compensatory functional response to anthropogenic linear feature density resulting in decreased use of natural travel corridors–has negative consequences for the viability of woodland caribou. PMID:29117234
Newton, Erica J; Patterson, Brent R; Anderson, Morgan L; Rodgers, Arthur R; Vander Vennen, Lucas M; Fryxell, John M
2017-01-01
Woodland caribou (Rangifer tarandus caribou) in Ontario are a threatened species that have experienced a substantial retraction of their historic range. Part of their decline has been attributed to increasing densities of anthropogenic linear features such as trails, roads, railways, and hydro lines. These features have been shown to increase the search efficiency and kill rate of wolves. However, it is unclear whether selection for anthropogenic linear features is additive or compensatory to selection for natural (water) linear features which may also be used for travel. We studied the selection of water and anthropogenic linear features by 52 resident wolves (Canis lupus x lycaon) over four years across three study areas in northern Ontario that varied in degrees of forestry activity and human disturbance. We used Euclidean distance-based resource selection functions (mixed-effects logistic regression) at the seasonal range scale with random coefficients for distance to water linear features, primary/secondary roads/railways, and hydro lines, and tertiary roads to estimate the strength of selection for each linear feature and for several habitat types, while accounting for availability of each feature. Next, we investigated the trade-off between selection for anthropogenic and water linear features. Wolves selected both anthropogenic and water linear features; selection for anthropogenic features was stronger than for water during the rendezvous season. Selection for anthropogenic linear features increased with increasing density of these features on the landscape, while selection for natural linear features declined, indicating compensatory selection of anthropogenic linear features. These results have implications for woodland caribou conservation. Prey encounter rates between wolves and caribou seem to be strongly influenced by increasing linear feature densities. This behavioral mechanism-a compensatory functional response to anthropogenic linear feature density resulting in decreased use of natural travel corridors-has negative consequences for the viability of woodland caribou.
McTwo: a two-step feature selection algorithm based on maximal information coefficient.
Ge, Ruiquan; Zhou, Manli; Luo, Youxi; Meng, Qinghan; Mai, Guoqin; Ma, Dongli; Wang, Guoqing; Zhou, Fengfeng
2016-03-23
High-throughput bio-OMIC technologies are producing high-dimension data from bio-samples at an ever increasing rate, whereas the training sample number in a traditional experiment remains small due to various difficulties. This "large p, small n" paradigm in the area of biomedical "big data" may be at least partly solved by feature selection algorithms, which select only features significantly associated with phenotypes. Feature selection is an NP-hard problem. Due to the exponentially increased time requirement for finding the globally optimal solution, all the existing feature selection algorithms employ heuristic rules to find locally optimal solutions, and their solutions achieve different performances on different datasets. This work describes a feature selection algorithm based on a recently published correlation measurement, Maximal Information Coefficient (MIC). The proposed algorithm, McTwo, aims to select features associated with phenotypes, independently of each other, and achieving high classification performance of the nearest neighbor algorithm. Based on the comparative study of 17 datasets, McTwo performs about as well as or better than existing algorithms, with significantly reduced numbers of selected features. The features selected by McTwo also appear to have particular biomedical relevance to the phenotypes from the literature. McTwo selects a feature subset with very good classification performance, as well as a small feature number. So McTwo may represent a complementary feature selection algorithm for the high-dimensional biomedical datasets.
NASA Astrophysics Data System (ADS)
Nemoto, Mitsutaka; Hayashi, Naoto; Hanaoka, Shouhei; Nomura, Yukihiro; Miki, Soichiro; Yoshikawa, Takeharu; Ohtomo, Kuni
2016-03-01
The purpose of this study is to evaluate the feasibility of a novel feature generation, which is based on multiple deep neural networks (DNNs) with boosting, for computer-assisted detection (CADe). It is hard and time-consuming to optimize the hyperparameters for DNNs such as stacked denoising autoencoder (SdA). The proposed method allows using SdA based features without the burden of the hyperparameter setting. The proposed method was evaluated by an application for detecting cerebral aneurysms on magnetic resonance angiogram (MRA). A baseline CADe process included four components; scaling, candidate area limitation, candidate detection, and candidate classification. Proposed feature generation method was applied to extract the optimal features for candidate classification. Proposed method only required setting range of the hyperparameters for SdA. The optimal feature set was selected from a large quantity of SdA based features by multiple SdAs, each of which was trained using different hyperparameter set. The feature selection was operated through ada-boost ensemble learning method. Training of the baseline CADe process and proposed feature generation were operated with 200 MRA cases, and the evaluation was performed with 100 MRA cases. Proposed method successfully provided SdA based features just setting the range of some hyperparameters for SdA. The CADe process by using both previous voxel features and SdA based features had the best performance with 0.838 of an area under ROC curve and 0.312 of ANODE score. The results showed that proposed method was effective in the application for detecting cerebral aneurysms on MRA.
Balcarras, Matthew; Ardid, Salva; Kaping, Daniel; Everling, Stefan; Womelsdorf, Thilo
2016-02-01
Attention includes processes that evaluate stimuli relevance, select the most relevant stimulus against less relevant stimuli, and bias choice behavior toward the selected information. It is not clear how these processes interact. Here, we captured these processes in a reinforcement learning framework applied to a feature-based attention task that required macaques to learn and update the value of stimulus features while ignoring nonrelevant sensory features, locations, and action plans. We found that value-based reinforcement learning mechanisms could account for feature-based attentional selection and choice behavior but required a value-independent stickiness selection process to explain selection errors while at asymptotic behavior. By comparing different reinforcement learning schemes, we found that trial-by-trial selections were best predicted by a model that only represents expected values for the task-relevant feature dimension, with nonrelevant stimulus features and action plans having only a marginal influence on covert selections. These findings show that attentional control subprocesses can be described by (1) the reinforcement learning of feature values within a restricted feature space that excludes irrelevant feature dimensions, (2) a stochastic selection process on feature-specific value representations, and (3) value-independent stickiness toward previous feature selections akin to perseveration in the motor domain. We speculate that these three mechanisms are implemented by distinct but interacting brain circuits and that the proposed formal account of feature-based stimulus selection will be important to understand how attentional subprocesses are implemented in primate brain networks.
Colomer Granero, Adrián; Fuentes-Hurtado, Félix; Naranjo Ornedo, Valery; Guixeres Provinciale, Jaime; Ausín, Jose M.; Alcañiz Raya, Mariano
2016-01-01
This work focuses on finding the most discriminatory or representative features that allow to classify commercials according to negative, neutral and positive effectiveness based on the Ace Score index. For this purpose, an experiment involving forty-seven participants was carried out. In this experiment electroencephalography (EEG), electrocardiography (ECG), Galvanic Skin Response (GSR) and respiration data were acquired while subjects were watching a 30-min audiovisual content. This content was composed by a submarine documentary and nine commercials (one of them the ad under evaluation). After the signal pre-processing, four sets of features were extracted from the physiological signals using different state-of-the-art metrics. These features computed in time and frequency domains are the inputs to several basic and advanced classifiers. An average of 89.76% of the instances was correctly classified according to the Ace Score index. The best results were obtained by a classifier consisting of a combination between AdaBoost and Random Forest with automatic selection of features. The selected features were those extracted from GSR and HRV signals. These results are promising in the audiovisual content evaluation field by means of physiological signal processing. PMID:27471462
Colomer Granero, Adrián; Fuentes-Hurtado, Félix; Naranjo Ornedo, Valery; Guixeres Provinciale, Jaime; Ausín, Jose M; Alcañiz Raya, Mariano
2016-01-01
This work focuses on finding the most discriminatory or representative features that allow to classify commercials according to negative, neutral and positive effectiveness based on the Ace Score index. For this purpose, an experiment involving forty-seven participants was carried out. In this experiment electroencephalography (EEG), electrocardiography (ECG), Galvanic Skin Response (GSR) and respiration data were acquired while subjects were watching a 30-min audiovisual content. This content was composed by a submarine documentary and nine commercials (one of them the ad under evaluation). After the signal pre-processing, four sets of features were extracted from the physiological signals using different state-of-the-art metrics. These features computed in time and frequency domains are the inputs to several basic and advanced classifiers. An average of 89.76% of the instances was correctly classified according to the Ace Score index. The best results were obtained by a classifier consisting of a combination between AdaBoost and Random Forest with automatic selection of features. The selected features were those extracted from GSR and HRV signals. These results are promising in the audiovisual content evaluation field by means of physiological signal processing.
The effect of feature selection methods on computer-aided detection of masses in mammograms
NASA Astrophysics Data System (ADS)
Hupse, Rianne; Karssemeijer, Nico
2010-05-01
In computer-aided diagnosis (CAD) research, feature selection methods are often used to improve generalization performance of classifiers and shorten computation times. In an application that detects malignant masses in mammograms, we investigated the effect of using a selection criterion that is similar to the final performance measure we are optimizing, namely the mean sensitivity of the system in a predefined range of the free-response receiver operating characteristics (FROC). To obtain the generalization performance of the selected feature subsets, a cross validation procedure was performed on a dataset containing 351 abnormal and 7879 normal regions, each region providing a set of 71 mass features. The same number of noise features, not containing any information, were added to investigate the ability of the feature selection algorithms to distinguish between useful and non-useful features. It was found that significantly higher performances were obtained using feature sets selected by the general test statistic Wilks' lambda than using feature sets selected by the more specific FROC measure. Feature selection leads to better performance when compared to a system in which all features were used.
Gao, Dashan; Vasconcelos, Nuno
2009-01-01
A decision-theoretic formulation of visual saliency, first proposed for top-down processing (object recognition) (Gao & Vasconcelos, 2005a), is extended to the problem of bottom-up saliency. Under this formulation, optimality is defined in the minimum probability of error sense, under a constraint of computational parsimony. The saliency of the visual features at a given location of the visual field is defined as the power of those features to discriminate between the stimulus at the location and a null hypothesis. For bottom-up saliency, this is the set of visual features that surround the location under consideration. Discrimination is defined in an information-theoretic sense and the optimal saliency detector derived for a class of stimuli that complies with known statistical properties of natural images. It is shown that under the assumption that saliency is driven by linear filtering, the optimal detector consists of what is usually referred to as the standard architecture of V1: a cascade of linear filtering, divisive normalization, rectification, and spatial pooling. The optimal detector is also shown to replicate the fundamental properties of the psychophysics of saliency: stimulus pop-out, saliency asymmetries for stimulus presence versus absence, disregard of feature conjunctions, and Weber's law. Finally, it is shown that the optimal saliency architecture can be applied to the solution of generic inference problems. In particular, for the class of stimuli studied, it performs the three fundamental operations of statistical inference: assessment of probabilities, implementation of Bayes decision rule, and feature selection.
Detection of stress/anxiety state from EEG features during video watching.
Giannakakis, Giorgos; Grigoriadis, Dimitris; Tsiknakis, Manolis
2015-01-01
This paper studies the effect of stress/anxiety states on EEG signals during video sessions. The levels of arousal and valence that are induced to each subject while watching each video are self rated. These levels are mapped in stress and relaxed states and subjects that fufill criteria of adequate anxiety/stress scale were chosen leading to a subset of 18 subjects. Then, temporal, spectral and non linear EEG features are evaluated for being able to represent accurately states under investigation. Feature selection schemes choose the most significant of them in order to provide increased discrimination ability between relaxed and anxiety/stress states.
The genetical theory of social behaviour
Lehmann, Laurent; Rousset, François
2014-01-01
We survey the population genetic basis of social evolution, using a logically consistent set of arguments to cover a wide range of biological scenarios. We start by reconsidering Hamilton's (Hamilton 1964 J. Theoret. Biol. 7, 1–16 (doi:10.1016/0022-5193(64)90038-4)) results for selection on a social trait under the assumptions of additive gene action, weak selection and constant environment and demography. This yields a prediction for the direction of allele frequency change in terms of phenotypic costs and benefits and genealogical concepts of relatedness, which holds for any frequency of the trait in the population, and provides the foundation for further developments and extensions. We then allow for any type of gene interaction within and between individuals, strong selection and fluctuating environments and demography, which may depend on the evolving trait itself. We reach three conclusions pertaining to selection on social behaviours under broad conditions. (i) Selection can be understood by focusing on a one-generation change in mean allele frequency, a computation which underpins the utility of reproductive value weights; (ii) in large populations under the assumptions of additive gene action and weak selection, this change is of constant sign for any allele frequency and is predicted by a phenotypic selection gradient; (iii) under the assumptions of trait substitution sequences, such phenotypic selection gradients suffice to characterize long-term multi-dimensional stochastic evolution, with almost no knowledge about the genetic details underlying the coevolving traits. Having such simple results about the effect of selection regardless of population structure and type of social interactions can help to delineate the common features of distinct biological processes. Finally, we clarify some persistent divergences within social evolution theory, with respect to exactness, synergies, maximization, dynamic sufficiency and the role of genetic arguments. PMID:24686929
The genetical theory of social behaviour.
Lehmann, Laurent; Rousset, François
2014-05-19
We survey the population genetic basis of social evolution, using a logically consistent set of arguments to cover a wide range of biological scenarios. We start by reconsidering Hamilton's (Hamilton 1964 J. Theoret. Biol. 7, 1-16 (doi:10.1016/0022-5193(64)90038-4)) results for selection on a social trait under the assumptions of additive gene action, weak selection and constant environment and demography. This yields a prediction for the direction of allele frequency change in terms of phenotypic costs and benefits and genealogical concepts of relatedness, which holds for any frequency of the trait in the population, and provides the foundation for further developments and extensions. We then allow for any type of gene interaction within and between individuals, strong selection and fluctuating environments and demography, which may depend on the evolving trait itself. We reach three conclusions pertaining to selection on social behaviours under broad conditions. (i) Selection can be understood by focusing on a one-generation change in mean allele frequency, a computation which underpins the utility of reproductive value weights; (ii) in large populations under the assumptions of additive gene action and weak selection, this change is of constant sign for any allele frequency and is predicted by a phenotypic selection gradient; (iii) under the assumptions of trait substitution sequences, such phenotypic selection gradients suffice to characterize long-term multi-dimensional stochastic evolution, with almost no knowledge about the genetic details underlying the coevolving traits. Having such simple results about the effect of selection regardless of population structure and type of social interactions can help to delineate the common features of distinct biological processes. Finally, we clarify some persistent divergences within social evolution theory, with respect to exactness, synergies, maximization, dynamic sufficiency and the role of genetic arguments.
Speech Emotion Feature Selection Method Based on Contribution Analysis Algorithm of Neural Network
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang Xiaojia; Mao Qirong; Zhan Yongzhao
There are many emotion features. If all these features are employed to recognize emotions, redundant features may be existed. Furthermore, recognition result is unsatisfying and the cost of feature extraction is high. In this paper, a method to select speech emotion features based on contribution analysis algorithm of NN is presented. The emotion features are selected by using contribution analysis algorithm of NN from the 95 extracted features. Cluster analysis is applied to analyze the effectiveness for the features selected, and the time of feature extraction is evaluated. Finally, 24 emotion features selected are used to recognize six speech emotions.more » The experiments show that this method can improve the recognition rate and the time of feature extraction.« less
Zhao, Yu-Xiang; Chou, Chien-Hsing
2016-01-01
In this study, a new feature selection algorithm, the neighborhood-relationship feature selection (NRFS) algorithm, is proposed for identifying rat electroencephalogram signals and recognizing Chinese characters. In these two applications, dependent relationships exist among the feature vectors and their neighboring feature vectors. Therefore, the proposed NRFS algorithm was designed for solving this problem. By applying the NRFS algorithm, unselected feature vectors have a high priority of being added into the feature subset if the neighboring feature vectors have been selected. In addition, selected feature vectors have a high priority of being eliminated if the neighboring feature vectors are not selected. In the experiments conducted in this study, the NRFS algorithm was compared with two feature algorithms. The experimental results indicated that the NRFS algorithm can extract the crucial frequency bands for identifying rat vigilance states and identifying crucial character regions for recognizing Chinese characters. PMID:27314346
Selective Audiovisual Semantic Integration Enabled by Feature-Selective Attention
Li, Yuanqing; Long, Jinyi; Huang, Biao; Yu, Tianyou; Wu, Wei; Li, Peijun; Fang, Fang; Sun, Pei
2016-01-01
An audiovisual object may contain multiple semantic features, such as the gender and emotional features of the speaker. Feature-selective attention and audiovisual semantic integration are two brain functions involved in the recognition of audiovisual objects. Humans often selectively attend to one or several features while ignoring the other features of an audiovisual object. Meanwhile, the human brain integrates semantic information from the visual and auditory modalities. However, how these two brain functions correlate with each other remains to be elucidated. In this functional magnetic resonance imaging (fMRI) study, we explored the neural mechanism by which feature-selective attention modulates audiovisual semantic integration. During the fMRI experiment, the subjects were presented with visual-only, auditory-only, or audiovisual dynamical facial stimuli and performed several feature-selective attention tasks. Our results revealed that a distribution of areas, including heteromodal areas and brain areas encoding attended features, may be involved in audiovisual semantic integration. Through feature-selective attention, the human brain may selectively integrate audiovisual semantic information from attended features by enhancing functional connectivity and thus regulating information flows from heteromodal areas to brain areas encoding the attended features. PMID:26759193
Ataer-Cansizoglu, E; Kalpathy-Cramer, J; You, S; Keck, K; Erdogmus, D; Chiang, M F
2015-01-01
Inter-expert variability in image-based clinical diagnosis has been demonstrated in many diseases including retinopathy of prematurity (ROP), which is a disease affecting low birth weight infants and is a major cause of childhood blindness. In order to better understand the underlying causes of variability among experts, we propose a method to quantify the variability of expert decisions and analyze the relationship between expert diagnoses and features computed from the images. Identification of these features is relevant for development of computer-based decision support systems and educational systems in ROP, and these methods may be applicable to other diseases where inter-expert variability is observed. The experiments were carried out on a dataset of 34 retinal images, each with diagnoses provided independently by 22 experts. Analysis was performed using concepts of Mutual Information (MI) and Kernel Density Estimation. A large set of structural features (a total of 66) were extracted from retinal images. Feature selection was utilized to identify the most important features that correlated to actual clinical decisions by the 22 study experts. The best three features for each observer were selected by an exhaustive search on all possible feature subsets and considering joint MI as a relevance criterion. We also compared our results with the results of Cohen's Kappa [36] as an inter-rater reliability measure. The results demonstrate that a group of observers (17 among 22) decide consistently with each other. Mean and second central moment of arteriolar tortuosity is among the reasons of disagreement between this group and the rest of the observers, meaning that the group of experts consider amount of tortuosity as well as the variation of tortuosity in the image. Given a set of image-based features, the proposed analysis method can identify critical image-based features that lead to expert agreement and disagreement in diagnosis of ROP. Although tree-based features and various statistics such as central moment are not popular in the literature, our results suggest that they are important for diagnosis.
Natural image classification driven by human brain activity
NASA Astrophysics Data System (ADS)
Zhang, Dai; Peng, Hanyang; Wang, Jinqiao; Tang, Ming; Xue, Rong; Zuo, Zhentao
2016-03-01
Natural image classification has been a hot topic in computer vision and pattern recognition research field. Since the performance of an image classification system can be improved by feature selection, many image feature selection methods have been developed. However, the existing supervised feature selection methods are typically driven by the class label information that are identical for different samples from the same class, ignoring with-in class image variability and therefore degrading the feature selection performance. In this study, we propose a novel feature selection method, driven by human brain activity signals collected using fMRI technique when human subjects were viewing natural images of different categories. The fMRI signals associated with subjects viewing different images encode the human perception of natural images, and therefore may capture image variability within- and cross- categories. We then select image features with the guidance of fMRI signals from brain regions with active response to image viewing. Particularly, bag of words features based on GIST descriptor are extracted from natural images for classification, and a sparse regression base feature selection method is adapted to select image features that can best predict fMRI signals. Finally, a classification model is built on the select image features to classify images without fMRI signals. The validation experiments for classifying images from 4 categories of two subjects have demonstrated that our method could achieve much better classification performance than the classifiers built on image feature selected by traditional feature selection methods.
EFS: an ensemble feature selection tool implemented as R-package and web-application.
Neumann, Ursula; Genze, Nikita; Heider, Dominik
2017-01-01
Feature selection methods aim at identifying a subset of features that improve the prediction performance of subsequent classification models and thereby also simplify their interpretability. Preceding studies demonstrated that single feature selection methods can have specific biases, whereas an ensemble feature selection has the advantage to alleviate and compensate for these biases. The software EFS (Ensemble Feature Selection) makes use of multiple feature selection methods and combines their normalized outputs to a quantitative ensemble importance. Currently, eight different feature selection methods have been integrated in EFS, which can be used separately or combined in an ensemble. EFS identifies relevant features while compensating specific biases of single methods due to an ensemble approach. Thereby, EFS can improve the prediction accuracy and interpretability in subsequent binary classification models. EFS can be downloaded as an R-package from CRAN or used via a web application at http://EFS.heiderlab.de.
Feature selection methods for big data bioinformatics: A survey from the search perspective.
Wang, Lipo; Wang, Yaoli; Chang, Qing
2016-12-01
This paper surveys main principles of feature selection and their recent applications in big data bioinformatics. Instead of the commonly used categorization into filter, wrapper, and embedded approaches to feature selection, we formulate feature selection as a combinatorial optimization or search problem and categorize feature selection methods into exhaustive search, heuristic search, and hybrid methods, where heuristic search methods may further be categorized into those with or without data-distilled feature ranking measures. Copyright © 2016 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Zhang, Chen; Ni, Zhiwei; Ni, Liping; Tang, Na
2016-10-01
Feature selection is an important method of data preprocessing in data mining. In this paper, a novel feature selection method based on multi-fractal dimension and harmony search algorithm is proposed. Multi-fractal dimension is adopted as the evaluation criterion of feature subset, which can determine the number of selected features. An improved harmony search algorithm is used as the search strategy to improve the efficiency of feature selection. The performance of the proposed method is compared with that of other feature selection algorithms on UCI data-sets. Besides, the proposed method is also used to predict the daily average concentration of PM2.5 in China. Experimental results show that the proposed method can obtain competitive results in terms of both prediction accuracy and the number of selected features.
Ozcift, Akin
2012-08-01
Parkinson disease (PD) is an age-related deterioration of certain nerve systems, which affects movement, balance, and muscle control of clients. PD is one of the common diseases which affect 1% of people older than 60 years. A new classification scheme based on support vector machine (SVM) selected features to train rotation forest (RF) ensemble classifiers is presented for improving diagnosis of PD. The dataset contains records of voice measurements from 31 people, 23 with PD and each record in the dataset is defined with 22 features. The diagnosis model first makes use of a linear SVM to select ten most relevant features from 22. As a second step of the classification model, six different classifiers are trained with the subset of features. Subsequently, at the third step, the accuracies of classifiers are improved by the utilization of RF ensemble classification strategy. The results of the experiments are evaluated using three metrics; classification accuracy (ACC), Kappa Error (KE) and Area under the Receiver Operating Characteristic (ROC) Curve (AUC). Performance measures of two base classifiers, i.e. KStar and IBk, demonstrated an apparent increase in PD diagnosis accuracy compared to similar studies in literature. After all, application of RF ensemble classification scheme improved PD diagnosis in 5 of 6 classifiers significantly. We, numerically, obtained about 97% accuracy in RF ensemble of IBk (a K-Nearest Neighbor variant) algorithm, which is a quite high performance for Parkinson disease diagnosis.
NASA Astrophysics Data System (ADS)
Pietrzyk, Mariusz W.; Donovan, Tim; Brennan, Patrick C.; Dix, Alan; Manning, David J.
2011-03-01
Aim: To optimize automated classification of radiological errors during lung nodule detection from chest radiographs (CxR) using a support vector machine (SVM) run on the spatial frequency features extracted from the local background of selected regions. Background: The majority of the unreported pulmonary nodules are visually detected but not recognized; shown by the prolonged dwell time values at false-negative regions. Similarly, overestimated nodule locations are capturing substantial amounts of foveal attention. Spatial frequency properties of selected local backgrounds are correlated with human observer responses either in terms of accuracy in indicating abnormality position or in the precision of visual sampling the medical images. Methods: Seven radiologists participated in the eye tracking experiments conducted under conditions of pulmonary nodule detection from a set of 20 postero-anterior CxR. The most dwelled locations have been identified and subjected to spatial frequency (SF) analysis. The image-based features of selected ROI were extracted with un-decimated Wavelet Packet Transform. An analysis of variance was run to select SF features and a SVM schema was implemented to classify False-Negative and False-Positive from all ROI. Results: A relative high overall accuracy was obtained for each individually developed Wavelet-SVM algorithm, with over 90% average correct ratio for errors recognition from all prolonged dwell locations. Conclusion: The preliminary results show that combined eye-tracking and image-based features can be used for automated detection of radiological error with SVM. The work is still in progress and not all analytical procedures have been completed, which might have an effect on the specificity of the algorithm.
NASA Astrophysics Data System (ADS)
Tang, Zhenchao; Liu, Zhenyu; Zhang, Xiaoyan; Shi, Yanjie; Wang, Shou; Fang, Mengjie; Sun, Yingshi; Dong, Enqing; Tian, Jie
2018-02-01
The Locally advanced rectal cancer (LARC) patients were routinely treated with neoadjuvant chemoradiotherapy (CRT) firstly and received total excision afterwards. While, the LARC patients might relieve to T1N0M0/T0N0M0 stage after the CRT, which would enable the patients be qualified for local excision. However, accurate pathological TNM stage could only be obtained by the pathological examination after surgery. We aimed to conduct a Radiomics analysis of Diffusion weighted Imaging (DWI) data to identify the patients in T1N0M0/T0N0M0 stages before surgery, in hope of providing clinical surgery decision support. 223 routinely treated LARC patients in Beijing Cancer Hospital were enrolled in current study. DWI data and clinical characteristics were collected after CRT. According to the pathological TNM stage, the patients of T1N0M0 and T0N0M0 stages were labelled as 1 and the other patients were labelled as 0. The first 123 patients in chronological order were used as training set, and the rest patients as validation set. 563 image features extracted from the DWI data and clinical characteristics were used as features. Two-sample T test was conducted to pre-select the top 50% discriminating features. Least absolute shrinkage and selection operator (Lasso)-Logistic regression model was conducted to further select features and construct the classification model. Based on the 14 selected image features, the area under the Receiver Operating Characteristic (ROC) curve (AUC) of 0.8781, classification Accuracy (ACC) of 0.8432 were achieved in the training set. In the validation set, AUC of 0.8707, ACC (ACC) of 0.84 were observed.
NASA Astrophysics Data System (ADS)
Adeli, Ehsan; Wu, Guorong; Saghafi, Behrouz; An, Le; Shi, Feng; Shen, Dinggang
2017-01-01
Feature selection methods usually select the most compact and relevant set of features based on their contribution to a linear regression model. Thus, these features might not be the best for a non-linear classifier. This is especially crucial for the tasks, in which the performance is heavily dependent on the feature selection techniques, like the diagnosis of neurodegenerative diseases. Parkinson’s disease (PD) is one of the most common neurodegenerative disorders, which progresses slowly while affects the quality of life dramatically. In this paper, we use the data acquired from multi-modal neuroimaging data to diagnose PD by investigating the brain regions, known to be affected at the early stages. We propose a joint kernel-based feature selection and classification framework. Unlike conventional feature selection techniques that select features based on their performance in the original input feature space, we select features that best benefit the classification scheme in the kernel space. We further propose kernel functions, specifically designed for our non-negative feature types. We use MRI and SPECT data of 538 subjects from the PPMI database, and obtain a diagnosis accuracy of 97.5%, which outperforms all baseline and state-of-the-art methods.
Adeli, Ehsan; Wu, Guorong; Saghafi, Behrouz; An, Le; Shi, Feng; Shen, Dinggang
2017-01-01
Feature selection methods usually select the most compact and relevant set of features based on their contribution to a linear regression model. Thus, these features might not be the best for a non-linear classifier. This is especially crucial for the tasks, in which the performance is heavily dependent on the feature selection techniques, like the diagnosis of neurodegenerative diseases. Parkinson’s disease (PD) is one of the most common neurodegenerative disorders, which progresses slowly while affects the quality of life dramatically. In this paper, we use the data acquired from multi-modal neuroimaging data to diagnose PD by investigating the brain regions, known to be affected at the early stages. We propose a joint kernel-based feature selection and classification framework. Unlike conventional feature selection techniques that select features based on their performance in the original input feature space, we select features that best benefit the classification scheme in the kernel space. We further propose kernel functions, specifically designed for our non-negative feature types. We use MRI and SPECT data of 538 subjects from the PPMI database, and obtain a diagnosis accuracy of 97.5%, which outperforms all baseline and state-of-the-art methods. PMID:28120883
NASA Astrophysics Data System (ADS)
Song, Bowen; Zhang, Guopeng; Wang, Huafeng; Zhu, Wei; Liang, Zhengrong
2013-02-01
Various types of features, e.g., geometric features, texture features, projection features etc., have been introduced for polyp detection and differentiation tasks via computer aided detection and diagnosis (CAD) for computed tomography colonography (CTC). Although these features together cover more information of the data, some of them are statistically highly-related to others, which made the feature set redundant and burdened the computation task of CAD. In this paper, we proposed a new dimension reduction method which combines hierarchical clustering and principal component analysis (PCA) for false positives (FPs) reduction task. First, we group all the features based on their similarity using hierarchical clustering, and then PCA is employed within each group. Different numbers of principal components are selected from each group to form the final feature set. Support vector machine is used to perform the classification. The results show that when three principal components were chosen from each group we can achieve an area under the curve of receiver operating characteristics of 0.905, which is as high as the original dataset. Meanwhile, the computation time is reduced by 70% and the feature set size is reduce by 77%. It can be concluded that the proposed method captures the most important information of the feature set and the classification accuracy is not affected after the dimension reduction. The result is promising and further investigation, such as automatically threshold setting, are worthwhile and are under progress.
Protein quality control in the early secretory pathway
Anelli, Tiziana; Sitia, Roberto
2008-01-01
Eukaryotic cells are able to discriminate between native and non-native polypeptides, selectively transporting the former to their final destinations. Secretory proteins are scrutinized at the endoplasmic reticulum (ER)–Golgi interface. Recent findings reveal novel features of the underlying molecular mechanisms, with several chaperone networks cooperating in assisting the maturation of complex proteins and being selectively induced to match changing synthetic demands. ‘Public' and ‘private' chaperones, some of which enriched in specializes subregions, operate for most or selected substrates, respectively. Moreover, sequential checkpoints are distributed along the early secretory pathway, allowing efficiency and fidelity in protein secretion. PMID:18216874
NASA Astrophysics Data System (ADS)
Lederman, Dror; Zheng, Bin; Wang, Xingwei; Wang, Xiao Hui; Gur, David
2011-03-01
We have developed a multi-probe resonance-frequency electrical impedance spectroscope (REIS) system to detect breast abnormalities. Based on assessing asymmetry in REIS signals acquired between left and right breasts, we developed several machine learning classifiers to classify younger women (i.e., under 50YO) into two groups of having high and low risk for developing breast cancer. In this study, we investigated a new method to optimize performance based on the area under a selected partial receiver operating characteristic (ROC) curve when optimizing an artificial neural network (ANN), and tested whether it could improve classification performance. From an ongoing prospective study, we selected a dataset of 174 cases for whom we have both REIS signals and diagnostic status verification. The dataset includes 66 "positive" cases recommended for biopsy due to detection of highly suspicious breast lesions and 108 "negative" cases determined by imaging based examinations. A set of REIS-based feature differences, extracted from the two breasts using a mirror-matched approach, was computed and constituted an initial feature pool. Using a leave-one-case-out cross-validation method, we applied a genetic algorithm (GA) to train the ANN with an optimal subset of features. Two optimization criteria were separately used in GA optimization, namely the area under the entire ROC curve (AUC) and the partial area under the ROC curve, up to a predetermined threshold (i.e., 90% specificity). The results showed that although the ANN optimized using the entire AUC yielded higher overall performance (AUC = 0.83 versus 0.76), the ANN optimized using the partial ROC area criterion achieved substantially higher operational performance (i.e., increasing sensitivity level from 28% to 48% at 95% specificity and/ or from 48% to 58% at 90% specificity).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Song, J; Pollom, E; Durkee, B
2015-06-15
Purpose: To predict response to radiation treatment using computational FDG-PET and CT images in locally advanced head and neck cancer (HNC). Methods: 68 patients with State III-IVB HNC treated with chemoradiation were included in this retrospective study. For each patient, we analyzed primary tumor and lymph nodes on PET and CT scans acquired both prior to and during radiation treatment, which led to 8 combinations of image datasets. From each image set, we extracted high-throughput, radiomic features of the following types: statistical, morphological, textural, histogram, and wavelet, resulting in a total of 437 features. We then performed unsupervised redundancy removalmore » and stability test on these features. To avoid over-fitting, we trained a logistic regression model with simultaneous feature selection based on least absolute shrinkage and selection operator (LASSO). To objectively evaluate the prediction ability, we performed 5-fold cross validation (CV) with 50 random repeats of stratified bootstrapping. Feature selection and model training was solely conducted on the training set and independently validated on the holdout test set. Receiver operating characteristic (ROC) curve of the pooled Result and the area under the ROC curve (AUC) was calculated as figure of merit. Results: For predicting local-regional recurrence, our model built on pre-treatment PET of lymph nodes achieved the best performance (AUC=0.762) on 5-fold CV, which compared favorably with node volume and SUVmax (AUC=0.704 and 0.449, p<0.001). Wavelet coefficients turned out to be the most predictive features. Prediction of distant recurrence showed a similar trend, in which pre-treatment PET features of lymph nodes had the highest AUC of 0.705. Conclusion: The radiomics approach identified novel imaging features that are predictive to radiation treatment response. If prospectively validated in larger cohorts, they could aid in risk-adaptive treatment of HNC.« less
NASA Astrophysics Data System (ADS)
Jelinek, Herbert F.; Cree, Michael J.; Leandro, Jorge J. G.; Soares, João V. B.; Cesar, Roberto M.; Luckie, A.
2007-05-01
Proliferative diabetic retinopathy can lead to blindness. However, early recognition allows appropriate, timely intervention. Fluorescein-labeled retinal blood vessels of 27 digital images were automatically segmented using the Gabor wavelet transform and classified using traditional features such as area, perimeter, and an additional five morphological features based on the derivatives-of-Gaussian wavelet-derived data. Discriminant analysis indicated that traditional features do not detect early proliferative retinopathy. The best single feature for discrimination was the wavelet curvature with an area under the curve (AUC) of 0.76. Linear discriminant analysis with a selection of six features achieved an AUC of 0.90 (0.73-0.97, 95% confidence interval). The wavelet method was able to segment retinal blood vessels and classify the images according to the presence or absence of proliferative retinopathy.
Classification of pulmonary nodules in lung CT images using shape and texture features
NASA Astrophysics Data System (ADS)
Dhara, Ashis Kumar; Mukhopadhyay, Sudipta; Dutta, Anirvan; Garg, Mandeep; Khandelwal, Niranjan; Kumar, Prafulla
2016-03-01
Differentiation of malignant and benign pulmonary nodules is important for prognosis of lung cancer. In this paper, benign and malignant nodules are classified using support vector machine. Several shape-based and texture-based features are used to represent the pulmonary nodules in the feature space. A semi-automated technique is used for nodule segmentation. Relevant features are selected for efficient representation of nodules in the feature space. The proposed scheme and the competing technique are evaluated on a data set of 542 nodules of Lung Image Database Consortium and Image Database Resource Initiative. The nodules with composite rank of malignancy "1","2" are considered as benign and "4","5" are considered as malignant. Area under the receiver operating characteristics curve is 0:9465 for the proposed method. The proposed method outperforms the competing technique.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Carlson, J.J.; Bouchard, A.M.; Osbourn, G.C.
Future generation automated human biometric identification and verification will require multiple features/sensors together with internal and external information sources to achieve high performance, accuracy, and reliability in uncontrolled environments. The primary objective of the proposed research is to develop a theoretical and practical basis for identifying and verifying people using standoff biometric features that can be obtained with minimal inconvenience during the verification process. The basic problem involves selecting sensors and discovering features that provide sufficient information to reliably verify a person`s identity under the uncertainties caused by measurement errors and tactics of uncooperative subjects. A system was developed formore » discovering hand, face, ear, and voice features and fusing them to verify the identity of people. The system obtains its robustness and reliability by fusing many coarse and easily measured features into a near minimal probability of error decision algorithm.« less
Compact cancer biomarkers discovery using a swarm intelligence feature selection algorithm.
Martinez, Emmanuel; Alvarez, Mario Moises; Trevino, Victor
2010-08-01
Biomarker discovery is a typical application from functional genomics. Due to the large number of genes studied simultaneously in microarray data, feature selection is a key step. Swarm intelligence has emerged as a solution for the feature selection problem. However, swarm intelligence settings for feature selection fail to select small features subsets. We have proposed a swarm intelligence feature selection algorithm based on the initialization and update of only a subset of particles in the swarm. In this study, we tested our algorithm in 11 microarray datasets for brain, leukemia, lung, prostate, and others. We show that the proposed swarm intelligence algorithm successfully increase the classification accuracy and decrease the number of selected features compared to other swarm intelligence methods. Copyright © 2010 Elsevier Ltd. All rights reserved.
Giacomini, M; Luft, H S; Robinson, J C
1995-01-01
This paper surveys recent health care reform debates and empirical evidence regarding the potential role for risk adjusters in addressing the problem of competitive risk segmentation under capitated financing. We discuss features of health plan markets affecting risk selection, methodological considerations in measuring it, and alternative approaches to financial correction for risk differentials. The appropriate approach to assessing risk differences between health plans depends upon the nature of market risk selection allowed under a given reform scenario. Because per capita costs depend on a health plan's population risk, efficiency, and quality of service, risk adjustment will most strongly promote efficiency in environments with commensurately strong incentives for quality care.
A real negative selection algorithm with evolutionary preference for anomaly detection
NASA Astrophysics Data System (ADS)
Yang, Tao; Chen, Wen; Li, Tao
2017-04-01
Traditional real negative selection algorithms (RNSAs) adopt the estimated coverage (c0) as the algorithm termination threshold, and generate detectors randomly. With increasing dimensions, the data samples could reside in the low-dimensional subspace, so that the traditional detectors cannot effectively distinguish these samples. Furthermore, in high-dimensional feature space, c0 cannot exactly reflect the detectors set coverage rate for the nonself space, and it could lead the algorithm to be terminated unexpectedly when the number of detectors is insufficient. These shortcomings make the traditional RNSAs to perform poorly in high-dimensional feature space. Based upon "evolutionary preference" theory in immunology, this paper presents a real negative selection algorithm with evolutionary preference (RNSAP). RNSAP utilizes the "unknown nonself space", "low-dimensional target subspace" and "known nonself feature" as the evolutionary preference to guide the generation of detectors, thus ensuring the detectors can cover the nonself space more effectively. Besides, RNSAP uses redundancy to replace c0 as the termination threshold, in this way RNSAP can generate adequate detectors under a proper convergence rate. The theoretical analysis and experimental result demonstrate that, compared to the classical RNSA (V-detector), RNSAP can achieve a higher detection rate, but with less detectors and computing cost.
The Cutplane - A tool for interactive solid modeling
NASA Technical Reports Server (NTRS)
Edwards, Laurence; Kessler, William; Leifer, Larry
1988-01-01
A geometric modeling system which incorporates a new concept for intuitively and unambiguously specifying and manipulating points or features in three dimensional space is presented. The central concept, the Cutplane, consists of a plane that moves through space under control of a mouse or similar input device. The intersection of the plane and any object is highlighted, and only this highlighted section can be selected for manipulation. Selection is accomplished with a crosshair that is constrained to remain within the plane, so that the relationship between the crosshair and the feature of interest is immediately evident. Although the idea of a section view is not new, previously it has been used as a way to reveal hidden structure, not as a means of manipulating objects or indicating spatial position, as is proposed here.
NASA Astrophysics Data System (ADS)
Gao, Yi; Zhu, Liangjia; Norton, Isaiah; Agar, Nathalie Y. R.; Tannenbaum, Allen
2014-03-01
Desorption electrospray ionization mass spectrometry (DESI-MS) provides a highly sensitive imaging technique for differentiating normal and cancerous tissue at the molecular level. This can be very useful, especially under intra-operative conditions where the surgeon has to make crucial decision about the tumor boundary. In such situations, the time it takes for imaging and data analysis becomes a critical factor. Therefore, in this work we utilize compressive sensing to perform the sparse sampling of the tissue, which halves the scanning time. Furthermore, sparse feature selection is performed, which not only reduces the dimension of data from about 104 to less than 50, and thus significantly shortens the analysis time. This procedure also identifies biochemically important molecules for further pathological analysis. The methods are validated on brain and breast tumor data sets.
Sturtz, Timothy M.; Adar, Sara D.; Gould, Timothy; Larson, Timothy V.
2016-01-01
PM10-2.5 mass and trace element concentrations were measured in Winston-Salem, Chicago, and St. Paul at up to 60 sites per city during two different seasons in 2010. Positive Matrix Factorization (PMF) was used to explore the underlying sources of variability. Information on previously reported PM10-2.5 tire and brake wear profiles was used to constrain these features in PMF by prior specification of selected species ratios. We also modified PMF to allow for combining the measurements from all three cities into a single model while preserving city-specific soil features. Relatively minor differences were observed between model predictions with and without the prior ratio constraints, increasing confidence in our ability to identify separate brake wear and tire wear features. Brake wear, tire wear, fertilized soil, and re-suspended soil were found to be important sources of copper, zinc, phosphorus, and silicon respectively across all three urban areas. PMID:27468256
Anomaly Detection of Electromyographic Signals.
Ijaz, Ahsan; Choi, Jongeun
2018-04-01
In this paper, we provide a robust framework to detect anomalous electromyographic (EMG) signals and identify contamination types. As a first step for feature selection, optimally selected Lawton wavelets transform is applied. Robust principal component analysis (rPCA) is then performed on these wavelet coefficients to obtain features in a lower dimension. The rPCA based features are used for constructing a self-organizing map (SOM). Finally, hierarchical clustering is applied on the SOM that separates anomalous signals residing in the smaller clusters and breaks them into logical units for contamination identification. The proposed methodology is tested using synthetic and real world EMG signals. The synthetic EMG signals are generated using a heteroscedastic process mimicking desired experimental setups. A sub-part of these synthetic signals is introduced with anomalies. These results are followed with real EMG signals introduced with synthetic anomalies. Finally, a heterogeneous real world data set is used with known quality issues under an unsupervised setting. The framework provides recall of 90% (± 3.3) and precision of 99%(±0.4).
NASA Astrophysics Data System (ADS)
Sturtz, Timothy M.; Adar, Sara D.; Gould, Timothy; Larson, Timothy V.
2014-02-01
PM10-2.5 mass and trace element concentrations were measured in Winston-Salem, Chicago, and St. Paul at up to 60 sites per city during two different seasons in 2010. Positive Matrix Factorization (PMF) was used to explore the underlying sources of variability. Information on previously reported PM10-2.5 tire and brake wear profiles was used to constrain these features in PMF by prior specification of selected species ratios. We also modified PMF to allow for combining the measurements from all three cities into a single model while preserving city-specific soil features. Relatively minor differences were observed between model predictions with and without the prior ratio constraints, increasing confidence in our ability to identify separate brake wear and tire wear features. Brake wear, tire wear, fertilized soil, and resuspended soil were found to be important sources of copper, zinc, phosphorus, and silicon, respectively, across all three urban areas.
Multi-task feature selection in microarray data by binary integer programming.
Lan, Liang; Vucetic, Slobodan
2013-12-20
A major challenge in microarray classification is that the number of features is typically orders of magnitude larger than the number of examples. In this paper, we propose a novel feature filter algorithm to select the feature subset with maximal discriminative power and minimal redundancy by solving a quadratic objective function with binary integer constraints. To improve the computational efficiency, the binary integer constraints are relaxed and a low-rank approximation to the quadratic term is applied. The proposed feature selection algorithm was extended to solve multi-task microarray classification problems. We compared the single-task version of the proposed feature selection algorithm with 9 existing feature selection methods on 4 benchmark microarray data sets. The empirical results show that the proposed method achieved the most accurate predictions overall. We also evaluated the multi-task version of the proposed algorithm on 8 multi-task microarray datasets. The multi-task feature selection algorithm resulted in significantly higher accuracy than when using the single-task feature selection methods.
Dzialak, Matthew R.; Olson, Chad V.; Harju, Seth M.; Webb, Stephen L.; Mudd, James P.; Winstead, Jeffrey B.; Hayden-Wing, L.D.
2011-01-01
Background Balancing animal conservation and human use of the landscape is an ongoing scientific and practical challenge throughout the world. We investigated reproductive success in female greater sage-grouse (Centrocercus urophasianus) relative to seasonal patterns of resource selection, with the larger goal of developing a spatially-explicit framework for managing human activity and sage-grouse conservation at the landscape level. Methodology/Principal Findings We integrated field-observation, Global Positioning Systems telemetry, and statistical modeling to quantify the spatial pattern of occurrence and risk during nesting and brood-rearing. We linked occurrence and risk models to provide spatially-explicit indices of habitat-performance relationships. As part of the analysis, we offer novel biological information on resource selection during egg-laying, incubation, and night. The spatial pattern of occurrence during all reproductive phases was driven largely by selection or avoidance of terrain features and vegetation, with little variation explained by anthropogenic features. Specifically, sage-grouse consistently avoided rough terrain, selected for moderate shrub cover at the patch level (within 90 m2), and selected for mesic habitat in mid and late brood-rearing phases. In contrast, risk of nest and brood failure was structured by proximity to anthropogenic features including natural gas wells and human-created mesic areas, as well as vegetation features such as shrub cover. Conclusions/Significance Risk in this and perhaps other human-modified landscapes is a top-down (i.e., human-mediated) process that would most effectively be minimized by developing a better understanding of specific mechanisms (e.g., predator subsidization) driving observed patterns, and using habitat-performance indices such as those developed herein for spatially-explicit guidance of conservation intervention. Working under the hypothesis that industrial activity structures risk by enhancing predator abundance or effectiveness, we offer specific recommendations for maintaining high-performance habitat and reducing low-performance habitat, particularly relative to the nesting phase, by managing key high-risk anthropogenic features such as industrial infrastructure and water developments. PMID:22022587
Tyagi, Neelam; Sutton, Elizabeth; Hunt, Margie; Zhang, Jing; Oh, Jung Hun; Apte, Aditya; Mechalakos, James; Wilgucki, Molly; Gelb, Emily; Mehrara, Babak; Matros, Evan; Ho, Alice
2017-02-01
Capsular contracture (CC) is a serious complication in patients receiving implant-based reconstruction for breast cancer. Currently, no objective methods are available for assessing CC. The goal of the present study was to identify image-based surrogates of CC using magnetic resonance imaging (MRI). We analyzed a retrospective data set of 50 patients who had undergone both a diagnostic MRI scan and a plastic surgeon's evaluation of the CC score (Baker's score) within a 6-month period after mastectomy and reconstructive surgery. The MRI scans were assessed for morphologic shape features of the implant and histogram features of the pectoralis muscle. The shape features, such as roundness, eccentricity, solidity, extent, and ratio length for the implant, were compared with the Baker score. For the pectoralis muscle, the muscle width and median, skewness, and kurtosis of the intensity were compared with the Baker score. Univariate analysis (UVA) using a Wilcoxon rank-sum test and multivariate analysis with the least absolute shrinkage and selection operator logistic regression was performed to determine significant differences in these features between the patient groups categorized according to their Baker's scores. UVA showed statistically significant differences between grade 1 and grade ≥2 for morphologic shape features and histogram features, except for volume and skewness. Only eccentricity, ratio length, and volume were borderline significant in differentiating grade ≤2 and grade ≥3. Features with P<.1 on UVA were used in the multivariate least absolute shrinkage and selection operator logistic regression analysis. Multivariate analysis showed a good level of predictive power for grade 1 versus grade ≥2 CC (area under the receiver operating characteristic curve 0.78, sensitivity 0.78, and specificity 0.82) and for grade ≤2 versus grade ≥3 CC (area under the receiver operating characteristic curve 0.75, sensitivity 0.75, and specificity 0.79). The morphologic shape features described on MR images were associated with the severity of CC. MRI has the potential to further improve the diagnostic ability of the Baker score in breast cancer patients who undergo implant reconstruction. Copyright © 2016 Elsevier Inc. All rights reserved.
Andersen, Søren K; Müller, Matthias M; Hillyard, Steven A
2015-07-08
Experiments that study feature-based attention have often examined situations in which selection is based on a single feature (e.g., the color red). However, in more complex situations relevant stimuli may not be set apart from other stimuli by a single defining property but by a specific combination of features. Here, we examined sustained attentional selection of stimuli defined by conjunctions of color and orientation. Human observers attended to one out of four concurrently presented superimposed fields of randomly moving horizontal or vertical bars of red or blue color to detect brief intervals of coherent motion. Selective stimulus processing in early visual cortex was assessed by recordings of steady-state visual evoked potentials (SSVEPs) elicited by each of the flickering fields of stimuli. We directly contrasted attentional selection of single features and feature conjunctions and found that SSVEP amplitudes on conditions in which selection was based on a single feature only (color or orientation) exactly predicted the magnitude of attentional enhancement of SSVEPs when attending to a conjunction of both features. Furthermore, enhanced SSVEP amplitudes elicited by attended stimuli were accompanied by equivalent reductions of SSVEP amplitudes elicited by unattended stimuli in all cases. We conclude that attentional selection of a feature-conjunction stimulus is accomplished by the parallel and independent facilitation of its constituent feature dimensions in early visual cortex. The ability to perceive the world is limited by the brain's processing capacity. Attention affords adaptive behavior by selectively prioritizing processing of relevant stimuli based on their features (location, color, orientation, etc.). We found that attentional mechanisms for selection of different features belonging to the same object operate independently and in parallel: concurrent attentional selection of two stimulus features is simply the sum of attending to each of those features separately. This result is key to understanding attentional selection in complex (natural) scenes, where relevant stimuli are likely to be defined by a combination of stimulus features. Copyright © 2015 the authors 0270-6474/15/359912-08$15.00/0.
AVC: Selecting discriminative features on basis of AUC by maximizing variable complementarity.
Sun, Lei; Wang, Jun; Wei, Jinmao
2017-03-14
The Receiver Operator Characteristic (ROC) curve is well-known in evaluating classification performance in biomedical field. Owing to its superiority in dealing with imbalanced and cost-sensitive data, the ROC curve has been exploited as a popular metric to evaluate and find out disease-related genes (features). The existing ROC-based feature selection approaches are simple and effective in evaluating individual features. However, these approaches may fail to find real target feature subset due to their lack of effective means to reduce the redundancy between features, which is essential in machine learning. In this paper, we propose to assess feature complementarity by a trick of measuring the distances between the misclassified instances and their nearest misses on the dimensions of pairwise features. If a misclassified instance and its nearest miss on one feature dimension are far apart on another feature dimension, the two features are regarded as complementary to each other. Subsequently, we propose a novel filter feature selection approach on the basis of the ROC analysis. The new approach employs an efficient heuristic search strategy to select optimal features with highest complementarities. The experimental results on a broad range of microarray data sets validate that the classifiers built on the feature subset selected by our approach can get the minimal balanced error rate with a small amount of significant features. Compared with other ROC-based feature selection approaches, our new approach can select fewer features and effectively improve the classification performance.
Deconstructing continuous flash suppression
Yang, Eunice; Blake, Randolph
2012-01-01
In this paper, we asked to what extent the depth of interocular suppression engendered by continuous flash suppression (CFS) varies depending on spatiotemporal properties of the suppressed stimulus and CFS suppressor. An answer to this question could have implications for interpreting the results in which CFS influences the processing of different categories of stimuli to different extents. In a series of experiments, we measured the selectivity and depth of suppression (i.e., elevation in contrast detection thresholds) as a function of the visual features of the stimulus being suppressed and the stimulus evoking suppression, namely, the popular “Mondrian” CFS stimulus (N. Tsuchiya & C. Koch, 2005). First, we found that CFS differentially suppresses the spatial components of the suppressed stimulus: Observers' sensitivity for stimuli of relatively low spatial frequency or cardinally oriented features was more strongly impaired in comparison to high spatial frequency or obliquely oriented stimuli. Second, we discovered that this feature-selective bias primarily arises from the spatiotemporal structure of the CFS stimulus, particularly within information residing in the low spatial frequency range and within the smooth rather than abrupt luminance changes over time. These results imply that this CFS stimulus operates by selectively attenuating certain classes of low-level signals while leaving others to be potentially encoded during suppression. These findings underscore the importance of considering the contribution of low-level features in stimulus-driven effects that are reported under CFS. PMID:22408039
Deconstructing continuous flash suppression.
Yang, Eunice; Blake, Randolph
2012-03-08
In this paper, we asked to what extent the depth of interocular suppression engendered by continuous flash suppression (CFS) varies depending on spatiotemporal properties of the suppressed stimulus and CFS suppressor. An answer to this question could have implications for interpreting the results in which CFS influences the processing of different categories of stimuli to different extents. In a series of experiments, we measured the selectivity and depth of suppression (i.e., elevation in contrast detection thresholds) as a function of the visual features of the stimulus being suppressed and the stimulus evoking suppression, namely, the popular "Mondrian" CFS stimulus (N. Tsuchiya & C. Koch, 2005). First, we found that CFS differentially suppresses the spatial components of the suppressed stimulus: Observers' sensitivity for stimuli of relatively low spatial frequency or cardinally oriented features was more strongly impaired in comparison to high spatial frequency or obliquely oriented stimuli. Second, we discovered that this feature-selective bias primarily arises from the spatiotemporal structure of the CFS stimulus, particularly within information residing in the low spatial frequency range and within the smooth rather than abrupt luminance changes over time. These results imply that this CFS stimulus operates by selectively attenuating certain classes of low-level signals while leaving others to be potentially encoded during suppression. These findings underscore the importance of considering the contribution of low-level features in stimulus-driven effects that are reported under CFS.
A Feature and Algorithm Selection Method for Improving the Prediction of Protein Structural Class.
Ni, Qianwu; Chen, Lei
2017-01-01
Correct prediction of protein structural class is beneficial to investigation on protein functions, regulations and interactions. In recent years, several computational methods have been proposed in this regard. However, based on various features, it is still a great challenge to select proper classification algorithm and extract essential features to participate in classification. In this study, a feature and algorithm selection method was presented for improving the accuracy of protein structural class prediction. The amino acid compositions and physiochemical features were adopted to represent features and thirty-eight machine learning algorithms collected in Weka were employed. All features were first analyzed by a feature selection method, minimum redundancy maximum relevance (mRMR), producing a feature list. Then, several feature sets were constructed by adding features in the list one by one. For each feature set, thirtyeight algorithms were executed on a dataset, in which proteins were represented by features in the set. The predicted classes yielded by these algorithms and true class of each protein were collected to construct a dataset, which were analyzed by mRMR method, yielding an algorithm list. From the algorithm list, the algorithm was taken one by one to build an ensemble prediction model. Finally, we selected the ensemble prediction model with the best performance as the optimal ensemble prediction model. Experimental results indicate that the constructed model is much superior to models using single algorithm and other models that only adopt feature selection procedure or algorithm selection procedure. The feature selection procedure or algorithm selection procedure are really helpful for building an ensemble prediction model that can yield a better performance. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Zhang, Chenwang; Gao, Liuze; Xu, Eugene Yujun
2016-11-01
Spermatogenesis is one of the fundamental processes of sexual reproduction, present in almost all metazoan animals. Like many other reproductive traits, developmental features and traits of spermatogenesis are under strong selective pressure to change, both at morphological and underlying molecular levels. Yet evidence suggests that some fundamental features of spermatogenesis may be ancient and conserved among metazoan species. Identifying the underlying conserved molecular mechanisms could reveal core components of metazoan spermatogenic machinery and provide novel insight into causes of human infertility. Conserved RNA-binding proteins and their interacting RNA network emerge to be a common theme important for animal sperm development. We review research on the recent addition to the RNA family - Long non-coding RNA (lncRNA) and its roles in spermatogenesis in the context of the expanding RNA-protein network. Copyright © 2016 Elsevier Ltd. All rights reserved.
Non-negative matrix factorization in texture feature for classification of dementia with MRI data
NASA Astrophysics Data System (ADS)
Sarwinda, D.; Bustamam, A.; Ardaneswari, G.
2017-07-01
This paper investigates applications of non-negative matrix factorization as feature selection method to select the features from gray level co-occurrence matrix. The proposed approach is used to classify dementia using MRI data. In this study, texture analysis using gray level co-occurrence matrix is done to feature extraction. In the feature extraction process of MRI data, we found seven features from gray level co-occurrence matrix. Non-negative matrix factorization selected three features that influence of all features produced by feature extractions. A Naïve Bayes classifier is adapted to classify dementia, i.e. Alzheimer's disease, Mild Cognitive Impairment (MCI) and normal control. The experimental results show that non-negative factorization as feature selection method able to achieve an accuracy of 96.4% for classification of Alzheimer's and normal control. The proposed method also compared with other features selection methods i.e. Principal Component Analysis (PCA).
NASA Astrophysics Data System (ADS)
Khehra, Baljit Singh; Pharwaha, Amar Partap Singh
2017-04-01
Ductal carcinoma in situ (DCIS) is one type of breast cancer. Clusters of microcalcifications (MCCs) are symptoms of DCIS that are recognized by mammography. Selection of robust features vector is the process of selecting an optimal subset of features from a large number of available features in a given problem domain after the feature extraction and before any classification scheme. Feature selection reduces the feature space that improves the performance of classifier and decreases the computational burden imposed by using many features on classifier. Selection of an optimal subset of features from a large number of available features in a given problem domain is a difficult search problem. For n features, the total numbers of possible subsets of features are 2n. Thus, selection of an optimal subset of features problem belongs to the category of NP-hard problems. In this paper, an attempt is made to find the optimal subset of MCCs features from all possible subsets of features using genetic algorithm (GA), particle swarm optimization (PSO) and biogeography-based optimization (BBO). For simulation, a total of 380 benign and malignant MCCs samples have been selected from mammogram images of DDSM database. A total of 50 features extracted from benign and malignant MCCs samples are used in this study. In these algorithms, fitness function is correct classification rate of classifier. Support vector machine is used as a classifier. From experimental results, it is also observed that the performance of PSO-based and BBO-based algorithms to select an optimal subset of features for classifying MCCs as benign or malignant is better as compared to GA-based algorithm.
Feature Selection for Classification of Polar Regions Using a Fuzzy Expert System
NASA Technical Reports Server (NTRS)
Penaloza, Mauel A.; Welch, Ronald M.
1996-01-01
Labeling, feature selection, and the choice of classifier are critical elements for classification of scenes and for image understanding. This study examines several methods for feature selection in polar regions, including the list, of a fuzzy logic-based expert system for further refinement of a set of selected features. Six Advanced Very High Resolution Radiometer (AVHRR) Local Area Coverage (LAC) arctic scenes are classified into nine classes: water, snow / ice, ice cloud, land, thin stratus, stratus over water, cumulus over water, textured snow over water, and snow-covered mountains. Sixty-seven spectral and textural features are computed and analyzed by the feature selection algorithms. The divergence, histogram analysis, and discriminant analysis approaches are intercompared for their effectiveness in feature selection. The fuzzy expert system method is used not only to determine the effectiveness of each approach in classifying polar scenes, but also to further reduce the features into a more optimal set. For each selection method,features are ranked from best to worst, and the best half of the features are selected. Then, rules using these selected features are defined. The results of running the fuzzy expert system with these rules show that the divergence method produces the best set features, not only does it produce the highest classification accuracy, but also it has the lowest computation requirements. A reduction of the set of features produced by the divergence method using the fuzzy expert system results in an overall classification accuracy of over 95 %. However, this increase of accuracy has a high computation cost.
Du, Feng; Abrams, Richard A
2012-09-01
To avoid sensory overload, people are able to selectively attend to a particular color or direction of motion while ignoring irrelevant stimuli that differ from the desired one. We show here for the first time that it is also possible to selectively attend to a specific line orientation-but with an important caveat: orientations that are perpendicular to the target orientation cannot be suppressed. This effect reflects properties of the neural mechanisms selective for orientation and reveals the extent to which contingent capture is constrained not only by one's top-down goals but also by feature preferences of visual neurons. Copyright © 2012 Elsevier B.V. All rights reserved.
Amaral, Jorge L M; Lopes, Agnaldo J; Jansen, José M; Faria, Alvaro C D; Melo, Pedro L
2013-12-01
The purpose of this study was to develop an automatic classifier to increase the accuracy of the forced oscillation technique (FOT) for diagnosing early respiratory abnormalities in smoking patients. The data consisted of FOT parameters obtained from 56 volunteers, 28 healthy and 28 smokers with low tobacco consumption. Many supervised learning techniques were investigated, including logistic linear classifiers, k nearest neighbor (KNN), neural networks and support vector machines (SVM). To evaluate performance, the ROC curve of the most accurate parameter was established as baseline. To determine the best input features and classifier parameters, we used genetic algorithms and a 10-fold cross-validation using the average area under the ROC curve (AUC). In the first experiment, the original FOT parameters were used as input. We observed a significant improvement in accuracy (KNN=0.89 and SVM=0.87) compared with the baseline (0.77). The second experiment performed a feature selection on the original FOT parameters. This selection did not cause any significant improvement in accuracy, but it was useful in identifying more adequate FOT parameters. In the third experiment, we performed a feature selection on the cross products of the FOT parameters. This selection resulted in a further increase in AUC (KNN=SVM=0.91), which allows for high diagnostic accuracy. In conclusion, machine learning classifiers can help identify early smoking-induced respiratory alterations. The use of FOT cross products and the search for the best features and classifier parameters can markedly improve the performance of machine learning classifiers. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Unbiased feature selection in learning random forests for high-dimensional data.
Nguyen, Thanh-Tung; Huang, Joshua Zhexue; Nguyen, Thuy Thi
2015-01-01
Random forests (RFs) have been widely used as a powerful classification method. However, with the randomization in both bagging samples and feature selection, the trees in the forest tend to select uninformative features for node splitting. This makes RFs have poor accuracy when working with high-dimensional data. Besides that, RFs have bias in the feature selection process where multivalued features are favored. Aiming at debiasing feature selection in RFs, we propose a new RF algorithm, called xRF, to select good features in learning RFs for high-dimensional data. We first remove the uninformative features using p-value assessment, and the subset of unbiased features is then selected based on some statistical measures. This feature subset is then partitioned into two subsets. A feature weighting sampling technique is used to sample features from these two subsets for building trees. This approach enables one to generate more accurate trees, while allowing one to reduce dimensionality and the amount of data needed for learning RFs. An extensive set of experiments has been conducted on 47 high-dimensional real-world datasets including image datasets. The experimental results have shown that RFs with the proposed approach outperformed the existing random forests in increasing the accuracy and the AUC measures.
Kuo, Pao-Jen; Wu, Shao-Chun; Chien, Peng-Chen; Rau, Cheng-Shyuan; Chen, Yi-Chun; Hsieh, Hsiao-Yun; Hsieh, Ching-Hua
2018-01-01
Objectives This study aimed to build and test the models of machine learning (ML) to predict the mortality of hospitalised motorcycle riders. Setting The study was conducted in a level-1 trauma centre in southern Taiwan. Participants Motorcycle riders who were hospitalised between January 2009 and December 2015 were classified into a training set (n=6306) and test set (n=946). Using the demographic information, injury characteristics and laboratory data of patients, logistic regression (LR), support vector machine (SVM) and decision tree (DT) analyses were performed to determine the mortality of individual motorcycle riders, under different conditions, using all samples or reduced samples, as well as all variables or selected features in the algorithm. Primary and secondary outcome measures The predictive performance of the model was evaluated based on accuracy, sensitivity, specificity and geometric mean, and an analysis of the area under the receiver operating characteristic curves of the two different models was carried out. Results In the training set, both LR and SVM had a significantly higher area under the receiver operating characteristic curve (AUC) than DT. No significant difference was observed in the AUC of LR and SVM, regardless of whether all samples or reduced samples and whether all variables or selected features were used. In the test set, the performance of the SVM model for all samples with selected features was better than that of all other models, with an accuracy of 98.73%, sensitivity of 86.96%, specificity of 99.02%, geometric mean of 92.79% and AUC of 0.9517, in mortality prediction. Conclusion ML can provide a feasible level of accuracy in predicting the mortality of motorcycle riders. Integration of the ML model, particularly the SVM algorithm in the trauma system, may help identify high-risk patients and, therefore, guide appropriate interventions by the clinical staff. PMID:29306885
Seminal quality prediction using data mining methods.
Sahoo, Anoop J; Kumar, Yugal
2014-01-01
Now-a-days, some new classes of diseases have come into existences which are known as lifestyle diseases. The main reasons behind these diseases are changes in the lifestyle of people such as alcohol drinking, smoking, food habits etc. After going through the various lifestyle diseases, it has been found that the fertility rates (sperm quantity) in men has considerably been decreasing in last two decades. Lifestyle factors as well as environmental factors are mainly responsible for the change in the semen quality. The objective of this paper is to identify the lifestyle and environmental features that affects the seminal quality and also fertility rate in man using data mining methods. The five artificial intelligence techniques such as Multilayer perceptron (MLP), Decision Tree (DT), Navie Bayes (Kernel), Support vector machine+Particle swarm optimization (SVM+PSO) and Support vector machine (SVM) have been applied on fertility dataset to evaluate the seminal quality and also to predict the person is either normal or having altered fertility rate. While the eight feature selection techniques such as support vector machine (SVM), neural network (NN), evolutionary logistic regression (LR), support vector machine plus particle swarm optimization (SVM+PSO), principle component analysis (PCA), chi-square test, correlation and T-test methods have been used to identify more relevant features which affect the seminal quality. These techniques are applied on fertility dataset which contains 100 instances with nine attribute with two classes. The experimental result shows that SVM+PSO provides higher accuracy and area under curve (AUC) rate (94% & 0.932) among multi-layer perceptron (MLP) (92% & 0.728), Support Vector Machines (91% & 0.758), Navie Bayes (Kernel) (89% & 0.850) and Decision Tree (89% & 0.735) for some of the seminal parameters. This paper also focuses on the feature selection process i.e. how to select the features which are more important for prediction of fertility rate. In this paper, eight feature selection methods are applied on fertility dataset to find out a set of good features. The investigational results shows that childish diseases (0.079) and high fever features (0.057) has less impact on fertility rate while age (0.8685), season (0.843), surgical intervention (0.7683), alcohol consumption (0.5992), smoking habit (0.575), number of hours spent on setting (0.4366) and accident (0.5973) features have more impact. It is also observed that feature selection methods increase the accuracy of above mentioned techniques (multilayer perceptron 92%, support vector machine 91%, SVM+PSO 94%, Navie Bayes (Kernel) 89% and decision tree 89%) as compared to without feature selection methods (multilayer perceptron 86%, support vector machine 86%, SVM+PSO 85%, Navie Bayes (Kernel) 83% and decision tree 84%) which shows the applicability of feature selection methods in prediction. This paper lightens the application of artificial techniques in medical domain. From this paper, it can be concluded that data mining methods can be used to predict a person with or without disease based on environmental and lifestyle parameters/features rather than undergoing various medical test. In this paper, five data mining techniques are used to predict the fertility rate and among which SVM+PSO provide more accurate results than support vector machine and decision tree.
NASA Astrophysics Data System (ADS)
Li, Yifan; Liang, Xihui; Lin, Jianhui; Chen, Yuejian; Liu, Jianxin
2018-02-01
This paper presents a novel signal processing scheme, feature selection based multi-scale morphological filter (MMF), for train axle bearing fault detection. In this scheme, more than 30 feature indicators of vibration signals are calculated for axle bearings with different conditions and the features which can reflect fault characteristics more effectively and representatively are selected using the max-relevance and min-redundancy principle. Then, a filtering scale selection approach for MMF based on feature selection and grey relational analysis is proposed. The feature selection based MMF method is tested on diagnosis of artificially created damages of rolling bearings of railway trains. Experimental results show that the proposed method has a superior performance in extracting fault features of defective train axle bearings. In addition, comparisons are performed with the kurtosis criterion based MMF and the spectral kurtosis criterion based MMF. The proposed feature selection based MMF method outperforms these two methods in detection of train axle bearing faults.
NASA Astrophysics Data System (ADS)
Ray, Shonket; Keller, Brad M.; Chen, Jinbo; Conant, Emily F.; Kontos, Despina
2016-03-01
This work details a methodology to obtain optimal parameter values for a locally-adaptive texture analysis algorithm that extracts mammographic texture features representative of breast parenchymal complexity for predicting falsepositive (FP) recalls from breast cancer screening with digital mammography. The algorithm has two components: (1) adaptive selection of localized regions of interest (ROIs) and (2) Haralick texture feature extraction via Gray- Level Co-Occurrence Matrices (GLCM). The following parameters were systematically varied: mammographic views used, upper limit of the ROI window size used for adaptive ROI selection, GLCM distance offsets, and gray levels (binning) used for feature extraction. Each iteration per parameter set had logistic regression with stepwise feature selection performed on a clinical screening cohort of 474 non-recalled women and 68 FP recalled women; FP recall prediction was evaluated using area under the curve (AUC) of the receiver operating characteristic (ROC) and associations between the extracted features and FP recall were assessed via odds ratios (OR). A default instance of mediolateral (MLO) view, upper ROI size limit of 143.36 mm (2048 pixels2), GLCM distance offset combination range of 0.07 to 0.84 mm (1 to 12 pixels) and 16 GLCM gray levels was set. The highest ROC performance value of AUC=0.77 [95% confidence intervals: 0.71-0.83] was obtained at three specific instances: the default instance, upper ROI window equal to 17.92 mm (256 pixels2), and gray levels set to 128. The texture feature of sum average was chosen as a statistically significant (p<0.05) predictor and associated with higher odds of FP recall for 12 out of 14 total instances.
Friberg, Anders; Schoonderwaldt, Erwin; Hedblad, Anton; Fabiani, Marco; Elowsson, Anders
2014-10-01
The notion of perceptual features is introduced for describing general music properties based on human perception. This is an attempt at rethinking the concept of features, aiming to approach the underlying human perception mechanisms. Instead of using concepts from music theory such as tones, pitches, and chords, a set of nine features describing overall properties of the music was selected. They were chosen from qualitative measures used in psychology studies and motivated from an ecological approach. The perceptual features were rated in two listening experiments using two different data sets. They were modeled both from symbolic and audio data using different sets of computational features. Ratings of emotional expression were predicted using the perceptual features. The results indicate that (1) at least some of the perceptual features are reliable estimates; (2) emotion ratings could be predicted by a small combination of perceptual features with an explained variance from 75% to 93% for the emotional dimensions activity and valence; (3) the perceptual features could only to a limited extent be modeled using existing audio features. Results clearly indicated that a small number of dedicated features were superior to a "brute force" model using a large number of general audio features.
Engineering nanoscale surface features to sustain microparticle rolling in flow.
Kalasin, Surachate; Santore, Maria M
2015-05-26
Nanoscopic features of channel walls are often engineered to facilitate microfluidic transport, for instance when surface charge enables electro-osmosis or when grooves drive mixing. The dynamic or rolling adhesion of flowing microparticles on a channel wall holds potential to accomplish particle sorting or to selectively transfer reactive species or signals between the wall and flowing particles. Inspired by cell rolling under the direction of adhesion molecules called selectins, we present an engineered platform in which the rolling of flowing microparticles is sustained through the incorporation of entirely synthetic, discrete, nanoscale, attractive features into the nonadhesive (electrostatically repulsive) surface of a flow channel. Focusing on one example or type of nanoscale feature and probing the impact of broad systematic variations in surface feature loading and processing parameters, this study demonstrates how relatively flat, weakly adhesive nanoscale features, positioned with average spacings on the order of tens of nanometers, can produce sustained microparticle rolling. We further demonstrate how the rolling velocity and travel distance depend on flow and surface design. We identify classes of related surfaces that fail to support rolling and present a state space that identifies combinations of surface and processing variables corresponding to transitions between rolling, free particle motion, and arrest. Finally we identify combinations of parameters (surface length scales, particle size, flow rates) where particles can be manipulated with size-selectivity.
Target oriented dimensionality reduction of hyperspectral data by Kernel Fukunaga-Koontz Transform
NASA Astrophysics Data System (ADS)
Binol, Hamidullah; Ochilov, Shuhrat; Alam, Mohammad S.; Bal, Abdullah
2017-02-01
Principal component analysis (PCA) is a popular technique in remote sensing for dimensionality reduction. While PCA is suitable for data compression, it is not necessarily an optimal technique for feature extraction, particularly when the features are exploited in supervised learning applications (Cheriyadat and Bruce, 2003) [1]. Preserving features belonging to the target is very crucial to the performance of target detection/recognition techniques. Fukunaga-Koontz Transform (FKT) based supervised band reduction technique can be used to provide this requirement. FKT achieves feature selection by transforming into a new space in where feature classes have complimentary eigenvectors. Analysis of these eigenvectors under two classes, target and background clutter, can be utilized for target oriented band reduction since each basis functions best represent target class while carrying least information of the background class. By selecting few eigenvectors which are the most relevant to the target class, dimension of hyperspectral data can be reduced and thus, it presents significant advantages for near real time target detection applications. The nonlinear properties of the data can be extracted by kernel approach which provides better target features. Thus, we propose constructing kernel FKT (KFKT) to present target oriented band reduction. The performance of the proposed KFKT based target oriented dimensionality reduction algorithm has been tested employing two real-world hyperspectral data and results have been reported consequently.
Audio-visual synchrony and feature-selective attention co-amplify early visual processing.
Keitel, Christian; Müller, Matthias M
2016-05-01
Our brain relies on neural mechanisms of selective attention and converging sensory processing to efficiently cope with rich and unceasing multisensory inputs. One prominent assumption holds that audio-visual synchrony can act as a strong attractor for spatial attention. Here, we tested for a similar effect of audio-visual synchrony on feature-selective attention. We presented two superimposed Gabor patches that differed in colour and orientation. On each trial, participants were cued to selectively attend to one of the two patches. Over time, spatial frequencies of both patches varied sinusoidally at distinct rates (3.14 and 3.63 Hz), giving rise to pulse-like percepts. A simultaneously presented pure tone carried a frequency modulation at the pulse rate of one of the two visual stimuli to introduce audio-visual synchrony. Pulsed stimulation elicited distinct time-locked oscillatory electrophysiological brain responses. These steady-state responses were quantified in the spectral domain to examine individual stimulus processing under conditions of synchronous versus asynchronous tone presentation and when respective stimuli were attended versus unattended. We found that both, attending to the colour of a stimulus and its synchrony with the tone, enhanced its processing. Moreover, both gain effects combined linearly for attended in-sync stimuli. Our results suggest that audio-visual synchrony can attract attention to specific stimulus features when stimuli overlap in space.
Chen, Qiang; Chen, Yunhao; Jiang, Weiguo
2016-07-30
In the field of multiple features Object-Based Change Detection (OBCD) for very-high-resolution remotely sensed images, image objects have abundant features and feature selection affects the precision and efficiency of OBCD. Through object-based image analysis, this paper proposes a Genetic Particle Swarm Optimization (GPSO)-based feature selection algorithm to solve the optimization problem of feature selection in multiple features OBCD. We select the Ratio of Mean to Variance (RMV) as the fitness function of GPSO, and apply the proposed algorithm to the object-based hybrid multivariate alternative detection model. Two experiment cases on Worldview-2/3 images confirm that GPSO can significantly improve the speed of convergence, and effectively avoid the problem of premature convergence, relative to other feature selection algorithms. According to the accuracy evaluation of OBCD, GPSO is superior at overall accuracy (84.17% and 83.59%) and Kappa coefficient (0.6771 and 0.6314) than other algorithms. Moreover, the sensitivity analysis results show that the proposed algorithm is not easily influenced by the initial parameters, but the number of features to be selected and the size of the particle swarm would affect the algorithm. The comparison experiment results reveal that RMV is more suitable than other functions as the fitness function of GPSO-based feature selection algorithm.
Smart, Otis; Burrell, Lauren
2014-01-01
Pattern classification for intracranial electroencephalogram (iEEG) and functional magnetic resonance imaging (fMRI) signals has furthered epilepsy research toward understanding the origin of epileptic seizures and localizing dysfunctional brain tissue for treatment. Prior research has demonstrated that implicitly selecting features with a genetic programming (GP) algorithm more effectively determined the proper features to discern biomarker and non-biomarker interictal iEEG and fMRI activity than conventional feature selection approaches. However for each the iEEG and fMRI modalities, it is still uncertain whether the stochastic properties of indirect feature selection with a GP yield (a) consistent results within a patient data set and (b) features that are specific or universal across multiple patient data sets. We examined the reproducibility of implicitly selecting features to classify interictal activity using a GP algorithm by performing several selection trials and subsequent frequent itemset mining (FIM) for separate iEEG and fMRI epilepsy patient data. We observed within-subject consistency and across-subject variability with some small similarity for selected features, indicating a clear need for patient-specific features and possible need for patient-specific feature selection or/and classification. For the fMRI, using nearest-neighbor classification and 30 GP generations, we obtained over 60% median sensitivity and over 60% median selectivity. For the iEEG, using nearest-neighbor classification and 30 GP generations, we obtained over 65% median sensitivity and over 65% median selectivity except one patient. PMID:25580059
Hash Bit Selection for Nearest Neighbor Search.
Xianglong Liu; Junfeng He; Shih-Fu Chang
2017-11-01
To overcome the barrier of storage and computation when dealing with gigantic-scale data sets, compact hashing has been studied extensively to approximate the nearest neighbor search. Despite the recent advances, critical design issues remain open in how to select the right features, hashing algorithms, and/or parameter settings. In this paper, we address these by posing an optimal hash bit selection problem, in which an optimal subset of hash bits are selected from a pool of candidate bits generated by different features, algorithms, or parameters. Inspired by the optimization criteria used in existing hashing algorithms, we adopt the bit reliability and their complementarity as the selection criteria that can be carefully tailored for hashing performance in different tasks. Then, the bit selection solution is discovered by finding the best tradeoff between search accuracy and time using a modified dynamic programming method. To further reduce the computational complexity, we employ the pairwise relationship among hash bits to approximate the high-order independence property, and formulate it as an efficient quadratic programming method that is theoretically equivalent to the normalized dominant set problem in a vertex- and edge-weighted graph. Extensive large-scale experiments have been conducted under several important application scenarios of hash techniques, where our bit selection framework can achieve superior performance over both the naive selection methods and the state-of-the-art hashing algorithms, with significant accuracy gains ranging from 10% to 50%, relatively.
The long-term evolution of multilocus traits under frequency-dependent disruptive selection.
van Doorn, G Sander; Dieckmann, Ulf
2006-11-01
Frequency-dependent disruptive selection is widely recognized as an important source of genetic variation. Its evolutionary consequences have been extensively studied using phenotypic evolutionary models, based on quantitative genetics, game theory, or adaptive dynamics. However, the genetic assumptions underlying these approaches are highly idealized and, even worse, predict different consequences of frequency-dependent disruptive selection. Population genetic models, by contrast, enable genotypic evolutionary models, but traditionally assume constant fitness values. Only a minority of these models thus addresses frequency-dependent selection, and only a few of these do so in a multilocus context. An inherent limitation of these remaining studies is that they only investigate the short-term maintenance of genetic variation. Consequently, the long-term evolution of multilocus characters under frequency-dependent disruptive selection remains poorly understood. We aim to bridge this gap between phenotypic and genotypic models by studying a multilocus version of Levene's soft-selection model. Individual-based simulations and deterministic approximations based on adaptive dynamics theory provide insights into the underlying evolutionary dynamics. Our analysis uncovers a general pattern of polymorphism formation and collapse, likely to apply to a wide variety of genetic systems: after convergence to a fitness minimum and the subsequent establishment of genetic polymorphism at multiple loci, genetic variation becomes increasingly concentrated on a few loci, until eventually only a single polymorphic locus remains. This evolutionary process combines features observed in quantitative genetics and adaptive dynamics models, and it can be explained as a consequence of changes in the selection regime that are inherent to frequency-dependent disruptive selection. Our findings demonstrate that the potential of frequency-dependent disruptive selection to maintain polygenic variation is considerably smaller than previously expected.
Method of generating features optimal to a dataset and classifier
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bruillard, Paul J.; Gosink, Luke J.; Jarman, Kenneth D.
A method of generating features optimal to a particular dataset and classifier is disclosed. A dataset of messages is inputted and a classifier is selected. An algebra of features is encoded. Computable features that are capable of describing the dataset from the algebra of features are selected. Irredundant features that are optimal for the classifier and the dataset are selected.
Gomez-Ramirez, Manuel; Trzcinski, Natalie K.; Mihalas, Stefan; Niebur, Ernst
2014-01-01
Studies in vision show that attention enhances the firing rates of cells when it is directed towards their preferred stimulus feature. However, it is unknown whether other sensory systems employ this mechanism to mediate feature selection within their modalities. Moreover, whether feature-based attention modulates the correlated activity of a population is unclear. Indeed, temporal correlation codes such as spike-synchrony and spike-count correlations (rsc) are believed to play a role in stimulus selection by increasing the signal and reducing the noise in a population, respectively. Here, we investigate (1) whether feature-based attention biases the correlated activity between neurons when attention is directed towards their common preferred feature, (2) the interplay between spike-synchrony and rsc during feature selection, and (3) whether feature attention effects are common across the visual and tactile systems. Single-unit recordings were made in secondary somatosensory cortex of three non-human primates while animals engaged in tactile feature (orientation and frequency) and visual discrimination tasks. We found that both firing rate and spike-synchrony between neurons with similar feature selectivity were enhanced when attention was directed towards their preferred feature. However, attention effects on spike-synchrony were twice as large as those on firing rate, and had a tighter relationship with behavioral performance. Further, we observed increased rsc when attention was directed towards the visual modality (i.e., away from touch). These data suggest that similar feature selection mechanisms are employed in vision and touch, and that temporal correlation codes such as spike-synchrony play a role in mediating feature selection. We posit that feature-based selection operates by implementing multiple mechanisms that reduce the overall noise levels in the neural population and synchronize activity across subpopulations that encode the relevant features of sensory stimuli. PMID:25423284
Predictive features of breast cancer on Mexican screening mammography patients
NASA Astrophysics Data System (ADS)
Rodriguez-Rojas, Juan; Garza-Montemayor, Margarita; Trevino-Alvarado, Victor; Tamez-Pena, José Gerardo
2013-02-01
Breast cancer is the most common type of cancer worldwide. In response, breast cancer screening programs are becoming common around the world and public programs now serve millions of women worldwide. These programs are expensive, requiring many specialized radiologists to examine all images. Nevertheless, there is a lack of trained radiologists in many countries as in Mexico, which is a barrier towards decreasing breast cancer mortality, pointing at the need of a triaging system that prioritizes high risk cases for prompt interpretation. Therefore we explored in an image database of Mexican patients whether high risk cases can be distinguished using image features. We collected a set of 200 digital screening mammography cases from a hospital in Mexico, and assigned low or high risk labels according to its BIRADS score. Breast tissue segmentation was performed using an automatic procedure. Image features were obtained considering only the segmented region on each view and comparing the bilateral di erences of the obtained features. Predictive combinations of features were chosen using a genetic algorithms based feature selection procedure. The best model found was able to classify low-risk and high-risk cases with an area under the ROC curve of 0.88 on a 150-fold cross-validation test. The features selected were associated to the differences of signal distribution and tissue shape on bilateral views. The model found can be used to automatically identify high risk cases and trigger the necessary measures to provide prompt treatment.
Padilla-Buritica, Jorge I.; Martinez-Vargas, Juan D.; Castellanos-Dominguez, German
2016-01-01
Lately, research on computational models of emotion had been getting much attention due to their potential for understanding the mechanisms of emotions and their promising broad range of applications that potentially bridge the gap between human and machine interactions. We propose a new method for emotion classification that relies on features extracted from those active brain areas that are most likely related to emotions. To this end, we carry out the selection of spatially compact regions of interest that are computed using the brain neural activity reconstructed from Electroencephalography data. Throughout this study, we consider three representative feature extraction methods widely applied to emotion detection tasks, including Power spectral density, Wavelet, and Hjorth parameters. Further feature selection is carried out using principal component analysis. For validation purpose, these features are used to feed a support vector machine classifier that is trained under the leave-one-out cross-validation strategy. Obtained results on real affective data show that incorporation of the proposed training method in combination with the enhanced spatial resolution provided by the source estimation allows improving the performed accuracy of discrimination in most of the considered emotions, namely: dominance, valence, and liking. PMID:27489541
Feature Selection Methods for Zero-Shot Learning of Neural Activity.
Caceres, Carlos A; Roos, Matthew J; Rupp, Kyle M; Milsap, Griffin; Crone, Nathan E; Wolmetz, Michael E; Ratto, Christopher R
2017-01-01
Dimensionality poses a serious challenge when making predictions from human neuroimaging data. Across imaging modalities, large pools of potential neural features (e.g., responses from particular voxels, electrodes, and temporal windows) have to be related to typically limited sets of stimuli and samples. In recent years, zero-shot prediction models have been introduced for mapping between neural signals and semantic attributes, which allows for classification of stimulus classes not explicitly included in the training set. While choices about feature selection can have a substantial impact when closed-set accuracy, open-set robustness, and runtime are competing design objectives, no systematic study of feature selection for these models has been reported. Instead, a relatively straightforward feature stability approach has been adopted and successfully applied across models and imaging modalities. To characterize the tradeoffs in feature selection for zero-shot learning, we compared correlation-based stability to several other feature selection techniques on comparable data sets from two distinct imaging modalities: functional Magnetic Resonance Imaging and Electrocorticography. While most of the feature selection methods resulted in similar zero-shot prediction accuracies and spatial/spectral patterns of selected features, there was one exception; A novel feature/attribute correlation approach was able to achieve those accuracies with far fewer features, suggesting the potential for simpler prediction models that yield high zero-shot classification accuracy.
Ultrasound based computer-aided-diagnosis of kidneys for pediatric hydronephrosis
NASA Astrophysics Data System (ADS)
Cerrolaza, Juan J.; Peters, Craig A.; Martin, Aaron D.; Myers, Emmarie; Safdar, Nabile; Linguraru, Marius G.
2014-03-01
Ultrasound is the mainstay of imaging for pediatric hydronephrosis, though its potential as diagnostic tool is limited by its subjective assessment, and lack of correlation with renal function. Therefore, all cases showing signs of hydronephrosis undergo further invasive studies, like diuretic renogram, in order to assess the actual renal function. Under the hypothesis that renal morphology is correlated with renal function, a new ultrasound based computer-aided diagnosis (CAD) tool for pediatric hydronephrosis is presented. From 2D ultrasound, a novel set of morphological features of the renal collecting systems and the parenchyma, is automatically extracted using image analysis techniques. From the original set of features, including size, geometric and curvature descriptors, a subset of ten features are selected as predictive variables, combining a feature selection technique and area under the curve filtering. Using the washout half time (T1/2) as indicative of renal obstruction, two groups are defined. Those cases whose T1/2 is above 30 minutes are considered to be severe, while the rest would be in the safety zone, where diuretic renography could be avoided. Two different classification techniques are evaluated (logistic regression, and support vector machines). Adjusting the probability decision thresholds to operate at the point of maximum sensitivity, i.e., preventing any severe case be misclassified, specificities of 53%, and 75% are achieved, for the logistic regression and the support vector machine classifier, respectively. The proposed CAD system allows to establish a link between non-invasive non-ionizing imaging techniques and renal function, limiting the need for invasive and ionizing diuretic renography.
Correlative feature analysis of FFDM images
NASA Astrophysics Data System (ADS)
Yuan, Yading; Giger, Maryellen L.; Li, Hui; Sennett, Charlene
2008-03-01
Identifying the corresponding image pair of a lesion is an essential step for combining information from different views of the lesion to improve the diagnostic ability of both radiologists and CAD systems. Because of the non-rigidity of the breasts and the 2D projective property of mammograms, this task is not trivial. In this study, we present a computerized framework that differentiates the corresponding images from different views of a lesion from non-corresponding ones. A dual-stage segmentation method, which employs an initial radial gradient index(RGI) based segmentation and an active contour model, was initially applied to extract mass lesions from the surrounding tissues. Then various lesion features were automatically extracted from each of the two views of each lesion to quantify the characteristics of margin, shape, size, texture and context of the lesion, as well as its distance to nipple. We employed a two-step method to select an effective subset of features, and combined it with a BANN to obtain a discriminant score, which yielded an estimate of the probability that the two images are of the same physical lesion. ROC analysis was used to evaluate the performance of the individual features and the selected feature subset in the task of distinguishing between corresponding and non-corresponding pairs. By using a FFDM database with 124 corresponding image pairs and 35 non-corresponding pairs, the distance feature yielded an AUC (area under the ROC curve) of 0.8 with leave-one-out evaluation by lesion, and the feature subset, which includes distance feature, lesion size and lesion contrast, yielded an AUC of 0.86. The improvement by using multiple features was statistically significant as compared to single feature performance. (p<0.001)
StochKit2: software for discrete stochastic simulation of biochemical systems with events.
Sanft, Kevin R; Wu, Sheng; Roh, Min; Fu, Jin; Lim, Rone Kwei; Petzold, Linda R
2011-09-01
StochKit2 is the first major upgrade of the popular StochKit stochastic simulation software package. StochKit2 provides highly efficient implementations of several variants of Gillespie's stochastic simulation algorithm (SSA), and tau-leaping with automatic step size selection. StochKit2 features include automatic selection of the optimal SSA method based on model properties, event handling, and automatic parallelism on multicore architectures. The underlying structure of the code has been completely updated to provide a flexible framework for extending its functionality. StochKit2 runs on Linux/Unix, Mac OS X and Windows. It is freely available under GPL version 3 and can be downloaded from http://sourceforge.net/projects/stochkit/. petzold@engineering.ucsb.edu.
Feature Migration in Time: Reflection of Selective Attention on Speech Errors
ERIC Educational Resources Information Center
Nozari, Nazbanou; Dell, Gary S.
2012-01-01
This article describes an initial study of the effect of focused attention on phonological speech errors. In 3 experiments, participants recited 4-word tongue twisters and focused attention on 1 (or none) of the words. The attended word was singled out differently in each experiment; participants were under instructions to avoid errors on the…
Liu, Shengyu; Tang, Buzhou; Chen, Qingcai; Wang, Xiaolong; Fan, Xiaoming
2015-01-01
Drug name recognition (DNR) is a critical step for drug information extraction. Machine learning-based methods have been widely used for DNR with various types of features such as part-of-speech, word shape, and dictionary feature. Features used in current machine learning-based methods are usually singleton features which may be due to explosive features and a large number of noisy features when singleton features are combined into conjunction features. However, singleton features that can only capture one linguistic characteristic of a word are not sufficient to describe the information for DNR when multiple characteristics should be considered. In this study, we explore feature conjunction and feature selection for DNR, which have never been reported. We intuitively select 8 types of singleton features and combine them into conjunction features in two ways. Then, Chi-square, mutual information, and information gain are used to mine effective features. Experimental results show that feature conjunction and feature selection can improve the performance of the DNR system with a moderate number of features and our DNR system significantly outperforms the best system in the DDIExtraction 2013 challenge.
Effect of feature-selective attention on neuronal responses in macaque area MT
Chen, X.; Hoffmann, K.-P.; Albright, T. D.
2012-01-01
Attention influences visual processing in striate and extrastriate cortex, which has been extensively studied for spatial-, object-, and feature-based attention. Most studies exploring neural signatures of feature-based attention have trained animals to attend to an object identified by a certain feature and ignore objects/displays identified by a different feature. Little is known about the effects of feature-selective attention, where subjects attend to one stimulus feature domain (e.g., color) of an object while features from different domains (e.g., direction of motion) of the same object are ignored. To study this type of feature-selective attention in area MT in the middle temporal sulcus, we trained macaque monkeys to either attend to and report the direction of motion of a moving sine wave grating (a feature for which MT neurons display strong selectivity) or attend to and report its color (a feature for which MT neurons have very limited selectivity). We hypothesized that neurons would upregulate their firing rate during attend-direction conditions compared with attend-color conditions. We found that feature-selective attention significantly affected 22% of MT neurons. Contrary to our hypothesis, these neurons did not necessarily increase firing rate when animals attended to direction of motion but fell into one of two classes. In one class, attention to color increased the gain of stimulus-induced responses compared with attend-direction conditions. The other class displayed the opposite effects. Feature-selective activity modulations occurred earlier in neurons modulated by attention to color compared with neurons modulated by attention to motion direction. Thus feature-selective attention influences neuronal processing in macaque area MT but often exhibited a mismatch between the preferred stimulus dimension (direction of motion) and the preferred attention dimension (attention to color). PMID:22170961
Effect of feature-selective attention on neuronal responses in macaque area MT.
Chen, X; Hoffmann, K-P; Albright, T D; Thiele, A
2012-03-01
Attention influences visual processing in striate and extrastriate cortex, which has been extensively studied for spatial-, object-, and feature-based attention. Most studies exploring neural signatures of feature-based attention have trained animals to attend to an object identified by a certain feature and ignore objects/displays identified by a different feature. Little is known about the effects of feature-selective attention, where subjects attend to one stimulus feature domain (e.g., color) of an object while features from different domains (e.g., direction of motion) of the same object are ignored. To study this type of feature-selective attention in area MT in the middle temporal sulcus, we trained macaque monkeys to either attend to and report the direction of motion of a moving sine wave grating (a feature for which MT neurons display strong selectivity) or attend to and report its color (a feature for which MT neurons have very limited selectivity). We hypothesized that neurons would upregulate their firing rate during attend-direction conditions compared with attend-color conditions. We found that feature-selective attention significantly affected 22% of MT neurons. Contrary to our hypothesis, these neurons did not necessarily increase firing rate when animals attended to direction of motion but fell into one of two classes. In one class, attention to color increased the gain of stimulus-induced responses compared with attend-direction conditions. The other class displayed the opposite effects. Feature-selective activity modulations occurred earlier in neurons modulated by attention to color compared with neurons modulated by attention to motion direction. Thus feature-selective attention influences neuronal processing in macaque area MT but often exhibited a mismatch between the preferred stimulus dimension (direction of motion) and the preferred attention dimension (attention to color).
Comparison of Feature Selection Techniques in Machine Learning for Anatomical Brain MRI in Dementia.
Tohka, Jussi; Moradi, Elaheh; Huttunen, Heikki
2016-07-01
We present a comparative split-half resampling analysis of various data driven feature selection and classification methods for the whole brain voxel-based classification analysis of anatomical magnetic resonance images. We compared support vector machines (SVMs), with or without filter based feature selection, several embedded feature selection methods and stability selection. While comparisons of the accuracy of various classification methods have been reported previously, the variability of the out-of-training sample classification accuracy and the set of selected features due to independent training and test sets have not been previously addressed in a brain imaging context. We studied two classification problems: 1) Alzheimer's disease (AD) vs. normal control (NC) and 2) mild cognitive impairment (MCI) vs. NC classification. In AD vs. NC classification, the variability in the test accuracy due to the subject sample did not vary between different methods and exceeded the variability due to different classifiers. In MCI vs. NC classification, particularly with a large training set, embedded feature selection methods outperformed SVM-based ones with the difference in the test accuracy exceeding the test accuracy variability due to the subject sample. The filter and embedded methods produced divergent feature patterns for MCI vs. NC classification that suggests the utility of the embedded feature selection for this problem when linked with the good generalization performance. The stability of the feature sets was strongly correlated with the number of features selected, weakly correlated with the stability of classification accuracy, and uncorrelated with the average classification accuracy.
Association between mammogram density and background parenchymal enhancement of breast MRI
NASA Astrophysics Data System (ADS)
Aghaei, Faranak; Danala, Gopichandh; Wang, Yunzhi; Zarafshani, Ali; Qian, Wei; Liu, Hong; Zheng, Bin
2018-02-01
Breast density has been widely considered as an important risk factor for breast cancer. The purpose of this study is to examine the association between mammogram density results and background parenchymal enhancement (BPE) of breast MRI. A dataset involving breast MR images was acquired from 65 high-risk women. Based on mammography density (BIRADS) results, the dataset was divided into two groups of low and high breast density cases. The Low-Density group has 15 cases with mammographic density (BIRADS 1 and 2), while the High-density group includes 50 cases, which were rated by radiologists as mammographic density BIRADS 3 and 4. A computer-aided detection (CAD) scheme was applied to segment and register breast regions depicted on sequential images of breast MRI scans. CAD scheme computed 20 global BPE features from the entire two breast regions, separately from the left and right breast region, as well as from the bilateral difference between left and right breast regions. An image feature selection method namely, CFS method, was applied to remove the most redundant features and select optimal features from the initial feature pool. Then, a logistic regression classifier was built using the optimal features to predict the mammogram density from the BPE features. Using a leave-one-case-out validation method, the classifier yields the accuracy of 82% and area under ROC curve, AUC=0.81+/-0.09. Also, the box-plot based analysis shows a negative association between mammogram density results and BPE features in the MRI images. This study demonstrated a negative association between mammogram density and BPE of breast MRI images.
Jeyasingh, Suganthi; Veluchamy, Malathi
2017-05-01
Early diagnosis of breast cancer is essential to save lives of patients. Usually, medical datasets include a large variety of data that can lead to confusion during diagnosis. The Knowledge Discovery on Database (KDD) process helps to improve efficiency. It requires elimination of inappropriate and repeated data from the dataset before final diagnosis. This can be done using any of the feature selection algorithms available in data mining. Feature selection is considered as a vital step to increase the classification accuracy. This paper proposes a Modified Bat Algorithm (MBA) for feature selection to eliminate irrelevant features from an original dataset. The Bat algorithm was modified using simple random sampling to select the random instances from the dataset. Ranking was with the global best features to recognize the predominant features available in the dataset. The selected features are used to train a Random Forest (RF) classification algorithm. The MBA feature selection algorithm enhanced the classification accuracy of RF in identifying the occurrence of breast cancer. The Wisconsin Diagnosis Breast Cancer Dataset (WDBC) was used for estimating the performance analysis of the proposed MBA feature selection algorithm. The proposed algorithm achieved better performance in terms of Kappa statistic, Mathew’s Correlation Coefficient, Precision, F-measure, Recall, Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Relative Absolute Error (RAE) and Root Relative Squared Error (RRSE). Creative Commons Attribution License
Chen, Qiang; Chen, Yunhao; Jiang, Weiguo
2016-01-01
In the field of multiple features Object-Based Change Detection (OBCD) for very-high-resolution remotely sensed images, image objects have abundant features and feature selection affects the precision and efficiency of OBCD. Through object-based image analysis, this paper proposes a Genetic Particle Swarm Optimization (GPSO)-based feature selection algorithm to solve the optimization problem of feature selection in multiple features OBCD. We select the Ratio of Mean to Variance (RMV) as the fitness function of GPSO, and apply the proposed algorithm to the object-based hybrid multivariate alternative detection model. Two experiment cases on Worldview-2/3 images confirm that GPSO can significantly improve the speed of convergence, and effectively avoid the problem of premature convergence, relative to other feature selection algorithms. According to the accuracy evaluation of OBCD, GPSO is superior at overall accuracy (84.17% and 83.59%) and Kappa coefficient (0.6771 and 0.6314) than other algorithms. Moreover, the sensitivity analysis results show that the proposed algorithm is not easily influenced by the initial parameters, but the number of features to be selected and the size of the particle swarm would affect the algorithm. The comparison experiment results reveal that RMV is more suitable than other functions as the fitness function of GPSO-based feature selection algorithm. PMID:27483285
Wang, Jie; Feng, Zuren; Lu, Na; Luo, Jing
2018-06-01
Feature selection plays an important role in the field of EEG signals based motor imagery pattern classification. It is a process that aims to select an optimal feature subset from the original set. Two significant advantages involved are: lowering the computational burden so as to speed up the learning procedure and removing redundant and irrelevant features so as to improve the classification performance. Therefore, feature selection is widely employed in the classification of EEG signals in practical brain-computer interface systems. In this paper, we present a novel statistical model to select the optimal feature subset based on the Kullback-Leibler divergence measure, and automatically select the optimal subject-specific time segment. The proposed method comprises four successive stages: a broad frequency band filtering and common spatial pattern enhancement as preprocessing, features extraction by autoregressive model and log-variance, the Kullback-Leibler divergence based optimal feature and time segment selection and linear discriminate analysis classification. More importantly, this paper provides a potential framework for combining other feature extraction models and classification algorithms with the proposed method for EEG signals classification. Experiments on single-trial EEG signals from two public competition datasets not only demonstrate that the proposed method is effective in selecting discriminative features and time segment, but also show that the proposed method yields relatively better classification results in comparison with other competitive methods. Copyright © 2018 Elsevier Ltd. All rights reserved.
Intraspecific differences in bacterial responses to modelled reduced gravity
NASA Technical Reports Server (NTRS)
Baker, P. W.; Leff, L. G.
2005-01-01
AIMS: Bacteria are important residents of water systems, including those of space stations which feature specific environmental conditions, such as lowered effects of gravity. The purpose of this study was to compare responses with modelled reduced gravity of space station, water system bacterial isolates with other isolates of the same species. METHODS AND RESULTS: Bacterial isolates, Stenotrophomonas paucimobilis and Acinetobacter radioresistens, originally recovered from the water supply aboard the International Space Station (ISS) were grown in nutrient broth under modelled reduced gravity. Their growth was compared with type strains S. paucimobilis ATCC 10829 and A. radioresistens ATCC 49000. Acinetobacter radioresistens ATCC 49000 and the two ISS isolates showed similar growth profiles under modelled reduced gravity compared with normal gravity, whereas S. paucimobilis ATCC 10829 was negatively affected by modelled reduced gravity. CONCLUSIONS: These results suggest that microgravity might have selected for bacteria that were able to thrive under this unusual condition. These responses, coupled with impacts of other features (such as radiation resistance and ability to persist under very oligotrophic conditions), may contribute to the success of these water system bacteria. SIGNIFICANCE AND IMPACT OF THE STUDY: Water quality is a significant factor in many environments including the ISS. Efforts to remove microbial contaminants are likely to be complicated by the features of these bacteria which allow them to persist under the extreme conditions of the systems.
Sentiment analysis of feature ranking methods for classification accuracy
NASA Astrophysics Data System (ADS)
Joseph, Shashank; Mugauri, Calvin; Sumathy, S.
2017-11-01
Text pre-processing and feature selection are important and critical steps in text mining. Text pre-processing of large volumes of datasets is a difficult task as unstructured raw data is converted into structured format. Traditional methods of processing and weighing took much time and were less accurate. To overcome this challenge, feature ranking techniques have been devised. A feature set from text preprocessing is fed as input for feature selection. Feature selection helps improve text classification accuracy. Of the three feature selection categories available, the filter category will be the focus. Five feature ranking methods namely: document frequency, standard deviation information gain, CHI-SQUARE, and weighted-log likelihood -ratio is analyzed.
Ortiz-Ramón, Rafael; Larroza, Andrés; Ruiz-España, Silvia; Arana, Estanislao; Moratal, David
2018-05-14
To examine the capability of MRI texture analysis to differentiate the primary site of origin of brain metastases following a radiomics approach. Sixty-seven untreated brain metastases (BM) were found in 3D T1-weighted MRI of 38 patients with cancer: 27 from lung cancer, 23 from melanoma and 17 from breast cancer. These lesions were segmented in 2D and 3D to compare the discriminative power of 2D and 3D texture features. The images were quantized using different number of gray-levels to test the influence of quantization. Forty-three rotation-invariant texture features were examined. Feature selection and random forest classification were implemented within a nested cross-validation structure. Classification was evaluated with the area under receiver operating characteristic curve (AUC) considering two strategies: multiclass and one-versus-one. In the multiclass approach, 3D texture features were more discriminative than 2D features. The best results were achieved for images quantized with 32 gray-levels (AUC = 0.873 ± 0.064) using the top four features provided by the feature selection method based on the p-value. In the one-versus-one approach, high accuracy was obtained when differentiating lung cancer BM from breast cancer BM (four features, AUC = 0.963 ± 0.054) and melanoma BM (eight features, AUC = 0.936 ± 0.070) using the optimal dataset (3D features, 32 gray-levels). Classification of breast cancer and melanoma BM was unsatisfactory (AUC = 0.607 ± 0.180). Volumetric MRI texture features can be useful to differentiate brain metastases from different primary cancers after quantizing the images with the proper number of gray-levels. • Texture analysis is a promising source of biomarkers for classifying brain neoplasms. • MRI texture features of brain metastases could help identifying the primary cancer. • Volumetric texture features are more discriminative than traditional 2D texture features.
Bertin, Angeline; Gouin, Nicolas; Baumel, Alex; Gianoli, Ernesto; Serratosa, Juan; Osorio, Rodomiro; Manel, Stephanie
2017-01-01
Positive species-genetic diversity correlations (SGDCs) are often thought to result from the parallel influence of neutral processes on genetic and species diversity. Yet, confounding effects of non-neutral mechanisms have not been explored. Here, we investigate the impact of non-neutral genetic diversity on SGDCs in high Andean wetlands. We compare correlations between plant species diversity and genetic diversity (GD) calculated with and without loci potentially under selection (outlier loci). The study system includes 2188 specimens from five species (three common aquatic macroinvertebrate and two dominant plant species) that were genotyped for 396 amplified fragment length polymorphism loci. We also appraise the importance of neutral processes on SGDCs by investigating the influence of habitat fragmentation features. Significant positive SGDCs were detected for all five species (mean SGDC = 0.52 ± 0.05). While only a few outlier loci were detected in each species, they resulted in significant decreases in GD and in SGDCs. This supports the hypothesis that neutral processes drive species-genetic diversity relationships in high Andean wetlands. Unexpectedly, the effects on genetic diversity GD of the habitat fragmentation characteristics in this study increased with the presence of outlier loci in two species. Overall, our results reveal pitfalls in using habitat features to infer processes driving SGDCs and show that a few loci potentially under selection are enough to cause a significant downward bias in SGDC. Investigating confounding effects of outlier loci thus represents a useful approach to evidence the contribution of neutral processes on species-genetic diversity relationships. © 2016 John Wiley & Sons Ltd.
NASA Astrophysics Data System (ADS)
Diamant, Idit; Shalhon, Moran; Goldberger, Jacob; Greenspan, Hayit
2016-03-01
Classification of clustered breast microcalcifications into benign and malignant categories is an extremely challenging task for computerized algorithms and expert radiologists alike. In this paper we present a novel method for feature selection based on mutual information (MI) criterion for automatic classification of microcalcifications. We explored the MI based feature selection for various texture features. The proposed method was evaluated on a standardized digital database for screening mammography (DDSM). Experimental results demonstrate the effectiveness and the advantage of using the MI-based feature selection to obtain the most relevant features for the task and thus to provide for improved performance as compared to using all features.
Using environmental heterogeneity to plan for sea-level rise.
Hunter, Elizabeth A; Nibbelink, Nathan P
2017-12-01
Environmental heterogeneity is increasingly being used to select conservation areas that will provide for future biodiversity under a variety of climate scenarios. This approach, termed conserving nature's stage (CNS), assumes environmental features respond to climate change more slowly than biological communities, but will CNS be effective if the stage were to change as rapidly as the climate? We tested the effectiveness of using CNS to select sites in salt marshes for conservation in coastal Georgia (U.S.A.), where environmental features will change rapidly as sea level rises. We calculated species diversity based on distributions of 7 bird species with a variety of niches in Georgia salt marshes. Environmental heterogeneity was assessed across six landscape gradients (e.g., elevation, salinity, and patch area). We used 2 approaches to select sites with high environmental heterogeneity: site complementarity (environmental diversity [ED]) and local environmental heterogeneity (environmental richness [ER]). Sites selected based on ER predicted present-day species diversity better than randomly selected sites (up to an 8.1% improvement), were resilient to areal loss from SLR (1.0% average areal loss by 2050 compared with 0.9% loss of randomly selected sites), and provided habitat to a threatened species (0.63 average occupancy compared with 0.6 average occupancy of randomly selected sites). Sites selected based on ED predicted species diversity no better or worse than random and were not resilient to SLR (2.9% average areal loss by 2050). Despite the discrepancy between the 2 approaches, CNS is a viable strategy for conservation site selection in salt marshes because the ER approach was successful. It has potential for application in other coastal areas where SLR will affect environmental features, but its performance may depend on the magnitude of geological changes caused by SLR. Our results indicate that conservation planners that had heretofore excluded low-lying coasts from CNS planning could include coastal ecosystems in regional conservation strategies. © 2017 Society for Conservation Biology.
Using neuronal populations to study the mechanisms underlying spatial and feature attention
Cohen, Marlene R.; Maunsell, John H.R.
2012-01-01
Summary Visual attention affects both perception and neuronal responses. Whether the same neuronal mechanisms mediate spatial attention, which improves perception of attended locations, and non-spatial forms of attention has been a subject of considerable debate. Spatial and feature attention have similar effects on individual neurons. Because visual cortex is retinotopically organized, however, spatial attention can co-modulate local neuronal populations, while feature attention generally requires more selective modulation. We compared the effects of feature and spatial attention on local and spatially separated populations by recording simultaneously from dozens of neurons in both hemispheres of V4. Feature and spatial attention affect the activity of local populations similarly, modulating both firing rates and correlations between pairs of nearby neurons. However, while spatial attention appears to act on local populations, feature attention is coordinated across hemispheres. Our results are consistent with a unified attentional mechanism that can modulate the responses of arbitrary subgroups of neurons. PMID:21689604
Feature Selection Methods for Zero-Shot Learning of Neural Activity
Caceres, Carlos A.; Roos, Matthew J.; Rupp, Kyle M.; Milsap, Griffin; Crone, Nathan E.; Wolmetz, Michael E.; Ratto, Christopher R.
2017-01-01
Dimensionality poses a serious challenge when making predictions from human neuroimaging data. Across imaging modalities, large pools of potential neural features (e.g., responses from particular voxels, electrodes, and temporal windows) have to be related to typically limited sets of stimuli and samples. In recent years, zero-shot prediction models have been introduced for mapping between neural signals and semantic attributes, which allows for classification of stimulus classes not explicitly included in the training set. While choices about feature selection can have a substantial impact when closed-set accuracy, open-set robustness, and runtime are competing design objectives, no systematic study of feature selection for these models has been reported. Instead, a relatively straightforward feature stability approach has been adopted and successfully applied across models and imaging modalities. To characterize the tradeoffs in feature selection for zero-shot learning, we compared correlation-based stability to several other feature selection techniques on comparable data sets from two distinct imaging modalities: functional Magnetic Resonance Imaging and Electrocorticography. While most of the feature selection methods resulted in similar zero-shot prediction accuracies and spatial/spectral patterns of selected features, there was one exception; A novel feature/attribute correlation approach was able to achieve those accuracies with far fewer features, suggesting the potential for simpler prediction models that yield high zero-shot classification accuracy. PMID:28690513
Development of Vision Based Multiview Gait Recognition System with MMUGait Database
Ng, Hu; Tan, Wooi-Haw; Tong, Hau-Lee
2014-01-01
This paper describes the acquisition setup and development of a new gait database, MMUGait. This database consists of 82 subjects walking under normal condition and 19 subjects walking with 11 covariate factors, which were captured under two views. This paper also proposes a multiview model-based gait recognition system with joint detection approach that performs well under different walking trajectories and covariate factors, which include self-occluded or external occluded silhouettes. In the proposed system, the process begins by enhancing the human silhouette to remove the artifacts. Next, the width and height of the body are obtained. Subsequently, the joint angular trajectories are determined once the body joints are automatically detected. Lastly, crotch height and step-size of the walking subject are determined. The extracted features are smoothened by Gaussian filter to eliminate the effect of outliers. The extracted features are normalized with linear scaling, which is followed by feature selection prior to the classification process. The classification experiments carried out on MMUGait database were benchmarked against the SOTON Small DB from University of Southampton. Results showed correct classification rate above 90% for all the databases. The proposed approach is found to outperform other approaches on SOTON Small DB in most cases. PMID:25143972
Decision-making dynamics in parasitoids of Drosophila.
Thiel, Andra; Hoffmeister, Thomas S
2009-01-01
Drosophilids and their associated parasitoids live in environments that vary in resource availability and quality within and between generations. The use of information to adapt behavior to the current environment is a key feature under such circumstances and Drosophila parasitic wasps are excellent model systems to study learning and information use. They are among the few parasitoid model species that have been tested in a wide array of situations. Moreover, several related species have been tested under similar conditions, allowing the analysis of within and between species variability, the effect of natural selection in a typical environment, the current physiological status, and previous experience of the individual. This holds for host habitat and host location as well as for host choice and search time allocation. Here, we review patterns of learning and memory, of information use and updating mechanisms, and we point out that information use itself is under strong selective pressure and thus, optimized by parasitic wasps.
Expectation and Surprise Determine Neural Population Responses in the Ventral Visual Stream
Egner, Tobias; Monti, Jim M.; Summerfield, Christopher
2014-01-01
Visual cortex is traditionally viewed as a hierarchy of neural feature detectors, with neural population responses being driven by bottom-up stimulus features. Conversely, “predictive coding” models propose that each stage of the visual hierarchy harbors two computationally distinct classes of processing unit: representational units that encode the conditional probability of a stimulus and provide predictions to the next lower level; and error units that encode the mismatch between predictions and bottom-up evidence, and forward prediction error to the next higher level. Predictive coding therefore suggests that neural population responses in category-selective visual regions, like the fusiform face area (FFA), reflect a summation of activity related to prediction (“face expectation”) and prediction error (“face surprise”), rather than a homogenous feature detection response. We tested the rival hypotheses of the feature detection and predictive coding models by collecting functional magnetic resonance imaging data from the FFA while independently varying both stimulus features (faces vs houses) and subjects’ perceptual expectations regarding those features (low vs medium vs high face expectation). The effects of stimulus and expectation factors interacted, whereby FFA activity elicited by face and house stimuli was indistinguishable under high face expectation and maximally differentiated under low face expectation. Using computational modeling, we show that these data can be explained by predictive coding but not by feature detection models, even when the latter are augmented with attentional mechanisms. Thus, population responses in the ventral visual stream appear to be determined by feature expectation and surprise rather than by stimulus features per se. PMID:21147999
Identification of features in indexed data and equipment therefore
Jarman, Kristin H [Richland, WA; Daly, Don Simone [Richland, WA; Anderson, Kevin K [Richland, WA; Wahl, Karen L [Richland, WA
2002-04-02
Embodiments of the present invention provide methods of identifying a feature in an indexed dataset. Such embodiments encompass selecting an initial subset of indices, the initial subset of indices being encompassed by an initial window-of-interest and comprising at least one beginning index and at least one ending index; computing an intensity weighted measure of dispersion for the subset of indices using a subset of responses corresponding to the subset of indices; and comparing the intensity weighted measure of dispersion to a dispersion critical value determined from an expected value of the intensity weighted measure of dispersion under a null hypothesis of no transient feature present. Embodiments of the present invention also encompass equipment configured to perform the methods of the present invention.
Learning partial differential equations via data discovery and sparse optimization
NASA Astrophysics Data System (ADS)
Schaeffer, Hayden
2017-01-01
We investigate the problem of learning an evolution equation directly from some given data. This work develops a learning algorithm to identify the terms in the underlying partial differential equations and to approximate the coefficients of the terms only using data. The algorithm uses sparse optimization in order to perform feature selection and parameter estimation. The features are data driven in the sense that they are constructed using nonlinear algebraic equations on the spatial derivatives of the data. Several numerical experiments show the proposed method's robustness to data noise and size, its ability to capture the true features of the data, and its capability of performing additional analytics. Examples include shock equations, pattern formation, fluid flow and turbulence, and oscillatory convection.
Data survey on the effect of product features on competitive advantage of selected firms in Nigeria.
Olokundun, Maxwell; Iyiola, Oladele; Ibidunni, Stephen; Falola, Hezekiah; Salau, Odunayo; Amaihian, Augusta; Peter, Fred; Borishade, Taiye
2018-06-01
The main objective of this study was to present a data article that investigates the effect product features on firm's competitive advantage. Few studies have examined how the features of a product could help in driving the competitive advantage of a firm. Descriptive research method was used. Statistical Package for Social Sciences (SPSS 22) was engaged for analysis of one hundred and fifty (150) valid questionnaire which were completed by small business owners registered under small and medium scale enterprises development of Nigeria (SMEDAN). Stratified and simple random sampling techniques were employed; reliability and validity procedures were also confirmed. The field data set is made publicly available to enable critical or extended analysis.
Learning partial differential equations via data discovery and sparse optimization.
Schaeffer, Hayden
2017-01-01
We investigate the problem of learning an evolution equation directly from some given data. This work develops a learning algorithm to identify the terms in the underlying partial differential equations and to approximate the coefficients of the terms only using data. The algorithm uses sparse optimization in order to perform feature selection and parameter estimation. The features are data driven in the sense that they are constructed using nonlinear algebraic equations on the spatial derivatives of the data. Several numerical experiments show the proposed method's robustness to data noise and size, its ability to capture the true features of the data, and its capability of performing additional analytics. Examples include shock equations, pattern formation, fluid flow and turbulence, and oscillatory convection.
Learning partial differential equations via data discovery and sparse optimization
2017-01-01
We investigate the problem of learning an evolution equation directly from some given data. This work develops a learning algorithm to identify the terms in the underlying partial differential equations and to approximate the coefficients of the terms only using data. The algorithm uses sparse optimization in order to perform feature selection and parameter estimation. The features are data driven in the sense that they are constructed using nonlinear algebraic equations on the spatial derivatives of the data. Several numerical experiments show the proposed method's robustness to data noise and size, its ability to capture the true features of the data, and its capability of performing additional analytics. Examples include shock equations, pattern formation, fluid flow and turbulence, and oscillatory convection. PMID:28265183
Dong, Zuoli; Zhang, Naiqian; Li, Chun; Wang, Haiyun; Fang, Yun; Wang, Jun; Zheng, Xiaoqi
2015-06-30
An enduring challenge in personalized medicine is to select right drug for individual patients. Testing drugs on patients in large clinical trials is one way to assess their efficacy and toxicity, but it is impractical to test hundreds of drugs currently under development. Therefore the preclinical prediction model is highly expected as it enables prediction of drug response to hundreds of cell lines in parallel. Recently, two large-scale pharmacogenomic studies screened multiple anticancer drugs on over 1000 cell lines in an effort to elucidate the response mechanism of anticancer drugs. To this aim, we here used gene expression features and drug sensitivity data in Cancer Cell Line Encyclopedia (CCLE) to build a predictor based on Support Vector Machine (SVM) and a recursive feature selection tool. Robustness of our model was validated by cross-validation and an independent dataset, the Cancer Genome Project (CGP). Our model achieved good cross validation performance for most drugs in the Cancer Cell Line Encyclopedia (≥80% accuracy for 10 drugs, ≥75% accuracy for 19 drugs). Independent tests on eleven common drugs between CCLE and CGP achieved satisfactory performance for three of them, i.e., AZD6244, Erlotinib and PD-0325901, using expression levels of only twelve, six and seven genes, respectively. These results suggest that drug response could be effectively predicted from genomic features. Our model could be applied to predict drug response for some certain drugs and potentially play a complementary role in personalized medicine.
[Natural history of flowers and gravity].
Yamashita, Masamichi; Tomita-Yokotani, Kaori; Nakamura, Teruko
2004-06-01
Many flowers have coevolved with their pollinator animals. Gravity has been one of selection pressure for the evolution of flowers. Gravity rules morphology and other features of flowers in many aspects. Pair matching between the flower and its specific pollinator is one of factors that determine the fitness of both sides. Evolution of flower morphology and its molecular basis are reviewed briefly. Anemophilous flowers are also under the influence of gravity. Shape and other features of entomophilous flowers have been highly diversed. Gravitropic response and its mechanism are summarized. Recent findings on gravitropism and phototropism of pistils and stamens are presented in this article.
Enhancing the Performance of LibSVM Classifier by Kernel F-Score Feature Selection
NASA Astrophysics Data System (ADS)
Sarojini, Balakrishnan; Ramaraj, Narayanasamy; Nickolas, Savarimuthu
Medical Data mining is the search for relationships and patterns within the medical datasets that could provide useful knowledge for effective clinical decisions. The inclusion of irrelevant, redundant and noisy features in the process model results in poor predictive accuracy. Much research work in data mining has gone into improving the predictive accuracy of the classifiers by applying the techniques of feature selection. Feature selection in medical data mining is appreciable as the diagnosis of the disease could be done in this patient-care activity with minimum number of significant features. The objective of this work is to show that selecting the more significant features would improve the performance of the classifier. We empirically evaluate the classification effectiveness of LibSVM classifier on the reduced feature subset of diabetes dataset. The evaluations suggest that the feature subset selected improves the predictive accuracy of the classifier and reduce false negatives and false positives.
The fate of task-irrelevant visual motion: perceptual load versus feature-based attention.
Taya, Shuichiro; Adams, Wendy J; Graf, Erich W; Lavie, Nilli
2009-11-18
We tested contrasting predictions derived from perceptual load theory and from recent feature-based selection accounts. Observers viewed moving, colored stimuli and performed low or high load tasks associated with one stimulus feature, either color or motion. The resultant motion aftereffect (MAE) was used to evaluate attentional allocation. We found that task-irrelevant visual features received less attention than co-localized task-relevant features of the same objects. Moreover, when color and motion features were co-localized yet perceived to belong to two distinct surfaces, feature-based selection was further increased at the expense of object-based co-selection. Load theory predicts that the MAE for task-irrelevant motion would be reduced with a higher load color task. However, this was not seen for co-localized features; perceptual load only modulated the MAE for task-irrelevant motion when this was spatially separated from the attended color location. Our results suggest that perceptual load effects are mediated by spatial selection and do not generalize to the feature domain. Feature-based selection operates to suppress processing of task-irrelevant, co-localized features, irrespective of perceptual load.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rios Velazquez, E; Parmar, C; Narayan, V
Purpose: To compare the complementary value of quantitative radiomic features to that of radiologist-annotated semantic features in predicting EGFR mutations in lung adenocarcinomas. Methods: Pre-operative CT images of 258 lung adenocarcinoma patients were available. Tumors were segmented using the sing-click ensemble segmentation algorithm. A set of radiomic features was extracted using 3D-Slicer. Test-retest reproducibility and unsupervised dimensionality reduction were applied to select a subset of reproducible and independent radiomic features. Twenty semantic annotations were scored by an expert radiologist, describing the tumor, surrounding tissue and associated findings. Minimum-redundancy-maximum-relevance (MRMR) was used to identify the most informative radiomic and semantic featuresmore » in 172 patients (training-set, temporal split). Radiomic, semantic and combined radiomic-semantic logistic regression models to predict EGFR mutations were evaluated in and independent validation dataset of 86 patients using the area under the receiver operating curve (AUC). Results: EGFR mutations were found in 77/172 (45%) and 39/86 (45%) of the training and validation sets, respectively. Univariate AUCs showed a similar range for both feature types: radiomics median AUC = 0.57 (range: 0.50 – 0.62); semantic median AUC = 0.53 (range: 0.50 – 0.64, Wilcoxon p = 0.55). After MRMR feature selection, the best-performing radiomic, semantic, and radiomic-semantic logistic regression models, for EGFR mutations, showed a validation AUC of 0.56 (p = 0.29), 0.63 (p = 0.063) and 0.67 (p = 0.004), respectively. Conclusion: Quantitative volumetric and textural Radiomic features complement the qualitative and semi-quantitative radiologist annotations. The prognostic value of informative qualitative semantic features such as cavitation and lobulation is increased with the addition of quantitative textural features from the tumor region.« less
Classification Influence of Features on Given Emotions and Its Application in Feature Selection
NASA Astrophysics Data System (ADS)
Xing, Yin; Chen, Chuang; Liu, Li-Long
2018-04-01
In order to solve the problem that there is a large amount of redundant data in high-dimensional speech emotion features, we analyze deeply the extracted speech emotion features and select better features. Firstly, a given emotion is classified by each feature. Secondly, the recognition rate is ranked in descending order. Then, the optimal threshold of features is determined by rate criterion. Finally, the better features are obtained. When applied in Berlin and Chinese emotional data set, the experimental results show that the feature selection method outperforms the other traditional methods.
Mala, S.; Latha, K.
2014-01-01
Activity recognition is needed in different requisition, for example, reconnaissance system, patient monitoring, and human-computer interfaces. Feature selection plays an important role in activity recognition, data mining, and machine learning. In selecting subset of features, an efficient evolutionary algorithm Differential Evolution (DE), a very efficient optimizer, is used for finding informative features from eye movements using electrooculography (EOG). Many researchers use EOG signals in human-computer interactions with various computational intelligence methods to analyze eye movements. The proposed system involves analysis of EOG signals using clearness based features, minimum redundancy maximum relevance features, and Differential Evolution based features. This work concentrates more on the feature selection algorithm based on DE in order to improve the classification for faultless activity recognition. PMID:25574185
Mala, S; Latha, K
2014-01-01
Activity recognition is needed in different requisition, for example, reconnaissance system, patient monitoring, and human-computer interfaces. Feature selection plays an important role in activity recognition, data mining, and machine learning. In selecting subset of features, an efficient evolutionary algorithm Differential Evolution (DE), a very efficient optimizer, is used for finding informative features from eye movements using electrooculography (EOG). Many researchers use EOG signals in human-computer interactions with various computational intelligence methods to analyze eye movements. The proposed system involves analysis of EOG signals using clearness based features, minimum redundancy maximum relevance features, and Differential Evolution based features. This work concentrates more on the feature selection algorithm based on DE in order to improve the classification for faultless activity recognition.
Object-based attention: strength of object representation and attentional guidance.
Shomstein, Sarah; Behrmann, Marlene
2008-01-01
Two or more features belonging to a single object are identified more quickly and more accurately than are features belonging to different objects--a finding attributed to sensory enhancement of all features belonging to an attended or selected object. However, several recent studies have suggested that this "single-object advantage" may be a product of probabilistic and configural strategic prioritizations rather than of object-based perceptual enhancement per se, challenging the underlying mechanism that is thought to give rise to object-based attention. In the present article, we further explore constraints on the mechanisms of object-based selection by examining the contribution of the strength of object representations to the single-object advantage. We manipulated factors such as exposure duration (i.e., preview time) and salience of configuration (i.e., objects). Varying preview time changes the magnitude of the object-based effect, so that if there is ample time to establish an object representation (i.e., preview time of 1,000 msec), then both probability and configuration (i.e., objects) guide attentional selection. If, however, insufficient time is provided to establish a robust object-based representation, then only probabilities guide attentional selection. Interestingly, at a short preview time of 200 msec, when the two objects were sufficiently different from each other (i.e., different colors), both configuration and probability guided attention selection. These results suggest that object-based effects can be explained both in terms of strength of object representations (established at longer exposure durations and by pictorial cues) and probabilistic contingencies in the visual environment.
Buri, Luigi; Hassan, Cesare; Bersani, Gianluca; Anti, Marcello; Bianco, Maria Antonietta; Cipolletta, Livio; Di Giulio, Emilio; Di Matteo, Giovanni; Familiari, Luigi; Ficano, Leonardo; Loriga, Pietro; Morini, Sergio; Pietropaolo, Vincenzo; Zambelli, Alessandro; Grossi, Enzo; Intraligi, Marco; Buscema, Massimo
2010-06-01
Selecting patients appropriately for upper endoscopy (EGD) is crucial for efficient use of endoscopy. The objective of this study was to compare different clinical strategies and statistical methods to select patients for EGD, namely appropriateness guidelines, age and/or alarm features, and multivariate and artificial neural network (ANN) models. A nationwide, multicenter, prospective study was undertaken in which consecutive patients referred for EGD during a 1-month period were enrolled. Before EGD, the endoscopist assessed referral appropriateness according to the American Society for Gastrointestinal Endoscopy (ASGE) guidelines, also collecting clinical and demographic variables. Outcomes of the study were detection of relevant findings and new diagnosis of malignancy at EGD. The accuracy of the following clinical strategies and predictive rules was compared: (i) ASGE appropriateness guidelines (indicated vs. not indicated), (ii) simplified rule (>or=45 years or alarm features vs. <45 years without alarm features), (iii) logistic regression model, and (iv) ANN models. A total of 8,252 patients were enrolled in 57 centers. Overall, 3,803 (46%) relevant findings and 132 (1.6%) new malignancies were detected. Sensitivity, specificity, and area under the receiver-operating characteristic curve (AUC) of the simplified rule were similar to that of the ASGE guidelines for both relevant findings (82%/26%/0.55 vs. 88%/27%/0.52) and cancer (97%/22%/0.58 vs. 98%/20%/0.58). Both logistic regression and ANN models seemed to be substantially more accurate in predicting new cases of malignancy, with an AUC of 0.82 and 0.87, respectively. A simple predictive rule based on age and alarm features is similarly effective to the more complex ASGE guidelines in selecting patients for EGD. Regression and ANN models may be useful in identifying a relatively small subgroup of patients at higher risk of cancer.
Chromatin Landscapes of Retroviral and Transposon Integration Profiles
Badhai, Jitendra; Rust, Alistair G.; Rad, Roland; Hilkens, John; Berns, Anton; van Lohuizen, Maarten; Wessels, Lodewyk F. A.; de Ridder, Jeroen
2014-01-01
The ability of retroviruses and transposons to insert their genetic material into host DNA makes them widely used tools in molecular biology, cancer research and gene therapy. However, these systems have biases that may strongly affect research outcomes. To address this issue, we generated very large datasets consisting of to unselected integrations in the mouse genome for the Sleeping Beauty (SB) and piggyBac (PB) transposons, and the Mouse Mammary Tumor Virus (MMTV). We analyzed (epi)genomic features to generate bias maps at both local and genome-wide scales. MMTV showed a remarkably uniform distribution of integrations across the genome. More distinct preferences were observed for the two transposons, with PB showing remarkable resemblance to bias profiles of the Murine Leukemia Virus. Furthermore, we present a model where target site selection is directed at multiple scales. At a large scale, target site selection is similar across systems, and defined by domain-oriented features, namely expression of proximal genes, proximity to CpG islands and to genic features, chromatin compaction and replication timing. Notable differences between the systems are mainly observed at smaller scales, and are directed by a diverse range of features. To study the effect of these biases on integration sites occupied under selective pressure, we turned to insertional mutagenesis (IM) screens. In IM screens, putative cancer genes are identified by finding frequently targeted genomic regions, or Common Integration Sites (CISs). Within three recently completed IM screens, we identified 7%–33% putative false positive CISs, which are likely not the result of the oncogenic selection process. Moreover, results indicate that PB, compared to SB, is more suited to tag oncogenes. PMID:24721906
Tan, Maxine; Aghaei, Faranak; Wang, Yunzhi; Zheng, Bin
2017-01-01
The purpose of this study is to evaluate a new method to improve performance of computer-aided detection (CAD) schemes of screening mammograms with two approaches. In the first approach, we developed a new case based CAD scheme using a set of optimally selected global mammographic density, texture, spiculation, and structural similarity features computed from all four full-field digital mammography (FFDM) images of the craniocaudal (CC) and mediolateral oblique (MLO) views by using a modified fast and accurate sequential floating forward selection feature selection algorithm. Selected features were then applied to a “scoring fusion” artificial neural network (ANN) classification scheme to produce a final case based risk score. In the second approach, we combined the case based risk score with the conventional lesion based scores of a conventional lesion based CAD scheme using a new adaptive cueing method that is integrated with the case based risk scores. We evaluated our methods using a ten-fold cross-validation scheme on 924 cases (476 cancer and 448 recalled or negative), whereby each case had all four images from the CC and MLO views. The area under the receiver operating characteristic curve was AUC = 0.793±0.015 and the odds ratio monotonically increased from 1 to 37.21 as CAD-generated case based detection scores increased. Using the new adaptive cueing method, the region based and case based sensitivities of the conventional CAD scheme at a false positive rate of 0.71 per image increased by 2.4% and 0.8%, respectively. The study demonstrated that supplementary information can be derived by computing global mammographic density image features to improve CAD-cueing performance on the suspicious mammographic lesions. PMID:27997380
An enhanced digital line graph design
Guptill, Stephen C.
1990-01-01
In response to increasing information demands on its digital cartographic data, the U.S. Geological Survey has designed an enhanced version of the Digital Line Graph, termed Digital Line Graph - Enhanced (DLG-E). In the DLG-E model, the phenomena represented by geographic and cartographic data are termed entities. Entities represent individual phenomena in the real world. A feature is an abstraction of a set of entities, with the feature description encompassing only selected properties of the entities (typically the properties that have been portrayed cartographically on a map). Buildings, bridges, roads, streams, grasslands, and counties are examples of features. A feature instance, that is, one occurrence of a feature, is described in the digital environment by feature objects and spatial objects. A feature object identifies a feature instance and its nonlocational attributes. Nontopological relationships are associated with feature objects. The locational aspects of the feature instance are represented by spatial objects. Four spatial objects (points, nodes, chains, and polygons) and their topological relationships are defined. To link the locational and nonlocational aspects of the feature instance, a given feature object is associated with (or is composed of) a set of spatial objects. These objects, attributes, and relationships are the components of the DLG-E data model. To establish a domain of features for DLG-E, an approach using a set of classes, or views, of spatial entities was adopted. The five views that were developed are cover, division, ecosystem, geoposition, and morphology. The views are exclusive; each view is a self-contained analytical approach to the entire range of world features. Because each view is independent of the others, a single point on the surface of the Earth can be represented under multiple views. Under the five views, over 200 features were identified and defined. This set constitutes an initial domain of DLG-E features.
Perceptual quality estimation of H.264/AVC videos using reduced-reference and no-reference models
NASA Astrophysics Data System (ADS)
Shahid, Muhammad; Pandremmenou, Katerina; Kondi, Lisimachos P.; Rossholm, Andreas; Lövström, Benny
2016-09-01
Reduced-reference (RR) and no-reference (NR) models for video quality estimation, using features that account for the impact of coding artifacts, spatio-temporal complexity, and packet losses, are proposed. The purpose of this study is to analyze a number of potentially quality-relevant features in order to select the most suitable set of features for building the desired models. The proposed sets of features have not been used in the literature and some of the features are used for the first time in this study. The features are employed by the least absolute shrinkage and selection operator (LASSO), which selects only the most influential of them toward perceptual quality. For comparison, we apply feature selection in the complete feature sets and ridge regression on the reduced sets. The models are validated using a database of H.264/AVC encoded videos that were subjectively assessed for quality in an ITU-T compliant laboratory. We infer that just two features selected by RR LASSO and two bitstream-based features selected by NR LASSO are able to estimate perceptual quality with high accuracy, higher than that of ridge, which uses more features. The comparisons with competing works and two full-reference metrics also verify the superiority of our models.
Feature Grouping and Selection Over an Undirected Graph.
Yang, Sen; Yuan, Lei; Lai, Ying-Cheng; Shen, Xiaotong; Wonka, Peter; Ye, Jieping
2012-01-01
High-dimensional regression/classification continues to be an important and challenging problem, especially when features are highly correlated. Feature selection, combined with additional structure information on the features has been considered to be promising in promoting regression/classification performance. Graph-guided fused lasso (GFlasso) has recently been proposed to facilitate feature selection and graph structure exploitation, when features exhibit certain graph structures. However, the formulation in GFlasso relies on pairwise sample correlations to perform feature grouping, which could introduce additional estimation bias. In this paper, we propose three new feature grouping and selection methods to resolve this issue. The first method employs a convex function to penalize the pairwise l ∞ norm of connected regression/classification coefficients, achieving simultaneous feature grouping and selection. The second method improves the first one by utilizing a non-convex function to reduce the estimation bias. The third one is the extension of the second method using a truncated l 1 regularization to further reduce the estimation bias. The proposed methods combine feature grouping and feature selection to enhance estimation accuracy. We employ the alternating direction method of multipliers (ADMM) and difference of convex functions (DC) programming to solve the proposed formulations. Our experimental results on synthetic data and two real datasets demonstrate the effectiveness of the proposed methods.
Cyclic Solvent Vapor Annealing for Rapid, Robust Vertical Orientation of Features in BCP Thin Films
NASA Astrophysics Data System (ADS)
Paradiso, Sean; Delaney, Kris; Fredrickson, Glenn
2015-03-01
Methods for reliably controlling block copolymer self assembly have seen much attention over the past decade as new applications for nanostructured thin films emerge in the fields of nanopatterning and lithography. While solvent assisted annealing techniques are established as flexible and simple methods for achieving long range order, solvent annealing alone exhibits a very weak thermodynamic driving force for vertically orienting domains with respect to the free surface. To address the desire for oriented features, we have investigated a cyclic solvent vapor annealing (CSVA) approach that combines the mobility benefits of solvent annealing with selective stress experienced by structures oriented parallel to the free surface as the film is repeatedly swollen with solvent and dried. Using dynamical self-consistent field theory (DSCFT) calculations, we establish the conditions under which the method significantly outperforms both static and cyclic thermal annealing and implicate the orientation selection as a consequence of the swelling/deswelling process. Our results suggest that CSVA may prove to be a potent method for the rapid formation of highly ordered, vertically oriented features in block copolymer thin films.
CheS-Mapper - Chemical Space Mapping and Visualization in 3D.
Gütlein, Martin; Karwath, Andreas; Kramer, Stefan
2012-03-17
Analyzing chemical datasets is a challenging task for scientific researchers in the field of chemoinformatics. It is important, yet difficult to understand the relationship between the structure of chemical compounds, their physico-chemical properties, and biological or toxic effects. To that respect, visualization tools can help to better comprehend the underlying correlations. Our recently developed 3D molecular viewer CheS-Mapper (Chemical Space Mapper) divides large datasets into clusters of similar compounds and consequently arranges them in 3D space, such that their spatial proximity reflects their similarity. The user can indirectly determine similarity, by selecting which features to employ in the process. The tool can use and calculate different kind of features, like structural fragments as well as quantitative chemical descriptors. These features can be highlighted within CheS-Mapper, which aids the chemist to better understand patterns and regularities and relate the observations to established scientific knowledge. As a final function, the tool can also be used to select and export specific subsets of a given dataset for further analysis.
CheS-Mapper - Chemical Space Mapping and Visualization in 3D
2012-01-01
Analyzing chemical datasets is a challenging task for scientific researchers in the field of chemoinformatics. It is important, yet difficult to understand the relationship between the structure of chemical compounds, their physico-chemical properties, and biological or toxic effects. To that respect, visualization tools can help to better comprehend the underlying correlations. Our recently developed 3D molecular viewer CheS-Mapper (Chemical Space Mapper) divides large datasets into clusters of similar compounds and consequently arranges them in 3D space, such that their spatial proximity reflects their similarity. The user can indirectly determine similarity, by selecting which features to employ in the process. The tool can use and calculate different kind of features, like structural fragments as well as quantitative chemical descriptors. These features can be highlighted within CheS-Mapper, which aids the chemist to better understand patterns and regularities and relate the observations to established scientific knowledge. As a final function, the tool can also be used to select and export specific subsets of a given dataset for further analysis. PMID:22424447
Development and evaluation of an ambulatory stress monitor based on wearable sensors.
Choi, Jongyoon; Ahmed, Beena; Gutierrez-Osuna, Ricardo
2012-03-01
Chronic stress is endemic to modern society. However, as it is unfeasible for physicians to continuously monitor stress levels, its diagnosis is nontrivial. Wireless body sensor networks offer opportunities to ubiquitously detect and monitor mental stress levels, enabling improved diagnosis, and early treatment. This article describes the development of a wearable sensor platform to monitor a number of physiological correlates of mental stress. We discuss tradeoffs in both system design and sensor selection to balance information content and wearability. Using experimental signals collected from the wearable sensor, we describe a selected number of physiological features that show good correlation with mental stress. In particular, we propose a new spectral feature that estimates the balance of the autonomic nervous system by combining information from the power spectral density of respiration and heart rate variability. We validate the effectiveness of our approach on a binary discrimination problem when subjects are placed under two psychophysiological conditions: mental stress and relaxation. When used in a logistic regression model, our feature set is able to discriminate between these two mental states with a success rate of 81% across subjects. © 2012 IEEE
Schizophrenia with prominent catatonic features: A selective review.
Ungvari, Gabor S; Gerevich, Jozsef; Takács, Rozália; Gazdag, Gábor
2017-08-14
A widely accepted consensus holds that a variety of motor symptoms subsumed under the term 'catatonia' have been an integral part of the symptomatology of schizophrenia since 1896, when Kraepelin proposed the concept of dementia praecox (schizophrenia). Until recently, psychiatric classifications included catatonic schizophrenia mainly through tradition, without compelling evidence of its validity as a schizophrenia subtype. This selective review briefly summarizes the history, psychopathology, demographic and epidemiological data, and treatment options for schizophrenia with prominent catatonic features. Although most catatonic signs and symptoms are easy to observe and measure, the lack of conceptual clarity of catatonia and consensus about the threshold and criteria for its diagnosis have hampered our understanding of how catatonia contributes to the pathophysiology of schizophrenic psychoses. Diverse study samples and methodologies have further hindered research on schizophrenia with prominent catatonic features. A focus on the motor aspects of broadly defined schizophrenia using modern methods of detecting and quantifying catatonic signs and symptoms coupled with sophisticated neuroimaging techniques offers a new approach to research in this long-overlooked field. Copyright © 2017. Published by Elsevier B.V.
Abbasian Ardakani, Ali; Gharbali, Akbar; Mohammadi, Afshin
2015-01-01
The aim of this study was to evaluate computer aided diagnosis (CAD) system with texture analysis (TA) to improve radiologists' accuracy in identification of thyroid nodules as malignant or benign. A total of 70 cases (26 benign and 44 malignant) were analyzed in this study. We extracted up to 270 statistical texture features as a descriptor for each selected region of interests (ROIs) in three normalization schemes (default, 3s and 1%-99%). Then features by the lowest probability of classification error and average correlation coefficients (POE+ACC), and Fisher coefficient (Fisher) eliminated to 10 best and most effective features. These features were analyzed under standard and nonstandard states. For TA of the thyroid nodules, Principle Component Analysis (PCA), Linear Discriminant Analysis (LDA) and Non-Linear Discriminant Analysis (NDA) were applied. First Nearest-Neighbour (1-NN) classifier was performed for the features resulting from PCA and LDA. NDA features were classified by artificial neural network (A-NN). Receiver operating characteristic (ROC) curve analysis was used for examining the performance of TA methods. The best results were driven in 1-99% normalization with features extracted by POE+ACC algorithm and analyzed by NDA with the area under the ROC curve ( Az) of 0.9722 which correspond to sensitivity of 94.45%, specificity of 100%, and accuracy of 97.14%. Our results indicate that TA is a reliable method, can provide useful information help radiologist in detection and classification of benign and malignant thyroid nodules.
A completely automated CAD system for mass detection in a large mammographic database.
Bellotti, R; De Carlo, F; Tangaro, S; Gargano, G; Maggipinto, G; Castellano, M; Massafra, R; Cascio, D; Fauci, F; Magro, R; Raso, G; Lauria, A; Forni, G; Bagnasco, S; Cerello, P; Zanon, E; Cheran, S C; Lopez Torres, E; Bottigli, U; Masala, G L; Oliva, P; Retico, A; Fantacci, M E; Cataldo, R; De Mitri, I; De Nunzio, G
2006-08-01
Mass localization plays a crucial role in computer-aided detection (CAD) systems for the classification of suspicious regions in mammograms. In this article we present a completely automated classification system for the detection of masses in digitized mammographic images. The tool system we discuss consists in three processing levels: (a) Image segmentation for the localization of regions of interest (ROIs). This step relies on an iterative dynamical threshold algorithm able to select iso-intensity closed contours around gray level maxima of the mammogram. (b) ROI characterization by means of textural features computed from the gray tone spatial dependence matrix (GTSDM), containing second-order spatial statistics information on the pixel gray level intensity. As the images under study were recorded in different centers and with different machine settings, eight GTSDM features were selected so as to be invariant under monotonic transformation. In this way, the images do not need to be normalized, as the adopted features depend on the texture only, rather than on the gray tone levels, too. (c) ROI classification by means of a neural network, with supervision provided by the radiologist's diagnosis. The CAD system was evaluated on a large database of 3369 mammographic images [2307 negative, 1062 pathological (or positive), containing at least one confirmed mass, as diagnosed by an expert radiologist]. To assess the performance of the system, receiver operating characteristic (ROC) and free-response ROC analysis were employed. The area under the ROC curve was found to be Az = 0.783 +/- 0.008 for the ROI-based classification. When evaluating the accuracy of the CAD against the radiologist-drawn boundaries, 4.23 false positives per image are found at 80% of mass sensitivity.
Breast cancer Ki67 expression preoperative discrimination by DCE-MRI radiomics features
NASA Astrophysics Data System (ADS)
Ma, Wenjuan; Ji, Yu; Qin, Zhuanping; Guo, Xinpeng; Jian, Xiqi; Liu, Peifang
2018-02-01
To investigate whether quantitative radiomics features extracted from dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) are associated with Ki67 expression of breast cancer. In this institutional review board approved retrospective study, we collected 377 cases Chinese women who were diagnosed with invasive breast cancer in 2015. This cohort included 53 low-Ki67 expression (Ki67 proliferation index less than 14%) and 324 cases with high-Ki67 expression (Ki67 proliferation index more than 14%). A binary-classification of low- vs. high- Ki67 expression was performed. A set of 52 quantitative radiomics features, including morphological, gray scale statistic, and texture features, were extracted from the segmented lesion area. Three most common machine learning classification methods, including Naive Bayes, k-Nearest Neighbor and support vector machine with Gaussian kernel, were employed for the classification and the least absolute shrink age and selection operator (LASSO) method was used to select most predictive features set for the classifiers. Classification performance was evaluated by the area under receiver operating characteristic curve (AUC), accuracy, sensitivity and specificity. The model that used Naive Bayes classification method achieved the best performance than the other two methods, yielding 0.773 AUC value, 0.757 accuracy, 0.777 sensitivity and 0.769 specificity. Our study showed that quantitative radiomics imaging features of breast tumor extracted from DCE-MRI are associated with breast cancer Ki67 expression. Future larger studies are needed in order to further evaluate the findings.
Effective traffic features selection algorithm for cyber-attacks samples
NASA Astrophysics Data System (ADS)
Li, Yihong; Liu, Fangzheng; Du, Zhenyu
2018-05-01
By studying the defense scheme of Network attacks, this paper propose an effective traffic features selection algorithm based on k-means++ clustering to deal with the problem of high dimensionality of traffic features which extracted from cyber-attacks samples. Firstly, this algorithm divide the original feature set into attack traffic feature set and background traffic feature set by the clustering. Then, we calculates the variation of clustering performance after removing a certain feature. Finally, evaluating the degree of distinctiveness of the feature vector according to the result. Among them, the effective feature vector is whose degree of distinctiveness exceeds the set threshold. The purpose of this paper is to select out the effective features from the extracted original feature set. In this way, it can reduce the dimensionality of the features so as to reduce the space-time overhead of subsequent detection. The experimental results show that the proposed algorithm is feasible and it has some advantages over other selection algorithms.
Occupant traffic estimation through structural vibration sensing
NASA Astrophysics Data System (ADS)
Pan, Shijia; Mirshekari, Mostafa; Zhang, Pei; Noh, Hae Young
2016-04-01
The number of people passing through different indoor areas is useful in various smart structure applications, including occupancy-based building energy/space management, marketing research, security, etc. Existing approaches to estimate occupant traffic include vision-, sound-, and radio-based (mobile) sensing methods, which have placement limitations (e.g., requirement of line-of-sight, quiet environment, carrying a device all the time). Such limitations make these direct sensing approaches difficult to deploy and maintain. An indirect approach using geophones to measure floor vibration induced by footsteps can be utilized. However, the main challenge lies in distinguishing multiple simultaneous walkers by developing features that can effectively represent the number of mixed signals and characterize the selected features under different traffic conditions. This paper presents a method to monitor multiple persons. Once the vibration signals are obtained, features are extracted to describe the overlapping vibration signals induced by multiple footsteps, which are used for occupancy traffic estimation. In particular, we focus on analysis of the efficiency and limitations of the four selected key features when used for estimating various traffic conditions. We characterize these features with signals collected from controlled impulse load tests as well as from multiple people walking through a real-world sensing area. In our experiments, the system achieves the mean estimation error of +/-0.2 people for different occupant traffic conditions (from one to four) using k-nearest neighbor classifier.
Relevance popularity: A term event model based feature selection scheme for text classification.
Feng, Guozhong; An, Baiguo; Yang, Fengqin; Wang, Han; Zhang, Libiao
2017-01-01
Feature selection is a practical approach for improving the performance of text classification methods by optimizing the feature subsets input to classifiers. In traditional feature selection methods such as information gain and chi-square, the number of documents that contain a particular term (i.e. the document frequency) is often used. However, the frequency of a given term appearing in each document has not been fully investigated, even though it is a promising feature to produce accurate classifications. In this paper, we propose a new feature selection scheme based on a term event Multinomial naive Bayes probabilistic model. According to the model assumptions, the matching score function, which is based on the prediction probability ratio, can be factorized. Finally, we derive a feature selection measurement for each term after replacing inner parameters by their estimators. On a benchmark English text datasets (20 Newsgroups) and a Chinese text dataset (MPH-20), our numerical experiment results obtained from using two widely used text classifiers (naive Bayes and support vector machine) demonstrate that our method outperformed the representative feature selection methods.
Hybrid feature selection for supporting lightweight intrusion detection systems
NASA Astrophysics Data System (ADS)
Song, Jianglong; Zhao, Wentao; Liu, Qiang; Wang, Xin
2017-08-01
Redundant and irrelevant features not only cause high resource consumption but also degrade the performance of Intrusion Detection Systems (IDS), especially when coping with big data. These features slow down the process of training and testing in network traffic classification. Therefore, a hybrid feature selection approach in combination with wrapper and filter selection is designed in this paper to build a lightweight intrusion detection system. Two main phases are involved in this method. The first phase conducts a preliminary search for an optimal subset of features, in which the chi-square feature selection is utilized. The selected set of features from the previous phase is further refined in the second phase in a wrapper manner, in which the Random Forest(RF) is used to guide the selection process and retain an optimized set of features. After that, we build an RF-based detection model and make a fair comparison with other approaches. The experimental results on NSL-KDD datasets show that our approach results are in higher detection accuracy as well as faster training and testing processes.
Ghayab, Hadi Ratham Al; Li, Yan; Abdulla, Shahab; Diykh, Mohammed; Wan, Xiangkui
2016-06-01
Electroencephalogram (EEG) signals are used broadly in the medical fields. The main applications of EEG signals are the diagnosis and treatment of diseases such as epilepsy, Alzheimer, sleep problems and so on. This paper presents a new method which extracts and selects features from multi-channel EEG signals. This research focuses on three main points. Firstly, simple random sampling (SRS) technique is used to extract features from the time domain of EEG signals. Secondly, the sequential feature selection (SFS) algorithm is applied to select the key features and to reduce the dimensionality of the data. Finally, the selected features are forwarded to a least square support vector machine (LS_SVM) classifier to classify the EEG signals. The LS_SVM classifier classified the features which are extracted and selected from the SRS and the SFS. The experimental results show that the method achieves 99.90, 99.80 and 100 % for classification accuracy, sensitivity and specificity, respectively.
NASA Astrophysics Data System (ADS)
Park, Keun; Lee, Sang-Ik
2010-03-01
High-frequency induction is an efficient, non-contact means of heating the surface of an injection mold through electromagnetic induction. Because the procedure allows for the rapid heating and cooling of mold surfaces, it has been recently applied to the injection molding of thin-walled parts or micro/nano-structures. The present study proposes a localized heating method involving the selective use of mold materials to enhance the heating efficiency of high-frequency induction heating. For localized induction heating, a composite injection mold of ferromagnetic material and paramagnetic material is used. The feasibility of the proposed heating method is investigated through numerical analyses in terms of its heating efficiency for localized mold surfaces and in terms of the structural safety of the composite mold. The moldability of high aspect ratio micro-features is then experimentally compared under a variety of induction heating conditions.
Qi, Miao; Wang, Ting; Yi, Yugen; Gao, Na; Kong, Jun; Wang, Jianzhong
2017-04-01
Feature selection has been regarded as an effective tool to help researchers understand the generating process of data. For mining the synthesis mechanism of microporous AlPOs, this paper proposes a novel feature selection method by joint l 2,1 norm and Fisher discrimination constraints (JNFDC). In order to obtain more effective feature subset, the proposed method can be achieved in two steps. The first step is to rank the features according to sparse and discriminative constraints. The second step is to establish predictive model with the ranked features, and select the most significant features in the light of the contribution of improving the predictive accuracy. To the best of our knowledge, JNFDC is the first work which employs the sparse representation theory to explore the synthesis mechanism of six kinds of pore rings. Numerical simulations demonstrate that our proposed method can select significant features affecting the specified structural property and improve the predictive accuracy. Moreover, comparison results show that JNFDC can obtain better predictive performances than some other state-of-the-art feature selection methods. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
An object-based visual attention model for robotic applications.
Yu, Yuanlong; Mann, George K I; Gosine, Raymond G
2010-10-01
By extending integrated competition hypothesis, this paper presents an object-based visual attention model, which selects one object of interest using low-dimensional features, resulting that visual perception starts from a fast attentional selection procedure. The proposed attention model involves seven modules: learning of object representations stored in a long-term memory (LTM), preattentive processing, top-down biasing, bottom-up competition, mediation between top-down and bottom-up ways, generation of saliency maps, and perceptual completion processing. It works in two phases: learning phase and attending phase. In the learning phase, the corresponding object representation is trained statistically when one object is attended. A dual-coding object representation consisting of local and global codings is proposed. Intensity, color, and orientation features are used to build the local coding, and a contour feature is employed to constitute the global coding. In the attending phase, the model preattentively segments the visual field into discrete proto-objects using Gestalt rules at first. If a task-specific object is given, the model recalls the corresponding representation from LTM and deduces the task-relevant feature(s) to evaluate top-down biases. The mediation between automatic bottom-up competition and conscious top-down biasing is then performed to yield a location-based saliency map. By combination of location-based saliency within each proto-object, the proto-object-based saliency is evaluated. The most salient proto-object is selected for attention, and it is finally put into the perceptual completion processing module to yield a complete object region. This model has been applied into distinct tasks of robots: detection of task-specific stationary and moving objects. Experimental results under different conditions are shown to validate this model.
Investigating a memory-based account of negative priming: support for selection-feature mismatch.
MacDonald, P A; Joordens, S
2000-08-01
Using typical and modified negative priming tasks, the selection-feature mismatch account of negative priming was tested. In the modified task, participants performed selections on the basis of a semantic feature (e.g., referent size). This procedure has been shown to enhance negative priming (P. A. MacDonald, S. Joordens, & K. N. Seergobin, 1999). Across 3 experiments, negative priming occurred only when the repeated item mismatched in terms of the feature used as the basis for selections. When the repeated item was congruent on the selection feature across the prime and probe displays, positive priming arose. This pattern of results appeared in both the ignored- and the attended-repetition conditions. Negative priming does not result from previously ignoring an item. These findings strongly support the selection-feature mismatch account of negative priming and refute both the distractor inhibition and the episodic-retrieval explanations.
High-Si content BARC for dual-BARC systems such as trilayer patterning
NASA Astrophysics Data System (ADS)
Kennedy, Joseph; Xie, Song-Yuan; Wu, Ze-Yu; Katsanes, Ron; Flanigan, Kyle; Lee, Kevin; Slezak, Mark; Liu, Zhi; Lin, Shang-Ho
2009-03-01
This work discusses the requirements and performance of Honeywell's middle layer material, UVAS, for tri-layer patterning. UVAS is a high Si content polymer synthesized directly from Si containing starting monomer components. The monomers are selected to produce a film that meets the requirements as a middle layer for tri-layer patterning (TLP) and gives us a level of flexibility to adjust the properties of the film to meet the customer's specific photoresist and patterning requirements. Results of simulations of the substrate reflectance versus numerical aperture, UVAS thickness, and under layer film are presented. ArF photoresist line profiles and process latitude versus UVAS bake at temperatures as low as 150ºC are presented and discussed. Immersion lithographic patterning of ArF photoresist line space and contact hole features will be presented. A sequence of SEM images detailing the plasma etch transfer of line space photoresist features through the middle and under layer films comprising the TLP film stack will be presented. Excellent etch selectivity between the UVAS and the organic under layer film exists as no edge erosion or faceting is observed as a result of the etch process. A detailed study of the impact of a PGMEA solvent photoresist rework process on the lithographic process window of a TLP film stack was performed with the results indicating that no degradation to the UVAS film occurs.
Fatigue Analyses Under Constant- and Variable-Amplitude Loading Using Small-Crack Theory
NASA Technical Reports Server (NTRS)
Newman, J. C., Jr.; Phillips, E. P.; Everett, R. A., Jr.
1999-01-01
Studies on the growth of small cracks have led to the observation that fatigue life of many engineering materials is primarily "crack growth" from micro-structural features, such as inclusion particles, voids, slip-bands or from manufacturing defects. This paper reviews the capabilities of a plasticity-induced crack-closure model to predict fatigue lives of metallic materials using "small-crack theory" under various loading conditions. Constraint factors, to account for three-dimensional effects, were selected to correlate large-crack growth rate data as a function of the effective stress-intensity factor range (delta-Keff) under constant-amplitude loading. Modifications to the delta-Keff-rate relations in the near-threshold regime were needed to fit measured small-crack growth rate behavior. The model was then used to calculate small-and large-crack growth rates, and to predict total fatigue lives, for notched and un-notched specimens under constant-amplitude and spectrum loading. Fatigue lives were predicted using crack-growth relations and micro-structural features like those that initiated cracks in the fatigue specimens for most of the materials analyzed. Results from the tests and analyses agreed well.
Ordinal feature selection for iris and palmprint recognition.
Sun, Zhenan; Wang, Libin; Tan, Tieniu
2014-09-01
Ordinal measures have been demonstrated as an effective feature representation model for iris and palmprint recognition. However, ordinal measures are a general concept of image analysis and numerous variants with different parameter settings, such as location, scale, orientation, and so on, can be derived to construct a huge feature space. This paper proposes a novel optimization formulation for ordinal feature selection with successful applications to both iris and palmprint recognition. The objective function of the proposed feature selection method has two parts, i.e., misclassification error of intra and interclass matching samples and weighted sparsity of ordinal feature descriptors. Therefore, the feature selection aims to achieve an accurate and sparse representation of ordinal measures. And, the optimization subjects to a number of linear inequality constraints, which require that all intra and interclass matching pairs are well separated with a large margin. Ordinal feature selection is formulated as a linear programming (LP) problem so that a solution can be efficiently obtained even on a large-scale feature pool and training database. Extensive experimental results demonstrate that the proposed LP formulation is advantageous over existing feature selection methods, such as mRMR, ReliefF, Boosting, and Lasso for biometric recognition, reporting state-of-the-art accuracy on CASIA and PolyU databases.
NASA Astrophysics Data System (ADS)
Jafari, Mehrnoosh; Minaei, Saeid; Safaie, Naser; Torkamani-Azar, Farah
2016-05-01
Spatial and temporal changes in surface temperature of infected and non-infected rose plant (Rosa hybrida cv. 'Angelina') leaves were visualized using digital infrared thermography. Infected areas exhibited a presymptomatic decrease in leaf temperature up to 2.3 °C. In this study, two experiments were conducted: one in the greenhouse (semi-controlled ambient conditions) and the other, in a growth chamber (controlled ambient conditions). Effect of drought stress and darkness on the thermal images were also studied in this research. It was found that thermal histograms of the infected leaves closely follow a standard normal distribution. They have a skewness near zero, kurtosis under 3, standard deviation larger than 0.6, and a Maximum Temperature Difference (MTD) more than 4. For each thermal histogram, central tendency, variability, and parameters of the best fitted Standard Normal and Laplace distributions were estimated. To classify healthy and infected leaves, feature selection was conducted and the best extracted thermal features with the largest linguistic hedge values were chosen. Among those features independent of absolute temperature measurement, MTD, SD, skewness, R2l, kurtosis and bn were selected. Then, a neuro-fuzzy classifier was trained to recognize the healthy leaves from the infected ones. The k-means clustering method was utilized to obtain the initial parameters and the fuzzy "if-then" rules. Best estimation rates of 92.55% and 92.3% were achieved in training and testing the classifier with 8 clusters. Results showed that drought stress had an adverse effect on the classification of healthy leaves. More healthy leaves under drought stress condition were classified as infected causing PPV and Specificity index values to decrease, accordingly. Image acquisition in the dark had no significant effect on the classification performance.
Predicting conversion from MCI to AD using resting-state fMRI, graph theoretical approach and SVM.
Hojjati, Seyed Hani; Ebrahimzadeh, Ata; Khazaee, Ali; Babajani-Feremi, Abbas
2017-04-15
We investigated identifying patients with mild cognitive impairment (MCI) who progress to Alzheimer's disease (AD), MCI converter (MCI-C), from those with MCI who do not progress to AD, MCI non-converter (MCI-NC), based on resting-state fMRI (rs-fMRI). Graph theory and machine learning approach were utilized to predict progress of patients with MCI to AD using rs-fMRI. Eighteen MCI converts (average age 73.6 years; 11 male) and 62 age-matched MCI non-converters (average age 73.0 years, 28 male) were included in this study. We trained and tested a support vector machine (SVM) to classify MCI-C from MCI-NC using features constructed based on the local and global graph measures. A novel feature selection algorithm was developed and utilized to select an optimal subset of features. Using subset of optimal features in SVM, we classified MCI-C from MCI-NC with an accuracy, sensitivity, specificity, and the area under the receiver operating characteristic (ROC) curve of 91.4%, 83.24%, 90.1%, and 0.95, respectively. Furthermore, results of our statistical analyses were used to identify the affected brain regions in AD. To the best of our knowledge, this is the first study that combines the graph measures (constructed based on rs-fMRI) with machine learning approach and accurately classify MCI-C from MCI-NC. Results of this study demonstrate potential of the proposed approach for early AD diagnosis and demonstrate capability of rs-fMRI to predict conversion from MCI to AD by identifying affected brain regions underlying this conversion. Copyright © 2017 Elsevier B.V. All rights reserved.
Economic indicators selection for crime rates forecasting using cooperative feature selection
NASA Astrophysics Data System (ADS)
Alwee, Razana; Shamsuddin, Siti Mariyam Hj; Salleh Sallehuddin, Roselina
2013-04-01
Features selection in multivariate forecasting model is very important to ensure that the model is accurate. The purpose of this study is to apply the Cooperative Feature Selection method for features selection. The features are economic indicators that will be used in crime rate forecasting model. The Cooperative Feature Selection combines grey relational analysis and artificial neural network to establish a cooperative model that can rank and select the significant economic indicators. Grey relational analysis is used to select the best data series to represent each economic indicator and is also used to rank the economic indicators according to its importance to the crime rate. After that, the artificial neural network is used to select the significant economic indicators for forecasting the crime rates. In this study, we used economic indicators of unemployment rate, consumer price index, gross domestic product and consumer sentiment index, as well as data rates of property crime and violent crime for the United States. Levenberg-Marquardt neural network is used in this study. From our experiments, we found that consumer price index is an important economic indicator that has a significant influence on the violent crime rate. While for property crime rate, the gross domestic product, unemployment rate and consumer price index are the influential economic indicators. The Cooperative Feature Selection is also found to produce smaller errors as compared to Multiple Linear Regression in forecasting property and violent crime rates.
Feature Selection for Ridge Regression with Provable Guarantees.
Paul, Saurabh; Drineas, Petros
2016-04-01
We introduce single-set spectral sparsification as a deterministic sampling-based feature selection technique for regularized least-squares classification, which is the classification analog to ridge regression. The method is unsupervised and gives worst-case guarantees of the generalization power of the classification function after feature selection with respect to the classification function obtained using all features. We also introduce leverage-score sampling as an unsupervised randomized feature selection method for ridge regression. We provide risk bounds for both single-set spectral sparsification and leverage-score sampling on ridge regression in the fixed design setting and show that the risk in the sampled space is comparable to the risk in the full-feature space. We perform experiments on synthetic and real-world data sets; a subset of TechTC-300 data sets, to support our theory. Experimental results indicate that the proposed methods perform better than the existing feature selection methods.
Chen, Yifei; Sun, Yuxing; Han, Bing-Qing
2015-01-01
Protein interaction article classification is a text classification task in the biological domain to determine which articles describe protein-protein interactions. Since the feature space in text classification is high-dimensional, feature selection is widely used for reducing the dimensionality of features to speed up computation without sacrificing classification performance. Many existing feature selection methods are based on the statistical measure of document frequency and term frequency. One potential drawback of these methods is that they treat features separately. Hence, first we design a similarity measure between the context information to take word cooccurrences and phrase chunks around the features into account. Then we introduce the similarity of context information to the importance measure of the features to substitute the document and term frequency. Hence we propose new context similarity-based feature selection methods. Their performance is evaluated on two protein interaction article collections and compared against the frequency-based methods. The experimental results reveal that the context similarity-based methods perform better in terms of the F1 measure and the dimension reduction rate. Benefiting from the context information surrounding the features, the proposed methods can select distinctive features effectively for protein interaction article classification.
Feature Selection Using Information Gain for Improved Structural-Based Alert Correlation
Siraj, Maheyzah Md; Zainal, Anazida; Elshoush, Huwaida Tagelsir; Elhaj, Fatin
2016-01-01
Grouping and clustering alerts for intrusion detection based on the similarity of features is referred to as structurally base alert correlation and can discover a list of attack steps. Previous researchers selected different features and data sources manually based on their knowledge and experience, which lead to the less accurate identification of attack steps and inconsistent performance of clustering accuracy. Furthermore, the existing alert correlation systems deal with a huge amount of data that contains null values, incomplete information, and irrelevant features causing the analysis of the alerts to be tedious, time-consuming and error-prone. Therefore, this paper focuses on selecting accurate and significant features of alerts that are appropriate to represent the attack steps, thus, enhancing the structural-based alert correlation model. A two-tier feature selection method is proposed to obtain the significant features. The first tier aims at ranking the subset of features based on high information gain entropy in decreasing order. The second tier extends additional features with a better discriminative ability than the initially ranked features. Performance analysis results show the significance of the selected features in terms of the clustering accuracy using 2000 DARPA intrusion detection scenario-specific dataset. PMID:27893821
A New Direction of Cancer Classification: Positive Effect of Low-Ranking MicroRNAs.
Li, Feifei; Piao, Minghao; Piao, Yongjun; Li, Meijing; Ryu, Keun Ho
2014-10-01
Many studies based on microRNA (miRNA) expression profiles showed a new aspect of cancer classification. Because one characteristic of miRNA expression data is the high dimensionality, feature selection methods have been used to facilitate dimensionality reduction. The feature selection methods have one shortcoming thus far: they just consider the problem of where feature to class is 1:1 or n:1. However, because one miRNA may influence more than one type of cancer, human miRNA is considered to be ranked low in traditional feature selection methods and are removed most of the time. In view of the limitation of the miRNA number, low-ranking miRNAs are also important to cancer classification. We considered both high- and low-ranking features to cover all problems (1:1, n:1, 1:n, and m:n) in cancer classification. First, we used the correlation-based feature selection method to select the high-ranking miRNAs, and chose the support vector machine, Bayes network, decision tree, k-nearest-neighbor, and logistic classifier to construct cancer classification. Then, we chose Chi-square test, information gain, gain ratio, and Pearson's correlation feature selection methods to build the m:n feature subset, and used the selected miRNAs to determine cancer classification. The low-ranking miRNA expression profiles achieved higher classification accuracy compared with just using high-ranking miRNAs in traditional feature selection methods. Our results demonstrate that the m:n feature subset made a positive impression of low-ranking miRNAs in cancer classification.
Automated frame selection process for high-resolution microendoscopy
NASA Astrophysics Data System (ADS)
Ishijima, Ayumu; Schwarz, Richard A.; Shin, Dongsuk; Mondrik, Sharon; Vigneswaran, Nadarajah; Gillenwater, Ann M.; Anandasabapathy, Sharmila; Richards-Kortum, Rebecca
2015-04-01
We developed an automated frame selection algorithm for high-resolution microendoscopy video sequences. The algorithm rapidly selects a representative frame with minimal motion artifact from a short video sequence, enabling fully automated image analysis at the point-of-care. The algorithm was evaluated by quantitative comparison of diagnostically relevant image features and diagnostic classification results obtained using automated frame selection versus manual frame selection. A data set consisting of video sequences collected in vivo from 100 oral sites and 167 esophageal sites was used in the analysis. The area under the receiver operating characteristic curve was 0.78 (automated selection) versus 0.82 (manual selection) for oral sites, and 0.93 (automated selection) versus 0.92 (manual selection) for esophageal sites. The implementation of fully automated high-resolution microendoscopy at the point-of-care has the potential to reduce the number of biopsies needed for accurate diagnosis of precancer and cancer in low-resource settings where there may be limited infrastructure and personnel for standard histologic analysis.
Artificial bee colony algorithm for single-trial electroencephalogram analysis.
Hsu, Wei-Yen; Hu, Ya-Ping
2015-04-01
In this study, we propose an analysis system combined with feature selection to further improve the classification accuracy of single-trial electroencephalogram (EEG) data. Acquiring event-related brain potential data from the sensorimotor cortices, the system comprises artifact and background noise removal, feature extraction, feature selection, and feature classification. First, the artifacts and background noise are removed automatically by means of independent component analysis and surface Laplacian filter, respectively. Several potential features, such as band power, autoregressive model, and coherence and phase-locking value, are then extracted for subsequent classification. Next, artificial bee colony (ABC) algorithm is used to select features from the aforementioned feature combination. Finally, selected subfeatures are classified by support vector machine. Comparing with and without artifact removal and feature selection, using a genetic algorithm on single-trial EEG data for 6 subjects, the results indicate that the proposed system is promising and suitable for brain-computer interface applications. © EEG and Clinical Neuroscience Society (ECNS) 2014.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, X; Zhou, Z; Thomas, K
Purpose: The goal of this work is to investigate the use of contrast enhanced computed tomographic (CT) features for the prediction of mutations of BAP1, PBRM1, and VHL genes in renal cell carcinoma (RCC). Methods: For this study, we used two patient databases with renal cell carcinoma (RCC). The first one consisted of 33 patients from our institution (UT Southwestern Medical Center, UTSW). The second one consisted of 24 patients from the Cancer Imaging Archive (TCIA), where each patient is connected by a unique identi?er to the tissue samples from the Cancer Genome Atlas (TCGA). From the contrast enhanced CTmore » image of each patient, tumor contour was first delineated by a physician. Geometry, intensity, and texture features were extracted from the delineated tumor. Based on UTSW dataset, we completed feature selection and trained a support vector machine (SVM) classifier to predict mutations of BAP1, PBRM1 and VHL genes. We then used TCIA-TCGA dataset to validate the predictive model build upon UTSW dataset. Results: The prediction accuracy of gene expression of TCIA-TCGA patients was 0.83 (20 of 24), 0.83 (20 of 24), and 0.75 (18 of 24) for BAP1, PBRM1, and VHL respectively. For BAP1 gene, texture feature was the most prominent feature type. For PBRM1 gene, intensity feature was the most prominent. For VHL gene, geometry, intensity, and texture features were all important. Conclusion: Using our feature selection strategy and models, we achieved predictive accuracy over 0.75 for all three genes under the condition of using patient data from one institution for training and data from other institutions for testing. These results suggest that radiogenomics can be used to aid in prognosis and used as convenient surrogates for expensive and time consuming gene assay procedures.« less
Liu, Tongtong; Ge, Xifeng; Yu, Jinhua; Guo, Yi; Wang, Yuanyuan; Wang, Wenping; Cui, Ligang
2018-06-21
B-mode ultrasound (B-US) and strain elastography ultrasound (SE-US) images have a potential to distinguish thyroid tumor with different lymph node (LN) status. The purpose of our study is to investigate whether the application of multi-modality images including B-US and SE-US can improve the discriminability of thyroid tumor with LN metastasis based on a radiomics approach. Ultrasound (US) images including B-US and SE-US images of 75 papillary thyroid carcinoma (PTC) cases were retrospectively collected. A radiomics approach was developed in this study to estimate LNs status of PTC patients. The approach included image segmentation, quantitative feature extraction, feature selection and classification. Three feature sets were extracted from B-US, SE-US, and multi-modality containing B-US and SE-US. They were used to evaluate the contribution of different modalities. A total of 684 radiomics features have been extracted in our study. We used sparse representation coefficient-based feature selection method with 10-bootstrap to reduce the dimension of feature sets. Support vector machine with leave-one-out cross-validation was used to build the model for estimating LN status. Using features extracted from both B-US and SE-US, the radiomics-based model produced an area under the receiver operating characteristic curve (AUC) [Formula: see text] 0.90, accuracy (ACC) [Formula: see text] 0.85, sensitivity (SENS) [Formula: see text] 0.77 and specificity (SPEC) [Formula: see text] 0.88, which was better than using features extracted from B-US or SE-US separately. Multi-modality images provided more information in radiomics study. Combining use of B-US and SE-US could improve the LN metastasis estimation accuracy for PTC patients.
Spectral feature design in high dimensional multispectral data
NASA Technical Reports Server (NTRS)
Chen, Chih-Chien Thomas; Landgrebe, David A.
1988-01-01
The High resolution Imaging Spectrometer (HIRIS) is designed to acquire images simultaneously in 192 spectral bands in the 0.4 to 2.5 micrometers wavelength region. It will make possible the collection of essentially continuous reflectance spectra at a spectral resolution sufficient to extract significantly enhanced amounts of information from return signals as compared to existing systems. The advantages of such high dimensional data come at a cost of increased system and data complexity. For example, since the finer the spectral resolution, the higher the data rate, it becomes impractical to design the sensor to be operated continuously. It is essential to find new ways to preprocess the data which reduce the data rate while at the same time maintaining the information content of the high dimensional signal produced. Four spectral feature design techniques are developed from the Weighted Karhunen-Loeve Transforms: (1) non-overlapping band feature selection algorithm; (2) overlapping band feature selection algorithm; (3) Walsh function approach; and (4) infinite clipped optimal function approach. The infinite clipped optimal function approach is chosen since the features are easiest to find and their classification performance is the best. After the preprocessed data has been received at the ground station, canonical analysis is further used to find the best set of features under the criterion that maximal class separability is achieved. Both 100 dimensional vegetation data and 200 dimensional soil data were used to test the spectral feature design system. It was shown that the infinite clipped versions of the first 16 optimal features had excellent classification performance. The overall probability of correct classification is over 90 percent while providing for a reduced downlink data rate by a factor of 10.
Handwriting: Feature Correlation Analysis for Biometric Hashes
NASA Astrophysics Data System (ADS)
Vielhauer, Claus; Steinmetz, Ralf
2004-12-01
In the application domain of electronic commerce, biometric authentication can provide one possible solution for the key management problem. Besides server-based approaches, methods of deriving digital keys directly from biometric measures appear to be advantageous. In this paper, we analyze one of our recently published specific algorithms of this category based on behavioral biometrics of handwriting, the biometric hash. Our interest is to investigate to which degree each of the underlying feature parameters contributes to the overall intrapersonal stability and interpersonal value space. We will briefly discuss related work in feature evaluation and introduce a new methodology based on three components: the intrapersonal scatter (deviation), the interpersonal entropy, and the correlation between both measures. Evaluation of the technique is presented based on two data sets of different size. The method presented will allow determination of effects of parameterization of the biometric system, estimation of value space boundaries, and comparison with other feature selection approaches.
iPcc: a novel feature extraction method for accurate disease class discovery and prediction
Ren, Xianwen; Wang, Yong; Zhang, Xiang-Sun; Jin, Qi
2013-01-01
Gene expression profiling has gradually become a routine procedure for disease diagnosis and classification. In the past decade, many computational methods have been proposed, resulting in great improvements on various levels, including feature selection and algorithms for classification and clustering. In this study, we present iPcc, a novel method from the feature extraction perspective to further propel gene expression profiling technologies from bench to bedside. We define ‘correlation feature space’ for samples based on the gene expression profiles by iterative employment of Pearson’s correlation coefficient. Numerical experiments on both simulated and real gene expression data sets demonstrate that iPcc can greatly highlight the latent patterns underlying noisy gene expression data and thus greatly improve the robustness and accuracy of the algorithms currently available for disease diagnosis and classification based on gene expression profiles. PMID:23761440
Automatic staging of bladder cancer on CT urography
NASA Astrophysics Data System (ADS)
Garapati, Sankeerth S.; Hadjiiski, Lubomir M.; Cha, Kenny H.; Chan, Heang-Ping; Caoili, Elaine M.; Cohan, Richard H.; Weizer, Alon; Alva, Ajjai; Paramagul, Chintana; Wei, Jun; Zhou, Chuan
2016-03-01
Correct staging of bladder cancer is crucial for the decision of neoadjuvant chemotherapy treatment and minimizing the risk of under- or over-treatment. Subjectivity and variability of clinicians in utilizing available diagnostic information may lead to inaccuracy in staging bladder cancer. An objective decision support system that merges the information in a predictive model based on statistical outcomes of previous cases and machine learning may assist clinicians in making more accurate and consistent staging assessments. In this study, we developed a preliminary method to stage bladder cancer. With IRB approval, 42 bladder cancer cases with CTU scans were collected from patient files. The cases were classified into two classes based on pathological stage T2, which is the decision threshold for neoadjuvant chemotherapy treatment (i.e. for stage >=T2) clinically. There were 21 cancers below stage T2 and 21 cancers at stage T2 or above. All 42 lesions were automatically segmented using our auto-initialized cascaded level sets (AI-CALS) method. Morphological features were extracted, which were selected and merged by linear discriminant analysis (LDA) classifier. A leave-one-case-out resampling scheme was used to train and test the classifier using the 42 lesions. The classification accuracy was quantified using the area under the ROC curve (Az). The average training Az was 0.97 and the test Az was 0.85. The classifier consistently selected the lesion volume, a gray level feature and a contrast feature. This predictive model shows promise for assisting in assessing the bladder cancer stage.
Application of quantum-behaved particle swarm optimization to motor imagery EEG classification.
Hsu, Wei-Yen
2013-12-01
In this study, we propose a recognition system for single-trial analysis of motor imagery (MI) electroencephalogram (EEG) data. Applying event-related brain potential (ERP) data acquired from the sensorimotor cortices, the system chiefly consists of automatic artifact elimination, feature extraction, feature selection and classification. In addition to the use of independent component analysis, a similarity measure is proposed to further remove the electrooculographic (EOG) artifacts automatically. Several potential features, such as wavelet-fractal features, are then extracted for subsequent classification. Next, quantum-behaved particle swarm optimization (QPSO) is used to select features from the feature combination. Finally, selected sub-features are classified by support vector machine (SVM). Compared with without artifact elimination, feature selection using a genetic algorithm (GA) and feature classification with Fisher's linear discriminant (FLD) on MI data from two data sets for eight subjects, the results indicate that the proposed method is promising in brain-computer interface (BCI) applications.
Nankali, Saber; Miandoab, Payam Samadi; Baghizadeh, Amin
2016-01-01
In external‐beam radiotherapy, using external markers is one of the most reliable tools to predict tumor position, in clinical applications. The main challenge in this approach is tumor motion tracking with highest accuracy that depends heavily on external markers location, and this issue is the objective of this study. Four commercially available feature selection algorithms entitled 1) Correlation‐based Feature Selection, 2) Classifier, 3) Principal Components, and 4) Relief were proposed to find optimum location of external markers in combination with two “Genetic” and “Ranker” searching procedures. The performance of these algorithms has been evaluated using four‐dimensional extended cardiac‐torso anthropomorphic phantom. Six tumors in lung, three tumors in liver, and 49 points on the thorax surface were taken into account to simulate internal and external motions, respectively. The root mean square error of an adaptive neuro‐fuzzy inference system (ANFIS) as prediction model was considered as metric for quantitatively evaluating the performance of proposed feature selection algorithms. To do this, the thorax surface region was divided into nine smaller segments and predefined tumors motion was predicted by ANFIS using external motion data of given markers at each small segment, separately. Our comparative results showed that all feature selection algorithms can reasonably select specific external markers from those segments where the root mean square error of the ANFIS model is minimum. Moreover, the performance accuracy of proposed feature selection algorithms was compared, separately. For this, each tumor motion was predicted using motion data of those external markers selected by each feature selection algorithm. Duncan statistical test, followed by F‐test, on final results reflected that all proposed feature selection algorithms have the same performance accuracy for lung tumors. But for liver tumors, a correlation‐based feature selection algorithm, in combination with a genetic search algorithm, proved to yield best performance accuracy for selecting optimum markers. PACS numbers: 87.55.km, 87.56.Fc PMID:26894358
Nankali, Saber; Torshabi, Ahmad Esmaili; Miandoab, Payam Samadi; Baghizadeh, Amin
2016-01-08
In external-beam radiotherapy, using external markers is one of the most reliable tools to predict tumor position, in clinical applications. The main challenge in this approach is tumor motion tracking with highest accuracy that depends heavily on external markers location, and this issue is the objective of this study. Four commercially available feature selection algorithms entitled 1) Correlation-based Feature Selection, 2) Classifier, 3) Principal Components, and 4) Relief were proposed to find optimum location of external markers in combination with two "Genetic" and "Ranker" searching procedures. The performance of these algorithms has been evaluated using four-dimensional extended cardiac-torso anthropomorphic phantom. Six tumors in lung, three tumors in liver, and 49 points on the thorax surface were taken into account to simulate internal and external motions, respectively. The root mean square error of an adaptive neuro-fuzzy inference system (ANFIS) as prediction model was considered as metric for quantitatively evaluating the performance of proposed feature selection algorithms. To do this, the thorax surface region was divided into nine smaller segments and predefined tumors motion was predicted by ANFIS using external motion data of given markers at each small segment, separately. Our comparative results showed that all feature selection algorithms can reasonably select specific external markers from those segments where the root mean square error of the ANFIS model is minimum. Moreover, the performance accuracy of proposed feature selection algorithms was compared, separately. For this, each tumor motion was predicted using motion data of those external markers selected by each feature selection algorithm. Duncan statistical test, followed by F-test, on final results reflected that all proposed feature selection algorithms have the same performance accuracy for lung tumors. But for liver tumors, a correlation-based feature selection algorithm, in combination with a genetic search algorithm, proved to yield best performance accuracy for selecting optimum markers.
Garcia-Chimeno, Yolanda; Garcia-Zapirain, Begonya; Gomez-Beldarrain, Marian; Fernandez-Ruanova, Begonya; Garcia-Monco, Juan Carlos
2017-04-13
Feature selection methods are commonly used to identify subsets of relevant features to facilitate the construction of models for classification, yet little is known about how feature selection methods perform in diffusion tensor images (DTIs). In this study, feature selection and machine learning classification methods were tested for the purpose of automating diagnosis of migraines using both DTIs and questionnaire answers related to emotion and cognition - factors that influence of pain perceptions. We select 52 adult subjects for the study divided into three groups: control group (15), subjects with sporadic migraine (19) and subjects with chronic migraine and medication overuse (18). These subjects underwent magnetic resonance with diffusion tensor to see white matter pathway integrity of the regions of interest involved in pain and emotion. The tests also gather data about pathology. The DTI images and test results were then introduced into feature selection algorithms (Gradient Tree Boosting, L1-based, Random Forest and Univariate) to reduce features of the first dataset and classification algorithms (SVM (Support Vector Machine), Boosting (Adaboost) and Naive Bayes) to perform a classification of migraine group. Moreover we implement a committee method to improve the classification accuracy based on feature selection algorithms. When classifying the migraine group, the greatest improvements in accuracy were made using the proposed committee-based feature selection method. Using this approach, the accuracy of classification into three types improved from 67 to 93% when using the Naive Bayes classifier, from 90 to 95% with the support vector machine classifier, 93 to 94% in boosting. The features that were determined to be most useful for classification included are related with the pain, analgesics and left uncinate brain (connected with the pain and emotions). The proposed feature selection committee method improved the performance of migraine diagnosis classifiers compared to individual feature selection methods, producing a robust system that achieved over 90% accuracy in all classifiers. The results suggest that the proposed methods can be used to support specialists in the classification of migraines in patients undergoing magnetic resonance imaging.
NASA Astrophysics Data System (ADS)
Nicolis, John S.; Katsikas, Anastassis A.
Collective parameters such as the Zipf's law-like statistics, the Transinformation, the Block Entropy and the Markovian character are compared for natural, genetic, musical and artificially generated long texts from generating partitions (alphabets) on homogeneous as well as on multifractal chaotic maps. It appears that minimal requirements for a language at the syntactical level such as memory, selectivity of few keywords and broken symmetry in one dimension (polarity) are more or less met by dynamically iterating simple maps or flows e.g. very simple chaotic hardware. The same selectivity is observed at the semantic level where the aim refers to partitioning a set of enviromental impinging stimuli onto coexisting attractors-categories. Under the regime of pattern recognition and classification, few key features of a pattern or few categories claim the lion's share of the information stored in this pattern and practically, only these key features are persistently scanned by the cognitive processor. A multifractal attractor model can in principle explain this high selectivity, both at the syntactical and the semantic levels.
Joint Feature Selection and Classification for Multilabel Learning.
Huang, Jun; Li, Guorong; Huang, Qingming; Wu, Xindong
2018-03-01
Multilabel learning deals with examples having multiple class labels simultaneously. It has been applied to a variety of applications, such as text categorization and image annotation. A large number of algorithms have been proposed for multilabel learning, most of which concentrate on multilabel classification problems and only a few of them are feature selection algorithms. Current multilabel classification models are mainly built on a single data representation composed of all the features which are shared by all the class labels. Since each class label might be decided by some specific features of its own, and the problems of classification and feature selection are often addressed independently, in this paper, we propose a novel method which can perform joint feature selection and classification for multilabel learning, named JFSC. Different from many existing methods, JFSC learns both shared features and label-specific features by considering pairwise label correlations, and builds the multilabel classifier on the learned low-dimensional data representations simultaneously. A comparative study with state-of-the-art approaches manifests a competitive performance of our proposed method both in classification and feature selection for multilabel learning.
A hybrid feature selection method using multiclass SVM for diagnosis of erythemato-squamous disease
NASA Astrophysics Data System (ADS)
Maryam, Setiawan, Noor Akhmad; Wahyunggoro, Oyas
2017-08-01
The diagnosis of erythemato-squamous disease is a complex problem and difficult to detect in dermatology. Besides that, it is a major cause of skin cancer. Data mining implementation in the medical field helps expert to diagnose precisely, accurately, and inexpensively. In this research, we use data mining technique to developed a diagnosis model based on multiclass SVM with a novel hybrid feature selection method to diagnose erythemato-squamous disease. Our hybrid feature selection method, named ChiGA (Chi Square and Genetic Algorithm), uses the advantages from filter and wrapper methods to select the optimal feature subset from original feature. Chi square used as filter method to remove redundant features and GA as wrapper method to select the ideal feature subset with SVM used as classifier. Experiment performed with 10 fold cross validation on erythemato-squamous diseases dataset taken from University of California Irvine (UCI) machine learning database. The experimental result shows that the proposed model based multiclass SVM with Chi Square and GA can give an optimum feature subset. There are 18 optimum features with 99.18% accuracy.
Selective attention to temporal features on nested time scales.
Henry, Molly J; Herrmann, Björn; Obleser, Jonas
2015-02-01
Meaningful auditory stimuli such as speech and music often vary simultaneously along multiple time scales. Thus, listeners must selectively attend to, and selectively ignore, separate but intertwined temporal features. The current study aimed to identify and characterize the neural network specifically involved in this feature-selective attention to time. We used a novel paradigm where listeners judged either the duration or modulation rate of auditory stimuli, and in which the stimulation, working memory demands, response requirements, and task difficulty were held constant. A first analysis identified all brain regions where individual brain activation patterns were correlated with individual behavioral performance patterns, which thus supported temporal judgments generically. A second analysis then isolated those brain regions that specifically regulated selective attention to temporal features: Neural responses in a bilateral fronto-parietal network including insular cortex and basal ganglia decreased with degree of change of the attended temporal feature. Critically, response patterns in these regions were inverted when the task required selectively ignoring this feature. The results demonstrate how the neural analysis of complex acoustic stimuli with multiple temporal features depends on a fronto-parietal network that simultaneously regulates the selective gain for attended and ignored temporal features. © The Author 2013. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Asadabadi, Ebrahim Barzegari; Abdolmaleki, Parviz; Barkooie, Seyyed Mohsen Hosseini; Jahandideh, Samad; Rezaei, Mohammad Ali
2009-12-01
Regarding the great potential of dual binding site inhibitors of acetylcholinesterase as the future potent drugs of Alzheimer's disease, this study was devoted to extraction of the most effective structural features of these inhibitors from among a large number of quantitative descriptors. To do this, we adopted a unique approach in quantitative structure-activity relationships. An efficient feature selection method was emphasized in such an approach, using the confirmative results of different routine and novel feature selection methods. The proposed methods generated quite consistent results ensuring the effectiveness of the selected structural features.
Molina, D.; Pérez-Beteta, J.; Martínez-González, A.; Velásquez, C.; Martino, J.; Luque, B.; Revert, A.; Herruzo, I.; Arana, E.; Pérez-García, V. M.
2017-01-01
Abstract Introduction: Textural analysis refers to a variety of mathematical methods used to quantify the spatial variations in grey levels within images. In brain tumors, textural features have a great potential as imaging biomarkers having been shown to correlate with survival, tumor grade, tumor type, etc. However, these measures should be reproducible under dynamic range and matrix size changes for their clinical use. Our aim is to study this robustness in brain tumors with 3D magnetic resonance imaging, not previously reported in the literature. Materials and methods: 3D T1-weighted images of 20 patients with glioblastoma (64.80 ± 9.12 years-old) obtained from a 3T scanner were analyzed. Tumors were segmented using an in-house semi-automatic 3D procedure. A set of 16 3D textural features of the most common types (co-occurrence and run-length matrices) were selected, providing regional (run-length based measures) and local information (co-ocurrence matrices) on the tumor heterogeneity. Feature robustness was assessed by means of the coefficient of variation (CV) under both dynamic range (16, 32 and 64 gray levels) and/or matrix size (256x256 and 432x432) changes. Results: None of the textural features considered were robust under dynamic range changes. The textural co-occurrence matrix feature Entropy was the only textural feature robust (CV < 10%) under spatial resolution changes. Conclusions: In general, textural measures of three-dimensional brain tumor images are neither robust under dynamic range nor under matrix size changes. Thus, it becomes mandatory to fix standards for image rescaling after acquisition before the textural features are computed if they are to be used as imaging biomarkers. For T1-weighted images a dynamic range of 16 grey levels and a matrix size of 256x256 (and isotropic voxel) is found to provide reliable and comparable results and is feasible with current MRI scanners. The implications of this work go beyond the specific tumor type and MRI sequence studied here and pose the need for standardization in textural feature calculation of oncological images. FUNDING: James S. Mc. Donnell Foundation (USA) 21st Century Science Initiative in Mathematical and Complex Systems Approaches for Brain Cancer [Collaborative award 220020450 and planning grant 220020420], MINECO/FEDER [MTM2015-71200-R], JCCM [PEII-2014-031-P].
Top-down modulation: Bridging selective attention and working memory
Gazzaley, Adam; Nobre, Anna C.
2012-01-01
Selective attention, the ability to focus our cognitive resources on information relevant to our goals, influences working memory (WM) performance. Indeed, attention and working memory are increasingly viewed as overlapping constructs. Here, we review recent evidence from human neurophysiological studies demonstrating that top-down modulation serves as a common neural mechanism underlying these two cognitive operations. The core features include activity modulation in stimulus-selective sensory cortices with concurrent engagement of prefrontal and parietal control regions that function as sources of top-down signals. Notably, top-down modulation is engaged during both stimulus-present and stimulus-absent stages of WM tasks, i.e., expectation of an ensuing stimulus to be remembered, selection and encoding of stimuli, maintenance of relevant information in mind and memory retrieval. PMID:22209601
Diffusion Tensor Image Registration Using Hybrid Connectivity and Tensor Features
Wang, Qian; Yap, Pew-Thian; Wu, Guorong; Shen, Dinggang
2014-01-01
Most existing diffusion tensor imaging (DTI) registration methods estimate structural correspondences based on voxelwise matching of tensors. The rich connectivity information that is given by DTI, however, is often neglected. In this article, we propose to integrate complementary information given by connectivity features and tensor features for improved registration accuracy. To utilize connectivity information, we place multiple anchors representing different brain anatomies in the image space, and define the connectivity features for each voxel as the geodesic distances from all anchors to the voxel under consideration. The geodesic distance, which is computed in relation to the tensor field, encapsulates information of brain connectivity. We also extract tensor features for every voxel to reflect the local statistics of tensors in its neighborhood. We then combine both connectivity features and tensor features for registration of tensor images. From the images, landmarks are selected automatically and their correspondences are determined based on their connectivity and tensor feature vectors. The deformation field that deforms one tensor image to the other is iteratively estimated and optimized according to the landmarks and their associated correspondences. Experimental results show that, by using connectivity features and tensor features simultaneously, registration accuracy is increased substantially compared with the cases using either type of features alone. PMID:24293159
NASA Astrophysics Data System (ADS)
Sun, Wenqing; Tseng, Tzu-Liang B.; Zheng, Bin; Zhang, Jianying; Qian, Wei
2015-03-01
A novel breast cancer risk analysis approach is proposed for enhancing performance of computerized breast cancer risk analysis using bilateral mammograms. Based on the intensity of breast area, five different sub-regions were acquired from one mammogram, and bilateral features were extracted from every sub-region. Our dataset includes 180 bilateral mammograms from 180 women who underwent routine screening examinations, all interpreted as negative and not recalled by the radiologists during the original screening procedures. A computerized breast cancer risk analysis scheme using four image processing modules, including sub-region segmentation, bilateral feature extraction, feature selection, and classification was designed to detect and compute image feature asymmetry between the left and right breasts imaged on the mammograms. The highest computed area under the curve (AUC) is 0.763 ± 0.021 when applying the multiple sub-region features to our testing dataset. The positive predictive value and the negative predictive value were 0.60 and 0.73, respectively. The study demonstrates that (1) features extracted from multiple sub-regions can improve the performance of our scheme compared to using features from whole breast area only; (2) a classifier using asymmetry bilateral features can effectively predict breast cancer risk; (3) incorporating texture and morphological features with density features can boost the classification accuracy.
Valizade Hasanloei, Mohammad Amin; Sheikhpour, Razieh; Sarram, Mehdi Agha; Sheikhpour, Elnaz; Sharifi, Hamdollah
2018-02-01
Quantitative structure-activity relationship (QSAR) is an effective computational technique for drug design that relates the chemical structures of compounds to their biological activities. Feature selection is an important step in QSAR based drug design to select the most relevant descriptors. One of the most popular feature selection methods for classification problems is Fisher score which aim is to minimize the within-class distance and maximize the between-class distance. In this study, the properties of Fisher criterion were extended for QSAR models to define the new distance metrics based on the continuous activity values of compounds with known activities. Then, a semi-supervised feature selection method was proposed based on the combination of Fisher and Laplacian criteria which exploits both compounds with known and unknown activities to select the relevant descriptors. To demonstrate the efficiency of the proposed semi-supervised feature selection method in selecting the relevant descriptors, we applied the method and other feature selection methods on three QSAR data sets such as serine/threonine-protein kinase PLK3 inhibitors, ROCK inhibitors and phenol compounds. The results demonstrated that the QSAR models built on the selected descriptors by the proposed semi-supervised method have better performance than other models. This indicates the efficiency of the proposed method in selecting the relevant descriptors using the compounds with known and unknown activities. The results of this study showed that the compounds with known and unknown activities can be helpful to improve the performance of the combined Fisher and Laplacian based feature selection methods.
NASA Astrophysics Data System (ADS)
Valizade Hasanloei, Mohammad Amin; Sheikhpour, Razieh; Sarram, Mehdi Agha; Sheikhpour, Elnaz; Sharifi, Hamdollah
2018-02-01
Quantitative structure-activity relationship (QSAR) is an effective computational technique for drug design that relates the chemical structures of compounds to their biological activities. Feature selection is an important step in QSAR based drug design to select the most relevant descriptors. One of the most popular feature selection methods for classification problems is Fisher score which aim is to minimize the within-class distance and maximize the between-class distance. In this study, the properties of Fisher criterion were extended for QSAR models to define the new distance metrics based on the continuous activity values of compounds with known activities. Then, a semi-supervised feature selection method was proposed based on the combination of Fisher and Laplacian criteria which exploits both compounds with known and unknown activities to select the relevant descriptors. To demonstrate the efficiency of the proposed semi-supervised feature selection method in selecting the relevant descriptors, we applied the method and other feature selection methods on three QSAR data sets such as serine/threonine-protein kinase PLK3 inhibitors, ROCK inhibitors and phenol compounds. The results demonstrated that the QSAR models built on the selected descriptors by the proposed semi-supervised method have better performance than other models. This indicates the efficiency of the proposed method in selecting the relevant descriptors using the compounds with known and unknown activities. The results of this study showed that the compounds with known and unknown activities can be helpful to improve the performance of the combined Fisher and Laplacian based feature selection methods.
Multi-level gene/MiRNA feature selection using deep belief nets and active learning.
Ibrahim, Rania; Yousri, Noha A; Ismail, Mohamed A; El-Makky, Nagwa M
2014-01-01
Selecting the most discriminative genes/miRNAs has been raised as an important task in bioinformatics to enhance disease classifiers and to mitigate the dimensionality curse problem. Original feature selection methods choose genes/miRNAs based on their individual features regardless of how they perform together. Considering group features instead of individual ones provides a better view for selecting the most informative genes/miRNAs. Recently, deep learning has proven its ability in representing the data in multiple levels of abstraction, allowing for better discrimination between different classes. However, the idea of using deep learning for feature selection is not widely used in the bioinformatics field yet. In this paper, a novel multi-level feature selection approach named MLFS is proposed for selecting genes/miRNAs based on expression profiles. The approach is based on both deep and active learning. Moreover, an extension to use the technique for miRNAs is presented by considering the biological relation between miRNAs and genes. Experimental results show that the approach was able to outperform classical feature selection methods in hepatocellular carcinoma (HCC) by 9%, lung cancer by 6% and breast cancer by around 10% in F1-measure. Results also show the enhancement in F1-measure of our approach over recently related work in [1] and [2].
NASA Astrophysics Data System (ADS)
An, Le; Adeli, Ehsan; Liu, Mingxia; Zhang, Jun; Lee, Seong-Whan; Shen, Dinggang
2017-03-01
Classification is one of the most important tasks in machine learning. Due to feature redundancy or outliers in samples, using all available data for training a classifier may be suboptimal. For example, the Alzheimer’s disease (AD) is correlated with certain brain regions or single nucleotide polymorphisms (SNPs), and identification of relevant features is critical for computer-aided diagnosis. Many existing methods first select features from structural magnetic resonance imaging (MRI) or SNPs and then use those features to build the classifier. However, with the presence of many redundant features, the most discriminative features are difficult to be identified in a single step. Thus, we formulate a hierarchical feature and sample selection framework to gradually select informative features and discard ambiguous samples in multiple steps for improved classifier learning. To positively guide the data manifold preservation process, we utilize both labeled and unlabeled data during training, making our method semi-supervised. For validation, we conduct experiments on AD diagnosis by selecting mutually informative features from both MRI and SNP, and using the most discriminative samples for training. The superior classification results demonstrate the effectiveness of our approach, as compared with the rivals.
Tsai, Richard Tzong-Han; Sung, Cheng-Lung; Dai, Hong-Jie; Hung, Hsieh-Chuan; Sung, Ting-Yi; Hsu, Wen-Lian
2006-12-18
Biomedical named entity recognition (Bio-NER) is a challenging problem because, in general, biomedical named entities of the same category (e.g., proteins and genes) do not follow one standard nomenclature. They have many irregularities and sometimes appear in ambiguous contexts. In recent years, machine-learning (ML) approaches have become increasingly common and now represent the cutting edge of Bio-NER technology. This paper addresses three problems faced by ML-based Bio-NER systems. First, most ML approaches usually employ singleton features that comprise one linguistic property (e.g., the current word is capitalized) and at least one class tag (e.g., B-protein, the beginning of a protein name). However, such features may be insufficient in cases where multiple properties must be considered. Adding conjunction features that contain multiple properties can be beneficial, but it would be infeasible to include all conjunction features in an NER model since memory resources are limited and some features are ineffective. To resolve the problem, we use a sequential forward search algorithm to select an effective set of features. Second, variations in the numerical parts of biomedical terms (e.g., "2" in the biomedical term IL2) cause data sparseness and generate many redundant features. In this case, we apply numerical normalization, which solves the problem by replacing all numerals in a term with one representative numeral to help classify named entities. Third, the assignment of NE tags does not depend solely on the target word's closest neighbors, but may depend on words outside the context window (e.g., a context window of five consists of the current word plus two preceding and two subsequent words). We use global patterns generated by the Smith-Waterman local alignment algorithm to identify such structures and modify the results of our ML-based tagger. This is called pattern-based post-processing. To develop our ML-based Bio-NER system, we employ conditional random fields, which have performed effectively in several well-known tasks, as our underlying ML model. Adding selected conjunction features, applying numerical normalization, and employing pattern-based post-processing improve the F-scores by 1.67%, 1.04%, and 0.57%, respectively. The combined increase of 3.28% yields a total score of 72.98%, which is better than the baseline system that only uses singleton features. We demonstrate the benefits of using the sequential forward search algorithm to select effective conjunction feature groups. In addition, we show that numerical normalization can effectively reduce the number of redundant and unseen features. Furthermore, the Smith-Waterman local alignment algorithm can help ML-based Bio-NER deal with difficult cases that need longer context windows.
An improved wrapper-based feature selection method for machinery fault diagnosis
2017-01-01
A major issue of machinery fault diagnosis using vibration signals is that it is over-reliant on personnel knowledge and experience in interpreting the signal. Thus, machine learning has been adapted for machinery fault diagnosis. The quantity and quality of the input features, however, influence the fault classification performance. Feature selection plays a vital role in selecting the most representative feature subset for the machine learning algorithm. In contrast, the trade-off relationship between capability when selecting the best feature subset and computational effort is inevitable in the wrapper-based feature selection (WFS) method. This paper proposes an improved WFS technique before integration with a support vector machine (SVM) model classifier as a complete fault diagnosis system for a rolling element bearing case study. The bearing vibration dataset made available by the Case Western Reserve University Bearing Data Centre was executed using the proposed WFS and its performance has been analysed and discussed. The results reveal that the proposed WFS secures the best feature subset with a lower computational effort by eliminating the redundancy of re-evaluation. The proposed WFS has therefore been found to be capable and efficient to carry out feature selection tasks. PMID:29261689
Asymmetric bagging and feature selection for activities prediction of drug molecules.
Li, Guo-Zheng; Meng, Hao-Hua; Lu, Wen-Cong; Yang, Jack Y; Yang, Mary Qu
2008-05-28
Activities of drug molecules can be predicted by QSAR (quantitative structure activity relationship) models, which overcomes the disadvantages of high cost and long cycle by employing the traditional experimental method. With the fact that the number of drug molecules with positive activity is rather fewer than that of negatives, it is important to predict molecular activities considering such an unbalanced situation. Here, asymmetric bagging and feature selection are introduced into the problem and asymmetric bagging of support vector machines (asBagging) is proposed on predicting drug activities to treat the unbalanced problem. At the same time, the features extracted from the structures of drug molecules affect prediction accuracy of QSAR models. Therefore, a novel algorithm named PRIFEAB is proposed, which applies an embedded feature selection method to remove redundant and irrelevant features for asBagging. Numerical experimental results on a data set of molecular activities show that asBagging improve the AUC and sensitivity values of molecular activities and PRIFEAB with feature selection further helps to improve the prediction ability. Asymmetric bagging can help to improve prediction accuracy of activities of drug molecules, which can be furthermore improved by performing feature selection to select relevant features from the drug molecules data sets.
System Complexity Reduction via Feature Selection
ERIC Educational Resources Information Center
Deng, Houtao
2011-01-01
This dissertation transforms a set of system complexity reduction problems to feature selection problems. Three systems are considered: classification based on association rules, network structure learning, and time series classification. Furthermore, two variable importance measures are proposed to reduce the feature selection bias in tree…
Ortega, Julio; Asensio-Cubero, Javier; Gan, John Q; Ortiz, Andrés
2016-07-15
Brain-computer interfacing (BCI) applications based on the classification of electroencephalographic (EEG) signals require solving high-dimensional pattern classification problems with such a relatively small number of training patterns that curse of dimensionality problems usually arise. Multiresolution analysis (MRA) has useful properties for signal analysis in both temporal and spectral analysis, and has been broadly used in the BCI field. However, MRA usually increases the dimensionality of the input data. Therefore, some approaches to feature selection or feature dimensionality reduction should be considered for improving the performance of the MRA based BCI. This paper investigates feature selection in the MRA-based frameworks for BCI. Several wrapper approaches to evolutionary multiobjective feature selection are proposed with different structures of classifiers. They are evaluated by comparing with baseline methods using sparse representation of features or without feature selection. The statistical analysis, by applying the Kolmogorov-Smirnoff and Kruskal-Wallis tests to the means of the Kappa values evaluated by using the test patterns in each approach, has demonstrated some advantages of the proposed approaches. In comparison with the baseline MRA approach used in previous studies, the proposed evolutionary multiobjective feature selection approaches provide similar or even better classification performances, with significant reduction in the number of features that need to be computed.
Arruti, Andoni; Cearreta, Idoia; Álvarez, Aitor; Lazkano, Elena; Sierra, Basilio
2014-01-01
Study of emotions in human–computer interaction is a growing research area. This paper shows an attempt to select the most significant features for emotion recognition in spoken Basque and Spanish Languages using different methods for feature selection. RekEmozio database was used as the experimental data set. Several Machine Learning paradigms were used for the emotion classification task. Experiments were executed in three phases, using different sets of features as classification variables in each phase. Moreover, feature subset selection was applied at each phase in order to seek for the most relevant feature subset. The three phases approach was selected to check the validity of the proposed approach. Achieved results show that an instance-based learning algorithm using feature subset selection techniques based on evolutionary algorithms is the best Machine Learning paradigm in automatic emotion recognition, with all different feature sets, obtaining a mean of 80,05% emotion recognition rate in Basque and a 74,82% in Spanish. In order to check the goodness of the proposed process, a greedy searching approach (FSS-Forward) has been applied and a comparison between them is provided. Based on achieved results, a set of most relevant non-speaker dependent features is proposed for both languages and new perspectives are suggested. PMID:25279686
NASA Technical Reports Server (NTRS)
Oken, S.; Skoumal, D. E.; Straayer, J. W.
1974-01-01
The development of metal structures reinforced with filamentary composites as a weight saving feature of the space shuttle components is discussed. A frame was selected for study that was representative of the type of construction used in the bulk frames of the orbiter vehicle. Theoretical and experimental investigations were conducted. Component tests were performed to evaluate the critical details used in the designs and to provide credibility to the weight saving results. A model frame was constructed of the reinforced metal material to provide a final evaluation of the construction under realistic load conditions.
Song, Yun S; Steinrücken, Matthias
2012-03-01
The transition density function of the Wright-Fisher diffusion describes the evolution of population-wide allele frequencies over time. This function has important practical applications in population genetics, but finding an explicit formula under a general diploid selection model has remained a difficult open problem. In this article, we develop a new computational method to tackle this classic problem. Specifically, our method explicitly finds the eigenvalues and eigenfunctions of the diffusion generator associated with the Wright-Fisher diffusion with recurrent mutation and arbitrary diploid selection, thus allowing one to obtain an accurate spectral representation of the transition density function. Simplicity is one of the appealing features of our approach. Although our derivation involves somewhat advanced mathematical concepts, the resulting algorithm is quite simple and efficient, only involving standard linear algebra. Furthermore, unlike previous approaches based on perturbation, which is applicable only when the population-scaled selection coefficient is small, our method is nonperturbative and is valid for a broad range of parameter values. As a by-product of our work, we obtain the rate of convergence to the stationary distribution under mutation-selection balance.
Song, Yun S.; Steinrücken, Matthias
2012-01-01
The transition density function of the Wright–Fisher diffusion describes the evolution of population-wide allele frequencies over time. This function has important practical applications in population genetics, but finding an explicit formula under a general diploid selection model has remained a difficult open problem. In this article, we develop a new computational method to tackle this classic problem. Specifically, our method explicitly finds the eigenvalues and eigenfunctions of the diffusion generator associated with the Wright–Fisher diffusion with recurrent mutation and arbitrary diploid selection, thus allowing one to obtain an accurate spectral representation of the transition density function. Simplicity is one of the appealing features of our approach. Although our derivation involves somewhat advanced mathematical concepts, the resulting algorithm is quite simple and efficient, only involving standard linear algebra. Furthermore, unlike previous approaches based on perturbation, which is applicable only when the population-scaled selection coefficient is small, our method is nonperturbative and is valid for a broad range of parameter values. As a by-product of our work, we obtain the rate of convergence to the stationary distribution under mutation–selection balance. PMID:22209899
The impact of feature selection on one and two-class classification performance for plant microRNAs.
Khalifa, Waleed; Yousef, Malik; Saçar Demirci, Müşerref Duygu; Allmer, Jens
2016-01-01
MicroRNAs (miRNAs) are short nucleotide sequences that form a typical hairpin structure which is recognized by a complex enzyme machinery. It ultimately leads to the incorporation of 18-24 nt long mature miRNAs into RISC where they act as recognition keys to aid in regulation of target mRNAs. It is involved to determine miRNAs experimentally and, therefore, machine learning is used to complement such endeavors. The success of machine learning mostly depends on proper input data and appropriate features for parameterization of the data. Although, in general, two-class classification (TCC) is used in the field; because negative examples are hard to come by, one-class classification (OCC) has been tried for pre-miRNA detection. Since both positive and negative examples are currently somewhat limited, feature selection can prove to be vital for furthering the field of pre-miRNA detection. In this study, we compare the performance of OCC and TCC using eight feature selection methods and seven different plant species providing positive pre-miRNA examples. Feature selection was very successful for OCC where the best feature selection method achieved an average accuracy of 95.6%, thereby being ∼29% better than the worst method which achieved 66.9% accuracy. While the performance is comparable to TCC, which performs up to 3% better than OCC, TCC is much less affected by feature selection and its largest performance gap is ∼13% which only occurs for two of the feature selection methodologies. We conclude that feature selection is crucially important for OCC and that it can perform on par with TCC given the proper set of features.
Learning Compositional Shape Models of Multiple Distance Metrics by Information Projection.
Luo, Ping; Lin, Liang; Liu, Xiaobai
2016-07-01
This paper presents a novel compositional contour-based shape model by incorporating multiple distance metrics to account for varying shape distortions or deformations. Our approach contains two key steps: 1) contour feature generation and 2) generative model pursuit. For each category, we first densely sample an ensemble of local prototype contour segments from a few positive shape examples and describe each segment using three different types of distance metrics. These metrics are diverse and complementary with each other to capture various shape deformations. We regard the parameterized contour segment plus an additive residual ϵ as a basic subspace, namely, ϵ -ball, in the sense that it represents local shape variance under the certain distance metric. Using these ϵ -balls as features, we then propose a generative learning algorithm to pursue the compositional shape model, which greedily selects the most representative features under the information projection principle. In experiments, we evaluate our model on several public challenging data sets, and demonstrate that the integration of multiple shape distance metrics is capable of dealing various shape deformations, articulations, and background clutter, hence boosting system performance.
Global risk of big earthquakes has not recently increased.
Shearer, Peter M; Stark, Philip B
2012-01-17
The recent elevated rate of large earthquakes has fueled concern that the underlying global rate of earthquake activity has increased, which would have important implications for assessments of seismic hazard and our understanding of how faults interact. We examine the timing of large (magnitude M≥7) earthquakes from 1900 to the present, after removing local clustering related to aftershocks. The global rate of M≥8 earthquakes has been at a record high roughly since 2004, but rates have been almost as high before, and the rate of smaller earthquakes is close to its historical average. Some features of the global catalog are improbable in retrospect, but so are some features of most random sequences--if the features are selected after looking at the data. For a variety of magnitude cutoffs and three statistical tests, the global catalog, with local clusters removed, is not distinguishable from a homogeneous Poisson process. Moreover, no plausible physical mechanism predicts real changes in the underlying global rate of large events. Together these facts suggest that the global risk of large earthquakes is no higher today than it has been in the past.
Global risk of big earthquakes has not recently increased
Shearer, Peter M.; Stark, Philip B.
2012-01-01
The recent elevated rate of large earthquakes has fueled concern that the underlying global rate of earthquake activity has increased, which would have important implications for assessments of seismic hazard and our understanding of how faults interact. We examine the timing of large (magnitude M≥7) earthquakes from 1900 to the present, after removing local clustering related to aftershocks. The global rate of M≥8 earthquakes has been at a record high roughly since 2004, but rates have been almost as high before, and the rate of smaller earthquakes is close to its historical average. Some features of the global catalog are improbable in retrospect, but so are some features of most random sequences—if the features are selected after looking at the data. For a variety of magnitude cutoffs and three statistical tests, the global catalog, with local clusters removed, is not distinguishable from a homogeneous Poisson process. Moreover, no plausible physical mechanism predicts real changes in the underlying global rate of large events. Together these facts suggest that the global risk of large earthquakes is no higher today than it has been in the past. PMID:22184228
NASA Astrophysics Data System (ADS)
Starkey, Andrew; Usman Ahmad, Aliyu; Hamdoun, Hassan
2017-10-01
This paper investigates the application of a novel method for classification called Feature Weighted Self Organizing Map (FWSOM) that analyses the topology information of a converged standard Self Organizing Map (SOM) to automatically guide the selection of important inputs during training for improved classification of data with redundant inputs, examined against two traditional approaches namely neural networks and Support Vector Machines (SVM) for the classification of EEG data as presented in previous work. In particular, the novel method looks to identify the features that are important for classification automatically, and in this way the important features can be used to improve the diagnostic ability of any of the above methods. The paper presents the results and shows how the automated identification of the important features successfully identified the important features in the dataset and how this results in an improvement of the classification results for all methods apart from linear discriminatory methods which cannot separate the underlying nonlinear relationship in the data. The FWSOM in addition to achieving higher classification accuracy has given insights into what features are important in the classification of each class (left and right-hand movements), and these are corroborated by already published work in this area.
High temperature arc-track resistant aerospace insulation
NASA Technical Reports Server (NTRS)
Dorogy, William
1994-01-01
The topics are presented in viewgraph form and include the following: high temperature aerospace insulation; Foster-Miller approach to develop a 300 C rated, arc-track resistant aerospace insulation; advantages and disadvantages of key structural features; summary goals and achievements of the phase 1 program; performance goals for selected materials; materials under evaluation; molecular structures of candidate polymers; candidate polymer properties; film properties; and a detailed program plan.
Kitano, Hiroaki
2004-11-01
Robustness is a ubiquitously observed property of biological systems. It is considered to be a fundamental feature of complex evolvable systems. It is attained by several underlying principles that are universal to both biological organisms and sophisticated engineering systems. Robustness facilitates evolvability and robust traits are often selected by evolution. Such a mutually beneficial process is made possible by specific architectural features observed in robust systems. But there are trade-offs between robustness, fragility, performance and resource demands, which explain system behaviour, including the patterns of failure. Insights into inherent properties of robust systems will provide us with a better understanding of complex diseases and a guiding principle for therapy design.
Wong, Gerard; Leckie, Christopher; Kowalczyk, Adam
2012-01-15
Feature selection is a key concept in machine learning for microarray datasets, where features represented by probesets are typically several orders of magnitude larger than the available sample size. Computational tractability is a key challenge for feature selection algorithms in handling very high-dimensional datasets beyond a hundred thousand features, such as in datasets produced on single nucleotide polymorphism microarrays. In this article, we present a novel feature set reduction approach that enables scalable feature selection on datasets with hundreds of thousands of features and beyond. Our approach enables more efficient handling of higher resolution datasets to achieve better disease subtype classification of samples for potentially more accurate diagnosis and prognosis, which allows clinicians to make more informed decisions in regards to patient treatment options. We applied our feature set reduction approach to several publicly available cancer single nucleotide polymorphism (SNP) array datasets and evaluated its performance in terms of its multiclass predictive classification accuracy over different cancer subtypes, its speedup in execution as well as its scalability with respect to sample size and array resolution. Feature Set Reduction (FSR) was able to reduce the dimensions of an SNP array dataset by more than two orders of magnitude while achieving at least equal, and in most cases superior predictive classification performance over that achieved on features selected by existing feature selection methods alone. An examination of the biological relevance of frequently selected features from FSR-reduced feature sets revealed strong enrichment in association with cancer. FSR was implemented in MATLAB R2010b and is available at http://ww2.cs.mu.oz.au/~gwong/FSR.
Selective processing of multiple features in the human brain: effects of feature type and salience.
McGinnis, E Menton; Keil, Andreas
2011-02-09
Identifying targets in a stream of items at a given constant spatial location relies on selection of aspects such as color, shape, or texture. Such attended (target) features of a stimulus elicit a negative-going event-related brain potential (ERP), termed Selection Negativity (SN), which has been used as an index of selective feature processing. In two experiments, participants viewed a series of Gabor patches in which targets were defined as a specific combination of color, orientation, and shape. Distracters were composed of different combinations of color, orientation, and shape of the target stimulus. This design allows comparisons of items with and without specific target features. Consistent with previous ERP research, SN deflections extended between 160-300 ms. Data from the subsequent P3 component (300-450 ms post-stimulus) were also examined, and were regarded as an index of target processing. In Experiment A, predominant effects of target color on SN and P3 amplitudes were found, along with smaller ERP differences in response to variations of orientation and shape. Manipulating color to be less salient while enhancing the saliency of the orientation of the Gabor patch (Experiment B) led to delayed color selection and enhanced orientation selection. Topographical analyses suggested that the location of SN on the scalp reliably varies with the nature of the to-be-attended feature. No interference of non-target features on the SN was observed. These results suggest that target feature selection operates by means of electrocortical facilitation of feature-specific sensory processes, and that selective electrocortical facilitation is more effective when stimulus saliency is heightened.
Feature selection for the classification of traced neurons.
López-Cabrera, José D; Lorenzo-Ginori, Juan V
2018-06-01
The great availability of computational tools to calculate the properties of traced neurons leads to the existence of many descriptors which allow the automated classification of neurons from these reconstructions. This situation determines the necessity to eliminate irrelevant features as well as making a selection of the most appropriate among them, in order to improve the quality of the classification obtained. The dataset used contains a total of 318 traced neurons, classified by human experts in 192 GABAergic interneurons and 126 pyramidal cells. The features were extracted by means of the L-measure software, which is one of the most used computational tools in neuroinformatics to quantify traced neurons. We review some current feature selection techniques as filter, wrapper, embedded and ensemble methods. The stability of the feature selection methods was measured. For the ensemble methods, several aggregation methods based on different metrics were applied to combine the subsets obtained during the feature selection process. The subsets obtained applying feature selection methods were evaluated using supervised classifiers, among which Random Forest, C4.5, SVM, Naïve Bayes, Knn, Decision Table and the Logistic classifier were used as classification algorithms. Feature selection methods of types filter, embedded, wrappers and ensembles were compared and the subsets returned were tested in classification tasks for different classification algorithms. L-measure features EucDistanceSD, PathDistanceSD, Branch_pathlengthAve, Branch_pathlengthSD and EucDistanceAve were present in more than 60% of the selected subsets which provides evidence about their importance in the classification of this neurons. Copyright © 2018 Elsevier B.V. All rights reserved.
Dynamic interactions between visual working memory and saccade target selection
Schneegans, Sebastian; Spencer, John P.; Schöner, Gregor; Hwang, Seongmin; Hollingworth, Andrew
2014-01-01
Recent psychophysical experiments have shown that working memory for visual surface features interacts with saccadic motor planning, even in tasks where the saccade target is unambiguously specified by spatial cues. Specifically, a match between a memorized color and the color of either the designated target or a distractor stimulus influences saccade target selection, saccade amplitudes, and latencies in a systematic fashion. To elucidate these effects, we present a dynamic neural field model in combination with new experimental data. The model captures the neural processes underlying visual perception, working memory, and saccade planning relevant to the psychophysical experiment. It consists of a low-level visual sensory representation that interacts with two separate pathways: a spatial pathway implementing spatial attention and saccade generation, and a surface feature pathway implementing color working memory and feature attention. Due to bidirectional coupling between visual working memory and feature attention in the model, the working memory content can indirectly exert an effect on perceptual processing in the low-level sensory representation. This in turn biases saccadic movement planning in the spatial pathway, allowing the model to quantitatively reproduce the observed interaction effects. The continuous coupling between representations in the model also implies that modulation should be bidirectional, and model simulations provide specific predictions for complementary effects of saccade target selection on visual working memory. These predictions were empirically confirmed in a new experiment: Memory for a sample color was biased toward the color of a task-irrelevant saccade target object, demonstrating the bidirectional coupling between visual working memory and perceptual processing. PMID:25228628
Color-selective attention need not be mediated by spatial attention.
Andersen, Søren K; Müller, Matthias M; Hillyard, Steven A
2009-06-08
It is well-established that attention can select stimuli for preferential processing on the basis of non-spatial features such as color, orientation, or direction of motion. Evidence is mixed, however, as to whether feature-selective attention acts by increasing the signal strength of to-be-attended features irrespective of their spatial locations or whether it acts by guiding the spotlight of spatial attention to locations containing the relevant feature. To address this question, we designed a task in which feature-selective attention could not be mediated by spatial selection. Participants observed a display of intermingled dots of two colors, which rapidly and unpredictably changed positions, with the task of detecting brief intervals of reduced luminance of 20% of the dots of one or the other color. Both behavioral indices and electrophysiological measures of steady-state visual evoked potentials showed selectively enhanced processing of the attended-color items. The results demonstrate that feature-selective attention produces a sensory gain enhancement at early levels of the visual cortex that occurs without mediation by spatial attention.
RESIDENTIAL RADON RESISTANT CONSTRUCTION FEATURE SELECTION SYSTEM
The report describes a proposed residential radon resistant construction feature selection system. The features consist of engineered barriers to reduce radon entry and accumulation indoors. The proposed Florida standards require radon resistant features in proportion to regional...
Matsukura, Michi; Vecera, Shaun P
2011-02-01
Attention selects objects as well as locations. When attention selects an object's features, observers identify two features from a single object more accurately than two features from two different objects (object-based effect of attention; e.g., Duncan, Journal of Experimental Psychology: General, 113, 501-517, 1984). Several studies have demonstrated that object-based attention can operate at a late visual processing stage that is independent of objects' spatial information (Awh, Dhaliwal, Christensen, & Matsukura, Psychological Science, 12, 329-334, 2001; Matsukura & Vecera, Psychonomic Bulletin & Review, 16, 529-536, 2009; Vecera, Journal of Experimental Psychology: General, 126, 14-18, 1997; Vecera & Farah, Journal of Experimental Psychology: General, 123, 146-160, 1994). In the present study, we asked two questions regarding this late object-based selection mechanism. In Part I, we investigated how observers' foreknowledge of to-be-reported features allows attention to select objects, as opposed to individual features. Using a feature-report task, a significant object-based effect was observed when to-be-reported features were known in advance but not when this advance knowledge was absent. In Part II, we examined what drives attention to select objects rather than individual features in the absence of observers' foreknowledge of to-be-reported features. Results suggested that, when there was no opportunity for observers to direct their attention to objects that possess to-be-reported features at the time of stimulus presentation, these stimuli must retain strong perceptual cues to establish themselves as separate objects.
Stefan, Sabina; Schorr, Barbara; Lopez-Rolon, Alex; Kolassa, Iris-Tatjana; Shock, Jonathan P; Rosenfelder, Martin; Heck, Suzette; Bender, Andreas
2018-04-17
We applied the following methods to resting-state EEG data from patients with disorders of consciousness (DOC) for consciousness indexing and outcome prediction: microstates, entropy (i.e. approximate, permutation), power in alpha and delta frequency bands, and connectivity (i.e. weighted symbolic mutual information, symbolic transfer entropy, complex network analysis). Patients with unresponsive wakefulness syndrome (UWS) and patients in a minimally conscious state (MCS) were classified into these two categories by fitting and testing a generalised linear model. We aimed subsequently to develop an automated system for outcome prediction in severe DOC by selecting an optimal subset of features using sequential floating forward selection (SFFS). The two outcome categories were defined as UWS or dead, and MCS or emerged from MCS. Percentage of time spent in microstate D in the alpha frequency band performed best at distinguishing MCS from UWS patients. The average clustering coefficient obtained from thresholding beta coherence performed best at predicting outcome. The optimal subset of features selected with SFFS consisted of the frequency of microstate A in the 2-20 Hz frequency band, path length obtained from thresholding alpha coherence, and average path length obtained from thresholding alpha coherence. Combining these features seemed to afford high prediction power. Python and MATLAB toolboxes for the above calculations are freely available under the GNU public license for non-commercial use ( https://qeeg.wordpress.com ).
Chowdhury, Debbrota Paul; Bakshi, Sambit; Guo, Guodong; Sa, Pankaj Kumar
2017-11-27
In this paper, an overall framework has been presented for person verification using ear biometric which uses tunable filter bank as local feature extractor. The tunable filter bank, based on a half-band polynomial of 14th order, extracts distinct features from ear images maintaining its frequency selectivity property. To advocate the applicability of tunable filter bank on ear biometrics, recognition test has been performed on available constrained databases like AMI, WPUT, IITD and unconstrained database like UERC. Experiments have been conducted applying tunable filter based feature extractor on subparts of the ear. Empirical experiments have been conducted with four and six subdivisions of the ear image. Analyzing the experimental results, it has been found that tunable filter moderately succeeds to distinguish ear features at par with the state-of-the-art features used for ear recognition. Accuracies of 70.58%, 67.01%, 81.98%, and 57.75% have been achieved on AMI, WPUT, IITD, and UERC databases through considering Canberra Distance as underlying measure of separation. The performances indicate that tunable filter is a candidate for recognizing human from ear images.
[Several mechanisms of visual gnosis disorders in local brain lesions].
Meerson, Ia A
1981-01-01
The object of the studies were peculiarities of recognizing visual images by patients with local cerebral lesions under conditions of incomplete sets of the image features, disjunction of the latter, distortion of their spatial arrangement, and unusual spatial orientation of the image as a whole. It was found that elimination of even one essential feature sharply hampered the recognition of the image both by healthy individuals (control), and patients with extraoccipital lesions, whereas elimination of several nonessential features only slowed down the process. In distinction from this the difficulties of the recognition of incomplete images by patients with occipital lesions were directly proportional to the number of the eliminated features irrespective of the latters' significance, i.e. these patients were unable to evaluate the hierarchy of the features. The recognition process in these patients were followed the way of scanning individual features. The reaccumulation and summation. The recognition of the fragmental, spatially distorted and unusually oriented images was found to be affected selectively in patients with parietal lobe affections. The patients with occipital lesions recognized such images practically as good as the ordinary ones.
Shan, Juan; Alam, S Kaisar; Garra, Brian; Zhang, Yingtao; Ahmed, Tahira
2016-04-01
This work identifies effective computable features from the Breast Imaging Reporting and Data System (BI-RADS), to develop a computer-aided diagnosis (CAD) system for breast ultrasound. Computerized features corresponding to ultrasound BI-RADs categories were designed and tested using a database of 283 pathology-proven benign and malignant lesions. Features were selected based on classification performance using a "bottom-up" approach for different machine learning methods, including decision tree, artificial neural network, random forest and support vector machine. Using 10-fold cross-validation on the database of 283 cases, the highest area under the receiver operating characteristic (ROC) curve (AUC) was 0.84 from a support vector machine with 77.7% overall accuracy; the highest overall accuracy, 78.5%, was from a random forest with the AUC 0.83. Lesion margin and orientation were optimum features common to all of the different machine learning methods. These features can be used in CAD systems to help distinguish benign from worrisome lesions. Copyright © 2016 World Federation for Ultrasound in Medicine & Biology. All rights reserved.
Constraint programming based biomarker optimization.
Zhou, Manli; Luo, Youxi; Sun, Guoquan; Mai, Guoqin; Zhou, Fengfeng
2015-01-01
Efficient and intuitive characterization of biological big data is becoming a major challenge for modern bio-OMIC based scientists. Interactive visualization and exploration of big data is proven to be one of the successful solutions. Most of the existing feature selection algorithms do not allow the interactive inputs from users in the optimizing process of feature selection. This study investigates this question as fixing a few user-input features in the finally selected feature subset and formulates these user-input features as constraints for a programming model. The proposed algorithm, fsCoP (feature selection based on constrained programming), performs well similar to or much better than the existing feature selection algorithms, even with the constraints from both literature and the existing algorithms. An fsCoP biomarker may be intriguing for further wet lab validation, since it satisfies both the classification optimization function and the biomedical knowledge. fsCoP may also be used for the interactive exploration of bio-OMIC big data by interactively adding user-defined constraints for modeling.
Jin, Mingwu; Deng, Weishu
2018-05-15
There is a spectrum of the progression from healthy control (HC) to mild cognitive impairment (MCI) without conversion to Alzheimer's disease (AD), to MCI with conversion to AD (cMCI), and to AD. This study aims to predict the different disease stages using brain structural information provided by magnetic resonance imaging (MRI) data. The neighborhood component analysis (NCA) is applied to select most powerful features for prediction. The ensemble decision tree classifier is built to predict which group the subject belongs to. The best features and model parameters are determined by cross validation of the training data. Our results show that 16 out of a total of 429 features were selected by NCA using 240 training subjects, including MMSE score and structural measures in memory-related regions. The boosting tree model with NCA features can achieve prediction accuracy of 56.25% on 160 test subjects. Principal component analysis (PCA) and sequential feature selection (SFS) are used for feature selection, while support vector machine (SVM) is used for classification. The boosting tree model with NCA features outperforms all other combinations of feature selection and classification methods. The results suggest that NCA be a better feature selection strategy than PCA and SFS for the data used in this study. Ensemble tree classifier with boosting is more powerful than SVM to predict the subject group. However, more advanced feature selection and classification methods or additional measures besides structural MRI may be needed to improve the prediction performance. Copyright © 2018 Elsevier B.V. All rights reserved.
Li, Jing; Hong, Wenxue
2014-12-01
The feature extraction and feature selection are the important issues in pattern recognition. Based on the geometric algebra representation of vector, a new feature extraction method using blade coefficient of geometric algebra was proposed in this study. At the same time, an improved differential evolution (DE) feature selection method was proposed to solve the elevated high dimension issue. The simple linear discriminant analysis was used as the classifier. The result of the 10-fold cross-validation (10 CV) classification of public breast cancer biomedical dataset was more than 96% and proved superior to that of the original features and traditional feature extraction method.
Application of machine learning on brain cancer multiclass classification
NASA Astrophysics Data System (ADS)
Panca, V.; Rustam, Z.
2017-07-01
Classification of brain cancer is a problem of multiclass classification. One approach to solve this problem is by first transforming it into several binary problems. The microarray gene expression dataset has the two main characteristics of medical data: extremely many features (genes) and only a few number of samples. The application of machine learning on microarray gene expression dataset mainly consists of two steps: feature selection and classification. In this paper, the features are selected using a method based on support vector machine recursive feature elimination (SVM-RFE) principle which is improved to solve multiclass classification, called multiple multiclass SVM-RFE. Instead of using only the selected features on a single classifier, this method combines the result of multiple classifiers. The features are divided into subsets and SVM-RFE is used on each subset. Then, the selected features on each subset are put on separate classifiers. This method enhances the feature selection ability of each single SVM-RFE. Twin support vector machine (TWSVM) is used as the method of the classifier to reduce computational complexity. While ordinary SVM finds single optimum hyperplane, the main objective Twin SVM is to find two non-parallel optimum hyperplanes. The experiment on the brain cancer microarray gene expression dataset shows this method could classify 71,4% of the overall test data correctly, using 100 and 1000 genes selected from multiple multiclass SVM-RFE feature selection method. Furthermore, the per class results show that this method could classify data of normal and MD class with 100% accuracy.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhong, H; Wang, J; Shen, L
Purpose: The purpose of this study is to investigate the relationship between computed tomographic (CT) texture features of primary lesions and metastasis-free survival for rectal cancer patients; and to develop a datamining prediction model using texture features. Methods: A total of 220 rectal cancer patients treated with neoadjuvant chemo-radiotherapy (CRT) were enrolled in this study. All patients underwent CT scans before CRT. The primary lesions on the CT images were delineated by two experienced oncologists. The CT images were filtered by Laplacian of Gaussian (LoG) filters with different filter values (1.0–2.5: from fine to coarse). Both filtered and unfiltered imagesmore » were analyzed using Gray-level Co-occurrence Matrix (GLCM) texture analysis with different directions (transversal, sagittal, and coronal). Totally, 270 texture features with different species, directions and filter values were extracted. Texture features were examined with Student’s t-test for selecting predictive features. Principal Component Analysis (PCA) was performed upon the selected features to reduce the feature collinearity. Artificial neural network (ANN) and logistic regression were applied to establish metastasis prediction models. Results: Forty-six of 220 patients developed metastasis with a follow-up time of more than 2 years. Sixtyseven texture features were significantly different in t-test (p<0.05) between patients with and without metastasis, and 12 of them were extremely significant (p<0.001). The Area-under-the-curve (AUC) of ANN was 0.72, and the concordance index (CI) of logistic regression was 0.71. The predictability of ANN was slightly better than logistic regression. Conclusion: CT texture features of primary lesions are related to metastasisfree survival of rectal cancer patients. Both ANN and logistic regression based models can be developed for prediction.« less
SU-F-R-14: PET Based Radiomics to Predict Outcomes in Patients with Hodgkin Lymphoma
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, J; Aristophanous, M; Akhtari, M
Purpose: To identify PET-based radiomics features associated with high refractory/relapsed disease risk for Hodgkin lymphoma patients. Methods: A total of 251 Hodgkin lymphoma patients including 19 primary refractory and 9 relapsed patients were investigated. All patients underwent an initial pre-treatment diagnostic FDG PET/CT scan. All cancerous lymph node regions (ROIs) were delineated by an experienced physician based on thresholding each volume of disease in the anatomical regions to SUV>2.5. We extracted 122 image features and evaluated the effect of ROI selection (the largest ROI, the ROI with highest mean SUV, merged ROI, and a single anatomic region [e.g. mediastinum]) onmore » classification accuracy. Random forest was used as a classifier and ROC analysis was used to assess the relationship between selected features and patient’s outcome status. Results: Each patient had between 1 and 9 separate ROIs, with much intra-patient variability in PET features. The best model, which used features from a single anatomic region (the mediastinal ROI, only volumes>5cc: 169 patients with 12 primary refractory) had a classification accuracy of 80.5% for primary refractory disease. The top five features, based on Gini index, consist of shape features (max 3D-diameter and volume) and texture features (correlation and information measure of correlation1&2). In the ROC analysis, sensitivity and specificity of the best model were 0.92 and 0.80, respectively. The area under the ROC (AUC) and the accuracy were 0.86 and 0.86, respectively. The classification accuracy was less than 60% for other ROI models or when ROIs less than 5cc were included. Conclusion: This study showed that PET-based radiomics features from the mediastinal lymph region are associated with primary refractory disease and therefore may play an important role in predicting outcomes in Hodgkin lymphoma patients. These features could be additive beyond baseline tumor and clinical characteristics, and may warrant more aggressive treatment.« less
DemQSAR: predicting human volume of distribution and clearance of drugs
NASA Astrophysics Data System (ADS)
Demir-Kavuk, Ozgur; Bentzien, Jörg; Muegge, Ingo; Knapp, Ernst-Walter
2011-12-01
In silico methods characterizing molecular compounds with respect to pharmacologically relevant properties can accelerate the identification of new drugs and reduce their development costs. Quantitative structure-activity/-property relationship (QSAR/QSPR) correlate structure and physico-chemical properties of molecular compounds with a specific functional activity/property under study. Typically a large number of molecular features are generated for the compounds. In many cases the number of generated features exceeds the number of molecular compounds with known property values that are available for learning. Machine learning methods tend to overfit the training data in such situations, i.e. the method adjusts to very specific features of the training data, which are not characteristic for the considered property. This problem can be alleviated by diminishing the influence of unimportant, redundant or even misleading features. A better strategy is to eliminate such features completely. Ideally, a molecular property can be described by a small number of features that are chemically interpretable. The purpose of the present contribution is to provide a predictive modeling approach, which combines feature generation, feature selection, model building and control of overtraining into a single application called DemQSAR. DemQSAR is used to predict human volume of distribution (VDss) and human clearance (CL). To control overtraining, quadratic and linear regularization terms were employed. A recursive feature selection approach is used to reduce the number of descriptors. The prediction performance is as good as the best predictions reported in the recent literature. The example presented here demonstrates that DemQSAR can generate a model that uses very few features while maintaining high predictive power. A standalone DemQSAR Java application for model building of any user defined property as well as a web interface for the prediction of human VDss and CL is available on the webpage of DemPRED: http://agknapp.chemie.fu-berlin.de/dempred/.
DemQSAR: predicting human volume of distribution and clearance of drugs.
Demir-Kavuk, Ozgur; Bentzien, Jörg; Muegge, Ingo; Knapp, Ernst-Walter
2011-12-01
In silico methods characterizing molecular compounds with respect to pharmacologically relevant properties can accelerate the identification of new drugs and reduce their development costs. Quantitative structure-activity/-property relationship (QSAR/QSPR) correlate structure and physico-chemical properties of molecular compounds with a specific functional activity/property under study. Typically a large number of molecular features are generated for the compounds. In many cases the number of generated features exceeds the number of molecular compounds with known property values that are available for learning. Machine learning methods tend to overfit the training data in such situations, i.e. the method adjusts to very specific features of the training data, which are not characteristic for the considered property. This problem can be alleviated by diminishing the influence of unimportant, redundant or even misleading features. A better strategy is to eliminate such features completely. Ideally, a molecular property can be described by a small number of features that are chemically interpretable. The purpose of the present contribution is to provide a predictive modeling approach, which combines feature generation, feature selection, model building and control of overtraining into a single application called DemQSAR. DemQSAR is used to predict human volume of distribution (VD(ss)) and human clearance (CL). To control overtraining, quadratic and linear regularization terms were employed. A recursive feature selection approach is used to reduce the number of descriptors. The prediction performance is as good as the best predictions reported in the recent literature. The example presented here demonstrates that DemQSAR can generate a model that uses very few features while maintaining high predictive power. A standalone DemQSAR Java application for model building of any user defined property as well as a web interface for the prediction of human VD(ss) and CL is available on the webpage of DemPRED: http://agknapp.chemie.fu-berlin.de/dempred/ .
Decision Variants for the Automatic Determination of Optimal Feature Subset in RF-RFE.
Chen, Qi; Meng, Zhaopeng; Liu, Xinyi; Jin, Qianguo; Su, Ran
2018-06-15
Feature selection, which identifies a set of most informative features from the original feature space, has been widely used to simplify the predictor. Recursive feature elimination (RFE), as one of the most popular feature selection approaches, is effective in data dimension reduction and efficiency increase. A ranking of features, as well as candidate subsets with the corresponding accuracy, is produced through RFE. The subset with highest accuracy (HA) or a preset number of features (PreNum) are often used as the final subset. However, this may lead to a large number of features being selected, or if there is no prior knowledge about this preset number, it is often ambiguous and subjective regarding final subset selection. A proper decision variant is in high demand to automatically determine the optimal subset. In this study, we conduct pioneering work to explore the decision variant after obtaining a list of candidate subsets from RFE. We provide a detailed analysis and comparison of several decision variants to automatically select the optimal feature subset. Random forest (RF)-recursive feature elimination (RF-RFE) algorithm and a voting strategy are introduced. We validated the variants on two totally different molecular biology datasets, one for a toxicogenomic study and the other one for protein sequence analysis. The study provides an automated way to determine the optimal feature subset when using RF-RFE.
Hay, Julia L; Milders, Maarten M; Sahraie, Arash; Niedeggen, Michael
2006-08-01
Recent visual marking studies have shown that the carry-over of distractor inhibition can impair the ability of singletons to capture attention if the singleton and distractors share features. The current study extends this finding to first-order motion targets and distractors, clearly separated in time by a visual cue (the letter X). Target motion discrimination was significantly impaired, a result attributed to the carry-over of distractor inhibition. Increasing the difficulty of cue detection increased the motion target impairment, as distractor inhibition is thought to increase under demanding (high load) conditions in order to maximize selection efficiency. The apparent conflict with studies reporting reduced distractor inhibition under high load conditions was resolved by distinguishing between the effects of "cognitive" and "perceptual" load. ((c) 2006 APA, all rights reserved).
Mining the key predictors for event outbreaks in social networks
NASA Astrophysics Data System (ADS)
Yi, Chengqi; Bao, Yuanyuan; Xue, Yibo
2016-04-01
It will be beneficial to devise a method to predict a so-called event outbreak. Existing works mainly focus on exploring effective methods for improving the accuracy of predictions, while ignoring the underlying causes: What makes event go viral? What factors that significantly influence the prediction of an event outbreak in social networks? In this paper, we proposed a novel definition for an event outbreak, taking into account the structural changes to a network during the propagation of content. In addition, we investigated features that were sensitive to predicting an event outbreak. In order to investigate the universality of these features at different stages of an event, we split the entire lifecycle of an event into 20 equal segments according to the proportion of the propagation time. We extracted 44 features, including features related to content, users, structure, and time, from each segment of the event. Based on these features, we proposed a prediction method using supervised classification algorithms to predict event outbreaks. Experimental results indicate that, as time goes by, our method is highly accurate, with a precision rate ranging from 79% to 97% and a recall rate ranging from 74% to 97%. In addition, after applying a feature-selection algorithm, the top five selected features can considerably improve the accuracy of the prediction. Data-driven experimental results show that the entropy of the eigenvector centrality, the entropy of the PageRank, the standard deviation of the betweenness centrality, the proportion of re-shares without content, and the average path length are the key predictors for an event outbreak. Our findings are especially useful for further exploring the intrinsic characteristics of outbreak prediction.
Influence of time and length size feature selections for human activity sequences recognition.
Fang, Hongqing; Chen, Long; Srinivasan, Raghavendiran
2014-01-01
In this paper, Viterbi algorithm based on a hidden Markov model is applied to recognize activity sequences from observed sensors events. Alternative features selections of time feature values of sensors events and activity length size feature values are tested, respectively, and then the results of activity sequences recognition performances of Viterbi algorithm are evaluated. The results show that the selection of larger time feature values of sensor events and/or smaller activity length size feature values will generate relatively better results on the activity sequences recognition performances. © 2013 ISA Published by ISA All rights reserved.
Adaptive runtime for a multiprocessing API
Antao, Samuel F.; Bertolli, Carlo; Eichenberger, Alexandre E.; O'Brien, John K.
2016-11-15
A computer-implemented method includes selecting a runtime for executing a program. The runtime includes a first combination of feature implementations, where each feature implementation implements a feature of an application programming interface (API). Execution of the program is monitored, and the execution uses the runtime. Monitor data is generated based on the monitoring. A second combination of feature implementations are selected, by a computer processor, where the selection is based at least in part on the monitor data. The runtime is modified by activating the second combination of feature implementations to replace the first combination of feature implementations.
Adaptive runtime for a multiprocessing API
Antao, Samuel F.; Bertolli, Carlo; Eichenberger, Alexandre E.; O'Brien, John K.
2016-10-11
A computer-implemented method includes selecting a runtime for executing a program. The runtime includes a first combination of feature implementations, where each feature implementation implements a feature of an application programming interface (API). Execution of the program is monitored, and the execution uses the runtime. Monitor data is generated based on the monitoring. A second combination of feature implementations are selected, by a computer processor, where the selection is based at least in part on the monitor data. The runtime is modified by activating the second combination of feature implementations to replace the first combination of feature implementations.
Chebouba, Lokmane; Boughaci, Dalila; Guziolowski, Carito
2018-06-04
The use of data issued from high throughput technologies in drug target problems is widely widespread during the last decades. This study proposes a meta-heuristic framework using stochastic local search (SLS) combined with random forest (RF) where the aim is to specify the most important genes and proteins leading to the best classification of Acute Myeloid Leukemia (AML) patients. First we use a stochastic local search meta-heuristic as a feature selection technique to select the most significant proteins to be used in the classification task step. Then we apply RF to classify new patients into their corresponding classes. The evaluation technique is to run the RF classifier on the training data to get a model. Then, we apply this model on the test data to find the appropriate class. We use as metrics the balanced accuracy (BAC) and the area under the receiver operating characteristic curve (AUROC) to measure the performance of our model. The proposed method is evaluated on the dataset issued from DREAM 9 challenge. The comparison is done with a pure random forest (without feature selection), and with the two best ranked results of the DREAM 9 challenge. We used three types of data: only clinical data, only proteomics data, and finally clinical and proteomics data combined. The numerical results show that the highest scores are obtained when using clinical data alone, and the lowest is obtained when using proteomics data alone. Further, our method succeeds in finding promising results compared to the methods presented in the DREAM challenge.
Analysis of sampling techniques for imbalanced data: An n = 648 ADNI study.
Dubey, Rashmi; Zhou, Jiayu; Wang, Yalin; Thompson, Paul M; Ye, Jieping
2014-02-15
Many neuroimaging applications deal with imbalanced imaging data. For example, in Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset, the mild cognitive impairment (MCI) cases eligible for the study are nearly two times the Alzheimer's disease (AD) patients for structural magnetic resonance imaging (MRI) modality and six times the control cases for proteomics modality. Constructing an accurate classifier from imbalanced data is a challenging task. Traditional classifiers that aim to maximize the overall prediction accuracy tend to classify all data into the majority class. In this paper, we study an ensemble system of feature selection and data sampling for the class imbalance problem. We systematically analyze various sampling techniques by examining the efficacy of different rates and types of undersampling, oversampling, and a combination of over and undersampling approaches. We thoroughly examine six widely used feature selection algorithms to identify significant biomarkers and thereby reduce the complexity of the data. The efficacy of the ensemble techniques is evaluated using two different classifiers including Random Forest and Support Vector Machines based on classification accuracy, area under the receiver operating characteristic curve (AUC), sensitivity, and specificity measures. Our extensive experimental results show that for various problem settings in ADNI, (1) a balanced training set obtained with K-Medoids technique based undersampling gives the best overall performance among different data sampling techniques and no sampling approach; and (2) sparse logistic regression with stability selection achieves competitive performance among various feature selection algorithms. Comprehensive experiments with various settings show that our proposed ensemble model of multiple undersampled datasets yields stable and promising results. © 2013 Elsevier Inc. All rights reserved.
Informative Feature Selection for Object Recognition via Sparse PCA
2011-04-07
constraint on images collected from low-power camera net- works instead of high-end photography is that establishing wide-baseline feature correspondence of...variable selection tool for selecting informative features in the object images captured from low-resolution cam- era sensor networks. Firstly, we...More examples can be found in Figure 4 later. 3. Identifying Informative Features Classical PCA is a well established tool for the analysis of high
SVGenes: a library for rendering genomic features in scalable vector graphic format.
Etherington, Graham J; MacLean, Daniel
2013-08-01
Drawing genomic features in attractive and informative ways is a key task in visualization of genomics data. Scalable Vector Graphics (SVG) format is a modern and flexible open standard that provides advanced features including modular graphic design, advanced web interactivity and animation within a suitable client. SVGs do not suffer from loss of image quality on re-scaling and provide the ability to edit individual elements of a graphic on the whole object level independent of the whole image. These features make SVG a potentially useful format for the preparation of publication quality figures including genomic objects such as genes or sequencing coverage and for web applications that require rich user-interaction with the graphical elements. SVGenes is a Ruby-language library that uses SVG primitives to render typical genomic glyphs through a simple and flexible Ruby interface. The library implements a simple Page object that spaces and contains horizontal Track objects that in turn style, colour and positions features within them. Tracks are the level at which visual information is supplied providing the full styling capability of the SVG standard. Genomic entities like genes, transcripts and histograms are modelled in Glyph objects that are attached to a track and take advantage of SVG primitives to render the genomic features in a track as any of a selection of defined glyphs. The feature model within SVGenes is simple but flexible and not dependent on particular existing gene feature formats meaning graphics for any existing datasets can easily be created without need for conversion. The library is provided as a Ruby Gem from https://rubygems.org/gems/bio-svgenes under the MIT license, and open source code is available at https://github.com/danmaclean/bioruby-svgenes also under the MIT License. dan.maclean@tsl.ac.uk.
Rosenkrantz, Andrew B; Doshi, Ankur M; Ginocchio, Luke A; Aphinyanaphongs, Yindalon
2016-12-01
This study aimed to assess the performance of a text classification machine-learning model in predicting highly cited articles within the recent radiological literature and to identify the model's most influential article features. We downloaded from PubMed the title, abstract, and medical subject heading terms for 10,065 articles published in 25 general radiology journals in 2012 and 2013. Three machine-learning models were applied to predict the top 10% of included articles in terms of the number of citations to the article in 2014 (reflecting the 2-year time window in conventional impact factor calculations). The model having the highest area under the curve was selected to derive a list of article features (words) predicting high citation volume, which was iteratively reduced to identify the smallest possible core feature list maintaining predictive power. Overall themes were qualitatively assigned to the core features. The regularized logistic regression (Bayesian binary regression) model had highest performance, achieving an area under the curve of 0.814 in predicting articles in the top 10% of citation volume. We reduced the initial 14,083 features to 210 features that maintain predictivity. These features corresponded with topics relating to various imaging techniques (eg, diffusion-weighted magnetic resonance imaging, hyperpolarized magnetic resonance imaging, dual-energy computed tomography, computed tomography reconstruction algorithms, tomosynthesis, elastography, and computer-aided diagnosis), particular pathologies (prostate cancer; thyroid nodules; hepatic adenoma, hepatocellular carcinoma, non-alcoholic fatty liver disease), and other topics (radiation dose, electroporation, education, general oncology, gadolinium, statistics). Machine learning can be successfully applied to create specific feature-based models for predicting articles likely to achieve high influence within the radiological literature. Copyright © 2016 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.
Selective attention to facial emotion in physically abused children.
Pollak, Seth D; Tolley-Schell, Stephanie A
2003-08-01
The ability to allocate attention to emotional cues in the environment is an important feature of adaptive self-regulation. Existing data suggest that physically abused children overattend to angry expressions, but the attentional mechanisms underlying such behavior are unknown. The authors tested 8-11-year-old physically abused children to determine whether they displayed specific information-processing problems in a selective attention paradigm using emotional faces as cues. Physically abused children demonstrated delayed disengagement when angry faces served as invalid cues. Abused children also demonstrated increased attentional benefits on valid angry trials. Results are discussed in terms of the influence of early adverse experience on children's selective attention to threat-related signals as a mechanism in the development of psychopathology.
Overview of field gamma spectrometries based on Si-photomultiplier
NASA Astrophysics Data System (ADS)
Denisov, Viktor; Korotaev, Valery; Titov, Aleksandr; Blokhina, Anastasia; Kleshchenok, Maksim
2017-05-01
Design of optical-electronic devices and systems involves the selection of such technical patterns that under given initial requirements and conditions are optimal according to certain criteria. The original characteristic of the OES for any purpose, defining its most important feature ability is a threshold detection. Based on this property, will be achieved the required functional quality of the device or system. Therefore, the original criteria and optimization methods have to subordinate to the idea of a better detectability. Generally reduces to the problem of optimal selection of the expected (predetermined) signals in the predetermined observation conditions. Thus the main purpose of optimization of the system when calculating its detectability is the choice of circuits and components that provide the most effective selection of a target.
NASA Astrophysics Data System (ADS)
Li, Zuhe; Fan, Yangyu; Liu, Weihua; Yu, Zeqi; Wang, Fengqin
2017-01-01
We aim to apply sparse autoencoder-based unsupervised feature learning to emotional semantic analysis for textile images. To tackle the problem of limited training data, we present a cross-domain feature learning scheme for emotional textile image classification using convolutional autoencoders. We further propose a correlation-analysis-based feature selection method for the weights learned by sparse autoencoders to reduce the number of features extracted from large size images. First, we randomly collect image patches on an unlabeled image dataset in the source domain and learn local features with a sparse autoencoder. We then conduct feature selection according to the correlation between different weight vectors corresponding to the autoencoder's hidden units. We finally adopt a convolutional neural network including a pooling layer to obtain global feature activations of textile images in the target domain and send these global feature vectors into logistic regression models for emotional image classification. The cross-domain unsupervised feature learning method achieves 65% to 78% average accuracy in the cross-validation experiments corresponding to eight emotional categories and performs better than conventional methods. Feature selection can reduce the computational cost of global feature extraction by about 50% while improving classification performance.
NASA Astrophysics Data System (ADS)
Weinmann, Martin; Jutzi, Boris; Hinz, Stefan; Mallet, Clément
2015-07-01
3D scene analysis in terms of automatically assigning 3D points a respective semantic label has become a topic of great importance in photogrammetry, remote sensing, computer vision and robotics. In this paper, we address the issue of how to increase the distinctiveness of geometric features and select the most relevant ones among these for 3D scene analysis. We present a new, fully automated and versatile framework composed of four components: (i) neighborhood selection, (ii) feature extraction, (iii) feature selection and (iv) classification. For each component, we consider a variety of approaches which allow applicability in terms of simplicity, efficiency and reproducibility, so that end-users can easily apply the different components and do not require expert knowledge in the respective domains. In a detailed evaluation involving 7 neighborhood definitions, 21 geometric features, 7 approaches for feature selection, 10 classifiers and 2 benchmark datasets, we demonstrate that the selection of optimal neighborhoods for individual 3D points significantly improves the results of 3D scene analysis. Additionally, we show that the selection of adequate feature subsets may even further increase the quality of the derived results while significantly reducing both processing time and memory consumption.
Web-based newborn screening system for metabolic diseases: machine learning versus clinicians.
Chen, Wei-Hsin; Hsieh, Sheau-Ling; Hsu, Kai-Ping; Chen, Han-Ping; Su, Xing-Yu; Tseng, Yi-Ju; Chien, Yin-Hsiu; Hwu, Wuh-Liang; Lai, Feipei
2013-05-23
A hospital information system (HIS) that integrates screening data and interpretation of the data is routinely requested by hospitals and parents. However, the accuracy of disease classification may be low because of the disease characteristics and the analytes used for classification. The objective of this study is to describe a system that enhanced the neonatal screening system of the Newborn Screening Center at the National Taiwan University Hospital. The system was designed and deployed according to a service-oriented architecture (SOA) framework under the Web services .NET environment. The system consists of sample collection, testing, diagnosis, evaluation, treatment, and follow-up services among collaborating hospitals. To improve the accuracy of newborn screening, machine learning and optimal feature selection mechanisms were investigated for screening newborns for inborn errors of metabolism. The framework of the Newborn Screening Hospital Information System (NSHIS) used the embedded Health Level Seven (HL7) standards for data exchanges among heterogeneous platforms integrated by Web services in the C# language. In this study, machine learning classification was used to predict phenylketonuria (PKU), hypermethioninemia, and 3-methylcrotonyl-CoA-carboxylase (3-MCC) deficiency. The classification methods used 347,312 newborn dried blood samples collected at the Center between 2006 and 2011. Of these, 220 newborns had values over the diagnostic cutoffs (positive cases) and 1557 had values that were over the screening cutoffs but did not meet the diagnostic cutoffs (suspected cases). The original 35 analytes and the manifested features were ranked based on F score, then combinations of the top 20 ranked features were selected as input features to support vector machine (SVM) classifiers to obtain optimal feature sets. These feature sets were tested using 5-fold cross-validation and optimal models were generated. The datasets collected in year 2011 were used as predicting cases. The feature selection strategies were implemented and the optimal markers for PKU, hypermethioninemia, and 3-MCC deficiency were obtained. The results of the machine learning approach were compared with the cutoff scheme. The number of the false positive cases were reduced from 21 to 2 for PKU, from 30 to 10 for hypermethioninemia, and 209 to 46 for 3-MCC deficiency. This SOA Web service-based newborn screening system can accelerate screening procedures effectively and efficiently. An SVM learning methodology for PKU, hypermethioninemia, and 3-MCC deficiency metabolic diseases classification, including optimal feature selection strategies, is presented. By adopting the results of this study, the number of suspected cases could be reduced dramatically.
Web-Based Newborn Screening System for Metabolic Diseases: Machine Learning Versus Clinicians
Chen, Wei-Hsin; Hsu, Kai-Ping; Chen, Han-Ping; Su, Xing-Yu; Tseng, Yi-Ju; Chien, Yin-Hsiu; Hwu, Wuh-Liang; Lai, Feipei
2013-01-01
Background A hospital information system (HIS) that integrates screening data and interpretation of the data is routinely requested by hospitals and parents. However, the accuracy of disease classification may be low because of the disease characteristics and the analytes used for classification. Objective The objective of this study is to describe a system that enhanced the neonatal screening system of the Newborn Screening Center at the National Taiwan University Hospital. The system was designed and deployed according to a service-oriented architecture (SOA) framework under the Web services .NET environment. The system consists of sample collection, testing, diagnosis, evaluation, treatment, and follow-up services among collaborating hospitals. To improve the accuracy of newborn screening, machine learning and optimal feature selection mechanisms were investigated for screening newborns for inborn errors of metabolism. Methods The framework of the Newborn Screening Hospital Information System (NSHIS) used the embedded Health Level Seven (HL7) standards for data exchanges among heterogeneous platforms integrated by Web services in the C# language. In this study, machine learning classification was used to predict phenylketonuria (PKU), hypermethioninemia, and 3-methylcrotonyl-CoA-carboxylase (3-MCC) deficiency. The classification methods used 347,312 newborn dried blood samples collected at the Center between 2006 and 2011. Of these, 220 newborns had values over the diagnostic cutoffs (positive cases) and 1557 had values that were over the screening cutoffs but did not meet the diagnostic cutoffs (suspected cases). The original 35 analytes and the manifested features were ranked based on F score, then combinations of the top 20 ranked features were selected as input features to support vector machine (SVM) classifiers to obtain optimal feature sets. These feature sets were tested using 5-fold cross-validation and optimal models were generated. The datasets collected in year 2011 were used as predicting cases. Results The feature selection strategies were implemented and the optimal markers for PKU, hypermethioninemia, and 3-MCC deficiency were obtained. The results of the machine learning approach were compared with the cutoff scheme. The number of the false positive cases were reduced from 21 to 2 for PKU, from 30 to 10 for hypermethioninemia, and 209 to 46 for 3-MCC deficiency. Conclusions This SOA Web service–based newborn screening system can accelerate screening procedures effectively and efficiently. An SVM learning methodology for PKU, hypermethioninemia, and 3-MCC deficiency metabolic diseases classification, including optimal feature selection strategies, is presented. By adopting the results of this study, the number of suspected cases could be reduced dramatically. PMID:23702487
Efficient feature selection using a hybrid algorithm for the task of epileptic seizure detection
NASA Astrophysics Data System (ADS)
Lai, Kee Huong; Zainuddin, Zarita; Ong, Pauline
2014-07-01
Feature selection is a very important aspect in the field of machine learning. It entails the search of an optimal subset from a very large data set with high dimensional feature space. Apart from eliminating redundant features and reducing computational cost, a good selection of feature also leads to higher prediction and classification accuracy. In this paper, an efficient feature selection technique is introduced in the task of epileptic seizure detection. The raw data are electroencephalography (EEG) signals. Using discrete wavelet transform, the biomedical signals were decomposed into several sets of wavelet coefficients. To reduce the dimension of these wavelet coefficients, a feature selection method that combines the strength of both filter and wrapper methods is proposed. Principal component analysis (PCA) is used as part of the filter method. As for wrapper method, the evolutionary harmony search (HS) algorithm is employed. This metaheuristic method aims at finding the best discriminating set of features from the original data. The obtained features were then used as input for an automated classifier, namely wavelet neural networks (WNNs). The WNNs model was trained to perform a binary classification task, that is, to determine whether a given EEG signal was normal or epileptic. For comparison purposes, different sets of features were also used as input. Simulation results showed that the WNNs that used the features chosen by the hybrid algorithm achieved the highest overall classification accuracy.
Parsanezhad, M E; Alborzi, S; Zarei, A; Dehbashi, S; Omrani, G
2001-10-01
To evaluate the clinical features, endocrine and metabolic profiles in clomiphene (CC) responders and non-responders with polycystic ovarian disease (PCOD), and to examine the effects of metformin (MTF) on the above parameters of CC resistance. A prospective clinical trial was undertaken at the infertility division of a university teaching hospital. Forty-one CC responders were selected and their hormonal and clinical features were determined. Forty-one CC-resistant PCOD women were also selected and clinical features; metabolic and hormonal profiles before and after treatment with MTF 1500 mg/day for 6-8 weeks were evaluated. Women who failed to conceive were treated by CC while continuing to take MTF. CC responders had higher insulin levels while non-responders were hyperinsulinemic. Menstrual irregularities improved in 30%. Mean+/-S.D. area under curve of insulin decreased from 297.58+/-191.33 to 206+/-0.1 mIU/ml per min (P=0.005). Only 39.39% ovulated and 24.24% conceived. PCOD is associated with insulin resistance (IR) particularly in CC-resistant women. Insulin resistance and androgen levels are significantly higher in obese patients. MTF therapy improved hyperandrogenemia, IR, and pregnancy rate.
Fuzzy Nonlinear Proximal Support Vector Machine for Land Extraction Based on Remote Sensing Image
Zhong, Xiaomei; Li, Jianping; Dou, Huacheng; Deng, Shijun; Wang, Guofei; Jiang, Yu; Wang, Yongjie; Zhou, Zebing; Wang, Li; Yan, Fei
2013-01-01
Currently, remote sensing technologies were widely employed in the dynamic monitoring of the land. This paper presented an algorithm named fuzzy nonlinear proximal support vector machine (FNPSVM) by basing on ETM+ remote sensing image. This algorithm is applied to extract various types of lands of the city Da’an in northern China. Two multi-category strategies, namely “one-against-one” and “one-against-rest” for this algorithm were described in detail and then compared. A fuzzy membership function was presented to reduce the effects of noises or outliers on the data samples. The approaches of feature extraction, feature selection, and several key parameter settings were also given. Numerous experiments were carried out to evaluate its performances including various accuracies (overall accuracies and kappa coefficient), stability, training speed, and classification speed. The FNPSVM classifier was compared to the other three classifiers including the maximum likelihood classifier (MLC), back propagation neural network (BPN), and the proximal support vector machine (PSVM) under different training conditions. The impacts of the selection of training samples, testing samples and features on the four classifiers were also evaluated in these experiments. PMID:23936016
A study of metaheuristic algorithms for high dimensional feature selection on microarray data
NASA Astrophysics Data System (ADS)
Dankolo, Muhammad Nasiru; Radzi, Nor Haizan Mohamed; Sallehuddin, Roselina; Mustaffa, Noorfa Haszlinna
2017-11-01
Microarray systems enable experts to examine gene profile at molecular level using machine learning algorithms. It increases the potentials of classification and diagnosis of many diseases at gene expression level. Though, numerous difficulties may affect the efficiency of machine learning algorithms which includes vast number of genes features comprised in the original data. Many of these features may be unrelated to the intended analysis. Therefore, feature selection is necessary to be performed in the data pre-processing. Many feature selection algorithms are developed and applied on microarray which including the metaheuristic optimization algorithms. This paper discusses the application of the metaheuristics algorithms for feature selection in microarray dataset. This study reveals that, the algorithms have yield an interesting result with limited resources thereby saving computational expenses of machine learning algorithms.
Delay Differential Equation Models of Normal and Diseased Electrocardiograms
NASA Astrophysics Data System (ADS)
Lainscsek, Claudia; Sejnowski, Terrence J.
Time series analysis with nonlinear delay differential equations (DDEs) is a powerful tool since it reveals spectral as well as nonlinear properties of the underlying dynamical system. Here global DDE models are used to analyze electrocardiography recordings (ECGs) in order to capture distinguishing features for different heart conditions such as normal heart beat, congestive heart failure, and atrial fibrillation. To capture distinguishing features of the different data types the number of terms and delays in the model as well as the order of nonlinearity of the DDE model have to be selected. The DDE structure selection is done in a supervised way by selecting the DDE that best separates different data types. We analyzed 24 h of data from 15 young healthy subjects in normal sinus rhythm (NSR) of 15 congestive heart failure (CHF) patients as well as of 15 subjects suffering from atrial fibrillation (AF) selected from the Physionet database. For the analysis presented here we used 5 min non-overlapping data windows on the raw data without any artifact removal. For classification performance we used the Cohen Kappa coefficient computed directly from the confusion matrix. The overall classification performance of the three groups was around 72-99 % on the 5 min windows for the different approaches. For 2 h data windows the classification for all three groups was above 95%.
Zhang, Chuncheng; Song, Sutao; Wen, Xiaotong; Yao, Li; Long, Zhiying
2015-04-30
Feature selection plays an important role in improving the classification accuracy of multivariate classification techniques in the context of fMRI-based decoding due to the "few samples and large features" nature of functional magnetic resonance imaging (fMRI) data. Recently, several sparse representation methods have been applied to the voxel selection of fMRI data. Despite the low computational efficiency of the sparse representation methods, they still displayed promise for applications that select features from fMRI data. In this study, we proposed the Laplacian smoothed L0 norm (LSL0) approach for feature selection of fMRI data. Based on the fast sparse decomposition using smoothed L0 norm (SL0) (Mohimani, 2007), the LSL0 method used the Laplacian function to approximate the L0 norm of sources. Results of the simulated and real fMRI data demonstrated the feasibility and robustness of LSL0 for the sparse source estimation and feature selection. Simulated results indicated that LSL0 produced more accurate source estimation than SL0 at high noise levels. The classification accuracy using voxels that were selected by LSL0 was higher than that by SL0 in both simulated and real fMRI experiment. Moreover, both LSL0 and SL0 showed higher classification accuracy and required less time than ICA and t-test for the fMRI decoding. LSL0 outperformed SL0 in sparse source estimation at high noise level and in feature selection. Moreover, LSL0 and SL0 showed better performance than ICA and t-test for feature selection. Copyright © 2015 Elsevier B.V. All rights reserved.
Feature-Selective Attention Adaptively Shifts Noise Correlations in Primary Auditory Cortex.
Downer, Joshua D; Rapone, Brittany; Verhein, Jessica; O'Connor, Kevin N; Sutter, Mitchell L
2017-05-24
Sensory environments often contain an overwhelming amount of information, with both relevant and irrelevant information competing for neural resources. Feature attention mediates this competition by selecting the sensory features needed to form a coherent percept. How attention affects the activity of populations of neurons to support this process is poorly understood because population coding is typically studied through simulations in which one sensory feature is encoded without competition. Therefore, to study the effects of feature attention on population-based neural coding, investigations must be extended to include stimuli with both relevant and irrelevant features. We measured noise correlations ( r noise ) within small neural populations in primary auditory cortex while rhesus macaques performed a novel feature-selective attention task. We found that the effect of feature-selective attention on r noise depended not only on the population tuning to the attended feature, but also on the tuning to the distractor feature. To attempt to explain how these observed effects might support enhanced perceptual performance, we propose an extension of a simple and influential model in which shifts in r noise can simultaneously enhance the representation of the attended feature while suppressing the distractor. These findings present a novel mechanism by which attention modulates neural populations to support sensory processing in cluttered environments. SIGNIFICANCE STATEMENT Although feature-selective attention constitutes one of the building blocks of listening in natural environments, its neural bases remain obscure. To address this, we developed a novel auditory feature-selective attention task and measured noise correlations ( r noise ) in rhesus macaque A1 during task performance. Unlike previous studies showing that the effect of attention on r noise depends on population tuning to the attended feature, we show that the effect of attention depends on the tuning to the distractor feature as well. We suggest that these effects represent an efficient process by which sensory cortex simultaneously enhances relevant information and suppresses irrelevant information. Copyright © 2017 the authors 0270-6474/17/375378-15$15.00/0.
Feature-Selective Attention Adaptively Shifts Noise Correlations in Primary Auditory Cortex
2017-01-01
Sensory environments often contain an overwhelming amount of information, with both relevant and irrelevant information competing for neural resources. Feature attention mediates this competition by selecting the sensory features needed to form a coherent percept. How attention affects the activity of populations of neurons to support this process is poorly understood because population coding is typically studied through simulations in which one sensory feature is encoded without competition. Therefore, to study the effects of feature attention on population-based neural coding, investigations must be extended to include stimuli with both relevant and irrelevant features. We measured noise correlations (rnoise) within small neural populations in primary auditory cortex while rhesus macaques performed a novel feature-selective attention task. We found that the effect of feature-selective attention on rnoise depended not only on the population tuning to the attended feature, but also on the tuning to the distractor feature. To attempt to explain how these observed effects might support enhanced perceptual performance, we propose an extension of a simple and influential model in which shifts in rnoise can simultaneously enhance the representation of the attended feature while suppressing the distractor. These findings present a novel mechanism by which attention modulates neural populations to support sensory processing in cluttered environments. SIGNIFICANCE STATEMENT Although feature-selective attention constitutes one of the building blocks of listening in natural environments, its neural bases remain obscure. To address this, we developed a novel auditory feature-selective attention task and measured noise correlations (rnoise) in rhesus macaque A1 during task performance. Unlike previous studies showing that the effect of attention on rnoise depends on population tuning to the attended feature, we show that the effect of attention depends on the tuning to the distractor feature as well. We suggest that these effects represent an efficient process by which sensory cortex simultaneously enhances relevant information and suppresses irrelevant information. PMID:28432139
Detecting bursts in the EEG of very and extremely premature infants using a multi-feature approach.
O'Toole, John M; Boylan, Geraldine B; Lloyd, Rhodri O; Goulding, Robert M; Vanhatalo, Sampsa; Stevenson, Nathan J
2017-07-01
To develop a method that segments preterm EEG into bursts and inter-bursts by extracting and combining multiple EEG features. Two EEG experts annotated bursts in individual EEG channels for 36 preterm infants with gestational age < 30 weeks. The feature set included spectral, amplitude, and frequency-weighted energy features. Using a consensus annotation, feature selection removed redundant features and a support vector machine combined features. Area under the receiver operator characteristic (AUC) and Cohen's kappa (κ) evaluated performance within a cross-validation procedure. The proposed channel-independent method improves AUC by 4-5% over existing methods (p < 0.001, n=36), with median (95% confidence interval) AUC of 0.989 (0.973-0.997) and sensitivity-specificity of 95.8-94.4%. Agreement rates between the detector and experts' annotations, κ=0.72 (0.36-0.83) and κ=0.65 (0.32-0.81), are comparable to inter-rater agreement, κ=0.60 (0.21-0.74). Automating the visual identification of bursts in preterm EEG is achievable with a high level of accuracy. Multiple features, combined using a data-driven approach, improves on existing single-feature methods. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Computer-aided diagnosis of prostate cancer in the peripheral zone using multiparametric MRI
NASA Astrophysics Data System (ADS)
Niaf, Emilie; Rouvière, Olivier; Mège-Lechevallier, Florence; Bratan, Flavie; Lartizien, Carole
2012-06-01
This study evaluated a computer-assisted diagnosis (CADx) system for determining a likelihood measure of prostate cancer presence in the peripheral zone (PZ) based on multiparametric magnetic resonance (MR) imaging, including T2-weighted, diffusion-weighted and dynamic contrast-enhanced MRI at 1.5 T. Based on a feature set derived from grey-level images, including first-order statistics, Haralick features, gradient features, semi-quantitative and quantitative (pharmacokinetic modelling) dynamic parameters, four kinds of classifiers were trained and compared : nonlinear support vector machine (SVM), linear discriminant analysis, k-nearest neighbours and naïve Bayes classifiers. A set of feature selection methods based on t-test, mutual information and minimum-redundancy-maximum-relevancy criteria were also compared. The aim was to discriminate between the relevant features as well as to create an efficient classifier using these features. The diagnostic performances of these different CADx schemes were evaluated based on a receiver operating characteristic (ROC) curve analysis. The evaluation database consisted of 30 sets of multiparametric MR images acquired from radical prostatectomy patients. Using histologic sections as the gold standard, both cancer and nonmalignant (but suspicious) tissues were annotated in consensus on all MR images by two radiologists, a histopathologist and a researcher. Benign tissue regions of interest (ROIs) were also delineated in the remaining prostate PZ. This resulted in a series of 42 cancer ROIs, 49 benign but suspicious ROIs and 124 nonsuspicious benign ROIs. From the outputs of all evaluated feature selection methods on the test bench, a restrictive set of about 15 highly informative features coming from all MR sequences was discriminated, thus confirming the validity of the multiparametric approach. Quantitative evaluation of the diagnostic performance yielded a maximal area under the ROC curve (AUC) of 0.89 (0.81-0.94) for the discrimination of the malignant versus nonmalignant tissues and 0.82 (0.73-0.90) for the discrimination of the malignant versus suspicious tissues when combining the t-test feature selection approach with a SVM classifier. A preliminary comparison showed that the optimal CADx scheme mimicked, in terms of AUC, the human experts in differentiating malignant from suspicious tissues, thus demonstrating its potential for assisting cancer identification in the PZ.
Deep learning based classification of breast tumors with shear-wave elastography.
Zhang, Qi; Xiao, Yang; Dai, Wei; Suo, Jingfeng; Wang, Congzhi; Shi, Jun; Zheng, Hairong
2016-12-01
This study aims to build a deep learning (DL) architecture for automated extraction of learned-from-data image features from the shear-wave elastography (SWE), and to evaluate the DL architecture in differentiation between benign and malignant breast tumors. We construct a two-layer DL architecture for SWE feature extraction, comprised of the point-wise gated Boltzmann machine (PGBM) and the restricted Boltzmann machine (RBM). The PGBM contains task-relevant and task-irrelevant hidden units, and the task-relevant units are connected to the RBM. Experimental evaluation was performed with five-fold cross validation on a set of 227 SWE images, 135 of benign tumors and 92 of malignant tumors, from 121 patients. The features learned with our DL architecture were compared with the statistical features quantifying image intensity and texture. Results showed that the DL features achieved better classification performance with an accuracy of 93.4%, a sensitivity of 88.6%, a specificity of 97.1%, and an area under the receiver operating characteristic curve of 0.947. The DL-based method integrates feature learning with feature selection on SWE. It may be potentially used in clinical computer-aided diagnosis of breast cancer. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Beguet, Benoit; Guyon, Dominique; Boukir, Samia; Chehata, Nesrine
2014-10-01
The main goal of this study is to design a method to describe the structure of forest stands from Very High Resolution satellite imagery, relying on some typical variables such as crown diameter, tree height, trunk diameter, tree density and tree spacing. The emphasis is placed on the automatization of the process of identification of the most relevant image features for the forest structure retrieval task, exploiting both spectral and spatial information. Our approach is based on linear regressions between the forest structure variables to be estimated and various spectral and Haralick's texture features. The main drawback of this well-known texture representation is the underlying parameters which are extremely difficult to set due to the spatial complexity of the forest structure. To tackle this major issue, an automated feature selection process is proposed which is based on statistical modeling, exploring a wide range of parameter values. It provides texture measures of diverse spatial parameters hence implicitly inducing a multi-scale texture analysis. A new feature selection technique, we called Random PRiF, is proposed. It relies on random sampling in feature space, carefully addresses the multicollinearity issue in multiple-linear regression while ensuring accurate prediction of forest variables. Our automated forest variable estimation scheme was tested on Quickbird and Pléiades panchromatic and multispectral images, acquired at different periods on the maritime pine stands of two sites in South-Western France. It outperforms two well-established variable subset selection techniques. It has been successfully applied to identify the best texture features in modeling the five considered forest structure variables. The RMSE of all predicted forest variables is improved by combining multispectral and panchromatic texture features, with various parameterizations, highlighting the potential of a multi-resolution approach for retrieving forest structure variables from VHR satellite images. Thus an average prediction error of ˜ 1.1 m is expected on crown diameter, ˜ 0.9 m on tree spacing, ˜ 3 m on height and ˜ 0.06 m on diameter at breast height.
Image Understanding by Image-Seeking Adaptive Networks (ISAN).
1987-08-10
our reserch on adaptive neural networks in the visual and sensory-motor cortex of cats. We demonstrate that, under certain conditions, plasticity is...understanding in organisms proceeds directly from adaptively seeking whole images and not via a preliminary analysis of elementary features, followed by object...empirical reserch has always been that ultimately any neural system has to serve behavior and that behavior serves survival. Evolutionary selection makes it
Mirsky, Simcha K; Barnea, Itay; Levi, Mattan; Greenspan, Hayit; Shaked, Natan T
2017-09-01
Currently, the delicate process of selecting sperm cells to be used for in vitro fertilization (IVF) is still based on the subjective, qualitative analysis of experienced clinicians using non-quantitative optical microscopy techniques. In this work, a method was developed for the automated analysis of sperm cells based on the quantitative phase maps acquired through use of interferometric phase microscopy (IPM). Over 1,400 human sperm cells from 8 donors were imaged using IPM, and an algorithm was designed to digitally isolate sperm cell heads from the quantitative phase maps while taking into consideration both the cell 3D morphology and contents, as well as acquire features describing sperm head morphology. A subset of these features was used to train a support vector machine (SVM) classifier to automatically classify sperm of good and bad morphology. The SVM achieves an area under the receiver operating characteristic curve of 88.59% and an area under the precision-recall curve of 88.67%, as well as precisions of 90% or higher. We believe that our automatic analysis can become the basis for objective and automatic sperm cell selection in IVF. © 2017 International Society for Advancement of Cytometry. © 2017 International Society for Advancement of Cytometry.
Age-related decline in bottom-up processing and selective attention in the very old.
Zhuravleva, Tatyana Y; Alperin, Brittany R; Haring, Anna E; Rentz, Dorene M; Holcomb, Philip J; Daffner, Kirk R
2014-06-01
Previous research demonstrating age-related deficits in selective attention have not included old-old adults, an increasingly important group to study. The current investigation compared event-related potentials in 15 young-old (65-79 years old) and 23 old-old (80-99 years old) subjects during a color-selective attention task. Subjects responded to target letters in a specified color (Attend) while ignoring letters in a different color (Ignore) under both low and high loads. There were no group differences in visual acuity, accuracy, reaction time, or latency of early event-related potential components. The old-old group showed a disruption in bottom-up processing, indexed by a substantially diminished posterior N1 (smaller amplitude). They also demonstrated markedly decreased modulation of bottom-up processing based on selected visual features, indexed by the posterior selection negativity (SN), with similar attenuation under both loads. In contrast, there were no group differences in frontally mediated attentional selection, measured by the anterior selection positivity (SP). There was a robust inverse relationship between the size of the SN and SP (the smaller the SN, the larger the SP), which may represent an anteriorly supported compensatory mechanism. In the absence of a decline in top-down modulation indexed by the SP, the diminished SN may reflect age-related degradation of early bottom-up visual processing in old-old adults.
Features and dosimetry of laser-inflicted retina injuries induced by short laser pulses
NASA Astrophysics Data System (ADS)
Pustovalov, Victor K.
1996-04-01
Energy absorption, heat transfer, thermodenaturation under the action of laser radiation pulse on pigmented spherical granules in heterogeneous laminated biotissues are investigated on the base of mathematical simulation. The possibility of selective interaction between short radiation pulses and pigmented retina biotissues is noted which results in the formation of thermodenaturation microregions inside and near the melanosomes. These denaturation microregions can originate in the eye biotissue under laser radiation intensities less than about 2 - 4 times the threshold ones determined ophthalmoscopically. These microdamages can appear without being detected by the standard ophthalmoscopical methods.
Catalyst- and Reagent-free Electrochemical Azole C-H Amination.
Qiu, Youai; Struwe, Julia; Meyer, Tjark H; Oliveira, Joao Carlos Agostinho Carlos Agostinho; Ackermann, Lutz
2018-06-14
Catalyst-, and chemical oxidant-free electrochemical azole C-H aminations were accomplished via cross-dehydrogenative C-H/N-H functionalization. The catalyst-free electrochemical C-H amination proved feasible on azoles with high levels of efficacy and selectivity, avoiding the use of stoichiometric oxidants under ambient conditions. Likewise, the C(sp3)-H nitrogenation proved viable under otherwise identical conditions. The dehydrogenative C-H amination featured ample scope, including cyclic and acyclic aliphatic amines as well as anilines, and employed sustainable electricity as the sole oxidant. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Josef, Noam; Amodio, Piero; Fiorito, Graziano; Shashar, Nadav
2012-01-01
Living under intense predation pressure, octopuses evolved an effective and impressive camouflaging ability that exploits features of their surroundings to enable them to “blend in.” To achieve such background matching, an animal may use general resemblance and reproduce characteristics of its entire surroundings, or it may imitate a specific object in its immediate environment. Using image analysis algorithms, we examined correlations between octopuses and their backgrounds. Field experiments show that when camouflaging, Octopus cyanea and O. vulgaris base their body patterns on selected features of nearby objects rather than attempting to match a large field of view. Such an approach enables the octopus to camouflage in partly occluded environments and to solve the problem of differences in appearance as a function of the viewing inclination of the observer. PMID:22649542
Liquid-Assisted Femtosecond Laser Precision-Machining of Silica.
Cao, Xiao-Wen; Chen, Qi-Dai; Fan, Hua; Zhang, Lei; Juodkazis, Saulius; Sun, Hong-Bo
2018-04-28
We report a systematical study on the liquid assisted femtosecond laser machining of quartz plate in water and under different etching solutions. The ablation features in liquid showed a better structuring quality and improved resolution with 1/3~1/2 smaller features as compared with those made in air. It has been demonstrated that laser induced periodic structures are present to a lesser extent when laser processed in water solutions. The redistribution of oxygen revealed a strong surface modification, which is related to the etching selectivity of laser irradiated regions. Laser ablation in KOH and HF solution showed very different morphology, which relates to the evolution of laser induced plasma on the formation of micro/nano-features in liquid. This work extends laser precision fabrication of hard materials. The mechanism of strong absorption in the regions with permittivity (epsilon) near zero is discussed.
NASA Astrophysics Data System (ADS)
Zhang, Zhifen; Chen, Huabin; Xu, Yanling; Zhong, Jiyong; Lv, Na; Chen, Shanben
2015-08-01
Multisensory data fusion-based online welding quality monitoring has gained increasing attention in intelligent welding process. This paper mainly focuses on the automatic detection of typical welding defect for Al alloy in gas tungsten arc welding (GTAW) by means of analzing arc spectrum, sound and voltage signal. Based on the developed algorithms in time and frequency domain, 41 feature parameters were successively extracted from these signals to characterize the welding process and seam quality. Then, the proposed feature selection approach, i.e., hybrid fisher-based filter and wrapper was successfully utilized to evaluate the sensitivity of each feature and reduce the feature dimensions. Finally, the optimal feature subset with 19 features was selected to obtain the highest accuracy, i.e., 94.72% using established classification model. This study provides a guideline for feature extraction, selection and dynamic modeling based on heterogeneous multisensory data to achieve a reliable online defect detection system in arc welding.
SVM-RFE based feature selection and Taguchi parameters optimization for multiclass SVM classifier.
Huang, Mei-Ling; Hung, Yung-Hsiang; Lee, W M; Li, R K; Jiang, Bo-Ru
2014-01-01
Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parameters C and γ to increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases.
SVM-RFE Based Feature Selection and Taguchi Parameters Optimization for Multiclass SVM Classifier
Huang, Mei-Ling; Hung, Yung-Hsiang; Lee, W. M.; Li, R. K.; Jiang, Bo-Ru
2014-01-01
Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parameters C and γ to increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases. PMID:25295306
Frank, Laurence E; Heiser, Willem J
2008-05-01
A set of features is the basis for the network representation of proximity data achieved by feature network models (FNMs). Features are binary variables that characterize the objects in an experiment, with some measure of proximity as response variable. Sometimes features are provided by theory and play an important role in the construction of the experimental conditions. In some research settings, the features are not known a priori. This paper shows how to generate features in this situation and how to select an adequate subset of features that takes into account a good compromise between model fit and model complexity, using a new version of least angle regression that restricts coefficients to be non-negative, called the Positive Lasso. It will be shown that features can be generated efficiently with Gray codes that are naturally linked to the FNMs. The model selection strategy makes use of the fact that FNM can be considered as univariate multiple regression model. A simulation study shows that the proposed strategy leads to satisfactory results if the number of objects is less than or equal to 22. If the number of objects is larger than 22, the number of features selected by our method exceeds the true number of features in some conditions.
Li, Jiangeng; Su, Lei; Pang, Zenan
2015-12-01
Feature selection techniques have been widely applied to tumor gene expression data analysis in recent years. A filter feature selection method named marginal Fisher analysis score (MFA score) which is based on graph embedding has been proposed, and it has been widely used mainly because it is superior to Fisher score. Considering the heavy redundancy in gene expression data, we proposed a new filter feature selection technique in this paper. It is named MFA score+ and is based on MFA score and redundancy excluding. We applied it to an artificial dataset and eight tumor gene expression datasets to select important features and then used support vector machine as the classifier to classify the samples. Compared with MFA score, t test and Fisher score, it achieved higher classification accuracy.
An ant colony optimization based feature selection for web page classification.
Saraç, Esra; Özel, Selma Ayşe
2014-01-01
The increased popularity of the web has caused the inclusion of huge amount of information to the web, and as a result of this explosive information growth, automated web page classification systems are needed to improve search engines' performance. Web pages have a large number of features such as HTML/XML tags, URLs, hyperlinks, and text contents that should be considered during an automated classification process. The aim of this study is to reduce the number of features to be used to improve runtime and accuracy of the classification of web pages. In this study, we used an ant colony optimization (ACO) algorithm to select the best features, and then we applied the well-known C4.5, naive Bayes, and k nearest neighbor classifiers to assign class labels to web pages. We used the WebKB and Conference datasets in our experiments, and we showed that using the ACO for feature selection improves both accuracy and runtime performance of classification. We also showed that the proposed ACO based algorithm can select better features with respect to the well-known information gain and chi square feature selection methods.
Vessel Classification in Cosmo-Skymed SAR Data Using Hierarchical Feature Selection
NASA Astrophysics Data System (ADS)
Makedonas, A.; Theoharatos, C.; Tsagaris, V.; Anastasopoulos, V.; Costicoglou, S.
2015-04-01
SAR based ship detection and classification are important elements of maritime monitoring applications. Recently, high-resolution SAR data have opened new possibilities to researchers for achieving improved classification results. In this work, a hierarchical vessel classification procedure is presented based on a robust feature extraction and selection scheme that utilizes scale, shape and texture features in a hierarchical way. Initially, different types of feature extraction algorithms are implemented in order to form the utilized feature pool, able to represent the structure, material, orientation and other vessel type characteristics. A two-stage hierarchical feature selection algorithm is utilized next in order to be able to discriminate effectively civilian vessels into three distinct types, in COSMO-SkyMed SAR images: cargos, small ships and tankers. In our analysis, scale and shape features are utilized in order to discriminate smaller types of vessels present in the available SAR data, or shape specific vessels. Then, the most informative texture and intensity features are incorporated in order to be able to better distinguish the civilian types with high accuracy. A feature selection procedure that utilizes heuristic measures based on features' statistical characteristics, followed by an exhaustive research with feature sets formed by the most qualified features is carried out, in order to discriminate the most appropriate combination of features for the final classification. In our analysis, five COSMO-SkyMed SAR data with 2.2m x 2.2m resolution were used to analyse the detailed characteristics of these types of ships. A total of 111 ships with available AIS data were used in the classification process. The experimental results show that this method has good performance in ship classification, with an overall accuracy reaching 83%. Further investigation of additional features and proper feature selection is currently in progress.
Features selection and classification to estimate elbow movements
NASA Astrophysics Data System (ADS)
Rubiano, A.; Ramírez, J. L.; El Korso, M. N.; Jouandeau, N.; Gallimard, L.; Polit, O.
2015-11-01
In this paper, we propose a novel method to estimate the elbow motion, through the features extracted from electromyography (EMG) signals. The features values are normalized and then compared to identify potential relationships between the EMG signal and the kinematic information as angle and angular velocity. We propose and implement a method to select the best set of features, maximizing the distance between the features that correspond to flexion and extension movements. Finally, we test the selected features as inputs to a non-linear support vector machine in the presence of non-idealistic conditions, obtaining an accuracy of 99.79% in the motion estimation results.
Analyses of Fatigue and Fatigue-Crack Growth under Constant- and Variable-Amplitude Loading
NASA Technical Reports Server (NTRS)
Newman, J. C., Jr.
1999-01-01
Studies on the growth of small cracks have led to the observation that fatigue life of many engineering materials is primarily crack growth from micro-structural features, such as inclusion particles, voids, slip-bands or from manufacturing defects. This paper reviews the capabilities of a plasticity-induced crack-closure model to predict fatigue lives of metallic materials using small-crack theory under various loading conditions. Constraint factors, to account for three-dimensional effects, were selected to correlate large-crack growth rate data as a function of the effective stress-intensity factor range (delta K(sub eff)) under constant-amplitude loading. Modifications to the delta K(sub eff)-rate relations in the near-threshold regime were needed to fit measured small-crack growth rate behavior. The model was then used to calculate small- and large-crack growth rates, and to predict total fatigue lives, for notched and un-notched specimens under constant-amplitude and spectrum loading. Fatigue lives were predicted using crack-growth relations and micro-structural features like those that initiated cracks in the fatigue specimens for most of the materials analyzed. Results from the tests and analyses agreed well.
Efficient feature subset selection with probabilistic distance criteria. [pattern recognition
NASA Technical Reports Server (NTRS)
Chittineni, C. B.
1979-01-01
Recursive expressions are derived for efficiently computing the commonly used probabilistic distance measures as a change in the criteria both when a feature is added to and when a feature is deleted from the current feature subset. A combinatorial algorithm for generating all possible r feature combinations from a given set of s features in (s/r) steps with a change of a single feature at each step is presented. These expressions can also be used for both forward and backward sequential feature selection.
Primary progressive aphasia and the evolving neurology of the language network
Mesulam, M.-Marsel; Rogalski, Emily J.; Wieneke, Christina; Hurley, Robert S.; Geula, Changiz; Bigio, Eileen H.; Thompson, Cynthia K.; Weintraub, Sandra
2014-01-01
Primary progressive aphasia (PPA) is caused by selective neurodegeneration of the language-dominant cerebral hemisphere; a language deficit initially arises as the only consequential impairment and remains predominant throughout most of the course of the disease. Agrammatic, logopenic and semantic subtypes, each reflecting a characteristic pattern of language impairment and corresponding anatomical distribution of cortical atrophy, represent the most frequent presentations of PPA. Such associations between clinical features and the sites of atrophy have provided new insights into the neurology of fluency, grammar, word retrieval, and word comprehension, and have necessitated modification of concepts related to the functions of the anterior temporal lobe and Wernicke’s area. The underlying neuropathology of PPA is, most commonly, frontotemporal lobar degeneration in the agrammatic and semantic forms, and Alzheimer disease (AD) pathology in the logopenic form; the AD pathology often displays atypical and asymmetrical anatomical features consistent with the aphasic phenotype. The PPA syndrome reflects complex interactions between disease-specific neuropathological features and patient-specific vulnerability. A better understanding of these interactions might help us to elucidate the biology of the language network and the principles of selective vulnerability in neurodegenerative diseases. We review these aspects of PPA, focusing on advances in our understanding of the clinical features and neuropathology of PPA and what they have taught us about the neural substrates of the language network. PMID:25179257
Primary progressive aphasia and the evolving neurology of the language network.
Mesulam, M-Marsel; Rogalski, Emily J; Wieneke, Christina; Hurley, Robert S; Geula, Changiz; Bigio, Eileen H; Thompson, Cynthia K; Weintraub, Sandra
2014-10-01
Primary progressive aphasia (PPA) is caused by selective neurodegeneration of the language-dominant cerebral hemisphere; a language deficit initially arises as the only consequential impairment and remains predominant throughout most of the course of the disease. Agrammatic, logopenic and semantic subtypes, each reflecting a characteristic pattern of language impairment and corresponding anatomical distribution of cortical atrophy, represent the most frequent presentations of PPA. Such associations between clinical features and the sites of atrophy have provided new insights into the neurology of fluency, grammar, word retrieval, and word comprehension, and have necessitated modification of concepts related to the functions of the anterior temporal lobe and Wernicke's area. The underlying neuropathology of PPA is, most commonly, frontotemporal lobar degeneration in the agrammatic and semantic forms, and Alzheimer disease (AD) pathology in the logopenic form; the AD pathology often displays atypical and asymmetrical anatomical features consistent with the aphasic phenotype. The PPA syndrome reflects complex interactions between disease-specific neuropathological features and patient-specific vulnerability. A better understanding of these interactions might help us to elucidate the biology of the language network and the principles of selective vulnerability in neurodegenerative diseases. We review these aspects of PPA, focusing on advances in our understanding of the clinical features and neuropathology of PPA and what they have taught us about the neural substrates of the language network.
FSMRank: feature selection algorithm for learning to rank.
Lai, Han-Jiang; Pan, Yan; Tang, Yong; Yu, Rong
2013-06-01
In recent years, there has been growing interest in learning to rank. The introduction of feature selection into different learning problems has been proven effective. These facts motivate us to investigate the problem of feature selection for learning to rank. We propose a joint convex optimization formulation which minimizes ranking errors while simultaneously conducting feature selection. This optimization formulation provides a flexible framework in which we can easily incorporate various importance measures and similarity measures of the features. To solve this optimization problem, we use the Nesterov's approach to derive an accelerated gradient algorithm with a fast convergence rate O(1/T(2)). We further develop a generalization bound for the proposed optimization problem using the Rademacher complexities. Extensive experimental evaluations are conducted on the public LETOR benchmark datasets. The results demonstrate that the proposed method shows: 1) significant ranking performance gain compared to several feature selection baselines for ranking, and 2) very competitive performance compared to several state-of-the-art learning-to-rank algorithms.
An ensemble method for extracting adverse drug events from social media.
Liu, Jing; Zhao, Songzheng; Zhang, Xiaodi
2016-06-01
Because adverse drug events (ADEs) are a serious health problem and a leading cause of death, it is of vital importance to identify them correctly and in a timely manner. With the development of Web 2.0, social media has become a large data source for information on ADEs. The objective of this study is to develop a relation extraction system that uses natural language processing techniques to effectively distinguish between ADEs and non-ADEs in informal text on social media. We develop a feature-based approach that utilizes various lexical, syntactic, and semantic features. Information-gain-based feature selection is performed to address high-dimensional features. Then, we evaluate the effectiveness of four well-known kernel-based approaches (i.e., subset tree kernel, tree kernel, shortest dependency path kernel, and all-paths graph kernel) and several ensembles that are generated by adopting different combination methods (i.e., majority voting, weighted averaging, and stacked generalization). All of the approaches are tested using three data sets: two health-related discussion forums and one general social networking site (i.e., Twitter). When investigating the contribution of each feature subset, the feature-based approach attains the best area under the receiver operating characteristics curve (AUC) values, which are 78.6%, 72.2%, and 79.2% on the three data sets. When individual methods are used, we attain the best AUC values of 82.1%, 73.2%, and 77.0% using the subset tree kernel, shortest dependency path kernel, and feature-based approach on the three data sets, respectively. When using classifier ensembles, we achieve the best AUC values of 84.5%, 77.3%, and 84.5% on the three data sets, outperforming the baselines. Our experimental results indicate that ADE extraction from social media can benefit from feature selection. With respect to the effectiveness of different feature subsets, lexical features and semantic features can enhance the ADE extraction capability. Kernel-based approaches, which can stay away from the feature sparsity issue, are qualified to address the ADE extraction problem. Combining different individual classifiers using suitable combination methods can further enhance the ADE extraction effectiveness. Copyright © 2016 Elsevier B.V. All rights reserved.
Gao, JianZhao; Tao, Xue-Wen; Zhao, Jia; Feng, Yuan-Ming; Cai, Yu-Dong; Zhang, Ning
2017-01-01
Lysine acetylation, as one type of post-translational modifications (PTM), plays key roles in cellular regulations and can be involved in a variety of human diseases. However, it is often high-cost and time-consuming to use traditional experimental approaches to identify the lysine acetylation sites. Therefore, effective computational methods should be developed to predict the acetylation sites. In this study, we developed a position-specific method for epsilon lysine acetylation site prediction. Sequences of acetylated proteins were retrieved from the UniProt database. Various kinds of features such as position specific scoring matrix (PSSM), amino acid factors (AAF), and disorders were incorporated. A feature selection method based on mRMR (Maximum Relevance Minimum Redundancy) and IFS (Incremental Feature Selection) was employed. Finally, 319 optimal features were selected from total 541 features. Using the 319 optimal features to encode peptides, a predictor was constructed based on dagging. As a result, an accuracy of 69.56% with MCC of 0.2792 was achieved. We analyzed the optimal features, which suggested some important factors determining the lysine acetylation sites. We developed a position-specific method for epsilon lysine acetylation site prediction. A set of optimal features was selected. Analysis of the optimal features provided insights into the mechanism of lysine acetylation sites, providing guidance of experimental validation. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Oliveira, Roberta B; Pereira, Aledir S; Tavares, João Manuel R S
2017-10-01
The number of deaths worldwide due to melanoma has risen in recent times, in part because melanoma is the most aggressive type of skin cancer. Computational systems have been developed to assist dermatologists in early diagnosis of skin cancer, or even to monitor skin lesions. However, there still remains a challenge to improve classifiers for the diagnosis of such skin lesions. The main objective of this article is to evaluate different ensemble classification models based on input feature manipulation to diagnose skin lesions. Input feature manipulation processes are based on feature subset selections from shape properties, colour variation and texture analysis to generate diversity for the ensemble models. Three subset selection models are presented here: (1) a subset selection model based on specific feature groups, (2) a correlation-based subset selection model, and (3) a subset selection model based on feature selection algorithms. Each ensemble classification model is generated using an optimum-path forest classifier and integrated with a majority voting strategy. The proposed models were applied on a set of 1104 dermoscopic images using a cross-validation procedure. The best results were obtained by the first ensemble classification model that generates a feature subset ensemble based on specific feature groups. The skin lesion diagnosis computational system achieved 94.3% accuracy, 91.8% sensitivity and 96.7% specificity. The input feature manipulation process based on specific feature subsets generated the greatest diversity for the ensemble classification model with very promising results. Copyright © 2017 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lin, H; Liu, T; Xu, X
Purpose: There are clinical decision challenges to select optimal treatment positions for left-sided breast cancer patients—supine free breathing (FB), supine Deep Inspiration Breath Hold (DIBH) and prone free breathing (prone). Physicians often make the decision based on experiences and trials, which might not always result optimal OAR doses. We herein propose a mathematical model to predict the lowest OAR doses among these three positions, providing a quantitative tool for corresponding clinical decision. Methods: Patients were scanned in FB, DIBH, and prone positions under an IRB approved protocol. Tangential beam plans were generated for each position, and OAR doses were calculated.more » The position with least OAR doses is defined as the optimal position. The following features were extracted from each scan to build the model: heart, ipsilateral lung, breast volume, in-field heart, ipsilateral lung volume, distance between heart and target, laterality of heart, and dose to heart and ipsilateral lung. Principal Components Analysis (PCA) was applied to remove the co-linearity of the input data and also to lower the data dimensionality. Feature selection, another method to reduce dimensionality, was applied as a comparison. Support Vector Machine (SVM) was then used for classification. Thirtyseven patient data were acquired; up to now, five patient plans were available. K-fold cross validation was used to validate the accuracy of the classifier model with small training size. Results: The classification results and K-fold cross validation demonstrated the model is capable of predicting the optimal position for patients. The accuracy of K-fold cross validations has reached 80%. Compared to PCA, feature selection allows causal features of dose to be determined. This provides more clinical insights. Conclusion: The proposed classification system appeared to be feasible. We are generating plans for the rest of the 37 patient images, and more statistically significant results are to be presented.« less
Zhang, Yu; Wu, Jianxin; Cai, Jianfei
2016-05-01
In large-scale visual recognition and image retrieval tasks, feature vectors, such as Fisher vector (FV) or the vector of locally aggregated descriptors (VLAD), have achieved state-of-the-art results. However, the combination of the large numbers of examples and high-dimensional vectors necessitates dimensionality reduction, in order to reduce its storage and CPU costs to a reasonable range. In spite of the popularity of various feature compression methods, this paper shows that the feature (dimension) selection is a better choice for high-dimensional FV/VLAD than the feature (dimension) compression methods, e.g., product quantization. We show that strong correlation among the feature dimensions in the FV and the VLAD may not exist, which renders feature selection a natural choice. We also show that, many dimensions in FV/VLAD are noise. Throwing them away using feature selection is better than compressing them and useful dimensions altogether using feature compression methods. To choose features, we propose an efficient importance sorting algorithm considering both the supervised and unsupervised cases, for visual recognition and image retrieval, respectively. Combining with the 1-bit quantization, feature selection has achieved both higher accuracy and less computational cost than feature compression methods, such as product quantization, on the FV and the VLAD image representations.
Blasting preparation for selective mining of complex structured ore deposition
NASA Astrophysics Data System (ADS)
Marinin, M. A.; Dolzhikov, V. V.
2017-10-01
Technological features of ore mining in the open pit development for processing of complex structured ore deposit of steeply falling occurrence have been considered. The technological schemes of ore bodies mining under different conditions of occurrence, consistency and capacity have been considered and offered in the paper. These technologies permit to reduce losses and dilution, but to increase the completeness and quality of mined ore. A method of subsequent selective excavation of ore bodies has been proposed. The method is based on the complex use of buffer-blasting technology for the muck mass and the principle of trim blasting at ore-rock junctions.
X-ray bright points and He I lambda 10830 dark points
NASA Technical Reports Server (NTRS)
Golub, L.; Harvey, K. L.; Herant, M.; Webb, D. F.
1989-01-01
Using near-simultaneous full disk Solar X-ray images and He I 10830 lambda, spectroheliograms from three recent rocket flights, dark points identified on the He I maps were compared with X-ray bright points identified on the X-ray images. It was found that for the largest and most obvious features there is a strong correlation: most He I dark points correspond to X-ray bright points. However, about 2/3 of the X-ray bright points were not identified on the basis of the helium data alone. Once an X-ray feature is identified it is almost always possible to find an underlying dark patch of enhanced He I absorption which, however, would not a priori have been selected as a dark point. Therefore, the He I dark points, using current selection criteria, cannot be used as a one-to-one proxy for the X-ray data. He I dark points do, however, identify the locations of the stronger X-ray bright points.
X-ray bright points and He I lambda 10830 dark points
NASA Technical Reports Server (NTRS)
Golub, L.; Harvey, K. L.; Herant, M.; Webb, D. F.
1989-01-01
Using near-simultaneous full disk Solar X-ray images and He I 10830 lambda, spectroheliograms from three recent rocket flights, dark points identified on the He I maps were compared with x-ray bright points identified on the X-ray images. It was found that for the largest and most obvious features there is a strong correlation: most He I dark points correspond to X-ray bright points. However, about 2/3 of the X-ray bright points were not identified on the basis of the helium data alone. Once an X-ray feature is identified it is almost always possible to find an underlying dark patch of enhanced He I absorption which, however, would not a priori have been selected as a dark point. Therefore, the He I dark points, using current selection criteria, cannot be used as a one-to-one proxy for the X-ray data. He I dark points do, however, identify the locations of the stronger X-ray bright points.
Automatic MeSH term assignment and quality assessment.
Kim, W.; Aronson, A. R.; Wilbur, W. J.
2001-01-01
For computational purposes documents or other objects are most often represented by a collection of individual attributes that may be strings or numbers. Such attributes are often called features and success in solving a given problem can depend critically on the nature of the features selected to represent documents. Feature selection has received considerable attention in the machine learning literature. In the area of document retrieval we refer to feature selection as indexing. Indexing has not traditionally been evaluated by the same methods used in machine learning feature selection. Here we show how indexing quality may be evaluated in a machine learning setting and apply this methodology to results of the Indexing Initiative at the National Library of Medicine. PMID:11825203
2009 Epigenetics Gordon Research Conference (August 9 - 14, 2009)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jeanie Lee
Epigenetics refers to the study of heritable changes in genome function that occur without a change in primary DNA sequence. The 2009 Gordon Conference in Epigenetics will feature discussion of various epigenetic phenomena, emerging understanding of their underlying mechanisms, and the growing appreciation that human, animal, and plant health all depend on proper epigenetic control. Special emphasis will be placed on genome-environment interactions particularly as they relate to human disease. Towards improving knowledge of molecular mechanisms, the conference will feature international leaders studying the roles of higher order chromatin structure, noncoding RNA, repeat elements, nuclear organization, and morphogenic evolution. Traditionalmore » and new model organisms are selected from plants, fungi, and metazoans.« less
Checklist/Guide to Selecting a Small Computer.
ERIC Educational Resources Information Center
Bennett, Wilma E.
This 322-point checklist was designed to help executives make an intelligent choice when selecting a small computer for a business. For ease of use the questions have been divided into ten categories: Display Features, Keyboard Features, Printer Features, Controller Features, Software, Word Processing, Service, Training, Miscellaneous, and Costs.…
USDA-ARS?s Scientific Manuscript database
Due to the availability of numerous spectral, spatial, and contextual features, the determination of optimal features and class separabilities can be a time consuming process in object-based image analysis (OBIA). While several feature selection methods have been developed to assist OBIA, a robust c...
News video story segmentation method using fusion of audio-visual features
NASA Astrophysics Data System (ADS)
Wen, Jun; Wu, Ling-da; Zeng, Pu; Luan, Xi-dao; Xie, Yu-xiang
2007-11-01
News story segmentation is an important aspect for news video analysis. This paper presents a method for news video story segmentation. Different form prior works, which base on visual features transform, the proposed technique uses audio features as baseline and fuses visual features with it to refine the results. At first, it selects silence clips as audio features candidate points, and selects shot boundaries and anchor shots as two kinds of visual features candidate points. Then this paper selects audio feature candidates as cues and develops different fusion method, which effectively using diverse type visual candidates to refine audio candidates, to get story boundaries. Experiment results show that this method has high efficiency and adaptability to different kinds of news video.
Chen, Yun-Hao; Jiang, Jin-Bao; Steven, Michael D; Gong, A-Du; Li, Yi-Fan
2012-07-01
With the global climate warming, reducing greenhouse gas emissions becomes a focused problem for the world. The carbon capture and storage (CCS) techniques could mitigate CO2 into atmosphere, but there is a risk in case that the CO2 leaks from underground. The objective of this paper is to study the chlorophyll contents (SPAD value), relative water contents (RWC) and leaf spectra changing features of beetroot under CO2 leakage stress through field experiment. The result shows that the chlorophyll contents and RWC of beetroot under CO2 leakage stress become lower than the control beetroot', and the leaf reflectance increases in the 550 nm region and decreases in the 680nm region. A new vegetation index (R550/R680) was designed for identifying beetroot under CO2 leakage stress, and the result indicates that the vegetation index R550/R680 could identify the beetroots after CO2 leakage for 7 days. The index has strong sensitivity, stability and identification for monitoring the beetroots under CO2 stress. The result of this paper has very important meaning and application values for selecting spots of CCS project, monitoring and evaluating land-surface ecology under CO2 stress and monitoring the leakage spots by using remote sensing.
Gunavathi, Chellamuthu; Premalatha, Kandasamy
2014-01-01
Feature selection in cancer classification is a central area of research in the field of bioinformatics and used to select the informative genes from thousands of genes of the microarray. The genes are ranked based on T-statistics, signal-to-noise ratio (SNR), and F-test values. The swarm intelligence (SI) technique finds the informative genes from the top-m ranked genes. These selected genes are used for classification. In this paper the shuffled frog leaping with Lévy flight (SFLLF) is proposed for feature selection. In SFLLF, the Lévy flight is included to avoid premature convergence of shuffled frog leaping (SFL) algorithm. The SI techniques such as particle swarm optimization (PSO), cuckoo search (CS), SFL, and SFLLF are used for feature selection which identifies informative genes for classification. The k-nearest neighbour (k-NN) technique is used to classify the samples. The proposed work is applied on 10 different benchmark datasets and examined with SI techniques. The experimental results show that the results obtained from k-NN classifier through SFLLF feature selection method outperform PSO, CS, and SFL.
A new time-frequency method for identification and classification of ball bearing faults
NASA Astrophysics Data System (ADS)
Attoui, Issam; Fergani, Nadir; Boutasseta, Nadir; Oudjani, Brahim; Deliou, Adel
2017-06-01
In order to fault diagnosis of ball bearing that is one of the most critical components of rotating machinery, this paper presents a time-frequency procedure incorporating a new feature extraction step that combines the classical wavelet packet decomposition energy distribution technique and a new feature extraction technique based on the selection of the most impulsive frequency bands. In the proposed procedure, firstly, as a pre-processing step, the most impulsive frequency bands are selected at different bearing conditions using a combination between Fast-Fourier-Transform FFT and Short-Frequency Energy SFE algorithms. Secondly, once the most impulsive frequency bands are selected, the measured machinery vibration signals are decomposed into different frequency sub-bands by using discrete Wavelet Packet Decomposition WPD technique to maximize the detection of their frequency contents and subsequently the most useful sub-bands are represented in the time-frequency domain by using Short Time Fourier transform STFT algorithm for knowing exactly what the frequency components presented in those frequency sub-bands are. Once the proposed feature vector is obtained, three feature dimensionality reduction techniques are employed using Linear Discriminant Analysis LDA, a feedback wrapper method and Locality Sensitive Discriminant Analysis LSDA. Lastly, the Adaptive Neuro-Fuzzy Inference System ANFIS algorithm is used for instantaneous identification and classification of bearing faults. In order to evaluate the performances of the proposed method, different testing data set to the trained ANFIS model by using different conditions of healthy and faulty bearings under various load levels, fault severities and rotating speed. The conclusion resulting from this paper is highlighted by experimental results which prove that the proposed method can serve as an intelligent bearing fault diagnosis system.
Feature selection for elderly faller classification based on wearable sensors.
Howcroft, Jennifer; Kofman, Jonathan; Lemaire, Edward D
2017-05-30
Wearable sensors can be used to derive numerous gait pattern features for elderly fall risk and faller classification; however, an appropriate feature set is required to avoid high computational costs and the inclusion of irrelevant features. The objectives of this study were to identify and evaluate smaller feature sets for faller classification from large feature sets derived from wearable accelerometer and pressure-sensing insole gait data. A convenience sample of 100 older adults (75.5 ± 6.7 years; 76 non-fallers, 24 fallers based on 6 month retrospective fall occurrence) walked 7.62 m while wearing pressure-sensing insoles and tri-axial accelerometers at the head, pelvis, left and right shanks. Feature selection was performed using correlation-based feature selection (CFS), fast correlation based filter (FCBF), and Relief-F algorithms. Faller classification was performed using multi-layer perceptron neural network, naïve Bayesian, and support vector machine classifiers, with 75:25 single stratified holdout and repeated random sampling. The best performing model was a support vector machine with 78% accuracy, 26% sensitivity, 95% specificity, 0.36 F1 score, and 0.31 MCC and one posterior pelvis accelerometer input feature (left acceleration standard deviation). The second best model achieved better sensitivity (44%) and used a support vector machine with 74% accuracy, 83% specificity, 0.44 F1 score, and 0.29 MCC. This model had ten input features: maximum, mean and standard deviation posterior acceleration; maximum, mean and standard deviation anterior acceleration; mean superior acceleration; and three impulse features. The best multi-sensor model sensitivity (56%) was achieved using posterior pelvis and both shank accelerometers and a naïve Bayesian classifier. The best single-sensor model sensitivity (41%) was achieved using the posterior pelvis accelerometer and a naïve Bayesian classifier. Feature selection provided models with smaller feature sets and improved faller classification compared to faller classification without feature selection. CFS and FCBF provided the best feature subset (one posterior pelvis accelerometer feature) for faller classification. However, better sensitivity was achieved by the second best model based on a Relief-F feature subset with three pressure-sensing insole features and seven head accelerometer features. Feature selection should be considered as an important step in faller classification using wearable sensors.
Select Features in "Finale 2011" for Music Educators
ERIC Educational Resources Information Center
Thompson, Douglas Earl
2011-01-01
A feature-laden software program such as "Finale" is an overwhelming tool to master--if one hopes to master many features in a short amount of time. Believing that working with a fewer number of features can be a helpful approach, this article looks at a select number of features in "Finale 2011" of obvious use to music educators. These features…
Ip, Ifan Betina; Bridge, Holly; Parker, Andrew J.
2014-01-01
An important advance in the study of visual attention has been the identification of a non-spatial component of attention that enhances the response to similar features or objects across the visual field. Here we test whether this non-spatial component can co-select individual features that are perceptually bound into a coherent object. We combined human psychophysics and functional magnetic resonance imaging (fMRI) to demonstrate the ability to co-select individual features from perceptually coherent objects. Our study used binocular disparity and visual motion to define disparity structure-from-motion (dSFM) stimuli. Although the spatial attention system induced strong modulations of the fMRI response in visual regions, the non-spatial system’s ability to co-select features of the dSFM stimulus was less pronounced and variable across subjects. Our results demonstrate that feature and global feature attention effects are variable across participants, suggesting that the feature attention system may be limited in its ability to automatically select features within the attended object. Careful comparison of the task design suggests that even minor differences in the perceptual task may be critical in revealing the presence of global feature attention. PMID:24936974
IMMAN: free software for information theory-based chemometric analysis.
Urias, Ricardo W Pino; Barigye, Stephen J; Marrero-Ponce, Yovani; García-Jacas, César R; Valdes-Martiní, José R; Perez-Gimenez, Facundo
2015-05-01
The features and theoretical background of a new and free computational program for chemometric analysis denominated IMMAN (acronym for Information theory-based CheMoMetrics ANalysis) are presented. This is multi-platform software developed in the Java programming language, designed with a remarkably user-friendly graphical interface for the computation of a collection of information-theoretic functions adapted for rank-based unsupervised and supervised feature selection tasks. A total of 20 feature selection parameters are presented, with the unsupervised and supervised frameworks represented by 10 approaches in each case. Several information-theoretic parameters traditionally used as molecular descriptors (MDs) are adapted for use as unsupervised rank-based feature selection methods. On the other hand, a generalization scheme for the previously defined differential Shannon's entropy is discussed, as well as the introduction of Jeffreys information measure for supervised feature selection. Moreover, well-known information-theoretic feature selection parameters, such as information gain, gain ratio, and symmetrical uncertainty are incorporated to the IMMAN software ( http://mobiosd-hub.com/imman-soft/ ), following an equal-interval discretization approach. IMMAN offers data pre-processing functionalities, such as missing values processing, dataset partitioning, and browsing. Moreover, single parameter or ensemble (multi-criteria) ranking options are provided. Consequently, this software is suitable for tasks like dimensionality reduction, feature ranking, as well as comparative diversity analysis of data matrices. Simple examples of applications performed with this program are presented. A comparative study between IMMAN and WEKA feature selection tools using the Arcene dataset was performed, demonstrating similar behavior. In addition, it is revealed that the use of IMMAN unsupervised feature selection methods improves the performance of both IMMAN and WEKA supervised algorithms. Graphic representation for Shannon's distribution of MD calculating software.
Improved pulmonary nodule classification utilizing quantitative lung parenchyma features.
Dilger, Samantha K N; Uthoff, Johanna; Judisch, Alexandra; Hammond, Emily; Mott, Sarah L; Smith, Brian J; Newell, John D; Hoffman, Eric A; Sieren, Jessica C
2015-10-01
Current computer-aided diagnosis (CAD) models for determining pulmonary nodule malignancy characterize nodule shape, density, and border in computed tomography (CT) data. Analyzing the lung parenchyma surrounding the nodule has been minimally explored. We hypothesize that improved nodule classification is achievable by including features quantified from the surrounding lung tissue. To explore this hypothesis, we have developed expanded quantitative CT feature extraction techniques, including volumetric Laws texture energy measures for the parenchyma and nodule, border descriptors using ray-casting and rubber-band straightening, histogram features characterizing densities, and global lung measurements. Using stepwise forward selection and leave-one-case-out cross-validation, a neural network was used for classification. When applied to 50 nodules (22 malignant and 28 benign) from high-resolution CT scans, 52 features (8 nodule, 39 parenchymal, and 5 global) were statistically significant. Nodule-only features yielded an area under the ROC curve of 0.918 (including nodule size) and 0.872 (excluding nodule size). Performance was improved through inclusion of parenchymal (0.938) and global features (0.932). These results show a trend toward increased performance when the parenchyma is included, coupled with the large number of significant parenchymal features that support our hypothesis: the pulmonary parenchyma is influenced differentially by malignant versus benign nodules, assisting CAD-based nodule characterizations.
Improving the performance of univariate control charts for abnormal detection and classification
NASA Astrophysics Data System (ADS)
Yiakopoulos, Christos; Koutsoudaki, Maria; Gryllias, Konstantinos; Antoniadis, Ioannis
2017-03-01
Bearing failures in rotating machinery can cause machine breakdown and economical loss, if no effective actions are taken on time. Therefore, it is of prime importance to detect accurately the presence of faults, especially at their early stage, to prevent sequent damage and reduce costly downtime. The machinery fault diagnosis follows a roadmap of data acquisition, feature extraction and diagnostic decision making, in which mechanical vibration fault feature extraction is the foundation and the key to obtain an accurate diagnostic result. A challenge in this area is the selection of the most sensitive features for various types of fault, especially when the characteristics of failures are difficult to be extracted. Thus, a plethora of complex data-driven fault diagnosis methods are fed by prominent features, which are extracted and reduced through traditional or modern algorithms. Since most of the available datasets are captured during normal operating conditions, the last decade a number of novelty detection methods, able to work when only normal data are available, have been developed. In this study, a hybrid method combining univariate control charts and a feature extraction scheme is introduced focusing towards an abnormal change detection and classification, under the assumption that measurements under normal operating conditions of the machinery are available. The feature extraction method integrates the morphological operators and the Morlet wavelets. The effectiveness of the proposed methodology is validated on two different experimental cases with bearing faults, demonstrating that the proposed approach can improve the fault detection and classification performance of conventional control charts.
Inter-area correlations in the ventral visual pathway reflect feature integration
Freeman, Jeremy; Donner, Tobias H.; Heeger, David J.
2011-01-01
During object perception, the brain integrates simple features into representations of complex objects. A perceptual phenomenon known as visual crowding selectively interferes with this process. Here, we use crowding to characterize a neural correlate of feature integration. Cortical activity was measured with functional magnetic resonance imaging, simultaneously in multiple areas of the ventral visual pathway (V1–V4 and the visual word form area, VWFA, which responds preferentially to familiar letters), while human subjects viewed crowded and uncrowded letters. Temporal correlations between cortical areas were lower for crowded letters than for uncrowded letters, especially between V1 and VWFA. These differences in correlation were retinotopically specific, and persisted when attention was diverted from the letters. But correlation differences were not evident when we substituted the letters with grating patches that were not crowded under our stimulus conditions. We conclude that inter-area correlations reflect feature integration and are disrupted by crowding. We propose that crowding may perturb the transformations between neural representations along the ventral pathway that underlie the integration of features into objects. PMID:21521832
Feature Mining and Health Assessment for Gearboxes Using Run-Up/Coast-Down Signals
Zhao, Ming; Lin, Jing; Miao, Yonghao; Xu, Xiaoqiang
2016-01-01
Vibration signals measured in the run-up/coast-down (R/C) processes usually carry rich information about the health status of machinery. However, a major challenge in R/C signals analysis lies in how to exploit more diagnostic information, and how this information could be properly integrated to achieve a more reliable maintenance decision. Aiming at this problem, a framework of R/C signals analysis is presented for the health assessment of gearbox. In the proposed methodology, we first investigate the data preprocessing and feature selection issues for R/C signals. Based on that, a sparsity-guided feature enhancement scheme is then proposed to extract the weak phase jitter associated with gear defect. In order for an effective feature mining and integration under R/C, a generalized phase demodulation technique is further established to reveal the evolution of modulation feature with operating speed and rotation angle. The experimental results indicate that the proposed methodology could not only detect the presence of gear damage, but also offer a novel insight into the dynamic behavior of gearbox. PMID:27827831
Feature Mining and Health Assessment for Gearboxes Using Run-Up/Coast-Down Signals.
Zhao, Ming; Lin, Jing; Miao, Yonghao; Xu, Xiaoqiang
2016-11-02
Vibration signals measured in the run-up/coast-down (R/C) processes usually carry rich information about the health status of machinery. However, a major challenge in R/C signals analysis lies in how to exploit more diagnostic information, and how this information could be properly integrated to achieve a more reliable maintenance decision. Aiming at this problem, a framework of R/C signals analysis is presented for the health assessment of gearbox. In the proposed methodology, we first investigate the data preprocessing and feature selection issues for R/C signals. Based on that, a sparsity-guided feature enhancement scheme is then proposed to extract the weak phase jitter associated with gear defect. In order for an effective feature mining and integration under R/C, a generalized phase demodulation technique is further established to reveal the evolution of modulation feature with operating speed and rotation angle. The experimental results indicate that the proposed methodology could not only detect the presence of gear damage, but also offer a novel insight into the dynamic behavior of gearbox.
Communication target object recognition for D2D connection with feature size limit
NASA Astrophysics Data System (ADS)
Ok, Jiheon; Kim, Soochang; Kim, Young-hoon; Lee, Chulhee
2015-03-01
Recently, a new concept of device-to-device (D2D) communication, which is called "point-and-link communication" has attracted great attentions due to its intuitive and simple operation. This approach enables user to communicate with target devices without any pre-identification information such as SSIDs, MAC addresses by selecting the target image displayed on the user's own device. In this paper, we present an efficient object matching algorithm that can be applied to look(point)-and-link communications for mobile services. Due to the limited channel bandwidth and low computational power of mobile terminals, the matching algorithm should satisfy low-complexity, low-memory and realtime requirements. To meet these requirements, we propose fast and robust feature extraction by considering the descriptor size and processing time. The proposed algorithm utilizes a HSV color histogram, SIFT (Scale Invariant Feature Transform) features and object aspect ratios. To reduce the descriptor size under 300 bytes, a limited number of SIFT key points were chosen as feature points and histograms were binarized while maintaining required performance. Experimental results show the robustness and the efficiency of the proposed algorithm.
Wang, Yong; Wu, Qiao-Feng; Chen, Chen; Wu, Ling-Yun; Yan, Xian-Zhong; Yu, Shu-Guang; Zhang, Xiang-Sun; Liang, Fan-Rong
2012-01-01
Acupuncture has been practiced in China for thousands of years as part of the Traditional Chinese Medicine (TCM) and has gradually accepted in western countries as an alternative or complementary treatment. However, the underlying mechanism of acupuncture, especially whether there exists any difference between varies acupoints, remains largely unknown, which hinders its widespread use. In this study, we develop a novel Linear Programming based Feature Selection method (LPFS) to understand the mechanism of acupuncture effect, at molecular level, by revealing the metabolite biomarkers for acupuncture treatment. Specifically, we generate and investigate the high-throughput metabolic profiles of acupuncture treatment at several acupoints in human. To select the subsets of metabolites that best characterize the acupuncture effect for each meridian point, an optimization model is proposed to identify biomarkers from high-dimensional metabolic data from case and control samples. Importantly, we use nearest centroid as the prototype to simultaneously minimize the number of selected features and the leave-one-out cross validation error of classifier. We compared the performance of LPFS to several state-of-the-art methods, such as SVM recursive feature elimination (SVM-RFE) and sparse multinomial logistic regression approach (SMLR). We find that our LPFS method tends to reveal a small set of metabolites with small standard deviation and large shifts, which exactly serves our requirement for good biomarker. Biologically, several metabolite biomarkers for acupuncture treatment are revealed and serve as the candidates for further mechanism investigation. Also biomakers derived from five meridian points, Zusanli (ST36), Liangmen (ST21), Juliao (ST3), Yanglingquan (GB34), and Weizhong (BL40), are compared for their similarity and difference, which provide evidence for the specificity of acupoints. Our result demonstrates that metabolic profiling might be a promising method to investigate the molecular mechanism of acupuncture. Comparing with other existing methods, LPFS shows better performance to select a small set of key molecules. In addition, LPFS is a general methodology and can be applied to other high-dimensional data analysis, for example cancer genomics.
2012-01-01
Background Acupuncture has been practiced in China for thousands of years as part of the Traditional Chinese Medicine (TCM) and has gradually accepted in western countries as an alternative or complementary treatment. However, the underlying mechanism of acupuncture, especially whether there exists any difference between varies acupoints, remains largely unknown, which hinders its widespread use. Results In this study, we develop a novel Linear Programming based Feature Selection method (LPFS) to understand the mechanism of acupuncture effect, at molecular level, by revealing the metabolite biomarkers for acupuncture treatment. Specifically, we generate and investigate the high-throughput metabolic profiles of acupuncture treatment at several acupoints in human. To select the subsets of metabolites that best characterize the acupuncture effect for each meridian point, an optimization model is proposed to identify biomarkers from high-dimensional metabolic data from case and control samples. Importantly, we use nearest centroid as the prototype to simultaneously minimize the number of selected features and the leave-one-out cross validation error of classifier. We compared the performance of LPFS to several state-of-the-art methods, such as SVM recursive feature elimination (SVM-RFE) and sparse multinomial logistic regression approach (SMLR). We find that our LPFS method tends to reveal a small set of metabolites with small standard deviation and large shifts, which exactly serves our requirement for good biomarker. Biologically, several metabolite biomarkers for acupuncture treatment are revealed and serve as the candidates for further mechanism investigation. Also biomakers derived from five meridian points, Zusanli (ST36), Liangmen (ST21), Juliao (ST3), Yanglingquan (GB34), and Weizhong (BL40), are compared for their similarity and difference, which provide evidence for the specificity of acupoints. Conclusions Our result demonstrates that metabolic profiling might be a promising method to investigate the molecular mechanism of acupuncture. Comparing with other existing methods, LPFS shows better performance to select a small set of key molecules. In addition, LPFS is a general methodology and can be applied to other high-dimensional data analysis, for example cancer genomics. PMID:23046877
Mammographic phenotypes of breast cancer risk driven by breast anatomy
NASA Astrophysics Data System (ADS)
Gastounioti, Aimilia; Oustimov, Andrew; Hsieh, Meng-Kang; Pantalone, Lauren; Conant, Emily F.; Kontos, Despina
2017-03-01
Image-derived features of breast parenchymal texture patterns have emerged as promising risk factors for breast cancer, paving the way towards personalized recommendations regarding women's cancer risk evaluation and screening. The main steps to extract texture features of the breast parenchyma are the selection of regions of interest (ROIs) where texture analysis is performed, the texture feature calculation and the texture feature summarization in case of multiple ROIs. In this study, we incorporate breast anatomy in these three key steps by (a) introducing breast anatomical sampling for the definition of ROIs, (b) texture feature calculation aligned with the structure of the breast and (c) weighted texture feature summarization considering the spatial position and the underlying tissue composition of each ROI. We systematically optimize this novel framework for parenchymal tissue characterization in a case-control study with digital mammograms from 424 women. We also compare the proposed approach with a conventional methodology, not considering breast anatomy, recently shown to enhance the case-control discriminatory capacity of parenchymal texture analysis. The case-control classification performance is assessed using elastic-net regression with 5-fold cross validation, where the evaluation measure is the area under the curve (AUC) of the receiver operating characteristic. Upon optimization, the proposed breast-anatomy-driven approach demonstrated a promising case-control classification performance (AUC=0.87). In the same dataset, the performance of conventional texture characterization was found to be significantly lower (AUC=0.80, DeLong's test p-value<0.05). Our results suggest that breast anatomy may further leverage the associations of parenchymal texture features with breast cancer, and may therefore be a valuable addition in pipelines aiming to elucidate quantitative mammographic phenotypes of breast cancer risk.
Angular momentum conservation law in light-front quantum field theory
Chiu, Kelly Yu-Ju; Brodsky, Stanley J.
2017-03-31
We prove the Lorentz invariance of the angular momentum conservation law and the helicity sum rule for relativistic composite systems in the light-front formulation. We explicitly show that j 3, the z -component of the angular momentum remains unchanged under Lorentz transformations generated by the light-front kinematical boost operators. The invariance of j 3 under Lorentz transformations is a feature unique to the front form. Applying the Lorentz invariance of the angular quantum number in the front form, we obtain a selection rule for the orbital angular momentum which can be used to eliminate certain interaction vertices in QED andmore » QCD. We also generalize the selection rule to any renormalizable theory and show that there exists an upper bound on the change of orbital angular momentum in scattering processes at any fixed order in perturbation theory.« less
Angular momentum conservation law in light-front quantum field theory
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chiu, Kelly Yu-Ju; Brodsky, Stanley J.
We prove the Lorentz invariance of the angular momentum conservation law and the helicity sum rule for relativistic composite systems in the light-front formulation. We explicitly show that j 3, the z -component of the angular momentum remains unchanged under Lorentz transformations generated by the light-front kinematical boost operators. The invariance of j 3 under Lorentz transformations is a feature unique to the front form. Applying the Lorentz invariance of the angular quantum number in the front form, we obtain a selection rule for the orbital angular momentum which can be used to eliminate certain interaction vertices in QED andmore » QCD. We also generalize the selection rule to any renormalizable theory and show that there exists an upper bound on the change of orbital angular momentum in scattering processes at any fixed order in perturbation theory.« less
Angular momentum conservation law in light-front quantum field theory
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chiu, Kelly Yu-Ju; Brodsky, Stanley J.
We prove the Lorentz invariance of the angular momentum conservation law and the helicity sum rule for relativistic composite systems in the light-front formulation. We explicitly show that j 3 , the z -component of the angular momentum remains unchanged under Lorentz transformations generated by the light-front kinematical boost operators. The invariance of j 3 under Lorentz transformations is a feature unique to the front form. Applying the Lorentz invariance of the angular quantum number in the front form, we obtain a selection rule for the orbital angular momentum which can be used to eliminate certain interaction vertices in QEDmore » and QCD. We also generalize the selection rule to any renormalizable theory and show that there exists an upper bound on the change of orbital angular momentum in scattering processes at any fixed order in perturbation theory.« less
Naia, Luana; Ferreira, Ildete Luísa; Ferreiro, Elisabete; Rego, A Cristina
2017-02-19
Mitochondria play a relevant role in Ca 2+ buffering, governing energy metabolism and neuronal function. Huntington's disease (HD) and Alzheimer's disease (AD) are two neurodegenerative disorders that, although clinically distinct, share pathological features linked to selective brain damage. These include mitochondrial dysfunction, intracellular Ca 2+ deregulation and mitochondrial Ca 2+ handling deficits. Both diseases are associated with misfolding and aggregation of specific proteins that physically interact with mitochondria and interfere with endoplasmic reticulum (ER)/mitochondria-contact sites. Cumulating evidences indicate that impairment of mitochondrial Ca 2+ homeostasis underlies the susceptibility to selective neuronal death observed in HD and AD; however data obtained with different models and experimental approaches are not always consistent. In this review, we explore the recent literature on deregulation of mitochondrial Ca 2+ handling underlying the interplay between mitochondria and ER in HD and AD-associated neurodegeneration. Copyright © 2016 Elsevier Inc. All rights reserved.
JCDSA: a joint covariate detection tool for survival analysis on tumor expression profiles.
Wu, Yiming; Liu, Yanan; Wang, Yueming; Shi, Yan; Zhao, Xudong
2018-05-29
Survival analysis on tumor expression profiles has always been a key issue for subsequent biological experimental validation. It is crucial how to select features which closely correspond to survival time. Furthermore, it is important how to select features which best discriminate between low-risk and high-risk group of patients. Common features derived from the two aspects may provide variable candidates for prognosis of cancer. Based on the provided two-step feature selection strategy, we develop a joint covariate detection tool for survival analysis on tumor expression profiles. Significant features, which are not only consistent with survival time but also associated with the categories of patients with different survival risks, are chosen. Using the miRNA expression data (Level 3) of 548 patients with glioblastoma multiforme (GBM) as an example, miRNA candidates for prognosis of cancer are selected. The reliability of selected miRNAs using this tool is demonstrated by 100 simulations. Furthermore, It is discovered that significant covariates are not directly composed of individually significant variables. Joint covariate detection provides a viewpoint for selecting variables which are not individually but jointly significant. Besides, it helps to select features which are not only consistent with survival time but also associated with prognosis risk. The software is available at http://bio-nefu.com/resource/jcdsa .
NASA Astrophysics Data System (ADS)
Burns, G.; French, J.
2007-05-01
Spectral calibrations, airglow and possibly auroral contaminations, solar and telluric absorption features and the selection of transition probabilities can all influence rotational temperatures derived from measurements of hydroxyl airglow intensities. Consideration and examples are given of these influences. Measurements and analyses are outlined for data checking that should be undertaken if a hydroxyl airglow data set is to be used to determine climate trends. Multiple spectral calibrations should be conducted throughout the observing period, with regular inter- comparisons to other calibration sources also required. Uncertainties in spectral calibrations should be expressed as a temperature equivalent. Sufficient spectral scans at maximum resolution should be obtained under all extreme observing conditions (at the lowest solar depression angle operated both morning and night, moon and cloud both separately and combined, aurora and under conditions of enhanced atomic oxygen airglow, and under clear sky conditions but with high atmospheric water vapour content) so that an uncertainty for the derived rotational temperatures can be determined for the established data selection criteria. Once the varying emission and absorption features for the hydroxyl region of interest at your site are understood for the observing site, then the spectral resolution of the observing instrument can be reduced to increase temporal resolution with reasonable confidence. This confidence should be tested by investigating the average rotational temperatures derived from all possible line intensity ratios under the extreme observing conditions noted. If a spectral-fitting rotational temperature determination is used, the residuals from the fit should be summed and similarly examined. Hydroxyl measurements provide a cost effective means of monitoring the temperature of the climate-sensitive mesopause region on an almost nightly basis. If care is taken, they provide a valuable data set for investigating climate change.
Hosseini, Seyyed Abed; Khalilzadeh, Mohammad Ali; Naghibi-Sistani, Mohammad Bagher; Homam, Seyyed Mehran
2015-01-01
Background: This paper proposes a new emotional stress assessment system using multi-modal bio-signals. Electroencephalogram (EEG) is the reflection of brain activity and is widely used in clinical diagnosis and biomedical research. Methods: We design an efficient acquisition protocol to acquire the EEG signals in five channels (FP1, FP2, T3, T4 and Pz) and peripheral signals such as blood volume pulse, skin conductance (SC) and respiration, under images induction (calm-neutral and negatively excited) for the participants. The visual stimuli images are selected from the subset International Affective Picture System database. The qualitative and quantitative evaluation of peripheral signals are used to select suitable segments of EEG signals for improving the accuracy of signal labeling according to emotional stress states. After pre-processing, wavelet coefficients, fractal dimension, and Lempel-Ziv complexity are used to extract the features of the EEG signals. The vast number of features leads to the problem of dimensionality, which is solved using the genetic algorithm as a feature selection method. Results: The results show that the average classification accuracy is 89.6% for two categories of emotional stress states using the support vector machine (SVM). Conclusion: This is a great improvement in results compared to other similar researches. We achieve a noticeable improvement of 11.3% in accuracy using SVM classifier, in compared to previous studies. Therefore, a new fusion between EEG and peripheral signals are more robust in comparison to the separate signals. PMID:26622979
Random ensemble learning for EEG classification.
Hosseini, Mohammad-Parsa; Pompili, Dario; Elisevich, Kost; Soltanian-Zadeh, Hamid
2018-01-01
Real-time detection of seizure activity in epilepsy patients is critical in averting seizure activity and improving patients' quality of life. Accurate evaluation, presurgical assessment, seizure prevention, and emergency alerts all depend on the rapid detection of seizure onset. A new method of feature selection and classification for rapid and precise seizure detection is discussed wherein informative components of electroencephalogram (EEG)-derived data are extracted and an automatic method is presented using infinite independent component analysis (I-ICA) to select independent features. The feature space is divided into subspaces via random selection and multichannel support vector machines (SVMs) are used to classify these subspaces. The result of each classifier is then combined by majority voting to establish the final output. In addition, a random subspace ensemble using a combination of SVM, multilayer perceptron (MLP) neural network and an extended k-nearest neighbors (k-NN), called extended nearest neighbor (ENN), is developed for the EEG and electrocorticography (ECoG) big data problem. To evaluate the solution, a benchmark ECoG of eight patients with temporal and extratemporal epilepsy was implemented in a distributed computing framework as a multitier cloud-computing architecture. Using leave-one-out cross-validation, the accuracy, sensitivity, specificity, and both false positive and false negative ratios of the proposed method were found to be 0.97, 0.98, 0.96, 0.04, and 0.02, respectively. Application of the solution to cases under investigation with ECoG has also been effected to demonstrate its utility. Copyright © 2017 Elsevier B.V. All rights reserved.
Hosseini, Seyyed Abed; Khalilzadeh, Mohammad Ali; Naghibi-Sistani, Mohammad Bagher; Homam, Seyyed Mehran
2015-07-06
This paper proposes a new emotional stress assessment system using multi-modal bio-signals. Electroencephalogram (EEG) is the reflection of brain activity and is widely used in clinical diagnosis and biomedical research. We design an efficient acquisition protocol to acquire the EEG signals in five channels (FP1, FP2, T3, T4 and Pz) and peripheral signals such as blood volume pulse, skin conductance (SC) and respiration, under images induction (calm-neutral and negatively excited) for the participants. The visual stimuli images are selected from the subset International Affective Picture System database. The qualitative and quantitative evaluation of peripheral signals are used to select suitable segments of EEG signals for improving the accuracy of signal labeling according to emotional stress states. After pre-processing, wavelet coefficients, fractal dimension, and Lempel-Ziv complexity are used to extract the features of the EEG signals. The vast number of features leads to the problem of dimensionality, which is solved using the genetic algorithm as a feature selection method. The results show that the average classification accuracy is 89.6% for two categories of emotional stress states using the support vector machine (SVM). This is a great improvement in results compared to other similar researches. We achieve a noticeable improvement of 11.3% in accuracy using SVM classifier, in compared to previous studies. Therefore, a new fusion between EEG and peripheral signals are more robust in comparison to the separate signals.
Multiclass feature selection for improved pediatric brain tumor segmentation
NASA Astrophysics Data System (ADS)
Ahmed, Shaheen; Iftekharuddin, Khan M.
2012-03-01
In our previous work, we showed that fractal-based texture features are effective in detection, segmentation and classification of posterior-fossa (PF) pediatric brain tumor in multimodality MRI. We exploited an information theoretic approach such as Kullback-Leibler Divergence (KLD) for feature selection and ranking different texture features. We further incorporated the feature selection technique with segmentation method such as Expectation Maximization (EM) for segmentation of tumor T and non tumor (NT) tissues. In this work, we extend the two class KLD technique to multiclass for effectively selecting the best features for brain tumor (T), cyst (C) and non tumor (NT). We further obtain segmentation robustness for each tissue types by computing Bay's posterior probabilities and corresponding number of pixels for each tissue segments in MRI patient images. We evaluate improved tumor segmentation robustness using different similarity metric for 5 patients in T1, T2 and FLAIR modalities.
Comparison of Different EHG Feature Selection Methods for the Detection of Preterm Labor
Alamedine, D.; Khalil, M.; Marque, C.
2013-01-01
Numerous types of linear and nonlinear features have been extracted from the electrohysterogram (EHG) in order to classify labor and pregnancy contractions. As a result, the number of available features is now very large. The goal of this study is to reduce the number of features by selecting only the relevant ones which are useful for solving the classification problem. This paper presents three methods for feature subset selection that can be applied to choose the best subsets for classifying labor and pregnancy contractions: an algorithm using the Jeffrey divergence (JD) distance, a sequential forward selection (SFS) algorithm, and a binary particle swarm optimization (BPSO) algorithm. The two last methods are based on a classifier and were tested with three types of classifiers. These methods have allowed us to identify common features which are relevant for contraction classification. PMID:24454536
HIV-1 protease cleavage site prediction based on two-stage feature selection method.
Niu, Bing; Yuan, Xiao-Cheng; Roeper, Preston; Su, Qiang; Peng, Chun-Rong; Yin, Jing-Yuan; Ding, Juan; Li, HaiPeng; Lu, Wen-Cong
2013-03-01
Knowledge of the mechanism of HIV protease cleavage specificity is critical to the design of specific and effective HIV inhibitors. Searching for an accurate, robust, and rapid method to correctly predict the cleavage sites in proteins is crucial when searching for possible HIV inhibitors. In this article, HIV-1 protease specificity was studied using the correlation-based feature subset (CfsSubset) selection method combined with Genetic Algorithms method. Thirty important biochemical features were found based on a jackknife test from the original data set containing 4,248 features. By using the AdaBoost method with the thirty selected features the prediction model yields an accuracy of 96.7% for the jackknife test and 92.1% for an independent set test, with increased accuracy over the original dataset by 6.7% and 77.4%, respectively. Our feature selection scheme could be a useful technique for finding effective competitive inhibitors of HIV protease.
Luo, Zhenhua; Liu, Bingwan; Liu, Songtao; Jiang, Zhigang; Halbrook, Richard S
2014-01-01
Human and livestock related disturbances of habitat selection by ungulates are topics of global concern, as they have profound impacts on ungulate survival, population density, fitness, and management; however, differences in ungulate habitat use under different human and livestock densities are not fully understood. Mongolian gazelle (Procapra gutturosa), an endemic ungulate species on the Asia-European steppe, faces varying intensities of human and livestock disturbances in the area around Dalai Lake, China. To investigate how habitat selection strategies vary as disturbance intensity changes, we randomly set 20 transects containing 1486 plots, on which we conducted repeated surveys of 21 ecological factors during the winters in the period of 2005-2008. We aimed to: 1) determine the critical factors underlying habitat selection of the gazelles; 2) determine the gazelles' habitat preferences in this area; 3) determine how habitat selection varies with disturbance intensity and explore the primary underlying mechanism. We used binary-logistic regressions and information theoretic approaches to build best-fit habitat selection models, and calculated resource selection functions. Sixty-six herds, 522 individuals, and 499 tracks were recorded. Our results indicate that snow depth and aboveground biomass are the main factors affecting habitat selection by Mongolian gazelle throughout the district in winter. Thin snow cover and abundant aboveground biomass are preferred. Avoiding disturbance was the primary factor accounting for habitat selection in low disturbance areas, although with increasing human or live-stock-related disturbance, gazelle maintained a reduced distance to the source of the disturbance. Presumably owing to that shift, movement costs were more important as disturbance increased. In addition, Mongolian gazelle selected habitats based on topographical features promoting greater visibility where disturbance was lower. We suggest several management implications of our findings for this ungulate species will contribute to the effective conservation of Mongolian gazelle in the Dalai Lake area.
Accurate Diabetes Risk Stratification Using Machine Learning: Role of Missing Value and Outliers.
Maniruzzaman, Md; Rahman, Md Jahanur; Al-MehediHasan, Md; Suri, Harman S; Abedin, Md Menhazul; El-Baz, Ayman; Suri, Jasjit S
2018-04-10
Diabetes mellitus is a group of metabolic diseases in which blood sugar levels are too high. About 8.8% of the world was diabetic in 2017. It is projected that this will reach nearly 10% by 2045. The major challenge is that when machine learning-based classifiers are applied to such data sets for risk stratification, leads to lower performance. Thus, our objective is to develop an optimized and robust machine learning (ML) system under the assumption that missing values or outliers if replaced by a median configuration will yield higher risk stratification accuracy. This ML-based risk stratification is designed, optimized and evaluated, where: (i) the features are extracted and optimized from the six feature selection techniques (random forest, logistic regression, mutual information, principal component analysis, analysis of variance, and Fisher discriminant ratio) and combined with ten different types of classifiers (linear discriminant analysis, quadratic discriminant analysis, naïve Bayes, Gaussian process classification, support vector machine, artificial neural network, Adaboost, logistic regression, decision tree, and random forest) under the hypothesis that both missing values and outliers when replaced by computed medians will improve the risk stratification accuracy. Pima Indian diabetic dataset (768 patients: 268 diabetic and 500 controls) was used. Our results demonstrate that on replacing the missing values and outliers by group median and median values, respectively and further using the combination of random forest feature selection and random forest classification technique yields an accuracy, sensitivity, specificity, positive predictive value, negative predictive value and area under the curve as: 92.26%, 95.96%, 79.72%, 91.14%, 91.20%, and 0.93, respectively. This is an improvement of 10% over previously developed techniques published in literature. The system was validated for its stability and reliability. RF-based model showed the best performance when outliers are replaced by median values.
Capela, Nicole A; Lemaire, Edward D; Baddour, Natalie
2015-01-01
Human activity recognition (HAR), using wearable sensors, is a growing area with the potential to provide valuable information on patient mobility to rehabilitation specialists. Smartphones with accelerometer and gyroscope sensors are a convenient, minimally invasive, and low cost approach for mobility monitoring. HAR systems typically pre-process raw signals, segment the signals, and then extract features to be used in a classifier. Feature selection is a crucial step in the process to reduce potentially large data dimensionality and provide viable parameters to enable activity classification. Most HAR systems are customized to an individual research group, including a unique data set, classes, algorithms, and signal features. These data sets are obtained predominantly from able-bodied participants. In this paper, smartphone accelerometer and gyroscope sensor data were collected from populations that can benefit from human activity recognition: able-bodied, elderly, and stroke patients. Data from a consecutive sequence of 41 mobility tasks (18 different tasks) were collected for a total of 44 participants. Seventy-six signal features were calculated and subsets of these features were selected using three filter-based, classifier-independent, feature selection methods (Relief-F, Correlation-based Feature Selection, Fast Correlation Based Filter). The feature subsets were then evaluated using three generic classifiers (Naïve Bayes, Support Vector Machine, j48 Decision Tree). Common features were identified for all three populations, although the stroke population subset had some differences from both able-bodied and elderly sets. Evaluation with the three classifiers showed that the feature subsets produced similar or better accuracies than classification with the entire feature set. Therefore, since these feature subsets are classifier-independent, they should be useful for developing and improving HAR systems across and within populations.
2015-01-01
Human activity recognition (HAR), using wearable sensors, is a growing area with the potential to provide valuable information on patient mobility to rehabilitation specialists. Smartphones with accelerometer and gyroscope sensors are a convenient, minimally invasive, and low cost approach for mobility monitoring. HAR systems typically pre-process raw signals, segment the signals, and then extract features to be used in a classifier. Feature selection is a crucial step in the process to reduce potentially large data dimensionality and provide viable parameters to enable activity classification. Most HAR systems are customized to an individual research group, including a unique data set, classes, algorithms, and signal features. These data sets are obtained predominantly from able-bodied participants. In this paper, smartphone accelerometer and gyroscope sensor data were collected from populations that can benefit from human activity recognition: able-bodied, elderly, and stroke patients. Data from a consecutive sequence of 41 mobility tasks (18 different tasks) were collected for a total of 44 participants. Seventy-six signal features were calculated and subsets of these features were selected using three filter-based, classifier-independent, feature selection methods (Relief-F, Correlation-based Feature Selection, Fast Correlation Based Filter). The feature subsets were then evaluated using three generic classifiers (Naïve Bayes, Support Vector Machine, j48 Decision Tree). Common features were identified for all three populations, although the stroke population subset had some differences from both able-bodied and elderly sets. Evaluation with the three classifiers showed that the feature subsets produced similar or better accuracies than classification with the entire feature set. Therefore, since these feature subsets are classifier-independent, they should be useful for developing and improving HAR systems across and within populations. PMID:25885272
Yang, Mingxing; Li, Xiumin; Li, Zhibin; Ou, Zhimin; Liu, Ming; Liu, Suhuan; Li, Xuejun; Yang, Shuyu
2013-01-01
DNA microarray analysis is characterized by obtaining a large number of gene variables from a small number of observations. Cluster analysis is widely used to analyze DNA microarray data to make classification and diagnosis of disease. Because there are so many irrelevant and insignificant genes in a dataset, a feature selection approach must be employed in data analysis. The performance of cluster analysis of this high-throughput data depends on whether the feature selection approach chooses the most relevant genes associated with disease classes. Here we proposed a new method using multiple Orthogonal Partial Least Squares-Discriminant Analysis (mOPLS-DA) models and S-plots to select the most relevant genes to conduct three-class disease classification and prediction. We tested our method using Golub's leukemia microarray data. For three classes with subtypes, we proposed hierarchical orthogonal partial least squares-discriminant analysis (OPLS-DA) models and S-plots to select features for two main classes and their subtypes. For three classes in parallel, we employed three OPLS-DA models and S-plots to choose marker genes for each class. The power of feature selection to classify and predict three-class disease was evaluated using cluster analysis. Further, the general performance of our method was tested using four public datasets and compared with those of four other feature selection methods. The results revealed that our method effectively selected the most relevant features for disease classification and prediction, and its performance was better than that of the other methods.
Yin, Jia Yuan; Wan, Xiang; Zhang, Qian; Cui, Tie Jun
2015-07-23
We propose an ultra-wideband polarization-conversion metasurface with polarization selective and incident-angle insensitive characteristics using anchor-shaped units through multiple resonances. The broadband characteristic is optimized by the genetic optimization algorithm, from which the anchor-shaped unit cell generates five resonances, resulting in expansion of the operating frequency range. Owing to the structural feature of the proposed metasurface, only x- and y-polarized incident waves can reach high-efficiency polarization conversions, realizing the polarization-selective property. The proposed metasurface is also insensitive to the angle of incident waves, which indicates a promising future in modern communication systems. We fabricate and measure the proposed metasurface, and both the simulated and measured results show ultra-wide bandwidth for the x- and y-polarized incident waves.
An experimental sample of the field gamma-spectrometer based on solid state Si-photomultiplier
NASA Astrophysics Data System (ADS)
Denisov, Viktor; Korotaev, Valery; Titov, Aleksandr; Blokhina, Anastasia; Kleshchenok, Maksim
2017-05-01
Design of optical-electronic devices and systems involves the selection of such technical patterns that under given initial requirements and conditions are optimal according to certain criteria. The original characteristic of the OES for any purpose, defining its most important feature ability is a threshold detection. Based on this property, will be achieved the required functional quality of the device or system. Therefore, the original criteria and optimization methods have to subordinate to the idea of a better detectability. Generally reduces to the problem of optimal selection of the expected (predetermined) signals in the predetermined observation conditions. Thus the main purpose of optimization of the system when calculating its detectability is the choice of circuits and components that provide the most effective selection of a target.
Yin, Jia Yuan; Wan, Xiang; Zhang, Qian; Cui, Tie Jun
2015-01-01
We propose an ultra-wideband polarization-conversion metasurface with polarization selective and incident-angle insensitive characteristics using anchor-shaped units through multiple resonances. The broadband characteristic is optimized by the genetic optimization algorithm, from which the anchor-shaped unit cell generates five resonances, resulting in expansion of the operating frequency range. Owing to the structural feature of the proposed metasurface, only x- and y-polarized incident waves can reach high-efficiency polarization conversions, realizing the polarization-selective property. The proposed metasurface is also insensitive to the angle of incident waves, which indicates a promising future in modern communication systems. We fabricate and measure the proposed metasurface, and both the simulated and measured results show ultra-wide bandwidth for the x- and y-polarized incident waves. PMID:26202495
A feature selection approach towards progressive vector transmission over the Internet
NASA Astrophysics Data System (ADS)
Miao, Ru; Song, Jia; Feng, Min
2017-09-01
WebGIS has been applied for visualizing and sharing geospatial information popularly over the Internet. In order to improve the efficiency of the client applications, the web-based progressive vector transmission approach is proposed. Important features should be selected and transferred firstly, and the methods for measuring the importance of features should be further considered in the progressive transmission. However, studies on progressive transmission for large-volume vector data have mostly focused on map generalization in the field of cartography, but rarely discussed on the selection of geographic features quantitatively. This paper applies information theory for measuring the feature importance of vector maps. A measurement model for the amount of information of vector features is defined based upon the amount of information for dealing with feature selection issues. The measurement model involves geometry factor, spatial distribution factor and thematic attribute factor. Moreover, a real-time transport protocol (RTP)-based progressive transmission method is then presented to improve the transmission of vector data. To clearly demonstrate the essential methodology and key techniques, a prototype for web-based progressive vector transmission is presented, and an experiment of progressive selection and transmission for vector features is conducted. The experimental results indicate that our approach clearly improves the performance and end-user experience of delivering and manipulating large vector data over the Internet.
A new approach to modeling the influence of image features on fixation selection in scenes
Nuthmann, Antje; Einhäuser, Wolfgang
2015-01-01
Which image characteristics predict where people fixate when memorizing natural images? To answer this question, we introduce a new analysis approach that combines a novel scene-patch analysis with generalized linear mixed models (GLMMs). Our method allows for (1) directly describing the relationship between continuous feature value and fixation probability, and (2) assessing each feature's unique contribution to fixation selection. To demonstrate this method, we estimated the relative contribution of various image features to fixation selection: luminance and luminance contrast (low-level features); edge density (a mid-level feature); visual clutter and image segmentation to approximate local object density in the scene (higher-level features). An additional predictor captured the central bias of fixation. The GLMM results revealed that edge density, clutter, and the number of homogenous segments in a patch can independently predict whether image patches are fixated or not. Importantly, neither luminance nor contrast had an independent effect above and beyond what could be accounted for by the other predictors. Since the parcellation of the scene and the selection of features can be tailored to the specific research question, our approach allows for assessing the interplay of various factors relevant for fixation selection in scenes in a powerful and flexible manner. PMID:25752239
USDA-ARS?s Scientific Manuscript database
The availability of numerous spectral, spatial, and contextual features with object-based image analysis (OBIA) renders the selection of optimal features a time consuming and subjective process. While several feature election methods have been used in conjunction with OBIA, a robust comparison of th...
Processing Dynamic Image Sequences from a Moving Sensor.
1984-02-01
65 Roadsign Image Sequence ..... ................ ... 70 Roadsign Sequence with Redundant Features .. ........ . 79 Roadsign Subimage...Selected Feature Error Values .. ........ 66 2c. Industrial Image Selected Feature Local Search Values. .. .... 67 3ab. Roadsign Image Error Values...72 3c. Roadsign Image Local Search Values ............. 73 4ab. Roadsign Redundant Feature Error Values. ............ 8 4c. Roadsign
NASA Astrophysics Data System (ADS)
Adi Putra, Januar
2018-04-01
In this paper, we propose a new mammogram classification scheme to classify the breast tissues as normal or abnormal. Feature matrix is generated using Local Binary Pattern to all the detailed coefficients from 2D-DWT of the region of interest (ROI) of a mammogram. Feature selection is done by selecting the relevant features that affect the classification. Feature selection is used to reduce the dimensionality of data and features that are not relevant, in this paper the F-test and Ttest will be performed to the results of the feature extraction dataset to reduce and select the relevant feature. The best features are used in a Neural Network classifier for classification. In this research we use MIAS and DDSM database. In addition to the suggested scheme, the competent schemes are also simulated for comparative analysis. It is observed that the proposed scheme has a better say with respect to accuracy, specificity and sensitivity. Based on experiments, the performance of the proposed scheme can produce high accuracy that is 92.71%, while the lowest accuracy obtained is 77.08%.
Wide-undermining neck liposuction: tips and tricks for good results.
Innocenti, Alessandro; Andretto Amodeo, Chiara; Ciancio, Francesco
2014-08-01
Neck rejuvenation is one of the most sought after procedures in the restoration of the facial contour. Numerous techniques to improve the aesthetic outcome and reduce downtime have been described. In our experience, wide undermining and local anesthesia are key to obtaining good results in selected patients who want a quick recovery. This article presents our experience with liposuction of the neck and proposes some tips and tricks to master wide-undermining neck liposuction. From January 2005 to September 2012, a total of 118 patients (34 males, 84 females) underwent neck liposuction. Patient selection was based mainly on age and neck-aging features. The procedure was performed with the patients under local anesthesia. A wide rhomboid-shaped skin undermining of the submandibular and neck area was performed and a very thin fat layer was preserved. Dressing was applied for 3 days. Improvement of the neck's contour was observed in all patients. Redefinition of the cervicomandibular angle and skin redraping of the cervical area occurred in all cases. No further touch-ups were needed. Edema and ecchymosis resolved in a few days. No major complications were observed. Our results show that wide-undermining neck liposuction performed under local anesthesia is an effective and safe procedure. Patient selection based on age and anatomical features was fundamental to obtain impressive improvement of neck contour. This journal requires that authors assign a level of evidence to each article. For a full description of these Evidence-Based Medicine ratings, please refer to the Table of Contents or the online Instructions to Authors www.springer.com/00266 .
Lister, Callum; Arbuckle, Kevin; Jackson, Timothy N W; Debono, Jordan; Zdenek, Christina N; Dashevsky, Daniel; Dunstan, Nathan; Allen, Luke; Hay, Chris; Bush, Brian; Gillett, Amber; Fry, Bryan G
2017-11-01
A paradigm of venom research is adaptive evolution of toxins as part of a predator-prey chemical arms race. This study examined differential co-factor dependence, variations relative to dietary preference, and the impact upon relative neutralisation by antivenom of the procoagulant toxins in the venoms of a clade of Australian snakes. All genera were characterised by venoms rich in factor Xa which act upon endogenous prothrombin. Examination of toxin sequences revealed an extraordinary level of conservation, which indicates that adaptive evolution is not a feature of this toxin type. Consistent with this, the venoms did not display differences on the plasma of different taxa. Examination of the prothrombin target revealed endogenous blood proteins are under extreme negative selection pressure for diversification, this in turn puts a strong negative selection pressure upon the toxins as sequence diversification could result in a drift away from the target. Thus this study reveals that adaptive evolution is not a consistent feature in toxin evolution in cases where the target is under negative selection pressure for diversification. Consistent with this high level of toxin conservation, the antivenom showed extremely high-levels of cross-reactivity. There was however a strong statistical correlation between relative degree of phospholipid-dependence and clotting time, with the least dependent venoms producing faster clotting times than the other venoms even in the presence of phospholipid. The results of this study are not only of interest to evolutionary and ecological disciplines, but also have implications for clinical toxinology. Copyright © 2017 Elsevier Inc. All rights reserved.
Li, Han; Liu, Yashu; Gong, Pinghua; Zhang, Changshui; Ye, Jieping
2014-01-01
Identifying patients with Mild Cognitive Impairment (MCI) who are likely to convert to dementia has recently attracted increasing attention in Alzheimer's disease (AD) research. An accurate prediction of conversion from MCI to AD can aid clinicians to initiate treatments at early stage and monitor their effectiveness. However, existing prediction systems based on the original biosignatures are not satisfactory. In this paper, we propose to fit the prediction models using pairwise biosignature interactions, thus capturing higher-order relationship among biosignatures. Specifically, we employ hierarchical constraints and sparsity regularization to prune the high-dimensional input features. Based on the significant biosignatures and underlying interactions identified, we build classifiers to predict the conversion probability based on the selected features. We further analyze the underlying interaction effects of different biosignatures based on the so-called stable expectation scores. We have used 293 MCI subjects from Alzheimer's Disease Neuroimaging Initiative (ADNI) database that have MRI measurements at the baseline to evaluate the effectiveness of the proposed method. Our proposed method achieves better classification performance than state-of-the-art methods. Moreover, we discover several significant interactions predictive of MCI-to-AD conversion. These results shed light on improving the prediction performance using interaction features. PMID:24416143
Neural basis of processing threatening voices in a crowded auditory world
Mothes-Lasch, Martin; Becker, Michael P. I.; Miltner, Wolfgang H. R.
2016-01-01
In real world situations, we typically listen to voice prosody against a background crowded with auditory stimuli. Voices and background can both contain behaviorally relevant features and both can be selectively in the focus of attention. Adequate responses to threat-related voices under such conditions require that the brain unmixes reciprocally masked features depending on variable cognitive resources. It is unknown which brain systems instantiate the extraction of behaviorally relevant prosodic features under varying combinations of prosody valence, auditory background complexity and attentional focus. Here, we used event-related functional magnetic resonance imaging to investigate the effects of high background sound complexity and attentional focus on brain activation to angry and neutral prosody in humans. Results show that prosody effects in mid superior temporal cortex were gated by background complexity but not attention, while prosody effects in the amygdala and anterior superior temporal cortex were gated by attention but not background complexity, suggesting distinct emotional prosody processing limitations in different regions. Crucially, if attention was focused on the highly complex background, the differential processing of emotional prosody was prevented in all brain regions, suggesting that in a distracting, complex auditory world even threatening voices may go unnoticed. PMID:26884543
Poly(A) code analyses reveal key determinants for tissue-specific mRNA alternative polyadenylation
Weng, Lingjie; Li, Yi; Xie, Xiaohui; Shi, Yongsheng
2016-01-01
mRNA alternative polyadenylation (APA) is a critical mechanism for post-transcriptional gene regulation and is often regulated in a tissue- and/or developmental stage-specific manner. An ultimate goal for the APA field has been to be able to computationally predict APA profiles under different physiological or pathological conditions. As a first step toward this goal, we have assembled a poly(A) code for predicting tissue-specific poly(A) sites (PASs). Based on a compendium of over 600 features that have known or potential roles in PAS selection, we have generated and refined a machine-learning algorithm using multiple high-throughput sequencing-based data sets of tissue-specific and constitutive PASs. This code can predict tissue-specific PASs with >85% accuracy. Importantly, by analyzing the prediction performance based on different RNA features, we found that PAS context, including the distance between alternative PASs and the relative position of a PAS within the gene, is a key feature for determining the susceptibility of a PAS to tissue-specific regulation. Our poly(A) code provides a useful tool for not only predicting tissue-specific APA regulation, but also for studying its underlying molecular mechanisms. PMID:27095026
Nie, Zhi; Vairavan, Srinivasan; Narayan, Vaibhav A; Ye, Jieping; Li, Qingqin S
2018-01-01
Identification of risk factors of treatment resistance may be useful to guide treatment selection, avoid inefficient trial-and-error, and improve major depressive disorder (MDD) care. We extended the work in predictive modeling of treatment resistant depression (TRD) via partition of the data from the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) cohort into a training and a testing dataset. We also included data from a small yet completely independent cohort RIS-INT-93 as an external test dataset. We used features from enrollment and level 1 treatment (up to week 2 response only) of STAR*D to explore the feature space comprehensively and applied machine learning methods to model TRD outcome at level 2. For TRD defined using QIDS-C16 remission criteria, multiple machine learning models were internally cross-validated in the STAR*D training dataset and externally validated in both the STAR*D testing dataset and RIS-INT-93 independent dataset with an area under the receiver operating characteristic curve (AUC) of 0.70-0.78 and 0.72-0.77, respectively. The upper bound for the AUC achievable with the full set of features could be as high as 0.78 in the STAR*D testing dataset. Model developed using top 30 features identified using feature selection technique (k-means clustering followed by χ2 test) achieved an AUC of 0.77 in the STAR*D testing dataset. In addition, the model developed using overlapping features between STAR*D and RIS-INT-93, achieved an AUC of > 0.70 in both the STAR*D testing and RIS-INT-93 datasets. Among all the features explored in STAR*D and RIS-INT-93 datasets, the most important feature was early or initial treatment response or symptom severity at week 2. These results indicate that prediction of TRD prior to undergoing a second round of antidepressant treatment could be feasible even in the absence of biomarker data.
An Ant Colony Optimization Based Feature Selection for Web Page Classification
2014-01-01
The increased popularity of the web has caused the inclusion of huge amount of information to the web, and as a result of this explosive information growth, automated web page classification systems are needed to improve search engines' performance. Web pages have a large number of features such as HTML/XML tags, URLs, hyperlinks, and text contents that should be considered during an automated classification process. The aim of this study is to reduce the number of features to be used to improve runtime and accuracy of the classification of web pages. In this study, we used an ant colony optimization (ACO) algorithm to select the best features, and then we applied the well-known C4.5, naive Bayes, and k nearest neighbor classifiers to assign class labels to web pages. We used the WebKB and Conference datasets in our experiments, and we showed that using the ACO for feature selection improves both accuracy and runtime performance of classification. We also showed that the proposed ACO based algorithm can select better features with respect to the well-known information gain and chi square feature selection methods. PMID:25136678
Hypergraph Based Feature Selection Technique for Medical Diagnosis.
Somu, Nivethitha; Raman, M R Gauthama; Kirthivasan, Kannan; Sriram, V S Shankar
2016-11-01
The impact of internet and information systems across various domains have resulted in substantial generation of multidimensional datasets. The use of data mining and knowledge discovery techniques to extract the original information contained in the multidimensional datasets play a significant role in the exploitation of complete benefit provided by them. The presence of large number of features in the high dimensional datasets incurs high computational cost in terms of computing power and time. Hence, feature selection technique has been commonly used to build robust machine learning models to select a subset of relevant features which projects the maximal information content of the original dataset. In this paper, a novel Rough Set based K - Helly feature selection technique (RSKHT) which hybridize Rough Set Theory (RST) and K - Helly property of hypergraph representation had been designed to identify the optimal feature subset or reduct for medical diagnostic applications. Experiments carried out using the medical datasets from the UCI repository proves the dominance of the RSKHT over other feature selection techniques with respect to the reduct size, classification accuracy and time complexity. The performance of the RSKHT had been validated using WEKA tool, which shows that RSKHT had been computationally attractive and flexible over massive datasets.
Neighborhood Structural Similarity Mapping for the Classification of Masses in Mammograms.
Rabidas, Rinku; Midya, Abhishek; Chakraborty, Jayasree
2018-05-01
In this paper, two novel feature extraction methods, using neighborhood structural similarity (NSS), are proposed for the characterization of mammographic masses as benign or malignant. Since gray-level distribution of pixels is different in benign and malignant masses, more regular and homogeneous patterns are visible in benign masses compared to malignant masses; the proposed method exploits the similarity between neighboring regions of masses by designing two new features, namely, NSS-I and NSS-II, which capture global similarity at different scales. Complementary to these global features, uniform local binary patterns are computed to enhance the classification efficiency by combining with the proposed features. The performance of the features are evaluated using the images from the mini-mammographic image analysis society (mini-MIAS) and digital database for screening mammography (DDSM) databases, where a tenfold cross-validation technique is incorporated with Fisher linear discriminant analysis, after selecting the optimal set of features using stepwise logistic regression method. The best area under the receiver operating characteristic curve of 0.98 with an accuracy of is achieved with the mini-MIAS database, while the same for the DDSM database is 0.93 with accuracy .
Reducing Sweeping Frequencies in Microwave NDT Employing Machine Learning Feature Selection
Moomen, Abdelniser; Ali, Abdulbaset; Ramahi, Omar M.
2016-01-01
Nondestructive Testing (NDT) assessment of materials’ health condition is useful for classifying healthy from unhealthy structures or detecting flaws in metallic or dielectric structures. Performing structural health testing for coated/uncoated metallic or dielectric materials with the same testing equipment requires a testing method that can work on metallics and dielectrics such as microwave testing. Reducing complexity and expenses associated with current diagnostic practices of microwave NDT of structural health requires an effective and intelligent approach based on feature selection and classification techniques of machine learning. Current microwave NDT methods in general based on measuring variation in the S-matrix over the entire operating frequency ranges of the sensors. For instance, assessing the health of metallic structures using a microwave sensor depends on the reflection or/and transmission coefficient measurements as a function of the sweeping frequencies of the operating band. The aim of this work is reducing sweeping frequencies using machine learning feature selection techniques. By treating sweeping frequencies as features, the number of top important features can be identified, then only the most influential features (frequencies) are considered when building the microwave NDT equipment. The proposed method of reducing sweeping frequencies was validated experimentally using a waveguide sensor and a metallic plate with different cracks. Among the investigated feature selection techniques are information gain, gain ratio, relief, chi-squared. The effectiveness of the selected features were validated through performance evaluations of various classification models; namely, Nearest Neighbor, Neural Networks, Random Forest, and Support Vector Machine. Results showed good crack classification accuracy rates after employing feature selection algorithms. PMID:27104533
On the use of feature selection to improve the detection of sea oil spills in SAR images
NASA Astrophysics Data System (ADS)
Mera, David; Bolon-Canedo, Veronica; Cotos, J. M.; Alonso-Betanzos, Amparo
2017-03-01
Fast and effective oil spill detection systems are crucial to ensure a proper response to environmental emergencies caused by hydrocarbon pollution on the ocean's surface. Typically, these systems uncover not only oil spills, but also a high number of look-alikes. The feature extraction is a critical and computationally intensive phase where each detected dark spot is independently examined. Traditionally, detection systems use an arbitrary set of features to discriminate between oil spills and look-alikes phenomena. However, Feature Selection (FS) methods based on Machine Learning (ML) have proved to be very useful in real domains for enhancing the generalization capabilities of the classifiers, while discarding the existing irrelevant features. In this work, we present a generic and systematic approach, based on FS methods, for choosing a concise and relevant set of features to improve the oil spill detection systems. We have compared five FS methods: Correlation-based feature selection (CFS), Consistency-based filter, Information Gain, ReliefF and Recursive Feature Elimination for Support Vector Machine (SVM-RFE). They were applied on a 141-input vector composed of features from a collection of outstanding studies. Selected features were validated via a Support Vector Machine (SVM) classifier and the results were compared with previous works. Test experiments revealed that the classifier trained with the 6-input feature vector proposed by SVM-RFE achieved the best accuracy and Cohen's kappa coefficient (87.1% and 74.06% respectively). This is a smaller feature combination with similar or even better classification accuracy than previous works. The presented finding allows to speed up the feature extraction phase without reducing the classifier accuracy. Experiments also confirmed the significance of the geometrical features since 75.0% of the different features selected by the applied FS methods as well as 66.67% of the proposed 6-input feature vector belong to this category.
Lin, Xiaohui; Li, Chao; Zhang, Yanhui; Su, Benzhe; Fan, Meng; Wei, Hai
2017-12-26
Feature selection is an important topic in bioinformatics. Defining informative features from complex high dimensional biological data is critical in disease study, drug development, etc. Support vector machine-recursive feature elimination (SVM-RFE) is an efficient feature selection technique that has shown its power in many applications. It ranks the features according to the recursive feature deletion sequence based on SVM. In this study, we propose a method, SVM-RFE-OA, which combines the classification accuracy rate and the average overlapping ratio of the samples to determine the number of features to be selected from the feature rank of SVM-RFE. Meanwhile, to measure the feature weights more accurately, we propose a modified SVM-RFE-OA (M-SVM-RFE-OA) algorithm that temporally screens out the samples lying in a heavy overlapping area in each iteration. The experiments on the eight public biological datasets show that the discriminative ability of the feature subset could be measured more accurately by combining the classification accuracy rate with the average overlapping degree of the samples compared with using the classification accuracy rate alone, and shielding the samples in the overlapping area made the calculation of the feature weights more stable and accurate. The methods proposed in this study can also be used with other RFE techniques to define potential biomarkers from big biological data.
Hadoop neural network for parallel and distributed feature selection.
Hodge, Victoria J; O'Keefe, Simon; Austin, Jim
2016-06-01
In this paper, we introduce a theoretical basis for a Hadoop-based neural network for parallel and distributed feature selection in Big Data sets. It is underpinned by an associative memory (binary) neural network which is highly amenable to parallel and distributed processing and fits with the Hadoop paradigm. There are many feature selectors described in the literature which all have various strengths and weaknesses. We present the implementation details of five feature selection algorithms constructed using our artificial neural network framework embedded in Hadoop YARN. Hadoop allows parallel and distributed processing. Each feature selector can be divided into subtasks and the subtasks can then be processed in parallel. Multiple feature selectors can also be processed simultaneously (in parallel) allowing multiple feature selectors to be compared. We identify commonalities among the five features selectors. All can be processed in the framework using a single representation and the overall processing can also be greatly reduced by only processing the common aspects of the feature selectors once and propagating these aspects across all five feature selectors as necessary. This allows the best feature selector and the actual features to select to be identified for large and high dimensional data sets through exploiting the efficiency and flexibility of embedding the binary associative-memory neural network in Hadoop. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.
MGRA: Motion Gesture Recognition via Accelerometer.
Hong, Feng; You, Shujuan; Wei, Meiyu; Zhang, Yongtuo; Guo, Zhongwen
2016-04-13
Accelerometers have been widely embedded in most current mobile devices, enabling easy and intuitive operations. This paper proposes a Motion Gesture Recognition system (MGRA) based on accelerometer data only, which is entirely implemented on mobile devices and can provide users with real-time interactions. A robust and unique feature set is enumerated through the time domain, the frequency domain and singular value decomposition analysis using our motion gesture set containing 11,110 traces. The best feature vector for classification is selected, taking both static and mobile scenarios into consideration. MGRA exploits support vector machine as the classifier with the best feature vector. Evaluations confirm that MGRA can accommodate a broad set of gesture variations within each class, including execution time, amplitude and non-gestural movement. Extensive evaluations confirm that MGRA achieves higher accuracy under both static and mobile scenarios and costs less computation time and energy on an LG Nexus 5 than previous methods.
Prediction of protein-protein interactions based on PseAA composition and hybrid feature selection.
Liu, Liang; Cai, Yudong; Lu, Wencong; Feng, Kaiyan; Peng, Chunrong; Niu, Bing
2009-03-06
Based on pseudo amino acid (PseAA) composition and a novel hybrid feature selection frame, this paper presents a computational system to predict the PPIs (protein-protein interactions) using 8796 protein pairs. These pairs are coded by PseAA composition, resulting in 114 features. A hybrid feature selection system, mRMR-KNNs-wrapper, is applied to obtain an optimized feature set by excluding poor-performed and/or redundant features, resulting in 103 remaining features. Using the optimized 103-feature subset, a prediction model is trained and tested in the k-nearest neighbors (KNNs) learning system. This prediction model achieves an overall accurate prediction rate of 76.18%, evaluated by 10-fold cross-validation test, which is 1.46% higher than using the initial 114 features and is 6.51% higher than the 20 features, coded by amino acid compositions. The PPIs predictor, developed for this research, is available for public use at http://chemdata.shu.edu.cn/ppi.
NASA Astrophysics Data System (ADS)
Rahmadani, S.; Dongoran, A.; Zarlis, M.; Zakarias
2018-03-01
This paper discusses the problem of feature selection using genetic algorithms on a dataset for classification problems. The classification model used is the decicion tree (DT), and Naive Bayes. In this paper we will discuss how the Naive Bayes and Decision Tree models to overcome the classification problem in the dataset, where the dataset feature is selectively selected using GA. Then both models compared their performance, whether there is an increase in accuracy or not. From the results obtained shows an increase in accuracy if the feature selection using GA. The proposed model is referred to as GADT (GA-Decision Tree) and GANB (GA-Naive Bayes). The data sets tested in this paper are taken from the UCI Machine Learning repository.
Fuzzy feature selection based on interval type-2 fuzzy sets
NASA Astrophysics Data System (ADS)
Cherif, Sahar; Baklouti, Nesrine; Alimi, Adel; Snasel, Vaclav
2017-03-01
When dealing with real world data; noise, complexity, dimensionality, uncertainty and irrelevance can lead to low performance and insignificant judgment. Fuzzy logic is a powerful tool for controlling conflicting attributes which can have similar effects and close meanings. In this paper, an interval type-2 fuzzy feature selection is presented as a new approach for removing irrelevant features and reducing complexity. We demonstrate how can Feature Selection be joined with Interval Type-2 Fuzzy Logic for keeping significant features and hence reducing time complexity. The proposed method is compared with some other approaches. The results show that the number of attributes is proportionally small.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lee, J; Nishikawa, R; Reiser, I
Purpose: Segmentation quality can affect quantitative image feature analysis. The objective of this study is to examine the relationship between computed tomography (CT) image quality, segmentation performance, and quantitative image feature analysis. Methods: A total of 90 pathology proven breast lesions in 87 dedicated breast CT images were considered. An iterative image reconstruction (IIR) algorithm was used to obtain CT images with different quality. With different combinations of 4 variables in the algorithm, this study obtained a total of 28 different qualities of CT images. Two imaging tasks/objectives were considered: 1) segmentation and 2) classification of the lesion as benignmore » or malignant. Twenty-three image features were extracted after segmentation using a semi-automated algorithm and 5 of them were selected via a feature selection technique. Logistic regression was trained and tested using leave-one-out-cross-validation and its area under the ROC curve (AUC) was recorded. The standard deviation of a homogeneous portion and the gradient of a parenchymal portion of an example breast were used as an estimate of image noise and sharpness. The DICE coefficient was computed using a radiologist’s drawing on the lesion. Mean DICE and AUC were used as performance metrics for each of the 28 reconstructions. The relationship between segmentation and classification performance under different reconstructions were compared. Distributions (median, 95% confidence interval) of DICE and AUC for each reconstruction were also compared. Results: Moderate correlation (Pearson’s rho = 0.43, p-value = 0.02) between DICE and AUC values was found. However, the variation between DICE and AUC values for each reconstruction increased as the image sharpness increased. There was a combination of IIR parameters that resulted in the best segmentation with the worst classification performance. Conclusion: There are certain images that yield better segmentation or classification performance. The best segmentation Result does not necessarily lead to the best classification Result. This work has been supported in part by grants from the NIH R21-EB015053. R Nishikawa is receives royalties form Hologic, Inc.« less
Neural evidence reveals the rapid effects of reward history on selective attention.
MacLean, Mary H; Giesbrecht, Barry
2015-05-05
Selective attention is often framed as being primarily driven by two factors: task-relevance and physical salience. However, factors like selection and reward history, which are neither currently task-relevant nor physically salient, can reliably and persistently influence visual selective attention. The current study investigated the nature of the persistent effects of irrelevant, physically non-salient, reward-associated features. These features affected one of the earliest reliable neural indicators of visual selective attention in humans, the P1 event-related potential, measured one week after the reward associations were learned. However, the effects of reward history were moderated by current task demands. The modulation of visually evoked activity supports the hypothesis that reward history influences the innate salience of reward associated features, such that even when no longer relevant, nor physically salient, these features have a rapid, persistent, and robust effect on early visual selective attention. Copyright © 2015 Elsevier B.V. All rights reserved.
Rouillard, Andrew D; Hurle, Mark R; Agarwal, Pankaj
2018-05-01
Target selection is the first and pivotal step in drug discovery. An incorrect choice may not manifest itself for many years after hundreds of millions of research dollars have been spent. We collected a set of 332 targets that succeeded or failed in phase III clinical trials, and explored whether Omic features describing the target genes could predict clinical success. We obtained features from the recently published comprehensive resource: Harmonizome. Nineteen features appeared to be significantly correlated with phase III clinical trial outcomes, but only 4 passed validation schemes that used bootstrapping or modified permutation tests to assess feature robustness and generalizability while accounting for target class selection bias. We also used classifiers to perform multivariate feature selection and found that classifiers with a single feature performed as well in cross-validation as classifiers with more features (AUROC = 0.57 and AUPR = 0.81). The two predominantly selected features were mean mRNA expression across tissues and standard deviation of expression across tissues, where successful targets tended to have lower mean expression and higher expression variance than failed targets. This finding supports the conventional wisdom that it is favorable for a target to be present in the tissue(s) affected by a disease and absent from other tissues. Overall, our results suggest that it is feasible to construct a model integrating interpretable target features to inform target selection. We anticipate deeper insights and better models in the future, as researchers can reuse the data we have provided to improve methods for handling sample biases and learn more informative features. Code, documentation, and data for this study have been deposited on GitHub at https://github.com/arouillard/omic-features-successful-targets.
An Optimization-Based Method for Feature Ranking in Nonlinear Regression Problems.
Bravi, Luca; Piccialli, Veronica; Sciandrone, Marco
2017-04-01
In this paper, we consider the feature ranking problem, where, given a set of training instances, the task is to associate a score with the features in order to assess their relevance. Feature ranking is a very important tool for decision support systems, and may be used as an auxiliary step of feature selection to reduce the high dimensionality of real-world data. We focus on regression problems by assuming that the process underlying the generated data can be approximated by a continuous function (for instance, a feedforward neural network). We formally state the notion of relevance of a feature by introducing a minimum zero-norm inversion problem of a neural network, which is a nonsmooth, constrained optimization problem. We employ a concave approximation of the zero-norm function, and we define a smooth, global optimization problem to be solved in order to assess the relevance of the features. We present the new feature ranking method based on the solution of instances of the global optimization problem depending on the available training data. Computational experiments on both artificial and real data sets are performed, and point out that the proposed feature ranking method is a valid alternative to existing methods in terms of effectiveness. The obtained results also show that the method is costly in terms of CPU time, and this may be a limitation in the solution of large-dimensional problems.
Mahrooghy, Majid; Ashraf, Ahmed B; Daye, Dania; McDonald, Elizabeth S; Rosen, Mark; Mies, Carolyn; Feldman, Michael; Kontos, Despina
2015-06-01
Heterogeneity in cancer can affect response to therapy and patient prognosis. Histologic measures have classically been used to measure heterogeneity, although a reliable noninvasive measurement is needed both to establish baseline risk of recurrence and monitor response to treatment. Here, we propose using spatiotemporal wavelet kinetic features from dynamic contrast-enhanced magnetic resonance imaging to quantify intratumor heterogeneity in breast cancer. Tumor pixels are first partitioned into homogeneous subregions using pharmacokinetic measures. Heterogeneity wavelet kinetic (HetWave) features are then extracted from these partitions to obtain spatiotemporal patterns of the wavelet coefficients and the contrast agent uptake. The HetWave features are evaluated in terms of their prognostic value using a logistic regression classifier with genetic algorithm wrapper-based feature selection to classify breast cancer recurrence risk as determined by a validated gene expression assay. Receiver operating characteristic analysis and area under the curve (AUC) are computed to assess classifier performance using leave-one-out cross validation. The HetWave features outperform other commonly used features (AUC = 0.88 HetWave versus 0.70 standard features). The combination of HetWave and standard features further increases classifier performance (AUCs 0.94). The rate of the spatial frequency pattern over the pharmacokinetic partitions can provide valuable prognostic information. HetWave could be a powerful feature extraction approach for characterizing tumor heterogeneity, providing valuable prognostic information.
Lou, Wangchao; Wang, Xiaoqing; Chen, Fan; Chen, Yixiao; Jiang, Bo; Zhang, Hua
2014-01-01
Developing an efficient method for determination of the DNA-binding proteins, due to their vital roles in gene regulation, is becoming highly desired since it would be invaluable to advance our understanding of protein functions. In this study, we proposed a new method for the prediction of the DNA-binding proteins, by performing the feature rank using random forest and the wrapper-based feature selection using forward best-first search strategy. The features comprise information from primary sequence, predicted secondary structure, predicted relative solvent accessibility, and position specific scoring matrix. The proposed method, called DBPPred, used Gaussian naïve Bayes as the underlying classifier since it outperformed five other classifiers, including decision tree, logistic regression, k-nearest neighbor, support vector machine with polynomial kernel, and support vector machine with radial basis function. As a result, the proposed DBPPred yields the highest average accuracy of 0.791 and average MCC of 0.583 according to the five-fold cross validation with ten runs on the training benchmark dataset PDB594. Subsequently, blind tests on the independent dataset PDB186 by the proposed model trained on the entire PDB594 dataset and by other five existing methods (including iDNA-Prot, DNA-Prot, DNAbinder, DNABIND and DBD-Threader) were performed, resulting in that the proposed DBPPred yielded the highest accuracy of 0.769, MCC of 0.538, and AUC of 0.790. The independent tests performed by the proposed DBPPred on completely a large non-DNA binding protein dataset and two RNA binding protein datasets also showed improved or comparable quality when compared with the relevant prediction methods. Moreover, we observed that majority of the selected features by the proposed method are statistically significantly different between the mean feature values of the DNA-binding and the non DNA-binding proteins. All of the experimental results indicate that the proposed DBPPred can be an alternative perspective predictor for large-scale determination of DNA-binding proteins. PMID:24475169
Feature selection using probabilistic prediction of support vector regression.
Yang, Jian-Bo; Ong, Chong-Jin
2011-06-01
This paper presents a new wrapper-based feature selection method for support vector regression (SVR) using its probabilistic predictions. The method computes the importance of a feature by aggregating the difference, over the feature space, of the conditional density functions of the SVR prediction with and without the feature. As the exact computation of this importance measure is expensive, two approximations are proposed. The effectiveness of the measure using these approximations, in comparison to several other existing feature selection methods for SVR, is evaluated on both artificial and real-world problems. The result of the experiments show that the proposed method generally performs better than, or at least as well as, the existing methods, with notable advantage when the dataset is sparse.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jing, Yaqi; Meng, Qinghao, E-mail: qh-meng@tju.edu.cn; Qi, Peifeng
An electronic nose (e-nose) was designed to classify Chinese liquors of the same aroma style. A new method of feature reduction which combined feature selection with feature extraction was proposed. Feature selection method used 8 feature-selection algorithms based on information theory and reduced the dimension of the feature space to 41. Kernel entropy component analysis was introduced into the e-nose system as a feature extraction method and the dimension of feature space was reduced to 12. Classification of Chinese liquors was performed by using back propagation artificial neural network (BP-ANN), linear discrimination analysis (LDA), and a multi-linear classifier. The classificationmore » rate of the multi-linear classifier was 97.22%, which was higher than LDA and BP-ANN. Finally the classification of Chinese liquors according to their raw materials and geographical origins was performed using the proposed multi-linear classifier and classification rate was 98.75% and 100%, respectively.« less
Tsoi, B; O'Reilly, D; Jegathisawaran, J; Tarride, J-E; Blackhouse, G; Goeree, R
2015-06-17
In constructing or appraising a health economic model, an early consideration is whether the modelling approach selected is appropriate for the given decision problem. Frameworks and taxonomies that distinguish between modelling approaches can help make this decision more systematic and this study aims to identify and compare the decision frameworks proposed to date on this topic area. A systematic review was conducted to identify frameworks from peer-reviewed and grey literature sources. The following databases were searched: OVID Medline and EMBASE; Wiley's Cochrane Library and Health Economic Evaluation Database; PubMed; and ProQuest. Eight decision frameworks were identified, each focused on a different set of modelling approaches and employing a different collection of selection criterion. The selection criteria can be categorized as either: (i) structural features (i.e. technical elements that are factual in nature) or (ii) practical considerations (i.e. context-dependent attributes). The most commonly mentioned structural features were population resolution (i.e. aggregate vs. individual) and interactivity (i.e. static vs. dynamic). Furthermore, understanding the needs of the end-users and stakeholders was frequently incorporated as a criterion within these frameworks. There is presently no universally-accepted framework for selecting an economic modelling approach. Rather, each highlights different criteria that may be of importance when determining whether a modelling approach is appropriate. Further discussion is thus necessary as the modelling approach selected will impact the validity of the underlying economic model and have downstream implications on its efficiency, transparency and relevance to decision-makers.
Purfield, Deirdre C.; McParland, Sinead; Wall, Eamon; Berry, Donagh P.
2017-01-01
Domestication and the subsequent selection of animals for either economic or morphological features can leave a variety of imprints on the genome of a population. Genomic regions subjected to high selective pressures often show reduced genetic diversity and frequent runs of homozygosity (ROH). Therefore, the objective of the present study was to use 42,182 autosomal SNPs to identify genomic regions in 3,191 sheep from six commercial breeds subjected to selection pressure and to quantify the genetic diversity within each breed using ROH. In addition, the historical effective population size of each breed was also estimated and, in conjunction with ROH, was used to elucidate the demographic history of the six breeds. ROH were common in the autosomes of animals in the present study, but the observed breed differences in patterns of ROH length and burden suggested differences in breed effective population size and recent management. ROH provided a sufficient predictor of the pedigree inbreeding coefficient, with an estimated correlation between both measures of 0.62. Genomic regions under putative selection were identified using two complementary algorithms; the fixation index and hapFLK. The identified regions under putative selection included candidate genes associated with skin pigmentation, body size and muscle formation; such characteristics are often sought after in modern-day breeding programs. These regions of selection frequently overlapped with high ROH regions both within and across breeds. Multiple yet uncharacterised genes also resided within putative regions of selection. This further substantiates the need for a more comprehensive annotation of the sheep genome as these uncharacterised genes may contribute to traits of interest in the animal sciences. Despite this, the regions identified as under putative selection in the current study provide an insight into the mechanisms leading to breed differentiation and genetic variation in meat production. PMID:28463982
Modeling the Evolution of Beliefs Using an Attentional Focus Mechanism
Marković, Dimitrije; Gläscher, Jan; Bossaerts, Peter; O’Doherty, John; Kiebel, Stefan J.
2015-01-01
For making decisions in everyday life we often have first to infer the set of environmental features that are relevant for the current task. Here we investigated the computational mechanisms underlying the evolution of beliefs about the relevance of environmental features in a dynamical and noisy environment. For this purpose we designed a probabilistic Wisconsin card sorting task (WCST) with belief solicitation, in which subjects were presented with stimuli composed of multiple visual features. At each moment in time a particular feature was relevant for obtaining reward, and participants had to infer which feature was relevant and report their beliefs accordingly. To test the hypothesis that attentional focus modulates the belief update process, we derived and fitted several probabilistic and non-probabilistic behavioral models, which either incorporate a dynamical model of attentional focus, in the form of a hierarchical winner-take-all neuronal network, or a diffusive model, without attention-like features. We used Bayesian model selection to identify the most likely generative model of subjects’ behavior and found that attention-like features in the behavioral model are essential for explaining subjects’ responses. Furthermore, we demonstrate a method for integrating both connectionist and Bayesian models of decision making within a single framework that allowed us to infer hidden belief processes of human subjects. PMID:26495984
NASA Astrophysics Data System (ADS)
Klomp, Sander; van der Sommen, Fons; Swager, Anne-Fré; Zinger, Svitlana; Schoon, Erik J.; Curvers, Wouter L.; Bergman, Jacques J.; de With, Peter H. N.
2017-03-01
Volumetric Laser Endomicroscopy (VLE) is a promising technique for the detection of early neoplasia in Barrett's Esophagus (BE). VLE generates hundreds of high resolution, grayscale, cross-sectional images of the esophagus. However, at present, classifying these images is a time consuming and cumbersome effort performed by an expert using a clinical prediction model. This paper explores the feasibility of using computer vision techniques to accurately predict the presence of dysplastic tissue in VLE BE images. Our contribution is threefold. First, a benchmarking is performed for widely applied machine learning techniques and feature extraction methods. Second, three new features based on the clinical detection model are proposed, having superior classification accuracy and speed, compared to earlier work. Third, we evaluate automated parameter tuning by applying simple grid search and feature selection methods. The results are evaluated on a clinically validated dataset of 30 dysplastic and 30 non-dysplastic VLE images. Optimal classification accuracy is obtained by applying a support vector machine and using our modified Haralick features and optimal image cropping, obtaining an area under the receiver operating characteristic of 0.95 compared to the clinical prediction model at 0.81. Optimal execution time is achieved using a proposed mean and median feature, which is extracted at least factor 2.5 faster than alternative features with comparable performance.
Computational mechanisms underlying cortical responses to the affordance properties of visual scenes
Epstein, Russell A.
2018-01-01
Biologically inspired deep convolutional neural networks (CNNs), trained for computer vision tasks, have been found to predict cortical responses with remarkable accuracy. However, the internal operations of these models remain poorly understood, and the factors that account for their success are unknown. Here we develop a set of techniques for using CNNs to gain insights into the computational mechanisms underlying cortical responses. We focused on responses in the occipital place area (OPA), a scene-selective region of dorsal occipitoparietal cortex. In a previous study, we showed that fMRI activation patterns in the OPA contain information about the navigational affordances of scenes; that is, information about where one can and cannot move within the immediate environment. We hypothesized that this affordance information could be extracted using a set of purely feedforward computations. To test this idea, we examined a deep CNN with a feedforward architecture that had been previously trained for scene classification. We found that responses in the CNN to scene images were highly predictive of fMRI responses in the OPA. Moreover the CNN accounted for the portion of OPA variance relating to the navigational affordances of scenes. The CNN could thus serve as an image-computable candidate model of affordance-related responses in the OPA. We then ran a series of in silico experiments on this model to gain insights into its internal operations. These analyses showed that the computation of affordance-related features relied heavily on visual information at high-spatial frequencies and cardinal orientations, both of which have previously been identified as low-level stimulus preferences of scene-selective visual cortex. These computations also exhibited a strong preference for information in the lower visual field, which is consistent with known retinotopic biases in the OPA. Visualizations of feature selectivity within the CNN suggested that affordance-based responses encoded features that define the layout of the spatial environment, such as boundary-defining junctions and large extended surfaces. Together, these results map the sensory functions of the OPA onto a fully quantitative model that provides insights into its visual computations. More broadly, they advance integrative techniques for understanding visual cortex across multiple level of analysis: from the identification of cortical sensory functions to the modeling of their underlying algorithms. PMID:29684011
Age-related differences in agenda-driven monitoring of format and task information
Mitchell, Karen J.; Ankudowich, Elizabeth; Durbin, Kelly A.; Greene, Erich J.; Johnson, Marcia K.
2013-01-01
Age-related source memory deficits may arise, in part, from changes in the agenda-driven processes that control what features of events are relevant during remembering. Using fMRI, we compared young and older adults on tests assessing source memory for format (picture, word) or encoding task (self-, other-referential), as well as on old-new recognition. Behaviorally, relative to old-new recognition, older adults showed disproportionate and equivalent deficits on both source tests compared to young adults. At encoding, both age groups showed expected activation associated with format in posterior visual processing areas, and with task in medial prefrontal cortex. At test, the groups showed similar selective, agenda-related activity in these representational areas. There were, however, marked age differences in the activity of control regions in lateral and medial prefrontal cortex and lateral parietal cortex. Results of correlation analyses were consistent with the idea that young adults had greater trial-by-trial agenda-driven modulation of activity (i.e., greater selectivity) than did older adults in representational regions. Thus, under selective remembering conditions where older adults showed clear differential regional activity in representational areas depending on type of test, they also showed evidence of disrupted frontal and parietal function and reduced item-by-item modulation of test-appropriate features. This pattern of results is consistent with an age-related deficit in the engagement of selective reflective attention. PMID:23357375
Li, Baopu; Meng, Max Q-H
2012-05-01
Tumor in digestive tract is a common disease and wireless capsule endoscopy (WCE) is a relatively new technology to examine diseases for digestive tract especially for small intestine. This paper addresses the problem of automatic recognition of tumor for WCE images. Candidate color texture feature that integrates uniform local binary pattern and wavelet is proposed to characterize WCE images. The proposed features are invariant to illumination change and describe multiresolution characteristics of WCE images. Two feature selection approaches based on support vector machine, sequential forward floating selection and recursive feature elimination, are further employed to refine the proposed features for improving the detection accuracy. Extensive experiments validate that the proposed computer-aided diagnosis system achieves a promising tumor recognition accuracy of 92.4% in WCE images on our collected data.
Ahmed, Shaheen; Iftekharuddin, Khan M; Vossough, Arastoo
2011-03-01
Our previous works suggest that fractal texture feature is useful to detect pediatric brain tumor in multimodal MRI. In this study, we systematically investigate efficacy of using several different image features such as intensity, fractal texture, and level-set shape in segmentation of posterior-fossa (PF) tumor for pediatric patients. We explore effectiveness of using four different feature selection and three different segmentation techniques, respectively, to discriminate tumor regions from normal tissue in multimodal brain MRI. We further study the selective fusion of these features for improved PF tumor segmentation. Our result suggests that Kullback-Leibler divergence measure for feature ranking and selection and the expectation maximization algorithm for feature fusion and tumor segmentation offer the best results for the patient data in this study. We show that for T1 and fluid attenuation inversion recovery (FLAIR) MRI modalities, the best PF tumor segmentation is obtained using the texture feature such as multifractional Brownian motion (mBm) while that for T2 MRI is obtained by fusing level-set shape with intensity features. In multimodality fused MRI (T1, T2, and FLAIR), mBm feature offers the best PF tumor segmentation performance. We use different similarity metrics to evaluate quality and robustness of these selected features for PF tumor segmentation in MRI for ten pediatric patients.
Iliyasu, Abdullah M; Fatichah, Chastine
2017-12-19
A quantum hybrid (QH) intelligent approach that blends the adaptive search capability of the quantum-behaved particle swarm optimisation (QPSO) method with the intuitionistic rationality of traditional fuzzy k -nearest neighbours (Fuzzy k -NN) algorithm (known simply as the Q-Fuzzy approach) is proposed for efficient feature selection and classification of cells in cervical smeared (CS) images. From an initial multitude of 17 features describing the geometry, colour, and texture of the CS images, the QPSO stage of our proposed technique is used to select the best subset features (i.e., global best particles) that represent a pruned down collection of seven features. Using a dataset of almost 1000 images, performance evaluation of our proposed Q-Fuzzy approach assesses the impact of our feature selection on classification accuracy by way of three experimental scenarios that are compared alongside two other approaches: the All-features (i.e., classification without prior feature selection) and another hybrid technique combining the standard PSO algorithm with the Fuzzy k -NN technique (P-Fuzzy approach). In the first and second scenarios, we further divided the assessment criteria in terms of classification accuracy based on the choice of best features and those in terms of the different categories of the cervical cells. In the third scenario, we introduced new QH hybrid techniques, i.e., QPSO combined with other supervised learning methods, and compared the classification accuracy alongside our proposed Q-Fuzzy approach. Furthermore, we employed statistical approaches to establish qualitative agreement with regards to the feature selection in the experimental scenarios 1 and 3. The synergy between the QPSO and Fuzzy k -NN in the proposed Q-Fuzzy approach improves classification accuracy as manifest in the reduction in number cell features, which is crucial for effective cervical cancer detection and diagnosis.
A multimedia guide to spinal cord injury: empowerment through self instruction.
Van Biervliet, A; Gest, T R
1995-01-01
The Spinal Cord Injury (SCI) Project is developing a series of instructional modules on SCI that will be distributed via CD-ROM for patient and family education. The modules are based on an instructional program and patient manual distributed by the Paralyzed Veterans of America. The program includes topics ranging from the anatomy and physiology of spinal cord injuries to legal rights established under the Americans With Disabilities Act. The SCI project expands on the instructional manual by combining digital multimedia techniques with motivational features such as games and personal guides. The user selects a personal guide from among a selection of individuals with spinal cord injuries to guide them through tutorials that include accounts of personal experiences. The guides appear in small video windows at various points throughout the tutorials and give personal insight into the topic at hand. The user can also query the other guides to hear their views on a topic. The user interface incorporates 'seamless access' features, which enable persons with a wide range of disabilities to use the program. Innovative features of these modules are the use of personal instructional guides, motivational games and activities, incorporation of alternative input or access strategies, and the use of high quality, low cost, multimedia production strategies.
In-process and post-process measurements of drill wear for control of the drilling process
NASA Astrophysics Data System (ADS)
Liu, Tien-I.; Liu, George; Gao, Zhiyu
2011-12-01
Optical inspection was used in this research for the post-process measurements of drill wear. A precision toolmakers" microscope was used. Indirect index, cutting force, is used for in-process drill wear measurements. Using in-process measurements to estimate the drill wear for control purpose can decrease the operation cost and enhance the product quality and safety. The challenge is to correlate the in-process cutting force measurements with the post-process optical inspection of drill wear. To find the most important feature, the energy principle was used in this research. It is necessary to select only the cutting force feature which shows the highest sensitivity to drill wear. The best feature selected is the peak of torque in the drilling process. Neuro-fuzzy systems were used for correlation purposes. The Adaptive-Network-Based Fuzzy Inference System (ANFIS) can construct fuzzy rules with membership functions to generate an input-output pair. A 1x6 ANFIS architecture with product of sigmoid membership functions can in-process measure the drill wear with an error as low as 0.15%. This is extremely important for control of the drilling process. Furthermore, the measurement of drill wear was performed under different drilling conditions. This shows that ANFIS has the capability of generalization.
Auditory Attentional Control and Selection during Cocktail Party Listening
Hill, Kevin T.
2010-01-01
In realistic auditory environments, people rely on both attentional control and attentional selection to extract intelligible signals from a cluttered background. We used functional magnetic resonance imaging to examine auditory attention to natural speech under such high processing-load conditions. Participants attended to a single talker in a group of 3, identified by the target talker's pitch or spatial location. A catch-trial design allowed us to distinguish activity due to top-down control of attention versus attentional selection of bottom-up information in both the spatial and spectral (pitch) feature domains. For attentional control, we found a left-dominant fronto-parietal network with a bias toward spatial processing in dorsal precentral sulcus and superior parietal lobule, and a bias toward pitch in inferior frontal gyrus. During selection of the talker, attention modulated activity in left intraparietal sulcus when using talker location and in bilateral but right-dominant superior temporal sulcus when using talker pitch. We argue that these networks represent the sources and targets of selective attention in rich auditory environments. PMID:19574393
Orientation-Selective Retinal Circuits in Vertebrates.
Antinucci, Paride; Hindges, Robert
2018-01-01
Visual information is already processed in the retina before it is transmitted to higher visual centers in the brain. This includes the extraction of salient features from visual scenes, such as motion directionality or contrast, through neurons belonging to distinct neural circuits. Some retinal neurons are tuned to the orientation of elongated visual stimuli. Such 'orientation-selective' neurons are present in the retinae of most, if not all, vertebrate species analyzed to date, with species-specific differences in frequency and degree of tuning. In some cases, orientation-selective neurons have very stereotyped functional and morphological properties suggesting that they represent distinct cell types. In this review, we describe the retinal cell types underlying orientation selectivity found in various vertebrate species, and highlight their commonalities and differences. In addition, we discuss recent studies that revealed the cellular, synaptic and circuit mechanisms at the basis of retinal orientation selectivity. Finally, we outline the significance of these findings in shaping our current understanding of how this fundamental neural computation is implemented in the visual systems of vertebrates.
Minimizing the semantic gap in biomedical content-based image retrieval
NASA Astrophysics Data System (ADS)
Guan, Haiying; Antani, Sameer; Long, L. Rodney; Thoma, George R.
2010-03-01
A major challenge in biomedical Content-Based Image Retrieval (CBIR) is to achieve meaningful mappings that minimize the semantic gap between the high-level biomedical semantic concepts and the low-level visual features in images. This paper presents a comprehensive learning-based scheme toward meeting this challenge and improving retrieval quality. The article presents two algorithms: a learning-based feature selection and fusion algorithm and the Ranking Support Vector Machine (Ranking SVM) algorithm. The feature selection algorithm aims to select 'good' features and fuse them using different similarity measurements to provide a better representation of the high-level concepts with the low-level image features. Ranking SVM is applied to learn the retrieval rank function and associate the selected low-level features with query concepts, given the ground-truth ranking of the training samples. The proposed scheme addresses four major issues in CBIR to improve the retrieval accuracy: image feature extraction, selection and fusion, similarity measurements, the association of the low-level features with high-level concepts, and the generation of the rank function to support high-level semantic image retrieval. It models the relationship between semantic concepts and image features, and enables retrieval at the semantic level. We apply it to the problem of vertebra shape retrieval from a digitized spine x-ray image set collected by the second National Health and Nutrition Examination Survey (NHANES II). The experimental results show an improvement of up to 41.92% in the mean average precision (MAP) over conventional image similarity computation methods.
ERIC Educational Resources Information Center
Eimer, Martin; Kiss, Monika; Nicholas, Susan
2011-01-01
When target-defining features are specified in advance, attentional target selection in visual search is controlled by preparatory top-down task sets. We used ERP measures to study voluntary target selection in the absence of such feature-specific task sets, and to compare it to selection that is guided by advance knowledge about target features.…
NASA Astrophysics Data System (ADS)
Zhongqin, G.; Chen, Y.
2017-12-01
Abstract Quickly identify the spatial distribution of landslides automatically is essential for the prevention, mitigation and assessment of the landslide hazard. It's still a challenging job owing to the complicated characteristics and vague boundary of the landslide areas on the image. The high resolution remote sensing image has multi-scales, complex spatial distribution and abundant features, the object-oriented image classification methods can make full use of the above information and thus effectively detect the landslides after the hazard happened. In this research we present a new semi-supervised workflow, taking advantages of recent object-oriented image analysis and machine learning algorithms to quick locate the different origins of landslides of some areas on the southwest part of China. Besides a sequence of image segmentation, feature selection, object classification and error test, this workflow ensemble the feature selection and classifier selection. The feature this study utilized were normalized difference vegetation index (NDVI) change, textural feature derived from the gray level co-occurrence matrices (GLCM), spectral feature and etc. The improvement of this study shows this algorithm significantly removes some redundant feature and the classifiers get fully used. All these improvements lead to a higher accuracy on the determination of the shape of landslides on the high resolution remote sensing image, in particular the flexibility aimed at different kinds of landslides.
Interictal epileptiform discharge characteristics underlying expert interrater agreement.
Bagheri, Elham; Dauwels, Justin; Dean, Brian C; Waters, Chad G; Westover, M Brandon; Halford, Jonathan J
2017-10-01
The presence of interictal epileptiform discharges (IED) in the electroencephalogram (EEG) is a key finding in the medical workup of a patient with suspected epilepsy. However, inter-rater agreement (IRA) regarding the presence of IED is imperfect, leading to incorrect and delayed diagnoses. An improved understanding of which IED attributes mediate expert IRA might help in developing automatic methods for IED detection able to emulate the abilities of experts. Therefore, using a set of IED scored by a large number of experts, we set out to determine which attributes of IED predict expert agreement regarding the presence of IED. IED were annotated on a 5-point scale by 18 clinical neurophysiologists within 200 30-s EEG segments from recordings of 200 patients. 5538 signal analysis features were extracted from the waveforms, including wavelet coefficients, morphological features, signal energy, nonlinear energy operator response, electrode location, and spectrogram features. Feature selection was performed by applying elastic net regression and support vector regression (SVR) was applied to predict expert opinion, with and without the feature selection procedure and with and without several types of signal normalization. Multiple types of features were useful for predicting expert annotations, but particular types of wavelet features performed best. Local EEG normalization also enhanced best model performance. As the size of the group of EEGers used to train the models was increased, the performance of the models leveled off at a group size of around 11. The features that best predict inter-rater agreement among experts regarding the presence of IED are wavelet features, using locally standardized EEG. Our models for predicting expert opinion based on EEGer's scores perform best with a large group of EEGers (more than 10). By examining a large group of EEG signal analysis features we found that wavelet features with certain wavelet basis functions performed best to identify IEDs. Local normalization also improves predictability, suggesting the importance of IED morphology over amplitude-based features. Although most IED detection studies in the past have used opinion from three or fewer experts, our study suggests a "wisdom of the crowd" effect, such that pooling over a larger number of expert opinions produces a better correlation between expert opinion and objectively quantifiable features of the EEG. Copyright © 2017 International Federation of Clinical Neurophysiology. Published by Elsevier B.V. All rights reserved.
Attention improves encoding of task-relevant features in the human visual cortex
Jehee, Janneke F.M.; Brady, Devin K.; Tong, Frank
2011-01-01
When spatial attention is directed towards a particular stimulus, increased activity is commonly observed in corresponding locations of the visual cortex. Does this attentional increase in activity indicate improved processing of all features contained within the attended stimulus, or might spatial attention selectively enhance the features relevant to the observer’s task? We used fMRI decoding methods to measure the strength of orientation-selective activity patterns in the human visual cortex while subjects performed either an orientation or contrast discrimination task, involving one of two laterally presented gratings. Greater overall BOLD activation with spatial attention was observed in areas V1-V4 for both tasks. However, multivariate pattern analysis revealed that orientation-selective responses were enhanced by attention only when orientation was the task-relevant feature, and not when the grating’s contrast had to be attended. In a second experiment, observers discriminated the orientation or color of a specific lateral grating. Here, orientation-selective responses were enhanced in both tasks but color-selective responses were enhanced only when color was task-relevant. In both experiments, task-specific enhancement of feature-selective activity was not confined to the attended stimulus location, but instead spread to other locations in the visual field, suggesting the concurrent involvement of a global feature-based attentional mechanism. These results suggest that attention can be remarkably selective in its ability to enhance particular task-relevant features, and further reveal that increases in overall BOLD amplitude are not necessarily accompanied by improved processing of stimulus information. PMID:21632942
Attention improves encoding of task-relevant features in the human visual cortex.
Jehee, Janneke F M; Brady, Devin K; Tong, Frank
2011-06-01
When spatial attention is directed toward a particular stimulus, increased activity is commonly observed in corresponding locations of the visual cortex. Does this attentional increase in activity indicate improved processing of all features contained within the attended stimulus, or might spatial attention selectively enhance the features relevant to the observer's task? We used fMRI decoding methods to measure the strength of orientation-selective activity patterns in the human visual cortex while subjects performed either an orientation or contrast discrimination task, involving one of two laterally presented gratings. Greater overall BOLD activation with spatial attention was observed in visual cortical areas V1-V4 for both tasks. However, multivariate pattern analysis revealed that orientation-selective responses were enhanced by attention only when orientation was the task-relevant feature and not when the contrast of the grating had to be attended. In a second experiment, observers discriminated the orientation or color of a specific lateral grating. Here, orientation-selective responses were enhanced in both tasks, but color-selective responses were enhanced only when color was task relevant. In both experiments, task-specific enhancement of feature-selective activity was not confined to the attended stimulus location but instead spread to other locations in the visual field, suggesting the concurrent involvement of a global feature-based attentional mechanism. These results suggest that attention can be remarkably selective in its ability to enhance particular task-relevant features and further reveal that increases in overall BOLD amplitude are not necessarily accompanied by improved processing of stimulus information.
NASA Astrophysics Data System (ADS)
Choi, Jonathan W.; Li, Zhaodong; Black, Charles T.; Sweat, Daniel P.; Wang, Xudong; Gopalan, Padma
2016-06-01
In this work, we demonstrate the use of self-assembled thin films of the cylinder-forming block copolymer poly(4-tert-butylstyrene-block-2-vinylpyridine) to pattern high density features at the 10 nm length scale. This material's large interaction parameter facilitates pattern formation in single-digit nanometer dimensions. This block copolymer's accessible order-disorder transition temperature allows thermal annealing to drive the assembly of ordered 2-vinylpyridine cylinders that can be selectively complexed with the organometallic precursor trimethylaluminum. This unique chemistry converts organic 2-vinylpyridine cylinders into alumina nanowires with diameters ranging from 8 to 11 nm, depending on the copolymer molecular weight. Graphoepitaxy of this block copolymer aligns and registers sub-12 nm diameter nanowires to larger-scale rectangular, curved, and circular features patterned by optical lithography. The alumina nanowires function as a robust hard mask to withstand the conditions required for patterning the underlying silicon by plasma etching. We conclude with a discussion of some of the challenges that arise with using block copolymers for patterning at sub-10 nm feature sizes.In this work, we demonstrate the use of self-assembled thin films of the cylinder-forming block copolymer poly(4-tert-butylstyrene-block-2-vinylpyridine) to pattern high density features at the 10 nm length scale. This material's large interaction parameter facilitates pattern formation in single-digit nanometer dimensions. This block copolymer's accessible order-disorder transition temperature allows thermal annealing to drive the assembly of ordered 2-vinylpyridine cylinders that can be selectively complexed with the organometallic precursor trimethylaluminum. This unique chemistry converts organic 2-vinylpyridine cylinders into alumina nanowires with diameters ranging from 8 to 11 nm, depending on the copolymer molecular weight. Graphoepitaxy of this block copolymer aligns and registers sub-12 nm diameter nanowires to larger-scale rectangular, curved, and circular features patterned by optical lithography. The alumina nanowires function as a robust hard mask to withstand the conditions required for patterning the underlying silicon by plasma etching. We conclude with a discussion of some of the challenges that arise with using block copolymers for patterning at sub-10 nm feature sizes. Electronic supplementary information (ESI) available. See DOI: 10.1039/c6nr01409g
Dynamics of Dark-Fly Genome Under Environmental Selections.
Izutsu, Minako; Toyoda, Atsushi; Fujiyama, Asao; Agata, Kiyokazu; Fuse, Naoyuki
2015-12-04
Environmental adaptation is one of the most fundamental features of organisms. Modern genome science has identified some genes associated with adaptive traits of organisms, and has provided insights into environmental adaptation and evolution. However, how genes contribute to adaptive traits and how traits are selected under an environment in the course of evolution remain mostly unclear. To approach these issues, we utilize "Dark-fly", a Drosophila melanogaster line maintained in constant dark conditions for more than 60 years. Our previous analysis identified 220,000 single nucleotide polymorphisms (SNPs) in the Dark-fly genome, but did not clarify which SNPs of Dark-fly are truly adaptive for living in the dark. We found here that Dark-fly dominated over the wild-type fly in a mixed population under dark conditions, and based on this domination we designed an experiment for genome reselection to identify adaptive genes of Dark-fly. For this experiment, large mixed populations of Dark-fly and the wild-type fly were maintained in light conditions or in dark conditions, and the frequencies of Dark-fly SNPs were compared between these populations across the whole genome. We thereby detected condition-dependent selections toward approximately 6% of the genome. In addition, we observed the time-course trajectory of SNP frequency in the mixed populations through generations 0, 22, and 49, which resulted in notable categorization of the selected SNPs into three types with different combinations of positive and negative selections. Our data provided a list of about 100 strong candidate genes associated with the adaptive traits of Dark-fly. Copyright © 2016 Izutsu et al.
Dynamics of Dark-Fly Genome Under Environmental Selections
Izutsu, Minako; Toyoda, Atsushi; Fujiyama, Asao; Agata, Kiyokazu; Fuse, Naoyuki
2015-01-01
Environmental adaptation is one of the most fundamental features of organisms. Modern genome science has identified some genes associated with adaptive traits of organisms, and has provided insights into environmental adaptation and evolution. However, how genes contribute to adaptive traits and how traits are selected under an environment in the course of evolution remain mostly unclear. To approach these issues, we utilize “Dark-fly”, a Drosophila melanogaster line maintained in constant dark conditions for more than 60 years. Our previous analysis identified 220,000 single nucleotide polymorphisms (SNPs) in the Dark-fly genome, but did not clarify which SNPs of Dark-fly are truly adaptive for living in the dark. We found here that Dark-fly dominated over the wild-type fly in a mixed population under dark conditions, and based on this domination we designed an experiment for genome reselection to identify adaptive genes of Dark-fly. For this experiment, large mixed populations of Dark-fly and the wild-type fly were maintained in light conditions or in dark conditions, and the frequencies of Dark-fly SNPs were compared between these populations across the whole genome. We thereby detected condition-dependent selections toward approximately 6% of the genome. In addition, we observed the time-course trajectory of SNP frequency in the mixed populations through generations 0, 22, and 49, which resulted in notable categorization of the selected SNPs into three types with different combinations of positive and negative selections. Our data provided a list of about 100 strong candidate genes associated with the adaptive traits of Dark-fly. PMID:26637434
Wen, Tingxi; Zhang, Zhongnan
2017-01-01
Abstract In this paper, genetic algorithm-based frequency-domain feature search (GAFDS) method is proposed for the electroencephalogram (EEG) analysis of epilepsy. In this method, frequency-domain features are first searched and then combined with nonlinear features. Subsequently, these features are selected and optimized to classify EEG signals. The extracted features are analyzed experimentally. The features extracted by GAFDS show remarkable independence, and they are superior to the nonlinear features in terms of the ratio of interclass distance and intraclass distance. Moreover, the proposed feature search method can search for features of instantaneous frequency in a signal after Hilbert transformation. The classification results achieved using these features are reasonable; thus, GAFDS exhibits good extensibility. Multiple classical classifiers (i.e., k-nearest neighbor, linear discriminant analysis, decision tree, AdaBoost, multilayer perceptron, and Naïve Bayes) achieve satisfactory classification accuracies by using the features generated by the GAFDS method and the optimized feature selection. The accuracies for 2-classification and 3-classification problems may reach up to 99% and 97%, respectively. Results of several cross-validation experiments illustrate that GAFDS is effective in the extraction of effective features for EEG classification. Therefore, the proposed feature selection and optimization model can improve classification accuracy. PMID:28489789
Wen, Tingxi; Zhang, Zhongnan
2017-05-01
In this paper, genetic algorithm-based frequency-domain feature search (GAFDS) method is proposed for the electroencephalogram (EEG) analysis of epilepsy. In this method, frequency-domain features are first searched and then combined with nonlinear features. Subsequently, these features are selected and optimized to classify EEG signals. The extracted features are analyzed experimentally. The features extracted by GAFDS show remarkable independence, and they are superior to the nonlinear features in terms of the ratio of interclass distance and intraclass distance. Moreover, the proposed feature search method can search for features of instantaneous frequency in a signal after Hilbert transformation. The classification results achieved using these features are reasonable; thus, GAFDS exhibits good extensibility. Multiple classical classifiers (i.e., k-nearest neighbor, linear discriminant analysis, decision tree, AdaBoost, multilayer perceptron, and Naïve Bayes) achieve satisfactory classification accuracies by using the features generated by the GAFDS method and the optimized feature selection. The accuracies for 2-classification and 3-classification problems may reach up to 99% and 97%, respectively. Results of several cross-validation experiments illustrate that GAFDS is effective in the extraction of effective features for EEG classification. Therefore, the proposed feature selection and optimization model can improve classification accuracy.
Burnham, Bryan R
2018-05-03
During visual search, both top-down factors and bottom-up properties contribute to the guidance of visual attention, but selection history can influence attention independent of bottom-up and top-down factors. For example, priming of pop-out (PoP) is the finding that search for a singleton target is faster when the target and distractor features repeat than when those features trade roles between trials. Studies have suggested that such priming (selection history) effects on pop-out search manifest either early, by biasing the selection of the preceding target feature, or later in processing, by facilitating response and target retrieval processes. The present study was designed to examine the influence of selection history on pop-out search by introducing a speed-accuracy trade-off manipulation in a pop-out search task. Ratcliff diffusion modeling (RDM) was used to examine how selection history influenced both attentional bias and response execution processes. The results support the hypothesis that selection history biases attention toward the preceding target's features on the current trial and also influences selection of the response to the target.
Attallah, Omneya; Karthikesalingam, Alan; Holt, Peter Je; Thompson, Matthew M; Sayers, Rob; Bown, Matthew J; Choke, Eddie C; Ma, Xianghong
2017-11-01
Feature selection is essential in medical area; however, its process becomes complicated with the presence of censoring which is the unique character of survival analysis. Most survival feature selection methods are based on Cox's proportional hazard model, though machine learning classifiers are preferred. They are less employed in survival analysis due to censoring which prevents them from directly being used to survival data. Among the few work that employed machine learning classifiers, partial logistic artificial neural network with auto-relevance determination is a well-known method that deals with censoring and perform feature selection for survival data. However, it depends on data replication to handle censoring which leads to unbalanced and biased prediction results especially in highly censored data. Other methods cannot deal with high censoring. Therefore, in this article, a new hybrid feature selection method is proposed which presents a solution to high level censoring. It combines support vector machine, neural network, and K-nearest neighbor classifiers using simple majority voting and a new weighted majority voting method based on survival metric to construct a multiple classifier system. The new hybrid feature selection process uses multiple classifier system as a wrapper method and merges it with iterated feature ranking filter method to further reduce features. Two endovascular aortic repair datasets containing 91% censored patients collected from two centers were used to construct a multicenter study to evaluate the performance of the proposed approach. The results showed the proposed technique outperformed individual classifiers and variable selection methods based on Cox's model such as Akaike and Bayesian information criterions and least absolute shrinkage and selector operator in p values of the log-rank test, sensitivity, and concordance index. This indicates that the proposed classifier is more powerful in correctly predicting the risk of re-intervention enabling doctor in selecting patients' future follow-up plan.
Robust Feature Selection Technique using Rank Aggregation.
Sarkar, Chandrima; Cooley, Sarah; Srivastava, Jaideep
2014-01-01
Although feature selection is a well-developed research area, there is an ongoing need to develop methods to make classifiers more efficient. One important challenge is the lack of a universal feature selection technique which produces similar outcomes with all types of classifiers. This is because all feature selection techniques have individual statistical biases while classifiers exploit different statistical properties of data for evaluation. In numerous situations this can put researchers into dilemma as to which feature selection method and a classifiers to choose from a vast range of choices. In this paper, we propose a technique that aggregates the consensus properties of various feature selection methods to develop a more optimal solution. The ensemble nature of our technique makes it more robust across various classifiers. In other words, it is stable towards achieving similar and ideally higher classification accuracy across a wide variety of classifiers. We quantify this concept of robustness with a measure known as the Robustness Index (RI). We perform an extensive empirical evaluation of our technique on eight data sets with different dimensions including Arrythmia, Lung Cancer, Madelon, mfeat-fourier, internet-ads, Leukemia-3c and Embryonal Tumor and a real world data set namely Acute Myeloid Leukemia (AML). We demonstrate not only that our algorithm is more robust, but also that compared to other techniques our algorithm improves the classification accuracy by approximately 3-4% (in data set with less than 500 features) and by more than 5% (in data set with more than 500 features), across a wide range of classifiers.
SEM-induced shrinkage and site-selective modification of single-crystal silicon nanopores
NASA Astrophysics Data System (ADS)
Chen, Qi; Wang, Yifan; Deng, Tao; Liu, Zewen
2017-07-01
Solid-state nanopores with feature sizes around 5 nm play a critical role in bio-sensing fields, especially in single molecule detection and sequencing of DNA, RNA and proteins. In this paper we present a systematic study on shrinkage and site-selective modification of single-crystal silicon nanopores with a conventional scanning electron microscope (SEM). Square nanopores with measurable sizes as small as 8 nm × 8 nm and rectangle nanopores with feature sizes (the smaller one between length and width) down to 5 nm have been obtained, using the SEM-induced shrinkage technique. The analysis of energy dispersive x-ray spectroscopy and the recovery of the pore size and morphology reveal that the grown material along with the edge of the nanopore is the result of deposition of hydrocarbon compounds, without structural damage during the shrinking process. A simplified model for pore shrinkage has been developed based on observation of the cross-sectional morphology of the shrunk nanopore. The main factors impacting on the task of controllably shrinking the nanopores, such as the accelerating voltage, spot size, scanned area of e-beam, and the initial pore size have been discussed. It is found that single-crystal silicon nanopores shrink linearly with time under localized irradiation by SEM e-beam in all cases, and the pore shrinkage rate is inversely proportional to the initial equivalent diameter of the pore under the same e-beam conditions.
How bandwidth selection algorithms impact exploratory data analysis using kernel density estimation.
Harpole, Jared K; Woods, Carol M; Rodebaugh, Thomas L; Levinson, Cheri A; Lenze, Eric J
2014-09-01
Exploratory data analysis (EDA) can reveal important features of underlying distributions, and these features often have an impact on inferences and conclusions drawn from data. Graphical analysis is central to EDA, and graphical representations of distributions often benefit from smoothing. A viable method of estimating and graphing the underlying density in EDA is kernel density estimation (KDE). This article provides an introduction to KDE and examines alternative methods for specifying the smoothing bandwidth in terms of their ability to recover the true density. We also illustrate the comparison and use of KDE methods with 2 empirical examples. Simulations were carried out in which we compared 8 bandwidth selection methods (Sheather-Jones plug-in [SJDP], normal rule of thumb, Silverman's rule of thumb, least squares cross-validation, biased cross-validation, and 3 adaptive kernel estimators) using 5 true density shapes (standard normal, positively skewed, bimodal, skewed bimodal, and standard lognormal) and 9 sample sizes (15, 25, 50, 75, 100, 250, 500, 1,000, 2,000). Results indicate that, overall, SJDP outperformed all methods. However, for smaller sample sizes (25 to 100) either biased cross-validation or Silverman's rule of thumb was recommended, and for larger sample sizes the adaptive kernel estimator with SJDP was recommended. Information is provided about implementing the recommendations in the R computing language. PsycINFO Database Record (c) 2014 APA, all rights reserved.
2015-01-01
Background Investigations into novel biomarkers using omics techniques generate large amounts of data. Due to their size and numbers of attributes, these data are suitable for analysis with machine learning methods. A key component of typical machine learning pipelines for omics data is feature selection, which is used to reduce the raw high-dimensional data into a tractable number of features. Feature selection needs to balance the objective of using as few features as possible, while maintaining high predictive power. This balance is crucial when the goal of data analysis is the identification of highly accurate but small panels of biomarkers with potential clinical utility. In this paper we propose a heuristic for the selection of very small feature subsets, via an iterative feature elimination process that is guided by rule-based machine learning, called RGIFE (Rule-guided Iterative Feature Elimination). We use this heuristic to identify putative biomarkers of osteoarthritis (OA), articular cartilage degradation and synovial inflammation, using both proteomic and transcriptomic datasets. Results and discussion Our RGIFE heuristic increased the classification accuracies achieved for all datasets when no feature selection is used, and performed well in a comparison with other feature selection methods. Using this method the datasets were reduced to a smaller number of genes or proteins, including those known to be relevant to OA, cartilage degradation and joint inflammation. The results have shown the RGIFE feature reduction method to be suitable for analysing both proteomic and transcriptomics data. Methods that generate large ‘omics’ datasets are increasingly being used in the area of rheumatology. Conclusions Feature reduction methods are advantageous for the analysis of omics data in the field of rheumatology, as the applications of such techniques are likely to result in improvements in diagnosis, treatment and drug discovery. PMID:25923811
Sensor feature fusion for detecting buried objects
DOE Office of Scientific and Technical Information (OSTI.GOV)
Clark, G.A.; Sengupta, S.K.; Sherwood, R.J.
1993-04-01
Given multiple registered images of the earth`s surface from dual-band sensors, our system fuses information from the sensors to reduce the effects of clutter and improve the ability to detect buried or surface target sites. The sensor suite currently includes two sensors (5 micron and 10 micron wavelengths) and one ground penetrating radar (GPR) of the wide-band pulsed synthetic aperture type. We use a supervised teaming pattern recognition approach to detect metal and plastic land mines buried in soil. The overall process consists of four main parts: Preprocessing, feature extraction, feature selection, and classification. These parts are used in amore » two step process to classify a subimage. Thee first step, referred to as feature selection, determines the features of sub-images which result in the greatest separability among the classes. The second step, image labeling, uses the selected features and the decisions from a pattern classifier to label the regions in the image which are likely to correspond to buried mines. We extract features from the images, and use feature selection algorithms to select only the most important features according to their contribution to correct detections. This allows us to save computational complexity and determine which of the sensors add value to the detection system. The most important features from the various sensors are fused using supervised teaming pattern classifiers (including neural networks). We present results of experiments to detect buried land mines from real data, and evaluate the usefulness of fusing feature information from multiple sensor types, including dual-band infrared and ground penetrating radar. The novelty of the work lies mostly in the combination of the algorithms and their application to the very important and currently unsolved operational problem of detecting buried land mines from an airborne standoff platform.« less
Contact-free heart rate measurement using multiple video data
NASA Astrophysics Data System (ADS)
Hung, Pang-Chan; Lee, Kual-Zheng; Tsai, Luo-Wei
2013-10-01
In this paper, we propose a contact-free heart rate measurement method by analyzing sequential images of multiple video data. In the proposed method, skin-like pixels are firstly detected from multiple video data for extracting the color features. These color features are synchronized and analyzed by independent component analysis. A representative component is finally selected among these independent component candidates to measure the HR, which achieves under 2% deviation on average compared with a pulse oximeter in the controllable environment. The advantages of the proposed method include: 1) it uses low cost and high accessibility camera device; 2) it eases users' discomfort by utilizing contact-free measurement; and 3) it achieves the low error rate and the high stability by integrating multiple video data.
Viswanathan, Ravi K; Busse, William W
2018-02-01
Although airway inflammation is an intrinsic and key feature of asthma, this response varies in its intensity and translation to clinical characteristics and responsiveness to treatment. The observations that clinical heterogeneity is an important aspect of asthma and a feature that likely dictates and determines responses to treatment in severe asthma, patient responsiveness to medication is incomplete, and risks for exacerbation are increased. The development of biologics, which target selected and specific components of inflammation, has been a promising advance to achieve asthma control in patients with severe disease. This article reviews the current biologics available and under development and how their use has affected asthma and which subpopulations appear to benefit the greatest. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
Tracking and Motion Analysis of Crack Propagations in Crystals for Molecular Dynamics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tsap, L V; Duchaineau, M; Goldgof, D B
2001-05-14
This paper presents a quantitative analysis for a discovery in molecular dynamics. Recent simulations have shown that velocities of crack propagations in crystals under certain conditions can become supersonic, which is contrary to classical physics. In this research, they present a framework for tracking and motion analysis of crack propagations in crystals. It includes line segment extraction based on Canny edge maps, feature selection based on physical properties, and subsequent tracking of primary and secondary wavefronts. This tracking is completely automated; it runs in real time on three 834-image sequences using forty 250 MHZ processors. Results supporting physical observations aremore » presented in terms of both feature tracking and velocity analysis.« less
Zhang, Xiaoheng; Wang, Lirui; Cao, Yao; Wang, Pin; Zhang, Cheng; Yang, Liuyang; Li, Yongming; Zhang, Yanling; Cheng, Oumei
2018-02-01
Diagnosis of Parkinson's disease (PD) based on speech data has been proved to be an effective way in recent years. However, current researches just care about the feature extraction and classifier design, and do not consider the instance selection. Former research by authors showed that the instance selection can lead to improvement on classification accuracy. However, no attention is paid on the relationship between speech sample and feature until now. Therefore, a new diagnosis algorithm of PD is proposed in this paper by simultaneously selecting speech sample and feature based on relevant feature weighting algorithm and multiple kernel method, so as to find their synergy effects, thereby improving classification accuracy. Experimental results showed that this proposed algorithm obtained apparent improvement on classification accuracy. It can obtain mean classification accuracy of 82.5%, which was 30.5% higher than the relevant algorithm. Besides, the proposed algorithm detected the synergy effects of speech sample and feature, which is valuable for speech marker extraction.
An FPGA-based trigger for the phase II of the MEG experiment
NASA Astrophysics Data System (ADS)
Baldini, A.; Bemporad, C.; Cei, F.; Galli, L.; Grassi, M.; Morsani, F.; Nicolò, D.; Ritt, S.; Venturini, M.
2016-07-01
For the phase II of MEG, we are going to develop a combined trigger and DAQ system. Here we focus on the former side, which operates an on-line reconstruction of detector signals and event selection within 450 μs from event occurrence. Trigger concentrator boards (TCB) are under development to gather data from different crates, each connected to a set of detector channels, to accomplish higher-level algorithms to issue a trigger in the case of a candidate signal event. We describe the major features of the new system, in comparison with phase I, as well as its performances in terms of selection efficiency and background rejection.
NASA Astrophysics Data System (ADS)
Samala, Ravi K.; Chan, Heang-Ping; Hadjiiski, Lubomir; Helvie, Mark A.; Kim, Renaid
2017-03-01
Understanding the key radiogenomic associations for breast cancer between DCE-MRI and micro-RNA expressions is the foundation for the discovery of radiomic features as biomarkers for assessing tumor progression and prognosis. We conducted a study to analyze the radiogenomic associations for breast cancer using the TCGA-TCIA data set. The core idea that tumor etiology is a function of the behavior of miRNAs is used to build the regression models. The associations based on regression are analyzed for three study outcomes: diagnosis, prognosis, and treatment. The diagnosis group consists of miRNAs associated with clinicopathologic features of breast cancer and significant aberration of expression in breast cancer patients. The prognosis group consists of miRNAs which are closely associated with tumor suppression and regulation of cell proliferation and differentiation. The treatment group consists of miRNAs that contribute significantly to the regulation of metastasis thereby having the potential to be part of therapeutic mechanisms. As a first step, important miRNA expressions were identified and their ability to classify the clinical phenotypes based on the study outcomes was evaluated using the area under the ROC curve (AUC) as a figure-of-merit. The key mapping between the selected miRNAs and radiomic features were determined using least absolute shrinkage and selection operator (LASSO) regression analysis within a two-loop leave-one-out cross-validation strategy. These key associations indicated a number of radiomic features from DCE-MRI to be potential biomarkers for the three study outcomes.