Liu, Jingfang; Zhang, Pengzhu; Lu, Yingjie
2014-11-01
User-generated medical messages on Internet contain extensive information related to adverse drug reactions (ADRs) and are known as valuable resources for post-marketing drug surveillance. The aim of this study was to find an effective method to identify messages related to ADRs automatically from online user reviews. We conducted experiments on online user reviews using different feature set and different classification technique. Firstly, the messages from three communities, allergy community, schizophrenia community and pain management community, were collected, the 3000 messages were annotated. Secondly, the N-gram-based features set and medical domain-specific features set were generated. Thirdly, three classification techniques, SVM, C4.5 and Naïve Bayes, were used to perform classification tasks separately. Finally, we evaluated the performance of different method using different feature set and different classification technique by comparing the metrics including accuracy and F-measure. In terms of accuracy, the accuracy of SVM classifier was higher than 0.8, the accuracy of C4.5 classifier or Naïve Bayes classifier was lower than 0.8; meanwhile, the combination feature sets including n-gram-based feature set and domain-specific feature set consistently outperformed single feature set. In terms of F-measure, the highest F-measure is 0.895 which was achieved by using combination feature sets and a SVM classifier. In all, we can get the best classification performance by using combination feature sets and SVM classifier. By using combination feature sets and SVM classifier, we can get an effective method to identify messages related to ADRs automatically from online user reviews.
Inattentional blindness: A combination of a relational set and a feature inhibition set?
Goldstein, Rebecca R; Beck, Melissa R
2016-07-01
Two experiments were conducted to directly test the feature set hypothesis and the relational set hypothesis in an inattentional blindness task. The feature set hypothesis predicts that unexpected objects that match the to-be-attended stimuli will be reported most. The relational set hypothesis predicts that unexpected objects that match the relationship between the to-be-attended and the to-be-ignored stimuli will be reported the most. Experiment 1 manipulated the luminance of the stimuli. Participants were instructed to monitor the gray letter shapes and to ignore either black or white letter shapes. The unexpected objects that exhibited the luminance relation of the to-be-attended to the to-be-ignored stimuli were reported by participants the most. Experiment 2 manipulated the color of the stimuli. Participants were instructed to monitor the yellower orange or the redder orange letter shapes and to ignore the redder orange or yellower letter shapes. The unexpected objects that exhibited the color relation of the to-be-attended to the to-be-ignored stimuli were reported the most. The results do not support the use of a feature set to accomplish the task and instead support the use of a relational set. In addition, the results point to the concurrent use of multiple attentional sets that are both excitatory and inhibitory.
Feature Selection Methods for Zero-Shot Learning of Neural Activity.
Caceres, Carlos A; Roos, Matthew J; Rupp, Kyle M; Milsap, Griffin; Crone, Nathan E; Wolmetz, Michael E; Ratto, Christopher R
2017-01-01
Dimensionality poses a serious challenge when making predictions from human neuroimaging data. Across imaging modalities, large pools of potential neural features (e.g., responses from particular voxels, electrodes, and temporal windows) have to be related to typically limited sets of stimuli and samples. In recent years, zero-shot prediction models have been introduced for mapping between neural signals and semantic attributes, which allows for classification of stimulus classes not explicitly included in the training set. While choices about feature selection can have a substantial impact when closed-set accuracy, open-set robustness, and runtime are competing design objectives, no systematic study of feature selection for these models has been reported. Instead, a relatively straightforward feature stability approach has been adopted and successfully applied across models and imaging modalities. To characterize the tradeoffs in feature selection for zero-shot learning, we compared correlation-based stability to several other feature selection techniques on comparable data sets from two distinct imaging modalities: functional Magnetic Resonance Imaging and Electrocorticography. While most of the feature selection methods resulted in similar zero-shot prediction accuracies and spatial/spectral patterns of selected features, there was one exception; A novel feature/attribute correlation approach was able to achieve those accuracies with far fewer features, suggesting the potential for simpler prediction models that yield high zero-shot classification accuracy.
Feature Selection Methods for Zero-Shot Learning of Neural Activity
Caceres, Carlos A.; Roos, Matthew J.; Rupp, Kyle M.; Milsap, Griffin; Crone, Nathan E.; Wolmetz, Michael E.; Ratto, Christopher R.
2017-01-01
Dimensionality poses a serious challenge when making predictions from human neuroimaging data. Across imaging modalities, large pools of potential neural features (e.g., responses from particular voxels, electrodes, and temporal windows) have to be related to typically limited sets of stimuli and samples. In recent years, zero-shot prediction models have been introduced for mapping between neural signals and semantic attributes, which allows for classification of stimulus classes not explicitly included in the training set. While choices about feature selection can have a substantial impact when closed-set accuracy, open-set robustness, and runtime are competing design objectives, no systematic study of feature selection for these models has been reported. Instead, a relatively straightforward feature stability approach has been adopted and successfully applied across models and imaging modalities. To characterize the tradeoffs in feature selection for zero-shot learning, we compared correlation-based stability to several other feature selection techniques on comparable data sets from two distinct imaging modalities: functional Magnetic Resonance Imaging and Electrocorticography. While most of the feature selection methods resulted in similar zero-shot prediction accuracies and spatial/spectral patterns of selected features, there was one exception; A novel feature/attribute correlation approach was able to achieve those accuracies with far fewer features, suggesting the potential for simpler prediction models that yield high zero-shot classification accuracy. PMID:28690513
New Features for Neuron Classification.
Hernández-Pérez, Leonardo A; Delgado-Castillo, Duniel; Martín-Pérez, Rainer; Orozco-Morales, Rubén; Lorenzo-Ginori, Juan V
2018-04-28
This paper addresses the problem of obtaining new neuron features capable of improving results of neuron classification. Most studies on neuron classification using morphological features have been based on Euclidean geometry. Here three one-dimensional (1D) time series are derived from the three-dimensional (3D) structure of neuron instead, and afterwards a spatial time series is finally constructed from which the features are calculated. Digitally reconstructed neurons were separated into control and pathological sets, which are related to three categories of alterations caused by epilepsy, Alzheimer's disease (long and local projections), and ischemia. These neuron sets were then subjected to supervised classification and the results were compared considering three sets of features: morphological, features obtained from the time series and a combination of both. The best results were obtained using features from the time series, which outperformed the classification using only morphological features, showing higher correct classification rates with differences of 5.15, 3.75, 5.33% for epilepsy and Alzheimer's disease (long and local projections) respectively. The morphological features were better for the ischemia set with a difference of 3.05%. Features like variance, Spearman auto-correlation, partial auto-correlation, mutual information, local minima and maxima, all related to the time series, exhibited the best performance. Also we compared different evaluators, among which ReliefF was the best ranked.
Detection of explosive cough events in audio recordings by internal sound analysis.
Rocha, B M; Mendes, L; Couceiro, R; Henriques, J; Carvalho, P; Paiva, R P
2017-07-01
We present a new method for the discrimination of explosive cough events, which is based on a combination of spectral content descriptors and pitch-related features. After the removal of near-silent segments, a vector of event boundaries is obtained and a proposed set of 9 features is extracted for each event. Two data sets, recorded using electronic stethoscopes and comprising a total of 46 healthy subjects and 13 patients, were employed to evaluate the method. The proposed feature set is compared to three other sets of descriptors: a baseline, a combination of both sets, and an automatic selection of the best 10 features from both sets. The combined feature set yields good results on the cross-validated database, attaining a sensitivity of 92.3±2.3% and a specificity of 84.7±3.3%. Besides, this feature set seems to generalize well when it is trained on a small data set of patients, with a variety of respiratory and cardiovascular diseases, and tested on a bigger data set of mostly healthy subjects: a sensitivity of 93.4% and a specificity of 83.4% are achieved in those conditions. These results demonstrate that complementing the proposed feature set with a baseline set is a promising approach.
Nixon, Mark S.; Komogortsev, Oleg V.
2017-01-01
We introduce the intraclass correlation coefficient (ICC) to the biometric community as an index of the temporal persistence, or stability, of a single biometric feature. It requires, as input, a feature on an interval or ratio scale, and which is reasonably normally distributed, and it can only be calculated if each subject is tested on 2 or more occasions. For a biometric system, with multiple features available for selection, the ICC can be used to measure the relative stability of each feature. We show, for 14 distinct data sets (1 synthetic, 8 eye-movement-related, 2 gait-related, and 2 face-recognition-related, and one brain-structure-related), that selecting the most stable features, based on the ICC, resulted in the best biometric performance generally. Analyses based on using only the most stable features produced superior Rank-1-Identification Rate (Rank-1-IR) performance in 12 of 14 databases (p = 0.0065, one-tailed), when compared to other sets of features, including the set of all features. For Equal Error Rate (EER), using a subset of only high-ICC features also produced superior performance in 12 of 14 databases (p = 0. 0065, one-tailed). In general, then, for our databases, prescreening potential biometric features, and choosing only highly reliable features yields better performance than choosing lower ICC features or than choosing all features combined. We also determined that, as the ICC of a group of features increases, the median of the genuine similarity score distribution increases and the spread of this distribution decreases. There was no statistically significant similar relationships for the impostor distributions. We believe that the ICC will find many uses in biometric research. In case of the eye movement-driven biometrics, the use of reliable features, as measured by ICC, allowed to us achieve the authentication performance with EER = 2.01%, which was not possible before. PMID:28575030
Friedman, Lee; Nixon, Mark S; Komogortsev, Oleg V
2017-01-01
We introduce the intraclass correlation coefficient (ICC) to the biometric community as an index of the temporal persistence, or stability, of a single biometric feature. It requires, as input, a feature on an interval or ratio scale, and which is reasonably normally distributed, and it can only be calculated if each subject is tested on 2 or more occasions. For a biometric system, with multiple features available for selection, the ICC can be used to measure the relative stability of each feature. We show, for 14 distinct data sets (1 synthetic, 8 eye-movement-related, 2 gait-related, and 2 face-recognition-related, and one brain-structure-related), that selecting the most stable features, based on the ICC, resulted in the best biometric performance generally. Analyses based on using only the most stable features produced superior Rank-1-Identification Rate (Rank-1-IR) performance in 12 of 14 databases (p = 0.0065, one-tailed), when compared to other sets of features, including the set of all features. For Equal Error Rate (EER), using a subset of only high-ICC features also produced superior performance in 12 of 14 databases (p = 0. 0065, one-tailed). In general, then, for our databases, prescreening potential biometric features, and choosing only highly reliable features yields better performance than choosing lower ICC features or than choosing all features combined. We also determined that, as the ICC of a group of features increases, the median of the genuine similarity score distribution increases and the spread of this distribution decreases. There was no statistically significant similar relationships for the impostor distributions. We believe that the ICC will find many uses in biometric research. In case of the eye movement-driven biometrics, the use of reliable features, as measured by ICC, allowed to us achieve the authentication performance with EER = 2.01%, which was not possible before.
Lu, Yingjie
2013-01-01
To facilitate patient involvement in online health community and obtain informative support and emotional support they need, a topic identification approach was proposed in this paper for identifying automatically topics of the health-related messages in online health community, thus assisting patients in reaching the most relevant messages for their queries efficiently. Feature-based classification framework was presented for automatic topic identification in our study. We first collected the messages related to some predefined topics in a online health community. Then we combined three different types of features, n-gram-based features, domain-specific features and sentiment features to build four feature sets for health-related text representation. Finally, three different text classification techniques, C4.5, Naïve Bayes and SVM were adopted to evaluate our topic classification model. By comparing different feature sets and different classification techniques, we found that n-gram-based features, domain-specific features and sentiment features were all considered to be effective in distinguishing different types of health-related topics. In addition, feature reduction technique based on information gain was also effective to improve the topic classification performance. In terms of classification techniques, SVM outperformed C4.5 and Naïve Bayes significantly. The experimental results demonstrated that the proposed approach could identify the topics of online health-related messages efficiently.
Complex Topographic Feature Ontology Patterns
Varanka, Dalia E.; Jerris, Thomas J.
2015-01-01
Semantic ontologies are examined as effective data models for the representation of complex topographic feature types. Complex feature types are viewed as integrated relations between basic features for a basic purpose. In the context of topographic science, such component assemblages are supported by resource systems and found on the local landscape. Ontologies are organized within six thematic modules of a domain ontology called Topography that includes within its sphere basic feature types, resource systems, and landscape types. Context is constructed not only as a spatial and temporal setting, but a setting also based on environmental processes. Types of spatial relations that exist between components include location, generative processes, and description. An example is offered in a complex feature type ‘mine.’ The identification and extraction of complex feature types are an area for future research.
Detecting spam comments on Indonesia’s Instagram posts
NASA Astrophysics Data System (ADS)
Septiandri, Ali Akbar; Wibisono, Okiriza
2017-01-01
In this paper we experimented with several feature sets for detecting spam comments in social media contents authored by Indonesian public figures. We define spam comments as comments which have promotional purposes (e.g. referring other users to products and services) and thus not related to the content to which the comments are posted. Three sets of features are evaluated for detecting spams: (1) hand-engineered features such as comment length, number of capital letters, and number of emojis, (2) keyword features such as whether the comment contains advertising words or product-related words, and (3) text features, namely, bag-of-words, TF-IDF, and fastText embeddings, each combined with latent semantic analysis. With 24,000 manually-annotated comments scraped from Instagram posts authored by more than 100 Indonesian public figures, we compared the performance of these feature sets and their combinations using 3 popular classification algorithms: Na¨ıve Bayes, SVM, and XGBoost. We find that using all three feature sets (with fastText embedding for the text features) gave the best F 1-score of 0.9601 on a holdout dataset. More interestingly, fastText embedding combined with hand-engineered features (i.e. without keyword features) yield similar F 1-score of 0.9523, and McNemar’s test failed to reject the hypothesis that the two results are not significantly different. This result is important as keyword features are largely dependent on the dataset and may not be as generalisable as the other feature sets when applied to new data. For future work, we hope to collect bigger and more diverse dataset of Indonesian spam comments, improve our model’s performance and generalisability, and publish a programming package for others to reliably detect spam comments.
The interaction of feature and space based orienting within the attention set.
Lim, Ahnate; Sinnett, Scott
2014-01-01
The processing of sensory information relies on interacting mechanisms of sustained attention and attentional capture, both of which operate in space and on object features. While evidence indicates that exogenous attentional capture, a mechanism previously understood to be automatic, can be eliminated while concurrently performing a demanding task, we reframe this phenomenon within the theoretical framework of the "attention set" (Most et al., 2005). Consequently, the specific prediction that cuing effects should reappear when feature dimensions of the cue overlap with those in the attention set (i.e., elements of the demanding task) was empirically tested and confirmed using a dual-task paradigm involving both sustained attention and attentional capture, adapted from Santangelo et al. (2007). Participants were required to either detect a centrally presented target presented in a stream of distractors (the primary task), or respond to a spatially cued target (the secondary task). Importantly, the spatial cue could either share features with the target in the centrally presented primary task, or not share any features. Overall, the findings supported the attention set hypothesis showing that a spatial cuing effect was only observed when the peripheral cue shared a feature with objects that were already in the attention set (i.e., the primary task). However, this finding was accompanied by differential attentional orienting dependent on the different types of objects within the attention set, with feature-based orienting occurring for target-related objects, and additional spatial-based orienting for distractor-related objects.
Shaping Relations: Exploiting Relational Features for Visuospatial Priming
ERIC Educational Resources Information Center
Livins, Katherine A.; Doumas, Leonidas A. A.; Spivey, Michael J.
2016-01-01
Although relational reasoning has been described as a process at the heart of human cognition, the exact character of relational representations remains an open debate. Symbolic-connectionist models of relational cognition suggest that relations are structured representations, but that they are ultimately grounded in feature sets; thus, they…
Working memory for visual features and conjunctions in schizophrenia.
Gold, James M; Wilk, Christopher M; McMahon, Robert P; Buchanan, Robert W; Luck, Steven J
2003-02-01
The visual working memory (WM) storage capacity of patients with schizophrenia was investigated using a change detection paradigm. Participants were presented with 2, 3, 4, or 6 colored bars with testing of both single feature (color, orientation) and feature conjunction conditions. Patients performed significantly worse than controls at all set sizes but demonstrated normal feature binding. Unlike controls, patient WM capacity declined at set size 6 relative to set size 4. Impairments with subcapacity arrays suggest a deficit in task set maintenance: Greater impairment for supercapacity set sizes suggests a deficit in the ability to selectively encode information for WM storage. Thus, the WM impairment in schizophrenia appears to be a consequence of attentional deficits rather than a reduction in storage capacity.
How well does multiple OCR error correction generalize?
NASA Astrophysics Data System (ADS)
Lund, William B.; Ringger, Eric K.; Walker, Daniel D.
2013-12-01
As the digitization of historical documents, such as newspapers, becomes more common, the need of the archive patron for accurate digital text from those documents increases. Building on our earlier work, the contributions of this paper are: 1. in demonstrating the applicability of novel methods for correcting optical character recognition (OCR) on disparate data sets, including a new synthetic training set, 2. enhancing the correction algorithm with novel features, and 3. assessing the data requirements of the correction learning method. First, we correct errors using conditional random fields (CRF) trained on synthetic training data sets in order to demonstrate the applicability of the methodology to unrelated test sets. Second, we show the strength of lexical features from the training sets on two unrelated test sets, yielding a relative reduction in word error rate on the test sets of 6.52%. New features capture the recurrence of hypothesis tokens and yield an additional relative reduction in WER of 2.30%. Further, we show that only 2.0% of the full training corpus of over 500,000 feature cases is needed to achieve correction results comparable to those using the entire training corpus, effectively reducing both the complexity of the training process and the learned correction model.
Comparing Pattern Recognition Feature Sets for Sorting Triples in the FIRST Database
NASA Astrophysics Data System (ADS)
Proctor, D. D.
2006-07-01
Pattern recognition techniques have been used with increasing success for coping with the tremendous amounts of data being generated by automated surveys. Usually this process involves construction of training sets, the typical examples of data with known classifications. Given a feature set, along with the training set, statistical methods can be employed to generate a classifier. The classifier is then applied to process the remaining data. Feature set selection, however, is still an issue. This paper presents techniques developed for accommodating data for which a substantive portion of the training set cannot be classified unambiguously, a typical case for low-resolution data. Significance tests on the sort-ordered, sample-size-normalized vote distribution of an ensemble of decision trees is introduced as a method of evaluating relative quality of feature sets. The technique is applied to comparing feature sets for sorting a particular radio galaxy morphology, bent-doubles, from the Faint Images of the Radio Sky at Twenty Centimeters (FIRST) database. Also examined are alternative functional forms for feature sets. Associated standard deviations provide the means to evaluate the effect of the number of folds, the number of classifiers per fold, and the sample size on the resulting classifications. The technique also may be applied to situations for which, although accurate classifications are available, the feature set is clearly inadequate, but is desired nonetheless to make the best of available information.
The interaction of feature and space based orienting within the attention set
Lim, Ahnate; Sinnett, Scott
2014-01-01
The processing of sensory information relies on interacting mechanisms of sustained attention and attentional capture, both of which operate in space and on object features. While evidence indicates that exogenous attentional capture, a mechanism previously understood to be automatic, can be eliminated while concurrently performing a demanding task, we reframe this phenomenon within the theoretical framework of the “attention set” (Most et al., 2005). Consequently, the specific prediction that cuing effects should reappear when feature dimensions of the cue overlap with those in the attention set (i.e., elements of the demanding task) was empirically tested and confirmed using a dual-task paradigm involving both sustained attention and attentional capture, adapted from Santangelo et al. (2007). Participants were required to either detect a centrally presented target presented in a stream of distractors (the primary task), or respond to a spatially cued target (the secondary task). Importantly, the spatial cue could either share features with the target in the centrally presented primary task, or not share any features. Overall, the findings supported the attention set hypothesis showing that a spatial cuing effect was only observed when the peripheral cue shared a feature with objects that were already in the attention set (i.e., the primary task). However, this finding was accompanied by differential attentional orienting dependent on the different types of objects within the attention set, with feature-based orienting occurring for target-related objects, and additional spatial-based orienting for distractor-related objects. PMID:24523682
A novel feature extraction approach for microarray data based on multi-algorithm fusion
Jiang, Zhu; Xu, Rong
2015-01-01
Feature extraction is one of the most important and effective method to reduce dimension in data mining, with emerging of high dimensional data such as microarray gene expression data. Feature extraction for gene selection, mainly serves two purposes. One is to identify certain disease-related genes. The other is to find a compact set of discriminative genes to build a pattern classifier with reduced complexity and improved generalization capabilities. Depending on the purpose of gene selection, two types of feature extraction algorithms including ranking-based feature extraction and set-based feature extraction are employed in microarray gene expression data analysis. In ranking-based feature extraction, features are evaluated on an individual basis, without considering inter-relationship between features in general, while set-based feature extraction evaluates features based on their role in a feature set by taking into account dependency between features. Just as learning methods, feature extraction has a problem in its generalization ability, which is robustness. However, the issue of robustness is often overlooked in feature extraction. In order to improve the accuracy and robustness of feature extraction for microarray data, a novel approach based on multi-algorithm fusion is proposed. By fusing different types of feature extraction algorithms to select the feature from the samples set, the proposed approach is able to improve feature extraction performance. The new approach is tested against gene expression dataset including Colon cancer data, CNS data, DLBCL data, and Leukemia data. The testing results show that the performance of this algorithm is better than existing solutions. PMID:25780277
A novel feature extraction approach for microarray data based on multi-algorithm fusion.
Jiang, Zhu; Xu, Rong
2015-01-01
Feature extraction is one of the most important and effective method to reduce dimension in data mining, with emerging of high dimensional data such as microarray gene expression data. Feature extraction for gene selection, mainly serves two purposes. One is to identify certain disease-related genes. The other is to find a compact set of discriminative genes to build a pattern classifier with reduced complexity and improved generalization capabilities. Depending on the purpose of gene selection, two types of feature extraction algorithms including ranking-based feature extraction and set-based feature extraction are employed in microarray gene expression data analysis. In ranking-based feature extraction, features are evaluated on an individual basis, without considering inter-relationship between features in general, while set-based feature extraction evaluates features based on their role in a feature set by taking into account dependency between features. Just as learning methods, feature extraction has a problem in its generalization ability, which is robustness. However, the issue of robustness is often overlooked in feature extraction. In order to improve the accuracy and robustness of feature extraction for microarray data, a novel approach based on multi-algorithm fusion is proposed. By fusing different types of feature extraction algorithms to select the feature from the samples set, the proposed approach is able to improve feature extraction performance. The new approach is tested against gene expression dataset including Colon cancer data, CNS data, DLBCL data, and Leukemia data. The testing results show that the performance of this algorithm is better than existing solutions.
Li, Der-Chiang; Liu, Chiao-Wen; Hu, Susan C
2011-05-01
Medical data sets are usually small and have very high dimensionality. Too many attributes will make the analysis less efficient and will not necessarily increase accuracy, while too few data will decrease the modeling stability. Consequently, the main objective of this study is to extract the optimal subset of features to increase analytical performance when the data set is small. This paper proposes a fuzzy-based non-linear transformation method to extend classification related information from the original data attribute values for a small data set. Based on the new transformed data set, this study applies principal component analysis (PCA) to extract the optimal subset of features. Finally, we use the transformed data with these optimal features as the input data for a learning tool, a support vector machine (SVM). Six medical data sets: Pima Indians' diabetes, Wisconsin diagnostic breast cancer, Parkinson disease, echocardiogram, BUPA liver disorders dataset, and bladder cancer cases in Taiwan, are employed to illustrate the approach presented in this paper. This research uses the t-test to evaluate the classification accuracy for a single data set; and uses the Friedman test to show the proposed method is better than other methods over the multiple data sets. The experiment results indicate that the proposed method has better classification performance than either PCA or kernel principal component analysis (KPCA) when the data set is small, and suggest creating new purpose-related information to improve the analysis performance. This paper has shown that feature extraction is important as a function of feature selection for efficient data analysis. When the data set is small, using the fuzzy-based transformation method presented in this work to increase the information available produces better results than the PCA and KPCA approaches. Copyright © 2011 Elsevier B.V. All rights reserved.
Toward real-time performance benchmarks for Ada
NASA Technical Reports Server (NTRS)
Clapp, Russell M.; Duchesneau, Louis; Volz, Richard A.; Mudge, Trevor N.; Schultze, Timothy
1986-01-01
The issue of real-time performance measurements for the Ada programming language through the use of benchmarks is addressed. First, the Ada notion of time is examined and a set of basic measurement techniques are developed. Then a set of Ada language features believed to be important for real-time performance are presented and specific measurement methods discussed. In addition, other important time related features which are not explicitly part of the language but are part of the run-time related features which are not explicitly part of the language but are part of the run-time system are also identified and measurement techniques developed. The measurement techniques are applied to the language and run-time system features and the results are presented.
NASA Technical Reports Server (NTRS)
Bradley, D. B.; Cain, J. B., III; Williard, M. W.
1978-01-01
The task was to evaluate the ability of a set of timing/synchronization subsystem features to provide a set of desirable characteristics for the evolving Defense Communications System digital communications network. The set of features related to the approaches by which timing/synchronization information could be disseminated throughout the network and the manner in which this information could be utilized to provide a synchronized network. These features, which could be utilized in a large number of different combinations, included mutual control, directed control, double ended reference links, independence of clock error measurement and correction, phase reference combining, and self organizing.
Feature selection gait-based gender classification under different circumstances
NASA Astrophysics Data System (ADS)
Sabir, Azhin; Al-Jawad, Naseer; Jassim, Sabah
2014-05-01
This paper proposes a gender classification based on human gait features and investigates the problem of two variations: clothing (wearing coats) and carrying bag condition as addition to the normal gait sequence. The feature vectors in the proposed system are constructed after applying wavelet transform. Three different sets of feature are proposed in this method. First, Spatio-temporal distance that is dealing with the distance of different parts of the human body (like feet, knees, hand, Human Height and shoulder) during one gait cycle. The second and third feature sets are constructed from approximation and non-approximation coefficient of human body respectively. To extract these two sets of feature we divided the human body into two parts, upper and lower body part, based on the golden ratio proportion. In this paper, we have adopted a statistical method for constructing the feature vector from the above sets. The dimension of the constructed feature vector is reduced based on the Fisher score as a feature selection method to optimize their discriminating significance. Finally k-Nearest Neighbor is applied as a classification method. Experimental results demonstrate that our approach is providing more realistic scenario and relatively better performance compared with the existing approaches.
Impact of experimental design on PET radiomics in predicting somatic mutation status.
Yip, Stephen S F; Parmar, Chintan; Kim, John; Huynh, Elizabeth; Mak, Raymond H; Aerts, Hugo J W L
2017-12-01
PET-based radiomic features have demonstrated great promises in predicting genetic data. However, various experimental parameters can influence the feature extraction pipeline, and hence, Here, we investigated how experimental settings affect the performance of radiomic features in predicting somatic mutation status in non-small cell lung cancer (NSCLC) patients. 348 NSCLC patients with somatic mutation testing and diagnostic PET images were included in our analysis. Radiomic feature extractions were analyzed for varying voxel sizes, filters and bin widths. 66 radiomic features were evaluated. The performance of features in predicting mutations status was assessed using the area under the receiver-operating-characteristic curve (AUC). The influence of experimental parameters on feature predictability was quantified as the relative difference between the minimum and maximum AUC (δ). The large majority of features (n=56, 85%) were significantly predictive for EGFR mutation status (AUC≥0.61). 29 radiomic features significantly predicted EGFR mutations and were robust to experimental settings with δ Overall <5%. The overall influence (δ Overall ) of the voxel size, filter and bin width for all features ranged from 5% to 15%, respectively. For all features, none of the experimental designs was predictive of KRAS+ from KRAS- (AUC≤0.56). The predictability of 29 radiomic features was robust to the choice of experimental settings; however, these settings need to be carefully chosen for all other features. The combined effect of the investigated processing methods could be substantial and must be considered. Optimized settings that will maximize the predictive performance of individual radiomic features should be investigated in the future. Copyright © 2017 Elsevier B.V. All rights reserved.
Zhang, Jianhua; Yin, Zhong; Wang, Rubin
2017-01-01
This paper developed a cognitive task-load (CTL) classification algorithm and allocation strategy to sustain the optimal operator CTL levels over time in safety-critical human-machine integrated systems. An adaptive human-machine system is designed based on a non-linear dynamic CTL classifier, which maps a set of electroencephalogram (EEG) and electrocardiogram (ECG) related features to a few CTL classes. The least-squares support vector machine (LSSVM) is used as dynamic pattern classifier. A series of electrophysiological and performance data acquisition experiments were performed on seven volunteer participants under a simulated process control task environment. The participant-specific dynamic LSSVM model is constructed to classify the instantaneous CTL into five classes at each time instant. The initial feature set, comprising 56 EEG and ECG related features, is reduced to a set of 12 salient features (including 11 EEG-related features) by using the locality preserving projection (LPP) technique. An overall correct classification rate of about 80% is achieved for the 5-class CTL classification problem. Then the predicted CTL is used to adaptively allocate the number of process control tasks between operator and computer-based controller. Simulation results showed that the overall performance of the human-machine system can be improved by using the adaptive automation strategy proposed.
Methods for the Precise Locating and Forming of Arrays of Curved Features into a Workpiece
Gill, David Dennis; Keeler, Gordon A.; Serkland, Darwin K.; Mukherjee, Sayan D.
2008-10-14
Methods for manufacturing high precision arrays of curved features (e.g. lenses) in the surface of a workpiece are described utilizing orthogonal sets of inter-fitting locating grooves to mate a workpiece to a workpiece holder mounted to the spindle face of a rotating machine tool. The matching inter-fitting groove sets in the workpiece and the chuck allow precisely and non-kinematically indexing the workpiece to locations defined in two orthogonal directions perpendicular to the turning axis of the machine tool. At each location on the workpiece a curved feature can then be on-center machined to create arrays of curved features on the workpiece. The averaging effect of the corresponding sets of inter-fitting grooves provide for precise repeatability in determining, the relative locations of the centers of each of the curved features in an array of curved features.
Fahimi, Fatemeh; Guan, Cuntai; Wooi Boon Goh; Kai Keng Ang; Choon Guan Lim; Tih Shih Lee
2017-07-01
Measuring attention from electroencephalogram (EEG) has found applications in the treatment of Attention Deficit Hyperactivity Disorder (ADHD). It is of great interest to understand what features in EEG are most representative of attention. Intensive research has been done in the past and it has been proven that frequency band powers and their ratios are effective features in detecting attention. However, there are still unanswered questions, like, what features in EEG are most discriminative between attentive and non-attentive states? Are these features common among all subjects or are they subject-specific and must be optimized for each subject? Using Mutual Information (MI) to perform subject-specific feature selection on a large data set including 120 ADHD children, we found that besides theta beta ratio (TBR) which is commonly used in attention detection and neurofeedback, the relative beta power and theta/(alpha+beta) (TBAR) are also equally significant and informative for attention detection. Interestingly, we found that the relative theta power (which is also commonly used) may not have sufficient discriminative information itself (it is informative only for 3.26% of ADHD children). We have also demonstrated that although these features (relative beta power, TBR and TBAR) are the most important measures to detect attention on average, different subjects have different set of most discriminative features.
NASA Astrophysics Data System (ADS)
Song, Bowen; Zhang, Guopeng; Wang, Huafeng; Zhu, Wei; Liang, Zhengrong
2013-02-01
Various types of features, e.g., geometric features, texture features, projection features etc., have been introduced for polyp detection and differentiation tasks via computer aided detection and diagnosis (CAD) for computed tomography colonography (CTC). Although these features together cover more information of the data, some of them are statistically highly-related to others, which made the feature set redundant and burdened the computation task of CAD. In this paper, we proposed a new dimension reduction method which combines hierarchical clustering and principal component analysis (PCA) for false positives (FPs) reduction task. First, we group all the features based on their similarity using hierarchical clustering, and then PCA is employed within each group. Different numbers of principal components are selected from each group to form the final feature set. Support vector machine is used to perform the classification. The results show that when three principal components were chosen from each group we can achieve an area under the curve of receiver operating characteristics of 0.905, which is as high as the original dataset. Meanwhile, the computation time is reduced by 70% and the feature set size is reduce by 77%. It can be concluded that the proposed method captures the most important information of the feature set and the classification accuracy is not affected after the dimension reduction. The result is promising and further investigation, such as automatically threshold setting, are worthwhile and are under progress.
Mougiakakou, Stavroula G; Valavanis, Ioannis K; Nikita, Alexandra; Nikita, Konstantina S
2007-09-01
The aim of the present study is to define an optimally performing computer-aided diagnosis (CAD) architecture for the classification of liver tissue from non-enhanced computed tomography (CT) images into normal liver (C1), hepatic cyst (C2), hemangioma (C3), and hepatocellular carcinoma (C4). To this end, various CAD architectures, based on texture features and ensembles of classifiers (ECs), are comparatively assessed. Number of regions of interests (ROIs) corresponding to C1-C4 have been defined by experienced radiologists in non-enhanced liver CT images. For each ROI, five distinct sets of texture features were extracted using first order statistics, spatial gray level dependence matrix, gray level difference method, Laws' texture energy measures, and fractal dimension measurements. Two different ECs were constructed and compared. The first one consists of five multilayer perceptron neural networks (NNs), each using as input one of the computed texture feature sets or its reduced version after genetic algorithm-based feature selection. The second EC comprised five different primary classifiers, namely one multilayer perceptron NN, one probabilistic NN, and three k-nearest neighbor classifiers, each fed with the combination of the five texture feature sets or their reduced versions. The final decision of each EC was extracted by using appropriate voting schemes, while bootstrap re-sampling was utilized in order to estimate the generalization ability of the CAD architectures based on the available relatively small-sized data set. The best mean classification accuracy (84.96%) is achieved by the second EC using a fused feature set, and the weighted voting scheme. The fused feature set was obtained after appropriate feature selection applied to specific subsets of the original feature set. The comparative assessment of the various CAD architectures shows that combining three types of classifiers with a voting scheme, fed with identical feature sets obtained after appropriate feature selection and fusion, may result in an accurate system able to assist differential diagnosis of focal liver lesions from non-enhanced CT images.
Capture by colour: evidence for dimension-specific singleton capture.
Harris, Anthony M; Becker, Stefanie I; Remington, Roger W
2015-10-01
Previous work on attentional capture has shown the attentional system to be quite flexible in the stimulus properties it can be set to respond to. Several different attentional "modes" have been identified. Feature search mode allows attention to be set for specific features of a target (e.g., red). Singleton detection mode sets attention to respond to any discrepant item ("singleton") in the display. Relational search sets attention for the relative properties of the target in relation to the distractors (e.g., redder, larger). Recently, a new attentional mode was proposed that sets attention to respond to any singleton within a particular feature dimension (e.g., colour; Folk & Anderson, 2010). We tested this proposal against the predictions of previously established attentional modes. In a spatial cueing paradigm, participants searched for a colour target that was randomly either red or green. The nature of the attentional control setting was probed by presenting an irrelevant singleton cue prior to the target display and assessing whether it attracted attention. In all experiments, the cues were red, green, blue, or a white stimulus rapidly rotated (motion cue). The results of three experiments support the existence of a "colour singleton set," finding that all colour cues captured attention strongly, while motion cues captured attention only weakly or not at all. Notably, we also found that capture by motion cues in search for colour targets was moderated by their frequency; rare motion cues captured attention (weakly), while frequent motion cues did not.
Electrophysiological evidence for parallel and serial processing during visual search.
Luck, S J; Hillyard, S A
1990-12-01
Event-related potentials were recorded from young adults during a visual search task in order to evaluate parallel and serial models of visual processing in the context of Treisman's feature integration theory. Parallel and serial search strategies were produced by the use of feature-present and feature-absent targets, respectively. In the feature-absent condition, the slopes of the functions relating reaction time and latency of the P3 component to set size were essentially identical, indicating that the longer reaction times observed for larger set sizes can be accounted for solely by changes in stimulus identification and classification time, rather than changes in post-perceptual processing stages. In addition, the amplitude of the P3 wave on target-present trials in this condition increased with set size and was greater when the preceding trial contained a target, whereas P3 activity was minimal on target-absent trials. These effects are consistent with the serial self-terminating search model and appear to contradict parallel processing accounts of attention-demanding visual search performance, at least for a subset of search paradigms. Differences in ERP scalp distributions further suggested that different physiological processes are utilized for the detection of feature presence and absence.
Kim, Eunji; Ivanov, Ivan; Hua, Jianping; Lampe, Johanna W; Hullar, Meredith Aj; Chapkin, Robert S; Dougherty, Edward R
2017-01-01
Ranking feature sets for phenotype classification based on gene expression is a challenging issue in cancer bioinformatics. When the number of samples is small, all feature selection algorithms are known to be unreliable, producing significant error, and error estimators suffer from different degrees of imprecision. The problem is compounded by the fact that the accuracy of classification depends on the manner in which the phenomena are transformed into data by the measurement technology. Because next-generation sequencing technologies amount to a nonlinear transformation of the actual gene or RNA concentrations, they can potentially produce less discriminative data relative to the actual gene expression levels. In this study, we compare the performance of ranking feature sets derived from a model of RNA-Seq data with that of a multivariate normal model of gene concentrations using 3 measures: (1) ranking power, (2) length of extensions, and (3) Bayes features. This is the model-based study to examine the effectiveness of reporting lists of small feature sets using RNA-Seq data and the effects of different model parameters and error estimators. The results demonstrate that the general trends of the parameter effects on the ranking power of the underlying gene concentrations are preserved in the RNA-Seq data, whereas the power of finding a good feature set becomes weaker when gene concentrations are transformed by the sequencing machine.
Munitions related feature extraction from LIDAR data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Roberts, Barry L.
2010-06-01
The characterization of former military munitions ranges is critical in the identification of areas likely to contain residual unexploded ordnance (UXO). Although these ranges are large, often covering tens-of-thousands of acres, the actual target areas represent only a small fraction of the sites. The challenge is that many of these sites do not have records indicating locations of former target areas. The identification of target areas is critical in the characterization and remediation of these sites. The Strategic Environmental Research and Development Program (SERDP) and Environmental Security Technology Certification Program (ESTCP) of the DoD have been developing and implementing techniquesmore » for the efficient characterization of large munitions ranges. As part of this process, high-resolution LIDAR terrain data sets have been collected over several former ranges. These data sets have been shown to contain information relating to former munitions usage at these ranges, specifically terrain cratering due to high-explosives detonations. The location and relative intensity of crater features can provide information critical in reconstructing the usage history of a range, and indicate areas most likely to contain UXO. We have developed an automated procedure using an adaptation of the Circular Hough Transform for the identification of crater features in LIDAR terrain data. The Circular Hough Transform is highly adept at finding circular features (craters) in noisy terrain data sets. This technique has the ability to find features of a specific radius providing a means of filtering features based on expected scale and providing additional spatial characterization of the identified feature. This method of automated crater identification has been applied to several former munitions ranges with positive results.« less
An ensemble method for extracting adverse drug events from social media.
Liu, Jing; Zhao, Songzheng; Zhang, Xiaodi
2016-06-01
Because adverse drug events (ADEs) are a serious health problem and a leading cause of death, it is of vital importance to identify them correctly and in a timely manner. With the development of Web 2.0, social media has become a large data source for information on ADEs. The objective of this study is to develop a relation extraction system that uses natural language processing techniques to effectively distinguish between ADEs and non-ADEs in informal text on social media. We develop a feature-based approach that utilizes various lexical, syntactic, and semantic features. Information-gain-based feature selection is performed to address high-dimensional features. Then, we evaluate the effectiveness of four well-known kernel-based approaches (i.e., subset tree kernel, tree kernel, shortest dependency path kernel, and all-paths graph kernel) and several ensembles that are generated by adopting different combination methods (i.e., majority voting, weighted averaging, and stacked generalization). All of the approaches are tested using three data sets: two health-related discussion forums and one general social networking site (i.e., Twitter). When investigating the contribution of each feature subset, the feature-based approach attains the best area under the receiver operating characteristics curve (AUC) values, which are 78.6%, 72.2%, and 79.2% on the three data sets. When individual methods are used, we attain the best AUC values of 82.1%, 73.2%, and 77.0% using the subset tree kernel, shortest dependency path kernel, and feature-based approach on the three data sets, respectively. When using classifier ensembles, we achieve the best AUC values of 84.5%, 77.3%, and 84.5% on the three data sets, outperforming the baselines. Our experimental results indicate that ADE extraction from social media can benefit from feature selection. With respect to the effectiveness of different feature subsets, lexical features and semantic features can enhance the ADE extraction capability. Kernel-based approaches, which can stay away from the feature sparsity issue, are qualified to address the ADE extraction problem. Combining different individual classifiers using suitable combination methods can further enhance the ADE extraction effectiveness. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Thomaz, Ricardo L.; Carneiro, Pedro C.; Patrocinio, Ana C.
2017-03-01
Breast cancer is the leading cause of death for women in most countries. The high levels of mortality relate mostly to late diagnosis and to the direct proportionally relationship between breast density and breast cancer development. Therefore, the correct assessment of breast density is important to provide better screening for higher risk patients. However, in modern digital mammography the discrimination among breast densities is highly complex due to increased contrast and visual information for all densities. Thus, a computational system for classifying breast density might be a useful tool for aiding medical staff. Several machine-learning algorithms are already capable of classifying small number of classes with good accuracy. However, machinelearning algorithms main constraint relates to the set of features extracted and used for classification. Although well-known feature extraction techniques might provide a good set of features, it is a complex task to select an initial set during design of a classifier. Thus, we propose feature extraction using a Convolutional Neural Network (CNN) for classifying breast density by a usual machine-learning classifier. We used 307 mammographic images downsampled to 260x200 pixels to train a CNN and extract features from a deep layer. After training, the activation of 8 neurons from a deep fully connected layer are extracted and used as features. Then, these features are feedforward to a single hidden layer neural network that is cross-validated using 10-folds to classify among four classes of breast density. The global accuracy of this method is 98.4%, presenting only 1.6% of misclassification. However, the small set of samples and memory constraints required the reuse of data in both CNN and MLP-NN, therefore overfitting might have influenced the results even though we cross-validated the network. Thus, although we presented a promising method for extracting features and classifying breast density, a greater database is still required for evaluating the results.
Artificial neural networks for acoustic target recognition
NASA Astrophysics Data System (ADS)
Robertson, James A.; Mossing, John C.; Weber, Bruce A.
1995-04-01
Acoustic sensors can be used to detect, track and identify non-line-of-sight targets passively. Attempts to alter acoustic emissions often result in an undesirable performance degradation. This research project investigates the use of neural networks for differentiating between features extracted from the acoustic signatures of sources. Acoustic data were filtered and digitized using a commercially available analog-digital convertor. The digital data was transformed to the frequency domain for additional processing using the FFT. Narrowband peak detection algorithms were incorporated to select peaks above a user defined SNR. These peaks were then used to generate a set of robust features which relate specifically to target components in varying background conditions. The features were then used as input into a backpropagation neural network. A K-means unsupervised clustering algorithm was used to determine the natural clustering of the observations. Comparisons between a feature set consisting of the normalized amplitudes of the first 250 frequency bins of the power spectrum and a set of 11 harmonically related features were made. Initial results indicate that even though some different target types had a tendency to group in the same clusters, the neural network was able to differentiate the targets. Successful identification of acoustic sources under varying operational conditions with high confidence levels was achieved.
Quality assessment of data discrimination using self-organizing maps.
Mekler, Alexey; Schwarz, Dmitri
2014-10-01
One of the important aspects of the data classification problem lies in making the most appropriate selection of features. The set of variables should be small and, at the same time, should provide reliable discrimination of the classes. The method for the discriminating power evaluation that enables a comparison between different sets of variables will be useful in the search for the set of variables. A new approach to feature selection is presented. Two methods of evaluation of the data discriminating power of a feature set are suggested. Both of the methods implement self-organizing maps (SOMs) and the newly introduced exponents of the degree of data clusterization on the SOM. The first method is based on the comparison of intraclass and interclass distances on the map. Another method concerns the evaluation of the relative number of best matching unit's (BMUs) nearest neighbors of the same class. Both methods make it possible to evaluate the discriminating power of a feature set in cases when this set provides nonlinear discrimination of the classes. Current algorithms in program code can be downloaded for free at http://mekler.narod.ru/Science/Articles_support.html, as well as the supporting data files. Copyright © 2014 Elsevier Inc. All rights reserved.
Mirzarezaee, Mitra; Araabi, Babak N; Sadeghi, Mehdi
2010-12-19
It has been understood that biological networks have modular organizations which are the sources of their observed complexity. Analysis of networks and motifs has shown that two types of hubs, party hubs and date hubs, are responsible for this complexity. Party hubs are local coordinators because of their high co-expressions with their partners, whereas date hubs display low co-expressions and are assumed as global connectors. However there is no mutual agreement on these concepts in related literature with different studies reporting their results on different data sets. We investigated whether there is a relation between the biological features of Saccharomyces Cerevisiae's proteins and their roles as non-hubs, intermediately connected, party hubs, and date hubs. We propose a classifier that separates these four classes. We extracted different biological characteristics including amino acid sequences, domain contents, repeated domains, functional categories, biological processes, cellular compartments, disordered regions, and position specific scoring matrix from various sources. Several classifiers are examined and the best feature-sets based on average correct classification rate and correlation coefficients of the results are selected. We show that fusion of five feature-sets including domains, Position Specific Scoring Matrix-400, cellular compartments level one, and composition pairs with two and one gaps provide the best discrimination with an average correct classification rate of 77%. We study a variety of known biological feature-sets of the proteins and show that there is a relation between domains, Position Specific Scoring Matrix-400, cellular compartments level one, composition pairs with two and one gaps of Saccharomyces Cerevisiae's proteins, and their roles in the protein interaction network as non-hubs, intermediately connected, party hubs and date hubs. This study also confirms the possibility of predicting non-hubs, party hubs and date hubs based on their biological features with acceptable accuracy. If such a hypothesis is correct for other species as well, similar methods can be applied to predict the roles of proteins in those species.
NASA Astrophysics Data System (ADS)
Wang, Shijun; Yao, Jianhua; Petrick, Nicholas A.; Summers, Ronald M.
2009-02-01
Colon cancer is the second leading cause of cancer-related deaths in the United States. Computed tomographic colonography (CTC) combined with a computer aided detection system provides a feasible combination for improving colonic polyps detection and increasing the use of CTC for colon cancer screening. To distinguish true polyps from false positives, various features extracted from polyp candidates have been proposed. Most of these features try to capture the shape information of polyp candidates or neighborhood knowledge about the surrounding structures (fold, colon wall, etc.). In this paper, we propose a new set of shape descriptors for polyp candidates based on statistical curvature information. These features, called histogram of curvature features, are rotation, translation and scale invariant and can be treated as complementing our existing feature set. Then in order to make full use of the traditional features (defined as group A) and the new features (group B) which are highly heterogeneous, we employed a multiple kernel learning method based on semi-definite programming to identify an optimized classification kernel based on the combined set of features. We did leave-one-patient-out test on a CTC dataset which contained scans from 50 patients (with 90 6-9mm polyp detections). Experimental results show that a support vector machine (SVM) based on the combined feature set and the semi-definite optimization kernel achieved higher FROC performance compared to SVMs using the two groups of features separately. At a false positive per patient rate of 7, the sensitivity on 6-9mm polyps using the combined features improved from 0.78 (Group A) and 0.73 (Group B) to 0.82 (p<=0.01).
Separate class true discovery rate degree of association sets for biomarker identification.
Crager, Michael R; Ahmed, Murat
2014-01-01
In 2008, Efron showed that biological features in a high-dimensional study can be divided into classes and a separate false discovery rate (FDR) analysis can be conducted in each class using information from the entire set of features to assess the FDR within each class. We apply this separate class approach to true discovery rate degree of association (TDRDA) set analysis, which is used in clinical-genomic studies to identify sets of biomarkers having strong association with clinical outcome or state while controlling the FDR. Careful choice of classes based on prior information can increase the identification power of the separate class analysis relative to the overall analysis.
NASA Astrophysics Data System (ADS)
Jaferzadeh, Keyvan; Moon, Inkyu
2016-12-01
The classification of erythrocytes plays an important role in the field of hematological diagnosis, specifically blood disorders. Since the biconcave shape of red blood cell (RBC) is altered during the different stages of hematological disorders, we believe that the three-dimensional (3-D) morphological features of erythrocyte provide better classification results than conventional two-dimensional (2-D) features. Therefore, we introduce a set of 3-D features related to the morphological and chemical properties of RBC profile and try to evaluate the discrimination power of these features against 2-D features with a neural network classifier. The 3-D features include erythrocyte surface area, volume, average cell thickness, sphericity index, sphericity coefficient and functionality factor, MCH and MCHSD, and two newly introduced features extracted from the ring section of RBC at the single-cell level. In contrast, the 2-D features are RBC projected surface area, perimeter, radius, elongation, and projected surface area to perimeter ratio. All features are obtained from images visualized by off-axis digital holographic microscopy with a numerical reconstruction algorithm, and four categories of biconcave (doughnut shape), flat-disc, stomatocyte, and echinospherocyte RBCs are interested. Our experimental results demonstrate that the 3-D features can be more useful in RBC classification than the 2-D features. Finally, we choose the best feature set of the 2-D and 3-D features by sequential forward feature selection technique, which yields better discrimination results. We believe that the final feature set evaluated with a neural network classification strategy can improve the RBC classification accuracy.
Information based universal feature extraction
NASA Astrophysics Data System (ADS)
Amiri, Mohammad; Brause, Rüdiger
2015-02-01
In many real world image based pattern recognition tasks, the extraction and usage of task-relevant features are the most crucial part of the diagnosis. In the standard approach, they mostly remain task-specific, although humans who perform such a task always use the same image features, trained in early childhood. It seems that universal feature sets exist, but they are not yet systematically found. In our contribution, we tried to find those universal image feature sets that are valuable for most image related tasks. In our approach, we trained a neural network by natural and non-natural images of objects and background, using a Shannon information-based algorithm and learning constraints. The goal was to extract those features that give the most valuable information for classification of visual objects hand-written digits. This will give a good start and performance increase for all other image learning tasks, implementing a transfer learning approach. As result, in our case we found that we could indeed extract features which are valid in all three kinds of tasks.
A hybrid feature selection approach for the early diagnosis of Alzheimer’s disease
NASA Astrophysics Data System (ADS)
Gallego-Jutglà, Esteve; Solé-Casals, Jordi; Vialatte, François-Benoît; Elgendi, Mohamed; Cichocki, Andrzej; Dauwels, Justin
2015-02-01
Objective. Recently, significant advances have been made in the early diagnosis of Alzheimer’s disease (AD) from electroencephalography (EEG). However, choosing suitable measures is a challenging task. Among other measures, frequency relative power (RP) and loss of complexity have been used with promising results. In the present study we investigate the early diagnosis of AD using synchrony measures and frequency RP on EEG signals, examining the changes found in different frequency ranges. Approach. We first explore the use of a single feature for computing the classification rate (CR), looking for the best frequency range. Then, we present a multiple feature classification system that outperforms all previous results using a feature selection strategy. These two approaches are tested in two different databases, one containing mild cognitive impairment (MCI) and healthy subjects (patients age: 71.9 ± 10.2, healthy subjects age: 71.7 ± 8.3), and the other containing Mild AD and healthy subjects (patients age: 77.6 ± 10.0 healthy subjects age: 69.4 ± 11.5). Main results. Using a single feature to compute CRs we achieve a performance of 78.33% for the MCI data set and of 97.56% for Mild AD. Results are clearly improved using the multiple feature classification, where a CR of 95% is found for the MCI data set using 11 features, and 100% for the Mild AD data set using four features. Significance. The new features selection method described in this work may be a reliable tool that could help to design a realistic system that does not require prior knowledge of a patient's status. With that aim, we explore the standardization of features for MCI and Mild AD data sets with promising results.
Pippel, Kristina; Meinck, M; Lübke, N
2017-06-01
Mobile geriatric rehabilitation can be provided in the setting of nursing homes, short-term care (STC) facilities and exclusively in private homes. This study analyzed the common features and differences of mobile rehabilitation interventions in various settings. Stratified by setting 1,879 anonymized mobile geriatric rehabilitation treatments between 2011 and 2014 from 11 participating institutions were analyzed with respect to patient, process and outcome-related features. Significant differences between the settings nursing home (n = 514, 27 %), STC (n = 167, 9 %) and private homes (n = 1198, 64 %) were evident for mean age (83 years, 83 years and 80 years, respectively), percentage of women (72 %, 64 % and 55 %), degree of dependency on pre-existing care (92 %, 76 % and 64 %), total treatment sessions (TS, 38 TS, 42 TS and 41 TS), treatment duration (54 days, 61 days and 58 days) as well as the Barthel index at the start of rehabilitation (34 points, 39 points and 46 points) and the gain in the Barthel index (15 points, 21 points and 18 points), whereby the gain in the capacity for self-sufficiency was significant in all settings. The setting-specific evaluation of mobile geriatric rehabilitation showed differences for relevant patient, process and outcome-related features. Compared to inpatient rehabilitation mobile rehabilitation in all settings made an above average contribution to the rehabilitation of patients with pre-existing dependency on care. The gains in the capacity for self-sufficiency achieved in all settings support the efficacy of mobile geriatric rehabilitation under the current prerequisites for applicability.
Janet, Jon Paul; Kulik, Heather J
2017-11-22
Machine learning (ML) of quantum mechanical properties shows promise for accelerating chemical discovery. For transition metal chemistry where accurate calculations are computationally costly and available training data sets are small, the molecular representation becomes a critical ingredient in ML model predictive accuracy. We introduce a series of revised autocorrelation functions (RACs) that encode relationships of the heuristic atomic properties (e.g., size, connectivity, and electronegativity) on a molecular graph. We alter the starting point, scope, and nature of the quantities evaluated in standard ACs to make these RACs amenable to inorganic chemistry. On an organic molecule set, we first demonstrate superior standard AC performance to other presently available topological descriptors for ML model training, with mean unsigned errors (MUEs) for atomization energies on set-aside test molecules as low as 6 kcal/mol. For inorganic chemistry, our RACs yield 1 kcal/mol ML MUEs on set-aside test molecules in spin-state splitting in comparison to 15-20× higher errors for feature sets that encode whole-molecule structural information. Systematic feature selection methods including univariate filtering, recursive feature elimination, and direct optimization (e.g., random forest and LASSO) are compared. Random-forest- or LASSO-selected subsets 4-5× smaller than the full RAC set produce sub- to 1 kcal/mol spin-splitting MUEs, with good transferability to metal-ligand bond length prediction (0.004-5 Å MUE) and redox potential on a smaller data set (0.2-0.3 eV MUE). Evaluation of feature selection results across property sets reveals the relative importance of local, electronic descriptors (e.g., electronegativity, atomic number) in spin-splitting and distal, steric effects in redox potential and bond lengths.
NASA Astrophysics Data System (ADS)
Jagodziński, Dariusz; Matysiewicz, Mateusz; Neumann, Łukasz; Nowak, Robert M.; Okuniewski, Rafał; Oleszkiewicz, Witold; Cichosz, Paweł
2016-09-01
This contribution introduces the method of cancer pathologies detection on breast skin temperature distribution images. The use of thermosensitive foils applied to the breasts skin allows to create thermograms, which displays the amount of infrared energy emitted by all breast cells. The significant foci of hyperthermia or inflammation are typical for cancer cells. That foci can be recognized on thermograms as a contours, which are the areas of higher temperature. Every contour can be converted to a feature set that describe it, using the raw, central, Hu, outline, Fourier and colour moments of image pixels processing. This paper defines also the new way of describing a set of contours through theirs neighbourhood relations. Contribution introduces moreover the way of ranking and selecting most relevant features. Authors used Neural Network with Gevrey`s concept and recursive feature elimination, to estimate feature importance.
NASA Astrophysics Data System (ADS)
Mazza, F.; Da Silva, M. P.; Le Callet, P.; Heynderickx, I. E. J.
2015-03-01
Multimedia quality assessment has been an important research topic during the last decades. The original focus on artifact visibility has been extended during the years to aspects as image aesthetics, interestingness and memorability. More recently, Fedorovskaya proposed the concept of 'image psychology': this concept focuses on additional quality dimensions related to human content processing. While these additional dimensions are very valuable in understanding preferences, it is very hard to define, isolate and measure their effect on quality. In this paper we continue our research on face pictures investigating which image factors influence context perception. We collected perceived fit of a set of images to various content categories. These categories were selected based on current typologies in social networks. Logistic regression was adopted to model category fit based on images features. In this model we used both low level and high level features, the latter focusing on complex features related to image content. In order to extract these high level features, we relied on crowdsourcing, since computer vision algorithms are not yet sufficiently accurate for the features we needed. Our results underline the importance of some high level content features, e.g. the dress of the portrayed person and scene setting, in categorizing image.
Bermeitinger, Christina; Wentura, Dirk; Frings, Christian
2011-06-01
"Semantic priming" refers to the phenomenon that people react faster to target words preceded by semantically related rather than semantically unrelated words. We wondered whether momentary mind sets modulate semantic priming for natural versus artifactual categories. We interspersed a category priming task with a second task that required participants to react to either the perceptual or action features of simple geometric shapes. Focusing on perceptual features enhanced semantic priming effects for natural categories, whereas focusing on action features enhanced semantic priming effects for artifactual categories. In fact, significant priming effects emerged only for those categories thought to rely on the features activated by the second task. This result suggests that (a) priming effects depend on momentary mind set and (b) features can be weighted flexibly in concept representations; it is also further evidence for sensory-functional accounts of concept and category representation.
Normalization of relative and incomplete temporal expressions in clinical narratives.
Sun, Weiyi; Rumshisky, Anna; Uzuner, Ozlem
2015-09-01
To improve the normalization of relative and incomplete temporal expressions (RI-TIMEXes) in clinical narratives. We analyzed the RI-TIMEXes in temporally annotated corpora and propose two hypotheses regarding the normalization of RI-TIMEXes in the clinical narrative domain: the anchor point hypothesis and the anchor relation hypothesis. We annotated the RI-TIMEXes in three corpora to study the characteristics of RI-TMEXes in different domains. This informed the design of our RI-TIMEX normalization system for the clinical domain, which consists of an anchor point classifier, an anchor relation classifier, and a rule-based RI-TIMEX text span parser. We experimented with different feature sets and performed an error analysis for each system component. The annotation confirmed the hypotheses that we can simplify the RI-TIMEXes normalization task using two multi-label classifiers. Our system achieves anchor point classification, anchor relation classification, and rule-based parsing accuracy of 74.68%, 87.71%, and 57.2% (82.09% under relaxed matching criteria), respectively, on the held-out test set of the 2012 i2b2 temporal relation challenge. Experiments with feature sets reveal some interesting findings, such as: the verbal tense feature does not inform the anchor relation classification in clinical narratives as much as the tokens near the RI-TIMEX. Error analysis showed that underrepresented anchor point and anchor relation classes are difficult to detect. We formulate the RI-TIMEX normalization problem as a pair of multi-label classification problems. Considering only RI-TIMEX extraction and normalization, the system achieves statistically significant improvement over the RI-TIMEX results of the best systems in the 2012 i2b2 challenge. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Review of "The Policy Framework for Online Charter Schools"
ERIC Educational Resources Information Center
Miron, Gary
2016-01-01
Relative to earlier research, this study from the Center for Reinventing Public Education provides a more in-depth analysis of policy features across the 27 states that allow online charter schools. It presents a well-organized description of policy features and includes a set of policy recommendations that generally, but not always, follow well…
Region 9 NPL Sites (Superfund Sites 2013)
NPL site POINT locations for the US EPA Region 9. NPL (National Priorities List) sites are hazardous waste sites that are eligible for extensive long-term cleanup under the Superfund program. Eligibility is determined by a scoring method called Hazard Ranking System. Sites with high scores are listed on the NPL. The majority of the locations are derived from polygon centroids of digitized site boundaries. The remaining locations were generated from address geocoding and digitizing. Area covered by this data set include Arizona, California, Nevada, Hawaii, Guam, American Samoa, Northern Marianas and Trust Territories. Attributes include NPL status codes, NPL industry type codes and environmental indicators. Related table, NPL_Contaminants contains information about contaminated media types and chemicals. This is a one-to-many relate and can be related to the feature class using the relationship classes under the Feature Data Set ENVIRO_CONTAMINANT.
Using Activity-Related Behavioural Features towards More Effective Automatic Stress Detection
Giakoumis, Dimitris; Drosou, Anastasios; Cipresso, Pietro; Tzovaras, Dimitrios; Hassapis, George; Gaggioli, Andrea; Riva, Giuseppe
2012-01-01
This paper introduces activity-related behavioural features that can be automatically extracted from a computer system, with the aim to increase the effectiveness of automatic stress detection. The proposed features are based on processing of appropriate video and accelerometer recordings taken from the monitored subjects. For the purposes of the present study, an experiment was conducted that utilized a stress-induction protocol based on the stroop colour word test. Video, accelerometer and biosignal (Electrocardiogram and Galvanic Skin Response) recordings were collected from nineteen participants. Then, an explorative study was conducted by following a methodology mainly based on spatiotemporal descriptors (Motion History Images) that are extracted from video sequences. A large set of activity-related behavioural features, potentially useful for automatic stress detection, were proposed and examined. Experimental evaluation showed that several of these behavioural features significantly correlate to self-reported stress. Moreover, it was found that the use of the proposed features can significantly enhance the performance of typical automatic stress detection systems, commonly based on biosignal processing. PMID:23028461
NASA Astrophysics Data System (ADS)
Riasati, Vahid R.
2016-05-01
In this work, the data covariance matrix is diagonalized to provide an orthogonal bases set using the eigen vectors of the data. The eigen-vector decomposition of the data is transformed and filtered in the transform domain to truncate the data for robust features related to a specified set of targets. These truncated eigen features are then combined and reconstructed to utilize in a composite filter and consequently utilized for the automatic target detection of the same class of targets. The results associated with the testing of the current technique are evaluated using the peak-correlation and peak-correlation energy metrics and are presented in this work. The inverse transformed eigen-bases of the current technique may be thought of as an injected sparsity to minimize data in representing the skeletal data structure information associated with the set of targets under consideration.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Potash, Peter J.; Bell, Eric B.; Harrison, Joshua J.
Predictive models for tweet deletion have been a relatively unexplored area of Twitter-related computational research. We first approach the deletion of tweets as a spam detection problem, applying a small set of handcrafted features to improve upon the current state-of-the- art in predicting deleted tweets. Next, we apply our approach to a dataset of deleted tweets that better reflects the current deletion rate. Since tweets are deleted for reasons beyond just the presence of spam, we apply topic modeling and text embeddings in order to capture the semantic content of tweets that can lead to tweet deletion. Our goal ismore » to create an effective model that has a low-dimensional feature space and is also language-independent. A lean model would be computationally advantageous processing high-volumes of Twitter data, which can reach 9,885 tweets per second. Our results show that a small set of spam-related features combined with word topics and character-level text embeddings provide the best f1 when trained with a random forest model. The highest precision of the deleted tweet class is achieved by a modification of paragraph2vec to capture author identity.« less
Fractal analysis of seafloor textures for target detection in synthetic aperture sonar imagery
NASA Astrophysics Data System (ADS)
Nabelek, T.; Keller, J.; Galusha, A.; Zare, A.
2018-04-01
Fractal analysis of an image is a mathematical approach to generate surface related features from an image or image tile that can be applied to image segmentation and to object recognition. In undersea target countermeasures, the targets of interest can appear as anomalies in a variety of contexts, visually different textures on the seafloor. In this paper, we evaluate the use of fractal dimension as a primary feature and related characteristics as secondary features to be extracted from synthetic aperture sonar (SAS) imagery for the purpose of target detection. We develop three separate methods for computing fractal dimension. Tiles with targets are compared to others from the same background textures without targets. The different fractal dimension feature methods are tested with respect to how well they can be used to detect targets vs. false alarms within the same contexts. These features are evaluated for utility using a set of image tiles extracted from a SAS data set generated by the U.S. Navy in conjunction with the Office of Naval Research. We find that all three methods perform well in the classification task, with a fractional Brownian motion model performing the best among the individual methods. We also find that the secondary features are just as useful, if not more so, in classifying false alarms vs. targets. The best classification accuracy overall, in our experimentation, is found when the features from all three methods are combined into a single feature vector.
Linguistic feature analysis for protein interaction extraction
2009-01-01
Background The rapid growth of the amount of publicly available reports on biomedical experimental results has recently caused a boost of text mining approaches for protein interaction extraction. Most approaches rely implicitly or explicitly on linguistic, i.e., lexical and syntactic, data extracted from text. However, only few attempts have been made to evaluate the contribution of the different feature types. In this work, we contribute to this evaluation by studying the relative importance of deep syntactic features, i.e., grammatical relations, shallow syntactic features (part-of-speech information) and lexical features. For this purpose, we use a recently proposed approach that uses support vector machines with structured kernels. Results Our results reveal that the contribution of the different feature types varies for the different data sets on which the experiments were conducted. The smaller the training corpus compared to the test data, the more important the role of grammatical relations becomes. Moreover, deep syntactic information based classifiers prove to be more robust on heterogeneous texts where no or only limited common vocabulary is shared. Conclusion Our findings suggest that grammatical relations play an important role in the interaction extraction task. Moreover, the net advantage of adding lexical and shallow syntactic features is small related to the number of added features. This implies that efficient classifiers can be built by using only a small fraction of the features that are typically being used in recent approaches. PMID:19909518
NASA Astrophysics Data System (ADS)
Vijverberg, Koen; Ghafoorian, Mohsen; van Uden, Inge W. M.; de Leeuw, Frank-Erik; Platel, Bram; Heskes, Tom
2016-03-01
Cerebral small vessel disease (SVD) is a disorder frequently found among the old people and is associated with deterioration in cognitive performance, parkinsonism, motor and mood impairments. White matter hyperintensities (WMH) as well as lacunes, microbleeds and subcortical brain atrophy are part of the spectrum of image findings, related to SVD. Accurate segmentation of WMHs is important for prognosis and diagnosis of multiple neurological disorders such as MS and SVD. Almost all of the published (semi-)automated WMH detection models employ multiple complex hand-crafted features, which require in-depth domain knowledge. In this paper we propose to apply a single-layer network unsupervised feature learning (USFL) method to avoid hand-crafted features, but rather to automatically learn a more efficient set of features. Experimental results show that a computer aided detection system with a USFL system outperforms a hand-crafted approach. Moreover, since the two feature sets have complementary properties, a hybrid system that makes use of both hand-crafted and unsupervised learned features, shows a significant performance boost compared to each system separately, getting close to the performance of an independent human expert.
NASA Technical Reports Server (NTRS)
Dekorvin, Andre
1989-01-01
The main purpose is to develop a theory for multiple knowledge systems. A knowledge system could be a sensor or an expert system, but it must specialize in one feature. The problem is that we have an exhaustive list of possible answers to some query (such as what object is it). By collecting different feature values, in principle, it should be possible to give an answer to the query, or at least narrow down the list. Since a sensor, or for that matter an expert system, does not in most cases yield a precise value for the feature, uncertainty must be built into the model. Also, researchers must have a formal mechanism to be able to put the information together. Researchers chose to use the Dempster-Shafer approach to handle the problems mentioned above. Researchers introduce the concept of a state of recognition and point out that there is a relation between receiving updates and defining a set valued Markov Chain. Also, deciding what the value of the next set valued variable is can be phrased in terms of classical decision making theory such as minimizing the maximum regret. Other related problems are examined.
Zhou, Jingyu; Tian, Shulin; Yang, Chenglin
2014-01-01
Few researches pay attention to prediction about analog circuits. The few methods lack the correlation with circuit analysis during extracting and calculating features so that FI (fault indicator) calculation often lack rationality, thus affecting prognostic performance. To solve the above problem, this paper proposes a novel prediction method about single components of analog circuits based on complex field modeling. Aiming at the feature that faults of single components hold the largest number in analog circuits, the method starts with circuit structure, analyzes transfer function of circuits, and implements complex field modeling. Then, by an established parameter scanning model related to complex field, it analyzes the relationship between parameter variation and degeneration of single components in the model in order to obtain a more reasonable FI feature set via calculation. According to the obtained FI feature set, it establishes a novel model about degeneration trend of analog circuits' single components. At last, it uses particle filter (PF) to update parameters for the model and predicts remaining useful performance (RUP) of analog circuits' single components. Since calculation about the FI feature set is more reasonable, accuracy of prediction is improved to some extent. Finally, the foregoing conclusions are verified by experiments.
A feature-based approach to modeling protein-protein interaction hot spots.
Cho, Kyu-il; Kim, Dongsup; Lee, Doheon
2009-05-01
Identifying features that effectively represent the energetic contribution of an individual interface residue to the interactions between proteins remains problematic. Here, we present several new features and show that they are more effective than conventional features. By combining the proposed features with conventional features, we develop a predictive model for interaction hot spots. Initially, 54 multifaceted features, composed of different levels of information including structure, sequence and molecular interaction information, are quantified. Then, to identify the best subset of features for predicting hot spots, feature selection is performed using a decision tree. Based on the selected features, a predictive model for hot spots is created using support vector machine (SVM) and tested on an independent test set. Our model shows better overall predictive accuracy than previous methods such as the alanine scanning methods Robetta and FOLDEF, and the knowledge-based method KFC. Subsequent analysis yields several findings about hot spots. As expected, hot spots have a larger relative surface area burial and are more hydrophobic than other residues. Unexpectedly, however, residue conservation displays a rather complicated tendency depending on the types of protein complexes, indicating that this feature is not good for identifying hot spots. Of the selected features, the weighted atomic packing density, relative surface area burial and weighted hydrophobicity are the top 3, with the weighted atomic packing density proving to be the most effective feature for predicting hot spots. Notably, we find that hot spots are closely related to pi-related interactions, especially pi . . . pi interactions.
Wang, Shijun; Yao, Jianhua; Petrick, Nicholas; Summers, Ronald M.
2010-01-01
Colon cancer is the second leading cause of cancer-related deaths in the United States. Computed tomographic colonography (CTC) combined with a computer aided detection system provides a feasible approach for improving colonic polyps detection and increasing the use of CTC for colon cancer screening. To distinguish true polyps from false positives, various features extracted from polyp candidates have been proposed. Most of these traditional features try to capture the shape information of polyp candidates or neighborhood knowledge about the surrounding structures (fold, colon wall, etc.). In this paper, we propose a new set of shape descriptors for polyp candidates based on statistical curvature information. These features called histograms of curvature features are rotation, translation and scale invariant and can be treated as complementing existing feature set. Then in order to make full use of the traditional geometric features (defined as group A) and the new statistical features (group B) which are highly heterogeneous, we employed a multiple kernel learning method based on semi-definite programming to learn an optimized classification kernel from the two groups of features. We conducted leave-one-patient-out test on a CTC dataset which contained scans from 66 patients. Experimental results show that a support vector machine (SVM) based on the combined feature set and the semi-definite optimization kernel achieved higher FROC performance compared to SVMs using the two groups of features separately. At a false positive per scan rate of 5, the sensitivity of the SVM using the combined features improved from 0.77 (Group A) and 0.73 (Group B) to 0.83 (p ≤ 0.01). PMID:20953299
ERIC Educational Resources Information Center
Ross, Scott R.; Benning, Stephen D.; Patrick, Christopher J.; Thompson, Angela; Thurston, Amanda
2009-01-01
Psychopathy is a personality disorder that includes interpersonal-affective and antisocial deviance features. The Psychopathic Personality Inventory (PPI) contains two underlying factors (fearless dominance and impulsive antisociality) that may differentially tap these two sets of features. In a mixed-gender sample of undergraduates and prisoners,…
Spatial Relation Predicates in Topographic Feature Semantics
Varanka, Dalia E.; Caro, Holly K.
2013-01-01
Topographic data are designed and widely used for base maps of diverse applications, yet the power of these information sources largely relies on the interpretive skills of map readers and relational database expert users once the data are in map or geographic information system (GIS) form. Advances in geospatial semantic technology offer data model alternatives for explicating concepts and articulating complex data queries and statements. To understand and enrich the vocabulary of topographic feature properties for semantic technology, English language spatial relation predicates were analyzed in three standard topographic feature glossaries. The analytical approach drew from disciplinary concepts in geography, linguistics, and information science. Five major classes of spatial relation predicates were identified from the analysis; representations for most of these are not widely available. The classes are: part-whole (which are commonly modeled throughout semantic and linked-data networks), geometric, processes, human intention, and spatial prepositions. These are commonly found in the ‘real world’ and support the environmental science basis for digital topographical mapping. The spatial relation concepts are based on sets of relation terms presented in this chapter, though these lists are not prescriptive or exhaustive. The results of this study make explicit the concepts forming a broad set of spatial relation expressions, which in turn form the basis for expanding the range of possible queries for topographical data analysis and mapping.
Subsurface failure in spherical bodies. A formation scenario for linear troughs on Vesta’s surface
Stickle, Angela M.; Schultz, P. H.; Crawford, D. A.
2014-10-13
Many asteroids in the Solar System exhibit unusual, linear features on their surface. The Dawn mission recently observed two sets of linear features on the surface of the asteroid 4 Vesta. Geologic observations indicate that these features are related to the two large impact basins at the south pole of Vesta, though no specific mechanism of origin has been determined. Furthermore, the orientation of the features is offset from the center of the basins. Experimental and numerical results reveal that the offset angle is a natural consequence of oblique impacts into a spherical target. We demonstrate that a set ofmore » shear planes develops in the subsurface of the body opposite to the point of first contact. Moreover, these subsurface failure zones then propagate to the surface under combined tensile-shear stress fields after the impact to create sets of approximately linear faults on the surface. Comparison between the orientation of damage structures in the laboratory and failure regions within Vesta can be used to constrain impact parameters (e.g., the approximate impact point and likely impact trajectory).« less
Visual Odometry Based on Structural Matching of Local Invariant Features Using Stereo Camera Sensor
Núñez, Pedro; Vázquez-Martín, Ricardo; Bandera, Antonio
2011-01-01
This paper describes a novel sensor system to estimate the motion of a stereo camera. Local invariant image features are matched between pairs of frames and linked into image trajectories at video rate, providing the so-called visual odometry, i.e., motion estimates from visual input alone. Our proposal conducts two matching sessions: the first one between sets of features associated to the images of the stereo pairs and the second one between sets of features associated to consecutive frames. With respect to previously proposed approaches, the main novelty of this proposal is that both matching algorithms are conducted by means of a fast matching algorithm which combines absolute and relative feature constraints. Finding the largest-valued set of mutually consistent matches is equivalent to finding the maximum-weighted clique on a graph. The stereo matching allows to represent the scene view as a graph which emerge from the features of the accepted clique. On the other hand, the frame-to-frame matching defines a graph whose vertices are features in 3D space. The efficiency of the approach is increased by minimizing the geometric and algebraic errors to estimate the final displacement of the stereo camera between consecutive acquired frames. The proposed approach has been tested for mobile robotics navigation purposes in real environments and using different features. Experimental results demonstrate the performance of the proposal, which could be applied in both industrial and service robot fields. PMID:22164016
Shift-invariant discrete wavelet transform analysis for retinal image classification.
Khademi, April; Krishnan, Sridhar
2007-12-01
This work involves retinal image classification and a novel analysis system was developed. From the compressed domain, the proposed scheme extracts textural features from wavelet coefficients, which describe the relative homogeneity of localized areas of the retinal images. Since the discrete wavelet transform (DWT) is shift-variant, a shift-invariant DWT was explored to ensure that a robust feature set was extracted. To combat the small database size, linear discriminant analysis classification was used with the leave one out method. 38 normal and 48 abnormal (exudates, large drusens, fine drusens, choroidal neovascularization, central vein and artery occlusion, histoplasmosis, arteriosclerotic retinopathy, hemi-central retinal vein occlusion and more) were used and a specificity of 79% and sensitivity of 85.4% were achieved (the average classification rate is 82.2%). The success of the system can be accounted to the highly robust feature set which included translation, scale and semi-rotational, features. Additionally, this technique is database independent since the features were specifically tuned to the pathologies of the human eye.
Salekin, Randall T; Lester, Whitney S; Sellers, Mary-Kate
2012-08-01
The purpose of the current study was to examine the effect of a motivational intervention on conduct problem youth with psychopathic features. Specifically, the current study examined conduct problem youths' mental set (or theory) regarding intelligence (entity vs. incremental) upon task performance. We assessed 36 juvenile offenders with psychopathic features and tested whether providing them with two different messages regarding intelligence would affect their functioning on a task related to academic performance. The study employed a MANOVA design with two motivational conditions and three outcomes including fluency, flexibility, and originality. Results showed that youth with psychopathic features who were given a message that intelligence grows over time, were more fluent and flexible than youth who were informed that intelligence is static. There were no significant differences between the groups in terms of originality. The implications of these findings are discussed including the possible benefits of interventions for adolescent offenders with conduct problems and psychopathic features. (PsycINFO Database Record (c) 2012 APA, all rights reserved).
Late summer sea ice segmentation with multi-polarisation SAR features in C- and X-band
NASA Astrophysics Data System (ADS)
Fors, A. S.; Brekke, C.; Doulgeris, A. P.; Eltoft, T.; Renner, A. H. H.; Gerland, S.
2015-09-01
In this study we investigate the potential of sea ice segmentation by C- and X-band multi-polarisation synthetic aperture radar (SAR) features during late summer. Five high-resolution satellite SAR scenes were recorded in the Fram Strait covering iceberg-fast first-year and old sea ice during a week with air temperatures varying around zero degrees Celsius. In situ data consisting of sea ice thickness, surface roughness and aerial photographs were collected during a helicopter flight at the site. Six polarimetric SAR features were extracted for each of the scenes. The ability of the individual SAR features to discriminate between sea ice types and their temporally consistency were examined. All SAR features were found to add value to sea ice type discrimination. Relative kurtosis, geometric brightness, cross-polarisation ratio and co-polarisation correlation angle were found to be temporally consistent in the investigated period, while co-polarisation ratio and co-polarisation correlation magnitude were found to be temporally inconsistent. An automatic feature-based segmentation algorithm was tested both for a full SAR feature set, and for a reduced SAR feature set limited to temporally consistent features. In general, the algorithm produces a good late summer sea ice segmentation. Excluding temporally inconsistent SAR features improved the segmentation at air temperatures above zero degrees Celcius.
A feature-based developmental model of the infant brain in structural MRI.
Toews, Matthew; Wells, William M; Zöllei, Lilla
2012-01-01
In this paper, anatomical development is modeled as a collection of distinctive image patterns localized in space and time. A Bayesian posterior probability is defined over a random variable of subject age, conditioned on data in the form of scale-invariant image features. The model is automatically learned from a large set of images exhibiting significant variation, used to discover anatomical structure related to age and development, and fit to new images to predict age. The model is applied to a set of 230 infant structural MRIs of 92 subjects acquired at multiple sites over an age range of 8-590 days. Experiments demonstrate that the model can be used to identify age-related anatomical structure, and to predict the age of new subjects with an average error of 72 days.
2010-01-01
Inventions combine technological features. When features are barely related, burdensomely broad knowledge is required to identify the situations that they share. When features are overly related, burdensomely broad knowledge is required to identify the situations that distinguish them. Thus, according to my first hypothesis, when features are moderately related, the costs of connecting and costs of synthesizing are cumulatively minimized, and the most useful inventions emerge. I also hypothesize that continued experimentation with a specific set of features is likely to lead to the discovery of decreasingly useful inventions; the earlier-identified connections reflect the more common consumer situations. Covering data from all industries, the empirical analysis provides broad support for the first hypothesis. Regressions to test the second hypothesis are inconclusive when examining industry types individually. Yet, this study represents an exploratory investigation, and future research should test refined hypotheses with more sophisticated data, such as that found in literature-based discovery research. PMID:21297855
NASA Astrophysics Data System (ADS)
Rees, S. J.; Jones, Bryan F.
1992-11-01
Once feature extraction has occurred in a processed image, the recognition problem becomes one of defining a set of features which maps sufficiently well onto one of the defined shape/object models to permit a claimed recognition. This process is usually handled by aggregating features until a large enough weighting is obtained to claim membership, or an adequate number of located features are matched to the reference set. A requirement has existed for an operator or measure capable of a more direct assessment of membership/occupancy between feature sets, particularly where the feature sets may be defective representations. Such feature set errors may be caused by noise, by overlapping of objects, and by partial obscuration of features. These problems occur at the point of acquisition: repairing the data would then assume a priori knowledge of the solution. The technique described in this paper offers a set theoretical measure for partial occupancy defined in terms of the set of minimum additions to permit full occupancy and the set of locations of occupancy if such additions are made. As is shown, this technique permits recognition of partial feature sets with quantifiable degrees of uncertainty. A solution to the problems of obscuration and overlapping is therefore available.
Relational and conjunctive binding functions dissociate in short-term memory.
Parra, Mario A; Fabi, Katia; Luzzi, Simona; Cubelli, Roberto; Hernandez Valdez, Maria; Della Sala, Sergio
2015-02-01
Remembering complex events requires binding features within unified objects (conjunctions) and holding associations between objects (relations). Recent studies suggest that the two functions dissociate in long-term memory (LTM). Less is known about their functional organization in short-term memory (STM). The present study investigated this issue in patient AE affected by a stroke which caused damage to brain regions known to be relevant for relational functions both in LTM and in STM (i.e., the hippocampus). The assessment involved a battery of standard neuropsychological tasks and STM binding tasks. One STM binding task (Experiment 1) presented common objects and common colors forming either pairs (relations) or integrated objects (conjunctions). Free recall of relations or conjunctions was assessed. A second STM binding task used random polygons and non-primary colors instead (Experiment 2). Memory was assessed by selecting the features that made up the relations or the conjunctions from a set of single polygons and a set of single colors. The neuropsychological assessment revealed impaired delayed memory in AE. AE's pronounced relational STM binding deficits contrasted with his completely preserved conjunctive binding functions in both Experiments 1 and 2. Only 2.35% and 1.14% of the population were expected to have a discrepancy more extreme than that presented by AE in Experiments 1 and 2, respectively. Processing relations and conjunctions of very elementary nonspatial features in STM led to dissociating performances in AE. These findings may inform current theories of memory decline such as those linked to cognitive aging.
A feature-based approach to modeling protein–protein interaction hot spots
Cho, Kyu-il; Kim, Dongsup; Lee, Doheon
2009-01-01
Identifying features that effectively represent the energetic contribution of an individual interface residue to the interactions between proteins remains problematic. Here, we present several new features and show that they are more effective than conventional features. By combining the proposed features with conventional features, we develop a predictive model for interaction hot spots. Initially, 54 multifaceted features, composed of different levels of information including structure, sequence and molecular interaction information, are quantified. Then, to identify the best subset of features for predicting hot spots, feature selection is performed using a decision tree. Based on the selected features, a predictive model for hot spots is created using support vector machine (SVM) and tested on an independent test set. Our model shows better overall predictive accuracy than previous methods such as the alanine scanning methods Robetta and FOLDEF, and the knowledge-based method KFC. Subsequent analysis yields several findings about hot spots. As expected, hot spots have a larger relative surface area burial and are more hydrophobic than other residues. Unexpectedly, however, residue conservation displays a rather complicated tendency depending on the types of protein complexes, indicating that this feature is not good for identifying hot spots. Of the selected features, the weighted atomic packing density, relative surface area burial and weighted hydrophobicity are the top 3, with the weighted atomic packing density proving to be the most effective feature for predicting hot spots. Notably, we find that hot spots are closely related to π–related interactions, especially π · · · π interactions. PMID:19273533
Online tracking of outdoor lighting variations for augmented reality with moving cameras.
Liu, Yanli; Granier, Xavier
2012-04-01
In augmented reality, one of key tasks to achieve a convincing visual appearance consistency between virtual objects and video scenes is to have a coherent illumination along the whole sequence. As outdoor illumination is largely dependent on the weather, the lighting condition may change from frame to frame. In this paper, we propose a full image-based approach for online tracking of outdoor illumination variations from videos captured with moving cameras. Our key idea is to estimate the relative intensities of sunlight and skylight via a sparse set of planar feature-points extracted from each frame. To address the inevitable feature misalignments, a set of constraints are introduced to select the most reliable ones. Exploiting the spatial and temporal coherence of illumination, the relative intensities of sunlight and skylight are finally estimated by using an optimization process. We validate our technique on a set of real-life videos and show that the results with our estimations are visually coherent along the video sequences.
Automatic machine learning based prediction of cardiovascular events in lung cancer screening data
NASA Astrophysics Data System (ADS)
de Vos, Bob D.; de Jong, Pim A.; Wolterink, Jelmer M.; Vliegenthart, Rozemarijn; Wielingen, Geoffrey V. F.; Viergever, Max A.; Išgum, Ivana
2015-03-01
Calcium burden determined in CT images acquired in lung cancer screening is a strong predictor of cardiovascular events (CVEs). This study investigated whether subjects undergoing such screening who are at risk of a CVE can be identified using automatic image analysis and subject characteristics. Moreover, the study examined whether these individuals can be identified using solely image information, or if a combination of image and subject data is needed. A set of 3559 male subjects undergoing Dutch-Belgian lung cancer screening trial was included. Low-dose non-ECG synchronized chest CT images acquired at baseline were analyzed (1834 scanned in the University Medical Center Groningen, 1725 in the University Medical Center Utrecht). Aortic and coronary calcifications were identified using previously developed automatic algorithms. A set of features describing number, volume and size distribution of the detected calcifications was computed. Age of the participants was extracted from image headers. Features describing participants' smoking status, smoking history and past CVEs were obtained. CVEs that occurred within three years after the imaging were used as outcome. Support vector machine classification was performed employing different feature sets using sets of only image features, or a combination of image and subject related characteristics. Classification based solely on the image features resulted in the area under the ROC curve (Az) of 0.69. A combination of image and subject features resulted in an Az of 0.71. The results demonstrate that subjects undergoing lung cancer screening who are at risk of CVE can be identified using automatic image analysis. Adding subject information slightly improved the performance.
Search asymmetry and eye movements in infants and adults.
Adler, Scott A; Gallego, Pamela
2014-08-01
Search asymmetry is characterized by the detection of a feature-present target amidst feature-absent distractors being efficient and unaffected by the number of distractors, whereas detection of a feature-absent target amidst feature-present distractors is typically inefficient and affected by the number of distractors. Although studies have attempted to investigate this phenomenon with infants (e.g., Adler, Inslicht, Rovee-Collier, & Gerhardstein in Infant Behavioral Development, 21, 253-272, 1998; Colombo, Mitchell, Coldren, & Atwater in Journal of Experimental Psychology: Learning, Memory and Cognition, 19, 98-109, 1990), due to methodological limitations, their findings have been unable to definitively establish the development of visual search mechanisms in infants. The present study assessed eye movements as a means to examine an asymmetry in responding to feature-present versus feature-absent targets in 3-month-olds, relative to adults. Saccade latencies to localize a target (or a distractor, as in the homogeneous conditions) were measured as infants and adults randomly viewed feature-present (R among Ps), feature-absent (P among Rs), and homogeneous (either all Rs or all Ps) arrays at set sizes of 1, 3, 5, and 8. Results indicated that neither infants' nor adults' saccade latencies to localize the target in the feature-present arrays were affected by increasing set sizes, suggesting that localization of the target was efficient. In contrast, saccade latencies to localize the target in the feature-absent arrays increased with increasing set sizes for both infants and adults, suggesting an inefficient localization. These findings indicate that infants exhibit an asymmetry consistent with that found with adults, providing support for functional bottom-up selective attention mechanisms in early infancy.
Rossby-gravity waves in tropical total ozone data
NASA Technical Reports Server (NTRS)
Stanford, J. L.; Ziemke, J. R.
1993-01-01
Evidence for Rossby-gravity waves in tropical data fields produced by the European Center for Medium Range Weather Forecasts (ECMWF) was recently reported. Similar features are observable in fields of total column ozone from the Total Ozone Mapping Spectrometer (TOMS) satellite instrument. The observed features are episodic, have zonal (east-west) wavelengths of 6,000-10,000 km, and oscillate with periods of 5-10 days. In accord with simple linear theory, the modes exhibit westward phase progression and eastward group velocity. The significance of finding Rossby-gravity waves in total ozone fields is that (1) the report of similar features in ECMWF tropical fields is corroborated with an independent data set and (2) the TOMS data set is demonstrated to possess surprising versatility and sensitivity to relatively smaller scale tropical phenomena.
A Novel Prediction Method about Single Components of Analog Circuits Based on Complex Field Modeling
Tian, Shulin; Yang, Chenglin
2014-01-01
Few researches pay attention to prediction about analog circuits. The few methods lack the correlation with circuit analysis during extracting and calculating features so that FI (fault indicator) calculation often lack rationality, thus affecting prognostic performance. To solve the above problem, this paper proposes a novel prediction method about single components of analog circuits based on complex field modeling. Aiming at the feature that faults of single components hold the largest number in analog circuits, the method starts with circuit structure, analyzes transfer function of circuits, and implements complex field modeling. Then, by an established parameter scanning model related to complex field, it analyzes the relationship between parameter variation and degeneration of single components in the model in order to obtain a more reasonable FI feature set via calculation. According to the obtained FI feature set, it establishes a novel model about degeneration trend of analog circuits' single components. At last, it uses particle filter (PF) to update parameters for the model and predicts remaining useful performance (RUP) of analog circuits' single components. Since calculation about the FI feature set is more reasonable, accuracy of prediction is improved to some extent. Finally, the foregoing conclusions are verified by experiments. PMID:25147853
Pilling, Michael; Gellatly, Angus
2013-07-01
We investigated the influence of dimensional set on report of object feature information using an immediate memory probe task. Participants viewed displays containing up to 36 coloured geometric shapes which were presented for several hundred milliseconds before one item was abruptly occluded by a probe. A cue presented simultaneously with the probe instructed participants to report either about the colour or shape of the probe item. A dimensional set towards the colour or shape of the presented items was induced by manipulating task probability - the relative probability with which the two feature dimensions required report. This was done across two participant groups: One group was given trials where there was a higher report probability of colour, the other a higher report probability of shape. Two experiments showed that features were reported most accurately when they were of high task probability, though in both cases the effect was largely driven by the colour dimension. Importantly the task probability effect did not interact with display set size. This is interpreted as tentative evidence that this manipulation influences feature processing in a global manner and at a stage prior to visual short term memory. Copyright © 2013 Elsevier B.V. All rights reserved.
Yugandhar, K; Gromiha, M Michael
2014-09-01
Protein-protein interactions are intrinsic to virtually every cellular process. Predicting the binding affinity of protein-protein complexes is one of the challenging problems in computational and molecular biology. In this work, we related sequence features of protein-protein complexes with their binding affinities using machine learning approaches. We set up a database of 185 protein-protein complexes for which the interacting pairs are heterodimers and their experimental binding affinities are available. On the other hand, we have developed a set of 610 features from the sequences of protein complexes and utilized Ranker search method, which is the combination of Attribute evaluator and Ranker method for selecting specific features. We have analyzed several machine learning algorithms to discriminate protein-protein complexes into high and low affinity groups based on their Kd values. Our results showed a 10-fold cross-validation accuracy of 76.1% with the combination of nine features using support vector machines. Further, we observed accuracy of 83.3% on an independent test set of 30 complexes. We suggest that our method would serve as an effective tool for identifying the interacting partners in protein-protein interaction networks and human-pathogen interactions based on the strength of interactions. © 2014 Wiley Periodicals, Inc.
On-line object feature extraction for multispectral scene representation
NASA Technical Reports Server (NTRS)
Ghassemian, Hassan; Landgrebe, David
1988-01-01
A new on-line unsupervised object-feature extraction method is presented that reduces the complexity and costs associated with the analysis of the multispectral image data and data transmission, storage, archival and distribution. The ambiguity in the object detection process can be reduced if the spatial dependencies, which exist among the adjacent pixels, are intelligently incorporated into the decision making process. The unity relation was defined that must exist among the pixels of an object. Automatic Multispectral Image Compaction Algorithm (AMICA) uses the within object pixel-feature gradient vector as a valuable contextual information to construct the object's features, which preserve the class separability information within the data. For on-line object extraction the path-hypothesis and the basic mathematical tools for its realization are introduced in terms of a specific similarity measure and adjacency relation. AMICA is applied to several sets of real image data, and the performance and reliability of features is evaluated.
NASA Astrophysics Data System (ADS)
Hussnain, Zille; Oude Elberink, Sander; Vosselman, George
2016-06-01
In mobile laser scanning systems, the platform's position is measured by GNSS and IMU, which is often not reliable in urban areas. Consequently, derived Mobile Laser Scanning Point Cloud (MLSPC) lacks expected positioning reliability and accuracy. Many of the current solutions are either semi-automatic or unable to achieve pixel level accuracy. We propose an automatic feature extraction method which involves utilizing corresponding aerial images as a reference data set. The proposed method comprise three steps; image feature detection, description and matching between corresponding patches of nadir aerial and MLSPC ortho images. In the data pre-processing step the MLSPC is patch-wise cropped and converted to ortho images. Furthermore, each aerial image patch covering the area of the corresponding MLSPC patch is also cropped from the aerial image. For feature detection, we implemented an adaptive variant of Harris-operator to automatically detect corner feature points on the vertices of road markings. In feature description phase, we used the LATCH binary descriptor, which is robust to data from different sensors. For descriptor matching, we developed an outlier filtering technique, which exploits the arrangements of relative Euclidean-distances and angles between corresponding sets of feature points. We found that the positioning accuracy of the computed correspondence has achieved the pixel level accuracy, where the image resolution is 12cm. Furthermore, the developed approach is reliable when enough road markings are available in the data sets. We conclude that, in urban areas, the developed approach can reliably extract features necessary to improve the MLSPC accuracy to pixel level.
Banerjee, Amit; Misra, Milind; Pai, Deepa; Shih, Liang-Yu; Woodley, Rohan; Lu, Xiang-Jun; Srinivasan, A R; Olson, Wilma K; Davé, Rajesh N; Venanzi, Carol A
2007-01-01
Six rigid-body parameters (Shift, Slide, Rise, Tilt, Roll, Twist) are commonly used to describe the relative displacement and orientation of successive base pairs in a nucleic acid structure. The present work adapts this approach to describe the relative displacement and orientation of any two planes in an arbitrary molecule-specifically, planes which contain important pharmacophore elements. Relevant code from the 3DNA software package (Nucleic Acids Res. 2003, 31, 5108-5121) was generalized to treat molecular fragments other than DNA bases as input for the calculation of the corresponding rigid-body (or "planes") parameters. These parameters were used to construct feature vectors for a fuzzy relational clustering study of over 700 conformations of a flexible analogue of the dopamine reuptake inhibitor, GBR 12909. Several cluster validity measures were used to determine the optimal number of clusters. Translational (Shift, Slide, Rise) rather than rotational (Tilt, Roll, Twist) features dominate clustering based on planes that are relatively far apart, whereas both types of features are important to clustering when the pair of planes are close by. This approach was able to classify the data set of molecular conformations into groups and to identify representative conformers for use as template conformers in future Comparative Molecular Field Analysis studies of GBR 12909 analogues. The advantage of using the planes parameters, rather than the combination of atomic coordinates and angles between molecular planes used in our previous fuzzy relational clustering of the same data set (J. Chem. Inf. Model. 2005, 45, 610-623), is that the present clustering results are independent of molecular superposition and the technique is able to identify clusters in the molecule considered as a whole. This approach is easily generalizable to any two planes in any molecule.
Sweidan, Michelle; Williamson, Margaret; Reeve, James F; Harvey, Ken; O'Neill, Jennifer A; Schattner, Peter; Snowdon, Teri
2010-04-15
Electronic prescribing is increasingly being used in primary care and in hospitals. Studies on the effects of e-prescribing systems have found evidence for both benefit and harm. The aim of this study was to identify features of e-prescribing software systems that support patient safety and quality of care and that are useful to the clinician and the patient, with a focus on improving the quality use of medicines. Software features were identified by a literature review, key informants and an expert group. A modified Delphi process was used with a 12-member multidisciplinary expert group to reach consensus on the expected impact of the features in four domains: patient safety, quality of care, usefulness to the clinician and usefulness to the patient. The setting was electronic prescribing in general practice in Australia. A list of 114 software features was developed. Most of the features relate to the recording and use of patient data, the medication selection process, prescribing decision support, monitoring drug therapy and clinical reports. The expert group rated 78 of the features (68%) as likely to have a high positive impact in at least one domain, 36 features (32%) as medium impact, and none as low or negative impact. Twenty seven features were rated as high positive impact across 3 or 4 domains including patient safety and quality of care. Ten features were considered "aspirational" because of a lack of agreed standards and/or suitable knowledge bases. This study defines features of e-prescribing software systems that are expected to support safety and quality, especially in relation to prescribing and use of medicines in general practice. The features could be used to develop software standards, and could be adapted if necessary for use in other settings and countries.
2010-01-01
Background Electronic prescribing is increasingly being used in primary care and in hospitals. Studies on the effects of e-prescribing systems have found evidence for both benefit and harm. The aim of this study was to identify features of e-prescribing software systems that support patient safety and quality of care and that are useful to the clinician and the patient, with a focus on improving the quality use of medicines. Methods Software features were identified by a literature review, key informants and an expert group. A modified Delphi process was used with a 12-member multidisciplinary expert group to reach consensus on the expected impact of the features in four domains: patient safety, quality of care, usefulness to the clinician and usefulness to the patient. The setting was electronic prescribing in general practice in Australia. Results A list of 114 software features was developed. Most of the features relate to the recording and use of patient data, the medication selection process, prescribing decision support, monitoring drug therapy and clinical reports. The expert group rated 78 of the features (68%) as likely to have a high positive impact in at least one domain, 36 features (32%) as medium impact, and none as low or negative impact. Twenty seven features were rated as high positive impact across 3 or 4 domains including patient safety and quality of care. Ten features were considered "aspirational" because of a lack of agreed standards and/or suitable knowledge bases. Conclusions This study defines features of e-prescribing software systems that are expected to support safety and quality, especially in relation to prescribing and use of medicines in general practice. The features could be used to develop software standards, and could be adapted if necessary for use in other settings and countries. PMID:20398294
Effective traffic features selection algorithm for cyber-attacks samples
NASA Astrophysics Data System (ADS)
Li, Yihong; Liu, Fangzheng; Du, Zhenyu
2018-05-01
By studying the defense scheme of Network attacks, this paper propose an effective traffic features selection algorithm based on k-means++ clustering to deal with the problem of high dimensionality of traffic features which extracted from cyber-attacks samples. Firstly, this algorithm divide the original feature set into attack traffic feature set and background traffic feature set by the clustering. Then, we calculates the variation of clustering performance after removing a certain feature. Finally, evaluating the degree of distinctiveness of the feature vector according to the result. Among them, the effective feature vector is whose degree of distinctiveness exceeds the set threshold. The purpose of this paper is to select out the effective features from the extracted original feature set. In this way, it can reduce the dimensionality of the features so as to reduce the space-time overhead of subsequent detection. The experimental results show that the proposed algorithm is feasible and it has some advantages over other selection algorithms.
Mathieson, Luke; Mendes, Alexandre; Marsden, John; Pond, Jeffrey; Moscato, Pablo
2017-01-01
This chapter introduces a new method for knowledge extraction from databases for the purpose of finding a discriminative set of features that is also a robust set for within-class classification. Our method is generic and we introduce it here in the field of breast cancer diagnosis from digital mammography data. The mathematical formalism is based on a generalization of the k-Feature Set problem called (α, β)-k-Feature Set problem, introduced by Cotta and Moscato (J Comput Syst Sci 67(4):686-690, 2003). This method proceeds in two steps: first, an optimal (α, β)-k-feature set of minimum cardinality is identified and then, a set of classification rules using these features is obtained. We obtain the (α, β)-k-feature set in two phases; first a series of extremely powerful reduction techniques, which do not lose the optimal solution, are employed; and second, a metaheuristic search to identify the remaining features to be considered or disregarded. Two algorithms were tested with a public domain digital mammography dataset composed of 71 malignant and 75 benign cases. Based on the results provided by the algorithms, we obtain classification rules that employ only a subset of these features.
2011-01-01
Background Existing methods of predicting DNA-binding proteins used valuable features of physicochemical properties to design support vector machine (SVM) based classifiers. Generally, selection of physicochemical properties and determination of their corresponding feature vectors rely mainly on known properties of binding mechanism and experience of designers. However, there exists a troublesome problem for designers that some different physicochemical properties have similar vectors of representing 20 amino acids and some closely related physicochemical properties have dissimilar vectors. Results This study proposes a systematic approach (named Auto-IDPCPs) to automatically identify a set of physicochemical and biochemical properties in the AAindex database to design SVM-based classifiers for predicting and analyzing DNA-binding domains/proteins. Auto-IDPCPs consists of 1) clustering 531 amino acid indices in AAindex into 20 clusters using a fuzzy c-means algorithm, 2) utilizing an efficient genetic algorithm based optimization method IBCGA to select an informative feature set of size m to represent sequences, and 3) analyzing the selected features to identify related physicochemical properties which may affect the binding mechanism of DNA-binding domains/proteins. The proposed Auto-IDPCPs identified m=22 features of properties belonging to five clusters for predicting DNA-binding domains with a five-fold cross-validation accuracy of 87.12%, which is promising compared with the accuracy of 86.62% of the existing method PSSM-400. For predicting DNA-binding sequences, the accuracy of 75.50% was obtained using m=28 features, where PSSM-400 has an accuracy of 74.22%. Auto-IDPCPs and PSSM-400 have accuracies of 80.73% and 82.81%, respectively, applied to an independent test data set of DNA-binding domains. Some typical physicochemical properties discovered are hydrophobicity, secondary structure, charge, solvent accessibility, polarity, flexibility, normalized Van Der Waals volume, pK (pK-C, pK-N, pK-COOH and pK-a(RCOOH)), etc. Conclusions The proposed approach Auto-IDPCPs would help designers to investigate informative physicochemical and biochemical properties by considering both prediction accuracy and analysis of binding mechanism simultaneously. The approach Auto-IDPCPs can be also applicable to predict and analyze other protein functions from sequences. PMID:21342579
A Feature-based Developmental Model of the Infant Brain in Structural MRI
Toews, Matthew; Wells, William M.; Zöllei, Lilla
2014-01-01
In this paper, anatomical development is modeled as a collection of distinctive image patterns localized in space and time. A Bayesian posterior probability is defined over a random variable of subject age, conditioned on data in the form of scale-invariant image features. The model is automatically learned from a large set of images exhibiting significant variation, used to discover anatomical structure related to age and development, and fit to new images to predict age. The model is applied to a set of 230 infant structural MRIs of 92 subjects acquired at multiple sites over an age range of 8-590 days. Experiments demonstrate that the model can be used to identify age-related anatomical structure, and to predict the age of new subjects with an average error of 72 days. PMID:23286050
Radial sets: interactive visual analysis of large overlapping sets.
Alsallakh, Bilal; Aigner, Wolfgang; Miksch, Silvia; Hauser, Helwig
2013-12-01
In many applications, data tables contain multi-valued attributes that often store the memberships of the table entities to multiple sets such as which languages a person masters, which skills an applicant documents, or which features a product comes with. With a growing number of entities, the resulting element-set membership matrix becomes very rich of information about how these sets overlap. Many analysis tasks targeted at set-typed data are concerned with these overlaps as salient features of such data. This paper presents Radial Sets, a novel visual technique to analyze set memberships for a large number of elements. Our technique uses frequency-based representations to enable quickly finding and analyzing different kinds of overlaps between the sets, and relating these overlaps to other attributes of the table entities. Furthermore, it enables various interactions to select elements of interest, find out if they are over-represented in specific sets or overlaps, and if they exhibit a different distribution for a specific attribute compared to the rest of the elements. These interactions allow formulating highly-expressive visual queries on the elements in terms of their set memberships and attribute values. As we demonstrate via two usage scenarios, Radial Sets enable revealing and analyzing a multitude of overlapping patterns between large sets, beyond the limits of state-of-the-art techniques.
NASA Astrophysics Data System (ADS)
Marble, Jay A.; Gorman, John D.
1999-08-01
A feature based approach is taken to reduce the occurrence of false alarms in foliage penetrating, ultra-wideband, synthetic aperture radar data. A set of 'generic' features is defined based on target size, shape, and pixel intensity. A second set of features is defined that contains generic features combined with features based on scattering phenomenology. Each set is combined using a quadratic polynomial discriminant (QPD), and performance is characterized by generating a receiver operating characteristic (ROC) curve. Results show that the feature set containing phenomenological features improves performance against both broadside and end-on targets. Performance against end-on targets, however, is especially pronounced.
Large-scale oscillation of structure-related DNA sequence features in human chromosome 21
NASA Astrophysics Data System (ADS)
Li, Wentian; Miramontes, Pedro
2006-08-01
Human chromosome 21 is the only chromosome in the human genome that exhibits oscillation of the (G+C) content of a cycle length of hundreds kilobases (kb) ( 500kb near the right telomere). We aim at establishing the existence of a similar periodicity in structure-related sequence features in order to relate this (G+C)% oscillation to other biological phenomena. The following quantities are shown to oscillate with the same 500kb periodicity in human chromosome 21: binding energy calculated by two sets of dinucleotide-based thermodynamic parameters, AA/TT and AAA/TTT bi- and tri-nucleotide density, 5'-TA-3' dinucleotide density, and signal for 10- or 11-base periodicity of AA/TT or AAA/TTT. These intrinsic quantities are related to structural features of the double helix of DNA molecules, such as base-pair binding, untwisting or unwinding, stiffness, and a putative tendency for nucleosome formation.
Entropy Based Feature Selection for Fuzzy Set-Valued Information Systems
NASA Astrophysics Data System (ADS)
Ahmed, Waseem; Sufyan Beg, M. M.; Ahmad, Tanvir
2018-06-01
In Set-valued Information Systems (SIS), several objects contain more than one value for some attributes. Tolerance relation used for handling SIS sometimes leads to loss of certain information. To surmount this problem, fuzzy rough model was introduced. However, in some cases, SIS may contain some real or continuous set-values. Therefore, the existing fuzzy rough model for handling Information system with fuzzy set-values needs some changes. In this paper, Fuzzy Set-valued Information System (FSIS) is proposed and fuzzy similarity relation for FSIS is defined. Yager's relative conditional entropy was studied to find the significance measure of a candidate attribute of FSIS. Later, using these significance values, three greedy forward algorithms are discussed for finding the reduct and relative reduct for the proposed FSIS. An experiment was conducted on a sample population of the real dataset and a comparison of classification accuracies of the proposed FSIS with the existing SIS and single-valued Fuzzy Information Systems was made, which demonstrated the effectiveness of proposed FSIS.
NASA Astrophysics Data System (ADS)
Chirra, Prathyush; Leo, Patrick; Yim, Michael; Bloch, B. Nicolas; Rastinehad, Ardeshir R.; Purysko, Andrei; Rosen, Mark; Madabhushi, Anant; Viswanath, Satish
2018-02-01
The recent advent of radiomics has enabled the development of prognostic and predictive tools which use routine imaging, but a key question that still remains is how reproducible these features may be across multiple sites and scanners. This is especially relevant in the context of MRI data, where signal intensity values lack tissue specific, quantitative meaning, as well as being dependent on acquisition parameters (magnetic field strength, image resolution, type of receiver coil). In this paper we present the first empirical study of the reproducibility of 5 different radiomic feature families in a multi-site setting; specifically, for characterizing prostate MRI appearance. Our cohort comprised 147 patient T2w MRI datasets from 4 different sites, all of which were first pre-processed to correct acquisition-related for artifacts such as bias field, differing voxel resolutions, as well as intensity drift (non-standardness). 406 3D voxel wise radiomic features were extracted and evaluated in a cross-site setting to determine how reproducible they were within a relatively homogeneous non-tumor tissue region; using 2 different measures of reproducibility: Multivariate Coefficient of Variation and Instability Score. Our results demonstrated that Haralick features were most reproducible between all 4 sites. By comparison, Laws features were among the least reproducible between sites, as well as performing highly variably across their entire parameter space. Similarly, the Gabor feature family demonstrated good cross-site reproducibility, but for certain parameter combinations alone. These trends indicate that despite extensive pre-processing, only a subset of radiomic features and associated parameters may be reproducible enough for use within radiomics-based machine learning classifier schemes.
Thomas, Minta; De Brabanter, Kris; De Moor, Bart
2014-05-10
DNA microarrays are potentially powerful technology for improving diagnostic classification, treatment selection, and prognostic assessment. The use of this technology to predict cancer outcome has a history of almost a decade. Disease class predictors can be designed for known disease cases and provide diagnostic confirmation or clarify abnormal cases. The main input to this class predictors are high dimensional data with many variables and few observations. Dimensionality reduction of these features set significantly speeds up the prediction task. Feature selection and feature transformation methods are well known preprocessing steps in the field of bioinformatics. Several prediction tools are available based on these techniques. Studies show that a well tuned Kernel PCA (KPCA) is an efficient preprocessing step for dimensionality reduction, but the available bandwidth selection method for KPCA was computationally expensive. In this paper, we propose a new data-driven bandwidth selection criterion for KPCA, which is related to least squares cross-validation for kernel density estimation. We propose a new prediction model with a well tuned KPCA and Least Squares Support Vector Machine (LS-SVM). We estimate the accuracy of the newly proposed model based on 9 case studies. Then, we compare its performances (in terms of test set Area Under the ROC Curve (AUC) and computational time) with other well known techniques such as whole data set + LS-SVM, PCA + LS-SVM, t-test + LS-SVM, Prediction Analysis of Microarrays (PAM) and Least Absolute Shrinkage and Selection Operator (Lasso). Finally, we assess the performance of the proposed strategy with an existing KPCA parameter tuning algorithm by means of two additional case studies. We propose, evaluate, and compare several mathematical/statistical techniques, which apply feature transformation/selection for subsequent classification, and consider its application in medical diagnostics. Both feature selection and feature transformation perform well on classification tasks. Due to the dynamic selection property of feature selection, it is hard to define significant features for the classifier, which predicts classes of future samples. Moreover, the proposed strategy enjoys a distinctive advantage with its relatively lesser time complexity.
Issues in Semantic Memory: A Response to Glass and Holyoak. Technical Report No. 101.
ERIC Educational Resources Information Center
Shoben, Edward J.; And Others
Glass and Holyoak (1975) have raised two issues related to the distinction between set-theoretic and network theories of semantic memory, contending that: (a) their version of a network theory, the Marker Search model, is conceptually and empirically superior to the Feature Comparison model version of a set-theoretic theory; and (b) the contrast…
ERIC Educational Resources Information Center
Webster, Amanda A.; Carter, Mark
2013-01-01
Background: One of the most commonly cited rationales for inclusive education is to enable the development of quality relationships with typically developing peers. Relatively few researchers have examined the features of the range of relationships that children with developmental disability form in inclusive school settings. Method: Interviews…
Guidelines for a cancer prevention smartphone application: A mixed-methods study.
Ribeiro, Nuno; Moreira, Luís; Barros, Ana; Almeida, Ana Margarida; Santos-Silva, Filipe
2016-10-01
This study sought to explore the views and experiences of healthy young adults concerning the fundamental features of a cancer prevention smartphone app that seeks behaviour change. Three focus groups were conducted with 16 healthy young adults that explored prior experiences, points of view and opinions about currently available health-related smartphone apps. Then, an online questionnaire was designed and applied to a larger sample of healthy young adults. Focus group and online questionnaire data were analysed and confronted. Study results identified behaviour tracking, goal setting, tailored information and use of reminders as the most desired features in a cancer prevention app. Participants highlighted the importance of privacy and were reluctant to share personal health information with other users. The results also point out important dimensions to be considered for long-term use of health promotion apps related with usability and perceived usefulness. Participants didn't consider gamification features as important dimensions for long-term use of apps. This study allowed the definition of a guideline set for the development of a cancer prevention app. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Zollanvari, Amin; Dougherty, Edward R
2016-12-01
In classification, prior knowledge is incorporated in a Bayesian framework by assuming that the feature-label distribution belongs to an uncertainty class of feature-label distributions governed by a prior distribution. A posterior distribution is then derived from the prior and the sample data. An optimal Bayesian classifier (OBC) minimizes the expected misclassification error relative to the posterior distribution. From an application perspective, prior construction is critical. The prior distribution is formed by mapping a set of mathematical relations among the features and labels, the prior knowledge, into a distribution governing the probability mass across the uncertainty class. In this paper, we consider prior knowledge in the form of stochastic differential equations (SDEs). We consider a vector SDE in integral form involving a drift vector and dispersion matrix. Having constructed the prior, we develop the optimal Bayesian classifier between two models and examine, via synthetic experiments, the effects of uncertainty in the drift vector and dispersion matrix. We apply the theory to a set of SDEs for the purpose of differentiating the evolutionary history between two species.
Comparison of Machine Learning Methods for the Arterial Hypertension Diagnostics
Belo, David; Gamboa, Hugo
2017-01-01
The paper presents results of machine learning approach accuracy applied analysis of cardiac activity. The study evaluates the diagnostics possibilities of the arterial hypertension by means of the short-term heart rate variability signals. Two groups were studied: 30 relatively healthy volunteers and 40 patients suffering from the arterial hypertension of II-III degree. The following machine learning approaches were studied: linear and quadratic discriminant analysis, k-nearest neighbors, support vector machine with radial basis, decision trees, and naive Bayes classifier. Moreover, in the study, different methods of feature extraction are analyzed: statistical, spectral, wavelet, and multifractal. All in all, 53 features were investigated. Investigation results show that discriminant analysis achieves the highest classification accuracy. The suggested approach of noncorrelated feature set search achieved higher results than data set based on the principal components. PMID:28831239
Effects of set-size and lateral masking in visual search.
Põder, Endel
2004-01-01
In the present research, the roles of lateral masking and central processing limitations in visual search were studied. Two search conditions were used: (1) target differed from distractors by presence/absence of a simple feature; (2) target differed by relative position of the same components only. The number of displayed stimuli (set-size) and the distance between neighbouring stimuli were varied as independently as possible in order to measure the effect of both. The effect of distance between stimuli (lateral masking) was found to be similar in both conditions. The effect of set-size was much larger for relative position stimuli. The results support the view that perception of relative position of stimulus components is limited mainly by the capacity of central processing.
categoryCompare, an analytical tool based on feature annotations
Flight, Robert M.; Harrison, Benjamin J.; Mohammad, Fahim; Bunge, Mary B.; Moon, Lawrence D. F.; Petruska, Jeffrey C.; Rouchka, Eric C.
2014-01-01
Assessment of high-throughput—omics data initially focuses on relative or raw levels of a particular feature, such as an expression value for a transcript, protein, or metabolite. At a second level, analyses of annotations including known or predicted functions and associations of each individual feature, attempt to distill biological context. Most currently available comparative- and meta-analyses methods are dependent on the availability of identical features across data sets, and concentrate on determining features that are differentially expressed across experiments, some of which may be considered “biomarkers.” The heterogeneity of measurement platforms and inherent variability of biological systems confounds the search for robust biomarkers indicative of a particular condition. In many instances, however, multiple data sets show involvement of common biological processes or signaling pathways, even though individual features are not commonly measured or differentially expressed between them. We developed a methodology, categoryCompare, for cross-platform and cross-sample comparison of high-throughput data at the annotation level. We assessed the utility of the approach using hypothetical data, as well as determining similarities and differences in the set of processes in two instances: (1) denervated skin vs. denervated muscle, and (2) colon from Crohn's disease vs. colon from ulcerative colitis (UC). The hypothetical data showed that in many cases comparing annotations gave superior results to comparing only at the gene level. Improved analytical results depended as well on the number of genes included in the annotation term, the amount of noise in relation to the number of genes expressing in unenriched annotation categories, and the specific method in which samples are combined. In the skin vs. muscle denervation comparison, the tissues demonstrated markedly different responses. The Crohn's vs. UC comparison showed gross similarities in inflammatory response in the two diseases, with particular processes specific to each disease. PMID:24808906
NASA Astrophysics Data System (ADS)
Sheikhan, Mansour; Abbasnezhad Arabi, Mahdi; Gharavian, Davood
2015-10-01
Artificial neural networks are efficient models in pattern recognition applications, but their performance is dependent on employing suitable structure and connection weights. This study used a hybrid method for obtaining the optimal weight set and architecture of a recurrent neural emotion classifier based on gravitational search algorithm (GSA) and its binary version (BGSA), respectively. By considering the features of speech signal that were related to prosody, voice quality, and spectrum, a rich feature set was constructed. To select more efficient features, a fast feature selection method was employed. The performance of the proposed hybrid GSA-BGSA method was compared with similar hybrid methods based on particle swarm optimisation (PSO) algorithm and its binary version, PSO and discrete firefly algorithm, and hybrid of error back-propagation and genetic algorithm that were used for optimisation. Experimental tests on Berlin emotional database demonstrated the superior performance of the proposed method using a lighter network structure.
World Wide Web Based Image Search Engine Using Text and Image Content Features
NASA Astrophysics Data System (ADS)
Luo, Bo; Wang, Xiaogang; Tang, Xiaoou
2003-01-01
Using both text and image content features, a hybrid image retrieval system for Word Wide Web is developed in this paper. We first use a text-based image meta-search engine to retrieve images from the Web based on the text information on the image host pages to provide an initial image set. Because of the high-speed and low cost nature of the text-based approach, we can easily retrieve a broad coverage of images with a high recall rate and a relatively low precision. An image content based ordering is then performed on the initial image set. All the images are clustered into different folders based on the image content features. In addition, the images can be re-ranked by the content features according to the user feedback. Such a design makes it truly practical to use both text and image content for image retrieval over the Internet. Experimental results confirm the efficiency of the system.
Obermeier, S.F.; Jacobson, R.B.; Smoot, J.P.; Weems, R.E.; Gohn, G.S.; Monroe, J.E.; Powars, D.S.
1990-01-01
Many types of liquefaction-related features (sand blows, fissures, lateral spreads, dikes, and sills) have been induced by earthquakes in coastal South Carolina and in the New Madrid seismic zone in the Central United States. In addition, abundant features of unknown and nonseismic origin are present. Geologic criteria for interpreting an earthquake origin in these areas are illustrated in practical applications; these criteria can be used to determine the origin of liquefaction features in many other geographic and geologic settings. In both coastal South Carolina and the New Madrid seismic zone, the earthquake-induced liquefaction features generally originated in clean sand deposits that contain no or few intercalated silt or clay-rich strata. The local geologic setting is a major influence on both development and surface expression of sand blows. Major factors controlling sand-blow formation include the thickness and physical properties of the deposits above the source sands, and these relationships are illustrated by comparing sand blows found in coastal South Carolina (in marine deposits) with sand blows found in the New Madrid seismic zone (in fluvial deposits). In coastal South Carolina, the surface stratum is typically a thin (about 1 m) soil that is weakly cemented with humate, and the sand blows are expressed as craters surrounded by a thin sheet of sand; in the New Madrid seismic zone the surface stratum generally is a clay-rich deposit ranging in thickness from 2 to 10 m, in which case sand blows characteristically are expressed as sand mounded above the original ground surface. Recognition of the various features described in this paper, and identification of the most probable origin for each, provides a set of important tools for understanding paleoseismicity in areas such as the Central and Eastern United States where faults are not exposed for study and strong seismic activity is infrequent.
NASA Technical Reports Server (NTRS)
Howard, Ayanna; Bayard, David
2006-01-01
Fuzzy Feature Observation Planner for Small Body Proximity Observations (FuzzObserver) is a developmental computer program, to be used along with other software, for autonomous planning of maneuvers of a spacecraft near an asteroid, comet, or other small astronomical body. Selection of terrain features and estimation of the position of the spacecraft relative to these features is an essential part of such planning. FuzzObserver contributes to the selection and estimation by generating recommendations for spacecraft trajectory adjustments to maintain the spacecraft's ability to observe sufficient terrain features for estimating position. The input to FuzzObserver consists of data from terrain images, including sets of data on features acquired during descent toward, or traversal of, a body of interest. The name of this program reflects its use of fuzzy logic to reason about the terrain features represented by the data and extract corresponding trajectory-adjustment rules. Linguistic fuzzy sets and conditional statements enable fuzzy systems to make decisions based on heuristic rule-based knowledge derived by engineering experts. A major advantage of using fuzzy logic is that it involves simple arithmetic calculations that can be performed rapidly enough to be useful for planning within the short times typically available for spacecraft maneuvers.
NASA Astrophysics Data System (ADS)
Beamish, David; White, James C.
2011-01-01
A number of modern, multiparameter, high resolution airborne geophysical surveys (termed HiRES) have been conducted over the past decade across onshore UK. These were undertaken, in part, as a response to the limited resolution of the existing UK national baseline magnetic survey data set acquired in the late 1950s and early 1960s. Modern magnetic survey data, obtained with higher precision and reduced line spacing and elevation, provide an improved data set; however the distinctions between the two available resources, existing and new, are rarely quantified. In this contribution we demonstrate and quantify the improvements that can be anticipated using the new data. The information content of the data sets is examined using a series of modern processing and modelling procedures that provide a full assessment of their resolution capabilities. The framework for the study involves two components. The first relates to the definition of the shallow magnetic structure in relation to an ongoing 1:10 k and 1:50 k geological map revision. The second component relates to the performance of the datasets in defining maps of magnetic basement and assisting with larger scale geological and structural interpretation. One of the smaller HiRES survey areas, the island of Anglesey (Ynys Môn), off the coast of NW Wales is used to provide a series of comparative studies. The geological setting here is both complex and debated and cultural interference is prevalent in the low altitude modern survey data. It is demonstrated that successful processing and interpretation can be carried out on data that have not been systematically corrected (decultured) for non-geological perturbations. Across the survey area a wide number of near-surface magnetic features are evident and are dominated by a reversely magnetized Palaeogene dyke swarm that extends offshore. The average depth to the upper surfaces of the dykes is found to be 44 m. The existing baseline data are necessarily limited in resolving features <1 km in scale; however a detailed comparison of the existing and new data reveals the extent to which these quasi-linear features can be resolved and mapped. The precise limitations of the baseline data in terms of detection, location and estimated depth are quantified. The spectral content of both data sets is examined and the longest wavelength information is extracted to estimate the resolution of magnetic basement features in the two data sets. A significant finding is the lack of information in the baseline data set across wavelengths of between 1 and ˜10 km. Here the HiRES data provide a detailed mapping of shallow magnetic basement features (1-3 km) that display a relevance to current understanding of the fault-bounded terranes that cross the survey area. Equally, the compact scale of the modern survey does not provide deeper (>3 km to upper surface) assessments of magnetic basement. This further assessment is successfully provided by the larger scale baseline data which locates and defines a mid-crustal magnetic basement feature, centred beneath the Snowdon Massif, and illustrates that basement of similar characteristic extends beneath much of Anglesey.
Experiments on automatic classification of tissue malignancy in the field of digital pathology
NASA Astrophysics Data System (ADS)
Pereira, J.; Barata, R.; Furtado, Pedro
2017-06-01
Automated analysis of histological images helps diagnose and further classify breast cancer. Totally automated approaches can be used to pinpoint images for further analysis by the medical doctor. But tissue images are especially challenging for either manual or automated approaches, due to mixed patterns and textures, where malignant regions are sometimes difficult to detect unless they are in very advanced stages. Some of the major challenges are related to irregular and very diffuse patterns, as well as difficulty to define winning features and classifier models. Although it is also hard to segment correctly into regions, due to the diffuse nature, it is still crucial to take low-level features over individualized regions instead of the whole image, and to select those with the best outcomes. In this paper we report on our experiments building a region classifier with a simple subspace division and a feature selection model that improves results over image-wide and/or limited feature sets. Experimental results show modest accuracy for a set of classifiers applied over the whole image, while the conjunction of image division, per-region low-level extraction of features and selection of features, together with the use of a neural network classifier achieved the best levels of accuracy for the dataset and settings we used in the experiments. Future work involves deep learning techniques, adding structures semantics and embedding the approach as a tumor finding helper in a practical Medical Imaging Application.
Algorithms for Learning Preferences for Sets of Objects
NASA Technical Reports Server (NTRS)
Wagstaff, Kiri L.; desJardins, Marie; Eaton, Eric
2010-01-01
A method is being developed that provides for an artificial-intelligence system to learn a user's preferences for sets of objects and to thereafter automatically select subsets of objects according to those preferences. The method was originally intended to enable automated selection, from among large sets of images acquired by instruments aboard spacecraft, of image subsets considered to be scientifically valuable enough to justify use of limited communication resources for transmission to Earth. The method is also applicable to other sets of objects: examples of sets of objects considered in the development of the method include food menus, radio-station music playlists, and assortments of colored blocks for creating mosaics. The method does not require the user to perform the often-difficult task of quantitatively specifying preferences; instead, the user provides examples of preferred sets of objects. This method goes beyond related prior artificial-intelligence methods for learning which individual items are preferred by the user: this method supports a concept of setbased preferences, which include not only preferences for individual items but also preferences regarding types and degrees of diversity of items in a set. Consideration of diversity in this method involves recognition that members of a set may interact with each other in the sense that when considered together, they may be regarded as being complementary, redundant, or incompatible to various degrees. The effects of such interactions are loosely summarized in the term portfolio effect. The learning method relies on a preference representation language, denoted DD-PREF, to express set-based preferences. In DD-PREF, a preference is represented by a tuple that includes quality (depth) functions to estimate how desired a specific value is, weights for each feature preference, the desired diversity of feature values, and the relative importance of diversity versus depth. The system applies statistical concepts to estimate quantitative measures of the user s preferences from training examples (preferred subsets) specified by the user. Once preferences have been learned, the system uses those preferences to select preferred subsets from new sets. The method was found to be viable when tested in computational experiments on menus, music playlists, and rover images. Contemplated future development efforts include further tests on more diverse sets and development of a sub-method for (a) estimating the parameter that represents the relative importance of diversity versus depth, and (b) incorporating background knowledge about the nature of quality functions, which are special functions that specify depth preferences for features.
ERIC Educational Resources Information Center
Tohidian, Iman
2009-01-01
One of those features that set human societies apart from animal societies is the use of language. Language is a vital part of every human culture and is a powerful social tool that we master at an early age. A second feature of humans is our ability to solve complex problems. For centuries philosophers have questioned whether these two abilities…
Vosbergen, Sandra; Peek, Niels; Mulder-Wiggers, Johanna MR; Kemps, Hareld MC; Kraaijenhagen, Roderik A; Jaspers, Monique WM; Lacroix, Joyca PW
2014-01-01
Objective To evaluate patients’ preferences for message features and assess their relationships with health literacy, monitor–blunter coping style, and other patient-dependent characteristics. Methods Patients with coronary heart disease completed an internet-based survey, which assessed health literacy and monitor–blunter coping style, as well as various other patient characteristics such as sociodemographics, disease history, and explicit information preferences. To assess preferences for message features, nine text sets differing in one of nine message features were composed, and participants were asked to state their preferences. Results The survey was completed by 213 patients. For three of the nine text sets, a relationship was found between patient preference and health literacy or monitor–blunter coping style. Patients with low health literacy preferred the text based on patient experience. Patients with a monitoring coping style preferred information on short-term effects of their treatment and mentioning of explicit risks. Various other patient characteristics such as marital status, social support, disease history, and age also showed a strong association. Conclusion Individual differences exist in patients’ preferences for message features, and these preferences relate to patient characteristics such as health literacy and monitor–blunter coping style. PMID:24851044
Impacts of uncertainties in European gridded precipitation observations on regional climate analysis
Gobiet, Andreas
2016-01-01
ABSTRACT Gridded precipitation data sets are frequently used to evaluate climate models or to remove model output biases. Although precipitation data are error prone due to the high spatio‐temporal variability of precipitation and due to considerable measurement errors, relatively few attempts have been made to account for observational uncertainty in model evaluation or in bias correction studies. In this study, we compare three types of European daily data sets featuring two Pan‐European data sets and a set that combines eight very high‐resolution station‐based regional data sets. Furthermore, we investigate seven widely used, larger scale global data sets. Our results demonstrate that the differences between these data sets have the same magnitude as precipitation errors found in regional climate models. Therefore, including observational uncertainties is essential for climate studies, climate model evaluation, and statistical post‐processing. Following our results, we suggest the following guidelines for regional precipitation assessments. (1) Include multiple observational data sets from different sources (e.g. station, satellite, reanalysis based) to estimate observational uncertainties. (2) Use data sets with high station densities to minimize the effect of precipitation undersampling (may induce about 60% error in data sparse regions). The information content of a gridded data set is mainly related to its underlying station density and not to its grid spacing. (3) Consider undercatch errors of up to 80% in high latitudes and mountainous regions. (4) Analyses of small‐scale features and extremes are especially uncertain in gridded data sets. For higher confidence, use climate‐mean and larger scale statistics. In conclusion, neglecting observational uncertainties potentially misguides climate model development and can severely affect the results of climate change impact assessments. PMID:28111497
Prein, Andreas F; Gobiet, Andreas
2017-01-01
Gridded precipitation data sets are frequently used to evaluate climate models or to remove model output biases. Although precipitation data are error prone due to the high spatio-temporal variability of precipitation and due to considerable measurement errors, relatively few attempts have been made to account for observational uncertainty in model evaluation or in bias correction studies. In this study, we compare three types of European daily data sets featuring two Pan-European data sets and a set that combines eight very high-resolution station-based regional data sets. Furthermore, we investigate seven widely used, larger scale global data sets. Our results demonstrate that the differences between these data sets have the same magnitude as precipitation errors found in regional climate models. Therefore, including observational uncertainties is essential for climate studies, climate model evaluation, and statistical post-processing. Following our results, we suggest the following guidelines for regional precipitation assessments. (1) Include multiple observational data sets from different sources (e.g. station, satellite, reanalysis based) to estimate observational uncertainties. (2) Use data sets with high station densities to minimize the effect of precipitation undersampling (may induce about 60% error in data sparse regions). The information content of a gridded data set is mainly related to its underlying station density and not to its grid spacing. (3) Consider undercatch errors of up to 80% in high latitudes and mountainous regions. (4) Analyses of small-scale features and extremes are especially uncertain in gridded data sets. For higher confidence, use climate-mean and larger scale statistics. In conclusion, neglecting observational uncertainties potentially misguides climate model development and can severely affect the results of climate change impact assessments.
Application of quantum-behaved particle swarm optimization to motor imagery EEG classification.
Hsu, Wei-Yen
2013-12-01
In this study, we propose a recognition system for single-trial analysis of motor imagery (MI) electroencephalogram (EEG) data. Applying event-related brain potential (ERP) data acquired from the sensorimotor cortices, the system chiefly consists of automatic artifact elimination, feature extraction, feature selection and classification. In addition to the use of independent component analysis, a similarity measure is proposed to further remove the electrooculographic (EOG) artifacts automatically. Several potential features, such as wavelet-fractal features, are then extracted for subsequent classification. Next, quantum-behaved particle swarm optimization (QPSO) is used to select features from the feature combination. Finally, selected sub-features are classified by support vector machine (SVM). Compared with without artifact elimination, feature selection using a genetic algorithm (GA) and feature classification with Fisher's linear discriminant (FLD) on MI data from two data sets for eight subjects, the results indicate that the proposed method is promising in brain-computer interface (BCI) applications.
Detecting natural occlusion boundaries using local cues
DiMattina, Christopher; Fox, Sean A.; Lewicki, Michael S.
2012-01-01
Occlusion boundaries and junctions provide important cues for inferring three-dimensional scene organization from two-dimensional images. Although several investigators in machine vision have developed algorithms for detecting occlusions and other edges in natural images, relatively few psychophysics or neurophysiology studies have investigated what features are used by the visual system to detect natural occlusions. In this study, we addressed this question using a psychophysical experiment where subjects discriminated image patches containing occlusions from patches containing surfaces. Image patches were drawn from a novel occlusion database containing labeled occlusion boundaries and textured surfaces in a variety of natural scenes. Consistent with related previous work, we found that relatively large image patches were needed to attain reliable performance, suggesting that human subjects integrate complex information over a large spatial region to detect natural occlusions. By defining machine observers using a set of previously studied features measured from natural occlusions and surfaces, we demonstrate that simple features defined at the spatial scale of the image patch are insufficient to account for human performance in the task. To define machine observers using a more biologically plausible multiscale feature set, we trained standard linear and neural network classifiers on the rectified outputs of a Gabor filter bank applied to the image patches. We found that simple linear classifiers could not match human performance, while a neural network classifier combining filter information across location and spatial scale compared well. These results demonstrate the importance of combining a variety of cues defined at multiple spatial scales for detecting natural occlusions. PMID:23255731
Frisch, Stefan A.; Pisoni, David B.
2012-01-01
Objective Computational simulations were carried out to evaluate the appropriateness of several psycholinguistic theories of spoken word recognition for children who use cochlear implants. These models also investigate the interrelations of commonly used measures of closed-set and open-set tests of speech perception. Design A software simulation of phoneme recognition performance was developed that uses feature identification scores as input. Two simulations of lexical access were developed. In one, early phoneme decisions are used in a lexical search to find the best matching candidate. In the second, phoneme decisions are made only when lexical access occurs. Simulated phoneme and word identification performance was then applied to behavioral data from the Phonetically Balanced Kindergarten test and Lexical Neighborhood Test of open-set word recognition. Simulations of performance were evaluated for children with prelingual sensorineural hearing loss who use cochlear implants with the MPEAK or SPEAK coding strategies. Results Open-set word recognition performance can be successfully predicted using feature identification scores. In addition, we observed no qualitative differences in performance between children using MPEAK and SPEAK, suggesting that both groups of children process spoken words similarly despite differences in input. Word recognition ability was best predicted in the model in which phoneme decisions were delayed until lexical access. Conclusions Closed-set feature identification and open-set word recognition focus on different, but related, levels of language processing. Additional insight for clinical intervention may be achieved by collecting both types of data. The most successful model of performance is consistent with current psycholinguistic theories of spoken word recognition. Thus it appears that the cognitive process of spoken word recognition is fundamentally the same for pediatric cochlear implant users and children and adults with normal hearing. PMID:11132784
Multi-task feature learning by using trace norm regularization
NASA Astrophysics Data System (ADS)
Jiangmei, Zhang; Binfeng, Yu; Haibo, Ji; Wang, Kunpeng
2017-11-01
Multi-task learning can extract the correlation of multiple related machine learning problems to improve performance. This paper considers applying the multi-task learning method to learn a single task. We propose a new learning approach, which employs the mixture of expert model to divide a learning task into several related sub-tasks, and then uses the trace norm regularization to extract common feature representation of these sub-tasks. A nonlinear extension of this approach by using kernel is also provided. Experiments conducted on both simulated and real data sets demonstrate the advantage of the proposed approach.
Late-summer sea ice segmentation with multi-polarisation SAR features in C and X band
NASA Astrophysics Data System (ADS)
Fors, Ane S.; Brekke, Camilla; Doulgeris, Anthony P.; Eltoft, Torbjørn; Renner, Angelika H. H.; Gerland, Sebastian
2016-02-01
In this study, we investigate the potential of sea ice segmentation by C- and X-band multi-polarisation synthetic aperture radar (SAR) features during late summer. Five high-resolution satellite SAR scenes were recorded in the Fram Strait covering iceberg-fast first-year and old sea ice during a week with air temperatures varying around 0 °C. Sea ice thickness, surface roughness and aerial photographs were collected during a helicopter flight at the site. Six polarimetric SAR features were extracted for each of the scenes. The ability of the individual SAR features to discriminate between sea ice types and their temporal consistency were examined. All SAR features were found to add value to sea ice type discrimination. Relative kurtosis, geometric brightness, cross-polarisation ratio and co-polarisation correlation angle were found to be temporally consistent in the investigated period, while co-polarisation ratio and co-polarisation correlation magnitude were found to be temporally inconsistent. An automatic feature-based segmentation algorithm was tested both for a full SAR feature set and for a reduced SAR feature set limited to temporally consistent features. In C band, the algorithm produced a good late-summer sea ice segmentation, separating the scenes into segments that could be associated with different sea ice types in the next step. The X-band performance was slightly poorer. Excluding temporally inconsistent SAR features improved the segmentation in one of the X-band scenes.
Sensor-oriented feature usability evaluation in fingerprint segmentation
NASA Astrophysics Data System (ADS)
Li, Ying; Yin, Yilong; Yang, Gongping
2013-06-01
Existing fingerprint segmentation methods usually process fingerprint images captured by different sensors with the same feature or feature set. We propose to improve the fingerprint segmentation result in view of an important fact that images from different sensors have different characteristics for segmentation. Feature usability evaluation, which means to evaluate the usability of features to find the personalized feature or feature set for different sensors to improve the performance of segmentation. The need for feature usability evaluation for fingerprint segmentation is raised and analyzed as a new issue. To address this issue, we present a decision-tree-based feature-usability evaluation method, which utilizes a C4.5 decision tree algorithm to evaluate and pick the best suitable feature or feature set for fingerprint segmentation from a typical candidate feature set. We apply the novel method on the FVC2002 database of fingerprint images, which are acquired by four different respective sensors and technologies. Experimental results show that the accuracy of segmentation is improved, and time consumption for feature extraction is dramatically reduced with selected feature(s).
National hydrography dataset--linear referencing
Simley, Jeffrey; Doumbouya, Ariel
2012-01-01
Geospatial data normally have a certain set of standard attributes, such as an identification number, the type of feature, and name of the feature. These standard attributes are typically embedded into the default attribute table, which is directly linked to the geospatial features. However, it is impractical to embed too much information because it can create a complex, inflexible, and hard to maintain geospatial dataset. Many scientists prefer to create a modular, or relational, data design where the information about the features is stored and maintained separately, then linked to the geospatial data. For example, information about the water chemistry of a lake can be maintained in a separate file and linked to the lake. A Geographic Information System (GIS) can then relate the water chemistry to the lake and analyze it as one piece of information. For example, the GIS can select all lakes more than 50 acres, with turbidity greater than 1.5 milligrams per liter.
Lamti, Hachem A; Gorce, Philippe; Ben Khelifa, Mohamed Moncef; Alimi, Adel M
2016-12-01
The goal of this study is to investigate the influence of mental fatigue on the event related potential P300 features (maximum pick, minimum amplitude, latency and period) during virtual wheelchair navigation. For this purpose, an experimental environment was set up based on customizable environmental parameters (luminosity, number of obstacles and obstacles velocities). A correlation study between P300 and fatigue ratings was conducted. Finally, the best correlated features supplied three classification algorithms which are MLP (Multi Layer Perceptron), Linear Discriminate Analysis and Support Vector Machine. The results showed that the maximum feature over visual and temporal regions as well as period feature over frontal, fronto-central and visual regions were correlated with mental fatigue levels. In the other hand, minimum amplitude and latency features didn't show any correlation. Among classification techniques, MLP showed the best performance although the differences between classification techniques are minimal. Those findings can help us in order to design suitable mental fatigue based wheelchair control.
Linear and Non-Linear Visual Feature Learning in Rat and Humans
Bossens, Christophe; Op de Beeck, Hans P.
2016-01-01
The visual system processes visual input in a hierarchical manner in order to extract relevant features that can be used in tasks such as invariant object recognition. Although typically investigated in primates, recent work has shown that rats can be trained in a variety of visual object and shape recognition tasks. These studies did not pinpoint the complexity of the features used by these animals. Many tasks might be solved by using a combination of relatively simple features which tend to be correlated. Alternatively, rats might extract complex features or feature combinations which are nonlinear with respect to those simple features. In the present study, we address this question by starting from a small stimulus set for which one stimulus-response mapping involves a simple linear feature to solve the task while another mapping needs a well-defined nonlinear combination of simpler features related to shape symmetry. We verified computationally that the nonlinear task cannot be trivially solved by a simple V1-model. We show how rats are able to solve the linear feature task but are unable to acquire the nonlinear feature. In contrast, humans are able to use the nonlinear feature and are even faster in uncovering this solution as compared to the linear feature. The implications for the computational capabilities of the rat visual system are discussed. PMID:28066201
The effects of TIS and MI on the texture features in ultrasonic fatty liver images
NASA Astrophysics Data System (ADS)
Zhao, Yuan; Cheng, Xinyao; Ding, Mingyue
2017-03-01
Nonalcoholic fatty liver disease (NAFLD) is prevalent and has a worldwide distribution now. Although ultrasound imaging technology has been deemed as the common method to diagnose fatty liver, it is not able to detect NAFLD in its early stage and limited by the diagnostic instruments and some other factors. B-scan image feature extraction of fatty liver can assist doctor to analyze the patient's situation and enhance the efficiency and accuracy of clinical diagnoses. However, some uncertain factors in ultrasonic diagnoses are often been ignored during feature extraction. In this study, the nonalcoholic fatty liver rabbit model was made and its liver ultrasound images were collected by setting different Thermal index of soft tissue (TIS) and mechanical index (MI). Then, texture features were calculated based on gray level co-occurrence matrix (GLCM) and the impacts of TIS and MI on these features were analyzed and discussed. Furthermore, the receiver operating characteristic (ROC) curve was used to evaluate whether each feature was effective or not when TIS and MI were given. The results showed that TIS and MI do affect the features extracted from the healthy liver, while the texture features of fatty liver are relatively stable. In addition, TIS set to 0.3 and MI equal to 0.9 might be a better choice when using a computer aided diagnosis (CAD) method for fatty liver recognition.
CAFÉ-Map: Context Aware Feature Mapping for mining high dimensional biomedical data.
Minhas, Fayyaz Ul Amir Afsar; Asif, Amina; Arif, Muhammad
2016-12-01
Feature selection and ranking is of great importance in the analysis of biomedical data. In addition to reducing the number of features used in classification or other machine learning tasks, it allows us to extract meaningful biological and medical information from a machine learning model. Most existing approaches in this domain do not directly model the fact that the relative importance of features can be different in different regions of the feature space. In this work, we present a context aware feature ranking algorithm called CAFÉ-Map. CAFÉ-Map is a locally linear feature ranking framework that allows recognition of important features in any given region of the feature space or for any individual example. This allows for simultaneous classification and feature ranking in an interpretable manner. We have benchmarked CAFÉ-Map on a number of toy and real world biomedical data sets. Our comparative study with a number of published methods shows that CAFÉ-Map achieves better accuracies on these data sets. The top ranking features obtained through CAFÉ-Map in a gene profiling study correlate very well with the importance of different genes reported in the literature. Furthermore, CAFÉ-Map provides a more in-depth analysis of feature ranking at the level of individual examples. CAFÉ-Map Python code is available at: http://faculty.pieas.edu.pk/fayyaz/software.html#cafemap . The CAFÉ-Map package supports parallelization and sparse data and provides example scripts for classification. This code can be used to reconstruct the results given in this paper. Copyright © 2016 Elsevier Ltd. All rights reserved.
Holmes, Tyson H; He, Xiao-Song
2016-10-01
Small, wide data sets are commonplace in human immunophenotyping research. As defined here, a small, wide data set is constructed by sampling a small to modest quantity n,1
Holmes, Tyson H.; He, Xiao-Song
2016-01-01
Small, wide data sets are commonplace in human immunophenotyping research. As defined here, a small, wide data set is constructed by sampling a small to modest quantity n, 1 < n < 50, of human participants for the purpose of estimating many parameters p, such that n < p < 1,000. We offer a set of prescriptions that are designed to facilitate low-variance (i.e. stable), low-bias, interpretive regression modeling of small, wide data sets. These prescriptions are distinctive in their especially heavy emphasis on minimizing use of out-of-sample information for conducting statistical inference. That allows the working immunologist to proceed without being encumbered by imposed and often untestable statistical assumptions. Problems of unmeasured confounders, confidence-interval coverage, feature selection, and shrinkage/denoising are defined clearly and treated in detail. We propose an extension of an existing nonparametric technique for improved small-sample confidence-interval tail coverage from the univariate case (single immune feature) to the multivariate (many, possibly correlated immune features). An important role for derived features in the immunological interpretation of regression analyses is stressed. Areas of further research are discussed. Presented principles and methods are illustrated through application to a small, wide data set of adults spanning a wide range in ages and multiple immunophenotypes that were assayed before and after immunization with inactivated influenza vaccine (IIV). Our regression modeling prescriptions identify some potentially important topics for future immunological research. 1) Immunologists may wish to distinguish age-related differences in immune features from changes in immune features caused by aging. 2) A form of the bootstrap that employs linear extrapolation may prove to be an invaluable analytic tool because it allows the working immunologist to obtain accurate estimates of the stability of immune parameter estimates with a bare minimum of imposed assumptions. 3) Liberal inclusion of immune features in phenotyping panels can facilitate accurate separation of biological signal of interest from noise. In addition, through a combination of denoising and potentially improved confidence interval coverage, we identify some candidate immune correlates (frequency of cell subset and concentration of cytokine) with B cell response as measured by quantity of IIV-specific IgA antibody-secreting cells and quantity of IIV-specific IgG antibody-secreting cells. PMID:27196789
Feature Selection for Ridge Regression with Provable Guarantees.
Paul, Saurabh; Drineas, Petros
2016-04-01
We introduce single-set spectral sparsification as a deterministic sampling-based feature selection technique for regularized least-squares classification, which is the classification analog to ridge regression. The method is unsupervised and gives worst-case guarantees of the generalization power of the classification function after feature selection with respect to the classification function obtained using all features. We also introduce leverage-score sampling as an unsupervised randomized feature selection method for ridge regression. We provide risk bounds for both single-set spectral sparsification and leverage-score sampling on ridge regression in the fixed design setting and show that the risk in the sampled space is comparable to the risk in the full-feature space. We perform experiments on synthetic and real-world data sets; a subset of TechTC-300 data sets, to support our theory. Experimental results indicate that the proposed methods perform better than the existing feature selection methods.
Visualization of Penile Suspensory Ligamentous System Based on Visible Human Data Sets
Chen, Xianzhuo; Wu, Yi; Tao, Ling; Yan, Yan; Pang, Jun; Zhang, Shaoxiang; Li, Shirong
2017-01-01
Background The aim of this study was to use a three-dimensional (3D) visualization technology to illustrate and describe the anatomical features of the penile suspensory ligamentous system based on the Visible Human data sets and to explore the suspensory mechanism of the penis for the further improvement of the penis-lengthening surgery. Material/Methods Cross-sectional images retrieved from the first Chinese Visible Human (CVH-1), third Chinese Visible Human (CVH-3), and Visible Human Male (VHM) data sets were used to segment the suspensory ligamentous system and its adjacent structures. The magnetic resonance imaging (MRI) images of this system were studied and compared with those from the Visible Human data sets. The 3D models reconstructed from the Visible Human data sets were used to provide morphological features of the penile suspensory ligamentous system and its related structures. Results The fundiform ligament was a superficial, loose, fibro-fatty tissue which originated from Scarpa’s fascia superiorly and continued to the scrotal septum inferiorly. The suspensory ligament and arcuate pubic ligament were dense fibrous connective tissues which started from the pubic symphysis and terminated by attaching to the tunica albuginea of the corpora cavernosa. Furthermore, the arcuate pubic ligament attached to the inferior rami of the pubis laterally. Conclusions The 3D model based on Visible Human data sets can be used to clarify the anatomical features of the suspensory ligamentous system, thereby contributing to the improvement of penis-lengthening surgery. PMID:28530218
Visualization of Penile Suspensory Ligamentous System Based on Visible Human Data Sets.
Chen, Xianzhuo; Wu, Yi; Tao, Ling; Yan, Yan; Pang, Jun; Zhang, Shaoxiang; Li, Shirong
2017-05-22
BACKGROUND The aim of this study was to use a three-dimensional (3D) visualization technology to illustrate and describe the anatomical features of the penile suspensory ligamentous system based on the Visible Human data sets and to explore the suspensory mechanism of the penis for the further improvement of the penis-lengthening surgery. MATERIAL AND METHODS Cross-sectional images retrieved from the first Chinese Visible Human (CVH-1), third Chinese Visible Human (CVH-3), and Visible Human Male (VHM) data sets were used to segment the suspensory ligamentous system and its adjacent structures. The magnetic resonance imaging (MRI) images of this system were studied and compared with those from the Visible Human data sets. The 3D models reconstructed from the Visible Human data sets were used to provide morphological features of the penile suspensory ligamentous system and its related structures. RESULTS The fundiform ligament was a superficial, loose, fibro-fatty tissue which originated from Scarpa's fascia superiorly and continued to the scrotal septum inferiorly. The suspensory ligament and arcuate pubic ligament were dense fibrous connective tissues which started from the pubic symphysis and terminated by attaching to the tunica albuginea of the corpora cavernosa. Furthermore, the arcuate pubic ligament attached to the inferior rami of the pubis laterally. CONCLUSIONS The 3D model based on Visible Human data sets can be used to clarify the anatomical features of the suspensory ligamentous system, thereby contributing to the improvement of penis-lengthening surgery.
Burger, Birgitta; Thompson, Marc R.; Luck, Geoff; Saarikallio, Suvi; Toiviainen, Petri
2013-01-01
Music makes us move. Several factors can affect the characteristics of such movements, including individual factors or musical features. For this study, we investigated the effect of rhythm- and timbre-related musical features as well as tempo on movement characteristics. Sixty participants were presented with 30 musical stimuli representing different styles of popular music, and instructed to move along with the music. Optical motion capture was used to record participants’ movements. Subsequently, eight movement features and four rhythm- and timbre-related musical features were computationally extracted from the data, while the tempo was assessed in a perceptual experiment. A subsequent correlational analysis revealed that, for instance, clear pulses seemed to be embodied with the whole body, i.e., by using various movement types of different body parts, whereas spectral flux and percussiveness were found to be more distinctly related to certain body parts, such as head and hand movement. A series of ANOVAs with the stimuli being divided into three groups of five stimuli each based on the tempo revealed no significant differences between the groups, suggesting that the tempo of our stimuli set failed to have an effect on the movement features. In general, the results can be linked to the framework of embodied music cognition, as they show that body movements are used to reflect, imitate, and predict musical characteristics. PMID:23641220
Burger, Birgitta; Thompson, Marc R; Luck, Geoff; Saarikallio, Suvi; Toiviainen, Petri
2013-01-01
Music makes us move. Several factors can affect the characteristics of such movements, including individual factors or musical features. For this study, we investigated the effect of rhythm- and timbre-related musical features as well as tempo on movement characteristics. Sixty participants were presented with 30 musical stimuli representing different styles of popular music, and instructed to move along with the music. Optical motion capture was used to record participants' movements. Subsequently, eight movement features and four rhythm- and timbre-related musical features were computationally extracted from the data, while the tempo was assessed in a perceptual experiment. A subsequent correlational analysis revealed that, for instance, clear pulses seemed to be embodied with the whole body, i.e., by using various movement types of different body parts, whereas spectral flux and percussiveness were found to be more distinctly related to certain body parts, such as head and hand movement. A series of ANOVAs with the stimuli being divided into three groups of five stimuli each based on the tempo revealed no significant differences between the groups, suggesting that the tempo of our stimuli set failed to have an effect on the movement features. In general, the results can be linked to the framework of embodied music cognition, as they show that body movements are used to reflect, imitate, and predict musical characteristics.
Miller, Stephan W.
1981-01-01
A second set of related problems deals with how this format and other representations of spatial entities, such as vector formats for point and line features, can be interrelated for manipulation, retrieval, and analysis by a spatial database management subsystem. Methods have been developed for interrelating areal data sets in the raster format with point and line data in a vector format and these are described.
Men, Hong; Fu, Songlin; Yang, Jialin; Cheng, Meiqi; Shi, Yan; Liu, Jingjing
2018-01-18
Paraffin odor intensity is an important quality indicator when a paraffin inspection is performed. Currently, paraffin odor level assessment is mainly dependent on an artificial sensory evaluation. In this paper, we developed a paraffin odor analysis system to classify and grade four kinds of paraffin samples. The original feature set was optimized using Principal Component Analysis (PCA) and Partial Least Squares (PLS). Support Vector Machine (SVM), Random Forest (RF), and Extreme Learning Machine (ELM) were applied to three different feature data sets for classification and level assessment of paraffin. For classification, the model based on SVM, with an accuracy rate of 100%, was superior to that based on RF, with an accuracy rate of 98.33-100%, and ELM, with an accuracy rate of 98.01-100%. For level assessment, the R² related to the training set was above 0.97 and the R² related to the test set was above 0.87. Through comprehensive comparison, the generalization of the model based on ELM was superior to those based on SVM and RF. The scoring errors for the three models were 0.0016-0.3494, lower than the error of 0.5-1.0 measured by industry standard experts, meaning these methods have a higher prediction accuracy for scoring paraffin level.
Bayes classification of interferometric TOPSAR data
NASA Technical Reports Server (NTRS)
Michel, T. R.; Rodriguez, E.; Houshmand, B.; Carande, R.
1995-01-01
We report the Bayes classification of terrain types at different sites using airborne interferometric synthetic aperture radar (INSAR) data. A Gaussian maximum likelihood classifier was applied on multidimensional observations derived from the SAR intensity, the terrain elevation model, and the magnitude of the interferometric correlation. Training sets for forested, urban, agricultural, or bare areas were obtained either by selecting samples with known ground truth, or by k-means clustering of random sets of samples uniformly distributed across all sites, and subsequent assignments of these clusters using ground truth. The accuracy of the classifier was used to optimize the discriminating efficiency of the set of features that was chosen. The most important features include the SAR intensity, a canopy penetration depth model, and the terrain slope. We demonstrate the classifier's performance across sites using a unique set of training classes for the four main terrain categories. The scenes examined include San Francisco (CA) (predominantly urban and water), Mount Adams (WA) (forested with clear cuts), Pasadena (CA) (urban with mountains), and Antioch Hills (CA) (water, swamps, fields). Issues related to the effects of image calibration and the robustness of the classification to calibration errors are explored. The relative performance of single polarization Interferometric data classification is contrasted against classification schemes based on polarimetric SAR data.
Discovery of Predicate-Oriented Relations among Named Entities Extracted from Thai Texts
NASA Astrophysics Data System (ADS)
Tongtep, Nattapong; Theeramunkong, Thanaruk
Extracting named entities (NEs) and their relations is more difficult in Thai than in other languages due to several Thai specific characteristics, including no explicit boundaries for words, phrases and sentences; few case markers and modifier clues; high ambiguity in compound words and serial verbs; and flexible word orders. Unlike most previous works which focused on NE relations of specific actions, such as work_for, live_in, located_in, and kill, this paper proposes more general types of NE relations, called predicate-oriented relation (PoR), where an extracted action part (verb) is used as a core component to associate related named entities extracted from Thai Texts. Lacking a practical parser for the Thai language, we present three types of surface features, i.e. punctuation marks (such as token spaces), entity types and the number of entities and then apply five alternative commonly used learning schemes to investigate their performance on predicate-oriented relation extraction. The experimental results show that our approach achieves the F-measure of 97.76%, 99.19%, 95.00% and 93.50% on four different types of predicate-oriented relation (action-location, location-action, action-person and person-action) in crime-related news documents using a data set of 1,736 entity pairs. The effects of NE extraction techniques, feature sets and class unbalance on the performance of relation extraction are explored.
Perception of initial obstruent voicing is influenced by gestural organization
Best, Catherine T.; Hallé, Pierre A.
2009-01-01
Cross-language differences in phonetic settings for phonological contrasts of stop voicing have posed a challenge for attempts to relate specific phonological features to specific phonetic details. We probe the phonetic-phonological relationship for voicing contrasts more broadly, analyzing in particular their relevance to nonnative speech perception, from two theoretical perspectives: feature geometry and articulatory phonology. Because these perspectives differ in assumptions about temporal/phasing relationships among features/gestures within syllable onsets, we undertook a cross-language investigation on perception of obstruent (stop, fricative) voicing contrasts in three nonnative onsets that use a common set of features/gestures but with differing time-coupling. Listeners of English and French, which differ in their phonetic settings for word-initial stop voicing distinctions, were tested on perception of three onset types, all nonnative to both English and French, that differ in how initial obstruent voicing is coordinated with a lateral feature/gesture and additional obstruent features/gestures. The targets, listed from least complex to most complex onsets, were: a lateral fricative voicing distinction (Zulu /ɬ/-ɮ/), a laterally-released affricate voicing distinction (Tlingit /tɬ/-/dɮ/), and a coronal stop voicing distinction in stop+/l/ clusters (Hebrew /tl/-/dl/). English and French listeners' performance reflected the differences in their native languages' stop voicing distinctions, compatible with prior perceptual studies on singleton consonant onsets. However, both groups' abilities to perceive voicing as a separable parameter also varied systematically with the structure of the target onsets, supporting the notion that the gestural organization of syllable onsets systematically affects perception of initial voicing distinctions. PMID:20228878
Diagnostic and prognostic histopathology system using morphometric indices
DOE Office of Scientific and Technical Information (OSTI.GOV)
Parvin, Bahram; Chang, Hang; Han, Ju
Determining at least one of a prognosis or a therapy for a patient based on a stained tissue section of the patient. An image of a stained tissue section of a patient is processed by a processing device. A set of features values for a set of cell-based features is extracted from the processed image, and the processed image is associated with a particular cluster of a plurality of clusters based on the set of feature values, where the plurality of clusters is defined with respect to a feature space corresponding to the set of features.
Universal relations for range corrections to Efimov features
Ji, Chen; Braaten, Eric; Phillips, Daniel R.; ...
2015-09-09
In a three-body system of identical bosons interacting through a large S-wave scattering length a, there are several sets of features related to the Efimov effect that are characterized by discrete scale invariance. Effective field theory was recently used to derive universal relations between these Efimov features that include the first-order correction due to a nonzero effective range r s. We reveal a simple pattern in these range corrections that had not been previously identified. The pattern is explained by the renormalization group for the effective field theory, which implies that the Efimov three-body parameter runs logarithmically with the momentummore » scale at a rate proportional to r s/a. The running Efimov parameter also explains the empirical observation that range corrections can be largely taken into account by shifting the Efimov parameter by an adjustable parameter divided by a. Furthermore, the accuracy of universal relations that include first-order range corrections is verified by comparing them with various theoretical calculations using models with nonzero range.« less
Segmental Rescoring in Text Recognition
2014-02-04
description relates to rescoring text hypotheses in text recognition based on segmental features. Offline printed text and handwriting recognition (OHR) can... Handwriting , College Park, Md., 2006, which is incorporated by reference here. For the set of training images 202, a character modeler 208 receives
Kursawe, Michael A; Zimmer, Hubert D
2015-06-01
We investigated the impact of perceptual processing demands on visual working memory of coloured complex random polygons during change detection. Processing load was assessed by pupil size (Exp. 1) and additionally slow wave potentials (Exp. 2). Task difficulty was manipulated by presenting different set sizes (1, 2, 4 items) and by making different features (colour, shape, or both) task-relevant. Memory performance in the colour condition was better than in the shape and both condition which did not differ. Pupil dilation and the posterior N1 increased with set size independent of type of feature. In contrast, slow waves and a posterior P2 component showed set size effects but only if shape was task-relevant. In the colour condition slow waves did not vary with set size. We suggest that pupil size and N1 indicates different states of attentional effort corresponding to the number of presented items. In contrast, slow waves reflect processes related to encoding and maintenance strategies. The observation that their potentials vary with the type of feature (simple colour versus complex shape) indicates that perceptual complexity already influences encoding and storage and not only comparison of targets with memory entries at the moment of testing. Copyright © 2015 Elsevier B.V. All rights reserved.
Variable Selection for Road Segmentation in Aerial Images
NASA Astrophysics Data System (ADS)
Warnke, S.; Bulatov, D.
2017-05-01
For extraction of road pixels from combined image and elevation data, Wegner et al. (2015) proposed classification of superpixels into road and non-road, after which a refinement of the classification results using minimum cost paths and non-local optimization methods took place. We believed that the variable set used for classification was to a certain extent suboptimal, because many variables were redundant while several features known as useful in Photogrammetry and Remote Sensing are missed. This motivated us to implement a variable selection approach which builds a model for classification using portions of training data and subsets of features, evaluates this model, updates the feature set, and terminates when a stopping criterion is satisfied. The choice of classifier is flexible; however, we tested the approach with Logistic Regression and Random Forests, and taylored the evaluation module to the chosen classifier. To guarantee a fair comparison, we kept the segment-based approach and most of the variables from the related work, but we extended them by additional, mostly higher-level features. Applying these superior features, removing the redundant ones, as well as using more accurately acquired 3D data allowed to keep stable or even to reduce the misclassification error in a challenging dataset.
Gadd, C S; Baskaran, P; Lobach, D F
1998-01-01
Extensive utilization of point-of-care decision support systems will be largely dependent on the development of user interaction capabilities that make them effective clinical tools in patient care settings. This research identified critical design features of point-of-care decision support systems that are preferred by physicians, through a multi-method formative evaluation of an evolving prototype of an Internet-based clinical decision support system. Clinicians used four versions of the system--each highlighting a different functionality. Surveys and qualitative evaluation methodologies assessed clinicians' perceptions regarding system usability and usefulness. Our analyses identified features that improve perceived usability, such as telegraphic representations of guideline-related information, facile navigation, and a forgiving, flexible interface. Users also preferred features that enhance usefulness and motivate use, such as an encounter documentation tool and the availability of physician instruction and patient education materials. In addition to identifying design features that are relevant to efforts to develop clinical systems for point-of-care decision support, this study demonstrates the value of combining quantitative and qualitative methods of formative evaluation with an iterative system development strategy to implement new information technology in complex clinical settings.
Comparing sociocultural features of cholera in three endemic African settings.
Schaetti, Christian; Sundaram, Neisha; Merten, Sonja; Ali, Said M; Nyambedha, Erick O; Lapika, Bruno; Chaignat, Claire-Lise; Hutubessy, Raymond; Weiss, Mitchell G
2013-09-18
Cholera mainly affects developing countries where safe water supply and sanitation infrastructure are often rudimentary. Sub-Saharan Africa is a cholera hotspot. Effective cholera control requires not only a professional assessment, but also consideration of community-based priorities. The present work compares local sociocultural features of endemic cholera in urban and rural sites from three field studies in southeastern Democratic Republic of Congo (SE-DRC), western Kenya and Zanzibar. A vignette-based semistructured interview was used in 2008 in Zanzibar to study sociocultural features of cholera-related illness among 356 men and women from urban and rural communities. Similar cross-sectional surveys were performed in western Kenya (n = 379) and in SE-DRC (n = 360) in 2010. Systematic comparison across all settings considered the following domains: illness identification; perceived seriousness, potential fatality and past household episodes; illness-related experience; meaning; knowledge of prevention; help-seeking behavior; and perceived vulnerability. Cholera is well known in all three settings and is understood to have a significant impact on people's lives. Its social impact was mainly characterized by financial concerns. Problems with unsafe water, sanitation and dirty environments were the most common perceived causes across settings; nonetheless, non-biomedical explanations were widespread in rural areas of SE-DRC and Zanzibar. Safe food and water and vaccines were prioritized for prevention in SE-DRC. Safe water was prioritized in western Kenya along with sanitation and health education. The latter two were also prioritized in Zanzibar. Use of oral rehydration solutions and rehydration was a top priority everywhere; healthcare facilities were universally reported as a primary source of help. Respondents in SE-DRC and Zanzibar reported cholera as affecting almost everybody without differentiating much for gender, age and class. In contrast, in western Kenya, gender differentiation was pronounced, and children and the poor were regarded as most vulnerable to cholera. This comprehensive review identified common and distinctive features of local understandings of cholera. Classical treatment (that is, rehydration) was highlighted as a priority for control in the three African study settings and is likely to be identified in the region beyond. Findings indicate the value of insight from community studies to guide local program planning for cholera control and elimination.
High-order graph matching based feature selection for Alzheimer's disease identification.
Liu, Feng; Suk, Heung-Il; Wee, Chong-Yaw; Chen, Huafu; Shen, Dinggang
2013-01-01
One of the main limitations of l1-norm feature selection is that it focuses on estimating the target vector for each sample individually without considering relations with other samples. However, it's believed that the geometrical relation among target vectors in the training set may provide useful information, and it would be natural to expect that the predicted vectors have similar geometric relations as the target vectors. To overcome these limitations, we formulate this as a graph-matching feature selection problem between a predicted graph and a target graph. In the predicted graph a node is represented by predicted vector that may describe regional gray matter volume or cortical thickness features, and in the target graph a node is represented by target vector that include class label and clinical scores. In particular, we devise new regularization terms in sparse representation to impose high-order graph matching between the target vectors and the predicted ones. Finally, the selected regional gray matter volume and cortical thickness features are fused in kernel space for classification. Using the ADNI dataset, we evaluate the effectiveness of the proposed method and obtain the accuracies of 92.17% and 81.57% in AD and MCI classification, respectively.
Analysis of wheezes using wavelet higher order spectral features.
Taplidou, Styliani A; Hadjileontiadis, Leontios J
2010-07-01
Wheezes are musical breath sounds, which usually imply an existing pulmonary obstruction, such as asthma and chronic obstructive pulmonary disease (COPD). Although many studies have addressed the problem of wheeze detection, a limited number of scientific works has focused in the analysis of wheeze characteristics, and in particular, their time-varying nonlinear characteristics. In this study, an effort is made to reveal and statistically analyze the nonlinear characteristics of wheezes and their evolution over time, as they are reflected in the quadratic phase coupling of their harmonics. To this end, the continuous wavelet transform (CWT) is used in combination with third-order spectra to define the analysis domain, where the nonlinear interactions of the harmonics of wheezes and their time variations are revealed by incorporating instantaneous wavelet bispectrum and bicoherence, which provide with the instantaneous biamplitude and biphase curves. Based on this nonlinear information pool, a set of 23 features is proposed for the nonlinear analysis of wheezes. Two complementary perspectives, i.e., general and detailed, related to average performance and to localities, respectively, were used in the construction of the feature set, in order to embed trends and local behaviors, respectively, seen in the nonlinear interaction of the harmonic elements of wheezes over time. The proposed feature set was evaluated on a dataset of wheezes, acquired from adult patients with diagnosed asthma and COPD from a lung sound database. The statistical evaluation of the feature set revealed discrimination ability between the two pathologies for all data subgroupings. In particular, when the total breathing cycle was examined, all 23 features, but one, showed statistically significant difference between the COPD and asthma pathologies, whereas for the subgroupings of inspiratory and expiratory phases, 18 out of 23 and 22 out of 23 features exhibited discrimination power, respectively. This paves the way for the use of the wavelet higher order spectral features as an input vector to an efficient classifier. Apparently, this would integrate the intrinsic characteristics of wheezes within computerized diagnostic tools toward their more efficient evaluation.
Anomaly Detection Using an Ensemble of Feature Models
Noto, Keith; Brodley, Carla; Slonim, Donna
2011-01-01
We present a new approach to semi-supervised anomaly detection. Given a set of training examples believed to come from the same distribution or class, the task is to learn a model that will be able to distinguish examples in the future that do not belong to the same class. Traditional approaches typically compare the position of a new data point to the set of “normal” training data points in a chosen representation of the feature space. For some data sets, the normal data may not have discernible positions in feature space, but do have consistent relationships among some features that fail to appear in the anomalous examples. Our approach learns to predict the values of training set features from the values of other features. After we have formed an ensemble of predictors, we apply this ensemble to new data points. To combine the contribution of each predictor in our ensemble, we have developed a novel, information-theoretic anomaly measure that our experimental results show selects against noisy and irrelevant features. Our results on 47 data sets show that for most data sets, this approach significantly improves performance over current state-of-the-art feature space distance and density-based approaches. PMID:22020249
Adam, Asrul; Ibrahim, Zuwairie; Mokhtar, Norrima; Shapiai, Mohd Ibrahim; Mubin, Marizan; Saad, Ismail
2016-01-01
In the existing electroencephalogram (EEG) signals peak classification research, the existing models, such as Dumpala, Acir, Liu, and Dingle peak models, employ different set of features. However, all these models may not be able to offer good performance for various applications and it is found to be problem dependent. Therefore, the objective of this study is to combine all the associated features from the existing models before selecting the best combination of features. A new optimization algorithm, namely as angle modulated simulated Kalman filter (AMSKF) will be employed as feature selector. Also, the neural network random weight method is utilized in the proposed AMSKF technique as a classifier. In the conducted experiment, 11,781 samples of peak candidate are employed in this study for the validation purpose. The samples are collected from three different peak event-related EEG signals of 30 healthy subjects; (1) single eye blink, (2) double eye blink, and (3) eye movement signals. The experimental results have shown that the proposed AMSKF feature selector is able to find the best combination of features and performs at par with the existing related studies of epileptic EEG events classification.
Fast detection of vascular plaque in optical coherence tomography images using a reduced feature set
NASA Astrophysics Data System (ADS)
Prakash, Ammu; Ocana Macias, Mariano; Hewko, Mark; Sowa, Michael; Sherif, Sherif
2018-03-01
Optical coherence tomography (OCT) images are capable of detecting vascular plaque by using the full set of 26 Haralick textural features and a standard K-means clustering algorithm. However, the use of the full set of 26 textural features is computationally expensive and may not be feasible for real time implementation. In this work, we identified a reduced set of 3 textural feature which characterizes vascular plaque and used a generalized Fuzzy C-means clustering algorithm. Our work involves three steps: 1) the reduction of a full set 26 textural feature to a reduced set of 3 textural features by using genetic algorithm (GA) optimization method 2) the implementation of an unsupervised generalized clustering algorithm (Fuzzy C-means) on the reduced feature space, and 3) the validation of our results using histology and actual photographic images of vascular plaque. Our results show an excellent match with histology and actual photographic images of vascular tissue. Therefore, our results could provide an efficient pre-clinical tool for the detection of vascular plaque in real time OCT imaging.
Cancer survival classification using integrated data sets and intermediate information.
Kim, Shinuk; Park, Taesung; Kon, Mark
2014-09-01
Although numerous studies related to cancer survival have been published, increasing the prediction accuracy of survival classes still remains a challenge. Integration of different data sets, such as microRNA (miRNA) and mRNA, might increase the accuracy of survival class prediction. Therefore, we suggested a machine learning (ML) approach to integrate different data sets, and developed a novel method based on feature selection with Cox proportional hazard regression model (FSCOX) to improve the prediction of cancer survival time. FSCOX provides us with intermediate survival information, which is usually discarded when separating survival into 2 groups (short- and long-term), and allows us to perform survival analysis. We used an ML-based protocol for feature selection, integrating information from miRNA and mRNA expression profiles at the feature level. To predict survival phenotypes, we used the following classifiers, first, existing ML methods, support vector machine (SVM) and random forest (RF), second, a new median-based classifier using FSCOX (FSCOX_median), and third, an SVM classifier using FSCOX (FSCOX_SVM). We compared these methods using 3 types of cancer tissue data sets: (i) miRNA expression, (ii) mRNA expression, and (iii) combined miRNA and mRNA expression. The latter data set included features selected either from the combined miRNA/mRNA profile or independently from miRNAs and mRNAs profiles (IFS). In the ovarian data set, the accuracy of survival classification using the combined miRNA/mRNA profiles with IFS was 75% using RF, 86.36% using SVM, 84.09% using FSCOX_median, and 88.64% using FSCOX_SVM with a balanced 22 short-term and 22 long-term survivor data set. These accuracies are higher than those using miRNA alone (70.45%, RF; 75%, SVM; 75%, FSCOX_median; and 75%, FSCOX_SVM) or mRNA alone (65.91%, RF; 63.64%, SVM; 72.73%, FSCOX_median; and 70.45%, FSCOX_SVM). Similarly in the glioblastoma multiforme data, the accuracy of miRNA/mRNA using IFS was 75.51% (RF), 87.76% (SVM) 85.71% (FSCOX_median), 85.71% (FSCOX_SVM). These results are higher than the results of using miRNA expression and mRNA expression alone. In addition we predict 16 hsa-miR-23b and hsa-miR-27b target genes in ovarian cancer data sets, obtained by SVM-based feature selection through integration of sequence information and gene expression profiles. Among the approaches used, the integrated miRNA and mRNA data set yielded better results than the individual data sets. The best performance was achieved using the FSCOX_SVM method with independent feature selection, which uses intermediate survival information between short-term and long-term survival time and the combination of the 2 different data sets. The results obtained using the combined data set suggest that there are some strong interactions between miRNA and mRNA features that are not detectable in the individual analyses. Copyright © 2014 Elsevier B.V. All rights reserved.
Brosch, Tom; Tang, Lisa Y W; Youngjin Yoo; Li, David K B; Traboulsee, Anthony; Tam, Roger
2016-05-01
We propose a novel segmentation approach based on deep 3D convolutional encoder networks with shortcut connections and apply it to the segmentation of multiple sclerosis (MS) lesions in magnetic resonance images. Our model is a neural network that consists of two interconnected pathways, a convolutional pathway, which learns increasingly more abstract and higher-level image features, and a deconvolutional pathway, which predicts the final segmentation at the voxel level. The joint training of the feature extraction and prediction pathways allows for the automatic learning of features at different scales that are optimized for accuracy for any given combination of image types and segmentation task. In addition, shortcut connections between the two pathways allow high- and low-level features to be integrated, which enables the segmentation of lesions across a wide range of sizes. We have evaluated our method on two publicly available data sets (MICCAI 2008 and ISBI 2015 challenges) with the results showing that our method performs comparably to the top-ranked state-of-the-art methods, even when only relatively small data sets are available for training. In addition, we have compared our method with five freely available and widely used MS lesion segmentation methods (EMS, LST-LPA, LST-LGA, Lesion-TOADS, and SLS) on a large data set from an MS clinical trial. The results show that our method consistently outperforms these other methods across a wide range of lesion sizes.
From tiger to panda: animal head detection.
Zhang, Weiwei; Sun, Jian; Tang, Xiaoou
2011-06-01
Robust object detection has many important applications in real-world online photo processing. For example, both Google image search and MSN live image search have integrated human face detector to retrieve face or portrait photos. Inspired by the success of such face filtering approach, in this paper, we focus on another popular online photo category--animal, which is one of the top five categories in the MSN live image search query log. As a first attempt, we focus on the problem of animal head detection of a set of relatively large land animals that are popular on the internet, such as cat, tiger, panda, fox, and cheetah. First, we proposed a new set of gradient oriented feature, Haar of Oriented Gradients (HOOG), to effectively capture the shape and texture features on animal head. Then, we proposed two detection algorithms, namely Bruteforce detection and Deformable detection, to effectively exploit the shape feature and texture feature simultaneously. Experimental results on 14,379 well labeled animals images validate the superiority of the proposed approach. Additionally, we apply the animal head detector to improve the image search result through text based online photo search result filtering.
10 CFR 60.32 - Conditions of construction authorization.
Code of Federal Regulations, 2010 CFR
2010-01-01
... GEOLOGIC REPOSITORIES Licenses Construction Authorization § 60.32 Conditions of construction authorization... changes to the features of the geologic repository and the procedures authorized. The restrictions that... setting as well as measures related to the design and construction of the geologic repository operations...
Deep Multimodal Distance Metric Learning Using Click Constraints for Image Ranking.
Yu, Jun; Yang, Xiaokang; Gao, Fei; Tao, Dacheng
2017-12-01
How do we retrieve images accurately? Also, how do we rank a group of images precisely and efficiently for specific queries? These problems are critical for researchers and engineers to generate a novel image searching engine. First, it is important to obtain an appropriate description that effectively represent the images. In this paper, multimodal features are considered for describing images. The images unique properties are reflected by visual features, which are correlated to each other. However, semantic gaps always exist between images visual features and semantics. Therefore, we utilize click feature to reduce the semantic gap. The second key issue is learning an appropriate distance metric to combine these multimodal features. This paper develops a novel deep multimodal distance metric learning (Deep-MDML) method. A structured ranking model is adopted to utilize both visual and click features in distance metric learning (DML). Specifically, images and their related ranking results are first collected to form the training set. Multimodal features, including click and visual features, are collected with these images. Next, a group of autoencoders is applied to obtain initially a distance metric in different visual spaces, and an MDML method is used to assign optimal weights for different modalities. Next, we conduct alternating optimization to train the ranking model, which is used for the ranking of new queries with click features. Compared with existing image ranking methods, the proposed method adopts a new ranking model to use multimodal features, including click features and visual features in DML. We operated experiments to analyze the proposed Deep-MDML in two benchmark data sets, and the results validate the effects of the method.
Multiclass Continuous Correspondence Learning
NASA Technical Reports Server (NTRS)
Bue, Brian D,; Thompson, David R.
2011-01-01
We extend the Structural Correspondence Learning (SCL) domain adaptation algorithm of Blitzer er al. to the realm of continuous signals. Given a set of labeled examples belonging to a 'source' domain, we select a set of unlabeled examples in a related 'target' domain that play similar roles in both domains. Using these 'pivot samples, we map both domains into a common feature space, allowing us to adapt a classifier trained on source examples to classify target examples. We show that when between-class distances are relatively preserved across domains, we can automatically select target pivots to bring the domains into correspondence.
van der Kloet, Frans M; Hendriks, Margriet; Hankemeier, Thomas; Reijmers, Theo
2013-11-01
Because of its high sensitivity and specificity, hyphenated mass spectrometry has become the predominant method to detect and quantify metabolites present in bio-samples relevant for all sorts of life science studies being executed. In contrast to targeted methods that are dedicated to specific features, global profiling acquisition methods allow new unspecific metabolites to be analyzed. The challenge with these so-called untargeted methods is the proper and automated extraction and integration of features that could be of relevance. We propose a new algorithm that enables untargeted integration of samples that are measured with high resolution liquid chromatography-mass spectrometry (LC-MS). In contrast to other approaches limited user interaction is needed allowing also less experienced users to integrate their data. The large amount of single features that are found within a sample is combined to a smaller list of, compound-related, grouped feature-sets representative for that sample. These feature-sets allow for easier interpretation and identification and as important, easier matching over samples. We show that the automatic obtained integration results for a set of known target metabolites match those generated with vendor software but that at least 10 times more feature-sets are extracted as well. We demonstrate our approach using high resolution LC-MS data acquired for 128 samples on a lipidomics platform. The data was also processed in a targeted manner (with a combination of automatic and manual integration) using vendor software for a set of 174 targets. As our untargeted extraction procedure is run per sample and per mass trace the implementation of it is scalable. Because of the generic approach, we envision that this data extraction lipids method will be used in a targeted as well as untargeted analysis of many different kinds of TOF-MS data, even CE- and GC-MS data or MRM. The Matlab package is available for download on request and efforts are directed toward a user-friendly Windows executable. Copyright © 2013 Elsevier B.V. All rights reserved.
Obermeier, S.F.
1996-01-01
Liquefaction features can be used in many field settings to estimate the recurrence interval and magnitude of strong earthquakes through much of the Holocene. These features include dikes, craters, vented sand, sills, and laterally spreading landslides. The relatively high seismic shaking level required for their formation makes them particularly valuable as records of strong paleo-earthquakes. This state-of-the-art summary for using liquefaction-induced features for paleoseismic interpretation and analysis takes into account both geological and geotechnical engineering perspectives. The driving mechanism for formation of the features is primarily the increased pore-water pressure associated with liquefaction of sand-rich sediment. The role of this mechanism is often supplemented greatly by the direct action of seismic shaking at the ground surface, which strains and breaks the clay-rich cap that lies immediately above the sediment that liquefied. Discussed in the text are the processes involved in formation of the features, as well as their morphology and characteristics in field settings. Whether liquefaction occurs is controlled mainly by sediment grain size, sediment packing, depth to the water table, and strength and duration of seismic shaking. Formation of recognizable features in the field generally requires a low-permeability cap above the sediment that liquefied. Field manifestations are controlled largely by the severity of liquefaction and the thickness and properties of the low-permeability cap. Criteria are presented for determining whether observed sediment deformation in the field originated by seismically induced liquefaction. These criteria have been developed mainly by observing historic effects of liquefaction in varied field settings. The most important criterion is that a seismic liquefaction origin requires widespread, regional development of features around a core area where the effects are most severe. In addition, the features must have a morphology that is consistent with a very sudden application of a large hydraulic force. This article discusses case studies in widely separated and different geological settings: coastal South Carolina, the New Madrid seismic zone, the Wabash Valley seismic zone, and coastal Washington State. These studies encompass most of the range of settings and the types of liquefaction-induced features likely to be encountered anywhere. The case studies describe the observed features and the logic for assigning a seismic liquefaction origin to them. Also discussed are some types of sediment deformations that can be misinterpreted as having a seismic origin. Two independent methods for estimating prehistoric magnitude are discussed briefly. One method is based on determination of the maximum distance from the epicenter over which liquefaction-induced effects have formed. The other method is based on use of geotechnical engineering techniques at sites of marginal liquefaction, in order to bracket the peak accelerations as a function of epicentral distance; these accelerations can then be compared with predictions from seismological models.
Wang, Kun-Ching
2015-01-14
The classification of emotional speech is mostly considered in speech-related research on human-computer interaction (HCI). In this paper, the purpose is to present a novel feature extraction based on multi-resolutions texture image information (MRTII). The MRTII feature set is derived from multi-resolution texture analysis for characterization and classification of different emotions in a speech signal. The motivation is that we have to consider emotions have different intensity values in different frequency bands. In terms of human visual perceptual, the texture property on multi-resolution of emotional speech spectrogram should be a good feature set for emotion classification in speech. Furthermore, the multi-resolution analysis on texture can give a clearer discrimination between each emotion than uniform-resolution analysis on texture. In order to provide high accuracy of emotional discrimination especially in real-life, an acoustic activity detection (AAD) algorithm must be applied into the MRTII-based feature extraction. Considering the presence of many blended emotions in real life, in this paper make use of two corpora of naturally-occurring dialogs recorded in real-life call centers. Compared with the traditional Mel-scale Frequency Cepstral Coefficients (MFCC) and the state-of-the-art features, the MRTII features also can improve the correct classification rates of proposed systems among different language databases. Experimental results show that the proposed MRTII-based feature information inspired by human visual perception of the spectrogram image can provide significant classification for real-life emotional recognition in speech.
Genetic programming approach to evaluate complexity of texture images
NASA Astrophysics Data System (ADS)
Ciocca, Gianluigi; Corchs, Silvia; Gasparini, Francesca
2016-11-01
We adopt genetic programming (GP) to define a measure that can predict complexity perception of texture images. We perform psychophysical experiments on three different datasets to collect data on the perceived complexity. The subjective data are used for training, validation, and test of the proposed measure. These data are also used to evaluate several possible candidate measures of texture complexity related to both low level and high level image features. We select four of them (namely roughness, number of regions, chroma variance, and memorability) to be combined in a GP framework. This approach allows a nonlinear combination of the measures and could give hints on how the related image features interact in complexity perception. The proposed complexity measure M exhibits Pearson correlation coefficients of 0.890 on the training set, 0.728 on the validation set, and 0.724 on the test set. M outperforms each of all the single measures considered. From the statistical analysis of different GP candidate solutions, we found that the roughness measure evaluated on the gray level image is the most dominant one, followed by the memorability, the number of regions, and finally the chroma variance.
NASA Astrophysics Data System (ADS)
Villar, Ricardo G.; Pelayo, Jigg L.; Mozo, Ray Mari N.; Salig, James B., Jr.; Bantugan, Jojemar
2016-06-01
Leaning on the derived results conducted by Central Mindanao University Phil-LiDAR 2.B.11 Image Processing Component, the paper attempts to provides the application of the Light Detection and Ranging (LiDAR) derived products in arriving quality Landcover classification considering the theoretical approach of data analysis principles to minimize the common problems in image classification. These are misclassification of objects and the non-distinguishable interpretation of pixelated features that results to confusion of class objects due to their closely-related spectral resemblance, unbalance saturation of RGB information is a challenged at the same time. Only low density LiDAR point cloud data is exploited in the research denotes as 2 pts/m2 of accuracy which bring forth essential derived information such as textures and matrices (number of returns, intensity textures, nDSM, etc.) in the intention of pursuing the conditions for selection characteristic. A novel approach that takes gain of the idea of object-based image analysis and the principle of allometric relation of two or more observables which are aggregated for each acquisition of datasets for establishing a proportionality function for data-partioning. In separating two or more data sets in distinct regions in a feature space of distributions, non-trivial computations for fitting distribution were employed to formulate the ideal hyperplane. Achieving the distribution computations, allometric relations were evaluated and match with the necessary rotation, scaling and transformation techniques to find applicable border conditions. Thus, a customized hybrid feature was developed and embedded in every object class feature to be used as classifier with employed hierarchical clustering strategy for cross-examining and filtering features. This features are boost using machine learning algorithms as trainable sets of information for a more competent feature detection. The product classification in this investigation was compared to a classification based on conventional object-oriented approach promoting straight-forward functionalities of the software eCognition. A compelling rise of efficiency in the overall accuracy (74.4% to 93.4%) and kappa index of agreement (70.5% to 91.7%) is noticeable based on the initial process. Nevertheless, having low-dense LiDAR dataset could be enough in generating exponential increase of performance in accuracy.
Detailed Mapping of the Alu Volcano, Ethiopia
NASA Astrophysics Data System (ADS)
Agrain, Guillaume; Buso, Roxane; Carlier, Jean; van Wyk de Vries, Benjamin
2017-04-01
The Alu volcano in the Danakil Depression is interpreted as a forced-fold related uplift, related to progressive intrusions of sills, or similar tabular intrusions. Alu is in a very isolated and difficult to access area, but Google Earth provides high resolution images that can be used for mapping the structure and volcanic features. We use the imagery to map in as much detail as possible all the morphological features of Alu, which we separate into primary volcanic features and secondary structural features. The mapping has been undertaken by a group undergraduates, graduates and researchers. The group has checked and validated the interpretation of each feature mapped. The data set is available as a kmz, and has been imported into QGIS. The detailed mapping reveals a complex history of multiple lava fields and fissure eruptions, some which pre-date uplift, while others have occurred during uplift, but are subsequently deformed. Similarly, there are cross-cutting structures, and we are able to set up a chronology of events. This shows that uplift grew in an area which was already covered by lavas, that some lava has been probably erupted from Alu's flanks, while most eruptions have been from around the base of Alu. Early in the deformation, thrust faults developed on the lower flanks, similar to those described near the Grosmanaux uplift (van Wyk de Vries et al 2014). These are cut by the larger faults, and by minor fissures. The mapping provides an accessible way of preparing for dedicated fieldwork in preparation of an eventual field expedition to Alu, while extracting the most from remote sensing data.
Automated method for the systematic interpretation of resonance peaks in spectrum data
Damiano, B.; Wood, R.T.
1997-04-22
A method is described for spectral signature interpretation. The method includes the creation of a mathematical model of a system or process. A neural network training set is then developed based upon the mathematical model. The neural network training set is developed by using the mathematical model to generate measurable phenomena of the system or process based upon model input parameter that correspond to the physical condition of the system or process. The neural network training set is then used to adjust internal parameters of a neural network. The physical condition of an actual system or process represented by the mathematical model is then monitored by extracting spectral features from measured spectra of the actual process or system. The spectral features are then input into said neural network to determine the physical condition of the system or process represented by the mathematical model. More specifically, the neural network correlates the spectral features (i.e. measurable phenomena) of the actual process or system with the corresponding model input parameters. The model input parameters relate to specific components of the system or process, and, consequently, correspond to the physical condition of the process or system. 1 fig.
Automated method for the systematic interpretation of resonance peaks in spectrum data
Damiano, Brian; Wood, Richard T.
1997-01-01
A method for spectral signature interpretation. The method includes the creation of a mathematical model of a system or process. A neural network training set is then developed based upon the mathematical model. The neural network training set is developed by using the mathematical model to generate measurable phenomena of the system or process based upon model input parameter that correspond to the physical condition of the system or process. The neural network training set is then used to adjust internal parameters of a neural network. The physical condition of an actual system or process represented by the mathematical model is then monitored by extracting spectral features from measured spectra of the actual process or system. The spectral features are then input into said neural network to determine the physical condition of the system or process represented by the mathematical. More specifically, the neural network correlates the spectral features (i.e. measurable phenomena) of the actual process or system with the corresponding model input parameters. The model input parameters relate to specific components of the system or process, and, consequently, correspond to the physical condition of the process or system.
Nagarajan, Mahesh B; Coan, Paola; Huber, Markus B; Diemoz, Paul C; Wismüller, Axel
2015-01-01
Phase contrast X-ray computed tomography (PCI-CT) has been demonstrated as a novel imaging technique that can visualize human cartilage with high spatial resolution and soft tissue contrast. Different textural approaches have been previously investigated for characterizing chondrocyte organization on PCI-CT to enable classification of healthy and osteoarthritic cartilage. However, the large size of feature sets extracted in such studies motivates an investigation into algorithmic feature reduction for computing efficient feature representations without compromising their discriminatory power. For this purpose, geometrical feature sets derived from the scaling index method (SIM) were extracted from 1392 volumes of interest (VOI) annotated on PCI-CT images of ex vivo human patellar cartilage specimens. The extracted feature sets were subject to linear and non-linear dimension reduction techniques as well as feature selection based on evaluation of mutual information criteria. The reduced feature set was subsequently used in a machine learning task with support vector regression to classify VOIs as healthy or osteoarthritic; classification performance was evaluated using the area under the receiver-operating characteristic (ROC) curve (AUC). Our results show that the classification performance achieved by 9-D SIM-derived geometric feature sets (AUC: 0.96 ± 0.02) can be maintained with 2-D representations computed from both dimension reduction and feature selection (AUC values as high as 0.97 ± 0.02). Thus, such feature reduction techniques can offer a high degree of compaction to large feature sets extracted from PCI-CT images while maintaining their ability to characterize the underlying chondrocyte patterns.
Stephens, David; Diesing, Markus
2014-01-01
Detailed seabed substrate maps are increasingly in demand for effective planning and management of marine ecosystems and resources. It has become common to use remotely sensed multibeam echosounder data in the form of bathymetry and acoustic backscatter in conjunction with ground-truth sampling data to inform the mapping of seabed substrates. Whilst, until recently, such data sets have typically been classified by expert interpretation, it is now obvious that more objective, faster and repeatable methods of seabed classification are required. This study compares the performances of a range of supervised classification techniques for predicting substrate type from multibeam echosounder data. The study area is located in the North Sea, off the north-east coast of England. A total of 258 ground-truth samples were classified into four substrate classes. Multibeam bathymetry and backscatter data, and a range of secondary features derived from these datasets were used in this study. Six supervised classification techniques were tested: Classification Trees, Support Vector Machines, k-Nearest Neighbour, Neural Networks, Random Forest and Naive Bayes. Each classifier was trained multiple times using different input features, including i) the two primary features of bathymetry and backscatter, ii) a subset of the features chosen by a feature selection process and iii) all of the input features. The predictive performances of the models were validated using a separate test set of ground-truth samples. The statistical significance of model performances relative to a simple baseline model (Nearest Neighbour predictions on bathymetry and backscatter) were tested to assess the benefits of using more sophisticated approaches. The best performing models were tree based methods and Naive Bayes which achieved accuracies of around 0.8 and kappa coefficients of up to 0.5 on the test set. The models that used all input features didn't generally perform well, highlighting the need for some means of feature selection.
Feature-Based Morphometry: Discovering Group-related Anatomical Patterns
Toews, Matthew; Wells, William; Collins, D. Louis; Arbel, Tal
2015-01-01
This paper presents feature-based morphometry (FBM), a new, fully data-driven technique for discovering patterns of group-related anatomical structure in volumetric imagery. In contrast to most morphometry methods which assume one-to-one correspondence between subjects, FBM explicitly aims to identify distinctive anatomical patterns that may only be present in subsets of subjects, due to disease or anatomical variability. The image is modeled as a collage of generic, localized image features that need not be present in all subjects. Scale-space theory is applied to analyze image features at the characteristic scale of underlying anatomical structures, instead of at arbitrary scales such as global or voxel-level. A probabilistic model describes features in terms of their appearance, geometry, and relationship to subject groups, and is automatically learned from a set of subject images and group labels. Features resulting from learning correspond to group-related anatomical structures that can potentially be used as image biomarkers of disease or as a basis for computer-aided diagnosis. The relationship between features and groups is quantified by the likelihood of feature occurrence within a specific group vs. the rest of the population, and feature significance is quantified in terms of the false discovery rate. Experiments validate FBM clinically in the analysis of normal (NC) and Alzheimer's (AD) brain images using the freely available OASIS database. FBM automatically identifies known structural differences between NC and AD subjects in a fully data-driven fashion, and an equal error classification rate of 0.80 is achieved for subjects aged 60-80 years exhibiting mild AD (CDR=1). PMID:19853047
The perception of naturalness correlates with low-level visual features of environmental scenes.
Berman, Marc G; Hout, Michael C; Kardan, Omid; Hunter, MaryCarol R; Yourganov, Grigori; Henderson, John M; Hanayik, Taylor; Karimi, Hossein; Jonides, John
2014-01-01
Previous research has shown that interacting with natural environments vs. more urban or built environments can have salubrious psychological effects, such as improvements in attention and memory. Even viewing pictures of nature vs. pictures of built environments can produce similar effects. A major question is: What is it about natural environments that produces these benefits? Problematically, there are many differing qualities between natural and urban environments, making it difficult to narrow down the dimensions of nature that may lead to these benefits. In this study, we set out to uncover visual features that related to individuals' perceptions of naturalness in images. We quantified naturalness in two ways: first, implicitly using a multidimensional scaling analysis and second, explicitly with direct naturalness ratings. Features that seemed most related to perceptions of naturalness were related to the density of contrast changes in the scene, the density of straight lines in the scene, the average color saturation in the scene and the average hue diversity in the scene. We then trained a machine-learning algorithm to predict whether a scene was perceived as being natural or not based on these low-level visual features and we could do so with 81% accuracy. As such we were able to reliably predict subjective perceptions of naturalness with objective low-level visual features. Our results can be used in future studies to determine if these features, which are related to naturalness, may also lead to the benefits attained from interacting with nature.
Karst database development in Minnesota: Design and data assembly
Gao, Y.; Alexander, E.C.; Tipping, R.G.
2005-01-01
The Karst Feature Database (KFD) of Minnesota is a relational GIS-based Database Management System (DBMS). Previous karst feature datasets used inconsistent attributes to describe karst features in different areas of Minnesota. Existing metadata were modified and standardized to represent a comprehensive metadata for all the karst features in Minnesota. Microsoft Access 2000 and ArcView 3.2 were used to develop this working database. Existing county and sub-county karst feature datasets have been assembled into the KFD, which is capable of visualizing and analyzing the entire data set. By November 17 2002, 11,682 karst features were stored in the KFD of Minnesota. Data tables are stored in a Microsoft Access 2000 DBMS and linked to corresponding ArcView applications. The current KFD of Minnesota has been moved from a Windows NT server to a Windows 2000 Citrix server accessible to researchers and planners through networked interfaces. ?? Springer-Verlag 2005.
Scaling laws for coastal overwash morphology
NASA Astrophysics Data System (ADS)
Lazarus, Eli D.
2016-12-01
Overwash is a physical process of coastal sediment transport driven by storm events and is essential to landscape resilience in low-lying barrier environments. This work establishes a comprehensive set of scaling laws for overwash morphology: unifying quantitative descriptions with which to compare overwash features by their morphological attributes across case examples. Such scaling laws also help relate overwash features to other morphodynamic phenomena. Here morphometric data from a physical experiment are compared with data from natural examples of overwash features. The resulting scaling relationships indicate scale invariance spanning several orders of magnitude. Furthermore, these new relationships for overwash morphology align with classic scaling laws for fluvial drainages and alluvial fans.
Machine learning approaches to diagnosis and laterality effects in semantic dementia discourse.
Garrard, Peter; Rentoumi, Vassiliki; Gesierich, Benno; Miller, Bruce; Gorno-Tempini, Maria Luisa
2014-06-01
Advances in automatic text classification have been necessitated by the rapid increase in the availability of digital documents. Machine learning (ML) algorithms can 'learn' from data: for instance a ML system can be trained on a set of features derived from written texts belonging to known categories, and learn to distinguish between them. Such a trained system can then be used to classify unseen texts. In this paper, we explore the potential of the technique to classify transcribed speech samples along clinical dimensions, using vocabulary data alone. We report the accuracy with which two related ML algorithms [naive Bayes Gaussian (NBG) and naive Bayes multinomial (NBM)] categorized picture descriptions produced by: 32 semantic dementia (SD) patients versus 10 healthy, age-matched controls; and SD patients with left- (n = 21) versus right-predominant (n = 11) patterns of temporal lobe atrophy. We used information gain (IG) to identify the vocabulary features that were most informative to each of these two distinctions. In the SD versus control classification task, both algorithms achieved accuracies of greater than 90%. In the right- versus left-temporal lobe predominant classification, NBM achieved a high level of accuracy (88%), but this was achieved by both NBM and NBG when the features used in the training set were restricted to those with high values of IG. The most informative features for the patient versus control task were low frequency content words, generic terms and components of metanarrative statements. For the right versus left task the number of informative lexical features was too small to support any specific inferences. An enriched feature set, including values derived from Quantitative Production Analysis (QPA) may shed further light on this little understood distinction. Copyright © 2013 Elsevier Ltd. All rights reserved.
Kupas, Katrin; Ultsch, Alfred; Klebe, Gerhard
2008-05-15
A new method to discover similar substructures in protein binding pockets, independently of sequence and folding patterns or secondary structure elements, is introduced. The solvent-accessible surface of a binding pocket, automatically detected as a depression on the protein surface, is divided into a set of surface patches. Each surface patch is characterized by its shape as well as by its physicochemical characteristics. Wavelets defined on surfaces are used for the description of the shape, as they have the great advantage of allowing a comparison at different resolutions. The number of coefficients to describe the wavelets can be chosen with respect to the size of the considered data set. The physicochemical characteristics of the patches are described by the assignment of the exposed amino acid residues to one or more of five different properties determinant for molecular recognition. A self-organizing neural network is used to project the high-dimensional feature vectors onto a two-dimensional layer of neurons, called a map. To find similarities between the binding pockets, in both geometrical and physicochemical features, a clustering of the projected feature vector is performed using an automatic distance- and density-based clustering algorithm. The method was validated with a small training data set of 109 binding cavities originating from a set of enzymes covering 12 different EC numbers. A second test data set of 1378 binding cavities, extracted from enzymes of 13 different EC numbers, was then used to prove the discriminating power of the algorithm and to demonstrate its applicability to large scale analyses. In all cases, members of the data set with the same EC number were placed into coherent regions on the map, with small distances between them. Different EC numbers are separated by large distances between the feature vectors. A third data set comprising three subfamilies of endopeptidases is used to demonstrate the ability of the algorithm to detect similar substructures between functionally related active sites. The algorithm can also be used to predict the function of novel proteins not considered in training data set. 2007 Wiley-Liss, Inc.
Reproducibility of radiomics for deciphering tumor phenotype with imaging
NASA Astrophysics Data System (ADS)
Zhao, Binsheng; Tan, Yongqiang; Tsai, Wei-Yann; Qi, Jing; Xie, Chuanmiao; Lu, Lin; Schwartz, Lawrence H.
2016-03-01
Radiomics (radiogenomics) characterizes tumor phenotypes based on quantitative image features derived from routine radiologic imaging to improve cancer diagnosis, prognosis, prediction and response to therapy. Although radiomic features must be reproducible to qualify as biomarkers for clinical care, little is known about how routine imaging acquisition techniques/parameters affect reproducibility. To begin to fill this knowledge gap, we assessed the reproducibility of a comprehensive, commonly-used set of radiomic features using a unique, same-day repeat computed tomography data set from lung cancer patients. Each scan was reconstructed at 6 imaging settings, varying slice thicknesses (1.25 mm, 2.5 mm and 5 mm) and reconstruction algorithms (sharp, smooth). Reproducibility was assessed using the repeat scans reconstructed at identical imaging setting (6 settings in total). In separate analyses, we explored differences in radiomic features due to different imaging parameters by assessing the agreement of these radiomic features extracted from the repeat scans reconstructed at the same slice thickness but different algorithms (3 settings in total). Our data suggest that radiomic features are reproducible over a wide range of imaging settings. However, smooth and sharp reconstruction algorithms should not be used interchangeably. These findings will raise awareness of the importance of properly setting imaging acquisition parameters in radiomics/radiogenomics research.
A Modeling Approach for Burn Scar Assessment Using Natural Features and Elastic Property
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tsap, L V; Zhang, Y; Goldgof, D B
2004-04-02
A modeling approach is presented for quantitative burn scar assessment. Emphases are given to: (1) constructing a finite element model from natural image features with an adaptive mesh, and (2) quantifying the Young's modulus of scars using the finite element model and the regularization method. A set of natural point features is extracted from the images of burn patients. A Delaunay triangle mesh is then generated that adapts to the point features. A 3D finite element model is built on top of the mesh with the aid of range images providing the depth information. The Young's modulus of scars ismore » quantified with a simplified regularization functional, assuming that the knowledge of scar's geometry is available. The consistency between the Relative Elasticity Index and the physician's rating based on the Vancouver Scale (a relative scale used to rate burn scars) indicates that the proposed modeling approach has high potentials for image-based quantitative burn scar assessment.« less
`Dem DEMs: Comparing Methods of Digital Elevation Model Creation
NASA Astrophysics Data System (ADS)
Rezza, C.; Phillips, C. B.; Cable, M. L.
2017-12-01
Topographic details of Europa's surface yield implications for large-scale processes that occur on the moon, including surface strength, modification, composition, and formation mechanisms for geologic features. In addition, small scale details presented from this data are imperative for future exploration of Europa's surface, such as by a potential Europa Lander mission. A comparison of different methods of Digital Elevation Model (DEM) creation and variations between them can help us quantify the relative accuracy of each model and improve our understanding of Europa's surface. In this work, we used data provided by Phillips et al. (2013, AGU Fall meeting, abs. P34A-1846) and Schenk and Nimmo (2017, in prep.) to compare DEMs that were created using Ames Stereo Pipeline (ASP), SOCET SET, and Paul Schenk's own method. We began by locating areas of the surface with multiple overlapping DEMs, and our initial comparisons were performed near the craters Manannan, Pwyll, and Cilix. For each region, we used ArcGIS to draw profile lines across matching features to determine elevation. Some of the DEMs had vertical or skewed offsets, and thus had to be corrected. The vertical corrections were applied by adding or subtracting the global minimum of the data set to create a common zero-point. The skewed data sets were corrected by rotating the plot so that it had a global slope of zero and then subtracting for a zero-point vertical offset. Once corrections were made, we plotted the three methods on one graph for each profile of each region. Upon analysis, we found relatively good feature correlation between the three methods. The smoothness of a DEM depends on both the input set of images and the stereo processing methods used. In our comparison, the DEMs produced by SOCET SET were less smoothed than those from ASP or Schenk. Height comparisons show that ASP and Schenk's model appear similar, alternating in maximum height. SOCET SET has more topographic variability due to its decreased smoothing, which is borne out by preliminary offset calculations. In the future, we plan to expand upon this preliminary work with more regions of Europa, continue quantifying the height differences and relative accuracy of each method, and generate more DEMs to expand our available comparison regions.
Toews, Matthew; Wells, William M.; Collins, Louis; Arbel, Tal
2013-01-01
This paper presents feature-based morphometry (FBM), a new, fully data-driven technique for identifying group-related differences in volumetric imagery. In contrast to most morphometry methods which assume one-to-one correspondence between all subjects, FBM models images as a collage of distinct, localized image features which may not be present in all subjects. FBM thus explicitly accounts for the case where the same anatomical tissue cannot be reliably identified in all subjects due to disease or anatomical variability. A probabilistic model describes features in terms of their appearance, geometry, and relationship to sub-groups of a population, and is automatically learned from a set of subject images and group labels. Features identified indicate group-related anatomical structure that can potentially be used as disease biomarkers or as a basis for computer-aided diagnosis. Scale-invariant image features are used, which reflect generic, salient patterns in the image. Experiments validate FBM clinically in the analysis of normal (NC) and Alzheimer’s (AD) brain images using the freely available OASIS database. FBM automatically identifies known structural differences between NC and AD subjects in a fully data-driven fashion, and obtains an equal error classification rate of 0.78 on new subjects. PMID:20426102
Recognising discourse causality triggers in the biomedical domain.
Mihăilă, Claudiu; Ananiadou, Sophia
2013-12-01
Current domain-specific information extraction systems represent an important resource for biomedical researchers, who need to process vast amounts of knowledge in a short time. Automatic discourse causality recognition can further reduce their workload by suggesting possible causal connections and aiding in the curation of pathway models. We describe here an approach to the automatic identification of discourse causality triggers in the biomedical domain using machine learning. We create several baselines and experiment with and compare various parameter settings for three algorithms, i.e. Conditional Random Fields (CRF), Support Vector Machines (SVM) and Random Forests (RF). We also evaluate the impact of lexical, syntactic, and semantic features on each of the algorithms, showing that semantics improves the performance in all cases. We test our comprehensive feature set on two corpora containing gold standard annotations of causal relations, and demonstrate the need for more gold standard data. The best performance of 79.35% F-score is achieved by CRFs when using all three feature types.
The prediction of airborne and structure-borne noise potential for a tire
NASA Astrophysics Data System (ADS)
Sakamoto, Nicholas Y.
Tire/pavement interaction noise is a major component of both exterior pass-by noise and vehicle interior noise. The current testing methods for ranking tires from loud to quiet require expensive equipment, multiple tires, and/or long experimental set-up and run times. If a laboratory based off-vehicle test could be used to identify the airborne and structure-borne potential of a tire from its dynamic characteristics, a relative ranking of a large group of tires could be performed at relatively modest expense. This would provide a smaller sample set of tires for follow-up testing and thus save expense for automobile OEMs. The focus of this research was identifying key noise features from a tire/pavement experiment. These results were compared against a stationary tire test in which the natural response of the tire to a forced input was measured. Since speed was identified as having some effect on the noise, an input function was also developed to allow the tires to be ranked at an appropriate speed. A relative noise model was used on a second sample set of tires to verify if the ranking could be used against interior vehicle measurements. While overall level analysis of the specified spectrum had mixed success, important noise generating features were identified, and the methods used could be improved to develop a standard off-vehicle test to predict a tire's noise potential.
Feature Selection for Chemical Sensor Arrays Using Mutual Information
Wang, X. Rosalind; Lizier, Joseph T.; Nowotny, Thomas; Berna, Amalia Z.; Prokopenko, Mikhail; Trowell, Stephen C.
2014-01-01
We address the problem of feature selection for classifying a diverse set of chemicals using an array of metal oxide sensors. Our aim is to evaluate a filter approach to feature selection with reference to previous work, which used a wrapper approach on the same data set, and established best features and upper bounds on classification performance. We selected feature sets that exhibit the maximal mutual information with the identity of the chemicals. The selected features closely match those found to perform well in the previous study using a wrapper approach to conduct an exhaustive search of all permitted feature combinations. By comparing the classification performance of support vector machines (using features selected by mutual information) with the performance observed in the previous study, we found that while our approach does not always give the maximum possible classification performance, it always selects features that achieve classification performance approaching the optimum obtained by exhaustive search. We performed further classification using the selected feature set with some common classifiers and found that, for the selected features, Bayesian Networks gave the best performance. Finally, we compared the observed classification performances with the performance of classifiers using randomly selected features. We found that the selected features consistently outperformed randomly selected features for all tested classifiers. The mutual information filter approach is therefore a computationally efficient method for selecting near optimal features for chemical sensor arrays. PMID:24595058
ERIC Educational Resources Information Center
Kasen, Stephanie; Cohen, Patricia; Chen, Henian
2011-01-01
Hierarchical linear models were used to examine trajectories of impulsivity and capability between ages 10 and 25 in relation to suicide attempt in 770 youths followed longitudinally: intercepts were set at age 17. The impulsivity measure assessed features of urgency (e.g., poor control, quick provocation, and disregard for external constraints);…
Men, Hong; Fu, Songlin; Yang, Jialin; Cheng, Meiqi; Shi, Yan
2018-01-01
Paraffin odor intensity is an important quality indicator when a paraffin inspection is performed. Currently, paraffin odor level assessment is mainly dependent on an artificial sensory evaluation. In this paper, we developed a paraffin odor analysis system to classify and grade four kinds of paraffin samples. The original feature set was optimized using Principal Component Analysis (PCA) and Partial Least Squares (PLS). Support Vector Machine (SVM), Random Forest (RF), and Extreme Learning Machine (ELM) were applied to three different feature data sets for classification and level assessment of paraffin. For classification, the model based on SVM, with an accuracy rate of 100%, was superior to that based on RF, with an accuracy rate of 98.33–100%, and ELM, with an accuracy rate of 98.01–100%. For level assessment, the R2 related to the training set was above 0.97 and the R2 related to the test set was above 0.87. Through comprehensive comparison, the generalization of the model based on ELM was superior to those based on SVM and RF. The scoring errors for the three models were 0.0016–0.3494, lower than the error of 0.5–1.0 measured by industry standard experts, meaning these methods have a higher prediction accuracy for scoring paraffin level. PMID:29346328
Socioeconomic Data and Applications Center | SEDAC
Themes * Agriculture * Climate * Conservation * Governance * Hazards * Health * Infrastructure * Land Use satellite imagery. Agriculture and Food Security Theme - Agriculture and Food Security Find data sets, maps , map services, featured uses of data and other resources related to Agriculture and Food Security
Face Alignment via Regressing Local Binary Features.
Ren, Shaoqing; Cao, Xudong; Wei, Yichen; Sun, Jian
2016-03-01
This paper presents a highly efficient and accurate regression approach for face alignment. Our approach has two novel components: 1) a set of local binary features and 2) a locality principle for learning those features. The locality principle guides us to learn a set of highly discriminative local binary features for each facial landmark independently. The obtained local binary features are used to jointly learn a linear regression for the final output. This approach achieves the state-of-the-art results when tested on the most challenging benchmarks to date. Furthermore, because extracting and regressing local binary features are computationally very cheap, our system is much faster than previous methods. It achieves over 3000 frames per second (FPS) on a desktop or 300 FPS on a mobile phone for locating a few dozens of landmarks. We also study a key issue that is important but has received little attention in the previous research, which is the face detector used to initialize alignment. We investigate several face detectors and perform quantitative evaluation on how they affect alignment accuracy. We find that an alignment friendly detector can further greatly boost the accuracy of our alignment method, reducing the error up to 16% relatively. To facilitate practical usage of face detection/alignment methods, we also propose a convenient metric to measure how good a detector is for alignment initialization.
Du, Yuncheng; Budman, Hector M; Duever, Thomas A
2016-06-01
Accurate automated quantitative analysis of living cells based on fluorescence microscopy images can be very useful for fast evaluation of experimental outcomes and cell culture protocols. In this work, an algorithm is developed for fast differentiation of normal and apoptotic viable Chinese hamster ovary (CHO) cells. For effective segmentation of cell images, a stochastic segmentation algorithm is developed by combining a generalized polynomial chaos expansion with a level set function-based segmentation algorithm. This approach provides a probabilistic description of the segmented cellular regions along the boundary, from which it is possible to calculate morphological changes related to apoptosis, i.e., the curvature and length of a cell's boundary. These features are then used as inputs to a support vector machine (SVM) classifier that is trained to distinguish between normal and apoptotic viable states of CHO cell images. The use of morphological features obtained from the stochastic level set segmentation of cell images in combination with the trained SVM classifier is more efficient in terms of differentiation accuracy as compared with the original deterministic level set method.
Selecting Power-Efficient Signal Features for a Low-Power Fall Detector.
Wang, Changhong; Redmond, Stephen J; Lu, Wei; Stevens, Michael C; Lord, Stephen R; Lovell, Nigel H
2017-11-01
Falls are a serious threat to the health of older people. A wearable fall detector can automatically detect the occurrence of a fall and alert a caregiver or an emergency response service so they may deliver immediate assistance, improving the chances of recovering from fall-related injuries. One constraint of such a wearable technology is its limited battery life. Thus, minimization of power consumption is an important design concern, all the while maintaining satisfactory accuracy of the fall detection algorithms implemented on the wearable device. This paper proposes an approach for selecting power-efficient signal features such that the minimum desirable fall detection accuracy is assured. Using data collected in simulated falls, simulated activities of daily living, and real free-living trials, all using young volunteers, the proposed approach selects four features from a set of ten commonly used features, providing a power saving of 75.3%, while limiting the error rate of a binary classification decision tree fall detection algorithm to 7.1%.Falls are a serious threat to the health of older people. A wearable fall detector can automatically detect the occurrence of a fall and alert a caregiver or an emergency response service so they may deliver immediate assistance, improving the chances of recovering from fall-related injuries. One constraint of such a wearable technology is its limited battery life. Thus, minimization of power consumption is an important design concern, all the while maintaining satisfactory accuracy of the fall detection algorithms implemented on the wearable device. This paper proposes an approach for selecting power-efficient signal features such that the minimum desirable fall detection accuracy is assured. Using data collected in simulated falls, simulated activities of daily living, and real free-living trials, all using young volunteers, the proposed approach selects four features from a set of ten commonly used features, providing a power saving of 75.3%, while limiting the error rate of a binary classification decision tree fall detection algorithm to 7.1%.
Pargo Chasma and its relationship to global tectonics
NASA Technical Reports Server (NTRS)
Ghail, R. C.
1993-01-01
Pargo Chasma was first identified on Pioneer Venus data as a 10,000 km long lineation extending from Atla Regio in the north terminating in the plains south of Phoebe Regio. More recent Magellan data have revealed this feature to be one of the longest chains of coronae so far identified on the planet. Stofan et al have identified 60 coronae and 2 related features associated with this chain; other estimates differ according to the classification scheme adopted, for example Head et al. identify only 29 coronae but 43 arachnoids in the same region. This highlights one of the major problems associated with the preliminary mapping of the Magellan data: there has been an emphasis on identifying particular features on Venus without a universally accepted scheme to classify those features. Nevertheless, Pargo Chasma is clearly identified as a major tectonic belt of global significance. Together with the Artemis-Atla-Beta tectonic zone and the Beta-Phoebe rift belt, Pargo Chasma defines a region on Venus with an unusually high concentration of tectonic and volcanic features. Thus, an understanding of the processes involved in the formation of Pargo Chasma may lend significant insight into the evolution of the region and the planet as a whole. I have produced a detailed 1 to 10 million scale map of Pargo Chasma and the surrounding area from preliminary USGS controlled mosaiced image maps of Venus constructed from Magellan data. In view of the problems highlighted above in relation the efforts already made at identifying a particular set of features I have mapped the region purely on the basis of the geomorphology visible in the magellan data without any attempt at identifying a particular set or class of features. Thus, the map produced distinguishes between areas of different brightness and texture. This has the advantage of highlighting the tectonic fabric of Pargo Chasma and clearly illustrates the close inter-relationship between individual coronae and the surrounding tectonic belts.
Gadd, C. S.; Baskaran, P.; Lobach, D. F.
1998-01-01
Extensive utilization of point-of-care decision support systems will be largely dependent on the development of user interaction capabilities that make them effective clinical tools in patient care settings. This research identified critical design features of point-of-care decision support systems that are preferred by physicians, through a multi-method formative evaluation of an evolving prototype of an Internet-based clinical decision support system. Clinicians used four versions of the system--each highlighting a different functionality. Surveys and qualitative evaluation methodologies assessed clinicians' perceptions regarding system usability and usefulness. Our analyses identified features that improve perceived usability, such as telegraphic representations of guideline-related information, facile navigation, and a forgiving, flexible interface. Users also preferred features that enhance usefulness and motivate use, such as an encounter documentation tool and the availability of physician instruction and patient education materials. In addition to identifying design features that are relevant to efforts to develop clinical systems for point-of-care decision support, this study demonstrates the value of combining quantitative and qualitative methods of formative evaluation with an iterative system development strategy to implement new information technology in complex clinical settings. Images Figure 1 PMID:9929188
Relational Network for Knowledge Discovery through Heterogeneous Biomedical and Clinical Features
Chen, Huaidong; Chen, Wei; Liu, Chenglin; Zhang, Le; Su, Jing; Zhou, Xiaobo
2016-01-01
Biomedical big data, as a whole, covers numerous features, while each dataset specifically delineates part of them. “Full feature spectrum” knowledge discovery across heterogeneous data sources remains a major challenge. We developed a method called bootstrapping for unified feature association measurement (BUFAM) for pairwise association analysis, and relational dependency network (RDN) modeling for global module detection on features across breast cancer cohorts. Discovered knowledge was cross-validated using data from Wake Forest Baptist Medical Center’s electronic medical records and annotated with BioCarta signaling signatures. The clinical potential of the discovered modules was exhibited by stratifying patients for drug responses. A series of discovered associations provided new insights into breast cancer, such as the effects of patient’s cultural background on preferences for surgical procedure. We also discovered two groups of highly associated features, the HER2 and the ER modules, each of which described how phenotypes were associated with molecular signatures, diagnostic features, and clinical decisions. The discovered “ER module”, which was dominated by cancer immunity, was used as an example for patient stratification and prediction of drug responses to tamoxifen and chemotherapy. BUFAM-derived RDN modeling demonstrated unique ability to discover clinically meaningful and actionable knowledge across highly heterogeneous biomedical big data sets. PMID:27427091
Relational Network for Knowledge Discovery through Heterogeneous Biomedical and Clinical Features
NASA Astrophysics Data System (ADS)
Chen, Huaidong; Chen, Wei; Liu, Chenglin; Zhang, Le; Su, Jing; Zhou, Xiaobo
2016-07-01
Biomedical big data, as a whole, covers numerous features, while each dataset specifically delineates part of them. “Full feature spectrum” knowledge discovery across heterogeneous data sources remains a major challenge. We developed a method called bootstrapping for unified feature association measurement (BUFAM) for pairwise association analysis, and relational dependency network (RDN) modeling for global module detection on features across breast cancer cohorts. Discovered knowledge was cross-validated using data from Wake Forest Baptist Medical Center’s electronic medical records and annotated with BioCarta signaling signatures. The clinical potential of the discovered modules was exhibited by stratifying patients for drug responses. A series of discovered associations provided new insights into breast cancer, such as the effects of patient’s cultural background on preferences for surgical procedure. We also discovered two groups of highly associated features, the HER2 and the ER modules, each of which described how phenotypes were associated with molecular signatures, diagnostic features, and clinical decisions. The discovered “ER module”, which was dominated by cancer immunity, was used as an example for patient stratification and prediction of drug responses to tamoxifen and chemotherapy. BUFAM-derived RDN modeling demonstrated unique ability to discover clinically meaningful and actionable knowledge across highly heterogeneous biomedical big data sets.
Nagarajan, Mahesh B.; Coan, Paola; Huber, Markus B.; Diemoz, Paul C.; Wismüller, Axel
2015-01-01
Phase contrast X-ray computed tomography (PCI-CT) has been demonstrated as a novel imaging technique that can visualize human cartilage with high spatial resolution and soft tissue contrast. Different textural approaches have been previously investigated for characterizing chondrocyte organization on PCI-CT to enable classification of healthy and osteoarthritic cartilage. However, the large size of feature sets extracted in such studies motivates an investigation into algorithmic feature reduction for computing efficient feature representations without compromising their discriminatory power. For this purpose, geometrical feature sets derived from the scaling index method (SIM) were extracted from 1392 volumes of interest (VOI) annotated on PCI-CT images of ex vivo human patellar cartilage specimens. The extracted feature sets were subject to linear and non-linear dimension reduction techniques as well as feature selection based on evaluation of mutual information criteria. The reduced feature set was subsequently used in a machine learning task with support vector regression to classify VOIs as healthy or osteoarthritic; classification performance was evaluated using the area under the receiver-operating characteristic (ROC) curve (AUC). Our results show that the classification performance achieved by 9-D SIM-derived geometric feature sets (AUC: 0.96 ± 0.02) can be maintained with 2-D representations computed from both dimension reduction and feature selection (AUC values as high as 0.97 ± 0.02). Thus, such feature reduction techniques can offer a high degree of compaction to large feature sets extracted from PCI-CT images while maintaining their ability to characterize the underlying chondrocyte patterns. PMID:25710875
Learning semantic histopathological representation for basal cell carcinoma classification
NASA Astrophysics Data System (ADS)
Gutiérrez, Ricardo; Rueda, Andrea; Romero, Eduardo
2013-03-01
Diagnosis of a histopathology glass slide is a complex process that involves accurate recognition of several structures, their function in the tissue and their relation with other structures. The way in which the pathologist represents the image content and the relations between those objects yields a better and accurate diagnoses. Therefore, an appropriate semantic representation of the image content will be useful in several analysis tasks such as cancer classification, tissue retrieval and histopahological image analysis, among others. Nevertheless, to automatically recognize those structures and extract their inner semantic meaning are still very challenging tasks. In this paper we introduce a new semantic representation that allows to describe histopathological concepts suitable for classification. The approach herein identify local concepts using a dictionary learning approach, i.e., the algorithm learns the most representative atoms from a set of random sampled patches, and then models the spatial relations among them by counting the co-occurrence between atoms, while penalizing the spatial distance. The proposed approach was compared with a bag-of-features representation in a tissue classification task. For this purpose, 240 histological microscopical fields of view, 24 per tissue class, were collected. Those images fed a Support Vector Machine classifier per class, using 120 images as train set and the remaining ones for testing, maintaining the same proportion of each concept in the train and test sets. The obtained classification results, averaged from 100 random partitions of training and test sets, shows that our approach is more sensitive in average than the bag-of-features representation in almost 6%.
Lefkowith, J B; Di Valerio, R; Norris, J; Glick, G D; Alexander, A L; Jackson, L; Gilkeson, G S
1996-08-01
We recently produced a panel of seven glomerular-binding mAbs from a nephritic MRL-lpr mouse that bind to histones/nucleosomes (group I) or DNA (group II) adherent to glomerular basement membrane. To elucidate the molecular basis of their binding and ontogeny, we sequenced their variable (V) regions, analyzed the apparent somatic mutations, and predicted their three-dimensional structures. There were two clonally related sets (3 of 4 in group I, 3 of 3 in group II) both of the VHJ1558 family, and one mAb of the VH 7183 family. V region somatic mutations within clonally related sets had little effect on glomerular binding and did not appear to be selected for based on glomerular binding. The VH regions were most homologous with those from autoantibodies to histones, DNA, or IgG (i.e., rheumatoid factors), the Vkappa regions, with those from autoantibodies to small nuclear ribonucleoproteins (snRNP). The VH regions also exhibited an unusual VD junction (in the group I clonally related set) and an overall high content of charged amino acids (arginine, aspartic acid) in complementarity-determining regions (CDRs), particularly in CDR3. Molecular modeling studies suggested that the Fv regions of these mAbs converge to form a flat, open surface with a net positive charge. The CDR arginines in group I mAbs; appear to be located in Ag contact regions of the binding cleft. In sum, these data suggest that glomerulotropic mAbs are a highly restricted set of Abs with distinctive molecular features that may mediate their binding to glomeruli.
Snoring classified: The Munich-Passau Snore Sound Corpus.
Janott, Christoph; Schmitt, Maximilian; Zhang, Yue; Qian, Kun; Pandit, Vedhas; Zhang, Zixing; Heiser, Clemens; Hohenhorst, Winfried; Herzog, Michael; Hemmert, Werner; Schuller, Björn
2018-03-01
Snoring can be excited in different locations within the upper airways during sleep. It was hypothesised that the excitation locations are correlated with distinct acoustic characteristics of the snoring noise. To verify this hypothesis, a database of snore sounds is developed, labelled with the location of sound excitation. Video and audio recordings taken during drug induced sleep endoscopy (DISE) examinations from three medical centres have been semi-automatically screened for snore events, which subsequently have been classified by ENT experts into four classes based on the VOTE classification. The resulting dataset containing 828 snore events from 219 subjects has been split into Train, Development, and Test sets. An SVM classifier has been trained using low level descriptors (LLDs) related to energy, spectral features, mel frequency cepstral coefficients (MFCC), formants, voicing, harmonic-to-noise ratio (HNR), spectral harmonicity, pitch, and microprosodic features. An unweighted average recall (UAR) of 55.8% could be achieved using the full set of LLDs including formants. Best performing subset is the MFCC-related set of LLDs. A strong difference in performance could be observed between the permutations of train, development, and test partition, which may be caused by the relatively low number of subjects included in the smaller classes of the strongly unbalanced data set. A database of snoring sounds is presented which are classified according to their sound excitation location based on objective criteria and verifiable video material. With the database, it could be demonstrated that machine classifiers can distinguish different excitation location of snoring sounds in the upper airway based on acoustic parameters. Copyright © 2018 Elsevier Ltd. All rights reserved.
Speech recognition features for EEG signal description in detection of neonatal seizures.
Temko, A; Boylan, G; Marnane, W; Lightbody, G
2010-01-01
In this work, features which are usually employed in automatic speech recognition (ASR) are used for the detection of neonatal seizures in newborn EEG. Three conventional ASR feature sets are compared to the feature set which has been previously developed for this task. The results indicate that the thoroughly-studied spectral envelope based ASR features perform reasonably well on their own. Additionally, the SVM Recursive Feature Elimination routine is applied to all extracted features pooled together. It is shown that ASR features consistently appear among the top-rank features.
Automatic feature design for optical character recognition using an evolutionary search procedure.
Stentiford, F W
1985-03-01
An automatic evolutionary search is applied to the problem of feature extraction in an OCR application. A performance measure based on feature independence is used to generate features which do not appear to suffer from peaking effects [17]. Features are extracted from a training set of 30 600 machine printed 34 class alphanumeric characters derived from British mail. Classification results on the training set and a test set of 10 200 characters are reported for an increasing number of features. A 1.01 percent forced decision error rate is obtained on the test data using 316 features. The hardware implementation should be cheap and fast to operate. The performance compares favorably with current low cost OCR page readers.
Wong, Gerard; Leckie, Christopher; Kowalczyk, Adam
2012-01-15
Feature selection is a key concept in machine learning for microarray datasets, where features represented by probesets are typically several orders of magnitude larger than the available sample size. Computational tractability is a key challenge for feature selection algorithms in handling very high-dimensional datasets beyond a hundred thousand features, such as in datasets produced on single nucleotide polymorphism microarrays. In this article, we present a novel feature set reduction approach that enables scalable feature selection on datasets with hundreds of thousands of features and beyond. Our approach enables more efficient handling of higher resolution datasets to achieve better disease subtype classification of samples for potentially more accurate diagnosis and prognosis, which allows clinicians to make more informed decisions in regards to patient treatment options. We applied our feature set reduction approach to several publicly available cancer single nucleotide polymorphism (SNP) array datasets and evaluated its performance in terms of its multiclass predictive classification accuracy over different cancer subtypes, its speedup in execution as well as its scalability with respect to sample size and array resolution. Feature Set Reduction (FSR) was able to reduce the dimensions of an SNP array dataset by more than two orders of magnitude while achieving at least equal, and in most cases superior predictive classification performance over that achieved on features selected by existing feature selection methods alone. An examination of the biological relevance of frequently selected features from FSR-reduced feature sets revealed strong enrichment in association with cancer. FSR was implemented in MATLAB R2010b and is available at http://ww2.cs.mu.oz.au/~gwong/FSR.
Using Concept Space to Verify Hyponymy in Building a Hyponymy Lexicon
NASA Astrophysics Data System (ADS)
Liu, Lei; Zhang, Sen; Diao, Lu Hong; Yan, Shu Ying; Cao, Cun Gen
Verification of hyponymy relations is a basic problem in knowledge acquisition. We present a method of hyponymy verification based on concept space. Firstly, we give the definition of concept space about a group of candidate hyponymy relations. Secondly we analyze the concept space and define a set of hyponymy features based on the space structure. Then we use them to verify candidate hyponymy relations. Experimental results show that the method can provide adequate verification of hyponymy.
General subspace learning with corrupted training data via graph embedding.
Bao, Bing-Kun; Liu, Guangcan; Hong, Richang; Yan, Shuicheng; Xu, Changsheng
2013-11-01
We address the following subspace learning problem: supposing we are given a set of labeled, corrupted training data points, how to learn the underlying subspace, which contains three components: an intrinsic subspace that captures certain desired properties of a data set, a penalty subspace that fits the undesired properties of the data, and an error container that models the gross corruptions possibly existing in the data. Given a set of data points, these three components can be learned by solving a nuclear norm regularized optimization problem, which is convex and can be efficiently solved in polynomial time. Using the method as a tool, we propose a new discriminant analysis (i.e., supervised subspace learning) algorithm called Corruptions Tolerant Discriminant Analysis (CTDA), in which the intrinsic subspace is used to capture the features with high within-class similarity, the penalty subspace takes the role of modeling the undesired features with high between-class similarity, and the error container takes charge of fitting the possible corruptions in the data. We show that CTDA can well handle the gross corruptions possibly existing in the training data, whereas previous linear discriminant analysis algorithms arguably fail in such a setting. Extensive experiments conducted on two benchmark human face data sets and one object recognition data set show that CTDA outperforms the related algorithms.
Shrivastava, Vimal K; Londhe, Narendra D; Sonawane, Rajendra S; Suri, Jasjit S
2016-04-01
Psoriasis is an autoimmune skin disease with red and scaly plaques on skin and affecting about 125 million people worldwide. Currently, dermatologist use visual and haptic methods for diagnosis the disease severity. This does not help them in stratification and risk assessment of the lesion stage and grade. Further, current methods add complexity during monitoring and follow-up phase. The current diagnostic tools lead to subjectivity in decision making and are unreliable and laborious. This paper presents a first comparative performance study of its kind using principal component analysis (PCA) based CADx system for psoriasis risk stratification and image classification utilizing: (i) 11 higher order spectra (HOS) features, (ii) 60 texture features, and (iii) 86 color feature sets and their seven combinations. Aggregate 540 image samples (270 healthy and 270 diseased) from 30 psoriasis patients of Indian ethnic origin are used in our database. Machine learning using PCA is used for dominant feature selection which is then fed to support vector machine classifier (SVM) to obtain optimized performance. Three different protocols are implemented using three kinds of feature sets. Reliability index of the CADx is computed. Among all feature combinations, the CADx system shows optimal performance of 100% accuracy, 100% sensitivity and specificity, when all three sets of feature are combined. Further, our experimental result with increasing data size shows that all feature combinations yield high reliability index throughout the PCA-cutoffs except color feature set and combination of color and texture feature sets. HOS features are powerful in psoriasis disease classification and stratification. Even though, independently, all three set of features HOS, texture, and color perform competitively, but when combined, the machine learning system performs the best. The system is fully automated, reliable and accurate. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Permafrost features on Earth and Mars: Similarities, differences
NASA Technical Reports Server (NTRS)
Joens, H. P.
1985-01-01
Typical permafrost features on Earth are polygonal structures, pingos and soli-/gelifluxion features. In areas around the poles and in mountain ranges the precipitation accumulates to inland ice or ice streams. On Mars the same features were identified: polygonal features cover the larger part of the northern lowlands indicating probably an ice wedge-/sand wedge system or desiccation cracks. These features indicate the extend of large mud accumulations which seem to be related to large outflow events of the chaotic terrains. The shore line of this mud accumulation is indicated by a special set of relief types. In some areas large pingo-like hills were identified. In the vicinity of the largest martian volcano, Olympus Mons, the melting of underlying permafrost and/or ground ice led to the downslope sliding of large parts of the primary shield which formed the aureole around Olympus Mons. Glacier-like features are identified along the escarpment which separates the Southern Uplands from the Northern Lowlands.
NASA Astrophysics Data System (ADS)
Nemoto, Mitsutaka; Hayashi, Naoto; Hanaoka, Shouhei; Nomura, Yukihiro; Miki, Soichiro; Yoshikawa, Takeharu; Ohtomo, Kuni
2016-03-01
The purpose of this study is to evaluate the feasibility of a novel feature generation, which is based on multiple deep neural networks (DNNs) with boosting, for computer-assisted detection (CADe). It is hard and time-consuming to optimize the hyperparameters for DNNs such as stacked denoising autoencoder (SdA). The proposed method allows using SdA based features without the burden of the hyperparameter setting. The proposed method was evaluated by an application for detecting cerebral aneurysms on magnetic resonance angiogram (MRA). A baseline CADe process included four components; scaling, candidate area limitation, candidate detection, and candidate classification. Proposed feature generation method was applied to extract the optimal features for candidate classification. Proposed method only required setting range of the hyperparameters for SdA. The optimal feature set was selected from a large quantity of SdA based features by multiple SdAs, each of which was trained using different hyperparameter set. The feature selection was operated through ada-boost ensemble learning method. Training of the baseline CADe process and proposed feature generation were operated with 200 MRA cases, and the evaluation was performed with 100 MRA cases. Proposed method successfully provided SdA based features just setting the range of some hyperparameters for SdA. The CADe process by using both previous voxel features and SdA based features had the best performance with 0.838 of an area under ROC curve and 0.312 of ANODE score. The results showed that proposed method was effective in the application for detecting cerebral aneurysms on MRA.
A Transform-Based Feature Extraction Approach for Motor Imagery Tasks Classification
Khorshidtalab, Aida; Mesbah, Mostefa; Salami, Momoh J. E.
2015-01-01
In this paper, we present a new motor imagery classification method in the context of electroencephalography (EEG)-based brain–computer interface (BCI). This method uses a signal-dependent orthogonal transform, referred to as linear prediction singular value decomposition (LP-SVD), for feature extraction. The transform defines the mapping as the left singular vectors of the LP coefficient filter impulse response matrix. Using a logistic tree-based model classifier; the extracted features are classified into one of four motor imagery movements. The proposed approach was first benchmarked against two related state-of-the-art feature extraction approaches, namely, discrete cosine transform (DCT) and adaptive autoregressive (AAR)-based methods. By achieving an accuracy of 67.35%, the LP-SVD approach outperformed the other approaches by large margins (25% compared with DCT and 6 % compared with AAR-based methods). To further improve the discriminatory capability of the extracted features and reduce the computational complexity, we enlarged the extracted feature subset by incorporating two extra features, namely, Q- and the Hotelling’s \\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}$T^{2}$ \\end{document} statistics of the transformed EEG and introduced a new EEG channel selection method. The performance of the EEG classification based on the expanded feature set and channel selection method was compared with that of a number of the state-of-the-art classification methods previously reported with the BCI IIIa competition data set. Our method came second with an average accuracy of 81.38%. PMID:27170898
Wang, Kun-Ching
2015-01-01
The classification of emotional speech is mostly considered in speech-related research on human-computer interaction (HCI). In this paper, the purpose is to present a novel feature extraction based on multi-resolutions texture image information (MRTII). The MRTII feature set is derived from multi-resolution texture analysis for characterization and classification of different emotions in a speech signal. The motivation is that we have to consider emotions have different intensity values in different frequency bands. In terms of human visual perceptual, the texture property on multi-resolution of emotional speech spectrogram should be a good feature set for emotion classification in speech. Furthermore, the multi-resolution analysis on texture can give a clearer discrimination between each emotion than uniform-resolution analysis on texture. In order to provide high accuracy of emotional discrimination especially in real-life, an acoustic activity detection (AAD) algorithm must be applied into the MRTII-based feature extraction. Considering the presence of many blended emotions in real life, in this paper make use of two corpora of naturally-occurring dialogs recorded in real-life call centers. Compared with the traditional Mel-scale Frequency Cepstral Coefficients (MFCC) and the state-of-the-art features, the MRTII features also can improve the correct classification rates of proposed systems among different language databases. Experimental results show that the proposed MRTII-based feature information inspired by human visual perception of the spectrogram image can provide significant classification for real-life emotional recognition in speech. PMID:25594590
The Role of Relational Information in Contingent Capture
ERIC Educational Resources Information Center
Becker, Stefanie I.; Folk, Charles L.; Remington, Roger W.
2010-01-01
On the contingent capture account, top-down attentional control settings restrict involuntary attentional capture to items that match the features of the search target. Attention capture is involuntary, but contingent on goals and intentions. The observation that only target-similar items can capture attention has usually been taken to show that…
Examining Classroom Interactions Related to Difference in Students' Science Achievement.
ERIC Educational Resources Information Center
Zady, Madelon F.; Portes, Pedro R.; Ochs, V. Dan
2003-01-01
Examines the cognitive supports that underlie achievement in science using a cultural historical framework and the activity setting (AS) construct with five features: personnel, motivation, scripts, task demands, and beliefs. Reports four emergent phenomena--science activities, the building of learning, meaning in lessons, and the conflict over…
The Broad Autism Phenotype Questionnaire
ERIC Educational Resources Information Center
Hurley, Robert S. E.; Losh, Molly; Parlier, Morgan; Reznick, J. Steven; Piven, Joseph
2007-01-01
The broad autism phenotype (BAP) is a set of personality and language characteristics that reflect the phenotypic expression of the genetic liability to autism, in non-autistic relatives of autistic individuals. These characteristics are milder but qualitatively similar to the defining features of autism. A new instrument designed to measure the…
Verleger, Rolf; Groen, Margriet; Heide, Wolfgang; Sobieralska, Kinga; Jaśkowski, Piotr
2008-05-01
We studied how physical and instructed embedding of features in gestalts affects perceptual selection. Four ovals on the horizontal midline were either unconnected or pairwise connected by circles, forming ears of left and right heads (gestalts). Relevant to responding was the position of one colored oval, either within its pair or relative to fixation ("object-based" or "fixation-based" instruction). Responses were faster under fixation- than object-based instruction, less so with gestalts. Previously reported increases of N1 when evoked by features within objects were replicated for fixation-based instruction only. There was no effect of instruction on N2pc. However P1 increased under the adequate instruction, object-based for gestalts, fixation-based for unconnected items, which presumably indicated how foci of attention were set by expecting specific stimuli under instructions that specified how to bind these stimuli to objects.
Automatic Detection of Blue-White Veil and Related Structures in Dermoscopy Images
Celebi, M. Emre; Iyatomi, Hitoshi; Stoecker, William V.; Moss, Randy H.; Rabinovitz, Harold S.; Argenziano, Giuseppe; Soyer, H. Peter
2011-01-01
Dermoscopy is a non-invasive skin imaging technique, which permits visualization of features of pigmented melanocytic neoplasms that are not discernable by examination with the naked eye. One of the most important features for the diagnosis of melanoma in dermoscopy images is the blue-white veil (irregular, structureless areas of confluent blue pigmentation with an overlying white “ground-glass” film). In this article, we present a machine learning approach to the detection of blue-white veil and related structures in dermoscopy images. The method involves contextual pixel classification using a decision tree classifier. The percentage of blue-white areas detected in a lesion combined with a simple shape descriptor yielded a sensitivity of 69.35% and a specificity of 89.97% on a set of 545 dermoscopy images. The sensitivity rises to 78.20% for detection of blue veil in those cases where it is a primary feature for melanoma recognition. PMID:18804955
Einhäuser, Wolfgang; Nuthmann, Antje
2016-09-01
During natural scene viewing, humans typically attend and fixate selected locations for about 200-400 ms. Two variables characterize such "overt" attention: the probability of a location being fixated, and the fixation's duration. Both variables have been widely researched, but little is known about their relation. We use a two-step approach to investigate the relation between fixation probability and duration. In the first step, we use a large corpus of fixation data. We demonstrate that fixation probability (empirical salience) predicts fixation duration across different observers and tasks. Linear mixed-effects modeling shows that this relation is explained neither by joint dependencies on simple image features (luminance, contrast, edge density) nor by spatial biases (central bias). In the second step, we experimentally manipulate some of these features. We find that fixation probability from the corpus data still predicts fixation duration for this new set of experimental data. This holds even if stimuli are deprived of low-level images features, as long as higher level scene structure remains intact. Together, this shows a robust relation between fixation duration and probability, which does not depend on simple image features. Moreover, the study exemplifies the combination of empirical research on a large corpus of data with targeted experimental manipulations.
The Centre for Speech, Language and the Brain (CSLB) concept property norms.
Devereux, Barry J; Tyler, Lorraine K; Geertzen, Jeroen; Randall, Billi
2014-12-01
Theories of the representation and processing of concepts have been greatly enhanced by models based on information available in semantic property norms. This information relates both to the identity of the features produced in the norms and to their statistical properties. In this article, we introduce a new and large set of property norms that are designed to be a more flexible tool to meet the demands of many different disciplines interested in conceptual knowledge representation, from cognitive psychology to computational linguistics. As well as providing all features listed by 2 or more participants, we also show the considerable linguistic variation that underlies each normalized feature label and the number of participants who generated each variant. Our norms are highly comparable with the largest extant set (McRae, Cree, Seidenberg, & McNorgan, 2005) in terms of the number and distribution of features. In addition, we show how the norms give rise to a coherent category structure. We provide these norms in the hope that the greater detail available in the Centre for Speech, Language and the Brain norms should further promote the development of models of conceptual knowledge. The norms can be downloaded at www.csl.psychol.cam.ac.uk/propertynorms.
Cheng, Tiejun; Li, Qingliang; Wang, Yanli; Bryant, Stephen H
2011-02-28
Aqueous solubility is recognized as a critical parameter in both the early- and late-stage drug discovery. Therefore, in silico modeling of solubility has attracted extensive interests in recent years. Most previous studies have been limited in using relatively small data sets with limited diversity, which in turn limits the predictability of derived models. In this work, we present a support vector machines model for the binary classification of solubility by taking advantage of the largest known public data set that contains over 46 000 compounds with experimental solubility. Our model was optimized in combination with a reduction and recombination feature selection strategy. The best model demonstrated robust performance in both cross-validation and prediction of two independent test sets, indicating it could be a practical tool to select soluble compounds for screening, purchasing, and synthesizing. Moreover, our work may be used for comparative evaluation of solubility classification studies ascribe to the use of completely public resources.
Perceptual quality estimation of H.264/AVC videos using reduced-reference and no-reference models
NASA Astrophysics Data System (ADS)
Shahid, Muhammad; Pandremmenou, Katerina; Kondi, Lisimachos P.; Rossholm, Andreas; Lövström, Benny
2016-09-01
Reduced-reference (RR) and no-reference (NR) models for video quality estimation, using features that account for the impact of coding artifacts, spatio-temporal complexity, and packet losses, are proposed. The purpose of this study is to analyze a number of potentially quality-relevant features in order to select the most suitable set of features for building the desired models. The proposed sets of features have not been used in the literature and some of the features are used for the first time in this study. The features are employed by the least absolute shrinkage and selection operator (LASSO), which selects only the most influential of them toward perceptual quality. For comparison, we apply feature selection in the complete feature sets and ridge regression on the reduced sets. The models are validated using a database of H.264/AVC encoded videos that were subjectively assessed for quality in an ITU-T compliant laboratory. We infer that just two features selected by RR LASSO and two bitstream-based features selected by NR LASSO are able to estimate perceptual quality with high accuracy, higher than that of ridge, which uses more features. The comparisons with competing works and two full-reference metrics also verify the superiority of our models.
Evidence of tampering in watermark identification
NASA Astrophysics Data System (ADS)
McLauchlan, Lifford; Mehrübeoglu, Mehrübe
2009-08-01
In this work, watermarks are embedded in digital images in the discrete wavelet transform (DWT) domain. Principal component analysis (PCA) is performed on the DWT coefficients. Next higher order statistics based on the principal components and the eigenvalues are determined for different sets of images. Feature sets are analyzed for different types of attacks in m dimensional space. The results demonstrate the separability of the features for the tampered digital copies. Different feature sets are studied to determine more effective tamper evident feature sets. The digital forensics, the probable manipulation(s) or modification(s) performed on the digital information can be identified using the described technique.
NASA Astrophysics Data System (ADS)
Rhodes, Andrew P.; Christian, John A.; Evans, Thomas
2017-12-01
With the availability and popularity of 3D sensors, it is advantageous to re-examine the use of point cloud descriptors for the purpose of pose estimation and spacecraft relative navigation. One popular descriptor is the oriented unique repeatable clustered viewpoint feature histogram (
Ageing and feature binding in visual working memory: The role of presentation time.
Rhodes, Stephen; Parra, Mario A; Logie, Robert H
2016-01-01
A large body of research has clearly demonstrated that healthy ageing is accompanied by an associative memory deficit. Older adults exhibit disproportionately poor performance on memory tasks requiring the retention of associations between items (e.g., pairs of unrelated words). In contrast to this robust deficit, older adults' ability to form and temporarily hold bound representations of an object's surface features, such as colour and shape, appears to be relatively well preserved. However, the findings of one set of experiments suggest that older adults may struggle to form temporary bound representations in visual working memory when given more time to study objects. However, these findings were based on between-participant comparisons across experimental paradigms. The present study directly assesses the role of presentation time in the ability of younger and older adults to bind shape and colour in visual working memory using a within-participant design. We report new evidence that giving older adults longer to study memory objects does not differentially affect their immediate memory for feature combinations relative to individual features. This is in line with a growing body of research suggesting that there is no age-related impairment in immediate memory for colour-shape binding.
Le, Trang T; Simmons, W Kyle; Misaki, Masaya; Bodurka, Jerzy; White, Bill C; Savitz, Jonathan; McKinney, Brett A
2017-09-15
Classification of individuals into disease or clinical categories from high-dimensional biological data with low prediction error is an important challenge of statistical learning in bioinformatics. Feature selection can improve classification accuracy but must be incorporated carefully into cross-validation to avoid overfitting. Recently, feature selection methods based on differential privacy, such as differentially private random forests and reusable holdout sets, have been proposed. However, for domains such as bioinformatics, where the number of features is much larger than the number of observations p≫n , these differential privacy methods are susceptible to overfitting. We introduce private Evaporative Cooling, a stochastic privacy-preserving machine learning algorithm that uses Relief-F for feature selection and random forest for privacy preserving classification that also prevents overfitting. We relate the privacy-preserving threshold mechanism to a thermodynamic Maxwell-Boltzmann distribution, where the temperature represents the privacy threshold. We use the thermal statistical physics concept of Evaporative Cooling of atomic gases to perform backward stepwise privacy-preserving feature selection. On simulated data with main effects and statistical interactions, we compare accuracies on holdout and validation sets for three privacy-preserving methods: the reusable holdout, reusable holdout with random forest, and private Evaporative Cooling, which uses Relief-F feature selection and random forest classification. In simulations where interactions exist between attributes, private Evaporative Cooling provides higher classification accuracy without overfitting based on an independent validation set. In simulations without interactions, thresholdout with random forest and private Evaporative Cooling give comparable accuracies. We also apply these privacy methods to human brain resting-state fMRI data from a study of major depressive disorder. Code available at http://insilico.utulsa.edu/software/privateEC . brett-mckinney@utulsa.edu. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Comparing sociocultural features of cholera in three endemic African settings
2013-01-01
Background Cholera mainly affects developing countries where safe water supply and sanitation infrastructure are often rudimentary. Sub-Saharan Africa is a cholera hotspot. Effective cholera control requires not only a professional assessment, but also consideration of community-based priorities. The present work compares local sociocultural features of endemic cholera in urban and rural sites from three field studies in southeastern Democratic Republic of Congo (SE-DRC), western Kenya and Zanzibar. Methods A vignette-based semistructured interview was used in 2008 in Zanzibar to study sociocultural features of cholera-related illness among 356 men and women from urban and rural communities. Similar cross-sectional surveys were performed in western Kenya (n = 379) and in SE-DRC (n = 360) in 2010. Systematic comparison across all settings considered the following domains: illness identification; perceived seriousness, potential fatality and past household episodes; illness-related experience; meaning; knowledge of prevention; help-seeking behavior; and perceived vulnerability. Results Cholera is well known in all three settings and is understood to have a significant impact on people’s lives. Its social impact was mainly characterized by financial concerns. Problems with unsafe water, sanitation and dirty environments were the most common perceived causes across settings; nonetheless, non-biomedical explanations were widespread in rural areas of SE-DRC and Zanzibar. Safe food and water and vaccines were prioritized for prevention in SE-DRC. Safe water was prioritized in western Kenya along with sanitation and health education. The latter two were also prioritized in Zanzibar. Use of oral rehydration solutions and rehydration was a top priority everywhere; healthcare facilities were universally reported as a primary source of help. Respondents in SE-DRC and Zanzibar reported cholera as affecting almost everybody without differentiating much for gender, age and class. In contrast, in western Kenya, gender differentiation was pronounced, and children and the poor were regarded as most vulnerable to cholera. Conclusions This comprehensive review identified common and distinctive features of local understandings of cholera. Classical treatment (that is, rehydration) was highlighted as a priority for control in the three African study settings and is likely to be identified in the region beyond. Findings indicate the value of insight from community studies to guide local program planning for cholera control and elimination. PMID:24047241
Protein structure based prediction of catalytic residues.
Fajardo, J Eduardo; Fiser, Andras
2013-02-22
Worldwide structural genomics projects continue to release new protein structures at an unprecedented pace, so far nearly 6000, but only about 60% of these proteins have any sort of functional annotation. We explored a range of features that can be used for the prediction of functional residues given a known three-dimensional structure. These features include various centrality measures of nodes in graphs of interacting residues: closeness, betweenness and page-rank centrality. We also analyzed the distance of functional amino acids to the general center of mass (GCM) of the structure, relative solvent accessibility (RSA), and the use of relative entropy as a measure of sequence conservation. From the selected features, neural networks were trained to identify catalytic residues. We found that using distance to the GCM together with amino acid type provide a good discriminant function, when combined independently with sequence conservation. Using an independent test set of 29 annotated protein structures, the method returned 411 of the initial 9262 residues as the most likely to be involved in function. The output 411 residues contain 70 of the annotated 111 catalytic residues. This represents an approximately 14-fold enrichment of catalytic residues on the entire input set (corresponding to a sensitivity of 63% and a precision of 17%), a performance competitive with that of other state-of-the-art methods. We found that several of the graph based measures utilize the same underlying feature of protein structures, which can be simply and more effectively captured with the distance to GCM definition. This also has the added the advantage of simplicity and easy implementation. Meanwhile sequence conservation remains by far the most influential feature in identifying functional residues. We also found that due the rapid changes in size and composition of sequence databases, conservation calculations must be recalibrated for specific reference databases.
Robust Statistics and Regularization for Feature Extraction and UXO Discrimination
2011-07-01
July 11, 2011 real data we find that this technique has an improved probability of finding all ordnance in a test data set, relative to previously...many sites. Tests on larger data sets should still be carried out. In previous work we considered a bootstrapping approach to selecting the operating...Marginalizing over x we obtain the probability that the ith order statistic in the test data belongs to the T class (55) P (T |x(i)) = ∞∫ −∞ P (T |x)p(x
Multi-image CAD employing features derived from ipsilateral mammographic views
NASA Astrophysics Data System (ADS)
Good, Walter F.; Zheng, Bin; Chang, Yuan-Hsiang; Wang, Xiao Hui; Maitz, Glenn S.; Gur, David
1999-05-01
On mammograms, certain kinds of features related to masses (e.g., location, texture, degree of spiculation, and integrated density difference) tend to be relatively invariant, or at last predictable, with respect to breast compression. Thus, ipsilateral pairs of mammograms may contain information not available from analyzing single views separately. To demonstrate the feasibility of incorporating multi-view features into CAD algorithm, `single-image' CAD was applied to each individual image in a set of 60 ipsilateral studies, after which all possible pairs of suspicious regions, consisting of one from each view, were formed. For these 402 pairs we defined and evaluated `multi-view' features such as: (1) relative position of centers of regions; (2) ratio of lengths of region projections parallel to nipple axis lines; (3) ratio of integrated contrast difference; (4) ratio of the sizes of the suspicious regions; and (5) measure of relative complexity of region boundaries. Each pair was identified as either a `true positive/true positive' (T) pair (i.e., two regions which are projections of the same actual mass), or as a falsely associated pair (F). Distributions for each feature were calculated. A Bayesian network was trained and tested to classify pairs of suspicious regions based exclusively on the multi-view features described above. Distributions for all features were significantly difference for T versus F pairs as indicated by likelihood ratios. Performance of the Bayesian network, which was measured by ROC analysis, indicates a significant ability to distinguish between T pairs and F pairs (Az equals 0.82 +/- 0.03), using information that is attributed to the multi-view content. This study is the first demonstration that there is a significant amount of spatial information that can be derived from ipsilateral pairs of mammograms.
NASA Astrophysics Data System (ADS)
Chung, Woon-Kwan; Park, Hyong-Hu; Im, In-Chul; Lee, Jae-Seung; Goo, Eun-Hoe; Dong, Kyung-Rae
2012-09-01
This paper proposes a computer-aided diagnosis (CAD) system based on texture feature analysis and statistical wavelet transformation technology to diagnose fatty liver disease with computed tomography (CT) imaging. In the target image, a wavelet transformation was performed for each lesion area to set the region of analysis (ROA, window size: 50 × 50 pixels) and define the texture feature of a pixel. Based on the extracted texture feature values, six parameters (average gray level, average contrast, relative smoothness, skewness, uniformity, and entropy) were determined to calculate the recognition rate for a fatty liver. In addition, a multivariate analysis of the variance (MANOVA) method was used to perform a discriminant analysis to verify the significance of the extracted texture feature values and the recognition rate for a fatty liver. According to the results, each texture feature value was significant for a comparison of the recognition rate for a fatty liver ( p < 0.05). Furthermore, the F-value, which was used as a scale for the difference in recognition rates, was highest in the average gray level, relatively high in the skewness and the entropy, and relatively low in the uniformity, the relative smoothness and the average contrast. The recognition rate for a fatty liver had the same scale as that for the F-value, showing 100% (average gray level) at the maximum and 80% (average contrast) at the minimum. Therefore, the recognition rate is believed to be a useful clinical value for the automatic detection and computer-aided diagnosis (CAD) using the texture feature value. Nevertheless, further study on various diseases and singular diseases will be needed in the future.
An EEG-Based Person Authentication System with Open-Set Capability Combining Eye Blinking Signals
Wu, Qunjian; Zeng, Ying; Zhang, Chi; Tong, Li; Yan, Bin
2018-01-01
The electroencephalogram (EEG) signal represents a subject’s specific brain activity patterns and is considered as an ideal biometric given its superior forgery prevention. However, the accuracy and stability of the current EEG-based person authentication systems are still unsatisfactory in practical application. In this paper, a multi-task EEG-based person authentication system combining eye blinking is proposed, which can achieve high precision and robustness. Firstly, we design a novel EEG-based biometric evoked paradigm using self- or non-self-face rapid serial visual presentation (RSVP). The designed paradigm could obtain a distinct and stable biometric trait from EEG with a lower time cost. Secondly, the event-related potential (ERP) features and morphological features are extracted from EEG signals and eye blinking signals, respectively. Thirdly, convolutional neural network and back propagation neural network are severally designed to gain the score estimation of EEG features and eye blinking features. Finally, a score fusion technology based on least square method is proposed to get the final estimation score. The performance of multi-task authentication system is improved significantly compared to the system using EEG only, with an increasing average accuracy from 92.4% to 97.6%. Moreover, open-set authentication tests for additional imposters and permanence tests for users are conducted to simulate the practical scenarios, which have never been employed in previous EEG-based person authentication systems. A mean false accepted rate (FAR) of 3.90% and a mean false rejected rate (FRR) of 3.87% are accomplished in open-set authentication tests and permanence tests, respectively, which illustrate the open-set authentication and permanence capability of our systems. PMID:29364848
An EEG-Based Person Authentication System with Open-Set Capability Combining Eye Blinking Signals.
Wu, Qunjian; Zeng, Ying; Zhang, Chi; Tong, Li; Yan, Bin
2018-01-24
The electroencephalogram (EEG) signal represents a subject's specific brain activity patterns and is considered as an ideal biometric given its superior forgery prevention. However, the accuracy and stability of the current EEG-based person authentication systems are still unsatisfactory in practical application. In this paper, a multi-task EEG-based person authentication system combining eye blinking is proposed, which can achieve high precision and robustness. Firstly, we design a novel EEG-based biometric evoked paradigm using self- or non-self-face rapid serial visual presentation (RSVP). The designed paradigm could obtain a distinct and stable biometric trait from EEG with a lower time cost. Secondly, the event-related potential (ERP) features and morphological features are extracted from EEG signals and eye blinking signals, respectively. Thirdly, convolutional neural network and back propagation neural network are severally designed to gain the score estimation of EEG features and eye blinking features. Finally, a score fusion technology based on least square method is proposed to get the final estimation score. The performance of multi-task authentication system is improved significantly compared to the system using EEG only, with an increasing average accuracy from 92.4% to 97.6%. Moreover, open-set authentication tests for additional imposters and permanence tests for users are conducted to simulate the practical scenarios, which have never been employed in previous EEG-based person authentication systems. A mean false accepted rate (FAR) of 3.90% and a mean false rejected rate (FRR) of 3.87% are accomplished in open-set authentication tests and permanence tests, respectively, which illustrate the open-set authentication and permanence capability of our systems.
Bayesian analogy with relational transformations.
Lu, Hongjing; Chen, Dawn; Holyoak, Keith J
2012-07-01
How can humans acquire relational representations that enable analogical inference and other forms of high-level reasoning? Using comparative relations as a model domain, we explore the possibility that bottom-up learning mechanisms applied to objects coded as feature vectors can yield representations of relations sufficient to solve analogy problems. We introduce Bayesian analogy with relational transformations (BART) and apply the model to the task of learning first-order comparative relations (e.g., larger, smaller, fiercer, meeker) from a set of animal pairs. Inputs are coded by vectors of continuous-valued features, based either on human magnitude ratings, normed feature ratings (De Deyne et al., 2008), or outputs of the topics model (Griffiths, Steyvers, & Tenenbaum, 2007). Bootstrapping from empirical priors, the model is able to induce first-order relations represented as probabilistic weight distributions, even when given positive examples only. These learned representations allow classification of novel instantiations of the relations and yield a symbolic distance effect of the sort obtained with both humans and other primates. BART then transforms its learned weight distributions by importance-guided mapping, thereby placing distinct dimensions into correspondence. These transformed representations allow BART to reliably solve 4-term analogies (e.g., larger:smaller::fiercer:meeker), a type of reasoning that is arguably specific to humans. Our results provide a proof-of-concept that structured analogies can be solved with representations induced from unstructured feature vectors by mechanisms that operate in a largely bottom-up fashion. We discuss potential implications for algorithmic and neural models of relational thinking, as well as for the evolution of abstract thought. Copyright 2012 APA, all rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shepard, A; Bednarz, B
Purpose: To develop an ultrasound learning-based tracking algorithm with the potential to provide real-time motion traces of anatomy-based fiducials that may aid in the effective delivery of external beam radiation. Methods: The algorithm was developed in Matlab R2015a and consists of two main stages: reference frame selection, and localized block matching. Immediately following frame acquisition, a normalized cross-correlation (NCC) similarity metric is used to determine a reference frame most similar to the current frame from a series of training set images that were acquired during a pretreatment scan. Segmented features in the reference frame provide the basis for the localizedmore » block matching to determine the feature locations in the current frame. The boundary points of the reference frame segmentation are used as the initial locations for the block matching and NCC is used to find the most similar block in the current frame. The best matched block locations in the current frame comprise the updated feature boundary. The algorithm was tested using five features from two sets of ultrasound patient data obtained from MICCAI 2014 CLUST. Due to the lack of a training set associated with the image sequences, the first 200 frames of the image sets were considered a valid training set for preliminary testing, and tracking was performed over the remaining frames. Results: Tracking of the five vessel features resulted in an average tracking error of 1.21 mm relative to predefined annotations. The average analysis rate was 15.7 FPS with analysis for one of the two patients reaching real-time speeds. Computations were performed on an i5-3230M at 2.60 GHz. Conclusion: Preliminary tests show tracking errors comparable with similar algorithms at close to real-time speeds. Extension of the work onto a GPU platform has the potential to achieve real-time performance, making tracking for therapy applications a feasible option. This work is partially funded by NIH grant R01CA190298.« less
Rough sets and Laplacian score based cost-sensitive feature selection
Yu, Shenglong
2018-01-01
Cost-sensitive feature selection learning is an important preprocessing step in machine learning and data mining. Recently, most existing cost-sensitive feature selection algorithms are heuristic algorithms, which evaluate the importance of each feature individually and select features one by one. Obviously, these algorithms do not consider the relationship among features. In this paper, we propose a new algorithm for minimal cost feature selection called the rough sets and Laplacian score based cost-sensitive feature selection. The importance of each feature is evaluated by both rough sets and Laplacian score. Compared with heuristic algorithms, the proposed algorithm takes into consideration the relationship among features with locality preservation of Laplacian score. We select a feature subset with maximal feature importance and minimal cost when cost is undertaken in parallel, where the cost is given by three different distributions to simulate different applications. Different from existing cost-sensitive feature selection algorithms, our algorithm simultaneously selects out a predetermined number of “good” features. Extensive experimental results show that the approach is efficient and able to effectively obtain the minimum cost subset. In addition, the results of our method are more promising than the results of other cost-sensitive feature selection algorithms. PMID:29912884
Rough sets and Laplacian score based cost-sensitive feature selection.
Yu, Shenglong; Zhao, Hong
2018-01-01
Cost-sensitive feature selection learning is an important preprocessing step in machine learning and data mining. Recently, most existing cost-sensitive feature selection algorithms are heuristic algorithms, which evaluate the importance of each feature individually and select features one by one. Obviously, these algorithms do not consider the relationship among features. In this paper, we propose a new algorithm for minimal cost feature selection called the rough sets and Laplacian score based cost-sensitive feature selection. The importance of each feature is evaluated by both rough sets and Laplacian score. Compared with heuristic algorithms, the proposed algorithm takes into consideration the relationship among features with locality preservation of Laplacian score. We select a feature subset with maximal feature importance and minimal cost when cost is undertaken in parallel, where the cost is given by three different distributions to simulate different applications. Different from existing cost-sensitive feature selection algorithms, our algorithm simultaneously selects out a predetermined number of "good" features. Extensive experimental results show that the approach is efficient and able to effectively obtain the minimum cost subset. In addition, the results of our method are more promising than the results of other cost-sensitive feature selection algorithms.
Valavanis, Ioannis; Pilalis, Eleftherios; Georgiadis, Panagiotis; Kyrtopoulos, Soterios; Chatziioannou, Aristotelis
2015-01-01
DNA methylation profiling exploits microarray technologies, thus yielding a wealth of high-volume data. Here, an intelligent framework is applied, encompassing epidemiological genome-scale DNA methylation data produced from the Illumina’s Infinium Human Methylation 450K Bead Chip platform, in an effort to correlate interesting methylation patterns with cancer predisposition and, in particular, breast cancer and B-cell lymphoma. Feature selection and classification are employed in order to select, from an initial set of ~480,000 methylation measurements at CpG sites, predictive cancer epigenetic biomarkers and assess their classification power for discriminating healthy versus cancer related classes. Feature selection exploits evolutionary algorithms or a graph-theoretic methodology which makes use of the semantics information included in the Gene Ontology (GO) tree. The selected features, corresponding to methylation of CpG sites, attained moderate-to-high classification accuracies when imported to a series of classifiers evaluated by resampling or blindfold validation. The semantics-driven selection revealed sets of CpG sites performing similarly with evolutionary selection in the classification tasks. However, gene enrichment and pathway analysis showed that it additionally provides more descriptive sets of GO terms and KEGG pathways regarding the cancer phenotypes studied here. Results support the expediency of this methodology regarding its application in epidemiological studies. PMID:27600245
Reliability in content analysis: The case of semantic feature norms classification.
Bolognesi, Marianna; Pilgram, Roosmaryn; van den Heerik, Romy
2017-12-01
Semantic feature norms (e.g., STIMULUS: car → RESPONSE:
Contingent attentional capture across multiple feature dimensions in a temporal search task.
Ito, Motohiro; Kawahara, Jun I
2016-01-01
The present study examined whether attention can be flexibly controlled to monitor two different feature dimensions (shape and color) in a temporal search task. Specifically, we investigated the occurrence of contingent attentional capture (i.e., interference from task-relevant distractors) and resulting set reconfiguration (i.e., enhancement of single task-relevant set). If observers can restrict searches to a specific value for each relevant feature dimension independently, the capture and reconfiguration effect should only occur when the single relevant distractor in each dimension appears. Participants identified a target letter surrounded by a non-green square or a non-square green frame. The results revealed contingent attentional capture, as target identification accuracy was lower when the distractor contained a target-defining feature than when it contained a nontarget feature. Resulting set reconfiguration was also obtained in that accuracy was superior when the current target's feature (e.g., shape) corresponded to the defining feature of the present distractor (shape) than when the current target's feature did not match the distractor's feature (color). This enhancement was not due to perceptual priming. The present study demonstrated that the principles of contingent attentional capture and resulting set reconfiguration held even when multiple target feature dimensions were monitored. Copyright © 2015 Elsevier B.V. All rights reserved.
Tbahriti, Imad; Chichester, Christine; Lisacek, Frédérique; Ruch, Patrick
2006-06-01
The aim of this study is to investigate the relationships between citations and the scientific argumentation found abstracts. We design a related article search task and observe how the argumentation can affect the search results. We extracted citation lists from a set of 3200 full-text papers originating from a narrow domain. In parallel, we recovered the corresponding MEDLINE records for analysis of the argumentative moves. Our argumentative model is founded on four classes: PURPOSE, METHODS, RESULTS and CONCLUSION. A Bayesian classifier trained on explicitly structured MEDLINE abstracts generates these argumentative categories. The categories are used to generate four different argumentative indexes. A fifth index contains the complete abstract, together with the title and the list of Medical Subject Headings (MeSH) terms. To appraise the relationship of the moves to the citations, the citation lists were used as the criteria for determining relatedness of articles, establishing a benchmark; it means that two articles are considered as "related" if they share a significant set of co-citations. Our results show that the average precision of queries with the PURPOSE and CONCLUSION features is the highest, while the precision of the RESULTS and METHODS features was relatively low. A linear weighting combination of the moves is proposed, which significantly improves retrieval of related articles.
Analysis of the Westland Data Set
NASA Technical Reports Server (NTRS)
Wen, Fang; Willett, Peter; Deb, Somnath
2001-01-01
The "Westland" set of empirical accelerometer helicopter data with seeded and labeled faults is analyzed with the aim of condition monitoring. The autoregressive (AR) coefficients from a simple linear model encapsulate a great deal of information in a relatively few measurements; and it has also been found that augmentation of these by harmonic and other parameters call improve classification significantly. Several techniques have been explored, among these restricted Coulomb energy (RCE) networks, learning vector quantization (LVQ), Gaussian mixture classifiers and decision trees. A problem with these approaches, and in common with many classification paradigms, is that augmentation of the feature dimension can degrade classification ability. Thus, we also introduce the Bayesian data reduction algorithm (BDRA), which imposes a Dirichlet prior oil training data and is thus able to quantify probability of error in all exact manner, such that features may be discarded or coarsened appropriately.
Bayesian Analysis of Hmi Images and Comparison to Tsi Variations and MWO Image Observables
NASA Astrophysics Data System (ADS)
Parker, D. G.; Ulrich, R. K.; Beck, J.; Tran, T. V.
2015-12-01
We have previously applied the Bayesian automatic classification system AutoClass to solar magnetogram and intensity images from the 150 Foot Solar Tower at Mount Wilson to identify classes of solar surface features associated with variations in total solar irradiance (TSI) and, using those identifications, modeled TSI time series with improved accuracy (r > 0.96). (Ulrich, et al, 2010) AutoClass identifies classes by a two-step process in which it: (1) finds, without human supervision, a set of class definitions based on specified attributes of a sample of the image data pixels, such as magnetic field and intensity in the case of MWO images, and (2) applies the class definitions thus found to new data sets to identify automatically in them the classes found in the sample set. HMI high resolution images capture four observables-magnetic field, continuum intensity, line depth and line width-in contrast to MWO's two observables-magnetic field and intensity. In this study, we apply AutoClass to the HMI observables for images from June, 2010 to December, 2014 to identify solar surface feature classes. We use contemporaneous TSI measurements to determine whether and how variations in the HMI classes are related to TSI variations and compare the characteristic statistics of the HMI classes to those found from MWO images. We also attempt to derive scale factors between the HMI and MWO magnetic and intensity observables.The ability to categorize automatically surface features in the HMI images holds out the promise of consistent, relatively quick and manageable analysis of the large quantity of data available in these images. Given that the classes found in MWO images using AutoClass have been found to improve modeling of TSI, application of AutoClass to the more complex HMI images should enhance understanding of the physical processes at work in solar surface features and their implications for the solar-terrestrial environment.Ulrich, R.K., Parker, D, Bertello, L. and Boyden, J. 2010, Solar Phys. , 261 , 11.
NASA Technical Reports Server (NTRS)
Adler, Robert F.; Curtis, Scott; Huffman, George; Bolvin, David; Nelkin, Eric; Einaudi, Franco (Technical Monitor)
2001-01-01
This paper gives an overview of the analysis of global precipitation over the last few decades and the impact of the new TRMM precipitation observations. The 20+ year, monthly, globally complete precipitation analysis of the World Climate Research Program's (WCRP/GEWEX) Global Precipitation Climatology Project (GPCP) is used to study global and regional variations and trends and is compared to the much shorter TRMM(Tropical Rainfall Measuring Mission) tropical data set. The GPCP data set shows no significant trend in global precipitation over the twenty years, unlike the positive trend in global surface temperatures over the past century. The global trend analysis must be interpreted carefully, however, because the inhomogeneity of the data set makes detecting a small signal very difficult, especially over this relatively short period. The relation of global (and tropical) total precipitation and ENSO events is quantified with no significant signal when land and ocean are combined. Identifying regional trends in precipitation may be more practical. From 1979 to 2000 the tropics have pattern of regional rainfall trends that has an ENSO-like pattern with features of both the El Nino and La Nina. This feature is related to a possible trend in the frequency of ENSO events (either El Nino or La Nina) over the past 20 years. Monthly anomalies of precipitation are related to ENSO variations with clear signals extending into middle and high latitudes of both hemispheres. The El Nino and La Nina mean anomalies are near mirror images of each other and when combined produce an ENSO signal with significant spatial continuity over large distances. A number of the features are shown to extend into high latitudes. Positive anomalies extend in the Southern Hemisphere (S.H.) from the Pacific southeastward across Chile and Argentina into the south Atlantic Ocean. In the Northern Hemisphere (N.H.) the counterpart feature extends across the southern U.S. and Atlantic Ocean into Europe. Further to the west a negative anomaly extends southeastward again from the Maritime Continent across the South Pacific and through the Drake Passage. In the Southern Hemisphere an anomaly feature is shown to spiral into the Antarctica land mass. The extremes of ENSO-related anomalies are also examined and indicate that globally, during both El Nino and La Nina, more extremes of precipitation (both wet and dry) occur than during the "neutral" regime, with the El Nino regime showing larger magnitudes. The distribution is different for the globe as a whole and when the area is restricted to just land. The recent (1998-present) TRMM observations are compared with the similar period of GPCP analyses with very good agreement in terms of pattern and generally good agreement with regard to magnitude. However, there still are differences among the individual TRMM products using passive and active microwave techniques and these need to be resolved before longer-term products such as the GPCP analyses can be validated.
Factors related to HIV-associated neurocognitive impairment differ with age.
Fogel, Gary B; Lamers, Susanna L; Levine, Andrew J; Valdes-Sueiras, Miguel; McGrath, Michael S; Shapshak, Paul; Singer, Elyse J
2015-02-01
Over 50% of HIV-infected (HIV+) persons are expected to be over age 50 by 2015. The pathogenic effects of HIV, particularly in cases of long-term infection, may intersect with those of age-related illnesses and prolonged exposure to combined antiretroviral therapy (cART). One potential outcome is an increased prevalence of neurocognitive impairment in older HIV+ individuals, as well as an altered presentation of HIV-associated neurocognitive disorders (HANDs). In this study, we employed stepwise regression to examine 24 features sometimes associated with HAND in 40 older (55-73 years of age) and 30 younger (32-50 years of age) HIV+, cART-treated participants without significant central nervous system confounds. The features most effective in generating a true assessment of the likelihood of HAND diagnosis differed between older and younger cohorts, with the younger cohort containing features associated with drug abuse that were correlated to HAND and the older cohort containing features that were associated with lipid disorders mildly associated with HAND. As the HIV-infected population grows and the demographics of the epidemic change, it is increasingly important to re-evaluate features associated with neurocognitive impairment. Here, we have identified features, routinely collected in primary care settings, that provide more accurate diagnostic value than a neurocognitive screening measure among younger and older HIV individuals.
Efficient feature selection using a hybrid algorithm for the task of epileptic seizure detection
NASA Astrophysics Data System (ADS)
Lai, Kee Huong; Zainuddin, Zarita; Ong, Pauline
2014-07-01
Feature selection is a very important aspect in the field of machine learning. It entails the search of an optimal subset from a very large data set with high dimensional feature space. Apart from eliminating redundant features and reducing computational cost, a good selection of feature also leads to higher prediction and classification accuracy. In this paper, an efficient feature selection technique is introduced in the task of epileptic seizure detection. The raw data are electroencephalography (EEG) signals. Using discrete wavelet transform, the biomedical signals were decomposed into several sets of wavelet coefficients. To reduce the dimension of these wavelet coefficients, a feature selection method that combines the strength of both filter and wrapper methods is proposed. Principal component analysis (PCA) is used as part of the filter method. As for wrapper method, the evolutionary harmony search (HS) algorithm is employed. This metaheuristic method aims at finding the best discriminating set of features from the original data. The obtained features were then used as input for an automated classifier, namely wavelet neural networks (WNNs). The WNNs model was trained to perform a binary classification task, that is, to determine whether a given EEG signal was normal or epileptic. For comparison purposes, different sets of features were also used as input. Simulation results showed that the WNNs that used the features chosen by the hybrid algorithm achieved the highest overall classification accuracy.
NASA Astrophysics Data System (ADS)
Shi, Bibo; Hou, Rui; Mazurowski, Maciej A.; Grimm, Lars J.; Ren, Yinhao; Marks, Jeffrey R.; King, Lorraine M.; Maley, Carlo C.; Hwang, E. Shelley; Lo, Joseph Y.
2018-02-01
Purpose: To determine whether domain transfer learning can improve the performance of deep features extracted from digital mammograms using a pre-trained deep convolutional neural network (CNN) in the prediction of occult invasive disease for patients with ductal carcinoma in situ (DCIS) on core needle biopsy. Method: In this study, we collected digital mammography magnification views for 140 patients with DCIS at biopsy, 35 of which were subsequently upstaged to invasive cancer. We utilized a deep CNN model that was pre-trained on two natural image data sets (ImageNet and DTD) and one mammographic data set (INbreast) as the feature extractor, hypothesizing that these data sets are increasingly more similar to our target task and will lead to better representations of deep features to describe DCIS lesions. Through a statistical pooling strategy, three sets of deep features were extracted using the CNNs at different levels of convolutional layers from the lesion areas. A logistic regression classifier was then trained to predict which tumors contain occult invasive disease. The generalization performance was assessed and compared using repeated random sub-sampling validation and receiver operating characteristic (ROC) curve analysis. Result: The best performance of deep features was from CNN model pre-trained on INbreast, and the proposed classifier using this set of deep features was able to achieve a median classification performance of ROC-AUC equal to 0.75, which is significantly better (p<=0.05) than the performance of deep features extracted using ImageNet data set (ROCAUC = 0.68). Conclusion: Transfer learning is helpful for learning a better representation of deep features, and improves the prediction of occult invasive disease in DCIS.
Real-Time Feature Tracking Using Homography
NASA Technical Reports Server (NTRS)
Clouse, Daniel S.; Cheng, Yang; Ansar, Adnan I.; Trotz, David C.; Padgett, Curtis W.
2010-01-01
This software finds feature point correspondences in sequences of images. It is designed for feature matching in aerial imagery. Feature matching is a fundamental step in a number of important image processing operations: calibrating the cameras in a camera array, stabilizing images in aerial movies, geo-registration of images, and generating high-fidelity surface maps from aerial movies. The method uses a Shi-Tomasi corner detector and normalized cross-correlation. This process is likely to result in the production of some mismatches. The feature set is cleaned up using the assumption that there is a large planar patch visible in both images. At high altitude, this assumption is often reasonable. A mathematical transformation, called an homography, is developed that allows us to predict the position in image 2 of any point on the plane in image 1. Any feature pair that is inconsistent with the homography is thrown out. The output of the process is a set of feature pairs, and the homography. The algorithms in this innovation are well known, but the new implementation improves the process in several ways. It runs in real-time at 2 Hz on 64-megapixel imagery. The new Shi-Tomasi corner detector tries to produce the requested number of features by automatically adjusting the minimum distance between found features. The homography-finding code now uses an implementation of the RANSAC algorithm that adjusts the number of iterations automatically to achieve a pre-set probability of missing a set of inliers. The new interface allows the caller to pass in a set of predetermined points in one of the images. This allows the ability to track the same set of points through multiple frames.
Büssow, Konrad; Hoffmann, Steve; Sievert, Volker
2002-12-19
Functional genomics involves the parallel experimentation with large sets of proteins. This requires management of large sets of open reading frames as a prerequisite of the cloning and recombinant expression of these proteins. A Java program was developed for retrieval of protein and nucleic acid sequences and annotations from NCBI GenBank, using the XML sequence format. Annotations retrieved by ORFer include sequence name, organism and also the completeness of the sequence. The program has a graphical user interface, although it can be used in a non-interactive mode. For protein sequences, the program also extracts the open reading frame sequence, if available, and checks its correct translation. ORFer accepts user input in the form of single or lists of GenBank GI identifiers or accession numbers. It can be used to extract complete sets of open reading frames and protein sequences from any kind of GenBank sequence entry, including complete genomes or chromosomes. Sequences are either stored with their features in a relational database or can be exported as text files in Fasta or tabulator delimited format. The ORFer program is freely available at http://www.proteinstrukturfabrik.de/orfer. The ORFer program allows for fast retrieval of DNA sequences, protein sequences and their open reading frames and sequence annotations from GenBank. Furthermore, storage of sequences and features in a relational database is supported. Such a database can supplement a laboratory information system (LIMS) with appropriate sequence information.
ERIC Educational Resources Information Center
Eimer, Martin; Kiss, Monika; Nicholas, Susan
2011-01-01
When target-defining features are specified in advance, attentional target selection in visual search is controlled by preparatory top-down task sets. We used ERP measures to study voluntary target selection in the absence of such feature-specific task sets, and to compare it to selection that is guided by advance knowledge about target features.…
NASA Astrophysics Data System (ADS)
Eakins, John P.; Edwards, Jonathan D.; Riley, K. Jonathan; Rosin, Paul L.
2001-01-01
Many different kinds of features have been used as the basis for shape retrieval from image databases. This paper investigates the relative effectiveness of several types of global shape feature, both singly and in combination. The features compared include well-established descriptors such as Fourier coefficients and moment invariants, as well as recently-proposed measures of triangularity and ellipticity. Experiments were conducted within the framework of the ARTISAN shape retrieval system, and retrieval effectiveness assessed on a database of over 10,000 images, using 24 queries and associated ground truth supplied by the UK Patent Office . Our experiments revealed only minor differences in retrieval effectiveness between different measures, suggesting that a wide variety of shape feature combinations can provide adequate discriminating power for effective shape retrieval in multi-component image collections such as trademark registries. Marked differences between measures were observed for some individual queries, suggesting that there could be considerable scope for improving retrieval effectiveness by providing users with an improved framework for searching multi-dimensional feature space.
NASA Astrophysics Data System (ADS)
Eakins, John P.; Edwards, Jonathan D.; Riley, K. Jonathan; Rosin, Paul L.
2000-12-01
Many different kinds of features have been used as the basis for shape retrieval from image databases. This paper investigates the relative effectiveness of several types of global shape feature, both singly and in combination. The features compared include well-established descriptors such as Fourier coefficients and moment invariants, as well as recently-proposed measures of triangularity and ellipticity. Experiments were conducted within the framework of the ARTISAN shape retrieval system, and retrieval effectiveness assessed on a database of over 10,000 images, using 24 queries and associated ground truth supplied by the UK Patent Office . Our experiments revealed only minor differences in retrieval effectiveness between different measures, suggesting that a wide variety of shape feature combinations can provide adequate discriminating power for effective shape retrieval in multi-component image collections such as trademark registries. Marked differences between measures were observed for some individual queries, suggesting that there could be considerable scope for improving retrieval effectiveness by providing users with an improved framework for searching multi-dimensional feature space.
Classification of large-scale fundus image data sets: a cloud-computing framework.
Roychowdhury, Sohini
2016-08-01
Large medical image data sets with high dimensionality require substantial amount of computation time for data creation and data processing. This paper presents a novel generalized method that finds optimal image-based feature sets that reduce computational time complexity while maximizing overall classification accuracy for detection of diabetic retinopathy (DR). First, region-based and pixel-based features are extracted from fundus images for classification of DR lesions and vessel-like structures. Next, feature ranking strategies are used to distinguish the optimal classification feature sets. DR lesion and vessel classification accuracies are computed using the boosted decision tree and decision forest classifiers in the Microsoft Azure Machine Learning Studio platform, respectively. For images from the DIARETDB1 data set, 40 of its highest-ranked features are used to classify four DR lesion types with an average classification accuracy of 90.1% in 792 seconds. Also, for classification of red lesion regions and hemorrhages from microaneurysms, accuracies of 85% and 72% are observed, respectively. For images from STARE data set, 40 high-ranked features can classify minor blood vessels with an accuracy of 83.5% in 326 seconds. Such cloud-based fundus image analysis systems can significantly enhance the borderline classification performances in automated screening systems.
Classification of wet aged related macular degeneration using optical coherence tomographic images
NASA Astrophysics Data System (ADS)
Haq, Anam; Mir, Fouwad Jamil; Yasin, Ubaid Ullah; Khan, Shoab A.
2013-12-01
Wet Age related macular degeneration (AMD) is a type of age related macular degeneration. In order to detect Wet AMD we look for Pigment Epithelium detachment (PED) and fluid filled region caused by choroidal neovascularization (CNV). This form of AMD can cause vision loss if not treated in time. In this article we have proposed an automated system for detection of Wet AMD in Optical coherence tomographic (OCT) images. The proposed system extracts PED and CNV from OCT images using segmentation and morphological operations and then detailed feature set are extracted. These features are then passed on to the classifier for classification. Finally performance measures like accuracy, sensitivity and specificity are calculated and the classifier delivering the maximum performance is selected as a comparison measure. Our system gives higher performance using SVM as compared to other methods.
Features of Home Environments Associated with Children's School Success.
ERIC Educational Resources Information Center
Martini, Mary
1995-01-01
Examines middle-class child-rearing philosophies and practices and their effect on children's academic success. Suggests that middle-class parenting practices reflect a coherent set of cultural beliefs about the relation of the individual to the group and about the parents' role in bringing children into the group. Suggests that these beliefs…
New Knowledge Derived from Learned Knowledge: Functional-Anatomic Correlates of Stimulus Equivalence
ERIC Educational Resources Information Center
Schlund, Michael W.; Hoehn-Saric, Rudolf; Cataldo, Michael F.
2007-01-01
Forming new knowledge based on knowledge established through prior learning is a central feature of higher cognition that is captured in research on stimulus equivalence (SE). Numerous SE investigations show that reinforcing behavior under control of distinct sets of arbitrary conditional relations gives rise to stimulus control by new, "derived"…
Studying Child Care Subsidies with Secondary Data Sources. Methodological Brief OPRE 2012-54
ERIC Educational Resources Information Center
Ha, Yoonsook; Johnson, Anna D.
2012-01-01
This brief describes four national surveys with data relevant to subsidy-related research and provides a useful set of considerations for subsidy researchers considering use of secondary data. Specifically, this brief describes each of the four datasets reviewed, highlighting unique features of each dataset and providing information on the survey…
Youth Unemployment and Labour Market Transitions in Hungary
ERIC Educational Resources Information Center
Audas, Rick; Berde, Eva; Dolton, Peter
2005-01-01
Unemployment and labour market adjustment have featured prominently in the problems of transitional economies. However, the position of young people and their transitions from school to work in these new market economies has been virtually ignored. This paper examines a new large longitudinal data set relating to young people in Hungary over the…
Boys' Music? School Context and Middle-School Boys' Musical Choices
ERIC Educational Resources Information Center
Bennetts, Kathleen Scott
2013-01-01
This article focusses primarily on the findings relating to the musical participation of boys in one Melbourne school. As part of a project that investigated boys' attitudes and participation at fifty-one schools, several contextual features were identified that set "Balton Boys" High School' apart from other participating schools,…
The Environmental Context of Patient Safety and Medical Errors
ERIC Educational Resources Information Center
Wholey, Douglas; Moscovice, Ira; Hietpas, Terry; Holtzman, Jeremy
2004-01-01
The environmental context of patient safety and medical errors was explored with specific interest in rural settings. Special attention was paid to unique features of rural health care organizations and their environment that relate to the patient safety issue and medical errors (including the distribution of patients, types of adverse events…
Features and Predictors of Problematic Internet Use in Chinese College Students
ERIC Educational Resources Information Center
Huang, R. L.; Lu, Z.; Liu, J. J.; You, Y. M.; Pan, Z. Q.; Wei, Z.; He, Q.; Wang, Z. Z.
2009-01-01
This study was set to investigate the prevalence of problematic internet use (PIU) among college students and the possible factors related to this disorder. About 4400 college students, ranging from freshmen to juniors, from eight different universities in Wuhan, China were surveyed. Young's Diagnostic Questionnaire for Internet Addiction (YDQ)…
Form drag in rivers due to small-scale natural topographic features: 2. Irregular sequences
Kean, J.W.; Smith, J.D.
2006-01-01
The size, shape, and spacing of small-scale topographic features found on the boundaries of natural streams, rivers, and floodplains can be quite variable. Consequently, a procedure for determining the form drag on irregular sequences of different-sized topographic features is essential for calculating near-boundary flows and sediment transport. A method for carrying out such calculations is developed in this paper. This method builds on the work of Kean and Smith (2006), which describes the flow field for the simpler case of a regular sequence of identical topographic features. Both approaches model topographic features as two-dimensional elements with Gaussian-shaped cross sections defined in terms of three parameters. Field measurements of bank topography are used to show that (1) the magnitude of these shape parameters can vary greatly between adjacent topographic features and (2) the variability of these shape parameters follows a lognormal distribution. Simulations using an irregular set of topographic roughness elements show that the drag on an individual element is primarily controlled by the size and shape of the feature immediately upstream and that the spatial average of the boundary shear stress over a large set of randomly ordered elements is relatively insensitive to the sequence of the elements. In addition, a method to transform the topography of irregular surfaces into an equivalently rough surface of regularly spaced, identical topographic elements also is given. The methods described in this paper can be used to improve predictions of flow resistance in rivers as well as quantify bank roughness.
The effect of feature selection methods on computer-aided detection of masses in mammograms
NASA Astrophysics Data System (ADS)
Hupse, Rianne; Karssemeijer, Nico
2010-05-01
In computer-aided diagnosis (CAD) research, feature selection methods are often used to improve generalization performance of classifiers and shorten computation times. In an application that detects malignant masses in mammograms, we investigated the effect of using a selection criterion that is similar to the final performance measure we are optimizing, namely the mean sensitivity of the system in a predefined range of the free-response receiver operating characteristics (FROC). To obtain the generalization performance of the selected feature subsets, a cross validation procedure was performed on a dataset containing 351 abnormal and 7879 normal regions, each region providing a set of 71 mass features. The same number of noise features, not containing any information, were added to investigate the ability of the feature selection algorithms to distinguish between useful and non-useful features. It was found that significantly higher performances were obtained using feature sets selected by the general test statistic Wilks' lambda than using feature sets selected by the more specific FROC measure. Feature selection leads to better performance when compared to a system in which all features were used.
User's manual for the Gaussian windows program
NASA Technical Reports Server (NTRS)
Jaeckel, Louis A.
1992-01-01
'Gaussian Windows' is a method for exploring a set of multivariate data, in order to estimate the shape of the underlying density function. The method can be used to find and describe structural features in the data. The method is described in two earlier papers. I assume that the reader has access to both of these papers, so I will not repeat material from them. The program described herein is written in BASIC and it runs on an IBM PC or PS/2 with the DOS 3.3 operating system. Although the program is slow and has limited memory space, it is adequate for experimenting with the method. Since it is written in BASIC, it is relatively easy to modify. The program and some related files are available on a 3-inch diskette. A listing of the program is also available. This user's manual explains the use of the program. First, it gives a brief tutorial, illustrating some of the program's features with a set of artificial data. Then, it describes the results displayed after the program does a Gaussian window, and it explains each of the items on the various menus.
A Granular Self-Organizing Map for Clustering and Gene Selection in Microarray Data.
Ray, Shubhra Sankar; Ganivada, Avatharam; Pal, Sankar K
2016-09-01
A new granular self-organizing map (GSOM) is developed by integrating the concept of a fuzzy rough set with the SOM. While training the GSOM, the weights of a winning neuron and the neighborhood neurons are updated through a modified learning procedure. The neighborhood is newly defined using the fuzzy rough sets. The clusters (granules) evolved by the GSOM are presented to a decision table as its decision classes. Based on the decision table, a method of gene selection is developed. The effectiveness of the GSOM is shown in both clustering samples and developing an unsupervised fuzzy rough feature selection (UFRFS) method for gene selection in microarray data. While the superior results of the GSOM, as compared with the related clustering methods, are provided in terms of β -index, DB-index, Dunn-index, and fuzzy rough entropy, the genes selected by the UFRFS are not only better in terms of classification accuracy and a feature evaluation index, but also statistically more significant than the related unsupervised methods. The C-codes of the GSOM and UFRFS are available online at http://avatharamg.webs.com/software-code.
Fisher's geometrical model emerges as a property of complex integrated phenotypic networks.
Martin, Guillaume
2014-05-01
Models relating phenotype space to fitness (phenotype-fitness landscapes) have seen important developments recently. They can roughly be divided into mechanistic models (e.g., metabolic networks) and more heuristic models like Fisher's geometrical model. Each has its own drawbacks, but both yield testable predictions on how the context (genomic background or environment) affects the distribution of mutation effects on fitness and thus adaptation. Both have received some empirical validation. This article aims at bridging the gap between these approaches. A derivation of the Fisher model "from first principles" is proposed, where the basic assumptions emerge from a more general model, inspired by mechanistic networks. I start from a general phenotypic network relating unspecified phenotypic traits and fitness. A limited set of qualitative assumptions is then imposed, mostly corresponding to known features of phenotypic networks: a large set of traits is pleiotropically affected by mutations and determines a much smaller set of traits under optimizing selection. Otherwise, the model remains fairly general regarding the phenotypic processes involved or the distribution of mutation effects affecting the network. A statistical treatment and a local approximation close to a fitness optimum yield a landscape that is effectively the isotropic Fisher model or its extension with a single dominant phenotypic direction. The fit of the resulting alternative distributions is illustrated in an empirical data set. These results bear implications on the validity of Fisher's model's assumptions and on which features of mutation fitness effects may vary (or not) across genomic or environmental contexts.
Decomposition and extraction: a new framework for visual classification.
Fang, Yuqiang; Chen, Qiang; Sun, Lin; Dai, Bin; Yan, Shuicheng
2014-08-01
In this paper, we present a novel framework for visual classification based on hierarchical image decomposition and hybrid midlevel feature extraction. Unlike most midlevel feature learning methods, which focus on the process of coding or pooling, we emphasize that the mechanism of image composition also strongly influences the feature extraction. To effectively explore the image content for the feature extraction, we model a multiplicity feature representation mechanism through meaningful hierarchical image decomposition followed by a fusion step. In particularly, we first propose a new hierarchical image decomposition approach in which each image is decomposed into a series of hierarchical semantical components, i.e, the structure and texture images. Then, different feature extraction schemes can be adopted to match the decomposed structure and texture processes in a dissociative manner. Here, two schemes are explored to produce property related feature representations. One is based on a single-stage network over hand-crafted features and the other is based on a multistage network, which can learn features from raw pixels automatically. Finally, those multiple midlevel features are incorporated by solving a multiple kernel learning task. Extensive experiments are conducted on several challenging data sets for visual classification, and experimental results demonstrate the effectiveness of the proposed method.
EEG-based recognition of video-induced emotions: selecting subject-independent feature set.
Kortelainen, Jukka; Seppänen, Tapio
2013-01-01
Emotions are fundamental for everyday life affecting our communication, learning, perception, and decision making. Including emotions into the human-computer interaction (HCI) could be seen as a significant step forward offering a great potential for developing advanced future technologies. While the electrical activity of the brain is affected by emotions, offers electroencephalogram (EEG) an interesting channel to improve the HCI. In this paper, the selection of subject-independent feature set for EEG-based emotion recognition is studied. We investigate the effect of different feature sets in classifying person's arousal and valence while watching videos with emotional content. The classification performance is optimized by applying a sequential forward floating search algorithm for feature selection. The best classification rate (65.1% for arousal and 63.0% for valence) is obtained with a feature set containing power spectral features from the frequency band of 1-32 Hz. The proposed approach substantially improves the classification rate reported in the literature. In future, further analysis of the video-induced EEG changes including the topographical differences in the spectral features is needed.
Egmose, Ida; Varni, Giovanna; Cordes, Katharina; Smith-Nielsen, Johanne; Væver, Mette S.; Køppe, Simo; Cohen, David; Chetouani, Mohamed
2017-01-01
Bodily movements are an essential component of social interactions. However, the role of movement in early mother-infant interaction has received little attention in the research literature. The aim of the present study was to investigate the relationship between automatically extracted motion features and interaction quality in mother-infant interactions at 4 and 13 months. The sample consisted of 19 mother-infant dyads at 4 months and 33 mother-infant dyads at 13 months. The coding system Coding Interactive Behavior (CIB) was used for rating the quality of the interactions. Kinetic energy of upper-body, arms and head motion was calculated and used as segmentation in order to extract coarse- and fine-grained motion features. Spearman correlations were conducted between the composites derived from the CIB and the coarse- and fine-grained motion features. At both 4 and 13 months, longer durations of maternal arm motion and infant upper-body motion were associated with more aversive interactions, i.e., more parent-led interactions and more infant negativity. Further, at 4 months, the amount of motion silence was related to more adaptive interactions, i.e., more sensitive and child-led interactions. Analyses of the fine-grained motion features showed that if the mother coordinates her head movements with her infant's head movements, the interaction is rated as more adaptive in terms of less infant negativity and less dyadic negative states. We found more and stronger correlations between the motion features and the interaction qualities at 4 compared to 13 months. These results highlight that motion features are related to the quality of mother-infant interactions. Factors such as infant age and interaction set-up are likely to modify the meaning and importance of different motion features. PMID:29326626
Egmose, Ida; Varni, Giovanna; Cordes, Katharina; Smith-Nielsen, Johanne; Væver, Mette S; Køppe, Simo; Cohen, David; Chetouani, Mohamed
2017-01-01
Bodily movements are an essential component of social interactions. However, the role of movement in early mother-infant interaction has received little attention in the research literature. The aim of the present study was to investigate the relationship between automatically extracted motion features and interaction quality in mother-infant interactions at 4 and 13 months. The sample consisted of 19 mother-infant dyads at 4 months and 33 mother-infant dyads at 13 months. The coding system Coding Interactive Behavior (CIB) was used for rating the quality of the interactions. Kinetic energy of upper-body, arms and head motion was calculated and used as segmentation in order to extract coarse- and fine-grained motion features. Spearman correlations were conducted between the composites derived from the CIB and the coarse- and fine-grained motion features. At both 4 and 13 months, longer durations of maternal arm motion and infant upper-body motion were associated with more aversive interactions, i.e., more parent-led interactions and more infant negativity. Further, at 4 months, the amount of motion silence was related to more adaptive interactions, i.e., more sensitive and child-led interactions. Analyses of the fine-grained motion features showed that if the mother coordinates her head movements with her infant's head movements, the interaction is rated as more adaptive in terms of less infant negativity and less dyadic negative states. We found more and stronger correlations between the motion features and the interaction qualities at 4 compared to 13 months. These results highlight that motion features are related to the quality of mother-infant interactions. Factors such as infant age and interaction set-up are likely to modify the meaning and importance of different motion features.
Friberg, Anders; Schoonderwaldt, Erwin; Hedblad, Anton; Fabiani, Marco; Elowsson, Anders
2014-10-01
The notion of perceptual features is introduced for describing general music properties based on human perception. This is an attempt at rethinking the concept of features, aiming to approach the underlying human perception mechanisms. Instead of using concepts from music theory such as tones, pitches, and chords, a set of nine features describing overall properties of the music was selected. They were chosen from qualitative measures used in psychology studies and motivated from an ecological approach. The perceptual features were rated in two listening experiments using two different data sets. They were modeled both from symbolic and audio data using different sets of computational features. Ratings of emotional expression were predicted using the perceptual features. The results indicate that (1) at least some of the perceptual features are reliable estimates; (2) emotion ratings could be predicted by a small combination of perceptual features with an explained variance from 75% to 93% for the emotional dimensions activity and valence; (3) the perceptual features could only to a limited extent be modeled using existing audio features. Results clearly indicated that a small number of dedicated features were superior to a "brute force" model using a large number of general audio features.
Systems and Methods for Correcting Optical Reflectance Measurements
NASA Technical Reports Server (NTRS)
Yang, Ye (Inventor); Shear, Michael A. (Inventor); Soller, Babs R. (Inventor); Soyemi, Olusola O. (Inventor)
2014-01-01
We disclose measurement systems and methods for measuring analytes in target regions of samples that also include features overlying the target regions. The systems include: (a) a light source; (b) a detection system; (c) a set of at least first, second, and third light ports which transmit light from the light source to a sample and receive and direct light reflected from the sample to the detection system, generating a first set of data including information corresponding to both an internal target within the sample and features overlying the internal target, and a second set of data including information corresponding to features overlying the internal target; and (d) a processor configured to remove information characteristic of the overlying features from the first set of data using the first and second sets of data to produce corrected information representing the internal target.
Systems and methods for correcting optical reflectance measurements
NASA Technical Reports Server (NTRS)
Yang, Ye (Inventor); Soller, Babs R. (Inventor); Soyemi, Olusola O. (Inventor); Shear, Michael A. (Inventor)
2009-01-01
We disclose measurement systems and methods for measuring analytes in target regions of samples that also include features overlying the target regions. The systems include: (a) a light source; (b) a detection system; (c) a set of at least first, second, and third light ports which transmit light from the light source to a sample and receive and direct light reflected from the sample to the detection system, generating a first set of data including information corresponding to both an internal target within the sample and features overlying the internal target, and a second set of data including information corresponding to features overlying the internal target; and (d) a processor configured to remove information characteristic of the overlying features from the first set of data using the first and second sets of data to produce corrected information representing the internal target.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ogden, K; O’Dwyer, R; Bradford, T
Purpose: To reduce differences in features calculated from MRI brain scans acquired at different field strengths with or without Gadolinium contrast. Methods: Brain scans were processed for 111 epilepsy patients to extract hippocampus and thalamus features. Scans were acquired on 1.5 T scanners with Gadolinium contrast (group A), 1.5T scanners without Gd (group B), and 3.0 T scanners without Gd (group C). A total of 72 features were extracted. Features were extracted from original scans and from scans where the image pixel values were rescaled to the mean of the hippocampi and thalami values. For each data set, cluster analysismore » was performed on the raw feature set and for feature sets with normalization (conversion to Z scores). Two methods of normalization were used: The first was over all values of a given feature, and the second by normalizing within the patient group membership. The clustering software was configured to produce 3 clusters. Group fractions in each cluster were calculated. Results: For features calculated from both the non-rescaled and rescaled data, cluster membership was identical for both the non-normalized and normalized data sets. Cluster 1 was comprised entirely of Group A data, Cluster 2 contained data from all three groups, and Cluster 3 contained data from only groups 1 and 2. For the categorically normalized data sets there was a more uniform distribution of group data in the three Clusters. A less pronounced effect was seen in the rescaled image data features. Conclusion: Image Rescaling and feature renormalization can have a significant effect on the results of clustering analysis. These effects are also likely to influence the results of supervised machine learning algorithms. It may be possible to partly remove the influence of scanner field strength and the presence of Gadolinium based contrast in feature extraction for radiomics applications.« less
A framework for feature extraction from hospital medical data with applications in risk prediction.
Tran, Truyen; Luo, Wei; Phung, Dinh; Gupta, Sunil; Rana, Santu; Kennedy, Richard Lee; Larkins, Ann; Venkatesh, Svetha
2014-12-30
Feature engineering is a time consuming component of predictive modeling. We propose a versatile platform to automatically extract features for risk prediction, based on a pre-defined and extensible entity schema. The extraction is independent of disease type or risk prediction task. We contrast auto-extracted features to baselines generated from the Elixhauser comorbidities. Hospital medical records was transformed to event sequences, to which filters were applied to extract feature sets capturing diversity in temporal scales and data types. The features were evaluated on a readmission prediction task, comparing with baseline feature sets generated from the Elixhauser comorbidities. The prediction model was through logistic regression with elastic net regularization. Predictions horizons of 1, 2, 3, 6, 12 months were considered for four diverse diseases: diabetes, COPD, mental disorders and pneumonia, with derivation and validation cohorts defined on non-overlapping data-collection periods. For unplanned readmissions, auto-extracted feature set using socio-demographic information and medical records, outperformed baselines derived from the socio-demographic information and Elixhauser comorbidities, over 20 settings (5 prediction horizons over 4 diseases). In particular over 30-day prediction, the AUCs are: COPD-baseline: 0.60 (95% CI: 0.57, 0.63), auto-extracted: 0.67 (0.64, 0.70); diabetes-baseline: 0.60 (0.58, 0.63), auto-extracted: 0.67 (0.64, 0.69); mental disorders-baseline: 0.57 (0.54, 0.60), auto-extracted: 0.69 (0.64,0.70); pneumonia-baseline: 0.61 (0.59, 0.63), auto-extracted: 0.70 (0.67, 0.72). The advantages of auto-extracted standard features from complex medical records, in a disease and task agnostic manner were demonstrated. Auto-extracted features have good predictive power over multiple time horizons. Such feature sets have potential to form the foundation of complex automated analytic tasks.
A Reduced Set of Features for Chronic Kidney Disease Prediction
Misir, Rajesh; Mitra, Malay; Samanta, Ranjit Kumar
2017-01-01
Chronic kidney disease (CKD) is one of the life-threatening diseases. Early detection and proper management are solicited for augmenting survivability. As per the UCI data set, there are 24 attributes for predicting CKD or non-CKD. At least there are 16 attributes need pathological investigations involving more resources, money, time, and uncertainties. The objective of this work is to explore whether we can predict CKD or non-CKD with reasonable accuracy using less number of features. An intelligent system development approach has been used in this study. We attempted one important feature selection technique to discover reduced features that explain the data set much better. Two intelligent binary classification techniques have been adopted for the validity of the reduced feature set. Performances were evaluated in terms of four important classification evaluation parameters. As suggested from our results, we may more concentrate on those reduced features for identifying CKD and thereby reduces uncertainty, saves time, and reduces costs. PMID:28706750
Ensemble methods with simple features for document zone classification
NASA Astrophysics Data System (ADS)
Obafemi-Ajayi, Tayo; Agam, Gady; Xie, Bingqing
2012-01-01
Document layout analysis is of fundamental importance for document image understanding and information retrieval. It requires the identification of blocks extracted from a document image via features extraction and block classification. In this paper, we focus on the classification of the extracted blocks into five classes: text (machine printed), handwriting, graphics, images, and noise. We propose a new set of features for efficient classifications of these blocks. We present a comparative evaluation of three ensemble based classification algorithms (boosting, bagging, and combined model trees) in addition to other known learning algorithms. Experimental results are demonstrated for a set of 36503 zones extracted from 416 document images which were randomly selected from the tobacco legacy document collection. The results obtained verify the robustness and effectiveness of the proposed set of features in comparison to the commonly used Ocropus recognition features. When used in conjunction with the Ocropus feature set, we further improve the performance of the block classification system to obtain a classification accuracy of 99.21%.
Starbase Data Tables: An ASCII Relational Database for Unix
NASA Astrophysics Data System (ADS)
Roll, John
2011-11-01
Database management is an increasingly important part of astronomical data analysis. Astronomers need easy and convenient ways of storing, editing, filtering, and retrieving data about data. Commercial databases do not provide good solutions for many of the everyday and informal types of database access astronomers need. The Starbase database system with simple data file formatting rules and command line data operators has been created to answer this need. The system includes a complete set of relational and set operators, fast search/index and sorting operators, and many formatting and I/O operators. Special features are included to enhance the usefulness of the database when manipulating astronomical data. The software runs under UNIX, MSDOS and IRAF.
Combined rule extraction and feature elimination in supervised classification.
Liu, Sheng; Patel, Ronak Y; Daga, Pankaj R; Liu, Haining; Fu, Gang; Doerksen, Robert J; Chen, Yixin; Wilkins, Dawn E
2012-09-01
There are a vast number of biology related research problems involving a combination of multiple sources of data to achieve a better understanding of the underlying problems. It is important to select and interpret the most important information from these sources. Thus it will be beneficial to have a good algorithm to simultaneously extract rules and select features for better interpretation of the predictive model. We propose an efficient algorithm, Combined Rule Extraction and Feature Elimination (CRF), based on 1-norm regularized random forests. CRF simultaneously extracts a small number of rules generated by random forests and selects important features. We applied CRF to several drug activity prediction and microarray data sets. CRF is capable of producing performance comparable with state-of-the-art prediction algorithms using a small number of decision rules. Some of the decision rules are biologically significant.
Different influences on lexical priming for integrative, thematic, and taxonomic relations
Jones, Lara L.; Golonka, Sabrina
2012-01-01
Word pairs may be integrative (i.e., combination of two concepts into one meaningful entity; e.g., fruit—cake), thematically related (i.e., connected in time and place; e.g., party—cake), and/or taxonomically related (i.e., shared features and category co-members; e.g., muffin—cake). Using participant ratings and computational measures, we demonstrated distinct patterns across measures of similarity and co-occurrence, and familiarity for each relational construct in two different item sets. In a standard lexical decision task (LDT) with various delays between prime and target presentation (SOAs), target RTs and priming magnitudes were consistent across the three relations for both item sets. However, across the SOAs, there were distinct patterns among the three relations on some of the underlying measures influencing target word recognition (LSA, Google, and BEAGLE). These distinct patterns suggest different mechanisms of lexical priming and further demonstrate that integrative relations are distinct from thematic and taxonomic relations. PMID:22798950
Arruti, Andoni; Cearreta, Idoia; Álvarez, Aitor; Lazkano, Elena; Sierra, Basilio
2014-01-01
Study of emotions in human–computer interaction is a growing research area. This paper shows an attempt to select the most significant features for emotion recognition in spoken Basque and Spanish Languages using different methods for feature selection. RekEmozio database was used as the experimental data set. Several Machine Learning paradigms were used for the emotion classification task. Experiments were executed in three phases, using different sets of features as classification variables in each phase. Moreover, feature subset selection was applied at each phase in order to seek for the most relevant feature subset. The three phases approach was selected to check the validity of the proposed approach. Achieved results show that an instance-based learning algorithm using feature subset selection techniques based on evolutionary algorithms is the best Machine Learning paradigm in automatic emotion recognition, with all different feature sets, obtaining a mean of 80,05% emotion recognition rate in Basque and a 74,82% in Spanish. In order to check the goodness of the proposed process, a greedy searching approach (FSS-Forward) has been applied and a comparison between them is provided. Based on achieved results, a set of most relevant non-speaker dependent features is proposed for both languages and new perspectives are suggested. PMID:25279686
Classification of AB O 3 perovskite solids: a machine learning study
Pilania, G.; Balachandran, P. V.; Gubernatis, J. E.; ...
2015-07-23
Here we explored the use of machine learning methods for classifying whether a particularABO 3chemistry forms a perovskite or non-perovskite structured solid. Starting with three sets of feature pairs (the tolerance and octahedral factors, theAandBionic radii relative to the radius of O, and the bond valence distances between theAandBions from the O atoms), we used machine learning to create a hyper-dimensional partial dependency structure plot using all three feature pairs or any two of them. Doing so increased the accuracy of our predictions by 2–3 percentage points over using any one pair. We also included the Mendeleev numbers of theAandBatomsmore » to this set of feature pairs. Moreover, doing this and using the capabilities of our machine learning algorithm, the gradient tree boosting classifier, enabled us to generate a new type of structure plot that has the simplicity of one based on using just the Mendeleev numbers, but with the added advantages of having a higher accuracy and providing a measure of likelihood of the predicted structure.« less
Estimation of relative effectiveness of phylogenetic programs by machine learning.
Krivozubov, Mikhail; Goebels, Florian; Spirin, Sergei
2014-04-01
Reconstruction of phylogeny of a protein family from a sequence alignment can produce results of different quality. Our goal is to predict the quality of phylogeny reconstruction basing on features that can be extracted from the input alignment. We used Fitch-Margoliash (FM) method of phylogeny reconstruction and random forest as a predictor. For training and testing the predictor, alignments of orthologous series (OS) were used, for which the result of phylogeny reconstruction can be evaluated by comparison with trees of corresponding organisms. Our results show that the quality of phylogeny reconstruction can be predicted with more than 80% precision. Also, we tried to predict which phylogeny reconstruction method, FM or UPGMA, is better for a particular alignment. With the used set of features, among alignments for which the obtained predictor predicts a better performance of UPGMA, 56% really give a better result with UPGMA. Taking into account that in our testing set only for 34% alignments UPGMA performs better, this result shows a principal possibility to predict the better phylogeny reconstruction method basing on features of a sequence alignment.
Automatic extraction of relations between medical concepts in clinical texts
Harabagiu, Sanda; Roberts, Kirk
2011-01-01
Objective A supervised machine learning approach to discover relations between medical problems, treatments, and tests mentioned in electronic medical records. Materials and methods A single support vector machine classifier was used to identify relations between concepts and to assign their semantic type. Several resources such as Wikipedia, WordNet, General Inquirer, and a relation similarity metric inform the classifier. Results The techniques reported in this paper were evaluated in the 2010 i2b2 Challenge and obtained the highest F1 score for the relation extraction task. When gold standard data for concepts and assertions were available, F1 was 73.7, precision was 72.0, and recall was 75.3. F1 is defined as 2*Precision*Recall/(Precision+Recall). Alternatively, when concepts and assertions were discovered automatically, F1 was 48.4, precision was 57.6, and recall was 41.7. Discussion Although a rich set of features was developed for the classifiers presented in this paper, little knowledge mining was performed from medical ontologies such as those found in UMLS. Future studies should incorporate features extracted from such knowledge sources, which we expect to further improve the results. Moreover, each relation discovery was treated independently. Joint classification of relations may further improve the quality of results. Also, joint learning of the discovery of concepts, assertions, and relations may also improve the results of automatic relation extraction. Conclusion Lexical and contextual features proved to be very important in relation extraction from medical texts. When they are not available to the classifier, the F1 score decreases by 3.7%. In addition, features based on similarity contribute to a decrease of 1.1% when they are not available. PMID:21846787
Giant polygons and mounds in the lowlands of Mars: signatures of an ancient ocean?
Oehler, Dorothy Z; Allen, Carlton C
2012-06-01
This paper presents the hypothesis that the well-known giant polygons and bright mounds of the martian lowlands may be related to a common process-a process of fluid expulsion that results from burial of fine-grained sediments beneath a body of water. Specifically, we hypothesize that giant polygons and mounds in Chryse and Acidalia Planitiae are analogous to kilometer-scale polygons and mud volcanoes in terrestrial, marine basins and that the co-occurrence of masses of these features in Chryse and Acidalia may be the signature of sedimentary processes in an ancient martian ocean. We base this hypothesis on recent data from both Earth and Mars. On Earth, 3-D seismic data illustrate kilometer-scale polygons that may be analogous to the giant polygons on Mars. The terrestrial polygons form in fine-grained sediments that have been deposited and buried in passive-margin, marine settings. These polygons are thought to result from compaction/dewatering, and they are commonly associated with fluid expulsion features, such as mud volcanoes. On Mars, in Chryse and Acidalia Planitiae, orbital data demonstrate that giant polygons and mounds have overlapping spatial distributions. There, each set of features occurs within a geological setting that is seemingly analogous to that of the terrestrial, kilometer-scale polygons (broad basin of deposition, predicted fine-grained sediments, and lack of significant horizontal stress). Regionally, the martian polygons and mounds both show a correlation to elevation, as if their formation were related to past water levels. Although these observations are based on older data with incomplete coverage, a similar correlation to elevation has been established in one local area studied in detail with newer higher-resolution data. Further mapping with the latest data sets should more clearly elucidate the relationship(s) of the polygons and mounds to elevation over the entire Chryse-Acidalia region and thereby provide more insight into this hypothesis.
A keyword spotting model using perceptually significant energy features
NASA Astrophysics Data System (ADS)
Umakanthan, Padmalochini
The task of a keyword recognition system is to detect the presence of certain words in a conversation based on the linguistic information present in human speech. Such keyword spotting systems have applications in homeland security, telephone surveillance and human-computer interfacing. General procedure of a keyword spotting system involves feature generation and matching. In this work, new set of features that are based on the psycho-acoustic masking nature of human speech are proposed. After developing these features a time aligned pattern matching process was implemented to locate the words in a set of unknown words. A word boundary detection technique based on frame classification using the nonlinear characteristics of speech is also addressed in this work. Validation of this keyword spotting model was done using widely acclaimed Cepstral features. The experimental results indicate the viability of using these perceptually significant features as an augmented feature set in keyword spotting.
A random forest model based classification scheme for neonatal amplitude-integrated EEG.
Chen, Weiting; Wang, Yu; Cao, Guitao; Chen, Guoqiang; Gu, Qiufang
2014-01-01
Modern medical advances have greatly increased the survival rate of infants, while they remain in the higher risk group for neurological problems later in life. For the infants with encephalopathy or seizures, identification of the extent of brain injury is clinically challenging. Continuous amplitude-integrated electroencephalography (aEEG) monitoring offers a possibility to directly monitor the brain functional state of the newborns over hours, and has seen an increasing application in neonatal intensive care units (NICUs). This paper presents a novel combined feature set of aEEG and applies random forest (RF) method to classify aEEG tracings. To that end, a series of experiments were conducted on 282 aEEG tracing cases (209 normal and 73 abnormal ones). Basic features, statistic features and segmentation features were extracted from both the tracing as a whole and the segmented recordings, and then form a combined feature set. All the features were sent to a classifier afterwards. The significance of feature, the data segmentation, the optimization of RF parameters, and the problem of imbalanced datasets were examined through experiments. Experiments were also done to evaluate the performance of RF on aEEG signal classifying, compared with several other widely used classifiers including SVM-Linear, SVM-RBF, ANN, Decision Tree (DT), Logistic Regression(LR), ML, and LDA. The combined feature set can better characterize aEEG signals, compared with basic features, statistic features and segmentation features respectively. With the combined feature set, the proposed RF-based aEEG classification system achieved a correct rate of 92.52% and a high F1-score of 95.26%. Among all of the seven classifiers examined in our work, the RF method got the highest correct rate, sensitivity, specificity, and F1-score, which means that RF outperforms all of the other classifiers considered here. The results show that the proposed RF-based aEEG classification system with the combined feature set is efficient and helpful to better detect the brain disorders in newborns.
Feature Selection for Classification of Polar Regions Using a Fuzzy Expert System
NASA Technical Reports Server (NTRS)
Penaloza, Mauel A.; Welch, Ronald M.
1996-01-01
Labeling, feature selection, and the choice of classifier are critical elements for classification of scenes and for image understanding. This study examines several methods for feature selection in polar regions, including the list, of a fuzzy logic-based expert system for further refinement of a set of selected features. Six Advanced Very High Resolution Radiometer (AVHRR) Local Area Coverage (LAC) arctic scenes are classified into nine classes: water, snow / ice, ice cloud, land, thin stratus, stratus over water, cumulus over water, textured snow over water, and snow-covered mountains. Sixty-seven spectral and textural features are computed and analyzed by the feature selection algorithms. The divergence, histogram analysis, and discriminant analysis approaches are intercompared for their effectiveness in feature selection. The fuzzy expert system method is used not only to determine the effectiveness of each approach in classifying polar scenes, but also to further reduce the features into a more optimal set. For each selection method,features are ranked from best to worst, and the best half of the features are selected. Then, rules using these selected features are defined. The results of running the fuzzy expert system with these rules show that the divergence method produces the best set features, not only does it produce the highest classification accuracy, but also it has the lowest computation requirements. A reduction of the set of features produced by the divergence method using the fuzzy expert system results in an overall classification accuracy of over 95 %. However, this increase of accuracy has a high computation cost.
Buij, R.; McShea, W.J.; Campbell, P.; Lee, M.E.; Dallmeier, F.; Guimondou, S.; Mackaga, L.; Guisseougou, N.; Mboumba, S.; Hines, J.E.; Nichols, J.D.; Alonso, A.
2007-01-01
The importance of human activity and ecological features in influencing African forest elephant ranging behaviour was investigated in the Rabi-Ndogo corridor of the Gamba Complex of Protected Areas in southwest Gabon. Locations in a wide geographical area with a range of environmental variables were selected for patch-occupancy surveys using elephant dung to assess seasonal presence and absence of elephants. Patch-occupancy procedures allowed for covariate modelling evaluating hypotheses for both occupancy in relation to human activity and ecological features, and detection probability in relation to vegetation density. The best fitting models for old and fresh dung data sets indicate that (1) detection probability for elephant dung is negatively related to the relative density of the vegetation, and (2) human activity, such as presence and infrastructure, are more closely associated with elephant distribution patterns than are ecological features, such as the presence of wetlands and preferred fresh fruit. Our findings emphasize the sensitivity of elephants to human disturbance, in this case infrastructure development associated with gas and oil production. Patch-occupancy methodology offers a viable alternative to current transect protocols for monitoring programs with multiple covariates.
Prediction of Cognitive States During Flight Simulation Using Multimodal Psychophysiological Sensing
NASA Technical Reports Server (NTRS)
Harrivel, Angela R.; Stephens, Chad L.; Milletich, Robert J.; Heinich, Christina M.; Last, Mary Carolyn; Napoli, Nicholas J.; Abraham, Nijo A.; Prinzel, Lawrence J.; Motter, Mark A.; Pope, Alan T.
2017-01-01
The Commercial Aviation Safety Team found the majority of recent international commercial aviation accidents attributable to loss of control inflight involved flight crew loss of airplane state awareness (ASA), and distraction was involved in all of them. Research on attention-related human performance limiting states (AHPLS) such as channelized attention, diverted attention, startle/surprise, and confirmation bias, has been recommended in a Safety Enhancement (SE) entitled "Training for Attention Management." To accomplish the detection of such cognitive and psychophysiological states, a broad suite of sensors was implemented to simultaneously measure their physiological markers during a high fidelity flight simulation human subject study. Twenty-four pilot participants were asked to wear the sensors while they performed benchmark tasks and motion-based flight scenarios designed to induce AHPLS. Pattern classification was employed to predict the occurrence of AHPLS during flight simulation also designed to induce those states. Classifier training data were collected during performance of the benchmark tasks. Multimodal classification was performed, using pre-processed electroencephalography, galvanic skin response, electrocardiogram, and respiration signals as input features. A combination of one, some or all modalities were used. Extreme gradient boosting, random forest and two support vector machine classifiers were implemented. The best accuracy for each modality-classifier combination is reported. Results using a select set of features and using the full set of available features are presented. Further, results are presented for training one classifier with the combined features and for training multiple classifiers with features from each modality separately. Using the select set of features and combined training, multistate prediction accuracy averaged 0.64 +/- 0.14 across thirteen participants and was significantly higher than that for the separate training case. These results support the goal of demonstrating simultaneous real-time classification of multiple states using multiple sensing modalities in high fidelity flight simulators. This detection is intended to support and inform training methods under development to mitigate the loss of ASA and thus reduce accidents and incidents.
Jochumsen, Mads; Rovsing, Cecilie; Rovsing, Helene; Niazi, Imran Khan; Dremstrup, Kim; Kamavuako, Ernest Nlandu
2017-01-01
Detection of single-trial movement intentions from EEG is paramount for brain-computer interfacing in neurorehabilitation. These movement intentions contain task-related information and if this is decoded, the neurorehabilitation could potentially be optimized. The aim of this study was to classify single-trial movement intentions associated with two levels of force and speed and three different grasp types using EEG rhythms and components of the movement-related cortical potential (MRCP) as features. The feature importance was used to estimate encoding of discriminative information. Two data sets were used. 29 healthy subjects executed and imagined different hand movements, while EEG was recorded over the contralateral sensorimotor cortex. The following features were extracted: delta, theta, mu/alpha, beta, and gamma rhythms, readiness potential, negative slope, and motor potential of the MRCP. Sequential forward selection was performed, and classification was performed using linear discriminant analysis and support vector machines. Limited classification accuracies were obtained from the EEG rhythms and MRCP-components: 0.48 ± 0.05 (grasp types), 0.41 ± 0.07 (kinetic profiles, motor execution), and 0.39 ± 0.08 (kinetic profiles, motor imagination). Delta activity contributed the most but all features provided discriminative information. These findings suggest that information from the entire EEG spectrum is needed to discriminate between task-related parameters from single-trial movement intentions.
Wu, Yu-Tzu; Nash, Paul; Barnes, Linda E; Minett, Thais; Matthews, Fiona E; Jones, Andy; Brayne, Carol
2014-10-22
An association between depressive symptoms and features of built environment has been reported in the literature. A remaining research challenge is the development of methods to efficiently capture pertinent environmental features in relevant study settings. Visual streetscape images have been used to replace traditional physical audits and directly observe the built environment of communities. The aim of this work is to examine the inter-method reliability of the two audit methods for assessing community environments with a specific focus on physical features related to mental health. Forty-eight postcodes in urban and rural areas of Cambridgeshire, England were randomly selected from an alphabetical list of streets hosted on a UK property website. The assessment was conducted in July and August 2012 by both physical and visual image audits based on the items in Residential Environment Assessment Tool (REAT), an observational instrument targeting the micro-scale environmental features related to mental health in UK postcodes. The assessor used the images of Google Street View and virtually "walked through" the streets to conduct the property and street level assessments. Gwet's AC1 coefficients and Bland-Altman plots were used to compare the concordance of two audits. The results of conducting the REAT by visual image audits generally correspond to direct observations. More variations were found in property level items regarding physical incivilities, with broad limits of agreement which importantly lead to most of the variation in the overall REAT score. Postcodes in urban areas had lower consistency between the two methods than rural areas. Google Street View has the potential to assess environmental features related to mental health with fair reliability and provide a less resource intense method of assessing community environments than physical audits.
Seguin, Rebecca A; Morgan, Emily H; Connor, Leah M; Garner, Jennifer A; King, Abby C; Sheats, Jylana L; Winter, Sandra J; Buman, Matthew P
2015-07-02
A community's built environment can influence health behaviors. Rural populations experience significant health disparities, yet built environment studies in these settings are limited. We used an electronic tablet-based community assessment tool to conduct built environment audits in rural settings. The primary objective of this qualitative study was to evaluate the usefulness of the tool in identifying barriers and facilitators to healthy eating and active living. The second objective was to understand resident perspectives on community features and opportunities for improvement. Participants were recruited from 4 rural communities in New York State. Using the tool, participants completed 2 audits, which consisted of taking pictures and recording audio narratives about community features perceived as assets or barriers to healthy eating and active living. Follow-up focus groups explored the audit experience, data captured, and opportunities for change. Twenty-four adults (mean age, 69.4 y (standard deviation, 13.2 y), 6 per community, participated in the study. The most frequently captured features related to active living were related to roads, sidewalks, and walkable destinations. Restaurants, nontraditional food stores, and supermarkets were identified in the food environment in relation to the cost, quality, and selection of healthy foods available. In general, participants found the assessment tool to be simple and enjoyable to use. An electronic tablet-based tool can be used to assess rural food and physical activity environments and may be useful in identifying and prioritizing resident-led change initiatives. This resident-led assessment approach may also be helpful for informing and evaluating rural community-based interventions.
McEwan, Desmond; Harden, Samantha M; Zumbo, Bruno D; Sylvester, Benjamin D; Kaulius, Megan; Ruissen, Geralyn R; Dowd, A Justine; Beauchamp, Mark R
2016-01-01
Drawing from goal setting theory (Latham & Locke, 1991; Locke & Latham, 2002; Locke et al., 1981), the purpose of this study was to conduct a systematic review and meta-analysis of multi-component goal setting interventions for changing physical activity (PA) behaviour. A literature search returned 41,038 potential articles. Included studies consisted of controlled experimental trials wherein participants in the intervention conditions set PA goals and their PA behaviour was compared to participants in a control group who did not set goals. A meta-analysis was ultimately carried out across 45 articles (comprising 52 interventions, 126 effect sizes, n = 5912) that met eligibility criteria using a random-effects model. Overall, a medium, positive effect (Cohen's d(SE) = .552(.06), 95% CI = .43-.67, Z = 9.03, p < .001) of goal setting interventions in relation to PA behaviour was found. Moderator analyses across 20 variables revealed several noteworthy results with regard to features of the study, sample characteristics, PA goal content, and additional goal-related behaviour change techniques. In conclusion, multi-component goal setting interventions represent an effective method of fostering PA across a diverse range of populations and settings. Implications for effective goal setting interventions are discussed.
Testing Product Generation in Software Product Lines Using Pairwise for Features Coverage
NASA Astrophysics Data System (ADS)
Pérez Lamancha, Beatriz; Polo Usaola, Macario
A Software Product Lines (SPL) is "a set of software-intensive systems sharing a common, managed set of features that satisfy the specific needs of a particular market segment or mission and that are developed from a common set of core assets in a prescribed way". Variability is a central concept that permits the generation of different products of the family by reusing core assets. It is captured through features which, for a SPL, define its scope. Features are represented in a feature model, which is later used to generate the products from the line. From the testing point of view, testing all the possible combinations in feature models is not practical because: (1) the number of possible combinations (i.e., combinations of features for composing products) may be untreatable, and (2) some combinations may contain incompatible features. Thus, this paper resolves the problem by the implementation of combinatorial testing techniques adapted to the SPL context.
High-level intuitive features (HLIFs) for intuitive skin lesion description.
Amelard, Robert; Glaister, Jeffrey; Wong, Alexander; Clausi, David A
2015-03-01
A set of high-level intuitive features (HLIFs) is proposed to quantitatively describe melanoma in standard camera images. Melanoma is the deadliest form of skin cancer. With rising incidence rates and subjectivity in current clinical detection methods, there is a need for melanoma decision support systems. Feature extraction is a critical step in melanoma decision support systems. Existing feature sets for analyzing standard camera images are comprised of low-level features, which exist in high-dimensional feature spaces and limit the system's ability to convey intuitive diagnostic rationale. The proposed HLIFs were designed to model the ABCD criteria commonly used by dermatologists such that each HLIF represents a human-observable characteristic. As such, intuitive diagnostic rationale can be conveyed to the user. Experimental results show that concatenating the proposed HLIFs with a full low-level feature set increased classification accuracy, and that HLIFs were able to separate the data better than low-level features with statistical significance. An example of a graphical interface for providing intuitive rationale is given.
Examining classroom interactions related to difference in students' science achievement
NASA Astrophysics Data System (ADS)
Zady, Madelon F.; Portes, Pedro R.; Ochs, V. Dan
2003-01-01
The current study examines the cognitive supports that underlie achievement in science by using a cultural historical framework (L. S. Vygotsky (1934/1986), Thought and Language, MIT Press, Cambridge, MA.) and the activity setting (AS) construct (R. G. Tharp & R. Gallimore (1988), Rousing minds to life: Teaching, learning and schooling in social context, Cambridge University Press, Cambridge, MA.) with its five features: personnel, motivations, scripts, task demands, and beliefs. Observations were made of the classrooms of seventh-grade science students, 32 of whom had participated in a prior achievement-related parent-child interaction or home study (P. R. Portes, M. F. Zady, & R. M. Dunham (1998), Journal of Genetic Psychology, 159, 163-178). The results of a quantitative analysis of classroom interaction showed two features of the AS: personnel and scripts. The qualitative field analysis generated four emergent phenomena related to the features of the AS that appeared to influence student opportunity for conceptual development. The emergent phenomenon were science activities, the building of learning, meaning in lessons, and the conflict over control. Lastly, the results of the two-part classroom study were compared to those of the home science AS of high and low achievers. Mismatches in the AS features in the science classroom may constrain the opportunity to learn. Educational implications are discussed.
Stoecker, William V.; Gupta, Kapil; Stanley, R. Joe; Moss, Randy H.; Shrestha, Bijaya
2011-01-01
Background Dermoscopy, also known as dermatoscopy or epiluminescence microscopy (ELM), is a non-invasive, in vivo technique, which permits visualization of features of pigmented melanocytic neoplasms that are not discernable by examination with the naked eye. One prominent feature useful for melanoma detection in dermoscopy images is the asymmetric blotch (asymmetric structureless area). Method Using both relative and absolute colors, blotches are detected in this research automatically by using thresholds in the red and green color planes. Several blotch indices are computed, including the scaled distance between the largest blotch centroid and the lesion centroid, ratio of total blotch areas to lesion area, ratio of largest blotch area to lesion area, total number of blotches, size of largest blotch, and irregularity of largest blotch. Results The effectiveness of the absolute and relative color blotch features was examined for melanoma/benign lesion discrimination over a dermoscopy image set containing 165 melanomas (151 invasive melanomas and 14 melanomas in situ) and 347 benign lesions (124 nevocellular nevi without dysplasia and 223 dysplastic nevi) using a leave-one-out neural network approach. Receiver operating characteristic curve results are shown, highlighting the sensitivity and specificity of melanoma detection. Statistical analysis of the blotch features are also presented. Conclusion Neural network and statistical analysis showed that the blotch detection method was somewhat more effective using relative color than using absolute color. The relative-color blotch detection method gave a diagnostic accuracy of about 77%. PMID:15998328
Cui, Licong; Bodenreider, Olivier; Shi, Jay; Zhang, Guo-Qiang
2018-02-01
We introduce a structural-lexical approach for auditing SNOMED CT using a combination of non-lattice subgraphs of the underlying hierarchical relations and enriched lexical attributes of fully specified concept names. Our goal is to develop a scalable and effective approach that automatically identifies missing hierarchical IS-A relations. Our approach involves 3 stages. In stage 1, all non-lattice subgraphs of SNOMED CT's IS-A hierarchical relations are extracted. In stage 2, lexical attributes of fully-specified concept names in such non-lattice subgraphs are extracted. For each concept in a non-lattice subgraph, we enrich its set of attributes with attributes from its ancestor concepts within the non-lattice subgraph. In stage 3, subset inclusion relations between the lexical attribute sets of each pair of concepts in each non-lattice subgraph are compared to existing IS-A relations in SNOMED CT. For concept pairs within each non-lattice subgraph, if a subset relation is identified but an IS-A relation is not present in SNOMED CT IS-A transitive closure, then a missing IS-A relation is reported. The September 2017 release of SNOMED CT (US edition) was used in this investigation. A total of 14,380 non-lattice subgraphs were extracted, from which we suggested a total of 41,357 missing IS-A relations. For evaluation purposes, 200 non-lattice subgraphs were randomly selected from 996 smaller subgraphs (of size 4, 5, or 6) within the "Clinical Finding" and "Procedure" sub-hierarchies. Two domain experts confirmed 185 (among 223) suggested missing IS-A relations, a precision of 82.96%. Our results demonstrate that analyzing the lexical features of concepts in non-lattice subgraphs is an effective approach for auditing SNOMED CT. Copyright © 2017 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Savastano, Vítor Lamy Mesiano; Schmitt, Renata da Silva; Araújo, Mário Neto Cavalcanti de; Inocêncio, Leonardo Campos
2017-01-01
High-resolution drone-supported mapping and traditional field work were used to refine the hierarchy and kinematics of rift-related faults in the basement rocks and Early Cretaceous mafic dikes onshore of the Campos Basin, SE-Brazil. Two sets of structures were identified. The most significant fault set is NE-SW oriented with predominantly normal displacement. At mesoscale, this fault set is arranged in a rhombic pattern, interpreted here as a breached relay ramp system. The rhombic pattern is a penetrative fabric from the thin-section to regional scale. The second-order set of structures is an E-W/ESE-WNW system of normal faults with sinistral component. These E-W structures are oriented parallel with regional intrabasinal transfer zones associated with the earliest stages of Campos Basin's rift system. The crosscutting relationship between the two fault sets and tholeiitic dikes implies that the NE-SW fault set is the older feature, but remained active until the final stages of rifting in this region as the second-order fault set is older than the tholeiitic dikes. Paleostresses estimated from fault slip inversion method indicated that extension was originally NW-SE, with formation of the E-W transfer, followed by ESE-WNW oblique opening associated with a relay ramp system and related accommodation zones.
Treatment of sleep-disordered breathing with positive airway pressure devices: technology update.
Johnson, Karin Gardner; Johnson, Douglas Clark
2015-01-01
Many types of positive airway pressure (PAP) devices are used to treat sleep-disordered breathing including obstructive sleep apnea, central sleep apnea, and sleep-related hypoventilation. These include continuous PAP, autoadjusting CPAP, bilevel PAP, adaptive servoventilation, and volume-assured pressure support. Noninvasive PAP has significant leak by design, which these devices adjust for in different manners. Algorithms to provide pressure, detect events, and respond to events vary greatly between the types of devices, and vary among the same category between companies and different models by the same company. Many devices include features designed to improve effectiveness and patient comfort. Data collection systems can track compliance, pressure, leak, and efficacy. Understanding how each device works allows the clinician to better select the best device and settings for a given patient. This paper reviews PAP devices, including their algorithms, settings, and features.
Response comment: Carbon sequestration on Mars
Edwards, Christopher; Ehlmann, Bethany L.
2016-01-01
Martian atmospheric pressure has important implications for the past and present habitability of the planet, including the timing and causes of environmental change. The ancient Martian surface is strewn with evidence for early water bound in minerals (e.g., Ehlmann and Edwards, 2014) and recorded in surface features such as large catastrophically created outflow channels (e.g., Carr, 1979), valley networks (Hynek et al., 2010; Irwin et al., 2005), and crater lakes (e.g., Fassett and Head, 2008). Using orbital spectral data sets coupled with geologic maps and a set of numerical spectral analysis models, Edwards and Ehlmann (2015) constrained the amount of atmospheric sequestration in early Martian rocks and found that the majority of this sequestration occurred prior to the formation of the early Hesperian/late Noachian valley networks (Fassett and Head, 2011; Hynek et al., 2010), thus implying the atmosphere was already thin by the time these surface-water-related features were formed.
Using Gaussian windows to explore a multivariate data set
NASA Technical Reports Server (NTRS)
Jaeckel, Louis A.
1991-01-01
In an earlier paper, I recounted an exploratory analysis, using Gaussian windows, of a data set derived from the Infrared Astronomical Satellite. Here, my goals are to develop strategies for finding structural features in a data set in a many-dimensional space, and to find ways to describe the shape of such a data set. After a brief review of Gaussian windows, I describe the current implementation of the method. I give some ways of describing features that we might find in the data, such as clusters and saddle points, and also extended structures such as a 'bar', which is an essentially one-dimensional concentration of data points. I then define a distance function, which I use to determine which data points are 'associated' with a feature. Data points not associated with any feature are called 'outliers'. I then explore the data set, giving the strategies that I used and quantitative descriptions of the features that I found, including clusters, bars, and a saddle point. I tried to use strategies and procedures that could, in principle, be used in any number of dimensions.
ERIC Educational Resources Information Center
Laboratory Design Notes, 1966
1966-01-01
A collection of laboratory design notes to set forth minimum criteria required in the design of basic medical research laboratory buildings. Recommendations contained are primarily concerned with features of design which affect quality of performance and future flexibility of facility systems. Subjects of economy and safety are discussed where…
Education as a Resource of the Information Society
ERIC Educational Resources Information Center
Zborovskii, G. E.; Shuklina, E. A.
2007-01-01
The literature on sociology in this country in the past few years has featured vigorous discussion of matters relating to the modernization of education in Russia. It is the authors opinion, that the examination of this set of problems has been limited primarily to the present day, and not enough attention is being given to the emergence of the…
Participation Patterns among Families Receiving Part C Early Intervention Services
ERIC Educational Resources Information Center
Khetani, Mary Alunkal
2010-01-01
Participation in the natural settings of home and community is one of four major goals for families receiving Part C early intervention services. While participation has been formally recognized as an important service-related outcome, there is a need to build knowledge about its key features to adequately apply the concept in practice. The need…
Building a Framework to Study the Hetero Norm in Praxis
ERIC Educational Resources Information Center
Lundin, Mattias
2011-01-01
To improve equality in schools and to facilitate the identification of oppressive features of the classroom, a framework to indicate the heterosexual norm and its consequences is needed. The purpose of this paper is to construct this framework through a review of literature focusing on the school setting and texts related to equality in Swedish…
Geospatial Analytics in Retail Site Selection and Sales Prediction.
Ting, Choo-Yee; Ho, Chiung Ching; Yee, Hui Jia; Matsah, Wan Razali
2018-03-01
Studies have shown that certain features from geography, demography, trade area, and environment can play a vital role in retail site selection, largely due to the impact they asserted on retail performance. Although the relevant features could be elicited by domain experts, determining the optimal feature set can be intractable and labor-intensive exercise. The challenges center around (1) how to determine features that are important to a particular retail business and (2) how to estimate retail sales performance given a new location? The challenges become apparent when the features vary across time. In this light, this study proposed a nonintervening approach by employing feature selection algorithms and subsequently sales prediction through similarity-based methods. The results of prediction were validated by domain experts. In this study, data sets from different sources were transformed and aggregated before an analytics data set that is ready for analysis purpose could be obtained. The data sets included data about feature location, population count, property type, education status, and monthly sales from 96 branches of a telecommunication company in Malaysia. The finding suggested that (1) optimal retail performance can only be achieved through fulfillment of specific location features together with the surrounding trade area characteristics and (2) similarity-based method can provide solution to retail sales prediction.
Integration of SAR and DEM data: Geometrical considerations
NASA Technical Reports Server (NTRS)
Kropatsch, Walter G.
1991-01-01
General principles for integrating data from different sources are derived from the experience of registration of SAR images with digital elevation models (DEM) data. The integration consists of establishing geometrical relations between the data sets that allow us to accumulate information from both data sets for any given object point (e.g., elevation, slope, backscatter of ground cover, etc.). Since the geometries of the two data are completely different they cannot be compared on a pixel by pixel basis. The presented approach detects instances of higher level features in both data sets independently and performs the matching at the high level. Besides the efficiency of this general strategy it further allows the integration of additional knowledge sources: world knowledge and sensor characteristics are also useful sources of information. The SAR features layover and shadow can be detected easily in SAR images. An analytical method to find such regions also in a DEM needs in addition the parameters of the flight path of the SAR sensor and the range projection model. The generation of the SAR layover and shadow maps is summarized and new extensions to this method are proposed.
Polypeptide having or assisting in carbohydrate material degrading activity and uses thereof
Schooneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; Los, Alrik Pieter
2016-02-16
The invention relates to a polypeptide which comprises the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 76% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 76% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
Polypeptide having beta-glucosidase activity and uses thereof
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schoonneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; De Jong, Rene Marcel
The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 96% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 96% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well asmore » the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.« less
Polypeptide having swollenin activity and uses thereof
Schoonneveld-Bergmans, Margot Elizabeth Francoise; Heijne, Wilbert Herman Marie; Vlasie, Monica D; Damveld, Robbertus Antonius
2015-11-04
The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 73% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 73% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
Polypeptide having beta-glucosidase activity and uses thereof
Schooneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; De Jong, Rene Marcel; Damveld, Robbertus Antonius
2015-09-01
The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 70% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 70% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
Polypeptide having cellobiohydrolase activity and uses thereof
Sagt, Cornelis Maria Jacobus; Schooneveld-Bergmans, Margot Elisabeth Francoise; Roubos, Johannes Andries; Los, Alrik Pieter
2015-09-15
The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 93% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 93% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
Polypeptide having acetyl xylan esterase activity and uses thereof
Schoonneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; Los, Alrik Pieter
2015-10-20
The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 82% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 82% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
Polypeptide having carbohydrate degrading activity and uses thereof
Schooneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; Vlasie, Monica Diana; Damveld, Robbertus Antonius
2015-08-18
The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 73% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 73% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tamburrini, G.; Termini, S.
1982-01-01
The general thesis underlying the present paper is that there are very strong methodological relations among cybernetics, system science, artificial intelligence, fuzzy sets and many other related fields. Then, in order to understand better both the achievements and the weak points of all the previous disciplines, one should look for some common features for looking at them in this general frame. What will be done is to present a brief analysis of the primitive program of cybernetics, presenting it as a case study useful for developing the previous thesis. Among the discussed points are the problems of interdisciplinarity and ofmore » the unity of cybernetics. Some implications of this analysis for a new reading of general system theory and fuzzy sets are briefly outlined at the end of the paper. 3 references.« less
Douma, Johanna G; Volkers, Karin M; Engels, Gwenda; Sonneveld, Marieke H; Goossens, Richard H M; Scherder, Erik J A
2017-04-28
Despite the detrimental effects of physical inactivity for older adults, especially aged residents of residential care settings may spend much time in inactive behavior. This may be partly due to their poorer physical condition; however, there may also be other, setting-related factors that influence the amount of inactivity. The aim of this review was to review setting-related factors (including the social and physical environment) that may contribute to the amount of older adults' physical inactivity in a wide range of residential care settings (e.g., nursing homes, assisted care facilities). Five databases were systematically searched for eligible studies, using the key words 'inactivity', 'care facilities', and 'older adults', including their synonyms and MeSH terms. Additional studies were selected from references used in articles included from the search. Based on specific eligibility criteria, a total of 12 studies were included. Quality of the included studies was assessed using the Mixed Methods Appraisal Tool (MMAT). Based on studies using different methodologies (e.g., interviews and observations), and of different quality (assessed quality range: 25-100%), we report several aspects related to the physical environment and caregivers. Factors of the physical environment that may be related to physical inactivity included, among others, the environment's compatibility with the abilities of a resident, the presence of equipment, the accessibility, security, comfort, and aesthetics of the environment/corridors, and possibly the presence of some specific areas. Caregiver-related factors included staffing levels, the available time, and the amount and type of care being provided. Inactivity levels in residential care settings may be reduced by improving several features of the physical environment and with the help of caregivers. Intervention studies could be performed in order to gain more insight into causal effects of improving setting-related factors on physical inactivity of aged residents.
Capela, Nicole A; Lemaire, Edward D; Baddour, Natalie
2015-01-01
Human activity recognition (HAR), using wearable sensors, is a growing area with the potential to provide valuable information on patient mobility to rehabilitation specialists. Smartphones with accelerometer and gyroscope sensors are a convenient, minimally invasive, and low cost approach for mobility monitoring. HAR systems typically pre-process raw signals, segment the signals, and then extract features to be used in a classifier. Feature selection is a crucial step in the process to reduce potentially large data dimensionality and provide viable parameters to enable activity classification. Most HAR systems are customized to an individual research group, including a unique data set, classes, algorithms, and signal features. These data sets are obtained predominantly from able-bodied participants. In this paper, smartphone accelerometer and gyroscope sensor data were collected from populations that can benefit from human activity recognition: able-bodied, elderly, and stroke patients. Data from a consecutive sequence of 41 mobility tasks (18 different tasks) were collected for a total of 44 participants. Seventy-six signal features were calculated and subsets of these features were selected using three filter-based, classifier-independent, feature selection methods (Relief-F, Correlation-based Feature Selection, Fast Correlation Based Filter). The feature subsets were then evaluated using three generic classifiers (Naïve Bayes, Support Vector Machine, j48 Decision Tree). Common features were identified for all three populations, although the stroke population subset had some differences from both able-bodied and elderly sets. Evaluation with the three classifiers showed that the feature subsets produced similar or better accuracies than classification with the entire feature set. Therefore, since these feature subsets are classifier-independent, they should be useful for developing and improving HAR systems across and within populations.
2015-01-01
Human activity recognition (HAR), using wearable sensors, is a growing area with the potential to provide valuable information on patient mobility to rehabilitation specialists. Smartphones with accelerometer and gyroscope sensors are a convenient, minimally invasive, and low cost approach for mobility monitoring. HAR systems typically pre-process raw signals, segment the signals, and then extract features to be used in a classifier. Feature selection is a crucial step in the process to reduce potentially large data dimensionality and provide viable parameters to enable activity classification. Most HAR systems are customized to an individual research group, including a unique data set, classes, algorithms, and signal features. These data sets are obtained predominantly from able-bodied participants. In this paper, smartphone accelerometer and gyroscope sensor data were collected from populations that can benefit from human activity recognition: able-bodied, elderly, and stroke patients. Data from a consecutive sequence of 41 mobility tasks (18 different tasks) were collected for a total of 44 participants. Seventy-six signal features were calculated and subsets of these features were selected using three filter-based, classifier-independent, feature selection methods (Relief-F, Correlation-based Feature Selection, Fast Correlation Based Filter). The feature subsets were then evaluated using three generic classifiers (Naïve Bayes, Support Vector Machine, j48 Decision Tree). Common features were identified for all three populations, although the stroke population subset had some differences from both able-bodied and elderly sets. Evaluation with the three classifiers showed that the feature subsets produced similar or better accuracies than classification with the entire feature set. Therefore, since these feature subsets are classifier-independent, they should be useful for developing and improving HAR systems across and within populations. PMID:25885272
Hong, Chih-Yuan; Guo, Lan-Yuen; Song, Rong; Nagurka, Mark L; Sung, Jia-Li; Yen, Chen-Wen
2016-08-02
Many methods have been proposed to assess the stability of human postural balance by using a force plate. While most of these approaches characterize postural stability by extracting features from the trajectory of the center of pressure (COP), this work develops stability measures derived from components of the ground reaction force (GRF). In comparison with previous GRF-based approaches that extract stability features from the GRF resultant force, this study proposes three feature sets derived from the correlation patterns among the vertical GRF (VGRF) components. The first and second feature sets quantitatively assess the strength and changing speed of the correlation patterns, respectively. The third feature set is used to quantify the stabilizing effect of the GRF coordination patterns on the COP. In addition to experimentally demonstrating the reliability of the proposed features, the efficacy of the proposed features has also been tested by using them to classify two age groups (18-24 and 65-73 years) in quiet standing. The experimental results show that the proposed features are considerably more sensitive to aging than one of the most effective conventional COP features and two recently proposed COM features. By extracting information from the correlation patterns of the VGRF components, this study proposes three sets of features to assess human postural stability during quiet standing. As demonstrated by the experimental results, the proposed features are not only robust to inter-trial variability but also more accurate than the tested COP and COM features in classifying the older and younger age groups. An additional advantage of the proposed approach is that it reduces the force sensing requirement from 3D to 1D, substantially reducing the cost of the force plate measurement system.
Comparison of Feature Selection Techniques in Machine Learning for Anatomical Brain MRI in Dementia.
Tohka, Jussi; Moradi, Elaheh; Huttunen, Heikki
2016-07-01
We present a comparative split-half resampling analysis of various data driven feature selection and classification methods for the whole brain voxel-based classification analysis of anatomical magnetic resonance images. We compared support vector machines (SVMs), with or without filter based feature selection, several embedded feature selection methods and stability selection. While comparisons of the accuracy of various classification methods have been reported previously, the variability of the out-of-training sample classification accuracy and the set of selected features due to independent training and test sets have not been previously addressed in a brain imaging context. We studied two classification problems: 1) Alzheimer's disease (AD) vs. normal control (NC) and 2) mild cognitive impairment (MCI) vs. NC classification. In AD vs. NC classification, the variability in the test accuracy due to the subject sample did not vary between different methods and exceeded the variability due to different classifiers. In MCI vs. NC classification, particularly with a large training set, embedded feature selection methods outperformed SVM-based ones with the difference in the test accuracy exceeding the test accuracy variability due to the subject sample. The filter and embedded methods produced divergent feature patterns for MCI vs. NC classification that suggests the utility of the embedded feature selection for this problem when linked with the good generalization performance. The stability of the feature sets was strongly correlated with the number of features selected, weakly correlated with the stability of classification accuracy, and uncorrelated with the average classification accuracy.
Rethinking the REAL ID Act and National Identification Cards as a Counterterrorism Tool
2009-12-01
federal government imposing national identification standards on states are also actively engaged in the debate. Michael Boldin , a 36-year-old Web...on the RIA.94 Boldin states, “Maine resisted, the government backed off, and soon all of these other states were doing the same thing.”95 Since...that acquires biometric data from an individual, extracts a feature set from the data, compares this feature set against the feature set stored in a
NASA Astrophysics Data System (ADS)
Strohmeier, Dominik; Kunze, Kristina; Göbel, Klemens; Liebetrau, Judith
2013-01-01
Assessing audiovisual Quality of Experience (QoE) is a key element to ensure quality acceptance of today's multimedia products. The use of descriptive evaluation methods allows evaluating QoE preferences and the underlying QoE features jointly. From our previous evaluations on QoE for mobile 3D video we found that mainly one dimension, video quality, dominates the descriptive models. Large variations of the visual video quality in the tests may be the reason for these findings. A new study was conducted to investigate whether test sets of low QoE are described differently than those of high audiovisual QoE. Reanalysis of previous data sets seems to confirm this hypothesis. Our new study consists of a pre-test and a main test, using the Descriptive Sorted Napping method. Data sets of good-only and bad-only video quality were evaluated separately. The results show that the perception of bad QoE is mainly determined one-dimensionally by visual artifacts, whereas the perception of good quality shows multiple dimensions. Here, mainly semantic-related features of the content and affective descriptors are used by the naïve test participants. The results show that, with increasing QoE of audiovisual systems, content semantics and users' a_ective involvement will become important for assessing QoE differences.
Aktaruzzaman, M; Migliorini, M; Tenhunen, M; Himanen, S L; Bianchi, A M; Sassi, R
2015-05-01
The work considers automatic sleep stage classification, based on heart rate variability (HRV) analysis, with a focus on the distinction of wakefulness (WAKE) from sleep and rapid eye movement (REM) from non-REM (NREM) sleep. A set of 20 automatically annotated one-night polysomnographic recordings was considered, and artificial neural networks were selected for classification. For each inter-heartbeat (RR) series, beside features previously presented in literature, we introduced a set of four parameters related to signal regularity. RR series of three different lengths were considered (corresponding to 2, 6, and 10 successive epochs, 30 s each, in the same sleep stage). Two sets of only four features captured 99 % of the data variance in each classification problem, and both of them contained one of the new regularity features proposed. The accuracy of classification for REM versus NREM (68.4 %, 2 epochs; 83.8 %, 10 epochs) was higher than when distinguishing WAKE versus SLEEP (67.6 %, 2 epochs; 71.3 %, 10 epochs). Also, the reliability parameter (Cohens's Kappa) was higher (0.68 and 0.45, respectively). Sleep staging classification based on HRV was still less precise than other staging methods, employing a larger variety of signals collected during polysomnographic studies. However, cheap and unobtrusive HRV-only sleep classification proved sufficiently precise for a wide range of applications.
Hu, Ze; Zhang, Zhan; Yang, Haiqin; Chen, Qing; Zuo, Decheng
2017-07-01
Recently, online health expert question-answering (HQA) services (systems) have attracted more and more health consumers to ask health-related questions everywhere at any time due to the convenience and effectiveness. However, the quality of answers in existing HQA systems varies in different situations. It is significant to provide effective tools to automatically determine the quality of the answers. Two main characteristics in HQA systems raise the difficulties of classification: (1) physicians' answers in an HQA system are usually written in short text, which yields the data sparsity issue; (2) HQA systems apply the quality control mechanism, which refrains the wisdom of crowd. The important information, such as the best answer and the number of users' votes, is missing. To tackle these issues, we prepare the first HQA research data set labeled by three medical experts in 90days and formulate the problem of predicting the quality of answers in the system as a classification task. We not only incorporate the standard textual feature of answers, but also introduce a set of unique non-textual features, i.e., the popular used surface linguistic features and the novel social features, from other modalities. A multimodal deep belief network (DBN)-based learning framework is then proposed to learn the high-level hidden semantic representations of answers from both textual features and non-textual features while the learned joint representation is fed into popular classifiers to determine the quality of answers. Finally, we conduct extensive experiments to demonstrate the effectiveness of including the non-textual features and the proposed multimodal deep learning framework. Copyright © 2017 Elsevier Inc. All rights reserved.
An efficient scheme for automatic web pages categorization using the support vector machine
NASA Astrophysics Data System (ADS)
Bhalla, Vinod Kumar; Kumar, Neeraj
2016-07-01
In the past few years, with an evolution of the Internet and related technologies, the number of the Internet users grows exponentially. These users demand access to relevant web pages from the Internet within fraction of seconds. To achieve this goal, there is a requirement of an efficient categorization of web page contents. Manual categorization of these billions of web pages to achieve high accuracy is a challenging task. Most of the existing techniques reported in the literature are semi-automatic. Using these techniques, higher level of accuracy cannot be achieved. To achieve these goals, this paper proposes an automatic web pages categorization into the domain category. The proposed scheme is based on the identification of specific and relevant features of the web pages. In the proposed scheme, first extraction and evaluation of features are done followed by filtering the feature set for categorization of domain web pages. A feature extraction tool based on the HTML document object model of the web page is developed in the proposed scheme. Feature extraction and weight assignment are based on the collection of domain-specific keyword list developed by considering various domain pages. Moreover, the keyword list is reduced on the basis of ids of keywords in keyword list. Also, stemming of keywords and tag text is done to achieve a higher accuracy. An extensive feature set is generated to develop a robust classification technique. The proposed scheme was evaluated using a machine learning method in combination with feature extraction and statistical analysis using support vector machine kernel as the classification tool. The results obtained confirm the effectiveness of the proposed scheme in terms of its accuracy in different categories of web pages.
Uppal, Karan; Soltow, Quinlyn A; Strobel, Frederick H; Pittard, W Stephen; Gernert, Kim M; Yu, Tianwei; Jones, Dean P
2013-01-16
Detection of low abundance metabolites is important for de novo mapping of metabolic pathways related to diet, microbiome or environmental exposures. Multiple algorithms are available to extract m/z features from liquid chromatography-mass spectral data in a conservative manner, which tends to preclude detection of low abundance chemicals and chemicals found in small subsets of samples. The present study provides software to enhance such algorithms for feature detection, quality assessment, and annotation. xMSanalyzer is a set of utilities for automated processing of metabolomics data. The utilites can be classified into four main modules to: 1) improve feature detection for replicate analyses by systematic re-extraction with multiple parameter settings and data merger to optimize the balance between sensitivity and reliability, 2) evaluate sample quality and feature consistency, 3) detect feature overlap between datasets, and 4) characterize high-resolution m/z matches to small molecule metabolites and biological pathways using multiple chemical databases. The package was tested with plasma samples and shown to more than double the number of features extracted while improving quantitative reliability of detection. MS/MS analysis of a random subset of peaks that were exclusively detected using xMSanalyzer confirmed that the optimization scheme improves detection of real metabolites. xMSanalyzer is a package of utilities for data extraction, quality control assessment, detection of overlapping and unique metabolites in multiple datasets, and batch annotation of metabolites. The program was designed to integrate with existing packages such as apLCMS and XCMS, but the framework can also be used to enhance data extraction for other LC/MS data software.
Handwriting: Feature Correlation Analysis for Biometric Hashes
NASA Astrophysics Data System (ADS)
Vielhauer, Claus; Steinmetz, Ralf
2004-12-01
In the application domain of electronic commerce, biometric authentication can provide one possible solution for the key management problem. Besides server-based approaches, methods of deriving digital keys directly from biometric measures appear to be advantageous. In this paper, we analyze one of our recently published specific algorithms of this category based on behavioral biometrics of handwriting, the biometric hash. Our interest is to investigate to which degree each of the underlying feature parameters contributes to the overall intrapersonal stability and interpersonal value space. We will briefly discuss related work in feature evaluation and introduce a new methodology based on three components: the intrapersonal scatter (deviation), the interpersonal entropy, and the correlation between both measures. Evaluation of the technique is presented based on two data sets of different size. The method presented will allow determination of effects of parameterization of the biometric system, estimation of value space boundaries, and comparison with other feature selection approaches.
Analyzing the Language of Therapist Empathy in Motivational Interview based Psychotherapy
Xiao, Bo; Can, Dogan; Georgiou, Panayiotis G.; Atkins, David; Narayanan, Shrikanth S.
2016-01-01
Empathy is an important aspect of social communication, especially in medical and psychotherapy applications. Measures of empathy can offer insights into the quality of therapy. We use an N-gram language model based maximum likelihood strategy to classify empathic versus non-empathic utterances and report the precision and recall of classification for various parameters. High recall is obtained with unigram while bigram features achieved the highest F1-score. Based on the utterance level models, a group of lexical features are extracted at the therapy session level. The effectiveness of these features in modeling session level annotator perceptions of empathy is evaluated through correlation with expert-coded session level empathy scores. Our combined feature set achieved a correlation of 0.558 between predicted and expert-coded empathy scores. Results also suggest that the longer term empathy perception process may be more related to isolated empathic salient events. PMID:27602411
Joint Concept Correlation and Feature-Concept Relevance Learning for Multilabel Classification.
Zhao, Xiaowei; Ma, Zhigang; Li, Zhi; Li, Zhihui
2018-02-01
In recent years, multilabel classification has attracted significant attention in multimedia annotation. However, most of the multilabel classification methods focus only on the inherent correlations existing among multiple labels and concepts and ignore the relevance between features and the target concepts. To obtain more robust multilabel classification results, we propose a new multilabel classification method aiming to capture the correlations among multiple concepts by leveraging hypergraph that is proved to be beneficial for relational learning. Moreover, we consider mining feature-concept relevance, which is often overlooked by many multilabel learning algorithms. To better show the feature-concept relevance, we impose a sparsity constraint on the proposed method. We compare the proposed method with several other multilabel classification methods and evaluate the classification performance by mean average precision on several data sets. The experimental results show that the proposed method outperforms the state-of-the-art methods.
Yu, Jin; Abidi, Syed Sibte Raza; Artes, Paul; McIntyre, Andy; Heywood, Malcolm
2005-01-01
The availability of modern imaging techniques such as Confocal Scanning Laser Tomography (CSLT) for capturing high-quality optic nerve images offer the potential for developing automatic and objective methods for diagnosing glaucoma. We present a hybrid approach that features the analysis of CSLT images using moment methods to derive abstract image defining features. The features are then used to train classifers for automatically distinguishing CSLT images of normal and glaucoma patient. As a first, in this paper, we present investigations in feature subset selction methods for reducing the relatively large input space produced by the moment methods. We use neural networks and support vector machines to determine a sub-set of moments that offer high classification accuracy. We demonstratee the efficacy of our methods to discriminate between healthy and glaucomatous optic disks based on shape information automatically derived from optic disk topography and reflectance images.
Unsupervised Deep Learning Applied to Breast Density Segmentation and Mammographic Risk Scoring.
Kallenberg, Michiel; Petersen, Kersten; Nielsen, Mads; Ng, Andrew Y; Pengfei Diao; Igel, Christian; Vachon, Celine M; Holland, Katharina; Winkel, Rikke Rass; Karssemeijer, Nico; Lillholm, Martin
2016-05-01
Mammographic risk scoring has commonly been automated by extracting a set of handcrafted features from mammograms, and relating the responses directly or indirectly to breast cancer risk. We present a method that learns a feature hierarchy from unlabeled data. When the learned features are used as the input to a simple classifier, two different tasks can be addressed: i) breast density segmentation, and ii) scoring of mammographic texture. The proposed model learns features at multiple scales. To control the models capacity a novel sparsity regularizer is introduced that incorporates both lifetime and population sparsity. We evaluated our method on three different clinical datasets. Our state-of-the-art results show that the learned breast density scores have a very strong positive relationship with manual ones, and that the learned texture scores are predictive of breast cancer. The model is easy to apply and generalizes to many other segmentation and scoring problems.
A human performance evaluation of graphic symbol-design features.
Samet, M G; Geiselman, R E; Landee, B M
1982-06-01
16 subjects learned each of two tactical display symbol sets (conventional symbols and iconic symbols) in turn and were then shown a series of graphic displays containing various symbol configurations. For each display, the subject was asked questions corresponding to different behavioral processes relating to symbol use (identification, search, comparison, pattern recognition). The results indicated that: (a) conventional symbols yielded faster pattern-recognition performance than iconic symbols, and iconic symbols did not yield faster identification than conventional symbols, and (b) the portrayal of additional feature information (through the use of perimeter density or vector projection coding) slowed processing of the core symbol information in four tasks, but certain symbol-design features created less perceptual interference and had greater correspondence with the portrayal of specific tactical concepts than others. The results were discussed in terms of the complexities involved in the selection of symbol design features for use in graphic tactical displays.
Karstic slope "breathing": morpho-structural influence and hazard implications
NASA Astrophysics Data System (ADS)
Devoti, Roberto; Falcucci, Emanuela; Gori, Stefano; Eliana Poli, Maria; Zanferrari, Adriano; Braitenberg, Carla; Fabris, Paolo; Grillo, Barbara; Zuliani, David
2016-04-01
The study refers to the active slope deformation detected by GPS and tiltmeter stations in the Cansiglio karstic plateau located in the western Carnic Prealps (NE Italy). The observed transient deformation clearly correlates with the rainfall, so that the southernmost border of the Plateau reacts instantly to heavy rains displaying a "back and forth" deformation up to a few centimeters wide, with different time constants, demonstrating a response to different catchment volumes. We carried out a field survey along the southern Cansiglio slope, to achieve structural characterization of the relief and to verify the possible relation between structural features and the peculiar geomorphological setting dominated by widespread karstic features. The Cansiglio plateau develops on the frontal ramp anticline of the Cansiglio thrust, an about ENE-WSW trending, SSE-verging, low angle thrust, belonging to the Neogene-Quaternary front of the eastern Southern Alps. The Cansiglio thrust outcrops at the base of the Cansiglio plateau, where it overlaps the Mesozoic carbonates on the Miocene-Quaternary terrigenous succession. All along its length cataclastic limestone largely outcrop. The Cansiglio thrust is bordered by two transfer zones probably inherited from the Mesozoic paleogeography: the Caneva fault in the west and the Col Longone fault in the east. The carbonatic massif is also characterized by a series of about northward steeply dipping reverse minor faults and a set of subvertical joints parallel to the axes of the Cansiglio anticline. Other NNW-SSE and NNE-SSW conjugate faults and fractures perpendicular to the Cansiglio southern slope are also identified. This structural setting affect pervasively the whole slope and may determine centimetre- to metre-scale rock prisms. Interestingly, along the topmost portion of the slope, some dolines and swallow holes show an incipient coalescence, that trends parallel to the massif front and to the deformation zones related to the reverse fault. Such a dolines alignment forms a ridge parallel elongated trench, about 4 km long, which is a typical morpho-structural feature of slopes undergoing large scale gravitational instability (deep seated gravitational slope deformations). The trench is interrupted towards the NE by several coalescent and slide scarps. Such geomorphic evidence testifies to the occurrence of landslides events (mainly rockslides and rock falls) that sourced from the top portion of the slope, as local collapses of the sector affected by the trench. Our observations, as a whole, suggest that morpho-structural framework of the Cansiglio south-eastern slope is highly influenced by tectonic features related to the complex tectonic deformation. The structural setting is locally favoring the nucleation of karstic landforms (dolines, swallow holes and ipokarstic features). Moreover, the presence of widespread tectonic features lead gravitational instability affecting the slope, linked to the high local relief of the mountain front, may trigger collapse of sectors of the slope in rock falls phenomena. In this perspective, therefore, the continuous "back and forth" movements of the slope observed by GPS time series analysis induced by rainfall may progressively weaken the slope and render it prone to landsliding.
Liu, Wei; Li, Dong; Zhang, Jiyang; Zhu, Yunping; He, Fuchu
2006-11-27
Measuring each protein's importance in signaling networks helps to identify the crucial proteins in a cellular process, find the fragile portion of the biology system and further assist for disease therapy. However, there are relatively few methods to evaluate the importance of proteins in signaling networks. We developed a novel network feature to evaluate the importance of proteins in signal transduction networks, that we call SigFlux, based on the concept of minimal path sets (MPSs). An MPS is a minimal set of nodes that can perform the signal propagation from ligands to target genes or feedback loops. We define SigFlux as the number of MPSs in which each protein is involved. We applied this network feature to the large signal transduction network in the hippocampal CA1 neuron of mice. Significant correlations were simultaneously observed between SigFlux and both the essentiality and evolutionary rate of genes. Compared with another commonly used network feature, connectivity, SigFlux has similar or better ability as connectivity to reflect a protein's essentiality. Further classification according to protein function demonstrates that high SigFlux, low connectivity proteins are abundant in receptors and transcriptional factors, indicating that SigFlux candescribe the importance of proteins within the context of the entire network. SigFlux is a useful network feature in signal transduction networks that allows the prediction of the essentiality and conservation of proteins. With this novel network feature, proteins that participate in more pathways or feedback loops within a signaling network are proved far more likely to be essential and conserved during evolution than their counterparts.
Rajbongshi, Nijara; Bora, Kangkana; Nath, Dilip C; Das, Anup K; Mahanta, Lipi B
2018-01-01
Cytological changes in terms of shape and size of nuclei are some of the common morphometric features to study breast cancer, which can be observed by careful screening of fine needle aspiration cytology (FNAC) images. This study attempts to categorize a collection of FNAC microscopic images into benign and malignant classes based on family of probability distribution using some morphometric features of cell nuclei. For this study, features namely area, perimeter, eccentricity, compactness, and circularity of cell nuclei were extracted from FNAC images of both benign and malignant samples using an image processing technique. All experiments were performed on a generated FNAC image database containing 564 malignant (cancerous) and 693 benign (noncancerous) cell level images. The five-set extracted features were reduced to three-set (area, perimeter, and circularity) based on the mean statistic. Finally, the data were fitted to the generalized Pearsonian system of frequency curve, so that the resulting distribution can be used as a statistical model. Pearsonian system is a family of distributions where kappa (κ) is the selection criteria computed as functions of the first four central moments. For the benign group, kappa (κ) corresponding to area, perimeter, and circularity was -0.00004, 0.0000, and 0.04155 and for malignant group it was 1016942, 0.01464, and -0.3213, respectively. Thus, the family of distribution related to these features for the benign and malignant group were different, and therefore, characterization of their probability curve will also be different.
Unger, Jakob; Schuster, Maria; Hecker, Dietmar J; Schick, Bernhard; Lohscheller, Joerg
2013-01-01
Direct observation of vocal fold vibration is indispensable for a clinical diagnosis of voice disorders. Among current imaging techniques, high-speed videoendoscopy constitutes a state-of-the-art method capturing several thousand frames per second of the vocal folds during phonation. Recently, a method for extracting descriptive features from phonovibrograms, a two-dimensional image containing the spatio-temporal pattern of vocal fold dynamics, was presented. The derived features are closely related to a clinically established protocol for functional assessment of pathologic voices. The discriminative power of these features for different pathologic findings and configurations has not been assessed yet. In the current study, a collective of 220 subjects is considered for two- and multi-class problems of healthy and pathologic findings. The performance of the proposed feature set is compared to conventional feature reduction routines and was found to clearly outperform these. As such, the proposed procedure shows great potential for diagnostical issues of vocal fold disorders.
NASA Technical Reports Server (NTRS)
Narasimhan, Sriram; Roychoudhury, Indranil; Balaban, Edward; Saxena, Abhinav
2010-01-01
Model-based diagnosis typically uses analytical redundancy to compare predictions from a model against observations from the system being diagnosed. However this approach does not work very well when it is not feasible to create analytic relations describing all the observed data, e.g., for vibration data which is usually sampled at very high rates and requires very detailed finite element models to describe its behavior. In such cases, features (in time and frequency domains) that contain diagnostic information are extracted from the data. Since this is a computationally intensive process, it is not efficient to extract all the features all the time. In this paper we present an approach that combines the analytic model-based and feature-driven diagnosis approaches. The analytic approach is used to reduce the set of possible faults and then features are chosen to best distinguish among the remaining faults. We describe an implementation of this approach on the Flyable Electro-mechanical Actuator (FLEA) test bed.
Prediction of essential proteins based on gene expression programming.
Zhong, Jiancheng; Wang, Jianxin; Peng, Wei; Zhang, Zhen; Pan, Yi
2013-01-01
Essential proteins are indispensable for cell survive. Identifying essential proteins is very important for improving our understanding the way of a cell working. There are various types of features related to the essentiality of proteins. Many methods have been proposed to combine some of them to predict essential proteins. However, it is still a big challenge for designing an effective method to predict them by integrating different features, and explaining how these selected features decide the essentiality of protein. Gene expression programming (GEP) is a learning algorithm and what it learns specifically is about relationships between variables in sets of data and then builds models to explain these relationships. In this work, we propose a GEP-based method to predict essential protein by combing some biological features and topological features. We carry out experiments on S. cerevisiae data. The experimental results show that the our method achieves better prediction performance than those methods using individual features. Moreover, our method outperforms some machine learning methods and performs as well as a method which is obtained by combining the outputs of eight machine learning methods. The accuracy of predicting essential proteins can been improved by using GEP method to combine some topological features and biological features.
Kamath, Padmaja; Fernandez, Alberto; Giralt, Francesc; Rallo, Robert
2015-01-01
Nanoparticles are likely to interact in real-case application scenarios with mixtures of proteins and biomolecules that will absorb onto their surface forming the so-called protein corona. Information related to the composition of the protein corona and net cell association was collected from literature for a library of surface-modified gold and silver nanoparticles. For each protein in the corona, sequence information was extracted and used to calculate physicochemical properties and statistical descriptors. Data cleaning and preprocessing techniques including statistical analysis and feature selection methods were applied to remove highly correlated, redundant and non-significant features. A weighting technique was applied to construct specific signatures that represent the corona composition for each nanoparticle. Using this basic set of protein descriptors, a new Protein Corona Structure-Activity Relationship (PCSAR) that relates net cell association with the physicochemical descriptors of the proteins that form the corona was developed and validated. The features that resulted from the feature selection were in line with already published literature, and the computational model constructed on these features had a good accuracy (R(2)LOO=0.76 and R(2)LMO(25%)=0.72) and stability, with the advantage that the fingerprints based on physicochemical descriptors were independent of the specific proteins that form the corona.
Computer-aided diagnosis of melanoma using border and wavelet-based texture analysis.
Garnavi, Rahil; Aldeen, Mohammad; Bailey, James
2012-11-01
This paper presents a novel computer-aided diagnosis system for melanoma. The novelty lies in the optimised selection and integration of features derived from textural, borderbased and geometrical properties of the melanoma lesion. The texture features are derived from using wavelet-decomposition, the border features are derived from constructing a boundaryseries model of the lesion border and analysing it in spatial and frequency domains, and the geometry features are derived from shape indexes. The optimised selection of features is achieved by using the Gain-Ratio method, which is shown to be computationally efficient for melanoma diagnosis application. Classification is done through the use of four classifiers; namely, Support Vector Machine, Random Forest, Logistic Model Tree and Hidden Naive Bayes. The proposed diagnostic system is applied on a set of 289 dermoscopy images (114 malignant, 175 benign) partitioned into train, validation and test image sets. The system achieves and accuracy of 91.26% and AUC value of 0.937, when 23 features are used. Other important findings include (i) the clear advantage gained in complementing texture with border and geometry features, compared to using texture information only, and (ii) higher contribution of texture features than border-based features in the optimised feature set.
Bakal, Gokhan; Talari, Preetham; Kakani, Elijah V; Kavuluru, Ramakanth
2018-06-01
Identifying new potential treatment options for medical conditions that cause human disease burden is a central task of biomedical research. Since all candidate drugs cannot be tested with animal and clinical trials, in vitro approaches are first attempted to identify promising candidates. Likewise, identifying different causal relations between biomedical entities is also critical to understand biomedical processes. Generally, natural language processing (NLP) and machine learning are used to predict specific relations between any given pair of entities using the distant supervision approach. To build high accuracy supervised predictive models to predict previously unknown treatment and causative relations between biomedical entities based only on semantic graph pattern features extracted from biomedical knowledge graphs. We used 7000 treats and 2918 causes hand-curated relations from the UMLS Metathesaurus to train and test our models. Our graph pattern features are extracted from simple paths connecting biomedical entities in the SemMedDB graph (based on the well-known SemMedDB database made available by the U.S. National Library of Medicine). Using these graph patterns connecting biomedical entities as features of logistic regression and decision tree models, we computed mean performance measures (precision, recall, F-score) over 100 distinct 80-20% train-test splits of the datasets. For all experiments, we used a positive:negative class imbalance of 1:10 in the test set to model relatively more realistic scenarios. Our models predict treats and causes relations with high F-scores of 99% and 90% respectively. Logistic regression model coefficients also help us identify highly discriminative patterns that have an intuitive interpretation. We are also able to predict some new plausible relations based on false positives that our models scored highly based on our collaborations with two physician co-authors. Finally, our decision tree models are able to retrieve over 50% of treatment relations from a recently created external dataset. We employed semantic graph patterns connecting pairs of candidate biomedical entities in a knowledge graph as features to predict treatment/causative relations between them. We provide what we believe is the first evidence in direct prediction of biomedical relations based on graph features. Our work complements lexical pattern based approaches in that the graph patterns can be used as additional features for weakly supervised relation prediction. Copyright © 2018 Elsevier Inc. All rights reserved.
Galpert, Deborah; Fernández, Alberto; Herrera, Francisco; Antunes, Agostinho; Molina-Ruiz, Reinaldo; Agüero-Chapin, Guillermin
2018-05-03
The development of new ortholog detection algorithms and the improvement of existing ones are of major importance in functional genomics. We have previously introduced a successful supervised pairwise ortholog classification approach implemented in a big data platform that considered several pairwise protein features and the low ortholog pair ratios found between two annotated proteomes (Galpert, D et al., BioMed Research International, 2015). The supervised models were built and tested using a Saccharomycete yeast benchmark dataset proposed by Salichos and Rokas (2011). Despite several pairwise protein features being combined in a supervised big data approach; they all, to some extent were alignment-based features and the proposed algorithms were evaluated on a unique test set. Here, we aim to evaluate the impact of alignment-free features on the performance of supervised models implemented in the Spark big data platform for pairwise ortholog detection in several related yeast proteomes. The Spark Random Forest and Decision Trees with oversampling and undersampling techniques, and built with only alignment-based similarity measures or combined with several alignment-free pairwise protein features showed the highest classification performance for ortholog detection in three yeast proteome pairs. Although such supervised approaches outperformed traditional methods, there were no significant differences between the exclusive use of alignment-based similarity measures and their combination with alignment-free features, even within the twilight zone of the studied proteomes. Just when alignment-based and alignment-free features were combined in Spark Decision Trees with imbalance management, a higher success rate (98.71%) within the twilight zone could be achieved for a yeast proteome pair that underwent a whole genome duplication. The feature selection study showed that alignment-based features were top-ranked for the best classifiers while the runners-up were alignment-free features related to amino acid composition. The incorporation of alignment-free features in supervised big data models did not significantly improve ortholog detection in yeast proteomes regarding the classification qualities achieved with just alignment-based similarity measures. However, the similarity of their classification performance to that of traditional ortholog detection methods encourages the evaluation of other alignment-free protein pair descriptors in future research.
An enhanced digital line graph design
Guptill, Stephen C.
1990-01-01
In response to increasing information demands on its digital cartographic data, the U.S. Geological Survey has designed an enhanced version of the Digital Line Graph, termed Digital Line Graph - Enhanced (DLG-E). In the DLG-E model, the phenomena represented by geographic and cartographic data are termed entities. Entities represent individual phenomena in the real world. A feature is an abstraction of a set of entities, with the feature description encompassing only selected properties of the entities (typically the properties that have been portrayed cartographically on a map). Buildings, bridges, roads, streams, grasslands, and counties are examples of features. A feature instance, that is, one occurrence of a feature, is described in the digital environment by feature objects and spatial objects. A feature object identifies a feature instance and its nonlocational attributes. Nontopological relationships are associated with feature objects. The locational aspects of the feature instance are represented by spatial objects. Four spatial objects (points, nodes, chains, and polygons) and their topological relationships are defined. To link the locational and nonlocational aspects of the feature instance, a given feature object is associated with (or is composed of) a set of spatial objects. These objects, attributes, and relationships are the components of the DLG-E data model. To establish a domain of features for DLG-E, an approach using a set of classes, or views, of spatial entities was adopted. The five views that were developed are cover, division, ecosystem, geoposition, and morphology. The views are exclusive; each view is a self-contained analytical approach to the entire range of world features. Because each view is independent of the others, a single point on the surface of the Earth can be represented under multiple views. Under the five views, over 200 features were identified and defined. This set constitutes an initial domain of DLG-E features.
Celluloid devils: a research study of male nurses in feature films.
Stanley, David
2012-11-01
To report a study of how male nurses are portrayed in feature films. It was hypothesized that male nurses are frequently portrayed negatively or stereotypically in the film media, potentially having a negative impact on male nurse recruitment and the public's perception of male nurses. An interpretive, qualitative methodology guided by insights into hegemonic masculinity and structured around a set of collective case studies (films) was used to examine the portrayal of male nurses in feature films made in the Western world from 1900 to 2007. Over 36,000 feature film synopses were reviewed (via CINAHL, ProQuest and relevant movie-specific literature) for the keyword 'nurse' and 'nursing' with an additional search for films from 1900 to 2010 for the word 'male nurse'. Identified films were labelled as 'cases' and analysed collectively to determine key attributes related to men in nursing and explore them for the emergence of concepts and themes related to the image of male nurses in films. A total of 13 relevant cases (feature films) were identified with 12 being made in the USA. Most films portrayed male nurses negatively and in ways opposed to hegemonic masculinity, as effeminate, homosexual, homicidal, corrupt or incompetent. Few film images of male nurses show them in traditional masculine roles or as clinically competent or self-confident professionals. Feature films predominantly portray male nurses negatively. Given the popularity of feature films, there may be negative effects on recruitment and on the public's perception of male nurses. © 2012 Blackwell Publishing Ltd.
Geomorphology of the Iberian Continental Margin
NASA Astrophysics Data System (ADS)
Maestro, Adolfo; López-Martínez, Jerónimo; Llave, Estefanía; Bohoyo, Fernando; Acosta, Juan; Hernández-Molina, F. Javier; Muñoz, Araceli; Jané, Gloria
2013-08-01
The submarine features and processes around the Iberian Peninsula are the result of a complex and diverse geological and oceanographical setting. This paper presents an overview of the seafloor geomorphology of the Iberian Continental Margin and the adjacent abyssal plains. The study covers an area of approximately 2.3 million km2, including a 50 to 400 km wide band adjacent to the coastline. The main morphological characteristics of the seafloor features on the Iberian continental shelf, continental slope, continental rise and the surrounding abyssal plains are described. Individual seafloor features existing on the Iberian Margin have been classified into three main groups according to their origin: tectonic and/or volcanic, depositional and erosional. Major depositional and erosional features around the Iberian Margin developed in late Pleistocene-Holocene times and have been controlled by tectonic movements and eustatic fluctuations. The distribution of the geomorphological features is discussed in relation to their genetic processes and the evolution of the margin. The prevalence of one or several specific processes in certain areas reflects the dominant morphotectonic and oceanographic controlling factors. Sedimentary processes and the resulting depositional products are dominant on the Valencia-Catalán Margin and in the northern part of the Balearic Promontory. Strong tectonic control is observed in the geomorphology of the Betic and the Gulf of Cádiz margins. The role of bottom currents is especially evident throughout the Iberian Margin. The Galicia, Portuguese and Cantabrian margins show a predominance of erosional features and tectonically-controlled linear features related to faults.
The feature-weighted receptive field: an interpretable encoding model for complex feature spaces.
St-Yves, Ghislain; Naselaris, Thomas
2017-06-20
We introduce the feature-weighted receptive field (fwRF), an encoding model designed to balance expressiveness, interpretability and scalability. The fwRF is organized around the notion of a feature map-a transformation of visual stimuli into visual features that preserves the topology of visual space (but not necessarily the native resolution of the stimulus). The key assumption of the fwRF model is that activity in each voxel encodes variation in a spatially localized region across multiple feature maps. This region is fixed for all feature maps; however, the contribution of each feature map to voxel activity is weighted. Thus, the model has two separable sets of parameters: "where" parameters that characterize the location and extent of pooling over visual features, and "what" parameters that characterize tuning to visual features. The "where" parameters are analogous to classical receptive fields, while "what" parameters are analogous to classical tuning functions. By treating these as separable parameters, the fwRF model complexity is independent of the resolution of the underlying feature maps. This makes it possible to estimate models with thousands of high-resolution feature maps from relatively small amounts of data. Once a fwRF model has been estimated from data, spatial pooling and feature tuning can be read-off directly with no (or very little) additional post-processing or in-silico experimentation. We describe an optimization algorithm for estimating fwRF models from data acquired during standard visual neuroimaging experiments. We then demonstrate the model's application to two distinct sets of features: Gabor wavelets and features supplied by a deep convolutional neural network. We show that when Gabor feature maps are used, the fwRF model recovers receptive fields and spatial frequency tuning functions consistent with known organizational principles of the visual cortex. We also show that a fwRF model can be used to regress entire deep convolutional networks against brain activity. The ability to use whole networks in a single encoding model yields state-of-the-art prediction accuracy. Our results suggest a wide variety of uses for the feature-weighted receptive field model, from retinotopic mapping with natural scenes, to regressing the activities of whole deep neural networks onto measured brain activity. Copyright © 2017. Published by Elsevier Inc.
AVC: Selecting discriminative features on basis of AUC by maximizing variable complementarity.
Sun, Lei; Wang, Jun; Wei, Jinmao
2017-03-14
The Receiver Operator Characteristic (ROC) curve is well-known in evaluating classification performance in biomedical field. Owing to its superiority in dealing with imbalanced and cost-sensitive data, the ROC curve has been exploited as a popular metric to evaluate and find out disease-related genes (features). The existing ROC-based feature selection approaches are simple and effective in evaluating individual features. However, these approaches may fail to find real target feature subset due to their lack of effective means to reduce the redundancy between features, which is essential in machine learning. In this paper, we propose to assess feature complementarity by a trick of measuring the distances between the misclassified instances and their nearest misses on the dimensions of pairwise features. If a misclassified instance and its nearest miss on one feature dimension are far apart on another feature dimension, the two features are regarded as complementary to each other. Subsequently, we propose a novel filter feature selection approach on the basis of the ROC analysis. The new approach employs an efficient heuristic search strategy to select optimal features with highest complementarities. The experimental results on a broad range of microarray data sets validate that the classifiers built on the feature subset selected by our approach can get the minimal balanced error rate with a small amount of significant features. Compared with other ROC-based feature selection approaches, our new approach can select fewer features and effectively improve the classification performance.
Botly, Leigh C P; De Rosa, Eve
2012-10-01
The visual search task established the feature integration theory of attention in humans and measures visuospatial attentional contributions to feature binding. We recently demonstrated that the neuromodulator acetylcholine (ACh), from the nucleus basalis magnocellularis (NBM), supports the attentional processes required for feature binding using a rat digging-based task. Additional research has demonstrated cholinergic contributions from the NBM to visuospatial attention in rats. Here, we combined these lines of evidence and employed visual search in rats to examine whether cortical cholinergic input supports visuospatial attention specifically for feature binding. We trained 18 male Long-Evans rats to perform visual search using touch screen-equipped operant chambers. Sessions comprised Feature Search (no feature binding required) and Conjunctive Search (feature binding required) trials using multiple stimulus set sizes. Following acquisition of visual search, 8 rats received bilateral NBM lesions using 192 IgG-saporin to selectively reduce cholinergic afferentation of the neocortex, which we hypothesized would selectively disrupt the visuospatial attentional processes needed for efficient conjunctive visual search. As expected, relative to sham-lesioned rats, ACh-NBM-lesioned rats took significantly longer to locate the target stimulus on Conjunctive Search, but not Feature Search trials, thus demonstrating that cholinergic contributions to visuospatial attention are important for feature binding in rats.
Falkowski, Andrzej; Jabłońska, Magdalena
2018-01-01
In this study we followed the extension of Tversky's research about features of similarity with its application to open sets. Unlike the original closed-set model in which a feature was shifted between a common and a distinctive set, we investigated how addition of new features and deletion of existing features affected similarity judgments. The model was tested empirically in a political context and we analyzed how positive and negative changes in a candidate's profile affect the similarity of the politician to his or her ideal and opposite counterpart. The results showed a positive-negative asymmetry in comparison judgments where enhancing negative features (distinctive for an ideal political candidate) had a greater effect on judgments than operations on positive (common) features. However, the effect was not observed for comparisons to a bad politician. Further analyses showed that in the case of a negative reference point, the relationship between similarity judgments and voting intention was mediated by the affective evaluation of the candidate.
Smith, Lorraine; Alles, Chehani; Lemay, Kate; Reddel, Helen; Saini, Bandana; Bosnic-Anticevich, Sinthia; Emmerton, Lynne; Stewart, Kay; Burton, Debbie; Krass, Ines; Armour, Carol
2013-01-01
Goal setting was investigated as part of an implementation trial of an asthma management service (PAMS) conducted in 96 Australian community pharmacies. Patients and pharmacists identified asthma-related issues of concern to the patient and collaboratively set goals to address these. Although goal setting is commonly integrated into disease state management interventions, the nature of goals, and their contribution to goal attainment and health outcomes are not well understood. To identify and describe: 1) goals set collaboratively between adult patients with asthma and their pharmacist, 2) goal specificity and goal achievement, and 3) describe the relationships between specificity, achievement, asthma control and asthma-related quality of life. Measures of goal specificity, and goal achievement were developed and applied to patient data records. Goals set were thematically analyzed into goal domains. Proportions of goals set, goals achieved and their specificity were calculated. Correlational and regression analyses were undertaken to determine the relationships between goal specificity, goal achievement, asthma control and asthma-related quality of life. Data were drawn from 498 patient records. Findings showed that patients set a wide range and number of asthma-related goals (N = 1787) and the majority (93%) were either achieved or being working toward by the end of the study. Goal achievement was positively associated with specific and moderately specific goals, but not non-specific goals. However, on closer inspection, an inconsistent pattern of relationships emerged as a function of goal domain. Findings also showed that goal setting was associated with end-of-study asthma control but not to asthma-related quality of life. Pharmacists can help patients to set achievable and specific asthma management goals, and these have the potential to directly impact health outcomes such as asthma control. Goal specificity appears to be an important feature in the achievement of goals, but other factors may also play a role. Copyright © 2013 Elsevier Inc. All rights reserved.
Effect of finite sample size on feature selection and classification: a simulation study.
Way, Ted W; Sahiner, Berkman; Hadjiiski, Lubomir M; Chan, Heang-Ping
2010-02-01
The small number of samples available for training and testing is often the limiting factor in finding the most effective features and designing an optimal computer-aided diagnosis (CAD) system. Training on a limited set of samples introduces bias and variance in the performance of a CAD system relative to that trained with an infinite sample size. In this work, the authors conducted a simulation study to evaluate the performances of various combinations of classifiers and feature selection techniques and their dependence on the class distribution, dimensionality, and the training sample size. The understanding of these relationships will facilitate development of effective CAD systems under the constraint of limited available samples. Three feature selection techniques, the stepwise feature selection (SFS), sequential floating forward search (SFFS), and principal component analysis (PCA), and two commonly used classifiers, Fisher's linear discriminant analysis (LDA) and support vector machine (SVM), were investigated. Samples were drawn from multidimensional feature spaces of multivariate Gaussian distributions with equal or unequal covariance matrices and unequal means, and with equal covariance matrices and unequal means estimated from a clinical data set. Classifier performance was quantified by the area under the receiver operating characteristic curve Az. The mean Az values obtained by resubstitution and hold-out methods were evaluated for training sample sizes ranging from 15 to 100 per class. The number of simulated features available for selection was chosen to be 50, 100, and 200. It was found that the relative performance of the different combinations of classifier and feature selection method depends on the feature space distributions, the dimensionality, and the available training sample sizes. The LDA and SVM with radial kernel performed similarly for most of the conditions evaluated in this study, although the SVM classifier showed a slightly higher hold-out performance than LDA for some conditions and vice versa for other conditions. PCA was comparable to or better than SFS and SFFS for LDA at small samples sizes, but inferior for SVM with polynomial kernel. For the class distributions simulated from clinical data, PCA did not show advantages over the other two feature selection methods. Under this condition, the SVM with radial kernel performed better than the LDA when few training samples were available, while LDA performed better when a large number of training samples were available. None of the investigated feature selection-classifier combinations provided consistently superior performance under the studied conditions for different sample sizes and feature space distributions. In general, the SFFS method was comparable to the SFS method while PCA may have an advantage for Gaussian feature spaces with unequal covariance matrices. The performance of the SVM with radial kernel was better than, or comparable to, that of the SVM with polynomial kernel under most conditions studied.
Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li
2011-01-01
Background Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Methodology/Principal Findings Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. Conclusions/Significance The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice. PMID:21359184
Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li
2011-02-16
Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice.
Comprehensive Computational Pathological Image Analysis Predicts Lung Cancer Prognosis.
Luo, Xin; Zang, Xiao; Yang, Lin; Huang, Junzhou; Liang, Faming; Rodriguez-Canales, Jaime; Wistuba, Ignacio I; Gazdar, Adi; Xie, Yang; Xiao, Guanghua
2017-03-01
Pathological examination of histopathological slides is a routine clinical procedure for lung cancer diagnosis and prognosis. Although the classification of lung cancer has been updated to become more specific, only a small subset of the total morphological features are taken into consideration. The vast majority of the detailed morphological features of tumor tissues, particularly tumor cells' surrounding microenvironment, are not fully analyzed. The heterogeneity of tumor cells and close interactions between tumor cells and their microenvironments are closely related to tumor development and progression. The goal of this study is to develop morphological feature-based prediction models for the prognosis of patients with lung cancer. We developed objective and quantitative computational approaches to analyze the morphological features of pathological images for patients with NSCLC. Tissue pathological images were analyzed for 523 patients with adenocarcinoma (ADC) and 511 patients with squamous cell carcinoma (SCC) from The Cancer Genome Atlas lung cancer cohorts. The features extracted from the pathological images were used to develop statistical models that predict patients' survival outcomes in ADC and SCC, respectively. We extracted 943 morphological features from pathological images of hematoxylin and eosin-stained tissue and identified morphological features that are significantly associated with prognosis in ADC and SCC, respectively. Statistical models based on these extracted features stratified NSCLC patients into high-risk and low-risk groups. The models were developed from training sets and validated in independent testing sets: a predicted high-risk group versus a predicted low-risk group (for patients with ADC: hazard ratio = 2.34, 95% confidence interval: 1.12-4.91, p = 0.024; for patients with SCC: hazard ratio = 2.22, 95% confidence interval: 1.15-4.27, p = 0.017) after adjustment for age, sex, smoking status, and pathologic tumor stage. The results suggest that the quantitative morphological features of tumor pathological images predict prognosis in patients with lung cancer. Copyright © 2016 International Association for the Study of Lung Cancer. Published by Elsevier Inc. All rights reserved.
Anticipating and controlling mask costs within EDA physical design
NASA Astrophysics Data System (ADS)
Rieger, Michael L.; Mayhew, Jeffrey P.; Melvin, Lawrence S.; Lugg, Robert M.; Beale, Daniel F.
2003-08-01
For low k1 lithography, more aggressive OPC is being applied to critical layers, and the number of mask layers with OPC treatments is growing rapidly. The 130 nm, process node required, on average, 8 layers containing rules- or model-based OPC. The 90 nm node will have 16 OPC layers, of which 14 layers contain aggressive model-based OPC. This escalation of mask pattern complexity, coupled with the predominant use of vector-scan e-beam (VSB) mask writers contributes to the rising costs of advanced mask sets. Writing times for OPC layouts are several times longer than for traditional layouts, making mask exposure the single largest cost component for OPC masks. Lower mask yields, another key factor in higher mask costs, is also aggravated by OPC. Historical mask set costs are plotted below. The initial cost of a 90 nm-node mask set will exceed one million dollars. The relative impact of mask cost on chip depends on how many total wafers are printed with each mask set. For many foundry chips, where unit production is often well below 1000 wafers, mask costs are larger than wafer processing costs. Further increases in NRE may begin to discourage these suppliers' adoption to 90 nm and smaller nodes. In this paper we will outline several alternatives for reducing mask costs by strategically leveraging dimensional margins. Dimensional specifications for a particular masking layer usually are applied uniformly to all features on that layer. As a practical matter, accuracy requirements on different features in the design may vary widely. Take a polysilicon layer, for example: global tolerance specifications for that layer are driven by the transistor-gate requirements; but these parameters over-specify interconnect feature requirements. By identifying features where dimensional accuracy requirements can be reduced, additional margin can be leveraged to reduce OPC complexity. Mask writing time on VSB tools will drop in nearly direct proportion to reduce shot count. By inspecting masks with reference to feature-dependent margins, instead of uniform specifications, mask yield can be effectively increased further reducing delivered mask expense.
Li, Shelly-Anne; Jeffs, Lianne; Barwick, Melanie; Stevens, Bonnie
2018-05-05
Organizational contextual features have been recognized as important determinants for implementing evidence-based practices across healthcare settings for over a decade. However, implementation scientists have not reached consensus on which features are most important for implementing evidence-based practices. The aims of this review were to identify the most commonly reported organizational contextual features that influence the implementation of evidence-based practices across healthcare settings, and to describe how these features affect implementation. An integrative review was undertaken following literature searches in CINAHL, MEDLINE, PsycINFO, EMBASE, Web of Science, and Cochrane databases from January 2005 to June 2017. English language, peer-reviewed empirical studies exploring organizational context in at least one implementation initiative within a healthcare setting were included. Quality appraisal of the included studies was performed using the Mixed Methods Appraisal Tool. Inductive content analysis informed data extraction and reduction. The search generated 5152 citations. After removing duplicates and applying eligibility criteria, 36 journal articles were included. The majority (n = 20) of the study designs were qualitative, 11 were quantitative, and 5 used a mixed methods approach. Six main organizational contextual features (organizational culture; leadership; networks and communication; resources; evaluation, monitoring and feedback; and champions) were most commonly reported to influence implementation outcomes in the selected studies across a wide range of healthcare settings. We identified six organizational contextual features that appear to be interrelated and work synergistically to influence the implementation of evidence-based practices within an organization. Organizational contextual features did not influence implementation efforts independently from other features. Rather, features were interrelated and often influenced each other in complex, dynamic ways to effect change. These features corresponded to the constructs in the Consolidated Framework for Implementation Research (CFIR), which supports the use of CFIR as a guiding framework for studies that explore the relationship between organizational context and implementation. Organizational culture was most commonly reported to affect implementation. Leadership exerted influence on the five other features, indicating it may be a moderator or mediator that enhances or impedes the implementation of evidence-based practices. Future research should focus on how organizational features interact to influence implementation effectiveness.
Local Feature Selection for Data Classification.
Armanfard, Narges; Reilly, James P; Komeili, Majid
2016-06-01
Typical feature selection methods choose an optimal global feature subset that is applied over all regions of the sample space. In contrast, in this paper we propose a novel localized feature selection (LFS) approach whereby each region of the sample space is associated with its own distinct optimized feature set, which may vary both in membership and size across the sample space. This allows the feature set to optimally adapt to local variations in the sample space. An associated method for measuring the similarities of a query datum to each of the respective classes is also proposed. The proposed method makes no assumptions about the underlying structure of the samples; hence the method is insensitive to the distribution of the data over the sample space. The method is efficiently formulated as a linear programming optimization problem. Furthermore, we demonstrate the method is robust against the over-fitting problem. Experimental results on eleven synthetic and real-world data sets demonstrate the viability of the formulation and the effectiveness of the proposed algorithm. In addition we show several examples where localized feature selection produces better results than a global feature selection method.
Dimensionality Reduction Through Classifier Ensembles
NASA Technical Reports Server (NTRS)
Oza, Nikunj C.; Tumer, Kagan; Norwig, Peter (Technical Monitor)
1999-01-01
In data mining, one often needs to analyze datasets with a very large number of attributes. Performing machine learning directly on such data sets is often impractical because of extensive run times, excessive complexity of the fitted model (often leading to overfitting), and the well-known "curse of dimensionality." In practice, to avoid such problems, feature selection and/or extraction are often used to reduce data dimensionality prior to the learning step. However, existing feature selection/extraction algorithms either evaluate features by their effectiveness across the entire data set or simply disregard class information altogether (e.g., principal component analysis). Furthermore, feature extraction algorithms such as principal components analysis create new features that are often meaningless to human users. In this article, we present input decimation, a method that provides "feature subsets" that are selected for their ability to discriminate among the classes. These features are subsequently used in ensembles of classifiers, yielding results superior to single classifiers, ensembles that use the full set of features, and ensembles based on principal component analysis on both real and synthetic datasets.
Predicting protein amidation sites by orchestrating amino acid sequence features
NASA Astrophysics Data System (ADS)
Zhao, Shuqiu; Yu, Hua; Gong, Xiujun
2017-08-01
Amidation is the fourth major category of post-translational modifications, which plays an important role in physiological and pathological processes. Identifying amidation sites can help us understanding the amidation and recognizing the original reason of many kinds of diseases. But the traditional experimental methods for predicting amidation sites are often time-consuming and expensive. In this study, we propose a computational method for predicting amidation sites by orchestrating amino acid sequence features. Three kinds of feature extraction methods are used to build a feature vector enabling to capture not only the physicochemical properties but also position related information of the amino acids. An extremely randomized trees algorithm is applied to choose the optimal features to remove redundancy and dependence among components of the feature vector by a supervised fashion. Finally the support vector machine classifier is used to label the amidation sites. When tested on an independent data set, it shows that the proposed method performs better than all the previous ones with the prediction accuracy of 0.962 at the Matthew's correlation coefficient of 0.89 and area under curve of 0.964.
ERIC Educational Resources Information Center
Texas State Technical Coll., Waco.
The Machine Tool Advanced Skills Technology (MAST) consortium was formed to address the shortage of skilled workers for the machine tools and metals-related industries. Featuring six of the nation's leading advanced technology centers, the MAST consortium developed, tested, and disseminated industry-specific skill standards and model curricula for…
Natural concepts in a juvenile gorilla (gorilla gorilla gorilla) at three levels of abstraction.
Vonk, Jennifer; MacDonald, Suzanne E
2002-01-01
The extent to which nonhumans are able to form conceptual versus perceptual discriminations remains a matter of debate. Among the great apes, only chimpanzees have been tested for conceptual understanding, defined as the ability to form discriminations not based solely on simple perceptual features of stimuli, and to transfer this learning to novel stimuli. In the present investigation, a young captive female gorilla was trained at three levels of abstraction (concrete, intermediate, and abstract) involving sets of photographs representing natural categories (e.g., orangutans vs. humans, primates vs. nonprimate animals, animals vs. foods). Within each level of abstraction, when the gorilla had learned to discriminate positive from negative exemplars in one set of photographs, a novel set was introduced. Transfer was defined in terms of high accuracy during the first two sessions with the new stimuli. The gorilla acquired discriminations at all three levels of abstraction but showed unambiguous transfer only with the concrete and abstract stimulus sets. Detailed analyses of response patterns revealed little evidence of control by simple stimulus features. Acquisition and transfer involving abstract stimulus sets suggest a conceptual basis for gorilla categorization. The gorilla's relatively poor performance with intermediate-level discriminations parallels findings with pigeons, and suggests a need to reconsider the role of perceptual information in discriminations thought to indicate conceptual behavior in nonhumans. PMID:12507006
Robust Point Set Matching for Partial Face Recognition.
Weng, Renliang; Lu, Jiwen; Tan, Yap-Peng
2016-03-01
Over the past three decades, a number of face recognition methods have been proposed in computer vision, and most of them use holistic face images for person identification. In many real-world scenarios especially some unconstrained environments, human faces might be occluded by other objects, and it is difficult to obtain fully holistic face images for recognition. To address this, we propose a new partial face recognition approach to recognize persons of interest from their partial faces. Given a pair of gallery image and probe face patch, we first detect keypoints and extract their local textural features. Then, we propose a robust point set matching method to discriminatively match these two extracted local feature sets, where both the textural information and geometrical information of local features are explicitly used for matching simultaneously. Finally, the similarity of two faces is converted as the distance between these two aligned feature sets. Experimental results on four public face data sets show the effectiveness of the proposed approach.
Anonymization of electronic medical records for validating genome-wide association studies
Loukides, Grigorios; Gkoulalas-Divanis, Aris; Malin, Bradley
2010-01-01
Genome-wide association studies (GWAS) facilitate the discovery of genotype–phenotype relations from population-based sequence databases, which is an integral facet of personalized medicine. The increasing adoption of electronic medical records allows large amounts of patients’ standardized clinical features to be combined with the genomic sequences of these patients and shared to support validation of GWAS findings and to enable novel discoveries. However, disseminating these data “as is” may lead to patient reidentification when genomic sequences are linked to resources that contain the corresponding patients’ identity information based on standardized clinical features. This work proposes an approach that provably prevents this type of data linkage and furnishes a result that helps support GWAS. Our approach automatically extracts potentially linkable clinical features and modifies them in a way that they can no longer be used to link a genomic sequence to a small number of patients, while preserving the associations between genomic sequences and specific sets of clinical features corresponding to GWAS-related diseases. Extensive experiments with real patient data derived from the Vanderbilt's University Medical Center verify that our approach generates data that eliminate the threat of individual reidentification, while supporting GWAS validation and clinical case analysis tasks. PMID:20385806
True polar wander on Europa from global-scale small-circle depressions.
Schenk, Paul; Matsuyama, Isamu; Nimmo, Francis
2008-05-15
The tectonic patterns and stress history of Europa are exceedingly complex and many large-scale features remain unexplained. True polar wander, involving reorientation of Europa's floating outer ice shell about the tidal axis with Jupiter, has been proposed as a possible explanation for some of the features. This mechanism is possible if the icy shell is latitudinally variable in thickness and decoupled from the rocky interior. It would impose high stress levels on the shell, leading to predictable fracture patterns. No satisfactory match to global-scale features has hitherto been found for polar wander stress patterns. Here we describe broad arcuate troughs and depressions on Europa that do not fit other proposed stress mechanisms in their current position. Using imaging from three spacecraft, we have mapped two global-scale organized concentric antipodal sets of arcuate troughs up to hundreds of kilometres long and 300 m to approximately 1.5 km deep. An excellent match to these features is found with stresses caused by an episode of approximately 80 degrees true polar wander. These depressions also appear to be geographically related to other large-scale bright and dark lineaments, suggesting that many of Europa's tectonic patterns may also be related to true polar wander.
Lescroart, Mark D.; Stansbury, Dustin E.; Gallant, Jack L.
2015-01-01
Perception of natural visual scenes activates several functional areas in the human brain, including the Parahippocampal Place Area (PPA), Retrosplenial Complex (RSC), and the Occipital Place Area (OPA). It is currently unclear what specific scene-related features are represented in these areas. Previous studies have suggested that PPA, RSC, and/or OPA might represent at least three qualitatively different classes of features: (1) 2D features related to Fourier power; (2) 3D spatial features such as the distance to objects in a scene; or (3) abstract features such as the categories of objects in a scene. To determine which of these hypotheses best describes the visual representation in scene-selective areas, we applied voxel-wise modeling (VM) to BOLD fMRI responses elicited by a set of 1386 images of natural scenes. VM provides an efficient method for testing competing hypotheses by comparing predictions of brain activity based on encoding models that instantiate each hypothesis. Here we evaluated three different encoding models that instantiate each of the three hypotheses listed above. We used linear regression to fit each encoding model to the fMRI data recorded from each voxel, and we evaluated each fit model by estimating the amount of variance it predicted in a withheld portion of the data set. We found that voxel-wise models based on Fourier power or the subjective distance to objects in each scene predicted much of the variance predicted by a model based on object categories. Furthermore, the response variance explained by these three models is largely shared, and the individual models explain little unique variance in responses. Based on an evaluation of previous studies and the data we present here, we conclude that there is currently no good basis to favor any one of the three alternative hypotheses about visual representation in scene-selective areas. We offer suggestions for further studies that may help resolve this issue. PMID:26594164
Protein structure based prediction of catalytic residues
2013-01-01
Background Worldwide structural genomics projects continue to release new protein structures at an unprecedented pace, so far nearly 6000, but only about 60% of these proteins have any sort of functional annotation. Results We explored a range of features that can be used for the prediction of functional residues given a known three-dimensional structure. These features include various centrality measures of nodes in graphs of interacting residues: closeness, betweenness and page-rank centrality. We also analyzed the distance of functional amino acids to the general center of mass (GCM) of the structure, relative solvent accessibility (RSA), and the use of relative entropy as a measure of sequence conservation. From the selected features, neural networks were trained to identify catalytic residues. We found that using distance to the GCM together with amino acid type provide a good discriminant function, when combined independently with sequence conservation. Using an independent test set of 29 annotated protein structures, the method returned 411 of the initial 9262 residues as the most likely to be involved in function. The output 411 residues contain 70 of the annotated 111 catalytic residues. This represents an approximately 14-fold enrichment of catalytic residues on the entire input set (corresponding to a sensitivity of 63% and a precision of 17%), a performance competitive with that of other state-of-the-art methods. Conclusions We found that several of the graph based measures utilize the same underlying feature of protein structures, which can be simply and more effectively captured with the distance to GCM definition. This also has the added the advantage of simplicity and easy implementation. Meanwhile sequence conservation remains by far the most influential feature in identifying functional residues. We also found that due the rapid changes in size and composition of sequence databases, conservation calculations must be recalibrated for specific reference databases. PMID:23433045
NASA Astrophysics Data System (ADS)
Martin, Y. E.; Johnson, E. A.; Gallaway, J.; Chaikina, O.
2011-12-01
Herein we conduct a followup investigation to an earlier research project in which we developed a numerical model of tree population dynamics, tree throw, and sediment transport associated with the formation of pit-mound features for Hawk Creek watershed, Canadian Rockies (Gallaway et al., 2009). We extend this earlier work by exploring the most appropriate transport relations to simulate the diffusion over time of newly-formed pit-pound features due to tree throw. We combine our earlier model with a landscape development model that can incorporate these diffusive transport relations. Using these combined models, changes in hillslope microtopography over time associated with the formation of pit-mound features and their decay will be investigated. The following ideas have motivated this particular study: (i) Rates of pit-mound degradation remain a source of almost complete speculation, as there is almost no long-term information on process rates. Therefore, we will attempt to tackle the issue of pit-mound degradation in a methodical way that can guide future field studies; (ii) The degree of visible pit-mound topography at any point in time on the landscape is a joint function of the rate of formation of new pit-mound features due to tree death/topple and their magnitude vs. the rate of decay of pit-mound features. An example of one interesting observation that arises is the following: it appears that pit-mound topography is often more pronounced in some eastern North American forests vs. field sites along the eastern slopes of the Canadian Rockies. Why is this the case? Our investigation begins by considering whether pit-mound decay might occur by linear or nonlinear diffusion. What differences might arise depending on which diffusive approach is adopted? What is the magnitude of transport rates associated with these possible forms of transport relations? We explore linear and nonlinear diffusion at varying rates and for different sizes of pit-mound pairs using a numerical modelling approach. Model results suggest that longevity of pit-mound features is dependent on: (i) magnitude/dimensions of initial pit-mound features for forests in different regions; (ii) defining appropriate pit-mound diffusion rates for these different forests (unfortunately, almost no appropriate field observations exist for calibration of these transport relations). In the next stage of this research, we will combine our earlier model of forest disturbance/tree population dynamics, tree throw and pit-mound formation with the numerical model LandMod (Martin, 1998, 2000, 2007); the latter will be used to simulate pit-mound diffusion over time. In this way, we can observe changes in hillslope microtopographic signatures over time that are found in different forest settings.
Detecting Parkinson's disease from sustained phonation and speech signals.
Vaiciukynas, Evaldas; Verikas, Antanas; Gelzinis, Adas; Bacauskiene, Marija
2017-01-01
This study investigates signals from sustained phonation and text-dependent speech modalities for Parkinson's disease screening. Phonation corresponds to the vowel /a/ voicing task and speech to the pronunciation of a short sentence in Lithuanian language. Signals were recorded through two channels simultaneously, namely, acoustic cardioid (AC) and smart phone (SP) microphones. Additional modalities were obtained by splitting speech recording into voiced and unvoiced parts. Information in each modality is summarized by 18 well-known audio feature sets. Random forest (RF) is used as a machine learning algorithm, both for individual feature sets and for decision-level fusion. Detection performance is measured by the out-of-bag equal error rate (EER) and the cost of log-likelihood-ratio. Essentia audio feature set was the best using the AC speech modality and YAAFE audio feature set was the best using the SP unvoiced modality, achieving EER of 20.30% and 25.57%, respectively. Fusion of all feature sets and modalities resulted in EER of 19.27% for the AC and 23.00% for the SP channel. Non-linear projection of a RF-based proximity matrix into the 2D space enriched medical decision support by visualization.
Mack Center on Nonprofit and Public Sector Management in Human Service Organizations
ERIC Educational Resources Information Center
Austin, Michael J.
2018-01-01
This invited set of reflections upon the research carried out under the auspices of a school of social work is part of a series featuring research centers. It reflects 25 years of scholarly work related to both public and nonprofit human service organizations at the only university-based research center in the United States devoted to research on…
ERIC Educational Resources Information Center
Almutairi, Mashal
2013-01-01
The main purpose of this research was to survey the literature about the U.S. education system and synthesize the important conclusions that could be identified as the main features of the education system in general as they relate to student achievement. The criteria were set and the meta-analysis procedures were carefully followed. This process…
ERIC Educational Resources Information Center
Milano, Chloe; Lawless, Aileen; Eades, Elaine
2015-01-01
This account explores the role of action learning during and after an educational programme. We focus on the final stage of a master's programme and the insider research that is a key feature in many UK universities. Researching within one's own organization should lead to individual and organizational learning. However, there is relatively little…
A graph-theoretic approach for inparalog detection.
Tremblay-Savard, Olivier; Swenson, Krister M
2012-01-01
Understanding the history of a gene family that evolves through duplication, speciation, and loss is a fundamental problem in comparative genomics. Features such as function, position, and structural similarity between genes are intimately connected to this history; relationships between genes such as orthology (genes related through a speciation event) or paralogy (genes related through a duplication event) are usually correlated with these features. For example, recent work has shown that in human and mouse there is a strong connection between function and inparalogs, the paralogs that were created since the speciation event separating the human and mouse lineages. Methods exist for detecting inparalogs that either use information from only two species, or consider a set of species but rely on clustering methods. In this paper we present a graph-theoretic approach for finding lower bounds on the number of inparalogs for a given set of species; we pose an edge covering problem on the similarity graph and give an efficient 2/3-approximation as well as a faster heuristic. Since the physical position of inparalogs corresponding to recent speciations is not likely to have changed since the duplication, we also use our predictions to estimate the types of duplications that have occurred in some vertebrates and drosophila.
Model-Based Learning of Local Image Features for Unsupervised Texture Segmentation
NASA Astrophysics Data System (ADS)
Kiechle, Martin; Storath, Martin; Weinmann, Andreas; Kleinsteuber, Martin
2018-04-01
Features that capture well the textural patterns of a certain class of images are crucial for the performance of texture segmentation methods. The manual selection of features or designing new ones can be a tedious task. Therefore, it is desirable to automatically adapt the features to a certain image or class of images. Typically, this requires a large set of training images with similar textures and ground truth segmentation. In this work, we propose a framework to learn features for texture segmentation when no such training data is available. The cost function for our learning process is constructed to match a commonly used segmentation model, the piecewise constant Mumford-Shah model. This means that the features are learned such that they provide an approximately piecewise constant feature image with a small jump set. Based on this idea, we develop a two-stage algorithm which first learns suitable convolutional features and then performs a segmentation. We note that the features can be learned from a small set of images, from a single image, or even from image patches. The proposed method achieves a competitive rank in the Prague texture segmentation benchmark, and it is effective for segmenting histological images.
SOLAR FLARE PREDICTION USING SDO/HMI VECTOR MAGNETIC FIELD DATA WITH A MACHINE-LEARNING ALGORITHM
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bobra, M. G.; Couvidat, S., E-mail: couvidat@stanford.edu
2015-01-10
We attempt to forecast M- and X-class solar flares using a machine-learning algorithm, called support vector machine (SVM), and four years of data from the Solar Dynamics Observatory's Helioseismic and Magnetic Imager, the first instrument to continuously map the full-disk photospheric vector magnetic field from space. Most flare forecasting efforts described in the literature use either line-of-sight magnetograms or a relatively small number of ground-based vector magnetograms. This is the first time a large data set of vector magnetograms has been used to forecast solar flares. We build a catalog of flaring and non-flaring active regions sampled from a databasemore » of 2071 active regions, comprised of 1.5 million active region patches of vector magnetic field data, and characterize each active region by 25 parameters. We then train and test the machine-learning algorithm and we estimate its performances using forecast verification metrics with an emphasis on the true skill statistic (TSS). We obtain relatively high TSS scores and overall predictive abilities. We surmise that this is partly due to fine-tuning the SVM for this purpose and also to an advantageous set of features that can only be calculated from vector magnetic field data. We also apply a feature selection algorithm to determine which of our 25 features are useful for discriminating between flaring and non-flaring active regions and conclude that only a handful are needed for good predictive abilities.« less
Toward the detection of abnormal chest radiographs the way radiologists do it
NASA Astrophysics Data System (ADS)
Alzubaidi, Mohammad; Patel, Ameet; Panchanathan, Sethuraman; Black, John A., Jr.
2011-03-01
Computer Aided Detection (CADe) and Computer Aided Diagnosis (CADx) are relatively recent areas of research that attempt to employ feature extraction, pattern recognition, and machine learning algorithms to aid radiologists in detecting and diagnosing abnormalities in medical images. However, these computational methods are based on the assumption that there are distinct classes of abnormalities, and that each class has some distinguishing features that set it apart from other classes. However, abnormalities in chest radiographs tend to be very heterogeneous. The literature suggests that thoracic (chest) radiologists develop their ability to detect abnormalities by developing a sense of what is normal, so that anything that is abnormal attracts their attention. This paper discusses an approach to CADe that is based on a technique called anomaly detection (which aims to detect outliers in data sets) for the purpose of detecting atypical regions in chest radiographs. However, in order to apply anomaly detection to chest radiographs, it is necessary to develop a basis for extracting features from corresponding anatomical locations in different chest radiographs. This paper proposes a method for doing this, and describes how it can be used to support CADe.
NASA Technical Reports Server (NTRS)
Borsdorf, H.; Nazarov, E. G.; Eiceman, G. A.
2002-01-01
The ionization pathways were determined for sets of isomeric non-polar hydrocarbons (structural isomers, cis/trans isomers) using ion mobility spectrometry and mass spectrometry with different techniques of atmospheric pressure chemical ionization to assess the influence of structural features on ion formation. Depending on the structural features, different ions were observed using mass spectrometry. Unsaturated hydrocarbons formed mostly [M - 1]+ and [(M - 1)2H]+ ions while mainly [M - 3]+ and [(M - 3)H2O]+ ions were found for saturated cis/trans isomers using photoionization and 63Ni ionization. These ionization methods and corona discharge ionization were used for ion mobility measurements of these compounds. Different ions were detected for compounds with different structural features. 63Ni ionization and photoionization provide comparable ions for every set of isomers. The product ions formed can be clearly attributed to the structures identified. However, differences in relative abundance of product ions were found. Although corona discharge ionization permits the most sensitive detection of non-polar hydrocarbons, the spectra detected are complex and differ from those obtained with 63Ni ionization and photoionization. c. 2002 American Society for Mass Spectrometry.
Jiao, Yong; Zhang, Yu; Wang, Yu; Wang, Bei; Jin, Jing; Wang, Xingyu
2018-05-01
Multiset canonical correlation analysis (MsetCCA) has been successfully applied to optimize the reference signals by extracting common features from multiple sets of electroencephalogram (EEG) for steady-state visual evoked potential (SSVEP) recognition in brain-computer interface application. To avoid extracting the possible noise components as common features, this study proposes a sophisticated extension of MsetCCA, called multilayer correlation maximization (MCM) model for further improving SSVEP recognition accuracy. MCM combines advantages of both CCA and MsetCCA by carrying out three layers of correlation maximization processes. The first layer is to extract the stimulus frequency-related information in using CCA between EEG samples and sine-cosine reference signals. The second layer is to learn reference signals by extracting the common features with MsetCCA. The third layer is to re-optimize the reference signals set in using CCA with sine-cosine reference signals again. Experimental study is implemented to validate effectiveness of the proposed MCM model in comparison with the standard CCA and MsetCCA algorithms. Superior performance of MCM demonstrates its promising potential for the development of an improved SSVEP-based brain-computer interface.
NASA Astrophysics Data System (ADS)
Kuwatani, T.; Toriumi, M.
2009-12-01
Recent advances in methodologies of geophysical observations, such as seismic tomography, seismic reflection method and geomagnetic method, provide us a large amount and a wide variety of data for physical properties of a crust and upper mantle (e.g. Matsubara et al. (2008)). However, it has still been difficult to specify a rock type and its physical conditions, mainly because (1) available data usually have a lot of error and uncertainty, and (2) physical properties of rocks are greatly affected by fluid and microstructures. The objective interpretation and quantitative evaluation for lithology and fluid-related structure require the statistical analyses of integrated geophysical and geological data. Self-Organizing Maps (SOMs) are unsupervised artificial neural networks that map the input space into clusters in a topological form whose organization is related to trends in the input data (Kohonen 2001). SOMs are powerful neural network techniques to classify and interpret multiattribute data sets. Results of SOM classifications can be represented as 2D images, called feature maps which illustrate the complexity and interrelationships among input data sets. Recently, some works have used SOM in order to interpret multidimensional, non-linear, and highly noised geophysical data for purposes of geological prediction (e.g. Klose 2006; Tselentis et al. 2007; Bauer et al. 2008). This paper describes the application of SOM to the 3D velocity structure beneath the whole Japan islands (e.g. Matsubara et al. 2008). From the obtained feature maps, we can specify the lithology and qualitatively evaluate the effect of fluid-related structures. Moreover, re-projection of feature maps onto the 3D velocity structures resulted in detailed images of the structures within the plates. The Pacific plate and the Philippine Sea plate subducting beneath the Eurasian plate can be imaged more clearly than the original P- and S-wave velocity structures. In order to understand more precise prediction of lithology and its structure, we will use the additional input data sets, such as tomographic images of random velocity fluctuation (Takahashi et al. 2009) and b-value mapping data. Additionally, different kinds of data sets, including the experimental and petrological results (e.g. Christensen 1991; Hacker et al. 2003) can be applied to our analyses.
Morgan, Emily H.; Connor, Leah M.; Garner, Jennifer A.; King, Abby C.; Sheats, Jylana L.; Winter, Sandra J.; Buman, Matthew P.
2015-01-01
Introduction A community’s built environment can influence health behaviors. Rural populations experience significant health disparities, yet built environment studies in these settings are limited. We used an electronic tablet-based community assessment tool to conduct built environment audits in rural settings. The primary objective of this qualitative study was to evaluate the usefulness of the tool in identifying barriers and facilitators to healthy eating and active living. The second objective was to understand resident perspectives on community features and opportunities for improvement. Methods Participants were recruited from 4 rural communities in New York State. Using the tool, participants completed 2 audits, which consisted of taking pictures and recording audio narratives about community features perceived as assets or barriers to healthy eating and active living. Follow-up focus groups explored the audit experience, data captured, and opportunities for change. Results Twenty-four adults (mean age, 69.4 y [standard deviation, 13.2 y]), 6 per community, participated in the study. The most frequently captured features related to active living were related to roads, sidewalks, and walkable destinations. Restaurants, nontraditional food stores, and supermarkets were identified in the food environment in relation to the cost, quality, and selection of healthy foods available. In general, participants found the assessment tool to be simple and enjoyable to use. Conclusion An electronic tablet–based tool can be used to assess rural food and physical activity environments and may be useful in identifying and prioritizing resident-led change initiatives. This resident-led assessment approach may also be helpful for informing and evaluating rural community-based interventions. PMID:26133645
A set of hypotheses on tribology of mammalian herbivore teeth
NASA Astrophysics Data System (ADS)
Kaiser, Thomas M.; Clauss, Marcus; Schulz-Kornas, Ellen
2016-03-01
Once erupted, mammal cheek teeth molars are continuously worn. Contact of molar surfaces with ingesta and with other teeth contribute to this wear. Microscopic wear features (dental surface texture) change continuously as new wear overprints old texture features. These features have been debated to indicate diet. The general assumption in relating occlusal textures to diet is that they are independent of masticatory movements and forces. If this assumption is not accepted, one needs to propose that occlusal textures comprise signals not only from the ‘last supper’ but also from masticatory events that represent ecological, species- or taxon-specific adaptations, and that occlusal textures therefore give a rather unspecific, somehow diet-related signal that is functionally inadequately understood. In order to test for mechanical mechanisms of wear, we created a hypothesis matrix that related sampled individuals with six tribological variables. Three variables represent mechanically relevant ingesta properties, and three represent animal-specific characteristics of the masticatory system. Three groups of mammal species (free ranging Cetartiodactyla and Perissodactyla, free ranging primates, and artificially fed rabbits) were investigated in terms of their 3D dental surface textures, which were quantified employing ten ISO 25178 surface texture parameters. We first formulated a set of specific predictions based on theoretical reflections on the effects of diet properties and animal characteristics, and subsequently performed discriminant analysis to test which parameters actually followed these predictions. We found that parameters Vvc, Vmc, Sp, Sq allowed the prediction of both, ingesta properties and properties of the masticatory system, if combined with other parameters. Sha, Sda and S5v had little predictive power in our dataset. Spd seemed rather unrelated to ingesta properties and made this parameter a suitable indicator of masticatory system properties.
Fabrezi, Marissa; Lobo, Fernando
2009-11-01
Many traits of the skull of ceratophryines are related to the capture of large prey independently of aquatic or terrestrial feeding. Herein, detailed descriptions of the development of hyoid skeleton and the anatomy of muscles responsible for hyoid and tongue movements in Lepidobatrachus laevis and L. llanensis are provided and compared with those of other neobatrachians. The aquatic Lepidobatrachus has special features in its hyoid skeleton that integrates a set of derived features convergent with the conditions observed in non-neobatrachian anurans and morphological novelties (e.g., dorsal dermal hyoid ossification) that deviate from the generalized pattern found in most frogs. Further, reduction of fibers of muscles of buccal floor, reduction or loss of hyoid muscles (m. geniohyoideus rama lateralis, anterior pair of m. petrohyoideus posteriores), small tongue, and simplified tongue muscles are also morphological deviations from the pattern of terrestrial ceratophryines, and other aquatic ceratophryids (e.g., Telmatobius) that seem to be related to feeding underwater. The historical derived features shared with Chacophrys and Ceratophrys involved in megalophagy are conserved in Lepidobatrachus and morphological changes in the hyoglossal apparatus define a unique functional complex among anurans.
Content validity of the DSM-IV borderline and narcissistic personality disorder criteria sets.
Blais, M A; Hilsenroth, M J; Castlebury, F D
1997-01-01
This study sought to empirically evaluate the content validity of the newly revised DSM-IV narcissistic personality disorder (NPD) and borderline personality disorder (BPD) criteria sets. Using the essential features of each disorder as construct definitions, factor analysis was used to determine how adequately the criteria sets covered the constructs. In addition, this empirical investigation sought to: 1) help define the dimensions underlying these polythetic disorders; 2) identify core features of each diagnosis; and 3) highlight the characteristics that may be most useful in diagnosing these two disorders. Ninety-one outpatients meeting DSM-IV criteria for a personality disorder (PD) were identified through a retrospective analysis of chart information. Records of these 91 patients were independently rated on all of the BPD and NPD symptom criteria for the DSM-IV. Acceptable interrater reliability (kappa estimates) was obtained for both presence or absence of a PD and symptom criteria for BPD and NPD. The factor analysis, performed separately for each disorder, identified a three-factor solution for both the DSM-IV BPD and NPD criteria sets. The results of this study provide strong support for the content validity of the NPD criteria set and moderate support for the content validly of the BPD criteria set. Three domains were found to comprise the BPD criteria set, with the essential features of interpersonal and identity instability forming one domain, and impulsivity and affective instability each identified as separate domains. Factor analysis of the NPD criteria set found three factors basically corresponding to the essential features of grandiosity, lack of empathy, and need for admiration. Therefore, the NPD criteria set adequately covers the essential or defining features of the disorder.
Predicting a small molecule-kinase interaction map: A machine learning approach
2011-01-01
Background We present a machine learning approach to the problem of protein ligand interaction prediction. We focus on a set of binding data obtained from 113 different protein kinases and 20 inhibitors. It was attained through ATP site-dependent binding competition assays and constitutes the first available dataset of this kind. We extract information about the investigated molecules from various data sources to obtain an informative set of features. Results A Support Vector Machine (SVM) as well as a decision tree algorithm (C5/See5) is used to learn models based on the available features which in turn can be used for the classification of new kinase-inhibitor pair test instances. We evaluate our approach using different feature sets and parameter settings for the employed classifiers. Moreover, the paper introduces a new way of evaluating predictions in such a setting, where different amounts of information about the binding partners can be assumed to be available for training. Results on an external test set are also provided. Conclusions In most of the cases, the presented approach clearly outperforms the baseline methods used for comparison. Experimental results indicate that the applied machine learning methods are able to detect a signal in the data and predict binding affinity to some extent. For SVMs, the binding prediction can be improved significantly by using features that describe the active site of a kinase. For C5, besides diversity in the feature set, alignment scores of conserved regions turned out to be very useful. PMID:21708012
NASA Astrophysics Data System (ADS)
Parker, D. G.; Ulrich, R. K.; Beck, J.
2014-12-01
We have previously applied the Bayesian automatic classification system AutoClass to solar magnetogram and intensity images from the 150 Foot Solar Tower at Mount Wilson to identify classes of solar surface features associated with variations in total solar irradiance (TSI) and, using those identifications, modeled TSI time series with improved accuracy (r > 0.96). (Ulrich, et al, 2010) AutoClass identifies classes by a two-step process in which it: (1) finds, without human supervision, a set of class definitions based on specified attributes of a sample of the image data pixels, such as magnetic field and intensity in the case of MWO images, and (2) applies the class definitions thus found to new data sets to identify automatically in them the classes found in the sample set. HMI high resolution images capture four observables-magnetic field, continuum intensity, line depth and line width-in contrast to MWO's two observables-magnetic field and intensity. In this study, we apply AutoClass to the HMI observables for images from May, 2010 to June, 2014 to identify solar surface feature classes. We use contemporaneous TSI measurements to determine whether and how variations in the HMI classes are related to TSI variations and compare the characteristic statistics of the HMI classes to those found from MWO images. We also attempt to derive scale factors between the HMI and MWO magnetic and intensity observables. The ability to categorize automatically surface features in the HMI images holds out the promise of consistent, relatively quick and manageable analysis of the large quantity of data available in these images. Given that the classes found in MWO images using AutoClass have been found to improve modeling of TSI, application of AutoClass to the more complex HMI images should enhance understanding of the physical processes at work in solar surface features and their implications for the solar-terrestrial environment. Ulrich, R.K., Parker, D, Bertello, L. and Boyden, J. 2010, Solar Phys. , 261 , 11.
Generalized compliant motion primitive
NASA Technical Reports Server (NTRS)
Backes, Paul G. (Inventor)
1994-01-01
This invention relates to a general primitive for controlling a telerobot with a set of input parameters. The primitive includes a trajectory generator; a teleoperation sensor; a joint limit generator; a force setpoint generator; a dither function generator, which produces telerobot motion inputs in a common coordinate frame for simultaneous combination in sensor summers. Virtual return spring motion input is provided by a restoration spring subsystem. The novel features of this invention include use of a single general motion primitive at a remote site to permit the shared and supervisory control of the robot manipulator to perform tasks via a remotely transferred input parameter set.
Bessel functions in mass action modeling of memories and remembrances
NASA Astrophysics Data System (ADS)
Freeman, Walter J.; Capolupo, Antonio; Kozma, Robert; Olivares del Campo, Andrés; Vitiello, Giuseppe
2015-10-01
Data from experimental observations of a class of neurological processes (Freeman K-sets) present functional distribution reproducing Bessel function behavior. We model such processes with couples of damped/amplified oscillators which provide time dependent representation of Bessel equation. The root loci of poles and zeros conform to solutions of K-sets. Some light is shed on the problem of filling the gap between the cellular level dynamics and the brain functional activity. Breakdown of time-reversal symmetry is related with the cortex thermodynamic features. This provides a possible mechanism to deduce lifetime of recorded memory.
Radio-nuclide mixture identification using medium energy resolution detectors
Nelson, Karl Einar
2013-09-17
According to one embodiment, a method for identifying radio-nuclides includes receiving spectral data, extracting a feature set from the spectral data comparable to a plurality of templates in a template library, and using a branch and bound method to determine a probable template match based on the feature set and templates in the template library. In another embodiment, a device for identifying unknown radio-nuclides includes a processor, a multi-channel analyzer, and a memory operatively coupled to the processor, the memory having computer readable code stored thereon. The computer readable code is configured, when executed by the processor, to receive spectral data, to extract a feature set from the spectral data comparable to a plurality of templates in a template library, and to use a branch and bound method to determine a probable template match based on the feature set and templates in the template library.
Geometry and combinatorics of Julia sets of real quadratic maps
NASA Astrophysics Data System (ADS)
Barnsley, M. F.; Geronimo, J. S.; Harrington, A. N.
1984-10-01
For real λ a correspondence is made between the Julia set B λ for z→( z- λ)2, in the hyperbolic case, and the set of λ-chains λ±√(λ±√(λ±..., with the aid of Cremer's theorem. It is shown how a number of features of Bλ can be understood in terms of λ-chains. The structure of B λ is determined by certain equivalence classes of λ-chains, fixed by orders of visitation of certain real cycles; and the bifurcation history of a given cycle can be conveniently computed via the combinatorics of λ-chains. The functional equations obeyed by attractive cycles are investigated, and their relation to λ-chains is given. The first cascade of period-doubling bifurcations is described from the point of view of the associated Julia sets and λ-chains. Certain "Julia sets" associated with the Feigenbaum function and some theorems of Lanford are discussed.
Extracting insights from the shape of complex data using topology
Lum, P. Y.; Singh, G.; Lehman, A.; Ishkanov, T.; Vejdemo-Johansson, M.; Alagappan, M.; Carlsson, J.; Carlsson, G.
2013-01-01
This paper applies topological methods to study complex high dimensional data sets by extracting shapes (patterns) and obtaining insights about them. Our method combines the best features of existing standard methodologies such as principal component and cluster analyses to provide a geometric representation of complex data sets. Through this hybrid method, we often find subgroups in data sets that traditional methodologies fail to find. Our method also permits the analysis of individual data sets as well as the analysis of relationships between related data sets. We illustrate the use of our method by applying it to three very different kinds of data, namely gene expression from breast tumors, voting data from the United States House of Representatives and player performance data from the NBA, in each case finding stratifications of the data which are more refined than those produced by standard methods. PMID:23393618
Extracting insights from the shape of complex data using topology.
Lum, P Y; Singh, G; Lehman, A; Ishkanov, T; Vejdemo-Johansson, M; Alagappan, M; Carlsson, J; Carlsson, G
2013-01-01
This paper applies topological methods to study complex high dimensional data sets by extracting shapes (patterns) and obtaining insights about them. Our method combines the best features of existing standard methodologies such as principal component and cluster analyses to provide a geometric representation of complex data sets. Through this hybrid method, we often find subgroups in data sets that traditional methodologies fail to find. Our method also permits the analysis of individual data sets as well as the analysis of relationships between related data sets. We illustrate the use of our method by applying it to three very different kinds of data, namely gene expression from breast tumors, voting data from the United States House of Representatives and player performance data from the NBA, in each case finding stratifications of the data which are more refined than those produced by standard methods.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cunliffe, Alexandra; Armato, Samuel G.; Castillo, Richard
2015-04-01
Purpose: To assess the relationship between radiation dose and change in a set of mathematical intensity- and texture-based features and to determine the ability of texture analysis to identify patients who develop radiation pneumonitis (RP). Methods and Materials: A total of 106 patients who received radiation therapy (RT) for esophageal cancer were retrospectively identified under institutional review board approval. For each patient, diagnostic computed tomography (CT) scans were acquired before (0-168 days) and after (5-120 days) RT, and a treatment planning CT scan with an associated dose map was obtained. 32- × 32-pixel regions of interest (ROIs) were randomly identifiedmore » in the lungs of each pre-RT scan. ROIs were subsequently mapped to the post-RT scan and the planning scan dose map by using deformable image registration. The changes in 20 feature values (ΔFV) between pre- and post-RT scan ROIs were calculated. Regression modeling and analysis of variance were used to test the relationships between ΔFV, mean ROI dose, and development of grade ≥2 RP. Area under the receiver operating characteristic curve (AUC) was calculated to determine each feature's ability to distinguish between patients with and those without RP. A classifier was constructed to determine whether 2- or 3-feature combinations could improve RP distinction. Results: For all 20 features, a significant ΔFV was observed with increasing radiation dose. Twelve features changed significantly for patients with RP. Individual texture features could discriminate between patients with and those without RP with moderate performance (AUCs from 0.49 to 0.78). Using multiple features in a classifier, AUC increased significantly (0.59-0.84). Conclusions: A relationship between dose and change in a set of image-based features was observed. For 12 features, ΔFV was significantly related to RP development. This study demonstrated the ability of radiomics to provide a quantitative, individualized measurement of patient lung tissue reaction to RT and assess RP development.« less
Meares, Russell; Gerull, Friederike; Stevenson, Janine; Korner, Anthony
2011-03-01
To determine which constellation of clinical features constitutes the core of borderline personality disorder (BPD). The criterion of endurance was used to identify the constellation of features which are most basic, or core, in borderline personality disorder. Two sets of constellations of DSM-III features were tested, each consisting of three groupings. The first set of constellations was constructed according to Clarkin's factor analysis; the second was theoretically derived. Broadly speaking, the three groupings concerned 'self', 'emotional regulation', and 'impulse'. Changes of these constellations were charted over one year in a comparison of the effect of treatment by the Conversational Model (n = 29) with treatment as usual (n = 31). In addition, measures of typical depression (Zung) were scored before and after the treatment period. The changes in the constellations were considered in relation to authoritative opinion. The changes in the two sets of constellations were similar. In the treatment as usual (TAU) group, 'self' endured unchanged, while 'emotional regulation' and 'impulse' improved. In the Conversational Model cohort, 'self' improved, 'emotional regulation' improved more greatly than the TAU group, while 'impulse' improved but not more than the treatment as usual group. Depression scores were not particularly associated with any grouping. A group of features including self/identity disturbance, emptiness and fear of abandonment may be at the core of BPD. Correlations between the three groupings and Zung scores favoured the view that the core affect is not typical depression. Rather, the central state may be 'painful incoherence'. It is suggested that the findings have implications for the refinement and elaboration of treatment methods in borderline personality disorder.
VARS-TOOL: A Comprehensive, Efficient, and Robust Sensitivity Analysis Toolbox
NASA Astrophysics Data System (ADS)
Razavi, S.; Sheikholeslami, R.; Haghnegahdar, A.; Esfahbod, B.
2016-12-01
VARS-TOOL is an advanced sensitivity and uncertainty analysis toolbox, applicable to the full range of computer simulation models, including Earth and Environmental Systems Models (EESMs). The toolbox was developed originally around VARS (Variogram Analysis of Response Surfaces), which is a general framework for Global Sensitivity Analysis (GSA) that utilizes the variogram/covariogram concept to characterize the full spectrum of sensitivity-related information, thereby providing a comprehensive set of "global" sensitivity metrics with minimal computational cost. VARS-TOOL is unique in that, with a single sample set (set of simulation model runs), it generates simultaneously three philosophically different families of global sensitivity metrics, including (1) variogram-based metrics called IVARS (Integrated Variogram Across a Range of Scales - VARS approach), (2) variance-based total-order effects (Sobol approach), and (3) derivative-based elementary effects (Morris approach). VARS-TOOL is also enabled with two novel features; the first one being a sequential sampling algorithm, called Progressive Latin Hypercube Sampling (PLHS), which allows progressively increasing the sample size for GSA while maintaining the required sample distributional properties. The second feature is a "grouping strategy" that adaptively groups the model parameters based on their sensitivity or functioning to maximize the reliability of GSA results. These features in conjunction with bootstrapping enable the user to monitor the stability, robustness, and convergence of GSA with the increase in sample size for any given case study. VARS-TOOL has been shown to achieve robust and stable results within 1-2 orders of magnitude smaller sample sizes (fewer model runs) than alternative tools. VARS-TOOL, available in MATLAB and Python, is under continuous development and new capabilities and features are forthcoming.
Discovering semantic features in the literature: a foundation for building functional associations
Chagoyen, Monica; Carmona-Saez, Pedro; Shatkay, Hagit; Carazo, Jose M; Pascual-Montano, Alberto
2006-01-01
Background Experimental techniques such as DNA microarray, serial analysis of gene expression (SAGE) and mass spectrometry proteomics, among others, are generating large amounts of data related to genes and proteins at different levels. As in any other experimental approach, it is necessary to analyze these data in the context of previously known information about the biological entities under study. The literature is a particularly valuable source of information for experiment validation and interpretation. Therefore, the development of automated text mining tools to assist in such interpretation is one of the main challenges in current bioinformatics research. Results We present a method to create literature profiles for large sets of genes or proteins based on common semantic features extracted from a corpus of relevant documents. These profiles can be used to establish pair-wise similarities among genes, utilized in gene/protein classification or can be even combined with experimental measurements. Semantic features can be used by researchers to facilitate the understanding of the commonalities indicated by experimental results. Our approach is based on non-negative matrix factorization (NMF), a machine-learning algorithm for data analysis, capable of identifying local patterns that characterize a subset of the data. The literature is thus used to establish putative relationships among subsets of genes or proteins and to provide coherent justification for this clustering into subsets. We demonstrate the utility of the method by applying it to two independent and vastly different sets of genes. Conclusion The presented method can create literature profiles from documents relevant to sets of genes. The representation of genes as additive linear combinations of semantic features allows for the exploration of functional associations as well as for clustering, suggesting a valuable methodology for the validation and interpretation of high-throughput experimental data. PMID:16438716
NASA Technical Reports Server (NTRS)
2004-01-01
KENNEDY SPACE CENTER, FLA. Employees at KSC stroll among several tents featuring vendors exhibits of safety- and health-related products. The exhibits were part of Spaceport Super Safety and Health Day, which also featured presentations by guest speakers Dr. Pamela Peeke, Navy Com. Stephen E. Iwanowicz, NASAs Dr. Kristine Calderon and Olympic-great Bruce Jenner. Vendors exhibits were set up in the parking areas outside the Vehicle Assembly Building (seen here) and the O&C Building. The annual event was initiated at KSC in 1998 to increase awareness of the importance of safety and health among the government and contractor workforce. The theme for this years event was Safety and Health: A Winning Combination.
Fernández, Alberto; Carmona, Cristobal José; José Del Jesus, María; Herrera, Francisco
2017-09-01
Imbalanced classification is related to those problems that have an uneven distribution among classes. In addition to the former, when instances are located into the overlapped areas, the correct modeling of the problem becomes harder. Current solutions for both issues are often focused on the binary case study, as multi-class datasets require an additional effort to be addressed. In this research, we overcome these problems by carrying out a combination between feature and instance selections. Feature selection will allow simplifying the overlapping areas easing the generation of rules to distinguish among the classes. Selection of instances from all classes will address the imbalance itself by finding the most appropriate class distribution for the learning task, as well as possibly removing noise and difficult borderline examples. For the sake of obtaining an optimal joint set of features and instances, we embedded the searching for both parameters in a Multi-Objective Evolutionary Algorithm, using the C4.5 decision tree as baseline classifier in this wrapper approach. The multi-objective scheme allows taking a double advantage: the search space becomes broader, and we may provide a set of different solutions in order to build an ensemble of classifiers. This proposal has been contrasted versus several state-of-the-art solutions on imbalanced classification showing excellent results in both binary and multi-class problems.
Chen, Peng; Li, Jinyan; Wong, Limsoon; Kuwahara, Hiroyuki; Huang, Jianhua Z; Gao, Xin
2013-08-01
Hot spot residues of proteins are fundamental interface residues that help proteins perform their functions. Detecting hot spots by experimental methods is costly and time-consuming. Sequential and structural information has been widely used in the computational prediction of hot spots. However, structural information is not always available. In this article, we investigated the problem of identifying hot spots using only physicochemical characteristics extracted from amino acid sequences. We first extracted 132 relatively independent physicochemical features from a set of the 544 properties in AAindex1, an amino acid index database. Each feature was utilized to train a classification model with a novel encoding schema for hot spot prediction by the IBk algorithm, an extension of the K-nearest neighbor algorithm. The combinations of the individual classifiers were explored and the classifiers that appeared frequently in the top performing combinations were selected. The hot spot predictor was built based on an ensemble of these classifiers and to work in a voting manner. Experimental results demonstrated that our method effectively exploited the feature space and allowed flexible weights of features for different queries. On the commonly used hot spot benchmark sets, our method significantly outperformed other machine learning algorithms and state-of-the-art hot spot predictors. The program is available at http://sfb.kaust.edu.sa/pages/software.aspx. Copyright © 2013 Wiley Periodicals, Inc.
System and method for the detection of anomalies in an image
Prasad, Lakshman; Swaminarayan, Sriram
2013-09-03
Preferred aspects of the present invention can include receiving a digital image at a processor; segmenting the digital image into a hierarchy of feature layers comprising one or more fine-scale features defining a foreground object embedded in one or more coarser-scale features defining a background to the one or more fine-scale features in the segmentation hierarchy; detecting a first fine-scale foreground feature as an anomaly with respect to a first background feature within which it is embedded; and constructing an anomalous feature layer by synthesizing spatially contiguous anomalous fine-scale features. Additional preferred aspects of the present invention can include detecting non-pervasive changes between sets of images in response at least in part to one or more difference images between the sets of images.
Effects of band selection on endmember extraction for forestry applications
NASA Astrophysics Data System (ADS)
Karathanassi, Vassilia; Andreou, Charoula; Andronis, Vassilis; Kolokoussis, Polychronis
2014-10-01
In spectral unmixing theory, data reduction techniques play an important role as hyperspectral imagery contains an immense amount of data, posing many challenging problems such as data storage, computational efficiency, and the so called "curse of dimensionality". Feature extraction and feature selection are the two main approaches for dimensionality reduction. Feature extraction techniques are used for reducing the dimensionality of the hyperspectral data by applying transforms on hyperspectral data. Feature selection techniques retain the physical meaning of the data by selecting a set of bands from the input hyperspectral dataset, which mainly contain the information needed for spectral unmixing. Although feature selection techniques are well-known for their dimensionality reduction potentials they are rarely used in the unmixing process. The majority of the existing state-of-the-art dimensionality reduction methods set criteria to the spectral information, which is derived by the whole wavelength, in order to define the optimum spectral subspace. These criteria are not associated with any particular application but with the data statistics, such as correlation and entropy values. However, each application is associated with specific land c over materials, whose spectral characteristics present variations in specific wavelengths. In forestry for example, many applications focus on tree leaves, in which specific pigments such as chlorophyll, xanthophyll, etc. determine the wavelengths where tree species, diseases, etc., can be detected. For such applications, when the unmixing process is applied, the tree species, diseases, etc., are considered as the endmembers of interest. This paper focuses on investigating the effects of band selection on the endmember extraction by exploiting the information of the vegetation absorbance spectral zones. More precisely, it is explored whether endmember extraction can be optimized when specific sets of initial bands related to leaf spectral characteristics are selected. Experiments comprise application of well-known signal subspace estimation and endmember extraction methods on a hyperspectral imagery that presents a forest area. Evaluation of the extracted endmembers showed that more forest species can be extracted as endmembers using selected bands.
NASA Astrophysics Data System (ADS)
Arimura, Hidetaka; Yoshiura, Takashi; Kumazawa, Seiji; Tanaka, Kazuhiro; Koga, Hiroshi; Mihara, Futoshi; Honda, Hiroshi; Sakai, Shuji; Toyofuku, Fukai; Higashida, Yoshiharu
2008-03-01
Our goal for this study was to attempt to develop a computer-aided diagnostic (CAD) method for classification of Alzheimer's disease (AD) with atrophic image features derived from specific anatomical regions in three-dimensional (3-D) T1-weighted magnetic resonance (MR) images. Specific regions related to the cerebral atrophy of AD were white matter and gray matter regions, and CSF regions in this study. Cerebral cortical gray matter regions were determined by extracting a brain and white matter regions based on a level set based method, whose speed function depended on gradient vectors in an original image and pixel values in grown regions. The CSF regions in cerebral sulci and lateral ventricles were extracted by wrapping the brain tightly with a zero level set determined from a level set function. Volumes of the specific regions and the cortical thickness were determined as atrophic image features. Average cortical thickness was calculated in 32 subregions, which were obtained by dividing each brain region. Finally, AD patients were classified by using a support vector machine, which was trained by the image features of AD and non-AD cases. We applied our CAD method to MR images of whole brains obtained from 29 clinically diagnosed AD cases and 25 non-AD cases. As a result, the area under a receiver operating characteristic (ROC) curve obtained by our computerized method was 0.901 based on a leave-one-out test in identification of AD cases among 54 cases including 8 AD patients at early stages. The accuracy for discrimination between 29 AD patients and 25 non-AD subjects was 0.840, which was determined at the point where the sensitivity was the same as the specificity on the ROC curve. This result showed that our CAD method based on atrophic image features may be promising for detecting AD patients by using 3-D MR images.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fan, J; Fan, J; Hu, W
Purpose: To develop a fast automatic algorithm based on the two dimensional kernel density estimation (2D KDE) to predict the dose-volume histogram (DVH) which can be employed for the investigation of radiotherapy quality assurance and automatic treatment planning. Methods: We propose a machine learning method that uses previous treatment plans to predict the DVH. The key to the approach is the framing of DVH in a probabilistic setting. The training consists of estimating, from the patients in the training set, the joint probability distribution of the dose and the predictive features. The joint distribution provides an estimation of the conditionalmore » probability of the dose given the values of the predictive features. For the new patient, the prediction consists of estimating the distribution of the predictive features and marginalizing the conditional probability from the training over this. Integrating the resulting probability distribution for the dose yields an estimation of the DVH. The 2D KDE is implemented to predict the joint probability distribution of the training set and the distribution of the predictive features for the new patient. Two variables, including the signed minimal distance from each OAR (organs at risk) voxel to the target boundary and its opening angle with respect to the origin of voxel coordinate, are considered as the predictive features to represent the OAR-target spatial relationship. The feasibility of our method has been demonstrated with the rectum, breast and head-and-neck cancer cases by comparing the predicted DVHs with the planned ones. Results: The consistent result has been found between these two DVHs for each cancer and the average of relative point-wise differences is about 5% within the clinical acceptable extent. Conclusion: According to the result of this study, our method can be used to predict the clinical acceptable DVH and has ability to evaluate the quality and consistency of the treatment planning.« less
GATOR: Requirements capturing of telephony features
NASA Technical Reports Server (NTRS)
Dankel, Douglas D., II; Walker, Wayne; Schmalz, Mark
1992-01-01
We are developing a natural language-based, requirements gathering system called GATOR (for the GATherer Of Requirements). GATOR assists in the development of more accurate and complete specifications of new telephony features. GATOR interacts with a feature designer who describes a new feature, set of features, or capability to be implemented. The system aids this individual in the specification process by asking for clarifications when potential ambiguities are present, by identifying potential conflicts with other existing features, and by presenting its understanding of the feature to the designer. Through user interaction with a model of the existing telephony feature set, GATOR constructs a formal representation of the new, 'to be implemented' feature. Ultimately GATOR will produce a requirements document and will maintain an internal representation of this feature to aid in future design and specification. This paper consists of three sections that describe (1) the structure of GATOR, (2) POND, GATOR's internal knowledge representation language, and (3) current research issues.
Environmental modeling and recognition for an autonomous land vehicle
NASA Technical Reports Server (NTRS)
Lawton, D. T.; Levitt, T. S.; Mcconnell, C. C.; Nelson, P. C.
1987-01-01
An architecture for object modeling and recognition for an autonomous land vehicle is presented. Examples of objects of interest include terrain features, fields, roads, horizon features, trees, etc. The architecture is organized around a set of data bases for generic object models and perceptual structures, temporary memory for the instantiation of object and relational hypotheses, and a long term memory for storing stable hypotheses that are affixed to the terrain representation. Multiple inference processes operate over these databases. Researchers describe these particular components: the perceptual structure database, the grouping processes that operate over this, schemas, and the long term terrain database. A processing example that matches predictions from the long term terrain model to imagery, extracts significant perceptual structures for consideration as potential landmarks, and extracts a relational structure to update the long term terrain database is given.
GARNET--gene set analysis with exploration of annotation relations.
Rho, Kyoohyoung; Kim, Bumjin; Jang, Youngjun; Lee, Sanghyun; Bae, Taejeong; Seo, Jihae; Seo, Chaehwa; Lee, Jihyun; Kang, Hyunjung; Yu, Ungsik; Kim, Sunghoon; Lee, Sanghyuk; Kim, Wan Kyu
2011-02-15
Gene set analysis is a powerful method of deducing biological meaning for an a priori defined set of genes. Numerous tools have been developed to test statistical enrichment or depletion in specific pathways or gene ontology (GO) terms. Major difficulties towards biological interpretation are integrating diverse types of annotation categories and exploring the relationships between annotation terms of similar information. GARNET (Gene Annotation Relationship NEtwork Tools) is an integrative platform for gene set analysis with many novel features. It includes tools for retrieval of genes from annotation database, statistical analysis & visualization of annotation relationships, and managing gene sets. In an effort to allow access to a full spectrum of amassed biological knowledge, we have integrated a variety of annotation data that include the GO, domain, disease, drug, chromosomal location, and custom-defined annotations. Diverse types of molecular networks (pathways, transcription and microRNA regulations, protein-protein interaction) are also included. The pair-wise relationship between annotation gene sets was calculated using kappa statistics. GARNET consists of three modules--gene set manager, gene set analysis and gene set retrieval, which are tightly integrated to provide virtually automatic analysis for gene sets. A dedicated viewer for annotation network has been developed to facilitate exploration of the related annotations. GARNET (gene annotation relationship network tools) is an integrative platform for diverse types of gene set analysis, where complex relationships among gene annotations can be easily explored with an intuitive network visualization tool (http://garnet.isysbio.org/ or http://ercsb.ewha.ac.kr/garnet/).
Schwämmle, Veit; León, Ileana Rodríguez; Jensen, Ole Nørregaard
2013-09-06
Large-scale quantitative analyses of biological systems are often performed with few replicate experiments, leading to multiple nonidentical data sets due to missing values. For example, mass spectrometry driven proteomics experiments are frequently performed with few biological or technical replicates due to sample-scarcity or due to duty-cycle or sensitivity constraints, or limited capacity of the available instrumentation, leading to incomplete results where detection of significant feature changes becomes a challenge. This problem is further exacerbated for the detection of significant changes on the peptide level, for example, in phospho-proteomics experiments. In order to assess the extent of this problem and the implications for large-scale proteome analysis, we investigated and optimized the performance of three statistical approaches by using simulated and experimental data sets with varying numbers of missing values. We applied three tools, including standard t test, moderated t test, also known as limma, and rank products for the detection of significantly changing features in simulated and experimental proteomics data sets with missing values. The rank product method was improved to work with data sets containing missing values. Extensive analysis of simulated and experimental data sets revealed that the performance of the statistical analysis tools depended on simple properties of the data sets. High-confidence results were obtained by using the limma and rank products methods for analyses of triplicate data sets that exhibited more than 1000 features and more than 50% missing values. The maximum number of differentially represented features was identified by using limma and rank products methods in a complementary manner. We therefore recommend combined usage of these methods as a novel and optimal way to detect significantly changing features in these data sets. This approach is suitable for large quantitative data sets from stable isotope labeling and mass spectrometry experiments and should be applicable to large data sets of any type. An R script that implements the improved rank products algorithm and the combined analysis is available.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhang, H; Wang, J; Chuong, M
2015-06-15
Purpose: To evaluate the role of mid-treatment and post-treatment FDG-PET/CT in predicting progression-free survival (PFS) and distant metastasis (DM) of anal cancer patients treated with chemoradiotherapy (CRT). Methods: 17 anal cancer patients treated with CRT were retrospectively studied. The median prescription dose was 56 Gy (range, 50–62.5 Gy). All patients underwent FDG-PET/CT scans before and after CRT. 16 of the 17 patients had an additional FDG-PET/CT image at 3–5 weeks into the treatment (denoted as mid-treatment FDG-PET/CT). 750 features were extracted from these three sets of scans, which included both traditional PET/CT measures (SUVmax, SUVpeak, tumor diameters, etc.) and spatialtemporalmore » PET/CT features (comprehensively quantify a tumor’s FDG uptake intensity and distribution, spatial variation (texture), geometric property and their temporal changes relative to baseline). 26 clinical parameters (age, gender, TNM stage, histology, GTV dose, etc.) were also analyzed. Advanced analytics including methods to select an optimal set of predictors and a model selection engine, which identifies the most accurate machine learning algorithm for predictive analysis was developed. Results: Comparing baseline + mid-treatment PET/CT set to baseline + posttreatment PET/CT set, 14 predictors were selected from each feature group. Same three clinical parameters (tumor size, T stage and whether 5-FU was held during any cycle of chemotherapy) and two traditional measures (pre- CRT SUVmin and SUVmedian) were selected by both predictor groups. Different mix of spatial-temporal PET/CT features was selected. Using the 14 predictors and Naive Bayes, mid-treatment PET/CT set achieved 87.5% accuracy (2 PFS patients misclassified, all local recurrence and DM patients correctly classified). Post-treatment PET/CT set achieved 94.0% accuracy (all PFS and DM patients correctly predicted, 1 local recurrence patient misclassified) with logistic regression, neural network or support vector machine model. Conclusion: Applying radiomics approach to either midtreatment or post-treatment PET/CT could achieve high accuracy in predicting anal cancer treatment outcomes. This work was supported in part by the National Cancer Institute Grant R01CA172638.« less
Feature engineering for MEDLINE citation categorization with MeSH.
Jimeno Yepes, Antonio Jose; Plaza, Laura; Carrillo-de-Albornoz, Jorge; Mork, James G; Aronson, Alan R
2015-04-08
Research in biomedical text categorization has mostly used the bag-of-words representation. Other more sophisticated representations of text based on syntactic, semantic and argumentative properties have been less studied. In this paper, we evaluate the impact of different text representations of biomedical texts as features for reproducing the MeSH annotations of some of the most frequent MeSH headings. In addition to unigrams and bigrams, these features include noun phrases, citation meta-data, citation structure, and semantic annotation of the citations. Traditional features like unigrams and bigrams exhibit strong performance compared to other feature sets. Little or no improvement is obtained when using meta-data or citation structure. Noun phrases are too sparse and thus have lower performance compared to more traditional features. Conceptual annotation of the texts by MetaMap shows similar performance compared to unigrams, but adding concepts from the UMLS taxonomy does not improve the performance of using only mapped concepts. The combination of all the features performs largely better than any individual feature set considered. In addition, this combination improves the performance of a state-of-the-art MeSH indexer. Concerning the machine learning algorithms, we find that those that are more resilient to class imbalance largely obtain better performance. We conclude that even though traditional features such as unigrams and bigrams have strong performance compared to other features, it is possible to combine them to effectively improve the performance of the bag-of-words representation. We have also found that the combination of the learning algorithm and feature sets has an influence in the overall performance of the system. Moreover, using learning algorithms resilient to class imbalance largely improves performance. However, when using a large set of features, consideration needs to be taken with algorithms due to the risk of over-fitting. Specific combinations of learning algorithms and features for individual MeSH headings could further increase the performance of an indexing system.
Single and Multiple Object Tracking Using a Multi-Feature Joint Sparse Representation.
Hu, Weiming; Li, Wei; Zhang, Xiaoqin; Maybank, Stephen
2015-04-01
In this paper, we propose a tracking algorithm based on a multi-feature joint sparse representation. The templates for the sparse representation can include pixel values, textures, and edges. In the multi-feature joint optimization, noise or occlusion is dealt with using a set of trivial templates. A sparse weight constraint is introduced to dynamically select the relevant templates from the full set of templates. A variance ratio measure is adopted to adaptively adjust the weights of different features. The multi-feature template set is updated adaptively. We further propose an algorithm for tracking multi-objects with occlusion handling based on the multi-feature joint sparse reconstruction. The observation model based on sparse reconstruction automatically focuses on the visible parts of an occluded object by using the information in the trivial templates. The multi-object tracking is simplified into a joint Bayesian inference. The experimental results show the superiority of our algorithm over several state-of-the-art tracking algorithms.
A wavelet-based approach for a continuous analysis of phonovibrograms.
Unger, Jakob; Meyer, Tobias; Doellinger, Michael; Hecker, Dietmar J; Schick, Bernhard; Lohscheller, Joerg
2012-01-01
Recently, endoscopic high-speed laryngoscopy has been established for commercial use and constitutes a state-of-the-art technique to examine vocal fold dynamics. Despite overcoming many limitations of commonly applied stroboscopy it has not gained widespread clinical application, yet. A major drawback is a missing methodology of extracting valuable features to support visual assessment or computer-aided diagnosis. In this paper a compact and descriptive feature set is presented. The feature extraction routines are based on two-dimensional color graphs called phonovibrograms (PVG). These graphs contain the full spatio-temporal pattern of vocal fold dynamics and are therefore suited to derive features that comprehensively describe the vibration pattern of vocal folds. Within our approach, clinically relevant features such as glottal closure type, symmetry and periodicity are quantified in a set of 10 descriptive features. The suitability for classification tasks is shown using a clinical data set comprising 50 healthy and 50 paralytic subjects. A classification accuracy of 93.2% has been achieved.
Falkowski, Andrzej; Jabłońska, Magdalena
2018-01-01
In this study we followed the extension of Tversky’s research about features of similarity with its application to open sets. Unlike the original closed-set model in which a feature was shifted between a common and a distinctive set, we investigated how addition of new features and deletion of existing features affected similarity judgments. The model was tested empirically in a political context and we analyzed how positive and negative changes in a candidate’s profile affect the similarity of the politician to his or her ideal and opposite counterpart. The results showed a positive–negative asymmetry in comparison judgments where enhancing negative features (distinctive for an ideal political candidate) had a greater effect on judgments than operations on positive (common) features. However, the effect was not observed for comparisons to a bad politician. Further analyses showed that in the case of a negative reference point, the relationship between similarity judgments and voting intention was mediated by the affective evaluation of the candidate. PMID:29535663
Intrusion detection using rough set classification.
Zhang, Lian-hua; Zhang, Guan-hua; Zhang, Jie; Bai, Ying-cai
2004-09-01
Recently machine learning-based intrusion detection approaches have been subjected to extensive researches because they can detect both misuse and anomaly. In this paper, rough set classification (RSC), a modern learning algorithm, is used to rank the features extracted for detecting intrusions and generate intrusion detection models. Feature ranking is a very critical step when building the model. RSC performs feature ranking before generating rules, and converts the feature ranking to minimal hitting set problem addressed by using genetic algorithm (GA). This is done in classical approaches using Support Vector Machine (SVM) by executing many iterations, each of which removes one useless feature. Compared with those methods, our method can avoid many iterations. In addition, a hybrid genetic algorithm is proposed to increase the convergence speed and decrease the training time of RSC. The models generated by RSC take the form of "IF-THEN" rules, which have the advantage of explication. Tests and comparison of RSC with SVM on DARPA benchmark data showed that for Probe and DoS attacks both RSC and SVM yielded highly accurate results (greater than 99% accuracy on testing set).
Zhu, Xiaolei; Mitchell, Julie C
2011-09-01
Hot spots constitute a small fraction of protein-protein interface residues, yet they account for a large fraction of the binding affinity. Based on our previous method (KFC), we present two new methods (KFC2a and KFC2b) that outperform other methods at hot spot prediction. A number of improvements were made in developing these new methods. First, we created a training data set that contained a similar number of hot spot and non-hot spot residues. In addition, we generated 47 different features, and different numbers of features were used to train the models to avoid over-fitting. Finally, two feature combinations were selected: One (used in KFC2a) is composed of eight features that are mainly related to solvent accessible surface area and local plasticity; the other (KFC2b) is composed of seven features, only two of which are identical to those used in KFC2a. The two models were built using support vector machines (SVM). The two KFC2 models were then tested on a mixed independent test set, and compared with other methods such as Robetta, FOLDEF, HotPoint, MINERVA, and KFC. KFC2a showed the highest predictive accuracy for hot spot residues (True Positive Rate: TPR = 0.85); however, the false positive rate was somewhat higher than for other models. KFC2b showed the best predictive accuracy for hot spot residues (True Positive Rate: TPR = 0.62) among all methods other than KFC2a, and the False Positive Rate (FPR = 0.15) was comparable with other highly predictive methods. Copyright © 2011 Wiley-Liss, Inc.
Eigenfunctions and Eigenvalues for a Scalar Riemann-Hilbert Problem Associated to Inverse Scattering
NASA Astrophysics Data System (ADS)
Pelinovsky, Dmitry E.; Sulem, Catherine
A complete set of eigenfunctions is introduced within the Riemann-Hilbert formalism for spectral problems associated to some solvable nonlinear evolution equations. In particular, we consider the time-independent and time-dependent Schrödinger problems which are related to the KdV and KPI equations possessing solitons and lumps, respectively. Non-standard scalar products, orthogonality and completeness relations are derived for these problems. The complete set of eigenfunctions is used for perturbation theory and bifurcation analysis of eigenvalues supported by the potentials under perturbations. We classify two different types of bifurcations of new eigenvalues and analyze their characteristic features. One type corresponds to thresholdless generation of solitons in the KdV equation, while the other predicts a threshold for generation of lumps in the KPI equation.
Groundwater sapping channels: Summary of effects of experiments with varied stratigraphy
NASA Technical Reports Server (NTRS)
Kochel, R. Craig; Simmons, David W.
1987-01-01
Experiments in the recirculating flume sapping box have modeled valley formation by groundwater sapping processes in a number of settings. The effects of the following parameters on sapping channel morphology were examined: surface slope; stratigraphic variations in permeability cohesion and dip; and structure of joints and dikes. These kinds of modeling experiments are particularly good for: testing concepts; developing a suite of distinctive morphologies and morphometries indicative of sapping; helping to relate process to morphology; and providing data necessary to assess the relative importance of runoff, sapping, and mass wasting processes on channel development. The observations from the flume systems can be used to help interpret features observed in terrestrial and Martian settings where sapping processes are thought to have played an important role in the development of valley networks.
Learning representations for the early detection of sepsis with deep neural networks.
Kam, Hye Jin; Kim, Ha Young
2017-10-01
Sepsis is one of the leading causes of death in intensive care unit patients. Early detection of sepsis is vital because mortality increases as the sepsis stage worsens. This study aimed to develop detection models for the early stage of sepsis using deep learning methodologies, and to compare the feasibility and performance of the new deep learning methodology with those of the regression method with conventional temporal feature extraction. Study group selection adhered to the InSight model. The results of the deep learning-based models and the InSight model were compared. With deep feedforward networks, the area under the ROC curve (AUC) of the models were 0.887 and 0.915 for the InSight and the new feature sets, respectively. For the model with the combined feature set, the AUC was the same as that of the basic feature set (0.915). For the long short-term memory model, only the basic feature set was applied and the AUC improved to 0.929 compared with the existing 0.887 of the InSight model. The contributions of this paper can be summarized in three ways: (i) improved performance without feature extraction using domain knowledge, (ii) verification of feature extraction capability of deep neural networks through comparison with reference features, and (iii) improved performance with feedforward neural networks using long short-term memory, a neural network architecture that can learn sequential patterns. Copyright © 2017 Elsevier Ltd. All rights reserved.
Machine learning for predicting soil classes in three semi-arid landscapes
Brungard, Colby W.; Boettinger, Janis L.; Duniway, Michael C.; Wills, Skye A.; Edwards, Thomas C.
2015-01-01
Mapping the spatial distribution of soil taxonomic classes is important for informing soil use and management decisions. Digital soil mapping (DSM) can quantitatively predict the spatial distribution of soil taxonomic classes. Key components of DSM are the method and the set of environmental covariates used to predict soil classes. Machine learning is a general term for a broad set of statistical modeling techniques. Many different machine learning models have been applied in the literature and there are different approaches for selecting covariates for DSM. However, there is little guidance as to which, if any, machine learning model and covariate set might be optimal for predicting soil classes across different landscapes. Our objective was to compare multiple machine learning models and covariate sets for predicting soil taxonomic classes at three geographically distinct areas in the semi-arid western United States of America (southern New Mexico, southwestern Utah, and northeastern Wyoming). All three areas were the focus of digital soil mapping studies. Sampling sites at each study area were selected using conditioned Latin hypercube sampling (cLHS). We compared models that had been used in other DSM studies, including clustering algorithms, discriminant analysis, multinomial logistic regression, neural networks, tree based methods, and support vector machine classifiers. Tested machine learning models were divided into three groups based on model complexity: simple, moderate, and complex. We also compared environmental covariates derived from digital elevation models and Landsat imagery that were divided into three different sets: 1) covariates selected a priori by soil scientists familiar with each area and used as input into cLHS, 2) the covariates in set 1 plus 113 additional covariates, and 3) covariates selected using recursive feature elimination. Overall, complex models were consistently more accurate than simple or moderately complex models. Random forests (RF) using covariates selected via recursive feature elimination was consistently the most accurate, or was among the most accurate, classifiers between study areas and between covariate sets within each study area. We recommend that for soil taxonomic class prediction, complex models and covariates selected by recursive feature elimination be used. Overall classification accuracy in each study area was largely dependent upon the number of soil taxonomic classes and the frequency distribution of pedon observations between taxonomic classes. Individual subgroup class accuracy was generally dependent upon the number of soil pedon observations in each taxonomic class. The number of soil classes is related to the inherent variability of a given area. The imbalance of soil pedon observations between classes is likely related to cLHS. Imbalanced frequency distributions of soil pedon observations between classes must be addressed to improve model accuracy. Solutions include increasing the number of soil pedon observations in classes with few observations or decreasing the number of classes. Spatial predictions using the most accurate models generally agree with expected soil–landscape relationships. Spatial prediction uncertainty was lowest in areas of relatively low relief for each study area.
Feature selection for elderly faller classification based on wearable sensors.
Howcroft, Jennifer; Kofman, Jonathan; Lemaire, Edward D
2017-05-30
Wearable sensors can be used to derive numerous gait pattern features for elderly fall risk and faller classification; however, an appropriate feature set is required to avoid high computational costs and the inclusion of irrelevant features. The objectives of this study were to identify and evaluate smaller feature sets for faller classification from large feature sets derived from wearable accelerometer and pressure-sensing insole gait data. A convenience sample of 100 older adults (75.5 ± 6.7 years; 76 non-fallers, 24 fallers based on 6 month retrospective fall occurrence) walked 7.62 m while wearing pressure-sensing insoles and tri-axial accelerometers at the head, pelvis, left and right shanks. Feature selection was performed using correlation-based feature selection (CFS), fast correlation based filter (FCBF), and Relief-F algorithms. Faller classification was performed using multi-layer perceptron neural network, naïve Bayesian, and support vector machine classifiers, with 75:25 single stratified holdout and repeated random sampling. The best performing model was a support vector machine with 78% accuracy, 26% sensitivity, 95% specificity, 0.36 F1 score, and 0.31 MCC and one posterior pelvis accelerometer input feature (left acceleration standard deviation). The second best model achieved better sensitivity (44%) and used a support vector machine with 74% accuracy, 83% specificity, 0.44 F1 score, and 0.29 MCC. This model had ten input features: maximum, mean and standard deviation posterior acceleration; maximum, mean and standard deviation anterior acceleration; mean superior acceleration; and three impulse features. The best multi-sensor model sensitivity (56%) was achieved using posterior pelvis and both shank accelerometers and a naïve Bayesian classifier. The best single-sensor model sensitivity (41%) was achieved using the posterior pelvis accelerometer and a naïve Bayesian classifier. Feature selection provided models with smaller feature sets and improved faller classification compared to faller classification without feature selection. CFS and FCBF provided the best feature subset (one posterior pelvis accelerometer feature) for faller classification. However, better sensitivity was achieved by the second best model based on a Relief-F feature subset with three pressure-sensing insole features and seven head accelerometer features. Feature selection should be considered as an important step in faller classification using wearable sensors.
Relational Agreement Measures for Similarity Searching of Cheminformatic Data Sets.
Rivera-Borroto, Oscar Miguel; García-de la Vega, José Manuel; Marrero-Ponce, Yovani; Grau, Ricardo
2016-01-01
Research on similarity searching of cheminformatic data sets has been focused on similarity measures using fingerprints. However, nominal scales are the least informative of all metric scales, increasing the tied similarity scores, and decreasing the effectivity of the retrieval engines. Tanimoto's coefficient has been claimed to be the most prominent measure for this task. Nevertheless, this field is far from being exhausted since the computer science no free lunch theorem predicts that "no similarity measure has overall superiority over the population of data sets". We introduce 12 relational agreement (RA) coefficients for seven metric scales, which are integrated within a group fusion-based similarity searching algorithm. These similarity measures are compared to a reference panel of 21 proximity quantifiers over 17 benchmark data sets (MUV), by using informative descriptors, a feature selection stage, a suitable performance metric, and powerful comparison tests. In this stage, RA coefficients perform favourably with repect to the state-of-the-art proximity measures. Afterward, the RA-based method outperform another four nearest neighbor searching algorithms over the same data domains. In a third validation stage, RA measures are successfully applied to the virtual screening of the NCI data set. Finally, we discuss a possible molecular interpretation for these similarity variants.
Grubert, Anna; Eimer, Martin
2016-08-01
To study whether top-down attentional control processes can be set simultaneously for different visual features, we employed a spatial cueing procedure to measure behavioral and electrophysiological markers of task-set contingent attentional capture during search for targets defined by 1 or 2 possible colors (one-color and two-color tasks). Search arrays were preceded by spatially nonpredictive color singleton cues. Behavioral spatial cueing effects indicative of attentional capture were elicited only by target-matching but not by distractor-color cues. However, when search displays contained 1 target-color and 1 distractor-color object among gray nontargets, N2pc components were triggered not only by target-color but also by distractor-color cues both in the one-color and two-color task, demonstrating that task-set nonmatching items attracted attention. When search displays contained 6 items in 6 different colors, so that participants had to adopt a fully feature-specific task set, the N2pc to distractor-color cues was eliminated in both tasks, indicating that nonmatching items were now successfully excluded from attentional processing. These results demonstrate that when observers adopt a feature-specific search mode, attentional task sets can be configured flexibly for multiple features within the same dimension, resulting in the rapid allocation of attention to task-set matching objects only. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, F; Yang, Y; Young, L
Purpose: Radiomic texture features derived from the oncologic PET have recently been brought under intense investigation within the context of patient stratification and treatment outcome prediction in a variety of cancer types; however, their validity has not yet been examined. This work is aimed to validate radiomic PET texture metrics through the use of realistic simulations in the ground truth setting. Methods: Simulation of FDG-PET was conducted by applying the Zubal phantom as an attenuation map to the SimSET software package that employs Monte Carlo techniques to model the physical process of emission imaging. A total of 15 irregularly-shaped lesionsmore » featuring heterogeneous activity distribution were simulated. For each simulated lesion, 28 texture features in relation to the intensity histograms (GLIH), grey-level co-occurrence matrices (GLCOM), neighborhood difference matrices (GLNDM), and zone size matrices (GLZSM) were evaluated and compared with their respective values extracted from the ground truth activity map. Results: In reference to the values from the ground truth images, texture parameters appearing on the simulated data varied with a range of 0.73–3026.2% for GLIH-based, 0.02–100.1% for GLCOM-based, 1.11–173.8% for GLNDM-based, and 0.35–66.3% for GLZSM-based. For majority of the examined texture metrics (16/28), their values on the simulated data differed significantly from those from the ground truth images (P-value ranges from <0.0001 to 0.04). Features not exhibiting significant difference comprised of GLIH-based standard deviation, GLCO-based energy and entropy, GLND-based coarseness and contrast, and GLZS-based low gray-level zone emphasis, high gray-level zone emphasis, short zone low gray-level emphasis, long zone low gray-level emphasis, long zone high gray-level emphasis, and zone size nonuniformity. Conclusion: The extent to which PET imaging disturbs texture appearance is feature-dependent and could be substantial. It is thus advised that use of PET texture parameters for predictive and prognostic measurements in oncologic setting awaits further systematic and critical evaluation.« less
Automatic classification of tissue malignancy for breast carcinoma diagnosis.
Fondón, Irene; Sarmiento, Auxiliadora; García, Ana Isabel; Silvestre, María; Eloy, Catarina; Polónia, António; Aguiar, Paulo
2018-05-01
Breast cancer is the second leading cause of cancer death among women. Its early diagnosis is extremely important to prevent avoidable deaths. However, malignancy assessment of tissue biopsies is complex and dependent on observer subjectivity. Moreover, hematoxylin and eosin (H&E)-stained histological images exhibit a highly variable appearance, even within the same malignancy level. In this paper, we propose a computer-aided diagnosis (CAD) tool for automated malignancy assessment of breast tissue samples based on the processing of histological images. We provide four malignancy levels as the output of the system: normal, benign, in situ and invasive. The method is based on the calculation of three sets of features related to nuclei, colour regions and textures considering local characteristics and global image properties. By taking advantage of well-established image processing techniques, we build a feature vector for each image that serves as an input to an SVM (Support Vector Machine) classifier with a quadratic kernel. The method has been rigorously evaluated, first with a 5-fold cross-validation within an initial set of 120 images, second with an external set of 30 different images and third with images with artefacts included. Accuracy levels range from 75.8% when the 5-fold cross-validation was performed to 75% with the external set of new images and 61.11% when the extremely difficult images were added to the classification experiment. The experimental results indicate that the proposed method is capable of distinguishing between four malignancy levels with high accuracy. Our results are close to those obtained with recent deep learning-based methods. Moreover, it performs better than other state-of-the-art methods based on feature extraction, and it can help improve the CAD of breast cancer. Copyright © 2018 Elsevier Ltd. All rights reserved.
Wang, Jie; Feng, Zuren; Lu, Na; Luo, Jing
2018-06-01
Feature selection plays an important role in the field of EEG signals based motor imagery pattern classification. It is a process that aims to select an optimal feature subset from the original set. Two significant advantages involved are: lowering the computational burden so as to speed up the learning procedure and removing redundant and irrelevant features so as to improve the classification performance. Therefore, feature selection is widely employed in the classification of EEG signals in practical brain-computer interface systems. In this paper, we present a novel statistical model to select the optimal feature subset based on the Kullback-Leibler divergence measure, and automatically select the optimal subject-specific time segment. The proposed method comprises four successive stages: a broad frequency band filtering and common spatial pattern enhancement as preprocessing, features extraction by autoregressive model and log-variance, the Kullback-Leibler divergence based optimal feature and time segment selection and linear discriminate analysis classification. More importantly, this paper provides a potential framework for combining other feature extraction models and classification algorithms with the proposed method for EEG signals classification. Experiments on single-trial EEG signals from two public competition datasets not only demonstrate that the proposed method is effective in selecting discriminative features and time segment, but also show that the proposed method yields relatively better classification results in comparison with other competitive methods. Copyright © 2018 Elsevier Ltd. All rights reserved.
Separation of Benign and Malicious Network Events for Accurate Malware Family Classification
2015-09-28
use Kullback - Leibler (KL) divergence [15] to measure the information ...related work in an important aspect concerning the order of events. We use n-grams to capture the order of events, which exposes richer information about...DISCUSSION Using n-grams on higher level network events helps under- stand the underlying operation of the malware, and provides a good feature set
Saund, Eric
2013-10-01
Effective object and scene classification and indexing depend on extraction of informative image features. This paper shows how large families of complex image features in the form of subgraphs can be built out of simpler ones through construction of a graph lattice—a hierarchy of related subgraphs linked in a lattice. Robustness is achieved by matching many overlapping and redundant subgraphs, which allows the use of inexpensive exact graph matching, instead of relying on expensive error-tolerant graph matching to a minimal set of ideal model graphs. Efficiency in exact matching is gained by exploitation of the graph lattice data structure. Additionally, the graph lattice enables methods for adaptively growing a feature space of subgraphs tailored to observed data. We develop the approach in the domain of rectilinear line art, specifically for the practical problem of document forms recognition. We are especially interested in methods that require only one or very few labeled training examples per category. We demonstrate two approaches to using the subgraph features for this purpose. Using a bag-of-words feature vector we achieve essentially single-instance learning on a benchmark forms database, following an unsupervised clustering stage. Further performance gains are achieved on a more difficult dataset using a feature voting method and feature selection procedure.
Kovalenko, Lyudmyla Y; Chaumon, Maximilien; Busch, Niko A
2012-07-01
Semantic processing of verbal and visual stimuli has been investigated in semantic violation or semantic priming paradigms in which a stimulus is either related or unrelated to a previously established semantic context. A hallmark of semantic priming is the N400 event-related potential (ERP)--a deflection of the ERP that is more negative for semantically unrelated target stimuli. The majority of studies investigating the N400 and semantic integration have used verbal material (words or sentences), and standardized stimulus sets with norms for semantic relatedness have been published for verbal but not for visual material. However, semantic processing of visual objects (as opposed to words) is an important issue in research on visual cognition. In this study, we present a set of 800 pairs of semantically related and unrelated visual objects. The images were rated for semantic relatedness by a sample of 132 participants. Furthermore, we analyzed low-level image properties and matched the two semantic categories according to these features. An ERP study confirmed the suitability of this image set for evoking a robust N400 effect of semantic integration. Additionally, using a general linear modeling approach of single-trial data, we also demonstrate that low-level visual image properties and semantic relatedness are in fact only minimally overlapping. The image set is available for download from the authors' website. We expect that the image set will facilitate studies investigating mechanisms of semantic and contextual processing of visual stimuli.
Decorrelation of the true and estimated classifier errors in high-dimensional settings.
Hanczar, Blaise; Hua, Jianping; Dougherty, Edward R
2007-01-01
The aim of many microarray experiments is to build discriminatory diagnosis and prognosis models. Given the huge number of features and the small number of examples, model validity which refers to the precision of error estimation is a critical issue. Previous studies have addressed this issue via the deviation distribution (estimated error minus true error), in particular, the deterioration of cross-validation precision in high-dimensional settings where feature selection is used to mitigate the peaking phenomenon (overfitting). Because classifier design is based upon random samples, both the true and estimated errors are sample-dependent random variables, and one would expect a loss of precision if the estimated and true errors are not well correlated, so that natural questions arise as to the degree of correlation and the manner in which lack of correlation impacts error estimation. We demonstrate the effect of correlation on error precision via a decomposition of the variance of the deviation distribution, observe that the correlation is often severely decreased in high-dimensional settings, and show that the effect of high dimensionality on error estimation tends to result more from its decorrelating effects than from its impact on the variance of the estimated error. We consider the correlation between the true and estimated errors under different experimental conditions using both synthetic and real data, several feature-selection methods, different classification rules, and three error estimators commonly used (leave-one-out cross-validation, k-fold cross-validation, and .632 bootstrap). Moreover, three scenarios are considered: (1) feature selection, (2) known-feature set, and (3) all features. Only the first is of practical interest; however, the other two are needed for comparison purposes. We will observe that the true and estimated errors tend to be much more correlated in the case of a known feature set than with either feature selection or using all features, with the better correlation between the latter two showing no general trend, but differing for different models.
Pattern recognition and image processing for environmental monitoring
NASA Astrophysics Data System (ADS)
Siddiqui, Khalid J.; Eastwood, DeLyle
1999-12-01
Pattern recognition (PR) and signal/image processing methods are among the most powerful tools currently available for noninvasively examining spectroscopic and other chemical data for environmental monitoring. Using spectral data, these systems have found a variety of applications employing analytical techniques for chemometrics such as gas chromatography, fluorescence spectroscopy, etc. An advantage of PR approaches is that they make no a prior assumption regarding the structure of the patterns. However, a majority of these systems rely on human judgment for parameter selection and classification. A PR problem is considered as a composite of four subproblems: pattern acquisition, feature extraction, feature selection, and pattern classification. One of the basic issues in PR approaches is to determine and measure the features useful for successful classification. Selection of features that contain the most discriminatory information is important because the cost of pattern classification is directly related to the number of features used in the decision rules. The state of the spectral techniques as applied to environmental monitoring is reviewed. A spectral pattern classification system combining the above components and automatic decision-theoretic approaches for classification is developed. It is shown how such a system can be used for analysis of large data sets, warehousing, and interpretation. In a preliminary test, the classifier was used to classify synchronous UV-vis fluorescence spectra of relatively similar petroleum oils with reasonable success.
Semantic Indexing of Multimedia Content Using Visual, Audio, and Text Cues
NASA Astrophysics Data System (ADS)
Adams, W. H.; Iyengar, Giridharan; Lin, Ching-Yung; Naphade, Milind Ramesh; Neti, Chalapathy; Nock, Harriet J.; Smith, John R.
2003-12-01
We present a learning-based approach to the semantic indexing of multimedia content using cues derived from audio, visual, and text features. We approach the problem by developing a set of statistical models for a predefined lexicon. Novel concepts are then mapped in terms of the concepts in the lexicon. To achieve robust detection of concepts, we exploit features from multiple modalities, namely, audio, video, and text. Concept representations are modeled using Gaussian mixture models (GMM), hidden Markov models (HMM), and support vector machines (SVM). Models such as Bayesian networks and SVMs are used in a late-fusion approach to model concepts that are not explicitly modeled in terms of features. Our experiments indicate promise in the proposed classification and fusion methodologies: our proposed fusion scheme achieves more than 10% relative improvement over the best unimodal concept detector.
Characterizing chaotic melodies in automatic music composition
NASA Astrophysics Data System (ADS)
Coca, Andrés E.; Tost, Gerard O.; Zhao, Liang
2010-09-01
In this paper, we initially present an algorithm for automatic composition of melodies using chaotic dynamical systems. Afterward, we characterize chaotic music in a comprehensive way as comprising three perspectives: musical discrimination, dynamical influence on musical features, and musical perception. With respect to the first perspective, the coherence between generated chaotic melodies (continuous as well as discrete chaotic melodies) and a set of classical reference melodies is characterized by statistical descriptors and melodic measures. The significant differences among the three types of melodies are determined by discriminant analysis. Regarding the second perspective, the influence of dynamical features of chaotic attractors, e.g., Lyapunov exponent, Hurst coefficient, and correlation dimension, on melodic features is determined by canonical correlation analysis. The last perspective is related to perception of originality, complexity, and degree of melodiousness (Euler's gradus suavitatis) of chaotic and classical melodies by nonparametric statistical tests.
Upper extremity paraesthesia: clinical assessment and reasoning.
Muscolino, Joseph E
2008-07-01
The art of clinical assessment involves an accurate determination of the cause(s) of a patient's symptoms. Given that a set of symptoms can be influenced by many contributing factors and features, assessment needs to differentially evaluate these. Accurate and appropriate treatment depends on differential assessment based on sound clinical reasoning. Many conditions derive from multiple causes demanding evaluation of as many etiological features as can be identified. The case review presented here involves a patient presenting with paraesthesia spreading into her right upper extremity. A complex history, involving her neck and contralateral upper extremity was assessed. The patient was found to have at least seven underlying, predisposing, and etiological, conditions capable of initiating, aggravating, or maintaining the presenting symptoms. Weighing the relative contributions of these often interacting features, and correlating this with the history, helped to identify a successful course of treatment.
Landsat analysis for uranium exploration in Northeast Turkey
Lee, Keenan
1983-01-01
No uranium deposits are known in the Trabzon, Turkey region, and consequently, exploration criteria have not been defined. Nonetheless, by analogy with uranium deposits studied elsewhere, exploration guides are suggested to include dense concentrations of linear features, lineaments -- especially with northwest trend, acidic plutonic rocks, and alteration indicated by limonite. A suite of digitally processed images of a single Landsat scene served as the image base for mapping 3,376 linear features. Analysis of the linear feature data yielded two statistically significant trends, which in turn defined two sets of strong lineaments. Color composite images were used to map acidic plutonic rocks and areas of surficial limonitic materials. The Landsat interpretation yielded a map of these exploration guides that may be used to evaluate relative uranium potential. One area in particular shows a high coincidence of favorable indicators.
Setting conservation targets for sandy beach ecosystems
NASA Astrophysics Data System (ADS)
Harris, Linda; Nel, Ronel; Holness, Stephen; Sink, Kerry; Schoeman, David
2014-10-01
Representative and adequate reserve networks are key to conserving biodiversity. This begs the question, how much of which features need to be placed in protected areas? Setting specifically-derived conservation targets for most ecosystems is common practice; however, this has never been done for sandy beaches. The aims of this paper, therefore, are to propose a methodology for setting conservation targets for sandy beach ecosystems; and to pilot the proposed method using data describing biodiversity patterns and processes from microtidal beaches in South Africa. First, a classification scheme of valued features of beaches is constructed, including: biodiversity features; unique features; and important processes. Second, methodologies for setting targets for each feature under different data-availability scenarios are described. From this framework, targets are set for features characteristic of microtidal beaches in South Africa, as follows. 1) Targets for dune vegetation types were adopted from a previous assessment, and ranged 19-100%. 2) Targets for beach morphodynamic types (habitats) were set using species-area relationships (SARs). These SARs were derived from species richness data from 142 sampling events around the South African coast (extrapolated to total theoretical species richness estimates using previously-established species-accumulation curve relationships), plotted against the area of the beach (calculated from Google Earth imagery). The species-accumulation factor (z) was 0.22, suggesting a baseline habitat target of 27% is required to protect 75% of the species. This baseline target was modified by heuristic principles, based on habitat rarity and threat status, with final values ranging 27-40%. 3) Species targets were fixed at 20%, modified using heuristic principles based on endemism, threat status, and whether or not beaches play an important role in the species' life history, with targets ranging 20-100%. 4) Targets for processes and 5) important assemblages were set at 50%, following other studies. 6) Finally, a target for an outstanding feature (the Alexandria dunefield) was set at 80% because of its national, international and ecological importance. The greatest shortfall in the current target-setting process is in the lack of empirical models describing the key beach processes, from which robust ecological thresholds can be derived. As for many other studies, our results illustrate that the conservation target of 10% for coastal and marine systems proposed by the Convention on Biological Diversity is too low to conserve sandy beaches and their biota.
MalaCards: an integrated compendium for diseases and their annotation
Rappaport, Noa; Nativ, Noam; Stelzer, Gil; Twik, Michal; Guan-Golan, Yaron; Iny Stein, Tsippi; Bahir, Iris; Belinky, Frida; Morrey, C. Paul; Safran, Marilyn; Lancet, Doron
2013-01-01
Comprehensive disease classification, integration and annotation are crucial for biomedical discovery. At present, disease compilation is incomplete, heterogeneous and often lacking systematic inquiry mechanisms. We introduce MalaCards, an integrated database of human maladies and their annotations, modeled on the architecture and strategy of the GeneCards database of human genes. MalaCards mines and merges 44 data sources to generate a computerized card for each of 16 919 human diseases. Each MalaCard contains disease-specific prioritized annotations, as well as inter-disease connections, empowered by the GeneCards relational database, its searches and GeneDecks set analyses. First, we generate a disease list from 15 ranked sources, using disease-name unification heuristics. Next, we use four schemes to populate MalaCards sections: (i) directly interrogating disease resources, to establish integrated disease names, synonyms, summaries, drugs/therapeutics, clinical features, genetic tests and anatomical context; (ii) searching GeneCards for related publications, and for associated genes with corresponding relevance scores; (iii) analyzing disease-associated gene sets in GeneDecks to yield affiliated pathways, phenotypes, compounds and GO terms, sorted by a composite relevance score and presented with GeneCards links; and (iv) searching within MalaCards itself, e.g. for additional related diseases and anatomical context. The latter forms the basis for the construction of a disease network, based on shared MalaCards annotations, embodying associations based on etiology, clinical features and clinical conditions. This broadly disposed network has a power-law degree distribution, suggesting that this might be an inherent property of such networks. Work in progress includes hierarchical malady classification, ontological mapping and disease set analyses, striving to make MalaCards an even more effective tool for biomedical research. Database URL: http://www.malacards.org/ PMID:23584832
NASA Astrophysics Data System (ADS)
Mohammad, Fatimah; Ansari, Rashid; Shahidi, Mahnaz
2013-03-01
The visibility and continuity of the inner segment outer segment (ISOS) junction layer of the photoreceptors on spectral domain optical coherence tomography images is known to be related to visual acuity in patients with age-related macular degeneration (AMD). Automatic detection and segmentation of lesions and pathologies in retinal images is crucial for the screening, diagnosis, and follow-up of patients with retinal diseases. One of the challenges of using the classical level-set algorithms for segmentation involves the placement of the initial contour. Manually defining the contour or randomly placing it in the image may lead to segmentation of erroneous structures. It is important to be able to automatically define the contour by using information provided by image features. We explored a level-set method which is based on the classical Chan-Vese model and which utilizes image feature information for automatic contour placement for the segmentation of pathologies in fluorescein angiograms and en face retinal images of the ISOS layer. This was accomplished by exploiting a priori knowledge of the shape and intensity distribution allowing the use of projection profiles to detect the presence of pathologies that are characterized by intensity differences with surrounding areas in retinal images. We first tested our method by applying it to fluorescein angiograms. We then applied our method to en face retinal images of patients with AMD. The experimental results included demonstrate that the proposed method provided a quick and improved outcome as compared to the classical Chan-Vese method in which the initial contour is randomly placed, thus indicating the potential to provide a more accurate and detailed view of changes in pathologies due to disease progression and treatment.
Reproducibility and Prognosis of Quantitative Features Extracted from CT Images12
Balagurunathan, Yoganand; Gu, Yuhua; Wang, Hua; Kumar, Virendra; Grove, Olya; Hawkins, Sam; Kim, Jongphil; Goldgof, Dmitry B; Hall, Lawrence O; Gatenby, Robert A; Gillies, Robert J
2014-01-01
We study the reproducibility of quantitative imaging features that are used to describe tumor shape, size, and texture from computed tomography (CT) scans of non-small cell lung cancer (NSCLC). CT images are dependent on various scanning factors. We focus on characterizing image features that are reproducible in the presence of variations due to patient factors and segmentation methods. Thirty-two NSCLC nonenhanced lung CT scans were obtained from the Reference Image Database to Evaluate Response data set. The tumors were segmented using both manual (radiologist expert) and ensemble (software-automated) methods. A set of features (219 three-dimensional and 110 two-dimensional) was computed, and quantitative image features were statistically filtered to identify a subset of reproducible and nonredundant features. The variability in the repeated experiment was measured by the test-retest concordance correlation coefficient (CCCTreT). The natural range in the features, normalized to variance, was measured by the dynamic range (DR). In this study, there were 29 features across segmentation methods found with CCCTreT and DR ≥ 0.9 and R2Bet ≥ 0.95. These reproducible features were tested for predicting radiologist prognostic score; some texture features (run-length and Laws kernels) had an area under the curve of 0.9. The representative features were tested for their prognostic capabilities using an independent NSCLC data set (59 lung adenocarcinomas), where one of the texture features, run-length gray-level nonuniformity, was statistically significant in separating the samples into survival groups (P ≤ .046). PMID:24772210
How important is vehicle safety in the new vehicle purchase process?
Koppel, Sjaanie; Charlton, Judith; Fildes, Brian; Fitzharris, Michael
2008-05-01
Whilst there has been a significant increase in the amount of consumer interest in the safety performance of privately owned vehicles, the role that it plays in consumers' purchase decisions is poorly understood. The aims of the current study were to determine: how important vehicle safety is in the new vehicle purchase process; what importance consumers place on safety options/features relative to other convenience and comfort features, and how consumers conceptualise vehicle safety. In addition, the study aimed to investigate the key parameters associated with ranking 'vehicle safety' as the most important consideration in the new vehicle purchase. Participants recruited in Sweden and Spain completed a questionnaire about their new vehicle purchase. The findings from the questionnaire indicated that participants ranked safety-related factors (e.g., EuroNCAP (or other) safety ratings) as more important in the new vehicle purchase process than other vehicle factors (e.g., price, reliability etc.). Similarly, participants ranked safety-related features (e.g., advanced braking systems, front passenger airbags etc.) as more important than non-safety-related features (e.g., route navigation systems, air-conditioning etc.). Consistent with previous research, most participants equated vehicle safety with the presence of specific vehicle safety features or technologies rather than vehicle crash safety/test results or crashworthiness. The key parameters associated with ranking 'vehicle safety' as the most important consideration in the new vehicle purchase were: use of EuroNCAP, gender and education level, age, drivers' concern about crash involvement, first vehicle purchase, annual driving distance, person for whom the vehicle was purchased, and traffic infringement history. The findings from this study are important for policy makers, manufacturers and other stakeholders to assist in setting priorities with regard to the promotion and publicity of vehicle safety features for particular consumer groups (such as younger consumers) in order to increase their knowledge regarding vehicle safety and to encourage them to place highest priority on safety in the new vehicle purchase process.
Set of Frequent Word Item sets as Feature Representation for Text with Indonesian Slang
NASA Astrophysics Data System (ADS)
Sa'adillah Maylawati, Dian; Putri Saptawati, G. A.
2017-01-01
Indonesian slang are commonly used in social media. Due to their unstructured syntax, it is difficult to extract their features based on Indonesian grammar for text mining. To do so, we propose Set of Frequent Word Item sets (SFWI) as text representation which is considered match for Indonesian slang. Besides, SFWI is able to keep the meaning of Indonesian slang with regard to the order of appearance sentence. We use FP-Growth algorithm with adding separation sentence function into the algorithm to extract the feature of SFWI. The experiments is done with text data from social media such as Facebook, Twitter, and personal website. The result of experiments shows that Indonesian slang were more correctly interpreted based on SFWI.
Use of volumetric features for temporal comparison of mass lesions in full field digital mammograms
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bozek, Jelena, E-mail: jelena.bozek@fer.hr; Grgic, Mislav; Kallenberg, Michiel
2014-02-15
Purpose: Temporal comparison of lesions might improve classification between benign and malignant lesions in full-field digital mammograms (FFDM). The authors compare the use of volumetric features for lesion classification, which are computed from dense tissue thickness maps, to the use of mammographic lesion area. Use of dense tissue thickness maps for lesion characterization is advantageous, since it results in lesion features that are invariant to acquisition parameters. Methods: The dataset used in the analysis consisted of 60 temporal mammogram pairs comprising 120 mediolateral oblique or craniocaudal views with a total of 65 lesions, of which 41 were benign and 24more » malignant. The authors analyzed the performance of four volumetric features, area, and four other commonly used features obtained from temporal mammogram pairs, current mammograms, and prior mammograms. The authors evaluated the individual performance of all features and of different feature sets. The authors used linear discriminant analysis with leave-one-out cross validation to classify different feature sets. Results: Volumetric features from temporal mammogram pairs achieved the best individual performance, as measured by the area under the receiver operating characteristic curve (A{sub z} value). Volume change (A{sub z} = 0.88) achieved higher A{sub z} value than projected lesion area change (A{sub z} = 0.78) in the temporal comparison of lesions. Best performance was achieved with a set that consisted of a set of features extracted from the current exam combined with four volumetric features representing changes with respect to the prior mammogram (A{sub z} = 0.90). This was significantly better (p = 0.005) than the performance obtained using features from the current exam only (A{sub z} = 0.77). Conclusions: Volumetric features from temporal mammogram pairs combined with features from the single exam significantly improve discrimination of benign and malignant lesions in FFDM mammograms compared to using only single exam features. In the comparison with prior mammograms, use of volumetric change may lead to better performance than use of lesion area change.« less
Fast Semantic Segmentation of 3d Point Clouds with Strongly Varying Density
NASA Astrophysics Data System (ADS)
Hackel, Timo; Wegner, Jan D.; Schindler, Konrad
2016-06-01
We describe an effective and efficient method for point-wise semantic classification of 3D point clouds. The method can handle unstructured and inhomogeneous point clouds such as those derived from static terrestrial LiDAR or photogammetric reconstruction; and it is computationally efficient, making it possible to process point clouds with many millions of points in a matter of minutes. The key issue, both to cope with strong variations in point density and to bring down computation time, turns out to be careful handling of neighborhood relations. By choosing appropriate definitions of a point's (multi-scale) neighborhood, we obtain a feature set that is both expressive and fast to compute. We evaluate our classification method both on benchmark data from a mobile mapping platform and on a variety of large, terrestrial laser scans with greatly varying point density. The proposed feature set outperforms the state of the art with respect to per-point classification accuracy, while at the same time being much faster to compute.
Semantic Relations for Problem-Oriented Medical Records
Uzuner, Ozlem; Mailoa, Jonathan; Ryan, Russell; Sibanda, Tawanda
2010-01-01
Summary Objective We describe semantic relation (SR) classification on medical discharge summaries. We focus on relations targeted to the creation of problem-oriented records. Thus, we define relations that involve the medical problems of patients. Methods and Materials We represent patients’ medical problems with their diseases and symptoms. We study the relations of patients’ problems with each other and with concepts that are identified as tests and treatments. We present an SR classifier that studies a corpus of patient records one sentence at a time. For all pairs of concepts that appear in a sentence, this SR classifier determines the relations between them. In doing so, the SR classifier takes advantage of surface, lexical, and syntactic features and uses these features as input to a support vector machine. We apply our SR classifier to two sets of medical discharge summaries, one obtained from the Beth Israel-Deaconess Medical Center (BIDMC), Boston, MA and the other from Partners Healthcare, Boston, MA. Results On the BIDMC corpus, our SR classifier achieves micro-averaged F-measures that range from 74% to 95% on the various relation types. On the Partners corpus, the micro-averaged F-measures on the various relation types range from 68% to 91%. Our experiments show that lexical features (in particular, tokens that occur between candidate concepts, which we refer to as inter-concept tokens) are very informative for relation classification in medical discharge summaries. Using only the inter-concept tokens in the corpus, our SR classifier can recognize 84% of the relations in the BIDMC corpus and 72% of the relations in the Partners corpus. Conclusion These results are promising for semantic indexing of medical records. They imply that we can take advantage of lexical patterns in discharge summaries for relation classification at a sentence level. PMID:20646918
NASA Astrophysics Data System (ADS)
Frisch, Michael J.; Binkley, J. Stephen; Schaefer, Henry F., III
1984-08-01
The relative energies of the stationary points on the FH2 and H2CO nuclear potential energy surfaces relevant to the hydrogen atom abstraction, H2 elimination and 1,2-hydrogen shift reactions have been examined using fourth-order Møller-Plesset perturbation theory and a variety of basis sets. The theoretical absolute zero activation energy for the F+H2→FH+H reaction is in better agreement with experiment than previous theoretical studies, and part of the disagreement between earlier theoretical calculations and experiment is found to result from the use of assumed rather than calculated zero-point vibrational energies. The fourth-order reaction energy for the elimination of hydrogen from formaldehyde is within 2 kcal mol-1 of the experimental value using the largest basis set considered. The qualitative features of the H2CO surface are unchanged by expansion of the basis set beyond the polarized triple-zeta level, but diffuse functions and several sets of polarization functions are found to be necessary for quantitative accuracy in predicted reaction and activation energies. Basis sets and levels of perturbation theory which represent good compromises between computational efficiency and accuracy are recommended.
Robust tumor morphometry in multispectral fluorescence microscopy
NASA Astrophysics Data System (ADS)
Tabesh, Ali; Vengrenyuk, Yevgen; Teverovskiy, Mikhail; Khan, Faisal M.; Sapir, Marina; Powell, Douglas; Mesa-Tejada, Ricardo; Donovan, Michael J.; Fernandez, Gerardo
2009-02-01
Morphological and architectural characteristics of primary tissue compartments, such as epithelial nuclei (EN) and cytoplasm, provide important cues for cancer diagnosis, prognosis, and therapeutic response prediction. We propose two feature sets for the robust quantification of these characteristics in multiplex immunofluorescence (IF) microscopy images of prostate biopsy specimens. To enable feature extraction, EN and cytoplasm regions were first segmented from the IF images. Then, feature sets consisting of the characteristics of the minimum spanning tree (MST) connecting the EN and the fractal dimension (FD) of gland boundaries were obtained from the segmented compartments. We demonstrated the utility of the proposed features in prostate cancer recurrence prediction on a multi-institution cohort of 1027 patients. Univariate analysis revealed that both FD and one of the MST features were highly effective for predicting cancer recurrence (p <= 0.0001). In multivariate analysis, an MST feature was selected for a model incorporating clinical and image features. The model achieved a concordance index (CI) of 0.73 on the validation set, which was significantly higher than the CI of 0.69 for the standard multivariate model based solely on clinical features currently used in clinical practice (p < 0.0001). The contributions of this work are twofold. First, it is the first demonstration of the utility of the proposed features in morphometric analysis of IF images. Second, this is the largest scale study of the efficacy and robustness of the proposed features in prostate cancer prognosis.
Feature Augmentation via Nonparametrics and Selection (FANS) in High-Dimensional Classification.
Fan, Jianqing; Feng, Yang; Jiang, Jiancheng; Tong, Xin
We propose a high dimensional classification method that involves nonparametric feature augmentation. Knowing that marginal density ratios are the most powerful univariate classifiers, we use the ratio estimates to transform the original feature measurements. Subsequently, penalized logistic regression is invoked, taking as input the newly transformed or augmented features. This procedure trains models equipped with local complexity and global simplicity, thereby avoiding the curse of dimensionality while creating a flexible nonlinear decision boundary. The resulting method is called Feature Augmentation via Nonparametrics and Selection (FANS). We motivate FANS by generalizing the Naive Bayes model, writing the log ratio of joint densities as a linear combination of those of marginal densities. It is related to generalized additive models, but has better interpretability and computability. Risk bounds are developed for FANS. In numerical analysis, FANS is compared with competing methods, so as to provide a guideline on its best application domain. Real data analysis demonstrates that FANS performs very competitively on benchmark email spam and gene expression data sets. Moreover, FANS is implemented by an extremely fast algorithm through parallel computing.
Feature Augmentation via Nonparametrics and Selection (FANS) in High-Dimensional Classification
Feng, Yang; Jiang, Jiancheng; Tong, Xin
2015-01-01
We propose a high dimensional classification method that involves nonparametric feature augmentation. Knowing that marginal density ratios are the most powerful univariate classifiers, we use the ratio estimates to transform the original feature measurements. Subsequently, penalized logistic regression is invoked, taking as input the newly transformed or augmented features. This procedure trains models equipped with local complexity and global simplicity, thereby avoiding the curse of dimensionality while creating a flexible nonlinear decision boundary. The resulting method is called Feature Augmentation via Nonparametrics and Selection (FANS). We motivate FANS by generalizing the Naive Bayes model, writing the log ratio of joint densities as a linear combination of those of marginal densities. It is related to generalized additive models, but has better interpretability and computability. Risk bounds are developed for FANS. In numerical analysis, FANS is compared with competing methods, so as to provide a guideline on its best application domain. Real data analysis demonstrates that FANS performs very competitively on benchmark email spam and gene expression data sets. Moreover, FANS is implemented by an extremely fast algorithm through parallel computing. PMID:27185970
The relative importance of external and internal features of facial composites.
Frowd, Charlie; Bruce, Vicki; McIntyre, Alex; Hancock, Peter
2007-02-01
Three experiments are reported that compare the quality of external with internal regions within a set of facial composites using two matching-type tasks. Composites are constructed with the aim of triggering recognition from people familiar with the targets, and past research suggests internal face features dominate representations of familiar faces in memory. However the experiments reported here show that the internal regions of composites are very poorly matched against the faces they purport to represent, while external feature regions alone were matched almost as well as complete composites. In Experiments 1 and 2 the composites used were constructed by participant-witnesses who were unfamiliar with the targets and therefore were predicted to demonstrate a bias towards the external parts of a face. In Experiment 3 we compared witnesses who were familiar or unfamiliar with the target items, but for both groups the external features were much better reproduced in the composites, suggesting it is the process of composite construction itself which is responsible for the poverty of the internal features. Practical implications of these results are discussed.
Transverse parton momenta in single inclusive hadron production in e+ e- annihilation processes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Boglione, M.; Gonzalez-Hernandez, J. O.; Taghavi, R.
Here, we study the transverse momentum distributions of single inclusive hadron production in e+e- annihilation processes. Although the only available experimental data are scarce and quite old, we find that the fundamental features of transverse momentum dependent (TMD) evolution, historically addressed in Drell–Yan processes and, more recently, in Semi-inclusive deep inelastic scattering processes, are visible in e+e- annihilations as well. Interesting effects related to its non-perturbative regime can be observed. We test two different parameterizations for the p more » $$\\perp$$ dependence of the cross section: the usual Gaussian distribution and a power-law model. We find the latter to be more appropriate in describing this particular set of experimental data, over a relatively large range of p $$\\perp$$ values. We use this model to map some of the features of the data within the framework of TMD evolution, and discuss the caveats of this and other possible interpretations, related to the one-dimensional nature of the available experimental data.« less
Transverse parton momenta in single inclusive hadron production in e+ e- annihilation processes
Boglione, M.; Gonzalez-Hernandez, J. O.; Taghavi, R.
2017-06-17
Here, we study the transverse momentum distributions of single inclusive hadron production in e+e- annihilation processes. Although the only available experimental data are scarce and quite old, we find that the fundamental features of transverse momentum dependent (TMD) evolution, historically addressed in Drell–Yan processes and, more recently, in Semi-inclusive deep inelastic scattering processes, are visible in e+e- annihilations as well. Interesting effects related to its non-perturbative regime can be observed. We test two different parameterizations for the p more » $$\\perp$$ dependence of the cross section: the usual Gaussian distribution and a power-law model. We find the latter to be more appropriate in describing this particular set of experimental data, over a relatively large range of p $$\\perp$$ values. We use this model to map some of the features of the data within the framework of TMD evolution, and discuss the caveats of this and other possible interpretations, related to the one-dimensional nature of the available experimental data.« less
Carbohydrate degrading polypeptide and uses thereof
Sagt, Cornelis Maria Jacobus; Schooneveld-Bergmans, Margot Elisabeth Francoise; Roubos, Johannes Andries; Los, Alrik Pieter
2015-10-20
The invention relates to a polypeptide having carbohydrate material degrading activity which comprises the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1 or SEQ ID NO: 4, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 96% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 96% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional protein and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
A Review of Feature Extraction Software for Microarray Gene Expression Data
Tan, Ching Siang; Ting, Wai Soon; Mohamad, Mohd Saberi; Chan, Weng Howe; Deris, Safaai; Ali Shah, Zuraini
2014-01-01
When gene expression data are too large to be processed, they are transformed into a reduced representation set of genes. Transforming large-scale gene expression data into a set of genes is called feature extraction. If the genes extracted are carefully chosen, this gene set can extract the relevant information from the large-scale gene expression data, allowing further analysis by using this reduced representation instead of the full size data. In this paper, we review numerous software applications that can be used for feature extraction. The software reviewed is mainly for Principal Component Analysis (PCA), Independent Component Analysis (ICA), Partial Least Squares (PLS), and Local Linear Embedding (LLE). A summary and sources of the software are provided in the last section for each feature extraction method. PMID:25250315
A survey of psychology practice in critical-care settings.
Stucky, Kirk; Jutte, Jennifer E; Warren, Ann Marie; Jackson, James C; Merbitz, Nancy
2016-05-01
The aims of this survey study were to (a) examine the frequency of health-service psychology involvement in intensive and critical-care settings; (b) characterize the distinguishing features of these providers; and (c) examine unique or distinguishing features of the hospital setting in which these providers are offering services. χ2 analyses were conducted for group comparisons of health-service psychologists: (a) providing services in critical care versus those with no or limited critical care activity and (b) involved in both critical care and rehabilitation versus those only involved in critical care. A total of 175 surveys met inclusion criteria and were included in the analyses. Psychologists who worked in critical-care settings at least monthly were more likely to be at a Level-1, χ2(1, N = 157) = 9.654, p = .002, or pediatric, χ2(1, N = 158) = 7.081, p = .008, trauma center. Psychologists involved with critical care were more likely to provide services on general medical-surgical units, χ2(1, N = 167) = 45.679, p = .000. A higher proportion of rehabilitation-oriented providers provided intensive care, critical care, and neurointensive care services relative to nonrehabilitation providers. The findings indicate that health-service psychologists are involved in critical-care settings and in various roles. A more broad-based survey of hospitals across the United States would be required to identify how frequently health-service psychologists are consulted and what specific services are most effective, valued, or desired in critical-care settings. (PsycINFO Database Record (c) 2016 APA, all rights reserved).
Incipient Fault Detection for Rolling Element Bearings under Varying Speed Conditions.
Xue, Lang; Li, Naipeng; Lei, Yaguo; Li, Ningbo
2017-06-20
Varying speed conditions bring a huge challenge to incipient fault detection of rolling element bearings because both the change of speed and faults could lead to the amplitude fluctuation of vibration signals. Effective detection methods need to be developed to eliminate the influence of speed variation. This paper proposes an incipient fault detection method for bearings under varying speed conditions. Firstly, relative residual (RR) features are extracted, which are insensitive to the varying speed conditions and are able to reflect the degradation trend of bearings. Then, a health indicator named selected negative log-likelihood probability (SNLLP) is constructed to fuse a feature set including RR features and non-dimensional features. Finally, based on the constructed SNLLP health indicator, a novel alarm trigger mechanism is designed to detect the incipient fault. The proposed method is demonstrated using vibration signals from bearing tests and industrial wind turbines. The results verify the effectiveness of the proposed method for incipient fault detection of rolling element bearings under varying speed conditions.
Mining Co-Location Patterns with Clustering Items from Spatial Data Sets
NASA Astrophysics Data System (ADS)
Zhou, G.; Li, Q.; Deng, G.; Yue, T.; Zhou, X.
2018-05-01
The explosive growth of spatial data and widespread use of spatial databases emphasize the need for the spatial data mining. Co-location patterns discovery is an important branch in spatial data mining. Spatial co-locations represent the subsets of features which are frequently located together in geographic space. However, the appearance of a spatial feature C is often not determined by a single spatial feature A or B but by the two spatial features A and B, that is to say where A and B appear together, C often appears. We note that this co-location pattern is different from the traditional co-location pattern. Thus, this paper presents a new concept called clustering terms, and this co-location pattern is called co-location patterns with clustering items. And the traditional algorithm cannot mine this co-location pattern, so we introduce the related concept in detail and propose a novel algorithm. This algorithm is extended by join-based approach proposed by Huang. Finally, we evaluate the performance of this algorithm.
Espinosa, Alejandro Martínez
2018-01-01
International evidence regarding the relationship between maternal employment and school-age children overweight and obesity shows divergent results. In Mexico, this relationship has not been confirmed by national data sets analysis. Consequently, the objective of this article was to evaluate the role of the mothers' participation in labor force related to excess body weight in Mexican school-age children (aged 5-11 years). A cross-sectional study was conducted on a sample of 17,418 individuals from the National Health and Nutrition Survey 2012, applying binomial logistic regression models. After controlling for individual, maternal and contextual features, the mothers' participation in labor force was associated with children body composition. However, when the household features (living arrangements, household ethnicity, size, food security and socioeconomic status) were incorporated, maternal employment was no longer statically significant. Household features are crucial factors for understanding the overweight and obesity prevalence levels in Mexican school-age children, despite the mother having a paid job. Copyright: © 2018 Permanyer.
A theory of utility conditionals: Paralogical reasoning from decision-theoretic leakage.
Bonnefon, Jean-François
2009-10-01
Many "if p, then q" conditionals have decision-theoretic features, such as antecedents or consequents that relate to the utility functions of various agents. These decision-theoretic features leak into reasoning processes, resulting in various paralogical conclusions. The theory of utility conditionals offers a unified account of the various forms that this phenomenon can take. The theory is built on 2 main components: (1) a representational tool (the utility grid), which summarizes in compact form the decision-theoretic features of a conditional, and (2) a set of folk axioms of decision, which reflect reasoners' beliefs about the way most agents make their decisions. Applying the folk axioms to the utility grid of a conditional allows for the systematic prediction of the paralogical conclusions invited by the utility grid's decision-theoretic features. The theory of utility conditionals significantly extends the scope of current theories of conditional inference and moves reasoning research toward a greater integration with decision-making research.
Driver drowsiness classification using fuzzy wavelet-packet-based feature-extraction algorithm.
Khushaba, Rami N; Kodagoda, Sarath; Lal, Sara; Dissanayake, Gamini
2011-01-01
Driver drowsiness and loss of vigilance are a major cause of road accidents. Monitoring physiological signals while driving provides the possibility of detecting and warning of drowsiness and fatigue. The aim of this paper is to maximize the amount of drowsiness-related information extracted from a set of electroencephalogram (EEG), electrooculogram (EOG), and electrocardiogram (ECG) signals during a simulation driving test. Specifically, we develop an efficient fuzzy mutual-information (MI)- based wavelet packet transform (FMIWPT) feature-extraction method for classifying the driver drowsiness state into one of predefined drowsiness levels. The proposed method estimates the required MI using a novel approach based on fuzzy memberships providing an accurate-information content-estimation measure. The quality of the extracted features was assessed on datasets collected from 31 drivers on a simulation test. The experimental results proved the significance of FMIWPT in extracting features that highly correlate with the different drowsiness levels achieving a classification accuracy of 95%-- 97% on an average across all subjects.
Incipient Fault Detection for Rolling Element Bearings under Varying Speed Conditions
Xue, Lang; Li, Naipeng; Lei, Yaguo; Li, Ningbo
2017-01-01
Varying speed conditions bring a huge challenge to incipient fault detection of rolling element bearings because both the change of speed and faults could lead to the amplitude fluctuation of vibration signals. Effective detection methods need to be developed to eliminate the influence of speed variation. This paper proposes an incipient fault detection method for bearings under varying speed conditions. Firstly, relative residual (RR) features are extracted, which are insensitive to the varying speed conditions and are able to reflect the degradation trend of bearings. Then, a health indicator named selected negative log-likelihood probability (SNLLP) is constructed to fuse a feature set including RR features and non-dimensional features. Finally, based on the constructed SNLLP health indicator, a novel alarm trigger mechanism is designed to detect the incipient fault. The proposed method is demonstrated using vibration signals from bearing tests and industrial wind turbines. The results verify the effectiveness of the proposed method for incipient fault detection of rolling element bearings under varying speed conditions. PMID:28773035
A Feature and Algorithm Selection Method for Improving the Prediction of Protein Structural Class.
Ni, Qianwu; Chen, Lei
2017-01-01
Correct prediction of protein structural class is beneficial to investigation on protein functions, regulations and interactions. In recent years, several computational methods have been proposed in this regard. However, based on various features, it is still a great challenge to select proper classification algorithm and extract essential features to participate in classification. In this study, a feature and algorithm selection method was presented for improving the accuracy of protein structural class prediction. The amino acid compositions and physiochemical features were adopted to represent features and thirty-eight machine learning algorithms collected in Weka were employed. All features were first analyzed by a feature selection method, minimum redundancy maximum relevance (mRMR), producing a feature list. Then, several feature sets were constructed by adding features in the list one by one. For each feature set, thirtyeight algorithms were executed on a dataset, in which proteins were represented by features in the set. The predicted classes yielded by these algorithms and true class of each protein were collected to construct a dataset, which were analyzed by mRMR method, yielding an algorithm list. From the algorithm list, the algorithm was taken one by one to build an ensemble prediction model. Finally, we selected the ensemble prediction model with the best performance as the optimal ensemble prediction model. Experimental results indicate that the constructed model is much superior to models using single algorithm and other models that only adopt feature selection procedure or algorithm selection procedure. The feature selection procedure or algorithm selection procedure are really helpful for building an ensemble prediction model that can yield a better performance. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Lieffers, Jessica R L; Haresign, Helen; Mehling, Christine; Hanning, Rhona M
2016-09-15
Little is known about use of goal setting and tracking tools within online programs to support nutrition and physical activity behaviour change. In 2011, Dietitians of Canada added "My Goals," a nutrition and physical activity behaviour goal setting and tracking tool to their free publicly available self-monitoring website (eaTracker® ( http://www.eaTracker.ca/ )). My Goals allows users to: a) set "ready-made" SMART (Specific, Measurable, Attainable, Realistic, Time-related) goals (choice of n = 87 goals from n = 13 categories) or "write your own" goals, and b) track progress using the "My Goals Tracker." The purpose of this study was to characterize: a) My Goals user demographics, b) types of goals set, and c) My Goals Tracker use. Anonymous data on all goals set using the My Goals feature from December 6/2012-April 28/2014 by users ≥19y from Ontario and Alberta, Canada were obtained. This dataset contained: anonymous self-reported user demographic data, user set goals, and My Goals Tracker use data. Write your own goals were categorized by topic and specificity. Data were summarized using descriptive statistics. Multivariate binary logistic regression was used to determine associations between user demographics and a) goal topic areas and b) My Goals Tracker use. Overall, n = 16,511 goal statements (75.4 % ready-made; 24.6 % write your own) set by n = 8,067 adult users 19-85y (83.3 % female; mean age 41.1 ± 15.0y, mean BMI 28.8 ± 7.6kg/m(2)) were included for analysis. Overall, 33.1 % of ready-made goals were from the "Managing your Weight" category. Of write your own goal entries, 42.3 % were solely distal goals (most related to weight management); 38.6 % addressed nutrition behaviour change (16.6 % had unspecific general eating goals); 18.1 % addressed physical activity behaviour change (47.3 % had goals without information on exercise amount and type). Many write your own goals were poor quality (e.g., non-specific (e.g., missing amounts)), and possibly unrealistic (e.g., no sugar). Few goals were tracked (<10 %). Demographic variables had statistically significant relations with goal topic areas and My Goals Tracker use. eaTracker® users had high interest in goal setting and the My Goals feature, however, self-written goals were often poor quality and goal tracking was rare. Further research is needed to better support users.
Action recognition using mined hierarchical compound features.
Gilbert, Andrew; Illingworth, John; Bowden, Richard
2011-05-01
The field of Action Recognition has seen a large increase in activity in recent years. Much of the progress has been through incorporating ideas from single-frame object recognition and adapting them for temporal-based action recognition. Inspired by the success of interest points in the 2D spatial domain, their 3D (space-time) counterparts typically form the basic components used to describe actions, and in action recognition the features used are often engineered to fire sparsely. This is to ensure that the problem is tractable; however, this can sacrifice recognition accuracy as it cannot be assumed that the optimum features in terms of class discrimination are obtained from this approach. In contrast, we propose to initially use an overcomplete set of simple 2D corners in both space and time. These are grouped spatially and temporally using a hierarchical process, with an increasing search area. At each stage of the hierarchy, the most distinctive and descriptive features are learned efficiently through data mining. This allows large amounts of data to be searched for frequently reoccurring patterns of features. At each level of the hierarchy, the mined compound features become more complex, discriminative, and sparse. This results in fast, accurate recognition with real-time performance on high-resolution video. As the compound features are constructed and selected based upon their ability to discriminate, their speed and accuracy increase at each level of the hierarchy. The approach is tested on four state-of-the-art data sets, the popular KTH data set to provide a comparison with other state-of-the-art approaches, the Multi-KTH data set to illustrate performance at simultaneous multiaction classification, despite no explicit localization information provided during training. Finally, the recent Hollywood and Hollywood2 data sets provide challenging complex actions taken from commercial movie sequences. For all four data sets, the proposed hierarchical approach outperforms all other methods reported thus far in the literature and can achieve real-time operation.
Jozwik, Kamila M.; Kriegeskorte, Nikolaus; Storrs, Katherine R.; Mur, Marieke
2017-01-01
Recent advances in Deep convolutional Neural Networks (DNNs) have enabled unprecedentedly accurate computational models of brain representations, and present an exciting opportunity to model diverse cognitive functions. State-of-the-art DNNs achieve human-level performance on object categorisation, but it is unclear how well they capture human behavior on complex cognitive tasks. Recent reports suggest that DNNs can explain significant variance in one such task, judging object similarity. Here, we extend these findings by replicating them for a rich set of object images, comparing performance across layers within two DNNs of different depths, and examining how the DNNs’ performance compares to that of non-computational “conceptual” models. Human observers performed similarity judgments for a set of 92 images of real-world objects. Representations of the same images were obtained in each of the layers of two DNNs of different depths (8-layer AlexNet and 16-layer VGG-16). To create conceptual models, other human observers generated visual-feature labels (e.g., “eye”) and category labels (e.g., “animal”) for the same image set. Feature labels were divided into parts, colors, textures and contours, while category labels were divided into subordinate, basic, and superordinate categories. We fitted models derived from the features, categories, and from each layer of each DNN to the similarity judgments, using representational similarity analysis to evaluate model performance. In both DNNs, similarity within the last layer explains most of the explainable variance in human similarity judgments. The last layer outperforms almost all feature-based models. Late and mid-level layers outperform some but not all feature-based models. Importantly, categorical models predict similarity judgments significantly better than any DNN layer. Our results provide further evidence for commonalities between DNNs and brain representations. Models derived from visual features other than object parts perform relatively poorly, perhaps because DNNs more comprehensively capture the colors, textures and contours which matter to human object perception. However, categorical models outperform DNNs, suggesting that further work may be needed to bring high-level semantic representations in DNNs closer to those extracted by humans. Modern DNNs explain similarity judgments remarkably well considering they were not trained on this task, and are promising models for many aspects of human cognition. PMID:29062291
NASA Astrophysics Data System (ADS)
Tadini, A.; Bisson, M.; Neri, A.; Cioni, R.; Bevilacqua, A.; Aspinall, W. P.
2017-06-01
This study presents new and revised data sets about the spatial distribution of past volcanic vents, eruptive fissures, and regional/local structures of the Somma-Vesuvio volcanic system (Italy). The innovative features of the study are the identification and quantification of important sources of uncertainty affecting interpretations of the data sets. In this regard, the spatial uncertainty of each feature is modeled by an uncertainty area, i.e., a geometric element typically represented by a polygon drawn around points or lines. The new data sets have been assembled as an updatable geodatabase that integrates and complements existing databases for Somma-Vesuvio. The data are organized into 4 data sets and stored as 11 feature classes (points and lines for feature locations and polygons for the associated uncertainty areas), totaling more than 1700 elements. More specifically, volcanic vent and eruptive fissure elements are subdivided into feature classes according to their associated eruptive styles: (i) Plinian and sub-Plinian eruptions (i.e., large- or medium-scale explosive activity); (ii) violent Strombolian and continuous ash emission eruptions (i.e., small-scale explosive activity); and (iii) effusive eruptions (including eruptions from both parasitic vents and eruptive fissures). Regional and local structures (i.e., deep faults) are represented as linear feature classes. To support interpretation of the eruption data, additional data sets are provided for Somma-Vesuvio geological units and caldera morphological features. In the companion paper, the data presented here, and the associated uncertainties, are used to develop a first vent opening probability map for the Somma-Vesuvio caldera, with specific attention focused on large or medium explosive events.
Identification of informative features for predicting proinflammatory potentials of engine exhausts.
Wang, Chia-Chi; Lin, Ying-Chi; Lin, Yuan-Chung; Jhang, Syu-Ruei; Tung, Chun-Wei
2017-08-18
The immunotoxicity of engine exhausts is of high concern to human health due to the increasing prevalence of immune-related diseases. However, the evaluation of immunotoxicity of engine exhausts is currently based on expensive and time-consuming experiments. It is desirable to develop efficient methods for immunotoxicity assessment. To accelerate the development of safe alternative fuels, this study proposed a computational method for identifying informative features for predicting proinflammatory potentials of engine exhausts. A principal component regression (PCR) algorithm was applied to develop prediction models. The informative features were identified by a sequential backward feature elimination (SBFE) algorithm. A total of 19 informative chemical and biological features were successfully identified by SBFE algorithm. The informative features were utilized to develop a computational method named FS-CBM for predicting proinflammatory potentials of engine exhausts. FS-CBM model achieved a high performance with correlation coefficient values of 0.997 and 0.943 obtained from training and independent test sets, respectively. The FS-CBM model was developed for predicting proinflammatory potentials of engine exhausts with a large improvement on prediction performance compared with our previous CBM model. The proposed method could be further applied to construct models for bioactivities of mixtures.
Douglas, Pamela K.; Lau, Edward; Anderson, Ariana; Head, Austin; Kerr, Wesley; Wollner, Margalit; Moyer, Daniel; Li, Wei; Durnhofer, Mike; Bramen, Jennifer; Cohen, Mark S.
2013-01-01
The complex task of assessing the veracity of a statement is thought to activate uniquely distributed brain regions based on whether a subject believes or disbelieves a given assertion. In the current work, we present parallel machine learning methods for predicting a subject's decision response to a given propositional statement based on independent component (IC) features derived from EEG and fMRI data. Our results demonstrate that IC features outperformed features derived from event related spectral perturbations derived from any single spectral band, yet were similar to accuracy across all spectral bands combined. We compared our diagnostic IC spatial maps with our conventional general linear model (GLM) results, and found that informative ICs had significant spatial overlap with our GLM results, yet also revealed unique regions like amygdala that were not statistically significant in GLM analyses. Overall, these results suggest that ICs may yield a parsimonious feature set that can be used along with a decision tree structure for interpretation of features used in classifying complex cognitive processes such as belief and disbelief across both fMRI and EEG neuroimaging modalities. PMID:23914164
Voxel classification based airway tree segmentation
NASA Astrophysics Data System (ADS)
Lo, Pechin; de Bruijne, Marleen
2008-03-01
This paper presents a voxel classification based method for segmenting the human airway tree in volumetric computed tomography (CT) images. In contrast to standard methods that use only voxel intensities, our method uses a more complex appearance model based on a set of local image appearance features and Kth nearest neighbor (KNN) classification. The optimal set of features for classification is selected automatically from a large set of features describing the local image structure at several scales. The use of multiple features enables the appearance model to differentiate between airway tree voxels and other voxels of similar intensities in the lung, thus making the segmentation robust to pathologies such as emphysema. The classifier is trained on imperfect segmentations that can easily be obtained using region growing with a manual threshold selection. Experiments show that the proposed method results in a more robust segmentation that can grow into the smaller airway branches without leaking into emphysematous areas, and is able to segment many branches that are not present in the training set.
Search time critically depends on irrelevant subset size in visual search.
Benjamins, Jeroen S; Hooge, Ignace T C; van Elst, Jacco C; Wertheim, Alexander H; Verstraten, Frans A J
2009-02-01
In order for our visual system to deal with the massive amount of sensory input, some of this input is discarded, while other parts are processed [Wolfe, J. M. (1994). Guided search 2.0: a revised model of visual search. Psychonomic Bulletin and Review, 1, 202-238]. From the visual search literature it is unclear how well one set of items can be selected that differs in only one feature from target (a 1F set), while another set of items can be ignored that differs in two features from target (a 2F set). We systematically varied the percentage of 2F non-targets to determine the contribution of these non-targets to search behaviour. Increasing the percentage 2F non-targets, that have to be ignored, was expected to result in increasingly faster search, since it decreases the size of 1F set that has to be searched. Observers searched large displays for a target in the 1F set with a variable percentage of 2F non-targets. Interestingly, when the search displays contained 5% 2F non-targets, the search time was longer compared to the search time in other conditions. This effect of 2F non-targets on performance was independent of set size. An inspection of the saccades revealed that saccade target selection did not contribute to the longer search times in displays with 5% 2F non-targets. Occurrence of longer search times in displays containing 5% 2F non-targets might be attributed to covert processes related to visual analysis of the fixated part of the display. Apparently, visual search performance critically depends on the percentage of irrelevant 2F non-targets.
Pairwise diversity ranking of polychotomous features for ensemble physiological signal classifiers.
Gupta, Lalit; Kota, Srinivas; Molfese, Dennis L; Vaidyanathan, Ravi
2013-06-01
It is well known that fusion classifiers for physiological signal classification with diverse components (classifiers or data sets) outperform those with less diverse components. Determining component diversity, therefore, is of the utmost importance in the design of fusion classifiers that are often employed in clinical diagnostic and numerous other pattern recognition problems. In this article, a new pairwise diversity-based ranking strategy is introduced to select a subset of ensemble components, which when combined will be more diverse than any other component subset of the same size. The strategy is unified in the sense that the components can be classifiers or data sets. Moreover, the classifiers and data sets can be polychotomous. Classifier-fusion and data-fusion systems are formulated based on the diversity-based selection strategy, and the application of the two fusion strategies are demonstrated through the classification of multichannel event-related potentials. It is observed that for both classifier and data fusion, the classification accuracy tends to increase/decrease when the diversity of the component ensemble increases/decreases. For the four sets of 14-channel event-related potentials considered, it is shown that data fusion outperforms classifier fusion. Furthermore, it is demonstrated that the combination of data components that yield the best performance, in a relative sense, can be determined through the diversity-based selection strategy.
Karst in the United States: a digital map compilation and database
Weary, David J.; Doctor, Daniel H.
2014-01-01
This report describes new digital maps delineating areas of the United States, including Puerto Rico and the U.S. Virgin Islands, having karst or the potential for development of karst and pseudokarst. These maps show areas underlain by soluble rocks and also by volcanic rocks, sedimentary deposits, and permafrost that have potential for karst or pseudokarst development. All 50 States contain rocks with potential for karst development, and about 18 percent of their area is underlain by soluble rocks having karst or the potential for development of karst features. The areas of soluble rocks shown are based primarily on selection from State geologic maps of rock units containing significant amounts of carbonate or evaporite minerals. Areas underlain by soluble rocks are further classified by general climate setting, degree of induration, and degree of exposure. Areas having potential for volcanic pseudokarst are those underlain chiefly by basaltic-flow rocks no older than Miocene in age. Areas with potential for pseudokarst features in sedimentary rocks are in relatively unconsolidated rocks from which pseudokarst features, such as piping caves, have been reported. Areas having potential for development of thermokarst features, mapped exclusively in Alaska, contain permafrost in relatively thick surficial deposits containing ground ice. This report includes a GIS database with links from the map unit polygons to online geologic unit descriptions.
NASA Astrophysics Data System (ADS)
Kakkos, I.; Gkiatis, K.; Bromis, K.; Asvestas, P. A.; Karanasiou, I. S.; Ventouras, E. M.; Matsopoulos, G. K.
2017-11-01
The detection of an error is the cognitive evaluation of an action outcome that is considered undesired or mismatches an expected response. Brain activity during monitoring of correct and incorrect responses elicits Event Related Potentials (ERPs) revealing complex cerebral responses to deviant sensory stimuli. Development of accurate error detection systems is of great importance both concerning practical applications and in investigating the complex neural mechanisms of decision making. In this study, data are used from an audio identification experiment that was implemented with two levels of complexity in order to investigate neurophysiological error processing mechanisms in actors and observers. To examine and analyse the variations of the processing of erroneous sensory information for each level of complexity we employ Support Vector Machines (SVM) classifiers with various learning methods and kernels using characteristic ERP time-windowed features. For dimensionality reduction and to remove redundant features we implement a feature selection framework based on Sequential Forward Selection (SFS). The proposed method provided high accuracy in identifying correct and incorrect responses both for actors and for observers with mean accuracy of 93% and 91% respectively. Additionally, computational time was reduced and the effects of the nesting problem usually occurring in SFS of large feature sets were alleviated.
Statistical universals reveal the structures and functions of human music.
Savage, Patrick E; Brown, Steven; Sakai, Emi; Currie, Thomas E
2015-07-21
Music has been called "the universal language of mankind." Although contemporary theories of music evolution often invoke various musical universals, the existence of such universals has been disputed for decades and has never been empirically demonstrated. Here we combine a music-classification scheme with statistical analyses, including phylogenetic comparative methods, to examine a well-sampled global set of 304 music recordings. Our analyses reveal no absolute universals but strong support for many statistical universals that are consistent across all nine geographic regions sampled. These universals include 18 musical features that are common individually as well as a network of 10 features that are commonly associated with one another. They span not only features related to pitch and rhythm that are often cited as putative universals but also rarely cited domains including performance style and social context. These cross-cultural structural regularities of human music may relate to roles in facilitating group coordination and cohesion, as exemplified by the universal tendency to sing, play percussion instruments, and dance to simple, repetitive music in groups. Our findings highlight the need for scientists studying music evolution to expand the range of musical cultures and musical features under consideration. The statistical universals we identified represent important candidates for future investigation.
Rothman, Linda; Buliung, Ron; Macarthur, Colin; To, Teresa; Howard, Andrew
2014-02-01
The child active transportation literature has focused on walking, with little attention to risk associated with increased traffic exposure. This paper reviews the literature related to built environment correlates of walking and pedestrian injury in children together, to broaden the current conceptualization of walkability to include injury prevention. Two independent searches were conducted focused on walking in children and child pedestrian injury within nine electronic databases until March, 2012. Studies were included which: 1) were quantitative 2) set in motorized countries 3) were either urban or suburban 4) investigated specific built environment risk factors 5) had outcomes of either walking in children and/or child pedestrian roadway collisions (ages 0-12). Built environment features were categorized according to those related to density, land use diversity or roadway design. Results were cross-tabulated to identify how built environment features associate with walking and injury. Fifty walking and 35 child pedestrian injury studies were identified. Only traffic calming and presence of playgrounds/recreation areas were consistently associated with more walking and less pedestrian injury. Several built environment features were associated with more walking, but with increased injury. Many features had inconsistent results or had not been investigated for either outcome. The findings emphasise the importance of incorporating safety into the conversation about creating more walkable cities.
Statistical universals reveal the structures and functions of human music
Savage, Patrick E.; Brown, Steven; Sakai, Emi; Currie, Thomas E.
2015-01-01
Music has been called “the universal language of mankind.” Although contemporary theories of music evolution often invoke various musical universals, the existence of such universals has been disputed for decades and has never been empirically demonstrated. Here we combine a music-classification scheme with statistical analyses, including phylogenetic comparative methods, to examine a well-sampled global set of 304 music recordings. Our analyses reveal no absolute universals but strong support for many statistical universals that are consistent across all nine geographic regions sampled. These universals include 18 musical features that are common individually as well as a network of 10 features that are commonly associated with one another. They span not only features related to pitch and rhythm that are often cited as putative universals but also rarely cited domains including performance style and social context. These cross-cultural structural regularities of human music may relate to roles in facilitating group coordination and cohesion, as exemplified by the universal tendency to sing, play percussion instruments, and dance to simple, repetitive music in groups. Our findings highlight the need for scientists studying music evolution to expand the range of musical cultures and musical features under consideration. The statistical universals we identified represent important candidates for future investigation. PMID:26124105
Robust Feature Selection Technique using Rank Aggregation.
Sarkar, Chandrima; Cooley, Sarah; Srivastava, Jaideep
2014-01-01
Although feature selection is a well-developed research area, there is an ongoing need to develop methods to make classifiers more efficient. One important challenge is the lack of a universal feature selection technique which produces similar outcomes with all types of classifiers. This is because all feature selection techniques have individual statistical biases while classifiers exploit different statistical properties of data for evaluation. In numerous situations this can put researchers into dilemma as to which feature selection method and a classifiers to choose from a vast range of choices. In this paper, we propose a technique that aggregates the consensus properties of various feature selection methods to develop a more optimal solution. The ensemble nature of our technique makes it more robust across various classifiers. In other words, it is stable towards achieving similar and ideally higher classification accuracy across a wide variety of classifiers. We quantify this concept of robustness with a measure known as the Robustness Index (RI). We perform an extensive empirical evaluation of our technique on eight data sets with different dimensions including Arrythmia, Lung Cancer, Madelon, mfeat-fourier, internet-ads, Leukemia-3c and Embryonal Tumor and a real world data set namely Acute Myeloid Leukemia (AML). We demonstrate not only that our algorithm is more robust, but also that compared to other techniques our algorithm improves the classification accuracy by approximately 3-4% (in data set with less than 500 features) and by more than 5% (in data set with more than 500 features), across a wide range of classifiers.
Rowan, L.C.; Trautwein, C.M.; Purdy, T.L.
1990-01-01
This study was undertaken as part of the Conterminous U.S. Mineral Assessment Program (CUSMAP). The purpose of the study was to map linear features on Landsat Multispectral Scanner (MSS) images and a proprietary side-looking airborne radar (SLAR) image mosaic and to determine the spatial relationship between these linear features and the locations of metallic mineral occurrE-nces. The results show a close spatial association of linear features with metallic mineral occurrences in parts of the quadrangle, but in other areas the association is less well defined. Linear features are defined as distinct linear and slightly curvilinear elements mappable on MSS and SLAR images. The features generally represent linear segments of streams, ridges, and terminations of topographic features; however, they may also represent tonal patterns that are related to variations in lithology and vegetation. Most linear features in the Butte quadrangle probably represent underlying structural elements, such as fractures (with and without displacement), dikes, and alignment of fold axes. However, in areas underlain by sedimentary rocks, some of the linear features may reflect bedding traces. This report describes the geologic setting of the Butte quadrangle, the procedures used in mapping and analyzing the linear features, and the results of the study. Relationship of these features to placer and non-metal deposits were not analyzed in this study and are not discussed in this report.
Vision-Based UAV Flight Control and Obstacle Avoidance
2006-01-01
denoted it by Vb = (Vb1, Vb2 , Vb3). Fig. 2 shows the block diagram of the proposed vision-based motion analysis and obstacle avoidance system. We denote...structure analysis often involve computation- intensive computer vision tasks, such as feature extraction and geometric modeling. Computation-intensive...First, we extract a set of features from each block. 2) Second, we compute the distance between these two sets of features. In conventional motion
Non-specific filtering of beta-distributed data.
Wang, Xinhui; Laird, Peter W; Hinoue, Toshinori; Groshen, Susan; Siegmund, Kimberly D
2014-06-19
Non-specific feature selection is a dimension reduction procedure performed prior to cluster analysis of high dimensional molecular data. Not all measured features are expected to show biological variation, so only the most varying are selected for analysis. In DNA methylation studies, DNA methylation is measured as a proportion, bounded between 0 and 1, with variance a function of the mean. Filtering on standard deviation biases the selection of probes to those with mean values near 0.5. We explore the effect this has on clustering, and develop alternate filter methods that utilize a variance stabilizing transformation for Beta distributed data and do not share this bias. We compared results for 11 different non-specific filters on eight Infinium HumanMethylation data sets, selected to span a variety of biological conditions. We found that for data sets having a small fraction of samples showing abnormal methylation of a subset of normally unmethylated CpGs, a characteristic of the CpG island methylator phenotype in cancer, a novel filter statistic that utilized a variance-stabilizing transformation for Beta distributed data outperformed the common filter of using standard deviation of the DNA methylation proportion, or its log-transformed M-value, in its ability to detect the cancer subtype in a cluster analysis. However, the standard deviation filter always performed among the best for distinguishing subgroups of normal tissue. The novel filter and standard deviation filter tended to favour features in different genome contexts; for the same data set, the novel filter always selected more features from CpG island promoters and the standard deviation filter always selected more features from non-CpG island intergenic regions. Interestingly, despite selecting largely non-overlapping sets of features, the two filters did find sample subsets that overlapped for some real data sets. We found two different filter statistics that tended to prioritize features with different characteristics, each performed well for identifying clusters of cancer and non-cancer tissue, and identifying a cancer CpG island hypermethylation phenotype. Since cluster analysis is for discovery, we would suggest trying both filters on any new data sets, evaluating the overlap of features selected and clusters discovered.
Aerodynamic features of flames in premixed gases
NASA Technical Reports Server (NTRS)
Oppenheim, A. K.
1984-01-01
A variety of experimentally established flame phenomena in premixed gases are interpreted by relating them to basic aerodynamic properties of the flow field. On this basis the essential mechanism of some well known characteristic features of flames stabilized in the wake of a bluff-body or propagating in ducts are revealed. Elementary components of the flame propagation process are shown to be: rotary motion, self-advancement, and expansion. Their consequences are analyzed under a most strict set of idealizations that permit the flow field to be treated as potential in character, while the flame is modelled as a Stefan-like interface capable of exerting a feed-back effect upon the flow field. The results provide an insight into the fundamental fluid-mechanical reasons for the experimentally observed distortions of the flame front, rationalizing in particular its ability to sustain relatively high flow velocities at amazingly low normal burning speeds.
Mattson, S N; Riley, E P; Gramling, L; Delis, D C; Jones, K L
1998-01-01
Fetal alcohol syndrome (FAS) is associated with behavioral and cognitive deficits. However, the majority of children born to alcohol-abusing women do not meet the formal criteria for FAS and it is not known if the cognitive abilities of these children differ from those of children with FAS. Using a set of neuropsychological tests, 3 groups were compared: (a) children with FAS, (b) children without FAS who were born to alcohol-abusing women (the PEA group), and (c) normal controls. The results indicated that, relative to controls, both the FAS and the PEA groups were impaired on tests of language, verbal learning and memory, academic skills, fine-motor speed, and visual-motor integration. These data suggest that heavy prenatal alcohol exposure is related to a consistent pattern of neuropsychological deficits and the degree of these deficits may be independent of the presence of physical features associated with FAS.
A numerical relativity scheme for cosmological simulations
NASA Astrophysics Data System (ADS)
Daverio, David; Dirian, Yves; Mitsou, Ermis
2017-12-01
Cosmological simulations involving the fully covariant gravitational dynamics may prove relevant in understanding relativistic/non-linear features and, therefore, in taking better advantage of the upcoming large scale structure survey data. We propose a new 3 + 1 integration scheme for general relativity in the case where the matter sector contains a minimally-coupled perfect fluid field. The original feature is that we completely eliminate the fluid components through the constraint equations, thus remaining with a set of unconstrained evolution equations for the rest of the fields. This procedure does not constrain the lapse function and shift vector, so it holds in arbitrary gauge and also works for arbitrary equation of state. An important advantage of this scheme is that it allows one to define and pass an adaptation of the robustness test to the cosmological context, at least in the case of pressureless perfect fluid matter, which is the relevant one for late-time cosmology.
Mud Volcanoes in the Martian Lowlands: Potential Windows to Fluid-Rich Samples from Depth
NASA Technical Reports Server (NTRS)
Oehler, Dorothy Z.; Allen, Carlton C.
2009-01-01
The regional setting of the Chryse-Acidalia area augurs well for a fluid-rich subsurface, accumulation of diverse rock types reflecting the wide catchment area, astrobiological prospectivity, and mud volcanism. This latter provides a mechanism for transporting samples from relatively great depth to the surface. Since mud volcanoes are not associated with extreme heat or shock pressures, materials they transport to the surface are likely to be relatively unaltered; thus such materials could contain interpretable remnants of potential martian life (e.g., organic chemical biomarkers, mineral biosignatures, or structural remains) as well as unmetamorphosed rock samples. None of the previous landings on Mars was located in an area with features identified as potential mud volcanoes (Fig. 3), but some of these features may offer targets for future missions aimed at sampling deep fluid-rich strata with potential habitable zones.
Pan, Rui; Wang, Hansheng; Li, Runze
2016-01-01
This paper is concerned with the problem of feature screening for multi-class linear discriminant analysis under ultrahigh dimensional setting. We allow the number of classes to be relatively large. As a result, the total number of relevant features is larger than usual. This makes the related classification problem much more challenging than the conventional one, where the number of classes is small (very often two). To solve the problem, we propose a novel pairwise sure independence screening method for linear discriminant analysis with an ultrahigh dimensional predictor. The proposed procedure is directly applicable to the situation with many classes. We further prove that the proposed method is screening consistent. Simulation studies are conducted to assess the finite sample performance of the new procedure. We also demonstrate the proposed methodology via an empirical analysis of a real life example on handwritten Chinese character recognition. PMID:28127109
Relational particle models: I. Reconciliation with standard classical and quantum theory
NASA Astrophysics Data System (ADS)
Anderson, Edward
2006-04-01
This paper concerns the absolute versus relative motion debate. The Barbour and Bertotti (1982) work may be viewed as an indirectly set up relational formulation of a portion of Newtonian mechanics. I consider further direct formulations of this and argue that the portion in question—universes with zero total angular momentum that are conservative and with kinetic terms that are (homogeneous) quadratic in their velocities—is capable of accommodating a wide range of classical physics phenomena. Furthermore, as I develop in paper II, this relational particle model is a useful toy model for canonical general relativity. I consider what happens if one quantizes relational rather than absolute mechanics, indeed whether the latter is misleading. By exploiting Jacobi coordinates, I show how to access many examples of quantized relational particle models and then interpret these from a relational perspective. By these means, previous suggestions of bad semiclassicality for such models can be eluded. I show how small (particle number) universe relational particle model examples display eigenspectrum truncation, gaps, energy interlocking and counterbalanced total angular momentum. These features mean that these small universe models make interesting toy models for some aspects of closed-universe quantum cosmology. Meanwhile, these features do not compromise the recovery of reality as regards the practicalities of experimentation in a large universe such as our own.
Optimizing Nanoscale Quantitative Optical Imaging of Subfield Scattering Targets
Henn, Mark-Alexander; Barnes, Bryan M.; Zhou, Hui; Sohn, Martin; Silver, Richard M.
2016-01-01
The full 3-D scattered field above finite sets of features has been shown to contain a continuum of spatial frequency information, and with novel optical microscopy techniques and electromagnetic modeling, deep-subwavelength geometrical parameters can be determined. Similarly, by using simulations, scattering geometries and experimental conditions can be established to tailor scattered fields that yield lower parametric uncertainties while decreasing the number of measurements and the area of such finite sets of features. Such optimized conditions are reported through quantitative optical imaging in 193 nm scatterfield microscopy using feature sets up to four times smaller in area than state-of-the-art critical dimension targets. PMID:27805660
Visual Saliency Detection Based on Multiscale Deep CNN Features.
Guanbin Li; Yizhou Yu
2016-11-01
Visual saliency is a fundamental problem in both cognitive and computational sciences, including computer vision. In this paper, we discover that a high-quality visual saliency model can be learned from multiscale features extracted using deep convolutional neural networks (CNNs), which have had many successes in visual recognition tasks. For learning such saliency models, we introduce a neural network architecture, which has fully connected layers on top of CNNs responsible for feature extraction at three different scales. The penultimate layer of our neural network has been confirmed to be a discriminative high-level feature vector for saliency detection, which we call deep contrast feature. To generate a more robust feature, we integrate handcrafted low-level features with our deep contrast feature. To promote further research and evaluation of visual saliency models, we also construct a new large database of 4447 challenging images and their pixelwise saliency annotations. Experimental results demonstrate that our proposed method is capable of achieving the state-of-the-art performance on all public benchmarks, improving the F-measure by 6.12% and 10%, respectively, on the DUT-OMRON data set and our new data set (HKU-IS), and lowering the mean absolute error by 9% and 35.3%, respectively, on these two data sets.
Apollo-Soyuz pamphlet no. 5: The earth from orbit. [experimental design
NASA Technical Reports Server (NTRS)
Page, L. W.; From, T. P.
1977-01-01
Astronaut training in the recognition of various geological features from space is described as well as the cameras, lenses and film used in experiment MA-136 to measure their effectiveness in photographing earth structural features from orbit. Aerosols that affect climate and weather are discussed in relation to experiment Ma-007 which relied on infrared observations of the setting or rising sun, as seen from Apollo, to measure the amount of dust and droplets in the lower 150 km of earth's atmosphere. The line spectra of atomic oxygen and nitrogen and their densities at 22 km above the earth's surface are examined along with experiment MA-059 which measured ultraviolet absorption at that altitude.
Steganalysis using logistic regression
NASA Astrophysics Data System (ADS)
Lubenko, Ivans; Ker, Andrew D.
2011-02-01
We advocate Logistic Regression (LR) as an alternative to the Support Vector Machine (SVM) classifiers commonly used in steganalysis. LR offers more information than traditional SVM methods - it estimates class probabilities as well as providing a simple classification - and can be adapted more easily and efficiently for multiclass problems. Like SVM, LR can be kernelised for nonlinear classification, and it shows comparable classification accuracy to SVM methods. This work is a case study, comparing accuracy and speed of SVM and LR classifiers in detection of LSB Matching and other related spatial-domain image steganography, through the state-of-art 686-dimensional SPAM feature set, in three image sets.
Roles and Responsibilities in Feature Teams
NASA Astrophysics Data System (ADS)
Eckstein, Jutta
Agile development requires self-organizing teams. The set-up of a (feature) team has to enable self-organization. Special care has to be taken if the project is not only distributed, but also large and more than one feature team is involved. Every feature team needs in such a setting a product owner who ensures the continuous focus on business delivery. The product owners collaborate by working together in a virtual team. Each feature team is supported by a coach who ensures not only the agile process of the individual feature team but also across all feature teams. An architect (or if necessary a team of architects) takes care that the system is technically sound. Contrariwise to small co-located projects, large global projects require a project manager who deals with—among other things—internal and especially external politics.
Feature Extraction for Pose Estimation. A Comparison Between Synthetic and Real IR Imagery
1991-12-01
determine the orientation of the sensor relative to the target ....... ........................ 33 4. Effects of changing sensor and target parameters...Reference object is a T-62 tank facing the viewer (sensor/target parameters set equal to zero). NOTE: Changing the target parameters produces...anomalous results. For these images, the field of view (FOV) was not changed .......................... 35 5. Image anomalies from changing the target
River meanders and channel size
Williams, G.P.
1986-01-01
This study uses an enlarged data set to (1) compare measured meander geometry to that predicted by the Langbein and Leopold (1966) theory, (2) examine the frequency distribution of the ratio radius of curvature/channel width, and (3) derive 40 empirical equations (31 of which are original) involving meander and channel size features. The data set, part of which comes from publications by other authors, consists of 194 sites from a large variety of physiographic environments in various countries. The Langbein-Leopold sine-generated-curve theory for predicting radius of curvature agrees very well with the field data (78 sites). The ratio radius of curvature/channel width has a modal value in the range of 2 to 3, in accordance with earlier work; about one third of the 79 values is less than 2.0. The 40 empirical relations, most of which include only two variables, involve channel cross-section dimensions (bankfull area, width, and mean depth) and meander features (wavelength, bend length, radius of curvature, and belt width). These relations have very high correlation coefficients, most being in the range of 0.95-0.99. Although channel width traditionally has served as a scale indicator, bankfull cross-sectional area and mean depth also can be used for this purpose. ?? 1986.
Extending GIS Technology to Study Karst Features of Southeastern Minnesota
NASA Astrophysics Data System (ADS)
Gao, Y.; Tipping, R. G.; Alexander, E. C.; Alexander, S. C.
2001-12-01
This paper summarizes ongoing research on karst feature distribution of southeastern Minnesota. The main goals of this interdisciplinary research are: 1) to look for large-scale patterns in the rate and distribution of sinkhole development; 2) to conduct statistical tests of hypotheses about the formation of sinkholes; 3) to create management tools for land-use managers and planners; and 4) to deliver geomorphic and hydrogeologic criteria for making scientifically valid land-use policies and ethical decisions in karst areas of southeastern Minnesota. Existing county and sub-county karst feature datasets of southeastern Minnesota have been assembled into a large GIS-based database capable of analyzing the entire data set. The central database management system (DBMS) is a relational GIS-based system interacting with three modules: GIS, statistical and hydrogeologic modules. ArcInfo and ArcView were used to generate a series of 2D and 3D maps depicting karst feature distributions in southeastern Minnesota. IRIS ExplorerTM was used to produce satisfying 3D maps and animations using data exported from GIS-based database. Nearest-neighbor analysis has been used to test sinkhole distributions in different topographic and geologic settings. All current nearest-neighbor analyses testify that sinkholes in southeastern Minnesota are not evenly distributed in this area (i.e., they tend to be clustered). More detailed statistical methods such as cluster analysis, histograms, probability estimation, correlation and regression have been used to study the spatial distributions of some mapped karst features of southeastern Minnesota. A sinkhole probability map for Goodhue County has been constructed based on sinkhole distribution, bedrock geology, depth to bedrock, GIS buffer analysis and nearest-neighbor analysis. A series of karst features for Winona County including sinkholes, springs, seeps, stream sinks and outcrop has been mapped and entered into the Karst Feature Database of Southeastern Minnesota. The Karst Feature Database of Winona County is being expanded to include all the mapped karst features of southeastern Minnesota. Air photos from 1930s to 1990s of Spring Valley Cavern Area in Fillmore County were scanned and geo-referenced into our GIS system. This technology has been proved to be very useful to identify sinkholes and study the rate of sinkhole development.
Pham, Thuy T; Moore, Steven T; Lewis, Simon John Geoffrey; Nguyen, Diep N; Dutkiewicz, Eryk; Fuglevand, Andrew J; McEwan, Alistair L; Leong, Philip H W
2017-11-01
Freezing of gait (FoG) is common in Parkinsonian gait and strongly relates to falls. Current clinical FoG assessments are patients' self-report diaries and experts' manual video analysis. Both are subjective and yield moderate reliability. Existing detection algorithms have been predominantly designed in subject-dependent settings. In this paper, we aim to develop an automated FoG detector for subject independent. After extracting highly relevant features, we apply anomaly detection techniques to detect FoG events. Specifically, feature selection is performed using correlation and clusterability metrics. From a list of 244 feature candidates, 36 candidates were selected using saliency and robustness criteria. We develop an anomaly score detector with adaptive thresholding to identify FoG events. Then, using accuracy metrics, we reduce the feature list to seven candidates. Our novel multichannel freezing index was the most selective across all window sizes, achieving sensitivity (specificity) of (). On the other hand, freezing index from the vertical axis was the best choice for a single input, achieving sensitivity (specificity) of () for ankle and () for back sensors. Our subject-independent method is not only significantly more accurate than those previously reported, but also uses a much smaller window (e.g., versus ) and/or lower tolerance (e.g., versus ).Freezing of gait (FoG) is common in Parkinsonian gait and strongly relates to falls. Current clinical FoG assessments are patients' self-report diaries and experts' manual video analysis. Both are subjective and yield moderate reliability. Existing detection algorithms have been predominantly designed in subject-dependent settings. In this paper, we aim to develop an automated FoG detector for subject independent. After extracting highly relevant features, we apply anomaly detection techniques to detect FoG events. Specifically, feature selection is performed using correlation and clusterability metrics. From a list of 244 feature candidates, 36 candidates were selected using saliency and robustness criteria. We develop an anomaly score detector with adaptive thresholding to identify FoG events. Then, using accuracy metrics, we reduce the feature list to seven candidates. Our novel multichannel freezing index was the most selective across all window sizes, achieving sensitivity (specificity) of (). On the other hand, freezing index from the vertical axis was the best choice for a single input, achieving sensitivity (specificity) of () for ankle and () for back sensors. Our subject-independent method is not only significantly more accurate than those previously reported, but also uses a much smaller window (e.g., versus ) and/or lower tolerance (e.g., versus ).
Single-trial effective brain connectivity patterns enhance discriminability of mental imagery tasks
NASA Astrophysics Data System (ADS)
Rathee, Dheeraj; Cecotti, Hubert; Prasad, Girijesh
2017-10-01
Objective. The majority of the current approaches of connectivity based brain-computer interface (BCI) systems focus on distinguishing between different motor imagery (MI) tasks. Brain regions associated with MI are anatomically close to each other, hence these BCI systems suffer from low performances. Our objective is to introduce single-trial connectivity feature based BCI system for cognition imagery (CI) based tasks wherein the associated brain regions are located relatively far away as compared to those for MI. Approach. We implemented time-domain partial Granger causality (PGC) for the estimation of the connectivity features in a BCI setting. The proposed hypothesis has been verified with two publically available datasets involving MI and CI tasks. Main results. The results support the conclusion that connectivity based features can provide a better performance than a classical signal processing framework based on bandpass features coupled with spatial filtering for CI tasks, including word generation, subtraction, and spatial navigation. These results show for the first time that connectivity features can provide a reliable performance for imagery-based BCI system. Significance. We show that single-trial connectivity features for mixed imagery tasks (i.e. combination of CI and MI) can outperform the features obtained by current state-of-the-art method and hence can be successfully applied for BCI applications.
NASA Astrophysics Data System (ADS)
Näsi, R.; Viljanen, N.; Oliveira, R.; Kaivosoja, J.; Niemeläinen, O.; Hakala, T.; Markelin, L.; Nezami, S.; Suomalainen, J.; Honkavaara, E.
2018-04-01
Light-weight 2D format hyperspectral imagers operable from unmanned aerial vehicles (UAV) have become common in various remote sensing tasks in recent years. Using these technologies, the area of interest is covered by multiple overlapping hypercubes, in other words multiview hyperspectral photogrammetric imagery, and each object point appears in many, even tens of individual hypercubes. The common practice is to calculate hyperspectral orthomosaics utilizing only the most nadir areas of the images. However, the redundancy of the data gives potential for much more versatile and thorough feature extraction. We investigated various options of extracting spectral features in the grass sward quantity evaluation task. In addition to the various sets of spectral features, we used photogrammetry-based ultra-high density point clouds to extract features describing the canopy 3D structure. Machine learning technique based on the Random Forest algorithm was used to estimate the fresh biomass. Results showed high accuracies for all investigated features sets. The estimation results using multiview data provided approximately 10 % better results than the most nadir orthophotos. The utilization of the photogrammetric 3D features improved estimation accuracy by approximately 40 % compared to approaches where only spectral features were applied. The best estimation RMSE of 239 kg/ha (6.0 %) was obtained with multiview anisotropy corrected data set and the 3D features.
Jiang, Shaowei; Liao, Jun; Bian, Zichao; Guo, Kaikai; Zhang, Yongbing; Zheng, Guoan
2018-04-01
A whole slide imaging (WSI) system has recently been approved for primary diagnostic use in the US. The image quality and system throughput of WSI is largely determined by the autofocusing process. Traditional approaches acquire multiple images along the optical axis and maximize a figure of merit for autofocusing. Here we explore the use of deep convolution neural networks (CNNs) to predict the focal position of the acquired image without axial scanning. We investigate the autofocusing performance with three illumination settings: incoherent Kohler illumination, partially coherent illumination with two plane waves, and one-plane-wave illumination. We acquire ~130,000 images with different defocus distances as the training data set. Different defocus distances lead to different spatial features of the captured images. However, solely relying on the spatial information leads to a relatively bad performance of the autofocusing process. It is better to extract defocus features from transform domains of the acquired image. For incoherent illumination, the Fourier cutoff frequency is directly related to the defocus distance. Similarly, autocorrelation peaks are directly related to the defocus distance for two-plane-wave illumination. In our implementation, we use the spatial image, the Fourier spectrum, the autocorrelation of the spatial image, and combinations thereof as the inputs for the CNNs. We show that the information from the transform domains can improve the performance and robustness of the autofocusing process. The resulting focusing error is ~0.5 µm, which is within the 0.8-µm depth-of-field range. The reported approach requires little hardware modification for conventional WSI systems and the images can be captured on the fly without focus map surveying. It may find applications in WSI and time-lapse microscopy. The transform- and multi-domain approaches may also provide new insights for developing microscopy-related deep-learning networks. We have made our training and testing data set (~12 GB) open-source for the broad research community.
Spectral gene set enrichment (SGSE).
Frost, H Robert; Li, Zhigang; Moore, Jason H
2015-03-03
Gene set testing is typically performed in a supervised context to quantify the association between groups of genes and a clinical phenotype. In many cases, however, a gene set-based interpretation of genomic data is desired in the absence of a phenotype variable. Although methods exist for unsupervised gene set testing, they predominantly compute enrichment relative to clusters of the genomic variables with performance strongly dependent on the clustering algorithm and number of clusters. We propose a novel method, spectral gene set enrichment (SGSE), for unsupervised competitive testing of the association between gene sets and empirical data sources. SGSE first computes the statistical association between gene sets and principal components (PCs) using our principal component gene set enrichment (PCGSE) method. The overall statistical association between each gene set and the spectral structure of the data is then computed by combining the PC-level p-values using the weighted Z-method with weights set to the PC variance scaled by Tracy-Widom test p-values. Using simulated data, we show that the SGSE algorithm can accurately recover spectral features from noisy data. To illustrate the utility of our method on real data, we demonstrate the superior performance of the SGSE method relative to standard cluster-based techniques for testing the association between MSigDB gene sets and the variance structure of microarray gene expression data. Unsupervised gene set testing can provide important information about the biological signal held in high-dimensional genomic data sets. Because it uses the association between gene sets and samples PCs to generate a measure of unsupervised enrichment, the SGSE method is independent of cluster or network creation algorithms and, most importantly, is able to utilize the statistical significance of PC eigenvalues to ignore elements of the data most likely to represent noise.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Velazquez, E Rios; Narayan, V; Grossmann, P
2015-06-15
Purpose: To compare the complementary prognostic value of automated Radiomic features to that of radiologist-annotated VASARI features in TCGA-GBM MRI dataset. Methods: For 96 GBM patients, pre-operative MRI images were obtained from The Cancer Imaging Archive. The abnormal tumor bulks were manually defined on post-contrast T1w images. The contrast-enhancing and necrotic regions were segmented using FAST. From these sub-volumes and the total abnormal tumor bulk, a set of Radiomic features quantifying phenotypic differences based on the tumor intensity, shape and texture, were extracted from the post-contrast T1w images. Minimum-redundancy-maximum-relevance (MRMR) was used to identify the most informative Radiomic, VASARI andmore » combined Radiomic-VASARI features in 70% of the dataset (training-set). Multivariate Cox-proportional hazards models were evaluated in 30% of the dataset (validation-set) using the C-index for OS. A bootstrap procedure was used to assess significance while comparing the C-Indices of the different models. Results: Overall, the Radiomic features showed a moderate correlation with the radiologist-annotated VASARI features (r = −0.37 – 0.49); however that correlation was stronger for the Tumor Diameter and Proportion of Necrosis VASARI features (r = −0.71 – 0.69). After MRMR feature selection, the best-performing Radiomic, VASARI, and Radiomic-VASARI Cox-PH models showed a validation C-index of 0.56 (p = NS), 0.58 (p = NS) and 0.65 (p = 0.01), respectively. The combined Radiomic-VASARI model C-index was significantly higher than that obtained from either the Radiomic or VASARI model alone (p = <0.001). Conclusion: Quantitative volumetric and textural Radiomic features complement the qualitative and semi-quantitative annotated VASARI feature set. The prognostic value of informative qualitative VASARI features such as Eloquent Brain and Multifocality is increased with the addition of quantitative volumetric and textural features from the contrast-enhancing and necrotic tumor regions. These results should be further evaluated in larger validation cohorts.« less
Targeted Feature Detection for Data-Dependent Shotgun Proteomics
2017-01-01
Label-free quantification of shotgun LC–MS/MS data is the prevailing approach in quantitative proteomics but remains computationally nontrivial. The central data analysis step is the detection of peptide-specific signal patterns, called features. Peptide quantification is facilitated by associating signal intensities in features with peptide sequences derived from MS2 spectra; however, missing values due to imperfect feature detection are a common problem. A feature detection approach that directly targets identified peptides (minimizing missing values) but also offers robustness against false-positive features (by assigning meaningful confidence scores) would thus be highly desirable. We developed a new feature detection algorithm within the OpenMS software framework, leveraging ideas and algorithms from the OpenSWATH toolset for DIA/SRM data analysis. Our software, FeatureFinderIdentification (“FFId”), implements a targeted approach to feature detection based on information from identified peptides. This information is encoded in an MS1 assay library, based on which ion chromatogram extraction and detection of feature candidates are carried out. Significantly, when analyzing data from experiments comprising multiple samples, our approach distinguishes between “internal” and “external” (inferred) peptide identifications (IDs) for each sample. On the basis of internal IDs, two sets of positive (true) and negative (decoy) feature candidates are defined. A support vector machine (SVM) classifier is then trained to discriminate between the sets and is subsequently applied to the “uncertain” feature candidates from external IDs, facilitating selection and confidence scoring of the best feature candidate for each peptide. This approach also enables our algorithm to estimate the false discovery rate (FDR) of the feature selection step. We validated FFId based on a public benchmark data set, comprising a yeast cell lysate spiked with protein standards that provide a known ground-truth. The algorithm reached almost complete (>99%) quantification coverage for the full set of peptides identified at 1% FDR (PSM level). Compared with other software solutions for label-free quantification, this is an outstanding result, which was achieved at competitive quantification accuracy and reproducibility across replicates. The FDR for the feature selection was estimated at a low 1.5% on average per sample (3% for features inferred from external peptide IDs). The FFId software is open-source and freely available as part of OpenMS (www.openms.org). PMID:28673088
Targeted Feature Detection for Data-Dependent Shotgun Proteomics.
Weisser, Hendrik; Choudhary, Jyoti S
2017-08-04
Label-free quantification of shotgun LC-MS/MS data is the prevailing approach in quantitative proteomics but remains computationally nontrivial. The central data analysis step is the detection of peptide-specific signal patterns, called features. Peptide quantification is facilitated by associating signal intensities in features with peptide sequences derived from MS2 spectra; however, missing values due to imperfect feature detection are a common problem. A feature detection approach that directly targets identified peptides (minimizing missing values) but also offers robustness against false-positive features (by assigning meaningful confidence scores) would thus be highly desirable. We developed a new feature detection algorithm within the OpenMS software framework, leveraging ideas and algorithms from the OpenSWATH toolset for DIA/SRM data analysis. Our software, FeatureFinderIdentification ("FFId"), implements a targeted approach to feature detection based on information from identified peptides. This information is encoded in an MS1 assay library, based on which ion chromatogram extraction and detection of feature candidates are carried out. Significantly, when analyzing data from experiments comprising multiple samples, our approach distinguishes between "internal" and "external" (inferred) peptide identifications (IDs) for each sample. On the basis of internal IDs, two sets of positive (true) and negative (decoy) feature candidates are defined. A support vector machine (SVM) classifier is then trained to discriminate between the sets and is subsequently applied to the "uncertain" feature candidates from external IDs, facilitating selection and confidence scoring of the best feature candidate for each peptide. This approach also enables our algorithm to estimate the false discovery rate (FDR) of the feature selection step. We validated FFId based on a public benchmark data set, comprising a yeast cell lysate spiked with protein standards that provide a known ground-truth. The algorithm reached almost complete (>99%) quantification coverage for the full set of peptides identified at 1% FDR (PSM level). Compared with other software solutions for label-free quantification, this is an outstanding result, which was achieved at competitive quantification accuracy and reproducibility across replicates. The FDR for the feature selection was estimated at a low 1.5% on average per sample (3% for features inferred from external peptide IDs). The FFId software is open-source and freely available as part of OpenMS ( www.openms.org ).
Task representation in individual and joint settings
Prinz, Wolfgang
2015-01-01
This paper outlines a framework for task representation and discusses applications to interference tasks in individual and joint settings. The framework is derived from the Theory of Event Coding (TEC). This theory regards task sets as transient assemblies of event codes in which stimulus and response codes interact and shape each other in particular ways. On the one hand, stimulus and response codes compete with each other within their respective subsets (horizontal interactions). On the other hand, stimulus and response code cooperate with each other (vertical interactions). Code interactions instantiating competition and cooperation apply to two time scales: on-line performance (i.e., doing the task) and off-line implementation (i.e., setting the task). Interference arises when stimulus and response codes overlap in features that are irrelevant for stimulus identification, but relevant for response selection. To resolve this dilemma, the feature profiles of event codes may become restructured in various ways. The framework is applied to three kinds of interference paradigms. Special emphasis is given to joint settings where tasks are shared between two participants. Major conclusions derived from these applications include: (1) Response competition is the chief driver of interference. Likewise, different modes of response competition give rise to different patterns of interference; (2) The type of features in which stimulus and response codes overlap is also a crucial factor. Different types of such features give likewise rise to different patterns of interference; and (3) Task sets for joint settings conflate intraindividual conflicts between responses (what), with interindividual conflicts between responding agents (whom). Features of response codes may, therefore, not only address responses, but also responding agents (both physically and socially). PMID:26029085
Sub-millisecond closed-loop feedback stimulation between arbitrary sets of individual neurons
Müller, Jan; Bakkum, Douglas J.; Hierlemann, Andreas
2012-01-01
We present a system to artificially correlate the spike timing between sets of arbitrary neurons that were interfaced to a complementary metal–oxide–semiconductor (CMOS) high-density microelectrode array (MEA). The system features a novel reprogrammable and flexible event engine unit to detect arbitrary spatio-temporal patterns of recorded action potentials and is capable of delivering sub-millisecond closed-loop feedback of electrical stimulation upon trigger events in real-time. The relative timing between action potentials of individual neurons as well as the temporal pattern among multiple neurons, or neuronal assemblies, is considered an important factor governing memory and learning in the brain. Artificially changing timings between arbitrary sets of spiking neurons with our system could provide a “knob” to tune information processing in the network. PMID:23335887
Walshe, Catherine
2011-12-01
Complex, incrementally changing, context dependent and variable palliative care services are difficult to evaluate. Case study research strategies may have potential to contribute to evaluating such complex interventions, and to develop this field of evaluation research. This paper explores definitions of case study (as a unit of study, a process, and a product) and examines the features of case study research strategies which are thought to confer benefits for the evaluation of complex interventions in palliative care settings. Ten features of case study that are thought to be beneficial in evaluating complex interventions in palliative care are discussed, drawing from exemplars of research in this field. Important features are related to a longitudinal approach, triangulation, purposive instance selection, comprehensive approach, multiple data sources, flexibility, concurrent data collection and analysis, search for proving-disproving evidence, pattern matching techniques and an engaging narrative. The limitations of case study approaches are discussed including the potential for subjectivity and their complex, time consuming and potentially expensive nature. Case study research strategies have great potential in evaluating complex interventions in palliative care settings. Three key features need to be exploited to develop this field: case selection, longitudinal designs, and the use of rival hypotheses. In particular, case study should be used in situations where there is interplay and interdependency between the intervention and its context, such that it is difficult to define or find relevant comparisons.
NASA Astrophysics Data System (ADS)
Chen, C.; Gong, W.; Hu, Y.; Chen, Y.; Ding, Y.
2017-05-01
The automated building detection in aerial images is a fundamental problem encountered in aerial and satellite images analysis. Recently, thanks to the advances in feature descriptions, Region-based CNN model (R-CNN) for object detection is receiving an increasing attention. Despite the excellent performance in object detection, it is problematic to directly leverage the features of R-CNN model for building detection in single aerial image. As we know, the single aerial image is in vertical view and the buildings possess significant directional feature. However, in R-CNN model, direction of the building is ignored and the detection results are represented by horizontal rectangles. For this reason, the detection results with horizontal rectangle cannot describe the building precisely. To address this problem, in this paper, we proposed a novel model with a key feature related to orientation, namely, Oriented R-CNN (OR-CNN). Our contributions are mainly in the following two aspects: 1) Introducing a new oriented layer network for detecting the rotation angle of building on the basis of the successful VGG-net R-CNN model; 2) the oriented rectangle is proposed to leverage the powerful R-CNN for remote-sensing building detection. In experiments, we establish a complete and bran-new data set for training our oriented R-CNN model and comprehensively evaluate the proposed method on a publicly available building detection data set. We demonstrate State-of-the-art results compared with the previous baseline methods.
Inefficient conjunction search made efficient by concurrent spoken delivery of target identity.
Reali, Florencia; Spivey, Michael J; Tyler, Melinda J; Terranova, Joseph
2006-08-01
Visual search based on a conjunction of two features typically elicits reaction times that increase linearly as a function of the number of distractors, whereas search based on a single feature is essentially unaffected by set size. These and related findings have often been interpreted as evidence of a serial search stage that follows a parallel search stage. However, a wide range of studies has been showing a form of blending of these two processes. For example, when a spoken instruction identifies the conjunction target concurrently with the visual display, the effect of set size is significantly reduced, suggesting that incremental linguistic processing of the first feature adjective and then the second feature adjective may facilitate something approximating a parallel extraction of objects during search for the target. Here, we extend these results to a variety of experimental designs. First, we replicate the result with a mixed-trials design (ruling out potential strategies associated with the blocked design of the original study). Second, in a mixed-trials experiment, the order of adjective types in the spoken query varies randomly across conditions. In a third experiment, we extend the effect to a triple-conjunction search task. A fourth (control) experiment demonstrates that these effects are not due to an efficient odd-one-out search that ignores the linguistic input. This series of experiments, along with attractor-network simulations of the phenomena, provide further evidence toward understanding linguistically mediated influences in real-time visual search processing.
Electronic Gaming Machine (EGM) Environments: Market Segments and Risk.
Rockloff, Matthew; Moskovsky, Neda; Thorne, Hannah; Browne, Matthew; Bryden, Gabrielle
2017-12-01
This study used a marketing-research paradigm to explore gamblers' attraction to EGMs based on different elements of the environment. A select set of environmental features was sourced from a prior study (Thorne et al. in J Gambl Issues 2016b), and a discrete choice experiment was conducted through an online survey. Using the same dataset first described by Rockloff et al. (EGM Environments that contribute to excess consumption and harm, 2015), a sample of 245 EGM gamblers were sourced from clubs in Victoria, Australia, and 7516 gamblers from an Australian national online survey-panel. Participants' choices amongst sets of hypothetical gambling environments allowed for an estimation of the implied individual-level utilities for each feature (e.g., general sounds, location, etc.). K-means clustering on these utilities identified four unique market segments for EGM gambling, representing four different types of consumers. The segments were named according to their dominant features: Social, Value, High Roller and Internet. We found that the environments orientated towards the Social and Value segments were most conducive to attracting players with relatively few gambling problems, while the High Roller and Internet-focused environments had greater appeal for players with problems and vulnerabilities. This study has generated new insights into the kinds of gambling environments that are most consistent with safe play.
NASA Astrophysics Data System (ADS)
Jebri, Fatma; Birol, Florence; Zakardjian, Bruno; Bouffard, Jérome; Sammari, Cherif
2016-07-01
This work is the first study exploiting along track altimetry data to observe and monitor coastal ocean features over the transition area between the western and eastern Mediterranean Basins. The relative performances of both the AVISO and the X-TRACK research regional altimetric data sets are compared using in situ observations. Both products are cross validated with tide gauge records. The altimeter-derived geostrophic velocities are also compared with observations from a moored Acoustic Doppler Current Profiler. Results indicate the good potential of satellite altimetry to retrieve dynamic features over the area. However, X-TRACK shows a more homogenous data coverage than AVISO, with longer time series in the 50 km coastal band. The seasonal evolution of the surface circulation is therefore analyzed by conjointly using X-TRACK data and remotely sensed sea surface temperature observations. This combined data set clearly depicts different current regimes and bifurcations, which allows us to propose a new seasonal circulation scheme for the central Mediterranean. The analysis shows variations of the path and temporal behavior of the main circulation features: the Atlantic Tunisian Current, the Atlantic Ionian Stream, the Atlantic Libyan Current, and the Sidra Gyre. The resulting bifurcating veins of these currents are also discussed, and a new current branch is observed for the first time.
A model for AGN variability on multiple time-scales
NASA Astrophysics Data System (ADS)
Sartori, Lia F.; Schawinski, Kevin; Trakhtenbrot, Benny; Caplar, Neven; Treister, Ezequiel; Koss, Michael J.; Urry, C. Megan; Zhang, C. E.
2018-05-01
We present a framework to link and describe active galactic nuclei (AGN) variability on a wide range of time-scales, from days to billions of years. In particular, we concentrate on the AGN variability features related to changes in black hole fuelling and accretion rate. In our framework, the variability features observed in different AGN at different time-scales may be explained as realisations of the same underlying statistical properties. In this context, we propose a model to simulate the evolution of AGN light curves with time based on the probability density function (PDF) and power spectral density (PSD) of the Eddington ratio (L/LEdd) distribution. Motivated by general galaxy population properties, we propose that the PDF may be inspired by the L/LEdd distribution function (ERDF), and that a single (or limited number of) ERDF+PSD set may explain all observed variability features. After outlining the framework and the model, we compile a set of variability measurements in terms of structure function (SF) and magnitude difference. We then combine the variability measurements on a SF plot ranging from days to Gyr. The proposed framework enables constraints on the underlying PSD and the ability to link AGN variability on different time-scales, therefore providing new insights into AGN variability and black hole growth phenomena.
Internal Gravity Waves: Generation and Breaking Mechanisms by Laboratory Experiments
NASA Astrophysics Data System (ADS)
la Forgia, Giovanni; Adduce, Claudia; Falcini, Federico
2016-04-01
Internal gravity waves (IGWs), occurring within estuaries and the coastal oceans, are manifest as large amplitude undulations of the pycnocline. IGWs propagating horizontally in a two layer stratified fluid are studied. The breaking of an IGW of depression shoaling upon a uniformly sloping boundary is investigated experimentally. Breaking dynamics beneath the shoaling waves causes both mixing and wave-induced near-bottom vortices suspending and redistributing the bed material. Laboratory experiments are conducted in a Perspex tank through the standard lock-release method, following the technique described in Sutherland et al. (2013). Each experiment is analysed and the instantaneous pycnocline position is measured, in order to obtain both geometric and kinematic features of the IGW: amplitude, wavelength and celerity. IGWs main features depend on the geometrical parameters that define the initial experimental setting: the density difference between the layers, the total depth, the layers depth ratio, the aspect ratio, and the displacement between the pycnoclines. Relations between IGWs geometric and kinematic features and the initial setting parameters are analysed. The approach of the IGWs toward a uniform slope is investigated in the present experiments. Depending on wave and slope characteristics, different breaking and mixing processes are observed. Sediments are sprinkled on the slope to visualize boundary layer separation in order to analyze the suspension e redistribution mechanisms due to the wave breaking.
Khaligh-Razavi, Seyed-Mahdi; Henriksson, Linda; Kay, Kendrick; Kriegeskorte, Nikolaus
2017-02-01
Studies of the primate visual system have begun to test a wide range of complex computational object-vision models. Realistic models have many parameters, which in practice cannot be fitted using the limited amounts of brain-activity data typically available. Task performance optimization (e.g. using backpropagation to train neural networks) provides major constraints for fitting parameters and discovering nonlinear representational features appropriate for the task (e.g. object classification). Model representations can be compared to brain representations in terms of the representational dissimilarities they predict for an image set. This method, called representational similarity analysis (RSA), enables us to test the representational feature space as is (fixed RSA) or to fit a linear transformation that mixes the nonlinear model features so as to best explain a cortical area's representational space (mixed RSA). Like voxel/population-receptive-field modelling, mixed RSA uses a training set (different stimuli) to fit one weight per model feature and response channel (voxels here), so as to best predict the response profile across images for each response channel. We analysed response patterns elicited by natural images, which were measured with functional magnetic resonance imaging (fMRI). We found that early visual areas were best accounted for by shallow models, such as a Gabor wavelet pyramid (GWP). The GWP model performed similarly with and without mixing, suggesting that the original features already approximated the representational space, obviating the need for mixing. However, a higher ventral-stream visual representation (lateral occipital region) was best explained by the higher layers of a deep convolutional network and mixing of its feature set was essential for this model to explain the representation. We suspect that mixing was essential because the convolutional network had been trained to discriminate a set of 1000 categories, whose frequencies in the training set did not match their frequencies in natural experience or their behavioural importance. The latter factors might determine the representational prominence of semantic dimensions in higher-level ventral-stream areas. Our results demonstrate the benefits of testing both the specific representational hypothesis expressed by a model's original feature space and the hypothesis space generated by linear transformations of that feature space.
Emotional recognition from the speech signal for a virtual education agent
NASA Astrophysics Data System (ADS)
Tickle, A.; Raghu, S.; Elshaw, M.
2013-06-01
This paper explores the extraction of features from the speech wave to perform intelligent emotion recognition. A feature extract tool (openSmile) was used to obtain a baseline set of 998 acoustic features from a set of emotional speech recordings from a microphone. The initial features were reduced to the most important ones so recognition of emotions using a supervised neural network could be performed. Given that the future use of virtual education agents lies with making the agents more interactive, developing agents with the capability to recognise and adapt to the emotional state of humans is an important step.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Honorio, J.; Goldstein, R.; Honorio, J.
We propose a simple, well grounded classification technique which is suited for group classification on brain fMRI data sets that have high dimensionality, small number of subjects, high noise level, high subject variability, imperfect registration and capture subtle cognitive effects. We propose threshold-split region as a new feature selection method and majority voteas the classification technique. Our method does not require a predefined set of regions of interest. We use average acros ssessions, only one feature perexperimental condition, feature independence assumption, and simple classifiers. The seeming counter-intuitive approach of using a simple design is supported by signal processing and statisticalmore » theory. Experimental results in two block design data sets that capture brain function under distinct monetary rewards for cocaine addicted and control subjects, show that our method exhibits increased generalization accuracy compared to commonly used feature selection and classification techniques.« less
NASA Astrophysics Data System (ADS)
Fang, Leyuan; Wang, Chong; Li, Shutao; Yan, Jun; Chen, Xiangdong; Rabbani, Hossein
2017-11-01
We present an automatic method, termed as the principal component analysis network with composite kernel (PCANet-CK), for the classification of three-dimensional (3-D) retinal optical coherence tomography (OCT) images. Specifically, the proposed PCANet-CK method first utilizes the PCANet to automatically learn features from each B-scan of the 3-D retinal OCT images. Then, multiple kernels are separately applied to a set of very important features of the B-scans and these kernels are fused together, which can jointly exploit the correlations among features of the 3-D OCT images. Finally, the fused (composite) kernel is incorporated into an extreme learning machine for the OCT image classification. We tested our proposed algorithm on two real 3-D spectral domain OCT (SD-OCT) datasets (of normal subjects and subjects with the macular edema and age-related macular degeneration), which demonstrated its effectiveness.