An ensemble of dissimilarity based classifiers for Mackerel gender determination
NASA Astrophysics Data System (ADS)
Blanco, A.; Rodriguez, R.; Martinez-Maranon, I.
2014-03-01
Mackerel is an infravalored fish captured by European fishing vessels. A manner to add value to this specie can be achieved by trying to classify it attending to its sex. Colour measurements were performed on Mackerel females and males (fresh and defrozen) extracted gonads to obtain differences between sexes. Several linear and non linear classifiers such as Support Vector Machines (SVM), k Nearest Neighbors (k-NN) or Diagonal Linear Discriminant Analysis (DLDA) can been applied to this problem. However, theyare usually based on Euclidean distances that fail to reflect accurately the sample proximities. Classifiers based on non-Euclidean dissimilarities misclassify a different set of patterns. We combine different kind of dissimilarity based classifiers. The diversity is induced considering a set of complementary dissimilarities for each model. The experimental results suggest that our algorithm helps to improve classifiers based on a single dissimilarity.
Robust Combining of Disparate Classifiers Through Order Statistics
NASA Technical Reports Server (NTRS)
Tumer, Kagan; Ghosh, Joydeep
2001-01-01
Integrating the outputs of multiple classifiers via combiners or meta-learners has led to substantial improvements in several difficult pattern recognition problems. In this article we investigate a family of combiners based on order statistics, for robust handling of situations where there are large discrepancies in performance of individual classifiers. Based on a mathematical modeling of how the decision boundaries are affected by order statistic combiners, we derive expressions for the reductions in error expected when simple output combination methods based on the the median, the maximum and in general, the ith order statistic, are used. Furthermore, we analyze the trim and spread combiners, both based on linear combinations of the ordered classifier outputs, and show that in the presence of uneven classifier performance, they often provide substantial gains over both linear and simple order statistics combiners. Experimental results on both real world data and standard public domain data sets corroborate these findings.
Nanni, Loris; Lumini, Alessandra
2009-01-01
The focuses of this work are: to propose a novel method for building an ensemble of classifiers for peptide classification based on substitution matrices; to show the importance to select a proper set of the parameters of the classifiers that build the ensemble of learning systems. The HIV-1 protease cleavage site prediction problem is here studied. The results obtained by a blind testing protocol are reported, the comparison with other state-of-the-art approaches, based on ensemble of classifiers, allows to quantify the performance improvement obtained by the systems proposed in this paper. The simulation based on experimentally determined protease cleavage data has demonstrated the success of these new ensemble algorithms. Particularly interesting it is to note that also if the HIV-1 protease cleavage site prediction problem is considered linearly separable we obtain the best performance using an ensemble of non-linear classifiers.
Comparison of Classification Methods for P300 Brain-Computer Interface on Disabled Subjects
Manyakov, Nikolay V.; Chumerin, Nikolay; Combaz, Adrien; Van Hulle, Marc M.
2011-01-01
We report on tests with a mind typing paradigm based on a P300 brain-computer interface (BCI) on a group of amyotrophic lateral sclerosis (ALS), middle cerebral artery (MCA) stroke, and subarachnoid hemorrhage (SAH) patients, suffering from motor and speech disabilities. We investigate the achieved typing accuracy given the individual patient's disorder, and how it correlates with the type of classifier used. We considered 7 types of classifiers, linear as well as nonlinear ones, and found that, overall, one type of linear classifier yielded a higher classification accuracy. In addition to the selection of the classifier, we also suggest and discuss a number of recommendations to be considered when building a P300-based typing system for disabled subjects. PMID:21941530
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jing, Yaqi; Meng, Qinghao, E-mail: qh-meng@tju.edu.cn; Qi, Peifeng
An electronic nose (e-nose) was designed to classify Chinese liquors of the same aroma style. A new method of feature reduction which combined feature selection with feature extraction was proposed. Feature selection method used 8 feature-selection algorithms based on information theory and reduced the dimension of the feature space to 41. Kernel entropy component analysis was introduced into the e-nose system as a feature extraction method and the dimension of feature space was reduced to 12. Classification of Chinese liquors was performed by using back propagation artificial neural network (BP-ANN), linear discrimination analysis (LDA), and a multi-linear classifier. The classificationmore » rate of the multi-linear classifier was 97.22%, which was higher than LDA and BP-ANN. Finally the classification of Chinese liquors according to their raw materials and geographical origins was performed using the proposed multi-linear classifier and classification rate was 98.75% and 100%, respectively.« less
Andries, Erik; Hagstrom, Thomas; Atlas, Susan R; Willman, Cheryl
2007-02-01
Linear discrimination, from the point of view of numerical linear algebra, can be treated as solving an ill-posed system of linear equations. In order to generate a solution that is robust in the presence of noise, these problems require regularization. Here, we examine the ill-posedness involved in the linear discrimination of cancer gene expression data with respect to outcome and tumor subclasses. We show that a filter factor representation, based upon Singular Value Decomposition, yields insight into the numerical ill-posedness of the hyperplane-based separation when applied to gene expression data. We also show that this representation yields useful diagnostic tools for guiding the selection of classifier parameters, thus leading to improved performance.
Faradji, Farhad; Ward, Rabab K; Birch, Gary E
2009-06-15
The feasibility of having a self-paced brain-computer interface (BCI) based on mental tasks is investigated. The EEG signals of four subjects performing five mental tasks each are used in the design of a 2-state self-paced BCI. The output of the BCI should only be activated when the subject performs a specific mental task and should remain inactive otherwise. For each subject and each task, the feature coefficient and the classifier that yield the best performance are selected, using the autoregressive coefficients as the features. The classifier with a zero false positive rate and the highest true positive rate is selected as the best classifier. The classifiers tested include: linear discriminant analysis, quadratic discriminant analysis, Mahalanobis discriminant analysis, support vector machine, and radial basis function neural network. The results show that: (1) some classifiers obtained the desired zero false positive rate; (2) the linear discriminant analysis classifier does not yield acceptable performance; (3) the quadratic discriminant analysis classifier outperforms the Mahalanobis discriminant analysis classifier and performs almost as well as the radial basis function neural network; and (4) the support vector machine classifier has the highest true positive rates but unfortunately has nonzero false positive rates in most cases.
Detection of Epileptic Seizure Event and Onset Using EEG
Ahammad, Nabeel; Fathima, Thasneem; Joseph, Paul
2014-01-01
This study proposes a method of automatic detection of epileptic seizure event and onset using wavelet based features and certain statistical features without wavelet decomposition. Normal and epileptic EEG signals were classified using linear classifier. For seizure event detection, Bonn University EEG database has been used. Three types of EEG signals (EEG signal recorded from healthy volunteer with eye open, epilepsy patients in the epileptogenic zone during a seizure-free interval, and epilepsy patients during epileptic seizures) were classified. Important features such as energy, entropy, standard deviation, maximum, minimum, and mean at different subbands were computed and classification was done using linear classifier. The performance of classifier was determined in terms of specificity, sensitivity, and accuracy. The overall accuracy was 84.2%. In the case of seizure onset detection, the database used is CHB-MIT scalp EEG database. Along with wavelet based features, interquartile range (IQR) and mean absolute deviation (MAD) without wavelet decomposition were extracted. Latency was used to study the performance of seizure onset detection. Classifier gave a sensitivity of 98.5% with an average latency of 1.76 seconds. PMID:24616892
A Prototype SSVEP Based Real Time BCI Gaming System
Martišius, Ignas
2016-01-01
Although brain-computer interface technology is mainly designed with disabled people in mind, it can also be beneficial to healthy subjects, for example, in gaming or virtual reality systems. In this paper we discuss the typical architecture, paradigms, requirements, and limitations of electroencephalogram-based gaming systems. We have developed a prototype three-class brain-computer interface system, based on the steady state visually evoked potentials paradigm and the Emotiv EPOC headset. An online target shooting game, implemented in the OpenViBE environment, has been used for user feedback. The system utilizes wave atom transform for feature extraction, achieving an average accuracy of 78.2% using linear discriminant analysis classifier, 79.3% using support vector machine classifier with a linear kernel, and 80.5% using a support vector machine classifier with a radial basis function kernel. PMID:27051414
A Prototype SSVEP Based Real Time BCI Gaming System.
Martišius, Ignas; Damaševičius, Robertas
2016-01-01
Although brain-computer interface technology is mainly designed with disabled people in mind, it can also be beneficial to healthy subjects, for example, in gaming or virtual reality systems. In this paper we discuss the typical architecture, paradigms, requirements, and limitations of electroencephalogram-based gaming systems. We have developed a prototype three-class brain-computer interface system, based on the steady state visually evoked potentials paradigm and the Emotiv EPOC headset. An online target shooting game, implemented in the OpenViBE environment, has been used for user feedback. The system utilizes wave atom transform for feature extraction, achieving an average accuracy of 78.2% using linear discriminant analysis classifier, 79.3% using support vector machine classifier with a linear kernel, and 80.5% using a support vector machine classifier with a radial basis function kernel.
A Novel Locally Linear KNN Method With Applications to Visual Recognition.
Liu, Qingfeng; Liu, Chengjun
2017-09-01
A locally linear K Nearest Neighbor (LLK) method is presented in this paper with applications to robust visual recognition. Specifically, the concept of an ideal representation is first presented, which improves upon the traditional sparse representation in many ways. The objective function based on a host of criteria for sparsity, locality, and reconstruction is then optimized to derive a novel representation, which is an approximation to the ideal representation. The novel representation is further processed by two classifiers, namely, an LLK-based classifier and a locally linear nearest mean-based classifier, for visual recognition. The proposed classifiers are shown to connect to the Bayes decision rule for minimum error. Additional new theoretical analysis is presented, such as the nonnegative constraint, the group regularization, and the computational efficiency of the proposed LLK method. New methods such as a shifted power transformation for improving reliability, a coefficients' truncating method for enhancing generalization, and an improved marginal Fisher analysis method for feature extraction are proposed to further improve visual recognition performance. Extensive experiments are implemented to evaluate the proposed LLK method for robust visual recognition. In particular, eight representative data sets are applied for assessing the performance of the LLK method for various visual recognition applications, such as action recognition, scene recognition, object recognition, and face recognition.
NASA Astrophysics Data System (ADS)
Wu, W.; Chen, G. Y.; Kang, R.; Xia, J. C.; Huang, Y. P.; Chen, K. J.
2017-07-01
During slaughtering and further processing, chicken carcasses are inevitably contaminated by microbial pathogen contaminants. Due to food safety concerns, many countries implement a zero-tolerance policy that forbids the placement of visibly contaminated carcasses in ice-water chiller tanks during processing. Manual detection of contaminants is labor consuming and imprecise. Here, a successive projections algorithm (SPA)-multivariable linear regression (MLR) classifier based on an optimal performance threshold was developed for automatic detection of contaminants on chicken carcasses. Hyperspectral images were obtained using a hyperspectral imaging system. A regression model of the classifier was established by MLR based on twelve characteristic wavelengths (505, 537, 561, 562, 564, 575, 604, 627, 656, 665, 670, and 689 nm) selected by SPA , and the optimal threshold T = 1 was obtained from the receiver operating characteristic (ROC) analysis. The SPA-MLR classifier provided the best detection results when compared with the SPA-partial least squares (PLS) regression classifier and the SPA-least squares supported vector machine (LS-SVM) classifier. The true positive rate (TPR) of 100% and the false positive rate (FPR) of 0.392% indicate that the SPA-MLR classifier can utilize spatial and spectral information to effectively detect contaminants on chicken carcasses.
Alzheimer's Disease Detection by Pseudo Zernike Moment and Linear Regression Classification.
Wang, Shui-Hua; Du, Sidan; Zhang, Yin; Phillips, Preetha; Wu, Le-Nan; Chen, Xian-Qing; Zhang, Yu-Dong
2017-01-01
This study presents an improved method based on "Gorji et al. Neuroscience. 2015" by introducing a relatively new classifier-linear regression classification. Our method selects one axial slice from 3D brain image, and employed pseudo Zernike moment with maximum order of 15 to extract 256 features from each image. Finally, linear regression classification was harnessed as the classifier. The proposed approach obtains an accuracy of 97.51%, a sensitivity of 96.71%, and a specificity of 97.73%. Our method performs better than Gorji's approach and five other state-of-the-art approaches. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
Thin Cloud Detection Method by Linear Combination Model of Cloud Image
NASA Astrophysics Data System (ADS)
Liu, L.; Li, J.; Wang, Y.; Xiao, Y.; Zhang, W.; Zhang, S.
2018-04-01
The existing cloud detection methods in photogrammetry often extract the image features from remote sensing images directly, and then use them to classify images into cloud or other things. But when the cloud is thin and small, these methods will be inaccurate. In this paper, a linear combination model of cloud images is proposed, by using this model, the underlying surface information of remote sensing images can be removed. So the cloud detection result can become more accurate. Firstly, the automatic cloud detection program in this paper uses the linear combination model to split the cloud information and surface information in the transparent cloud images, then uses different image features to recognize the cloud parts. In consideration of the computational efficiency, AdaBoost Classifier was introduced to combine the different features to establish a cloud classifier. AdaBoost Classifier can select the most effective features from many normal features, so the calculation time is largely reduced. Finally, we selected a cloud detection method based on tree structure and a multiple feature detection method using SVM classifier to compare with the proposed method, the experimental data shows that the proposed cloud detection program in this paper has high accuracy and fast calculation speed.
Automatic classification of artifactual ICA-components for artifact removal in EEG signals.
Winkler, Irene; Haufe, Stefan; Tangermann, Michael
2011-08-02
Artifacts contained in EEG recordings hamper both, the visual interpretation by experts as well as the algorithmic processing and analysis (e.g. for Brain-Computer Interfaces (BCI) or for Mental State Monitoring). While hand-optimized selection of source components derived from Independent Component Analysis (ICA) to clean EEG data is widespread, the field could greatly profit from automated solutions based on Machine Learning methods. Existing ICA-based removal strategies depend on explicit recordings of an individual's artifacts or have not been shown to reliably identify muscle artifacts. We propose an automatic method for the classification of general artifactual source components. They are estimated by TDSEP, an ICA method that takes temporal correlations into account. The linear classifier is based on an optimized feature subset determined by a Linear Programming Machine (LPM). The subset is composed of features from the frequency-, the spatial- and temporal domain. A subject independent classifier was trained on 640 TDSEP components (reaction time (RT) study, n = 12) that were hand labeled by experts as artifactual or brain sources and tested on 1080 new components of RT data of the same study. Generalization was tested on new data from two studies (auditory Event Related Potential (ERP) paradigm, n = 18; motor imagery BCI paradigm, n = 80) that used data with different channel setups and from new subjects. Based on six features only, the optimized linear classifier performed on level with the inter-expert disagreement (<10% Mean Squared Error (MSE)) on the RT data. On data of the auditory ERP study, the same pre-calculated classifier generalized well and achieved 15% MSE. On data of the motor imagery paradigm, we demonstrate that the discriminant information used for BCI is preserved when removing up to 60% of the most artifactual source components. We propose a universal and efficient classifier of ICA components for the subject independent removal of artifacts from EEG data. Based on linear methods, it is applicable for different electrode placements and supports the introspection of results. Trained on expert ratings of large data sets, it is not restricted to the detection of eye- and muscle artifacts. Its performance and generalization ability is demonstrated on data of different EEG studies.
Linear and Order Statistics Combiners for Pattern Classification
NASA Technical Reports Server (NTRS)
Tumer, Kagan; Ghosh, Joydeep; Lau, Sonie (Technical Monitor)
2001-01-01
Several researchers have experimentally shown that substantial improvements can be obtained in difficult pattern recognition problems by combining or integrating the outputs of multiple classifiers. This chapter provides an analytical framework to quantify the improvements in classification results due to combining. The results apply to both linear combiners and order statistics combiners. We first show that to a first order approximation, the error rate obtained over and above the Bayes error rate, is directly proportional to the variance of the actual decision boundaries around the Bayes optimum boundary. Combining classifiers in output space reduces this variance, and hence reduces the 'added' error. If N unbiased classifiers are combined by simple averaging. the added error rate can be reduced by a factor of N if the individual errors in approximating the decision boundaries are uncorrelated. Expressions are then derived for linear combiners which are biased or correlated, and the effect of output correlations on ensemble performance is quantified. For order statistics based non-linear combiners, we derive expressions that indicate how much the median, the maximum and in general the i-th order statistic can improve classifier performance. The analysis presented here facilitates the understanding of the relationships among error rates, classifier boundary distributions, and combining in output space. Experimental results on several public domain data sets are provided to illustrate the benefits of combining and to support the analytical results.
Intelligent query by humming system based on score level fusion of multiple classifiers
NASA Astrophysics Data System (ADS)
Pyo Nam, Gi; Thu Trang Luong, Thi; Ha Nam, Hyun; Ryoung Park, Kang; Park, Sung-Joo
2011-12-01
Recently, the necessity for content-based music retrieval that can return results even if a user does not know information such as the title or singer has increased. Query-by-humming (QBH) systems have been introduced to address this need, as they allow the user to simply hum snatches of the tune to find the right song. Even though there have been many studies on QBH, few have combined multiple classifiers based on various fusion methods. Here we propose a new QBH system based on the score level fusion of multiple classifiers. This research is novel in the following three respects: three local classifiers [quantized binary (QB) code-based linear scaling (LS), pitch-based dynamic time warping (DTW), and LS] are employed; local maximum and minimum point-based LS and pitch distribution feature-based LS are used as global classifiers; and the combination of local and global classifiers based on the score level fusion by the PRODUCT rule is used to achieve enhanced matching accuracy. Experimental results with the 2006 MIREX QBSH and 2009 MIR-QBSH corpus databases show that the performance of the proposed method is better than that of single classifier and other fusion methods.
Text categorization of biomedical data sets using graph kernels and a controlled vocabulary.
Bleik, Said; Mishra, Meenakshi; Huan, Jun; Song, Min
2013-01-01
Recently, graph representations of text have been showing improved performance over conventional bag-of-words representations in text categorization applications. In this paper, we present a graph-based representation for biomedical articles and use graph kernels to classify those articles into high-level categories. In our representation, common biomedical concepts and semantic relationships are identified with the help of an existing ontology and are used to build a rich graph structure that provides a consistent feature set and preserves additional semantic information that could improve a classifier's performance. We attempt to classify the graphs using both a set-based graph kernel that is capable of dealing with the disconnected nature of the graphs and a simple linear kernel. Finally, we report the results comparing the classification performance of the kernel classifiers to common text-based classifiers.
Steyrl, David; Scherer, Reinhold; Faller, Josef; Müller-Putz, Gernot R
2016-02-01
There is general agreement in the brain-computer interface (BCI) community that although non-linear classifiers can provide better results in some cases, linear classifiers are preferable. Particularly, as non-linear classifiers often involve a number of parameters that must be carefully chosen. However, new non-linear classifiers were developed over the last decade. One of them is the random forest (RF) classifier. Although popular in other fields of science, RFs are not common in BCI research. In this work, we address three open questions regarding RFs in sensorimotor rhythm (SMR) BCIs: parametrization, online applicability, and performance compared to regularized linear discriminant analysis (LDA). We found that the performance of RF is constant over a large range of parameter values. We demonstrate - for the first time - that RFs are applicable online in SMR-BCIs. Further, we show in an offline BCI simulation that RFs statistically significantly outperform regularized LDA by about 3%. These results confirm that RFs are practical and convenient non-linear classifiers for SMR-BCIs. Taking into account further properties of RFs, such as independence from feature distributions, maximum margin behavior, multiclass and advanced data mining capabilities, we argue that RFs should be taken into consideration for future BCIs.
Combination of dynamic Bayesian network classifiers for the recognition of degraded characters
NASA Astrophysics Data System (ADS)
Likforman-Sulem, Laurence; Sigelle, Marc
2009-01-01
We investigate in this paper the combination of DBN (Dynamic Bayesian Network) classifiers, either independent or coupled, for the recognition of degraded characters. The independent classifiers are a vertical HMM and a horizontal HMM whose observable outputs are the image columns and the image rows respectively. The coupled classifiers, presented in a previous study, associate the vertical and horizontal observation streams into single DBNs. The scores of the independent and coupled classifiers are then combined linearly at the decision level. We compare the different classifiers -independent, coupled or linearly combined- on two tasks: the recognition of artificially degraded handwritten digits and the recognition of real degraded old printed characters. Our results show that coupled DBNs perform better on degraded characters than the linear combination of independent HMM scores. Our results also show that the best classifier is obtained by linearly combining the scores of the best coupled DBN and the best independent HMM.
Terrill, Philip I; Wilson, Stephen J; Suresh, Sadasivam; Cooper, David M; Dakin, Carolyn
2012-08-01
Previous work has identified that non-linear variables calculated from respiratory data vary between sleep states, and that variables derived from the non-linear analytical tool recurrence quantification analysis (RQA) are accurate infant sleep state discriminators. This study aims to apply these discriminators to automatically classify 30 s epochs of infant sleep as REM, non-REM and wake. Polysomnograms were obtained from 25 healthy infants at 2 weeks, 3, 6 and 12 months of age, and manually sleep staged as wake, REM and non-REM. Inter-breath interval data were extracted from the respiratory inductive plethysmograph, and RQA applied to calculate radius, determinism and laminarity. Time-series statistic and spectral analysis variables were also calculated. A nested cross-validation method was used to identify the optimal feature subset, and to train and evaluate a linear discriminant analysis-based classifier. The RQA features radius and laminarity and were reliably selected. Mean agreement was 79.7, 84.9, 84.0 and 79.2 % at 2 weeks, 3, 6 and 12 months, and the classifier performed better than a comparison classifier not including RQA variables. The performance of this sleep-staging tool compares favourably with inter-human agreement rates, and improves upon previous systems using only respiratory data. Applications include diagnostic screening and population-based sleep research.
Automatic Classification of Artifactual ICA-Components for Artifact Removal in EEG Signals
2011-01-01
Background Artifacts contained in EEG recordings hamper both, the visual interpretation by experts as well as the algorithmic processing and analysis (e.g. for Brain-Computer Interfaces (BCI) or for Mental State Monitoring). While hand-optimized selection of source components derived from Independent Component Analysis (ICA) to clean EEG data is widespread, the field could greatly profit from automated solutions based on Machine Learning methods. Existing ICA-based removal strategies depend on explicit recordings of an individual's artifacts or have not been shown to reliably identify muscle artifacts. Methods We propose an automatic method for the classification of general artifactual source components. They are estimated by TDSEP, an ICA method that takes temporal correlations into account. The linear classifier is based on an optimized feature subset determined by a Linear Programming Machine (LPM). The subset is composed of features from the frequency-, the spatial- and temporal domain. A subject independent classifier was trained on 640 TDSEP components (reaction time (RT) study, n = 12) that were hand labeled by experts as artifactual or brain sources and tested on 1080 new components of RT data of the same study. Generalization was tested on new data from two studies (auditory Event Related Potential (ERP) paradigm, n = 18; motor imagery BCI paradigm, n = 80) that used data with different channel setups and from new subjects. Results Based on six features only, the optimized linear classifier performed on level with the inter-expert disagreement (<10% Mean Squared Error (MSE)) on the RT data. On data of the auditory ERP study, the same pre-calculated classifier generalized well and achieved 15% MSE. On data of the motor imagery paradigm, we demonstrate that the discriminant information used for BCI is preserved when removing up to 60% of the most artifactual source components. Conclusions We propose a universal and efficient classifier of ICA components for the subject independent removal of artifacts from EEG data. Based on linear methods, it is applicable for different electrode placements and supports the introspection of results. Trained on expert ratings of large data sets, it is not restricted to the detection of eye- and muscle artifacts. Its performance and generalization ability is demonstrated on data of different EEG studies. PMID:21810266
LBP and SIFT based facial expression recognition
NASA Astrophysics Data System (ADS)
Sumer, Omer; Gunes, Ece O.
2015-02-01
This study compares the performance of local binary patterns (LBP) and scale invariant feature transform (SIFT) with support vector machines (SVM) in automatic classification of discrete facial expressions. Facial expression recognition is a multiclass classification problem and seven classes; happiness, anger, sadness, disgust, surprise, fear and comtempt are classified. Using SIFT feature vectors and linear SVM, 93.1% mean accuracy is acquired on CK+ database. On the other hand, the performance of LBP-based classifier with linear SVM is reported on SFEW using strictly person independent (SPI) protocol. Seven-class mean accuracy on SFEW is 59.76%. Experiments on both databases showed that LBP features can be used in a fairly descriptive way if a good localization of facial points and partitioning strategy are followed.
NASA Astrophysics Data System (ADS)
Winder, Anthony J.; Siemonsen, Susanne; Flottmann, Fabian; Fiehler, Jens; Forkert, Nils D.
2017-03-01
Voxel-based tissue outcome prediction in acute ischemic stroke patients is highly relevant for both clinical routine and research. Previous research has shown that features extracted from baseline multi-parametric MRI datasets have a high predictive value and can be used for the training of classifiers, which can generate tissue outcome predictions for both intravenous and conservative treatments. However, with the recent advent and popularization of intra-arterial thrombectomy treatment, novel research specifically addressing the utility of predictive classi- fiers for thrombectomy intervention is necessary for a holistic understanding of current stroke treatment options. The aim of this work was to develop three clinically viable tissue outcome prediction models using approximate nearest-neighbor, generalized linear model, and random decision forest approaches and to evaluate the accuracy of predicting tissue outcome after intra-arterial treatment. Therefore, the three machine learning models were trained, evaluated, and compared using datasets of 42 acute ischemic stroke patients treated with intra-arterial thrombectomy. Classifier training utilized eight voxel-based features extracted from baseline MRI datasets and five global features. Evaluation of classifier-based predictions was performed via comparison to the known tissue outcome, which was determined in follow-up imaging, using the Dice coefficient and leave-on-patient-out cross validation. The random decision forest prediction model led to the best tissue outcome predictions with a mean Dice coefficient of 0.37. The approximate nearest-neighbor and generalized linear model performed equally suboptimally with average Dice coefficients of 0.28 and 0.27 respectively, suggesting that both non-linearity and machine learning are desirable properties of a classifier well-suited to the intra-arterial tissue outcome prediction problem.
Overlapped Partitioning for Ensemble Classifiers of P300-Based Brain-Computer Interfaces
Onishi, Akinari; Natsume, Kiyohisa
2014-01-01
A P300-based brain-computer interface (BCI) enables a wide range of people to control devices that improve their quality of life. Ensemble classifiers with naive partitioning were recently applied to the P300-based BCI and these classification performances were assessed. However, they were usually trained on a large amount of training data (e.g., 15300). In this study, we evaluated ensemble linear discriminant analysis (LDA) classifiers with a newly proposed overlapped partitioning method using 900 training data. In addition, the classification performances of the ensemble classifier with naive partitioning and a single LDA classifier were compared. One of three conditions for dimension reduction was applied: the stepwise method, principal component analysis (PCA), or none. The results show that an ensemble stepwise LDA (SWLDA) classifier with overlapped partitioning achieved a better performance than the commonly used single SWLDA classifier and an ensemble SWLDA classifier with naive partitioning. This result implies that the performance of the SWLDA is improved by overlapped partitioning and the ensemble classifier with overlapped partitioning requires less training data than that with naive partitioning. This study contributes towards reducing the required amount of training data and achieving better classification performance. PMID:24695550
Overlapped partitioning for ensemble classifiers of P300-based brain-computer interfaces.
Onishi, Akinari; Natsume, Kiyohisa
2014-01-01
A P300-based brain-computer interface (BCI) enables a wide range of people to control devices that improve their quality of life. Ensemble classifiers with naive partitioning were recently applied to the P300-based BCI and these classification performances were assessed. However, they were usually trained on a large amount of training data (e.g., 15300). In this study, we evaluated ensemble linear discriminant analysis (LDA) classifiers with a newly proposed overlapped partitioning method using 900 training data. In addition, the classification performances of the ensemble classifier with naive partitioning and a single LDA classifier were compared. One of three conditions for dimension reduction was applied: the stepwise method, principal component analysis (PCA), or none. The results show that an ensemble stepwise LDA (SWLDA) classifier with overlapped partitioning achieved a better performance than the commonly used single SWLDA classifier and an ensemble SWLDA classifier with naive partitioning. This result implies that the performance of the SWLDA is improved by overlapped partitioning and the ensemble classifier with overlapped partitioning requires less training data than that with naive partitioning. This study contributes towards reducing the required amount of training data and achieving better classification performance.
A bench-top hyperspectral imaging system to classify beef from Nellore cattle based on tenderness
NASA Astrophysics Data System (ADS)
Nubiato, Keni Eduardo Zanoni; Mazon, Madeline Rezende; Antonelo, Daniel Silva; Calkins, Chris R.; Naganathan, Govindarajan Konda; Subbiah, Jeyamkondan; da Luz e Silva, Saulo
2018-03-01
The aim of this study was to evaluate the accuracy of classification of Nellore beef aged for 0, 7, 14, or 21 days and classification based on tenderness and aging period using a bench-top hyperspectral imaging system. A hyperspectral imaging system (λ = 928-2524 nm) was used to collect hyperspectral images of the Longissimus thoracis et lumborum (aging n = 376 and tenderness n = 345) of Nellore cattle. The image processing steps included selection of region of interest, extraction of spectra, and indentification and evalution of selected wavelengths for classification. Six linear discriminant models were developed to classify samples based on tenderness and aging period. The model using the first derivative of partial absorbance spectra (give wavelength range spectra) was able to classify steaks based on the tenderness with an overall accuracy of 89.8%. The model using the first derivative of full absorbance spectra was able to classify steaks based on aging period with an overall accuracy of 84.8%. The results demonstrate that the HIS may be a viable technology for classifying beef based on tenderness and aging period.
NASA Technical Reports Server (NTRS)
Lin, Qian; Allebach, Jan P.
1990-01-01
An adaptive vector linear minimum mean-squared error (LMMSE) filter for multichannel images with multiplicative noise is presented. It is shown theoretically that the mean-squared error in the filter output is reduced by making use of the correlation between image bands. The vector and conventional scalar LMMSE filters are applied to a three-band SIR-B SAR, and their performance is compared. Based on a mutliplicative noise model, the per-pel maximum likelihood classifier was derived. The authors extend this to the design of sequential and robust classifiers. These classifiers are also applied to the three-band SIR-B SAR image.
A comprehensive simulation study on classification of RNA-Seq data.
Zararsız, Gökmen; Goksuluk, Dincer; Korkmaz, Selcuk; Eldem, Vahap; Zararsiz, Gozde Erturk; Duru, Izzet Parug; Ozturk, Ahmet
2017-01-01
RNA sequencing (RNA-Seq) is a powerful technique for the gene-expression profiling of organisms that uses the capabilities of next-generation sequencing technologies. Developing gene-expression-based classification algorithms is an emerging powerful method for diagnosis, disease classification and monitoring at molecular level, as well as providing potential markers of diseases. Most of the statistical methods proposed for the classification of gene-expression data are either based on a continuous scale (eg. microarray data) or require a normal distribution assumption. Hence, these methods cannot be directly applied to RNA-Seq data since they violate both data structure and distributional assumptions. However, it is possible to apply these algorithms with appropriate modifications to RNA-Seq data. One way is to develop count-based classifiers, such as Poisson linear discriminant analysis and negative binomial linear discriminant analysis. Another way is to bring the data closer to microarrays and apply microarray-based classifiers. In this study, we compared several classifiers including PLDA with and without power transformation, NBLDA, single SVM, bagging SVM (bagSVM), classification and regression trees (CART), and random forests (RF). We also examined the effect of several parameters such as overdispersion, sample size, number of genes, number of classes, differential-expression rate, and the transformation method on model performances. A comprehensive simulation study is conducted and the results are compared with the results of two miRNA and two mRNA experimental datasets. The results revealed that increasing the sample size, differential-expression rate and decreasing the dispersion parameter and number of groups lead to an increase in classification accuracy. Similar with differential-expression studies, the classification of RNA-Seq data requires careful attention when handling data overdispersion. We conclude that, as a count-based classifier, the power transformed PLDA and, as a microarray-based classifier, vst or rlog transformed RF and SVM classifiers may be a good choice for classification. An R/BIOCONDUCTOR package, MLSeq, is freely available at https://www.bioconductor.org/packages/release/bioc/html/MLSeq.html.
Classification of Multiple Chinese Liquors by Means of a QCM-based E-Nose and MDS-SVM Classifier.
Li, Qiang; Gu, Yu; Jia, Jing
2017-01-30
Chinese liquors are internationally well-known fermentative alcoholic beverages. They have unique flavors attributable to the use of various bacteria and fungi, raw materials, and production processes. Developing a novel, rapid, and reliable method to identify multiple Chinese liquors is of positive significance. This paper presents a pattern recognition system for classifying ten brands of Chinese liquors based on multidimensional scaling (MDS) and support vector machine (SVM) algorithms in a quartz crystal microbalance (QCM)-based electronic nose (e-nose) we designed. We evaluated the comprehensive performance of the MDS-SVM classifier that predicted all ten brands of Chinese liquors individually. The prediction accuracy (98.3%) showed superior performance of the MDS-SVM classifier over the back-propagation artificial neural network (BP-ANN) classifier (93.3%) and moving average-linear discriminant analysis (MA-LDA) classifier (87.6%). The MDS-SVM classifier has reasonable reliability, good fitting and prediction (generalization) performance in classification of the Chinese liquors. Taking both application of the e-nose and validation of the MDS-SVM classifier into account, we have thus created a useful method for the classification of multiple Chinese liquors.
Novel nonlinear knowledge-based mean force potentials based on machine learning.
Dong, Qiwen; Zhou, Shuigeng
2011-01-01
The prediction of 3D structures of proteins from amino acid sequences is one of the most challenging problems in molecular biology. An essential task for solving this problem with coarse-grained models is to deduce effective interaction potentials. The development and evaluation of new energy functions is critical to accurately modeling the properties of biological macromolecules. Knowledge-based mean force potentials are derived from statistical analysis of proteins of known structures. Current knowledge-based potentials are almost in the form of weighted linear sum of interaction pairs. In this study, a class of novel nonlinear knowledge-based mean force potentials is presented. The potential parameters are obtained by nonlinear classifiers, instead of relative frequencies of interaction pairs against a reference state or linear classifiers. The support vector machine is used to derive the potential parameters on data sets that contain both native structures and decoy structures. Five knowledge-based mean force Boltzmann-based or linear potentials are introduced and their corresponding nonlinear potentials are implemented. They are the DIH potential (single-body residue-level Boltzmann-based potential), the DFIRE-SCM potential (two-body residue-level Boltzmann-based potential), the FS potential (two-body atom-level Boltzmann-based potential), the HR potential (two-body residue-level linear potential), and the T32S3 potential (two-body atom-level linear potential). Experiments are performed on well-established decoy sets, including the LKF data set, the CASP7 data set, and the Decoys “R”Us data set. The evaluation metrics include the energy Z score and the ability of each potential to discriminate native structures from a set of decoy structures. Experimental results show that all nonlinear potentials significantly outperform the corresponding Boltzmann-based or linear potentials, and the proposed discriminative framework is effective in developing knowledge-based mean force potentials. The nonlinear potentials can be widely used for ab initio protein structure prediction, model quality assessment, protein docking, and other challenging problems in computational biology.
Building Reliable Metaclassifiers for Text Learning
2006-05-01
outputs are often poor [Ben00, DP96] but can be improved [Ben00, ZE01, ZE02]. SVM For linear SVMs, we use the Smox toolkit which is based on Platt’s...and implementations are the same as discussed in Section 6.3. The exception is that for an implementation of linear SVMs, we used the Smox toolkit which...is based on Platt’s Sequential Minimal Optimization algorithm [Pla98]. Since Smox is the best base classifier in the experiments below, it is the
Korczowski, L; Congedo, M; Jutten, C
2015-08-01
The classification of electroencephalographic (EEG) data recorded from multiple users simultaneously is an important challenge in the field of Brain-Computer Interface (BCI). In this paper we compare different approaches for classification of single-trials Event-Related Potential (ERP) on two subjects playing a collaborative BCI game. The minimum distance to mean (MDM) classifier in a Riemannian framework is extended to use the diversity of the inter-subjects spatio-temporal statistics (MDM-hyper) or to merge multiple classifiers (MDM-multi). We show that both these classifiers outperform significantly the mean performance of the two users and analogous classifiers based on the step-wise linear discriminant analysis. More importantly, the MDM-multi outperforms the performance of the best player within the pair.
Fall Detection Using Smartphone Audio Features.
Cheffena, Michael
2016-07-01
An automated fall detection system based on smartphone audio features is developed. The spectrogram, mel frequency cepstral coefficents (MFCCs), linear predictive coding (LPC), and matching pursuit (MP) features of different fall and no-fall sound events are extracted from experimental data. Based on the extracted audio features, four different machine learning classifiers: k-nearest neighbor classifier (k-NN), support vector machine (SVM), least squares method (LSM), and artificial neural network (ANN) are investigated for distinguishing between fall and no-fall events. For each audio feature, the performance of each classifier in terms of sensitivity, specificity, accuracy, and computational complexity is evaluated. The best performance is achieved using spectrogram features with ANN classifier with sensitivity, specificity, and accuracy all above 98%. The classifier also has acceptable computational requirement for training and testing. The system is applicable in home environments where the phone is placed in the vicinity of the user.
Miao, Xinyang; Li, Hao; Bao, Rima; Feng, Chengjing; Wu, Hang; Zhan, Honglei; Li, Yizhang; Zhao, Kun
2017-02-01
Understanding the geological units of a reservoir is essential to the development and management of the resource. In this paper, drill cuttings from several depths from an oilfield were studied using terahertz time domain spectroscopy (THz-TDS). Cluster analysis (CA) and principal component analysis (PCA) were employed to classify and analyze the cuttings. The cuttings were clearly classified based on CA and PCA methods, and the results were in agreement with the lithology. Moreover, calcite and dolomite have stronger absorption of a THz pulse than any other minerals, based on an analysis of the PC1 scores. Quantitative analyses of minor minerals were also realized by building a series of linear and non-linear models between contents and PC2 scores. The results prove THz technology to be a promising means for determining reservoir lithology as well as other properties, which will be a significant supplementary method in oil fields.
2014-01-01
Background Left bundle branch block (LBBB) and right bundle branch block (RBBB) not only mask electrocardiogram (ECG) changes that reflect diseases but also indicate important underlying pathology. The timely detection of LBBB and RBBB is critical in the treatment of cardiac diseases. Inter-patient heartbeat classification is based on independent training and testing sets to construct and evaluate a heartbeat classification system. Therefore, a heartbeat classification system with a high performance evaluation possesses a strong predictive capability for unknown data. The aim of this study was to propose a method for inter-patient classification of heartbeats to accurately detect LBBB and RBBB from the normal beat (NORM). Methods This study proposed a heartbeat classification method through a combination of three different types of classifiers: a minimum distance classifier constructed between NORM and LBBB; a weighted linear discriminant classifier between NORM and RBBB based on Bayesian decision making using posterior probabilities; and a linear support vector machine (SVM) between LBBB and RBBB. Each classifier was used with matching features to obtain better classification performance. The final types of the test heartbeats were determined using a majority voting strategy through the combination of class labels from the three classifiers. The optimal parameters for the classifiers were selected using cross-validation on the training set. The effects of different lead configurations on the classification results were assessed, and the performance of these three classifiers was compared for the detection of each pair of heartbeat types. Results The study results showed that a two-lead configuration exhibited better classification results compared with a single-lead configuration. The construction of a classifier with good performance between each pair of heartbeat types significantly improved the heartbeat classification performance. The results showed a sensitivity of 91.4% and a positive predictive value of 37.3% for LBBB and a sensitivity of 92.8% and a positive predictive value of 88.8% for RBBB. Conclusions A multi-classifier ensemble method was proposed based on inter-patient data and demonstrated a satisfactory classification performance. This approach has the potential for application in clinical practice to distinguish LBBB and RBBB from NORM of unknown patients. PMID:24903422
Huang, Huifang; Liu, Jie; Zhu, Qiang; Wang, Ruiping; Hu, Guangshu
2014-06-05
Left bundle branch block (LBBB) and right bundle branch block (RBBB) not only mask electrocardiogram (ECG) changes that reflect diseases but also indicate important underlying pathology. The timely detection of LBBB and RBBB is critical in the treatment of cardiac diseases. Inter-patient heartbeat classification is based on independent training and testing sets to construct and evaluate a heartbeat classification system. Therefore, a heartbeat classification system with a high performance evaluation possesses a strong predictive capability for unknown data. The aim of this study was to propose a method for inter-patient classification of heartbeats to accurately detect LBBB and RBBB from the normal beat (NORM). This study proposed a heartbeat classification method through a combination of three different types of classifiers: a minimum distance classifier constructed between NORM and LBBB; a weighted linear discriminant classifier between NORM and RBBB based on Bayesian decision making using posterior probabilities; and a linear support vector machine (SVM) between LBBB and RBBB. Each classifier was used with matching features to obtain better classification performance. The final types of the test heartbeats were determined using a majority voting strategy through the combination of class labels from the three classifiers. The optimal parameters for the classifiers were selected using cross-validation on the training set. The effects of different lead configurations on the classification results were assessed, and the performance of these three classifiers was compared for the detection of each pair of heartbeat types. The study results showed that a two-lead configuration exhibited better classification results compared with a single-lead configuration. The construction of a classifier with good performance between each pair of heartbeat types significantly improved the heartbeat classification performance. The results showed a sensitivity of 91.4% and a positive predictive value of 37.3% for LBBB and a sensitivity of 92.8% and a positive predictive value of 88.8% for RBBB. A multi-classifier ensemble method was proposed based on inter-patient data and demonstrated a satisfactory classification performance. This approach has the potential for application in clinical practice to distinguish LBBB and RBBB from NORM of unknown patients.
Integrated Sensing Processor, Phase 2
2005-12-01
performance analysis for several baseline classifiers including neural nets, linear classifiers, and kNN classifiers. Use of CCDR as a preprocessing step...below the level of the benchmark non-linear classifier for this problem ( kNN ). Furthermore, the CCDR preconditioned kNN achieved a 10% improvement over...the benchmark kNN without CCDR. Finally, we found an important connection between intrinsic dimension estimation via entropic graphs and the optimal
Astrand, Elaine; Enel, Pierre; Ibos, Guilhem; Dominey, Peter Ford; Baraduc, Pierre; Ben Hamed, Suliann
2014-01-01
Decoding neuronal information is important in neuroscience, both as a basic means to understand how neuronal activity is related to cerebral function and as a processing stage in driving neuroprosthetic effectors. Here, we compare the readout performance of six commonly used classifiers at decoding two different variables encoded by the spiking activity of the non-human primate frontal eye fields (FEF): the spatial position of a visual cue, and the instructed orientation of the animal's attention. While the first variable is exogenously driven by the environment, the second variable corresponds to the interpretation of the instruction conveyed by the cue; it is endogenously driven and corresponds to the output of internal cognitive operations performed on the visual attributes of the cue. These two variables were decoded using either a regularized optimal linear estimator in its explicit formulation, an optimal linear artificial neural network estimator, a non-linear artificial neural network estimator, a non-linear naïve Bayesian estimator, a non-linear Reservoir recurrent network classifier or a non-linear Support Vector Machine classifier. Our results suggest that endogenous information such as the orientation of attention can be decoded from the FEF with the same accuracy as exogenous visual information. All classifiers did not behave equally in the face of population size and heterogeneity, the available training and testing trials, the subject's behavior and the temporal structure of the variable of interest. In most situations, the regularized optimal linear estimator and the non-linear Support Vector Machine classifiers outperformed the other tested decoders. PMID:24466019
Deconvolution When Classifying Noisy Data Involving Transformations.
Carroll, Raymond; Delaigle, Aurore; Hall, Peter
2012-09-01
In the present study, we consider the problem of classifying spatial data distorted by a linear transformation or convolution and contaminated by additive random noise. In this setting, we show that classifier performance can be improved if we carefully invert the data before the classifier is applied. However, the inverse transformation is not constructed so as to recover the original signal, and in fact, we show that taking the latter approach is generally inadvisable. We introduce a fully data-driven procedure based on cross-validation, and use several classifiers to illustrate numerical properties of our approach. Theoretical arguments are given in support of our claims. Our procedure is applied to data generated by light detection and ranging (Lidar) technology, where we improve on earlier approaches to classifying aerosols. This article has supplementary materials online.
Multicategory nets of single-layer perceptrons: complexity and sample-size issues.
Raudys, Sarunas; Kybartas, Rimantas; Zavadskas, Edmundas Kazimieras
2010-05-01
The standard cost function of multicategory single-layer perceptrons (SLPs) does not minimize the classification error rate. In order to reduce classification error, it is necessary to: 1) refuse the traditional cost function, 2) obtain near to optimal pairwise linear classifiers by specially organized SLP training and optimal stopping, and 3) fuse their decisions properly. To obtain better classification in unbalanced training set situations, we introduce the unbalance correcting term. It was found that fusion based on the Kulback-Leibler (K-L) distance and the Wu-Lin-Weng (WLW) method result in approximately the same performance in situations where sample sizes are relatively small. The explanation for this observation is by theoretically known verity that an excessive minimization of inexact criteria becomes harmful at times. Comprehensive comparative investigations of six real-world pattern recognition (PR) problems demonstrated that employment of SLP-based pairwise classifiers is comparable and as often as not outperforming the linear support vector (SV) classifiers in moderate dimensional situations. The colored noise injection used to design pseudovalidation sets proves to be a powerful tool for facilitating finite sample problems in moderate-dimensional PR tasks.
Triacylglycerol stereospecific analysis and linear discriminant analysis for milk speciation.
Blasi, Francesca; Lombardi, Germana; Damiani, Pietro; Simonetti, Maria Stella; Giua, Laura; Cossignani, Lina
2013-05-01
Product authenticity is an important topic in dairy sector. Dairy products sold for public consumption must be accurately labelled in accordance with the contained milk species. Linear discriminant analysis (LDA), a common chemometric procedure, has been applied to fatty acid% composition to classify pure milk samples (cow, ewe, buffalo, donkey, goat). All original grouped cases were correctly classified, while 90% of cross-validated grouped cases were correctly classified. Another objective of this research was the characterisation of cow-ewe milk mixtures in order to reveal a common fraud in dairy field, that is the addition of cow to ewe milk. Stereospecific analysis of triacylglycerols (TAG), a method based on chemical-enzymatic procedures coupled with chromatographic techniques, has been carried out to detect fraudulent milk additions, in particular 1, 3, 5% cow milk added to ewe milk. When only TAG composition data were used for the elaboration, 75% of original grouped cases were correctly classified, while totally correct classified samples were obtained when both total and intrapositional TAG data were used. Also the results of cross validation were better when TAG stereospecific analysis data were considered as LDA variables. In particular, 100% of cross-validated grouped cases were obtained when 5% cow milk mixtures were considered.
An implementation of support vector machine on sentiment classification of movie reviews
NASA Astrophysics Data System (ADS)
Yulietha, I. M.; Faraby, S. A.; Adiwijaya; Widyaningtyas, W. C.
2018-03-01
With technological advances, all information about movie is available on the internet. If the information is processed properly, it will get the quality of the information. This research proposes to the classify sentiments on movie review documents. This research uses Support Vector Machine (SVM) method because it can classify high dimensional data in accordance with the data used in this research in the form of text. Support Vector Machine is a popular machine learning technique for text classification because it can classify by learning from a collection of documents that have been classified previously and can provide good result. Based on number of datasets, the 90-10 composition has the best result that is 85.6%. Based on SVM kernel, kernel linear with constant 1 has the best result that is 84.9%
Classification of speech dysfluencies using LPC based parameterization techniques.
Hariharan, M; Chee, Lim Sin; Ai, Ooi Chia; Yaacob, Sazali
2012-06-01
The goal of this paper is to discuss and compare three feature extraction methods: Linear Predictive Coefficients (LPC), Linear Prediction Cepstral Coefficients (LPCC) and Weighted Linear Prediction Cepstral Coefficients (WLPCC) for recognizing the stuttered events. Speech samples from the University College London Archive of Stuttered Speech (UCLASS) were used for our analysis. The stuttered events were identified through manual segmentation and were used for feature extraction. Two simple classifiers namely, k-nearest neighbour (kNN) and Linear Discriminant Analysis (LDA) were employed for speech dysfluencies classification. Conventional validation method was used for testing the reliability of the classifier results. The study on the effect of different frame length, percentage of overlapping, value of ã in a first order pre-emphasizer and different order p were discussed. The speech dysfluencies classification accuracy was found to be improved by applying statistical normalization before feature extraction. The experimental investigation elucidated LPC, LPCC and WLPCC features can be used for identifying the stuttered events and WLPCC features slightly outperforms LPCC features and LPC features.
Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li
2011-01-01
Background Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Methodology/Principal Findings Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. Conclusions/Significance The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice. PMID:21359184
Song, Sutao; Zhan, Zhichao; Long, Zhiying; Zhang, Jiacai; Yao, Li
2011-02-16
Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional studies which focused either merely on the evaluation of different types of SVM or the voxel selection methods, we aimed to investigate the overall performance of linear and RBF SVM for fMRI classification together with voxel selection schemes on classification accuracy and time-consuming. Six different voxel selection methods were employed to decide which voxels of fMRI data would be included in SVM classifiers with linear and RBF kernels in classifying 4-category objects. Then the overall performances of voxel selection and classification methods were compared. Results showed that: (1) Voxel selection had an important impact on the classification accuracy of the classifiers: in a relative low dimensional feature space, RBF SVM outperformed linear SVM significantly; in a relative high dimensional space, linear SVM performed better than its counterpart; (2) Considering the classification accuracy and time-consuming holistically, linear SVM with relative more voxels as features and RBF SVM with small set of voxels (after PCA) could achieve the better accuracy and cost shorter time. The present work provides the first empirical result of linear and RBF SVM in classification of fMRI data, combined with voxel selection methods. Based on the findings, if only classification accuracy was concerned, RBF SVM with appropriate small voxels and linear SVM with relative more voxels were two suggested solutions; if users concerned more about the computational time, RBF SVM with relative small set of voxels when part of the principal components were kept as features was a better choice.
NASA Astrophysics Data System (ADS)
Oung, Qi Wei; Nisha Basah, Shafriza; Muthusamy, Hariharan; Vijean, Vikneswaran; Lee, Hoileong
2018-03-01
Parkinson’s disease (PD) is one type of progressive neurodegenerative disease known as motor system syndrome, which is due to the death of dopamine-generating cells, a region of the human midbrain. PD normally affects people over 60 years of age, which at present has influenced a huge part of worldwide population. Lately, many researches have shown interest into the connection between PD and speech disorders. Researches have revealed that speech signals may be a suitable biomarker for distinguishing between people with Parkinson’s (PWP) from healthy subjects. Therefore, early diagnosis of PD through the speech signals can be considered for this aim. In this research, the speech data are acquired based on speech behaviour as the biomarker for differentiating PD severity levels (mild and moderate) from healthy subjects. Feature extraction algorithms applied are Mel Frequency Cepstral Coefficients (MFCC), Linear Predictive Coefficients (LPC), Linear Prediction Cepstral Coefficients (LPCC), and Weighted Linear Prediction Cepstral Coefficients (WLPCC). For classification, two types of classifiers are used: k-Nearest Neighbour (KNN) and Probabilistic Neural Network (PNN). The experimental results demonstrated that PNN classifier and KNN classifier achieve the best average classification performance of 92.63% and 88.56% respectively through 10-fold cross-validation measures. Favourably, the suggested techniques have the possibilities of becoming a new choice of promising tools for the PD detection with tremendous performance.
NASA Technical Reports Server (NTRS)
Pelletier, R. E.
1984-01-01
A need exists for digitized information pertaining to linear features such as roads, streams, water bodies and agricultural field boundaries as component parts of a data base. For many areas where this data may not yet exist or is in need of updating, these features may be extracted from remotely sensed digital data. This paper examines two approaches for identifying linear features, one utilizing raw data and the other classified data. Each approach uses a series of data enhancement procedures including derivation of standard deviation values, principal component analysis and filtering procedures using a high-pass window matrix. Just as certain bands better classify different land covers, so too do these bands exhibit high spectral contrast by which boundaries between land covers can be delineated. A few applications for this kind of data are briefly discussed, including its potential in a Universal Soil Loss Equation Model.
Blood Based Biomarkers of Early Onset Breast Cancer
2016-12-01
discretizes the data, and also using logistic elastic net – a form of linear regression - we were unable to build a classifier that could accurately...classifier for differentiating cases from controls off discretized data. The first pass analysis demonstrated a 35 gene signature that differentiated...to the discretized data for mRNA gene signature, the samples used to “train” were also included in the final samples used to “test” the algorithm
Using Neural Networks to Classify Digitized Images of Galaxies
NASA Astrophysics Data System (ADS)
Goderya, S. N.; McGuire, P. C.
2000-12-01
Automated classification of Galaxies into Hubble types is of paramount importance to study the large scale structure of the Universe, particularly as survey projects like the Sloan Digital Sky Survey complete their data acquisition of one million galaxies. At present it is not possible to find robust and efficient artificial intelligence based galaxy classifiers. In this study we will summarize progress made in the development of automated galaxy classifiers using neural networks as machine learning tools. We explore the Bayesian linear algorithm, the higher order probabilistic network, the multilayer perceptron neural network and Support Vector Machine Classifier. The performance of any machine classifier is dependant on the quality of the parameters that characterize the different groups of galaxies. Our effort is to develop geometric and invariant moment based parameters as input to the machine classifiers instead of the raw pixel data. Such an approach reduces the dimensionality of the classifier considerably, and removes the effects of scaling and rotation, and makes it easier to solve for the unknown parameters in the galaxy classifier. To judge the quality of training and classification we develop the concept of Mathews coefficients for the galaxy classification community. Mathews coefficients are single numbers that quantify classifier performance even with unequal prior probabilities of the classes.
Towards brain-activity-controlled information retrieval: Decoding image relevance from MEG signals.
Kauppi, Jukka-Pekka; Kandemir, Melih; Saarinen, Veli-Matti; Hirvenkari, Lotta; Parkkonen, Lauri; Klami, Arto; Hari, Riitta; Kaski, Samuel
2015-05-15
We hypothesize that brain activity can be used to control future information retrieval systems. To this end, we conducted a feasibility study on predicting the relevance of visual objects from brain activity. We analyze both magnetoencephalographic (MEG) and gaze signals from nine subjects who were viewing image collages, a subset of which was relevant to a predetermined task. We report three findings: i) the relevance of an image a subject looks at can be decoded from MEG signals with performance significantly better than chance, ii) fusion of gaze-based and MEG-based classifiers significantly improves the prediction performance compared to using either signal alone, and iii) non-linear classification of the MEG signals using Gaussian process classifiers outperforms linear classification. These findings break new ground for building brain-activity-based interactive image retrieval systems, as well as for systems utilizing feedback both from brain activity and eye movements. Copyright © 2015 Elsevier Inc. All rights reserved.
2013-01-01
Background Identifying the emotional state is helpful in applications involving patients with autism and other intellectual disabilities; computer-based training, human computer interaction etc. Electrocardiogram (ECG) signals, being an activity of the autonomous nervous system (ANS), reflect the underlying true emotional state of a person. However, the performance of various methods developed so far lacks accuracy, and more robust methods need to be developed to identify the emotional pattern associated with ECG signals. Methods Emotional ECG data was obtained from sixty participants by inducing the six basic emotional states (happiness, sadness, fear, disgust, surprise and neutral) using audio-visual stimuli. The non-linear feature ‘Hurst’ was computed using Rescaled Range Statistics (RRS) and Finite Variance Scaling (FVS) methods. New Hurst features were proposed by combining the existing RRS and FVS methods with Higher Order Statistics (HOS). The features were then classified using four classifiers – Bayesian Classifier, Regression Tree, K- nearest neighbor and Fuzzy K-nearest neighbor. Seventy percent of the features were used for training and thirty percent for testing the algorithm. Results Analysis of Variance (ANOVA) conveyed that Hurst and the proposed features were statistically significant (p < 0.001). Hurst computed using RRS and FVS methods showed similar classification accuracy. The features obtained by combining FVS and HOS performed better with a maximum accuracy of 92.87% and 76.45% for classifying the six emotional states using random and subject independent validation respectively. Conclusions The results indicate that the combination of non-linear analysis and HOS tend to capture the finer emotional changes that can be seen in healthy ECG data. This work can be further fine tuned to develop a real time system. PMID:23680041
Proposing an adaptive mutation to improve XCSF performance to classify ADHD and BMD patients
NASA Astrophysics Data System (ADS)
Sadatnezhad, Khadijeh; Boostani, Reza; Ghanizadeh, Ahmad
2010-12-01
There is extensive overlap of clinical symptoms observed among children with bipolar mood disorder (BMD) and those with attention deficit hyperactivity disorder (ADHD). Thus, diagnosis according to clinical symptoms cannot be very accurate. It is therefore desirable to develop quantitative criteria for automatic discrimination between these disorders. This study is aimed at designing an efficient decision maker to accurately classify ADHD and BMD patients by analyzing their electroencephalogram (EEG) signals. In this study, 22 channels of EEGs have been recorded from 21 subjects with ADHD and 22 individuals with BMD. Several informative features, such as fractal dimension, band power and autoregressive coefficients, were extracted from the recorded signals. Considering the multimodal overlapping distribution of the obtained features, linear discriminant analysis (LDA) was used to reduce the input dimension in a more separable space to make it more appropriate for the proposed classifier. A piecewise linear classifier based on the extended classifier system for function approximation (XCSF) was modified by developing an adaptive mutation rate, which was proportional to the genotypic content of best individuals and their fitness in each generation. The proposed operator controlled the trade-off between exploration and exploitation while maintaining the diversity in the classifier's population to avoid premature convergence. To assess the effectiveness of the proposed scheme, the extracted features were applied to support vector machine, LDA, nearest neighbor and XCSF classifiers. To evaluate the method, a noisy environment was simulated with different noise amplitudes. It is shown that the results of the proposed technique are more robust as compared to conventional classifiers. Statistical tests demonstrate that the proposed classifier is a promising method for discriminating between ADHD and BMD patients.
Proposing an adaptive mutation to improve XCSF performance to classify ADHD and BMD patients.
Sadatnezhad, Khadijeh; Boostani, Reza; Ghanizadeh, Ahmad
2010-12-01
There is extensive overlap of clinical symptoms observed among children with bipolar mood disorder (BMD) and those with attention deficit hyperactivity disorder (ADHD). Thus, diagnosis according to clinical symptoms cannot be very accurate. It is therefore desirable to develop quantitative criteria for automatic discrimination between these disorders. This study is aimed at designing an efficient decision maker to accurately classify ADHD and BMD patients by analyzing their electroencephalogram (EEG) signals. In this study, 22 channels of EEGs have been recorded from 21 subjects with ADHD and 22 individuals with BMD. Several informative features, such as fractal dimension, band power and autoregressive coefficients, were extracted from the recorded signals. Considering the multimodal overlapping distribution of the obtained features, linear discriminant analysis (LDA) was used to reduce the input dimension in a more separable space to make it more appropriate for the proposed classifier. A piecewise linear classifier based on the extended classifier system for function approximation (XCSF) was modified by developing an adaptive mutation rate, which was proportional to the genotypic content of best individuals and their fitness in each generation. The proposed operator controlled the trade-off between exploration and exploitation while maintaining the diversity in the classifier's population to avoid premature convergence. To assess the effectiveness of the proposed scheme, the extracted features were applied to support vector machine, LDA, nearest neighbor and XCSF classifiers. To evaluate the method, a noisy environment was simulated with different noise amplitudes. It is shown that the results of the proposed technique are more robust as compared to conventional classifiers. Statistical tests demonstrate that the proposed classifier is a promising method for discriminating between ADHD and BMD patients.
Why Does Rebalancing Class-Unbalanced Data Improve AUC for Linear Discriminant Analysis?
Xue, Jing-Hao; Hall, Peter
2015-05-01
Many established classifiers fail to identify the minority class when it is much smaller than the majority class. To tackle this problem, researchers often first rebalance the class sizes in the training dataset, through oversampling the minority class or undersampling the majority class, and then use the rebalanced data to train the classifiers. This leads to interesting empirical patterns. In particular, using the rebalanced training data can often improve the area under the receiver operating characteristic curve (AUC) for the original, unbalanced test data. The AUC is a widely-used quantitative measure of classification performance, but the property that it increases with rebalancing has, as yet, no theoretical explanation. In this note, using Gaussian-based linear discriminant analysis (LDA) as the classifier, we demonstrate that, at least for LDA, there is an intrinsic, positive relationship between the rebalancing of class sizes and the improvement of AUC. We show that the largest improvement of AUC is achieved, asymptotically, when the two classes are fully rebalanced to be of equal sizes.
Evaluation of an enhanced gravity-based fine-coal circuit for high-sulfur coal
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mohanty, M.K.; Samal, A.R.; Palit, A.
One of the main objectives of this study was to evaluate a fine-coal cleaning circuit using an enhanced gravity separator specifically for a high sulfur coal application. The evaluation not only included testing of individual unit operations used for fine-coal classification, cleaning and dewatering, but also included testing of the complete circuit simultaneously. At a scale of nearly 2 t/h, two alternative circuits were evaluated to clean a minus 0.6-mm coal stream utilizing a 150-mm-diameter classifying cyclone, a linear screen having a projected surface area of 0.5 m{sup 2}, an enhanced gravity separator having a bowl diameter of 250 mmmore » and a screen-bowl centrifuge having a bowl diameter of 500 mm. The cleaning and dewatering components of both circuits were the same; however, one circuit used a classifying cyclone whereas the other used a linear screen as the classification device. An industrial size coal spiral was used to clean the 2- x 0.6-mm coal size fraction for each circuit to estimate the performance of a complete fine-coal circuit cleaning a minus 2-mm particle size coal stream. The 'linear screen + enhanced gravity separator + screen-bowl circuit' provided superior sulfur and ash-cleaning performance to the alternative circuit that used a classifying cyclone in place of the linear screen. Based on these test data, it was estimated that the use of the recommended circuit to treat 50 t/h of minus 2-mm size coal having feed ash and sulfur contents of 33.9% and 3.28%, respectively, may produce nearly 28.3 t/h of clean coal with product ash and sulfur contents of 9.15% and 1.61 %, respectively.« less
Kuhlmann, Levin; Manton, Jonathan H; Heyse, Bjorn; Vereecke, Hugo E M; Lipping, Tarmo; Struys, Michel M R F; Liley, David T J
2017-04-01
Tracking brain states with electrophysiological measurements often relies on short-term averages of extracted features and this may not adequately capture the variability of brain dynamics. The objective is to assess the hypotheses that this can be overcome by tracking distributions of linear models using anesthesia data, and that anesthetic brain state tracking performance of linear models is comparable to that of a high performing depth of anesthesia monitoring feature. Individuals' brain states are classified by comparing the distribution of linear (auto-regressive moving average-ARMA) model parameters estimated from electroencephalographic (EEG) data obtained with a sliding window to distributions of linear model parameters for each brain state. The method is applied to frontal EEG data from 15 subjects undergoing propofol anesthesia and classified by the observers assessment of alertness/sedation (OAA/S) scale. Classification of the OAA/S score was performed using distributions of either ARMA parameters or the benchmark feature, Higuchi fractal dimension. The highest average testing sensitivity of 59% (chance sensitivity: 17%) was found for ARMA (2,1) models and Higuchi fractal dimension achieved 52%, however, no statistical difference was observed. For the same ARMA case, there was no statistical difference if medians are used instead of distributions (sensitivity: 56%). The model-based distribution approach is not necessarily more effective than a median/short-term average approach, however, it performs well compared with a distribution approach based on a high performing anesthesia monitoring measure. These techniques hold potential for anesthesia monitoring and may be generally applicable for tracking brain states.
EEG-based mild depressive detection using feature selection methods and classifiers.
Li, Xiaowei; Hu, Bin; Sun, Shuting; Cai, Hanshu
2016-11-01
Depression has become a major health burden worldwide, and effectively detection of such disorder is a great challenge which requires latest technological tool, such as Electroencephalography (EEG). This EEG-based research seeks to find prominent frequency band and brain regions that are most related to mild depression, as well as an optimal combination of classification algorithms and feature selection methods which can be used in future mild depression detection. An experiment based on facial expression viewing task (Emo_block and Neu_block) was conducted, and EEG data of 37 university students were collected using a 128 channel HydroCel Geodesic Sensor Net (HCGSN). For discriminating mild depressive patients and normal controls, BayesNet (BN), Support Vector Machine (SVM), Logistic Regression (LR), k-nearest neighbor (KNN) and RandomForest (RF) classifiers were used. And BestFirst (BF), GreedyStepwise (GSW), GeneticSearch (GS), LinearForwordSelection (LFS) and RankSearch (RS) based on Correlation Features Selection (CFS) were applied for linear and non-linear EEG features selection. Independent Samples T-test with Bonferroni correction was used to find the significantly discriminant electrodes and features. Data mining results indicate that optimal performance is achieved using a combination of feature selection method GSW based on CFS and classifier KNN for beta frequency band. Accuracies achieved 92.00% and 98.00%, and AUC achieved 0.957 and 0.997, for Emo_block and Neu_block beta band data respectively. T-test results validate the effectiveness of selected features by search method GSW. Simplified EEG system with only FP1, FP2, F3, O2, T3 electrodes was also explored with linear features, which yielded accuracies of 91.70% and 96.00%, AUC of 0.952 and 0.972, for Emo_block and Neu_block respectively. Classification results obtained by GSW + KNN are encouraging and better than previously published results. In the spatial distribution of features, we find that left parietotemporal lobe in beta EEG frequency band has greater effect on mild depression detection. And fewer EEG channels (FP1, FP2, F3, O2 and T3) combined with linear features may be good candidates for usage in portable systems for mild depression detection. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
The effects of pre-processing strategies in sentiment analysis of online movie reviews
NASA Astrophysics Data System (ADS)
Zin, Harnani Mat; Mustapha, Norwati; Murad, Masrah Azrifah Azmi; Sharef, Nurfadhlina Mohd
2017-10-01
With the ever increasing of internet applications and social networking sites, people nowadays can easily express their feelings towards any products and services. These online reviews act as an important source for further analysis and improved decision making. These reviews are mostly unstructured by nature and thus, need processing like sentiment analysis and classification to provide a meaningful information for future uses. In text analysis tasks, the appropriate selection of words/features will have a huge impact on the effectiveness of the classifier. Thus, this paper explores the effect of the pre-processing strategies in the sentiment analysis of online movie reviews. In this paper, supervised machine learning method was used to classify the reviews. The support vector machine (SVM) with linear and non-linear kernel has been considered as classifier for the classification of the reviews. The performance of the classifier is critically examined based on the results of precision, recall, f-measure, and accuracy. Two different features representations were used which are term frequency and term frequency-inverse document frequency. Results show that the pre-processing strategies give a significant impact on the classification process.
Anytime query-tuned kernel machine classifiers via Cholesky factorization
NASA Technical Reports Server (NTRS)
DeCoste, D.
2002-01-01
We recently demonstrated 2 to 64-fold query-time speedups of Support Vector Machine and Kernel Fisher classifiers via a new computational geometry method for anytime output bounds (DeCoste,2002). This new paper refines our approach in two key ways. First, we introduce a simple linear algebra formulation based on Cholesky factorization, yielding simpler equations and lower computational overhead. Second, this new formulation suggests new methods for achieving additional speedups, including tuning on query samples. We demonstrate effectiveness on benchmark datasets.
Accuracy of automated classification of major depressive disorder as a function of symptom severity.
Ramasubbu, Rajamannar; Brown, Matthew R G; Cortese, Filmeno; Gaxiola, Ismael; Goodyear, Bradley; Greenshaw, Andrew J; Dursun, Serdar M; Greiner, Russell
2016-01-01
Growing evidence documents the potential of machine learning for developing brain based diagnostic methods for major depressive disorder (MDD). As symptom severity may influence brain activity, we investigated whether the severity of MDD affected the accuracies of machine learned MDD-vs-Control diagnostic classifiers. Forty-five medication-free patients with DSM-IV defined MDD and 19 healthy controls participated in the study. Based on depression severity as determined by the Hamilton Rating Scale for Depression (HRSD), MDD patients were sorted into three groups: mild to moderate depression (HRSD 14-19), severe depression (HRSD 20-23), and very severe depression (HRSD ≥ 24). We collected functional magnetic resonance imaging (fMRI) data during both resting-state and an emotional-face matching task. Patients in each of the three severity groups were compared against controls in separate analyses, using either the resting-state or task-based fMRI data. We use each of these six datasets with linear support vector machine (SVM) binary classifiers for identifying individuals as patients or controls. The resting-state fMRI data showed statistically significant classification accuracy only for the very severe depression group (accuracy 66%, p = 0.012 corrected), while mild to moderate (accuracy 58%, p = 1.0 corrected) and severe depression (accuracy 52%, p = 1.0 corrected) were only at chance. With task-based fMRI data, the automated classifier performed at chance in all three severity groups. Binary linear SVM classifiers achieved significant classification of very severe depression with resting-state fMRI, but the contribution of brain measurements may have limited potential in differentiating patients with less severe depression from healthy controls.
Linking brain-wide multivoxel activation patterns to behaviour: Examples from language and math.
Raizada, Rajeev D S; Tsao, Feng-Ming; Liu, Huei-Mei; Holloway, Ian D; Ansari, Daniel; Kuhl, Patricia K
2010-05-15
A key goal of cognitive neuroscience is to find simple and direct connections between brain and behaviour. However, fMRI analysis typically involves choices between many possible options, with each choice potentially biasing any brain-behaviour correlations that emerge. Standard methods of fMRI analysis assess each voxel individually, but then face the problem of selection bias when combining those voxels into a region-of-interest, or ROI. Multivariate pattern-based fMRI analysis methods use classifiers to analyse multiple voxels together, but can also introduce selection bias via data-reduction steps as feature selection of voxels, pre-selecting activated regions, or principal components analysis. We show here that strong brain-behaviour links can be revealed without any voxel selection or data reduction, using just plain linear regression as a classifier applied to the whole brain at once, i.e. treating each entire brain volume as a single multi-voxel pattern. The brain-behaviour correlations emerged despite the fact that the classifier was not provided with any information at all about subjects' behaviour, but instead was given only the neural data and its condition-labels. Surprisingly, more powerful classifiers such as a linear SVM and regularised logistic regression produce very similar results. We discuss some possible reasons why the very simple brain-wide linear regression model is able to find correlations with behaviour that are as strong as those obtained on the one hand from a specific ROI and on the other hand from more complex classifiers. In a manner which is unencumbered by arbitrary choices, our approach offers a method for investigating connections between brain and behaviour which is simple, rigorous and direct. Copyright (c) 2010 Elsevier Inc. All rights reserved.
Linking brain-wide multivoxel activation patterns to behaviour: Examples from language and math
Raizada, Rajeev D.S.; Tsao, Feng-Ming; Liu, Huei-Mei; Holloway, Ian D.; Ansari, Daniel; Kuhl, Patricia K.
2010-01-01
A key goal of cognitive neuroscience is to find simple and direct connections between brain and behaviour. However, fMRI analysis typically involves choices between many possible options, with each choice potentially biasing any brain–behaviour correlations that emerge. Standard methods of fMRI analysis assess each voxel individually, but then face the problem of selection bias when combining those voxels into a region-of-interest, or ROI. Multivariate pattern-based fMRI analysis methods use classifiers to analyse multiple voxels together, but can also introduce selection bias via data-reduction steps as feature selection of voxels, pre-selecting activated regions, or principal components analysis. We show here that strong brain–behaviour links can be revealed without any voxel selection or data reduction, using just plain linear regression as a classifier applied to the whole brain at once, i.e. treating each entire brain volume as a single multi-voxel pattern. The brain–behaviour correlations emerged despite the fact that the classifier was not provided with any information at all about subjects' behaviour, but instead was given only the neural data and its condition-labels. Surprisingly, more powerful classifiers such as a linear SVM and regularised logistic regression produce very similar results. We discuss some possible reasons why the very simple brain-wide linear regression model is able to find correlations with behaviour that are as strong as those obtained on the one hand from a specific ROI and on the other hand from more complex classifiers. In a manner which is unencumbered by arbitrary choices, our approach offers a method for investigating connections between brain and behaviour which is simple, rigorous and direct. PMID:20132896
A Novel Local Learning based Approach With Application to Breast Cancer Diagnosis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xu, Songhua; Tourassi, Georgia
2012-01-01
The purpose of this study is to develop and evaluate a novel local learning-based approach for computer-assisted diagnosis of breast cancer. Our new local learning based algorithm using the linear logistic regression method as its base learner is described. Overall, our algorithm will perform its stochastic searching process until the total allowed computing time is used up by our random walk process in identifying the most suitable population subdivision scheme and their corresponding individual base learners. The proposed local learning-based approach was applied for the prediction of breast cancer given 11 mammographic and clinical findings reported by physicians using themore » BI-RADS lexicon. Our database consisted of 850 patients with biopsy confirmed diagnosis (290 malignant and 560 benign). We also compared the performance of our method with a collection of publicly available state-of-the-art machine learning methods. Predictive performance for all classifiers was evaluated using 10-fold cross validation and Receiver Operating Characteristics (ROC) analysis. Figure 1 reports the performance of 54 machine learning methods implemented in the machine learning toolkit Weka (version 3.0). We introduced a novel local learning-based classifier and compared it with an extensive list of other classifiers for the problem of breast cancer diagnosis. Our experiments show that the algorithm superior prediction performance outperforming a wide range of other well established machine learning techniques. Our conclusion complements the existing understanding in the machine learning field that local learning may capture complicated, non-linear relationships exhibited by real-world datasets.« less
Yourganov, Grigori; Schmah, Tanya; Churchill, Nathan W; Berman, Marc G; Grady, Cheryl L; Strother, Stephen C
2014-08-01
The field of fMRI data analysis is rapidly growing in sophistication, particularly in the domain of multivariate pattern classification. However, the interaction between the properties of the analytical model and the parameters of the BOLD signal (e.g. signal magnitude, temporal variance and functional connectivity) is still an open problem. We addressed this problem by evaluating a set of pattern classification algorithms on simulated and experimental block-design fMRI data. The set of classifiers consisted of linear and quadratic discriminants, linear support vector machine, and linear and nonlinear Gaussian naive Bayes classifiers. For linear discriminant, we used two methods of regularization: principal component analysis, and ridge regularization. The classifiers were used (1) to classify the volumes according to the behavioral task that was performed by the subject, and (2) to construct spatial maps that indicated the relative contribution of each voxel to classification. Our evaluation metrics were: (1) accuracy of out-of-sample classification and (2) reproducibility of spatial maps. In simulated data sets, we performed an additional evaluation of spatial maps with ROC analysis. We varied the magnitude, temporal variance and connectivity of simulated fMRI signal and identified the optimal classifier for each simulated environment. Overall, the best performers were linear and quadratic discriminants (operating on principal components of the data matrix) and, in some rare situations, a nonlinear Gaussian naïve Bayes classifier. The results from the simulated data were supported by within-subject analysis of experimental fMRI data, collected in a study of aging. This is the first study that systematically characterizes interactions between analysis model and signal parameters (such as magnitude, variance and correlation) on the performance of pattern classifiers for fMRI. Copyright © 2014 Elsevier Inc. All rights reserved.
SVM and SVM Ensembles in Breast Cancer Prediction.
Huang, Min-Wei; Chen, Chih-Wen; Lin, Wei-Chao; Ke, Shih-Wen; Tsai, Chih-Fong
2017-01-01
Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM) have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction performance. However, there have been very few studies focused on examining the prediction performances of SVM based on different kernel functions. Moreover, it is unknown whether SVM classifier ensembles which have been proposed to improve the performance of single classifiers can outperform single SVM classifiers in terms of breast cancer prediction. Therefore, the aim of this paper is to fully assess the prediction performance of SVM and SVM ensembles over small and large scale breast cancer datasets. The classification accuracy, ROC, F-measure, and computational times of training SVM and SVM ensembles are compared. The experimental results show that linear kernel based SVM ensembles based on the bagging method and RBF kernel based SVM ensembles with the boosting method can be the better choices for a small scale dataset, where feature selection should be performed in the data pre-processing stage. For a large scale dataset, RBF kernel based SVM ensembles based on boosting perform better than the other classifiers.
SVM and SVM Ensembles in Breast Cancer Prediction
Huang, Min-Wei; Chen, Chih-Wen; Lin, Wei-Chao; Ke, Shih-Wen; Tsai, Chih-Fong
2017-01-01
Breast cancer is an all too common disease in women, making how to effectively predict it an active research problem. A number of statistical and machine learning techniques have been employed to develop various breast cancer prediction models. Among them, support vector machines (SVM) have been shown to outperform many related techniques. To construct the SVM classifier, it is first necessary to decide the kernel function, and different kernel functions can result in different prediction performance. However, there have been very few studies focused on examining the prediction performances of SVM based on different kernel functions. Moreover, it is unknown whether SVM classifier ensembles which have been proposed to improve the performance of single classifiers can outperform single SVM classifiers in terms of breast cancer prediction. Therefore, the aim of this paper is to fully assess the prediction performance of SVM and SVM ensembles over small and large scale breast cancer datasets. The classification accuracy, ROC, F-measure, and computational times of training SVM and SVM ensembles are compared. The experimental results show that linear kernel based SVM ensembles based on the bagging method and RBF kernel based SVM ensembles with the boosting method can be the better choices for a small scale dataset, where feature selection should be performed in the data pre-processing stage. For a large scale dataset, RBF kernel based SVM ensembles based on boosting perform better than the other classifiers. PMID:28060807
Non-linear molecular pattern classification using molecular beacons with multiple targets.
Lee, In-Hee; Lee, Seung Hwan; Park, Tai Hyun; Zhang, Byoung-Tak
2013-12-01
In vitro pattern classification has been highlighted as an important future application of DNA computing. Previous work has demonstrated the feasibility of linear classifiers using DNA-based molecular computing. However, complex tasks require non-linear classification capability. Here we design a molecular beacon that can interact with multiple targets and experimentally shows that its fluorescent signals form a complex radial-basis function, enabling it to be used as a building block for non-linear molecular classification in vitro. The proposed method was successfully applied to solving artificial and real-world classification problems: XOR and microRNA expression patterns. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Linear Classifier with Reject Option for the Detection of Vocal Fold Paralysis and Vocal Fold Edema
NASA Astrophysics Data System (ADS)
Kotropoulos, Constantine; Arce, Gonzalo R.
2009-12-01
Two distinct two-class pattern recognition problems are studied, namely, the detection of male subjects who are diagnosed with vocal fold paralysis against male subjects who are diagnosed as normal and the detection of female subjects who are suffering from vocal fold edema against female subjects who do not suffer from any voice pathology. To do so, utterances of the sustained vowel "ah" are employed from the Massachusetts Eye and Ear Infirmary database of disordered speech. Linear prediction coefficients extracted from the aforementioned utterances are used as features. The receiver operating characteristic curve of the linear classifier, that stems from the Bayes classifier when Gaussian class conditional probability density functions with equal covariance matrices are assumed, is derived. The optimal operating point of the linear classifier is specified with and without reject option. First results using utterances of the "rainbow passage" are also reported for completeness. The reject option is shown to yield statistically significant improvements in the accuracy of detecting the voice pathologies under study.
Predictive models reduce talent development costs in female gymnastics.
Pion, Johan; Hohmann, Andreas; Liu, Tianbiao; Lenoir, Matthieu; Segers, Veerle
2017-04-01
This retrospective study focuses on the comparison of different predictive models based on the results of a talent identification test battery for female gymnasts. We studied to what extent these models have the potential to optimise selection procedures, and at the same time reduce talent development costs in female artistic gymnastics. The dropout rate of 243 female elite gymnasts was investigated, 5 years past talent selection, using linear (discriminant analysis) and non-linear predictive models (Kohonen feature maps and multilayer perceptron). The coaches classified 51.9% of the participants correct. Discriminant analysis improved the correct classification to 71.6% while the non-linear technique of Kohonen feature maps reached 73.7% correctness. Application of the multilayer perceptron even classified 79.8% of the gymnasts correctly. The combination of different predictive models for talent selection can avoid deselection of high-potential female gymnasts. The selection procedure based upon the different statistical analyses results in decrease of 33.3% of cost because the pool of selected athletes can be reduced to 92 instead of 138 gymnasts (as selected by the coaches). Reduction of the costs allows the limited resources to be fully invested in the high-potential athletes.
Geometry-based ensembles: toward a structural characterization of the classification boundary.
Pujol, Oriol; Masip, David
2009-06-01
This paper introduces a novel binary discriminative learning technique based on the approximation of the nonlinear decision boundary by a piecewise linear smooth additive model. The decision border is geometrically defined by means of the characterizing boundary points-points that belong to the optimal boundary under a certain notion of robustness. Based on these points, a set of locally robust linear classifiers is defined and assembled by means of a Tikhonov regularized optimization procedure in an additive model to create a final lambda-smooth decision rule. As a result, a very simple and robust classifier with a strong geometrical meaning and nonlinear behavior is obtained. The simplicity of the method allows its extension to cope with some of today's machine learning challenges, such as online learning, large-scale learning or parallelization, with linear computational complexity. We validate our approach on the UCI database, comparing with several state-of-the-art classification techniques. Finally, we apply our technique in online and large-scale scenarios and in six real-life computer vision and pattern recognition problems: gender recognition based on face images, intravascular ultrasound tissue classification, speed traffic sign detection, Chagas' disease myocardial damage severity detection, old musical scores clef classification, and action recognition using 3D accelerometer data from a wearable device. The results are promising and this paper opens a line of research that deserves further attention.
Multi-view L2-SVM and its multi-view core vector machine.
Huang, Chengquan; Chung, Fu-lai; Wang, Shitong
2016-03-01
In this paper, a novel L2-SVM based classifier Multi-view L2-SVM is proposed to address multi-view classification tasks. The proposed Multi-view L2-SVM classifier does not have any bias in its objective function and hence has the flexibility like μ-SVC in the sense that the number of the yielded support vectors can be controlled by a pre-specified parameter. The proposed Multi-view L2-SVM classifier can make full use of the coherence and the difference of different views through imposing the consensus among multiple views to improve the overall classification performance. Besides, based on the generalized core vector machine GCVM, the proposed Multi-view L2-SVM classifier is extended into its GCVM version MvCVM which can realize its fast training on large scale multi-view datasets, with its asymptotic linear time complexity with the sample size and its space complexity independent of the sample size. Our experimental results demonstrated the effectiveness of the proposed Multi-view L2-SVM classifier for small scale multi-view datasets and the proposed MvCVM classifier for large scale multi-view datasets. Copyright © 2015 Elsevier Ltd. All rights reserved.
Ritchie, J Brendan; Carlson, Thomas A
2016-01-01
A fundamental challenge for cognitive neuroscience is characterizing how the primitives of psychological theory are neurally implemented. Attempts to meet this challenge are a manifestation of what Fechner called "inner" psychophysics: the theory of the precise mapping between mental quantities and the brain. In his own time, inner psychophysics remained an unrealized ambition for Fechner. We suggest that, today, multivariate pattern analysis (MVPA), or neural "decoding," methods provide a promising starting point for developing an inner psychophysics. A cornerstone of these methods are simple linear classifiers applied to neural activity in high-dimensional activation spaces. We describe an approach to inner psychophysics based on the shared architecture of linear classifiers and observers under decision boundary models such as signal detection theory. Under this approach, distance from a decision boundary through activation space, as estimated by linear classifiers, can be used to predict reaction time in accordance with signal detection theory, and distance-to-bound models of reaction time. Our "neural distance-to-bound" approach is potentially quite general, and simple to implement. Furthermore, our recent work on visual object recognition suggests it is empirically viable. We believe the approach constitutes an important step along the path to an inner psychophysics that links mind, brain, and behavior.
On the design of classifiers for crop inventories
NASA Technical Reports Server (NTRS)
Heydorn, R. P.; Takacs, H. C.
1986-01-01
Crop proportion estimators that use classifications of satellite data to correct, in an additive way, a given estimate acquired from ground observations are discussed. A linear version of these estimators is optimal, in terms of minimum variance, when the regression of the ground observations onto the satellite observations in linear. When this regression is not linear, but the reverse regression (satellite observations onto ground observations) is linear, the estimator is suboptimal but still has certain appealing variance properties. In this paper expressions are derived for those regressions which relate the intercepts and slopes to conditional classification probabilities. These expressions are then used to discuss the question of classifier designs that can lead to low-variance crop proportion estimates. Variance expressions for these estimates in terms of classifier omission and commission errors are also derived.
Naseer, Noman; Hong, Keum-Shik
2013-10-11
This paper presents a study on functional near-infrared spectroscopy (fNIRS) indicating that the hemodynamic responses of the right- and left-wrist motor imageries have distinct patterns that can be classified using a linear classifier for the purpose of developing a brain-computer interface (BCI). Ten healthy participants were instructed to imagine kinesthetically the right- or left-wrist flexion indicated on a computer screen. Signals from the right and left primary motor cortices were acquired simultaneously using a multi-channel continuous-wave fNIRS system. Using two distinct features (the mean and the slope of change in the oxygenated hemoglobin concentration), the linear discriminant analysis classifier was used to classify the right- and left-wrist motor imageries resulting in average classification accuracies of 73.35% and 83.0%, respectively, during the 10s task period. Moreover, when the analysis time was confined to the 2-7s span within the overall 10s task period, the average classification accuracies were improved to 77.56% and 87.28%, respectively. These results demonstrate the feasibility of an fNIRS-based BCI and the enhanced performance of the classifier by removing the initial 2s span and/or the time span after the peak value. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Walker, H. F.
1979-01-01
In many pattern recognition problems, data vectors are classified although one or more of the data vector elements are missing. This problem occurs in remote sensing when the ground is obscured by clouds. Optimal linear discrimination procedures for classifying imcomplete data vectors are discussed.
Abnormality detection of mammograms by discriminative dictionary learning on DSIFT descriptors.
Tavakoli, Nasrin; Karimi, Maryam; Nejati, Mansour; Karimi, Nader; Reza Soroushmehr, S M; Samavi, Shadrokh; Najarian, Kayvan
2017-07-01
Detection and classification of breast lesions using mammographic images are one of the most difficult studies in medical image processing. A number of learning and non-learning methods have been proposed for detecting and classifying these lesions. However, the accuracy of the detection/classification still needs improvement. In this paper we propose a powerful classification method based on sparse learning to diagnose breast cancer in mammograms. For this purpose, a supervised discriminative dictionary learning approach is applied on dense scale invariant feature transform (DSIFT) features. A linear classifier is also simultaneously learned with the dictionary which can effectively classify the sparse representations. Our experimental results show the superior performance of our method compared to existing approaches.
A face and palmprint recognition approach based on discriminant DCT feature extraction.
Jing, Xiao-Yuan; Zhang, David
2004-12-01
In the field of image processing and recognition, discrete cosine transform (DCT) and linear discrimination are two widely used techniques. Based on them, we present a new face and palmprint recognition approach in this paper. It first uses a two-dimensional separability judgment to select the DCT frequency bands with favorable linear separability. Then from the selected bands, it extracts the linear discriminative features by an improved Fisherface method and performs the classification by the nearest neighbor classifier. We detailedly analyze theoretical advantages of our approach in feature extraction. The experiments on face databases and palmprint database demonstrate that compared to the state-of-the-art linear discrimination methods, our approach obtains better classification performance. It can significantly improve the recognition rates for face and palmprint data and effectively reduce the dimension of feature space.
Discriminative Hierarchical K-Means Tree for Large-Scale Image Classification.
Chen, Shizhi; Yang, Xiaodong; Tian, Yingli
2015-09-01
A key challenge in large-scale image classification is how to achieve efficiency in terms of both computation and memory without compromising classification accuracy. The learning-based classifiers achieve the state-of-the-art accuracies, but have been criticized for the computational complexity that grows linearly with the number of classes. The nonparametric nearest neighbor (NN)-based classifiers naturally handle large numbers of categories, but incur prohibitively expensive computation and memory costs. In this brief, we present a novel classification scheme, i.e., discriminative hierarchical K-means tree (D-HKTree), which combines the advantages of both learning-based and NN-based classifiers. The complexity of the D-HKTree only grows sublinearly with the number of categories, which is much better than the recent hierarchical support vector machines-based methods. The memory requirement is the order of magnitude less than the recent Naïve Bayesian NN-based approaches. The proposed D-HKTree classification scheme is evaluated on several challenging benchmark databases and achieves the state-of-the-art accuracies, while with significantly lower computation cost and memory requirement.
Multiobjective GAs, quantitative indices, and pattern classification.
Bandyopadhyay, Sanghamitra; Pal, Sankar K; Aruna, B
2004-10-01
The concept of multiobjective optimization (MOO) has been integrated with variable length chromosomes for the development of a nonparametric genetic classifier which can overcome the problems, like overfitting/overlearning and ignoring smaller classes, as faced by single objective classifiers. The classifier can efficiently approximate any kind of linear and/or nonlinear class boundaries of a data set using an appropriate number of hyperplanes. While designing the classifier the aim is to simultaneously minimize the number of misclassified training points and the number of hyperplanes, and to maximize the product of class wise recognition scores. The concepts of validation set (in addition to training and test sets) and validation functional are introduced in the multiobjective classifier for selecting a solution from a set of nondominated solutions provided by the MOO algorithm. This genetic classifier incorporates elitism and some domain specific constraints in the search process, and is called the CEMOGA-Classifier (constrained elitist multiobjective genetic algorithm based classifier). Two new quantitative indices, namely, the purity and minimal spacing, are developed for evaluating the performance of different MOO techniques. These are used, along with classification accuracy, required number of hyperplanes and the computation time, to compare the CEMOGA-Classifier with other related ones.
Kia, Seyed Mostafa; Vega Pons, Sandro; Weisz, Nathan; Passerini, Andrea
2016-01-01
Brain decoding is a popular multivariate approach for hypothesis testing in neuroimaging. Linear classifiers are widely employed in the brain decoding paradigm to discriminate among experimental conditions. Then, the derived linear weights are visualized in the form of multivariate brain maps to further study spatio-temporal patterns of underlying neural activities. It is well known that the brain maps derived from weights of linear classifiers are hard to interpret because of high correlations between predictors, low signal to noise ratios, and the high dimensionality of neuroimaging data. Therefore, improving the interpretability of brain decoding approaches is of primary interest in many neuroimaging studies. Despite extensive studies of this type, at present, there is no formal definition for interpretability of multivariate brain maps. As a consequence, there is no quantitative measure for evaluating the interpretability of different brain decoding methods. In this paper, first, we present a theoretical definition of interpretability in brain decoding; we show that the interpretability of multivariate brain maps can be decomposed into their reproducibility and representativeness. Second, as an application of the proposed definition, we exemplify a heuristic for approximating the interpretability in multivariate analysis of evoked magnetoencephalography (MEG) responses. Third, we propose to combine the approximated interpretability and the generalization performance of the brain decoding into a new multi-objective criterion for model selection. Our results, for the simulated and real MEG data, show that optimizing the hyper-parameters of the regularized linear classifier based on the proposed criterion results in more informative multivariate brain maps. More importantly, the presented definition provides the theoretical background for quantitative evaluation of interpretability, and hence, facilitates the development of more effective brain decoding algorithms in the future.
Kia, Seyed Mostafa; Vega Pons, Sandro; Weisz, Nathan; Passerini, Andrea
2017-01-01
Brain decoding is a popular multivariate approach for hypothesis testing in neuroimaging. Linear classifiers are widely employed in the brain decoding paradigm to discriminate among experimental conditions. Then, the derived linear weights are visualized in the form of multivariate brain maps to further study spatio-temporal patterns of underlying neural activities. It is well known that the brain maps derived from weights of linear classifiers are hard to interpret because of high correlations between predictors, low signal to noise ratios, and the high dimensionality of neuroimaging data. Therefore, improving the interpretability of brain decoding approaches is of primary interest in many neuroimaging studies. Despite extensive studies of this type, at present, there is no formal definition for interpretability of multivariate brain maps. As a consequence, there is no quantitative measure for evaluating the interpretability of different brain decoding methods. In this paper, first, we present a theoretical definition of interpretability in brain decoding; we show that the interpretability of multivariate brain maps can be decomposed into their reproducibility and representativeness. Second, as an application of the proposed definition, we exemplify a heuristic for approximating the interpretability in multivariate analysis of evoked magnetoencephalography (MEG) responses. Third, we propose to combine the approximated interpretability and the generalization performance of the brain decoding into a new multi-objective criterion for model selection. Our results, for the simulated and real MEG data, show that optimizing the hyper-parameters of the regularized linear classifier based on the proposed criterion results in more informative multivariate brain maps. More importantly, the presented definition provides the theoretical background for quantitative evaluation of interpretability, and hence, facilitates the development of more effective brain decoding algorithms in the future. PMID:28167896
Chao, Pei-Kuang; Wang, Chun-Li; Chan, Hsiao-Lung
2012-03-01
Predicting response after cardiac resynchronization therapy (CRT) has been a challenge of cardiologists. About 30% of selected patients based on the standard selection criteria for CRT do not show response after receiving the treatment. This study is aimed to build an intelligent classifier to assist in identifying potential CRT responders by speckle-tracking radial strain based on echocardiograms. The echocardiograms analyzed were acquired before CRT from 26 patients who have received CRT. Sequential forward selection was performed on the parameters obtained by peak-strain timing and phase space reconstruction on speckle-tracking radial strain to find an optimal set of features for creating intelligent classifiers. Support vector machine (SVM) with a linear, quadratic, and polynominal kernel were tested to build classifiers to identify potential responders and non-responders for CRT by selected features. Based on random sub-sampling validation, the best classification performance is correct rate about 95% with 96-97% sensitivity and 93-94% specificity achieved by applying SVM with a quadratic kernel on a set of 3 parameters. The selected 3 parameters contain both indexes extracted by peak-strain timing and phase space reconstruction. An intelligent classifier with an averaged correct rate, sensitivity and specificity above 90% for assisting in identifying CRT responders is built by speckle-tracking radial strain. The classifier can be applied to provide objective suggestion for patient selection of CRT. Copyright © 2011 Elsevier B.V. All rights reserved.
Machine learning-based methods for prediction of linear B-cell epitopes.
Wang, Hsin-Wei; Pai, Tun-Wen
2014-01-01
B-cell epitope prediction facilitates immunologists in designing peptide-based vaccine, diagnostic test, disease prevention, treatment, and antibody production. In comparison with T-cell epitope prediction, the performance of variable length B-cell epitope prediction is still yet to be satisfied. Fortunately, due to increasingly available verified epitope databases, bioinformaticians could adopt machine learning-based algorithms on all curated data to design an improved prediction tool for biomedical researchers. Here, we have reviewed related epitope prediction papers, especially those for linear B-cell epitope prediction. It should be noticed that a combination of selected propensity scales and statistics of epitope residues with machine learning-based tools formulated a general way for constructing linear B-cell epitope prediction systems. It is also observed from most of the comparison results that the kernel method of support vector machine (SVM) classifier outperformed other machine learning-based approaches. Hence, in this chapter, except reviewing recently published papers, we have introduced the fundamentals of B-cell epitope and SVM techniques. In addition, an example of linear B-cell prediction system based on physicochemical features and amino acid combinations is illustrated in details.
Thyroid nodule classification using ultrasound elastography via linear discriminant analysis.
Luo, Si; Kim, Eung-Hun; Dighe, Manjiri; Kim, Yongmin
2011-05-01
The non-surgical diagnosis of thyroid nodules is currently made via a fine needle aspiration (FNA) biopsy. It is estimated that somewhere between 250,000 and 300,000 thyroid FNA biopsies are performed in the United States annually. However, a large percentage (approximately 70%) of these biopsies turn out to be benign. Since the aggressive FNA management of thyroid nodules is costly, quantitative risk assessment and stratification of a nodule's malignancy is of value in triage and more appropriate healthcare resources utilization. In this paper, we introduce a new method for classifying the thyroid nodules based on the ultrasound (US) elastography features. Unlike approaches to assess the stiffness of a thyroid nodule by visually inspecting the pseudo-color pattern in the strain image, we use a classification algorithm to stratify the nodule by using the power spectrum of strain rate waveform extracted from the US elastography image sequence. Pulsation from the carotid artery was used to compress the thyroid nodules. Ultrasound data previously acquired from 98 thyroid nodules were used in this retrospective study to evaluate our classification algorithm. A classifier was developed based on the linear discriminant analysis (LDA) and used to differentiate the thyroid nodules into two types: (I) no FNA (observation-only) and (II) FNA. Using our method, 62 nodules were classified as type I, all of which were benign, while 36 nodules were classified as Type-II, 16 malignant and 20 benign, resulting in a sensitivity of 100% and specificity of 75.6% in detecting malignant thyroid nodules. This indicates that our triage method based on US elastography has the potential to substantially reduce the number of FNA biopsies (63.3%) by detecting benign nodules and managing them via follow-up observations rather than an FNA biopsy. Published by Elsevier B.V.
Sea Ice Detection Based on an Improved Similarity Measurement Method Using Hyperspectral Data.
Han, Yanling; Li, Jue; Zhang, Yun; Hong, Zhonghua; Wang, Jing
2017-05-15
Hyperspectral remote sensing technology can acquire nearly continuous spectrum information and rich sea ice image information, thus providing an important means of sea ice detection. However, the correlation and redundancy among hyperspectral bands reduce the accuracy of traditional sea ice detection methods. Based on the spectral characteristics of sea ice, this study presents an improved similarity measurement method based on linear prediction (ISMLP) to detect sea ice. First, the first original band with a large amount of information is determined based on mutual information theory. Subsequently, a second original band with the least similarity is chosen by the spectral correlation measuring method. Finally, subsequent bands are selected through the linear prediction method, and a support vector machine classifier model is applied to classify sea ice. In experiments performed on images of Baffin Bay and Bohai Bay, comparative analyses were conducted to compare the proposed method and traditional sea ice detection methods. Our proposed ISMLP method achieved the highest classification accuracies (91.18% and 94.22%) in both experiments. From these results the ISMLP method exhibits better performance overall than other methods and can be effectively applied to hyperspectral sea ice detection.
Sea Ice Detection Based on an Improved Similarity Measurement Method Using Hyperspectral Data
Han, Yanling; Li, Jue; Zhang, Yun; Hong, Zhonghua; Wang, Jing
2017-01-01
Hyperspectral remote sensing technology can acquire nearly continuous spectrum information and rich sea ice image information, thus providing an important means of sea ice detection. However, the correlation and redundancy among hyperspectral bands reduce the accuracy of traditional sea ice detection methods. Based on the spectral characteristics of sea ice, this study presents an improved similarity measurement method based on linear prediction (ISMLP) to detect sea ice. First, the first original band with a large amount of information is determined based on mutual information theory. Subsequently, a second original band with the least similarity is chosen by the spectral correlation measuring method. Finally, subsequent bands are selected through the linear prediction method, and a support vector machine classifier model is applied to classify sea ice. In experiments performed on images of Baffin Bay and Bohai Bay, comparative analyses were conducted to compare the proposed method and traditional sea ice detection methods. Our proposed ISMLP method achieved the highest classification accuracies (91.18% and 94.22%) in both experiments. From these results the ISMLP method exhibits better performance overall than other methods and can be effectively applied to hyperspectral sea ice detection. PMID:28505135
Optimal number of features as a function of sample size for various classification rules.
Hua, Jianping; Xiong, Zixiang; Lowey, James; Suh, Edward; Dougherty, Edward R
2005-04-15
Given the joint feature-label distribution, increasing the number of features always results in decreased classification error; however, this is not the case when a classifier is designed via a classification rule from sample data. Typically (but not always), for fixed sample size, the error of a designed classifier decreases and then increases as the number of features grows. The potential downside of using too many features is most critical for small samples, which are commonplace for gene-expression-based classifiers for phenotype discrimination. For fixed sample size and feature-label distribution, the issue is to find an optimal number of features. Since only in rare cases is there a known distribution of the error as a function of the number of features and sample size, this study employs simulation for various feature-label distributions and classification rules, and across a wide range of sample and feature-set sizes. To achieve the desired end, finding the optimal number of features as a function of sample size, it employs massively parallel computation. Seven classifiers are treated: 3-nearest-neighbor, Gaussian kernel, linear support vector machine, polynomial support vector machine, perceptron, regular histogram and linear discriminant analysis. Three Gaussian-based models are considered: linear, nonlinear and bimodal. In addition, real patient data from a large breast-cancer study is considered. To mitigate the combinatorial search for finding optimal feature sets, and to model the situation in which subsets of genes are co-regulated and correlation is internal to these subsets, we assume that the covariance matrix of the features is blocked, with each block corresponding to a group of correlated features. Altogether there are a large number of error surfaces for the many cases. These are provided in full on a companion website, which is meant to serve as resource for those working with small-sample classification. For the companion website, please visit http://public.tgen.org/tamu/ofs/ e-dougherty@ee.tamu.edu.
EEG feature selection method based on decision tree.
Duan, Lijuan; Ge, Hui; Ma, Wei; Miao, Jun
2015-01-01
This paper aims to solve automated feature selection problem in brain computer interface (BCI). In order to automate feature selection process, we proposed a novel EEG feature selection method based on decision tree (DT). During the electroencephalogram (EEG) signal processing, a feature extraction method based on principle component analysis (PCA) was used, and the selection process based on decision tree was performed by searching the feature space and automatically selecting optimal features. Considering that EEG signals are a series of non-linear signals, a generalized linear classifier named support vector machine (SVM) was chosen. In order to test the validity of the proposed method, we applied the EEG feature selection method based on decision tree to BCI Competition II datasets Ia, and the experiment showed encouraging results.
Bashir, Saba; Qamar, Usman; Khan, Farhan Hassan
2015-06-01
Conventional clinical decision support systems are based on individual classifiers or simple combination of these classifiers which tend to show moderate performance. This research paper presents a novel classifier ensemble framework based on enhanced bagging approach with multi-objective weighted voting scheme for prediction and analysis of heart disease. The proposed model overcomes the limitations of conventional performance by utilizing an ensemble of five heterogeneous classifiers: Naïve Bayes, linear regression, quadratic discriminant analysis, instance based learner and support vector machines. Five different datasets are used for experimentation, evaluation and validation. The datasets are obtained from publicly available data repositories. Effectiveness of the proposed ensemble is investigated by comparison of results with several classifiers. Prediction results of the proposed ensemble model are assessed by ten fold cross validation and ANOVA statistics. The experimental evaluation shows that the proposed framework deals with all type of attributes and achieved high diagnosis accuracy of 84.16 %, 93.29 % sensitivity, 96.70 % specificity, and 82.15 % f-measure. The f-ratio higher than f-critical and p value less than 0.05 for 95 % confidence interval indicate that the results are extremely statistically significant for most of the datasets.
NASA Astrophysics Data System (ADS)
Tahernezhad-Javazm, Farajollah; Azimirad, Vahid; Shoaran, Maryam
2018-04-01
Objective. Considering the importance and the near-future development of noninvasive brain-machine interface (BMI) systems, this paper presents a comprehensive theoretical-experimental survey on the classification and evolutionary methods for BMI-based systems in which EEG signals are used. Approach. The paper is divided into two main parts. In the first part, a wide range of different types of the base and combinatorial classifiers including boosting and bagging classifiers and evolutionary algorithms are reviewed and investigated. In the second part, these classifiers and evolutionary algorithms are assessed and compared based on two types of relatively widely used BMI systems, sensory motor rhythm-BMI and event-related potentials-BMI. Moreover, in the second part, some of the improved evolutionary algorithms as well as bi-objective algorithms are experimentally assessed and compared. Main results. In this study two databases are used, and cross-validation accuracy (CVA) and stability to data volume (SDV) are considered as the evaluation criteria for the classifiers. According to the experimental results on both databases, regarding the base classifiers, linear discriminant analysis and support vector machines with respect to CVA evaluation metric, and naive Bayes with respect to SDV demonstrated the best performances. Among the combinatorial classifiers, four classifiers, Bagg-DT (bagging decision tree), LogitBoost, and GentleBoost with respect to CVA, and Bagging-LR (bagging logistic regression) and AdaBoost (adaptive boosting) with respect to SDV had the best performances. Finally, regarding the evolutionary algorithms, single-objective invasive weed optimization (IWO) and bi-objective nondominated sorting IWO algorithms demonstrated the best performances. Significance. We present a general survey on the base and the combinatorial classification methods for EEG signals (sensory motor rhythm and event-related potentials) as well as their optimization methods through the evolutionary algorithms. In addition, experimental and statistical significance tests are carried out to study the applicability and effectiveness of the reviewed methods.
Underwater target classification using wavelet packets and neural networks.
Azimi-Sadjadi, M R; Yao, D; Huang, Q; Dobeck, G J
2000-01-01
In this paper, a new subband-based classification scheme is developed for classifying underwater mines and mine-like targets from the acoustic backscattered signals. The system consists of a feature extractor using wavelet packets in conjunction with linear predictive coding (LPC), a feature selection scheme, and a backpropagation neural-network classifier. The data set used for this study consists of the backscattered signals from six different objects: two mine-like targets and four nontargets for several aspect angles. Simulation results on ten different noisy realizations and for signal-to-noise ratio (SNR) of 12 dB are presented. The receiver operating characteristic (ROC) curve of the classifier generated based on these results demonstrated excellent classification performance of the system. The generalization ability of the trained network was demonstrated by computing the error and classification rate statistics on a large data set. A multiaspect fusion scheme was also adopted in order to further improve the classification performance.
Variations in the Intragene Methylation Profiles Hallmark Induced Pluripotency
Druzhkov, Pavel; Zolotykh, Nikolay; Meyerov, Iosif; Alsaedi, Ahmed; Shutova, Maria; Ivanchenko, Mikhail; Zaikin, Alexey
2015-01-01
We demonstrate the potential of differentiating embryonic and induced pluripotent stem cells by the regularized linear and decision tree machine learning classification algorithms, based on a number of intragene methylation measures. The resulting average accuracy of classification has been proven to be above 95%, which overcomes the earlier achievements. We propose a constructive and transparent method of feature selection based on classifier accuracy. Enrichment analysis reveals statistically meaningful presence of stemness group and cancer discriminating genes among the selected best classifying features. These findings stimulate the further research on the functional consequences of these differences in methylation patterns. The presented approach can be broadly used to discriminate the cells of different phenotype or in different state by their methylation profiles, identify groups of genes constituting multifeature classifiers, and assess enrichment of these groups by the sets of genes with a functionality of interest. PMID:26618180
NASA Technical Reports Server (NTRS)
Scholz, D.; Fuhs, N.; Hixson, M.; Akiyama, T. (Principal Investigator)
1979-01-01
The author has identified the following significant results. Data sets for corn, soybeans, winter wheat, and spring wheat were used to evaluate the following schemes for crop identification: (1) per point Gaussian maximum classifier; (2) per point sum of normal densities classifiers; (3) per point linear classifier; (4) per point Gaussian maximum likelihood decision tree classifiers; and (5) texture sensitive per field Gaussian maximum likelihood classifier. Test site location and classifier both had significant effects on classification accuracy of small grains; classifiers did not differ significantly in overall accuracy, with the majority of the difference among classifiers being attributed to training method rather than to the classification algorithm applied. The complexity of use and computer costs for the classifiers varied significantly. A linear classification rule which assigns each pixel to the class whose mean is closest in Euclidean distance was the easiest for the analyst and cost the least per classification.
Optical recognition of statistical patterns
NASA Astrophysics Data System (ADS)
Lee, S. H.
1981-12-01
Optical implementation of the Fukunaga-Koontz transform (FKT) and the Least-Squares Linear Mapping Technique (LSLMT) is described. The FKT is a linear transformation which performs image feature extraction for a two-class image classification problem. The LSLMT performs a transform from large dimensional feature space to small dimensional decision space for separating multiple image classes by maximizing the interclass differences while minimizing the intraclass variations. The FKT and the LSLMT were optically implemented by utilizing a coded phase optical processor. The transform was used for classifying birds and fish. After the F-K basis functions were calculated, those most useful for classification were incorporated into a computer generated hologram. The output of the optical processor, consisting of the squared magnitude of the F-K coefficients, was detected by a T.V. camera, digitized, and fed into a micro-computer for classification. A simple linear classifier based on only two F-K coefficients was able to separate the images into two classes, indicating that the F-K transform had chosen good features. Two advantages of optically implementing the FKT and LSLMT are parallel and real time processing.
Optical recognition of statistical patterns
NASA Technical Reports Server (NTRS)
Lee, S. H.
1981-01-01
Optical implementation of the Fukunaga-Koontz transform (FKT) and the Least-Squares Linear Mapping Technique (LSLMT) is described. The FKT is a linear transformation which performs image feature extraction for a two-class image classification problem. The LSLMT performs a transform from large dimensional feature space to small dimensional decision space for separating multiple image classes by maximizing the interclass differences while minimizing the intraclass variations. The FKT and the LSLMT were optically implemented by utilizing a coded phase optical processor. The transform was used for classifying birds and fish. After the F-K basis functions were calculated, those most useful for classification were incorporated into a computer generated hologram. The output of the optical processor, consisting of the squared magnitude of the F-K coefficients, was detected by a T.V. camera, digitized, and fed into a micro-computer for classification. A simple linear classifier based on only two F-K coefficients was able to separate the images into two classes, indicating that the F-K transform had chosen good features. Two advantages of optically implementing the FKT and LSLMT are parallel and real time processing.
Inorganic arsenic is classified as a carcinogen and has been linked to lung and bladder cancer as well as other non-cancerous health effects. Because of these health effects the U.S. EPA has set a Maximum Contaminant Level (MCL) at 10ppb based on a linear extrapolation of risk an...
DeepGene: an advanced cancer type classifier based on deep learning and somatic point mutations.
Yuan, Yuchen; Shi, Yi; Li, Changyang; Kim, Jinman; Cai, Weidong; Han, Zeguang; Feng, David Dagan
2016-12-23
With the developments of DNA sequencing technology, large amounts of sequencing data have become available in recent years and provide unprecedented opportunities for advanced association studies between somatic point mutations and cancer types/subtypes, which may contribute to more accurate somatic point mutation based cancer classification (SMCC). However in existing SMCC methods, issues like high data sparsity, small volume of sample size, and the application of simple linear classifiers, are major obstacles in improving the classification performance. To address the obstacles in existing SMCC studies, we propose DeepGene, an advanced deep neural network (DNN) based classifier, that consists of three steps: firstly, the clustered gene filtering (CGF) concentrates the gene data by mutation occurrence frequency, filtering out the majority of irrelevant genes; secondly, the indexed sparsity reduction (ISR) converts the gene data into indexes of its non-zero elements, thereby significantly suppressing the impact of data sparsity; finally, the data after CGF and ISR is fed into a DNN classifier, which extracts high-level features for accurate classification. Experimental results on our curated TCGA-DeepGene dataset, which is a reformulated subset of the TCGA dataset containing 12 selected types of cancer, show that CGF, ISR and DNN all contribute in improving the overall classification performance. We further compare DeepGene with three widely adopted classifiers and demonstrate that DeepGene has at least 24% performance improvement in terms of testing accuracy. Based on deep learning and somatic point mutation data, we devise DeepGene, an advanced cancer type classifier, which addresses the obstacles in existing SMCC studies. Experiments indicate that DeepGene outperforms three widely adopted existing classifiers, which is mainly attributed to its deep learning module that is able to extract the high level features between combinatorial somatic point mutations and cancer types.
Tracing the Geographical Origin of Onions by Strontium Isotope Ratio and Strontium Content.
Hiraoka, Hisaaki; Morita, Sakie; Izawa, Atsunobu; Aoyama, Keisuke; Shin, Ki-Cheol; Nakano, Takanori
2016-01-01
The strontium (Sr) isotope ratio ((87)Sr/(86)Sr) and Sr content were used to trace the geographical origin of onions from Japan and other countries, including China, the United States of America, New Zealand, Australia, and Thailand. The mean (87)Sr/(86)Sr ratio and Sr content (dry weight basis) for onions from Japan were 0.70751 and 4.6 mg kg(-1), respectively, and the values for onions from the other countries were 0.71199 and 12.4 mg kg(-1), respectively. Linear discriminant analysis was performed to classify onions produced in Japan from those produced in the other countries based on the Sr data. The discriminant equation derived from linear discriminant analysis was evaluated by 10-fold cross validation. As a result, the origins of 92% of onions were correctly classified between Japan and the other countries.
Applications of Support Vector Machines In Chemo And Bioinformatics
NASA Astrophysics Data System (ADS)
Jayaraman, V. K.; Sundararajan, V.
2010-10-01
Conventional linear & nonlinear tools for classification, regression & data driven modeling are being replaced on a rapid scale by newer techniques & tools based on artificial intelligence and machine learning. While the linear techniques are not applicable for inherently nonlinear problems, newer methods serve as attractive alternatives for solving real life problems. Support Vector Machine (SVM) classifiers are a set of universal feed-forward network based classification algorithms that have been formulated from statistical learning theory and structural risk minimization principle. SVM regression closely follows the classification methodology. In this work recent applications of SVM in Chemo & Bioinformatics will be described with suitable illustrative examples.
Comparison of Different EHG Feature Selection Methods for the Detection of Preterm Labor
Alamedine, D.; Khalil, M.; Marque, C.
2013-01-01
Numerous types of linear and nonlinear features have been extracted from the electrohysterogram (EHG) in order to classify labor and pregnancy contractions. As a result, the number of available features is now very large. The goal of this study is to reduce the number of features by selecting only the relevant ones which are useful for solving the classification problem. This paper presents three methods for feature subset selection that can be applied to choose the best subsets for classifying labor and pregnancy contractions: an algorithm using the Jeffrey divergence (JD) distance, a sequential forward selection (SFS) algorithm, and a binary particle swarm optimization (BPSO) algorithm. The two last methods are based on a classifier and were tested with three types of classifiers. These methods have allowed us to identify common features which are relevant for contraction classification. PMID:24454536
Towards exaggerated emphysema stereotypes
NASA Astrophysics Data System (ADS)
Chen, C.; Sørensen, L.; Lauze, F.; Igel, C.; Loog, M.; Feragen, A.; de Bruijne, M.; Nielsen, M.
2012-03-01
Classification is widely used in the context of medical image analysis and in order to illustrate the mechanism of a classifier, we introduce the notion of an exaggerated image stereotype based on training data and trained classifier. The stereotype of some image class of interest should emphasize/exaggerate the characteristic patterns in an image class and visualize the information the employed classifier relies on. This is useful for gaining insight into the classification and serves for comparison with the biological models of disease. In this work, we build exaggerated image stereotypes by optimizing an objective function which consists of a discriminative term based on the classification accuracy, and a generative term based on the class distributions. A gradient descent method based on iterated conditional modes (ICM) is employed for optimization. We use this idea with Fisher's linear discriminant rule and assume a multivariate normal distribution for samples within a class. The proposed framework is applied to computed tomography (CT) images of lung tissue with emphysema. The synthesized stereotypes illustrate the exaggerated patterns of lung tissue with emphysema, which is underpinned by three different quantitative evaluation methods.
A Novel Design of 4-Class BCI Using Two Binary Classifiers and Parallel Mental Tasks
Geng, Tao; Gan, John Q.; Dyson, Matthew; Tsui, Chun SL; Sepulveda, Francisco
2008-01-01
A novel 4-class single-trial brain computer interface (BCI) based on two (rather than four or more) binary linear discriminant analysis (LDA) classifiers is proposed, which is called a “parallel BCI.” Unlike other BCIs where mental tasks are executed and classified in a serial way one after another, the parallel BCI uses properly designed parallel mental tasks that are executed on both sides of the subject body simultaneously, which is the main novelty of the BCI paradigm used in our experiments. Each of the two binary classifiers only classifies the mental tasks executed on one side of the subject body, and the results of the two binary classifiers are combined to give the result of the 4-class BCI. Data was recorded in experiments with both real movement and motor imagery in 3 able-bodied subjects. Artifacts were not detected or removed. Offline analysis has shown that, in some subjects, the parallel BCI can generate a higher accuracy than a conventional 4-class BCI, although both of them have used the same feature selection and classification algorithms. PMID:18584040
Ozcift, Akin
2012-08-01
Parkinson disease (PD) is an age-related deterioration of certain nerve systems, which affects movement, balance, and muscle control of clients. PD is one of the common diseases which affect 1% of people older than 60 years. A new classification scheme based on support vector machine (SVM) selected features to train rotation forest (RF) ensemble classifiers is presented for improving diagnosis of PD. The dataset contains records of voice measurements from 31 people, 23 with PD and each record in the dataset is defined with 22 features. The diagnosis model first makes use of a linear SVM to select ten most relevant features from 22. As a second step of the classification model, six different classifiers are trained with the subset of features. Subsequently, at the third step, the accuracies of classifiers are improved by the utilization of RF ensemble classification strategy. The results of the experiments are evaluated using three metrics; classification accuracy (ACC), Kappa Error (KE) and Area under the Receiver Operating Characteristic (ROC) Curve (AUC). Performance measures of two base classifiers, i.e. KStar and IBk, demonstrated an apparent increase in PD diagnosis accuracy compared to similar studies in literature. After all, application of RF ensemble classification scheme improved PD diagnosis in 5 of 6 classifiers significantly. We, numerically, obtained about 97% accuracy in RF ensemble of IBk (a K-Nearest Neighbor variant) algorithm, which is a quite high performance for Parkinson disease diagnosis.
Weng, Wei-Hung; Wagholikar, Kavishwar B; McCray, Alexa T; Szolovits, Peter; Chueh, Henry C
2017-12-01
The medical subdomain of a clinical note, such as cardiology or neurology, is useful content-derived metadata for developing machine learning downstream applications. To classify the medical subdomain of a note accurately, we have constructed a machine learning-based natural language processing (NLP) pipeline and developed medical subdomain classifiers based on the content of the note. We constructed the pipeline using the clinical NLP system, clinical Text Analysis and Knowledge Extraction System (cTAKES), the Unified Medical Language System (UMLS) Metathesaurus, Semantic Network, and learning algorithms to extract features from two datasets - clinical notes from Integrating Data for Analysis, Anonymization, and Sharing (iDASH) data repository (n = 431) and Massachusetts General Hospital (MGH) (n = 91,237), and built medical subdomain classifiers with different combinations of data representation methods and supervised learning algorithms. We evaluated the performance of classifiers and their portability across the two datasets. The convolutional recurrent neural network with neural word embeddings trained-medical subdomain classifier yielded the best performance measurement on iDASH and MGH datasets with area under receiver operating characteristic curve (AUC) of 0.975 and 0.991, and F1 scores of 0.845 and 0.870, respectively. Considering better clinical interpretability, linear support vector machine-trained medical subdomain classifier using hybrid bag-of-words and clinically relevant UMLS concepts as the feature representation, with term frequency-inverse document frequency (tf-idf)-weighting, outperformed other shallow learning classifiers on iDASH and MGH datasets with AUC of 0.957 and 0.964, and F1 scores of 0.932 and 0.934 respectively. We trained classifiers on one dataset, applied to the other dataset and yielded the threshold of F1 score of 0.7 in classifiers for half of the medical subdomains we studied. Our study shows that a supervised learning-based NLP approach is useful to develop medical subdomain classifiers. The deep learning algorithm with distributed word representation yields better performance yet shallow learning algorithms with the word and concept representation achieves comparable performance with better clinical interpretability. Portable classifiers may also be used across datasets from different institutions.
A Feature-Free 30-Disease Pathological Brain Detection System by Linear Regression Classifier.
Chen, Yi; Shao, Ying; Yan, Jie; Yuan, Ti-Fei; Qu, Yanwen; Lee, Elizabeth; Wang, Shuihua
2017-01-01
Alzheimer's disease patients are increasing rapidly every year. Scholars tend to use computer vision methods to develop automatic diagnosis system. (Background) In 2015, Gorji et al. proposed a novel method using pseudo Zernike moment. They tested four classifiers: learning vector quantization neural network, pattern recognition neural network trained by Levenberg-Marquardt, by resilient backpropagation, and by scaled conjugate gradient. This study presents an improved method by introducing a relatively new classifier-linear regression classification. Our method selects one axial slice from 3D brain image, and employed pseudo Zernike moment with maximum order of 15 to extract 256 features from each image. Finally, linear regression classification was harnessed as the classifier. The proposed approach obtains an accuracy of 97.51%, a sensitivity of 96.71%, and a specificity of 97.73%. Our method performs better than Gorji's approach and five other state-of-the-art approaches. Therefore, it can be used to detect Alzheimer's disease. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.
NASA Astrophysics Data System (ADS)
Hashemi, H.; Tax, D. M. J.; Duin, R. P. W.; Javaherian, A.; de Groot, P.
2008-11-01
Seismic object detection is a relatively new field in which 3-D bodies are visualized and spatial relationships between objects of different origins are studied in order to extract geologic information. In this paper, we propose a method for finding an optimal classifier with the help of a statistical feature ranking technique and combining different classifiers. The method, which has general applicability, is demonstrated here on a gas chimney detection problem. First, we evaluate a set of input seismic attributes extracted at locations labeled by a human expert using regularized discriminant analysis (RDA). In order to find the RDA score for each seismic attribute, forward and backward search strategies are used. Subsequently, two non-linear classifiers: multilayer perceptron (MLP) and support vector classifier (SVC) are run on the ranked seismic attributes. Finally, to capitalize on the intrinsic differences between both classifiers, the MLP and SVC results are combined using logical rules of maximum, minimum and mean. The proposed method optimizes the ranked feature space size and yields the lowest classification error in the final combined result. We will show that the logical minimum reveals gas chimneys that exhibit both the softness of MLP and the resolution of SVC classifiers.
Gemignani, Jessica; Middell, Eike; Barbour, Randall L; Graber, Harry L; Blankertz, Benjamin
2018-04-04
The statistical analysis of functional near infrared spectroscopy (fNIRS) data based on the general linear model (GLM) is often made difficult by serial correlations, high inter-subject variability of the hemodynamic response, and the presence of motion artifacts. In this work we propose to extract information on the pattern of hemodynamic activations without using any a priori model for the data, by classifying the channels as 'active' or 'not active' with a multivariate classifier based on linear discriminant analysis (LDA). This work is developed in two steps. First we compared the performance of the two analyses, using a synthetic approach in which simulated hemodynamic activations were combined with either simulated or real resting-state fNIRS data. This procedure allowed for exact quantification of the classification accuracies of GLM and LDA. In the case of real resting-state data, the correlations between classification accuracy and demographic characteristics were investigated by means of a Linear Mixed Model. In the second step, to further characterize the reliability of the newly proposed analysis method, we conducted an experiment in which participants had to perform a simple motor task and data were analyzed with the LDA-based classifier as well as with the standard GLM analysis. The results of the simulation study show that the LDA-based method achieves higher classification accuracies than the GLM analysis, and that the LDA results are more uniform across different subjects and, in contrast to the accuracies achieved by the GLM analysis, have no significant correlations with any of the demographic characteristics. Findings from the real-data experiment are consistent with the results of the real-plus-simulation study, in that the GLM-analysis results show greater inter-subject variability than do the corresponding LDA results. The results obtained suggest that the outcome of GLM analysis is highly vulnerable to violations of theoretical assumptions, and that therefore a data-driven approach such as that provided by the proposed LDA-based method is to be favored.
Using Passive Cavitation Images to Classify High-Intensity Focused Ultrasound Lesions
Haworth, Kevin J.; Salgaonkar, Vasant A.; Corregan, Nicholas M.; Holland, Christy K.; Mast, T. Douglas
2015-01-01
Passive cavitation imaging provides spatially resolved monitoring of cavitation emissions. However the diffraction limit of a linear imaging array results in relatively poor range resolution. Poor range resolution has limited prior analyses of the spatial specificity and sensitivity of passive cavitation imaging for predicting thermal lesion formation. In this study, this limitation is overcome by orienting a linear array orthogonal to the HIFU propagation direction and performing passive imaging. Fourteen lesions were formed in ex vivo bovine liver samples as a result of 1.1 MHz continuous-wave ultrasound exposure. The lesions were classified as focal, “tadpole”, or pre-focal based on their shape and location. Passive cavitation images were beam-formed from emissions at the fundamental, harmonic, ultraharmonic, and inharmonic frequencies with an established algorithm. Using the area under a receiver operator characteristic curve (AUROC), fundamental, harmonic, and ultraharmonic emissions were shown to be significant predictors of lesion formation for all lesion types. For both harmonic and ultraharmonic emissions, pre-focal lesions were classified most successfully (AUROC values of 0.87 and 0.88, respectively), followed by tadpole lesions (AUROC values of 0.77 and 0.64, respectively), and focal lesions (AUROC values of 0.65 and 0.60, respectively). PMID:26051309
Caravaca, Juan; Soria-Olivas, Emilio; Bataller, Manuel; Serrano, Antonio J; Such-Miquel, Luis; Vila-Francés, Joan; Guerrero, Juan F
2014-02-01
This work presents the application of machine learning techniques to analyse the influence of physical exercise in the physiological properties of the heart, during ventricular fibrillation. To this end, different kinds of classifiers (linear and neural models) are used to classify between trained and sedentary rabbit hearts. The use of those classifiers in combination with a wrapper feature selection algorithm allows to extract knowledge about the most relevant features in the problem. The obtained results show that neural models outperform linear classifiers (better performance indices and a better dimensionality reduction). The most relevant features to describe the benefits of physical exercise are those related to myocardial heterogeneity, mean activation rate and activation complexity. © 2013 Published by Elsevier Ltd.
Neural CMOS-integrated circuit and its application to data classification.
Göknar, Izzet Cem; Yildiz, Merih; Minaei, Shahram; Deniz, Engin
2012-05-01
Implementation and new applications of a tunable complementary metal-oxide-semiconductor-integrated circuit (CMOS-IC) of a recently proposed classifier core-cell (CC) are presented and tested with two different datasets. With two algorithms-one based on Fisher's linear discriminant analysis and the other based on perceptron learning, used to obtain CCs' tunable parameters-the Haberman and Iris datasets are classified. The parameters so obtained are used for hard-classification of datasets with a neural network structured circuit. Classification performance and coefficient calculation times for both algorithms are given. The CC has 6-ns response time and 1.8-mW power consumption. The fabrication parameters used for the IC are taken from CMOS AMS 0.35-μm technology.
On Algorithms for Generating Computationally Simple Piecewise Linear Classifiers
1989-05-01
suffers. - Waveform classification, e.g. speech recognition, seismic analysis (i.e. discrimination between earthquakes and nuclear explosions), target...assuming Gaussian distributions (B-G) d) Bayes classifier with probability densities estimated with the k-N-N method (B- kNN ) e) The -arest neighbour...range of classifiers are chosen including a fast, easy computable and often used classifier (B-G), reliable and complex classifiers (B- kNN and NNR
Comparing supervised learning techniques on the task of physical activity recognition.
Dalton, A; OLaighin, G
2013-01-01
The objective of this study was to compare the performance of base-level and meta-level classifiers on the task of physical activity recognition. Five wireless kinematic sensors were attached to each subject (n = 25) while they completed a range of basic physical activities in a controlled laboratory setting. Subjects were then asked to carry out similar self-annotated physical activities in a random order and in an unsupervised environment. A combination of time-domain and frequency-domain features were extracted from the sensor data including the first four central moments, zero-crossing rate, average magnitude, sensor cross-correlation, sensor auto-correlation, spectral entropy and dominant frequency components. A reduced feature set was generated using a wrapper subset evaluation technique with a linear forward search and this feature set was employed for classifier comparison. The meta-level classifier AdaBoostM1 with C4.5 Graft as its base-level classifier achieved an overall accuracy of 95%. Equal sized datasets of subject independent data and subject dependent data were used to train this classifier and high recognition rates could be achieved without the need for user specific training. Furthermore, it was found that an accuracy of 88% could be achieved using data from the ankle and wrist sensors only.
Data mining for the analysis of hippocampal zones in Alzheimer's disease
NASA Astrophysics Data System (ADS)
Ovando Vázquez, Cesaré M.
2012-02-01
In this work, a methodology to classify people with Alzheimer's Disease (AD), Healthy Controls (HC) and people with Mild Cognitive Impairment (MCI) is presented. This methodology consists of an ensemble of Support Vector Machines (SVM) with the hippocampal boxes (HB) as input data, these hippocampal zones are taken from Magnetic Resonance (MRI) and Positron Emission Tomography (PET) images. Two ways of constructing this ensemble are presented, the first consists of linear SVM models and the second of non-linear SVM models. Results demonstrate that the linear models classify HBs more accurately than the non-linear models between HC and MCI and that there are no differences between HC and AD.
2002-09-01
weather conditions (1999 Christmas storm in Europe , January 2000 snow storm over the eastern coast of the US) can be attributed to the inaccuracies in...over the normal modes of a linearized version of the model equations. These 5 normal modes can be classified (at least for the extratropics ) based
Using color histograms and SPA-LDA to classify bacteria.
de Almeida, Valber Elias; da Costa, Gean Bezerra; de Sousa Fernandes, David Douglas; Gonçalves Dias Diniz, Paulo Henrique; Brandão, Deysiane; de Medeiros, Ana Claudia Dantas; Véras, Germano
2014-09-01
In this work, a new approach is proposed to verify the differentiating characteristics of five bacteria (Escherichia coli, Enterococcus faecalis, Streptococcus salivarius, Streptococcus oralis, and Staphylococcus aureus) by using digital images obtained with a simple webcam and variable selection by the Successive Projections Algorithm associated with Linear Discriminant Analysis (SPA-LDA). In this sense, color histograms in the red-green-blue (RGB), hue-saturation-value (HSV), and grayscale channels and their combinations were used as input data, and statistically evaluated by using different multivariate classifiers (Soft Independent Modeling by Class Analogy (SIMCA), Principal Component Analysis-Linear Discriminant Analysis (PCA-LDA), Partial Least Squares Discriminant Analysis (PLS-DA) and Successive Projections Algorithm-Linear Discriminant Analysis (SPA-LDA)). The bacteria strains were cultivated in a nutritive blood agar base layer for 24 h by following the Brazilian Pharmacopoeia, maintaining the status of cell growth and the nature of nutrient solutions under the same conditions. The best result in classification was obtained by using RGB and SPA-LDA, which reached 94 and 100 % of classification accuracy in the training and test sets, respectively. This result is extremely positive from the viewpoint of routine clinical analyses, because it avoids bacterial identification based on phenotypic identification of the causative organism using Gram staining, culture, and biochemical proofs. Therefore, the proposed method presents inherent advantages, promoting a simpler, faster, and low-cost alternative for bacterial identification.
Computer-aided diagnosis of early knee osteoarthritis based on MRI T2 mapping.
Wu, Yixiao; Yang, Ran; Jia, Sen; Li, Zhanjun; Zhou, Zhiyang; Lou, Ting
2014-01-01
This work was aimed at studying the method of computer-aided diagnosis of early knee OA (OA: osteoarthritis). Based on the technique of MRI (MRI: Magnetic Resonance Imaging) T2 Mapping, through computer image processing, feature extraction, calculation and analysis via constructing a classifier, an effective computer-aided diagnosis method for knee OA was created to assist doctors in their accurate, timely and convenient detection of potential risk of OA. In order to evaluate this method, a total of 1380 data from the MRI images of 46 samples of knee joints were collected. These data were then modeled through linear regression on an offline general platform by the use of the ImageJ software, and a map of the physical parameter T2 was reconstructed. After the image processing, the T2 values of ten regions in the WORMS (WORMS: Whole-organ Magnetic Resonance Imaging Score) areas of the articular cartilage were extracted to be used as the eigenvalues in data mining. Then,a RBF (RBF: Radical Basis Function) network classifier was built to classify and identify the collected data. The classifier exhibited a final identification accuracy of 75%, indicating a good result of assisting diagnosis. Since the knee OA classifier constituted by a weights-directly-determined RBF neural network didn't require any iteration, our results demonstrated that the optimal weights, appropriate center and variance could be yielded through simple procedures. Furthermore, the accuracy for both the training samples and the testing samples from the normal group could reach 100%. Finally, the classifier was superior both in time efficiency and classification performance to the frequently used classifiers based on iterative learning. Thus it was suitable to be used as an aid to computer-aided diagnosis of early knee OA.
Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines
del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J.; Raboso, Mariano
2015-01-01
Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation—based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking—to reduce the dimensions of images—and binarization—to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements. PMID:26091392
Acoustic Biometric System Based on Preprocessing Techniques and Linear Support Vector Machines.
del Val, Lara; Izquierdo-Fuente, Alberto; Villacorta, Juan J; Raboso, Mariano
2015-06-17
Drawing on the results of an acoustic biometric system based on a MSE classifier, a new biometric system has been implemented. This new system preprocesses acoustic images, extracts several parameters and finally classifies them, based on Support Vector Machine (SVM). The preprocessing techniques used are spatial filtering, segmentation-based on a Gaussian Mixture Model (GMM) to separate the person from the background, masking-to reduce the dimensions of images-and binarization-to reduce the size of each image. An analysis of classification error and a study of the sensitivity of the error versus the computational burden of each implemented algorithm are presented. This allows the selection of the most relevant algorithms, according to the benefits required by the system. A significant improvement of the biometric system has been achieved by reducing the classification error, the computational burden and the storage requirements.
sEMG-based joint force control for an upper-limb power-assist exoskeleton robot.
Li, Zhijun; Wang, Baocheng; Sun, Fuchun; Yang, Chenguang; Xie, Qing; Zhang, Weidong
2014-05-01
This paper investigates two surface electromyogram (sEMG)-based control strategies developed for a power-assist exoskeleton arm. Different from most of the existing position control approaches, this paper develops force control methods to make the exoskeleton robot behave like humans in order to provide better assistance. The exoskeleton robot is directly attached to a user's body and activated by the sEMG signals of the user's muscles, which reflect the user's motion intention. In the first proposed control method, the forces of agonist and antagonist muscles pair are estimated, and their difference is used to produce the torque of the corresponding joints. In the second method, linear discriminant analysis-based classifiers are introduced as the indicator of the motion type of the joints. Then, the classifier's outputs together with the estimated force of corresponding active muscle determine the torque control signals. Different from the conventional approaches, one classifier is assigned to each joint, which decreases the training time and largely simplifies the recognition process. Finally, the extensive experiments are conducted to illustrate the effectiveness of the proposed approaches.
Statistical classification of drug incidents due to look-alike sound-alike mix-ups.
Wong, Zoie Shui Yee
2016-06-01
It has been recognised that medication names that look or sound similar are a cause of medication errors. This study builds statistical classifiers for identifying medication incidents due to look-alike sound-alike mix-ups. A total of 227 patient safety incident advisories related to medication were obtained from the Canadian Patient Safety Institute's Global Patient Safety Alerts system. Eight feature selection strategies based on frequent terms, frequent drug terms and constituent terms were performed. Statistical text classifiers based on logistic regression, support vector machines with linear, polynomial, radial-basis and sigmoid kernels and decision tree were trained and tested. The models developed achieved an average accuracy of above 0.8 across all the model settings. The receiver operating characteristic curves indicated the classifiers performed reasonably well. The results obtained in this study suggest that statistical text classification can be a feasible method for identifying medication incidents due to look-alike sound-alike mix-ups based on a database of advisories from Global Patient Safety Alerts. © The Author(s) 2014.
Classification of hadith into positive suggestion, negative suggestion, and information
NASA Astrophysics Data System (ADS)
Faraby, Said Al; Riviera Rachmawati Jasin, Eliza; Kusumaningrum, Andina; Adiwijaya
2018-03-01
As one of the Muslim life guidelines, based on the meaning of its sentence(s), a hadith can be viewed as a suggestion for doing something, or a suggestion for not doing something, or just information without any suggestion. In this paper, we tried to classify the Bahasa translation of hadith into the three categories using machine learning approach. We tried stemming and stopword removal in preprocessing, and TF-IDF of unigram, bigram, and trigram as the extracted features. As the classifier, we compared between SVM and Neural Network. Since the categories are new, so in order to compare the results of the previous pipelines, we created a baseline classifier using simple rule-based string matching technique. The rule-based algorithm conditions on the occurrence of words such as “janganlah, sholatlah, and so on” to determine the category. The baseline method achieved F1-Score of 0.69, while the best F1-Score from the machine learning approach was 0.88, and it was produced by SVM model with the linear kernel.
Two-dimensional shape classification using generalized Fourier representation and neural networks
NASA Astrophysics Data System (ADS)
Chodorowski, Artur; Gustavsson, Tomas; Mattsson, Ulf
2000-04-01
A shape-based classification method is developed based upon the Generalized Fourier Representation (GFR). GFR can be regarded as an extension of traditional polar Fourier descriptors, suitable for description of closed objects, both convex and concave, with or without holes. Explicit relations of GFR coefficients to regular moments, moment invariants and affine moment invariants are given in the paper. The dual linear relation between GFR coefficients and regular moments was used to compare shape features derive from GFR descriptors and Hu's moment invariants. the GFR was then applied to a clinical problem within oral medicine and used to represent the contours of the lesions in the oral cavity. The lesions studied were leukoplakia and different forms of lichenoid reactions. Shape features were extracted from GFR coefficients in order to classify potentially cancerous oral lesions. Alternative classifiers were investigated based on a multilayer perceptron with different architectures and extensions. The overall classification accuracy for recognition of potentially cancerous oral lesions when using neural network classifier was 85%, while the classification between leukoplakia and reticular lichenoid reactions gave 96% (5-fold cross-validated) recognition rate.
voomDDA: discovery of diagnostic biomarkers and classification of RNA-seq data.
Zararsiz, Gokmen; Goksuluk, Dincer; Klaus, Bernd; Korkmaz, Selcuk; Eldem, Vahap; Karabulut, Erdem; Ozturk, Ahmet
2017-01-01
RNA-Seq is a recent and efficient technique that uses the capabilities of next-generation sequencing technology for characterizing and quantifying transcriptomes. One important task using gene-expression data is to identify a small subset of genes that can be used to build diagnostic classifiers particularly for cancer diseases. Microarray based classifiers are not directly applicable to RNA-Seq data due to its discrete nature. Overdispersion is another problem that requires careful modeling of mean and variance relationship of the RNA-Seq data. In this study, we present voomDDA classifiers: variance modeling at the observational level (voom) extensions of the nearest shrunken centroids (NSC) and the diagonal discriminant classifiers. VoomNSC is one of these classifiers and brings voom and NSC approaches together for the purpose of gene-expression based classification. For this purpose, we propose weighted statistics and put these weighted statistics into the NSC algorithm. The VoomNSC is a sparse classifier that models the mean-variance relationship using the voom method and incorporates voom's precision weights into the NSC classifier via weighted statistics. A comprehensive simulation study was designed and four real datasets are used for performance assessment. The overall results indicate that voomNSC performs as the sparsest classifier. It also provides the most accurate results together with power-transformed Poisson linear discriminant analysis, rlog transformed support vector machines and random forests algorithms. In addition to prediction purposes, the voomNSC classifier can be used to identify the potential diagnostic biomarkers for a condition of interest. Through this work, statistical learning methods proposed for microarrays can be reused for RNA-Seq data. An interactive web application is freely available at http://www.biosoft.hacettepe.edu.tr/voomDDA/.
A review of classification algorithms for EEG-based brain–computer interfaces: a 10 year update
NASA Astrophysics Data System (ADS)
Lotte, F.; Bougrain, L.; Cichocki, A.; Clerc, M.; Congedo, M.; Rakotomamonjy, A.; Yger, F.
2018-06-01
Objective. Most current electroencephalography (EEG)-based brain–computer interfaces (BCIs) are based on machine learning algorithms. There is a large diversity of classifier types that are used in this field, as described in our 2007 review paper. Now, approximately ten years after this review publication, many new algorithms have been developed and tested to classify EEG signals in BCIs. The time is therefore ripe for an updated review of EEG classification algorithms for BCIs. Approach. We surveyed the BCI and machine learning literature from 2007 to 2017 to identify the new classification approaches that have been investigated to design BCIs. We synthesize these studies in order to present such algorithms, to report how they were used for BCIs, what were the outcomes, and to identify their pros and cons. Main results. We found that the recently designed classification algorithms for EEG-based BCIs can be divided into four main categories: adaptive classifiers, matrix and tensor classifiers, transfer learning and deep learning, plus a few other miscellaneous classifiers. Among these, adaptive classifiers were demonstrated to be generally superior to static ones, even with unsupervised adaptation. Transfer learning can also prove useful although the benefits of transfer learning remain unpredictable. Riemannian geometry-based methods have reached state-of-the-art performances on multiple BCI problems and deserve to be explored more thoroughly, along with tensor-based methods. Shrinkage linear discriminant analysis and random forests also appear particularly useful for small training samples settings. On the other hand, deep learning methods have not yet shown convincing improvement over state-of-the-art BCI methods. Significance. This paper provides a comprehensive overview of the modern classification algorithms used in EEG-based BCIs, presents the principles of these methods and guidelines on when and how to use them. It also identifies a number of challenges to further advance EEG classification in BCI.
Enhancing atlas based segmentation with multiclass linear classifiers
DOE Office of Scientific and Technical Information (OSTI.GOV)
Sdika, Michaël, E-mail: michael.sdika@creatis.insa-lyon.fr
Purpose: To present a method to enrich atlases for atlas based segmentation. Such enriched atlases can then be used as a single atlas or within a multiatlas framework. Methods: In this paper, machine learning techniques have been used to enhance the atlas based segmentation approach. The enhanced atlas defined in this work is a pair composed of a gray level image alongside an image of multiclass classifiers with one classifier per voxel. Each classifier embeds local information from the whole training dataset that allows for the correction of some systematic errors in the segmentation and accounts for the possible localmore » registration errors. The authors also propose to use these images of classifiers within a multiatlas framework: results produced by a set of such local classifier atlases can be combined using a label fusion method. Results: Experiments have been made on the in vivo images of the IBSR dataset and a comparison has been made with several state-of-the-art methods such as FreeSurfer and the multiatlas nonlocal patch based method of Coupé or Rousseau. These experiments show that their method is competitive with state-of-the-art methods while having a low computational cost. Further enhancement has also been obtained with a multiatlas version of their method. It is also shown that, in this case, nonlocal fusion is unnecessary. The multiatlas fusion can therefore be done efficiently. Conclusions: The single atlas version has similar quality as state-of-the-arts multiatlas methods but with the computational cost of a naive single atlas segmentation. The multiatlas version offers a improvement in quality and can be done efficiently without a nonlocal strategy.« less
de Chazal, Philip; Heneghan, Conor; Sheridan, Elaine; Reilly, Richard; Nolan, Philip; O'Malley, Mark
2003-06-01
A method for the automatic processing of the electrocardiogram (ECG) for the detection of obstructive apnoea is presented. The method screens nighttime single-lead ECG recordings for the presence of major sleep apnoea and provides a minute-by-minute analysis of disordered breathing. A large independently validated database of 70 ECG recordings acquired from normal subjects and subjects with obstructive and mixed sleep apnoea, each of approximately eight hours in duration, was used throughout the study. Thirty-five of these recordings were used for training and 35 retained for independent testing. A wide variety of features based on heartbeat intervals and an ECG-derived respiratory signal were considered. Classifiers based on linear and quadratic discriminants were compared. Feature selection and regularization of classifier parameters were used to optimize classifier performance. Results show that the normal recordings could be separated from the apnoea recordings with a 100% success rate and a minute-by-minute classification accuracy of over 90% is achievable.
Hussain, Lal
2018-06-01
Epilepsy is a neurological disorder produced due to abnormal excitability of neurons in the brain. The research reveals that brain activity is monitored through electroencephalogram (EEG) of patients suffered from seizure to detect the epileptic seizure. The performance of EEG detection based epilepsy require feature extracting strategies. In this research, we have extracted varying features extracting strategies based on time and frequency domain characteristics, nonlinear, wavelet based entropy and few statistical features. A deeper study was undertaken using novel machine learning classifiers by considering multiple factors. The support vector machine kernels are evaluated based on multiclass kernel and box constraint level. Likewise, for K-nearest neighbors (KNN), we computed the different distance metrics, Neighbor weights and Neighbors. Similarly, the decision trees we tuned the paramours based on maximum splits and split criteria and ensemble classifiers are evaluated based on different ensemble methods and learning rate. For training/testing tenfold Cross validation was employed and performance was evaluated in form of TPR, NPR, PPV, accuracy and AUC. In this research, a deeper analysis approach was performed using diverse features extracting strategies using robust machine learning classifiers with more advanced optimal options. Support Vector Machine linear kernel and KNN with City block distance metric give the overall highest accuracy of 99.5% which was higher than using the default parameters for these classifiers. Moreover, highest separation (AUC = 0.9991, 0.9990) were obtained at different kernel scales using SVM. Additionally, the K-nearest neighbors with inverse squared distance weight give higher performance at different Neighbors. Moreover, to distinguish the postictal heart rate oscillations from epileptic ictal subjects, and highest performance of 100% was obtained using different machine learning classifiers.
Construction accident narrative classification: An evaluation of text mining techniques.
Goh, Yang Miang; Ubeynarayana, C U
2017-11-01
Learning from past accidents is fundamental to accident prevention. Thus, accident and near miss reporting are encouraged by organizations and regulators. However, for organizations managing large safety databases, the time taken to accurately classify accident and near miss narratives will be very significant. This study aims to evaluate the utility of various text mining classification techniques in classifying 1000 publicly available construction accident narratives obtained from the US OSHA website. The study evaluated six machine learning algorithms, including support vector machine (SVM), linear regression (LR), random forest (RF), k-nearest neighbor (KNN), decision tree (DT) and Naive Bayes (NB), and found that SVM produced the best performance in classifying the test set of 251 cases. Further experimentation with tokenization of the processed text and non-linear SVM were also conducted. In addition, a grid search was conducted on the hyperparameters of the SVM models. It was found that the best performing classifiers were linear SVM with unigram tokenization and radial basis function (RBF) SVM with uni-gram tokenization. In view of its relative simplicity, the linear SVM is recommended. Across the 11 labels of accident causes or types, the precision of the linear SVM ranged from 0.5 to 1, recall ranged from 0.36 to 0.9 and F1 score was between 0.45 and 0.92. The reasons for misclassification were discussed and suggestions on ways to improve the performance were provided. Copyright © 2017 Elsevier Ltd. All rights reserved.
Telephony-based voice pathology assessment using automated speech analysis.
Moran, Rosalyn J; Reilly, Richard B; de Chazal, Philip; Lacy, Peter D
2006-03-01
A system for remotely detecting vocal fold pathologies using telephone-quality speech is presented. The system uses a linear classifier, processing measurements of pitch perturbation, amplitude perturbation and harmonic-to-noise ratio derived from digitized speech recordings. Voice recordings from the Disordered Voice Database Model 4337 system were used to develop and validate the system. Results show that while a sustained phonation, recorded in a controlled environment, can be classified as normal or pathologic with accuracy of 89.1%, telephone-quality speech can be classified as normal or pathologic with an accuracy of 74.2%, using the same scheme. Amplitude perturbation features prove most robust for telephone-quality speech. The pathologic recordings were then subcategorized into four groups, comprising normal, neuromuscular pathologic, physical pathologic and mixed (neuromuscular with physical) pathologic. A separate classifier was developed for classifying the normal group from each pathologic subcategory. Results show that neuromuscular disorders could be detected remotely with an accuracy of 87%, physical abnormalities with an accuracy of 78% and mixed pathology voice with an accuracy of 61%. This study highlights the real possibility for remote detection and diagnosis of voice pathology.
2017-01-01
Driver fatigue has become an important factor to traffic accidents worldwide, and effective detection of driver fatigue has major significance for public health. The purpose method employs entropy measures for feature extraction from a single electroencephalogram (EEG) channel. Four types of entropies measures, sample entropy (SE), fuzzy entropy (FE), approximate entropy (AE), and spectral entropy (PE), were deployed for the analysis of original EEG signal and compared by ten state-of-the-art classifiers. Results indicate that optimal performance of single channel is achieved using a combination of channel CP4, feature FE, and classifier Random Forest (RF). The highest accuracy can be up to 96.6%, which has been able to meet the needs of real applications. The best combination of channel + features + classifier is subject-specific. In this work, the accuracy of FE as the feature is far greater than the Acc of other features. The accuracy using classifier RF is the best, while that of classifier SVM with linear kernel is the worst. The impact of channel selection on the Acc is larger. The performance of various channels is very different. PMID:28255330
Hu, Jianfeng
2017-01-01
Driver fatigue has become an important factor to traffic accidents worldwide, and effective detection of driver fatigue has major significance for public health. The purpose method employs entropy measures for feature extraction from a single electroencephalogram (EEG) channel. Four types of entropies measures, sample entropy (SE), fuzzy entropy (FE), approximate entropy (AE), and spectral entropy (PE), were deployed for the analysis of original EEG signal and compared by ten state-of-the-art classifiers. Results indicate that optimal performance of single channel is achieved using a combination of channel CP4, feature FE, and classifier Random Forest (RF). The highest accuracy can be up to 96.6%, which has been able to meet the needs of real applications. The best combination of channel + features + classifier is subject-specific. In this work, the accuracy of FE as the feature is far greater than the Acc of other features. The accuracy using classifier RF is the best, while that of classifier SVM with linear kernel is the worst. The impact of channel selection on the Acc is larger. The performance of various channels is very different.
Age and gender estimation using Region-SIFT and multi-layered SVM
NASA Astrophysics Data System (ADS)
Kim, Hyunduk; Lee, Sang-Heon; Sohn, Myoung-Kyu; Hwang, Byunghun
2018-04-01
In this paper, we propose an age and gender estimation framework using the region-SIFT feature and multi-layered SVM classifier. The suggested framework entails three processes. The first step is landmark based face alignment. The second step is the feature extraction step. In this step, we introduce the region-SIFT feature extraction method based on facial landmarks. First, we define sub-regions of the face. We then extract SIFT features from each sub-region. In order to reduce the dimensions of features we employ a Principal Component Analysis (PCA) and a Linear Discriminant Analysis (LDA). Finally, we classify age and gender using a multi-layered Support Vector Machines (SVM) for efficient classification. Rather than performing gender estimation and age estimation independently, the use of the multi-layered SVM can improve the classification rate by constructing a classifier that estimate the age according to gender. Moreover, we collect a dataset of face images, called by DGIST_C, from the internet. A performance evaluation of proposed method was performed with the FERET database, CACD database, and DGIST_C database. The experimental results demonstrate that the proposed approach classifies age and performs gender estimation very efficiently and accurately.
Effect of separate sampling on classification accuracy.
Shahrokh Esfahani, Mohammad; Dougherty, Edward R
2014-01-15
Measurements are commonly taken from two phenotypes to build a classifier, where the number of data points from each class is predetermined, not random. In this 'separate sampling' scenario, the data cannot be used to estimate the class prior probabilities. Moreover, predetermined class sizes can severely degrade classifier performance, even for large samples. We employ simulations using both synthetic and real data to show the detrimental effect of separate sampling on a variety of classification rules. We establish propositions related to the effect on the expected classifier error owing to a sampling ratio different from the population class ratio. From these we derive a sample-based minimax sampling ratio and provide an algorithm for approximating it from the data. We also extend to arbitrary distributions the classical population-based Anderson linear discriminant analysis minimax sampling ratio derived from the discriminant form of the Bayes classifier. All the codes for synthetic data and real data examples are written in MATLAB. A function called mmratio, whose output is an approximation of the minimax sampling ratio of a given dataset, is also written in MATLAB. All the codes are available at: http://gsp.tamu.edu/Publications/supplementary/shahrokh13b.
A fresh look at functional link neural network for motor imagery-based brain-computer interface.
Hettiarachchi, Imali T; Babaei, Toktam; Nguyen, Thanh; Lim, Chee P; Nahavandi, Saeid
2018-05-04
Artificial neural networks (ANNs) are one of the widely used classifiers in the brain-computer interface (BCI) systems-based on noninvasive electroencephalography (EEG) signals. Among the different ANN architectures, the most commonly applied for BCI classifiers is the multilayer perceptron (MLP). When appropriately designed with optimal number of neuron layers and number of neurons per layer, the ANN can act as a universal approximator. However, due to the low signal-to-noise ratio of EEG signal data, overtraining problem may become an inherent issue, causing these universal approximators to fail in real-time applications. In this study we introduce a higher order neural network, namely the functional link neural network (FLNN) as a classifier for motor imagery (MI)-based BCI systems, to remedy the drawbacks in MLP. We compare the proposed method with competing classifiers such as linear decomposition analysis, naïve Bayes, k-nearest neighbours, support vector machine and three MLP architectures. Two multi-class benchmark datasets from the BCI competitions are used. Common spatial pattern algorithm is utilized for feature extraction to build classification models. FLNN reports the highest average Kappa value over multiple subjects for both the BCI competition datasets, under similarly preprocessed data and extracted features. Further, statistical comparison results over multiple subjects show that the proposed FLNN classification method yields the best performance among the competing classifiers. Findings from this study imply that the proposed method, which has less computational complexity compared to the MLP, can be implemented effectively in practical MI-based BCI systems. Copyright © 2018 Elsevier B.V. All rights reserved.
Integrated pillar scatterers for speeding up classification of cell holograms.
Lugnan, Alessio; Dambre, Joni; Bienstman, Peter
2017-11-27
The computational power required to classify cell holograms is a major limit to the throughput of label-free cell sorting based on digital holographic microscopy. In this work, a simple integrated photonic stage comprising a collection of silica pillar scatterers is proposed as an effective nonlinear mixing interface between the light scattered by a cell and an image sensor. The light processing provided by the photonic stage allows for the use of a simple linear classifier implemented in the electric domain and applied on a limited number of pixels. A proof-of-concept of the presented machine learning technique, which is based on the extreme learning machine (ELM) paradigm, is provided by the classification results on samples generated by 2D FDTD simulations of cells in a microfluidic channel.
NASA Astrophysics Data System (ADS)
Singla, Neeru; Srivastava, Vishal; Singh Mehta, Dalip
2018-02-01
We report the first fully automated detection of human skin burn injuries in vivo, with the goal of automatic surgical margin assessment based on optical coherence tomography (OCT) images. Our proposed automated procedure entails building a machine-learning-based classifier by extracting quantitative features from normal and burn tissue images recorded by OCT. In this study, 56 samples (28 normal, 28 burned) were imaged by OCT and eight features were extracted. A linear model classifier was trained using 34 samples and 22 samples were used to test the model. Sensitivity of 91.6% and specificity of 90% were obtained. Our results demonstrate the capability of a computer-aided technique for accurately and automatically identifying burn tissue resection margins during surgical treatment.
Weyhenmeyer, Jonathan; Hernandez, Manuel E; Lainscsek, Claudia; Sejnowski, Terrence J; Poizner, Howard
2014-01-01
Parkinson's disease (PD) is known to lead to marked alterations in cortical-basal ganglia activity that may be amenable to serve as a biomarker for PD diagnosis. Using non-linear delay differential equations (DDE) for classification of PD patients on and off dopaminergic therapy (PD-on, PD-off, respectively) from healthy age-matched controls (CO), we show that 1 second of quasi-resting state clean and raw electroencephalogram (EEG) data can be used to classify CO from PD-on/off based on the area under the receiver operating characteristic curve (AROC). Raw EEG is shown to classify more robustly (AROC=0.59-0.86) than clean EEG data (AROC=0.57-0.72). Decomposition of the raw data into stereotypical and non-stereotypical artifacts provides evidence that increased classification of raw EEG time series originates from muscle artifacts. Thus, non-linear feature extraction and classification of raw EEG data in a low dimensional feature space is a potential biomarker for Parkinson's disease.
Combining Biomarkers Linearly and Nonlinearly for Classification Using the Area Under the ROC Curve
Fong, Youyi; Yin, Shuxin; Huang, Ying
2016-01-01
In biomedical studies, it is often of interest to classify/predict a subject’s disease status based on a variety of biomarker measurements. A commonly used classification criterion is based on AUC - Area under the Receiver Operating Characteristic Curve. Many methods have been proposed to optimize approximated empirical AUC criteria, but there are two limitations to the existing methods. First, most methods are only designed to find the best linear combination of biomarkers, which may not perform well when there is strong nonlinearity in the data. Second, many existing linear combination methods use gradient-based algorithms to find the best marker combination, which often result in sub-optimal local solutions. In this paper, we address these two problems by proposing a new kernel-based AUC optimization method called Ramp AUC (RAUC). This method approximates the empirical AUC loss function with a ramp function, and finds the best combination by a difference of convex functions algorithm. We show that as a linear combination method, RAUC leads to a consistent and asymptotically normal estimator of the linear marker combination when the data is generated from a semiparametric generalized linear model, just as the Smoothed AUC method (SAUC). Through simulation studies and real data examples, we demonstrate that RAUC out-performs SAUC in finding the best linear marker combinations, and can successfully capture nonlinear pattern in the data to achieve better classification performance. We illustrate our method with a dataset from a recent HIV vaccine trial. PMID:27058981
Event-shape fluctuations and flow correlations in ultra-relativistic heavy-ion collisions
Jia, Jiangyong
2014-12-01
I review recent measurements of a large set of flow observables associated with event-shape fluctuations and collective expansion in heavy ion collisions. First, these flow observables are classified and experiment methods are introduced. The experimental results for each type of observables are then presented and compared to theoretical calculations. A coherent picture of initial condition and collective flow based on linear and non-linear hydrodynamic responses is derived, which qualitatively describe most experimental results. I discuss new types of fluctuation measurements that can further our understanding of the event-shape fluctuations and collective expansion dynamics.
Texture- and deformability-based surface recognition by tactile image analysis.
Khasnobish, Anwesha; Pal, Monalisa; Tibarewala, D N; Konar, Amit; Pal, Kunal
2016-08-01
Deformability and texture are two unique object characteristics which are essential for appropriate surface recognition by tactile exploration. Tactile sensation is required to be incorporated in artificial arms for rehabilitative and other human-computer interface applications to achieve efficient and human-like manoeuvring. To accomplish the same, surface recognition by tactile data analysis is one of the prerequisites. The aim of this work is to develop effective technique for identification of various surfaces based on deformability and texture by analysing tactile images which are obtained during dynamic exploration of the item by artificial arms whose gripper is fitted with tactile sensors. Tactile data have been acquired, while human beings as well as a robot hand fitted with tactile sensors explored the objects. The tactile images are pre-processed, and relevant features are extracted from the tactile images. These features are provided as input to the variants of support vector machine (SVM), linear discriminant analysis and k-nearest neighbour (kNN) for classification. Based on deformability, six household surfaces are recognized from their corresponding tactile images. Moreover, based on texture five surfaces of daily use are classified. The method adopted in the former two cases has also been applied for deformability- and texture-based recognition of four biomembranes, i.e. membranes prepared from biomaterials which can be used for various applications such as drug delivery and implants. Linear SVM performed best for recognizing surface deformability with an accuracy of 83 % in 82.60 ms, whereas kNN classifier recognizes surfaces of daily use having different textures with an accuracy of 89 % in 54.25 ms and SVM with radial basis function kernel recognizes biomembranes with an accuracy of 78 % in 53.35 ms. The classifiers are observed to generalize well on the unseen test datasets with very high performance to achieve efficient material recognition based on its deformability and texture.
NASA Astrophysics Data System (ADS)
Taha, Zahari; Muazu Musa, Rabiu; Majeed, Anwar P. P. Abdul; Razali Abdullah, Mohamad; Amirul Abdullah, Muhammad; Hasnun Arif Hassan, Mohd; Khalil, Zubair
2018-04-01
The present study employs a machine learning algorithm namely support vector machine (SVM) to classify high and low potential archers from a collection of bio-physiological variables trained on different SVMs. 50 youth archers with the average age and standard deviation of (17.0 ±.056) gathered from various archery programmes completed a one end shooting score test. The bio-physiological variables namely resting heart rate, resting respiratory rate, resting diastolic blood pressure, resting systolic blood pressure, as well as calories intake, were measured prior to their shooting tests. k-means cluster analysis was applied to cluster the archers based on their scores on variables assessed. SVM models i.e. linear, quadratic and cubic kernel functions, were trained on the aforementioned variables. The k-means clustered the archers into high (HPA) and low potential archers (LPA), respectively. It was demonstrated that the linear SVM exhibited good accuracy with a classification accuracy of 94% in comparison the other tested models. The findings of this investigation can be valuable to coaches and sports managers to recognise high potential athletes from the selected bio-physiological variables examined.
Classification of cardiovascular tissues using LBP based descriptors and a cascade SVM.
Mazo, Claudia; Alegre, Enrique; Trujillo, Maria
2017-08-01
Histological images have characteristics, such as texture, shape, colour and spatial structure, that permit the differentiation of each fundamental tissue and organ. Texture is one of the most discriminative features. The automatic classification of tissues and organs based on histology images is an open problem, due to the lack of automatic solutions when treating tissues without pathologies. In this paper, we demonstrate that it is possible to automatically classify cardiovascular tissues using texture information and Support Vector Machines (SVM). Additionally, we realised that it is feasible to recognise several cardiovascular organs following the same process. The texture of histological images was described using Local Binary Patterns (LBP), LBP Rotation Invariant (LBPri), Haralick features and different concatenations between them, representing in this way its content. Using a SVM with linear kernel, we selected the more appropriate descriptor that, for this problem, was a concatenation of LBP and LBPri. Due to the small number of the images available, we could not follow an approach based on deep learning, but we selected the classifier who yielded the higher performance by comparing SVM with Random Forest and Linear Discriminant Analysis. Once SVM was selected as the classifier with a higher area under the curve that represents both higher recall and precision, we tuned it evaluating different kernels, finding that a linear SVM allowed us to accurately separate four classes of tissues: (i) cardiac muscle of the heart, (ii) smooth muscle of the muscular artery, (iii) loose connective tissue, and (iv) smooth muscle of the large vein and the elastic artery. The experimental validation was conducted using 3000 blocks of 100 × 100 sized pixels, with 600 blocks per class and the classification was assessed using a 10-fold cross-validation. using LBP as the descriptor, concatenated with LBPri and a SVM with linear kernel, the main four classes of tissues were recognised with an AUC higher than 0.98. A polynomial kernel was then used to separate the elastic artery and vein, yielding an AUC in both cases superior to 0.98. Following the proposed approach, it is possible to separate with very high precision (AUC greater than 0.98) the fundamental tissues of the cardiovascular system along with some organs, such as the heart, arteries and veins. Copyright © 2017 Elsevier B.V. All rights reserved.
Acharya, U Rajendra; Sree, S Vinitha; Chattopadhyay, Subhagata; Yu, Wenwei; Ang, Peng Chuan Alvin
2011-06-01
Epilepsy is a common neurological disorder that is characterized by the recurrence of seizures. Electroencephalogram (EEG) signals are widely used to diagnose seizures. Because of the non-linear and dynamic nature of the EEG signals, it is difficult to effectively decipher the subtle changes in these signals by visual inspection and by using linear techniques. Therefore, non-linear methods are being researched to analyze the EEG signals. In this work, we use the recorded EEG signals in Recurrence Plots (RP), and extract Recurrence Quantification Analysis (RQA) parameters from the RP in order to classify the EEG signals into normal, ictal, and interictal classes. Recurrence Plot (RP) is a graph that shows all the times at which a state of the dynamical system recurs. Studies have reported significantly different RQA parameters for the three classes. However, more studies are needed to develop classifiers that use these promising features and present good classification accuracy in differentiating the three types of EEG segments. Therefore, in this work, we have used ten RQA parameters to quantify the important features in the EEG signals.These features were fed to seven different classifiers: Support vector machine (SVM), Gaussian Mixture Model (GMM), Fuzzy Sugeno Classifier, K-Nearest Neighbor (KNN), Naive Bayes Classifier (NBC), Decision Tree (DT), and Radial Basis Probabilistic Neural Network (RBPNN). Our results show that the SVM classifier was able to identify the EEG class with an average efficiency of 95.6%, sensitivity and specificity of 98.9% and 97.8%, respectively.
Kim, Jongin; Park, Hyeong-jun
2016-01-01
The purpose of this study is to classify EEG data on imagined speech in a single trial. We recorded EEG data while five subjects imagined different vowels, /a/, /e/, /i/, /o/, and /u/. We divided each single trial dataset into thirty segments and extracted features (mean, variance, standard deviation, and skewness) from all segments. To reduce the dimension of the feature vector, we applied a feature selection algorithm based on the sparse regression model. These features were classified using a support vector machine with a radial basis function kernel, an extreme learning machine, and two variants of an extreme learning machine with different kernels. Because each single trial consisted of thirty segments, our algorithm decided the label of the single trial by selecting the most frequent output among the outputs of the thirty segments. As a result, we observed that the extreme learning machine and its variants achieved better classification rates than the support vector machine with a radial basis function kernel and linear discrimination analysis. Thus, our results suggested that EEG responses to imagined speech could be successfully classified in a single trial using an extreme learning machine with a radial basis function and linear kernel. This study with classification of imagined speech might contribute to the development of silent speech BCI systems. PMID:28097128
Discriminative Learning of Receptive Fields from Responses to Non-Gaussian Stimulus Ensembles
Meyer, Arne F.; Diepenbrock, Jan-Philipp; Happel, Max F. K.; Ohl, Frank W.; Anemüller, Jörn
2014-01-01
Analysis of sensory neurons' processing characteristics requires simultaneous measurement of presented stimuli and concurrent spike responses. The functional transformation from high-dimensional stimulus space to the binary space of spike and non-spike responses is commonly described with linear-nonlinear models, whose linear filter component describes the neuron's receptive field. From a machine learning perspective, this corresponds to the binary classification problem of discriminating spike-eliciting from non-spike-eliciting stimulus examples. The classification-based receptive field (CbRF) estimation method proposed here adapts a linear large-margin classifier to optimally predict experimental stimulus-response data and subsequently interprets learned classifier weights as the neuron's receptive field filter. Computational learning theory provides a theoretical framework for learning from data and guarantees optimality in the sense that the risk of erroneously assigning a spike-eliciting stimulus example to the non-spike class (and vice versa) is minimized. Efficacy of the CbRF method is validated with simulations and for auditory spectro-temporal receptive field (STRF) estimation from experimental recordings in the auditory midbrain of Mongolian gerbils. Acoustic stimulation is performed with frequency-modulated tone complexes that mimic properties of natural stimuli, specifically non-Gaussian amplitude distribution and higher-order correlations. Results demonstrate that the proposed approach successfully identifies correct underlying STRFs, even in cases where second-order methods based on the spike-triggered average (STA) do not. Applied to small data samples, the method is shown to converge on smaller amounts of experimental recordings and with lower estimation variance than the generalized linear model and recent information theoretic methods. Thus, CbRF estimation may prove useful for investigation of neuronal processes in response to natural stimuli and in settings where rapid adaptation is induced by experimental design. PMID:24699631
Discriminative learning of receptive fields from responses to non-Gaussian stimulus ensembles.
Meyer, Arne F; Diepenbrock, Jan-Philipp; Happel, Max F K; Ohl, Frank W; Anemüller, Jörn
2014-01-01
Analysis of sensory neurons' processing characteristics requires simultaneous measurement of presented stimuli and concurrent spike responses. The functional transformation from high-dimensional stimulus space to the binary space of spike and non-spike responses is commonly described with linear-nonlinear models, whose linear filter component describes the neuron's receptive field. From a machine learning perspective, this corresponds to the binary classification problem of discriminating spike-eliciting from non-spike-eliciting stimulus examples. The classification-based receptive field (CbRF) estimation method proposed here adapts a linear large-margin classifier to optimally predict experimental stimulus-response data and subsequently interprets learned classifier weights as the neuron's receptive field filter. Computational learning theory provides a theoretical framework for learning from data and guarantees optimality in the sense that the risk of erroneously assigning a spike-eliciting stimulus example to the non-spike class (and vice versa) is minimized. Efficacy of the CbRF method is validated with simulations and for auditory spectro-temporal receptive field (STRF) estimation from experimental recordings in the auditory midbrain of Mongolian gerbils. Acoustic stimulation is performed with frequency-modulated tone complexes that mimic properties of natural stimuli, specifically non-Gaussian amplitude distribution and higher-order correlations. Results demonstrate that the proposed approach successfully identifies correct underlying STRFs, even in cases where second-order methods based on the spike-triggered average (STA) do not. Applied to small data samples, the method is shown to converge on smaller amounts of experimental recordings and with lower estimation variance than the generalized linear model and recent information theoretic methods. Thus, CbRF estimation may prove useful for investigation of neuronal processes in response to natural stimuli and in settings where rapid adaptation is induced by experimental design.
NASA Astrophysics Data System (ADS)
Chernavskaia, Olga; Heuke, Sandro; Vieth, Michael; Friedrich, Oliver; Schürmann, Sebastian; Atreya, Raja; Stallmach, Andreas; Neurath, Markus F.; Waldner, Maximilian; Petersen, Iver; Schmitt, Michael; Bocklitz, Thomas; Popp, Jürgen
2016-07-01
Assessing disease activity is a prerequisite for an adequate treatment of inflammatory bowel diseases (IBD) such as Crohn’s disease and ulcerative colitis. In addition to endoscopic mucosal healing, histologic remission poses a promising end-point of IBD therapy. However, evaluating histological remission harbors the risk for complications due to the acquisition of biopsies and results in a delay of diagnosis because of tissue processing procedures. In this regard, non-linear multimodal imaging techniques might serve as an unparalleled technique that allows the real-time evaluation of microscopic IBD activity in the endoscopy unit. In this study, tissue sections were investigated using the non-linear multimodal microscopy combination of coherent anti-Stokes Raman scattering (CARS), two-photon excited auto fluorescence (TPEF) and second-harmonic generation (SHG). After the measurement a gold-standard assessment of histological indexes was carried out based on a conventional H&E stain. Subsequently, various geometry and intensity related features were extracted from the multimodal images. An optimized feature set was utilized to predict histological index levels based on a linear classifier. Based on the automated prediction, the diagnosis time interval is decreased. Therefore, non-linear multimodal imaging may provide a real-time diagnosis of IBD activity suited to assist clinical decision making within the endoscopy unit.
Diagnosis of multiple sclerosis from EEG signals using nonlinear methods.
Torabi, Ali; Daliri, Mohammad Reza; Sabzposhan, Seyyed Hojjat
2017-12-01
EEG signals have essential and important information about the brain and neural diseases. The main purpose of this study is classifying two groups of healthy volunteers and Multiple Sclerosis (MS) patients using nonlinear features of EEG signals while performing cognitive tasks. EEG signals were recorded when users were doing two different attentional tasks. One of the tasks was based on detecting a desired change in color luminance and the other task was based on detecting a desired change in direction of motion. EEG signals were analyzed in two ways: EEG signals analysis without rhythms decomposition and EEG sub-bands analysis. After recording and preprocessing, time delay embedding method was used for state space reconstruction; embedding parameters were determined for original signals and their sub-bands. Afterwards nonlinear methods were used in feature extraction phase. To reduce the feature dimension, scalar feature selections were done by using T-test and Bhattacharyya criteria. Then, the data were classified using linear support vector machines (SVM) and k-nearest neighbor (KNN) method. The best combination of the criteria and classifiers was determined for each task by comparing performances. For both tasks, the best results were achieved by using T-test criterion and SVM classifier. For the direction-based and the color-luminance-based tasks, maximum classification performances were 93.08 and 79.79% respectively which were reached by using optimal set of features. Our results show that the nonlinear dynamic features of EEG signals seem to be useful and effective in MS diseases diagnosis.
Quality grading of Atlantic salmon (Salmo salar) by computer vision.
Misimi, E; Erikson, U; Skavhaug, A
2008-06-01
In this study, we present a promising method of computer vision-based quality grading of whole Atlantic salmon (Salmo salar). Using computer vision, it was possible to differentiate among different quality grades of Atlantic salmon based on the external geometrical information contained in the fish images. Initially, before the image acquisition, the fish were subjectively graded and labeled into grading classes by a qualified human inspector in the processing plant. Prior to classification, the salmon images were segmented into binary images, and then feature extraction was performed on the geometrical parameters of the fish from the grading classes. The classification algorithm was a threshold-based classifier, which was designed using linear discriminant analysis. The performance of the classifier was tested by using the leave-one-out cross-validation method, and the classification results showed a good agreement between the classification done by human inspectors and by the computer vision. The computer vision-based method classified correctly 90% of the salmon from the data set as compared with the classification by human inspector. Overall, it was shown that computer vision can be used as a powerful tool to grade Atlantic salmon into quality grades in a fast and nondestructive manner by a relatively simple classifier algorithm. The low cost of implementation of today's advanced computer vision solutions makes this method feasible for industrial purposes in fish plants as it can replace manual labor, on which grading tasks still rely.
Image Statistics and the Representation of Material Properties in the Visual Cortex
Baumgartner, Elisabeth; Gegenfurtner, Karl R.
2016-01-01
We explored perceived material properties (roughness, texturedness, and hardness) with a novel approach that compares perception, image statistics and brain activation, as measured with fMRI. We initially asked participants to rate 84 material images with respect to the above mentioned properties, and then scanned 15 of the participants with fMRI while they viewed the material images. The images were analyzed with a set of image statistics capturing their spatial frequency and texture properties. Linear classifiers were then applied to the image statistics as well as the voxel patterns of visually responsive voxels and early visual areas to discriminate between images with high and low perceptual ratings. Roughness and texturedness could be classified above chance level based on image statistics. Roughness and texturedness could also be classified based on the brain activation patterns in visual cortex, whereas hardness could not. Importantly, the agreement in classification based on image statistics and brain activation was also above chance level. Our results show that information about visual material properties is to a large degree contained in low-level image statistics, and that these image statistics are also partially reflected in brain activity patterns induced by the perception of material images. PMID:27582714
Image Statistics and the Representation of Material Properties in the Visual Cortex.
Baumgartner, Elisabeth; Gegenfurtner, Karl R
2016-01-01
We explored perceived material properties (roughness, texturedness, and hardness) with a novel approach that compares perception, image statistics and brain activation, as measured with fMRI. We initially asked participants to rate 84 material images with respect to the above mentioned properties, and then scanned 15 of the participants with fMRI while they viewed the material images. The images were analyzed with a set of image statistics capturing their spatial frequency and texture properties. Linear classifiers were then applied to the image statistics as well as the voxel patterns of visually responsive voxels and early visual areas to discriminate between images with high and low perceptual ratings. Roughness and texturedness could be classified above chance level based on image statistics. Roughness and texturedness could also be classified based on the brain activation patterns in visual cortex, whereas hardness could not. Importantly, the agreement in classification based on image statistics and brain activation was also above chance level. Our results show that information about visual material properties is to a large degree contained in low-level image statistics, and that these image statistics are also partially reflected in brain activity patterns induced by the perception of material images.
Brownian motion curve-based textural classification and its application in cancer diagnosis.
Mookiah, Muthu Rama Krishnan; Shah, Pratik; Chakraborty, Chandan; Ray, Ajoy K
2011-06-01
To develop an automated diagnostic methodology based on textural features of the oral mucosal epithelium to discriminate normal and oral submucous fibrosis (OSF). A total of 83 normal and 29 OSF images from histopathologic sections of the oral mucosa are considered. The proposed diagnostic mechanism consists of two parts: feature extraction using Brownian motion curve (BMC) and design ofa suitable classifier. The discrimination ability of the features has been substantiated by statistical tests. An error back-propagation neural network (BPNN) is used to classify OSF vs. normal. In development of an automated oral cancer diagnostic module, BMC has played an important role in characterizing textural features of the oral images. Fisher's linear discriminant analysis yields 100% sensitivity and 85% specificity, whereas BPNN leads to 92.31% sensitivity and 100% specificity, respectively. In addition to intensity and morphology-based features, textural features are also very important, especially in histopathologic diagnosis of oral cancer. In view of this, a set of textural features are extracted using the BMC for the diagnosis of OSF. Finally, a textural classifier is designed using BPNN, which leads to a diagnostic performance with 96.43% accuracy. (Anal Quant
Boareto, Marcelo; Cesar, Jonatas; Leite, Vitor B P; Caticha, Nestor
2015-01-01
We introduce Supervised Variational Relevance Learning (Suvrel), a variational method to determine metric tensors to define distance based similarity in pattern classification, inspired in relevance learning. The variational method is applied to a cost function that penalizes large intraclass distances and favors small interclass distances. We find analytically the metric tensor that minimizes the cost function. Preprocessing the patterns by doing linear transformations using the metric tensor yields a dataset which can be more efficiently classified. We test our methods using publicly available datasets, for some standard classifiers. Among these datasets, two were tested by the MAQC-II project and, even without the use of further preprocessing, our results improve on their performance.
Analysis of longitudinal diffusion-weighted images in healthy and pathological aging: An ADNI study.
Kruggel, Frithjof; Masaki, Fumitaro; Solodkin, Ana
2017-02-15
The widely used framework of voxel-based morphometry for analyzing neuroimages is extended here to model longitudinal imaging data by exchanging the linear model with a linear mixed-effects model. The new approach is employed for analyzing a large longitudinal sample of 756 diffusion-weighted images acquired in 177 subjects of the Alzheimer's Disease Neuroimaging initiative (ADNI). While sample- and group-level results from both approaches are equivalent, the mixed-effect model yields information at the single subject level. Interestingly, the neurobiological relevance of the relevant parameter at the individual level describes specific differences associated with aging. In addition, our approach highlights white matter areas that reliably discriminate between patients with Alzheimer's disease and healthy controls with a predictive power of 0.99 and include the hippocampal alveus, the para-hippocampal white matter, the white matter of the posterior cingulate, and optic tracts. In this context, notably the classifier includes a sub-population of patients with minimal cognitive impairment into the pathological domain. Our classifier offers promising features for an accessible biomarker that predicts the risk of conversion to Alzheimer's disease. Data used in preparation of this article were obtained from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wp-content/uploads/how to apply/ADNI Acknowledgement List.pdf. Significance statement This study assesses neuro-degenerative processes in the brain's white matter as revealed by diffusion-weighted imaging, in order to discriminate healthy from pathological aging in a large sample of elderly subjects. The analysis of time-series examinations in a linear mixed effects model allowed the discrimination of population-based aging processes from individual determinants. We demonstrate that a simple classifier based on white matter imaging data is able to predict the conversion to Alzheimer's disease with a high predictive power. Copyright © 2017 Elsevier B.V. All rights reserved.
Optical diagnosis of cervical cancer by higher order spectra and boosting
NASA Astrophysics Data System (ADS)
Pratiher, Sawon; Mukhopadhyay, Sabyasachi; Barman, Ritwik; Pratiher, Souvik; Pradhan, Asima; Ghosh, Nirmalya; Panigrahi, Prasanta K.
2017-03-01
In this contribution, we report the application of higher order statistical moments using decision tree and ensemble based learning methodology for the development of diagnostic algorithms for optical diagnosis of cancer. The classification results were compared to those obtained with an independent feature extractors like linear discriminant analysis (LDA). The performance and efficacy of these methodology using higher order statistics as a classifier using boosting has higher specificity and sensitivity while being much faster as compared to other time-frequency domain based methods.
2014-09-18
Converter AES Advance Encryption Standard ANN Artificial Neural Network APS Application Support AUC Area Under the Curve CPA Correlation Power Analysis ...Importance WGN White Gaussian Noise WPAN Wireless Personal Area Networks XEnv Cross-Environment XRx Cross-Receiver xxi ADVANCES IN SCA AND RF-DNA...based tool called KillerBee was released in 2009 that increases the exposure of ZigBee and other IEEE 802.15.4-based Wireless Personal Area Networks
Koua, Dominique; Kuhn-Nentwig, Lucia
2017-01-01
Spider venoms are rich cocktails of bioactive peptides, proteins, and enzymes that are being intensively investigated over the years. In order to provide a better comprehension of that richness, we propose a three-level family classification system for spider venom components. This classification is supported by an exhaustive set of 219 new profile hidden Markov models (HMMs) able to attribute a given peptide to its precise peptide type, family, and group. The proposed classification has the advantages of being totally independent from variable spider taxonomic names and can easily evolve. In addition to the new classifiers, we introduce and demonstrate the efficiency of hmmcompete, a new standalone tool that monitors HMM-based family classification and, after post-processing the result, reports the best classifier when multiple models produce significant scores towards given peptide queries. The combined used of hmmcompete and the new spider venom component-specific classifiers demonstrated 96% sensitivity to properly classify all known spider toxins from the UniProtKB database. These tools are timely regarding the important classification needs caused by the increasing number of peptides and proteins generated by transcriptomic projects. PMID:28786958
Two algorithms for neural-network design and training with application to channel equalization.
Sweatman, C Z; Mulgrew, B; Gibson, G J
1998-01-01
We describe two algorithms for designing and training neural-network classifiers. The first, the linear programming slab algorithm (LPSA), is motivated by the problem of reconstructing digital signals corrupted by passage through a dispersive channel and by additive noise. It constructs a multilayer perceptron (MLP) to separate two disjoint sets by using linear programming methods to identify network parameters. The second, the perceptron learning slab algorithm (PLSA), avoids the computational costs of linear programming by using an error-correction approach to identify parameters. Both algorithms operate in highly constrained parameter spaces and are able to exploit symmetry in the classification problem. Using these algorithms, we develop a number of procedures for the adaptive equalization of a complex linear 4-quadrature amplitude modulation (QAM) channel, and compare their performance in a simulation study. Results are given for both stationary and time-varying channels, the latter based on the COST 207 GSM propagation model.
Anam, Khairul; Al-Jumaily, Adel
2017-01-01
The success of myoelectric pattern recognition (M-PR) mostly relies on the features extracted and classifier employed. This paper proposes and evaluates a fast classifier, extreme learning machine (ELM), to classify individual and combined finger movements on amputees and non-amputees. ELM is a single hidden layer feed-forward network (SLFN) that avoids iterative learning by determining input weights randomly and output weights analytically. Therefore, it can accelerate the training time of SLFNs. In addition to the classifier evaluation, this paper evaluates various feature combinations to improve the performance of M-PR and investigate some feature projections to improve the class separability of the features. Different from other studies on the implementation of ELM in the myoelectric controller, this paper presents a complete and thorough investigation of various types of ELMs including the node-based and kernel-based ELM. Furthermore, this paper provides comparisons of ELMs and other well-known classifiers such as linear discriminant analysis (LDA), k-nearest neighbour (kNN), support vector machine (SVM) and least-square SVM (LS-SVM). The experimental results show the most accurate ELM classifier is radial basis function ELM (RBF-ELM). The comparison of RBF-ELM and other well-known classifiers shows that RBF-ELM is as accurate as SVM and LS-SVM but faster than the SVM family; it is superior to LDA and kNN. The experimental results also indicate that the accuracy gap of the M-PR on the amputees and non-amputees is not too much with the accuracy of 98.55% on amputees and 99.5% on the non-amputees using six electromyography (EMG) channels. Copyright © 2016 Elsevier Ltd. All rights reserved.
Prediction of Potential Hit Song and Musical Genre Using Artificial Neural Networks
NASA Astrophysics Data System (ADS)
Monterola, Christopher; Abundo, Cheryl; Tugaff, Jeric; Venturina, Lorcel Ericka
Accurately quantifying the goodness of music based on the seemingly subjective taste of the public is a multi-million industry. Recording companies can make sound decisions on which songs or artists to prioritize if accurate forecasting is achieved. We extract 56 single-valued musical features (e.g. pitch and tempo) from 380 Original Pilipino Music (OPM) songs (190 are hit songs) released from 2004 to 2006. Based on an effect size criterion which measures a variable's discriminating power, the 20 highest ranked features are fed to a classifier tasked to predict hit songs. We show that regardless of musical genre, a trained feed-forward neural network (NN) can predict potential hit songs with an average accuracy of ΦNN = 81%. The accuracy is about +20% higher than those of standard classifiers such as linear discriminant analysis (LDA, ΦLDA = 61%) and classification and regression trees (CART, ΦCART = 57%). Both LDA and CART are above the proportional chance criterion (PCC, ΦPCC = 50%) but are slightly below the suggested acceptable classifier requirement of 1.25*ΦPCC = 63%. Utilizing a similar procedure, we demonstrate that different genres (ballad, alternative rock or rock) of OPM songs can be automatically classified with near perfect accuracy using LDA or NN but only around 77% using CART.
Predicting hepatotoxicity using ToxCast in vitro bioactivity and ...
Background: The U.S. EPA ToxCastTM program is screening thousands of environmental chemicals for bioactivity using hundreds of high-throughput in vitro assays to build predictive models of toxicity. We represented chemicals based on bioactivity and chemical structure descriptors then used supervised machine learning to predict their hepatotoxic effects.Results: A set of 677 chemicals were represented by 711 in vitro bioactivity descriptors (from ToxCast assays), 4,376 chemical structure descriptors (from QikProp, OpenBabel, PADEL, and PubChem), and three hepatotoxicity categories (from animal studies). Hepatotoxicants were defined by rat liver histopathology observed after chronic chemical testing and grouped into hypertrophy (161), injury (101) and proliferative lesions (99). Classifiers were built using six machine learning algorithms: linear discriminant analysis (LDA), Naïve Bayes (NB), support vector classification (SVM), classification and regression trees (CART), k-nearest neighbors (KNN) and an ensemble of classifiers (ENSMB). Classifiers of hepatotoxicity were built using chemical structure, ToxCast bioactivity, and a hybrid representation. Predictive performance was evaluated using 10-fold cross-validation testing and in-loop, filter-based, feature subset selection. Hybrid classifiers had the best balanced accuracy for predicting hypertrophy (0.78±0.08), injury (0.73±0.10) and proliferative lesions (0.72±0.09). Though chemical and bioactivity class
NASA Astrophysics Data System (ADS)
Zargari Khuzani, Abolfazl; Danala, Gopichandh; Heidari, Morteza; Du, Yue; Mashhadi, Najmeh; Qiu, Yuchen; Zheng, Bin
2018-02-01
Higher recall rates are a major challenge in mammography screening. Thus, developing computer-aided diagnosis (CAD) scheme to classify between malignant and benign breast lesions can play an important role to improve efficacy of mammography screening. Objective of this study is to develop and test a unique image feature fusion framework to improve performance in classifying suspicious mass-like breast lesions depicting on mammograms. The image dataset consists of 302 suspicious masses detected on both craniocaudal and mediolateral-oblique view images. Amongst them, 151 were malignant and 151 were benign. The study consists of following 3 image processing and feature analysis steps. First, an adaptive region growing segmentation algorithm was used to automatically segment mass regions. Second, a set of 70 image features related to spatial and frequency characteristics of mass regions were initially computed. Third, a generalized linear regression model (GLM) based machine learning classifier combined with a bat optimization algorithm was used to optimally fuse the selected image features based on predefined assessment performance index. An area under ROC curve (AUC) with was used as a performance assessment index. Applying CAD scheme to the testing dataset, AUC was 0.75+/-0.04, which was significantly higher than using a single best feature (AUC=0.69+/-0.05) or the classifier with equally weighted features (AUC=0.73+/-0.05). This study demonstrated that comparing to the conventional equal-weighted approach, using an unequal-weighted feature fusion approach had potential to significantly improve accuracy in classifying between malignant and benign breast masses.
NASA Astrophysics Data System (ADS)
Li, Zheng; Jiang, Yi-han; Duan, Lian; Zhu, Chao-zhe
2017-08-01
Objective. Functional near infra-red spectroscopy (fNIRS) is a promising brain imaging technology for brain-computer interfaces (BCI). Future clinical uses of fNIRS will likely require operation over long time spans, during which neural activation patterns may change. However, current decoders for fNIRS signals are not designed to handle changing activation patterns. The objective of this study is to test via simulations a new adaptive decoder for fNIRS signals, the Gaussian mixture model adaptive classifier (GMMAC). Approach. GMMAC can simultaneously classify and track activation pattern changes without the need for ground-truth labels. This adaptive classifier uses computationally efficient variational Bayesian inference to label new data points and update mixture model parameters, using the previous model parameters as priors. We test GMMAC in simulations in which neural activation patterns change over time and compare to static decoders and unsupervised adaptive linear discriminant analysis classifiers. Main results. Our simulation experiments show GMMAC can accurately decode under time-varying activation patterns: shifts of activation region, expansions of activation region, and combined contractions and shifts of activation region. Furthermore, the experiments show the proposed method can track the changing shape of the activation region. Compared to prior work, GMMAC performed significantly better than the other unsupervised adaptive classifiers on a difficult activation pattern change simulation: 99% versus <54% in two-choice classification accuracy. Significance. We believe GMMAC will be useful for clinical fNIRS-based brain-computer interfaces, including neurofeedback training systems, where operation over long time spans is required.
Pareek, Gyan; Acharya, U Rajendra; Sree, S Vinitha; Swapna, G; Yantri, Ratna; Martis, Roshan Joy; Saba, Luca; Krishnamurthi, Ganapathy; Mallarini, Giorgio; El-Baz, Ayman; Al Ekish, Shadi; Beland, Michael; Suri, Jasjit S
2013-12-01
In this work, we have proposed an on-line computer-aided diagnostic system called "UroImage" that classifies a Transrectal Ultrasound (TRUS) image into cancerous or non-cancerous with the help of non-linear Higher Order Spectra (HOS) features and Discrete Wavelet Transform (DWT) coefficients. The UroImage system consists of an on-line system where five significant features (one DWT-based feature and four HOS-based features) are extracted from the test image. These on-line features are transformed by the classifier parameters obtained using the training dataset to determine the class. We trained and tested six classifiers. The dataset used for evaluation had 144 TRUS images which were split into training and testing sets. Three-fold and ten-fold cross-validation protocols were adopted for training and estimating the accuracy of the classifiers. The ground truth used for training was obtained using the biopsy results. Among the six classifiers, using 10-fold cross-validation technique, Support Vector Machine and Fuzzy Sugeno classifiers presented the best classification accuracy of 97.9% with equally high values for sensitivity, specificity and positive predictive value. Our proposed automated system, which achieved more than 95% values for all the performance measures, can be an adjunct tool to provide an initial diagnosis for the identification of patients with prostate cancer. The technique, however, is limited by the limitations of 2D ultrasound guided biopsy, and we intend to improve our technique by using 3D TRUS images in the future.
Signal Classification in Fading Channels Using Cyclic Spectral Analysis
2009-07-01
Classifier Design The proposed classifier is designed to classify AM, BFSK, OFDM, DS - CDMA , 4-ASK, 8-ASK, BPSK, QPSK, 8-PSK, 16-PSK, 16-QAM, and 64-QAM...five independent neural networks, each trained to classify a signal as either AM, BFSK, DS - CDMA , or a linear modulation scheme with a real-valued...in an SOF image that resembles those of QAM and PSK signals. Additionally, the DS - CDMA scheme can be thought to look like a BPSK signal. However, due
Point spread function based classification of regions for linear digital tomosynthesis
NASA Astrophysics Data System (ADS)
Israni, Kenny; Avinash, Gopal; Li, Baojun
2007-03-01
In digital tomosynthesis, one of the limitations is the presence of out-of-plane blur due to the limited angle acquisition. The point spread function (PSF) characterizes blur in the imaging volume, and is shift-variant in tomosynthesis. The purpose of this research is to classify the tomosynthesis imaging volume into four different categories based on PSF-driven focus criteria. We considered linear tomosynthesis geometry and simple back projection algorithm for reconstruction. The three-dimensional PSF at every pixel in the imaging volume was determined. Intensity profiles were computed for every pixel by integrating the PSF-weighted intensities contained within the line segment defined by the PSF, at each slice. Classification rules based on these intensity profiles were used to categorize image regions. At background and low-frequency pixels, the derived intensity profiles were flat curves with relatively low and high maximum intensities respectively. At in-focus pixels, the maximum intensity of the profiles coincided with the PSF-weighted intensity of the pixel. At out-of-focus pixels, the PSF-weighted intensity of the pixel was always less than the maximum intensity of the profile. We validated our method using human observer classified regions as gold standard. Based on the computed and manual classifications, the mean sensitivity and specificity of the algorithm were 77+/-8.44% and 91+/-4.13% respectively (t=-0.64, p=0.56, DF=4). Such a classification algorithm may assist in mitigating out-of-focus blur from tomosynthesis image slices.
Zhang, Guoqing; Sun, Huaijiang; Xia, Guiyu; Sun, Quansen
2016-07-07
Sparse representation based classification (SRC) has been developed and shown great potential for real-world application. Based on SRC, Yang et al. [10] devised a SRC steered discriminative projection (SRC-DP) method. However, as a linear algorithm, SRC-DP cannot handle the data with highly nonlinear distribution. Kernel sparse representation-based classifier (KSRC) is a non-linear extension of SRC and can remedy the drawback of SRC. KSRC requires the use of a predetermined kernel function and selection of the kernel function and its parameters is difficult. Recently, multiple kernel learning for SRC (MKL-SRC) [22] has been proposed to learn a kernel from a set of base kernels. However, MKL-SRC only considers the within-class reconstruction residual while ignoring the between-class relationship, when learning the kernel weights. In this paper, we propose a novel multiple kernel sparse representation-based classifier (MKSRC), and then we use it as a criterion to design a multiple kernel sparse representation based orthogonal discriminative projection method (MK-SR-ODP). The proposed algorithm aims at learning a projection matrix and a corresponding kernel from the given base kernels such that in the low dimension subspace the between-class reconstruction residual is maximized and the within-class reconstruction residual is minimized. Furthermore, to achieve a minimum overall loss by performing recognition in the learned low-dimensional subspace, we introduce cost information into the dimensionality reduction method. The solutions for the proposed method can be efficiently found based on trace ratio optimization method [33]. Extensive experimental results demonstrate the superiority of the proposed algorithm when compared with the state-of-the-art methods.
Aksu, Yaman; Miller, David J; Kesidis, George; Yang, Qing X
2010-05-01
Feature selection for classification in high-dimensional spaces can improve generalization, reduce classifier complexity, and identify important, discriminating feature "markers." For support vector machine (SVM) classification, a widely used technique is recursive feature elimination (RFE). We demonstrate that RFE is not consistent with margin maximization, central to the SVM learning approach. We thus propose explicit margin-based feature elimination (MFE) for SVMs and demonstrate both improved margin and improved generalization, compared with RFE. Moreover, for the case of a nonlinear kernel, we show that RFE assumes that the squared weight vector 2-norm is strictly decreasing as features are eliminated. We demonstrate this is not true for the Gaussian kernel and, consequently, RFE may give poor results in this case. MFE for nonlinear kernels gives better margin and generalization. We also present an extension which achieves further margin gains, by optimizing only two degrees of freedom--the hyperplane's intercept and its squared 2-norm--with the weight vector orientation fixed. We finally introduce an extension that allows margin slackness. We compare against several alternatives, including RFE and a linear programming method that embeds feature selection within the classifier design. On high-dimensional gene microarray data sets, University of California at Irvine (UCI) repository data sets, and Alzheimer's disease brain image data, MFE methods give promising results.
Using passive cavitation images to classify high-intensity focused ultrasound lesions.
Haworth, Kevin J; Salgaonkar, Vasant A; Corregan, Nicholas M; Holland, Christy K; Mast, T Douglas
2015-09-01
Passive cavitation imaging provides spatially resolved monitoring of cavitation emissions. However, the diffraction limit of a linear imaging array results in relatively poor range resolution. Poor range resolution has limited prior analyses of the spatial specificity and sensitivity of passive cavitation imaging in predicting thermal lesion formation. In this study, this limitation is overcome by orienting a linear array orthogonal to the high-intensity focused ultrasound propagation direction and performing passive imaging. Fourteen lesions were formed in ex vivo bovine liver samples as a result of 1.1-MHz continuous-wave ultrasound exposure. The lesions were classified as focal, "tadpole" or pre-focal based on their shape and location. Passive cavitation images were beamformed from emissions at the fundamental, harmonic, ultraharmonic and inharmonic frequencies with an established algorithm. Using the area under a receiver operating characteristic curve (AUROC), fundamental, harmonic and ultraharmonic emissions were found to be significant predictors of lesion formation for all lesion types. For both harmonic and ultraharmonic emissions, pre-focal lesions were classified most successfully (AUROC values of 0.87 and 0.88, respectively), followed by tadpole lesions (AUROC values of 0.77 and 0.64, respectively) and focal lesions (AUROC values of 0.65 and 0.60, respectively). Copyright © 2015 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.
Scoring and staging systems using cox linear regression modeling and recursive partitioning.
Lee, J W; Um, S H; Lee, J B; Mun, J; Cho, H
2006-01-01
Scoring and staging systems are used to determine the order and class of data according to predictors. Systems used for medical data, such as the Child-Turcotte-Pugh scoring and staging systems for ordering and classifying patients with liver disease, are often derived strictly from physicians' experience and intuition. We construct objective and data-based scoring/staging systems using statistical methods. We consider Cox linear regression modeling and recursive partitioning techniques for censored survival data. In particular, to obtain a target number of stages we propose cross-validation and amalgamation algorithms. We also propose an algorithm for constructing scoring and staging systems by integrating local Cox linear regression models into recursive partitioning, so that we can retain the merits of both methods such as superior predictive accuracy, ease of use, and detection of interactions between predictors. The staging system construction algorithms are compared by cross-validation evaluation of real data. The data-based cross-validation comparison shows that Cox linear regression modeling is somewhat better than recursive partitioning when there are only continuous predictors, while recursive partitioning is better when there are significant categorical predictors. The proposed local Cox linear recursive partitioning has better predictive accuracy than Cox linear modeling and simple recursive partitioning. This study indicates that integrating local linear modeling into recursive partitioning can significantly improve prediction accuracy in constructing scoring and staging systems.
Esserman, Denise A.; Moore, Charity G.; Roth, Mary T.
2009-01-01
Older community dwelling adults often take multiple medications for numerous chronic diseases. Non-adherence to these medications can have a large public health impact. Therefore, the measurement and modeling of medication adherence in the setting of polypharmacy is an important area of research. We apply a variety of different modeling techniques (standard linear regression; weighted linear regression; adjusted linear regression; naïve logistic regression; beta-binomial (BB) regression; generalized estimating equations (GEE)) to binary medication adherence data from a study in a North Carolina based population of older adults, where each medication an individual was taking was classified as adherent or non-adherent. In addition, through simulation we compare these different methods based on Type I error rates, bias, power, empirical 95% coverage, and goodness of fit. We find that estimation and inference using GEE is robust to a wide variety of scenarios and we recommend using this in the setting of polypharmacy when adherence is dichotomously measured for multiple medications per person. PMID:20414358
Locality-preserving sparse representation-based classification in hyperspectral imagery
NASA Astrophysics Data System (ADS)
Gao, Lianru; Yu, Haoyang; Zhang, Bing; Li, Qingting
2016-10-01
This paper proposes to combine locality-preserving projections (LPP) and sparse representation (SR) for hyperspectral image classification. The LPP is first used to reduce the dimensionality of all the training and testing data by finding the optimal linear approximations to the eigenfunctions of the Laplace Beltrami operator on the manifold, where the high-dimensional data lies. Then, SR codes the projected testing pixels as sparse linear combinations of all the training samples to classify the testing pixels by evaluating which class leads to the minimum approximation error. The integration of LPP and SR represents an innovative contribution to the literature. The proposed approach, called locality-preserving SR-based classification, addresses the imbalance between high dimensionality of hyperspectral data and the limited number of training samples. Experimental results on three real hyperspectral data sets demonstrate that the proposed approach outperforms the original counterpart, i.e., SR-based classification.
Shim, You-Shin; Kim, Jong-Chan; Jeong, Seung-Weon
2016-01-01
A simultaneous analytical method for piperine, capsaicin, and dihydrocapsaicin in Korean instant-noodle soup base using HPLC was validated in terms of precision, accuracy, sensitivity, and linearity. The HPLC separation was performed on a reversed-phase C18 column (5 μm particle size, 4.6 mm id, 250 mm length) using a UV detector fixed at 280 nm. The LOD and LOQ of the HPLC analyses ranged from 0.25 to 1.03 mg/kg. The intraday and interday precisions of the individual piperine, capsaicin, and dihydrocapsaicin were <10.55%, and the recovery values ranged from 85.43 to 94.68%. The calibration curves exhibited good linearity (r(2) = 0.999) within the tested ranges. These results suggest that the analytical method in this study can be used to classify Korean instant noodles based on their levels of spiciness.
Structural vibration-based damage classification of delaminated smart composite laminates
NASA Astrophysics Data System (ADS)
Khan, Asif; Kim, Heung Soo; Sohn, Jung Woo
2018-03-01
Separation along the interfaces of layers (delamination) is a principal mode of failure in laminated composites and its detection is of prime importance for structural integrity of composite materials. In this work, structural vibration response is employed to detect and classify delaminations in piezo-bonded laminated composites. Improved layerwise theory and finite element method are adopted to develop the electromechanically coupled governing equation of a smart composite laminate with and without delaminations. Transient responses of the healthy and damaged structures are obtained through a surface bonded piezoelectric sensor by solving the governing equation in the time domain. Wavelet packet transform (WPT) and linear discriminant analysis (LDA) are employed to extract discriminative features from the structural vibration response of the healthy and delaminated structures. Dendrogram-based support vector machine (DSVM) is used to classify the discriminative features. The confusion matrix of the classification algorithm provided physically consistent results.
Automatic Tortuosity-Based Retinopathy of Prematurity Screening System
NASA Astrophysics Data System (ADS)
Sukkaew, Lassada; Uyyanonvara, Bunyarit; Makhanov, Stanislav S.; Barman, Sarah; Pangputhipong, Pannet
Retinopathy of Prematurity (ROP) is an infant disease characterized by increased dilation and tortuosity of the retinal blood vessels. Automatic tortuosity evaluation from retinal digital images is very useful to facilitate an ophthalmologist in the ROP screening and to prevent childhood blindness. This paper proposes a method to automatically classify the image into tortuous and non-tortuous. The process imitates expert ophthalmologists' screening by searching for clearly tortuous vessel segments. First, a skeleton of the retinal blood vessels is extracted from the original infant retinal image using a series of morphological operators. Next, we propose to partition the blood vessels recursively using an adaptive linear interpolation scheme. Finally, the tortuosity is calculated based on the curvature of the resulting vessel segments. The retinal images are then classified into two classes using segments characterized by the highest tortuosity. For an optimal set of training parameters the prediction is as high as 100%.
Automatic stage identification of Drosophila egg chamber based on DAPI images
Jia, Dongyu; Xu, Qiuping; Xie, Qian; Mio, Washington; Deng, Wu-Min
2016-01-01
The Drosophila egg chamber, whose development is divided into 14 stages, is a well-established model for developmental biology. However, visual stage determination can be a tedious, subjective and time-consuming task prone to errors. Our study presents an objective, reliable and repeatable automated method for quantifying cell features and classifying egg chamber stages based on DAPI images. The proposed approach is composed of two steps: 1) a feature extraction step and 2) a statistical modeling step. The egg chamber features used are egg chamber size, oocyte size, egg chamber ratio and distribution of follicle cells. Methods for determining the on-site of the polytene stage and centripetal migration are also discussed. The statistical model uses linear and ordinal regression to explore the stage-feature relationships and classify egg chamber stages. Combined with machine learning, our method has great potential to enable discovery of hidden developmental mechanisms. PMID:26732176
Neural network classification of sweet potato embryos
NASA Astrophysics Data System (ADS)
Molto, Enrique; Harrell, Roy C.
1993-05-01
Somatic embryogenesis is a process that allows for the in vitro propagation of thousands of plants in sub-liter size vessels and has been successfully applied to many significant species. The heterogeneity of maturity and quality of embryos produced with this technique requires sorting to obtain a uniform product. An automated harvester is being developed at the University of Florida to sort embryos in vitro at different stages of maturation in a suspension culture. The system utilizes machine vision to characterize embryo morphology and a fluidic based separation device to isolate embryos associated with a pre-defined, targeted morphology. Two different backpropagation neural networks (BNN) were used to classify embryos based on information extracted from the vision system. One network utilized geometric features such as embryo area, length, and symmetry as inputs. The alternative network utilized polar coordinates of an embryo's perimeter with respect to its centroid as inputs. The performances of both techniques were compared with each other and with an embryo classification method based on linear discriminant analysis (LDA). Similar results were obtained with all three techniques. Classification efficiency was improved by reducing the dimension of the feature vector trough a forward stepwise analysis by LDA. In order to enhance the purity of the sample selected as harvestable, a reject to classify option was introduced in the model and analyzed. The best classifier performances (76% overall correct classifications, 75% harvestable objects properly classified, homogeneity improvement ratio 1.5) were obtained using 8 features in a BNN.
Bisarro Dos Reis, Mariana; Barros-Filho, Mateus Camargo; Marchi, Fábio Albuquerque; Beltrami, Caroline Moraes; Kuasne, Hellen; Pinto, Clóvis Antônio Lopes; Ambatipudi, Srikant; Herceg, Zdenko; Kowalski, Luiz Paulo; Rogatto, Silvia Regina
2017-11-01
Even though the majority of well-differentiated thyroid carcinoma (WDTC) is indolent, a number of cases display an aggressive behavior. Cumulative evidence suggests that the deregulation of DNA methylation has the potential to point out molecular markers associated with worse prognosis. To identify a prognostic epigenetic signature in thyroid cancer. Genome-wide DNA methylation assays (450k platform, Illumina) were performed in a cohort of 50 nonneoplastic thyroid tissues (NTs), 17 benign thyroid lesions (BTLs), and 74 thyroid carcinomas (60 papillary, 8 follicular, 2 Hürthle cell, 1 poorly differentiated, and 3 anaplastic). A prognostic classifier for WDTC was developed via diagonal linear discriminant analysis. The results were compared with The Cancer Genome Atlas (TCGA) database. A specific epigenetic profile was detected according to each histological subtype. BTLs and follicular carcinomas showed a greater number of methylated CpG in comparison with NTs, whereas hypomethylation was predominant in papillary and undifferentiated carcinomas. A prognostic classifier based on 21 DNA methylation probes was able to predict poor outcome in patients with WDTC (sensitivity 63%, specificity 92% for internal data; sensitivity 64%, specificity 88% for TCGA data). High-risk score based on the classifier was considered an independent factor of poor outcome (Cox regression, P < 0.001). The methylation profile of thyroid lesions exhibited a specific signature according to the histological subtype. A meaningful algorithm composed of 21 probes was capable of predicting the recurrence in WDTC. Copyright © 2017 Endocrine Society
Chinese Sentence Classification Based on Convolutional Neural Network
NASA Astrophysics Data System (ADS)
Gu, Chengwei; Wu, Ming; Zhang, Chuang
2017-10-01
Sentence classification is one of the significant issues in Natural Language Processing (NLP). Feature extraction is often regarded as the key point for natural language processing. Traditional ways based on machine learning can not take high level features into consideration, such as Naive Bayesian Model. The neural network for sentence classification can make use of contextual information to achieve greater results in sentence classification tasks. In this paper, we focus on classifying Chinese sentences. And the most important is that we post a novel architecture of Convolutional Neural Network (CNN) to apply on Chinese sentence classification. In particular, most of the previous methods often use softmax classifier for prediction, we embed a linear support vector machine to substitute softmax in the deep neural network model, minimizing a margin-based loss to get a better result. And we use tanh as an activation function, instead of ReLU. The CNN model improve the result of Chinese sentence classification tasks. Experimental results on the Chinese news title database validate the effectiveness of our model.
NASA Astrophysics Data System (ADS)
Kotan, Muhammed; Öz, Cemil
2017-12-01
An inspection system using estimated three-dimensional (3-D) surface characteristics information to detect and classify the faults to increase the quality control on the frequently used industrial components is proposed. Shape from shading (SFS) is one of the basic and classic 3-D shape recovery problems in computer vision. In our application, we developed a system using Frankot and Chellappa SFS method based on the minimization of the selected basis function. First, the specialized image acquisition system captured the images of the component. To eliminate noise, wavelet transform is applied to the taken images. Then, estimated gradients were used to obtain depth and surface profiles. Depth information was used to determine and classify the surface defects. Also, a comparison made with some linearization-based SFS algorithms was discussed. The developed system was applied to real products and the results indicated that using SFS approaches is useful and various types of defects can easily be detected in a short period of time.
Drunk driving detection based on classification of multivariate time series.
Li, Zhenlong; Jin, Xue; Zhao, Xiaohua
2015-09-01
This paper addresses the problem of detecting drunk driving based on classification of multivariate time series. First, driving performance measures were collected from a test in a driving simulator located in the Traffic Research Center, Beijing University of Technology. Lateral position and steering angle were used to detect drunk driving. Second, multivariate time series analysis was performed to extract the features. A piecewise linear representation was used to represent multivariate time series. A bottom-up algorithm was then employed to separate multivariate time series. The slope and time interval of each segment were extracted as the features for classification. Third, a support vector machine classifier was used to classify driver's state into two classes (normal or drunk) according to the extracted features. The proposed approach achieved an accuracy of 80.0%. Drunk driving detection based on the analysis of multivariate time series is feasible and effective. The approach has implications for drunk driving detection. Copyright © 2015 Elsevier Ltd and National Safety Council. All rights reserved.
Fuzzy support vector machine: an efficient rule-based classification technique for microarrays.
Hajiloo, Mohsen; Rabiee, Hamid R; Anooshahpour, Mahdi
2013-01-01
The abundance of gene expression microarray data has led to the development of machine learning algorithms applicable for tackling disease diagnosis, disease prognosis, and treatment selection problems. However, these algorithms often produce classifiers with weaknesses in terms of accuracy, robustness, and interpretability. This paper introduces fuzzy support vector machine which is a learning algorithm based on combination of fuzzy classifiers and kernel machines for microarray classification. Experimental results on public leukemia, prostate, and colon cancer datasets show that fuzzy support vector machine applied in combination with filter or wrapper feature selection methods develops a robust model with higher accuracy than the conventional microarray classification models such as support vector machine, artificial neural network, decision trees, k nearest neighbors, and diagonal linear discriminant analysis. Furthermore, the interpretable rule-base inferred from fuzzy support vector machine helps extracting biological knowledge from microarray data. Fuzzy support vector machine as a new classification model with high generalization power, robustness, and good interpretability seems to be a promising tool for gene expression microarray classification.
NASA Astrophysics Data System (ADS)
Boniecki, P.; Nowakowski, K.; Slosarz, P.; Dach, J.; Pilarski, K.
2012-04-01
The purpose of the project was to identify the degree of organic matter decomposition by means of a neural model based on graphical information derived from image analysis. Empirical data (photographs of compost content at various stages of maturation) were used to generate an optimal neural classifier (Boniecki et al. 2009, Nowakowski et al. 2009). The best classification properties were found in an RBF (Radial Basis Function) artificial neural network, which demonstrates that the process is non-linear.
Classification of sodium MRI data of cartilage using machine learning.
Madelin, Guillaume; Poidevin, Frederick; Makrymallis, Antonios; Regatte, Ravinder R
2015-11-01
To assess the possible utility of machine learning for classifying subjects with and subjects without osteoarthritis using sodium magnetic resonance imaging data. Theory: Support vector machine, k-nearest neighbors, naïve Bayes, discriminant analysis, linear regression, logistic regression, neural networks, decision tree, and tree bagging were tested. Sodium magnetic resonance imaging with and without fluid suppression by inversion recovery was acquired on the knee cartilage of 19 controls and 28 osteoarthritis patients. Sodium concentrations were measured in regions of interests in the knee for both acquisitions. Mean (MEAN) and standard deviation (STD) of these concentrations were measured in each regions of interest, and the minimum, maximum, and mean of these two measurements were calculated over all regions of interests for each subject. The resulting 12 variables per subject were used as predictors for classification. Either Min [STD] alone, or in combination with Mean [MEAN] or Min [MEAN], all from fluid suppressed data, were the best predictors with an accuracy >74%, mainly with linear logistic regression and linear support vector machine. Other good classifiers include discriminant analysis, linear regression, and naïve Bayes. Machine learning is a promising technique for classifying osteoarthritis patients and controls from sodium magnetic resonance imaging data. © 2014 Wiley Periodicals, Inc.
McDonald, Linda S; Panozzo, Joseph F; Salisbury, Phillip A; Ford, Rebecca
2016-01-01
Field peas (Pisum sativum L.) are generally traded based on seed appearance, which subjectively defines broad market-grades. In this study, we developed an objective Linear Discriminant Analysis (LDA) model to classify market grades of field peas based on seed colour, shape and size traits extracted from digital images. Seeds were imaged in a high-throughput system consisting of a camera and laser positioned over a conveyor belt. Six colour intensity digital images were captured (under 405, 470, 530, 590, 660 and 850nm light) for each seed, and surface height was measured at each pixel by laser. Colour, shape and size traits were compiled across all seed in each sample to determine the median trait values. Defective and non-defective seed samples were used to calibrate and validate the model. Colour components were sufficient to correctly classify all non-defective seed samples into correct market grades. Defective samples required a combination of colour, shape and size traits to achieve 87% and 77% accuracy in market grade classification of calibration and validation sample-sets respectively. Following these results, we used the same colour, shape and size traits to develop an LDA model which correctly classified over 97% of all validation samples as defective or non-defective.
McDonald, Linda S.; Panozzo, Joseph F.; Salisbury, Phillip A.; Ford, Rebecca
2016-01-01
Field peas (Pisum sativum L.) are generally traded based on seed appearance, which subjectively defines broad market-grades. In this study, we developed an objective Linear Discriminant Analysis (LDA) model to classify market grades of field peas based on seed colour, shape and size traits extracted from digital images. Seeds were imaged in a high-throughput system consisting of a camera and laser positioned over a conveyor belt. Six colour intensity digital images were captured (under 405, 470, 530, 590, 660 and 850nm light) for each seed, and surface height was measured at each pixel by laser. Colour, shape and size traits were compiled across all seed in each sample to determine the median trait values. Defective and non-defective seed samples were used to calibrate and validate the model. Colour components were sufficient to correctly classify all non-defective seed samples into correct market grades. Defective samples required a combination of colour, shape and size traits to achieve 87% and 77% accuracy in market grade classification of calibration and validation sample-sets respectively. Following these results, we used the same colour, shape and size traits to develop an LDA model which correctly classified over 97% of all validation samples as defective or non-defective. PMID:27176469
Zhang, Jianhua; Yin, Zhong; Wang, Rubin
2017-01-01
This paper developed a cognitive task-load (CTL) classification algorithm and allocation strategy to sustain the optimal operator CTL levels over time in safety-critical human-machine integrated systems. An adaptive human-machine system is designed based on a non-linear dynamic CTL classifier, which maps a set of electroencephalogram (EEG) and electrocardiogram (ECG) related features to a few CTL classes. The least-squares support vector machine (LSSVM) is used as dynamic pattern classifier. A series of electrophysiological and performance data acquisition experiments were performed on seven volunteer participants under a simulated process control task environment. The participant-specific dynamic LSSVM model is constructed to classify the instantaneous CTL into five classes at each time instant. The initial feature set, comprising 56 EEG and ECG related features, is reduced to a set of 12 salient features (including 11 EEG-related features) by using the locality preserving projection (LPP) technique. An overall correct classification rate of about 80% is achieved for the 5-class CTL classification problem. Then the predicted CTL is used to adaptively allocate the number of process control tasks between operator and computer-based controller. Simulation results showed that the overall performance of the human-machine system can be improved by using the adaptive automation strategy proposed.
Hwang, Wonjun; Wang, Haitao; Kim, Hyunwoo; Kee, Seok-Cheol; Kim, Junmo
2011-04-01
The authors present a robust face recognition system for large-scale data sets taken under uncontrolled illumination variations. The proposed face recognition system consists of a novel illumination-insensitive preprocessing method, a hybrid Fourier-based facial feature extraction, and a score fusion scheme. First, in the preprocessing stage, a face image is transformed into an illumination-insensitive image, called an "integral normalized gradient image," by normalizing and integrating the smoothed gradients of a facial image. Then, for feature extraction of complementary classifiers, multiple face models based upon hybrid Fourier features are applied. The hybrid Fourier features are extracted from different Fourier domains in different frequency bandwidths, and then each feature is individually classified by linear discriminant analysis. In addition, multiple face models are generated by plural normalized face images that have different eye distances. Finally, to combine scores from multiple complementary classifiers, a log likelihood ratio-based score fusion scheme is applied. The proposed system using the face recognition grand challenge (FRGC) experimental protocols is evaluated; FRGC is a large available data set. Experimental results on the FRGC version 2.0 data sets have shown that the proposed method shows an average of 81.49% verification rate on 2-D face images under various environmental variations such as illumination changes, expression changes, and time elapses.
Geomorphic Flood Area (GFA): a DEM-based tool for flood susceptibility mapping at large scales
NASA Astrophysics Data System (ADS)
Manfreda, S.; Samela, C.; Albano, R.; Sole, A.
2017-12-01
Flood hazard and risk mapping over large areas is a critical issue. Recently, many researchers are trying to achieve a global scale mapping encountering several difficulties, above all the lack of data and implementation costs. In data scarce environments, a preliminary and cost-effective floodplain delineation can be performed using geomorphic methods (e.g., Manfreda et al., 2014). We carried out several years of research on this topic, proposing a morphologic descriptor named Geomorphic Flood Index (GFI) (Samela et al., 2017) and developing a Digital Elevation Model (DEM)-based procedure able to identify flood susceptible areas. The procedure exhibited high accuracy in several test sites in Europe, United States and Africa (Manfreda et al., 2015; Samela et al., 2016, 2017) and has been recently implemented in a QGIS plugin named Geomorphic Flood Area (GFA) - tool. The tool allows to automatically compute the GFI, and turn it into a linear binary classifier capable of detecting flood-prone areas. To train this classifier, an inundation map derived using hydraulic models for a small portion of the basin is required (the minimum is 2% of the river basin's area). In this way, the GFA-tool allows to extend the classification of the flood-prone areas across the entire basin. We are also defining a simplified procedure for the estimation of the river depth, which may be helpful for large-scale analyses to approximatively evaluate the expected flood damages in the surrounding areas. ReferencesManfreda, S., Nardi, F., Samela, C., Grimaldi, S., Taramasso, A. C., Roth, G., & Sole, A. (2014). Investigation on the use of geomorphic approaches for the delineation of flood prone areas. J. Hydrol., 517, 863-876. Manfreda, S., Samela, C., Gioia, A., Consoli, G., Iacobellis, V., Giuzio, L., & Sole, A. (2016). Flood-prone areas assessment using linear binary classifiers based on flood maps obtained from 1D and 2D hydraulic models. Nat. Hazards, Vol. 79 (2), pp 735-754. Samela, C., Manfreda, S., Paola, F. D., Giugni, M., Sole, A., & Fiorentino, M. (2016). DEM-Based Approaches for the Delineation of Flood-Prone Areas in an Ungauged Basin in Africa. J. Hydrol. Eng,, 06015010. Samela, C., Troy, T. J., & Manfreda, S. (2017a). Geomorphic classifiers for flood-prone areas delineation for data-scarce environments. Adv. Water Resour., 102, 13-28.
Akhtar, Naveed; Mian, Ajmal
2017-10-03
We present a principled approach to learn a discriminative dictionary along a linear classifier for hyperspectral classification. Our approach places Gaussian Process priors over the dictionary to account for the relative smoothness of the natural spectra, whereas the classifier parameters are sampled from multivariate Gaussians. We employ two Beta-Bernoulli processes to jointly infer the dictionary and the classifier. These processes are coupled under the same sets of Bernoulli distributions. In our approach, these distributions signify the frequency of the dictionary atom usage in representing class-specific training spectra, which also makes the dictionary discriminative. Due to the coupling between the dictionary and the classifier, the popularity of the atoms for representing different classes gets encoded into the classifier. This helps in predicting the class labels of test spectra that are first represented over the dictionary by solving a simultaneous sparse optimization problem. The labels of the spectra are predicted by feeding the resulting representations to the classifier. Our approach exploits the nonparametric Bayesian framework to automatically infer the dictionary size--the key parameter in discriminative dictionary learning. Moreover, it also has the desirable property of adaptively learning the association between the dictionary atoms and the class labels by itself. We use Gibbs sampling to infer the posterior probability distributions over the dictionary and the classifier under the proposed model, for which, we derive analytical expressions. To establish the effectiveness of our approach, we test it on benchmark hyperspectral images. The classification performance is compared with the state-of-the-art dictionary learning-based classification methods.
Three-dimensional passive sensing photon counting for object classification
NASA Astrophysics Data System (ADS)
Yeom, Seokwon; Javidi, Bahram; Watson, Edward
2007-04-01
In this keynote address, we address three-dimensional (3D) distortion-tolerant object recognition using photon-counting integral imaging (II). A photon-counting linear discriminant analysis (LDA) is discussed for classification of photon-limited images. We develop a compact distortion-tolerant recognition system based on the multiple-perspective imaging of II. Experimental and simulation results have shown that a low level of photons is sufficient to classify out-of-plane rotated objects.
Muthu Rama Krishnan, M; Shah, Pratik; Chakraborty, Chandan; Ray, Ajoy K
2012-04-01
The objective of this paper is to provide an improved technique, which can assist oncopathologists in correct screening of oral precancerous conditions specially oral submucous fibrosis (OSF) with significant accuracy on the basis of collagen fibres in the sub-epithelial connective tissue. The proposed scheme is composed of collagen fibres segmentation, its textural feature extraction and selection, screening perfomance enhancement under Gaussian transformation and finally classification. In this study, collagen fibres are segmented on R,G,B color channels using back-probagation neural network from 60 normal and 59 OSF histological images followed by histogram specification for reducing the stain intensity variation. Henceforth, textural features of collgen area are extracted using fractal approaches viz., differential box counting and brownian motion curve . Feature selection is done using Kullback-Leibler (KL) divergence criterion and the screening performance is evaluated based on various statistical tests to conform Gaussian nature. Here, the screening performance is enhanced under Gaussian transformation of the non-Gaussian features using hybrid distribution. Moreover, the routine screening is designed based on two statistical classifiers viz., Bayesian classification and support vector machines (SVM) to classify normal and OSF. It is observed that SVM with linear kernel function provides better classification accuracy (91.64%) as compared to Bayesian classifier. The addition of fractal features of collagen under Gaussian transformation improves Bayesian classifier's performance from 80.69% to 90.75%. Results are here studied and discussed.
A random forest model based classification scheme for neonatal amplitude-integrated EEG.
Chen, Weiting; Wang, Yu; Cao, Guitao; Chen, Guoqiang; Gu, Qiufang
2014-01-01
Modern medical advances have greatly increased the survival rate of infants, while they remain in the higher risk group for neurological problems later in life. For the infants with encephalopathy or seizures, identification of the extent of brain injury is clinically challenging. Continuous amplitude-integrated electroencephalography (aEEG) monitoring offers a possibility to directly monitor the brain functional state of the newborns over hours, and has seen an increasing application in neonatal intensive care units (NICUs). This paper presents a novel combined feature set of aEEG and applies random forest (RF) method to classify aEEG tracings. To that end, a series of experiments were conducted on 282 aEEG tracing cases (209 normal and 73 abnormal ones). Basic features, statistic features and segmentation features were extracted from both the tracing as a whole and the segmented recordings, and then form a combined feature set. All the features were sent to a classifier afterwards. The significance of feature, the data segmentation, the optimization of RF parameters, and the problem of imbalanced datasets were examined through experiments. Experiments were also done to evaluate the performance of RF on aEEG signal classifying, compared with several other widely used classifiers including SVM-Linear, SVM-RBF, ANN, Decision Tree (DT), Logistic Regression(LR), ML, and LDA. The combined feature set can better characterize aEEG signals, compared with basic features, statistic features and segmentation features respectively. With the combined feature set, the proposed RF-based aEEG classification system achieved a correct rate of 92.52% and a high F1-score of 95.26%. Among all of the seven classifiers examined in our work, the RF method got the highest correct rate, sensitivity, specificity, and F1-score, which means that RF outperforms all of the other classifiers considered here. The results show that the proposed RF-based aEEG classification system with the combined feature set is efficient and helpful to better detect the brain disorders in newborns.
Arcentales, Andres; Rivera, Patricio; Caminal, Pere; Voss, Andreas; Bayes-Genis, Antonio; Giraldo, Beatriz F
2016-08-01
Changes in the left ventricle function produce alternans in the hemodynamic and electric behavior of the cardiovascular system. A total of 49 cardiomyopathy patients have been studied based on the blood pressure signal (BP), and were classified according to the left ventricular ejection fraction (LVEF) in low risk (LR: LVEF>35%, 17 patients) and high risk (HR: LVEF≤35, 32 patients) groups. We propose to characterize these patients using a linear and a nonlinear methods, based on the spectral estimation and the recurrence plot, respectively. From BP signal, we extracted each systolic time interval (STI), upward systolic slope (BPsl), and the difference between systolic and diastolic BP, defined as pulse pressure (PP). After, the best subset of parameters were obtained through the sequential feature selection (SFS) method. According to the results, the best classification was obtained using a combination of linear and nonlinear features from STI and PP parameters. For STI, the best combination was obtained considering the frequency peak and the diagonal structures of RP, with an area under the curve (AUC) of 79%. The same results were obtained when comparing PP values. Consequently, the use of combined linear and nonlinear parameters could improve the risk stratification of cardiomyopathy patients.
Segmentation of thalamus from MR images via task-driven dictionary learning
NASA Astrophysics Data System (ADS)
Liu, Luoluo; Glaister, Jeffrey; Sun, Xiaoxia; Carass, Aaron; Tran, Trac D.; Prince, Jerry L.
2016-03-01
Automatic thalamus segmentation is useful to track changes in thalamic volume over time. In this work, we introduce a task-driven dictionary learning framework to find the optimal dictionary given a set of eleven features obtained from T1-weighted MRI and diffusion tensor imaging. In this dictionary learning framework, a linear classifier is designed concurrently to classify voxels as belonging to the thalamus or non-thalamus class. Morphological post-processing is applied to produce the final thalamus segmentation. Due to the uneven size of the training data samples for the non-thalamus and thalamus classes, a non-uniform sampling scheme is pro- posed to train the classifier to better discriminate between the two classes around the boundary of the thalamus. Experiments are conducted on data collected from 22 subjects with manually delineated ground truth. The experimental results are promising in terms of improvements in the Dice coefficient of the thalamus segmentation overstate-of-the-art atlas-based thalamus segmentation algorithms.
Donato, Gianluca; Bartlett, Marian Stewart; Hager, Joseph C.; Ekman, Paul; Sejnowski, Terrence J.
2010-01-01
The Facial Action Coding System (FACS) [23] is an objective method for quantifying facial movement in terms of component actions. This system is widely used in behavioral investigations of emotion, cognitive processes, and social interaction. The coding is presently performed by highly trained human experts. This paper explores and compares techniques for automatically recognizing facial actions in sequences of images. These techniques include analysis of facial motion through estimation of optical flow; holistic spatial analysis, such as principal component analysis, independent component analysis, local feature analysis, and linear discriminant analysis; and methods based on the outputs of local filters, such as Gabor wavelet representations and local principal components. Performance of these systems is compared to naive and expert human subjects. Best performances were obtained using the Gabor wavelet representation and the independent component representation, both of which achieved 96 percent accuracy for classifying 12 facial actions of the upper and lower face. The results provide converging evidence for the importance of using local filters, high spatial frequencies, and statistical independence for classifying facial actions. PMID:21188284
Segmentation of Thalamus from MR images via Task-Driven Dictionary Learning.
Liu, Luoluo; Glaister, Jeffrey; Sun, Xiaoxia; Carass, Aaron; Tran, Trac D; Prince, Jerry L
2016-02-27
Automatic thalamus segmentation is useful to track changes in thalamic volume over time. In this work, we introduce a task-driven dictionary learning framework to find the optimal dictionary given a set of eleven features obtained from T1-weighted MRI and diffusion tensor imaging. In this dictionary learning framework, a linear classifier is designed concurrently to classify voxels as belonging to the thalamus or non-thalamus class. Morphological post-processing is applied to produce the final thalamus segmentation. Due to the uneven size of the training data samples for the non-thalamus and thalamus classes, a non-uniform sampling scheme is proposed to train the classifier to better discriminate between the two classes around the boundary of the thalamus. Experiments are conducted on data collected from 22 subjects with manually delineated ground truth. The experimental results are promising in terms of improvements in the Dice coefficient of the thalamus segmentation over state-of-the-art atlas-based thalamus segmentation algorithms.
Aspect-object alignment with Integer Linear Programming in opinion mining.
Zhao, Yanyan; Qin, Bing; Liu, Ting; Yang, Wei
2015-01-01
Target extraction is an important task in opinion mining. In this task, a complete target consists of an aspect and its corresponding object. However, previous work has always simply regarded the aspect as the target itself and has ignored the important "object" element. Thus, these studies have addressed incomplete targets, which are of limited use for practical applications. This paper proposes a novel and important sentiment analysis task, termed aspect-object alignment, to solve the "object neglect" problem. The objective of this task is to obtain the correct corresponding object for each aspect. We design a two-step framework for this task. We first provide an aspect-object alignment classifier that incorporates three sets of features, namely, the basic, relational, and special target features. However, the objects that are assigned to aspects in a sentence often contradict each other and possess many complicated features that are difficult to incorporate into a classifier. To resolve these conflicts, we impose two types of constraints in the second step: intra-sentence constraints and inter-sentence constraints. These constraints are encoded as linear formulations, and Integer Linear Programming (ILP) is used as an inference procedure to obtain a final global decision that is consistent with the constraints. Experiments on a corpus in the camera domain demonstrate that the three feature sets used in the aspect-object alignment classifier are effective in improving its performance. Moreover, the classifier with ILP inference performs better than the classifier without it, thereby illustrating that the two types of constraints that we impose are beneficial.
de Moraes, Fábio R; Neshich, Izabella A P; Mazoni, Ivan; Yano, Inácio H; Pereira, José G C; Salim, José A; Jardine, José G; Neshich, Goran
2014-01-01
Protein-protein interactions are involved in nearly all regulatory processes in the cell and are considered one of the most important issues in molecular biology and pharmaceutical sciences but are still not fully understood. Structural and computational biology contributed greatly to the elucidation of the mechanism of protein interactions. In this paper, we present a collection of the physicochemical and structural characteristics that distinguish interface-forming residues (IFR) from free surface residues (FSR). We formulated a linear discriminative analysis (LDA) classifier to assess whether chosen descriptors from the BlueStar STING database (http://www.cbi.cnptia.embrapa.br/SMS/) are suitable for such a task. Receiver operating characteristic (ROC) analysis indicates that the particular physicochemical and structural descriptors used for building the linear classifier perform much better than a random classifier and in fact, successfully outperform some of the previously published procedures, whose performance indicators were recently compared by other research groups. The results presented here show that the selected set of descriptors can be utilized to predict IFRs, even when homologue proteins are missing (particularly important for orphan proteins where no homologue is available for comparative analysis/indication) or, when certain conformational changes accompany interface formation. The development of amino acid type specific classifiers is shown to increase IFR classification performance. Also, we found that the addition of an amino acid conservation attribute did not improve the classification prediction. This result indicates that the increase in predictive power associated with amino acid conservation is exhausted by adequate use of an extensive list of independent physicochemical and structural parameters that, by themselves, fully describe the nano-environment at protein-protein interfaces. The IFR classifier developed in this study is now integrated into the BlueStar STING suite of programs. Consequently, the prediction of protein-protein interfaces for all proteins available in the PDB is possible through STING_interfaces module, accessible at the following website: (http://www.cbi.cnptia.embrapa.br/SMS/predictions/index.html).
de Moraes, Fábio R.; Neshich, Izabella A. P.; Mazoni, Ivan; Yano, Inácio H.; Pereira, José G. C.; Salim, José A.; Jardine, José G.; Neshich, Goran
2014-01-01
Protein-protein interactions are involved in nearly all regulatory processes in the cell and are considered one of the most important issues in molecular biology and pharmaceutical sciences but are still not fully understood. Structural and computational biology contributed greatly to the elucidation of the mechanism of protein interactions. In this paper, we present a collection of the physicochemical and structural characteristics that distinguish interface-forming residues (IFR) from free surface residues (FSR). We formulated a linear discriminative analysis (LDA) classifier to assess whether chosen descriptors from the BlueStar STING database (http://www.cbi.cnptia.embrapa.br/SMS/) are suitable for such a task. Receiver operating characteristic (ROC) analysis indicates that the particular physicochemical and structural descriptors used for building the linear classifier perform much better than a random classifier and in fact, successfully outperform some of the previously published procedures, whose performance indicators were recently compared by other research groups. The results presented here show that the selected set of descriptors can be utilized to predict IFRs, even when homologue proteins are missing (particularly important for orphan proteins where no homologue is available for comparative analysis/indication) or, when certain conformational changes accompany interface formation. The development of amino acid type specific classifiers is shown to increase IFR classification performance. Also, we found that the addition of an amino acid conservation attribute did not improve the classification prediction. This result indicates that the increase in predictive power associated with amino acid conservation is exhausted by adequate use of an extensive list of independent physicochemical and structural parameters that, by themselves, fully describe the nano-environment at protein-protein interfaces. The IFR classifier developed in this study is now integrated into the BlueStar STING suite of programs. Consequently, the prediction of protein-protein interfaces for all proteins available in the PDB is possible through STING_interfaces module, accessible at the following website: (http://www.cbi.cnptia.embrapa.br/SMS/predictions/index.html). PMID:24489849
2016-01-01
Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications. PMID:27806075
Miguel-Hurtado, Oscar; Guest, Richard; Stevenage, Sarah V; Neil, Greg J; Black, Sue
2016-01-01
Understanding the relationship between physiological measurements from human subjects and their demographic data is important within both the biometric and forensic domains. In this paper we explore the relationship between measurements of the human hand and a range of demographic features. We assess the ability of linear regression and machine learning classifiers to predict demographics from hand features, thereby providing evidence on both the strength of relationship and the key features underpinning this relationship. Our results show that we are able to predict sex, height, weight and foot size accurately within various data-range bin sizes, with machine learning classification algorithms out-performing linear regression in most situations. In addition, we identify the features used to provide these relationships applicable across multiple applications.
Huikang Wang; Luzheng Bi; Teng Teng
2017-07-01
This paper proposes a novel method of electroencephalography (EEG)-based driver emergency braking intention detection system for brain-controlled driving considering one electrode falling-off. First, whether one electrode falls off is discriminated based on EEG potentials. Then, the missing signals are estimated by using the signals collected from other channels based on multivariate linear regression. Finally, a linear decoder is applied to classify driver intentions. Experimental results show that the falling-off discrimination accuracy is 99.63% on average and the correlation coefficient and root mean squared error (RMSE) between the estimated and experimental data are 0.90 and 11.43 μV, respectively, on average. Given one electrode falls off, the system accuracy of the proposed intention prediction method is significantly higher than that of the original method (95.12% VS 79.11%) and is close to that (95.95%) of the original system under normal situations (i. e., no electrode falling-off).
Linear Legendrian curves in T(3)
NASA Astrophysics Data System (ADS)
Ghiggini, Paolo
2006-05-01
Using convex surfaces and Kanda's classification theorem, we classify Legendrian isotopy classes of Legendrian linear curves in all tight contact structures on T(3) . Some of the knot types considered in this paper provide new examples of non transversally simple knot types.
Multivariate detrending of fMRI signal drifts for real-time multiclass pattern classification.
Lee, Dongha; Jang, Changwon; Park, Hae-Jeong
2015-03-01
Signal drift in functional magnetic resonance imaging (fMRI) is an unavoidable artifact that limits classification performance in multi-voxel pattern analysis of fMRI. As conventional methods to reduce signal drift, global demeaning or proportional scaling disregards regional variations of drift, whereas voxel-wise univariate detrending is too sensitive to noisy fluctuations. To overcome these drawbacks, we propose a multivariate real-time detrending method for multiclass classification that involves spatial demeaning at each scan and the recursive detrending of drifts in the classifier outputs driven by a multiclass linear support vector machine. Experiments using binary and multiclass data showed that the linear trend estimation of the classifier output drift for each class (a weighted sum of drifts in the class-specific voxels) was more robust against voxel-wise artifacts that lead to inconsistent spatial patterns and the effect of online processing than voxel-wise detrending. The classification performance of the proposed method was significantly better, especially for multiclass data, than that of voxel-wise linear detrending, global demeaning, and classifier output detrending without demeaning. We concluded that the multivariate approach using classifier output detrending of fMRI signals with spatial demeaning preserves spatial patterns, is less sensitive than conventional methods to sample size, and increases classification performance, which is a useful feature for real-time fMRI classification. Copyright © 2014 Elsevier Inc. All rights reserved.
Burgansky-Eliash, Zvia; Wollstein, Gadi; Chu, Tianjiao; Ramsey, Joseph D.; Glymour, Clark; Noecker, Robert J.; Ishikawa, Hiroshi; Schuman, Joel S.
2007-01-01
Purpose Machine-learning classifiers are trained computerized systems with the ability to detect the relationship between multiple input parameters and a diagnosis. The present study investigated whether the use of machine-learning classifiers improves optical coherence tomography (OCT) glaucoma detection. Methods Forty-seven patients with glaucoma (47 eyes) and 42 healthy subjects (42 eyes) were included in this cross-sectional study. Of the glaucoma patients, 27 had early disease (visual field mean deviation [MD] ≥ −6 dB) and 20 had advanced glaucoma (MD < −6 dB). Machine-learning classifiers were trained to discriminate between glaucomatous and healthy eyes using parameters derived from OCT output. The classifiers were trained with all 38 parameters as well as with only 8 parameters that correlated best with the visual field MD. Five classifiers were tested: linear discriminant analysis, support vector machine, recursive partitioning and regression tree, generalized linear model, and generalized additive model. For the last two classifiers, a backward feature selection was used to find the minimal number of parameters that resulted in the best and most simple prediction. The cross-validated receiver operating characteristic (ROC) curve and accuracies were calculated. Results The largest area under the ROC curve (AROC) for glaucoma detection was achieved with the support vector machine using eight parameters (0.981). The sensitivity at 80% and 95% specificity was 97.9% and 92.5%, respectively. This classifier also performed best when judged by cross-validated accuracy (0.966). The best classification between early glaucoma and advanced glaucoma was obtained with the generalized additive model using only three parameters (AROC = 0.854). Conclusions Automated machine classifiers of OCT data might be useful for enhancing the utility of this technology for detecting glaucomatous abnormality. PMID:16249492
A review on classification methods for solving fully fuzzy linear systems
NASA Astrophysics Data System (ADS)
Daud, Wan Suhana Wan; Ahmad, Nazihah; Aziz, Khairu Azlan Abd
2015-12-01
Fully Fuzzy Linear System (FFLS) exists when there are fuzzy numbers on both sides of the linear systems. This system is quite significant today since most of the linear systems play with uncertainties of parameters especially in mathematics, engineering and finance. Many researchers and practitioners used the FFLS to model their problem and they apply various methods to solve it. In this paper, we present the outcome of a comprehensive review that we have done on various methods used for solving the FFLS. We classify our findings based on parameters' type used for the FFLS either restricted or unrestricted. We also discuss some of the methods by illustrating numerical examples and identify the differences between the methods. Ultimately, we summarize all findings in a table. We hope this study will encourage researchers to appreciate the use of this method and with that it will be easier for them to choose the right method or to propose any new method for solving the FFLS.
A boosted optimal linear learner for retinal vessel segmentation
NASA Astrophysics Data System (ADS)
Poletti, E.; Grisan, E.
2014-03-01
Ocular fundus images provide important information about retinal degeneration, which may be related to acute pathologies or to early signs of systemic diseases. An automatic and quantitative assessment of vessel morphological features, such as diameters and tortuosity, can improve clinical diagnosis and evaluation of retinopathy. At variance with available methods, we propose a data-driven approach, in which the system learns a set of optimal discriminative convolution kernels (linear learner). The set is progressively built based on an ADA-boost sample weighting scheme, providing seamless integration between linear learner estimation and classification. In order to capture the vessel appearance changes at different scales, the kernels are estimated on a pyramidal decomposition of the training samples. The set is employed as a rotating bank of matched filters, whose response is used by the boosted linear classifier to provide a classification of each image pixel into the two classes of interest (vessel/background). We tested the approach fundus images available from the DRIVE dataset. We show that the segmentation performance yields an accuracy of 0.94.
Multi-channel linear descriptors for event-related EEG collected in brain computer interface.
Pei, Xiao-mei; Zheng, Chong-xun; Xu, Jin; Bin, Guang-yu; Wang, Hong-wu
2006-03-01
By three multi-channel linear descriptors, i.e. spatial complexity (omega), field power (sigma) and frequency of field changes (phi), event-related EEG data within 8-30 Hz were investigated during imagination of left or right hand movement. Studies on the event-related EEG data indicate that a two-channel version of omega, sigma and phi could reflect the antagonistic ERD/ERS patterns over contralateral and ipsilateral areas and also characterize different phases of the changing brain states in the event-related paradigm. Based on the selective two-channel linear descriptors, the left and right hand motor imagery tasks are classified to obtain satisfactory results, which testify the validity of the three linear descriptors omega, sigma and phi for characterizing event-related EEG. The preliminary results show that omega, sigma together with phi have good separability for left and right hand motor imagery tasks, which could be considered for classification of two classes of EEG patterns in the application of brain computer interfaces.
Wang, Hsin-Wei; Lin, Ya-Chi; Pai, Tun-Wen; Chang, Hao-Teng
2011-01-01
Epitopes are antigenic determinants that are useful because they induce B-cell antibody production and stimulate T-cell activation. Bioinformatics can enable rapid, efficient prediction of potential epitopes. Here, we designed a novel B-cell linear epitope prediction system called LEPS, Linear Epitope Prediction by Propensities and Support Vector Machine, that combined physico-chemical propensity identification and support vector machine (SVM) classification. We tested the LEPS on four datasets: AntiJen, HIV, a newly generated PC, and AHP, a combination of these three datasets. Peptides with globally or locally high physicochemical propensities were first identified as primitive linear epitope (LE) candidates. Then, candidates were classified with the SVM based on the unique features of amino acid segments. This reduced the number of predicted epitopes and enhanced the positive prediction value (PPV). Compared to four other well-known LE prediction systems, the LEPS achieved the highest accuracy (72.52%), specificity (84.22%), PPV (32.07%), and Matthews' correlation coefficient (10.36%).
Heart rate variability based on risk stratification for type 2 diabetes mellitus.
Silva-E-Oliveira, Julia; Amélio, Pâmela Marina; Abranches, Isabela Lopes Laguardia; Damasceno, Dênis Derly; Furtado, Fabianne
2017-01-01
To evaluate heart rate variability among adults with different risk levels for type 2 diabetes mellitus. The risk for type 2 diabetes mellitus was assessed in 130 participants (89 females) based on the questionnaire Finnish Diabetes Risk Score and was classified as low risk (n=26), slightly elevated risk (n=41), moderate risk (n=27) and high risk (n=32). To measure heart rate variability, a heart-rate monitor Polar S810i® was employed to obtain RR series for each individual, at rest, for 5 minutes, followed by analysis of linear and nonlinear indexes. The groups at higher risk of type 2 diabetes mellitus had significantly lower linear and nonlinear heart rate variability indexes. The individuals at high risk for type 2 diabetes mellitus have lower heart rate variability. Avaliar a variabilidade da frequência cardíaca em adultos com diferentes níveis de risco para diabetes mellitus tipo 2. O grau de risco para diabetes mellitus tipo 2 de 130 participantes (41 homens) foi avaliado pelo questionário Finnish Diabetes Risk Score. Os participantes foram classificados em baixo risco (n=26), risco levemente elevado (n=41), risco moderado (n=27) e alto risco (n=32). Para medir a variabilidade da frequência cardíaca, utilizou-se o frequencímetro Polar S810i® para obter séries de intervalo RR para cada indivíduo, em repouso, durante 5 minutos; posteriormente, realizou-se análise por meio de índices lineares e não-lineares. O grupo com maior risco para diabetes mellitus tipo 2 teve uma diminuição significante nos índices lineares e não-lineares da variabilidade da frequência cardíaca. Os resultados apontam que indivíduos com risco alto para diabetes mellitus tipo 2 tem menor variabilidade da frequência cardíaca. To evaluate heart rate variability among adults with different risk levels for type 2 diabetes mellitus. The risk for type 2 diabetes mellitus was assessed in 130 participants (89 females) based on the questionnaire Finnish Diabetes Risk Score and was classified as low risk (n=26), slightly elevated risk (n=41), moderate risk (n=27) and high risk (n=32). To measure heart rate variability, a heart-rate monitor Polar S810i® was employed to obtain RR series for each individual, at rest, for 5 minutes, followed by analysis of linear and nonlinear indexes. The groups at higher risk of type 2 diabetes mellitus had significantly lower linear and nonlinear heart rate variability indexes. The individuals at high risk for type 2 diabetes mellitus have lower heart rate variability.
A novel approach for fire recognition using hybrid features and manifold learning-based classifier
NASA Astrophysics Data System (ADS)
Zhu, Rong; Hu, Xueying; Tang, Jiajun; Hu, Sheng
2018-03-01
Although image/video based fire recognition has received growing attention, an efficient and robust fire detection strategy is rarely explored. In this paper, we propose a novel approach to automatically identify the flame or smoke regions in an image. It is composed to three stages: (1) a block processing is applied to divide an image into several nonoverlapping image blocks, and these image blocks are identified as suspicious fire regions or not by using two color models and a color histogram-based similarity matching method in the HSV color space, (2) considering that compared to other information, the flame and smoke regions have significant visual characteristics, so that two kinds of image features are extracted for fire recognition, where local features are obtained based on the Scale Invariant Feature Transform (SIFT) descriptor and the Bags of Keypoints (BOK) technique, and texture features are extracted based on the Gray Level Co-occurrence Matrices (GLCM) and the Wavelet-based Analysis (WA) methods, and (3) a manifold learning-based classifier is constructed based on two image manifolds, which is designed via an improve Globular Neighborhood Locally Linear Embedding (GNLLE) algorithm, and the extracted hybrid features are used as input feature vectors to train the classifier, which is used to make decision for fire images or non fire images. Experiments and comparative analyses with four approaches are conducted on the collected image sets. The results show that the proposed approach is superior to the other ones in detecting fire and achieving a high recognition accuracy and a low error rate.
Detection of chewing from piezoelectric film sensor signals using ensemble classifiers.
Farooq, Muhammad; Sazonov, Edward
2016-08-01
Selection and use of pattern recognition algorithms is application dependent. In this work, we explored the use of several ensembles of weak classifiers to classify signals captured from a wearable sensor system to detect food intake based on chewing. Three sensor signals (Piezoelectric sensor, accelerometer, and hand to mouth gesture) were collected from 12 subjects in free-living conditions for 24 hrs. Sensor signals were divided into 10 seconds epochs and for each epoch combination of time and frequency domain features were computed. In this work, we present a comparison of three different ensemble techniques: boosting (AdaBoost), bootstrap aggregation (bagging) and stacking, each trained with 3 different weak classifiers (Decision Trees, Linear Discriminant Analysis (LDA) and Logistic Regression). Type of feature normalization used can also impact the classification results. For each ensemble method, three feature normalization techniques: (no-normalization, z-score normalization, and minmax normalization) were tested. A 12 fold cross-validation scheme was used to evaluate the performance of each model where the performance was evaluated in terms of precision, recall, and accuracy. Best results achieved here show an improvement of about 4% over our previous algorithms.
Kernel-imbedded Gaussian processes for disease classification using microarray gene expression data
Zhao, Xin; Cheung, Leo Wang-Kit
2007-01-01
Background Designing appropriate machine learning methods for identifying genes that have a significant discriminating power for disease outcomes has become more and more important for our understanding of diseases at genomic level. Although many machine learning methods have been developed and applied to the area of microarray gene expression data analysis, the majority of them are based on linear models, which however are not necessarily appropriate for the underlying connection between the target disease and its associated explanatory genes. Linear model based methods usually also bring in false positive significant features more easily. Furthermore, linear model based algorithms often involve calculating the inverse of a matrix that is possibly singular when the number of potentially important genes is relatively large. This leads to problems of numerical instability. To overcome these limitations, a few non-linear methods have recently been introduced to the area. Many of the existing non-linear methods have a couple of critical problems, the model selection problem and the model parameter tuning problem, that remain unsolved or even untouched. In general, a unified framework that allows model parameters of both linear and non-linear models to be easily tuned is always preferred in real-world applications. Kernel-induced learning methods form a class of approaches that show promising potentials to achieve this goal. Results A hierarchical statistical model named kernel-imbedded Gaussian process (KIGP) is developed under a unified Bayesian framework for binary disease classification problems using microarray gene expression data. In particular, based on a probit regression setting, an adaptive algorithm with a cascading structure is designed to find the appropriate kernel, to discover the potentially significant genes, and to make the optimal class prediction accordingly. A Gibbs sampler is built as the core of the algorithm to make Bayesian inferences. Simulation studies showed that, even without any knowledge of the underlying generative model, the KIGP performed very close to the theoretical Bayesian bound not only in the case with a linear Bayesian classifier but also in the case with a very non-linear Bayesian classifier. This sheds light on its broader usability to microarray data analysis problems, especially to those that linear methods work awkwardly. The KIGP was also applied to four published microarray datasets, and the results showed that the KIGP performed better than or at least as well as any of the referred state-of-the-art methods did in all of these cases. Conclusion Mathematically built on the kernel-induced feature space concept under a Bayesian framework, the KIGP method presented in this paper provides a unified machine learning approach to explore both the linear and the possibly non-linear underlying relationship between the target features of a given binary disease classification problem and the related explanatory gene expression data. More importantly, it incorporates the model parameter tuning into the framework. The model selection problem is addressed in the form of selecting a proper kernel type. The KIGP method also gives Bayesian probabilistic predictions for disease classification. These properties and features are beneficial to most real-world applications. The algorithm is naturally robust in numerical computation. The simulation studies and the published data studies demonstrated that the proposed KIGP performs satisfactorily and consistently. PMID:17328811
Event Recognition for Contactless Activity Monitoring Using Phase-Modulated Continuous Wave Radar.
Forouzanfar, Mohamad; Mabrouk, Mohamed; Rajan, Sreeraman; Bolic, Miodrag; Dajani, Hilmi R; Groza, Voicu Z
2017-02-01
The use of remote sensing technologies such as radar is gaining popularity as a technique for contactless detection of physiological signals and analysis of human motion. This paper presents a methodology for classifying different events in a collection of phase modulated continuous wave radar returns. The primary application of interest is to monitor inmates where the presence of human vital signs amidst different, interferences needs to be identified. A comprehensive set of features is derived through time and frequency domain analyses of the radar returns. The Bhattacharyya distance is used to preselect the features with highest class separability as the possible candidate features for use in the classification process. The uncorrelated linear discriminant analysis is performed to decorrelate, denoise, and reduce the dimension of the candidate feature set. Linear and quadratic Bayesian classifiers are designed to distinguish breathing, different human motions, and nonhuman motions. The performance of these classifiers is evaluated on a pilot dataset of radar returns that contained different events including breathing, stopped breathing, simple human motions, and movement of fan and water. Our proposed pattern classification system achieved accuracies of up to 93% in stationary subject detection, 90% in stop-breathing detection, and 86% in interference detection. Our proposed radar pattern recognition system was able to accurately distinguish the predefined events amidst interferences. Besides inmate monitoring and suicide attempt detection, this paper can be extended to other radar applications such as home-based monitoring of elderly people, apnea detection, and home occupancy detection.
NASA Astrophysics Data System (ADS)
Legara, Erika Fille; Monterola, Christopher; Abundo, Cheryl
2011-01-01
We demonstrate an accurate procedure based on linear discriminant analysis that allows automatic authorship classification of opinion column articles. First, we extract the following stylometric features of 157 column articles from four authors: statistics on high frequency words, number of words per sentence, and number of sentences per paragraph. Then, by systematically ranking these features based on an effect size criterion, we show that we can achieve an average classification accuracy of 93% for the test set. In comparison, frequency size based ranking has an average accuracy of 80%. The highest possible average classification accuracy of our data merely relying on chance is ∼31%. By carrying out sensitivity analysis, we show that the effect size criterion is superior than frequency ranking because there exist low frequency words that significantly contribute to successful author discrimination. Consistent results are seen when the procedure is applied in classifying the undisputed Federalist papers of Alexander Hamilton and James Madison. To the best of our knowledge, the work is the first attempt in classifying opinion column articles, that by virtue of being shorter in length (as compared to novels or short stories), are more prone to over-fitting issues. The near perfect classification for the longer papers supports this claim. Our results provide an important insight on authorship attribution that has been overlooked in previous studies: that ranking discriminant variables based on word frequency counts is not necessarily an optimal procedure.
Prinyakupt, Jaroonrut; Pluempitiwiriyawej, Charnchai
2015-06-30
Blood smear microscopic images are routinely investigated by haematologists to diagnose most blood diseases. However, the task is quite tedious and time consuming. An automatic detection and classification of white blood cells within such images can accelerate the process tremendously. In this paper we propose a system to locate white blood cells within microscopic blood smear images, segment them into nucleus and cytoplasm regions, extract suitable features and finally, classify them into five types: basophil, eosinophil, neutrophil, lymphocyte and monocyte. Two sets of blood smear images were used in this study's experiments. Dataset 1, collected from Rangsit University, were normal peripheral blood slides under light microscope with 100× magnification; 555 images with 601 white blood cells were captured by a Nikon DS-Fi2 high-definition color camera and saved in JPG format of size 960 × 1,280 pixels at 15 pixels per 1 μm resolution. In dataset 2, 477 cropped white blood cell images were downloaded from CellaVision.com. They are in JPG format of size 360 × 363 pixels. The resolution is estimated to be 10 pixels per 1 μm. The proposed system comprises a pre-processing step, nucleus segmentation, cell segmentation, feature extraction, feature selection and classification. The main concept of the segmentation algorithm employed uses white blood cell's morphological properties and the calibrated size of a real cell relative to image resolution. The segmentation process combined thresholding, morphological operation and ellipse curve fitting. Consequently, several features were extracted from the segmented nucleus and cytoplasm regions. Prominent features were then chosen by a greedy search algorithm called sequential forward selection. Finally, with a set of selected prominent features, both linear and naïve Bayes classifiers were applied for performance comparison. This system was tested on normal peripheral blood smear slide images from two datasets. Two sets of comparison were performed: segmentation and classification. The automatically segmented results were compared to the ones obtained manually by a haematologist. It was found that the proposed method is consistent and coherent in both datasets, with dice similarity of 98.9 and 91.6% for average segmented nucleus and cell regions, respectively. Furthermore, the overall correction rate in the classification phase is about 98 and 94% for linear and naïve Bayes models, respectively. The proposed system, based on normal white blood cell morphology and its characteristics, was applied to two different datasets. The results of the calibrated segmentation process on both datasets are fast, robust, efficient and coherent. Meanwhile, the classification of normal white blood cells into five types shows high sensitivity in both linear and naïve Bayes models, with slightly better results in the linear classifier.
Srivastava, Saurabh Kumar; Singh, Sandeep Kumar; Suri, Jasjit S
2018-04-13
A machine learning (ML)-based text classification system has several classifiers. The performance evaluation (PE) of the ML system is typically driven by the training data size and the partition protocols used. Such systems lead to low accuracy because the text classification systems lack the ability to model the input text data in terms of noise characteristics. This research study proposes a concept of misrepresentation ratio (MRR) on input healthcare text data and models the PE criteria for validating the hypothesis. Further, such a novel system provides a platform to amalgamate several attributes of the ML system such as: data size, classifier type, partitioning protocol and percentage MRR. Our comprehensive data analysis consisted of five types of text data sets (TwitterA, WebKB4, Disease, Reuters (R8), and SMS); five kinds of classifiers (support vector machine with linear kernel (SVM-L), MLP-based neural network, AdaBoost, stochastic gradient descent and decision tree); and five types of training protocols (K2, K4, K5, K10 and JK). Using the decreasing order of MRR, our ML system demonstrates the mean classification accuracies as: 70.13 ± 0.15%, 87.34 ± 0.06%, 93.73 ± 0.03%, 94.45 ± 0.03% and 97.83 ± 0.01%, respectively, using all the classifiers and protocols. The corresponding AUC is 0.98 for SMS data using Multi-Layer Perceptron (MLP) based neural network. All the classifiers, the best accuracy of 91.84 ± 0.04% is shown to be of MLP-based neural network and this is 6% better over previously published. Further we observed that as MRR decreases, the system robustness increases and validated by standard deviations. The overall text system accuracy using all data types, classifiers, protocols is 89%, thereby showing the entire ML system to be novel, robust and unique. The system is also tested for stability and reliability.
Houshyarifar, Vahid; Chehel Amirani, Mehdi
2016-08-12
In this paper we present a method to predict Sudden Cardiac Arrest (SCA) with higher order spectral (HOS) and linear (Time) features extracted from heart rate variability (HRV) signal. Predicting the occurrence of SCA is important in order to avoid the probability of Sudden Cardiac Death (SCD). This work is a challenge to predict five minutes before SCA onset. The method consists of four steps: pre-processing, feature extraction, feature reduction, and classification. In the first step, the QRS complexes are detected from the electrocardiogram (ECG) signal and then the HRV signal is extracted. In second step, bispectrum features of HRV signal and time-domain features are obtained. Six features are extracted from bispectrum and two features from time-domain. In the next step, these features are reduced to one feature by the linear discriminant analysis (LDA) technique. Finally, KNN and support vector machine-based classifiers are used to classify the HRV signals. We used two database named, MIT/BIH Sudden Cardiac Death (SCD) Database and Physiobank Normal Sinus Rhythm (NSR). In this work we achieved prediction of SCD occurrence for six minutes before the SCA with the accuracy over 91%.
Full-motion video analysis for improved gender classification
NASA Astrophysics Data System (ADS)
Flora, Jeffrey B.; Lochtefeld, Darrell F.; Iftekharuddin, Khan M.
2014-06-01
The ability of computer systems to perform gender classification using the dynamic motion of the human subject has important applications in medicine, human factors, and human-computer interface systems. Previous works in motion analysis have used data from sensors (including gyroscopes, accelerometers, and force plates), radar signatures, and video. However, full-motion video, motion capture, range data provides a higher resolution time and spatial dataset for the analysis of dynamic motion. Works using motion capture data have been limited by small datasets in a controlled environment. In this paper, we explore machine learning techniques to a new dataset that has a larger number of subjects. Additionally, these subjects move unrestricted through a capture volume, representing a more realistic, less controlled environment. We conclude that existing linear classification methods are insufficient for the gender classification for larger dataset captured in relatively uncontrolled environment. A method based on a nonlinear support vector machine classifier is proposed to obtain gender classification for the larger dataset. In experimental testing with a dataset consisting of 98 trials (49 subjects, 2 trials per subject), classification rates using leave-one-out cross-validation are improved from 73% using linear discriminant analysis to 88% using the nonlinear support vector machine classifier.
Wang, Kun; Jiang, Tianzi; Liang, Meng; Wang, Liang; Tian, Lixia; Zhang, Xinqing; Li, Kuncheng; Liu, Zhening
2006-01-01
In this work, we proposed a discriminative model of Alzheimer's disease (AD) on the basis of multivariate pattern classification and functional magnetic resonance imaging (fMRI). This model used the correlation/anti-correlation coefficients of two intrinsically anti-correlated networks in resting brains, which have been suggested by two recent studies, as the feature of classification. Pseudo-Fisher Linear Discriminative Analysis (pFLDA) was then performed on the feature space and a linear classifier was generated. Using leave-one-out (LOO) cross validation, our results showed a correct classification rate of 83%. We also compared the proposed model with another one based on the whole brain functional connectivity. Our proposed model outperformed the other one significantly, and this implied that the two intrinsically anti-correlated networks may be a more susceptible part of the whole brain network in the early stage of AD.
A neural network approach to cloud classification
NASA Technical Reports Server (NTRS)
Lee, Jonathan; Weger, Ronald C.; Sengupta, Sailes K.; Welch, Ronald M.
1990-01-01
It is shown that, using high-spatial-resolution data, very high cloud classification accuracies can be obtained with a neural network approach. A texture-based neural network classifier using only single-channel visible Landsat MSS imagery achieves an overall cloud identification accuracy of 93 percent. Cirrus can be distinguished from boundary layer cloudiness with an accuracy of 96 percent, without the use of an infrared channel. Stratocumulus is retrieved with an accuracy of 92 percent, cumulus at 90 percent. The use of the neural network does not improve cirrus classification accuracy. Rather, its main effect is in the improved separation between stratocumulus and cumulus cloudiness. While most cloud classification algorithms rely on linear parametric schemes, the present study is based on a nonlinear, nonparametric four-layer neural network approach. A three-layer neural network architecture, the nonparametric K-nearest neighbor approach, and the linear stepwise discriminant analysis procedure are compared. A significant finding is that significantly higher accuracies are attained with the nonparametric approaches using only 20 percent of the database as training data, compared to 67 percent of the database in the linear approach.
Kolchinsky, A; Lourenço, A; Li, L; Rocha, L M
2013-01-01
Drug-drug interaction (DDI) is a major cause of morbidity and mortality. DDI research includes the study of different aspects of drug interactions, from in vitro pharmacology, which deals with drug interaction mechanisms, to pharmaco-epidemiology, which investigates the effects of DDI on drug efficacy and adverse drug reactions. Biomedical literature mining can aid both kinds of approaches by extracting relevant DDI signals from either the published literature or large clinical databases. However, though drug interaction is an ideal area for translational research, the inclusion of literature mining methodologies in DDI workflows is still very preliminary. One area that can benefit from literature mining is the automatic identification of a large number of potential DDIs, whose pharmacological mechanisms and clinical significance can then be studied via in vitro pharmacology and in populo pharmaco-epidemiology. We implemented a set of classifiers for identifying published articles relevant to experimental pharmacokinetic DDI evidence. These documents are important for identifying causal mechanisms behind putative drug-drug interactions, an important step in the extraction of large numbers of potential DDIs. We evaluate performance of several linear classifiers on PubMed abstracts, under different feature transformation and dimensionality reduction methods. In addition, we investigate the performance benefits of including various publicly-available named entity recognition features, as well as a set of internally-developed pharmacokinetic dictionaries. We found that several classifiers performed well in distinguishing relevant and irrelevant abstracts. We found that the combination of unigram and bigram textual features gave better performance than unigram features alone, and also that normalization transforms that adjusted for feature frequency and document length improved classification. For some classifiers, such as linear discriminant analysis (LDA), proper dimensionality reduction had a large impact on performance. Finally, the inclusion of NER features and dictionaries was found not to help classification.
Abdelnour, A. Farras; Huppert, Theodore
2009-01-01
Near-infrared spectroscopy is a non-invasive neuroimaging method which uses light to measure changes in cerebral blood oxygenation associated with brain activity. In this work, we demonstrate the ability to record and analyze images of brain activity in real-time using a 16-channel continuous wave optical NIRS system. We propose a novel real-time analysis framework using an adaptive Kalman filter and a state–space model based on a canonical general linear model of brain activity. We show that our adaptive model has the ability to estimate single-trial brain activity events as we apply this method to track and classify experimental data acquired during an alternating bilateral self-paced finger tapping task. PMID:19457389
Law, Tameeka L; Katikaneni, Lakshmi D; Taylor, Sarah N; Korte, Jeffrey E; Ebeling, Myla D; Wagner, Carol L; Newman, Roger B
2012-07-01
Compare customized versus population-based growth curves for identification of small-for-gestational-age (SGA) and body fat percent (BF%) among preterm infants. Prospective cohort study of 204 preterm infants classified as SGA or appropriate-for-gestational-age (AGA) by population-based and customized growth curves. BF% was determined by air-displacement plethysmography. Differences between groups were compared using bivariable and multivariable linear and logistic regression analyses. Customized curves reclassified 30% of the preterm infants as SGA. SGA infants identified by customized method only had significantly lower BF% (13.8 ± 6.0) than the AGA (16.2 ± 6.3, p = 0.02) infants and similar to the SGA infants classified by both methods (14.6 ± 6.7, p = 0.51). Customized growth curves were a significant predictor of BF% (p = 0.02), whereas population-based growth curves were not a significant independent predictor of BF% (p = 0.50) at term corrected gestational age. Customized growth potential improves the differentiation of SGA infants and low BF% compared with a standard population-based growth curve among a cohort of preterm infants.
NASA Astrophysics Data System (ADS)
Yekkehkhany, B.; Safari, A.; Homayouni, S.; Hasanlou, M.
2014-10-01
In this paper, a framework is developed based on Support Vector Machines (SVM) for crop classification using polarimetric features extracted from multi-temporal Synthetic Aperture Radar (SAR) imageries. The multi-temporal integration of data not only improves the overall retrieval accuracy but also provides more reliable estimates with respect to single-date data. Several kernel functions are employed and compared in this study for mapping the input space to higher Hilbert dimension space. These kernel functions include linear, polynomials and Radial Based Function (RBF). The method is applied to several UAVSAR L-band SAR images acquired over an agricultural area near Winnipeg, Manitoba, Canada. In this research, the temporal alpha features of H/A/α decomposition method are used in classification. The experimental tests show an SVM classifier with RBF kernel for three dates of data increases the Overall Accuracy (OA) to up to 3% in comparison to using linear kernel function, and up to 1% in comparison to a 3rd degree polynomial kernel function.
Balouchestani, Mohammadreza; Krishnan, Sridhar
2014-01-01
Long-term recording of Electrocardiogram (ECG) signals plays an important role in health care systems for diagnostic and treatment purposes of heart diseases. Clustering and classification of collecting data are essential parts for detecting concealed information of P-QRS-T waves in the long-term ECG recording. Currently used algorithms do have their share of drawbacks: 1) clustering and classification cannot be done in real time; 2) they suffer from huge energy consumption and load of sampling. These drawbacks motivated us in developing novel optimized clustering algorithm which could easily scan large ECG datasets for establishing low power long-term ECG recording. In this paper, we present an advanced K-means clustering algorithm based on Compressed Sensing (CS) theory as a random sampling procedure. Then, two dimensionality reduction methods: Principal Component Analysis (PCA) and Linear Correlation Coefficient (LCC) followed by sorting the data using the K-Nearest Neighbours (K-NN) and Probabilistic Neural Network (PNN) classifiers are applied to the proposed algorithm. We show our algorithm based on PCA features in combination with K-NN classifier shows better performance than other methods. The proposed algorithm outperforms existing algorithms by increasing 11% classification accuracy. In addition, the proposed algorithm illustrates classification accuracy for K-NN and PNN classifiers, and a Receiver Operating Characteristics (ROC) area of 99.98%, 99.83%, and 99.75% respectively.
Rodriguez-Diaz, Eladio; Castanon, David A; Singh, Satish K; Bigio, Irving J
2011-06-01
Optical spectroscopy has shown potential as a real-time, in vivo, diagnostic tool for identifying neoplasia during endoscopy. We present the development of a diagnostic algorithm to classify elastic-scattering spectroscopy (ESS) spectra as either neoplastic or non-neoplastic. The algorithm is based on pattern recognition methods, including ensemble classifiers, in which members of the ensemble are trained on different regions of the ESS spectrum, and misclassification-rejection, where the algorithm identifies and refrains from classifying samples that are at higher risk of being misclassified. These "rejected" samples can be reexamined by simply repositioning the probe to obtain additional optical readings or ultimately by sending the polyp for histopathological assessment, as per standard practice. Prospective validation using separate training and testing sets result in a baseline performance of sensitivity = .83, specificity = .79, using the standard framework of feature extraction (principal component analysis) followed by classification (with linear support vector machines). With the developed algorithm, performance improves to Se ∼ 0.90, Sp ∼ 0.90, at a cost of rejecting 20-33% of the samples. These results are on par with a panel of expert pathologists. For colonoscopic prevention of colorectal cancer, our system could reduce biopsy risk and cost, obviate retrieval of non-neoplastic polyps, decrease procedure time, and improve assessment of cancer risk.
Rodriguez-Diaz, Eladio; Castanon, David A.; Singh, Satish K.; Bigio, Irving J.
2011-01-01
Optical spectroscopy has shown potential as a real-time, in vivo, diagnostic tool for identifying neoplasia during endoscopy. We present the development of a diagnostic algorithm to classify elastic-scattering spectroscopy (ESS) spectra as either neoplastic or non-neoplastic. The algorithm is based on pattern recognition methods, including ensemble classifiers, in which members of the ensemble are trained on different regions of the ESS spectrum, and misclassification-rejection, where the algorithm identifies and refrains from classifying samples that are at higher risk of being misclassified. These “rejected” samples can be reexamined by simply repositioning the probe to obtain additional optical readings or ultimately by sending the polyp for histopathological assessment, as per standard practice. Prospective validation using separate training and testing sets result in a baseline performance of sensitivity = .83, specificity = .79, using the standard framework of feature extraction (principal component analysis) followed by classification (with linear support vector machines). With the developed algorithm, performance improves to Se ∼ 0.90, Sp ∼ 0.90, at a cost of rejecting 20–33% of the samples. These results are on par with a panel of expert pathologists. For colonoscopic prevention of colorectal cancer, our system could reduce biopsy risk and cost, obviate retrieval of non-neoplastic polyps, decrease procedure time, and improve assessment of cancer risk. PMID:21721830
Wang, Jinjia; Liu, Yuan
2015-04-01
This paper presents a feature extraction method based on multivariate empirical mode decomposition (MEMD) combining with the power spectrum feature, and the method aims at the non-stationary electroencephalogram (EEG) or magnetoencephalogram (MEG) signal in brain-computer interface (BCI) system. Firstly, we utilized MEMD algorithm to decompose multichannel brain signals into a series of multiple intrinsic mode function (IMF), which was proximate stationary and with multi-scale. Then we extracted and reduced the power characteristic from each IMF to a lower dimensions using principal component analysis (PCA). Finally, we classified the motor imagery tasks by linear discriminant analysis classifier. The experimental verification showed that the correct recognition rates of the two-class and four-class tasks of the BCI competition III and competition IV reached 92.0% and 46.2%, respectively, which were superior to the winner of the BCI competition. The experimental proved that the proposed method was reasonably effective and stable and it would provide a new way for feature extraction.
NASA Astrophysics Data System (ADS)
Fujiki, Shogoro; Okada, Kei-ichi; Nishio, Shogo; Kitayama, Kanehiro
2016-09-01
We developed a new method to estimate stand ages of secondary vegetation in the Bornean montane zone, where local people conduct traditional shifting cultivation and protected areas are surrounded by patches of recovering secondary vegetation of various ages. Identifying stand ages at the landscape level is critical to improve conservation policies. We combined a high-resolution satellite image (WorldView-2) with time-series Landsat images. We extracted stand ages (the time elapsed since the most recent slash and burn) from a change-detection analysis with Landsat time-series images and superimposed the derived stand ages on the segments classified by object-based image analysis using WorldView-2. We regarded stand ages as a response variable, and object-based metrics as independent variables, to develop regression models that explain stand ages. Subsequently, we classified the vegetation of the target area into six age units and one rubber plantation unit (1-3 yr, 3-5 yr, 5-7 yr, 7-30 yr, 30-50 yr, >50 yr and 'rubber plantation') using regression models and linear discriminant analyses. Validation demonstrated an accuracy of 84.3%. Our approach is particularly effective in classifying highly dynamic pioneer vegetation younger than 7 years into 2-yr intervals, suggesting that rapid changes in vegetation canopies can be detected with high accuracy. The combination of a spectral time-series analysis and object-based metrics based on high-resolution imagery enabled the classification of dynamic vegetation under intensive shifting cultivation and yielded an informative land cover map based on stand ages.
Automatic Organ Segmentation for CT Scans Based on Super-Pixel and Convolutional Neural Networks.
Liu, Xiaoming; Guo, Shuxu; Yang, Bingtao; Ma, Shuzhi; Zhang, Huimao; Li, Jing; Sun, Changjian; Jin, Lanyi; Li, Xueyan; Yang, Qi; Fu, Yu
2018-04-20
Accurate segmentation of specific organ from computed tomography (CT) scans is a basic and crucial task for accurate diagnosis and treatment. To avoid time-consuming manual optimization and to help physicians distinguish diseases, an automatic organ segmentation framework is presented. The framework utilized convolution neural networks (CNN) to classify pixels. To reduce the redundant inputs, the simple linear iterative clustering (SLIC) of super-pixels and the support vector machine (SVM) classifier are introduced. To establish the perfect boundary of organs in one-pixel-level, the pixels need to be classified step-by-step. First, the SLIC is used to cut an image into grids and extract respective digital signatures. Next, the signature is classified by the SVM, and the rough edges are acquired. Finally, a precise boundary is obtained by the CNN, which is based on patches around each pixel-point. The framework is applied to abdominal CT scans of livers and high-resolution computed tomography (HRCT) scans of lungs. The experimental CT scans are derived from two public datasets (Sliver 07 and a Chinese local dataset). Experimental results show that the proposed method can precisely and efficiently detect the organs. This method consumes 38 s/slice for liver segmentation. The Dice coefficient of the liver segmentation results reaches to 97.43%. For lung segmentation, the Dice coefficient is 97.93%. This finding demonstrates that the proposed framework is a favorable method for lung segmentation of HRCT scans.
Automatic staging of bladder cancer on CT urography
NASA Astrophysics Data System (ADS)
Garapati, Sankeerth S.; Hadjiiski, Lubomir M.; Cha, Kenny H.; Chan, Heang-Ping; Caoili, Elaine M.; Cohan, Richard H.; Weizer, Alon; Alva, Ajjai; Paramagul, Chintana; Wei, Jun; Zhou, Chuan
2016-03-01
Correct staging of bladder cancer is crucial for the decision of neoadjuvant chemotherapy treatment and minimizing the risk of under- or over-treatment. Subjectivity and variability of clinicians in utilizing available diagnostic information may lead to inaccuracy in staging bladder cancer. An objective decision support system that merges the information in a predictive model based on statistical outcomes of previous cases and machine learning may assist clinicians in making more accurate and consistent staging assessments. In this study, we developed a preliminary method to stage bladder cancer. With IRB approval, 42 bladder cancer cases with CTU scans were collected from patient files. The cases were classified into two classes based on pathological stage T2, which is the decision threshold for neoadjuvant chemotherapy treatment (i.e. for stage >=T2) clinically. There were 21 cancers below stage T2 and 21 cancers at stage T2 or above. All 42 lesions were automatically segmented using our auto-initialized cascaded level sets (AI-CALS) method. Morphological features were extracted, which were selected and merged by linear discriminant analysis (LDA) classifier. A leave-one-case-out resampling scheme was used to train and test the classifier using the 42 lesions. The classification accuracy was quantified using the area under the ROC curve (Az). The average training Az was 0.97 and the test Az was 0.85. The classifier consistently selected the lesion volume, a gray level feature and a contrast feature. This predictive model shows promise for assisting in assessing the bladder cancer stage.
Max-AUC Feature Selection in Computer-Aided Detection of Polyps in CT Colonography
Xu, Jian-Wu; Suzuki, Kenji
2014-01-01
We propose a feature selection method based on a sequential forward floating selection (SFFS) procedure to improve the performance of a classifier in computerized detection of polyps in CT colonography (CTC). The feature selection method is coupled with a nonlinear support vector machine (SVM) classifier. Unlike the conventional linear method based on Wilks' lambda, the proposed method selected the most relevant features that would maximize the area under the receiver operating characteristic curve (AUC), which directly maximizes classification performance, evaluated based on AUC value, in the computer-aided detection (CADe) scheme. We presented two variants of the proposed method with different stopping criteria used in the SFFS procedure. The first variant searched all feature combinations allowed in the SFFS procedure and selected the subsets that maximize the AUC values. The second variant performed a statistical test at each step during the SFFS procedure, and it was terminated if the increase in the AUC value was not statistically significant. The advantage of the second variant is its lower computational cost. To test the performance of the proposed method, we compared it against the popular stepwise feature selection method based on Wilks' lambda for a colonic-polyp database (25 polyps and 2624 nonpolyps). We extracted 75 morphologic, gray-level-based, and texture features from the segmented lesion candidate regions. The two variants of the proposed feature selection method chose 29 and 7 features, respectively. Two SVM classifiers trained with these selected features yielded a 96% by-polyp sensitivity at false-positive (FP) rates of 4.1 and 6.5 per patient, respectively. Experiments showed a significant improvement in the performance of the classifier with the proposed feature selection method over that with the popular stepwise feature selection based on Wilks' lambda that yielded 18.0 FPs per patient at the same sensitivity level. PMID:24608058
Max-AUC feature selection in computer-aided detection of polyps in CT colonography.
Xu, Jian-Wu; Suzuki, Kenji
2014-03-01
We propose a feature selection method based on a sequential forward floating selection (SFFS) procedure to improve the performance of a classifier in computerized detection of polyps in CT colonography (CTC). The feature selection method is coupled with a nonlinear support vector machine (SVM) classifier. Unlike the conventional linear method based on Wilks' lambda, the proposed method selected the most relevant features that would maximize the area under the receiver operating characteristic curve (AUC), which directly maximizes classification performance, evaluated based on AUC value, in the computer-aided detection (CADe) scheme. We presented two variants of the proposed method with different stopping criteria used in the SFFS procedure. The first variant searched all feature combinations allowed in the SFFS procedure and selected the subsets that maximize the AUC values. The second variant performed a statistical test at each step during the SFFS procedure, and it was terminated if the increase in the AUC value was not statistically significant. The advantage of the second variant is its lower computational cost. To test the performance of the proposed method, we compared it against the popular stepwise feature selection method based on Wilks' lambda for a colonic-polyp database (25 polyps and 2624 nonpolyps). We extracted 75 morphologic, gray-level-based, and texture features from the segmented lesion candidate regions. The two variants of the proposed feature selection method chose 29 and 7 features, respectively. Two SVM classifiers trained with these selected features yielded a 96% by-polyp sensitivity at false-positive (FP) rates of 4.1 and 6.5 per patient, respectively. Experiments showed a significant improvement in the performance of the classifier with the proposed feature selection method over that with the popular stepwise feature selection based on Wilks' lambda that yielded 18.0 FPs per patient at the same sensitivity level.
Improving EMG based classification of basic hand movements using EMD.
Sapsanis, Christos; Georgoulas, George; Tzes, Anthony; Lymberopoulos, Dimitrios
2013-01-01
This paper presents a pattern recognition approach for the identification of basic hand movements using surface electromyographic (EMG) data. The EMG signal is decomposed using Empirical Mode Decomposition (EMD) into Intrinsic Mode Functions (IMFs) and subsequently a feature extraction stage takes place. Various combinations of feature subsets are tested using a simple linear classifier for the detection task. Our results suggest that the use of EMD can increase the discrimination ability of the conventional feature sets extracted from the raw EMG signal.
Acquah, Gifty E.; Via, Brian K.; Billor, Nedret; Fasina, Oladiran O.; Eckhardt, Lori G.
2016-01-01
As new markets, technologies and economies evolve in the low carbon bioeconomy, forest logging residue, a largely untapped renewable resource will play a vital role. The feedstock can however be variable depending on plant species and plant part component. This heterogeneity can influence the physical, chemical and thermochemical properties of the material, and thus the final yield and quality of products. Although it is challenging to control compositional variability of a batch of feedstock, it is feasible to monitor this heterogeneity and make the necessary changes in process parameters. Such a system will be a first step towards optimization, quality assurance and cost-effectiveness of processes in the emerging biofuel/chemical industry. The objective of this study was therefore to qualitatively classify forest logging residue made up of different plant parts using both near infrared spectroscopy (NIRS) and Fourier transform infrared spectroscopy (FTIRS) together with linear discriminant analysis (LDA). Forest logging residue harvested from several Pinus taeda (loblolly pine) plantations in Alabama, USA, were classified into three plant part components: clean wood, wood and bark and slash (i.e., limbs and foliage). Five-fold cross-validated linear discriminant functions had classification accuracies of over 96% for both NIRS and FTIRS based models. An extra factor/principal component (PC) was however needed to achieve this in FTIRS modeling. Analysis of factor loadings of both NIR and FTIR spectra showed that, the statistically different amount of cellulose in the three plant part components of logging residue contributed to their initial separation. This study demonstrated that NIR or FTIR spectroscopy coupled with PCA and LDA has the potential to be used as a high throughput tool in classifying the plant part makeup of a batch of forest logging residue feedstock. Thus, NIR/FTIR could be employed as a tool to rapidly probe/monitor the variability of forest biomass so that the appropriate online adjustments to parameters can be made in time to ensure process optimization and product quality. PMID:27618901
ERIC Educational Resources Information Center
Moreno, Mario; Harwell, Michael; Guzey, S. Selcen; Phillips, Alison; Moore, Tamara J.
2016-01-01
Hierarchical linear models have become a familiar method for accounting for a hierarchical data structure in studies of science and mathematics achievement. This paper illustrates the use of cross-classified random effects models (CCREMs), which are likely less familiar. The defining characteristic of CCREMs is a hierarchical data structure…
NASA Technical Reports Server (NTRS)
Scholz, D.; Fuhs, N.; Hixson, M.
1979-01-01
The overall objective of this study was to apply and evaluate several of the currently available classification schemes for crop identification. The approaches examined were: (1) a per point Gaussian maximum likelihood classifier, (2) a per point sum of normal densities classifier, (3) a per point linear classifier, (4) a per point Gaussian maximum likelihood decision tree classifier, and (5) a texture sensitive per field Gaussian maximum likelihood classifier. Three agricultural data sets were used in the study: areas from Fayette County, Illinois, and Pottawattamie and Shelby Counties in Iowa. The segments were located in two distinct regions of the Corn Belt to sample variability in soils, climate, and agricultural practices.
Toward Automated Cochlear Implant Fitting Procedures Based on Event-Related Potentials.
Finke, Mareike; Billinger, Martin; Büchner, Andreas
Cochlear implants (CIs) restore hearing to the profoundly deaf by direct electrical stimulation of the auditory nerve. To provide an optimal electrical stimulation pattern the CI must be individually fitted to each CI user. To date, CI fitting is primarily based on subjective feedback from the user. However, not all CI users are able to provide such feedback, for example, small children. This study explores the possibility of using the electroencephalogram (EEG) to objectively determine if CI users are able to hear differences in tones presented to them, which has potential applications in CI fitting or closed loop systems. Deviant and standard stimuli were presented to 12 CI users in an active auditory oddball paradigm. The EEG was recorded in two sessions and classification of the EEG data was performed with shrinkage linear discriminant analysis. Also, the impact of CI artifact removal on classification performance and the possibility to reuse a trained classifier in future sessions were evaluated. Overall, classification performance was above chance level for all participants although performance varied considerably between participants. Also, artifacts were successfully removed from the EEG without impairing classification performance. Finally, reuse of the classifier causes only a small loss in classification performance. Our data provide first evidence that EEG can be automatically classified on single-trial basis in CI users. Despite the slightly poorer classification performance over sessions, classifier and CI artifact correction appear stable over successive sessions. Thus, classifier and artifact correction weights can be reused without repeating the set-up procedure in every session, which makes the technique easier applicable. With our present data, we can show successful classification of event-related cortical potential patterns in CI users. In the future, this has the potential to objectify and automate parts of CI fitting procedures.
False alarm reduction by the And-ing of multiple multivariate Gaussian classifiers
NASA Astrophysics Data System (ADS)
Dobeck, Gerald J.; Cobb, J. Tory
2003-09-01
The high-resolution sonar is one of the principal sensors used by the Navy to detect and classify sea mines in minehunting operations. For such sonar systems, substantial effort has been devoted to the development of automated detection and classification (D/C) algorithms. These have been spurred by several factors including (1) aids for operators to reduce work overload, (2) more optimal use of all available data, and (3) the introduction of unmanned minehunting systems. The environments where sea mines are typically laid (harbor areas, shipping lanes, and the littorals) give rise to many false alarms caused by natural, biologic, and man-made clutter. The objective of the automated D/C algorithms is to eliminate most of these false alarms while still maintaining a very high probability of mine detection and classification (PdPc). In recent years, the benefits of fusing the outputs of multiple D/C algorithms have been studied. We refer to this as Algorithm Fusion. The results have been remarkable, including reliable robustness to new environments. This paper describes a method for training several multivariate Gaussian classifiers such that their And-ing dramatically reduces false alarms while maintaining a high probability of classification. This training approach is referred to as the Focused- Training method. This work extends our 2001-2002 work where the Focused-Training method was used with three other types of classifiers: the Attractor-based K-Nearest Neighbor Neural Network (a type of radial-basis, probabilistic neural network), the Optimal Discrimination Filter Classifier (based linear discrimination theory), and the Quadratic Penalty Function Support Vector Machine (QPFSVM). Although our experience has been gained in the area of sea mine detection and classification, the principles described herein are general and can be applied to a wide range of pattern recognition and automatic target recognition (ATR) problems.
Heart rate variability (HRV): an indicator of stress
NASA Astrophysics Data System (ADS)
Kaur, Balvinder; Durek, Joseph J.; O'Kane, Barbara L.; Tran, Nhien; Moses, Sophia; Luthra, Megha; Ikonomidou, Vasiliki N.
2014-05-01
Heart rate variability (HRV) can be an important indicator of several conditions that affect the autonomic nervous system, including traumatic brain injury, post-traumatic stress disorder and peripheral neuropathy [3], [4], [10] & [11]. Recent work has shown that some of the HRV features can potentially be used for distinguishing a subject's normal mental state from a stressed one [4], [13] & [14]. In all of these past works, although processing is done in both frequency and time domains, few classification algorithms have been explored for classifying normal from stressed RRintervals. In this paper we used 30 s intervals from the Electrocardiogram (ECG) time series collected during normal and stressed conditions, produced by means of a modified version of the Trier social stress test, to compute HRV-driven features and subsequently applied a set of classification algorithms to distinguish stressed from normal conditions. To classify RR-intervals, we explored classification algorithms that are commonly used for medical applications, namely 1) logistic regression (LR) [16] and 2) linear discriminant analysis (LDA) [6]. Classification performance for various levels of stress over the entire test was quantified using precision, accuracy, sensitivity and specificity measures. Results from both classifiers were then compared to find an optimal classifier and HRV features for stress detection. This work, performed under an IRB-approved protocol, not only provides a method for developing models and classifiers based on human data, but also provides a foundation for a stress indicator tool based on HRV. Further, these classification tools will not only benefit many civilian applications for detecting stress, but also security and military applications for screening such as: border patrol, stress detection for deception [3],[17], and wounded-warrior triage [12].
NASA Astrophysics Data System (ADS)
Smid, Marek; Costa, Ana; Pebesma, Edzer; Granell, Carlos; Bhattacharya, Devanjan
2016-04-01
Human kind is currently predominantly urban based, and the majority of ever continuing population growth will take place in urban agglomerations. Urban systems are not only major drivers of climate change, but also the impact hot spots. Furthermore, climate change impacts are commonly managed at city scale. Therefore, assessing climate change impacts on urban systems is a very relevant subject of research. Climate and its impacts on all levels (local, meso and global scale) and also the inter-scale dependencies of those processes should be a subject to detail analysis. While global and regional projections of future climate are currently available, local-scale information is lacking. Hence, statistical downscaling methodologies represent a potentially efficient way to help to close this gap. In general, the methodological reviews of downscaling procedures cover the various methods according to their application (e.g. downscaling for the hydrological modelling). Some of the most recent and comprehensive studies, such as the ESSEM COST Action ES1102 (VALUE), use the concept of Perfect Prog and MOS. Other examples of classification schemes of downscaling techniques consider three main categories: linear methods, weather classifications and weather generators. Downscaling and climate modelling represent a multidisciplinary field, where researchers from various backgrounds intersect their efforts, resulting in specific terminology, which may be somewhat confusing. For instance, the Polynomial Regression (also called the Surface Trend Analysis) is a statistical technique. In the context of the spatial interpolation procedures, it is commonly classified as a deterministic technique, and kriging approaches are classified as stochastic. Furthermore, the terms "statistical" and "stochastic" (frequently used as names of sub-classes in downscaling methodological reviews) are not always considered as synonymous, even though both terms could be seen as identical since they are referring to methods handling input modelling factors as variables with certain probability distributions. In addition, the recent development is going towards multi-step methodologies containing deterministic and stochastic components. This evolution leads to the introduction of new terms like hybrid or semi-stochastic approaches, which makes the efforts to systematically classifying downscaling methods to the previously defined categories even more challenging. This work presents a review of statistical downscaling procedures, which classifies the methods in two steps. In the first step, we describe several techniques that produce a single climatic surface based on observations. The methods are classified into two categories using an approximation to the broadest consensual statistical terms: linear and non-linear methods. The second step covers techniques that use simulations to generate alternative surfaces, which correspond to different realizations of the same processes. Those simulations are essential because there is a limited number of real observational data, and such procedures are crucial for modelling extremes. This work emphasises the link between statistical downscaling methods and the research of climate change impacts at city scale.
NASA Astrophysics Data System (ADS)
Jelinek, Herbert F.; Cree, Michael J.; Leandro, Jorge J. G.; Soares, João V. B.; Cesar, Roberto M.; Luckie, A.
2007-05-01
Proliferative diabetic retinopathy can lead to blindness. However, early recognition allows appropriate, timely intervention. Fluorescein-labeled retinal blood vessels of 27 digital images were automatically segmented using the Gabor wavelet transform and classified using traditional features such as area, perimeter, and an additional five morphological features based on the derivatives-of-Gaussian wavelet-derived data. Discriminant analysis indicated that traditional features do not detect early proliferative retinopathy. The best single feature for discrimination was the wavelet curvature with an area under the curve (AUC) of 0.76. Linear discriminant analysis with a selection of six features achieved an AUC of 0.90 (0.73-0.97, 95% confidence interval). The wavelet method was able to segment retinal blood vessels and classify the images according to the presence or absence of proliferative retinopathy.
Beevi, K Sabeena; Nair, Madhu S; Bindu, G R
2016-08-01
The exact measure of mitotic nuclei is a crucial parameter in breast cancer grading and prognosis. This can be achieved by improving the mitotic detection accuracy by careful design of segmentation and classification techniques. In this paper, segmentation of nuclei from breast histopathology images are carried out by Localized Active Contour Model (LACM) utilizing bio-inspired optimization techniques in the detection stage, in order to handle diffused intensities present along object boundaries. Further, the application of a new optimal machine learning algorithm capable of classifying strong non-linear data such as Random Kitchen Sink (RKS), shows improved classification performance. The proposed method has been tested on Mitosis detection in breast cancer histological images (MITOS) dataset provided for MITOS-ATYPIA CONTEST 2014. The proposed framework achieved 95% recall, 98% precision and 96% F-score.
Research on driver fatigue detection
NASA Astrophysics Data System (ADS)
Zhang, Ting; Chen, Zhong; Ouyang, Chao
2018-03-01
Driver fatigue is one of the main causes of frequent traffic accidents. In this case, driver fatigue detection system has very important significance in avoiding traffic accidents. This paper presents a real-time method based on fusion of multiple facial features, including eye closure, yawn and head movement. The eye state is classified as being open or closed by a linear SVM classifier trained using HOG features of the detected eye. The mouth state is determined according to the width-height ratio of the mouth. The head movement is detected by head pitch angle calculated by facial landmark. The driver's fatigue state can be reasoned by the model trained by above features. According to experimental results, drive fatigue detection obtains an excellent performance. It indicates that the developed method is valuable for the application of avoiding traffic accidents caused by driver's fatigue.
Poynton, Clare B; Chen, Kevin T; Chonde, Daniel B; Izquierdo-Garcia, David; Gollub, Randy L; Gerstner, Elizabeth R; Batchelor, Tracy T; Catana, Ciprian
2014-01-01
We present a new MRI-based attenuation correction (AC) approach for integrated PET/MRI systems that combines both segmentation- and atlas-based methods by incorporating dual-echo ultra-short echo-time (DUTE) and T1-weighted (T1w) MRI data and a probabilistic atlas. Segmented atlases were constructed from CT training data using a leave-one-out framework and combined with T1w, DUTE, and CT data to train a classifier that computes the probability of air/soft tissue/bone at each voxel. This classifier was applied to segment the MRI of the subject of interest and attenuation maps (μ-maps) were generated by assigning specific linear attenuation coefficients (LACs) to each tissue class. The μ-maps generated with this "Atlas-T1w-DUTE" approach were compared to those obtained from DUTE data using a previously proposed method. For validation of the segmentation results, segmented CT μ-maps were considered to the "silver standard"; the segmentation accuracy was assessed qualitatively and quantitatively through calculation of the Dice similarity coefficient (DSC). Relative change (RC) maps between the CT and MRI-based attenuation corrected PET volumes were also calculated for a global voxel-wise assessment of the reconstruction results. The μ-maps obtained using the Atlas-T1w-DUTE classifier agreed well with those derived from CT; the mean DSCs for the Atlas-T1w-DUTE-based μ-maps across all subjects were higher than those for DUTE-based μ-maps; the atlas-based μ-maps also showed a lower percentage of misclassified voxels across all subjects. RC maps from the atlas-based technique also demonstrated improvement in the PET data compared to the DUTE method, both globally as well as regionally.
Adaptive road crack detection system by pavement classification.
Gavilán, Miguel; Balcones, David; Marcos, Oscar; Llorca, David F; Sotelo, Miguel A; Parra, Ignacio; Ocaña, Manuel; Aliseda, Pedro; Yarza, Pedro; Amírola, Alejandro
2011-01-01
This paper presents a road distress detection system involving the phases needed to properly deal with fully automatic road distress assessment. A vehicle equipped with line scan cameras, laser illumination and acquisition HW-SW is used to storage the digital images that will be further processed to identify road cracks. Pre-processing is firstly carried out to both smooth the texture and enhance the linear features. Non-crack features detection is then applied to mask areas of the images with joints, sealed cracks and white painting, that usually generate false positive cracking. A seed-based approach is proposed to deal with road crack detection, combining Multiple Directional Non-Minimum Suppression (MDNMS) with a symmetry check. Seeds are linked by computing the paths with the lowest cost that meet the symmetry restrictions. The whole detection process involves the use of several parameters. A correct setting becomes essential to get optimal results without manual intervention. A fully automatic approach by means of a linear SVM-based classifier ensemble able to distinguish between up to 10 different types of pavement that appear in the Spanish roads is proposed. The optimal feature vector includes different texture-based features. The parameters are then tuned depending on the output provided by the classifier. Regarding non-crack features detection, results show that the introduction of such module reduces the impact of false positives due to non-crack features up to a factor of 2. In addition, the observed performance of the crack detection system is significantly boosted by adapting the parameters to the type of pavement.
Adaptive Road Crack Detection System by Pavement Classification
Gavilán, Miguel; Balcones, David; Marcos, Oscar; Llorca, David F.; Sotelo, Miguel A.; Parra, Ignacio; Ocaña, Manuel; Aliseda, Pedro; Yarza, Pedro; Amírola, Alejandro
2011-01-01
This paper presents a road distress detection system involving the phases needed to properly deal with fully automatic road distress assessment. A vehicle equipped with line scan cameras, laser illumination and acquisition HW-SW is used to storage the digital images that will be further processed to identify road cracks. Pre-processing is firstly carried out to both smooth the texture and enhance the linear features. Non-crack features detection is then applied to mask areas of the images with joints, sealed cracks and white painting, that usually generate false positive cracking. A seed-based approach is proposed to deal with road crack detection, combining Multiple Directional Non-Minimum Suppression (MDNMS) with a symmetry check. Seeds are linked by computing the paths with the lowest cost that meet the symmetry restrictions. The whole detection process involves the use of several parameters. A correct setting becomes essential to get optimal results without manual intervention. A fully automatic approach by means of a linear SVM-based classifier ensemble able to distinguish between up to 10 different types of pavement that appear in the Spanish roads is proposed. The optimal feature vector includes different texture-based features. The parameters are then tuned depending on the output provided by the classifier. Regarding non-crack features detection, results show that the introduction of such module reduces the impact of false positives due to non-crack features up to a factor of 2. In addition, the observed performance of the crack detection system is significantly boosted by adapting the parameters to the type of pavement. PMID:22163717
Noise tolerant dendritic lattice associative memories
NASA Astrophysics Data System (ADS)
Ritter, Gerhard X.; Schmalz, Mark S.; Hayden, Eric; Tucker, Marc
2011-09-01
Linear classifiers based on computation over the real numbers R (e.g., with operations of addition and multiplication) denoted by (R, +, x), have been represented extensively in the literature of pattern recognition. However, a different approach to pattern classification involves the use of addition, maximum, and minimum operations over the reals in the algebra (R, +, maximum, minimum) These pattern classifiers, based on lattice algebra, have been shown to exhibit superior information storage capacity, fast training and short convergence times, high pattern classification accuracy, and low computational cost. Such attributes are not always found, for example, in classical neural nets based on the linear inner product. In a special type of lattice associative memory (LAM), called a dendritic LAM or DLAM, it is possible to achieve noise-tolerant pattern classification by varying the design of noise or error acceptance bounds. This paper presents theory and algorithmic approaches for the computation of noise-tolerant lattice associative memories (LAMs) under a variety of input constraints. Of particular interest are the classification of nonergodic data in noise regimes with time-varying statistics. DLAMs, which are a specialization of LAMs derived from concepts of biological neural networks, have successfully been applied to pattern classification from hyperspectral remote sensing data, as well as spatial object recognition from digital imagery. The authors' recent research in the development of DLAMs is overviewed, with experimental results that show utility for a wide variety of pattern classification applications. Performance results are presented in terms of measured computational cost, noise tolerance, classification accuracy, and throughput for a variety of input data and noise levels.
Liang, Yujie; Ying, Rendong; Lu, Zhenqi; Liu, Peilin
2014-01-01
In the design phase of sensor arrays during array signal processing, the estimation performance and system cost are largely determined by array aperture size. In this article, we address the problem of joint direction-of-arrival (DOA) estimation with distributed sparse linear arrays (SLAs) and propose an off-grid synchronous approach based on distributed compressed sensing to obtain larger array aperture. We focus on the complex source distribution in the practical applications and classify the sources into common and innovation parts according to whether a signal of source can impinge on all the SLAs or a specific one. For each SLA, we construct a corresponding virtual uniform linear array (ULA) to create the relationship of random linear map between the signals respectively observed by these two arrays. The signal ensembles including the common/innovation sources for different SLAs are abstracted as a joint spatial sparsity model. And we use the minimization of concatenated atomic norm via semidefinite programming to solve the problem of joint DOA estimation. Joint calculation of the signals observed by all the SLAs exploits their redundancy caused by the common sources and decreases the requirement of array size. The numerical results illustrate the advantages of the proposed approach. PMID:25420150
NASA Astrophysics Data System (ADS)
Chandra, Malavika; Scheiman, James; Simeone, Diane; McKenna, Barbara; Purdy, Julianne; Mycek, Mary-Ann
2010-01-01
Pancreatic adenocarcinoma is one of the leading causes of cancer death, in part because of the inability of current diagnostic methods to reliably detect early-stage disease. We present the first assessment of the diagnostic accuracy of algorithms developed for pancreatic tissue classification using data from fiber optic probe-based bimodal optical spectroscopy, a real-time approach that would be compatible with minimally invasive diagnostic procedures for early cancer detection in the pancreas. A total of 96 fluorescence and 96 reflectance spectra are considered from 50 freshly excised tissue sites-including human pancreatic adenocarcinoma, chronic pancreatitis (inflammation), and normal tissues-on nine patients. Classification algorithms using linear discriminant analysis are developed to distinguish among tissues, and leave-one-out cross-validation is employed to assess the classifiers' performance. The spectral areas and ratios classifier (SpARC) algorithm employs a combination of reflectance and fluorescence data and has the best performance, with sensitivity, specificity, negative predictive value, and positive predictive value for correctly identifying adenocarcinoma being 85, 89, 92, and 80%, respectively.
Pattern Recognition Approaches for Breast Cancer DCE-MRI Classification: A Systematic Review.
Fusco, Roberta; Sansone, Mario; Filice, Salvatore; Carone, Guglielmo; Amato, Daniela Maria; Sansone, Carlo; Petrillo, Antonella
2016-01-01
We performed a systematic review of several pattern analysis approaches for classifying breast lesions using dynamic, morphological, and textural features in dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI). Several machine learning approaches, namely artificial neural networks (ANN), support vector machines (SVM), linear discriminant analysis (LDA), tree-based classifiers (TC), and Bayesian classifiers (BC), and features used for classification are described. The findings of a systematic review of 26 studies are presented. The sensitivity and specificity are respectively 91 and 83 % for ANN, 85 and 82 % for SVM, 96 and 85 % for LDA, 92 and 87 % for TC, and 82 and 85 % for BC. The sensitivity and specificity are respectively 82 and 74 % for dynamic features, 93 and 60 % for morphological features, 88 and 81 % for textural features, 95 and 86 % for a combination of dynamic and morphological features, and 88 and 84 % for a combination of dynamic, morphological, and other features. LDA and TC have the best performance. A combination of dynamic and morphological features gives the best performance.
Shishkin, Sergei L.; Nuzhdin, Yuri O.; Svirin, Evgeny P.; Trofimov, Alexander G.; Fedorova, Anastasia A.; Kozyrskiy, Bogdan L.; Velichkovsky, Boris M.
2016-01-01
We usually look at an object when we are going to manipulate it. Thus, eye tracking can be used to communicate intended actions. An effective human-machine interface, however, should be able to differentiate intentional and spontaneous eye movements. We report an electroencephalogram (EEG) marker that differentiates gaze fixations used for control from spontaneous fixations involved in visual exploration. Eight healthy participants played a game with their eye movements only. Their gaze-synchronized EEG data (fixation-related potentials, FRPs) were collected during game's control-on and control-off conditions. A slow negative wave with a maximum in the parietooccipital region was present in each participant's averaged FRPs in the control-on conditions and was absent or had much lower amplitude in the control-off condition. This wave was similar but not identical to stimulus-preceding negativity, a slow negative wave that can be observed during feedback expectation. Classification of intentional vs. spontaneous fixations was based on amplitude features from 13 EEG channels using 300 ms length segments free from electrooculogram contamination (200–500 ms relative to the fixation onset). For the first fixations in the fixation triplets required to make moves in the game, classified against control-off data, a committee of greedy classifiers provided 0.90 ± 0.07 specificity and 0.38 ± 0.14 sensitivity. Similar (slightly lower) results were obtained for the shrinkage Linear Discriminate Analysis (LDA) classifier. The second and third fixations in the triplets were classified at lower rate. We expect that, with improved feature sets and classifiers, a hybrid dwell-based Eye-Brain-Computer Interface (EBCI) can be built using the FRP difference between the intended and spontaneous fixations. If this direction of BCI development will be successful, such a multimodal interface may improve the fluency of interaction and can possibly become the basis for a new input device for paralyzed and healthy users, the EBCI “Wish Mouse.” PMID:27917105
Jamzad, Amoon; Setarehdan, Seyed Kamaledin
2014-04-01
The twinkling artifact is an undesired phenomenon within color Doppler sonograms that usually appears at the site of internal calcifications. Since the appearance of the twinkling artifact is correlated with the roughness of the calculi, noninvasive roughness estimation of the internal stones may be considered as a potential twinkling artifact application. This article proposes a novel quantitative approach for measurement and analysis of twinkling artifact data for roughness estimation. A phantom was developed with 7 quantified levels of roughness. The Doppler system was initially calibrated by the proposed procedure to facilitate the analysis. A total of 1050 twinkling artifact images were acquired from the phantom, and 32 novel numerical measures were introduced and computed for each image. The measures were then ranked on the basis of roughness quantification ability using different methods. The performance of the proposed twinkling artifact-based surface roughness quantification method was finally investigated for different combinations of features and classifiers. Eleven features were shown to be the most efficient numerical twinkling artifact measures in roughness characterization. The linear classifier outperformed other methods for twinkling artifact classification. The pixel count measures produced better results among the other categories. The sequential selection method showed higher accuracy than other individual rankings. The best roughness recognition average accuracy of 98.33% was obtained by the first 5 principle components and the linear classifier. The proposed twinkling artifact analysis method could recognize the phantom surface roughness with average accuracy of 98.33%. This method may also be applicable for noninvasive calculi characterization in treatment management.
Reducing the number of reconstructions needed for estimating channelized observer performance
NASA Astrophysics Data System (ADS)
Pineda, Angel R.; Miedema, Hope; Brenner, Melissa; Altaf, Sana
2018-03-01
A challenge for task-based optimization is the time required for each reconstructed image in applications where reconstructions are time consuming. Our goal is to reduce the number of reconstructions needed to estimate the area under the receiver operating characteristic curve (AUC) of the infinitely-trained optimal channelized linear observer. We explore the use of classifiers which either do not invert the channel covariance matrix or do feature selection. We also study the assumption that multiple low contrast signals in the same image of a non-linear reconstruction do not significantly change the estimate of the AUC. We compared the AUC of several classifiers (Hotelling, logistic regression, logistic regression using Firth bias reduction and the least absolute shrinkage and selection operator (LASSO)) with a small number of observations both for normal simulated data and images from a total variation reconstruction in magnetic resonance imaging (MRI). We used 10 Laguerre-Gauss channels and the Mann-Whitney estimator for AUC. For this data, our results show that at small sample sizes feature selection using the LASSO technique can decrease bias of the AUC estimation with increased variance and that for large sample sizes the difference between these classifiers is small. We also compared the use of multiple signals in a single reconstructed image to reduce the number of reconstructions in a total variation reconstruction for accelerated imaging in MRI. We found that AUC estimation using multiple low contrast signals in the same image resulted in similar AUC estimates as doing a single reconstruction per signal leading to a 13x reduction in the number of reconstructions needed.
NASA Astrophysics Data System (ADS)
Phinyomark, A.; Hu, H.; Phukpattaranont, P.; Limsakul, C.
2012-01-01
The classification of upper-limb movements based on surface electromyography (EMG) signals is an important issue in the control of assistive devices and rehabilitation systems. Increasing the number of EMG channels and features in order to increase the number of control commands can yield a high dimensional feature vector. To cope with the accuracy and computation problems associated with high dimensionality, it is commonplace to apply a processing step that transforms the data to a space of significantly lower dimensions with only a limited loss of useful information. Linear discriminant analysis (LDA) has been successfully applied as an EMG feature projection method. Recently, a number of extended LDA-based algorithms have been proposed, which are more competitive in terms of both classification accuracy and computational costs/times with classical LDA. This paper presents the findings of a comparative study of classical LDA and five extended LDA methods. From a quantitative comparison based on seven multi-feature sets, three extended LDA-based algorithms, consisting of uncorrelated LDA, orthogonal LDA and orthogonal fuzzy neighborhood discriminant analysis, produce better class separability when compared with a baseline system (without feature projection), principle component analysis (PCA), and classical LDA. Based on a 7-dimension time domain and time-scale feature vectors, these methods achieved respectively 95.2% and 93.2% classification accuracy by using a linear discriminant classifier.
Linear programming model to develop geodiversity map using utility theory
NASA Astrophysics Data System (ADS)
Sepehr, Adel
2015-04-01
In this article, the classification and mapping of geodiversity based on a quantitative methodology was accomplished using linear programming, the central idea of which being that geosites and geomorphosites as main indicators of geodiversity can be evaluated by utility theory. A linear programming method was applied for geodiversity mapping over Khorasan-razavi province located in eastern north of Iran. In this route, the main criteria for distinguishing geodiversity potential in the studied area were considered regarding rocks type (lithology), faults position (tectonic process), karst area (dynamic process), Aeolian landforms frequency and surface river forms. These parameters were investigated by thematic maps including geology, topography and geomorphology at scales 1:100'000, 1:50'000 and 1:250'000 separately, imagery data involving SPOT, ETM+ (Landsat 7) and field operations directly. The geological thematic layer was simplified from the original map using a practical lithologic criterion based on a primary genetic rocks classification representing metamorphic, igneous and sedimentary rocks. The geomorphology map was provided using DEM at scale 30m extracted by ASTER data, geology and google earth images. The geology map shows tectonic status and geomorphology indicated dynamic processes and landform (karst, Aeolian and river). Then, according to the utility theory algorithms, we proposed a linear programming to classify geodiversity degree in the studied area based on geology/morphology parameters. The algorithm used in the methodology was consisted a linear function to be maximized geodiversity to certain constraints in the form of linear equations. The results of this research indicated three classes of geodiversity potential including low, medium and high status. The geodiversity potential shows satisfied conditions in the Karstic areas and Aeolian landscape. Also the utility theory used in the research has been decreased uncertainty of the evaluations.
Goodson, Summer G; White, Sarah; Stevans, Alicia M; Bhat, Sanjana; Kao, Chia-Yu; Jaworski, Scott; Marlowe, Tamara R; Kohlmeier, Martin; McMillan, Leonard; Zeisel, Steven H; O'Brien, Deborah A
2017-11-01
The ability to accurately monitor alterations in sperm motility is paramount to understanding multiple genetic and biochemical perturbations impacting normal fertilization. Computer-aided sperm analysis (CASA) of human sperm typically reports motile percentage and kinematic parameters at the population level, and uses kinematic gating methods to identify subpopulations such as progressive or hyperactivated sperm. The goal of this study was to develop an automated method that classifies all patterns of human sperm motility during in vitro capacitation following the removal of seminal plasma. We visually classified CASA tracks of 2817 sperm from 18 individuals and used a support vector machine-based decision tree to compute four hyperplanes that separate five classes based on their kinematic parameters. We then developed a web-based program, CASAnova, which applies these equations sequentially to assign a single classification to each motile sperm. Vigorous sperm are classified as progressive, intermediate, or hyperactivated, and nonvigorous sperm as slow or weakly motile. This program correctly classifies sperm motility into one of five classes with an overall accuracy of 89.9%. Application of CASAnova to capacitating sperm populations showed a shift from predominantly linear patterns of motility at initial time points to more vigorous patterns, including hyperactivated motility, as capacitation proceeds. Both intermediate and hyperactivated motility patterns were largely eliminated when sperm were incubated in noncapacitating medium, demonstrating the sensitivity of this method. The five CASAnova classifications are distinctive and reflect kinetic parameters of washed human sperm, providing an accurate, quantitative, and high-throughput method for monitoring alterations in motility. © The Authors 2017. Published by Oxford University Press on behalf of Society for the Study of Reproduction. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Melillo, Paolo; Jovic, Alan; De Luca, Nicola; Pecchia, Leandro
2015-08-01
Accidental falls are a major problem of later life. Different technologies to predict falls have been investigated, but with limited success, mainly because of low specificity due to a high false positive rate. This Letter presents an automatic classifier based on heart rate variability (HRV) analysis with the goal to identify fallers automatically. HRV was used in this study as it is considered a good estimator of autonomic nervous system (ANS) states, which are responsible, among other things, for human balance control. Nominal 24 h electrocardiogram recordings from 168 cardiac patients (age 72 ± 8 years, 60 female), of which 47 were fallers, were investigated. Linear and nonlinear HRV properties were analysed in 30 min excerpts. Different data mining approaches were adopted and their performances were compared with a subject-based receiver operating characteristic analysis. The best performance was achieved by a hybrid algorithm, RUSBoost, integrated with feature selection method based on principal component analysis, which achieved satisfactory specificity and accuracy (80 and 72%, respectively), but low sensitivity (51%). These results suggested that ANS states causing falls could be reliably detected, but also that not all the falls were due to ANS states.
A multiple maximum scatter difference discriminant criterion for facial feature extraction.
Song, Fengxi; Zhang, David; Mei, Dayong; Guo, Zhongwei
2007-12-01
Maximum scatter difference (MSD) discriminant criterion was a recently presented binary discriminant criterion for pattern classification that utilizes the generalized scatter difference rather than the generalized Rayleigh quotient as a class separability measure, thereby avoiding the singularity problem when addressing small-sample-size problems. MSD classifiers based on this criterion have been quite effective on face-recognition tasks, but as they are binary classifiers, they are not as efficient on large-scale classification tasks. To address the problem, this paper generalizes the classification-oriented binary criterion to its multiple counterpart--multiple MSD (MMSD) discriminant criterion for facial feature extraction. The MMSD feature-extraction method, which is based on this novel discriminant criterion, is a new subspace-based feature-extraction method. Unlike most other subspace-based feature-extraction methods, the MMSD computes its discriminant vectors from both the range of the between-class scatter matrix and the null space of the within-class scatter matrix. The MMSD is theoretically elegant and easy to calculate. Extensive experimental studies conducted on the benchmark database, FERET, show that the MMSD out-performs state-of-the-art facial feature-extraction methods such as null space method, direct linear discriminant analysis (LDA), eigenface, Fisherface, and complete LDA.
NASA Astrophysics Data System (ADS)
Costache, G. N.; Gavat, I.
2004-09-01
Along with the aggressive growing of the amount of digital data available (text, audio samples, digital photos and digital movies joined all in the multimedia domain) the need for classification, recognition and retrieval of this kind of data became very important. In this paper will be presented a system structure to handle multimedia data based on a recognition perspective. The main processing steps realized for the interesting multimedia objects are: first, the parameterization, by analysis, in order to obtain a description based on features, forming the parameter vector; second, a classification, generally with a hierarchical structure to make the necessary decisions. For audio signals, both speech and music, the derived perceptual features are the melcepstral (MFCC) and the perceptual linear predictive (PLP) coefficients. For images, the derived features are the geometric parameters of the speaker mouth. The hierarchical classifier consists generally in a clustering stage, based on the Kohonnen Self-Organizing Maps (SOM) and a final stage, based on a powerful classification algorithm called Support Vector Machines (SVM). The system, in specific variants, is applied with good results in two tasks: the first, is a bimodal speech recognition which uses features obtained from speech signal fused to features obtained from speaker's image and the second is a music retrieval from large music database.
Muthusamy, Hariharan; Polat, Kemal; Yaacob, Sazali
2015-01-01
In the recent years, many research works have been published using speech related features for speech emotion recognition, however, recent studies show that there is a strong correlation between emotional states and glottal features. In this work, Mel-frequency cepstralcoefficients (MFCCs), linear predictive cepstral coefficients (LPCCs), perceptual linear predictive (PLP) features, gammatone filter outputs, timbral texture features, stationary wavelet transform based timbral texture features and relative wavelet packet energy and entropy features were extracted from the emotional speech (ES) signals and its glottal waveforms(GW). Particle swarm optimization based clustering (PSOC) and wrapper based particle swarm optimization (WPSO) were proposed to enhance the discerning ability of the features and to select the discriminating features respectively. Three different emotional speech databases were utilized to gauge the proposed method. Extreme learning machine (ELM) was employed to classify the different types of emotions. Different experiments were conducted and the results show that the proposed method significantly improves the speech emotion recognition performance compared to previous works published in the literature. PMID:25799141
Comparison of four approaches to a rock facies classification problem
Dubois, M.K.; Bohling, Geoffrey C.; Chakrabarti, S.
2007-01-01
In this study, seven classifiers based on four different approaches were tested in a rock facies classification problem: classical parametric methods using Bayes' rule, and non-parametric methods using fuzzy logic, k-nearest neighbor, and feed forward-back propagating artificial neural network. Determining the most effective classifier for geologic facies prediction in wells without cores in the Panoma gas field, in Southwest Kansas, was the objective. Study data include 3600 samples with known rock facies class (from core) with each sample having either four or five measured properties (wire-line log curves), and two derived geologic properties (geologic constraining variables). The sample set was divided into two subsets, one for training and one for testing the ability of the trained classifier to correctly assign classes. Artificial neural networks clearly outperformed all other classifiers and are effective tools for this particular classification problem. Classical parametric models were inadequate due to the nature of the predictor variables (high dimensional and not linearly correlated), and feature space of the classes (overlapping). The other non-parametric methods tested, k-nearest neighbor and fuzzy logic, would need considerable improvement to match the neural network effectiveness, but further work, possibly combining certain aspects of the three non-parametric methods, may be justified. ?? 2006 Elsevier Ltd. All rights reserved.
Janousova, Eva; Schwarz, Daniel; Kasparek, Tomas
2015-06-30
We investigated a combination of three classification algorithms, namely the modified maximum uncertainty linear discriminant analysis (mMLDA), the centroid method, and the average linkage, with three types of features extracted from three-dimensional T1-weighted magnetic resonance (MR) brain images, specifically MR intensities, grey matter densities, and local deformations for distinguishing 49 first episode schizophrenia male patients from 49 healthy male subjects. The feature sets were reduced using intersubject principal component analysis before classification. By combining the classifiers, we were able to obtain slightly improved results when compared with single classifiers. The best classification performance (81.6% accuracy, 75.5% sensitivity, and 87.8% specificity) was significantly better than classification by chance. We also showed that classifiers based on features calculated using more computation-intensive image preprocessing perform better; mMLDA with classification boundary calculated as weighted mean discriminative scores of the groups had improved sensitivity but similar accuracy compared to the original MLDA; reducing a number of eigenvectors during data reduction did not always lead to higher classification accuracy, since noise as well as the signal important for classification were removed. Our findings provide important information for schizophrenia research and may improve accuracy of computer-aided diagnostics of neuropsychiatric diseases. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
ERIC Educational Resources Information Center
Al-Balushi, Sulaiman M.; Coll, Richard Kevin
2013-01-01
The current study compared different learners' static and dynamic mental images of unseen scientific species and processes in relation to their spatial ability. Learners were classified into verbal, visual and schematic. Dynamic images were classified into: appearing/disappearing, linear-movement, and rotation. Two types of scientific entities and…
NASA Astrophysics Data System (ADS)
Vítková, Gabriela; Prokeš, Lubomír; Novotný, Karel; Pořízka, Pavel; Novotný, Jan; Všianský, Dalibor; Čelko, Ladislav; Kaiser, Jozef
2014-11-01
Focusing on historical aspect, during archeological excavation or restoration works of buildings or different structures built from bricks it is important to determine, preferably in-situ and in real-time, the locality of bricks origin. Fast classification of bricks on the base of Laser-Induced Breakdown Spectroscopy (LIBS) spectra is possible using multivariate statistical methods. Combination of principal component analysis (PCA) and linear discriminant analysis (LDA) was applied in this case. LIBS was used to classify altogether the 29 brick samples from 7 different localities. Realizing comparative study using two different LIBS setups - stand-off and table-top it is shown that stand-off LIBS has a big potential for archeological in-field measurements.
[Research on the methods for multi-class kernel CSP-based feature extraction].
Wang, Jinjia; Zhang, Lingzhi; Hu, Bei
2012-04-01
To relax the presumption of strictly linear patterns in the common spatial patterns (CSP), we studied the kernel CSP (KCSP). A new multi-class KCSP (MKCSP) approach was proposed in this paper, which combines the kernel approach with multi-class CSP technique. In this approach, we used kernel spatial patterns for each class against all others, and extracted signal components specific to one condition from EEG data sets of multiple conditions. Then we performed classification using the Logistic linear classifier. Brain computer interface (BCI) competition III_3a was used in the experiment. Through the experiment, it can be proved that this approach could decompose the raw EEG singles into spatial patterns extracted from multi-class of single trial EEG, and could obtain good classification results.
Visual, Algebraic and Mixed Strategies in Visually Presented Linear Programming Problems.
ERIC Educational Resources Information Center
Shama, Gilli; Dreyfus, Tommy
1994-01-01
Identified and classified solution strategies of (n=49) 10th-grade students who were presented with linear programming problems in a predominantly visual setting in the form of a computerized game. Visual strategies were developed more frequently than either algebraic or mixed strategies. Appendix includes questionnaires. (Contains 11 references.)…
Linear relations in microbial reaction systems: a general overview of their origin, form, and use.
Noorman, H J; Heijnen, J J; Ch A M Luyben, K
1991-09-01
In microbial reaction systems, there are a number of linear relations among net conversion rates. These can be very useful in the analysis of experimental data. This article provides a general approach for the formation and application of the linear relations. Two type of system descriptions, one considering the biomass as a black box and the other based on metabolic pathways, are encountered. These are defined in a linear vector and matrix algebra framework. A correct a priori description can be obtained by three useful tests: the independency, consistency, and observability tests. The independency are different. The black box approach provides only conservations relations. They are derived from element, electrical charge, energy, and Gibbs energy balances. The metabolic approach provides, in addition to the conservation relations, metabolic and reaction relations. These result from component, energy, and Gibbs energy balances. Thus it is more attractive to use the metabolic description than the black box approach. A number of different types of linear relations given in the literature are reviewed. They are classified according to the different categories that result from the black box or the metabolic system description. Validation of hypotheses related to metabolic pathways can be supported by experimental validation of the linear metabolic relations. However, definite proof from biochemical evidence remains indispensable.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Patel, Sanjay V.; Jenkins, Mark W.; Hughes, Robert C.
1999-07-19
We demonstrate a ''universal solvent sensor'' constructed from a small array of carbon/polymer composite chemiresistors that respond to solvents spanning a wide range of Hildebrand volubility parameters. Conductive carbon particles provide electrical continuity in these composite films. When the polymer matrix absorbs solvent vapors, the composite film swells, the average separation between carbon particles increases, and an increase in film resistance results, as some of the conduction pathways are broken. The adverse effects of contact resistance at high solvent concentrations are reported. Solvent vapors including isooctane, ethanol, dlisopropyhnethylphosphonate (DIMP), and water are correctly identified (''classified'') using three chemiresistors, their compositemore » coatings chosen to span the full range of volubility parameters. With the same three sensors, binary mixtures of solvent vapor and water vapor are correctly classified, following classification, two sensors suffice to determine the concentrations of both vapor components. Polyethylene vinylacetate and polyvinyl alcohol (PVA) are two such polymers that are used to classify binary mixtures of DIMP with water vapor; the PVA/carbon-particle-composite films are sensitive to less than 0.25{degree}A relative humidity. The Sandia-developed VERI (Visual-Empirical Region of Influence) technique is used as a method of pattern recognition to classify the solvents and mixtures and to distinguish them from water vapor. In many cases, the response of a given composite sensing film to a binary mixture deviates significantly from the sum of the responses to the isolated vapor components at the same concentrations. While these nonlinearities pose significant difficulty for (primarily) linear methods such as principal components analysis, VERI handles both linear and nonlinear data with equal ease. In the present study the maximum speciation accuracy is achieved by an array containing three or four sensor elements, with the addition of more sensors resulting in a measurable accuracy decrease.« less
Detection of epileptic seizure in EEG signals using linear least squares preprocessing.
Roshan Zamir, Z
2016-09-01
An epileptic seizure is a transient event of abnormal excessive neuronal discharge in the brain. This unwanted event can be obstructed by detection of electrical changes in the brain that happen before the seizure takes place. The automatic detection of seizures is necessary since the visual screening of EEG recordings is a time consuming task and requires experts to improve the diagnosis. Much of the prior research in detection of seizures has been developed based on artificial neural network, genetic programming, and wavelet transforms. Although the highest achieved accuracy for classification is 100%, there are drawbacks, such as the existence of unbalanced datasets and the lack of investigations in performances consistency. To address these, four linear least squares-based preprocessing models are proposed to extract key features of an EEG signal in order to detect seizures. The first two models are newly developed. The original signal (EEG) is approximated by a sinusoidal curve. Its amplitude is formed by a polynomial function and compared with the predeveloped spline function. Different statistical measures, namely classification accuracy, true positive and negative rates, false positive and negative rates and precision, are utilised to assess the performance of the proposed models. These metrics are derived from confusion matrices obtained from classifiers. Different classifiers are used over the original dataset and the set of extracted features. The proposed models significantly reduce the dimension of the classification problem and the computational time while the classification accuracy is improved in most cases. The first and third models are promising feature extraction methods with the classification accuracy of 100%. Logistic, LazyIB1, LazyIB5, and J48 are the best classifiers. Their true positive and negative rates are 1 while false positive and negative rates are 0 and the corresponding precision values are 1. Numerical results suggest that these models are robust and efficient for detecting epileptic seizure. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Moreno, J. C.; Goldsmith, S.; Griem, H. R.; Cohen, Leonard; Knauer, J.
1990-01-01
Nonresonance spectral lines of Mg XII and Mg XI emitted by magnesium laser-produced plasmas have been observed in the extreme-vacuum-ultraviolet region and their transitions classified. As many as eight beams of the Omega laser system of the Laboratory for Laser Energetics at the University of Rochester were linearly focused onto magnesium-coated flat targets to produce linear plasma radiation sources from 3 to 6 mm long. The spectra were photographed end-on with a grazing-incidence spectrograph. The identified Mg XII lines are classified as 2s-3p, 2p-3d, 2s-4p, 2p-4d, and 3d-4f transitions. The identified Mg XI lines are classified as 1s2s-1s3p, 1s2p-1s3d, 1s2p-1s4d, 1s3p-1s4d, and 1s3d-1s4f.
Classification of the Correct Quranic Letters Pronunciation of Male and Female Reciters
NASA Astrophysics Data System (ADS)
Khairuddin, Safiah; Ahmad, Salmiah; Embong, Abdul Halim; Nur Wahidah Nik Hashim, Nik; Altamas, Tareq M. K.; Nuratikah Syd Badaruddin, Syarifah; Shahbudin Hassan, Surul
2017-11-01
Recitation of the Holy Quran with the correct Tajweed is essential for every Muslim. Islam has encouraged Quranic education since early age as the recitation of the Quran correctly will represent the correct meaning of the words of Allah. It is important to recite the Quranic verses according to its characteristics (sifaat) and from its point of articulations (makhraj). This paper presents the identification and classification analysis of Quranic letters pronunciation for both male and female reciters, to obtain the unique representation of each letter by male as compared to female expert reciters. Linear Discriminant Analysis (LDA) was used as the classifier to classify the data with Formants and Power Spectral Density (PSD) as the acoustic features. The result shows that linear classifier of PSD with band 1 and band 2 power spectral combinations gives a high percentage of classification accuracy for most of the Quranic letters. It is also shown that the pronunciation by male reciters gives better result in the classification of the Quranic letters.
Harmonic wavelet packet transform for on-line system health diagnosis
NASA Astrophysics Data System (ADS)
Yan, Ruqiang; Gao, Robert X.
2004-07-01
This paper presents a new approach to on-line health diagnosis of mechanical systems, based on the wavelet packet transform. Specifically, signals acquired from vibration sensors are decomposed into sub-bands by means of the discrete harmonic wavelet packet transform (DHWPT). Based on the Fisher linear discriminant criterion, features in the selected sub-bands are then used as inputs to three classifiers (Nearest Neighbor rule-based and two Neural Network-based), for system health condition assessment. Experimental results have confirmed that, comparing to the conventional approach where statistical parameters from raw signals are used, the presented approach enabled higher signal-to-noise ratio for more effective and intelligent use of the sensory information, thus leading to more accurate system health diagnosis.
Hierarchical Rhetorical Sentence Categorization for Scientific Papers
NASA Astrophysics Data System (ADS)
Rachman, G. H.; Khodra, M. L.; Widyantoro, D. H.
2018-03-01
Important information in scientific papers can be composed of rhetorical sentences that is structured from certain categories. To get this information, text categorization should be conducted. Actually, some works in this task have been completed by employing word frequency, semantic similarity words, hierarchical classification, and the others. Therefore, this paper aims to present the rhetorical sentence categorization from scientific paper by employing TF-IDF and Word2Vec to capture word frequency and semantic similarity words and employing hierarchical classification. Every experiment is tested in two classifiers, namely Naïve Bayes and SVM Linear. This paper shows that hierarchical classifier is better than flat classifier employing either TF-IDF or Word2Vec, although it increases only almost 2% from 27.82% when using flat classifier until 29.61% when using hierarchical classifier. It shows also different learning model for child-category can be built by hierarchical classifier.
Feature-space-based FMRI analysis using the optimal linear transformation.
Sun, Fengrong; Morris, Drew; Lee, Wayne; Taylor, Margot J; Mills, Travis; Babyn, Paul S
2010-09-01
The optimal linear transformation (OLT), an image analysis technique of feature space, was first presented in the field of MRI. This paper proposes a method of extending OLT from MRI to functional MRI (fMRI) to improve the activation-detection performance over conventional approaches of fMRI analysis. In this method, first, ideal hemodynamic response time series for different stimuli were generated by convolving the theoretical hemodynamic response model with the stimulus timing. Second, constructing hypothetical signature vectors for different activity patterns of interest by virtue of the ideal hemodynamic responses, OLT was used to extract features of fMRI data. The resultant feature space had particular geometric clustering properties. It was then classified into different groups, each pertaining to an activity pattern of interest; the applied signature vector for each group was obtained by averaging. Third, using the applied signature vectors, OLT was applied again to generate fMRI composite images with high SNRs for the desired activity patterns. Simulations and a blocked fMRI experiment were employed for the method to be verified and compared with the general linear model (GLM)-based analysis. The simulation studies and the experimental results indicated the superiority of the proposed method over the GLM-based analysis in detecting brain activities.
Ebrahimi, Farideh; Setarehdan, Seyed-Kamaledin; Ayala-Moyeda, Jose; Nazeran, Homer
2013-10-01
The conventional method for sleep staging is to analyze polysomnograms (PSGs) recorded in a sleep lab. The electroencephalogram (EEG) is one of the most important signals in PSGs but recording and analysis of this signal presents a number of technical challenges, especially at home. Instead, electrocardiograms (ECGs) are much easier to record and may offer an attractive alternative for home sleep monitoring. The heart rate variability (HRV) signal proves suitable for automatic sleep staging. Thirty PSGs from the Sleep Heart Health Study (SHHS) database were used. Three feature sets were extracted from 5- and 0.5-min HRV segments: time-domain features, nonlinear-dynamics features and time-frequency features. The latter was achieved by using empirical mode decomposition (EMD) and discrete wavelet transform (DWT) methods. Normalized energies in important frequency bands of HRV signals were computed using time-frequency methods. ANOVA and t-test were used for statistical evaluations. Automatic sleep staging was based on HRV signal features. The ANOVA followed by a post hoc Bonferroni was used for individual feature assessment. Most features were beneficial for sleep staging. A t-test was used to compare the means of extracted features in 5- and 0.5-min HRV segments. The results showed that the extracted features means were statistically similar for a small number of features. A separability measure showed that time-frequency features, especially EMD features, had larger separation than others. There was not a sizable difference in separability of linear features between 5- and 0.5-min HRV segments but separability of nonlinear features, especially EMD features, decreased in 0.5-min HRV segments. HRV signal features were classified by linear discriminant (LD) and quadratic discriminant (QD) methods. Classification results based on features from 5-min segments surpassed those obtained from 0.5-min segments. The best result was obtained from features using 5-min HRV segments classified by the LD classifier. A combination of linear/nonlinear features from HRV signals is effective in automatic sleep staging. Moreover, time-frequency features are more informative than others. In addition, a separability measure and classification results showed that HRV signal features, especially nonlinear features, extracted from 5-min segments are more discriminative than those from 0.5-min segments in automatic sleep staging. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.
Applying machine-learning techniques to Twitter data for automatic hazard-event classification.
NASA Astrophysics Data System (ADS)
Filgueira, R.; Bee, E. J.; Diaz-Doce, D.; Poole, J., Sr.; Singh, A.
2017-12-01
The constant flow of information offered by tweets provides valuable information about all sorts of events at a high temporal and spatial resolution. Over the past year we have been analyzing in real-time geological hazards/phenomenon, such as earthquakes, volcanic eruptions, landslides, floods or the aurora, as part of the GeoSocial project, by geo-locating tweets filtered by keywords in a web-map. However, not all the filtered tweets are related with hazard/phenomenon events. This work explores two classification techniques for automatic hazard-event categorization based on tweets about the "Aurora". First, tweets were filtered using aurora-related keywords, removing stop words and selecting the ones written in English. For classifying the remaining between "aurora-event" or "no-aurora-event" categories, we compared two state-of-art techniques: Support Vector Machine (SVM) and Deep Convolutional Neural Networks (CNN) algorithms. Both approaches belong to the family of supervised learning algorithms, which make predictions based on labelled training dataset. Therefore, we created a training dataset by tagging 1200 tweets between both categories. The general form of SVM is used to separate two classes by a function (kernel). We compared the performance of four different kernels (Linear Regression, Logistic Regression, Multinomial Naïve Bayesian and Stochastic Gradient Descent) provided by Scikit-Learn library using our training dataset to build the SVM classifier. The results shown that the Logistic Regression (LR) gets the best accuracy (87%). So, we selected the SVM-LR classifier to categorise a large collection of tweets using the "dispel4py" framework.Later, we developed a CNN classifier, where the first layer embeds words into low-dimensional vectors. The next layer performs convolutions over the embedded word vectors. Results from the convolutional layer are max-pooled into a long feature vector, which is classified using a softmax layer. The CNN's accuracy is lower (83%) than the SVM-LR, since the algorithm needs a bigger training dataset to increase its accuracy. We used TensorFlow framework for applying CNN classifier to the same collection of tweets.In future we will modify both classifiers to work with other geo-hazards, use larger training datasets and apply them in real-time.
NASA Astrophysics Data System (ADS)
Schmalz, M.; Ritter, G.; Key, R.
Accurate and computationally efficient spectral signature classification is a crucial step in the nonimaging detection and recognition of spaceborne objects. In classical hyperspectral recognition applications using linear mixing models, signature classification accuracy depends on accurate spectral endmember discrimination [1]. If the endmembers cannot be classified correctly, then the signatures cannot be classified correctly, and object recognition from hyperspectral data will be inaccurate. In practice, the number of endmembers accurately classified often depends linearly on the number of inputs. This can lead to potentially severe classification errors in the presence of noise or densely interleaved signatures. In this paper, we present an comparison of emerging technologies for nonimaging spectral signature classfication based on a highly accurate, efficient search engine called Tabular Nearest Neighbor Encoding (TNE) [3,4] and a neural network technology called Morphological Neural Networks (MNNs) [5]. Based on prior results, TNE can optimize its classifier performance to track input nonergodicities, as well as yield measures of confidence or caution for evaluation of classification results. Unlike neural networks, TNE does not have a hidden intermediate data structure (e.g., the neural net weight matrix). Instead, TNE generates and exploits a user-accessible data structure called the agreement map (AM), which can be manipulated by Boolean logic operations to effect accurate classifier refinement algorithms. The open architecture and programmability of TNE's agreement map processing allows a TNE programmer or user to determine classification accuracy, as well as characterize in detail the signatures for which TNE did not obtain classification matches, and why such mis-matches occurred. In this study, we will compare TNE and MNN based endmember classification, using performance metrics such as probability of correct classification (Pd) and rate of false detections (Rfa). As proof of principle, we analyze classification of multiple closely spaced signatures from a NASA database of space material signatures. Additional analysis pertains to computational complexity and noise sensitivity, which are superior to Bayesian techniques based on classical neural networks. [1] Winter, M.E. "Fast autonomous spectral end-member determination in hyperspectral data," in Proceedings of the 13th International Conference On Applied Geologic Remote Sensing, Vancouver, B.C., Canada, pp. 337-44 (1999). [2] N. Keshava, "A survey of spectral unmixing algorithms," Lincoln Laboratory Journal 14:55-78 (2003). [3] Key, G., M.S. SCHMALZ, F.M. Caimi, and G.X. Ritter. "Performance analysis of tabular nearest neighbor encoding algorithm for joint compression and ATR", in Proceedings SPIE 3814:115-126 (1999). [4] Schmalz, M.S. and G. Key. "Algorithms for hyperspectral signature classification in unresolved object detection using tabular nearest neighbor encoding" in Proceedings of the 2007 AMOS Conference, Maui HI (2007). [5] Ritter, G.X., G. Urcid, and M.S. Schmalz. "Autonomous single-pass endmember approximation using lattice auto-associative memories", Neurocomputing (Elsevier), accepted (June 2008).
MIDAS: Regionally linear multivariate discriminative statistical mapping.
Varol, Erdem; Sotiras, Aristeidis; Davatzikos, Christos
2018-07-01
Statistical parametric maps formed via voxel-wise mass-univariate tests, such as the general linear model, are commonly used to test hypotheses about regionally specific effects in neuroimaging cross-sectional studies where each subject is represented by a single image. Despite being informative, these techniques remain limited as they ignore multivariate relationships in the data. Most importantly, the commonly employed local Gaussian smoothing, which is important for accounting for registration errors and making the data follow Gaussian distributions, is usually chosen in an ad hoc fashion. Thus, it is often suboptimal for the task of detecting group differences and correlations with non-imaging variables. Information mapping techniques, such as searchlight, which use pattern classifiers to exploit multivariate information and obtain more powerful statistical maps, have become increasingly popular in recent years. However, existing methods may lead to important interpretation errors in practice (i.e., misidentifying a cluster as informative, or failing to detect truly informative voxels), while often being computationally expensive. To address these issues, we introduce a novel efficient multivariate statistical framework for cross-sectional studies, termed MIDAS, seeking highly sensitive and specific voxel-wise brain maps, while leveraging the power of regional discriminant analysis. In MIDAS, locally linear discriminative learning is applied to estimate the pattern that best discriminates between two groups, or predicts a variable of interest. This pattern is equivalent to local filtering by an optimal kernel whose coefficients are the weights of the linear discriminant. By composing information from all neighborhoods that contain a given voxel, MIDAS produces a statistic that collectively reflects the contribution of the voxel to the regional classifiers as well as the discriminative power of the classifiers. Critically, MIDAS efficiently assesses the statistical significance of the derived statistic by analytically approximating its null distribution without the need for computationally expensive permutation tests. The proposed framework was extensively validated using simulated atrophy in structural magnetic resonance imaging (MRI) and further tested using data from a task-based functional MRI study as well as a structural MRI study of cognitive performance. The performance of the proposed framework was evaluated against standard voxel-wise general linear models and other information mapping methods. The experimental results showed that MIDAS achieves relatively higher sensitivity and specificity in detecting group differences. Together, our results demonstrate the potential of the proposed approach to efficiently map effects of interest in both structural and functional data. Copyright © 2018. Published by Elsevier Inc.
Adaptive local linear regression with application to printer color management.
Gupta, Maya R; Garcia, Eric K; Chin, Erika
2008-06-01
Local learning methods, such as local linear regression and nearest neighbor classifiers, base estimates on nearby training samples, neighbors. Usually, the number of neighbors used in estimation is fixed to be a global "optimal" value, chosen by cross validation. This paper proposes adapting the number of neighbors used for estimation to the local geometry of the data, without need for cross validation. The term enclosing neighborhood is introduced to describe a set of neighbors whose convex hull contains the test point when possible. It is proven that enclosing neighborhoods yield bounded estimation variance under some assumptions. Three such enclosing neighborhood definitions are presented: natural neighbors, natural neighbors inclusive, and enclosing k-NN. The effectiveness of these neighborhood definitions with local linear regression is tested for estimating lookup tables for color management. Significant improvements in error metrics are shown, indicating that enclosing neighborhoods may be a promising adaptive neighborhood definition for other local learning tasks as well, depending on the density of training samples.
Saini, Harsh; Raicar, Gaurav; Dehzangi, Abdollah; Lal, Sunil; Sharma, Alok
2015-12-07
Protein subcellular localization is an important topic in proteomics since it is related to a protein׳s overall function, helps in the understanding of metabolic pathways, and in drug design and discovery. In this paper, a basic approximation technique from natural language processing called the linear interpolation smoothing model is applied for predicting protein subcellular localizations. The proposed approach extracts features from syntactical information in protein sequences to build probabilistic profiles using dependency models, which are used in linear interpolation to determine how likely is a sequence to belong to a particular subcellular location. This technique builds a statistical model based on maximum likelihood. It is able to deal effectively with high dimensionality that hinders other traditional classifiers such as Support Vector Machines or k-Nearest Neighbours without sacrificing performance. This approach has been evaluated by predicting subcellular localizations of Gram positive and Gram negative bacterial proteins. Copyright © 2015 Elsevier Ltd. All rights reserved.
Kunimatsu, Akira; Kunimatsu, Natsuko; Yasaka, Koichiro; Akai, Hiroyuki; Kamiya, Kouhei; Watadani, Takeyuki; Mori, Harushi; Abe, Osamu
2018-05-16
Although advanced MRI techniques are increasingly available, imaging differentiation between glioblastoma and primary central nervous system lymphoma (PCNSL) is sometimes confusing. We aimed to evaluate the performance of image classification by support vector machine, a method of traditional machine learning, using texture features computed from contrast-enhanced T 1 -weighted images. This retrospective study on preoperative brain tumor MRI included 76 consecutives, initially treated patients with glioblastoma (n = 55) or PCNSL (n = 21) from one institution, consisting of independent training group (n = 60: 44 glioblastomas and 16 PCNSLs) and test group (n = 16: 11 glioblastomas and 5 PCNSLs) sequentially separated by time periods. A total set of 67 texture features was computed on routine contrast-enhanced T 1 -weighted images of the training group, and the top four most discriminating features were selected as input variables to train support vector machine classifiers. These features were then evaluated on the test group with subsequent image classification. The area under the receiver operating characteristic curves on the training data was calculated at 0.99 (95% confidence interval [CI]: 0.96-1.00) for the classifier with a Gaussian kernel and 0.87 (95% CI: 0.77-0.95) for the classifier with a linear kernel. On the test data, both of the classifiers showed prediction accuracy of 75% (12/16) of the test images. Although further improvement is needed, our preliminary results suggest that machine learning-based image classification may provide complementary diagnostic information on routine brain MRI.
Online Detection of Driver Fatigue Using Steering Wheel Angles for Real Driving Conditions
Li, Zuojin; Li, Shengbo Eben; Li, Renjie; Cheng, Bo; Shi, Jinliang
2017-01-01
This paper presents a drowsiness on-line detection system for monitoring driver fatigue level under real driving conditions, based on the data of steering wheel angles (SWA) collected from sensors mounted on the steering lever. The proposed system firstly extracts approximate entropy (ApEn) features from fixed sliding windows on real-time steering wheel angles time series. After that, this system linearizes the ApEn features series through an adaptive piecewise linear fitting using a given deviation. Then, the detection system calculates the warping distance between the linear features series of the sample data. Finally, this system uses the warping distance to determine the drowsiness state of the driver according to a designed binary decision classifier. The experimental data were collected from 14.68 h driving under real road conditions, including two fatigue levels: “wake” and “drowsy”. The results show that the proposed system is capable of working online with an average 78.01% accuracy, 29.35% false detections of the “awake” state, and 15.15% false detections of the “drowsy” state. The results also confirm that the proposed method based on SWA signal is valuable for applications in preventing traffic accidents caused by driver fatigue. PMID:28257094
Raina, Abhay; Hennessy, Ricky; Rains, Michael; Allred, James; Hirshburg, Jason M; Diven, Dayna; Markey, Mia K.
2016-01-01
Background Traditional metrics for evaluating the severity of psoriasis are subjective, which complicates efforts to measure effective treatments in clinical trials. Methods We collected images of psoriasis plaques and calibrated the coloration of the images according to an included color card. Features were extracted from the images and used to train a linear discriminant analysis classifier with cross-validation to automatically classify the degree of erythema. The results were tested against numerical scores obtained by a panel of dermatologists using a standard rating system. Results Quantitative measures of erythema based on the digital color images showed good agreement with subjective assessment of erythema severity (κ = 0.4203). The color calibration process improved the agreement from κ = 0.2364 to κ = 0.4203. Conclusions We propose a method for the objective measurement of the psoriasis severity parameter of erythema and show that the calibration process improved the results. PMID:26517973
Großekathöfer, Ulf; Manyakov, Nikolay V.; Mihajlović, Vojkan; Pandina, Gahan; Skalkin, Andrew; Ness, Seth; Bangerter, Abigail; Goodwin, Matthew S.
2017-01-01
A number of recent studies using accelerometer features as input to machine learning classifiers show promising results for automatically detecting stereotypical motor movements (SMM) in individuals with Autism Spectrum Disorder (ASD). However, replicating these results across different types of accelerometers and their position on the body still remains a challenge. We introduce a new set of features in this domain based on recurrence plot and quantification analyses that are orientation invariant and able to capture non-linear dynamics of SMM. Applying these features to an existing published data set containing acceleration data, we achieve up to 9% average increase in accuracy compared to current state-of-the-art published results. Furthermore, we provide evidence that a single torso sensor can automatically detect multiple types of SMM in ASD, and that our approach allows recognition of SMM with high accuracy in individuals when using a person-independent classifier. PMID:28261082
Großekathöfer, Ulf; Manyakov, Nikolay V; Mihajlović, Vojkan; Pandina, Gahan; Skalkin, Andrew; Ness, Seth; Bangerter, Abigail; Goodwin, Matthew S
2017-01-01
A number of recent studies using accelerometer features as input to machine learning classifiers show promising results for automatically detecting stereotypical motor movements (SMM) in individuals with Autism Spectrum Disorder (ASD). However, replicating these results across different types of accelerometers and their position on the body still remains a challenge. We introduce a new set of features in this domain based on recurrence plot and quantification analyses that are orientation invariant and able to capture non-linear dynamics of SMM. Applying these features to an existing published data set containing acceleration data, we achieve up to 9% average increase in accuracy compared to current state-of-the-art published results. Furthermore, we provide evidence that a single torso sensor can automatically detect multiple types of SMM in ASD, and that our approach allows recognition of SMM with high accuracy in individuals when using a person-independent classifier.
NASA Astrophysics Data System (ADS)
Karam, Walid; Mokbel, Chafic; Greige, Hanna; Chollet, Gerard
2006-05-01
A GMM based audio visual speaker verification system is described and an Active Appearance Model with a linear speaker transformation system is used to evaluate the robustness of the verification. An Active Appearance Model (AAM) is used to automatically locate and track a speaker's face in a video recording. A Gaussian Mixture Model (GMM) based classifier (BECARS) is used for face verification. GMM training and testing is accomplished on DCT based extracted features of the detected faces. On the audio side, speech features are extracted and used for speaker verification with the GMM based classifier. Fusion of both audio and video modalities for audio visual speaker verification is compared with face verification and speaker verification systems. To improve the robustness of the multimodal biometric identity verification system, an audio visual imposture system is envisioned. It consists of an automatic voice transformation technique that an impostor may use to assume the identity of an authorized client. Features of the transformed voice are then combined with the corresponding appearance features and fed into the GMM based system BECARS for training. An attempt is made to increase the acceptance rate of the impostor and to analyzing the robustness of the verification system. Experiments are being conducted on the BANCA database, with a prospect of experimenting on the newly developed PDAtabase developed within the scope of the SecurePhone project.
About decomposition approach for solving the classification problem
NASA Astrophysics Data System (ADS)
Andrianova, A. A.
2016-11-01
This article describes the features of the application of an algorithm with using of decomposition methods for solving the binary classification problem of constructing a linear classifier based on Support Vector Machine method. Application of decomposition reduces the volume of calculations, in particular, due to the emerging possibilities to build parallel versions of the algorithm, which is a very important advantage for the solution of problems with big data. The analysis of the results of computational experiments conducted using the decomposition approach. The experiment use known data set for binary classification problem.
Piezoelectric transformer structural modeling--a review.
Yang, Jiashi
2007-06-01
A review on piezoelectric transformer structural modeling is presented. The operating principle and the basic behavior of piezoelectric transformers as governed by the linear theory of piezoelectricity are shown by a simple, theoretical analysis on a Rosen transformer based on extensional modes of a nonhomogeneous ceramic rod. Various transformers are classified according to their structural shapes, operating modes, and voltage transforming capability. Theoretical and numerical modeling results from the theory of piezoelectricity are reviewed. More advances modeling on thermal and nonlinear effects also are discussed. The article contains 167 references.
Local Subspace Classifier with Transform-Invariance for Image Classification
NASA Astrophysics Data System (ADS)
Hotta, Seiji
A family of linear subspace classifiers called local subspace classifier (LSC) outperforms the k-nearest neighbor rule (kNN) and conventional subspace classifiers in handwritten digit classification. However, LSC suffers very high sensitivity to image transformations because it uses projection and the Euclidean distances for classification. In this paper, I present a combination of a local subspace classifier (LSC) and a tangent distance (TD) for improving accuracy of handwritten digit recognition. In this classification rule, we can deal with transform-invariance easily because we are able to use tangent vectors for approximation of transformations. However, we cannot use tangent vectors in other type of images such as color images. Hence, kernel LSC (KLSC) is proposed for incorporating transform-invariance into LSC via kernel mapping. The performance of the proposed methods is verified with the experiments on handwritten digit and color image classification.
NASA Astrophysics Data System (ADS)
Prasad, S.; Bruce, L. M.
2007-04-01
There is a growing interest in using multiple sources for automatic target recognition (ATR) applications. One approach is to take multiple, independent observations of a phenomenon and perform a feature level or a decision level fusion for ATR. This paper proposes a method to utilize these types of multi-source fusion techniques to exploit hyperspectral data when only a small number of training pixels are available. Conventional hyperspectral image based ATR techniques project the high dimensional reflectance signature onto a lower dimensional subspace using techniques such as Principal Components Analysis (PCA), Fisher's linear discriminant analysis (LDA), subspace LDA and stepwise LDA. While some of these techniques attempt to solve the curse of dimensionality, or small sample size problem, these are not necessarily optimal projections. In this paper, we present a divide and conquer approach to address the small sample size problem. The hyperspectral space is partitioned into contiguous subspaces such that the discriminative information within each subspace is maximized, and the statistical dependence between subspaces is minimized. We then treat each subspace as a separate source in a multi-source multi-classifier setup and test various decision fusion schemes to determine their efficacy. Unlike previous approaches which use correlation between variables for band grouping, we study the efficacy of higher order statistical information (using average mutual information) for a bottom up band grouping. We also propose a confidence measure based decision fusion technique, where the weights associated with various classifiers are based on their confidence in recognizing the training data. To this end, training accuracies of all classifiers are used for weight assignment in the fusion process of test pixels. The proposed methods are tested using hyperspectral data with known ground truth, such that the efficacy can be quantitatively measured in terms of target recognition accuracies.
Hatamikia, Sepideh; Maghooli, Keivan; Nasrabadi, Ali Motie
2014-01-01
Electroencephalogram (EEG) is one of the useful biological signals to distinguish different brain diseases and mental states. In recent years, detecting different emotional states from biological signals has been merged more attention by researchers and several feature extraction methods and classifiers are suggested to recognize emotions from EEG signals. In this research, we introduce an emotion recognition system using autoregressive (AR) model, sequential forward feature selection (SFS) and K-nearest neighbor (KNN) classifier using EEG signals during emotional audio-visual inductions. The main purpose of this paper is to investigate the performance of AR features in the classification of emotional states. To achieve this goal, a distinguished AR method (Burg's method) based on Levinson-Durbin's recursive algorithm is used and AR coefficients are extracted as feature vectors. In the next step, two different feature selection methods based on SFS algorithm and Davies–Bouldin index are used in order to decrease the complexity of computing and redundancy of features; then, three different classifiers include KNN, quadratic discriminant analysis and linear discriminant analysis are used to discriminate two and three different classes of valence and arousal levels. The proposed method is evaluated with EEG signals of available database for emotion analysis using physiological signals, which are recorded from 32 participants during 40 1 min audio visual inductions. According to the results, AR features are efficient to recognize emotional states from EEG signals, and KNN performs better than two other classifiers in discriminating of both two and three valence/arousal classes. The results also show that SFS method improves accuracies by almost 10-15% as compared to Davies–Bouldin based feature selection. The best accuracies are %72.33 and %74.20 for two classes of valence and arousal and %61.10 and %65.16 for three classes, respectively. PMID:25298928
Yin, Zhong; Zhang, Jianhua
2014-07-01
Identifying the abnormal changes of mental workload (MWL) over time is quite crucial for preventing the accidents due to cognitive overload and inattention of human operators in safety-critical human-machine systems. It is known that various neuroimaging technologies can be used to identify the MWL variations. In order to classify MWL into a few discrete levels using representative MWL indicators and small-sized training samples, a novel EEG-based approach by combining locally linear embedding (LLE), support vector clustering (SVC) and support vector data description (SVDD) techniques is proposed and evaluated by using the experimentally measured data. The MWL indicators from different cortical regions are first elicited by using the LLE technique. Then, the SVC approach is used to find the clusters of these MWL indicators and thereby to detect MWL variations. It is shown that the clusters can be interpreted as the binary class MWL. Furthermore, a trained binary SVDD classifier is shown to be capable of detecting slight variations of those indicators. By combining the two schemes, a SVC-SVDD framework is proposed, where the clear-cut (smaller) cluster is detected by SVC first and then a subsequent SVDD model is utilized to divide the overlapped (larger) cluster into two classes. Finally, three-class MWL levels (low, normal and high) can be identified automatically. The experimental data analysis results are compared with those of several existing methods. It has been demonstrated that the proposed framework can lead to acceptable computational accuracy and has the advantages of both unsupervised and supervised training strategies. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Havla, Lukas; Schneider, Moritz J; Thierfelder, Kolja M; Beyer, Sebastian E; Ertl-Wagner, Birgit; Reiser, Maximilian F; Sommer, Wieland H; Dietrich, Olaf
2016-02-01
The purpose of this study was to propose and evaluate a new wavelet-based technique for classification of arterial and venous vessels using time-resolved cerebral CT perfusion data sets. Fourteen consecutive patients (mean age 73 yr, range 17-97) with suspected stroke but no pathology in follow-up MRI were included. A CT perfusion scan with 32 dynamic phases was performed during intravenous bolus contrast-agent application. After rigid-body motion correction, a Paul wavelet (order 1) was used to calculate voxelwise the wavelet power spectrum (WPS) of each attenuation-time course. The angiographic intensity A was defined as the maximum of the WPS, located at the coordinates T (time axis) and W (scale/width axis) within the WPS. Using these three parameters (A, T, W) separately as well as combined by (1) Fisher's linear discriminant analysis (FLDA), (2) logistic regression (LogR) analysis, or (3) support vector machine (SVM) analysis, their potential to classify 18 different arterial and venous vessel segments per subject was evaluated. The best vessel classification was obtained using all three parameters A and T and W [area under the curve (AUC): 0.953 with FLDA and 0.957 with LogR or SVM]. In direct comparison, the wavelet-derived parameters provided performance at least equal to conventional attenuation-time-course parameters. The maximum AUC obtained from the proposed wavelet parameters was slightly (although not statistically significantly) higher than the maximum AUC (0.945) obtained from the conventional parameters. A new method to classify arterial and venous cerebral vessels with high statistical accuracy was introduced based on the time-domain wavelet transform of dynamic CT perfusion data in combination with linear or nonlinear multidimensional classification techniques.
Blood vessel segmentation in color fundus images based on regional and Hessian features.
Shah, Syed Ayaz Ali; Tang, Tong Boon; Faye, Ibrahima; Laude, Augustinus
2017-08-01
To propose a new algorithm of blood vessel segmentation based on regional and Hessian features for image analysis in retinal abnormality diagnosis. Firstly, color fundus images from the publicly available database DRIVE were converted from RGB to grayscale. To enhance the contrast of the dark objects (blood vessels) against the background, the dot product of the grayscale image with itself was generated. To rectify the variation in contrast, we used a 5 × 5 window filter on each pixel. Based on 5 regional features, 1 intensity feature and 2 Hessian features per scale using 9 scales, we extracted a total of 24 features. A linear minimum squared error (LMSE) classifier was trained to classify each pixel into a vessel or non-vessel pixel. The DRIVE dataset provided 20 training and 20 test color fundus images. The proposed algorithm achieves a sensitivity of 72.05% with 94.79% accuracy. Our proposed algorithm achieved higher accuracy (0.9206) at the peripapillary region, where the ocular manifestations in the microvasculature due to glaucoma, central retinal vein occlusion, etc. are most obvious. This supports the proposed algorithm as a strong candidate for automated vessel segmentation.
A ROC-based feature selection method for computer-aided detection and diagnosis
NASA Astrophysics Data System (ADS)
Wang, Songyuan; Zhang, Guopeng; Liao, Qimei; Zhang, Junying; Jiao, Chun; Lu, Hongbing
2014-03-01
Image-based computer-aided detection and diagnosis (CAD) has been a very active research topic aiming to assist physicians to detect lesions and distinguish them from benign to malignant. However, the datasets fed into a classifier usually suffer from small number of samples, as well as significantly less samples available in one class (have a disease) than the other, resulting in the classifier's suboptimal performance. How to identifying the most characterizing features of the observed data for lesion detection is critical to improve the sensitivity and minimize false positives of a CAD system. In this study, we propose a novel feature selection method mR-FAST that combines the minimal-redundancymaximal relevance (mRMR) framework with a selection metric FAST (feature assessment by sliding thresholds) based on the area under a ROC curve (AUC) generated on optimal simple linear discriminants. With three feature datasets extracted from CAD systems for colon polyps and bladder cancer, we show that the space of candidate features selected by mR-FAST is more characterizing for lesion detection with higher AUC, enabling to find a compact subset of superior features at low cost.
Spatial-temporal discriminant analysis for ERP-based brain-computer interface.
Zhang, Yu; Zhou, Guoxu; Zhao, Qibin; Jin, Jing; Wang, Xingyu; Cichocki, Andrzej
2013-03-01
Linear discriminant analysis (LDA) has been widely adopted to classify event-related potential (ERP) in brain-computer interface (BCI). Good classification performance of the ERP-based BCI usually requires sufficient data recordings for effective training of the LDA classifier, and hence a long system calibration time which however may depress the system practicability and cause the users resistance to the BCI system. In this study, we introduce a spatial-temporal discriminant analysis (STDA) to ERP classification. As a multiway extension of the LDA, the STDA method tries to maximize the discriminant information between target and nontarget classes through finding two projection matrices from spatial and temporal dimensions collaboratively, which reduces effectively the feature dimensionality in the discriminant analysis, and hence decreases significantly the number of required training samples. The proposed STDA method was validated with dataset II of the BCI Competition III and dataset recorded from our own experiments, and compared to the state-of-the-art algorithms for ERP classification. Online experiments were additionally implemented for the validation. The superior classification performance in using few training samples shows that the STDA is effective to reduce the system calibration time and improve the classification accuracy, thereby enhancing the practicability of ERP-based BCI.
Computer-Vision-Assisted Palm Rehabilitation With Supervised Learning.
Vamsikrishna, K M; Dogra, Debi Prosad; Desarkar, Maunendra Sankar
2016-05-01
Physical rehabilitation supported by the computer-assisted-interface is gaining popularity among health-care fraternity. In this paper, we have proposed a computer-vision-assisted contactless methodology to facilitate palm and finger rehabilitation. Leap motion controller has been interfaced with a computing device to record parameters describing 3-D movements of the palm of a user undergoing rehabilitation. We have proposed an interface using Unity3D development platform. Our interface is capable of analyzing intermediate steps of rehabilitation without the help of an expert, and it can provide online feedback to the user. Isolated gestures are classified using linear discriminant analysis (DA) and support vector machines (SVM). Finally, a set of discrete hidden Markov models (HMM) have been used to classify gesture sequence performed during rehabilitation. Experimental validation using a large number of samples collected from healthy volunteers reveals that DA and SVM perform similarly while applied on isolated gesture recognition. We have compared the results of HMM-based sequence classification with CRF-based techniques. Our results confirm that both HMM and CRF perform quite similarly when tested on gesture sequences. The proposed system can be used for home-based palm or finger rehabilitation in the absence of experts.
A hybrid sensing approach for pure and adulterated honey classification.
Subari, Norazian; Mohamad Saleh, Junita; Md Shakaff, Ali Yeon; Zakaria, Ammar
2012-10-17
This paper presents a comparison between data from single modality and fusion methods to classify Tualang honey as pure or adulterated using Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) statistical classification approaches. Ten different brands of certified pure Tualang honey were obtained throughout peninsular Malaysia and Sumatera, Indonesia. Various concentrations of two types of sugar solution (beet and cane sugar) were used in this investigation to create honey samples of 20%, 40%, 60% and 80% adulteration concentrations. Honey data extracted from an electronic nose (e-nose) and Fourier Transform Infrared Spectroscopy (FTIR) were gathered, analyzed and compared based on fusion methods. Visual observation of classification plots revealed that the PCA approach able to distinct pure and adulterated honey samples better than the LDA technique. Overall, the validated classification results based on FTIR data (88.0%) gave higher classification accuracy than e-nose data (76.5%) using the LDA technique. Honey classification based on normalized low-level and intermediate-level FTIR and e-nose fusion data scored classification accuracies of 92.2% and 88.7%, respectively using the Stepwise LDA method. The results suggested that pure and adulterated honey samples were better classified using FTIR and e-nose fusion data than single modality data.
NASA Astrophysics Data System (ADS)
Lesniak, J. M.; Hupse, R.; Blanc, R.; Karssemeijer, N.; Székely, G.
2012-08-01
False positive (FP) marks represent an obstacle for effective use of computer-aided detection (CADe) of breast masses in mammography. Typically, the problem can be approached either by developing more discriminative features or by employing different classifier designs. In this paper, the usage of support vector machine (SVM) classification for FP reduction in CADe is investigated, presenting a systematic quantitative evaluation against neural networks, k-nearest neighbor classification, linear discriminant analysis and random forests. A large database of 2516 film mammography examinations and 73 input features was used to train the classifiers and evaluate for their performance on correctly diagnosed exams as well as false negatives. Further, classifier robustness was investigated using varying training data and feature sets as input. The evaluation was based on the mean exam sensitivity in 0.05-1 FPs on normals on the free-response receiver operating characteristic curve (FROC), incorporated into a tenfold cross validation framework. It was found that SVM classification using a Gaussian kernel offered significantly increased detection performance (P = 0.0002) compared to the reference methods. Varying training data and input features, SVMs showed improved exploitation of large feature sets. It is concluded that with the SVM-based CADe a significant reduction of FPs is possible outperforming other state-of-the-art approaches for breast mass CADe.
Prediction of in vivo hepatotoxicity effects using in vitro ...
High-throughput in vitro transcriptomics data support molecular understanding of chemical-induced toxicity. Here, we evaluated the utility of such data to predict liver toxicity. First, in vitro gene expression data for 93 genes was generated following exposure of metabolically competent HepaRG cells to 1060 environmental chemicals from the US EPA ToxCast library. The empirical relationship between these data and rat chronic liver endpoints from animal studies in the Toxicity Reference Database (ToxRefDB) was then evaluated using machine learning techniques. Chemicals were classified as positive (242) or negative (135) based on observed hepatic histopathologic effects, and divided into three categories: hypertrophy (183), injury (112) and proliferative lesions (101). Hepatotoxicants were classified on the basis of the bioactivity of 93 genes (descriptors) using six machine learning algorithms: linear discriminant analysis, naïve Bayes, support vector classification, classification and regression trees, k-nearest neighbors, and an ensemble of classifiers. Classification performance was evaluated using 10-fold cross-validation testing, and in-loop, filter-based, feature subset selection. The best balanced accuracy for prediction of hypertrophy, injury and proliferative lesions were 0.81 ± 0.07, 0.79 ± 0.08 and 0.77 ± 0.09, respectively. Gene specific perturbation of xenobiotic metabolism enzymes (CYP7A1/2E1/4A11/1A1/4A22) and transporters (ABCG2, ABCB11, SLC22
Nagarajan, R; Hariharan, M; Satiyan, M
2012-08-01
Developing tools to assist physically disabled and immobilized people through facial expression is a challenging area of research and has attracted many researchers recently. In this paper, luminance stickers based facial expression recognition is proposed. Recognition of facial expression is carried out by employing Discrete Wavelet Transform (DWT) as a feature extraction method. Different wavelet families with their different orders (db1 to db20, Coif1 to Coif 5 and Sym2 to Sym8) are utilized to investigate their performance in recognizing facial expression and to evaluate their computational time. Standard deviation is computed for the coefficients of first level of wavelet decomposition for every order of wavelet family. This standard deviation is used to form a set of feature vectors for classification. In this study, conventional validation and cross validation are performed to evaluate the efficiency of the suggested feature vectors. Three different classifiers namely Artificial Neural Network (ANN), k-Nearest Neighborhood (kNN) and Linear Discriminant Analysis (LDA) are used to classify a set of eight facial expressions. The experimental results demonstrate that the proposed method gives very promising classification accuracies.
A Locomotion Intent Prediction System Based on Multi-Sensor Fusion
Chen, Baojun; Zheng, Enhao; Wang, Qining
2014-01-01
Locomotion intent prediction is essential for the control of powered lower-limb prostheses to realize smooth locomotion transitions. In this research, we develop a multi-sensor fusion based locomotion intent prediction system, which can recognize current locomotion mode and detect locomotion transitions in advance. Seven able-bodied subjects were recruited for this research. Signals from two foot pressure insoles and three inertial measurement units (one on the thigh, one on the shank and the other on the foot) are measured. A two-level recognition strategy is used for the recognition with linear discriminate classifier. Six kinds of locomotion modes and ten kinds of locomotion transitions are tested in this study. Recognition accuracy during steady locomotion periods (i.e., no locomotion transitions) is 99.71% ± 0.05% for seven able-bodied subjects. During locomotion transition periods, all the transitions are correctly detected and most of them can be detected before transiting to new locomotion modes. No significant deterioration in recognition performance is observed in the following five hours after the system is trained, and small number of experiment trials are required to train reliable classifiers. PMID:25014097
A locomotion intent prediction system based on multi-sensor fusion.
Chen, Baojun; Zheng, Enhao; Wang, Qining
2014-07-10
Locomotion intent prediction is essential for the control of powered lower-limb prostheses to realize smooth locomotion transitions. In this research, we develop a multi-sensor fusion based locomotion intent prediction system, which can recognize current locomotion mode and detect locomotion transitions in advance. Seven able-bodied subjects were recruited for this research. Signals from two foot pressure insoles and three inertial measurement units (one on the thigh, one on the shank and the other on the foot) are measured. A two-level recognition strategy is used for the recognition with linear discriminate classifier. Six kinds of locomotion modes and ten kinds of locomotion transitions are tested in this study. Recognition accuracy during steady locomotion periods (i.e., no locomotion transitions) is 99.71% ± 0.05% for seven able-bodied subjects. During locomotion transition periods, all the transitions are correctly detected and most of them can be detected before transiting to new locomotion modes. No significant deterioration in recognition performance is observed in the following five hours after the system is trained, and small number of experiment trials are required to train reliable classifiers.
Gender classification of running subjects using full-body kinematics
NASA Astrophysics Data System (ADS)
Williams, Christina M.; Flora, Jeffrey B.; Iftekharuddin, Khan M.
2016-05-01
This paper proposes novel automated gender classification of subjects while engaged in running activity. The machine learning techniques include preprocessing steps using principal component analysis followed by classification with linear discriminant analysis, and nonlinear support vector machines, and decision-stump with AdaBoost. The dataset consists of 49 subjects (25 males, 24 females, 2 trials each) all equipped with approximately 80 retroreflective markers. The trials are reflective of the subject's entire body moving unrestrained through a capture volume at a self-selected running speed, thus producing highly realistic data. The classification accuracy using leave-one-out cross validation for the 49 subjects is improved from 66.33% using linear discriminant analysis to 86.74% using the nonlinear support vector machine. Results are further improved to 87.76% by means of implementing a nonlinear decision stump with AdaBoost classifier. The experimental findings suggest that the linear classification approaches are inadequate in classifying gender for a large dataset with subjects running in a moderately uninhibited environment.
2011-01-01
Background Dementia and cognitive impairment associated with aging are a major medical and social concern. Neuropsychological testing is a key element in the diagnostic procedures of Mild Cognitive Impairment (MCI), but has presently a limited value in the prediction of progression to dementia. We advance the hypothesis that newer statistical classification methods derived from data mining and machine learning methods like Neural Networks, Support Vector Machines and Random Forests can improve accuracy, sensitivity and specificity of predictions obtained from neuropsychological testing. Seven non parametric classifiers derived from data mining methods (Multilayer Perceptrons Neural Networks, Radial Basis Function Neural Networks, Support Vector Machines, CART, CHAID and QUEST Classification Trees and Random Forests) were compared to three traditional classifiers (Linear Discriminant Analysis, Quadratic Discriminant Analysis and Logistic Regression) in terms of overall classification accuracy, specificity, sensitivity, Area under the ROC curve and Press'Q. Model predictors were 10 neuropsychological tests currently used in the diagnosis of dementia. Statistical distributions of classification parameters obtained from a 5-fold cross-validation were compared using the Friedman's nonparametric test. Results Press' Q test showed that all classifiers performed better than chance alone (p < 0.05). Support Vector Machines showed the larger overall classification accuracy (Median (Me) = 0.76) an area under the ROC (Me = 0.90). However this method showed high specificity (Me = 1.0) but low sensitivity (Me = 0.3). Random Forest ranked second in overall accuracy (Me = 0.73) with high area under the ROC (Me = 0.73) specificity (Me = 0.73) and sensitivity (Me = 0.64). Linear Discriminant Analysis also showed acceptable overall accuracy (Me = 0.66), with acceptable area under the ROC (Me = 0.72) specificity (Me = 0.66) and sensitivity (Me = 0.64). The remaining classifiers showed overall classification accuracy above a median value of 0.63, but for most sensitivity was around or even lower than a median value of 0.5. Conclusions When taking into account sensitivity, specificity and overall classification accuracy Random Forests and Linear Discriminant analysis rank first among all the classifiers tested in prediction of dementia using several neuropsychological tests. These methods may be used to improve accuracy, sensitivity and specificity of Dementia predictions from neuropsychological testing. PMID:21849043
Instrumentation for Linear and Nonlinear Optical Device Characterization
2018-01-31
1998. 4. TITLE. Enter title and subtitle with volume number and part number, if applicable. On classified documents, enter the title classification...with security classification regulations, e.g. U, C, S, etc. If this form contains classified information, stamp classification level on the top...hundreds of picoseconds. Figure 3 illustrates example data taken from the oscilloscope. 0 5 10 15 Time (ns) 20 25 30 Figure 3. (a) A screen shot
Zhang, Jian; Lockhart, Thurmon E.; Soangra, Rahul
2013-01-01
Fatigue in lower extremity musculature is associated with decline in postural stability, motor performance and alters normal walking patterns in human subjects. Automated recognition of lower extremity muscle fatigue condition may be advantageous in early detection of fall and injury risks. Supervised machine learning methods such as Support Vector Machines (SVM) have been previously used for classifying healthy and pathological gait patterns and also for separating old and young gait patterns. In this study we explore the classification potential of SVM in recognition of gait patterns utilizing an inertial measurement unit associated with lower extremity muscular fatigue. Both kinematic and kinetic gait patterns of 17 participants (29±11 years) were recorded and analyzed in normal and fatigued state of walking. Lower extremities were fatigued by performance of a squatting exercise until the participants reached 60% of their baseline maximal voluntary exertion level. Feature selection methods were used to classify fatigue and no-fatigue conditions based on temporal and frequency information of the signals. Additionally, influences of three different kernel schemes (i.e., linear, polynomial, and radial basis function) were investigated for SVM classification. The results indicated that lower extremity muscle fatigue condition influenced gait and loading responses. In terms of the SVM classification results, an accuracy of 96% was reached in distinguishing the two gait patterns (fatigue and no-fatigue) within the same subject using the kinematic, time and frequency domain features. It is also found that linear kernel and RBF kernel were equally good to identify intra-individual fatigue characteristics. These results suggest that intra-subject fatigue classification using gait patterns from an inertial sensor holds considerable potential in identifying “at-risk” gait due to muscle fatigue. PMID:24081829
NASA Astrophysics Data System (ADS)
Zhao, An; Jin, Ning-de; Ren, Ying-yu; Zhu, Lei; Yang, Xia
2016-01-01
In this article we apply an approach to identify the oil-gas-water three-phase flow patterns in vertical upwards 20 mm inner-diameter pipe based on the conductance fluctuating signals. We use the approach to analyse the signals with long-range correlations by decomposing the signal increment series into magnitude and sign series and extracting their scaling properties. We find that the magnitude series relates to nonlinear properties of the original time series, whereas the sign series relates to the linear properties. The research shows that the oil-gas-water three-phase flows (slug flow, churn flow, bubble flow) can be classified by a combination of scaling exponents of magnitude and sign series. This study provides a new way of characterising linear and nonlinear properties embedded in oil-gas-water three-phase flows.
Reddy, Anupama; Growney, Joseph D; Wilson, Nick S; Emery, Caroline M; Johnson, Jennifer A; Ward, Rebecca; Monaco, Kelli A; Korn, Joshua; Monahan, John E; Stump, Mark D; Mapa, Felipa A; Wilson, Christopher J; Steiger, Janine; Ledell, Jebediah; Rickles, Richard J; Myer, Vic E; Ettenberg, Seth A; Schlegel, Robert; Sellers, William R; Huet, Heather A; Lehár, Joseph
2015-01-01
Death Receptor 5 (DR5) agonists demonstrate anti-tumor activity in preclinical models but have yet to demonstrate robust clinical responses. A key limitation may be the lack of patient selection strategies to identify those most likely to respond to treatment. To overcome this limitation, we screened a DR5 agonist Nanobody across >600 cell lines representing 21 tumor lineages and assessed molecular features associated with response. High expression of DR5 and Casp8 were significantly associated with sensitivity, but their expression thresholds were difficult to translate due to low dynamic ranges. To address the translational challenge of establishing thresholds of gene expression, we developed a classifier based on ratios of genes that predicted response across lineages. The ratio classifier outperformed the DR5+Casp8 classifier, as well as standard approaches for feature selection and classification using genes, instead of ratios. This classifier was independently validated using 11 primary patient-derived pancreatic xenograft models showing perfect predictions as well as a striking linearity between prediction probability and anti-tumor response. A network analysis of the genes in the ratio classifier captured important biological relationships mediating drug response, specifically identifying key positive and negative regulators of DR5 mediated apoptosis, including DR5, CASP8, BID, cFLIP, XIAP and PEA15. Importantly, the ratio classifier shows translatability across gene expression platforms (from Affymetrix microarrays to RNA-seq) and across model systems (in vitro to in vivo). Our approach of using gene expression ratios presents a robust and novel method for constructing translatable biomarkers of compound response, which can also probe the underlying biology of treatment response.
Reddy, Anupama; Growney, Joseph D.; Wilson, Nick S.; Emery, Caroline M.; Johnson, Jennifer A.; Ward, Rebecca; Monaco, Kelli A.; Korn, Joshua; Monahan, John E.; Stump, Mark D.; Mapa, Felipa A.; Wilson, Christopher J.; Steiger, Janine; Ledell, Jebediah; Rickles, Richard J.; Myer, Vic E.; Ettenberg, Seth A.; Schlegel, Robert; Sellers, William R.
2015-01-01
Death Receptor 5 (DR5) agonists demonstrate anti-tumor activity in preclinical models but have yet to demonstrate robust clinical responses. A key limitation may be the lack of patient selection strategies to identify those most likely to respond to treatment. To overcome this limitation, we screened a DR5 agonist Nanobody across >600 cell lines representing 21 tumor lineages and assessed molecular features associated with response. High expression of DR5 and Casp8 were significantly associated with sensitivity, but their expression thresholds were difficult to translate due to low dynamic ranges. To address the translational challenge of establishing thresholds of gene expression, we developed a classifier based on ratios of genes that predicted response across lineages. The ratio classifier outperformed the DR5+Casp8 classifier, as well as standard approaches for feature selection and classification using genes, instead of ratios. This classifier was independently validated using 11 primary patient-derived pancreatic xenograft models showing perfect predictions as well as a striking linearity between prediction probability and anti-tumor response. A network analysis of the genes in the ratio classifier captured important biological relationships mediating drug response, specifically identifying key positive and negative regulators of DR5 mediated apoptosis, including DR5, CASP8, BID, cFLIP, XIAP and PEA15. Importantly, the ratio classifier shows translatability across gene expression platforms (from Affymetrix microarrays to RNA-seq) and across model systems (in vitro to in vivo). Our approach of using gene expression ratios presents a robust and novel method for constructing translatable biomarkers of compound response, which can also probe the underlying biology of treatment response. PMID:26378449
O'Neill, William; Penn, Richard; Werner, Michael; Thomas, Justin
2015-06-01
Estimation of stochastic process models from data is a common application of time series analysis methods. Such system identification processes are often cast as hypothesis testing exercises whose intent is to estimate model parameters and test them for statistical significance. Ordinary least squares (OLS) regression and the Levenberg-Marquardt algorithm (LMA) have proven invaluable computational tools for models being described by non-homogeneous, linear, stationary, ordinary differential equations. In this paper we extend stochastic model identification to linear, stationary, partial differential equations in two independent variables (2D) and show that OLS and LMA apply equally well to these systems. The method employs an original nonparametric statistic as a test for the significance of estimated parameters. We show gray scale and color images are special cases of 2D systems satisfying a particular autoregressive partial difference equation which estimates an analogous partial differential equation. Several applications to medical image modeling and classification illustrate the method by correctly classifying demented and normal OLS models of axial magnetic resonance brain scans according to subject Mini Mental State Exam (MMSE) scores. Comparison with 13 image classifiers from the literature indicates our classifier is at least 14 times faster than any of them and has a classification accuracy better than all but one. Our modeling method applies to any linear, stationary, partial differential equation and the method is readily extended to 3D whole-organ systems. Further, in addition to being a robust image classifier, estimated image models offer insights into which parameters carry the most diagnostic image information and thereby suggest finer divisions could be made within a class. Image models can be estimated in milliseconds which translate to whole-organ models in seconds; such runtimes could make real-time medicine and surgery modeling possible.
Diagnosis of Tempromandibular Disorders Using Local Binary Patterns.
Haghnegahdar, A A; Kolahi, S; Khojastepour, L; Tajeripour, F
2018-03-01
Temporomandibular joint disorder (TMD) might be manifested as structural changes in bone through modification, adaptation or direct destruction. We propose to use Local Binary Pattern (LBP) characteristics and histogram-oriented gradients on the recorded images as a diagnostic tool in TMD assessment. CBCT images of 66 patients (132 joints) with TMD and 66 normal cases (132 joints) were collected and 2 coronal cut prepared from each condyle, although images were limited to head of mandibular condyle. In order to extract features of images, first we use LBP and then histogram of oriented gradients. To reduce dimensionality, the linear algebra Singular Value Decomposition (SVD) is applied to the feature vectors matrix of all images. For evaluation, we used K nearest neighbor (K-NN), Support Vector Machine, Naïve Bayesian and Random Forest classifiers. We used Receiver Operating Characteristic (ROC) to evaluate the hypothesis. K nearest neighbor classifier achieves a very good accuracy (0.9242), moreover, it has desirable sensitivity (0.9470) and specificity (0.9015) results, when other classifiers have lower accuracy, sensitivity and specificity. We proposed a fully automatic approach to detect TMD using image processing techniques based on local binary patterns and feature extraction. K-NN has been the best classifier for our experiments in detecting patients from healthy individuals, by 92.42% accuracy, 94.70% sensitivity and 90.15% specificity. The proposed method can help automatically diagnose TMD at its initial stages.
Lagrangians and Euler morphisms from connections on the frame bundle
NASA Astrophysics Data System (ADS)
Kurek, J.; Mikulski, W. M.
2011-07-01
We classify all natural operators transforming torsion free classical linear connections ∇ on m-dimensional manifolds M into r-th order Lagrangians λ(∇) and Euler morphisms E(∇) on the linear frame bundle P1M. We also briefly write how this classification result can be generalized on higher order frame bundles PkM instead of P1M.
Role of EEG as Biomarker in the Early Detection and Classification of Dementia
Al-Qazzaz, Noor Kamal; Ali, Sawal Hamid Bin MD.; Ahmad, Siti Anom; Chellappan, Kalaivani; Islam, Md. Shabiul; Escudero, Javier
2014-01-01
The early detection and classification of dementia are important clinical support tasks for medical practitioners in customizing patient treatment programs to better manage the development and progression of these diseases. Efforts are being made to diagnose these neurodegenerative disorders in the early stages. Indeed, early diagnosis helps patients to obtain the maximum treatment benefit before significant mental decline occurs. The use of electroencephalogram as a tool for the detection of changes in brain activities and clinical diagnosis is becoming increasingly popular for its capabilities in quantifying changes in brain degeneration in dementia. This paper reviews the role of electroencephalogram as a biomarker based on signal processing to detect dementia in early stages and classify its severity. The review starts with a discussion of dementia types and cognitive spectrum followed by the presentation of the effective preprocessing denoising to eliminate possible artifacts. It continues with a description of feature extraction by using linear and nonlinear techniques, and it ends with a brief explanation of vast variety of separation techniques to classify EEG signals. This paper also provides an idea from the most popular studies that may help in diagnosing dementia in early stages and classifying through electroencephalogram signal processing and analysis. PMID:25093211
Role of EEG as biomarker in the early detection and classification of dementia.
Al-Qazzaz, Noor Kamal; Ali, Sawal Hamid Bin Md; Ahmad, Siti Anom; Chellappan, Kalaivani; Islam, Md Shabiul; Escudero, Javier
2014-01-01
The early detection and classification of dementia are important clinical support tasks for medical practitioners in customizing patient treatment programs to better manage the development and progression of these diseases. Efforts are being made to diagnose these neurodegenerative disorders in the early stages. Indeed, early diagnosis helps patients to obtain the maximum treatment benefit before significant mental decline occurs. The use of electroencephalogram as a tool for the detection of changes in brain activities and clinical diagnosis is becoming increasingly popular for its capabilities in quantifying changes in brain degeneration in dementia. This paper reviews the role of electroencephalogram as a biomarker based on signal processing to detect dementia in early stages and classify its severity. The review starts with a discussion of dementia types and cognitive spectrum followed by the presentation of the effective preprocessing denoising to eliminate possible artifacts. It continues with a description of feature extraction by using linear and nonlinear techniques, and it ends with a brief explanation of vast variety of separation techniques to classify EEG signals. This paper also provides an idea from the most popular studies that may help in diagnosing dementia in early stages and classifying through electroencephalogram signal processing and analysis.
Graña, M; Termenon, M; Savio, A; Gonzalez-Pinto, A; Echeveste, J; Pérez, J M; Besga, A
2011-09-20
The aim of this paper is to obtain discriminant features from two scalar measures of Diffusion Tensor Imaging (DTI) data, Fractional Anisotropy (FA) and Mean Diffusivity (MD), and to train and test classifiers able to discriminate Alzheimer's Disease (AD) patients from controls on the basis of features extracted from the FA or MD volumes. In this study, support vector machine (SVM) classifier was trained and tested on FA and MD data. Feature selection is done computing the Pearson's correlation between FA or MD values at voxel site across subjects and the indicative variable specifying the subject class. Voxel sites with high absolute correlation are selected for feature extraction. Results are obtained over an on-going study in Hospital de Santiago Apostol collecting anatomical T1-weighted MRI volumes and DTI data from healthy control subjects and AD patients. FA features and a linear SVM classifier achieve perfect accuracy, sensitivity and specificity in several cross-validation studies, supporting the usefulness of DTI-derived features as an image-marker for AD and to the feasibility of building Computer Aided Diagnosis systems for AD based on them. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.
Wang, Kun; Bhandari, Vineet; Giuliano, John S.; O′Hern, Corey S.; Shattuck, Mark D.; Kirby, Michael
2014-01-01
Severe pediatric sepsis continues to be associated with high mortality rates in children. Thus, an important area of biomedical research is to identify biomarkers that can classify sepsis severity and outcomes. The complex and heterogeneous nature of sepsis makes the prospect of the classification of sepsis severity using a single biomarker less likely. Instead, we employ machine learning techniques to validate the use of a multiple biomarkers scoring system to determine the severity of sepsis in critically ill children. The study was based on clinical data and plasma samples provided by a tertiary care center's Pediatric Intensive Care Unit (PICU) from a group of 45 patients with varying sepsis severity at the time of admission. Canonical Correlation Analysis with the Forward Selection and Random Forests methods identified a particular set of biomarkers that included Angiopoietin-1 (Ang-1), Angiopoietin-2 (Ang-2), and Bicarbonate (HCO) as having the strongest correlations with sepsis severity. The robustness and effectiveness of these biomarkers for classifying sepsis severity were validated by constructing a linear Support Vector Machine diagnostic classifier. We also show that the concentrations of Ang-1, Ang-2, and HCO enable predictions of the time dependence of sepsis severity in children. PMID:25255212
Helweg, D A; Au, W W; Roitblat, H L; Nachtigall, P E
1996-04-01
The relationships between acoustic features of target echoes and the cognitive representations of the target formed by an echolocating dolphin will influence the ease with which the dolphin can recognize a target. A blindfolded Atlantic bottlenose dolphin (Tursiops truncatus) learned to match aspect-dependent three-dimensional targets (such as a cube) at haphazard orientations, although with some difficulty. This task may have been difficult because aspect-dependent targets produce different echoes at different orientations, which required the dolphin to have some capability for object constancy across changes in echo characteristics. Significant target-related differences in echo amplitude, rms bandwidth, and distributions of interhighlight intervals were observed among echoes collected when the dolphin was performing the task. Targets could be classified using a combination of energy flux density and rms bandwidth by a linear discriminant analysis and a nearest centroid classifier. Neither statistical model could classify targets without amplitude information, but the highest accuracy required spectral information as well. This suggests that the dolphin recognized the targets using a multidimensional representation containing amplitude and spectral information and that dolphins can form stable representations of targets regardless of orientation based on varying sensory properties.
Paraxial diffractive elements for space-variant linear transforms
NASA Astrophysics Data System (ADS)
Teiwes, Stephan; Schwarzer, Heiko; Gu, Ben-Yuan
1998-06-01
Optical linear transform architectures bear good potential for future developments of very powerful hybrid vision systems and neural network classifiers. The optical modules of such systems could be used as pre-processors to solve complex linear operations at very high speed in order to simplify an electronic data post-processing. However, the applicability of linear optical architectures is strongly connected with the fundamental question of how to implement a specific linear transform by optical means and physical imitations. The large majority of publications on this topic focusses on the optical implementation of space-invariant transforms by the well-known 4f-setup. Only few papers deal with approaches to implement selected space-variant transforms. In this paper, we propose a simple algebraic method to design diffractive elements for an optical architecture in order to realize arbitrary space-variant transforms. The design procedure is based on a digital model of scalar, paraxial wave theory and leads to optimal element transmission functions within the model. Its computational and physical limitations are discussed in terms of complexity measures. Finally, the design procedure is demonstrated by some examples. Firstly, diffractive elements for the realization of different rotation operations are computed and, secondly, a Hough transform element is presented. The correct optical functions of the elements are proved in computer simulation experiments.
Muraro, Ana Paula; Souza, Rita Adriana Gomes de; Rodrigues, Paulo Rogério Melo; Ferreira, Márcia Gonçalves; Sichieri, Rosely
2017-01-01
To assess the effect of socioeconomic position (SEP) in childhood and social mobility on linear growth through adolescence in a population-based cohort. Children born in Cuiabá-MT, central-western Brazil, were evaluated during 1994 - 1999. They were first assessed during 1999 - 2000 (0 - 5 years) and again during 2009 - 2011 (10 - 17 years), and their height-for-age was evaluated during these two periods.Awealth index was used to classify the SEP of each child's family as low, medium, or high. Social mobility was categorized as upward mobility or no upward mobility. Linear mixed models were used. We evaluated 1,716 children (71.4% of baseline) after 10 years, and 60.6% of the families showed upward mobility, with a higher percentage among the lowest economic classes. A higher height-for-age was also observed among those from families with a high SEP both in childhood (low SEP= -0.35 z-score; high SEP= 0.15 z-score, p < 0.01) and adolescence (low SEP= -0.01 z-score; high SEP= 0.45 z-score, p < 0.01), whereas upward mobility did not affect their linear growth. Expressive social mobility was observed, but SEP in childhood and social mobility did not greatly influence linear growth through childhood in this central-western Brazilian cohort.
Investigation of Super Learner Methodology on HIV-1 Small Sample: Application on Jaguar Trial Data.
Houssaïni, Allal; Assoumou, Lambert; Marcelin, Anne Geneviève; Molina, Jean Michel; Calvez, Vincent; Flandre, Philippe
2012-01-01
Background. Many statistical models have been tested to predict phenotypic or virological response from genotypic data. A statistical framework called Super Learner has been introduced either to compare different methods/learners (discrete Super Learner) or to combine them in a Super Learner prediction method. Methods. The Jaguar trial is used to apply the Super Learner framework. The Jaguar study is an "add-on" trial comparing the efficacy of adding didanosine to an on-going failing regimen. Our aim was also to investigate the impact on the use of different cross-validation strategies and different loss functions. Four different repartitions between training set and validations set were tested through two loss functions. Six statistical methods were compared. We assess performance by evaluating R(2) values and accuracy by calculating the rates of patients being correctly classified. Results. Our results indicated that the more recent Super Learner methodology of building a new predictor based on a weighted combination of different methods/learners provided good performance. A simple linear model provided similar results to those of this new predictor. Slight discrepancy arises between the two loss functions investigated, and slight difference arises also between results based on cross-validated risks and results from full dataset. The Super Learner methodology and linear model provided around 80% of patients correctly classified. The difference between the lower and higher rates is around 10 percent. The number of mutations retained in different learners also varys from one to 41. Conclusions. The more recent Super Learner methodology combining the prediction of many learners provided good performance on our small dataset.
Geomorphic Flood Area (GFA): a QGIS tool for a cost-effective delineation of the floodplains
NASA Astrophysics Data System (ADS)
Samela, Caterina; Albano, Raffaele; Sole, Aurelia; Manfreda, Salvatore
2017-04-01
The importance of delineating flood hazard and risk areas at a global scale has been highlighted for many years. However, its complete achievement regularly encounters practical difficulties, above all the lack of data and implementation costs. In conditions of scarce data availability (e.g. ungauged basins, large-scale analyses), a fast and cost-effective floodplain delineation can be carried out using geomorphic methods (e.g., Manfreda et al., 2011; 2014). In particular, an automatic DEM-based procedure has been implemented in an open-source QGIS plugin named Geomorphic Flood Area - tool (GFA - tool). This tool performs a linear binary classification based on the recently proposed Geomorphic Flood Index (GFI), which exhibited high classification accuracy and reliability in several test sites located in Europe, United States and Africa (Manfreda et al., 2015; Samela et al., 2016, 2017; Samela, 2016). The GFA - tool is designed to make available to all users the proposed procedure, that includes a number of operations requiring good geomorphic and GIS competences. It allows computing the GFI through terrain analysis, turning it into a binary classifier, and training it on the base of a standard inundation map derived for a portion of the river basin (a minimum of 2% of the river basin's area is suggested) using detailed methods of analysis (e.g. flood hazard maps produced by emergency management agencies or river basin authorities). Finally, GFA - tool allows to extend the classification outside the calibration area to delineate the flood-prone areas across the entire river basin. The full analysis has been implemented in this plugin with a user-friendly interface that should make it easy to all user to apply the approach and produce the desired results. Keywords: flood susceptibility; data scarce environments; geomorphic flood index; linear binary classification; Digital elevation models (DEMs). References Manfreda, S., Di Leo, M., Sole, A., (2011). Detection of Flood Prone Areas using Digital Elevation Models, Journal of Hydrologic Engineering, 16(10), 781-790. Manfreda, S., Nardi, F., Samela, C., Grimaldi, S., Taramasso, A. C., Roth, G., & Sole, A. (2014). Investigation on the Use of Geomorphic Approaches for the Delineation of Flood Prone Areas, Journal of Hydrology, 517, 863-876. Manfreda, S., Samela, C., Gioia, A., Consoli, G., Iacobellis, V., Giuzio, L., & Sole, A. (2015). Flood-prone areas assessment using linear binary classifiers based on flood maps obtained from 1D and 2D hydraulic models. Natural Hazards, Vol. 79 (2), pp 735-754. Samela, C. (2016), 100-year flood susceptibility maps for the continental U.S. derived with a geomorphic method. University of Basilicata. Dataset. Samela, C., Manfreda, S., Paola, F. D., Giugni, M., Sole, A., & Fiorentino, M. (2016). DEM-Based Approaches for the Delineation of Flood-Prone Areas in an Ungauged Basin in Africa. Journal of Hydrologic Engineering, 21(2), 1-10. Samela, C., Troy, T.J., Manfreda, S. (2017). Geomorphic classifiers for flood-prone areas delineation for data-scarce environments, Advances in Water Resources (under review).
NASA Astrophysics Data System (ADS)
Adeli, Ehsan; Wu, Guorong; Saghafi, Behrouz; An, Le; Shi, Feng; Shen, Dinggang
2017-01-01
Feature selection methods usually select the most compact and relevant set of features based on their contribution to a linear regression model. Thus, these features might not be the best for a non-linear classifier. This is especially crucial for the tasks, in which the performance is heavily dependent on the feature selection techniques, like the diagnosis of neurodegenerative diseases. Parkinson’s disease (PD) is one of the most common neurodegenerative disorders, which progresses slowly while affects the quality of life dramatically. In this paper, we use the data acquired from multi-modal neuroimaging data to diagnose PD by investigating the brain regions, known to be affected at the early stages. We propose a joint kernel-based feature selection and classification framework. Unlike conventional feature selection techniques that select features based on their performance in the original input feature space, we select features that best benefit the classification scheme in the kernel space. We further propose kernel functions, specifically designed for our non-negative feature types. We use MRI and SPECT data of 538 subjects from the PPMI database, and obtain a diagnosis accuracy of 97.5%, which outperforms all baseline and state-of-the-art methods.
Adeli, Ehsan; Wu, Guorong; Saghafi, Behrouz; An, Le; Shi, Feng; Shen, Dinggang
2017-01-01
Feature selection methods usually select the most compact and relevant set of features based on their contribution to a linear regression model. Thus, these features might not be the best for a non-linear classifier. This is especially crucial for the tasks, in which the performance is heavily dependent on the feature selection techniques, like the diagnosis of neurodegenerative diseases. Parkinson’s disease (PD) is one of the most common neurodegenerative disorders, which progresses slowly while affects the quality of life dramatically. In this paper, we use the data acquired from multi-modal neuroimaging data to diagnose PD by investigating the brain regions, known to be affected at the early stages. We propose a joint kernel-based feature selection and classification framework. Unlike conventional feature selection techniques that select features based on their performance in the original input feature space, we select features that best benefit the classification scheme in the kernel space. We further propose kernel functions, specifically designed for our non-negative feature types. We use MRI and SPECT data of 538 subjects from the PPMI database, and obtain a diagnosis accuracy of 97.5%, which outperforms all baseline and state-of-the-art methods. PMID:28120883
Li, Jing; Hong, Wenxue
2014-12-01
The feature extraction and feature selection are the important issues in pattern recognition. Based on the geometric algebra representation of vector, a new feature extraction method using blade coefficient of geometric algebra was proposed in this study. At the same time, an improved differential evolution (DE) feature selection method was proposed to solve the elevated high dimension issue. The simple linear discriminant analysis was used as the classifier. The result of the 10-fold cross-validation (10 CV) classification of public breast cancer biomedical dataset was more than 96% and proved superior to that of the original features and traditional feature extraction method.
NASA Technical Reports Server (NTRS)
Paradella, W. R. (Principal Investigator); Vitorello, I.; Monteiro, M. D.
1984-01-01
Enhancement techniques and thematic classifications were applied to the metasediments of Bambui Super Group (Upper Proterozoic) in the Region of Serra do Ramalho, SW of the state of Bahia. Linear contrast stretch, band-ratios with contrast stretch, and color-composites allow lithological discriminations. The effects of human activities and of vegetation cover mask and limit, in several ways, the lithological discrimination with digital MSS data. Principal component images and color composite of linear contrast stretch of these products, show lithological discrimination through tonal gradations. This set of products allows the delineations of several metasedimentary sequences to a level superior to reconnaissance mapping. Supervised (maximum likelihood classifier) and nonsupervised (K-Means classifier) classification of the limestone sequence, host to fluorite mineralization show satisfactory results.
Real-data comparison of data mining methods in prediction of diabetes in iran.
Tapak, Lily; Mahjub, Hossein; Hamidi, Omid; Poorolajal, Jalal
2013-09-01
Diabetes is one of the most common non-communicable diseases in developing countries. Early screening and diagnosis play an important role in effective prevention strategies. This study compared two traditional classification methods (logistic regression and Fisher linear discriminant analysis) and four machine-learning classifiers (neural networks, support vector machines, fuzzy c-mean, and random forests) to classify persons with and without diabetes. The data set used in this study included 6,500 subjects from the Iranian national non-communicable diseases risk factors surveillance obtained through a cross-sectional survey. The obtained sample was based on cluster sampling of the Iran population which was conducted in 2005-2009 to assess the prevalence of major non-communicable disease risk factors. Ten risk factors that are commonly associated with diabetes were selected to compare the performance of six classifiers in terms of sensitivity, specificity, total accuracy, and area under the receiver operating characteristic (ROC) curve criteria. Support vector machines showed the highest total accuracy (0.986) as well as area under the ROC (0.979). Also, this method showed high specificity (1.000) and sensitivity (0.820). All other methods produced total accuracy of more than 85%, but for all methods, the sensitivity values were very low (less than 0.350). The results of this study indicate that, in terms of sensitivity, specificity, and overall classification accuracy, the support vector machine model ranks first among all the classifiers tested in the prediction of diabetes. Therefore, this approach is a promising classifier for predicting diabetes, and it should be further investigated for the prediction of other diseases.
Stec, James; Wang, Jing; Coombes, Kevin; Ayers, Mark; Hoersch, Sebastian; Gold, David L.; Ross, Jeffrey S; Hess, Kenneth R.; Tirrell, Stephen; Linette, Gerald; Hortobagyi, Gabriel N.; Symmans, W. Fraser; Pusztai, Lajos
2005-01-01
We examined how well differentially expressed genes and multigene outcome classifiers retain their class-discriminating values when tested on data generated by different transcriptional profiling platforms. RNA from 33 stage I-III breast cancers was hybridized to both Affymetrix GeneChip and Millennium Pharmaceuticals cDNA arrays. Only 30% of all corresponding gene expression measurements on the two platforms had Pearson correlation coefficient r ≥ 0.7 when UniGene was used to match probes. There was substantial variation in correlation between different Affymetrix probe sets matched to the same cDNA probe. When cDNA and Affymetrix probes were matched by basic local alignment tool (BLAST) sequence identity, the correlation increased substantially. We identified 182 genes in the Affymetrix and 45 in the cDNA data (including 17 common genes) that accurately separated 91% of cases in supervised hierarchical clustering in each data set. Cross-platform testing of these informative genes resulted in lower clustering accuracy of 45 and 79%, respectively. Several sets of accurate five-gene classifiers were developed on each platform using linear discriminant analysis. The best 100 classifiers showed average misclassification error rate of 2% on the original data that rose to 19.5% when tested on data from the other platform. Random five-gene classifiers showed misclassification error rate of 33%. We conclude that multigene predictors optimized for one platform lose accuracy when applied to data from another platform due to missing genes and sequence differences in probes that result in differing measurements for the same gene. PMID:16049308
NASA Astrophysics Data System (ADS)
Anees, Asim; Aryal, Jagannath; O'Reilly, Małgorzata M.; Gale, Timothy J.; Wardlaw, Tim
2016-12-01
A robust non-parametric framework, based on multiple Radial Basic Function (RBF) kernels, is proposed in this study, for detecting land/forest cover changes using Landsat 7 ETM+ images. One of the widely used frameworks is to find change vectors (difference image) and use a supervised classifier to differentiate between change and no-change. The Bayesian Classifiers e.g. Maximum Likelihood Classifier (MLC), Naive Bayes (NB), are widely used probabilistic classifiers which assume parametric models, e.g. Gaussian function, for the class conditional distributions. However, their performance can be limited if the data set deviates from the assumed model. The proposed framework exploits the useful properties of Least Squares Probabilistic Classifier (LSPC) formulation i.e. non-parametric and probabilistic nature, to model class posterior probabilities of the difference image using a linear combination of a large number of Gaussian kernels. To this end, a simple technique, based on 10-fold cross-validation is also proposed for tuning model parameters automatically instead of selecting a (possibly) suboptimal combination from pre-specified lists of values. The proposed framework has been tested and compared with Support Vector Machine (SVM) and NB for detection of defoliation, caused by leaf beetles (Paropsisterna spp.) in Eucalyptus nitens and Eucalyptus globulus plantations of two test areas, in Tasmania, Australia, using raw bands and band combination indices of Landsat 7 ETM+. It was observed that due to multi-kernel non-parametric formulation and probabilistic nature, the LSPC outperforms parametric NB with Gaussian assumption in change detection framework, with Overall Accuracy (OA) ranging from 93.6% (κ = 0.87) to 97.4% (κ = 0.94) against 85.3% (κ = 0.69) to 93.4% (κ = 0.85), and is more robust to changing data distributions. Its performance was comparable to SVM, with added advantages of being probabilistic and capable of handling multi-class problems naturally with its original formulation.
Poynton, Clare B; Chen, Kevin T; Chonde, Daniel B; Izquierdo-Garcia, David; Gollub, Randy L; Gerstner, Elizabeth R; Batchelor, Tracy T; Catana, Ciprian
2014-01-01
We present a new MRI-based attenuation correction (AC) approach for integrated PET/MRI systems that combines both segmentation- and atlas-based methods by incorporating dual-echo ultra-short echo-time (DUTE) and T1-weighted (T1w) MRI data and a probabilistic atlas. Segmented atlases were constructed from CT training data using a leave-one-out framework and combined with T1w, DUTE, and CT data to train a classifier that computes the probability of air/soft tissue/bone at each voxel. This classifier was applied to segment the MRI of the subject of interest and attenuation maps (μ-maps) were generated by assigning specific linear attenuation coefficients (LACs) to each tissue class. The μ-maps generated with this “Atlas-T1w-DUTE” approach were compared to those obtained from DUTE data using a previously proposed method. For validation of the segmentation results, segmented CT μ-maps were considered to the “silver standard”; the segmentation accuracy was assessed qualitatively and quantitatively through calculation of the Dice similarity coefficient (DSC). Relative change (RC) maps between the CT and MRI-based attenuation corrected PET volumes were also calculated for a global voxel-wise assessment of the reconstruction results. The μ-maps obtained using the Atlas-T1w-DUTE classifier agreed well with those derived from CT; the mean DSCs for the Atlas-T1w-DUTE-based μ-maps across all subjects were higher than those for DUTE-based μ-maps; the atlas-based μ-maps also showed a lower percentage of misclassified voxels across all subjects. RC maps from the atlas-based technique also demonstrated improvement in the PET data compared to the DUTE method, both globally as well as regionally. PMID:24753982
Cerebral 18F-FDG PET in macrophagic myofasciitis: An individual SVM-based approach.
Blanc-Durand, Paul; Van Der Gucht, Axel; Guedj, Eric; Abulizi, Mukedaisi; Aoun-Sebaiti, Mehdi; Lerman, Lionel; Verger, Antoine; Authier, François-Jérôme; Itti, Emmanuel
2017-01-01
Macrophagic myofasciitis (MMF) is an emerging condition with highly specific myopathological alterations. A peculiar spatial pattern of a cerebral glucose hypometabolism involving occipito-temporal cortex and cerebellum have been reported in patients with MMF; however, the full pattern is not systematically present in routine interpretation of scans, and with varying degrees of severity depending on the cognitive profile of patients. Aim was to generate and evaluate a support vector machine (SVM) procedure to classify patients between healthy or MMF 18F-FDG brain profiles. 18F-FDG PET brain images of 119 patients with MMF and 64 healthy subjects were retrospectively analyzed. The whole-population was divided into two groups; a training set (100 MMF, 44 healthy subjects) and a testing set (19 MMF, 20 healthy subjects). Dimensionality reduction was performed using a t-map from statistical parametric mapping (SPM) and a SVM with a linear kernel was trained on the training set. To evaluate the performance of the SVM classifier, values of sensitivity (Se), specificity (Sp), positive predictive value (PPV), negative predictive value (NPV) and accuracy (Acc) were calculated. The SPM12 analysis on the training set exhibited the already reported hypometabolism pattern involving occipito-temporal and fronto-parietal cortices, limbic system and cerebellum. The SVM procedure, based on the t-test mask generated from the training set, correctly classified MMF patients of the testing set with following Se, Sp, PPV, NPV and Acc: 89%, 85%, 85%, 89%, and 87%. We developed an original and individual approach including a SVM to classify patients between healthy or MMF metabolic brain profiles using 18F-FDG-PET. Machine learning algorithms are promising for computer-aided diagnosis but will need further validation in prospective cohorts.
Psoriasis skin biopsy image segmentation using Deep Convolutional Neural Network.
Pal, Anabik; Garain, Utpal; Chandra, Aditi; Chatterjee, Raghunath; Senapati, Swapan
2018-06-01
Development of machine assisted tools for automatic analysis of psoriasis skin biopsy image plays an important role in clinical assistance. Development of automatic approach for accurate segmentation of psoriasis skin biopsy image is the initial prerequisite for developing such system. However, the complex cellular structure, presence of imaging artifacts, uneven staining variation make the task challenging. This paper presents a pioneering attempt for automatic segmentation of psoriasis skin biopsy images. Several deep neural architectures are tried for segmenting psoriasis skin biopsy images. Deep models are used for classifying the super-pixels generated by Simple Linear Iterative Clustering (SLIC) and the segmentation performance of these architectures is compared with the traditional hand-crafted feature based classifiers built on popularly used classifiers like K-Nearest Neighbor (KNN), Support Vector Machine (SVM) and Random Forest (RF). A U-shaped Fully Convolutional Neural Network (FCN) is also used in an end to end learning fashion where input is the original color image and the output is the segmentation class map for the skin layers. An annotated real psoriasis skin biopsy image data set of ninety (90) images is developed and used for this research. The segmentation performance is evaluated with two metrics namely, Jaccard's Coefficient (JC) and the Ratio of Correct Pixel Classification (RCPC) accuracy. The experimental results show that the CNN based approaches outperform the traditional hand-crafted feature based classification approaches. The present research shows that practical system can be developed for machine assisted analysis of psoriasis disease. Copyright © 2018 Elsevier B.V. All rights reserved.
Surgical gesture classification from video and kinematic data.
Zappella, Luca; Béjar, Benjamín; Hager, Gregory; Vidal, René
2013-10-01
Much of the existing work on automatic classification of gestures and skill in robotic surgery is based on dynamic cues (e.g., time to completion, speed, forces, torque) or kinematic data (e.g., robot trajectories and velocities). While videos could be equally or more discriminative (e.g., videos contain semantic information not present in kinematic data), they are typically not used because of the difficulties associated with automatic video interpretation. In this paper, we propose several methods for automatic surgical gesture classification from video data. We assume that the video of a surgical task (e.g., suturing) has been segmented into video clips corresponding to a single gesture (e.g., grabbing the needle, passing the needle) and propose three methods to classify the gesture of each video clip. In the first one, we model each video clip as the output of a linear dynamical system (LDS) and use metrics in the space of LDSs to classify new video clips. In the second one, we use spatio-temporal features extracted from each video clip to learn a dictionary of spatio-temporal words, and use a bag-of-features (BoF) approach to classify new video clips. In the third one, we use multiple kernel learning (MKL) to combine the LDS and BoF approaches. Since the LDS approach is also applicable to kinematic data, we also use MKL to combine both types of data in order to exploit their complementarity. Our experiments on a typical surgical training setup show that methods based on video data perform equally well, if not better, than state-of-the-art approaches based on kinematic data. In turn, the combination of both kinematic and video data outperforms any other algorithm based on one type of data alone. Copyright © 2013 Elsevier B.V. All rights reserved.
Sliding Window Generalized Kernel Affine Projection Algorithm Using Projection Mappings
NASA Astrophysics Data System (ADS)
Slavakis, Konstantinos; Theodoridis, Sergios
2008-12-01
Very recently, a solution to the kernel-based online classification problem has been given by the adaptive projected subgradient method (APSM). The developed algorithm can be considered as a generalization of a kernel affine projection algorithm (APA) and the kernel normalized least mean squares (NLMS). Furthermore, sparsification of the resulting kernel series expansion was achieved by imposing a closed ball (convex set) constraint on the norm of the classifiers. This paper presents another sparsification method for the APSM approach to the online classification task by generating a sequence of linear subspaces in a reproducing kernel Hilbert space (RKHS). To cope with the inherent memory limitations of online systems and to embed tracking capabilities to the design, an upper bound on the dimension of the linear subspaces is imposed. The underlying principle of the design is the notion of projection mappings. Classification is performed by metric projection mappings, sparsification is achieved by orthogonal projections, while the online system's memory requirements and tracking are attained by oblique projections. The resulting sparsification scheme shows strong similarities with the classical sliding window adaptive schemes. The proposed design is validated by the adaptive equalization problem of a nonlinear communication channel, and is compared with classical and recent stochastic gradient descent techniques, as well as with the APSM's solution where sparsification is performed by a closed ball constraint on the norm of the classifiers.
Maalek, Reza; Lichti, Derek D; Ruwanpura, Janaka Y
2018-03-08
Automated segmentation of planar and linear features of point clouds acquired from construction sites is essential for the automatic extraction of building construction elements such as columns, beams and slabs. However, many planar and linear segmentation methods use scene-dependent similarity thresholds that may not provide generalizable solutions for all environments. In addition, outliers exist in construction site point clouds due to data artefacts caused by moving objects, occlusions and dust. To address these concerns, a novel method for robust classification and segmentation of planar and linear features is proposed. First, coplanar and collinear points are classified through a robust principal components analysis procedure. The classified points are then grouped using a new robust clustering method, the robust complete linkage method. A robust method is also proposed to extract the points of flat-slab floors and/or ceilings independent of the aforementioned stages to improve computational efficiency. The applicability of the proposed method is evaluated in eight datasets acquired from a complex laboratory environment and two construction sites at the University of Calgary. The precision, recall, and accuracy of the segmentation at both construction sites were 96.8%, 97.7% and 95%, respectively. These results demonstrate the suitability of the proposed method for robust segmentation of planar and linear features of contaminated datasets, such as those collected from construction sites.
Maalek, Reza; Lichti, Derek D; Ruwanpura, Janaka Y
2018-01-01
Automated segmentation of planar and linear features of point clouds acquired from construction sites is essential for the automatic extraction of building construction elements such as columns, beams and slabs. However, many planar and linear segmentation methods use scene-dependent similarity thresholds that may not provide generalizable solutions for all environments. In addition, outliers exist in construction site point clouds due to data artefacts caused by moving objects, occlusions and dust. To address these concerns, a novel method for robust classification and segmentation of planar and linear features is proposed. First, coplanar and collinear points are classified through a robust principal components analysis procedure. The classified points are then grouped using a new robust clustering method, the robust complete linkage method. A robust method is also proposed to extract the points of flat-slab floors and/or ceilings independent of the aforementioned stages to improve computational efficiency. The applicability of the proposed method is evaluated in eight datasets acquired from a complex laboratory environment and two construction sites at the University of Calgary. The precision, recall, and accuracy of the segmentation at both construction sites were 96.8%, 97.7% and 95%, respectively. These results demonstrate the suitability of the proposed method for robust segmentation of planar and linear features of contaminated datasets, such as those collected from construction sites. PMID:29518062
Mori, Hiroki; Okuyama, Yuji; Asada, Minoru
2017-01-01
Chaotic itinerancy is a phenomenon in which the state of a nonlinear dynamical system spontaneously explores and attracts certain states in a state space. From this perspective, the diverse behavior of animals and its spontaneous transitions lead to a complex coupled dynamical system, including a physical body and a brain. Herein, a series of simulations using different types of non-linear oscillator networks (i.e., regular, small-world, scale-free, random) with a musculoskeletal model (i.e., a snake-like robot) as a physical body are conducted to understand how the chaotic itinerancy of bodily behavior emerges from the coupled dynamics between the body and the brain. A behavior analysis (behavior clustering) and network analysis for the classified behavior are then applied. The former consists of feature vector extraction from the motions and classification of the movement patterns that emerged from the coupled dynamics. The network structures behind the classified movement patterns are revealed by estimating the “information networks” different from the given non-linear oscillator networks based on the transfer entropy which finds the information flow among neurons. The experimental results show that: (1) the number of movement patterns and their duration depend on the sensor ratio to control the balance of strength between the body and the brain dynamics and on the type of the given non-linear oscillator networks; and (2) two kinds of information networks are found behind two kinds movement patterns with different durations by utilizing the complex network measures, clustering coefficient and the shortest path length with a negative and a positive relationship with the duration periods of movement patterns. The current results seem promising for a future extension of the method to a more complicated body and environment. Several requirements are also discussed. PMID:28796797
Park, Jihoon; Mori, Hiroki; Okuyama, Yuji; Asada, Minoru
2017-01-01
Chaotic itinerancy is a phenomenon in which the state of a nonlinear dynamical system spontaneously explores and attracts certain states in a state space. From this perspective, the diverse behavior of animals and its spontaneous transitions lead to a complex coupled dynamical system, including a physical body and a brain. Herein, a series of simulations using different types of non-linear oscillator networks (i.e., regular, small-world, scale-free, random) with a musculoskeletal model (i.e., a snake-like robot) as a physical body are conducted to understand how the chaotic itinerancy of bodily behavior emerges from the coupled dynamics between the body and the brain. A behavior analysis (behavior clustering) and network analysis for the classified behavior are then applied. The former consists of feature vector extraction from the motions and classification of the movement patterns that emerged from the coupled dynamics. The network structures behind the classified movement patterns are revealed by estimating the "information networks" different from the given non-linear oscillator networks based on the transfer entropy which finds the information flow among neurons. The experimental results show that: (1) the number of movement patterns and their duration depend on the sensor ratio to control the balance of strength between the body and the brain dynamics and on the type of the given non-linear oscillator networks; and (2) two kinds of information networks are found behind two kinds movement patterns with different durations by utilizing the complex network measures, clustering coefficient and the shortest path length with a negative and a positive relationship with the duration periods of movement patterns. The current results seem promising for a future extension of the method to a more complicated body and environment. Several requirements are also discussed.
Shahid, Mohammad; Shahzad Cheema, Muhammad; Klenner, Alexander; Younesi, Erfan; Hofmann-Apitius, Martin
2013-03-01
Systems pharmacological modeling of drug mode of action for the next generation of multitarget drugs may open new routes for drug design and discovery. Computational methods are widely used in this context amongst which support vector machines (SVM) have proven successful in addressing the challenge of classifying drugs with similar features. We have applied a variety of such SVM-based approaches, namely SVM-based recursive feature elimination (SVM-RFE). We use the approach to predict the pharmacological properties of drugs widely used against complex neurodegenerative disorders (NDD) and to build an in-silico computational model for the binary classification of NDD drugs from other drugs. Application of an SVM-RFE model to a set of drugs successfully classified NDD drugs from non-NDD drugs and resulted in overall accuracy of ∼80 % with 10 fold cross validation using 40 top ranked molecular descriptors selected out of total 314 descriptors. Moreover, SVM-RFE method outperformed linear discriminant analysis (LDA) based feature selection and classification. The model reduced the multidimensional descriptors space of drugs dramatically and predicted NDD drugs with high accuracy, while avoiding over fitting. Based on these results, NDD-specific focused libraries of drug-like compounds can be designed and existing NDD-specific drugs can be characterized by a well-characterized set of molecular descriptors. Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Modeling of Tool-Tissue Interactions for Computer-Based Surgical Simulation: A Literature Review
Misra, Sarthak; Ramesh, K. T.; Okamura, Allison M.
2009-01-01
Surgical simulators present a safe and potentially effective method for surgical training, and can also be used in robot-assisted surgery for pre- and intra-operative planning. Accurate modeling of the interaction between surgical instruments and organs has been recognized as a key requirement in the development of high-fidelity surgical simulators. Researchers have attempted to model tool-tissue interactions in a wide variety of ways, which can be broadly classified as (1) linear elasticity-based, (2) nonlinear (hyperelastic) elasticity-based finite element (FE) methods, and (3) other techniques that not based on FE methods or continuum mechanics. Realistic modeling of organ deformation requires populating the model with real tissue data (which are difficult to acquire in vivo) and simulating organ response in real time (which is computationally expensive). Further, it is challenging to account for connective tissue supporting the organ, friction, and topological changes resulting from tool-tissue interactions during invasive surgical procedures. Overcoming such obstacles will not only help us to model tool-tissue interactions in real time, but also enable realistic force feedback to the user during surgical simulation. This review paper classifies the existing research on tool-tissue interactions for surgical simulators specifically based on the modeling techniques employed and the kind of surgical operation being simulated, in order to inform and motivate future research on improved tool-tissue interaction models. PMID:20119508
Bonato, Matteo; Papini, Gabriele; Bosio, Andrea; Mohammed, Rahil A.; Bonomi, Alberto G.; Moore, Jonathan P.; Merati, Giampiero; La Torre, Antonio; Kubis, Hans-Peter
2016-01-01
Cardio-respiratory fitness (CRF) is a widespread essential indicator in Sports Science as well as in Sports Medicine. This study aimed to develop and validate a prediction model for CRF based on a 45 second self-test, which can be conducted anywhere. Criterion validity, test re-test study was set up to accomplish our objectives. Data from 81 healthy volunteers (age: 29 ± 8 years, BMI: 24.0 ± 2.9), 18 of whom females, were used to validate this test against gold standard. Nineteen volunteers repeated this test twice in order to evaluate its repeatability. CRF estimation models were developed using heart rate (HR) features extracted from the resting, exercise, and the recovery phase. The most predictive HR feature was the intercept of the linear equation fitting the HR values during the recovery phase normalized for the height2 (r2 = 0.30). The Ruffier-Dickson Index (RDI), which was originally developed for this squat test, showed a negative significant correlation with CRF (r = -0.40), but explained only 15% of the variability in CRF. A multivariate model based on RDI and sex, age and height increased the explained variability up to 53% with a cross validation (CV) error of 0.532 L ∙ min-1 and substantial repeatability (ICC = 0.91). The best predictive multivariate model made use of the linear intercept of HR at the beginning of the recovery normalized for height2 and age2; this had an adjusted r2 = 0. 59, a CV error of 0.495 L·min-1 and substantial repeatability (ICC = 0.93). It also had a higher agreement in classifying CRF levels (κ = 0.42) than RDI-based model (κ = 0.29). In conclusion, this simple 45 s self-test can be used to estimate and classify CRF in healthy individuals with moderate accuracy and large repeatability when HR recovery features are included. PMID:27959935
Sartor, Francesco; Bonato, Matteo; Papini, Gabriele; Bosio, Andrea; Mohammed, Rahil A; Bonomi, Alberto G; Moore, Jonathan P; Merati, Giampiero; La Torre, Antonio; Kubis, Hans-Peter
2016-01-01
Cardio-respiratory fitness (CRF) is a widespread essential indicator in Sports Science as well as in Sports Medicine. This study aimed to develop and validate a prediction model for CRF based on a 45 second self-test, which can be conducted anywhere. Criterion validity, test re-test study was set up to accomplish our objectives. Data from 81 healthy volunteers (age: 29 ± 8 years, BMI: 24.0 ± 2.9), 18 of whom females, were used to validate this test against gold standard. Nineteen volunteers repeated this test twice in order to evaluate its repeatability. CRF estimation models were developed using heart rate (HR) features extracted from the resting, exercise, and the recovery phase. The most predictive HR feature was the intercept of the linear equation fitting the HR values during the recovery phase normalized for the height2 (r2 = 0.30). The Ruffier-Dickson Index (RDI), which was originally developed for this squat test, showed a negative significant correlation with CRF (r = -0.40), but explained only 15% of the variability in CRF. A multivariate model based on RDI and sex, age and height increased the explained variability up to 53% with a cross validation (CV) error of 0.532 L ∙ min-1 and substantial repeatability (ICC = 0.91). The best predictive multivariate model made use of the linear intercept of HR at the beginning of the recovery normalized for height2 and age2; this had an adjusted r2 = 0. 59, a CV error of 0.495 L·min-1 and substantial repeatability (ICC = 0.93). It also had a higher agreement in classifying CRF levels (κ = 0.42) than RDI-based model (κ = 0.29). In conclusion, this simple 45 s self-test can be used to estimate and classify CRF in healthy individuals with moderate accuracy and large repeatability when HR recovery features are included.
Jongin Kim; Boreom Lee
2017-07-01
The classification of neuroimaging data for the diagnosis of Alzheimer's Disease (AD) is one of the main research goals of the neuroscience and clinical fields. In this study, we performed extreme learning machine (ELM) classifier to discriminate the AD, mild cognitive impairment (MCI) from normal control (NC). We compared the performance of ELM with that of a linear kernel support vector machine (SVM) for 718 structural MRI images from Alzheimer's Disease Neuroimaging Initiative (ADNI) database. The data consisted of normal control, MCI converter (MCI-C), MCI non-converter (MCI-NC), and AD. We employed SVM-based recursive feature elimination (RFE-SVM) algorithm to find the optimal subset of features. In this study, we found that the RFE-SVM feature selection approach in combination with ELM shows the superior classification accuracy to that of linear kernel SVM for structural T1 MRI data.
Ebad-Allah, J; Baldassarre, L; Sing, M; Claessen, R; Brabers, V A M; Kuntscher, C A
2013-01-23
The optical properties of magnetite at room temperature were studied by infrared reflectivity measurements as a function of pressure up to 8 GPa. The optical conductivity spectrum consists of a Drude term, two sharp phonon modes, a far-infrared band at around 600 cm(-1) and a pronounced mid-infrared absorption band. With increasing pressure both absorption bands shift to lower frequencies and the phonon modes harden in a linear fashion. Based on the shape of the MIR band, the temperature dependence of the dc transport data, and the occurrence of the far-infrared band in the optical conductivity spectrum, the polaronic coupling strength in magnetite at room temperature should be classified as intermediate. For the lower energy phonon mode an abrupt increase of the linear pressure coefficient occurs at around 6 GPa, which could be attributed to minor alterations of the charge distribution among the different Fe sites.
An improved SRC method based on virtual samples for face recognition
NASA Astrophysics Data System (ADS)
Fu, Lijun; Chen, Deyun; Lin, Kezheng; Li, Ao
2018-07-01
The sparse representation classifier (SRC) performs classification by evaluating which class leads to the minimum representation error. However, in real world, the number of available training samples is limited due to noise interference, training samples cannot accurately represent the test sample linearly. Therefore, in this paper, we first produce virtual samples by exploiting original training samples at the aim of increasing the number of training samples. Then, we take the intra-class difference as data representation of partial noise, and utilize the intra-class differences and training samples simultaneously to represent the test sample in a linear way according to the theory of SRC algorithm. Using weighted score level fusion, the respective representation scores of the virtual samples and the original training samples are fused together to obtain the final classification results. The experimental results on multiple face databases show that our proposed method has a very satisfactory classification performance.
Milenković, Jana; Dalmış, Mehmet Ufuk; Žgajnar, Janez; Platel, Bram
2017-09-01
New ultrafast view-sharing sequences have enabled breast dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) to be performed at high spatial and temporal resolution. The aim of this study is to evaluate the diagnostic potential of textural features that quantify the spatiotemporal changes of the contrast-agent uptake in computer-aided diagnosis of malignant and benign breast lesions imaged with high spatial and temporal resolution DCE-MRI. The proposed approach is based on the textural analysis quantifying the spatial variation of six dynamic features of the early-phase contrast-agent uptake of a lesion's largest cross-sectional area. The textural analysis is performed by means of the second-order gray-level co-occurrence matrix, gray-level run-length matrix and gray-level difference matrix. This yields 35 textural features to quantify the spatial variation of each of the six dynamic features, providing a feature set of 210 features in total. The proposed feature set is evaluated based on receiver operating characteristic (ROC) curve analysis in a cross-validation scheme for random forests (RF) and two support vector machine classifiers, with linear and radial basis function (RBF) kernel. Evaluation is done on a dataset with 154 breast lesions (83 malignant and 71 benign) and compared to a previous approach based on 3D morphological features and the average and standard deviation of the same dynamic features over the entire lesion volume as well as their average for the smaller region of the strongest uptake rate. The area under the ROC curve (AUC) obtained by the proposed approach with the RF classifier was 0.8997, which was significantly higher (P = 0.0198) than the performance achieved by the previous approach (AUC = 0.8704) on the same dataset. Similarly, the proposed approach obtained a significantly higher result for both SVM classifiers with RBF (P = 0.0096) and linear kernel (P = 0.0417) obtaining AUC of 0.8876 and 0.8548, respectively, compared to AUC values of previous approach of 0.8562 and 0.8311, respectively. The proposed approach based on 2D textural features quantifying spatiotemporal changes of the contrast-agent uptake significantly outperforms the previous approach based on 3D morphology and dynamic analysis in differentiating the malignant and benign breast lesions, showing its potential to aid clinical decision making. © 2017 American Association of Physicists in Medicine.
Novel layered clustering-based approach for generating ensemble of classifiers.
Rahman, Ashfaqur; Verma, Brijesh
2011-05-01
This paper introduces a novel concept for creating an ensemble of classifiers. The concept is based on generating an ensemble of classifiers through clustering of data at multiple layers. The ensemble classifier model generates a set of alternative clustering of a dataset at different layers by randomly initializing the clustering parameters and trains a set of base classifiers on the patterns at different clusters in different layers. A test pattern is classified by first finding the appropriate cluster at each layer and then using the corresponding base classifier. The decisions obtained at different layers are fused into a final verdict using majority voting. As the base classifiers are trained on overlapping patterns at different layers, the proposed approach achieves diversity among the individual classifiers. Identification of difficult-to-classify patterns through clustering as well as achievement of diversity through layering leads to better classification results as evidenced from the experimental results.
A mechatronics platform to study prosthetic hand control using EMG signals.
Geethanjali, P
2016-09-01
In this paper, a low-cost mechatronics platform for the design and development of robotic hands as well as a surface electromyogram (EMG) pattern recognition system is proposed. This paper also explores various EMG classification techniques using a low-cost electronics system in prosthetic hand applications. The proposed platform involves the development of a four channel EMG signal acquisition system; pattern recognition of acquired EMG signals; and development of a digital controller for a robotic hand. Four-channel surface EMG signals, acquired from ten healthy subjects for six different movements of the hand, were used to analyse pattern recognition in prosthetic hand control. Various time domain features were extracted and grouped into five ensembles to compare the influence of features in feature-selective classifiers (SLR) with widely considered non-feature-selective classifiers, such as neural networks (NN), linear discriminant analysis (LDA) and support vector machines (SVM) applied with different kernels. The results divulged that the average classification accuracy of the SVM, with a linear kernel function, outperforms other classifiers with feature ensembles, Hudgin's feature set and auto regression (AR) coefficients. However, the slight improvement in classification accuracy of SVM incurs more processing time and memory space in the low-level controller. The Kruskal-Wallis (KW) test also shows that there is no significant difference in the classification performance of SLR with Hudgin's feature set to that of SVM with Hudgin's features along with AR coefficients. In addition, the KW test shows that SLR was found to be better in respect to computation time and memory space, which is vital in a low-level controller. Similar to SVM, with a linear kernel function, other non-feature selective LDA and NN classifiers also show a slight improvement in performance using twice the features but with the drawback of increased memory space requirement and time. This prototype facilitated the study of various issues of pattern recognition and identified an efficient classifier, along with a feature ensemble, in the implementation of EMG controlled prosthetic hands in a laboratory setting at low-cost. This platform may help to motivate and facilitate prosthetic hand research in developing countries.
Incremental Structured Dictionary Learning for Video Sensor-Based Object Tracking
Xue, Ming; Yang, Hua; Zheng, Shibao; Zhou, Yi; Yu, Zhenghua
2014-01-01
To tackle robust object tracking for video sensor-based applications, an online discriminative algorithm based on incremental discriminative structured dictionary learning (IDSDL-VT) is presented. In our framework, a discriminative dictionary combining both positive, negative and trivial patches is designed to sparsely represent the overlapped target patches. Then, a local update (LU) strategy is proposed for sparse coefficient learning. To formulate the training and classification process, a multiple linear classifier group based on a K-combined voting (KCV) function is proposed. As the dictionary evolves, the models are also trained to timely adapt the target appearance variation. Qualitative and quantitative evaluations on challenging image sequences compared with state-of-the-art algorithms demonstrate that the proposed tracking algorithm achieves a more favorable performance. We also illustrate its relay application in visual sensor networks. PMID:24549252
Using linear algebra for protein structural comparison and classification
2009-01-01
In this article, we describe a novel methodology to extract semantic characteristics from protein structures using linear algebra in order to compose structural signature vectors which may be used efficiently to compare and classify protein structures into fold families. These signatures are built from the pattern of hydrophobic intrachain interactions using Singular Value Decomposition (SVD) and Latent Semantic Indexing (LSI) techniques. Considering proteins as documents and contacts as terms, we have built a retrieval system which is able to find conserved contacts in samples of myoglobin fold family and to retrieve these proteins among proteins of varied folds with precision of up to 80%. The classifier is a web tool available at our laboratory website. Users can search for similar chains from a specific PDB, view and compare their contact maps and browse their structures using a JMol plug-in. PMID:21637532
Using linear algebra for protein structural comparison and classification.
Gomide, Janaína; Melo-Minardi, Raquel; Dos Santos, Marcos Augusto; Neshich, Goran; Meira, Wagner; Lopes, Júlio César; Santoro, Marcelo
2009-07-01
In this article, we describe a novel methodology to extract semantic characteristics from protein structures using linear algebra in order to compose structural signature vectors which may be used efficiently to compare and classify protein structures into fold families. These signatures are built from the pattern of hydrophobic intrachain interactions using Singular Value Decomposition (SVD) and Latent Semantic Indexing (LSI) techniques. Considering proteins as documents and contacts as terms, we have built a retrieval system which is able to find conserved contacts in samples of myoglobin fold family and to retrieve these proteins among proteins of varied folds with precision of up to 80%. The classifier is a web tool available at our laboratory website. Users can search for similar chains from a specific PDB, view and compare their contact maps and browse their structures using a JMol plug-in.
Watari, Ricky; Kobsar, Dylan; Phinyomark, Angkoon; Osis, Sean; Ferber, Reed
2016-10-01
Not all patients with patellofemoral pain exhibit successful outcomes following exercise therapy. Thus, the ability to identify patellofemoral pain subgroups related to treatment response is important for the development of optimal therapeutic strategies to improve rehabilitation outcomes. The purpose of this study was to use baseline running gait kinematic and clinical outcome variables to classify patellofemoral pain patients on treatment response retrospectively. Forty-one individuals with patellofemoral pain that underwent a 6-week exercise intervention program were sub-grouped as treatment Responders (n=28) and Non-responders (n=13) based on self-reported measures of pain and function. Baseline three-dimensional running kinematics, and self-reported measures underwent a linear discriminant analysis of the principal components of the variables to retrospectively classify participants based on treatment response. The significance of the discriminant function was verified with a Wilk's lambda test (α=0.05). The model selected 2 gait principal components and had a 78.1% classification accuracy. Overall, Non-responders exhibited greater ankle dorsiflexion, knee abduction and hip flexion during the swing phase and greater ankle inversion during the stance phase, compared to Responders. This is the first study to investigate an objective method to use baseline kinematic and self-report outcome variables to classify on patellofemoral pain treatment outcome. This study represents a significant first step towards a method to help clinicians make evidence-informed decisions regarding optimal treatment strategies for patients with patellofemoral pain. Copyright © 2016 Elsevier Ltd. All rights reserved.
Khondoker, Mizanur R; Bachmann, Till T; Mewissen, Muriel; Dickinson, Paul; Dobrzelecki, Bartosz; Campbell, Colin J; Mount, Andrew R; Walton, Anthony J; Crain, Jason; Schulze, Holger; Giraud, Gerard; Ross, Alan J; Ciani, Ilenia; Ember, Stuart W J; Tlili, Chaker; Terry, Jonathan G; Grant, Eilidh; McDonnell, Nicola; Ghazal, Peter
2010-12-01
Machine learning and statistical model based classifiers have increasingly been used with more complex and high dimensional biological data obtained from high-throughput technologies. Understanding the impact of various factors associated with large and complex microarray datasets on the predictive performance of classifiers is computationally intensive, under investigated, yet vital in determining the optimal number of biomarkers for various classification purposes aimed towards improved detection, diagnosis, and therapeutic monitoring of diseases. We investigate the impact of microarray based data characteristics on the predictive performance for various classification rules using simulation studies. Our investigation using Random Forest, Support Vector Machines, Linear Discriminant Analysis and k-Nearest Neighbour shows that the predictive performance of classifiers is strongly influenced by training set size, biological and technical variability, replication, fold change and correlation between biomarkers. Optimal number of biomarkers for a classification problem should therefore be estimated taking account of the impact of all these factors. A database of average generalization errors is built for various combinations of these factors. The database of generalization errors can be used for estimating the optimal number of biomarkers for given levels of predictive accuracy as a function of these factors. Examples show that curves from actual biological data resemble that of simulated data with corresponding levels of data characteristics. An R package optBiomarker implementing the method is freely available for academic use from the Comprehensive R Archive Network (http://www.cran.r-project.org/web/packages/optBiomarker/).
NASA Astrophysics Data System (ADS)
Talai, Sahand; Boelmans, Kai; Sedlacik, Jan; Forkert, Nils D.
2017-03-01
Parkinsonian syndromes encompass a spectrum of neurodegenerative diseases, which can be classified into various subtypes. The differentiation of these subtypes is typically conducted based on clinical criteria. Due to the overlap of intra-syndrome symptoms, the accurate differential diagnosis based on clinical guidelines remains a challenge with failure rates up to 25%. The aim of this study is to present an image-based classification method of patients with Parkinson's disease (PD) and patients with progressive supranuclear palsy (PSP), an atypical variant of PD. Therefore, apparent diffusion coefficient (ADC) parameter maps were calculated based on diffusion-tensor magnetic resonance imaging (MRI) datasets. Mean ADC values were determined in 82 brain regions using an atlas-based approach. The extracted mean ADC values for each patient were then used as features for classification using a linear kernel support vector machine classifier. To increase the classification accuracy, a feature selection was performed, which resulted in the top 17 attributes to be used as the final input features. A leave-one-out cross validation based on 56 PD and 21 PSP subjects revealed that the proposed method is capable of differentiating PD and PSP patients with an accuracy of 94.8%. In conclusion, the classification of PD and PSP patients based on ADC features obtained from diffusion MRI datasets is a promising new approach for the differentiation of Parkinsonian syndromes in the broader context of decision support systems.
Perception of olive oils sensory defects using a potentiometric taste device.
Veloso, Ana C A; Silva, Lucas M; Rodrigues, Nuno; Rebello, Ligia P G; Dias, Luís G; Pereira, José A; Peres, António M
2018-01-01
The capability of perceiving olive oils sensory defects and intensities plays a key role on olive oils quality grade classification since olive oils can only be classified as extra-virgin if no defect can be perceived by a human trained sensory panel. Otherwise, olive oils may be classified as virgin or lampante depending on the median intensity of the defect predominantly perceived and on the physicochemical levels. However, sensory analysis is time-consuming and requires an official sensory panel, which can only evaluate a low number of samples per day. In this work, the potential use of an electronic tongue as a taste sensor device to identify the defect predominantly perceived in olive oils was evaluated. The potentiometric profiles recorded showed that intra- and inter-day signal drifts could be neglected (i.e., relative standard deviations lower than 25%), being not statistically significant the effect of the analysis day on the overall recorded E-tongue sensor fingerprints (P-value = 0.5715, for multivariate analysis of variance using Pillai's trace test), which significantly differ according to the olive oils' sensory defect (P-value = 0.0084, for multivariate analysis of variance using Pillai's trace test). Thus, a linear discriminant model based on 19 potentiometric signal sensors, selected by the simulated annealing algorithm, could be established to correctly predict the olive oil main sensory defect (fusty, rancid, wet-wood or winey-vinegary) with average sensitivity of 75 ± 3% and specificity of 73 ± 4% (repeated K-fold cross-validation variant: 4 folds×10 repeats). Similarly, a linear discriminant model, based on 24 selected sensors, correctly classified 92 ± 3% of the olive oils as virgin or lampante, being an average specificity of 93 ± 3% achieved. The overall satisfactory predictive performances strengthen the feasibility of the developed taste sensor device as a complementary methodology for olive oils' defects analysis and subsequent quality grade classification. Furthermore, the capability of identifying the type of sensory defect of an olive oil may allow establishing helpful insights regarding bad practices of olives or olive oils production, harvesting, transport and storage. Copyright © 2017 Elsevier B.V. All rights reserved.
Chemical data as markers of the geographical origins of sugarcane spirits.
Serafim, F A T; Pereira-Filho, Edenir R; Franco, D W
2016-04-01
In an attempt to classify sugarcane spirits according to their geographic region of origin, chemical data for 24 analytes were evaluated in 50 cachaças produced using a similar procedure in selected regions of Brazil: São Paulo - SP (15), Minas Gerais - MG (11), Rio de Janeiro - RJ (11), Paraiba -PB (9), and Ceará - CE (4). Multivariate analysis was applied to the analytical results, and the predictive abilities of different classification methods were evaluated. Principal component analysis identified five groups, and chemical similarities were observed between MG and SP samples and between RJ and PB samples. CE samples presented a distinct chemical profile. Among the samples, partial linear square discriminant analysis (PLS-DA) classified 50.2% of the samples correctly, K-nearest neighbor (KNN) 86%, and soft independent modeling of class analogy (SIMCA) 56.2%. Therefore, in this proof of concept demonstration, the proposed approach based on chemical data satisfactorily predicted the cachaças' geographic origins. Copyright © 2015 Elsevier Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Alexandrov, Natalia (Technical Monitor); Kuby, Michael; Tierney, Sean; Roberts, Tyler; Upchurch, Christopher
2005-01-01
This report reviews six classes of models that are used for studying transportation network topologies. The report is motivated by two main questions. First, what can the "new science" of complex networks (scale-free, small-world networks) contribute to our understanding of transport network structure, compared to more traditional methods? Second, how can geographic information systems (GIS) contribute to studying transport networks? The report defines terms that can be used to classify different kinds of models by their function, composition, mechanism, spatial and temporal dimensions, certainty, linearity, and resolution. Six broad classes of models for analyzing transport network topologies are then explored: GIS; static graph theory; complex networks; mathematical programming; simulation; and agent-based modeling. Each class of models is defined and classified according to the attributes introduced earlier. The paper identifies some typical types of research questions about network structure that have been addressed by each class of model in the literature.
Mild Depression Detection of College Students: an EEG-Based Solution with Free Viewing Tasks.
Li, Xiaowei; Hu, Bin; Shen, Ji; Xu, Tingting; Retcliffe, Martyn
2015-12-01
Depression is a common mental disorder with growing prevalence; however current diagnoses of depression face the problem of patient denial, clinical experience and subjective biases from self-report. By using a combination of linear and nonlinear EEG features in our research, we aim to develop a more accurate and objective approach to depression detection that supports the process of diagnosis and assists the monitoring of risk factors. By classifying EEG features during free viewing task, an accuracy of 99.1%, which is the highest to our knowledge by far, was achieved using kNN classifier to discriminate depressed and non-depressed subjects. Furthermore, through correlation analysis, comparisons of performance on each electrode were discussed on the availability of single channel EEG recording depression detection system. Combined with wearable EEG collecting devices, our method offers the possibility of cost effective wearable ubiquitous system for doctors to monitor their patients with depression, and for normal people to understand their mental states in time.
Raina, A; Hennessy, R; Rains, M; Allred, J; Hirshburg, J M; Diven, D G; Markey, M K
2016-08-01
Traditional metrics for evaluating the severity of psoriasis are subjective, which complicates efforts to measure effective treatments in clinical trials. We collected images of psoriasis plaques and calibrated the coloration of the images according to an included color card. Features were extracted from the images and used to train a linear discriminant analysis classifier with cross-validation to automatically classify the degree of erythema. The results were tested against numerical scores obtained by a panel of dermatologists using a standard rating system. Quantitative measures of erythema based on the digital color images showed good agreement with subjective assessment of erythema severity (κ = 0.4203). The color calibration process improved the agreement from κ = 0.2364 to κ = 0.4203. We propose a method for the objective measurement of the psoriasis severity parameter of erythema and show that the calibration process improved the results. © 2015 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Ogawa, Takeshi; Hirayama, Jun-Ichiro; Gupta, Pankaj; Moriya, Hiroki; Yamaguchi, Shumpei; Ishikawa, Akihiro; Inoue, Yoshihiro; Kawanabe, Motoaki; Ishii, Shin
2015-08-01
Smart houses for elderly or physically challenged people need a method to understand residents' intentions during their daily-living behaviors. To explore a new possibility, we here developed a novel brain-machine interface (BMI) system integrated with an experimental smart house, based on a prototype of a wearable near-infrared spectroscopy (NIRS) device, and verified the system in a specific task of controlling of the house's equipments with BMI. We recorded NIRS signals of three participants during typical daily-living actions (DLAs), and classified them by linear support vector machine. In our off-line analysis, four DLAs were classified at about 70% mean accuracy, significantly above the chance level of 25%, in every participant. In an online demonstration in the real smart house, one participant successfully controlled three target appliances by BMI at 81.3% accuracy. Thus we successfully demonstrated the feasibility of using NIRS-BMI in real smart houses, which will possibly enhance new assistive smart-home technologies.
Color image segmentation with support vector machines: applications to road signs detection.
Cyganek, Bogusław
2008-08-01
In this paper we propose efficient color segmentation method which is based on the Support Vector Machine classifier operating in a one-class mode. The method has been developed especially for the road signs recognition system, although it can be used in other applications. The main advantage of the proposed method comes from the fact that the segmentation of characteristic colors is performed not in the original but in the higher dimensional feature space. By this a better data encapsulation with a linear hypersphere can be usually achieved. Moreover, the classifier does not try to capture the whole distribution of the input data which is often difficult to achieve. Instead, the characteristic data samples, called support vectors, are selected which allow construction of the tightest hypersphere that encloses majority of the input data. Then classification of a test data simply consists in a measurement of its distance to a centre of the found hypersphere. The experimental results show high accuracy and speed of the proposed method.
Single-cultivar extra virgin olive oil classification using a potentiometric electronic tongue.
Dias, Luís G; Fernandes, Andreia; Veloso, Ana C A; Machado, Adélio A S C; Pereira, José A; Peres, António M
2014-10-01
Label authentication of monovarietal extra virgin olive oils is of great importance. A novel approach based on a potentiometric electronic tongue is proposed to classify oils obtained from single olive cultivars (Portuguese cvs. Cobrançosa, Madural, Verdeal Transmontana; Spanish cvs. Arbequina, Hojiblanca, Picual). A meta-heuristic simulated annealing algorithm was applied to select the most informative sets of sensors to establish predictive linear discriminant models. Olive oils were correctly classified according to olive cultivar (sensitivities greater than 97%) and each Spanish olive oil was satisfactorily discriminated from the Portuguese ones with the exception of cv. Arbequina (sensitivities from 61% to 98%). Also, the discriminant ability was related to the polar compounds contents of olive oils and so, indirectly, with organoleptic properties like bitterness, astringency or pungency. Therefore the proposed E-tongue can be foreseen as a useful auxiliary tool for trained sensory panels for the classification of monovarietal extra virgin olive oils. Copyright © 2014 Elsevier Ltd. All rights reserved.
Gene-Based Multiclass Cancer Diagnosis with Class-Selective Rejections
Jrad, Nisrine; Grall-Maës, Edith; Beauseroy, Pierre
2009-01-01
Supervised learning of microarray data is receiving much attention in recent years. Multiclass cancer diagnosis, based on selected gene profiles, are used as adjunct of clinical diagnosis. However, supervised diagnosis may hinder patient care, add expense or confound a result. To avoid this misleading, a multiclass cancer diagnosis with class-selective rejection is proposed. It rejects some patients from one, some, or all classes in order to ensure a higher reliability while reducing time and expense costs. Moreover, this classifier takes into account asymmetric penalties dependant on each class and on each wrong or partially correct decision. It is based on ν-1-SVM coupled with its regularization path and minimizes a general loss function defined in the class-selective rejection scheme. The state of art multiclass algorithms can be considered as a particular case of the proposed algorithm where the number of decisions is given by the classes and the loss function is defined by the Bayesian risk. Two experiments are carried out in the Bayesian and the class selective rejection frameworks. Five genes selected datasets are used to assess the performance of the proposed method. Results are discussed and accuracies are compared with those computed by the Naive Bayes, Nearest Neighbor, Linear Perceptron, Multilayer Perceptron, and Support Vector Machines classifiers. PMID:19584932
A paper-based cantilever array sensor: Monitoring volatile organic compounds with naked eye.
Fraiwan, Arwa; Lee, Hankeun; Choi, Seokheun
2016-09-01
Volatile organic compound (VOC) detection is critical for controlling industrial and commercial emissions, environmental monitoring, and public health. Simple, portable, rapid and low-cost VOC sensing platforms offer the benefits of on-site and real-time monitoring anytime and anywhere. The best and most practically useful approaches to monitoring would include equipment-free and power-free detection by the naked eye. In this work, we created a novel, paper-based cantilever sensor array that allows simple and rapid naked-eye VOC detection without the need for power, electronics or readout interface/equipment. This simple VOC detection method was achieved using (i) low-cost paper materials as a substrate and (ii) swellable thin polymers adhered to the paper. Upon exposure to VOCs, the polymer swelling adhered to the paper-based cantilever, inducing mechanical deflection that generated a distinctive composite pattern of the deflection angles for a specific VOC. The angle is directly measured by the naked eye on a 3-D protractor printed on a paper facing the cantilevers. The generated angle patterns are subjected to statistical algorithms (linear discriminant analysis (LDA)) to classify each VOC sample and selectively detect a VOC. We classified four VOC samples with 100% accuracy using LDA. Copyright © 2016 Elsevier B.V. All rights reserved.
Single-trial lie detection using a combined fNIRS-polygraph system
Bhutta, M. Raheel; Hong, Melissa J.; Kim, Yun-Hee; Hong, Keum-Shik
2015-01-01
Deception is a human behavior that many people experience in daily life. It involves complex neuronal activities in addition to several physiological changes in the body. A polygraph, which can measure some of the physiological responses from the body, has been widely employed in lie-detection. Many researchers, however, believe that lie detection can become more precise if the neuronal changes that occur in the process of deception can be isolated and measured. In this study, we combine both measures (i.e., physiological and neuronal changes) for enhanced lie-detection. Specifically, to investigate the deception-related hemodynamic response, functional near-infrared spectroscopy (fNIRS) is applied at the prefrontal cortex besides a commercially available polygraph system. A mock crime scenario with a single-trial stimulus is set up as a deception protocol. The acquired data are classified into “true” and “lie” classes based on the fNIRS-based hemoglobin-concentration changes and polygraph-based physiological signal changes. Linear discriminant analysis is utilized as a classifier. The results indicate that the combined fNIRS-polygraph system delivers much higher classification accuracy than that of a singular system. This study demonstrates a plausible solution toward single-trial lie-detection by combining fNIRS and the polygraph. PMID:26082733
A Hybrid Sensing Approach for Pure and Adulterated Honey Classification
Subari, Norazian; Saleh, Junita Mohamad; Shakaff, Ali Yeon Md; Zakaria, Ammar
2012-01-01
This paper presents a comparison between data from single modality and fusion methods to classify Tualang honey as pure or adulterated using Linear Discriminant Analysis (LDA) and Principal Component Analysis (PCA) statistical classification approaches. Ten different brands of certified pure Tualang honey were obtained throughout peninsular Malaysia and Sumatera, Indonesia. Various concentrations of two types of sugar solution (beet and cane sugar) were used in this investigation to create honey samples of 20%, 40%, 60% and 80% adulteration concentrations. Honey data extracted from an electronic nose (e-nose) and Fourier Transform Infrared Spectroscopy (FTIR) were gathered, analyzed and compared based on fusion methods. Visual observation of classification plots revealed that the PCA approach able to distinct pure and adulterated honey samples better than the LDA technique. Overall, the validated classification results based on FTIR data (88.0%) gave higher classification accuracy than e-nose data (76.5%) using the LDA technique. Honey classification based on normalized low-level and intermediate-level FTIR and e-nose fusion data scored classification accuracies of 92.2% and 88.7%, respectively using the Stepwise LDA method. The results suggested that pure and adulterated honey samples were better classified using FTIR and e-nose fusion data than single modality data. PMID:23202033
A synteny-based draft genome sequence of the forage grass Lolium perenne.
Byrne, Stephen L; Nagy, Istvan; Pfeifer, Matthias; Armstead, Ian; Swain, Suresh; Studer, Bruno; Mayer, Klaus; Campbell, Jacqueline D; Czaban, Adrian; Hentrup, Stephan; Panitz, Frank; Bendixen, Christian; Hedegaard, Jakob; Caccamo, Mario; Asp, Torben
2015-11-01
Here we report the draft genome sequence of perennial ryegrass (Lolium perenne), an economically important forage and turf grass species that is widely cultivated in temperate regions worldwide. It is classified along with wheat, barley, oats and Brachypodium distachyon in the Pooideae sub-family of the grass family (Poaceae). Transcriptome data was used to identify 28,455 gene models, and we utilized macro-co-linearity between perennial ryegrass and barley, and synteny within the grass family, to establish a synteny-based linear gene order. The gametophytic self-incompatibility mechanism enables the pistil of a plant to reject self-pollen and therefore promote out-crossing. We have used the sequence assembly to characterize transcriptional changes in the stigma during pollination with both compatible and incompatible pollen. Characterization of the pollen transcriptome identified homologs to pollen allergens from a range of species, many of which were expressed to very high levels in mature pollen grains, and are potentially involved in the self-incompatibility mechanism. The genome sequence provides a valuable resource for future breeding efforts based on genomic prediction, and will accelerate the development of new varieties for more productive grasslands. © 2015 The Authors The Plant Journal © 2015 John Wiley & Sons Ltd.
Diagnosis of Tempromandibular Disorders Using Local Binary Patterns
Haghnegahdar, A.A.; Kolahi, S.; Khojastepour, L.; Tajeripour, F.
2018-01-01
Background: Temporomandibular joint disorder (TMD) might be manifested as structural changes in bone through modification, adaptation or direct destruction. We propose to use Local Binary Pattern (LBP) characteristics and histogram-oriented gradients on the recorded images as a diagnostic tool in TMD assessment. Material and Methods: CBCT images of 66 patients (132 joints) with TMD and 66 normal cases (132 joints) were collected and 2 coronal cut prepared from each condyle, although images were limited to head of mandibular condyle. In order to extract features of images, first we use LBP and then histogram of oriented gradients. To reduce dimensionality, the linear algebra Singular Value Decomposition (SVD) is applied to the feature vectors matrix of all images. For evaluation, we used K nearest neighbor (K-NN), Support Vector Machine, Naïve Bayesian and Random Forest classifiers. We used Receiver Operating Characteristic (ROC) to evaluate the hypothesis. Results: K nearest neighbor classifier achieves a very good accuracy (0.9242), moreover, it has desirable sensitivity (0.9470) and specificity (0.9015) results, when other classifiers have lower accuracy, sensitivity and specificity. Conclusion: We proposed a fully automatic approach to detect TMD using image processing techniques based on local binary patterns and feature extraction. K-NN has been the best classifier for our experiments in detecting patients from healthy individuals, by 92.42% accuracy, 94.70% sensitivity and 90.15% specificity. The proposed method can help automatically diagnose TMD at its initial stages. PMID:29732343
Beneito-Cambra, M; Bernabé-Zafón, V; Herrero-Martínez, J M; Simó-Alfonso, E F; Ramis-Ramos, G
2009-07-15
The enzymes present in raw materials of the cleaning industry (enzyme industrial concentrates) and in household cleaners were isolated by precipitation with acetone and hydrolyzed with HCl. The resulting amino acids were derivatized with o-phthaldialdehyde, and the derivatives were separated by HPLC. The peaks of 14 amino acids were observed using a C18 column and a multi-segmented gradient of acetonitrile-water in the presence of a 5 mM citric/citrate buffer of pH 6.5. Using either normalized peak areas (divided by the sum of the peak areas of the chromatogram) or ratios of pairs of peak areas as predictor variables, linear discriminant analysis models, capable of predicting the enzyme class, including proteases, lipases, amylases and cellulases, were constructed. For this purpose, both enzyme industrial concentrates and detergent bases spiked with them were included in the training set. In all cases, the enzymes of the evaluation set, including industrial concentrates, spiked detergent bases and commercial cleaners were correctly classified with assignment probabilities higher than 99%.
Torija, Antonio J; Ruiz, Diego P
2012-10-01
Road traffic has a heavy impact on the urban sound environment, constituting the main source of noise and widely dominating its spectral composition. In this context, our research investigates the use of recorded sound spectra as input data for the development of real-time short-term road traffic flow estimation models. For this, a series of models based on the use of Multilayer Perceptron Neural Networks, multiple linear regression, and the Fisher linear discriminant were implemented to estimate road traffic flow as well as to classify it according to the composition of heavy vehicles and motorcycles/mopeds. In view of the results, the use of the 50-400 Hz and 1-2.5 kHz frequency ranges as input variables in multilayer perceptron-based models successfully estimated urban road traffic flow with an average percentage of explained variance equal to 86%, while the classification of the urban road traffic flow gave an average success rate of 96.1%. Copyright © 2012 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Donroman, T.; Chesoh, S.; Lim, A.
2018-04-01
This study aimed to investigate the variation patterns of fish fingerling abundance based on month, year and sampling site. Monthly collecting data set of the Na Thap tidal river of southern Thailand, were obtained from June 2005 to October 2015. The square root transformation was employed for maintaining the fingerling data normality. Factor analysis was applied for clustering number of fingerling species and multiple linear regression was used to examine the association between fingerling density and year, month and site. Results from factor analysis classified fingerling into 3 factors based on saline preference; saline water, freshwater and ubiquitous species. The results showed a statistically high significant relation between fingerling density, month, year and site. Abundance of saline water and ubiquitous fingerling density showed similar pattern. Downstream site presented highest fingerling density whereas almost of freshwater fingerling occurred in upstream. This finding confirmed that factor analysis and the general linear regression method can be used as an effective tool for predicting and monitoring wild fingerling density in order to sustain fish stock management.
Ishido, Masami; Suzuki, Junko
2014-02-01
Exposure to environmental neurotoxic chemicals both in utero and during the early postnatal period can cause neurodevelopmental disorders. To evaluate the disruption of neurodevelopmental programming, we previously established an in vitro neurosphere assay system using rat mesencephalic neural stem cells that can be used to evaluate. Here, we extended the assay system to examine the neurodevelopmental toxicity of the endocrine disruptors butyl benzyl phthalate, di-n-butyl phthalate, dicyclohexyl phthalate, diethyl phthalate, di(2-ethyl hexyl) phthalate, di-n-pentyl phthalate, and dihexyl phthalate at a range of concentrations (0-100 μM). All phthalates tested inhibited cell migration with a linear or non-linear range of concentrations when comparing migration distance to the logarithm of the phthalate concentrations. On the other hand, some, but not all, phthalates decreased the number of proliferating cells. Apoptotic cells were not observed upon phthalate exposure under any of the conditions tested, whereas the dopaminergic toxin rotenone induced significant apoptosis. Thus, we were able to classify phthalate toxicity based on cell migration and cell proliferation using the in vitro neurosphere assay.
Local-global classifier fusion for screening chest radiographs
NASA Astrophysics Data System (ADS)
Ding, Meng; Antani, Sameer; Jaeger, Stefan; Xue, Zhiyun; Candemir, Sema; Kohli, Marc; Thoma, George
2017-03-01
Tuberculosis (TB) is a severe comorbidity of HIV and chest x-ray (CXR) analysis is a necessary step in screening for the infective disease. Automatic analysis of digital CXR images for detecting pulmonary abnormalities is critical for population screening, especially in medical resource constrained developing regions. In this article, we describe steps that improve previously reported performance of NLM's CXR screening algorithms and help advance the state of the art in the field. We propose a local-global classifier fusion method where two complementary classification systems are combined. The local classifier focuses on subtle and partial presentation of the disease leveraging information in radiology reports that roughly indicates locations of the abnormalities. In addition, the global classifier models the dominant spatial structure in the gestalt image using GIST descriptor for the semantic differentiation. Finally, the two complementary classifiers are combined using linear fusion, where the weight of each decision is calculated by the confidence probabilities from the two classifiers. We evaluated our method on three datasets in terms of the area under the Receiver Operating Characteristic (ROC) curve, sensitivity, specificity and accuracy. The evaluation demonstrates the superiority of our proposed local-global fusion method over any single classifier.
Kmeans-ICA based automatic method for ocular artifacts removal in a motorimagery classification.
Bou Assi, Elie; Rihana, Sandy; Sawan, Mohamad
2014-01-01
Electroencephalogram (EEG) recordings aroused as inputs of a motor imagery based BCI system. Eye blinks contaminate the spectral frequency of the EEG signals. Independent Component Analysis (ICA) has been already proved for removing these artifacts whose frequency band overlap with the EEG of interest. However, already ICA developed methods, use a reference lead such as the ElectroOculoGram (EOG) to identify the ocular artifact components. In this study, artifactual components were identified using an adaptive thresholding by means of Kmeans clustering. The denoised EEG signals have been fed into a feature extraction algorithm extracting the band power, the coherence and the phase locking value and inserted into a linear discriminant analysis classifier for a motor imagery classification.
Wang, Tao; He, Fuhong; Zhang, Anding; Gu, Lijuan; Wen, Yangmao; Jiang, Weiguo; Shao, Hongbo
2014-01-01
This paper took a subregion in a small watershed gully system at Beiyanzikou catchment of Qixia, China, as a study and, using object-orientated image analysis (OBIA), extracted shoulder line of gullies from high spatial resolution digital orthophoto map (DOM) aerial photographs. Next, it proposed an accuracy assessment method based on the adjacent distance between the boundary classified by remote sensing and points measured by RTK-GPS along the shoulder line of gullies. Finally, the original surface was fitted using linear regression in accordance with the elevation of two extracted edges of experimental gullies, named Gully 1 and Gully 2, and the erosion volume was calculated. The results indicate that OBIA can effectively extract information of gullies; average range difference between points field measured along the edge of gullies and classified boundary is 0.3166 m, with variance of 0.2116 m. The erosion area and volume of two gullies are 2141.6250 m(2), 5074.1790 m(3) and 1316.1250 m(2), 1591.5784 m(3), respectively. The results of the study provide a new method for the quantitative study of small gully erosion.
Ambert, Kyle H; Cohen, Aaron M
2009-01-01
OBJECTIVE Free-text clinical reports serve as an important part of patient care management and clinical documentation of patient disease and treatment status. Free-text notes are commonplace in medical practice, but remain an under-used source of information for clinical and epidemiological research, as well as personalized medicine. The authors explore the challenges associated with automatically extracting information from clinical reports using their submission to the Integrating Informatics with Biology and the Bedside (i2b2) 2008 Natural Language Processing Obesity Challenge Task. DESIGN A text mining system for classifying patient comorbidity status, based on the information contained in clinical reports. The approach of the authors incorporates a variety of automated techniques, including hot-spot filtering, negated concept identification, zero-vector filtering, weighting by inverse class-frequency, and error-correcting of output codes with linear support vector machines. MEASUREMENTS Performance was evaluated in terms of the macroaveraged F1 measure. RESULTS The automated system performed well against manual expert rule-based systems, finishing fifth in the Challenge's intuitive task, and 13(th) in the textual task. CONCLUSIONS The system demonstrates that effective comorbidity status classification by an automated system is possible.
Pathological speech signal analysis and classification using empirical mode decomposition.
Kaleem, Muhammad; Ghoraani, Behnaz; Guergachi, Aziz; Krishnan, Sridhar
2013-07-01
Automated classification of normal and pathological speech signals can provide an objective and accurate mechanism for pathological speech diagnosis, and is an active area of research. A large part of this research is based on analysis of acoustic measures extracted from sustained vowels. However, sustained vowels do not reflect real-world attributes of voice as effectively as continuous speech, which can take into account important attributes of speech such as rapid voice onset and termination, changes in voice frequency and amplitude, and sudden discontinuities in speech. This paper presents a methodology based on empirical mode decomposition (EMD) for classification of continuous normal and pathological speech signals obtained from a well-known database. EMD is used to decompose randomly chosen portions of speech signals into intrinsic mode functions, which are then analyzed to extract meaningful temporal and spectral features, including true instantaneous features which can capture discriminative information in signals hidden at local time-scales. A total of six features are extracted, and a linear classifier is used with the feature vector to classify continuous speech portions obtained from a database consisting of 51 normal and 161 pathological speakers. A classification accuracy of 95.7 % is obtained, thus demonstrating the effectiveness of the methodology.
Liu, Chao; Gu, Jinwei
2014-01-01
Classifying raw, unpainted materials--metal, plastic, ceramic, fabric, and so on--is an important yet challenging task for computer vision. Previous works measure subsets of surface spectral reflectance as features for classification. However, acquiring the full spectral reflectance is time consuming and error-prone. In this paper, we propose to use coded illumination to directly measure discriminative features for material classification. Optimal illumination patterns--which we call "discriminative illumination"--are learned from training samples, after projecting to which the spectral reflectance of different materials are maximally separated. This projection is automatically realized by the integration of incident light for surface reflection. While a single discriminative illumination is capable of linear, two-class classification, we show that multiple discriminative illuminations can be used for nonlinear and multiclass classification. We also show theoretically that the proposed method has higher signal-to-noise ratio than previous methods due to light multiplexing. Finally, we construct an LED-based multispectral dome and use the discriminative illumination method for classifying a variety of raw materials, including metal (aluminum, alloy, steel, stainless steel, brass, and copper), plastic, ceramic, fabric, and wood. Experimental results demonstrate its effectiveness.
High-speed potato grading and quality inspection based on a color vision system
NASA Astrophysics Data System (ADS)
Noordam, Jacco C.; Otten, Gerwoud W.; Timmermans, Toine J. M.; van Zwol, Bauke H.
2000-03-01
A high-speed machine vision system for the quality inspection and grading of potatoes has been developed. The vision system grades potatoes on size, shape and external defects such as greening, mechanical damages, rhizoctonia, silver scab, common scab, cracks and growth cracks. A 3-CCD line-scan camera inspects the potatoes in flight as they pass under the camera. The use of mirrors to obtain a 360-degree view of the potato and the lack of product holders guarantee a full view of the potato. To achieve the required capacity of 12 tons/hour, 11 SHARC Digital Signal Processors perform the image processing and classification tasks. The total capacity of the system is about 50 potatoes/sec. The color segmentation procedure uses Linear Discriminant Analysis (LDA) in combination with a Mahalanobis distance classifier to classify the pixels. The procedure for the detection of misshapen potatoes uses a Fourier based shape classification technique. Features such as area, eccentricity and central moments are used to discriminate between similar colored defects. Experiments with red and yellow skin-colored potatoes have shown that the system is robust and consistent in its classification.
Realistic Subsurface Anomaly Discrimination Using Electromagnetic Induction and an SVM Classifier
2010-01-01
proposed by Pasion and Oldenburg [25]: Q(t) = kt−βe−γt. (10) Various combinations of these fitting parameters can be used as inputs to classifier... Pasion -Oldenburg parameters k, β, and γ for each anomaly by a direct nonlinear least-squares fit of (10) and by linear (pseudo)inversion of its...combinations of the Pasion -Oldenburg parameters. Com- bining k and γ yields results similar to those of k and R, as Figure 7 and Table 2 show. Figure 8 and
Studies of Sea Ice Thickness and Characteristics from an Arctic Submarine Cruise
1991-01-31
decreasing slope. It is likely 12 that at the smallest lags, the autocovariance is artificially increased because the sonai " had a beamwidth of about...region. Class F: Narrow linear lines of very bright (white) return. Class G : The remaining area is ’matrix’, a mottled region of mid-grey and white...classified SAR feature map was digitised in the same way as the classified sidescan data. 15.8 SAR Statistics Statistics of the SAR features (A to G ) were
Optimizing Support Vector Machine Parameters with Genetic Algorithm for Credit Risk Assessment
NASA Astrophysics Data System (ADS)
Manurung, Jonson; Mawengkang, Herman; Zamzami, Elviawaty
2017-12-01
Support vector machine (SVM) is a popular classification method known to have strong generalization capabilities. SVM can solve the problem of classification and linear regression or nonlinear kernel which can be a learning algorithm for the ability of classification and regression. However, SVM also has a weakness that is difficult to determine the optimal parameter value. SVM calculates the best linear separator on the input feature space according to the training data. To classify data which are non-linearly separable, SVM uses kernel tricks to transform the data into a linearly separable data on a higher dimension feature space. The kernel trick using various kinds of kernel functions, such as : linear kernel, polynomial, radial base function (RBF) and sigmoid. Each function has parameters which affect the accuracy of SVM classification. To solve the problem genetic algorithms are proposed to be applied as the optimal parameter value search algorithm thus increasing the best classification accuracy on SVM. Data taken from UCI repository of machine learning database: Australian Credit Approval. The results show that the combination of SVM and genetic algorithms is effective in improving classification accuracy. Genetic algorithms has been shown to be effective in systematically finding optimal kernel parameters for SVM, instead of randomly selected kernel parameters. The best accuracy for data has been upgraded from kernel Linear: 85.12%, polynomial: 81.76%, RBF: 77.22% Sigmoid: 78.70%. However, for bigger data sizes, this method is not practical because it takes a lot of time.
NASA Astrophysics Data System (ADS)
Sun, Yankui; Li, Shan; Sun, Zhongyang
2017-01-01
We propose a framework for automated detection of dry age-related macular degeneration (AMD) and diabetic macular edema (DME) from retina optical coherence tomography (OCT) images, based on sparse coding and dictionary learning. The study aims to improve the classification performance of state-of-the-art methods. First, our method presents a general approach to automatically align and crop retina regions; then it obtains global representations of images by using sparse coding and a spatial pyramid; finally, a multiclass linear support vector machine classifier is employed for classification. We apply two datasets for validating our algorithm: Duke spectral domain OCT (SD-OCT) dataset, consisting of volumetric scans acquired from 45 subjects-15 normal subjects, 15 AMD patients, and 15 DME patients; and clinical SD-OCT dataset, consisting of 678 OCT retina scans acquired from clinics in Beijing-168, 297, and 213 OCT images for AMD, DME, and normal retinas, respectively. For the former dataset, our classifier correctly identifies 100%, 100%, and 93.33% of the volumes with DME, AMD, and normal subjects, respectively, and thus performs much better than the conventional method; for the latter dataset, our classifier leads to a correct classification rate of 99.67%, 99.67%, and 100.00% for DME, AMD, and normal images, respectively.
Sun, Yankui; Li, Shan; Sun, Zhongyang
2017-01-01
We propose a framework for automated detection of dry age-related macular degeneration (AMD) and diabetic macular edema (DME) from retina optical coherence tomography (OCT) images, based on sparse coding and dictionary learning. The study aims to improve the classification performance of state-of-the-art methods. First, our method presents a general approach to automatically align and crop retina regions; then it obtains global representations of images by using sparse coding and a spatial pyramid; finally, a multiclass linear support vector machine classifier is employed for classification. We apply two datasets for validating our algorithm: Duke spectral domain OCT (SD-OCT) dataset, consisting of volumetric scans acquired from 45 subjects—15 normal subjects, 15 AMD patients, and 15 DME patients; and clinical SD-OCT dataset, consisting of 678 OCT retina scans acquired from clinics in Beijing—168, 297, and 213 OCT images for AMD, DME, and normal retinas, respectively. For the former dataset, our classifier correctly identifies 100%, 100%, and 93.33% of the volumes with DME, AMD, and normal subjects, respectively, and thus performs much better than the conventional method; for the latter dataset, our classifier leads to a correct classification rate of 99.67%, 99.67%, and 100.00% for DME, AMD, and normal images, respectively.
Pereira, Anieli G; Abdala, Virginia; Kohlsdorf, Tiana
2015-02-01
Skeletal muscles can be classified as flexors or extensors according to their function, and as dorsal or ventral according to their position. The latter classification evokes their embryological origin from muscle masses initially divided during limb development, and muscles sharing a given position do not necessarily perform the same function. Here, we compare the relative proportions of different fiber types among six limb muscles in the lizard Tropidurus psammonastes. Individual fibers were classified as slow oxidative (SO), fast glycolytic (FG) or fast oxidative-glycolytic (FOG) based on mitochondrial content; muscles were classified according to position and function. Mixed linear models considering one or both effects were compared using likelihood ratio tests. Variation in the proportion of FG and FOG fibers is mainly explained by function (flexor muscles have on average lower proportions of FG and higher proportions of FOG fibers), while variation in SO fibers is better explained by position (they are less abundant in ventral muscles than in those developed from a dorsal muscle mass). Our results clarify the roles of position and function in determining the relative proportions of the various muscle fibers and provide evidence that these factors may differentially affect distinct fiber types. Copyright © 2014. Published by Elsevier GmbH.
NASA Astrophysics Data System (ADS)
Jo, J. A.; Fang, Q.; Papaioannou, T.; Qiao, J. H.; Fishbein, M. C.; Beseth, B.; Dorafshar, A. H.; Reil, T.; Baker, D.; Freischlag, J.; Marcu, L.
2006-02-01
This study introduces new methods of time-resolved laser-induced fluorescence spectroscopy (TR-LIFS) data analysis for tissue characterization. These analytical methods were applied for the detection of atherosclerotic vulnerable plaques. Upon pulsed nitrogen laser (337 nm, 1 ns) excitation, TR-LIFS measurements were obtained from carotid atherosclerotic plaque specimens (57 endarteroctomy patients) at 492 distinct areas. The emission was both spectrally- (360-600 nm range at 5 nm interval) and temporally- (0.3 ns resolution) resolved using a prototype clinically compatible fiber-optic catheter TR-LIFS apparatus. The TR-LIFS measurements were subsequently analyzed using a standard multiexponential deconvolution and a recently introduced Laguerre deconvolution technique. Based on their histopathology, the lesions were classified as early (thin intima), fibrotic (collagen-rich intima), and high-risk (thin cap over necrotic core and/or inflamed intima). Stepwise linear discriminant analysis (SLDA) was applied for lesion classification. Normalized spectral intensity values and Laguerre expansion coefficients (LEC) at discrete emission wavelengths (390, 450, 500 and 550 nm) were used as features for classification. The Laguerre based SLDA classifier provided discrimination of high-risk lesions with high sensitivity (SE>81%) and specificity (SP>95%). Based on these findings, we believe that TR-LIFS information derived from the Laguerre expansion coefficients can provide a valuable additional dimension for the diagnosis of high-risk vulnerable atherosclerotic plaques.
Defect classification in sparsity-based structural health monitoring
NASA Astrophysics Data System (ADS)
Golato, Andrew; Ahmad, Fauzia; Santhanam, Sridhar; Amin, Moeness G.
2017-05-01
Guided waves have gained popularity in structural health monitoring (SHM) due to their ability to inspect large areas with little attenuation, while providing rich interactions with defects. For thin-walled structures, the propagating waves are Lamb waves, which are a complex but well understood type of guided waves. Recent works have cast the defect localization problem of Lamb wave based SHM within the sparse reconstruction framework. These methods make use of a linear model relating the measurements with the scene reflectivity under the assumption of point-like defects. However, most structural defects are not perfect points but tend to assume specific forms, such as surface cracks or internal cracks. Knowledge of the "type" of defects is useful in the assessment phase of SHM. In this paper, we present a dual purpose sparsity-based imaging scheme which, in addition to accurately localizing defects, properly classifies the defects present simultaneously. The proposed approach takes advantage of the bias exhibited by certain types of defects toward a specific Lamb wave mode. For example, some defects strongly interact with the anti-symmetric modes, while others strongly interact with the symmetric modes. We build model based dictionaries for the fundamental symmetric and anti-symmetric wave modes, which are then utilized in unison to properly localize and classify the defects present. Simulated data of surface and internal defects in a thin Aluminum plate are used to validate the proposed scheme.
Rand, R.S.; Clark, R.N.; Livo, K.E.
2011-01-01
The Deepwater Horizon oil spill covered a very large geographical area in the Gulf of Mexico creating potentially serious environmental impacts on both marine life and the coastal shorelines. Knowing the oil's areal extent and thickness as well as denoting different categories of the oil's physical state is important for assessing these impacts. High spectral resolution data in hyperspectral imagery (HSI) sensors such as Airborne Visible and Infrared Imaging Spectrometer (AVIRIS) provide a valuable source of information that can be used for analysis by semi-automatic methods for tracking an oil spill's areal extent, oil thickness, and oil categories. However, the spectral behavior of oil in water is inherently a highly non-linear and variable phenomenon that changes depending on oil thickness and oil/water ratios. For certain oil thicknesses there are well-defined absorption features, whereas for very thin films sometimes there are almost no observable features. Feature-based imaging spectroscopy methods are particularly effective at classifying materials that exhibit specific well-defined spectral absorption features. Statistical methods are effective at classifying materials with spectra that exhibit a considerable amount of variability and that do not necessarily exhibit well-defined spectral absorption features. This study investigates feature-based and statistical methods for analyzing oil spills using hyperspectral imagery. The appropriate use of each approach is investigated and a combined feature-based and statistical method is proposed.
Penn, Richard; Werner, Michael; Thomas, Justin
2015-01-01
Background Estimation of stochastic process models from data is a common application of time series analysis methods. Such system identification processes are often cast as hypothesis testing exercises whose intent is to estimate model parameters and test them for statistical significance. Ordinary least squares (OLS) regression and the Levenberg-Marquardt algorithm (LMA) have proven invaluable computational tools for models being described by non-homogeneous, linear, stationary, ordinary differential equations. Methods In this paper we extend stochastic model identification to linear, stationary, partial differential equations in two independent variables (2D) and show that OLS and LMA apply equally well to these systems. The method employs an original nonparametric statistic as a test for the significance of estimated parameters. Results We show gray scale and color images are special cases of 2D systems satisfying a particular autoregressive partial difference equation which estimates an analogous partial differential equation. Several applications to medical image modeling and classification illustrate the method by correctly classifying demented and normal OLS models of axial magnetic resonance brain scans according to subject Mini Mental State Exam (MMSE) scores. Comparison with 13 image classifiers from the literature indicates our classifier is at least 14 times faster than any of them and has a classification accuracy better than all but one. Conclusions Our modeling method applies to any linear, stationary, partial differential equation and the method is readily extended to 3D whole-organ systems. Further, in addition to being a robust image classifier, estimated image models offer insights into which parameters carry the most diagnostic image information and thereby suggest finer divisions could be made within a class. Image models can be estimated in milliseconds which translate to whole-organ models in seconds; such runtimes could make real-time medicine and surgery modeling possible. PMID:26029638
Pavlovich, Matthew J; Dunn, Emily E; Hall, Adam B
2016-05-15
Commercial spices represent an emerging class of fuels for improvised explosives. Being able to classify such spices not only by type but also by brand would represent an important step in developing methods to analytically investigate these explosive compositions. Therefore, a combined ambient mass spectrometric/chemometric approach was developed to quickly and accurately classify commercial spices by brand. Direct analysis in real time mass spectrometry (DART-MS) was used to generate mass spectra for samples of black pepper, cayenne pepper, and turmeric, along with four different brands of cinnamon, all dissolved in methanol. Unsupervised learning techniques showed that the cinnamon samples clustered according to brand. Then, we used supervised machine learning algorithms to build chemometric models with a known training set and classified the brands of an unknown testing set of cinnamon samples. Ten independent runs of five-fold cross-validation showed that the training set error for the best-performing models (i.e., the linear discriminant and neural network models) was lower than 2%. The false-positive percentages for these models were 3% or lower, and the false-negative percentages were lower than 10%. In particular, the linear discriminant model perfectly classified the testing set with 0% error. Repeated iterations of training and testing gave similar results, demonstrating the reproducibility of these models. Chemometric models were able to classify the DART mass spectra of commercial cinnamon samples according to brand, with high specificity and low classification error. This method could easily be generalized to other classes of spices, and it could be applied to authenticating questioned commercial samples of spices or to examining evidence from improvised explosives. Copyright © 2016 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Paul, Subir; Nagesh Kumar, D.
2018-04-01
Hyperspectral (HS) data comprises of continuous spectral responses of hundreds of narrow spectral bands with very fine spectral resolution or bandwidth, which offer feature identification and classification with high accuracy. In the present study, Mutual Information (MI) based Segmented Stacked Autoencoder (S-SAE) approach for spectral-spatial classification of the HS data is proposed to reduce the complexity and computational time compared to Stacked Autoencoder (SAE) based feature extraction. A non-parametric dependency measure (MI) based spectral segmentation is proposed instead of linear and parametric dependency measure to take care of both linear and nonlinear inter-band dependency for spectral segmentation of the HS bands. Then morphological profiles are created corresponding to segmented spectral features to assimilate the spatial information in the spectral-spatial classification approach. Two non-parametric classifiers, Support Vector Machine (SVM) with Gaussian kernel and Random Forest (RF) are used for classification of the three most popularly used HS datasets. Results of the numerical experiments carried out in this study have shown that SVM with a Gaussian kernel is providing better results for the Pavia University and Botswana datasets whereas RF is performing better for Indian Pines dataset. The experiments performed with the proposed methodology provide encouraging results compared to numerous existing approaches.
Shin, Jaeyoung; Müller, Klaus-R; Hwang, Han-Jeong
2016-01-01
We propose a near-infrared spectroscopy (NIRS)-based brain-computer interface (BCI) that can be operated in eyes-closed (EC) state. To evaluate the feasibility of NIRS-based EC BCIs, we compared the performance of an eye-open (EO) BCI paradigm and an EC BCI paradigm with respect to hemodynamic response and classification accuracy. To this end, subjects performed either mental arithmetic or imagined vocalization of the English alphabet as a baseline task with very low cognitive loading. The performances of two linear classifiers were compared; resulting in an advantage of shrinkage linear discriminant analysis (LDA). The classification accuracy of EC paradigm (75.6 ± 7.3%) was observed to be lower than that of EO paradigm (77.0 ± 9.2%), which was statistically insignificant (p = 0.5698). Subjects reported they felt it more comfortable (p = 0.057) and easier (p < 0.05) to perform the EC BCI tasks. The different task difficulty may become a cause of the slightly lower classification accuracy of EC data. From the analysis results, we could confirm the feasibility of NIRS-based EC BCIs, which can be a BCI option that may ultimately be of use for patients who cannot keep their eyes open consistently. PMID:27824089
Shin, Jaeyoung; Müller, Klaus-R; Hwang, Han-Jeong
2016-11-08
We propose a near-infrared spectroscopy (NIRS)-based brain-computer interface (BCI) that can be operated in eyes-closed (EC) state. To evaluate the feasibility of NIRS-based EC BCIs, we compared the performance of an eye-open (EO) BCI paradigm and an EC BCI paradigm with respect to hemodynamic response and classification accuracy. To this end, subjects performed either mental arithmetic or imagined vocalization of the English alphabet as a baseline task with very low cognitive loading. The performances of two linear classifiers were compared; resulting in an advantage of shrinkage linear discriminant analysis (LDA). The classification accuracy of EC paradigm (75.6 ± 7.3%) was observed to be lower than that of EO paradigm (77.0 ± 9.2%), which was statistically insignificant (p = 0.5698). Subjects reported they felt it more comfortable (p = 0.057) and easier (p < 0.05) to perform the EC BCI tasks. The different task difficulty may become a cause of the slightly lower classification accuracy of EC data. From the analysis results, we could confirm the feasibility of NIRS-based EC BCIs, which can be a BCI option that may ultimately be of use for patients who cannot keep their eyes open consistently.
User oriented ERTS-1 images. [vegetation identification in Canada through image enhancement
NASA Technical Reports Server (NTRS)
Shlien, S.; Goodenough, D.
1974-01-01
Photographic reproduction of ERTS-1 images are capable of displaying only a portion of the total information available from the multispectral scanner. Methods are being developed to generate ERTS-1 images oriented towards special users such as agriculturists, foresters, and hydrologists by applying image enhancement techniques and interactive statistical classification schemes. Spatial boundaries and linear features can be emphasized and delineated using simple filters. Linear and nonlinear transformations can be applied to the spectral data to emphasize certain ground information. An automatic classification scheme was developed to identify particular ground cover classes such as fallow, grain, rape seed or various vegetation covers. The scheme applies the maximum likelihood decision rule to the spectral information and classifies the ERTS-1 image on a pixel by pixel basis. Preliminary results indicate that the classifier has limited success in distinguishing crops, but is well adapted for identifying different types of vegetation.
Classification With Truncated Distance Kernel.
Huang, Xiaolin; Suykens, Johan A K; Wang, Shuning; Hornegger, Joachim; Maier, Andreas
2018-05-01
This brief proposes a truncated distance (TL1) kernel, which results in a classifier that is nonlinear in the global region but is linear in each subregion. With this kernel, the subregion structure can be trained using all the training data and local linear classifiers can be established simultaneously. The TL1 kernel has good adaptiveness to nonlinearity and is suitable for problems which require different nonlinearities in different areas. Though the TL1 kernel is not positive semidefinite, some classical kernel learning methods are still applicable which means that the TL1 kernel can be directly used in standard toolboxes by replacing the kernel evaluation. In numerical experiments, the TL1 kernel with a pregiven parameter achieves similar or better performance than the radial basis function kernel with the parameter tuned by cross validation, implying the TL1 kernel a promising nonlinear kernel for classification tasks.
A Co-Adaptive Brain-Computer Interface for End Users with Severe Motor Impairment
Faller, Josef; Scherer, Reinhold; Costa, Ursula; Opisso, Eloy; Medina, Josep; Müller-Putz, Gernot R.
2014-01-01
Co-adaptive training paradigms for event-related desynchronization (ERD) based brain-computer interfaces (BCI) have proven effective for healthy users. As of yet, it is not clear whether co-adaptive training paradigms can also benefit users with severe motor impairment. The primary goal of our paper was to evaluate a novel cue-guided, co-adaptive BCI training paradigm with severely impaired volunteers. The co-adaptive BCI supports a non-control state, which is an important step toward intuitive, self-paced control. A secondary aim was to have the same participants operate a specifically designed self-paced BCI training paradigm based on the auto-calibrated classifier. The co-adaptive BCI analyzed the electroencephalogram from three bipolar derivations (C3, Cz, and C4) online, while the 22 end users alternately performed right hand movement imagery (MI), left hand MI and relax with eyes open (non-control state). After less than five minutes, the BCI auto-calibrated and proceeded to provide visual feedback for the MI task that could be classified better against the non-control state. The BCI continued to regularly recalibrate. In every calibration step, the system performed trial-based outlier rejection and trained a linear discriminant analysis classifier based on one auto-selected logarithmic band-power feature. In 24 minutes of training, the co-adaptive BCI worked significantly (p = 0.01) better than chance for 18 of 22 end users. The self-paced BCI training paradigm worked significantly (p = 0.01) better than chance in 11 of 20 end users. The presented co-adaptive BCI complements existing approaches in that it supports a non-control state, requires very little setup time, requires no BCI expert and works online based on only two electrodes. The preliminary results from the self-paced BCI paradigm compare favorably to previous studies and the collected data will allow to further improve self-paced BCI systems for disabled users. PMID:25014055
Adaptive block online learning target tracking based on super pixel segmentation
NASA Astrophysics Data System (ADS)
Cheng, Yue; Li, Jianzeng
2018-04-01
Video target tracking technology under the unremitting exploration of predecessors has made big progress, but there are still lots of problems not solved. This paper proposed a new algorithm of target tracking based on image segmentation technology. Firstly we divide the selected region using simple linear iterative clustering (SLIC) algorithm, after that, we block the area with the improved density-based spatial clustering of applications with noise (DBSCAN) clustering algorithm. Each sub-block independently trained classifier and tracked, then the algorithm ignore the failed tracking sub-block while reintegrate the rest of the sub-blocks into tracking box to complete the target tracking. The experimental results show that our algorithm can work effectively under occlusion interference, rotation change, scale change and many other problems in target tracking compared with the current mainstream algorithms.
Joint source based analysis of multiple brain structures in studying major depressive disorder
NASA Astrophysics Data System (ADS)
Ramezani, Mahdi; Rasoulian, Abtin; Hollenstein, Tom; Harkness, Kate; Johnsrude, Ingrid; Abolmaesumi, Purang
2014-03-01
We propose a joint Source-Based Analysis (jSBA) framework to identify brain structural variations in patients with Major Depressive Disorder (MDD). In this framework, features representing position, orientation and size (i.e. pose), shape, and local tissue composition are extracted. Subsequently, simultaneous analysis of these features within a joint analysis method is performed to generate the basis sources that show signi cant di erences between subjects with MDD and those in healthy control. Moreover, in a cross-validation leave- one-out experiment, we use a Fisher Linear Discriminant (FLD) classi er to identify individuals within the MDD group. Results show that we can classify the MDD subjects with an accuracy of 76% solely based on the information gathered from the joint analysis of pose, shape, and tissue composition in multiple brain structures.
[Electroencephalogram Feature Selection Based on Correlation Coefficient Analysis].
Zhou, Jinzhi; Tang, Xiaofang
2015-08-01
In order to improve the accuracy of classification with small amount of motor imagery training data on the development of brain-computer interface (BCD systems, we proposed an analyzing method to automatically select the characteristic parameters based on correlation coefficient analysis. Throughout the five sample data of dataset IV a from 2005 BCI Competition, we utilized short-time Fourier transform (STFT) and correlation coefficient calculation to reduce the number of primitive electroencephalogram dimension, then introduced feature extraction based on common spatial pattern (CSP) and classified by linear discriminant analysis (LDA). Simulation results showed that the average rate of classification accuracy could be improved by using correlation coefficient feature selection method than those without using this algorithm. Comparing with support vector machine (SVM) optimization features algorithm, the correlation coefficient analysis can lead better selection parameters to improve the accuracy of classification.
NASA Astrophysics Data System (ADS)
Di, Nur Faraidah Muhammad; Satari, Siti Zanariah
2017-05-01
Outlier detection in linear data sets has been done vigorously but only a small amount of work has been done for outlier detection in circular data. In this study, we proposed multiple outliers detection in circular regression models based on the clustering algorithm. Clustering technique basically utilizes distance measure to define distance between various data points. Here, we introduce the similarity distance based on Euclidean distance for circular model and obtain a cluster tree using the single linkage clustering algorithm. Then, a stopping rule for the cluster tree based on the mean direction and circular standard deviation of the tree height is proposed. We classify the cluster group that exceeds the stopping rule as potential outlier. Our aim is to demonstrate the effectiveness of proposed algorithms with the similarity distances in detecting the outliers. It is found that the proposed methods are performed well and applicable for circular regression model.
Proteins QSAR with Markov average electrostatic potentials.
González-Díaz, Humberto; Uriarte, Eugenio
2005-11-15
Classic physicochemical and topological indices have been largely used in small molecules QSAR but less in proteins QSAR. In this study, a Markov model is used to calculate, for the first time, average electrostatic potentials xik for an indirect interaction between aminoacids placed at topologic distances k within a given protein backbone. The short-term average stochastic potential xi1 for 53 Arc repressor mutants was used to model the effect of Alanine scanning on thermal stability. The Arc repressor is a model protein of relevance for biochemical studies on bioorganics and medicinal chemistry. A linear discriminant analysis model developed correctly classified 43 out of 53, 81.1% of proteins according to their thermal stability. More specifically, the model classified 20/28, 71.4% of proteins with near wild-type stability and 23/25, 92.0% of proteins with reduced stability. Moreover, predictability in cross-validation procedures was of 81.0%. Expansion of the electrostatic potential in the series xi0, xi1, xi2, and xi3, justified the use of the abrupt truncation approach, being the overall accuracy >70.0% for xi0 but equal for xi1, xi2, and xi3. The xi1 model compared favorably with respect to others based on D-Fire potential, surface area, volume, partition coefficient, and molar refractivity, with less than 77.0% of accuracy [Ramos de Armas, R.; González-Díaz, H.; Molina, R.; Uriarte, E. Protein Struct. Func. Bioinf.2004, 56, 715]. The xi1 model also has more tractable interpretation than others based on Markovian negentropies and stochastic moments. Finally, the model is notably simpler than the two models based on quadratic and linear indices. Both models, reported by Marrero-Ponce et al., use four-to-five time more descriptors. Introduction of average stochastic potentials may be useful for QSAR applications; having xik amenable physical interpretation and being very effective.
Fusion and Gaussian mixture based classifiers for SONAR data
NASA Astrophysics Data System (ADS)
Kotari, Vikas; Chang, KC
2011-06-01
Underwater mines are inexpensive and highly effective weapons. They are difficult to detect and classify. Hence detection and classification of underwater mines is essential for the safety of naval vessels. This necessitates a formulation of highly efficient classifiers and detection techniques. Current techniques primarily focus on signals from one source. Data fusion is known to increase the accuracy of detection and classification. In this paper, we formulated a fusion-based classifier and a Gaussian mixture model (GMM) based classifier for classification of underwater mines. The emphasis has been on sound navigation and ranging (SONAR) signals due to their extensive use in current naval operations. The classifiers have been tested on real SONAR data obtained from University of California Irvine (UCI) repository. The performance of both GMM based classifier and fusion based classifier clearly demonstrate their superior classification accuracy over conventional single source cases and validate our approach.
Discovering Fine-grained Sentiment in Suicide Notes
Wang, Wenbo; Chen, Lu; Tan, Ming; Wang, Shaojun; Sheth, Amit P.
2012-01-01
This paper presents our solution for the i2b2 sentiment classification challenge. Our hybrid system consists of machine learning and rule-based classifiers. For the machine learning classifier, we investigate a variety of lexical, syntactic and knowledge-based features, and show how much these features contribute to the performance of the classifier through experiments. For the rule-based classifier, we propose an algorithm to automatically extract effective syntactic and lexical patterns from training examples. The experimental results show that the rule-based classifier outperforms the baseline machine learning classifier using unigram features. By combining the machine learning classifier and the rule-based classifier, the hybrid system gains a better trade-off between precision and recall, and yields the highest micro-averaged F-measure (0.5038), which is better than the mean (0.4875) and median (0.5027) micro-average F-measures among all participating teams. PMID:22879770
Obrzut, Bogdan; Kusy, Maciej; Semczuk, Andrzej; Obrzut, Marzanna; Kluska, Jacek
2017-12-12
Computational intelligence methods, including non-linear classification algorithms, can be used in medical research and practice as a decision making tool. This study aimed to evaluate the usefulness of artificial intelligence models for 5-year overall survival prediction in patients with cervical cancer treated by radical hysterectomy. The data set was collected from 102 patients with cervical cancer FIGO stage IA2-IIB, that underwent primary surgical treatment. Twenty-three demographic, tumor-related parameters and selected perioperative data of each patient were collected. The simulations involved six computational intelligence methods: the probabilistic neural network (PNN), multilayer perceptron network, gene expression programming classifier, support vector machines algorithm, radial basis function neural network and k-Means algorithm. The prediction ability of the models was determined based on the accuracy, sensitivity, specificity, as well as the area under the receiver operating characteristic curve. The results of the computational intelligence methods were compared with the results of linear regression analysis as a reference model. The best results were obtained by the PNN model. This neural network provided very high prediction ability with an accuracy of 0.892 and sensitivity of 0.975. The area under the receiver operating characteristics curve of PNN was also high, 0.818. The outcomes obtained by other classifiers were markedly worse. The PNN model is an effective tool for predicting 5-year overall survival in cervical cancer patients treated with radical hysterectomy.
Kovalska, M P; Bürki, E; Schoetzau, A; Orguel, S F; Orguel, S; Grieshaber, M C
2011-04-01
The distinction of real progression from test variability in visual field (VF) series may be based on clinical judgment, on trend analysis based on follow-up of test parameters over time, or on identification of a significant change related to the mean of baseline exams (event analysis). The aim of this study was to compare a new population-based method (Octopus field analysis, OFA) with classic regression analyses and clinical judgment for detecting glaucomatous VF changes. 240 VF series of 240 patients with at least 9 consecutive examinations available were included into this study. They were independently classified by two experienced investigators. The results of such a classification served as a reference for comparison for the following statistical tests: (a) t-test global, (b) r-test global, (c) regression analysis of 10 VF clusters and (d) point-wise linear regression analysis. 32.5 % of the VF series were classified as progressive by the investigators. The sensitivity and specificity were 89.7 % and 92.0 % for r-test, and 73.1 % and 93.8 % for the t-test, respectively. In the point-wise linear regression analysis, the specificity was comparable (89.5 % versus 92 %), but the sensitivity was clearly lower than in the r-test (22.4 % versus 89.7 %) at a significance level of p = 0.01. A regression analysis for the 10 VF clusters showed a markedly higher sensitivity for the r-test (37.7 %) than the t-test (14.1 %) at a similar specificity (88.3 % versus 93.8 %) for a significant trend (p = 0.005). In regard to the cluster distribution, the paracentral clusters and the superior nasal hemifield progressed most frequently. The population-based regression analysis seems to be superior to the trend analysis in detecting VF progression in glaucoma, and may eliminate the drawbacks of the event analysis. Further, it may assist the clinician in the evaluation of VF series and may allow better visualization of the correlation between function and structure owing to VF clusters. © Georg Thieme Verlag KG Stuttgart · New York.
NASA Astrophysics Data System (ADS)
Ortolano, Gaetano; Visalli, Roberto; Godard, Gaston; Cirrincione, Rosolino
2018-06-01
We present a new ArcGIS®-based tool developed in the Python programming language for calibrating EDS/WDS X-ray element maps, with the aim of acquiring quantitative information of petrological interest. The calibration procedure is based on a multiple linear regression technique that takes into account interdependence among elements and is constrained by the stoichiometry of minerals. The procedure requires an appropriate number of spot analyses for use as internal standards and provides several test indexes for a rapid check of calibration accuracy. The code is based on an earlier image-processing tool designed primarily for classifying minerals in X-ray element maps; the original Python code has now been enhanced to yield calibrated maps of mineral end-members or the chemical parameters of each classified mineral. The semi-automated procedure can be used to extract a dataset that is automatically stored within queryable tables. As a case study, the software was applied to an amphibolite-facies garnet-bearing micaschist. The calibrated images obtained for both anhydrous (i.e., garnet and plagioclase) and hydrous (i.e., biotite) phases show a good fit with corresponding electron microprobe analyses. This new GIS-based tool package can thus find useful application in petrology and materials science research. Moreover, the huge quantity of data extracted opens new opportunities for the development of a thin-section microchemical database that, using a GIS platform, can be linked with other major global geoscience databases.
A general prediction model for the detection of ADHD and Autism using structural and functional MRI.
Sen, Bhaskar; Borle, Neil C; Greiner, Russell; Brown, Matthew R G
2018-01-01
This work presents a novel method for learning a model that can diagnose Attention Deficit Hyperactivity Disorder (ADHD), as well as Autism, using structural texture and functional connectivity features obtained from 3-dimensional structural magnetic resonance imaging (MRI) and 4-dimensional resting-state functional magnetic resonance imaging (fMRI) scans of subjects. We explore a series of three learners: (1) The LeFMS learner first extracts features from the structural MRI images using the texture-based filters produced by a sparse autoencoder. These filters are then convolved with the original MRI image using an unsupervised convolutional network. The resulting features are used as input to a linear support vector machine (SVM) classifier. (2) The LeFMF learner produces a diagnostic model by first computing spatial non-stationary independent components of the fMRI scans, which it uses to decompose each subject's fMRI scan into the time courses of these common spatial components. These features can then be used with a learner by themselves or in combination with other features to produce the model. Regardless of which approach is used, the final set of features are input to a linear support vector machine (SVM) classifier. (3) Finally, the overall LeFMSF learner uses the combined features obtained from the two feature extraction processes in (1) and (2) above as input to an SVM classifier, achieving an accuracy of 0.673 on the ADHD-200 holdout data and 0.643 on the ABIDE holdout data. Both of these results, obtained with the same LeFMSF framework, are the best known, over all hold-out accuracies on these datasets when only using imaging data-exceeding previously-published results by 0.012 for ADHD and 0.042 for Autism. Our results show that combining multi-modal features can yield good classification accuracy for diagnosis of ADHD and Autism, which is an important step towards computer-aided diagnosis of these psychiatric diseases and perhaps others as well.
Automatic analysis and classification of surface electromyography.
Abou-Chadi, F E; Nashar, A; Saad, M
2001-01-01
In this paper, parametric modeling of surface electromyography (EMG) algorithms that facilitates automatic SEMG feature extraction and artificial neural networks (ANN) are combined for providing an integrated system for the automatic analysis and diagnosis of myopathic disorders. Three paradigms of ANN were investigated: the multilayer backpropagation algorithm, the self-organizing feature map algorithm and a probabilistic neural network model. The performance of the three classifiers was compared with that of the old Fisher linear discriminant (FLD) classifiers. The results have shown that the three ANN models give higher performance. The percentage of correct classification reaches 90%. Poorer diagnostic performance was obtained from the FLD classifier. The system presented here indicates that surface EMG, when properly processed, can be used to provide the physician with a diagnostic assist device.
Federal Register 2010, 2011, 2012, 2013, 2014
2011-01-25
... Avenue, NW., Washington, DC 20230; telephone: (202) 482-1655. Case History With the issuance of the... material and then glued together in a linear fashion. Uncovered innersprings are classified under...
Xi, Xugang; Tang, Minyan; Miran, Seyed M; Luo, Zhizeng
2017-05-27
As an essential subfield of context awareness, activity awareness, especially daily activity monitoring and fall detection, plays a significant role for elderly or frail people who need assistance in their daily activities. This study investigates the feature extraction and pattern recognition of surface electromyography (sEMG), with the purpose of determining the best features and classifiers of sEMG for daily living activities monitoring and fall detection. This is done by a serial of experiments. In the experiments, four channels of sEMG signal from wireless, wearable sensors located on lower limbs are recorded from three subjects while they perform seven activities of daily living (ADL). A simulated trip fall scenario is also considered with a custom-made device attached to the ankle. With this experimental setting, 15 feature extraction methods of sEMG, including time, frequency, time/frequency domain and entropy, are analyzed based on class separability and calculation complexity, and five classification methods, each with 15 features, are estimated with respect to the accuracy rate of recognition and calculation complexity for activity monitoring and fall detection. It is shown that a high accuracy rate of recognition and a minimal calculation time for daily activity monitoring and fall detection can be achieved in the current experimental setting. Specifically, the Wilson Amplitude (WAMP) feature performs the best, and the classifier Gaussian Kernel Support Vector Machine (GK-SVM) with Permutation Entropy (PE) or WAMP results in the highest accuracy for activity monitoring with recognition rates of 97.35% and 96.43%. For fall detection, the classifier Fuzzy Min-Max Neural Network (FMMNN) has the best sensitivity and specificity at the cost of the longest calculation time, while the classifier Gaussian Kernel Fisher Linear Discriminant Analysis (GK-FDA) with the feature WAMP guarantees a high sensitivity (98.70%) and specificity (98.59%) with a short calculation time (65.586 ms), making it a possible choice for pre-impact fall detection. The thorough quantitative comparison of the features and classifiers in this study supports the feasibility of a wireless, wearable sEMG sensor system for automatic activity monitoring and fall detection.
Xi, Xugang; Tang, Minyan; Miran, Seyed M.; Luo, Zhizeng
2017-01-01
As an essential subfield of context awareness, activity awareness, especially daily activity monitoring and fall detection, plays a significant role for elderly or frail people who need assistance in their daily activities. This study investigates the feature extraction and pattern recognition of surface electromyography (sEMG), with the purpose of determining the best features and classifiers of sEMG for daily living activities monitoring and fall detection. This is done by a serial of experiments. In the experiments, four channels of sEMG signal from wireless, wearable sensors located on lower limbs are recorded from three subjects while they perform seven activities of daily living (ADL). A simulated trip fall scenario is also considered with a custom-made device attached to the ankle. With this experimental setting, 15 feature extraction methods of sEMG, including time, frequency, time/frequency domain and entropy, are analyzed based on class separability and calculation complexity, and five classification methods, each with 15 features, are estimated with respect to the accuracy rate of recognition and calculation complexity for activity monitoring and fall detection. It is shown that a high accuracy rate of recognition and a minimal calculation time for daily activity monitoring and fall detection can be achieved in the current experimental setting. Specifically, the Wilson Amplitude (WAMP) feature performs the best, and the classifier Gaussian Kernel Support Vector Machine (GK-SVM) with Permutation Entropy (PE) or WAMP results in the highest accuracy for activity monitoring with recognition rates of 97.35% and 96.43%. For fall detection, the classifier Fuzzy Min-Max Neural Network (FMMNN) has the best sensitivity and specificity at the cost of the longest calculation time, while the classifier Gaussian Kernel Fisher Linear Discriminant Analysis (GK-FDA) with the feature WAMP guarantees a high sensitivity (98.70%) and specificity (98.59%) with a short calculation time (65.586 ms), making it a possible choice for pre-impact fall detection. The thorough quantitative comparison of the features and classifiers in this study supports the feasibility of a wireless, wearable sEMG sensor system for automatic activity monitoring and fall detection. PMID:28555016
Garcia-Chimeno, Yolanda; Garcia-Zapirain, Begonya
2015-01-01
The classification of subjects' pathologies enables a rigorousness to be applied to the treatment of certain pathologies, as doctors on occasions play with so many variables that they can end up confusing some illnesses with others. Thanks to Machine Learning techniques applied to a health-record database, it is possible to make using our algorithm. hClass contains a non-linear classification of either a supervised, non-supervised or semi-supervised type. The machine is configured using other techniques such as validation of the set to be classified (cross-validation), reduction in features (PCA) and committees for assessing the various classifiers. The tool is easy to use, and the sample matrix and features that one wishes to classify, the number of iterations and the subjects who are going to be used to train the machine all need to be introduced as inputs. As a result, the success rate is shown either via a classifier or via a committee if one has been formed. A 90% success rate is obtained in the ADABoost classifier and 89.7% in the case of a committee (comprising three classifiers) when PCA is applied. This tool can be expanded to allow the user to totally characterise the classifiers by adjusting them to each classification use.
Christian, Josef; Kröll, Josef; Schwameder, Hermann
2017-06-01
Common summary measures of gait quality such as the Gait Profile Score (GPS) are based on the principle of measuring a distance from the mean pattern of a healthy reference group in a gait pattern vector space. The recently introduced Classifier Oriented Gait Score (COGS) is a pathology specific score that measures this distance in a unique direction, which is indicated by a linear classifier. This approach has potentially improved the discriminatory power to detect subtle changes in gait patterns but does not incorporate a profile of interpretable sub-scores like the GPS. The main aims of this study were to extend the COGS by decomposing it into interpretable sub-scores as realized in the GPS and to compare the discriminative power of the GPS and COGS. Two types of gait impairments were imitated to enable a high level of control of the gait patterns. Imitated impairments were realized by restricting knee extension and inducing leg length discrepancy. The results showed increased discriminatory power of the COGS for differentiating diverse levels of impairment. Comparison of the GPS and COGS sub-scores and their ability to indicate changes in specific variables supports the validity of both scores. The COGS is an overall measure of gait quality with increased power to detect subtle changes in gait patterns and might be well suited for tracing the effect of a therapeutic treatment over time. The newly introduced sub-scores improved the interpretability of the COGS, which is helpful for practical applications. Copyright © 2017 Elsevier B.V. All rights reserved.
GBM heterogeneity characterization by radiomic analysis of phenotype anatomical planes
NASA Astrophysics Data System (ADS)
Chaddad, Ahmad; Desrosiers, Christian; Toews, Matthew
2016-03-01
Glioblastoma multiforme (GBM) is the most common malignant primary tumor of the central nervous system, characterized among other traits by rapid metastatis. Three tissue phenotypes closely associated with GBMs, namely, necrosis (N), contrast enhancement (CE), and edema/invasion (E), exhibit characteristic patterns of texture heterogeneity in magnetic resonance images (MRI). In this study, we propose a novel model to characterize GBM tissue phenotypes using gray level co-occurrence matrices (GLCM) in three anatomical planes. The GLCM encodes local image patches in terms of informative, orientation-invariant texture descriptors, which are used here to sub-classify GBM tissue phenotypes. Experiments demonstrate the model on MRI data of 41 GBM patients, obtained from the cancer genome atlas (TCGA). Intensity-based automatic image registration is applied to align corresponding pairs of fixed T1˗weighted (T1˗WI) post-contrast and fluid attenuated inversion recovery (FLAIR) images. GBM tissue regions are then segmented using the 3D Slicer tool. Texture features are computed from 12 quantifier functions operating on GLCM descriptors, that are generated from MRI intensities within segmented GBM tissue regions. Various classifier models are used to evaluate the effectiveness of texture features for discriminating between GBM phenotypes. Results based on T1-WI scans showed a phenotype classification accuracy of over 88.14%, a sensitivity of 85.37% and a specificity of 96.1%, using the linear discriminant analysis (LDA) classifier. This model has the potential to provide important characteristics of tumors, which can be used for the sub-classification of GBM phenotypes.
Raghu, S; Sriraam, N; Kumar, G Pradeep
2017-02-01
Electroencephalogram shortly termed as EEG is considered as the fundamental segment for the assessment of the neural activities in the brain. In cognitive neuroscience domain, EEG-based assessment method is found to be superior due to its non-invasive ability to detect deep brain structure while exhibiting superior spatial resolutions. Especially for studying the neurodynamic behavior of epileptic seizures, EEG recordings reflect the neuronal activity of the brain and thus provide required clinical diagnostic information for the neurologist. This specific proposed study makes use of wavelet packet based log and norm entropies with a recurrent Elman neural network (REN) for the automated detection of epileptic seizures. Three conditions, normal, pre-ictal and epileptic EEG recordings were considered for the proposed study. An adaptive Weiner filter was initially applied to remove the power line noise of 50 Hz from raw EEG recordings. Raw EEGs were segmented into 1 s patterns to ensure stationarity of the signal. Then wavelet packet using Haar wavelet with a five level decomposition was introduced and two entropies, log and norm were estimated and were applied to REN classifier to perform binary classification. The non-linear Wilcoxon statistical test was applied to observe the variation in the features under these conditions. The effect of log energy entropy (without wavelets) was also studied. It was found from the simulation results that the wavelet packet log entropy with REN classifier yielded a classification accuracy of 99.70 % for normal-pre-ictal, 99.70 % for normal-epileptic and 99.85 % for pre-ictal-epileptic.
NASA Astrophysics Data System (ADS)
Huynh, Benjamin Q.; Antropova, Natasha; Giger, Maryellen L.
2017-03-01
DCE-MRI datasets have a temporal aspect to them, resulting in multiple regions of interest (ROIs) per subject, based on contrast time points. It is unclear how the different contrast time points vary in terms of usefulness for computer-aided diagnosis tasks in conjunction with deep learning methods. We thus sought to compare the different DCE-MRI contrast time points with regard to how well their extracted features predict response to neoadjuvant chemotherapy within a deep convolutional neural network. Our dataset consisted of 561 ROIs from 64 subjects. Each subject was categorized as a non-responder or responder, determined by recurrence-free survival. First, features were extracted from each ROI using a convolutional neural network (CNN) pre-trained on non-medical images. Linear discriminant analysis classifiers were then trained on varying subsets of these features, based on their contrast time points of origin. Leave-one-out cross validation (by subject) was used to assess performance in the task of estimating probability of response to therapy, with area under the ROC curve (AUC) as the metric. The classifier trained on features from strictly the pre-contrast time point performed the best, with an AUC of 0.85 (SD = 0.033). The remaining classifiers resulted in AUCs ranging from 0.71 (SD = 0.028) to 0.82 (SD = 0.027). Overall, we found the pre-contrast time point to be the most effective at predicting response to therapy and that including additional contrast time points moderately reduces variance.
The effect of combining two echo times in automatic brain tumor classification by MRS.
García-Gómez, Juan M; Tortajada, Salvador; Vidal, César; Julià-Sapé, Margarida; Luts, Jan; Moreno-Torres, Angel; Van Huffel, Sabine; Arús, Carles; Robles, Montserrat
2008-11-01
(1)H MRS is becoming an accurate, non-invasive technique for initial examination of brain masses. We investigated if the combination of single-voxel (1)H MRS at 1.5 T at two different (TEs), short TE (PRESS or STEAM, 20-32 ms) and long TE (PRESS, 135-136 ms), improves the classification of brain tumors over using only one echo TE. A clinically validated dataset of 50 low-grade meningiomas, 105 aggressive tumors (glioblastoma and metastasis), and 30 low-grade glial tumors (astrocytomas grade II, oligodendrogliomas and oligoastrocytomas) was used to fit predictive models based on the combination of features from short-TEs and long-TE spectra. A new approach that combines the two consecutively was used to produce a single data vector from which relevant features of the two TE spectra could be extracted by means of three algorithms: stepwise, reliefF, and principal components analysis. Least squares support vector machines and linear discriminant analysis were applied to fit the pairwise and multiclass classifiers, respectively. Significant differences in performance were found when short-TE, long-TE or both spectra combined were used as input. In our dataset, to discriminate meningiomas, the combination of the two TE acquisitions produced optimal performance. To discriminate aggressive tumors from low-grade glial tumours, the use of short-TE acquisition alone was preferable. The classifier development strategy used here lends itself to automated learning and test performance processes, which may be of use for future web-based multicentric classifier development studies. Copyright (c) 2008 John Wiley & Sons, Ltd.
Belchansky, Gennady I.; Douglas, David C.
2000-01-01
This paper presents methods for classifying Arctic sea ice using both passive and active (2-channel) microwave imagery acquired by the Russian OKEAN 01 polar-orbiting satellite series. Methods and results are compared to sea ice classifications derived from nearly coincident Special Sensor Microwave Imager (SSM/I) and Advanced Very High Resolution Radiometer (AVHRR) image data of the Barents, Kara, and Laptev Seas. The Russian OKEAN 01 satellite data were collected over weekly intervals during October 1995 through December 1997. Methods are presented for calibrating, georeferencing and classifying the raw active radar and passive microwave OKEAN 01 data, and for correcting the OKEAN 01 microwave radiometer calibration wedge based on concurrent 37 GHz horizontal polarization SSM/I brightness temperature data. Sea ice type and ice concentration algorithms utilized OKEAN's two-channel radar and passive microwave data in a linear mixture model based on the measured values of brightness temperature and radar backscatter, together with a priori knowledge about the scattering parameters and natural emissivities of basic sea ice types. OKEAN 01 data and algorithms tended to classify lower concentrations of young or first-year sea ice when concentrations were less than 60%, and to produce higher concentrations of multi-year sea ice when concentrations were greater than 40%, when compared to estimates produced from SSM/I data. Overall, total sea ice concentration maps derived independently from OKEAN 01, SSM/I, and AVHRR satellite imagery were all highly correlated, with uniform biases, and mean differences in total ice concentration of less than four percent (sd<15%).
Ranking and combining multiple predictors without labeled data
Parisi, Fabio; Strino, Francesco; Nadler, Boaz; Kluger, Yuval
2014-01-01
In a broad range of classification and decision-making problems, one is given the advice or predictions of several classifiers, of unknown reliability, over multiple questions or queries. This scenario is different from the standard supervised setting, where each classifier’s accuracy can be assessed using available labeled data, and raises two questions: Given only the predictions of several classifiers over a large set of unlabeled test data, is it possible to (i) reliably rank them and (ii) construct a metaclassifier more accurate than most classifiers in the ensemble? Here we present a spectral approach to address these questions. First, assuming conditional independence between classifiers, we show that the off-diagonal entries of their covariance matrix correspond to a rank-one matrix. Moreover, the classifiers can be ranked using the leading eigenvector of this covariance matrix, because its entries are proportional to their balanced accuracies. Second, via a linear approximation to the maximum likelihood estimator, we derive the Spectral Meta-Learner (SML), an unsupervised ensemble classifier whose weights are equal to these eigenvector entries. On both simulated and real data, SML typically achieves a higher accuracy than most classifiers in the ensemble and can provide a better starting point than majority voting for estimating the maximum likelihood solution. Furthermore, SML is robust to the presence of small malicious groups of classifiers designed to veer the ensemble prediction away from the (unknown) ground truth. PMID:24474744
Classification of Odours for Mobile Robots Using an Ensemble of Linear Classifiers
NASA Astrophysics Data System (ADS)
Trincavelli, Marco; Coradeschi, Silvia; Loutfi, Amy
2009-05-01
This paper investigates the classification of odours using an electronic nose mounted on a mobile robot. The samples are collected as the robot explores the environment. Under such conditions, the sensor response differs from typical three phase sampling processes. In this paper, we focus particularly on the classification problem and how it is influenced by the movement of the robot. To cope with these influences, an algorithm consisting of an ensemble of classifiers is presented. Experimental results show that this algorithm increases classification performance compared to other traditional classification methods.
Driving behavior recognition using EEG data from a simulated car-following experiment.
Yang, Liu; Ma, Rui; Zhang, H Michael; Guan, Wei; Jiang, Shixiong
2018-07-01
Driving behavior recognition is the foundation of driver assistance systems, with potential applications in automated driving systems. Most prevailing studies have used subjective questionnaire data and objective driving data to classify driving behaviors, while few studies have used physiological signals such as electroencephalography (EEG) to gather data. To bridge this gap, this paper proposes a two-layer learning method for driving behavior recognition using EEG data. A simulated car-following driving experiment was designed and conducted to simultaneously collect data on the driving behaviors and EEG data of drivers. The proposed learning method consists of two layers. In Layer I, two-dimensional driving behavior features representing driving style and stability were selected and extracted from raw driving behavior data using K-means and support vector machine recursive feature elimination. Five groups of driving behaviors were classified based on these two-dimensional driving behavior features. In Layer II, the classification results from Layer I were utilized as inputs to generate a k-Nearest-Neighbor classifier identifying driving behavior groups using EEG data. Using independent component analysis, a fast Fourier transformation, and linear discriminant analysis sequentially, the raw EEG signals were processed to extract two core EEG features. Classifier performance was enhanced using the adaptive synthetic sampling approach. A leave-one-subject-out cross validation was conducted. The results showed that the average classification accuracy for all tested traffic states was 69.5% and the highest accuracy reached 83.5%, suggesting a significant correlation between EEG patterns and car-following behavior. Copyright © 2017 Elsevier Ltd. All rights reserved.
Multi-feature classifiers for burst detection in single EEG channels from preterm infants
NASA Astrophysics Data System (ADS)
Navarro, X.; Porée, F.; Kuchenbuch, M.; Chavez, M.; Beuchée, Alain; Carrault, G.
2017-08-01
Objective. The study of electroencephalographic (EEG) bursts in preterm infants provides valuable information about maturation or prognostication after perinatal asphyxia. Over the last two decades, a number of works proposed algorithms to automatically detect EEG bursts in preterm infants, but they were designed for populations under 35 weeks of post menstrual age (PMA). However, as the brain activity evolves rapidly during postnatal life, these solutions might be under-performing with increasing PMA. In this work we focused on preterm infants reaching term ages (PMA ⩾36 weeks) using multi-feature classification on a single EEG channel. Approach. Five EEG burst detectors relying on different machine learning approaches were compared: logistic regression (LR), linear discriminant analysis (LDA), k-nearest neighbors (kNN), support vector machines (SVM) and thresholding (Th). Classifiers were trained by visually labeled EEG recordings from 14 very preterm infants (born after 28 weeks of gestation) with 36-41 weeks PMA. Main results. The most performing classifiers reached about 95% accuracy (kNN, SVM and LR) whereas Th obtained 84%. Compared to human-automatic agreements, LR provided the highest scores (Cohen’s kappa = 0.71) using only three EEG features. Applying this classifier in an unlabeled database of 21 infants ⩾36 weeks PMA, we found that long EEG bursts and short inter-burst periods are characteristic of infants with the highest PMA and weights. Significance. In view of these results, LR-based burst detection could be a suitable tool to study maturation in monitoring or portable devices using a single EEG channel.
NASA Astrophysics Data System (ADS)
Iyatomi, Hitoshi; Hashimoto, Jun; Yoshii, Fumuhito; Kazama, Toshiki; Kawada, Shuichi; Imai, Yutaka
2014-03-01
Discrimination between Alzheimer's disease and other dementia is clinically significant, however it is often difficult. In this study, we developed classification models among Alzheimer's disease (AD), other dementia (OD) and/or normal subjects (NC) using patient factors and indices obtained by brain perfusion SPECT. SPECT is commonly used to assess cerebral blood flow (CBF) and allows the evaluation of the severity of hypoperfusion by introducing statistical parametric mapping (SPM). We investigated a total of 150 cases (50 cases each for AD, OD, and NC) from Tokai University Hospital, Japan. In each case, we obtained a total of 127 candidate parameters from: (A) 2 patient factors (age and sex), (B) 12 CBF parameters and 113 SPM parameters including (C) 3 from specific volume analysis (SVA), and (D) 110 from voxel-based analysis stereotactic extraction estimation (vbSEE). We built linear classifiers with a statistical stepwise feature selection and evaluated the performance with the leave-one-out cross validation strategy. Our classifiers achieved very high classification performances with reasonable number of selected parameters. In the most significant discrimination in clinical, namely those of AD from OD, our classifier achieved both sensitivity (SE) and specificity (SP) of 96%. In a similar way, our classifiers achieved a SE of 90% and a SP of 98% in AD from NC, as well as a SE of 88% and a SP of 86% in AD from OD and NC cases. Introducing SPM indices such as SVA and vbSEE, classification performances improved around 7-15%. We confirmed that these SPM factors are quite important for diagnosing Alzheimer's disease.
NASA Astrophysics Data System (ADS)
Alfano, R.; Soetemans, D.; Bauman, G. S.; Gibson, E.; Gaed, M.; Moussa, M.; Gomez, J. A.; Chin, J. L.; Pautler, S.; Ward, A. D.
2018-02-01
Multi-parametric MRI (mp-MRI) is becoming a standard in contemporary prostate cancer screening and diagnosis, and has shown to aid physicians in cancer detection. It offers many advantages over traditional systematic biopsy, which has shown to have very high clinical false-negative rates of up to 23% at all stages of the disease. However beneficial, mp-MRI is relatively complex to interpret and suffers from inter-observer variability in lesion localization and grading. Computer-aided diagnosis (CAD) systems have been developed as a solution as they have the power to perform deterministic quantitative image analysis. We measured the accuracy of such a system validated using accurately co-registered whole-mount digitized histology. We trained a logistic linear classifier (LOGLC), support vector machine (SVC), k-nearest neighbour (KNN) and random forest classifier (RFC) in a four part ROI based experiment against: 1) cancer vs. non-cancer, 2) high-grade (Gleason score ≥4+3) vs. low-grade cancer (Gleason score <4+3), 3) high-grade vs. other tissue components and 4) high-grade vs. benign tissue by selecting the classifier with the highest AUC using 1-10 features from forward feature selection. The CAD model was able to classify malignant vs. benign tissue and detect high-grade cancer with high accuracy. Once fully validated, this work will form the basis for a tool that enhances the radiologist's ability to detect malignancies, potentially improving biopsy guidance, treatment selection, and focal therapy for prostate cancer patients, maximizing the potential for cure and increasing quality of life.
Spectral embedding finds meaningful (relevant) structure in image and microarray data
Higgs, Brandon W; Weller, Jennifer; Solka, Jeffrey L
2006-01-01
Background Accurate methods for extraction of meaningful patterns in high dimensional data have become increasingly important with the recent generation of data types containing measurements across thousands of variables. Principal components analysis (PCA) is a linear dimensionality reduction (DR) method that is unsupervised in that it relies only on the data; projections are calculated in Euclidean or a similar linear space and do not use tuning parameters for optimizing the fit to the data. However, relationships within sets of nonlinear data types, such as biological networks or images, are frequently mis-rendered into a low dimensional space by linear methods. Nonlinear methods, in contrast, attempt to model important aspects of the underlying data structure, often requiring parameter(s) fitting to the data type of interest. In many cases, the optimal parameter values vary when different classification algorithms are applied on the same rendered subspace, making the results of such methods highly dependent upon the type of classifier implemented. Results We present the results of applying the spectral method of Lafon, a nonlinear DR method based on the weighted graph Laplacian, that minimizes the requirements for such parameter optimization for two biological data types. We demonstrate that it is successful in determining implicit ordering of brain slice image data and in classifying separate species in microarray data, as compared to two conventional linear methods and three nonlinear methods (one of which is an alternative spectral method). This spectral implementation is shown to provide more meaningful information, by preserving important relationships, than the methods of DR presented for comparison. Tuning parameter fitting is simple and is a general, rather than data type or experiment specific approach, for the two datasets analyzed here. Tuning parameter optimization is minimized in the DR step to each subsequent classification method, enabling the possibility of valid cross-experiment comparisons. Conclusion Results from the spectral method presented here exhibit the desirable properties of preserving meaningful nonlinear relationships in lower dimensional space and requiring minimal parameter fitting, providing a useful algorithm for purposes of visualization and classification across diverse datasets, a common challenge in systems biology. PMID:16483359
Koinzer, Stefan; Hesse, Carola; Caliebe, Amke; Saeger, Mark; Baade, Alexander; Schlott, Kerstin; Brinkmann, Ralf; Roider, Johann
2013-09-01
The rabbit is the most common animal model to study retinal photocoagulation lesions. We present a classification of retinal lesions from rabbits, that is based on optical coherence tomographic (OCT) findings, temperature data, and OCT-follow-up data over 3 months. Four hundred eighty-six photocoagulation lesions (modified Zeiss Visulas® 532 nm CW laser, lesion diameter 133 µm, exposure duration 200 milliseconds or variable, power variable) were analyzed from six eyes of three chinchilla gray rabbits. During the irradiation of each lesion, we used an optoacoustics-based method to measure the retinal temperature profile. Two hours, 1 week, 1 month, and 3 months after the treatment, we obtained fundus color and OCT (Spectralis®) images of each lesion. We classified the lesions according to their OCT morphology and correlated the findings to ophthalmoscopic and OCT lesion diameters, and temperatures. Besides an undetectable lesion class 0, we discerned subthreshold lesions that were invisible on the fundus but detectable in OCT (classes 1 and 2), very mild lesions that were partly visible on the fundus (class 3), and 3 classes of suprathreshold lesions. OCT greatest linear diameters (GLDs) were larger than ophthalmoscopic lesion diameters, both increased for increasing classes, and GLDs decreased over 3 months within each class. Mean peak end temperatures for 200 milliseconds lesions ranged from 61°C in class 2 to 80°C in class 6. The seven step rabbit lesion classifier is distinct from a previously published human lesion classifier. Threshold lesions are generated at comparable temperatures in rabbits and humans, while more intense lesions are created at lower temperatures in rabbits. The OCT lesion classifier could replace routine histology in some studies, and the presented data may be used to estimate lesion end temperatures from OCT images. © 2013 Wiley Periodicals, Inc.
Rohde, Maximilian; Mehari, Fanuel; Klämpfl, Florian; Adler, Werner; Neukam, Friedrich-Wilhelm; Schmidt, Michael; Stelzle, Florian
2017-10-01
Compared to conventional techniques, Laser surgery procedures provide a number of advantages, but may be associated with an increased risk of iatrogenic damage to important anatomical structures. The type of tissue ablated in the focus spot is unknown. Laser-Induced Breakdown-Spectroscopy (LIBS) has the potential to gain information about the type of material that is being ablated by the laser beam. This may form the basis for tissue selective laser surgery. In the present study, 7 different porcine tissues (cortical and cancellous bone, nerve, mucosa, enamel, dentine and pulp) from 6 animals were analyzed for their qualitative and semiquantitative molecular composition using LIBS. The so gathered data was used to first differentiate between the soft- and hard-tissues using a Calcium-Carbon emission based classifier. The tissues were then further classified using emission-ratio based analysis, principal component analysis (PCA) and linear discriminant analysis (LDA). The relatively higher concentration of Calcium in the hard tissues allows for an accurate first differentiation of soft- and hard tissues (100% sensitivity and specificity). The ratio based statistical differentiation approach yields results in the range from 65% (enamel-dentine pair) to 100% (nerve-pulp, cancellous bone-dentine, cancellous bone-enamel pairs) sensitivity and specificity. Experimental LIBS measuring setup. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
Mighell, A D
2001-01-01
In theory, physical crystals can be represented by idealized mathematical lattices. Under appropriate conditions, these representations can be used for a variety of purposes such as identifying, classifying, and understanding the physical properties of materials. Critical to these applications is the ability to construct a unique representation of the lattice. The vital link that enabled this theory to be realized in practice was provided by the 1970 paper on the determination of reduced cells. This seminal paper led to a mathematical approach to lattice analysis initially based on systematic reduction procedures and the use of standard cells. Subsequently, the process evolved to a matrix approach based on group theory and linear algebra that offered a more abstract and powerful way to look at lattices and their properties. Application of the reduced cell to both database work and laboratory research at NIST was immediately successful. Currently, this cell and/or procedures based on reduction are widely and routinely used by the general scientific community: (i) for calculating standard cells for the reporting of crystalline materials, (ii) for classifying materials, (iii) in crystallographic database work (iv) in routine x-ray and neutron diffractometry, and (v) in general crystallographic research. Especially important is its use in symmetry determination and in identification. The focus herein is on the role of the reduced cell in lattice symmetry determination.
Bahl, Gautam; Cruite, Irene; Wolfson, Tanya; Gamst, Anthony C.; Collins, Julie M.; Chavez, Alyssa D.; Barakat, Fatma; Hassanein, Tarek; Sirlin, Claude B.
2016-01-01
Purpose To demonstrate a proof of concept that quantitative texture feature analysis of double contrast-enhanced magnetic resonance imaging (MRI) can classify fibrosis noninvasively, using histology as a reference standard. Materials and Methods A Health Insurance Portability and Accountability Act (HIPAA)-compliant Institutional Review Board (IRB)-approved retrospective study of 68 patients with diffuse liver disease was performed at a tertiary liver center. All patients underwent double contrast-enhanced MRI, with histopathology-based staging of fibrosis obtained within 12 months of imaging. The MaZda software program was used to compute 279 texture parameters for each image. A statistical regularization technique, generalized linear model (GLM)-path, was used to develop a model based on texture features for dichotomous classification of fibrosis category (F ≤2 vs. F ≥3) of the 68 patients, with histology as the reference standard. The model's performance was assessed and cross-validated. There was no additional validation performed on an independent cohort. Results Cross-validated sensitivity, specificity, and total accuracy of the texture feature model in classifying fibrosis were 91.9%, 83.9%, and 88.2%, respectively. Conclusion This study shows proof of concept that accurate, noninvasive classification of liver fibrosis is possible by applying quantitative texture analysis to double contrast-enhanced MRI. Further studies are needed in independent cohorts of subjects. PMID:22851409
Mighell, Alan D.
2001-01-01
In theory, physical crystals can be represented by idealized mathematical lattices. Under appropriate conditions, these representations can be used for a variety of purposes such as identifying, classifying, and understanding the physical properties of materials. Critical to these applications is the ability to construct a unique representation of the lattice. The vital link that enabled this theory to be realized in practice was provided by the 1970 paper on the determination of reduced cells. This seminal paper led to a mathematical approach to lattice analysis initially based on systematic reduction procedures and the use of standard cells. Subsequently, the process evolved to a matrix approach based on group theory and linear algebra that offered a more abstract and powerful way to look at lattices and their properties. Application of the reduced cell to both database work and laboratory research at NIST was immediately successful. Currently, this cell and/or procedures based on reduction are widely and routinely used by the general scientific community: (i) for calculating standard cells for the reporting of crystalline materials, (ii) for classifying materials, (iii) in crystallographic database work (iv) in routine x-ray and neutron diffractometry, and (v) in general crystallographic research. Especially important is its use in symmetry determination and in identification. The focus herein is on the role of the reduced cell in lattice symmetry determination. PMID:27500059
Centre-based restricted nearest feature plane with angle classifier for face recognition
NASA Astrophysics Data System (ADS)
Tang, Linlin; Lu, Huifen; Zhao, Liang; Li, Zuohua
2017-10-01
An improved classifier based on the nearest feature plane (NFP), called the centre-based restricted nearest feature plane with the angle (RNFPA) classifier, is proposed for the face recognition problems here. The famous NFP uses the geometrical information of samples to increase the number of training samples, but it increases the computation complexity and it also has an inaccuracy problem coursed by the extended feature plane. To solve the above problems, RNFPA exploits a centre-based feature plane and utilizes a threshold of angle to restrict extended feature space. By choosing the appropriate angle threshold, RNFPA can improve the performance and decrease computation complexity. Experiments in the AT&T face database, AR face database and FERET face database are used to evaluate the proposed classifier. Compared with the original NFP classifier, the nearest feature line (NFL) classifier, the nearest neighbour (NN) classifier and some other improved NFP classifiers, the proposed one achieves competitive performance.
NASA Astrophysics Data System (ADS)
Bang, Jeongil; Oak, Jeong-Jung; Park, Yong Ho
2016-01-01
The aim of this study was to characterize microstructures and mechanical properties of aluminum metal matrix composites (MMC's) prepared by powder metallurgy method. Consolidation of mixed powder with gas atomized Al-Si/SiCp powder and Al-14Si-2.5Cu-0.5Mg powder by hot pressing was classified according to sintering temperature and sintering time. Sintering condition was optimized using tensile properties of sintered specimens. Ultimate tensile strength of the optimized sintered specimen was 228 MPa with an elongation of 5.3% in longitudinal direction. In addition, wear properties and behaviors of the sintered aluminum-based MMC's were analyzed in accordance with vertical load and linear speed. As the linear speed and vertical load of the wear increased, change of the wear behavior occurred in order of oxidation of Al-Si matrix, formation of C-rich layer, Fe-alloying to matrix, and melting of the specimen
Khanmohammadi, Mohammadreza; Bagheri Garmarudi, Amir; Samani, Simin; Ghasemi, Keyvan; Ashuri, Ahmad
2011-06-01
Attenuated Total Reflectance Fourier Transform Infrared (ATR-FTIR) microspectroscopy was applied for detection of colon cancer according to the spectral features of colon tissues. Supervised classification models can be trained to identify the tissue type based on the spectroscopic fingerprint. A total of 78 colon tissues were used in spectroscopy studies. Major spectral differences were observed in 1,740-900 cm(-1) spectral region. Several chemometric methods such as analysis of variance (ANOVA), cluster analysis (CA) and linear discriminate analysis (LDA) were applied for classification of IR spectra. Utilizing the chemometric techniques, clear and reproducible differences were observed between the spectra of normal and cancer cases, suggesting that infrared microspectroscopy in conjunction with spectral data processing would be useful for diagnostic classification. Using LDA technique, the spectra were classified into cancer and normal tissue classes with an accuracy of 95.8%. The sensitivity and specificity was 100 and 93.1%, respectively.
A Transform-Based Feature Extraction Approach for Motor Imagery Tasks Classification
Khorshidtalab, Aida; Mesbah, Mostefa; Salami, Momoh J. E.
2015-01-01
In this paper, we present a new motor imagery classification method in the context of electroencephalography (EEG)-based brain–computer interface (BCI). This method uses a signal-dependent orthogonal transform, referred to as linear prediction singular value decomposition (LP-SVD), for feature extraction. The transform defines the mapping as the left singular vectors of the LP coefficient filter impulse response matrix. Using a logistic tree-based model classifier; the extracted features are classified into one of four motor imagery movements. The proposed approach was first benchmarked against two related state-of-the-art feature extraction approaches, namely, discrete cosine transform (DCT) and adaptive autoregressive (AAR)-based methods. By achieving an accuracy of 67.35%, the LP-SVD approach outperformed the other approaches by large margins (25% compared with DCT and 6 % compared with AAR-based methods). To further improve the discriminatory capability of the extracted features and reduce the computational complexity, we enlarged the extracted feature subset by incorporating two extra features, namely, Q- and the Hotelling’s \\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}$T^{2}$ \\end{document} statistics of the transformed EEG and introduced a new EEG channel selection method. The performance of the EEG classification based on the expanded feature set and channel selection method was compared with that of a number of the state-of-the-art classification methods previously reported with the BCI IIIa competition data set. Our method came second with an average accuracy of 81.38%. PMID:27170898
Sparse Substring Pattern Set Discovery Using Linear Programming Boosting
NASA Astrophysics Data System (ADS)
Kashihara, Kazuaki; Hatano, Kohei; Bannai, Hideo; Takeda, Masayuki
In this paper, we consider finding a small set of substring patterns which classifies the given documents well. We formulate the problem as 1 norm soft margin optimization problem where each dimension corresponds to a substring pattern. Then we solve this problem by using LPBoost and an optimal substring discovery algorithm. Since the problem is a linear program, the resulting solution is likely to be sparse, which is useful for feature selection. We evaluate the proposed method for real data such as movie reviews.
Case base classification on digital mammograms: improving the performance of case base classifier
NASA Astrophysics Data System (ADS)
Raman, Valliappan; Then, H. H.; Sumari, Putra; Venkatesa Mohan, N.
2011-10-01
Breast cancer continues to be a significant public health problem in the world. Early detection is the key for improving breast cancer prognosis. The aim of the research presented here is in twofold. First stage of research involves machine learning techniques, which segments and extracts features from the mass of digital mammograms. Second level is on problem solving approach which includes classification of mass by performance based case base classifier. In this paper we build a case-based Classifier in order to diagnose mammographic images. We explain different methods and behaviors that have been added to the classifier to improve the performance of the classifier. Currently the initial Performance base Classifier with Bagging is proposed in the paper and it's been implemented and it shows an improvement in specificity and sensitivity.
Decoding magnetoencephalographic rhythmic activity using spectrospatial information.
Kauppi, Jukka-Pekka; Parkkonen, Lauri; Hari, Riitta; Hyvärinen, Aapo
2013-12-01
We propose a new data-driven decoding method called Spectral Linear Discriminant Analysis (Spectral LDA) for the analysis of magnetoencephalography (MEG). The method allows investigation of changes in rhythmic neural activity as a result of different stimuli and tasks. The introduced classification model only assumes that each "brain state" can be characterized as a combination of neural sources, each of which shows rhythmic activity at one or several frequency bands. Furthermore, the model allows the oscillation frequencies to be different for each such state. We present decoding results from 9 subjects in a four-category classification problem defined by an experiment involving randomly alternating epochs of auditory, visual and tactile stimuli interspersed with rest periods. The performance of Spectral LDA was very competitive compared with four alternative classifiers based on different assumptions concerning the organization of rhythmic brain activity. In addition, the spectral and spatial patterns extracted automatically on the basis of trained classifiers showed that Spectral LDA offers a novel and interesting way of analyzing spectrospatial oscillatory neural activity across the brain. All the presented classification methods and visualization tools are freely available as a Matlab toolbox. © 2013.
[Study on application of SVM in prediction of coronary heart disease].
Zhu, Yue; Wu, Jianghua; Fang, Ying
2013-12-01
Base on the data of blood pressure, plasma lipid, Glu and UA by physical test, Support Vector Machine (SVM) was applied to identify coronary heart disease (CHD) in patients and non-CHD individuals in south China population for guide of further prevention and treatment of the disease. Firstly, the SVM classifier was built using radial basis kernel function, liner kernel function and polynomial kernel function, respectively. Secondly, the SVM penalty factor C and kernel parameter sigma were optimized by particle swarm optimization (PSO) and then employed to diagnose and predict the CHD. By comparison with those from artificial neural network with the back propagation (BP) model, linear discriminant analysis, logistic regression method and non-optimized SVM, the overall results of our calculation demonstrated that the classification performance of optimized RBF-SVM model could be superior to other classifier algorithm with higher accuracy rate, sensitivity and specificity, which were 94.51%, 92.31% and 96.67%, respectively. So, it is well concluded that SVM could be used as a valid method for assisting diagnosis of CHD.
NASA Astrophysics Data System (ADS)
Li, Shaoxin; Zhang, Yanjiao; Xu, Junfa; Li, Linfang; Zeng, Qiuyao; Lin, Lin; Guo, Zhouyi; Liu, Zhiming; Xiong, Honglian; Liu, Songhao
2014-09-01
This study aims to present a noninvasive prostate cancer screening methods using serum surface-enhanced Raman scattering (SERS) and support vector machine (SVM) techniques through peripheral blood sample. SERS measurements are performed using serum samples from 93 prostate cancer patients and 68 healthy volunteers by silver nanoparticles. Three types of kernel functions including linear, polynomial, and Gaussian radial basis function (RBF) are employed to build SVM diagnostic models for classifying measured SERS spectra. For comparably evaluating the performance of SVM classification models, the standard multivariate statistic analysis method of principal component analysis (PCA) is also applied to classify the same datasets. The study results show that for the RBF kernel SVM diagnostic model, the diagnostic accuracy of 98.1% is acquired, which is superior to the results of 91.3% obtained from PCA methods. The receiver operating characteristic curve of diagnostic models further confirm above research results. This study demonstrates that label-free serum SERS analysis technique combined with SVM diagnostic algorithm has great potential for noninvasive prostate cancer screening.
Ambiguity domain-based identification of altered gait pattern in ALS disorder
NASA Astrophysics Data System (ADS)
Sugavaneswaran, L.; Umapathy, K.; Krishnan, S.
2012-08-01
The onset of a neurological disorder, such as amyotrophic lateral sclerosis (ALS), is so subtle that the symptoms are often overlooked, thereby ruling out the option of early detection of the abnormality. In the case of ALS, over 75% of the affected individuals often experience awkwardness when using their limbs, which alters their gait, i.e. stride and swing intervals. The aim of this work is to suitably represent the non-stationary characteristics of gait (fluctuations in stride and swing intervals) in order to facilitate discrimination between normal and ALS subjects. We define a simple-yet-representative feature vector space by exploiting the ambiguity domain (AD) to achieve efficient classification between healthy and pathological gait stride interval. The stride-to-stride fluctuations and the swing intervals of 16 healthy control and 13 ALS-affected subjects were analyzed. Three features that are representative of the gait signal characteristics were extracted from the AD-space and are fed to linear discriminant analysis and neural network classifiers, respectively. Overall, maximum accuracies of 89.2% (LDA) and 100% (NN) were obtained in classifying the ALS gait.
Distinguishing centrarchid genera by use of lateral line scales
Roberts, N.M.; Rabeni, C.F.; Stanovick, J.S.
2007-01-01
Predator-prey relations involving fishes are often evaluated using scales remaining in gut contents or feces. While several reliable keys help identify North American freshwater fish scales to the family level, none attempt to separate the family Centrarchidae to the genus level. Centrarchidae is of particular concern in the midwestern United States because it contains several popular sport fishes, such as smallmouth bass Micropterus dolomieu, largemouth bass M. salmoides, and rock bass Ambloplites rupestris, as well as less-sought-after species of sunfishes Lepomis spp. and crappies Pomoxis spp. Differentiating sport fish from non-sport fish has important management implications. Morphological characteristics of lateral line scales (n = 1,581) from known centrarchid fishes were analyzed. The variability of measurements within and between genera was examined to select variables that were the most useful in further classifying unknown centrarchid scales. A linear discriminant analysis model was developed using 10 variables. Based on this model, 84.4% of Ambloplites scales, 81.2% of Lepomis scales, and 86.6% of Micropterus scales were classified correctly using a jackknife procedure. ?? Copyright by the American Fisheries Society 2007.
EEG Responses to Auditory Stimuli for Automatic Affect Recognition
Hettich, Dirk T.; Bolinger, Elaina; Matuz, Tamara; Birbaumer, Niels; Rosenstiel, Wolfgang; Spüler, Martin
2016-01-01
Brain state classification for communication and control has been well established in the area of brain-computer interfaces over the last decades. Recently, the passive and automatic extraction of additional information regarding the psychological state of users from neurophysiological signals has gained increased attention in the interdisciplinary field of affective computing. We investigated how well specific emotional reactions, induced by auditory stimuli, can be detected in EEG recordings. We introduce an auditory emotion induction paradigm based on the International Affective Digitized Sounds 2nd Edition (IADS-2) database also suitable for disabled individuals. Stimuli are grouped in three valence categories: unpleasant, neutral, and pleasant. Significant differences in time domain domain event-related potentials are found in the electroencephalogram (EEG) between unpleasant and neutral, as well as pleasant and neutral conditions over midline electrodes. Time domain data were classified in three binary classification problems using a linear support vector machine (SVM) classifier. We discuss three classification performance measures in the context of affective computing and outline some strategies for conducting and reporting affect classification studies. PMID:27375410
Wen, Tingxi; Zhang, Zhongnan
2017-01-01
Abstract In this paper, genetic algorithm-based frequency-domain feature search (GAFDS) method is proposed for the electroencephalogram (EEG) analysis of epilepsy. In this method, frequency-domain features are first searched and then combined with nonlinear features. Subsequently, these features are selected and optimized to classify EEG signals. The extracted features are analyzed experimentally. The features extracted by GAFDS show remarkable independence, and they are superior to the nonlinear features in terms of the ratio of interclass distance and intraclass distance. Moreover, the proposed feature search method can search for features of instantaneous frequency in a signal after Hilbert transformation. The classification results achieved using these features are reasonable; thus, GAFDS exhibits good extensibility. Multiple classical classifiers (i.e., k-nearest neighbor, linear discriminant analysis, decision tree, AdaBoost, multilayer perceptron, and Naïve Bayes) achieve satisfactory classification accuracies by using the features generated by the GAFDS method and the optimized feature selection. The accuracies for 2-classification and 3-classification problems may reach up to 99% and 97%, respectively. Results of several cross-validation experiments illustrate that GAFDS is effective in the extraction of effective features for EEG classification. Therefore, the proposed feature selection and optimization model can improve classification accuracy. PMID:28489789
Wang, Tao; He, Fuhong; Zhang, Anding; Gu, Lijuan; Wen, Yangmao; Jiang, Weiguo; Shao, Hongbo
2014-01-01
This paper took a subregion in a small watershed gully system at Beiyanzikou catchment of Qixia, China, as a study and, using object-orientated image analysis (OBIA), extracted shoulder line of gullies from high spatial resolution digital orthophoto map (DOM) aerial photographs. Next, it proposed an accuracy assessment method based on the adjacent distance between the boundary classified by remote sensing and points measured by RTK-GPS along the shoulder line of gullies. Finally, the original surface was fitted using linear regression in accordance with the elevation of two extracted edges of experimental gullies, named Gully 1 and Gully 2, and the erosion volume was calculated. The results indicate that OBIA can effectively extract information of gullies; average range difference between points field measured along the edge of gullies and classified boundary is 0.3166 m, with variance of 0.2116 m. The erosion area and volume of two gullies are 2141.6250 m2, 5074.1790 m3 and 1316.1250 m2, 1591.5784 m3, respectively. The results of the study provide a new method for the quantitative study of small gully erosion. PMID:24616626
NASA Astrophysics Data System (ADS)
Wang, Tao; He, Bin
2004-03-01
The recognition of mental states during motor imagery tasks is crucial for EEG-based brain computer interface research. We have developed a new algorithm by means of frequency decomposition and weighting synthesis strategy for recognizing imagined right- and left-hand movements. A frequency range from 5 to 25 Hz was divided into 20 band bins for each trial, and the corresponding envelopes of filtered EEG signals for each trial were extracted as a measure of instantaneous power at each frequency band. The dimensionality of the feature space was reduced from 200 (corresponding to 2 s) to 3 by down-sampling of envelopes of the feature signals, and subsequently applying principal component analysis. The linear discriminate analysis algorithm was then used to classify the features, due to its generalization capability. Each frequency band bin was weighted by a function determined according to the classification accuracy during the training process. The present classification algorithm was applied to a dataset of nine human subjects, and achieved a success rate of classification of 90% in training and 77% in testing. The present promising results suggest that the present classification algorithm can be used in initiating a general-purpose mental state recognition based on motor imagery tasks.
Human Detection from a Mobile Robot Using Fusion of Laser and Vision Information
Fotiadis, Efstathios P.; Garzón, Mario; Barrientos, Antonio
2013-01-01
This paper presents a human detection system that can be employed on board a mobile platform for use in autonomous surveillance of large outdoor infrastructures. The prediction is based on the fusion of two detection modules, one for the laser and another for the vision data. In the laser module, a novel feature set that better encapsulates variations due to noise, distance and human pose is proposed. This enhances the generalization of the system, while at the same time, increasing the outdoor performance in comparison with current methods. The vision module uses the combination of the histogram of oriented gradients descriptor and the linear support vector machine classifier. Current approaches use a fixed-size projection to define regions of interest on the image data using the range information from the laser range finder. When applied to small size unmanned ground vehicles, these techniques suffer from misalignment, due to platform vibrations and terrain irregularities. This is effectively addressed in this work by using a novel adaptive projection technique, which is based on a probabilistic formulation of the classifier performance. Finally, a probability calibration step is introduced in order to optimally fuse the information from both modules. Experiments in real world environments demonstrate the robustness of the proposed method. PMID:24008280
Human detection from a mobile robot using fusion of laser and vision information.
Fotiadis, Efstathios P; Garzón, Mario; Barrientos, Antonio
2013-09-04
This paper presents a human detection system that can be employed on board a mobile platform for use in autonomous surveillance of large outdoor infrastructures. The prediction is based on the fusion of two detection modules, one for the laser and another for the vision data. In the laser module, a novel feature set that better encapsulates variations due to noise, distance and human pose is proposed. This enhances the generalization of the system, while at the same time, increasing the outdoor performance in comparison with current methods. The vision module uses the combination of the histogram of oriented gradients descriptor and the linear support vector machine classifier. Current approaches use a fixed-size projection to define regions of interest on the image data using the range information from the laser range finder. When applied to small size unmanned ground vehicles, these techniques suffer from misalignment, due to platform vibrations and terrain irregularities. This is effectively addressed in this work by using a novel adaptive projection technique, which is based on a probabilistic formulation of the classifier performance. Finally, a probability calibration step is introduced in order to optimally fuse the information from both modules. Experiments in real world environments demonstrate the robustness of the proposed method.
Automatic seed selection for segmentation of liver cirrhosis in laparoscopic sequences
NASA Astrophysics Data System (ADS)
Sinha, Rahul; Marcinczak, Jan Marek; Grigat, Rolf-Rainer
2014-03-01
For computer aided diagnosis based on laparoscopic sequences, image segmentation is one of the basic steps which define the success of all further processing. However, many image segmentation algorithms require prior knowledge which is given by interaction with the clinician. We propose an automatic seed selection algorithm for segmentation of liver cirrhosis in laparoscopic sequences which assigns each pixel a probability of being cirrhotic liver tissue or background tissue. Our approach is based on a trained classifier using SIFT and RGB features with PCA. Due to the unique illumination conditions in laparoscopic sequences of the liver, a very low dimensional feature space can be used for classification via logistic regression. The methodology is evaluated on 718 cirrhotic liver and background patches that are taken from laparoscopic sequences of 7 patients. Using a linear classifier we achieve a precision of 91% in a leave-one-patient-out cross-validation. Furthermore, we demonstrate that with logistic probability estimates, seeds with high certainty of being cirrhotic liver tissue can be obtained. For example, our precision of liver seeds increases to 98.5% if only seeds with more than 95% probability of being liver are used. Finally, these automatically selected seeds can be used as priors in Graph Cuts which is demonstrated in this paper.
Assessment of forward head posture in females: observational and photogrammetry methods.
Salahzadeh, Zahra; Maroufi, Nader; Ahmadi, Amir; Behtash, Hamid; Razmjoo, Arash; Gohari, Mahmoud; Parnianpour, Mohamad
2014-01-01
There are different methods to assess forward head posture (FHP) but the accuracy and discrimination ability of these methods are not clear. Here, we want to compare three postural angles for FHP assessment and also study the discrimination accuracy of three photogrammetric methods to differentiate groups categorized based on observational method. All Seventy-eight healthy female participants (23 ± 2.63 years), were classified into three groups: moderate-severe FHP, slight FHP and non FHP based on observational postural assessment rules. Applying three photogrammetric methods - craniovertebral angle, head title angle and head position angle - to measure FHP objectively. One - way ANOVA test showed a significant difference in three categorized group's craniovertebral angle (P< 0.05, F=83.07). There was no dramatic difference in head tilt angle and head position angle methods in three groups. According to Linear Discriminate Analysis (LDA) results, the canonical discriminant function (Wilks'Lambda) was 0.311 for craniovertebral angle with 79.5% of cross-validated grouped cases correctly classified. Our results showed that, craniovertebral angle method may discriminate the females with moderate-severe and non FHP more accurate than head position angle and head tilt angle. The photogrammetric method had excellent inter and intra rater reliability to assess the head and cervical posture.
A novel Bayesian framework for discriminative feature extraction in Brain-Computer Interfaces.
Suk, Heung-Il; Lee, Seong-Whan
2013-02-01
As there has been a paradigm shift in the learning load from a human subject to a computer, machine learning has been considered as a useful tool for Brain-Computer Interfaces (BCIs). In this paper, we propose a novel Bayesian framework for discriminative feature extraction for motor imagery classification in an EEG-based BCI in which the class-discriminative frequency bands and the corresponding spatial filters are optimized by means of the probabilistic and information-theoretic approaches. In our framework, the problem of simultaneous spatiospectral filter optimization is formulated as the estimation of an unknown posterior probability density function (pdf) that represents the probability that a single-trial EEG of predefined mental tasks can be discriminated in a state. In order to estimate the posterior pdf, we propose a particle-based approximation method by extending a factored-sampling technique with a diffusion process. An information-theoretic observation model is also devised to measure discriminative power of features between classes. From the viewpoint of classifier design, the proposed method naturally allows us to construct a spectrally weighted label decision rule by linearly combining the outputs from multiple classifiers. We demonstrate the feasibility and effectiveness of the proposed method by analyzing the results and its success on three public databases.
Wen, Tingxi; Zhang, Zhongnan
2017-05-01
In this paper, genetic algorithm-based frequency-domain feature search (GAFDS) method is proposed for the electroencephalogram (EEG) analysis of epilepsy. In this method, frequency-domain features are first searched and then combined with nonlinear features. Subsequently, these features are selected and optimized to classify EEG signals. The extracted features are analyzed experimentally. The features extracted by GAFDS show remarkable independence, and they are superior to the nonlinear features in terms of the ratio of interclass distance and intraclass distance. Moreover, the proposed feature search method can search for features of instantaneous frequency in a signal after Hilbert transformation. The classification results achieved using these features are reasonable; thus, GAFDS exhibits good extensibility. Multiple classical classifiers (i.e., k-nearest neighbor, linear discriminant analysis, decision tree, AdaBoost, multilayer perceptron, and Naïve Bayes) achieve satisfactory classification accuracies by using the features generated by the GAFDS method and the optimized feature selection. The accuracies for 2-classification and 3-classification problems may reach up to 99% and 97%, respectively. Results of several cross-validation experiments illustrate that GAFDS is effective in the extraction of effective features for EEG classification. Therefore, the proposed feature selection and optimization model can improve classification accuracy.
A travel time forecasting model based on change-point detection method
NASA Astrophysics Data System (ADS)
LI, Shupeng; GUANG, Xiaoping; QIAN, Yongsheng; ZENG, Junwei
2017-06-01
Travel time parameters obtained from road traffic sensors data play an important role in traffic management practice. A travel time forecasting model is proposed for urban road traffic sensors data based on the method of change-point detection in this paper. The first-order differential operation is used for preprocessing over the actual loop data; a change-point detection algorithm is designed to classify the sequence of large number of travel time data items into several patterns; then a travel time forecasting model is established based on autoregressive integrated moving average (ARIMA) model. By computer simulation, different control parameters are chosen for adaptive change point search for travel time series, which is divided into several sections of similar state.Then linear weight function is used to fit travel time sequence and to forecast travel time. The results show that the model has high accuracy in travel time forecasting.
Automatic classification of visual evoked potentials based on wavelet decomposition
NASA Astrophysics Data System (ADS)
Stasiakiewicz, Paweł; Dobrowolski, Andrzej P.; Tomczykiewicz, Kazimierz
2017-04-01
Diagnosis of part of the visual system, that is responsible for conducting compound action potential, is generally based on visual evoked potentials generated as a result of stimulation of the eye by external light source. The condition of patient's visual path is assessed by set of parameters that describe the time domain characteristic extremes called waves. The decision process is compound therefore diagnosis significantly depends on experience of a doctor. The authors developed a procedure - based on wavelet decomposition and linear discriminant analysis - that ensures automatic classification of visual evoked potentials. The algorithm enables to assign individual case to normal or pathological class. The proposed classifier has a 96,4% sensitivity at 10,4% probability of false alarm in a group of 220 cases and area under curve ROC equals to 0,96 which, from the medical point of view, is a very good result.
Beneito-Cambra, Miriam; Herrero-Martínez, José Manuel; Simó-Alfonso, Ernesto F; Ramis-Ramos, Guillermo
2008-11-01
A method for the rapid classification of proteases, lipases, amylases and cellulases used as enhancers in cleaning products, based on precipitation with acetone, hydrolysis with HCl, dilution of the hydrolysates with ethanol, and direct infusion into the electrospray ion source of an ion-trap mass spectrometer, has been developed. The abundances of the ([M+H]+ ions of the amino acids, from the hydrolysates of both the enzyme industrial concentrates and the detergent bases spiked with them, were used to construct linear discriminant analysis models, capable of distinguishing between the enzyme classes. For this purpose, the variables were normalized as follows: (A) the ion abundance of each amino acid was divided by the sum of the ion abundances of all the amino acids in the corresponding mass spectrum; (B) the ratios of pairs of ion abundances were obtained by dividing the ion abundance of each amino acid by each one of the ion abundances of the other 17 amino acids in the corresponding mass spectrum. Using normalization procedure B, excellent class-resolution between proteases, lipases, amylases and cellulases was achieved. In all cases, enzymes in industrial concentrates and manufactured cleaning products were correctly classified with >98% assignment probability.
Nonlinear Fusion of Multispectral Citrus Fruit Image Data with Information Contents.
Li, Peilin; Lee, Sang-Heon; Hsu, Hung-Yao; Park, Jae-Sam
2017-01-13
The main issue of vison-based automatic harvesting manipulators is the difficulty in the correct fruit identification in the images under natural lighting conditions. Mostly, the solution has been based on a linear combination of color components in the multispectral images. However, the results have not reached a satisfactory level. To overcome this issue, this paper proposes a robust nonlinear fusion method to augment the original color image with the synchronized near infrared image. The two images are fused with Daubechies wavelet transform (DWT) in a multiscale decomposition approach. With DWT, the background noises are reduced and the necessary image features are enhanced by fusing the color contrast of the color components and the homogeneity of the near infrared (NIR) component. The resulting fused color image is classified with a C-means algorithm for reconstruction. The performance of the proposed approach is evaluated with the statistical F measure in comparison to some existing methods using linear combinations of color components. The results show that the fusion of information in different spectral components has the advantage of enhancing the image quality, therefore improving the classification accuracy in citrus fruit identification in natural lighting conditions.
Nonlinear Fusion of Multispectral Citrus Fruit Image Data with Information Contents
Li, Peilin; Lee, Sang-Heon; Hsu, Hung-Yao; Park, Jae-Sam
2017-01-01
The main issue of vison-based automatic harvesting manipulators is the difficulty in the correct fruit identification in the images under natural lighting conditions. Mostly, the solution has been based on a linear combination of color components in the multispectral images. However, the results have not reached a satisfactory level. To overcome this issue, this paper proposes a robust nonlinear fusion method to augment the original color image with the synchronized near infrared image. The two images are fused with Daubechies wavelet transform (DWT) in a multiscale decomposition approach. With DWT, the background noises are reduced and the necessary image features are enhanced by fusing the color contrast of the color components and the homogeneity of the near infrared (NIR) component. The resulting fused color image is classified with a C-means algorithm for reconstruction. The performance of the proposed approach is evaluated with the statistical F measure in comparison to some existing methods using linear combinations of color components. The results show that the fusion of information in different spectral components has the advantage of enhancing the image quality, therefore improving the classification accuracy in citrus fruit identification in natural lighting conditions. PMID:28098797
Bearing Fault Diagnosis Based on Statistical Locally Linear Embedding
Wang, Xiang; Zheng, Yuan; Zhao, Zhenzhou; Wang, Jinping
2015-01-01
Fault diagnosis is essentially a kind of pattern recognition. The measured signal samples usually distribute on nonlinear low-dimensional manifolds embedded in the high-dimensional signal space, so how to implement feature extraction, dimensionality reduction and improve recognition performance is a crucial task. In this paper a novel machinery fault diagnosis approach based on a statistical locally linear embedding (S-LLE) algorithm which is an extension of LLE by exploiting the fault class label information is proposed. The fault diagnosis approach first extracts the intrinsic manifold features from the high-dimensional feature vectors which are obtained from vibration signals that feature extraction by time-domain, frequency-domain and empirical mode decomposition (EMD), and then translates the complex mode space into a salient low-dimensional feature space by the manifold learning algorithm S-LLE, which outperforms other feature reduction methods such as PCA, LDA and LLE. Finally in the feature reduction space pattern classification and fault diagnosis by classifier are carried out easily and rapidly. Rolling bearing fault signals are used to validate the proposed fault diagnosis approach. The results indicate that the proposed approach obviously improves the classification performance of fault pattern recognition and outperforms the other traditional approaches. PMID:26153771
Familial associations with paratuberculosis ELISA results in Texas Longhorn cattle.
Osterstock, Jason B; Fosgate, Geoffrey T; Cohen, Noah D; Derr, James N; Manning, Elizabeth J B; Collins, Michael T; Roussel, Allen J
2008-05-25
The objective of this cross-sectional study was to estimate familial associations with paratuberculosis ELISA status in beef cattle. Texas Longhorn cattle (n=715) greater than 2years of age were sampled for paratuberculosis testing using ELISA and fecal culture. Diagnostic test results were indicative of substantial numbers of false-positive serological reactions consistent with environmental exposure to non-MAP Mycobacterium spp. Associations between ancestors and paratuberculosis ELISA status of offspring were assessed using conditional logistic regression. The association between ELISA status of the dam and her offspring was assessed using linear mixed-effect models. Significant associations were identified between some ancestors and offspring ELISA status. The odds of being classified as "suspect" or greater based on ELISA results were 4.6 times greater for offspring of dams with similarly increased S:P ratios. A significant positive linear association was also observed between dam and offspring log-transformed S:P ratios. Results indicate that there is familial aggregation of paratuberculosis ELISA results in beef cattle and suggest that genetic selection based on paratuberculosis ELISA status may decrease seroprevalence. However, genetic selection may have minimal effect on paratuberculosis control in herds with exposure to non-MAP Mycobacterium spp.
MO-AB-BRA-10: Cancer Therapy Outcome Prediction Based On Dempster-Shafer Theory and PET Imaging
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lian, C; University of Rouen, QuantIF - EA 4108 LITIS, 76000 Rouen; Li, H
2015-06-15
Purpose: In cancer therapy, utilizing FDG-18 PET image-based features for accurate outcome prediction is challenging because of 1) limited discriminative information within a small number of PET image sets, and 2) fluctuant feature characteristics caused by the inferior spatial resolution and system noise of PET imaging. In this study, we proposed a new Dempster-Shafer theory (DST) based approach, evidential low-dimensional transformation with feature selection (ELT-FS), to accurately predict cancer therapy outcome with both PET imaging features and clinical characteristics. Methods: First, a specific loss function with sparse penalty was developed to learn an adaptive low-rank distance metric for representing themore » dissimilarity between different patients’ feature vectors. By minimizing this loss function, a linear low-dimensional transformation of input features was achieved. Also, imprecise features were excluded simultaneously by applying a l2,1-norm regularization of the learnt dissimilarity metric in the loss function. Finally, the learnt dissimilarity metric was applied in an evidential K-nearest-neighbor (EK- NN) classifier to predict treatment outcome. Results: Twenty-five patients with stage II–III non-small-cell lung cancer and thirty-six patients with esophageal squamous cell carcinomas treated with chemo-radiotherapy were collected. For the two groups of patients, 52 and 29 features, respectively, were utilized. The leave-one-out cross-validation (LOOCV) protocol was used for evaluation. Compared to three existing linear transformation methods (PCA, LDA, NCA), the proposed ELT-FS leads to higher prediction accuracy for the training and testing sets both for lung-cancer patients (100+/−0.0, 88.0+/−33.17) and for esophageal-cancer patients (97.46+/−1.64, 83.33+/−37.8). The ELT-FS also provides superior class separation in both test data sets. Conclusion: A novel DST- based approach has been proposed to predict cancer treatment outcome using PET image features and clinical characteristics. A specific loss function has been designed for robust accommodation of feature set incertitude and imprecision, facilitating adaptive learning of the dissimilarity metric for the EK-NN classifier.« less
Correlation coefficient based supervised locally linear embedding for pulmonary nodule recognition.
Wu, Panpan; Xia, Kewen; Yu, Hengyong
2016-11-01
Dimensionality reduction techniques are developed to suppress the negative effects of high dimensional feature space of lung CT images on classification performance in computer aided detection (CAD) systems for pulmonary nodule detection. An improved supervised locally linear embedding (SLLE) algorithm is proposed based on the concept of correlation coefficient. The Spearman's rank correlation coefficient is introduced to adjust the distance metric in the SLLE algorithm to ensure that more suitable neighborhood points could be identified, and thus to enhance the discriminating power of embedded data. The proposed Spearman's rank correlation coefficient based SLLE (SC(2)SLLE) is implemented and validated in our pilot CAD system using a clinical dataset collected from the publicly available lung image database consortium and image database resource initiative (LICD-IDRI). Particularly, a representative CAD system for solitary pulmonary nodule detection is designed and implemented. After a sequential medical image processing steps, 64 nodules and 140 non-nodules are extracted, and 34 representative features are calculated. The SC(2)SLLE, as well as SLLE and LLE algorithm, are applied to reduce the dimensionality. Several quantitative measurements are also used to evaluate and compare the performances. Using a 5-fold cross-validation methodology, the proposed algorithm achieves 87.65% accuracy, 79.23% sensitivity, 91.43% specificity, and 8.57% false positive rate, on average. Experimental results indicate that the proposed algorithm outperforms the original locally linear embedding and SLLE coupled with the support vector machine (SVM) classifier. Based on the preliminary results from a limited number of nodules in our dataset, this study demonstrates the great potential to improve the performance of a CAD system for nodule detection using the proposed SC(2)SLLE. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Jamieson, Andrew R; Giger, Maryellen L; Drukker, Karen; Li, Hui; Yuan, Yading; Bhooshan, Neha
2010-01-01
In this preliminary study, recently developed unsupervised nonlinear dimension reduction (DR) and data representation techniques were applied to computer-extracted breast lesion feature spaces across three separate imaging modalities: Ultrasound (U.S.) with 1126 cases, dynamic contrast enhanced magnetic resonance imaging with 356 cases, and full-field digital mammography with 245 cases. Two methods for nonlinear DR were explored: Laplacian eigenmaps [M. Belkin and P. Niyogi, "Laplacian eigenmaps for dimensionality reduction and data representation," Neural Comput. 15, 1373-1396 (2003)] and t-distributed stochastic neighbor embedding (t-SNE) [L. van der Maaten and G. Hinton, "Visualizing data using t-SNE," J. Mach. Learn. Res. 9, 2579-2605 (2008)]. These methods attempt to map originally high dimensional feature spaces to more human interpretable lower dimensional spaces while preserving both local and global information. The properties of these methods as applied to breast computer-aided diagnosis (CADx) were evaluated in the context of malignancy classification performance as well as in the visual inspection of the sparseness within the two-dimensional and three-dimensional mappings. Classification performance was estimated by using the reduced dimension mapped feature output as input into both linear and nonlinear classifiers: Markov chain Monte Carlo based Bayesian artificial neural network (MCMC-BANN) and linear discriminant analysis. The new techniques were compared to previously developed breast CADx methodologies, including automatic relevance determination and linear stepwise (LSW) feature selection, as well as a linear DR method based on principal component analysis. Using ROC analysis and 0.632+bootstrap validation, 95% empirical confidence intervals were computed for the each classifier's AUC performance. In the large U.S. data set, sample high performance results include, AUC0.632+ = 0.88 with 95% empirical bootstrap interval [0.787;0.895] for 13 ARD selected features and AUC0.632+ = 0.87 with interval [0.817;0.906] for four LSW selected features compared to 4D t-SNE mapping (from the original 81D feature space) giving AUC0.632+ = 0.90 with interval [0.847;0.919], all using the MCMC-BANN. Preliminary results appear to indicate capability for the new methods to match or exceed classification performance of current advanced breast lesion CADx algorithms. While not appropriate as a complete replacement of feature selection in CADx problems, DR techniques offer a complementary approach, which can aid elucidation of additional properties associated with the data. Specifically, the new techniques were shown to possess the added benefit of delivering sparse lower dimensional representations for visual interpretation, revealing intricate data structure of the feature space.
Classification of Self-Driven Mental Tasks from Whole-Brain Activity Patterns
Nawa, Norberto Eiji; Ando, Hiroshi
2014-01-01
During wakefulness, a constant and continuous stream of complex stimuli and self-driven thoughts permeate the human mind. Here, eleven participants were asked to count down numbers and remember negative or positive autobiographical episodes of their personal lives, for 32 seconds at a time, during which they could freely engage in the execution of those tasks. We then examined the possibility of determining from a single whole-brain functional magnetic resonance imaging scan which one of the two mental tasks each participant was performing at a given point in time. Linear support-vector machines were used to build within-participant classifiers and across-participants classifiers. The within-participant classifiers could correctly discriminate scans with an average accuracy as high as 82%, when using data from all individual voxels in the brain. These results demonstrate that it is possible to accurately classify self-driven mental tasks from whole-brain activity patterns recorded in a time interval as short as 2 seconds. PMID:24824899
Hierarchical ensemble of global and local classifiers for face recognition.
Su, Yu; Shan, Shiguang; Chen, Xilin; Gao, Wen
2009-08-01
In the literature of psychophysics and neurophysiology, many studies have shown that both global and local features are crucial for face representation and recognition. This paper proposes a novel face recognition method which exploits both global and local discriminative features. In this method, global features are extracted from the whole face images by keeping the low-frequency coefficients of Fourier transform, which we believe encodes the holistic facial information, such as facial contour. For local feature extraction, Gabor wavelets are exploited considering their biological relevance. After that, Fisher's linear discriminant (FLD) is separately applied to the global Fourier features and each local patch of Gabor features. Thus, multiple FLD classifiers are obtained, each embodying different facial evidences for face recognition. Finally, all these classifiers are combined to form a hierarchical ensemble classifier. We evaluate the proposed method using two large-scale face databases: FERET and FRGC version 2.0. Experiments show that the results of our method are impressively better than the best known results with the same evaluation protocol.
Joint deconvolution and classification with applications to passive acoustic underwater multipath.
Anderson, Hyrum S; Gupta, Maya R
2008-11-01
This paper addresses the problem of classifying signals that have been corrupted by noise and unknown linear time-invariant (LTI) filtering such as multipath, given labeled uncorrupted training signals. A maximum a posteriori approach to the deconvolution and classification is considered, which produces estimates of the desired signal, the unknown channel, and the class label. For cases in which only a class label is needed, the classification accuracy can be improved by not committing to an estimate of the channel or signal. A variant of the quadratic discriminant analysis (QDA) classifier is proposed that probabilistically accounts for the unknown LTI filtering, and which avoids deconvolution. The proposed QDA classifier can work either directly on the signal or on features whose transformation by LTI filtering can be analyzed; as an example a classifier for subband-power features is derived. Results on simulated data and real Bowhead whale vocalizations show that jointly considering deconvolution with classification can dramatically improve classification performance over traditional methods over a range of signal-to-noise ratios.
An ensemble of SVM classifiers based on gene pairs.
Tong, Muchenxuan; Liu, Kun-Hong; Xu, Chungui; Ju, Wenbin
2013-07-01
In this paper, a genetic algorithm (GA) based ensemble support vector machine (SVM) classifier built on gene pairs (GA-ESP) is proposed. The SVMs (base classifiers of the ensemble system) are trained on different informative gene pairs. These gene pairs are selected by the top scoring pair (TSP) criterion. Each of these pairs projects the original microarray expression onto a 2-D space. Extensive permutation of gene pairs may reveal more useful information and potentially lead to an ensemble classifier with satisfactory accuracy and interpretability. GA is further applied to select an optimized combination of base classifiers. The effectiveness of the GA-ESP classifier is evaluated on both binary-class and multi-class datasets. Copyright © 2013 Elsevier Ltd. All rights reserved.
Catanzaro, Daniele; Schäffer, Alejandro A.; Schwartz, Russell
2016-01-01
Ductal Carcinoma In Situ (DCIS) is a precursor lesion of Invasive Ductal Carcinoma (IDC) of the breast. Investigating its temporal progression could provide fundamental new insights for the development of better diagnostic tools to predict which cases of DCIS will progress to IDC. We investigate the problem of reconstructing a plausible progression from single-cell sampled data of an individual with Synchronous DCIS and IDC. Specifically, by using a number of assumptions derived from the observation of cellular atypia occurring in IDC, we design a possible predictive model using integer linear programming (ILP). Computational experiments carried out on a preexisting data set of 13 patients with simultaneous DCIS and IDC show that the corresponding predicted progression models are classifiable into categories having specific evolutionary characteristics. The approach provides new insights into mechanisms of clonal progression in breast cancers and helps illustrate the power of the ILP approach for similar problems in reconstructing tumor evolution scenarios under complex sets of constraints. PMID:26353381
Catanzaro, Daniele; Shackney, Stanley E; Schaffer, Alejandro A; Schwartz, Russell
2016-01-01
Ductal Carcinoma In Situ (DCIS) is a precursor lesion of Invasive Ductal Carcinoma (IDC) of the breast. Investigating its temporal progression could provide fundamental new insights for the development of better diagnostic tools to predict which cases of DCIS will progress to IDC. We investigate the problem of reconstructing a plausible progression from single-cell sampled data of an individual with synchronous DCIS and IDC. Specifically, by using a number of assumptions derived from the observation of cellular atypia occurring in IDC, we design a possible predictive model using integer linear programming (ILP). Computational experiments carried out on a preexisting data set of 13 patients with simultaneous DCIS and IDC show that the corresponding predicted progression models are classifiable into categories having specific evolutionary characteristics. The approach provides new insights into mechanisms of clonal progression in breast cancers and helps illustrate the power of the ILP approach for similar problems in reconstructing tumor evolution scenarios under complex sets of constraints.
Soft computing-based terrain visual sensing and data fusion for unmanned ground robotic systems
NASA Astrophysics Data System (ADS)
Shirkhodaie, Amir
2006-05-01
In this paper, we have primarily discussed technical challenges and navigational skill requirements of mobile robots for traversability path planning in natural terrain environments similar to Mars surface terrains. We have described different methods for detection of salient terrain features based on imaging texture analysis techniques. We have also presented three competing techniques for terrain traversability assessment of mobile robots navigating in unstructured natural terrain environments. These three techniques include: a rule-based terrain classifier, a neural network-based terrain classifier, and a fuzzy-logic terrain classifier. Each proposed terrain classifier divides a region of natural terrain into finite sub-terrain regions and classifies terrain condition exclusively within each sub-terrain region based on terrain visual clues. The Kalman Filtering technique is applied for aggregative fusion of sub-terrain assessment results. The last two terrain classifiers are shown to have remarkable capability for terrain traversability assessment of natural terrains. We have conducted a comparative performance evaluation of all three terrain classifiers and presented the results in this paper.
Recognition of pornographic web pages by classifying texts and images.
Hu, Weiming; Wu, Ou; Chen, Zhouyao; Fu, Zhouyu; Maybank, Steve
2007-06-01
With the rapid development of the World Wide Web, people benefit more and more from the sharing of information. However, Web pages with obscene, harmful, or illegal content can be easily accessed. It is important to recognize such unsuitable, offensive, or pornographic Web pages. In this paper, a novel framework for recognizing pornographic Web pages is described. A C4.5 decision tree is used to divide Web pages, according to content representations, into continuous text pages, discrete text pages, and image pages. These three categories of Web pages are handled, respectively, by a continuous text classifier, a discrete text classifier, and an algorithm that fuses the results from the image classifier and the discrete text classifier. In the continuous text classifier, statistical and semantic features are used to recognize pornographic texts. In the discrete text classifier, the naive Bayes rule is used to calculate the probability that a discrete text is pornographic. In the image classifier, the object's contour-based features are extracted to recognize pornographic images. In the text and image fusion algorithm, the Bayes theory is used to combine the recognition results from images and texts. Experimental results demonstrate that the continuous text classifier outperforms the traditional keyword-statistics-based classifier, the contour-based image classifier outperforms the traditional skin-region-based image classifier, the results obtained by our fusion algorithm outperform those by either of the individual classifiers, and our framework can be adapted to different categories of Web pages.
Raman spectroscopy for highly accurate estimation of the age of refrigerated porcine muscle
NASA Astrophysics Data System (ADS)
Timinis, Constantinos; Pitris, Costas
2016-03-01
The high water content of meat, combined with all the nutrients it contains, make it vulnerable to spoilage at all stages of production and storage even when refrigerated at 5 °C. A non-destructive and in situ tool for meat sample testing, which could provide an accurate indication of the storage time of meat, would be very useful for the control of meat quality as well as for consumer safety. The proposed solution is based on Raman spectroscopy which is non-invasive and can be applied in situ. For the purposes of this project, 42 meat samples from 14 animals were obtained and three Raman spectra per sample were collected every two days for two weeks. The spectra were subsequently processed and the sample age was calculated using a set of linear differential equations. In addition, the samples were classified in categories corresponding to the age in 2-day steps (i.e., 0, 2, 4, 6, 8, 10, 12 or 14 days old), using linear discriminant analysis and cross-validation. Contrary to other studies, where the samples were simply grouped into two categories (higher or lower quality, suitable or unsuitable for human consumption, etc.), in this study, the age was predicted with a mean error of ~ 1 day (20%) or classified, in 2-day steps, with 100% accuracy. Although Raman spectroscopy has been used in the past for the analysis of meat samples, the proposed methodology has resulted in a prediction of the sample age far more accurately than any report in the literature.
Battistella, G; Fuertinger, S; Fleysher, L; Ozelius, L J; Simonyan, K
2016-10-01
Spasmodic dysphonia (SD), or laryngeal dystonia, is a task-specific isolated focal dystonia of unknown causes and pathophysiology. Although functional and structural abnormalities have been described in this disorder, the influence of its different clinical phenotypes and genotypes remains scant, making it difficult to explain SD pathophysiology and to identify potential biomarkers. We used a combination of independent component analysis and linear discriminant analysis of resting-state functional magnetic resonance imaging data to investigate brain organization in different SD phenotypes (abductor versus adductor type) and putative genotypes (familial versus sporadic cases) and to characterize neural markers for genotype/phenotype categorization. We found abnormal functional connectivity within sensorimotor and frontoparietal networks in patients with SD compared with healthy individuals as well as phenotype- and genotype-distinct alterations of these networks, involving primary somatosensory, premotor and parietal cortices. The linear discriminant analysis achieved 71% accuracy classifying SD and healthy individuals using connectivity measures in the left inferior parietal and sensorimotor cortices. When categorizing between different forms of SD, the combination of measures from the left inferior parietal, premotor and right sensorimotor cortices achieved 81% discriminatory power between familial and sporadic SD cases, whereas the combination of measures from the right superior parietal, primary somatosensory and premotor cortices led to 71% accuracy in the classification of adductor and abductor SD forms. Our findings present the first effort to identify and categorize isolated focal dystonia based on its brain functional connectivity profile, which may have a potential impact on the future development of biomarkers for this rare disorder. © 2016 EAN.
Battistella, Giovanni; Fuertinger, Stefan; Fleysher, Lazar; Ozelius, Laurie J.; Simonyan, Kristina
2017-01-01
Background Spasmodic dysphonia (SD), or laryngeal dystonia, is a task-specific isolated focal dystonia of unknown causes and pathophysiology. Although functional and structural abnormalities have been described in this disorder, the influence of its different clinical phenotypes and genotypes remains scant, making it difficult to explain SD pathophysiology and to identify potential biomarkers. Methods We used a combination of independent component analysis and linear discriminant analysis of resting-state functional MRI data to investigate brain organization in different SD phenotypes (abductor vs. adductor type) and putative genotypes (familial vs. sporadic cases) and to characterize neural markers for genotype/phenotype categorization. Results We found abnormal functional connectivity within sensorimotor and frontoparietal networks in SD patients compared to healthy individuals as well as phenotype- and genotype-distinct alterations of these networks, involving primary somatosensory, premotor and parietal cortices. The linear discriminant analysis achieved 71% accuracy classifying SD and healthy individuals using connectivity measures in the left inferior parietal and sensorimotor cortex. When categorizing between different forms of SD, the combination of measures from left inferior parietal, premotor and right sensorimotor cortices achieved 81% discriminatory power between familial and sporadic SD cases, whereas the combination of measures from the right superior parietal, primary somatosensory and premotor cortices led to 71% accuracy in the classification of adductor and abductor SD forms. Conclusions Our findings present the first effort to identify and categorize isolated focal dystonia based on its brain functional connectivity profile, which may have a potential impact on the future development of biomarkers for this rare disorder. PMID:27346568
Sandberg, David E; Vena, John E; Weiner, John; Beehler, Gregory P; Swanson, Mya; Meyer-Bahlburg, Heino F L
2003-03-01
Early sex hormone exposure contributes to gender-dimorphic behavioral development in mammals, including humans. Environmental toxicants concentrated in contaminated sport fish can interfere with the actions of sex steroids. This study developed an outcome variable by combining gender-dimorphic behaviors that differentiates boys and girls. Offspring of participants in the New York State Angler Cohort Study (NYSACS) were targeted in a parent-report postal survey. Instruments were selected based on findings of gender differences in the general population. A linear discriminant function model incorporating three gender behavior scales correctly classified the sex of 97.7% of children (252 boys and 234 girls) from a random NYSACS sample. The discriminant function was cross-validated by correctly classifying the sex of 98.4% of children (457 boys and 425 girls) from the remaining NYSACS cases and 97.6% of children (154 boys and 142 girls) from an independent school sample. Within-sex stepwise multiple regression analyses revealed that masculine behavior increased among boys with age and with the number of years of maternal sport fish consumption. In girls, older age and previous live-born siblings were associated with more masculine behavior, whereas feminine behavior increased with the duration of breast feeding. These associations were replicated in an independent sample. A linear discriminant function effectively transformed the binary classification of sex (male-female) to a bipolar continuum of gender (masculinity-femininity). Findings from this study are consistent with the hypothesis that environmental contaminants contribute to shifts in gender-role behavior. Future investigations will need to account for competing explanations of this effect.
MORFOMETRYKA—A NEW WAY OF ESTABLISHING MORPHOLOGICAL CLASSIFICATION OF GALAXIES
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ferrari, F.; Carvalho, R. R. de; Trevisan, M., E-mail: fabricio@ferrari.pro.br
We present an extended morphometric system to automatically classify galaxies from astronomical images. The new system includes the original and modified versions of the CASGM coefficients (Concentration C{sub 1}, Asymmetry A{sub 3}, and Smoothness S{sub 3}), and the new parameters entropy, H, and spirality σ{sub ψ}. The new parameters A{sub 3}, S{sub 3}, and H are better to discriminate galaxy classes than A{sub 1}, S{sub 1}, and G, respectively. The new parameter σ{sub ψ} captures the amount of non-radial pattern on the image and is almost linearly dependent on T-type. Using a sample of spiral and elliptical galaxies from themore » Galaxy Zoo project as a training set, we employed the Linear Discriminant Analysis (LDA) technique to classify EFIGI (Baillard et al. 4458 galaxies), Nair and Abraham (14,123 galaxies), and SDSS Legacy (779,235 galaxies) samples. The cross-validation test shows that we can achieve an accuracy of more than 90% with our classification scheme. Therefore, we are able to define a plane in the morphometric parameter space that separates the elliptical and spiral classes with a mismatch between classes smaller than 10%. We use the distance to this plane as a morphometric index (M{sub i}) and we show that it follows the human based T-type index very closely. We calculate morphometric index M{sub i} for ∼780k galaxies from SDSS Legacy Survey–DR7. We discuss how M{sub i} correlates with stellar population parameters obtained using the spectra available from SDSS–DR7.« less
2007-12-01
American culture that I would never ever have learnt otherwise (though now that I think of it “tentacle porn ” is Japanese). Also throughout the whole thing...the fact that I was able to tell him most of this in more than one occasion and just thank him yet again... It’s harder to thank family and loved ones... family , but I will do my best to make them proud. xi xii Contents 1 Introduction 1 2 Decoding Information From fMRI Data With Linear Classifiers 11 2.1
Computer-aided diagnosis of prostate cancer in the peripheral zone using multiparametric MRI
NASA Astrophysics Data System (ADS)
Niaf, Emilie; Rouvière, Olivier; Mège-Lechevallier, Florence; Bratan, Flavie; Lartizien, Carole
2012-06-01
This study evaluated a computer-assisted diagnosis (CADx) system for determining a likelihood measure of prostate cancer presence in the peripheral zone (PZ) based on multiparametric magnetic resonance (MR) imaging, including T2-weighted, diffusion-weighted and dynamic contrast-enhanced MRI at 1.5 T. Based on a feature set derived from grey-level images, including first-order statistics, Haralick features, gradient features, semi-quantitative and quantitative (pharmacokinetic modelling) dynamic parameters, four kinds of classifiers were trained and compared : nonlinear support vector machine (SVM), linear discriminant analysis, k-nearest neighbours and naïve Bayes classifiers. A set of feature selection methods based on t-test, mutual information and minimum-redundancy-maximum-relevancy criteria were also compared. The aim was to discriminate between the relevant features as well as to create an efficient classifier using these features. The diagnostic performances of these different CADx schemes were evaluated based on a receiver operating characteristic (ROC) curve analysis. The evaluation database consisted of 30 sets of multiparametric MR images acquired from radical prostatectomy patients. Using histologic sections as the gold standard, both cancer and nonmalignant (but suspicious) tissues were annotated in consensus on all MR images by two radiologists, a histopathologist and a researcher. Benign tissue regions of interest (ROIs) were also delineated in the remaining prostate PZ. This resulted in a series of 42 cancer ROIs, 49 benign but suspicious ROIs and 124 nonsuspicious benign ROIs. From the outputs of all evaluated feature selection methods on the test bench, a restrictive set of about 15 highly informative features coming from all MR sequences was discriminated, thus confirming the validity of the multiparametric approach. Quantitative evaluation of the diagnostic performance yielded a maximal area under the ROC curve (AUC) of 0.89 (0.81-0.94) for the discrimination of the malignant versus nonmalignant tissues and 0.82 (0.73-0.90) for the discrimination of the malignant versus suspicious tissues when combining the t-test feature selection approach with a SVM classifier. A preliminary comparison showed that the optimal CADx scheme mimicked, in terms of AUC, the human experts in differentiating malignant from suspicious tissues, thus demonstrating its potential for assisting cancer identification in the PZ.
Optical classification for quality and defect analysis of train brakes
NASA Astrophysics Data System (ADS)
Glock, Stefan; Hausmann, Stefan; Gerke, Sebastian; Warok, Alexander; Spiess, Peter; Witte, Stefan; Lohweg, Volker
2009-06-01
In this paper we present an optical measurement system approach for quality analysis of brakes which are used in high-speed trains. The brakes consist of the so called brake discs and pads. In a deceleration process the discs will be heated up to 500°C. The quality measure is based on the fact that the heated brake discs should not generate hot spots inside the brake material. Instead, the brake disc should be heated homogeneously by the deceleration. Therefore, it makes sense to analyze the number of hot spots and their relative gradients to create a quality measure for train brakes. In this contribution we present a new approach for a quality measurement system which is based on an image analysis and classification of infra-red based heat images. Brake images which are represented in pseudo-color are first transformed in a linear grayscale space by a hue-saturation-intensity (HSI) space. This transform is necessary for the following gradient analysis which is based on gray scale gradient filters. Furthermore, different features based on Haralick's measures are generated from the gray scale and gradient images. A following Fuzzy-Pattern-Classifier is used for the classification of good and bad brakes. It has to be pointed out that the classifier returns a score value for each brake which is between 0 and 100% good quality. This fact guarantees that not only good and bad bakes can be distinguished, but also their quality can be labeled. The results show that all critical thermal patterns of train brakes can be sensed and verified.
Classification of vegetation types in military region
NASA Astrophysics Data System (ADS)
Gonçalves, Miguel; Silva, Jose Silvestre; Bioucas-Dias, Jose
2015-10-01
In decision-making process regarding planning and execution of military operations, the terrain is a determining factor. Aerial photographs are a source of vital information for the success of an operation in hostile region, namely when the cartographic information behind enemy lines is scarce or non-existent. The objective of present work is the development of a tool capable of processing aerial photos. The methodology implemented starts with feature extraction, followed by the application of an automatic selector of features. The next step, using the k-fold cross validation technique, estimates the input parameters for the following classifiers: Sparse Multinomial Logist Regression (SMLR), K Nearest Neighbor (KNN), Linear Classifier using Principal Component Expansion on the Joint Data (PCLDC) and Multi-Class Support Vector Machine (MSVM). These classifiers were used in two different studies with distinct objectives: discrimination of vegetation's density and identification of vegetation's main components. It was found that the best classifier on the first approach is the Sparse Logistic Multinomial Regression (SMLR). On the second approach, the implemented methodology applied to high resolution images showed that the better performance was achieved by KNN classifier and PCLDC. Comparing the two approaches there is a multiscale issue, in which for different resolutions, the best solution to the problem requires different classifiers and the extraction of different features.
Brain tumor image segmentation using kernel dictionary learning.
Jeon Lee; Seung-Jun Kim; Rong Chen; Herskovits, Edward H
2015-08-01
Automated brain tumor image segmentation with high accuracy and reproducibility holds a big potential to enhance the current clinical practice. Dictionary learning (DL) techniques have been applied successfully to various image processing tasks recently. In this work, kernel extensions of the DL approach are adopted. Both reconstructive and discriminative versions of the kernel DL technique are considered, which can efficiently incorporate multi-modal nonlinear feature mappings based on the kernel trick. Our novel discriminative kernel DL formulation allows joint learning of a task-driven kernel-based dictionary and a linear classifier using a K-SVD-type algorithm. The proposed approaches were tested using real brain magnetic resonance (MR) images of patients with high-grade glioma. The obtained preliminary performances are competitive with the state of the art. The discriminative kernel DL approach is seen to reduce computational burden without much sacrifice in performance.
Hybrid Feature Extraction-based Approach for Facial Parts Representation and Recognition
NASA Astrophysics Data System (ADS)
Rouabhia, C.; Tebbikh, H.
2008-06-01
Face recognition is a specialized image processing which has attracted a considerable attention in computer vision. In this article, we develop a new facial recognition system from video sequences images dedicated to person identification whose face is partly occulted. This system is based on a hybrid image feature extraction technique called ACPDL2D (Rouabhia et al. 2007), it combines two-dimensional principal component analysis and two-dimensional linear discriminant analysis with neural network. We performed the feature extraction task on the eyes and the nose images separately then a Multi-Layers Perceptron classifier is used. Compared to the whole face, the results of simulation are in favor of the facial parts in terms of memory capacity and recognition (99.41% for the eyes part, 98.16% for the nose part and 97.25 % for the whole face).
Classification of burst and suppression in the neonatal electroencephalogram
NASA Astrophysics Data System (ADS)
Löfhede, J.; Löfgren, N.; Thordstein, M.; Flisberg, A.; Kjellmer, I.; Lindecrantz, K.
2008-12-01
Fisher's linear discriminant (FLD), a feed-forward artificial neural network (ANN) and a support vector machine (SVM) were compared with respect to their ability to distinguish bursts from suppressions in electroencephalograms (EEG) displaying a burst-suppression pattern. Five features extracted from the EEG were used as inputs. The study was based on EEG signals from six full-term infants who had suffered from perinatal asphyxia, and the methods have been trained with reference data classified by an experienced electroencephalographer. The results are summarized as the area under the curve (AUC), derived from receiver operating characteristic (ROC) curves for the three methods. Based on this, the SVM performs slightly better than the others. Testing the three methods with combinations of increasing numbers of the five features shows that the SVM handles the increasing amount of information better than the other methods.
Estimating the mutual information of an EEG-based Brain-Computer Interface.
Schlögl, A; Neuper, C; Pfurtscheller, G
2002-01-01
An EEG-based Brain-Computer Interface (BCI) could be used as an additional communication channel between human thoughts and the environment. The efficacy of such a BCI depends mainly on the transmitted information rate. Shannon's communication theory was used to quantify the information rate of BCI data. For this purpose, experimental EEG data from four BCI experiments was analyzed off-line. Subjects imaginated left and right hand movements during EEG recording from the sensorimotor area. Adaptive autoregressive (AAR) parameters were used as features of single trial EEG and classified with linear discriminant analysis. The intra-trial variation as well as the inter-trial variability, the signal-to-noise ratio, the entropy of information, and the information rate were estimated. The entropy difference was used as a measure of the separability of two classes of EEG patterns.
Residential water demand model under block rate pricing: A case study of Beijing, China
NASA Astrophysics Data System (ADS)
Chen, H.; Yang, Z. F.
2009-05-01
In many cities, the inconsistency between water supply and water demand has become a critical problem because of deteriorating water shortage and increasing water demand. Uniform price of residential water cannot promote the efficient water allocation. In China, block water price will be put into practice in the future, but the outcome of such regulation measure is unpredictable without theory support. In this paper, the residential water is classified by the volume of water usage based on economic rules and block water is considered as different kinds of goods. A model based on extended linear expenditure system (ELES) is constructed to simulate the relationship between block water price and water demand, which provide theoretical support for the decision-makers. Finally, the proposed model is used to simulate residential water demand under block rate pricing in Beijing.
Pica, Alessia; Moeckli, Raphael; Balmer, Aubin; Beck-Popovic, Maja; Chollet-Rivier, Madeleine; Do, Huu-Phuoc; Weber, Damien C; Munier, Francis L
2011-12-01
To determine the local control and complication rates for children with papillary and/or macular retinoblastoma progressing after chemotherapy and undergoing stereotactic radiotherapy (SRT) with a micromultileaf collimator. Between 2004 and 2008, 11 children (15 eyes) with macular and/or papillary retinoblastoma were treated with SRT. The mean age was 19 months (range, 2-111). Of the 15 eyes, 7, 6, and 2 were classified as International Classification of Intraocular Retinoblastoma Group B, C, and E, respectively. The delivered dose of SRT was 50.4 Gy in 28 fractions using a dedicated micromultileaf collimator linear accelerator. The median follow-up was 20 months (range, 13-39). Local control was achieved in 13 eyes (87%). The actuarial 1- and 2-year local control rates were both 82%. SRT was well tolerated. Late adverse events were reported in 4 patients. Of the 4 patients, 2 had developed focal microangiopathy 20 months after SRT; 1 had developed a transient recurrence of retinal detachment; and 1 had developed bilateral cataracts. No optic neuropathy was observed. Linear accelerator-based SRT for papillary and/or macular retinoblastoma in children resulted in excellent tumor control rates with acceptable toxicity. Additional research regarding SRT and its intrinsic organ-at-risk sparing capability is justified in the framework of prospective trials. Copyright © 2011 Elsevier Inc. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pica, Alessia, E-mail: Alessia.Pica@chuv.ch; Moeckli, Raphael; Balmer, Aubin
2011-12-01
Purpose: To determine the local control and complication rates for children with papillary and/or macular retinoblastoma progressing after chemotherapy and undergoing stereotactic radiotherapy (SRT) with a micromultileaf collimator. Methods and Materials: Between 2004 and 2008, 11 children (15 eyes) with macular and/or papillary retinoblastoma were treated with SRT. The mean age was 19 months (range, 2-111). Of the 15 eyes, 7, 6, and 2 were classified as International Classification of Intraocular Retinoblastoma Group B, C, and E, respectively. The delivered dose of SRT was 50.4 Gy in 28 fractions using a dedicated micromultileaf collimator linear accelerator. Results: The median follow-upmore » was 20 months (range, 13-39). Local control was achieved in 13 eyes (87%). The actuarial 1- and 2-year local control rates were both 82%. SRT was well tolerated. Late adverse events were reported in 4 patients. Of the 4 patients, 2 had developed focal microangiopathy 20 months after SRT; 1 had developed a transient recurrence of retinal detachment; and 1 had developed bilateral cataracts. No optic neuropathy was observed. Conclusions: Linear accelerator-based SRT for papillary and/or macular retinoblastoma in children resulted in excellent tumor control rates with acceptable toxicity. Additional research regarding SRT and its intrinsic organ-at-risk sparing capability is justified in the framework of prospective trials.« less
Power, Sarah D; Kushki, Azadeh; Chau, Tom
2012-01-01
Near-infrared spectroscopy (NIRS) has been recently investigated for use in noninvasive brain-computer interface (BCI) technologies. Previous studies have demonstrated the ability to classify patterns of neural activation associated with different mental tasks (e.g., mental arithmetic) using NIRS signals. Though these studies represent an important step towards the realization of an NIRS-BCI, there is a paucity of literature regarding the consistency of these responses, and the ability to classify them on a single-trial basis, over multiple sessions. This is important when moving out of an experimental context toward a practical system, where performance must be maintained over longer periods. When considering response consistency across sessions, two questions arise: 1) can the hemodynamic response to the activation task be distinguished from a baseline (or other task) condition, consistently across sessions, and if so, 2) are the spatiotemporal characteristics of the response which best distinguish it from the baseline (or other task) condition consistent across sessions. The answers will have implications for the viability of an NIRS-BCI system, and the design strategies (especially in terms of classifier training protocols) adopted. In this study, we investigated the consistency of classification of a mental arithmetic task and a no-control condition over five experimental sessions. Mixed model linear regression on intrasession classification accuracies indicate that the task and baseline states remain differentiable across multiple sessions, with no significant decrease in accuracy (p = 0.67). Intersession analysis, however, revealed inconsistencies in spatiotemporal response characteristics. Based on these results, we investigated several different practical classifier training protocols, including scenarios in which the training and test data come from 1) different sessions, 2) the same session, and 3) a combination of both. Results indicate that when selecting optimal classifier training protocols for NIRS-BCI, a compromise between accuracy and convenience (e.g., in terms of duration/frequency of training data collection) must be considered.
Zheng, Wenjing; Balzer, Laura; van der Laan, Mark; Petersen, Maya
2018-01-30
Binary classification problems are ubiquitous in health and social sciences. In many cases, one wishes to balance two competing optimality considerations for a binary classifier. For instance, in resource-limited settings, an human immunodeficiency virus prevention program based on offering pre-exposure prophylaxis (PrEP) to select high-risk individuals must balance the sensitivity of the binary classifier in detecting future seroconverters (and hence offering them PrEP regimens) with the total number of PrEP regimens that is financially and logistically feasible for the program. In this article, we consider a general class of constrained binary classification problems wherein the objective function and the constraint are both monotonic with respect to a threshold. These include the minimization of the rate of positive predictions subject to a minimum sensitivity, the maximization of sensitivity subject to a maximum rate of positive predictions, and the Neyman-Pearson paradigm, which minimizes the type II error subject to an upper bound on the type I error. We propose an ensemble approach to these binary classification problems based on the Super Learner methodology. This approach linearly combines a user-supplied library of scoring algorithms, with combination weights and a discriminating threshold chosen to minimize the constrained optimality criterion. We then illustrate the application of the proposed classifier to develop an individualized PrEP targeting strategy in a resource-limited setting, with the goal of minimizing the number of PrEP offerings while achieving a minimum required sensitivity. This proof of concept data analysis uses baseline data from the ongoing Sustainable East Africa Research in Community Health study. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Lidar detection of underwater objects using a neuro-SVM-based architecture.
Mitra, Vikramjit; Wang, Chia-Jiu; Banerjee, Satarupa
2006-05-01
This paper presents a neural network architecture using a support vector machine (SVM) as an inference engine (IE) for classification of light detection and ranging (Lidar) data. Lidar data gives a sequence of laser backscatter intensities obtained from laser shots generated from an airborne object at various altitudes above the earth surface. Lidar data is pre-filtered to remove high frequency noise. As the Lidar shots are taken from above the earth surface, it has some air backscatter information, which is of no importance for detecting underwater objects. Because of these, the air backscatter information is eliminated from the data and a segment of this data is subsequently selected to extract features for classification. This is then encoded using linear predictive coding (LPC) and polynomial approximation. The coefficients thus generated are used as inputs to the two branches of a parallel neural architecture. The decisions obtained from the two branches are vector multiplied and the result is fed to an SVM-based IE that presents the final inference. Two parallel neural architectures using multilayer perception (MLP) and hybrid radial basis function (HRBF) are considered in this paper. The proposed structure fits the Lidar data classification task well due to the inherent classification efficiency of neural networks and accurate decision-making capability of SVM. A Bayesian classifier and a quadratic classifier were considered for the Lidar data classification task but they failed to offer high prediction accuracy. Furthermore, a single-layered artificial neural network (ANN) classifier was also considered and it failed to offer good accuracy. The parallel ANN architecture proposed in this paper offers high prediction accuracy (98.9%) and is found to be the most suitable architecture for the proposed task of Lidar data classification.
Dess, Brian W; Cardarelli, John; Thomas, Mark J; Stapleton, Jeff; Kroutil, Robert T; Miller, David; Curry, Timothy; Small, Gary W
2018-03-08
A generalized methodology was developed for automating the detection of radioisotopes from gamma-ray spectra collected from an aircraft platform using sodium-iodide detectors. Employing data provided by the U.S Environmental Protection Agency Airborne Spectral Photometric Environmental Collection Technology (ASPECT) program, multivariate classification models based on nonparametric linear discriminant analysis were developed for application to spectra that were preprocessed through a combination of altitude-based scaling and digital filtering. Training sets of spectra for use in building classification models were assembled from a combination of background spectra collected in the field and synthesized spectra obtained by superimposing laboratory-collected spectra of target radioisotopes onto field backgrounds. This approach eliminated the need for field experimentation with radioactive sources for use in building classification models. Through a bi-Gaussian modeling procedure, the discriminant scores that served as the outputs from the classification models were related to associated confidence levels. This provided an easily interpreted result regarding the presence or absence of the signature of a specific radioisotope in each collected spectrum. Through the use of this approach, classifiers were built for cesium-137 ( 137 Cs) and cobalt-60 ( 60 Co), two radioisotopes that are of interest in airborne radiological monitoring applications. The optimized classifiers were tested with field data collected from a set of six geographically diverse sites, three of which contained either 137 Cs, 60 Co, or both. When the optimized classification models were applied, the overall percentages of correct classifications for spectra collected at these sites were 99.9 and 97.9% for the 60 Co and 137 Cs classifiers, respectively. Copyright © 2018 Elsevier Ltd. All rights reserved.
Shi, Weiwei; Bugrim, Andrej; Nikolsky, Yuri; Nikolskya, Tatiana; Brennan, Richard J
2008-01-01
ABSTRACT The ideal toxicity biomarker is composed of the properties of prediction (is detected prior to traditional pathological signs of injury), accuracy (high sensitivity and specificity), and mechanistic relationships to the endpoint measured (biological relevance). Gene expression-based toxicity biomarkers ("signatures") have shown good predictive power and accuracy, but are difficult to interpret biologically. We have compared different statistical methods of feature selection with knowledge-based approaches, using GeneGo's database of canonical pathway maps, to generate gene sets for the classification of renal tubule toxicity. The gene set selection algorithms include four univariate analyses: t-statistics, fold-change, B-statistics, and RankProd, and their combination and overlap for the identification of differentially expressed probes. Enrichment analysis following the results of the four univariate analyses, Hotelling T-square test, and, finally out-of-bag selection, a variant of cross-validation, were used to identify canonical pathway maps-sets of genes coordinately involved in key biological processes-with classification power. Differentially expressed genes identified by the different statistical univariate analyses all generated reasonably performing classifiers of tubule toxicity. Maps identified by enrichment analysis or Hotelling T-square had lower classification power, but highlighted perturbed lipid homeostasis as a common discriminator of nephrotoxic treatments. The out-of-bag method yielded the best functionally integrated classifier. The map "ephrins signaling" performed comparably to a classifier derived using sparse linear programming, a machine learning algorithm, and represents a signaling network specifically involved in renal tubule development and integrity. Such functional descriptors of toxicity promise to better integrate predictive toxicogenomics with mechanistic analysis, facilitating the interpretation and risk assessment of predictive genomic investigations.
Comparing ensemble learning methods based on decision tree classifiers for protein fold recognition.
Bardsiri, Mahshid Khatibi; Eftekhari, Mahdi
2014-01-01
In this paper, some methods for ensemble learning of protein fold recognition based on a decision tree (DT) are compared and contrasted against each other over three datasets taken from the literature. According to previously reported studies, the features of the datasets are divided into some groups. Then, for each of these groups, three ensemble classifiers, namely, random forest, rotation forest and AdaBoost.M1 are employed. Also, some fusion methods are introduced for combining the ensemble classifiers obtained in the previous step. After this step, three classifiers are produced based on the combination of classifiers of types random forest, rotation forest and AdaBoost.M1. Finally, the three different classifiers achieved are combined to make an overall classifier. Experimental results show that the overall classifier obtained by the genetic algorithm (GA) weighting fusion method, is the best one in comparison to previously applied methods in terms of classification accuracy.
A fuzzy classifier system for process control
NASA Technical Reports Server (NTRS)
Karr, C. L.; Phillips, J. C.
1994-01-01
A fuzzy classifier system that discovers rules for controlling a mathematical model of a pH titration system was developed by researchers at the U.S. Bureau of Mines (USBM). Fuzzy classifier systems successfully combine the strengths of learning classifier systems and fuzzy logic controllers. Learning classifier systems resemble familiar production rule-based systems, but they represent their IF-THEN rules by strings of characters rather than in the traditional linguistic terms. Fuzzy logic is a tool that allows for the incorporation of abstract concepts into rule based-systems, thereby allowing the rules to resemble the familiar 'rules-of-thumb' commonly used by humans when solving difficult process control and reasoning problems. Like learning classifier systems, fuzzy classifier systems employ a genetic algorithm to explore and sample new rules for manipulating the problem environment. Like fuzzy logic controllers, fuzzy classifier systems encapsulate knowledge in the form of production rules. The results presented in this paper demonstrate the ability of fuzzy classifier systems to generate a fuzzy logic-based process control system.
Classification of Malaysia aromatic rice using multivariate statistical analysis
NASA Astrophysics Data System (ADS)
Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A.; Omar, O.
2015-05-01
Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC-MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.
Higher criticism thresholding: Optimal feature selection when useful features are rare and weak.
Donoho, David; Jin, Jiashun
2008-09-30
In important application fields today-genomics and proteomics are examples-selecting a small subset of useful features is crucial for success of Linear Classification Analysis. We study feature selection by thresholding of feature Z-scores and introduce a principle of threshold selection, based on the notion of higher criticism (HC). For i = 1, 2, ..., p, let pi(i) denote the two-sided P-value associated with the ith feature Z-score and pi((i)) denote the ith order statistic of the collection of P-values. The HC threshold is the absolute Z-score corresponding to the P-value maximizing the HC objective (i/p - pi((i)))/sqrt{i/p(1-i/p)}. We consider a rare/weak (RW) feature model, where the fraction of useful features is small and the useful features are each too weak to be of much use on their own. HC thresholding (HCT) has interesting behavior in this setting, with an intimate link between maximizing the HC objective and minimizing the error rate of the designed classifier, and very different behavior from popular threshold selection procedures such as false discovery rate thresholding (FDRT). In the most challenging RW settings, HCT uses an unconventionally low threshold; this keeps the missed-feature detection rate under better control than FDRT and yields a classifier with improved misclassification performance. Replacing cross-validated threshold selection in the popular Shrunken Centroid classifier with the computationally less expensive and simpler HCT reduces the variance of the selected threshold and the error rate of the constructed classifier. Results on standard real datasets and in asymptotic theory confirm the advantages of HCT.
Higher criticism thresholding: Optimal feature selection when useful features are rare and weak
Donoho, David; Jin, Jiashun
2008-01-01
In important application fields today—genomics and proteomics are examples—selecting a small subset of useful features is crucial for success of Linear Classification Analysis. We study feature selection by thresholding of feature Z-scores and introduce a principle of threshold selection, based on the notion of higher criticism (HC). For i = 1, 2, …, p, let πi denote the two-sided P-value associated with the ith feature Z-score and π(i) denote the ith order statistic of the collection of P-values. The HC threshold is the absolute Z-score corresponding to the P-value maximizing the HC objective (i/p − π(i))/i/p(1−i/p). We consider a rare/weak (RW) feature model, where the fraction of useful features is small and the useful features are each too weak to be of much use on their own. HC thresholding (HCT) has interesting behavior in this setting, with an intimate link between maximizing the HC objective and minimizing the error rate of the designed classifier, and very different behavior from popular threshold selection procedures such as false discovery rate thresholding (FDRT). In the most challenging RW settings, HCT uses an unconventionally low threshold; this keeps the missed-feature detection rate under better control than FDRT and yields a classifier with improved misclassification performance. Replacing cross-validated threshold selection in the popular Shrunken Centroid classifier with the computationally less expensive and simpler HCT reduces the variance of the selected threshold and the error rate of the constructed classifier. Results on standard real datasets and in asymptotic theory confirm the advantages of HCT. PMID:18815365
DOE Office of Scientific and Technical Information (OSTI.GOV)
Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md
Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy trainingmore » time, and prone to fatigue as the number of sample increased and inconsistent. The GC–MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.« less