Sample records for nearest prototype classifiers

  1. A swarm-trained k-nearest prototypes adaptive classifier with automatic feature selection for interval data.

    PubMed

    Silva Filho, Telmo M; Souza, Renata M C R; Prudêncio, Ricardo B C

    2016-08-01

    Some complex data types are capable of modeling data variability and imprecision. These data types are studied in the symbolic data analysis field. One such data type is interval data, which represents ranges of values and is more versatile than classic point data for many domains. This paper proposes a new prototype-based classifier for interval data, trained by a swarm optimization method. Our work has two main contributions: a swarm method which is capable of performing both automatic selection of features and pruning of unused prototypes and a generalized weighted squared Euclidean distance for interval data. By discarding unnecessary features and prototypes, the proposed algorithm deals with typical limitations of prototype-based methods, such as the problem of prototype initialization. The proposed distance is useful for learning classes in interval datasets with different shapes, sizes and structures. When compared to other prototype-based methods, the proposed method achieves lower error rates in both synthetic and real interval datasets. Copyright © 2016 Elsevier Ltd. All rights reserved.

  2. Generative Models for Similarity-based Classification

    DTIC Science & Technology

    2007-01-01

    NC), local nearest centroid (local NC), k-nearest neighbors ( kNN ), and condensed nearest neighbors (CNN) are all similarity-based classifiers which...vector machine to the k nearest neighbors of the test sample [80]. The SVM- KNN method was developed to address the robustness and dimensionality...concerns that afflict nearest neighbors and SVMs. Similarly to the nearest-means classifier, the SVM- KNN is a hybrid local and global classifier developed

  3. Frog sound identification using extended k-nearest neighbor classifier

    NASA Astrophysics Data System (ADS)

    Mukahar, Nordiana; Affendi Rosdi, Bakhtiar; Athiar Ramli, Dzati; Jaafar, Haryati

    2017-09-01

    Frog sound identification based on the vocalization becomes important for biological research and environmental monitoring. As a result, different types of feature extractions and classifiers have been employed to evaluate the accuracy of frog sound identification. This paper presents a frog sound identification with Extended k-Nearest Neighbor (EKNN) classifier. The EKNN classifier integrates the nearest neighbors and mutual sharing of neighborhood concepts, with the aims of improving the classification performance. It makes a prediction based on who are the nearest neighbors of the testing sample and who consider the testing sample as their nearest neighbors. In order to evaluate the classification performance in frog sound identification, the EKNN classifier is compared with competing classifier, k -Nearest Neighbor (KNN), Fuzzy k -Nearest Neighbor (FKNN) k - General Nearest Neighbor (KGNN)and Mutual k -Nearest Neighbor (MKNN) on the recorded sounds of 15 frog species obtained in Malaysia forest. The recorded sounds have been segmented using Short Time Energy and Short Time Average Zero Crossing Rate (STE+STAZCR), sinusoidal modeling (SM), manual and the combination of Energy (E) and Zero Crossing Rate (ZCR) (E+ZCR) while the features are extracted by Mel Frequency Cepstrum Coefficient (MFCC). The experimental results have shown that the EKNCN classifier exhibits the best performance in terms of accuracy compared to the competing classifiers, KNN, FKNN, GKNN and MKNN for all cases.

  4. Centre-based restricted nearest feature plane with angle classifier for face recognition

    NASA Astrophysics Data System (ADS)

    Tang, Linlin; Lu, Huifen; Zhao, Liang; Li, Zuohua

    2017-10-01

    An improved classifier based on the nearest feature plane (NFP), called the centre-based restricted nearest feature plane with the angle (RNFPA) classifier, is proposed for the face recognition problems here. The famous NFP uses the geometrical information of samples to increase the number of training samples, but it increases the computation complexity and it also has an inaccuracy problem coursed by the extended feature plane. To solve the above problems, RNFPA exploits a centre-based feature plane and utilizes a threshold of angle to restrict extended feature space. By choosing the appropriate angle threshold, RNFPA can improve the performance and decrease computation complexity. Experiments in the AT&T face database, AR face database and FERET face database are used to evaluate the proposed classifier. Compared with the original NFP classifier, the nearest feature line (NFL) classifier, the nearest neighbour (NN) classifier and some other improved NFP classifiers, the proposed one achieves competitive performance.

  5. Error minimizing algorithms for nearest eighbor classifiers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Porter, Reid B; Hush, Don; Zimmer, G. Beate

    2011-01-03

    Stack Filters define a large class of discrete nonlinear filter first introd uced in image and signal processing for noise removal. In recent years we have suggested their application to classification problems, and investigated their relationship to other types of discrete classifiers such as Decision Trees. In this paper we focus on a continuous domain version of Stack Filter Classifiers which we call Ordered Hypothesis Machines (OHM), and investigate their relationship to Nearest Neighbor classifiers. We show that OHM classifiers provide a novel framework in which to train Nearest Neighbor type classifiers by minimizing empirical error based loss functions. Wemore » use the framework to investigate a new cost sensitive loss function that allows us to train a Nearest Neighbor type classifier for low false alarm rate applications. We report results on both synthetic data and real-world image data.« less

  6. Finger vein identification using fuzzy-based k-nearest centroid neighbor classifier

    NASA Astrophysics Data System (ADS)

    Rosdi, Bakhtiar Affendi; Jaafar, Haryati; Ramli, Dzati Athiar

    2015-02-01

    In this paper, a new approach for personal identification using finger vein image is presented. Finger vein is an emerging type of biometrics that attracts attention of researchers in biometrics area. As compared to other biometric traits such as face, fingerprint and iris, finger vein is more secured and hard to counterfeit since the features are inside the human body. So far, most of the researchers focus on how to extract robust features from the captured vein images. Not much research was conducted on the classification of the extracted features. In this paper, a new classifier called fuzzy-based k-nearest centroid neighbor (FkNCN) is applied to classify the finger vein image. The proposed FkNCN employs a surrounding rule to obtain the k-nearest centroid neighbors based on the spatial distributions of the training images and their distance to the test image. Then, the fuzzy membership function is utilized to assign the test image to the class which is frequently represented by the k-nearest centroid neighbors. Experimental evaluation using our own database which was collected from 492 fingers shows that the proposed FkNCN has better performance than the k-nearest neighbor, k-nearest-centroid neighbor and fuzzy-based-k-nearest neighbor classifiers. This shows that the proposed classifier is able to identify the finger vein image effectively.

  7. Fall Detection System for the Elderly Based on the Classification of Shimmer Sensor Prototype Data

    PubMed Central

    Ahmed, Moiz; Mehmood, Nadeem; Mehmood, Amir; Rizwan, Kashif

    2017-01-01

    Objectives Falling in the elderly is considered a major cause of death. In recent years, ambient and wireless sensor platforms have been extensively used in developed countries for the detection of falls in the elderly. However, we believe extra efforts are required to address this issue in developing countries, such as Pakistan, where most deaths due to falls are not even reported. Considering this, in this paper, we propose a fall detection system prototype that s based on the classification on real time shimmer sensor data. Methods We first developed a data set, ‘SMotion’ of certain postures that could lead to falls in the elderly by using a body area network of Shimmer sensors and categorized the items in this data set into age and weight groups. We developed a feature selection and classification system using three classifiers, namely, support vector machine (SVM), K-nearest neighbor (KNN), and neural network (NN). Finally, a prototype was fabricated to generate alerts to caregivers, health experts, or emergency services in case of fall. Results To evaluate the proposed system, SVM, KNN, and NN were used. The results of this study identified KNN as the most accurate classifier with maximum accuracy of 96% for age groups and 93% for weight groups. Conclusions In this paper, a classification-based fall detection system is proposed. For this purpose, the SMotion data set was developed and categorized into two groups (age and weight groups). The proposed fall detection system for the elderly is implemented through a body area sensor network using third-generation sensors. The evaluation results demonstrate the reasonable performance of the proposed fall detection prototype system in the tested scenarios. PMID:28875049

  8. An Improvement To The k-Nearest Neighbor Classifier For ECG Database

    NASA Astrophysics Data System (ADS)

    Jaafar, Haryati; Hidayah Ramli, Nur; Nasir, Aimi Salihah Abdul

    2018-03-01

    The k nearest neighbor (kNN) is a non-parametric classifier and has been widely used for pattern classification. However, in practice, the performance of kNN often tends to fail due to the lack of information on how the samples are distributed among them. Moreover, kNN is no longer optimal when the training samples are limited. Another problem observed in kNN is regarding the weighting issues in assigning the class label before classification. Thus, to solve these limitations, a new classifier called Mahalanobis fuzzy k-nearest centroid neighbor (MFkNCN) is proposed in this study. Here, a Mahalanobis distance is applied to avoid the imbalance of samples distribition. Then, a surrounding rule is employed to obtain the nearest centroid neighbor based on the distributions of training samples and its distance to the query point. Consequently, the fuzzy membership function is employed to assign the query point to the class label which is frequently represented by the nearest centroid neighbor Experimental studies from electrocardiogram (ECG) signal is applied in this study. The classification performances are evaluated in two experimental steps i.e. different values of k and different sizes of feature dimensions. Subsequently, a comparative study of kNN, kNCN, FkNN and MFkCNN classifier is conducted to evaluate the performances of the proposed classifier. The results show that the performance of MFkNCN consistently exceeds the kNN, kNCN and FkNN with the best classification rates of 96.5%.

  9. Data mining of text as a tool in authorship attribution

    NASA Astrophysics Data System (ADS)

    Visa, Ari J. E.; Toivonen, Jarmo; Autio, Sami; Maekinen, Jarno; Back, Barbro; Vanharanta, Hannu

    2001-03-01

    It is common that text documents are characterized and classified by keywords that the authors use to give them. Visa et al. have developed a new methodology based on prototype matching. The prototype is an interesting document or a part of an extracted, interesting text. This prototype is matched with the document database of the monitored document flow. The new methodology is capable of extracting the meaning of the document in a certain degree. Our claim is that the new methodology is also capable of authenticating the authorship. To verify this claim two tests were designed. The test hypothesis was that the words and the word order in the sentences could authenticate the author. In the first test three authors were selected. The selected authors were William Shakespeare, Edgar Allan Poe, and George Bernard Shaw. Three texts from each author were examined. Every text was one by one used as a prototype. The two nearest matches with the prototype were noted. The second test uses the Reuters-21578 financial news database. A group of 25 short financial news reports from five different authors are examined. Our new methodology and the interesting results from the two tests are reported in this paper. In the first test, for Shakespeare and for Poe all cases were successful. For Shaw one text was confused with Poe. In the second test the Reuters-21578 financial news were identified by the author relatively well. The resolution is that our text mining methodology seems to be capable of authorship attribution.

  10. Examining change detection approaches for tropical mangrove monitoring

    USGS Publications Warehouse

    Myint, Soe W.; Franklin, Janet; Buenemann, Michaela; Kim, Won; Giri, Chandra

    2014-01-01

    This study evaluated the effectiveness of different band combinations and classifiers (unsupervised, supervised, object-oriented nearest neighbor, and object-oriented decision rule) for quantifying mangrove forest change using multitemporal Landsat data. A discriminant analysis using spectra of different vegetation types determined that bands 2 (0.52 to 0.6 μm), 5 (1.55 to 1.75 μm), and 7 (2.08 to 2.35 μm) were the most effective bands for differentiating mangrove forests from surrounding land cover types. A ranking of thirty-six change maps, produced by comparing the classification accuracy of twelve change detection approaches, was used. The object-based Nearest Neighbor classifier produced the highest mean overall accuracy (84 percent) regardless of band combinations. The automated decision rule-based approach (mean overall accuracy of 88 percent) as well as a composite of bands 2, 5, and 7 used with the unsupervised classifier and the same composite or all band difference with the object-oriented Nearest Neighbor classifier were the most effective approaches.

  11. A survey of supervised machine learning models for mobile-phone based pathogen identification and classification

    NASA Astrophysics Data System (ADS)

    Ceylan Koydemir, Hatice; Feng, Steve; Liang, Kyle; Nadkarni, Rohan; Tseng, Derek; Benien, Parul; Ozcan, Aydogan

    2017-03-01

    Giardia lamblia causes a disease known as giardiasis, which results in diarrhea, abdominal cramps, and bloating. Although conventional pathogen detection methods used in water analysis laboratories offer high sensitivity and specificity, they are time consuming, and need experts to operate bulky equipment and analyze the samples. Here we present a field-portable and cost-effective smartphone-based waterborne pathogen detection platform that can automatically classify Giardia cysts using machine learning. Our platform enables the detection and quantification of Giardia cysts in one hour, including sample collection, labeling, filtration, and automated counting steps. We evaluated the performance of three prototypes using Giardia-spiked water samples from different sources (e.g., reagent-grade, tap, non-potable, and pond water samples). We populated a training database with >30,000 cysts and estimated our detection sensitivity and specificity using 20 different classifier models, including decision trees, nearest neighbor classifiers, support vector machines (SVMs), and ensemble classifiers, and compared their speed of training and classification, as well as predicted accuracies. Among them, cubic SVM, medium Gaussian SVM, and bagged-trees were the most promising classifier types with accuracies of 94.1%, 94.2%, and 95%, respectively; we selected the latter as our preferred classifier for the detection and enumeration of Giardia cysts that are imaged using our mobile-phone fluorescence microscope. Without the need for any experts or microbiologists, this field-portable pathogen detection platform can present a useful tool for water quality monitoring in resource-limited-settings.

  12. Colorectal Cancer and Colitis Diagnosis Using Fourier Transform Infrared Spectroscopy and an Improved K-Nearest-Neighbour Classifier.

    PubMed

    Li, Qingbo; Hao, Can; Kang, Xue; Zhang, Jialin; Sun, Xuejun; Wang, Wenbo; Zeng, Haishan

    2017-11-27

    Combining Fourier transform infrared spectroscopy (FTIR) with endoscopy, it is expected that noninvasive, rapid detection of colorectal cancer can be performed in vivo in the future. In this study, Fourier transform infrared spectra were collected from 88 endoscopic biopsy colorectal tissue samples (41 colitis and 47 cancers). A new method, viz., entropy weight local-hyperplane k-nearest-neighbor (EWHK), which is an improved version of K-local hyperplane distance nearest-neighbor (HKNN), is proposed for tissue classification. In order to avoid limiting high dimensions and small values of the nearest neighbor, the new EWHK method calculates feature weights based on information entropy. The average results of the random classification showed that the EWHK classifier for differentiating cancer from colitis samples produced a sensitivity of 81.38% and a specificity of 92.69%.

  13. Weighted Parzen Windows for Pattern Classification

    DTIC Science & Technology

    1994-05-01

    Nearest-Neighbor Rule The k-Nearest-Neighbor ( kNN ) technique is nonparametric, assuming nothing about the distribution of the data. Stated succinctly...probabilities P(wj I x) from samples." Raudys and Jain [20:255] advance this interpretation by pointing out that the kNN technique can be viewed as the...34Parzen window classifier with a hyper- rectangular window function." As with the Parzen-window technique, the kNN classifier is more accurate as the

  14. Smart BIT/TSMD Integration

    DTIC Science & Technology

    1991-12-01

    user using the ’: knn ’ option in the do-scenario command line). An instance of the K-Nearest Neighbor object is first created and initialized before...Navigation Computer HF High Frequency ILS Instrument Landing System KNN K - Nearest Neighbor LRU Line Replaceable Unit MC Mission Computer MTCA...approaches have been investigated here, K-nearest Neighbors ( KNN ) and neural networks (NN). Both approaches require that previously classified examples of

  15. AVNM: A Voting based Novel Mathematical Rule for Image Classification.

    PubMed

    Vidyarthi, Ankit; Mittal, Namita

    2016-12-01

    In machine learning, the accuracy of the system depends upon classification result. Classification accuracy plays an imperative role in various domains. Non-parametric classifier like K-Nearest Neighbor (KNN) is the most widely used classifier for pattern analysis. Besides its easiness, simplicity and effectiveness characteristics, the main problem associated with KNN classifier is the selection of a number of nearest neighbors i.e. "k" for computation. At present, it is hard to find the optimal value of "k" using any statistical algorithm, which gives perfect accuracy in terms of low misclassification error rate. Motivated by the prescribed problem, a new sample space reduction weighted voting mathematical rule (AVNM) is proposed for classification in machine learning. The proposed AVNM rule is also non-parametric in nature like KNN. AVNM uses the weighted voting mechanism with sample space reduction to learn and examine the predicted class label for unidentified sample. AVNM is free from any initial selection of predefined variable and neighbor selection as found in KNN algorithm. The proposed classifier also reduces the effect of outliers. To verify the performance of the proposed AVNM classifier, experiments are made on 10 standard datasets taken from UCI database and one manually created dataset. The experimental result shows that the proposed AVNM rule outperforms the KNN classifier and its variants. Experimentation results based on confusion matrix accuracy parameter proves higher accuracy value with AVNM rule. The proposed AVNM rule is based on sample space reduction mechanism for identification of an optimal number of nearest neighbor selections. AVNM results in better classification accuracy and minimum error rate as compared with the state-of-art algorithm, KNN, and its variants. The proposed rule automates the selection of nearest neighbor selection and improves classification rate for UCI dataset and manually created dataset. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  16. An incremental knowledge assimilation system (IKAS) for mine detection

    NASA Astrophysics Data System (ADS)

    Porway, Jake; Raju, Chaitanya; Varadarajan, Karthik Mahesh; Nguyen, Hieu; Yadegar, Joseph

    2010-04-01

    In this paper we present an adaptive incremental learning system for underwater mine detection and classification that utilizes statistical models of seabed texture and an adaptive nearest-neighbor classifier to identify varied underwater targets in many different environments. The first stage of processing uses our Background Adaptive ANomaly detector (BAAN), which identifies statistically likely target regions using Gabor filter responses over the image. Using this information, BAAN classifies the background type and updates its detection using background-specific parameters. To perform classification, a Fully Adaptive Nearest Neighbor (FAAN) determines the best label for each detection. FAAN uses an extremely fast version of Nearest Neighbor to find the most likely label for the target. The classifier perpetually assimilates new and relevant information into its existing knowledge database in an incremental fashion, allowing improved classification accuracy and capturing concept drift in the target classes. Experiments show that the system achieves >90% classification accuracy on underwater mine detection tasks performed on synthesized datasets provided by the Office of Naval Research. We have also demonstrated that the system can incrementally improve its detection accuracy by constantly learning from new samples.

  17. Vehicle Classification Using an Imbalanced Dataset Based on a Single Magnetic Sensor.

    PubMed

    Xu, Chang; Wang, Yingguan; Bao, Xinghe; Li, Fengrong

    2018-05-24

    This paper aims to improve the accuracy of automatic vehicle classifiers for imbalanced datasets. Classification is made through utilizing a single anisotropic magnetoresistive sensor, with the models of vehicles involved being classified into hatchbacks, sedans, buses, and multi-purpose vehicles (MPVs). Using time domain and frequency domain features in combination with three common classification algorithms in pattern recognition, we develop a novel feature extraction method for vehicle classification. These three common classification algorithms are the k-nearest neighbor, the support vector machine, and the back-propagation neural network. Nevertheless, a problem remains with the original vehicle magnetic dataset collected being imbalanced, and may lead to inaccurate classification results. With this in mind, we propose an approach called SMOTE, which can further boost the performance of classifiers. Experimental results show that the k-nearest neighbor (KNN) classifier with the SMOTE algorithm can reach a classification accuracy of 95.46%, thus minimizing the effect of the imbalance.

  18. Fuzzy-Rough Nearest Neighbour Classification

    NASA Astrophysics Data System (ADS)

    Jensen, Richard; Cornelis, Chris

    A new fuzzy-rough nearest neighbour (FRNN) classification algorithm is presented in this paper, as an alternative to Sarkar's fuzzy-rough ownership function (FRNN-O) approach. By contrast to the latter, our method uses the nearest neighbours to construct lower and upper approximations of decision classes, and classifies test instances based on their membership to these approximations. In the experimental analysis, we evaluate our approach with both classical fuzzy-rough approximations (based on an implicator and a t-norm), as well as with the recently introduced vaguely quantified rough sets. Preliminary results are very good, and in general FRNN outperforms FRNN-O, as well as the traditional fuzzy nearest neighbour (FNN) algorithm.

  19. Distributed Adaptive Binary Quantization for Fast Nearest Neighbor Search.

    PubMed

    Xianglong Liu; Zhujin Li; Cheng Deng; Dacheng Tao

    2017-11-01

    Hashing has been proved an attractive technique for fast nearest neighbor search over big data. Compared with the projection based hashing methods, prototype-based ones own stronger power to generate discriminative binary codes for the data with complex intrinsic structure. However, existing prototype-based methods, such as spherical hashing and K-means hashing, still suffer from the ineffective coding that utilizes the complete binary codes in a hypercube. To address this problem, we propose an adaptive binary quantization (ABQ) method that learns a discriminative hash function with prototypes associated with small unique binary codes. Our alternating optimization adaptively discovers the prototype set and the code set of a varying size in an efficient way, which together robustly approximate the data relations. Our method can be naturally generalized to the product space for long hash codes, and enjoys the fast training linear to the number of the training data. We further devise a distributed framework for the large-scale learning, which can significantly speed up the training of ABQ in the distributed environment that has been widely deployed in many areas nowadays. The extensive experiments on four large-scale (up to 80 million) data sets demonstrate that our method significantly outperforms state-of-the-art hashing methods, with up to 58.84% performance gains relatively.

  20. Automatic classification and detection of clinically relevant images for diabetic retinopathy

    NASA Astrophysics Data System (ADS)

    Xu, Xinyu; Li, Baoxin

    2008-03-01

    We proposed a novel approach to automatic classification of Diabetic Retinopathy (DR) images and retrieval of clinically-relevant DR images from a database. Given a query image, our approach first classifies the image into one of the three categories: microaneurysm (MA), neovascularization (NV) and normal, and then it retrieves DR images that are clinically-relevant to the query image from an archival image database. In the classification stage, the query DR images are classified by the Multi-class Multiple-Instance Learning (McMIL) approach, where images are viewed as bags, each of which contains a number of instances corresponding to non-overlapping blocks, and each block is characterized by low-level features including color, texture, histogram of edge directions, and shape. McMIL first learns a collection of instance prototypes for each class that maximizes the Diverse Density function using Expectation- Maximization algorithm. A nonlinear mapping is then defined using the instance prototypes and maps every bag to a point in a new multi-class bag feature space. Finally a multi-class Support Vector Machine is trained in the multi-class bag feature space. In the retrieval stage, we retrieve images from the archival database who bear the same label with the query image, and who are the top K nearest neighbors of the query image in terms of similarity in the multi-class bag feature space. The classification approach achieves high classification accuracy, and the retrieval of clinically-relevant images not only facilitates utilization of the vast amount of hidden diagnostic knowledge in the database, but also improves the efficiency and accuracy of DR lesion diagnosis and assessment.

  1. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Mason, J.

    CCHDT constructs and classifies various arrangements of hard disks of a single radius places on the unit square with periodic boundary conditions. Specifially, a given configuration is evolved to the nearest critical point on a smoothed hard disk energy fuction, and is classified by the adjacency matrix of the canonically labelled contact graph.

  2. Histogram Curve Matching Approaches for Object-based Image Classification of Land Cover and Land Use

    PubMed Central

    Toure, Sory I.; Stow, Douglas A.; Weeks, John R.; Kumar, Sunil

    2013-01-01

    The classification of image-objects is usually done using parametric statistical measures of central tendency and/or dispersion (e.g., mean or standard deviation). The objectives of this study were to analyze digital number histograms of image objects and evaluate classifications measures exploiting characteristic signatures of such histograms. Two histograms matching classifiers were evaluated and compared to the standard nearest neighbor to mean classifier. An ADS40 airborne multispectral image of San Diego, California was used for assessing the utility of curve matching classifiers in a geographic object-based image analysis (GEOBIA) approach. The classifications were performed with data sets having 0.5 m, 2.5 m, and 5 m spatial resolutions. Results show that histograms are reliable features for characterizing classes. Also, both histogram matching classifiers consistently performed better than the one based on the standard nearest neighbor to mean rule. The highest classification accuracies were produced with images having 2.5 m spatial resolution. PMID:24403648

  3. An integrated classifier for computer-aided diagnosis of colorectal polyps based on random forest and location index strategies

    NASA Astrophysics Data System (ADS)

    Hu, Yifan; Han, Hao; Zhu, Wei; Li, Lihong; Pickhardt, Perry J.; Liang, Zhengrong

    2016-03-01

    Feature classification plays an important role in differentiation or computer-aided diagnosis (CADx) of suspicious lesions. As a widely used ensemble learning algorithm for classification, random forest (RF) has a distinguished performance for CADx. Our recent study has shown that the location index (LI), which is derived from the well-known kNN (k nearest neighbor) and wkNN (weighted k nearest neighbor) classifier [1], has also a distinguished role in the classification for CADx. Therefore, in this paper, based on the property that the LI will achieve a very high accuracy, we design an algorithm to integrate the LI into RF for improved or higher value of AUC (area under the curve of receiver operating characteristics -- ROC). Experiments were performed by the use of a database of 153 lesions (polyps), including 116 neoplastic lesions and 37 hyperplastic lesions, with comparison to the existing classifiers of RF and wkNN, respectively. A noticeable gain by the proposed integrated classifier was quantified by the AUC measure.

  4. Evaluation of the maximum-likelihood adaptive neural system (MLANS) applications to noncooperative IFF

    NASA Astrophysics Data System (ADS)

    Chernick, Julian A.; Perlovsky, Leonid I.; Tye, David M.

    1994-06-01

    This paper describes applications of maximum likelihood adaptive neural system (MLANS) to the characterization of clutter in IR images and to the identification of targets. The characterization of image clutter is needed to improve target detection and to enhance the ability to compare performance of different algorithms using diverse imagery data. Enhanced unambiguous IFF is important for fratricide reduction while automatic cueing and targeting is becoming an ever increasing part of operations. We utilized MLANS which is a parametric neural network that combines optimal statistical techniques with a model-based approach. This paper shows that MLANS outperforms classical classifiers, the quadratic classifier and the nearest neighbor classifier, because on the one hand it is not limited to the usual Gaussian distribution assumption and can adapt in real time to the image clutter distribution; on the other hand MLANS learns from fewer samples and is more robust than the nearest neighbor classifiers. Future research will address uncooperative IFF using fused IR and MMW data.

  5. Activity Recognition in Egocentric video using SVM, kNN and Combined SVMkNN Classifiers

    NASA Astrophysics Data System (ADS)

    Sanal Kumar, K. P.; Bhavani, R., Dr.

    2017-08-01

    Egocentric vision is a unique perspective in computer vision which is human centric. The recognition of egocentric actions is a challenging task which helps in assisting elderly people, disabled patients and so on. In this work, life logging activity videos are taken as input. There are 2 categories, first one is the top level and second one is second level. Here, the recognition is done using the features like Histogram of Oriented Gradients (HOG), Motion Boundary Histogram (MBH) and Trajectory. The features are fused together and it acts as a single feature. The extracted features are reduced using Principal Component Analysis (PCA). The features that are reduced are provided as input to the classifiers like Support Vector Machine (SVM), k nearest neighbor (kNN) and combined Support Vector Machine (SVM) and k Nearest Neighbor (kNN) (combined SVMkNN). These classifiers are evaluated and the combined SVMkNN provided better results than other classifiers in the literature.

  6. Applied algorithm in the liner inspection of solid rocket motors

    NASA Astrophysics Data System (ADS)

    Hoffmann, Luiz Felipe Simões; Bizarria, Francisco Carlos Parquet; Bizarria, José Walter Parquet

    2018-03-01

    In rocket motors, the bonding between the solid propellant and thermal insulation is accomplished by a thin adhesive layer, known as liner. The liner application method involves a complex sequence of tasks, which includes in its final stage, the surface integrity inspection. Nowadays in Brazil, an expert carries out a thorough visual inspection to detect defects on the liner surface that may compromise the propellant interface bonding. Therefore, this paper proposes an algorithm that uses the photometric stereo technique and the K-nearest neighbor (KNN) classifier to assist the expert in the surface inspection. Photometric stereo allows the surface information recovery of the test images, while the KNN method enables image pixels classification into two classes: non-defect and defect. Tests performed on a computer vision based prototype validate the algorithm. The positive results suggest that the algorithm is feasible and when implemented in a real scenario, will be able to help the expert in detecting defective areas on the liner surface.

  7. A biologically plausible computational model for auditory object recognition.

    PubMed

    Larson, Eric; Billimoria, Cyrus P; Sen, Kamal

    2009-01-01

    Object recognition is a task of fundamental importance for sensory systems. Although this problem has been intensively investigated in the visual system, relatively little is known about the recognition of complex auditory objects. Recent work has shown that spike trains from individual sensory neurons can be used to discriminate between and recognize stimuli. Multiple groups have developed spike similarity or dissimilarity metrics to quantify the differences between spike trains. Using a nearest-neighbor approach the spike similarity metrics can be used to classify the stimuli into groups used to evoke the spike trains. The nearest prototype spike train to the tested spike train can then be used to identify the stimulus. However, how biological circuits might perform such computations remains unclear. Elucidating this question would facilitate the experimental search for such circuits in biological systems, as well as the design of artificial circuits that can perform such computations. Here we present a biologically plausible model for discrimination inspired by a spike distance metric using a network of integrate-and-fire model neurons coupled to a decision network. We then apply this model to the birdsong system in the context of song discrimination and recognition. We show that the model circuit is effective at recognizing individual songs, based on experimental input data from field L, the avian primary auditory cortex analog. We also compare the performance and robustness of this model to two alternative models of song discrimination: a model based on coincidence detection and a model based on firing rate.

  8. Gently does it: Humans outperform a software classifier in recognizing subtle, nonstereotypical facial expressions.

    PubMed

    Yitzhak, Neta; Giladi, Nir; Gurevich, Tanya; Messinger, Daniel S; Prince, Emily B; Martin, Katherine; Aviezer, Hillel

    2017-12-01

    According to dominant theories of affect, humans innately and universally express a set of emotions using specific configurations of prototypical facial activity. Accordingly, thousands of studies have tested emotion recognition using sets of highly intense and stereotypical facial expressions, yet their incidence in real life is virtually unknown. In fact, a commonplace experience is that emotions are expressed in subtle and nonprototypical forms. Such facial expressions are at the focus of the current study. In Experiment 1, we present the development and validation of a novel stimulus set consisting of dynamic and subtle emotional facial displays conveyed without constraining expressers to using prototypical configurations. Although these subtle expressions were more challenging to recognize than prototypical dynamic expressions, they were still well recognized by human raters, and perhaps most importantly, they were rated as more ecological and naturalistic than the prototypical expressions. In Experiment 2, we examined the characteristics of subtle versus prototypical expressions by subjecting them to a software classifier, which used prototypical basic emotion criteria. Although the software was highly successful at classifying prototypical expressions, it performed very poorly at classifying the subtle expressions. Further validation was obtained from human expert face coders: Subtle stimuli did not contain many of the key facial movements present in prototypical expressions. Together, these findings suggest that emotions may be successfully conveyed to human viewers using subtle nonprototypical expressions. Although classic prototypical facial expressions are well recognized, they appear less naturalistic and may not capture the richness of everyday emotional communication. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  9. Comparison of Random Forest, k-Nearest Neighbor, and Support Vector Machine Classifiers for Land Cover Classification Using Sentinel-2 Imagery

    PubMed Central

    Thanh Noi, Phan; Kappas, Martin

    2017-01-01

    In previous classification studies, three non-parametric classifiers, Random Forest (RF), k-Nearest Neighbor (kNN), and Support Vector Machine (SVM), were reported as the foremost classifiers at producing high accuracies. However, only a few studies have compared the performances of these classifiers with different training sample sizes for the same remote sensing images, particularly the Sentinel-2 Multispectral Imager (MSI). In this study, we examined and compared the performances of the RF, kNN, and SVM classifiers for land use/cover classification using Sentinel-2 image data. An area of 30 × 30 km2 within the Red River Delta of Vietnam with six land use/cover types was classified using 14 different training sample sizes, including balanced and imbalanced, from 50 to over 1250 pixels/class. All classification results showed a high overall accuracy (OA) ranging from 90% to 95%. Among the three classifiers and 14 sub-datasets, SVM produced the highest OA with the least sensitivity to the training sample sizes, followed consecutively by RF and kNN. In relation to the sample size, all three classifiers showed a similar and high OA (over 93.85%) when the training sample size was large enough, i.e., greater than 750 pixels/class or representing an area of approximately 0.25% of the total study area. The high accuracy was achieved with both imbalanced and balanced datasets. PMID:29271909

  10. Comparison of Random Forest, k-Nearest Neighbor, and Support Vector Machine Classifiers for Land Cover Classification Using Sentinel-2 Imagery.

    PubMed

    Thanh Noi, Phan; Kappas, Martin

    2017-12-22

    In previous classification studies, three non-parametric classifiers, Random Forest (RF), k-Nearest Neighbor (kNN), and Support Vector Machine (SVM), were reported as the foremost classifiers at producing high accuracies. However, only a few studies have compared the performances of these classifiers with different training sample sizes for the same remote sensing images, particularly the Sentinel-2 Multispectral Imager (MSI). In this study, we examined and compared the performances of the RF, kNN, and SVM classifiers for land use/cover classification using Sentinel-2 image data. An area of 30 × 30 km² within the Red River Delta of Vietnam with six land use/cover types was classified using 14 different training sample sizes, including balanced and imbalanced, from 50 to over 1250 pixels/class. All classification results showed a high overall accuracy (OA) ranging from 90% to 95%. Among the three classifiers and 14 sub-datasets, SVM produced the highest OA with the least sensitivity to the training sample sizes, followed consecutively by RF and kNN. In relation to the sample size, all three classifiers showed a similar and high OA (over 93.85%) when the training sample size was large enough, i.e., greater than 750 pixels/class or representing an area of approximately 0.25% of the total study area. The high accuracy was achieved with both imbalanced and balanced datasets.

  11. Supervised novelty detection in brain tissue classification with an application to white matter hyperintensities

    NASA Astrophysics Data System (ADS)

    Kuijf, Hugo J.; Moeskops, Pim; de Vos, Bob D.; Bouvy, Willem H.; de Bresser, Jeroen; Biessels, Geert Jan; Viergever, Max A.; Vincken, Koen L.

    2016-03-01

    Novelty detection is concerned with identifying test data that differs from the training data of a classifier. In the case of brain MR images, pathology or imaging artefacts are examples of untrained data. In this proof-of-principle study, we measure the behaviour of a classifier during the classification of trained labels (i.e. normal brain tissue). Next, we devise a measure that distinguishes normal classifier behaviour from abnormal behavior that occurs in the case of a novelty. This will be evaluated by training a kNN classifier on normal brain tissue, applying it to images with an untrained pathology (white matter hyperintensities (WMH)), and determine if our measure is able to identify abnormal classifier behaviour at WMH locations. For our kNN classifier, behaviour is modelled as the mean, median, or q1 distance to the k nearest points. Healthy tissue was trained on 15 images; classifier behaviour was trained/tested on 5 images with leave-one-out cross-validation. For each trained class, we measure the distribution of mean/median/q1 distances to the k nearest point. Next, for each test voxel, we compute its Z-score with respect to the measured distribution of its predicted label. We consider a Z-score >=4 abnormal behaviour of the classifier, having a probability due to chance of 0.000032. Our measure identified >90% of WMH volume and also highlighted other non-trained findings. The latter being predominantly vessels, cerebral falx, brain mask errors, choroid plexus. This measure is generalizable to other classifiers and might help in detecting unexpected findings or novelties by measuring classifier behaviour.

  12. A 10-Gene Classifier for Indeterminate Thyroid Nodules: Development and Multicenter Accuracy Study

    PubMed Central

    González, Hernán E.; Martínez, José R.; Vargas-Salas, Sergio; Solar, Antonieta; Veliz, Loreto; Cruz, Francisco; Arias, Tatiana; Loyola, Soledad; Horvath, Eleonora; Tala, Hernán; Traipe, Eufrosina; Meneses, Manuel; Marín, Luis; Wohllk, Nelson; Diaz, René E.; Véliz, Jesús; Pineda, Pedro; Arroyo, Patricia; Mena, Natalia; Bracamonte, Milagros; Miranda, Giovanna; Bruce, Elsa

    2017-01-01

    Background: In most of the world, diagnostic surgery remains the most frequent approach for indeterminate thyroid cytology. Although several molecular tests are available for testing in centralized commercial laboratories in the United States, there are no available kits for local laboratory testing. The aim of this study was to develop a prototype in vitro diagnostic (IVD) gene classifier for the further characterization of nodules with an indeterminate thyroid cytology. Methods: In a first stage, the expression of 18 genes was determined by quantitative polymerase chain reaction (qPCR) in a broad histopathological spectrum of 114 fresh-tissue biopsies. Expression data were used to train several classifiers by supervised machine learning approaches. Classifiers were tested in an independent set of 139 samples. In a second stage, the best classifier was chosen as a model to develop a multiplexed-qPCR IVD prototype assay, which was tested in a prospective multicenter cohort of fine-needle aspiration biopsies. Results: In tissue biopsies, the best classifier, using only 10 genes, reached an optimal and consistent performance in the ninefold cross-validated testing set (sensitivity 93% and specificity 81%). In the multicenter cohort of fine-needle aspiration biopsy samples, the 10-gene signature, built into a multiplexed-qPCR IVD prototype, showed an area under the curve of 0.97, a positive predictive value of 78%, and a negative predictive value of 98%. By Bayes' theorem, the IVD prototype is expected to achieve a positive predictive value of 64–82% and a negative predictive value of 97–99% in patients with a cancer prevalence range of 20–40%. Conclusions: A new multiplexed-qPCR IVD prototype is reported that accurately classifies thyroid nodules and may provide a future solution suitable for local reference laboratory testing. PMID:28521616

  13. Rectangular Array Of Digital Processors For Planning Paths

    NASA Technical Reports Server (NTRS)

    Kemeny, Sabrina E.; Fossum, Eric R.; Nixon, Robert H.

    1993-01-01

    Prototype 24 x 25 rectangular array of asynchronous parallel digital processors rapidly finds best path across two-dimensional field, which could be patch of terrain traversed by robotic or military vehicle. Implemented as single-chip very-large-scale integrated circuit. Excepting processors on edges, each processor communicates with four nearest neighbors along paths representing travel to north, south, east, and west. Each processor contains delay generator in form of 8-bit ripple counter, preset to 1 of 256 possible values. Operation begins with choice of processor representing starting point. Transmits signals to nearest neighbor processors, which retransmits to other neighboring processors, and process repeats until signals propagated across entire field.

  14. An Analysis of Document Category Prediction Responses to Classifier Model Parameter Treatment Permutations within the Software Design Patterns Subject Domain

    ERIC Educational Resources Information Center

    Pankau, Brian L.

    2009-01-01

    This empirical study evaluates the document category prediction effectiveness of Naive Bayes (NB) and K-Nearest Neighbor (KNN) classifier treatments built from different feature selection and machine learning settings and trained and tested against textual corpora of 2300 Gang-Of-Four (GOF) design pattern documents. Analysis of the experiment's…

  15. Nation-Building Modeling and Resource Allocation Via Dynamic Programming

    DTIC Science & Technology

    2014-09-01

    Figure 2. RAND Study Models[59:98,115] (WMA) and used both the k-Nearest Neighbor ( KNN ) and Nearest Centroid (NC) algorithms to classify future features...The study found that KNN performed bet- ter than NC with 85% or greater accuracy in all test cases. The methodology was adopted for use under the...analysis feature of the model. 3.7.1 The No Surge Alternative. On the 10th of January 2007, President George W. Bush delivered a speech to the American

  16. Analysis of miRNA expression profile based on SVM algorithm

    NASA Astrophysics Data System (ADS)

    Ting-ting, Dai; Chang-ji, Shan; Yan-shou, Dong; Yi-duo, Bian

    2018-05-01

    Based on mirna expression spectrum data set, a new data mining algorithm - tSVM - KNN (t statistic with support vector machine - k nearest neighbor) is proposed. the idea of the algorithm is: firstly, the feature selection of the data set is carried out by the unified measurement method; Secondly, SVM - KNN algorithm, which combines support vector machine (SVM) and k - nearest neighbor (k - nearest neighbor) is used as classifier. Simulation results show that SVM - KNN algorithm has better classification ability than SVM and KNN alone. Tsvm - KNN algorithm only needs 5 mirnas to obtain 96.08 % classification accuracy in terms of the number of mirna " tags" and recognition accuracy. compared with similar algorithms, tsvm - KNN algorithm has obvious advantages.

  17. Minimum Expected Risk Estimation for Near-neighbor Classification

    DTIC Science & Technology

    2006-04-01

    We consider the problems of class probability estimation and classification when using near-neighbor classifiers, such as k-nearest neighbors ( kNN ...estimate for weighted kNN classifiers with different prior information, for a broad class of risk functions. Theory and simulations show how significant...the difference is compared to the standard maximum likelihood weighted kNN estimates. Comparisons are made with uniform weights, symmetric weights

  18. A Proposed Methodology to Classify Frontier Capital Markets

    DTIC Science & Technology

    2011-07-31

    but because it is the surest route to our common good.” -Inaugural Speech by President Barack Obama, Jan 2009 This project involves basic...machine learning. The algorithm consists of a unique binary classifier mechanism that combines three methods: k-Nearest Neighbors ( kNN ), ensemble...Through kNN Ensemble Classification Techniques E. Capital Market Classification Based on Capital Flows and Trading Architecture F. Horizontal

  19. A Proposed Methodology to Classify Frontier Capital Markets

    DTIC Science & Technology

    2011-07-31

    out of charity, but because it is the surest route to our common good.” -Inaugural Speech by President Barack Obama, Jan 2009 This project...identification, and machine learning. The algorithm consists of a unique binary classifier mechanism that combines three methods: k-Nearest Neighbors ( kNN ...Support Through kNN Ensemble Classification Techniques E. Capital Market Classification Based on Capital Flows and Trading Architecture F

  20. Prototype-Incorporated Emotional Neural Network.

    PubMed

    Oyedotun, Oyebade K; Khashman, Adnan

    2017-08-15

    Artificial neural networks (ANNs) aim to simulate the biological neural activities. Interestingly, many ''engineering'' prospects in ANN have relied on motivations from cognition and psychology studies. So far, two important learning theories that have been subject of active research are the prototype and adaptive learning theories. The learning rules employed for ANNs can be related to adaptive learning theory, where several examples of the different classes in a task are supplied to the network for adjusting internal parameters. Conversely, the prototype-learning theory uses prototypes (representative examples); usually, one prototype per class of the different classes contained in the task. These prototypes are supplied for systematic matching with new examples so that class association can be achieved. In this paper, we propose and implement a novel neural network algorithm based on modifying the emotional neural network (EmNN) model to unify the prototype- and adaptive-learning theories. We refer to our new model as ``prototype-incorporated EmNN''. Furthermore, we apply the proposed model to two real-life challenging tasks, namely, static hand-gesture recognition and face recognition, and compare the result to those obtained using the popular back-propagation neural network (BPNN), emotional BPNN (EmNN), deep networks, an exemplar classification model, and k-nearest neighbor.

  1. Understanding the Instruments of National Power through a System of Differential Equations in a Counterinsurgency

    DTIC Science & Technology

    2012-03-01

    WMA) and used both the k-Nearest Neighbor ( KNN ) and Nearest Centroid 27 (a) Coalition and Regional (b) Indigenous Figure 3. RAND Study Models[32:98,115...NC) algorithms to classify future features. The study found that KNN performed better than NC with 85% or greater accuracy in all test cases. The...the model. 4.2.1 No Surge. On the 10th of January 2007, President George W. Bush delivered a speech to the American Public outlining a new strategy in

  2. Integrating instance selection, instance weighting, and feature weighting for nearest neighbor classifiers by coevolutionary algorithms.

    PubMed

    Derrac, Joaquín; Triguero, Isaac; Garcia, Salvador; Herrera, Francisco

    2012-10-01

    Cooperative coevolution is a successful trend of evolutionary computation which allows us to define partitions of the domain of a given problem, or to integrate several related techniques into one, by the use of evolutionary algorithms. It is possible to apply it to the development of advanced classification methods, which integrate several machine learning techniques into a single proposal. A novel approach integrating instance selection, instance weighting, and feature weighting into the framework of a coevolutionary model is presented in this paper. We compare it with a wide range of evolutionary and nonevolutionary related methods, in order to show the benefits of the employment of coevolution to apply the techniques considered simultaneously. The results obtained, contrasted through nonparametric statistical tests, show that our proposal outperforms other methods in the comparison, thus becoming a suitable tool in the task of enhancing the nearest neighbor classifier.

  3. Predicting Flavonoid UGT Regioselectivity

    PubMed Central

    Jackson, Rhydon; Knisley, Debra; McIntosh, Cecilia; Pfeiffer, Phillip

    2011-01-01

    Machine learning was applied to a challenging and biologically significant protein classification problem: the prediction of avonoid UGT acceptor regioselectivity from primary sequence. Novel indices characterizing graphical models of residues were proposed and found to be widely distributed among existing amino acid indices and to cluster residues appropriately. UGT subsequences biochemically linked to regioselectivity were modeled as sets of index sequences. Several learning techniques incorporating these UGT models were compared with classifications based on standard sequence alignment scores. These techniques included an application of time series distance functions to protein classification. Time series distances defined on the index sequences were used in nearest neighbor and support vector machine classifiers. Additionally, Bayesian neural network classifiers were applied to the index sequences. The experiments identified improvements over the nearest neighbor and support vector machine classifications relying on standard alignment similarity scores, as well as strong correlations between specific subsequences and regioselectivities. PMID:21747849

  4. The Use of Fuzzy Set Classification for Pattern Recognition of the Polygraph

    DTIC Science & Technology

    1993-12-01

    actual feature extraction was done, It was decided to use the K-nearest neighbor ( KNN ) the data was preprocessed. The electrocardiogram classifier in...showing heart pulse, and a low frequency not known beforehand, and the KNN classifier does not component showing blood volume. The derivative of...the characteristics of the conventional KNN these six derived signals were detrended and filtered, classification method is that it assigns each

  5. A Novel Locally Linear KNN Method With Applications to Visual Recognition.

    PubMed

    Liu, Qingfeng; Liu, Chengjun

    2017-09-01

    A locally linear K Nearest Neighbor (LLK) method is presented in this paper with applications to robust visual recognition. Specifically, the concept of an ideal representation is first presented, which improves upon the traditional sparse representation in many ways. The objective function based on a host of criteria for sparsity, locality, and reconstruction is then optimized to derive a novel representation, which is an approximation to the ideal representation. The novel representation is further processed by two classifiers, namely, an LLK-based classifier and a locally linear nearest mean-based classifier, for visual recognition. The proposed classifiers are shown to connect to the Bayes decision rule for minimum error. Additional new theoretical analysis is presented, such as the nonnegative constraint, the group regularization, and the computational efficiency of the proposed LLK method. New methods such as a shifted power transformation for improving reliability, a coefficients' truncating method for enhancing generalization, and an improved marginal Fisher analysis method for feature extraction are proposed to further improve visual recognition performance. Extensive experiments are implemented to evaluate the proposed LLK method for robust visual recognition. In particular, eight representative data sets are applied for assessing the performance of the LLK method for various visual recognition applications, such as action recognition, scene recognition, object recognition, and face recognition.

  6. Comparison of four approaches to a rock facies classification problem

    USGS Publications Warehouse

    Dubois, M.K.; Bohling, Geoffrey C.; Chakrabarti, S.

    2007-01-01

    In this study, seven classifiers based on four different approaches were tested in a rock facies classification problem: classical parametric methods using Bayes' rule, and non-parametric methods using fuzzy logic, k-nearest neighbor, and feed forward-back propagating artificial neural network. Determining the most effective classifier for geologic facies prediction in wells without cores in the Panoma gas field, in Southwest Kansas, was the objective. Study data include 3600 samples with known rock facies class (from core) with each sample having either four or five measured properties (wire-line log curves), and two derived geologic properties (geologic constraining variables). The sample set was divided into two subsets, one for training and one for testing the ability of the trained classifier to correctly assign classes. Artificial neural networks clearly outperformed all other classifiers and are effective tools for this particular classification problem. Classical parametric models were inadequate due to the nature of the predictor variables (high dimensional and not linearly correlated), and feature space of the classes (overlapping). The other non-parametric methods tested, k-nearest neighbor and fuzzy logic, would need considerable improvement to match the neural network effectiveness, but further work, possibly combining certain aspects of the three non-parametric methods, may be justified. ?? 2006 Elsevier Ltd. All rights reserved.

  7. Nearest Neighbor Algorithms for Pattern Classification

    NASA Technical Reports Server (NTRS)

    Barrios, J. O.

    1972-01-01

    A solution of the discrimination problem is considered by means of the minimum distance classifier, commonly referred to as the nearest neighbor (NN) rule. The NN rule is nonparametric, or distribution free, in the sense that it does not depend on any assumptions about the underlying statistics for its application. The k-NN rule is a procedure that assigns an observation vector z to a category F if most of the k nearby observations x sub i are elements of F. The condensed nearest neighbor (CNN) rule may be used to reduce the size of the training set required categorize The Bayes risk serves merely as a reference-the limit of excellence beyond which it is not possible to go. The NN rule is bounded below by the Bayes risk and above by twice the Bayes risk.

  8. Initial results from a video-laser rangefinder device

    Treesearch

    Neil A. Clark

    2000-01-01

    Three hundred and nine width measurements at various heights to 10 m on a metal light pole were calculated from video images captured with a prototype video-laser rangefinder instrument. Data were captured at distances from 6 to 15 m. The endpoints for the width measurements were manually selected to the nearest pixel from individual video frames.Chi-square...

  9. A Mobile GPS Application: Mosque Tracking with Prayer Time Synchronization

    NASA Astrophysics Data System (ADS)

    Hashim, Rathiah; Ikhmatiar, Mohammad Sibghotulloh; Surip, Miswan; Karmin, Masiri; Herawan, Tutut

    Global Positioning System (GPS) is a popular technology applied in many areas and embedded in many devices, facilitating end-users to navigate effectively to user's intended destination via the best calculated route. The ability of GPS to track precisely according to coordinates of specific locations can be utilized to assist a Muslim traveler visiting or passing an unfamiliar place to find the nearest mosque in order to perform his prayer. However, not many techniques have been proposed for Mosque tracking. This paper presents the development of GPS technology in tracking the nearest mosque using mobile application software embedded with the prayer time's synchronization system on a mobile application. The prototype GPS system developed has been successfully incorporated with a map and several mosque locations.

  10. Jastrow-like ground states for quantum many-body potentials with near-neighbors interactions

    NASA Astrophysics Data System (ADS)

    Baradaran, Marzieh; Carrasco, José A.; Finkel, Federico; González-López, Artemio

    2018-01-01

    We completely solve the problem of classifying all one-dimensional quantum potentials with nearest- and next-to-nearest-neighbors interactions whose ground state is Jastrow-like, i.e., of Jastrow type but depending only on differences of consecutive particles. In particular, we show that these models must necessarily contain a three-body interaction term, as was the case with all previously known examples. We discuss several particular instances of the general solution, including a new hyperbolic potential and a model with elliptic interactions which reduces to the known rational and trigonometric ones in appropriate limits.

  11. Learning with imperfectly labeled patterns

    NASA Technical Reports Server (NTRS)

    Chittineni, C. B.

    1979-01-01

    The problem of learning in pattern recognition using imperfectly labeled patterns is considered. The performance of the Bayes and nearest neighbor classifiers with imperfect labels is discussed using a probabilistic model for the mislabeling of the training patterns. Schemes for training the classifier using both parametric and non parametric techniques are presented. Methods for the correction of imperfect labels were developed. To gain an understanding of the learning process, expressions are derived for success probability as a function of training time for a one dimensional increment error correction classifier with imperfect labels. Feature selection with imperfectly labeled patterns is described.

  12. Thermography based diagnosis of ruptured anterior cruciate ligament (ACL) in canines

    NASA Astrophysics Data System (ADS)

    Lama, Norsang; Umbaugh, Scott E.; Mishra, Deependra; Dahal, Rohini; Marino, Dominic J.; Sackman, Joseph

    2016-09-01

    Anterior cruciate ligament (ACL) rupture in canines is a common orthopedic injury in veterinary medicine. Veterinarians use both imaging and non-imaging methods to diagnose the disease. Common imaging methods such as radiography, computed tomography (CT scan) and magnetic resonance imaging (MRI) have some disadvantages: expensive setup, high dose of radiation, and time-consuming. In this paper, we present an alternative diagnostic method based on feature extraction and pattern classification (FEPC) to diagnose abnormal patterns in ACL thermograms. The proposed method was experimented with a total of 30 thermograms for each camera view (anterior, lateral and posterior) including 14 disease and 16 non-disease cases provided from Long Island Veterinary Specialists. The normal and abnormal patterns in thermograms are analyzed in two steps: feature extraction and pattern classification. Texture features based on gray level co-occurrence matrices (GLCM), histogram features and spectral features are extracted from the color normalized thermograms and the computed feature vectors are applied to Nearest Neighbor (NN) classifier, K-Nearest Neighbor (KNN) classifier and Support Vector Machine (SVM) classifier with leave-one-out validation method. The algorithm gives the best classification success rate of 86.67% with a sensitivity of 85.71% and a specificity of 87.5% in ACL rupture detection using NN classifier for the lateral view and Norm-RGB-Lum color normalization method. Our results show that the proposed method has the potential to detect ACL rupture in canines.

  13. Diagnosis of Tempromandibular Disorders Using Local Binary Patterns.

    PubMed

    Haghnegahdar, A A; Kolahi, S; Khojastepour, L; Tajeripour, F

    2018-03-01

    Temporomandibular joint disorder (TMD) might be manifested as structural changes in bone through modification, adaptation or direct destruction. We propose to use Local Binary Pattern (LBP) characteristics and histogram-oriented gradients on the recorded images as a diagnostic tool in TMD assessment. CBCT images of 66 patients (132 joints) with TMD and 66 normal cases (132 joints) were collected and 2 coronal cut prepared from each condyle, although images were limited to head of mandibular condyle. In order to extract features of images, first we use LBP and then histogram of oriented gradients. To reduce dimensionality, the linear algebra Singular Value Decomposition (SVD) is applied to the feature vectors matrix of all images. For evaluation, we used K nearest neighbor (K-NN), Support Vector Machine, Naïve Bayesian and Random Forest classifiers. We used Receiver Operating Characteristic (ROC) to evaluate the hypothesis. K nearest neighbor classifier achieves a very good accuracy (0.9242), moreover, it has desirable sensitivity (0.9470) and specificity (0.9015) results, when other classifiers have lower accuracy, sensitivity and specificity. We proposed a fully automatic approach to detect TMD using image processing techniques based on local binary patterns and feature extraction. K-NN has been the best classifier for our experiments in detecting patients from healthy individuals, by 92.42% accuracy, 94.70% sensitivity and 90.15% specificity. The proposed method can help automatically diagnose TMD at its initial stages.

  14. Large margin nearest neighbor classifiers.

    PubMed

    Domeniconi, Carlotta; Gunopulos, Dimitrios; Peng, Jing

    2005-07-01

    The nearest neighbor technique is a simple and appealing approach to addressing classification problems. It relies on the assumption of locally constant class conditional probabilities. This assumption becomes invalid in high dimensions with a finite number of examples due to the curse of dimensionality. Severe bias can be introduced under these conditions when using the nearest neighbor rule. The employment of a locally adaptive metric becomes crucial in order to keep class conditional probabilities close to uniform, thereby minimizing the bias of estimates. We propose a technique that computes a locally flexible metric by means of support vector machines (SVMs). The decision function constructed by SVMs is used to determine the most discriminant direction in a neighborhood around the query. Such a direction provides a local feature weighting scheme. We formally show that our method increases the margin in the weighted space where classification takes place. Moreover, our method has the important advantage of online computational efficiency over competing locally adaptive techniques for nearest neighbor classification. We demonstrate the efficacy of our method using both real and simulated data.

  15. Automated analysis of long-term grooming behavior in Drosophila using a k-nearest neighbors classifier

    PubMed Central

    Allen, Victoria W; Shirasu-Hiza, Mimi

    2018-01-01

    Despite being pervasive, the control of programmed grooming is poorly understood. We addressed this gap by developing a high-throughput platform that allows long-term detection of grooming in Drosophila melanogaster. In our method, a k-nearest neighbors algorithm automatically classifies fly behavior and finds grooming events with over 90% accuracy in diverse genotypes. Our data show that flies spend ~13% of their waking time grooming, driven largely by two major internal programs. One of these programs regulates the timing of grooming and involves the core circadian clock components cycle, clock, and period. The second program regulates the duration of grooming and, while dependent on cycle and clock, appears to be independent of period. This emerging dual control model in which one program controls timing and another controls duration, resembles the two-process regulatory model of sleep. Together, our quantitative approach presents the opportunity for further dissection of mechanisms controlling long-term grooming in Drosophila. PMID:29485401

  16. Brain tissue segmentation in 4D CT using voxel classification

    NASA Astrophysics Data System (ADS)

    van den Boom, R.; Oei, M. T. H.; Lafebre, S.; Oostveen, L. J.; Meijer, F. J. A.; Steens, S. C. A.; Prokop, M.; van Ginneken, B.; Manniesing, R.

    2012-02-01

    A method is proposed to segment anatomical regions of the brain from 4D computer tomography (CT) patient data. The method consists of a three step voxel classification scheme, each step focusing on structures that are increasingly difficult to segment. The first step classifies air and bone, the second step classifies vessels and the third step classifies white matter, gray matter and cerebrospinal fluid. As features the time averaged intensity value and the temporal intensity change value were used. In each step, a k-Nearest-Neighbor classifier was used to classify the voxels. Training data was obtained by placing regions of interest in reconstructed 3D image data. The method has been applied to ten 4D CT cerebral patient data. A leave-one-out experiment showed consistent and accurate segmentation results.

  17. Machine Learning in Intrusion Detection

    DTIC Science & Technology

    2005-07-01

    machine learning tasks. Anomaly detection provides the core technology for a broad spectrum of security-centric applications. In this dissertation, we examine various aspects of anomaly based intrusion detection in computer security. First, we present a new approach to learn program behavior for intrusion detection. Text categorization techniques are adopted to convert each process to a vector and calculate the similarity between two program activities. Then the k-nearest neighbor classifier is employed to classify program behavior as normal or intrusive. We demonstrate

  18. Detecting the Difficulty Level of Foreign Language Texts

    DTIC Science & Technology

    2010-02-01

    continuous tenses), as well as part- of- speech labels for words. The authors used a k-Nearest Neighbor ( kNN ) classifier (Cover and Hart, 1967; Mitchell, 1997...anticipate, and influence these situations and to operate in them is found in foreign language speech and text. For this reason, military linguists are...the language model system, LGR is the prediction of one of the grammar-based classifiers, and CkNN is a confidence value of the kNN prediction for the

  19. An ensemble of dissimilarity based classifiers for Mackerel gender determination

    NASA Astrophysics Data System (ADS)

    Blanco, A.; Rodriguez, R.; Martinez-Maranon, I.

    2014-03-01

    Mackerel is an infravalored fish captured by European fishing vessels. A manner to add value to this specie can be achieved by trying to classify it attending to its sex. Colour measurements were performed on Mackerel females and males (fresh and defrozen) extracted gonads to obtain differences between sexes. Several linear and non linear classifiers such as Support Vector Machines (SVM), k Nearest Neighbors (k-NN) or Diagonal Linear Discriminant Analysis (DLDA) can been applied to this problem. However, theyare usually based on Euclidean distances that fail to reflect accurately the sample proximities. Classifiers based on non-Euclidean dissimilarities misclassify a different set of patterns. We combine different kind of dissimilarity based classifiers. The diversity is induced considering a set of complementary dissimilarities for each model. The experimental results suggest that our algorithm helps to improve classifiers based on a single dissimilarity.

  20. ESONET LIDO Demonstration Mission: the Iberian Margin node.

    NASA Astrophysics Data System (ADS)

    Embriaco, Davide; André, Michel; Zitellini, Nevio; Esonet Lido Demonstration Mission Team

    2010-05-01

    The Gulf of Cadiz is one of two the test sites chosen for the demonstration of the ESONET - LIDO Demonstration Mission (DM) [1], which will establish a first nucleus of regional network of multidisciplinary sea floor observatories. The Gulf of Cadiz is a highly populated area, characterized by tsunamigenic sources, which caused the devastating earthquake and tsunamis that struck Lisbon in 1755. The seismic activity is concentrated along a belt going from this region to the Azores and the main tsunamigenic tectonic sources are located near the coastline. In the framework of the EU - NEAREST project [2] the GEOSTAR deep ocean bottom multi-parametric observatory was deployed for a one year mission off cape Saint Vincent at about 3200 m depth. GEOSTAR was equipped with a set of oceanographic, seismic and geophysical sensors and with a new tsunami detector prototype. In November 2009 the GEOSTAR abyssal station equipped with the tsunami prototype was redeployed at the same site on behalf of NEAREST and ESONET - LIDO DM. The system is able to communicate from the ocean bottom to the land station via an acoustic and satellite link. The abyssal station is designed both for long term geophysical and oceanographic observation and for tsunami early warning purpose. The tsunami detection is performed by two different algorithms: a new real time dedicated tsunami detection algorithm which analyses the water pressure data, and a seismic algorithm which triggers on strong events. Examples of geophysical and oceanographic data acquired by the abyssal station during the one year mission will be shown. The development of a new acoustic antenna equipped with a stand alone and autonomous acquisition system will allow the recording of marine mammals and the evaluation of environmental noise. References [1] M. André and The ESONET LIDO Demonstration Mission Team, "Listening to the deep-ocean environment: an ESONET initiative for the real-time monitoring of geohazards and marine ambient noise", EGU General Assembly, Vienna 2-7 May 2010 [2] EU - NEAREST Project web site: http://nearest.bo.ismar.cnr.it/

  1. Detection of Periodic Leg Movements by Machine Learning Methods Using Polysomnographic Parameters Other Than Leg Electromyography

    PubMed Central

    Umut, İlhan; Çentik, Güven

    2016-01-01

    The number of channels used for polysomnographic recording frequently causes difficulties for patients because of the many cables connected. Also, it increases the risk of having troubles during recording process and increases the storage volume. In this study, it is intended to detect periodic leg movement (PLM) in sleep with the use of the channels except leg electromyography (EMG) by analysing polysomnography (PSG) data with digital signal processing (DSP) and machine learning methods. PSG records of 153 patients of different ages and genders with PLM disorder diagnosis were examined retrospectively. A novel software was developed for the analysis of PSG records. The software utilizes the machine learning algorithms, statistical methods, and DSP methods. In order to classify PLM, popular machine learning methods (multilayer perceptron, K-nearest neighbour, and random forests) and logistic regression were used. Comparison of classified results showed that while K-nearest neighbour classification algorithm had higher average classification rate (91.87%) and lower average classification error value (RMSE = 0.2850), multilayer perceptron algorithm had the lowest average classification rate (83.29%) and the highest average classification error value (RMSE = 0.3705). Results showed that PLM can be classified with high accuracy (91.87%) without leg EMG record being present. PMID:27213008

  2. Detection of Periodic Leg Movements by Machine Learning Methods Using Polysomnographic Parameters Other Than Leg Electromyography.

    PubMed

    Umut, İlhan; Çentik, Güven

    2016-01-01

    The number of channels used for polysomnographic recording frequently causes difficulties for patients because of the many cables connected. Also, it increases the risk of having troubles during recording process and increases the storage volume. In this study, it is intended to detect periodic leg movement (PLM) in sleep with the use of the channels except leg electromyography (EMG) by analysing polysomnography (PSG) data with digital signal processing (DSP) and machine learning methods. PSG records of 153 patients of different ages and genders with PLM disorder diagnosis were examined retrospectively. A novel software was developed for the analysis of PSG records. The software utilizes the machine learning algorithms, statistical methods, and DSP methods. In order to classify PLM, popular machine learning methods (multilayer perceptron, K-nearest neighbour, and random forests) and logistic regression were used. Comparison of classified results showed that while K-nearest neighbour classification algorithm had higher average classification rate (91.87%) and lower average classification error value (RMSE = 0.2850), multilayer perceptron algorithm had the lowest average classification rate (83.29%) and the highest average classification error value (RMSE = 0.3705). Results showed that PLM can be classified with high accuracy (91.87%) without leg EMG record being present.

  3. Analysis of microarray leukemia data using an efficient MapReduce-based K-nearest-neighbor classifier.

    PubMed

    Kumar, Mukesh; Rath, Nitish Kumar; Rath, Santanu Kumar

    2016-04-01

    Microarray-based gene expression profiling has emerged as an efficient technique for classification, prognosis, diagnosis, and treatment of cancer. Frequent changes in the behavior of this disease generates an enormous volume of data. Microarray data satisfies both the veracity and velocity properties of big data, as it keeps changing with time. Therefore, the analysis of microarray datasets in a small amount of time is essential. They often contain a large amount of expression, but only a fraction of it comprises genes that are significantly expressed. The precise identification of genes of interest that are responsible for causing cancer are imperative in microarray data analysis. Most existing schemes employ a two-phase process such as feature selection/extraction followed by classification. In this paper, various statistical methods (tests) based on MapReduce are proposed for selecting relevant features. After feature selection, a MapReduce-based K-nearest neighbor (mrKNN) classifier is also employed to classify microarray data. These algorithms are successfully implemented in a Hadoop framework. A comparative analysis is done on these MapReduce-based models using microarray datasets of various dimensions. From the obtained results, it is observed that these models consume much less execution time than conventional models in processing big data. Copyright © 2016 Elsevier Inc. All rights reserved.

  4. Efficient Fingercode Classification

    NASA Astrophysics Data System (ADS)

    Sun, Hong-Wei; Law, Kwok-Yan; Gollmann, Dieter; Chung, Siu-Leung; Li, Jian-Bin; Sun, Jia-Guang

    In this paper, we present an efficient fingerprint classification algorithm which is an essential component in many critical security application systems e. g. systems in the e-government and e-finance domains. Fingerprint identification is one of the most important security requirements in homeland security systems such as personnel screening and anti-money laundering. The problem of fingerprint identification involves searching (matching) the fingerprint of a person against each of the fingerprints of all registered persons. To enhance performance and reliability, a common approach is to reduce the search space by firstly classifying the fingerprints and then performing the search in the respective class. Jain et al. proposed a fingerprint classification algorithm based on a two-stage classifier, which uses a K-nearest neighbor classifier in its first stage. The fingerprint classification algorithm is based on the fingercode representation which is an encoding of fingerprints that has been demonstrated to be an effective fingerprint biometric scheme because of its ability to capture both local and global details in a fingerprint image. We enhance this approach by improving the efficiency of the K-nearest neighbor classifier for fingercode-based fingerprint classification. Our research firstly investigates the various fast search algorithms in vector quantization (VQ) and the potential application in fingerprint classification, and then proposes two efficient algorithms based on the pyramid-based search algorithms in VQ. Experimental results on DB1 of FVC 2004 demonstrate that our algorithms can outperform the full search algorithm and the original pyramid-based search algorithms in terms of computational efficiency without sacrificing accuracy.

  5. Diagnosis of Tempromandibular Disorders Using Local Binary Patterns

    PubMed Central

    Haghnegahdar, A.A.; Kolahi, S.; Khojastepour, L.; Tajeripour, F.

    2018-01-01

    Background: Temporomandibular joint disorder (TMD) might be manifested as structural changes in bone through modification, adaptation or direct destruction. We propose to use Local Binary Pattern (LBP) characteristics and histogram-oriented gradients on the recorded images as a diagnostic tool in TMD assessment. Material and Methods: CBCT images of 66 patients (132 joints) with TMD and 66 normal cases (132 joints) were collected and 2 coronal cut prepared from each condyle, although images were limited to head of mandibular condyle. In order to extract features of images, first we use LBP and then histogram of oriented gradients. To reduce dimensionality, the linear algebra Singular Value Decomposition (SVD) is applied to the feature vectors matrix of all images. For evaluation, we used K nearest neighbor (K-NN), Support Vector Machine, Naïve Bayesian and Random Forest classifiers. We used Receiver Operating Characteristic (ROC) to evaluate the hypothesis. Results: K nearest neighbor classifier achieves a very good accuracy (0.9242), moreover, it has desirable sensitivity (0.9470) and specificity (0.9015) results, when other classifiers have lower accuracy, sensitivity and specificity. Conclusion: We proposed a fully automatic approach to detect TMD using image processing techniques based on local binary patterns and feature extraction. K-NN has been the best classifier for our experiments in detecting patients from healthy individuals, by 92.42% accuracy, 94.70% sensitivity and 90.15% specificity. The proposed method can help automatically diagnose TMD at its initial stages. PMID:29732343

  6. Local Subspace Classifier with Transform-Invariance for Image Classification

    NASA Astrophysics Data System (ADS)

    Hotta, Seiji

    A family of linear subspace classifiers called local subspace classifier (LSC) outperforms the k-nearest neighbor rule (kNN) and conventional subspace classifiers in handwritten digit classification. However, LSC suffers very high sensitivity to image transformations because it uses projection and the Euclidean distances for classification. In this paper, I present a combination of a local subspace classifier (LSC) and a tangent distance (TD) for improving accuracy of handwritten digit recognition. In this classification rule, we can deal with transform-invariance easily because we are able to use tangent vectors for approximation of transformations. However, we cannot use tangent vectors in other type of images such as color images. Hence, kernel LSC (KLSC) is proposed for incorporating transform-invariance into LSC via kernel mapping. The performance of the proposed methods is verified with the experiments on handwritten digit and color image classification.

  7. Chapter 6 - Developing the LANDFIRE Vegetation and Biophysical Settings Map Unit Classifications for the LANDFIRE Prototype Project

    Treesearch

    Jennifer L. Long; Melanie Miller; James P. Menakis; Robert E. Keane

    2006-01-01

    The Landscape Fire and Resource Management Planning Tools Prototype Project, or LANDFIRE Prototype Project, required a system for classifying vegetation composition, biophysical settings, and vegetation structure to facilitate the mapping of vegetation and wildland fuel characteristics and the simulation of vegetation dynamics using landscape modeling. We developed...

  8. Realization of the axial next-nearest-neighbor Ising model in U 3 Al 2 Ge 3

    DOE PAGES

    Fobes, David M.; Lin, Shi-Zeng; Ghimire, Nirmal J.; ...

    2017-11-09

    Inmore » this paper, we report small-angle neutron scattering (SANS) measurements and theoretical modeling of U 3 Al 2 Ge 3 . Analysis of the SANS data reveals a phase transition to sinusoidally modulated magnetic order at T N = 63 K to be second order and a first-order phase transition to ferromagnetic order at T c = 48 K. Within the sinusoidally modulated magnetic phase (T c < T < T N), we uncover a dramatic change, by a factor of 3, in the ordering wave vector as a function of temperature. Finally, these observations all indicate that U 3 Al 2 Ge 3 is a close realization of the three-dimensional axial next-nearest-neighbor Ising model, a prototypical framework for describing commensurate to incommensurate phase transitions in frustrated magnets.« less

  9. Realization of the axial next-nearest-neighbor Ising model in U 3 Al 2 Ge 3

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fobes, David M.; Lin, Shi-Zeng; Ghimire, Nirmal J.

    Inmore » this paper, we report small-angle neutron scattering (SANS) measurements and theoretical modeling of U 3 Al 2 Ge 3 . Analysis of the SANS data reveals a phase transition to sinusoidally modulated magnetic order at T N = 63 K to be second order and a first-order phase transition to ferromagnetic order at T c = 48 K. Within the sinusoidally modulated magnetic phase (T c < T < T N), we uncover a dramatic change, by a factor of 3, in the ordering wave vector as a function of temperature. Finally, these observations all indicate that U 3 Al 2 Ge 3 is a close realization of the three-dimensional axial next-nearest-neighbor Ising model, a prototypical framework for describing commensurate to incommensurate phase transitions in frustrated magnets.« less

  10. Predicting protein subnuclear location with optimized evidence-theoretic K-nearest classifier and pseudo amino acid composition.

    PubMed

    Shen, Hong-Bin; Chou, Kuo-Chen

    2005-11-25

    The nucleus is the brain of eukaryotic cells that guides the life processes of the cell by issuing key instructions. For in-depth understanding of the biochemical process of the nucleus, the knowledge of localization of nuclear proteins is very important. With the avalanche of protein sequences generated in the post-genomic era, it is highly desired to develop an automated method for fast annotating the subnuclear locations for numerous newly found nuclear protein sequences so as to be able to timely utilize them for basic research and drug discovery. In view of this, a novel approach is developed for predicting the protein subnuclear location. It is featured by introducing a powerful classifier, the optimized evidence-theoretic K-nearest classifier, and using the pseudo amino acid composition [K.C. Chou, PROTEINS: Structure, Function, and Genetics, 43 (2001) 246], which can incorporate a considerable amount of sequence-order effects, to represent protein samples. As a demonstration, identifications were performed for 370 nuclear proteins among the following 9 subnuclear locations: (1) Cajal body, (2) chromatin, (3) heterochromatin, (4) nuclear diffuse, (5) nuclear pore, (6) nuclear speckle, (7) nucleolus, (8) PcG body, and (9) PML body. The overall success rates thus obtained by both the re-substitution test and jackknife cross-validation test are significantly higher than those by existing classifiers on the same working dataset. It is anticipated that the powerful approach may also become a useful high throughput vehicle to bridge the huge gap occurring in the post-genomic era between the number of gene sequences in databases and the number of gene products that have been functionally characterized. The OET-KNN classifier will be available at www.pami.sjtu.edu.cn/people/hbshen.

  11. Detecting epileptic seizure with different feature extracting strategies using robust machine learning classification techniques by applying advance parameter optimization approach.

    PubMed

    Hussain, Lal

    2018-06-01

    Epilepsy is a neurological disorder produced due to abnormal excitability of neurons in the brain. The research reveals that brain activity is monitored through electroencephalogram (EEG) of patients suffered from seizure to detect the epileptic seizure. The performance of EEG detection based epilepsy require feature extracting strategies. In this research, we have extracted varying features extracting strategies based on time and frequency domain characteristics, nonlinear, wavelet based entropy and few statistical features. A deeper study was undertaken using novel machine learning classifiers by considering multiple factors. The support vector machine kernels are evaluated based on multiclass kernel and box constraint level. Likewise, for K-nearest neighbors (KNN), we computed the different distance metrics, Neighbor weights and Neighbors. Similarly, the decision trees we tuned the paramours based on maximum splits and split criteria and ensemble classifiers are evaluated based on different ensemble methods and learning rate. For training/testing tenfold Cross validation was employed and performance was evaluated in form of TPR, NPR, PPV, accuracy and AUC. In this research, a deeper analysis approach was performed using diverse features extracting strategies using robust machine learning classifiers with more advanced optimal options. Support Vector Machine linear kernel and KNN with City block distance metric give the overall highest accuracy of 99.5% which was higher than using the default parameters for these classifiers. Moreover, highest separation (AUC = 0.9991, 0.9990) were obtained at different kernel scales using SVM. Additionally, the K-nearest neighbors with inverse squared distance weight give higher performance at different Neighbors. Moreover, to distinguish the postictal heart rate oscillations from epileptic ictal subjects, and highest performance of 100% was obtained using different machine learning classifiers.

  12. Comparison of Neural Networks and Tabular Nearest Neighbor Encoding for Hyperspectral Signature Classification in Unresolved Object Detection

    NASA Astrophysics Data System (ADS)

    Schmalz, M.; Ritter, G.; Key, R.

    Accurate and computationally efficient spectral signature classification is a crucial step in the nonimaging detection and recognition of spaceborne objects. In classical hyperspectral recognition applications using linear mixing models, signature classification accuracy depends on accurate spectral endmember discrimination [1]. If the endmembers cannot be classified correctly, then the signatures cannot be classified correctly, and object recognition from hyperspectral data will be inaccurate. In practice, the number of endmembers accurately classified often depends linearly on the number of inputs. This can lead to potentially severe classification errors in the presence of noise or densely interleaved signatures. In this paper, we present an comparison of emerging technologies for nonimaging spectral signature classfication based on a highly accurate, efficient search engine called Tabular Nearest Neighbor Encoding (TNE) [3,4] and a neural network technology called Morphological Neural Networks (MNNs) [5]. Based on prior results, TNE can optimize its classifier performance to track input nonergodicities, as well as yield measures of confidence or caution for evaluation of classification results. Unlike neural networks, TNE does not have a hidden intermediate data structure (e.g., the neural net weight matrix). Instead, TNE generates and exploits a user-accessible data structure called the agreement map (AM), which can be manipulated by Boolean logic operations to effect accurate classifier refinement algorithms. The open architecture and programmability of TNE's agreement map processing allows a TNE programmer or user to determine classification accuracy, as well as characterize in detail the signatures for which TNE did not obtain classification matches, and why such mis-matches occurred. In this study, we will compare TNE and MNN based endmember classification, using performance metrics such as probability of correct classification (Pd) and rate of false detections (Rfa). As proof of principle, we analyze classification of multiple closely spaced signatures from a NASA database of space material signatures. Additional analysis pertains to computational complexity and noise sensitivity, which are superior to Bayesian techniques based on classical neural networks. [1] Winter, M.E. "Fast autonomous spectral end-member determination in hyperspectral data," in Proceedings of the 13th International Conference On Applied Geologic Remote Sensing, Vancouver, B.C., Canada, pp. 337-44 (1999). [2] N. Keshava, "A survey of spectral unmixing algorithms," Lincoln Laboratory Journal 14:55-78 (2003). [3] Key, G., M.S. SCHMALZ, F.M. Caimi, and G.X. Ritter. "Performance analysis of tabular nearest neighbor encoding algorithm for joint compression and ATR", in Proceedings SPIE 3814:115-126 (1999). [4] Schmalz, M.S. and G. Key. "Algorithms for hyperspectral signature classification in unresolved object detection using tabular nearest neighbor encoding" in Proceedings of the 2007 AMOS Conference, Maui HI (2007). [5] Ritter, G.X., G. Urcid, and M.S. Schmalz. "Autonomous single-pass endmember approximation using lattice auto-associative memories", Neurocomputing (Elsevier), accepted (June 2008).

  13. Automatic classification of hyperactive children: comparing multiple artificial intelligence approaches.

    PubMed

    Delavarian, Mona; Towhidkhah, Farzad; Gharibzadeh, Shahriar; Dibajnia, Parvin

    2011-07-12

    Automatic classification of different behavioral disorders with many similarities (e.g. in symptoms) by using an automated approach will help psychiatrists to concentrate on correct disorder and its treatment as soon as possible, to avoid wasting time on diagnosis, and to increase the accuracy of diagnosis. In this study, we tried to differentiate and classify (diagnose) 306 children with many similar symptoms and different behavioral disorders such as ADHD, depression, anxiety, comorbid depression and anxiety and conduct disorder with high accuracy. Classification was based on the symptoms and their severity. With examining 16 different available classifiers, by using "Prtools", we have proposed nearest mean classifier as the most accurate classifier with 96.92% accuracy in this research. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.

  14. Credit scoring analysis using weighted k nearest neighbor

    NASA Astrophysics Data System (ADS)

    Mukid, M. A.; Widiharih, T.; Rusgiyono, A.; Prahutama, A.

    2018-05-01

    Credit scoring is a quatitative method to evaluate the credit risk of loan applications. Both statistical methods and artificial intelligence are often used by credit analysts to help them decide whether the applicants are worthy of credit. These methods aim to predict future behavior in terms of credit risk based on past experience of customers with similar characteristics. This paper reviews the weighted k nearest neighbor (WKNN) method for credit assessment by considering the use of some kernels. We use credit data from a private bank in Indonesia. The result shows that the Gaussian kernel and rectangular kernel have a better performance based on the value of percentage corrected classified whose value is 82.4% respectively.

  15. Lowest cost, nearest term options for a manned Mars mission

    NASA Technical Reports Server (NTRS)

    Sauls, Bob; Mortensen, Michael; Myers, Renee; Guacci, Giovanni; Montes, Fred

    1992-01-01

    This study is part of a NASA/USRA Advanced Design Program project executed for the purpose of examining the requirements of a first manned Mars mission. The mission, classified as a split/sprint mission, has been designed for a crew of six with a total manned trip time of one year.

  16. Emotion recognition from multichannel EEG signals using K-nearest neighbor classification.

    PubMed

    Li, Mi; Xu, Hongpei; Liu, Xingwang; Lu, Shengfu

    2018-04-27

    Many studies have been done on the emotion recognition based on multi-channel electroencephalogram (EEG) signals. This paper explores the influence of the emotion recognition accuracy of EEG signals in different frequency bands and different number of channels. We classified the emotional states in the valence and arousal dimensions using different combinations of EEG channels. Firstly, DEAP default preprocessed data were normalized. Next, EEG signals were divided into four frequency bands using discrete wavelet transform, and entropy and energy were calculated as features of K-nearest neighbor Classifier. The classification accuracies of the 10, 14, 18 and 32 EEG channels based on the Gamma frequency band were 89.54%, 92.28%, 93.72% and 95.70% in the valence dimension and 89.81%, 92.24%, 93.69% and 95.69% in the arousal dimension. As the number of channels increases, the classification accuracy of emotional states also increases, the classification accuracy of the gamma frequency band is greater than that of the beta frequency band followed by the alpha and theta frequency bands. This paper provided better frequency bands and channels reference for emotion recognition based on EEG.

  17. Fall Detection Using Smartphone Audio Features.

    PubMed

    Cheffena, Michael

    2016-07-01

    An automated fall detection system based on smartphone audio features is developed. The spectrogram, mel frequency cepstral coefficents (MFCCs), linear predictive coding (LPC), and matching pursuit (MP) features of different fall and no-fall sound events are extracted from experimental data. Based on the extracted audio features, four different machine learning classifiers: k-nearest neighbor classifier (k-NN), support vector machine (SVM), least squares method (LSM), and artificial neural network (ANN) are investigated for distinguishing between fall and no-fall events. For each audio feature, the performance of each classifier in terms of sensitivity, specificity, accuracy, and computational complexity is evaluated. The best performance is achieved using spectrogram features with ANN classifier with sensitivity, specificity, and accuracy all above 98%. The classifier also has acceptable computational requirement for training and testing. The system is applicable in home environments where the phone is placed in the vicinity of the user.

  18. E-Nose Vapor Identification Based on Dempster-Shafer Fusion of Multiple Classifiers

    NASA Technical Reports Server (NTRS)

    Li, Winston; Leung, Henry; Kwan, Chiman; Linnell, Bruce R.

    2005-01-01

    Electronic nose (e-nose) vapor identification is an efficient approach to monitor air contaminants in space stations and shuttles in order to ensure the health and safety of astronauts. Data preprocessing (measurement denoising and feature extraction) and pattern classification are important components of an e-nose system. In this paper, a wavelet-based denoising method is applied to filter the noisy sensor measurements. Transient-state features are then extracted from the denoised sensor measurements, and are used to train multiple classifiers such as multi-layer perceptions (MLP), support vector machines (SVM), k nearest neighbor (KNN), and Parzen classifier. The Dempster-Shafer (DS) technique is used at the end to fuse the results of the multiple classifiers to get the final classification. Experimental analysis based on real vapor data shows that the wavelet denoising method can remove both random noise and outliers successfully, and the classification rate can be improved by using classifier fusion.

  19. Comparing supervised learning methods for classifying sex, age, context and individual Mudi dogs from barking.

    PubMed

    Larrañaga, Ana; Bielza, Concha; Pongrácz, Péter; Faragó, Tamás; Bálint, Anna; Larrañaga, Pedro

    2015-03-01

    Barking is perhaps the most characteristic form of vocalization in dogs; however, very little is known about its role in the intraspecific communication of this species. Besides the obvious need for ethological research, both in the field and in the laboratory, the possible information content of barks can also be explored by computerized acoustic analyses. This study compares four different supervised learning methods (naive Bayes, classification trees, [Formula: see text]-nearest neighbors and logistic regression) combined with three strategies for selecting variables (all variables, filter and wrapper feature subset selections) to classify Mudi dogs by sex, age, context and individual from their barks. The classification accuracy of the models obtained was estimated by means of [Formula: see text]-fold cross-validation. Percentages of correct classifications were 85.13 % for determining sex, 80.25 % for predicting age (recodified as young, adult and old), 55.50 % for classifying contexts (seven situations) and 67.63 % for recognizing individuals (8 dogs), so the results are encouraging. The best-performing method was [Formula: see text]-nearest neighbors following a wrapper feature selection approach. The results for classifying contexts and recognizing individual dogs were better with this method than they were for other approaches reported in the specialized literature. This is the first time that the sex and age of domestic dogs have been predicted with the help of sound analysis. This study shows that dog barks carry ample information regarding the caller's indexical features. Our computerized analysis provides indirect proof that barks may serve as an important source of information for dogs as well.

  20. The distance function effect on k-nearest neighbor classification for medical datasets.

    PubMed

    Hu, Li-Yu; Huang, Min-Wei; Ke, Shih-Wen; Tsai, Chih-Fong

    2016-01-01

    K-nearest neighbor (k-NN) classification is conventional non-parametric classifier, which has been used as the baseline classifier in many pattern classification problems. It is based on measuring the distances between the test data and each of the training data to decide the final classification output. Since the Euclidean distance function is the most widely used distance metric in k-NN, no study examines the classification performance of k-NN by different distance functions, especially for various medical domain problems. Therefore, the aim of this paper is to investigate whether the distance function can affect the k-NN performance over different medical datasets. Our experiments are based on three different types of medical datasets containing categorical, numerical, and mixed types of data and four different distance functions including Euclidean, cosine, Chi square, and Minkowsky are used during k-NN classification individually. The experimental results show that using the Chi square distance function is the best choice for the three different types of datasets. However, using the cosine and Euclidean (and Minkowsky) distance function perform the worst over the mixed type of datasets. In this paper, we demonstrate that the chosen distance function can affect the classification accuracy of the k-NN classifier. For the medical domain datasets including the categorical, numerical, and mixed types of data, K-NN based on the Chi square distance function performs the best.

  1. The effect of class imbalance on case selection for case-based classifiers: An empirical study in the context of medical decision support

    PubMed Central

    Malof, Jordan M.; Mazurowski, Maciej A.; Tourassi, Georgia D.

    2013-01-01

    Case selection is a useful approach for increasing the efficiency and performance of case-based classifiers. Multiple techniques have been designed to perform case selection. This paper empirically investigates how class imbalance in the available set of training cases can impact the performance of the resulting classifier as well as properties of the selected set. In this study, the experiments are performed using a dataset for the problem of detecting breast masses in screening mammograms. The classification problem was binary and we used a k-nearest neighbor classifier. The classifier’s performance was evaluated using the Receiver Operating Characteristic (ROC) area under the curve (AUC) measure. The experimental results indicate that although class imbalance reduces the performance of the derived classifier and the effectiveness of selection at improving overall classifier performance, case selection can still be beneficial, regardless of the level of class imbalance. PMID:21820273

  2. The diabolo classifier

    PubMed

    Schwenk

    1998-11-15

    We present a new classification architecture based on autoassociative neural networks that are used to learn discriminant models of each class. The proposed architecture has several interesting properties with respect to other model-based classifiers like nearest-neighbors or radial basis functions: it has a low computational complexity and uses a compact distributed representation of the models. The classifier is also well suited for the incorporation of a priori knowledge by means of a problem-specific distance measure. In particular, we will show that tangent distance (Simard, Le Cun, & Denker, 1993) can be used to achieve transformation invariance during learning and recognition. We demonstrate the application of this classifier to optical character recognition, where it has achieved state-of-the-art results on several reference databases. Relations to other models, in particular those based on principal component analysis, are also discussed.

  3. Automatic tissue characterization from ultrasound imagery

    NASA Astrophysics Data System (ADS)

    Kadah, Yasser M.; Farag, Aly A.; Youssef, Abou-Bakr M.; Badawi, Ahmed M.

    1993-08-01

    In this work, feature extraction algorithms are proposed to extract the tissue characterization parameters from liver images. Then the resulting parameter set is further processed to obtain the minimum number of parameters representing the most discriminating pattern space for classification. This preprocessing step was applied to over 120 pathology-investigated cases to obtain the learning data for designing the classifier. The extracted features are divided into independent training and test sets and are used to construct both statistical and neural classifiers. The optimal criteria for these classifiers are set to have minimum error, ease of implementation and learning, and the flexibility for future modifications. Various algorithms for implementing various classification techniques are presented and tested on the data. The best performance was obtained using a single layer tensor model functional link network. Also, the voting k-nearest neighbor classifier provided comparably good diagnostic rates.

  4. Automatic morphological classification of galaxy images

    PubMed Central

    Shamir, Lior

    2009-01-01

    We describe an image analysis supervised learning algorithm that can automatically classify galaxy images. The algorithm is first trained using a manually classified images of elliptical, spiral, and edge-on galaxies. A large set of image features is extracted from each image, and the most informative features are selected using Fisher scores. Test images can then be classified using a simple Weighted Nearest Neighbor rule such that the Fisher scores are used as the feature weights. Experimental results show that galaxy images from Galaxy Zoo can be classified automatically to spiral, elliptical and edge-on galaxies with accuracy of ~90% compared to classifications carried out by the author. Full compilable source code of the algorithm is available for free download, and its general-purpose nature makes it suitable for other uses that involve automatic image analysis of celestial objects. PMID:20161594

  5. Chaotic particle swarm optimization with mutation for classification.

    PubMed

    Assarzadeh, Zahra; Naghsh-Nilchi, Ahmad Reza

    2015-01-01

    In this paper, a chaotic particle swarm optimization with mutation-based classifier particle swarm optimization is proposed to classify patterns of different classes in the feature space. The introduced mutation operators and chaotic sequences allows us to overcome the problem of early convergence into a local minima associated with particle swarm optimization algorithms. That is, the mutation operator sharpens the convergence and it tunes the best possible solution. Furthermore, to remove the irrelevant data and reduce the dimensionality of medical datasets, a feature selection approach using binary version of the proposed particle swarm optimization is introduced. In order to demonstrate the effectiveness of our proposed classifier, mutation-based classifier particle swarm optimization, it is checked out with three sets of data classifications namely, Wisconsin diagnostic breast cancer, Wisconsin breast cancer and heart-statlog, with different feature vector dimensions. The proposed algorithm is compared with different classifier algorithms including k-nearest neighbor, as a conventional classifier, particle swarm-classifier, genetic algorithm, and Imperialist competitive algorithm-classifier, as more sophisticated ones. The performance of each classifier was evaluated by calculating the accuracy, sensitivity, specificity and Matthews's correlation coefficient. The experimental results show that the mutation-based classifier particle swarm optimization unequivocally performs better than all the compared algorithms.

  6. The Impact of Theoretical Orientation and Training on Preference for Diagnostic Models of Personality Pathology.

    PubMed

    Paggeot, Amy; Nelson, Sharon; Huprich, Steven

    2017-01-01

    The role of theoretical orientation in determining preference for different methods of diagnosis has been largely unexplored. The goal of the present study was to explore ratings of the usefulness of 4 diagnostic methods after applying them to a patient: prototype ratings derived from the SWAP-II, the DSM-5 Section III specific personality disorders, the DSM-5 Section III trait model, and prototype ratings derived from the Psychodynamic Diagnostic Manual (PDM). Three hundred and twenty-nine trainees in APA-accredited doctoral programs and internships rated one of their current patients with each of the 4 diagnostic methods. Individuals who classified their theoretical orientation as "cognitive- behavioral" displayed a significantly greater preference for the proposed DSM-5 personality disorder prototypes when compared to individuals who classified their orientation as "psychodynamic/psychoanalytic," while individuals who considered themselves psychodynamic or psychoanalytic rated the PDM as significantly more useful than those who considered themselves cognitive-behavioral. Individuals who classified their graduate program as a PsyD program were also more likely to rate the DSM-5 Section III and PDM models as more useful diagnostic methods than individuals who classified their graduate program as a PhD program. Implications and future directions will be discussed. © 2017 S. Karger AG, Basel.

  7. Local classifier weighting by quadratic programming.

    PubMed

    Cevikalp, Hakan; Polikar, Robi

    2008-10-01

    It has been widely accepted that the classification accuracy can be improved by combining outputs of multiple classifiers. However, how to combine multiple classifiers with various (potentially conflicting) decisions is still an open problem. A rich collection of classifier combination procedures -- many of which are heuristic in nature -- have been developed for this goal. In this brief, we describe a dynamic approach to combine classifiers that have expertise in different regions of the input space. To this end, we use local classifier accuracy estimates to weight classifier outputs. Specifically, we estimate local recognition accuracies of classifiers near a query sample by utilizing its nearest neighbors, and then use these estimates to find the best weights of classifiers to label the query. The problem is formulated as a convex quadratic optimization problem, which returns optimal nonnegative classifier weights with respect to the chosen objective function, and the weights ensure that locally most accurate classifiers are weighted more heavily for labeling the query sample. Experimental results on several data sets indicate that the proposed weighting scheme outperforms other popular classifier combination schemes, particularly on problems with complex decision boundaries. Hence, the results indicate that local classification-accuracy-based combination techniques are well suited for decision making when the classifiers are trained by focusing on different regions of the input space.

  8. Determination of authenticity of brand perfume using electronic nose prototypes

    NASA Astrophysics Data System (ADS)

    Gebicki, Jacek; Szulczynski, Bartosz; Kaminski, Marian

    2015-12-01

    The paper presents the practical application of an electronic nose technique for fast and efficient discrimination between authentic and fake perfume samples. Two self-built electronic nose prototypes equipped with a set of semiconductor sensors were employed for that purpose. Additionally 10 volunteers took part in the sensory analysis. The following perfumes and their fake counterparts were analysed: Dior—Fahrenheit, Eisenberg—J’ose, YSL—La nuit de L’homme, 7 Loewe and Spice Bomb. The investigations were carried out using the headspace of the aqueous solutions. Data analysis utilized multidimensional techniques: principle component analysis (PCA), linear discrimination analysis (LDA), k-nearest neighbour (k-NN). The results obtained confirmed the legitimacy of the electronic nose technique as an alternative to the sensory analysis as far as the determination of authenticity of perfume is concerned.

  9. Spectrum, symmetries, and dynamics of Heisenberg spin-1/2 chains

    NASA Astrophysics Data System (ADS)

    Joel, Kira; Kollmar, Davida; Santos, Lea

    2013-03-01

    Quantum spin chains are prototype quantum many-body systems. They are employed in the description of various complex physical phenomena. Here we provide an introduction to the subject by focusing on the time evolution of Heisenberg spin-1/2 chains with couplings between nearest-neighbor sites only. We study how the anisotropy parameter and the symmetries of the model affect its time evolution. Our predictions are based on the analysis of the eigenvalues and eigenstates of the system and then confirmed with actual numerical results.

  10. Comparative Analysis of Document level Text Classification Algorithms using R

    NASA Astrophysics Data System (ADS)

    Syamala, Maganti; Nalini, N. J., Dr; Maguluri, Lakshamanaphaneendra; Ragupathy, R., Dr.

    2017-08-01

    From the past few decades there has been tremendous volumes of data available in Internet either in structured or unstructured form. Also, there is an exponential growth of information on Internet, so there is an emergent need of text classifiers. Text mining is an interdisciplinary field which draws attention on information retrieval, data mining, machine learning, statistics and computational linguistics. And to handle this situation, a wide range of supervised learning algorithms has been introduced. Among all these K-Nearest Neighbor(KNN) is efficient and simplest classifier in text classification family. But KNN suffers from imbalanced class distribution and noisy term features. So, to cope up with this challenge we use document based centroid dimensionality reduction(CentroidDR) using R Programming. By combining these two text classification techniques, KNN and Centroid classifiers, we propose a scalable and effective flat classifier, called MCenKNN which works well substantially better than CenKNN.

  11. A Comparison of Rule-Based, K-Nearest Neighbor, and Neural Net Classifiers for Automated

    Treesearch

    Tai-Hoon Cho; Richard W. Conners; Philip A. Araman

    1991-01-01

    Over the last few years the authors have been involved in research aimed at developing a machine vision system for locating and identifying surface defects on materials. The particular problem being studied involves locating surface defects on hardwood lumber in a species independent manner. Obviously, the accurate location and identification of defects is of paramount...

  12. Autonomous target recognition using remotely sensed surface vibration measurements

    NASA Astrophysics Data System (ADS)

    Geurts, James; Ruck, Dennis W.; Rogers, Steven K.; Oxley, Mark E.; Barr, Dallas N.

    1993-09-01

    The remotely measured surface vibration signatures of tactical military ground vehicles are investigated for use in target classification and identification friend or foe (IFF) systems. The use of remote surface vibration sensing by a laser radar reduces the effects of partial occlusion, concealment, and camouflage experienced by automatic target recognition systems using traditional imagery in a tactical battlefield environment. Linear Predictive Coding (LPC) efficiently represents the vibration signatures and nearest neighbor classifiers exploit the LPC feature set using a variety of distortion metrics. Nearest neighbor classifiers achieve an 88 percent classification rate in an eight class problem, representing a classification performance increase of thirty percent from previous efforts. A novel confidence figure of merit is implemented to attain a 100 percent classification rate with less than 60 percent rejection. The high classification rates are achieved on a target set which would pose significant problems to traditional image-based recognition systems. The targets are presented to the sensor in a variety of aspects and engine speeds at a range of 1 kilometer. The classification rates achieved demonstrate the benefits of using remote vibration measurement in a ground IFF system. The signature modeling and classification system can also be used to identify rotary and fixed-wing targets.

  13. A discrete wavelet based feature extraction and hybrid classification technique for microarray data analysis.

    PubMed

    Bennet, Jaison; Ganaprakasam, Chilambuchelvan Arul; Arputharaj, Kannan

    2014-01-01

    Cancer classification by doctors and radiologists was based on morphological and clinical features and had limited diagnostic ability in olden days. The recent arrival of DNA microarray technology has led to the concurrent monitoring of thousands of gene expressions in a single chip which stimulates the progress in cancer classification. In this paper, we have proposed a hybrid approach for microarray data classification based on nearest neighbor (KNN), naive Bayes, and support vector machine (SVM). Feature selection prior to classification plays a vital role and a feature selection technique which combines discrete wavelet transform (DWT) and moving window technique (MWT) is used. The performance of the proposed method is compared with the conventional classifiers like support vector machine, nearest neighbor, and naive Bayes. Experiments have been conducted on both real and benchmark datasets and the results indicate that the ensemble approach produces higher classification accuracy than conventional classifiers. This paper serves as an automated system for the classification of cancer and can be applied by doctors in real cases which serve as a boon to the medical community. This work further reduces the misclassification of cancers which is highly not allowed in cancer detection.

  14. Multivariate Feature Selection of Image Descriptors Data for Breast Cancer with Computer-Assisted Diagnosis

    PubMed Central

    Galván-Tejada, Carlos E.; Zanella-Calzada, Laura A.; Galván-Tejada, Jorge I.; Celaya-Padilla, José M.; Gamboa-Rosales, Hamurabi; Garza-Veloz, Idalia; Martinez-Fierro, Margarita L.

    2017-01-01

    Breast cancer is an important global health problem, and the most common type of cancer among women. Late diagnosis significantly decreases the survival rate of the patient; however, using mammography for early detection has been demonstrated to be a very important tool increasing the survival rate. The purpose of this paper is to obtain a multivariate model to classify benign and malignant tumor lesions using a computer-assisted diagnosis with a genetic algorithm in training and test datasets from mammography image features. A multivariate search was conducted to obtain predictive models with different approaches, in order to compare and validate results. The multivariate models were constructed using: Random Forest, Nearest centroid, and K-Nearest Neighbor (K-NN) strategies as cost function in a genetic algorithm applied to the features in the BCDR public databases. Results suggest that the two texture descriptor features obtained in the multivariate model have a similar or better prediction capability to classify the data outcome compared with the multivariate model composed of all the features, according to their fitness value. This model can help to reduce the workload of radiologists and present a second opinion in the classification of tumor lesions. PMID:28216571

  15. Multivariate Feature Selection of Image Descriptors Data for Breast Cancer with Computer-Assisted Diagnosis.

    PubMed

    Galván-Tejada, Carlos E; Zanella-Calzada, Laura A; Galván-Tejada, Jorge I; Celaya-Padilla, José M; Gamboa-Rosales, Hamurabi; Garza-Veloz, Idalia; Martinez-Fierro, Margarita L

    2017-02-14

    Breast cancer is an important global health problem, and the most common type of cancer among women. Late diagnosis significantly decreases the survival rate of the patient; however, using mammography for early detection has been demonstrated to be a very important tool increasing the survival rate. The purpose of this paper is to obtain a multivariate model to classify benign and malignant tumor lesions using a computer-assisted diagnosis with a genetic algorithm in training and test datasets from mammography image features. A multivariate search was conducted to obtain predictive models with different approaches, in order to compare and validate results. The multivariate models were constructed using: Random Forest, Nearest centroid, and K-Nearest Neighbor (K-NN) strategies as cost function in a genetic algorithm applied to the features in the BCDR public databases. Results suggest that the two texture descriptor features obtained in the multivariate model have a similar or better prediction capability to classify the data outcome compared with the multivariate model composed of all the features, according to their fitness value. This model can help to reduce the workload of radiologists and present a second opinion in the classification of tumor lesions.

  16. Comparison of classification methods for voxel-based prediction of acute ischemic stroke outcome following intra-arterial intervention

    NASA Astrophysics Data System (ADS)

    Winder, Anthony J.; Siemonsen, Susanne; Flottmann, Fabian; Fiehler, Jens; Forkert, Nils D.

    2017-03-01

    Voxel-based tissue outcome prediction in acute ischemic stroke patients is highly relevant for both clinical routine and research. Previous research has shown that features extracted from baseline multi-parametric MRI datasets have a high predictive value and can be used for the training of classifiers, which can generate tissue outcome predictions for both intravenous and conservative treatments. However, with the recent advent and popularization of intra-arterial thrombectomy treatment, novel research specifically addressing the utility of predictive classi- fiers for thrombectomy intervention is necessary for a holistic understanding of current stroke treatment options. The aim of this work was to develop three clinically viable tissue outcome prediction models using approximate nearest-neighbor, generalized linear model, and random decision forest approaches and to evaluate the accuracy of predicting tissue outcome after intra-arterial treatment. Therefore, the three machine learning models were trained, evaluated, and compared using datasets of 42 acute ischemic stroke patients treated with intra-arterial thrombectomy. Classifier training utilized eight voxel-based features extracted from baseline MRI datasets and five global features. Evaluation of classifier-based predictions was performed via comparison to the known tissue outcome, which was determined in follow-up imaging, using the Dice coefficient and leave-on-patient-out cross validation. The random decision forest prediction model led to the best tissue outcome predictions with a mean Dice coefficient of 0.37. The approximate nearest-neighbor and generalized linear model performed equally suboptimally with average Dice coefficients of 0.28 and 0.27 respectively, suggesting that both non-linearity and machine learning are desirable properties of a classifier well-suited to the intra-arterial tissue outcome prediction problem.

  17. Classification of emotional states from electrocardiogram signals: a non-linear approach based on hurst

    PubMed Central

    2013-01-01

    Background Identifying the emotional state is helpful in applications involving patients with autism and other intellectual disabilities; computer-based training, human computer interaction etc. Electrocardiogram (ECG) signals, being an activity of the autonomous nervous system (ANS), reflect the underlying true emotional state of a person. However, the performance of various methods developed so far lacks accuracy, and more robust methods need to be developed to identify the emotional pattern associated with ECG signals. Methods Emotional ECG data was obtained from sixty participants by inducing the six basic emotional states (happiness, sadness, fear, disgust, surprise and neutral) using audio-visual stimuli. The non-linear feature ‘Hurst’ was computed using Rescaled Range Statistics (RRS) and Finite Variance Scaling (FVS) methods. New Hurst features were proposed by combining the existing RRS and FVS methods with Higher Order Statistics (HOS). The features were then classified using four classifiers – Bayesian Classifier, Regression Tree, K- nearest neighbor and Fuzzy K-nearest neighbor. Seventy percent of the features were used for training and thirty percent for testing the algorithm. Results Analysis of Variance (ANOVA) conveyed that Hurst and the proposed features were statistically significant (p < 0.001). Hurst computed using RRS and FVS methods showed similar classification accuracy. The features obtained by combining FVS and HOS performed better with a maximum accuracy of 92.87% and 76.45% for classifying the six emotional states using random and subject independent validation respectively. Conclusions The results indicate that the combination of non-linear analysis and HOS tend to capture the finer emotional changes that can be seen in healthy ECG data. This work can be further fine tuned to develop a real time system. PMID:23680041

  18. [Classification of Children with Attention-Deficit/Hyperactivity Disorder and Typically Developing Children Based on Electroencephalogram Principal Component Analysis and k-Nearest Neighbor].

    PubMed

    Yang, Jiaojiao; Guo, Qian; Li, Wenjie; Wang, Suhong; Zou, Ling

    2016-04-01

    This paper aims to assist the individual clinical diagnosis of children with attention-deficit/hyperactivity disorder using electroencephalogram signal detection method.Firstly,in our experiments,we obtained and studied the electroencephalogram signals from fourteen attention-deficit/hyperactivity disorder children and sixteen typically developing children during the classic interference control task of Simon-spatial Stroop,and we completed electroencephalogram data preprocessing including filtering,segmentation,removal of artifacts and so on.Secondly,we selected the subset electroencephalogram electrodes using principal component analysis(PCA)method,and we collected the common channels of the optimal electrodes which occurrence rates were more than 90%in each kind of stimulation.We then extracted the latency(200~450ms)mean amplitude features of the common electrodes.Finally,we used the k-nearest neighbor(KNN)classifier based on Euclidean distance and the support vector machine(SVM)classifier based on radial basis kernel function to classify.From the experiment,at the same kind of interference control task,the attention-deficit/hyperactivity disorder children showed lower correct response rates and longer reaction time.The N2 emerged in prefrontal cortex while P2 presented in the inferior parietal area when all kinds of stimuli demonstrated.Meanwhile,the children with attention-deficit/hyperactivity disorder exhibited markedly reduced N2 and P2amplitude compared to typically developing children.KNN resulted in better classification accuracy than SVM classifier,and the best classification rate was 89.29%in StI task.The results showed that the electroencephalogram signals were different in the brain regions of prefrontal cortex and inferior parietal cortex between attention-deficit/hyperactivity disorder and typically developing children during the interference control task,which provided a scientific basis for the clinical diagnosis of attention-deficit/hyperactivity disorder individuals.

  19. An ultra low power feature extraction and classification system for wearable seizure detection.

    PubMed

    Page, Adam; Pramod Tim Oates, Siddharth; Mohsenin, Tinoosh

    2015-01-01

    In this paper we explore the use of a variety of machine learning algorithms for designing a reliable and low-power, multi-channel EEG feature extractor and classifier for predicting seizures from electroencephalographic data (scalp EEG). Different machine learning classifiers including k-nearest neighbor, support vector machines, naïve Bayes, logistic regression, and neural networks are explored with the goal of maximizing detection accuracy while minimizing power, area, and latency. The input to each machine learning classifier is a 198 feature vector containing 9 features for each of the 22 EEG channels obtained over 1-second windows. All classifiers were able to obtain F1 scores over 80% and onset sensitivity of 100% when tested on 10 patients. Among five different classifiers that were explored, logistic regression (LR) proved to have minimum hardware complexity while providing average F-1 score of 91%. Both ASIC and FPGA implementations of logistic regression are presented and show the smallest area, power consumption, and the lowest latency when compared to the previous work.

  20. Motion Control of Drives for Prosthetic Hand Using Continuous Myoelectric Signals

    NASA Astrophysics Data System (ADS)

    Purushothaman, Geethanjali; Ray, Kalyan Kumar

    2016-03-01

    In this paper the authors present motion control of a prosthetic hand, through continuous myoelectric signal acquisition, classification and actuation of the prosthetic drive. A four channel continuous electromyogram (EMG) signal also known as myoelectric signals (MES) are acquired from the abled-body to classify the six unique movements of hand and wrist, viz, hand open (HO), hand close (HC), wrist flexion (WF), wrist extension (WE), ulnar deviation (UD) and radial deviation (RD). The classification technique involves in extracting the features/pattern through statistical time domain (TD) parameter/autoregressive coefficients (AR), which are reduced using principal component analysis (PCA). The reduced statistical TD features and or AR coefficients are used to classify the signal patterns through k nearest neighbour (kNN) as well as neural network (NN) classifier and the performance of the classifiers are compared. Performance comparison of the above two classifiers clearly shows that kNN classifier in identifying the hidden intended motion in the myoelectric signals is better than that of NN classifier. Once the classifier identifies the intended motion, the signal is amplified to actuate the three low power DC motor to perform the above mentioned movements.

  1. A Prototype SSVEP Based Real Time BCI Gaming System

    PubMed Central

    Martišius, Ignas

    2016-01-01

    Although brain-computer interface technology is mainly designed with disabled people in mind, it can also be beneficial to healthy subjects, for example, in gaming or virtual reality systems. In this paper we discuss the typical architecture, paradigms, requirements, and limitations of electroencephalogram-based gaming systems. We have developed a prototype three-class brain-computer interface system, based on the steady state visually evoked potentials paradigm and the Emotiv EPOC headset. An online target shooting game, implemented in the OpenViBE environment, has been used for user feedback. The system utilizes wave atom transform for feature extraction, achieving an average accuracy of 78.2% using linear discriminant analysis classifier, 79.3% using support vector machine classifier with a linear kernel, and 80.5% using a support vector machine classifier with a radial basis function kernel. PMID:27051414

  2. MIDAS, prototype Multivariate Interactive Digital Analysis System, phase 1. Volume 1: System description

    NASA Technical Reports Server (NTRS)

    Kriegler, F. J.

    1974-01-01

    The MIDAS System is described as a third-generation fast multispectral recognition system able to keep pace with the large quantity and high rates of data acquisition from present and projected sensors. A principal objective of the MIDAS program is to provide a system well interfaced with the human operator and thus to obtain large overall reductions in turnaround time and significant gains in throughput. The hardware and software are described. The system contains a mini-computer to control the various high-speed processing elements in the data path, and a classifier which implements an all-digital prototype multivariate-Gaussian maximum likelihood decision algorithm operating at 200,000 pixels/sec. Sufficient hardware was developed to perform signature extraction from computer-compatible tapes, compute classifier coefficients, control the classifier operation, and diagnose operation.

  3. A Prototype SSVEP Based Real Time BCI Gaming System.

    PubMed

    Martišius, Ignas; Damaševičius, Robertas

    2016-01-01

    Although brain-computer interface technology is mainly designed with disabled people in mind, it can also be beneficial to healthy subjects, for example, in gaming or virtual reality systems. In this paper we discuss the typical architecture, paradigms, requirements, and limitations of electroencephalogram-based gaming systems. We have developed a prototype three-class brain-computer interface system, based on the steady state visually evoked potentials paradigm and the Emotiv EPOC headset. An online target shooting game, implemented in the OpenViBE environment, has been used for user feedback. The system utilizes wave atom transform for feature extraction, achieving an average accuracy of 78.2% using linear discriminant analysis classifier, 79.3% using support vector machine classifier with a linear kernel, and 80.5% using a support vector machine classifier with a radial basis function kernel.

  4. Determining the Number of Clusters in a Data Set Without Graphical Interpretation

    NASA Technical Reports Server (NTRS)

    Aguirre, Nathan S.; Davies, Misty D.

    2011-01-01

    Cluster analysis is a data mining technique that is meant ot simplify the process of classifying data points. The basic clustering process requires an input of data points and the number of clusters wanted. The clustering algorithm will then pick starting C points for the clusters, which can be either random spatial points or random data points. It then assigns each data point to the nearest C point where "nearest usually means Euclidean distance, but some algorithms use another criterion. The next step is determining whether the clustering arrangement this found is within a certain tolerance. If it falls within this tolerance, the process ends. Otherwise the C points are adjusted based on how many data points are in each cluster, and the steps repeat until the algorithm converges,

  5. Chaotic Particle Swarm Optimization with Mutation for Classification

    PubMed Central

    Assarzadeh, Zahra; Naghsh-Nilchi, Ahmad Reza

    2015-01-01

    In this paper, a chaotic particle swarm optimization with mutation-based classifier particle swarm optimization is proposed to classify patterns of different classes in the feature space. The introduced mutation operators and chaotic sequences allows us to overcome the problem of early convergence into a local minima associated with particle swarm optimization algorithms. That is, the mutation operator sharpens the convergence and it tunes the best possible solution. Furthermore, to remove the irrelevant data and reduce the dimensionality of medical datasets, a feature selection approach using binary version of the proposed particle swarm optimization is introduced. In order to demonstrate the effectiveness of our proposed classifier, mutation-based classifier particle swarm optimization, it is checked out with three sets of data classifications namely, Wisconsin diagnostic breast cancer, Wisconsin breast cancer and heart-statlog, with different feature vector dimensions. The proposed algorithm is compared with different classifier algorithms including k-nearest neighbor, as a conventional classifier, particle swarm-classifier, genetic algorithm, and Imperialist competitive algorithm-classifier, as more sophisticated ones. The performance of each classifier was evaluated by calculating the accuracy, sensitivity, specificity and Matthews's correlation coefficient. The experimental results show that the mutation-based classifier particle swarm optimization unequivocally performs better than all the compared algorithms. PMID:25709937

  6. A dysbiosis index to assess microbial changes in fecal samples of dogs with chronic inflammatory enteropathy.

    PubMed

    AlShawaqfeh, M K; Wajid, B; Minamoto, Y; Markel, M; Lidbury, J A; Steiner, J M; Serpedin, E; Suchodolski, J S

    2017-11-01

    Recent studies have identified various bacterial groups that are altered in dogs with chronic inflammatory enteropathies (CE) compared to healthy dogs. The study aim was to use quantitative PCR (qPCR) assays to confirm these findings in a larger number of dogs, and to build a mathematical algorithm to report these microbiota changes as a dysbiosis index (DI). Fecal DNA from 95 healthy dogs and 106 dogs with histologically confirmed CE was analyzed. Samples were grouped into a training set and a validation set. Various mathematical models and combination of qPCR assays were evaluated to find a model with highest discriminatory power. The final qPCR panel consisted of eight bacterial groups: total bacteria, Faecalibacterium, Turicibacter, Escherichia coli, Streptococcus, Blautia, Fusobacterium and Clostridium hiranonis. The qPCR-based DI was built based on the nearest centroid classifier, and reports the degree of dysbiosis in a single numerical value that measures the closeness in the l2 - norm of the test sample to the mean prototype of each class. A negative DI indicates normobiosis, whereas a positive DI indicates dysbiosis. For a threshold of 0, the DI based on the combined dataset achieved 74% sensitivity and 95% specificity to separate healthy and CE dogs. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  7. Recognition of aspect-dependent three-dimensional objects by an echolocating Atlantic bottlenose dolphin.

    PubMed

    Helweg, D A; Roitblat, H L; Nachtigall, P E; Hautus, M J

    1996-01-01

    We examined the ability of a bottlenose dolphin (Tursiops truncatus) to recognize aspect-dependent objects using echolocation. An aspect-dependent object such as a cube produces acoustically different echoes at different angles relative to the echolocation signal. The dolphin recognized the objects even though the objects were free to rotate and sway. A linear discriminant analysis and nearest centroid classifier could classify the objects using average amplitude, center frequency, and bandwidth of object echoes. The results show that dolphins can use varying acoustic properties to recognize constant objects and suggest that aspect-independent representations may be formed by combining information gleaned from multiple echoes.

  8. Stratified estimation of forest area using satellite imagery, inventory data, and the k-nearest neighbors technique

    Treesearch

    Ronald E. McRoberts; Mark D. Nelson; Daniel G. Wendt

    2002-01-01

    For two large study areas in Minnesota, USA, stratified estimation using classified Landsat Thematic Mapper satellite imagery as the basis for stratification was used to estimate forest area. Measurements of forest inventory plots obtained for a 12-month period in 1998 and 1999 were used as the source of data for within-stratum estimates. These measurements further...

  9. Portable Language-Independent Adaptive Translation from OCR. Phase 1

    DTIC Science & Technology

    2009-04-01

    including brute-force k-Nearest Neighbors ( kNN ), fast approximate kNN using hashed k-d trees, classification and regression trees, and locality...achieved by refinements in ground-truthing protocols. Recent algorithmic improvements to our approximate kNN classifier using hashed k-D trees allows...recent years discriminative training has been shown to outperform phonetic HMMs estimated using ML for speech recognition. Standard ML estimation

  10. TACOA – Taxonomic classification of environmental genomic fragments using a kernelized nearest neighbor approach

    PubMed Central

    Diaz, Naryttza N; Krause, Lutz; Goesmann, Alexander; Niehaus, Karsten; Nattkemper, Tim W

    2009-01-01

    Background Metagenomics, or the sequencing and analysis of collective genomes (metagenomes) of microorganisms isolated from an environment, promises direct access to the "unculturable majority". This emerging field offers the potential to lay solid basis on our understanding of the entire living world. However, the taxonomic classification is an essential task in the analysis of metagenomics data sets that it is still far from being solved. We present a novel strategy to predict the taxonomic origin of environmental genomic fragments. The proposed classifier combines the idea of the k-nearest neighbor with strategies from kernel-based learning. Results Our novel strategy was extensively evaluated using the leave-one-out cross validation strategy on fragments of variable length (800 bp – 50 Kbp) from 373 completely sequenced genomes. TACOA is able to classify genomic fragments of length 800 bp and 1 Kbp with high accuracy until rank class. For longer fragments ≥ 3 Kbp accurate predictions are made at even deeper taxonomic ranks (order and genus). Remarkably, TACOA also produces reliable results when the taxonomic origin of a fragment is not represented in the reference set, thus classifying such fragments to its known broader taxonomic class or simply as "unknown". We compared the classification accuracy of TACOA with the latest intrinsic classifier PhyloPythia using 63 recently published complete genomes. For fragments of length 800 bp and 1 Kbp the overall accuracy of TACOA is higher than that obtained by PhyloPythia at all taxonomic ranks. For all fragment lengths, both methods achieved comparable high specificity results up to rank class and low false negative rates are also obtained. Conclusion An accurate multi-class taxonomic classifier was developed for environmental genomic fragments. TACOA can predict with high reliability the taxonomic origin of genomic fragments as short as 800 bp. The proposed method is transparent, fast, accurate and the reference set can be easily updated as newly sequenced genomes become available. Moreover, the method demonstrated to be competitive when compared to the most current classifier PhyloPythia and has the advantage that it can be locally installed and the reference set can be kept up-to-date. PMID:19210774

  11. MIDAS, prototype Multivariate Interactive Digital Analysis System, phase 1. Volume 3: Wiring diagrams

    NASA Technical Reports Server (NTRS)

    Kriegler, F. J.; Christenson, D.; Gordon, M.; Kistler, R.; Lampert, S.; Marshall, R.; Mclaughlin, R.

    1974-01-01

    The Midas System is a third-generation, fast, multispectral recognition system able to keep pace with the large quantity and high rates of data acquisition from present and projected sensors. A principal objective of the MIDAS Program is to provide a system well interfaced with the human operator and thus to obtain large overall reductions in turn-around time and significant gains in throughput. The hardware and software generated in Phase I of the overall program are described. The system contains a mini-computer to control the various high-speed processing elements in the data path and a classifier which implements an all-digital prototype multivariate-Gaussian maximum likelihood decision algorithm operating at 2 x 100,000 pixels/sec. Sufficient hardware was developed to perform signature extraction from computer-compatible tapes, compute classifier coefficients, control the classifier operation, and diagnose operation. The MIDAS construction and wiring diagrams are given.

  12. MIDAS, prototype Multivariate Interactive Digital Analysis System, Phase 1. Volume 2: Diagnostic system

    NASA Technical Reports Server (NTRS)

    Kriegler, F. J.; Christenson, D.; Gordon, M.; Kistler, R.; Lampert, S.; Marshall, R.; Mclaughlin, R.

    1974-01-01

    The MIDAS System is a third-generation, fast, multispectral recognition system able to keep pace with the large quantity and high rates of data acquisition from present and projected sensors. A principal objective of the MIDAS Program is to provide a system well interfaced with the human operator and thus to obtain large overall reductions in turn-around time and significant gains in throughout. The hardware and software generated in Phase I of the over-all program are described. The system contains a mini-computer to control the various high-speed processing elements in the data path and a classifier which implements an all-digital prototype multivariate-Gaussian maximum likelihood decision algorithm operating 2 x 105 pixels/sec. Sufficient hardware was developed to perform signature extraction from computer-compatible tapes, compute classifier coefficients, control the classifier operation, and diagnose operation. Diagnostic programs used to test MIDAS' operations are presented.

  13. K-nearest neighbors based methods for identification of different gear crack levels under different motor speeds and loads: Revisited

    NASA Astrophysics Data System (ADS)

    Wang, Dong

    2016-03-01

    Gears are the most commonly used components in mechanical transmission systems. Their failures may cause transmission system breakdown and result in economic loss. Identification of different gear crack levels is important to prevent any unexpected gear failure because gear cracks lead to gear tooth breakage. Signal processing based methods mainly require expertize to explain gear fault signatures which is usually not easy to be achieved by ordinary users. In order to automatically identify different gear crack levels, intelligent gear crack identification methods should be developed. The previous case studies experimentally proved that K-nearest neighbors based methods exhibit high prediction accuracies for identification of 3 different gear crack levels under different motor speeds and loads. In this short communication, to further enhance prediction accuracies of existing K-nearest neighbors based methods and extend identification of 3 different gear crack levels to identification of 5 different gear crack levels, redundant statistical features are constructed by using Daubechies 44 (db44) binary wavelet packet transform at different wavelet decomposition levels, prior to the use of a K-nearest neighbors method. The dimensionality of redundant statistical features is 620, which provides richer gear fault signatures. Since many of these statistical features are redundant and highly correlated with each other, dimensionality reduction of redundant statistical features is conducted to obtain new significant statistical features. At last, the K-nearest neighbors method is used to identify 5 different gear crack levels under different motor speeds and loads. A case study including 3 experiments is investigated to demonstrate that the developed method provides higher prediction accuracies than the existing K-nearest neighbors based methods for recognizing different gear crack levels under different motor speeds and loads. Based on the new significant statistical features, some other popular statistical models including linear discriminant analysis, quadratic discriminant analysis, classification and regression tree and naive Bayes classifier, are compared with the developed method. The results show that the developed method has the highest prediction accuracies among these statistical models. Additionally, selection of the number of new significant features and parameter selection of K-nearest neighbors are thoroughly investigated.

  14. Carbon p Electron Ferromagnetism in Silicon Carbide

    PubMed Central

    Wang, Yutian; Liu, Yu; Wang, Gang; Anwand, Wolfgang; Jenkins, Catherine A.; Arenholz, Elke; Munnik, Frans; Gordan, Ovidiu D.; Salvan, Georgeta; Zahn, Dietrich R. T.; Chen, Xiaolong; Gemming, Sibylle; Helm, Manfred; Zhou, Shengqiang

    2015-01-01

    Ferromagnetism can occur in wide-band gap semiconductors as well as in carbon-based materials when specific defects are introduced. It is thus desirable to establish a direct relation between the defects and the resulting ferromagnetism. Here, we contribute to revealing the origin of defect-induced ferromagnetism using SiC as a prototypical example. We show that the long-range ferromagnetic coupling can be attributed to the p electrons of the nearest-neighbor carbon atoms around the VSiVC divacancies. Thus, the ferromagnetism is traced down to its microscopic electronic origin. PMID:25758040

  15. Carbon p electron ferromagnetism in silicon carbide

    DOE PAGES

    Wang, Yutian; Liu, Yu; Wang, Gang; ...

    2015-03-11

    Ferromagnetism can occur in wide-band gap semiconductors as well as in carbon-based materials when specific defects are introduced. It is thus desirable to establish a direct relation between the defects and the resulting ferromagnetism. Here, we contribute to revealing the origin of defect-induced ferromagnetism using SiC as a prototypical example. We show that the long-range ferromagnetic coupling can be attributed to the p electrons of the nearest-neighbor carbon atoms around the V SiV C divacancies. Thus, the ferromagnetism is traced down to its microscopic electronic origin.

  16. Land cover map for map zones 8 and 9 developed from SAGEMAP, GNN, and SWReGAP: a pilot for NWGAP

    Treesearch

    James S. Kagan; Janet L. Ohmann; Matthew Gregory; Claudine Tobalske

    2008-01-01

    As part of the Northwest Gap Analysis Project, land cover maps were generated for most of eastern Washington and eastern Oregon. The maps were derived from regional SAGEMAP and SWReGAP data sets using decision tree classifiers for nonforest areas, and Gradient Nearest Neighbor imputation modeling for forests and woodlands. The maps integrate data from regional...

  17. Natural Language Processing Based Instrument for Classification of Free Text Medical Records

    PubMed Central

    2016-01-01

    According to the Ministry of Labor, Health and Social Affairs of Georgia a new health management system has to be introduced in the nearest future. In this context arises the problem of structuring and classifying documents containing all the history of medical services provided. The present work introduces the instrument for classification of medical records based on the Georgian language. It is the first attempt of such classification of the Georgian language based medical records. On the whole 24.855 examination records have been studied. The documents were classified into three main groups (ultrasonography, endoscopy, and X-ray) and 13 subgroups using two well-known methods: Support Vector Machine (SVM) and K-Nearest Neighbor (KNN). The results obtained demonstrated that both machine learning methods performed successfully, with a little supremacy of SVM. In the process of classification a “shrink” method, based on features selection, was introduced and applied. At the first stage of classification the results of the “shrink” case were better; however, on the second stage of classification into subclasses 23% of all documents could not be linked to only one definite individual subclass (liver or binary system) due to common features characterizing these subclasses. The overall results of the study were successful. PMID:27668260

  18. Second harmonic generation microscopy analysis of extracellular matrix changes in human idiopathic pulmonary fibrosis

    PubMed Central

    Tilbury, Karissa; Hocker, James; Wen, Bruce L.; Sandbo, Nathan; Singh, Vikas; Campagnola, Paul J.

    2014-01-01

    Abstract. Patients with idiopathic fibrosis (IPF) have poor long-term survival as there are limited diagnostic/prognostic tools or successful therapies. Remodeling of the extracellular matrix (ECM) has been implicated in IPF progression; however, the structural consequences on the collagen architecture have not received considerable attention. Here, we demonstrate that second harmonic generation (SHG) and multiphoton fluorescence microscopy can quantitatively differentiate normal and IPF human tissues. For SHG analysis, we developed a classifier based on wavelet transforms, principle component analysis, and a K-nearest-neighbor algorithm to classify the specific alterations of the collagen structure observed in IPF tissues. The resulting ROC curves obtained by varying the numbers of principal components and nearest neighbors yielded accuracies of >95%. In contrast, simpler metrics based on SHG intensity and collagen coverage in the image provided little or no discrimination. We also characterized the change in the elastin/collagen balance by simultaneously measuring the elastin autofluorescence and SHG intensities and found that the IPF tissues were less elastic relative to collagen. This is consistent with known mechanical consequences of the disease. Understanding ECM remodeling in IPF via nonlinear optical microscopy may enhance our ability to differentiate patients with rapid and slow progression and, thus, provide better prognostic information. PMID:25134793

  19. Exploiting Language Models to Classify Events from Twitter

    PubMed Central

    Vo, Duc-Thuan; Hai, Vo Thuan; Ock, Cheol-Young

    2015-01-01

    Classifying events is challenging in Twitter because tweets texts have a large amount of temporal data with a lot of noise and various kinds of topics. In this paper, we propose a method to classify events from Twitter. We firstly find the distinguishing terms between tweets in events and measure their similarities with learning language models such as ConceptNet and a latent Dirichlet allocation method for selectional preferences (LDA-SP), which have been widely studied based on large text corpora within computational linguistic relations. The relationship of term words in tweets will be discovered by checking them under each model. We then proposed a method to compute the similarity between tweets based on tweets' features including common term words and relationships among their distinguishing term words. It will be explicit and convenient for applying to k-nearest neighbor techniques for classification. We carefully applied experiments on the Edinburgh Twitter Corpus to show that our method achieves competitive results for classifying events. PMID:26451139

  20. Efficacy Evaluation of Different Wavelet Feature Extraction Methods on Brain MRI Tumor Detection

    NASA Astrophysics Data System (ADS)

    Nabizadeh, Nooshin; John, Nigel; Kubat, Miroslav

    2014-03-01

    Automated Magnetic Resonance Imaging brain tumor detection and segmentation is a challenging task. Among different available methods, feature-based methods are very dominant. While many feature extraction techniques have been employed, it is still not quite clear which of feature extraction methods should be preferred. To help improve the situation, we present the results of a study in which we evaluate the efficiency of using different wavelet transform features extraction methods in brain MRI abnormality detection. Applying T1-weighted brain image, Discrete Wavelet Transform (DWT), Discrete Wavelet Packet Transform (DWPT), Dual Tree Complex Wavelet Transform (DTCWT), and Complex Morlet Wavelet Transform (CMWT) methods are applied to construct the feature pool. Three various classifiers as Support Vector Machine, K Nearest Neighborhood, and Sparse Representation-Based Classifier are applied and compared for classifying the selected features. The results show that DTCWT and CMWT features classified with SVM, result in the highest classification accuracy, proving of capability of wavelet transform features to be informative in this application.

  1. Machine learning approach to automatic exudate detection in retinal images from diabetic patients

    NASA Astrophysics Data System (ADS)

    Sopharak, Akara; Dailey, Matthew N.; Uyyanonvara, Bunyarit; Barman, Sarah; Williamson, Tom; Thet Nwe, Khine; Aye Moe, Yin

    2010-01-01

    Exudates are among the preliminary signs of diabetic retinopathy, a major cause of vision loss in diabetic patients. Early detection of exudates could improve patients' chances to avoid blindness. In this paper, we present a series of experiments on feature selection and exudates classification using naive Bayes and support vector machine (SVM) classifiers. We first fit the naive Bayes model to a training set consisting of 15 features extracted from each of 115,867 positive examples of exudate pixels and an equal number of negative examples. We then perform feature selection on the naive Bayes model, repeatedly removing features from the classifier, one by one, until classification performance stops improving. To find the best SVM, we begin with the best feature set from the naive Bayes classifier, and repeatedly add the previously-removed features to the classifier. For each combination of features, we perform a grid search to determine the best combination of hyperparameters ν (tolerance for training errors) and γ (radial basis function width). We compare the best naive Bayes and SVM classifiers to a baseline nearest neighbour (NN) classifier using the best feature sets from both classifiers. We find that the naive Bayes and SVM classifiers perform better than the NN classifier. The overall best sensitivity, specificity, precision, and accuracy are 92.28%, 98.52%, 53.05%, and 98.41%, respectively.

  2. Emergent biomarker derived from next-generation sequencing to identify pain patients requiring uncommonly high opioid doses

    PubMed Central

    Kringel, D; Ultsch, A; Zimmermann, M; Jansen, J-P; Ilias, W; Freynhagen, R; Griessinger, N; Kopf, A; Stein, C; Doehring, A; Resch, E; Lötsch, J

    2017-01-01

    Next-generation sequencing (NGS) provides unrestricted access to the genome, but it produces ‘big data’ exceeding in amount and complexity the classical analytical approaches. We introduce a bioinformatics-based classifying biomarker that uses emergent properties in genetics to separate pain patients requiring extremely high opioid doses from controls. Following precisely calculated selection of the 34 most informative markers in the OPRM1, OPRK1, OPRD1 and SIGMAR1 genes, pattern of genotypes belonging to either patient group could be derived using a k-nearest neighbor (kNN) classifier that provided a diagnostic accuracy of 80.6±4%. This outperformed alternative classifiers such as reportedly functional opioid receptor gene variants or complex biomarkers obtained via multiple regression or decision tree analysis. The accumulation of several genetic variants with only minor functional influences may result in a qualitative consequence affecting complex phenotypes, pointing at emergent properties in genetics. PMID:27139154

  3. Emergent biomarker derived from next-generation sequencing to identify pain patients requiring uncommonly high opioid doses.

    PubMed

    Kringel, D; Ultsch, A; Zimmermann, M; Jansen, J-P; Ilias, W; Freynhagen, R; Griessinger, N; Kopf, A; Stein, C; Doehring, A; Resch, E; Lötsch, J

    2017-10-01

    Next-generation sequencing (NGS) provides unrestricted access to the genome, but it produces 'big data' exceeding in amount and complexity the classical analytical approaches. We introduce a bioinformatics-based classifying biomarker that uses emergent properties in genetics to separate pain patients requiring extremely high opioid doses from controls. Following precisely calculated selection of the 34 most informative markers in the OPRM1, OPRK1, OPRD1 and SIGMAR1 genes, pattern of genotypes belonging to either patient group could be derived using a k-nearest neighbor (kNN) classifier that provided a diagnostic accuracy of 80.6±4%. This outperformed alternative classifiers such as reportedly functional opioid receptor gene variants or complex biomarkers obtained via multiple regression or decision tree analysis. The accumulation of several genetic variants with only minor functional influences may result in a qualitative consequence affecting complex phenotypes, pointing at emergent properties in genetics.

  4. A Cultural Resources Inventory of the John Martin Reservoir, Colorado.

    DTIC Science & Technology

    1982-08-31

    Service provides very useful infor- thesis 1.4. The Nearest Neighbor analysis with mation as to potential native plants and animals Z-coordinate cluster...expected relationships between Our analysis is far less specific about the base camps and many internally different artifact staple plant resources. The...artifacts were which had one of the lowest mean annual plant classified for the analysis in terms of their attri- production ratings. Twenty-nine

  5. Automated phenotype pattern recognition of zebrafish for high-throughput screening.

    PubMed

    Schutera, Mark; Dickmeis, Thomas; Mione, Marina; Peravali, Ravindra; Marcato, Daniel; Reischl, Markus; Mikut, Ralf; Pylatiuk, Christian

    2016-07-03

    Over the last years, the zebrafish (Danio rerio) has become a key model organism in genetic and chemical screenings. A growing number of experiments and an expanding interest in zebrafish research makes it increasingly essential to automatize the distribution of embryos and larvae into standard microtiter plates or other sample holders for screening, often according to phenotypical features. Until now, such sorting processes have been carried out by manually handling the larvae and manual feature detection. Here, a prototype platform for image acquisition together with a classification software is presented. Zebrafish embryos and larvae and their features such as pigmentation are detected automatically from the image. Zebrafish of 4 different phenotypes can be classified through pattern recognition at 72 h post fertilization (hpf), allowing the software to classify an embryo into 2 distinct phenotypic classes: wild-type versus variant. The zebrafish phenotypes are classified with an accuracy of 79-99% without any user interaction. A description of the prototype platform and of the algorithms for image processing and pattern recognition is presented.

  6. A Nearest Neighbor Classifier Employing Critical Boundary Vectors for Efficient On-Chip Template Reduction.

    PubMed

    Xia, Wenjun; Mita, Yoshio; Shibata, Tadashi

    2016-05-01

    Aiming at efficient data condensation and improving accuracy, this paper presents a hardware-friendly template reduction (TR) method for the nearest neighbor (NN) classifiers by introducing the concept of critical boundary vectors. A hardware system is also implemented to demonstrate the feasibility of using an field-programmable gate array (FPGA) to accelerate the proposed method. Initially, k -means centers are used as substitutes for the entire template set. Then, to enhance the classification performance, critical boundary vectors are selected by a novel learning algorithm, which is completed within a single iteration. Moreover, to remove noisy boundary vectors that can mislead the classification in a generalized manner, a global categorization scheme has been explored and applied to the algorithm. The global characterization automatically categorizes each classification problem and rapidly selects the boundary vectors according to the nature of the problem. Finally, only critical boundary vectors and k -means centers are used as the new template set for classification. Experimental results for 24 data sets show that the proposed algorithm can effectively reduce the number of template vectors for classification with a high learning speed. At the same time, it improves the accuracy by an average of 2.17% compared with the traditional NN classifiers and also shows greater accuracy than seven other TR methods. We have shown the feasibility of using a proof-of-concept FPGA system of 256 64-D vectors to accelerate the proposed method on hardware. At a 50-MHz clock frequency, the proposed system achieves a 3.86 times higher learning speed than on a 3.4-GHz PC, while consuming only 1% of the power of that used by the PC.

  7. Discrimination of soft tissues using laser-induced breakdown spectroscopy in combination with k nearest neighbors (kNN) and support vector machine (SVM) classifiers

    NASA Astrophysics Data System (ADS)

    Li, Xiaohui; Yang, Sibo; Fan, Rongwei; Yu, Xin; Chen, Deying

    2018-06-01

    In this paper, discrimination of soft tissues using laser-induced breakdown spectroscopy (LIBS) in combination with multivariate statistical methods is presented. Fresh pork fat, skin, ham, loin and tenderloin muscle tissues are manually cut into slices and ablated using a 1064 nm pulsed Nd:YAG laser. Discrimination analyses between fat, skin and muscle tissues, and further between highly similar ham, loin and tenderloin muscle tissues, are performed based on the LIBS spectra in combination with multivariate statistical methods, including principal component analysis (PCA), k nearest neighbors (kNN) classification, and support vector machine (SVM) classification. Performances of the discrimination models, including accuracy, sensitivity and specificity, are evaluated using 10-fold cross validation. The classification models are optimized to achieve best discrimination performances. The fat, skin and muscle tissues can be definitely discriminated using both kNN and SVM classifiers, with accuracy of over 99.83%, sensitivity of over 0.995 and specificity of over 0.998. The highly similar ham, loin and tenderloin muscle tissues can also be discriminated with acceptable performances. The best performances are achieved with SVM classifier using Gaussian kernel function, with accuracy of 76.84%, sensitivity of over 0.742 and specificity of over 0.869. The results show that the LIBS technique assisted with multivariate statistical methods could be a powerful tool for online discrimination of soft tissues, even for tissues of high similarity, such as muscles from different parts of the animal body. This technique could be used for discrimination of tissues suffering minor clinical changes, thus may advance the diagnosis of early lesions and abnormalities.

  8. Inspection of wear particles in oils by using a fuzzy classifier

    NASA Astrophysics Data System (ADS)

    Hamalainen, Jari J.; Enwald, Petri

    1994-11-01

    The reliability of stand-alone machines and larger production units can be improved by automated condition monitoring. Analysis of wear particles in lubricating or hydraulic oils helps diagnosing the wear states of machine parts. This paper presents a computer vision system for automated classification of wear particles. Digitized images from experiments with a bearing test bench, a hydraulic system with an industrial company, and oil samples from different industrial sources were used for algorithm development and testing. The wear particles were divided into four classes indicating different wear mechanisms: cutting wear, fatigue wear, adhesive wear, and abrasive wear. The results showed that the fuzzy K-nearest neighbor classifier utilized gave the same distribution of wear particles as the classification by a human expert.

  9. A system for tracking and recognizing pedestrian faces using a network of loosely coupled cameras

    NASA Astrophysics Data System (ADS)

    Gagnon, L.; Laliberté, F.; Foucher, S.; Branzan Albu, A.; Laurendeau, D.

    2006-05-01

    A face recognition module has been developed for an intelligent multi-camera video surveillance system. The module can recognize a pedestrian face in terms of six basic emotions and the neutral state. Face and facial features detection (eyes, nasal root, nose and mouth) are first performed using cascades of boosted classifiers. These features are used to normalize the pose and dimension of the face image. Gabor filters are then sampled on a regular grid covering the face image to build a facial feature vector that feeds a nearest neighbor classifier with a cosine distance similarity measure for facial expression interpretation and face model construction. A graphical user interface allows the user to adjust the module parameters.

  10. Automatic Recognition of Breathing Route During Sleep Using Snoring Sounds

    NASA Astrophysics Data System (ADS)

    Mikami, Tsuyoshi; Kojima, Yohichiro

    This letter classifies snoring sounds into three breathing routes (oral, nasal, and oronasal) with discriminant analysis of the power spectra and k-nearest neighbor method. It is necessary to recognize breathing route during snoring, because oral snoring is a typical symptom of sleep apnea but we cannot know our own breathing and snoring condition during sleep. As a result, about 98.8% classification rate is obtained by using leave-one-out test for performance evaluation.

  11. Heterogeneous Multi-Metric Learning for Multi-Sensor Fusion

    DTIC Science & Technology

    2011-07-01

    distance”. One of the most widely used methods is the k-nearest neighbor ( KNN ) method [4], which labels an input data sample to be the class with majority...despite of its simplicity, it can be an effective candidate and can be easily extended to handle multiple sensors. Distance based method such as KNN relies...Neighbor (LMNN) method [21] which will be briefly reviewed in the sequel. LMNN method tries to learn an optimal metric specifically for KNN classifier. The

  12. Review on CNC-Rapid Prototyping

    NASA Astrophysics Data System (ADS)

    Z, M. Nafis O.; Y, Nafrizuan M.; A, Munira M.; J, Kartina

    2012-09-01

    This article reviewed developments of Computerized Numerical Control (CNC) technology in rapid prototyping process. Rapid prototyping (RP) can be classified into three major groups; subtractive, additive and virtual. CNC rapid prototyping is grouped under the subtractive category which involves material removal from the workpiece that is larger than the final part. Richard Wysk established the use of CNC machines for rapid prototyping using sets of 2½-D tool paths from various orientations about a rotary axis to machine parts without refixturing. Since then, there are few developments on this process mainly aimed to optimized the operation and increase the process capabilities to stand equal with common additive type of RP. These developments include the integration between machining and deposition process (hybrid RP), adoption of RP to the conventional machine and optimization of the CNC rapid prototyping process based on controlled parameters. The article ended by concluding that the CNC rapid prototyping research area has a vast space for improvement as in the conventional machining processes. Further developments and findings will enhance the usage of this method and minimize the limitation of current approach in building a prototype.

  13. The classification of hunger behaviour of Lates Calcarifer through the integration of image processing technique and k-Nearest Neighbour learning algorithm

    NASA Astrophysics Data System (ADS)

    Taha, Z.; Razman, M. A. M.; Ghani, A. S. Abdul; Majeed, A. P. P. Abdul; Musa, R. M.; Adnan, F. A.; Sallehudin, M. F.; Mukai, Y.

    2018-04-01

    Fish Hunger behaviour is essential in determining the fish feeding routine, particularly for fish farmers. The inability to provide accurate feeding routines (under-feeding or over-feeding) may lead the death of the fish and consequently inhibits the quantity of the fish produced. Moreover, the excessive food that is not consumed by the fish will be dissolved in the water and accordingly reduce the water quality through the reduction of oxygen quantity. This problem also leads the death of the fish or even spur fish diseases. In the present study, a correlation of Barramundi fish-school behaviour with hunger condition through the hybrid data integration of image processing technique is established. The behaviour is clustered with respect to the position of the school size as well as the school density of the fish before feeding, during feeding and after feeding. The clustered fish behaviour is then classified through k-Nearest Neighbour (k-NN) learning algorithm. Three different variations of the algorithm namely cosine, cubic and weighted are assessed on its ability to classify the aforementioned fish hunger behaviour. It was found from the study that the weighted k-NN variation provides the best classification with an accuracy of 86.5%. Therefore, it could be concluded that the proposed integration technique may assist fish farmers in ascertaining fish feeding routine.

  14. Attribute Weighting Based K-Nearest Neighbor Using Gain Ratio

    NASA Astrophysics Data System (ADS)

    Nababan, A. A.; Sitompul, O. S.; Tulus

    2018-04-01

    K- Nearest Neighbor (KNN) is a good classifier, but from several studies, the result performance accuracy of KNN still lower than other methods. One of the causes of the low accuracy produced, because each attribute has the same effect on the classification process, while some less relevant characteristics lead to miss-classification of the class assignment for new data. In this research, we proposed Attribute Weighting Based K-Nearest Neighbor Using Gain Ratio as a parameter to see the correlation between each attribute in the data and the Gain Ratio also will be used as the basis for weighting each attribute of the dataset. The accuracy of results is compared to the accuracy acquired from the original KNN method using 10-fold Cross-Validation with several datasets from the UCI Machine Learning repository and KEEL-Dataset Repository, such as abalone, glass identification, haberman, hayes-roth and water quality status. Based on the result of the test, the proposed method was able to increase the classification accuracy of KNN, where the highest difference of accuracy obtained hayes-roth dataset is worth 12.73%, and the lowest difference of accuracy obtained in the abalone dataset of 0.07%. The average result of the accuracy of all dataset increases the accuracy by 5.33%.

  15. Ensemble Methods for Classification of Physical Activities from Wrist Accelerometry.

    PubMed

    Chowdhury, Alok Kumar; Tjondronegoro, Dian; Chandran, Vinod; Trost, Stewart G

    2017-09-01

    To investigate whether the use of ensemble learning algorithms improve physical activity recognition accuracy compared to the single classifier algorithms, and to compare the classification accuracy achieved by three conventional ensemble machine learning methods (bagging, boosting, random forest) and a custom ensemble model comprising four algorithms commonly used for activity recognition (binary decision tree, k nearest neighbor, support vector machine, and neural network). The study used three independent data sets that included wrist-worn accelerometer data. For each data set, a four-step classification framework consisting of data preprocessing, feature extraction, normalization and feature selection, and classifier training and testing was implemented. For the custom ensemble, decisions from the single classifiers were aggregated using three decision fusion methods: weighted majority vote, naïve Bayes combination, and behavior knowledge space combination. Classifiers were cross-validated using leave-one subject out cross-validation and compared on the basis of average F1 scores. In all three data sets, ensemble learning methods consistently outperformed the individual classifiers. Among the conventional ensemble methods, random forest models provided consistently high activity recognition; however, the custom ensemble model using weighted majority voting demonstrated the highest classification accuracy in two of the three data sets. Combining multiple individual classifiers using conventional or custom ensemble learning methods can improve activity recognition accuracy from wrist-worn accelerometer data.

  16. Ensemble Clustering Classification compete SVM and One-Class classifiers applied on plant microRNAs Data.

    PubMed

    Yousef, Malik; Khalifa, Waleed; AbedAllah, Loai

    2016-12-22

    The performance of many learning and data mining algorithms depends critically on suitable metrics to assess efficiency over the input space. Learning a suitable metric from examples may, therefore, be the key to successful application of these algorithms. We have demonstrated that the k-nearest neighbor (kNN) classification can be significantly improved by learning a distance metric from labeled examples. The clustering ensemble is used to define the distance between points in respect to how they co-cluster. This distance is then used within the framework of the kNN algorithm to define a classifier named ensemble clustering kNN classifier (EC-kNN). In many instances in our experiments we achieved highest accuracy while SVM failed to perform as well. In this study, we compare the performance of a two-class classifier using EC-kNN with different one-class and two-class classifiers. The comparison was applied to seven different plant microRNA species considering eight feature selection methods. In this study, the averaged results show that ECkNN outperforms all other methods employed here and previously published results for the same data. In conclusion, this study shows that the chosen classifier shows high performance when the distance metric is carefully chosen.

  17. Ensemble Clustering Classification Applied to Competing SVM and One-Class Classifiers Exemplified by Plant MicroRNAs Data.

    PubMed

    Yousef, Malik; Khalifa, Waleed; AbdAllah, Loai

    2016-12-01

    The performance of many learning and data mining algorithms depends critically on suitable metrics to assess efficiency over the input space. Learning a suitable metric from examples may, therefore, be the key to successful application of these algorithms. We have demonstrated that the k-nearest neighbor (kNN) classification can be significantly improved by learning a distance metric from labeled examples. The clustering ensemble is used to define the distance between points in respect to how they co-cluster. This distance is then used within the framework of the kNN algorithm to define a classifier named ensemble clustering kNN classifier (EC-kNN). In many instances in our experiments we achieved highest accuracy while SVM failed to perform as well. In this study, we compare the performance of a two-class classifier using EC-kNN with different one-class and two-class classifiers. The comparison was applied to seven different plant microRNA species considering eight feature selection methods. In this study, the averaged results show that EC-kNN outperforms all other methods employed here and previously published results for the same data. In conclusion, this study shows that the chosen classifier shows high performance when the distance metric is carefully chosen.

  18. Classification of vegetation types in military region

    NASA Astrophysics Data System (ADS)

    Gonçalves, Miguel; Silva, Jose Silvestre; Bioucas-Dias, Jose

    2015-10-01

    In decision-making process regarding planning and execution of military operations, the terrain is a determining factor. Aerial photographs are a source of vital information for the success of an operation in hostile region, namely when the cartographic information behind enemy lines is scarce or non-existent. The objective of present work is the development of a tool capable of processing aerial photos. The methodology implemented starts with feature extraction, followed by the application of an automatic selector of features. The next step, using the k-fold cross validation technique, estimates the input parameters for the following classifiers: Sparse Multinomial Logist Regression (SMLR), K Nearest Neighbor (KNN), Linear Classifier using Principal Component Expansion on the Joint Data (PCLDC) and Multi-Class Support Vector Machine (MSVM). These classifiers were used in two different studies with distinct objectives: discrimination of vegetation's density and identification of vegetation's main components. It was found that the best classifier on the first approach is the Sparse Logistic Multinomial Regression (SMLR). On the second approach, the implemented methodology applied to high resolution images showed that the better performance was achieved by KNN classifier and PCLDC. Comparing the two approaches there is a multiscale issue, in which for different resolutions, the best solution to the problem requires different classifiers and the extraction of different features.

  19. Comparisons and Selections of Features and Classifiers for Short Text Classification

    NASA Astrophysics Data System (ADS)

    Wang, Ye; Zhou, Zhi; Jin, Shan; Liu, Debin; Lu, Mi

    2017-10-01

    Short text is considerably different from traditional long text documents due to its shortness and conciseness, which somehow hinders the applications of conventional machine learning and data mining algorithms in short text classification. According to traditional artificial intelligence methods, we divide short text classification into three steps, namely preprocessing, feature selection and classifier comparison. In this paper, we have illustrated step-by-step how we approach our goals. Specifically, in feature selection, we compared the performance and robustness of the four methods of one-hot encoding, tf-idf weighting, word2vec and paragraph2vec, and in the classification part, we deliberately chose and compared Naive Bayes, Logistic Regression, Support Vector Machine, K-nearest Neighbor and Decision Tree as our classifiers. Then, we compared and analysed the classifiers horizontally with each other and vertically with feature selections. Regarding the datasets, we crawled more than 400,000 short text files from Shanghai and Shenzhen Stock Exchanges and manually labeled them into two classes, the big and the small. There are eight labels in the big class, and 59 labels in the small class.

  20. Automatic Cataract Hardness Classification Ex Vivo by Ultrasound Techniques.

    PubMed

    Caixinha, Miguel; Santos, Mário; Santos, Jaime

    2016-04-01

    To demonstrate the feasibility of a new methodology for cataract hardness characterization and automatic classification using ultrasound techniques, different cataract degrees were induced in 210 porcine lenses. A 25-MHz ultrasound transducer was used to obtain acoustical parameters (velocity and attenuation) and backscattering signals. B-Scan and parametric Nakagami images were constructed. Ninety-seven parameters were extracted and subjected to a Principal Component Analysis. Bayes, K-Nearest-Neighbours, Fisher Linear Discriminant and Support Vector Machine (SVM) classifiers were used to automatically classify the different cataract severities. Statistically significant increases with cataract formation were found for velocity, attenuation, mean brightness intensity of the B-Scan images and mean Nakagami m parameter (p < 0.01). The four classifiers showed a good performance for healthy versus cataractous lenses (F-measure ≥ 92.68%), while for initial versus severe cataracts the SVM classifier showed the higher performance (90.62%). The results showed that ultrasound techniques can be used for non-invasive cataract hardness characterization and automatic classification. Copyright © 2016 World Federation for Ultrasound in Medicine & Biology. Published by Elsevier Inc. All rights reserved.

  1. Developing a radiomics framework for classifying non-small cell lung carcinoma subtypes

    NASA Astrophysics Data System (ADS)

    Yu, Dongdong; Zang, Yali; Dong, Di; Zhou, Mu; Gevaert, Olivier; Fang, Mengjie; Shi, Jingyun; Tian, Jie

    2017-03-01

    Patient-targeted treatment of non-small cell lung carcinoma (NSCLC) has been well documented according to the histologic subtypes over the past decade. In parallel, recent development of quantitative image biomarkers has recently been highlighted as important diagnostic tools to facilitate histological subtype classification. In this study, we present a radiomics analysis that classifies the adenocarcinoma (ADC) and squamous cell carcinoma (SqCC). We extract 52-dimensional, CT-based features (7 statistical features and 45 image texture features) to represent each nodule. We evaluate our approach on a clinical dataset including 324 ADCs and 110 SqCCs patients with CT image scans. Classification of these features is performed with four different machine-learning classifiers including Support Vector Machines with Radial Basis Function kernel (RBF-SVM), Random forest (RF), K-nearest neighbor (KNN), and RUSBoost algorithms. To improve the classifiers' performance, optimal feature subset is selected from the original feature set by using an iterative forward inclusion and backward eliminating algorithm. Extensive experimental results demonstrate that radiomics features achieve encouraging classification results on both complete feature set (AUC=0.89) and optimal feature subset (AUC=0.91).

  2. Unobtrusive Multi-Static Serial LiDAR Imager (UMSLI) First Generation Shape-Matching Based Classifier for 2D Contours

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Cao, Zheng; Ouyang, Bing; Principe, Jose

    A multi-static serial LiDAR system prototype was developed under DE-EE0006787 to detect, classify, and record interactions of marine life with marine hydrokinetic generation equipment. This software implements a shape-matching based classifier algorithm for the underwater automated detection of marine life for that system. In addition to applying shape descriptors, the algorithm also adopts information theoretical learning based affine shape registration, improving point correspondences found by shape descriptors as well as the final similarity measure.

  3. A regularization approach to hydrofacies delineation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Wohlberg, Brendt; Tartakovsky, Daniel

    2009-01-01

    We consider an inverse problem of identifying complex internal structures of composite (geological) materials from sparse measurements of system parameters and system states. Two conceptual frameworks for identifying internal boundaries between constitutive materials in a composite are considered. A sequential approach relies on support vector machines, nearest neighbor classifiers, or geostatistics to reconstruct boundaries from measurements of system parameters and then uses system states data to refine the reconstruction. A joint approach inverts the two data sets simultaneously by employing a regularization approach.

  4. Discovering weighted patterns in intron sequences using self-adaptive harmony search and back-propagation algorithms.

    PubMed

    Huang, Yin-Fu; Wang, Chia-Ming; Liou, Sing-Wu

    2013-01-01

    A hybrid self-adaptive harmony search and back-propagation mining system was proposed to discover weighted patterns in human intron sequences. By testing the weights under a lazy nearest neighbor classifier, the numerical results revealed the significance of these weighted patterns. Comparing these weighted patterns with the popular intron consensus model, it is clear that the discovered weighted patterns make originally the ambiguous 5SS and 3SS header patterns more specific and concrete.

  5. Discovering Weighted Patterns in Intron Sequences Using Self-Adaptive Harmony Search and Back-Propagation Algorithms

    PubMed Central

    Wang, Chia-Ming; Liou, Sing-Wu

    2013-01-01

    A hybrid self-adaptive harmony search and back-propagation mining system was proposed to discover weighted patterns in human intron sequences. By testing the weights under a lazy nearest neighbor classifier, the numerical results revealed the significance of these weighted patterns. Comparing these weighted patterns with the popular intron consensus model, it is clear that the discovered weighted patterns make originally the ambiguous 5SS and 3SS header patterns more specific and concrete. PMID:23737711

  6. Identity Recognition Algorithm Using Improved Gabor Feature Selection of Gait Energy Image

    NASA Astrophysics Data System (ADS)

    Chao, LIANG; Ling-yao, JIA; Dong-cheng, SHI

    2017-01-01

    This paper describes an effective gait recognition approach based on Gabor features of gait energy image. In this paper, the kernel Fisher analysis combined with kernel matrix is proposed to select dominant features. The nearest neighbor classifier based on whitened cosine distance is used to discriminate different gait patterns. The approach proposed is tested on the CASIA and USF gait databases. The results show that our approach outperforms other state of gait recognition approaches in terms of recognition accuracy and robustness.

  7. Machine Learning Methods for Production Cases Analysis

    NASA Astrophysics Data System (ADS)

    Mokrova, Nataliya V.; Mokrov, Alexander M.; Safonova, Alexandra V.; Vishnyakov, Igor V.

    2018-03-01

    Approach to analysis of events occurring during the production process were proposed. Described machine learning system is able to solve classification tasks related to production control and hazard identification at an early stage. Descriptors of the internal production network data were used for training and testing of applied models. k-Nearest Neighbors and Random forest methods were used to illustrate and analyze proposed solution. The quality of the developed classifiers was estimated using standard statistical metrics, such as precision, recall and accuracy.

  8. MIDAS, prototype Multivariate Interactive Digital Analysis System for large area earth resources surveys. Volume 1: System description

    NASA Technical Reports Server (NTRS)

    Christenson, D.; Gordon, M.; Kistler, R.; Kriegler, F.; Lampert, S.; Marshall, R.; Mclaughlin, R.

    1977-01-01

    A third-generation, fast, low cost, multispectral recognition system (MIDAS) able to keep pace with the large quantity and high rates of data acquisition from large regions with present and projected sensots is described. The program can process a complete ERTS frame in forty seconds and provide a color map of sixteen constituent categories in a few minutes. A principle objective of the MIDAS program is to provide a system well interfaced with the human operator and thus to obtain large overall reductions in turn-around time and significant gains in throughput. The hardware and software generated in the overall program is described. The system contains a midi-computer to control the various high speed processing elements in the data path, a preprocessor to condition data, and a classifier which implements an all digital prototype multivariate Gaussian maximum likelihood or a Bayesian decision algorithm. Sufficient software was developed to perform signature extraction, control the preprocessor, compute classifier coefficients, control the classifier operation, operate the color display and printer, and diagnose operation.

  9. Comparative Analysis of Automatic Exudate Detection between Machine Learning and Traditional Approaches

    NASA Astrophysics Data System (ADS)

    Sopharak, Akara; Uyyanonvara, Bunyarit; Barman, Sarah; Williamson, Thomas

    To prevent blindness from diabetic retinopathy, periodic screening and early diagnosis are neccessary. Due to lack of expert ophthalmologists in rural area, automated early exudate (one of visible sign of diabetic retinopathy) detection could help to reduce the number of blindness in diabetic patients. Traditional automatic exudate detection methods are based on specific parameter configuration, while the machine learning approaches which seems more flexible may be computationally high cost. A comparative analysis of traditional and machine learning of exudates detection, namely, mathematical morphology, fuzzy c-means clustering, naive Bayesian classifier, Support Vector Machine and Nearest Neighbor classifier are presented. Detected exudates are validated with expert ophthalmologists' hand-drawn ground-truths. The sensitivity, specificity, precision, accuracy and time complexity of each method are also compared.

  10. Classifying multispectral data by neural networks

    NASA Technical Reports Server (NTRS)

    Telfer, Brian A.; Szu, Harold H.; Kiang, Richard K.

    1993-01-01

    Several energy functions for synthesizing neural networks are tested on 2-D synthetic data and on Landsat-4 Thematic Mapper data. These new energy functions, designed specifically for minimizing misclassification error, in some cases yield significant improvements in classification accuracy over the standard least mean squares energy function. In addition to operating on networks with one output unit per class, a new energy function is tested for binary encoded outputs, which result in smaller network sizes. The Thematic Mapper data (four bands were used) is classified on a single pixel basis, to provide a starting benchmark against which further improvements will be measured. Improvements are underway to make use of both subpixel and superpixel (i.e. contextual or neighborhood) information in tile processing. For single pixel classification, the best neural network result is 78.7 percent, compared with 71.7 percent for a classical nearest neighbor classifier. The 78.7 percent result also improves on several earlier neural network results on this data.

  11. Optimizing classification performance in an object-based very-high-resolution land use-land cover urban application

    NASA Astrophysics Data System (ADS)

    Georganos, Stefanos; Grippa, Tais; Vanhuysse, Sabine; Lennert, Moritz; Shimoni, Michal; Wolff, Eléonore

    2017-10-01

    This study evaluates the impact of three Feature Selection (FS) algorithms in an Object Based Image Analysis (OBIA) framework for Very-High-Resolution (VHR) Land Use-Land Cover (LULC) classification. The three selected FS algorithms, Correlation Based Selection (CFS), Mean Decrease in Accuracy (MDA) and Random Forest (RF) based Recursive Feature Elimination (RFE), were tested on Support Vector Machine (SVM), K-Nearest Neighbor, and Random Forest (RF) classifiers. The results demonstrate that the accuracy of SVM and KNN classifiers are the most sensitive to FS. The RF appeared to be more robust to high dimensionality, although a significant increase in accuracy was found by using the RFE method. In terms of classification accuracy, SVM performed the best using FS, followed by RF and KNN. Finally, only a small number of features is needed to achieve the highest performance using each classifier. This study emphasizes the benefits of rigorous FS for maximizing performance, as well as for minimizing model complexity and interpretation.

  12. Emotion detection model of Filipino music

    NASA Astrophysics Data System (ADS)

    Noblejas, Kathleen Alexis; Isidro, Daryl Arvin; Samonte, Mary Jane C.

    2017-02-01

    This research explored the creation of a model to detect emotion from Filipino songs. The emotion model used was based from Paul Ekman's six basic emotions. The songs were classified into the following genres: kundiman, novelty, pop, and rock. The songs were annotated by a group of music experts based on the emotion the song induces to the listener. Musical features of the songs were extracted using jAudio while the lyric features were extracted by Bag-of- Words feature representation. The audio and lyric features of the Filipino songs were extracted for classification by the chosen three classifiers, Naïve Bayes, Support Vector Machines, and k-Nearest Neighbors. The goal of the research was to know which classifier would work best for Filipino music. Evaluation was done by 10-fold cross validation and accuracy, precision, recall, and F-measure results were compared. Models were also tested with unknown test data to further determine the models' accuracy through the prediction results.

  13. Object Classification in Semi Structured Enviroment Using Forward-Looking Sonar

    PubMed Central

    dos Santos, Matheus; Ribeiro, Pedro Otávio; Núñez, Pedro; Botelho, Silvia

    2017-01-01

    The submarine exploration using robots has been increasing in recent years. The automation of tasks such as monitoring, inspection, and underwater maintenance requires the understanding of the robot’s environment. The object recognition in the scene is becoming a critical issue for these systems. On this work, an underwater object classification pipeline applied in acoustic images acquired by Forward-Looking Sonar (FLS) are studied. The object segmentation combines thresholding, connected pixels searching and peak of intensity analyzing techniques. The object descriptor extract intensity and geometric features of the detected objects. A comparison between the Support Vector Machine, K-Nearest Neighbors, and Random Trees classifiers are presented. An open-source tool was developed to annotate and classify the objects and evaluate their classification performance. The proposed method efficiently segments and classifies the structures in the scene using a real dataset acquired by an underwater vehicle in a harbor area. Experimental results demonstrate the robustness and accuracy of the method described in this paper. PMID:28961163

  14. Can a Smartphone Diagnose Parkinson Disease? A Deep Neural Network Method and Telediagnosis System Implementation.

    PubMed

    Zhang, Y N

    2017-01-01

    Parkinson's disease (PD) is primarily diagnosed by clinical examinations, such as walking test, handwriting test, and MRI diagnostic. In this paper, we propose a machine learning based PD telediagnosis method for smartphone. Classification of PD using speech records is a challenging task owing to the fact that the classification accuracy is still lower than doctor-level. Here we demonstrate automatic classification of PD using time frequency features, stacked autoencoders (SAE), and K nearest neighbor (KNN) classifier. KNN classifier can produce promising classification results from useful representations which were learned by SAE. Empirical results show that the proposed method achieves better performance with all tested cases across classification tasks, demonstrating machine learning capable of classifying PD with a level of competence comparable to doctor. It concludes that a smartphone can therefore potentially provide low-cost PD diagnostic care. This paper also gives an implementation on browser/server system and reports the running time cost. Both advantages and disadvantages of the proposed telediagnosis system are discussed.

  15. Can a Smartphone Diagnose Parkinson Disease? A Deep Neural Network Method and Telediagnosis System Implementation

    PubMed Central

    2017-01-01

    Parkinson's disease (PD) is primarily diagnosed by clinical examinations, such as walking test, handwriting test, and MRI diagnostic. In this paper, we propose a machine learning based PD telediagnosis method for smartphone. Classification of PD using speech records is a challenging task owing to the fact that the classification accuracy is still lower than doctor-level. Here we demonstrate automatic classification of PD using time frequency features, stacked autoencoders (SAE), and K nearest neighbor (KNN) classifier. KNN classifier can produce promising classification results from useful representations which were learned by SAE. Empirical results show that the proposed method achieves better performance with all tested cases across classification tasks, demonstrating machine learning capable of classifying PD with a level of competence comparable to doctor. It concludes that a smartphone can therefore potentially provide low-cost PD diagnostic care. This paper also gives an implementation on browser/server system and reports the running time cost. Both advantages and disadvantages of the proposed telediagnosis system are discussed. PMID:29075547

  16. Neural Network Classifies Teleoperation Data

    NASA Technical Reports Server (NTRS)

    Fiorini, Paolo; Giancaspro, Antonio; Losito, Sergio; Pasquariello, Guido

    1994-01-01

    Prototype artificial neural network, implemented in software, identifies phases of telemanipulator tasks in real time by analyzing feedback signals from force sensors on manipulator hand. Prototype is early, subsystem-level product of continuing effort to develop automated system that assists in training and supervising human control operator: provides symbolic feedback (e.g., warnings of impending collisions or evaluations of performance) to operator in real time during successive executions of same task. Also simplifies transition between teleoperation and autonomous modes of telerobotic system.

  17. A translational platform for prototyping closed-loop neuromodulation systems

    PubMed Central

    Afshar, Pedram; Khambhati, Ankit; Stanslaski, Scott; Carlson, David; Jensen, Randy; Linde, Dave; Dani, Siddharth; Lazarewicz, Maciej; Cong, Peng; Giftakis, Jon; Stypulkowski, Paul; Denison, Tim

    2013-01-01

    While modulating neural activity through stimulation is an effective treatment for neurological diseases such as Parkinson's disease and essential tremor, an opportunity for improving neuromodulation therapy remains in automatically adjusting therapy to continuously optimize patient outcomes. Practical issues associated with achieving this include the paucity of human data related to disease states, poorly validated estimators of patient state, and unknown dynamic mappings of optimal stimulation parameters based on estimated states. To overcome these challenges, we present an investigational platform including: an implanted sensing and stimulation device to collect data and run automated closed-loop algorithms; an external tool to prototype classifier and control-policy algorithms; and real-time telemetry to update the implanted device firmware and monitor its state. The prototyping system was demonstrated in a chronic large animal model studying hippocampal dynamics. We used the platform to find biomarkers of the observed states and transfer functions of different stimulation amplitudes. Data showed that moderate levels of stimulation suppress hippocampal beta activity, while high levels of stimulation produce seizure-like after-discharge activity. The biomarker and transfer function observations were mapped into classifier and control-policy algorithms, which were downloaded to the implanted device to continuously titrate stimulation amplitude for the desired network effect. The platform is designed to be a flexible prototyping tool and could be used to develop improved mechanistic models and automated closed-loop systems for a variety of neurological disorders. PMID:23346048

  18. A translational platform for prototyping closed-loop neuromodulation systems.

    PubMed

    Afshar, Pedram; Khambhati, Ankit; Stanslaski, Scott; Carlson, David; Jensen, Randy; Linde, Dave; Dani, Siddharth; Lazarewicz, Maciej; Cong, Peng; Giftakis, Jon; Stypulkowski, Paul; Denison, Tim

    2012-01-01

    While modulating neural activity through stimulation is an effective treatment for neurological diseases such as Parkinson's disease and essential tremor, an opportunity for improving neuromodulation therapy remains in automatically adjusting therapy to continuously optimize patient outcomes. Practical issues associated with achieving this include the paucity of human data related to disease states, poorly validated estimators of patient state, and unknown dynamic mappings of optimal stimulation parameters based on estimated states. To overcome these challenges, we present an investigational platform including: an implanted sensing and stimulation device to collect data and run automated closed-loop algorithms; an external tool to prototype classifier and control-policy algorithms; and real-time telemetry to update the implanted device firmware and monitor its state. The prototyping system was demonstrated in a chronic large animal model studying hippocampal dynamics. We used the platform to find biomarkers of the observed states and transfer functions of different stimulation amplitudes. Data showed that moderate levels of stimulation suppress hippocampal beta activity, while high levels of stimulation produce seizure-like after-discharge activity. The biomarker and transfer function observations were mapped into classifier and control-policy algorithms, which were downloaded to the implanted device to continuously titrate stimulation amplitude for the desired network effect. The platform is designed to be a flexible prototyping tool and could be used to develop improved mechanistic models and automated closed-loop systems for a variety of neurological disorders.

  19. An Intelligent Weather Station

    PubMed Central

    Mestre, Gonçalo; Ruano, Antonio; Duarte, Helder; Silva, Sergio; Khosravani, Hamid; Pesteh, Shabnam; Ferreira, Pedro M.; Horta, Ricardo

    2015-01-01

    Accurate measurements of global solar radiation, atmospheric temperature and relative humidity, as well as the availability of the predictions of their evolution over time, are important for different areas of applications, such as agriculture, renewable energy and energy management, or thermal comfort in buildings. For this reason, an intelligent, light-weight, self-powered and portable sensor was developed, using a nearest-neighbors (NEN) algorithm and artificial neural network (ANN) models as the time-series predictor mechanisms. The hardware and software design of the implemented prototype are described, as well as the forecasting performance related to the three atmospheric variables, using both approaches, over a prediction horizon of 48-steps-ahead. PMID:26690433

  20. An Intelligent Weather Station.

    PubMed

    Mestre, Gonçalo; Ruano, Antonio; Duarte, Helder; Silva, Sergio; Khosravani, Hamid; Pesteh, Shabnam; Ferreira, Pedro M; Horta, Ricardo

    2015-12-10

    Accurate measurements of global solar radiation, atmospheric temperature and relative humidity, as well as the availability of the predictions of their evolution over time, are important for different areas of applications, such as agriculture, renewable energy and energy management, or thermal comfort in buildings. For this reason, an intelligent, light-weight, self-powered and portable sensor was developed, using a nearest-neighbors (NEN) algorithm and artificial neural network (ANN) models as the time-series predictor mechanisms. The hardware and software design of the implemented prototype are described, as well as the forecasting performance related to the three atmospheric variables, using both approaches, over a prediction horizon of 48-steps-ahead.

  1. Nearest neighbor 3D segmentation with context features

    NASA Astrophysics Data System (ADS)

    Hristova, Evelin; Schulz, Heinrich; Brosch, Tom; Heinrich, Mattias P.; Nickisch, Hannes

    2018-03-01

    Automated and fast multi-label segmentation of medical images is challenging and clinically important. This paper builds upon a supervised machine learning framework that uses training data sets with dense organ annotations and vantage point trees to classify voxels in unseen images based on similarity of binary feature vectors extracted from the data. Without explicit model knowledge, the algorithm is applicable to different modalities and organs, and achieves high accuracy. The method is successfully tested on 70 abdominal CT and 42 pelvic MR images. With respect to ground truth, an average Dice overlap score of 0.76 for the CT segmentation of liver, spleen and kidneys is achieved. The mean score for the MR delineation of bladder, bones, prostate and rectum is 0.65. Additionally, we benchmark several variations of the main components of the method and reduce the computation time by up to 47% without significant loss of accuracy. The segmentation results are - for a nearest neighbor method - surprisingly accurate, robust as well as data and time efficient.

  2. Detection of acute lymphocyte leukemia using k-nearest neighbor algorithm based on shape and histogram features

    NASA Astrophysics Data System (ADS)

    Purwanti, Endah; Calista, Evelyn

    2017-05-01

    Leukemia is a type of cancer which is caused by malignant neoplasms in leukocyte cells. Leukemia disease which can cause death quickly enough for the sufferer is a type of acute lymphocyte leukemia (ALL). In this study, we propose automatic detection of lymphocyte leukemia through classification of lymphocyte cell images obtained from peripheral blood smear single cell. There are two main objectives in this study. The first is to extract featuring cells. The second objective is to classify the lymphocyte cells into two classes, namely normal and abnormal lymphocytes. In conducting this study, we use combination of shape feature and histogram feature, and the classification algorithm is k-nearest Neighbour with k variation is 1, 3, 5, 7, 9, 11, 13, and 15. The best level of accuracy, sensitivity, and specificity in this study are 90%, 90%, and 90%, and they were obtained from combined features of area-perimeter-mean-standard deviation with k=7.

  3. RDTC [Restricted Data Transmission Controller] global variable definitions

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Grambihler, A.J.; O`Callaghan, P.B.

    The purpose of the Restricted Data Transmission Controller (RDTC) is to demonstrate a methodology for transmitting data between computers which have different levels of classification. The RDTC does this by logically filtering the data being transmitted between the two computers. This prototype is set up to filter data from the classified computer so that only numeric data is passed to the unclassified computer. The RDTC allows all data from the unclassified computer to be sent to the classified computer. The classified system is referred to as LUA and the unclassified system is referred to as LUB. 9 tabs.

  4. Random ensemble learning for EEG classification.

    PubMed

    Hosseini, Mohammad-Parsa; Pompili, Dario; Elisevich, Kost; Soltanian-Zadeh, Hamid

    2018-01-01

    Real-time detection of seizure activity in epilepsy patients is critical in averting seizure activity and improving patients' quality of life. Accurate evaluation, presurgical assessment, seizure prevention, and emergency alerts all depend on the rapid detection of seizure onset. A new method of feature selection and classification for rapid and precise seizure detection is discussed wherein informative components of electroencephalogram (EEG)-derived data are extracted and an automatic method is presented using infinite independent component analysis (I-ICA) to select independent features. The feature space is divided into subspaces via random selection and multichannel support vector machines (SVMs) are used to classify these subspaces. The result of each classifier is then combined by majority voting to establish the final output. In addition, a random subspace ensemble using a combination of SVM, multilayer perceptron (MLP) neural network and an extended k-nearest neighbors (k-NN), called extended nearest neighbor (ENN), is developed for the EEG and electrocorticography (ECoG) big data problem. To evaluate the solution, a benchmark ECoG of eight patients with temporal and extratemporal epilepsy was implemented in a distributed computing framework as a multitier cloud-computing architecture. Using leave-one-out cross-validation, the accuracy, sensitivity, specificity, and both false positive and false negative ratios of the proposed method were found to be 0.97, 0.98, 0.96, 0.04, and 0.02, respectively. Application of the solution to cases under investigation with ECoG has also been effected to demonstrate its utility. Copyright © 2017 Elsevier B.V. All rights reserved.

  5. Comparative Performance Analysis of Support Vector Machine, Random Forest, Logistic Regression and k-Nearest Neighbours in Rainbow Trout (Oncorhynchus Mykiss) Classification Using Image-Based Features

    PubMed Central

    Císař, Petr; Labbé, Laurent; Souček, Pavel; Pelissier, Pablo; Kerneis, Thierry

    2018-01-01

    The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout (Oncorhynchus mykiss) were fed either a fish-meal based diet (80 fish) or a 100% plant-based diet (80 fish) and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF), Support vector machine (SVM), Logistic regression (LR) and k-Nearest neighbours (k-NN). The SVM with radial based kernel provided the best classifier with correct classification rate (CCR) of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k-NN was the least accurate (40%) classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet’s effects on fish skin. PMID:29596375

  6. Comparative Performance Analysis of Support Vector Machine, Random Forest, Logistic Regression and k-Nearest Neighbours in Rainbow Trout (Oncorhynchus Mykiss) Classification Using Image-Based Features.

    PubMed

    Saberioon, Mohammadmehdi; Císař, Petr; Labbé, Laurent; Souček, Pavel; Pelissier, Pablo; Kerneis, Thierry

    2018-03-29

    The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout ( Oncorhynchus mykiss ) were fed either a fish-meal based diet (80 fish) or a 100% plant-based diet (80 fish) and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF), Support vector machine (SVM), Logistic regression (LR) and k -Nearest neighbours ( k -NN). The SVM with radial based kernel provided the best classifier with correct classification rate (CCR) of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k -NN was the least accurate (40%) classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet's effects on fish skin.

  7. Self-Organizing Map Neural Network-Based Nearest Neighbor Position Estimation Scheme for Continuous Crystal PET Detectors

    NASA Astrophysics Data System (ADS)

    Wang, Yonggang; Li, Deng; Lu, Xiaoming; Cheng, Xinyi; Wang, Liwei

    2014-10-01

    Continuous crystal-based positron emission tomography (PET) detectors could be an ideal alternative for current high-resolution pixelated PET detectors if the issues of high performance γ interaction position estimation and its real-time implementation are solved. Unfortunately, existing position estimators are not very feasible for implementation on field-programmable gate array (FPGA). In this paper, we propose a new self-organizing map neural network-based nearest neighbor (SOM-NN) positioning scheme aiming not only at providing high performance, but also at being realistic for FPGA implementation. Benefitting from the SOM feature mapping mechanism, the large set of input reference events at each calibration position is approximated by a small set of prototypes, and the computation of the nearest neighbor searching for unknown events is largely reduced. Using our experimental data, the scheme was evaluated, optimized and compared with the smoothed k-NN method. The spatial resolutions of full-width-at-half-maximum (FWHM) of both methods averaged over the center axis of the detector were obtained as 1.87 ±0.17 mm and 1.92 ±0.09 mm, respectively. The test results show that the SOM-NN scheme has an equivalent positioning performance with the smoothed k-NN method, but the amount of computation is only about one-tenth of the smoothed k-NN method. In addition, the algorithm structure of the SOM-NN scheme is more feasible for implementation on FPGA. It has the potential to realize real-time position estimation on an FPGA with a high-event processing throughput.

  8. Application of texture analysis method for mammogram density classification

    NASA Astrophysics Data System (ADS)

    Nithya, R.; Santhi, B.

    2017-07-01

    Mammographic density is considered a major risk factor for developing breast cancer. This paper proposes an automated approach to classify breast tissue types in digital mammogram. The main objective of the proposed Computer-Aided Diagnosis (CAD) system is to investigate various feature extraction methods and classifiers to improve the diagnostic accuracy in mammogram density classification. Texture analysis methods are used to extract the features from the mammogram. Texture features are extracted by using histogram, Gray Level Co-Occurrence Matrix (GLCM), Gray Level Run Length Matrix (GLRLM), Gray Level Difference Matrix (GLDM), Local Binary Pattern (LBP), Entropy, Discrete Wavelet Transform (DWT), Wavelet Packet Transform (WPT), Gabor transform and trace transform. These extracted features are selected using Analysis of Variance (ANOVA). The features selected by ANOVA are fed into the classifiers to characterize the mammogram into two-class (fatty/dense) and three-class (fatty/glandular/dense) breast density classification. This work has been carried out by using the mini-Mammographic Image Analysis Society (MIAS) database. Five classifiers are employed namely, Artificial Neural Network (ANN), Linear Discriminant Analysis (LDA), Naive Bayes (NB), K-Nearest Neighbor (KNN), and Support Vector Machine (SVM). Experimental results show that ANN provides better performance than LDA, NB, KNN and SVM classifiers. The proposed methodology has achieved 97.5% accuracy for three-class and 99.37% for two-class density classification.

  9. Automated diagnosis of dry eye using infrared thermography images

    NASA Astrophysics Data System (ADS)

    Acharya, U. Rajendra; Tan, Jen Hong; Koh, Joel E. W.; Sudarshan, Vidya K.; Yeo, Sharon; Too, Cheah Loon; Chua, Chua Kuang; Ng, E. Y. K.; Tong, Louis

    2015-07-01

    Dry Eye (DE) is a condition of either decreased tear production or increased tear film evaporation. Prolonged DE damages the cornea causing the corneal scarring, thinning and perforation. There is no single uniform diagnosis test available to date; combinations of diagnostic tests are to be performed to diagnose DE. The current diagnostic methods available are subjective, uncomfortable and invasive. Hence in this paper, we have developed an efficient, fast and non-invasive technique for the automated identification of normal and DE classes using infrared thermography images. The features are extracted from nonlinear method called Higher Order Spectra (HOS). Features are ranked using t-test ranking strategy. These ranked features are fed to various classifiers namely, K-Nearest Neighbor (KNN), Nave Bayesian Classifier (NBC), Decision Tree (DT), Probabilistic Neural Network (PNN), and Support Vector Machine (SVM) to select the best classifier using minimum number of features. Our proposed system is able to identify the DE and normal classes automatically with classification accuracy of 99.8%, sensitivity of 99.8%, and specificity if 99.8% for left eye using PNN and KNN classifiers. And we have reported classification accuracy of 99.8%, sensitivity of 99.9%, and specificity if 99.4% for right eye using SVM classifier with polynomial order 2 kernel.

  10. Using Chou's pseudo amino acid composition based on approximate entropy and an ensemble of AdaBoost classifiers to predict protein subnuclear location.

    PubMed

    Jiang, Xiaoying; Wei, Rong; Zhao, Yanjun; Zhang, Tongliang

    2008-05-01

    The knowledge of subnuclear localization in eukaryotic cells is essential for understanding the life function of nucleus. Developing prediction methods and tools for proteins subnuclear localization become important research fields in protein science for special characteristics in cell nuclear. In this study, a novel approach has been proposed to predict protein subnuclear localization. Sample of protein is represented by Pseudo Amino Acid (PseAA) composition based on approximate entropy (ApEn) concept, which reflects the complexity of time series. A novel ensemble classifier is designed incorporating three AdaBoost classifiers. The base classifier algorithms in three AdaBoost are decision stumps, fuzzy K nearest neighbors classifier, and radial basis-support vector machines, respectively. Different PseAA compositions are used as input data of different AdaBoost classifier in ensemble. Genetic algorithm is used to optimize the dimension and weight factor of PseAA composition. Two datasets often used in published works are used to validate the performance of the proposed approach. The obtained results of Jackknife cross-validation test are higher and more balance than them of other methods on same datasets. The promising results indicate that the proposed approach is effective and practical. It might become a useful tool in protein subnuclear localization. The software in Matlab and supplementary materials are available freely by contacting the corresponding author.

  11. Design of a hybrid model for cardiac arrhythmia classification based on Daubechies wavelet transform.

    PubMed

    Rajagopal, Rekha; Ranganathan, Vidhyapriya

    2018-06-05

    Automation in cardiac arrhythmia classification helps medical professionals make accurate decisions about the patient's health. The aim of this work was to design a hybrid classification model to classify cardiac arrhythmias. The design phase of the classification model comprises the following stages: preprocessing of the cardiac signal by eliminating detail coefficients that contain noise, feature extraction through Daubechies wavelet transform, and arrhythmia classification using a collaborative decision from the K nearest neighbor classifier (KNN) and a support vector machine (SVM). The proposed model is able to classify 5 arrhythmia classes as per the ANSI/AAMI EC57: 1998 classification standard. Level 1 of the proposed model involves classification using the KNN and the classifier is trained with examples from all classes. Level 2 involves classification using an SVM and is trained specifically to classify overlapped classes. The final classification of a test heartbeat pertaining to a particular class is done using the proposed KNN/SVM hybrid model. The experimental results demonstrated that the average sensitivity of the proposed model was 92.56%, the average specificity 99.35%, the average positive predictive value 98.13%, the average F-score 94.5%, and the average accuracy 99.78%. The results obtained using the proposed model were compared with the results of discriminant, tree, and KNN classifiers. The proposed model is able to achieve a high classification accuracy.

  12. Identification of jasmine flower (Jasminum sp.) based on the shape of the flower using sobel edge and k-nearest neighbour

    NASA Astrophysics Data System (ADS)

    Qur’ania, A.; Sarinah, I.

    2018-03-01

    People often wrong in knowing the type of jasmine by just looking at the white color of the jasmine, while not all white flowers including jasmine and not all jasmine flowers have white. There is a jasmine that is yellow and there is a jasmine that is white and purple.The aim of this research is to identify Jasmine flower (Jasminum sp.) based on the shape of the flower image-based using Sobel edge detection and k-Nearest Neighbor. Edge detection is used to detect the type of flower from the flower shape. Edge detection aims to improve the appearance of the border of a digital image. While k-Nearest Neighbor method is used to classify the classification of test objects into classes that have neighbouring properties closest to the object of training. The data used in this study are three types of jasmine namely jasmine white (Jasminum sambac), jasmine gambir (Jasminum pubescens), and jasmine japan (Pseuderanthemum reticulatum). Testing of jasmine flower image resized 50 × 50 pixels, 100 × 100 pixels, 150 × 150 pixels yields an accuracy of 84%. Tests on distance values of the k-NN method with spacing 5, 10 and 15 resulted in different accuracy rates for 5 and 10 closest distances yielding the same accuracy rate of 84%, for the 15 shortest distance resulted in a small accuracy of 65.2%.

  13. Quantitative diagnosis of bladder cancer by morphometric analysis of HE images

    NASA Astrophysics Data System (ADS)

    Wu, Binlin; Nebylitsa, Samantha V.; Mukherjee, Sushmita; Jain, Manu

    2015-02-01

    In clinical practice, histopathological analysis of biopsied tissue is the main method for bladder cancer diagnosis and prognosis. The diagnosis is performed by a pathologist based on the morphological features in the image of a hematoxylin and eosin (HE) stained tissue sample. This manuscript proposes algorithms to perform morphometric analysis on the HE images, quantify the features in the images, and discriminate bladder cancers with different grades, i.e. high grade and low grade. The nuclei are separated from the background and other types of cells such as red blood cells (RBCs) and immune cells using manual outlining, color deconvolution and image segmentation. A mask of nuclei is generated for each image for quantitative morphometric analysis. The features of the nuclei in the mask image including size, shape, orientation, and their spatial distributions are measured. To quantify local clustering and alignment of nuclei, we propose a 1-nearest-neighbor (1-NN) algorithm which measures nearest neighbor distance and nearest neighbor parallelism. The global distributions of the features are measured using statistics of the proposed parameters. A linear support vector machine (SVM) algorithm is used to classify the high grade and low grade bladder cancers. The results show using a particular group of nuclei such as large ones, and combining multiple parameters can achieve better discrimination. This study shows the proposed approach can potentially help expedite pathological diagnosis by triaging potentially suspicious biopsies.

  14. Non-kinetic Targeting Risk Assessment Methodology (NKTRAM)

    DTIC Science & Technology

    2014-07-22

    kinetic inetic Targe J TGTs & E ement of ris s and dama ian / non-co Treaty Orga kinetic enga rce ( IATF ) t of selecting perational and prioritiza c...t e activities. T c engagemen is often inapp of a tank, is and Major S ( IATF ). ssessm opose a non ral Damage / munitions ting Risk As ) staff...prototype stage and is classified. As such, it will not be discussed or identified within this SL. In November 2013, the IATF red teamed the prototype

  15. Automated Classification of Phonological Errors in Aphasic Language

    PubMed Central

    Ahuja, Sanjeev B.; Reggia, James A.; Berndt, Rita S.

    1984-01-01

    Using heuristically-guided state space search, a prototype program has been developed to simulate and classify phonemic errors occurring in the speech of neurologically-impaired patients. Simulations are based on an interchangeable rule/operator set of elementary errors which represent a theory of phonemic processing faults. This work introduces and evaluates a novel approach to error simulation and classification, it provides a prototype simulation tool for neurolinguistic research, and it forms the initial phase of a larger research effort involving computer modelling of neurolinguistic processes.

  16. Classification of biosensor time series using dynamic time warping: applications in screening cancer cells with characteristic biomarkers.

    PubMed

    Rai, Shesh N; Trainor, Patrick J; Khosravi, Farhad; Kloecker, Goetz; Panchapakesan, Balaji

    2016-01-01

    The development of biosensors that produce time series data will facilitate improvements in biomedical diagnostics and in personalized medicine. The time series produced by these devices often contains characteristic features arising from biochemical interactions between the sample and the sensor. To use such characteristic features for determining sample class, similarity-based classifiers can be utilized. However, the construction of such classifiers is complicated by the variability in the time domains of such series that renders the traditional distance metrics such as Euclidean distance ineffective in distinguishing between biological variance and time domain variance. The dynamic time warping (DTW) algorithm is a sequence alignment algorithm that can be used to align two or more series to facilitate quantifying similarity. In this article, we evaluated the performance of DTW distance-based similarity classifiers for classifying time series that mimics electrical signals produced by nanotube biosensors. Simulation studies demonstrated the positive performance of such classifiers in discriminating between time series containing characteristic features that are obscured by noise in the intensity and time domains. We then applied a DTW distance-based k -nearest neighbors classifier to distinguish the presence/absence of mesenchymal biomarker in cancer cells in buffy coats in a blinded test. Using a train-test approach, we find that the classifier had high sensitivity (90.9%) and specificity (81.8%) in differentiating between EpCAM-positive MCF7 cells spiked in buffy coats and those in plain buffy coats.

  17. Classifying dysmorphic syndromes by using artificial neural network based hierarchical decision tree.

    PubMed

    Özdemir, Merve Erkınay; Telatar, Ziya; Eroğul, Osman; Tunca, Yusuf

    2018-05-01

    Dysmorphic syndromes have different facial malformations. These malformations are significant to an early diagnosis of dysmorphic syndromes and contain distinctive information for face recognition. In this study we define the certain features of each syndrome by considering facial malformations and classify Fragile X, Hurler, Prader Willi, Down, Wolf Hirschhorn syndromes and healthy groups automatically. The reference points are marked on the face images and ratios between the points' distances are taken into consideration as features. We suggest a neural network based hierarchical decision tree structure in order to classify the syndrome types. We also implement k-nearest neighbor (k-NN) and artificial neural network (ANN) classifiers to compare classification accuracy with our hierarchical decision tree. The classification accuracy is 50, 73 and 86.7% with k-NN, ANN and hierarchical decision tree methods, respectively. Then, the same images are shown to a clinical expert who achieve a recognition rate of 46.7%. We develop an efficient system to recognize different syndrome types automatically in a simple, non-invasive imaging data, which is independent from the patient's age, sex and race at high accuracy. The promising results indicate that our method can be used for pre-diagnosis of the dysmorphic syndromes by clinical experts.

  18. Identification of Anisomerous Motor Imagery EEG Signals Based on Complex Algorithms

    PubMed Central

    Zhang, Zhiwen; Duan, Feng; Zhou, Xin; Meng, Zixuan

    2017-01-01

    Motor imagery (MI) electroencephalograph (EEG) signals are widely applied in brain-computer interface (BCI). However, classified MI states are limited, and their classification accuracy rates are low because of the characteristics of nonlinearity and nonstationarity. This study proposes a novel MI pattern recognition system that is based on complex algorithms for classifying MI EEG signals. In electrooculogram (EOG) artifact preprocessing, band-pass filtering is performed to obtain the frequency band of MI-related signals, and then, canonical correlation analysis (CCA) combined with wavelet threshold denoising (WTD) is used for EOG artifact preprocessing. We propose a regularized common spatial pattern (R-CSP) algorithm for EEG feature extraction by incorporating the principle of generic learning. A new classifier combining the K-nearest neighbor (KNN) and support vector machine (SVM) approaches is used to classify four anisomerous states, namely, imaginary movements with the left hand, right foot, and right shoulder and the resting state. The highest classification accuracy rate is 92.5%, and the average classification accuracy rate is 87%. The proposed complex algorithm identification method can significantly improve the identification rate of the minority samples and the overall classification performance. PMID:28874909

  19. Gap Shape Classification using Landscape Indices and Multivariate Statistics

    PubMed Central

    Wu, Chih-Da; Cheng, Chi-Chuan; Chang, Che-Chang; Lin, Chinsu; Chang, Kun-Cheng; Chuang, Yung-Chung

    2016-01-01

    This study proposed a novel methodology to classify the shape of gaps using landscape indices and multivariate statistics. Patch-level indices were used to collect the qualified shape and spatial configuration characteristics for canopy gaps in the Lienhuachih Experimental Forest in Taiwan in 1998 and 2002. Non-hierarchical cluster analysis was used to assess the optimal number of gap clusters and canonical discriminant analysis was used to generate the discriminant functions for canopy gap classification. The gaps for the two periods were optimally classified into three categories. In general, gap type 1 had a more complex shape, gap type 2 was more elongated and gap type 3 had the largest gaps that were more regular in shape. The results were evaluated using Wilks’ lambda as satisfactory (p < 0.001). The agreement rate of confusion matrices exceeded 96%. Differences in gap characteristics between the classified gap types that were determined using a one-way ANOVA showed a statistical significance in all patch indices (p = 0.00), except for the Euclidean nearest neighbor distance (ENN) in 2002. Taken together, these results demonstrated the feasibility and applicability of the proposed methodology to classify the shape of a gap. PMID:27901127

  20. Software platform for managing the classification of error- related potentials of observers

    NASA Astrophysics Data System (ADS)

    Asvestas, P.; Ventouras, E.-C.; Kostopoulos, S.; Sidiropoulos, K.; Korfiatis, V.; Korda, A.; Uzunolglu, A.; Karanasiou, I.; Kalatzis, I.; Matsopoulos, G.

    2015-09-01

    Human learning is partly based on observation. Electroencephalographic recordings of subjects who perform acts (actors) or observe actors (observers), contain a negative waveform in the Evoked Potentials (EPs) of the actors that commit errors and of observers who observe the error-committing actors. This waveform is called the Error-Related Negativity (ERN). Its detection has applications in the context of Brain-Computer Interfaces. The present work describes a software system developed for managing EPs of observers, with the aim of classifying them into observations of either correct or incorrect actions. It consists of an integrated platform for the storage, management, processing and classification of EPs recorded during error-observation experiments. The system was developed using C# and the following development tools and frameworks: MySQL, .NET Framework, Entity Framework and Emgu CV, for interfacing with the machine learning library of OpenCV. Up to six features can be computed per EP recording per electrode. The user can select among various feature selection algorithms and then proceed to train one of three types of classifiers: Artificial Neural Networks, Support Vector Machines, k-nearest neighbour. Next the classifier can be used for classifying any EP curve that has been inputted to the database.

  1. Gap Shape Classification using Landscape Indices and Multivariate Statistics.

    PubMed

    Wu, Chih-Da; Cheng, Chi-Chuan; Chang, Che-Chang; Lin, Chinsu; Chang, Kun-Cheng; Chuang, Yung-Chung

    2016-11-30

    This study proposed a novel methodology to classify the shape of gaps using landscape indices and multivariate statistics. Patch-level indices were used to collect the qualified shape and spatial configuration characteristics for canopy gaps in the Lienhuachih Experimental Forest in Taiwan in 1998 and 2002. Non-hierarchical cluster analysis was used to assess the optimal number of gap clusters and canonical discriminant analysis was used to generate the discriminant functions for canopy gap classification. The gaps for the two periods were optimally classified into three categories. In general, gap type 1 had a more complex shape, gap type 2 was more elongated and gap type 3 had the largest gaps that were more regular in shape. The results were evaluated using Wilks' lambda as satisfactory (p < 0.001). The agreement rate of confusion matrices exceeded 96%. Differences in gap characteristics between the classified gap types that were determined using a one-way ANOVA showed a statistical significance in all patch indices (p = 0.00), except for the Euclidean nearest neighbor distance (ENN) in 2002. Taken together, these results demonstrated the feasibility and applicability of the proposed methodology to classify the shape of a gap.

  2. Real-time object-to-features vectorisation via Siamese neural networks

    NASA Astrophysics Data System (ADS)

    Fedorenko, Fedor; Usilin, Sergey

    2017-03-01

    Object-to-features vectorisation is a hard problem to solve for objects that can be hard to distinguish. Siamese and Triplet neural networks are one of the more recent tools used for such task. However, most networks used are very deep networks that prove to be hard to compute in the Internet of Things setting. In this paper, a computationally efficient neural network is proposed for real-time object-to-features vectorisation into a Euclidean metric space. We use L2 distance to reflect feature vector similarity during both training and testing. In this way, feature vectors we develop can be easily classified using K-Nearest Neighbours classifier. Such approach can be used to train networks to vectorise such "problematic" objects like images of human faces, keypoint image patches, like keypoints on Arctic maps and surrounding marine areas.

  3. ISE-based sensor array system for classification of foodstuffs

    NASA Astrophysics Data System (ADS)

    Ciosek, Patrycja; Sobanski, Tomasz; Augustyniak, Ewa; Wróblewski, Wojciech

    2006-01-01

    A system composed of an array of polymeric membrane ion-selective electrodes and a pattern recognition block—a so-called 'electronic tongue'—was used for the classification of liquid samples: milk, fruit juice and tonic. The task of this system was to automatically recognize a brand of the product. To analyze the measurement set-up responses various non-parametric classifiers such as k-nearest neighbours, a feedforward neural network and a probabilistic neural network were used. In order to enhance the classification ability of the system, standard model solutions of salts were measured (in order to take into account any variation in time of the working parameters of the sensors). This system was capable of recognizing the brand of the products with accuracy ranging from 68% to 100% (in the case of the best classifier).

  4. Just-in-time adaptive classifiers-part II: designing the classifier.

    PubMed

    Alippi, Cesare; Roveri, Manuel

    2008-12-01

    Aging effects, environmental changes, thermal drifts, and soft and hard faults affect physical systems by changing their nature and behavior over time. To cope with a process evolution adaptive solutions must be envisaged to track its dynamics; in this direction, adaptive classifiers are generally designed by assuming the stationary hypothesis for the process generating the data with very few results addressing nonstationary environments. This paper proposes a methodology based on k-nearest neighbor (NN) classifiers for designing adaptive classification systems able to react to changing conditions just-in-time (JIT), i.e., exactly when it is needed. k-NN classifiers have been selected for their computational-free training phase, the possibility to easily estimate the model complexity k and keep under control the computational complexity of the classifier through suitable data reduction mechanisms. A JIT classifier requires a temporal detection of a (possible) process deviation (aspect tackled in a companion paper) followed by an adaptive management of the knowledge base (KB) of the classifier to cope with the process change. The novelty of the proposed approach resides in the general framework supporting the real-time update of the KB of the classification system in response to novel information coming from the process both in stationary conditions (accuracy improvement) and in nonstationary ones (process tracking) and in providing a suitable estimate of k. It is shown that the classification system grants consistency once the change targets the process generating the data in a new stationary state, as it is the case in many real applications.

  5. SVM based colon polyps classifier in a wireless active stereo endoscope.

    PubMed

    Ayoub, J; Granado, B; Mhanna, Y; Romain, O

    2010-01-01

    This work focuses on the recognition of three-dimensional colon polyps captured by an active stereo vision sensor. The detection algorithm consists of SVM classifier trained on robust feature descriptors. The study is related to Cyclope, this prototype sensor allows real time 3D object reconstruction and continues to be optimized technically to improve its classification task by differentiation between hyperplastic and adenomatous polyps. Experimental results were encouraging and show correct classification rate of approximately 97%. The work contains detailed statistics about the detection rate and the computing complexity. Inspired by intensity histogram, the work shows a new approach that extracts a set of features based on depth histogram and combines stereo measurement with SVM classifiers to correctly classify benign and malignant polyps.

  6. Consistent latent position estimation and vertex classification for random dot product graphs.

    PubMed

    Sussman, Daniel L; Tang, Minh; Priebe, Carey E

    2014-01-01

    In this work, we show that using the eigen-decomposition of the adjacency matrix, we can consistently estimate latent positions for random dot product graphs provided the latent positions are i.i.d. from some distribution. If class labels are observed for a number of vertices tending to infinity, then we show that the remaining vertices can be classified with error converging to Bayes optimal using the $(k)$-nearest-neighbors classification rule. We evaluate the proposed methods on simulated data and a graph derived from Wikipedia.

  7. Texture Analysis and Machine Learning for Detecting Myocardial Infarction in Noncontrast Low-Dose Computed Tomography: Unveiling the Invisible.

    PubMed

    Mannil, Manoj; von Spiczak, Jochen; Manka, Robert; Alkadhi, Hatem

    2018-06-01

    The aim of this study was to test whether texture analysis and machine learning enable the detection of myocardial infarction (MI) on non-contrast-enhanced low radiation dose cardiac computed tomography (CCT) images. In this institutional review board-approved retrospective study, we included non-contrast-enhanced electrocardiography-gated low radiation dose CCT image data (effective dose, 0.5 mSv) acquired for the purpose of calcium scoring of 27 patients with acute MI (9 female patients; mean age, 60 ± 12 years), 30 patients with chronic MI (8 female patients; mean age, 68 ± 13 years), and in 30 subjects (9 female patients; mean age, 44 ± 6 years) without cardiac abnormality, hereafter termed controls. Texture analysis of the left ventricle was performed using free-hand regions of interest, and texture features were classified twice (Model I: controls versus acute MI versus chronic MI; Model II: controls versus acute and chronic MI). For both classifications, 6 commonly used machine learning classifiers were used: decision tree C4.5 (J48), k-nearest neighbors, locally weighted learning, RandomForest, sequential minimal optimization, and an artificial neural network employing deep learning. In addition, 2 blinded, independent readers visually assessed noncontrast CCT images for the presence or absence of MI. In Model I, best classification results were obtained using the k-nearest neighbors classifier (sensitivity, 69%; specificity, 85%; false-positive rate, 0.15). In Model II, the best classification results were found with the locally weighted learning classification (sensitivity, 86%; specificity, 81%; false-positive rate, 0.19) with an area under the curve from receiver operating characteristics analysis of 0.78. In comparison, both readers were not able to identify MI in any of the noncontrast, low radiation dose CCT images. This study indicates the ability of texture analysis and machine learning in detecting MI on noncontrast low radiation dose CCT images being not visible for the radiologists' eye.

  8. A comparison of supervised classification methods for the prediction of substrate type using multibeam acoustic and legacy grain-size data.

    PubMed

    Stephens, David; Diesing, Markus

    2014-01-01

    Detailed seabed substrate maps are increasingly in demand for effective planning and management of marine ecosystems and resources. It has become common to use remotely sensed multibeam echosounder data in the form of bathymetry and acoustic backscatter in conjunction with ground-truth sampling data to inform the mapping of seabed substrates. Whilst, until recently, such data sets have typically been classified by expert interpretation, it is now obvious that more objective, faster and repeatable methods of seabed classification are required. This study compares the performances of a range of supervised classification techniques for predicting substrate type from multibeam echosounder data. The study area is located in the North Sea, off the north-east coast of England. A total of 258 ground-truth samples were classified into four substrate classes. Multibeam bathymetry and backscatter data, and a range of secondary features derived from these datasets were used in this study. Six supervised classification techniques were tested: Classification Trees, Support Vector Machines, k-Nearest Neighbour, Neural Networks, Random Forest and Naive Bayes. Each classifier was trained multiple times using different input features, including i) the two primary features of bathymetry and backscatter, ii) a subset of the features chosen by a feature selection process and iii) all of the input features. The predictive performances of the models were validated using a separate test set of ground-truth samples. The statistical significance of model performances relative to a simple baseline model (Nearest Neighbour predictions on bathymetry and backscatter) were tested to assess the benefits of using more sophisticated approaches. The best performing models were tree based methods and Naive Bayes which achieved accuracies of around 0.8 and kappa coefficients of up to 0.5 on the test set. The models that used all input features didn't generally perform well, highlighting the need for some means of feature selection.

  9. Proposing an adaptive mutation to improve XCSF performance to classify ADHD and BMD patients

    NASA Astrophysics Data System (ADS)

    Sadatnezhad, Khadijeh; Boostani, Reza; Ghanizadeh, Ahmad

    2010-12-01

    There is extensive overlap of clinical symptoms observed among children with bipolar mood disorder (BMD) and those with attention deficit hyperactivity disorder (ADHD). Thus, diagnosis according to clinical symptoms cannot be very accurate. It is therefore desirable to develop quantitative criteria for automatic discrimination between these disorders. This study is aimed at designing an efficient decision maker to accurately classify ADHD and BMD patients by analyzing their electroencephalogram (EEG) signals. In this study, 22 channels of EEGs have been recorded from 21 subjects with ADHD and 22 individuals with BMD. Several informative features, such as fractal dimension, band power and autoregressive coefficients, were extracted from the recorded signals. Considering the multimodal overlapping distribution of the obtained features, linear discriminant analysis (LDA) was used to reduce the input dimension in a more separable space to make it more appropriate for the proposed classifier. A piecewise linear classifier based on the extended classifier system for function approximation (XCSF) was modified by developing an adaptive mutation rate, which was proportional to the genotypic content of best individuals and their fitness in each generation. The proposed operator controlled the trade-off between exploration and exploitation while maintaining the diversity in the classifier's population to avoid premature convergence. To assess the effectiveness of the proposed scheme, the extracted features were applied to support vector machine, LDA, nearest neighbor and XCSF classifiers. To evaluate the method, a noisy environment was simulated with different noise amplitudes. It is shown that the results of the proposed technique are more robust as compared to conventional classifiers. Statistical tests demonstrate that the proposed classifier is a promising method for discriminating between ADHD and BMD patients.

  10. voomDDA: discovery of diagnostic biomarkers and classification of RNA-seq data.

    PubMed

    Zararsiz, Gokmen; Goksuluk, Dincer; Klaus, Bernd; Korkmaz, Selcuk; Eldem, Vahap; Karabulut, Erdem; Ozturk, Ahmet

    2017-01-01

    RNA-Seq is a recent and efficient technique that uses the capabilities of next-generation sequencing technology for characterizing and quantifying transcriptomes. One important task using gene-expression data is to identify a small subset of genes that can be used to build diagnostic classifiers particularly for cancer diseases. Microarray based classifiers are not directly applicable to RNA-Seq data due to its discrete nature. Overdispersion is another problem that requires careful modeling of mean and variance relationship of the RNA-Seq data. In this study, we present voomDDA classifiers: variance modeling at the observational level (voom) extensions of the nearest shrunken centroids (NSC) and the diagonal discriminant classifiers. VoomNSC is one of these classifiers and brings voom and NSC approaches together for the purpose of gene-expression based classification. For this purpose, we propose weighted statistics and put these weighted statistics into the NSC algorithm. The VoomNSC is a sparse classifier that models the mean-variance relationship using the voom method and incorporates voom's precision weights into the NSC classifier via weighted statistics. A comprehensive simulation study was designed and four real datasets are used for performance assessment. The overall results indicate that voomNSC performs as the sparsest classifier. It also provides the most accurate results together with power-transformed Poisson linear discriminant analysis, rlog transformed support vector machines and random forests algorithms. In addition to prediction purposes, the voomNSC classifier can be used to identify the potential diagnostic biomarkers for a condition of interest. Through this work, statistical learning methods proposed for microarrays can be reused for RNA-Seq data. An interactive web application is freely available at http://www.biosoft.hacettepe.edu.tr/voomDDA/.

  11. Proposing an adaptive mutation to improve XCSF performance to classify ADHD and BMD patients.

    PubMed

    Sadatnezhad, Khadijeh; Boostani, Reza; Ghanizadeh, Ahmad

    2010-12-01

    There is extensive overlap of clinical symptoms observed among children with bipolar mood disorder (BMD) and those with attention deficit hyperactivity disorder (ADHD). Thus, diagnosis according to clinical symptoms cannot be very accurate. It is therefore desirable to develop quantitative criteria for automatic discrimination between these disorders. This study is aimed at designing an efficient decision maker to accurately classify ADHD and BMD patients by analyzing their electroencephalogram (EEG) signals. In this study, 22 channels of EEGs have been recorded from 21 subjects with ADHD and 22 individuals with BMD. Several informative features, such as fractal dimension, band power and autoregressive coefficients, were extracted from the recorded signals. Considering the multimodal overlapping distribution of the obtained features, linear discriminant analysis (LDA) was used to reduce the input dimension in a more separable space to make it more appropriate for the proposed classifier. A piecewise linear classifier based on the extended classifier system for function approximation (XCSF) was modified by developing an adaptive mutation rate, which was proportional to the genotypic content of best individuals and their fitness in each generation. The proposed operator controlled the trade-off between exploration and exploitation while maintaining the diversity in the classifier's population to avoid premature convergence. To assess the effectiveness of the proposed scheme, the extracted features were applied to support vector machine, LDA, nearest neighbor and XCSF classifiers. To evaluate the method, a noisy environment was simulated with different noise amplitudes. It is shown that the results of the proposed technique are more robust as compared to conventional classifiers. Statistical tests demonstrate that the proposed classifier is a promising method for discriminating between ADHD and BMD patients.

  12. Automated identification of diagnosis and co-morbidity in clinical records.

    PubMed

    Cano, C; Blanco, A; Peshkin, L

    2009-01-01

    Automated understanding of clinical records is a challenging task involving various legal and technical difficulties. Clinical free text is inherently redundant, unstructured, and full of acronyms, abbreviations and domain-specific language which make it challenging to mine automatically. There is much effort in the field focused on creating specialized ontology, lexicons and heuristics based on expert knowledge of the domain. However, ad-hoc solutions poorly generalize across diseases or diagnoses. This paper presents a successful approach for a rapid prototyping of a diagnosis classifier based on a popular computational linguistics platform. The corpus consists of several hundred of full length discharge summaries provided by Partners Healthcare. The goal is to identify a diagnosis and assign co-morbidi-ty. Our approach is based on the rapid implementation of a logistic regression classifier using an existing toolkit: LingPipe (http://alias-i.com/lingpipe). We implement and compare three different classifiers. The baseline approach uses character 5-grams as features. The second approach uses a bag-of-words representation enriched with a small additional set of features. The third approach reduces a feature set to the most informative features according to the information content. The proposed systems achieve high performance (average F-micro 0.92) for the task. We discuss the relative merit of the three classifiers. Supplementary material with detailed results is available at: http:// decsai.ugr.es/~ccano/LR/supplementary_ material/ We show that our methodology for rapid prototyping of a domain-unaware system is effective for building an accurate classifier for clinical records.

  13. Cooperative light-induced molecular movements of highly ordered azobenzene self-assembled monolayers.

    PubMed

    Pace, Giuseppina; Ferri, Violetta; Grave, Christian; Elbing, Mark; von Hänisch, Carsten; Zharnikov, Michael; Mayor, Marcel; Rampi, Maria Anita; Samorì, Paolo

    2007-06-12

    Photochromic systems can convert light energy into mechanical energy, thus they can be used as building blocks for the fabrication of prototypes of molecular devices that are based on the photomechanical effect. Hitherto a controlled photochromic switch on surfaces has been achieved either on isolated chromophores or within assemblies of randomly arranged molecules. Here we show by scanning tunneling microscopy imaging the photochemical switching of a new terminally thiolated azobiphenyl rigid rod molecule. Interestingly, the switching of entire molecular 2D crystalline domains is observed, which is ruled by the interactions between nearest neighbors. This observation of azobenzene-based systems displaying collective switching might be of interest for applications in high-density data storage.

  14. Hamiltonian thermostats fail to promote heat flow

    NASA Astrophysics Data System (ADS)

    Hoover, Wm. G.; Hoover, Carol G.

    2013-12-01

    Hamiltonian mechanics can be used to constrain temperature simultaneously with energy. We illustrate the interesting situations that develop when two different temperatures are imposed within a composite Hamiltonian system. The model systems we treat are ϕ4 chains, with quartic tethers and quadratic nearest-neighbor Hooke's-law interactions. This model is known to satisfy Fourier's law. Our prototypical problem sandwiches a Newtonian subsystem between hot and cold Hamiltonian reservoir regions. We have characterized four different Hamiltonian reservoir types. There is no tendency for any of these two-temperature Hamiltonian simulations to transfer heat from the hot to the cold degrees of freedom. Evidently steady heat flow simulations require energy sources and sinks, and are therefore incompatible with Hamiltonian mechanics.

  15. Hamiltonian dynamics of thermostated systems: two-temperature heat-conducting phi4 chains.

    PubMed

    Hoover, Wm G; Hoover, Carol G

    2007-04-28

    We consider and compare four Hamiltonian formulations of thermostated mechanics, three of them kinetic, and the other one configurational. Though all four approaches "work" at equilibrium, their application to many-body nonequilibrium simulations can fail to provide a proper flow of heat. All the Hamiltonian formulations considered here are applied to the same prototypical two-temperature "phi4" model of a heat-conducting chain. This model incorporates nearest-neighbor Hooke's-Law interactions plus a quartic tethering potential. Physically correct results, obtained with the isokinetic Gaussian and Nose-Hoover thermostats, are compared with two other Hamiltonian results. The latter results, based on constrained Hamiltonian thermostats, fail to model correctly the flow of heat.

  16. A binary linear programming formulation of the graph edit distance.

    PubMed

    Justice, Derek; Hero, Alfred

    2006-08-01

    A binary linear programming formulation of the graph edit distance for unweighted, undirected graphs with vertex attributes is derived and applied to a graph recognition problem. A general formulation for editing graphs is used to derive a graph edit distance that is proven to be a metric, provided the cost function for individual edit operations is a metric. Then, a binary linear program is developed for computing this graph edit distance, and polynomial time methods for determining upper and lower bounds on the solution of the binary program are derived by applying solution methods for standard linear programming and the assignment problem. A recognition problem of comparing a sample input graph to a database of known prototype graphs in the context of a chemical information system is presented as an application of the new method. The costs associated with various edit operations are chosen by using a minimum normalized variance criterion applied to pairwise distances between nearest neighbors in the database of prototypes. The new metric is shown to perform quite well in comparison to existing metrics when applied to a database of chemical graphs.

  17. Discriminative Hierarchical K-Means Tree for Large-Scale Image Classification.

    PubMed

    Chen, Shizhi; Yang, Xiaodong; Tian, Yingli

    2015-09-01

    A key challenge in large-scale image classification is how to achieve efficiency in terms of both computation and memory without compromising classification accuracy. The learning-based classifiers achieve the state-of-the-art accuracies, but have been criticized for the computational complexity that grows linearly with the number of classes. The nonparametric nearest neighbor (NN)-based classifiers naturally handle large numbers of categories, but incur prohibitively expensive computation and memory costs. In this brief, we present a novel classification scheme, i.e., discriminative hierarchical K-means tree (D-HKTree), which combines the advantages of both learning-based and NN-based classifiers. The complexity of the D-HKTree only grows sublinearly with the number of categories, which is much better than the recent hierarchical support vector machines-based methods. The memory requirement is the order of magnitude less than the recent Naïve Bayesian NN-based approaches. The proposed D-HKTree classification scheme is evaluated on several challenging benchmark databases and achieves the state-of-the-art accuracies, while with significantly lower computation cost and memory requirement.

  18. Speaker gender identification based on majority vote classifiers

    NASA Astrophysics Data System (ADS)

    Mezghani, Eya; Charfeddine, Maha; Nicolas, Henri; Ben Amar, Chokri

    2017-03-01

    Speaker gender identification is considered among the most important tools in several multimedia applications namely in automatic speech recognition, interactive voice response systems and audio browsing systems. Gender identification systems performance is closely linked to the selected feature set and the employed classification model. Typical techniques are based on selecting the best performing classification method or searching optimum tuning of one classifier parameters through experimentation. In this paper, we consider a relevant and rich set of features involving pitch, MFCCs as well as other temporal and frequency-domain descriptors. Five classification models including decision tree, discriminant analysis, nave Bayes, support vector machine and k-nearest neighbor was experimented. The three best perming classifiers among the five ones will contribute by majority voting between their scores. Experimentations were performed on three different datasets spoken in three languages: English, German and Arabic in order to validate language independency of the proposed scheme. Results confirm that the presented system has reached a satisfying accuracy rate and promising classification performance thanks to the discriminating abilities and diversity of the used features combined with mid-level statistics.

  19. A Robust and Fast Computation Touchless Palm Print Recognition System Using LHEAT and the IFkNCN Classifier

    PubMed Central

    Jaafar, Haryati; Ibrahim, Salwani; Ramli, Dzati Athiar

    2015-01-01

    Mobile implementation is a current trend in biometric design. This paper proposes a new approach to palm print recognition, in which smart phones are used to capture palm print images at a distance. A touchless system was developed because of public demand for privacy and sanitation. Robust hand tracking, image enhancement, and fast computation processing algorithms are required for effective touchless and mobile-based recognition. In this project, hand tracking and the region of interest (ROI) extraction method were discussed. A sliding neighborhood operation with local histogram equalization, followed by a local adaptive thresholding or LHEAT approach, was proposed in the image enhancement stage to manage low-quality palm print images. To accelerate the recognition process, a new classifier, improved fuzzy-based k nearest centroid neighbor (IFkNCN), was implemented. By removing outliers and reducing the amount of training data, this classifier exhibited faster computation. Our experimental results demonstrate that a touchless palm print system using LHEAT and IFkNCN achieves a promising recognition rate of 98.64%. PMID:26113861

  20. Automated diagnosis of epilepsy using CWT, HOS and texture parameters.

    PubMed

    Acharya, U Rajendra; Yanti, Ratna; Zheng, Jia Wei; Krishnan, M Muthu Rama; Tan, Jen Hong; Martis, Roshan Joy; Lim, Choo Min

    2013-06-01

    Epilepsy is a chronic brain disorder which manifests as recurrent seizures. Electroencephalogram (EEG) signals are generally analyzed to study the characteristics of epileptic seizures. In this work, we propose a method for the automated classification of EEG signals into normal, interictal and ictal classes using Continuous Wavelet Transform (CWT), Higher Order Spectra (HOS) and textures. First the CWT plot was obtained for the EEG signals and then the HOS and texture features were extracted from these plots. Then the statistically significant features were fed to four classifiers namely Decision Tree (DT), K-Nearest Neighbor (KNN), Probabilistic Neural Network (PNN) and Support Vector Machine (SVM) to select the best classifier. We observed that the SVM classifier with Radial Basis Function (RBF) kernel function yielded the best results with an average accuracy of 96%, average sensitivity of 96.9% and average specificity of 97% for 23.6 s duration of EEG data. Our proposed technique can be used as an automatic seizure monitoring software. It can also assist the doctors to cross check the efficacy of their prescribed drugs.

  1. SVM feature selection based rotation forest ensemble classifiers to improve computer-aided diagnosis of Parkinson disease.

    PubMed

    Ozcift, Akin

    2012-08-01

    Parkinson disease (PD) is an age-related deterioration of certain nerve systems, which affects movement, balance, and muscle control of clients. PD is one of the common diseases which affect 1% of people older than 60 years. A new classification scheme based on support vector machine (SVM) selected features to train rotation forest (RF) ensemble classifiers is presented for improving diagnosis of PD. The dataset contains records of voice measurements from 31 people, 23 with PD and each record in the dataset is defined with 22 features. The diagnosis model first makes use of a linear SVM to select ten most relevant features from 22. As a second step of the classification model, six different classifiers are trained with the subset of features. Subsequently, at the third step, the accuracies of classifiers are improved by the utilization of RF ensemble classification strategy. The results of the experiments are evaluated using three metrics; classification accuracy (ACC), Kappa Error (KE) and Area under the Receiver Operating Characteristic (ROC) Curve (AUC). Performance measures of two base classifiers, i.e. KStar and IBk, demonstrated an apparent increase in PD diagnosis accuracy compared to similar studies in literature. After all, application of RF ensemble classification scheme improved PD diagnosis in 5 of 6 classifiers significantly. We, numerically, obtained about 97% accuracy in RF ensemble of IBk (a K-Nearest Neighbor variant) algorithm, which is a quite high performance for Parkinson disease diagnosis.

  2. Dissimilarity representations in lung parenchyma classification

    NASA Astrophysics Data System (ADS)

    Sørensen, Lauge; de Bruijne, Marleen

    2009-02-01

    A good problem representation is important for a pattern recognition system to be successful. The traditional approach to statistical pattern recognition is feature representation. More specifically, objects are represented by a number of features in a feature vector space, and classifiers are built in this representation. This is also the general trend in lung parenchyma classification in computed tomography (CT) images, where the features often are measures on feature histograms. Instead, we propose to build normal density based classifiers in dissimilarity representations for lung parenchyma classification. This allows for the classifiers to work on dissimilarities between objects, which might be a more natural way of representing lung parenchyma. In this context, dissimilarity is defined between CT regions of interest (ROI)s. ROIs are represented by their CT attenuation histogram and ROI dissimilarity is defined as a histogram dissimilarity measure between the attenuation histograms. In this setting, the full histograms are utilized according to the chosen histogram dissimilarity measure. We apply this idea to classification of different emphysema patterns as well as normal, healthy tissue. Two dissimilarity representation approaches as well as different histogram dissimilarity measures are considered. The approaches are evaluated on a set of 168 CT ROIs using normal density based classifiers all showing good performance. Compared to using histogram dissimilarity directly as distance in a emph{k} nearest neighbor classifier, which achieves a classification accuracy of 92.9%, the best dissimilarity representation based classifier is significantly better with a classification accuracy of 97.0% (text{emph{p" border="0" class="imgtopleft"> = 0.046).

  3. Application of recurrence quantification analysis for the automated identification of epileptic EEG signals.

    PubMed

    Acharya, U Rajendra; Sree, S Vinitha; Chattopadhyay, Subhagata; Yu, Wenwei; Ang, Peng Chuan Alvin

    2011-06-01

    Epilepsy is a common neurological disorder that is characterized by the recurrence of seizures. Electroencephalogram (EEG) signals are widely used to diagnose seizures. Because of the non-linear and dynamic nature of the EEG signals, it is difficult to effectively decipher the subtle changes in these signals by visual inspection and by using linear techniques. Therefore, non-linear methods are being researched to analyze the EEG signals. In this work, we use the recorded EEG signals in Recurrence Plots (RP), and extract Recurrence Quantification Analysis (RQA) parameters from the RP in order to classify the EEG signals into normal, ictal, and interictal classes. Recurrence Plot (RP) is a graph that shows all the times at which a state of the dynamical system recurs. Studies have reported significantly different RQA parameters for the three classes. However, more studies are needed to develop classifiers that use these promising features and present good classification accuracy in differentiating the three types of EEG segments. Therefore, in this work, we have used ten RQA parameters to quantify the important features in the EEG signals.These features were fed to seven different classifiers: Support vector machine (SVM), Gaussian Mixture Model (GMM), Fuzzy Sugeno Classifier, K-Nearest Neighbor (KNN), Naive Bayes Classifier (NBC), Decision Tree (DT), and Radial Basis Probabilistic Neural Network (RBPNN). Our results show that the SVM classifier was able to identify the EEG class with an average efficiency of 95.6%, sensitivity and specificity of 98.9% and 97.8%, respectively.

  4. Detecting falls with wearable sensors using machine learning techniques.

    PubMed

    Özdemir, Ahmet Turan; Barshan, Billur

    2014-06-18

    Falls are a serious public health problem and possibly life threatening for people in fall risk groups. We develop an automated fall detection system with wearable motion sensor units fitted to the subjects' body at six different positions. Each unit comprises three tri-axial devices (accelerometer, gyroscope, and magnetometer/compass). Fourteen volunteers perform a standardized set of movements including 20 voluntary falls and 16 activities of daily living (ADLs), resulting in a large dataset with 2520 trials. To reduce the computational complexity of training and testing the classifiers, we focus on the raw data for each sensor in a 4 s time window around the point of peak total acceleration of the waist sensor, and then perform feature extraction and reduction. Most earlier studies on fall detection employ rule-based approaches that rely on simple thresholding of the sensor outputs. We successfully distinguish falls from ADLs using six machine learning techniques (classifiers): the k-nearest neighbor (k-NN) classifier, least squares method (LSM), support vector machines (SVM), Bayesian decision making (BDM), dynamic time warping (DTW), and artificial neural networks (ANNs). We compare the performance and the computational complexity of the classifiers and achieve the best results with the k-NN classifier and LSM, with sensitivity, specificity, and accuracy all above 99%. These classifiers also have acceptable computational requirements for training and testing. Our approach would be applicable in real-world scenarios where data records of indeterminate length, containing multiple activities in sequence, are recorded.

  5. Classifying depression patients and normal subjects using machine learning techniques and nonlinear features from EEG signal.

    PubMed

    Hosseinifard, Behshad; Moradi, Mohammad Hassan; Rostami, Reza

    2013-03-01

    Diagnosing depression in the early curable stages is very important and may even save the life of a patient. In this paper, we study nonlinear analysis of EEG signal for discriminating depression patients and normal controls. Forty-five unmedicated depressed patients and 45 normal subjects were participated in this study. Power of four EEG bands and four nonlinear features including detrended fluctuation analysis (DFA), higuchi fractal, correlation dimension and lyapunov exponent were extracted from EEG signal. For discriminating the two groups, k-nearest neighbor, linear discriminant analysis and logistic regression as the classifiers are then used. Highest classification accuracy of 83.3% is obtained by correlation dimension and LR classifier among other nonlinear features. For further improvement, all nonlinear features are combined and applied to classifiers. A classification accuracy of 90% is achieved by all nonlinear features and LR classifier. In all experiments, genetic algorithm is employed to select the most important features. The proposed technique is compared and contrasted with the other reported methods and it is demonstrated that by combining nonlinear features, the performance is enhanced. This study shows that nonlinear analysis of EEG can be a useful method for discriminating depressed patients and normal subjects. It is suggested that this analysis may be a complementary tool to help psychiatrists for diagnosing depressed patients. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.

  6. Topological side-chain classification of beta-turns: ideal motifs for peptidomimetic development.

    PubMed

    Tran, Tran Trung; McKie, Jim; Meutermans, Wim D F; Bourne, Gregory T; Andrews, Peter R; Smythe, Mark L

    2005-08-01

    Beta-turns are important topological motifs for biological recognition of proteins and peptides. Organic molecules that sample the side chain positions of beta-turns have shown broad binding capacity to multiple different receptors, for example benzodiazepines. Beta-turns have traditionally been classified into various types based on the backbone dihedral angles (phi2, psi2, phi3 and psi3). Indeed, 57-68% of beta-turns are currently classified into 8 different backbone families (Type I, Type II, Type I', Type II', Type VIII, Type VIa1, Type VIa2 and Type VIb and Type IV which represents unclassified beta-turns). Although this classification of beta-turns has been useful, the resulting beta-turn types are not ideal for the design of beta-turn mimetics as they do not reflect topological features of the recognition elements, the side chains. To overcome this, we have extracted beta-turns from a data set of non-homologous and high-resolution protein crystal structures. The side chain positions, as defined by C(alpha)-C(beta) vectors, of these turns have been clustered using the kth nearest neighbor clustering and filtered nearest centroid sorting algorithms. Nine clusters were obtained that cluster 90% of the data, and the average intra-cluster RMSD of the four C(alpha)-C(beta) vectors is 0.36. The nine clusters therefore represent the topology of the side chain scaffold architecture of the vast majority of beta-turns. The mean structures of the nine clusters are useful for the development of beta-turn mimetics and as biological descriptors for focusing combinatorial chemistry towards biologically relevant topological space.

  7. Getting to know the nearest stars: Intermittent radio bursts from Ross 614

    NASA Astrophysics Data System (ADS)

    Winterhalter, Daniel; Knapp, Mary; Bastian, Tim

    2017-04-01

    Radio observations have been used as a search tool for exoplanets since before the confirmed discovery of the first extrasolar planet. To date, there have been no definitive detections of exoplanets in the radio regime. We are engaged in an ongoing blind radio survey of the nearest star systems for exoplanetary radio emission. The goal of this survey is to obtain meaningful upper limits on radio emission from (or modulated by) sub-stellar companions of the nearest stars. Nearby stars are strongly preferred because they suffer the least from the dilution of potential radio signals by distance. Targets are selected by distance and observability (both LOFAR and VLA) only. Other properties of target stars, such as stellar type, are not considered to avoid biasing the search. Five survey targets, Procyon, GJ 1111, GJ 725, Ross 614, and UGPSJ072227.51, have been observed with the VLA telescope L- and S-band receivers. P-band observations are ongoing. Of particular interest are, at this time, our observation of the Ross 614 System. Ross 614 is an M-dwarf binary system at a distance of about 13 Ly, with an orbital period of 16.6 years. The binary companions are classified as flare stars because strong radio emission has been detected from the location of the system in previous work. Analyses are in progress to determine if the intermittent burst are similar to solar-type burst, and/or if there is any evidence for emissions from sub-stellar companions.

  8. Latent Dirichlet Allocation (LDA) Model and kNN Algorithm to Classify Research Project Selection

    NASA Astrophysics Data System (ADS)

    Safi’ie, M. A.; Utami, E.; Fatta, H. A.

    2018-03-01

    Universitas Sebelas Maret has a teaching staff more than 1500 people, and one of its tasks is to carry out research. In the other side, the funding support for research and service is limited, so there is need to be evaluated to determine the Research proposal submission and devotion on society (P2M). At the selection stage, research proposal documents are collected as unstructured data and the data stored is very large. To extract information contained in the documents therein required text mining technology. This technology applied to gain knowledge to the documents by automating the information extraction. In this articles we use Latent Dirichlet Allocation (LDA) to the documents as a model in feature extraction process, to get terms that represent its documents. Hereafter we use k-Nearest Neighbour (kNN) algorithm to classify the documents based on its terms.

  9. Quaternion-Based Signal Analysis for Motor Imagery Classification from Electroencephalographic Signals.

    PubMed

    Batres-Mendoza, Patricia; Montoro-Sanjose, Carlos R; Guerra-Hernandez, Erick I; Almanza-Ojeda, Dora L; Rostro-Gonzalez, Horacio; Romero-Troncoso, Rene J; Ibarra-Manzano, Mario A

    2016-03-05

    Quaternions can be used as an alternative to model the fundamental patterns of electroencephalographic (EEG) signals in the time domain. Thus, this article presents a new quaternion-based technique known as quaternion-based signal analysis (QSA) to represent EEG signals obtained using a brain-computer interface (BCI) device to detect and interpret cognitive activity. This quaternion-based signal analysis technique can extract features to represent brain activity related to motor imagery accurately in various mental states. Experimental tests in which users where shown visual graphical cues related to left and right movements were used to collect BCI-recorded signals. These signals were then classified using decision trees (DT), support vector machine (SVM) and k-nearest neighbor (KNN) techniques. The quantitative analysis of the classifiers demonstrates that this technique can be used as an alternative in the EEG-signal modeling phase to identify mental states.

  10. Quaternion-Based Signal Analysis for Motor Imagery Classification from Electroencephalographic Signals

    PubMed Central

    Batres-Mendoza, Patricia; Montoro-Sanjose, Carlos R.; Guerra-Hernandez, Erick I.; Almanza-Ojeda, Dora L.; Rostro-Gonzalez, Horacio; Romero-Troncoso, Rene J.; Ibarra-Manzano, Mario A.

    2016-01-01

    Quaternions can be used as an alternative to model the fundamental patterns of electroencephalographic (EEG) signals in the time domain. Thus, this article presents a new quaternion-based technique known as quaternion-based signal analysis (QSA) to represent EEG signals obtained using a brain-computer interface (BCI) device to detect and interpret cognitive activity. This quaternion-based signal analysis technique can extract features to represent brain activity related to motor imagery accurately in various mental states. Experimental tests in which users where shown visual graphical cues related to left and right movements were used to collect BCI-recorded signals. These signals were then classified using decision trees (DT), support vector machine (SVM) and k-nearest neighbor (KNN) techniques. The quantitative analysis of the classifiers demonstrates that this technique can be used as an alternative in the EEG-signal modeling phase to identify mental states. PMID:26959029

  11. Characterization of 3D Voronoi Tessellation Nearest Neighbor Lipid Shells Provides Atomistic Lipid Disruption Profile of Protein Containing Lipid Membranes

    PubMed Central

    Cheng, Sara Y.; Duong, Hai V.; Compton, Campbell; Vaughn, Mark W.; Nguyen, Hoa; Cheng, Kwan H.

    2015-01-01

    Quantifying protein-induced lipid disruptions at the atomistic level is a challenging problem in membrane biophysics. Here we propose a novel 3D Voronoi tessellation nearest-atom-neighbor shell method to classify and characterize lipid domains into discrete concentric lipid shells surrounding membrane proteins in structurally heterogeneous lipid membranes. This method needs only the coordinates of the system and is independent of force fields and simulation conditions. As a proof-of-principle, we use this multiple lipid shell method to analyze the lipid disruption profiles of three simulated membrane systems: phosphatidylcholine, phosphatidylcholine/cholesterol, and beta-amyloid/phosphatidylcholine/cholesterol. We observed different atomic volume disruption mechanisms due to cholesterol and beta-amyloid Additionally, several lipid fractional groups and lipid-interfacial water did not converge to their control values with increasing distance or shell order from the protein. This volume divergent behavior was confirmed by bilayer thickness and chain orientational order calculations. Our method can also be used to analyze high-resolution structural experimental data. PMID:25637891

  12. Biclustering Learning of Trading Rules.

    PubMed

    Huang, Qinghua; Wang, Ting; Tao, Dacheng; Li, Xuelong

    2015-10-01

    Technical analysis with numerous indicators and patterns has been regarded as important evidence for making trading decisions in financial markets. However, it is extremely difficult for investors to find useful trading rules based on numerous technical indicators. This paper innovatively proposes the use of biclustering mining to discover effective technical trading patterns that contain a combination of indicators from historical financial data series. This is the first attempt to use biclustering algorithm on trading data. The mined patterns are regarded as trading rules and can be classified as three trading actions (i.e., the buy, the sell, and no-action signals) with respect to the maximum support. A modified K nearest neighborhood ( K -NN) method is applied to classification of trading days in the testing period. The proposed method [called biclustering algorithm and the K nearest neighbor (BIC- K -NN)] was implemented on four historical datasets and the average performance was compared with the conventional buy-and-hold strategy and three previously reported intelligent trading systems. Experimental results demonstrate that the proposed trading system outperforms its counterparts and will be useful for investment in various financial markets.

  13. Classifier ensemble based on feature selection and diversity measures for predicting the affinity of A(2B) adenosine receptor antagonists.

    PubMed

    Bonet, Isis; Franco-Montero, Pedro; Rivero, Virginia; Teijeira, Marta; Borges, Fernanda; Uriarte, Eugenio; Morales Helguera, Aliuska

    2013-12-23

    A(2B) adenosine receptor antagonists may be beneficial in treating diseases like asthma, diabetes, diabetic retinopathy, and certain cancers. This has stimulated research for the development of potent ligands for this subtype, based on quantitative structure-affinity relationships. In this work, a new ensemble machine learning algorithm is proposed for classification and prediction of the ligand-binding affinity of A(2B) adenosine receptor antagonists. This algorithm is based on the training of different classifier models with multiple training sets (composed of the same compounds but represented by diverse features). The k-nearest neighbor, decision trees, neural networks, and support vector machines were used as single classifiers. To select the base classifiers for combining into the ensemble, several diversity measures were employed. The final multiclassifier prediction results were computed from the output obtained by using a combination of selected base classifiers output, by utilizing different mathematical functions including the following: majority vote, maximum and average probability. In this work, 10-fold cross- and external validation were used. The strategy led to the following results: i) the single classifiers, together with previous features selections, resulted in good overall accuracy, ii) a comparison between single classifiers, and their combinations in the multiclassifier model, showed that using our ensemble gave a better performance than the single classifier model, and iii) our multiclassifier model performed better than the most widely used multiclassifier models in the literature. The results and statistical analysis demonstrated the supremacy of our multiclassifier approach for predicting the affinity of A(2B) adenosine receptor antagonists, and it can be used to develop other QSAR models.

  14. Consensus Classification Using Non-Optimized Classifiers.

    PubMed

    Brownfield, Brett; Lemos, Tony; Kalivas, John H

    2018-04-03

    Classifying samples into categories is a common problem in analytical chemistry and other fields. Classification is usually based on only one method, but numerous classifiers are available with some being complex, such as neural networks, and others are simple, such as k nearest neighbors. Regardless, most classification schemes require optimization of one or more tuning parameters for best classification accuracy, sensitivity, and specificity. A process not requiring exact selection of tuning parameter values would be useful. To improve classification, several ensemble approaches have been used in past work to combine classification results from multiple optimized single classifiers. The collection of classifications for a particular sample are then combined by a fusion process such as majority vote to form the final classification. Presented in this Article is a method to classify a sample by combining multiple classification methods without specifically classifying the sample by each method, that is, the classification methods are not optimized. The approach is demonstrated on three analytical data sets. The first is a beer authentication set with samples measured on five instruments, allowing fusion of multiple instruments by three ways. The second data set is composed of textile samples from three classes based on Raman spectra. This data set is used to demonstrate the ability to classify simultaneously with different data preprocessing strategies, thereby reducing the need to determine the ideal preprocessing method, a common prerequisite for accurate classification. The third data set contains three wine cultivars for three classes measured at 13 unique chemical and physical variables. In all cases, fusion of nonoptimized classifiers improves classification. Also presented are atypical uses of Procrustes analysis and extended inverted signal correction (EISC) for distinguishing sample similarities to respective classes.

  15. Predicting Classifier Performance with Limited Training Data: Applications to Computer-Aided Diagnosis in Breast and Prostate Cancer

    PubMed Central

    Basavanhally, Ajay; Viswanath, Satish; Madabhushi, Anant

    2015-01-01

    Clinical trials increasingly employ medical imaging data in conjunction with supervised classifiers, where the latter require large amounts of training data to accurately model the system. Yet, a classifier selected at the start of the trial based on smaller and more accessible datasets may yield inaccurate and unstable classification performance. In this paper, we aim to address two common concerns in classifier selection for clinical trials: (1) predicting expected classifier performance for large datasets based on error rates calculated from smaller datasets and (2) the selection of appropriate classifiers based on expected performance for larger datasets. We present a framework for comparative evaluation of classifiers using only limited amounts of training data by using random repeated sampling (RRS) in conjunction with a cross-validation sampling strategy. Extrapolated error rates are subsequently validated via comparison with leave-one-out cross-validation performed on a larger dataset. The ability to predict error rates as dataset size increases is demonstrated on both synthetic data as well as three different computational imaging tasks: detecting cancerous image regions in prostate histopathology, differentiating high and low grade cancer in breast histopathology, and detecting cancerous metavoxels in prostate magnetic resonance spectroscopy. For each task, the relationships between 3 distinct classifiers (k-nearest neighbor, naive Bayes, Support Vector Machine) are explored. Further quantitative evaluation in terms of interquartile range (IQR) suggests that our approach consistently yields error rates with lower variability (mean IQRs of 0.0070, 0.0127, and 0.0140) than a traditional RRS approach (mean IQRs of 0.0297, 0.0779, and 0.305) that does not employ cross-validation sampling for all three datasets. PMID:25993029

  16. Fast Query-Optimized Kernel-Machine Classification

    NASA Technical Reports Server (NTRS)

    Mazzoni, Dominic; DeCoste, Dennis

    2004-01-01

    A recently developed algorithm performs kernel-machine classification via incremental approximate nearest support vectors. The algorithm implements support-vector machines (SVMs) at speeds 10 to 100 times those attainable by use of conventional SVM algorithms. The algorithm offers potential benefits for classification of images, recognition of speech, recognition of handwriting, and diverse other applications in which there are requirements to discern patterns in large sets of data. SVMs constitute a subset of kernel machines (KMs), which have become popular as models for machine learning and, more specifically, for automated classification of input data on the basis of labeled training data. While similar in many ways to k-nearest-neighbors (k-NN) models and artificial neural networks (ANNs), SVMs tend to be more accurate. Using representations that scale only linearly in the numbers of training examples, while exploring nonlinear (kernelized) feature spaces that are exponentially larger than the original input dimensionality, KMs elegantly and practically overcome the classic curse of dimensionality. However, the price that one must pay for the power of KMs is that query-time complexity scales linearly with the number of training examples, making KMs often orders of magnitude more computationally expensive than are ANNs, decision trees, and other popular machine learning alternatives. The present algorithm treats an SVM classifier as a special form of a k-NN. The algorithm is based partly on an empirical observation that one can often achieve the same classification as that of an exact KM by using only small fraction of the nearest support vectors (SVs) of a query. The exact KM output is a weighted sum over the kernel values between the query and the SVs. In this algorithm, the KM output is approximated with a k-NN classifier, the output of which is a weighted sum only over the kernel values involving k selected SVs. Before query time, there are gathered statistics about how misleading the output of the k-NN model can be, relative to the outputs of the exact KM for a representative set of examples, for each possible k from 1 to the total number of SVs. From these statistics, there are derived upper and lower thresholds for each step k. These thresholds identify output levels for which the particular variant of the k-NN model already leans so strongly positively or negatively that a reversal in sign is unlikely, given the weaker SV neighbors still remaining. At query time, the partial output of each query is incrementally updated, stopping as soon as it exceeds the predetermined statistical thresholds of the current step. For an easy query, stopping can occur as early as step k = 1. For more difficult queries, stopping might not occur until nearly all SVs are touched. A key empirical observation is that this approach can tolerate very approximate nearest-neighbor orderings. In experiments, SVs and queries were projected to a subspace comprising the top few principal- component dimensions and neighbor orderings were computed in that subspace. This approach ensured that the overhead of the nearest-neighbor computations was insignificant, relative to that of the exact KM computation.

  17. Predicting hepatotoxicity using ToxCast in vitro bioactivity and ...

    EPA Pesticide Factsheets

    Background: The U.S. EPA ToxCastTM program is screening thousands of environmental chemicals for bioactivity using hundreds of high-throughput in vitro assays to build predictive models of toxicity. We represented chemicals based on bioactivity and chemical structure descriptors then used supervised machine learning to predict their hepatotoxic effects.Results: A set of 677 chemicals were represented by 711 in vitro bioactivity descriptors (from ToxCast assays), 4,376 chemical structure descriptors (from QikProp, OpenBabel, PADEL, and PubChem), and three hepatotoxicity categories (from animal studies). Hepatotoxicants were defined by rat liver histopathology observed after chronic chemical testing and grouped into hypertrophy (161), injury (101) and proliferative lesions (99). Classifiers were built using six machine learning algorithms: linear discriminant analysis (LDA), Naïve Bayes (NB), support vector classification (SVM), classification and regression trees (CART), k-nearest neighbors (KNN) and an ensemble of classifiers (ENSMB). Classifiers of hepatotoxicity were built using chemical structure, ToxCast bioactivity, and a hybrid representation. Predictive performance was evaluated using 10-fold cross-validation testing and in-loop, filter-based, feature subset selection. Hybrid classifiers had the best balanced accuracy for predicting hypertrophy (0.78±0.08), injury (0.73±0.10) and proliferative lesions (0.72±0.09). Though chemical and bioactivity class

  18. Accelerometer and Camera-Based Strategy for Improved Human Fall Detection.

    PubMed

    Zerrouki, Nabil; Harrou, Fouzi; Sun, Ying; Houacine, Amrane

    2016-12-01

    In this paper, we address the problem of detecting human falls using anomaly detection. Detection and classification of falls are based on accelerometric data and variations in human silhouette shape. First, we use the exponentially weighted moving average (EWMA) monitoring scheme to detect a potential fall in the accelerometric data. We used an EWMA to identify features that correspond with a particular type of fall allowing us to classify falls. Only features corresponding with detected falls were used in the classification phase. A benefit of using a subset of the original data to design classification models minimizes training time and simplifies models. Based on features corresponding to detected falls, we used the support vector machine (SVM) algorithm to distinguish between true falls and fall-like events. We apply this strategy to the publicly available fall detection databases from the university of Rzeszow's. Results indicated that our strategy accurately detected and classified fall events, suggesting its potential application to early alert mechanisms in the event of fall situations and its capability for classification of detected falls. Comparison of the classification results using the EWMA-based SVM classifier method with those achieved using three commonly used machine learning classifiers, neural network, K-nearest neighbor and naïve Bayes, proved our model superior.

  19. Solubility classification of airborne uranium products collected at the perimeter of the Allied Chemical Plant, Metropolis, Illinois

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kalkwarf, D.R.

    1980-05-01

    Airborne uranium products were collected at the perimeter of the uranium-conversion plant operated by the Allied Chemical Corporation at Metropolis, Illinois, and the dissolution rates of these products were classified in terms of the ICRP Task Group Lung Model. Assignments were based on measurements of the dissolution half-times exhibited by uranium components of the dust samples as they dissolved in simulated lung fluid at 37/sup 0/C. Based on three trials, the dissolution behavior of dust with aerodynamic equivalent diameter (AED) less than 5.5 ..mu..m and collected nearest the closest residence to the plant was classified 0.40 D, 0.60 Y. Basedmore » on two trials, the dissolution behavior of dust with AED greater than 5.5 ..mu..m and collected at this location was classified 0.37 D, 0.63 Y. Based on one trial, the dissolution behavior of dust with AED less than 5.5 ..mu..m and collected at a location on the opposite side of the plant was classified 0.68 D, 0.32 Y. There was some evidence for adsorption of dissolved uranium onto other dust components during dissolution, and preliminary dissolution trials are recommended for future samples in order to optimize the fluid replacement schedule.« less

  20. Automated Sentiment Analysis

    DTIC Science & Technology

    2009-06-01

    questions. Our prototype text classifier uses a “vector similarity” approach. This is a well-known technique introduced by Salton , Wong, and Yang (1975...Loveman & T.M. Davies Jr. (Eds.), Guerrilla warfare. Lincoln, NE: University of Nebraska Press, 1985, 47-69. Salton , G., Wong, A., & Yang, C.S. “A

  1. Category Representation for Classification and Feature Inference

    ERIC Educational Resources Information Center

    Johansen, Mark K.; Kruschke, John K.

    2005-01-01

    This research's purpose was to contrast the representations resulting from learning of the same categories by either classifying instances or inferring instance features. Prior inference learning research, particularly T. Yamauchi and A. B. Markman (1998), has suggested that feature inference learning fosters prototype representation, whereas…

  2. Ontology-based classification of remote sensing images using spectral rules

    NASA Astrophysics Data System (ADS)

    Andrés, Samuel; Arvor, Damien; Mougenot, Isabelle; Libourel, Thérèse; Durieux, Laurent

    2017-05-01

    Earth Observation data is of great interest for a wide spectrum of scientific domain applications. An enhanced access to remote sensing images for "domain" experts thus represents a great advance since it allows users to interpret remote sensing images based on their domain expert knowledge. However, such an advantage can also turn into a major limitation if this knowledge is not formalized, and thus is difficult for it to be shared with and understood by other users. In this context, knowledge representation techniques such as ontologies should play a major role in the future of remote sensing applications. We implemented an ontology-based prototype to automatically classify Landsat images based on explicit spectral rules. The ontology is designed in a very modular way in order to achieve a generic and versatile representation of concepts we think of utmost importance in remote sensing. The prototype was tested on four subsets of Landsat images and the results confirmed the potential of ontologies to formalize expert knowledge and classify remote sensing images.

  3. Effects of distance to care and rural or urban residence on receipt of radiation therapy among North Carolina Medicare enrollees with breast cancer.

    PubMed

    Wheeler, Stephanie B; Kuo, Tzy-Mey; Durham, Danielle; Frizzelle, Brian; Reeder-Hayes, Katherine; Meyer, Anne-Marie

    2014-01-01

    Distance to oncology service providers and rurality may affect receipt of guideline-recommended radiation therapy (RT), but the extent to which these factors affect the care of Medicare-insured patients is unknown. Using cancer registry data linked to Medicare claims from the Integrated Cancer Information and Surveillance System (ICISS), we identified all women aged 65 years or older who were diagnosed with stage I, II, or III breast cancer from 2003 through 2005, who had Medicare claims through 2006, and who were clinically eligible for RT. We geocoded the address of each RT service provider's practice location and calculated the travel distance from each patient's residential address to the nearest RT provider. We used ZIP codes to classify each patient's residence as rural or urban according to rural- urban commuting area codes. We used generalized estimating equations models with county-level clustering and interaction terms between distance categories and rural-urban status to estimate the effect of distance to care and rural-urban status on receipt of RT. In urban areas, increasing distance to the nearest RT provider was associated with a lower likelihood of receiving RT (odds ratio [OR] = 0.54; 95% confidence interval [CI], 0.30-0.97) for those living more than 20 miles from the nearest RT provider compared with those living less than 10 miles away. In rural areas, those living within 10-20 miles of the nearest RT provider were more likely to receive RT than those living less than 10 miles away (OR = 1.73; 95% CI, 1.08-2.76). Results may not be generalizable to areas outside North Carolina or to non-Medicare populations. Coordinated outreach programs targeted differently to rural and urban patients may be necessary to improve the quality of oncology care.

  4. A miniature electronic nose system based on an MWNT-polymer microsensor array and a low-power signal-processing chip.

    PubMed

    Chiu, Shih-Wen; Wu, Hsiang-Chiu; Chou, Ting-I; Chen, Hsin; Tang, Kea-Tiong

    2014-06-01

    This article introduces a power-efficient, miniature electronic nose (e-nose) system. The e-nose system primarily comprises two self-developed chips, a multiple-walled carbon nanotube (MWNT)-polymer based microsensor array, and a low-power signal-processing chip. The microsensor array was fabricated on a silicon wafer by using standard photolithography technology. The microsensor array comprised eight interdigitated electrodes surrounded by SU-8 "walls," which restrained the material-solvent liquid in a defined area of 650 × 760 μm(2). To achieve a reliable sensor-manufacturing process, we used a two-layer deposition method, coating the MWNTs and polymer film as the first and second layers, respectively. The low-power signal-processing chip included array data acquisition circuits and a signal-processing core. The MWNT-polymer microsensor array can directly connect with array data acquisition circuits, which comprise sensor interface circuitry and an analog-to-digital converter; the signal-processing core consists of memory and a microprocessor. The core executes the program, classifying the odor data received from the array data acquisition circuits. The low-power signal-processing chip was designed and fabricated using the Taiwan Semiconductor Manufacturing Company 0.18-μm 1P6M standard complementary metal oxide semiconductor process. The chip consumes only 1.05 mW of power at supply voltages of 1 and 1.8 V for the array data acquisition circuits and the signal-processing core, respectively. The miniature e-nose system, which used a microsensor array, a low-power signal-processing chip, and an embedded k-nearest-neighbor-based pattern recognition algorithm, was developed as a prototype that successfully recognized the complex odors of tincture, sorghum wine, sake, whisky, and vodka.

  5. [Schizophrenia or spiritual crisis? On "raising the kundalini" and its diagnostic classification].

    PubMed

    Hansen, G

    1995-07-31

    Two patients are described who had been diagnosed as schizophrenic, but had actually instead been going through spiritual crises, which in Eastern spiritual tradition are called raising the kundalini. Perhaps this experience is not a disease, but many--especially if not understood by oneself, the nearest relations and the medical profession--cause mental illness. In WHO ICD-10 the experience could be classified as F48.8, disordines neurotici specificati alii. The process falls outside the categories of both normal and psychotic. When allowed to progress to completion this process culminates in deep psychological balance, strength, and maturity.

  6. Robotic situational awareness of actions in human teaming

    NASA Astrophysics Data System (ADS)

    Tahmoush, Dave

    2015-06-01

    When robots can sense and interpret the activities of the people they are working with, they become more of a team member and less of just a piece of equipment. This has motivated work on recognizing human actions using existing robotic sensors like short-range ladar imagers. These produce three-dimensional point cloud movies which can be analyzed for structure and motion information. We skeletonize the human point cloud and apply a physics-based velocity correlation scheme to the resulting joint motions. The twenty actions are then recognized using a nearest-neighbors classifier that achieves good accuracy.

  7. Cooperative light-induced molecular movements of highly ordered azobenzene self-assembled monolayers

    PubMed Central

    Pace, Giuseppina; Ferri, Violetta; Grave, Christian; Elbing, Mark; von Hänisch, Carsten; Zharnikov, Michael; Mayor, Marcel; Rampi, Maria Anita; Samorì, Paolo

    2007-01-01

    Photochromic systems can convert light energy into mechanical energy, thus they can be used as building blocks for the fabrication of prototypes of molecular devices that are based on the photomechanical effect. Hitherto a controlled photochromic switch on surfaces has been achieved either on isolated chromophores or within assemblies of randomly arranged molecules. Here we show by scanning tunneling microscopy imaging the photochemical switching of a new terminally thiolated azobiphenyl rigid rod molecule. Interestingly, the switching of entire molecular 2D crystalline domains is observed, which is ruled by the interactions between nearest neighbors. This observation of azobenzene-based systems displaying collective switching might be of interest for applications in high-density data storage. PMID:17535889

  8. Quest for Relevance.

    ERIC Educational Resources Information Center

    Axelrod, Joseph

    During an analysis of the nature of the curricular-instructional process of US higher education, faculty members were classified into 5 prototypes based on their styles of teaching. The recitation class teacher limits the process of reasoning by students. The content-centered faculty member helps his students to master what "knowledgeable" people…

  9. Is Contrast Enhanced Ultrasonography a useful tool in a beginner's hand? How much can a Computer Assisted Diagnosis prototype help in characterizing the malignancy of focal liver lesions?

    PubMed

    Moga, Tudor Voicu; Popescu, Alina; Sporea, Ioan; Danila, Mirela; David, Ciprian; Gui, Vasile; Iacob, Nicoleta; Miclaus, Gratian; Sirli, Roxana

    2017-08-23

    Contrast enhanced ultrasound (CEUS) improved the characterization of focal liver lesions (FLLs), but is an operatordependent method. The goal of this paper was to test a computer assisted diagnosis (CAD) prototype and to see its benefit in assisting a beginner in the evaluation of FLLs. Our cohort included 97 good quality CEUS videos[34% hepatocellular carcinomas (HCC), 12.3% hypervascular metastases (HiperM), 11.3% hypovascular metastases (HipoM), 24.7% hemangiomas (HMG), 17.5% focal nodular hyperplasia (FNH)] that were used to develop a CAD prototype based on an algorithm that tested a binary decision based classifier. Two young medical doctors (1 year CEUS experience), two experts and the CAD prototype, reevaluated 50 FLLs CEUS videos (diagnosis of benign vs. malignant) first blinded to clinical data, in order to evaluate the diagnostic gap beginner vs. expert. The CAD classifier managed a 75.2% overall (benign vs. malignant) correct classification rate. The overall classification rates for the evaluators, before and after clinical data were: first beginner-78%; 94%; second beginner-82%; 96%; first expert-94%; 100%; second expert-96%; 98%. For both beginners, the malignant vs. benign diagnosis significantly improved after knowing the clinical data (p=0.005; p=0,008). The expert was better than the beginner (p=0.04) and better than the CAD (p=0.001). CAD in addition to the beginner can reach the expert diagnosis. The most frequent lesions misdiagnosed at CEUS were FNH and HCC. The CAD prototype is a good comparing tool for a beginner operator that can be developed to assist the diagnosis. In order to increase the classification rate, the CAD system for FLL in CEUS must integrate the clinical data.

  10. An improved method of early diagnosis of smoking-induced respiratory changes using machine learning algorithms.

    PubMed

    Amaral, Jorge L M; Lopes, Agnaldo J; Jansen, José M; Faria, Alvaro C D; Melo, Pedro L

    2013-12-01

    The purpose of this study was to develop an automatic classifier to increase the accuracy of the forced oscillation technique (FOT) for diagnosing early respiratory abnormalities in smoking patients. The data consisted of FOT parameters obtained from 56 volunteers, 28 healthy and 28 smokers with low tobacco consumption. Many supervised learning techniques were investigated, including logistic linear classifiers, k nearest neighbor (KNN), neural networks and support vector machines (SVM). To evaluate performance, the ROC curve of the most accurate parameter was established as baseline. To determine the best input features and classifier parameters, we used genetic algorithms and a 10-fold cross-validation using the average area under the ROC curve (AUC). In the first experiment, the original FOT parameters were used as input. We observed a significant improvement in accuracy (KNN=0.89 and SVM=0.87) compared with the baseline (0.77). The second experiment performed a feature selection on the original FOT parameters. This selection did not cause any significant improvement in accuracy, but it was useful in identifying more adequate FOT parameters. In the third experiment, we performed a feature selection on the cross products of the FOT parameters. This selection resulted in a further increase in AUC (KNN=SVM=0.91), which allows for high diagnostic accuracy. In conclusion, machine learning classifiers can help identify early smoking-induced respiratory alterations. The use of FOT cross products and the search for the best features and classifier parameters can markedly improve the performance of machine learning classifiers. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  11. Can-Evo-Ens: Classifier stacking based evolutionary ensemble system for prediction of human breast cancer using amino acid sequences.

    PubMed

    Ali, Safdar; Majid, Abdul

    2015-04-01

    The diagnostic of human breast cancer is an intricate process and specific indicators may produce negative results. In order to avoid misleading results, accurate and reliable diagnostic system for breast cancer is indispensable. Recently, several interesting machine-learning (ML) approaches are proposed for prediction of breast cancer. To this end, we developed a novel classifier stacking based evolutionary ensemble system "Can-Evo-Ens" for predicting amino acid sequences associated with breast cancer. In this paper, first, we selected four diverse-type of ML algorithms of Naïve Bayes, K-Nearest Neighbor, Support Vector Machines, and Random Forest as base-level classifiers. These classifiers are trained individually in different feature spaces using physicochemical properties of amino acids. In order to exploit the decision spaces, the preliminary predictions of base-level classifiers are stacked. Genetic programming (GP) is then employed to develop a meta-classifier that optimal combine the predictions of the base classifiers. The most suitable threshold value of the best-evolved predictor is computed using Particle Swarm Optimization technique. Our experiments have demonstrated the robustness of Can-Evo-Ens system for independent validation dataset. The proposed system has achieved the highest value of Area Under Curve (AUC) of ROC Curve of 99.95% for cancer prediction. The comparative results revealed that proposed approach is better than individual ML approaches and conventional ensemble approaches of AdaBoostM1, Bagging, GentleBoost, and Random Subspace. It is expected that the proposed novel system would have a major impact on the fields of Biomedical, Genomics, Proteomics, Bioinformatics, and Drug Development. Copyright © 2015 Elsevier Inc. All rights reserved.

  12. A comparative study of the svm and k-nn machine learning algorithms for the diagnosis of respiratory pathologies using pulmonary acoustic signals

    PubMed Central

    2014-01-01

    Background Pulmonary acoustic parameters extracted from recorded respiratory sounds provide valuable information for the detection of respiratory pathologies. The automated analysis of pulmonary acoustic signals can serve as a differential diagnosis tool for medical professionals, a learning tool for medical students, and a self-management tool for patients. In this context, we intend to evaluate and compare the performance of the support vector machine (SVM) and K-nearest neighbour (K-nn) classifiers in diagnosis respiratory pathologies using respiratory sounds from R.A.L.E database. Results The pulmonary acoustic signals used in this study were obtained from the R.A.L.E lung sound database. The pulmonary acoustic signals were manually categorised into three different groups, namely normal, airway obstruction pathology, and parenchymal pathology. The mel-frequency cepstral coefficient (MFCC) features were extracted from the pre-processed pulmonary acoustic signals. The MFCC features were analysed by one-way ANOVA and then fed separately into the SVM and K-nn classifiers. The performances of the classifiers were analysed using the confusion matrix technique. The statistical analysis of the MFCC features using one-way ANOVA showed that the extracted MFCC features are significantly different (p < 0.001). The classification accuracies of the SVM and K-nn classifiers were found to be 92.19% and 98.26%, respectively. Conclusion Although the data used to train and test the classifiers are limited, the classification accuracies found are satisfactory. The K-nn classifier was better than the SVM classifier for the discrimination of pulmonary acoustic signals from pathological and normal subjects obtained from the RALE database. PMID:24970564

  13. A comparative study of the SVM and K-nn machine learning algorithms for the diagnosis of respiratory pathologies using pulmonary acoustic signals.

    PubMed

    Palaniappan, Rajkumar; Sundaraj, Kenneth; Sundaraj, Sebastian

    2014-06-27

    Pulmonary acoustic parameters extracted from recorded respiratory sounds provide valuable information for the detection of respiratory pathologies. The automated analysis of pulmonary acoustic signals can serve as a differential diagnosis tool for medical professionals, a learning tool for medical students, and a self-management tool for patients. In this context, we intend to evaluate and compare the performance of the support vector machine (SVM) and K-nearest neighbour (K-nn) classifiers in diagnosis respiratory pathologies using respiratory sounds from R.A.L.E database. The pulmonary acoustic signals used in this study were obtained from the R.A.L.E lung sound database. The pulmonary acoustic signals were manually categorised into three different groups, namely normal, airway obstruction pathology, and parenchymal pathology. The mel-frequency cepstral coefficient (MFCC) features were extracted from the pre-processed pulmonary acoustic signals. The MFCC features were analysed by one-way ANOVA and then fed separately into the SVM and K-nn classifiers. The performances of the classifiers were analysed using the confusion matrix technique. The statistical analysis of the MFCC features using one-way ANOVA showed that the extracted MFCC features are significantly different (p < 0.001). The classification accuracies of the SVM and K-nn classifiers were found to be 92.19% and 98.26%, respectively. Although the data used to train and test the classifiers are limited, the classification accuracies found are satisfactory. The K-nn classifier was better than the SVM classifier for the discrimination of pulmonary acoustic signals from pathological and normal subjects obtained from the RALE database.

  14. Evaluation of extreme learning machine for classification of individual and combined finger movements using electromyography on amputees and non-amputees.

    PubMed

    Anam, Khairul; Al-Jumaily, Adel

    2017-01-01

    The success of myoelectric pattern recognition (M-PR) mostly relies on the features extracted and classifier employed. This paper proposes and evaluates a fast classifier, extreme learning machine (ELM), to classify individual and combined finger movements on amputees and non-amputees. ELM is a single hidden layer feed-forward network (SLFN) that avoids iterative learning by determining input weights randomly and output weights analytically. Therefore, it can accelerate the training time of SLFNs. In addition to the classifier evaluation, this paper evaluates various feature combinations to improve the performance of M-PR and investigate some feature projections to improve the class separability of the features. Different from other studies on the implementation of ELM in the myoelectric controller, this paper presents a complete and thorough investigation of various types of ELMs including the node-based and kernel-based ELM. Furthermore, this paper provides comparisons of ELMs and other well-known classifiers such as linear discriminant analysis (LDA), k-nearest neighbour (kNN), support vector machine (SVM) and least-square SVM (LS-SVM). The experimental results show the most accurate ELM classifier is radial basis function ELM (RBF-ELM). The comparison of RBF-ELM and other well-known classifiers shows that RBF-ELM is as accurate as SVM and LS-SVM but faster than the SVM family; it is superior to LDA and kNN. The experimental results also indicate that the accuracy gap of the M-PR on the amputees and non-amputees is not too much with the accuracy of 98.55% on amputees and 99.5% on the non-amputees using six electromyography (EMG) channels. Copyright © 2016 Elsevier Ltd. All rights reserved.

  15. Online breakage detection of multitooth tools using classifier ensembles for imbalanced data

    NASA Astrophysics Data System (ADS)

    Bustillo, Andrés; Rodríguez, Juan J.

    2014-12-01

    Cutting tool breakage detection is an important task, due to its economic impact on mass production lines in the automobile industry. This task presents a central limitation: real data-sets are extremely imbalanced because breakage occurs in very few cases compared with normal operation of the cutting process. In this paper, we present an analysis of different data-mining techniques applied to the detection of insert breakage in multitooth tools. The analysis applies only one experimental variable: the electrical power consumption of the tool drive. This restriction profiles real industrial conditions more accurately than other physical variables, such as acoustic or vibration signals, which are not so easily measured. Many efforts have been made to design a method that is able to identify breakages with a high degree of reliability within a short period of time. The solution is based on classifier ensembles for imbalanced data-sets. Classifier ensembles are combinations of classifiers, which in many situations are more accurate than individual classifiers. Six different base classifiers are tested: Decision Trees, Rules, Naïve Bayes, Nearest Neighbour, Multilayer Perceptrons and Logistic Regression. Three different balancing strategies are tested with each of the classifier ensembles and compared to their performance with the original data-set: Synthetic Minority Over-Sampling Technique (SMOTE), undersampling and a combination of SMOTE and undersampling. To identify the most suitable data-mining solution, Receiver Operating Characteristics (ROC) graph and Recall-precision graph are generated and discussed. The performance of logistic regression ensembles on the balanced data-set using the combination of SMOTE and undersampling turned out to be the most suitable technique. Finally a comparison using industrial performance measures is presented, which concludes that this technique is also more suited to this industrial problem than the other techniques presented in the bibliography.

  16. The nearest neighbor and next nearest neighbor effects on the thermodynamic and kinetic properties of RNA base pair

    NASA Astrophysics Data System (ADS)

    Wang, Yujie; Wang, Zhen; Wang, Yanli; Liu, Taigang; Zhang, Wenbing

    2018-01-01

    The thermodynamic and kinetic parameters of an RNA base pair with different nearest and next nearest neighbors were obtained through long-time molecular dynamics simulation of the opening-closing switch process of the base pair near its melting temperature. The results indicate that thermodynamic parameters of GC base pair are dependent on the nearest neighbor base pair, and the next nearest neighbor base pair has little effect, which validated the nearest-neighbor model. The closing and opening rates of the GC base pair also showed nearest neighbor dependences. At certain temperature, the closing and opening rates of the GC pair with nearest neighbor AU is larger than that with the nearest neighbor GC, and the next nearest neighbor plays little role. The free energy landscape of the GC base pair with the nearest neighbor GC is rougher than that with nearest neighbor AU.

  17. A multiple-point spatially weighted k-NN method for object-based classification

    NASA Astrophysics Data System (ADS)

    Tang, Yunwei; Jing, Linhai; Li, Hui; Atkinson, Peter M.

    2016-10-01

    Object-based classification, commonly referred to as object-based image analysis (OBIA), is now commonly regarded as able to produce more appealing classification maps, often of greater accuracy, than pixel-based classification and its application is now widespread. Therefore, improvement of OBIA using spatial techniques is of great interest. In this paper, multiple-point statistics (MPS) is proposed for object-based classification enhancement in the form of a new multiple-point k-nearest neighbour (k-NN) classification method (MPk-NN). The proposed method first utilises a training image derived from a pre-classified map to characterise the spatial correlation between multiple points of land cover classes. The MPS borrows spatial structures from other parts of the training image, and then incorporates this spatial information, in the form of multiple-point probabilities, into the k-NN classifier. Two satellite sensor images with a fine spatial resolution were selected to evaluate the new method. One is an IKONOS image of the Beijing urban area and the other is a WorldView-2 image of the Wolong mountainous area, in China. The images were object-based classified using the MPk-NN method and several alternatives, including the k-NN, the geostatistically weighted k-NN, the Bayesian method, the decision tree classifier (DTC), and the support vector machine classifier (SVM). It was demonstrated that the new spatial weighting based on MPS can achieve greater classification accuracy relative to the alternatives and it is, thus, recommended as appropriate for object-based classification.

  18. Maintaining the U.S. Army Research, Development and Engineering Command Prototype Integration Facilities

    DTIC Science & Technology

    2014-05-01

    shelters, tents and fabric covers, mechanical aerial delivery parts and components, kitchens , and combat feeding items (see Figure 4). NSRDEC’s PIF is...generic terms and refrain from revealing confidential or classified information. Research hypotheses are as follows: H1: The PIF leadership predicts

  19. The Necessity of Machine Learning and Epistemology in the Development of Categorization Theories: A Case Study in Prototype-Exemplar Debate

    NASA Astrophysics Data System (ADS)

    Gagliardi, Francesco

    In the present paper we discuss some aspects of the development of categorization theories concerning cognitive psychology and machine learning. We consider the thirty-year debate between prototype-theory and exemplar-theory in the studies of cognitive psychology regarding the categorization processes. We propose this debate is ill-posed, because it neglects some theoretical and empirical results of machine learning about the bias-variance theorem and the existence of some instance-based classifiers which can embed models subsuming both prototype and exemplar theories. Moreover this debate lies on a epistemological error of pursuing a, so called, experimentum crucis. Then we present how an interdisciplinary approach, based on synthetic method for cognitive modelling, can be useful to progress both the fields of cognitive psychology and machine learning.

  20. Initial results of the FUSION-X-US prototype combining 3D automated breast ultrasound and digital breast tomosynthesis.

    PubMed

    Schaefgen, Benedikt; Heil, Joerg; Barr, Richard G; Radicke, Marcus; Harcos, Aba; Gomez, Christina; Stieber, Anne; Hennigs, André; von Au, Alexandra; Spratte, Julia; Rauch, Geraldine; Rom, Joachim; Schütz, Florian; Sohn, Christof; Golatta, Michael

    2018-06-01

    To determine the feasibility of a prototype device combining 3D-automated breast ultrasound (ABVS) and digital breast tomosynthesis in a single device to detect and characterize breast lesions. In this prospective feasibility study, the FUSION-X-US prototype was used to perform digital breast tomosynthesis and ABVS in 23 patients with an indication for tomosynthesis based on current guidelines after clinical examination and standard imaging. The ABVS and tomosynthesis images of the prototype were interpreted separately by two blinded experts. The study compares the detection and BI-RADS® scores of breast lesions using only the tomosynthesis and ABVS data from the FUSION-X-US prototype to the results of the complete diagnostic workup. Image acquisition and processing by the prototype was fast and accurate, with some limitations in ultrasound coverage and image quality. In the diagnostic workup, 29 solid lesions (23 benign, including three cases with microcalcifications, and six malignant lesions) were identified. Using the prototype, all malignant lesions were detected and classified as malignant or suspicious by both investigators. Solid breast lesions can be localized accurately and fast by the Fusion-X-US system. Technical improvements of the ultrasound image quality and ultrasound coverage are needed to further study this new device. The prototype combines tomosynthesis and automated 3D-ultrasound (ABVS) in one device. It allows accurate detection of malignant lesions, directly correlating tomosynthesis and ABVS data. The diagnostic evaluation of the prototype-acquired data was interpreter-independent. The prototype provides a time-efficient and technically reliable diagnostic procedure. The combination of tomosynthesis and ABVS is a promising diagnostic approach.

  1. ATM encryption testing

    NASA Astrophysics Data System (ADS)

    Capell, Joyce; Deeth, David

    1996-01-01

    This paper describes why encryption was selected by Lockheed Martin Missiles & Space as the means for securing ATM networks. The ATM encryption testing program is part of an ATM network trial provided by Pacific Bell under the California Research Education Network (CalREN). The problem being addressed is the threat to data security which results when changing from a packet switched network infrastructure to a circuit switched ATM network backbone. As organizations move to high speed cell-based networks, there is a break down in the traditional security model which is designed to protect packet switched data networks from external attacks. This is due to the fact that most data security firewalls filter IP packets, restricting inbound and outbound protocols, e.g. ftp. ATM networks, based on cell-switching over virtual circuits, does not support this method for restricting access since the protocol information is not carried by each cell. ATM switches set up multiple virtual connections, thus there is no longer a single point of entry into the internal network. The problem is further complicated by the fact that ATM networks support high speed multi-media applications, including real time video and video teleconferencing which are incompatible with packet switched networks. The ability to restrict access to Lockheed Martin networks in support of both unclassified and classified communications is required before ATM network technology can be fully deployed. The Lockheed Martin CalREN ATM testbed provides the opportunity to test ATM encryption prototypes with actual applications to assess the viability of ATM encryption methodologies prior to installing large scale ATM networks. Two prototype ATM encryptors are being tested: (1) `MILKBUSH' a prototype encryptor developed by NSA for transmission of government classified data over ATM networks, and (2) a prototype ATM encryptor developed by Sandia National Labs in New Mexico, for the encryption of proprietary data.

  2. Recognizing human activities using appearance metric feature and kinematics feature

    NASA Astrophysics Data System (ADS)

    Qian, Huimin; Zhou, Jun; Lu, Xinbiao; Wu, Xinye

    2017-05-01

    The problem of automatically recognizing human activities from videos through the fusion of the two most important cues, appearance metric feature and kinematics feature, is considered. And a system of two-dimensional (2-D) Poisson equations is introduced to extract the more discriminative appearance metric feature. Specifically, the moving human blobs are first detected out from the video by background subtraction technique to form a binary image sequence, from which the appearance feature designated as the motion accumulation image and the kinematics feature termed as centroid instantaneous velocity are extracted. Second, 2-D discrete Poisson equations are employed to reinterpret the motion accumulation image to produce a more differentiated Poisson silhouette image, from which the appearance feature vector is created through the dimension reduction technique called bidirectional 2-D principal component analysis, considering the balance between classification accuracy and time consumption. Finally, a cascaded classifier based on the nearest neighbor classifier and two directed acyclic graph support vector machine classifiers, integrated with the fusion of the appearance feature vector and centroid instantaneous velocity vector, is applied to recognize the human activities. Experimental results on the open databases and a homemade one confirm the recognition performance of the proposed algorithm.

  3. Webcam mouse using face and eye tracking in various illumination environments.

    PubMed

    Lin, Yuan-Pin; Chao, Yi-Ping; Lin, Chung-Chih; Chen, Jyh-Horng

    2005-01-01

    Nowadays, due to enhancement of computer performance and popular usage of webcam devices, it has become possible to acquire users' gestures for the human-computer-interface with PC via webcam. However, the effects of illumination variation would dramatically decrease the stability and accuracy of skin-based face tracking system; especially for a notebook or portable platform. In this study we present an effective illumination recognition technique, combining K-Nearest Neighbor classifier and adaptive skin model, to realize the real-time tracking system. We have demonstrated that the accuracy of face detection based on the KNN classifier is higher than 92% in various illumination environments. In real-time implementation, the system successfully tracks user face and eyes features at 15 fps under standard notebook platforms. Although KNN classifier only initiates five environments at preliminary stage, the system permits users to define and add their favorite environments to KNN for computer access. Eventually, based on this efficient tracking algorithm, we have developed a "Webcam Mouse" system to control the PC cursor using face and eye tracking. Preliminary studies in "point and click" style PC web games also shows promising applications in consumer electronic markets in the future.

  4. Automated 3D Phenotype Analysis Using Data Mining

    PubMed Central

    Plyusnin, Ilya; Evans, Alistair R.; Karme, Aleksis; Gionis, Aristides; Jernvall, Jukka

    2008-01-01

    The ability to analyze and classify three-dimensional (3D) biological morphology has lagged behind the analysis of other biological data types such as gene sequences. Here, we introduce the techniques of data mining to the study of 3D biological shapes to bring the analyses of phenomes closer to the efficiency of studying genomes. We compiled five training sets of highly variable morphologies of mammalian teeth from the MorphoBrowser database. Samples were labeled either by dietary class or by conventional dental types (e.g. carnassial, selenodont). We automatically extracted a multitude of topological attributes using Geographic Information Systems (GIS)-like procedures that were then used in several combinations of feature selection schemes and probabilistic classification models to build and optimize classifiers for predicting the labels of the training sets. In terms of classification accuracy, computational time and size of the feature sets used, non-repeated best-first search combined with 1-nearest neighbor classifier was the best approach. However, several other classification models combined with the same searching scheme proved practical. The current study represents a first step in the automatic analysis of 3D phenotypes, which will be increasingly valuable with the future increase in 3D morphology and phenomics databases. PMID:18320060

  5. An Efficient Statistical Computation Technique for Health Care Big Data using R

    NASA Astrophysics Data System (ADS)

    Sushma Rani, N.; Srinivasa Rao, P., Dr; Parimala, P.

    2017-08-01

    Due to the changes in living conditions and other factors many critical health related problems are arising. The diagnosis of the problem at earlier stages will increase the chances of survival and fast recovery. This reduces the time of recovery and the cost associated for the treatment. One such medical related issue is cancer and breast cancer has been identified as the second leading cause of cancer death. If detected in the early stage it can be cured. Once a patient is detected with breast cancer tumor, it should be classified whether it is cancerous or non-cancerous. So the paper uses k-nearest neighbors(KNN) algorithm which is one of the simplest machine learning algorithms and is an instance-based learning algorithm to classify the data. Day-to -day new records are added which leds to increase in the data to be classified and this tends to be big data problem. The algorithm is implemented in R whichis the most popular platform applied to machine learning algorithms for statistical computing. Experimentation is conducted by using various classification evaluation metric onvarious values of k. The results show that the KNN algorithm out performes better than existing models.

  6. Spatial Statistics for Tumor Cell Counting and Classification

    NASA Astrophysics Data System (ADS)

    Wirjadi, Oliver; Kim, Yoo-Jin; Breuel, Thomas

    To count and classify cells in histological sections is a standard task in histology. One example is the grading of meningiomas, benign tumors of the meninges, which requires to assess the fraction of proliferating cells in an image. As this process is very time consuming when performed manually, automation is required. To address such problems, we propose a novel application of Markov point process methods in computer vision, leading to algorithms for computing the locations of circular objects in images. In contrast to previous algorithms using such spatial statistics methods in image analysis, the present one is fully trainable. This is achieved by combining point process methods with statistical classifiers. Using simulated data, the method proposed in this paper will be shown to be more accurate and more robust to noise than standard image processing methods. On the publicly available SIMCEP benchmark for cell image analysis algorithms, the cell count performance of the present paper is significantly more accurate than results published elsewhere, especially when cells form dense clusters. Furthermore, the proposed system performs as well as a state-of-the-art algorithm for the computer-aided histological grading of meningiomas when combined with a simple k-nearest neighbor classifier for identifying proliferating cells.

  7. Knee X-ray image analysis method for automated detection of Osteoarthritis

    PubMed Central

    Shamir, Lior; Ling, Shari M.; Scott, William W.; Bos, Angelo; Orlov, Nikita; Macura, Tomasz; Eckley, D. Mark; Ferrucci, Luigi; Goldberg, Ilya G.

    2008-01-01

    We describe a method for automated detection of radiographic Osteoarthritis (OA) in knee X-ray images. The detection is based on the Kellgren-Lawrence classification grades, which correspond to the different stages of OA severity. The classifier was built using manually classified X-rays, representing the first four KL grades (normal, doubtful, minimal and moderate). Image analysis is performed by first identifying a set of image content descriptors and image transforms that are informative for the detection of OA in the X-rays, and assigning weights to these image features using Fisher scores. Then, a simple weighted nearest neighbor rule is used in order to predict the KL grade to which a given test X-ray sample belongs. The dataset used in the experiment contained 350 X-ray images classified manually by their KL grades. Experimental results show that moderate OA (KL grade 3) and minimal OA (KL grade 2) can be differentiated from normal cases with accuracy of 91.5% and 80.4%, respectively. Doubtful OA (KL grade 1) was detected automatically with a much lower accuracy of 57%. The source code developed and used in this study is available for free download at www.openmicroscopy.org. PMID:19342330

  8. Acoustic basis for recognition of aspect-dependent three-dimensional targets by an echolocating bottlenose dolphin.

    PubMed

    Helweg, D A; Au, W W; Roitblat, H L; Nachtigall, P E

    1996-04-01

    The relationships between acoustic features of target echoes and the cognitive representations of the target formed by an echolocating dolphin will influence the ease with which the dolphin can recognize a target. A blindfolded Atlantic bottlenose dolphin (Tursiops truncatus) learned to match aspect-dependent three-dimensional targets (such as a cube) at haphazard orientations, although with some difficulty. This task may have been difficult because aspect-dependent targets produce different echoes at different orientations, which required the dolphin to have some capability for object constancy across changes in echo characteristics. Significant target-related differences in echo amplitude, rms bandwidth, and distributions of interhighlight intervals were observed among echoes collected when the dolphin was performing the task. Targets could be classified using a combination of energy flux density and rms bandwidth by a linear discriminant analysis and a nearest centroid classifier. Neither statistical model could classify targets without amplitude information, but the highest accuracy required spectral information as well. This suggests that the dolphin recognized the targets using a multidimensional representation containing amplitude and spectral information and that dolphins can form stable representations of targets regardless of orientation based on varying sensory properties.

  9. Discovery and validation of gene classifiers for endocrine-disrupting chemicals in zebrafish (danio rerio)

    PubMed Central

    2012-01-01

    Background Development and application of transcriptomics-based gene classifiers for ecotoxicological applications lag far behind those of biomedical sciences. Many such classifiers discovered thus far lack vigorous statistical and experimental validations. A combination of genetic algorithm/support vector machines and genetic algorithm/K nearest neighbors was used in this study to search for classifiers of endocrine-disrupting chemicals (EDCs) in zebrafish. Searches were conducted on both tissue-specific and tissue-combined datasets, either across the entire transcriptome or within individual transcription factor (TF) networks previously linked to EDC effects. Candidate classifiers were evaluated by gene set enrichment analysis (GSEA) on both the original training data and a dedicated validation dataset. Results Multi-tissue dataset yielded no classifiers. Among the 19 chemical-tissue conditions evaluated, the transcriptome-wide searches yielded classifiers for six of them, each having approximately 20 to 30 gene features unique to a condition. Searches within individual TF networks produced classifiers for 15 chemical-tissue conditions, each containing 100 or fewer top-ranked gene features pooled from those of multiple TF networks and also unique to each condition. For the training dataset, 10 out of 11 classifiers successfully identified the gene expression profiles (GEPs) of their targeted chemical-tissue conditions by GSEA. For the validation dataset, classifiers for prochloraz-ovary and flutamide-ovary also correctly identified the GEPs of corresponding conditions while no classifier could predict the GEP from prochloraz-brain. Conclusions The discrepancies in the performance of these classifiers were attributed in part to varying data complexity among the conditions, as measured to some degree by Fisher’s discriminant ratio statistic. This variation in data complexity could likely be compensated by adjusting sample size for individual chemical-tissue conditions, thus suggesting a need for a preliminary survey of transcriptomic responses before launching a full scale classifier discovery effort. Classifier discovery based on individual TF networks could yield more mechanistically-oriented biomarkers. GSEA proved to be a flexible and effective tool for application of gene classifiers but a similar and more refined algorithm, connectivity mapping, should also be explored. The distribution characteristics of classifiers across tissues, chemicals, and TF networks suggested a differential biological impact among the EDCs on zebrafish transcriptome involving some basic cellular functions. PMID:22849515

  10. A Sensor Data Fusion System Based on k-Nearest Neighbor Pattern Classification for Structural Health Monitoring Applications

    PubMed Central

    Vitola, Jaime; Pozo, Francesc; Tibaduiza, Diego A.; Anaya, Maribel

    2017-01-01

    Civil and military structures are susceptible and vulnerable to damage due to the environmental and operational conditions. Therefore, the implementation of technology to provide robust solutions in damage identification (by using signals acquired directly from the structure) is a requirement to reduce operational and maintenance costs. In this sense, the use of sensors permanently attached to the structures has demonstrated a great versatility and benefit since the inspection system can be automated. This automation is carried out with signal processing tasks with the aim of a pattern recognition analysis. This work presents the detailed description of a structural health monitoring (SHM) system based on the use of a piezoelectric (PZT) active system. The SHM system includes: (i) the use of a piezoelectric sensor network to excite the structure and collect the measured dynamic response, in several actuation phases; (ii) data organization; (iii) advanced signal processing techniques to define the feature vectors; and finally; (iv) the nearest neighbor algorithm as a machine learning approach to classify different kinds of damage. A description of the experimental setup, the experimental validation and a discussion of the results from two different structures are included and analyzed. PMID:28230796

  11. A Sensor Data Fusion System Based on k-Nearest Neighbor Pattern Classification for Structural Health Monitoring Applications.

    PubMed

    Vitola, Jaime; Pozo, Francesc; Tibaduiza, Diego A; Anaya, Maribel

    2017-02-21

    Civil and military structures are susceptible and vulnerable to damage due to the environmental and operational conditions. Therefore, the implementation of technology to provide robust solutions in damage identification (by using signals acquired directly from the structure) is a requirement to reduce operational and maintenance costs. In this sense, the use of sensors permanently attached to the structures has demonstrated a great versatility and benefit since the inspection system can be automated. This automation is carried out with signal processing tasks with the aim of a pattern recognition analysis. This work presents the detailed description of a structural health monitoring (SHM) system based on the use of a piezoelectric (PZT) active system. The SHM system includes: (i) the use of a piezoelectric sensor network to excite the structure and collect the measured dynamic response, in several actuation phases; (ii) data organization; (iii) advanced signal processing techniques to define the feature vectors; and finally; (iv) the nearest neighbor algorithm as a machine learning approach to classify different kinds of damage. A description of the experimental setup, the experimental validation and a discussion of the results from two different structures are included and analyzed.

  12. Acetobacter fabarum sp. nov., an acetic acid bacterium from a Ghanaian cocoa bean heap fermentation.

    PubMed

    Cleenwerck, Ilse; Gonzalez, Angel; Camu, Nicholas; Engelbeen, Katrien; De Vos, Paul; De Vuyst, Luc

    2008-09-01

    Six acetic acid bacterial isolates, obtained during a study of the microbial diversity of spontaneous fermentations of Ghanaian cocoa beans, were subjected to a polyphasic taxonomic study. (GTG)(5)-PCR fingerprinting grouped the isolates together, but they could not be identified using this method. Phylogenetic analysis based on 16S rRNA gene sequences allocated the isolates to the genus Acetobacter and revealed Acetobacter lovaniensis, Acetobacter ghanensis and Acetobacter syzygii to be nearest neighbours. DNA-DNA hybridizations demonstrated that the isolates belonged to a single novel genospecies that could be differentiated from its phylogenetically nearest neighbours by the following phenotypic characteristics: no production of 2-keto-D-gluconic acid from D-glucose; growth on methanol and D-xylose, but not on maltose, as sole carbon sources; no growth on yeast extract with 30% D-glucose; and weak growth at 37 degrees C. The DNA G+C contents of four selected strains were 56.8-58.0 mol%. The results obtained prove that the isolates should be classified as representatives of a novel Acetobacter species, for which the name Acetobacter fabarum sp. nov. is proposed. The type strain is strain 985(T) (=R-36330(T) =LMG 24244(T) =DSM 19596(T)).

  13. X ray studies of the Hyades cluster

    NASA Technical Reports Server (NTRS)

    Stern, Robert A.

    1993-01-01

    The Hyades cluster occupies a unique position in both the history of astronomy and at the frontiers of contemporary astronomical research. At a distance of only 45 pc, the Hyades is the nearest star cluster in the Galaxy which is localized in the sky: the UMa cluster, which is closer, but much sparser, essentially surrounds the Solar neighborhood. The Hyades is the prototype cluster for distance determination using the 'moving-cluster' method, and thus serves to define the zero-age main sequence from which the cosmic distance scale is essentially bootstrapped. The Hyades age (0.6-0.7 Gyr), nearly 8 times younger than the Sun, guarantees the Hyades critical importance to studies of stellar evolution. The results of a complete survey of the Hyades cluster using the ROSAT All Sky Survey (RASS) are reported.

  14. Multi-Band Received Signal Strength Fingerprinting Based Indoor Location System

    NASA Astrophysics Data System (ADS)

    Sertthin, Chinnapat; Fujii, Takeo; Ohtsuki, Tomoaki; Nakagawa, Masao

    This paper proposes a new multi-band received signal strength (MRSS) fingerprinting based indoor location system, which employs the frequency diversity on the conventional single-band received signal strength (RSS) fingerprinting based indoor location system. In the proposed system, the impacts of frequency diversity on the enhancements of positioning accuracy are analyzed. Effectiveness of the proposed system is proved by experimental approach, which was conducted in non line-of-sight (NLOS) environment under the area of 103m2 at Yagami Campus, Keio University. WLAN access points, which simultaneously transmit dual-band signal of 2.4 and 5.2GHz, are utilized as transmitters. Likewise, a dual-band WLAN receiver is utilized as a receiver. Signal distances calculated by both Manhattan and Euclidean were classified by K-Nearest Neighbor (KNN) classifier to illustrate the performance of the proposed system. The results confirmed that Frequency diversity attributions of multi-band signal provide accuracy improvement over 50% of the conventional single-band.

  15. Chemical data as markers of the geographical origins of sugarcane spirits.

    PubMed

    Serafim, F A T; Pereira-Filho, Edenir R; Franco, D W

    2016-04-01

    In an attempt to classify sugarcane spirits according to their geographic region of origin, chemical data for 24 analytes were evaluated in 50 cachaças produced using a similar procedure in selected regions of Brazil: São Paulo - SP (15), Minas Gerais - MG (11), Rio de Janeiro - RJ (11), Paraiba -PB (9), and Ceará - CE (4). Multivariate analysis was applied to the analytical results, and the predictive abilities of different classification methods were evaluated. Principal component analysis identified five groups, and chemical similarities were observed between MG and SP samples and between RJ and PB samples. CE samples presented a distinct chemical profile. Among the samples, partial linear square discriminant analysis (PLS-DA) classified 50.2% of the samples correctly, K-nearest neighbor (KNN) 86%, and soft independent modeling of class analogy (SIMCA) 56.2%. Therefore, in this proof of concept demonstration, the proposed approach based on chemical data satisfactorily predicted the cachaças' geographic origins. Copyright © 2015 Elsevier Ltd. All rights reserved.

  16. White matter lesion extension to automatic brain tissue segmentation on MRI.

    PubMed

    de Boer, Renske; Vrooman, Henri A; van der Lijn, Fedde; Vernooij, Meike W; Ikram, M Arfan; van der Lugt, Aad; Breteler, Monique M B; Niessen, Wiro J

    2009-05-01

    A fully automated brain tissue segmentation method is optimized and extended with white matter lesion segmentation. Cerebrospinal fluid (CSF), gray matter (GM) and white matter (WM) are segmented by an atlas-based k-nearest neighbor classifier on multi-modal magnetic resonance imaging data. This classifier is trained by registering brain atlases to the subject. The resulting GM segmentation is used to automatically find a white matter lesion (WML) threshold in a fluid-attenuated inversion recovery scan. False positive lesions are removed by ensuring that the lesions are within the white matter. The method was visually validated on a set of 209 subjects. No segmentation errors were found in 98% of the brain tissue segmentations and 97% of the WML segmentations. A quantitative evaluation using manual segmentations was performed on a subset of 6 subjects for CSF, GM and WM segmentation and an additional 14 for the WML segmentations. The results indicated that the automatic segmentation accuracy is close to the interobserver variability of manual segmentations.

  17. Method of Menu Selection by Gaze Movement Using AC EOG Signals

    NASA Astrophysics Data System (ADS)

    Kanoh, Shin'ichiro; Futami, Ryoko; Yoshinobu, Tatsuo; Hoshimiya, Nozomu

    A method to detect the direction and the distance of voluntary eye gaze movement from EOG (electrooculogram) signals was proposed and tested. In this method, AC-amplified vertical and horizontal transient EOG signals were classified into 8-class directions and 2-class distances of voluntary eye gaze movements. A horizontal and a vertical EOGs during eye gaze movement at each sampling time were treated as a two-dimensional vector, and the center of gravity of the sample vectors whose norms were more than 80% of the maximum norm was used as a feature vector to be classified. By the classification using the k-nearest neighbor algorithm, it was shown that the averaged correct detection rates on each subject were 98.9%, 98.7%, 94.4%, respectively. This method can avoid strict EOG-based eye tracking which requires DC amplification of very small signal. It would be useful to develop robust human interfacing systems based on menu selection for severely paralyzed patients.

  18. Classification of older adults with/without a fall history using machine learning methods.

    PubMed

    Lin Zhang; Ou Ma; Fabre, Jennifer M; Wood, Robert H; Garcia, Stephanie U; Ivey, Kayla M; McCann, Evan D

    2015-01-01

    Falling is a serious problem in an aged society such that assessment of the risk of falls for individuals is imperative for the research and practice of falls prevention. This paper introduces an application of several machine learning methods for training a classifier which is capable of classifying individual older adults into a high risk group and a low risk group (distinguished by whether or not the members of the group have a recent history of falls). Using a 3D motion capture system, significant gait features related to falls risk are extracted. By training these features, classification hypotheses are obtained based on machine learning techniques (K Nearest-neighbour, Naive Bayes, Logistic Regression, Neural Network, and Support Vector Machine). Training and test accuracies with sensitivity and specificity of each of these techniques are assessed. The feature adjustment and tuning of the machine learning algorithms are discussed. The outcome of the study will benefit the prediction and prevention of falls.

  19. Lagrangian methods of cosmic web classification

    NASA Astrophysics Data System (ADS)

    Fisher, J. D.; Faltenbacher, A.; Johnson, M. S. T.

    2016-05-01

    The cosmic web defines the large-scale distribution of matter we see in the Universe today. Classifying the cosmic web into voids, sheets, filaments and nodes allows one to explore structure formation and the role environmental factors have on halo and galaxy properties. While existing studies of cosmic web classification concentrate on grid-based methods, this work explores a Lagrangian approach where the V-web algorithm proposed by Hoffman et al. is implemented with techniques borrowed from smoothed particle hydrodynamics. The Lagrangian approach allows one to classify individual objects (e.g. particles or haloes) based on properties of their nearest neighbours in an adaptive manner. It can be applied directly to a halo sample which dramatically reduces computational cost and potentially allows an application of this classification scheme to observed galaxy samples. Finally, the Lagrangian nature admits a straightforward inclusion of the Hubble flow negating the necessity of a visually defined threshold value which is commonly employed by grid-based classification methods.

  20. Urban Shanty Town Recognition Based on High-Resolution Remote Sensing Images and National Geographical Monitoring Features - a Case Study of Nanning City

    NASA Astrophysics Data System (ADS)

    He, Y.; He, Y.

    2018-04-01

    Urban shanty towns are communities that has contiguous old and dilapidated houses with more than 2000 square meters built-up area or more than 50 households. This study makes attempts to extract shanty towns in Nanning City using the product of Census and TripleSat satellite images. With 0.8-meter high-resolution remote sensing images, five texture characteristics (energy, contrast, maximum probability, and inverse difference moment) of shanty towns are trained and analyzed through GLCM. In this study, samples of shanty town are well classified with 98.2 % producer accuracy of unsupervised classification and 73.2 % supervised classification correctness. Low-rise and mid-rise residential blocks in Nanning City are classified into 4 different types by using k-means clustering and nearest neighbour classification respectively. This study initially establish texture feature descriptions of different types of residential areas, especially low-rise and mid-rise buildings, which would help city administrator evaluate residential blocks and reconstruction shanty towns.

  1. Fuzzy support vector machine: an efficient rule-based classification technique for microarrays.

    PubMed

    Hajiloo, Mohsen; Rabiee, Hamid R; Anooshahpour, Mahdi

    2013-01-01

    The abundance of gene expression microarray data has led to the development of machine learning algorithms applicable for tackling disease diagnosis, disease prognosis, and treatment selection problems. However, these algorithms often produce classifiers with weaknesses in terms of accuracy, robustness, and interpretability. This paper introduces fuzzy support vector machine which is a learning algorithm based on combination of fuzzy classifiers and kernel machines for microarray classification. Experimental results on public leukemia, prostate, and colon cancer datasets show that fuzzy support vector machine applied in combination with filter or wrapper feature selection methods develops a robust model with higher accuracy than the conventional microarray classification models such as support vector machine, artificial neural network, decision trees, k nearest neighbors, and diagonal linear discriminant analysis. Furthermore, the interpretable rule-base inferred from fuzzy support vector machine helps extracting biological knowledge from microarray data. Fuzzy support vector machine as a new classification model with high generalization power, robustness, and good interpretability seems to be a promising tool for gene expression microarray classification.

  2. Automated robot-assisted surgical skill evaluation: Predictive analytics approach.

    PubMed

    Fard, Mahtab J; Ameri, Sattar; Darin Ellis, R; Chinnam, Ratna B; Pandya, Abhilash K; Klein, Michael D

    2018-02-01

    Surgical skill assessment has predominantly been a subjective task. Recently, technological advances such as robot-assisted surgery have created great opportunities for objective surgical evaluation. In this paper, we introduce a predictive framework for objective skill assessment based on movement trajectory data. Our aim is to build a classification framework to automatically evaluate the performance of surgeons with different levels of expertise. Eight global movement features are extracted from movement trajectory data captured by a da Vinci robot for surgeons with two levels of expertise - novice and expert. Three classification methods - k-nearest neighbours, logistic regression and support vector machines - are applied. The result shows that the proposed framework can classify surgeons' expertise as novice or expert with an accuracy of 82.3% for knot tying and 89.9% for a suturing task. This study demonstrates and evaluates the ability of machine learning methods to automatically classify expert and novice surgeons using global movement features. Copyright © 2017 John Wiley & Sons, Ltd.

  3. CAVIAR: CLASSIFICATION VIA AGGREGATED REGRESSION AND ITS APPLICATION IN CLASSIFYING OASIS BRAIN DATABASE

    PubMed Central

    Chen, Ting; Rangarajan, Anand; Vemuri, Baba C.

    2010-01-01

    This paper presents a novel classification via aggregated regression algorithm – dubbed CAVIAR – and its application to the OASIS MRI brain image database. The CAVIAR algorithm simultaneously combines a set of weak learners based on the assumption that the weight combination for the final strong hypothesis in CAVIAR depends on both the weak learners and the training data. A regularization scheme using the nearest neighbor method is imposed in the testing stage to avoid overfitting. A closed form solution to the cost function is derived for this algorithm. We use a novel feature – the histogram of the deformation field between the MRI brain scan and the atlas which captures the structural changes in the scan with respect to the atlas brain – and this allows us to automatically discriminate between various classes within OASIS [1] using CAVIAR. We empirically show that CAVIAR significantly increases the performance of the weak classifiers by showcasing the performance of our technique on OASIS. PMID:21151847

  4. CAVIAR: CLASSIFICATION VIA AGGREGATED REGRESSION AND ITS APPLICATION IN CLASSIFYING OASIS BRAIN DATABASE.

    PubMed

    Chen, Ting; Rangarajan, Anand; Vemuri, Baba C

    2010-04-14

    This paper presents a novel classification via aggregated regression algorithm - dubbed CAVIAR - and its application to the OASIS MRI brain image database. The CAVIAR algorithm simultaneously combines a set of weak learners based on the assumption that the weight combination for the final strong hypothesis in CAVIAR depends on both the weak learners and the training data. A regularization scheme using the nearest neighbor method is imposed in the testing stage to avoid overfitting. A closed form solution to the cost function is derived for this algorithm. We use a novel feature - the histogram of the deformation field between the MRI brain scan and the atlas which captures the structural changes in the scan with respect to the atlas brain - and this allows us to automatically discriminate between various classes within OASIS [1] using CAVIAR. We empirically show that CAVIAR significantly increases the performance of the weak classifiers by showcasing the performance of our technique on OASIS.

  5. Classification of speech dysfluencies using LPC based parameterization techniques.

    PubMed

    Hariharan, M; Chee, Lim Sin; Ai, Ooi Chia; Yaacob, Sazali

    2012-06-01

    The goal of this paper is to discuss and compare three feature extraction methods: Linear Predictive Coefficients (LPC), Linear Prediction Cepstral Coefficients (LPCC) and Weighted Linear Prediction Cepstral Coefficients (WLPCC) for recognizing the stuttered events. Speech samples from the University College London Archive of Stuttered Speech (UCLASS) were used for our analysis. The stuttered events were identified through manual segmentation and were used for feature extraction. Two simple classifiers namely, k-nearest neighbour (kNN) and Linear Discriminant Analysis (LDA) were employed for speech dysfluencies classification. Conventional validation method was used for testing the reliability of the classifier results. The study on the effect of different frame length, percentage of overlapping, value of ã in a first order pre-emphasizer and different order p were discussed. The speech dysfluencies classification accuracy was found to be improved by applying statistical normalization before feature extraction. The experimental investigation elucidated LPC, LPCC and WLPCC features can be used for identifying the stuttered events and WLPCC features slightly outperforms LPCC features and LPC features.

  6. Breast Cancer Detection with Reduced Feature Set.

    PubMed

    Mert, Ahmet; Kılıç, Niyazi; Bilgili, Erdem; Akan, Aydin

    2015-01-01

    This paper explores feature reduction properties of independent component analysis (ICA) on breast cancer decision support system. Wisconsin diagnostic breast cancer (WDBC) dataset is reduced to one-dimensional feature vector computing an independent component (IC). The original data with 30 features and reduced one feature (IC) are used to evaluate diagnostic accuracy of the classifiers such as k-nearest neighbor (k-NN), artificial neural network (ANN), radial basis function neural network (RBFNN), and support vector machine (SVM). The comparison of the proposed classification using the IC with original feature set is also tested on different validation (5/10-fold cross-validations) and partitioning (20%-40%) methods. These classifiers are evaluated how to effectively categorize tumors as benign and malignant in terms of specificity, sensitivity, accuracy, F-score, Youden's index, discriminant power, and the receiver operating characteristic (ROC) curve with its criterion values including area under curve (AUC) and 95% confidential interval (CI). This represents an improvement in diagnostic decision support system, while reducing computational complexity.

  7. Using multiple classifiers for predicting the risk of endovascular aortic aneurysm repair re-intervention through hybrid feature selection.

    PubMed

    Attallah, Omneya; Karthikesalingam, Alan; Holt, Peter Je; Thompson, Matthew M; Sayers, Rob; Bown, Matthew J; Choke, Eddie C; Ma, Xianghong

    2017-11-01

    Feature selection is essential in medical area; however, its process becomes complicated with the presence of censoring which is the unique character of survival analysis. Most survival feature selection methods are based on Cox's proportional hazard model, though machine learning classifiers are preferred. They are less employed in survival analysis due to censoring which prevents them from directly being used to survival data. Among the few work that employed machine learning classifiers, partial logistic artificial neural network with auto-relevance determination is a well-known method that deals with censoring and perform feature selection for survival data. However, it depends on data replication to handle censoring which leads to unbalanced and biased prediction results especially in highly censored data. Other methods cannot deal with high censoring. Therefore, in this article, a new hybrid feature selection method is proposed which presents a solution to high level censoring. It combines support vector machine, neural network, and K-nearest neighbor classifiers using simple majority voting and a new weighted majority voting method based on survival metric to construct a multiple classifier system. The new hybrid feature selection process uses multiple classifier system as a wrapper method and merges it with iterated feature ranking filter method to further reduce features. Two endovascular aortic repair datasets containing 91% censored patients collected from two centers were used to construct a multicenter study to evaluate the performance of the proposed approach. The results showed the proposed technique outperformed individual classifiers and variable selection methods based on Cox's model such as Akaike and Bayesian information criterions and least absolute shrinkage and selector operator in p values of the log-rank test, sensitivity, and concordance index. This indicates that the proposed classifier is more powerful in correctly predicting the risk of re-intervention enabling doctor in selecting patients' future follow-up plan.

  8. A fresh look at functional link neural network for motor imagery-based brain-computer interface.

    PubMed

    Hettiarachchi, Imali T; Babaei, Toktam; Nguyen, Thanh; Lim, Chee P; Nahavandi, Saeid

    2018-05-04

    Artificial neural networks (ANNs) are one of the widely used classifiers in the brain-computer interface (BCI) systems-based on noninvasive electroencephalography (EEG) signals. Among the different ANN architectures, the most commonly applied for BCI classifiers is the multilayer perceptron (MLP). When appropriately designed with optimal number of neuron layers and number of neurons per layer, the ANN can act as a universal approximator. However, due to the low signal-to-noise ratio of EEG signal data, overtraining problem may become an inherent issue, causing these universal approximators to fail in real-time applications. In this study we introduce a higher order neural network, namely the functional link neural network (FLNN) as a classifier for motor imagery (MI)-based BCI systems, to remedy the drawbacks in MLP. We compare the proposed method with competing classifiers such as linear decomposition analysis, naïve Bayes, k-nearest neighbours, support vector machine and three MLP architectures. Two multi-class benchmark datasets from the BCI competitions are used. Common spatial pattern algorithm is utilized for feature extraction to build classification models. FLNN reports the highest average Kappa value over multiple subjects for both the BCI competition datasets, under similarly preprocessed data and extracted features. Further, statistical comparison results over multiple subjects show that the proposed FLNN classification method yields the best performance among the competing classifiers. Findings from this study imply that the proposed method, which has less computational complexity compared to the MLP, can be implemented effectively in practical MI-based BCI systems. Copyright © 2018 Elsevier B.V. All rights reserved.

  9. Training set optimization and classifier performance in a top-down diabetic retinopathy screening system

    NASA Astrophysics Data System (ADS)

    Wigdahl, J.; Agurto, C.; Murray, V.; Barriga, S.; Soliz, P.

    2013-03-01

    Diabetic retinopathy (DR) affects more than 4.4 million Americans age 40 and over. Automatic screening for DR has shown to be an efficient and cost-effective way to lower the burden on the healthcare system, by triaging diabetic patients and ensuring timely care for those presenting with DR. Several supervised algorithms have been developed to detect pathologies related to DR, but little work has been done in determining the size of the training set that optimizes an algorithm's performance. In this paper we analyze the effect of the training sample size on the performance of a top-down DR screening algorithm for different types of statistical classifiers. Results are based on partial least squares (PLS), support vector machines (SVM), k-nearest neighbor (kNN), and Naïve Bayes classifiers. Our dataset consisted of digital retinal images collected from a total of 745 cases (595 controls, 150 with DR). We varied the number of normal controls in the training set, while keeping the number of DR samples constant, and repeated the procedure 10 times using randomized training sets to avoid bias. Results show increasing performance in terms of area under the ROC curve (AUC) when the number of DR subjects in the training set increased, with similar trends for each of the classifiers. Of these, PLS and k-NN had the highest average AUC. Lower standard deviation and a flattening of the AUC curve gives evidence that there is a limit to the learning ability of the classifiers and an optimal number of cases to train on.

  10. [Terahertz Spectroscopic Identification with Deep Belief Network].

    PubMed

    Ma, Shuai; Shen, Tao; Wang, Rui-qi; Lai, Hua; Yu, Zheng-tao

    2015-12-01

    Feature extraction and classification are the key issues of terahertz spectroscopy identification. Because many materials have no apparent absorption peaks in the terahertz band, it is difficult to extract theirs terahertz spectroscopy feature and identify. To this end, a novel of identify terahertz spectroscopy approach with Deep Belief Network (DBN) was studied in this paper, which combines the advantages of DBN and K-Nearest Neighbors (KNN) classifier. Firstly, cubic spline interpolation and S-G filter were used to normalize the eight kinds of substances (ATP, Acetylcholine Bromide, Bifenthrin, Buprofezin, Carbazole, Bleomycin, Buckminster and Cylotriphosphazene) terahertz transmission spectra in the range of 0.9-6 THz. Secondly, the DBN model was built by two restricted Boltzmann machine (RBM) and then trained layer by layer using unsupervised approach. Instead of using handmade features, the DBN was employed to learn suitable features automatically with raw input data. Finally, a KNN classifier was applied to identify the terahertz spectrum. Experimental results show that using the feature learned by DBN can identify the terahertz spectrum of different substances with the recognition rate of over 90%, which demonstrates that the proposed method can automatically extract the effective features of terahertz spectrum. Furthermore, this KNN classifier was compared with others (BP neural network, SOM neural network and RBF neural network). Comparisons showed that the recognition rate of KNN classifier is better than the other three classifiers. Using the approach that automatic extract terahertz spectrum features by DBN can greatly reduce the workload of feature extraction. This proposed method shows a promising future in the application of identifying the mass terahertz spectroscopy.

  11. Do unpaved, low-traffic roads affect bird communities?

    NASA Astrophysics Data System (ADS)

    Mammides, Christos; Kounnamas, Constantinos; Goodale, Eben; Kadis, Costas

    2016-02-01

    Unpaved, low traffic roads are often assumed to have minimal effects on biodiversity. To explore this assertion, we sampled the bird communities in fifteen randomly selected sites in Pafos Forest, Cyprus and used multiple regression to quantify the effects of such roads on the total species richness. Moreover, we classified birds according to their migratory status and their global population trends, and tested each category separately. Besides the total length of unpaved roads, we also tested: a. the site's habitat diversity, b. the coefficient of variation in habitat (patch) size, c. the distance to the nearest agricultural field, and d. the human population size of the nearest village. We measured our variables at six different distances from the bird point-count locations. We found a strong negative relationship between the total bird richness and the total length of unpaved roads. The human population size of the nearest village also had a negative effect. Habitat diversity was positively related to species richness. When the categories were tested, we found that the passage migrants were influenced more by the road network while resident breeders were influenced by habitat diversity. Species with increasing and stable populations were only marginally affected by the variables tested, but the effect of road networks on species with decreasing populations was large. We conclude that unpaved and sporadically used roads can have detrimental effects on the bird communities, especially on vulnerable species. We propose that actions are taken to limit the extent of road networks within protected areas, especially in sites designated for their rich avifauna, such as Pafos Forest, where several of the affected species are species of European and global importance.

  12. Alternative method to validate the seasonal land cover regions of the conterminous United States

    Treesearch

    Zhiliang Zhu; Donald O. Ohlen; Raymond L. Czaplewski; Robert E. Burgan

    1996-01-01

    An accuracy assessment method involving double sampling and the multivariate composite estimator has been used to validate the prototype seasonal land cover characteristics database of the conterminous United States. The database consists of 159 land cover classes, classified using time series of 1990 1-km satellite data and augmented with ancillary data including...

  13. Online myoelectric control of a dexterous hand prosthesis by transradial amputees.

    PubMed

    Cipriani, Christian; Antfolk, Christian; Controzzi, Marco; Lundborg, Göran; Rosen, Birgitta; Carrozza, Maria Chiara; Sebelius, Fredrik

    2011-06-01

    A real-time pattern recognition algorithm based on k-nearest neighbors and lazy learning was used to classify, voluntary electromyography (EMG) signals and to simultaneously control movements of a dexterous artificial hand. EMG signals were superficially recorded by eight pairs of electrodes from the stumps of five transradial amputees and forearms of five able-bodied participants and used online to control a robot hand. Seven finger movements (not involving the wrist) were investigated in this study. The first objective was to understand whether and to which extent it is possible to control continuously and in real-time, the finger postures of a prosthetic hand, using superficial EMG, and a practical classifier, also taking advantage of the direct visual feedback of the moving hand. The second objective was to calculate statistical differences in the performance between participants and groups, thereby assessing the general applicability of the proposed method. The average accuracy of the classifier was 79% for amputees and 89% for able-bodied participants. Statistical analysis of the data revealed a difference in control accuracy based on the aetiology of amputation, type of prostheses regularly used and also between able-bodied participants and amputees. These results are encouraging for the development of noninvasive EMG interfaces for the control of dexterous prostheses.

  14. Feature extraction and classification of clouds in high resolution panchromatic satellite imagery

    NASA Astrophysics Data System (ADS)

    Sharghi, Elan

    The development of sophisticated remote sensing sensors is rapidly increasing, and the vast amount of satellite imagery collected is too much to be analyzed manually by a human image analyst. It has become necessary for a tool to be developed to automate the job of an image analyst. This tool would need to intelligently detect and classify objects of interest through computer vision algorithms. Existing software called the Rapid Image Exploitation Resource (RAPIER®) was designed by engineers at Space and Naval Warfare Systems Center Pacific (SSC PAC) to perform exactly this function. This software automatically searches for anomalies in the ocean and reports the detections as a possible ship object. However, if the image contains a high percentage of cloud coverage, a high number of false positives are triggered by the clouds. The focus of this thesis is to explore various feature extraction and classification methods to accurately distinguish clouds from ship objects. An examination of a texture analysis method, line detection using the Hough transform, and edge detection using wavelets are explored as possible feature extraction methods. The features are then supplied to a K-Nearest Neighbors (KNN) or Support Vector Machine (SVM) classifier. Parameter options for these classifiers are explored and the optimal parameters are determined.

  15. Retinopathy of Prematurity-assist: Novel Software for Detecting Plus Disease

    PubMed Central

    Pour, Elias Khalili; Pourreza, Hamidreza; Zamani, Kambiz Ameli; Mahmoudi, Alireza; Sadeghi, Arash Mir Mohammad; Shadravan, Mahla; Karkhaneh, Reza; Pour, Ramak Rouhi

    2017-01-01

    Purpose To design software with a novel algorithm, which analyzes the tortuosity and vascular dilatation in fundal images of retinopathy of prematurity (ROP) patients with an acceptable accuracy for detecting plus disease. Methods Eighty-seven well-focused fundal images taken with RetCam were classified to three groups of plus, non-plus, and pre-plus by agreement between three ROP experts. Automated algorithms in this study were designed based on two methods: the curvature measure and distance transform for assessment of tortuosity and vascular dilatation, respectively as two major parameters of plus disease detection. Results Thirty-eight plus, 12 pre-plus, and 37 non-plus images, which were classified by three experts, were tested by an automated algorithm and software evaluated the correct grouping of images in comparison to expert voting with three different classifiers, k-nearest neighbor, support vector machine and multilayer perceptron network. The plus, pre-plus, and non-plus images were analyzed with 72.3%, 83.7%, and 84.4% accuracy, respectively. Conclusions The new automated algorithm used in this pilot scheme for diagnosis and screening of patients with plus ROP has acceptable accuracy. With more improvements, it may become particularly useful, especially in centers without a skilled person in the ROP field. PMID:29022295

  16. Aesthetic preference recognition of 3D shapes using EEG.

    PubMed

    Chew, Lin Hou; Teo, Jason; Mountstephens, James

    2016-04-01

    Recognition and identification of aesthetic preference is indispensable in industrial design. Humans tend to pursue products with aesthetic values and make buying decisions based on their aesthetic preferences. The existence of neuromarketing is to understand consumer responses toward marketing stimuli by using imaging techniques and recognition of physiological parameters. Numerous studies have been done to understand the relationship between human, art and aesthetics. In this paper, we present a novel preference-based measurement of user aesthetics using electroencephalogram (EEG) signals for virtual 3D shapes with motion. The 3D shapes are designed to appear like bracelets, which is generated by using the Gielis superformula. EEG signals were collected by using a medical grade device, the B-Alert X10 from advance brain monitoring, with a sampling frequency of 256 Hz and resolution of 16 bits. The signals obtained when viewing 3D bracelet shapes were decomposed into alpha, beta, theta, gamma and delta rhythm by using time-frequency analysis, then classified into two classes, namely like and dislike by using support vector machines and K-nearest neighbors (KNN) classifiers respectively. Classification accuracy of up to 80 % was obtained by using KNN with the alpha, theta and delta rhythms as the features extracted from frontal channels, Fz, F3 and F4 to classify two classes, like and dislike.

  17. A three-parameter model for classifying anurans into four genera based on advertisement calls.

    PubMed

    Gingras, Bruno; Fitch, William Tecumseh

    2013-01-01

    The vocalizations of anurans are innate in structure and may therefore contain indicators of phylogenetic history. Thus, advertisement calls of species which are more closely related phylogenetically are predicted to be more similar than those of distant species. This hypothesis was evaluated by comparing several widely used machine-learning algorithms. Recordings of advertisement calls from 142 species belonging to four genera were analyzed. A logistic regression model, using mean values for dominant frequency, coefficient of variation of root-mean square energy, and spectral flux, correctly classified advertisement calls with regard to genus with an accuracy above 70%. Similar accuracy rates were obtained using these parameters with a support vector machine model, a K-nearest neighbor algorithm, and a multivariate Gaussian distribution classifier, whereas a Gaussian mixture model performed slightly worse. In contrast, models based on mel-frequency cepstral coefficients did not fare as well. Comparable accuracy levels were obtained on out-of-sample recordings from 52 of the 142 original species. The results suggest that a combination of low-level acoustic attributes is sufficient to discriminate efficiently between the vocalizations of these four genera, thus supporting the initial premise and validating the use of high-throughput algorithms on animal vocalizations to evaluate phylogenetic hypotheses.

  18. Classification of sodium MRI data of cartilage using machine learning.

    PubMed

    Madelin, Guillaume; Poidevin, Frederick; Makrymallis, Antonios; Regatte, Ravinder R

    2015-11-01

    To assess the possible utility of machine learning for classifying subjects with and subjects without osteoarthritis using sodium magnetic resonance imaging data. Theory: Support vector machine, k-nearest neighbors, naïve Bayes, discriminant analysis, linear regression, logistic regression, neural networks, decision tree, and tree bagging were tested. Sodium magnetic resonance imaging with and without fluid suppression by inversion recovery was acquired on the knee cartilage of 19 controls and 28 osteoarthritis patients. Sodium concentrations were measured in regions of interests in the knee for both acquisitions. Mean (MEAN) and standard deviation (STD) of these concentrations were measured in each regions of interest, and the minimum, maximum, and mean of these two measurements were calculated over all regions of interests for each subject. The resulting 12 variables per subject were used as predictors for classification. Either Min [STD] alone, or in combination with Mean [MEAN] or Min [MEAN], all from fluid suppressed data, were the best predictors with an accuracy >74%, mainly with linear logistic regression and linear support vector machine. Other good classifiers include discriminant analysis, linear regression, and naïve Bayes. Machine learning is a promising technique for classifying osteoarthritis patients and controls from sodium magnetic resonance imaging data. © 2014 Wiley Periodicals, Inc.

  19. The Performance of Short-Term Heart Rate Variability in the Detection of Congestive Heart Failure

    PubMed Central

    Barros, Allan Kardec; Ohnishi, Noboru

    2016-01-01

    Congestive heart failure (CHF) is a cardiac disease associated with the decreasing capacity of the cardiac output. It has been shown that the CHF is the main cause of the cardiac death around the world. Some works proposed to discriminate CHF subjects from healthy subjects using either electrocardiogram (ECG) or heart rate variability (HRV) from long-term recordings. In this work, we propose an alternative framework to discriminate CHF from healthy subjects by using HRV short-term intervals based on 256 RR continuous samples. Our framework uses a matching pursuit algorithm based on Gabor functions. From the selected Gabor functions, we derived a set of features that are inputted into a hybrid framework which uses a genetic algorithm and k-nearest neighbour classifier to select a subset of features that has the best classification performance. The performance of the framework is analyzed using both Fantasia and CHF database from Physionet archives which are, respectively, composed of 40 healthy volunteers and 29 subjects. From a set of nonstandard 16 features, the proposed framework reaches an overall accuracy of 100% with five features. Our results suggest that the application of hybrid frameworks whose classifier algorithms are based on genetic algorithms has outperformed well-known classifier methods. PMID:27891509

  20. Classification of CT examinations for COPD visual severity analysis

    NASA Astrophysics Data System (ADS)

    Tan, Jun; Zheng, Bin; Wang, Xingwei; Pu, Jiantao; Gur, David; Sciurba, Frank C.; Leader, J. Ken

    2012-03-01

    In this study we present a computational method of CT examination classification into visual assessed emphysema severity. The visual severity categories ranged from 0 to 5 and were rated by an experienced radiologist. The six categories were none, trace, mild, moderate, severe and very severe. Lung segmentation was performed for every input image and all image features are extracted from the segmented lung only. We adopted a two-level feature representation method for the classification. Five gray level distribution statistics, six gray level co-occurrence matrix (GLCM), and eleven gray level run-length (GLRL) features were computed for each CT image depicted segment lung. Then we used wavelets decomposition to obtain the low- and high-frequency components of the input image, and again extract from the lung region six GLCM features and eleven GLRL features. Therefore our feature vector length is 56. The CT examinations were classified using the support vector machine (SVM) and k-nearest neighbors (KNN) and the traditional threshold (density mask) approach. The SVM classifier had the highest classification performance of all the methods with an overall sensitivity of 54.4% and a 69.6% sensitivity to discriminate "no" and "trace visually assessed emphysema. We believe this work may lead to an automated, objective method to categorically classify emphysema severity on CT exam.

  1. A statistical-textural-features based approach for classification of solid drugs using surface microscopic images.

    PubMed

    Tahir, Fahima; Fahiem, Muhammad Abuzar

    2014-01-01

    The quality of pharmaceutical products plays an important role in pharmaceutical industry as well as in our lives. Usage of defective tablets can be harmful for patients. In this research we proposed a nondestructive method to identify defective and nondefective tablets using their surface morphology. Three different environmental factors temperature, humidity and moisture are analyzed to evaluate the performance of the proposed method. Multiple textural features are extracted from the surface of the defective and nondefective tablets. These textural features are gray level cooccurrence matrix, run length matrix, histogram, autoregressive model and HAAR wavelet. Total textural features extracted from images are 281. We performed an analysis on all those 281, top 15, and top 2 features. Top 15 features are extracted using three different feature reduction techniques: chi-square, gain ratio and relief-F. In this research we have used three different classifiers: support vector machine, K-nearest neighbors and naïve Bayes to calculate the accuracies against proposed method using two experiments, that is, leave-one-out cross-validation technique and train test models. We tested each classifier against all selected features and then performed the comparison of their results. The experimental work resulted in that in most of the cases SVM performed better than the other two classifiers.

  2. Construction accident narrative classification: An evaluation of text mining techniques.

    PubMed

    Goh, Yang Miang; Ubeynarayana, C U

    2017-11-01

    Learning from past accidents is fundamental to accident prevention. Thus, accident and near miss reporting are encouraged by organizations and regulators. However, for organizations managing large safety databases, the time taken to accurately classify accident and near miss narratives will be very significant. This study aims to evaluate the utility of various text mining classification techniques in classifying 1000 publicly available construction accident narratives obtained from the US OSHA website. The study evaluated six machine learning algorithms, including support vector machine (SVM), linear regression (LR), random forest (RF), k-nearest neighbor (KNN), decision tree (DT) and Naive Bayes (NB), and found that SVM produced the best performance in classifying the test set of 251 cases. Further experimentation with tokenization of the processed text and non-linear SVM were also conducted. In addition, a grid search was conducted on the hyperparameters of the SVM models. It was found that the best performing classifiers were linear SVM with unigram tokenization and radial basis function (RBF) SVM with uni-gram tokenization. In view of its relative simplicity, the linear SVM is recommended. Across the 11 labels of accident causes or types, the precision of the linear SVM ranged from 0.5 to 1, recall ranged from 0.36 to 0.9 and F1 score was between 0.45 and 0.92. The reasons for misclassification were discussed and suggestions on ways to improve the performance were provided. Copyright © 2017 Elsevier Ltd. All rights reserved.

  3. MeSHLabeler: improving the accuracy of large-scale MeSH indexing by integrating diverse evidence.

    PubMed

    Liu, Ke; Peng, Shengwen; Wu, Junqiu; Zhai, Chengxiang; Mamitsuka, Hiroshi; Zhu, Shanfeng

    2015-06-15

    Medical Subject Headings (MeSHs) are used by National Library of Medicine (NLM) to index almost all citations in MEDLINE, which greatly facilitates the applications of biomedical information retrieval and text mining. To reduce the time and financial cost of manual annotation, NLM has developed a software package, Medical Text Indexer (MTI), for assisting MeSH annotation, which uses k-nearest neighbors (KNN), pattern matching and indexing rules. Other types of information, such as prediction by MeSH classifiers (trained separately), can also be used for automatic MeSH annotation. However, existing methods cannot effectively integrate multiple evidence for MeSH annotation. We propose a novel framework, MeSHLabeler, to integrate multiple evidence for accurate MeSH annotation by using 'learning to rank'. Evidence includes numerous predictions from MeSH classifiers, KNN, pattern matching, MTI and the correlation between different MeSH terms, etc. Each MeSH classifier is trained independently, and thus prediction scores from different classifiers are incomparable. To address this issue, we have developed an effective score normalization procedure to improve the prediction accuracy. MeSHLabeler won the first place in Task 2A of 2014 BioASQ challenge, achieving the Micro F-measure of 0.6248 for 9,040 citations provided by the BioASQ challenge. Note that this accuracy is around 9.15% higher than 0.5724, obtained by MTI. The software is available upon request. © The Author 2015. Published by Oxford University Press.

  4. MeSHLabeler: improving the accuracy of large-scale MeSH indexing by integrating diverse evidence

    PubMed Central

    Liu, Ke; Peng, Shengwen; Wu, Junqiu; Zhai, Chengxiang; Mamitsuka, Hiroshi; Zhu, Shanfeng

    2015-01-01

    Motivation: Medical Subject Headings (MeSHs) are used by National Library of Medicine (NLM) to index almost all citations in MEDLINE, which greatly facilitates the applications of biomedical information retrieval and text mining. To reduce the time and financial cost of manual annotation, NLM has developed a software package, Medical Text Indexer (MTI), for assisting MeSH annotation, which uses k-nearest neighbors (KNN), pattern matching and indexing rules. Other types of information, such as prediction by MeSH classifiers (trained separately), can also be used for automatic MeSH annotation. However, existing methods cannot effectively integrate multiple evidence for MeSH annotation. Methods: We propose a novel framework, MeSHLabeler, to integrate multiple evidence for accurate MeSH annotation by using ‘learning to rank’. Evidence includes numerous predictions from MeSH classifiers, KNN, pattern matching, MTI and the correlation between different MeSH terms, etc. Each MeSH classifier is trained independently, and thus prediction scores from different classifiers are incomparable. To address this issue, we have developed an effective score normalization procedure to improve the prediction accuracy. Results: MeSHLabeler won the first place in Task 2A of 2014 BioASQ challenge, achieving the Micro F-measure of 0.6248 for 9,040 citations provided by the BioASQ challenge. Note that this accuracy is around 9.15% higher than 0.5724, obtained by MTI. Availability and implementation: The software is available upon request. Contact: zhusf@fudan.edu.cn PMID:26072501

  5. Diagnosis of multiple sclerosis from EEG signals using nonlinear methods.

    PubMed

    Torabi, Ali; Daliri, Mohammad Reza; Sabzposhan, Seyyed Hojjat

    2017-12-01

    EEG signals have essential and important information about the brain and neural diseases. The main purpose of this study is classifying two groups of healthy volunteers and Multiple Sclerosis (MS) patients using nonlinear features of EEG signals while performing cognitive tasks. EEG signals were recorded when users were doing two different attentional tasks. One of the tasks was based on detecting a desired change in color luminance and the other task was based on detecting a desired change in direction of motion. EEG signals were analyzed in two ways: EEG signals analysis without rhythms decomposition and EEG sub-bands analysis. After recording and preprocessing, time delay embedding method was used for state space reconstruction; embedding parameters were determined for original signals and their sub-bands. Afterwards nonlinear methods were used in feature extraction phase. To reduce the feature dimension, scalar feature selections were done by using T-test and Bhattacharyya criteria. Then, the data were classified using linear support vector machines (SVM) and k-nearest neighbor (KNN) method. The best combination of the criteria and classifiers was determined for each task by comparing performances. For both tasks, the best results were achieved by using T-test criterion and SVM classifier. For the direction-based and the color-luminance-based tasks, maximum classification performances were 93.08 and 79.79% respectively which were reached by using optimal set of features. Our results show that the nonlinear dynamic features of EEG signals seem to be useful and effective in MS diseases diagnosis.

  6. Anomaly detection in forward looking infrared imaging using one-class classifiers

    NASA Astrophysics Data System (ADS)

    Popescu, Mihail; Stone, Kevin; Havens, Timothy; Ho, Dominic; Keller, James

    2010-04-01

    In this paper we describe a method for generating cues of possible abnormal objects present in the field of view of an infrared (IR) camera installed on a moving vehicle. The proposed method has two steps. In the first step, for each frame, we generate a set of possible points of interest using a corner detection algorithm. In the second step, the points related to the background are discarded from the point set using an one class classifier (OCC) trained on features extracted from a local neighborhood of each point. The advantage of using an OCC is that we do not need examples from the "abnormal object" class to train the classifier. Instead, OCC is trained using corner points from images known to be abnormal object free, i.e., that contain only background scenes. To further reduce the number of false alarms we use a temporal fusion procedure: a region has to be detected as "interesting" in m out of n, m

  7. Evaluation of Classifier Performance for Multiclass Phenotype Discrimination in Untargeted Metabolomics.

    PubMed

    Trainor, Patrick J; DeFilippis, Andrew P; Rai, Shesh N

    2017-06-21

    Statistical classification is a critical component of utilizing metabolomics data for examining the molecular determinants of phenotypes. Despite this, a comprehensive and rigorous evaluation of the accuracy of classification techniques for phenotype discrimination given metabolomics data has not been conducted. We conducted such an evaluation using both simulated and real metabolomics datasets, comparing Partial Least Squares-Discriminant Analysis (PLS-DA), Sparse PLS-DA, Random Forests, Support Vector Machines (SVM), Artificial Neural Network, k -Nearest Neighbors ( k -NN), and Naïve Bayes classification techniques for discrimination. We evaluated the techniques on simulated data generated to mimic global untargeted metabolomics data by incorporating realistic block-wise correlation and partial correlation structures for mimicking the correlations and metabolite clustering generated by biological processes. Over the simulation studies, covariance structures, means, and effect sizes were stochastically varied to provide consistent estimates of classifier performance over a wide range of possible scenarios. The effects of the presence of non-normal error distributions, the introduction of biological and technical outliers, unbalanced phenotype allocation, missing values due to abundances below a limit of detection, and the effect of prior-significance filtering (dimension reduction) were evaluated via simulation. In each simulation, classifier parameters, such as the number of hidden nodes in a Neural Network, were optimized by cross-validation to minimize the probability of detecting spurious results due to poorly tuned classifiers. Classifier performance was then evaluated using real metabolomics datasets of varying sample medium, sample size, and experimental design. We report that in the most realistic simulation studies that incorporated non-normal error distributions, unbalanced phenotype allocation, outliers, missing values, and dimension reduction, classifier performance (least to greatest error) was ranked as follows: SVM, Random Forest, Naïve Bayes, sPLS-DA, Neural Networks, PLS-DA and k -NN classifiers. When non-normal error distributions were introduced, the performance of PLS-DA and k -NN classifiers deteriorated further relative to the remaining techniques. Over the real datasets, a trend of better performance of SVM and Random Forest classifier performance was observed.

  8. Topological magnons in a one-dimensional itinerant flatband ferromagnet

    NASA Astrophysics Data System (ADS)

    Su, Xiao-Fei; Gu, Zhao-Long; Dong, Zhao-Yang; Li, Jian-Xin

    2018-06-01

    Different from previous scenarios that topological magnons emerge in local spin models, we propose an alternative that itinerant electron magnets can host topological magnons. A one-dimensional Tasaki model with a flatband is considered as the prototype. This model can be viewed as a quarter-filled periodic Anderson model with impurities located in between and hybridizing with the nearest-neighbor conducting electrons, together with a Hubbard repulsion for these electrons. By increasing the Hubbard interaction, the gap between the acoustic and optical magnons closes and reopens while the Berry phase of the acoustic band changes from 0 to π , leading to the occurrence of a topological transition. After this transition, there always exist in-gap edge magnonic modes, which is consistent with the bulk-edge correspondence. The Hubbard interaction-driven transition reveals a new mechanism to realize nontrivial magnon bands.

  9. Impact of distance-based metric learning on classification and visualization model performance and structure-activity landscapes.

    PubMed

    Kireeva, Natalia V; Ovchinnikova, Svetlana I; Kuznetsov, Sergey L; Kazennov, Andrey M; Tsivadze, Aslan Yu

    2014-02-01

    This study concerns large margin nearest neighbors classifier and its multi-metric extension as the efficient approaches for metric learning which aimed to learn an appropriate distance/similarity function for considered case studies. In recent years, many studies in data mining and pattern recognition have demonstrated that a learned metric can significantly improve the performance in classification, clustering and retrieval tasks. The paper describes application of the metric learning approach to in silico assessment of chemical liabilities. Chemical liabilities, such as adverse effects and toxicity, play a significant role in drug discovery process, in silico assessment of chemical liabilities is an important step aimed to reduce costs and animal testing by complementing or replacing in vitro and in vivo experiments. Here, to our knowledge for the first time, a distance-based metric learning procedures have been applied for in silico assessment of chemical liabilities, the impact of metric learning on structure-activity landscapes and predictive performance of developed models has been analyzed, the learned metric was used in support vector machines. The metric learning results have been illustrated using linear and non-linear data visualization techniques in order to indicate how the change of metrics affected nearest neighbors relations and descriptor space.

  10. Candidate soil indicators for monitoring the progress of constructed wetlands toward a natural state: a statistical approach

    USGS Publications Warehouse

    Stapanian, Martin A.; Adams, Jean V.; Fennessy, M. Siobhan; Mack, John; Micacchion, Mick

    2013-01-01

    A persistent question among ecologists and environmental managers is whether constructed wetlands are structurally or functionally equivalent to naturally occurring wetlands. We examined 19 variables collected from 10 constructed and nine natural emergent wetlands in Ohio, USA. Our primary objective was to identify candidate indicators of wetland class (natural or constructed), based on measurements of soil properties and an index of vegetation integrity, that can be used to track the progress of constructed wetlands toward a natural state. The method of nearest shrunken centroids was used to find a subset of variables that would serve as the best classifiers of wetland class, and error rate was calculated using a five-fold cross-validation procedure. The shrunken differences of percent total organic carbon (% TOC) and percent dry weight of the soil exhibited the greatest distances from the overall centroid. Classification based on these two variables yielded a misclassification rate of 11% based on cross-validation. Our results indicate that % TOC and percent dry weight can be used as candidate indicators of the status of emergent, constructed wetlands in Ohio and for assessing the performance of mitigation. The method of nearest shrunken centroids has excellent potential for further applications in ecology.

  11. Impact of distance-based metric learning on classification and visualization model performance and structure-activity landscapes

    NASA Astrophysics Data System (ADS)

    Kireeva, Natalia V.; Ovchinnikova, Svetlana I.; Kuznetsov, Sergey L.; Kazennov, Andrey M.; Tsivadze, Aslan Yu.

    2014-02-01

    This study concerns large margin nearest neighbors classifier and its multi-metric extension as the efficient approaches for metric learning which aimed to learn an appropriate distance/similarity function for considered case studies. In recent years, many studies in data mining and pattern recognition have demonstrated that a learned metric can significantly improve the performance in classification, clustering and retrieval tasks. The paper describes application of the metric learning approach to in silico assessment of chemical liabilities. Chemical liabilities, such as adverse effects and toxicity, play a significant role in drug discovery process, in silico assessment of chemical liabilities is an important step aimed to reduce costs and animal testing by complementing or replacing in vitro and in vivo experiments. Here, to our knowledge for the first time, a distance-based metric learning procedures have been applied for in silico assessment of chemical liabilities, the impact of metric learning on structure-activity landscapes and predictive performance of developed models has been analyzed, the learned metric was used in support vector machines. The metric learning results have been illustrated using linear and non-linear data visualization techniques in order to indicate how the change of metrics affected nearest neighbors relations and descriptor space.

  12. Classification of THz pulse signals using two-dimensional cross-correlation feature extraction and non-linear classifiers.

    PubMed

    Siuly; Yin, Xiaoxia; Hadjiloucas, Sillas; Zhang, Yanchun

    2016-04-01

    This work provides a performance comparison of four different machine learning classifiers: multinomial logistic regression with ridge estimators (MLR) classifier, k-nearest neighbours (KNN), support vector machine (SVM) and naïve Bayes (NB) as applied to terahertz (THz) transient time domain sequences associated with pixelated images of different powder samples. The six substances considered, although have similar optical properties, their complex insertion loss at the THz part of the spectrum is significantly different because of differences in both their frequency dependent THz extinction coefficient as well as differences in their refractive index and scattering properties. As scattering can be unquantifiable in many spectroscopic experiments, classification solely on differences in complex insertion loss can be inconclusive. The problem is addressed using two-dimensional (2-D) cross-correlations between background and sample interferograms, these ensure good noise suppression of the datasets and provide a range of statistical features that are subsequently used as inputs to the above classifiers. A cross-validation procedure is adopted to assess the performance of the classifiers. Firstly the measurements related to samples that had thicknesses of 2mm were classified, then samples at thicknesses of 4mm, and after that 3mm were classified and the success rate and consistency of each classifier was recorded. In addition, mixtures having thicknesses of 2 and 4mm as well as mixtures of 2, 3 and 4mm were presented simultaneously to all classifiers. This approach provided further cross-validation of the classification consistency of each algorithm. The results confirm the superiority in classification accuracy and robustness of the MLR (least accuracy 88.24%) and KNN (least accuracy 90.19%) algorithms which consistently outperformed the SVM (least accuracy 74.51%) and NB (least accuracy 56.86%) classifiers for the same number of feature vectors across all studies. The work establishes a general methodology for assessing the performance of other hyperspectral dataset classifiers on the basis of 2-D cross-correlations in far-infrared spectroscopy or other parts of the electromagnetic spectrum. It also advances the wider proliferation of automated THz imaging systems across new application areas e.g., biomedical imaging, industrial processing and quality control where interpretation of hyperspectral images is still under development. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  13. Novel Human Adenovirus Causing Nosocomial Epidemic Keratoconjunctivitis▿

    PubMed Central

    Ishiko, Hiroaki; Shimada, Yasushi; Konno, Tsunetada; Hayashi, Akio; Ohguchi, Takeshi; Tagawa, Yoshitsugu; Aoki, Koki; Ohno, Shigeaki; Yamazaki, Shudo

    2008-01-01

    In 2000, we encountered cases of nosocomial infections with epidemic keratoconjunctivitis (EKC) at a university hospital in Kobe, in the western part of Japan. Two human adenovirus (HAdV) strains, Kobe-H and Kobe-S, were isolated from patients with nosocomial EKC infection. They were untypeable by existing neutralizing antisera; however, the isolate was neutralized with homologous antisera. We then encountered several cases of EKC due to nosocomial infections in eye clinics in different parts of Japan. A total of 80 HAdVs were isolated from patients with EKC at eight different hospitals. The partial hexon gene sequences of the isolates were determined and compared to those of the prototype strains of 51 serotypes. All isolates had identical partial hexon nucleotide sequences. Phylogenetic analysis classified these isolates into species of HAdV-D. The isolates showed 93.9 to 96.7% nucleotide identity with HAdV-D prototype strains, while all 32 HAdV-D prototype strains ranged from 93.2 to 99.2% identity. The sequences of the loop 2 and fiber knob regions from the representative strain, Kobe-H, were dissimilar in all prototype strains of 51 serotypes. We believe that this virus is a novel serotype of HAdV that causes EKC. PMID:18385435

  14. Classification of EEG Signals Based on Pattern Recognition Approach.

    PubMed

    Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed

    2017-01-01

    Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a "pattern recognition" approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90-7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11-89.63% and 91.60-81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy.

  15. Classification of EEG Signals Based on Pattern Recognition Approach

    PubMed Central

    Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed

    2017-01-01

    Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a “pattern recognition” approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90–7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11–89.63% and 91.60–81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy. PMID:29209190

  16. False alarm reduction by the And-ing of multiple multivariate Gaussian classifiers

    NASA Astrophysics Data System (ADS)

    Dobeck, Gerald J.; Cobb, J. Tory

    2003-09-01

    The high-resolution sonar is one of the principal sensors used by the Navy to detect and classify sea mines in minehunting operations. For such sonar systems, substantial effort has been devoted to the development of automated detection and classification (D/C) algorithms. These have been spurred by several factors including (1) aids for operators to reduce work overload, (2) more optimal use of all available data, and (3) the introduction of unmanned minehunting systems. The environments where sea mines are typically laid (harbor areas, shipping lanes, and the littorals) give rise to many false alarms caused by natural, biologic, and man-made clutter. The objective of the automated D/C algorithms is to eliminate most of these false alarms while still maintaining a very high probability of mine detection and classification (PdPc). In recent years, the benefits of fusing the outputs of multiple D/C algorithms have been studied. We refer to this as Algorithm Fusion. The results have been remarkable, including reliable robustness to new environments. This paper describes a method for training several multivariate Gaussian classifiers such that their And-ing dramatically reduces false alarms while maintaining a high probability of classification. This training approach is referred to as the Focused- Training method. This work extends our 2001-2002 work where the Focused-Training method was used with three other types of classifiers: the Attractor-based K-Nearest Neighbor Neural Network (a type of radial-basis, probabilistic neural network), the Optimal Discrimination Filter Classifier (based linear discrimination theory), and the Quadratic Penalty Function Support Vector Machine (QPFSVM). Although our experience has been gained in the area of sea mine detection and classification, the principles described herein are general and can be applied to a wide range of pattern recognition and automatic target recognition (ATR) problems.

  17. Development of an indoor location based service test bed and geographic information system with a wireless sensor network.

    PubMed

    Jan, Shau-Shiun; Hsu, Li-Ta; Tsai, Wen-Ming

    2010-01-01

    In order to provide the seamless navigation and positioning services for indoor environments, an indoor location based service (LBS) test bed is developed to integrate the indoor positioning system and the indoor three-dimensional (3D) geographic information system (GIS). A wireless sensor network (WSN) is used in the developed indoor positioning system. Considering the power consumption, in this paper the ZigBee radio is used as the wireless protocol, and the received signal strength (RSS) fingerprinting positioning method is applied as the primary indoor positioning algorithm. The matching processes of the user location include the nearest neighbor (NN) algorithm, the K-weighted nearest neighbors (KWNN) algorithm, and the probabilistic approach. To enhance the positioning accuracy for the dynamic user, the particle filter is used to improve the positioning performance. As part of this research, a 3D indoor GIS is developed to be used with the indoor positioning system. This involved using the computer-aided design (CAD) software and the virtual reality markup language (VRML) to implement a prototype indoor LBS test bed. Thus, a rapid and practical procedure for constructing a 3D indoor GIS is proposed, and this GIS is easy to update and maintenance for users. The building of the Department of Aeronautics and Astronautics at National Cheng Kung University in Taiwan is used as an example to assess the performance of various algorithms for the indoor positioning system.

  18. Development of an Indoor Location Based Service Test Bed and Geographic Information System with a Wireless Sensor Network

    PubMed Central

    Jan, Shau-Shiun; Hsu, Li-Ta; Tsai, Wen-Ming

    2010-01-01

    In order to provide the seamless navigation and positioning services for indoor environments, an indoor location based service (LBS) test bed is developed to integrate the indoor positioning system and the indoor three-dimensional (3D) geographic information system (GIS). A wireless sensor network (WSN) is used in the developed indoor positioning system. Considering the power consumption, in this paper the ZigBee radio is used as the wireless protocol, and the received signal strength (RSS) fingerprinting positioning method is applied as the primary indoor positioning algorithm. The matching processes of the user location include the nearest neighbor (NN) algorithm, the K-weighted nearest neighbors (KWNN) algorithm, and the probabilistic approach. To enhance the positioning accuracy for the dynamic user, the particle filter is used to improve the positioning performance. As part of this research, a 3D indoor GIS is developed to be used with the indoor positioning system. This involved using the computer-aided design (CAD) software and the virtual reality markup language (VRML) to implement a prototype indoor LBS test bed. Thus, a rapid and practical procedure for constructing a 3D indoor GIS is proposed, and this GIS is easy to update and maintenance for users. The building of the Department of Aeronautics and Astronautics at National Cheng Kung University in Taiwan is used as an example to assess the performance of various algorithms for the indoor positioning system. PMID:22319282

  19. Active learning for solving the incomplete data problem in facial age classification by the furthest nearest-neighbor criterion.

    PubMed

    Wang, Jian-Gang; Sung, Eric; Yau, Wei-Yun

    2011-07-01

    Facial age classification is an approach to classify face images into one of several predefined age groups. One of the difficulties in applying learning techniques to the age classification problem is the large amount of labeled training data required. Acquiring such training data is very costly in terms of age progress, privacy, human time, and effort. Although unlabeled face images can be obtained easily, it would be expensive to manually label them on a large scale and getting the ground truth. The frugal selection of the unlabeled data for labeling to quickly reach high classification performance with minimal labeling efforts is a challenging problem. In this paper, we present an active learning approach based on an online incremental bilateral two-dimension linear discriminant analysis (IB2DLDA) which initially learns from a small pool of labeled data and then iteratively selects the most informative samples from the unlabeled set to increasingly improve the classifier. Specifically, we propose a novel data selection criterion called the furthest nearest-neighbor (FNN) that generalizes the margin-based uncertainty to the multiclass case and which is easy to compute, so that the proposed active learning algorithm can handle a large number of classes and large data sizes efficiently. Empirical experiments on FG-NET and Morph databases together with a large unlabeled data set for age categorization problems show that the proposed approach can achieve results comparable or even outperform a conventionally trained active classifier that requires much more labeling effort. Our IB2DLDA-FNN algorithm can achieve similar results much faster than random selection and with fewer samples for age categorization. It also can achieve comparable results with active SVM but is much faster than active SVM in terms of training because kernel methods are not needed. The results on the face recognition database and palmprint/palm vein database showed that our approach can handle problems with large number of classes. Our contributions in this paper are twofold. First, we proposed the IB2DLDA-FNN, the FNN being our novel idea, as a generic on-line or active learning paradigm. Second, we showed that it can be another viable tool for active learning of facial age range classification.

  20. Multi-spectral brain tissue segmentation using automatically trained k-Nearest-Neighbor classification.

    PubMed

    Vrooman, Henri A; Cocosco, Chris A; van der Lijn, Fedde; Stokking, Rik; Ikram, M Arfan; Vernooij, Meike W; Breteler, Monique M B; Niessen, Wiro J

    2007-08-01

    Conventional k-Nearest-Neighbor (kNN) classification, which has been successfully applied to classify brain tissue in MR data, requires training on manually labeled subjects. This manual labeling is a laborious and time-consuming procedure. In this work, a new fully automated brain tissue classification procedure is presented, in which kNN training is automated. This is achieved by non-rigidly registering the MR data with a tissue probability atlas to automatically select training samples, followed by a post-processing step to keep the most reliable samples. The accuracy of the new method was compared to rigid registration-based training and to conventional kNN-based segmentation using training on manually labeled subjects for segmenting gray matter (GM), white matter (WM) and cerebrospinal fluid (CSF) in 12 data sets. Furthermore, for all classification methods, the performance was assessed when varying the free parameters. Finally, the robustness of the fully automated procedure was evaluated on 59 subjects. The automated training method using non-rigid registration with a tissue probability atlas was significantly more accurate than rigid registration. For both automated training using non-rigid registration and for the manually trained kNN classifier, the difference with the manual labeling by observers was not significantly larger than inter-observer variability for all tissue types. From the robustness study, it was clear that, given an appropriate brain atlas and optimal parameters, our new fully automated, non-rigid registration-based method gives accurate and robust segmentation results. A similarity index was used for comparison with manually trained kNN. The similarity indices were 0.93, 0.92 and 0.92, for CSF, GM and WM, respectively. It can be concluded that our fully automated method using non-rigid registration may replace manual segmentation, and thus that automated brain tissue segmentation without laborious manual training is feasible.

  1. Spectrum of purpura fulminans: report of three classical prototypes and review of management strategies.

    PubMed

    Talwar, Ankur; Kumar, Sharath; Gopal, M G; Nandini, A S

    2012-01-01

    Purpura fulminans is a rare syndrome of intravascular thrombosis and hemorrhagic infarction of the skin that is rapidly progressive and is accompanied by vascular collapse and disseminated intravascular coagulation. It usually occurs in children, but this syndrome has also been noted in adults. The three forms of this disease are classified by the triggering mechanisms. We describe three classical cases of purpura fulminans of the three classical prototypes treated at our center and their varied clinical outcomes. We also describe a case of acute infectious purpura fulminans secondary to systemic leptospirosis which to our best knowledge is the first reported case in world literature. The various treatment options for purpura fulminans have also been reviewed.

  2. Augmenting the senses: a review on sensor-based learning support.

    PubMed

    Schneider, Jan; Börner, Dirk; van Rosmalen, Peter; Specht, Marcus

    2015-02-11

    In recent years sensor components have been extending classical computer-based support systems in a variety of applications domains (sports, health, etc.). In this article we review the use of sensors for the application domain of learning. For that we analyzed 82 sensor-based prototypes exploring their learning support. To study this learning support we classified the prototypes according to the Bloom's taxonomy of learning domains and explored how they can be used to assist on the implementation of formative assessment, paying special attention to their use as feedback tools. The analysis leads to current research foci and gaps in the development of sensor-based learning support systems and concludes with a research agenda based on the findings.

  3. Augmenting the Senses: A Review on Sensor-Based Learning Support

    PubMed Central

    Schneider, Jan; Börner, Dirk; van Rosmalen, Peter; Specht, Marcus

    2015-01-01

    In recent years sensor components have been extending classical computer-based support systems in a variety of applications domains (sports, health, etc.). In this article we review the use of sensors for the application domain of learning. For that we analyzed 82 sensor-based prototypes exploring their learning support. To study this learning support we classified the prototypes according to the Bloom's taxonomy of learning domains and explored how they can be used to assist on the implementation of formative assessment, paying special attention to their use as feedback tools. The analysis leads to current research foci and gaps in the development of sensor-based learning support systems and concludes with a research agenda based on the findings. PMID:25679313

  4. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Deng, Zhiqun; Carlson, Thomas J.; Fu, Tao

    Power extracted from fast moving tidal currents has been identified as a potential commercial-scale source of renewable energy. Device developers and utilities are pursuing deployment of prototype tidal turbines to assess technology viability, site feasibility, and environmental interactions. Deployment of prototype turbines requires permits from a range of regulatory authorities. Ensuring the safety of marine animals, particularly those under protection of the Endangered Species Act of 1973 (ESA) and the Marine Mammal Protection Act of 1972 has emerged as a key regulatory challenge for initial MHK deployments. The greatest perceived risk to marine animals is from strike by the rotatingmore » blades of tidal turbines. Development of the marine mammal alert system (MAAS) was undertaken to support monitoring and mitigation requirements for tidal turbine deployments. The prototype system development focused on Southern Resident killer whales (SRKW), an endangered population of killer whales that frequents Puget Sound and is intermittently present in the part of the sound where deployment of prototype tidal turbines is being considered. Passive acoustics were selected as the primary means because of the vocal nature of these animals. The MAAS passive acoustic system consists of two-stage process involving the use of an energy detector and a spectrogram-based classifier to distinguish between SKRW’s calls and noise. A prototype consisting of two 2D symmetrical star arrays separated by 20 m center to center was built and evaluated in the waters of Sequim Bay using whale call playback.« less

  5. Applying data fusion techniques for benthic habitat mapping and monitoring in a coral reef ecosystem

    NASA Astrophysics Data System (ADS)

    Zhang, Caiyun

    2015-06-01

    Accurate mapping and effective monitoring of benthic habitat in the Florida Keys are critical in developing management strategies for this valuable coral reef ecosystem. For this study, a framework was designed for automated benthic habitat mapping by combining multiple data sources (hyperspectral, aerial photography, and bathymetry data) and four contemporary imagery processing techniques (data fusion, Object-based Image Analysis (OBIA), machine learning, and ensemble analysis). In the framework, 1-m digital aerial photograph was first merged with 17-m hyperspectral imagery and 10-m bathymetry data using a pixel/feature-level fusion strategy. The fused dataset was then preclassified by three machine learning algorithms (Random Forest, Support Vector Machines, and k-Nearest Neighbor). Final object-based habitat maps were produced through ensemble analysis of outcomes from three classifiers. The framework was tested for classifying a group-level (3-class) and code-level (9-class) habitats in a portion of the Florida Keys. Informative and accurate habitat maps were achieved with an overall accuracy of 88.5% and 83.5% for the group-level and code-level classifications, respectively.

  6. Machine learning algorithms for meteorological event classification in the coastal area using in-situ data

    NASA Astrophysics Data System (ADS)

    Sokolov, Anton; Gengembre, Cyril; Dmitriev, Egor; Delbarre, Hervé

    2017-04-01

    The problem is considered of classification of local atmospheric meteorological events in the coastal area such as sea breezes, fogs and storms. The in-situ meteorological data as wind speed and direction, temperature, humidity and turbulence are used as predictors. Local atmospheric events of 2013-2014 were analysed manually to train classification algorithms in the coastal area of English Channel in Dunkirk (France). Then, ultrasonic anemometer data and LIDAR wind profiler data were used as predictors. A few algorithms were applied to determine meteorological events by local data such as a decision tree, the nearest neighbour classifier, a support vector machine. The comparison of classification algorithms was carried out, the most important predictors for each event type were determined. It was shown that in more than 80 percent of the cases machine learning algorithms detect the meteorological class correctly. We expect that this methodology could be applied also to classify events by climatological in-situ data or by modelling data. It allows estimating frequencies of each event in perspective of climate change.

  7. Ensemble LUT classification for degraded document enhancement

    NASA Astrophysics Data System (ADS)

    Obafemi-Ajayi, Tayo; Agam, Gady; Frieder, Ophir

    2008-01-01

    The fast evolution of scanning and computing technologies have led to the creation of large collections of scanned paper documents. Examples of such collections include historical collections, legal depositories, medical archives, and business archives. Moreover, in many situations such as legal litigation and security investigations scanned collections are being used to facilitate systematic exploration of the data. It is almost always the case that scanned documents suffer from some form of degradation. Large degradations make documents hard to read and substantially deteriorate the performance of automated document processing systems. Enhancement of degraded document images is normally performed assuming global degradation models. When the degradation is large, global degradation models do not perform well. In contrast, we propose to estimate local degradation models and use them in enhancing degraded document images. Using a semi-automated enhancement system we have labeled a subset of the Frieder diaries collection.1 This labeled subset was then used to train an ensemble classifier. The component classifiers are based on lookup tables (LUT) in conjunction with the approximated nearest neighbor algorithm. The resulting algorithm is highly effcient. Experimental evaluation results are provided using the Frieder diaries collection.1

  8. Luminance sticker based facial expression recognition using discrete wavelet transform for physically disabled persons.

    PubMed

    Nagarajan, R; Hariharan, M; Satiyan, M

    2012-08-01

    Developing tools to assist physically disabled and immobilized people through facial expression is a challenging area of research and has attracted many researchers recently. In this paper, luminance stickers based facial expression recognition is proposed. Recognition of facial expression is carried out by employing Discrete Wavelet Transform (DWT) as a feature extraction method. Different wavelet families with their different orders (db1 to db20, Coif1 to Coif 5 and Sym2 to Sym8) are utilized to investigate their performance in recognizing facial expression and to evaluate their computational time. Standard deviation is computed for the coefficients of first level of wavelet decomposition for every order of wavelet family. This standard deviation is used to form a set of feature vectors for classification. In this study, conventional validation and cross validation are performed to evaluate the efficiency of the suggested feature vectors. Three different classifiers namely Artificial Neural Network (ANN), k-Nearest Neighborhood (kNN) and Linear Discriminant Analysis (LDA) are used to classify a set of eight facial expressions. The experimental results demonstrate that the proposed method gives very promising classification accuracies.

  9. A comparative analysis of swarm intelligence techniques for feature selection in cancer classification.

    PubMed

    Gunavathi, Chellamuthu; Premalatha, Kandasamy

    2014-01-01

    Feature selection in cancer classification is a central area of research in the field of bioinformatics and used to select the informative genes from thousands of genes of the microarray. The genes are ranked based on T-statistics, signal-to-noise ratio (SNR), and F-test values. The swarm intelligence (SI) technique finds the informative genes from the top-m ranked genes. These selected genes are used for classification. In this paper the shuffled frog leaping with Lévy flight (SFLLF) is proposed for feature selection. In SFLLF, the Lévy flight is included to avoid premature convergence of shuffled frog leaping (SFL) algorithm. The SI techniques such as particle swarm optimization (PSO), cuckoo search (CS), SFL, and SFLLF are used for feature selection which identifies informative genes for classification. The k-nearest neighbour (k-NN) technique is used to classify the samples. The proposed work is applied on 10 different benchmark datasets and examined with SI techniques. The experimental results show that the results obtained from k-NN classifier through SFLLF feature selection method outperform PSO, CS, and SFL.

  10. Selected-ion flow-tube mass-spectrometry (SIFT-MS) fingerprinting versus chemical profiling for geographic traceability of Moroccan Argan oils.

    PubMed

    Kharbach, Mourad; Kamal, Rabie; Mansouri, Mohammed Alaoui; Marmouzi, Ilias; Viaene, Johan; Cherrah, Yahia; Alaoui, Katim; Vercammen, Joeri; Bouklouze, Abdelaziz; Vander Heyden, Yvan

    2018-10-15

    This study investigated the effectiveness of SIFT-MS versus chemical profiling, both coupled to multivariate data analysis, to classify 95 Extra Virgin Argan Oils (EVAO), originating from five Moroccan Argan forest locations. The full scan option of SIFT-MS, is suitable to indicate the geographic origin of EVAO based on the fingerprints obtained using the three chemical ionization precursors (H 3 O + , NO + and O 2 + ). The chemical profiling (including acidity, peroxide value, spectrophotometric indices, fatty acids, tocopherols- and sterols composition) was also used for classification. Partial least squares discriminant analysis (PLS-DA), soft independent modeling of class analogy (SIMCA), K-nearest neighbors (KNN), and support vector machines (SVM), were compared. The SIFT-MS data were therefore fed to variable-selection methods to find potential biomarkers for classification. The classification models based either on chemical profiling or SIFT-MS data were able to classify the samples with high accuracy. SIFT-MS was found to be advantageous for rapid geographic classification. Copyright © 2018 Elsevier Ltd. All rights reserved.

  11. Improving medical diagnosis reliability using Boosted C5.0 decision tree empowered by Particle Swarm Optimization.

    PubMed

    Pashaei, Elnaz; Ozen, Mustafa; Aydin, Nizamettin

    2015-08-01

    Improving accuracy of supervised classification algorithms in biomedical applications is one of active area of research. In this study, we improve the performance of Particle Swarm Optimization (PSO) combined with C4.5 decision tree (PSO+C4.5) classifier by applying Boosted C5.0 decision tree as the fitness function. To evaluate the effectiveness of our proposed method, it is implemented on 1 microarray dataset and 5 different medical data sets obtained from UCI machine learning databases. Moreover, the results of PSO + Boosted C5.0 implementation are compared to eight well-known benchmark classification methods (PSO+C4.5, support vector machine under the kernel of Radial Basis Function, Classification And Regression Tree (CART), C4.5 decision tree, C5.0 decision tree, Boosted C5.0 decision tree, Naive Bayes and Weighted K-Nearest neighbor). Repeated five-fold cross-validation method was used to justify the performance of classifiers. Experimental results show that our proposed method not only improve the performance of PSO+C4.5 but also obtains higher classification accuracy compared to the other classification methods.

  12. Link prediction in multiplex online social networks

    NASA Astrophysics Data System (ADS)

    Jalili, Mahdi; Orouskhani, Yasin; Asgari, Milad; Alipourfard, Nazanin; Perc, Matjaž

    2017-02-01

    Online social networks play a major role in modern societies, and they have shaped the way social relationships evolve. Link prediction in social networks has many potential applications such as recommending new items to users, friendship suggestion and discovering spurious connections. Many real social networks evolve the connections in multiple layers (e.g. multiple social networking platforms). In this article, we study the link prediction problem in multiplex networks. As an example, we consider a multiplex network of Twitter (as a microblogging service) and Foursquare (as a location-based social network). We consider social networks of the same users in these two platforms and develop a meta-path-based algorithm for predicting the links. The connectivity information of the two layers is used to predict the links in Foursquare network. Three classical classifiers (naive Bayes, support vector machines (SVM) and K-nearest neighbour) are used for the classification task. Although the networks are not highly correlated in the layers, our experiments show that including the cross-layer information significantly improves the prediction performance. The SVM classifier results in the best performance with an average accuracy of 89%.

  13. Link prediction in multiplex online social networks.

    PubMed

    Jalili, Mahdi; Orouskhani, Yasin; Asgari, Milad; Alipourfard, Nazanin; Perc, Matjaž

    2017-02-01

    Online social networks play a major role in modern societies, and they have shaped the way social relationships evolve. Link prediction in social networks has many potential applications such as recommending new items to users, friendship suggestion and discovering spurious connections. Many real social networks evolve the connections in multiple layers (e.g. multiple social networking platforms). In this article, we study the link prediction problem in multiplex networks. As an example, we consider a multiplex network of Twitter (as a microblogging service) and Foursquare (as a location-based social network). We consider social networks of the same users in these two platforms and develop a meta-path-based algorithm for predicting the links. The connectivity information of the two layers is used to predict the links in Foursquare network. Three classical classifiers (naive Bayes, support vector machines (SVM) and K-nearest neighbour) are used for the classification task. Although the networks are not highly correlated in the layers, our experiments show that including the cross-layer information significantly improves the prediction performance. The SVM classifier results in the best performance with an average accuracy of 89%.

  14. Classification of X-ray sources in the direction of M31

    NASA Astrophysics Data System (ADS)

    Vasilopoulos, G.; Hatzidimitriou, D.; Pietsch, W.

    2012-01-01

    M31 is our nearest spiral galaxy, at a distance of 780 kpc. Identification of X-ray sources in nearby galaxies is important for interpreting the properties of more distant ones, mainly because we can classify nearby sources using both X-ray and optical data, while more distant ones via X-rays alone. The XMM-Newton Large Project for M31 has produced an abundant sample of about 1900 X-ray sources in the direction of M31. Most of them remain elusive, giving us little signs of their origin. Our goal is to classify these sources using criteria based on properties of already identified ones. In particular we construct candidate lists of high mass X-ray binaries, low mass X-ray binaries, X-ray binaries correlated with globular clusters and AGN based on their X-ray emission and the properties of their optical counterparts, if any. Our main methodology consists of identifying particular loci of X-ray sources on X-ray hardness ratio diagrams and the color magnitude diagrams of their optical counterparts. Finally, we examined the X-ray luminosity function of the X-ray binaries populations.

  15. Impact of corpus domain for sentiment classification: An evaluation study using supervised machine learning techniques

    NASA Astrophysics Data System (ADS)

    Karsi, Redouane; Zaim, Mounia; El Alami, Jamila

    2017-07-01

    Thanks to the development of the internet, a large community now has the possibility to communicate and express its opinions and preferences through multiple media such as blogs, forums, social networks and e-commerce sites. Today, it becomes clearer that opinions published on the web are a very valuable source for decision-making, so a rapidly growing field of research called “sentiment analysis” is born to address the problem of automatically determining the polarity (Positive, negative, neutral,…) of textual opinions. People expressing themselves in a particular domain often use specific domain language expressions, thus, building a classifier, which performs well in different domains is a challenging problem. The purpose of this paper is to evaluate the impact of domain for sentiment classification when using machine learning techniques. In our study three popular machine learning techniques: Support Vector Machines (SVM), Naive Bayes and K nearest neighbors(KNN) were applied on datasets collected from different domains. Experimental results show that Support Vector Machines outperforms other classifiers in all domains, since it achieved at least 74.75% accuracy with a standard deviation of 4,08.

  16. Remote sensing change detection methods to track deforestation and growth in threatened rainforests in Madre de Dios, Peru

    USGS Publications Warehouse

    Shermeyer, Jacob S.; Haack, Barry N.

    2015-01-01

    Two forestry-change detection methods are described, compared, and contrasted for estimating deforestation and growth in threatened forests in southern Peru from 2000 to 2010. The methods used in this study rely on freely available data, including atmospherically corrected Landsat 5 Thematic Mapper and Moderate Resolution Imaging Spectroradiometer (MODIS) vegetation continuous fields (VCF). The two methods include a conventional supervised signature extraction method and a unique self-calibrating method called MODIS VCF guided forest/nonforest (FNF) masking. The process chain for each of these methods includes a threshold classification of MODIS VCF, training data or signature extraction, signature evaluation, k-nearest neighbor classification, analyst-guided reclassification, and postclassification image differencing to generate forest change maps. Comparisons of all methods were based on an accuracy assessment using 500 validation pixels. Results of this accuracy assessment indicate that FNF masking had a 5% higher overall accuracy and was superior to conventional supervised classification when estimating forest change. Both methods succeeded in classifying persistently forested and nonforested areas, and both had limitations when classifying forest change.

  17. Coregistered photoacoustic and ultrasound imaging and classification of ovarian cancer: ex vivo and in vivo studies

    NASA Astrophysics Data System (ADS)

    Salehi, Hassan S.; Li, Hai; Merkulov, Alex; Kumavor, Patrick D.; Vavadi, Hamed; Sanders, Melinda; Kueck, Angela; Brewer, Molly A.; Zhu, Quing

    2016-04-01

    Most ovarian cancers are diagnosed at advanced stages due to the lack of efficacious screening techniques. Photoacoustic tomography (PAT) has a potential to image tumor angiogenesis and detect early neovascular changes of the ovary. We have developed a coregistered PAT and ultrasound (US) prototype system for real-time assessment of ovarian masses. Features extracted from PAT and US angular beams, envelopes, and images were input to a logistic classifier and a support vector machine (SVM) classifier to diagnose ovaries as benign or malignant. A total of 25 excised ovaries of 15 patients were studied and the logistic and SVM classifiers achieved sensitivities of 70.4 and 87.7%, and specificities of 95.6 and 97.9%, respectively. Furthermore, the ovaries of two patients were noninvasively imaged using the PAT/US system before surgical excision. By using five significant features and the logistic classifier, 12 out of 14 images (86% sensitivity) from a malignant ovarian mass and all 17 images (100% specificity) from a benign mass were accurately classified; the SVM correctly classified 10 out of 14 malignant images (71% sensitivity) and all 17 benign images (100% specificity). These initial results demonstrate the clinical potential of the PAT/US technique for ovarian cancer diagnosis.

  18. omniClassifier: a Desktop Grid Computing System for Big Data Prediction Modeling

    PubMed Central

    Phan, John H.; Kothari, Sonal; Wang, May D.

    2016-01-01

    Robust prediction models are important for numerous science, engineering, and biomedical applications. However, best-practice procedures for optimizing prediction models can be computationally complex, especially when choosing models from among hundreds or thousands of parameter choices. Computational complexity has further increased with the growth of data in these fields, concurrent with the era of “Big Data”. Grid computing is a potential solution to the computational challenges of Big Data. Desktop grid computing, which uses idle CPU cycles of commodity desktop machines, coupled with commercial cloud computing resources can enable research labs to gain easier and more cost effective access to vast computing resources. We have developed omniClassifier, a multi-purpose prediction modeling application that provides researchers with a tool for conducting machine learning research within the guidelines of recommended best-practices. omniClassifier is implemented as a desktop grid computing system using the Berkeley Open Infrastructure for Network Computing (BOINC) middleware. In addition to describing implementation details, we use various gene expression datasets to demonstrate the potential scalability of omniClassifier for efficient and robust Big Data prediction modeling. A prototype of omniClassifier can be accessed at http://omniclassifier.bme.gatech.edu/. PMID:27532062

  19. Processing of Fear and Anger Facial Expressions: The Role of Spatial Frequency

    PubMed Central

    Comfort, William E.; Wang, Meng; Benton, Christopher P.; Zana, Yossi

    2013-01-01

    Spatial frequency (SF) components encode a portion of the affective value expressed in face images. The aim of this study was to estimate the relative weight of specific frequency spectrum bandwidth on the discrimination of anger and fear facial expressions. The general paradigm was a classification of the expression of faces morphed at varying proportions between anger and fear images in which SF adaptation and SF subtraction are expected to shift classification of facial emotion. A series of three experiments was conducted. In Experiment 1 subjects classified morphed face images that were unfiltered or filtered to remove either low (<8 cycles/face), middle (12–28 cycles/face), or high (>32 cycles/face) SF components. In Experiment 2 subjects were adapted to unfiltered or filtered prototypical (non-morphed) fear face images and subsequently classified morphed face images. In Experiment 3 subjects were adapted to unfiltered or filtered prototypical fear face images with the phase component randomized before classifying morphed face images. Removing mid frequency components from the target images shifted classification toward fear. The same shift was observed under adaptation condition to unfiltered and low- and middle-range filtered fear images. However, when the phase spectrum of the same adaptation stimuli was randomized, no adaptation effect was observed. These results suggest that medium SF components support the perception of fear more than anger at both low and high level of processing. They also suggest that the effect at high-level processing stage is related more to high-level featural and/or configural information than to the low-level frequency spectrum. PMID:23637687

  20. [A graph cuts-based interactive method for segmentation of magnetic resonance images of meningioma].

    PubMed

    Li, Shuan-qiang; Feng, Qian-jin; Chen, Wu-fan; Lin, Ya-zhong

    2011-06-01

    For accurate segmentation of the magnetic resonance (MR) images of meningioma, we propose a novel interactive segmentation method based on graph cuts. The high dimensional image features was extracted, and for each pixel, the probabilities of its origin, either the tumor or the background regions, were estimated by exploiting the weighted K-nearest neighborhood classifier. Based on these probabilities, a new energy function was proposed. Finally, a graph cut optimal framework was used for the solution of the energy function. The proposed method was evaluated by application in the segmentation of MR images of meningioma, and the results showed that the method significantly improved the segmentation accuracy compared with the gray level information-based graph cut method.

  1. Hamiltonian identifiability assisted by single-probe measurement

    NASA Astrophysics Data System (ADS)

    Sone, Akira; Cappellaro, Paola; Quantum Engineering Group Team

    2017-04-01

    We study the Hamiltonian identifiability of a many-body spin- 1 / 2 system assisted by the measurement on a single quantum probe based on the eigensystem realization algorithm (ERA) approach employed in. We demonstrate a potential application of Gröbner basis to the identifiability test of the Hamiltonian, and provide the necessary experimental resources, such as the lower bound in the number of the required sampling points, the upper bound in total required evolution time, and thus the total measurement time. Focusing on the examples of the identifiability in the spin chain model with nearest-neighbor interaction, we classify the spin-chain Hamiltonian based on its identifiability, and provide the control protocols to engineer the non-identifiable Hamiltonian to be an identifiable Hamiltonian.

  2. Harmonic wavelet packet transform for on-line system health diagnosis

    NASA Astrophysics Data System (ADS)

    Yan, Ruqiang; Gao, Robert X.

    2004-07-01

    This paper presents a new approach to on-line health diagnosis of mechanical systems, based on the wavelet packet transform. Specifically, signals acquired from vibration sensors are decomposed into sub-bands by means of the discrete harmonic wavelet packet transform (DHWPT). Based on the Fisher linear discriminant criterion, features in the selected sub-bands are then used as inputs to three classifiers (Nearest Neighbor rule-based and two Neural Network-based), for system health condition assessment. Experimental results have confirmed that, comparing to the conventional approach where statistical parameters from raw signals are used, the presented approach enabled higher signal-to-noise ratio for more effective and intelligent use of the sensory information, thus leading to more accurate system health diagnosis.

  3. A systematic comparison of different object-based classification techniques using high spatial resolution imagery in agricultural environments

    NASA Astrophysics Data System (ADS)

    Li, Manchun; Ma, Lei; Blaschke, Thomas; Cheng, Liang; Tiede, Dirk

    2016-07-01

    Geographic Object-Based Image Analysis (GEOBIA) is becoming more prevalent in remote sensing classification, especially for high-resolution imagery. Many supervised classification approaches are applied to objects rather than pixels, and several studies have been conducted to evaluate the performance of such supervised classification techniques in GEOBIA. However, these studies did not systematically investigate all relevant factors affecting the classification (segmentation scale, training set size, feature selection and mixed objects). In this study, statistical methods and visual inspection were used to compare these factors systematically in two agricultural case studies in China. The results indicate that Random Forest (RF) and Support Vector Machines (SVM) are highly suitable for GEOBIA classifications in agricultural areas and confirm the expected general tendency, namely that the overall accuracies decline with increasing segmentation scale. All other investigated methods except for RF and SVM are more prone to obtain a lower accuracy due to the broken objects at fine scales. In contrast to some previous studies, the RF classifiers yielded the best results and the k-nearest neighbor classifier were the worst results, in most cases. Likewise, the RF and Decision Tree classifiers are the most robust with or without feature selection. The results of training sample analyses indicated that the RF and adaboost. M1 possess a superior generalization capability, except when dealing with small training sample sizes. Furthermore, the classification accuracies were directly related to the homogeneity/heterogeneity of the segmented objects for all classifiers. Finally, it was suggested that RF should be considered in most cases for agricultural mapping.

  4. Driving behavior recognition using EEG data from a simulated car-following experiment.

    PubMed

    Yang, Liu; Ma, Rui; Zhang, H Michael; Guan, Wei; Jiang, Shixiong

    2018-07-01

    Driving behavior recognition is the foundation of driver assistance systems, with potential applications in automated driving systems. Most prevailing studies have used subjective questionnaire data and objective driving data to classify driving behaviors, while few studies have used physiological signals such as electroencephalography (EEG) to gather data. To bridge this gap, this paper proposes a two-layer learning method for driving behavior recognition using EEG data. A simulated car-following driving experiment was designed and conducted to simultaneously collect data on the driving behaviors and EEG data of drivers. The proposed learning method consists of two layers. In Layer I, two-dimensional driving behavior features representing driving style and stability were selected and extracted from raw driving behavior data using K-means and support vector machine recursive feature elimination. Five groups of driving behaviors were classified based on these two-dimensional driving behavior features. In Layer II, the classification results from Layer I were utilized as inputs to generate a k-Nearest-Neighbor classifier identifying driving behavior groups using EEG data. Using independent component analysis, a fast Fourier transformation, and linear discriminant analysis sequentially, the raw EEG signals were processed to extract two core EEG features. Classifier performance was enhanced using the adaptive synthetic sampling approach. A leave-one-subject-out cross validation was conducted. The results showed that the average classification accuracy for all tested traffic states was 69.5% and the highest accuracy reached 83.5%, suggesting a significant correlation between EEG patterns and car-following behavior. Copyright © 2017 Elsevier Ltd. All rights reserved.

  5. Multi-feature classifiers for burst detection in single EEG channels from preterm infants

    NASA Astrophysics Data System (ADS)

    Navarro, X.; Porée, F.; Kuchenbuch, M.; Chavez, M.; Beuchée, Alain; Carrault, G.

    2017-08-01

    Objective. The study of electroencephalographic (EEG) bursts in preterm infants provides valuable information about maturation or prognostication after perinatal asphyxia. Over the last two decades, a number of works proposed algorithms to automatically detect EEG bursts in preterm infants, but they were designed for populations under 35 weeks of post menstrual age (PMA). However, as the brain activity evolves rapidly during postnatal life, these solutions might be under-performing with increasing PMA. In this work we focused on preterm infants reaching term ages (PMA  ⩾36 weeks) using multi-feature classification on a single EEG channel. Approach. Five EEG burst detectors relying on different machine learning approaches were compared: logistic regression (LR), linear discriminant analysis (LDA), k-nearest neighbors (kNN), support vector machines (SVM) and thresholding (Th). Classifiers were trained by visually labeled EEG recordings from 14 very preterm infants (born after 28 weeks of gestation) with 36-41 weeks PMA. Main results. The most performing classifiers reached about 95% accuracy (kNN, SVM and LR) whereas Th obtained 84%. Compared to human-automatic agreements, LR provided the highest scores (Cohen’s kappa  =  0.71) using only three EEG features. Applying this classifier in an unlabeled database of 21 infants  ⩾36 weeks PMA, we found that long EEG bursts and short inter-burst periods are characteristic of infants with the highest PMA and weights. Significance. In view of these results, LR-based burst detection could be a suitable tool to study maturation in monitoring or portable devices using a single EEG channel.

  6. Protein classification based on text document classification techniques.

    PubMed

    Cheng, Betty Yee Man; Carbonell, Jaime G; Klein-Seetharaman, Judith

    2005-03-01

    The need for accurate, automated protein classification methods continues to increase as advances in biotechnology uncover new proteins. G-protein coupled receptors (GPCRs) are a particularly difficult superfamily of proteins to classify due to extreme diversity among its members. Previous comparisons of BLAST, k-nearest neighbor (k-NN), hidden markov model (HMM) and support vector machine (SVM) using alignment-based features have suggested that classifiers at the complexity of SVM are needed to attain high accuracy. Here, analogous to document classification, we applied Decision Tree and Naive Bayes classifiers with chi-square feature selection on counts of n-grams (i.e. short peptide sequences of length n) to this classification task. Using the GPCR dataset and evaluation protocol from the previous study, the Naive Bayes classifier attained an accuracy of 93.0 and 92.4% in level I and level II subfamily classification respectively, while SVM has a reported accuracy of 88.4 and 86.3%. This is a 39.7 and 44.5% reduction in residual error for level I and level II subfamily classification, respectively. The Decision Tree, while inferior to SVM, outperforms HMM in both level I and level II subfamily classification. For those GPCR families whose profiles are stored in the Protein FAMilies database of alignments and HMMs (PFAM), our method performs comparably to a search against those profiles. Finally, our method can be generalized to other protein families by applying it to the superfamily of nuclear receptors with 94.5, 97.8 and 93.6% accuracy in family, level I and level II subfamily classification respectively. Copyright 2005 Wiley-Liss, Inc.

  7. Estimating local scaling properties for the classification of interstitial lung disease patterns

    NASA Astrophysics Data System (ADS)

    Huber, Markus B.; Nagarajan, Mahesh B.; Leinsinger, Gerda; Ray, Lawrence A.; Wismueller, Axel

    2011-03-01

    Local scaling properties of texture regions were compared in their ability to classify morphological patterns known as 'honeycombing' that are considered indicative for the presence of fibrotic interstitial lung diseases in high-resolution computed tomography (HRCT) images. For 14 patients with known occurrence of honeycombing, a stack of 70 axial, lung kernel reconstructed images were acquired from HRCT chest exams. 241 regions of interest of both healthy and pathological (89) lung tissue were identified by an experienced radiologist. Texture features were extracted using six properties calculated from gray-level co-occurrence matrices (GLCM), Minkowski Dimensions (MDs), and the estimation of local scaling properties with Scaling Index Method (SIM). A k-nearest-neighbor (k-NN) classifier and a Multilayer Radial Basis Functions Network (RBFN) were optimized in a 10-fold cross-validation for each texture vector, and the classification accuracy was calculated on independent test sets as a quantitative measure of automated tissue characterization. A Wilcoxon signed-rank test was used to compare two accuracy distributions including the Bonferroni correction. The best classification results were obtained by the set of SIM features, which performed significantly better than all the standard GLCM and MD features (p < 0.005) for both classifiers with the highest accuracy (94.1%, 93.7%; for the k-NN and RBFN classifier, respectively). The best standard texture features were the GLCM features 'homogeneity' (91.8%, 87.2%) and 'absolute value' (90.2%, 88.5%). The results indicate that advanced texture features using local scaling properties can provide superior classification performance in computer-assisted diagnosis of interstitial lung diseases when compared to standard texture analysis methods.

  8. Development of a computer aided diagnosis model for prostate cancer classification on multi-parametric MRI

    NASA Astrophysics Data System (ADS)

    Alfano, R.; Soetemans, D.; Bauman, G. S.; Gibson, E.; Gaed, M.; Moussa, M.; Gomez, J. A.; Chin, J. L.; Pautler, S.; Ward, A. D.

    2018-02-01

    Multi-parametric MRI (mp-MRI) is becoming a standard in contemporary prostate cancer screening and diagnosis, and has shown to aid physicians in cancer detection. It offers many advantages over traditional systematic biopsy, which has shown to have very high clinical false-negative rates of up to 23% at all stages of the disease. However beneficial, mp-MRI is relatively complex to interpret and suffers from inter-observer variability in lesion localization and grading. Computer-aided diagnosis (CAD) systems have been developed as a solution as they have the power to perform deterministic quantitative image analysis. We measured the accuracy of such a system validated using accurately co-registered whole-mount digitized histology. We trained a logistic linear classifier (LOGLC), support vector machine (SVC), k-nearest neighbour (KNN) and random forest classifier (RFC) in a four part ROI based experiment against: 1) cancer vs. non-cancer, 2) high-grade (Gleason score ≥4+3) vs. low-grade cancer (Gleason score <4+3), 3) high-grade vs. other tissue components and 4) high-grade vs. benign tissue by selecting the classifier with the highest AUC using 1-10 features from forward feature selection. The CAD model was able to classify malignant vs. benign tissue and detect high-grade cancer with high accuracy. Once fully validated, this work will form the basis for a tool that enhances the radiologist's ability to detect malignancies, potentially improving biopsy guidance, treatment selection, and focal therapy for prostate cancer patients, maximizing the potential for cure and increasing quality of life.

  9. iAMP-2L: a two-level multi-label classifier for identifying antimicrobial peptides and their functional types.

    PubMed

    Xiao, Xuan; Wang, Pu; Lin, Wei-Zhong; Jia, Jian-Hua; Chou, Kuo-Chen

    2013-05-15

    Antimicrobial peptides (AMPs), also called host defense peptides, are an evolutionarily conserved component of the innate immune response and are found among all classes of life. According to their special functions, AMPs are generally classified into ten categories: Antibacterial Peptides, Anticancer/tumor Peptides, Antifungal Peptides, Anti-HIV Peptides, Antiviral Peptides, Antiparasital Peptides, Anti-protist Peptides, AMPs with Chemotactic Activity, Insecticidal Peptides, and Spermicidal Peptides. Given a query peptide, how can we identify whether it is an AMP or non-AMP? If it is, can we identify which functional type or types it belong to? Particularly, how can we deal with the multi-type problem since an AMP may belong to two or more functional types? To address these problems, which are obviously very important to both basic research and drug development, a multi-label classifier was developed based on the pseudo amino acid composition (PseAAC) and fuzzy K-nearest neighbor (FKNN) algorithm, where the components of PseAAC were featured by incorporating five physicochemical properties. The novel classifier is called iAMP-2L, where "2L" means that it is a 2-level predictor. The 1st-level is to answer the 1st question above, while the 2nd-level is to answer the 2nd and 3rd questions that are beyond the reach of any existing methods in this area. For the conveniences of users, a user-friendly web-server for iAMP-2L was established at http://www.jci-bioinfo.cn/iAMP-2L. Copyright © 2013 Elsevier Inc. All rights reserved.

  10. Mediterranean Land Use and Land Cover Classification Assessment Using High Spatial Resolution Data

    NASA Astrophysics Data System (ADS)

    Elhag, Mohamed; Boteva, Silvena

    2016-10-01

    Landscape fragmentation is noticeably practiced in Mediterranean regions and imposes substantial complications in several satellite image classification methods. To some extent, high spatial resolution data were able to overcome such complications. For better classification performances in Land Use Land Cover (LULC) mapping, the current research adopts different classification methods comparison for LULC mapping using Sentinel-2 satellite as a source of high spatial resolution. Both of pixel-based and an object-based classification algorithms were assessed; the pixel-based approach employs Maximum Likelihood (ML), Artificial Neural Network (ANN) algorithms, Support Vector Machine (SVM), and, the object-based classification uses the Nearest Neighbour (NN) classifier. Stratified Masking Process (SMP) that integrates a ranking process within the classes based on spectral fluctuation of the sum of the training and testing sites was implemented. An analysis of the overall and individual accuracy of the classification results of all four methods reveals that the SVM classifier was the most efficient overall by distinguishing most of the classes with the highest accuracy. NN succeeded to deal with artificial surface classes in general while agriculture area classes, and forest and semi-natural area classes were segregated successfully with SVM. Furthermore, a comparative analysis indicates that the conventional classification method yielded better accuracy results than the SMP method overall with both classifiers used, ML and SVM.

  11. Image-based Analysis of Emotional Facial Expressions in Full Face Transplants.

    PubMed

    Bedeloglu, Merve; Topcu, Çagdas; Akgul, Arzu; Döger, Ela Naz; Sever, Refik; Ozkan, Ozlenen; Ozkan, Omer; Uysal, Hilmi; Polat, Ovunc; Çolak, Omer Halil

    2018-01-20

    In this study, it is aimed to determine the degree of the development in emotional expression of full face transplant patients from photographs. Hence, a rehabilitation process can be planned according to the determination of degrees as a later work. As envisaged, in full face transplant cases, the determination of expressions can be confused or cannot be achieved as the healthy control group. In order to perform image-based analysis, a control group consist of 9 healthy males and 2 full-face transplant patients participated in the study. Appearance-based Gabor Wavelet Transform (GWT) and Local Binary Pattern (LBP) methods are adopted for recognizing neutral and 6 emotional expressions which consist of angry, scared, happy, hate, confused and sad. Feature extraction was carried out by using both methods and combination of these methods serially. In the performed expressions, the extracted features of the most distinct zones in the facial area where the eye and mouth region, have been used to classify the emotions. Also, the combination of these region features has been used to improve classifier performance. Control subjects and transplant patients' ability to perform emotional expressions have been determined with K-nearest neighbor (KNN) classifier with region-specific and method-specific decision stages. The results have been compared with healthy group. It has been observed that transplant patients don't reflect some emotional expressions. Also, there were confusions among expressions.

  12. An Event-Triggered Machine Learning Approach for Accelerometer-Based Fall Detection.

    PubMed

    Putra, I Putu Edy Suardiyana; Brusey, James; Gaura, Elena; Vesilo, Rein

    2017-12-22

    The fixed-size non-overlapping sliding window (FNSW) and fixed-size overlapping sliding window (FOSW) approaches are the most commonly used data-segmentation techniques in machine learning-based fall detection using accelerometer sensors. However, these techniques do not segment by fall stages (pre-impact, impact, and post-impact) and thus useful information is lost, which may reduce the detection rate of the classifier. Aligning the segment with the fall stage is difficult, as the segment size varies. We propose an event-triggered machine learning (EvenT-ML) approach that aligns each fall stage so that the characteristic features of the fall stages are more easily recognized. To evaluate our approach, two publicly accessible datasets were used. Classification and regression tree (CART), k -nearest neighbor ( k -NN), logistic regression (LR), and the support vector machine (SVM) were used to train the classifiers. EvenT-ML gives classifier F-scores of 98% for a chest-worn sensor and 92% for a waist-worn sensor, and significantly reduces the computational cost compared with the FNSW- and FOSW-based approaches, with reductions of up to 8-fold and 78-fold, respectively. EvenT-ML achieves a significantly better F-score than existing fall detection approaches. These results indicate that aligning feature segments with fall stages significantly increases the detection rate and reduces the computational cost.

  13. Using Bayesian neural networks to classify forest scenes

    NASA Astrophysics Data System (ADS)

    Vehtari, Aki; Heikkonen, Jukka; Lampinen, Jouko; Juujarvi, Jouni

    1998-10-01

    We present results that compare the performance of Bayesian learning methods for neural networks on the task of classifying forest scenes into trees and background. Classification task is demanding due to the texture richness of the trees, occlusions of the forest scene objects and diverse lighting conditions under operation. This makes it difficult to determine which are optimal image features for the classification. A natural way to proceed is to extract many different types of potentially suitable features, and to evaluate their usefulness in later processing stages. One approach to cope with large number of features is to use Bayesian methods to control the model complexity. Bayesian learning uses a prior on model parameters, combines this with evidence from a training data, and the integrates over the resulting posterior to make predictions. With this method, we can use large networks and many features without fear of overfitting. For this classification task we compare two Bayesian learning methods for multi-layer perceptron (MLP) neural networks: (1) The evidence framework of MacKay uses a Gaussian approximation to the posterior weight distribution and maximizes with respect to hyperparameters. (2) In a Markov Chain Monte Carlo (MCMC) method due to Neal, the posterior distribution of the network parameters is numerically integrated using the MCMC method. As baseline classifiers for comparison we use (3) MLP early stop committee, (4) K-nearest-neighbor and (5) Classification And Regression Tree.

  14. Fast clustering algorithm for large ECG data sets based on CS theory in combination with PCA and K-NN methods.

    PubMed

    Balouchestani, Mohammadreza; Krishnan, Sridhar

    2014-01-01

    Long-term recording of Electrocardiogram (ECG) signals plays an important role in health care systems for diagnostic and treatment purposes of heart diseases. Clustering and classification of collecting data are essential parts for detecting concealed information of P-QRS-T waves in the long-term ECG recording. Currently used algorithms do have their share of drawbacks: 1) clustering and classification cannot be done in real time; 2) they suffer from huge energy consumption and load of sampling. These drawbacks motivated us in developing novel optimized clustering algorithm which could easily scan large ECG datasets for establishing low power long-term ECG recording. In this paper, we present an advanced K-means clustering algorithm based on Compressed Sensing (CS) theory as a random sampling procedure. Then, two dimensionality reduction methods: Principal Component Analysis (PCA) and Linear Correlation Coefficient (LCC) followed by sorting the data using the K-Nearest Neighbours (K-NN) and Probabilistic Neural Network (PNN) classifiers are applied to the proposed algorithm. We show our algorithm based on PCA features in combination with K-NN classifier shows better performance than other methods. The proposed algorithm outperforms existing algorithms by increasing 11% classification accuracy. In addition, the proposed algorithm illustrates classification accuracy for K-NN and PNN classifiers, and a Receiver Operating Characteristics (ROC) area of 99.98%, 99.83%, and 99.75% respectively.

  15. Accurate prediction of hot spot residues through physicochemical characteristics of amino acid sequences.

    PubMed

    Chen, Peng; Li, Jinyan; Wong, Limsoon; Kuwahara, Hiroyuki; Huang, Jianhua Z; Gao, Xin

    2013-08-01

    Hot spot residues of proteins are fundamental interface residues that help proteins perform their functions. Detecting hot spots by experimental methods is costly and time-consuming. Sequential and structural information has been widely used in the computational prediction of hot spots. However, structural information is not always available. In this article, we investigated the problem of identifying hot spots using only physicochemical characteristics extracted from amino acid sequences. We first extracted 132 relatively independent physicochemical features from a set of the 544 properties in AAindex1, an amino acid index database. Each feature was utilized to train a classification model with a novel encoding schema for hot spot prediction by the IBk algorithm, an extension of the K-nearest neighbor algorithm. The combinations of the individual classifiers were explored and the classifiers that appeared frequently in the top performing combinations were selected. The hot spot predictor was built based on an ensemble of these classifiers and to work in a voting manner. Experimental results demonstrated that our method effectively exploited the feature space and allowed flexible weights of features for different queries. On the commonly used hot spot benchmark sets, our method significantly outperformed other machine learning algorithms and state-of-the-art hot spot predictors. The program is available at http://sfb.kaust.edu.sa/pages/software.aspx. Copyright © 2013 Wiley Periodicals, Inc.

  16. Computer-aided Prognosis of Neuroblastoma on Whole-slide Images: Classification of Stromal Development

    PubMed Central

    Sertel, O.; Kong, J.; Shimada, H.; Catalyurek, U.V.; Saltz, J.H.; Gurcan, M.N.

    2009-01-01

    We are developing a computer-aided prognosis system for neuroblastoma (NB), a cancer of the nervous system and one of the most malignant tumors affecting children. Histopathological examination is an important stage for further treatment planning in routine clinical diagnosis of NB. According to the International Neuroblastoma Pathology Classification (the Shimada system), NB patients are classified into favorable and unfavorable histology based on the tissue morphology. In this study, we propose an image analysis system that operates on digitized H&E stained whole-slide NB tissue samples and classifies each slide as either stroma-rich or stroma-poor based on the degree of Schwannian stromal development. Our statistical framework performs the classification based on texture features extracted using co-occurrence statistics and local binary patterns. Due to the high resolution of digitized whole-slide images, we propose a multi-resolution approach that mimics the evaluation of a pathologist such that the image analysis starts from the lowest resolution and switches to higher resolutions when necessary. We employ an offine feature selection step, which determines the most discriminative features at each resolution level during the training step. A modified k-nearest neighbor classifier is used to determine the confidence level of the classification to make the decision at a particular resolution level. The proposed approach was independently tested on 43 whole-slide samples and provided an overall classification accuracy of 88.4%. PMID:20161324

  17. Object-based locust habitat mapping using high-resolution multispectral satellite data in the southern Aral Sea basin

    NASA Astrophysics Data System (ADS)

    Navratil, Peter; Wilps, Hans

    2013-01-01

    Three different object-based image classification techniques are applied to high-resolution satellite data for the mapping of the habitats of Asian migratory locust (Locusta migratoria migratoria) in the southern Aral Sea basin, Uzbekistan. A set of panchromatic and multispectral Système Pour l'Observation de la Terre-5 satellite images was spectrally enhanced by normalized difference vegetation index and tasseled cap transformation and segmented into image objects, which were then classified by three different classification approaches: a rule-based hierarchical fuzzy threshold (HFT) classification method was compared to a supervised nearest neighbor classifier and classification tree analysis by the quick, unbiased, efficient statistical trees algorithm. Special emphasis was laid on the discrimination of locust feeding and breeding habitats due to the significance of this discrimination for practical locust control. Field data on vegetation and land cover, collected at the time of satellite image acquisition, was used to evaluate classification accuracy. The results show that a robust HFT classifier outperformed the two automated procedures by 13% overall accuracy. The classification method allowed a reliable discrimination of locust feeding and breeding habitats, which is of significant importance for the application of the resulting data for an economically and environmentally sound control of locust pests because exact spatial knowledge on the habitat types allows a more effective surveying and use of pesticides.

  18. Comparative evaluation of support vector machine classification for computer aided detection of breast masses in mammography

    NASA Astrophysics Data System (ADS)

    Lesniak, J. M.; Hupse, R.; Blanc, R.; Karssemeijer, N.; Székely, G.

    2012-08-01

    False positive (FP) marks represent an obstacle for effective use of computer-aided detection (CADe) of breast masses in mammography. Typically, the problem can be approached either by developing more discriminative features or by employing different classifier designs. In this paper, the usage of support vector machine (SVM) classification for FP reduction in CADe is investigated, presenting a systematic quantitative evaluation against neural networks, k-nearest neighbor classification, linear discriminant analysis and random forests. A large database of 2516 film mammography examinations and 73 input features was used to train the classifiers and evaluate for their performance on correctly diagnosed exams as well as false negatives. Further, classifier robustness was investigated using varying training data and feature sets as input. The evaluation was based on the mean exam sensitivity in 0.05-1 FPs on normals on the free-response receiver operating characteristic curve (FROC), incorporated into a tenfold cross validation framework. It was found that SVM classification using a Gaussian kernel offered significantly increased detection performance (P = 0.0002) compared to the reference methods. Varying training data and input features, SVMs showed improved exploitation of large feature sets. It is concluded that with the SVM-based CADe a significant reduction of FPs is possible outperforming other state-of-the-art approaches for breast mass CADe.

  19. Evaluation of Short-Term Cepstral Based Features for Detection of Parkinson’s Disease Severity Levels through Speech signals

    NASA Astrophysics Data System (ADS)

    Oung, Qi Wei; Nisha Basah, Shafriza; Muthusamy, Hariharan; Vijean, Vikneswaran; Lee, Hoileong

    2018-03-01

    Parkinson’s disease (PD) is one type of progressive neurodegenerative disease known as motor system syndrome, which is due to the death of dopamine-generating cells, a region of the human midbrain. PD normally affects people over 60 years of age, which at present has influenced a huge part of worldwide population. Lately, many researches have shown interest into the connection between PD and speech disorders. Researches have revealed that speech signals may be a suitable biomarker for distinguishing between people with Parkinson’s (PWP) from healthy subjects. Therefore, early diagnosis of PD through the speech signals can be considered for this aim. In this research, the speech data are acquired based on speech behaviour as the biomarker for differentiating PD severity levels (mild and moderate) from healthy subjects. Feature extraction algorithms applied are Mel Frequency Cepstral Coefficients (MFCC), Linear Predictive Coefficients (LPC), Linear Prediction Cepstral Coefficients (LPCC), and Weighted Linear Prediction Cepstral Coefficients (WLPCC). For classification, two types of classifiers are used: k-Nearest Neighbour (KNN) and Probabilistic Neural Network (PNN). The experimental results demonstrated that PNN classifier and KNN classifier achieve the best average classification performance of 92.63% and 88.56% respectively through 10-fold cross-validation measures. Favourably, the suggested techniques have the possibilities of becoming a new choice of promising tools for the PD detection with tremendous performance.

  20. Prediction of in vivo hepatotoxicity effects using in vitro ...

    EPA Pesticide Factsheets

    High-throughput in vitro transcriptomics data support molecular understanding of chemical-induced toxicity. Here, we evaluated the utility of such data to predict liver toxicity. First, in vitro gene expression data for 93 genes was generated following exposure of metabolically competent HepaRG cells to 1060 environmental chemicals from the US EPA ToxCast library. The empirical relationship between these data and rat chronic liver endpoints from animal studies in the Toxicity Reference Database (ToxRefDB) was then evaluated using machine learning techniques. Chemicals were classified as positive (242) or negative (135) based on observed hepatic histopathologic effects, and divided into three categories: hypertrophy (183), injury (112) and proliferative lesions (101). Hepatotoxicants were classified on the basis of the bioactivity of 93 genes (descriptors) using six machine learning algorithms: linear discriminant analysis, naïve Bayes, support vector classification, classification and regression trees, k-nearest neighbors, and an ensemble of classifiers. Classification performance was evaluated using 10-fold cross-validation testing, and in-loop, filter-based, feature subset selection. The best balanced accuracy for prediction of hypertrophy, injury and proliferative lesions were 0.81 ± 0.07, 0.79 ± 0.08 and 0.77 ± 0.09, respectively. Gene specific perturbation of xenobiotic metabolism enzymes (CYP7A1/2E1/4A11/1A1/4A22) and transporters (ABCG2, ABCB11, SLC22

  1. Diagnostic tools for nearest neighbors techniques when used with satellite imagery

    Treesearch

    Ronald E. McRoberts

    2009-01-01

    Nearest neighbors techniques are non-parametric approaches to multivariate prediction that are useful for predicting both continuous and categorical forest attribute variables. Although some assumptions underlying nearest neighbor techniques are common to other prediction techniques such as regression, other assumptions are unique to nearest neighbor techniques....

  2. Three-Dimensional Printing of Vitrification Loop Prototypes for Aquatic Species.

    PubMed

    Tiersch, Nolan J; Childress, William M; Tiersch, Terrence R

    2018-05-16

    Vitrification is a method of cryopreservation that freezes samples rapidly, while forming an amorphous solid ("glass"), typically in small (μL) volumes. The goal of this project was to create, by three-dimensional (3D) printing, open vitrification devices based on an elliptical loop that could be efficiently used and stored. Vitrification efforts can benefit from the application of 3D printing, and to begin integration of this technology, we addressed four main variables: thermoplastic filament type, loop length, loop height, and method of loading. Our objectives were to: (1) design vitrification loops with varied dimensions; (2) print prototype loops for testing; (3) evaluate loading methods for the devices; and (4) classify vitrification responses to multiple device configurations. The various configurations were designed digitally using 3D CAD (Computer Aided Design) software, and prototype devices were produced with MakerBot ® 3D printers. The thermoplastic filaments used to produce devices were acrylonitrile butadiene styrene (ABS) and polylactic acid (PLA). Vitrification devices were characterized by the film volumes formed with different methods of loading (pipetting or submersion). Frozen films were classified to determine vitrification quality: zero (opaque, or abundant crystalline ice formation); one (translucent, or partial vitrification), or two (transparent, or substantial vitrification, glass). A published vitrification solution was used to conduct experiments. Loading by pipetting formed frozen films more reliably than by submersion, but submersion yielded fewer filling problems and was more rapid. The loop designs that yielded the highest levels of vitrification enabled rapid transfer of heat, and most often were characterized as being longer and consisting of fewer layers (height). 3D printing can assist standardization of vitrification methods and research, yet can also provide the ability to quickly design and fabricate custom devices when needed.

  3. DARPA counter-sniper program: Phase 1 Acoustic Systems Demonstration results

    NASA Astrophysics Data System (ADS)

    Carapezza, Edward M.; Law, David B.; Csanadi, Christina J.

    1997-02-01

    During October 1995 through May 1996, the Defense Advanced Research Projects Agency sponsored the development of prototype systems that exploit acoustic muzzle blast and ballistic shock wave signatures to accurately predict the location of gunfire events and associated shooter locations using either single or multiple volumetric arrays. The output of these acoustic systems is an estimate of the shooter location and a classification estimate of the caliber of the shooter's weapon. A portable display and control unit provides both graphical and alphanumeric shooter location related information integrated on a two- dimensional digital map of the defended area. The final Phase I Acoustic Systems Demonstration field tests were completed in May. These these tests were held at USMC Base Camp Pendleton Military Operations Urban Training (MOUT) facility. These tests were structured to provide challenging gunfire related scenarios with significant reverberation and multi-path conditions. Special shot geometries and false alarms were included in these tests to probe potential system vulnerabilities and to determine the performance and robustness of the systems. Five prototypes developed by U.S. companies and one Israeli developed prototype were tested. This analysis quantifies the spatial resolution estimation capability (azimuth, elevation and range) of these prototypes and describes their ability to accurately classify the type of bullet fired in a challenging urban- like setting.

  4. An Online Biosensor for the Protection of Water Supplies

    DTIC Science & Technology

    2015-03-01

    microfluidic device, analyze this data using the embedded classifier algorithm, and transmit the results via encrypted Wi-Fi. Our final deliverable is a...self-contained water sensor prototype. Attached are the Year 1 results that demonstrate our ability to acquire and analyze data in real time to...Seq results for promoter activation in E. coli MG1655 in response to single and multiple toxin exposures at low and high concentrations

  5. Object Detection using the Kinect

    DTIC Science & Technology

    2012-03-01

    Kinect camera and point cloud data from the Kinect’s structured light stereo system (figure 1). We obtain reasonable results using a single prototype...same manner we present in this report. For example, at Willow Garage , Steder uses a 3-D feature he developed to classify objects directly from point...detecting backpacks using the data available from the Kinect sensor. 4 3.1 Point Cloud Filtering Dense point clouds derived from stereo are notoriously

  6. Real Time Intelligent Target Detection and Analysis with Machine Vision

    NASA Technical Reports Server (NTRS)

    Howard, Ayanna; Padgett, Curtis; Brown, Kenneth

    2000-01-01

    We present an algorithm for detecting a specified set of targets for an Automatic Target Recognition (ATR) application. ATR involves processing images for detecting, classifying, and tracking targets embedded in a background scene. We address the problem of discriminating between targets and nontarget objects in a scene by evaluating 40x40 image blocks belonging to an image. Each image block is first projected onto a set of templates specifically designed to separate images of targets embedded in a typical background scene from those background images without targets. These filters are found using directed principal component analysis which maximally separates the two groups. The projected images are then clustered into one of n classes based on a minimum distance to a set of n cluster prototypes. These cluster prototypes have previously been identified using a modified clustering algorithm based on prior sensed data. Each projected image pattern is then fed into the associated cluster's trained neural network for classification. A detailed description of our algorithm will be given in this paper. We outline our methodology for designing the templates, describe our modified clustering algorithm, and provide details on the neural network classifiers. Evaluation of the overall algorithm demonstrates that our detection rates approach 96% with a false positive rate of less than 0.03%.

  7. A Novel Hybrid Classification Model of Genetic Algorithms, Modified k-Nearest Neighbor and Developed Backpropagation Neural Network

    PubMed Central

    Salari, Nader; Shohaimi, Shamarina; Najafi, Farid; Nallappan, Meenakshii; Karishnarajah, Isthrinayagy

    2014-01-01

    Among numerous artificial intelligence approaches, k-Nearest Neighbor algorithms, genetic algorithms, and artificial neural networks are considered as the most common and effective methods in classification problems in numerous studies. In the present study, the results of the implementation of a novel hybrid feature selection-classification model using the above mentioned methods are presented. The purpose is benefitting from the synergies obtained from combining these technologies for the development of classification models. Such a combination creates an opportunity to invest in the strength of each algorithm, and is an approach to make up for their deficiencies. To develop proposed model, with the aim of obtaining the best array of features, first, feature ranking techniques such as the Fisher's discriminant ratio and class separability criteria were used to prioritize features. Second, the obtained results that included arrays of the top-ranked features were used as the initial population of a genetic algorithm to produce optimum arrays of features. Third, using a modified k-Nearest Neighbor method as well as an improved method of backpropagation neural networks, the classification process was advanced based on optimum arrays of the features selected by genetic algorithms. The performance of the proposed model was compared with thirteen well-known classification models based on seven datasets. Furthermore, the statistical analysis was performed using the Friedman test followed by post-hoc tests. The experimental findings indicated that the novel proposed hybrid model resulted in significantly better classification performance compared with all 13 classification methods. Finally, the performance results of the proposed model was benchmarked against the best ones reported as the state-of-the-art classifiers in terms of classification accuracy for the same data sets. The substantial findings of the comprehensive comparative study revealed that performance of the proposed model in terms of classification accuracy is desirable, promising, and competitive to the existing state-of-the-art classification models. PMID:25419659

  8. Ising lattices with +/-J second-nearest-neighbor interactions

    NASA Astrophysics Data System (ADS)

    Ramírez-Pastor, A. J.; Nieto, F.; Vogel, E. E.

    1997-06-01

    Second-nearest-neighbor interactions are added to the usual nearest-neighbor Ising Hamiltonian for square lattices in different ways. The starting point is a square lattice where half the nearest-neighbor interactions are ferromagnetic and the other half of the bonds are antiferromagnetic. Then, second-nearest-neighbor interactions can also be assigned randomly or in a variety of causal manners determined by the nearest-neighbor interactions. In the present paper we consider three causal and three random ways of assigning second-nearest-neighbor exchange interactions. Several ground-state properties are then calculated for each of these lattices:energy per bond ɛg, site correlation parameter pg, maximal magnetization μg, and fraction of unfrustrated bonds hg. A set of 500 samples is considered for each size N (number of spins) and array (way of distributing the N spins). The properties of the original lattices with only nearest-neighbor interactions are already known, which allows realizing the effect of the additional interactions. We also include cubic lattices to discuss the distinction between coordination number and dimensionality. Comparison with results for triangular and honeycomb lattices is done at specific points.

  9. Interplay of Socioeconomic Status and Supermarket Distance Is Associated with Excess Obesity Risk: A UK Cross-Sectional Study

    PubMed Central

    Mackenbach, Joreintje D.; Lakerveld, Jeroen; Forouhi, Nita G.; Griffin, Simon J.; Brage, Søren; Wareham, Nicholas J.; Monsivais, Pablo

    2017-01-01

    U.S. policy initiatives have sought to improve health through attracting neighborhood supermarket investment. Little evidence exists to suggest that these policies will be effective, in particular where there are socioeconomic barriers to healthy eating. We measured the independent associations and combined interplay of supermarket access and socioeconomic status with obesity. Using data on 9702 UK adults, we employed adjusted regression analyses to estimate measured BMI (kg/m2), overweight (25 ≥ BMI < 30) and obesity (≥30), across participants’ highest educational attainment (three groups) and tertiles of street network distance (km) from home location to nearest supermarket. Jointly-classified models estimated combined associations of education and supermarket distance, and relative excess risk due to interaction (RERI). Participants farthest away from their nearest supermarket had higher odds of obesity (OR 1.33, 95% CI: 1.11, 1.58), relative to those living closest. Lower education was also associated with higher odds of obesity. Those least-educated and living farthest away had 3.39 (2.46–4.65) times the odds of being obese, compared to those highest-educated and living closest, with an excess obesity risk (RERI = 0.09); results were similar for overweight. Our results suggest that public health can be improved through planning better access to supermarkets, in combination with interventions to address socioeconomic barriers. PMID:29068365

  10. Interplay of Socioeconomic Status and Supermarket Distance Is Associated with Excess Obesity Risk: A UK Cross-Sectional Study.

    PubMed

    Burgoine, Thomas; Mackenbach, Joreintje D; Lakerveld, Jeroen; Forouhi, Nita G; Griffin, Simon J; Brage, Søren; Wareham, Nicholas J; Monsivais, Pablo

    2017-10-25

    U.S. policy initiatives have sought to improve health through attracting neighborhood supermarket investment. Little evidence exists to suggest that these policies will be effective, in particular where there are socioeconomic barriers to healthy eating. We measured the independent associations and combined interplay of supermarket access and socioeconomic status with obesity. Using data on 9702 UK adults, we employed adjusted regression analyses to estimate measured BMI (kg/m²), overweight (25 ≥ BMI < 30) and obesity (≥30), across participants' highest educational attainment (three groups) and tertiles of street network distance (km) from home location to nearest supermarket. Jointly-classified models estimated combined associations of education and supermarket distance, and relative excess risk due to interaction (RERI). Participants farthest away from their nearest supermarket had higher odds of obesity (OR 1.33, 95% CI: 1.11, 1.58), relative to those living closest. Lower education was also associated with higher odds of obesity. Those least-educated and living farthest away had 3.39 (2.46-4.65) times the odds of being obese, compared to those highest-educated and living closest, with an excess obesity risk (RERI = 0.09); results were similar for overweight. Our results suggest that public health can be improved through planning better access to supermarkets, in combination with interventions to address socioeconomic barriers.

  11. The application of k-Nearest Neighbour in the identification of high potential archers based on relative psychological coping skills variables

    NASA Astrophysics Data System (ADS)

    Taha, Zahari; Muazu Musa, Rabiu; Majeed, Anwar P. P. Abdul; Razali Abdullah, Mohamad; Muaz Alim, Muhammad; Nasir, Ahmad Fakhri Ab

    2018-04-01

    The present study aims at classifying and predicting high and low potential archers from a collection of psychological coping skills variables trained on different k-Nearest Neighbour (k-NN) kernels. 50 youth archers with the average age and standard deviation of (17.0 ±.056) gathered from various archery programmes completed a one end shooting score test. Psychological coping skills inventory which evaluates the archers level of related coping skills were filled out by the archers prior to their shooting tests. k-means cluster analysis was applied to cluster the archers based on their scores on variables assessed k-NN models, i.e. fine, medium, coarse, cosine, cubic and weighted kernel functions, were trained on the psychological variables. The k-means clustered the archers into high psychologically prepared archers (HPPA) and low psychologically prepared archers (LPPA), respectively. It was demonstrated that the cosine k-NN model exhibited good accuracy and precision throughout the exercise with an accuracy of 94% and considerably fewer error rate for the prediction of the HPPA and the LPPA as compared to the rest of the models. The findings of this investigation can be valuable to coaches and sports managers to recognise high potential athletes from the selected psychological coping skills variables examined which would consequently save time and energy during talent identification and development programme.

  12. A spin transfer torque magnetoresistance random access memory-based high-density and ultralow-power associative memory for fully data-adaptive nearest neighbor search with current-mode similarity evaluation and time-domain minimum searching

    NASA Astrophysics Data System (ADS)

    Ma, Yitao; Miura, Sadahiko; Honjo, Hiroaki; Ikeda, Shoji; Hanyu, Takahiro; Ohno, Hideo; Endoh, Tetsuo

    2017-04-01

    A high-density nonvolatile associative memory (NV-AM) based on spin transfer torque magnetoresistive random access memory (STT-MRAM), which achieves highly concurrent and ultralow-power nearest neighbor search with full adaptivity of the template data format, has been proposed and fabricated using the 90 nm CMOS/70 nm perpendicular-magnetic-tunnel-junction hybrid process. A truly compact current-mode circuitry is developed to realize flexibly controllable and high-parallel similarity evaluation, which makes the NV-AM adaptable to any dimensionality and component-bit of template data. A compact dual-stage time-domain minimum searching circuit is also developed, which can freely extend the system for more template data by connecting multiple NM-AM cores without additional circuits for integrated processing. Both the embedded STT-MRAM module and the computing circuit modules in this NV-AM chip are synchronously power-gated to completely eliminate standby power and maximally reduce operation power by only activating the currently accessed circuit blocks. The operations of a prototype chip at 40 MHz are demonstrated by measurement. The average operation power is only 130 µW, and the circuit density is less than 11 µm2/bit. Compared with the latest conventional works in both volatile and nonvolatile approaches, more than 31.3% circuit area reductions and 99.2% power improvements are achieved, respectively. Further power performance analyses are discussed, which verify the special superiority of the proposed NV-AM in low-power and large-memory-based VLSIs.

  13. Development of a systematic approach to rapid classification and identification of notoginsenosides and metabolites in rat feces based on liquid chromatography coupled triple time-of-flight mass spectrometry.

    PubMed

    Xing, Rong; Zhou, Lijun; Xie, Lin; Hao, Kun; Rao, Tai; Wang, Qian; Ye, Wei; Fu, Hanxu; Wang, Xinwen; Wang, Guangji; Liang, Yan

    2015-03-31

    The present work contributes to the development of a powerful technical platform to rapidly identify and classify complicated components and metabolites for traditional Chinese medicines. In this process, notoginsenosides, the main active ingredients in Panaxnotoginseng, were chosen as model compounds. Firstly, the fragmental patterns, diagnostic product ions and neutral loss of each subfamily of notoginsenosides were summarized by collision-induced dissociation analysis of representative authentic standards. Next, in order to maximally cover low-concentration components which could otherwise be omitted from previous diagnostic fragment-ion method using only single product ion of notoginsenosides, a multiple product ions filtering strategy was proposed and utilized to identify and classify both non-target and target notoginsenosides of P.notoginseng extract (in vitro). With this strategy, 13 protopanaxadiol-type notoginsenosides and 30 protopanaxatriol-type notoginsenosides were efficiently extracted. Then, a neutral loss filtering technique was employed to trace prototype components and metabolites in rats (in vivo) since diagnostic product ions might shift therefore become unpredictable when metabolic reactions occurred on the mother skeleton of notoginsenosides. After comparing the constitute profiles in vitro with in vivo, 62 drug-related components were identified from rat feces, and these components were classified into 27 prototype compounds and 35 metabolites. Lastly, all the metabolites were successfully correlated to their parent compounds based on chemicalome-metabolome matching approach which was previously built by our group. This study provided a generally applicable approach to global metabolite identification for the complicated components in complex matrices. Copyright © 2015 Elsevier B.V. All rights reserved.

  14. Interaction of Language, Culture and Cognition in Group Dynamics for Understanding the Adversary

    DTIC Science & Technology

    2010-07-01

    is particularly evident in noun class b. described above, which mixes women with what European cultures would classify as “inanimate” entities, as...languages emerge gradually from an ancient -root prototypical language. For example, when modern Italian and modern French emerged (and diverged) from...Rather, CGT relates to what an individual might conceptualize as ingroup and outgroup in a given context. In CGT, two kinds of groups are defined

  15. A Multi-Week Behavioral Sampling Tag for Sound Effects Studies: Design Trade-Offs and Prototype Evaluation

    DTIC Science & Technology

    2013-09-30

    performance of algorithms detecting dives, strokes , clicks, respiration and gait changes. (ii) Calibration errors: Size and power constraints in...acceptance parameters used to detect and classify events. For example, swim stroke detection requires parameters defining the minimum magnitude and the min...and max duration of a stroke . Species dependent parameters can be selected from existing DTAG data but other parameters depend on the size of the

  16. Machine-learning approaches to select Wolf-Rayet candidates

    NASA Astrophysics Data System (ADS)

    Marston, A. P.; Morello, G.; Morris, P.; van Dyk, S.; Mauerhan, J.

    2017-11-01

    The WR stellar population can be distinguished, at least partially, from other stellar populations by broad-band IR colour selection. We present the use of a machine learning classifier to quantitatively improve the selection of Galactic Wolf-Rayet (WR) candidates. These methods are used to separate the other stellar populations which have similar IR colours. We show the results of the classifications obtained by using the 2MASS J, H and K photometric bands, and the Spitzer/IRAC bands at 3.6, 4.5, 5.8 and 8.0μm. The k-Nearest Neighbour method has been used to select Galactic WR candidates for observational follow-up. A few candidates have been spectroscopically observed. Preliminary observations suggest that a detection rate of 50% can easily be achieved.

  17. Compact localized states and flat-band generators in one dimension

    NASA Astrophysics Data System (ADS)

    Maimaiti, Wulayimu; Andreanov, Alexei; Park, Hee Chul; Gendelman, Oleg; Flach, Sergej

    2017-03-01

    Flat bands (FB) are strictly dispersionless bands in the Bloch spectrum of a periodic lattice Hamiltonian, recently observed in a variety of photonic and dissipative condensate networks. FB Hamiltonians are fine-tuned networks, still lacking a comprehensive generating principle. We introduce a FB generator based on local network properties. We classify FB networks through the properties of compact localized states (CLS) which are exact FB eigenstates and occupy U unit cells. We obtain the complete two-parameter FB family of two-band d =1 networks with nearest unit cell interaction and U =2 . We discover a novel high symmetry sawtooth chain with identical hoppings in a transverse dc field, easily accessible in experiments. Our results pave the way towards a complete description of FBs in networks with more bands and in higher dimensions.

  18. Transportation Modes Classification Using Sensors on Smartphones.

    PubMed

    Fang, Shih-Hau; Liao, Hao-Hsiang; Fei, Yu-Xiang; Chen, Kai-Hsiang; Huang, Jen-Wei; Lu, Yu-Ding; Tsao, Yu

    2016-08-19

    This paper investigates the transportation and vehicular modes classification by using big data from smartphone sensors. The three types of sensors used in this paper include the accelerometer, magnetometer, and gyroscope. This study proposes improved features and uses three machine learning algorithms including decision trees, K-nearest neighbor, and support vector machine to classify the user's transportation and vehicular modes. In the experiments, we discussed and compared the performance from different perspectives including the accuracy for both modes, the executive time, and the model size. Results show that the proposed features enhance the accuracy, in which the support vector machine provides the best performance in classification accuracy whereas it consumes the largest prediction time. This paper also investigates the vehicle classification mode and compares the results with that of the transportation modes.

  19. Transportation Modes Classification Using Sensors on Smartphones

    PubMed Central

    Fang, Shih-Hau; Liao, Hao-Hsiang; Fei, Yu-Xiang; Chen, Kai-Hsiang; Huang, Jen-Wei; Lu, Yu-Ding; Tsao, Yu

    2016-01-01

    This paper investigates the transportation and vehicular modes classification by using big data from smartphone sensors. The three types of sensors used in this paper include the accelerometer, magnetometer, and gyroscope. This study proposes improved features and uses three machine learning algorithms including decision trees, K-nearest neighbor, and support vector machine to classify the user’s transportation and vehicular modes. In the experiments, we discussed and compared the performance from different perspectives including the accuracy for both modes, the executive time, and the model size. Results show that the proposed features enhance the accuracy, in which the support vector machine provides the best performance in classification accuracy whereas it consumes the largest prediction time. This paper also investigates the vehicle classification mode and compares the results with that of the transportation modes. PMID:27548182

  20. Nonlinear features for product inspection

    NASA Astrophysics Data System (ADS)

    Talukder, Ashit; Casasent, David P.

    1999-03-01

    Classification of real-time X-ray images of randomly oriented touching pistachio nuts is discussed. The ultimate objective is the development of a system for automated non-invasive detection of defective product items on a conveyor belt. We discuss the extraction of new features that allow better discrimination between damaged and clean items (pistachio nuts). This feature extraction and classification stage is the new aspect of this paper; our new maximum representation and discriminating feature (MRDF) extraction method computes nonlinear features that are used as inputs to a new modified k nearest neighbor classifier. In this work, the MRDF is applied to standard features (rather than iconic data). The MRDF is robust to various probability distributions of the input class and is shown to provide good classification and new ROC (receiver operating characteristic) data.

  1. Activity recognition in planetary navigation field tests using classification algorithms applied to accelerometer data.

    PubMed

    Song, Wen; Ade, Carl; Broxterman, Ryan; Barstow, Thomas; Nelson, Thomas; Warren, Steve

    2012-01-01

    Accelerometer data provide useful information about subject activity in many different application scenarios. For this study, single-accelerometer data were acquired from subjects participating in field tests that mimic tasks that astronauts might encounter in reduced gravity environments. The primary goal of this effort was to apply classification algorithms that could identify these tasks based on features present in their corresponding accelerometer data, where the end goal is to establish methods to unobtrusively gauge subject well-being based on sensors that reside in their local environment. In this initial analysis, six different activities that involve leg movement are classified. The k-Nearest Neighbors (kNN) algorithm was found to be the most effective, with an overall classification success rate of 90.8%.

  2. Turbulent chimeras in large semiconductor laser arrays

    PubMed Central

    Shena, J.; Hizanidis, J.; Kovanis, V.; Tsironis, G. P.

    2017-01-01

    Semiconductor laser arrays have been investigated experimentally and theoretically from the viewpoint of temporal and spatial coherence for the past forty years. In this work, we are focusing on a rather novel complex collective behavior, namely chimera states, where synchronized clusters of emitters coexist with unsynchronized ones. For the first time, we find such states exist in large diode arrays based on quantum well gain media with nearest-neighbor interactions. The crucial parameters are the evanescent coupling strength and the relative optical frequency detuning between the emitters of the array. By employing a recently proposed figure of merit for classifying chimera states, we provide quantitative and qualitative evidence for the observed dynamics. The corresponding chimeras are identified as turbulent according to the irregular temporal behavior of the classification measure. PMID:28165053

  3. Turbulent chimeras in large semiconductor laser arrays

    NASA Astrophysics Data System (ADS)

    Shena, J.; Hizanidis, J.; Kovanis, V.; Tsironis, G. P.

    2017-02-01

    Semiconductor laser arrays have been investigated experimentally and theoretically from the viewpoint of temporal and spatial coherence for the past forty years. In this work, we are focusing on a rather novel complex collective behavior, namely chimera states, where synchronized clusters of emitters coexist with unsynchronized ones. For the first time, we find such states exist in large diode arrays based on quantum well gain media with nearest-neighbor interactions. The crucial parameters are the evanescent coupling strength and the relative optical frequency detuning between the emitters of the array. By employing a recently proposed figure of merit for classifying chimera states, we provide quantitative and qualitative evidence for the observed dynamics. The corresponding chimeras are identified as turbulent according to the irregular temporal behavior of the classification measure.

  4. Motion data classification on the basis of dynamic time warping with a cloud point distance measure

    NASA Astrophysics Data System (ADS)

    Switonski, Adam; Josinski, Henryk; Zghidi, Hafedh; Wojciechowski, Konrad

    2016-06-01

    The paper deals with the problem of classification of model free motion data. The nearest neighbors classifier which is based on comparison performed by Dynamic Time Warping transform with cloud point distance measure is proposed. The classification utilizes both specific gait features reflected by a movements of subsequent skeleton joints and anthropometric data. To validate proposed approach human gait identification challenge problem is taken into consideration. The motion capture database containing data of 30 different humans collected in Human Motion Laboratory of Polish-Japanese Academy of Information Technology is used. The achieved results are satisfactory, the obtained accuracy of human recognition exceeds 90%. What is more, the applied cloud point distance measure does not depend on calibration process of motion capture system which results in reliable validation.

  5. Anomalous quantum heat transport in a one-dimensional harmonic chain with random couplings.

    PubMed

    Yan, Yonghong; Zhao, Hui

    2012-07-11

    We investigate quantum heat transport in a one-dimensional harmonic system with random couplings. In the presence of randomness, phonon modes may normally be classified as ballistic, diffusive or localized. We show that these modes can roughly be characterized by the local nearest-neighbor level spacing distribution, similarly to their electronic counterparts. We also show that the thermal conductance G(th) through the system decays rapidly with the system size (G(th) ∼ L(-α)). The exponent α strongly depends on the system size and can change from α < 1 to α > 1 with increasing system size, indicating that the system undergoes a transition from a heat conductor to a heat insulator. This result could be useful in thermal control of low-dimensional systems.

  6. Multispectral and Panchromatic used Enhancement Resolution and Study Effective Enhancement on Supervised and Unsupervised Classification Land – Cover

    NASA Astrophysics Data System (ADS)

    Salman, S. S.; Abbas, W. A.

    2018-05-01

    The goal of the study is to support analysis Enhancement of Resolution and study effect on classification methods on bands spectral information of specific and quantitative approaches. In this study introduce a method to enhancement resolution Landsat 8 of combining the bands spectral of 30 meters resolution with panchromatic band 8 of 15 meters resolution, because of importance multispectral imagery to extracting land - cover. Classification methods used in this study to classify several lands -covers recorded from OLI- 8 imagery. Two methods of Data mining can be classified as either supervised or unsupervised. In supervised methods, there is a particular predefined target, that means the algorithm learn which values of the target are associated with which values of the predictor sample. K-nearest neighbors and maximum likelihood algorithms examine in this work as supervised methods. In other hand, no sample identified as target in unsupervised methods, the algorithm of data extraction searches for structure and patterns between all the variables, represented by Fuzzy C-mean clustering method as one of the unsupervised methods, NDVI vegetation index used to compare the results of classification method, the percent of dense vegetation in maximum likelihood method give a best results.

  7. Comparison of GOES Cloud Classification Algorithms Employing Explicit and Implicit Physics

    NASA Technical Reports Server (NTRS)

    Bankert, Richard L.; Mitrescu, Cristian; Miller, Steven D.; Wade, Robert H.

    2009-01-01

    Cloud-type classification based on multispectral satellite imagery data has been widely researched and demonstrated to be useful for distinguishing a variety of classes using a wide range of methods. The research described here is a comparison of the classifier output from two very different algorithms applied to Geostationary Operational Environmental Satellite (GOES) data over the course of one year. The first algorithm employs spectral channel thresholding and additional physically based tests. The second algorithm was developed through a supervised learning method with characteristic features of expertly labeled image samples used as training data for a 1-nearest-neighbor classification. The latter's ability to identify classes is also based in physics, but those relationships are embedded implicitly within the algorithm. A pixel-to-pixel comparison analysis was done for hourly daytime scenes within a region in the northeastern Pacific Ocean. Considerable agreement was found in this analysis, with many of the mismatches or disagreements providing insight to the strengths and limitations of each classifier. Depending upon user needs, a rule-based or other postprocessing system that combines the output from the two algorithms could provide the most reliable cloud-type classification.

  8. Super-resolution method for face recognition using nonlinear mappings on coherent features.

    PubMed

    Huang, Hua; He, Huiting

    2011-01-01

    Low-resolution (LR) of face images significantly decreases the performance of face recognition. To address this problem, we present a super-resolution method that uses nonlinear mappings to infer coherent features that favor higher recognition of the nearest neighbor (NN) classifiers for recognition of single LR face image. Canonical correlation analysis is applied to establish the coherent subspaces between the principal component analysis (PCA) based features of high-resolution (HR) and LR face images. Then, a nonlinear mapping between HR/LR features can be built by radial basis functions (RBFs) with lower regression errors in the coherent feature space than in the PCA feature space. Thus, we can compute super-resolved coherent features corresponding to an input LR image according to the trained RBF model efficiently and accurately. And, face identity can be obtained by feeding these super-resolved features to a simple NN classifier. Extensive experiments on the Facial Recognition Technology, University of Manchester Institute of Science and Technology, and Olivetti Research Laboratory databases show that the proposed method outperforms the state-of-the-art face recognition algorithms for single LR image in terms of both recognition rate and robustness to facial variations of pose and expression.

  9. Deep feature extraction and combination for synthetic aperture radar target classification

    NASA Astrophysics Data System (ADS)

    Amrani, Moussa; Jiang, Feng

    2017-10-01

    Feature extraction has always been a difficult problem in the classification performance of synthetic aperture radar automatic target recognition (SAR-ATR). It is very important to select discriminative features to train a classifier, which is a prerequisite. Inspired by the great success of convolutional neural network (CNN), we address the problem of SAR target classification by proposing a feature extraction method, which takes advantage of exploiting the extracted deep features from CNNs on SAR images to introduce more powerful discriminative features and robust representation ability for them. First, the pretrained VGG-S net is fine-tuned on moving and stationary target acquisition and recognition (MSTAR) public release database. Second, after a simple preprocessing is performed, the fine-tuned network is used as a fixed feature extractor to extract deep features from the processed SAR images. Third, the extracted deep features are fused by using a traditional concatenation and a discriminant correlation analysis algorithm. Finally, for target classification, K-nearest neighbors algorithm based on LogDet divergence-based metric learning triplet constraints is adopted as a baseline classifier. Experiments on MSTAR are conducted, and the classification accuracy results demonstrate that the proposed method outperforms the state-of-the-art methods.

  10. Automatic epileptic seizure detection using scalp EEG and advanced artificial intelligence techniques.

    PubMed

    Fergus, Paul; Hignett, David; Hussain, Abir; Al-Jumeily, Dhiya; Abdel-Aziz, Khaled

    2015-01-01

    The epilepsies are a heterogeneous group of neurological disorders and syndromes characterised by recurrent, involuntary, paroxysmal seizure activity, which is often associated with a clinicoelectrical correlate on the electroencephalogram. The diagnosis of epilepsy is usually made by a neurologist but can be difficult to be made in the early stages. Supporting paraclinical evidence obtained from magnetic resonance imaging and electroencephalography may enable clinicians to make a diagnosis of epilepsy and investigate treatment earlier. However, electroencephalogram capture and interpretation are time consuming and can be expensive due to the need for trained specialists to perform the interpretation. Automated detection of correlates of seizure activity may be a solution. In this paper, we present a supervised machine learning approach that classifies seizure and nonseizure records using an open dataset containing 342 records. Our results show an improvement on existing studies by as much as 10% in most cases with a sensitivity of 93%, specificity of 94%, and area under the curve of 98% with a 6% global error using a k-class nearest neighbour classifier. We propose that such an approach could have clinical applications in the investigation of patients with suspected seizure disorders.

  11. A Step Towards EEG-based Brain Computer Interface for Autism Intervention*

    PubMed Central

    Fan, Jing; Wade, Joshua W.; Bian, Dayi; Key, Alexandra P.; Warren, Zachary E.; Mion, Lorraine C.; Sarkar, Nilanjan

    2017-01-01

    Autism Spectrum Disorder (ASD) is a prevalent and costly neurodevelopmental disorder. Individuals with ASD often have deficits in social communication skills as well as adaptive behavior skills related to daily activities. We have recently designed a novel virtual reality (VR) based driving simulator for driving skill training for individuals with ASD. In this paper, we explored the feasibility of detecting engagement level, emotional states, and mental workload during VR-based driving using EEG as a first step towards a potential EEG-based Brain Computer Interface (BCI) for assisting autism intervention. We used spectral features of EEG signals from a 14-channel EEG neuroheadset, together with therapist ratings of behavioral engagement, enjoyment, frustration, boredom, and difficulty to train a group of classification models. Seven classification methods were applied and compared including Bayes network, naïve Bayes, Support Vector Machine (SVM), multilayer perceptron, K-nearest neighbors (KNN), random forest, and J48. The classification results were promising, with over 80% accuracy in classifying engagement and mental workload, and over 75% accuracy in classifying emotional states. Such results may lead to an adaptive closed-loop VR-based skill training system for use in autism intervention. PMID:26737113

  12. Classification of Malaysia aromatic rice using multivariate statistical analysis

    NASA Astrophysics Data System (ADS)

    Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A.; Omar, O.

    2015-05-01

    Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC-MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.

  13. Sequence Based Prediction of Antioxidant Proteins Using a Classifier Selection Strategy

    PubMed Central

    Zhang, Lina; Zhang, Chengjin; Gao, Rui; Yang, Runtao; Song, Qing

    2016-01-01

    Antioxidant proteins perform significant functions in maintaining oxidation/antioxidation balance and have potential therapies for some diseases. Accurate identification of antioxidant proteins could contribute to revealing physiological processes of oxidation/antioxidation balance and developing novel antioxidation-based drugs. In this study, an ensemble method is presented to predict antioxidant proteins with hybrid features, incorporating SSI (Secondary Structure Information), PSSM (Position Specific Scoring Matrix), RSA (Relative Solvent Accessibility), and CTD (Composition, Transition, Distribution). The prediction results of the ensemble predictor are determined by an average of prediction results of multiple base classifiers. Based on a classifier selection strategy, we obtain an optimal ensemble classifier composed of RF (Random Forest), SMO (Sequential Minimal Optimization), NNA (Nearest Neighbor Algorithm), and J48 with an accuracy of 0.925. A Relief combined with IFS (Incremental Feature Selection) method is adopted to obtain optimal features from hybrid features. With the optimal features, the ensemble method achieves improved performance with a sensitivity of 0.95, a specificity of 0.93, an accuracy of 0.94, and an MCC (Matthew’s Correlation Coefficient) of 0.880, far better than the existing method. To evaluate the prediction performance objectively, the proposed method is compared with existing methods on the same independent testing dataset. Encouragingly, our method performs better than previous studies. In addition, our method achieves more balanced performance with a sensitivity of 0.878 and a specificity of 0.860. These results suggest that the proposed ensemble method can be a potential candidate for antioxidant protein prediction. For public access, we develop a user-friendly web server for antioxidant protein identification that is freely accessible at http://antioxidant.weka.cc. PMID:27662651

  14. Using an intelligent system to aid in tephra layer correlation of the tephra beds of the Mono-Inyo Craters, California

    NASA Astrophysics Data System (ADS)

    Hanson-Hedgecock, S.; Bursik, M.; Rogova, G.

    2008-12-01

    We are developing an intelligent system to correlate tephra layers by using the lithologic and geochemical characteristics of field samples, to aid geologists in interpreting eruption patterns in volcanic fields. Understanding the eruption history of a volcanic field from stratigraphic studies is important for forecasting future eruptive behavior and hazards. The intelligent system is used to define groups of tephra source vents and to correlate tephra layers based on a combination of geochemical data and lithostratigraphic characteristics. The tephra beds of the Mono-Inyo Craters, California, are used to test the ability of the intelligent system for tephra layer correlation. The data processing is performed by a suite of both unsupervised and supervised classifiers, built and combined within the framework of the Dempster-Shafer theory of evidence. We have developed algorithms to calculate isopleth maps of thickness, lithic and pumice size that are used in the processing of the lithostratigraphic data. This spatial information is important in the determination of eruption patterns and is used by an evidential nearest neighbor classifier to correlate tephra layers. Integrating a better isopleth approximation function and expert knowledge about stratigraphic order of the tephra layers into the classifier improves the lithostratigraphic correlation from 56% to 87% of layers correctly identified. Geochemical data for defining groups of tephra sources are processed by a suit of fuzzy k-means classifiers. Improved clustering results of geochemical data are achieved by the fusion of individual clustering results with an evidential combination method. The intelligent system aids correlation by showing matches and disparities between data patterns from different outcrops that may have been overlooked. The intelligent system produces a useful recognition result, while dealing with the uncertainty from sparse data and the imprecise description of layer characteristics.

  15. Discrimination among spawning aggregations of lake herring from Lake Superior using whole-body morphometric characters

    USGS Publications Warehouse

    Hoff, Michael H.

    2004-01-01

    The lake herring (Coregonus artedi) was one of the most commercially and ecologically valuable Lake Superior fishes, but declined in the second half of the 20th century as the result of overharvest of putatively discrete stocks. No tools were previously available that described lake herring stock structure and accurately classified lake herring to their spawning stocks. The accuracy of discriminating among spawning aggregations was evaluated using whole-body morphometrics based on a truss network. Lake herring were collected from 11 spawning aggregations in Lake Superior and two inland Wisconsin lakes to evaluate morphometrics as a stock discrimination tool. Discriminant function analysis correctly classified 53% of all fish from all spawning aggregations, and fish from all but one aggregation were classified at greater rates than were possible by chance. Discriminant analysis also correctly classified 66% of fish to nearest neighbor groups, which were groups that accounted for the possibility of mixing among the aggregations. Stepwise discriminant analysis showed that posterior body length and depth measurements were among the best discriminators of spawning aggregations. These findings support other evidence that discrete stocks of lake herring exist in Lake Superior, and fishery managers should consider all but one of the spawning aggregations as discrete stocks. Abundance, annual harvest, total annual mortality rate, and exploitation data should be collected from each stock, and surplus production of each stock should be estimated. Prudent management of stock surplus production and exploitation rates will aid in restoration of stocks and will prevent a repeat of the stock collapses that occurred in the middle of the 20th century, when the species was nearly extirpated from the lake.

  16. Automatic classication of pulmonary function in COPD patients using trachea analysis in chest CT scans

    NASA Astrophysics Data System (ADS)

    van Rikxoort, E. M.; de Jong, P. A.; Mets, O. M.; van Ginneken, B.

    2012-03-01

    Chronic Obstructive Pulmonary Disease (COPD) is a chronic lung disease that is characterized by airflow limitation. COPD is clinically diagnosed and monitored using pulmonary function testing (PFT), which measures global inspiration and expiration capabilities of patients and is time-consuming and labor-intensive. It is becoming standard practice to obtain paired inspiration-expiration CT scans of COPD patients. Predicting the PFT results from the CT scans would alleviate the need for PFT testing. It is hypothesized that the change of the trachea during breathing might be an indicator of tracheomalacia in COPD patients and correlate with COPD severity. In this paper, we propose to automatically measure morphological changes in the trachea from paired inspiration and expiration CT scans and investigate the influence on COPD GOLD stage classification. The trachea is automatically segmented and the trachea shape is encoded using the lengths of rays cast from the center of gravity of the trachea. These features are used in a classifier, combined with emphysema scoring, to attempt to classify subjects into their COPD stage. A database of 187 subjects, well distributed over the COPD GOLD stages 0 through 4 was used for this study. The data was randomly divided into training and test set. Using the training scans, a nearest mean classifier was trained to classify the subjects into their correct GOLD stage using either emphysema score, tracheal shape features, or a combination. Combining the proposed trachea shape features with emphysema score, the classification performance into GOLD stages improved with 11% to 51%. In addition, an 80% accuracy was achieved in distinguishing healthy subjects from COPD patients.

  17. Classification of interstitial lung disease patterns with topological texture features

    NASA Astrophysics Data System (ADS)

    Huber, Markus B.; Nagarajan, Mahesh; Leinsinger, Gerda; Ray, Lawrence A.; Wismüller, Axel

    2010-03-01

    Topological texture features were compared in their ability to classify morphological patterns known as 'honeycombing' that are considered indicative for the presence of fibrotic interstitial lung diseases in high-resolution computed tomography (HRCT) images. For 14 patients with known occurrence of honey-combing, a stack of 70 axial, lung kernel reconstructed images were acquired from HRCT chest exams. A set of 241 regions of interest of both healthy and pathological (89) lung tissue were identified by an experienced radiologist. Texture features were extracted using six properties calculated from gray-level co-occurrence matrices (GLCM), Minkowski Dimensions (MDs), and three Minkowski Functionals (MFs, e.g. MF.euler). A k-nearest-neighbor (k-NN) classifier and a Multilayer Radial Basis Functions Network (RBFN) were optimized in a 10-fold cross-validation for each texture vector, and the classification accuracy was calculated on independent test sets as a quantitative measure of automated tissue characterization. A Wilcoxon signed-rank test was used to compare two accuracy distributions and the significance thresholds were adjusted for multiple comparisons by the Bonferroni correction. The best classification results were obtained by the MF features, which performed significantly better than all the standard GLCM and MD features (p < 0.005) for both classifiers. The highest accuracy was found for MF.euler (97.5%, 96.6%; for the k-NN and RBFN classifier, respectively). The best standard texture features were the GLCM features 'homogeneity' (91.8%, 87.2%) and 'absolute value' (90.2%, 88.5%). The results indicate that advanced topological texture features can provide superior classification performance in computer-assisted diagnosis of interstitial lung diseases when compared to standard texture analysis methods.

  18. Predicting Implantation Outcome of In Vitro Fertilization and Intracytoplasmic Sperm Injection Using Data Mining Techniques.

    PubMed

    Hafiz, Pegah; Nematollahi, Mohtaram; Boostani, Reza; Namavar Jahromi, Bahia

    2017-10-01

    In vitro fertilization (IVF) and intracytoplasmic sperm injection (ICSI) are two important subsets of the assisted reproductive techniques, used for the treatment of infertility. Predicting implantation outcome of IVF/ICSI or the chance of pregnancy is essential for infertile couples, since these treatments are complex and expensive with a low probability of conception. In this cross-sectional study, the data of 486 patients were collected using census method. The IVF/ICSI dataset contains 29 variables along with an identifier for each patient that is either negative or positive. Mean accuracy and mean area under the receiver operating characteristic (ROC) curve are calculated for the classifiers. Sensitivity, specificity, positive and negative predictive values, and likelihood ratios of classifiers are employed as indicators of performance. The state-of-art classifiers which are candidates for this study include support vector machines, recursive partitioning (RPART), random forest (RF), adaptive boosting, and one-nearest neighbor. RF and RPART outperform the other comparable methods. The results revealed the areas under the ROC curve (AUC) as 84.23 and 82.05%, respectively. The importance of IVF/ICSI features was extracted from the output of RPART. Our findings demonstrate that the probability of pregnancy is low for women aged above 38. Classifiers RF and RPART are better at predicting IVF/ICSI cases compared to other decision makers that were tested in our study. Elicited decision rules of RPART determine useful predictive features of IVF/ICSI. Out of 20 factors, the age of woman, number of developed embryos, and serum estradiol level on the day of human chorionic gonadotropin administration are the three best features for such prediction. Copyright© by Royan Institute. All rights reserved.

  19. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md

    Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy trainingmore » time, and prone to fatigue as the number of sample increased and inconsistent. The GC–MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.« less

  20. Introducing two Random Forest based methods for cloud detection in remote sensing images

    NASA Astrophysics Data System (ADS)

    Ghasemian, Nafiseh; Akhoondzadeh, Mehdi

    2018-07-01

    Cloud detection is a necessary phase in satellite images processing to retrieve the atmospheric and lithospheric parameters. Currently, some cloud detection methods based on Random Forest (RF) model have been proposed but they do not consider both spectral and textural characteristics of the image. Furthermore, they have not been tested in the presence of snow/ice. In this paper, we introduce two RF based algorithms, Feature Level Fusion Random Forest (FLFRF) and Decision Level Fusion Random Forest (DLFRF) to incorporate visible, infrared (IR) and thermal spectral and textural features (FLFRF) including Gray Level Co-occurrence Matrix (GLCM) and Robust Extended Local Binary Pattern (RELBP_CI) or visible, IR and thermal classifiers (DLFRF) for highly accurate cloud detection on remote sensing images. FLFRF first fuses visible, IR and thermal features. Thereafter, it uses the RF model to classify pixels to cloud, snow/ice and background or thick cloud, thin cloud and background. DLFRF considers visible, IR and thermal features (both spectral and textural) separately and inserts each set of features to RF model. Then, it holds vote matrix of each run of the model. Finally, it fuses the classifiers using the majority vote method. To demonstrate the effectiveness of the proposed algorithms, 10 Terra MODIS and 15 Landsat 8 OLI/TIRS images with different spatial resolutions are used in this paper. Quantitative analyses are based on manually selected ground truth data. Results show that after adding RELBP_CI to input feature set cloud detection accuracy improves. Also, the average cloud kappa values of FLFRF and DLFRF on MODIS images (1 and 0.99) are higher than other machine learning methods, Linear Discriminate Analysis (LDA), Classification And Regression Tree (CART), K Nearest Neighbor (KNN) and Support Vector Machine (SVM) (0.96). The average snow/ice kappa values of FLFRF and DLFRF on MODIS images (1 and 0.85) are higher than other traditional methods. The quantitative values on Landsat 8 images show similar trend. Consequently, while SVM and K-nearest neighbor show overestimation in predicting cloud and snow/ice pixels, our Random Forest (RF) based models can achieve higher cloud, snow/ice kappa values on MODIS and thin cloud, thick cloud and snow/ice kappa values on Landsat 8 images. Our algorithms predict both thin and thick cloud on Landsat 8 images while the existing cloud detection algorithm, Fmask cannot discriminate them. Compared to the state-of-the-art methods, our algorithms have acquired higher average cloud and snow/ice kappa values for different spatial resolutions.

  1. Hyperswitch communication network

    NASA Technical Reports Server (NTRS)

    Peterson, J.; Pniel, M.; Upchurch, E.

    1991-01-01

    The Hyperswitch Communication Network (HCN) is a large scale parallel computer prototype being developed at JPL. Commercial versions of the HCN computer are planned. The HCN computer being designed is a message passing multiple instruction multiple data (MIMD) computer, and offers many advantages in price-performance ratio, reliability and availability, and manufacturing over traditional uniprocessors and bus based multiprocessors. The design of the HCN operating system is a uniquely flexible environment that combines both parallel processing and distributed processing. This programming paradigm can achieve a balance among the following competing factors: performance in processing and communications, user friendliness, and fault tolerance. The prototype is being designed to accommodate a maximum of 64 state of the art microprocessors. The HCN is classified as a distributed supercomputer. The HCN system is described, and the performance/cost analysis and other competing factors within the system design are reviewed.

  2. A Compact and Low Cost Electronic Nose for Aroma Detection

    PubMed Central

    Macías, Miguel Macías; Agudo, J. Enrique; Manso, Antonio García; Orellana, Carlos Javier García; Velasco, Horacio Manuel González; Caballero, Ramón Gallardo

    2013-01-01

    This article explains the development of a prototype of a portable and a very low-cost electronic nose based on an mbed microcontroller. Mbeds are a series of ARM microcontroller development boards designed for fast, flexible and rapid prototyping. The electronic nose is comprised of an mbed, an LCD display, two small pumps, two electro-valves and a sensor chamber with four TGS Figaro gas sensors. The performance of the electronic nose has been tested by measuring the ethanol content of wine synthetic matrices and special attention has been paid to the reproducibility and repeatability of the measurements taken on different days. Results show that the electronic nose with a neural network classifier is able to discriminate wine samples with 10, 12 and 14% V/V alcohol content with a classification error of less than 1%. PMID:23698265

  3. Novel computer-aided diagnosis of mesothelioma using nuclear structure of mesothelial cells in effusion cytology specimens

    NASA Astrophysics Data System (ADS)

    Tosun, Akif Burak; Yergiyev, Oleksandr; Kolouri, Soheil; Silverman, Jan F.; Rohde, Gustavo K.

    2014-03-01

    diagnostic standard is a pleural biopsy with subsequent histologic examination of the tissue demonstrating invasion by the tumor. The diagnostic tissue is obtained through thoracoscopy or open thoracotomy, both being highly invasive procedures. Thoracocenthesis, or removal of effusion fluid from the pleural space, is a far less invasive procedure that can provide material for cytological examination. However, it is insufficient to definitively confirm or exclude the diagnosis of malignant mesothelioma, since tissue invasion cannot be determined. In this study, we present a computerized method to detect and classify malignant mesothelioma based on the nuclear chromatin distribution from digital images of mesothelial cells in effusion cytology specimens. Our method aims at determining whether a set of nuclei belonging to a patient, obtained from effusion fluid images using image segmentation, is benign or malignant, and has a potential to eliminate the need for tissue biopsy. This method is performed by quantifying chromatin morphology of cells using the optimal transportation (Kantorovich-Wasserstein) metric in combination with the modified Fisher discriminant analysis, a k-nearest neighborhood classification, and a simple voting strategy. Our results show that we can classify the data of 10 different human cases with 100% accuracy after blind cross validation. We conclude that nuclear structure alone contains enough information to classify the malignant mesothelioma. We also conclude that the distribution of chromatin seems to be a discriminating feature between nuclei of benign and malignant mesothelioma cells.

  4. Physical Human Activity Recognition Using Wearable Sensors.

    PubMed

    Attal, Ferhat; Mohammed, Samer; Dedabrishvili, Mariam; Chamroukhi, Faicel; Oukhellou, Latifa; Amirat, Yacine

    2015-12-11

    This paper presents a review of different classification techniques used to recognize human activities from wearable inertial sensor data. Three inertial sensor units were used in this study and were worn by healthy subjects at key points of upper/lower body limbs (chest, right thigh and left ankle). Three main steps describe the activity recognition process: sensors' placement, data pre-processing and data classification. Four supervised classification techniques namely, k-Nearest Neighbor (k-NN), Support Vector Machines (SVM), Gaussian Mixture Models (GMM), and Random Forest (RF) as well as three unsupervised classification techniques namely, k-Means, Gaussian mixture models (GMM) and Hidden Markov Model (HMM), are compared in terms of correct classification rate, F-measure, recall, precision, and specificity. Raw data and extracted features are used separately as inputs of each classifier. The feature selection is performed using a wrapper approach based on the RF algorithm. Based on our experiments, the results obtained show that the k-NN classifier provides the best performance compared to other supervised classification algorithms, whereas the HMM classifier is the one that gives the best results among unsupervised classification algorithms. This comparison highlights which approach gives better performance in both supervised and unsupervised contexts. It should be noted that the obtained results are limited to the context of this study, which concerns the classification of the main daily living human activities using three wearable accelerometers placed at the chest, right shank and left ankle of the subject.

  5. Local binary pattern texture-based classification of solid masses in ultrasound breast images

    NASA Astrophysics Data System (ADS)

    Matsumoto, Monica M. S.; Sehgal, Chandra M.; Udupa, Jayaram K.

    2012-03-01

    Breast cancer is one of the leading causes of cancer mortality among women. Ultrasound examination can be used to assess breast masses, complementarily to mammography. Ultrasound images reveal tissue information in its echoic patterns. Therefore, pattern recognition techniques can facilitate classification of lesions and thereby reduce the number of unnecessary biopsies. Our hypothesis was that image texture features on the boundary of a lesion and its vicinity can be used to classify masses. We have used intensity-independent and rotation-invariant texture features, known as Local Binary Patterns (LBP). The classifier selected was K-nearest neighbors. Our breast ultrasound image database consisted of 100 patient images (50 benign and 50 malignant cases). The determination of whether the mass was benign or malignant was done through biopsy and pathology assessment. The training set consisted of sixty images, randomly chosen from the database of 100 patients. The testing set consisted of forty images to be classified. The results with a multi-fold cross validation of 100 iterations produced a robust evaluation. The highest performance was observed for feature LBP with 24 symmetrically distributed neighbors over a circle of radius 3 (LBP24,3) with an accuracy rate of 81.0%. We also investigated an approach with a score of malignancy assigned to the images in the test set. This approach provided an ROC curve with Az of 0.803. The analysis of texture features over the boundary of solid masses showed promise for malignancy classification in ultrasound breast images.

  6. Multi-factorial analysis of class prediction error: estimating optimal number of biomarkers for various classification rules.

    PubMed

    Khondoker, Mizanur R; Bachmann, Till T; Mewissen, Muriel; Dickinson, Paul; Dobrzelecki, Bartosz; Campbell, Colin J; Mount, Andrew R; Walton, Anthony J; Crain, Jason; Schulze, Holger; Giraud, Gerard; Ross, Alan J; Ciani, Ilenia; Ember, Stuart W J; Tlili, Chaker; Terry, Jonathan G; Grant, Eilidh; McDonnell, Nicola; Ghazal, Peter

    2010-12-01

    Machine learning and statistical model based classifiers have increasingly been used with more complex and high dimensional biological data obtained from high-throughput technologies. Understanding the impact of various factors associated with large and complex microarray datasets on the predictive performance of classifiers is computationally intensive, under investigated, yet vital in determining the optimal number of biomarkers for various classification purposes aimed towards improved detection, diagnosis, and therapeutic monitoring of diseases. We investigate the impact of microarray based data characteristics on the predictive performance for various classification rules using simulation studies. Our investigation using Random Forest, Support Vector Machines, Linear Discriminant Analysis and k-Nearest Neighbour shows that the predictive performance of classifiers is strongly influenced by training set size, biological and technical variability, replication, fold change and correlation between biomarkers. Optimal number of biomarkers for a classification problem should therefore be estimated taking account of the impact of all these factors. A database of average generalization errors is built for various combinations of these factors. The database of generalization errors can be used for estimating the optimal number of biomarkers for given levels of predictive accuracy as a function of these factors. Examples show that curves from actual biological data resemble that of simulated data with corresponding levels of data characteristics. An R package optBiomarker implementing the method is freely available for academic use from the Comprehensive R Archive Network (http://www.cran.r-project.org/web/packages/optBiomarker/).

  7. Classification of postural profiles among mouth-breathing children by learning vector quantization.

    PubMed

    Mancini, F; Sousa, F S; Hummel, A D; Falcão, A E J; Yi, L C; Ortolani, C F; Sigulem, D; Pisa, I T

    2011-01-01

    Mouth breathing is a chronic syndrome that may bring about postural changes. Finding characteristic patterns of changes occurring in the complex musculoskeletal system of mouth-breathing children has been a challenge. Learning vector quantization (LVQ) is an artificial neural network model that can be applied for this purpose. The aim of the present study was to apply LVQ to determine the characteristic postural profiles shown by mouth-breathing children, in order to further understand abnormal posture among mouth breathers. Postural training data on 52 children (30 mouth breathers and 22 nose breathers) and postural validation data on 32 children (22 mouth breathers and 10 nose breathers) were used. The performance of LVQ and other classification models was compared in relation to self-organizing maps, back-propagation applied to multilayer perceptrons, Bayesian networks, naive Bayes, J48 decision trees, k, and k-nearest-neighbor classifiers. Classifier accuracy was assessed by means of leave-one-out cross-validation, area under ROC curve (AUC), and inter-rater agreement (Kappa statistics). By using the LVQ model, five postural profiles for mouth-breathing children could be determined. LVQ showed satisfactory results for mouth-breathing and nose-breathing classification: sensitivity and specificity rates of 0.90 and 0.95, respectively, when using the training dataset, and 0.95 and 0.90, respectively, when using the validation dataset. The five postural profiles for mouth-breathing children suggested by LVQ were incorporated into application software for classifying the severity of mouth breathers' abnormal posture.

  8. Physical Human Activity Recognition Using Wearable Sensors

    PubMed Central

    Attal, Ferhat; Mohammed, Samer; Dedabrishvili, Mariam; Chamroukhi, Faicel; Oukhellou, Latifa; Amirat, Yacine

    2015-01-01

    This paper presents a review of different classification techniques used to recognize human activities from wearable inertial sensor data. Three inertial sensor units were used in this study and were worn by healthy subjects at key points of upper/lower body limbs (chest, right thigh and left ankle). Three main steps describe the activity recognition process: sensors’ placement, data pre-processing and data classification. Four supervised classification techniques namely, k-Nearest Neighbor (k-NN), Support Vector Machines (SVM), Gaussian Mixture Models (GMM), and Random Forest (RF) as well as three unsupervised classification techniques namely, k-Means, Gaussian mixture models (GMM) and Hidden Markov Model (HMM), are compared in terms of correct classification rate, F-measure, recall, precision, and specificity. Raw data and extracted features are used separately as inputs of each classifier. The feature selection is performed using a wrapper approach based on the RF algorithm. Based on our experiments, the results obtained show that the k-NN classifier provides the best performance compared to other supervised classification algorithms, whereas the HMM classifier is the one that gives the best results among unsupervised classification algorithms. This comparison highlights which approach gives better performance in both supervised and unsupervised contexts. It should be noted that the obtained results are limited to the context of this study, which concerns the classification of the main daily living human activities using three wearable accelerometers placed at the chest, right shank and left ankle of the subject. PMID:26690450

  9. Combination of mass spectrometry-based targeted lipidomics and supervised machine learning algorithms in detecting adulterated admixtures of white rice.

    PubMed

    Lim, Dong Kyu; Long, Nguyen Phuoc; Mo, Changyeun; Dong, Ziyuan; Cui, Lingmei; Kim, Giyoung; Kwon, Sung Won

    2017-10-01

    The mixing of extraneous ingredients with original products is a common adulteration practice in food and herbal medicines. In particular, authenticity of white rice and its corresponding blended products has become a key issue in food industry. Accordingly, our current study aimed to develop and evaluate a novel discrimination method by combining targeted lipidomics with powerful supervised learning methods, and eventually introduce a platform to verify the authenticity of white rice. A total of 30 cultivars were collected, and 330 representative samples of white rice from Korea and China as well as seven mixing ratios were examined. Random forests (RF), support vector machines (SVM) with a radial basis function kernel, C5.0, model averaged neural network, and k-nearest neighbor classifiers were used for the classification. We achieved desired results, and the classifiers effectively differentiated white rice from Korea to blended samples with high prediction accuracy for the contamination ratio as low as five percent. In addition, RF and SVM classifiers were generally superior to and more robust than the other techniques. Our approach demonstrated that the relative differences in lysoGPLs can be successfully utilized to detect the adulterated mixing of white rice originating from different countries. In conclusion, the present study introduces a novel and high-throughput platform that can be applied to authenticate adulterated admixtures from original white rice samples. Copyright © 2017 Elsevier Ltd. All rights reserved.

  10. Quantum Correlation in the XY Spin Model with Anisotropic Three-Site Interaction

    NASA Astrophysics Data System (ADS)

    Wang, Yao; Chai, Bing-Bing; Guo, Jin-Liang

    2018-05-01

    We investigate pairwise entanglement and quantum discord (QD) in the XY spin model with anisotropic three-site interaction at zero and finite temperatures. For both the nearest-neighbor spins and the next nearest-neighbor spins, special attention is paid to the dependence of entanglement and QD on the anisotropic parameter δ induced by the next nearest-neighbor spins. We show that the behavior of QD differs in many ways from entanglement under the influences of the anisotropic three-site interaction at finite temperatures. More important, comparing the effects of δ on the entanglement and QD, we find the anisotropic three-site interaction plays an important role in the quantum correlations at zero and finite temperatures. It is found that δ can strengthen the quantum correlation for both the nearest-neighbor spins and the next nearest-neighbor spins, especially for the nearest-neighbor spins at low temperature.

  11. Performing a scatterv operation on a hierarchical tree network optimized for collective operations

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Archer, Charles J; Blocksome, Michael A; Ratterman, Joseph D

    Performing a scatterv operation on a hierarchical tree network optimized for collective operations including receiving, by the scatterv module installed on the node, from a nearest neighbor parent above the node a chunk of data having at least a portion of data for the node; maintaining, by the scatterv module installed on the node, the portion of the data for the node; determining, by the scatterv module installed on the node, whether any portions of the data are for a particular nearest neighbor child below the node or one or more other nodes below the particular nearest neighbor child; andmore » sending, by the scatterv module installed on the node, those portions of data to the nearest neighbor child if any portions of the data are for a particular nearest neighbor child below the node or one or more other nodes below the particular nearest neighbor child.« less

  12. Gunshot identification system by integration of open source consumer electronics

    NASA Astrophysics Data System (ADS)

    López R., Juan Manuel; Marulanda B., Jose Ignacio

    2014-05-01

    This work presents a prototype of low-cost gunshots identification system that uses consumer electronics in order to ensure the existence of gunshots and then classify it according to a previously established database. The implementation of this tool in the urban areas is to set records that support the forensics, hence improving law enforcement also on developing countries. An analysis of its effectiveness is presented in comparison with theoretical results obtained with numerical simulations.

  13. A prototype system based on visual interactive SDM called VGC

    NASA Astrophysics Data System (ADS)

    Jia, Zelu; Liu, Yaolin; Liu, Yanfang

    2009-10-01

    In many application domains, data is collected and referenced by its geo-spatial location. Spatial data mining, or the discovery of interesting patterns in such databases, is an important capability in the development of database systems. Spatial data mining recently emerges from a number of real applications, such as real-estate marketing, urban planning, weather forecasting, medical image analysis, road traffic accident analysis, etc. It demands for efficient solutions for many new, expensive, and complicated problems. For spatial data mining of large data sets to be effective, it is also important to include humans in the data exploration process and combine their flexibility, creativity, and general knowledge with the enormous storage capacity and computational power of today's computers. Visual spatial data mining applies human visual perception to the exploration of large data sets. Presenting data in an interactive, graphical form often fosters new insights, encouraging the information and validation of new hypotheses to the end of better problem-solving and gaining deeper domain knowledge. In this paper a visual interactive spatial data mining prototype system (visual geo-classify) based on VC++6.0 and MapObject2.0 are designed and developed, the basic algorithms of the spatial data mining is used decision tree and Bayesian networks, and data classify are used training and learning and the integration of the two to realize. The result indicates it's a practical and extensible visual interactive spatial data mining tool.

  14. Electronic origin of the dependence of hydrogen bond strengths on nearest-neighbor and next-nearest-neighbor hydrogen bonds in polyhedral water clusters (H 2 O) n , n = 8, 20 and 24

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Iwata, Suehiro; Akase, Dai; Aida, Misako

    2016-01-01

    The relative stability and the characteristics of the hydrogen bond networks in the cubic cages of (H2O)8, dodecahedral cages of (H2O)20,and tetrakaidodecahedral cages of (H2O)24 are studied. The charge-transfer and dispersion interaction terms of every pair of the hydrogen bonds are evaluated by using the perturbation theory based on the locally-projected molecular orbital (LPMO PT). Every water molecule and every hydrogen-bonded pair in polyhedral clusters are classified by the types of the adjacent molecules and hydrogen bonds. The relative binding energies among the polyhedral clusters are grouped by these classifications. The necessary condition for the stable conformers and the rulesmore » of the ordering of the relative stability among the isomers are derived from the analysis. The O–O distances and the pair-wise charge-transfer terms are dependent not only on the types of the hydrogen donor and acceptor waters but also on the types of the adjacent waters. This dependence is analyzed with Mulliken’s charge-transfer theory. The work is partially supported by the Grant-in-Aid for Science Research of JSPS (SI, DA, MA). SSX was supported by the US Department of Energy, Office of Science, Office of Basic Energy Sciences, Division of Chemical Sciences, Geosciences and Biosciences. Battelle operates the Pacific Northwest National Laboratory for the US Department of Energy.« less

  15. On the classification techniques in data mining for microarray data classification

    NASA Astrophysics Data System (ADS)

    Aydadenta, Husna; Adiwijaya

    2018-03-01

    Cancer is one of the deadly diseases, according to data from WHO by 2015 there are 8.8 million more deaths caused by cancer, and this will increase every year if not resolved earlier. Microarray data has become one of the most popular cancer-identification studies in the field of health, since microarray data can be used to look at levels of gene expression in certain cell samples that serve to analyze thousands of genes simultaneously. By using data mining technique, we can classify the sample of microarray data thus it can be identified with cancer or not. In this paper we will discuss some research using some data mining techniques using microarray data, such as Support Vector Machine (SVM), Artificial Neural Network (ANN), Naive Bayes, k-Nearest Neighbor (kNN), and C4.5, and simulation of Random Forest algorithm with technique of reduction dimension using Relief. The result of this paper show performance measure (accuracy) from classification algorithm (SVM, ANN, Naive Bayes, kNN, C4.5, and Random Forets).The results in this paper show the accuracy of Random Forest algorithm higher than other classification algorithms (Support Vector Machine (SVM), Artificial Neural Network (ANN), Naive Bayes, k-Nearest Neighbor (kNN), and C4.5). It is hoped that this paper can provide some information about the speed, accuracy, performance and computational cost generated from each Data Mining Classification Technique based on microarray data.

  16. Classification Features of US Images Liver Extracted with Co-occurrence Matrix Using the Nearest Neighbor Algorithm

    NASA Astrophysics Data System (ADS)

    Moldovanu, Simona; Bibicu, Dorin; Moraru, Luminita; Nicolae, Mariana Carmen

    2011-12-01

    Co-occurrence matrix has been applied successfully for echographic images characterization because it contains information about spatial distribution of grey-scale levels in an image. The paper deals with the analysis of pixels in selected regions of interest of an US image of the liver. The useful information obtained refers to texture features such as entropy, contrast, dissimilarity and correlation extract with co-occurrence matrix. The analyzed US images were grouped in two distinct sets: healthy liver and steatosis (or fatty) liver. These two sets of echographic images of the liver build a database that includes only histological confirmed cases: 10 images of healthy liver and 10 images of steatosis liver. The healthy subjects help to compute four textural indices and as well as control dataset. We chose to study these diseases because the steatosis is the abnormal retention of lipids in cells. The texture features are statistical measures and they can be used to characterize irregularity of tissues. The goal is to extract the information using the Nearest Neighbor classification algorithm. The K-NN algorithm is a powerful tool to classify features textures by means of grouping in a training set using healthy liver, on the one hand, and in a holdout set using the features textures of steatosis liver, on the other hand. The results could be used to quantify the texture information and will allow a clear detection between health and steatosis liver.

  17. Forward collision warning based on kernelized correlation filters

    NASA Astrophysics Data System (ADS)

    Pu, Jinchuan; Liu, Jun; Zhao, Yong

    2017-07-01

    A vehicle detection and tracking system is one of the indispensable methods to reduce the occurrence of traffic accidents. The nearest vehicle is the most likely to cause harm to us. So, this paper will do more research on about the nearest vehicle in the region of interest (ROI). For this system, high accuracy, real-time and intelligence are the basic requirement. In this paper, we set up a system that combines the advanced KCF tracking algorithm with the HaarAdaBoost detection algorithm. The KCF algorithm reduces computation time and increase the speed through the cyclic shift and diagonalization. This algorithm satisfies the real-time requirement. At the same time, Haar features also have the same advantage of simple operation and high speed for detection. The combination of this two algorithm contribute to an obvious improvement of the system running rate comparing with previous works. The detection result of the HaarAdaBoost classifier provides the initial value for the KCF algorithm. This fact optimizes KCF algorithm flaws that manual car marking in the initial phase, which is more scientific and more intelligent. Haar detection and KCF tracking with Histogram of Oriented Gradient (HOG) ensures the accuracy of the system. We evaluate the performance of framework on dataset that were self-collected. The experimental results demonstrate that the proposed method is robust and real-time. The algorithm can effectively adapt to illumination variation, even in the night it can meet the detection and tracking requirements, which is an improvement compared with the previous work.

  18. AVC: Selecting discriminative features on basis of AUC by maximizing variable complementarity.

    PubMed

    Sun, Lei; Wang, Jun; Wei, Jinmao

    2017-03-14

    The Receiver Operator Characteristic (ROC) curve is well-known in evaluating classification performance in biomedical field. Owing to its superiority in dealing with imbalanced and cost-sensitive data, the ROC curve has been exploited as a popular metric to evaluate and find out disease-related genes (features). The existing ROC-based feature selection approaches are simple and effective in evaluating individual features. However, these approaches may fail to find real target feature subset due to their lack of effective means to reduce the redundancy between features, which is essential in machine learning. In this paper, we propose to assess feature complementarity by a trick of measuring the distances between the misclassified instances and their nearest misses on the dimensions of pairwise features. If a misclassified instance and its nearest miss on one feature dimension are far apart on another feature dimension, the two features are regarded as complementary to each other. Subsequently, we propose a novel filter feature selection approach on the basis of the ROC analysis. The new approach employs an efficient heuristic search strategy to select optimal features with highest complementarities. The experimental results on a broad range of microarray data sets validate that the classifiers built on the feature subset selected by our approach can get the minimal balanced error rate with a small amount of significant features. Compared with other ROC-based feature selection approaches, our new approach can select fewer features and effectively improve the classification performance.

  19. 51 Eridani and GJ 3305: A 10-15 Myr old Binary Star System at 30 Parsecs

    NASA Astrophysics Data System (ADS)

    Feigelson, E. D.; Lawson, W. A.; Stark, M.; Townsley, L.; Garmire, G. P.

    2006-03-01

    Following the suggestion of Zuckerman and coworkers, we consider the evidence that 51 Eri (spectral type F0) and GJ 3305 (M0), historically classified as unrelated main-sequence stars in the solar neighborhood, are instead a wide physical binary system and members of the young β Pic moving group (BPMG). The BPMG is the nearest (d<~50 pc) of several groups of young stars with ages around 10 Myr that are kinematically convergent with the Oph-Sco-Cen association (OSCA), the nearest OB star association. Combining South African Astronomical Observatory optical photometry, Hobby-Eberly Telescope high-resolution spectroscopy, Chandra X-Ray Observatory data, and Second US Naval Observatory CCD Astrograph Catalog kinematics, we confirm with high confidence that the system is indeed extremely young. GJ 3305 itself exhibits very strong magnetic activity but has rapidly depleted most of its lithium. The 51 Eri/GJ 3305 system is the westernmost known member of the OSCA, lying 110 pc from the main subgroups. The system is similar to the BPMG wide binary HD 172555/CD -64 1208 and the HD 104237 quintet, suggesting that dynamically fragile multiple systems can survive the turbulent environments of their natal giant molecular cloud complexes, while still having high dispersion velocities imparted. Nearby young systems such as these are excellent targets for evolved circumstellar disk and planetary studies, having stellar ages comparable to that of the late phases of planet formation.

  20. Insufficient smoking restrictions in restaurants around junior high schools in Japan.

    PubMed

    Kotani, Kazuhiko; Osaki, Yoneatsu; Kurozawa, Youichi; Kishimoto, Takuji

    2006-12-01

    Controls for second hand smoke (SHS) and adolescent smoking have been still sociomedical concerns in Japan. Restaurant smoking restrictions are associated with community social norms affecting adolescent smoking behavior, and the status in areas around Junior high schools (JHSs) in the community could be a sign of community practices on regulating SHS for adolescents. To examine whether restaurant smoking restrictions are seen especially in areas around JHSs in Japan, a survey using the direct inspection of a total of 163 restaurants (64 restaurants within and 99 outside a 1-km radius from the nearest JHSs) was conducted in May 2003 in Yonago city, Japan. We assessed smoking restriction status in each restaurant and classified them into 2 groups according to the distance from the nearest JHSs. There were only 2 (3.1%) restaurants with 100% non-smoking and 11 (17.2%) with some partial restrictions among the restaurants within a 1-km radius of JHSs. There were 1 (1.0%) restaurant with 100% non-smoking, 3 (3.0%) with complete non-smoking sections and 17 (17.2%) with some partial restrictions among the restaurants outside a 1-km radius of JHSs. Among restaurants with some partial restrictions, restriction methods were considered insufficient. The smoking restriction status was not significantly different between the restaurant groups within and outside a 1-km radius of JHSs. These results suggest that the public awareness of and attitude toward adolescent smoking problems remains low in Japan. Further SHS control actions for adolescents are needed in Japan.

  1. Genetic Programming and Frequent Itemset Mining to Identify Feature Selection Patterns of iEEG and fMRI Epilepsy Data

    PubMed Central

    Smart, Otis; Burrell, Lauren

    2014-01-01

    Pattern classification for intracranial electroencephalogram (iEEG) and functional magnetic resonance imaging (fMRI) signals has furthered epilepsy research toward understanding the origin of epileptic seizures and localizing dysfunctional brain tissue for treatment. Prior research has demonstrated that implicitly selecting features with a genetic programming (GP) algorithm more effectively determined the proper features to discern biomarker and non-biomarker interictal iEEG and fMRI activity than conventional feature selection approaches. However for each the iEEG and fMRI modalities, it is still uncertain whether the stochastic properties of indirect feature selection with a GP yield (a) consistent results within a patient data set and (b) features that are specific or universal across multiple patient data sets. We examined the reproducibility of implicitly selecting features to classify interictal activity using a GP algorithm by performing several selection trials and subsequent frequent itemset mining (FIM) for separate iEEG and fMRI epilepsy patient data. We observed within-subject consistency and across-subject variability with some small similarity for selected features, indicating a clear need for patient-specific features and possible need for patient-specific feature selection or/and classification. For the fMRI, using nearest-neighbor classification and 30 GP generations, we obtained over 60% median sensitivity and over 60% median selectivity. For the iEEG, using nearest-neighbor classification and 30 GP generations, we obtained over 65% median sensitivity and over 65% median selectivity except one patient. PMID:25580059

  2. The association between the geography of fast food outlets and childhood obesity rates in Leeds, UK.

    PubMed

    Fraser, Lorna K; Edwards, Kimberley L

    2010-11-01

    To analyse the association between childhood overweight and obesity and the density and proximity of fast food outlets in relation to the child's residential postcode. This was an observational study using individual level height/weight data and geographic information systems methodology. Leeds in West Yorkshire, UK. This area consists of 476 lower super-output areas. Children aged 3-14 years who lived within the Leeds metropolitan boundaries (n=33,594). The number of fast food outlets per area and the distance to the nearest fast food outlet from the child's home address. The weight status of the child: overweight, obese or neither. 27.1% of the children were overweight or obese with 12.6% classified as obese. There is a significant positive correlation (p<0.001) between density of fast food outlets and higher deprivation. A higher density of fast food outlets was significantly associated (p=0.02) with the child being obese (or overweight/obese) in the generalised estimating equation model which also included sex, age and deprivation. No significant association between distance to the nearest fast food outlet and overweight or obese status was found. There is a positive relationship between the density of fast food outlets per area and the obesity status of children in Leeds. There is also a significant association between fast food outlet density and areas of higher deprivation. Copyright © 2010 Elsevier Ltd. All rights reserved.

  3. The Application of Determining Students’ Graduation Status of STMIK Palangkaraya Using K-Nearest Neighbors Method

    NASA Astrophysics Data System (ADS)

    Rusdiana, Lili; Marfuah

    2017-12-01

    K-Nearest Neighbors method is one of methods used for classification which calculate a value to find out the closest in distance. It is used to group a set of data such as students’ graduation status that are got from the amount of course credits taken by them, the grade point average (AVG), and the mini-thesis grade. The study is conducted to know the results of using K-Nearest Neighbors method on the application of determining students’ graduation status, so it can be analyzed from the method used, the data, and the application constructed. The aim of this study is to find out the application results by using K-Nearest Neighbors concept to determine students’ graduation status using the data of STMIK Palangkaraya students. The development of the software used Extreme Programming, since it was appropriate and precise for this study which was to quickly finish the project. The application was created using Microsoft Office Excel 2007 for the training data and Matlab 7 to implement the application. The result of K-Nearest Neighbors method on the application of determining students’ graduation status was 92.5%. It could determine the predicate graduation of 94 data used from the initial data before the processing as many as 136 data which the maximal training data was 50data. The K-Nearest Neighbors method is one of methods used to group a set of data based on the closest value, so that using K-Nearest Neighbors method agreed with this study. The results of K-Nearest Neighbors method on the application of determining students’ graduation status was 92.5% could determine the predicate graduation which is the maximal training data. The K-Nearest Neighbors method is one of methods used to group a set of data based on the closest value, so that using K-Nearest Neighbors method agreed with this study.

  4. One for You, One for Me: Humans' Unique Turn-Taking Skills.

    PubMed

    Melis, Alicia P; Grocke, Patricia; Kalbitz, Josefine; Tomasello, Michael

    2016-07-01

    Long-term collaborative relationships require that any jointly produced resources be shared in mutually satisfactory ways. Prototypically, this sharing involves partners dividing up simultaneously available resources, but sometimes the collaboration makes a resource available to only one individual, and any sharing of resources must take place across repeated instances over time. Here, we show that beginning at 5 years of age, human children stabilize cooperation in such cases by taking turns across instances of obtaining a resource. In contrast, chimpanzees do not take turns in this way, and so their collaboration tends to disintegrate over time. Alternating turns in obtaining a collaboratively produced resource does not necessarily require a prosocial concern for the other, but rather requires only a strategic judgment that partners need incentives to continue collaborating. These results suggest that human beings are adapted for thinking strategically in ways that sustain long-term cooperative relationships and that are absent in their nearest primate relatives. © The Author(s) 2016.

  5. Stellar Archaeology and Galaxy Genesis: The Need for Large Area Multi-Object Spectrograph on 8 m-Class Telescopes

    NASA Astrophysics Data System (ADS)

    Irwin, Mike J.; Lewis, Geraint F.

    The origin and evolution of galaxies like the Milky Way and M31 remain among the key questions in astrophysics. The galaxies we see today in and around the Local Group are representatives of the general field population of the Universe and have been evolving for the majority of cosmic time. As our nearest neighbour systems they can be studied in far more detail than their distant counterparts and hence provide our best hope for understanding star formation and prototypical galaxy evolution over the lifetime of the Universe [K. Freeman, J. Bland-Hawthorn in Annu. Rev. Astron. Astrophys. 40, 487 (2002)]. Significant observational progress has been made, but we are still a long way from understanding galaxy genesis. To unravel this formative epoch, detailed large area multi-object spectroscopy of spatial, kinematic and chemical structures on 8 m-class telescopes are required, to provide the link between local near-field cosmology and predictions from the high-redshift Universe.

  6. Portable Electronic Nose Based on Electrochemical Sensors for Food Quality Assessment

    PubMed Central

    Dymerski, Tomasz; Gębicki, Jacek; Namieśnik, Jacek

    2017-01-01

    The steady increase in global consumption puts a strain on agriculture and might lead to a decrease in food quality. Currently used techniques of food analysis are often labour-intensive and time-consuming and require extensive sample preparation. For that reason, there is a demand for novel methods that could be used for rapid food quality assessment. A technique based on the use of an array of chemical sensors for holistic analysis of the sample’s headspace is called electronic olfaction. In this article, a prototype of a portable, modular electronic nose intended for food analysis is described. Using the SVM method, it was possible to classify samples of poultry meat based on shelf-life with 100% accuracy, and also samples of rapeseed oil based on the degree of thermal degradation with 100% accuracy. The prototype was also used to detect adulterations of extra virgin olive oil with rapeseed oil with 82% overall accuracy. Due to the modular design, the prototype offers the advantages of solutions targeted for analysis of specific food products, at the same time retaining the flexibility of application. Furthermore, its portability allows the device to be used at different stages of the production and distribution process. PMID:29186754

  7. Antigenic characterization of intermediate adenovirus 14-11 strains associated with upper respiratory illness in a military camp.

    PubMed Central

    Hierholzer, J C; Pumarola, A

    1976-01-01

    An unusual variant of adenovirus (AV) 11 was isolated from throat and rectal swabs from six persons with upper respiratory illness in a Spanish military camp in March 1969. The same strain was serologically related to the upper respiratory illness of seven other men among 25 sample cases studied in detail. After strain purification, the virus was grouped as an AV by standard biological tests; it possessed the usual titers of group-specific hexon antigen but only low hemagglutinin titers (1:4 to 1:8) with erythrocytes from selected rhesus monkeys. The virus gave little reaction in hemagglutination inhibition (HI) tests with antisera to AV 1 through 35, but was neutralized to homologous titers by AV 11 antiserum. Reciprocally, rabbit and guinea pig antisera to the isolates possessed high HI antibody titers to prototype AV 14 and high serum neutralization (SN) antibody titers to prototype AV 11. On this basis, the variants were classified as AV 14-11 intermediates. Sequential serum specimens from the patients with and without positive cultures showed diagnostic rises in HI and SN antibody levels to the AV 14-11 intermediate and to prototype AV 11, but little response to AV 14. PMID:177365

  8. Boosting instance prototypes to detect local dermoscopic features.

    PubMed

    Situ, Ning; Yuan, Xiaojing; Zouridakis, George

    2010-01-01

    Local dermoscopic features are useful in many dermoscopic criteria for skin cancer detection. We address the problem of detecting local dermoscopic features from epiluminescence (ELM) microscopy skin lesion images. We formulate the recognition of local dermoscopic features as a multi-instance learning (MIL) problem. We employ the method of diverse density (DD) and evidence confidence (EC) function to convert MIL to a single-instance learning (SIL) problem. We apply Adaboost to improve the classification performance with support vector machines (SVMs) as the base classifier. We also propose to boost the selection of instance prototypes through changing the data weights in the DD function. We validate the methods on detecting ten local dermoscopic features from a dataset with 360 images. We compare the performance of the MIL approach, its boosting version, and a baseline method without using MIL. Our results show that boosting can provide performance improvement compared to the other two methods.

  9. K-Nearest Neighbor Algorithm Optimization in Text Categorization

    NASA Astrophysics Data System (ADS)

    Chen, Shufeng

    2018-01-01

    K-Nearest Neighbor (KNN) classification algorithm is one of the simplest methods of data mining. It has been widely used in classification, regression and pattern recognition. The traditional KNN method has some shortcomings such as large amount of sample computation and strong dependence on the sample library capacity. In this paper, a method of representative sample optimization based on CURE algorithm is proposed. On the basis of this, presenting a quick algorithm QKNN (Quick k-nearest neighbor) to find the nearest k neighbor samples, which greatly reduces the similarity calculation. The experimental results show that this algorithm can effectively reduce the number of samples and speed up the search for the k nearest neighbor samples to improve the performance of the algorithm.

  10. Comparison of Machine Learning Methods for the Arterial Hypertension Diagnostics

    PubMed Central

    Belo, David; Gamboa, Hugo

    2017-01-01

    The paper presents results of machine learning approach accuracy applied analysis of cardiac activity. The study evaluates the diagnostics possibilities of the arterial hypertension by means of the short-term heart rate variability signals. Two groups were studied: 30 relatively healthy volunteers and 40 patients suffering from the arterial hypertension of II-III degree. The following machine learning approaches were studied: linear and quadratic discriminant analysis, k-nearest neighbors, support vector machine with radial basis, decision trees, and naive Bayes classifier. Moreover, in the study, different methods of feature extraction are analyzed: statistical, spectral, wavelet, and multifractal. All in all, 53 features were investigated. Investigation results show that discriminant analysis achieves the highest classification accuracy. The suggested approach of noncorrelated feature set search achieved higher results than data set based on the principal components. PMID:28831239

  11. Gesture recognition for smart home applications using portable radar sensors.

    PubMed

    Wan, Qian; Li, Yiran; Li, Changzhi; Pal, Ranadip

    2014-01-01

    In this article, we consider the design of a human gesture recognition system based on pattern recognition of signatures from a portable smart radar sensor. Powered by AAA batteries, the smart radar sensor operates in the 2.4 GHz industrial, scientific and medical (ISM) band. We analyzed the feature space using principle components and application-specific time and frequency domain features extracted from radar signals for two different sets of gestures. We illustrate that a nearest neighbor based classifier can achieve greater than 95% accuracy for multi class classification using 10 fold cross validation when features are extracted based on magnitude differences and Doppler shifts as compared to features extracted through orthogonal transformations. The reported results illustrate the potential of intelligent radars integrated with a pattern recognition system for high accuracy smart home and health monitoring purposes.

  12. A face and palmprint recognition approach based on discriminant DCT feature extraction.

    PubMed

    Jing, Xiao-Yuan; Zhang, David

    2004-12-01

    In the field of image processing and recognition, discrete cosine transform (DCT) and linear discrimination are two widely used techniques. Based on them, we present a new face and palmprint recognition approach in this paper. It first uses a two-dimensional separability judgment to select the DCT frequency bands with favorable linear separability. Then from the selected bands, it extracts the linear discriminative features by an improved Fisherface method and performs the classification by the nearest neighbor classifier. We detailedly analyze theoretical advantages of our approach in feature extraction. The experiments on face databases and palmprint database demonstrate that compared to the state-of-the-art linear discrimination methods, our approach obtains better classification performance. It can significantly improve the recognition rates for face and palmprint data and effectively reduce the dimension of feature space.

  13. Hamiltonian identifiability assisted by a single-probe measurement

    NASA Astrophysics Data System (ADS)

    Sone, Akira; Cappellaro, Paola

    2017-02-01

    We study the Hamiltonian identifiability of a many-body spin-1 /2 system assisted by the measurement on a single quantum probe based on the eigensystem realization algorithm approach employed in Zhang and Sarovar, Phys. Rev. Lett. 113, 080401 (2014), 10.1103/PhysRevLett.113.080401. We demonstrate a potential application of Gröbner basis to the identifiability test of the Hamiltonian, and provide the necessary experimental resources, such as the lower bound in the number of the required sampling points, the upper bound in total required evolution time, and thus the total measurement time. Focusing on the examples of the identifiability in the spin-chain model with nearest-neighbor interaction, we classify the spin-chain Hamiltonian based on its identifiability, and provide the control protocols to engineer the nonidentifiable Hamiltonian to be an identifiable Hamiltonian.

  14. A Smartphone App for Families With Preschool-Aged Children in a Public Nutrition Program: Prototype Development and Beta-Testing

    PubMed Central

    Emerson, Janice S; Quirk, Meghan E; Canedo, Juan R; Jones, Jessica L; Vylegzhanina, Violetta; Schmidt, Douglas C; Mulvaney, Shelagh A; Beech, Bettina M; Briley, Chiquita; Harris, Calvin; Husaini, Baqar A

    2017-01-01

    Background The Special Supplemental Nutrition Program for Women, Infants, and Children (WIC) in the United States provides free supplemental food and nutrition education to low-income mothers and children under age 5 years. Childhood obesity prevalence is higher among preschool children in the WIC program compared to other children, and WIC improves dietary quality among low-income children. The Children Eating Well (CHEW) smartphone app was developed in English and Spanish for WIC-participating families with preschool-aged children as a home-based intervention to reinforce WIC nutrition education and help prevent childhood obesity. Objective This paper describes the development and beta-testing of the CHEW smartphone app. The objective of beta-testing was to test the CHEW app prototype with target users, focusing on usage, usability, and perceived barriers and benefits of the app. Methods The goals of the CHEW app were to make the WIC shopping experience easier, maximize WIC benefit redemption, and improve parent snack feeding practices. The CHEW app prototype consisted of WIC Shopping Tools, including a barcode scanner and calculator tools for the cash value voucher for purchasing fruits and vegetables, and nutrition education focused on healthy snacks and beverages, including a Yummy Snack Gallery and Healthy Snacking Tips. Mothers of 63 black and Hispanic WIC-participating children ages 2 to 4 years tested the CHEW app prototype for 3 months and completed follow-up interviews. Results Study participants testing the app for 3 months used the app on average once a week for approximately 4 and a half minutes per session, although substantial variation was observed. Usage of specific features averaged at 1 to 2 times per month for shopping-related activities and 2 to 4 times per month for the snack gallery. Mothers classified as users rated the app’s WIC Shopping Tools relatively high on usability and benefits, although variation in scores and qualitative feedback highlighted several barriers that need to be addressed. The Yummy Snack Gallery and Healthy Snacking Tips scored higher on usability than benefits, suggesting that the nutrition education components may have been appealing but too limited in scope and exposure. Qualitative feedback from mothers classified as non-users pointed to several important barriers that could preclude some WIC participants from using the app at all. Conclusions The prototype study successfully demonstrated the feasibility of using the CHEW app prototype with mothers of WIC-enrolled black and Hispanic preschool-aged children, with moderate levels of app usage and moderate to high usability and benefits. Future versions with enhanced shopping tools and expanded nutrition content should be implemented in WIC clinics to evaluate adoption and behavioral outcomes. This study adds to the growing body of research focused on the application of technology-based interventions in the WIC program to promote program retention and childhood obesity prevention. PMID:28768611

  15. Prototype diagnosis of psychiatric syndromes

    PubMed Central

    WESTEN, DREW

    2012-01-01

    The method of diagnosing patients used since the early 1980s in psychiatry, which involves evaluating each of several hundred symptoms for their presence or absence and then applying idiosyncratic rules for combining them for each of several hundred disorders, has led to great advances in research over the last 30 years. However, its problems have become increasingly apparent, particularly for clinical practice. An alternative approach, designed to maximize clinical utility, is prototype matching. Instead of counting symptoms of a disorder and determining whether they cross an arbitrary cutoff, the task of the diagnostician is to gauge the extent to which a patient’s clinical presentation matches a paragraph-length description of the disorder using a simple 5-point scale, from 1 (“little or no match”) to 5 (“very good match”). The result is both a dimensional diagnosis that captures the extent to which the patient “has” the disorder and a categorical diagnosis, with ratings of 4 and 5 corresponding to presence of the disorder and a rating of 3 indicating “subthreshold” or “clinically significant features”. The disorders and criteria woven into the prototypes can be identified empirically, so that the prototypes are both scientifically grounded and clinically useful. Prototype diagnosis has a number of advantages: it better captures the way humans naturally classify novel and complex stimuli; is clinically helpful, reliable, and easy to use in everyday practice; facilitates both dimensional and categorical diagnosis and dramatically reduces the number of categories required for classification; allows for clinically richer, empirically derived, and culturally relevant classification; reduces the gap between research criteria and clinical knowledge, by allowing clinicians in training to learn a small set of standardized prototypes and to develop richer mental representations of the disorders over time through clinical experience; and can help resolve the thorny issue of the relation between psychiatric diagnosis and functional impairment. PMID:22294998

  16. Site descriptions: Cypress Creek, Davis Canyon, Deaf Smith, Hanford Reference, Lavender Canyon, Richton Dome, Swisher, Vacherie Dome, Yucca Mountain. Revision

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    NONE

    1986-04-01

    The following information is given about the various sites: location (state and county), terrain, climate, weather, endangered plants and animals; nearest town, population, nearest railway, nearest interstate highway, economy, density within 50 miles, owners, and historical sites. (LM)

  17. The nearest relative in mental health law.

    PubMed

    Andoh, Benjamin; Gogo, Emmanuel

    2004-04-01

    This article considers the concept of the 'nearest relative' in mental health law in England and Wales and argues, inter alia, for its retention in a way that avoids violation of the European Convention on Human Rights and the Human Rights Act 1998. It looks, first, at the meaning of nearest relative and then focuses on his/her role today, including its link with advance directives for mental health care, and on the tension between nearest relatives and approved social workers and the law. The problem exposed by JT v. United Kingdom in relation to the Human Rights Act 1998 and its implications for the future are considered. The impact of the Mental Health Bill (2002) on the nearest relative is discussed and recommendations to improve the present law are then suggested.

  18. Improved nearest codeword search scheme using a tighter kick-out condition

    NASA Astrophysics Data System (ADS)

    Hwang, Kuo-Feng; Chang, Chin-Chen

    2001-09-01

    Using a tighter kick-out condition as a faster approach to nearest codeword searches is proposed. The proposed scheme finds the nearest codeword that is identical to the one found using a full search. However, using our scheme, the search time is much shorter. Our scheme first establishes a tighter kick-out condition. Then, the temporal nearest codeword can be obtained from the codewords that survive the tighter condition. Finally, the temporal nearest codeword cooperatives with the query vector to constitute a better kick-out condition. In other words, more codewords can be excluded without actually computing the distances between the bypassed codewords and the query vector. Comparison to previous work are included to present the benefits of the proposed scheme in relation to search time.

  19. Revealing metabolite biomarkers for acupuncture treatment by linear programming based feature selection.

    PubMed

    Wang, Yong; Wu, Qiao-Feng; Chen, Chen; Wu, Ling-Yun; Yan, Xian-Zhong; Yu, Shu-Guang; Zhang, Xiang-Sun; Liang, Fan-Rong

    2012-01-01

    Acupuncture has been practiced in China for thousands of years as part of the Traditional Chinese Medicine (TCM) and has gradually accepted in western countries as an alternative or complementary treatment. However, the underlying mechanism of acupuncture, especially whether there exists any difference between varies acupoints, remains largely unknown, which hinders its widespread use. In this study, we develop a novel Linear Programming based Feature Selection method (LPFS) to understand the mechanism of acupuncture effect, at molecular level, by revealing the metabolite biomarkers for acupuncture treatment. Specifically, we generate and investigate the high-throughput metabolic profiles of acupuncture treatment at several acupoints in human. To select the subsets of metabolites that best characterize the acupuncture effect for each meridian point, an optimization model is proposed to identify biomarkers from high-dimensional metabolic data from case and control samples. Importantly, we use nearest centroid as the prototype to simultaneously minimize the number of selected features and the leave-one-out cross validation error of classifier. We compared the performance of LPFS to several state-of-the-art methods, such as SVM recursive feature elimination (SVM-RFE) and sparse multinomial logistic regression approach (SMLR). We find that our LPFS method tends to reveal a small set of metabolites with small standard deviation and large shifts, which exactly serves our requirement for good biomarker. Biologically, several metabolite biomarkers for acupuncture treatment are revealed and serve as the candidates for further mechanism investigation. Also biomakers derived from five meridian points, Zusanli (ST36), Liangmen (ST21), Juliao (ST3), Yanglingquan (GB34), and Weizhong (BL40), are compared for their similarity and difference, which provide evidence for the specificity of acupoints. Our result demonstrates that metabolic profiling might be a promising method to investigate the molecular mechanism of acupuncture. Comparing with other existing methods, LPFS shows better performance to select a small set of key molecules. In addition, LPFS is a general methodology and can be applied to other high-dimensional data analysis, for example cancer genomics.

  20. Revealing metabolite biomarkers for acupuncture treatment by linear programming based feature selection

    PubMed Central

    2012-01-01

    Background Acupuncture has been practiced in China for thousands of years as part of the Traditional Chinese Medicine (TCM) and has gradually accepted in western countries as an alternative or complementary treatment. However, the underlying mechanism of acupuncture, especially whether there exists any difference between varies acupoints, remains largely unknown, which hinders its widespread use. Results In this study, we develop a novel Linear Programming based Feature Selection method (LPFS) to understand the mechanism of acupuncture effect, at molecular level, by revealing the metabolite biomarkers for acupuncture treatment. Specifically, we generate and investigate the high-throughput metabolic profiles of acupuncture treatment at several acupoints in human. To select the subsets of metabolites that best characterize the acupuncture effect for each meridian point, an optimization model is proposed to identify biomarkers from high-dimensional metabolic data from case and control samples. Importantly, we use nearest centroid as the prototype to simultaneously minimize the number of selected features and the leave-one-out cross validation error of classifier. We compared the performance of LPFS to several state-of-the-art methods, such as SVM recursive feature elimination (SVM-RFE) and sparse multinomial logistic regression approach (SMLR). We find that our LPFS method tends to reveal a small set of metabolites with small standard deviation and large shifts, which exactly serves our requirement for good biomarker. Biologically, several metabolite biomarkers for acupuncture treatment are revealed and serve as the candidates for further mechanism investigation. Also biomakers derived from five meridian points, Zusanli (ST36), Liangmen (ST21), Juliao (ST3), Yanglingquan (GB34), and Weizhong (BL40), are compared for their similarity and difference, which provide evidence for the specificity of acupoints. Conclusions Our result demonstrates that metabolic profiling might be a promising method to investigate the molecular mechanism of acupuncture. Comparing with other existing methods, LPFS shows better performance to select a small set of key molecules. In addition, LPFS is a general methodology and can be applied to other high-dimensional data analysis, for example cancer genomics. PMID:23046877

  1. Characterization of Used Nuclear Fuel with Multivariate Analysis for Process Monitoring

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Dayman, Kenneth J.; Coble, Jamie B.; Orton, Christopher R.

    2014-01-01

    The Multi-Isotope Process (MIP) Monitor combines gamma spectroscopy and multivariate analysis to detect anomalies in various process streams in a nuclear fuel reprocessing system. Measured spectra are compared to models of nominal behavior at each measurement location to detect unexpected changes in system behavior. In order to improve the accuracy and specificity of process monitoring, fuel characterization may be used to more accurately train subsequent models in a full analysis scheme. This paper presents initial development of a reactor-type classifier that is used to select a reactor-specific partial least squares model to predict fuel burnup. Nuclide activities for prototypic usedmore » fuel samples were generated in ORIGEN-ARP and used to investigate techniques to characterize used nuclear fuel in terms of reactor type (pressurized or boiling water reactor) and burnup. A variety of reactor type classification algorithms, including k-nearest neighbors, linear and quadratic discriminant analyses, and support vector machines, were evaluated to differentiate used fuel from pressurized and boiling water reactors. Then, reactor type-specific partial least squares models were developed to predict the burnup of the fuel. Using these reactor type-specific models instead of a model trained for all light water reactors improved the accuracy of burnup predictions. The developed classification and prediction models were combined and applied to a large dataset that included eight fuel assembly designs, two of which were not used in training the models, and spanned the range of the initial 235U enrichment, cooling time, and burnup values expected of future commercial used fuel for reprocessing. Error rates were consistent across the range of considered enrichment, cooling time, and burnup values. Average absolute relative errors in burnup predictions for validation data both within and outside the training space were 0.0574% and 0.0597%, respectively. The errors seen in this work are artificially low, because the models were trained, optimized, and tested on simulated, noise-free data. However, these results indicate that the developed models may generalize well to new data and that the proposed approach constitutes a viable first step in developing a fuel characterization algorithm based on gamma spectra.« less

  2. Behavioral Modeling for Mental Health using Machine Learning Algorithms.

    PubMed

    Srividya, M; Mohanavalli, S; Bhalaji, N

    2018-04-03

    Mental health is an indicator of emotional, psychological and social well-being of an individual. It determines how an individual thinks, feels and handle situations. Positive mental health helps one to work productively and realize their full potential. Mental health is important at every stage of life, from childhood and adolescence through adulthood. Many factors contribute to mental health problems which lead to mental illness like stress, social anxiety, depression, obsessive compulsive disorder, drug addiction, and personality disorders. It is becoming increasingly important to determine the onset of the mental illness to maintain proper life balance. The nature of machine learning algorithms and Artificial Intelligence (AI) can be fully harnessed for predicting the onset of mental illness. Such applications when implemented in real time will benefit the society by serving as a monitoring tool for individuals with deviant behavior. This research work proposes to apply various machine learning algorithms such as support vector machines, decision trees, naïve bayes classifier, K-nearest neighbor classifier and logistic regression to identify state of mental health in a target group. The responses obtained from the target group for the designed questionnaire were first subject to unsupervised learning techniques. The labels obtained as a result of clustering were validated by computing the Mean Opinion Score. These cluster labels were then used to build classifiers to predict the mental health of an individual. Population from various groups like high school students, college students and working professionals were considered as target groups. The research presents an analysis of applying the aforementioned machine learning algorithms on the target groups and also suggests directions for future work.

  3. An evaluation of supervised classifiers for indirectly detecting salt-affected areas at irrigation scheme level

    NASA Astrophysics Data System (ADS)

    Muller, Sybrand Jacobus; van Niekerk, Adriaan

    2016-07-01

    Soil salinity often leads to reduced crop yield and quality and can render soils barren. Irrigated areas are particularly at risk due to intensive cultivation and secondary salinization caused by waterlogging. Regular monitoring of salt accumulation in irrigation schemes is needed to keep its negative effects under control. The dynamic spatial and temporal characteristics of remote sensing can provide a cost-effective solution for monitoring salt accumulation at irrigation scheme level. This study evaluated a range of pan-fused SPOT-5 derived features (spectral bands, vegetation indices, image textures and image transformations) for classifying salt-affected areas in two distinctly different irrigation schemes in South Africa, namely Vaalharts and Breede River. The relationship between the input features and electro conductivity measurements were investigated using regression modelling (stepwise linear regression, partial least squares regression, curve fit regression modelling) and supervised classification (maximum likelihood, nearest neighbour, decision tree analysis, support vector machine and random forests). Classification and regression trees and random forest were used to select the most important features for differentiating salt-affected and unaffected areas. The results showed that the regression analyses produced weak models (<0.4 R squared). Better results were achieved using the supervised classifiers, but the algorithms tend to over-estimate salt-affected areas. A key finding was that none of the feature sets or classification algorithms stood out as being superior for monitoring salt accumulation at irrigation scheme level. This was attributed to the large variations in the spectral responses of different crops types at different growing stages, coupled with their individual tolerances to saline conditions.

  4. Machine-learning-based classification of real-time tissue elastography for hepatic fibrosis in patients with chronic hepatitis B.

    PubMed

    Chen, Yang; Luo, Yan; Huang, Wei; Hu, Die; Zheng, Rong-Qin; Cong, Shu-Zhen; Meng, Fan-Kun; Yang, Hong; Lin, Hong-Jun; Sun, Yan; Wang, Xiu-Yan; Wu, Tao; Ren, Jie; Pei, Shu-Fang; Zheng, Ying; He, Yun; Hu, Yu; Yang, Na; Yan, Hongmei

    2017-10-01

    Hepatic fibrosis is a common middle stage of the pathological processes of chronic liver diseases. Clinical intervention during the early stages of hepatic fibrosis can slow the development of liver cirrhosis and reduce the risk of developing liver cancer. Performing a liver biopsy, the gold standard for viral liver disease management, has drawbacks such as invasiveness and a relatively high sampling error rate. Real-time tissue elastography (RTE), one of the most recently developed technologies, might be promising imaging technology because it is both noninvasive and provides accurate assessments of hepatic fibrosis. However, determining the stage of liver fibrosis from RTE images in a clinic is a challenging task. In this study, in contrast to the previous liver fibrosis index (LFI) method, which predicts the stage of diagnosis using RTE images and multiple regression analysis, we employed four classical classifiers (i.e., Support Vector Machine, Naïve Bayes, Random Forest and K-Nearest Neighbor) to build a decision-support system to improve the hepatitis B stage diagnosis performance. Eleven RTE image features were obtained from 513 subjects who underwent liver biopsies in this multicenter collaborative research. The experimental results showed that the adopted classifiers significantly outperformed the LFI method and that the Random Forest(RF) classifier provided the highest average accuracy among the four machine algorithms. This result suggests that sophisticated machine-learning methods can be powerful tools for evaluating the stage of hepatic fibrosis and show promise for clinical applications. Copyright © 2017 Elsevier Ltd. All rights reserved.

  5. Diagnosis of response and non-response to dry eye treatment using infrared thermography images

    NASA Astrophysics Data System (ADS)

    Acharya, U. Rajendra; Tan, Jen Hong; Vidya, S.; Yeo, Sharon; Too, Cheah Loon; Lim, Wei Jie Eugene; Chua, Kuang Chua; Tong, Louis

    2014-11-01

    The dry eye treatment outcome depends on the assessment of clinical relevance of the treatment effect. The potential approach to assess the clinical relevance of the treatment is to identify the symptoms responders and non-responders to the given treatments using the responder analysis. In our work, we have performed the responder analysis to assess the clinical relevance effect of the dry eye treatments namely, hot towel, EyeGiene®, and Blephasteam® twice daily and 12 min session of Lipiflow®. Thermography is performed at week 0 (baseline), at weeks 4 and 12 after treatment. The clinical parameters such as, change in the clinical irritations scores, tear break up time (TBUT), corneal staining and Schirmer's symptoms tests values are used to obtain the responders and non-responders groups. We have obtained the infrared thermography images of dry eye symptoms responders and non-responders to the three types of warming treatments. The energy, kurtosis, skewness, mean, standard deviation, and various entropies namely Shannon, Renyi and Kapoor are extracted from responders and non-responders thermograms. The extracted features are ranked based on t-values. These ranked features are fed to the various classifiers to get the highest performance using minimum features. We have used decision tree (DT), K nearest neighbour (KNN), Naves Bayesian (NB) and support vector machine (SVM) to classify the features into responder and non-responder classes. We have obtained an average accuracy of 99.88%, sensitivity of 99.7% and specificity of 100% using KNN classifier using ten-fold cross validation.

  6. Psoriasis skin biopsy image segmentation using Deep Convolutional Neural Network.

    PubMed

    Pal, Anabik; Garain, Utpal; Chandra, Aditi; Chatterjee, Raghunath; Senapati, Swapan

    2018-06-01

    Development of machine assisted tools for automatic analysis of psoriasis skin biopsy image plays an important role in clinical assistance. Development of automatic approach for accurate segmentation of psoriasis skin biopsy image is the initial prerequisite for developing such system. However, the complex cellular structure, presence of imaging artifacts, uneven staining variation make the task challenging. This paper presents a pioneering attempt for automatic segmentation of psoriasis skin biopsy images. Several deep neural architectures are tried for segmenting psoriasis skin biopsy images. Deep models are used for classifying the super-pixels generated by Simple Linear Iterative Clustering (SLIC) and the segmentation performance of these architectures is compared with the traditional hand-crafted feature based classifiers built on popularly used classifiers like K-Nearest Neighbor (KNN), Support Vector Machine (SVM) and Random Forest (RF). A U-shaped Fully Convolutional Neural Network (FCN) is also used in an end to end learning fashion where input is the original color image and the output is the segmentation class map for the skin layers. An annotated real psoriasis skin biopsy image data set of ninety (90) images is developed and used for this research. The segmentation performance is evaluated with two metrics namely, Jaccard's Coefficient (JC) and the Ratio of Correct Pixel Classification (RCPC) accuracy. The experimental results show that the CNN based approaches outperform the traditional hand-crafted feature based classification approaches. The present research shows that practical system can be developed for machine assisted analysis of psoriasis disease. Copyright © 2018 Elsevier B.V. All rights reserved.

  7. Automated cloud classification using a ground based infra-red camera and texture analysis techniques

    NASA Astrophysics Data System (ADS)

    Rumi, Emal; Kerr, David; Coupland, Jeremy M.; Sandford, Andrew P.; Brettle, Mike J.

    2013-10-01

    Clouds play an important role in influencing the dynamics of local and global weather and climate conditions. Continuous monitoring of clouds is vital for weather forecasting and for air-traffic control. Convective clouds such as Towering Cumulus (TCU) and Cumulonimbus clouds (CB) are associated with thunderstorms, turbulence and atmospheric instability. Human observers periodically report the presence of CB and TCU clouds during operational hours at airports and observatories; however such observations are expensive and time limited. Robust, automatic classification of cloud type using infrared ground-based instrumentation offers the advantage of continuous, real-time (24/7) data capture and the representation of cloud structure in the form of a thermal map, which can greatly help to characterise certain cloud formations. The work presented here utilised a ground based infrared (8-14 μm) imaging device mounted on a pan/tilt unit for capturing high spatial resolution sky images. These images were processed to extract 45 separate textural features using statistical and spatial frequency based analytical techniques. These features were used to train a weighted k-nearest neighbour (KNN) classifier in order to determine cloud type. Ground truth data were obtained by inspection of images captured simultaneously from a visible wavelength colour camera at the same installation, with approximately the same field of view as the infrared device. These images were classified by a trained cloud observer. Results from the KNN classifier gave an encouraging success rate. A Probability of Detection (POD) of up to 90% with a Probability of False Alarm (POFA) as low as 16% was achieved.

  8. Effective and extensible feature extraction method using genetic algorithm-based frequency-domain feature search for epileptic EEG multiclassification

    PubMed Central

    Wen, Tingxi; Zhang, Zhongnan

    2017-01-01

    Abstract In this paper, genetic algorithm-based frequency-domain feature search (GAFDS) method is proposed for the electroencephalogram (EEG) analysis of epilepsy. In this method, frequency-domain features are first searched and then combined with nonlinear features. Subsequently, these features are selected and optimized to classify EEG signals. The extracted features are analyzed experimentally. The features extracted by GAFDS show remarkable independence, and they are superior to the nonlinear features in terms of the ratio of interclass distance and intraclass distance. Moreover, the proposed feature search method can search for features of instantaneous frequency in a signal after Hilbert transformation. The classification results achieved using these features are reasonable; thus, GAFDS exhibits good extensibility. Multiple classical classifiers (i.e., k-nearest neighbor, linear discriminant analysis, decision tree, AdaBoost, multilayer perceptron, and Naïve Bayes) achieve satisfactory classification accuracies by using the features generated by the GAFDS method and the optimized feature selection. The accuracies for 2-classification and 3-classification problems may reach up to 99% and 97%, respectively. Results of several cross-validation experiments illustrate that GAFDS is effective in the extraction of effective features for EEG classification. Therefore, the proposed feature selection and optimization model can improve classification accuracy. PMID:28489789

  9. Gene-Based Multiclass Cancer Diagnosis with Class-Selective Rejections

    PubMed Central

    Jrad, Nisrine; Grall-Maës, Edith; Beauseroy, Pierre

    2009-01-01

    Supervised learning of microarray data is receiving much attention in recent years. Multiclass cancer diagnosis, based on selected gene profiles, are used as adjunct of clinical diagnosis. However, supervised diagnosis may hinder patient care, add expense or confound a result. To avoid this misleading, a multiclass cancer diagnosis with class-selective rejection is proposed. It rejects some patients from one, some, or all classes in order to ensure a higher reliability while reducing time and expense costs. Moreover, this classifier takes into account asymmetric penalties dependant on each class and on each wrong or partially correct decision. It is based on ν-1-SVM coupled with its regularization path and minimizes a general loss function defined in the class-selective rejection scheme. The state of art multiclass algorithms can be considered as a particular case of the proposed algorithm where the number of decisions is given by the classes and the loss function is defined by the Bayesian risk. Two experiments are carried out in the Bayesian and the class selective rejection frameworks. Five genes selected datasets are used to assess the performance of the proposed method. Results are discussed and accuracies are compared with those computed by the Naive Bayes, Nearest Neighbor, Linear Perceptron, Multilayer Perceptron, and Support Vector Machines classifiers. PMID:19584932

  10. Computer-aided diagnosis system: a Bayesian hybrid classification method.

    PubMed

    Calle-Alonso, F; Pérez, C J; Arias-Nicolás, J P; Martín, J

    2013-10-01

    A novel method to classify multi-class biomedical objects is presented. The method is based on a hybrid approach which combines pairwise comparison, Bayesian regression and the k-nearest neighbor technique. It can be applied in a fully automatic way or in a relevance feedback framework. In the latter case, the information obtained from both an expert and the automatic classification is iteratively used to improve the results until a certain accuracy level is achieved, then, the learning process is finished and new classifications can be automatically performed. The method has been applied in two biomedical contexts by following the same cross-validation schemes as in the original studies. The first one refers to cancer diagnosis, leading to an accuracy of 77.35% versus 66.37%, originally obtained. The second one considers the diagnosis of pathologies of the vertebral column. The original method achieves accuracies ranging from 76.5% to 96.7%, and from 82.3% to 97.1% in two different cross-validation schemes. Even with no supervision, the proposed method reaches 96.71% and 97.32% in these two cases. By using a supervised framework the achieved accuracy is 97.74%. Furthermore, all abnormal cases were correctly classified. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  11. Classification of protein quaternary structure by functional domain composition

    PubMed Central

    Yu, Xiaojing; Wang, Chuan; Li, Yixue

    2006-01-01

    Background The number and the arrangement of subunits that form a protein are referred to as quaternary structure. Quaternary structure is an important protein attribute that is closely related to its function. Proteins with quaternary structure are called oligomeric proteins. Oligomeric proteins are involved in various biological processes, such as metabolism, signal transduction, and chromosome replication. Thus, it is highly desirable to develop some computational methods to automatically classify the quaternary structure of proteins from their sequences. Results To explore this problem, we adopted an approach based on the functional domain composition of proteins. Every protein was represented by a vector calculated from the domains in the PFAM database. The nearest neighbor algorithm (NNA) was used for classifying the quaternary structure of proteins from this information. The jackknife cross-validation test was performed on the non-redundant protein dataset in which the sequence identity was less than 25%. The overall success rate obtained is 75.17%. Additionally, to demonstrate the effectiveness of this method, we predicted the proteins in an independent dataset and achieved an overall success rate of 84.11% Conclusion Compared with the amino acid composition method and Blast, the results indicate that the domain composition approach may be a more effective and promising high-throughput method in dealing with this complicated problem in bioinformatics. PMID:16584572

  12. Automatic detection of red lesions in digital color fundus photographs.

    PubMed

    Niemeijer, Meindert; van Ginneken, Bram; Staal, Joes; Suttorp-Schulten, Maria S A; Abràmoff, Michael D

    2005-05-01

    The robust detection of red lesions in digital color fundus photographs is a critical step in the development of automated screening systems for diabetic retinopathy. In this paper, a novel red lesion detection method is presented based on a hybrid approach, combining prior works by Spencer et al. (1996) and Frame et al. (1998) with two important new contributions. The first contribution is a new red lesion candidate detection system based on pixel classification. Using this technique, vasculature and red lesions are separated from the background of the image. After removal of the connected vasculature the remaining objects are considered possible red lesions. Second, an extensive number of new features are added to those proposed by Spencer-Frame. The detected candidate objects are classified using all features and a k-nearest neighbor classifier. An extensive evaluation was performed on a test set composed of images representative of those normally found in a screening set. When determining whether an image contains red lesions the system achieves a sensitivity of 100% at a specificity of 87%. The method is compared with several different automatic systems and is shown to outperform them all. Performance is close to that of a human expert examining the images for the presence of red lesions.

  13. A Novel Feature Level Fusion for Heart Rate Variability Classification Using Correntropy and Cauchy-Schwarz Divergence.

    PubMed

    Goshvarpour, Ateke; Goshvarpour, Atefeh

    2018-04-30

    Heart rate variability (HRV) analysis has become a widely used tool for monitoring pathological and psychological states in medical applications. In a typical classification problem, information fusion is a process whereby the effective combination of the data can achieve a more accurate system. The purpose of this article was to provide an accurate algorithm for classifying HRV signals in various psychological states. Therefore, a novel feature level fusion approach was proposed. First, using the theory of information, two similarity indicators of the signal were extracted, including correntropy and Cauchy-Schwarz divergence. Applying probabilistic neural network (PNN) and k-nearest neighbor (kNN), the performance of each index in the classification of meditators and non-meditators HRV signals was appraised. Then, three fusion rules, including division, product, and weighted sum rules were used to combine the information of both similarity measures. For the first time, we propose an algorithm to define the weights of each feature based on the statistical p-values. The performance of HRV classification using combined features was compared with the non-combined features. Totally, the accuracy of 100% was obtained for discriminating all states. The results showed the strong ability and proficiency of division and weighted sum rules in the improvement of the classifier accuracies.

  14. Enhancing the Discrimination Ability of a Gas Sensor Array Based on a Novel Feature Selection and Fusion Framework.

    PubMed

    Deng, Changjian; Lv, Kun; Shi, Debo; Yang, Bo; Yu, Song; He, Zhiyi; Yan, Jia

    2018-06-12

    In this paper, a novel feature selection and fusion framework is proposed to enhance the discrimination ability of gas sensor arrays for odor identification. Firstly, we put forward an efficient feature selection method based on the separability and the dissimilarity to determine the feature selection order for each type of feature when increasing the dimension of selected feature subsets. Secondly, the K-nearest neighbor (KNN) classifier is applied to determine the dimensions of the optimal feature subsets for different types of features. Finally, in the process of establishing features fusion, we come up with a classification dominance feature fusion strategy which conducts an effective basic feature. Experimental results on two datasets show that the recognition rates of Database I and Database II achieve 97.5% and 80.11%, respectively, when k = 1 for KNN classifier and the distance metric is correlation distance (COR), which demonstrates the superiority of the proposed feature selection and fusion framework in representing signal features. The novel feature selection method proposed in this paper can effectively select feature subsets that are conducive to the classification, while the feature fusion framework can fuse various features which describe the different characteristics of sensor signals, for enhancing the discrimination ability of gas sensors and, to a certain extent, suppressing drift effect.

  15. Combining PubMed knowledge and EHR data to develop a weighted bayesian network for pancreatic cancer prediction.

    PubMed

    Zhao, Di; Weng, Chunhua

    2011-10-01

    In this paper, we propose a novel method that combines PubMed knowledge and Electronic Health Records to develop a weighted Bayesian Network Inference (BNI) model for pancreatic cancer prediction. We selected 20 common risk factors associated with pancreatic cancer and used PubMed knowledge to weigh the risk factors. A keyword-based algorithm was developed to extract and classify PubMed abstracts into three categories that represented positive, negative, or neutral associations between each risk factor and pancreatic cancer. Then we designed a weighted BNI model by adding the normalized weights into a conventional BNI model. We used this model to extract the EHR values for patients with or without pancreatic cancer, which then enabled us to calculate the prior probabilities for the 20 risk factors in the BNI. The software iDiagnosis was designed to use this weighted BNI model for predicting pancreatic cancer. In an evaluation using a case-control dataset, the weighted BNI model significantly outperformed the conventional BNI and two other classifiers (k-Nearest Neighbor and Support Vector Machine). We conclude that the weighted BNI using PubMed knowledge and EHR data shows remarkable accuracy improvement over existing representative methods for pancreatic cancer prediction. Copyright © 2011 Elsevier Inc. All rights reserved.

  16. Combining PubMed Knowledge and EHR Data to Develop a Weighted Bayesian Network for Pancreatic Cancer Prediction

    PubMed Central

    Zhao, Di; Weng, Chunhua

    2011-01-01

    In this paper, we propose a novel method that combines PubMed knowledge and Electronic Health Records to develop a weighted Bayesian Network Inference (BNI) model for pancreatic cancer prediction. We selected 20 common risk factors associated with pancreatic cancer and used PubMed knowledge to weigh the risk factors. A keyword-based algorithm was developed to extract and classify PubMed abstracts into three categories that represented positive, negative, or neutral associations between each risk factor and pancreatic cancer. Then we designed a weighted BNI model by adding the normalized weights into a conventional BNI model. We used this model to extract the EHR values for patients with or without pancreatic cancer, which then enabled us to calculate the prior probabilities for the 20 risk factors in the BNI. The software iDiagnosis was designed to use this weighted BNI model for predicting pancreatic cancer. In an evaluation using a case-control dataset, the weighted BNI model significantly outperformed the conventional BNI and two other classifiers (k-Nearest Neighbor and Support Vector Machine). We conclude that the weighted BNI using PubMed knowledge and EHR data shows remarkable accuracy improvement over existing representative methods for pancreatic cancer prediction. PMID:21642013

  17. Jersey number detection in sports video for athlete identification

    NASA Astrophysics Data System (ADS)

    Ye, Qixiang; Huang, Qingming; Jiang, Shuqiang; Liu, Yang; Gao, Wen

    2005-07-01

    Athlete identification is important for sport video content analysis since users often care about the video clips with their preferred athletes. In this paper, we propose a method for athlete identification by combing the segmentation, tracking and recognition procedures into a coarse-to-fine scheme for jersey number (digital characters on sport shirt) detection. Firstly, image segmentation is employed to separate the jersey number regions with its background. And size/pipe-like attributes of digital characters are used to filter out candidates. Then, a K-NN (K nearest neighbor) classifier is employed to classify a candidate into a digit in "0-9" or negative. In the recognition procedure, we use the Zernike moment features, which are invariant to rotation and scale for digital shape recognition. Synthetic training samples with different fonts are used to represent the pattern of digital characters with non-rigid deformation. Once a character candidate is detected, a SSD (smallest square distance)-based tracking procedure is started. The recognition procedure is performed every several frames in the tracking process. After tracking tens of frames, the overall recognition results are combined to determine if a candidate is a true jersey number or not by a voting procedure. Experiments on several types of sports video shows encouraging result.

  18. Pattern recognition for passive polarimetric data using nonparametric classifiers

    NASA Astrophysics Data System (ADS)

    Thilak, Vimal; Saini, Jatinder; Voelz, David G.; Creusere, Charles D.

    2005-08-01

    Passive polarization based imaging is a useful tool in computer vision and pattern recognition. A passive polarization imaging system forms a polarimetric image from the reflection of ambient light that contains useful information for computer vision tasks such as object detection (classification) and recognition. Applications of polarization based pattern recognition include material classification and automatic shape recognition. In this paper, we present two target detection algorithms for images captured by a passive polarimetric imaging system. The proposed detection algorithms are based on Bayesian decision theory. In these approaches, an object can belong to one of any given number classes and classification involves making decisions that minimize the average probability of making incorrect decisions. This minimum is achieved by assigning an object to the class that maximizes the a posteriori probability. Computing a posteriori probabilities requires estimates of class conditional probability density functions (likelihoods) and prior probabilities. A Probabilistic neural network (PNN), which is a nonparametric method that can compute Bayes optimal boundaries, and a -nearest neighbor (KNN) classifier, is used for density estimation and classification. The proposed algorithms are applied to polarimetric image data gathered in the laboratory with a liquid crystal-based system. The experimental results validate the effectiveness of the above algorithms for target detection from polarimetric data.

  19. Primary mass discrimination of high energy cosmic rays using PNN and k-NN methods

    NASA Astrophysics Data System (ADS)

    Rastegarzadeh, G.; Nemati, M.

    2018-02-01

    Probabilistic neural network (PNN) and k-Nearest Neighbors (k-NN) methods are widely used data classification techniques. In this paper, these two methods have been used to classify the Extensive Air Shower (EAS) data sets which were simulated using the CORSIKA code for three primary cosmic rays. The primaries are proton, oxygen and iron nuclei at energies of 100 TeV-10 PeV. This study is performed in the following of the investigations into the primary cosmic ray mass sensitive observables. We propose a new approach for measuring the mass sensitive observables of EAS in order to improve the primary mass separation. In this work, the EAS observables measurement has performed locally instead of total measurements. Also the relationships between the included number of observables in the classification methods and the prediction accuracy have been investigated. We have shown that the local measurements and inclusion of more mass sensitive observables in the classification processes can improve the classifying quality and also we have shown that muons and electrons energy density can be considered as primary mass sensitive observables in primary mass classification. Also it must be noted that this study is performed for Tehran observation level without considering the details of any certain EAS detection array.

  20. Effective and extensible feature extraction method using genetic algorithm-based frequency-domain feature search for epileptic EEG multiclassification.

    PubMed

    Wen, Tingxi; Zhang, Zhongnan

    2017-05-01

    In this paper, genetic algorithm-based frequency-domain feature search (GAFDS) method is proposed for the electroencephalogram (EEG) analysis of epilepsy. In this method, frequency-domain features are first searched and then combined with nonlinear features. Subsequently, these features are selected and optimized to classify EEG signals. The extracted features are analyzed experimentally. The features extracted by GAFDS show remarkable independence, and they are superior to the nonlinear features in terms of the ratio of interclass distance and intraclass distance. Moreover, the proposed feature search method can search for features of instantaneous frequency in a signal after Hilbert transformation. The classification results achieved using these features are reasonable; thus, GAFDS exhibits good extensibility. Multiple classical classifiers (i.e., k-nearest neighbor, linear discriminant analysis, decision tree, AdaBoost, multilayer perceptron, and Naïve Bayes) achieve satisfactory classification accuracies by using the features generated by the GAFDS method and the optimized feature selection. The accuracies for 2-classification and 3-classification problems may reach up to 99% and 97%, respectively. Results of several cross-validation experiments illustrate that GAFDS is effective in the extraction of effective features for EEG classification. Therefore, the proposed feature selection and optimization model can improve classification accuracy.

  1. 77 FR 66555 - Final Flood Elevation Determinations

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-11-06

    .... [caret] Mean Sea Level, rounded to the nearest 0.1 meter. ADDRESSES City of Chiefland Maps are available... American Vertical Datum. Depth in feet above ground. [caret] Mean Sea Level, rounded to the nearest 0.1... feet above ground. [caret] Mean Sea Level, rounded to the nearest 0.1 meter. ADDRESSES City of...

  2. 75 FR 31347 - Proposed Flood Elevation Determinations

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-06-03

    ... Datum. [caret] Mean Sea Level, rounded to the nearest 0.1 meter. ** BFEs to be changed include the... in feet above ground. [caret] Mean Sea Level, rounded to the nearest 0.1 meter. ** BFEs to be changed... in feet above ground. [caret] Mean Sea Level, rounded to the nearest 0.1 meter. ** BFEs to be changed...

  3. 23 CFR 750.704 - Statutory requirements.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ...-traveled way and within 660 feet of the nearest edge of the right-of-way, and those additional signs beyond... nearest edge of the right-of-way within areas adjacent to the Interstate and Federal-aid Primary Systems... the nearest edge of the right-of-way within areas adjacent to the Interstate and Federal-aid Primary...

  4. Relationship between neighbor number and vibrational spectra in disordered colloidal clusters with attractive interactions

    NASA Astrophysics Data System (ADS)

    Yunker, Peter J.; Zhang, Zexin; Gratale, Matthew; Chen, Ke; Yodh, A. G.

    2013-03-01

    We study connections between vibrational spectra and average nearest neighbor number in disordered clusters of colloidal particles with attractive interactions. Measurements of displacement covariances between particles in each cluster permit calculation of the stiffness matrix, which contains effective spring constants linking pairs of particles. From the cluster stiffness matrix, we derive vibrational properties of corresponding "shadow" glassy clusters, with the same geometric configuration and interactions as the "source" cluster but without damping. Here, we investigate the stiffness matrix to elucidate the origin of the correlations between the median frequency of cluster vibrational modes and average number of nearest neighbors in the cluster. We find that the mean confining stiffness of particles in a cluster, i.e., the ensemble-averaged sum of nearest neighbor spring constants, correlates strongly with average nearest neighbor number, and even more strongly with median frequency. Further, we find that the average oscillation frequency of an individual particle is set by the total stiffness of its nearest neighbor bonds; this average frequency increases as the square root of the nearest neighbor bond stiffness, in a manner similar to the simple harmonic oscillator.

  5. Unconventional quantum antiferromagnetism with a fourfold symmetry breaking in a spin-1/2 Ising-Heisenberg pentagonal chain

    NASA Astrophysics Data System (ADS)

    Karľová, Katarína; Strečka, Jozef; Lyra, Marcelo L.

    2018-03-01

    The spin-1/2 Ising-Heisenberg pentagonal chain is investigated with use of the star-triangle transformation, which establishes a rigorous mapping equivalence with the effective spin-1/2 Ising zigzag ladder. The investigated model has a rich ground-state phase diagram including two spectacular quantum antiferromagnetic ground states with a fourfold broken symmetry. It is demonstrated that these long-period quantum ground states arise due to a competition between the effective next-nearest-neighbor and nearest-neighbor interactions of the corresponding spin-1/2 Ising zigzag ladder. The concurrence is used to quantify the bipartite entanglement between the nearest-neighbor Heisenberg spin pairs, which are quantum-mechanically entangled in two quantum ground states with or without spontaneously broken symmetry. The pair correlation functions between the nearest-neighbor Heisenberg spins as well as the next-nearest-neighbor and nearest-neighbor Ising spins were investigated with the aim to bring insight into how a relevant short-range order manifests itself at low enough temperatures. It is shown that the specific heat displays temperature dependencies with either one or two separate round maxima.

  6. [Detection of tibial condylar fractures using 3D imaging with a mobile image amplifier (Siemens ISO-C-3D): Comparison with plain films and spiral CT].

    PubMed

    Kotsianos, D; Rock, C; Wirth, S; Linsenmaier, U; Brandl, R; Fischer, T; Euler, E; Mutschler, W; Pfeifer, K J; Reiser, M

    2002-01-01

    To analyze a prototype mobile C-arm 3D image amplifier in the detection and classification of experimental tibial condylar fractures with multiplanar reconstructions (MPR). Human knee specimens (n = 22) with tibial condylar fractures were examined with a prototype C-arm (ISO-C-3D, Siemens AG), plain films (CR) and spiral CT (CT). The motorized C-arm provides fluoroscopic images during a 190 degrees orbital rotation computing a 119 mm data cube. From these 3D data sets MP reconstructions were obtained. All images were evaluated by four independent readers for the detection and assessment of fracture lines. All fractures were classified according to the Müller AO classification. To confirm the results, the specimens were finally surgically dissected. 97 % of the tibial condylar fractures were easily seen and correctly classified according to the Müller AO classification on MP reconstruction of the ISO-C-3D. There is no significant difference between ISO-C and CT in detection and correct classification of fractures, but ISO-CD-3D is significant by better than CR. The evaluation of fractures with the ISO-C is better than with plain films alone and comparable to CT scans. The three-dimensional reconstruction of the ISO-C can provide important information which cannot be obtained from plain films. The ISO-C-3D may be useful in planning operative reconstructions and evaluating surgical results in orthopaedic surgery of the limbs.

  7. Spectral properties near the Mott transition in the two-dimensional t-J model with next-nearest-neighbor hopping

    NASA Astrophysics Data System (ADS)

    Kohno, Masanori

    2018-05-01

    The single-particle spectral properties of the two-dimensional t-J model with next-nearest-neighbor hopping are investigated near the Mott transition by using cluster perturbation theory. The spectral features are interpreted by considering the effects of the next-nearest-neighbor hopping on the shift of the spectral-weight distribution of the two-dimensional t-J model. Various anomalous features observed in hole-doped and electron-doped high-temperature cuprate superconductors are collectively explained in the two-dimensional t-J model with next-nearest-neighbor hopping near the Mott transition.

  8. Phase transitions in the antiferromagnetic Ising model on a body-centered cubic lattice with interactions between next-to-nearest neighbors

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Murtazaev, A. K.; Ramazanov, M. K., E-mail: sheikh77@mail.ru; Kassan-Ogly, F. A.

    2015-01-15

    Phase transitions in the antiferromagnetic Ising model on a body-centered cubic lattice are studied on the basis of the replica algorithm by the Monte Carlo method and histogram analysis taking into account the interaction of next-to-nearest neighbors. The phase diagram of the dependence of the critical temperature on the intensity of interaction of the next-to-nearest neighbors is constructed. It is found that a second-order phase transition is realized in this model in the investigated interval of the intensities of interaction of next-to-nearest neighbors.

  9. Scalable Nearest Neighbor Algorithms for High Dimensional Data.

    PubMed

    Muja, Marius; Lowe, David G

    2014-11-01

    For many computer vision and machine learning problems, large training sets are key for good performance. However, the most computationally expensive part of many computer vision and machine learning algorithms consists of finding nearest neighbor matches to high dimensional vectors that represent the training data. We propose new algorithms for approximate nearest neighbor matching and evaluate and compare them with previous algorithms. For matching high dimensional features, we find two algorithms to be the most efficient: the randomized k-d forest and a new algorithm proposed in this paper, the priority search k-means tree. We also propose a new algorithm for matching binary features by searching multiple hierarchical clustering trees and show it outperforms methods typically used in the literature. We show that the optimal nearest neighbor algorithm and its parameters depend on the data set characteristics and describe an automated configuration procedure for finding the best algorithm to search a particular data set. In order to scale to very large data sets that would otherwise not fit in the memory of a single machine, we propose a distributed nearest neighbor matching framework that can be used with any of the algorithms described in the paper. All this research has been released as an open source library called fast library for approximate nearest neighbors (FLANN), which has been incorporated into OpenCV and is now one of the most popular libraries for nearest neighbor matching.

  10. 5 CFR 532.317 - Use of data from the nearest similar area.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... REGULATIONS PREVAILING RATE SYSTEMS Determining Rates for Principal Types of Positions § 532.317 Use of data... 5 Administrative Personnel 1 2013-01-01 2013-01-01 false Use of data from the nearest similar area... subpart, analyze and use the acceptable data from the nearest similar wage area together with the data...

  11. 5 CFR 532.317 - Use of data from the nearest similar area.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... REGULATIONS PREVAILING RATE SYSTEMS Determining Rates for Principal Types of Positions § 532.317 Use of data... 5 Administrative Personnel 1 2012-01-01 2012-01-01 false Use of data from the nearest similar area... subpart, analyze and use the acceptable data from the nearest similar wage area together with the data...

  12. yaImpute: An R package for kNN imputation

    Treesearch

    Nicholas L. Crookston; Andrew O. Finley

    2008-01-01

    This article introduces yaImpute, an R package for nearest neighbor search and imputation. Although nearest neighbor imputation is used in a host of disciplines, the methods implemented in the yaImpute package are tailored to imputation-based forest attribute estimation and mapping. The impetus to writing the yaImpute is a growing interest in nearest neighbor...

  13. 40 CFR 63.775 - Reporting requirements.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... source is located in an urban cluster with 10,000 people or more; the distance in miles to the nearest urbanized area boundary if the source is not located in an urban cluster with 10,000 people or more; and the names of the nearest urban cluster with 10,000 people or more and nearest urbanized area. (ii...

  14. Estimating forest attribute parameters for small areas using nearest neighbors techniques

    Treesearch

    Ronald E. McRoberts

    2012-01-01

    Nearest neighbors techniques have become extremely popular, particularly for use with forest inventory data. With these techniques, a population unit prediction is calculated as a linear combination of observations for a selected number of population units in a sample that are most similar, or nearest, in a space of ancillary variables to the population unit requiring...

  15. A Fast Exact k-Nearest Neighbors Algorithm for High Dimensional Search Using k-Means Clustering and Triangle Inequality.

    PubMed

    Wang, Xueyi

    2012-02-08

    The k-nearest neighbors (k-NN) algorithm is a widely used machine learning method that finds nearest neighbors of a test object in a feature space. We present a new exact k-NN algorithm called kMkNN (k-Means for k-Nearest Neighbors) that uses the k-means clustering and the triangle inequality to accelerate the searching for nearest neighbors in a high dimensional space. The kMkNN algorithm has two stages. In the buildup stage, instead of using complex tree structures such as metric trees, kd-trees, or ball-tree, kMkNN uses a simple k-means clustering method to preprocess the training dataset. In the searching stage, given a query object, kMkNN finds nearest training objects starting from the nearest cluster to the query object and uses the triangle inequality to reduce the distance calculations. Experiments show that the performance of kMkNN is surprisingly good compared to the traditional k-NN algorithm and tree-based k-NN algorithms such as kd-trees and ball-trees. On a collection of 20 datasets with up to 10(6) records and 10(4) dimensions, kMkNN shows a 2-to 80-fold reduction of distance calculations and a 2- to 60-fold speedup over the traditional k-NN algorithm for 16 datasets. Furthermore, kMkNN performs significant better than a kd-tree based k-NN algorithm for all datasets and performs better than a ball-tree based k-NN algorithm for most datasets. The results show that kMkNN is effective for searching nearest neighbors in high dimensional spaces.

  16. Social aggregation in pea aphids: experiment and random walk modeling.

    PubMed

    Nilsen, Christa; Paige, John; Warner, Olivia; Mayhew, Benjamin; Sutley, Ryan; Lam, Matthew; Bernoff, Andrew J; Topaz, Chad M

    2013-01-01

    From bird flocks to fish schools and ungulate herds to insect swarms, social biological aggregations are found across the natural world. An ongoing challenge in the mathematical modeling of aggregations is to strengthen the connection between models and biological data by quantifying the rules that individuals follow. We model aggregation of the pea aphid, Acyrthosiphon pisum. Specifically, we conduct experiments to track the motion of aphids walking in a featureless circular arena in order to deduce individual-level rules. We observe that each aphid transitions stochastically between a moving and a stationary state. Moving aphids follow a correlated random walk. The probabilities of motion state transitions, as well as the random walk parameters, depend strongly on distance to an aphid's nearest neighbor. For large nearest neighbor distances, when an aphid is essentially isolated, its motion is ballistic with aphids moving faster, turning less, and being less likely to stop. In contrast, for short nearest neighbor distances, aphids move more slowly, turn more, and are more likely to become stationary; this behavior constitutes an aggregation mechanism. From the experimental data, we estimate the state transition probabilities and correlated random walk parameters as a function of nearest neighbor distance. With the individual-level model established, we assess whether it reproduces the macroscopic patterns of movement at the group level. To do so, we consider three distributions, namely distance to nearest neighbor, angle to nearest neighbor, and percentage of population moving at any given time. For each of these three distributions, we compare our experimental data to the output of numerical simulations of our nearest neighbor model, and of a control model in which aphids do not interact socially. Our stochastic, social nearest neighbor model reproduces salient features of the experimental data that are not captured by the control.

  17. Defining greed.

    PubMed

    Seuntjens, Terri G; Zeelenberg, Marcel; Breugelmans, Seger M; van de Ven, Niels

    2015-08-01

    Although greed is both hailed as the motor of economic growth and blamed as the cause of economic crises, very little is known about its psychological underpinnings. Five studies explored lay conceptualizations of greed among U.S. and Dutch participants using a prototype analysis. Study 1 identified features related to greed. Study 2 determined the importance of these features; the most important features were classified as central (e.g., self-interested, never satisfied), whereas less important features were classified as peripheral (e.g., ambition, addiction). Subsequently, we found that, compared to peripheral features, participants recalled central features better (Study 3), faster (Study 4), and these central features were more present in real-life episodes of greed (Study 5). These findings provide a better understanding of the elements that make up the experience of greed and provide insights into how greed can be manipulated and measured in future research. © 2014 The British Psychological Society.

  18. Quantum realization of the nearest-neighbor interpolation method for FRQI and NEQR

    NASA Astrophysics Data System (ADS)

    Sang, Jianzhi; Wang, Shen; Niu, Xiamu

    2016-01-01

    This paper is concerned with the feasibility of the classical nearest-neighbor interpolation based on flexible representation of quantum images (FRQI) and novel enhanced quantum representation (NEQR). Firstly, the feasibility of the classical image nearest-neighbor interpolation for quantum images of FRQI and NEQR is proven. Then, by defining the halving operation and by making use of quantum rotation gates, the concrete quantum circuit of the nearest-neighbor interpolation for FRQI is designed for the first time. Furthermore, quantum circuit of the nearest-neighbor interpolation for NEQR is given. The merit of the proposed NEQR circuit lies in their low complexity, which is achieved by utilizing the halving operation and the quantum oracle operator. Finally, in order to further improve the performance of the former circuits, new interpolation circuits for FRQI and NEQR are presented by using Control-NOT gates instead of a halving operation. Simulation results show the effectiveness of the proposed circuits.

  19. Constructing a logical, regular axis topology from an irregular topology

    DOEpatents

    Faraj, Daniel A.

    2014-07-22

    Constructing a logical regular topology from an irregular topology including, for each axial dimension and recursively, for each compute node in a subcommunicator until returning to a first node: adding to a logical line of the axial dimension a neighbor specified in a nearest neighbor list; calling the added compute node; determining, by the called node, whether any neighbor in the node's nearest neighbor list is available to add to the logical line; if a neighbor in the called compute node's nearest neighbor list is available to add to the logical line, adding, by the called compute node to the logical line, any neighbor in the called compute node's nearest neighbor list for the axial dimension not already added to the logical line; and, if no neighbor in the called compute node's nearest neighbor list is available to add to the logical line, returning to the calling compute node.

  20. Constructing a logical, regular axis topology from an irregular topology

    DOEpatents

    Faraj, Daniel A.

    2014-07-01

    Constructing a logical regular topology from an irregular topology including, for each axial dimension and recursively, for each compute node in a subcommunicator until returning to a first node: adding to a logical line of the axial dimension a neighbor specified in a nearest neighbor list; calling the added compute node; determining, by the called node, whether any neighbor in the node's nearest neighbor list is available to add to the logical line; if a neighbor in the called compute node's nearest neighbor list is available to add to the logical line, adding, by the called compute node to the logical line, any neighbor in the called compute node's nearest neighbor list for the axial dimension not already added to the logical line; and, if no neighbor in the called compute node's nearest neighbor list is available to add to the logical line, returning to the calling compute node.

  1. Classification of three-state Hamiltonians solvable by the coordinate Bethe ansatz

    NASA Astrophysics Data System (ADS)

    Crampé, N.; Frappat, L.; Ragoucy, E.

    2013-10-01

    We classify ‘all’ Hamiltonians with rank 1 symmetry and nearest-neighbour interactions, acting on a periodic three-state spin chain, and solvable through (generalization of) the coordinate Bethe ansatz (CBA). In this way we obtain four multi-parametric extensions of the known 19-vertex Hamiltonians (such as Zamolodchikov-Fateev, Izergin-Korepin and Bariev Hamiltonians). Apart from the 19-vertex Hamiltonians, there exist 17-vertex and 14-vertex Hamiltonians that cannot be viewed as subcases of the 19-vertex ones. In the case of 17-vertex Hamiltonians, we get a generalization of the genus 5 special branch found by Martins, plus three new ones. We also get two 14-vertex Hamiltonians. We solve all these Hamiltonians using CBA, and provide their spectrum, eigenfunctions and Bethe equations. Special attention is given to provide the specifications of our multi-parametric Hamiltonians that give back known Hamiltonians.

  2. The HI Content of Galaxies as a Function of Local Density and Large-Scale Environment

    NASA Astrophysics Data System (ADS)

    Thoreen, Henry; Cantwell, Kelly; Maloney, Erin; Cane, Thomas; Brough Morris, Theodore; Flory, Oscar; Raskin, Mark; Crone-Odekon, Mary; ALFALFA Team

    2017-01-01

    We examine the HI content of galaxies as a function of environment, based on a catalogue of 41527 galaxies that are part of the 70% complete Arecibo Legacy Fast-ALFA (ALFALFA) survey. We use nearest-neighbor methods to characterize local environment, and a modified version of the algorithm developed for the Galaxy and Mass Assembly (GAMA) survey to classify large-scale environment as group, filament, tendril, or void. We compare the HI content in these environments using statistics that include both HI detections and the upper limits on detections from ALFALFA. The large size of the sample allows to statistically compare the HI content in different environments for early-type galaxies as well as late-type galaxies. This work is supported by NSF grants AST-1211005 and AST-1637339, the Skidmore Faculty-Student Summer Research program, and the Schupf Scholars program.

  3. Identifying Dyscalculia Symptoms Related to Magnocellular Reasoning Using Smartphones.

    PubMed

    Knudsen, Greger Siem; Babic, Ankica

    2016-01-01

    This paper presents a study that has developed a mobile software application for assisting diagnosis of learning disabilities in mathematics, called dyscalculia, and measuring correlations between dyscalculia symptoms and magnocellular reasoning. Usually, software aids for dyscalculic individuals are focused on both assisting diagnosis and teaching the material. The software developed in this study however maintains a specific focus on the former, and in the process attempts to capture alleged correlations between dyscalculia symptoms and possible underlying causes of the condition. Classification of symptoms is performed by k-Nearest Neighbor algorithm classifying five parameters evaluating user's skills, returning calculated performance in each category as well as correlation strength between detected symptoms and magnocellular reasoning abilities. Expert evaluations has found the application to be appropriate and productive for its intended purpose, proving that mobile software is a suitable and valuable tool for assisting dyscalculia diagnosis and identifying root causes of developing the condition.

  4. Blind equalization and automatic modulation classification based on subspace for subcarrier MPSK optical communications

    NASA Astrophysics Data System (ADS)

    Chen, Dan; Guo, Lin-yuan; Wang, Chen-hao; Ke, Xi-zheng

    2017-07-01

    Equalization can compensate channel distortion caused by channel multipath effects, and effectively improve convergent of modulation constellation diagram in optical wireless system. In this paper, the subspace blind equalization algorithm is used to preprocess M-ary phase shift keying (MPSK) subcarrier modulation signal in receiver. Mountain clustering is adopted to get the clustering centers of MPSK modulation constellation diagram, and the modulation order is automatically identified through the k-nearest neighbor (KNN) classifier. The experiment has been done under four different weather conditions. Experimental results show that the convergent of constellation diagram is improved effectively after using the subspace blind equalization algorithm, which means that the accuracy of modulation recognition is increased. The correct recognition rate of 16PSK can be up to 85% in any kind of weather condition which is mentioned in paper. Meanwhile, the correct recognition rate is the highest in cloudy and the lowest in heavy rain condition.

  5. Use of data mining to predict significant factors and benefits of bilateral cochlear implantation.

    PubMed

    Ramos-Miguel, Angel; Perez-Zaballos, Teresa; Perez, Daniel; Falconb, Juan Carlos; Ramosb, Angel

    2015-11-01

    Data mining (DM) is a technique used to discover pattern and knowledge from a big amount of data. It uses artificial intelligence, automatic learning, statistics, databases, etc. In this study, DM was successfully used as a predictive tool to assess disyllabic speech test performance in bilateral implanted patients with a success rate above 90%. 60 bilateral sequentially implanted adult patients were included in the study. The DM algorithms developed found correlations between unilateral medical records and Audiological test results and bilateral performance by establishing relevant variables based on two DM techniques: the classifier and the estimation. The nearest neighbor algorithm was implemented in the first case, and the linear regression in the second. The results showed that patients with unilateral disyllabic test results below 70% benefited the most from a bilateral implantation. Finally, it was observed that its benefits decrease as the inter-implant time increases.

  6. New nonlinear features for inspection, robotics, and face recognition

    NASA Astrophysics Data System (ADS)

    Casasent, David P.; Talukder, Ashit

    1999-10-01

    Classification of real-time X-ray images of randomly oriented touching pistachio nuts is discussed. The ultimate objective is the development of a system for automated non- invasive detection of defective product items on a conveyor belt. We discuss the extraction of new features that allow better discrimination between damaged and clean items (pistachio nuts). This feature extraction and classification stage is the new aspect of this paper; our new maximum representation and discriminating feature (MRDF) extraction method computes nonlinear features that are used as inputs to a new modified k nearest neighbor classifier. In this work, the MRDF is applied to standard features (rather than iconic data). The MRDF is robust to various probability distributions of the input class and is shown to provide good classification and new ROC (receiver operating characteristic) data. Other applications of these new feature spaces in robotics and face recognition are also noted.

  7. New feature extraction method for classification of agricultural products from x-ray images

    NASA Astrophysics Data System (ADS)

    Talukder, Ashit; Casasent, David P.; Lee, Ha-Woon; Keagy, Pamela M.; Schatzki, Thomas F.

    1999-01-01

    Classification of real-time x-ray images of randomly oriented touching pistachio nuts is discussed. The ultimate objective is the development of a system for automated non- invasive detection of defective product items on a conveyor belt. We discuss the extraction of new features that allow better discrimination between damaged and clean items. This feature extraction and classification stage is the new aspect of this paper; our new maximum representation and discrimination between damaged and clean items. This feature extraction and classification stage is the new aspect of this paper; our new maximum representation and discriminating feature (MRDF) extraction method computes nonlinear features that are used as inputs to a new modified k nearest neighbor classifier. In this work the MRDF is applied to standard features. The MRDF is robust to various probability distributions of the input class and is shown to provide good classification and new ROC data.

  8. Circum-Arctic petroleum systems identified using decision-tree chemometrics

    USGS Publications Warehouse

    Peters, K.E.; Ramos, L.S.; Zumberge, J.E.; Valin, Z.C.; Scotese, C.R.; Gautier, D.L.

    2007-01-01

    Source- and age-related biomarker and isotopic data were measured for more than 1000 crude oil samples from wells and seeps collected above approximately 55??N latitude. A unique, multitiered chemometric (multivariate statistical) decision tree was created that allowed automated classification of 31 genetically distinct circumArctic oil families based on a training set of 622 oil samples. The method, which we call decision-tree chemometrics, uses principal components analysis and multiple tiers of K-nearest neighbor and SIMCA (soft independent modeling of class analogy) models to classify and assign confidence limits for newly acquired oil samples and source rock extracts. Geochemical data for each oil sample were also used to infer the age, lithology, organic matter input, depositional environment, and identity of its source rock. These results demonstrate the value of large petroleum databases where all samples were analyzed using the same procedures and instrumentation. Copyright ?? 2007. The American Association of Petroleum Geologists. All rights reserved.

  9. A new feature constituting approach to detection of vocal fold pathology

    NASA Astrophysics Data System (ADS)

    Hariharan, M.; Polat, Kemal; Yaacob, Sazali

    2014-08-01

    In the last two decades, non-invasive methods through acoustic analysis of voice signal have been proved to be excellent and reliable tool to diagnose vocal fold pathologies. This paper proposes a new feature vector based on the wavelet packet transform and singular value decomposition for the detection of vocal fold pathology. k-means clustering based feature weighting is proposed to increase the distinguishing performance of the proposed features. In this work, two databases Massachusetts Eye and Ear Infirmary (MEEI) voice disorders database and MAPACI speech pathology database are used. Four different supervised classifiers such as k-nearest neighbour (k-NN), least-square support vector machine, probabilistic neural network and general regression neural network are employed for testing the proposed features. The experimental results uncover that the proposed features give very promising classification accuracy of 100% for both MEEI database and MAPACI speech pathology database.

  10. Machine learning methods in chemoinformatics

    PubMed Central

    Mitchell, John B O

    2014-01-01

    Machine learning algorithms are generally developed in computer science or adjacent disciplines and find their way into chemical modeling by a process of diffusion. Though particular machine learning methods are popular in chemoinformatics and quantitative structure–activity relationships (QSAR), many others exist in the technical literature. This discussion is methods-based and focused on some algorithms that chemoinformatics researchers frequently use. It makes no claim to be exhaustive. We concentrate on methods for supervised learning, predicting the unknown property values of a test set of instances, usually molecules, based on the known values for a training set. Particularly relevant approaches include Artificial Neural Networks, Random Forest, Support Vector Machine, k-Nearest Neighbors and naïve Bayes classifiers. WIREs Comput Mol Sci 2014, 4:468–481. How to cite this article: WIREs Comput Mol Sci 2014, 4:468–481. doi:10.1002/wcms.1183 PMID:25285160

  11. An Individual Finger Gesture Recognition System Based on Motion-Intent Analysis Using Mechanomyogram Signal

    PubMed Central

    Ding, Huijun; He, Qing; Zhou, Yongjin; Dan, Guo; Cui, Song

    2017-01-01

    Motion-intent-based finger gesture recognition systems are crucial for many applications such as prosthesis control, sign language recognition, wearable rehabilitation system, and human–computer interaction. In this article, a motion-intent-based finger gesture recognition system is designed to correctly identify the tapping of every finger for the first time. Two auto-event annotation algorithms are firstly applied and evaluated for detecting the finger tapping frame. Based on the truncated signals, the Wavelet packet transform (WPT) coefficients are calculated and compressed as the features, followed by a feature selection method that is able to improve the performance by optimizing the feature set. Finally, three popular classifiers including naive Bayes (NBC), K-nearest neighbor (KNN), and support vector machine (SVM) are applied and evaluated. The recognition accuracy can be achieved up to 94%. The design and the architecture of the system are presented with full system characterization results. PMID:29167655

  12. Automated segmentation algorithm for detection of changes in vaginal epithelial morphology using optical coherence tomography

    NASA Astrophysics Data System (ADS)

    Chitchian, Shahab; Vincent, Kathleen L.; Vargas, Gracie; Motamedi, Massoud

    2012-11-01

    We have explored the use of optical coherence tomography (OCT) as a noninvasive tool for assessing the toxicity of topical microbicides, products used to prevent HIV, by monitoring the integrity of the vaginal epithelium. A novel feature-based segmentation algorithm using a nearest-neighbor classifier was developed to monitor changes in the morphology of vaginal epithelium. The two-step automated algorithm yielded OCT images with a clearly defined epithelial layer, enabling differentiation of normal and damaged tissue. The algorithm was robust in that it was able to discriminate the epithelial layer from underlying stroma as well as residual microbicide product on the surface. This segmentation technique for OCT images has the potential to be readily adaptable to the clinical setting for noninvasively defining the boundaries of the epithelium, enabling quantifiable assessment of microbicide-induced damage in vaginal tissue.

  13. Evaluating Descriptive Metrics of the Human Cone Mosaic

    PubMed Central

    Cooper, Robert F.; Wilk, Melissa A.; Tarima, Sergey; Carroll, Joseph

    2016-01-01

    Purpose To evaluate how metrics used to describe the cone mosaic change in response to simulated photoreceptor undersampling (i.e., cell loss or misidentification). Methods Using an adaptive optics ophthalmoscope, we acquired images of the cone mosaic from the center of fixation to 10° along the temporal, superior, inferior, and nasal meridians in 20 healthy subjects. Regions of interest (n = 1780) were extracted at regular intervals along each meridian. Cone mosaic geometry was assessed using a variety of metrics − density, density recovery profile distance (DRPD), nearest neighbor distance (NND), intercell distance (ICD), farthest neighbor distance (FND), percentage of six-sided Voronoi cells, nearest neighbor regularity (NNR), number of neighbors regularity (NoNR), and Voronoi cell area regularity (VCAR). The “performance” of each metric was evaluated by determining the level of simulated loss necessary to obtain 80% statistical power. Results Of the metrics assessed, NND and DRPD were the least sensitive to undersampling, classifying mosaics that lost 50% of their coordinates as indistinguishable from normal. The NoNR was the most sensitive, detecting a significant deviation from normal with only a 10% cell loss. Conclusions The robustness of cone spacing metrics makes them unsuitable for reliably detecting small deviations from normal or for tracking small changes in the mosaic over time. In contrast, regularity metrics are more sensitive to diffuse loss and, therefore, better suited for detecting such changes, provided the fraction of misidentified cells is minimal. Combining metrics with a variety of sensitivities may provide a more complete picture of the integrity of the photoreceptor mosaic. PMID:27273598

  14. Feature weight estimation for gene selection: a local hyperlinear learning approach

    PubMed Central

    2014-01-01

    Background Modeling high-dimensional data involving thousands of variables is particularly important for gene expression profiling experiments, nevertheless,it remains a challenging task. One of the challenges is to implement an effective method for selecting a small set of relevant genes, buried in high-dimensional irrelevant noises. RELIEF is a popular and widely used approach for feature selection owing to its low computational cost and high accuracy. However, RELIEF based methods suffer from instability, especially in the presence of noisy and/or high-dimensional outliers. Results We propose an innovative feature weighting algorithm, called LHR, to select informative genes from highly noisy data. LHR is based on RELIEF for feature weighting using classical margin maximization. The key idea of LHR is to estimate the feature weights through local approximation rather than global measurement, which is typically used in existing methods. The weights obtained by our method are very robust in terms of degradation of noisy features, even those with vast dimensions. To demonstrate the performance of our method, extensive experiments involving classification tests have been carried out on both synthetic and real microarray benchmark datasets by combining the proposed technique with standard classifiers, including the support vector machine (SVM), k-nearest neighbor (KNN), hyperplane k-nearest neighbor (HKNN), linear discriminant analysis (LDA) and naive Bayes (NB). Conclusion Experiments on both synthetic and real-world datasets demonstrate the superior performance of the proposed feature selection method combined with supervised learning in three aspects: 1) high classification accuracy, 2) excellent robustness to noise and 3) good stability using to various classification algorithms. PMID:24625071

  15. Where Do Patients With Cancer in Iowa Receive Radiation Therapy?

    PubMed Central

    Ward, Marcia M.; Ullrich, Fred; Matthews, Kevin; Rushton, Gerard; Tracy, Roger; Goldstein, Michael A.; Bajorin, Dean F.; Kosty, Michael P.; Bruinooge, Suanna S.; Hanley, Amy; Jacobson, Geraldine M.; Lynch, Charles F.

    2014-01-01

    Purpose: Multiple studies have shown survival benefits in patients with cancer treated with radiation therapy, but access to treatment facilities has been found to limit its use. This study was undertaken to examine access issues in Iowa and determine a methodology for conducting a similar national analysis. Patients and Methods: All Iowa residents who received radiation therapy regardless of where they were diagnosed or treated were identified through the Iowa Cancer Registry (ICR). Radiation oncologists were identified through the Iowa Physician Information System (IPIS). Radiation facilities were identified through IPIS and classified using the Commission on Cancer accreditation standard. Results: Between 2004 and 2010, 113,885 invasive cancers in 106,603 patients, 28.5% of whom received radiation treatment, were entered in ICR. Mean and median travel times were 25.8 and 20.1 minutes, respectively, to the nearest facility but 42.4 and 29.1 minutes, respectively, to the patient's chosen treatment facility. Multivariable analysis predicting travel time showed significant relationships for disease site, age, residence location, and facility category. Residents of small and isolated rural towns traveled nearly 3× longer than urban residents to receive radiation therapy, as did patients using certain categories of facilities. Conclusion: Half of Iowa patients could reach their nearest facility in 20 minutes, but instead, they traveled 30 minutes on average to receive treatment. The findings identified certain groups of patients with cancer who chose more distant facilities. However, other groups of patients with cancer, namely those residing in rural areas, had less choice, and some had to travel considerably farther to radiation facilities than urban patients. PMID:24443730

  16. Using K-Nearest Neighbor Classification to Diagnose Abnormal Lung Sounds

    PubMed Central

    Chen, Chin-Hsing; Huang, Wen-Tzeng; Tan, Tan-Hsu; Chang, Cheng-Chun; Chang, Yuan-Jen

    2015-01-01

    A reported 30% of people worldwide have abnormal lung sounds, including crackles, rhonchi, and wheezes. To date, the traditional stethoscope remains the most popular tool used by physicians to diagnose such abnormal lung sounds, however, many problems arise with the use of a stethoscope, including the effects of environmental noise, the inability to record and store lung sounds for follow-up or tracking, and the physician’s subjective diagnostic experience. This study has developed a digital stethoscope to help physicians overcome these problems when diagnosing abnormal lung sounds. In this digital system, mel-frequency cepstral coefficients (MFCCs) were used to extract the features of lung sounds, and then the K-means algorithm was used for feature clustering, to reduce the amount of data for computation. Finally, the K-nearest neighbor method was used to classify the lung sounds. The proposed system can also be used for home care: if the percentage of abnormal lung sound frames is > 30% of the whole test signal, the system can automatically warn the user to visit a physician for diagnosis. We also used bend sensors together with an amplification circuit, Bluetooth, and a microcontroller to implement a respiration detector. The respiratory signal extracted by the bend sensors can be transmitted to the computer via Bluetooth to calculate the respiratory cycle, for real-time assessment. If an abnormal status is detected, the device will warn the user automatically. Experimental results indicated that the error in respiratory cycles between measured and actual values was only 6.8%, illustrating the potential of our detector for home care applications. PMID:26053756

  17. Differential diagnosis of CT focal liver lesions using texture features, feature selection and ensemble driven classifiers.

    PubMed

    Mougiakakou, Stavroula G; Valavanis, Ioannis K; Nikita, Alexandra; Nikita, Konstantina S

    2007-09-01

    The aim of the present study is to define an optimally performing computer-aided diagnosis (CAD) architecture for the classification of liver tissue from non-enhanced computed tomography (CT) images into normal liver (C1), hepatic cyst (C2), hemangioma (C3), and hepatocellular carcinoma (C4). To this end, various CAD architectures, based on texture features and ensembles of classifiers (ECs), are comparatively assessed. Number of regions of interests (ROIs) corresponding to C1-C4 have been defined by experienced radiologists in non-enhanced liver CT images. For each ROI, five distinct sets of texture features were extracted using first order statistics, spatial gray level dependence matrix, gray level difference method, Laws' texture energy measures, and fractal dimension measurements. Two different ECs were constructed and compared. The first one consists of five multilayer perceptron neural networks (NNs), each using as input one of the computed texture feature sets or its reduced version after genetic algorithm-based feature selection. The second EC comprised five different primary classifiers, namely one multilayer perceptron NN, one probabilistic NN, and three k-nearest neighbor classifiers, each fed with the combination of the five texture feature sets or their reduced versions. The final decision of each EC was extracted by using appropriate voting schemes, while bootstrap re-sampling was utilized in order to estimate the generalization ability of the CAD architectures based on the available relatively small-sized data set. The best mean classification accuracy (84.96%) is achieved by the second EC using a fused feature set, and the weighted voting scheme. The fused feature set was obtained after appropriate feature selection applied to specific subsets of the original feature set. The comparative assessment of the various CAD architectures shows that combining three types of classifiers with a voting scheme, fed with identical feature sets obtained after appropriate feature selection and fusion, may result in an accurate system able to assist differential diagnosis of focal liver lesions from non-enhanced CT images.

  18. Deep learning approach for classifying, detecting and predicting photometric redshifts of quasars in the Sloan Digital Sky Survey stripe 82

    NASA Astrophysics Data System (ADS)

    Pasquet-Itam, J.; Pasquet, J.

    2018-04-01

    We have applied a convolutional neural network (CNN) to classify and detect quasars in the Sloan Digital Sky Survey Stripe 82 and also to predict the photometric redshifts of quasars. The network takes the variability of objects into account by converting light curves into images. The width of the images, noted w, corresponds to the five magnitudes ugriz and the height of the images, noted h, represents the date of the observation. The CNN provides good results since its precision is 0.988 for a recall of 0.90, compared to a precision of 0.985 for the same recall with a random forest classifier. Moreover 175 new quasar candidates are found with the CNN considering a fixed recall of 0.97. The combination of probabilities given by the CNN and the random forest makes good performance even better with a precision of 0.99 for a recall of 0.90. For the redshift predictions, the CNN presents excellent results which are higher than those obtained with a feature extraction step and different classifiers (a K-nearest-neighbors, a support vector machine, a random forest and a Gaussian process classifier). Indeed, the accuracy of the CNN within |Δz| < 0.1 can reach 78.09%, within |Δz| < 0.2 reaches 86.15%, within |Δz| < 0.3 reaches 91.2% and the value of root mean square (rms) is 0.359. The performance of the KNN decreases for the three |Δz| regions, since within the accuracy of |Δz| < 0.1, |Δz| < 0.2, and |Δz| < 0.3 is 73.72%, 82.46%, and 90.09% respectively, and the value of rms amounts to 0.395. So the CNN successfully reduces the dispersion and the catastrophic redshifts of quasars. This new method is very promising for the future of big databases such as the Large Synoptic Survey Telescope. A table of the candidates is only available at the CDS via anonymous ftp to http://cdsarc.u-strasbg.fr (http://130.79.128.5) or via http://cdsarc.u-strasbg.fr/viz-bin/qcat?J/A+A/611/A97

  19. Artificial intelligence techniques applied to the development of a decision–support system for diagnosing celiac disease

    PubMed Central

    Tenório, Josceli Maria; Hummel, Anderson Diniz; Cohrs, Frederico Molina; Sdepanian, Vera Lucia; Pisa, Ivan Torres; de Fátima Marin, Heimar

    2013-01-01

    Background Celiac disease (CD) is a difficult-to-diagnose condition because of its multiple clinical presentations and symptoms shared with other diseases. Gold-standard diagnostic confirmation of suspected CD is achieved by biopsying the small intestine. Objective To develop a clinical decision–support system (CDSS) integrated with an automated classifier to recognize CD cases, by selecting from experimental models developed using intelligence artificial techniques. Methods A web-based system was designed for constructing a retrospective database that included 178 clinical cases for training. Tests were run on 270 automated classifiers available in Weka 3.6.1 using five artificial intelligence techniques, namely decision trees, Bayesian inference, k-nearest neighbor algorithm, support vector machines and artificial neural networks. The parameters evaluated were accuracy, sensitivity, specificity and area under the ROC curve (AUC). AUC was used as a criterion for selecting the CDSS algorithm. A testing database was constructed including 38 clinical CD cases for CDSS evaluation. The diagnoses suggested by CDSS were compared with those made by physicians during patient consultations. Results The most accurate method during the training phase was the averaged one-dependence estimator (AODE) algorithm (a Bayesian classifier), which showed accuracy 80.0%, sensitivity 0.78, specificity 0.80 and AUC 0.84. This classifier was integrated into the web-based decision–support system. The gold-standard validation of CDSS achieved accuracy of 84.2% and k = 0.68 (p < 0.0001) with good agreement. The same accuracy was achieved in the comparison between the physician’s diagnostic impression and the gold standard k = 0. 64 (p < 0.0001). There was moderate agreement between the physician’s diagnostic impression and CDSS k = 0.46 (p = 0.0008). Conclusions The study results suggest that CDSS could be used to help in diagnosing CD, since the algorithm tested achieved excellent accuracy in differentiating possible positive from negative CD diagnoses. This study may contribute towards developing of a computer-assisted environment to support CD diagnosis. PMID:21917512

  20. Batch Effect Confounding Leads to Strong Bias in Performance Estimates Obtained by Cross-Validation

    PubMed Central

    Delorenzi, Mauro

    2014-01-01

    Background With the large amount of biological data that is currently publicly available, many investigators combine multiple data sets to increase the sample size and potentially also the power of their analyses. However, technical differences (“batch effects”) as well as differences in sample composition between the data sets may significantly affect the ability to draw generalizable conclusions from such studies. Focus The current study focuses on the construction of classifiers, and the use of cross-validation to estimate their performance. In particular, we investigate the impact of batch effects and differences in sample composition between batches on the accuracy of the classification performance estimate obtained via cross-validation. The focus on estimation bias is a main difference compared to previous studies, which have mostly focused on the predictive performance and how it relates to the presence of batch effects. Data We work on simulated data sets. To have realistic intensity distributions, we use real gene expression data as the basis for our simulation. Random samples from this expression matrix are selected and assigned to group 1 (e.g., ‘control’) or group 2 (e.g., ‘treated’). We introduce batch effects and select some features to be differentially expressed between the two groups. We consider several scenarios for our study, most importantly different levels of confounding between groups and batch effects. Methods We focus on well-known classifiers: logistic regression, Support Vector Machines (SVM), k-nearest neighbors (kNN) and Random Forests (RF). Feature selection is performed with the Wilcoxon test or the lasso. Parameter tuning and feature selection, as well as the estimation of the prediction performance of each classifier, is performed within a nested cross-validation scheme. The estimated classification performance is then compared to what is obtained when applying the classifier to independent data. PMID:24967636

  1. Optimal number of features as a function of sample size for various classification rules.

    PubMed

    Hua, Jianping; Xiong, Zixiang; Lowey, James; Suh, Edward; Dougherty, Edward R

    2005-04-15

    Given the joint feature-label distribution, increasing the number of features always results in decreased classification error; however, this is not the case when a classifier is designed via a classification rule from sample data. Typically (but not always), for fixed sample size, the error of a designed classifier decreases and then increases as the number of features grows. The potential downside of using too many features is most critical for small samples, which are commonplace for gene-expression-based classifiers for phenotype discrimination. For fixed sample size and feature-label distribution, the issue is to find an optimal number of features. Since only in rare cases is there a known distribution of the error as a function of the number of features and sample size, this study employs simulation for various feature-label distributions and classification rules, and across a wide range of sample and feature-set sizes. To achieve the desired end, finding the optimal number of features as a function of sample size, it employs massively parallel computation. Seven classifiers are treated: 3-nearest-neighbor, Gaussian kernel, linear support vector machine, polynomial support vector machine, perceptron, regular histogram and linear discriminant analysis. Three Gaussian-based models are considered: linear, nonlinear and bimodal. In addition, real patient data from a large breast-cancer study is considered. To mitigate the combinatorial search for finding optimal feature sets, and to model the situation in which subsets of genes are co-regulated and correlation is internal to these subsets, we assume that the covariance matrix of the features is blocked, with each block corresponding to a group of correlated features. Altogether there are a large number of error surfaces for the many cases. These are provided in full on a companion website, which is meant to serve as resource for those working with small-sample classification. For the companion website, please visit http://public.tgen.org/tamu/ofs/ e-dougherty@ee.tamu.edu.

  2. Artificial intelligence techniques applied to the development of a decision-support system for diagnosing celiac disease.

    PubMed

    Tenório, Josceli Maria; Hummel, Anderson Diniz; Cohrs, Frederico Molina; Sdepanian, Vera Lucia; Pisa, Ivan Torres; de Fátima Marin, Heimar

    2011-11-01

    Celiac disease (CD) is a difficult-to-diagnose condition because of its multiple clinical presentations and symptoms shared with other diseases. Gold-standard diagnostic confirmation of suspected CD is achieved by biopsying the small intestine. To develop a clinical decision-support system (CDSS) integrated with an automated classifier to recognize CD cases, by selecting from experimental models developed using intelligence artificial techniques. A web-based system was designed for constructing a retrospective database that included 178 clinical cases for training. Tests were run on 270 automated classifiers available in Weka 3.6.1 using five artificial intelligence techniques, namely decision trees, Bayesian inference, k-nearest neighbor algorithm, support vector machines and artificial neural networks. The parameters evaluated were accuracy, sensitivity, specificity and area under the ROC curve (AUC). AUC was used as a criterion for selecting the CDSS algorithm. A testing database was constructed including 38 clinical CD cases for CDSS evaluation. The diagnoses suggested by CDSS were compared with those made by physicians during patient consultations. The most accurate method during the training phase was the averaged one-dependence estimator (AODE) algorithm (a Bayesian classifier), which showed accuracy 80.0%, sensitivity 0.78, specificity 0.80 and AUC 0.84. This classifier was integrated into the web-based decision-support system. The gold-standard validation of CDSS achieved accuracy of 84.2% and k=0.68 (p<0.0001) with good agreement. The same accuracy was achieved in the comparison between the physician's diagnostic impression and the gold standard k=0. 64 (p<0.0001). There was moderate agreement between the physician's diagnostic impression and CDSS k=0.46 (p=0.0008). The study results suggest that CDSS could be used to help in diagnosing CD, since the algorithm tested achieved excellent accuracy in differentiating possible positive from negative CD diagnoses. This study may contribute towards developing of a computer-assisted environment to support CD diagnosis. Copyright © 2011 Elsevier Ireland Ltd. All rights reserved.

  3. Comparison of machine learning techniques to predict all-cause mortality using fitness data: the Henry ford exercIse testing (FIT) project.

    PubMed

    Sakr, Sherif; Elshawi, Radwa; Ahmed, Amjad M; Qureshi, Waqas T; Brawner, Clinton A; Keteyian, Steven J; Blaha, Michael J; Al-Mallah, Mouaz H

    2017-12-19

    Prior studies have demonstrated that cardiorespiratory fitness (CRF) is a strong marker of cardiovascular health. Machine learning (ML) can enhance the prediction of outcomes through classification techniques that classify the data into predetermined categories. The aim of this study is to present an evaluation and comparison of how machine learning techniques can be applied on medical records of cardiorespiratory fitness and how the various techniques differ in terms of capabilities of predicting medical outcomes (e.g. mortality). We use data of 34,212 patients free of known coronary artery disease or heart failure who underwent clinician-referred exercise treadmill stress testing at Henry Ford Health Systems Between 1991 and 2009 and had a complete 10-year follow-up. Seven machine learning classification techniques were evaluated: Decision Tree (DT), Support Vector Machine (SVM), Artificial Neural Networks (ANN), Naïve Bayesian Classifier (BC), Bayesian Network (BN), K-Nearest Neighbor (KNN) and Random Forest (RF). In order to handle the imbalanced dataset used, the Synthetic Minority Over-Sampling Technique (SMOTE) is used. Two set of experiments have been conducted with and without the SMOTE sampling technique. On average over different evaluation metrics, SVM Classifier has shown the lowest performance while other models like BN, BC and DT performed better. The RF classifier has shown the best performance (AUC = 0.97) among all models trained using the SMOTE sampling. The results show that various ML techniques can significantly vary in terms of its performance for the different evaluation metrics. It is also not necessarily that the more complex the ML model, the more prediction accuracy can be achieved. The prediction performance of all models trained with SMOTE is much better than the performance of models trained without SMOTE. The study shows the potential of machine learning methods for predicting all-cause mortality using cardiorespiratory fitness data.

  4. The Emotion Recognition System Based on Autoregressive Model and Sequential Forward Feature Selection of Electroencephalogram Signals

    PubMed Central

    Hatamikia, Sepideh; Maghooli, Keivan; Nasrabadi, Ali Motie

    2014-01-01

    Electroencephalogram (EEG) is one of the useful biological signals to distinguish different brain diseases and mental states. In recent years, detecting different emotional states from biological signals has been merged more attention by researchers and several feature extraction methods and classifiers are suggested to recognize emotions from EEG signals. In this research, we introduce an emotion recognition system using autoregressive (AR) model, sequential forward feature selection (SFS) and K-nearest neighbor (KNN) classifier using EEG signals during emotional audio-visual inductions. The main purpose of this paper is to investigate the performance of AR features in the classification of emotional states. To achieve this goal, a distinguished AR method (Burg's method) based on Levinson-Durbin's recursive algorithm is used and AR coefficients are extracted as feature vectors. In the next step, two different feature selection methods based on SFS algorithm and Davies–Bouldin index are used in order to decrease the complexity of computing and redundancy of features; then, three different classifiers include KNN, quadratic discriminant analysis and linear discriminant analysis are used to discriminate two and three different classes of valence and arousal levels. The proposed method is evaluated with EEG signals of available database for emotion analysis using physiological signals, which are recorded from 32 participants during 40 1 min audio visual inductions. According to the results, AR features are efficient to recognize emotional states from EEG signals, and KNN performs better than two other classifiers in discriminating of both two and three valence/arousal classes. The results also show that SFS method improves accuracies by almost 10-15% as compared to Davies–Bouldin based feature selection. The best accuracies are %72.33 and %74.20 for two classes of valence and arousal and %61.10 and %65.16 for three classes, respectively. PMID:25298928

  5. A two-step nearest neighbors algorithm using satellite imagery for predicting forest structure within species composition classes

    Treesearch

    Ronald E. McRoberts

    2009-01-01

    Nearest neighbors techniques have been shown to be useful for predicting multiple forest attributes from forest inventory and Landsat satellite image data. However, in regions lacking good digital land cover information, nearest neighbors selected to predict continuous variables such as tree volume must be selected without regard to relevant categorical variables such...

  6. Earthquake Declustering via a Nearest-Neighbor Approach in Space-Time-Magnitude Domain

    NASA Astrophysics Data System (ADS)

    Zaliapin, I. V.; Ben-Zion, Y.

    2016-12-01

    We propose a new method for earthquake declustering based on nearest-neighbor analysis of earthquakes in space-time-magnitude domain. The nearest-neighbor approach was recently applied to a variety of seismological problems that validate the general utility of the technique and reveal the existence of several different robust types of earthquake clusters. Notably, it was demonstrated that clustering associated with the largest earthquakes is statistically different from that of small-to-medium events. In particular, the characteristic bimodality of the nearest-neighbor distances that helps separating clustered and background events is often violated after the largest earthquakes in their vicinity, which is dominated by triggered events. This prevents using a simple threshold between the two modes of the nearest-neighbor distance distribution for declustering. The current study resolves this problem hence extending the nearest-neighbor approach to the problem of earthquake declustering. The proposed technique is applied to seismicity of different areas in California (San Jacinto, Coso, Salton Sea, Parkfield, Ventura, Mojave, etc.), as well as to the global seismicity, to demonstrate its stability and efficiency in treating various clustering types. The results are compared with those of alternative declustering methods.

  7. A Smartphone App for Families With Preschool-Aged Children in a Public Nutrition Program: Prototype Development and Beta-Testing.

    PubMed

    Hull, Pamela; Emerson, Janice S; Quirk, Meghan E; Canedo, Juan R; Jones, Jessica L; Vylegzhanina, Violetta; Schmidt, Douglas C; Mulvaney, Shelagh A; Beech, Bettina M; Briley, Chiquita; Harris, Calvin; Husaini, Baqar A

    2017-08-02

    The Special Supplemental Nutrition Program for Women, Infants, and Children (WIC) in the United States provides free supplemental food and nutrition education to low-income mothers and children under age 5 years. Childhood obesity prevalence is higher among preschool children in the WIC program compared to other children, and WIC improves dietary quality among low-income children. The Children Eating Well (CHEW) smartphone app was developed in English and Spanish for WIC-participating families with preschool-aged children as a home-based intervention to reinforce WIC nutrition education and help prevent childhood obesity. This paper describes the development and beta-testing of the CHEW smartphone app. The objective of beta-testing was to test the CHEW app prototype with target users, focusing on usage, usability, and perceived barriers and benefits of the app. The goals of the CHEW app were to make the WIC shopping experience easier, maximize WIC benefit redemption, and improve parent snack feeding practices. The CHEW app prototype consisted of WIC Shopping Tools, including a barcode scanner and calculator tools for the cash value voucher for purchasing fruits and vegetables, and nutrition education focused on healthy snacks and beverages, including a Yummy Snack Gallery and Healthy Snacking Tips. Mothers of 63 black and Hispanic WIC-participating children ages 2 to 4 years tested the CHEW app prototype for 3 months and completed follow-up interviews. Study participants testing the app for 3 months used the app on average once a week for approximately 4 and a half minutes per session, although substantial variation was observed. Usage of specific features averaged at 1 to 2 times per month for shopping-related activities and 2 to 4 times per month for the snack gallery. Mothers classified as users rated the app's WIC Shopping Tools relatively high on usability and benefits, although variation in scores and qualitative feedback highlighted several barriers that need to be addressed. The Yummy Snack Gallery and Healthy Snacking Tips scored higher on usability than benefits, suggesting that the nutrition education components may have been appealing but too limited in scope and exposure. Qualitative feedback from mothers classified as non-users pointed to several important barriers that could preclude some WIC participants from using the app at all. The prototype study successfully demonstrated the feasibility of using the CHEW app prototype with mothers of WIC-enrolled black and Hispanic preschool-aged children, with moderate levels of app usage and moderate to high usability and benefits. Future versions with enhanced shopping tools and expanded nutrition content should be implemented in WIC clinics to evaluate adoption and behavioral outcomes. This study adds to the growing body of research focused on the application of technology-based interventions in the WIC program to promote program retention and childhood obesity prevention. ©Pamela Hull, Janice S Emerson, Meghan E Quirk, Juan R Canedo, Jessica L Jones, Violetta Vylegzhanina, Douglas C Schmidt, Shelagh A Mulvaney, Bettina M Beech, Chiquita Briley, Calvin Harris, Baqar A Husaini. Originally published in JMIR Mhealth and Uhealth (http://mhealth.jmir.org), 02.08.2017.

  8. RNA-Seq analysis of isolate- and growth phase-specific differences in the global transcriptomes of enteropathogenic Escherichia coli prototype isolates

    PubMed Central

    Hazen, Tracy H.; Daugherty, Sean C.; Shetty, Amol; Mahurkar, Anup A.; White, Owen; Kaper, James B.; Rasko, David A.

    2015-01-01

    Enteropathogenic Escherichia coli (EPEC) are a leading cause of diarrheal illness among infants in developing countries. E. coli isolates classified as typical EPEC are identified by the presence of the locus of enterocyte effacement (LEE) and the bundle-forming pilus (BFP), and absence of the Shiga-toxin genes, while the atypical EPEC also encode LEE but do not encode BFP or Shiga-toxin. Comparative genomic analyses have demonstrated that EPEC isolates belong to diverse evolutionary lineages and possess lineage- and isolate-specific genomic content. To investigate whether this genomic diversity results in significant differences in global gene expression, we used an RNA sequencing (RNA-Seq) approach to characterize the global transcriptomes of the prototype typical EPEC isolates E2348/69, B171, C581-05, and the prototype atypical EPEC isolate E110019. The global transcriptomes were characterized during laboratory growth in two different media and three different growth phases, as well as during adherence of the EPEC isolates to human cells using in vitro tissue culture assays. Comparison of the global transcriptomes during these conditions was used to identify isolate- and growth phase-specific differences in EPEC gene expression. These analyses resulted in the identification of genes that encode proteins involved in survival and metabolism that were coordinately expressed with virulence factors. These findings demonstrate there are isolate- and growth phase-specific differences in the global transcriptomes of EPEC prototype isolates, and highlight the utility of comparative transcriptomics for identifying additional factors that are directly or indirectly involved in EPEC pathogenesis. PMID:26124752

  9. A chronic generalized bi-directional brain-machine interface.

    PubMed

    Rouse, A G; Stanslaski, S R; Cong, P; Jensen, R M; Afshar, P; Ullestad, D; Gupta, R; Molnar, G F; Moran, D W; Denison, T J

    2011-06-01

    A bi-directional neural interface (NI) system was designed and prototyped by incorporating a novel neural recording and processing subsystem into a commercial neural stimulator architecture. The NI system prototype leverages the system infrastructure from an existing neurostimulator to ensure reliable operation in a chronic implantation environment. In addition to providing predicate therapy capabilities, the device adds key elements to facilitate chronic research, such as four channels of electrocortigram/local field potential amplification and spectral analysis, a three-axis accelerometer, algorithm processing, event-based data logging, and wireless telemetry for data uploads and algorithm/configuration updates. The custom-integrated micropower sensor and interface circuits facilitate extended operation in a power-limited device. The prototype underwent significant verification testing to ensure reliability, and meets the requirements for a class CF instrument per IEC-60601 protocols. The ability of the device system to process and aid in classifying brain states was preclinically validated using an in vivo non-human primate model for brain control of a computer cursor (i.e. brain-machine interface or BMI). The primate BMI model was chosen for its ability to quantitatively measure signal decoding performance from brain activity that is similar in both amplitude and spectral content to other biomarkers used to detect disease states (e.g. Parkinson's disease). A key goal of this research prototype is to help broaden the clinical scope and acceptance of NI techniques, particularly real-time brain state detection. These techniques have the potential to be generalized beyond motor prosthesis, and are being explored for unmet needs in other neurological conditions such as movement disorders, stroke and epilepsy.

  10. Machine learning-based patient specific prompt-gamma dose monitoring in proton therapy

    NASA Astrophysics Data System (ADS)

    Gueth, P.; Dauvergne, D.; Freud, N.; Létang, J. M.; Ray, C.; Testa, E.; Sarrut, D.

    2013-07-01

    Online dose monitoring in proton therapy is currently being investigated with prompt-gamma (PG) devices. PG emission was shown to be correlated with dose deposition. This relationship is mostly unknown under real conditions. We propose a machine learning approach based on simulations to create optimized treatment-specific classifiers that detect discrepancies between planned and delivered dose. Simulations were performed with the Monte-Carlo platform Gate/Geant4 for a spot-scanning proton therapy treatment and a PG camera prototype currently under investigation. The method first builds a learning set of perturbed situations corresponding to a range of patient translation. This set is then used to train a combined classifier using distal falloff and registered correlation measures. Classifier performances were evaluated using receiver operating characteristic curves and maximum associated specificity and sensitivity. A leave-one-out study showed that it is possible to detect discrepancies of 5 mm with specificity and sensitivity of 85% whereas using only distal falloff decreases the sensitivity down to 77% on the same data set. The proposed method could help to evaluate performance and to optimize the design of PG monitoring devices. It is generic: other learning sets of deviations, other measures and other types of classifiers could be studied to potentially reach better performance. At the moment, the main limitation lies in the computation time needed to perform the simulations.

  11. Monitoring industrial facilities using principles of integration of fiber classifier and local sensor networks

    NASA Astrophysics Data System (ADS)

    Korotaev, Valery V.; Denisov, Victor M.; Rodrigues, Joel J. P. C.; Serikova, Mariya G.; Timofeev, Andrey V.

    2015-05-01

    The paper deals with the creation of integrated monitoring systems. They combine fiber-optic classifiers and local sensor networks. These systems allow for the monitoring of complex industrial objects. Together with adjacent natural objects, they form the so-called geotechnical systems. An integrated monitoring system may include one or more spatially continuous fiber-optic classifiers based on optic fiber and one or more arrays of discrete measurement sensors, which are usually combined in sensor networks. Fiber-optic classifiers are already widely used for the control of hazardous extended objects (oil and gas pipelines, railways, high-rise buildings, etc.). To monitor local objects, discrete measurement sensors are generally used (temperature, pressure, inclinometers, strain gauges, accelerometers, sensors measuring the composition of impurities in the air, and many others). However, monitoring complex geotechnical systems require a simultaneous use of continuous spatially distributed sensors based on fiber-optic cable and connected local discrete sensors networks. In fact, we are talking about integration of the two monitoring methods. This combination provides an additional way to create intelligent monitoring systems. Modes of operation of intelligent systems can automatically adapt to changing environmental conditions. For this purpose, context data received from one sensor (e.g., optical channel) may be used to change modes of work of other sensors within the same monitoring system. This work also presents experimental results of the prototype of the integrated monitoring system.

  12. Neural networks for simultaneous classification and parameter estimation in musical instrument control

    NASA Astrophysics Data System (ADS)

    Lee, Michael; Freed, Adrian; Wessel, David

    1992-08-01

    In this report we present our tools for prototyping adaptive user interfaces in the context of real-time musical instrument control. Characteristic of most human communication is the simultaneous use of classified events and estimated parameters. We have integrated a neural network object into the MAX language to explore adaptive user interfaces that considers these facets of human communication. By placing the neural processing in the context of a flexible real-time musical programming environment, we can rapidly prototype experiments on applications of adaptive interfaces and learning systems to musical problems. We have trained networks to recognize gestures from a Mathews radio baton, Nintendo Power GloveTM, and MIDI keyboard gestural input devices. In one experiment, a network successfully extracted classification and attribute data from gestural contours transduced by a continuous space controller, suggesting their application in the interpretation of conducting gestures and musical instrument control. We discuss network architectures, low-level features extracted for the networks to operate on, training methods, and musical applications of adaptive techniques.

  13. Interactive classification and content-based retrieval of tissue images

    NASA Astrophysics Data System (ADS)

    Aksoy, Selim; Marchisio, Giovanni B.; Tusk, Carsten; Koperski, Krzysztof

    2002-11-01

    We describe a system for interactive classification and retrieval of microscopic tissue images. Our system models tissues in pixel, region and image levels. Pixel level features are generated using unsupervised clustering of color and texture values. Region level features include shape information and statistics of pixel level feature values. Image level features include statistics and spatial relationships of regions. To reduce the gap between low-level features and high-level expert knowledge, we define the concept of prototype regions. The system learns the prototype regions in an image collection using model-based clustering and density estimation. Different tissue types are modeled using spatial relationships of these regions. Spatial relationships are represented by fuzzy membership functions. The system automatically selects significant relationships from training data and builds models which can also be updated using user relevance feedback. A Bayesian framework is used to classify tissues based on these models. Preliminary experiments show that the spatial relationship models we developed provide a flexible and powerful framework for classification and retrieval of tissue images.

  14. Development of Diesel Engine Operated Forklift Truck for Explosive Gas Atmospheres

    NASA Astrophysics Data System (ADS)

    Vishwakarma, Rajendra Kumar; Singh, Arvind Kumar; Ahirwal, Bhagirath; Sinha, Amalendu

    2018-02-01

    For the present study, a prototype diesel engine operated Forklift truck of 2 t capacity is developed for explosive gas atmosphere. The parts of the Forklift truck are assessed against risk of ignition of the explosive gases, vapors or mist grouped in Gr. IIA and having ignition temperature more than 200°C. Identification of possible sources of ignition and their control or prevention is the main objective of this work. The design transformation of a standard Forklift truck into a special Forklift truck is made on prototype basis. The safety parameters of the improved Forklift truck are discussed in this paper. The specially designed Forklift truck is useful in industries where explosive atmospheres may present during normal working conditions and risk of explosion is a concern during handling or transportation of materials. This indigenous diesel engine based Forklift truck for explosive gas atmosphere classified as Zone 1 and Zone 2 area and gas group IIA is developed first time in India in association with the Industry.

  15. NCC-AUC: an AUC optimization method to identify multi-biomarker panel for cancer prognosis from genomic and clinical data.

    PubMed

    Zou, Meng; Liu, Zhaoqi; Zhang, Xiang-Sun; Wang, Yong

    2015-10-15

    In prognosis and survival studies, an important goal is to identify multi-biomarker panels with predictive power using molecular characteristics or clinical observations. Such analysis is often challenged by censored, small-sample-size, but high-dimensional genomic profiles or clinical data. Therefore, sophisticated models and algorithms are in pressing need. In this study, we propose a novel Area Under Curve (AUC) optimization method for multi-biomarker panel identification named Nearest Centroid Classifier for AUC optimization (NCC-AUC). Our method is motived by the connection between AUC score for classification accuracy evaluation and Harrell's concordance index in survival analysis. This connection allows us to convert the survival time regression problem to a binary classification problem. Then an optimization model is formulated to directly maximize AUC and meanwhile minimize the number of selected features to construct a predictor in the nearest centroid classifier framework. NCC-AUC shows its great performance by validating both in genomic data of breast cancer and clinical data of stage IB Non-Small-Cell Lung Cancer (NSCLC). For the genomic data, NCC-AUC outperforms Support Vector Machine (SVM) and Support Vector Machine-based Recursive Feature Elimination (SVM-RFE) in classification accuracy. It tends to select a multi-biomarker panel with low average redundancy and enriched biological meanings. Also NCC-AUC is more significant in separation of low and high risk cohorts than widely used Cox model (Cox proportional-hazards regression model) and L1-Cox model (L1 penalized in Cox model). These performance gains of NCC-AUC are quite robust across 5 subtypes of breast cancer. Further in an independent clinical data, NCC-AUC outperforms SVM and SVM-RFE in predictive accuracy and is consistently better than Cox model and L1-Cox model in grouping patients into high and low risk categories. In summary, NCC-AUC provides a rigorous optimization framework to systematically reveal multi-biomarker panel from genomic and clinical data. It can serve as a useful tool to identify prognostic biomarkers for survival analysis. NCC-AUC is available at http://doc.aporc.org/wiki/NCC-AUC. ywang@amss.ac.cn Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  16. Nearest private query based on quantum oblivious key distribution

    NASA Astrophysics Data System (ADS)

    Xu, Min; Shi, Run-hua; Luo, Zhen-yu; Peng, Zhen-wan

    2017-12-01

    Nearest private query is a special private query which involves two parties, a user and a data owner, where the user has a private input (e.g., an integer) and the data owner has a private data set, and the user wants to query which element in the owner's private data set is the nearest to his input without revealing their respective private information. In this paper, we first present a quantum protocol for nearest private query, which is based on quantum oblivious key distribution (QOKD). Compared to the classical related protocols, our protocol has the advantages of the higher security and the better feasibility, so it has a better prospect of applications.

  17. Fidelity study of superconductivity in extended Hubbard models

    NASA Astrophysics Data System (ADS)

    Plonka, N.; Jia, C. J.; Wang, Y.; Moritz, B.; Devereaux, T. P.

    2015-07-01

    The Hubbard model with local on-site repulsion is generally thought to possess a superconducting ground state for appropriate parameters, but the effects of more realistic long-range Coulomb interactions have not been studied extensively. We study the influence of these interactions on superconductivity by including nearest- and next-nearest-neighbor extended Hubbard interactions in addition to the usual on-site terms. Utilizing numerical exact diagonalization, we analyze the signatures of superconductivity in the ground states through the fidelity metric of quantum information theory. We find that nearest and next-nearest neighbor interactions have thresholds above which they destabilize superconductivity regardless of whether they are attractive or repulsive, seemingly due to competing charge fluctuations.

  18. Algorithms that Defy the Gravity of Learning Curve

    DTIC Science & Technology

    2017-04-28

    three nearest neighbour-based anomaly detectors, i.e., an ensemble of nearest neigh- bours, a recent nearest neighbour-based ensemble method called iNNE...streams. Note that the change in sample size does not alter the geometrical data characteristics discussed here. 3.1 Experimental Methodology ...need to be answered. 3.6 Comparison with conventional ensemble methods Given the theoretical results, the third aim of this project (i.e., identify the

  19. A Catalog of Visually Classified Galaxies in the Local (z ∼ 0.01) Universe

    NASA Astrophysics Data System (ADS)

    Ann, H. B.; Seo, Mira; Ha, D. K.

    2015-04-01

    The morphological types of 5836 galaxies were classified by a visual inspection of color images using the Sloan Digital Sky Survey Data Release 7 to produce a morphology catalog of a representative sample of local galaxies with z\\lt 0.01. The sample galaxies are almost complete for galaxies brighter than {{r}pet}=17.77. Our classification system is basically the same as that of the Third Reference Catalog of Bright Galaxies with some simplifications for giant galaxies. On the other hand, we distinguish the fine features of dwarf elliptical (dE)-like galaxies to classify five subtypes: dE, blue-cored dwarf ellipticals, dwarf spheroidals (dSph), blue dwarf ellipticals (dEblue), and dwarf lenticulars (dS0). In addition, we note the presence of nucleation in dE, dSph, and dS0. Elliptical galaxies and lenticular galaxies contribute only ∼ 1.5 and ∼ 4.9% of local galaxies, respectively, whereas spirals and irregulars contribute ∼ 32.1 and ∼ 42.8%, respectively. The dEblue galaxies, which are a recently discovered population of galaxies, contribute a significant fraction of dwarf galaxies. There seem to be structural differences between dSph and dE galaxies. The dSph galaxies are fainter and bluer with a shallower surface brightness gradient than dE galaxies. They also have a lower fraction of galaxies with small axis ratios (b/a≲ 0.4) than dE galaxies. The mean projected distance to the nearest neighbor galaxy is ∼260 kpc. About 1% of local galaxies have no neighbors with comparable luminosity within a projected distance of 2 Mpc.

  20. Bag-of-features based medical image retrieval via multiple assignment and visual words weighting.

    PubMed

    Wang, Jingyan; Li, Yongping; Zhang, Ying; Wang, Chao; Xie, Honglan; Chen, Guoling; Gao, Xin

    2011-11-01

    Bag-of-features based approaches have become prominent for image retrieval and image classification tasks in the past decade. Such methods represent an image as a collection of local features, such as image patches and key points with scale invariant feature transform (SIFT) descriptors. To improve the bag-of-features methods, we first model the assignments of local descriptors as contribution functions, and then propose a novel multiple assignment strategy. Assuming the local features can be reconstructed by their neighboring visual words in a vocabulary, reconstruction weights can be solved by quadratic programming. The weights are then used to build contribution functions, resulting in a novel assignment method, called quadratic programming (QP) assignment. We further propose a novel visual word weighting method. The discriminative power of each visual word is analyzed by the sub-similarity function in the bin that corresponds to the visual word. Each sub-similarity function is then treated as a weak classifier. A strong classifier is learned by boosting methods that combine those weak classifiers. The weighting factors of the visual words are learned accordingly. We evaluate the proposed methods on medical image retrieval tasks. The methods are tested on three well-known data sets, i.e., the ImageCLEFmed data set, the 304 CT Set, and the basal-cell carcinoma image set. Experimental results demonstrate that the proposed QP assignment outperforms the traditional nearest neighbor assignment, the multiple assignment, and the soft assignment, whereas the proposed boosting based weighting strategy outperforms the state-of-the-art weighting methods, such as the term frequency weights and the term frequency-inverse document frequency weights.

Top