Sample records for k-nearest neighbour knn

  1. Personalised news filtering and recommendation system using Chi-square statistics-based K-nearest neighbour (χ2SB-KNN) model

    NASA Astrophysics Data System (ADS)

    Adeniyi, D. A.; Wei, Z.; Yang, Y.

    2017-10-01

    Recommendation problem has been extensively studied by researchers in the field of data mining, database and information retrieval. This study presents the design and realisation of an automated, personalised news recommendations system based on Chi-square statistics-based K-nearest neighbour (χ2SB-KNN) model. The proposed χ2SB-KNN model has the potential to overcome computational complexity and information overloading problems, reduces runtime and speeds up execution process through the use of critical value of χ2 distribution. The proposed recommendation engine can alleviate scalability challenges through combined online pattern discovery and pattern matching for real-time recommendations. This work also showcases the development of a novel method of feature selection referred to as Data Discretisation-Based feature selection method. This is used for selecting the best features for the proposed χ2SB-KNN algorithm at the preprocessing stage of the classification procedures. The implementation of the proposed χ2SB-KNN model is achieved through the use of a developed in-house Java program on an experimental website called OUC newsreaders' website. Finally, we compared the performance of our system with two baseline methods which are traditional Euclidean distance K-nearest neighbour and Naive Bayesian techniques. The result shows a significant improvement of our method over the baseline methods studied.

  2. Model-based mean square error estimators for k-nearest neighbour predictions and applications using remotely sensed data for forest inventories

    Treesearch

    Steen Magnussen; Ronald E. McRoberts; Erkki O. Tomppo

    2009-01-01

    New model-based estimators of the uncertainty of pixel-level and areal k-nearest neighbour (knn) predictions of attribute Y from remotely-sensed ancillary data X are presented. Non-parametric functions predict Y from scalar 'Single Index Model' transformations of X. Variance functions generated...

  3. The application of k-Nearest Neighbour in the identification of high potential archers based on relative psychological coping skills variables

    NASA Astrophysics Data System (ADS)

    Taha, Zahari; Muazu Musa, Rabiu; Majeed, Anwar P. P. Abdul; Razali Abdullah, Mohamad; Muaz Alim, Muhammad; Nasir, Ahmad Fakhri Ab

    2018-04-01

    The present study aims at classifying and predicting high and low potential archers from a collection of psychological coping skills variables trained on different k-Nearest Neighbour (k-NN) kernels. 50 youth archers with the average age and standard deviation of (17.0 ±.056) gathered from various archery programmes completed a one end shooting score test. Psychological coping skills inventory which evaluates the archers level of related coping skills were filled out by the archers prior to their shooting tests. k-means cluster analysis was applied to cluster the archers based on their scores on variables assessed k-NN models, i.e. fine, medium, coarse, cosine, cubic and weighted kernel functions, were trained on the psychological variables. The k-means clustered the archers into high psychologically prepared archers (HPPA) and low psychologically prepared archers (LPPA), respectively. It was demonstrated that the cosine k-NN model exhibited good accuracy and precision throughout the exercise with an accuracy of 94% and considerably fewer error rate for the prediction of the HPPA and the LPPA as compared to the rest of the models. The findings of this investigation can be valuable to coaches and sports managers to recognise high potential athletes from the selected psychological coping skills variables examined which would consequently save time and energy during talent identification and development programme.

  4. A Comparison of the Spatial Linear Model to Nearest Neighbor (k-NN) Methods for Forestry Applications

    Treesearch

    Jay M. Ver Hoef; Hailemariam Temesgen; Sergio Gómez

    2013-01-01

    Forest surveys provide critical information for many diverse interests. Data are often collected from samples, and from these samples, maps of resources and estimates of aerial totals or averages are required. In this paper, two approaches for mapping and estimating totals; the spatial linear model (SLM) and k-NN (k-Nearest Neighbor) are compared, theoretically,...

  5. A Fast Exact k-Nearest Neighbors Algorithm for High Dimensional Search Using k-Means Clustering and Triangle Inequality.

    PubMed

    Wang, Xueyi

    2012-02-08

    The k-nearest neighbors (k-NN) algorithm is a widely used machine learning method that finds nearest neighbors of a test object in a feature space. We present a new exact k-NN algorithm called kMkNN (k-Means for k-Nearest Neighbors) that uses the k-means clustering and the triangle inequality to accelerate the searching for nearest neighbors in a high dimensional space. The kMkNN algorithm has two stages. In the buildup stage, instead of using complex tree structures such as metric trees, kd-trees, or ball-tree, kMkNN uses a simple k-means clustering method to preprocess the training dataset. In the searching stage, given a query object, kMkNN finds nearest training objects starting from the nearest cluster to the query object and uses the triangle inequality to reduce the distance calculations. Experiments show that the performance of kMkNN is surprisingly good compared to the traditional k-NN algorithm and tree-based k-NN algorithms such as kd-trees and ball-trees. On a collection of 20 datasets with up to 10(6) records and 10(4) dimensions, kMkNN shows a 2-to 80-fold reduction of distance calculations and a 2- to 60-fold speedup over the traditional k-NN algorithm for 16 datasets. Furthermore, kMkNN performs significant better than a kd-tree based k-NN algorithm for all datasets and performs better than a ball-tree based k-NN algorithm for most datasets. The results show that kMkNN is effective for searching nearest neighbors in high dimensional spaces.

  6. A multiple-point spatially weighted k-NN method for object-based classification

    NASA Astrophysics Data System (ADS)

    Tang, Yunwei; Jing, Linhai; Li, Hui; Atkinson, Peter M.

    2016-10-01

    Object-based classification, commonly referred to as object-based image analysis (OBIA), is now commonly regarded as able to produce more appealing classification maps, often of greater accuracy, than pixel-based classification and its application is now widespread. Therefore, improvement of OBIA using spatial techniques is of great interest. In this paper, multiple-point statistics (MPS) is proposed for object-based classification enhancement in the form of a new multiple-point k-nearest neighbour (k-NN) classification method (MPk-NN). The proposed method first utilises a training image derived from a pre-classified map to characterise the spatial correlation between multiple points of land cover classes. The MPS borrows spatial structures from other parts of the training image, and then incorporates this spatial information, in the form of multiple-point probabilities, into the k-NN classifier. Two satellite sensor images with a fine spatial resolution were selected to evaluate the new method. One is an IKONOS image of the Beijing urban area and the other is a WorldView-2 image of the Wolong mountainous area, in China. The images were object-based classified using the MPk-NN method and several alternatives, including the k-NN, the geostatistically weighted k-NN, the Bayesian method, the decision tree classifier (DTC), and the support vector machine classifier (SVM). It was demonstrated that the new spatial weighting based on MPS can achieve greater classification accuracy relative to the alternatives and it is, thus, recommended as appropriate for object-based classification.

  7. The classification of hunger behaviour of Lates Calcarifer through the integration of image processing technique and k-Nearest Neighbour learning algorithm

    NASA Astrophysics Data System (ADS)

    Taha, Z.; Razman, M. A. M.; Ghani, A. S. Abdul; Majeed, A. P. P. Abdul; Musa, R. M.; Adnan, F. A.; Sallehudin, M. F.; Mukai, Y.

    2018-04-01

    Fish Hunger behaviour is essential in determining the fish feeding routine, particularly for fish farmers. The inability to provide accurate feeding routines (under-feeding or over-feeding) may lead the death of the fish and consequently inhibits the quantity of the fish produced. Moreover, the excessive food that is not consumed by the fish will be dissolved in the water and accordingly reduce the water quality through the reduction of oxygen quantity. This problem also leads the death of the fish or even spur fish diseases. In the present study, a correlation of Barramundi fish-school behaviour with hunger condition through the hybrid data integration of image processing technique is established. The behaviour is clustered with respect to the position of the school size as well as the school density of the fish before feeding, during feeding and after feeding. The clustered fish behaviour is then classified through k-Nearest Neighbour (k-NN) learning algorithm. Three different variations of the algorithm namely cosine, cubic and weighted are assessed on its ability to classify the aforementioned fish hunger behaviour. It was found from the study that the weighted k-NN variation provides the best classification with an accuracy of 86.5%. Therefore, it could be concluded that the proposed integration technique may assist fish farmers in ascertaining fish feeding routine.

  8. K-Nearest Neighbor Algorithm Optimization in Text Categorization

    NASA Astrophysics Data System (ADS)

    Chen, Shufeng

    2018-01-01

    K-Nearest Neighbor (KNN) classification algorithm is one of the simplest methods of data mining. It has been widely used in classification, regression and pattern recognition. The traditional KNN method has some shortcomings such as large amount of sample computation and strong dependence on the sample library capacity. In this paper, a method of representative sample optimization based on CURE algorithm is proposed. On the basis of this, presenting a quick algorithm QKNN (Quick k-nearest neighbor) to find the nearest k neighbor samples, which greatly reduces the similarity calculation. The experimental results show that this algorithm can effectively reduce the number of samples and speed up the search for the k nearest neighbor samples to improve the performance of the algorithm.

  9. Fuzzy-Rough Nearest Neighbour Classification

    NASA Astrophysics Data System (ADS)

    Jensen, Richard; Cornelis, Chris

    A new fuzzy-rough nearest neighbour (FRNN) classification algorithm is presented in this paper, as an alternative to Sarkar's fuzzy-rough ownership function (FRNN-O) approach. By contrast to the latter, our method uses the nearest neighbours to construct lower and upper approximations of decision classes, and classifies test instances based on their membership to these approximations. In the experimental analysis, we evaluate our approach with both classical fuzzy-rough approximations (based on an implicator and a t-norm), as well as with the recently introduced vaguely quantified rough sets. Preliminary results are very good, and in general FRNN outperforms FRNN-O, as well as the traditional fuzzy nearest neighbour (FNN) algorithm.

  10. Identification of jasmine flower (Jasminum sp.) based on the shape of the flower using sobel edge and k-nearest neighbour

    NASA Astrophysics Data System (ADS)

    Qur’ania, A.; Sarinah, I.

    2018-03-01

    People often wrong in knowing the type of jasmine by just looking at the white color of the jasmine, while not all white flowers including jasmine and not all jasmine flowers have white. There is a jasmine that is yellow and there is a jasmine that is white and purple.The aim of this research is to identify Jasmine flower (Jasminum sp.) based on the shape of the flower image-based using Sobel edge detection and k-Nearest Neighbor. Edge detection is used to detect the type of flower from the flower shape. Edge detection aims to improve the appearance of the border of a digital image. While k-Nearest Neighbor method is used to classify the classification of test objects into classes that have neighbouring properties closest to the object of training. The data used in this study are three types of jasmine namely jasmine white (Jasminum sambac), jasmine gambir (Jasminum pubescens), and jasmine japan (Pseuderanthemum reticulatum). Testing of jasmine flower image resized 50 × 50 pixels, 100 × 100 pixels, 150 × 150 pixels yields an accuracy of 84%. Tests on distance values of the k-NN method with spacing 5, 10 and 15 resulted in different accuracy rates for 5 and 10 closest distances yielding the same accuracy rate of 84%, for the 15 shortest distance resulted in a small accuracy of 65.2%.

  11. An Improvement To The k-Nearest Neighbor Classifier For ECG Database

    NASA Astrophysics Data System (ADS)

    Jaafar, Haryati; Hidayah Ramli, Nur; Nasir, Aimi Salihah Abdul

    2018-03-01

    The k nearest neighbor (kNN) is a non-parametric classifier and has been widely used for pattern classification. However, in practice, the performance of kNN often tends to fail due to the lack of information on how the samples are distributed among them. Moreover, kNN is no longer optimal when the training samples are limited. Another problem observed in kNN is regarding the weighting issues in assigning the class label before classification. Thus, to solve these limitations, a new classifier called Mahalanobis fuzzy k-nearest centroid neighbor (MFkNCN) is proposed in this study. Here, a Mahalanobis distance is applied to avoid the imbalance of samples distribition. Then, a surrounding rule is employed to obtain the nearest centroid neighbor based on the distributions of training samples and its distance to the query point. Consequently, the fuzzy membership function is employed to assign the query point to the class label which is frequently represented by the nearest centroid neighbor Experimental studies from electrocardiogram (ECG) signal is applied in this study. The classification performances are evaluated in two experimental steps i.e. different values of k and different sizes of feature dimensions. Subsequently, a comparative study of kNN, kNCN, FkNN and MFkCNN classifier is conducted to evaluate the performances of the proposed classifier. The results show that the performance of MFkNCN consistently exceeds the kNN, kNCN and FkNN with the best classification rates of 96.5%.

  12. Comparative Performance Analysis of Support Vector Machine, Random Forest, Logistic Regression and k-Nearest Neighbours in Rainbow Trout (Oncorhynchus Mykiss) Classification Using Image-Based Features

    PubMed Central

    Císař, Petr; Labbé, Laurent; Souček, Pavel; Pelissier, Pablo; Kerneis, Thierry

    2018-01-01

    The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout (Oncorhynchus mykiss) were fed either a fish-meal based diet (80 fish) or a 100% plant-based diet (80 fish) and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF), Support vector machine (SVM), Logistic regression (LR) and k-Nearest neighbours (k-NN). The SVM with radial based kernel provided the best classifier with correct classification rate (CCR) of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k-NN was the least accurate (40%) classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet’s effects on fish skin. PMID:29596375

  13. Improving GPU-accelerated adaptive IDW interpolation algorithm using fast kNN search.

    PubMed

    Mei, Gang; Xu, Nengxiong; Xu, Liangliang

    2016-01-01

    This paper presents an efficient parallel Adaptive Inverse Distance Weighting (AIDW) interpolation algorithm on modern Graphics Processing Unit (GPU). The presented algorithm is an improvement of our previous GPU-accelerated AIDW algorithm by adopting fast k-nearest neighbors (kNN) search. In AIDW, it needs to find several nearest neighboring data points for each interpolated point to adaptively determine the power parameter; and then the desired prediction value of the interpolated point is obtained by weighted interpolating using the power parameter. In this work, we develop a fast kNN search approach based on the space-partitioning data structure, even grid, to improve the previous GPU-accelerated AIDW algorithm. The improved algorithm is composed of the stages of kNN search and weighted interpolating. To evaluate the performance of the improved algorithm, we perform five groups of experimental tests. The experimental results indicate: (1) the improved algorithm can achieve a speedup of up to 1017 over the corresponding serial algorithm; (2) the improved algorithm is at least two times faster than our previous GPU-accelerated AIDW algorithm; and (3) the utilization of fast kNN search can significantly improve the computational efficiency of the entire GPU-accelerated AIDW algorithm.

  14. Frog sound identification using extended k-nearest neighbor classifier

    NASA Astrophysics Data System (ADS)

    Mukahar, Nordiana; Affendi Rosdi, Bakhtiar; Athiar Ramli, Dzati; Jaafar, Haryati

    2017-09-01

    Frog sound identification based on the vocalization becomes important for biological research and environmental monitoring. As a result, different types of feature extractions and classifiers have been employed to evaluate the accuracy of frog sound identification. This paper presents a frog sound identification with Extended k-Nearest Neighbor (EKNN) classifier. The EKNN classifier integrates the nearest neighbors and mutual sharing of neighborhood concepts, with the aims of improving the classification performance. It makes a prediction based on who are the nearest neighbors of the testing sample and who consider the testing sample as their nearest neighbors. In order to evaluate the classification performance in frog sound identification, the EKNN classifier is compared with competing classifier, k -Nearest Neighbor (KNN), Fuzzy k -Nearest Neighbor (FKNN) k - General Nearest Neighbor (KGNN)and Mutual k -Nearest Neighbor (MKNN) on the recorded sounds of 15 frog species obtained in Malaysia forest. The recorded sounds have been segmented using Short Time Energy and Short Time Average Zero Crossing Rate (STE+STAZCR), sinusoidal modeling (SM), manual and the combination of Energy (E) and Zero Crossing Rate (ZCR) (E+ZCR) while the features are extracted by Mel Frequency Cepstrum Coefficient (MFCC). The experimental results have shown that the EKNCN classifier exhibits the best performance in terms of accuracy compared to the competing classifiers, KNN, FKNN, GKNN and MKNN for all cases.

  15. K-nearest neighbor imputation of forest inventory variables in New Hampshire

    Treesearch

    Andrew Lister; Michael Hoppus; Raymond L. Czaplewski

    2005-01-01

    The k-nearest neighbor (kNN) method was used to map stand volume for a mosaic of 4 Landsat scenes covering the state of New Hampshire. Data for gross cubic foot volume and trees per acre were summarized from USDA Forest Service Forest Inventory and Analysis (FIA) plots and used as training for kNN. Six bands of...

  16. Automated identification of Monogeneans using digital image processing and K-nearest neighbour approaches.

    PubMed

    Yousef Kalafi, Elham; Tan, Wooi Boon; Town, Christopher; Dhillon, Sarinder Kaur

    2016-12-22

    Monogeneans are flatworms (Platyhelminthes) that are primarily found on gills and skin of fishes. Monogenean parasites have attachment appendages at their haptoral regions that help them to move about the body surface and feed on skin and gill debris. Haptoral attachment organs consist of sclerotized hard parts such as hooks, anchors and marginal hooks. Monogenean species are differentiated based on their haptoral bars, anchors, marginal hooks, reproductive parts' (male and female copulatory organs) morphological characters and soft anatomical parts. The complex structure of these diagnostic organs and also their overlapping in microscopic digital images are impediments for developing fully automated identification system for monogeneans (LNCS 7666:256-263, 2012), (ISDA; 457-462, 2011), (J Zoolog Syst Evol Res 52(2): 95-99. 2013;). In this study images of hard parts of the haptoral organs such as bars and anchors are used to develop a fully automated identification technique for monogenean species identification by implementing image processing techniques and machine learning methods. Images of four monogenean species namely Sinodiplectanotrema malayanus, Trianchoratus pahangensis, Metahaliotrema mizellei and Metahaliotrema sp. (undescribed) were used to develop an automated technique for identification. K-nearest neighbour (KNN) was applied to classify the monogenean specimens based on the extracted features. 50% of the dataset was used for training and the other 50% was used as testing for system evaluation. Our approach demonstrated overall classification accuracy of 90%. In this study Leave One Out (LOO) cross validation is used for validation of our system and the accuracy is 91.25%. The methods presented in this study facilitate fast and accurate fully automated classification of monogeneans at the species level. In future studies more classes will be included in the model, the time to capture the monogenean images will be reduced and improvements in

  17. Latent Dirichlet Allocation (LDA) Model and kNN Algorithm to Classify Research Project Selection

    NASA Astrophysics Data System (ADS)

    Safi’ie, M. A.; Utami, E.; Fatta, H. A.

    2018-03-01

    Universitas Sebelas Maret has a teaching staff more than 1500 people, and one of its tasks is to carry out research. In the other side, the funding support for research and service is limited, so there is need to be evaluated to determine the Research proposal submission and devotion on society (P2M). At the selection stage, research proposal documents are collected as unstructured data and the data stored is very large. To extract information contained in the documents therein required text mining technology. This technology applied to gain knowledge to the documents by automating the information extraction. In this articles we use Latent Dirichlet Allocation (LDA) to the documents as a model in feature extraction process, to get terms that represent its documents. Hereafter we use k-Nearest Neighbour (kNN) algorithm to classify the documents based on its terms.

  18. Accelerating k-NN Algorithm with Hybrid MPI and OpenSHMEM

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Lin, Jian; Hamidouche, Khaled; Zheng, Jie

    2015-08-05

    Machine Learning algorithms are benefiting from the continuous improvement of programming models, including MPI, MapReduce and PGAS. k-Nearest Neighbors (k-NN) algorithm is a widely used machine learning algorithm, applied to supervised learning tasks such as classification. Several parallel implementations of k-NN have been proposed in the literature and practice. However, on high-performance computing systems with high-speed interconnects, it is important to further accelerate existing designs of the k-NN algorithm through taking advantage of scalable programming models. To improve the performance of k-NN on large-scale environment with InfiniBand network, this paper proposes several alternative hybrid MPI+OpenSHMEM designs and performs a systemicmore » evaluation and analysis on typical workloads. The hybrid designs leverage the one-sided memory access to better overlap communication with computation than the existing pure MPI design, and propose better schemes for efficient buffer management. The implementation based on k-NN program from MaTEx with MVAPICH2-X (Unified MPI+PGAS Communication Runtime over InfiniBand) shows up to 9.0% time reduction for training KDD Cup 2010 workload over 512 cores, and 27.6% time reduction for small workload with balanced communication and computation. Experiments of running with varied number of cores show that our design can maintain good scalability.« less

  19. Fast clustering algorithm for large ECG data sets based on CS theory in combination with PCA and K-NN methods.

    PubMed

    Balouchestani, Mohammadreza; Krishnan, Sridhar

    2014-01-01

    Long-term recording of Electrocardiogram (ECG) signals plays an important role in health care systems for diagnostic and treatment purposes of heart diseases. Clustering and classification of collecting data are essential parts for detecting concealed information of P-QRS-T waves in the long-term ECG recording. Currently used algorithms do have their share of drawbacks: 1) clustering and classification cannot be done in real time; 2) they suffer from huge energy consumption and load of sampling. These drawbacks motivated us in developing novel optimized clustering algorithm which could easily scan large ECG datasets for establishing low power long-term ECG recording. In this paper, we present an advanced K-means clustering algorithm based on Compressed Sensing (CS) theory as a random sampling procedure. Then, two dimensionality reduction methods: Principal Component Analysis (PCA) and Linear Correlation Coefficient (LCC) followed by sorting the data using the K-Nearest Neighbours (K-NN) and Probabilistic Neural Network (PNN) classifiers are applied to the proposed algorithm. We show our algorithm based on PCA features in combination with K-NN classifier shows better performance than other methods. The proposed algorithm outperforms existing algorithms by increasing 11% classification accuracy. In addition, the proposed algorithm illustrates classification accuracy for K-NN and PNN classifiers, and a Receiver Operating Characteristics (ROC) area of 99.98%, 99.83%, and 99.75% respectively.

  20. Attribute Weighting Based K-Nearest Neighbor Using Gain Ratio

    NASA Astrophysics Data System (ADS)

    Nababan, A. A.; Sitompul, O. S.; Tulus

    2018-04-01

    K- Nearest Neighbor (KNN) is a good classifier, but from several studies, the result performance accuracy of KNN still lower than other methods. One of the causes of the low accuracy produced, because each attribute has the same effect on the classification process, while some less relevant characteristics lead to miss-classification of the class assignment for new data. In this research, we proposed Attribute Weighting Based K-Nearest Neighbor Using Gain Ratio as a parameter to see the correlation between each attribute in the data and the Gain Ratio also will be used as the basis for weighting each attribute of the dataset. The accuracy of results is compared to the accuracy acquired from the original KNN method using 10-fold Cross-Validation with several datasets from the UCI Machine Learning repository and KEEL-Dataset Repository, such as abalone, glass identification, haberman, hayes-roth and water quality status. Based on the result of the test, the proposed method was able to increase the classification accuracy of KNN, where the highest difference of accuracy obtained hayes-roth dataset is worth 12.73%, and the lowest difference of accuracy obtained in the abalone dataset of 0.07%. The average result of the accuracy of all dataset increases the accuracy by 5.33%.

  1. A comparative study of the SVM and K-nn machine learning algorithms for the diagnosis of respiratory pathologies using pulmonary acoustic signals.

    PubMed

    Palaniappan, Rajkumar; Sundaraj, Kenneth; Sundaraj, Sebastian

    2014-06-27

    Pulmonary acoustic parameters extracted from recorded respiratory sounds provide valuable information for the detection of respiratory pathologies. The automated analysis of pulmonary acoustic signals can serve as a differential diagnosis tool for medical professionals, a learning tool for medical students, and a self-management tool for patients. In this context, we intend to evaluate and compare the performance of the support vector machine (SVM) and K-nearest neighbour (K-nn) classifiers in diagnosis respiratory pathologies using respiratory sounds from R.A.L.E database. The pulmonary acoustic signals used in this study were obtained from the R.A.L.E lung sound database. The pulmonary acoustic signals were manually categorised into three different groups, namely normal, airway obstruction pathology, and parenchymal pathology. The mel-frequency cepstral coefficient (MFCC) features were extracted from the pre-processed pulmonary acoustic signals. The MFCC features were analysed by one-way ANOVA and then fed separately into the SVM and K-nn classifiers. The performances of the classifiers were analysed using the confusion matrix technique. The statistical analysis of the MFCC features using one-way ANOVA showed that the extracted MFCC features are significantly different (p < 0.001). The classification accuracies of the SVM and K-nn classifiers were found to be 92.19% and 98.26%, respectively. Although the data used to train and test the classifiers are limited, the classification accuracies found are satisfactory. The K-nn classifier was better than the SVM classifier for the discrimination of pulmonary acoustic signals from pathological and normal subjects obtained from the RALE database.

  2. A comparative study of the svm and k-nn machine learning algorithms for the diagnosis of respiratory pathologies using pulmonary acoustic signals

    PubMed Central

    2014-01-01

    Background Pulmonary acoustic parameters extracted from recorded respiratory sounds provide valuable information for the detection of respiratory pathologies. The automated analysis of pulmonary acoustic signals can serve as a differential diagnosis tool for medical professionals, a learning tool for medical students, and a self-management tool for patients. In this context, we intend to evaluate and compare the performance of the support vector machine (SVM) and K-nearest neighbour (K-nn) classifiers in diagnosis respiratory pathologies using respiratory sounds from R.A.L.E database. Results The pulmonary acoustic signals used in this study were obtained from the R.A.L.E lung sound database. The pulmonary acoustic signals were manually categorised into three different groups, namely normal, airway obstruction pathology, and parenchymal pathology. The mel-frequency cepstral coefficient (MFCC) features were extracted from the pre-processed pulmonary acoustic signals. The MFCC features were analysed by one-way ANOVA and then fed separately into the SVM and K-nn classifiers. The performances of the classifiers were analysed using the confusion matrix technique. The statistical analysis of the MFCC features using one-way ANOVA showed that the extracted MFCC features are significantly different (p < 0.001). The classification accuracies of the SVM and K-nn classifiers were found to be 92.19% and 98.26%, respectively. Conclusion Although the data used to train and test the classifiers are limited, the classification accuracies found are satisfactory. The K-nn classifier was better than the SVM classifier for the discrimination of pulmonary acoustic signals from pathological and normal subjects obtained from the RALE database. PMID:24970564

  3. An Examination of Diameter Density Prediction with k-NN and Airborne Lidar

    DOE PAGES

    Strunk, Jacob L.; Gould, Peter J.; Packalen, Petteri; ...

    2017-11-16

    While lidar-based forest inventory methods have been widely demonstrated, performances of methods to predict tree diameters with airborne lidar (lidar) are not well understood. One cause for this is that the performance metrics typically used in studies for prediction of diameters can be difficult to interpret, and may not support comparative inferences between sampling designs and study areas. To help with this problem we propose two indices and use them to evaluate a variety of lidar and k nearest neighbor (k-NN) strategies for prediction of tree diameter distributions. The indices are based on the coefficient of determination ( R 2),more » and root mean square deviation (RMSD). Both of the indices are highly interpretable, and the RMSD-based index facilitates comparisons with alternative (non-lidar) inventory strategies, and with projects in other regions. K-NN diameter distribution prediction strategies were examined using auxiliary lidar for 190 training plots distribute across the 800 km 2 Savannah River Site in South Carolina, USA. In conclusion, we evaluate the performance of k-NN with respect to distance metrics, number of neighbors, predictor sets, and response sets. K-NN and lidar explained 80% of variability in diameters, and Mahalanobis distance with k = 3 neighbors performed best according to a number of criteria.« less

  4. An Examination of Diameter Density Prediction with k-NN and Airborne Lidar

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Strunk, Jacob L.; Gould, Peter J.; Packalen, Petteri

    While lidar-based forest inventory methods have been widely demonstrated, performances of methods to predict tree diameters with airborne lidar (lidar) are not well understood. One cause for this is that the performance metrics typically used in studies for prediction of diameters can be difficult to interpret, and may not support comparative inferences between sampling designs and study areas. To help with this problem we propose two indices and use them to evaluate a variety of lidar and k nearest neighbor (k-NN) strategies for prediction of tree diameter distributions. The indices are based on the coefficient of determination ( R 2),more » and root mean square deviation (RMSD). Both of the indices are highly interpretable, and the RMSD-based index facilitates comparisons with alternative (non-lidar) inventory strategies, and with projects in other regions. K-NN diameter distribution prediction strategies were examined using auxiliary lidar for 190 training plots distribute across the 800 km 2 Savannah River Site in South Carolina, USA. In conclusion, we evaluate the performance of k-NN with respect to distance metrics, number of neighbors, predictor sets, and response sets. K-NN and lidar explained 80% of variability in diameters, and Mahalanobis distance with k = 3 neighbors performed best according to a number of criteria.« less

  5. Spectral identification of melon seeds variety based on k-nearest neighbor and Fisher discriminant analysis

    NASA Astrophysics Data System (ADS)

    Li, Cuiling; Jiang, Kai; Zhao, Xueguan; Fan, Pengfei; Wang, Xiu; Liu, Chuan

    2017-10-01

    Impurity of melon seeds variety will cause reductions of melon production and economic benefits of farmers, this research aimed to adopt spectral technology combined with chemometrics methods to identify melon seeds variety. Melon seeds whose varieties were "Yi Te Bai", "Yi Te Jin", "Jing Mi NO.7", "Jing Mi NO.11" and " Yi Li Sha Bai "were used as research samples. A simple spectral system was developed to collect reflective spectral data of melon seeds, including a light source unit, a spectral data acquisition unit and a data processing unit, the detection wavelength range of this system was 200-1100nm with spectral resolution of 0.14 7.7nm. The original reflective spectral data was pre-treated with de-trend (DT), multiple scattering correction (MSC), first derivative (FD), normalization (NOR) and Savitzky-Golay (SG) convolution smoothing methods. Principal Component Analysis (PCA) method was adopted to reduce the dimensions of reflective spectral data and extract principal components. K-nearest neighbour (KNN) and Fisher discriminant analysis (FDA) methods were used to develop discriminant models of melon seeds variety based on PCA. Spectral data pretreatments improved the discriminant effects of KNN and FDA, FDA generated better discriminant results than KNN, both KNN and FDA methods produced discriminant accuracies reaching to 90.0% for validation set. Research results showed that using spectral technology in combination with KNN and FDA modelling methods to identify melon seeds variety was feasible.

  6. Improving the accuracy of k-nearest neighbor using local mean based and distance weight

    NASA Astrophysics Data System (ADS)

    Syaliman, K. U.; Nababan, E. B.; Sitompul, O. S.

    2018-03-01

    In k-nearest neighbor (kNN), the determination of classes for new data is normally performed by a simple majority vote system, which may ignore the similarities among data, as well as allowing the occurrence of a double majority class that can lead to misclassification. In this research, we propose an approach to resolve the majority vote issues by calculating the distance weight using a combination of local mean based k-nearest neighbor (LMKNN) and distance weight k-nearest neighbor (DWKNN). The accuracy of results is compared to the accuracy acquired from the original k-NN method using several datasets from the UCI Machine Learning repository, Kaggle and Keel, such as ionosphare, iris, voice genre, lower back pain, and thyroid. In addition, the proposed method is also tested using real data from a public senior high school in city of Tualang, Indonesia. Results shows that the combination of LMKNN and DWKNN was able to increase the classification accuracy of kNN, whereby the average accuracy on test data is 2.45% with the highest increase in accuracy of 3.71% occurring on the lower back pain symptoms dataset. For the real data, the increase in accuracy is obtained as high as 5.16%.

  7. Activity Recognition in Egocentric video using SVM, kNN and Combined SVMkNN Classifiers

    NASA Astrophysics Data System (ADS)

    Sanal Kumar, K. P.; Bhavani, R., Dr.

    2017-08-01

    Egocentric vision is a unique perspective in computer vision which is human centric. The recognition of egocentric actions is a challenging task which helps in assisting elderly people, disabled patients and so on. In this work, life logging activity videos are taken as input. There are 2 categories, first one is the top level and second one is second level. Here, the recognition is done using the features like Histogram of Oriented Gradients (HOG), Motion Boundary Histogram (MBH) and Trajectory. The features are fused together and it acts as a single feature. The extracted features are reduced using Principal Component Analysis (PCA). The features that are reduced are provided as input to the classifiers like Support Vector Machine (SVM), k nearest neighbor (kNN) and combined Support Vector Machine (SVM) and k Nearest Neighbor (kNN) (combined SVMkNN). These classifiers are evaluated and the combined SVMkNN provided better results than other classifiers in the literature.

  8. The distance function effect on k-nearest neighbor classification for medical datasets.

    PubMed

    Hu, Li-Yu; Huang, Min-Wei; Ke, Shih-Wen; Tsai, Chih-Fong

    2016-01-01

    K-nearest neighbor (k-NN) classification is conventional non-parametric classifier, which has been used as the baseline classifier in many pattern classification problems. It is based on measuring the distances between the test data and each of the training data to decide the final classification output. Since the Euclidean distance function is the most widely used distance metric in k-NN, no study examines the classification performance of k-NN by different distance functions, especially for various medical domain problems. Therefore, the aim of this paper is to investigate whether the distance function can affect the k-NN performance over different medical datasets. Our experiments are based on three different types of medical datasets containing categorical, numerical, and mixed types of data and four different distance functions including Euclidean, cosine, Chi square, and Minkowsky are used during k-NN classification individually. The experimental results show that using the Chi square distance function is the best choice for the three different types of datasets. However, using the cosine and Euclidean (and Minkowsky) distance function perform the worst over the mixed type of datasets. In this paper, we demonstrate that the chosen distance function can affect the classification accuracy of the k-NN classifier. For the medical domain datasets including the categorical, numerical, and mixed types of data, K-NN based on the Chi square distance function performs the best.

  9. Discrimination of soft tissues using laser-induced breakdown spectroscopy in combination with k nearest neighbors (kNN) and support vector machine (SVM) classifiers

    NASA Astrophysics Data System (ADS)

    Li, Xiaohui; Yang, Sibo; Fan, Rongwei; Yu, Xin; Chen, Deying

    2018-06-01

    In this paper, discrimination of soft tissues using laser-induced breakdown spectroscopy (LIBS) in combination with multivariate statistical methods is presented. Fresh pork fat, skin, ham, loin and tenderloin muscle tissues are manually cut into slices and ablated using a 1064 nm pulsed Nd:YAG laser. Discrimination analyses between fat, skin and muscle tissues, and further between highly similar ham, loin and tenderloin muscle tissues, are performed based on the LIBS spectra in combination with multivariate statistical methods, including principal component analysis (PCA), k nearest neighbors (kNN) classification, and support vector machine (SVM) classification. Performances of the discrimination models, including accuracy, sensitivity and specificity, are evaluated using 10-fold cross validation. The classification models are optimized to achieve best discrimination performances. The fat, skin and muscle tissues can be definitely discriminated using both kNN and SVM classifiers, with accuracy of over 99.83%, sensitivity of over 0.995 and specificity of over 0.998. The highly similar ham, loin and tenderloin muscle tissues can also be discriminated with acceptable performances. The best performances are achieved with SVM classifier using Gaussian kernel function, with accuracy of 76.84%, sensitivity of over 0.742 and specificity of over 0.869. The results show that the LIBS technique assisted with multivariate statistical methods could be a powerful tool for online discrimination of soft tissues, even for tissues of high similarity, such as muscles from different parts of the animal body. This technique could be used for discrimination of tissues suffering minor clinical changes, thus may advance the diagnosis of early lesions and abnormalities.

  10. Multidimensional k-nearest neighbor model based on EEMD for financial time series forecasting

    NASA Astrophysics Data System (ADS)

    Zhang, Ningning; Lin, Aijing; Shang, Pengjian

    2017-07-01

    In this paper, we propose a new two-stage methodology that combines the ensemble empirical mode decomposition (EEMD) with multidimensional k-nearest neighbor model (MKNN) in order to forecast the closing price and high price of the stocks simultaneously. The modified algorithm of k-nearest neighbors (KNN) has an increasingly wide application in the prediction of all fields. Empirical mode decomposition (EMD) decomposes a nonlinear and non-stationary signal into a series of intrinsic mode functions (IMFs), however, it cannot reveal characteristic information of the signal with much accuracy as a result of mode mixing. So ensemble empirical mode decomposition (EEMD), an improved method of EMD, is presented to resolve the weaknesses of EMD by adding white noise to the original data. With EEMD, the components with true physical meaning can be extracted from the time series. Utilizing the advantage of EEMD and MKNN, the new proposed ensemble empirical mode decomposition combined with multidimensional k-nearest neighbor model (EEMD-MKNN) has high predictive precision for short-term forecasting. Moreover, we extend this methodology to the case of two-dimensions to forecast the closing price and high price of the four stocks (NAS, S&P500, DJI and STI stock indices) at the same time. The results indicate that the proposed EEMD-MKNN model has a higher forecast precision than EMD-KNN, KNN method and ARIMA.

  11. Secure and Efficient k-NN Queries⋆

    PubMed Central

    Asif, Hafiz; Vaidya, Jaideep; Shafiq, Basit; Adam, Nabil

    2017-01-01

    Given the morass of available data, ranking and best match queries are often used to find records of interest. As such, k-NN queries, which give the k closest matches to a query point, are of particular interest, and have many applications. We study this problem in the context of the financial sector, wherein an investment portfolio database is queried for matching portfolios. Given the sensitivity of the information involved, our key contribution is to develop a secure k-NN computation protocol that can enable the computation k-NN queries in a distributed multi-party environment while taking domain semantics into account. The experimental results show that the proposed protocols are extremely efficient. PMID:29218333

  12. Applying an efficient K-nearest neighbor search to forest attribute imputation

    Treesearch

    Andrew O. Finley; Ronald E. McRoberts; Alan R. Ek

    2006-01-01

    This paper explores the utility of an efficient nearest neighbor (NN) search algorithm for applications in multi-source kNN forest attribute imputation. The search algorithm reduces the number of distance calculations between a given target vector and each reference vector, thereby, decreasing the time needed to discover the NN subset. Results of five trials show gains...

  13. Geometric k-nearest neighbor estimation of entropy and mutual information

    NASA Astrophysics Data System (ADS)

    Lord, Warren M.; Sun, Jie; Bollt, Erik M.

    2018-03-01

    Nonparametric estimation of mutual information is used in a wide range of scientific problems to quantify dependence between variables. The k-nearest neighbor (knn) methods are consistent, and therefore expected to work well for a large sample size. These methods use geometrically regular local volume elements. This practice allows maximum localization of the volume elements, but can also induce a bias due to a poor description of the local geometry of the underlying probability measure. We introduce a new class of knn estimators that we call geometric knn estimators (g-knn), which use more complex local volume elements to better model the local geometry of the probability measures. As an example of this class of estimators, we develop a g-knn estimator of entropy and mutual information based on elliptical volume elements, capturing the local stretching and compression common to a wide range of dynamical system attractors. A series of numerical examples in which the thickness of the underlying distribution and the sample sizes are varied suggest that local geometry is a source of problems for knn methods such as the Kraskov-Stögbauer-Grassberger estimator when local geometric effects cannot be removed by global preprocessing of the data. The g-knn method performs well despite the manipulation of the local geometry. In addition, the examples suggest that the g-knn estimators can be of particular relevance to applications in which the system is large, but the data size is limited.

  14. D Nearest Neighbour Search Using a Clustered Hierarchical Tree Structure

    NASA Astrophysics Data System (ADS)

    Suhaibah, A.; Uznir, U.; Anton, F.; Mioc, D.; Rahman, A. A.

    2016-06-01

    Locating and analysing the location of new stores or outlets is one of the common issues facing retailers and franchisers. This is due to assure that new opening stores are at their strategic location to attract the highest possible number of customers. Spatial information is used to manage, maintain and analyse these store locations. However, since the business of franchising and chain stores in urban areas runs within high rise multi-level buildings, a three-dimensional (3D) method is prominently required in order to locate and identify the surrounding information such as at which level of the franchise unit will be located or is the franchise unit located is at the best level for visibility purposes. One of the common used analyses used for retrieving the surrounding information is Nearest Neighbour (NN) analysis. It uses a point location and identifies the surrounding neighbours. However, with the immense number of urban datasets, the retrieval and analysis of nearest neighbour information and their efficiency will become more complex and crucial. In this paper, we present a technique to retrieve nearest neighbour information in 3D space using a clustered hierarchical tree structure. Based on our findings, the proposed approach substantially showed an improvement of response time analysis compared to existing approaches of spatial access methods in databases. The query performance was tested using a dataset consisting of 500,000 point locations building and franchising unit. The results are presented in this paper. Another advantage of this structure is that it also offers a minimal overlap and coverage among nodes which can reduce repetitive data entry.

  15. Using genetic algorithms to optimize k-Nearest Neighbors configurations for use with airborne laser scanning data

    Treesearch

    Ronald E. McRoberts; Grant M. Domke; Qi Chen; Erik Næsset; Terje Gobakken

    2016-01-01

    The relatively small sampling intensities used by national forest inventories are often insufficient to produce the desired precision for estimates of population parameters unless the estimation process is augmented with auxiliary information, usually in the form of remotely sensed data. The k-Nearest Neighbors (k-NN) technique is a non-parametric,multivariate approach...

  16. Estimating areal means and variances of forest attributes using the k-Nearest Neighbors technique and satellite imagery

    Treesearch

    Ronald E. McRoberts; Erkki O. Tomppo; Andrew O. Finley; Heikkinen Juha

    2007-01-01

    The k-Nearest Neighbor (k-NN) technique has become extremely popular for a variety of forest inventory mapping and estimation applications. Much of this popularity may be attributed to the non-parametric, multivariate features of the technique, its intuitiveness, and its ease of use. When used with satellite imagery and forest...

  17. Bees do not use nearest-neighbour rules for optimization of multi-location routes.

    PubMed

    Lihoreau, Mathieu; Chittka, Lars; Le Comber, Steven C; Raine, Nigel E

    2012-02-23

    Animals collecting patchily distributed resources are faced with complex multi-location routing problems. Rather than comparing all possible routes, they often find reasonably short solutions by simply moving to the nearest unvisited resources when foraging. Here, we report the travel optimization performance of bumble-bees (Bombus terrestris) foraging in a flight cage containing six artificial flowers arranged such that movements between nearest-neighbour locations would lead to a long suboptimal route. After extensive training (80 foraging bouts and at least 640 flower visits), bees reduced their flight distances and prioritized shortest possible routes, while almost never following nearest-neighbour solutions. We discuss possible strategies used during the establishment of stable multi-location routes (or traplines), and how these could allow bees and other animals to solve complex routing problems through experience, without necessarily requiring a sophisticated cognitive representation of space.

  18. Progress in adapting k-NN methods for forest mapping and estimation using the new annual Forest Inventory and Analysis data

    Treesearch

    Reija Haapanen; Kimmo Lehtinen; Jukka Miettinen; Marvin E. Bauer; Alan R. Ek

    2002-01-01

    The k-nearest neighbor (k-NN) method has been undergoing development and testing for applications with USDA Forest Service Forest Inventory and Analysis (FIA) data in Minnesota since 1997. Research began using the 1987-1990 FIA inventory of the state, the then standard 10-point cluster plots, and Landsat TM imagery. In the past year, research has moved to examine...

  19. Categorizing document by fuzzy C-Means and K-nearest neighbors approach

    NASA Astrophysics Data System (ADS)

    Priandini, Novita; Zaman, Badrus; Purwanti, Endah

    2017-08-01

    Increasing of technology had made categorizing documents become important. It caused by increasing of number of documents itself. Managing some documents by categorizing is one of Information Retrieval application, because it involve text mining on its process. Whereas, categorization technique could be done both Fuzzy C-Means (FCM) and K-Nearest Neighbors (KNN) method. This experiment would consolidate both methods. The aim of the experiment is increasing performance of document categorize. First, FCM is in order to clustering training documents. Second, KNN is in order to categorize testing document until the output of categorization is shown. Result of the experiment is 14 testing documents retrieve relevantly to its category. Meanwhile 6 of 20 testing documents retrieve irrelevant to its category. Result of system evaluation shows that both precision and recall are 0,7.

  20. Nearest neighbor-density-based clustering methods for large hyperspectral images

    NASA Astrophysics Data System (ADS)

    Cariou, Claude; Chehdi, Kacem

    2017-10-01

    We address the problem of hyperspectral image (HSI) pixel partitioning using nearest neighbor - density-based (NN-DB) clustering methods. NN-DB methods are able to cluster objects without specifying the number of clusters to be found. Within the NN-DB approach, we focus on deterministic methods, e.g. ModeSeek, knnClust, and GWENN (standing for Graph WatershEd using Nearest Neighbors). These methods only require the availability of a k-nearest neighbor (kNN) graph based on a given distance metric. Recently, a new DB clustering method, called Density Peak Clustering (DPC), has received much attention, and kNN versions of it have quickly followed and showed their efficiency. However, NN-DB methods still suffer from the difficulty of obtaining the kNN graph due to the quadratic complexity with respect to the number of pixels. This is why GWENN was embedded into a multiresolution (MR) scheme to bypass the computation of the full kNN graph over the image pixels. In this communication, we propose to extent the MR-GWENN scheme on three aspects. Firstly, similarly to knnClust, the original labeling rule of GWENN is modified to account for local density values, in addition to the labels of previously processed objects. Secondly, we set up a modified NN search procedure within the MR scheme, in order to stabilize of the number of clusters found from the coarsest to the finest spatial resolution. Finally, we show that these extensions can be easily adapted to the three other NN-DB methods (ModeSeek, knnClust, knnDPC) for pixel clustering in large HSIs. Experiments are conducted to compare the four NN-DB methods for pixel clustering in HSIs. We show that NN-DB methods can outperform a classical clustering method such as fuzzy c-means (FCM), in terms of classification accuracy, relevance of found clusters, and clustering speed. Finally, we demonstrate the feasibility and evaluate the performances of NN-DB methods on a very large image acquired by our AISA Eagle hyperspectral

  1. Quantum Algorithm for K-Nearest Neighbors Classification Based on the Metric of Hamming Distance

    NASA Astrophysics Data System (ADS)

    Ruan, Yue; Xue, Xiling; Liu, Heng; Tan, Jianing; Li, Xi

    2017-11-01

    K-nearest neighbors (KNN) algorithm is a common algorithm used for classification, and also a sub-routine in various complicated machine learning tasks. In this paper, we presented a quantum algorithm (QKNN) for implementing this algorithm based on the metric of Hamming distance. We put forward a quantum circuit for computing Hamming distance between testing sample and each feature vector in the training set. Taking advantage of this method, we realized a good analog for classical KNN algorithm by setting a distance threshold value t to select k - n e a r e s t neighbors. As a result, QKNN achieves O( n 3) performance which is only relevant to the dimension of feature vectors and high classification accuracy, outperforms Llyod's algorithm (Lloyd et al. 2013) and Wiebe's algorithm (Wiebe et al. 2014).

  2. Multi-spectral brain tissue segmentation using automatically trained k-Nearest-Neighbor classification.

    PubMed

    Vrooman, Henri A; Cocosco, Chris A; van der Lijn, Fedde; Stokking, Rik; Ikram, M Arfan; Vernooij, Meike W; Breteler, Monique M B; Niessen, Wiro J

    2007-08-01

    Conventional k-Nearest-Neighbor (kNN) classification, which has been successfully applied to classify brain tissue in MR data, requires training on manually labeled subjects. This manual labeling is a laborious and time-consuming procedure. In this work, a new fully automated brain tissue classification procedure is presented, in which kNN training is automated. This is achieved by non-rigidly registering the MR data with a tissue probability atlas to automatically select training samples, followed by a post-processing step to keep the most reliable samples. The accuracy of the new method was compared to rigid registration-based training and to conventional kNN-based segmentation using training on manually labeled subjects for segmenting gray matter (GM), white matter (WM) and cerebrospinal fluid (CSF) in 12 data sets. Furthermore, for all classification methods, the performance was assessed when varying the free parameters. Finally, the robustness of the fully automated procedure was evaluated on 59 subjects. The automated training method using non-rigid registration with a tissue probability atlas was significantly more accurate than rigid registration. For both automated training using non-rigid registration and for the manually trained kNN classifier, the difference with the manual labeling by observers was not significantly larger than inter-observer variability for all tissue types. From the robustness study, it was clear that, given an appropriate brain atlas and optimal parameters, our new fully automated, non-rigid registration-based method gives accurate and robust segmentation results. A similarity index was used for comparison with manually trained kNN. The similarity indices were 0.93, 0.92 and 0.92, for CSF, GM and WM, respectively. It can be concluded that our fully automated method using non-rigid registration may replace manual segmentation, and thus that automated brain tissue segmentation without laborious manual training is feasible.

  3. GPU based cloud system for high-performance arrhythmia detection with parallel k-NN algorithm.

    PubMed

    Tae Joon Jun; Hyun Ji Park; Hyuk Yoo; Young-Hak Kim; Daeyoung Kim

    2016-08-01

    In this paper, we propose an GPU based Cloud system for high-performance arrhythmia detection. Pan-Tompkins algorithm is used for QRS detection and we optimized beat classification algorithm with K-Nearest Neighbor (K-NN). To support high performance beat classification on the system, we parallelized beat classification algorithm with CUDA to execute the algorithm on virtualized GPU devices on the Cloud system. MIT-BIH Arrhythmia database is used for validation of the algorithm. The system achieved about 93.5% of detection rate which is comparable to previous researches while our algorithm shows 2.5 times faster execution time compared to CPU only detection algorithm.

  4. Study of parameters of the nearest neighbour shared algorithm on clustering documents

    NASA Astrophysics Data System (ADS)

    Mustika Rukmi, Alvida; Budi Utomo, Daryono; Imro’atus Sholikhah, Neni

    2018-03-01

    Document clustering is one way of automatically managing documents, extracting of document topics and fastly filtering information. Preprocess of clustering documents processed by textmining consists of: keyword extraction using Rapid Automatic Keyphrase Extraction (RAKE) and making the document as concept vector using Latent Semantic Analysis (LSA). Furthermore, the clustering process is done so that the documents with the similarity of the topic are in the same cluster, based on the preprocesing by textmining performed. Shared Nearest Neighbour (SNN) algorithm is a clustering method based on the number of "nearest neighbors" shared. The parameters in the SNN Algorithm consist of: k nearest neighbor documents, ɛ shared nearest neighbor documents and MinT minimum number of similar documents, which can form a cluster. Characteristics The SNN algorithm is based on shared ‘neighbor’ properties. Each cluster is formed by keywords that are shared by the documents. SNN algorithm allows a cluster can be built more than one keyword, if the value of the frequency of appearing keywords in document is also high. Determination of parameter values on SNN algorithm affects document clustering results. The higher parameter value k, will increase the number of neighbor documents from each document, cause similarity of neighboring documents are lower. The accuracy of each cluster is also low. The higher parameter value ε, caused each document catch only neighbor documents that have a high similarity to build a cluster. It also causes more unclassified documents (noise). The higher the MinT parameter value cause the number of clusters will decrease, since the number of similar documents can not form clusters if less than MinT. Parameter in the SNN Algorithm determine performance of clustering result and the amount of noise (unclustered documents ). The Silhouette coeffisient shows almost the same result in many experiments, above 0.9, which means that SNN algorithm works well

  5. Distributed Computation of the knn Graph for Large High-Dimensional Point Sets

    PubMed Central

    Plaku, Erion; Kavraki, Lydia E.

    2009-01-01

    High-dimensional problems arising from robot motion planning, biology, data mining, and geographic information systems often require the computation of k nearest neighbor (knn) graphs. The knn graph of a data set is obtained by connecting each point to its k closest points. As the research in the above-mentioned fields progressively addresses problems of unprecedented complexity, the demand for computing knn graphs based on arbitrary distance metrics and large high-dimensional data sets increases, exceeding resources available to a single machine. In this work we efficiently distribute the computation of knn graphs for clusters of processors with message passing. Extensions to our distributed framework include the computation of graphs based on other proximity queries, such as approximate knn or range queries. Our experiments show nearly linear speedup with over one hundred processors and indicate that similar speedup can be obtained with several hundred processors. PMID:19847318

  6. Text Categorization Using Weight Adjusted k-Nearest Neighbor Classification

    DTIC Science & Technology

    1999-05-17

    Experimental Results In this section, we compare kNN -mut which uses the weight vector obtained using mutual information as the fi- nal weight vector and...WAKNN against kNN , C4.5 [Qui93], RIPPER [Coh95], PEBLS [CS93], Rainbow [McC96], VSM [Low95] on several synthetic and real data sets. VSM is another k...obtained without this option. 3 C4.5 RIPPER PEBLS Rainbow kNN WAKNN Syn-1 100.0 100.0 100.0 100.0 77.3 100.0 Syn-2 67.5 69.5 62.0 50.0 66.0 68.8 Syn

  7. Multi-color space threshold segmentation and self-learning k-NN algorithm for surge test EUT status identification

    NASA Astrophysics Data System (ADS)

    Huang, Jian; Liu, Gui-xiong

    2016-09-01

    The identification of targets varies in different surge tests. A multi-color space threshold segmentation and self-learning k-nearest neighbor algorithm ( k-NN) for equipment under test status identification was proposed after using feature matching to identify equipment status had to train new patterns every time before testing. First, color space (L*a*b*, hue saturation lightness (HSL), hue saturation value (HSV)) to segment was selected according to the high luminance points ratio and white luminance points ratio of the image. Second, the unknown class sample S r was classified by the k-NN algorithm with training set T z according to the feature vector, which was formed from number of pixels, eccentricity ratio, compactness ratio, and Euler's numbers. Last, while the classification confidence coefficient equaled k, made S r as one sample of pre-training set T z '. The training set T z increased to T z+1 by T z ' if T z ' was saturated. In nine series of illuminant, indicator light, screen, and disturbances samples (a total of 21600 frames), the algorithm had a 98.65%identification accuracy, also selected five groups of samples to enlarge the training set from T 0 to T 5 by itself.

  8. Evaluation of normalization methods for cDNA microarray data by k-NN classification

    PubMed Central

    Wu, Wei; Xing, Eric P; Myers, Connie; Mian, I Saira; Bissell, Mina J

    2005-01-01

    Background Non-biological factors give rise to unwanted variations in cDNA microarray data. There are many normalization methods designed to remove such variations. However, to date there have been few published systematic evaluations of these techniques for removing variations arising from dye biases in the context of downstream, higher-order analytical tasks such as classification. Results Ten location normalization methods that adjust spatial- and/or intensity-dependent dye biases, and three scale methods that adjust scale differences were applied, individually and in combination, to five distinct, published, cancer biology-related cDNA microarray data sets. Leave-one-out cross-validation (LOOCV) classification error was employed as the quantitative end-point for assessing the effectiveness of a normalization method. In particular, a known classifier, k-nearest neighbor (k-NN), was estimated from data normalized using a given technique, and the LOOCV error rate of the ensuing model was computed. We found that k-NN classifiers are sensitive to dye biases in the data. Using NONRM and GMEDIAN as baseline methods, our results show that single-bias-removal techniques which remove either spatial-dependent dye bias (referred later as spatial effect) or intensity-dependent dye bias (referred later as intensity effect) moderately reduce LOOCV classification errors; whereas double-bias-removal techniques which remove both spatial- and intensity effect reduce LOOCV classification errors even further. Of the 41 different strategies examined, three two-step processes, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, all of which removed intensity effect globally and spatial effect locally, appear to reduce LOOCV classification errors most consistently and effectively across all data sets. We also found that the investigated scale normalization methods do not reduce LOOCV classification error. Conclusion Using LOOCV error of k-NNs as the evaluation criterion, three

  9. Evaluation of normalization methods for cDNA microarray data by k-NN classification.

    PubMed

    Wu, Wei; Xing, Eric P; Myers, Connie; Mian, I Saira; Bissell, Mina J

    2005-07-26

    Non-biological factors give rise to unwanted variations in cDNA microarray data. There are many normalization methods designed to remove such variations. However, to date there have been few published systematic evaluations of these techniques for removing variations arising from dye biases in the context of downstream, higher-order analytical tasks such as classification. Ten location normalization methods that adjust spatial- and/or intensity-dependent dye biases, and three scale methods that adjust scale differences were applied, individually and in combination, to five distinct, published, cancer biology-related cDNA microarray data sets. Leave-one-out cross-validation (LOOCV) classification error was employed as the quantitative end-point for assessing the effectiveness of a normalization method. In particular, a known classifier, k-nearest neighbor (k-NN), was estimated from data normalized using a given technique, and the LOOCV error rate of the ensuing model was computed. We found that k-NN classifiers are sensitive to dye biases in the data. Using NONRM and GMEDIAN as baseline methods, our results show that single-bias-removal techniques which remove either spatial-dependent dye bias (referred later as spatial effect) or intensity-dependent dye bias (referred later as intensity effect) moderately reduce LOOCV classification errors; whereas double-bias-removal techniques which remove both spatial- and intensity effect reduce LOOCV classification errors even further. Of the 41 different strategies examined, three two-step processes, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, all of which removed intensity effect globally and spatial effect locally, appear to reduce LOOCV classification errors most consistently and effectively across all data sets. We also found that the investigated scale normalization methods do not reduce LOOCV classification error. Using LOOCV error of k-NNs as the evaluation criterion, three double

  10. Colorectal Cancer and Colitis Diagnosis Using Fourier Transform Infrared Spectroscopy and an Improved K-Nearest-Neighbour Classifier.

    PubMed

    Li, Qingbo; Hao, Can; Kang, Xue; Zhang, Jialin; Sun, Xuejun; Wang, Wenbo; Zeng, Haishan

    2017-11-27

    Combining Fourier transform infrared spectroscopy (FTIR) with endoscopy, it is expected that noninvasive, rapid detection of colorectal cancer can be performed in vivo in the future. In this study, Fourier transform infrared spectra were collected from 88 endoscopic biopsy colorectal tissue samples (41 colitis and 47 cancers). A new method, viz., entropy weight local-hyperplane k-nearest-neighbor (EWHK), which is an improved version of K-local hyperplane distance nearest-neighbor (HKNN), is proposed for tissue classification. In order to avoid limiting high dimensions and small values of the nearest neighbor, the new EWHK method calculates feature weights based on information entropy. The average results of the random classification showed that the EWHK classifier for differentiating cancer from colitis samples produced a sensitivity of 81.38% and a specificity of 92.69%.

  11. k-Nearest neighbour local linear prediction of scalp EEG activity during intermittent photic stimulation.

    PubMed

    Erla, Silvia; Faes, Luca; Tranquillini, Enzo; Orrico, Daniele; Nollo, Giandomenico

    2011-05-01

    The characterization of the EEG response to photic stimulation (PS) is an important issue with significant clinical relevance. This study aims to quantify and map the complexity of the EEG during PS, where complexity is measured as the degree of unpredictability resulting from local linear prediction. EEG activity was recorded with eyes closed (EC) and eyes open (EO) during resting and PS at 5, 10, and 15 Hz in a group of 30 healthy subjects and in a case-report of a patient suffering from cerebral ischemia. The mean squared prediction error (MSPE) resulting from k-nearest neighbour local linear prediction was calculated in each condition as an index of EEG unpredictability. The linear or nonlinear nature of the system underlying EEG activity was evaluated quantifying MSPE as a function of the neighbourhood size during local linear prediction, and by surrogate data analysis as well. Unpredictability maps were obtained for each subject interpolating MSPE values over a schematic head representation. Results on healthy subjects evidenced: (i) the prevalence of linear mechanisms in the generation of EEG dynamics, (ii) the lower predictability of EO EEG, (iii) the desynchronization of oscillatory mechanisms during PS leading to increased EEG complexity, (iv) the entrainment of alpha rhythm during EC obtained by 10 Hz PS, and (v) differences of EEG predictability among different scalp regions. Ischemic patient showed different MSPE values in healthy and damaged regions. The EEG predictability decreased moving from the early acute stage to a stage of partial recovery. These results suggest that nonlinear prediction can be a useful tool to characterize EEG dynamics during PS protocols, and may consequently constitute a complement of quantitative EEG analysis in clinical applications. Copyright © 2010 IPEM. Published by Elsevier Ltd. All rights reserved.

  12. Trans-polyacetylene within the extended tight-binding picture and evidence for next-nearest neighbour hopping from the dispersion of interband transition edges

    NASA Astrophysics Data System (ADS)

    Drechsler, S. L.; Heiner, E.; Osipov, V. A.

    1986-11-01

    The influence of additional non-nearest neighbour hopping processes is investigated in a SSH-like model. The enhanced splitting of absorption peaks due to Π-Π ∗ interband transitions (deduced from new electron loss data of Fink and Leising /17/) can be explained by a reasonable value of the next-nearest neighbour hopping integral |t 2| ≈0.05 t 0.

  13. Optimal Detection Range of RFID Tag for RFID-based Positioning System Using the k-NN Algorithm.

    PubMed

    Han, Soohee; Kim, Junghwan; Park, Choung-Hwan; Yoon, Hee-Cheon; Heo, Joon

    2009-01-01

    Positioning technology to track a moving object is an important and essential component of ubiquitous computing environments and applications. An RFID-based positioning system using the k-nearest neighbor (k-NN) algorithm can determine the position of a moving reader from observed reference data. In this study, the optimal detection range of an RFID-based positioning system was determined on the principle that tag spacing can be derived from the detection range. It was assumed that reference tags without signal strength information are regularly distributed in 1-, 2- and 3-dimensional spaces. The optimal detection range was determined, through analytical and numerical approaches, to be 125% of the tag-spacing distance in 1-dimensional space. Through numerical approaches, the range was 134% in 2-dimensional space, 143% in 3-dimensional space.

  14. Research on cardiovascular disease prediction based on distance metric learning

    NASA Astrophysics Data System (ADS)

    Ni, Zhuang; Liu, Kui; Kang, Guixia

    2018-04-01

    Distance metric learning algorithm has been widely applied to medical diagnosis and exhibited its strengths in classification problems. The k-nearest neighbour (KNN) is an efficient method which treats each feature equally. The large margin nearest neighbour classification (LMNN) improves the accuracy of KNN by learning a global distance metric, which did not consider the locality of data distributions. In this paper, we propose a new distance metric algorithm adopting cosine metric and LMNN named COS-SUBLMNN which takes more care about local feature of data to overcome the shortage of LMNN and improve the classification accuracy. The proposed methodology is verified on CVDs patient vector derived from real-world medical data. The Experimental results show that our method provides higher accuracy than KNN and LMNN did, which demonstrates the effectiveness of the Risk predictive model of CVDs based on COS-SUBLMNN.

  15. A nearest-neighbour discretisation of the regularized stokeslet boundary integral equation

    NASA Astrophysics Data System (ADS)

    Smith, David J.

    2018-04-01

    The method of regularized stokeslets is extensively used in biological fluid dynamics due to its conceptual simplicity and meshlessness. This simplicity carries a degree of cost in computational expense and accuracy because the number of degrees of freedom used to discretise the unknown surface traction is generally significantly higher than that required by boundary element methods. We describe a meshless method based on nearest-neighbour interpolation that significantly reduces the number of degrees of freedom required to discretise the unknown traction, increasing the range of problems that can be practically solved, without excessively complicating the task of the modeller. The nearest-neighbour technique is tested against the classical problem of rigid body motion of a sphere immersed in very viscous fluid, then applied to the more complex biophysical problem of calculating the rotational diffusion timescales of a macromolecular structure modelled by three closely-spaced non-slender rods. A heuristic for finding the required density of force and quadrature points by numerical refinement is suggested. Matlab/GNU Octave code for the key steps of the algorithm is provided, which predominantly use basic linear algebra operations, with a full implementation being provided on github. Compared with the standard Nyström discretisation, more accurate and substantially more efficient results can be obtained by de-refining the force discretisation relative to the quadrature discretisation: a cost reduction of over 10 times with improved accuracy is observed. This improvement comes at minimal additional technical complexity. Future avenues to develop the algorithm are then discussed.

  16. Mapping gradients of community composition with nearest-neighbour imputation: extending plot data for landscape analysis

    Treesearch

    Janet L. Ohmann; Matthew J. Gregory; Emilie B. Henderson; Heather M. Roberts

    2011-01-01

    Question: How can nearest-neighbour (NN) imputation be used to develop maps of multiple species and plant communities? Location: Western and central Oregon, USA, but methods are applicable anywhere. Methods: We demonstrate NN imputation by mapping woody plant communities for >100 000 km2 of diverse forests and woodlands. Species abundances on...

  17. A Novel Graph Constructor for Semisupervised Discriminant Analysis: Combined Low-Rank and k-Nearest Neighbor Graph

    PubMed Central

    Pan, Yongke; Niu, Wenjia

    2017-01-01

    Semisupervised Discriminant Analysis (SDA) is a semisupervised dimensionality reduction algorithm, which can easily resolve the out-of-sample problem. Relative works usually focus on the geometric relationships of data points, which are not obvious, to enhance the performance of SDA. Different from these relative works, the regularized graph construction is researched here, which is important in the graph-based semisupervised learning methods. In this paper, we propose a novel graph for Semisupervised Discriminant Analysis, which is called combined low-rank and k-nearest neighbor (LRKNN) graph. In our LRKNN graph, we map the data to the LR feature space and then the kNN is adopted to satisfy the algorithmic requirements of SDA. Since the low-rank representation can capture the global structure and the k-nearest neighbor algorithm can maximally preserve the local geometrical structure of the data, the LRKNN graph can significantly improve the performance of SDA. Extensive experiments on several real-world databases show that the proposed LRKNN graph is an efficient graph constructor, which can largely outperform other commonly used baselines. PMID:28316616

  18. Comparison of Random Forest, k-Nearest Neighbor, and Support Vector Machine Classifiers for Land Cover Classification Using Sentinel-2 Imagery

    PubMed Central

    Thanh Noi, Phan; Kappas, Martin

    2017-01-01

    In previous classification studies, three non-parametric classifiers, Random Forest (RF), k-Nearest Neighbor (kNN), and Support Vector Machine (SVM), were reported as the foremost classifiers at producing high accuracies. However, only a few studies have compared the performances of these classifiers with different training sample sizes for the same remote sensing images, particularly the Sentinel-2 Multispectral Imager (MSI). In this study, we examined and compared the performances of the RF, kNN, and SVM classifiers for land use/cover classification using Sentinel-2 image data. An area of 30 × 30 km2 within the Red River Delta of Vietnam with six land use/cover types was classified using 14 different training sample sizes, including balanced and imbalanced, from 50 to over 1250 pixels/class. All classification results showed a high overall accuracy (OA) ranging from 90% to 95%. Among the three classifiers and 14 sub-datasets, SVM produced the highest OA with the least sensitivity to the training sample sizes, followed consecutively by RF and kNN. In relation to the sample size, all three classifiers showed a similar and high OA (over 93.85%) when the training sample size was large enough, i.e., greater than 750 pixels/class or representing an area of approximately 0.25% of the total study area. The high accuracy was achieved with both imbalanced and balanced datasets. PMID:29271909

  19. Comparison of Random Forest, k-Nearest Neighbor, and Support Vector Machine Classifiers for Land Cover Classification Using Sentinel-2 Imagery.

    PubMed

    Thanh Noi, Phan; Kappas, Martin

    2017-12-22

    In previous classification studies, three non-parametric classifiers, Random Forest (RF), k-Nearest Neighbor (kNN), and Support Vector Machine (SVM), were reported as the foremost classifiers at producing high accuracies. However, only a few studies have compared the performances of these classifiers with different training sample sizes for the same remote sensing images, particularly the Sentinel-2 Multispectral Imager (MSI). In this study, we examined and compared the performances of the RF, kNN, and SVM classifiers for land use/cover classification using Sentinel-2 image data. An area of 30 × 30 km² within the Red River Delta of Vietnam with six land use/cover types was classified using 14 different training sample sizes, including balanced and imbalanced, from 50 to over 1250 pixels/class. All classification results showed a high overall accuracy (OA) ranging from 90% to 95%. Among the three classifiers and 14 sub-datasets, SVM produced the highest OA with the least sensitivity to the training sample sizes, followed consecutively by RF and kNN. In relation to the sample size, all three classifiers showed a similar and high OA (over 93.85%) when the training sample size was large enough, i.e., greater than 750 pixels/class or representing an area of approximately 0.25% of the total study area. The high accuracy was achieved with both imbalanced and balanced datasets.

  20. Determining the location and nearest neighbours of aluminium in zeolites with atom probe tomography

    DOE PAGES

    Perea, Daniel E.; Arslan, Ilke; Liu, Jia; ...

    2015-07-02

    Zeolite catalysis is determined by a combination of pore architecture and Brønsted acidity. As Brønsted acid sites are formed by the substitution of AlO4 for SiO4 tetrahedra, it is of utmost importance to have information on the number as well as the location and neighbouring sites of framework aluminium. Unfortunately, such detailed information has not yet been obtained, mainly due to the lack of suitable characterization methods. Here we report, using the powerful atomic-scale analysis technique known as atom probe tomography, the quantitative spatial distribution of individual aluminium atoms, including their three-dimensional extent of segregation. Ultimately, using a nearest-neighbour statisticalmore » analysis, we precisely determine the short-range distribution of aluminium over the different T-sites and determine the most probable Al–Al neighbouring distance within parent and steamed ZSM-5 crystals, as well as assess the long-range redistribution of aluminium upon zeolite steaming.« less

  1. Determining the location and nearest neighbours of aluminium in zeolites with atom probe tomography

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Perea, Daniel E.; Arslan, Ilke; Liu, Jia

    Zeolite catalysis is determined by a combination of pore architecture and Brønsted acidity. As Brønsted acid sites are formed by the substitution of AlO4 for SiO4 tetrahedra, it is of utmost importance to have information on the number as well as the location and neighbouring sites of framework aluminium. Unfortunately, such detailed information has not yet been obtained, mainly due to the lack of suitable characterization methods. Here we report, using the powerful atomic-scale analysis technique known as atom probe tomography, the quantitative spatial distribution of individual aluminium atoms, including their three-dimensional extent of segregation. Ultimately, using a nearest-neighbour statisticalmore » analysis, we precisely determine the short-range distribution of aluminium over the different T-sites and determine the most probable Al–Al neighbouring distance within parent and steamed ZSM-5 crystals, as well as assess the long-range redistribution of aluminium upon zeolite steaming.« less

  2. Statistical analysis for validating ACO-KNN algorithm as feature selection in sentiment analysis

    NASA Astrophysics Data System (ADS)

    Ahmad, Siti Rohaidah; Yusop, Nurhafizah Moziyana Mohd; Bakar, Azuraliza Abu; Yaakub, Mohd Ridzwan

    2017-10-01

    This research paper aims to propose a hybrid of ant colony optimization (ACO) and k-nearest neighbor (KNN) algorithms as feature selections for selecting and choosing relevant features from customer review datasets. Information gain (IG), genetic algorithm (GA), and rough set attribute reduction (RSAR) were used as baseline algorithms in a performance comparison with the proposed algorithm. This paper will also discuss the significance test, which was used to evaluate the performance differences between the ACO-KNN, IG-GA, and IG-RSAR algorithms. This study evaluated the performance of the ACO-KNN algorithm using precision, recall, and F-score, which were validated using the parametric statistical significance tests. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. The evaluation process has statistically proven that this ACO-KNN algorithm has been significantly improved compared to the baseline algorithms. In addition, the experimental results have proven that the ACO-KNN can be used as a feature selection technique in sentiment analysis to obtain quality, optimal feature subset that can represent the actual data in customer review data.

  3. Mapping ionospheric observations using combined techniques for Europe region

    NASA Astrophysics Data System (ADS)

    Tomasik, Lukasz; Gulyaeva, Tamara; Stanislawska, Iwona; Swiatek, Anna; Pozoga, Mariusz; Dziak-Jankowska, Beata

    An k nearest neighbours algorithm (KNN) was used for filling the gaps of the missing F2-layer critical frequency is proposed and applied. This method uses TEC data calculated from EGNOS Vertical Delay Estimate (VDE ≈0.78 TECU) and several GNSS stations and its spatial correlation whit data from selected ionosondes. For mapping purposes two-dimensional similarity function in KNN method was proposed.

  4. Nearest Neighbor Algorithms for Pattern Classification

    NASA Technical Reports Server (NTRS)

    Barrios, J. O.

    1972-01-01

    A solution of the discrimination problem is considered by means of the minimum distance classifier, commonly referred to as the nearest neighbor (NN) rule. The NN rule is nonparametric, or distribution free, in the sense that it does not depend on any assumptions about the underlying statistics for its application. The k-NN rule is a procedure that assigns an observation vector z to a category F if most of the k nearby observations x sub i are elements of F. The condensed nearest neighbor (CNN) rule may be used to reduce the size of the training set required categorize The Bayes risk serves merely as a reference-the limit of excellence beyond which it is not possible to go. The NN rule is bounded below by the Bayes risk and above by twice the Bayes risk.

  5. Nearest neighbors by neighborhood counting.

    PubMed

    Wang, Hui

    2006-06-01

    Finding nearest neighbors is a general idea that underlies many artificial intelligence tasks, including machine learning, data mining, natural language understanding, and information retrieval. This idea is explicitly used in the k-nearest neighbors algorithm (kNN), a popular classification method. In this paper, this idea is adopted in the development of a general methodology, neighborhood counting, for devising similarity functions. We turn our focus from neighbors to neighborhoods, a region in the data space covering the data point in question. To measure the similarity between two data points, we consider all neighborhoods that cover both data points. We propose to use the number of such neighborhoods as a measure of similarity. Neighborhood can be defined for different types of data in different ways. Here, we consider one definition of neighborhood for multivariate data and derive a formula for such similarity, called neighborhood counting measure or NCM. NCM was tested experimentally in the framework of kNN. Experiments show that NCM is generally comparable to VDM and its variants, the state-of-the-art distance functions for multivariate data, and, at the same time, is consistently better for relatively large k values. Additionally, NCM consistently outperforms HEOM (a mixture of Euclidean and Hamming distances), the "standard" and most widely used distance function for multivariate data. NCM has a computational complexity in the same order as the standard Euclidean distance function and NCM is task independent and works for numerical and categorical data in a conceptually uniform way. The neighborhood counting methodology is proven sound for multivariate data experimentally. We hope it will work for other types of data.

  6. Efficient computation of k-Nearest Neighbour Graphs for large high-dimensional data sets on GPU clusters.

    PubMed

    Dashti, Ali; Komarov, Ivan; D'Souza, Roshan M

    2013-01-01

    This paper presents an implementation of the brute-force exact k-Nearest Neighbor Graph (k-NNG) construction for ultra-large high-dimensional data cloud. The proposed method uses Graphics Processing Units (GPUs) and is scalable with multi-levels of parallelism (between nodes of a cluster, between different GPUs on a single node, and within a GPU). The method is applicable to homogeneous computing clusters with a varying number of nodes and GPUs per node. We achieve a 6-fold speedup in data processing as compared with an optimized method running on a cluster of CPUs and bring a hitherto impossible [Formula: see text]-NNG generation for a dataset of twenty million images with 15 k dimensionality into the realm of practical possibility.

  7. Simulating ensembles of source water quality using a K-nearest neighbor resampling approach.

    PubMed

    Towler, Erin; Rajagopalan, Balaji; Seidel, Chad; Summers, R Scott

    2009-03-01

    Climatological, geological, and water management factors can cause significant variability in surface water quality. As drinking water quality standards become more stringent, the ability to quantify the variability of source water quality becomes more important for decision-making and planning in water treatment for regulatory compliance. However, paucity of long-term water quality data makes it challenging to apply traditional simulation techniques. To overcome this limitation, we have developed and applied a robust nonparametric K-nearest neighbor (K-nn) bootstrap approach utilizing the United States Environmental Protection Agency's Information Collection Rule (ICR) data. In this technique, first an appropriate "feature vector" is formed from the best available explanatory variables. The nearest neighbors to the feature vector are identified from the ICR data and are resampled using a weight function. Repetition of this results in water quality ensembles, and consequently the distribution and the quantification of the variability. The main strengths of the approach are its flexibility, simplicity, and the ability to use a large amount of spatial data with limited temporal extent to provide water quality ensembles for any given location. We demonstrate this approach by applying it to simulate monthly ensembles of total organic carbon for two utilities in the U.S. with very different watersheds and to alkalinity and bromide at two other U.S. utilities.

  8. Conformal Prediction Based on K-Nearest Neighbors for Discrimination of Ginsengs by a Home-Made Electronic Nose

    PubMed Central

    Sun, Xiyang; Miao, Jiacheng; Wang, You; Luo, Zhiyuan; Li, Guang

    2017-01-01

    An estimate on the reliability of prediction in the applications of electronic nose is essential, which has not been paid enough attention. An algorithm framework called conformal prediction is introduced in this work for discriminating different kinds of ginsengs with a home-made electronic nose instrument. Nonconformity measure based on k-nearest neighbors (KNN) is implemented separately as underlying algorithm of conformal prediction. In offline mode, the conformal predictor achieves a classification rate of 84.44% based on 1NN and 80.63% based on 3NN, which is better than that of simple KNN. In addition, it provides an estimate of reliability for each prediction. In online mode, the validity of predictions is guaranteed, which means that the error rate of region predictions never exceeds the significance level set by a user. The potential of this framework for detecting borderline examples and outliers in the application of E-nose is also investigated. The result shows that conformal prediction is a promising framework for the application of electronic nose to make predictions with reliability and validity. PMID:28805721

  9. Dynamics of Nearest-Neighbour Competitions on Graphs

    NASA Astrophysics Data System (ADS)

    Rador, Tonguç

    2017-10-01

    Considering a collection of agents representing the vertices of a graph endowed with integer points, we study the asymptotic dynamics of the rate of the increase of their points according to a very simple rule: we randomly pick an an edge from the graph which unambiguously defines two agents we give a point the the agent with larger point with probability p and to the lagger with probability q such that p+q=1. The model we present is the most general version of the nearest-neighbour competition model introduced by Ben-Naim, Vazquez and Redner. We show that the model combines aspects of hyperbolic partial differential equations—as that of a conservation law—graph colouring and hyperplane arrangements. We discuss the properties of the model for general graphs but we confine in depth study to d-dimensional tori. We present a detailed study for the ring graph, which includes a chemical potential approximation to calculate all its statistics that gives rather accurate results. The two-dimensional torus, not studied in depth as the ring, is shown to possess critical behaviour in that the asymptotic speeds arrange themselves in two-coloured islands separated by borders of three other colours and the size of the islands obey power law distribution. We also show that in the large d limit the d-dimensional torus shows inverse sine law for the distribution of asymptotic speeds.

  10. Attention Recognition in EEG-Based Affective Learning Research Using CFS+KNN Algorithm.

    PubMed

    Hu, Bin; Li, Xiaowei; Sun, Shuting; Ratcliffe, Martyn

    2018-01-01

    The research detailed in this paper focuses on the processing of Electroencephalography (EEG) data to identify attention during the learning process. The identification of affect using our procedures is integrated into a simulated distance learning system that provides feedback to the user with respect to attention and concentration. The authors propose a classification procedure that combines correlation-based feature selection (CFS) and a k-nearest-neighbor (KNN) data mining algorithm. To evaluate the CFS+KNN algorithm, it was test against CFS+C4.5 algorithm and other classification algorithms. The classification performance was measured 10 times with different 3-fold cross validation data. The data was derived from 10 subjects while they were attempting to learn material in a simulated distance learning environment. A self-assessment model of self-report was used with a single valence to evaluate attention on 3 levels (high, neutral, low). It was found that CFS+KNN had a much better performance, giving the highest correct classification rate (CCR) of % for the valence dimension divided into three classes.

  11. Comparative Performance Analysis of Support Vector Machine, Random Forest, Logistic Regression and k-Nearest Neighbours in Rainbow Trout (Oncorhynchus Mykiss) Classification Using Image-Based Features.

    PubMed

    Saberioon, Mohammadmehdi; Císař, Petr; Labbé, Laurent; Souček, Pavel; Pelissier, Pablo; Kerneis, Thierry

    2018-03-29

    The main aim of this study was to develop a new objective method for evaluating the impacts of different diets on the live fish skin using image-based features. In total, one-hundred and sixty rainbow trout ( Oncorhynchus mykiss ) were fed either a fish-meal based diet (80 fish) or a 100% plant-based diet (80 fish) and photographed using consumer-grade digital camera. Twenty-three colour features and four texture features were extracted. Four different classification methods were used to evaluate fish diets including Random forest (RF), Support vector machine (SVM), Logistic regression (LR) and k -Nearest neighbours ( k -NN). The SVM with radial based kernel provided the best classifier with correct classification rate (CCR) of 82% and Kappa coefficient of 0.65. Although the both LR and RF methods were less accurate than SVM, they achieved good classification with CCR 75% and 70% respectively. The k -NN was the least accurate (40%) classification model. Overall, it can be concluded that consumer-grade digital cameras could be employed as the fast, accurate and non-invasive sensor for classifying rainbow trout based on their diets. Furthermore, these was a close association between image-based features and fish diet received during cultivation. These procedures can be used as non-invasive, accurate and precise approaches for monitoring fish status during the cultivation by evaluating diet's effects on fish skin.

  12. Primary mass discrimination of high energy cosmic rays using PNN and k-NN methods

    NASA Astrophysics Data System (ADS)

    Rastegarzadeh, G.; Nemati, M.

    2018-02-01

    Probabilistic neural network (PNN) and k-Nearest Neighbors (k-NN) methods are widely used data classification techniques. In this paper, these two methods have been used to classify the Extensive Air Shower (EAS) data sets which were simulated using the CORSIKA code for three primary cosmic rays. The primaries are proton, oxygen and iron nuclei at energies of 100 TeV-10 PeV. This study is performed in the following of the investigations into the primary cosmic ray mass sensitive observables. We propose a new approach for measuring the mass sensitive observables of EAS in order to improve the primary mass separation. In this work, the EAS observables measurement has performed locally instead of total measurements. Also the relationships between the included number of observables in the classification methods and the prediction accuracy have been investigated. We have shown that the local measurements and inclusion of more mass sensitive observables in the classification processes can improve the classifying quality and also we have shown that muons and electrons energy density can be considered as primary mass sensitive observables in primary mass classification. Also it must be noted that this study is performed for Tehran observation level without considering the details of any certain EAS detection array.

  13. Privacy Preserving Nearest Neighbor Search

    NASA Astrophysics Data System (ADS)

    Shaneck, Mark; Kim, Yongdae; Kumar, Vipin

    Data mining is frequently obstructed by privacy concerns. In many cases data is distributed, and bringing the data together in one place for analysis is not possible due to privacy laws (e.g. HIPAA) or policies. Privacy preserving data mining techniques have been developed to address this issue by providing mechanisms to mine the data while giving certain privacy guarantees. In this chapter we address the issue of privacy preserving nearest neighbor search, which forms the kernel of many data mining applications. To this end, we present a novel algorithm based on secure multiparty computation primitives to compute the nearest neighbors of records in horizontally distributed data. We show how this algorithm can be used in three important data mining algorithms, namely LOF outlier detection, SNN clustering, and kNN classification. We prove the security of these algorithms under the semi-honest adversarial model, and describe methods that can be used to optimize their performance. Keywords: Privacy Preserving Data Mining, Nearest Neighbor Search, Outlier Detection, Clustering, Classification, Secure Multiparty Computation

  14. Mutual proximity graphs for improved reachability in music recommendation.

    PubMed

    Flexer, Arthur; Stevens, Jeff

    2018-01-01

    This paper is concerned with the impact of hubness, a general problem of machine learning in high-dimensional spaces, on a real-world music recommendation system based on visualisation of a k-nearest neighbour (knn) graph. Due to a problem of measuring distances in high dimensions, hub objects are recommended over and over again while anti-hubs are nonexistent in recommendation lists, resulting in poor reachability of the music catalogue. We present mutual proximity graphs, which are an alternative to knn and mutual knn graphs, and are able to avoid hub vertices having abnormally high connectivity. We show that mutual proximity graphs yield much better graph connectivity resulting in improved reachability compared to knn graphs, mutual knn graphs and mutual knn graphs enhanced with minimum spanning trees, while simultaneously reducing the negative effects of hubness.

  15. Predicting acute contact toxicity of pesticides in honeybees (Apis mellifera) through a k-nearest neighbor model.

    PubMed

    Como, F; Carnesecchi, E; Volani, S; Dorne, J L; Richardson, J; Bassan, A; Pavan, M; Benfenati, E

    2017-01-01

    Ecological risk assessment of plant protection products (PPPs) requires an understanding of both the toxicity and the extent of exposure to assess risks for a range of taxa of ecological importance including target and non-target species. Non-target species such as honey bees (Apis mellifera), solitary bees and bumble bees are of utmost importance because of their vital ecological services as pollinators of wild plants and crops. To improve risk assessment of PPPs in bee species, computational models predicting the acute and chronic toxicity of a range of PPPs and contaminants can play a major role in providing structural and physico-chemical properties for the prioritisation of compounds of concern and future risk assessments. Over the last three decades, scientific advisory bodies and the research community have developed toxicological databases and quantitative structure-activity relationship (QSAR) models that are proving invaluable to predict toxicity using historical data and reduce animal testing. This paper describes the development and validation of a k-Nearest Neighbor (k-NN) model using in-house software for the prediction of acute contact toxicity of pesticides on honey bees. Acute contact toxicity data were collected from different sources for 256 pesticides, which were divided into training and test sets. The k-NN models were validated with good prediction, with an accuracy of 70% for all compounds and of 65% for highly toxic compounds, suggesting that they might reliably predict the toxicity of structurally diverse pesticides and could be used to screen and prioritise new pesticides. Copyright © 2016 Elsevier Ltd. All rights reserved.

  16. Mutual proximity graphs for improved reachability in music recommendation

    PubMed Central

    Flexer, Arthur; Stevens, Jeff

    2018-01-01

    This paper is concerned with the impact of hubness, a general problem of machine learning in high-dimensional spaces, on a real-world music recommendation system based on visualisation of a k-nearest neighbour (knn) graph. Due to a problem of measuring distances in high dimensions, hub objects are recommended over and over again while anti-hubs are nonexistent in recommendation lists, resulting in poor reachability of the music catalogue. We present mutual proximity graphs, which are an alternative to knn and mutual knn graphs, and are able to avoid hub vertices having abnormally high connectivity. We show that mutual proximity graphs yield much better graph connectivity resulting in improved reachability compared to knn graphs, mutual knn graphs and mutual knn graphs enhanced with minimum spanning trees, while simultaneously reducing the negative effects of hubness. PMID:29348779

  17. Generative Models for Similarity-based Classification

    DTIC Science & Technology

    2007-01-01

    NC), local nearest centroid (local NC), k-nearest neighbors ( kNN ), and condensed nearest neighbors (CNN) are all similarity-based classifiers which...vector machine to the k nearest neighbors of the test sample [80]. The SVM- KNN method was developed to address the robustness and dimensionality...concerns that afflict nearest neighbors and SVMs. Similarly to the nearest-means classifier, the SVM- KNN is a hybrid local and global classifier developed

  18. Self-Organizing Map Neural Network-Based Nearest Neighbor Position Estimation Scheme for Continuous Crystal PET Detectors

    NASA Astrophysics Data System (ADS)

    Wang, Yonggang; Li, Deng; Lu, Xiaoming; Cheng, Xinyi; Wang, Liwei

    2014-10-01

    Continuous crystal-based positron emission tomography (PET) detectors could be an ideal alternative for current high-resolution pixelated PET detectors if the issues of high performance γ interaction position estimation and its real-time implementation are solved. Unfortunately, existing position estimators are not very feasible for implementation on field-programmable gate array (FPGA). In this paper, we propose a new self-organizing map neural network-based nearest neighbor (SOM-NN) positioning scheme aiming not only at providing high performance, but also at being realistic for FPGA implementation. Benefitting from the SOM feature mapping mechanism, the large set of input reference events at each calibration position is approximated by a small set of prototypes, and the computation of the nearest neighbor searching for unknown events is largely reduced. Using our experimental data, the scheme was evaluated, optimized and compared with the smoothed k-NN method. The spatial resolutions of full-width-at-half-maximum (FWHM) of both methods averaged over the center axis of the detector were obtained as 1.87 ±0.17 mm and 1.92 ±0.09 mm, respectively. The test results show that the SOM-NN scheme has an equivalent positioning performance with the smoothed k-NN method, but the amount of computation is only about one-tenth of the smoothed k-NN method. In addition, the algorithm structure of the SOM-NN scheme is more feasible for implementation on FPGA. It has the potential to realize real-time position estimation on an FPGA with a high-event processing throughput.

  19. Estimating Stand Height and Tree Density in Pinus taeda plantations using in-situ data, airborne LiDAR and k-Nearest Neighbor Imputation.

    PubMed

    Silva, Carlos Alberto; Klauberg, Carine; Hudak, Andrew T; Vierling, Lee A; Liesenberg, Veraldo; Bernett, Luiz G; Scheraiber, Clewerson F; Schoeninger, Emerson R

    2018-01-01

    Accurate forest inventory is of great economic importance to optimize the entire supply chain management in pulp and paper companies. The aim of this study was to estimate stand dominate and mean heights (HD and HM) and tree density (TD) of Pinus taeda plantations located in South Brazil using in-situ measurements, airborne Light Detection and Ranging (LiDAR) data and the non- k-nearest neighbor (k-NN) imputation. Forest inventory attributes and LiDAR derived metrics were calculated at 53 regular sample plots and we used imputation models to retrieve the forest attributes at plot and landscape-levels. The best LiDAR-derived metrics to predict HD, HM and TD were H99TH, HSD, SKE and HMIN. The Imputation model using the selected metrics was more effective for retrieving height than tree density. The model coefficients of determination (adj.R2) and a root mean squared difference (RMSD) for HD, HM and TD were 0.90, 0.94, 0.38m and 6.99, 5.70, 12.92%, respectively. Our results show that LiDAR and k-NN imputation can be used to predict stand heights with high accuracy in Pinus taeda. However, furthers studies need to be realized to improve the accuracy prediction of TD and to evaluate and compare the cost of acquisition and processing of LiDAR data against the conventional inventory procedures.

  20. Finger vein identification using fuzzy-based k-nearest centroid neighbor classifier

    NASA Astrophysics Data System (ADS)

    Rosdi, Bakhtiar Affendi; Jaafar, Haryati; Ramli, Dzati Athiar

    2015-02-01

    In this paper, a new approach for personal identification using finger vein image is presented. Finger vein is an emerging type of biometrics that attracts attention of researchers in biometrics area. As compared to other biometric traits such as face, fingerprint and iris, finger vein is more secured and hard to counterfeit since the features are inside the human body. So far, most of the researchers focus on how to extract robust features from the captured vein images. Not much research was conducted on the classification of the extracted features. In this paper, a new classifier called fuzzy-based k-nearest centroid neighbor (FkNCN) is applied to classify the finger vein image. The proposed FkNCN employs a surrounding rule to obtain the k-nearest centroid neighbors based on the spatial distributions of the training images and their distance to the test image. Then, the fuzzy membership function is utilized to assign the test image to the class which is frequently represented by the k-nearest centroid neighbors. Experimental evaluation using our own database which was collected from 492 fingers shows that the proposed FkNCN has better performance than the k-nearest neighbor, k-nearest-centroid neighbor and fuzzy-based-k-nearest neighbor classifiers. This shows that the proposed classifier is able to identify the finger vein image effectively.

  1. Detection of acute lymphocyte leukemia using k-nearest neighbor algorithm based on shape and histogram features

    NASA Astrophysics Data System (ADS)

    Purwanti, Endah; Calista, Evelyn

    2017-05-01

    Leukemia is a type of cancer which is caused by malignant neoplasms in leukocyte cells. Leukemia disease which can cause death quickly enough for the sufferer is a type of acute lymphocyte leukemia (ALL). In this study, we propose automatic detection of lymphocyte leukemia through classification of lymphocyte cell images obtained from peripheral blood smear single cell. There are two main objectives in this study. The first is to extract featuring cells. The second objective is to classify the lymphocyte cells into two classes, namely normal and abnormal lymphocytes. In conducting this study, we use combination of shape feature and histogram feature, and the classification algorithm is k-nearest Neighbour with k variation is 1, 3, 5, 7, 9, 11, 13, and 15. The best level of accuracy, sensitivity, and specificity in this study are 90%, 90%, and 90%, and they were obtained from combined features of area-perimeter-mean-standard deviation with k=7.

  2. Smart BIT/TSMD Integration

    DTIC Science & Technology

    1991-12-01

    user using the ’: knn ’ option in the do-scenario command line). An instance of the K-Nearest Neighbor object is first created and initialized before...Navigation Computer HF High Frequency ILS Instrument Landing System KNN K - Nearest Neighbor LRU Line Replaceable Unit MC Mission Computer MTCA...approaches have been investigated here, K-nearest Neighbors ( KNN ) and neural networks (NN). Both approaches require that previously classified examples of

  3. Analysis of miRNA expression profile based on SVM algorithm

    NASA Astrophysics Data System (ADS)

    Ting-ting, Dai; Chang-ji, Shan; Yan-shou, Dong; Yi-duo, Bian

    2018-05-01

    Based on mirna expression spectrum data set, a new data mining algorithm - tSVM - KNN (t statistic with support vector machine - k nearest neighbor) is proposed. the idea of the algorithm is: firstly, the feature selection of the data set is carried out by the unified measurement method; Secondly, SVM - KNN algorithm, which combines support vector machine (SVM) and k - nearest neighbor (k - nearest neighbor) is used as classifier. Simulation results show that SVM - KNN algorithm has better classification ability than SVM and KNN alone. Tsvm - KNN algorithm only needs 5 mirnas to obtain 96.08 % classification accuracy in terms of the number of mirna " tags" and recognition accuracy. compared with similar algorithms, tsvm - KNN algorithm has obvious advantages.

  4. Fabrication of transparent lead-free KNN glass ceramics by incorporation method

    PubMed Central

    2012-01-01

    The incorporation method was employed to produce potassium sodium niobate [KNN] (K0.5Na0.5NbO3) glass ceramics from the KNN-SiO2 system. This incorporation method combines a simple mixed-oxide technique for producing KNN powder and a conventional melt-quenching technique to form the resulting glass. KNN was calcined at 800°C and subsequently mixed with SiO2 in the KNN:SiO2 ratio of 75:25 (mol%). The successfully produced optically transparent glass was then subjected to a heat treatment schedule at temperatures ranging from 525°C -575°C for crystallization. All glass ceramics of more than 40% transmittance crystallized into KNN nanocrystals that were rectangular in shape and dispersed well throughout the glass matrix. The crystal size and crystallinity were found to increase with increasing heat treatment temperature, which in turn plays an important role in controlling the properties of the glass ceramics, including physical, optical, and dielectric properties. The transparency of the glass samples decreased with increasing crystal size. The maximum room temperature dielectric constant (εr) was as high as 474 at 10 kHz with an acceptable low loss (tanδ) around 0.02 at 10 kHz. PMID:22340426

  5. An RFID Indoor Positioning Algorithm Based on Bayesian Probability and K-Nearest Neighbor.

    PubMed

    Xu, He; Ding, Ye; Li, Peng; Wang, Ruchuan; Li, Yizhu

    2017-08-05

    The Global Positioning System (GPS) is widely used in outdoor environmental positioning. However, GPS cannot support indoor positioning because there is no signal for positioning in an indoor environment. Nowadays, there are many situations which require indoor positioning, such as searching for a book in a library, looking for luggage in an airport, emergence navigation for fire alarms, robot location, etc. Many technologies, such as ultrasonic, sensors, Bluetooth, WiFi, magnetic field, Radio Frequency Identification (RFID), etc., are used to perform indoor positioning. Compared with other technologies, RFID used in indoor positioning is more cost and energy efficient. The Traditional RFID indoor positioning algorithm LANDMARC utilizes a Received Signal Strength (RSS) indicator to track objects. However, the RSS value is easily affected by environmental noise and other interference. In this paper, our purpose is to reduce the location fluctuation and error caused by multipath and environmental interference in LANDMARC. We propose a novel indoor positioning algorithm based on Bayesian probability and K -Nearest Neighbor (BKNN). The experimental results show that the Gaussian filter can filter some abnormal RSS values. The proposed BKNN algorithm has the smallest location error compared with the Gaussian-based algorithm, LANDMARC and an improved KNN algorithm. The average error in location estimation is about 15 cm using our method.

  6. Multi-site Stochastic Simulation of Daily Streamflow with Markov Chain and KNN Algorithm

    NASA Astrophysics Data System (ADS)

    Mathai, J.; Mujumdar, P.

    2017-12-01

    A key focus of this study is to develop a method which is physically consistent with the hydrologic processes that can capture short-term characteristics of daily hydrograph as well as the correlation of streamflow in temporal and spatial domains. In complex water resource systems, flow fluctuations at small time intervals require that discretisation be done at small time scales such as daily scales. Also, simultaneous generation of synthetic flows at different sites in the same basin are required. We propose a method to equip water managers with a streamflow generator within a stochastic streamflow simulation framework. The motivation for the proposed method is to generate sequences that extend beyond the variability represented in the historical record of streamflow time series. The method has two steps: In step 1, daily flow is generated independently at each station by a two-state Markov chain, with rising limb increments randomly sampled from a Gamma distribution and the falling limb modelled as exponential recession and in step 2, the streamflow generated in step 1 is input to a nonparametric K-nearest neighbor (KNN) time series bootstrap resampler. The KNN model, being data driven, does not require assumptions on the dependence structure of the time series. A major limitation of KNN based streamflow generators is that they do not produce new values, but merely reshuffle the historical data to generate realistic streamflow sequences. However, daily flow generated using the Markov chain approach is capable of generating a rich variety of streamflow sequences. Furthermore, the rising and falling limbs of daily hydrograph represent different physical processes, and hence they need to be modelled individually. Thus, our method combines the strengths of the two approaches. We show the utility of the method and improvement over the traditional KNN by simulating daily streamflow sequences at 7 locations in the Godavari River basin in India.

  7. A Robust False Matching Points Detection Method for Remote Sensing Image Registration

    NASA Astrophysics Data System (ADS)

    Shan, X. J.; Tang, P.

    2015-04-01

    Given the influences of illumination, imaging angle, and geometric distortion, among others, false matching points still occur in all image registration algorithms. Therefore, false matching points detection is an important step in remote sensing image registration. Random Sample Consensus (RANSAC) is typically used to detect false matching points. However, RANSAC method cannot detect all false matching points in some remote sensing images. Therefore, a robust false matching points detection method based on Knearest- neighbour (K-NN) graph (KGD) is proposed in this method to obtain robust and high accuracy result. The KGD method starts with the construction of the K-NN graph in one image. K-NN graph can be first generated for each matching points and its K nearest matching points. Local transformation model for each matching point is then obtained by using its K nearest matching points. The error of each matching point is computed by using its transformation model. Last, L matching points with largest error are identified false matching points and removed. This process is iterative until all errors are smaller than the given threshold. In addition, KGD method can be used in combination with other methods, such as RANSAC. Several remote sensing images with different resolutions and terrains are used in the experiment. We evaluate the performance of KGD method, RANSAC + KGD method, RANSAC, and Graph Transformation Matching (GTM). The experimental results demonstrate the superior performance of the KGD and RANSAC + KGD methods.

  8. The causes of spatial patterning of mounds of a fungus-cultivating termite: results from nearest-neighbour analysis and ecological studies.

    PubMed

    Korb, Judith; Linsenmair, Karl Eduard

    2001-05-01

    Little is known about processes regulating population dynamics in termites. We investigated the distribution of mound-colonies of the fungus-cultivating termite Macrotermes bellicosus (Smeathman) in two habitats in the Comoé National Park (Côte d'Ivoire) with nearest-neighbour analysis differentiating between different age classes. These results were compared with ecological data on processes influencing population dynamics. High mound densities were recorded in shrub savannah while only a few mounds were found in gallery forest. Mounds were distributed randomly in both habitats when all mounds were considered together, and when inhabited and uninhabited mounds were treated separately. However, distinctive non-random patterns were revealed in the savannah when we distinguished between different age classes. Small, young colonies were aggregated when they coexisted with larger, older colonies, which were more regularly distributed. This indicates that the distribution of older colonies is influenced by intraspecific competition whereas that of younger colonies is influenced by opposing factors that lead to aggregation. This is in accordance with ecological data. Food is a limiting resource for large colonies, while patchily distributed appropriate microclimatic conditions seem to be more important for young colonies. Colonies that had formerly coexisted (i.e. living colonies and recently dead colonies) showed aggregated, random and regular distribution patterns, suggesting several causes of mortality. Colonies that had never had contact with each other were randomly distributed and no specific regulation mechanism was implicated. These results show that different age classes seem to be regulated by different processes and that separation between age classes is necessary to reveal indicative spatial patterns in nearest-neighbour analysis.

  9. The Application of Determining Students’ Graduation Status of STMIK Palangkaraya Using K-Nearest Neighbors Method

    NASA Astrophysics Data System (ADS)

    Rusdiana, Lili; Marfuah

    2017-12-01

    K-Nearest Neighbors method is one of methods used for classification which calculate a value to find out the closest in distance. It is used to group a set of data such as students’ graduation status that are got from the amount of course credits taken by them, the grade point average (AVG), and the mini-thesis grade. The study is conducted to know the results of using K-Nearest Neighbors method on the application of determining students’ graduation status, so it can be analyzed from the method used, the data, and the application constructed. The aim of this study is to find out the application results by using K-Nearest Neighbors concept to determine students’ graduation status using the data of STMIK Palangkaraya students. The development of the software used Extreme Programming, since it was appropriate and precise for this study which was to quickly finish the project. The application was created using Microsoft Office Excel 2007 for the training data and Matlab 7 to implement the application. The result of K-Nearest Neighbors method on the application of determining students’ graduation status was 92.5%. It could determine the predicate graduation of 94 data used from the initial data before the processing as many as 136 data which the maximal training data was 50data. The K-Nearest Neighbors method is one of methods used to group a set of data based on the closest value, so that using K-Nearest Neighbors method agreed with this study. The results of K-Nearest Neighbors method on the application of determining students’ graduation status was 92.5% could determine the predicate graduation which is the maximal training data. The K-Nearest Neighbors method is one of methods used to group a set of data based on the closest value, so that using K-Nearest Neighbors method agreed with this study.

  10. On prognostic models, artificial intelligence and censored observations.

    PubMed

    Anand, S S; Hamilton, P W; Hughes, J G; Bell, D A

    2001-03-01

    The development of prognostic models for assisting medical practitioners with decision making is not a trivial task. Models need to possess a number of desirable characteristics and few, if any, current modelling approaches based on statistical or artificial intelligence can produce models that display all these characteristics. The inability of modelling techniques to provide truly useful models has led to interest in these models being purely academic in nature. This in turn has resulted in only a very small percentage of models that have been developed being deployed in practice. On the other hand, new modelling paradigms are being proposed continuously within the machine learning and statistical community and claims, often based on inadequate evaluation, being made on their superiority over traditional modelling methods. We believe that for new modelling approaches to deliver true net benefits over traditional techniques, an evaluation centric approach to their development is essential. In this paper we present such an evaluation centric approach to developing extensions to the basic k-nearest neighbour (k-NN) paradigm. We use standard statistical techniques to enhance the distance metric used and a framework based on evidence theory to obtain a prediction for the target example from the outcome of the retrieved exemplars. We refer to this new k-NN algorithm as Censored k-NN (Ck-NN). This reflects the enhancements made to k-NN that are aimed at providing a means for handling censored observations within k-NN.

  11. Weighted Parzen Windows for Pattern Classification

    DTIC Science & Technology

    1994-05-01

    Nearest-Neighbor Rule The k-Nearest-Neighbor ( kNN ) technique is nonparametric, assuming nothing about the distribution of the data. Stated succinctly...probabilities P(wj I x) from samples." Raudys and Jain [20:255] advance this interpretation by pointing out that the kNN technique can be viewed as the...34Parzen window classifier with a hyper- rectangular window function." As with the Parzen-window technique, the kNN classifier is more accurate as the

  12. Differentiation of AmpC beta-lactamase binders vs. decoys using classification kNN QSAR modeling and application of the QSAR classifier to virtual screening

    NASA Astrophysics Data System (ADS)

    Hsieh, Jui-Hua; Wang, Xiang S.; Teotico, Denise; Golbraikh, Alexander; Tropsha, Alexander

    2008-09-01

    The use of inaccurate scoring functions in docking algorithms may result in the selection of compounds with high predicted binding affinity that nevertheless are known experimentally not to bind to the target receptor. Such falsely predicted binders have been termed `binding decoys'. We posed a question as to whether true binders and decoys could be distinguished based only on their structural chemical descriptors using approaches commonly used in ligand based drug design. We have applied the k-Nearest Neighbor ( kNN) classification QSAR approach to a dataset of compounds characterized as binders or binding decoys of AmpC beta-lactamase. Models were subjected to rigorous internal and external validation as part of our standard workflow and a special QSAR modeling scheme was employed that took into account the imbalanced ratio of inhibitors to non-binders (1:4) in this dataset. 342 predictive models were obtained with correct classification rate (CCR) for both training and test sets as high as 0.90 or higher. The prediction accuracy was as high as 100% (CCR = 1.00) for the external validation set composed of 10 compounds (5 true binders and 5 decoys) selected randomly from the original dataset. For an additional external set of 50 known non-binders, we have achieved the CCR of 0.87 using very conservative model applicability domain threshold. The validated binary kNN QSAR models were further employed for mining the NCGC AmpC screening dataset (69653 compounds). The consensus prediction of 64 compounds identified as screening hits in the AmpC PubChem assay disagreed with their annotation in PubChem but was in agreement with the results of secondary assays. At the same time, 15 compounds were identified as potential binders contrary to their annotation in PubChem. Five of them were tested experimentally and showed inhibitory activities in millimolar range with the highest binding constant Ki of 135 μM. Our studies suggest that validated QSAR models could complement

  13. Analysis of microarray leukemia data using an efficient MapReduce-based K-nearest-neighbor classifier.

    PubMed

    Kumar, Mukesh; Rath, Nitish Kumar; Rath, Santanu Kumar

    2016-04-01

    Microarray-based gene expression profiling has emerged as an efficient technique for classification, prognosis, diagnosis, and treatment of cancer. Frequent changes in the behavior of this disease generates an enormous volume of data. Microarray data satisfies both the veracity and velocity properties of big data, as it keeps changing with time. Therefore, the analysis of microarray datasets in a small amount of time is essential. They often contain a large amount of expression, but only a fraction of it comprises genes that are significantly expressed. The precise identification of genes of interest that are responsible for causing cancer are imperative in microarray data analysis. Most existing schemes employ a two-phase process such as feature selection/extraction followed by classification. In this paper, various statistical methods (tests) based on MapReduce are proposed for selecting relevant features. After feature selection, a MapReduce-based K-nearest neighbor (mrKNN) classifier is also employed to classify microarray data. These algorithms are successfully implemented in a Hadoop framework. A comparative analysis is done on these MapReduce-based models using microarray datasets of various dimensions. From the obtained results, it is observed that these models consume much less execution time than conventional models in processing big data. Copyright © 2016 Elsevier Inc. All rights reserved.

  14. Residual uncertainty estimation using instance-based learning with applications to hydrologic forecasting

    NASA Astrophysics Data System (ADS)

    Wani, Omar; Beckers, Joost V. L.; Weerts, Albrecht H.; Solomatine, Dimitri P.

    2017-08-01

    A non-parametric method is applied to quantify residual uncertainty in hydrologic streamflow forecasting. This method acts as a post-processor on deterministic model forecasts and generates a residual uncertainty distribution. Based on instance-based learning, it uses a k nearest-neighbour search for similar historical hydrometeorological conditions to determine uncertainty intervals from a set of historical errors, i.e. discrepancies between past forecast and observation. The performance of this method is assessed using test cases of hydrologic forecasting in two UK rivers: the Severn and Brue. Forecasts in retrospect were made and their uncertainties were estimated using kNN resampling and two alternative uncertainty estimators: quantile regression (QR) and uncertainty estimation based on local errors and clustering (UNEEC). Results show that kNN uncertainty estimation produces accurate and narrow uncertainty intervals with good probability coverage. Analysis also shows that the performance of this technique depends on the choice of search space. Nevertheless, the accuracy and reliability of uncertainty intervals generated using kNN resampling are at least comparable to those produced by QR and UNEEC. It is concluded that kNN uncertainty estimation is an interesting alternative to other post-processors, like QR and UNEEC, for estimating forecast uncertainty. Apart from its concept being simple and well understood, an advantage of this method is that it is relatively easy to implement.

  15. Assessment of various supervised learning algorithms using different performance metrics

    NASA Astrophysics Data System (ADS)

    Susheel Kumar, S. M.; Laxkar, Deepak; Adhikari, Sourav; Vijayarajan, V.

    2017-11-01

    Our work brings out comparison based on the performance of supervised machine learning algorithms on a binary classification task. The supervised machine learning algorithms which are taken into consideration in the following work are namely Support Vector Machine(SVM), Decision Tree(DT), K Nearest Neighbour (KNN), Naïve Bayes(NB) and Random Forest(RF). This paper mostly focuses on comparing the performance of above mentioned algorithms on one binary classification task by analysing the Metrics such as Accuracy, F-Measure, G-Measure, Precision, Misclassification Rate, False Positive Rate, True Positive Rate, Specificity, Prevalence.

  16. Long-term surface EMG monitoring using K-means clustering and compressive sensing

    NASA Astrophysics Data System (ADS)

    Balouchestani, Mohammadreza; Krishnan, Sridhar

    2015-05-01

    In this work, we present an advanced K-means clustering algorithm based on Compressed Sensing theory (CS) in combination with the K-Singular Value Decomposition (K-SVD) method for Clustering of long-term recording of surface Electromyography (sEMG) signals. The long-term monitoring of sEMG signals aims at recording of the electrical activity produced by muscles which are very useful procedure for treatment and diagnostic purposes as well as for detection of various pathologies. The proposed algorithm is examined for three scenarios of sEMG signals including healthy person (sEMG-Healthy), a patient with myopathy (sEMG-Myopathy), and a patient with neuropathy (sEMG-Neuropathr), respectively. The proposed algorithm can easily scan large sEMG datasets of long-term sEMG recording. We test the proposed algorithm with Principal Component Analysis (PCA) and Linear Correlation Coefficient (LCC) dimensionality reduction methods. Then, the output of the proposed algorithm is fed to K-Nearest Neighbours (K-NN) and Probabilistic Neural Network (PNN) classifiers in order to calclute the clustering performance. The proposed algorithm achieves a classification accuracy of 99.22%. This ability allows reducing 17% of Average Classification Error (ACE), 9% of Training Error (TE), and 18% of Root Mean Square Error (RMSE). The proposed algorithm also reduces 14% clustering energy consumption compared to the existing K-Means clustering algorithm.

  17. A novel method for the detection of R-peaks in ECG based on K-Nearest Neighbors and Particle Swarm Optimization

    NASA Astrophysics Data System (ADS)

    He, Runnan; Wang, Kuanquan; Li, Qince; Yuan, Yongfeng; Zhao, Na; Liu, Yang; Zhang, Henggui

    2017-12-01

    Cardiovascular diseases are associated with high morbidity and mortality. However, it is still a challenge to diagnose them accurately and efficiently. Electrocardiogram (ECG), a bioelectrical signal of the heart, provides crucial information about the dynamical functions of the heart, playing an important role in cardiac diagnosis. As the QRS complex in ECG is associated with ventricular depolarization, therefore, accurate QRS detection is vital for interpreting ECG features. In this paper, we proposed a real-time, accurate, and effective algorithm for QRS detection. In the algorithm, a proposed preprocessor with a band-pass filter was first applied to remove baseline wander and power-line interference from the signal. After denoising, a method combining K-Nearest Neighbor (KNN) and Particle Swarm Optimization (PSO) was used for accurate QRS detection in ECGs with different morphologies. The proposed algorithm was tested and validated using 48 ECG records from MIT-BIH arrhythmia database (MITDB), achieved a high averaged detection accuracy, sensitivity and positive predictivity of 99.43, 99.69, and 99.72%, respectively, indicating a notable improvement to extant algorithms as reported in literatures.

  18. A Comparative Study with RapidMiner and WEKA Tools over some Classification Techniques for SMS Spam

    NASA Astrophysics Data System (ADS)

    Foozy, Cik Feresa Mohd; Ahmad, Rabiah; Faizal Abdollah, M. A.; Chai Wen, Chuah

    2017-08-01

    SMS Spamming is a serious attack that can manipulate the use of the SMS by spreading the advertisement in bulk. By sending the unwanted SMS that contain advertisement can make the users feeling disturb and this against the privacy of the mobile users. To overcome these issues, many studies have proposed to detect SMS Spam by using data mining tools. This paper will do a comparative study using five machine learning techniques such as Naïve Bayes, K-NN (K-Nearest Neighbour Algorithm), Decision Tree, Random Forest and Decision Stumps to observe the accuracy result between RapidMiner and WEKA for dataset SMS Spam UCI Machine Learning repository.

  19. yaImpute: An R package for kNN imputation

    Treesearch

    Nicholas L. Crookston; Andrew O. Finley

    2008-01-01

    This article introduces yaImpute, an R package for nearest neighbor search and imputation. Although nearest neighbor imputation is used in a host of disciplines, the methods implemented in the yaImpute package are tailored to imputation-based forest attribute estimation and mapping. The impetus to writing the yaImpute is a growing interest in nearest neighbor...

  20. Modeling Gas and Gas Hydrate Accumulation in Marine Sediments Using a K-Nearest Neighbor Machine-Learning Technique

    NASA Astrophysics Data System (ADS)

    Wood, W. T.; Runyan, T. E.; Palmsten, M.; Dale, J.; Crawford, C.

    2016-12-01

    Natural Gas (primarily methane) and gas hydrate accumulations require certain bio-geochemical, as well as physical conditions, some of which are poorly sampled and/or poorly understood. We exploit recent advances in the prediction of seafloor porosity and heat flux via machine learning techniques (e.g. Random forests and Bayesian networks) to predict the occurrence of gas and subsequently gas hydrate in marine sediments. The prediction (actually guided interpolation) of key parameters we use in this study is a K-nearest neighbor technique. KNN requires only minimal pre-processing of the data and predictors, and requires minimal run-time input so the results are almost entirely data-driven. Specifically we use new estimates of sedimentation rate and sediment type, along with recently derived compaction modeling to estimate profiles of porosity and age. We combined the compaction with seafloor heat flux to estimate temperature with depth and geologic age, which, with estimates of organic carbon, and models of methanogenesis yield limits on the production of methane. Results include geospatial predictions of gas (and gas hydrate) accumulations, with quantitative estimates of uncertainty. The Generic Earth Modeling System (GEMS) we have developed to derive the machine learning estimates is modular and easily updated with new algorithms or data.

  1. A software package for interactive motor unit potential classification using fuzzy k-NN classifier.

    PubMed

    Rasheed, Sarbast; Stashuk, Daniel; Kamel, Mohamed

    2008-01-01

    We present an interactive software package for implementing the supervised classification task during electromyographic (EMG) signal decomposition process using a fuzzy k-NN classifier and utilizing the MATLAB high-level programming language and its interactive environment. The method employs an assertion-based classification that takes into account a combination of motor unit potential (MUP) shapes and two modes of use of motor unit firing pattern information: the passive and the active modes. The developed package consists of several graphical user interfaces used to detect individual MUP waveforms from a raw EMG signal, extract relevant features, and classify the MUPs into motor unit potential trains (MUPTs) using assertion-based classifiers.

  2. Accounting for Dependence Induced by Weighted KNN Imputation in Paired Samples, Motivated by a Colorectal Cancer Study

    PubMed Central

    Suyundikov, Anvar; Stevens, John R.; Corcoran, Christopher; Herrick, Jennifer; Wolff, Roger K.; Slattery, Martha L.

    2015-01-01

    Missing data can arise in bioinformatics applications for a variety of reasons, and imputation methods are frequently applied to such data. We are motivated by a colorectal cancer study where miRNA expression was measured in paired tumor-normal samples of hundreds of patients, but data for many normal samples were missing due to lack of tissue availability. We compare the precision and power performance of several imputation methods, and draw attention to the statistical dependence induced by K-Nearest Neighbors (KNN) imputation. This imputation-induced dependence has not previously been addressed in the literature. We demonstrate how to account for this dependence, and show through simulation how the choice to ignore or account for this dependence affects both power and type I error rate control. PMID:25849489

  3. Portable Language-Independent Adaptive Translation from OCR. Phase 1

    DTIC Science & Technology

    2009-04-01

    including brute-force k-Nearest Neighbors ( kNN ), fast approximate kNN using hashed k-d trees, classification and regression trees, and locality...achieved by refinements in ground-truthing protocols. Recent algorithmic improvements to our approximate kNN classifier using hashed k-D trees allows...recent years discriminative training has been shown to outperform phonetic HMMs estimated using ML for speech recognition. Standard ML estimation

  4. Prediction of human breast and colon cancers from imbalanced data using nearest neighbor and support vector machines.

    PubMed

    Majid, Abdul; Ali, Safdar; Iqbal, Mubashar; Kausar, Nabeela

    2014-03-01

    This study proposes a novel prediction approach for human breast and colon cancers using different feature spaces. The proposed scheme consists of two stages: the preprocessor and the predictor. In the preprocessor stage, the mega-trend diffusion (MTD) technique is employed to increase the samples of the minority class, thereby balancing the dataset. In the predictor stage, machine-learning approaches of K-nearest neighbor (KNN) and support vector machines (SVM) are used to develop hybrid MTD-SVM and MTD-KNN prediction models. MTD-SVM model has provided the best values of accuracy, G-mean and Matthew's correlation coefficient of 96.71%, 96.70% and 71.98% for cancer/non-cancer dataset, breast/non-breast cancer dataset and colon/non-colon cancer dataset, respectively. We found that hybrid MTD-SVM is the best with respect to prediction performance and computational cost. MTD-KNN model has achieved moderately better prediction as compared to hybrid MTD-NB (Naïve Bayes) but at the expense of higher computing cost. MTD-KNN model is faster than MTD-RF (random forest) but its prediction is not better than MTD-RF. To the best of our knowledge, the reported results are the best results, so far, for these datasets. The proposed scheme indicates that the developed models can be used as a tool for the prediction of cancer. This scheme may be useful for study of any sequential information such as protein sequence or any nucleic acid sequence. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.

  5. An Unsupervised kNN Method to Systematically Detect Changes in Protein Localization in High-Throughput Microscopy Images.

    PubMed

    Lu, Alex Xijie; Moses, Alan M

    2016-01-01

    Despite the importance of characterizing genes that exhibit subcellular localization changes between conditions in proteome-wide imaging experiments, many recent studies still rely upon manual evaluation to assess the results of high-throughput imaging experiments. We describe and demonstrate an unsupervised k-nearest neighbours method for the detection of localization changes. Compared to previous classification-based supervised change detection methods, our method is much simpler and faster, and operates directly on the feature space to overcome limitations in needing to manually curate training sets that may not generalize well between screens. In addition, the output of our method is flexible in its utility, generating both a quantitatively ranked list of localization changes that permit user-defined cut-offs, and a vector for each gene describing feature-wise direction and magnitude of localization changes. We demonstrate that our method is effective at the detection of localization changes using the Δrpd3 perturbation in Saccharomyces cerevisiae, where we capture 71.4% of previously known changes within the top 10% of ranked genes, and find at least four new localization changes within the top 1% of ranked genes. The results of our analysis indicate that simple unsupervised methods may be able to identify localization changes in images without laborious manual image labelling steps.

  6. A comparative analysis of swarm intelligence techniques for feature selection in cancer classification.

    PubMed

    Gunavathi, Chellamuthu; Premalatha, Kandasamy

    2014-01-01

    Feature selection in cancer classification is a central area of research in the field of bioinformatics and used to select the informative genes from thousands of genes of the microarray. The genes are ranked based on T-statistics, signal-to-noise ratio (SNR), and F-test values. The swarm intelligence (SI) technique finds the informative genes from the top-m ranked genes. These selected genes are used for classification. In this paper the shuffled frog leaping with Lévy flight (SFLLF) is proposed for feature selection. In SFLLF, the Lévy flight is included to avoid premature convergence of shuffled frog leaping (SFL) algorithm. The SI techniques such as particle swarm optimization (PSO), cuckoo search (CS), SFL, and SFLLF are used for feature selection which identifies informative genes for classification. The k-nearest neighbour (k-NN) technique is used to classify the samples. The proposed work is applied on 10 different benchmark datasets and examined with SI techniques. The experimental results show that the results obtained from k-NN classifier through SFLLF feature selection method outperform PSO, CS, and SFL.

  7. Minimum Expected Risk Estimation for Near-neighbor Classification

    DTIC Science & Technology

    2006-04-01

    We consider the problems of class probability estimation and classification when using near-neighbor classifiers, such as k-nearest neighbors ( kNN ...estimate for weighted kNN classifiers with different prior information, for a broad class of risk functions. Theory and simulations show how significant...the difference is compared to the standard maximum likelihood weighted kNN estimates. Comparisons are made with uniform weights, symmetric weights

  8. On Algorithms for Generating Computationally Simple Piecewise Linear Classifiers

    DTIC Science & Technology

    1989-05-01

    suffers. - Waveform classification, e.g. speech recognition, seismic analysis (i.e. discrimination between earthquakes and nuclear explosions), target...assuming Gaussian distributions (B-G) d) Bayes classifier with probability densities estimated with the k-N-N method (B- kNN ) e) The -arest neighbour...range of classifiers are chosen including a fast, easy computable and often used classifier (B-G), reliable and complex classifiers (B- kNN and NNR

  9. Determination of accuracy of winding deformation method using kNN based classifier used for 3 MVA transformer

    NASA Astrophysics Data System (ADS)

    Ahmed, Mustafa Wasir; Baishya, Manash Jyoti; Sharma, Sasanka Sekhor; Hazarika, Manash

    2018-04-01

    This paper presents a detecting system on power transformer in transformer winding, core and on load tap changer (OLTC). Accuracy of winding deformation is determined using kNN based classifier. Winding deformation in power transformer can be measured using sweep frequency response analysis (SFRA), which can enhance the diagnosis accuracy to a large degree. It is suggested that in the results minor deformation faults can be detected at frequency range of 1 mHz to 2 MHz. The values of RCL parameters are changed when faults occur and hence frequency response of the winding will change accordingly. The SFRA data of tested transformer is compared with reference trace. The difference between two graphs indicate faults in the transformer. The deformation between 1 mHz to 1kHz gives winding deformation, 1 kHz to 100 kHz gives core deformation and 100 kHz to 2 MHz gives OLTC deformation.

  10. Remaining Useful Life Estimation of Insulated Gate Biploar Transistors (IGBTs) Based on a Novel Volterra k-Nearest Neighbor Optimally Pruned Extreme Learning Machine (VKOPP) Model Using Degradation Data

    PubMed Central

    Mei, Wenjuan; Zeng, Xianping; Yang, Chenglin; Zhou, Xiuyun

    2017-01-01

    The insulated gate bipolar transistor (IGBT) is a kind of excellent performance switching device used widely in power electronic systems. How to estimate the remaining useful life (RUL) of an IGBT to ensure the safety and reliability of the power electronics system is currently a challenging issue in the field of IGBT reliability. The aim of this paper is to develop a prognostic technique for estimating IGBTs’ RUL. There is a need for an efficient prognostic algorithm that is able to support in-situ decision-making. In this paper, a novel prediction model with a complete structure based on optimally pruned extreme learning machine (OPELM) and Volterra series is proposed to track the IGBT’s degradation trace and estimate its RUL; we refer to this model as Volterra k-nearest neighbor OPELM prediction (VKOPP) model. This model uses the minimum entropy rate method and Volterra series to reconstruct phase space for IGBTs’ ageing samples, and a new weight update algorithm, which can effectively reduce the influence of the outliers and noises, is utilized to establish the VKOPP network; then a combination of the k-nearest neighbor method (KNN) and least squares estimation (LSE) method is used to calculate the output weights of OPELM and predict the RUL of the IGBT. The prognostic results show that the proposed approach can predict the RUL of IGBT modules with small error and achieve higher prediction precision and lower time cost than some classic prediction approaches. PMID:29099811

  11. Investigating the Effects of Imputation Methods for Modelling Gene Networks Using a Dynamic Bayesian Network from Gene Expression Data

    PubMed Central

    CHAI, Lian En; LAW, Chow Kuan; MOHAMAD, Mohd Saberi; CHONG, Chuii Khim; CHOON, Yee Wen; DERIS, Safaai; ILLIAS, Rosli Md

    2014-01-01

    Background: Gene expression data often contain missing expression values. Therefore, several imputation methods have been applied to solve the missing values, which include k-nearest neighbour (kNN), local least squares (LLS), and Bayesian principal component analysis (BPCA). However, the effects of these imputation methods on the modelling of gene regulatory networks from gene expression data have rarely been investigated and analysed using a dynamic Bayesian network (DBN). Methods: In the present study, we separately imputed datasets of the Escherichia coli S.O.S. DNA repair pathway and the Saccharomyces cerevisiae cell cycle pathway with kNN, LLS, and BPCA, and subsequently used these to generate gene regulatory networks (GRNs) using a discrete DBN. We made comparisons on the basis of previous studies in order to select the gene network with the least error. Results: We found that BPCA and LLS performed better on larger networks (based on the S. cerevisiae dataset), whereas kNN performed better on smaller networks (based on the E. coli dataset). Conclusion: The results suggest that the performance of each imputation method is dependent on the size of the dataset, and this subsequently affects the modelling of the resultant GRNs using a DBN. In addition, on the basis of these results, a DBN has the capacity to discover potential edges, as well as display interactions, between genes. PMID:24876803

  12. Electromagnetic Induction Spectroscopy for the Detection of Subsurface Targets

    DTIC Science & Technology

    2012-12-01

    curves of the proposed method and that of Fails et al.. For the kNN ROC curve, k = 7. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81...et al. [6] and Ramachandran et al. [7] both demonstrated success in detecting mines using the k-nearest-neighbor ( kNN ) algorithm based on the EMI...error is also included in the feature vector. The kNN labels an unknown target based on the closest targets in a training set. Collins et al. [2] and

  13. K-Nearest Neighbor Estimation of Forest Attributes: Improving Mapping Efficiency

    Treesearch

    Andrew O. Finley; Alan R. Ek; Yun Bai; Marvin E. Bauer

    2005-01-01

    This paper describes our efforts in refining k-nearest neighbor forest attributes classification using U.S. Department of Agriculture Forest Service Forest Inventory and Analysis plot data and Landsat 7 Enhanced Thematic Mapper Plus imagery. The analysis focuses on FIA-defined forest type classification across St. Louis County in northeastern Minnesota. We outline...

  14. [Classification of Children with Attention-Deficit/Hyperactivity Disorder and Typically Developing Children Based on Electroencephalogram Principal Component Analysis and k-Nearest Neighbor].

    PubMed

    Yang, Jiaojiao; Guo, Qian; Li, Wenjie; Wang, Suhong; Zou, Ling

    2016-04-01

    This paper aims to assist the individual clinical diagnosis of children with attention-deficit/hyperactivity disorder using electroencephalogram signal detection method.Firstly,in our experiments,we obtained and studied the electroencephalogram signals from fourteen attention-deficit/hyperactivity disorder children and sixteen typically developing children during the classic interference control task of Simon-spatial Stroop,and we completed electroencephalogram data preprocessing including filtering,segmentation,removal of artifacts and so on.Secondly,we selected the subset electroencephalogram electrodes using principal component analysis(PCA)method,and we collected the common channels of the optimal electrodes which occurrence rates were more than 90%in each kind of stimulation.We then extracted the latency(200~450ms)mean amplitude features of the common electrodes.Finally,we used the k-nearest neighbor(KNN)classifier based on Euclidean distance and the support vector machine(SVM)classifier based on radial basis kernel function to classify.From the experiment,at the same kind of interference control task,the attention-deficit/hyperactivity disorder children showed lower correct response rates and longer reaction time.The N2 emerged in prefrontal cortex while P2 presented in the inferior parietal area when all kinds of stimuli demonstrated.Meanwhile,the children with attention-deficit/hyperactivity disorder exhibited markedly reduced N2 and P2amplitude compared to typically developing children.KNN resulted in better classification accuracy than SVM classifier,and the best classification rate was 89.29%in StI task.The results showed that the electroencephalogram signals were different in the brain regions of prefrontal cortex and inferior parietal cortex between attention-deficit/hyperactivity disorder and typically developing children during the interference control task,which provided a scientific basis for the clinical diagnosis of attention

  15. Predicting protein subnuclear location with optimized evidence-theoretic K-nearest classifier and pseudo amino acid composition.

    PubMed

    Shen, Hong-Bin; Chou, Kuo-Chen

    2005-11-25

    The nucleus is the brain of eukaryotic cells that guides the life processes of the cell by issuing key instructions. For in-depth understanding of the biochemical process of the nucleus, the knowledge of localization of nuclear proteins is very important. With the avalanche of protein sequences generated in the post-genomic era, it is highly desired to develop an automated method for fast annotating the subnuclear locations for numerous newly found nuclear protein sequences so as to be able to timely utilize them for basic research and drug discovery. In view of this, a novel approach is developed for predicting the protein subnuclear location. It is featured by introducing a powerful classifier, the optimized evidence-theoretic K-nearest classifier, and using the pseudo amino acid composition [K.C. Chou, PROTEINS: Structure, Function, and Genetics, 43 (2001) 246], which can incorporate a considerable amount of sequence-order effects, to represent protein samples. As a demonstration, identifications were performed for 370 nuclear proteins among the following 9 subnuclear locations: (1) Cajal body, (2) chromatin, (3) heterochromatin, (4) nuclear diffuse, (5) nuclear pore, (6) nuclear speckle, (7) nucleolus, (8) PcG body, and (9) PML body. The overall success rates thus obtained by both the re-substitution test and jackknife cross-validation test are significantly higher than those by existing classifiers on the same working dataset. It is anticipated that the powerful approach may also become a useful high throughput vehicle to bridge the huge gap occurring in the post-genomic era between the number of gene sequences in databases and the number of gene products that have been functionally characterized. The OET-KNN classifier will be available at www.pami.sjtu.edu.cn/people/hbshen.

  16. Fusing Bluetooth Beacon Data with Wi-Fi Radiomaps for Improved Indoor Localization

    PubMed Central

    Kanaris, Loizos; Kokkinis, Akis; Liotta, Antonio; Stavrou, Stavros

    2017-01-01

    Indoor user localization and tracking are instrumental to a broad range of services and applications in the Internet of Things (IoT) and particularly in Body Sensor Networks (BSN) and Ambient Assisted Living (AAL) scenarios. Due to the widespread availability of IEEE 802.11, many localization platforms have been proposed, based on the Wi-Fi Received Signal Strength (RSS) indicator, using algorithms such as K-Nearest Neighbour (KNN), Maximum A Posteriori (MAP) and Minimum Mean Square Error (MMSE). In this paper, we introduce a hybrid method that combines the simplicity (and low cost) of Bluetooth Low Energy (BLE) and the popular 802.11 infrastructure, to improve the accuracy of indoor localization platforms. Building on KNN, we propose a new positioning algorithm (dubbed i-KNN) which is able to filter the initial fingerprint dataset (i.e., the radiomap), after considering the proximity of RSS fingerprints with respect to the BLE devices. In this way, i-KNN provides an optimised small subset of possible user locations, based on which it finally estimates the user position. The proposed methodology achieves fast positioning estimation due to the utilization of a fragment of the initial fingerprint dataset, while at the same time improves positioning accuracy by minimizing any calculation errors. PMID:28394268

  17. Fusing Bluetooth Beacon Data with Wi-Fi Radiomaps for Improved Indoor Localization.

    PubMed

    Kanaris, Loizos; Kokkinis, Akis; Liotta, Antonio; Stavrou, Stavros

    2017-04-10

    Indoor user localization and tracking are instrumental to a broad range of services and applications in the Internet of Things (IoT) and particularly in Body Sensor Networks (BSN) and Ambient Assisted Living (AAL) scenarios. Due to the widespread availability of IEEE 802.11, many localization platforms have been proposed, based on the Wi-Fi Received Signal Strength (RSS) indicator, using algorithms such as K -Nearest Neighbour (KNN), Maximum A Posteriori (MAP) and Minimum Mean Square Error (MMSE). In this paper, we introduce a hybrid method that combines the simplicity (and low cost) of Bluetooth Low Energy (BLE) and the popular 802.11 infrastructure, to improve the accuracy of indoor localization platforms. Building on KNN, we propose a new positioning algorithm (dubbed i-KNN) which is able to filter the initial fingerprint dataset (i.e., the radiomap), after considering the proximity of RSS fingerprints with respect to the BLE devices. In this way, i-KNN provides an optimised small subset of possible user locations, based on which it finally estimates the user position. The proposed methodology achieves fast positioning estimation due to the utilization of a fragment of the initial fingerprint dataset, while at the same time improves positioning accuracy by minimizing any calculation errors.

  18. A new hybrid method based on fuzzy-artificial immune system and k-nn algorithm for breast cancer diagnosis.

    PubMed

    Sahan, Seral; Polat, Kemal; Kodaz, Halife; Güneş, Salih

    2007-03-01

    The use of machine learning tools in medical diagnosis is increasing gradually. This is mainly because the effectiveness of classification and recognition systems has improved in a great deal to help medical experts in diagnosing diseases. Such a disease is breast cancer, which is a very common type of cancer among woman. As the incidence of this disease has increased significantly in the recent years, machine learning applications to this problem have also took a great attention as well as medical consideration. This study aims at diagnosing breast cancer with a new hybrid machine learning method. By hybridizing a fuzzy-artificial immune system with k-nearest neighbour algorithm, a method was obtained to solve this diagnosis problem via classifying Wisconsin Breast Cancer Dataset (WBCD). This data set is a very commonly used data set in the literature relating the use of classification systems for breast cancer diagnosis and it was used in this study to compare the classification performance of our proposed method with regard to other studies. We obtained a classification accuracy of 99.14%, which is the highest one reached so far. The classification accuracy was obtained via 10-fold cross validation. This result is for WBCD but it states that this method can be used confidently for other breast cancer diagnosis problems, too.

  19. Motion Control of Drives for Prosthetic Hand Using Continuous Myoelectric Signals

    NASA Astrophysics Data System (ADS)

    Purushothaman, Geethanjali; Ray, Kalyan Kumar

    2016-03-01

    In this paper the authors present motion control of a prosthetic hand, through continuous myoelectric signal acquisition, classification and actuation of the prosthetic drive. A four channel continuous electromyogram (EMG) signal also known as myoelectric signals (MES) are acquired from the abled-body to classify the six unique movements of hand and wrist, viz, hand open (HO), hand close (HC), wrist flexion (WF), wrist extension (WE), ulnar deviation (UD) and radial deviation (RD). The classification technique involves in extracting the features/pattern through statistical time domain (TD) parameter/autoregressive coefficients (AR), which are reduced using principal component analysis (PCA). The reduced statistical TD features and or AR coefficients are used to classify the signal patterns through k nearest neighbour (kNN) as well as neural network (NN) classifier and the performance of the classifiers are compared. Performance comparison of the above two classifiers clearly shows that kNN classifier in identifying the hidden intended motion in the myoelectric signals is better than that of NN classifier. Once the classifier identifies the intended motion, the signal is amplified to actuate the three low power DC motor to perform the above mentioned movements.

  20. Machine Learning and Computer Vision System for Phenotype Data Acquisition and Analysis in Plants.

    PubMed

    Navarro, Pedro J; Pérez, Fernando; Weiss, Julia; Egea-Cortines, Marcos

    2016-05-05

    Phenomics is a technology-driven approach with promising future to obtain unbiased data of biological systems. Image acquisition is relatively simple. However data handling and analysis are not as developed compared to the sampling capacities. We present a system based on machine learning (ML) algorithms and computer vision intended to solve the automatic phenotype data analysis in plant material. We developed a growth-chamber able to accommodate species of various sizes. Night image acquisition requires near infrared lightning. For the ML process, we tested three different algorithms: k-nearest neighbour (kNN), Naive Bayes Classifier (NBC), and Support Vector Machine. Each ML algorithm was executed with different kernel functions and they were trained with raw data and two types of data normalisation. Different metrics were computed to determine the optimal configuration of the machine learning algorithms. We obtained a performance of 99.31% in kNN for RGB images and a 99.34% in SVM for NIR. Our results show that ML techniques can speed up phenomic data analysis. Furthermore, both RGB and NIR images can be segmented successfully but may require different ML algorithms for segmentation.

  1. An expert system based on principal component analysis, artificial immune system and fuzzy k-NN for diagnosis of valvular heart diseases.

    PubMed

    Sengur, Abdulkadir

    2008-03-01

    In the last two decades, the use of artificial intelligence methods in medical analysis is increasing. This is mainly because the effectiveness of classification and detection systems have improved a great deal to help the medical experts in diagnosing. In this work, we investigate the use of principal component analysis (PCA), artificial immune system (AIS) and fuzzy k-NN to determine the normal and abnormal heart valves from the Doppler heart sounds. The proposed heart valve disorder detection system is composed of three stages. The first stage is the pre-processing stage. Filtering, normalization and white de-noising are the processes that were used in this stage. The feature extraction is the second stage. During feature extraction stage, wavelet packet decomposition was used. As a next step, wavelet entropy was considered as features. For reducing the complexity of the system, PCA was used for feature reduction. In the classification stage, AIS and fuzzy k-NN were used. To evaluate the performance of the proposed methodology, a comparative study is realized by using a data set containing 215 samples. The validation of the proposed method is measured by using the sensitivity and specificity parameters; 95.9% sensitivity and 96% specificity rate was obtained.

  2. A Quantum Hybrid PSO Combined with Fuzzy k-NN Approach to Feature Selection and Cell Classification in Cervical Cancer Detection.

    PubMed

    Iliyasu, Abdullah M; Fatichah, Chastine

    2017-12-19

    A quantum hybrid (QH) intelligent approach that blends the adaptive search capability of the quantum-behaved particle swarm optimisation (QPSO) method with the intuitionistic rationality of traditional fuzzy k -nearest neighbours (Fuzzy k -NN) algorithm (known simply as the Q-Fuzzy approach) is proposed for efficient feature selection and classification of cells in cervical smeared (CS) images. From an initial multitude of 17 features describing the geometry, colour, and texture of the CS images, the QPSO stage of our proposed technique is used to select the best subset features (i.e., global best particles) that represent a pruned down collection of seven features. Using a dataset of almost 1000 images, performance evaluation of our proposed Q-Fuzzy approach assesses the impact of our feature selection on classification accuracy by way of three experimental scenarios that are compared alongside two other approaches: the All-features (i.e., classification without prior feature selection) and another hybrid technique combining the standard PSO algorithm with the Fuzzy k -NN technique (P-Fuzzy approach). In the first and second scenarios, we further divided the assessment criteria in terms of classification accuracy based on the choice of best features and those in terms of the different categories of the cervical cells. In the third scenario, we introduced new QH hybrid techniques, i.e., QPSO combined with other supervised learning methods, and compared the classification accuracy alongside our proposed Q-Fuzzy approach. Furthermore, we employed statistical approaches to establish qualitative agreement with regards to the feature selection in the experimental scenarios 1 and 3. The synergy between the QPSO and Fuzzy k -NN in the proposed Q-Fuzzy approach improves classification accuracy as manifest in the reduction in number cell features, which is crucial for effective cervical cancer detection and diagnosis.

  3. Medical diagnosis of atherosclerosis from Carotid Artery Doppler Signals using principal component analysis (PCA), k-NN based weighting pre-processing and Artificial Immune Recognition System (AIRS).

    PubMed

    Latifoğlu, Fatma; Polat, Kemal; Kara, Sadik; Güneş, Salih

    2008-02-01

    In this study, we proposed a new medical diagnosis system based on principal component analysis (PCA), k-NN based weighting pre-processing, and Artificial Immune Recognition System (AIRS) for diagnosis of atherosclerosis from Carotid Artery Doppler Signals. The suggested system consists of four stages. First, in the feature extraction stage, we have obtained the features related with atherosclerosis disease using Fast Fourier Transformation (FFT) modeling and by calculating of maximum frequency envelope of sonograms. Second, in the dimensionality reduction stage, the 61 features of atherosclerosis disease have been reduced to 4 features using PCA. Third, in the pre-processing stage, we have weighted these 4 features using different values of k in a new weighting scheme based on k-NN based weighting pre-processing. Finally, in the classification stage, AIRS classifier has been used to classify subjects as healthy or having atherosclerosis. Hundred percent of classification accuracy has been obtained by the proposed system using 10-fold cross validation. This success shows that the proposed system is a robust and effective system in diagnosis of atherosclerosis disease.

  4. Unconstrained and contactless hand geometry biometrics.

    PubMed

    de-Santos-Sierra, Alberto; Sánchez-Ávila, Carmen; Del Pozo, Gonzalo Bailador; Guerra-Casanova, Javier

    2011-01-01

    This paper presents a hand biometric system for contact-less, platform-free scenarios, proposing innovative methods in feature extraction, template creation and template matching. The evaluation of the proposed method considers both the use of three contact-less publicly available hand databases, and the comparison of the performance to two competitive pattern recognition techniques existing in literature: namely support vector machines (SVM) and k-nearest neighbour (k-NN). Results highlight the fact that the proposed method outcomes existing approaches in literature in terms of computational cost, accuracy in human identification, number of extracted features and number of samples for template creation. The proposed method is a suitable solution for human identification in contact-less scenarios based on hand biometrics, providing a feasible solution to devices with limited hardware requirements like mobile devices.

  5. Unconstrained and Contactless Hand Geometry Biometrics

    PubMed Central

    de-Santos-Sierra, Alberto; Sánchez-Ávila, Carmen; del Pozo, Gonzalo Bailador; Guerra-Casanova, Javier

    2011-01-01

    This paper presents a hand biometric system for contact-less, platform-free scenarios, proposing innovative methods in feature extraction, template creation and template matching. The evaluation of the proposed method considers both the use of three contact-less publicly available hand databases, and the comparison of the performance to two competitive pattern recognition techniques existing in literature: namely Support Vector Machines (SVM) and k-Nearest Neighbour (k-NN). Results highlight the fact that the proposed method outcomes existing approaches in literature in terms of computational cost, accuracy in human identification, number of extracted features and number of samples for template creation. The proposed method is a suitable solution for human identification in contact-less scenarios based on hand biometrics, providing a feasible solution to devices with limited hardware requirements like mobile devices. PMID:22346634

  6. Determination of authenticity of brand perfume using electronic nose prototypes

    NASA Astrophysics Data System (ADS)

    Gebicki, Jacek; Szulczynski, Bartosz; Kaminski, Marian

    2015-12-01

    The paper presents the practical application of an electronic nose technique for fast and efficient discrimination between authentic and fake perfume samples. Two self-built electronic nose prototypes equipped with a set of semiconductor sensors were employed for that purpose. Additionally 10 volunteers took part in the sensory analysis. The following perfumes and their fake counterparts were analysed: Dior—Fahrenheit, Eisenberg—J’ose, YSL—La nuit de L’homme, 7 Loewe and Spice Bomb. The investigations were carried out using the headspace of the aqueous solutions. Data analysis utilized multidimensional techniques: principle component analysis (PCA), linear discrimination analysis (LDA), k-nearest neighbour (k-NN). The results obtained confirmed the legitimacy of the electronic nose technique as an alternative to the sensory analysis as far as the determination of authenticity of perfume is concerned.

  7. Exploitation of RF-DNA for Device Classification and Verification Using GRLVQI Processing

    DTIC Science & Technology

    2012-12-01

    5 FLD Fisher’s Linear Discriminant . . . . . . . . . . . . . . . . . . . 6 kNN K-Nearest Neighbor...Neighbor ( kNN ), Support Vector Machine (SVM), and simple cross-correlation techniques [40, 57, 82, 88, 94, 95]. The RF-DNA fingerprinting research in...Expansion and the Dis- crete Gabor Transform on a Non-Separable Lattice”. 2000 IEEE Int’l Conf on Acoustics, Speech , and Signal Processing (ICASSP00

  8. Credit scoring analysis using weighted k nearest neighbor

    NASA Astrophysics Data System (ADS)

    Mukid, M. A.; Widiharih, T.; Rusgiyono, A.; Prahutama, A.

    2018-05-01

    Credit scoring is a quatitative method to evaluate the credit risk of loan applications. Both statistical methods and artificial intelligence are often used by credit analysts to help them decide whether the applicants are worthy of credit. These methods aim to predict future behavior in terms of credit risk based on past experience of customers with similar characteristics. This paper reviews the weighted k nearest neighbor (WKNN) method for credit assessment by considering the use of some kernels. We use credit data from a private bank in Indonesia. The result shows that the Gaussian kernel and rectangular kernel have a better performance based on the value of percentage corrected classified whose value is 82.4% respectively.

  9. A comparison of machine learning methods for classification using simulation with multiple real data examples from mental health studies.

    PubMed

    Khondoker, Mizanur; Dobson, Richard; Skirrow, Caroline; Simmons, Andrew; Stahl, Daniel

    2016-10-01

    Recent literature on the comparison of machine learning methods has raised questions about the neutrality, unbiasedness and utility of many comparative studies. Reporting of results on favourable datasets and sampling error in the estimated performance measures based on single samples are thought to be the major sources of bias in such comparisons. Better performance in one or a few instances does not necessarily imply so on an average or on a population level and simulation studies may be a better alternative for objectively comparing the performances of machine learning algorithms. We compare the classification performance of a number of important and widely used machine learning algorithms, namely the Random Forests (RF), Support Vector Machines (SVM), Linear Discriminant Analysis (LDA) and k-Nearest Neighbour (kNN). Using massively parallel processing on high-performance supercomputers, we compare the generalisation errors at various combinations of levels of several factors: number of features, training sample size, biological variation, experimental variation, effect size, replication and correlation between features. For smaller number of correlated features, number of features not exceeding approximately half the sample size, LDA was found to be the method of choice in terms of average generalisation errors as well as stability (precision) of error estimates. SVM (with RBF kernel) outperforms LDA as well as RF and kNN by a clear margin as the feature set gets larger provided the sample size is not too small (at least 20). The performance of kNN also improves as the number of features grows and outplays that of LDA and RF unless the data variability is too high and/or effect sizes are too small. RF was found to outperform only kNN in some instances where the data are more variable and have smaller effect sizes, in which cases it also provide more stable error estimates than kNN and LDA. Applications to a number of real datasets supported the findings from

  10. Nearest clusters based partial least squares discriminant analysis for the classification of spectral data.

    PubMed

    Song, Weiran; Wang, Hui; Maguire, Paul; Nibouche, Omar

    2018-06-07

    Partial Least Squares Discriminant Analysis (PLS-DA) is one of the most effective multivariate analysis methods for spectral data analysis, which extracts latent variables and uses them to predict responses. In particular, it is an effective method for handling high-dimensional and collinear spectral data. However, PLS-DA does not explicitly address data multimodality, i.e., within-class multimodal distribution of data. In this paper, we present a novel method termed nearest clusters based PLS-DA (NCPLS-DA) for addressing the multimodality and nonlinearity issues explicitly and improving the performance of PLS-DA on spectral data classification. The new method applies hierarchical clustering to divide samples into clusters and calculates the corresponding centre of every cluster. For a given query point, only clusters whose centres are nearest to such a query point are used for PLS-DA. Such a method can provide a simple and effective tool for separating multimodal and nonlinear classes into clusters which are locally linear and unimodal. Experimental results on 17 datasets, including 12 UCI and 5 spectral datasets, show that NCPLS-DA can outperform 4 baseline methods, namely, PLS-DA, kernel PLS-DA, local PLS-DA and k-NN, achieving the highest classification accuracy most of the time. Copyright © 2018 Elsevier B.V. All rights reserved.

  11. AVNM: A Voting based Novel Mathematical Rule for Image Classification.

    PubMed

    Vidyarthi, Ankit; Mittal, Namita

    2016-12-01

    In machine learning, the accuracy of the system depends upon classification result. Classification accuracy plays an imperative role in various domains. Non-parametric classifier like K-Nearest Neighbor (KNN) is the most widely used classifier for pattern analysis. Besides its easiness, simplicity and effectiveness characteristics, the main problem associated with KNN classifier is the selection of a number of nearest neighbors i.e. "k" for computation. At present, it is hard to find the optimal value of "k" using any statistical algorithm, which gives perfect accuracy in terms of low misclassification error rate. Motivated by the prescribed problem, a new sample space reduction weighted voting mathematical rule (AVNM) is proposed for classification in machine learning. The proposed AVNM rule is also non-parametric in nature like KNN. AVNM uses the weighted voting mechanism with sample space reduction to learn and examine the predicted class label for unidentified sample. AVNM is free from any initial selection of predefined variable and neighbor selection as found in KNN algorithm. The proposed classifier also reduces the effect of outliers. To verify the performance of the proposed AVNM classifier, experiments are made on 10 standard datasets taken from UCI database and one manually created dataset. The experimental result shows that the proposed AVNM rule outperforms the KNN classifier and its variants. Experimentation results based on confusion matrix accuracy parameter proves higher accuracy value with AVNM rule. The proposed AVNM rule is based on sample space reduction mechanism for identification of an optimal number of nearest neighbor selections. AVNM results in better classification accuracy and minimum error rate as compared with the state-of-art algorithm, KNN, and its variants. The proposed rule automates the selection of nearest neighbor selection and improves classification rate for UCI dataset and manually created dataset. Copyright © 2016 Elsevier

  12. Short-term Power Load Forecasting Based on Balanced KNN

    NASA Astrophysics Data System (ADS)

    Lv, Xianlong; Cheng, Xingong; YanShuang; Tang, Yan-mei

    2018-03-01

    To improve the accuracy of load forecasting, a short-term load forecasting model based on balanced KNN algorithm is proposed; According to the load characteristics, the historical data of massive power load are divided into scenes by the K-means algorithm; In view of unbalanced load scenes, the balanced KNN algorithm is proposed to classify the scene accurately; The local weighted linear regression algorithm is used to fitting and predict the load; Adopting the Apache Hadoop programming framework of cloud computing, the proposed algorithm model is parallelized and improved to enhance its ability of dealing with massive and high-dimension data. The analysis of the household electricity consumption data for a residential district is done by 23-nodes cloud computing cluster, and experimental results show that the load forecasting accuracy and execution time by the proposed model are the better than those of traditional forecasting algorithm.

  13. A Proposed Methodology to Classify Frontier Capital Markets

    DTIC Science & Technology

    2011-07-31

    but because it is the surest route to our common good.” -Inaugural Speech by President Barack Obama, Jan 2009 This project involves basic...machine learning. The algorithm consists of a unique binary classifier mechanism that combines three methods: k-Nearest Neighbors ( kNN ), ensemble...Through kNN Ensemble Classification Techniques E. Capital Market Classification Based on Capital Flows and Trading Architecture F. Horizontal

  14. A Proposed Methodology to Classify Frontier Capital Markets

    DTIC Science & Technology

    2011-07-31

    out of charity, but because it is the surest route to our common good.” -Inaugural Speech by President Barack Obama, Jan 2009 This project...identification, and machine learning. The algorithm consists of a unique binary classifier mechanism that combines three methods: k-Nearest Neighbors ( kNN ...Support Through kNN Ensemble Classification Techniques E. Capital Market Classification Based on Capital Flows and Trading Architecture F

  15. A nearest neighbour approach by genetic distance to the assignment of individual trees to geographic origin.

    PubMed

    Degen, Bernd; Blanc-Jolivet, Céline; Stierand, Katrin; Gillet, Elizabeth

    2017-03-01

    During the past decade, the use of DNA for forensic applications has been extensively implemented for plant and animal species, as well as in humans. Tracing back the geographical origin of an individual usually requires genetic assignment analysis. These approaches are based on reference samples that are grouped into populations or other aggregates and intend to identify the most likely group of origin. Often this grouping does not have a biological but rather a historical or political justification, such as "country of origin". In this paper, we present a new nearest neighbour approach to individual assignment or classification within a given but potentially imperfect grouping of reference samples. This method, which is based on the genetic distance between individuals, functions better in many cases than commonly used methods. We demonstrate the operation of our assignment method using two data sets. One set is simulated for a large number of trees distributed in a 120km by 120km landscape with individual genotypes at 150 SNPs, and the other set comprises experimental data of 1221 individuals of the African tropical tree species Entandrophragma cylindricum (Sapelli) genotyped at 61 SNPs. Judging by the level of correct self-assignment, our approach outperformed the commonly used frequency and Bayesian approaches by 15% for the simulated data set and by 5-7% for the Sapelli data set. Our new approach is less sensitive to overlapping sources of genetic differentiation, such as genetic differences among closely-related species, phylogeographic lineages and isolation by distance, and thus operates better even for suboptimal grouping of individuals. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  16. A new feature constituting approach to detection of vocal fold pathology

    NASA Astrophysics Data System (ADS)

    Hariharan, M.; Polat, Kemal; Yaacob, Sazali

    2014-08-01

    In the last two decades, non-invasive methods through acoustic analysis of voice signal have been proved to be excellent and reliable tool to diagnose vocal fold pathologies. This paper proposes a new feature vector based on the wavelet packet transform and singular value decomposition for the detection of vocal fold pathology. k-means clustering based feature weighting is proposed to increase the distinguishing performance of the proposed features. In this work, two databases Massachusetts Eye and Ear Infirmary (MEEI) voice disorders database and MAPACI speech pathology database are used. Four different supervised classifiers such as k-nearest neighbour (k-NN), least-square support vector machine, probabilistic neural network and general regression neural network are employed for testing the proposed features. The experimental results uncover that the proposed features give very promising classification accuracy of 100% for both MEEI database and MAPACI speech pathology database.

  17. The Use of Fuzzy Set Classification for Pattern Recognition of the Polygraph

    DTIC Science & Technology

    1993-12-01

    actual feature extraction was done, It was decided to use the K-nearest neighbor ( KNN ) the data was preprocessed. The electrocardiogram classifier in...showing heart pulse, and a low frequency not known beforehand, and the KNN classifier does not component showing blood volume. The derivative of...the characteristics of the conventional KNN these six derived signals were detrended and filtered, classification method is that it assigns each

  18. Imputed forest structure uncertainty varies across elevational and longitudinal gradients in the western Cascade mountains, Oregon, USA

    Treesearch

    David M. Bell; Matthew J. Gregory; Janet L. Ohmann

    2015-01-01

    Imputation provides a useful method for mapping forest attributes across broad geographic areas based on field plot measurements and Landsat multi-spectral data, but the resulting map products may be of limited use without corresponding analyses of uncertainties in predictions. In the case of k-nearest neighbor (kNN) imputation with k = 1, such as the Gradient Nearest...

  19. One input-class and two input-class classifications for differentiating olive oil from other edible vegetable oils by use of the normal-phase liquid chromatography fingerprint of the methyl-transesterified fraction.

    PubMed

    Jiménez-Carvelo, Ana M; Pérez-Castaño, Estefanía; González-Casado, Antonio; Cuadros-Rodríguez, Luis

    2017-04-15

    A new method for differentiation of olive oil (independently of the quality category) from other vegetable oils (canola, safflower, corn, peanut, seeds, grapeseed, palm, linseed, sesame and soybean) has been developed. The analytical procedure for chromatographic fingerprinting of the methyl-transesterified fraction of each vegetable oil, using normal-phase liquid chromatography, is described and the chemometric strategies applied and discussed. Some chemometric methods, such as k-nearest neighbours (kNN), partial least squared-discriminant analysis (PLS-DA), support vector machine classification analysis (SVM-C), and soft independent modelling of class analogies (SIMCA), were applied to build classification models. Performance of the classification was evaluated and ranked using several classification quality metrics. The discriminant analysis, based on the use of one input-class, (plus a dummy class) was applied for the first time in this study. Copyright © 2016 Elsevier Ltd. All rights reserved.

  20. Optimized Seizure Detection Algorithm: A Fast Approach for Onset of Epileptic in EEG Signals Using GT Discriminant Analysis and K-NN Classifier

    PubMed Central

    Rezaee, Kh.; Azizi, E.; Haddadnia, J.

    2016-01-01

    Background Epilepsy is a severe disorder of the central nervous system that predisposes the person to recurrent seizures. Fifty million people worldwide suffer from epilepsy; after Alzheimer’s and stroke, it is the third widespread nervous disorder. Objective In this paper, an algorithm to detect the onset of epileptic seizures based on the analysis of brain electrical signals (EEG) has been proposed. 844 hours of EEG were recorded form 23 pediatric patients consecutively with 163 occurrences of seizures. Signals had been collected from Children’s Hospital Boston with a sampling frequency of 256 Hz through 18 channels in order to assess epilepsy surgery. By selecting effective features from seizure and non-seizure signals of each individual and putting them into two categories, the proposed algorithm detects the onset of seizures quickly and with high sensitivity. Method In this algorithm, L-sec epochs of signals are displayed in form of a third-order tensor in spatial, spectral and temporal spaces by applying wavelet transform. Then, after applying general tensor discriminant analysis (GTDA) on tensors and calculating mapping matrix, feature vectors are extracted. GTDA increases the sensitivity of the algorithm by storing data without deleting them. Finally, K-Nearest neighbors (KNN) is used to classify the selected features. Results The results of simulating algorithm on algorithm standard dataset shows that the algorithm is capable of detecting 98 percent of seizures with an average delay of 4.7 seconds and the average error rate detection of three errors in 24 hours. Conclusion Today, the lack of an automated system to detect or predict the seizure onset is strongly felt. PMID:27672628

  1. Nation-Building Modeling and Resource Allocation Via Dynamic Programming

    DTIC Science & Technology

    2014-09-01

    Figure 2. RAND Study Models[59:98,115] (WMA) and used both the k-Nearest Neighbor ( KNN ) and Nearest Centroid (NC) algorithms to classify future features...The study found that KNN performed bet- ter than NC with 85% or greater accuracy in all test cases. The methodology was adopted for use under the...analysis feature of the model. 3.7.1 The No Surge Alternative. On the 10th of January 2007, President George W. Bush delivered a speech to the American

  2. Understanding the Instruments of National Power through a System of Differential Equations in a Counterinsurgency

    DTIC Science & Technology

    2012-03-01

    WMA) and used both the k-Nearest Neighbor ( KNN ) and Nearest Centroid 27 (a) Coalition and Regional (b) Indigenous Figure 3. RAND Study Models[32:98,115...NC) algorithms to classify future features. The study found that KNN performed better than NC with 85% or greater accuracy in all test cases. The...the model. 4.2.1 No Surge. On the 10th of January 2007, President George W. Bush delivered a speech to the American Public outlining a new strategy in

  3. K-nearest neighbors based methods for identification of different gear crack levels under different motor speeds and loads: Revisited

    NASA Astrophysics Data System (ADS)

    Wang, Dong

    2016-03-01

    Gears are the most commonly used components in mechanical transmission systems. Their failures may cause transmission system breakdown and result in economic loss. Identification of different gear crack levels is important to prevent any unexpected gear failure because gear cracks lead to gear tooth breakage. Signal processing based methods mainly require expertize to explain gear fault signatures which is usually not easy to be achieved by ordinary users. In order to automatically identify different gear crack levels, intelligent gear crack identification methods should be developed. The previous case studies experimentally proved that K-nearest neighbors based methods exhibit high prediction accuracies for identification of 3 different gear crack levels under different motor speeds and loads. In this short communication, to further enhance prediction accuracies of existing K-nearest neighbors based methods and extend identification of 3 different gear crack levels to identification of 5 different gear crack levels, redundant statistical features are constructed by using Daubechies 44 (db44) binary wavelet packet transform at different wavelet decomposition levels, prior to the use of a K-nearest neighbors method. The dimensionality of redundant statistical features is 620, which provides richer gear fault signatures. Since many of these statistical features are redundant and highly correlated with each other, dimensionality reduction of redundant statistical features is conducted to obtain new significant statistical features. At last, the K-nearest neighbors method is used to identify 5 different gear crack levels under different motor speeds and loads. A case study including 3 experiments is investigated to demonstrate that the developed method provides higher prediction accuracies than the existing K-nearest neighbors based methods for recognizing different gear crack levels under different motor speeds and loads. Based on the new significant statistical

  4. Nanoscale characterization and local piezoelectric properties of lead-free KNN-LT-LS thin films

    NASA Astrophysics Data System (ADS)

    Abazari, M.; Choi, T.; Cheong, S.-W.; Safari, A.

    2010-01-01

    We report the observation of domain structure and piezoelectric properties of pure and Mn-doped (K0.44,Na0.52,Li0.04)(Nb0.84,Ta0.1,Sb0.06)O3 (KNN-LT-LS) thin films on SrTiO3 substrates. It is revealed that, using piezoresponse force microscopy, ferroelectric domain structure in such 500 nm thin films comprised of primarily 180° domains. This was in accordance with the tetragonal structure of the films, confirmed by relative permittivity measurements and x-ray diffraction patterns. Effective piezoelectric coefficient (d33) of the films were calculated using piezoelectric displacement curves and shown to be ~53 pm V-1 for pure KNN-LT-LS thin films. This value is among the highest values reported for an epitaxial lead-free thin film and shows a great potential for KNN-LT-LS to serve as an alternative to PZT thin films in future applications.

  5. Mapping growing stock volume and forest live biomass: a case study of the Polissya region of Ukraine

    NASA Astrophysics Data System (ADS)

    Bilous, Andrii; Myroniuk, Viktor; Holiaka, Dmytrii; Bilous, Svitlana; See, Linda; Schepaschenko, Dmitry

    2017-10-01

    Forest inventory and biomass mapping are important tasks that require inputs from multiple data sources. In this paper we implement two methods for the Ukrainian region of Polissya: random forest (RF) for tree species prediction and k-nearest neighbors (k-NN) for growing stock volume and biomass mapping. We examined the suitability of the five-band RapidEye satellite image to predict the distribution of six tree species. The accuracy of RF is quite high: ~99% for forest/non-forest mask and 89% for tree species prediction. Our results demonstrate that inclusion of elevation as a predictor variable in the RF model improved the performance of tree species classification. We evaluated different distance metrics for the k-NN method, including Euclidean or Mahalanobis distance, most similar neighbor (MSN), gradient nearest neighbor, and independent component analysis. The MSN with the four nearest neighbors (k = 4) is the most precise (according to the root-mean-square deviation) for predicting forest attributes across the study area. The k-NN method allowed us to estimate growing stock volume with an accuracy of 3 m3 ha-1 and for live biomass of about 2 t ha-1 over the study area.

  6. Three Dimensional Object Recognition Using a Complex Autoregressive Model

    DTIC Science & Technology

    1993-12-01

    3.4.2 Template Matching Algorithm ...................... 3-16 3.4.3 K-Nearest-Neighbor ( KNN ) Techniques ................. 3-25 3.4.4 Hidden Markov Model...Neighbor ( KNN ) Test Results ...................... 4-13 4.2.1 Single-Look 1-NN Testing .......................... 4-14 4.2.2 Multiple-Look 1-NN Testing...4-15 4.2.3 Discussion of KNN Test Results ...................... 4-15 4.3 Hidden Markov Model (HMM) Test Results

  7. Missing portion sizes in FFQ--alternatives to use of standard portions.

    PubMed

    Køster-Rasmussen, Rasmus; Siersma, Volkert; Halldorsson, Thorhallur I; de Fine Olivarius, Niels; Henriksen, Jan E; Heitmann, Berit L

    2015-08-01

    Standard portions or substitution of missing portion sizes with medians may generate bias when quantifying the dietary intake from FFQ. The present study compared four different methods to include portion sizes in FFQ. We evaluated three stochastic methods for imputation of portion sizes based on information about anthropometry, sex, physical activity and age. Energy intakes computed with standard portion sizes, defined as sex-specific medians (median), or with portion sizes estimated with multinomial logistic regression (MLR), 'comparable categories' (Coca) or k-nearest neighbours (KNN) were compared with a reference based on self-reported portion sizes (quantified by a photographic food atlas embedded in the FFQ). The Danish Health Examination Survey 2007-2008. The study included 3728 adults with complete portion size data. Compared with the reference, the root-mean-square errors of the mean daily total energy intake (in kJ) computed with portion sizes estimated by the four methods were (men; women): median (1118; 1061), MLR (1060; 1051), Coca (1230; 1146), KNN (1281; 1181). The equivalent biases (mean error) were (in kJ): median (579; 469), MLR (248; 178), Coca (234; 188), KNN (-340; 218). The methods MLR and Coca provided the best agreement with the reference. The stochastic methods allowed for estimation of meaningful portion sizes by conditioning on information about physiology and they were suitable for multiple imputation. We propose to use MLR or Coca to substitute missing portion size values or when portion sizes needs to be included in FFQ without portion size data.

  8. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zhang Yanxia; Ma He; Peng Nanbo

    We apply one of the lazy learning methods, the k-nearest neighbor (kNN) algorithm, to estimate the photometric redshifts of quasars based on various data sets from the Sloan Digital Sky Survey (SDSS), the UKIRT Infrared Deep Sky Survey (UKIDSS), and the Wide-field Infrared Survey Explorer (WISE; the SDSS sample, the SDSS-UKIDSS sample, the SDSS-WISE sample, and the SDSS-UKIDSS-WISE sample). The influence of the k value and different input patterns on the performance of kNN is discussed. kNN performs best when k is different with a special input pattern for a special data set. The best result belongs to the SDSS-UKIDSS-WISEmore » sample. The experimental results generally show that the more information from more bands, the better performance of photometric redshift estimation with kNN. The results also demonstrate that kNN using multiband data can effectively solve the catastrophic failure of photometric redshift estimation, which is met by many machine learning methods. Compared with the performance of various other methods of estimating the photometric redshifts of quasars, kNN based on KD-Tree shows superiority, exhibiting the best accuracy.« less

  9. Centre-based restricted nearest feature plane with angle classifier for face recognition

    NASA Astrophysics Data System (ADS)

    Tang, Linlin; Lu, Huifen; Zhao, Liang; Li, Zuohua

    2017-10-01

    An improved classifier based on the nearest feature plane (NFP), called the centre-based restricted nearest feature plane with the angle (RNFPA) classifier, is proposed for the face recognition problems here. The famous NFP uses the geometrical information of samples to increase the number of training samples, but it increases the computation complexity and it also has an inaccuracy problem coursed by the extended feature plane. To solve the above problems, RNFPA exploits a centre-based feature plane and utilizes a threshold of angle to restrict extended feature space. By choosing the appropriate angle threshold, RNFPA can improve the performance and decrease computation complexity. Experiments in the AT&T face database, AR face database and FERET face database are used to evaluate the proposed classifier. Compared with the original NFP classifier, the nearest feature line (NFL) classifier, the nearest neighbour (NN) classifier and some other improved NFP classifiers, the proposed one achieves competitive performance.

  10. Mapping aboveground woody biomass using forest inventory, remote sensing and geostatistical techniques.

    PubMed

    Yadav, Bechu K V; Nandy, S

    2015-05-01

    Mapping forest biomass is fundamental for estimating CO₂ emissions, and planning and monitoring of forests and ecosystem productivity. The present study attempted to map aboveground woody biomass (AGWB) integrating forest inventory, remote sensing and geostatistical techniques, viz., direct radiometric relationships (DRR), k-nearest neighbours (k-NN) and cokriging (CoK) and to evaluate their accuracy. A part of the Timli Forest Range of Kalsi Soil and Water Conservation Division, Uttarakhand, India was selected for the present study. Stratified random sampling was used to collect biophysical data from 36 sample plots of 0.1 ha (31.62 m × 31.62 m) size. Species-specific volumetric equations were used for calculating volume and multiplied by specific gravity to get biomass. Three forest-type density classes, viz. 10-40, 40-70 and >70% of Shorea robusta forest and four non-forest classes were delineated using on-screen visual interpretation of IRS P6 LISS-III data of December 2012. The volume in different strata of forest-type density ranged from 189.84 to 484.36 m(3) ha(-1). The total growing stock of the forest was found to be 2,024,652.88 m(3). The AGWB ranged from 143 to 421 Mgha(-1). Spectral bands and vegetation indices were used as independent variables and biomass as dependent variable for DRR, k-NN and CoK. After validation and comparison, k-NN method of Mahalanobis distance (root mean square error (RMSE) = 42.25 Mgha(-1)) was found to be the best method followed by fuzzy distance and Euclidean distance with RMSE of 44.23 and 45.13 Mgha(-1) respectively. DRR was found to be the least accurate method with RMSE of 67.17 Mgha(-1). The study highlighted the potential of integrating of forest inventory, remote sensing and geostatistical techniques for forest biomass mapping.

  11. Nearest Neighbor Classification of Stationary Time Series: An Application to Anesthesia Level Classification by EEG Analysis.

    DTIC Science & Technology

    1980-12-05

    classification procedures that are common in speech processing. The anesthesia level classification by EEG time series population screening problem example is in...formance. The use of the KL number type metric in NN rule classification, in a delete-one subj ect ’s EE-at-a-time KL-NN and KL- kNN classification of the...17 individual labeled EEG sample population using KL-NN and KL- kNN rules. The results obtained are shown in Table 1. The entries in the table indicate

  12. Heterogeneous Multi-Metric Learning for Multi-Sensor Fusion

    DTIC Science & Technology

    2011-07-01

    distance”. One of the most widely used methods is the k-nearest neighbor ( KNN ) method [4], which labels an input data sample to be the class with majority...despite of its simplicity, it can be an effective candidate and can be easily extended to handle multiple sensors. Distance based method such as KNN relies...Neighbor (LMNN) method [21] which will be briefly reviewed in the sequel. LMNN method tries to learn an optimal metric specifically for KNN classifier. The

  13. Forecasting municipal solid waste generation using artificial intelligence modelling approaches.

    PubMed

    Abbasi, Maryam; El Hanandeh, Ali

    2016-10-01

    Municipal solid waste (MSW) management is a major concern to local governments to protect human health, the environment and to preserve natural resources. The design and operation of an effective MSW management system requires accurate estimation of future waste generation quantities. The main objective of this study was to develop a model for accurate forecasting of MSW generation that helps waste related organizations to better design and operate effective MSW management systems. Four intelligent system algorithms including support vector machine (SVM), adaptive neuro-fuzzy inference system (ANFIS), artificial neural network (ANN) and k-nearest neighbours (kNN) were tested for their ability to predict monthly waste generation in the Logan City Council region in Queensland, Australia. Results showed artificial intelligence models have good prediction performance and could be successfully applied to establish municipal solid waste forecasting models. Using machine learning algorithms can reliably predict monthly MSW generation by training with waste generation time series. In addition, results suggest that ANFIS system produced the most accurate forecasts of the peaks while kNN was successful in predicting the monthly averages of waste quantities. Based on the results, the total annual MSW generated in Logan City will reach 9.4×10(7)kg by 2020 while the peak monthly waste will reach 9.37×10(6)kg. Copyright © 2016 Elsevier Ltd. All rights reserved.

  14. Detecting the Difficulty Level of Foreign Language Texts

    DTIC Science & Technology

    2010-02-01

    continuous tenses), as well as part- of- speech labels for words. The authors used a k-Nearest Neighbor ( kNN ) classifier (Cover and Hart, 1967; Mitchell, 1997...anticipate, and influence these situations and to operate in them is found in foreign language speech and text. For this reason, military linguists are...the language model system, LGR is the prediction of one of the grammar-based classifiers, and CkNN is a confidence value of the kNN prediction for the

  15. An automated algorithm for determining photometric redshifts of quasars

    NASA Astrophysics Data System (ADS)

    Wang, Dan; Zhang, Yanxia; Zhao, Yongheng

    2010-07-01

    We employ k-nearest neighbor algorithm (KNN) for photometric redshift measurement of quasars with the Fifth Data Release (DR5) of the Sloan Digital Sky Survey (SDSS). KNN is an instance learning algorithm where the result of new instance query is predicted based on the closest training samples. The regressor do not use any model to fit and only based on memory. Given a query quasar, we find the known quasars or (training points) closest to the query point, whose redshift value is simply assigned to be the average of the values of its k nearest neighbors. Three kinds of different colors (PSF, Model or Fiber) and spectral redshifts are used as input parameters, separatively. The combination of the three kinds of colors is also taken as input. The experimental results indicate that the best input pattern is PSF + Model + Fiber colors in all experiments. With this pattern, 59.24%, 77.34% and 84.68% of photometric redshifts are obtained within ▵z < 0.1, 0.2 and 0.3, respectively. If only using one kind of colors as input, the model colors achieve the best performance. However, when using two kinds of colors, the best result is achieved by PSF + Fiber colors. In addition, nearest neighbor method (k = 1) shows its superiority compared to KNN (k ≠ 1) for the given sample.

  16. Performance-driven Multimodality Sensor Fusion

    DTIC Science & Technology

    2012-01-23

    in IEEE Intl Conf. on Acoust., Speech , Signal Processing, (Dallas), Mar. 2010. [10] K. Sricharan, R. Raich, and A. Hero III, “Boundary compensated knn ...nearest neighbor ( kNN ) plug-in estima- tors, we have developed a generally applicable theory that gives analytical closed-form expressions for asymptotic...Co-PI’s Raich and Hero and was published in the IEEE Proc. of 2011 Intl Conf. on Acoustics, Speech , and Signal Processing. 2.4 Dimension estimation in

  17. Influences of the third and fourth nearest neighbouring interactions on the surface anisotropy of face-centred-cubic metals

    NASA Astrophysics Data System (ADS)

    Luo, Yongkun; Qin, Rongshan

    2014-06-01

    The structure and the anisotropic properties of the surfaces of face-centred-cubic (FCC) metals have been studied using the broken-bond model while considering the third and fourth nearest neighbouring (3rd and 4th NN) interactions. The pair potential expressions are obtained using the Rose-Vinet universal potential equation. The model is suitable for calculation of the property of a surface with arbitrary crystallographic orientations and can provide absolute unrelaxed surface energy values using three input parameters, namely the lattice constant, bulk modulus and cohesive energy. These parameters are available for the majority of FCC metals. The numerical results for 7 FCC metals have been obtained and compared with these obtained from ab initio calculations and experimental measurements. Good agreement is observed between the two. Taking into account up to the 4th NN interactions, the overall surface energy anisotropy for FCC metals was found to be between 12% to 16%, and the ratio between the surface energies at (100) and (111) planes was found to be 1.05. These values are less than those reported by conventional calculations but more similar to experimental measurements. It is found that the strength of 3rd and 4th NN interactions differs from one element to another, the Ni and Cu interactions being the most significant while the Au, Pt and Pb interactions are the least significant. This suggests that the polar diagrams of the surface energy of Ni and Cu are different from those of Au, Pt and Pb by showing cusps of the unconventional {110} and high-index {210}, {311} and possibly {135} poles. This provides explanations to the recent experimental observations of the {110}, {210}, {311} and {135} facets in equilibrated Ni and Cu crystallines.

  18. A Machine Learning Approach to Pedestrian Detection for Autonomous Vehicles Using High-Definition 3D Range Data

    PubMed Central

    Navarro, Pedro J.; Fernández, Carlos; Borraz, Raúl; Alonso, Diego

    2016-01-01

    This article describes an automated sensor-based system to detect pedestrians in an autonomous vehicle application. Although the vehicle is equipped with a broad set of sensors, the article focuses on the processing of the information generated by a Velodyne HDL-64E LIDAR sensor. The cloud of points generated by the sensor (more than 1 million points per revolution) is processed to detect pedestrians, by selecting cubic shapes and applying machine vision and machine learning algorithms to the XY, XZ, and YZ projections of the points contained in the cube. The work relates an exhaustive analysis of the performance of three different machine learning algorithms: k-Nearest Neighbours (kNN), Naïve Bayes classifier (NBC), and Support Vector Machine (SVM). These algorithms have been trained with 1931 samples. The final performance of the method, measured a real traffic scenery, which contained 16 pedestrians and 469 samples of non-pedestrians, shows sensitivity (81.2%), accuracy (96.2%) and specificity (96.8%). PMID:28025565

  19. A Machine Learning Approach to Pedestrian Detection for Autonomous Vehicles Using High-Definition 3D Range Data.

    PubMed

    Navarro, Pedro J; Fernández, Carlos; Borraz, Raúl; Alonso, Diego

    2016-12-23

    This article describes an automated sensor-based system to detect pedestrians in an autonomous vehicle application. Although the vehicle is equipped with a broad set of sensors, the article focuses on the processing of the information generated by a Velodyne HDL-64E LIDAR sensor. The cloud of points generated by the sensor (more than 1 million points per revolution) is processed to detect pedestrians, by selecting cubic shapes and applying machine vision and machine learning algorithms to the XY, XZ, and YZ projections of the points contained in the cube. The work relates an exhaustive analysis of the performance of three different machine learning algorithms: k-Nearest Neighbours (kNN), Naïve Bayes classifier (NBC), and Support Vector Machine (SVM). These algorithms have been trained with 1931 samples. The final performance of the method, measured a real traffic scenery, which contained 16 pedestrians and 469 samples of non-pedestrians, shows sensitivity (81.2%), accuracy (96.2%) and specificity (96.8%).

  20. Integrated Sensing and Processing (ISP) Phase II: Demonstration and Evaluation for Distributed Sensor Netowrks and Missile Seeker Systems

    DTIC Science & Technology

    2007-02-28

    Shah, D. Waagen, H. Schmitt, S. Bellofiore, A. Spanias, and D. Cochran, 32nd International Conference on Acoustics, Speech , and Signal Processing...Information Exploitation Office kNN k-Nearest Neighbor LEAN Laplacian Eigenmap Adaptive Neighbor LIP Linear Integer Programming ISP

  1. Diagnosis of diabetes diseases using an Artificial Immune Recognition System2 (AIRS2) with fuzzy K-nearest neighbor.

    PubMed

    Chikh, Mohamed Amine; Saidi, Meryem; Settouti, Nesma

    2012-10-01

    The use of expert systems and artificial intelligence techniques in disease diagnosis has been increasing gradually. Artificial Immune Recognition System (AIRS) is one of the methods used in medical classification problems. AIRS2 is a more efficient version of the AIRS algorithm. In this paper, we used a modified AIRS2 called MAIRS2 where we replace the K- nearest neighbors algorithm with the fuzzy K-nearest neighbors to improve the diagnostic accuracy of diabetes diseases. The diabetes disease dataset used in our work is retrieved from UCI machine learning repository. The performances of the AIRS2 and MAIRS2 are evaluated regarding classification accuracy, sensitivity and specificity values. The highest classification accuracy obtained when applying the AIRS2 and MAIRS2 using 10-fold cross-validation was, respectively 82.69% and 89.10%.

  2. Development of a Compton camera for prompt-gamma medical imaging

    NASA Astrophysics Data System (ADS)

    Aldawood, S.; Thirolf, P. G.; Miani, A.; Böhmer, M.; Dedes, G.; Gernhäuser, R.; Lang, C.; Liprandi, S.; Maier, L.; Marinšek, T.; Mayerhofer, M.; Schaart, D. R.; Lozano, I. Valencia; Parodi, K.

    2017-11-01

    A Compton camera-based detector system for photon detection from nuclear reactions induced by proton (or heavier ion) beams is under development at LMU Munich, targeting the online range verification of the particle beam in hadron therapy via prompt-gamma imaging. The detector is designed to be capable to reconstruct the photon source origin not only from the Compton scattering kinematics of the primary photon, but also to allow for tracking of the secondary Compton-scattered electrons, thus enabling a γ-source reconstruction also from incompletely absorbed photon events. The Compton camera consists of a monolithic LaBr3:Ce scintillation crystal, read out by a multi-anode PMT acting as absorber, preceded by a stacked array of 6 double-sided silicon strip detectors as scatterers. The detector components have been characterized both under offline and online conditions. The LaBr3:Ce crystal exhibits an excellent time and energy resolution. Using intense collimated 137Cs and 60Co sources, the monolithic scintillator was scanned on a fine 2D grid to generate a reference library of light amplitude distributions that allows for reconstructing the photon interaction position using a k-Nearest Neighbour (k-NN) algorithm. Systematic studies were performed to investigate the performance of the reconstruction algorithm, revealing an improvement of the spatial resolution with increasing photon energy to an optimum value of 3.7(1)mm at 1.33 MeV, achieved with the Categorical Average Pattern (CAP) modification of the k-NN algorithm.

  3. Predictive modelling for startup and investor relationship based on crowdfunding platform data

    NASA Astrophysics Data System (ADS)

    Alamsyah, Andry; Buono Asto Nugroho, Tri

    2018-03-01

    Crowdfunding platform is a place where startup shows off publicly their idea for the purpose to get their project funded. Crowdfunding platform such as Kickstarter are becoming popular today, it provides the efficient way for startup to get funded without liabilities, it also provides variety project category that can be participated. There is an available safety procedure to ensure achievable low-risk environment. The startup promoted project must accomplish their funded goal target. If they fail to reach the target, then there is no investment activity take place. It motivates startup to be more active to promote or disseminate their project idea and it also protect investor from losing money. The study objective is to predict the successfulness of proposed project and mapping investor trend using data mining framework. To achieve the objective, we proposed 3 models. First model is to predict whether a project is going to be successful or failed using K-Nearest Neighbour (KNN). Second model is to predict the number of successful project using Artificial Neural Network (ANN). Third model is to map the trend of investor in investing the project using K-Means clustering algorithm. KNN gives 99.04% model accuracy, while ANN best configuration gives 16-14-1 neuron layers and 0.2 learning rate, and K-Means gives 6 best separation clusters. The results of those models can help startup or investor to make decision regarding startup investment.

  4. The effect of K and Na excess on the ferroelectric and piezoelectric properties of K0.5Na0.5NbO3 thin films

    NASA Astrophysics Data System (ADS)

    Ahn, C. W.; Y Lee, S.; Lee, H. J.; Ullah, A.; Bae, J. S.; Jeong, E. D.; Choi, J. S.; Park, B. H.; Kim, I. W.

    2009-11-01

    We have fabricated K0.5Na0.5NbO3 (KNN) thin films on Pt substrates by a chemical solution deposition method and investigated the effect of K and Na excess (0-30 mol%) on ferroelectric and piezoelectric properties of KNN thin film. It was found that with increasing K and Na excess in a precursor solution from 0 to 30 mol%, the leakage current and ferroelectric properties were strongly affected. KNN thin film synthesized by using 20 mol% K and Na excess precursor solution exhibited a low leakage current density and well saturated ferroelectric P-E hysteresis loops. Moreover, the optimized KNN thin film had good fatigue resistance and a piezoelectric constant of 40 pm V-1, which is comparable to that of polycrystalline PZT thin films.

  5. Prediction of regulatory gene pairs using dynamic time warping and gene ontology.

    PubMed

    Yang, Andy C; Hsu, Hui-Huang; Lu, Ming-Da; Tseng, Vincent S; Shih, Timothy K

    2014-01-01

    Selecting informative genes is the most important task for data analysis on microarray gene expression data. In this work, we aim at identifying regulatory gene pairs from microarray gene expression data. However, microarray data often contain multiple missing expression values. Missing value imputation is thus needed before further processing for regulatory gene pairs becomes possible. We develop a novel approach to first impute missing values in microarray time series data by combining k-Nearest Neighbour (KNN), Dynamic Time Warping (DTW) and Gene Ontology (GO). After missing values are imputed, we then perform gene regulation prediction based on our proposed DTW-GO distance measurement of gene pairs. Experimental results show that our approach is more accurate when compared with existing missing value imputation methods on real microarray data sets. Furthermore, our approach can also discover more regulatory gene pairs that are known in the literature than other methods.

  6. Charge exchange between two nearest neighbour ions immersed in a dense plasma

    NASA Astrophysics Data System (ADS)

    Sauvan, P.; Angelo, P.; Derfoul, H.; Leboucher-Dalimier, E.; Devdariani, A.; Calisti, A.; Talin, B.

    1999-04-01

    In dense plasmas the quasimolecular model is relevant to describe the radiative properties: two nearest neighbor ions remain close to each other during a time scale of the order of the emission time. Within the frame of a quasistatic approach it has been shown that hydrogen-like spectral line shapes can exhibit satellite-like features. In this work we present the effect on the line shapes of the dynamical collision between the two ions exchanging transiently their bound electron. This model is suitable for the description of the core, the wings and the red satellite-like features. It is post-processed to the self consistent code (IDEFIX) giving the adiabatic transition energies and the oscillator strengths for the transient molecule immersed in a dense free electron bath. It is shown that the positions of the satellites are insensitive to the dynamics of the ion-ion collision. Results for fluorine Lyβ are presented.

  7. Improving cluster-based missing value estimation of DNA microarray data.

    PubMed

    Brás, Lígia P; Menezes, José C

    2007-06-01

    We present a modification of the weighted K-nearest neighbours imputation method (KNNimpute) for missing values (MVs) estimation in microarray data based on the reuse of estimated data. The method was called iterative KNN imputation (IKNNimpute) as the estimation is performed iteratively using the recently estimated values. The estimation efficiency of IKNNimpute was assessed under different conditions (data type, fraction and structure of missing data) by the normalized root mean squared error (NRMSE) and the correlation coefficients between estimated and true values, and compared with that of other cluster-based estimation methods (KNNimpute and sequential KNN). We further investigated the influence of imputation on the detection of differentially expressed genes using SAM by examining the differentially expressed genes that are lost after MV estimation. The performance measures give consistent results, indicating that the iterative procedure of IKNNimpute can enhance the prediction ability of cluster-based methods in the presence of high missing rates, in non-time series experiments and in data sets comprising both time series and non-time series data, because the information of the genes having MVs is used more efficiently and the iterative procedure allows refining the MV estimates. More importantly, IKNN has a smaller detrimental effect on the detection of differentially expressed genes.

  8. Ecological interactions and the Netflix problem.

    PubMed

    Desjardins-Proulx, Philippe; Laigle, Idaline; Poisot, Timothée; Gravel, Dominique

    2017-01-01

    Species interactions are a key component of ecosystems but we generally have an incomplete picture of who-eats-who in a given community. Different techniques have been devised to predict species interactions using theoretical models or abundances. Here, we explore the K nearest neighbour approach, with a special emphasis on recommendation, along with a supervised machine learning technique. Recommenders are algorithms developed for companies like Netflix to predict whether a customer will like a product given the preferences of similar customers. These machine learning techniques are well-suited to study binary ecological interactions since they focus on positive-only data. By removing a prey from a predator, we find that recommenders can guess the missing prey around 50% of the times on the first try, with up to 881 possibilities. Traits do not improve significantly the results for the K nearest neighbour, although a simple test with a supervised learning approach (random forests) show we can predict interactions with high accuracy using only three traits per species. This result shows that binary interactions can be predicted without regard to the ecological community given only three variables: body mass and two variables for the species' phylogeny. These techniques are complementary, as recommenders can predict interactions in the absence of traits, using only information about other species' interactions, while supervised learning algorithms such as random forests base their predictions on traits only but do not exploit other species' interactions. Further work should focus on developing custom similarity measures specialized for ecology to improve the KNN algorithms and using richer data to capture indirect relationships between species.

  9. Ecological interactions and the Netflix problem

    PubMed Central

    Laigle, Idaline; Poisot, Timothée; Gravel, Dominique

    2017-01-01

    Species interactions are a key component of ecosystems but we generally have an incomplete picture of who-eats-who in a given community. Different techniques have been devised to predict species interactions using theoretical models or abundances. Here, we explore the K nearest neighbour approach, with a special emphasis on recommendation, along with a supervised machine learning technique. Recommenders are algorithms developed for companies like Netflix to predict whether a customer will like a product given the preferences of similar customers. These machine learning techniques are well-suited to study binary ecological interactions since they focus on positive-only data. By removing a prey from a predator, we find that recommenders can guess the missing prey around 50% of the times on the first try, with up to 881 possibilities. Traits do not improve significantly the results for the K nearest neighbour, although a simple test with a supervised learning approach (random forests) show we can predict interactions with high accuracy using only three traits per species. This result shows that binary interactions can be predicted without regard to the ecological community given only three variables: body mass and two variables for the species’ phylogeny. These techniques are complementary, as recommenders can predict interactions in the absence of traits, using only information about other species’ interactions, while supervised learning algorithms such as random forests base their predictions on traits only but do not exploit other species’ interactions. Further work should focus on developing custom similarity measures specialized for ecology to improve the KNN algorithms and using richer data to capture indirect relationships between species. PMID:28828250

  10. Classification of speech dysfluencies using LPC based parameterization techniques.

    PubMed

    Hariharan, M; Chee, Lim Sin; Ai, Ooi Chia; Yaacob, Sazali

    2012-06-01

    The goal of this paper is to discuss and compare three feature extraction methods: Linear Predictive Coefficients (LPC), Linear Prediction Cepstral Coefficients (LPCC) and Weighted Linear Prediction Cepstral Coefficients (WLPCC) for recognizing the stuttered events. Speech samples from the University College London Archive of Stuttered Speech (UCLASS) were used for our analysis. The stuttered events were identified through manual segmentation and were used for feature extraction. Two simple classifiers namely, k-nearest neighbour (kNN) and Linear Discriminant Analysis (LDA) were employed for speech dysfluencies classification. Conventional validation method was used for testing the reliability of the classifier results. The study on the effect of different frame length, percentage of overlapping, value of ã in a first order pre-emphasizer and different order p were discussed. The speech dysfluencies classification accuracy was found to be improved by applying statistical normalization before feature extraction. The experimental investigation elucidated LPC, LPCC and WLPCC features can be used for identifying the stuttered events and WLPCC features slightly outperforms LPCC features and LPC features.

  11. Scalable Nearest Neighbor Algorithms for High Dimensional Data.

    PubMed

    Muja, Marius; Lowe, David G

    2014-11-01

    For many computer vision and machine learning problems, large training sets are key for good performance. However, the most computationally expensive part of many computer vision and machine learning algorithms consists of finding nearest neighbor matches to high dimensional vectors that represent the training data. We propose new algorithms for approximate nearest neighbor matching and evaluate and compare them with previous algorithms. For matching high dimensional features, we find two algorithms to be the most efficient: the randomized k-d forest and a new algorithm proposed in this paper, the priority search k-means tree. We also propose a new algorithm for matching binary features by searching multiple hierarchical clustering trees and show it outperforms methods typically used in the literature. We show that the optimal nearest neighbor algorithm and its parameters depend on the data set characteristics and describe an automated configuration procedure for finding the best algorithm to search a particular data set. In order to scale to very large data sets that would otherwise not fit in the memory of a single machine, we propose a distributed nearest neighbor matching framework that can be used with any of the algorithms described in the paper. All this research has been released as an open source library called fast library for approximate nearest neighbors (FLANN), which has been incorporated into OpenCV and is now one of the most popular libraries for nearest neighbor matching.

  12. Machine learning search for variable stars

    NASA Astrophysics Data System (ADS)

    Pashchenko, Ilya N.; Sokolovsky, Kirill V.; Gavras, Panagiotis

    2018-04-01

    Photometric variability detection is often considered as a hypothesis testing problem: an object is variable if the null hypothesis that its brightness is constant can be ruled out given the measurements and their uncertainties. The practical applicability of this approach is limited by uncorrected systematic errors. We propose a new variability detection technique sensitive to a wide range of variability types while being robust to outliers and underestimated measurement uncertainties. We consider variability detection as a classification problem that can be approached with machine learning. Logistic Regression (LR), Support Vector Machines (SVM), k Nearest Neighbours (kNN), Neural Nets (NN), Random Forests (RF), and Stochastic Gradient Boosting classifier (SGB) are applied to 18 features (variability indices) quantifying scatter and/or correlation between points in a light curve. We use a subset of Optical Gravitational Lensing Experiment phase two (OGLE-II) Large Magellanic Cloud (LMC) photometry (30 265 light curves) that was searched for variability using traditional methods (168 known variable objects) as the training set and then apply the NN to a new test set of 31 798 OGLE-II LMC light curves. Among 205 candidates selected in the test set, 178 are real variables, while 13 low-amplitude variables are new discoveries. The machine learning classifiers considered are found to be more efficient (select more variables and fewer false candidates) compared to traditional techniques using individual variability indices or their linear combination. The NN, SGB, SVM, and RF show a higher efficiency compared to LR and kNN.

  13. Rule groupings in expert systems using nearest neighbour decision rules, and convex hulls

    NASA Technical Reports Server (NTRS)

    Anastasiadis, Stergios

    1991-01-01

    Expert System shells are lacking in many areas of software engineering. Large rule based systems are not semantically comprehensible, difficult to debug, and impossible to modify or validate. Partitioning a set of rules found in CLIPS (C Language Integrated Production System) into groups of rules which reflect the underlying semantic subdomains of the problem, will address adequately the concerns stated above. Techniques are introduced to structure a CLIPS rule base into groups of rules that inherently have common semantic information. The concepts involved are imported from the field of A.I., Pattern Recognition, and Statistical Inference. Techniques focus on the areas of feature selection, classification, and a criteria of how 'good' the classification technique is, based on Bayesian Decision Theory. A variety of distance metrics are discussed for measuring the 'closeness' of CLIPS rules and various Nearest Neighbor classification algorithms are described based on the above metric.

  14. Fast Query-Optimized Kernel-Machine Classification

    NASA Technical Reports Server (NTRS)

    Mazzoni, Dominic; DeCoste, Dennis

    2004-01-01

    A recently developed algorithm performs kernel-machine classification via incremental approximate nearest support vectors. The algorithm implements support-vector machines (SVMs) at speeds 10 to 100 times those attainable by use of conventional SVM algorithms. The algorithm offers potential benefits for classification of images, recognition of speech, recognition of handwriting, and diverse other applications in which there are requirements to discern patterns in large sets of data. SVMs constitute a subset of kernel machines (KMs), which have become popular as models for machine learning and, more specifically, for automated classification of input data on the basis of labeled training data. While similar in many ways to k-nearest-neighbors (k-NN) models and artificial neural networks (ANNs), SVMs tend to be more accurate. Using representations that scale only linearly in the numbers of training examples, while exploring nonlinear (kernelized) feature spaces that are exponentially larger than the original input dimensionality, KMs elegantly and practically overcome the classic curse of dimensionality. However, the price that one must pay for the power of KMs is that query-time complexity scales linearly with the number of training examples, making KMs often orders of magnitude more computationally expensive than are ANNs, decision trees, and other popular machine learning alternatives. The present algorithm treats an SVM classifier as a special form of a k-NN. The algorithm is based partly on an empirical observation that one can often achieve the same classification as that of an exact KM by using only small fraction of the nearest support vectors (SVs) of a query. The exact KM output is a weighted sum over the kernel values between the query and the SVs. In this algorithm, the KM output is approximated with a k-NN classifier, the output of which is a weighted sum only over the kernel values involving k selected SVs. Before query time, there are gathered

  15. Effects of doping on ferroelectric properties and leakage current behavior of KNN-LT-LS thin films on SrTiO3 substrate

    NASA Astrophysics Data System (ADS)

    Abazari, M.; Safari, A.

    2009-05-01

    We report the effects of Ba, Ti, and Mn dopants on ferroelectric polarization and leakage current of (K0.44Na0.52Li0.04)(Nb0.84Ta0.1Sb0.06)O3 (KNN-LT-LS) thin films deposited by pulsed laser deposition. It is shown that donor dopants such as Ba2+, which increased the resistivity in bulk KNN-LT-LS, had an opposite effect in the thin film. Ti4+ as an acceptor B-site dopant reduces the leakage current by an order of magnitude, while the polarization values showed a slight degradation. Mn4+, however, was found to effectively suppress the leakage current by over two orders of magnitude while enhancing the polarization, with 15 and 23 μC/cm2 remanent and saturated polarization, whose values are ˜70% and 82% of the reported values for bulk composition. This phenomenon has been associated with the dual effect of Mn4+ in KNN-LT-LS thin film, by substituting both A- and B-site cations. A detailed description on how each dopant affects the concentrations of vacancies in the lattice is presented. Mn-doped KNN-LT-LS thin films are shown to be a promising candidate for lead-free thin films and applications.

  16. Iris Recognition Using Feature Extraction of Box Counting Fractal Dimension

    NASA Astrophysics Data System (ADS)

    Khotimah, C.; Juniati, D.

    2018-01-01

    Biometrics is a science that is now growing rapidly. Iris recognition is a biometric modality which captures a photo of the eye pattern. The markings of the iris are distinctive that it has been proposed to use as a means of identification, instead of fingerprints. Iris recognition was chosen for identification in this research because every human has a special feature that each individual is different and the iris is protected by the cornea so that it will have a fixed shape. This iris recognition consists of three step: pre-processing of data, feature extraction, and feature matching. Hough transformation is used in the process of pre-processing to locate the iris area and Daugman’s rubber sheet model to normalize the iris data set into rectangular blocks. To find the characteristics of the iris, it was used box counting method to get the fractal dimension value of the iris. Tests carried out by used k-fold cross method with k = 5. In each test used 10 different grade K of K-Nearest Neighbor (KNN). The result of iris recognition was obtained with the best accuracy was 92,63 % for K = 3 value on K-Nearest Neighbor (KNN) method.

  17. On the classification techniques in data mining for microarray data classification

    NASA Astrophysics Data System (ADS)

    Aydadenta, Husna; Adiwijaya

    2018-03-01

    Cancer is one of the deadly diseases, according to data from WHO by 2015 there are 8.8 million more deaths caused by cancer, and this will increase every year if not resolved earlier. Microarray data has become one of the most popular cancer-identification studies in the field of health, since microarray data can be used to look at levels of gene expression in certain cell samples that serve to analyze thousands of genes simultaneously. By using data mining technique, we can classify the sample of microarray data thus it can be identified with cancer or not. In this paper we will discuss some research using some data mining techniques using microarray data, such as Support Vector Machine (SVM), Artificial Neural Network (ANN), Naive Bayes, k-Nearest Neighbor (kNN), and C4.5, and simulation of Random Forest algorithm with technique of reduction dimension using Relief. The result of this paper show performance measure (accuracy) from classification algorithm (SVM, ANN, Naive Bayes, kNN, C4.5, and Random Forets).The results in this paper show the accuracy of Random Forest algorithm higher than other classification algorithms (Support Vector Machine (SVM), Artificial Neural Network (ANN), Naive Bayes, k-Nearest Neighbor (kNN), and C4.5). It is hoped that this paper can provide some information about the speed, accuracy, performance and computational cost generated from each Data Mining Classification Technique based on microarray data.

  18. Comparative Analysis of Document level Text Classification Algorithms using R

    NASA Astrophysics Data System (ADS)

    Syamala, Maganti; Nalini, N. J., Dr; Maguluri, Lakshamanaphaneendra; Ragupathy, R., Dr.

    2017-08-01

    From the past few decades there has been tremendous volumes of data available in Internet either in structured or unstructured form. Also, there is an exponential growth of information on Internet, so there is an emergent need of text classifiers. Text mining is an interdisciplinary field which draws attention on information retrieval, data mining, machine learning, statistics and computational linguistics. And to handle this situation, a wide range of supervised learning algorithms has been introduced. Among all these K-Nearest Neighbor(KNN) is efficient and simplest classifier in text classification family. But KNN suffers from imbalanced class distribution and noisy term features. So, to cope up with this challenge we use document based centroid dimensionality reduction(CentroidDR) using R Programming. By combining these two text classification techniques, KNN and Centroid classifiers, we propose a scalable and effective flat classifier, called MCenKNN which works well substantially better than CenKNN.

  19. Landscape-scale parameterization of a tree-level forest growth model: a k-nearest neighbor imputation approach incorporating LiDAR data

    Treesearch

    Michael J. Falkowski; Andrew T. Hudak; Nicholas L. Crookston; Paul E. Gessler; Edward H. Uebler; Alistair M. S. Smith

    2010-01-01

    Sustainable forest management requires timely, detailed forest inventory data across large areas, which is difficult to obtain via traditional forest inventory techniques. This study evaluated k-nearest neighbor imputation models incorporating LiDAR data to predict tree-level inventory data (individual tree height, diameter at breast height, and...

  20. Evaluation of Short-Term Cepstral Based Features for Detection of Parkinson’s Disease Severity Levels through Speech signals

    NASA Astrophysics Data System (ADS)

    Oung, Qi Wei; Nisha Basah, Shafriza; Muthusamy, Hariharan; Vijean, Vikneswaran; Lee, Hoileong

    2018-03-01

    Parkinson’s disease (PD) is one type of progressive neurodegenerative disease known as motor system syndrome, which is due to the death of dopamine-generating cells, a region of the human midbrain. PD normally affects people over 60 years of age, which at present has influenced a huge part of worldwide population. Lately, many researches have shown interest into the connection between PD and speech disorders. Researches have revealed that speech signals may be a suitable biomarker for distinguishing between people with Parkinson’s (PWP) from healthy subjects. Therefore, early diagnosis of PD through the speech signals can be considered for this aim. In this research, the speech data are acquired based on speech behaviour as the biomarker for differentiating PD severity levels (mild and moderate) from healthy subjects. Feature extraction algorithms applied are Mel Frequency Cepstral Coefficients (MFCC), Linear Predictive Coefficients (LPC), Linear Prediction Cepstral Coefficients (LPCC), and Weighted Linear Prediction Cepstral Coefficients (WLPCC). For classification, two types of classifiers are used: k-Nearest Neighbour (KNN) and Probabilistic Neural Network (PNN). The experimental results demonstrated that PNN classifier and KNN classifier achieve the best average classification performance of 92.63% and 88.56% respectively through 10-fold cross-validation measures. Favourably, the suggested techniques have the possibilities of becoming a new choice of promising tools for the PD detection with tremendous performance.

  1. Using K-Nearest Neighbor Classification to Diagnose Abnormal Lung Sounds

    PubMed Central

    Chen, Chin-Hsing; Huang, Wen-Tzeng; Tan, Tan-Hsu; Chang, Cheng-Chun; Chang, Yuan-Jen

    2015-01-01

    A reported 30% of people worldwide have abnormal lung sounds, including crackles, rhonchi, and wheezes. To date, the traditional stethoscope remains the most popular tool used by physicians to diagnose such abnormal lung sounds, however, many problems arise with the use of a stethoscope, including the effects of environmental noise, the inability to record and store lung sounds for follow-up or tracking, and the physician’s subjective diagnostic experience. This study has developed a digital stethoscope to help physicians overcome these problems when diagnosing abnormal lung sounds. In this digital system, mel-frequency cepstral coefficients (MFCCs) were used to extract the features of lung sounds, and then the K-means algorithm was used for feature clustering, to reduce the amount of data for computation. Finally, the K-nearest neighbor method was used to classify the lung sounds. The proposed system can also be used for home care: if the percentage of abnormal lung sound frames is > 30% of the whole test signal, the system can automatically warn the user to visit a physician for diagnosis. We also used bend sensors together with an amplification circuit, Bluetooth, and a microcontroller to implement a respiration detector. The respiratory signal extracted by the bend sensors can be transmitted to the computer via Bluetooth to calculate the respiratory cycle, for real-time assessment. If an abnormal status is detected, the device will warn the user automatically. Experimental results indicated that the error in respiratory cycles between measured and actual values was only 6.8%, illustrating the potential of our detector for home care applications. PMID:26053756

  2. Species distribution models: A comparison of statistical approaches for livestock and disease epidemics.

    PubMed

    Hollings, Tracey; Robinson, Andrew; van Andel, Mary; Jewell, Chris; Burgman, Mark

    2017-01-01

    In livestock industries, reliable up-to-date spatial distribution and abundance records for animals and farms are critical for governments to manage and respond to risks. Yet few, if any, countries can afford to maintain comprehensive, up-to-date agricultural census data. Statistical modelling can be used as a proxy for such data but comparative modelling studies have rarely been undertaken for livestock populations. Widespread species, including livestock, can be difficult to model effectively due to complex spatial distributions that do not respond predictably to environmental gradients. We assessed three machine learning species distribution models (SDM) for their capacity to estimate national-level farm animal population numbers within property boundaries: boosted regression trees (BRT), random forests (RF) and K-nearest neighbour (K-NN). The models were built from a commercial livestock database and environmental and socio-economic predictor data for New Zealand. We used two spatial data stratifications to test (i) support for decision making in an emergency response situation, and (ii) the ability for the models to predict to new geographic regions. The performance of the three model types varied substantially, but the best performing models showed very high accuracy. BRTs had the best performance overall, but RF performed equally well or better in many simulations; RFs were superior at predicting livestock numbers for all but very large commercial farms. K-NN performed poorly relative to both RF and BRT in all simulations. The predictions of both multi species and single species models for farms and within hypothetical quarantine zones were very close to observed data. These models are generally applicable for livestock estimation with broad applications in disease risk modelling, biosecurity, policy and planning.

  3. Species distribution models: A comparison of statistical approaches for livestock and disease epidemics

    PubMed Central

    Robinson, Andrew; van Andel, Mary; Jewell, Chris; Burgman, Mark

    2017-01-01

    In livestock industries, reliable up-to-date spatial distribution and abundance records for animals and farms are critical for governments to manage and respond to risks. Yet few, if any, countries can afford to maintain comprehensive, up-to-date agricultural census data. Statistical modelling can be used as a proxy for such data but comparative modelling studies have rarely been undertaken for livestock populations. Widespread species, including livestock, can be difficult to model effectively due to complex spatial distributions that do not respond predictably to environmental gradients. We assessed three machine learning species distribution models (SDM) for their capacity to estimate national-level farm animal population numbers within property boundaries: boosted regression trees (BRT), random forests (RF) and K-nearest neighbour (K-NN). The models were built from a commercial livestock database and environmental and socio-economic predictor data for New Zealand. We used two spatial data stratifications to test (i) support for decision making in an emergency response situation, and (ii) the ability for the models to predict to new geographic regions. The performance of the three model types varied substantially, but the best performing models showed very high accuracy. BRTs had the best performance overall, but RF performed equally well or better in many simulations; RFs were superior at predicting livestock numbers for all but very large commercial farms. K-NN performed poorly relative to both RF and BRT in all simulations. The predictions of both multi species and single species models for farms and within hypothetical quarantine zones were very close to observed data. These models are generally applicable for livestock estimation with broad applications in disease risk modelling, biosecurity, policy and planning. PMID:28837685

  4. Evaluation of extreme learning machine for classification of individual and combined finger movements using electromyography on amputees and non-amputees.

    PubMed

    Anam, Khairul; Al-Jumaily, Adel

    2017-01-01

    The success of myoelectric pattern recognition (M-PR) mostly relies on the features extracted and classifier employed. This paper proposes and evaluates a fast classifier, extreme learning machine (ELM), to classify individual and combined finger movements on amputees and non-amputees. ELM is a single hidden layer feed-forward network (SLFN) that avoids iterative learning by determining input weights randomly and output weights analytically. Therefore, it can accelerate the training time of SLFNs. In addition to the classifier evaluation, this paper evaluates various feature combinations to improve the performance of M-PR and investigate some feature projections to improve the class separability of the features. Different from other studies on the implementation of ELM in the myoelectric controller, this paper presents a complete and thorough investigation of various types of ELMs including the node-based and kernel-based ELM. Furthermore, this paper provides comparisons of ELMs and other well-known classifiers such as linear discriminant analysis (LDA), k-nearest neighbour (kNN), support vector machine (SVM) and least-square SVM (LS-SVM). The experimental results show the most accurate ELM classifier is radial basis function ELM (RBF-ELM). The comparison of RBF-ELM and other well-known classifiers shows that RBF-ELM is as accurate as SVM and LS-SVM but faster than the SVM family; it is superior to LDA and kNN. The experimental results also indicate that the accuracy gap of the M-PR on the amputees and non-amputees is not too much with the accuracy of 98.55% on amputees and 99.5% on the non-amputees using six electromyography (EMG) channels. Copyright © 2016 Elsevier Ltd. All rights reserved.

  5. Combining Fourier and lagged k-nearest neighbor imputation for biomedical time series data.

    PubMed

    Rahman, Shah Atiqur; Huang, Yuxiao; Claassen, Jan; Heintzman, Nathaniel; Kleinberg, Samantha

    2015-12-01

    Most clinical and biomedical data contain missing values. A patient's record may be split across multiple institutions, devices may fail, and sensors may not be worn at all times. While these missing values are often ignored, this can lead to bias and error when the data are mined. Further, the data are not simply missing at random. Instead the measurement of a variable such as blood glucose may depend on its prior values as well as that of other variables. These dependencies exist across time as well, but current methods have yet to incorporate these temporal relationships as well as multiple types of missingness. To address this, we propose an imputation method (FLk-NN) that incorporates time lagged correlations both within and across variables by combining two imputation methods, based on an extension to k-NN and the Fourier transform. This enables imputation of missing values even when all data at a time point is missing and when there are different types of missingness both within and across variables. In comparison to other approaches on three biological datasets (simulated and actual Type 1 diabetes datasets, and multi-modality neurological ICU monitoring) the proposed method has the highest imputation accuracy. This was true for up to half the data being missing and when consecutive missing values are a significant fraction of the overall time series length. Copyright © 2015 Elsevier Inc. All rights reserved.

  6. An Analysis of Document Category Prediction Responses to Classifier Model Parameter Treatment Permutations within the Software Design Patterns Subject Domain

    ERIC Educational Resources Information Center

    Pankau, Brian L.

    2009-01-01

    This empirical study evaluates the document category prediction effectiveness of Naive Bayes (NB) and K-Nearest Neighbor (KNN) classifier treatments built from different feature selection and machine learning settings and trained and tested against textual corpora of 2300 Gang-Of-Four (GOF) design pattern documents. Analysis of the experiment's…

  7. Potential of near-infrared spectroscopy for quality evaluation of cattle leather.

    PubMed

    Braz, Carlos Eduardo M; Jacinto, Manuel Antonio C; Pereira-Filho, Edenir R; Souza, Gilberto B; Nogueira, Ana Rita A

    2018-05-09

    Models using near-infrared spectroscopy (NIRS) were constructed based on physical-mechanical tests to determine the quality of cattle leather. The following official parameters were used, considering the industry requirements: tensile strength (TS), percentage elongation (%E), tear strength (TT), and double hole tear strength (DHS). Classification models were constructed with the use of k-nearest neighbor (kNN), soft independent modeling of class analogy (SIMCA), and partial least squares-discriminant analysis (PLS-DA). The evaluated figures of merit, accuracy, sensitivity, and specificity presented results between 85% and 93%, and the false alarm rates from 9% to 14%. The model with lowest validation percentage (92%) was kNN, and the highest was PLS-DA (100%). For TS, lower values were obtained, from 52% for kNN and 74% for SIMCA. The other parameters %E, TT, and DHS presented hit rates between 87 and 100%. The abilities of the models were similar, showing they can be used to predict the quality of cattle leather. Copyright © 2018 Elsevier B.V. All rights reserved.

  8. Point-by-point compositional analysis for atom probe tomography.

    PubMed

    Stephenson, Leigh T; Ceguerra, Anna V; Li, Tong; Rojhirunsakool, Tanaporn; Nag, Soumya; Banerjee, Rajarshi; Cairney, Julie M; Ringer, Simon P

    2014-01-01

    This new alternate approach to data processing for analyses that traditionally employed grid-based counting methods is necessary because it removes a user-imposed coordinate system that not only limits an analysis but also may introduce errors. We have modified the widely used "binomial" analysis for APT data by replacing grid-based counting with coordinate-independent nearest neighbour identification, improving the measurements and the statistics obtained, allowing quantitative analysis of smaller datasets, and datasets from non-dilute solid solutions. It also allows better visualisation of compositional fluctuations in the data. Our modifications include:.•using spherical k-atom blocks identified by each detected atom's first k nearest neighbours.•3D data visualisation of block composition and nearest neighbour anisotropy.•using z-statistics to directly compare experimental and expected composition curves. Similar modifications may be made to other grid-based counting analyses (contingency table, Langer-Bar-on-Miller, sinusoidal model) and could be instrumental in developing novel data visualisation options.

  9. Improved Fuzzy K-Nearest Neighbor Using Modified Particle Swarm Optimization

    NASA Astrophysics Data System (ADS)

    Jamaluddin; Siringoringo, Rimbun

    2017-12-01

    Fuzzy k-Nearest Neighbor (FkNN) is one of the most powerful classification methods. The presence of fuzzy concepts in this method successfully improves its performance on almost all classification issues. The main drawbackof FKNN is that it is difficult to determine the parameters. These parameters are the number of neighbors (k) and fuzzy strength (m). Both parameters are very sensitive. This makes it difficult to determine the values of ‘m’ and ‘k’, thus making FKNN difficult to control because no theories or guides can deduce how proper ‘m’ and ‘k’ should be. This study uses Modified Particle Swarm Optimization (MPSO) to determine the best value of ‘k’ and ‘m’. MPSO is focused on the Constriction Factor Method. Constriction Factor Method is an improvement of PSO in order to avoid local circumstances optima. The model proposed in this study was tested on the German Credit Dataset. The test of the data/The data test has been standardized by UCI Machine Learning Repository which is widely applied to classification problems. The application of MPSO to the determination of FKNN parameters is expected to increase the value of classification performance. Based on the experiments that have been done indicating that the model offered in this research results in a better classification performance compared to the Fk-NN model only. The model offered in this study has an accuracy rate of 81%, while. With using Fk-NN model, it has the accuracy of 70%. At the end is done comparison of research model superiority with 2 other classification models;such as Naive Bayes and Decision Tree. This research model has a better performance level, where Naive Bayes has accuracy 75%, and the decision tree model has 70%

  10. Emotion recognition from multichannel EEG signals using K-nearest neighbor classification.

    PubMed

    Li, Mi; Xu, Hongpei; Liu, Xingwang; Lu, Shengfu

    2018-04-27

    Many studies have been done on the emotion recognition based on multi-channel electroencephalogram (EEG) signals. This paper explores the influence of the emotion recognition accuracy of EEG signals in different frequency bands and different number of channels. We classified the emotional states in the valence and arousal dimensions using different combinations of EEG channels. Firstly, DEAP default preprocessed data were normalized. Next, EEG signals were divided into four frequency bands using discrete wavelet transform, and entropy and energy were calculated as features of K-nearest neighbor Classifier. The classification accuracies of the 10, 14, 18 and 32 EEG channels based on the Gamma frequency band were 89.54%, 92.28%, 93.72% and 95.70% in the valence dimension and 89.81%, 92.24%, 93.69% and 95.69% in the arousal dimension. As the number of channels increases, the classification accuracy of emotional states also increases, the classification accuracy of the gamma frequency band is greater than that of the beta frequency band followed by the alpha and theta frequency bands. This paper provided better frequency bands and channels reference for emotion recognition based on EEG.

  11. Data fusion for food authentication. Combining rare earth elements and trace metals to discriminate "Fava Santorinis" from other yellow split peas using chemometric tools.

    PubMed

    Drivelos, Spiros A; Higgins, Kevin; Kalivas, John H; Haroutounian, Serkos A; Georgiou, Constantinos A

    2014-12-15

    "Fava Santorinis", is a protected designation of origin (PDO) yellow split pea species growing only in the island of Santorini in Greece. Due to its nutritional quality and taste, it has gained a high monetary value. Thus, it is prone to adulteration with other yellow split peas. In order to discriminate "Fava Santorinis" from other yellow split peas, four classification methods utilising rare earth elements (REEs) measured through inductively coupled plasma-mass spectrometry (ICP-MS) are studied. The four classification processes are orthogonal projection analysis (OPA), Mahalanobis distance (MD), partial least squares discriminant analysis (PLS-DA) and k nearest neighbours (KNN). Since it is known that trace elements are often useful to determine geographical origin of food products, we further quantitated for trace elements using ICP-MS. Presented in this paper are results using the four classification processes based on the fusion of the REEs data with the trace element data. Overall, the OPA method was found to perform best with up to 100% accuracy using the fused data. Copyright © 2014 Elsevier Ltd. All rights reserved.

  12. Voice based gender classification using machine learning

    NASA Astrophysics Data System (ADS)

    Raahul, A.; Sapthagiri, R.; Pankaj, K.; Vijayarajan, V.

    2017-11-01

    Gender identification is one of the major problem speech analysis today. Tracing the gender from acoustic data i.e., pitch, median, frequency etc. Machine learning gives promising results for classification problem in all the research domains. There are several performance metrics to evaluate algorithms of an area. Our Comparative model algorithm for evaluating 5 different machine learning algorithms based on eight different metrics in gender classification from acoustic data. Agenda is to identify gender, with five different algorithms: Linear Discriminant Analysis (LDA), K-Nearest Neighbour (KNN), Classification and Regression Trees (CART), Random Forest (RF), and Support Vector Machine (SVM) on basis of eight different metrics. The main parameter in evaluating any algorithms is its performance. Misclassification rate must be less in classification problems, which says that the accuracy rate must be high. Location and gender of the person have become very crucial in economic markets in the form of AdSense. Here with this comparative model algorithm, we are trying to assess the different ML algorithms and find the best fit for gender classification of acoustic data.

  13. Modeling Liver-Related Adverse Effects of Drugs Using kNN QSAR Method

    PubMed Central

    Rodgers, Amie D.; Zhu, Hao; Fourches, Dennis; Rusyn, Ivan; Tropsha, Alexander

    2010-01-01

    Adverse effects of drugs (AEDs) continue to be a major cause of drug withdrawals both in development and post-marketing. While liver-related AEDs are a major concern for drug safety, there are few in silico models for predicting human liver toxicity for drug candidates. We have applied the Quantitative Structure Activity Relationship (QSAR) approach to model liver AEDs. In this study, we aimed to construct a QSAR model capable of binary classification (active vs. inactive) of drugs for liver AEDs based on chemical structure. To build QSAR models, we have employed an FDA spontaneous reporting database of human liver AEDs (elevations in activity of serum liver enzymes), which contains data on approximately 500 approved drugs. Approximately 200 compounds with wide clinical data coverage, structural similarity and balanced (40/60) active/inactive ratio were selected for modeling and divided into multiple training/test and external validation sets. QSAR models were developed using the k nearest neighbor method and validated using external datasets. Models with high sensitivity (>73%) and specificity (>94%) for prediction of liver AEDs in external validation sets were developed. To test applicability of the models, three chemical databases (World Drug Index, Prestwick Chemical Library, and Biowisdom Liver Intelligence Module) were screened in silico and the validity of predictions was determined, where possible, by comparing model-based classification with assertions in publicly available literature. Validated QSAR models of liver AEDs based on the data from the FDA spontaneous reporting system can be employed as sensitive and specific predictors of AEDs in pre-clinical screening of drug candidates for potential hepatotoxicity in humans. PMID:20192250

  14. RRAM-based parallel computing architecture using k-nearest neighbor classification for pattern recognition

    NASA Astrophysics Data System (ADS)

    Jiang, Yuning; Kang, Jinfeng; Wang, Xinan

    2017-03-01

    Resistive switching memory (RRAM) is considered as one of the most promising devices for parallel computing solutions that may overcome the von Neumann bottleneck of today’s electronic systems. However, the existing RRAM-based parallel computing architectures suffer from practical problems such as device variations and extra computing circuits. In this work, we propose a novel parallel computing architecture for pattern recognition by implementing k-nearest neighbor classification on metal-oxide RRAM crossbar arrays. Metal-oxide RRAM with gradual RESET behaviors is chosen as both the storage and computing components. The proposed architecture is tested by the MNIST database. High speed (~100 ns per example) and high recognition accuracy (97.05%) are obtained. The influence of several non-ideal device properties is also discussed, and it turns out that the proposed architecture shows great tolerance to device variations. This work paves a new way to achieve RRAM-based parallel computing hardware systems with high performance.

  15. Compositional inhomogeneityand segregation in (K 0.5Na 0.5)NbO 3 ceramics

    DOE PAGES

    Chen, Kepi; Tang, Jing; Chen, Yan

    2016-03-11

    The effects of the calcination temperature of (K 0.5Na 0.5)NbO 3 (KNN) powder on the sintering and piezoelectric properties of KNN ceramics have been investigated in this report. KNN powders are synthesized via the solid-state approach. Scanning electron microscopy and X-ray diffraction characterizations indicate that the incomplete reaction at 700 °C and 750 °C calcination results in the compositional inhomogeneity of the K-rich and Na-rich phases while the orthorhombic single phase is obtained after calcination at 900 °C. During the sintering, the presence of the liquid K-rich phase due to the lower melting point has a significant impact on themore » densification, the abnormal grain growth and the deteriorated piezoelectric properties. From the standpoint of piezoelectric properties, the optimal calcination temperature obtained for KNN ceramics calcined at this temperature is determined to be 800 °C, with piezoelectric constant d 33=128.3 pC/N, planar electromechanical coupling coefficient k p=32.2%, mechanical quality factor Q m=88, and dielectric loss tan δ=2.1%.« less

  16. Differential diagnosis of pleural mesothelioma using Logic Learning Machine.

    PubMed

    Parodi, Stefano; Filiberti, Rosa; Marroni, Paola; Libener, Roberta; Ivaldi, Giovanni Paolo; Mussap, Michele; Ferrari, Enrico; Manneschi, Chiara; Montani, Erika; Muselli, Marco

    2015-01-01

    Tumour markers are standard tools for the differential diagnosis of cancer. However, the occurrence of nonspecific symptoms and different malignancies involving the same cancer site may lead to a high proportion of misclassifications. Classification accuracy can be improved by combining information from different markers using standard data mining techniques, like Decision Tree (DT), Artificial Neural Network (ANN), and k-Nearest Neighbour (KNN) classifier. Unfortunately, each method suffers from some unavoidable limitations. DT, in general, tends to show a low classification performance, whereas ANN and KNN produce a "black-box" classification that does not provide biological information useful for clinical purposes. Logic Learning Machine (LLM) is an innovative method of supervised data analysis capable of building classifiers described by a set of intelligible rules including simple conditions in their antecedent part. It is essentially an efficient implementation of the Switching Neural Network model and reaches excellent classification accuracy while keeping low the computational demand. LLM was applied to data from a consecutive cohort of 169 patients admitted for diagnosis to two pulmonary departments in Northern Italy from 2009 to 2011. Patients included 52 malignant pleural mesotheliomas (MPM), 62 pleural metastases (MTX) from other tumours and 55 benign diseases (BD) associated with pleurisies. Concentration of three tumour markers (CEA, CYFRA 21-1 and SMRP) was measured in the pleural fluid of each patient and a cytological examination was also carried out. The performance of LLM and that of three competing methods (DT, KNN and ANN) was assessed by leave-one-out cross-validation. LLM outperformed all other considered methods. Global accuracy was 77.5% for LLM, 72.8% for DT, 54.4% for KNN, and 63.9% for ANN, respectively. In more details, LLM correctly classified 79% of MPM, 66% of MTX and 89% of BD. The corresponding figures for DT were: MPM = 83%, MTX

  17. Classification of Malaysia aromatic rice using multivariate statistical analysis

    NASA Astrophysics Data System (ADS)

    Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md; Masnan, M. J.; Zakaria, A.; Rahim, N. A.; Omar, O.

    2015-05-01

    Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy training time, and prone to fatigue as the number of sample increased and inconsistent. The GC-MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.

  18. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Abdullah, A. H.; Adom, A. H.; Shakaff, A. Y. Md

    Aromatic rice (Oryza sativa L.) is considered as the best quality premium rice. The varieties are preferred by consumers because of its preference criteria such as shape, colour, distinctive aroma and flavour. The price of aromatic rice is higher than ordinary rice due to its special needed growth condition for instance specific climate and soil. Presently, the aromatic rice quality is identified by using its key elements and isotopic variables. The rice can also be classified via Gas Chromatography Mass Spectrometry (GC-MS) or human sensory panels. However, the uses of human sensory panels have significant drawbacks such as lengthy trainingmore » time, and prone to fatigue as the number of sample increased and inconsistent. The GC–MS analysis techniques on the other hand, require detailed procedures, lengthy analysis and quite costly. This paper presents the application of in-house developed Electronic Nose (e-nose) to classify new aromatic rice varieties. The e-nose is used to classify the variety of aromatic rice based on the samples odour. The samples were taken from the variety of rice. The instrument utilizes multivariate statistical data analysis, including Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA) and K-Nearest Neighbours (KNN) to classify the unknown rice samples. The Leave-One-Out (LOO) validation approach is applied to evaluate the ability of KNN to perform recognition and classification of the unspecified samples. The visual observation of the PCA and LDA plots of the rice proves that the instrument was able to separate the samples into different clusters accordingly. The results of LDA and KNN with low misclassification error support the above findings and we may conclude that the e-nose is successfully applied to the classification of the aromatic rice varieties.« less

  19. BIANCA (Brain Intensity AbNormality Classification Algorithm): A new tool for automated segmentation of white matter hyperintensities.

    PubMed

    Griffanti, Ludovica; Zamboni, Giovanna; Khan, Aamira; Li, Linxin; Bonifacio, Guendalina; Sundaresan, Vaanathi; Schulz, Ursula G; Kuker, Wilhelm; Battaglini, Marco; Rothwell, Peter M; Jenkinson, Mark

    2016-11-01

    Reliable quantification of white matter hyperintensities of presumed vascular origin (WMHs) is increasingly needed, given the presence of these MRI findings in patients with several neurological and vascular disorders, as well as in elderly healthy subjects. We present BIANCA (Brain Intensity AbNormality Classification Algorithm), a fully automated, supervised method for WMH detection, based on the k-nearest neighbour (k-NN) algorithm. Relative to previous k-NN based segmentation methods, BIANCA offers different options for weighting the spatial information, local spatial intensity averaging, and different options for the choice of the number and location of the training points. BIANCA is multimodal and highly flexible so that the user can adapt the tool to their protocol and specific needs. We optimised and validated BIANCA on two datasets with different MRI protocols and patient populations (a "predominantly neurodegenerative" and a "predominantly vascular" cohort). BIANCA was first optimised on a subset of images for each dataset in terms of overlap and volumetric agreement with a manually segmented WMH mask. The correlation between the volumes extracted with BIANCA (using the optimised set of options), the volumes extracted from the manual masks and visual ratings showed that BIANCA is a valid alternative to manual segmentation. The optimised set of options was then applied to the whole cohorts and the resulting WMH volume estimates showed good correlations with visual ratings and with age. Finally, we performed a reproducibility test, to evaluate the robustness of BIANCA, and compared BIANCA performance against existing methods. Our findings suggest that BIANCA, which will be freely available as part of the FSL package, is a reliable method for automated WMH segmentation in large cross-sectional cohort studies. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  20. Exploiting the capabilities of the Sentinel-2 multi spectral instrument for predicting growing stock volume in forest ecosystems

    NASA Astrophysics Data System (ADS)

    Mura, Matteo; Bottalico, Francesca; Giannetti, Francesca; Bertani, Remo; Giannini, Raffaello; Mancini, Marco; Orlandini, Simone; Travaglini, Davide; Chirici, Gherardo

    2018-04-01

    The spatial prediction of growing stock volume is one of the most frequent application of remote sensing for supporting the sustainable management of forest ecosystems. For such a purpose data from active or passive sensors are used as predictor variables in combination with measures taken in the field in sampling plots. The Sentinel-2 (S2) satellites are equipped with a Multi Spectral Instrument (MSI) capable of acquiring 13 bands in the visible and infrared domains with a spatial resolution varying between 10 and 60 m. The present study aimed at evaluating the performance of the S2-MSI imagery for estimating the growing stock volume of forest ecosystems. To do so we used 240 plots measured in two study areas in Italy. The imputation was carried out with eight k-Nearest Neighbours (k-NN) methods available in the open source YaImpute R package. In order to evaluate the S2-MSI performance we repeated the experimental protocol also with two other sets of images acquired by two well-known satellites equipped with multi spectral instruments: Landsat 8 OLI and RapidEye scanner. We found that S2 worked better than Landsat in 37.5% of the cases and in 62.5% of the cases better than RapidEye. In one study area the best performance was obtained with Landsat OLI (RMSD = 6.84%) and in the other with S2 (RMSD = 22.94%), both with the k-NN system based on a distance matrix calculated with the Random Forest algorithm. The results confirmed that S2 images are suitable for predicting growing stock volume obtaining good performances (average RMSD for both the test areas of less than 19%).

  1. Ensemble Clustering Classification Applied to Competing SVM and One-Class Classifiers Exemplified by Plant MicroRNAs Data.

    PubMed

    Yousef, Malik; Khalifa, Waleed; AbdAllah, Loai

    2016-12-01

    The performance of many learning and data mining algorithms depends critically on suitable metrics to assess efficiency over the input space. Learning a suitable metric from examples may, therefore, be the key to successful application of these algorithms. We have demonstrated that the k-nearest neighbor (kNN) classification can be significantly improved by learning a distance metric from labeled examples. The clustering ensemble is used to define the distance between points in respect to how they co-cluster. This distance is then used within the framework of the kNN algorithm to define a classifier named ensemble clustering kNN classifier (EC-kNN). In many instances in our experiments we achieved highest accuracy while SVM failed to perform as well. In this study, we compare the performance of a two-class classifier using EC-kNN with different one-class and two-class classifiers. The comparison was applied to seven different plant microRNA species considering eight feature selection methods. In this study, the averaged results show that EC-kNN outperforms all other methods employed here and previously published results for the same data. In conclusion, this study shows that the chosen classifier shows high performance when the distance metric is carefully chosen.

  2. Classification Features of US Images Liver Extracted with Co-occurrence Matrix Using the Nearest Neighbor Algorithm

    NASA Astrophysics Data System (ADS)

    Moldovanu, Simona; Bibicu, Dorin; Moraru, Luminita; Nicolae, Mariana Carmen

    2011-12-01

    Co-occurrence matrix has been applied successfully for echographic images characterization because it contains information about spatial distribution of grey-scale levels in an image. The paper deals with the analysis of pixels in selected regions of interest of an US image of the liver. The useful information obtained refers to texture features such as entropy, contrast, dissimilarity and correlation extract with co-occurrence matrix. The analyzed US images were grouped in two distinct sets: healthy liver and steatosis (or fatty) liver. These two sets of echographic images of the liver build a database that includes only histological confirmed cases: 10 images of healthy liver and 10 images of steatosis liver. The healthy subjects help to compute four textural indices and as well as control dataset. We chose to study these diseases because the steatosis is the abnormal retention of lipids in cells. The texture features are statistical measures and they can be used to characterize irregularity of tissues. The goal is to extract the information using the Nearest Neighbor classification algorithm. The K-NN algorithm is a powerful tool to classify features textures by means of grouping in a training set using healthy liver, on the one hand, and in a holdout set using the features textures of steatosis liver, on the other hand. The results could be used to quantify the texture information and will allow a clear detection between health and steatosis liver.

  3. Ensemble Clustering Classification compete SVM and One-Class classifiers applied on plant microRNAs Data.

    PubMed

    Yousef, Malik; Khalifa, Waleed; AbedAllah, Loai

    2016-12-22

    The performance of many learning and data mining algorithms depends critically on suitable metrics to assess efficiency over the input space. Learning a suitable metric from examples may, therefore, be the key to successful application of these algorithms. We have demonstrated that the k-nearest neighbor (kNN) classification can be significantly improved by learning a distance metric from labeled examples. The clustering ensemble is used to define the distance between points in respect to how they co-cluster. This distance is then used within the framework of the kNN algorithm to define a classifier named ensemble clustering kNN classifier (EC-kNN). In many instances in our experiments we achieved highest accuracy while SVM failed to perform as well. In this study, we compare the performance of a two-class classifier using EC-kNN with different one-class and two-class classifiers. The comparison was applied to seven different plant microRNA species considering eight feature selection methods. In this study, the averaged results show that ECkNN outperforms all other methods employed here and previously published results for the same data. In conclusion, this study shows that the chosen classifier shows high performance when the distance metric is carefully chosen.

  4. Fractal dimension to classify the heart sound recordings with KNN and fuzzy c-mean clustering methods

    NASA Astrophysics Data System (ADS)

    Juniati, D.; Khotimah, C.; Wardani, D. E. K.; Budayasa, K.

    2018-01-01

    The heart abnormalities can be detected from heart sound. A heart sound can be heard directly with a stethoscope or indirectly by a phonocardiograph, a machine of the heart sound recording. This paper presents the implementation of fractal dimension theory to make a classification of phonocardiograms into a normal heart sound, a murmur, or an extrasystole. The main algorithm used to calculate the fractal dimension was Higuchi’s Algorithm. There were two steps to make a classification of phonocardiograms, feature extraction, and classification. For feature extraction, we used Discrete Wavelet Transform to decompose the signal of heart sound into several sub-bands depending on the selected level. After the decomposition process, the signal was processed using Fast Fourier Transform (FFT) to determine the spectral frequency. The fractal dimension of the FFT output was calculated using Higuchi Algorithm. The classification of fractal dimension of all phonocardiograms was done with KNN and Fuzzy c-mean clustering methods. Based on the research results, the best accuracy obtained was 86.17%, the feature extraction by DWT decomposition level 3 with the value of kmax 50, using 5-fold cross validation and the number of neighbors was 5 at K-NN algorithm. Meanwhile, for fuzzy c-mean clustering, the accuracy was 78.56%.

  5. Probability estimation with machine learning methods for dichotomous and multicategory outcome: theory.

    PubMed

    Kruppa, Jochen; Liu, Yufeng; Biau, Gérard; Kohler, Michael; König, Inke R; Malley, James D; Ziegler, Andreas

    2014-07-01

    Probability estimation for binary and multicategory outcome using logistic and multinomial logistic regression has a long-standing tradition in biostatistics. However, biases may occur if the model is misspecified. In contrast, outcome probabilities for individuals can be estimated consistently with machine learning approaches, including k-nearest neighbors (k-NN), bagged nearest neighbors (b-NN), random forests (RF), and support vector machines (SVM). Because machine learning methods are rarely used by applied biostatisticians, the primary goal of this paper is to explain the concept of probability estimation with these methods and to summarize recent theoretical findings. Probability estimation in k-NN, b-NN, and RF can be embedded into the class of nonparametric regression learning machines; therefore, we start with the construction of nonparametric regression estimates and review results on consistency and rates of convergence. In SVMs, outcome probabilities for individuals are estimated consistently by repeatedly solving classification problems. For SVMs we review classification problem and then dichotomous probability estimation. Next we extend the algorithms for estimating probabilities using k-NN, b-NN, and RF to multicategory outcomes and discuss approaches for the multicategory probability estimation problem using SVM. In simulation studies for dichotomous and multicategory dependent variables we demonstrate the general validity of the machine learning methods and compare it with logistic regression. However, each method fails in at least one simulation scenario. We conclude with a discussion of the failures and give recommendations for selecting and tuning the methods. Applications to real data and example code are provided in a companion article (doi:10.1002/bimj.201300077). © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  6. Creating Profiles from User Network Behavior

    DTIC Science & Technology

    2013-09-01

    We varied the m-estimate in Naïve Bayes, m for pruning in Learning Tree, and how many k nearest neighbors to select from in KNN, before settling on the...N. Taft, “The cubicle vs. the coffee shop: behavioral modes in enterprise end-users,” in Proc. of the 9th Int. Conf. on Passive and Active Network

  7. Evolution of the composition, structure, and piezoelectric performance of (K1-xNax)NbO3 nanorod arrays with hydrothermal reaction time

    NASA Astrophysics Data System (ADS)

    Jin, Wenchao; Wang, Zhao; Li, Meng; He, Yahua; Hu, Xiaokang; Li, Luying; Gao, Yihua; Hu, Yongming; Gu, Haoshuang; Wang, Xiaolin

    2018-04-01

    Lead-free (K,Na)NbO3 (KNN) nanorod arrays were synthesized with the assistance of a Nb: SrTiO3 single-crystal substrate through the hydrothermal process. The evolutions of the morphology, composition, and structure of the as-synthesized KNN nanorods with the increase in reaction time were investigated. The results confirmed that the increase in reaction time up to 3 h led to the increase in the length and aspect ratio of the well-aligned KNN nanorods. All samples have K-rich orthorhombic crystal structures, while the diffraction peaks shifted towards a higher degree. The peak shifts should be attributed to the increase in the Na content in the KNN lattice, which could decrease the lattice parameters owing to the small ionic radius of Na+ than that of K+. Moreover, the increase in reaction time also resulted in the suppression of oxygen vacancies on the surface of the KNN nanorods. These evolutions of the composition and crystal structure, as well as the decrease in the defect content, lead to great enhancement of the nanorod's piezoelectric response, as their d33 value was increased from 19 to 64 pm/V. These results demonstrated the significant impact of reaction time on the hydrothermal growth of high-performance lead-free KNN one-dimensional nanomaterials.

  8. Passive RFID Rotation Dimension Reduction via Aggregation

    NASA Astrophysics Data System (ADS)

    Matthews, Eric

    Radio Frequency IDentification (RFID) has applications in object identification, position, and orientation tracking. RFID technology can be applied in hospitals for patient and equipment tracking, stores and warehouses for product tracking, robots for self-localisation, tracking hazardous materials, or locating any other desired object. Efficient and accurate algorithms that perform localisation are required to extract meaningful data beyond simple identification. A Received Signal Strength Indicator (RSSI) is the strength of a received radio frequency signal used to localise passive and active RFID tags. Many factors affect RSSI such as reflections, tag rotation in 3D space, and obstacles blocking line-of-sight. LANDMARC is a statistical method for estimating tag location based on a target tag's similarity to surrounding reference tags. LANDMARC does not take into account the rotation of the target tag. By either aggregating multiple reference tag positions at various rotations, or by determining a rotation value for a newly read tag, we can perform an expected value calculation based on a comparison to the k-most similar training samples via an algorithm called K-Nearest Neighbours (KNN) more accurately. By choosing the average as the aggregation function, we improve the relative accuracy of single-rotation LANDMARC localisation by 10%, and any-rotation localisation by 20%.

  9. Detection of sibutramine in adulterated dietary supplements using attenuated total reflectance-infrared spectroscopy.

    PubMed

    Deconinck, E; Cauwenbergh, T; Bothy, J L; Custers, D; Courselle, P; De Beer, J O

    2014-11-01

    Sibutramine is one of the most occurring adulterants encountered in dietary supplements with slimming as indication. These adulterated dietary supplements often contain a herbal matrix. When customs intercept these kind of supplements it is almost impossible to discriminate between the legal products and the adulterated ones, due to misleading packaging. Therefore in most cases these products are confiscated and send to laboratories for analysis. This results inherently in the confiscation of legal, non-adulterated products. Therefore there is a need for easy to use equipment and techniques to perform an initial screening of samples. Attenuated total reflectance-infrared (ATR-IR) spectroscopy was evaluated for the detection of sibutramine in adulterated dietary supplements. Data interpretation was performed using different basic chemometric techniques. It was found that the use of ATR-IR combined with the k-Nearest Neighbours (k-NN) was able to detect all adulterated dietary supplements in an external test set and this with a minimum of false positive results. This means that a small amount of legal products will still be confiscated and analyzed in a laboratory to be found negative, but no adulterated samples will pass the initial ATR-IR screening. Copyright © 2014 Elsevier B.V. All rights reserved.

  10. The nearest neighbor and next nearest neighbor effects on the thermodynamic and kinetic properties of RNA base pair

    NASA Astrophysics Data System (ADS)

    Wang, Yujie; Wang, Zhen; Wang, Yanli; Liu, Taigang; Zhang, Wenbing

    2018-01-01

    The thermodynamic and kinetic parameters of an RNA base pair with different nearest and next nearest neighbors were obtained through long-time molecular dynamics simulation of the opening-closing switch process of the base pair near its melting temperature. The results indicate that thermodynamic parameters of GC base pair are dependent on the nearest neighbor base pair, and the next nearest neighbor base pair has little effect, which validated the nearest-neighbor model. The closing and opening rates of the GC base pair also showed nearest neighbor dependences. At certain temperature, the closing and opening rates of the GC pair with nearest neighbor AU is larger than that with the nearest neighbor GC, and the next nearest neighbor plays little role. The free energy landscape of the GC base pair with the nearest neighbor GC is rougher than that with nearest neighbor AU.

  11. Diagnosis of response and non-response to dry eye treatment using infrared thermography images

    NASA Astrophysics Data System (ADS)

    Acharya, U. Rajendra; Tan, Jen Hong; Vidya, S.; Yeo, Sharon; Too, Cheah Loon; Lim, Wei Jie Eugene; Chua, Kuang Chua; Tong, Louis

    2014-11-01

    The dry eye treatment outcome depends on the assessment of clinical relevance of the treatment effect. The potential approach to assess the clinical relevance of the treatment is to identify the symptoms responders and non-responders to the given treatments using the responder analysis. In our work, we have performed the responder analysis to assess the clinical relevance effect of the dry eye treatments namely, hot towel, EyeGiene®, and Blephasteam® twice daily and 12 min session of Lipiflow®. Thermography is performed at week 0 (baseline), at weeks 4 and 12 after treatment. The clinical parameters such as, change in the clinical irritations scores, tear break up time (TBUT), corneal staining and Schirmer's symptoms tests values are used to obtain the responders and non-responders groups. We have obtained the infrared thermography images of dry eye symptoms responders and non-responders to the three types of warming treatments. The energy, kurtosis, skewness, mean, standard deviation, and various entropies namely Shannon, Renyi and Kapoor are extracted from responders and non-responders thermograms. The extracted features are ranked based on t-values. These ranked features are fed to the various classifiers to get the highest performance using minimum features. We have used decision tree (DT), K nearest neighbour (KNN), Naves Bayesian (NB) and support vector machine (SVM) to classify the features into responder and non-responder classes. We have obtained an average accuracy of 99.88%, sensitivity of 99.7% and specificity of 100% using KNN classifier using ten-fold cross validation.

  12. Automated cloud classification using a ground based infra-red camera and texture analysis techniques

    NASA Astrophysics Data System (ADS)

    Rumi, Emal; Kerr, David; Coupland, Jeremy M.; Sandford, Andrew P.; Brettle, Mike J.

    2013-10-01

    Clouds play an important role in influencing the dynamics of local and global weather and climate conditions. Continuous monitoring of clouds is vital for weather forecasting and for air-traffic control. Convective clouds such as Towering Cumulus (TCU) and Cumulonimbus clouds (CB) are associated with thunderstorms, turbulence and atmospheric instability. Human observers periodically report the presence of CB and TCU clouds during operational hours at airports and observatories; however such observations are expensive and time limited. Robust, automatic classification of cloud type using infrared ground-based instrumentation offers the advantage of continuous, real-time (24/7) data capture and the representation of cloud structure in the form of a thermal map, which can greatly help to characterise certain cloud formations. The work presented here utilised a ground based infrared (8-14 μm) imaging device mounted on a pan/tilt unit for capturing high spatial resolution sky images. These images were processed to extract 45 separate textural features using statistical and spatial frequency based analytical techniques. These features were used to train a weighted k-nearest neighbour (KNN) classifier in order to determine cloud type. Ground truth data were obtained by inspection of images captured simultaneously from a visible wavelength colour camera at the same installation, with approximately the same field of view as the infrared device. These images were classified by a trained cloud observer. Results from the KNN classifier gave an encouraging success rate. A Probability of Detection (POD) of up to 90% with a Probability of False Alarm (POFA) as low as 16% was achieved.

  13. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Majidpour, Mostafa; Qiu, Charlie; Chu, Peter

    Three algorithms for the forecasting of energy consumption at individual EV charging outlets have been applied to real world data from the UCLA campus. Out of these three algorithms, namely k-Nearest Neighbor (kNN), ARIMA, and Pattern Sequence Forecasting (PSF), kNN with k=1, was the best and PSF was the worst performing algorithm with respect to the SMAPE measure. The advantage of PSF is its increased robustness to noise by substituting the real valued time series with an integer valued one, and the advantage of NN is having the least SMAPE for our data. We propose a Modified PSF algorithm (MPSF)more » which is a combination of PSF and NN; it could be interpreted as NN on integer valued data or as PSF with considering only the most recent neighbor to produce the output. Some other shortcomings of PSF are also addressed in the MPSF. Results show that MPSF has improved the forecast performance.« less

  14. Sequential (step-by-step) detection, identification and quantitation of extra virgin olive oil adulteration by chemometric treatment of chromatographic profiles.

    PubMed

    Capote, F Priego; Jiménez, J Ruiz; de Castro, M D Luque

    2007-08-01

    An analytical method for the sequential detection, identification and quantitation of extra virgin olive oil adulteration with four edible vegetable oils--sunflower, corn, peanut and coconut oils--is proposed. The only data required for this method are the results obtained from an analysis of the lipid fraction by gas chromatography-mass spectrometry. A total number of 566 samples (pure oils and samples of adulterated olive oil) were used to develop the chemometric models, which were designed to accomplish, step-by-step, the three aims of the method: to detect whether an olive oil sample is adulterated, to identify the type of adulterant used in the fraud, and to determine how much aldulterant is in the sample. Qualitative analysis was carried out via two chemometric approaches--soft independent modelling of class analogy (SIMCA) and K nearest neighbours (KNN)--both approaches exhibited prediction abilities that were always higher than 91% for adulterant detection and 88% for type of adulterant identification. Quantitative analysis was based on partial least squares regression (PLSR), which yielded R2 values of >0.90 for calibration and validation sets and thus made it possible to determine adulteration with excellent precision according to the Shenk criteria.

  15. Automated analysis of long-term grooming behavior in Drosophila using a k-nearest neighbors classifier

    PubMed Central

    Allen, Victoria W; Shirasu-Hiza, Mimi

    2018-01-01

    Despite being pervasive, the control of programmed grooming is poorly understood. We addressed this gap by developing a high-throughput platform that allows long-term detection of grooming in Drosophila melanogaster. In our method, a k-nearest neighbors algorithm automatically classifies fly behavior and finds grooming events with over 90% accuracy in diverse genotypes. Our data show that flies spend ~13% of their waking time grooming, driven largely by two major internal programs. One of these programs regulates the timing of grooming and involves the core circadian clock components cycle, clock, and period. The second program regulates the duration of grooming and, while dependent on cycle and clock, appears to be independent of period. This emerging dual control model in which one program controls timing and another controls duration, resembles the two-process regulatory model of sleep. Together, our quantitative approach presents the opportunity for further dissection of mechanisms controlling long-term grooming in Drosophila. PMID:29485401

  16. Multisite rainfall downscaling and disaggregation in a tropical urban area

    NASA Astrophysics Data System (ADS)

    Lu, Y.; Qin, X. S.

    2014-02-01

    A systematic downscaling-disaggregation study was conducted over Singapore Island, with an aim to generate high spatial and temporal resolution rainfall data under future climate-change conditions. The study consisted of two major components. The first part was to perform an inter-comparison of various alternatives of downscaling and disaggregation methods based on observed data. This included (i) single-site generalized linear model (GLM) plus K-nearest neighbor (KNN) (S-G-K) vs. multisite GLM (M-G) for spatial downscaling, (ii) HYETOS vs. KNN for single-site disaggregation, and (iii) KNN vs. MuDRain (Multivariate Rainfall Disaggregation tool) for multisite disaggregation. The results revealed that, for multisite downscaling, M-G performs better than S-G-K in covering the observed data with a lower RMSE value; for single-site disaggregation, KNN could better keep the basic statistics (i.e. standard deviation, lag-1 autocorrelation and probability of wet hour) than HYETOS; for multisite disaggregation, MuDRain outperformed KNN in fitting interstation correlations. In the second part of the study, an integrated downscaling-disaggregation framework based on M-G, KNN, and MuDRain was used to generate hourly rainfall at multiple sites. The results indicated that the downscaled and disaggregated rainfall data based on multiple ensembles from HadCM3 for the period from 1980 to 2010 could well cover the observed mean rainfall amount and extreme data, and also reasonably keep the spatial correlations both at daily and hourly timescales. The framework was also used to project future rainfall conditions under HadCM3 SRES A2 and B2 scenarios. It was indicated that the annual rainfall amount could reduce up to 5% at the end of this century, but the rainfall of wet season and extreme hourly rainfall could notably increase.

  17. The nearest neighbor and the bayes error rates.

    PubMed

    Loizou, G; Maybank, S J

    1987-02-01

    The (k, l) nearest neighbor method of pattern classification is compared to the Bayes method. If the two acceptance rates are equal then the asymptotic error rates satisfy the inequalities Ek,l + 1 ¿ E*(¿) ¿ Ek,l dE*(¿), where d is a function of k, l, and the number of pattern classes, and ¿ is the reject threshold for the Bayes method. An explicit expression for d is given which is optimal in the sense that for some probability distributions Ek,l and dE* (¿) are equal.

  18. A Novel Hybrid Classification Model of Genetic Algorithms, Modified k-Nearest Neighbor and Developed Backpropagation Neural Network

    PubMed Central

    Salari, Nader; Shohaimi, Shamarina; Najafi, Farid; Nallappan, Meenakshii; Karishnarajah, Isthrinayagy

    2014-01-01

    Among numerous artificial intelligence approaches, k-Nearest Neighbor algorithms, genetic algorithms, and artificial neural networks are considered as the most common and effective methods in classification problems in numerous studies. In the present study, the results of the implementation of a novel hybrid feature selection-classification model using the above mentioned methods are presented. The purpose is benefitting from the synergies obtained from combining these technologies for the development of classification models. Such a combination creates an opportunity to invest in the strength of each algorithm, and is an approach to make up for their deficiencies. To develop proposed model, with the aim of obtaining the best array of features, first, feature ranking techniques such as the Fisher's discriminant ratio and class separability criteria were used to prioritize features. Second, the obtained results that included arrays of the top-ranked features were used as the initial population of a genetic algorithm to produce optimum arrays of features. Third, using a modified k-Nearest Neighbor method as well as an improved method of backpropagation neural networks, the classification process was advanced based on optimum arrays of the features selected by genetic algorithms. The performance of the proposed model was compared with thirteen well-known classification models based on seven datasets. Furthermore, the statistical analysis was performed using the Friedman test followed by post-hoc tests. The experimental findings indicated that the novel proposed hybrid model resulted in significantly better classification performance compared with all 13 classification methods. Finally, the performance results of the proposed model was benchmarked against the best ones reported as the state-of-the-art classifiers in terms of classification accuracy for the same data sets. The substantial findings of the comprehensive comparative study revealed that performance of the

  19. Automatic Classification of Protein Structure Using the Maximum Contact Map Overlap Metric

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Andonov, Rumen; Djidjev, Hristo Nikolov; Klau, Gunnar W.

    In this paper, we propose a new distance measure for comparing two protein structures based on their contact map representations. We show that our novel measure, which we refer to as the maximum contact map overlap (max-CMO) metric, satisfies all properties of a metric on the space of protein representations. Having a metric in that space allows one to avoid pairwise comparisons on the entire database and, thus, to significantly accelerate exploring the protein space compared to no-metric spaces. We show on a gold standard superfamily classification benchmark set of 6759 proteins that our exact k-nearest neighbor (k-NN) scheme classifiesmore » up to 224 out of 236 queries correctly and on a larger, extended version of the benchmark with 60; 850 additional structures, up to 1361 out of 1369 queries. Finally, our k-NN classification thus provides a promising approach for the automatic classification of protein structures based on flexible contact map overlap alignments.« less

  20. Automatic Classification of Protein Structure Using the Maximum Contact Map Overlap Metric

    DOE PAGES

    Andonov, Rumen; Djidjev, Hristo Nikolov; Klau, Gunnar W.; ...

    2015-10-09

    In this paper, we propose a new distance measure for comparing two protein structures based on their contact map representations. We show that our novel measure, which we refer to as the maximum contact map overlap (max-CMO) metric, satisfies all properties of a metric on the space of protein representations. Having a metric in that space allows one to avoid pairwise comparisons on the entire database and, thus, to significantly accelerate exploring the protein space compared to no-metric spaces. We show on a gold standard superfamily classification benchmark set of 6759 proteins that our exact k-nearest neighbor (k-NN) scheme classifiesmore » up to 224 out of 236 queries correctly and on a larger, extended version of the benchmark with 60; 850 additional structures, up to 1361 out of 1369 queries. Finally, our k-NN classification thus provides a promising approach for the automatic classification of protein structures based on flexible contact map overlap alignments.« less

  1. Classification of Breast Cancer Resistant Protein (BCRP) Inhibitors and Non-Inhibitors Using Machine Learning Approaches.

    PubMed

    Belekar, Vilas; Lingineni, Karthik; Garg, Prabha

    2015-01-01

    The breast cancer resistant protein (BCRP) is an important transporter and its inhibitors play an important role in cancer treatment by improving the oral bioavailability as well as blood brain barrier (BBB) permeability of anticancer drugs. In this work, a computational model was developed to predict the compounds as BCRP inhibitors or non-inhibitors. Various machine learning approaches like, support vector machine (SVM), k-nearest neighbor (k-NN) and artificial neural network (ANN) were used to develop the models. The Matthews correlation coefficients (MCC) of developed models using ANN, k-NN and SVM are 0.67, 0.71 and 0.77, and prediction accuracies are 85.2%, 88.3% and 90.8% respectively. The developed models were tested with a test set of 99 compounds and further validated with external set of 98 compounds. Distribution plot analysis and various machine learning models were also developed based on druglikeness descriptors. Applicability domain is used to check the prediction reliability of the new molecules.

  2. A study on (K, Na) NbO3 based multilayer piezoelectric ceramics micro speaker

    NASA Astrophysics Data System (ADS)

    Gao, Renlong; Chu, Xiangcheng; Huan, Yu; Sun, Yiming; Liu, Jiayi; Wang, Xiaohui; Li, Longtu

    2014-10-01

    A flat panel micro speaker was fabricated from (K, Na) NbO3 (KNN)-based multilayer piezoelectric ceramics by a tape casting and cofiring process using Ag-Pd alloys as an inner electrode. The interface between ceramic and electrode was investigated by scanning electron microscope (SEM) and transmission electron microscope (TEM). The acoustic response was characterized by a standard audio test system. We found that the micro speaker with dimensions of 23 × 27 × 0.6 mm3, using three layers of 30 μm thickness KNN-based ceramic, has a high average sound pressure level (SPL) of 87 dB, between 100 Hz-20 kHz under five voltage. This result was even better than that of lead zirconate titanate (PZT)-based ceramics under the same conditions. The experimental results show that the KNN-based multilayer ceramics could be used as lead free piezoelectric micro speakers.

  3. Diagnosis of Tempromandibular Disorders Using Local Binary Patterns

    PubMed Central

    Haghnegahdar, A.A.; Kolahi, S.; Khojastepour, L.; Tajeripour, F.

    2018-01-01

    Background: Temporomandibular joint disorder (TMD) might be manifested as structural changes in bone through modification, adaptation or direct destruction. We propose to use Local Binary Pattern (LBP) characteristics and histogram-oriented gradients on the recorded images as a diagnostic tool in TMD assessment. Material and Methods: CBCT images of 66 patients (132 joints) with TMD and 66 normal cases (132 joints) were collected and 2 coronal cut prepared from each condyle, although images were limited to head of mandibular condyle. In order to extract features of images, first we use LBP and then histogram of oriented gradients. To reduce dimensionality, the linear algebra Singular Value Decomposition (SVD) is applied to the feature vectors matrix of all images. For evaluation, we used K nearest neighbor (K-NN), Support Vector Machine, Naïve Bayesian and Random Forest classifiers. We used Receiver Operating Characteristic (ROC) to evaluate the hypothesis. Results: K nearest neighbor classifier achieves a very good accuracy (0.9242), moreover, it has desirable sensitivity (0.9470) and specificity (0.9015) results, when other classifiers have lower accuracy, sensitivity and specificity. Conclusion: We proposed a fully automatic approach to detect TMD using image processing techniques based on local binary patterns and feature extraction. K-NN has been the best classifier for our experiments in detecting patients from healthy individuals, by 92.42% accuracy, 94.70% sensitivity and 90.15% specificity. The proposed method can help automatically diagnose TMD at its initial stages. PMID:29732343

  4. Diagnosis of Tempromandibular Disorders Using Local Binary Patterns.

    PubMed

    Haghnegahdar, A A; Kolahi, S; Khojastepour, L; Tajeripour, F

    2018-03-01

    Temporomandibular joint disorder (TMD) might be manifested as structural changes in bone through modification, adaptation or direct destruction. We propose to use Local Binary Pattern (LBP) characteristics and histogram-oriented gradients on the recorded images as a diagnostic tool in TMD assessment. CBCT images of 66 patients (132 joints) with TMD and 66 normal cases (132 joints) were collected and 2 coronal cut prepared from each condyle, although images were limited to head of mandibular condyle. In order to extract features of images, first we use LBP and then histogram of oriented gradients. To reduce dimensionality, the linear algebra Singular Value Decomposition (SVD) is applied to the feature vectors matrix of all images. For evaluation, we used K nearest neighbor (K-NN), Support Vector Machine, Naïve Bayesian and Random Forest classifiers. We used Receiver Operating Characteristic (ROC) to evaluate the hypothesis. K nearest neighbor classifier achieves a very good accuracy (0.9242), moreover, it has desirable sensitivity (0.9470) and specificity (0.9015) results, when other classifiers have lower accuracy, sensitivity and specificity. We proposed a fully automatic approach to detect TMD using image processing techniques based on local binary patterns and feature extraction. K-NN has been the best classifier for our experiments in detecting patients from healthy individuals, by 92.42% accuracy, 94.70% sensitivity and 90.15% specificity. The proposed method can help automatically diagnose TMD at its initial stages.

  5. Predictive models to assess risk of type 2 diabetes, hypertension and comorbidity: machine-learning algorithms and validation using national health data from Kuwait—a cohort study

    PubMed Central

    Farran, Bassam; Channanath, Arshad Mohamed; Behbehani, Kazem; Thanaraj, Thangavel Alphonse

    2013-01-01

    Objective We build classification models and risk assessment tools for diabetes, hypertension and comorbidity using machine-learning algorithms on data from Kuwait. We model the increased proneness in diabetic patients to develop hypertension and vice versa. We ascertain the importance of ethnicity (and natives vs expatriate migrants) and of using regional data in risk assessment. Design Retrospective cohort study. Four machine-learning techniques were used: logistic regression, k-nearest neighbours (k-NN), multifactor dimensionality reduction and support vector machines. The study uses fivefold cross validation to obtain generalisation accuracies and errors. Setting Kuwait Health Network (KHN) that integrates data from primary health centres and hospitals in Kuwait. Participants 270 172 hospital visitors (of which, 89 858 are diabetic, 58 745 hypertensive and 30 522 comorbid) comprising Kuwaiti natives, Asian and Arab expatriates. Outcome measures Incident type 2 diabetes, hypertension and comorbidity. Results Classification accuracies of >85% (for diabetes) and >90% (for hypertension) are achieved using only simple non-laboratory-based parameters. Risk assessment tools based on k-NN classification models are able to assign ‘high’ risk to 75% of diabetic patients and to 94% of hypertensive patients. Only 5% of diabetic patients are seen assigned ‘low’ risk. Asian-specific models and assessments perform even better. Pathological conditions of diabetes in the general population or in hypertensive population and those of hypertension are modelled. Two-stage aggregate classification models and risk assessment tools, built combining both the component models on diabetes (or on hypertension), perform better than individual models. Conclusions Data on diabetes, hypertension and comorbidity from the cosmopolitan State of Kuwait are available for the first time. This enabled us to apply four different case–control models to assess risks. These tools aid

  6. Texture- and deformability-based surface recognition by tactile image analysis.

    PubMed

    Khasnobish, Anwesha; Pal, Monalisa; Tibarewala, D N; Konar, Amit; Pal, Kunal

    2016-08-01

    Deformability and texture are two unique object characteristics which are essential for appropriate surface recognition by tactile exploration. Tactile sensation is required to be incorporated in artificial arms for rehabilitative and other human-computer interface applications to achieve efficient and human-like manoeuvring. To accomplish the same, surface recognition by tactile data analysis is one of the prerequisites. The aim of this work is to develop effective technique for identification of various surfaces based on deformability and texture by analysing tactile images which are obtained during dynamic exploration of the item by artificial arms whose gripper is fitted with tactile sensors. Tactile data have been acquired, while human beings as well as a robot hand fitted with tactile sensors explored the objects. The tactile images are pre-processed, and relevant features are extracted from the tactile images. These features are provided as input to the variants of support vector machine (SVM), linear discriminant analysis and k-nearest neighbour (kNN) for classification. Based on deformability, six household surfaces are recognized from their corresponding tactile images. Moreover, based on texture five surfaces of daily use are classified. The method adopted in the former two cases has also been applied for deformability- and texture-based recognition of four biomembranes, i.e. membranes prepared from biomaterials which can be used for various applications such as drug delivery and implants. Linear SVM performed best for recognizing surface deformability with an accuracy of 83 % in 82.60 ms, whereas kNN classifier recognizes surfaces of daily use having different textures with an accuracy of 89 % in 54.25 ms and SVM with radial basis function kernel recognizes biomembranes with an accuracy of 78 % in 53.35 ms. The classifiers are observed to generalize well on the unseen test datasets with very high performance to achieve efficient material

  7. Predictive models to assess risk of type 2 diabetes, hypertension and comorbidity: machine-learning algorithms and validation using national health data from Kuwait--a cohort study.

    PubMed

    Farran, Bassam; Channanath, Arshad Mohamed; Behbehani, Kazem; Thanaraj, Thangavel Alphonse

    2013-05-14

    We build classification models and risk assessment tools for diabetes, hypertension and comorbidity using machine-learning algorithms on data from Kuwait. We model the increased proneness in diabetic patients to develop hypertension and vice versa. We ascertain the importance of ethnicity (and natives vs expatriate migrants) and of using regional data in risk assessment. Retrospective cohort study. Four machine-learning techniques were used: logistic regression, k-nearest neighbours (k-NN), multifactor dimensionality reduction and support vector machines. The study uses fivefold cross validation to obtain generalisation accuracies and errors. Kuwait Health Network (KHN) that integrates data from primary health centres and hospitals in Kuwait. 270 172 hospital visitors (of which, 89 858 are diabetic, 58 745 hypertensive and 30 522 comorbid) comprising Kuwaiti natives, Asian and Arab expatriates. Incident type 2 diabetes, hypertension and comorbidity. Classification accuracies of >85% (for diabetes) and >90% (for hypertension) are achieved using only simple non-laboratory-based parameters. Risk assessment tools based on k-NN classification models are able to assign 'high' risk to 75% of diabetic patients and to 94% of hypertensive patients. Only 5% of diabetic patients are seen assigned 'low' risk. Asian-specific models and assessments perform even better. Pathological conditions of diabetes in the general population or in hypertensive population and those of hypertension are modelled. Two-stage aggregate classification models and risk assessment tools, built combining both the component models on diabetes (or on hypertension), perform better than individual models. Data on diabetes, hypertension and comorbidity from the cosmopolitan State of Kuwait are available for the first time. This enabled us to apply four different case-control models to assess risks. These tools aid in the preliminary non-intrusive assessment of the population. Ethnicity is seen significant

  8. GTM-Based QSAR Models and Their Applicability Domains.

    PubMed

    Gaspar, H A; Baskin, I I; Marcou, G; Horvath, D; Varnek, A

    2015-06-01

    In this paper we demonstrate that Generative Topographic Mapping (GTM), a machine learning method traditionally used for data visualisation, can be efficiently applied to QSAR modelling using probability distribution functions (PDF) computed in the latent 2-dimensional space. Several different scenarios of the activity assessment were considered: (i) the "activity landscape" approach based on direct use of PDF, (ii) QSAR models involving GTM-generated on descriptors derived from PDF, and, (iii) the k-Nearest Neighbours approach in 2D latent space. Benchmarking calculations were performed on five different datasets: stability constants of metal cations Ca(2+) , Gd(3+) and Lu(3+) complexes with organic ligands in water, aqueous solubility and activity of thrombin inhibitors. It has been shown that the performance of GTM-based regression models is similar to that obtained with some popular machine-learning methods (random forest, k-NN, M5P regression tree and PLS) and ISIDA fragment descriptors. By comparing GTM activity landscapes built both on predicted and experimental activities, we may visually assess the model's performance and identify the areas in the chemical space corresponding to reliable predictions. The applicability domain used in this work is based on data likelihood. Its application has significantly improved the model performances for 4 out of 5 datasets. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  9. Preserved Network Metrics across Translated Texts

    NASA Astrophysics Data System (ADS)

    Cabatbat, Josephine Jill T.; Monsanto, Jica P.; Tapang, Giovanni A.

    2014-09-01

    Co-occurrence language networks based on Bible translations and the Universal Declaration of Human Rights (UDHR) translations in different languages were constructed and compared with random text networks. Among the considered network metrics, the network size, N, the normalized betweenness centrality (BC), and the average k-nearest neighbors, knn, were found to be the most preserved across translations. Moreover, similar frequency distributions of co-occurring network motifs were observed for translated texts networks.

  10. Chemometric and multivariate statistical analysis of time-of-flight secondary ion mass spectrometry spectra from complex Cu-Fe sulfides.

    PubMed

    Kalegowda, Yogesh; Harmer, Sarah L

    2012-03-20

    Time-of-flight secondary ion mass spectrometry (TOF-SIMS) spectra of mineral samples are complex, comprised of large mass ranges and many peaks. Consequently, characterization and classification analysis of these systems is challenging. In this study, different chemometric and statistical data evaluation methods, based on monolayer sensitive TOF-SIMS data, have been tested for the characterization and classification of copper-iron sulfide minerals (chalcopyrite, chalcocite, bornite, and pyrite) at different flotation pulp conditions (feed, conditioned feed, and Eh modified). The complex mass spectral data sets were analyzed using the following chemometric and statistical techniques: principal component analysis (PCA); principal component-discriminant functional analysis (PC-DFA); soft independent modeling of class analogy (SIMCA); and k-Nearest Neighbor (k-NN) classification. PCA was found to be an important first step in multivariate analysis, providing insight into both the relative grouping of samples and the elemental/molecular basis for those groupings. For samples exposed to oxidative conditions (at Eh ~430 mV), each technique (PCA, PC-DFA, SIMCA, and k-NN) was found to produce excellent classification. For samples at reductive conditions (at Eh ~ -200 mV SHE), k-NN and SIMCA produced the most accurate classification. Phase identification of particles that contain the same elements but a different crystal structure in a mixed multimetal mineral system has been achieved.

  11. Discrimination of lymphoma using laser-induced breakdown spectroscopy conducted on whole blood samples

    PubMed Central

    Chen, Xue; Li, Xiaohui; Yang, Sibo; Yu, Xin; Liu, Aichun

    2018-01-01

    Lymphoma is a significant cancer that affects the human lymphatic and hematopoietic systems. In this work, discrimination of lymphoma using laser-induced breakdown spectroscopy (LIBS) conducted on whole blood samples is presented. The whole blood samples collected from lymphoma patients and healthy controls are deposited onto standard quantitative filter papers and ablated with a 1064 nm Q-switched Nd:YAG laser. 16 atomic and ionic emission lines of calcium (Ca), iron (Fe), magnesium (Mg), potassium (K) and sodium (Na) are selected to discriminate the cancer disease. Chemometric methods, including principal component analysis (PCA), linear discriminant analysis (LDA) classification, and k nearest neighbor (kNN) classification are used to build the discrimination models. Both LDA and kNN models have achieved very good discrimination performances for lymphoma, with an accuracy of over 99.7%, a sensitivity of over 0.996, and a specificity of over 0.997. These results demonstrate that the whole-blood-based LIBS technique in combination with chemometric methods can serve as a fast, less invasive, and accurate method for detection and discrimination of human malignancies. PMID:29541503

  12. α-K2AgF4: Ferromagnetism induced by the weak superexchange of different eg orbitals from the nearest neighbor Ag ions

    NASA Astrophysics Data System (ADS)

    Zhang, Xiaoli; Zhang, Guoren; Jia, Ting; Zeng, Zhi; Lin, H. Q.

    2016-05-01

    We study the abnormal ferromagnetism in α-K2AgF4, which is very similar to high-TC parent material La2CuO4 in structure. We find out that the electron correlation is very important in determining the insulating property of α-K2AgF4. The Ag(II) 4d9 in the octahedron crystal field has the t2 g 6 eg 3 electron occupation with eg x2-y2 orbital fully occupied and 3z2-r2 orbital partially occupied. The two eg orbitals are very extended indicating both of them are active in superexchange. Using the Hubbard model combined with Nth-order muffin-tin orbital (NMTO) downfolding technique, it is concluded that the exchange interaction between eg 3z2-r2 and x2-y2 from the first nearest neighbor Ag ions leads to the anomalous ferromagnetism in α-K2AgF4.

  13. Effect of pasture size on behavioural synchronization and spacing in German Blackface ewes (Ovis aries).

    PubMed

    Hauschildt, Verena; Gerken, Martina

    2016-03-01

    This study aims to assess plot size related changes in spacing and behavioural synchronization in a herd of 14 German Blackface ewes kept on three different pasture sizes: S (126m(2)), M (1100m(2)), and L (11,200m(2)). In direct field observations, behaviour and nearest neighbour distance were recorded individually. Additionally, interindividual and nearest neighbour distances were derived from aerial photographs of the herd taken on plot sizes S and M. Nearest neighbour distances <1m accounted for more than 60% of observations, and were more frequent on plot size L than on plot sizes S (Z=3.3; p<0.01) and M (Z=3.2; p<0.01). Average interindividual distances were significantly smaller on S (4.89±2.62m) than on M plots (5.99±3.06m; t=7.3; p<0.01). Synchronization tended to increase with plot size (K(S)=0.42; K(M)=0.52; K(L)=0.66), but was not accompanied by a concomitant increase in dispersion. Aerial photography proved a valuable tool in the analysis of spacing behaviour as intraindividual repeatability of the derived distances was highly significant (Kendall's W between 0.32 and 0.58; p<0.01). The sheep kept small distances on all plot sizes, thus the high degree of behavioural synchronization might be mainly attributed to the motivation for close proximity to any conspecific. Copyright © 2015 Elsevier B.V. All rights reserved.

  14. Multispectral imaging burn wound tissue classification system: a comparison of test accuracies between several common machine learning algorithms

    NASA Astrophysics Data System (ADS)

    Squiers, John J.; Li, Weizhi; King, Darlene R.; Mo, Weirong; Zhang, Xu; Lu, Yang; Sellke, Eric W.; Fan, Wensheng; DiMaio, J. Michael; Thatcher, Jeffrey E.

    2016-03-01

    The clinical judgment of expert burn surgeons is currently the standard on which diagnostic and therapeutic decisionmaking regarding burn injuries is based. Multispectral imaging (MSI) has the potential to increase the accuracy of burn depth assessment and the intraoperative identification of viable wound bed during surgical debridement of burn injuries. A highly accurate classification model must be developed using machine-learning techniques in order to translate MSI data into clinically-relevant information. An animal burn model was developed to build an MSI training database and to study the burn tissue classification ability of several models trained via common machine-learning algorithms. The algorithms tested, from least to most complex, were: K-nearest neighbors (KNN), decision tree (DT), linear discriminant analysis (LDA), weighted linear discriminant analysis (W-LDA), quadratic discriminant analysis (QDA), ensemble linear discriminant analysis (EN-LDA), ensemble K-nearest neighbors (EN-KNN), and ensemble decision tree (EN-DT). After the ground-truth database of six tissue types (healthy skin, wound bed, blood, hyperemia, partial injury, full injury) was generated by histopathological analysis, we used 10-fold cross validation to compare the algorithms' performances based on their accuracies in classifying data against the ground truth, and each algorithm was tested 100 times. The mean test accuracy of the algorithms were KNN 68.3%, DT 61.5%, LDA 70.5%, W-LDA 68.1%, QDA 68.9%, EN-LDA 56.8%, EN-KNN 49.7%, and EN-DT 36.5%. LDA had the highest test accuracy, reflecting the bias-variance tradeoff over the range of complexities inherent to the algorithms tested. Several algorithms were able to match the current standard in burn tissue classification, the clinical judgment of expert burn surgeons. These results will guide further development of an MSI burn tissue classification system. Given that there are few surgeons and facilities specializing in burn care

  15. Neighbour effects on Erica multiflora (Ericaceae) reproductive performance after clipping

    NASA Astrophysics Data System (ADS)

    Vilà, Montserrat; Terradas, Jaume

    1998-04-01

    The effect of interspecific competition on resprouting and reproductive success and the relationship between above-ground vegetative biomass variability and reproductive biomass variability were analysed during resprouting after clipping. For this purpose, a field experiment was performed by removing neighbours around individuals of Erica multiflora in a Mediterranean shrub community. Removal of neighbours increased the number of sprouts and the above-ground vegetative biomass of target plants. However, it did not decrease plant size variability. Neighbours decreased the likelihood of fruiting and the biomass of fruits. In target plants that had set fruits a simple allometric relationship between above-ground vegetative biomass and the biomass of fruits explained 42% of the variation in fruit biomass. The probability to set fruits at a given plant size was smaller in plants with neighbours than without neighbours. Presence of neighbours also increased the variability of fruit biomass within the population, because 50% of target plants with neighbours did not set fruits. This failure to set fruits may be related to shading, the small size of plants with neighbours, as well as a delay in development.

  16. An integrated analysis for determining the geographical origin of medicinal herbs using ICP-AES/ICP-MS and (1)H NMR analysis.

    PubMed

    Kwon, Yong-Kook; Bong, Yeon-Sik; Lee, Kwang-Sik; Hwang, Geum-Sook

    2014-10-15

    ICP-MS and (1)H NMR are commonly used to determine the geographical origin of food and crops. In this study, data from multielemental analysis performed by ICP-AES/ICP-MS and metabolomic data obtained from (1)H NMR were integrated to improve the reliability of determining the geographical origin of medicinal herbs. Astragalus membranaceus and Paeonia albiflora with different origins in Korea and China were analysed by (1)H NMR and ICP-AES/ICP-MS, and an integrated multivariate analysis was performed to characterise the differences between their origins. Four classification methods were applied: linear discriminant analysis (LDA), k-nearest neighbour classification (KNN), support vector machines (SVM), and partial least squares-discriminant analysis (PLS-DA). Results were compared using leave-one-out cross-validation and external validation. The integration of multielemental and metabolomic data was more suitable for determining geographical origin than the use of each individual data set alone. The integration of the two analytical techniques allowed diverse environmental factors such as climate and geology, to be considered. Our study suggests that an appropriate integration of different types of analytical data is useful for determining the geographical origin of food and crops with a high degree of reliability. Copyright © 2014 Elsevier Ltd. All rights reserved.

  17. Two-dimensional fingerprinting approach for comparison of complex substances analysed by HPLC-UV and fluorescence detection.

    PubMed

    Ni, Yongnian; Liu, Ying; Kokot, Serge

    2011-02-07

    This work is concerned with the research and development of methodology for analysis of complex mixtures such as pharmaceutical or food samples, which contain many analytes. Variously treated samples (swill washed, fried and scorched) of the Rhizoma atractylodis macrocephalae (RAM) traditional Chinese medicine (TCM) as well as the common substitute, Rhizoma atractylodis (RA) TCM were chosen as examples for analysis. A combined data matrix of chromatographic 2-D HPLC-DAD-FLD (two-dimensional high performance liquid chromatography with diode array and fluorescence detectors) fingerprint profiles was constructed with the use of the HPLC-DAD and HPLC-FLD individual data matrices; the purpose was to collect maximum information and to interpret this complex data with the use of various chemometrics methods e.g. the rank-ordering multi-criteria decision making (MCDM) PROMETHEE and GAIA, K-nearest neighbours (KNN), partial least squares (PLS), back propagation-artificial neural networks (BP-ANN) methods. The chemometrics analysis demonstrated that the combined 2-D HPLC-DAD-FLD data matrix does indeed provide more information and facilitates better performing classification/prediction models for the analysis of such complex samples as the RAM and RA ones noted above. It is suggested that this fingerprint approach is suitable for analysis of other complex, multi-analyte substances.

  18. Optical and Piezoelectric Study of KNN Solid Solutions Co-Doped with La-Mn and Eu-Fe.

    PubMed

    Peña-Jiménez, Jesús-Alejandro; González, Federico; López-Juárez, Rigoberto; Hernández-Alcántara, José-Manuel; Camarillo, Enrique; Murrieta-Sánchez, Héctor; Pardo, Lorena; Villafuerte-Castrejón, María-Elena

    2016-09-28

    The solid-state method was used to synthesize single phase potassium-sodium niobate (KNN) co-doped with the La 3+ -Mn 4+ and Eu 3+ -Fe 3+ ion pairs. Structural determination of all studied solid solutions was accomplished by XRD and Rietveld refinement method. Electron paramagnetic resonance (EPR) studies were performed to determine the oxidation state of paramagnetic centers. Optical spectroscopy measurements, excitation, emission and decay lifetime were carried out for each solid solution. The present study reveals that doping KNN with La 3+ -Mn 4+ and Eu 3+ -Fe 3+ at concentrations of 0.5 mol % and 1 mol %, respectively, improves the ferroelectric and piezoelectric behavior and induce the generation of optical properties in the material for potential applications.

  19. Optical and Piezoelectric Study of KNN Solid Solutions Co-Doped with La-Mn and Eu-Fe

    PubMed Central

    Peña-Jiménez, Jesús-Alejandro; González, Federico; López-Juárez, Rigoberto; Hernández-Alcántara, José-Manuel; Camarillo, Enrique; Murrieta-Sánchez, Héctor; Pardo, Lorena; Villafuerte-Castrejón, María-Elena

    2016-01-01

    The solid-state method was used to synthesize single phase potassium-sodium niobate (KNN) co-doped with the La3+–Mn4+ and Eu3+–Fe3+ ion pairs. Structural determination of all studied solid solutions was accomplished by XRD and Rietveld refinement method. Electron paramagnetic resonance (EPR) studies were performed to determine the oxidation state of paramagnetic centers. Optical spectroscopy measurements, excitation, emission and decay lifetime were carried out for each solid solution. The present study reveals that doping KNN with La3+–Mn4+ and Eu3+–Fe3+ at concentrations of 0.5 mol % and 1 mol %, respectively, improves the ferroelectric and piezoelectric behavior and induce the generation of optical properties in the material for potential applications. PMID:28773925

  20. Classification enhancement for post-stroke dementia using fuzzy neighborhood preserving analysis with QR-decomposition.

    PubMed

    Al-Qazzaz, Noor Kamal; Ali, Sawal; Ahmad, Siti Anom; Escudero, Javier

    2017-07-01

    The aim of the present study was to discriminate the electroencephalogram (EEG) of 5 patients with vascular dementia (VaD), 15 patients with stroke-related mild cognitive impairment (MCI), and 15 control normal subjects during a working memory (WM) task. We used independent component analysis (ICA) and wavelet transform (WT) as a hybrid preprocessing approach for EEG artifact removal. Three different features were extracted from the cleaned EEG signals: spectral entropy (SpecEn), permutation entropy (PerEn) and Tsallis entropy (TsEn). Two classification schemes were applied - support vector machine (SVM) and k-nearest neighbors (kNN) - with fuzzy neighborhood preserving analysis with QR-decomposition (FNPAQR) as a dimensionality reduction technique. The FNPAQR dimensionality reduction technique increased the SVM classification accuracy from 82.22% to 90.37% and from 82.6% to 86.67% for kNN. These results suggest that FNPAQR consistently improves the discrimination of VaD, MCI patients and control normal subjects and it could be a useful feature selection to help the identification of patients with VaD and MCI.

  1. Spatial clustering of dark matter haloes: secondary bias, neighbour bias, and the influence of massive neighbours on halo properties

    NASA Astrophysics Data System (ADS)

    Salcedo, Andrés N.; Maller, Ariyeh H.; Berlind, Andreas A.; Sinha, Manodeep; McBride, Cameron K.; Behroozi, Peter S.; Wechsler, Risa H.; Weinberg, David H.

    2018-04-01

    We explore the phenomenon commonly known as halo assembly bias, whereby dark matter haloes of the same mass are found to be more or less clustered when a second halo property is considered, for haloes in the mass range 3.7 × 1011-5.0 × 1013 h-1 M⊙. Using the Large Suite of Dark Matter Simulations (LasDamas) we consider nine commonly used halo properties and find that a clustering bias exists if haloes are binned by mass or by any other halo property. This secondary bias implies that no single halo property encompasses all the spatial clustering information of the halo population. The mean values of some halo properties depend on their halo's distance to a more massive neighbour. Halo samples selected by having high values of one of these properties therefore inherit a neighbour bias such that they are much more likely to be close to a much more massive neighbour. This neighbour bias largely accounts for the secondary bias seen in haloes binned by mass and split by concentration or age. However, haloes binned by other mass-like properties still show a secondary bias even when the neighbour bias is removed. The secondary bias of haloes selected by their spin behaves differently than that for other halo properties, suggesting that the origin of the spin bias is different than of other secondary biases.

  2. Improvement in synthesis of (K 0.5Na 0.5)NbO 3 powders by Ge 4+ acceptor doping

    DOE PAGES

    Zhao, Yajing; Chen, Yan; Chen, Kepi

    2016-11-17

    In this study, the effects of doping with GeO 2 on the synthesis temperature, phase structure and morphology of (K 0.5Na 0.5)NbO 3 (KNN) ceramic powders were studied using XRD and SEM. The results show that KNN powders with good crystallinity and compositional homogeneity can be obtained after calcination at up to 900°C for 2 h. Introducing 0.5 mol.% GeO 2 into the starting mixture improved the synthesis of the KNN powders and allowed the calcination temperature to be decreased to 800°C, which can be ascribed to the formation of the liquid phase during the synthesis.

  3. A Novel Locally Linear KNN Method With Applications to Visual Recognition.

    PubMed

    Liu, Qingfeng; Liu, Chengjun

    2017-09-01

    A locally linear K Nearest Neighbor (LLK) method is presented in this paper with applications to robust visual recognition. Specifically, the concept of an ideal representation is first presented, which improves upon the traditional sparse representation in many ways. The objective function based on a host of criteria for sparsity, locality, and reconstruction is then optimized to derive a novel representation, which is an approximation to the ideal representation. The novel representation is further processed by two classifiers, namely, an LLK-based classifier and a locally linear nearest mean-based classifier, for visual recognition. The proposed classifiers are shown to connect to the Bayes decision rule for minimum error. Additional new theoretical analysis is presented, such as the nonnegative constraint, the group regularization, and the computational efficiency of the proposed LLK method. New methods such as a shifted power transformation for improving reliability, a coefficients' truncating method for enhancing generalization, and an improved marginal Fisher analysis method for feature extraction are proposed to further improve visual recognition performance. Extensive experiments are implemented to evaluate the proposed LLK method for robust visual recognition. In particular, eight representative data sets are applied for assessing the performance of the LLK method for various visual recognition applications, such as action recognition, scene recognition, object recognition, and face recognition.

  4. An integrated classifier for computer-aided diagnosis of colorectal polyps based on random forest and location index strategies

    NASA Astrophysics Data System (ADS)

    Hu, Yifan; Han, Hao; Zhu, Wei; Li, Lihong; Pickhardt, Perry J.; Liang, Zhengrong

    2016-03-01

    Feature classification plays an important role in differentiation or computer-aided diagnosis (CADx) of suspicious lesions. As a widely used ensemble learning algorithm for classification, random forest (RF) has a distinguished performance for CADx. Our recent study has shown that the location index (LI), which is derived from the well-known kNN (k nearest neighbor) and wkNN (weighted k nearest neighbor) classifier [1], has also a distinguished role in the classification for CADx. Therefore, in this paper, based on the property that the LI will achieve a very high accuracy, we design an algorithm to integrate the LI into RF for improved or higher value of AUC (area under the curve of receiver operating characteristics -- ROC). Experiments were performed by the use of a database of 153 lesions (polyps), including 116 neoplastic lesions and 37 hyperplastic lesions, with comparison to the existing classifiers of RF and wkNN, respectively. A noticeable gain by the proposed integrated classifier was quantified by the AUC measure.

  5. Spatial clustering of dark matter haloes: secondary bias, neighbour bias, and the influence of massive neighbours on halo properties

    DOE PAGES

    Salcedo, Andres N.; Maller, Ariyeh H.; Berlind, Andreas A.; ...

    2018-01-15

    Here, we explore the phenomenon commonly known as halo assembly bias, whereby dark matter haloes of the same mass are found to be more or less clustered when a second halo property is considered, for haloes in the mass range 3.7 × 10 11–5.0 × 10 13 h –1 M ⊙. Using the Large Suite of Dark Matter Simulations (LasDamas) we consider nine commonly used halo properties and find that a clustering bias exists if haloes are binned by mass or by any other halo property. This secondary bias implies that no single halo property encompasses all the spatial clusteringmore » information of the halo population. The mean values of some halo properties depend on their halo's distance to a more massive neighbour. Halo samples selected by having high values of one of these properties therefore inherit a neighbour bias such that they are much more likely to be close to a much more massive neighbour. This neighbour bias largely accounts for the secondary bias seen in haloes binned by mass and split by concentration or age. However, haloes binned by other mass-like properties still show a secondary bias even when the neighbour bias is removed. The secondary bias of haloes selected by their spin behaves differently than that for other halo properties, suggesting that the origin of the spin bias is different than of other secondary biases.« less

  6. Spatial clustering of dark matter haloes: secondary bias, neighbour bias, and the influence of massive neighbours on halo properties

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Salcedo, Andres N.; Maller, Ariyeh H.; Berlind, Andreas A.

    Here, we explore the phenomenon commonly known as halo assembly bias, whereby dark matter haloes of the same mass are found to be more or less clustered when a second halo property is considered, for haloes in the mass range 3.7 × 10 11–5.0 × 10 13 h –1 M ⊙. Using the Large Suite of Dark Matter Simulations (LasDamas) we consider nine commonly used halo properties and find that a clustering bias exists if haloes are binned by mass or by any other halo property. This secondary bias implies that no single halo property encompasses all the spatial clusteringmore » information of the halo population. The mean values of some halo properties depend on their halo's distance to a more massive neighbour. Halo samples selected by having high values of one of these properties therefore inherit a neighbour bias such that they are much more likely to be close to a much more massive neighbour. This neighbour bias largely accounts for the secondary bias seen in haloes binned by mass and split by concentration or age. However, haloes binned by other mass-like properties still show a secondary bias even when the neighbour bias is removed. The secondary bias of haloes selected by their spin behaves differently than that for other halo properties, suggesting that the origin of the spin bias is different than of other secondary biases.« less

  7. Synergetic interaction between neighbouring platinum monomers in CO2 hydrogenation

    NASA Astrophysics Data System (ADS)

    Li, Hongliang; Wang, Liangbing; Dai, Yizhou; Pu, Zhengtian; Lao, Zhuohan; Chen, Yawei; Wang, Menglin; Zheng, Xusheng; Zhu, Junfa; Zhang, Wenhua; Si, Rui; Ma, Chao; Zeng, Jie

    2018-05-01

    Exploring the interaction between two neighbouring monomers has great potential to significantly raise the performance and deepen the mechanistic understanding of heterogeneous catalysis. Herein, we demonstrate that the synergetic interaction between neighbouring Pt monomers on MoS2 greatly enhanced the CO2 hydrogenation catalytic activity and reduced the activation energy relative to isolated monomers. Neighbouring Pt monomers were achieved by increasing the Pt mass loading up to 7.5% while maintaining the atomic dispersion of Pt. Mechanistic studies reveal that neighbouring Pt monomers not only worked in synergy to vary the reaction barrier, but also underwent distinct reaction paths compared with isolated monomers. Isolated Pt monomers favour the conversion of CO2 into methanol without the formation of formic acid, whereas CO2 is hydrogenated stepwise into formic acid and methanol for neighbouring Pt monomers. The discovery of the synergetic interaction between neighbouring monomers may create a new path for manipulating catalytic properties.

  8. Emotional modelling and classification of a large-scale collection of scene images in a cluster environment

    PubMed Central

    Li, Yanfei; Tian, Yun

    2018-01-01

    The development of network technology and the popularization of image capturing devices have led to a rapid increase in the number of digital images available, and it is becoming increasingly difficult to identify a desired image from among the massive number of possible images. Images usually contain rich semantic information, and people usually understand images at a high semantic level. Therefore, achieving the ability to use advanced technology to identify the emotional semantics contained in images to enable emotional semantic image classification remains an urgent issue in various industries. To this end, this study proposes an improved OCC emotion model that integrates personality and mood factors for emotional modelling to describe the emotional semantic information contained in an image. The proposed classification system integrates the k-Nearest Neighbour (KNN) algorithm with the Support Vector Machine (SVM) algorithm. The MapReduce parallel programming model was used to adapt the KNN-SVM algorithm for parallel implementation in the Hadoop cluster environment, thereby achieving emotional semantic understanding for the classification of a massive collection of images. For training and testing, 70,000 scene images were randomly selected from the SUN Database. The experimental results indicate that users with different personalities show overall consistency in their emotional understanding of the same image. For a training sample size of 50,000, the classification accuracies for different emotional categories targeted at users with different personalities were approximately 95%, and the training time was only 1/5 of that required for the corresponding algorithm with a single-node architecture. Furthermore, the speedup of the system also showed a linearly increasing tendency. Thus, the experiments achieved a good classification effect and can lay a foundation for classification in terms of additional types of emotional image semantics, thereby demonstrating

  9. Emotional modelling and classification of a large-scale collection of scene images in a cluster environment.

    PubMed

    Cao, Jianfang; Li, Yanfei; Tian, Yun

    2018-01-01

    The development of network technology and the popularization of image capturing devices have led to a rapid increase in the number of digital images available, and it is becoming increasingly difficult to identify a desired image from among the massive number of possible images. Images usually contain rich semantic information, and people usually understand images at a high semantic level. Therefore, achieving the ability to use advanced technology to identify the emotional semantics contained in images to enable emotional semantic image classification remains an urgent issue in various industries. To this end, this study proposes an improved OCC emotion model that integrates personality and mood factors for emotional modelling to describe the emotional semantic information contained in an image. The proposed classification system integrates the k-Nearest Neighbour (KNN) algorithm with the Support Vector Machine (SVM) algorithm. The MapReduce parallel programming model was used to adapt the KNN-SVM algorithm for parallel implementation in the Hadoop cluster environment, thereby achieving emotional semantic understanding for the classification of a massive collection of images. For training and testing, 70,000 scene images were randomly selected from the SUN Database. The experimental results indicate that users with different personalities show overall consistency in their emotional understanding of the same image. For a training sample size of 50,000, the classification accuracies for different emotional categories targeted at users with different personalities were approximately 95%, and the training time was only 1/5 of that required for the corresponding algorithm with a single-node architecture. Furthermore, the speedup of the system also showed a linearly increasing tendency. Thus, the experiments achieved a good classification effect and can lay a foundation for classification in terms of additional types of emotional image semantics, thereby demonstrating

  10. Bimodal spectroscopic evaluation of ultra violet-irradiated mouse skin inflammatory and precancerous stages: instrumentation, spectral feature extraction/selection and classification (k-NN, LDA and SVM)

    NASA Astrophysics Data System (ADS)

    Díaz-Ayil, G.; Amouroux, M.; Blondel, W. C. P. M.; Bourg-Heckly, G.; Leroux, A.; Guillemin, F.; Granjon, Y.

    2009-07-01

    This paper deals with the development and application of in vivo spatially-resolved bimodal spectroscopy (AutoFluorescence AF and Diffuse Reflectance DR), to discriminate various stages of skin precancer in a preclinical model (UV-irradiated mouse): Compensatory Hyperplasia CH, Atypical Hyperplasia AH and Dysplasia D. A programmable instrumentation was developed for acquiring AF emission spectra using 7 excitation wavelengths: 360, 368, 390, 400, 410, 420 and 430 nm, and DR spectra in the 390-720 nm wavelength range. After various steps of intensity spectra preprocessing (filtering, spectral correction and intensity normalization), several sets of spectral characteristics were extracted and selected based on their discrimination power statistically tested for every pair-wise comparison of histological classes. Data reduction with Principal Components Analysis (PCA) was performed and 3 classification methods were implemented (k-NN, LDA and SVM), in order to compare diagnostic performance of each method. Diagnostic performance was studied and assessed in terms of sensitivity (Se) and specificity (Sp) as a function of the selected features, of the combinations of 3 different inter-fibers distances and of the numbers of principal components, such that: Se and Sp ≈ 100% when discriminating CH vs. others; Sp ≈ 100% and Se > 95% when discriminating Healthy vs. AH or D; Sp ≈ 74% and Se ≈ 63%for AH vs. D.

  11. Reverse Nearest Neighbor Search on a Protein-Protein Interaction Network to Infer Protein-Disease Associations.

    PubMed

    Suratanee, Apichat; Plaimas, Kitiporn

    2017-01-01

    The associations between proteins and diseases are crucial information for investigating pathological mechanisms. However, the number of known and reliable protein-disease associations is quite small. In this study, an analysis framework to infer associations between proteins and diseases was developed based on a large data set of a human protein-protein interaction network integrating an effective network search, namely, the reverse k -nearest neighbor (R k NN) search. The R k NN search was used to identify an impact of a protein on other proteins. Then, associations between proteins and diseases were inferred statistically. The method using the R k NN search yielded a much higher precision than a random selection, standard nearest neighbor search, or when applying the method to a random protein-protein interaction network. All protein-disease pair candidates were verified by a literature search. Supporting evidence for 596 pairs was identified. In addition, cluster analysis of these candidates revealed 10 promising groups of diseases to be further investigated experimentally. This method can be used to identify novel associations to better understand complex relationships between proteins and diseases.

  12. Surgical outcomes of a civil war in a neighbouring country.

    PubMed

    Akkucuk, Seckin; Aydogan, A; Yetim, I; Ugur, M; Oruc, C; Kilic, E; Paltaci, I; Kaplan, A; Temiz, M

    2016-08-01

    The civil war in Syria began on 15 March 2011, and many of the injured were treated in the neighbouring country of Turkey. This study reports the surgical outcomes of this war, in a tertiary centre in Turkey. 159 patients with civilian war injuries in Syria who were admitted to the General Surgery Department in the Research and Training Hospital of the Medical School of Mustafa Kemal University, Hatay, Turkey, between 2011 and 2012 were analysed regarding the age, sex, injury type, history of previous surgery for the injury, types of abdominal injuries (solid or luminal organ), the status of isolated abdominal injuries or multiple injuries, mortality, length of hospital stay and injury severity scoring. The median age of the patients was 30.05 (18-66 years) years. Most of the injuries were gunshot wounds (99 of 116 patients, 85.3%). Primary and previously operated patients were transferred to our clinic in a median time of 6.28±4.44 h and 58.11±44.08 h, respectively. Most of the patients had intestinal injuries; although a limited number of patients with colorectal injuries were treated with primary repair, stoma was the major surgical option due to the gross peritoneal contamination secondary to prolonged transport time. Two women and 21 men died. The major cause of death was multiorgan failure secondary to sepsis (18 patients). In the case of civil war in the bordering countries, it is recommended that precautions are taken, such as transformation of nearby civilian hospitals into military ones and employment of experienced trauma surgeons in these hospitals to provide effective medical care. Damage control procedures can avoid fatalities especially before the lethal triad of physiological demise occurs. Rapid transport of the wounded to the nearest medical centre is the key point in countries neighbouring a civil war. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

  13. Activity of the Seyfert galaxy neighbours

    NASA Astrophysics Data System (ADS)

    Koulouridis, E.; Plionis, M.; Chavushyan, V.; Dultzin, D.; Krongold, Y.; Georgantopoulos, I.; León-Tavares, J.

    2013-04-01

    We present a follow-up study of a series of papers concerning the role of close interactions as a possible triggering mechanism of AGN activity. We have already studied the close (≤100 h-1 kpc) and the large-scale (≤1 h-1 Mpc) environment of a local sample of Sy1, Sy2, and bright IRAS galaxies (BIRG) and of their respective control samples. The results led us to the conclusion that a close encounter appears capable of activating a sequence where an absorption line galaxy (ALG) galaxy first becomes a starburst, then a Sy2, and finally a Sy1. Here we investigate the activity of neighbouring galaxies of different types of AGN, since both galaxies of an interacting pair should be affected. To this end we present the optical spectroscopy and X-ray imaging of 30 neighbouring galaxies around two local (z ≲ 0.034) samples of 10 Sy1 and 13 Sy2 galaxies. Although this is a pilot study of a small sample, various interesting trends have been discovered that imply physical mechanisms that may lead to different Seyfert types. Based on the optical spectroscopy, we find that more than 70% of all neighbouring galaxies exhibit star forming and/or nuclear activity (namely recent star formation and/or AGN), while an additional X-ray analysis showed that this percentage might be significantly higher. Furthermore, we find a statistically significant correlation, at a 99.9% level, between the value of the neighbour's [OIII]/Hβ ratio and the activity type of the central active galaxy, i.e. the neighbours of Sy2 galaxies are systematically more ionized than the neighbours of Sy1s. This result, in combination with trends found using the Equivalent Width of the Hα emission line and the stellar population synthesis code STARLIGHT, indicate differences in the stellar mass, metallicity, and star formation history between the samples. Our results point towards a link between close galaxy interactions and activity and also provide more clues regarding the possible evolutionary sequence

  14. Emotional State Classification in Virtual Reality Using Wearable Electroencephalography

    NASA Astrophysics Data System (ADS)

    Suhaimi, N. S.; Teo, J.; Mountstephens, J.

    2018-03-01

    This paper presents the classification of emotions on EEG signals. One of the key issues in this research is the lack of mental classification using VR as the medium to stimulate emotion. The approach towards this research is by using K-nearest neighbor (KNN) and Support Vector Machine (SVM). Firstly, each of the participant will be required to wear the EEG headset and recording their brainwaves when they are immersed inside the VR. The data points are then marked if they showed any physical signs of emotion or by observing the brainwave pattern. Secondly, the data will then be tested and trained with KNN and SVM algorithms. The accuracy achieved from both methods were approximately 82% throughout the brainwave spectrum (α, β, γ, δ, θ). These methods showed promising results and will be further enhanced using other machine learning approaches in VR stimulus.

  15. Extending bicluster analysis to annotate unclassified ORFs and predict novel functional modules using expression data

    PubMed Central

    Bryan, Kenneth; Cunningham, Pádraig

    2008-01-01

    Background Microarrays have the capacity to measure the expressions of thousands of genes in parallel over many experimental samples. The unsupervised classification technique of bicluster analysis has been employed previously to uncover gene expression correlations over subsets of samples with the aim of providing a more accurate model of the natural gene functional classes. This approach also has the potential to aid functional annotation of unclassified open reading frames (ORFs). Until now this aspect of biclustering has been under-explored. In this work we illustrate how bicluster analysis may be extended into a 'semi-supervised' ORF annotation approach referred to as BALBOA. Results The efficacy of the BALBOA ORF classification technique is first assessed via cross validation and compared to a multi-class k-Nearest Neighbour (kNN) benchmark across three independent gene expression datasets. BALBOA is then used to assign putative functional annotations to unclassified yeast ORFs. These predictions are evaluated using existing experimental and protein sequence information. Lastly, we employ a related semi-supervised method to predict the presence of novel functional modules within yeast. Conclusion In this paper we demonstrate how unsupervised classification methods, such as bicluster analysis, may be extended using of available annotations to form semi-supervised approaches within the gene expression analysis domain. We show that such methods have the potential to improve upon supervised approaches and shed new light on the functions of unclassified ORFs and their co-regulation. PMID:18831786

  16. Response to displaced neighbours in a territorial songbird with a large repertoire

    NASA Astrophysics Data System (ADS)

    Briefer, Elodie; Aubin, Thierry; Rybak, Fanny

    2009-09-01

    Neighbour recognition allows territory owners to modulate their territorial response according to the threat posed by each neighbour and thus to reduce the costs associated with territorial defence. Individual acoustic recognition of neighbours has been shown in numerous bird species, but few of them had a large repertoire. Here, we tested individual vocal recognition in a songbird with a large repertoire, the skylark Alauda arvensis. We first examined the physical basis for recognition in the song, and we then experimentally tested recognition by playing back songs of adjacent neighbours and strangers. Males showed a lower territorial response to adjacent neighbours than to strangers when we broadcast songs from the shared boundary. However, when we broadcast songs from the opposite boundary, males showed a similar response to neighbours and strangers, indicating a spatial categorisation of adjacent neighbours’ songs. Acoustic analyses revealed that males could potentially use the syntactical arrangement of syllables in sequences to identify the songs of their neighbours. Neighbour interactions in skylarks are thus subtle relationships that can be modulated according to the spatial position of each neighbour.

  17. Bringing Proximate Neighbours into the Study of US Residential Segregation

    PubMed Central

    Friedman, Samantha

    2011-01-01

    The race and ethnicity of neighbours are thought to be critical in shaping household mobility underlying residential segregation. However, studies on this topic have used data at the census-tract level of analysis rather than at the proximate-neighbour level. Using a non-publicly available version of the neighbour-cluster sample within the American Housing Survey, this study incorporates data on the race, ethnicity and socioeconomic characteristics of the proximate neighbours of White, Black and Latino households and examines their impact on household residential satisfaction, out- and in-mobility. Results indicate that proximate-neighbour race and ethnicity matter in influencing endpoints of the mobility process and do not necessarily parallel those at the census-tract level. Implications of these findings are discussed as they relate to the study of residential segregation. PMID:21544258

  18. Ising lattices with +/-J second-nearest-neighbor interactions

    NASA Astrophysics Data System (ADS)

    Ramírez-Pastor, A. J.; Nieto, F.; Vogel, E. E.

    1997-06-01

    Second-nearest-neighbor interactions are added to the usual nearest-neighbor Ising Hamiltonian for square lattices in different ways. The starting point is a square lattice where half the nearest-neighbor interactions are ferromagnetic and the other half of the bonds are antiferromagnetic. Then, second-nearest-neighbor interactions can also be assigned randomly or in a variety of causal manners determined by the nearest-neighbor interactions. In the present paper we consider three causal and three random ways of assigning second-nearest-neighbor exchange interactions. Several ground-state properties are then calculated for each of these lattices:energy per bond ɛg, site correlation parameter pg, maximal magnetization μg, and fraction of unfrustrated bonds hg. A set of 500 samples is considered for each size N (number of spins) and array (way of distributing the N spins). The properties of the original lattices with only nearest-neighbor interactions are already known, which allows realizing the effect of the additional interactions. We also include cubic lattices to discuss the distinction between coordination number and dimensionality. Comparison with results for triangular and honeycomb lattices is done at specific points.

  19. Acetobacter sicerae sp. nov., isolated from cider and kefir, and identification of species of the genus Acetobacter by dnaK, groEL and rpoB sequence analysis.

    PubMed

    Li, Leilei; Wieme, Anneleen; Spitaels, Freek; Balzarini, Tom; Nunes, Olga C; Manaia, Célia M; Van Landschoot, Anita; De Vuyst, Luc; Cleenwerck, Ilse; Vandamme, Peter

    2014-07-01

    Five acetic acid bacteria isolates, awK9_3, awK9_4 ( = LMG 27543), awK9_5 ( = LMG 28092), awK9_6 and awK9_9, obtained during a study of micro-organisms present in traditionally produced kefir, were grouped on the basis of their MALDI-TOF MS profile with LMG 1530 and LMG 1531(T), two strains currently classified as members of the genus Acetobacter. Phylogenetic analysis based on nearly complete 16S rRNA gene sequences as well as on concatenated partial sequences of the housekeeping genes dnaK, groEL and rpoB indicated that these isolates were representatives of a single novel species together with LMG 1530 and LMG 1531(T) in the genus Acetobacter, with Acetobacter aceti, Acetobacter nitrogenifigens, Acetobacter oeni and Acetobacter estunensis as nearest phylogenetic neighbours. Pairwise similarity of 16S rRNA gene sequences between LMG 1531(T) and the type strains of the above-mentioned species were 99.7%, 99.1%, 98.4% and 98.2%, respectively. DNA-DNA hybridizations confirmed that status, while amplified fragment length polymorphism (AFLP) and random amplified polymorphic DNA (RAPD) data indicated that LMG 1531(T), LMG 1530, LMG 27543 and LMG 28092 represent at least two different strains of the novel species. The major fatty acid of LMG 1531(T) and LMG 27543 was C18 : 1ω7c. The major ubiquinone present was Q-9 and the DNA G+C contents of LMG 1531(T) and LMG 27543 were 58.3 and 56.7 mol%, respectively. The strains were able to grow on D-fructose and D-sorbitol as a single carbon source. They were also able to grow on yeast extract with 30% D-glucose and on standard medium with pH 3.6 or containing 1% NaCl. They had a weak ability to produce acid from d-arabinose. These features enabled their differentiation from their nearest phylogenetic neighbours. The name Acetobacter sicerae sp. nov. is proposed with LMG 1531(T) ( = NCIMB 8941(T)) as the type strain. © 2014 IUMS.

  20. Classification of diabetic retinopathy using fractal dimension analysis of eye fundus image

    NASA Astrophysics Data System (ADS)

    Safitri, Diah Wahyu; Juniati, Dwi

    2017-08-01

    Diabetes Mellitus (DM) is a metabolic disorder when pancreas produce inadequate insulin or a condition when body resist insulin action, so the blood glucose level is high. One of the most common complications of diabetes mellitus is diabetic retinopathy which can lead to a vision problem. Diabetic retinopathy can be recognized by an abnormality in eye fundus. Those abnormalities are characterized by microaneurysms, hemorrhage, hard exudate, cotton wool spots, and venous's changes. The diabetic retinopathy is classified depends on the conditions of abnormality in eye fundus, that is grade 1 if there is a microaneurysm only in the eye fundus; grade 2, if there are a microaneurysm and a hemorrhage in eye fundus; and grade 3: if there are microaneurysm, hemorrhage, and neovascularization in the eye fundus. This study proposed a method and a process of eye fundus image to classify of diabetic retinopathy using fractal analysis and K-Nearest Neighbor (KNN). The first phase was image segmentation process using green channel, CLAHE, morphological opening, matched filter, masking, and morphological opening binary image. After segmentation process, its fractal dimension was calculated using box-counting method and the values of fractal dimension were analyzed to make a classification of diabetic retinopathy. Tests carried out by used k-fold cross validation method with k=5. In each test used 10 different grade K of KNN. The accuracy of the result of this method is 89,17% with K=3 or K=4, it was the best results than others K value. Based on this results, it can be concluded that the classification of diabetic retinopathy using fractal analysis and KNN had a good performance.

  1. Current and emerging operational uses of remote sensing in Swedish forestry

    Treesearch

    Hakan Olsson; Mikael Egberth; Jonas Engberg; Johan E.S. Fransson; Tina Granqvist Pahlen; < i> et al< /i>

    2007-01-01

    Satellite remote sensing is being used operationally by Swedish authorities in applications involving, for example, change detection of clear felled areas, use of k-Nearest Neighbour estimates of forest parameters, and post-stratification (in combination with National Forest Inventory plots). For forest management planning of estates, aerial...

  2. The nearest relative in mental health law.

    PubMed

    Andoh, Benjamin; Gogo, Emmanuel

    2004-04-01

    This article considers the concept of the 'nearest relative' in mental health law in England and Wales and argues, inter alia, for its retention in a way that avoids violation of the European Convention on Human Rights and the Human Rights Act 1998. It looks, first, at the meaning of nearest relative and then focuses on his/her role today, including its link with advance directives for mental health care, and on the tension between nearest relatives and approved social workers and the law. The problem exposed by JT v. United Kingdom in relation to the Human Rights Act 1998 and its implications for the future are considered. The impact of the Mental Health Bill (2002) on the nearest relative is discussed and recommendations to improve the present law are then suggested.

  3. Supervised novelty detection in brain tissue classification with an application to white matter hyperintensities

    NASA Astrophysics Data System (ADS)

    Kuijf, Hugo J.; Moeskops, Pim; de Vos, Bob D.; Bouvy, Willem H.; de Bresser, Jeroen; Biessels, Geert Jan; Viergever, Max A.; Vincken, Koen L.

    2016-03-01

    Novelty detection is concerned with identifying test data that differs from the training data of a classifier. In the case of brain MR images, pathology or imaging artefacts are examples of untrained data. In this proof-of-principle study, we measure the behaviour of a classifier during the classification of trained labels (i.e. normal brain tissue). Next, we devise a measure that distinguishes normal classifier behaviour from abnormal behavior that occurs in the case of a novelty. This will be evaluated by training a kNN classifier on normal brain tissue, applying it to images with an untrained pathology (white matter hyperintensities (WMH)), and determine if our measure is able to identify abnormal classifier behaviour at WMH locations. For our kNN classifier, behaviour is modelled as the mean, median, or q1 distance to the k nearest points. Healthy tissue was trained on 15 images; classifier behaviour was trained/tested on 5 images with leave-one-out cross-validation. For each trained class, we measure the distribution of mean/median/q1 distances to the k nearest point. Next, for each test voxel, we compute its Z-score with respect to the measured distribution of its predicted label. We consider a Z-score >=4 abnormal behaviour of the classifier, having a probability due to chance of 0.000032. Our measure identified >90% of WMH volume and also highlighted other non-trained findings. The latter being predominantly vessels, cerebral falx, brain mask errors, choroid plexus. This measure is generalizable to other classifiers and might help in detecting unexpected findings or novelties by measuring classifier behaviour.

  4. Takahasi Nearest-Neighbour Gas Revisited II: Morse Gases

    NASA Astrophysics Data System (ADS)

    Matsumoto, Akira

    2011-12-01

    Some thermodynamic quantities for the Morse potential are analytically evaluated at an isobaric process. The parameters of Morse gases for 21 substances are obtained by the second virial coefficient data and the spectroscopic data of diatomic molecules. Also some thermodynamic quantities for water are calculated numerically and drawn graphically. The inflexion point of the length L which depends on temperature T and pressure P corresponds physically to a boiling point. L indicates the liquid phase from lower temperature to the inflexion point and the gaseous phase from the inflexion point to higher temperature. The boiling temperatures indicate reasonable values compared with experimental data. The behaviour of L suggests a chance of a first-order phase transition in one dimension.

  5. Reweighting anthropometric data using a nearest neighbour approach.

    PubMed

    Kumar, Kannan Anil; Parkinson, Matthew B

    2018-07-01

    When designing products and environments, detailed data on body size and shape are seldom available for the specific user population. One way to mitigate this issue is to reweight available data such that they provide an accurate estimate of the target population of interest. This is done by assigning a statistical weight to each individual in the reference data, increasing or decreasing their influence on statistical models of the whole. This paper presents a new approach to reweighting these data. Instead of stratified sampling, the proposed method uses a clustering algorithm to identify relationships between the detailed and reference populations using their height, mass, and body mass index (BMI). The newly weighted data are shown to provide more accurate estimates than traditional approaches. The improved accuracy that accompanies this method provides designers with an alternative to data synthesis techniques as they seek appropriate data to guide their design practice.Practitioner Summary: Design practice is best guided by data on body size and shape that accurately represents the target user population. This research presents an alternative to data synthesis (e.g. regression or proportionality constants) for adapting data from one population for use in modelling another.

  6. Breathing Pattern Interpretation as an Alternative and Effective Voice Communication Solution.

    PubMed

    Elsahar, Yasmin; Bouazza-Marouf, Kaddour; Kerr, David; Gaur, Atul; Kaushik, Vipul; Hu, Sijung

    2018-05-15

    Augmentative and alternative communication (AAC) systems tend to rely on the interpretation of purposeful gestures for interaction. Existing AAC methods could be cumbersome and limit the solutions in terms of versatility. The study aims to interpret breathing patterns (BPs) to converse with the outside world by means of a unidirectional microphone and researches breathing-pattern interpretation (BPI) to encode messages in an interactive manner with minimal training. We present BP processing work with (1) output synthesized machine-spoken words (SMSW) along with single-channel Weiner filtering (WF) for signal de-noising, and (2) k -nearest neighbor ( k-NN ) classification of BPs associated with embedded dynamic time warping (DTW). An approved protocol to collect analogue modulated BP sets belonging to 4 distinct classes with 10 training BPs per class and 5 live BPs per class was implemented with 23 healthy subjects. An 86% accuracy of k-NN classification was obtained with decreasing error rates of 17%, 14%, and 11% for the live classifications of classes 2, 3, and 4, respectively. The results express a systematic reliability of 89% with increased familiarity. The outcomes from the current AAC setup recommend a durable engineering solution directly beneficial to the sufferers.

  7. Losing Wallets, Retaining Trust? The Relationship Between Ethnic Heterogeneity and Trusting Coethnic and Non-coethnic Neighbours and Non-neighbours to Return a Lost Wallet.

    PubMed

    Tolsma, J; van der Meer, T W G

    2017-01-01

    The constrict claim that ethnic heterogeneity drives down social trust has been empirically tested across the globe. Meta-analyses suggest that neighbourhood ethnic heterogeneity generally undermines ties within the neighbourhood (such as trust in neighbours), but concurrently has an inconsistent or even positive effect on interethnic ties (such as outgroup trust). While the composition of the living environment thus often seems to matter, when and where remain unclear. We contribute to the literature by: (1) scrutinizing the extent to which ethnic heterogeneity drives down trust in coethnic neighbours, non-coethnic neighbours, unknown neighbours and unknown non-neighbours similarly; (2) comparing effects of heterogeneity aggregated to geographical areas that vary in scale and type of boundary; and (3) assessing whether the impact of heterogeneity of the local area depends on the wider geographic context. We test our hypotheses on the Religion in Dutch Society 2011-2012 dataset, supplemented with uniquely detailed GIS-data of Statistics Netherlands. Our dependent variables are four different so-called wallet-items, which we model through spatial and multilevel regression techniques. We demonstrate that both trust in non-coethnic and coethnic neighbours are lower in heterogeneous environments. Trust in people outside the neighbourhood is not affected by local heterogeneity. Measures of heterogeneity aggregated to relatively large scales, such as, administrative municipalities and egohoods with a 4000 m radius, demonstrate the strongest negative relationships with our trust indicators.

  8. A gap-filling model for eddy covariance latent heat flux: Estimating evapotranspiration of a subtropical seasonal evergreen broad-leaved forest as an example

    NASA Astrophysics Data System (ADS)

    Chen, Yi-Ying; Chu, Chia-Ren; Li, Ming-Hsu

    2012-10-01

    SummaryIn this paper we present a semi-parametric multivariate gap-filling model for tower-based measurement of latent heat flux (LE). Two statistical techniques, the principal component analysis (PCA) and a nonlinear interpolation approach were integrated into this LE gap-filling model. The PCA was first used to resolve the multicollinearity relationships among various environmental variables, including radiation, soil moisture deficit, leaf area index, wind speed, etc. Two nonlinear interpolation methods, multiple regressions (MRS) and the K-nearest neighbors (KNNs) were examined with random selected flux gaps for both clear sky and nighttime/cloudy data to incorporate into this LE gap-filling model. Experimental results indicated that the KNN interpolation approach is able to provide consistent LE estimations while MRS presents over estimations during nighttime/cloudy. Rather than using empirical regression parameters, the KNN approach resolves the nonlinear relationship between the gap-filled LE flux and principal components with adaptive K values under different atmospheric states. The developed LE gap-filling model (PCA with KNN) works with a RMSE of 2.4 W m-2 (˜0.09 mm day-1) at a weekly time scale by adding 40% artificial flux gaps into original dataset. Annual evapotranspiration at this study site were estimated at 736 mm (1803 MJ) and 728 mm (1785 MJ) for year 2008 and 2009, respectively.

  9. Is it possible to predict long-term success with k-NN? Case study of four market indices (FTSE100, DAX, HANGSENG, NASDAQ)

    NASA Astrophysics Data System (ADS)

    Shi, Y.; Gorban, A. N.; Y Yang, T.

    2014-03-01

    This case study tests the possibility of prediction for 'success' (or 'winner') components of four stock & shares market indices in a time period of three years from 02-Jul-2009 to 29-Jun-2012.We compare their performance ain two time frames: initial frame three months at the beginning (02/06/2009-30/09/2009) and the final three month frame (02/04/2012-29/06/2012).To label the components, average price ratio between two time frames in descending order is computed. The average price ratio is defined as the ratio between the mean prices of the beginning and final time period. The 'winner' components are referred to the top one third of total components in the same order as average price ratio it means the mean price of final time period is relatively higher than the beginning time period. The 'loser' components are referred to the last one third of total components in the same order as they have higher mean prices of beginning time period. We analyse, is there any information about the winner-looser separation in the initial fragments of the daily closing prices log-returns time series.The Leave-One-Out Cross-Validation with k-NN algorithm is applied on the daily log-return of components using a distance and proximity in the experiment. By looking at the error analysis, it shows that for HANGSENG and DAX index, there are clear signs of possibility to evaluate the probability of long-term success. The correlation distance matrix histograms and 2-D/3-D elastic maps generated from ViDaExpert show that the 'winner' components are closer to each other and 'winner'/'loser' components are separable on elastic maps for HANGSENG and DAX index while for the negative possibility indices, there is no sign of separation.

  10. Webcam mouse using face and eye tracking in various illumination environments.

    PubMed

    Lin, Yuan-Pin; Chao, Yi-Ping; Lin, Chung-Chih; Chen, Jyh-Horng

    2005-01-01

    Nowadays, due to enhancement of computer performance and popular usage of webcam devices, it has become possible to acquire users' gestures for the human-computer-interface with PC via webcam. However, the effects of illumination variation would dramatically decrease the stability and accuracy of skin-based face tracking system; especially for a notebook or portable platform. In this study we present an effective illumination recognition technique, combining K-Nearest Neighbor classifier and adaptive skin model, to realize the real-time tracking system. We have demonstrated that the accuracy of face detection based on the KNN classifier is higher than 92% in various illumination environments. In real-time implementation, the system successfully tracks user face and eyes features at 15 fps under standard notebook platforms. Although KNN classifier only initiates five environments at preliminary stage, the system permits users to define and add their favorite environments to KNN for computer access. Eventually, based on this efficient tracking algorithm, we have developed a "Webcam Mouse" system to control the PC cursor using face and eye tracking. Preliminary studies in "point and click" style PC web games also shows promising applications in consumer electronic markets in the future.

  11. A comparison between skeleton and bounding box models for falling direction recognition

    NASA Astrophysics Data System (ADS)

    Narupiyakul, Lalita; Srisrisawang, Nitikorn

    2017-12-01

    Falling is an injury that can lead to a serious medical condition in every range of the age of people. However, in the case of elderly, the risk of serious injury is much higher. Due to the fact that one way of preventing serious injury is to treat the fallen person as soon as possible, several works attempted to implement different algorithms to recognize the fall. Our work compares the performance of two models based on features extraction: (i) Body joint data (Skeleton Data) which are the joint's positions in 3 axes and (ii) Bounding box (Box-size Data) covering all body joints. Machine learning algorithms that were chosen are Decision Tree (DT), Naïve Bayes (NB), K-nearest neighbors (KNN), Linear discriminant analysis (LDA), Voting Classification (VC), and Gradient boosting (GB). The results illustrate that the models trained with Skeleton data are performed far better than those trained with Box-size data (with an average accuracy of 94-81% and 80-75%, respectively). KNN shows the best performance in both Body joint model and Bounding box model. In conclusion, KNN with Body joint model performs the best among the others.

  12. Design of a hybrid model for cardiac arrhythmia classification based on Daubechies wavelet transform.

    PubMed

    Rajagopal, Rekha; Ranganathan, Vidhyapriya

    2018-06-05

    Automation in cardiac arrhythmia classification helps medical professionals make accurate decisions about the patient's health. The aim of this work was to design a hybrid classification model to classify cardiac arrhythmias. The design phase of the classification model comprises the following stages: preprocessing of the cardiac signal by eliminating detail coefficients that contain noise, feature extraction through Daubechies wavelet transform, and arrhythmia classification using a collaborative decision from the K nearest neighbor classifier (KNN) and a support vector machine (SVM). The proposed model is able to classify 5 arrhythmia classes as per the ANSI/AAMI EC57: 1998 classification standard. Level 1 of the proposed model involves classification using the KNN and the classifier is trained with examples from all classes. Level 2 involves classification using an SVM and is trained specifically to classify overlapped classes. The final classification of a test heartbeat pertaining to a particular class is done using the proposed KNN/SVM hybrid model. The experimental results demonstrated that the average sensitivity of the proposed model was 92.56%, the average specificity 99.35%, the average positive predictive value 98.13%, the average F-score 94.5%, and the average accuracy 99.78%. The results obtained using the proposed model were compared with the results of discriminant, tree, and KNN classifiers. The proposed model is able to achieve a high classification accuracy.

  13. Discrimination of stroke-related mild cognitive impairment and vascular dementia using EEG signal analysis.

    PubMed

    Al-Qazzaz, Noor Kamal; Ali, Sawal Hamid Bin Mohd; Ahmad, Siti Anom; Islam, Mohd Shabiul; Escudero, Javier

    2018-01-01

    Stroke survivors are more prone to developing cognitive impairment and dementia. Dementia detection is a challenge for supporting personalized healthcare. This study analyzes the electroencephalogram (EEG) background activity of 5 vascular dementia (VaD) patients, 15 stroke-related patients with mild cognitive impairment (MCI), and 15 control healthy subjects during a working memory (WM) task. The objective of this study is twofold. First, it aims to enhance the discrimination of VaD, stroke-related MCI patients, and control subjects using fuzzy neighborhood preserving analysis with QR-decomposition (FNPAQR); second, it aims to extract and investigate the spectral features that characterize the post-stroke dementia patients compared to the control subjects. Nineteen channels were recorded and analyzed using the independent component analysis and wavelet analysis (ICA-WT) denoising technique. Using ANOVA, linear spectral power including relative powers (RP) and power ratio were calculated to test whether the EEG dominant frequencies were slowed down in VaD and stroke-related MCI patients. Non-linear features including permutation entropy (PerEn) and fractal dimension (FD) were used to test the degree of irregularity and complexity, which was significantly lower in patients with VaD and stroke-related MCI than that in control subjects (ANOVA; p ˂ 0.05). This study is the first to use fuzzy neighborhood preserving analysis with QR-decomposition (FNPAQR) dimensionality reduction technique with EEG background activity of dementia patients. The impairment of post-stroke patients was detected using support vector machine (SVM) and k-nearest neighbors (kNN) classifiers. A comparative study has been performed to check the effectiveness of using FNPAQR dimensionality reduction technique with the SVM and kNN classifiers. FNPAQR with SVM and kNN obtained 91.48 and 89.63% accuracy, respectively, whereas without using the FNPAQR exhibited 70 and 67.78% accuracy for SVM and kNN

  14. Prediction of mortality after radical cystectomy for bladder cancer by machine learning techniques.

    PubMed

    Wang, Guanjin; Lam, Kin-Man; Deng, Zhaohong; Choi, Kup-Sze

    2015-08-01

    Bladder cancer is a common cancer in genitourinary malignancy. For muscle invasive bladder cancer, surgical removal of the bladder, i.e. radical cystectomy, is in general the definitive treatment which, unfortunately, carries significant morbidities and mortalities. Accurate prediction of the mortality of radical cystectomy is therefore needed. Statistical methods have conventionally been used for this purpose, despite the complex interactions of high-dimensional medical data. Machine learning has emerged as a promising technique for handling high-dimensional data, with increasing application in clinical decision support, e.g. cancer prediction and prognosis. Its ability to reveal the hidden nonlinear interactions and interpretable rules between dependent and independent variables is favorable for constructing models of effective generalization performance. In this paper, seven machine learning methods are utilized to predict the 5-year mortality of radical cystectomy, including back-propagation neural network (BPN), radial basis function (RBFN), extreme learning machine (ELM), regularized ELM (RELM), support vector machine (SVM), naive Bayes (NB) classifier and k-nearest neighbour (KNN), on a clinicopathological dataset of 117 patients of the urology unit of a hospital in Hong Kong. The experimental results indicate that RELM achieved the highest average prediction accuracy of 0.8 at a fast learning speed. The research findings demonstrate the potential of applying machine learning techniques to support clinical decision making. Copyright © 2015 Elsevier Ltd. All rights reserved.

  15. Children's Services Statistical Neighbour Benchmarking Tool. Practitioner User Guide

    ERIC Educational Resources Information Center

    National Foundation for Educational Research, 2007

    2007-01-01

    Statistical neighbour models provide one method for benchmarking progress. For each local authority (LA), these models designate a number of other LAs deemed to have similar characteristics. These designated LAs are known as statistical neighbours. Any LA may compare its performance (as measured by various indicators) against its statistical…

  16. Vehicle Classification Using an Imbalanced Dataset Based on a Single Magnetic Sensor.

    PubMed

    Xu, Chang; Wang, Yingguan; Bao, Xinghe; Li, Fengrong

    2018-05-24

    This paper aims to improve the accuracy of automatic vehicle classifiers for imbalanced datasets. Classification is made through utilizing a single anisotropic magnetoresistive sensor, with the models of vehicles involved being classified into hatchbacks, sedans, buses, and multi-purpose vehicles (MPVs). Using time domain and frequency domain features in combination with three common classification algorithms in pattern recognition, we develop a novel feature extraction method for vehicle classification. These three common classification algorithms are the k-nearest neighbor, the support vector machine, and the back-propagation neural network. Nevertheless, a problem remains with the original vehicle magnetic dataset collected being imbalanced, and may lead to inaccurate classification results. With this in mind, we propose an approach called SMOTE, which can further boost the performance of classifiers. Experimental results show that the k-nearest neighbor (KNN) classifier with the SMOTE algorithm can reach a classification accuracy of 95.46%, thus minimizing the effect of the imbalance.

  17. Large margin nearest neighbor classifiers.

    PubMed

    Domeniconi, Carlotta; Gunopulos, Dimitrios; Peng, Jing

    2005-07-01

    The nearest neighbor technique is a simple and appealing approach to addressing classification problems. It relies on the assumption of locally constant class conditional probabilities. This assumption becomes invalid in high dimensions with a finite number of examples due to the curse of dimensionality. Severe bias can be introduced under these conditions when using the nearest neighbor rule. The employment of a locally adaptive metric becomes crucial in order to keep class conditional probabilities close to uniform, thereby minimizing the bias of estimates. We propose a technique that computes a locally flexible metric by means of support vector machines (SVMs). The decision function constructed by SVMs is used to determine the most discriminant direction in a neighborhood around the query. Such a direction provides a local feature weighting scheme. We formally show that our method increases the margin in the weighted space where classification takes place. Moreover, our method has the important advantage of online computational efficiency over competing locally adaptive techniques for nearest neighbor classification. We demonstrate the efficacy of our method using both real and simulated data.

  18. A Comparative Evaluation of Anomaly Detection Algorithms for Maritime Video Surveillance

    DTIC Science & Technology

    2011-01-01

    of k-means clustering and the k- NN Localized p-value Estimator ( KNN -LPE). K-means is a popular distance-based clustering algorithm while KNN -LPE...implemented the sparse cluster identification rule we described in Section 3.1. 2. k-NN Localized p-value Estimator ( KNN -LPE): We implemented this using...Average Density ( KNN -NAD): This was implemented as described in Section 3.4. Algorithm Parameter Settings The global and local density-based anomaly

  19. Discovery of Nearest Known Brown Dwarf

    NASA Astrophysics Data System (ADS)

    2003-01-01

    Bright Southern Star Epsilon Indi Has Cool, Substellar Companion [1] Summary A team of European astronomers [2] has discovered a Brown Dwarf object (a 'failed' star) less than 12 light-years from the Sun. It is the nearest yet known. Now designated Epsilon Indi B, it is a companion to a well-known bright star in the southern sky, Epsilon Indi (now "Epsilon Indi A"), previously thought to be single. The binary system is one of the twenty nearest stellar systems to the Sun. The brown dwarf was discovered from the comparatively rapid motion across the sky which it shares with its brighter companion : the pair move a full lunar diameter in less than 400 years. It was first identified using digitised archival photographic plates from the SuperCOSMOS Sky Surveys (SSS) and confirmed using data from the Two Micron All Sky Survey (2MASS). Follow-up observations with the near-infrared sensitive SOFI instrument on the ESO 3.5-m New Technology Telescope (NTT) at the La Silla Observatory confirmed its nature and has allowed measurements of its physical properties. Epsilon Indi B has a mass just 45 times that of Jupiter, the largest planet in the Solar System, and a surface temperature of only 1000 °C. It belongs to the so-called 'T dwarf' category of objects which straddle the domain between stars and giant planets. Epsilon Indi B is the nearest and brightest T dwarf known. Future studies of the new object promise to provide astronomers with important new clues as to the formation and evolution of these exotic celestial bodies, at the same time yielding interesting insights into the border zone between planets and stars. TINY MOVING NEEDLES IN GIANT HAYSTACKS ESO PR Photo 03a/03 ESO PR Photo 03a/03 [Preview - JPEG: 400 x 605 pix - 92k [Normal - JPEG: 1200 x 1815 pix - 1.0M] Caption: PR Photo 03a/03 shows Epsilon Indi A (the bright star at far right) and its newly discovered brown dwarf companion Epsilon Indi B (circled). The upper image comes from one of the SuperCOSMOS Sky

  20. Fabrication and ferroelectric properties of highly dense lead-free piezoelectric (K0.5Na0.5)NbO3 thick films by aerosol deposition

    NASA Astrophysics Data System (ADS)

    Ryu, Jungho; Choi, Jong-Jin; Hahn, Byung-Dong; Park, Dong-Soo; Yoon, Woon-Ha; Kim, Ki-Hoon

    2007-04-01

    Lead-free piezoelectric thick films of (K0.5Na0.5)NbO3 were fabricated by aerosol-deposition method. The thickness of KNN film was 7.1μm and fully dense films were obtained. The dielectric constants ɛ3T/ɛ0 of the as-deposited and annealed films at 1kHz were 116 and 545, respectively, which are higher than any previously reported values for lead-free piezoelectric thin/thick films, either without or with heat treatment. The ferroelectric properties were improved after annealing and the maximum values of Pr=8.1μC/cm3 and Ec=100kV/cm were achieved. These values are markedly superior to those of sintered KNN ceramic counterparts.

  1. Algorithms that Defy the Gravity of Learning Curve

    DTIC Science & Technology

    2017-04-28

    three nearest neighbour-based anomaly detectors, i.e., an ensemble of nearest neigh- bours, a recent nearest neighbour-based ensemble method called iNNE...streams. Note that the change in sample size does not alter the geometrical data characteristics discussed here. 3.1 Experimental Methodology ...need to be answered. 3.6 Comparison with conventional ensemble methods Given the theoretical results, the third aim of this project (i.e., identify the

  2. Phase transitions and electrical behavior of lead-free (K0.50Na0.50)NbO3 thin film

    NASA Astrophysics Data System (ADS)

    Wu, Jiagang; Wang, John

    2009-09-01

    Lead-free (K0.50Na0.50)NbO3 (KNN) thin films with a high degree of (100) preferred orientation were deposited on the SrRuO3-buffered SrTiO3(100) substrate by off-axis radio frequency magnetron sputtering. They possess lower phase transition temperatures (To-t˜120 °C and Tc˜310 °C), as compared to those of KNN bulk ceramic (To-t˜190 °C and Tc˜400 °C). They also demonstrate enhanced ferroelectric behavior (e.g., 2Pr=24.1 μc/cm2) and fatigue endurance, together with a lower dielectric loss (tan δ ˜0.017) and a lower leakage current, as compared to the bulk ceramic counterpart. Oxygen vacancies are shown to be involved in the conduction of the KNN thin film.

  3. 133Cs-NMR study on aligned powder of competing spin chain compound Cs2Cu2Mo3O12

    NASA Astrophysics Data System (ADS)

    Yagi, A.; Matsui, K.; Goto, T.; Hase, M.; Sasaki, T.

    2018-03-01

    S = 1/2 competing spin chain compound Cs2Cu2Mo3O12 has two dominant exchange interactions of the nearest neighbouring ferromagnetic J 1 = 93 K and the second nearest neighbouring antiferromagnetic J 2 = +33 K, and is expected to show the nematic Tomonaga-Luttinger liquid (TLL) state under high magnetic field region. The recent theoretical study by Sato et al. has shown that in the nematic TLL state, the spin fluctuations are expected to be highly anisotropic, that is, its transverse component is suppressed. Our previous NMR study on the present system showed that the dominant contribution to nuclear spin relaxation comes from the longitudinal component. In order to conclude that the transverse component of spin fluctuations is suppressed, the knowledge of hyperfine coupling is indispensable. This article is solely devoted to investigate the hyperfine coupling of 133Cs-NMR site to prove that the anisotropic part of hyperfine coupling, which connects the nuclear spin relaxation with the transverse spin fluctuations is considerably large to be A an = +770 Oe/μB.

  4. Empirical Mode Decomposition and k-Nearest Embedding Vectors for Timely Analyses of Antibiotic Resistance Trends

    PubMed Central

    Teodoro, Douglas; Lovis, Christian

    2013-01-01

    Background Antibiotic resistance is a major worldwide public health concern. In clinical settings, timely antibiotic resistance information is key for care providers as it allows appropriate targeted treatment or improved empirical treatment when the specific results of the patient are not yet available. Objective To improve antibiotic resistance trend analysis algorithms by building a novel, fully data-driven forecasting method from the combination of trend extraction and machine learning models for enhanced biosurveillance systems. Methods We investigate a robust model for extraction and forecasting of antibiotic resistance trends using a decade of microbiology data. Our method consists of breaking down the resistance time series into independent oscillatory components via the empirical mode decomposition technique. The resulting waveforms describing intrinsic resistance trends serve as the input for the forecasting algorithm. The algorithm applies the delay coordinate embedding theorem together with the k-nearest neighbor framework to project mappings from past events into the future dimension and estimate the resistance levels. Results The algorithms that decompose the resistance time series and filter out high frequency components showed statistically significant performance improvements in comparison with a benchmark random walk model. We present further qualitative use-cases of antibiotic resistance trend extraction, where empirical mode decomposition was applied to highlight the specificities of the resistance trends. Conclusion The decomposition of the raw signal was found not only to yield valuable insight into the resistance evolution, but also to produce novel models of resistance forecasters with boosted prediction performance, which could be utilized as a complementary method in the analysis of antibiotic resistance trends. PMID:23637796

  5. Optimizing classification performance in an object-based very-high-resolution land use-land cover urban application

    NASA Astrophysics Data System (ADS)

    Georganos, Stefanos; Grippa, Tais; Vanhuysse, Sabine; Lennert, Moritz; Shimoni, Michal; Wolff, Eléonore

    2017-10-01

    This study evaluates the impact of three Feature Selection (FS) algorithms in an Object Based Image Analysis (OBIA) framework for Very-High-Resolution (VHR) Land Use-Land Cover (LULC) classification. The three selected FS algorithms, Correlation Based Selection (CFS), Mean Decrease in Accuracy (MDA) and Random Forest (RF) based Recursive Feature Elimination (RFE), were tested on Support Vector Machine (SVM), K-Nearest Neighbor, and Random Forest (RF) classifiers. The results demonstrate that the accuracy of SVM and KNN classifiers are the most sensitive to FS. The RF appeared to be more robust to high dimensionality, although a significant increase in accuracy was found by using the RFE method. In terms of classification accuracy, SVM performed the best using FS, followed by RF and KNN. Finally, only a small number of features is needed to achieve the highest performance using each classifier. This study emphasizes the benefits of rigorous FS for maximizing performance, as well as for minimizing model complexity and interpretation.

  6. Applied algorithm in the liner inspection of solid rocket motors

    NASA Astrophysics Data System (ADS)

    Hoffmann, Luiz Felipe Simões; Bizarria, Francisco Carlos Parquet; Bizarria, José Walter Parquet

    2018-03-01

    In rocket motors, the bonding between the solid propellant and thermal insulation is accomplished by a thin adhesive layer, known as liner. The liner application method involves a complex sequence of tasks, which includes in its final stage, the surface integrity inspection. Nowadays in Brazil, an expert carries out a thorough visual inspection to detect defects on the liner surface that may compromise the propellant interface bonding. Therefore, this paper proposes an algorithm that uses the photometric stereo technique and the K-nearest neighbor (KNN) classifier to assist the expert in the surface inspection. Photometric stereo allows the surface information recovery of the test images, while the KNN method enables image pixels classification into two classes: non-defect and defect. Tests performed on a computer vision based prototype validate the algorithm. The positive results suggest that the algorithm is feasible and when implemented in a real scenario, will be able to help the expert in detecting defective areas on the liner surface.

  7. Can a Smartphone Diagnose Parkinson Disease? A Deep Neural Network Method and Telediagnosis System Implementation.

    PubMed

    Zhang, Y N

    2017-01-01

    Parkinson's disease (PD) is primarily diagnosed by clinical examinations, such as walking test, handwriting test, and MRI diagnostic. In this paper, we propose a machine learning based PD telediagnosis method for smartphone. Classification of PD using speech records is a challenging task owing to the fact that the classification accuracy is still lower than doctor-level. Here we demonstrate automatic classification of PD using time frequency features, stacked autoencoders (SAE), and K nearest neighbor (KNN) classifier. KNN classifier can produce promising classification results from useful representations which were learned by SAE. Empirical results show that the proposed method achieves better performance with all tested cases across classification tasks, demonstrating machine learning capable of classifying PD with a level of competence comparable to doctor. It concludes that a smartphone can therefore potentially provide low-cost PD diagnostic care. This paper also gives an implementation on browser/server system and reports the running time cost. Both advantages and disadvantages of the proposed telediagnosis system are discussed.

  8. Can a Smartphone Diagnose Parkinson Disease? A Deep Neural Network Method and Telediagnosis System Implementation

    PubMed Central

    2017-01-01

    Parkinson's disease (PD) is primarily diagnosed by clinical examinations, such as walking test, handwriting test, and MRI diagnostic. In this paper, we propose a machine learning based PD telediagnosis method for smartphone. Classification of PD using speech records is a challenging task owing to the fact that the classification accuracy is still lower than doctor-level. Here we demonstrate automatic classification of PD using time frequency features, stacked autoencoders (SAE), and K nearest neighbor (KNN) classifier. KNN classifier can produce promising classification results from useful representations which were learned by SAE. Empirical results show that the proposed method achieves better performance with all tested cases across classification tasks, demonstrating machine learning capable of classifying PD with a level of competence comparable to doctor. It concludes that a smartphone can therefore potentially provide low-cost PD diagnostic care. This paper also gives an implementation on browser/server system and reports the running time cost. Both advantages and disadvantages of the proposed telediagnosis system are discussed. PMID:29075547

  9. Neighbour lists for smoothed particle hydrodynamics on GPUs

    NASA Astrophysics Data System (ADS)

    Winkler, Daniel; Rezavand, Massoud; Rauch, Wolfgang

    2018-04-01

    The efficient iteration of neighbouring particles is a performance critical aspect of any high performance smoothed particle hydrodynamics (SPH) solver. SPH solvers that implement a constant smoothing length generally divide the simulation domain into a uniform grid to reduce the computational complexity of the neighbour search. Based on this method, particle neighbours are either stored per grid cell or for each individual particle, denoted as Verlet list. While the latter approach has significantly higher memory requirements, it has the potential for a significant computational speedup. A theoretical comparison is performed to estimate the potential improvements of the method based on unknown hardware dependent factors. Subsequently, the computational performance of both approaches is empirically evaluated on graphics processing units. It is shown that the speedup differs significantly for different hardware, dimensionality and floating point precision. The Verlet list algorithm is implemented as an alternative to the cell linked list approach in the open-source SPH solver DualSPHysics and provided as a standalone software package.

  10. Neighbour tolerance, not suppression, provides competitive advantage to non-native plants.

    PubMed

    Golivets, Marina; Wallin, Kimberly F

    2018-05-01

    High competitive ability has often been invoked as a key determinant of invasion success and ecological impacts of non-native plants. Yet our understanding of the strategies that non-natives use to gain competitive dominance remains limited. Particularly, it remains unknown whether the two non-mutually exclusive competitive strategies, neighbour suppression and neighbour tolerance, are equally important for the competitive advantage of non-native plants. Here, we analyse data from 192 peer-reviewed studies on pairwise plant competition within a Bayesian multilevel meta-analytic framework and show that non-native plants outperform their native counterparts due to high tolerance of competition, as opposed to strong suppressive ability. Competitive tolerance ability of non-native plants was driven by neighbour's origin and was expressed in response to a heterospecific native but not heterospecific non-native neighbour. In contrast to natives, non-native species were not more suppressed by hetero- vs. conspecific neighbours, which was partially due to higher intensity of intraspecific competition among non-natives. Heterogeneity in the data was primarily associated with methodological differences among studies and not with phylogenetic relatedness among species. Altogether, our synthesis demonstrates that non-native plants are competitively distinct from native plants and challenges the common notion that neighbour suppression is the primary strategy for plant invasion success. © 2018 John Wiley & Sons Ltd/CNRS.

  11. Forest/non-forest mapping using inventory data and satellite imagery

    Treesearch

    Ronald E. McRoberts

    2002-01-01

    For two study areas in Minnesota, USA, one heavily forested and one sparsely forested, maps of predicted proportion forest area were created using Landsat Thematic Mapper imagery, forest inventory plot data, and two prediction techniques, logistic regression and a k-Nearest Neighbours technique. The maps were used to increase the precision of forest area estimates by...

  12. Improvement of the piezoelectric properties in (K,Na)NbO3-based lead-free piezoelectric ceramic with two-phase co-existing state

    NASA Astrophysics Data System (ADS)

    Yamada, H.; Matsuoka, T.; Kozuka, H.; Yamazaki, M.; Ohbayashi, K.; Ida, T.

    2015-06-01

    Two phases of (K,Na)NbO3 (KNN) co-exist in a KNN-based composite lead-free piezoelectric ceramic 0.910(K1-xNax)0.86Ca0.04Li0.02Nb0.85O3-δ-0.042K0.85Ti0.85Nb1.15O5-0.036BaZrO3-0.0016Co3O4- 0.0025Fe2O3-0.0069ZnO system, over a wide range of Na fractions, where 0.56 ≤ x ≤ 0.75. The crystal systems of the two KNN phases are identified to tetragonal and orthorhombic by analyzing the synchrotron powder X-ray diffraction (XRD) data, high-resolution transmission electron microscopy (HR-TEM), and selected-area electron diffraction (SAD). In the range 0.33 ≤ x ≤ 0.50, the main component of the composite system is found to be single-phase KNN with a tetragonal structure. Granular nanodomains of the orthorhombic phase dispersed in the tetragonal matrix have been identified by HR-TEM and SAD for 0.56 ≤ x ≤ 0.75. Only a trace amount of the orthorhombic phase has been found in the SAD patterns at the composition x = 0.56. However, the number of orthorhombic nanodomains gradually increases with increasing Na content up to x < 0.75, as observed from the HR-TEM images. An abrupt increase and agglomeration of the nanodomains are observed at x = 0.75, where weak diffraction peaks of the orthorhombic phase have also become detectable from the XRD data. The maximum value of the electromechanical coupling coefficient, kp = 0.56, has been observed at the composition x = 0.56.

  13. Integrated Sensing Processor, Phase 2

    DTIC Science & Technology

    2005-12-01

    performance analysis for several baseline classifiers including neural nets, linear classifiers, and kNN classifiers. Use of CCDR as a preprocessing step...below the level of the benchmark non-linear classifier for this problem ( kNN ). Furthermore, the CCDR preconditioned kNN achieved a 10% improvement over...the benchmark kNN without CCDR. Finally, we found an important connection between intrinsic dimension estimation via entropic graphs and the optimal

  14. Development of a Lead-free Piezoelectric (K,Na)NbO3 Thin Film Deposited on Nickel-based Electrodes

    NASA Astrophysics Data System (ADS)

    Bani Milhim, Alaeddin

    It is desirable to replace noble metals used as electrode materials for piezoelectric thin film with base metals. This will reduce the piezoelectric thin film fabrication cost. A nickel?based layer in conjunction with other protective layers is proposed as a bottom electrode for lead-free piezoelectric KNN thin film. The obtained results do not indicate the oxidation of the nickel?based bottom electrode after the deposition of KNN at 600 °C for 10 hours in the presence of oxygen and/or after annealing the sample at 400 °C for an hour in air. The fabricated KNN thin film was fully characterized in this work. The effective piezoelectric coefficients d33 and d31 were estimated to be 37 pm/V and 17.2 pm/V, respectively, at 100 kV/cm. The piezoelectric properties of the fabricated KNN/Ni/Ti/SiO2/Si are affected by the crystal orientation of the KNN layer, which was preferentially oriented in the (110) direction. Optimization of the deposition parameters of the fabricated KNN/Ni/Ti/SiO2/Si film is expected to further enhance the piezoelectric properties. Two novel systems utilizing the developed KNN piezoelectric thin film are proposed and their performance simulated based on the achieved KNN thin film parameters. The first is a precision automated nanomanipulation system using an AFM as a sensor and piezo-actuated manipulators. Real-time feedback of the particle being manipulated can be achieved using the proposed system. The length of the manipulators needs to be at least 2 mm to be incorporated with a commercial AFM system. To fabricate the required manipulators, a three-step electrochemical etching technique was developed. Tungsten tips combining well-defined conical shape, a length as large as 2 mm, and sharpness with a radius of curvature of around 20 nm were fabricated using the proposed technique. By depositing the KNN thin film on the fabricated manipulator, nanomanipulators with out-of-plane actuation can be produced. Ultrasonic piezoelectric fan array, the

  15. Ultrahigh Piezoelectric Properties in Textured (K,Na)NbO3 -Based Lead-Free Ceramics.

    PubMed

    Li, Peng; Zhai, Jiwei; Shen, Bo; Zhang, Shujun; Li, Xiaolong; Zhu, Fangyuan; Zhang, Xingmin

    2018-02-01

    High-performance lead-free piezoelectric materials are in great demand for next-generation electronic devices to meet the requirement of environmentally sustainable society. Here, ultrahigh piezoelectric properties with piezoelectric coefficients (d 33 ≈700 pC N -1 , d 33 * ≈980 pm V -1 ) and planar electromechanical coupling factor (k p ≈76%) are achieved in highly textured (K,Na)NbO 3 (KNN)-based ceramics. The excellent piezoelectric properties can be explained by the strong anisotropic feature, optimized engineered domain configuration in the textured ceramics, and facilitated polarization rotation induced by the intermediate phase. In addition, the nanodomain structures with decreased domain wall energy and increased domain wall mobility also contribute to the ultrahigh piezoelectric properties. This work not only demonstrates the tremendous potential of KNN-based ceramics to replace lead-based piezoelectrics but also provides a good strategy to design high-performance piezoelectrics by controlling appropriate phase and crystallographic orientation. © 2018 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  16. Classifying dysmorphic syndromes by using artificial neural network based hierarchical decision tree.

    PubMed

    Özdemir, Merve Erkınay; Telatar, Ziya; Eroğul, Osman; Tunca, Yusuf

    2018-05-01

    Dysmorphic syndromes have different facial malformations. These malformations are significant to an early diagnosis of dysmorphic syndromes and contain distinctive information for face recognition. In this study we define the certain features of each syndrome by considering facial malformations and classify Fragile X, Hurler, Prader Willi, Down, Wolf Hirschhorn syndromes and healthy groups automatically. The reference points are marked on the face images and ratios between the points' distances are taken into consideration as features. We suggest a neural network based hierarchical decision tree structure in order to classify the syndrome types. We also implement k-nearest neighbor (k-NN) and artificial neural network (ANN) classifiers to compare classification accuracy with our hierarchical decision tree. The classification accuracy is 50, 73 and 86.7% with k-NN, ANN and hierarchical decision tree methods, respectively. Then, the same images are shown to a clinical expert who achieve a recognition rate of 46.7%. We develop an efficient system to recognize different syndrome types automatically in a simple, non-invasive imaging data, which is independent from the patient's age, sex and race at high accuracy. The promising results indicate that our method can be used for pre-diagnosis of the dysmorphic syndromes by clinical experts.

  17. Bamboo Classification Using WorldView-2 Imagery of Giant Panda Habitat in a Large Shaded Area in Wolong, Sichuan Province, China

    PubMed Central

    Tang, Yunwei; Jing, Linhai; Li, Hui; Liu, Qingjie; Yan, Qi; Li, Xiuxia

    2016-01-01

    This study explores the ability of WorldView-2 (WV-2) imagery for bamboo mapping in a mountainous region in Sichuan Province, China. A large area of this place is covered by shadows in the image, and only a few sampled points derived were useful. In order to identify bamboos based on sparse training data, the sample size was expanded according to the reflectance of multispectral bands selected using the principal component analysis (PCA). Then, class separability based on the training data was calculated using a feature space optimization method to select the features for classification. Four regular object-based classification methods were applied based on both sets of training data. The results show that the k-nearest neighbor (k-NN) method produced the greatest accuracy. A geostatistically-weighted k-NN classifier, accounting for the spatial correlation between classes, was then applied to further increase the accuracy. It achieved 82.65% and 93.10% of the producer’s and user’s accuracies respectively for the bamboo class. The canopy densities were estimated to explain the result. This study demonstrates that the WV-2 image can be used to identify small patches of understory bamboos given limited known samples, and the resulting bamboo distribution facilitates the assessments of the habitats of giant pandas. PMID:27879661

  18. Exploring Sampling in the Detection of Multicategory EEG Signals

    PubMed Central

    Siuly, Siuly; Kabir, Enamul; Wang, Hua; Zhang, Yanchun

    2015-01-01

    The paper presents a structure based on samplings and machine leaning techniques for the detection of multicategory EEG signals where random sampling (RS) and optimal allocation sampling (OS) are explored. In the proposed framework, before using the RS and OS scheme, the entire EEG signals of each class are partitioned into several groups based on a particular time period. The RS and OS schemes are used in order to have representative observations from each group of each category of EEG data. Then all of the selected samples by the RS from the groups of each category are combined in a one set named RS set. In the similar way, for the OS scheme, an OS set is obtained. Then eleven statistical features are extracted from the RS and OS set, separately. Finally this study employs three well-known classifiers: k-nearest neighbor (k-NN), multinomial logistic regression with a ridge estimator (MLR), and support vector machine (SVM) to evaluate the performance for the RS and OS feature set. The experimental outcomes demonstrate that the RS scheme well represents the EEG signals and the k-NN with the RS is the optimum choice for detection of multicategory EEG signals. PMID:25977705

  19. Model-based segmentation of abdominal aortic aneurysms in CTA images

    NASA Astrophysics Data System (ADS)

    de Bruijne, Marleen; van Ginneken, Bram; Niessen, Wiro J.; Loog, Marco; Viergever, Max A.

    2003-05-01

    Segmentation of thrombus in abdominal aortic aneurysms is complicated by regions of low boundary contrast and by the presence of many neighboring structures in close proximity to the aneurysm wall. We present an automated method that is similar to the well known Active Shape Models (ASM), combining a three-dimensional shape model with a one-dimensional boundary appearance model. Our contribution is twofold: we developed a non-parametric appearance modeling scheme that effectively deals with a highly varying background, and we propose a way of generalizing models of curvilinear structures from small training sets. In contrast with the conventional ASM approach, the new appearance model trains on both true and false examples of boundary profiles. The probability that a given image profile belongs to the boundary is obtained using k nearest neighbor (kNN) probability density estimation. The performance of this scheme is compared to that of original ASMs, which minimize the Mahalanobis distance to the average true profile in the training set. The generalizability of the shape model is improved by modeling the objects axis deformation independent of its cross-sectional deformation. A leave-one-out experiment was performed on 23 datasets. Segmentation using the kNN appearance model significantly outperformed the original ASM scheme; average volume errors were 5.9% and 46% respectively.

  20. Automated Identification of Abnormal Adult EEGs

    PubMed Central

    López, S.; Suarez, G.; Jungreis, D.; Obeid, I.; Picone, J.

    2016-01-01

    The interpretation of electroencephalograms (EEGs) is a process that is still dependent on the subjective analysis of the examiners. Though interrater agreement on critical events such as seizures is high, it is much lower on subtler events (e.g., when there are benign variants). The process used by an expert to interpret an EEG is quite subjective and hard to replicate by machine. The performance of machine learning technology is far from human performance. We have been developing an interpretation system, AutoEEG, with a goal of exceeding human performance on this task. In this work, we are focusing on one of the early decisions made in this process – whether an EEG is normal or abnormal. We explore two baseline classification algorithms: k-Nearest Neighbor (kNN) and Random Forest Ensemble Learning (RF). A subset of the TUH EEG Corpus was used to evaluate performance. Principal Components Analysis (PCA) was used to reduce the dimensionality of the data. kNN achieved a 41.8% detection error rate while RF achieved an error rate of 31.7%. These error rates are significantly lower than those obtained by random guessing based on priors (49.5%). The majority of the errors were related to misclassification of normal EEGs. PMID:27195311

  1. An Efficient Statistical Computation Technique for Health Care Big Data using R

    NASA Astrophysics Data System (ADS)

    Sushma Rani, N.; Srinivasa Rao, P., Dr; Parimala, P.

    2017-08-01

    Due to the changes in living conditions and other factors many critical health related problems are arising. The diagnosis of the problem at earlier stages will increase the chances of survival and fast recovery. This reduces the time of recovery and the cost associated for the treatment. One such medical related issue is cancer and breast cancer has been identified as the second leading cause of cancer death. If detected in the early stage it can be cured. Once a patient is detected with breast cancer tumor, it should be classified whether it is cancerous or non-cancerous. So the paper uses k-nearest neighbors(KNN) algorithm which is one of the simplest machine learning algorithms and is an instance-based learning algorithm to classify the data. Day-to -day new records are added which leds to increase in the data to be classified and this tends to be big data problem. The algorithm is implemented in R whichis the most popular platform applied to machine learning algorithms for statistical computing. Experimentation is conducted by using various classification evaluation metric onvarious values of k. The results show that the KNN algorithm out performes better than existing models.

  2. Using machine learning algorithms to guide rehabilitation planning for home care clients.

    PubMed

    Zhu, Mu; Zhang, Zhanyang; Hirdes, John P; Stolee, Paul

    2007-12-20

    Targeting older clients for rehabilitation is a clinical challenge and a research priority. We investigate the potential of machine learning algorithms - Support Vector Machine (SVM) and K-Nearest Neighbors (KNN) - to guide rehabilitation planning for home care clients. This study is a secondary analysis of data on 24,724 longer-term clients from eight home care programs in Ontario. Data were collected with the RAI-HC assessment system, in which the Activities of Daily Living Clinical Assessment Protocol (ADLCAP) is used to identify clients with rehabilitation potential. For study purposes, a client is defined as having rehabilitation potential if there was: i) improvement in ADL functioning, or ii) discharge home. SVM and KNN results are compared with those obtained using the ADLCAP. For comparison, the machine learning algorithms use the same functional and health status indicators as the ADLCAP. The KNN and SVM algorithms achieved similar substantially improved performance over the ADLCAP, although false positive and false negative rates were still fairly high (FP > .18, FN > .34 versus FP > .29, FN. > .58 for ADLCAP). Results are used to suggest potential revisions to the ADLCAP. Machine learning algorithms achieved superior predictions than the current protocol. Machine learning results are less readily interpretable, but can also be used to guide development of improved clinical protocols.

  3. Context-dependent responses to neighbours and strangers in wild European rabbits (Oryctolagus cuniculus).

    PubMed

    Monclús, Raquel; Saavedra, Irene; de Miguel, Javier

    2014-07-01

    Territorial animals defend their territories against intruders. The level of aggression directed to intruders depends on the familiarity and/or the relative threat they pose, and it could be modified by the context of the interaction. We explored in a wild social mammal, the European rabbit (Oryctolagus cuniculus), whether residents responded more aggressively to strangers or to neighbours (dear enemy or nasty neighbour effects, respectively). We simulated the intrusion of neighbours or strangers in different parts of the territory of wild European rabbits in a suburban area in central Spain. For that, we placed faecal pellets of neighbouring or stranger rabbits in the territory of 5 rabbit colonies. Resident rabbits counter-marked preferably the odour stations with stranger odour, compared to the ones with neighbour odour, and they did not make a difference between neighbour and a non-odour control stimuli. The results suggest that rabbits show a dear enemy effect. However, repeated intrusions escalated the responses of rabbits towards neighbours. The location within the territory or the sex of the stranger did not affect the level of response. We conclude that in rabbits the relative threat posed by the intruder triggers the intensity of the interaction. Copyright © 2014 Elsevier B.V. All rights reserved.

  4. Polymers with nearest- and next nearest-neighbor interactions on the Husimi lattice

    NASA Astrophysics Data System (ADS)

    Oliveira, Tiago J.

    2016-04-01

    The exact grand-canonical solution of a generalized interacting self-avoid walk (ISAW) model, placed on a Husimi lattice built with squares, is presented. In this model, beyond the traditional interaction {ω }1={{{e}}}{ɛ 1/{k}BT} between (nonconsecutive) monomers on nearest-neighbor (NN) sites, an additional energy {ɛ }2 is associated to next-NN (NNN) monomers. Three definitions of NNN sites/interactions are considered, where each monomer can have, effectively, at most two, four, or six NNN monomers on the Husimi lattice. The phase diagrams found in all cases have (qualitatively) the same thermodynamic properties: a non-polymerized (NP) and a polymerized (P) phase separated by a critical and a coexistence surface that meet at a tricritical (θ-) line. This θ-line is found even when one of the interactions is repulsive, existing for {ω }1 in the range [0,∞ ), i.e., for {ɛ }1/{k}BT in the range [-∞ ,∞ ). Thus, counterintuitively, a θ-point exists even for an infinite repulsion between NN monomers ({ω }1=0), being associated to a coil-‘soft globule’ transition. In the limit of an infinite repulsive force between NNN monomers, however, the coil-globule transition disappears, and only NP-P continuous transition is observed. This particular case, with {ω }2=0, is also solved exactly on the square lattice, using a transfer matrix calculation where a discontinuous NP-P transition is found. For attractive and repulsive forces between NN and NNN monomers, respectively, the model becomes quite similar to the semiflexible-ISAW one, whose crystalline phase is not observed here, as a consequence of the frustration due to competing NN and NNN forces. The mapping of the phase diagrams in canonical ones is discussed and compared with recent results from Monte Carlo simulations on the square lattice.

  5. Error minimizing algorithms for nearest eighbor classifiers

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Porter, Reid B; Hush, Don; Zimmer, G. Beate

    2011-01-03

    Stack Filters define a large class of discrete nonlinear filter first introd uced in image and signal processing for noise removal. In recent years we have suggested their application to classification problems, and investigated their relationship to other types of discrete classifiers such as Decision Trees. In this paper we focus on a continuous domain version of Stack Filter Classifiers which we call Ordered Hypothesis Machines (OHM), and investigate their relationship to Nearest Neighbor classifiers. We show that OHM classifiers provide a novel framework in which to train Nearest Neighbor type classifiers by minimizing empirical error based loss functions. Wemore » use the framework to investigate a new cost sensitive loss function that allows us to train a Nearest Neighbor type classifier for low false alarm rate applications. We report results on both synthetic data and real-world image data.« less

  6. A Semi-parametric Multivariate Gap-filling Model for Eddy Covariance Latent Heat Flux

    NASA Astrophysics Data System (ADS)

    Li, M.; Chen, Y.

    2010-12-01

    Quantitative descriptions of latent heat fluxes are important to study the water and energy exchanges between terrestrial ecosystems and the atmosphere. The eddy covariance approaches have been recognized as the most reliable technique for measuring surface fluxes over time scales ranging from hours to years. However, unfavorable micrometeorological conditions, instrument failures, and applicable measurement limitations may cause inevitable flux gaps in time series data. Development and application of suitable gap-filling techniques are crucial to estimate long term fluxes. In this study, a semi-parametric multivariate gap-filling model was developed to fill latent heat flux gaps for eddy covariance measurements. Our approach combines the advantages of a multivariate statistical analysis (principal component analysis, PCA) and a nonlinear interpolation technique (K-nearest-neighbors, KNN). The PCA method was first used to resolve the multicollinearity relationships among various hydrometeorological factors, such as radiation, soil moisture deficit, LAI, and wind speed. The KNN method was then applied as a nonlinear interpolation tool to estimate the flux gaps as the weighted sum latent heat fluxes with the K-nearest distances in the PCs’ domain. Two years, 2008 and 2009, of eddy covariance and hydrometeorological data from a subtropical mixed evergreen forest (the Lien-Hua-Chih Site) were collected to calibrate and validate the proposed approach with artificial gaps after standard QC/QA procedures. The optimal K values and weighting factors were determined by the maximum likelihood test. The results of gap-filled latent heat fluxes conclude that developed model successful preserving energy balances of daily, monthly, and yearly time scales. Annual amounts of evapotranspiration from this study forest were 747 mm and 708 mm for 2008 and 2009, respectively. Nocturnal evapotranspiration was estimated with filled gaps and results are comparable with other studies

  7. Diagnostic tools for nearest neighbors techniques when used with satellite imagery

    Treesearch

    Ronald E. McRoberts

    2009-01-01

    Nearest neighbors techniques are non-parametric approaches to multivariate prediction that are useful for predicting both continuous and categorical forest attribute variables. Although some assumptions underlying nearest neighbor techniques are common to other prediction techniques such as regression, other assumptions are unique to nearest neighbor techniques....

  8. The nasty neighbour in the striped mouse (Rhabdomys pumilio) steals paternity and elicits aggression.

    PubMed

    Schradin, Carsten; Schneider, Carola; Lindholm, Anna K

    2010-06-23

    Territoriality functions to monopolize access to resources including mates, but is costly in terms of energy and time investment. Some species reduce these costs by being less aggressive towards their neighbours than towards unfamiliar strangers, the so called dear enemy phenomenon. However, in other species individuals are more, not less aggressive towards their neighbours. It has been hypothesised that this is due to the fact that neighbours can impose a greater threat than strangers, but this has not been tested previously. We tested aggression in wild group-living male striped mice in a neutral test arena and demonstrate that breeders are more aggressive than non-breeding philopatrics, and that more aggression occurs during the breeding than during the non-breeding season. Male breeders were significantly more aggressive towards their neighbours than towards strangers, leading to the prediction that neighbours are the most important competitors for paternity. Using a molecular parentage analysis we show that 28% of offspring are sired by neighbouring males and only 7% by strangers. We conclude that in male striped mice the main function of male aggression is defending paternity against their territorial neighbours.

  9. The nasty neighbour in the striped mouse (Rhabdomys pumilio) steals paternity and elicits aggression

    PubMed Central

    2010-01-01

    Background Territoriality functions to monopolize access to resources including mates, but is costly in terms of energy and time investment. Some species reduce these costs by being less aggressive towards their neighbours than towards unfamiliar strangers, the so called dear enemy phenomenon. However, in other species individuals are more, not less aggressive towards their neighbours. It has been hypothesised that this is due to the fact that neighbours can impose a greater threat than strangers, but this has not been tested previously. Results We tested aggression in wild group-living male striped mice in a neutral test arena and demonstrate that breeders are more aggressive than non-breeding philopatrics, and that more aggression occurs during the breeding than during the non-breeding season. Male breeders were significantly more aggressive towards their neighbours than towards strangers, leading to the prediction that neighbours are the most important competitors for paternity. Using a molecular parentage analysis we show that 28% of offspring are sired by neighbouring males and only 7% by strangers. Conclusions We conclude that in male striped mice the main function of male aggression is defending paternity against their territorial neighbours. PMID:20573184

  10. Thermodynamics of alternating spin chains with competing nearest- and next-nearest-neighbor interactions: Ising model

    NASA Astrophysics Data System (ADS)

    Pini, Maria Gloria; Rettori, Angelo

    1993-08-01

    The thermodynamical properties of an alternating spin (S,s) one-dimensional (1D) Ising model with competing nearest- and next-nearest-neighbor interactions are exactly calculated using a transfer-matrix technique. In contrast to the case S=s=1/2, previously investigated by Harada, the alternation of different spins (S≠s) along the chain is found to give rise to two-peaked static structure factors, signaling the coexistence of different short-range-order configurations. The relevance of our calculations with regard to recent experimental data by Gatteschi et al. in quasi-1D molecular magnetic materials, R (hfac)3 NITEt (R=Gd, Tb, Dy, Ho, Er, . . .), is discussed; hfac is hexafluoro-acetylacetonate and NlTEt is 2-Ethyl-4,4,5,5-tetramethyl-4,5-dihydro-1H-imidazolyl-1-oxyl-3-oxide.

  11. Optimization of internet content filtering-Combined with KNN and OCAT algorithms

    NASA Astrophysics Data System (ADS)

    Guo, Tianze; Wu, Lingjing; Liu, Jiaming

    2018-04-01

    The face of the status quo that rampant illegal content in the Internet, the result of traditional way to filter information, keyword recognition and manual screening, is getting worse. Based on this, this paper uses OCAT algorithm nested by KNN classification algorithm to construct a corpus training library that can dynamically learn and update, which can be improved on the filter corpus for constantly updated illegal content of the network, including text and pictures, and thus can better filter and investigate illegal content and its source. After that, the research direction will focus on the simplified updating of recognition and comparison algorithms and the optimization of the corpus learning ability in order to improve the efficiency of filtering, save time and resources.

  12. Toward a functional near-infrared spectroscopy-based monitoring of pain assessment for nonverbal patients

    NASA Astrophysics Data System (ADS)

    Fernandez Rojas, Raul; Huang, Xu; Ou, Keng-Liang

    2017-10-01

    Pain diagnosis for nonverbal patients represents a challenge in clinical settings. Neuroimaging methods, such as functional magnetic resonance imaging and functional near-infrared spectroscopy (fNIRS), have shown promising results to assess neuronal function in response to nociception and pain. Recent studies suggest that neuroimaging in conjunction with machine learning models can be used to predict different cognitive tasks. The aim of this study is to expand previous studies by exploring the classification of fNIRS signals (oxyhaemoglobin) according to temperature level (cold and hot) and corresponding pain intensity (low and high) using machine learning models. Toward this aim, we used the quantitative sensory testing to determine pain threshold and pain tolerance to cold and heat in 18 healthy subjects (three females), mean age±standard deviation (31.9±5.5). The classification model is based on the bag-of-words approach, a histogram representation used in document classification based on the frequencies of extracted words and adapted for time series; two learning algorithms were used separately, K-nearest neighbor (K-NN) and support vector machines (SVM). A comparison between two sets of fNIRS channels was also made in the classification task, all 24 channels and 8 channels from the somatosensory region defined as our region of interest (RoI). The results showed that K-NN obtained slightly better results (92.08%) than SVM (91.25%) using the 24 channels; however, the performance slightly dropped using only channels from the RoI with K-NN (91.53%) and SVM (90.83%). These results indicate potential applications of fNIRS in the development of a physiologically based diagnosis of human pain that would benefit vulnerable patients who cannot self-report pain.

  13. Modeling ready biodegradability of fragrance materials.

    PubMed

    Ceriani, Lidia; Papa, Ester; Kovarich, Simona; Boethling, Robert; Gramatica, Paola

    2015-06-01

    In the present study, quantitative structure activity relationships were developed for predicting ready biodegradability of approximately 200 heterogeneous fragrance materials. Two classification methods, classification and regression tree (CART) and k-nearest neighbors (kNN), were applied to perform the modeling. The models were validated with multiple external prediction sets, and the structural applicability domain was verified by the leverage approach. The best models had good sensitivity (internal ≥80%; external ≥68%), specificity (internal ≥80%; external 73%), and overall accuracy (≥75%). Results from the comparison with BIOWIN global models, based on group contribution method, show that specific models developed in the present study perform better in prediction than BIOWIN6, in particular for the correct classification of not readily biodegradable fragrance materials. © 2015 SETAC.

  14. Activity recognition in planetary navigation field tests using classification algorithms applied to accelerometer data.

    PubMed

    Song, Wen; Ade, Carl; Broxterman, Ryan; Barstow, Thomas; Nelson, Thomas; Warren, Steve

    2012-01-01

    Accelerometer data provide useful information about subject activity in many different application scenarios. For this study, single-accelerometer data were acquired from subjects participating in field tests that mimic tasks that astronauts might encounter in reduced gravity environments. The primary goal of this effort was to apply classification algorithms that could identify these tasks based on features present in their corresponding accelerometer data, where the end goal is to establish methods to unobtrusively gauge subject well-being based on sensors that reside in their local environment. In this initial analysis, six different activities that involve leg movement are classified. The k-Nearest Neighbors (kNN) algorithm was found to be the most effective, with an overall classification success rate of 90.8%.

  15. Secure Nearest Neighbor Query on Crowd-Sensing Data

    PubMed Central

    Cheng, Ke; Wang, Liangmin; Zhong, Hong

    2016-01-01

    Nearest neighbor queries are fundamental in location-based services, and secure nearest neighbor queries mainly focus on how to securely and quickly retrieve the nearest neighbor in the outsourced cloud server. However, the previous big data system structure has changed because of the crowd-sensing data. On the one hand, sensing data terminals as the data owner are numerous and mistrustful, while, on the other hand, in most cases, the terminals find it difficult to finish many safety operation due to computation and storage capability constraints. In light of they Multi Owners and Multi Users (MOMU) situation in the crowd-sensing data cloud environment, this paper presents a secure nearest neighbor query scheme based on the proxy server architecture, which is constructed by protocols of secure two-party computation and secure Voronoi diagram algorithm. It not only preserves the data confidentiality and query privacy but also effectively resists the collusion between the cloud server and the data owners or users. Finally, extensive theoretical and experimental evaluations are presented to show that our proposed scheme achieves a superior balance between the security and query performance compared to other schemes. PMID:27669253

  16. Secure Nearest Neighbor Query on Crowd-Sensing Data.

    PubMed

    Cheng, Ke; Wang, Liangmin; Zhong, Hong

    2016-09-22

    Nearest neighbor queries are fundamental in location-based services, and secure nearest neighbor queries mainly focus on how to securely and quickly retrieve the nearest neighbor in the outsourced cloud server. However, the previous big data system structure has changed because of the crowd-sensing data. On the one hand, sensing data terminals as the data owner are numerous and mistrustful, while, on the other hand, in most cases, the terminals find it difficult to finish many safety operation due to computation and storage capability constraints. In light of they Multi Owners and Multi Users (MOMU) situation in the crowd-sensing data cloud environment, this paper presents a secure nearest neighbor query scheme based on the proxy server architecture, which is constructed by protocols of secure two-party computation and secure Voronoi diagram algorithm. It not only preserves the data confidentiality and query privacy but also effectively resists the collusion between the cloud server and the data owners or users. Finally, extensive theoretical and experimental evaluations are presented to show that our proposed scheme achieves a superior balance between the security and query performance compared to other schemes.

  17. Dielectric and ferroelectric properties of strain-relieved epitaxial lead-free KNN-LT-LS ferroelectric thin films on SrTiO3 substrates

    NASA Astrophysics Data System (ADS)

    Abazari, M.; Akdoǧan, E. K.; Safari, A.

    2008-05-01

    We report the growth of single-phase (K0.44,Na0.52,Li0.04)(Nb0.84,Ta0.10,Sb0.06)O3 thin films on SrRuO3 coated ⟨001⟩ oriented SrTiO3 substrates by using pulsed laser deposition. Films grown at 600°C under low laser fluence exhibit a ⟨001⟩ textured columnar grained nanostructure, which coalesce with increasing deposition temperature, leading to a uniform fully epitaxial highly stoichiometric film at 750°C. However, films deposited at lower temperatures exhibit compositional fluctuations as verified by Rutherford backscattering spectroscopy. The epitaxial films of 400-600nm thickness have a room temperature relative permittivity of ˜750 and a loss tangent of ˜6% at 1kHz. The room temperature remnant polarization of the films is 4μC /cm2, while the saturation polarization is 7.1μC/cm2 at 24kV/cm and the coercive field is ˜7.3kV/cm. The results indicate that approximately 50% of the bulk permittivity and 20% of bulk spontaneous polarization can be retained in submicron epitaxial KNN-LT-LS thin film, respectively. The conductivity of the films remains to be a challenge as evidenced by the high loss tangent, leakage currents, and broad hysteresis loops.

  18. Detecting epileptic seizure with different feature extracting strategies using robust machine learning classification techniques by applying advance parameter optimization approach.

    PubMed

    Hussain, Lal

    2018-06-01

    Epilepsy is a neurological disorder produced due to abnormal excitability of neurons in the brain. The research reveals that brain activity is monitored through electroencephalogram (EEG) of patients suffered from seizure to detect the epileptic seizure. The performance of EEG detection based epilepsy require feature extracting strategies. In this research, we have extracted varying features extracting strategies based on time and frequency domain characteristics, nonlinear, wavelet based entropy and few statistical features. A deeper study was undertaken using novel machine learning classifiers by considering multiple factors. The support vector machine kernels are evaluated based on multiclass kernel and box constraint level. Likewise, for K-nearest neighbors (KNN), we computed the different distance metrics, Neighbor weights and Neighbors. Similarly, the decision trees we tuned the paramours based on maximum splits and split criteria and ensemble classifiers are evaluated based on different ensemble methods and learning rate. For training/testing tenfold Cross validation was employed and performance was evaluated in form of TPR, NPR, PPV, accuracy and AUC. In this research, a deeper analysis approach was performed using diverse features extracting strategies using robust machine learning classifiers with more advanced optimal options. Support Vector Machine linear kernel and KNN with City block distance metric give the overall highest accuracy of 99.5% which was higher than using the default parameters for these classifiers. Moreover, highest separation (AUC = 0.9991, 0.9990) were obtained at different kernel scales using SVM. Additionally, the K-nearest neighbors with inverse squared distance weight give higher performance at different Neighbors. Moreover, to distinguish the postictal heart rate oscillations from epileptic ictal subjects, and highest performance of 100% was obtained using different machine learning classifiers.

  19. Evaluation of Data Processing Techniques for Unobtrusive Gait Authentication

    DTIC Science & Technology

    2014-03-01

    scatter plot depicting the performance of kNN , by TER, on all experimental mixtures...30  Table 9.  Mean TER of SVM and kNN performance with different voting parameters...performance on XYZ-axis data. ...........................................................51  Table 19.  kNN and SVM results in back pocket carrying

  20. Evaluation of four supervised learning methods for groundwater spring potential mapping in Khalkhal region (Iran) using GIS-based features

    NASA Astrophysics Data System (ADS)

    Naghibi, Seyed Amir; Moradi Dashtpagerdi, Mostafa

    2017-01-01

    One important tool for water resources management in arid and semi-arid areas is groundwater potential mapping. In this study, four data-mining models including K-nearest neighbor (KNN), linear discriminant analysis (LDA), multivariate adaptive regression splines (MARS), and quadric discriminant analysis (QDA) were used for groundwater potential mapping to get better and more accurate groundwater potential maps (GPMs). For this purpose, 14 groundwater influence factors were considered, such as altitude, slope angle, slope aspect, plan curvature, profile curvature, slope length, topographic wetness index (TWI), stream power index, distance from rivers, river density, distance from faults, fault density, land use, and lithology. From 842 springs in the study area, in the Khalkhal region of Iran, 70 % (589 springs) were considered for training and 30 % (253 springs) were used as a validation dataset. Then, KNN, LDA, MARS, and QDA models were applied in the R statistical software and the results were mapped as GPMs. Finally, the receiver operating characteristics (ROC) curve was implemented to evaluate the performance of the models. According to the results, the area under the curve of ROCs were calculated as 81.4, 80.5, 79.6, and 79.2 % for MARS, QDA, KNN, and LDA, respectively. So, it can be concluded that the performances of KNN and LDA were acceptable and the performances of MARS and QDA were excellent. Also, the results depicted high contribution of altitude, TWI, slope angle, and fault density, while plan curvature and land use were seen to be the least important factors.

  1. Automated classification of neurological disorders of gait using spatio-temporal gait parameters.

    PubMed

    Pradhan, Cauchy; Wuehr, Max; Akrami, Farhoud; Neuhaeusser, Maximilian; Huth, Sabrina; Brandt, Thomas; Jahn, Klaus; Schniepp, Roman

    2015-04-01

    Automated pattern recognition systems have been used for accurate identification of neurological conditions as well as the evaluation of the treatment outcomes. This study aims to determine the accuracy of diagnoses of (oto-)neurological gait disorders using different types of automated pattern recognition techniques. Clinically confirmed cases of phobic postural vertigo (N = 30), cerebellar ataxia (N = 30), progressive supranuclear palsy (N = 30), bilateral vestibulopathy (N = 30), as well as healthy subjects (N = 30) were recruited for the study. 8 measurements with 136 variables using a GAITRite(®) sensor carpet were obtained from each subject. Subjects were randomly divided into two groups (training cases and validation cases). Sensitivity and specificity of k-nearest neighbor (KNN), naive-bayes classifier (NB), artificial neural network (ANN), and support vector machine (SVM) in classifying the validation cases were calculated. ANN and SVM had the highest overall sensitivity with 90.6% and 92.0% respectively, followed by NB (76.0%) and KNN (73.3%). SVM and ANN showed high false negative rates for bilateral vestibulopathy cases (20.0% and 26.0%); while KNN and NB had high false negative rates for progressive supranuclear palsy cases (76.7% and 40.0%). Automated pattern recognition systems are able to identify pathological gait patterns and establish clinical diagnosis with good accuracy. SVM and ANN in particular differentiate gait patterns of several distinct oto-neurological disorders of gait with high sensitivity and specificity compared to KNN and NB. Both SVM and ANN appear to be a reliable diagnostic and management tool for disorders of gait. Copyright © 2015 Elsevier Ltd. All rights reserved.

  2. Aided diagnosis methods of breast cancer based on machine learning

    NASA Astrophysics Data System (ADS)

    Zhao, Yue; Wang, Nian; Cui, Xiaoyu

    2017-08-01

    In the field of medicine, quickly and accurately determining whether the patient is malignant or benign is the key to treatment. In this paper, K-Nearest Neighbor, Linear Discriminant Analysis, Logistic Regression were applied to predict the classification of thyroid,Her-2,PR,ER,Ki67,metastasis and lymph nodes in breast cancer, in order to recognize the benign and malignant breast tumors and achieve the purpose of aided diagnosis of breast cancer. The results showed that the highest classification accuracy of LDA was 88.56%, while the classification effect of KNN and Logistic Regression were better than that of LDA, the best accuracy reached 96.30%.

  3. A comparative analysis of predictive models of morbidity in intensive care unit after cardiac surgery - part II: an illustrative example.

    PubMed

    Cevenini, Gabriele; Barbini, Emanuela; Scolletta, Sabino; Biagioli, Bonizella; Giomarelli, Pierpaolo; Barbini, Paolo

    2007-11-22

    Popular predictive models for estimating morbidity probability after heart surgery are compared critically in a unitary framework. The study is divided into two parts. In the first part modelling techniques and intrinsic strengths and weaknesses of different approaches were discussed from a theoretical point of view. In this second part the performances of the same models are evaluated in an illustrative example. Eight models were developed: Bayes linear and quadratic models, k-nearest neighbour model, logistic regression model, Higgins and direct scoring systems and two feed-forward artificial neural networks with one and two layers. Cardiovascular, respiratory, neurological, renal, infectious and hemorrhagic complications were defined as morbidity. Training and testing sets each of 545 cases were used. The optimal set of predictors was chosen among a collection of 78 preoperative, intraoperative and postoperative variables by a stepwise procedure. Discrimination and calibration were evaluated by the area under the receiver operating characteristic curve and Hosmer-Lemeshow goodness-of-fit test, respectively. Scoring systems and the logistic regression model required the largest set of predictors, while Bayesian and k-nearest neighbour models were much more parsimonious. In testing data, all models showed acceptable discrimination capacities, however the Bayes quadratic model, using only three predictors, provided the best performance. All models showed satisfactory generalization ability: again the Bayes quadratic model exhibited the best generalization, while artificial neural networks and scoring systems gave the worst results. Finally, poor calibration was obtained when using scoring systems, k-nearest neighbour model and artificial neural networks, while Bayes (after recalibration) and logistic regression models gave adequate results. Although all the predictive models showed acceptable discrimination performance in the example considered, the Bayes and

  4. Neighbourly support of people with chronic illness; is it related to neighbourhood social capital?

    PubMed

    Waverijn, Geeke; Heijmans, Monique; Groenewegen, Peter P

    2017-01-01

    The neighbourhood may provide resources for health. It is to date unknown whether people who live in neighbourhoods with more social capital have more access to practical and emotional support by neighbours, or whether this is a resource only available to those who are personally connected to people in their neighbourhood. We investigated whether support by neighbours of people with chronic illness was related to neighbourhood social capital and to individual neighbourhood connections. Furthermore, we investigated whether support received from neighbours by people with chronic illness differed according to demographic and disease characteristics. We collected data on support by neighbours and individual connections to neighbours among 2272 people with chronic illness in 2015. Data on neighbourhood social capital were collected among 69,336 people in 3425 neighbourhoods between May 2011 and September 2012. Neighbourhood social capital was estimated with ecometric measurements. We conducted multilevel regression analyses. People with chronic illness were more likely to receive practical and emotional support from neighbours if they had more individual connections to people in their neighbourhood. People with chronic illness were not more likely to receive practical and emotional support from neighbours if they lived in a neighbourhood with more social capital. People with chronic illness with moderate physical disabilities or with comorbidity, and people with chronic illness who lived together with their partner or children, were more likely to receive support from neighbours. To gain more insight into the benefits of neighbourhood social capital, it is necessary to differentiate between the resources only accessible through individual connections to people in the neighbourhood and resources provided through social capital on the neighbourhood level. Copyright © 2016 Elsevier Ltd. All rights reserved.

  5. Rapid lard identification with portable electronic nose

    NASA Astrophysics Data System (ADS)

    Latief, Marsad; Khorsidtalab, Aida; Saputra, Irwan; Akmeliawati, Rini; Nurashikin, Anis; Jaswir, Irwandi; Witjaksono, Gunawan

    2017-11-01

    Human sensory systems are limited in many different regards, yet they are great sources of inspiration for development of technologies that help humans to overcome their restraints. This paper signifies the capability of our developed electronic nose in rapid lard identification. The developed device, known as E-Nose, mimics human’s olfactory system’s technique to identify a particular substance. Lard is a common pig derivative which is often used as a food additive, emulsion or shortening. It’s also commonly used as an adulterant or as an alternative for cooking oils, margarine and butter. This substance is prohibited to be consumed by Muslims and Orthodox Jews for religious reasons. A portable reliable device with an ability to identify lard rapidly can be convenient to users concerned about lard adulteration. The prototype was examined using K-Nearest Neighbors algorithm (KNN), Support Vector Machine (SVM), Bagged Trees and Simple Tree, and can identify lard with the highest accuracy of 95.6% among three types of fat (lard, chicken and beef) in liquid form over a certain range of temperature using KNN.

  6. Lip reading using neural networks

    NASA Astrophysics Data System (ADS)

    Kalbande, Dhananjay; Mishra, Akassh A.; Patil, Sanjivani; Nirgudkar, Sneha; Patel, Prashant

    2011-10-01

    Computerized lip reading, or speech reading, is concerned with the difficult task of converting a video signal of a speaking person to written text. It has several applications like teaching deaf and dumb to speak and communicate effectively with the other people, its crime fighting potential and invariance to acoustic environment. We convert the video of the subject speaking vowels into images and then images are further selected manually for processing. However, several factors like fast speech, bad pronunciation, and poor illumination, movement of face, moustaches and beards make lip reading difficult. Contour tracking methods and Template matching are used for the extraction of lips from the face. K Nearest Neighbor algorithm is then used to classify the 'speaking' images and the 'silent' images. The sequence of images is then transformed into segments of utterances. Feature vector is calculated on each frame for all the segments and is stored in the database with properly labeled class. Character recognition is performed using modified KNN algorithm which assigns more weight to nearer neighbors. This paper reports the recognition of vowels using KNN algorithms

  7. Nearest private query based on quantum oblivious key distribution

    NASA Astrophysics Data System (ADS)

    Xu, Min; Shi, Run-hua; Luo, Zhen-yu; Peng, Zhen-wan

    2017-12-01

    Nearest private query is a special private query which involves two parties, a user and a data owner, where the user has a private input (e.g., an integer) and the data owner has a private data set, and the user wants to query which element in the owner's private data set is the nearest to his input without revealing their respective private information. In this paper, we first present a quantum protocol for nearest private query, which is based on quantum oblivious key distribution (QOKD). Compared to the classical related protocols, our protocol has the advantages of the higher security and the better feasibility, so it has a better prospect of applications.

  8. Nearest Neighbor Interactions Affect the Conformational Distribution in the Unfolded State of Peptides

    NASA Astrophysics Data System (ADS)

    Toal, Siobhan; Schweitzer-Stenner, Reinhard; Rybka, Karin; Schwalbe, Hardol

    2013-03-01

    In order to enable structural predictions of intrinsically disordered proteins (IDPs) the intrinsic conformational propensities of amino acids must be complimented by information on nearest-neighbor interactions. To explore the influence of nearest-neighbors on conformational distributions, we preformed a joint vibrational (Infrared, Vibrational Circular Dichroism (VCD), polarized Raman) and 2D-NMR study of selected GxyG host-guest peptides: GDyG, GSyG, GxLG, GxVG, where x/y ={A,K,LV}. D and S (L and V) were chosen at the x (y) position due to their observance to drastically change the distribution of alanine in xAy tripeptide sequences in truncated coil libraries. The conformationally sensitive amide' profiles of the respective spectra were analyzed in terms of a statistical ensemble described as a superposition of 2D-Gaussian functions in Ramachandran space representing sub-ensembles of pPII-, β-strand-, helical-, and turn-like conformations. Our analysis and simulation of the amide I' band profiles exploits excitonic coupling between the local amide I' vibrational modes in the tetra-peptides. The resulting distributions reveal that D and S, which themselves have high propensities for turn-structures, strongly affect the conformational distribution of their downstream neighbor. Taken together, our results indicate that Dx and Sx motifs might act as conformational randomizers in proteins, attenuating intrinsic propensities of neighboring residues. Overall, our results show that nearest neighbor interactions contribute significantly to the Gibbs energy landscape of disordered peptides and proteins.

  9. Leakage current behavior in lead-free ferroelectric (K,Na)NbO3-LiTaO3-LiSbO3 thin films

    NASA Astrophysics Data System (ADS)

    Abazari, M.; Safari, A.

    2010-12-01

    Conduction mechanisms in epitaxial (001)-oriented pure and 1 mol % Mn-doped (K0.44,Na0.52,Li0.04)(Nb0.84,Ta0.1,Sb0.06)O3 (KNN-LT-LS) thin films on SrTiO3 substrate were investigated. Temperature dependence of leakage current density was measured as a function of applied electric field in the range of 200-380 K. It was shown that the different transport mechanisms dominate in pure and Mn-doped thin films. In pure (KNN-LT-LS) thin films, Poole-Frenkel emission was found to be responsible for the leakage, while Schottky emission was the dominant mechanism in Mn-doped thin films at higher electric fields. This is a remarkable yet clear indication of effect of 1 mol % Mn on the resistive behavior of such thin films.

  10. Point process statistics in atom probe tomography.

    PubMed

    Philippe, T; Duguay, S; Grancher, G; Blavette, D

    2013-09-01

    We present a review of spatial point processes as statistical models that we have designed for the analysis and treatment of atom probe tomography (APT) data. As a major advantage, these methods do not require sampling. The mean distance to nearest neighbour is an attractive approach to exhibit a non-random atomic distribution. A χ(2) test based on distance distributions to nearest neighbour has been developed to detect deviation from randomness. Best-fit methods based on first nearest neighbour distance (1 NN method) and pair correlation function are presented and compared to assess the chemical composition of tiny clusters. Delaunay tessellation for cluster selection has been also illustrated. These statistical tools have been applied to APT experiments on microelectronics materials. Copyright © 2012 Elsevier B.V. All rights reserved.

  11. Word Recognition is Affected by the Meaning of Orthographic Neighbours: Evidence from Semantic Decision Tasks

    ERIC Educational Resources Information Center

    Boot, Inge; Pecher, Diane

    2008-01-01

    Many models of word recognition predict that neighbours of target words will be activated during word processing. Cascaded models can make the additional prediction that semantic features of those neighbours get activated before the target has been uniquely identified. In two semantic decision tasks neighbours that were congruent (i.e., from the…

  12. Enhancing the Discrimination Ability of a Gas Sensor Array Based on a Novel Feature Selection and Fusion Framework.

    PubMed

    Deng, Changjian; Lv, Kun; Shi, Debo; Yang, Bo; Yu, Song; He, Zhiyi; Yan, Jia

    2018-06-12

    In this paper, a novel feature selection and fusion framework is proposed to enhance the discrimination ability of gas sensor arrays for odor identification. Firstly, we put forward an efficient feature selection method based on the separability and the dissimilarity to determine the feature selection order for each type of feature when increasing the dimension of selected feature subsets. Secondly, the K-nearest neighbor (KNN) classifier is applied to determine the dimensions of the optimal feature subsets for different types of features. Finally, in the process of establishing features fusion, we come up with a classification dominance feature fusion strategy which conducts an effective basic feature. Experimental results on two datasets show that the recognition rates of Database I and Database II achieve 97.5% and 80.11%, respectively, when k = 1 for KNN classifier and the distance metric is correlation distance (COR), which demonstrates the superiority of the proposed feature selection and fusion framework in representing signal features. The novel feature selection method proposed in this paper can effectively select feature subsets that are conducive to the classification, while the feature fusion framework can fuse various features which describe the different characteristics of sensor signals, for enhancing the discrimination ability of gas sensors and, to a certain extent, suppressing drift effect.

  13. Analysis of the polymeric fractions of scrap from mobile phones using laser-induced breakdown spectroscopy: chemometric applications for better data interpretation.

    PubMed

    Aquino, Francisco W B; Pereira-Filho, Edenir R

    2015-03-01

    Because of their short life span and high production and consumption rates, mobile phones are one of the contributors to WEEE (waste electrical and electronic equipment) growth in many countries. If incorrectly managed, the hazardous materials used in the assembly of these devices can pollute the environment and pose dangers for workers involved in the recycling of these materials. In this study, 144 polymer fragments originating from 50 broken or obsolete mobile phones were analyzed via laser-induced breakdown spectroscopy (LIBS) without previous treatment. The coated polymers were mainly characterized by the presence of Ag, whereas the uncoated polymers were related to the presence of Al, K, Na, Si and Ti. Classification models were proposed using black and white polymers separately in order to identify the manufacturer and origin using KNN (K-nearest neighbor), SIMCA (Soft Independent Modeling of Class Analogy) and PLS-DA (Partial Least Squares for Discriminant Analysis). For the black polymers the percentage of correct predictions was, in average, 58% taking into consideration the models for manufacturer and origin identification. In the case of white polymers, the percentage of correct predictions ranged from 72.8% (PLS-DA) to 100% (KNN). Copyright © 2014 Elsevier B.V. All rights reserved.

  14. Structural characteristics of Mg-doped (1-x)(K0.5Na0.5)NbO3-xLiSbO3 lead-free ceramics as revealed by Raman spectroscopy

    NASA Astrophysics Data System (ADS)

    Zhu, W. L.; Zhu, J. L.; Meng, Y.; Wang, M. S.; Zhu, B.; Zhu, X. H.; Zhu, J. G.; Xiao, D. Q.; Pezzotti, G.

    2011-12-01

    This paper presents a Raman spectroscopic study of compositional-change-induced structure variation and of the related mechanism of Mg doping in LiSbO3 (LS)-modified (K0.5Na0.5)NbO3 (KNN) ceramics. With increasing LS content from 0 to 0.06, a discontinuous shift towards higher wavenumbers was found for the band position of the A1g(v1) stretching mode of KNN, accompanied by a clearly nonlinear broadening of this band and a decrease in its intensity. Such morphological changes in the Raman spectrum result from two factors: (i) changes in polarizability/binding strength of the O-Nb-O vibration upon incorporation of Li ions in the KNN perovskitic structure and (ii) a polymorphic phase transition (PPT) from orthorhombic to tetragonal (O → T) phase at x > 0.04. Upon increasing the amount, w, of Mg dopant incorporated into the (1-x)KNN-xLS ceramic structure, the intensity of the Raman bands are enhanced, while the peak position and the full width at half maximum of the A1g(v1) mode was found to experience a clear dependence on both w and x. Raman characterization revealed that the mechanism of Mg doping is strongly correlated with the concentration of Li in the perovskite structure: Mg2+ ions will preferentially replace Li+ ions for low Mg doping while replace K/Na ions for higher doping of Mg. The PPT O → T was also found to be altered by the introduction of Mg and the critical value of LS concentration, xO-T, for incipient O → T transition in the KNN-xLS-wMT system was strongly dependent on Mg content, with xO → T being roughly equal to 0.04 + 2w, for the case of dilute Mg alloying.

  15. Competing growth processes induced by next-nearest-neighbor interactions: Effects on meandering wavelength and stiffness

    NASA Astrophysics Data System (ADS)

    Blel, Sonia; Hamouda, Ajmi BH.; Mahjoub, B.; Einstein, T. L.

    2017-02-01

    In this paper we explore the meandering instability of vicinal steps with a kinetic Monte Carlo simulations (kMC) model including the attractive next-nearest-neighbor (NNN) interactions. kMC simulations show that increase of the NNN interaction strength leads to considerable reduction of the meandering wavelength and to weaker dependence of the wavelength on the deposition rate F. The dependences of the meandering wavelength on the temperature and the deposition rate obtained with simulations are in good quantitative agreement with the experimental result on the meandering instability of Cu(0 2 24) [T. Maroutian et al., Phys. Rev. B 64, 165401 (2001), 10.1103/PhysRevB.64.165401]. The effective step stiffness is found to depend not only on the strength of NNN interactions and the Ehrlich-Schwoebel barrier, but also on F. We argue that attractive NNN interactions intensify the incorporation of adatoms at step edges and enhance step roughening. Competition between NNN and nearest-neighbor interactions results in an alternative form of meandering instability which we call "roughening-limited" growth, rather than attachment-detachment-limited growth that governs the Bales-Zangwill instability. The computed effective wavelength and the effective stiffness behave as λeff˜F-q and β˜eff˜F-p , respectively, with q ≈p /2 .

  16. On estimation in k-tree sampling

    Treesearch

    Christoph Kleinn; Frantisek Vilcko

    2007-01-01

    The plot design known as k-tree sampling involves taking the k nearest trees from a selected sample point as sample trees. While this plot design is very practical and easily applied in the field for moderate values of k, unbiased estimation remains a problem. In this article, we give a brief introduction to the...

  17. Seeds integrate biological information about conspecific and allospecific neighbours.

    PubMed

    Yamawo, Akira; Mukai, Hiromi

    2017-06-28

    Numerous organisms integrate information from multiple sources and express adaptive behaviours, but how they do so at different developmental stages remains to be identified. Seeds, which are the embryonic stage of plants, need to make decisions about the timing of emergence in response to environmental cues related to survival. We investigated the timing of emergence of Plantago asiatica (Plantaginaceae) seed while manipulating the presence of Trifolium repens seed and the relatedness of neighbouring P. asiatica seed. The relatedness of neighbouring P. asiatica seed and the presence of seeds of T. repens did not on their own influence the timing of P. asiatica emergence. However, when encountering a T. repens seed, a P. asiatica seed emerged faster in the presence of a sibling seed than in the presence of a non-sibling seed. Water extracts of seeds gave the same result. We show that P. asiatica seeds integrate information about the relatedness of neighbouring P. asiatica seeds and the presence of seeds of a different species via water-soluble chemicals and adjust their emergence behaviour in response. These findings suggest the presence of kin-dependent interspecific interactions. © 2017 The Author(s).

  18. Predicting the binding preference of transcription factors to individual DNA k-mers.

    PubMed

    Alleyne, Trevis M; Peña-Castillo, Lourdes; Badis, Gwenael; Talukder, Shaheynoor; Berger, Michael F; Gehrke, Andrew R; Philippakis, Anthony A; Bulyk, Martha L; Morris, Quaid D; Hughes, Timothy R

    2009-04-15

    Recognition of specific DNA sequences is a central mechanism by which transcription factors (TFs) control gene expression. Many TF-binding preferences, however, are unknown or poorly characterized, in part due to the difficulty associated with determining their specificity experimentally, and an incomplete understanding of the mechanisms governing sequence specificity. New techniques that estimate the affinity of TFs to all possible k-mers provide a new opportunity to study DNA-protein interaction mechanisms, and may facilitate inference of binding preferences for members of a given TF family when such information is available for other family members. We employed a new dataset consisting of the relative preferences of mouse homeodomains for all eight-base DNA sequences in order to ask how well we can predict the binding profiles of homeodomains when only their protein sequences are given. We evaluated a panel of standard statistical inference techniques, as well as variations of the protein features considered. Nearest neighbour among functionally important residues emerged among the most effective methods. Our results underscore the complexity of TF-DNA recognition, and suggest a rational approach for future analyses of TF families.

  19. Improved nearest codeword search scheme using a tighter kick-out condition

    NASA Astrophysics Data System (ADS)

    Hwang, Kuo-Feng; Chang, Chin-Chen

    2001-09-01

    Using a tighter kick-out condition as a faster approach to nearest codeword searches is proposed. The proposed scheme finds the nearest codeword that is identical to the one found using a full search. However, using our scheme, the search time is much shorter. Our scheme first establishes a tighter kick-out condition. Then, the temporal nearest codeword can be obtained from the codewords that survive the tighter condition. Finally, the temporal nearest codeword cooperatives with the query vector to constitute a better kick-out condition. In other words, more codewords can be excluded without actually computing the distances between the bypassed codewords and the query vector. Comparison to previous work are included to present the benefits of the proposed scheme in relation to search time.

  20. Physical Human Activity Recognition Using Wearable Sensors.

    PubMed

    Attal, Ferhat; Mohammed, Samer; Dedabrishvili, Mariam; Chamroukhi, Faicel; Oukhellou, Latifa; Amirat, Yacine

    2015-12-11

    This paper presents a review of different classification techniques used to recognize human activities from wearable inertial sensor data. Three inertial sensor units were used in this study and were worn by healthy subjects at key points of upper/lower body limbs (chest, right thigh and left ankle). Three main steps describe the activity recognition process: sensors' placement, data pre-processing and data classification. Four supervised classification techniques namely, k-Nearest Neighbor (k-NN), Support Vector Machines (SVM), Gaussian Mixture Models (GMM), and Random Forest (RF) as well as three unsupervised classification techniques namely, k-Means, Gaussian mixture models (GMM) and Hidden Markov Model (HMM), are compared in terms of correct classification rate, F-measure, recall, precision, and specificity. Raw data and extracted features are used separately as inputs of each classifier. The feature selection is performed using a wrapper approach based on the RF algorithm. Based on our experiments, the results obtained show that the k-NN classifier provides the best performance compared to other supervised classification algorithms, whereas the HMM classifier is the one that gives the best results among unsupervised classification algorithms. This comparison highlights which approach gives better performance in both supervised and unsupervised contexts. It should be noted that the obtained results are limited to the context of this study, which concerns the classification of the main daily living human activities using three wearable accelerometers placed at the chest, right shank and left ankle of the subject.

  1. Can Laws Be a Potential PET Image Texture Analysis Approach for Evaluation of Tumor Heterogeneity and Histopathological Characteristics in NSCLC?

    PubMed

    Karacavus, Seyhan; Yılmaz, Bülent; Tasdemir, Arzu; Kayaaltı, Ömer; Kaya, Eser; İçer, Semra; Ayyıldız, Oguzhan

    2018-04-01

    We investigated the association between the textural features obtained from 18 F-FDG images, metabolic parameters (SUVmax , SUVmean, MTV, TLG), and tumor histopathological characteristics (stage and Ki-67 proliferation index) in non-small cell lung cancer (NSCLC). The FDG-PET images of 67 patients with NSCLC were evaluated. MATLAB technical computing language was employed in the extraction of 137 features by using first order statistics (FOS), gray-level co-occurrence matrix (GLCM), gray-level run length matrix (GLRLM), and Laws' texture filters. Textural features and metabolic parameters were statistically analyzed in terms of good discrimination power between tumor stages, and selected features/parameters were used in the automatic classification by k-nearest neighbors (k-NN) and support vector machines (SVM). We showed that one textural feature (gray-level nonuniformity, GLN) obtained using GLRLM approach and nine textural features using Laws' approach were successful in discriminating all tumor stages, unlike metabolic parameters. There were significant correlations between Ki-67 index and some of the textural features computed using Laws' method (r = 0.6, p = 0.013). In terms of automatic classification of tumor stage, the accuracy was approximately 84% with k-NN classifier (k = 3) and SVM, using selected five features. Texture analysis of FDG-PET images has a potential to be an objective tool to assess tumor histopathological characteristics. The textural features obtained using Laws' approach could be useful in the discrimination of tumor stage.

  2. Physical Human Activity Recognition Using Wearable Sensors

    PubMed Central

    Attal, Ferhat; Mohammed, Samer; Dedabrishvili, Mariam; Chamroukhi, Faicel; Oukhellou, Latifa; Amirat, Yacine

    2015-01-01

    This paper presents a review of different classification techniques used to recognize human activities from wearable inertial sensor data. Three inertial sensor units were used in this study and were worn by healthy subjects at key points of upper/lower body limbs (chest, right thigh and left ankle). Three main steps describe the activity recognition process: sensors’ placement, data pre-processing and data classification. Four supervised classification techniques namely, k-Nearest Neighbor (k-NN), Support Vector Machines (SVM), Gaussian Mixture Models (GMM), and Random Forest (RF) as well as three unsupervised classification techniques namely, k-Means, Gaussian mixture models (GMM) and Hidden Markov Model (HMM), are compared in terms of correct classification rate, F-measure, recall, precision, and specificity. Raw data and extracted features are used separately as inputs of each classifier. The feature selection is performed using a wrapper approach based on the RF algorithm. Based on our experiments, the results obtained show that the k-NN classifier provides the best performance compared to other supervised classification algorithms, whereas the HMM classifier is the one that gives the best results among unsupervised classification algorithms. This comparison highlights which approach gives better performance in both supervised and unsupervised contexts. It should be noted that the obtained results are limited to the context of this study, which concerns the classification of the main daily living human activities using three wearable accelerometers placed at the chest, right shank and left ankle of the subject. PMID:26690450

  3. Compositional Design of Dielectric, Ferroelectric and Piezoelectric Properties of (K, Na)NbO₃ and (Ba, Na)(Ti, Nb)O₃ Based Ceramics Prepared by Different Sintering Routes.

    PubMed

    Eiras, José A; Gerbasi, Rosimeire B Z; Rosso, Jaciele M; Silva, Daniel M; Cótica, Luiz F; Santos, Ivair A; Souza, Camila A; Lente, Manuel H

    2016-03-08

    Lead free piezoelectric materials are being intensively investigated in order to substitute lead based ones, commonly used in many different applications. Among the most promising lead-free materials are those with modified NaNbO₃, such as (K, Na)NbO₃ (KNN) and (Ba, Na)(Ti, Nb)O₃ (BTNN) families. From a ceramic processing point of view, high density single phase KNN and BTNN ceramics are very difficult to sinter due to the volatility of the alkaline elements, the narrow sintering temperature range and the anomalous grain growth. In this work, Spark Plasma Sintering (SPS) and high-energy ball milling (HEBM), following heat treatments (calcining and sintering), in oxidative (O₂) atmosphere have been used to prepare single phase highly densified KNN ("pure" and Cu 2+ or Li 1+ doped), with theoretical densities ρ th > 97% and BTNN ceramics (ρ th - 90%), respectively. Using BTTN ceramics with a P 4 mm perovskite-like structure, we showed that by increasing the NaNbO₃ content, the ferroelectric properties change from having a relaxor effect to an almost "normal" ferroelectric character, while the tetragonality and grain size increase and the shear piezoelectric coefficients ( k 15 , g 15 and d 15 ) improve. For KNN ceramics, the results reveal that the values for remanent polarization as well as for most of the coercive field are quite similar among all compositions. These facts evidenced that Cu 2+ may be incorporated into the A and/or B sites of the perovskite structure, having both hardening and softening effects.

  4. Nearest-neighbor Kitaev exchange blocked by charge order in electron-doped α -RuCl3

    NASA Astrophysics Data System (ADS)

    Koitzsch, A.; Habenicht, C.; Müller, E.; Knupfer, M.; Büchner, B.; Kretschmer, S.; Richter, M.; van den Brink, J.; Börrnert, F.; Nowak, D.; Isaeva, A.; Doert, Th.

    2017-10-01

    A quantum spin liquid might be realized in α -RuCl3 , a honeycomb-lattice magnetic material with substantial spin-orbit coupling. Moreover, α -RuCl3 is a Mott insulator, which implies the possibility that novel exotic phases occur upon doping. Here, we study the electronic structure of this material when intercalated with potassium by photoemission spectroscopy, electron energy loss spectroscopy, and density functional theory calculations. We obtain a stable stoichiometry at K0.5RuCl3 . This gives rise to a peculiar charge disproportionation into formally Ru2 + (4 d6 ) and Ru3 + (4 d5 ). Every Ru 4 d5 site with one hole in the t2 g shell is surrounded by nearest neighbors of 4 d6 character, where the t2 g level is full and magnetically inert. Thus, each type of Ru site forms a triangular lattice, and nearest-neighbor interactions of the original honeycomb are blocked.

  5. ReliefSeq: A Gene-Wise Adaptive-K Nearest-Neighbor Feature Selection Tool for Finding Gene-Gene Interactions and Main Effects in mRNA-Seq Gene Expression Data

    PubMed Central

    McKinney, Brett A.; White, Bill C.; Grill, Diane E.; Li, Peter W.; Kennedy, Richard B.; Poland, Gregory A.; Oberg, Ann L.

    2013-01-01

    Relief-F is a nonparametric, nearest-neighbor machine learning method that has been successfully used to identify relevant variables that may interact in complex multivariate models to explain phenotypic variation. While several tools have been developed for assessing differential expression in sequence-based transcriptomics, the detection of statistical interactions between transcripts has received less attention in the area of RNA-seq analysis. We describe a new extension and assessment of Relief-F for feature selection in RNA-seq data. The ReliefSeq implementation adapts the number of nearest neighbors (k) for each gene to optimize the Relief-F test statistics (importance scores) for finding both main effects and interactions. We compare this gene-wise adaptive-k (gwak) Relief-F method with standard RNA-seq feature selection tools, such as DESeq and edgeR, and with the popular machine learning method Random Forests. We demonstrate performance on a panel of simulated data that have a range of distributional properties reflected in real mRNA-seq data including multiple transcripts with varying sizes of main effects and interaction effects. For simulated main effects, gwak-Relief-F feature selection performs comparably to standard tools DESeq and edgeR for ranking relevant transcripts. For gene-gene interactions, gwak-Relief-F outperforms all comparison methods at ranking relevant genes in all but the highest fold change/highest signal situations where it performs similarly. The gwak-Relief-F algorithm outperforms Random Forests for detecting relevant genes in all simulation experiments. In addition, Relief-F is comparable to the other methods based on computational time. We also apply ReliefSeq to an RNA-Seq study of smallpox vaccine to identify gene expression changes between vaccinia virus-stimulated and unstimulated samples. ReliefSeq is an attractive tool for inclusion in the suite of tools used for analysis of mRNA-Seq data; it has power to detect both main

  6. Compositional Design of Dielectric, Ferroelectric and Piezoelectric Properties of (K, Na)NbO3 and (Ba, Na)(Ti, Nb)O3 Based Ceramics Prepared by Different Sintering Routes

    PubMed Central

    Eiras, José A.; Gerbasi, Rosimeire B. Z.; Rosso, Jaciele M.; Silva, Daniel M.; Cótica, Luiz F.; Santos, Ivair A.; Souza, Camila A.; Lente, Manuel H.

    2016-01-01

    Lead free piezoelectric materials are being intensively investigated in order to substitute lead based ones, commonly used in many different applications. Among the most promising lead-free materials are those with modified NaNbO3, such as (K, Na)NbO3 (KNN) and (Ba, Na)(Ti, Nb)O3 (BTNN) families. From a ceramic processing point of view, high density single phase KNN and BTNN ceramics are very difficult to sinter due to the volatility of the alkaline elements, the narrow sintering temperature range and the anomalous grain growth. In this work, Spark Plasma Sintering (SPS) and high-energy ball milling (HEBM), following heat treatments (calcining and sintering), in oxidative (O2) atmosphere have been used to prepare single phase highly densified KNN (“pure” and Cu2+ or Li1+ doped), with theoretical densities ρth > 97% and BTNN ceramics (ρth ~ 90%), respectively. Using BTTN ceramics with a P4mm perovskite-like structure, we showed that by increasing the NaNbO3 content, the ferroelectric properties change from having a relaxor effect to an almost “normal” ferroelectric character, while the tetragonality and grain size increase and the shear piezoelectric coefficients (k15, g15 and d15) improve. For KNN ceramics, the results reveal that the values for remanent polarization as well as for most of the coercive field are quite similar among all compositions. These facts evidenced that Cu2+ may be incorporated into the A and/or B sites of the perovskite structure, having both hardening and softening effects. PMID:28773304

  7. [Terahertz Spectroscopic Identification with Deep Belief Network].

    PubMed

    Ma, Shuai; Shen, Tao; Wang, Rui-qi; Lai, Hua; Yu, Zheng-tao

    2015-12-01

    Feature extraction and classification are the key issues of terahertz spectroscopy identification. Because many materials have no apparent absorption peaks in the terahertz band, it is difficult to extract theirs terahertz spectroscopy feature and identify. To this end, a novel of identify terahertz spectroscopy approach with Deep Belief Network (DBN) was studied in this paper, which combines the advantages of DBN and K-Nearest Neighbors (KNN) classifier. Firstly, cubic spline interpolation and S-G filter were used to normalize the eight kinds of substances (ATP, Acetylcholine Bromide, Bifenthrin, Buprofezin, Carbazole, Bleomycin, Buckminster and Cylotriphosphazene) terahertz transmission spectra in the range of 0.9-6 THz. Secondly, the DBN model was built by two restricted Boltzmann machine (RBM) and then trained layer by layer using unsupervised approach. Instead of using handmade features, the DBN was employed to learn suitable features automatically with raw input data. Finally, a KNN classifier was applied to identify the terahertz spectrum. Experimental results show that using the feature learned by DBN can identify the terahertz spectrum of different substances with the recognition rate of over 90%, which demonstrates that the proposed method can automatically extract the effective features of terahertz spectrum. Furthermore, this KNN classifier was compared with others (BP neural network, SOM neural network and RBF neural network). Comparisons showed that the recognition rate of KNN classifier is better than the other three classifiers. Using the approach that automatic extract terahertz spectrum features by DBN can greatly reduce the workload of feature extraction. This proposed method shows a promising future in the application of identifying the mass terahertz spectroscopy.

  8. Multivariate Feature Selection of Image Descriptors Data for Breast Cancer with Computer-Assisted Diagnosis

    PubMed Central

    Galván-Tejada, Carlos E.; Zanella-Calzada, Laura A.; Galván-Tejada, Jorge I.; Celaya-Padilla, José M.; Gamboa-Rosales, Hamurabi; Garza-Veloz, Idalia; Martinez-Fierro, Margarita L.

    2017-01-01

    Breast cancer is an important global health problem, and the most common type of cancer among women. Late diagnosis significantly decreases the survival rate of the patient; however, using mammography for early detection has been demonstrated to be a very important tool increasing the survival rate. The purpose of this paper is to obtain a multivariate model to classify benign and malignant tumor lesions using a computer-assisted diagnosis with a genetic algorithm in training and test datasets from mammography image features. A multivariate search was conducted to obtain predictive models with different approaches, in order to compare and validate results. The multivariate models were constructed using: Random Forest, Nearest centroid, and K-Nearest Neighbor (K-NN) strategies as cost function in a genetic algorithm applied to the features in the BCDR public databases. Results suggest that the two texture descriptor features obtained in the multivariate model have a similar or better prediction capability to classify the data outcome compared with the multivariate model composed of all the features, according to their fitness value. This model can help to reduce the workload of radiologists and present a second opinion in the classification of tumor lesions. PMID:28216571

  9. Multivariate Feature Selection of Image Descriptors Data for Breast Cancer with Computer-Assisted Diagnosis.

    PubMed

    Galván-Tejada, Carlos E; Zanella-Calzada, Laura A; Galván-Tejada, Jorge I; Celaya-Padilla, José M; Gamboa-Rosales, Hamurabi; Garza-Veloz, Idalia; Martinez-Fierro, Margarita L

    2017-02-14

    Breast cancer is an important global health problem, and the most common type of cancer among women. Late diagnosis significantly decreases the survival rate of the patient; however, using mammography for early detection has been demonstrated to be a very important tool increasing the survival rate. The purpose of this paper is to obtain a multivariate model to classify benign and malignant tumor lesions using a computer-assisted diagnosis with a genetic algorithm in training and test datasets from mammography image features. A multivariate search was conducted to obtain predictive models with different approaches, in order to compare and validate results. The multivariate models were constructed using: Random Forest, Nearest centroid, and K-Nearest Neighbor (K-NN) strategies as cost function in a genetic algorithm applied to the features in the BCDR public databases. Results suggest that the two texture descriptor features obtained in the multivariate model have a similar or better prediction capability to classify the data outcome compared with the multivariate model composed of all the features, according to their fitness value. This model can help to reduce the workload of radiologists and present a second opinion in the classification of tumor lesions.

  10. Detection of Periodic Leg Movements by Machine Learning Methods Using Polysomnographic Parameters Other Than Leg Electromyography

    PubMed Central

    Umut, İlhan; Çentik, Güven

    2016-01-01

    The number of channels used for polysomnographic recording frequently causes difficulties for patients because of the many cables connected. Also, it increases the risk of having troubles during recording process and increases the storage volume. In this study, it is intended to detect periodic leg movement (PLM) in sleep with the use of the channels except leg electromyography (EMG) by analysing polysomnography (PSG) data with digital signal processing (DSP) and machine learning methods. PSG records of 153 patients of different ages and genders with PLM disorder diagnosis were examined retrospectively. A novel software was developed for the analysis of PSG records. The software utilizes the machine learning algorithms, statistical methods, and DSP methods. In order to classify PLM, popular machine learning methods (multilayer perceptron, K-nearest neighbour, and random forests) and logistic regression were used. Comparison of classified results showed that while K-nearest neighbour classification algorithm had higher average classification rate (91.87%) and lower average classification error value (RMSE = 0.2850), multilayer perceptron algorithm had the lowest average classification rate (83.29%) and the highest average classification error value (RMSE = 0.3705). Results showed that PLM can be classified with high accuracy (91.87%) without leg EMG record being present. PMID:27213008

  11. Detection of Periodic Leg Movements by Machine Learning Methods Using Polysomnographic Parameters Other Than Leg Electromyography.

    PubMed

    Umut, İlhan; Çentik, Güven

    2016-01-01

    The number of channels used for polysomnographic recording frequently causes difficulties for patients because of the many cables connected. Also, it increases the risk of having troubles during recording process and increases the storage volume. In this study, it is intended to detect periodic leg movement (PLM) in sleep with the use of the channels except leg electromyography (EMG) by analysing polysomnography (PSG) data with digital signal processing (DSP) and machine learning methods. PSG records of 153 patients of different ages and genders with PLM disorder diagnosis were examined retrospectively. A novel software was developed for the analysis of PSG records. The software utilizes the machine learning algorithms, statistical methods, and DSP methods. In order to classify PLM, popular machine learning methods (multilayer perceptron, K-nearest neighbour, and random forests) and logistic regression were used. Comparison of classified results showed that while K-nearest neighbour classification algorithm had higher average classification rate (91.87%) and lower average classification error value (RMSE = 0.2850), multilayer perceptron algorithm had the lowest average classification rate (83.29%) and the highest average classification error value (RMSE = 0.3705). Results showed that PLM can be classified with high accuracy (91.87%) without leg EMG record being present.

  12. Neighbour-die effect on the measurement of wafer-level flip-chip LED dies in production lines

    NASA Astrophysics Data System (ADS)

    Chen, Tengfei; Wan, Zirui; Li, Bin

    2017-11-01

    The light from the side surfaces of the test flip-chip light-emitting diode (FCLED) dies is reflected, refracted or absorbed by neighbour dies during the measurement of wafer-level FCLED dies in production lines. A notable measurement deviation is caused by the neighbour-die effect, which is not considered in current industry practice. In this paper, Monte Carlo ray-tracing simulations are used to study the measurement deviations caused by the neighbour-die effect and extension ratios of the film. The simulation results show that the maximal deviation of radiant flux impinging the photodiode can reach 5.5%, if the die is tested without any neighbour dies, or is surrounded by a set of neighbour dies at an extension ratio of 1.1. Moreover, the dependence between the measurement results and neighbour cases for different extension ratios is also investigated. Then, a modified calibration method is proposed and studied. The proposed technique can be used to improve the calibration and measurement accuracy of the test equipment used for measurement of wafer-level FCLED dies in production lines.

  13. Continuous excitations of the triangular-lattice quantum spin liquid YbMgGaO 4

    DOE PAGES

    Paddison, Joseph A. M.; Daum, Marcus; Dun, Zhiling; ...

    2016-12-05

    A quantum spin liquid (QSL) is an exotic state of matter in which electrons’ spins are quantum entangled over long distances, but do not show magnetic order in the zero-temperature limit. The observation of QSL states is a central aim of experimental physics, because they host collective excitations that transcend our knowledge of quantum matter; however, examples in real materials are scarce. We report neutron-scattering experiments on YbMgGaO 4, a QSL candidate in which Yb 3+ ions with effective spin-1/2 occupy a triangular lattice. Furthermore, our measurements reveal a continuum of magnetic excitations—the essential experimental hallmark of a QSL7—at verymore » low temperature (0.06 K). The origin of this peculiar excitation spectrum is a crucial question, because isotropic nearest-neighbour interactions do not yield a QSL ground state on the triangular lattice. In using measurements the field-polarized state, we identify antiferromagnetic next-nearest-neighbour interactions spin-space anisotropies and chemical disorder between the magnetic layers as key ingredients in YbMgGaO 4.« less

  14. Empirical Wavelet Transform Based Features for Classification of Parkinson's Disease Severity.

    PubMed

    Oung, Qi Wei; Muthusamy, Hariharan; Basah, Shafriza Nisha; Lee, Hoileong; Vijean, Vikneswaran

    2017-12-29

    Parkinson's disease (PD) is a type of progressive neurodegenerative disorder that has affected a large part of the population till now. Several symptoms of PD include tremor, rigidity, slowness of movements and vocal impairments. In order to develop an effective diagnostic system, a number of algorithms were proposed mainly to distinguish healthy individuals from the ones with PD. However, most of the previous works were conducted based on a binary classification, with the early PD stage and the advanced ones being treated equally. Therefore, in this work, we propose a multiclass classification with three classes of PD severity level (mild, moderate, severe) and healthy control. The focus is to detect and classify PD using signals from wearable motion and audio sensors based on both empirical wavelet transform (EWT) and empirical wavelet packet transform (EWPT) respectively. The EWT/EWPT was applied to decompose both speech and motion data signals up to five levels. Next, several features are extracted after obtaining the instantaneous amplitudes and frequencies from the coefficients of the decomposed signals by applying the Hilbert transform. The performance of the algorithm was analysed using three classifiers - K-nearest neighbour (KNN), probabilistic neural network (PNN) and extreme learning machine (ELM). Experimental results demonstrated that our proposed approach had the ability to differentiate PD from non-PD subjects, including their severity level - with classification accuracies of more than 90% using EWT/EWPT-ELM based on signals from motion and audio sensors respectively. Additionally, classification accuracy of more than 95% was achieved when EWT/EWPT-ELM is applied to signals from integration of both signal's information.

  15. Development of a computer aided diagnosis model for prostate cancer classification on multi-parametric MRI

    NASA Astrophysics Data System (ADS)

    Alfano, R.; Soetemans, D.; Bauman, G. S.; Gibson, E.; Gaed, M.; Moussa, M.; Gomez, J. A.; Chin, J. L.; Pautler, S.; Ward, A. D.

    2018-02-01

    Multi-parametric MRI (mp-MRI) is becoming a standard in contemporary prostate cancer screening and diagnosis, and has shown to aid physicians in cancer detection. It offers many advantages over traditional systematic biopsy, which has shown to have very high clinical false-negative rates of up to 23% at all stages of the disease. However beneficial, mp-MRI is relatively complex to interpret and suffers from inter-observer variability in lesion localization and grading. Computer-aided diagnosis (CAD) systems have been developed as a solution as they have the power to perform deterministic quantitative image analysis. We measured the accuracy of such a system validated using accurately co-registered whole-mount digitized histology. We trained a logistic linear classifier (LOGLC), support vector machine (SVC), k-nearest neighbour (KNN) and random forest classifier (RFC) in a four part ROI based experiment against: 1) cancer vs. non-cancer, 2) high-grade (Gleason score ≥4+3) vs. low-grade cancer (Gleason score <4+3), 3) high-grade vs. other tissue components and 4) high-grade vs. benign tissue by selecting the classifier with the highest AUC using 1-10 features from forward feature selection. The CAD model was able to classify malignant vs. benign tissue and detect high-grade cancer with high accuracy. Once fully validated, this work will form the basis for a tool that enhances the radiologist's ability to detect malignancies, potentially improving biopsy guidance, treatment selection, and focal therapy for prostate cancer patients, maximizing the potential for cure and increasing quality of life.

  16. Piezoelectric Properties of LiSbO3-Modified (K0.48Na0.52)NbO3 Lead-Free Ceramics

    NASA Astrophysics Data System (ADS)

    Wu, Jiagang; Wang, Yuanyu; Xiao, Dingquan; Zhu, Jianguo; Yu, Ping; Wu, Lang; Wu, Wenjuan

    2007-11-01

    Lead-free piezoelectric (1-x)(K0.48Na0.52)NbO3-xLiSbO3 [(1-x)KNN-xLS] ceramics were prepared by conventional sintering. A morphotropic phase boundary (MPB) between the orthorhombic and tetragonal phases was identified in the composition range of 0.04KNN-xLS ceramic is a promising lead-free piezoelectric material.

  17. Computing the Edge-Neighbour-Scattering Number of Graphs

    NASA Astrophysics Data System (ADS)

    Wei, Zongtian; Qi, Nannan; Yue, Xiaokui

    2013-11-01

    A set of edges X is subverted from a graph G by removing the closed neighbourhood N[X] from G. We denote the survival subgraph by G=X. An edge-subversion strategy X is called an edge-cut strategy of G if G=X is disconnected, a single vertex, or empty. The edge-neighbour-scattering number of a graph G is defined as ENS(G) = max{ω(G/X)-|X| : X is an edge-cut strategy of G}, where w(G=X) is the number of components of G=X. This parameter can be used to measure the vulnerability of networks when some edges are failed, especially spy networks and virus-infected networks. In this paper, we prove that the problem of computing the edge-neighbour-scattering number of a graph is NP-complete and give some upper and lower bounds for this parameter.

  18. Quantum realization of the nearest neighbor value interpolation method for INEQR

    NASA Astrophysics Data System (ADS)

    Zhou, RiGui; Hu, WenWen; Luo, GaoFeng; Liu, XingAo; Fan, Ping

    2018-07-01

    This paper presents the nearest neighbor value (NNV) interpolation algorithm for the improved novel enhanced quantum representation of digital images (INEQR). It is necessary to use interpolation in image scaling because there is an increase or a decrease in the number of pixels. The difference between the proposed scheme and nearest neighbor interpolation is that the concept applied, to estimate the missing pixel value, is guided by the nearest value rather than the distance. Firstly, a sequence of quantum operations is predefined, such as cyclic shift transformations and the basic arithmetic operations. Then, the feasibility of the nearest neighbor value interpolation method for quantum image of INEQR is proven using the previously designed quantum operations. Furthermore, quantum image scaling algorithm in the form of circuits of the NNV interpolation for INEQR is constructed for the first time. The merit of the proposed INEQR circuit lies in their low complexity, which is achieved by utilizing the unique properties of quantum superposition and entanglement. Finally, simulation-based experimental results involving different classical images and ratios (i.e., conventional or non-quantum) are simulated based on the classical computer's MATLAB 2014b software, which demonstrates that the proposed interpolation method has higher performances in terms of high resolution compared to the nearest neighbor and bilinear interpolation.

  19. HOW FAR TO THE NEAREST ROAD?

    EPA Science Inventory

    Ecological impacts from roads may be the rule rather than the exception in most watersheds of the conterminous United States. We measured total area, and forestland area located within nine distances of the nearest road of any type in each of 2,108 watersheds nationwide. Overall,...

  20. Detecting falls with wearable sensors using machine learning techniques.

    PubMed

    Özdemir, Ahmet Turan; Barshan, Billur

    2014-06-18

    Falls are a serious public health problem and possibly life threatening for people in fall risk groups. We develop an automated fall detection system with wearable motion sensor units fitted to the subjects' body at six different positions. Each unit comprises three tri-axial devices (accelerometer, gyroscope, and magnetometer/compass). Fourteen volunteers perform a standardized set of movements including 20 voluntary falls and 16 activities of daily living (ADLs), resulting in a large dataset with 2520 trials. To reduce the computational complexity of training and testing the classifiers, we focus on the raw data for each sensor in a 4 s time window around the point of peak total acceleration of the waist sensor, and then perform feature extraction and reduction. Most earlier studies on fall detection employ rule-based approaches that rely on simple thresholding of the sensor outputs. We successfully distinguish falls from ADLs using six machine learning techniques (classifiers): the k-nearest neighbor (k-NN) classifier, least squares method (LSM), support vector machines (SVM), Bayesian decision making (BDM), dynamic time warping (DTW), and artificial neural networks (ANNs). We compare the performance and the computational complexity of the classifiers and achieve the best results with the k-NN classifier and LSM, with sensitivity, specificity, and accuracy all above 99%. These classifiers also have acceptable computational requirements for training and testing. Our approach would be applicable in real-world scenarios where data records of indeterminate length, containing multiple activities in sequence, are recorded.

  1. Local Subspace Classifier with Transform-Invariance for Image Classification

    NASA Astrophysics Data System (ADS)

    Hotta, Seiji

    A family of linear subspace classifiers called local subspace classifier (LSC) outperforms the k-nearest neighbor rule (kNN) and conventional subspace classifiers in handwritten digit classification. However, LSC suffers very high sensitivity to image transformations because it uses projection and the Euclidean distances for classification. In this paper, I present a combination of a local subspace classifier (LSC) and a tangent distance (TD) for improving accuracy of handwritten digit recognition. In this classification rule, we can deal with transform-invariance easily because we are able to use tangent vectors for approximation of transformations. However, we cannot use tangent vectors in other type of images such as color images. Hence, kernel LSC (KLSC) is proposed for incorporating transform-invariance into LSC via kernel mapping. The performance of the proposed methods is verified with the experiments on handwritten digit and color image classification.

  2. An ensemble of dissimilarity based classifiers for Mackerel gender determination

    NASA Astrophysics Data System (ADS)

    Blanco, A.; Rodriguez, R.; Martinez-Maranon, I.

    2014-03-01

    Mackerel is an infravalored fish captured by European fishing vessels. A manner to add value to this specie can be achieved by trying to classify it attending to its sex. Colour measurements were performed on Mackerel females and males (fresh and defrozen) extracted gonads to obtain differences between sexes. Several linear and non linear classifiers such as Support Vector Machines (SVM), k Nearest Neighbors (k-NN) or Diagonal Linear Discriminant Analysis (DLDA) can been applied to this problem. However, theyare usually based on Euclidean distances that fail to reflect accurately the sample proximities. Classifiers based on non-Euclidean dissimilarities misclassify a different set of patterns. We combine different kind of dissimilarity based classifiers. The diversity is induced considering a set of complementary dissimilarities for each model. The experimental results suggest that our algorithm helps to improve classifiers based on a single dissimilarity.

  3. Parametric, bootstrap, and jackknife variance estimators for the k-Nearest Neighbors technique with illustrations using forest inventory and satellite image data

    Treesearch

    Ronald E. McRoberts; Steen Magnussen; Erkki O. Tomppo; Gherardo Chirici

    2011-01-01

    Nearest neighbors techniques have been shown to be useful for estimating forest attributes, particularly when used with forest inventory and satellite image data. Published reports of positive results have been truly international in scope. However, for these techniques to be more useful, they must be able to contribute to scientific inference which, for sample-based...

  4. Long-term priming of neighbours biases the word recognition process: evidence from a lexical decision task.

    PubMed

    Wagenmakers, Eric-Jan; Raaijmakers, Jeroen G W

    2006-12-01

    The role of orthographically similar words (i.e., neighbours) in the word recognition process has been studied extensively using short-term priming paradigms (e.g., Colombo, 1986). Here we demonstrate that long-term effects of neighbour priming can also be obtained. Experiment 1 showed that prior study of a neighbour (e.g., TANGO) increased later lexical decision performance for similar words (e.g., MANGO), but decreased performance for similar pseudowords (e.g., LANGO). Experiment 2 replicated this bias effect and showed that the increase in lexical decision performance due to neighbour priming is selectively due to words from a relatively sparse neighbourhood. Explanations of the bias effect in terms of lexical activation and episodic memory retrieval are discussed.

  5. Earthquake Declustering via a Nearest-Neighbor Approach in Space-Time-Magnitude Domain

    NASA Astrophysics Data System (ADS)

    Zaliapin, I. V.; Ben-Zion, Y.

    2016-12-01

    We propose a new method for earthquake declustering based on nearest-neighbor analysis of earthquakes in space-time-magnitude domain. The nearest-neighbor approach was recently applied to a variety of seismological problems that validate the general utility of the technique and reveal the existence of several different robust types of earthquake clusters. Notably, it was demonstrated that clustering associated with the largest earthquakes is statistically different from that of small-to-medium events. In particular, the characteristic bimodality of the nearest-neighbor distances that helps separating clustered and background events is often violated after the largest earthquakes in their vicinity, which is dominated by triggered events. This prevents using a simple threshold between the two modes of the nearest-neighbor distance distribution for declustering. The current study resolves this problem hence extending the nearest-neighbor approach to the problem of earthquake declustering. The proposed technique is applied to seismicity of different areas in California (San Jacinto, Coso, Salton Sea, Parkfield, Ventura, Mojave, etc.), as well as to the global seismicity, to demonstrate its stability and efficiency in treating various clustering types. The results are compared with those of alternative declustering methods.

  6. Diagnosis of human malignancies using laser-induced breakdown spectroscopy in combination with chemometric methods

    NASA Astrophysics Data System (ADS)

    Chen, Xue; Li, Xiaohui; Yu, Xin; Chen, Deying; Liu, Aichun

    2018-01-01

    Diagnosis of malignancies is a challenging clinical issue. In this work, we present quick and robust diagnosis and discrimination of lymphoma and multiple myeloma (MM) using laser-induced breakdown spectroscopy (LIBS) conducted on human serum samples, in combination with chemometric methods. The serum samples collected from lymphoma and MM cancer patients and healthy controls were deposited on filter papers and ablated with a pulsed 1064 nm Nd:YAG laser. 24 atomic lines of Ca, Na, K, H, O, and N were selected for malignancy diagnosis. Principal component analysis (PCA), linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), and k nearest neighbors (kNN) classification were applied to build the malignancy diagnosis and discrimination models. The performances of the models were evaluated using 10-fold cross validation. The discrimination accuracy, confusion matrix and receiver operating characteristic (ROC) curves were obtained. The values of area under the ROC curve (AUC), sensitivity and specificity at the cut-points were determined. The kNN model exhibits the best performances with overall discrimination accuracy of 96.0%. Distinct discrimination between malignancies and healthy controls has been achieved with AUC, sensitivity and specificity for healthy controls all approaching 1. For lymphoma, the best discrimination performance values are AUC = 0.990, sensitivity = 0.970 and specificity = 0.956. For MM, the corresponding values are AUC = 0.986, sensitivity = 0.892 and specificity = 0.994. The results show that the serum-LIBS technique can serve as a quick, less invasive and robust method for diagnosis and discrimination of human malignancies.

  7. Intelligent data analysis: the best approach for chronic heart failure (CHF) follow up management.

    PubMed

    Mohammadzadeh, Niloofar; Safdari, Reza; Baraani, Alireza; Mohammadzadeh, Farshid

    2014-08-01

    Intelligent data analysis has ability to prepare and present complex relations between symptoms and diseases, medical and treatment consequences and definitely has significant role in improving follow-up management of chronic heart failure (CHF) patients, increasing speed ​​and accuracy in diagnosis and treatments; reducing costs, designing and implementation of clinical guidelines. The aim of this article is to describe intelligent data analysis methods in order to improve patient monitoring in follow and treatment of chronic heart failure patients as the best approach for CHF follow up management. Minimum data set (MDS) requirements for monitoring and follow up of CHF patient designed in checklist with six main parts. All CHF patients that discharged in 2013 from Tehran heart center have been selected. The MDS for monitoring CHF patient status were collected during 5 months in three different times of follow up. Gathered data was imported in RAPIDMINER 5 software. Modeling was based on decision trees methods such as C4.5, CHAID, ID3 and k-Nearest Neighbors algorithm (K-NN) with k=1. Final analysis was based on voting method. Decision trees and K-NN evaluate according to Cross-Validation. Creating and using standard terminologies and databases consistent with these terminologies help to meet the challenges related to data collection from various places and data application in intelligent data analysis. It should be noted that intelligent analysis of health data and intelligent system can never replace cardiologists. It can only act as a helpful tool for the cardiologist's decisions making.

  8. Classification of vegetation types in military region

    NASA Astrophysics Data System (ADS)

    Gonçalves, Miguel; Silva, Jose Silvestre; Bioucas-Dias, Jose

    2015-10-01

    In decision-making process regarding planning and execution of military operations, the terrain is a determining factor. Aerial photographs are a source of vital information for the success of an operation in hostile region, namely when the cartographic information behind enemy lines is scarce or non-existent. The objective of present work is the development of a tool capable of processing aerial photos. The methodology implemented starts with feature extraction, followed by the application of an automatic selector of features. The next step, using the k-fold cross validation technique, estimates the input parameters for the following classifiers: Sparse Multinomial Logist Regression (SMLR), K Nearest Neighbor (KNN), Linear Classifier using Principal Component Expansion on the Joint Data (PCLDC) and Multi-Class Support Vector Machine (MSVM). These classifiers were used in two different studies with distinct objectives: discrimination of vegetation's density and identification of vegetation's main components. It was found that the best classifier on the first approach is the Sparse Logistic Multinomial Regression (SMLR). On the second approach, the implemented methodology applied to high resolution images showed that the better performance was achieved by KNN classifier and PCLDC. Comparing the two approaches there is a multiscale issue, in which for different resolutions, the best solution to the problem requires different classifiers and the extraction of different features.

  9. Analyzing the defect structure of CuO-doped PZT and KNN piezoelectrics from electron paramagnetic resonance.

    PubMed

    Jakes, Peter; Kungl, Hans; Schierholz, Roland; Eichel, Rüdiger-A

    2014-09-01

    The defect structure for copper-doped sodium potassium niobate (KNN) ferroelectrics has been analyzed with respect to its defect structure. In particular, the interplay between the mutually compensating dimeric (Cu(Nb)'''-V(O)··) and trimeric (V(O)··-Cu(Nb)'''-V(O)··)· defect complexes with 180° and non-180° domain walls has been analyzed and compared to the effects from (Cu'' - V(O)··)(x)× dipoles in CuO-doped lead zirconate titanate (PZT). Attempts are made to relate the rearrangement of defect complexes to macroscopic electromechanical properties.

  10. Classification of THz pulse signals using two-dimensional cross-correlation feature extraction and non-linear classifiers.

    PubMed

    Siuly; Yin, Xiaoxia; Hadjiloucas, Sillas; Zhang, Yanchun

    2016-04-01

    This work provides a performance comparison of four different machine learning classifiers: multinomial logistic regression with ridge estimators (MLR) classifier, k-nearest neighbours (KNN), support vector machine (SVM) and naïve Bayes (NB) as applied to terahertz (THz) transient time domain sequences associated with pixelated images of different powder samples. The six substances considered, although have similar optical properties, their complex insertion loss at the THz part of the spectrum is significantly different because of differences in both their frequency dependent THz extinction coefficient as well as differences in their refractive index and scattering properties. As scattering can be unquantifiable in many spectroscopic experiments, classification solely on differences in complex insertion loss can be inconclusive. The problem is addressed using two-dimensional (2-D) cross-correlations between background and sample interferograms, these ensure good noise suppression of the datasets and provide a range of statistical features that are subsequently used as inputs to the above classifiers. A cross-validation procedure is adopted to assess the performance of the classifiers. Firstly the measurements related to samples that had thicknesses of 2mm were classified, then samples at thicknesses of 4mm, and after that 3mm were classified and the success rate and consistency of each classifier was recorded. In addition, mixtures having thicknesses of 2 and 4mm as well as mixtures of 2, 3 and 4mm were presented simultaneously to all classifiers. This approach provided further cross-validation of the classification consistency of each algorithm. The results confirm the superiority in classification accuracy and robustness of the MLR (least accuracy 88.24%) and KNN (least accuracy 90.19%) algorithms which consistently outperformed the SVM (least accuracy 74.51%) and NB (least accuracy 56.86%) classifiers for the same number of feature vectors across all studies

  11. Estimating forest attribute parameters for small areas using nearest neighbors techniques

    Treesearch

    Ronald E. McRoberts

    2012-01-01

    Nearest neighbors techniques have become extremely popular, particularly for use with forest inventory data. With these techniques, a population unit prediction is calculated as a linear combination of observations for a selected number of population units in a sample that are most similar, or nearest, in a space of ancillary variables to the population unit requiring...

  12. [Exploration of rapidly determining quality of traditional Chinese medicines by (NIR) spectroscopy based on internet sharing mode].

    PubMed

    Ni, Li-Jun; Luan, Shao-Rong; Zhang, Li-Guo

    2016-10-01

    Because of the numerous varieties of herbal species and active ingredients in the traditional Chinese medicine(TCM),the traditional methods employed could hardly satisfy the current determination requirements of TCM.The present work proposed an idea to realize rapid determination of the quality of TCM based on near infrared(NIR)spectroscopy and internet sharing mode. Low cost and portable multi-source composite spectrometer was invented by our group for in-site fast measurement of spectra of TCM samples. The database could be set up by sharing spectra and quality detection data of TCM samples among TCM enterprises based on the internet platform.A novel method called as keeping same relationship between X and Y space based on K nearest neighbors(KNN-KSR for short)was applied to predict the contents of effective compounds of the samples. In addition,a comparative study between KNN-KSR and partial least squares(PLS)was conducted. Two datasets were applied to validate above idea:one was about 58 Ginkgo Folium samples samples measured with four near-infrared spectroscopy instruments and two multi-source composite spectrometers,another one was about 80 corn samples available online measured with three NIR instruments. The results show that the KNN-KSR method could obtain more reliable outcomes without correcting spectrum.However transforming the PLS models to other instruments could hardly acquire better predictive results until spectral calibration is performed. Meanwhile,the similar analysis results of total flavonoids and total lactones of Ginkgo Folium samples are achieved on the multi-source composite spectrometers and near-infrared spectroscopy instruments,and the prediction results of KNN-KSR are better than PLS. The idea proposed in present study is in urgent need of more samples spectra, and then to be verified by more case studies. Copyright© by the Chinese Pharmaceutical Association.

  13. Mapping forested wetlands in the Great Zhan River Basin through integrating optical, radar, and topographical data classification techniques.

    PubMed

    Na, X D; Zang, S Y; Wu, C S; Li, W L

    2015-11-01

    Knowledge of the spatial extent of forested wetlands is essential to many studies including wetland functioning assessment, greenhouse gas flux estimation, and wildlife suitable habitat identification. For discriminating forested wetlands from their adjacent land cover types, researchers have resorted to image analysis techniques applied to numerous remotely sensed data. While with some success, there is still no consensus on the optimal approaches for mapping forested wetlands. To address this problem, we examined two machine learning approaches, random forest (RF) and K-nearest neighbor (KNN) algorithms, and applied these two approaches to the framework of pixel-based and object-based classifications. The RF and KNN algorithms were constructed using predictors derived from Landsat 8 imagery, Radarsat-2 advanced synthetic aperture radar (SAR), and topographical indices. The results show that the objected-based classifications performed better than per-pixel classifications using the same algorithm (RF) in terms of overall accuracy and the difference of their kappa coefficients are statistically significant (p<0.01). There were noticeably omissions for forested and herbaceous wetlands based on the per-pixel classifications using the RF algorithm. As for the object-based image analysis, there were also statistically significant differences (p<0.01) of Kappa coefficient between results performed based on RF and KNN algorithms. The object-based classification using RF provided a more visually adequate distribution of interested land cover types, while the object classifications based on the KNN algorithm showed noticeably commissions for forested wetlands and omissions for agriculture land. This research proves that the object-based classification with RF using optical, radar, and topographical data improved the mapping accuracy of land covers and provided a feasible approach to discriminate the forested wetlands from the other land cover types in forestry area.

  14. Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds.

    PubMed

    Sur, Maitreyi; Suffredini, Tony; Wessells, Stephen M; Bloom, Peter H; Lanzone, Michael; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd

    2017-01-01

    Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data.

  15. Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds

    PubMed Central

    Suffredini, Tony; Wessells, Stephen M.; Bloom, Peter H.; Lanzone, Michael; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd

    2017-01-01

    Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data. PMID:28403159

  16. Improved supervised classification of accelerometry data to distinguish behaviors of soaring birds

    USGS Publications Warehouse

    Sur, Maitreyi; Suffredini, Tony; Wessells, Stephen M.; Bloom, Peter H.; Lanzone, Michael J.; Blackshire, Sheldon; Sridhar, Srisarguru; Katzner, Todd

    2017-01-01

    Soaring birds can balance the energetic costs of movement by switching between flapping, soaring and gliding flight. Accelerometers can allow quantification of flight behavior and thus a context to interpret these energetic costs. However, models to interpret accelerometry data are still being developed, rarely trained with supervised datasets, and difficult to apply. We collected accelerometry data at 140Hz from a trained golden eagle (Aquila chrysaetos) whose flight we recorded with video that we used to characterize behavior. We applied two forms of supervised classifications, random forest (RF) models and K-nearest neighbor (KNN) models. The KNN model was substantially easier to implement than the RF approach but both were highly accurate in classifying basic behaviors such as flapping (85.5% and 83.6% accurate, respectively), soaring (92.8% and 87.6%) and sitting (84.1% and 88.9%) with overall accuracies of 86.6% and 92.3% respectively. More detailed classification schemes, with specific behaviors such as banking and straight flights were well classified only by the KNN model (91.24% accurate; RF = 61.64% accurate). The RF model maintained its accuracy of classifying basic behavior classification accuracy of basic behaviors at sampling frequencies as low as 10Hz, the KNN at sampling frequencies as low as 20Hz. Classification of accelerometer data collected from free ranging birds demonstrated a strong dependence of predicted behavior on the type of classification model used. Our analyses demonstrate the consequence of different approaches to classification of accelerometry data, the potential to optimize classification algorithms with validated flight behaviors to improve classification accuracy, ideal sampling frequencies for different classification algorithms, and a number of ways to improve commonly used analytical techniques and best practices for classification of accelerometry data.

  17. Inverted electro-mechanical behaviour induced by the irreversible domain configuration transformation in (K,Na)NbO3-based ceramics

    PubMed Central

    Huan, Yu; Wang, Xiaohui; Koruza, Jurij; Wang, Ke; Webber, Kyle G.; Hao, Yanan; Li, Longtu

    2016-01-01

    Miniaturization of domains to the nanometer scale has been previously reported in many piezoelectrics with two-phase coexistence. Despite the observation of nanoscale domain configuration near the polymorphic phase transition (PPT) regionin virgin (K0.5Na0.5)NbO3 (KNN) based ceramics, it remains unclear how this domain state responds to external loads and influences the macroscopic electro-mechanical properties. To this end, the electric-field-induced and stress-induced strain curves of KNN-based ceramics over a wide compositional range across PPT were characterized. It was found that the coercive field of the virgin samples was highest in PPT region, which was related to the inhibited domain wall motion due to the presence of nanodomains. However, the coercive field was found to be the lowest in the PPT region after electrical poling. This was related to the irreversible transformation of the nanodomains into micron-sized domains during the poling process. With the similar micron-sized domain configuration for all poled ceramics, the domains in the PPT region move more easily due to the additional polarization vectors. The results demonstrate that the poling process can give rise to the irreversible domain configuration transformation and then account for the inverted macroscopic piezoelectricity in the PPT region of KNN-based ceramics. PMID:26915972

  18. Quantum realization of the nearest-neighbor interpolation method for FRQI and NEQR

    NASA Astrophysics Data System (ADS)

    Sang, Jianzhi; Wang, Shen; Niu, Xiamu

    2016-01-01

    This paper is concerned with the feasibility of the classical nearest-neighbor interpolation based on flexible representation of quantum images (FRQI) and novel enhanced quantum representation (NEQR). Firstly, the feasibility of the classical image nearest-neighbor interpolation for quantum images of FRQI and NEQR is proven. Then, by defining the halving operation and by making use of quantum rotation gates, the concrete quantum circuit of the nearest-neighbor interpolation for FRQI is designed for the first time. Furthermore, quantum circuit of the nearest-neighbor interpolation for NEQR is given. The merit of the proposed NEQR circuit lies in their low complexity, which is achieved by utilizing the halving operation and the quantum oracle operator. Finally, in order to further improve the performance of the former circuits, new interpolation circuits for FRQI and NEQR are presented by using Control-NOT gates instead of a halving operation. Simulation results show the effectiveness of the proposed circuits.

  19. 5 CFR 532.317 - Use of data from the nearest similar area.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... REGULATIONS PREVAILING RATE SYSTEMS Determining Rates for Principal Types of Positions § 532.317 Use of data... 5 Administrative Personnel 1 2013-01-01 2013-01-01 false Use of data from the nearest similar area... subpart, analyze and use the acceptable data from the nearest similar wage area together with the data...

  20. 5 CFR 532.317 - Use of data from the nearest similar area.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... REGULATIONS PREVAILING RATE SYSTEMS Determining Rates for Principal Types of Positions § 532.317 Use of data... 5 Administrative Personnel 1 2012-01-01 2012-01-01 false Use of data from the nearest similar area... subpart, analyze and use the acceptable data from the nearest similar wage area together with the data...

  1. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Atamanik, E.; Thangadurai, V.

    We report a comparative study of the dielectric properties of solid-state (ceramic method) synthesized NaNbO{sub 3} (NN), Na{sub 0.75}K{sub 0.25}NbO{sub 3} (K25NN), K{sub 0.5}Na{sub 0.5}NbO{sub 3} (KNN) and some composite materials containing In{sub 2}O{sub 3} and NN or KNN using an AC impedance method. Powder X-ray diffraction (PXRD) was employed to investigate the phase purity. No significant amount of impurity phase was observed for NN, K25NN, and KNN. Substitutions of 10, 15 and 25 mol% In{sup 3+} for Nb{sup 5+} in KNN and NN using solid-state reactions at 1150 deg. C resulted in composite materials. AC impedance studies of NN,more » KNN and K25NN in the temperature range of 500-800 deg. C showed a single semicircle (attributed to the bulk property) in the high-frequency range of 10{sup 3} to 10{sup 6} Hz. The individual contributions from the bulk and grain boundary on the dielectric properties were resolved and quantified from the impedance data. The calculated dielectric values for NN were consistent with previously reported in the literature. 10% Indium based KNN composite materials had the lowest dielectric loss 0.585 and the dielectric constant of 233 at 100 kHz at the temperature of 650 deg. C.« less

  2. (100)-Textured KNN-based thick film with enhanced piezoelectric property for intravascular ultrasound imaging

    PubMed Central

    Zhu, Benpeng; Zhang, Zhiqiang; Ma, Teng; Yang, Xiaofei; Li, Yongxiang; Shung, K. Kirk; Zhou, Qifa

    2015-01-01

    Using tape-casting technology, 35 μm free-standing (100)-textured Li doped KNN (KNLN) thick film was prepared by employing NaNbO3 (NN) as template. It exhibited similar piezoelectric behavior to lead containing materials: a longitudinal piezoelectric coefficient (d33) of ∼150 pm/V and an electromechanical coupling coefficient (kt) of 0.44. Based on this thick film, a 52 MHz side-looking miniature transducer with a bandwidth of 61.5% at −6 dB was built for Intravascular ultrasound (IVUS) imaging. In comparison with 40 MHz PMN-PT single crystal transducer, the rabbit aorta image had better resolution and higher noise-to-signal ratio, indicating that lead-free (100)-textured KNLN thick film may be suitable for IVUS (>50 MHz) imaging. PMID:25991874

  3. On the Discriminant Analysis in the 2-Populations Case

    NASA Astrophysics Data System (ADS)

    Rublík, František

    2008-01-01

    The empirical Bayes Gaussian rule, which in the normal case yields good values of the probability of total error, may yield high values of the maximum probability error. From this point of view the presented modified version of the classification rule of Broffitt, Randles and Hogg appears to be superior. The modification included in this paper is termed as a WR method, and the choice of its weights is discussed. The mentioned methods are also compared with the K nearest neighbours classification rule.

  4. Emergent biomarker derived from next-generation sequencing to identify pain patients requiring uncommonly high opioid doses

    PubMed Central

    Kringel, D; Ultsch, A; Zimmermann, M; Jansen, J-P; Ilias, W; Freynhagen, R; Griessinger, N; Kopf, A; Stein, C; Doehring, A; Resch, E; Lötsch, J

    2017-01-01

    Next-generation sequencing (NGS) provides unrestricted access to the genome, but it produces ‘big data’ exceeding in amount and complexity the classical analytical approaches. We introduce a bioinformatics-based classifying biomarker that uses emergent properties in genetics to separate pain patients requiring extremely high opioid doses from controls. Following precisely calculated selection of the 34 most informative markers in the OPRM1, OPRK1, OPRD1 and SIGMAR1 genes, pattern of genotypes belonging to either patient group could be derived using a k-nearest neighbor (kNN) classifier that provided a diagnostic accuracy of 80.6±4%. This outperformed alternative classifiers such as reportedly functional opioid receptor gene variants or complex biomarkers obtained via multiple regression or decision tree analysis. The accumulation of several genetic variants with only minor functional influences may result in a qualitative consequence affecting complex phenotypes, pointing at emergent properties in genetics. PMID:27139154

  5. Emergent biomarker derived from next-generation sequencing to identify pain patients requiring uncommonly high opioid doses.

    PubMed

    Kringel, D; Ultsch, A; Zimmermann, M; Jansen, J-P; Ilias, W; Freynhagen, R; Griessinger, N; Kopf, A; Stein, C; Doehring, A; Resch, E; Lötsch, J

    2017-10-01

    Next-generation sequencing (NGS) provides unrestricted access to the genome, but it produces 'big data' exceeding in amount and complexity the classical analytical approaches. We introduce a bioinformatics-based classifying biomarker that uses emergent properties in genetics to separate pain patients requiring extremely high opioid doses from controls. Following precisely calculated selection of the 34 most informative markers in the OPRM1, OPRK1, OPRD1 and SIGMAR1 genes, pattern of genotypes belonging to either patient group could be derived using a k-nearest neighbor (kNN) classifier that provided a diagnostic accuracy of 80.6±4%. This outperformed alternative classifiers such as reportedly functional opioid receptor gene variants or complex biomarkers obtained via multiple regression or decision tree analysis. The accumulation of several genetic variants with only minor functional influences may result in a qualitative consequence affecting complex phenotypes, pointing at emergent properties in genetics.

  6. Fall Detection Using Smartphone Audio Features.

    PubMed

    Cheffena, Michael

    2016-07-01

    An automated fall detection system based on smartphone audio features is developed. The spectrogram, mel frequency cepstral coefficents (MFCCs), linear predictive coding (LPC), and matching pursuit (MP) features of different fall and no-fall sound events are extracted from experimental data. Based on the extracted audio features, four different machine learning classifiers: k-nearest neighbor classifier (k-NN), support vector machine (SVM), least squares method (LSM), and artificial neural network (ANN) are investigated for distinguishing between fall and no-fall events. For each audio feature, the performance of each classifier in terms of sensitivity, specificity, accuracy, and computational complexity is evaluated. The best performance is achieved using spectrogram features with ANN classifier with sensitivity, specificity, and accuracy all above 98%. The classifier also has acceptable computational requirement for training and testing. The system is applicable in home environments where the phone is placed in the vicinity of the user.

  7. How far to the nearest road?

    Treesearch

    Kurt H. Riitters; James D. Wickham

    2003-01-01

    Ecological impacts from roads may be the rule rather than the exception in most of the conterminous United States. We measured the proportion of land area that was located within nine distances from the nearest road of any type, and mapped the results for 164 ecoregions and 2108 watersheds nationwide. Overall, 20% of the total land area was within 127 m of a road, and...

  8. Dielectric properties of (K0.5Na0.5)NbO3-(Bi0.5Li0.5)ZrO3 lead-free ceramics as high-temperature ceramic capacitors

    NASA Astrophysics Data System (ADS)

    Yan, Tianxiang; Han, Feifei; Ren, Shaokai; Ma, Xing; Fang, Liang; Liu, Laijun; Kuang, Xiaojun; Elouadi, Brahim

    2018-04-01

    (1 - x)K0.5Na0.5NbO3- x(Bi0.5Li0.5)ZrO3 (labeled as (1 - x)KNN- xBLZ) lead-free ceramics were fabricated by a solid-state reaction method. A research was conducted on the effects of BLZ content on structure, dielectric properties and relaxation behavior of KNN ceramics. By combining the X-ray diffraction patterns with the temperature dependence of dielectric properties, an orthorhombic-tetragonal phase coexistence was identified for x = 0.03, a tetragonal phase was determined for x = 0.05, and a single rhombohedral structure occurred at x = 0.08. The 0.92KNN-0.08BLZ ceramic exhibits a high and stable permittivity ( 1317, ± 15% variation) from 55 to 445 °C and low dielectric loss (≤ 6%) from 120 to 400 °C, which is hugely attractive for high-temperature capacitors. Activation energies of both high-temperature dielectric relaxation and dc conductivity first increase and then decline with the increase of BLZ, which might be attributed to the lattice distortion and concentration of oxygen vacancies.

  9. Interfering Neighbours: The Impact of Novel Word Learning on the Identification of Visually Similar Words

    ERIC Educational Resources Information Center

    Bowers, Jeffrey S.; Davis, Colin J.; Hanley, Derek A.

    2005-01-01

    We assessed the impact of visual similarity on written word identification by having participants learn new words (e.g. BANARA) that were neighbours of familiar words that previously had no neighbours (e.g. BANANA). Repeated exposure to these new words made it more difficult to semantically categorize the familiar words. There was some evidence of…

  10. Giant electrocaloric and energy storage performance of [(K0.5Na0.5)NbO3](1-x)-[LiSbO3] x nanocrystalline ceramics.

    PubMed

    Kumar, Raju; Singh, Satyendra

    2018-02-16

    Electrocaloric (EC) refrigeration, an EC effect based technology has been accepted as an auspicious way in the development of next generation refrigeration due to high efficiency and compact size. Here, we report the results of our experimental investigations on electrocaloric response and electrical energy storage properties in lead-free nanocrystalline (1 - x)K 0.5 Na 0.5 NbO 3 -xLiSbO 3 (KNN-xLS) ceramics in the range of 0.015 ≤ x ≤ 0.06 by the indirect EC measurements. Doping of LiSbO 3 has lowered both the transitions (T C and T O-T ) of KNN to the room temperature side effectively. A maximal value of EC temperature change, ΔT = 3.33 K was obtained for the composition with x = 0.03 at 345 K under an external electric field of 40 kV/cm. The higher value of EC responsivity, ζ = 8.32 × 10 -7  K.m/V is found with COP of 8.14 and recoverable energy storage of 0.128 J/cm 3 with 46% efficiency for the composition of x = 0.03. Our investigations show that this material is a very promising candidate for electrocaloric refrigeration and energy storage near room temperature.

  11. RKNNMDA: Ranking-based KNN for MiRNA-Disease Association prediction.

    PubMed

    Chen, Xing; Wu, Qiao-Feng; Yan, Gui-Ying

    2017-07-03

    Cumulative verified experimental studies have demonstrated that microRNAs (miRNAs) could be closely related with the development and progression of human complex diseases. Based on the assumption that functional similar miRNAs may have a strong correlation with phenotypically similar diseases and vice versa, researchers developed various effective computational models which combine heterogeneous biologic data sets including disease similarity network, miRNA similarity network, and known disease-miRNA association network to identify potential relationships between miRNAs and diseases in biomedical research. Considering the limitations in previous computational study, we introduced a novel computational method of Ranking-based KNN for miRNA-Disease Association prediction (RKNNMDA) to predict potential related miRNAs for diseases, and our method obtained an AUC of 0.8221 based on leave-one-out cross validation. In addition, RKNNMDA was applied to 3 kinds of important human cancers for further performance evaluation. The results showed that 96%, 80% and 94% of predicted top 50 potential related miRNAs for Colon Neoplasms, Esophageal Neoplasms, and Prostate Neoplasms have been confirmed by experimental literatures, respectively. Moreover, RKNNMDA could be used to predict potential miRNAs for diseases without any known miRNAs, and it is anticipated that RKNNMDA would be of great use for novel miRNA-disease association identification.

  12. A machine learning approach to quantifying geologic similarities between sites of gas hydrate accumulation

    NASA Astrophysics Data System (ADS)

    Runyan, T. E.; Wood, W. T.; Palmsten, M. L.; Zhang, R.

    2016-12-01

    Gas hydrates, specifically methane hydrates, are sparsely sampled on a global scale, and their accumulation is difficult to predict geospatially. Several attempts have been made at estimating global inventories, and to some extent geospatial distribution, using geospatial extrapoltions guided with geophysical and geochemical methods. Our objective is to quantitatively predict the geospatial likelihood of encountering methane hydrates, with uncertainty. Predictions could be incorporated into analyses of drilling hazards as well as climate change. We use global data sets (including water depth, temperature, pressure, TOC, sediment thickness, and heat flow) as parameters to train a k-nearest neighbor (KNN) machine learning technique. The KNN is unsupervised and non-parametric, we do not provide any interpretive influence on prior probability distribution, so our results are strictly data driven. We have selected as test sites several locations where gas hydrates have been well studied, each with significantly different geologic settings.These include: The Blake Ridge (U.S. East Coast), Hydrate Ridge (U.S. West Coast), and the Gulf of Mexico. We then use KNN to quantify similarities between these sites, and determine, via the distance in parameter space, what is the likelihood and uncertainty of encountering gas hydrate anywhere in the world. Here we are operating under the assumption that the distance in parameter space is proportional to the probability of the occurrence of gas hydrate. We then compare these global similarity maps made from our several test sites to identify the geologic (geophyisical, bio-geochemical) parameters best suited for predicting gas hydrate occurrence.

  13. An improved method of early diagnosis of smoking-induced respiratory changes using machine learning algorithms.

    PubMed

    Amaral, Jorge L M; Lopes, Agnaldo J; Jansen, José M; Faria, Alvaro C D; Melo, Pedro L

    2013-12-01

    The purpose of this study was to develop an automatic classifier to increase the accuracy of the forced oscillation technique (FOT) for diagnosing early respiratory abnormalities in smoking patients. The data consisted of FOT parameters obtained from 56 volunteers, 28 healthy and 28 smokers with low tobacco consumption. Many supervised learning techniques were investigated, including logistic linear classifiers, k nearest neighbor (KNN), neural networks and support vector machines (SVM). To evaluate performance, the ROC curve of the most accurate parameter was established as baseline. To determine the best input features and classifier parameters, we used genetic algorithms and a 10-fold cross-validation using the average area under the ROC curve (AUC). In the first experiment, the original FOT parameters were used as input. We observed a significant improvement in accuracy (KNN=0.89 and SVM=0.87) compared with the baseline (0.77). The second experiment performed a feature selection on the original FOT parameters. This selection did not cause any significant improvement in accuracy, but it was useful in identifying more adequate FOT parameters. In the third experiment, we performed a feature selection on the cross products of the FOT parameters. This selection resulted in a further increase in AUC (KNN=SVM=0.91), which allows for high diagnostic accuracy. In conclusion, machine learning classifiers can help identify early smoking-induced respiratory alterations. The use of FOT cross products and the search for the best features and classifier parameters can markedly improve the performance of machine learning classifiers. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  14. Detection of Cardiac Abnormalities from Multilead ECG using Multiscale Phase Alternation Features.

    PubMed

    Tripathy, R K; Dandapat, S

    2016-06-01

    The cardiac activities such as the depolarization and the relaxation of atria and ventricles are observed in electrocardiogram (ECG). The changes in the morphological features of ECG are the symptoms of particular heart pathology. It is a cumbersome task for medical experts to visually identify any subtle changes in the morphological features during 24 hours of ECG recording. Therefore, the automated analysis of ECG signal is a need for accurate detection of cardiac abnormalities. In this paper, a novel method for automated detection of cardiac abnormalities from multilead ECG is proposed. The method uses multiscale phase alternation (PA) features of multilead ECG and two classifiers, k-nearest neighbor (KNN) and fuzzy KNN for classification of bundle branch block (BBB), myocardial infarction (MI), heart muscle defect (HMD) and healthy control (HC). The dual tree complex wavelet transform (DTCWT) is used to decompose the ECG signal of each lead into complex wavelet coefficients at different scales. The phase of the complex wavelet coefficients is computed and the PA values at each wavelet scale are used as features for detection and classification of cardiac abnormalities. A publicly available multilead ECG database (PTB database) is used for testing of the proposed method. The experimental results show that, the proposed multiscale PA features and the fuzzy KNN classifier have better performance for detection of cardiac abnormalities with sensitivity values of 78.12 %, 80.90 % and 94.31 % for BBB, HMD and MI classes. The sensitivity value of proposed method for MI class is compared with the state-of-art techniques from multilead ECG.

  15. Solar Flare Prediction Model with Three Machine-learning Algorithms using Ultraviolet Brightening and Vector Magnetograms

    NASA Astrophysics Data System (ADS)

    Nishizuka, N.; Sugiura, K.; Kubo, Y.; Den, M.; Watari, S.; Ishii, M.

    2017-02-01

    We developed a flare prediction model using machine learning, which is optimized to predict the maximum class of flares occurring in the following 24 hr. Machine learning is used to devise algorithms that can learn from and make decisions on a huge amount of data. We used solar observation data during the period 2010-2015, such as vector magnetograms, ultraviolet (UV) emission, and soft X-ray emission taken by the Solar Dynamics Observatory and the Geostationary Operational Environmental Satellite. We detected active regions (ARs) from the full-disk magnetogram, from which ˜60 features were extracted with their time differentials, including magnetic neutral lines, the current helicity, the UV brightening, and the flare history. After standardizing the feature database, we fully shuffled and randomly separated it into two for training and testing. To investigate which algorithm is best for flare prediction, we compared three machine-learning algorithms: the support vector machine, k-nearest neighbors (k-NN), and extremely randomized trees. The prediction score, the true skill statistic, was higher than 0.9 with a fully shuffled data set, which is higher than that for human forecasts. It was found that k-NN has the highest performance among the three algorithms. The ranking of the feature importance showed that previous flare activity is most effective, followed by the length of magnetic neutral lines, the unsigned magnetic flux, the area of UV brightening, and the time differentials of features over 24 hr, all of which are strongly correlated with the flux emergence dynamics in an AR.

  16. Quantifying Postural Control during Exergaming Using Multivariate Whole-Body Movement Data: A Self-Organizing Maps Approach

    PubMed Central

    van Diest, Mike; Stegenga, Jan; Wörtche, Heinrich J.; Roerdink, Jos B. T. M; Verkerke, Gijsbertus J.; Lamoth, Claudine J. C.

    2015-01-01

    Background Exergames are becoming an increasingly popular tool for training balance ability, thereby preventing falls in older adults. Automatic, real time, assessment of the user’s balance control offers opportunities in terms of providing targeted feedback and dynamically adjusting the gameplay to the individual user, yet algorithms for quantification of balance control remain to be developed. The aim of the present study was to identify movement patterns, and variability therein, of young and older adults playing a custom-made weight-shifting (ice-skating) exergame. Methods Twenty older adults and twenty young adults played a weight-shifting exergame under five conditions of varying complexity, while multi-segmental whole-body movement data were captured using Kinect. Movement coordination patterns expressed during gameplay were identified using Self Organizing Maps (SOM), an artificial neural network, and variability in these patterns was quantified by computing Total Trajectory Variability (TTvar). Additionally a k Nearest Neighbor (kNN) classifier was trained to discriminate between young and older adults based on the SOM features. Results Results showed that TTvar was significantly higher in older adults than in young adults, when playing the exergame under complex task conditions. The kNN classifier showed a classification accuracy of 65.8%. Conclusions Older adults display more variable sway behavior than young adults, when playing the exergame under complex task conditions. The SOM features characterizing movement patterns expressed during exergaming allow for discriminating between young and older adults with limited accuracy. Our findings contribute to the development of algorithms for quantification of balance ability during home-based exergaming for balance training. PMID:26230655

  17. An Analysis on Sensor Locations of the Human Body for Wearable Fall Detection Devices: Principles and Practice

    PubMed Central

    Özdemir, Ahmet Turan

    2016-01-01

    Wearable devices for fall detection have received attention in academia and industry, because falls are very dangerous, especially for elderly people, and if immediate aid is not provided, it may result in death. However, some predictive devices are not easily worn by elderly people. In this work, a huge dataset, including 2520 tests, is employed to determine the best sensor placement location on the body and to reduce the number of sensor nodes for device ergonomics. During the tests, the volunteer’s movements are recorded with six groups of sensors each with a triaxial (accelerometer, gyroscope and magnetometer) sensor, which is placed tightly on different parts of the body with special straps: head, chest, waist, right-wrist, right-thigh and right-ankle. The accuracy of individual sensor groups with their location is investigated with six machine learning techniques, namely the k-nearest neighbor (k-NN) classifier, Bayesian decision making (BDM), support vector machines (SVM), least squares method (LSM), dynamic time warping (DTW) and artificial neural networks (ANNs). Each technique is applied to single, double, triple, quadruple, quintuple and sextuple sensor configurations. These configurations create 63 different combinations, and for six machine learning techniques, a total of 63 × 6 = 378 combinations is investigated. As a result, the waist region is found to be the most suitable location for sensor placement on the body with 99.96% fall detection sensitivity by using the k-NN classifier, whereas the best sensitivity achieved by the wrist sensor is 97.37%, despite this location being highly preferred for today’s wearable applications. PMID:27463719

  18. Training set optimization and classifier performance in a top-down diabetic retinopathy screening system

    NASA Astrophysics Data System (ADS)

    Wigdahl, J.; Agurto, C.; Murray, V.; Barriga, S.; Soliz, P.

    2013-03-01

    Diabetic retinopathy (DR) affects more than 4.4 million Americans age 40 and over. Automatic screening for DR has shown to be an efficient and cost-effective way to lower the burden on the healthcare system, by triaging diabetic patients and ensuring timely care for those presenting with DR. Several supervised algorithms have been developed to detect pathologies related to DR, but little work has been done in determining the size of the training set that optimizes an algorithm's performance. In this paper we analyze the effect of the training sample size on the performance of a top-down DR screening algorithm for different types of statistical classifiers. Results are based on partial least squares (PLS), support vector machines (SVM), k-nearest neighbor (kNN), and Naïve Bayes classifiers. Our dataset consisted of digital retinal images collected from a total of 745 cases (595 controls, 150 with DR). We varied the number of normal controls in the training set, while keeping the number of DR samples constant, and repeated the procedure 10 times using randomized training sets to avoid bias. Results show increasing performance in terms of area under the ROC curve (AUC) when the number of DR subjects in the training set increased, with similar trends for each of the classifiers. Of these, PLS and k-NN had the highest average AUC. Lower standard deviation and a flattening of the AUC curve gives evidence that there is a limit to the learning ability of the classifiers and an optimal number of cases to train on.

  19. Solar Flare Prediction Model with Three Machine-learning Algorithms using Ultraviolet Brightening and Vector Magnetograms

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Nishizuka, N.; Kubo, Y.; Den, M.

    We developed a flare prediction model using machine learning, which is optimized to predict the maximum class of flares occurring in the following 24 hr. Machine learning is used to devise algorithms that can learn from and make decisions on a huge amount of data. We used solar observation data during the period 2010–2015, such as vector magnetograms, ultraviolet (UV) emission, and soft X-ray emission taken by the Solar Dynamics Observatory and the Geostationary Operational Environmental Satellite . We detected active regions (ARs) from the full-disk magnetogram, from which ∼60 features were extracted with their time differentials, including magnetic neutralmore » lines, the current helicity, the UV brightening, and the flare history. After standardizing the feature database, we fully shuffled and randomly separated it into two for training and testing. To investigate which algorithm is best for flare prediction, we compared three machine-learning algorithms: the support vector machine, k-nearest neighbors (k-NN), and extremely randomized trees. The prediction score, the true skill statistic, was higher than 0.9 with a fully shuffled data set, which is higher than that for human forecasts. It was found that k-NN has the highest performance among the three algorithms. The ranking of the feature importance showed that previous flare activity is most effective, followed by the length of magnetic neutral lines, the unsigned magnetic flux, the area of UV brightening, and the time differentials of features over 24 hr, all of which are strongly correlated with the flux emergence dynamics in an AR.« less

  20. Multi-feature classifiers for burst detection in single EEG channels from preterm infants

    NASA Astrophysics Data System (ADS)

    Navarro, X.; Porée, F.; Kuchenbuch, M.; Chavez, M.; Beuchée, Alain; Carrault, G.

    2017-08-01

    Objective. The study of electroencephalographic (EEG) bursts in preterm infants provides valuable information about maturation or prognostication after perinatal asphyxia. Over the last two decades, a number of works proposed algorithms to automatically detect EEG bursts in preterm infants, but they were designed for populations under 35 weeks of post menstrual age (PMA). However, as the brain activity evolves rapidly during postnatal life, these solutions might be under-performing with increasing PMA. In this work we focused on preterm infants reaching term ages (PMA  ⩾36 weeks) using multi-feature classification on a single EEG channel. Approach. Five EEG burst detectors relying on different machine learning approaches were compared: logistic regression (LR), linear discriminant analysis (LDA), k-nearest neighbors (kNN), support vector machines (SVM) and thresholding (Th). Classifiers were trained by visually labeled EEG recordings from 14 very preterm infants (born after 28 weeks of gestation) with 36-41 weeks PMA. Main results. The most performing classifiers reached about 95% accuracy (kNN, SVM and LR) whereas Th obtained 84%. Compared to human-automatic agreements, LR provided the highest scores (Cohen’s kappa  =  0.71) using only three EEG features. Applying this classifier in an unlabeled database of 21 infants  ⩾36 weeks PMA, we found that long EEG bursts and short inter-burst periods are characteristic of infants with the highest PMA and weights. Significance. In view of these results, LR-based burst detection could be a suitable tool to study maturation in monitoring or portable devices using a single EEG channel.

  1. Quantifying Postural Control during Exergaming Using Multivariate Whole-Body Movement Data: A Self-Organizing Maps Approach.

    PubMed

    van Diest, Mike; Stegenga, Jan; Wörtche, Heinrich J; Roerdink, Jos B T M; Verkerke, Gijsbertus J; Lamoth, Claudine J C

    2015-01-01

    Exergames are becoming an increasingly popular tool for training balance ability, thereby preventing falls in older adults. Automatic, real time, assessment of the user's balance control offers opportunities in terms of providing targeted feedback and dynamically adjusting the gameplay to the individual user, yet algorithms for quantification of balance control remain to be developed. The aim of the present study was to identify movement patterns, and variability therein, of young and older adults playing a custom-made weight-shifting (ice-skating) exergame. Twenty older adults and twenty young adults played a weight-shifting exergame under five conditions of varying complexity, while multi-segmental whole-body movement data were captured using Kinect. Movement coordination patterns expressed during gameplay were identified using Self Organizing Maps (SOM), an artificial neural network, and variability in these patterns was quantified by computing Total Trajectory Variability (TTvar). Additionally a k Nearest Neighbor (kNN) classifier was trained to discriminate between young and older adults based on the SOM features. Results showed that TTvar was significantly higher in older adults than in young adults, when playing the exergame under complex task conditions. The kNN classifier showed a classification accuracy of 65.8%. Older adults display more variable sway behavior than young adults, when playing the exergame under complex task conditions. The SOM features characterizing movement patterns expressed during exergaming allow for discriminating between young and older adults with limited accuracy. Our findings contribute to the development of algorithms for quantification of balance ability during home-based exergaming for balance training.

  2. A MapReduce approach to diminish imbalance parameters for big deoxyribonucleic acid dataset.

    PubMed

    Kamal, Sarwar; Ripon, Shamim Hasnat; Dey, Nilanjan; Ashour, Amira S; Santhi, V

    2016-07-01

    In the age of information superhighway, big data play a significant role in information processing, extractions, retrieving and management. In computational biology, the continuous challenge is to manage the biological data. Data mining techniques are sometimes imperfect for new space and time requirements. Thus, it is critical to process massive amounts of data to retrieve knowledge. The existing software and automated tools to handle big data sets are not sufficient. As a result, an expandable mining technique that enfolds the large storage and processing capability of distributed or parallel processing platforms is essential. In this analysis, a contemporary distributed clustering methodology for imbalance data reduction using k-nearest neighbor (K-NN) classification approach has been introduced. The pivotal objective of this work is to illustrate real training data sets with reduced amount of elements or instances. These reduced amounts of data sets will ensure faster data classification and standard storage management with less sensitivity. However, general data reduction methods cannot manage very big data sets. To minimize these difficulties, a MapReduce-oriented framework is designed using various clusters of automated contents, comprising multiple algorithmic approaches. To test the proposed approach, a real DNA (deoxyribonucleic acid) dataset that consists of 90 million pairs has been used. The proposed model reduces the imbalance data sets from large-scale data sets without loss of its accuracy. The obtained results depict that MapReduce based K-NN classifier provided accurate results for big data of DNA. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  3. An Analysis on Sensor Locations of the Human Body for Wearable Fall Detection Devices: Principles and Practice.

    PubMed

    Özdemir, Ahmet Turan

    2016-07-25

    Wearable devices for fall detection have received attention in academia and industry, because falls are very dangerous, especially for elderly people, and if immediate aid is not provided, it may result in death. However, some predictive devices are not easily worn by elderly people. In this work, a huge dataset, including 2520 tests, is employed to determine the best sensor placement location on the body and to reduce the number of sensor nodes for device ergonomics. During the tests, the volunteer's movements are recorded with six groups of sensors each with a triaxial (accelerometer, gyroscope and magnetometer) sensor, which is placed tightly on different parts of the body with special straps: head, chest, waist, right-wrist, right-thigh and right-ankle. The accuracy of individual sensor groups with their location is investigated with six machine learning techniques, namely the k-nearest neighbor (k-NN) classifier, Bayesian decision making (BDM), support vector machines (SVM), least squares method (LSM), dynamic time warping (DTW) and artificial neural networks (ANNs). Each technique is applied to single, double, triple, quadruple, quintuple and sextuple sensor configurations. These configurations create 63 different combinations, and for six machine learning techniques, a total of 63 × 6 = 378 combinations is investigated. As a result, the waist region is found to be the most suitable location for sensor placement on the body with 99.96% fall detection sensitivity by using the k-NN classifier, whereas the best sensitivity achieved by the wrist sensor is 97.37%, despite this location being highly preferred for today's wearable applications.

  4. A genetic algorithm-based framework for wavelength selection on sample categorization.

    PubMed

    Anzanello, Michel J; Yamashita, Gabrielli; Marcelo, Marcelo; Fogliatto, Flávio S; Ortiz, Rafael S; Mariotti, Kristiane; Ferrão, Marco F

    2017-08-01

    In forensic and pharmaceutical scenarios, the application of chemometrics and optimization techniques has unveiled common and peculiar features of seized medicine and drug samples, helping investigative forces to track illegal operations. This paper proposes a novel framework aimed at identifying relevant subsets of attenuated total reflectance Fourier transform infrared (ATR-FTIR) wavelengths for classifying samples into two classes, for example authentic or forged categories in case of medicines, or salt or base form in cocaine analysis. In the first step of the framework, the ATR-FTIR spectra were partitioned into equidistant intervals and the k-nearest neighbour (KNN) classification technique was applied to each interval to insert samples into proper classes. In the next step, selected intervals were refined through the genetic algorithm (GA) by identifying a limited number of wavelengths from the intervals previously selected aimed at maximizing classification accuracy. When applied to Cialis®, Viagra®, and cocaine ATR-FTIR datasets, the proposed method substantially decreased the number of wavelengths needed to categorize, and increased the classification accuracy. From a practical perspective, the proposed method provides investigative forces with valuable information towards monitoring illegal production of drugs and medicines. In addition, focusing on a reduced subset of wavelengths allows the development of portable devices capable of testing the authenticity of samples during police checking events, avoiding the need for later laboratorial analyses and reducing equipment expenses. Theoretically, the proposed GA-based approach yields more refined solutions than the current methods relying on interval approaches, which tend to insert irrelevant wavelengths in the retained intervals. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  5. Location of Nearest Rocky Exoplanet Known

    NASA Image and Video Library

    2015-07-30

    This sky map shows the location of the star HD 219134 (circle), host to the nearest confirmed rocky planet found to date outside of our solar system. The star lies just off the "W" shape of the constellation Cassiopeia and can be seen with the naked eye in dark skies. It actually has multiple planets, none of which are habitable. http://photojournal.jpl.nasa.gov/catalog/PIA19832

  6. Shifting from Empowered Agencies to Empowered People: Neighbours, Inc.

    ERIC Educational Resources Information Center

    Walker, Pam; Cory, Rebecca

    This report describes Neighbours, Inc., a nonprofit organization based in Franklin Park, New Jersey, that offers individualized supports for people with disabilities. In addition to the CEO and the director, the agency employs five advisors. These advisors each work to coordinate support for between five and seven people. Advisors, who typically…

  7. Consensus model for identification of novel PI3K inhibitors in large chemical library.

    PubMed

    Liew, Chin Yee; Ma, Xiao Hua; Yap, Chun Wei

    2010-02-01

    Phosphoinositide 3-kinases (PI3Ks) inhibitors have treatment potential for cancer, diabetes, cardiovascular disease, chronic inflammation and asthma. A consensus model consisting of three base classifiers (AODE, kNN, and SVM) trained with 1,283 positive compounds (PI3K inhibitors), 16 negative compounds (PI3K non-inhibitors) and 64,078 generated putative negatives was developed for predicting compounds with PI3K inhibitory activity of IC(50) < or = 10 microM. The consensus model has an estimated false positive rate of 0.75%. Nine novel potential inhibitors were identified using the consensus model and several of these contain structural features that are consistent with those found to be important for PI3K inhibitory activities. An advantage of the current model is that it does not require knowledge of 3D structural information of the various PI3K isoforms, which is not readily available for all isoforms.

  8. Consensus model for identification of novel PI3K inhibitors in large chemical library

    NASA Astrophysics Data System (ADS)

    Liew, Chin Yee; Ma, Xiao Hua; Yap, Chun Wei

    2010-02-01

    Phosphoinositide 3-kinases (PI3Ks) inhibitors have treatment potential for cancer, diabetes, cardiovascular disease, chronic inflammation and asthma. A consensus model consisting of three base classifiers (AODE, kNN, and SVM) trained with 1,283 positive compounds (PI3K inhibitors), 16 negative compounds (PI3K non-inhibitors) and 64,078 generated putative negatives was developed for predicting compounds with PI3K inhibitory activity of IC50 ≤ 10 μM. The consensus model has an estimated false positive rate of 0.75%. Nine novel potential inhibitors were identified using the consensus model and several of these contain structural features that are consistent with those found to be important for PI3K inhibitory activities. An advantage of the current model is that it does not require knowledge of 3D structural information of the various PI3K isoforms, which is not readily available for all isoforms.

  9. Realization of the axial next-nearest-neighbor Ising model in U 3 Al 2 Ge 3

    DOE PAGES

    Fobes, David M.; Lin, Shi-Zeng; Ghimire, Nirmal J.; ...

    2017-11-09

    Inmore » this paper, we report small-angle neutron scattering (SANS) measurements and theoretical modeling of U 3 Al 2 Ge 3 . Analysis of the SANS data reveals a phase transition to sinusoidally modulated magnetic order at T N = 63 K to be second order and a first-order phase transition to ferromagnetic order at T c = 48 K. Within the sinusoidally modulated magnetic phase (T c < T < T N), we uncover a dramatic change, by a factor of 3, in the ordering wave vector as a function of temperature. Finally, these observations all indicate that U 3 Al 2 Ge 3 is a close realization of the three-dimensional axial next-nearest-neighbor Ising model, a prototypical framework for describing commensurate to incommensurate phase transitions in frustrated magnets.« less

  10. Realization of the axial next-nearest-neighbor Ising model in U 3 Al 2 Ge 3

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Fobes, David M.; Lin, Shi-Zeng; Ghimire, Nirmal J.

    Inmore » this paper, we report small-angle neutron scattering (SANS) measurements and theoretical modeling of U 3 Al 2 Ge 3 . Analysis of the SANS data reveals a phase transition to sinusoidally modulated magnetic order at T N = 63 K to be second order and a first-order phase transition to ferromagnetic order at T c = 48 K. Within the sinusoidally modulated magnetic phase (T c < T < T N), we uncover a dramatic change, by a factor of 3, in the ordering wave vector as a function of temperature. Finally, these observations all indicate that U 3 Al 2 Ge 3 is a close realization of the three-dimensional axial next-nearest-neighbor Ising model, a prototypical framework for describing commensurate to incommensurate phase transitions in frustrated magnets.« less

  11. Nearest-neighbor thermodynamics of deoxyinosine pairs in DNA duplexes

    PubMed Central

    Watkins, Norman E.; SantaLucia, John

    2005-01-01

    Nearest-neighbor thermodynamic parameters of the ‘universal pairing base’ deoxyinosine were determined for the pairs I·C, I·A, I·T, I·G and I·I adjacent to G·C and A·T pairs. Ultraviolet absorbance melting curves were measured and non-linear regression performed on 84 oligonucleotide duplexes with 9 or 12 bp lengths. These data were combined with data for 13 inosine containing duplexes from the literature. Multiple linear regression was used to solve for the 32 nearest-neighbor unknowns. The parameters predict the Tm for all sequences within 1.2°C on average. The general trend in decreasing stability is I·C > I·A > I·T ≈ I· G > I·I. The stability trend for the base pair 5′ of the I·X pair is G·C > C·G > A·T > T·A. The stability trend for the base pair 3′ of I·X is the same. These trends indicate a complex interplay between H-bonding, nearest-neighbor stacking, and mismatch geometry. A survey of 14 tandem inosine pairs and 8 tandem self-complementary inosine pairs is also provided. These results may be used in the design of degenerate PCR primers and for degenerate microarray probes. PMID:16264087

  12. Comparing K-mer based methods for improved classification of 16S sequences.

    PubMed

    Vinje, Hilde; Liland, Kristian Hovde; Almøy, Trygve; Snipen, Lars

    2015-07-01

    The need for precise and stable taxonomic classification is highly relevant in modern microbiology. Parallel to the explosion in the amount of sequence data accessible, there has also been a shift in focus for classification methods. Previously, alignment-based methods were the most applicable tools. Now, methods based on counting K-mers by sliding windows are the most interesting classification approach with respect to both speed and accuracy. Here, we present a systematic comparison on five different K-mer based classification methods for the 16S rRNA gene. The methods differ from each other both in data usage and modelling strategies. We have based our study on the commonly known and well-used naïve Bayes classifier from the RDP project, and four other methods were implemented and tested on two different data sets, on full-length sequences as well as fragments of typical read-length. The difference in classification error obtained by the methods seemed to be small, but they were stable and for both data sets tested. The Preprocessed nearest-neighbour (PLSNN) method performed best for full-length 16S rRNA sequences, significantly better than the naïve Bayes RDP method. On fragmented sequences the naïve Bayes Multinomial method performed best, significantly better than all other methods. For both data sets explored, and on both full-length and fragmented sequences, all the five methods reached an error-plateau. We conclude that no K-mer based method is universally best for classifying both full-length sequences and fragments (reads). All methods approach an error plateau indicating improved training data is needed to improve classification from here. Classification errors occur most frequent for genera with few sequences present. For improving the taxonomy and testing new classification methods, the need for a better and more universal and robust training data set is crucial.

  13. Nearest unlike neighbor (NUN): an aid to decision confidence estimation

    NASA Astrophysics Data System (ADS)

    Dasarathy, Belur V.

    1995-09-01

    The concept of nearest unlike neighbor (NUN), proposed and explored previously in the design of nearest neighbor (NN) based decision systems, is further exploited in this study to develop a measure of confidence in the decisions made by NN-based decision systems. This measure of confidence, on the basis of comparison with a user-defined threshold, may be used to determine the acceptability of the decision provided by the NN-based decision system. The concepts, associated methodology, and some illustrative numerical examples using the now classical Iris data to bring out the ease of implementation and effectiveness of the proposed innovations are presented.

  14. Missing value imputation for gene expression data by tailored nearest neighbors.

    PubMed

    Faisal, Shahla; Tutz, Gerhard

    2017-04-25

    High dimensional data like gene expression and RNA-sequences often contain missing values. The subsequent analysis and results based on these incomplete data can suffer strongly from the presence of these missing values. Several approaches to imputation of missing values in gene expression data have been developed but the task is difficult due to the high dimensionality (number of genes) of the data. Here an imputation procedure is proposed that uses weighted nearest neighbors. Instead of using nearest neighbors defined by a distance that includes all genes the distance is computed for genes that are apt to contribute to the accuracy of imputed values. The method aims at avoiding the curse of dimensionality, which typically occurs if local methods as nearest neighbors are applied in high dimensional settings. The proposed weighted nearest neighbors algorithm is compared to existing missing value imputation techniques like mean imputation, KNNimpute and the recently proposed imputation by random forests. We use RNA-sequence and microarray data from studies on human cancer to compare the performance of the methods. The results from simulations as well as real studies show that the weighted distance procedure can successfully handle missing values for high dimensional data structures where the number of predictors is larger than the number of samples. The method typically outperforms the considered competitors.

  15. The Influence of Semantic Neighbours on Visual Word Recognition

    ERIC Educational Resources Information Center

    Yates, Mark

    2012-01-01

    Although it is assumed that semantics is a critical component of visual word recognition, there is still much that we do not understand. One recent way of studying semantic processing has been in terms of semantic neighbourhood (SN) density, and this research has shown that semantic neighbours facilitate lexical decisions. However, it is not clear…

  16. [Searching for Rare Celestial Objects Automatically from Stellar Spectra of the Sloan Digital Sky Survey Data Release Eight].

    PubMed

    Si, Jian-min; Luo, A-li; Wu, Fu-zhao; Wu, Yi-hong

    2015-03-01

    There are many valuable rare and unusual objects in spectra dataset of Sloan Digital Sky Survey (SDSS) Data Release eight (DR8), such as special white dwarfs (DZ, DQ, DC), carbon stars, white dwarf main-sequence binaries (WDMS), cataclysmic variable (CV) stars and so on, so it is extremely significant to search for rare and unusual celestial objects from massive spectra dataset. A novel algorithm based on Kernel dense estimation and K-nearest neighborhoods (KNN) has been presented, and applied to search for rare and unusual celestial objects from 546 383 stellar spectra of SDSS DR8. Their densities are estimated using Gaussian kernel density estimation, the top 5 000 spectra in descend order by their densities are selected as rare objects, and the top 300 000 spectra in ascend order by their densities are selected as normal objects. Then, KNN were used to classify the rest objects, and simultaneously K nearest neighbors of the 5 000 rare spectra are also selected as rare objects. As a result, there are totally 21 193 spectra selected as initial rare spectra, which include error spectra caused by deletion, redden, bad calibration, spectra consisting of different physically irrelevant components, planetary nebulas, QSOs, special white dwarfs (DZ, DQ, DC), carbon stars, white dwarf main-sequence binaries (WDMS), cataclysmic variable (CV) stars and so on. By cross identification with SIMBAD, NED, ADS and major literature, it is found that three DZ white dwarfs, one WDMS, two CVs with company of G-type star, three CVs candidates, six DC white dwarfs, one DC white dwarf candidate and one BL Lacertae (BL lac) candidate are our new findings. We also have found one special DA white dwarf with emission lines of Ca II triple and Mg I, and one unknown object whose spectrum looks like a late M star with emission lines and its image looks like a galaxy or nebula.

  17. Unleashing the Full Sustainable Potential of Thick Films of Lead-Free Potassium Sodium Niobate (K0.5Na0.5NbO3) by Aqueous Electrophoretic Deposition.

    PubMed

    Mahajan, Amit; Pinho, Rui; Dolhen, Morgane; Costa, M Elisabete; Vilarinho, Paula M

    2016-05-31

    A current challenge for the fabrication of functional oxide-based devices is related with the need of environmental and sustainable materials and processes. By considering both lead-free ferroelectrics of potassium sodium niobate (K0.5Na0.5NbO3, KNN) and aqueous-based electrophoretic deposition here we demonstrate that an eco-friendly aqueous solution-based process can be used to produce KNN thick coatings with improved electromechanical performance. KNN thick films on platinum substrates with thickness varying between 10 and 15 μm have a dielectric permittivity of 495, dielectric losses of 0.08 at 1 MHz, and a piezoelectric coefficient d33 of ∼70 pC/N. At TC these films display a relative permittivity of 2166 and loss tangent of 0.11 at 1 MHz. A comparison of the physical properties between these films and their bulk ceramics counterparts demonstrates the impact of the aqueous-based electrophoretic deposition (EPD) technique for the preparation of lead-free ferroelectric thick films. This opens the door to the possible development of high-performance, lead-free piezoelectric thick films by a sustainable low-cost process, expanding the applicability of lead-free piezoelectrics.

  18. [Galaxy/quasar classification based on nearest neighbor method].

    PubMed

    Li, Xiang-Ru; Lu, Yu; Zhou, Jian-Ming; Wang, Yong-Jun

    2011-09-01

    With the wide application of high-quality CCD in celestial spectrum imagery and the implementation of many large sky survey programs (e. g., Sloan Digital Sky Survey (SDSS), Two-degree-Field Galaxy Redshift Survey (2dF), Spectroscopic Survey Telescope (SST), Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) program and Large Synoptic Survey Telescope (LSST) program, etc.), celestial observational data are coming into the world like torrential rain. Therefore, to utilize them effectively and fully, research on automated processing methods for celestial data is imperative. In the present work, we investigated how to recognizing galaxies and quasars from spectra based on nearest neighbor method. Galaxies and quasars are extragalactic objects, they are far away from earth, and their spectra are usually contaminated by various noise. Therefore, it is a typical problem to recognize these two types of spectra in automatic spectra classification. Furthermore, the utilized method, nearest neighbor, is one of the most typical, classic, mature algorithms in pattern recognition and data mining, and often is used as a benchmark in developing novel algorithm. For applicability in practice, it is shown that the recognition ratio of nearest neighbor method (NN) is comparable to the best results reported in the literature based on more complicated methods, and the superiority of NN is that this method does not need to be trained, which is useful in incremental learning and parallel computation in mass spectral data processing. In conclusion, the results in this work are helpful for studying galaxies and quasars spectra classification.

  19. Floor Identification with Commercial Smartphones in Wifi-Based Indoor Localization System

    NASA Astrophysics Data System (ADS)

    Ai, H. J.; Liu, M. Y.; Shi, Y. M.; Zhao, J. Q.

    2016-06-01

    In this paper, we utilize novel sensors built-in commercial smart devices to propose a schema which can identify floors with high accuracy and efficiency. This schema can be divided into two modules: floor identifying and floor change detection. Floor identifying module starts at initial phase of positioning, and responsible for determining which floor the positioning start. We have estimated two methods to identify initial floor based on K-Nearest Neighbors (KNN) and BP Neural Network, respectively. In order to improve performance of KNN algorithm, we proposed a novel method based on weighting signal strength, which can identify floors robust and quickly. Floor change detection module turns on after entering into continues positioning procedure. In this module, sensors (such as accelerometer and barometer) of smart devices are used to determine whether the user is going up and down stairs or taking an elevator. This method has fused different kinds of sensor data and can adapt various motion pattern of users. We conduct our experiment with mobile client on Android Phone (Nexus 5) at a four-floors building with an open area between the second and third floor. The results demonstrate that our scheme can achieve an accuracy of 99% to identify floor and 97% to detecting floor changes as a whole.

  20. Automated diagnosis of dry eye using infrared thermography images

    NASA Astrophysics Data System (ADS)

    Acharya, U. Rajendra; Tan, Jen Hong; Koh, Joel E. W.; Sudarshan, Vidya K.; Yeo, Sharon; Too, Cheah Loon; Chua, Chua Kuang; Ng, E. Y. K.; Tong, Louis

    2015-07-01

    Dry Eye (DE) is a condition of either decreased tear production or increased tear film evaporation. Prolonged DE damages the cornea causing the corneal scarring, thinning and perforation. There is no single uniform diagnosis test available to date; combinations of diagnostic tests are to be performed to diagnose DE. The current diagnostic methods available are subjective, uncomfortable and invasive. Hence in this paper, we have developed an efficient, fast and non-invasive technique for the automated identification of normal and DE classes using infrared thermography images. The features are extracted from nonlinear method called Higher Order Spectra (HOS). Features are ranked using t-test ranking strategy. These ranked features are fed to various classifiers namely, K-Nearest Neighbor (KNN), Nave Bayesian Classifier (NBC), Decision Tree (DT), Probabilistic Neural Network (PNN), and Support Vector Machine (SVM) to select the best classifier using minimum number of features. Our proposed system is able to identify the DE and normal classes automatically with classification accuracy of 99.8%, sensitivity of 99.8%, and specificity if 99.8% for left eye using PNN and KNN classifiers. And we have reported classification accuracy of 99.8%, sensitivity of 99.9%, and specificity if 99.4% for right eye using SVM classifier with polynomial order 2 kernel.

  1. Probabilistic brain tissue segmentation in neonatal magnetic resonance imaging.

    PubMed

    Anbeek, Petronella; Vincken, Koen L; Groenendaal, Floris; Koeman, Annemieke; van Osch, Matthias J P; van der Grond, Jeroen

    2008-02-01

    A fully automated method has been developed for segmentation of four different structures in the neonatal brain: white matter (WM), central gray matter (CEGM), cortical gray matter (COGM), and cerebrospinal fluid (CSF). The segmentation algorithm is based on information from T2-weighted (T2-w) and inversion recovery (IR) scans. The method uses a K nearest neighbor (KNN) classification technique with features derived from spatial information and voxel intensities. Probabilistic segmentations of each tissue type were generated. By applying thresholds on these probability maps, binary segmentations were obtained. These final segmentations were evaluated by comparison with a gold standard. The sensitivity, specificity, and Dice similarity index (SI) were calculated for quantitative validation of the results. High sensitivity and specificity with respect to the gold standard were reached: sensitivity >0.82 and specificity >0.9 for all tissue types. Tissue volumes were calculated from the binary and probabilistic segmentations. The probabilistic segmentation volumes of all tissue types accurately estimated the gold standard volumes. The KNN approach offers valuable ways for neonatal brain segmentation. The probabilistic outcomes provide a useful tool for accurate volume measurements. The described method is based on routine diagnostic magnetic resonance imaging (MRI) and is suitable for large population studies.

  2. Monte Carlo study of a ferrimagnetic mixed-spin (2, 5/2) system with the nearest and next-nearest neighbors exchange couplings

    NASA Astrophysics Data System (ADS)

    Bi, Jiang-lin; Wang, Wei; Li, Qi

    2017-07-01

    In this paper, the effects of the next-nearest neighbors exchange couplings on the magnetic and thermal properties of the ferrimagnetic mixed-spin (2, 5/2) Ising model on a 3D honeycomb lattice have been investigated by the use of Monte Carlo simulation. In particular, the influences of exchange couplings (Ja, Jb, Jan) and the single-ion anisotropy(Da) on the phase diagrams, the total magnetization, the sublattice magnetization, the total susceptibility, the internal energy and the specific heat have been discussed in detail. The results clearly show that the system can express the critical and compensation behavior within the next-nearest neighbors exchange coupling. Great deals of the M curves such as N-, Q-, P- and L-types have been discovered, owing to the competition between the exchange coupling and the temperature. Compared with other theoretical and experimental works, our results have an excellent consistency with theirs.

  3. RKNNMDA: Ranking-based KNN for MiRNA-Disease Association prediction

    PubMed Central

    Chen, Xing; Yan, Gui-Ying

    2017-01-01

    ABSTRACT Cumulative verified experimental studies have demonstrated that microRNAs (miRNAs) could be closely related with the development and progression of human complex diseases. Based on the assumption that functional similar miRNAs may have a strong correlation with phenotypically similar diseases and vice versa, researchers developed various effective computational models which combine heterogeneous biologic data sets including disease similarity network, miRNA similarity network, and known disease-miRNA association network to identify potential relationships between miRNAs and diseases in biomedical research. Considering the limitations in previous computational study, we introduced a novel computational method of Ranking-based KNN for miRNA-Disease Association prediction (RKNNMDA) to predict potential related miRNAs for diseases, and our method obtained an AUC of 0.8221 based on leave-one-out cross validation. In addition, RKNNMDA was applied to 3 kinds of important human cancers for further performance evaluation. The results showed that 96%, 80% and 94% of predicted top 50 potential related miRNAs for Colon Neoplasms, Esophageal Neoplasms, and Prostate Neoplasms have been confirmed by experimental literatures, respectively. Moreover, RKNNMDA could be used to predict potential miRNAs for diseases without any known miRNAs, and it is anticipated that RKNNMDA would be of great use for novel miRNA-disease association identification. PMID:28421868

  4. A triboelectric motion sensor in wearable body sensor network for human activity recognition.

    PubMed

    Hui Huang; Xian Li; Ye Sun

    2016-08-01

    The goal of this study is to design a novel triboelectric motion sensor in wearable body sensor network for human activity recognition. Physical activity recognition is widely used in well-being management, medical diagnosis and rehabilitation. Other than traditional accelerometers, we design a novel wearable sensor system based on triboelectrification. The triboelectric motion sensor can be easily attached to human body and collect motion signals caused by physical activities. The experiments are conducted to collect five common activity data: sitting and standing, walking, climbing upstairs, downstairs, and running. The k-Nearest Neighbor (kNN) clustering algorithm is adopted to recognize these activities and validate the feasibility of this new approach. The results show that our system can perform physical activity recognition with a successful rate over 80% for walking, sitting and standing. The triboelectric structure can also be used as an energy harvester for motion harvesting due to its high output voltage in random low-frequency motion.

  5. Multi-texture local ternary pattern for face recognition

    NASA Astrophysics Data System (ADS)

    Essa, Almabrok; Asari, Vijayan

    2017-05-01

    In imagery and pattern analysis domain a variety of descriptors have been proposed and employed for different computer vision applications like face detection and recognition. Many of them are affected under different conditions during the image acquisition process such as variations in illumination and presence of noise, because they totally rely on the image intensity values to encode the image information. To overcome these problems, a novel technique named Multi-Texture Local Ternary Pattern (MTLTP) is proposed in this paper. MTLTP combines the edges and corners based on the local ternary pattern strategy to extract the local texture features of the input image. Then returns a spatial histogram feature vector which is the descriptor for each image that we use to recognize a human being. Experimental results using a k-nearest neighbors classifier (k-NN) on two publicly available datasets justify our algorithm for efficient face recognition in the presence of extreme variations of illumination/lighting environments and slight variation of pose conditions.

  6. Classification of acoustic emission signals using wavelets and Random Forests : Application to localized corrosion

    NASA Astrophysics Data System (ADS)

    Morizet, N.; Godin, N.; Tang, J.; Maillet, E.; Fregonese, M.; Normand, B.

    2016-03-01

    This paper aims to propose a novel approach to classify acoustic emission (AE) signals deriving from corrosion experiments, even if embedded into a noisy environment. To validate this new methodology, synthetic data are first used throughout an in-depth analysis, comparing Random Forests (RF) to the k-Nearest Neighbor (k-NN) algorithm. Moreover, a new evaluation tool called the alter-class matrix (ACM) is introduced to simulate different degrees of uncertainty on labeled data for supervised classification. Then, tests on real cases involving noise and crevice corrosion are conducted, by preprocessing the waveforms including wavelet denoising and extracting a rich set of features as input of the RF algorithm. To this end, a software called RF-CAM has been developed. Results show that this approach is very efficient on ground truth data and is also very promising on real data, especially for its reliability, performance and speed, which are serious criteria for the chemical industry.

  7. Intra-regional classification of grape seeds produced in Mendoza province (Argentina) by multi-elemental analysis and chemometrics tools.

    PubMed

    Canizo, Brenda V; Escudero, Leticia B; Pérez, María B; Pellerano, Roberto G; Wuilloud, Rodolfo G

    2018-03-01

    The feasibility of the application of chemometric techniques associated with multi-element analysis for the classification of grape seeds according to their provenance vineyard soil was investigated. Grape seed samples from different localities of Mendoza province (Argentina) were evaluated. Inductively coupled plasma mass spectrometry (ICP-MS) was used for the determination of twenty-nine elements (Ag, As, Ce, Co, Cs, Cu, Eu, Fe, Ga, Gd, La, Lu, Mn, Mo, Nb, Nd, Ni, Pr, Rb, Sm, Te, Ti, Tl, Tm, U, V, Y, Zn and Zr). Once the analytical data were collected, supervised pattern recognition techniques such as linear discriminant analysis (LDA), partial least square discriminant analysis (PLS-DA), k-nearest neighbors (k-NN), support vector machine (SVM) and Random Forest (RF) were applied to construct classification/discrimination rules. The results indicated that nonlinear methods, RF and SVM, perform best with up to 98% and 93% accuracy rate, respectively, and therefore are excellent tools for classification of grapes. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. Continuous statistical modelling for rapid detection of adulteration of extra virgin olive oil using mid infrared and Raman spectroscopic data.

    PubMed

    Georgouli, Konstantia; Martinez Del Rincon, Jesus; Koidis, Anastasios

    2017-02-15

    The main objective of this work was to develop a novel dimensionality reduction technique as a part of an integrated pattern recognition solution capable of identifying adulterants such as hazelnut oil in extra virgin olive oil at low percentages based on spectroscopic chemical fingerprints. A novel Continuous Locality Preserving Projections (CLPP) technique is proposed which allows the modelling of the continuous nature of the produced in-house admixtures as data series instead of discrete points. The maintenance of the continuous structure of the data manifold enables the better visualisation of this examined classification problem and facilitates the more accurate utilisation of the manifold for detecting the adulterants. The performance of the proposed technique is validated with two different spectroscopic techniques (Raman and Fourier transform infrared, FT-IR). In all cases studied, CLPP accompanied by k-Nearest Neighbors (kNN) algorithm was found to outperform any other state-of-the-art pattern recognition techniques. Copyright © 2016 Elsevier Ltd. All rights reserved.

  9. Breast Cancer Detection with Reduced Feature Set.

    PubMed

    Mert, Ahmet; Kılıç, Niyazi; Bilgili, Erdem; Akan, Aydin

    2015-01-01

    This paper explores feature reduction properties of independent component analysis (ICA) on breast cancer decision support system. Wisconsin diagnostic breast cancer (WDBC) dataset is reduced to one-dimensional feature vector computing an independent component (IC). The original data with 30 features and reduced one feature (IC) are used to evaluate diagnostic accuracy of the classifiers such as k-nearest neighbor (k-NN), artificial neural network (ANN), radial basis function neural network (RBFNN), and support vector machine (SVM). The comparison of the proposed classification using the IC with original feature set is also tested on different validation (5/10-fold cross-validations) and partitioning (20%-40%) methods. These classifiers are evaluated how to effectively categorize tumors as benign and malignant in terms of specificity, sensitivity, accuracy, F-score, Youden's index, discriminant power, and the receiver operating characteristic (ROC) curve with its criterion values including area under curve (AUC) and 95% confidential interval (CI). This represents an improvement in diagnostic decision support system, while reducing computational complexity.

  10. The different neighbours around Type-1 and Type-2 active galactic nuclei

    NASA Astrophysics Data System (ADS)

    Villarroel, Beatriz; Korn, Andreas J.

    2014-06-01

    One of the most intriguing open issues in galaxy evolution is the structure and evolution of active galactic nuclei (AGN) that emit intense light believed to come from an accretion disk near a super massive black hole. To understand the zoo of different AGN classes, it has been suggested that all AGN are the same type of object viewed from different angles. This model--called AGN unification--has been successful in predicting, for example, the existence of hidden broad optical lines in the spectrum of many narrow-line AGN. But this model is not unchallenged and it is debatable whether more than viewing angle separates the so-called Type-1 and Type-2 AGN. Here we report the first large-scale study that finds strong differences in the galaxy neighbours to Type-1 and Type-2 AGN with data from the Sloan Digital Sky Survey (SDSS; ref. ) Data Release 7 (DR7; ref. ) and Galaxy Zoo. We find strong differences in the colour and AGN activity of the neighbours to Type-1 and Type-2 AGN and in how the fraction of AGN residing in spiral hosts changes depending on the presence or not of a neighbour. These findings suggest that an evolutionary link between the two major AGN types might exist.

  11. Compositional dependence of phase structure and electrical properties in (K0.42Na0.58)NbO3-LiSbO3 lead-free ceramics

    NASA Astrophysics Data System (ADS)

    Wu, Jiagang; Xiao, Dingquan; Wang, Yuanyu; Zhu, Jianguo; Yu, Ping; Jiang, Yihang

    2007-12-01

    (1-x)(K0.42Na0.58)NbO3-xLiSbO3 [(1-x)KNN-xLS] lead-free piezoelectric ceramics were prepared by the conventional mixed oxide method. The compositional dependence of the phase structure and the electrical properties of the ceramics were studied. A morphotropic phase boundary (MPB) between the orthorhombic and tetragonal phases was identified in the composition range of 0.04kV/cm) and possess low dielectric loss (<2%) at 10 and 100 kHz at high temperature (250-400 °C). The low dielectric loss at high temperature is very important for high-temperature application of the ceramics. The related mechanism of the enhanced electrical properties of the ceramics was also discussed. These results show that (1-x)KNN-xLS (x =0.05) ceramic is a promising lead-free piezoelectric material.

  12. The Milky Way's Tiny but Tough Galactic Neighbour

    NASA Astrophysics Data System (ADS)

    2009-10-01

    Today ESO announces the release of a stunning new image of one of our nearest galactic neighbours, Barnard's Galaxy, also known as NGC 6822. The galaxy contains regions of rich star formation and curious nebulae, such as the bubble clearly visible in the upper left of this remarkable vista. Astronomers classify NGC 6822 as an irregular dwarf galaxy because of its odd shape and relatively diminutive size by galactic standards. The strange shapes of these cosmic misfits help researchers understand how galaxies interact, evolve and occasionally "cannibalise" each other, leaving behind radiant, star-filled scraps. In the new ESO image, Barnard's Galaxy glows beneath a sea of foreground stars in the direction of the constellation of Sagittarius (the Archer). At the relatively close distance of about 1.6 million light-years, Barnard's Galaxy is a member of the Local Group, the archipelago of galaxies that includes our home, the Milky Way. The nickname of NGC 6822 comes from its discoverer, the American astronomer Edward Emerson Barnard, who first spied this visually elusive cosmic islet using a 125-millimetre aperture refractor in 1884. Astronomers obtained this latest portrait using the Wide Field Imager (WFI) attached to the 2.2-metre MPG/ESO telescope at ESO's La Silla Observatory in northern Chile. Even though Barnard's Galaxy lacks the majestic spiral arms and glowing, central bulge that grace its big galactic neighbours, the Milky Way, the Andromeda and the Triangulum galaxies, this dwarf galaxy has no shortage of stellar splendour and pyrotechnics. Reddish nebulae in this image reveal regions of active star formation, where young, hot stars heat up nearby gas clouds. Also prominent in the upper left of this new image is a striking bubble-shaped nebula. At the nebula's centre, a clutch of massive, scorching stars send waves of matter smashing into the surrounding interstellar material, generating a glowing structure that appears ring-like from our perspective

  13. Using artificial intelligence to bring evidence-based medicine a step closer to making the individual difference.

    PubMed

    Sissons, B; Gray, W A; Bater, A; Morrey, D

    2007-03-01

    The vision of evidence-based medicine is that of experienced clinicians systematically using the best research evidence to meet the individual patient's needs. This vision remains distant from clinical reality, as no complete methodology exists to apply objective, population-based research evidence to the needs of an individual real-world patient. We describe an approach, based on techniques from machine learning, to bridge this gap between evidence and individual patients in oncology. We examine existing proposals for tackling this gap and the relative benefits and challenges of our proposed, k-nearest-neighbour-based, approach.

  14. Measures of galaxy environment - I. What is 'environment'?

    NASA Astrophysics Data System (ADS)

    Muldrew, Stuart I.; Croton, Darren J.; Skibba, Ramin A.; Pearce, Frazer R.; Ann, Hong Bae; Baldry, Ivan K.; Brough, Sarah; Choi, Yun-Young; Conselice, Christopher J.; Cowan, Nicolas B.; Gallazzi, Anna; Gray, Meghan E.; Grützbauch, Ruth; Li, I.-Hui; Park, Changbom; Pilipenko, Sergey V.; Podgorzec, Bret J.; Robotham, Aaron S. G.; Wilman, David J.; Yang, Xiaohu; Zhang, Youcai; Zibetti, Stefano

    2012-01-01

    The influence of a galaxy's environment on its evolution has been studied and compared extensively in the literature, although differing techniques are often used to define environment. Most methods fall into two broad groups: those that use nearest neighbours to probe the underlying density field and those that use fixed apertures. The differences between the two inhibit a clean comparison between analyses and leave open the possibility that, even with the same data, different properties are actually being measured. In this work, we apply 20 published environment definitions to a common mock galaxy catalogue constrained to look like the local Universe. We find that nearest-neighbour-based measures best probe the internal densities of high-mass haloes, while at low masses the interhalo separation dominates and acts to smooth out local density variations. The resulting correlation also shows that nearest-neighbour galaxy environment is largely independent of dark matter halo mass. Conversely, aperture-based methods that probe superhalo scales accurately identify high-density regions corresponding to high-mass haloes. Both methods show how galaxies in dense environments tend to be redder, with the exception of the largest apertures, but these are the strongest at recovering the background dark matter environment. We also warn against using photometric redshifts to define environment in all but the densest regions. When considering environment, there are two regimes: the 'local environment' internal to a halo best measured with nearest neighbour and 'large-scale environment' external to a halo best measured with apertures. This leads to the conclusion that there is no universal environment measure and the most suitable method depends on the scale being probed.

  15. Prediction of microsleeps using pairwise joint entropy and mutual information between EEG channels.

    PubMed

    Baseer, Abdul; Weddell, Stephen J; Jones, Richard D

    2017-07-01

    Microsleeps are involuntary and brief instances of complete loss of responsiveness, typically of 0.5-15 s duration. They adversely affect performance in extended attention-driven jobs and can be fatal. Our aim was to predict microsleeps from 16 channel EEG signals. Two information theoretic concepts - pairwise joint entropy and mutual information - were independently used to continuously extract features from EEG signals. k-nearest neighbor (kNN) with k = 3 was used to calculate both joint entropy and mutual information. Highly correlated features were discarded and the rest were ranked using Fisher score followed by an average of 3-fold cross-validation area under the curve of the receiver operating characteristic (AUC ROC ). Leave-one-out method (LOOM) was performed to test the performance of microsleep prediction system on independent data. The best prediction for 0.25 s ahead was AUCROC, sensitivity, precision, geometric mean (GM), and φ of 0.93, 0.68, 0.33, 0.75, and 0.38 respectively with joint entropy using single linear discriminant analysis (LDA) classifier.

  16. Analysis of Optimal Transport Route Determination of Oil Palm Fresh Fruit Bunches from Plantation to Processing Factory

    NASA Astrophysics Data System (ADS)

    Tarigan, U.; Sidabutar, R. F.; Tarigan, U. P. P.; Chen, A.

    2018-04-01

    Manufacturers engaged in the business, producing CPO and kernels whose raw materials are oil palm fresh fruit bunches taken from their own plantation, generally face problems of transporting from plantation to factory where there is often a change of distance traveled by the truck the carrier of FFB is due to non-specific transport instructions. The research was conducted to determine the optimal transportation route in terms of distance, time and route number. The determination of this transportation route is solved using Nearest Neighbours and Clarke & Wright Savings methods. Based on the calculations performed then found in area I with method Nearest Neighbours has a distance of 200.78 Km while Clarke & Wright Savings as with a result of 214.09 Km. As for the harvest area, II obtained results with Nearest Neighbours method of 264.37 Km and Clarke & Wright Savings method with a total distance of 264.33 Km. Based on the calculation of the time to do all the activities of transporting FFB juxtaposed with the work time of the driver got the reduction of conveyance from 8 units to 5 units. There is also improvement of fuel efficiency by 0.8%.

  17. On the consistency between nearest-neighbor peridynamic discretizations and discretized classical elasticity models

    DOE PAGES

    Seleson, Pablo; Du, Qiang; Parks, Michael L.

    2016-08-16

    The peridynamic theory of solid mechanics is a nonlocal reformulation of the classical continuum mechanics theory. At the continuum level, it has been demonstrated that classical (local) elasticity is a special case of peridynamics. Such a connection between these theories has not been extensively explored at the discrete level. This paper investigates the consistency between nearest-neighbor discretizations of linear elastic peridynamic models and finite difference discretizations of the Navier–Cauchy equation of classical elasticity. While nearest-neighbor discretizations in peridynamics have been numerically observed to present grid-dependent crack paths or spurious microcracks, this paper focuses on a different, analytical aspect of suchmore » discretizations. We demonstrate that, even in the absence of cracks, such discretizations may be problematic unless a proper selection of weights is used. Specifically, we demonstrate that using the standard meshfree approach in peridynamics, nearest-neighbor discretizations do not reduce, in general, to discretizations of corresponding classical models. We study nodal-based quadratures for the discretization of peridynamic models, and we derive quadrature weights that result in consistency between nearest-neighbor discretizations of peridynamic models and discretized classical models. The quadrature weights that lead to such consistency are, however, model-/discretization-dependent. We motivate the choice of those quadrature weights through a quadratic approximation of displacement fields. The stability of nearest-neighbor peridynamic schemes is demonstrated through a Fourier mode analysis. Finally, an approach based on a normalization of peridynamic constitutive constants at the discrete level is explored. This approach results in the desired consistency for one-dimensional models, but does not work in higher dimensions. The results of the work presented in this paper suggest that even though nearest

  18. Three Transits for the Price of One: Super-Earth Transits of the Nearest Planetary System Discovered By Kepler/K2

    NASA Astrophysics Data System (ADS)

    Redfield, Seth; Niraula, Prajwal; Hedges, Christina; Crossfield, Ian; Kreidberg, Laura; Greene, Tom; Rodriguez, Joey; Vanderburg, Andrew; Laughlin, Gregory; Millholland, Sarah; Wang, Songhu; Cochran, William; Livingston, John; Gandolfi, Davide; Guenther, Eike; Fridlund, Malcolm; Korth, Judith

    2018-05-01

    We propose primary transit observations of three Super-Earth planets in the newly discovered planetary system around a bright, nearby star, GJ 9827. We recently announced the detection of three super-Earth planets in 1:3:5 commensurability, the inner planet, GJ 9827 b having a period of 1.2 days. This is the nearest planetary system that Kepler or K2 has found, at 30 pc, and given its brightness is one of the top systems for follow-up characterization. This system presents a unique opportunity to acquire three planetary transits for the price of one. There are several opportunities in the Spitzer visibility windows to obtain all three transits in a short period of time. We propose 3.6 micron observations of all three Super-Earth transits in a single 18-hour observation window. The proximity to a 1:3:5 resonance is intriguing from a dynamical standpoint as well. Indeed, anomalous transit timing offsets have been measured for planet d in Hubble observations that suffer from partial phase coverage. The short cadence and extended coverage of Spitzer is essential to provide a firm determination of the ephemerides and characterize any transit timing variations. Constraining these orbital parameters is critical for follow-up observations from space and ground-based telescopes. Due to the brightness of the host star, this planetary system is likely to be extensively observed in the years to come. Indeed, our team has acquired observations of the planets orbiting GJ9827 with Hubble in the ultraviolet and infrared. The proposed observations will provide infrared atmospheric measurements and firm orbital characterization which is critical for planning and designing future observations, in particular atmospheric characterization with JWST.

  19. A model for adatom structures

    NASA Astrophysics Data System (ADS)

    Kappus, W.

    1981-06-01

    A model concerning adatom structures is proposed. Attractive nearest neighbour interactions, which may be of electronic nature lead to 2-dimensional condensation. Every pair bond causes and elastic dipole. The elastic dipoles interact via substrate strains with an anisotropic s -3 power law. Different types of adatoms or sites are permitted and many-body effects result, from the assumptions. Electric dipole interactions of adatoms are included for comparison. The model is applied to the W(110) surface and compared with superstructures experimentally found in the W(110)-0 system. It is found that there is still lack for an additional next-nearest neighbour interaction.

  20. Fall Detection System for the Elderly Based on the Classification of Shimmer Sensor Prototype Data

    PubMed Central

    Ahmed, Moiz; Mehmood, Nadeem; Mehmood, Amir; Rizwan, Kashif

    2017-01-01

    Objectives Falling in the elderly is considered a major cause of death. In recent years, ambient and wireless sensor platforms have been extensively used in developed countries for the detection of falls in the elderly. However, we believe extra efforts are required to address this issue in developing countries, such as Pakistan, where most deaths due to falls are not even reported. Considering this, in this paper, we propose a fall detection system prototype that s based on the classification on real time shimmer sensor data. Methods We first developed a data set, ‘SMotion’ of certain postures that could lead to falls in the elderly by using a body area network of Shimmer sensors and categorized the items in this data set into age and weight groups. We developed a feature selection and classification system using three classifiers, namely, support vector machine (SVM), K-nearest neighbor (KNN), and neural network (NN). Finally, a prototype was fabricated to generate alerts to caregivers, health experts, or emergency services in case of fall. Results To evaluate the proposed system, SVM, KNN, and NN were used. The results of this study identified KNN as the most accurate classifier with maximum accuracy of 96% for age groups and 93% for weight groups. Conclusions In this paper, a classification-based fall detection system is proposed. For this purpose, the SMotion data set was developed and categorized into two groups (age and weight groups). The proposed fall detection system for the elderly is implemented through a body area sensor network using third-generation sensors. The evaluation results demonstrate the reasonable performance of the proposed fall detection prototype system in the tested scenarios. PMID:28875049

  1. Prediction and identification of sequences coding for orphan enzymes using genomic and metagenomic neighbours

    PubMed Central

    Yamada, Takuji; Waller, Alison S; Raes, Jeroen; Zelezniak, Aleksej; Perchat, Nadia; Perret, Alain; Salanoubat, Marcel; Patil, Kiran R; Weissenbach, Jean; Bork, Peer

    2012-01-01

    Despite the current wealth of sequencing data, one-third of all biochemically characterized metabolic enzymes lack a corresponding gene or protein sequence, and as such can be considered orphan enzymes. They represent a major gap between our molecular and biochemical knowledge, and consequently are not amenable to modern systemic analyses. As 555 of these orphan enzymes have metabolic pathway neighbours, we developed a global framework that utilizes the pathway and (meta)genomic neighbour information to assign candidate sequences to orphan enzymes. For 131 orphan enzymes (37% of those for which (meta)genomic neighbours are available), we associate sequences to them using scoring parameters with an estimated accuracy of 70%, implying functional annotation of 16 345 gene sequences in numerous (meta)genomes. As a case in point, two of these candidate sequences were experimentally validated to encode the predicted activity. In addition, we augmented the currently available genome-scale metabolic models with these new sequence–function associations and were able to expand the models by on average 8%, with a considerable change in the flux connectivity patterns and improved essentiality prediction. PMID:22569339

  2. Estimating local scaling properties for the classification of interstitial lung disease patterns

    NASA Astrophysics Data System (ADS)

    Huber, Markus B.; Nagarajan, Mahesh B.; Leinsinger, Gerda; Ray, Lawrence A.; Wismueller, Axel

    2011-03-01

    Local scaling properties of texture regions were compared in their ability to classify morphological patterns known as 'honeycombing' that are considered indicative for the presence of fibrotic interstitial lung diseases in high-resolution computed tomography (HRCT) images. For 14 patients with known occurrence of honeycombing, a stack of 70 axial, lung kernel reconstructed images were acquired from HRCT chest exams. 241 regions of interest of both healthy and pathological (89) lung tissue were identified by an experienced radiologist. Texture features were extracted using six properties calculated from gray-level co-occurrence matrices (GLCM), Minkowski Dimensions (MDs), and the estimation of local scaling properties with Scaling Index Method (SIM). A k-nearest-neighbor (k-NN) classifier and a Multilayer Radial Basis Functions Network (RBFN) were optimized in a 10-fold cross-validation for each texture vector, and the classification accuracy was calculated on independent test sets as a quantitative measure of automated tissue characterization. A Wilcoxon signed-rank test was used to compare two accuracy distributions including the Bonferroni correction. The best classification results were obtained by the set of SIM features, which performed significantly better than all the standard GLCM and MD features (p < 0.005) for both classifiers with the highest accuracy (94.1%, 93.7%; for the k-NN and RBFN classifier, respectively). The best standard texture features were the GLCM features 'homogeneity' (91.8%, 87.2%) and 'absolute value' (90.2%, 88.5%). The results indicate that advanced texture features using local scaling properties can provide superior classification performance in computer-assisted diagnosis of interstitial lung diseases when compared to standard texture analysis methods.

  3. Classification of interstitial lung disease patterns with topological texture features

    NASA Astrophysics Data System (ADS)

    Huber, Markus B.; Nagarajan, Mahesh; Leinsinger, Gerda; Ray, Lawrence A.; Wismüller, Axel

    2010-03-01

    Topological texture features were compared in their ability to classify morphological patterns known as 'honeycombing' that are considered indicative for the presence of fibrotic interstitial lung diseases in high-resolution computed tomography (HRCT) images. For 14 patients with known occurrence of honey-combing, a stack of 70 axial, lung kernel reconstructed images were acquired from HRCT chest exams. A set of 241 regions of interest of both healthy and pathological (89) lung tissue were identified by an experienced radiologist. Texture features were extracted using six properties calculated from gray-level co-occurrence matrices (GLCM), Minkowski Dimensions (MDs), and three Minkowski Functionals (MFs, e.g. MF.euler). A k-nearest-neighbor (k-NN) classifier and a Multilayer Radial Basis Functions Network (RBFN) were optimized in a 10-fold cross-validation for each texture vector, and the classification accuracy was calculated on independent test sets as a quantitative measure of automated tissue characterization. A Wilcoxon signed-rank test was used to compare two accuracy distributions and the significance thresholds were adjusted for multiple comparisons by the Bonferroni correction. The best classification results were obtained by the MF features, which performed significantly better than all the standard GLCM and MD features (p < 0.005) for both classifiers. The highest accuracy was found for MF.euler (97.5%, 96.6%; for the k-NN and RBFN classifier, respectively). The best standard texture features were the GLCM features 'homogeneity' (91.8%, 87.2%) and 'absolute value' (90.2%, 88.5%). The results indicate that advanced topological texture features can provide superior classification performance in computer-assisted diagnosis of interstitial lung diseases when compared to standard texture analysis methods.

  4. QSAR modeling of human serum protein binding with several modeling techniques utilizing structure-information representation.

    PubMed

    Votano, Joseph R; Parham, Marc; Hall, L Mark; Hall, Lowell H; Kier, Lemont B; Oloff, Scott; Tropsha, Alexander

    2006-11-30

    Four modeling techniques, using topological descriptors to represent molecular structure, were employed to produce models of human serum protein binding (% bound) on a data set of 1008 experimental values, carefully screened from publicly available sources. To our knowledge, this data is the largest set on human serum protein binding reported for QSAR modeling. The data was partitioned into a training set of 808 compounds and an external validation test set of 200 compounds. Partitioning was accomplished by clustering the compounds in a structure descriptor space so that random sampling of 20% of the whole data set produced an external test set that is a good representative of the training set with respect to both structure and protein binding values. The four modeling techniques include multiple linear regression (MLR), artificial neural networks (ANN), k-nearest neighbors (kNN), and support vector machines (SVM). With the exception of the MLR model, the ANN, kNN, and SVM QSARs were ensemble models. Training set correlation coefficients and mean absolute error ranged from r2=0.90 and MAE=7.6 for ANN to r2=0.61 and MAE=16.2 for MLR. Prediction results from the validation set yielded correlation coefficients and mean absolute errors which ranged from r2=0.70 and MAE=14.1 for ANN to a low of r2=0.59 and MAE=18.3 for the SVM model. Structure descriptors that contribute significantly to the models are discussed and compared with those found in other published models. For the ANN model, structure descriptor trends with respect to their affects on predicted protein binding can assist the chemist in structure modification during the drug design process.

  5. A novel feature ranking algorithm for biometric recognition with PPG signals.

    PubMed

    Reşit Kavsaoğlu, A; Polat, Kemal; Recep Bozkurt, M

    2014-06-01

    This study is intended for describing the application of the Photoplethysmography (PPG) signal and the time domain features acquired from its first and second derivatives for biometric identification. For this purpose, a sum of 40 features has been extracted and a feature-ranking algorithm is proposed. This proposed algorithm calculates the contribution of each feature to biometric recognition and collocates the features, the contribution of which is from great to small. While identifying the contribution of the features, the Euclidean distance and absolute distance formulas are used. The efficiency of the proposed algorithms is demonstrated by the results of the k-NN (k-nearest neighbor) classifier applications of the features. During application, each 15-period-PPG signal belonging to two different durations from each of the thirty healthy subjects were used with a PPG data acquisition card. The first PPG signals recorded from the subjects were evaluated as the 1st configuration; the PPG signals recorded later at a different time as the 2nd configuration and the combination of both were evaluated as the 3rd configuration. When the results were evaluated for the k-NN classifier model created along with the proposed algorithm, an identification of 90.44% for the 1st configuration, 94.44% for the 2nd configuration, and 87.22% for the 3rd configuration has successfully been attained. The obtained results showed that both the proposed algorithm and the biometric identification model based on this developed PPG signal are very promising for contactless recognizing the people with the proposed method. Copyright © 2014 Elsevier Ltd. All rights reserved.

  6. Reliable Multi-Label Learning via Conformal Predictor and Random Forest for Syndrome Differentiation of Chronic Fatigue in Traditional Chinese Medicine

    PubMed Central

    Wang, Huazhen; Liu, Xin; Lv, Bing; Yang, Fan; Hong, Yanzhu

    2014-01-01

    Objective Chronic Fatigue (CF) still remains unclear about its etiology, pathophysiology, nomenclature and diagnostic criteria in the medical community. Traditional Chinese medicine (TCM) adopts a unique diagnostic method, namely ‘bian zheng lun zhi’ or syndrome differentiation, to diagnose the CF with a set of syndrome factors, which can be regarded as the Multi-Label Learning (MLL) problem in the machine learning literature. To obtain an effective and reliable diagnostic tool, we use Conformal Predictor (CP), Random Forest (RF) and Problem Transformation method (PT) for the syndrome differentiation of CF. Methods and Materials In this work, using PT method, CP-RF is extended to handle MLL problem. CP-RF applies RF to measure the confidence level (p-value) of each label being the true label, and then selects multiple labels whose p-values are larger than the pre-defined significance level as the region prediction. In this paper, we compare the proposed CP-RF with typical CP-NBC(Naïve Bayes Classifier), CP-KNN(K-Nearest Neighbors) and ML-KNN on CF dataset, which consists of 736 cases. Specifically, 95 symptoms are used to identify CF, and four syndrome factors are employed in the syndrome differentiation, including ‘spleen deficiency’, ‘heart deficiency’, ‘liver stagnation’ and ‘qi deficiency’. The Results CP-RF demonstrates an outstanding performance beyond CP-NBC, CP-KNN and ML-KNN under the general metrics of subset accuracy, hamming loss, one-error, coverage, ranking loss and average precision. Furthermore, the performance of CP-RF remains steady at the large scale of confidence levels from 80% to 100%, which indicates its robustness to the threshold determination. In addition, the confidence evaluation provided by CP is valid and well-calibrated. Conclusion CP-RF not only offers outstanding performance but also provides valid confidence evaluation for the CF syndrome differentiation. It would be well applicable to TCM practitioners and

  7. Improving RNA nearest neighbor parameters for helices by going beyond the two-state model.

    PubMed

    Spasic, Aleksandar; Berger, Kyle D; Chen, Jonathan L; Seetin, Matthew G; Turner, Douglas H; Mathews, David H

    2018-06-01

    RNA folding free energy change nearest neighbor parameters are widely used to predict folding stabilities of secondary structures. They were determined by linear regression to datasets of optical melting experiments on small model systems. Traditionally, the optical melting experiments are analyzed assuming a two-state model, i.e. a structure is either complete or denatured. Experimental evidence, however, shows that structures exist in an ensemble of conformations. Partition functions calculated with existing nearest neighbor parameters predict that secondary structures can be partially denatured, which also directly conflicts with the two-state model. Here, a new approach for determining RNA nearest neighbor parameters is presented. Available optical melting data for 34 Watson-Crick helices were fit directly to a partition function model that allows an ensemble of conformations. Fitting parameters were the enthalpy and entropy changes for helix initiation, terminal AU pairs, stacks of Watson-Crick pairs and disordered internal loops. The resulting set of nearest neighbor parameters shows a 38.5% improvement in the sum of residuals in fitting the experimental melting curves compared to the current literature set.

  8. Time trends in avoidable cancer mortality in Switzerland and neighbouring European countries 1996-2010.

    PubMed

    Feller, Anita; Mark, Michael Thomas; Steiner, Annik; Clough-Gorr, Kerri M

    2015-01-01

    What are the trends in avoidable cancer mortality in Switzerland and neighbouring countries? Mortality data and population estimates 1996-2010 were obtained from the Swiss Federal Statistical Office for Switzerland and the World Health Organization Mortality Database (http://www.who.int/healthinfo/mortality_data/en/) for Austria, Germany, France and Italy. Age standardised mortality rates (ASMRs, European standard) per 100 000 person-years were calculated for the population <75 years old by sex for the following groups of cancer deaths: (1) avoidable through primary prevention; (2) avoidable through early detection and treatment; (3) avoidable through improved treatment and medical care; and (4) remaining cancer deaths. To assess time trends in ASMRs, estimated annual percentage changes (EAPCs) with 95% confidence intervals (95% CIs) were calculated. In Switzerland and neighbouring countries cancer mortality in persons <75 years old continuously decreased 1996-2010. Avoidable cancer mortality decreased in all groups of avoidable cancer deaths in both sexes, with one exception. ASMRs for causes avoidable through primary prevention increased in females in all countries (in Switzerland from 16.2 to 20.3 per 100 000 person years, EAPC 2.0 [95% CI 1.4 to 2.6]). Compared with its neighbouring countries, Switzerland showed the lowest rates for all groups of avoidable cancer mortality in males 2008-2010. Overall avoidable cancer mortality decreased, indicating achievements in cancer care and related health policies. However, increasing trends in avoidable cancer mortality through primary prevention for females suggest there is a need in Switzerland and its European neighbouring countries to improve primary prevention.

  9. Identification of Disease Critical Genes Using Collective Meta-heuristic Approaches: An Application to Preeclampsia.

    PubMed

    Biswas, Surama; Dutta, Subarna; Acharyya, Sriyankar

    2017-12-01

    Identifying a small subset of disease critical genes out of a large size of microarray gene expression data is a challenge in computational life sciences. This paper has applied four meta-heuristic algorithms, namely, honey bee mating optimization (HBMO), harmony search (HS), differential evolution (DE) and genetic algorithm (basic version GA) to find disease critical genes of preeclampsia which affects women during gestation. Two hybrid algorithms, namely, HBMO-kNN and HS-kNN have been newly proposed here where kNN (k nearest neighbor classifier) is used for sample classification. Performances of these new approaches have been compared with other two hybrid algorithms, namely, DE-kNN and SGA-kNN. Three datasets of different sizes have been used. In a dataset, the set of genes found common in the output of each algorithm is considered here as disease critical genes. In different datasets, the percentage of classification or classification accuracy of meta-heuristic algorithms varied between 92.46 and 100%. HBMO-kNN has the best performance (99.64-100%) in almost all data sets. DE-kNN secures the second position (99.42-100%). Disease critical genes obtained here match with clinically revealed preeclampsia genes to a large extent.

  10. The Effective Resistance of the -Cycle Graph with Four Nearest Neighbors

    NASA Astrophysics Data System (ADS)

    Chair, Noureddine

    2014-02-01

    The exact expression for the effective resistance between any two vertices of the -cycle graph with four nearest neighbors , is given. It turns out that this expression is written in terms of the effective resistance of the -cycle graph , the square of the Fibonacci numbers, and the bisected Fibonacci numbers. As a consequence closed form formulas for the total effective resistance, the first passage time, and the mean first passage time for the simple random walk on the the -cycle graph with four nearest neighbors are obtained. Finally, a closed form formula for the effective resistance of with all first neighbors removed is obtained.

  11. Common neighbour structure and similarity intensity in complex networks

    NASA Astrophysics Data System (ADS)

    Hou, Lei; Liu, Kecheng

    2017-10-01

    Complex systems as networks always exhibit strong regularities, implying underlying mechanisms governing their evolution. In addition to the degree preference, the similarity has been argued to be another driver for networks. Assuming a network is randomly organised without similarity preference, the present paper studies the expected number of common neighbours between vertices. A symmetrical similarity index is accordingly developed by removing such expected number from the observed common neighbours. The developed index can not only describe the similarities between vertices, but also the dissimilarities. We further apply the proposed index to measure of the influence of similarity on the wring patterns of networks. Fifteen empirical networks as well as artificial networks are examined in terms of similarity intensity and degree heterogeneity. Results on real networks indicate that, social networks are strongly governed by the similarity as well as the degree preference, while the biological networks and infrastructure networks show no apparent similarity governance. Particularly, classical network models, such as the Barabási-Albert model, the Erdös-Rényi model and the Ring Lattice, cannot well describe the social networks in terms of the degree heterogeneity and similarity intensity. The findings may shed some light on the modelling and link prediction of different classes of networks.

  12. Collateral missing value imputation: a new robust missing value estimation algorithm for microarray data.

    PubMed

    Sehgal, Muhammad Shoaib B; Gondal, Iqbal; Dooley, Laurence S

    2005-05-15

    Microarray data are used in a range of application areas in biology, although often it contains considerable numbers of missing values. These missing values can significantly affect subsequent statistical analysis and machine learning algorithms so there is a strong motivation to estimate these values as accurately as possible before using these algorithms. While many imputation algorithms have been proposed, more robust techniques need to be developed so that further analysis of biological data can be accurately undertaken. In this paper, an innovative missing value imputation algorithm called collateral missing value estimation (CMVE) is presented which uses multiple covariance-based imputation matrices for the final prediction of missing values. The matrices are computed and optimized using least square regression and linear programming methods. The new CMVE algorithm has been compared with existing estimation techniques including Bayesian principal component analysis imputation (BPCA), least square impute (LSImpute) and K-nearest neighbour (KNN). All these methods were rigorously tested to estimate missing values in three separate non-time series (ovarian cancer based) and one time series (yeast sporulation) dataset. Each method was quantitatively analyzed using the normalized root mean square (NRMS) error measure, covering a wide range of randomly introduced missing value probabilities from 0.01 to 0.2. Experiments were also undertaken on the yeast dataset, which comprised 1.7% actual missing values, to test the hypothesis that CMVE performed better not only for randomly occurring but also for a real distribution of missing values. The results confirmed that CMVE consistently demonstrated superior and robust estimation capability of missing values compared with other methods for both series types of data, for the same order of computational complexity. A concise theoretical framework has also been formulated to validate the improved performance of the CMVE

  13. Integration of multimodal RNA-seq data for prediction of kidney cancer survival

    PubMed Central

    Schwartzi, Matt; Parkl, Martin; Phanl, John H.; Wang., May D.

    2016-01-01

    Kidney cancer is of prominent concern in modern medicine. Predicting patient survival is critical to patient awareness and developing a proper treatment regimens. Previous prediction models built upon molecular feature analysis are limited to just gene expression data. In this study we investigate the difference in predicting five year survival between unimodal and multimodal analysis of RNA-seq data from gene, exon, junction, and isoform modalities. Our preliminary findings report higher predictive accuracy-as measured by area under the ROC curve (AUC)-for multimodal learning when compared to unimodal learning with both support vector machine (SVM) and k-nearest neighbor (KNN) methods. The results of this study justify further research on the use of multimodal RNA-seq data to predict survival for other cancer types using a larger sample size and additional machine learning methods. PMID:27532026

  14. Fuzzy Temporal Logic Based Railway Passenger Flow Forecast Model

    PubMed Central

    Dou, Fei; Jia, Limin; Wang, Li; Xu, Jie; Huang, Yakun

    2014-01-01

    Passenger flow forecast is of essential importance to the organization of railway transportation and is one of the most important basics for the decision-making on transportation pattern and train operation planning. Passenger flow of high-speed railway features the quasi-periodic variations in a short time and complex nonlinear fluctuation because of existence of many influencing factors. In this study, a fuzzy temporal logic based passenger flow forecast model (FTLPFFM) is presented based on fuzzy logic relationship recognition techniques that predicts the short-term passenger flow for high-speed railway, and the forecast accuracy is also significantly improved. An applied case that uses the real-world data illustrates the precision and accuracy of FTLPFFM. For this applied case, the proposed model performs better than the k-nearest neighbor (KNN) and autoregressive integrated moving average (ARIMA) models. PMID:25431586

  15. Multiwavelet grading of prostate pathological images

    NASA Astrophysics Data System (ADS)

    Soltanian-Zadeh, Hamid; Jafari-Khouzani, Kourosh

    2002-05-01

    We have developed image analysis methods to automatically grade pathological images of prostate. The proposed method generates Gleason grades to images, where each image is assigned a grade between 1 and 5. This is done using features extracted from multiwavelet transformations. We extract energy and entropy features from submatrices obtained in the decomposition. Next, we apply a k-NN classifier to grade the image. To find optimal multiwavelet basis, preprocessing, and classifier, we use features extracted by different multiwavelets with either critically sampled preprocessing or repeated row preprocessing and different k-NN classifiers and compare their performances, evaluated by total misclassification rate (TMR). To evaluate sensitivity to noise, we add white Gaussian noise to images and compare the results (TMR's). We applied proposed methods to 100 images. We evaluated the first and second levels of decomposition using Geronimo, Hardin, and Massopust (GHM), Chui and Lian (CL), and Shen (SA4) multiwavelets. We also evaluated k-NN classifier for k=1,2,3,4,5. Experimental results illustrate that first level of decomposition is quite noisy. They also show that critically sampled preprocessing outperforms repeated row preprocessing and has less sensitivity to noise. Finally, comparison studies indicate that SA4 multiwavelet and k-NN classifier (k=1) generates optimal results (with smallest TMR of 3%).

  16. Comparison between air pollution concentrations measured at the nearest monitoring station to the delivery hospital and those measured at stations nearest the residential postal code regions of pregnant women in Fukuoka.

    PubMed

    Michikawa, Takehiro; Morokuma, Seiichi; Nitta, Hiroshi; Kato, Kiyoko; Yamazaki, Shin

    2017-06-13

    Numerous earlier studies examining the association of air pollution with maternal and foetal health estimated maternal exposure to air pollutants based on the women's residential addresses. However, residential addresses, which are personally identifiable information, are not always obtainable. Since a majority of pregnant women reside near their delivery hospitals, the concentrations of air pollutants at the respective delivery hospitals may be surrogate markers of pollutant exposure at home. We compared air pollutant concentrations measured at the nearest monitoring station to Kyushu University Hospital with those measured at the closest monitoring stations to the respective residential postal code regions of pregnant women in Fukuoka. Aggregated postal code data for the home addresses of pregnant women who delivered at Kyushu University Hospital in 2014 was obtained from Kyushu University Hospital. For each of the study's 695 women who resided in Fukuoka Prefecture, we assigned pollutant concentrations measured at the nearest monitoring station to Kyushu University Hospital and pollutant concentrations measured at the nearest monitoring station to their respective residential postal code regions. Among the 695 women, 584 (84.0%) resided in the proximity of the nearest monitoring station to hospital or one of the four other stations (as the nearest stations to their respective residential postal code region) in Fukuoka city. Pearson's correlation for daily mean concentrations among the monitoring stations in Fukuoka city was strong for fine particulate matter (PM 2.5 ), suspended particulate matter (SPM), and photochemical oxidants (Ox) (coefficients ≥0.9), but moderate for coarse particulate matter (the result of subtracting the PM 2.5 from the SPM concentrations), nitrogen dioxide, and sulphur dioxide. Hospital-based and residence-based concentrations of PM 2.5 , SPM, and Ox were comparable. For PM 2.5 , SPM, and Ox, exposure estimation based on the delivery

  17. Intraspecific chemical diversity among neighbouring plants correlates positively with plant size and herbivore load but negatively with herbivore damage.

    PubMed

    Bustos-Segura, Carlos; Poelman, Erik H; Reichelt, Michael; Gershenzon, Jonathan; Gols, Rieta

    2017-01-01

    Intraspecific plant diversity can modify the properties of associated arthropod communities and plant fitness. However, it is not well understood which plant traits determine these ecological effects. We explored the effect of intraspecific chemical diversity among neighbouring plants on the associated invertebrate community and plant traits. In a common garden experiment, intraspecific diversity among neighbouring plants was manipulated using three plant populations of wild cabbage that differ in foliar glucosinolates. Plants were larger, harboured more herbivores, but were less damaged when plant diversity was increased. Glucosinolate concentration differentially correlated with generalist and specialist herbivore abundance. Glucosinolate composition correlated with plant damage, while in polycultures, variation in glucosinolate concentrations among neighbouring plants correlated positively with herbivore diversity and negatively with plant damage levels. The results suggest that intraspecific variation in secondary chemistry among neighbouring plants is important in determining the structure of the associated insect community and positively affects plant performance. © 2016 The Authors. Ecology Letters published by CNRS and John Wiley & Sons Ltd.

  18. Exchange interactions in two-state systems: rare earth pyrochlores.

    PubMed

    Curnoe, S H

    2018-06-13

    The general form of the nearest neighbour exchange interaction for rare earth pyrochlores is derived based on symmetry. Generally, the rare earth angular momentum degeneracy is lifted by the crystal electric field (CEF) into singlets and doublets. When the CEF ground state is a doublet that is well-separated from the first excited state the CEF ground state doublet can be treated as a pseudo-spin of some kind. The general form of the nearest neighbour exchange interaction for pseudo-spins on the pyrochlore lattice is derived for three different types of pseudo-spins. The methodology presented in this paper can be applied to other two-state spin systems with a high space group symmetry.

  19. Exchange interactions in two-state systems: rare earth pyrochlores

    NASA Astrophysics Data System (ADS)

    Curnoe, S. H.

    2018-06-01

    The general form of the nearest neighbour exchange interaction for rare earth pyrochlores is derived based on symmetry. Generally, the rare earth angular momentum degeneracy is lifted by the crystal electric field (CEF) into singlets and doublets. When the CEF ground state is a doublet that is well-separated from the first excited state the CEF ground state doublet can be treated as a pseudo-spin of some kind. The general form of the nearest neighbour exchange interaction for pseudo-spins on the pyrochlore lattice is derived for three different types of pseudo-spins. The methodology presented in this paper can be applied to other two-state spin systems with a high space group symmetry.

  20. Distributed Adaptive Binary Quantization for Fast Nearest Neighbor Search.

    PubMed

    Xianglong Liu; Zhujin Li; Cheng Deng; Dacheng Tao

    2017-11-01

    Hashing has been proved an attractive technique for fast nearest neighbor search over big data. Compared with the projection based hashing methods, prototype-based ones own stronger power to generate discriminative binary codes for the data with complex intrinsic structure. However, existing prototype-based methods, such as spherical hashing and K-means hashing, still suffer from the ineffective coding that utilizes the complete binary codes in a hypercube. To address this problem, we propose an adaptive binary quantization (ABQ) method that learns a discriminative hash function with prototypes associated with small unique binary codes. Our alternating optimization adaptively discovers the prototype set and the code set of a varying size in an efficient way, which together robustly approximate the data relations. Our method can be naturally generalized to the product space for long hash codes, and enjoys the fast training linear to the number of the training data. We further devise a distributed framework for the large-scale learning, which can significantly speed up the training of ABQ in the distributed environment that has been widely deployed in many areas nowadays. The extensive experiments on four large-scale (up to 80 million) data sets demonstrate that our method significantly outperforms state-of-the-art hashing methods, with up to 58.84% performance gains relatively.

  1. Application of texture analysis method for mammogram density classification

    NASA Astrophysics Data System (ADS)

    Nithya, R.; Santhi, B.

    2017-07-01

    Mammographic density is considered a major risk factor for developing breast cancer. This paper proposes an automated approach to classify breast tissue types in digital mammogram. The main objective of the proposed Computer-Aided Diagnosis (CAD) system is to investigate various feature extraction methods and classifiers to improve the diagnostic accuracy in mammogram density classification. Texture analysis methods are used to extract the features from the mammogram. Texture features are extracted by using histogram, Gray Level Co-Occurrence Matrix (GLCM), Gray Level Run Length Matrix (GLRLM), Gray Level Difference Matrix (GLDM), Local Binary Pattern (LBP), Entropy, Discrete Wavelet Transform (DWT), Wavelet Packet Transform (WPT), Gabor transform and trace transform. These extracted features are selected using Analysis of Variance (ANOVA). The features selected by ANOVA are fed into the classifiers to characterize the mammogram into two-class (fatty/dense) and three-class (fatty/glandular/dense) breast density classification. This work has been carried out by using the mini-Mammographic Image Analysis Society (MIAS) database. Five classifiers are employed namely, Artificial Neural Network (ANN), Linear Discriminant Analysis (LDA), Naive Bayes (NB), K-Nearest Neighbor (KNN), and Support Vector Machine (SVM). Experimental results show that ANN provides better performance than LDA, NB, KNN and SVM classifiers. The proposed methodology has achieved 97.5% accuracy for three-class and 99.37% for two-class density classification.

  2. Aesthetic preference recognition of 3D shapes using EEG.

    PubMed

    Chew, Lin Hou; Teo, Jason; Mountstephens, James

    2016-04-01

    Recognition and identification of aesthetic preference is indispensable in industrial design. Humans tend to pursue products with aesthetic values and make buying decisions based on their aesthetic preferences. The existence of neuromarketing is to understand consumer responses toward marketing stimuli by using imaging techniques and recognition of physiological parameters. Numerous studies have been done to understand the relationship between human, art and aesthetics. In this paper, we present a novel preference-based measurement of user aesthetics using electroencephalogram (EEG) signals for virtual 3D shapes with motion. The 3D shapes are designed to appear like bracelets, which is generated by using the Gielis superformula. EEG signals were collected by using a medical grade device, the B-Alert X10 from advance brain monitoring, with a sampling frequency of 256 Hz and resolution of 16 bits. The signals obtained when viewing 3D bracelet shapes were decomposed into alpha, beta, theta, gamma and delta rhythm by using time-frequency analysis, then classified into two classes, namely like and dislike by using support vector machines and K-nearest neighbors (KNN) classifiers respectively. Classification accuracy of up to 80 % was obtained by using KNN with the alpha, theta and delta rhythms as the features extracted from frontal channels, Fz, F3 and F4 to classify two classes, like and dislike.

  3. The contribution of neighbouring countries to pesticide levels in Dutch surface waters.

    PubMed

    Van 'T Zelfde, M; Tamis, W L M; Vijver, M G; De Snoo, G R

    2011-01-01

    Compared with other European countries, Dutch consumption of pesticides is high, particularly in agriculture, with many of the compounds found in surface waters in high concentrations and various standards being exceeded. Surface water quality is routinely monitored and the data obtained are published in the Dutch Pesticides Atlas. One important mechanism for reducing pesticide levels in surface waters is authorisation policy, which proceeds on the assumption that the pollution concerned has taken place in the Netherlands. The country straddles the delta of several major European rivers, however, and as river basins do not respect national borders some of the water quality problems will derive from neighbouring countries. Against this background the general question addressed in this article is the following: To what extent do countries neighbouring on the Netherlands contribute to pesticide pollution of Dutch surface waters? To answer this question, data from the Pesticides Atlas for the period 2005-2009 were used. Border zones with Belgium and Germany were defined and the data for these zones compared with Dutch data. In the analyses, due allowance was also made for authorised and non-authorised compounds and for differences between flowing and stagnant waters. Monitoring efforts in the border zones and in the Netherlands were also characterised, showing that efforts in the former are similar to those in the rest of the country. In the border zone with Belgium the relative number of non-authorised pesticides exceeding the standards is clearly higher than in the rest of the Netherlands. These exceedances are observed mainly in flowing waters. In contrast, there is no difference in the relative number of standard-exceeding measurements between the border zones and the rest of the Netherlands. In the boundary zones the array of standard-exceeding compounds clearly deviates from that in the rest of the Netherlands, with compounds authorised in the neighbouring

  4. Thiamethoxam as a seed treatment alters the physiological response of maize (Zea mays) seedlings to neighbouring weeds.

    PubMed

    Afifi, Maha; Lee, Elizabeth; Lukens, Lewis; Swanton, Clarence

    2015-04-01

    Thiamethoxam is a broad-spectrum neonicotinoid insecticide that, when applied to seed, has been observed to enhance seedling vigour under environmental stress conditions. Stress created by the presence of neighbouring weeds is known to trigger the accumulation of hydrogen peroxide (H2 O2 ) in maize seedling tissue. No previous work has explored the effect of thiamethoxam as a seed treatment on the physiological response of maize seedlings emerging in the presence of neighbouring weeds. Thiamethoxam was found to enhance seedling vigour and to overcome the expression of typical shade avoidance characteristics in the presence of neighbouring weeds. These results were attributed to maintenance of the total phenolics content, 1,1-diphenyl-2-picryl-hydrazyl (DPPH) radical scavenging activity and anthocyanin and lignin contents. These findings were also associated with the activation of scavenging genes, which reduced the accumulation of H2 O2 and the subsequent damage caused by lipid peroxidation in maize seedlings originating from treated seeds even when exposed to neighbouring weeds. These results suggest the possibility of exploring new chemistries and modes of action as novel seed treatments to upregulate free radical scavenging genes and to maintain the antioxidant system within plants. Such an approach may provide an opportunity to enhance crop competitiveness with weeds. © 2014 Society of Chemical Industry.

  5. Estimation of Carcinogenicity using Hierarchical Clustering and Nearest Neighbor Methodologies

    EPA Science Inventory

    Previously a hierarchical clustering (HC) approach and a nearest neighbor (NN) approach were developed to model acute aquatic toxicity end points. These approaches were developed to correlate the toxicity for large, noncongeneric data sets. In this study these approaches applie...

  6. Plant colonization, succession and ecosystem development on Surtsey with reference to neighbouring islands

    NASA Astrophysics Data System (ADS)

    Magnússon, B.; Magnússon, S. H.; Ólafsson, E.; Sigurdsson, B. D.

    2014-06-01

    Plant colonization and succession on Surtsey volcanic island, formed in 1963, have been closely followed. In 2013, a total of 69 vascular plant species had been discovered on the island; of these 59 were present and 39 had established viable populations. Surtsey had more than twice the species of any of the comparable neighbouring islands and all their common species had established on Surtsey. The first colonizers were dispersed by sea, but after 1985 bird-dispersal became the principal pathway with the formation of a seagull colony on the island and consequent site amelioration. This allowed wind-dispersed species to establish after 1990. Since 2007 there has been a net loss of species on the island. A study of plant succession, soil formation and invertebrate communities in permanent plots on Surtsey and on two older neighbouring islands (plants and soil) has revealed that seabirds, through their transfer of nutrients from sea to land, are major drivers of development of these ecosystems. In the area impacted by seagulls dense grassland swards have developed and plant cover, species richness, diversity, plant biomass and soil carbon become significantly higher than in low-impact areas, which remained relatively barren. A similar difference was found for the invertebrate fauna. After 2000, the vegetation of the oldest part of the seagull colony became increasingly dominated by long-lived, rhizomatous grasses (Festuca, Poa, Leymus) with a decline in species richness and diversity. Old grasslands of the neighbouring islands Elliðaey (puffin colony, high nutrient input) and Heimaey (no seabirds, low nutrient input) contrasted sharply. The puffin grassland of Elliðaey was very dense and species-poor. Dominated by Festuca and Poa, it it was very similar to the seagull grassland developing on Surtsey. The Heimaey grassland was significantly higher in species richness and diversity, and had a more even cover of dominants (Festuca/Agrostis/Ranunculus). We forecast that

  7. Observation of Dipolar Spin-Exchange Interactions with Polar Molecules in a Lattice

    DTIC Science & Technology

    2013-01-01

    extend beyond nearest neighbours. This allows coherent spin dynamics to persist even for gases with relatively high entropy and low lattice filling...dynamics to persist even for gases with relatively high entropy and low lat- tice filling. While measured effects of dipolar interactions in ultracold...limits superexchange to nearest-neighbor interactions and requires extremely low temperature and entropy . In contrast, long-range dipolar

  8. A Sensor Data Fusion System Based on k-Nearest Neighbor Pattern Classification for Structural Health Monitoring Applications

    PubMed Central

    Vitola, Jaime; Pozo, Francesc; Tibaduiza, Diego A.; Anaya, Maribel

    2017-01-01

    Civil and military structures are susceptible and vulnerable to damage due to the environmental and operational conditions. Therefore, the implementation of technology to provide robust solutions in damage identification (by using signals acquired directly from the structure) is a requirement to reduce operational and maintenance costs. In this sense, the use of sensors permanently attached to the structures has demonstrated a great versatility and benefit since the inspection system can be automated. This automation is carried out with signal processing tasks with the aim of a pattern recognition analysis. This work presents the detailed description of a structural health monitoring (SHM) system based on the use of a piezoelectric (PZT) active system. The SHM system includes: (i) the use of a piezoelectric sensor network to excite the structure and collect the measured dynamic response, in several actuation phases; (ii) data organization; (iii) advanced signal processing techniques to define the feature vectors; and finally; (iv) the nearest neighbor algorithm as a machine learning approach to classify different kinds of damage. A description of the experimental setup, the experimental validation and a discussion of the results from two different structures are included and analyzed. PMID:28230796

  9. A Sensor Data Fusion System Based on k-Nearest Neighbor Pattern Classification for Structural Health Monitoring Applications.

    PubMed

    Vitola, Jaime; Pozo, Francesc; Tibaduiza, Diego A; Anaya, Maribel

    2017-02-21

    Civil and military structures are susceptible and vulnerable to damage due to the environmental and operational conditions. Therefore, the implementation of technology to provide robust solutions in damage identification (by using signals acquired directly from the structure) is a requirement to reduce operational and maintenance costs. In this sense, the use of sensors permanently attached to the structures has demonstrated a great versatility and benefit since the inspection system can be automated. This automation is carried out with signal processing tasks with the aim of a pattern recognition analysis. This work presents the detailed description of a structural health monitoring (SHM) system based on the use of a piezoelectric (PZT) active system. The SHM system includes: (i) the use of a piezoelectric sensor network to excite the structure and collect the measured dynamic response, in several actuation phases; (ii) data organization; (iii) advanced signal processing techniques to define the feature vectors; and finally; (iv) the nearest neighbor algorithm as a machine learning approach to classify different kinds of damage. A description of the experimental setup, the experimental validation and a discussion of the results from two different structures are included and analyzed.

  10. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Chen, Kepi; Tang, Jing; Chen, Yan

    The effects of the calcination temperature of (K 0.5Na 0.5)NbO 3 (KNN) powder on the sintering and piezoelectric properties of KNN ceramics have been investigated in this report. KNN powders are synthesized via the solid-state approach. Scanning electron microscopy and X-ray diffraction characterizations indicate that the incomplete reaction at 700 °C and 750 °C calcination results in the compositional inhomogeneity of the K-rich and Na-rich phases while the orthorhombic single phase is obtained after calcination at 900 °C. During the sintering, the presence of the liquid K-rich phase due to the lower melting point has a significant impact on themore » densification, the abnormal grain growth and the deteriorated piezoelectric properties. From the standpoint of piezoelectric properties, the optimal calcination temperature obtained for KNN ceramics calcined at this temperature is determined to be 800 °C, with piezoelectric constant d 33=128.3 pC/N, planar electromechanical coupling coefficient k p=32.2%, mechanical quality factor Q m=88, and dielectric loss tan δ=2.1%.« less

  11. The effect of competition from neighbours on stomatal conductance in lettuce and tomato plants.

    PubMed

    Vysotskaya, Lidiya; Wilkinson, Sally; Davies, William J; Arkhipova, Tatyana; Kudoyarova, Guzel

    2011-05-01

    Competition decreased transpiration from young lettuce plants after 2 days, before any reductions in leaf area became apparent, and stomatal conductance (g(s) ) of lettuce and tomato plants was also reduced. Stomatal closure was not due to hydraulic signals or competition for nutrients, as soil water content, leaf water status and leaf nitrate concentrations were unaffected by neighbours. Competition-induced stomatal closure was absent in an abscisic acid (ABA)-deficient tomato mutant, flacca, indicating a fundamental involvement of ABA. Although tomato xylem sap ABA concentrations were unaffected by the presence of neighbours, ABA/pH-based stomatal modulation is still likely to underlie the response to competition, as soil and xylem sap alkalization was observed in competing plants. Competition also modulated leaf ethylene production, and treatment of lettuce plants with an ethylene perception inhibitor (1-methylcyclopropene) diminished the difference in g(s) between single and competing plants grown in a controlled environment room, but increased it in plants grown in the greenhouse: ethylene altered the extent of the stomatal response to competition. Effects of competition on g(s) are discussed in terms of the detection of the absence of neighbours: increases in g(s) and carbon fixation may allow faster initial space occupancy within an emerging community/crop. © 2011 Blackwell Publishing Ltd.

  12. Frequency Diverse Array Radar: Signal Characterization and Measurement Accuracy

    DTIC Science & Technology

    2010-03-25

    W knN (C.14) and f [n] = N−1∑ k=0 F [k]W− knN (C.15) where f [n] = f(t)|t=nTs F [k] = F (ω)|ω=k∆ω WN = exp(−j2π/N) Ts = f −1 s ∆ω = 2π NTs , fs is the...Properties of the MIMO radar ambiguity function”. Proceedings 2008 International Conference on Acoustics, Speech and Signal Processing, 2309–2312. April 2008

  13. Classification of EEG Signals Based on Pattern Recognition Approach.

    PubMed

    Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed

    2017-01-01

    Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a "pattern recognition" approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90-7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11-89.63% and 91.60-81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy.

  14. Classification of EEG Signals Based on Pattern Recognition Approach

    PubMed Central

    Amin, Hafeez Ullah; Mumtaz, Wajid; Subhani, Ahmad Rauf; Saad, Mohamad Naufal Mohamad; Malik, Aamir Saeed

    2017-01-01

    Feature extraction is an important step in the process of electroencephalogram (EEG) signal classification. The authors propose a “pattern recognition” approach that discriminates EEG signals recorded during different cognitive conditions. Wavelet based feature extraction such as, multi-resolution decompositions into detailed and approximate coefficients as well as relative wavelet energy were computed. Extracted relative wavelet energy features were normalized to zero mean and unit variance and then optimized using Fisher's discriminant ratio (FDR) and principal component analysis (PCA). A high density EEG dataset validated the proposed method (128-channels) by identifying two classifications: (1) EEG signals recorded during complex cognitive tasks using Raven's Advance Progressive Metric (RAPM) test; (2) EEG signals recorded during a baseline task (eyes open). Classifiers such as, K-nearest neighbors (KNN), Support Vector Machine (SVM), Multi-layer Perceptron (MLP), and Naïve Bayes (NB) were then employed. Outcomes yielded 99.11% accuracy via SVM classifier for coefficient approximations (A5) of low frequencies ranging from 0 to 3.90 Hz. Accuracy rates for detailed coefficients were 98.57 and 98.39% for SVM and KNN, respectively; and for detailed coefficients (D5) deriving from the sub-band range (3.90–7.81 Hz). Accuracy rates for MLP and NB classifiers were comparable at 97.11–89.63% and 91.60–81.07% for A5 and D5 coefficients, respectively. In addition, the proposed approach was also applied on public dataset for classification of two cognitive tasks and achieved comparable classification results, i.e., 93.33% accuracy with KNN. The proposed scheme yielded significantly higher classification performances using machine learning classifiers compared to extant quantitative feature extraction. These results suggest the proposed feature extraction method reliably classifies EEG signals recorded during cognitive tasks with a higher degree of accuracy. PMID

  15. Parallel exploitation of a spatial-spectral classification approach for hyperspectral images on RVC-CAL

    NASA Astrophysics Data System (ADS)

    Lazcano, R.; Madroñal, D.; Fabelo, H.; Ortega, S.; Salvador, R.; Callicó, G. M.; Juárez, E.; Sanz, C.

    2017-10-01

    Hyperspectral Imaging (HI) assembles high resolution spectral information from hundreds of narrow bands across the electromagnetic spectrum, thus generating 3D data cubes in which each pixel gathers the spectral information of the reflectance of every spatial pixel. As a result, each image is composed of large volumes of data, which turns its processing into a challenge, as performance requirements have been continuously tightened. For instance, new HI applications demand real-time responses. Hence, parallel processing becomes a necessity to achieve this requirement, so the intrinsic parallelism of the algorithms must be exploited. In this paper, a spatial-spectral classification approach has been implemented using a dataflow language known as RVCCAL. This language represents a system as a set of functional units, and its main advantage is that it simplifies the parallelization process by mapping the different blocks over different processing units. The spatial-spectral classification approach aims at refining the classification results previously obtained by using a K-Nearest Neighbors (KNN) filtering process, in which both the pixel spectral value and the spatial coordinates are considered. To do so, KNN needs two inputs: a one-band representation of the hyperspectral image and the classification results provided by a pixel-wise classifier. Thus, spatial-spectral classification algorithm is divided into three different stages: a Principal Component Analysis (PCA) algorithm for computing the one-band representation of the image, a Support Vector Machine (SVM) classifier, and the KNN-based filtering algorithm. The parallelization of these algorithms shows promising results in terms of computational time, as the mapping of them over different cores presents a speedup of 2.69x when using 3 cores. Consequently, experimental results demonstrate that real-time processing of hyperspectral images is achievable.

  16. Detecting Paroxysmal Coughing from Pertussis Cases Using Voice Recognition Technology

    PubMed Central

    Parker, Danny; Picone, Joseph; Harati, Amir; Lu, Shuang; Jenkyns, Marion H.; Polgreen, Philip M.

    2013-01-01

    Background Pertussis is highly contagious; thus, prompt identification of cases is essential to control outbreaks. Clinicians experienced with the disease can easily identify classic cases, where patients have bursts of rapid coughing followed by gasps, and a characteristic whooping sound. However, many clinicians have never seen a case, and thus may miss initial cases during an outbreak. The purpose of this project was to use voice-recognition software to distinguish pertussis coughs from croup and other coughs. Methods We collected a series of recordings representing pertussis, croup and miscellaneous coughing by children. We manually categorized coughs as either pertussis or non-pertussis, and extracted features for each category. We used Mel-frequency cepstral coefficients (MFCC), a sampling rate of 16 KHz, a frame Duration of 25 msec, and a frame rate of 10 msec. The coughs were filtered. Each cough was divided into 3 sections of proportion 3-4-3. The average of the 13 MFCCs for each section was computed and made into a 39-element feature vector used for the classification. We used the following machine learning algorithms: Neural Networks, K-Nearest Neighbor (KNN), and a 200 tree Random Forest (RF). Data were reserved for cross-validation of the KNN and RF. The Neural Network was trained 100 times, and the averaged results are presented. Results After categorization, we had 16 examples of non-pertussis coughs and 31 examples of pertussis coughs. Over 90% of all pertussis coughs were properly classified as pertussis. The error rates were: Type I errors of 7%, 12%, and 25% and Type II errors of 8%, 0%, and 0%, using the Neural Network, Random Forest, and KNN, respectively. Conclusion Our results suggest that we can build a robust classifier to assist clinicians and the public to help identify pertussis cases in children presenting with typical symptoms. PMID:24391730

  17. The Emotion Recognition System Based on Autoregressive Model and Sequential Forward Feature Selection of Electroencephalogram Signals

    PubMed Central

    Hatamikia, Sepideh; Maghooli, Keivan; Nasrabadi, Ali Motie

    2014-01-01

    Electroencephalogram (EEG) is one of the useful biological signals to distinguish different brain diseases and mental states. In recent years, detecting different emotional states from biological signals has been merged more attention by researchers and several feature extraction methods and classifiers are suggested to recognize emotions from EEG signals. In this research, we introduce an emotion recognition system using autoregressive (AR) model, sequential forward feature selection (SFS) and K-nearest neighbor (KNN) classifier using EEG signals during emotional audio-visual inductions. The main purpose of this paper is to investigate the performance of AR features in the classification of emotional states. To achieve this goal, a distinguished AR method (Burg's method) based on Levinson-Durbin's recursive algorithm is used and AR coefficients are extracted as feature vectors. In the next step, two different feature selection methods based on SFS algorithm and Davies–Bouldin index are used in order to decrease the complexity of computing and redundancy of features; then, three different classifiers include KNN, quadratic discriminant analysis and linear discriminant analysis are used to discriminate two and three different classes of valence and arousal levels. The proposed method is evaluated with EEG signals of available database for emotion analysis using physiological signals, which are recorded from 32 participants during 40 1 min audio visual inductions. According to the results, AR features are efficient to recognize emotional states from EEG signals, and KNN performs better than two other classifiers in discriminating of both two and three valence/arousal classes. The results also show that SFS method improves accuracies by almost 10-15% as compared to Davies–Bouldin based feature selection. The best accuracies are %72.33 and %74.20 for two classes of valence and arousal and %61.10 and %65.16 for three classes, respectively. PMID:25298928

  18. Detecting paroxysmal coughing from pertussis cases using voice recognition technology.

    PubMed

    Parker, Danny; Picone, Joseph; Harati, Amir; Lu, Shuang; Jenkyns, Marion H; Polgreen, Philip M

    2013-01-01

    Pertussis is highly contagious; thus, prompt identification of cases is essential to control outbreaks. Clinicians experienced with the disease can easily identify classic cases, where patients have bursts of rapid coughing followed by gasps, and a characteristic whooping sound. However, many clinicians have never seen a case, and thus may miss initial cases during an outbreak. The purpose of this project was to use voice-recognition software to distinguish pertussis coughs from croup and other coughs. We collected a series of recordings representing pertussis, croup and miscellaneous coughing by children. We manually categorized coughs as either pertussis or non-pertussis, and extracted features for each category. We used Mel-frequency cepstral coefficients (MFCC), a sampling rate of 16 KHz, a frame Duration of 25 msec, and a frame rate of 10 msec. The coughs were filtered. Each cough was divided into 3 sections of proportion 3-4-3. The average of the 13 MFCCs for each section was computed and made into a 39-element feature vector used for the classification. We used the following machine learning algorithms: Neural Networks, K-Nearest Neighbor (KNN), and a 200 tree Random Forest (RF). Data were reserved for cross-validation of the KNN and RF. The Neural Network was trained 100 times, and the averaged results are presented. After categorization, we had 16 examples of non-pertussis coughs and 31 examples of pertussis coughs. Over 90% of all pertussis coughs were properly classified as pertussis. The error rates were: Type I errors of 7%, 12%, and 25% and Type II errors of 8%, 0%, and 0%, using the Neural Network, Random Forest, and KNN, respectively. Our results suggest that we can build a robust classifier to assist clinicians and the public to help identify pertussis cases in children presenting with typical symptoms.

  19. IDM-PhyChm-Ens: intelligent decision-making ensemble methodology for classification of human breast cancer using physicochemical properties of amino acids.

    PubMed

    Ali, Safdar; Majid, Abdul; Khan, Asifullah

    2014-04-01

    Development of an accurate and reliable intelligent decision-making method for the construction of cancer diagnosis system is one of the fast growing research areas of health sciences. Such decision-making system can provide adequate information for cancer diagnosis and drug discovery. Descriptors derived from physicochemical properties of protein sequences are very useful for classifying cancerous proteins. Recently, several interesting research studies have been reported on breast cancer classification. To this end, we propose the exploitation of the physicochemical properties of amino acids in protein primary sequences such as hydrophobicity (Hd) and hydrophilicity (Hb) for breast cancer classification. Hd and Hb properties of amino acids, in recent literature, are reported to be quite effective in characterizing the constituent amino acids and are used to study protein foldings, interactions, structures, and sequence-order effects. Especially, using these physicochemical properties, we observed that proline, serine, tyrosine, cysteine, arginine, and asparagine amino acids offer high discrimination between cancerous and healthy proteins. In addition, unlike traditional ensemble classification approaches, the proposed 'IDM-PhyChm-Ens' method was developed by combining the decision spaces of a specific classifier trained on different feature spaces. The different feature spaces used were amino acid composition, split amino acid composition, and pseudo amino acid composition. Consequently, we have exploited different feature spaces using Hd and Hb properties of amino acids to develop an accurate method for classification of cancerous protein sequences. We developed ensemble classifiers using diverse learning algorithms such as random forest (RF), support vector machines (SVM), and K-nearest neighbor (KNN) trained on different feature spaces. We observed that ensemble-RF, in case of cancer classification, performed better than ensemble-SVM and ensemble-KNN. Our

  20. Combination of multivariate curve resolution and multivariate classification techniques for comprehensive high-performance liquid chromatography-diode array absorbance detection fingerprints analysis of Salvia reuterana extracts.

    PubMed

    Hakimzadeh, Neda; Parastar, Hadi; Fattahi, Mohammad

    2014-01-24

    In this study, multivariate curve resolution (MCR) and multivariate classification methods are proposed to develop a new chemometric strategy for comprehensive analysis of high-performance liquid chromatography-diode array absorbance detection (HPLC-DAD) fingerprints of sixty Salvia reuterana samples from five different geographical regions. Different chromatographic problems occurred during HPLC-DAD analysis of S. reuterana samples, such as baseline/background contribution and noise, low signal-to-noise ratio (S/N), asymmetric peaks, elution time shifts, and peak overlap are handled using the proposed strategy. In this way, chromatographic fingerprints of sixty samples are properly segmented to ten common chromatographic regions using local rank analysis and then, the corresponding segments are column-wise augmented for subsequent MCR analysis. Extended multivariate curve resolution-alternating least squares (MCR-ALS) is used to obtain pure component profiles in each segment. In general, thirty-one chemical components were resolved using MCR-ALS in sixty S. reuterana samples and the lack of fit (LOF) values of MCR-ALS models were below 10.0% in all cases. Pure spectral profiles are considered for identification of chemical components by comparing their resolved spectra with the standard ones and twenty-four components out of thirty-one components were identified. Additionally, pure elution profiles are used to obtain relative concentrations of chemical components in different samples for multivariate classification analysis by principal component analysis (PCA) and k-nearest neighbors (kNN). Inspection of the PCA score plot (explaining 76.1% of variance accounted for three PCs) showed that S. reuterana samples belong to four clusters. The degree of class separation (DCS) which quantifies the distance separating clusters in relation to the scatter within each cluster is calculated for four clusters and it was in the range of 1.6-5.8. These results are then

  1. Geometry of ‘standoffs’ in lattice models of the spatial Prisoner’s Dilemma and Snowdrift games

    NASA Astrophysics Data System (ADS)

    Laird, Robert A.; Goyal, Dipankar; Yazdani, Soroosh

    2013-09-01

    The Prisoner’s Dilemma and Snowdrift games are the main theoretical constructs used to study the evolutionary dynamics of cooperation. In large, well-mixed populations, mean-field models predict a stable equilibrium abundance of all defectors in the Prisoner’s Dilemma and a stable mixed-equilibrium of cooperators and defectors in the Snowdrift game. In the spatial extensions of these games, which can greatly modify the fates of populations (including allowing cooperators to persist in the Prisoner’s Dilemma, for example), lattice models are typically used to represent space, individuals play only with their nearest neighbours, and strategy replacement is a function of the differences in payoffs between neighbours. Interestingly, certain values of the cost-benefit ratio of cooperation, coupled with particular spatial configurations of cooperators and defectors, can lead to ‘global standoffs’, a situation in which all cooperator-defector neighbours have identical payoffs, leading to the development of static spatial patterns. We start by investigating the conditions that can lead to ‘local standoffs’ (i.e., in which isolated pairs of neighbouring cooperators and defectors cannot overtake one another), and then use exhaustive searches of small square lattices (4×4 and 6×6) of degree k=3,k=4, and k=6, to show that two main types of global standoff patterns-‘periodic’ and ‘aperiodic’-are possible by tiling local standoffs across entire spatially structured populations. Of these two types, we argue that only aperiodic global standoffs are likely to be potentially attracting, i.e., capable of emerging spontaneously from non-standoff conditions. Finally, we use stochastic simulation models with comparatively large lattices (100×100) to show that global standoffs in the Prisoner’s Dilemma and Snowdrift games do indeed only (but not always) emerge under the conditions predicted by the small-lattice analysis.

  2. Use of Cell Viability Assay Data Improves the Prediction Accuracy of Conventional Quantitative Structure–Activity Relationship Models of Animal Carcinogenicity

    PubMed Central

    Zhu, Hao; Rusyn, Ivan; Richard, Ann; Tropsha, Alexander

    2008-01-01

    Background To develop efficient approaches for rapid evaluation of chemical toxicity and human health risk of environmental compounds, the National Toxicology Program (NTP) in collaboration with the National Center for Chemical Genomics has initiated a project on high-throughput screening (HTS) of environmental chemicals. The first HTS results for a set of 1,408 compounds tested for their effects on cell viability in six different cell lines have recently become available via PubChem. Objectives We have explored these data in terms of their utility for predicting adverse health effects of the environmental agents. Methods and results Initially, the classification k nearest neighbor (kNN) quantitative structure–activity relationship (QSAR) modeling method was applied to the HTS data only, for a curated data set of 384 compounds. The resulting models had prediction accuracies for training, test (containing 275 compounds together), and external validation (109 compounds) sets as high as 89%, 71%, and 74%, respectively. We then asked if HTS results could be of value in predicting rodent carcinogenicity. We identified 383 compounds for which data were available from both the Berkeley Carcinogenic Potency Database and NTP–HTS studies. We found that compounds classified by HTS as “actives” in at least one cell line were likely to be rodent carcinogens (sensitivity 77%); however, HTS “inactives” were far less informative (specificity 46%). Using chemical descriptors only, kNN QSAR modeling resulted in 62.3% prediction accuracy for rodent carcinogenicity applied to this data set. Importantly, the prediction accuracy of the model was significantly improved (72.7%) when chemical descriptors were augmented by HTS data, which were regarded as biological descriptors. Conclusions Our studies suggest that combining NTP–HTS profiles with conventional chemical descriptors could considerably improve the predictive power of computational approaches in toxicology. PMID

  3. Does rational selection of training and test sets improve the outcome of QSAR modeling?

    PubMed

    Martin, Todd M; Harten, Paul; Young, Douglas M; Muratov, Eugene N; Golbraikh, Alexander; Zhu, Hao; Tropsha, Alexander

    2012-10-22

    Prior to using a quantitative structure activity relationship (QSAR) model for external predictions, its predictive power should be established and validated. In the absence of a true external data set, the best way to validate the predictive ability of a model is to perform its statistical external validation. In statistical external validation, the overall data set is divided into training and test sets. Commonly, this splitting is performed using random division. Rational splitting methods can divide data sets into training and test sets in an intelligent fashion. The purpose of this study was to determine whether rational division methods lead to more predictive models compared to random division. A special data splitting procedure was used to facilitate the comparison between random and rational division methods. For each toxicity end point, the overall data set was divided into a modeling set (80% of the overall set) and an external evaluation set (20% of the overall set) using random division. The modeling set was then subdivided into a training set (80% of the modeling set) and a test set (20% of the modeling set) using rational division methods and by using random division. The Kennard-Stone, minimal test set dissimilarity, and sphere exclusion algorithms were used as the rational division methods. The hierarchical clustering, random forest, and k-nearest neighbor (kNN) methods were used to develop QSAR models based on the training sets. For kNN QSAR, multiple training and test sets were generated, and multiple QSAR models were built. The results of this study indicate that models based on rational division methods generate better statistical results for the test sets than models based on random division, but the predictive power of both types of models are comparable.

  4. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Xu Xiaoying; Ho, Shirley; Trac, Hy

    We investigate machine learning (ML) techniques for predicting the number of galaxies (N{sub gal}) that occupy a halo, given the halo's properties. These types of mappings are crucial for constructing the mock galaxy catalogs necessary for analyses of large-scale structure. The ML techniques proposed here distinguish themselves from traditional halo occupation distribution (HOD) modeling as they do not assume a prescribed relationship between halo properties and N{sub gal}. In addition, our ML approaches are only dependent on parent halo properties (like HOD methods), which are advantageous over subhalo-based approaches as identifying subhalos correctly is difficult. We test two algorithms: supportmore » vector machines (SVM) and k-nearest-neighbor (kNN) regression. We take galaxies and halos from the Millennium simulation and predict N{sub gal} by training our algorithms on the following six halo properties: number of particles, M{sub 200}, {sigma}{sub v}, v{sub max}, half-mass radius, and spin. For Millennium, our predicted N{sub gal} values have a mean-squared error (MSE) of {approx}0.16 for both SVM and kNN. Our predictions match the overall distribution of halos reasonably well and the galaxy correlation function at large scales to {approx}5%-10%. In addition, we demonstrate a feature selection algorithm to isolate the halo parameters that are most predictive, a useful technique for understanding the mapping between halo properties and N{sub gal}. Lastly, we investigate these ML-based approaches in making mock catalogs for different galaxy subpopulations (e.g., blue, red, high M{sub star}, low M{sub star}). Given its non-parametric nature as well as its powerful predictive and feature selection capabilities, ML offers an interesting alternative for creating mock catalogs.« less

  5. Tree diversity and the role of non-host neighbour tree species in reducing fungal pathogen infestation

    PubMed Central

    Hantsch, Lydia; Bien, Steffen; Radatz, Stine; Braun, Uwe; Auge, Harald; Bruelheide, Helge

    2014-01-01

    The degree to which plant pathogen infestation occurs in a host plant is expected to be strongly influenced by the level of species diversity among neighbouring host and non-host plant species. Since pathogen infestation can negatively affect host plant performance, it can mediate the effects of local biodiversity on ecosystem functioning. We tested the effects of tree diversity and the proportion of neighbouring host and non-host species with respect to the foliar fungal pathogens of Tilia cordata and Quercus petraea in the Kreinitz tree diversity experiment in Germany. We hypothesized that fungal pathogen richness increases while infestation decreases with increasing local tree diversity. In addition, we tested whether fungal pathogen richness and infestation are dependent on the proportion of host plant species present or on the proportion of particular non-host neighbouring tree species. Leaves of the two target species were sampled across three consecutive years with visible foliar fungal pathogens on the leaf surface being identified macro- and microscopically. Effects of diversity among neighbouring trees were analysed: (i) for total fungal species richness and fungal infestation on host trees and (ii) for infestation by individual fungal species. We detected four and five fungal species on T. cordata and Q. petraea, respectively. High local tree diversity reduced (i) total fungal species richness and infestation of T. cordata and fungal infestation of Q. petraea and (ii) infestation by three host-specialized fungal pathogen species. These effects were brought about by local tree diversity and were independent of host species proportion. In general, host species proportion had almost no effect on fungal species richness and infestation. Strong effects associated with the proportion of particular non-host neighbouring tree species on fungal species richness and infestation were, however, recorded. Synthesis. For the first time, we experimentally

  6. Tree diversity and the role of non-host neighbour tree species in reducing fungal pathogen infestation.

    PubMed

    Hantsch, Lydia; Bien, Steffen; Radatz, Stine; Braun, Uwe; Auge, Harald; Bruelheide, Helge

    2014-11-01

    The degree to which plant pathogen infestation occurs in a host plant is expected to be strongly influenced by the level of species diversity among neighbouring host and non-host plant species. Since pathogen infestation can negatively affect host plant performance, it can mediate the effects of local biodiversity on ecosystem functioning.We tested the effects of tree diversity and the proportion of neighbouring host and non-host species with respect to the foliar fungal pathogens of Tilia cordata and Quercus petraea in the Kreinitz tree diversity experiment in Germany. We hypothesized that fungal pathogen richness increases while infestation decreases with increasing local tree diversity. In addition, we tested whether fungal pathogen richness and infestation are dependent on the proportion of host plant species present or on the proportion of particular non-host neighbouring tree species.Leaves of the two target species were sampled across three consecutive years with visible foliar fungal pathogens on the leaf surface being identified macro- and microscopically. Effects of diversity among neighbouring trees were analysed: (i) for total fungal species richness and fungal infestation on host trees and (ii) for infestation by individual fungal species.We detected four and five fungal species on T. cordata and Q. petraea , respectively. High local tree diversity reduced (i) total fungal species richness and infestation of T. cordata and fungal infestation of Q. petraea and (ii) infestation by three host-specialized fungal pathogen species. These effects were brought about by local tree diversity and were independent of host species proportion. In general, host species proportion had almost no effect on fungal species richness and infestation. Strong effects associated with the proportion of particular non-host neighbouring tree species on fungal species richness and infestation were, however, recorded. Synthesis . For the first time, we experimentally

  7. Nearest pattern interaction and global pattern formation

    NASA Astrophysics Data System (ADS)

    Jeong, Seong-Ok; Moon, Hie-Tae; Ko, Tae-Wook

    2000-12-01

    We studied the effect of nearest pattern interaction on a global pattern formation in a two-dimensional space, where patterns are to grow initially from a noise in the presence of a periodic supply of energy. Although our approach is general, we found that this study is relevant in particular to the pattern formation on a periodically vibrated granular layer, as it gives a unified perspective of the experimentally observed pattern dynamics such as oscillon and stripe formations, skew-varicose and crossroll instabilities, and also a kink formation and decoration.

  8. Iterative Neighbour-Information Gathering for Ranking Nodes in Complex Networks

    NASA Astrophysics Data System (ADS)

    Xu, Shuang; Wang, Pei; Lü, Jinhu

    2017-01-01

    Designing node influence ranking algorithms can provide insights into network dynamics, functions and structures. Increasingly evidences reveal that node’s spreading ability largely depends on its neighbours. We introduce an iterative neighbourinformation gathering (Ing) process with three parameters, including a transformation matrix, a priori information and an iteration time. The Ing process iteratively combines priori information from neighbours via the transformation matrix, and iteratively assigns an Ing score to each node to evaluate its influence. The algorithm appropriates for any types of networks, and includes some traditional centralities as special cases, such as degree, semi-local, LeaderRank. The Ing process converges in strongly connected networks with speed relying on the first two largest eigenvalues of the transformation matrix. Interestingly, the eigenvector centrality corresponds to a limit case of the algorithm. By comparing with eight renowned centralities, simulations of susceptible-infected-removed (SIR) model on real-world networks reveal that the Ing can offer more exact rankings, even without a priori information. We also observe that an optimal iteration time is always in existence to realize best characterizing of node influence. The proposed algorithms bridge the gaps among some existing measures, and may have potential applications in infectious disease control, designing of optimal information spreading strategies.

  9. Adaptation to local ultraviolet radiation conditions among neighbouring Daphnia populations

    PubMed Central

    Miner, Brooks E.; Kerr, Benjamin

    2011-01-01

    Understanding the historical processes that generated current patterns of phenotypic diversity in nature is particularly challenging in subdivided populations. Populations often exhibit heritable genetic differences that correlate with environmental variables, but the non-independence among neighbouring populations complicates statistical inference of adaptation. To understand the relative influence of adaptive and non-adaptive processes in generating phenotypes requires joint evaluation of genetic and phenotypic divergence in an integrated and statistically appropriate analysis. We investigated phenotypic divergence, population-genetic structure and potential fitness trade-offs in populations of Daphnia melanica inhabiting neighbouring subalpine ponds of widely differing transparency to ultraviolet radiation (UVR). Using a combination of experimental, population-genetic and statistical techniques, we separated the effects of shared population ancestry and environmental variables in predicting phenotypic divergence among populations. We found that native water transparency significantly predicted divergence in phenotypes among populations even after accounting for significant population structure. This result demonstrates that environmental factors such as UVR can at least partially account for phenotypic divergence. However, a lack of evidence for a hypothesized trade-off between UVR tolerance and growth rates in the absence of UVR prevents us from ruling out the possibility that non-adaptive processes are partially responsible for phenotypic differentiation in this system. PMID:20943691

  10. Measuring the excitations in a new S  =  1/2 quantum spin chain material with competing interactions

    NASA Astrophysics Data System (ADS)

    Rule, K. C.; Mole, R. A.; Zanardo, J.; Krause-Heuer, A.; Darwish, T.; Lerch, M.; Yu, D.

    2018-05-01

    Recently a new one-dimensional (1D) quantum spin chain system has been reported: catena-dichloro(2-Cl-3Mpy)copper(II), (where 2-Cl-3Mpy=2-chloro-3-methylpyridine). Preliminary calculations and bulk magnetic property measurements indicate that this system does not undergo magnetic ordering down to 1.8 K and is a prime candidate for investigating frustration in a J 1/J 2 system (where the nearest neighbour interactions, J 1, are ferromagnetic and the next nearest neighbour interactions, J 2, are antiferromagnetic). Calculations predicted three possible magnetic interaction strengths for J 1 below 6 meV depending on the orientation of the ligand. For one of the predicted J 1 values, the existence of a quantum critical point is implied. A deuterated sample of catena-dichloro(2-Cl-3Mpy)copper(II) was synthesised and the excitations measured using inelastic neutron scattering. Scattering indicated the most likely scenario involves spin-chains where each chain consists of only one of the three possible magnetic excitations in this material, rather than the completely random array of exchange interactions within each chain as predicted by Herringer et al (2014 Chem. Eur. J. 20 8355–62). This indicates the possibility of tuning the chemical structure to favour a system which may exhibit a quantum critical point.

  11. Measuring the excitations in a new S  =  1/2 quantum spin chain material with competing interactions.

    PubMed

    Rule, K C; Mole, R A; Zanardo, J; Krause-Heuer, A; Darwish, T; Lerch, M; Yu, D

    2018-05-31

    Recently a new one-dimensional (1D) quantum spin chain system has been reported: catena-dichloro(2-Cl-3Mpy)copper(II), (where 2-Cl-3Mpy=2-chloro-3-methylpyridine). Preliminary calculations and bulk magnetic property measurements indicate that this system does not undergo magnetic ordering down to 1.8 K and is a prime candidate for investigating frustration in a J 1 /J 2 system (where the nearest neighbour interactions, J 1 , are ferromagnetic and the next nearest neighbour interactions, J 2 , are antiferromagnetic). Calculations predicted three possible magnetic interaction strengths for J 1 below 6 meV depending on the orientation of the ligand. For one of the predicted J 1 values, the existence of a quantum critical point is implied. A deuterated sample of catena-dichloro(2-Cl-3Mpy)copper(II) was synthesised and the excitations measured using inelastic neutron scattering. Scattering indicated the most likely scenario involves spin-chains where each chain consists of only one of the three possible magnetic excitations in this material, rather than the completely random array of exchange interactions within each chain as predicted by Herringer et al (2014 Chem. Eur. J. 20 8355-62). This indicates the possibility of tuning the chemical structure to favour a system which may exhibit a quantum critical point.

  12. Sintering of Lead-Free Piezoelectric Sodium Potassium Niobate Ceramics

    PubMed Central

    Malič, Barbara; Koruza, Jurij; Hreščak, Jitka; Bernard, Janez; Wang, Ke; Fisher, John G.; Benčan, Andreja

    2015-01-01

    The potassium sodium niobate, K0.5Na0.5NbO3, solid solution (KNN) is considered as one of the most promising, environment-friendly, lead-free candidates to replace highly efficient, lead-based piezoelectrics. Since the first reports of KNN, it has been recognized that obtaining phase-pure materials with a high density and a uniform, fine-grained microstructure is a major challenge. For this reason the present paper reviews the different methods for consolidating KNN ceramics. The difficulties involved in the solid-state synthesis of KNN powder, i.e., obtaining phase purity, the stoichiometry of the perovskite phase, and the chemical homogeneity, are discussed. The solid-state sintering of stoichiometric KNN is characterized by poor densification and an extremely narrow sintering-temperature range, which is close to the solidus temperature. A study of the initial sintering stage revealed that coarsening of the microstructure without densification contributes to a reduction of the driving force for sintering. The influences of the (K + Na)/Nb molar ratio, the presence of a liquid phase, chemical modifications (doping, complex solid solutions) and different atmospheres (i.e., defect chemistry) on the sintering are discussed. Special sintering techniques, such as pressure-assisted sintering and spark-plasma sintering, can be effective methods for enhancing the density of KNN ceramics. The sintering behavior of KNN is compared to that of a representative piezoelectric lead zirconate titanate (PZT). PMID:28793702

  13. Assessing the varietal origin of extra-virgin olive oil using liquid chromatography fingerprints of phenolic compound, data fusion and chemometrics.

    PubMed

    Bajoub, Aadil; Medina-Rodríguez, Santiago; Gómez-Romero, María; Ajal, El Amine; Bagur-González, María Gracia; Fernández-Gutiérrez, Alberto; Carrasco-Pancorbo, Alegría

    2017-01-15

    High Performance Liquid Chromatography (HPLC) with diode array (DAD) and fluorescence (FLD) detection was used to acquire the fingerprints of the phenolic fraction of monovarietal extra-virgin olive oils (extra-VOOs) collected over three consecutive crop seasons (2011/2012-2013/2014). The chromatographic fingerprints of 140 extra-VOO samples processed from olive fruits of seven olive varieties, were recorded and statistically treated for varietal authentication purposes. First, DAD and FLD chromatographic-fingerprint datasets were separately processed and, subsequently, were joined using "Low-level" and "Mid-Level" data fusion methods. After the preliminary examination by principal component analysis (PCA), three supervised pattern recognition techniques, Partial Least Squares Discriminant Analysis (PLS-DA), Soft Independent Modeling of Class Analogies (SIMCA) and K-Nearest Neighbors (k-NN) were applied to the four chromatographic-fingerprinting matrices. The classification models built were very sensitive and selective, showing considerably good recognition and prediction abilities. The combination "chromatographic dataset+chemometric technique" allowing the most accurate classification for each monovarietal extra-VOO was highlighted. Copyright © 2016 Elsevier Ltd. All rights reserved.

  14. Geographical origin discrimination of lentils (Lens culinaris Medik.) using 1H NMR fingerprinting and multivariate statistical analyses.

    PubMed

    Longobardi, Francesco; Innamorato, Valentina; Di Gioia, Annalisa; Ventrella, Andrea; Lippolis, Vincenzo; Logrieco, Antonio F; Catucci, Lucia; Agostiano, Angela

    2017-12-15

    Lentil samples coming from two different countries, i.e. Italy and Canada, were analysed using untargeted 1 H NMR fingerprinting in combination with chemometrics in order to build models able to classify them according to their geographical origin. For such aim, Soft Independent Modelling of Class Analogy (SIMCA), k-Nearest Neighbor (k-NN), Principal Component Analysis followed by Linear Discriminant Analysis (PCA-LDA) and Partial Least Squares-Discriminant Analysis (PLS-DA) were applied to the NMR data and the results were compared. The best combination of average recognition (100%) and cross-validation prediction abilities (96.7%) was obtained for the PCA-LDA. All the statistical models were validated both by using a test set and by carrying out a Monte Carlo Cross Validation: the obtained performances were found to be satisfying for all the models, with prediction abilities higher than 95% demonstrating the suitability of the developed methods. Finally, the metabolites that mostly contributed to the lentil discrimination were indicated. Copyright © 2017 Elsevier Ltd. All rights reserved.

  15. Lactobacillus fabifermentans sp. nov. and Lactobacillus cacaonum sp. nov., isolated from Ghanaian cocoa fermentations.

    PubMed

    De Bruyne, Katrien; Camu, Nicholas; De Vuyst, Luc; Vandamme, Peter

    2009-01-01

    Two Gram-positive bacterial strains, LMG 24284T and LMG 24285T, were isolated from different spontaneous cocoa bean heap fermentations in Ghana. Analysis of their 16S rRNA gene sequences indicated that they were members of the Lactobacillus plantarum and Lactobacillus salivarius species groups, respectively. DNA-DNA hybridization experiments with their nearest phylogenetic neighbours demonstrated that both strains represented novel species that could be differentiated from their nearest neighbours by pheS sequence analysis, whole-cell protein electrophoresis, fluorescent amplified fragment length polymorphism analysis and biochemical characterization. Therefore, two novel Lactobacillus species are proposed, Lactobacillus fabifermentans sp. nov. (type strain LMG 24284T =DSM 21115T) and Lactobacillus cacaonum sp. nov. (type strain LMG 24285T =DSM 21116T).

  16. Multi-strategy based quantum cost reduction of linear nearest-neighbor quantum circuit

    NASA Astrophysics Data System (ADS)

    Tan, Ying-ying; Cheng, Xue-yun; Guan, Zhi-jin; Liu, Yang; Ma, Haiying

    2018-03-01

    With the development of reversible and quantum computing, study of reversible and quantum circuits has also developed rapidly. Due to physical constraints, most quantum circuits require quantum gates to interact on adjacent quantum bits. However, many existing quantum circuits nearest-neighbor have large quantum cost. Therefore, how to effectively reduce quantum cost is becoming a popular research topic. In this paper, we proposed multiple optimization strategies to reduce the quantum cost of the circuit, that is, we reduce quantum cost from MCT gates decomposition, nearest neighbor and circuit simplification, respectively. The experimental results show that the proposed strategies can effectively reduce the quantum cost, and the maximum optimization rate is 30.61% compared to the corresponding results.

  17. NMR-Based Metabolomic Study on Isatis tinctoria: Comparison of Different Accessions, Harvesting Dates, and the Effect of Repeated Harvesting.

    PubMed

    Guldbrandsen, Niels; Kostidis, Sarantos; Schäfer, Hartmut; De Mieri, Maria; Spraul, Manfred; Skaltsounis, Alexios-Leandros; Mikros, Emmanuel; Hamburger, Matthias

    2015-05-22

    Isatis tinctoria is an ancient dye and medicinal plant with potent anti-inflammatory and antiallergic properties. Metabolic differences were investigated by NMR spectroscopy of accessions from different origins that were grown under identical conditions on experimental plots. For these accessions, metabolite profiles at different harvesting dates were analyzed, and single and repeatedly harvested plants were compared. Leaf samples were shock-frozen in liquid N2 immediately after being harvested, freeze-dried, and cryomilled prior to extraction. Extracts were prepared by pressurized liquid extraction with ethyl acetate and 70% aqueous methanol. NMR spectra were analyzed using a combination of different methods of multivariate data analysis such as principal component analysis (PCA), canonical analysis (CA), and k-nearest neighbor concept (k-NN). Accessions and harvesting dates were well separated in the PCA/CA/k-NN analysis in both extracts. Pairwise statistical total correlation spectroscopy (STOCSY) revealed unsaturated fatty acids, porphyrins, carbohydrates, indole derivatives, isoprenoids, phenylpropanoids, and minor aromatic compounds as the cause of these differences. In addition, the metabolite profile was affected by the repeated harvest regime, causing a decrease of 1,5-anhydroglucitol, sucrose, unsaturated fatty acids, porphyrins, isoprenoids, and a flavonoid.

  18. Quaternion-Based Signal Analysis for Motor Imagery Classification from Electroencephalographic Signals.

    PubMed

    Batres-Mendoza, Patricia; Montoro-Sanjose, Carlos R; Guerra-Hernandez, Erick I; Almanza-Ojeda, Dora L; Rostro-Gonzalez, Horacio; Romero-Troncoso, Rene J; Ibarra-Manzano, Mario A

    2016-03-05

    Quaternions can be used as an alternative to model the fundamental patterns of electroencephalographic (EEG) signals in the time domain. Thus, this article presents a new quaternion-based technique known as quaternion-based signal analysis (QSA) to represent EEG signals obtained using a brain-computer interface (BCI) device to detect and interpret cognitive activity. This quaternion-based signal analysis technique can extract features to represent brain activity related to motor imagery accurately in various mental states. Experimental tests in which users where shown visual graphical cues related to left and right movements were used to collect BCI-recorded signals. These signals were then classified using decision trees (DT), support vector machine (SVM) and k-nearest neighbor (KNN) techniques. The quantitative analysis of the classifiers demonstrates that this technique can be used as an alternative in the EEG-signal modeling phase to identify mental states.

  19. Quaternion-Based Signal Analysis for Motor Imagery Classification from Electroencephalographic Signals

    PubMed Central

    Batres-Mendoza, Patricia; Montoro-Sanjose, Carlos R.; Guerra-Hernandez, Erick I.; Almanza-Ojeda, Dora L.; Rostro-Gonzalez, Horacio; Romero-Troncoso, Rene J.; Ibarra-Manzano, Mario A.

    2016-01-01

    Quaternions can be used as an alternative to model the fundamental patterns of electroencephalographic (EEG) signals in the time domain. Thus, this article presents a new quaternion-based technique known as quaternion-based signal analysis (QSA) to represent EEG signals obtained using a brain-computer interface (BCI) device to detect and interpret cognitive activity. This quaternion-based signal analysis technique can extract features to represent brain activity related to motor imagery accurately in various mental states. Experimental tests in which users where shown visual graphical cues related to left and right movements were used to collect BCI-recorded signals. These signals were then classified using decision trees (DT), support vector machine (SVM) and k-nearest neighbor (KNN) techniques. The quantitative analysis of the classifiers demonstrates that this technique can be used as an alternative in the EEG-signal modeling phase to identify mental states. PMID:26959029

  20. An Individual Finger Gesture Recognition System Based on Motion-Intent Analysis Using Mechanomyogram Signal

    PubMed Central

    Ding, Huijun; He, Qing; Zhou, Yongjin; Dan, Guo; Cui, Song

    2017-01-01

    Motion-intent-based finger gesture recognition systems are crucial for many applications such as prosthesis control, sign language recognition, wearable rehabilitation system, and human–computer interaction. In this article, a motion-intent-based finger gesture recognition system is designed to correctly identify the tapping of every finger for the first time. Two auto-event annotation algorithms are firstly applied and evaluated for detecting the finger tapping frame. Based on the truncated signals, the Wavelet packet transform (WPT) coefficients are calculated and compressed as the features, followed by a feature selection method that is able to improve the performance by optimizing the feature set. Finally, three popular classifiers including naive Bayes (NBC), K-nearest neighbor (KNN), and support vector machine (SVM) are applied and evaluated. The recognition accuracy can be achieved up to 94%. The design and the architecture of the system are presented with full system characterization results. PMID:29167655